Functional Collections and Arity Exceptions

“We need to show that off to the Scheme programmers.”

Rich Hickey

David Nolen was live coding at Clojure/conj and Rich Hickey raises his hand. David—unsure of what to expect—summons Rich’s question. It was more of a suggestion about his open Emacs buffer:

“Why don’t you use the set as a function?”

David had eta-expanded a set.

Dan Friedman and Will Byrd were in the audience and had presented one of their famous paired miniKanren talks. Entirely in Scheme, of course. Tongue in cheek, Rich justified his comment in a way resembling the quote at the top of this page.

What made this quote memorable, perhaps, was that I was also trying to impress them. Dan is a non-stop recruiter for grad students at Indiana University, and having warmed the crowd for Dan and Will's talk by giving a talk on core.logic myself at the same conference, I had their attention. They easily hooked me on the idea of studying at Indiana Unversity, and I started my PhD there a few short years later.

So, sets are functions. In Clojure, most of the interesting data structures are too. They usually take a key to index into the collection and look themselves up.

Clojure makes this possible by making a function an abstraction via the interface clojure.lang.IFn. The JVM is very good at dispatching methods based on the number of arguments you provide, so its method invoke has 22 overloads to optimize the common cases of functions with 0-20 parameters.

While this is great for performance, it’s not so great for implementors of this interface. You need to implement 22 methods to be a function, even if, like sets, you only need to work on 1 or 2 arguments. This is where implementation inheritance comes in: the abstract class clojure.lang.AFn implements all 22 methods to throw an arity exception based on the number of arguments provided. Now, each collection can simply extend this class and override just a couple of methods, and you get good defaults for the rest.

At least that’s the theory.

Today I found a minor bug in AFn: for 22 or more arguments, the default implementation of invoke will throw an arity exception that claims only 21 arguments in its error message:

Clojure 1.11.1
user=> (apply {} (range 22))
Execution error (ArityException) at user/eval1 (REPL:1).
Wrong number of args (21) passed to: clojure.lang.PersistentArrayMap

user=> (apply {} (range 23))
Execution error (ArityException) at user/eval3 (REPL:1).
Wrong number of args (21) passed to: clojure.lang.PersistentArrayMap

This affected almost every functional collection in Clojure, including:

clojure.lang.PersistentArrayMap
clojure.lang.PersistentArrayMap$TransientArrayMap
clojure.lang.PersistentHashMap
clojure.lang.PersistentHashMap$TransientHashMap
clojure.lang.PersistentVector
clojure.lang.PersistentVector$TransientVector
clojure.lang.PersistentHashSet
clojure.lang.PersistentHashSet$TransientHashSet
clojure.lang.PersistentTreeMap
clojure.lang.PersistentTreeSet
clojure.lang.MapEntry

It’s also likely that libraries similarly extend AFn to implement their own collections. Try it out with your favourite 3rd-party collection.

Beyond that, the most important kind of function was impacted: fn.

user=> (apply (fn []) (range 25))
Execution error (ArityException) at user/eval5 (REPL:1).
Wrong number of args (21) passed to: user/eval5/fn--141

The Clojure compiler uses AFn in the same way as a shortcut for defining functions. The emitted code extends AFn and just overrides the methods it needs:

user=> (supers (class (fn [])))
#{... clojure.lang.AFn ...}

That means probably every function defined by fn and defn that supports 20 arguments max or less suffers from this problem. Try it on your own functions.

user=> (apply identity (range 25))
Execution error (ArityException) at user/eval148 (REPL:1).
Wrong number of args (21) passed to: clojure.core/identity

user=> (apply fnil (range 100))
Execution error (ArityException) at user/eval152 (REPL:1).
Wrong number of args (21) passed to: clojure.core/fnil

If the fn has rest arguments, then a different code path is taken. A RestFn is created instead which redirects arguments slightly differently: its method getRequiredArity returns the number of fixed arguments it has. The Clojure compiler allows up to 20 fixed arguments, but you can create your own instances of RestFn that exceed this with familiar results.

user=> (apply (proxy [clojure.lang.RestFn] []
                (getRequiredArity [] 30))
              (range 25))
Execution error (ArityException) at user.proxy$clojure.lang.RestFn$ff19274a/throwArity (REPL:-1).
Wrong number of args (21) passed to: user.proxy/clojure.lang.RestFn/ff19274a

This is just a coincidence though, the source of this second problem is RestFn itself, not AFn. It’s a different and more (completely?) benign issue.

I have reported these both to Clojure: