clojure.core goodies

A random collection of functions/macros in clojure.core.

count, bounded-count, counted?, last, peek

  • count can be slow depending on the underlying data type
    (count (cons x huge-list))
    
    ;; count on range can be very slow
    
  • bounded-count: If coll is counted? returns its count, else will count at most the first n elements of coll using its seq
  • Avoid using last with large seqs, since it will traverse every element in the seq.
  • For huge vector, using (nth (dec (count v))) or (peek v) can be much faster than using (last v)

recur

recur does not gather variadic/rest args

(def conj 
  (fn ^:static conj
    ([coll x & xs]
     (if xs
       ;; note that params are passed to recur as if conj is a three-arg function
       (recur (clojure.lang.RT/conj coll x) (first xs) (next xs))
       (clojure.lang.RT/conj coll x)))))

(def
  assoc
  (fn ^:static assoc
    ([map key val] (clojure.lang.RT/assoc map key val))
    ([map key val & kvs]
     (let [ret (clojure.lang.RT/assoc map key val)]
       (if kvs
         (if (next kvs)
           (recur ret (first kvs) (second kvs) (nnext kvs))
           (throw (IllegalArgumentException.
                   "assoc expects even number of arguments after map/vector, found odd number")))
         ret)))))

LazySeq, lazy-seq

Be aware of that calling seq on a lazy seq is synchronized using

final synchronized public ISeq seq(){
	sval();
	if(sv != null)
		{
		Object ls = sv;
		sv = null;
		while(ls instanceof LazySeq)
			{
			ls = ((LazySeq)ls).sval();
			}
		s = RT.seq(ls);
		}
	return s;
}

hash-set

hash-set uses a persisted ITransientMap as its underlying data structure.

sorted-map

sorted-map is implemented using PersistentTreeMap

sorted-map-by, sorted-set-by: create sorted map/set using the provided comparator

some?, any?

some? is the same as (not (nil? x))

(any? x) always returns true

Not to be cofused with some, any where takes a predicator and applies to a collection.

gensym

gensym uses AtomicInteger to generate unique ids,

(defn gensym
  [prefix-string] (. clojure.lang.Symbol (intern (str prefix-string (str (. clojure.lang.RT (nextID)))))))

symbol

Clojure symbol is NOT singleton, hence (identical? 'a 'a) returns false.

static public Symbol intern(String nsname){
	int i = nsname.indexOf('/');
	if(i == -1 || nsname.equals("/"))
		return new Symbol(null, nsname);
	else
		return new Symbol(nsname.substring(0, i), nsname.substring(i + 1));
}

keyword

Clojure keywords are interned in a ConcurrentHashMap.

ReferenceQueue created at 1 is used when creating WeakReference which served as a cache, which will be removed when all non-weak references are released. See Using Java's ReferenceQueue

private static ConcurrentHashMap<Symbol, Reference<Keyword>> table = new ConcurrentHashMap();
static final ReferenceQueue rq = new ReferenceQueue(); // 1
public static Keyword intern(Symbol sym){
	Keyword k = null;
	Reference<Keyword> existingRef = table.get(sym);
	if(existingRef == null)
		{
		Util.clearCache(rq, table);
		if(sym.meta() != null)
			sym = (Symbol) sym.withMeta(null);
		k = new Keyword(sym);
		existingRef = table.putIfAbsent(sym, new WeakReference<Keyword>(k, rq)); // 2
		}
	if(existingRef == null)
		return k;
	Keyword existingk = existingRef.get();
	if(existingk != null)
		return existingk;
	//entry died in the interim, do over
	table.remove(sym, existingRef);
	return intern(sym);
}

find-keyword

(find-keyword "my-key") checks whether a keyword is interned.

with-meta, vary-meta

Returns a copy of an object with modified meta data.

defonce

defonce is not thread-safe. So make-body in (defonce av (make-body)) may execute more than once. This is rarely a concern in production but might affect development.

delay

A delayed object is implemented as follows,

public class Delay implements IDeref, IPending{
Object val;
Throwable exception;
IFn fn;

public Delay(IFn fn){
	this.fn = fn;
	this.val = null;
        this.exception = null;
}

static public Object force(Object x) {
	return (x instanceof Delay) ?
	       ((Delay) x).deref()
	       : x;
}

synchronized public Object deref() {
	if(fn != null)
		{
		try
			{
			val = fn.invoke();
			}
		catch(Throwable t)
			{
			exception = t;
			}
		fn = null;
		}
	if(exception != null)
		throw Util.sneakyThrow(exception);
	return val;
}

synchronized public boolean isRealized(){
	return fn == null;
}
}

arbitratry precision

+', *', inc' etc

static public Number multiplyP(long x, long y){
    if (x == Long.MIN_VALUE && y < 0)
        return multiplyP((Number)x,(Number)y);
    long ret = x * y;
    if (y != 0 && ret/y != x)
        return multiplyP((Number)x,(Number)y);
    return num(ret);
}

final public Number multiplyP(Number x, Number y){
    long lx = x.longValue(), ly = y.longValue();
    if (lx == Long.MIN_VALUE && ly < 0)
        return BIGINT_OPS.multiply(x, y);
    long ret = lx * ly;
    if (ly != 0 && ret/ly != lx)
        return BIGINT_OPS.multiply(x, y);
    return num(ret);
}

Note that (*' (Double/MAX_VALUE) 2.0) gives you infinity.

public static final double POSITIVE_INFINITY = 1.0 / 0.0;

reverse, rseq

reverse: creates a new seq and add items from original seq in reverse order:

(defn reverse
  [coll]
  (reduce1 conj () coll))

rseq returns a reversed list in constant time:

(defn rseq
  [^clojure.lang.Reversible rev]
    (. rev (rseq)))

Use rseq to rervse key-vals of a map:

(into {} (map (comp vec rseq) m))

Note that access a rseq or APersistentVector$RSeq by index is inefficient. (Will traverse each element up to the specified index)

even? and odd?

(zero? (bit-and (clojure.lang.RT/uncheckedLongCast n) 1))

complement

(complement compare) will always return false

(defn complement
  [f] 
  (fn 
    ([] (not (f)))
    ([x] (not (f x)))
    ([x y] (not (f x y)))
    ([x y & zs] (not (apply f x y zs)))))

peek and pop

peek and pop behave differently for lists and vectors.

(defn peek
  "For a list or queue, same as first, for a vector, same as, but much
  more efficient than, last. If the collection is empty, returns nil."
  {:added "1.0"
   :static true}
  [coll] (. clojure.lang.RT (peek coll)))

(defn pop
  "For a list or queue, returns a new list/queue without the first
  item, for a vector, returns a new vector without the last item. If
  the collection is empty, throws an exception.  Note - not the same
  as next/butlast."
  {:added "1.0"
   :static true}
  [coll] (. clojure.lang.RT (pop coll)))

macro: recursive calls

(defmacro ..
  ([x form] `(. ~x ~form))
  ([x form & more] `(.. (. ~x ~form) ~@more)))

methods

Given a multimethod, methods returns a map of dispatch values -> dispatch fns

if-some, when-some

Unlike if-let, if-some will take falsy-clause only when the binding value id nil:

(defmacro if-some
  ([bindings then]
   `(if-some ~bindings ~then nil))
  ([bindings then else & oldform]
   (assert-args
     (vector? bindings) "a vector for its binding"
     (nil? oldform) "1 or 2 forms after binding vector"
     (= 2 (count bindings)) "exactly 2 forms in binding vector")
   (let [form (bindings 0) tst (bindings 1)]
     `(let [temp# ~tst]
        (if (nil? temp#)
          ~else
          (let [~form temp#]
            ~then))))))

agent

Agent thread pools: send-off uses a cached-threadpool, while send uses a fixed threadpool.

volatile public static ExecutorService pooledExecutor =
  Executors.newFixedThreadPool(2 + Runtime.getRuntime().availableProcessors(),
                                 createThreadFactory("clojure-agent-send-pool-%d", sendThreadPoolCounter));

volatile public static ExecutorService soloExecutor =
 Executors.newCachedThreadPool(
    createThreadFactory("clojure-agent-send-off-pool-%d", sendOffThreadPoolCounter));

volatile!

Changes are guaranteed to propagate to other threads. However it does not guarantee read/write atomicity.

final public class Volatile implements IDeref {

  volatile Object val;

  public Volatile(Object val){
    this.val = val;
  }

  public Object deref() {
    return val;
  }

  public Object reset(Object newval) {
    return this.val = newval;
  }

}

Use volatile when state is required and no multithread is involved.

(defn distinct
  "Returns a lazy sequence of the elements of coll with duplicates removed.
  Returns a stateful transducer when no collection is provided."
  {:added "1.0"
   :static true}
  ([]
   (fn [rf]
     (let [seen (volatile! #{})]
       (fn
         ([] (rf))
         ([result] (rf result))
         ([result input]
          (if (contains? @seen input)
            result
            (do (vswap! seen conj input)
                (rf result input)))))))))

comparator, compare, sort

  • comparator converts a pred function to a java comparator.
  • compare, works like obj.compareTo in Java, except it also handles nil. It returns a number, just like a Java Comparator/compare.
  • sort takes an optional comparator and sorts a collection.

Note: comparator is rarely needed, you can use sort like this (sort > (range -3 3))

(defn comparator
  "Returns an implementation of java.util.Comparator based upon pred."
  {:added "1.0"
   :static true}
  [pred]
    (fn [x y]
      (cond (pred x y) -1 (pred y x) 1 :else 0)))
(defn compare
  "Comparator. Returns a negative number, zero, or a positive number
  when x is logically 'less than', 'equal to', or 'greater than'
  y. Same as Java x.compareTo(y) except it also works for nil, and
  compares numbers and collections in a type-independent manner. x
  must implement Comparable"
  {
   :inline (fn [x y] `(. clojure.lang.Util compare ~x ~y))
   :added "1.0"}
  [x y] (. clojure.lang.Util (compare x y)))

doto

doto is defined as a macro:

(defmacro doto
  "Evaluates x then calls all of the methods and functions with the
  value of x supplied at the front of the given arguments.  The forms
  are evaluated in order.  Returns x.

  (doto (new java.util.HashMap) (.put \"a\" 1) (.put \"b\" 2))"
  {:added "1.0"}
  [x & forms]
    (let [gx (gensym)]
      `(let [~gx ~x]
         ~@(map (fn [f]
                  (if (seq? f)
                    `(~(first f) ~gx ~@(next f))
                    `(~f ~gx)))
                forms)
         ~gx)))

memfn

Convert a Java method to a function. Additional arguments must be supplied to march argument list.

(defmacro memfn
  [name & args]
  (let [t (with-meta (gensym "target")
            (meta name))]
    `(fn [~t ~@args]
       (. ~t (~name ~@args)))))

;; usage
(def starts-with (memfn startsWith prefix))
(starts-with "abcd" "a")

array operations

Array manipulations: aclone, alength, aget, aset, amap, areduce and there is also System/arraycopy

(def ar (int-array (range 1e5)))

;; Returns a new array
;; Note the performance difference with or without type hint
(time (amap ar idx ret (inc (aget ar idx))))
;; "Elapsed time: 2256.856522 msecs"

(time (amap ^ints ar idx ret (inc (aget ^ints ar idx))))
;; "Elapsed time: 10.307571 msecs"

;; copy 
(def src (byte-array (range 200)))
(def target (byte-array 4))
(System/arraycopy src 0 target 1 3)
(map int target)

the-ns, ns-map

  • the-ns Converts a symbol to namespace, throws exception if the namespace is not found.
    (the-ns 'wine.core)
    
  • ns-map returns a map of all the mappings for the namespace
    (ns-map 'wine.core)
    
    (defn the-ns
      "If passed a namespace, returns it. Else, when passed a symbol,
      returns the namespace named by it, throwing an exception if not
      found."
    ;; meta data
      {:added "1.0"
       :static true}
    ;; type hint return value
      ^clojure.lang.Namespace [x]
      (if (instance? clojure.lang.Namespace x)
        x
        (or (find-ns x) (throw (Exception. (str "No namespace: " x " found"))))))
    

boolean

(boolean x) returns false if x is false or nil, returns true when x is anything else.

read and evaluate

Load file/reader and evaluate sequentially.

(defn load-reader
  "Sequentially read and evaluate the set of forms contained in the
  stream/file"
  {:added "1.0"
   :static true}
  [rdr] (. clojure.lang.Compiler (load rdr)))

(defn load-string
  "Sequentially read and evaluate the set of forms contained in the
  string"
  {:added "1.0"
   :static true}
  [s]
  (let [rdr (-> (java.io.StringReader. s)
                (clojure.lang.LineNumberingPushbackReader.))]
    (load-reader rdr)))

ex-info, ex-data

ex-info creates an IExceptionInfo instance which additionally holds a custom message and data. ex-data can extract the data in an IExceptionInfo instance.

IExceptionInfo is a subclass of RuntimeException, so it is a unchecked exception.

(defn ex-info
  "Create an instance of ExceptionInfo, a RuntimeException subclass
   that carries a map of additional data."
  {:added "1.4"}
  ([msg map]
     (ExceptionInfo. msg map))
  ([msg map cause]
     (ExceptionInfo. msg map cause)))
     
;; usage
(try 
 (throw (ex-info "my error" {:data "my-data"}))
 (catch Exception e
        (ex-data e)))

tree-seq

tree-seq performs a depth first walk through a tree data structure, returns a lazy seq of nodes.

The following function prints all clojure source files under the provided directory.

(defn list-clj-files [^java.io.File dir]
  (dorun
   (tree-seq 
    (fn [f] (.isDirectory f))
    (fn [d] (seq (.listFiles
                  d
                  (proxy [java.io.FileFilter] []
                         (accept [f]
                                 (boolean ;; accept must not return nil
                                  (or (.isDirectory f)
                                      (when (.endsWith (.getName f) ".clj")
                                        (prn (.getAbsolutePath f))))))))))
    dir)))

max-key, min-key

max-key/min-key finds an entry in a seq for which a function has the maximum/minimum value.

More specifically, (max-key k & args) returns an x in args for which (k x) is the greatest. (k x) must return a number.

Examples: instead of using

(first (sort-by count ["a" "bb" "ccc" "e"]))

use

(apply min-key count ["a" "bb" "ccc" "e"])

Find the key with the highest value:

(key (apply max-key val {:a 3 :b 7 :c 9}))

defmethod, remove-all-methods, remove-method, methods

  • methods lists all dispatch values and dispatch functions as a map.
  • Use remove-all-methods to remove all of the methods of a mutlimethod.

empty, empty?

  • empty returns an empty collection of the same category as the input collection.
  • empty? returns true if a collection is empty, it is equivalent to (not (seq coll))

compare-and-set!, set-validator!, get-validator

compare-and-set! atomically compare the old value and sets a new value of an atom.

When throwing exception in the validation function of set-validator!, make sure to throw RuntimeException with a meaningful error message.

(def af (atom 0))
(set-validator! 
 af
 #(if (neg? %) 
      (throw (ex-info  "must not be negative" {:val %}))
    true))
(swap! af - 5)
;; ExceptionInfo must not be negative  clojure.core/ex-info (core.clj:4617)

var-get, var-set

var-get returns the value in the var object, it is equivalent to @.

(= (var-get #'map) map @#'map)
;; true

with-in-str, print-str, println-str, prn-str

Instead of taking *in* from the user input, but taking from a StringReader

(with-in-str "34" (prompt "How old are you?"))

print-str prints inputs to a string and returns it.

dosync

dosync body must not have side effects

(defmacro dosync
  [& exprs]
  `(sync nil ~@exprs))

(defmacro sync
  [flags-ignored-for-now & body]
  `(. clojure.lang.LockingTransaction
      (runInTransaction (fn [] ~@body))))

dosync is implemented as follows

final static ThreadLocal<LockingTransaction> transaction = new ThreadLocal<LockingTransaction>();
static public Object runInTransaction(Callable fn) throws Exception{
	LockingTransaction t = transaction.get();
	Object ret;
	if(t == null) {
		transaction.set(t = new LockingTransaction());
		try {
			ret = t.run(fn);
		} finally {
			transaction.remove();
		}
	} else {
		if(t.info != null) {
			ret = fn.call();
		} else {
			ret = t.run(fn);
		}
	}

	return ret;
}

subseq, rsubseq

subseq/rsubseq returns a (reversed) sub seq of a sorted seq.

subseq input must implement sorted interface.

(subseq (sorted-set 1 2 3 4 5 6 7 8 9 0) > 2 < 9)

seque

Creates a queued seq on another lazy seq. Eagerly produces n items.

bases, supers, parents, ancestors, descendants, derive, underive, isa?

(supers (class (hash-map)))
;; #{clojure.lang.IPersistentMap clojure.lang.AFn java.util.Map clojure.lang.IHashEq java.lang.Iterable java.lang.Runnable java.util.concurrent.Callable java.lang.Object clojure.lang.Counted clojure.lang.IMapIterable clojure.lang.IKVReduce clojure.lang.MapEquivalence clojure.lang.IPersistentCollection java.io.Serializable clojure.lang.ILookup clojure.lang.IEditableCollection clojure.lang.IFn clojure.lang.Seqable clojure.lang.Associative clojure.lang.IObj clojure.lang.APersistentMap clojure.lang.IMeta}

;; bases returns immediate superclasses and direct interfaces
(bases (class (hash-map)))
;; (clojure.lang.APersistentMap clojure.lang.IObj clojure.lang.IEditableCollection clojure.lang.IMapIterable clojure.lang.IKVReduce)

;; `parents` works like `bases` but also includes direct parents created by `make-hierarchy` 
(parents (class (hash-map)))
;; #{clojure.lang.IMapIterable clojure.lang.IKVReduce clojure.lang.IEditableCollection clojure.lang.IObj clojure.lang.APersistentMap}

;; `descendants` returns immediate and indirect children 

;; `derive` creates a parent/child relationship
(derive ::dog ::animal)
(isa? ::dog ::animal)
;; true

;; isa? returns true if the `child` class is direct or indirectly extends/implements `parent`
(isa? (class (hash-map)) clojure.lang.IMeta)
;; true

distinct?

distinct? returns true if all elements are different. (not =)

Not to be confused with distinct which returns a lazy sequence of the elements in a collection with duplicates removed.

(apply distinct? (cons 2 (range 3)))
;; false

fnil

Use fnil to create function which expects default argument.

((fnil + 2) 3 4)
;; 7

filter, filterv, keep, keep-indexed

  • pred in filter, filterv argument must have no side-effects.
  • keep, keep-idnexed returns items with non-nil results of (f item) or (f index item). Note item with false return values will be included.
    (keep identity [1 'a "b" nil false []])
    ;; (1 a "b" false [])
    
    (filter identity [1 'a "b" nil false []])
    ;; (1 a "b" [])
    

future, future-call, future-cancel, future-cacelled?

future-cancel cancels the future if possible.

From stackoverflow, with-timeout executes the body within a specified time limit.

(defmacro with-timeout
  [msec & body]
  `(let [f# (future (do ~@body))
         v# (gensym)
         result# (deref f# ~msec v#)]
    (if (= v# result#)
      (do
        (future-cancel f#)
        (throw (TimeoutException.)))
      result#)))

pmap, pcals, pvalues

pmap works like map but executes in parallel.

pcalls executes no-args fns in parallel and returns a lazy seq of the return values.

pmap, pcals, pvalues uses a cachedThreadPool to run the tasks, which means it will create threads as much as possible, this might drain your system resources, or causing expensive context switch among threads. Normally you would like to controll the parallelism by using a custom thread pool, this can be achieved by using the claypoole version of these functions.

pcalls, pvalues source code:

(defn pcalls [& fns] (pmap #(%) fns))

(defmacro pvalues [& exprs]
  `(pcalls ~@(map #(list `fn [] %) exprs)))

Examples

(defn rn [] (rand-int 10))

;; generating random digits
(clojure.string/join (apply pcalls (repeat 4 rn)))
;; "7995"

(pvalues (rand) (rand) (rand-int 5))
;; (0.28873409146622275 0.10551504557045066 1)

clojure-version

*clojure-version* as a dynamic var is defined as follows

(let [properties (with-open [version-stream (.getResourceAsStream
                                             (clojure.lang.RT/baseLoader)
                                             "clojure/version.properties")]
                   (doto (new java.util.Properties)
                     (.load version-stream)))
      version-string (.getProperty properties "version")
      [_ major minor incremental qualifier snapshot]
      (re-matches
       #"(\d+)\.(\d+)\.(\d+)(?:-([a-zA-Z0-9_]+))?(?:-(SNAPSHOT))?"
       version-string)
      clojure-version {:major       (Integer/valueOf ^String major)
                       :minor       (Integer/valueOf ^String minor)
                       :incremental (Integer/valueOf ^String incremental)
                       :qualifier   (if (= qualifier "SNAPSHOT") nil qualifier)}]
  (def ^:dynamic *clojure-version*
    (if (.contains version-string "SNAPSHOT")
      (clojure.lang.RT/assoc clojure-version :interim true)
      clojure-version)))

promise, deliver

A promise can be delivered once and exactly once.

group-by, partition-by, frequencies

(partition-by (partial > 5) (range 10))
(split-at 5 (range 10))
(split-with (partial > 5) (range 10))
;; ((0 1 2 3 4) (5 6 7 8 9))

every-pred, some-fn

every-pred doc: Takes a set of predicates and returns a function f that returns true if all of its composing predicates return a logical true value against all of its arguments, else it returns false. Note that f is short-circuiting in that it will stop execution on the first argument that triggers a logical false result against the original predicates.

(def f (every-pred (complement nil?) integer? pos?))
(f 1 2 3 4 5)
;; true
(f 1 2 3 4 '() 5)
;; false

some-fn doc: Takes a set of predicates and returns a function f that returns the first logical true value returned by one of its composing predicates against any of its arguments, else it returns logical false. Note that f is short-circuiting in that it will stop execution on the first argument that triggers a logical true result against the original predicates.

To get the first non-nil value for keys :price, :bid, :ask

;; Instead of using 
(some
 identity
 ((juxt :price :bid :ask)
  {:bid 1.22 :ask nil :price 1.22}))

;; use 
((some-fn :price :bid :ask) {:bid 1.22 :ask nil :price 1.22})

random-sample

(random-sample prob coll), equivalent to (filter (fn [_] (< (rand) prob)) coll), returns items (lazily) from coll with random probability.

Comment