Clojure Journey V – Simplicity, Evaluation and Syntax

Most of the time that I heard someone talking about Lisp family languages, probably was something like “that ugly language, why people program with that?”, “why so many parentheses?”, unfortunately since I started to learn Clojure, I hear it twice more.

This post is the fifth of a long journey on learning Clojure, and the first to really talk about code, but it can be served to learn about any language of the LISP family, this post will talk about syntax of the language, and the most important, we’ll understand why LISP languages look like this, the philosophy behind.

When you learn every language, most of the books in the first chapters is like a cake recipe, a long and boring list of all statements of the language, control flow, loops, var declarations, and much more.

Back to the day that you started to learn programming, probably you started with simple math and strings:

1 + 1;
=> 2
"Hello World";
=> "Hello World"

After that you learned about variables:

var age = 23;

Some control flow:

if (10 > 5) {
  "Hello World"  
}

And the most strange thing in every language, the for loop:

for (var i = 0; i < 10; i++) {
  console.log(42)
}

Probably you’re questing yourself about what I’m talking about, but have you looked at this common languages statements and noticed that they didn’t follow any logic or consistency in the syntax? The traditional for statements is not simple, we’re just used to view it all day. These syntax is not simple, and that’s the point.

Simplicity

In LISP family languages all the code follows the same logic, and after you’re capable of understand one expression, you’ll be able to understand the entire language. It doesn’t have many special forms, or something different from the rest, and that’s why “simplicity”is one of the worlds in this post title! Let’s take a look at some Clojure code, starting from the classical:

=> (println "Hello World")
Hello World

Let’s see some arithmetic:

=> (+ 1 2 3 4)
10
=> (= 1 2 3)
false
=> (+ (* 2 2) (* 3 3))
13

Can you imagine how if looks?

=> (if (> 2 1) "Greater" "Smaller")
"Greater"

What all these expressions have in common? They follow the same pattern. Let’s take a look:

LISP family syntax

In the image, the expression xxx is read as a list, the first element can be called “function position”, and the rest as arguments. When evaluating it will always try to evaluate the first item as a thing to invoke (you can think like a function, but sometimes can be other things like macros), and the rest as arguments the invoked.

Positioning the “function” before its arguments it’s called prefix/polish notation, the expression (f x y) looks a lot like the traditional mathematics notation f(x, y). All lisp expressions use polish notation.

Evaluation

During the process of evaluating expressions, it will evaluate each argument of the expression and then invoke the function of the first position with the result of evaluating each argument, in other words, it will evaluate all expressions from the innermost to the outermost recursively, forming a evaluation tree. Let’s assume (+ (* 2 2) (* 3 3)) as our example:

Evaluation Tree

For those who already studied programming languages design probably realized that this evaluation tree looks a lot like an AST (Abstract Syntax Tree), and it’s kind off.

You may have noticed that the numbers in the bottom of the evaluation tree are evaluating themselves, and it leads us to another important topic.

Different from other languages, Clojure doesn’t have statements, it has only expressions that evaluates to a value, and its programs is composed of n expressions. Every form not handled specially by any macro or special forms is considered a expression by the compiler and it will be compiled and has it’s value returned.

Most literal Clojure forms evaluate to themselves, except symbols and list, that’s why numbers was evaluated to themselves in the evaluation tree image. Other common examples of forms that evaluates to themselves are, booleans, strings, characters and keywords. Symbols are resolved based on a set of rules, but in general, it will look for what was “binded” to the symbol.

=> "String"
"String"
=> 42
42
=> :keyword
:keyword

It will try to evaluate everything, but we have a way to delay this process and say “don’t evaluate it, just return it raw”, just put an apostrophe before your expression:

=> '+
+
=> '(+ 1 2)
(+ 1 2)
=> 'variable
variable

Syntax

If you read the text so far, probably you already know how to write an expression and how literals are evaluated, you also know how to delay a evaluation using the apostrophe, now we can talk about one of the most interesting thing in LISP family languages, the “code as data and data as code” foundation.

Languages that are from LISP family, have a data structure called list, lists are basically zero or more forms enclosed in parentheses (1 2 3 42) is a example of one list with integers 1, 2, 3 and 42, and now comes the plot… Other examples of list is (+ 1 2 3) or (println "Hello World").

OMG! It’s a LIST

All the programs are composed of lists, and LISP/Clojure is defined in terms of evaluate its own data, and not a stream of keywords like other languages. LISP stands for LIST-Processing, and it explain this name.

Languages that have its programs represented by its own data are called “homoiconic”, and LISP family languages are one of them, based on this the affirmation “code is data and data is code” emerges and it is commonly used when talking about Lisp/Clojure.

When writing Clojure probably you’ll use lists as a data structure to store data, but if you try to use it to store a list of integers, probably you’ll get error during the evaluation proccess:

=> (1 2 3 4)
Execution error (ClassCastException) at user/eval2018 
(REPL:1)
java.lang.Long cannot be cast to clojure.lang.IFn

If we take a look at the error, java.lang.Long cannot be cast to clojure.lang.IFn, this error is saying that a integer can’t be cast as function and this error occurs because we tried to evaluate a list without any function in the first position, by default when evaluating any list, the Clojure reader tries to find a thing to invoke in the first position, that’s why when evaluating a list just as data structure, clojurists use the apostrophe to tell Clojure to not try to evaluate it:

=> '(1 2 3 4)
(1 2 3 4)
=> '(+ 1 2 3 4)
(+ 1 2 3 4)

And that’s the way that you usually will see programmers using lists as data inside Clojure programs, in a future post we’ll talk about how to use it.

Let’s finish our study about syntax with an experiment, Clojure has a function in it’s core named eval, if we pass a valid list, it will evaluate it:

(eval '(+ 1 2 3))
=> 6

This is the core of the REPL, and with Clojure simplicity, we can write our own in a simple way:

(loop [] (println (eval (read))) (recur))

Combining a function to read user input read, one to eval this input eval, print the result of evaluation with println and finally loop infinitely doing it. Run it and you’ll have your own read-eval-print-loop (REPL).

Invoking REPL

Conclusions

This a long post but essential on our journey, we ha been introduced to Clojure syntax and the logic behinds it. We learned about the famous “code is data and data is code” jargon, beautiful behind lists and built our own REPL.

LISP/Clojure syntax is beautiful and simple, it can afraid people because it’s different from what we’re used to, but if we study we start to understand it fast!

Now that we know the syntax, we can take a look in data types on the next post, and after that advance to functions. Some topics are missing in this post like special forms, but in a future post we’ll talk about it.

Final thought

If you have any questions that I can help you with, please ask! Send an email (otaviopvaladares at gmail.com), pm me on my Twitter or comment on this post!

Useful Links

Evaluation, Clojure docs

The Reader, Clojure docs

Syntax, Clojure docs

Homoiconicity, Wikpedia

Polish Notation, Wikpedia

S-expressions, the Syntax of Lisp

Lecture 1A: Overview and Introduction to Lisp

Why code-as-data?

Hitchhiker’s Guide to Clojure