Faster GraalVM Clojure Compilation Times

<img alt="Crispin Wellington" class="blog-splash" src="https://epiccastle.io/blog/faster-graalvm-clojure-compilation-times/splash-20.jpg" style="width:50%;"><h3>Slash GraalVM Clojure Compilation Time With This One Weird Trick</h3><p>Are you building a clojure project and compiling it to a native binary with GraalVM? As the project has grown, has the GraalVM compilation time gotten longer and longer? Here is the story of how I discovered <a href="https://twitter.com/borkdude">@borkdude</a>'s trick to getting GraalVM native image times in under 5 minutes.</p><h4>The Story</h4><p>As I've been building <a href="https://epiccastle.io/spire/">spire</a>, compilation times have gradually been increasing. Then in order to bring a bunch of increased functionality I added a large number of Java class bindings to my sci environment. I check the code in and go to bed. When I wake I discover my <a href="https://circleci.com/gh/epiccastle/spire/1055">Circle CI builds</a> have been killed in the night for taking too long. The build was underway for an hour when the CI system killed the build!</p><p>So I try compiling on my local linux workstation to see what is going on. After an eternity the build crashes with:</p><pre><code class="shell">[build/spire:2202] analysis: 12,784,610.52 ms, 6.22 GB Fatal error:java.lang.OutOfMemoryError: GC overhead limit exceeded Error: Image build request failed with exit status 1 com.oracle.svm.driver.NativeImage$NativeImageError: Image build request failed with exit status 1 at com.oracle.svm.driver.NativeImage.showError(NativeImage.java:1527) at com.oracle.svm.driver.NativeImage.build(NativeImage.java:1289) at com.oracle.svm.driver.NativeImage.performBuild(NativeImage.java:1250) at com.oracle.svm.driver.NativeImage.main(NativeImage.java:1209) at com.oracle.svm.driver.NativeImage$JDK9Plus.main(NativeImage.java:1707) </code></pre><p><strong>12,784,610ms</strong> spent in analysis. That is over <strong>three and a half hours</strong>!</p><p>As I just added a bunch of new class bindings and reflection config for them all, I assume that must be the culprit. I remove 75% of them. And build again. I time this run. It succeeds, but it isn't pretty.</p><pre><code class="shell">[build/spire:11341] classlist: 5,953.42 ms, 0.94 GB [build/spire:11341] (cap): 800.78 ms, 0.94 GB [build/spire:11341] setup: 2,536.63 ms, 0.94 GB [build/spire:11341] (typeflow): 1,544,374.55 ms, 6.51 GB [build/spire:11341] (objects): 5,164,930.94 ms, 6.51 GB [build/spire:11341] (features): 4,320.62 ms, 6.51 GB [build/spire:11341] analysis: 6,724,448.94 ms, 6.51 GB [build/spire:11341] (clinit): 1,898.20 ms, 6.51 GB [build/spire:11341] universe: 47,567.23 ms, 6.51 GB [build/spire:11341] (parse): 12,451.68 ms, 4.17 GB [build/spire:11341] (inline): 9,098.24 ms, 3.94 GB [build/spire:11341] (compile): 75,592.43 ms, 3.59 GB [build/spire:11341] compile: 102,870.54 ms, 3.59 GB [build/spire:11341] image: 14,383.80 ms, 3.61 GB [build/spire:11341] write: 1,725.39 ms, 3.61 GB [build/spire:11341] [total]: 6,899,951.34 ms, 3.61 GB cp build/spire spire real 115m59.386s user 839m24.587s sys 1m56.493s </code></pre><p>Wow! Analysis is now down to only 1 hour and 50 minutes! Good times!</p><p>So this leads me to wonder how Michiel (<a href="https://twitter.com/borkdude">@borkdude</a>) is building his babashka images on Circle CI without hitting the wall. I go over to have a look at his <a href="https://circleci.com/gh/borkdude/babashka/9392">Circle CI builds</a>. They are building his images in <strong>less that 5 minutes</strong>.... whaaaaaaat?</p><h4>That One Weird Trick</h4><p>This leads me to a bunch of comparisons between the compilation and building of babashka with that of spire. I rule out reflection as the source of the problem. I rule out GraalVM options. And then I discover <a href="https://github.com/borkdude/babashka/blob/c3f9480efe08827dfa4ac0fb21f7376d80287ce6/project.clj#L53">the magical incantation</a>.</p><p>I had this setting on my GraalVM build, but I did not have it on when clojure was doing AOT compilation for the uberjar. Before we turn it on lets look into what it does.</p><h4>Direct Linking</h4><p>Clojure's direct linking can be activated by passing <code>-Dclojure.compiler.direct-linking=true</code> to the compiler. This feature is <a href="https://clojure.org/reference/compilation#directlinking">documented here</a>. From this discussion we read:</p><blockquote><p> &quot;Normally, invoking a function will cause a var to be dereferenced to find the function instance implementing it, then invoking that function... <em>Direct linking</em> can be used to replace this indirection with a direct static invocation of the function instead. This will result in faster var invocation. Additionally, the compiler can remove unused vars from class initialization and direct linking will make many more vars unused. Typically this results in smaller class sizes and faster startup times.&quot; </p></blockquote><p>And faster Graal compilation times to boot! This option will produce JVM byte code that will be much more like what a standard Java program will produce. Java is a statically typed language after all, and Java programs are not dereferencing vars every time they are invoked. The Graal compiler, being built primarily to compile Java programmes, is obviously having a very hard time with the dynamic nature of clojure's compiled byte code.</p><p>But do we lose anything if we compile our code with direct linking? According to the docs:</p><blockquote><p> &quot;One consequence of direct linking is that var redefinitions will not be seen by code that has been compiled with direct linking (because direct linking avoids dereferencing the var). Vars marked as ^:dynamic will never be direct linked. If you wish to mark a var as supporting redefinition (but not dynamic), mark it with ^:redef to avoid direct linking.&quot; </p></blockquote><p>As Michiel pointed out to me, things like a general use of <code>with-redefs</code> won't work with direct linking. But if we do want to do something dynamic like <code>with-redefs</code> in our code, we can individually mark those vars with <code>^:redef</code> meta data to allow them to work. Also, things like <code>with-redefs</code> is more commonly used in writing tests, so we can keep the option off in our test code and save direct linking for our uberjar builds.</p><h4>The Fixed Build</h4><p>Now taking the original problematic build that crashed after three and a half hours, I switch that setting on in my uberjar compilation and rebuild. Here's the result:</p><pre><code class="shell">[build/spire:22871] classlist: 4,379.73 ms, 0.96 GB [build/spire:22871] (cap): 739.03 ms, 0.96 GB [build/spire:22871] setup: 2,268.45 ms, 0.96 GB [build/spire:22871] (typeflow): 50,731.79 ms, 5.92 GB [build/spire:22871] (objects): 116,840.36 ms, 5.92 GB [build/spire:22871] (features): 2,682.61 ms, 5.92 GB [build/spire:22871] analysis: 173,423.16 ms, 5.92 GB [build/spire:22871] (clinit): 1,193.08 ms, 5.92 GB [build/spire:22871] universe: 3,049.85 ms, 5.92 GB [build/spire:22871] (parse): 7,224.44 ms, 5.79 GB [build/spire:22871] (inline): 13,408.30 ms, 4.19 GB [build/spire:22871] (compile): 77,773.85 ms, 4.16 GB [build/spire:22871] compile: 104,068.58 ms, 4.16 GB [build/spire:22871] image: 13,542.17 ms, 4.16 GB [build/spire:22871] write: 1,792.26 ms, 4.22 GB [build/spire:22871] [total]: 302,882.67 ms, 4.22 GB cp build/spire spire real 5m32.050s user 33m7.865s sys 0m12.756s </code></pre><p>Analysis is down from 3½ hours... to a little under <strong>3 minutes</strong>. And amazing improvement!</p><h4>Summary</h4><p>Add <code>-Dclojure.compiler.direct-linking=true</code> to your clojure compilation JVM options when building your uberjar and when compiling with GraalVM.</p><p>In lein:</p><pre><code class="clojure">(defproject foo &quot;0.1.0-SNAPSHOT&quot; ;; missing lines :profiles {:uberjar {:aot :all :jvm-opts [&quot;-Dclojure.compiler.direct-linking=true&quot;]} ) </code></pre><p>And in GraalVM native-image:</p><pre><code class="shell">graalvm-ce-java11-20.1.0-dev/bin/native-image \ ... -J-Dclojure.compiler.direct-linking=true \ ... </code></pre>