Sunday 14 February 2021

JITs, AOT, Caching

I already wrote in the past several posts about JIT compilers [1], [2], [3] and [4]. I've been recently reflecting about the behaviour of current JITs and runtimes.

We know that Java HotSpot and .Net Core from 3.0 will JIT compile methods frequently used over and over applying optimizations only known at runtime, replacing (hot swapping) the previous native code with a further optimized version. So with all this precious profiling optimizations done at runtime, does this optimized native code get saved to disk for being reused in the next executions?

For .NET, as far as I know, nothing like that is done, but in Java it seems it's done by means of Shared Classes Cache (these are used for sharing between multiple JVM's and can be configured to be persisted to disk).

It's interesting how this other question gives a quite different and wrong (as it would deny the existence of those shared classes) answer, but that indeed makes quite sense

The ouput of a JIT may be different for each run - it can optimize for the current load pattern. This sometimes allows it to optimize more agressively than what would be possible for pre-compiled code that needs to be reusable.

If the load-pattern changes and the optimization is found to be sub-optimal or even adverse, a JIT can de-optimize and possibly attempt a different optimization (see also About the dynamic de-optimization of HotSpot)

Now, it might sometimes save some performance to keep these results of various compiled versions around and reuse them later, but it would also require a significant amount of book-keeping to be able to find out if a section of code has already been compiled with currently relevant optimizations.

I suppose that is just not considered worth the effort. Its a tradeoff between doing file-IO and usually fast compilation of small sections of code.

Trying to apply the above wisdom to the existence of a Class Cache, one could think that maybe what is saved in the Class Cache is the first native code produced by the JIT, without any particular optimizations. The optimized code that is hot swapped as the application runs and that can contain optimizations based on the current data set would not be saved to the class cache. What would make sense for me is saving to the Class Cache some profiling information, so that in future executions it can be compared with the current profile information, to see if all executions work with similar environments.

If what I say above is correct, one could wonder why not to use AOT compilation rather than the class cache (and then at runtime continue to use additional JIT compilations for hot methods to apply optimizations. There are good discussions about the pros and cons of AOT vs JIT, like this. One interesting point brought up there and that I had never considered before, is that with JIT you are only compiling the parts of your applications that are really executed, while with AOT you compile the whole application. For a large application in which your use cases are restricted to just a small part of the application, doing AOT would involved wasting a lot of processing time.

In the JavaScript world I know that V8 has 2 JIT's, not optimized and optimized. Hot code is JITted again by the optimized JIT. I'm not sure if this optimized JIT is ran additional times for "very hot" code, performing additional how swappings, I guess so. Anyway, what I've learnt here that V8 saves to disk the compiled code to speed up further executions. My understanding is that this is done only for the code generated by the not-optimized JIT.

Before finishing this post I have to mention the very interesting approach used in modern Android (ART since Android 8), where interpreter, JIT and AOT compilation are used. From the documentation:

ART uses ahead-of-time (AOT) compilation, and starting in Android 7.0 (Nougat or N), it uses a hybrid combination of AOT, just-in-time (JIT) compilation, and profile-guided compilation. The combination of all these compilation modes is configurable and will be discussed in this section. As an example, Pixel devices are configured with the following compilation flow:

An application is initially installed without any AOT compilation. The first few times the application runs, it will be interpreted, and methods frequently executed will be JIT compiled.
When the device is idle and charging, a compilation daemon runs to AOT-compile frequently used code based on a profile generated during the first runs.
The next restart of an application will use the profile-guided code and avoid doing JIT compilation at runtime for methods already compiled. Methods that get JIT-compiled during the new runs will be added to the profile, which will then be picked up by the compilation daemon.

The backgroud AOT compilation for only frequently used code (rather than the whole application) is very interesting. I don't know if the "profile generated" contains only information about which methods are frequently run or if it contains additional generic profiling information that the AOT could use to optimize those methods. I guess Android's JIT can do hot swapping, if so, it could make sense that the AOT code could be hot swapped by code generated by the JIT based on the current execution especifities. I think this could be the case, cause I read somewhere about de-optimizations.

No comments:

Post a Comment