Thursday, 21 December 2023

Kotlin and "Generators"

In my previos post I mentioned how Kotlin suspendable functions are the basis for asynchronous programming and also for generators-like functionality. In JavaScript, Python and C# we have different constructs-keywords for these 2 things, async-await on one side and yield on the other, so it's interesting to see how Kotlin manages both things through the same suspend construct/keyword.

The equivalent to a JavaScript or Python generator function (a funtion that produces elements as we iterate it) is created in Kotlin by means of the sequence() and yield() functions, like this:



val cities = sequence {
        yield("Paris")
        yield("Lyon")
        yield("Bordeaux")
    }
    
    for (city: String in cities) {
        println(city)
    }


sequence is a function (a Sequence Builder) that receives a "suspendable block" that has as receiver the abstract SequenceScope class, and yield() is a suspendable function of the SequenceScope class. Let me show their signatures:


fun  sequence(
    block: suspend SequenceScope.() -> Unit
): Sequence

abstract suspend fun yield(value: T)


The usage of suspendable functions for providing generators-like functionality seemed surprising to me. The "external interface" of a suspendable function is that you have a function that you invoke once, will get suspended 1 to n times in its suspension points, and finally will return one result to you. With generators what we have is a function that gets involved n times, producing one value each time. So, how can a suspendable function work for this second case? Well, as I explained in my previous post the internal details of a suspendable function is that the function is transformed by the compiler into a state machine, with each suspension point corresponding to one of those states. When the function invokes another suspendable function (suspension point) it returns to the caller a COROUTINE_SUSPENDED value, so that the caller knows that it has been suspended. When the async operation we're waiting is completed the coroutine will resume the suspend function (by invoking its corresponding Continuation). When the suspendable function reaches its end it will return its result, rather than a COROUTINE_SUSPENDED. For our generator like functionality we'll be calling an iterator.next() method. If this iterator invokes the suspendable function and the suspendable function (that in this case does not need of any asynchronous call to produce each of its iteration values) stores the value produced for this iteration "shomewhere" and returns COROUTINE_SUSPENDED, then the iterator can retrieve that value. Then the next call to iterator.next() will resume the suspendable function so that it produces the next value, and so on. So this should work!

I've been digging into Kotlin's source code (SequenceBuilder.kt) to verify that the strategy that I've just described is indeed what Kotlin is doing. The key element is the SequenceBuilderIterator class, that is no more than a SequenceScope, an Iterator and a Continuation. I'll just copy below its definition and main method signatures:


private class SequenceBuilderIterator : SequenceScope(), Iterator, Continuation {
    private var state = State_NotReady
    private var nextValue: T? = null
    private var nextIterator: Iterator? = null
    var nextStep: Continuation? = null

    override fun hasNext(): Boolean {
        ...
    }

    override fun next(): T {
        ...
    }

 
    override suspend fun yield(value: T) {
        ...
    }

    override suspend fun yieldAll(iterator: Iterator) {
        ...
    }

    // Completion continuation implementation
    override fun resumeWith(result: Result) {
       ...
    }

    override val context: CoroutineContext
        get() = EmptyCoroutineContext
}

So when we invoke the sequence() function passing to it a suspendable function/block that uses as receiver a SequenceScope:
- This sequence() creates a SequenceBuilderIterator that stores in its nextStep property a Continuation object for that suspend function (block), that in turn will store a SequenceScope (indeed that SequenceBuilderIterator instance) object in its state.
- It wraps the SequenceBuilderIterator in a Sequence object (which iterator() method will just return that SequenceBuilderIterator)

Then when we want to iterate that Sequence we invoke its iterator() method to obtain the SequenceBuilderIterator instance. Invoking .next() on this iterator will invoke the Continuation for the suspend function/block (referenced from the nextStep property. From this suspending function/block we invoke yield(), that is another suspend function that receives the SequenceScope (our almighty SequenceBuilderIterator) as receiver and that will set in that SequenceBuilderIterator the value to return (in the nextValue property). So the so versatile SequenceBuilderIterator object is what joins everything. It has a next() method for the iteration, that returns the value set in the nextValue property by its yield method, and has another property (nextStep) pointing to the continuation (that wraps the main suspend function).

Notice that the CoroutineContext associated to a SequenceBuilderIterator is an EmptyCorotuineContext. We saw in my previous posts that Continuations have a CoroutineContext that holds a Job and a Dispatcher, but for the Continuation for our "generator-like" functionality, that Job and Dispatcher are not needed at all.

Monday, 18 December 2023

Kotlin Asynchronous Programming

One of the first things I looked into when I started to fiddle with Kotlin was if it had the async-await "magic keywords" and it took me a while to understand that the approach to asynchrous programming in Kotlin is rather different from the strategies followed in JavaScript, Python or C#.

Asynchronous programming in JavaScript revolves around Promises, both if you use the async-await magic or if you go more old-school and explicitely chain Promises. A function marked as async in JavaScript implicitly returns a Promise, and when you await on that Promise the compiler "breaks" the function transforming it into a state-machine so that when that promise is resolved the function is invoked again in the point after the await (the corresponding state in the state-machine). Asynchrony in C# revolves around Tasks, and methods marked as async return a Task. In Python asynchronous programming revolves around awaitables (Tasks, Futures and Coroutine objects) and when a function marked as async (a coroutine function returning a coroutine object) performs a real asynchrouns operation (it gets "suspended"), it will yield a Future to the event loop (the python approach is more confusing to me, but I managed to more or less explain it last year)

In Kotlin the key element in asynchronous programming are suspendable functions (marked with the suspend keyword) and the Continuation Passing Style (CSP) that comes with them. Indeed suspendable functions are a broader concept, as they are also used for the equivalent to generators in JavaScript and Python. As you can guess from its name a suspendable function can be suspended at different points, suspension points, points from where the function is calling other suspendable functions. Once the function invoked from the suspension point completes the caller/calling function will be resumed from that suspension point. Each suspend function is indeed a state machine. Well, this feels like async functions in other languages, but the internals are a bit different. Suspendable functions do not return Promises/Futures... but receive something that at first (wrong) sight seemed like callbacks to me, Continuations. A Continuation contains the function to be invoked, its state (the values of its variables) and the point where the function has to be resumed (state machine), so it's more complex than a simple callback. From this excellent article:

Each suspendable lambda is compiled to a continuation class, with fields representing its local variables, and an integer field for current state in the state machine. Suspension point is where such lambda can suspend: either a suspending function call or suspendCoroutineUninterceptedOrReturn intrinsic call.

So for each suspendable function we have a Continuation class that contains the code of the function, its variables and the point where it has been suspended (the function is made up of a switch-case state machine, so that point is a label for that switch). When a suspendable function f1 invokes another suspendable function f2 it passes over to it its own continuation object. f2 will set in its own continuation a reference to the f1's continuation. This way, once the invoked suspend function f2 completes the continuation for the caller function f1 will be invoked. If in turn f2 invokes another suspendable function f3, this will be added to the chain of continuations: f3 -> f2 -> f1. You can read about this for example here or here

Notice that the way this chain of Continuations is created is like the opposite of what we do in JavaScript (or C#, or Python) when creating a chain of Promises, where when an async function returns a Promise the caller function attaches to the Promise the next function to execute (via .then()

Apart from suspend functions and continuations the other key aspect of asynchrony in Kotlin are coroutines. A chain of suspend function calls runs in a coroutine. A suspend function can only be invoked from another suspend function or from a Coroutine. From here.

A coroutine is an instance of a suspendable computation. It is conceptually similar to a thread, in the sense that it takes a block of code to run that works concurrently with the rest of the code. However, a coroutine is not bound to any particular thread. It may suspend its execution in one thread and resume in another one.

So a coroutine is a "suspendable execution flow", similar to threads or processes, but we don't create coroutines (instances of classes inheriting from the AbstractCoroutine class) directly, we create them through a CoroutineBuilder function. CoroutineBuilders like launch or runBlocking are extension methods of the CoroutineScope interface. The source code for coroutine builders is for sure pretty interesting. We can see there that a coroutine builder creates a CoroutineContext that passes over to the Coroutine constructor and then the builder invokes Coroutine.start().

CoroutineScopes and CoroutineContexts are the other essential pieces of Kotlin asynchrony. They are well explained here, where we can learn that a CoroutineScope holds a CorotuineContext, and that the CoroutineContext is a set of various elements. The main elements are the Job of the coroutine and its Dispatcher. The Dispatcher determines what thread/s will be used for executing the different suspend functions of the coroutine. A Continuation has a reference to its Context. From another article I noted down this: "When resuming, continuations don't invoke their functions directly, but they ask the dispatcher to do it. The Dispatcher is referenced from a CoroutineContext, that is referenced from the Continuation."

Another good article about coroutines is this one

There's an important usage difference that comes from the differences between Promises/Tasks/Futures... and Continuations. In JavaScript when you invoke an async function you have to explicitly state (by means of the await keyword) that you are "suspending" at that point until the Promise is resolved. In Kotlin there's no option to indicate if you want to suspend or not, any invocation to a suspend function means that the caller function will get suspended (if the called one really does something asynchronous). I guess the explanation for this is because you can only pass your continuation to the suspend function at the point of invokation, while with promises you can decide to await for the result just as you obtain the Promise, or await for it later (doing something else in between). The problem with this is that when your suspend function invokes several functions (some "normal" funtions, some suspend functions), you don't know at first sight if it's suspending at those invokations points or not, you have to check if the invoked function is a "normal" or suspend one. In one hand forcing you to use a sort of "await" keyword when invoking a suspend function would be redundant, but on the other hand would make code more clear to me.

What can be a bit confusing is that while Kotlin does not need async-await keywords for writing asynchronous code, it does indeed provide an async() function and an await() function, that you can need in certain cases. async() is a coroutine builder, that contrary to other builders like launch(), returns a Deferred (a sort of Promise/Future) that completes when the corresponding coroutine finishes. In turn, Deferred has an await suspendable method. We can see an example combining both functions here. Related to the previous paragraph, if you want to invoke a suspend function, but not get your current function suspended (as when in JavaScript we invoke an async function but do not await for it at that point) you can wrap the call to the suspend function in a block and pass it to async().

Reading a bit about asynchrony in Raku (that pretty interesting descendent of the honorable Perl language) I've seen that they have Promises also, but they keep() or break() rather than resolve() or reject(). It seemed odd at first sight, but indeed that's what we use in natural language, keeping or breaking promises, not resolving or rejecting.

Sunday, 3 December 2023

Truffle vs Invokedynamic

Reading (and writing) about Truffle and graalpy lately has had me thinking (again) about languages and interpreters. When I first heard about Truffle I wrote in the corresponding post:

Indeed, the Java guys spent much time and efforts to adapt the JVM, by means of the Da Vinci project and the new invokedynamic bytecode instruction, to strongly increase its performance with dynamic languages. After doing that, all of a sudden comes Truffle and you can write interpreters for languages that you compile to "self-modifying AST's" (not to Java bytecodes), and those implementations of JavaScript, Ruby... seem to be faster that the implementations that compiled those languages to Java bytecodes and that leveraged that recent and magical invokedynamic instruction!

After that initial astonishement I managed to more or less understand how Truffle, Partial Evaluation and so on work and make code fly. It seems like the future of dynamic languages in the JVM will be based on Truffle rather than invokedynamic. Oracle discontinued Nashorn (Javascript engine based on invokedynamic) while promoting its Truffle equivalent. For Python on the JVM, given that Jython remains stuck in Python 2.7, it's clear that graalpy has no other contender. However, for Ruby we have alive and well both a "traditional" version, JRuby that years ago was updated to leverage invokedynamic, and a Truffle version, TruffleRuby.

In case you wonder if TruffleRuby could be making some use of invokedynamic (in the code of the interpreter methods for the different AST nodes... the answer is NO

- How is this related to invokedynamic?
TruffleRuby doesn't use invokedynamic, as it doesn't emit bytecode. However it does have an optimizing method dispatch mechanism that achieves a similar result.

Already in my post from 2021 I had written something that I had forgotten and that pretty much surprised me at that time:

the "classic" JRuby (that continues to use as far as I know a strange mix of a Ruby interpreter written in Java and a runtime compiler to java bytecodes, that then will be either interpreted or JIT compiled by the JVM)

I tend to forget that interpreters do not need a bytecode stage, they can parse the source into an AST and directly interpret that AST, and indeed this used to be the common behaviour of first interpreters. The CPython interpreter compiles the python sources to bytecodes for a stack based VM (and saves them to a .pyc file) and then interprets those bytecodes (as we saw in my last post graalpy also uses those pyc bytecode files as an intermediate step for building its ASTs. The modern ruby C runtime YARV (that includes both an interpreter and a JIT) also compiles ruby source code to bytecodes for a stack based VM, but the old C runtime, MRI, was just building an AST from the ruby source and direclty interpreting that complex AST (I think this is what is called a pure AST-walking interpreter)

JRuby is a complex beast that is well explained here. It does not use the "YARV ruby bytecode format" or a specific one, it creates an AST from the ruby source and starts to interpret it (AST-walking interpreter). When a method becomes "hot" it will JIT compile it to Java bytecodes that will be run by the Java interpreter, that in turn can decide that this method has become "hot" and compile it to native code (and reoptimize via PGO and so on...).

JRuby’s JIT compiler is similar in nature, the primary difference being the source and destination formats. In Java the source format is Java bytecode, and the destination format is native machine instructions. Conversely, in JRuby, the source format is the JRuby AST, and the destination format is Java bytecode. Perhaps the most fascinating aspect of the JRuby JIT compiler is that it benefits from both the initial compilation into Java bytecode, and then later, when Java may attempt to translate the JRuby-generated bytecode into native machine instructions. So effectively, it is possible to get a double-JIT on your executing code.

That's amazing, you end up with 2 interpreters (for the AST built from the ruby source and for the Java bytecodes) and 2 JITs (for turning the ruby AST into Java bytecodes and for turning those bytecodes into native code, which indeed goese throught the 2 Java JITs: C1 and C2 or C1 and GraalCompiler). Regardless of how cool that looks, it makes pretty good sense to me that RubyTruffle performs better with its more straightforward approach, interpreting the AST, especializing/Partial Evaluation, and JIT compilation to Native code.

What is not clear to me is whether TruffleRuby creates its AST directly from ruby source code or whether it has an intermediate step and leverages YARV bytecodes (or another bytecode format).

As a final note, it's interesting to note that as explained here, Jython goes a different route from JRuby, it does not have an interpreter phase, it directly compiles to Java bytecodes and feeds them to the JVM. This is the same behaviour of IronPython on .Net.

Jython uses an ANTLRv3 generated parser to generate an AST that conforms to the spec for the one you can get from Pythons built in compile function (if you ask for AST Jython will just return after this stage). This is then fed through a custom compiler that uses the ASM Java bytecode generation library to generate Java bytecode that is then loaded and executed. Jython does not (in contrast to JRuby) have an initial interpreted stage, but compiles directly to Java bytecode and lets the JVM handle it from there. Because of this I've never liked it when people describe Jython as a Python interpreter written in Java, I'd much rather call it a Python implementation for Java.

By the way, while reading different sources about all this stuff, I've found a couple of references to a free ebook about interpreters programming. that looks pretty interesting.