Thursday 21 December 2023

Kotlin and "Generators"

In my previos post I mentioned how Kotlin suspendable functions are the basis for asynchronous programming and also for generators-like functionality. In JavaScript, Python and C# we have different constructs-keywords for these 2 things, async-await on one side and yield on the other, so it's interesting to see how Kotlin manages both things through the same suspend construct/keyword.

The equivalent to a JavaScript or Python generator function (a funtion that produces elements as we iterate it) is created in Kotlin by means of the sequence() and yield() functions, like this:



val cities = sequence {
        yield("Paris")
        yield("Lyon")
        yield("Bordeaux")
    }
    
    for (city: String in cities) {
        println(city)
    }


sequence is a function (a Sequence Builder) that receives a "suspendable block" that has as receiver the abstract SequenceScope class, and yield() is a suspendable function of the SequenceScope class. Let me show their signatures:


fun  sequence(
    block: suspend SequenceScope.() -> Unit
): Sequence

abstract suspend fun yield(value: T)


The usage of suspendable functions for providing generators-like functionality seemed surprising to me. The "external interface" of a suspendable function is that you have a function that you invoke once, will get suspended 1 to n times in its suspension points, and finally will return one result to you. With generators what we have is a function that gets involved n times, producing one value each time. So, how can a suspendable function work for this second case? Well, as I explained in my previous post the internal details of a suspendable function is that the function is transformed by the compiler into a state machine, with each suspension point corresponding to one of those states. When the function invokes another suspendable function (suspension point) it returns to the caller a COROUTINE_SUSPENDED value, so that the caller knows that it has been suspended. When the async operation we're waiting is completed the coroutine will resume the suspend function (by invoking its corresponding Continuation). When the suspendable function reaches its end it will return its result, rather than a COROUTINE_SUSPENDED. For our generator like functionality we'll be calling an iterator.next() method. If this iterator invokes the suspendable function and the suspendable function (that in this case does not need of any asynchronous call to produce each of its iteration values) stores the value produced for this iteration "shomewhere" and returns COROUTINE_SUSPENDED, then the iterator can retrieve that value. Then the next call to iterator.next() will resume the suspendable function so that it produces the next value, and so on. So this should work!

I've been digging into Kotlin's source code (SequenceBuilder.kt) to verify that the strategy that I've just described is indeed what Kotlin is doing. The key element is the SequenceBuilderIterator class, that is no more than a SequenceScope, an Iterator and a Continuation. I'll just copy below its definition and main method signatures:


private class SequenceBuilderIterator : SequenceScope(), Iterator, Continuation {
    private var state = State_NotReady
    private var nextValue: T? = null
    private var nextIterator: Iterator? = null
    var nextStep: Continuation? = null

    override fun hasNext(): Boolean {
        ...
    }

    override fun next(): T {
        ...
    }

 
    override suspend fun yield(value: T) {
        ...
    }

    override suspend fun yieldAll(iterator: Iterator) {
        ...
    }

    // Completion continuation implementation
    override fun resumeWith(result: Result) {
       ...
    }

    override val context: CoroutineContext
        get() = EmptyCoroutineContext
}

So when we invoke the sequence() function passing to it a suspendable function/block that uses as receiver a SequenceScope:
- This sequence() creates a SequenceBuilderIterator that stores in its nextStep property a Continuation object for that suspend function (block), that in turn will store a SequenceScope (indeed that SequenceBuilderIterator instance) object in its state.
- It wraps the SequenceBuilderIterator in a Sequence object (which iterator() method will just return that SequenceBuilderIterator)

Then when we want to iterate that Sequence we invoke its iterator() method to obtain the SequenceBuilderIterator instance. Invoking .next() on this iterator will invoke the Continuation for the suspend function/block (referenced from the nextStep property. From this suspending function/block we invoke yield(), that is another suspend function that receives the SequenceScope (our almighty SequenceBuilderIterator) as receiver and that will set in that SequenceBuilderIterator the value to return (in the nextValue property). So the so versatile SequenceBuilderIterator object is what joins everything. It has a next() method for the iteration, that returns the value set in the nextValue property by its yield method, and has another property (nextStep) pointing to the continuation (that wraps the main suspend function).

Notice that the CoroutineContext associated to a SequenceBuilderIterator is an EmptyCorotuineContext. We saw in my previous posts that Continuations have a CoroutineContext that holds a Job and a Dispatcher, but for the Continuation for our "generator-like" functionality, that Job and Dispatcher are not needed at all.

No comments:

Post a Comment