Sunday 20 February 2022

itertools.tee implemented in JavaScript

Generator functions and Generator objects, iterators and iterables work in a quite similar way in Pyhton and JavaScript, though there are some differences that maybe I will address in a separate post. Both in JavaScript and Python generator functions return generator objects that are both iterables and iterators (so the generator object iterates itself). Because of this, you can iterate a generator object only once. Notice that I explained time ago that this works differently in C#.

Looking into the functionality provided by the Python itertools module I came across an intereting function, tee, that allows creating multiple independent iterators over the same iterable. These iterators point to a buffer around the original iterator. It's a nice idea and I've implemented it in JavaScript



function tee(iterable, num){
    let internalIterator = iterable[Symbol.iterator]();
    let buffer = [];

    //it's very interesting that a generator function can trap the outer variables in a closure, though
    //in the end what is being generated by the compiler is an object with a next() method, not a function that traps the activation object
    function* generatorFn(){
        let pos = 0;
        let finished = false;
        while (!finished){
            if (pos %lt; buffer.length){
                yield buffer[pos];
            }
            else{
                it = internalIterator.next();
                if (it.done){
                    finished = true;
                }
                else{
                    buffer.push(it.value);
                    yield it.value;
                }
            }
            pos++;
        }
    };
    
    let generatorObjs = []
    for (let i=0; i%lt;num; i++){
        generatorObjs.push(generatorFn());
    }
    return generatorObjs;
}


let generatorObj = (function*(){
    yield "a";
    yield "b";
    yield "c";
})()

let [iter1, iter2, iter3, iter4] = tee(generatorObj, 4);

console.log(iter1.next().value);

console.log(iter2.next().value);
console.log(iter2.next().value);
console.log(iter2.next().value);
console.log(iter2.next().value);

console.log(iter1.next().value);
console.log(iter1.next().value);
console.log(iter1.next().value);
for (it of iter3){
    console.log(it);
}

for (it of iter4){
    console.log(it);
}
/*
a
a
b
c
undefined
b
c
undefined
a
b
c
a
b
c
*/

After coding it (and uploading it to a gist) I realized of how the generators and closures machinery is even more impressive than I already knew. In the above code we have a generator function that captures in a closure the "internalIterator" and "buffer" variables in its lexical scope (well, it'll get them through the [[scope]] property...). The amazing thing is that as we know that "generator function" will create "generator objects" on which we will invoke the .next() method, and is really that .next() method who will be accessing the captured variables, so somehow the compiler has to translate the "closure behaviour" (that in this case is a sort of "virtual closure") to a normal object, adding I guess references to the trapped variables to each of the generator objects that it creates. Evenly impressive is the fact that the node.js debugger will show us everything at runtime as if we were just dealing with a normal closure.

No comments:

Post a Comment