Sunday 26 February 2023

Kotlin MutableIterable

This post is a sort of kotlin follow up to this post from last year. Kotlin follows the common iterable-iterator nomenclature (the only platform that uses a different naming, IEnumerable-IEnumerator is .Net...), and same as in Python, Iterators are also Iterables. They are iterable because there is an extension function for the Iterator interface that returns the iterator itself. This means that the Iterator interface does not inherit from the Iterable interface. I wonder if there's any reason for implementing it this way rather than as a default Interface method. From here:

Returns the given iterator itself. This allows to use an instance of iterator in a for loop.

Kotlin adds something new (to me) to the Iterable-Iterator pair, it features also a MutableIterable-MutableIterator couple. A MutableIterator has a remove method that removes from the underlying collection the last element returned by the iterator. You can see a usage example here


val numbers = mutableListOf(1,2,3,4,5,6)
val numberIterator = numbers.iterator()
while (numberIterator.hasNext()) {
    val integer = numberIterator.next()
    if (integer < 3) {
        numberIterator.remove()
    }
}

Including a remove-delete capabiliby in the iteration logic is not common. Neither JavaScript, nor Python nor .Net do this. I wrote an article about iteration and removal one decade ago and reading it again now has helped me to refresh ideas. I had forgotten that as I explain there Java iterators have a remove method. The Kotlin approach of having 2 different pairs, Iterable-Iterator, MutableIterable-MutableIterator seems much more correct. As Kotlin has had mutable and immutable collections since the beginning (I think) the need for this distinction had to seem evident to the smart guys that designed Kotlin.

Prompted by that old post of mine I've been thinking about deleting while iterating in Python. I'm talking about deleting from the original object, not about creating a new collection with the not removed ones (that is as simple as just filtering the collection) Of course you can always use the "universal" solution of using a normal while loop and an index, and either traversing backwards or moving forward and being careful with how you update your position. But what about a for in and a iterator? If we try this:


l1 = ["Paris", "Toulouse", "Torino", "Xixon"]

for i, v in enumerate(l1):
    if v.startswith("T"):
        del l1[i]

print(l1)
['Paris', 'Torino', 'Xixon']


We don't get an exception as we get in .Net (for that .version thing that I explain in the old post), but we skip from the iteration the next item, so in this case Torino does not go through the validation and remains in the list.

Googling around I came across an amazing solution, that uses slice assignment to do an in place replacement of the whole original list with the filtered one.


l1 = ["Paris", "Toulouse", "Torino", "Xixon"]

l1[:] = [v for i, v in enumerate(l1) if not v.startswith("T")]
print(l1)
['Paris', 'Xixon']


I have to admit that I had never heard about slice assignment. The idea is that when we do:



l1 = ["a", "b", "c", "d"]
l2 = ["X", "Y"]

l1[1:3] = l2
print(l1)
# ['a', 'X', 'Y', 'd'] 

We are are replacing that slice of the list on the left with the list on the right.

Saturday 18 February 2023

Python Type Hints, Cast and Any

When I came back into Python programming 1 year ago and came across ype hints I was not particularly excited about them. I had the same feeling I'd had when TypeScript arrived to the JavaScript world, that it was somehow limiting the power of an almighty dynamic language, and started to use them occasionally, but now I'm totally hooked to type hints. I use them even for the most trivial scripts. The sort of "basic documentation" that type annotations provide, and the way they empower intellisense and prevent bugs is something that seems essential to me now. The amazing thing is that using Any, cast and Protocols you can have the best of 2 worlds, ultra dynamic, duck-typed code with object expansion, inheritance modification and so on, and the development friendliness provided by type hints.

Any is what in this post I qualified as Top and Bottom type. Objects of any type can be assigned to a variable hinted as Any, and an object hinted as Any can be assigned to a variable of whatever type.
We can use Any in sections of code where we don't care about type checking. Assign the Any type to a variable or parameter and the type-checker will not complain about anything that we try to do with that object.

Casting. The typing.cast function in python behaves like the rest of the type hints infrastructure, it's used only by they type checker (mypy for example) and has no effect at runtime (we know that the only effect that type hints have at runtime is that they are available as annotations). While in static typed languages like Java and C# casting an object to an incorrect type will cause a runtime error, in python it'll do nothing, as cast() just returns the instance object passed to it. Duck-typing is one of those Python features that we love so much, and we can make it play nicely with type hints just by using a cast. My object is not a Duck, but it quacks, and my function expects a Duck, but just to make it quack, so let's pretend to be a Duck by casting to a Duck, so that the function accepts it.


lass Animal:
    def __init__(self, name):
        self.name = name

    def move(self):
        print(f"Animal {self.name} is moving")

class Button:
    def __init__(self, name):
        self.name = name

    def move(self):
        print(f"Button {self.name} is moving")

def do_sport(item: Animal):
    print("doing sport")
    item.move()

a1 = Animal("Xana")
do_sport(a1)

b1 = Button("Slide")
# inferred type is Button

# at runtime this is fine thanks to Duck Typing
# but the type-checker complains:
# error: Argument 1 to "do_sport" has incompatible type "Button"; expected "Animal"
do_sport(b1)

# so let's cast it
do_sport(cast(Animal, b1))

# if we cast it to Any we get the same effect, but obviously we lose intellisense
do_sport(cast(Any, b1))

We have an interesting combined use of casting to Any and casting to an specific type when we are expanding an object (adding to it a method for example). Let's see it:


from typing import cast, Any
import types 

class Animal:
    def __init__(self, name):
        self.name = name

    def move(self):
        print(f"{self.name} is moving")

class Person:
    def __init__(self, name):
        self.name = name

    def speak(self):
        print(f"{self.name} is speaking")

p1 = Person("Francois")
# let's add to this instance (expand it) the move method from the Animal class. We have to create a bound method, which we do with types.MethodType

# this works fine at runtime:
p1.move = types.MethodType(Animal.move, p1)
p1.move()
#but the type-checker complains
#expando.py:21: error: "Person" has no attribute "move"  [attr-defined]
#expando.py:22: error: "Person" has no attribute "move"  [attr-defined]


# we cast to Any to prevent this type-checker error: Cannot assign to a method  [assignment]
(cast(Any, p1)).move = types.MethodType(Animal.move, p1)
# and now we cast it to Animal and assign to a new variable so that we can use it comfortably
p2 = cast(Animal, p1)
p2.move()

Saturday 11 February 2023

JavaScript arguments vs Python locals()

There's a technique in JavaScript combining the arguments (array-like) object and the spread operator that comes pretty handy. When I have a function receiving multiple parameters and invoking another function with those same parameters, I use the arguments object for convenience, rather than typing again the parameters list. I mean:


function format(name, age, city, country){
    console.log(`formatting: ${name} - ${age} [${city}, ${country}]`);
}

function validate(name, age, city, country){
    if (city == "Paris") {
        console.log("validation OK");
        format(...arguments);   //equivalent to: format(name, age, city, country)
    }
    else {
        console.log("validation KO");
    }
}


validate("Francois", 40, "Paris", "France");
validate("Francois", 40, "Lyon", "France");

// validation OK
// formatting: Francois - 40 [Paris, France]
// validation KO



One could wonder what happens if we modify the value of one of the function arguments before invoking the second function? Does the arguments-object that we are passing over reflect those changes? This is an important consideration in some use cases. Well, this is explained in this subsection. In not strict-mode the value gets updated, I mean:


function format(name, age, city, country){
    console.log(`formatting: ${name} - ${age} [${city}, ${country}]`);
}

function enhance(name, age, city, country){
    if (city == "Paris") {
        city = city.toUpperCase()
        age += 2;
        format(...arguments);  
    }
}

enhance("Francois", 40, "Paris", "France");

// formatting: Francois - 42 [PARIS, France]

What about using this same technique in Python? Well, Python does not have an equivalent to the arguments-object, but in many cases the locals function will do the trick.



def format(name, age, city, country):
    print(f"formatting: {name} - {age} [{city}, {country}]")

def validate(name, age, city, country):
    if city == "Paris":
        print("validation OK")
        format(**locals()) # equivalent to: format(name, age, city, country)
    else:
        print("validation KO");


validate("Francois", 40, "Paris", "France")
validate("Francois", 40, "Lyon", "France")

# validation OK
# formatting: Francois - 40 [Paris, France]
# validation KO

locals() returns a dictionary representing the current local symbol table, which includes the function arguments, that's why the above example works. But this means that if our function is a closure, the free vars trapped by the closure are also included, so this technique won't work for closures. There are more limitations. As we define new variables in our function they will be added to the dictionary returned by locals(), so again the technique would fail. Notice that different calls to locals() in one function return the same dictionary (I mean, it points to the same position in memory, the id()). As new variables are defined, they are added as new keys in that dictionary, and if values are updated, they are updated in the dictionary. This means that our second javascript example also works in Python:



def format(name, age, city, country):
    print(f"formatting: {name} - {age} [{city}, {country}]")
    
def enhance(name, age, city, country):
    if city == "Paris":
        city = city.upper()
        age += 2
        format(**locals()); 


enhance("Francois", 40, "Paris", "France")
# formatting: Francois - 42 [PARIS, France]


Thursday 2 February 2023

Kotlin and inline functions

One of the many surprising (at least for me) features that one finds in Kotlin is inline functions. The Inline expansion of a function was not new to me. I know that compilers (either ahead of time compilers or JIT's) do it occasionally. They come across a small function and based on some rules consider that it will be faster to unroll its code in the callsite that invoking it. What was new to me is that the programmer can tell the compiler which functions to inline, it's the first time I come across with this option but I've been investigating and this exists also in F#, Scala and C-C++.

This excellent article explains pretty well why inlining is important in Kotlin (pretty interesting also how, as we saw in my previous post, the compiler creates either singletons or normal classes for hosting function references) . Long in short, in Kotlin we use functions (lambdas) in places where in other languages we just write the code in place (scope functions are a great example, and indeed I still don't find them particularly appealing save for DSL cases). This comes with a performance price, so Kotlin guys (that are really, really clever people) came up with the inline thing.

The article says:

When using inline functions, the compiler inlines the function body. That is, it substitutes the body directly into places where the function gets called. By default, the compiler inlines the code for both the function itself and the lambdas passed to it.

So with regards to inlining the functions passed to the inline function it only mentions lambdas (same as the kotlin documentation does), so I was wondering if an anonymous function would not be inlined. I've just done a fast test to verify that yes, it is inlined. In these 2 functions below (notice that the apply scope function is defined with the inline modifier) both the lambda expression and the anonymous funtion get inlined.


// notice that apply is defined like this:
// inline fun <T> T.apply(block: T.() -> Unit): T

class Person constructor (var name: String, var age: Int, var city: String) {
    fun sayHi(): String {
        return "Bonjour, je suis $name et j'habite a {city}"
    }
}

// lambda gets inlined
fun test1() {
    val p1 = Person("Francois", 4, "Xixón")
    p1.apply {
        name = name.uppercase()
        age += 1
    }
}

// anonymous function defined in place gets inlined
fun test2() {
    val p1 = Person("Francois", 4, "Xixón")
    p1.apply(fun(p: Person) {
        p.name = p.name.uppercase()
        p.age += 1
    })
}

All the examples for inlining that I've seen show the lambda being defined right in the place of the parameter to the function invokation, but what if we define it previously? It's clear that in most cases inlining can not be done. Normally when we have a variable that references the compiler can not know what specific function that variable will be referencing to (maybe the variable is initialized from a parameter, maybe there are conditionals...). But there are very specific cases where the compiler could check the whole function and know the real value being used and inline it. I mean cases that simple as this:


// lambda not defined right in place does not get inlined
fun test3() {
    val p1 = Person("Francois", 4, "Xixón")
    val fn: Person.() -> Unit = { 
        name = name.uppercase()
        age += 1
    }
    p1.apply(fn)
}

//anonymous function not defined right in place does not get inlined
fun test4() {
    val p1 = Person("Francois", 4, "Xixón")
    val fn: (Person) -> Unit = fun (p: Person) {
        p.name = p.name.uppercase()
        p.age += 1
    }    
    p1.apply(fn)
}

I've verified that it's not being inlined.

There's another topic that is a bit related to inlining, the usage of return. Believe it or not, apart from the normal "return" that we are used to in other languages (an unqualified return), Kotlin also has qualified returns, where the return is accompanied by a label. These labelled returns are also called non-local returns, as when we define and invoke a nested function, they allow it to return from the outer scope. Honestly I find this quite a bit confusing, but probably the most important thing to take from this is: Non qualified returns (unlabelled returns) are not allowed in lambda expressions, unless that the lambda expression is inlined.. In this case, returning from the lambda returns from the first "normal function" enclosing the lambda.

At first it seemed odd to me but now I think I understand the reasoning for this restriction. When we use a lambda as in my test1 function above, that is, define a lambda just in the place where it's being passed as parameter to another function, the visual effect is that it really does not look as function, it looks more as a language construct, like an if{}, while{}. In these cases, if we see a return written inside those {}, it could look more as a return from the enclosing function. If the lambda is being inlined that perception would be right as the return ends up as another statement of the enclosing function, but if it's not inlined, the lambda remains a function and that return would be just returning from it, not from the enclosing function. So in the end the programmer would have to be particularly attentive to what the return is really doing. For example in the same lambda expression a return would be doing different things depending on whether the lambda is declared, assigned to a variable and passed to a function, or whether it's directly defined and passed, in which case the behaviour would be different if the function gets inlined or not... So yes, it could be rather confusing, so restricting the usage of return makes things a bit more straight.

All this talking about "inlining" has reminded me of another very different kind of inlining, the possibility provided by C and C++ of writing inline assembly code. I've just learnt that the Rust also allows writing inline assembly!