Thursday 18 April 2024

Static Members Comparison

Companion objects is a rather surprising Kotlin feature. Being a replacement (allegedly an improvement) for the static members that we find in most other languages (Java, C#, Python), in order to grasp its advantages I've needed first to review how static members work in these other languages. That's what this post is about.

In Java static members (fields or methods) can be accessed from the class and also from instances of the class (which is not recommended because of what I'm going to explain). Static members are inherited, but static methods are not virtual, their resolution is done at compile-time, based on the compile-time type, which is something pretty important to take into account if we are going to invoke them through an instance rather than through the class (it seems to be a source of confusion, and one of the reasons why Kotlin designers decided not to add "static" to the language). If we define a same static method in a Parent class and its Child class, and we invoke it through a Parent variable pointing to a Child instance, as the resolution is done at compile time (there's no polymorphism for static members) the method being invoked will be the one in Parent rather than in Child. You can read more here.

Things are a bit different in C#. Probably aware of that problem in Java, C# designers decided to make static members only accessible from the class, not from instances. static members are inherited (you can use class Child to access a static member defined in class Parent) and you can redefine a static method (hide the inherited one with a new one) in a Child class using the new modifier.

Recent versions of JavaScript have seen the addition of static members to classes (of course remember that classes in JavaScript are just syntactic sugar, the language continues to be prototype based). They work in the same way as in C#. They can be accessed only through the class, not through instances. You have access to them using a Child class (they are inherited) and you can also redefine them in a Child class.


class Person {
    static planet = "Earth"
    
    constructor(name) {
        this.name = name;
    }

    static shout() {
        return `${this.planet} inhabitant AAAAAAAAAAAA`;
    }

}

class ExtendedPerson extends Person {

}

console.log(Person.shout())

try {
    console.log(new Person("Francois").shout());
}
catch (ex) {
    console.log(ex);
}

// inheritance of static fields/methods works OK
console.log(ExtendedPerson.shout());

//it works because of this:
console.log(Object.getPrototypeOf(ExtendedPerson) === Person);
//true

I assume static members are implemented by just setting properties in the object for that class (Person is indeed a function object), I mean: Person.shout = function(){};. Inheritance works because as you can see in the last line [[Prototype]] of a Child "class" points to the Parent.

An interesting thing is that from a static method you can (and should) access other static methods of the same class using "this". This makes pretty good sense, "this" is dynamic, it's the "receiver" and in a static method such receiver is the class itself. Using "this" rather than the class name allows a form of polymorphism, let's see:


class Person {
    static shout() {
        return "I'm shouting";
    }

    static kick() {
        return "I'm kicking";
    }

    static makeTrouble() {
        return `${this.shout()}, ${Person.kick()}`;
    }

}

class StrongPerson extends Person {
    static shout() {
        return "I'm shouting Loud";
    }
    static kick() {
        return "I'm kicking Hard";
    }    
}

console.log(Person.makeTrouble());
console.log("--------------");
console.log(StrongPerson.makeTrouble());

// I'm shouting, I'm kicking
// --------------
// I'm shouting Loud, I'm kicking


Notice how thanks to using this we end up invoking the Child.shout() method, while for kick() we are stuck in the Parent.kick()

Static/class members in Python have some particularities. In Python any attribute declared in a standard class belongs to the class. This means that for static data attributes we don't have to use any extra keyword, we just add them at the class level (rather than in the __init__() method). For static/class methods we have to use the @classmethod decorator (if it's going to call other class methods) of the @staticmethod decorator if not. When we invoke a method in an object Python uses the attribute lookup algorithm to get the function that then will be invoked. As explained here Functions are indeed data-descriptors that have a __get__ method, so when we retrieve this function via the attribute lookup the __get__ method of the descriptor is executed, creating a bound method object, bound to the instance or to the class (if the function has been decorated with classmethod) or a staticmethod object, that is not bound, if the function has been decorated with staticmethod. Based on this we have that class/static methods can be invoked both via the class or also via an instance, that they are inherited, and that the polymorphism we saw in JavaScript works also nicely in Python. Let's see some code:


class Person:
    planet = "Earth"
    
    def __init__(self, name: str):
        self.name = name

    def say_hi(self):
        return f"Bonjour, je m'appelle {self.name}"
    
    @staticmethod
    def shout():
        return "I'm shouting"

    @staticmethod   
    def kick():
        return "I'm kicking"

    @classmethod
    def makeTrouble(cls):
        return f"{cls.shout()}, {cls.kick()}"


class StrongPerson(Person):
    @staticmethod
    def shout():
        return "I'm shouting Loud"

    @staticmethod   
    def kick():
        return "I'm kicking hard"


print(Person.makeTrouble())
p1 = Person("Iyan")
print(p1.makeTrouble())

print("--------------")

# inheritance works fine, with polymorphism, both invoked through the class or through an instance
print(StrongPerson.makeTrouble())
p2 = StrongPerson("Iyan")
print(p2.makeTrouble())

# I'm shouting, I'm kicking
# I'm shouting, I'm kicking
# --------------
# I'm shouting Loud, I'm kicking hard
# I'm shouting Loud, I'm kicking hard


print(Person.planet) # Earth
print(p1.planet) # Earth

Person.planet = "New Earth"
print(Person.planet) # New Earth
print(p1.planet) # New Earth

# this assignment will set the attibute in the instance, not in the class
p1.planet = "Earth 22"
print(Person.planet) # New Earth
print(p1.planet) # Earth 22

Notice how we can read a static attribute (planet) both via the class or via an instance, but if we modify it via an instance the attribute will added to the instance rather than updated in the class.

One extra note. We know that when using dataclasses we declare the instance members at the class level (then the dataclass decorator will take care of the logic for setting them in the instance in each instantiation), so for declaring static/class attributes in our dataclasses we have to use ClassVar type annotation: cvar: ClassVar[float] = 0.5

Sunday 24 March 2024

Python Object Literals

Kotlin object expressions are pretty nice. They can be used as the object literals that we like so much in JavaScript, but furthermore you can extend other classes or implement interfaces. In Kotlin for the JVM these object expressions are instances of (anonymous) classes that the compiler creates for us.

Python does not provide syntax for object literals, but indeed it's rather easy to get something similar to what we have in Kotlin. For easily creating a "bag of attributes" we can leverage the SimpleNamespace class. To add methods to our object we have to be careful, cause if we just set an attribute to a function, when we later invoke it it will be called without the "receiver-self". We have to simulate the "bound methods" magic applied to functions declared inside a class (that are indeed descriptors that return bound-methods). We just have to use types.MethodType to bind the function to the "receiver". Of course, we also want to allow our "literal objects" to extend other classes. This turns out to be pretty easy too, given that in Python we can change the class of an existing object via the __class__ attribute (I tend to complain about Python syntax, but its dynamic features are so cool!) a feature that we combine with the sometimes disdained multiple inheritance of classes. So we'll create a new class (with types.new_class) that extends the old and the new class, and assign this class to our object.
So, less talk and more code! I ended up with a class extending SimpleNamespace with bind and extend methods. Both methods modify the object in place and return it to allow chaining.


class ObjectLiteral(SimpleNamespace):
    def bind(self, name: str, fn: Callable) -> "ObjectLiteral":
        setattr(self, name, MethodType(fn, self))
        return self
    
    def extend(self, cls, *args, **kwargs) -> "ObjectLiteral":
        parent_cls = types.new_class("Parent", (self.__class__, cls))
        self.__class__ = parent_cls
        cls.__init__(self, *args, **kwargs)
        return self

We'll use it like this:



class Formatter:
    def __init__(self, w1: str):
        self.w1 = w1
    
    def format(self, txt) -> str:
        return f"{self.w1}{txt}{self.w1}"
    

p1 = (ObjectLiteral(
        name="Xuan",
        age="50",
    )
    .extend(Formatter, "|")
    .bind("format2", lambda x, wr: f"{wr}{x.name}-{x.age}{wr}")
    .bind("say_hi", lambda x: f"Bonjour, je m'appelle {x.name} et j'ai {x.age} ans")
)

print(p1.format("Hey"))
print(p1.format2("|"))
print(p1.say_hi())
print(f"mro: {p1.__class__.__mro__}")
# let's add new attributes
p1.extra = "aaaa"
print(p1.extra)

# let's extend another class
class Calculator:
    def __init__(self, v1: int):
        self.v1 = v1
    
    def calculate(self, v2: int) -> str:
        return self.v1 * v2
    
p1.extend(Calculator, 2)
print(f"mro: {p1.__class__.__mro__}")
print(p1.format("Hey"))
print(p1.format2("|"))
print(p1.say_hi())
print(p1.calculate(4))

print(f"instance Formatter: {isinstance(p1, Formatter)}")
print(f"instance Formatter: {isinstance(p1, Calculator)}")


# |Xuan-50|
# Bonjour, je m'appelle Xuan et j'ai 50 ans
# |Hey|
# |Xuan-50|
# Bonjour, je m'appelle Xuan et j'ai 50 ans
# mro: (, , , , )
# aaaa
# mro: (, , , , , , )
# |Hey|
# |Xuan-50|
# Bonjour, je m'appelle Xuan et j'ai 50 ans
# 8
# instance Formatter: True
# instance Formatter: True

Notice how after the initial creation of our object we've continued to expand it with additional attributes, binding new methods and extending other classes.

Tuesday 19 March 2024

Named Arguments Differences

In 2 previous posts [1] and [2] we saw that Python, JavaScript and Kotlin have some differences (python limitations we can say) in how they deal with Default arguments. I was wondering if there are any differences in how they handle Named Arguments.

As for JavaScript, given that it does not support named arguments (which is rather surprising), there's not much to say.

The main difference between Python and Kotlin Named Arguments (also knows as Keyword Arguments in Python) has to do with mixing positional and named arguments in the same call. In Python the named arguments can be provided in any order, but you can not use a not-named argument (positional) after a named argument. In Kotlin, you can also provide named arguments in any order when you are not providing not-named arguments after them. So yes, that "when" means that you are also allowed to provide not-named arguments after named arguments, but if you do this, all the arguments (positional ones and named ones) have to be provided in just the same order they were defined in the function signature.

Let's see some examples in Python:


def format(w1, w2, txt):
    return f"{w1}{w2}{txt}{w2}{w1}"

print(format("-", "|", "Bonjour"))

print(format("-", "|", txt="Bonjour"))

print(format("-", w2="|", txt="Bonjour"))

print(format("-", txt="Bonjour", w2="|"))

# Positional argument cannot appear after keyword arguments
#print(format("-", txt="Bonjour", "|"))
#print(format("-", l2="|", "Bonjour"))


And in Kotlin:


package namedParameters


fun format(w1: String, w2: String, txt: String): String {
    return "$w1$w2$txt$w2$w1"
}


fun main() {
    println(format("-", "|", "Bonjour"))

    println(format("-", "|", txt = "Bonjour"))
    
    println(format("-", w2 = "|", txt = "Bonjour"))
    
    println(format("-", txt = "Bonjour", w2 = "|"))

    //error: Mixing named and positioned arguments is not allowed
    //println(format("-", txt="Bonjour", "|"))
    
    //If I'm going to provide an unnamed parameter after a named one (not available in Python), all the parameters in the call have to be passed in order
    
    println(format("-", w2 = "|", "Bonjour"))

}

From what I've read here named arguments in C# have the same behaviour as in Kotlin (you can provide not-named arguments after named ones, but this forces you to provide all of them in order). At first sight this feature seemed of little use to me. Once you provide some named argument, why would you skip the name of an ensuing argument if that's going to force you to provide all the arguments in the order defined in the signature? Well, indeed it makes sense, let me explain. When we have variables that have the same names as function parameters it can seem a bit redundant to use named arguments, so let's say we are passing as arguments some "inline values" and some variables with a different name from the parameter. This is a case where even if we are keeping the same order as in the function signature, using named arguments for those "inline values" and variables will do our code more clear. But if after those arguments we are going to pass a variable that has the same name as the parameter, using a named argument is redundant, so it's nice not to be forced to name it.

A bit related to this it has come to my mind a discussion in Python (there's a recently created PEP draft for it) about a syntax for shortening the use of named arguments when a variable and a parameter have the same name. The idea seems to be inspired by ruby and would look like this:


#For example, the function invocation:

my_function(my_first_variable=, my_second_variable=, my_third_variable=)

#Will be interpreted exactly equivalently to following in existing syntax:

my_function(
  my_first_variable=my_first_variable,
  my_second_variable=my_second_variable,
  my_third_variable=my_third_variable,
)


This looks like an excellent idea to me that would make unnecessary implementing the Kotlin/C# feature that we've been discussing.

There's one feature of named arguments in Python that I mentioned in my previous post. In Python a function can force us to provide certain (or all) arguments as named ones by means of "*", you can read more here and here. It seems there's been some discussion about adding this to Kotlin, but nothing has been done so far.

Sunday 10 March 2024

Default Arguments differences

After my previous post about late-bound default arguments/parameters (again I continue to be confused about saying "argument" or "parameter", and indeed the Kotlin documentation also seems to use them indistinctively in some occasions) I realised that there's another limitation of Python default arguments when compared to JavaScript or Kotlin. In python once you define a default argument, all the ensuing arguments also have to be defined as default ones, otherwise you'll get a Syntax Error.

This is not the case in Kotlin (and it seems in Ruby either), where:

If a default parameter precedes a parameter with no default value, the default value can only be used by calling the function with named arguments.


fun foo(
    bar: Int = 0,
    baz: Int,
) { /*...*/ }

foo(baz = 1) // The default value bar = 0 is used

So it's OK not to provide one default argument that has non-default arguments after it, but in that case you have to invoke the funcion naming the remaining arguments. There's an exception to this, if the last of those argument is a lambda (trailing lambda), you can just use the nice syntax of passing it outside the parentheses:

If the last argument after default parameters is a lambda, you can pass it either as a named argument or outside the parentheses

As all this ends up being a bit confusing, I think it's generally adviced to put all the default arguments at the end of the function signature (if the last one is a lambda this one is excluded from the rule).

In JavaScript you can also define them as in Kotlin, but given that there's no support for named arguments the behaviour is that if you pass less parameters than the function defines in its signature, it gets them in order, so it seems like it's not particularly useful. Well, it's useful if we explicitely pass undefined as the value for a default parameter, as in that case the default argument is used, rather than value (that's not the case if we pass null)


const format = (l1, l2 = "_", txt) => `${l1}${l2}${txt}${l2}${l1}`;

// not what we would like
console.log(format("|", "Iyan"));
//|IyanundefinedIyan|

console.log(format("|"));
// |_undefined_|

console.log(format("|", "_", "Iyan"));
// |_Iyan_|

// the thing of having declared a default parameter before a non default parameter is only useful if we decide to pass undefined, that will be replaced by the default value
console.log(format("|", undefined, "Iyan"));
// |_Iyan_|

Well there is a syntax "trick" in Python that sort of allows the kotlin behaviour as explained here. It leverages a Python feature that was unknown to me so far, Keyword-Only Arguments. So you can declare non default arguments after default arguments by placing a ", *" between them, as it forces you to pass by name the remaing non default arguments (so you get the same behaviour as in Kotlin). The disadvantage here is that indeed you are forced to pass those parameters by name always, even when you are providing all the values and not using any of the defaults.


# We're not allowed to write this:
# Non-default argument follows default argumentPylance
# def format(l1, l2 = "_", txt):
#     return f"{l1}{l2}{txt}{l2}{l1}"

# but we can use the * feature
def format(l1, l2 = "_", *, txt):
    return f"{l1}{l2}{txt}{l2}{l1}"


# both fail:
# format() missing 1 required keyword-only argument: 'txt'
#print(format("|", "Iyan"))
#print(format("|"))

# works fine
print(format("|", txt="Iyan"))
# |_Iyan_|

# the disadvantage is that we are forced to always pass txt by name, even when we are passing all values
#TypeError: format() takes from 1 to 2 positional arguments but 3 were given
#print(format("|", "-", "Iyan"))

print(format("|", "-", txt="Iyan"))

Tuesday 5 March 2024

Destructuring, astuple, attrgetter

I already talked about destructuring in JavaScript/Python/Kotlin in this previos post. So, to sum up, we can use destructuring with any iterable Python object (that's equivalent to JavaScript Array destructuring). That's nice, but it would be event more cute to have something that we could use with any object (a bit in the vein of JavaScript Object destructuring) with no need to make it iterable. I've been looking into some possibilities:

If you are working with simple dataclasses the astuple() function comes pretty handy. Notice though, that it will recursivelly retrieve elements of other dataclasses, lists, dictionaries... which probably is not what you want./p>


@dataclass
class Person:
    name: str
    city: str
    post_code: int
    age: int
    job: str

p1 = Person("Iyan", "Xixon", 33200, 49, "Coder")

name, _, pc, job, _ =  astuple(p1)
print(name, pc, job)
#Iyan 33200 Coder


That approach works in a positional way. We know the order of the the attributes of the class and we take those positions we want and discard others (with the _ convention).

Another option, this one more nominal, as we take attributes based on their name, is using operator.attrgetter, like this:


name, pc, job = operator.attrgetter("name", "post_code", "job")(p1)
print(name, pc, job)
#Iyan 33200 Coder


It looks ok, but using attribute names in its string form is a huge risk. If you rename an attribute refactoring tools will know nothing about your string, and you have to remember that you are accessing that attribute as a string to manually fix it. It's easy to make a mess... With that in mind, I think I prefer the redundancy of writing the object name multiple times:


name, pc, job = p1.name, p1.post_code, p1.job
print(name, pc, job)
#Iyan 33200 Coder


Related to this, what if we want to transform each of the values we are destructuring, but with a different transformation for each value? I'm not talking about a normal map() call where we apply the same funcion, I'm talking about applying different functions. For that, a transform function like this will do the trick:


def transform(funcs, items):
    return [func(item) for func, item in zip(funcs, items)]


That way we can write something like this:



name, pc, job = transform(
    (str.upper, lambda x: x, str.lower), 
    attrgetter("name", "post_code", "job")(p1), 
)
print(name, pc, job)
#IYAN 33200 coder


name, pc, job = transform(
    (str.upper, lambda x: x, str.lower), 
    (p1.name, p1.post_code, p1.job), 
)
print(name, pc, job)
IYAN 33200 coder