Wednesday, 26 July 2023

Python Closure in a Loop

A long, long time ago I wrote a post about the particularities of creating closures in a loop trapping the loop variable. At that time, 2010, both in JavaScript and C# that variable (declared with var in JavaScript) was scoped to the whole function and in order to trap a different variable for each closure-iteration we needed create an extra scope by means of an additional factory function in each iteration. A bit later, in 2012, that was no longer necessary for C# foreach, and 2 or 3 years later, with the addition of let scope the trick was no longer necessary in JavaScript either.

I've been reminded of all this because recently I've had to use that trick again, this time in Python, as in python variables are scoped to the whole function. So the problem is this:


def closure_in_loop_1():
    fns = []
    #i is scoped to the whole function
    for i in range(0,4):
        def fn():
            print("i: " + str(i))
        fns.append(fn)
    for fn in fns:
        fn()

closure_in_loop_1()        

# i: 3
# i: 3
# i: 3
# i: 3


And the solution, use an additional wrapper function receiving the value we want to trap in the closure, is this:


def closure_in_loop_2():
    # do the trick that I used to do in javasccript before the arrival of "let" 
    def scope(i):
        def fn():
            print("i: " + str(i))
        return fn
    fns = []
    for i in range(0,4):
        fns.append(scope(i))
    for fn in fns:
        fn()
        
closure_in_loop_2()

# i: 0
# i: 1
# i: 2
# i: 3


Python gives us access from the outer world to the values trapped by the closure. Functions are objects that have a __closure__ attribute that points to a tuple containing those trapped values (it's None if the function has not trapped any value). The tuple does not directly contain the trapped values, but a cell object that points to that value through the cell_contents attribute. This extra level of indirection is what allows 2 closures to share variables (both closures point to the same cell object):


def closure_cells_test():
    city = "Paris"
    def print_fav_city():
        print(f"my favorite city: {city}")
    
    def print_expensive_city():
        print(f"expensive city: {city}")

    print_fav_city()
    print_expensive_city()  
    print(f"city: {city}")

    print(type(print_fav_city.__closure__[0]).__name__)
    #__closure__ is a tuple of cell objects

    print(f"id1: {id(print_fav_city.__closure__[0])}") # id1: 139805268884496
    print(f"id2: {id(print_expensive_city.__closure__[0])}") # id2: 139805268884496
    # so it's  the same cell object in both cases 

	# the cell object has a cell_contents attribute
    print_fav_city.__closure__[0].cell_contents = "Moscow"
    print_fav_city() # Moscow
    print_expensive_city() # Moscow
    print(f"city: {city}") # Moscow

    city = "Beijing"
    print_fav_city() # Beijing
    print_expensive_city() # Beijing
    print(f"city: {city}") # Beijing

    # to get the names of the variables trapped by the closure:
    print(f"free vars: {print_fav_city.__code__.co_freevars}")
    # free vars: ('city',)

There's one low level details that escapes to me. How the cell object and the outer variable (on which the closure has closed over) are kept in sync (as you can see I can modifiy any of them and it's all kept synced)?.

Additionally you can see in the code above that we can get the names of the variables trapped by the closure (free variables) accessing the co_freevars attibute of the code object:
my_function.__code__.co_freevars

Sunday, 23 July 2023

Lost Kidz (aka OEGP kidz)

If you've read this post you should already know my devotion/obsession for One Eyed God Prophecy (and Uranus). In this later post I talked about what seemed to me the most OEGP similar band that I had managed to find til that day, the excellent German band Arsen. Every now and then I do some web search for "One Eyed God Prophecy similar bands" with hope of coming up with a new band or an old one unknown to me that someone relates to OEGP, and recently I've nailed it!!!

Somehow I ended up in this post providing some information about OEGP and discovering to me a band, Lost Kidz from their same mid-sized Quebecois town, Serbrooke formed a few years later. Being from the same town and sounding pretty similar to OEGP (which is not easy at all) the blog author initially thought that it was a continuation or at least featured some OEGP members, but no, it just was a band profoundly influenced by them.

Lost Kidz just released a DIY CD (apparently in 2003) that you can download from the link to mega.nz in that post. It's an excellent record, dark, emotional and very aggressive, close to that unique sound created by OEGP (but not so magical). You won't find the acoustic passages found in OEGP and there are no moments as violent as the most violent ones in OEGP, but the emotional heavy darkness is well there. They have a few parts with dual vocals with one of them slightly more on the growling than on the screaming sprectrum, which feels pretty nice. The best track for me is the last one, "Soumission librement consentie", a beautiful title for a gorgeous song. A great plus is that 4 of the 5 songs are in French rather than in English (Vive le Quebec Libre!!! :-) dealing with nationalism, consumerism, Mother Earth and self liberation. Additionally I quite like the simple layout. Some raw "existentialist" drawings much in the vein of many 90's screamo records. "Je suis perdu comme jamais nul ne le fut!" (I'm lost as nobody ever was!). Enjoy the darkness!!!

Friday, 14 July 2023

Python Protocols

When I started to use Type Hints in Python (with whatever static type checker, normally mypy) it felt a bit to me as using TypeScript, though I feel way more free and comfortable with type hints. Additionally, there's a huge difference. TypeScript type system uses Structural Typing (objects are compatible if their types have the same "shape" regarless), and Python Type Hints use Nominal Typing. From wikipedia:

In computer science, a type system is a nominal or nominative type system (or name-based type system) if compatibility and equivalence of data types is determined by explicit declarations and/or the name of the types. Nominal systems are used to determine if types are equivalent, as well as if a type is a subtype of another. Nominal type systems contrast with structural systems, where comparisons are based on the structure of the types in question and do not require explicit declarations.

However, Python has some support for Structural Typing (both at "static type checking time" and at runtime) thanks to the recent addition of Protocols. We define a protocol by defining a class inheriting from Protocol. Whatever methods, properties or attributes we include in that class define a protocol (a structure, a shape) and the static type checker (mypy) will check if the "shape" of an object matches with what is defined in the Protocol, regardless of checking the type (the name of the class) of that object.

The above seems clear, and for example we can define a protocol for objects that can be moved like this:


from typing import Protocol

class Movable(Protocol):
    def move(self, to_x: int, to_y:int):
        ...

Unrelated objects that can be moved (feature a "move" method) will be considered as Movable by the static type checker (e.g. mypy) without explicitly inheriting from it.


class Animal:
    def move(self, to_x: int, to_y:int):
        print(f"moving to {to_x}-{to_y}")

# fine for mypy, Animal complies with the Movable protocol
mv3: Movable = Animal()

That's the most basic use of Protocols, but there's much more it it.

Protocols are related to abstract classes, indeed, the Protocol class is an abstract class. As we saw in a previous post, normally we create abstract classes by inheriting from abc.ABC, but what we achieve with that is having abc.ABCMeta as our metaclass, and it's in ABCMeta where the abstract magic is implemented. Likewise, when we create a protocol by inheriting from Protocol what we achieve is having typing._ProtocolMeta as our metaclass, and _ProtocolMeta inherits from ABCMeta, so Protocol is an abstract class.


print(f"issubclass of abc.ABCMeta: {issubclass(type(Protocol), abc.ABCMeta)}")
# True

print(f"type(Protocol): {type(Protocol)}")
# type(Protocol): typing._ProtocolMeta

Protocols are intended as "contracts" and we can not create instances of a Protocol. We get a runtime error if we try:


# Trying to instantiate a Protocol causes both:
# - typings error: Cannot instantiate protocol class "Movable"
# - runtime error: TypeError('Protocols cannot be instantiated')

mv: Movable = Button()

A class is considered a protocol only if it directly inherits from Protocol. A class that inherits from another class that inherits from Protocol is not a Protocol, as I've said, it has to directly inherit from Protocol. Anyway, we can define subprotocols (protocols that extend other protocols) by inheriting from those protocols and also directly inheriting from Protocol. From PEP 544:

Subclassing a protocol class would not turn the subclass into a protocol unless it also has typing.Protocol as an explicit base class. Without this base, the class is “downgraded” to a regular ABC that cannot be used with structural subtyping.
If Protocol is included in the base class list, all the other base classes must be protocols. A protocol can’t extend a regular class.

A very interesting feature is that we can do structural typing checks at runtime with issubclass() and isinstance(). For that we have to decorate our protocol with @runtime_checkable. Otherwise those methods will throw an Exception, and the static type checker (mypy) will also do so. If we look into the cpython source code we can see that class _ProtocolMeta(ABCMeta): has its own implementation of the __subclasscheck__ and __instancecheck__ methods that we saw in another previous post, and they check for the existence of an _is_runtime_protocol attribute in the class (that is added by the @runtime_checkable decorator).

When we define a class that we know complies with an existing protocol it makes sense to explicitly inherit from it. On one side this makes the contract explicit, and on the other side if the protocol changes mypy will warn us about the class no longer complying with it. There's an additional use for this, when combined with marking the empty methods defined in a protocol as abstract methods. This is useful because if then we decide to explicitly inherit from the protocol in those clasess that comply with it, if we forget to implement any of those methods we'll get not only a mypy typechecking error, but also a runtime error "Can't instantiate abstract class X with abstract methods ..."

Once you start to create classes that inherit from protocols there's another feature that makes sense. Same as Kotlin and modern Java and C# interfaces can have default implementations, you can have non empty methods in your protocols.

Thursday, 6 July 2023

Python ABC's subclasshook and more

I think the combination in Python of dynamic features with "static friendly" features like ABC's and type hints is really amazing. While keeping all its dynamism, the language has become rather "static friendly" if you want it. One rather surprising mechanism is the possibility of interfering in the process of checking if a class inherits from another class (checked with the issubclass function) or if an object is an instance of a class (checked with the isinstance function).

issubclass(Student, Person) and isinstance(p1, Person) will search for a __subclasscheck__ and an __instancecheck__ dunder method respectively. They don't check in the Person class, but in the class of the class (the metaclass). So if you are not using some new metaclass it will end up using the methods in the type metaclass (the type class is implemented in pure c, so the type.__dict__["__instancecheck__"] and type.__dict__["__subclasscheck__"] functions are functions in typeobject.c).

The __isinstance__ and __issubclass__ mechanism was added to expand, not to restrict, the cases when an object passes the isinstance, issubclass check, and should not be used for the contrary, for restricting (that is, making objects that comply with that type-check fail), which indeed should have little use. If you try to use it for restriction you'll get odd results, it will work for the issubclass check, but will fail for some isinstance checks.

This is because of an optimization applied to __instancecheck__ explained here

an optimization in isinstance's implementation that immediately returns True if type(obj) == given_class:

You can see here the confusing results that you'll get if used for restriction (I have no idea of why for the issubclass check they do not apply an "if cls == cls" optimization


    # https://stackoverflow.com/questions/52168971/instancecheck-overwrite-shows-no-effect-what-am-i-doing-wrong

class Meta1(type):
    def __instancecheck__(cls, inst):
        """Implement isinstance(inst, cls)."""
        print("__instancecheck__")
        #return super().__instancecheck__(inst)
        return False
    
    def __subclasscheck__(cls, sub):
        """Implement issubclass(sub, cls)."""
        print("__subclasscheck__")
        #return super().__subclasscheck__(sub)
        return False
    

class Person1(metaclass=Meta1):
    pass

class Worker1(Person1):
    pass


p1 = Person1()
print(isinstance(p1, Person1))
# True
# it uses the optimization and our custom __isinstance__ is not even invoked
# type(p1) 

print("------------------------")

p1 = Worker1()
print(isinstance(p1, Person1))
# False
# the optimization returns False, so it invokes our custom __isinstance__ that returns False

print("------------------------")

print(issubclass(Person1, Person1))
# False
# __subclasscheck__ is invoked and returns False

print("------------------------")

print(issubclass(Worker1, Person1))
# False
# __subclasscheck__ is invoked and returns False
print("------------------------")



This machinery was added to python in part to give "extra powers" to abstract classes. It makes easier performing "multiple duck type checks". Let's say you need an object that conforms to an "interface" defined by an abc, but you don't need it to inherit from that class, you just want it to have those methods (duck typing). Furthermore let's say you don't need just one method of the abc, but all or many of them. Checking beforehand if all those methods exist rather than calling methods in the object until one of them fails cause it's not implemented seems like a better option. The __subclasshook__ was born for that. You can define a @classmethod def __subclasshook__(cls, subclass) for that. abc's have ABCMeta as metaclass, and __isinstance__ and __issubclass__ have been defined in it, and both of them will invoke the __subclasshook__ of that abstract class if it exists. The source code for this is in _py_abc.py.


from abc import ABC, abstractmethod

class Movable(ABC):
    @abstractmethod
    def move_rigth(self, x):
        pass

    @abstractmethod
    def move_left(self, x):
        pass

    @classmethod
    def __subclasshook__(cls, C):
        # Both isinstance() and issubclass() checks will call into this (through ABCMeta.__instancecheck__, ABCMeta.__subclasscheck__ )
        if cls is Movable:
            # Note that you generally check if the first argument is the class itself. 
            # That's to avoid that subclasses "inherit" the __subclasshook__ instead of using normal subclass-determination.
            return hasattr(C, "move_right") and hasattr(C, "move_left")
        return NotImplemented


class Button:
    def move_right(self, x):
        print("moving rigth")

    def move_left(self, x):
        print("moving left")

# Button is a Movable in duck-typing and structural typing terms
# but it does not strictly implement the Movable "interface" (it does not inherit from Movable)
# thanks to the __subclasshook__ both checks below are true

if issubclass(Button, Movable):
    bt = Button()
    bt.move_right(4)
    bt.move_left(2)


bt = Button()
if isinstance(bt, Movable):
    bt.move_right(4)
    bt.move_left(2)


#moving rigth
#moving left
#moving rigth
#moving left


This discussion probably explains all this quite better than this post :-)

Additionally there's an alternative/complementary way to declare that a class complains with the "interface" defined by an abc, the ABCMeta.register method. With this you declare that specific classes complain with the "interface" rather than defining the conditions to comply with it, as you do with __subclasshook__. The collections.abc module is a very interesting sample of the use of these mechanisms.