Sunday, 5 January 2025

Closing Javascript Generators

In this previous post about closing Python generators I mentioned that JavaScript generators had a similar feature that would deserve a post on its own, so here it is.

JavaScript generators have a return() method. We can think of it as partially equivalent to Python's close() method. This is so for the simple (and main I guess) use cases that I explained in my previous post (use it as a replacement for break, and when you are passing the generator around to other methods and one of them can decide to close it). For example:


function* citiesGen() {
    yield "Paris";
    yield "Porto";
    return "Europe";
}

// using .return() rather than break
let cities = citiesGen();
for (let city of cities) {
    if (city == "Porto") {
        cities.return();
        console.log("closing generator");
    }
    console.log(city);
}
/*
Paris
closing generator
Porto
*/


Then we have the more advanced cases, for which indeed finding a use case seems not so apparent to me. Here it's where the differences with Python's close() are important. JavaScript return() accepts a value, that will be returned as part of the value-done pair returned when the generator is finished. This "when it's finished" is key, as a try-finally in the generator code can prevent the return() call from finishing the generator in that call. It will continue to produce values as instructed from the finally part, and once completed will return the value that we had passed in the return() call. The theory:

The return() method, when called, can be seen as if a return value; statement is inserted in the generator's body at the current suspended position, where value is the value passed to the return() method. Therefore, in a typical flow, calling return(value) will return { done: true, value: value }. However, if the yield expression is wrapped in a try...finally block, the control flow doesn't exit the function body, but proceeds to the finally block instead. In this case, the value returned may be different, and done may even be false, if there are more yield expressions within the finally block.

And the practice:




function* citiesGen2() {
    yield "Paris";
    try {
        yield "Lyon";
        yield "Porto";
        return "Stockholm";
    }
    finally {
        yield "Lisbon";
        yield "Berlin";
    }
}

cities = citiesGen2();
console.log(cities.next());
console.log(cities.next());
console.log(cities.return("Over"));
console.log(cities.next());
console.log(cities.next());
console.log(cities.next());


// { value: 'Paris', done: false }
// { value: 'Lyon', done: false }
// { value: 'Lisbon', done: false }
// { value: 'Berlin', done: false }
// { value: 'Over', done: true }
// { value: undefined, done: true }


If this feels odd to you, you're not alone :-D. This is quite different from Python, where a call to close() always finishes the generator, even if we are catching the Exception and returning something from it.

generator.close()

Raises a GeneratorExit at the point where the generator function was paused. If the generator function catches the exception and returns a value, this value is returned from close(). If the generator function is already closed, or raises GeneratorExit (by not catching the exception), close() returns None. If the generator yields a value, a RuntimeError is raised. If the generator raises any other exception, it is propagated to the caller. If the generator has already exited due to an exception or normal exit, close() returns None and has no other effect.

Same as Python, JavaScript generators also have a throw() method, and again, I see no much use for it.

Monday, 30 December 2024

Python altinstall and WSL2

When working with multiple Python versions installed on the same machine, on Windows I use pyenv, but on Linux I prefer to do an altinstall. This means downloading the Python source code and building it. It may sound scary, but it's pretty simple. Once done you'll have a new Python installation in /usr/local/bin - /usr/local/lib that will not interfere with the system/default one (the one that came "from factory" with your Linux distribution, that is used by the Operating System itself and that should not be modified) that is installed on /usr/bin - /usr/lib. Just use use python3.xx to invoke that newly installed version, and python3 to invoke the system Python.

What feels odd is that it's not something that is explained in so many places. The official Python documentation just mentions this. The more detailed instructions that I've always followed are here, and as it explains installation is that simple as this:

sudo apt install build-essential zlib1g-dev \
libncurses5-dev libgdbm-dev libnss3-dev \
libssl-dev libreadline-dev libffi-dev 

wget https://www.python.org/downloads/release/python-3xxx/Python-3.xx.x.tar.xz
tar xf Python-3.xx.x.tar.xz
./configure
make altinstall

The first step is particularly important. Python comes with python modules that use native modules, and to compile those native modules they need some -dev packages installed on your system (these source packages contain mainly C header files), otherwise the compilation of those modules will fail and you'll have an incomplete installation that will cause errors when trying to import those missing modules. If you plan to use sqlite on your system, given that the sqlite module that is part of the Python distribution depends on a native module, in order to compile it you must add this: libsqlite3-dev to the list of dependencies to install that I listed above.

My work laptop (the one provided by my employer I mean) is still a Windows one. I have no problem with that, I used to have good knowledge of Windows internals, and even now that I'm more of a Linux person (all my personal computers are Linux based) I still consider that Windows architecture is really good (though I've come to distaste the updates system, the restore points, the UI...). That said, I'm using WSL2 more and more these days. I have Python3.13 installed as an altinstall on it and it's been working perfectly fine for testing on linux stuff that I develop on Windows. The other day I went one step further and wanted to debug that code on linux. Your Windows VS Code can work with folders on your WSL2 installation just in the same way it works with code on a remote linux machine. The WSL extension works in combination with the Remote SSH extension, installing to your $HOME/.vscode-server/ folder in WSL2 the code it needs on the linux side (same as it does when working with any remote Linux server). I think all this remote development is something that a few years back one could not dream about.

With VS Code and the WSL extension combined, VS Code’s UI runs on Windows, and all your commands, extensions, and even the terminal, run on Linux. You get the full VS Code experience, including autocomplete and debugging, powered by the tools and compilers installed on Linux.

The thing is that when trying to run under the debugger my code on WSL2 I was confronted with this

debugpy/launcher/debuggee.py", line 6, in module
    import ctypes
  File "/usr/local/lib/python3.10/ctypes/__init__.py", line 8, in module
    from _ctypes import Union, Structure, Array
ModuleNotFoundError: No module named '_ctypes'

Initially I was thinking it would be some problem of the debugger itself, some issue with the amazing "remote development experience" that was making it fail to find that module, but just jumping into a WLS2 terminal, opening a Python3.13 REPL and trying to import _ctypes was causing the same error. So that _ctypes module was really missing on my Python3.13 WSL2 altinstallation.

Jumping to my main Ubuntu personal laptop, with also a Python3.13 altinstallation and importing _ctypes I got:

$ python3.13
Python 3.13.0 (main, Nov  9 2024, 16:10:52) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import _ctypes
>>> _ctypes.__file__
'/usr/local/lib/python3.13/lib-dynload/_ctypes.cpython-313-x86_64-linux-gnu.so'

lib-dynload seems to be where native modules get installed (I can see also for example the sqlite.so), si if the ctypes.so is missing is that some necessary -dev package was missing when I did my altinstall on WSL2. For the sake of experiment I decided to just copy the _ctypes___.so from my laptop to the windows WSL laptop. Doing that, I got another missing module, libffi, that is imported by _ctypes. Doing a sudo apt list --installed | grep libffi I see that there's not a libffi-dev package installed on my WSL2, so somehow when a time ago I installed the different -dev packages needed to compile Python I missed to install it (so the Python compilation could not create that libffi.so into lib-dynload), and the issue had not hit me until now. To fix the problem I installed libffi-dev, uninstalled python3.13 and did a new altinstall. It works like a charm now.

There does not seem to be an automatic mechanism to uninstall a Python version installed as altinstall (a Python install takes little space and indeed I assume that I could just have do a new install without removing the existing one and it would get correctly updated), but anyway, as explained here I removed this list of folders/files:

    directory /usr/local/lib/python3.13
    directory /usr/local/include/python3.13
    file /usr/local/lib/libpython3.13.a
    file /usr/local/lib/pkgconfig/python-3.13-embed.pc
    6 files /usr/local/bin/*3.13*

While checking this thing of the missing native module (.so) I also used these commands:
lsof -p [PID] | grep .so to see the shared objects loaded by a process (lsof was an old friend of mine)
readelf -d (this was new to me. It gives you information about an elf binary file (executable or shared object, the equivalent to a windows PE file), and among that information you can see the shared objects needed by that binary, eg:

readelf -d _ctypes.cpython-313-x86_64-linux-gnu.so
$ readelf -d _ctypes.cpython-313-x86_64-linux-gnu.so

Dynamic section at offset 0x21cf8 contains 25 entries:
  Tag        Type                         Name/Value
 0x0000000000000001 (NEEDED)             Shared library: [libffi.so.8]
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]
 0x000000000000000c (INIT)               0x7000

Saturday, 21 December 2024

Python Generators Close and Return Enhancement

In my previous post I mentioned how I find it strange that the value returned by generator.close() (if it returns a value) is not made available in the next call to generator.next() (or generator.send()) as the value of the StopIteration exception. A few months ago I wrote about the problems with for loops for getting access to the value returned by a generator, and provided a solution. Well, I've improved such solution to have generator object that wraps the original generator and addresses both issues. If we close the generator and it returns a value, the next time we call next() or send() such value is available in StopIteration.value. Additionally, if a generator returns a value (either if it's closed or it finishes normally) that return value is made accessible in a result attribute of our Generator wrapper. OK, much talk, show me the code:


import inspect
from typing import Generator, Any

def cities_gen_fn():
    try:
        yield "Xixon"
        yield "Paris"
        yield "Lisbon"
        yield "Bilbao"
    except GeneratorExit:
        pass    
    # return this value both if closed or in a normal execution
    return "XPLB"


# wraps a generator in an "extended generator" that stores the value returned by close() to "return it" it the next call to .next() or .send() 
# it also stores the returned value if the original generator returns something
# that stored return-close value is only returned as StopIteration.value in the first call to .next()-.send(), ensuing calls return StopIteration.value as None

class GeneratorWithEnhancedReturnAndClose:
    def __init__(self, generator_ob: Generator[Any, Any, Any]):
        self.generator_ob = generator_ob
        self.result = None
        self.just_closed = False
        self.closed = False
        
    def __iter__(self):
        return self
    
    def _do_iterate(self, caller: str, val: Any) -> Any:
        if self.just_closed:
            self.just_closed = False
            ex = StopIteration()
            ex.value = self.result
            raise ex
        try:
            if caller == "__next__":
                return next(self.generator_ob)
            else:
                return self.generator_ob.send(val)
        except StopIteration as ex:
            if self.result is None:
                self.result = ex.value
            raise ex
            
    def __next__(self):
        return self._do_iterate("__next__", None)

    def send(self, val):
        return self._do_iterate("send", val)       

    def close(self):
        if not self.closed:
            self.closed = True
            self.just_closed = True 
            self.result = self.generator_ob.close()
            return self.result
        
    def throw(self, ex):
        return self.generator_ob.throw(ex)

print("- getting return value after for-loop")
cities = GeneratorWithEnhancedReturnAndClose(cities_gen_fn())
for city in cities:
    print(city)
print(f"return value: {cities.result}")

print("------------------------")
print("- using next() and close()")

cities = GeneratorWithEnhancedReturnAndClose(cities_gen_fn())
print(next(cities))
print(next(cities))
print(f"closing generator: {cities.close()}")
# first iteration after closing it returns the close-value in the StopIteration.value
try:
    next(cities)
except Exception as ex:
    print(f"generator finished {ex.value}")

# next iteration returns StopIteration with value = None
try:
    next(cities)
except Exception as ex:
    print(f"generator finished {ex.value}")

print(f"return value: {cities.result}")

print("------------------------")
print("- using send() and close()")
# test now that send() also works OK

def freak_cities_gen():
    try:
        w = yield "Xixon"
        w = yield f"{w}Paris{w}"
        w = yield f"{w}Lisbon{w}"
        yield f"{w}Bilbao{w}"
    except BaseException: #GeneratorExit:
        pass    
    # return this value both if closed or in a normal execution
    return "XPLB"
 
cities = GeneratorWithEnhancedReturnAndClose(freak_cities_gen())
print(next(cities))
print(cities.send("||"))
print(f"closing generator: {cities.close()}")
# first iteration after closing it returns the close-value in the StopIteration.value
try:
    next(cities) #it's the same using next or send
except Exception as ex:
    print(f"generator finished {ex.value}")

# next iteration returns StopIteration with value = None
try:
    cities.send("|") #it's the same using next or send
except Exception as ex:
    print(f"generator finished {ex.value}")

print(f"return value: {cities.result}")



# - getting return value after for-loop
# Xixon
# Paris
# Lisbon
# Bilbao
# return value: XPLB
# ------------------------
# - using next() and close()
# Xixon
# Paris
# closing generator: XPLB
# generator finished XPLB
# generator finished None
# return value: XPLB
# ------------------------
# - using send() and close()
# Xixon
# ||Paris||
# closing generator: XPLB
# generator finished XPLB
# generator finished None
# return value: XPLB



Notice that we could inspect the call stack to get from what method we are being called and rewrite the code this way:


    def _do_iterate_slow(self, val: Any) -> Any:
        # this is very cool with the use of introspection to check the caller name, but that's pretty slow
        if self.just_closed:
            self.just_closed = False
            ex = StopIteration()
            ex.value = self.result
            raise ex
        try:
            if inspect.stack()[1][3] == "__next__":
                return next(self.generator_ob)
            else:
                return self.generator_ob.send(val)
        except StopIteration as ex:
            if self.result is None:
                self.result = ex.value
            raise ex
           
    def __next__(self):
	return self._do_iterate_slow(None)

    def send(self, val):
        return self._do_iterate_slow(val)


But other than being very cool, that stack access is rather slow, so we better avoid such technique.

Tuesday, 17 December 2024

Closing Python Generators

This post is about some rarely used features of Python generators (JavaScript generators are pretty similar, but with some differences that would deserve its own post).

First of all, the internals of Python generators is pretty interesting. They are quite different from C# generators or Kotlin suspend functions, where the function is converted by the compiler into a class with a "state machine method" with labels for each "suspension point" and properties for the current label and local variables. In Python, the generator object created from a generator function points to that function as such, and holds a frame object with the variables and the next instruction to run. Each time the generator function is resumed it gets this frame object (gi_frame) (rather than starting with an unitialized one) containing its state and position of its next instruction (gi_frame.f_lasti). It's very nicely explained here. We can see with this simple code that the gi_frame and the frame taken (via inspect) from the stack in the generator function execution are indeed the same object, not a copy:


import inspect

def cities_gen_fn():
    print(f"frame id: {id(inspect.stack()[0].frame)}")    
    yield "Xixon"
    print(f"frame id: {id(inspect.stack()[0].frame)}")
    yield "Paris"
    yield "Lisbon"
    yield "Bilbao"

cities = cities_gen_fn()
print(next(cities))
print(f"gi_frame id: {id(cities.gi_frame)}")
print(next(cities))

# frame id: 2550405375184
# Xixon
# gi_frame id: 2550405375184
# frame id: 2550405375184


Python generator objects have a close() method that allows us to set the generator as finished. One common use case is when looping over a generator and at some point a condition tells us to stop. Of course you can leave the loop using the break statement, but that's a bit different, break will leave the loop immediatelly, not in the next iteration, and as the generator has not been finished, we still can continue to iterate it after the loop.



def cities_gen_fn():
    yield "Xixon"
    yield "Paris"
    yield "Lisbon"
    yield "Bilbao"

print("- using break")
cities = cities_gen_fn()
for city in cities:
    if (city := city.upper())[0] == "L":
        break
    print(city)
print(next(cities))

print("- using .close()")
cities = cities_gen_fn()
for city in cities:
    if (city := city.upper())[0] != "L":
        cities.close()
        print(city)
try:
    print(next(cities))
except StopIteration as ex:
    print("generator is finished")

# - using break
# XIXON
# PARIS
# Bilbao

# - using .close()
# XIXON
# generator is finished


I can think of some situation in the past where this .close() method would have come handy. Let's say we have a main function that creates a generator and delegates on other functions certain tasks involving iterating that generator. Each of those functions could determine based on its own logic that the generator is finished, so it would close it, and then the main function would no longer invoke the remaining functions with it. Unaware of this .close() functionality I was returning a "is_finished" boolean from each of those functions.

The documentation on .close() shows that it's a quite interesting and complex beast. Raises a GeneratorExit at the point where the generator function was paused. Wow, that's quite a bit mind blowing. So it's as if when the generator function is resumed somehow the interpreter injects a raise GeneratorExit() sentence in the place where the gi_frame.f_lasti is pointing to! If the generator does not catch the exception the generator finishes (the next iteration attempt will throw a StopIteration) and the close() call returns None (that's the behaviour in the examples above). Python3.13 has introduced a new feature, the generator can catch the exception and return a value to the close() method. The main effect, finishing the generator is the same, but we have this extra of returning a value to the caller. Let's see:



def cities_gen_fn():
    yield "Xixon"
    yield "Paris"
    yield "Lisbon"
    yield "Bilbao"


def cities2_gen_fn():
    try:
        yield "Xixon"
        yield "Paris"
        yield "Lisbon"
        yield "Bilbao"
    except BaseException: #GeneratorExit:
        return "No City"
        #this returned value in case of a close() is returned by close(), but not as return value of the generator (StopIteration.value is None)


for cities_gen in [cities_gen_fn(), cities2_gen_fn()]:
    print(next(cities_gen))
    print(f"close result: {cities_gen.close()}")
    print("generator has been closed")
    try:
        next(cities_gen)
    except Exception as ex:
        print(f"Exception: {type(ex)}, value: {ex.value}")
    print("--------------------------")

# Xixon
# close result: None
# generator has been closed
# Exception: 'StopIteration', value: None
# --------------------------
# Xixon
# close result: No City
# generator has been closed
# Exception: 'StopIteration', value: None
# --------------------------

What feels a bit odd to me is that the value returned by the generator to .close() is not considered as a generator return value and made available as the .value property of the next StopIteration exception.

We have another related method, generator.throw(). It's also used to finish a generator, but throwing exceptions, for which I don't see any clear use case.

Raises an exception at the point where the generator was paused, and returns the next value yielded by the generator function. If the generator exits without yielding another value, a StopIteration exception is raised. If the generator function does not catch the passed-in exception, or raises a different exception, then that exception propagates to the caller.

I'll show some example, but honestly I don't see when this method can be useful.



def cities_gen_fn():
    yield "Xixon"
    yield "Paris"
    yield "Lisbon"
    yield "Bilbao"

cities_gen = cities_gen_fn()
print(next(cities_gen))
try:
    print(f"throw result: {cities_gen.throw(Exception())}")
    print("after generator throw")
except Exception as ex:
    print(f"Exception: {ex}")
try:
    print("next iteration attempt")
    next(cities_gen)
except Exception as ex:
    print(f"Exception in next() call: {type(ex)}, value: {ex.value}")

# Xixon
# Exception: 
# next iteration attempt
# Exception in next() call: 'StopIteration', value: None


print("--------------------------")


def cities2_gen_fn():
    try:
        yield "Xixon"
        yield "Paris"
        yield "Lisbon"
        yield "Bilbao"
    except Exception: 
        yield "Except City"


cities_gen = cities2_gen_fn()

print(next(cities_gen))
print(f"throw result: {cities_gen.throw(Exception())}")
print("after generator throw")
try:
    print("next iteration attempt")
    next(cities_gen)
except Exception as ex:
    print(f"Exception in next() call: {type(ex)}, value: {ex.value}")


# Xixon
# throw result: Except City
# after generator throw
# next iteration attempt
# Exception in next() call: 'StopIteration', value: None


Monday, 2 December 2024

Python locals(), f_locals, local namespace

The freshly released Python 3.13 mentions some updates to the locals() behaviour. Reading those notes, confirms to me (as I have outlined here) that trying to create new variables in exec()/compile() will have no effect outside of the "block" executed in exec-compile itself (reassigning an "external" variable will not have effect either) the code will always run against an independent snapshot of the local variables in optimized scopes, and hence the changes will never be visible in subsequent calls to locals(), and also opens the door to some really interesting stuff: FrameType.f_locals now returns a write-through proxy to the frame’s local and locally referenced nonlocal variables in these scopes.

Let's go a bit deeper into the above statements. Each time we execute a function, a "local namespace" object is created for that function (it's a sort of dictionary), where local variables and parameters are stored (and also free vars if the function is a closure). I guess we can think of this local namespace object as JavaScript's Activation Object. Let's see:


def create_fn():
    trapped = "aaa"
    def fn(param1):
        nonlocal trapped
        trapped = "AAAA"
        local1 = "bbb"
        print(f"fn local namespace: {locals()}")
    return fn

fn1 = create_fn()
fn1("ppppppp")

# fn local namespace: {'param1': 'ppppppp', 'local1': 'bbb', 'trapped': 'AAAA'}


As aforementioned, code executed by the exec()/compile() functions receives a snapshot of the local namespace of the invoking function, meaning that adding a variable or reassigning a variable in that snapshot will not have effect outside the exec() itself. I mean:


def declare_new_variable(param1):
    # creating a new variable or setting an existing variable in exec will not crash,, but it in the local namespace snapshot that it receives
    # but will not have effect in the original local namespace
    print(f"- {declare_new_variable.__name__}")
    # create new variable
    exec(
        "a = 'Bonjour'\n"
        "print('a inside exec: ' + a)\n"
    )
    # a inside exec: Bonjour

    p_v = "bbb"
    # assign to existing variable
    exec(
        "p_v = 'cccc'\n"
        "print('pv inside exec: ' + p_v)\n"
    )
    # pv inside exec: cccc
    
    print(f"locals: {locals()}")
    # locals: {'param1': '11111', 'p_v': 'bbb'}
    # the new variable "a" has not been created in the local namespace, and p_v has not been updated
	

And now the second part of the first paragraph, the FrameType.f_locals. I've been playing with it to learn that from a Python function we can traverse its call stack, getting references to a write-through proxy of the local namespace of each stack frame. This means that from one function we have access (read and write) to any variable in any of its calling functions (any function down in the stack), and even "sort of" add new variables. I'm using inspect.stack() to get access to the stack-chain, then freely move through it, get the stack-frame I want, and use f_locals to get that "write-through proxy" to its local namespace.



def child2():
    print("- enter child2")
    c2_v1 = "child2 v1"
    c2_v2 = 200
    print("child2")
    parent_locals = inspect.stack()[2].frame.f_locals
    print(f"parent_locals viewed from child2: {parent_locals}")
    print("modify existing parent variable, p_v1")
    parent_locals["p_v1"] = parent_locals["p_v1"] + "_modified"
    print("add variable p_v3 to parent")   
    parent_locals["p_v3"] = "extra var"   
    # remove variable this way fails:
    #del parent_locals["p_v2"] 
    # TypeError: cannot remove variables from FrameLocalsProxy
    print("- exit child2")


def child1():
    print("- enter child1")
    c1_v1 = "child1 v1"
    c1_v2 = 20
    child2()
    print("- exit child1")


def parent():
    p_v1 = "parent v1"
    p_v2 = 2
    print("before calling child")
    print(f"parent: {locals()}")
    child1()
    print("after calling child")
    # p_v1 has been updated and p_v3 has been added:
    print(f"parent: {locals()}")

    # I can see the updated value of this var
    print(f"p_v1: {p_v1}")

    #but trying to acces the new variable like this will fail:
    try:
        print(f"p_v3: {p_v3}")
    except Exception as ex:
        print(f"Exception: {ex}")


parent()

# before calling child
# parent: {'p_v1': 'parent v1', 'p_v2': 2}
# - enter child1
# - enter child2
# child2
# parent_locals viewed from child2: {'p_v1': 'parent v1', 'p_v2': 2}
# modify existing parent variable, p_v1
# add variable p_v3 to parent
# - exit child2
# - exit child1
# after calling child
# parent: {'p_v1': 'parent v1_modified', 'p_v2': 2, 'p_v3': 'extra var'}
# p_v1: parent v1_modified
# Exception: name 'p_v3' is not defined


As you can see at the end of the above code, adding new variables to a function through the f_locals has an odd behaviour. A new entry is created in the local namespace corresponding to that f_locals. We can see the variable with locals() (regardless of whether it was added by code deeper in the stack chain) but trying to access it directly by its name will fail. The new variable exists in the local namespace, but it seems as if the variable name does not exist, and yes it's just that, as explained by this post:

Functions are special, because they introduce a separate local scope. The variables inside that scope are fixed when the function is compiled (the number and names of the variables, not their values). You can see that by inspecting a function's .__code__.co_varnames attribute.
That fixed registry of variable names is what is used when names are looked up from inside the function. And that registry is not updated when you're calling exec.

Saturday, 23 November 2024

FIC Xixón 2023

While going through the Program of the 2024 FICXixon edition I've realised that I had forgotten to finish and publish my post about the previous edition, indeed, publishing this kind of post 1 year later has become as much a tradition as the fact of writing a post about the festival. OK, here it goes:

One more year and one more FICXixon edition, number 61, from November 17th to November 25th, 2023. Once again I feel that sort of pride when I see how my hometown manages to organize such an excellent event, a middle size city (and a small region) in one corner of our continent, that after striving to survive to different waves of massive economical crisis and population losses, now seems to be on the way to stabilization and even recovery (IT sector, wind turbines factories, small private shipyards, some home workers moving/returning here).

As always I'm a bit constrained by work and by not feeling like going to sessions in those cinemas far away from my flat, (Yelmo in La Calzada, Laboral) and in the end I only attended to 4 sessions (I could not watch Baltimore, by Moldy and Lawlor, as when I finally made up my mind to get a ticket for it it was already sold out). I pretty nailed it with these 4 films, as all of them were good or very good

  • Shoshana. Saturday 18, Teatro Jovellanos. Based on a true story, excellent, absolutely excellent, so much that I should write a separate post for it, but I think that is not happening, so I better write some lines directly here. I've watched quite a bunch of films (normally very good) dealing with the complexities of life in Israel-Palestine, from different angles and perspectives, but I think this is the first one I've watched about the (recent) beginning of all that mess. Set in the 1930's/1940's, in the British colony of Palestine, where European jews have been settling recently, escaping the fascist threat in Europe and pursuing the dream of thriving in the land of their ancestors. Many of them are left-wing, not only the ones in the Kibbutz, also those in Tel-Aviv. They dream of having their own independent state, a homeland, and they buy lands and work hard. Most of them think it will be possible to get along with Arabs, but as the film ends it's sadly clear even for those that believed in "living together" not just in "live separate and not kill each other", that coexisting with Muslims in equality (not in dhimmitude) is for the most part impossible.
  • Day of the Tiger (Tigru). Wednesday 22, OCine. An interesting Romanian film inspired by true events. A woman working in a zoo is going through a bad time and in an error she lets a tiger escape the zoo. A patrol sets off to track the animal and a fucking mother fucker whose main pride and fun in life consists of hunting animals becomes the leader. There was an encounter with the director at the end, where he explained that his love for animals encouraged him to do this film. I was intrigued about where do you find wild animals like these (the tiger role was played by 2 different animals) for a film, if there is a "tiger's casting agency" :-) Well, it's a bit like that. These tigers come from Northern France, where a guy has a huge farm where he takes care of a bunch of wild animals. When the animals are "hired" you also hire the guy, as of course some expert has to handle them during the recording.
  • Disco Boy. Thursday 23, Teatro Jovellanos. Very powerful French drama. A tough story. One Belarusian young man enters illegally in France and decides to join the Foreign Legion, where if he manages to serve for 3 years he will be granted with French citizenship. The way he entered in France is so hard as the way he chose to try to stay. If you know how France works it's revolting. It's revolting that for a guy with a similar culture to ours and that sure would work hard and integrate into the French society without a problem, the Far-Left (anti)French system makes this as hard as possible, while anyone coming from an incompatible culture, with a hatred and distaste for our values, and with a firm desire of profiting from our welfare system, will find all sort of support from all kind of Far-Left associations. I say "ours" in this paragraph cause though I don't have French citizen, I consider it my second home/country/identity/belonging (and I've paid taxes there for years, not like all that scum that is destroying the country).
  • Matronas (Sages-femmes). Saturday 25, OCine. I think I was not particularly expectant about this film, but it was the last day of Festival and I wanted to attend one more screening, and the thing is that it was excellent. I had never thought about how intense and demanding is the work of bringing new lives to this world. This film takes us through the lives of several French women that do just that, working as midwives (sage-femme in Francais). An unattended surprise. When leaving the screening I heard some women that seemed to be midwives themselves praising how they felt reflected on the film.

Friday, 15 November 2024

Python Functions Disallowing Named Arguments

There are some builtin Python functions (written in C) that do not accept named (keyword) arguments (so they force the use of positional ones), for example (range, map, filter...) If you try to pass them a keyword argument you'll get an error: [function] takes no keyword arguments. This feels like an odd limitation to me, and furthermore, very confusing, as when looking into the documentation (for example for functions filter and map) I see no indication of this restriction, so you "learn by trying".

I've been trying to find information about the reason for this odd behaviour, and have not found much. There is this discussion that says that some builtin functions are defined with a "fast header" that does not accept keyword arguments.

This is because they use only a fast simple header ...(... PyObject *args) and therefore all named parameters are rejected automatically. (Python can never introspect into names of arguments in C source. :-)

Many other C functions have a header ...(... PyObject *args, PyObject *kwds) and they support exact list of names explicit by implementing much more complicated validation PyArg_ParseTupleAndKeywords. These names must be written to docs strings manually.

This is a bit confusing to me, but here they go some of my ideas on this (comparing it to other platforms). From what I've read the JVM has no notion of named arguments, meaning that the Kotlin compiler takes care of them when producing the bytecodes for the function call, and has a minimum performance implications at runtime:

Using named arguments produces the same bytecode as using positional arguments. However, changing the order of the arguments leads to the creation of temporary variables. This is because the arguments need to be evaluated in the order they appear, but passed to the method in the original order.

On the other hand, Python bytecode has the notion of Keyword arguments at the bytecode level, so it's the interpreter at runtime who takes care of them, which I guess has some minor performance implication. I would say that when writing native functions that you want to be as fast as possible, preventing a function from being invoked with keyword arguments provides some "nano-improvement".

Thanks to this searching I've found a surprising feature. I already knew that you can use the * symbol in your function signature to force all parameters after it to be passed as named (keyword) arguments, but I did not know that python 3.8 (PEP-570) introduced Positional Only Parameters. Using the / symbol in your function signature, parameters defined before it can only be provided as positional arguments, not as keyword ones.

At first sight it seems like an odd addition. That some builtin functins behave like that is a disturbance for me, so why would I want to disturb others by using that in some of my own functions? Well, there's a pretty good explanation here. Of the 4 reasons given there, the one that really makes this feature useful and valuable to me is the last one:

Since the parameters to the left of / are not exposed as possible keywords, the parameters names remain available for use in **kwargs:

Honestly I'd never thought about that potential problem when using variable number of keyword arguments. When you receive in your packed **kwargs an argument with a name that you already use for your other parameters you get an exception (Exception: [function] got multiple values for argument '[xxx]'), as the interpreter does not know to what parameter that named argument refers to. Let's see an example with the problem and the solution using /



def format_msg(msg: str):
    return f"[[{msg}]]"

def log_call(msg, fn, *args, **kwargs):
    print(f"{fn.__name__} {msg}")
    return fn(*args, **kwargs)
    

# no keywork argument, so it works fine
print(log_call("invoked", format_msg, "hi"))
#format invoked
#Out[11]: 'hi'

# but here we have a problem
try:
    print(log_call("invoked", format, msg="hi"))
except Exception as ex:
    print(f"Exception: {ex}")

# TypeError                                 Traceback (most recent call last)
# Input In [12], in ()
# ----> 1 log_call("invoked", format, msg="hi")
# TypeError: log_call() got multiple values for argument 'msg'


# that we can prevent by redefining the function like this:
def log_call(msg, fn, /, *args, **kwargs):
    print(f"{fn.__name__} {msg}")
    return fn(*args, **kwargs)
    

print(log_call("invoked", format_msg, msg="hi"))
# format_msg invoked
# [[hi]]'


Now that I'm aware of this potential problem in Python a further question arises, how do they manage this in Kotlin? Well, the thing is that they don't have this problem, cause kotlin supports variable number of unnamed arguments, but not of named arguments. The vararg modifier used in a function signature denotes an array of arguments, but there's not an additional modifier for denoting a dictionary of parameters (like python's **). Related to this, the spread operator * only applies to arrays, there's not an equivalent to Python's ** (packing/unpacking) for dictionaries.