Sunday, 30 November 2025

exec, eval, return and ruby

In this post about expressions I mentioned that Kotlin features return expressions, which is a rather surprising feature. Let's see it in action:


// Kotlin code:
fun getCapital(country: Country) {
     val city = country?.capital ?: return "Paris"
     // this won't run if we could not find a capital
     logger.log("We've found the capital")
     return city
}

Contrary to try or throw expressions, that can be simulated (in JavaScript, Python...) with a function [1], [2], there's no way to use a "return() function" to mimic them (it would exit from that function itself, not from the calling one). Well, it came to my mind that maybe we could use a trick in JavaScript with eval() (I already knew that it would not work in Python with exec()), but no, it does not work in JavaScript either.


// JavaScript code:
function demo() {
    eval("return 42;");
    console.log("This will never run");
}

console.log(demo());
// Output: SyntaxError: Illegal return statement


JavaScript gives us a SyntaxError when we try that because that return can not work in the way we intend (returning from the enclosing function) so it prevents us from trying it. The code that eval compiles and runs is running inside the eval function, it's not as if it were magically placed inline in the enclosing function, so return (or break, or continue) would just return from eval itself, not from the enclosing function, and to prevent confusion, JavaScript forbids it.

The reason why I thought that maybe this would be possible is because as I had already explained in this previous post JavaScript eval() is more powerful than Python exec(), as it allows us modifying and even adding variables to the enclosing function. As a reminder:


// JavaScript code:
function declareNewVariable() {
    // has to be declared as "var" rather than let to make it accessible outside the block
    let block = "var a = 'Bonjour';";
    eval(block);
    console.log(`a: ${a}`)
}

declareNewVariable();
// a: Bonjour


This works because when JavaScript compiles and executes a "block" of code with eval() it gives it access to the scope chain of the enclosing function.

Python could have also implemented this feature, but it would be very problematic in performance terms. Each Python function stores its variables in an array (I think it's the f_localsplus attribute of the internal frame/interpreter object, not to be confused with the higher level PyFrameObject wrapper), and the bytecode access to variables by index in that array (using LOAD_FAST, STORE_FAST instructions), not by name . exec() accepts an arbitrary dictionary to be used as locals, meaning that it will access to that custom locals or to the one created from the real locals, as a dictionary lookup (with LOAD_NAME, STORE_NAME). Basically there's not an easy way to reconcile both approaches. Well, indeed exec() could have been designed as receiving by default a write-through proxy like the one created by frame.f_locals. That would allow modifying variables from the enclosing function, but would not work for adding variables to it (see this post). So I guess Python designers have seen it as more coherent to prevent both cases rather than having one case work (modification of variable) and another case not (addition of a new variable). As for the PyFrameObject stuff that I mention, some GPT information:

In Python 3.11+, the local variables and execution state are stored in interpreter frames (also called "internal frames"), which are lower-level C structures that are much more lightweight than the old PyFrameObject.
When you call sys._getframe() or use debugging tools, CPython creates a PyFrameObject on-demand that acts as a Python-accessible wrapper around the internal frame data. This wrapper is what you can inspect from Python code, but it's only created when needed.

So all in all we can say (well, a GPT says...)

Bottom line: Neither Python’s exec() nor JavaScript’s eval() can magically splice control-flow into the caller’s code. They both create separate compilation units. JavaScript feels “closer” because eval() shares lexical scope, but the AST boundaries still apply.

After all this, one interesting question comes up, is there any language where the equivalent to eval/exec allows us returning from the enclosing function? The answer is Yes, Ruby (and obviously it also allows modifying and adding new variables to the enclosing function). Additionally notice that ruby also supports return expressions (well, everything in ruby is an expression).


# ruby code:
def example
  result = eval("return 5")
  puts "This won't execute"
end

example  # returns 5

Ruby's eval is much more powerful than JavaScript's or Python's - it truly executes code as if it were written inline in the enclosing context.

The "as if" is important. It's not that Ruby compiles the code passed to eval and somehow embeds it in the middle of the currently running function. That could be possible I guess in a Tree parsing interpreter, modifying the AST of the current function, but Ruby has long ago moved to bytecode and JIT. What really happens is this

Ruby's eval compiles the string to bytecode and then executes it in the context of the provided binding, which includes:

- Local variables
- self (the current object)
- The control flow context (the call stack frame)

That last part is key. When you pass a Binding object, you're not just passing variables - you're passing a reference to the actual execution frame. So when the evaled code does return, break, or next (Ruby's continue), it operates on that captured frame. Here's where it gets wild.

The Binding object idea (an object that represents the execution context of a function) is amazing. By default (when you don't explicitly provide a binding object) the binding object represents the current execution frame, but you can even pass as binding the execution frame of another function!!! You can get access to variables from another function, and if that function is still active (it's up in the call stack) you can even return from that function, meaning you can make control flow jump from one function to another one up in the stack chain!

eval operates on a Binding object (which you can pass explicitly), and that binding captures the complete execution context - local variables, self, the surrounding scope, everything. You can even capture and pass bindings around

Just notice that Python allows a small subset of the binding object functionality by allowing us to explicitly provide custom dictionaries as locals and globals to exec().

Sunday, 23 November 2025

How Exceptions Work

It's been quite a while since I first complained about the lack of safe-navigation and coalesce operators in Python, and provided a basic alternative. I've also complained about the lack of try-expressions in Python, and also provided a basic alternative. Indeed, there's not a strong reason for having 2 separate functions, I think we can just use do_try for the safe-get and coalesce option.



def do_try(action: Callable, exceptions: BaseException | list[BaseException] | None = Exception, on_except: Any | None = None) -> Any:
    """
    simulate 'try expressions'
    on_except can be a value or a Callable (that receives the Exception)
    """
    try:
        return action()
    except exceptions as ex:
        return on_except(ex) if (on_except and callable(on_except)) else on_except

person = Person()
embassy = do_try(lambda: person.country.main_cities[0])


I also complained of how absurd it feels having a get method for ditionaries but not for collections. That means that we end up writing code like this:


x = items[i] if len(items) > i else "default"

Of course, that would not be necessary if we had safe-navigation, but as we don't have it, we can just use the do_try function:


x = do_try(lambda: items[i], "default")

And here comes the interesting part, obviously using do_try means using try-except under the covers, which when compared to using an if conditional seems something to avoid in performance terms, right? Well, I've been revisiting a bit the internals and cost of exceptions. Since version 3.11 Python has zero-cost exceptions. This means that (as in Java) having try-except blocks in your code does not have any performance effect if no exception is thrown/raise, the only costs occur if an exception is actually raised: The "zero cost" refers to the cost when no exception is raised. There is still a cost when exceptions are thrown.

Modern Python uses exception tables. For each function containing try-except blocks an exception table is created linking the try part to the handling code in the except part. Exception tables are created at compile time and stored in the code object. Then at runtime if an exception occurs the interpreter will consult the exception table to find the handler for the given exception and jump to it. Obviously creating an exception object, searching the exception table and jumping to the handler has a cost. Given that in Python compilation occurs when we launch the script, just before we can run the code, we can say that this exception table creation also has a runtime cost, but it's mininum as it happens only once per function (when the function is created), not every time the function is executed. That's where the cost happens if an exception is raised: creating the Exception object, unwinding and jumping to the handler.

Throwing/raising an exception felt like a low level mechanism to me, but it's not at all.

Language-level exceptions are software constructs managed by the runtime (JVM for Java, CPython for Python). They do not involve the OS unless the program crashes. So when you use a throw/raise statement in your code there's not any sort if software interrupt, it's just one more (or several) instruction. The python interpreter will come across a RAISE_VARARGS bytecode instruction, and it will search in the exception table for the current function and/or the functions in the call stack, trying to find an exception handler.

Notice that the same happens in Java-JVM. The Java Compiler creates an exception table for each method and stores it in the .class file. This table maps bytecode ranges to handlers (catch blocks) and the type of exception they handle. When the class loader loads the class the JVM stores this table in the method’s metadata.. Given that the JVM comes with JIT compilation, there's an additional level. When the JIT compiles the method:

The JIT generates native machine code for the method.
It also creates a new exception table for the compiled code, because:
The original bytecode offsets are no longer relevant.
The JIT needs to map native instruction addresses to handler entry points.

This table is stored alongside the compiled code in the JVM’s internal structures.

So once a method has been compiled by the JIT at runtime we'll have two exception tables, the initial one for the bytecode form of the method (that is kept around in case we have to deoptimize from native back to bytecodes), and the table for native code. Notice that when the JIT compiles the bytecodes to native code we'll incur a very small extra cost for the creation of this additional table.

With all the above, using do_try() for safe indexed access seems a bit overkill (unless we're sure that access is very rarely going to fail and throw), and having a specific commodity function for it makes sense:


def get_by_index(sequence: Sequence[T], index: int, default: Any = None) -> Any:
    """
    Safely access an element by index in a sequence, where sequence is any class supporting __getitem__ and __len__.
    like: list, str, tuple, and bytes
    Usage: get_index(my_list, 2, default='Not Found')
    """
    return sequence[index] if 0 <= index < len(sequence) else default
    

We could generalize the function for nested access, but once we start to loop with if conditions at some number of iterations the try-except will probably end up being better for performance.

Friday, 14 November 2025

FICXixon 2024

I guess as you get older (OK, let's say mature, to sound nicer) traditions become more and more important. One more year, and one more edition of FICXixón is almost here, and as usual, I realise I have not published yet my post about the previous edition, so here it goes:

FICXixon 62 edition took place from November 15th to 23rd. This was a dry November month here, which makes me rather angry, I love rainy weather, I've always loved it, but now even more probably due my youth memories (and I guess my Galician heritage also plays its part). All in all I attended 6 films, which is my record since I returned to Xixón in 2018. This year I was not so busy at work, and had more time to check the programme and attend screenings. I watched 2 excellent films, a good film, an interesting documentary, and 2 films that were OK, but I would not watch again. From the micro-reviews below you can guess who is who.

  • The Antique, Friday 15, 19:00, OCine. The best film I've watched in this edition. I was quite in doubt between this one and Bird, but the trailer of "The Antique" looked so good (additionally, a story set in Russia is more appealing to me now that one set in UK), and in the end I settled on this gorgeous Georgian film. An old flat in a historical building in central Saint Petersbourg, the snow covered streets, an old man approaching the end, a gorgeous Georgian woman, antiquities, social unrest. I think I've said enough
  • Una Ballena, Saturday 16, 22:00, Teatro Jovellanos. Basque film mixing neo-noir and horror-fantasy. It had excellent reviews, but I'm not sure why it did not work for me. There was an "Encuentro con el público" after the screening, where the director and the main actress, the beautiful Ingrid, would discuss the film with the audience, some people expressed how (positively) shocked they were with the film, and honestly I felt a bit out of place.
  • La prisionnere de Bordeaux, Tuesday 19, 21:30, Yelmo Ocimax. When FICXixón 2022 dedicated a retrospective to Patricia Mazuay I watched three of her films, and I loved 2 of them. And I loved even more the "Encuentros con el público" with her. She was so funny and expressive. So when I found out that they were programming her last film I could not miss it, furthermore when it stars the charming Isabelle Huppert, one of my favorite actresses. So I even took the bus to go to the Yelmo cinema located far away in the outskirts, something I'd never done before for a FIC film! I did not regret it, the film is the kind of bizarre, funny, melancholic product that you could expect from these 2 crazy women.
  • When the Light Breaks, Wednesday 20, 22:00, Teatro Jovellanos. Excellent Icelandic drama. Grieving for a loved one is even worse when you are (and he was) young, and even worse when you have to hide how broken you are cause you had a secret relation. The light sequences at the start and the end of the film are mesmerizing.
  • Que se sepa (Indarkeriaren pi(h)artzunak), Thursday 21, 22:00, Escuela de Comercio. Interesting and necessary Basque documentary about one more of the many episodes of sorrow and pain brought up by the long and bloody conflict in the Basque Country. This time ETA and the Spanish Government join forces to kill an innocent man and destroy his family.
  • Fréwaka, Saturday 22, 19:15, OCine. It was preceded by a Colombian short, "La noche del Minotauro". Irish horror film, it was entertaining I think, but had little effect on me, so little that 1 year after I hardly remember anything about it.

Sunday, 9 November 2025

CGNAT

I've got a rather basic network knowledge and I've lately come across a problem/limitation I was not aware of and that I think is increasingly common, CGNAT. With my Internet Provider (Telecable) in Asturies, my FTTH router (a nice ZTE F6640) has a stable IP. I mean, it's not static, but it rarely changes (even after rebooting it). So when I recently felt that it could be convenient for me to occasionally connect to one of the computers in my LAN from outside, I thought it would be feasible.

So let's say I want to be able to ssh into my RasPI 5 from downtown while I discuss with my friends about how woke ideology is destroying humanity. The DHCP server in my router is configured to provide a static IP to all significant devices in my LAN, let's say 192.168.1.5 for my rasPi5. To make the port 22 in my rasPi accessible from outside I have to configure port forwarding in my router. It's just a matter of telling the router "forward incoming connections to one of your ports (let's say 10022) to port 22 in 192.168.1.5". I'd never done it before, but seems like something that has existed for decades and should work. So I connected my laptop to my mobile phone hotspot, to simulate the "I'm on the outside world thing", and tried. And tried, and tried... to not avail.

Checking some forums with similar questions involving other Internet providers in Spain I came across this fucking technology: CGNAT

Carrier-grade NAT (CGN or CGNAT), also known as large-scale NAT (LSN), is a type of network address translation (NAT) used by ISPs in IPv4 network design. With CGNAT, end sites, in particular residential networks, are configured with private network addresses that are translated to public IPv4 addresses by middlebox network address translator devices embedded in the network operator's network, permitting the sharing of small pools of public addresses among many end users. This essentially repeats the traditional customer-premises NAT function at the ISP level.

My internet provider in Asturies continues to use IPv4 (that's not the case in France, where to my surprise I found recently that it's using IPv6), and given that it has not enough public IP addresses for all its customers, it's adding an extra NAT (Network Address Translation) Layer.

I had got my router public address using curl ident.me, that gave me a nice and public 85.152.xxx.yyy address, but if I connect to my fiber router and check in it, I see a different one: 100.102.x.y. Well, that's not a public IP, and an indicator that my ISP is using CGNAT, as explained here.

If it's any of the following, then your router doesn't have a public IP address:

  • 192.168.x.x
  • 10.x.x.x
  • 172.16.x.x through 172.31.x.x
  • 100.64.x.x through 100.127.x.x

The last one is usually indicative of your ISP using CGNAT.

Summing up, my laptop has a 192.168. private IP address. My fiber Router faces the outside world with another private IP address (100.102.). Me and other customers in my area are connected to another upstream router in my ISP network, and this one faces the outside world with the 85.152.xxx.yyy public IP that I can see with ident.me. So in order for the connection from the outside to my RasPi to work I would also have to set up port-forwarding in that upstream ISP router shared with my "neighbours". So, no way...

Well, there's another way (that I have not tried) to set up this, a sort of reverse approach. In the last year I've been using SSH tunnels to connect to some non public servers at work through a "Bastion" work server with a public IP. With a standard SSH tunnel I basically create a SSH connection to that Bastion server telling it (to the Bastion server) that any connection that goes through it (through that "tunnel") has to be forwarded to another server. There are also reverse SSH Tunnels, where I create a SSH connection to a server (a tunnel) telling that server that any connections it receives to a certain port have to be forwarded to "me" through that tunnel, to a certain port on my machine. So if you have a server on the internet (Azure, AWS...) you could use it to create a reverse SSH tunnel to your PC located behind CGNAT. All this is explained for example here.

Tuesday, 28 October 2025

SqlAlchemy Registry

After talking about Persistence Ignorance and mapping styles in SqlAlchemy in my previous post, it's time now to take a look to an interesting technique used by the Declarative mapping. Whatever mapping style you use, SqlAlchemy relies on a registry where the information that maps entities to tables, fields to properties, relations, etc is stored. When using the Imperative mapping you directly work with that registry (sqlalchemy.orm.registry)


from sqlalchemy.orm import registry
@dataclasses.dataclass
class Post:
    title: int
    content: str

metadata = MetaData()
mapper_registry = registry(metadata=metadata)

mapper_registry.map_imperatively(
	entities.Post,
	table_post,
	properties={
		#"post_id": table_post.c.PostId, 
		"title": table_post.c.Title,
		"content": table_post.c.Content
	}      
)

But when using the declarative mapping you're not aware of that registry as you normally don't interact with it at all, though notice that you still have access to it through the Base class.


class Base(DeclarativeBase):
    pass
	
class Post(Base):
    __tablename__ = "Posts"
    post_id: Mapped[int] = mapped_column("PostId", primary_key=True, autoincrement=True)
    title: Mapped[str] = mapped_column("Title")
    content: Mapped[str] = mapped_column("Content")
	
# we have not directly used the registry at all in the above code, but it's still there, accessible through the Base class:
print(f"{Base.registry=}")
# Base.registry=
    

So how does the registry get set? Well, your entities get registered in that registry by leveraging the inheritance and metaclasses machinery to obtain a behaviour that is similar to the ruby inherited hook. Remember that I already talked in a previous post about simulating another ruby metaprogramming hook, the method_added hook, by means of metaclasses. We can use metaclasses to execute some action each time a class based on that metaclass is created (putting that code to be executed in the __new__ or __init__ methods of the metaclass). In our case, we want to execute code to add each model class to the registry. For that the Base class that we define for our entities must have DeclarativeMeta as its metaclass. We can do this by directly setting ourselves the metaclass and the registry instance:


mapper_registry = registry()

class Base(metaclass=DeclarativeMeta):
    registry = mapper_registry


Or by inheriting from DeclarativeBase (that already has DeclarativeMeta as its meta). DeclarativeBase will also take care of setting the registry in our Base class.



class Base(DeclarativeBase):
    pass


We can take look at the DeclarativeMeta code to see how it makes its magic:


class DeclarativeMeta(DeclarativeAttributeIntercept):
    metadata: MetaData
    registry: RegistryType

    def __init__(
        cls, classname: Any, bases: Any, dict_: Any, **kw: Any
    ) -> None:
        # use cls.__dict__, which can be modified by an
        # __init_subclass__() method (#7900)
        dict_ = cls.__dict__

        # early-consume registry from the initial declarative base,
        # assign privately to not conflict with subclass attributes named
        # "registry"
        reg = getattr(cls, "_sa_registry", None)
        if reg is None:
            reg = dict_.get("registry", None)
            if not isinstance(reg, registry):
                raise exc.InvalidRequestError(
                    "Declarative base class has no 'registry' attribute, "
                    "or registry is not a sqlalchemy.orm.registry() object"
                )
            else:
                cls._sa_registry = reg

        if not cls.__dict__.get("__abstract__", False):
            _ORMClassConfigurator._as_declarative(reg, cls, dict_)
        type.__init__(cls, classname, bases, dict_)
from sqlalchemy.orm import DeclarativeBase


I'm involved in some projects where the Database is not a critical element. We don't retrieve data from it, we just use it as an additional storage for our results, but the main store for those results are json/csv files. This means that if the Database is down, the application should run anyway. So it's important for me to have clear what things involve database access (and hence an error if the DB is not accessible), and also when the model mapping will throw an error if the mapping is incorrect. Let's see:

  • Adding classes to the registry (either explicitly with the imperative mapping or implicitly with the declarative one) does not perform any check with the database (so if the DB is down or there's something wrong in our mapping, like wrong tables or columns, we won't find it until later).
  • Creating a SqlAlchemy engine does not perform any connection to the DB either.
  • Creating a Session does connect to the Database, but it does not perform any model verification.
  • Adding objects to a Session won't check the model until the moment when you do a flush or a commit (that indirectly performs a flush).
  • Performing a select Query through a Session will obviously generate an error if any of the mappings for the tables involved in the query is wrong.

invoke operator function


Friday, 24 October 2025

Persistence Ignorance

I've used SqlAlchemy in some projects (basic use, projects where the Database is just one of multiple datasources), and until recently I'v been sticking to using the Imperative mapping style. I grew up as a developer with Persistence Ignorance (PI) as a guiding principle (keep your domain model free from infrastructure concerns like database access, so it remains clean, testable, and focused on business logic), so that was the natural thing to me, and I was really surprised to see that SqlAlchemy recommends to use the Declarative mapping style, where the entities are totally aware of the specific persistence mechanism. .Net Entity Framework and NHibernate make a good job in allowing us to have entities that are "almost" persistent ignorant. I say "almost" cause if you check this list of things that go against Persistence Ignorance, you'll recognize some entity framework requirements like parameterless constructor and using virtual properties for lazy loaded relations. You can have all the additional constructors that make sense for your entities, EF just needs this parameterless one as it will initialize your entities by calling it and then setting properties as needed. As for the virtual properties, EF implements lazy-loading by means of creating proxy classes. If you have a Country entity with a lazy-loaded navigation property cities, EF will create a Proxy class that inherits from Country and overrides the cities property implementing there the lazy-loading logic.

Using the imperative mapping in SqlAlchemy gives you even more freedom. Your entities can have any constructor, as SqlAlchemy leverages Python's __new__ and __init__ separation so that it does not invoke __init__ for initializing the entities, but set attributes one by one. Then the dynamic nature of the language means that you don't have to mark in any special way properties corresponding to lazy loaded relationships and it does not need to resort to proxy classes to implement lazy loading, as it leverages Python dynamism and lookup logic. I think for each lazy relation in an entity a Descriptor is added to the class. When you first try to access the corresponding attribute the lookup will reach the Descriptor, that will perform the corresponding query and set the result in an attribute of the instance, so that the next time that you access the relation, the values will be retrieved from the instance. I guess this is more or less related to what I discuss here.

All this said, we should also note that (as explained here) there's still some "persistence leakage" into your entities when using the Imperative mapping. While you define your entity classes fully unaware of the persistence, SqlAlchemy (when adding them to the registry) makes them aware of the persistence mechanism by adding different attributes at the class level and at the instance level. For example attributes like _sa_instance_state or _sa_lazy_loader (these are part of SQLAlchemy’s internal machinery to track state and identity, manage lazy loading and relationship resolution and hook into attribute access dynamically). So your entities become bloated with extra attributes that you don't use on your own, and if you serialize them to json or whatever, they'll show up.

In the end I've ended up having separate Model entities (that use the declarative mapping) and Domain entities (that know nothing about the database) and mapper classes/functions that map Model entities to Domain entities and viceversa. This gives you almost full PI. I say almost cause you still end up with table ID's leaking into you Domain entities, but this is a more than acceptable compromise. Anyway, you still could get rid of it by declaring your Domain entities without the ID (Countr class) but declaring additional child entities (CountryIdAware class) that incorporate the ID. Your Model to Domain mappers will indeed create CountryIdAware instances that will be passed to your Domain, but the Domain will we aware of them just as User instances, it won't see the ID attribute.

Sunday, 12 October 2025

Truffle Bytecode DSL

I have a fascination with Graal and the Truffle interpreters framework, though it's all from a theoretical point, as I've never built an interpreter myself. The thing is that recently I've found out about a new addition to Truffle, the bytecode DSL. This means that Truffle supports now the 2 main type of interpreters: Tree Parsing Interpreters and bytecode interpreters.

I found this a bit odd at first, as it was not clear to me how to reconcile this with what I understood as the main super-powers of Truffle. The "traditional" approach in Truffle is writing Tree Parsing Interpreters (AST interpreters). Summarizing what I explain in some of my previous posts, the nodes in this Tree correspond to java methods (Java bytecodes) that the interpreter invokes. These nodes can get specialized to more specific nodes thanks to profiling, and then when a guest language method is hot, the Java bytecodes for the nodes making up that method are sent to Graal for compiling it to native code (this is the Partial Evaluation part). The equivalent to specializing the AST nodes also exists for the bytecodes case, those bytecodes can be specialized/quickened in a way very similar to what the Python Adaptive Specializing Interpreter does. But for the compilation part, if with the bytecode DSL we no longer have a tree made up of nodes, are we missing the Partial Evaluation magic?

No, we are not missing anything. For each Guest language bytecode of the program we are executing we'll have a Java method that executes it. When a guest language method is hot, the Java methods for the bytecodes making up that method will be sent to Graal for compilation, so this is the same we do with AST nodes.

Confirming this with a GPT has provided me with a better understanding of the Partial Evaluation and optimizations. When a method is hot and it's nodes (or guest language bytecode) are sent for compilation Truffle can decide to send only part of the method, not the whole method (path specific compilation). When Truffle does partial evaluation, it traces the actual execution path that was taken during profiling. This means that if we have an "if-else" and the profiling shows that the condition is always true it will only send for compilation the "if part". Of course it adds guards so that if the assumptions taken becomes false it can deoptimize the code (transfer back to the interpreter)

There's an additional element in how Truffle can achieve such excellent performance, inlining (both for AST and bytecode interpreters). When Truffle sends the java methods for the nodes or bytecodes of a method (or of part of a method based on optimizations) for compilation, it will also send those methods called from that method, and will inline them in the generated native code.

A common taxonomy of JIT compilers is Method-based (Graal, HotSpot, .Net...) vs Tracing JITs (LuaJIT, older TraceMonkey).

Method-Based JITs (like Graal/Truffle, HotSpot, .NET)

  • Compilation unit: entire method
  • When a method becomes hot, compile it
  • Can still do path specialization within that method
  • Inlining: pulls called methods into the caller during compilation

Tracing JITs (like LuaJIT, older TraceMonkey)

  • Compilation unit: hot trace across multiple methods
  • Traces execution through method calls, loops, returns
  • The "trace" might start in methodA(), call into methodB(), and return - all one compiled unit
  • More aggressive cross-method optimization

The interesting thing is that while Graal is primarily method-based, with very aggressive inlining it can achieve trace-like behavior.

Pure tracing JITs can cross method boundaries more naturally, but modern method-based JITs like Graal blur this distinction through aggressive inlining. The end result can be quite similar, just with different conceptual models!

Example:

Tracing JITs Hot loop detected spanning multiple methods

  • Record exact execution path
  • Compile: loop_header → methodA → methodB → loop_back
  • One flat piece of native code

Graal methodA is hot

  • Compile methodA
  • Inline methodB call
  • Inline methodC call
  • Result looks similar but structured around methodA

Saturday, 4 October 2025

Python venvs

In the past I used to install Python modules globally, but since quite a while ago I'm careful to use separate virtual environments (venv) for all my projects. I guess anyone doing any non basic Python stuff will be familiar with venvs, so I'm not going to explain here what a venv is, but to provide some information that though pretty simple feels useful and interesting to me.

We create a new virtual environment with: python -m venv .venv. Notice that .venv is just the name of the folder where the venv will be created, we can use any name, but .venv (or env or venv) is a sort of convention. Then we activate the venv with: source .venv/bin/activate (or .venv\Scripts\activate in Windows).

The python version that we use when creating the virtual environment is the one the venv will use. That means that if we have several python versions (the system one, let's say 3.12, and several altinstalls, let's say 3.11, 3.13), if we create the venv using 3.13 (python3.13 -m venv .venv), when we activate the venv it will use the python3.13 altinstall, regardless of whether we type python, python3 or python3.13

That's so because inside the venv (.venv/bin) we have these symlinks:
python -> python3.13
python3 -> python3.13
python3.13 -> /usr/local/bin/python3.13

If we want to launch a python script in certain venv (I mean in one go, not the typical thing of opening a terminal, activating the venv in that terminal and then launching the python script), we can just put this in a launcher.sh script:
source /path/to/.venv/bin/activate && python /path/to/script.py
This will activate the venv in the bash process that runs the script and hence the python invocation will be done with the python pointed from the venv.

There's a more direct approach that I was not aware of until recently. We don't need to activate the venv, we can just type this:
/path/to/.venv/bin/python /path/to/script.py

All this works because the venv mechanism is implemented by python itself, it's not a third party addition. When we activate a venv with source .venv/bin/activate what is happening mainly is that it's prepending the path to .venv/bin to our PATH variable, that's all. That way we'll reach those symlinks that we've seen that point to the python installation used during the venv creation. So if in the end we're just running that global python installation, how is it that it will find the packages locally installed in: .venv/lib/python3.13/site-packages?

Well, that's so because when started, python checks if a pyenv.cfg file exists in a path relative to the path used for launching python (so in this case the path to that symlink). I guess it gets the path used for launching it by checking argv[0]. If that file (.venv/pyenv.cfg) exists, it will use it for:

  • It adjusts sys.path to point sto the venv’s lib/python3.13/site-packages
  • It sets sys.prefix and sys.exec_prefix to the venv directory
  • It avoids loading global site-packages (unless configured to do so)

With regards to installing packages with pip in a venv, we have to notice that pip is a bootstrap python script and a python module. When we create a venv, 3 pip scripts are created in .venv/bin:


pip
pip3
pip3.13

Each of them is a python script with a shebang pointing to the python version used during the venv creation. They look like this:


$ cd .venv/bin 
$ more pip
#!/myProjects/my_app/.venv/bin/python3.13
# -*- coding: utf-8 -*-
import re
import sys
from pip._internal.cli.main import main
if __name__ == '__main__':
    sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
    sys.exit(main())

And a pip module is installed inside the site-packages of that venv, (e.g: .venv/lib/python3.13/site-packages/pip). So when we run any of those pip scripts in the venv, they load the python version that was used when creating the venv, and that python version will see the pyenv.cfg file, prepend the .venv/site-packages to sys.path, and that way load the pip module in the .venv site-packages.

Thursday, 25 September 2025

Kotlin Unit

When I wrote this post and mentioned the Kotlin use function for Automatic Resource Management I had some doubts about the signature. It's a generic function that returns R:


inline fun  T.use(block: (T) -> R): R

which fits so nice in the more functional style code I was discussing in that post, but what if we want to do something on that closable resource without returning any value? Well, the thing is that the function signature is also valid for that case, as in Kotlin what I mean by "not returning anything" (I'll be careful not to say "returning nothing" as in Kotlin Nothing has a particular meaning) means indeed returning Unit, a singleton object/class descending from Any, so that perfectly fits with the T generic type.

Having said this, it feels interesting to me to review how this "not retuning anything" works in different languages and compare it to the Kotlin approach and its advanced type system.

In Java and C# functions that do not return anything are marked as returning void. void is indeed a keyword representing "no return value" - it's not part of the normal type hierarchy.

In Python a function that does not have an explicit return statement or just does "return" without providing a value is indeed returning None (that is the single instance of the NoneType singleton class). If using type hints, mypy will consider correct that a function defined as returning Any returns None (as None is just a normal objec). In JavaScript the lack of an explicit return statement (or a simple "return;") will return the undefined value (JavaScript has this messy distinction between undefined and null that feels more accidental than intentional). For dynamic languages where type checking has come as an after thought (either directly in Python via typing or indirectly in JavaScript via its TypeScript "friend") this seems OK, but static languages things have to be more strict.

Kotlin elegantly establishes a subtle difference between functions that can return a value or the absence of that value, and functions that never return anything useful. For the former we use as return a nullable type. A getUser(id) function returns User? cause it returns a User if the id exists or null if that's not the case. On the other hand a writeLog() function returns Unit because it never returns anything useful. In Python we don't have that subtle difference, both write_log and get_user for missing id return None. With this, we can say that:

  • Kotlin uses Unit for "no meaningful value" and null for "the absence of an expected value"
  • Python lacks the semantic distinction between "no meaningful value" vs "absent value"

As I said at the start of this post I've been saying "does not return anything" (I should better say "does not return anything useful") rather than "returns nothing" because as I mention in this post Nothing has a special meaning in Kotlin. It's a bottom type, used for expressing that a function does not return (throws an exception, it's an infinite loop...). There's a good discussion here. I think Nothing is a confusing name, I prefer the naming used in Python typing for the same concept Never and NoReturn (both types are equivalent).

I was wondering, what if we want to declare a Kotlin function that always returns null? Well, probably that does not make much sense and we should just declare it as returning Unit. Googling around I've found this discussion where they propose returning Nothing?. They support the idea by copy-pasting a fragment of the documentation that does no longer seem to exist in the current documentation, so maybe the Kotlin designers just discarded that notion.

As a conclusion, in languages like Kotlin, Python or JavaScript we should not think about functions that do not return anything (save for functions that crash). All functions that finish naturally return something, but that something can be a not usable value (Unit) or an absent value(null, None, undefined).

Sunday, 14 September 2025

Python Currying

Last year I wrote a post about function currying in JavaScript, it corrected the (absolutely) wrong implementation that I had written some years ago. That last implementation did not feel particularly easy to understand to me. I was using three functions: curry, one for saving args and the curried function itself... I've recently implemented currying in Python, without looking into the JavaScript version, and it's interesting how thinking more in Python terms has made me better think about a JavaScript implementation

Currying a function means ending up with an invokable/callable object that references the original function and stores the provided parameters until all of them have been provided. In each "incomplete invokation" it has to return another invokable object trapping the expanded list of parameters. In Python an "invokable/callable object with state" is either a closure or an instance of a callable class (well, indeed normal functions are also instances of callables). And the creator of that invokable object is either a closure factory or a callable class. For my implementation I've used a Callable rather than a closure, somehow this time it felt more intutive:


class Curried:
    def __init__(self, fn, args: list[Any] | None = None):
        # in the initial call to create the initial curried function args is None
        self.fn = fn
        self.saved_args = args or []
        self.expected_args_len = len(inspect.signature(self.fn).parameters)
        # notice how for class based decorators we have to use update_wrapper here, rather than wraps
        functools.update_wrapper(self, fn)

    def __call__(self, *args):
        current_args = [*self.saved_args, *args]
        if len(current_args) > self.expected_args_len:
            raise Exception("too many arguments!!!")
        if len(current_args) == self.expected_args_len:
            return self.fn(*current_args)
        else:
            return Curried(self.fn, current_args)

# alias for better semantics when used as decorator        
curry = Curried


def test_curried(): 
    def format_city(planet: str, continent: str, country: str, region: str, city: str) -> str:
        """I'm the format_city docstring"""       
        return f"{planet}.{continent}.{country}.{region}_{city}"

    curried_format_city = Curried(format_city)
    
    # or if used as decorator:
    # @curry
    # def format_city(planet: str, continent: str, country: str, region: str, city: str) -> str:
    #     """I'm the format_city docstring"""       
    #     return f"{planet}.{continent}.{country}.{region}_{city}"

    print(curried_format_city("Earth")("Europe", "Spain")("Asturies", "Xixon"))

    format1 = curried_format_city("Earth", "Europe")
    format2 = curried_format_city("Earth", "Asia")
    # update_wrapper works nicely
    print(f"{format1.__name__=}, {format1.__doc__=}, ")

    print(format1("Spain")("Asturies", "Xixon"))
    print(format1("France")("Ile de France", "Paris"))

    print(format2("China")("Beijing", "Beijing"))
    print(format2("China")("Guandong", "Shenzen"))
    format3 = format2("Russia")("Northwestern")
    print(f"{format3.__name__=}, {format3.__doc__=}, ")
    print(format3("Saint Petersburg"))
    
# Earth.Europe.Spain.Asturies_Xixon
# format1.__name__='format_city', format1.__doc__="I'm the format_city docstring", 
# Earth.Europe.Spain.Asturies_Xixon
# Earth.Europe.France.Ile de France_Paris
# Earth.Asia.China.Beijing_Beijing
# Earth.Asia.China.Guandong_Shenzen
# format3.__name__='format_city', format3.__doc__="I'm the format_city docstring", 
# Earth.Asia.Russia.Northwestern_Saint Petersburg

As you see, a curried function is a callable object, an instance of the Curried class (that has a __call__ method). As I said, the curried function is equivalent to a closure, and the Curried class is equivalent to a closure factory. Notice that as I'm creating a callable object rather than a standard function I'm using functools.update_wrapper (rather than functools.wraps) to set the original __name__, __doc__, etc in the curried callable. I can invoke it directly (Curried(fn)) or use it as a decorator at function definition time.

In my previous JavaScript implementation I had 3 elements: a curry function, a saveArgs function and the closure itself. That's why it felt a bit strange to me, following the Python implementation, I only need 2 elements, the closure factory and the closure. So here it goes my new JavaScript implementation:


let curry = function createCurriedFn(fn, args) {
    // createCurriedFn is a closure factory, creates a closure that traps original fn and parameters
    let savedArgs = args ?? [];
    // return the curriedFn/closure
    return (...args) => {
        const curArgs = [...savedArgs, ...args];
        return curArgs.length >= fn.length 
            ? fn(...curArgs)
            : createCurriedFn(fn, curArgs);
    };
}

curriedFormat = curry(formatMessages);
curriedFormat("a")("b")("c");
curriedFormat("d")("e")("f");
curriedFormat("g", "h")("i");
curriedFormat("j", "k", "l");

// a-b-c
// d-e-f
// g-h-i
// j-k-l

As we know Python features named parameters (contrary to JavaScript), so we should contemplate that in our curry function. This is the improved version that does just that:


class Curried:
    def __init__(self, fn, args: list[Any] | None = None, kwargs: dict[str, Any] | None = None):
        # in the initial call to create the initial curried function args is None
        self.fn = fn
        self.saved_args = args or []
        self.saved_kwargs = kwargs or {}
        self.expected_args_len = len(inspect.signature(self.fn).parameters)
        # notice how for class based decorators we have to use update_wrapper here, rather than wraps
        functools.update_wrapper(self, fn)

    def __call__(self, *args, **kwargs):
        current_args = [*self.saved_args, *args]
        current_kwargs = {**self.saved_kwargs, **kwargs}
        
        if (cur_len := (len(current_args) + len(current_kwargs))) > self.expected_args_len:
            raise Exception("too many arguments!!!")
        if cur_len == self.expected_args_len:
            return self.fn(*current_args, **current_kwargs)
        else:
            #return wraps(self.fn)(Curried(self.fn, cur_arguments))
            return Curried(self.fn, current_args, current_kwargs)

# alias for better semantics when used as decorator        
curry = Curried


def test_curried(): 
    def format_city(planet: str, continent: str, country: str, region: str, city: str) -> str:
        """I'm the format_city docstring"""       
        return f"{planet}.{continent}.{country}.{region}_{city}"

    curried_format_city = Curried(format_city)
    #print(curried_format_city.__name__)
    print(curried_format_city("Earth")("Europe", "Spain")("Asturies", "Xixon"))

    format1 = curried_format_city("Earth", "Europe")
    format2 = curried_format_city("Earth", "Asia")
    # update_wrapper works nicely
    print(f"{format1.__name__=}, {format1.__doc__=}, ")

    print(format1("Spain")(region="Asturies", city="Xixon"))
    print(format1("France")("Ile de France", city="Paris"))

    print(format2(country="Chinaaa")(country="China")(city="Guangzhou", region="Guangdong"))
    print(format2("China")("Guandong", "Shenzen"))

    print(format2("China")("Guangdong", "Guangzhou"))
    print(format2("China")("Guandong", city="Shenzen"))
    format3 = format2("Russia")("Northwestern")
    print(f"{format3.__name__=}, {format3.__doc__=}, ")
    print(format3("Saint Petersburg"))


test_curried()
print("----------------------")

# Earth.Europe.Spain.Asturies_Xixon
# format1.__name__='format_city', format1.__doc__="I'm the format_city docstring", 
# Earth.Europe.Spain.Asturies_Xixon
# Earth.Europe.France.Ile de France_Paris
# Earth.Asia.China.Guangdong_Guangzhou
# Earth.Asia.China.Guandong_Shenzen
# Earth.Asia.China.Guangdong_Guangzhou
# Earth.Asia.China.Guandong_Shenzen
# format3.__name__='format_city', format3.__doc__="I'm the format_city docstring", 
# Earth.Asia.Russia.Northwestern_Saint Petersburg


Notice that contrary to what happens with standard functions, in the curried functions created by this implementation we can pass unnamed parameters after named ones, but the unnamed ones have to be provided in the same order as in the original function. Same as with functools.partial, we can provide the same named parameter multiple times, each new provided value overwrites the previous one.

Wednesday, 10 September 2025

Dans les Brumes de Capelans

I have to sadly admit that I'm not a great reader (I'm talking about literature, as for programming/technical stuff, political crap, history and so on I read tons of stuff). Just a few books per year (these last years a bit more, hopefully). Years ago (betwen 2009 and 2013 mainly) I was very much into Nordic Noir. I started with Stieg Larsson and continued with Asa Larsson (my favorite), Camilla Lackberg and Jo Nesbo. In Asa Larsson and Camilla Lackberg I terribly appreciated the "darkness" of many of the characters, that sadness, those difficoult existences... In recent years I've moved back into crime/thriller/police books, but this time into what I would call as "French blood noir", that is, crime-police-dark thriller novels where crimes are particularly bloody, violent, evil (involve torture, some sort of ritual, mutilations, BSDM...). It's what in Les Rivieres Pourpres (series) they call "crimes de sang". By the way, those series are really good, particularly Season 1 and 2 (season 3 and 4 felt a bit weaker to me, but have some excellent chapters also).

Since 2022 I've been reading the Sharko and Lucie Henebelle stories by Franck Thilliez. I can not recommend it enough. The crimes are horrible, bloody, sick, conducted by lonely psychopaths, organised elitist groups, pseudo-vampires... there are secret clubs that remind me of the 28 mms film, but above all I've come to love the main characters, particularly Sharko, and Nicolas Bellanger, whose nightmarish existence has become more and more important in the last books. They live a painful life, they fall, get up, fall again, overcome all sort of crap that leaves such deep scars... I should write several posts about them, but this one is not intended to that, but to a book from a different author, Dans les Brumes des Capelans by Olivier Norek.

I had previously read "Trilogy 93", that follows the misadventures of policeman Captain Victor Costa and his team tracking criminals in Seine-Saint-Denis. Pretty good, but it's more "standard crime-police literature" than the aforementiond "crimes de sang" stuff. In his real life Norek worked as a policeman in Seine-Saint-Denis, so one can imagine that there's much reality poured into those novels. "Dans les brumes de Capelans" is quite a different beast, much more of a dark thriller, of a "crimes de sang" story. Several years after the tragic end of the trilogy, Costa has managed to survive by running away from Paris and his previous life and living a lonely existence in such a secluded place as Saint Pierre et Miquelon, where works for the Witness Protection Service, managing a house by a cliff where he receives guests that have to remain hidden until they are provided with a new identity. We could say that Costa wants to remain as hidden as his guests.

This time he receives a young woman, Anna, that is the only survival of a maniac that has been seizing, torturing and murdering young girls for more than a decade, but that decided to keep her alive "for some reason". The girl is fucked up, Costa is fucked up, and strong bonds get woven between these 2 broken souls, 2 partners in pain and desolation. There are some very beautiful cathartic moments, there's the maniac killer that resurfaces, and there are many, many surprises. There's another interesting character, the policeman that dealt with Anna and the other girls disappearances and Anna's liberation. He appears only in the first and last chapters, playing an important role. Setting the story in this mysterious island adds darkness and loneliness to the story, a hard place for hard people.

I won't tell you more, go for the book and enjoy it.

Thursday, 4 September 2025

Automatic Resource Management as Expression

As a few weeks ago, going through the Python ideas forum has introduced me again to another interesting idea. As usual, someone proposes a syntax for a feature, and as it's clear that no syntax changes will be performed to provide that, people come up with interesting work arounds.

So someone proposed allowing to use with, the syntax for Automatic Resource Management with context managers, as an expression (so having a with expression along with the existing with statement. He was proposing something like this:
txt = do_something(json.load(f) with open('foo.json') as f)
That indeed reminds me of the syntax outlined in the rejected PEP for exception-catching expressions (aka try-expressions):
msg = (parse(txt) except ParsingError: None)

As I said, it's obvious that given how reluctant the Python leaders are to any syntax change, neither of those ideas will ever make it into the language. The good thing is that same as we can easily define a do_try function as the one we saw in this previous post, we can also define a using/with_do function, like this (taken from the discussion thread):


#def with_do(mgr, fn): 
def using(mgr, fn):
    with mgr as res:
        return fn(res)

# or maybe this is more semantic?
def do_with(fn, mgr):
    with mgr as res:
        return fn(res)

#config = tomllib.load(with open("file.toml", "rb") as f: f)
config = using(open("file.toml", "rb"), tomllib.load)
config = do_with(tomllib.load, open("file.toml", "rb"))

#data = with open("file.txt", "r") as f: f.read()
data = using(open("file.txt", "r"), lambda f: f.read())
data = do_with(lambda f: f.read(), open("file.txt", "r"), )

All the above examples are pretty contrived, as "with open() as" can be replaced by pathlib.Path.read_text, that takes care of managing any exception. I mean:


config = tomlib.loads(pathlib.Path('foo.json').read_text())

But there are other context manager use cases for which this kind of function would come handy.

The other Automatic Resource Management (ARM) mechanisms I'm familiar with are the C# using statement with IDisposables and Java Try-with-resources statement with Closables. So in Python, C# and Java ARM is provided via statements, not expressions, that's why I think I had never thought of using it as an expression. Given that in Kotlin almost everything is an expression, is easy to imagine that they have had this into account. Kotlin does not have a specific syntax construct for ARM, as given its rich and expressive syntax it can be nicely implemented with an extension function of the Closable interface, use. It Executes the given block function on this resource (a Closable object) and then closes it down correctly whether an exception is thrown or not:


inline fun  T.use(block: (T) -> R): R

As you can see in the signature, the block returns a value R, that in turn is returned by use, so what if we just want to execute a block that does not return anything? Well, that signature is also valid. In Kotlin a function that does not return anything does indeed return Unit (a singleton class), so when passing to use a block that does not return anything that generic type R becomes Unit, and everything is perfectly valid.

Monday, 25 August 2025

Python Adaptive Specializing Interpreter

It's clear that I have a fascination with the interaction between interpreters and JIT's, optmizations, deoptimizations and so on. I've written multiple posts about that, with this one being the most recent. I recently came across this interesting thesis where in the introduction it mentions:

For both variants, we can quicken certain operations. Quickening entails replacing one operation with another, which usually handles a subset of possible values or a more restrictive set of preconditions but performs the operation quicker. In tree-walk interpreters, this is performed by node replacement, and in bytecode-based interpreters, by changing the instruction stream, replacing one instruction with another. In both cases, the target of the dispatch changes. How this is implemented in Operation DSL is detailed in section 3.7.

The "replacing one instruction with another" suddenly reminded me of something I had read some months ago regarding Python performance improvements but that I had forgotten to dive into and had indeed forgotten. I'm talking about Adaptive Specializing Interpreter, that is something pretty surprising to me. I've talked in my previous posts about interpreters that find hotspots in your code and send those hot methods to a fast JIT compiler to turn them into native code. Then that native code continues to be monitored and if it's hot enough it's sent to a more aggressive JIT compiler that spends more time in producing a more optimized native code. Then we have cases where the code has to be deoptimized, returning the function to its interpreted form, and the cycle starts again. But the idea of monitoring the code to find specific bytecode instructions (opcodes) that can be replaced by an optimized (specialized/quickened) version of that bytecode instruction is something that was pretty new to me

As of today (Python 3.13) the main Python environment, CPython, does not come with a JIT compiler (an experimental one is due for 3.14 I think). CPython just uses a bytecode interpreter (historical note: The move from the initial tree-walk interpreter to a bytecode interpreter happened between Python 0.9.x and Python 1.0, likely around 1992–1993 during the pre-1.0 development phase). Python 3.11 implemented PEP-659 - Specializing Adaptive Interpreter as part of the Faster CPython project. It introduced some bytecode instructions that are generic (for example the one for adding 2 items, BINARY_OP_ADD (+)) and that if a constant execution pattern is found will be replaced (specialized/quickened) by an specialized version ('BINARY_OP_ADD_FLOAT', 'BINARY_OP_ADD_INT', 'BINARY_OP_ADD_UNICODE'). If that pattern changes the instruction will be replaced by the initial, generic bytecode. This discussion has some interesting information.

You probably know that (as I mention in this post) Python compiles functions to code objects, and then each function object points to a code object (via __code__ attribute). A code object has a co_code attribute pointing to a bytes object containing the bytecodes.

bytes objects are inmutable, so the bytecode specialization has to happen in another structure. I could not find much information about this, so ChatGPT came to the rescue. So yes, there's an additional structure that contains a mutable copy of the bytecodes. It's from that structure that the interpreter reads the bytecodes to execute for that given function, and applies adaptations/specializations/quickening as it sees fit.

  • The co_code itself remains immutable and is not rewritten at runtime. It continues to contain the canonical, "baseline" bytecode sequence as emitted by the compiler.
  • When a code object is executed, CPython creates an internal _PyCodeRuntime structure (not exposed to Python), which contains a mutable copy of the bytecode in a field called co_firstinstr (technically in co_warm → co_warm.instructions).
  • That runtime bytecode buffer is where the interpreter patches in "quickened" instructions (specialized opcodes). For example, a generic BINARY_OP might be replaced at runtime with BINARY_OP_ADD_INT if it sees enough hot integer additions.

That mutable copy of the bytecodes seem to be referenced from the code object via a private _co_code_adaptive attribute (but this is an internal, undocumented detail that can change from version to version). Python allows us to very easily check the bytecodes for a given function by using the standard dis module: dis.dis(my_function). By default dis.dis shows the immutable bytecodes in co_code, but since python3.12 we can use the adaptive=True flag, to see the adapted/quickened instruction. This is pretty amazing, cause we can so easily see how a function bytecodes evolve over time!


import sys, dis

def f(a, b):
    return a + b

print("before warming up")
dis.dis(f, adaptive=True)  # Only in 3.12+
# BINARY_OP                0 (+)

# Warm it up
for _ in range(10_000):
    f(1, 2)

# Disassemble with quickening shown
print("after warming up with ints")
dis.dis(f, adaptive=True)
#BINARY_OP_ADD_INT        0 (+)

# now let's try to break the quickening by passing strings rather than ints
print("first call with strings")
f("a", "b")
dis.dis(f, adaptive=True)
# it's still quickened
#BINARY_OP_ADD_INT        0 (+)

print("second call with strings")
f("c", "b")
dis.dis(f, adaptive=True)
# it's still quickened
#BINARY_OP_ADD_INT        0 (+)

print("Warm it up again, this time with strings")
for _ in range(10_000):
    f("a", "b")

print("after warming up again")
dis.dis(f, adaptive=True)
# BINARY_OP_ADD_UNICODE    0 (+)

So initially the bytecode for an addition of 2 values uses the BINARY_OP opcode (python is a dynamic language where a and b could be of any type, so BINARY_OP is a generic (and slower) instruction for summing up any value). Then we do a good bunch of additions all of them with int values, so that the interpreter decides to specialize the generic addition to a fast BINARY_OP_ADD_INT opcode. After that we do a couple of invokations using strings rather than ints. The specialized opcode checks if the operands are the expected types (here, two ints), as they are not it falls back to the generic implementation of the operation (the slow path), but for the moment it still keeps the specialized opcode. It takes note of these divergences so that if they continue it will revert the specialization. The thing is that in my tests I have not managed to find the number of failed executions that will make the interpreter to revert the specialization, what we can see is that after a good bunch of executions using strings, the int specialization is changed to a string specialization (BINARY_OP_ADD_UNICODE).

In some previous post I mentioned some crazy Python projects that manipulate functions by creating a new code object (an instance of types.CodeType) based on the original one, with a modified version (adding extra instructions, whatever) of its bytecodes, and assigns it to the function. How does this play with the adaptive version of the code? Well, thanks to ChatGPT we learn that the adaptation process starts again:

  • The quickened bytecode (co_code_adaptive) is built lazily, the first time the interpreter executes a CodeType.
  • It is not stored permanently in the CodeType; rather, it is in a per-runtime structure that references the original co_code.
  • If you assign a different code object to a function (func.__code__ = new_code), that’s a new CodeType with its own co_code_adaptive buffer, initially empty.
  • Therefore, execution will start again with baseline opcodes and caches, and the specializing interpreter will re-warm and re-specialize.

Thursday, 7 August 2025

Python Decorators Implementation

There's something in the inner workings of the amazing multimethod library that we saw in my previous post that has had me pretty confused. In that post I outline how the multidispatch decorator works internally, but for the multimethod decorator, things are more complex. From my previous post we know we use it like this:


class Formatter:
    def __init__(self, wrapper: str):
        self.wrapper = wrapper

    @multimethod
    def format(self, item: str, starter: str):
        return f"{starter}{self.wrapper}{item}{self.wrapper}"
    
    @multimethod
    def format(self, item: int, starter: str):
        return f"{starter}{self.wrapper * 2}{item}{self.wrapper * 2}"   

multimethod is a class based decorator. In the first invokation it returns an instance of the multimethod class, that will store the format function and its signature. For each new invokation of the decorator it has to store each of those additional overload functions and signatures in that already existing instance, rather than creating a new instance each time. For doing that, the decorator checks if in the current scope (the class declaration scope) already exists a variable with the name of the function being decorated that already points to an instance of multimethod. For that it inspects the previous frame in the stack, like this (taken from its source code):


    def __new__(cls, func):
        homonym = inspect.currentframe().f_back.f_locals.get(func.__name__)
        if isinstance(homonym, multimethod):
            return homonym
        ...

That's really, really nice code, but it's another thing what I could not grasp. We know that Python decorators are just callables (functions or classes) that are invoked receiving the function (or class) being decorated as parameter. So they are normally explained like this:


@my_deco
def fn(): 
   # whatever

Is (in principle) just equivalent (syntactic sugar) to this:


def fn():
   # whatever
fn = my_deco(fn)

The problem is that it would mean that our example above is indeed translated by the Python compiler into something like this:



    def format(self, item: str, starter: str):
        return f"{starter}{self.wrapper}{item}{self.wrapper}"
    format = multimethod(format)
    # here format is pointing to a multimethod instance, good
    
   
    def format(self, item: int, starter: str):
        return f"{starter}{self.wrapper * 2}{item}{self.wrapper * 2}" 
    # but here format is pointing the the function that we've just defined, so when we apply the decorator again and it does the isinstance(homonym, multimethod) check, it will be False! so this can not work
    format = multimethod(format)

I've explained the problem in the comments. When defining the second format function, the format variable in the current scope is set to that second function, so it's no longer pointing the the multimethod instance previously created, so the isinstance(homonym, multimethod) check will be False! and a new multimethod instance will be created, so the whole thing can not work!

Well, after discussing this with Claude AI we've found an explanation. When applying a decorator to a function, the function definition doesn't immediately overwrite the namespace (setting a variable with the name of the function to point to the function), what is really happening is this:

  1. def format(...) creates a temporary function object
  2. @multimethod is applied to that temporary function object
  3. The decorator returns an object (which could be the existing multimethod)
  4. Only then does format = assignment happen

So indeed a decorator works more like this:


format = my_deco(
	def format():
	    # whatever
)

But as Python lacks statement lambdas, the above code is not valid, and hence the different articles explaining decorators can not use that as the pseudo-code for what the runtime really does, and use an inaccurate approximation. Usually that's not a problem, but for this very specific case that approximation prevented me from envisioning how this particular decorator works.

Friday, 1 August 2025

Visitor vs Multiple Dispatch

The Visitor pattern has always felt a bit odd to me. The thing is that more than as a good practice, it was born to make up for a limitation, the lack of multiple dispatch (aka multimethods) in most programming languages. Being forced to have a visit() method in the classes that are going to be visited feels terribly intrusive to me, it violates the separation of concerns principle. So, when we have a language that supports multiple dispatch (either directly in the language of through some library) we should favor that and forget about Visitors.

That said, Python has no support for multiple dispatch at the language level, but has at least 2 interesting libraries that leverage Python's dynamism and introspection to provide us with the feature. I have only tried multimethod, and it's pretty amazing. I'll show some examples.


from multimethod import multimethod

# multimethod does not support keyword arguments

class Formatter:
    def __init__(self, wrapper: str):
        self.wrapper = wrapper

    @multimethod
    def format(self, item: str, starter: str):
        return f"{starter}{self.wrapper}{item}{self.wrapper}"
    
    @multimethod
    def format(self, item: int, starter: str):
        return f"{starter}{self.wrapper * 2}{item}{self.wrapper * 2}"
    
f1 = Formatter("|")
print(f1.format("aaa", "- "))

print(f1.format(25, "- "))

# I can even register new overloads to an existing one
@Formatter.format.register
def _(self, item: bool):
    return f"{self.wrapper * 3}{item}{self.wrapper * 3}"

print(f1.format(True))

# and I can even overwrite an existing overload!
@Formatter.format.register
def _(self, item: int, starter: str):
    return f"{starter}{self.wrapper * 5}{item}{self.wrapper * 5}"

print(f1.format(25, "- "))

# - |aaa|
# - ||25||
# |||True|||
# - |||||25|||||

# but there's one limitation, keyword arguments do not work
#print(f1.format(item="bbb", starter="- "))
#print(f1.format(starter="- ", item="bbb"))
# multimethod.DispatchError

Well, the code is self-explanatory. I can define different overloads of a same method using the multimethod decorator, and the method invocation will be redirected to the right method based on the parameters. There's one limitation though, this redirection only works when using positional parameters. If we use named parameters, it fails with a multimethod.DispatchError. To work with named parameters we have to use the multidispatch decorator.


from multimethod import multidispatch 

# multidispatch supports keyword arguments  
class Formatter:
    def __init__(self, wrapper: str):
        self.wrapper = wrapper

    @multidispatch
    def format(self, item: str, starter: str):
        return f"{starter}{self.wrapper}{item}{self.wrapper}"
    
    # @multidispatch
    # def format(self, item: int, starter: str):
    #     return f"{starter}{self.wrapper * 2}{item}{self.wrapper * 2}"

    @format.register
    def _(self, item: int, starter: str):
        return f"{starter}{self.wrapper * 2}{item}{self.wrapper * 2}"

    
f1 = Formatter("|")
print(f1.format("aaa", "- "))

print(f1.format(25, "- "))

print(f1.format(starter="- ", item="bbb"))
print(f1.format(item=25, starter="- "))

# I can register new overloads to an existing class!
@Formatter.format.register
def _(self, item: bool):
    return f"{self.wrapper * 3}{item}{self.wrapper * 3}"

print(f1.format(True))
 
# and I can even overwrite an existing overload!
@Formatter.format.register
def _(self, item: int, starter: str):
    return f"{starter}{self.wrapper * 5}{item}{self.wrapper * 5}"

print(f1.format(item=25, starter="- "))

#  - |aaa|
# - ||25||
# - |bbb|
# - ||25||
# |||True|||
# - |||||25|||||


This multidispatch decorator is implemented quite differently from the multimethod decorator, so apart from supporting named parameters, you use it differently, via the register method. Let's see how the whole thing works.

First of all, let's revisit how class creation works in Python. When the runtime comes across a class declaration it executes the statements inside that declaration (normally we mainly have function definitions, but we can have also assignments, conditionals...) and the different elements (methods and attributes) defined there are added to a namespace object (sort of dictionary). Then, the class object is created by invoking type(classname, superclasses, namespace) (or if the class is using a metaclass other than type: MyMeta(classname, superclasses, namespace, **kwargs)). You can further read about this in these 2 previous posts [1] and [2]

multidispatch (same as multimethod) is a class based decorator, so when you do: "@multidispatch def format()..." you are creating an instance of the multidispatch class and adding a "format" entry in the class namespace pointing to it. Then, you decorate the other "overloads" of the function with calls to the register() method of that multidispatch instance, so the instance will store the different overloads with their corresponding signatures. Notice that because of its different internal implementation, you can not name the overload functions with the same name of the first one (that will become the "public" name for this multiple dispatch method), we use a commodity name like "_". Finally the class object will be created by inovoking type with a namespace object that contains this instance of the multidispatch class. Obviously the multidispatch class is callable, so it has a __call__ method that when invoked searches through its list of registered overloads based on the signature.

Notice in both samples above that after having declared a class we still can add additional overloads to the class using the register method (we can even overwrite an existing overload). This is a particularly nice feature as your class remains open