Deploy to nenyures: May 2026

Sunday, 17 May 2026

Type Hints Notes 2026

Type hints are more and more prevalent in recent Python code. I'm still not too severe about them, but my level of strictness continues to grow over time. I've lately learnt a couple of things:

Variadic Parameters. When typing functions that have packed (variadic) parameters in its signature (*args, **kwargs), we put the type of the individual parameter, we don't have to type it as a collection or dictionary (save if each parameter is really a collection), I mean, for a pipe function we should do:


# this is RIGHT
def pipe(val: Any, *fns: Callable) -> Any:

# this is WRONG
def pipe(val: Any, *fns: list[Callable]) -> Any:

Tuples. In Python we use tuples for "groups" of a fixed number of elements, a pair, a trio... We express it in the signature like this: tuple[str, str] or tuple[int, str, str]... But how to express that a function returns (or receives) a "group" of an unknown number of elements? We also use tuples, combined with ellipsis (...), like this:


tuple[int, ...]         # any number of ints: (), (1,), (1, 2, 99), ...
tuple[int | str, ...]   # any number of elements, where each element can be an int or a str (), ("a", 1), ("a", "b", "c"), (1, 2, 1) ...
  
tuple[int, str, bool]   # exactly 3 elements: an int, a str, a bool

An important detail that I've learnt thanks to a typing issue. We know that in a Python try-except block, the except clause can manage multiple exception types, I mean: except RuntimeError, TypeError, NameError:. Those multiple exceptions are a tuple, not just any iterable. Let's see an example (the last line is what is WRONG):


# multiple exceptions
def multiple_exceptions(exceptions: tuple[type[Exception], ...]) -> None:
    try:
        raise ValueError("This is a ValueError")
    except exceptions as e:
        print(f"Caught an exception: {e}")

multiple_exceptions((ValueError, TypeError))  
# Caught an exception: ValueError

# important, this is WRONG, we have to pass a tuple, not just any collection
multiple_exceptions([ValueError, TypeError])
# TypeError: catching classes that do not inherit from BaseException is not allowed

Indeed, an equivalent function with a variadic signature feels more natural and idiomatic than the above (and furthermore prevents the confusion of passing over any collection rather than exactly a tuple):


# this variadic signature feels more natural
def multiple_exceptions2(*exceptions: type[Exception]) -> None:
    if not exceptions:
        raise ValueError("pass at least one exception type")
    try:
        raise ValueError("This is a ValueError")
    except exceptions as e:
        print(f"Caught an exception: {e}")

multiple_exceptions2(ValueError, TypeError)

And notice also how I've added a guard against the empty-call case (except () is invalid at runtime).

Sunday, 10 May 2026

Python partial and placeholders

When Python 3.14 was released I had already read about some of its main features (those that involve a PEP and that have been discussed in the Python discussion forums), like Lazy Annotations and Template Strings. When reading in depth recently the release notes I came across a small feature added to functools.partial (and partialmethod) that I find particularly useful:

functools:
Add the Placeholder sentinel. This may be used with the partial() or partialmethod() functions to reserve a place for positional arguments in the returned partial object. (Contributed by Dominykas Grigonis in gh-119127.)

Just a reminder of what partial function application is (don't confuse it with the related concept of curried functions):

In computer science, partial application (or partial function application) refers to the process of fixing a number of arguments of a function, producing another function of smaller arity.

Indeed I already talked about functools.partial some time ago

The "basic" approach to partial function application is that we can just fix (pre-fill) arguments from left to right. This is what we have also in JavaScript with function.prototype.bind (that binds as first argument the "this" value). As Python supports named arguments, functools.partial already supported fixing named arguments.


def format_geo_info(country, region, city, population):
    return f"{city}, {region.upper()} ({country}) - {population}"
    
bound_format = functools.partial(format_geo_info, "France")
print(bound_format("Occitanie", "Toulouse", 500_000))
# Toulouse, OCCITANIE (France) - 500000
print(bound_format("Occitanie", city="Toulouse", population=500_000))
# Toulouse, OCCITANIE (France) - 500000

What was not possible until this version was fixing some intermediate non-named argument, but this is possible since version 3.14 thanks to the Placehodler sentinel value:



format_french_city_with_unknown_population = partial(format_geo_info, "France", Placeholder, Placeholder, 0)
print(format_french_city_with_unknown_population("Ile de France", "Saint Denis"))
# Saint Denis, ILE DE FRANCE (France) - 0

Not a revolutionary feature, but one that I've missed occasionally. A trivial implementation could be something like this:


# supports positional and keyword arguments, but not placeholders
def my_basic_partial(func, *args, **kwargs):
    return lambda *fargs, **fkwargs: func(*args, *fargs, **(kwargs | fkwargs))
    
# add support for placeholders in the arguments
PLACEHOLDER = object()
def my_complete_partial(func, *args, **kwargs):
    def new_func(*fargs, **fkwargs):
        merged_args = []
        fargs_iter = iter(fargs)
        for arg in args:
            if arg is PLACEHOLDER:
                merged_args.append(next(fargs_iter))
            else:
                merged_args.append(arg)
        merged_args.extend(fargs_iter)
        return func(*merged_args, **(kwargs | fkwargs))
    return new_func

format_french_city_with_unknown_population = my_complete_partial(format_geo_info, "France", PLACEHOLDER, PLACEHOLDER, 0)
print(format_french_city_with_unknown_population("Ile de France", "Saint Denis"))
# Saint Denis, ILE DE FRANCE (France) - 0

format_2 = my_complete_partial(format_geo_info, "France", city="Toulouse")
print(format_2("Occitanie", population=500_000))
# Toulouse, OCCITANIE (France) - 500000

In my aforementioned previous post about partial in Python I gave some reasons for using partial over directly trapping the variables with a closure (of course internally partial has to use either closures or a callable class). I've just realised that I was missing the main reason, partial is more semantic.

- Intent-Revealing Code: partial(func, arg) explicitly states your intent to partially apply arguments, improving readability and self-documentation. - Declarative Style: It focuses on the result (a new specialized function) rather than the imperative mechanics of capturing lexical scope.

Lodash, the excellent JavaScript library, also features placeholders in its implemention of partial.

Sunday, 3 May 2026

Glibc

Compiling to native code, and furthermore for a Linux system... wow, sounds scary, and very, very far away from what I've been doing in the last decade(s). Well, the thing is that my employer decided some months ago that we had to compile to native code some of our Python applications. It's not something performance related, it's for preventing access to the source code of these applications. We were looking into cython, but we settled on Nuitka, an amazing piece of software that has been serving us so well.

Normally almost every native application compiled for a Linux system has been dynamically linked against glibc. OK, and, what's glibc?

The GNU C Library, commonly known as glibc, is the GNU Project implementation of the C standard library. It provides a wrapper around the system calls of the Linux kernel and other kernels for application use. Despite its name, it now also directly supports C++ (and, indirectly, other programming languages).

So when a Linux native application (using glibc) starts, the dynamic linker (libdl.so) will dynamically load the shared objects (SO, .so files, the equivalent to windows DLL's) needed by the application (like glibc.so) and link the callsites to the functions imported from those libraries.

Obviously glibc evolves over time, so, what about versions? First, what glibc version is installed on my system? You can check the SO's loaded by a running process by doing: lsof -p PID | grep .so. Normally you'll see that it's using: libc.so.6 (in Ubuntu it located here: /usr/lib/x86_64-linux-gnu/libc.so.6). That 6 is not the version number (libc.so.6 is the name for the library since 1997!), the version number is something like 2.XX (2.39 in my ubuntu 24.04). You find it by using: ldd --version

So, what happens if I compile my application in a system with one version of glibc and try to run it in a system with a different version? Well, the situation is quite more fine-grained that I thought. Version numbers are not checked at the glibc level, but at the function level. This is so because glibc uses symbolic versioning

The "Symbol Versioning" Approach (Advanced)

This is what glibc uses. It is also used by heavy-hitters like OpenSSL, Qt, and libgcc.

    The Logic: The filename stays the same (e.g., libc.so.6 has been the name since 1997), but individual functions inside the file are tagged with versions.

    The Result: Multiple versions of the same function can coexist in one file. This allows for extreme backward compatibility without breaking the system every time a single function is updated.
    
    The primary goal of symbol versioning is backward compatibility. It allows a single library file to provide multiple versions of the same function so that:

    Old binaries compiled against v2.10 continue to use the v2.10 implementation.

    New binaries compiled against v2.11 use the new v2.11 implementation.

So multiple versions of the same function live inside glibc, and your binary will dynamically link against the one it was compiled for. And, when does the version number of a function change? Normally it only changes if the function interface (the contract, the ABI) changes, but not if its only its internal implementation that changes. So if we compare symbolic versioning to semantic versioning (SemVer, a more familiar versioning schema), we could say that in symbolic versioning a version change corresponds to a Major version in semantic versioning.

You are exactly right: A new Symbol Version is functionally equivalent to a Major Version bump for that specific function. It signals to the linker that the "Contract" for that specific symbol has changed, and old programs should look for the previous contract elsewhere in the same file.

Notice how symbolic versioning is used for functions inside a library, while semantic versioning (when used) is normally used for libraries.

The glibc version (that one obtained with ldd --version) has no importance in terms of loading the library in memory (the dynamic linker will load libc.so.6 regardless of its "internal" version), the important part is the specific version of each function that we try to link.

I guess when you program in C you are aware of the version of each function that you are using, as you have to adapt your code to the ABI of the function if it has changed, but when that happens behind the scenes, that's quite different. In our case, we just write Python code, and the beautiful Nuitka takes care of transforming it to C and then compiling it to native. So it's Nuitka who takes care of writing the C code in accordance to the function versions inside the glibc in the system. So if then you run that binary in a system with an older glibc version it could happen that your binary is "pointing" to a function with a symbolic version (let's say openEncryptedFile@GLIBC_2.12) higher than the one in the older glibc (let's say openEncryptedFile@GLIBC_2.10) present in the current system, and your application will crash. Basically this means that you have to compile your Python application in a system with a glibc version <= that the glibc version in the target system. It feels odd at first, as the starting point is just the same Python code, and if in one system it can just use openEncryptedFile@GLIBC_2.10 why doesn't it compile it always with that 2.10 even if a bigger version (openEncryptedFile@GLIBC_2.12) is present? Well, that's how things work by default, when compiling, code will be linked to the highest version of that function present in the glibc in the compilation machine.

If you wonder if other .so libraries (SO, ELF libraries) also use symbolic versioning, it depends. For smaller, simpler libraries what is usually used is the SONAME approach, the library (.so file) name changes with each version (this is a coarse grained approach).

Symbolic versioning is the technically superior approach, but it is not the universal standard for all ELF libraries. It depends entirely on the library maintainers and their commitment to long-term ABI stability.

In the Linux ecosystem, there are two primary ways to manage library changes:  
1. The "SONAME" Approach (Common)

Most smaller or simpler libraries use the SONAME mechanism. 
You’ve likely seen files like libfoo.so.1 and libfoo.so.2.

    The Logic: If the developers change the interface, they increment the "Major" version number in the filename itself.  

    The Result: Programs linked against libfoo.so.1 will refuse to start 
    if only libfoo.so.2 is present. This is a "heavy-handed" fix because 
    it requires recompiling every program that uses the library even if 
    the specific function they use didn't actually change.

2. The "Symbol Versioning" Approach (Advanced)

This is what glibc uses. It is also used by heavy-hitters like OpenSSL, Qt, and libgcc.

    The Logic: The filename stays the same (e.g., libc.so.6 has been the name since 1997), 
    but individual functions inside the file are tagged with versions.

    The Result: Multiple versions of the same function can coexist in one file. 
    This allows for extreme backward compatibility without breaking the system every time a single function is updated.

To complete this post, I'll add some useful, related commands:

To check the SO's used by a given program.
For a binary on disk: ldd /usr/bin/program_name
For a running process: lsof -p [PID] | grep '\.so'
To view the symbols used by a program (the specific functions imported from SO's)
All imported symbols: nm -Du
Symbols + Versions: objdump -T | grep '*UND*'
Only glibc symbols: objdump -T | grep 'GLIBC_'
Library Version Map: readelf -V
To view the symbols/functions exported by glibc in your system: objdump -T /usr/lib/x86_64-linux-gnu/libc.so.6

Some additional findings related to the last command. For example I want to see the versions of pthread_spin_init present in my glibc: objdump -T /usr/lib/x86_64-linux-gnu/libc.so.6 | grep pthread_spin_init

That gives me:

0000000000a4130 g DF .text 000000000000000d GLIBC_2.34 pthread_spin_init 00000000000a4130 g DF .text 000000000000000d (GLIBC_2.2.5) pthread_spin_ini

Which is very interestinng as it shows us that a symbol version is not a sequential counter for that specific function. Instead, it is a timestamp or a marker of the glibc release that defined that specific version of the function's ABI. From a GPT:

How glibc handles ABI changes with symbol versioning

Original version: Suppose foo() was introduced in GLIBC_2.2.5. That version is tagged as foo@GLIBC_2.2.5.

ABI change in glibc 2.32: If glibc developers change the ABI of foo() in version 2.32 (e.g., change its behavior, arguments, or return type in a way that breaks compatibility), they will:

Keep the old implementation as foo@GLIBC_2.2.5.
Add a new implementation as foo@GLIBC_2.32.

At runtime:

A binary linked against glibc 2.2.5 will request foo@GLIBC_2.2.5, and the dynamic linker will resolve it to the old implementation.
A binary linked against glibc 2.32 will request foo@GLIBC_2.32, and get the new implementation.

This mechanism ensures backward compatibility while allowing glibc to evolve.

Deploy to nenyures

Sunday, 17 May 2026

Type Hints Notes 2026

Sunday, 10 May 2026

Python partial and placeholders

Sunday, 3 May 2026

Glibc

About Me

Contact Me

My Other Sites

AntiFacebook

Things I read online

Things that make my life better

Other Helpful Links

Followers

Blog Archive

Labels