Friday, 15 November 2024

Python Functions Disallowing Named Arguments

There are some builtin Python functions (written in C) that do not accept named (keyword) arguments (so they force the use of positional ones), for example (range, map, filter...) If you try to pass them a keyword argument you'll get an error: [function] takes no keyword arguments. This feels like an odd limitation to me, and furthermore, very confusing, as when looking into the documentation (for example for functions filter and map) I see no indication of this restriction, so you "learn by trying".

I've been trying to find information about the reason for this odd behaviour, and have not found much. There is this discussion that says that some builtin functions are defined with a "fast header" that does not accept keyword arguments.

This is because they use only a fast simple header ...(... PyObject *args) and therefore all named parameters are rejected automatically. (Python can never introspect into names of arguments in C source. :-)

Many other C functions have a header ...(... PyObject *args, PyObject *kwds) and they support exact list of names explicit by implementing much more complicated validation PyArg_ParseTupleAndKeywords. These names must be written to docs strings manually.

This is a bit confusing to me, but here they go some of my ideas on this (comparing it to other platforms). From what I've read the JVM has no notion of named arguments, meaning that the Kotlin compiler takes care of them when producing the bytecodes for the function call, and has a minimum performance implications at runtime:

Using named arguments produces the same bytecode as using positional arguments. However, changing the order of the arguments leads to the creation of temporary variables. This is because the arguments need to be evaluated in the order they appear, but passed to the method in the original order.

On the other hand, Python bytecode has the notion of Keyword arguments at the bytecode level, so it's the interpreter at runtime who takes care of them, which I guess has some minor performance implication. I would say that when writing native functions that you want to be as fast as possible, preventing a function from being invoked with keyword arguments provides some "nano-improvement".

Thanks to this searching I've found a surprising feature. I already knew that you can use the * symbol in your function signature to force all parameters after it to be passed as named (keyword) arguments, but I did not know that python 3.8 (PEP-570) introduced Positional Only Parameters. Using the / symbol in your function signature, parameters defined before it can only be provided as positional arguments, not as keyword ones.

At first sight it seems like an odd addition. That some builtin functins behave like that is a disturbance for me, so why would I want to disturb others by using that in some of my own functions? Well, there's a pretty good explanation here. Of the 4 reasons given there, the one that really makes this feature useful and valuable to me is the last one:

Since the parameters to the left of / are not exposed as possible keywords, the parameters names remain available for use in **kwargs:

Honestly I'd never thought about that potential problem when using variable number of keyword arguments. When you receive in your packed **kwargs an argument with a name that you already use for your other parameters you get an exception (Exception: [function] got multiple values for argument '[xxx]'), as the interpreter does not know to what parameter that named argument refers to. Let's see an example with the problem and the solution using /



def format_msg(msg: str):
    return f"[[{msg}]]"

def log_call(msg, fn, *args, **kwargs):
    print(f"{fn.__name__} {msg}")
    return fn(*args, **kwargs)
    

# no keywork argument, so it works fine
print(log_call("invoked", format_msg, "hi"))
#format invoked
#Out[11]: 'hi'

# but here we have a problem
try:
    print(log_call("invoked", format, msg="hi"))
except Exception as ex:
    print(f"Exception: {ex}")

# TypeError                                 Traceback (most recent call last)
# Input In [12], in ()
# ----> 1 log_call("invoked", format, msg="hi")
# TypeError: log_call() got multiple values for argument 'msg'


# that we can prevent by redefining the function like this:
def log_call(msg, fn, /, *args, **kwargs):
    print(f"{fn.__name__} {msg}")
    return fn(*args, **kwargs)
    

print(log_call("invoked", format_msg, msg="hi"))
# format_msg invoked
# [[hi]]'


Now that I'm aware of this potential problem in Python a further question arises, how do they manage this in Kotlin? Well, the thing is that they don't have this problem, cause kotlin supports variable number of unnamed arguments, but not of named arguments. The vararg modifier used in a function signature denotes an array of arguments, but there's not an additional modifier for denoting a dictionary of parameters (like python's **). Related to this, the spread operator * only applies to arrays, there's not an equivalent to Python's ** (packing/unpacking) for dictionaries.

No comments:

Post a Comment