It's been quite a while since I first complained about the lack of safe-navigation and coalesce operators in Python, and provided a basic alternative. I've also complained about the lack of try-expressions in Python, and also provided a basic alternative. Indeed, there's not a strong reason for having 2 separate functions, I think we can just use do_try for the safe-get and coalesce option.
def do_try(action: Callable, exceptions: BaseException | list[BaseException] | None = Exception, on_except: Any | None = None) -> Any:
"""
simulate 'try expressions'
on_except can be a value or a Callable (that receives the Exception)
"""
try:
return action()
except exceptions as ex:
return on_except(ex) if (on_except and callable(on_except)) else on_except
person = Person()
embassy = do_try(lambda: person.country.main_cities[0])
I also complained of how absurd it feels having a get method for ditionaries but not for collections. That means that we end up writing code like this:
x = items[i] if len(items) > i else "default"
Of course, that would not be necessary if we had safe-navigation, but as we don't have it, we can just use the do_try function:
x = do_try(lambda: items[i], "default")
And here comes the interesting part, obviously using do_try means using try-except under the covers, which when compared to using an if conditional seems something to avoid in performance terms, right? Well, I've been revisiting a bit the internals and cost of exceptions. Since version 3.11 Python has zero-cost exceptions. This means that (as in Java) having try-except blocks in your code does not have any performance effect if no exception is thrown/raise, the only costs occur if an exception is actually raised: The "zero cost" refers to the cost when no exception is raised. There is still a cost when exceptions are thrown.
Modern Python uses exception tables. For each function containing try-except blocks an exception table is created linking the try part to the handling code in the except part. Exception tables are created at compile time and stored in the code object. Then at runtime if an exception occurs the interpreter will consult the exception table to find the handler for the given exception and jump to it. Obviously creating an exception object, searching the exception table and jumping to the handler has a cost. Given that in Python compilation occurs when we launch the script, just before we can run the code, we can say that this exception table creation also has a runtime cost, but it's mininum as it happens only once per function (when the function is created), not every time the function is executed. That's where the cost happens if an exception is raised: creating the Exception object, unwinding and jumping to the handler.
Throwing/raising an exception felt like a low level mechanism to me, but it's not at all.
Language-level exceptions are software constructs managed by the runtime (JVM for Java, CPython for Python). They do not involve the OS unless the program crashes. So when you use a throw/raise statement in your code there's not any sort if software interrupt, it's just one more (or several) instruction. The python interpreter will come across a RAISE_VARARGS bytecode instruction, and it will search in the exception table for the current function and/or the functions in the call stack, trying to find an exception handler.Notice that the same happens in Java-JVM. The Java Compiler creates an exception table for each method and stores it in the .class file. This table maps bytecode ranges to handlers (catch blocks) and the type of exception they handle. When the class loader loads the class the JVM stores this table in the method’s metadata.. Given that the JVM comes with JIT compilation, there's an additional level. When the JIT compiles the method:
The JIT generates native machine code for the method.
It also creates a new exception table for the compiled code, because:
The original bytecode offsets are no longer relevant.
The JIT needs to map native instruction addresses to handler entry points.
This table is stored alongside the compiled code in the JVM’s internal structures.
So once a method has been compiled by the JIT at runtime we'll have two exception tables, the initial one for the bytecode form of the method (that is kept around in case we have to deoptimize from native back to bytecodes), and the table for native code. Notice that when the JIT compiles the bytecodes to native code we'll incur a very small extra cost for the creation of this additional table.
With all the above, using do_try() for safe indexed access seems a bit overkill (unless we're sure that access is very rarely going to fail and throw), and having a specific commodity function for it makes sense:
def get_by_index(sequence: Sequence[T], index: int, default: Any = None) -> Any:
"""
Safely access an element by index in a sequence, where sequence is any class supporting __getitem__ and __len__.
like: list, str, tuple, and bytes
Usage: get_index(my_list, 2, default='Not Found')
"""
return sequence[index] if 0 <= index < len(sequence) else default
We could generalize the function for nested access, but once we start to loop with if conditions at some number of iterations the try-except will probably end up being better for performance.
No comments:
Post a Comment