Tuesday, 17 December 2024

Closing Python Generators

This post is about some rarely used features of Python generators (JavaScript generators are pretty similar, but with some differences that would deserve its own post).

First of all, the internals of Python generators is pretty interesting. They are quite different from C# generators or Kotlin suspend functions, where the function is converted by the compiler into a class with a "state machine method" with labels for each "suspension point" and properties for the current label and local variables. In Python, the generator object created from a generator function points to that function as such, and holds a frame object with the variables and the next instruction to run. Each time the generator function is resumed it gets this frame object (gi_frame) (rather than starting with an unitialized one) containing its state and position of its next instruction (gi_frame.f_lasti). It's very nicely explained here. We can see with this simple code that the gi_frame and the frame taken (via inspect) from the stack in the generator function execution are indeed the same object, not a copy:


import inspect

def cities_gen_fn():
    print(f"frame id: {id(inspect.stack()[0].frame)}")    
    yield "Xixon"
    print(f"frame id: {id(inspect.stack()[0].frame)}")
    yield "Paris"
    yield "Lisbon"
    yield "Bilbao"

cities = cities_gen_fn()
print(next(cities))
print(f"gi_frame id: {id(cities.gi_frame)}")
print(next(cities))

# frame id: 2550405375184
# Xixon
# gi_frame id: 2550405375184
# frame id: 2550405375184


Python generator objects have a close() method that allows us to set the generator as finished. One common use case is when looping over a generator and at some point a condition tells us to stop. Of course you can leave the loop using the break statement, but that's a bit different, break will leave the loop immediatelly, not in the next iteration, and as the generator has not been finished, we still can continue to iterate it after the loop.



def cities_gen_fn():
    yield "Xixon"
    yield "Paris"
    yield "Lisbon"
    yield "Bilbao"

print("- using break")
cities = cities_gen_fn()
for city in cities:
    if (city := city.upper())[0] == "L":
        break
    print(city)
print(next(cities))

print("- using .close()")
cities = cities_gen_fn()
for city in cities:
    if (city := city.upper())[0] != "L":
        cities.close()
        print(city)
try:
    print(next(cities))
except StopIteration as ex:
    print("generator is finished")

# - using break
# XIXON
# PARIS
# Bilbao

# - using .close()
# XIXON
# generator is finished


I can think of some situation in the past where this .close() method would have come handy. Let's say we have a main function that creates a generator and delegates on other functions certain tasks involving iterating that generator. Each of those functions could determine based on its own logic that the generator is finished, so it would close it, and then the main function would no longer invoke the remaining functions with it. Unaware of this .close() functionality I was returning a "is_finished" boolean from each of those functions.

The documentation on .close() shows that it's a quite interesting and complex beast. Raises a GeneratorExit at the point where the generator function was paused. Wow, that's quite a bit mind blowing. So it's as if when the generator function is resumed somehow the interpreter injects a raise GeneratorExit() sentence in the place where the gi_frame.f_lasti is pointing to! If the generator does not catch the exception the generator finishes (the next iteration attempt will throw a StopIteration) and the close() call returns None (that's the behaviour in the examples above). Python3.13 has introduced a new feature, the generator can catch the exception and return a value to the close() method. The main effect, finishing the generator is the same, but we have this extra of returning a value to the caller. Let's see:



def cities_gen_fn():
    yield "Xixon"
    yield "Paris"
    yield "Lisbon"
    yield "Bilbao"


def cities2_gen_fn():
    try:
        yield "Xixon"
        yield "Paris"
        yield "Lisbon"
        yield "Bilbao"
    except BaseException: #GeneratorExit:
        return "No City"
        #this returned value in case of a close() is returned by close(), but not as return value of the generator (StopIteration.value is None)


for cities_gen in [cities_gen_fn(), cities2_gen_fn()]:
    print(next(cities_gen))
    print(f"close result: {cities_gen.close()}")
    print("generator has been closed")
    try:
        next(cities_gen)
    except Exception as ex:
        print(f"Exception: {type(ex)}, value: {ex.value}")
    print("--------------------------")

# Xixon
# close result: None
# generator has been closed
# Exception: 'StopIteration', value: None
# --------------------------
# Xixon
# close result: No City
# generator has been closed
# Exception: 'StopIteration', value: None
# --------------------------

What feels a bit odd to me is that the value returned by the generator to .close() is not considered as a generator return value and made available as the .value property of the next StopIteration exception.

We have another related method, generator.throw(). It's also used to finish a generator, but throwing exceptions, for which I don't see any clear use case.

Raises an exception at the point where the generator was paused, and returns the next value yielded by the generator function. If the generator exits without yielding another value, a StopIteration exception is raised. If the generator function does not catch the passed-in exception, or raises a different exception, then that exception propagates to the caller.

I'll show some example, but honestly I don't see when this method can be useful.



def cities_gen_fn():
    yield "Xixon"
    yield "Paris"
    yield "Lisbon"
    yield "Bilbao"

cities_gen = cities_gen_fn()
print(next(cities_gen))
try:
    print(f"throw result: {cities_gen.throw(Exception())}")
    print("after generator throw")
except Exception as ex:
    print(f"Exception: {ex}")
try:
    print("next iteration attempt")
    next(cities_gen)
except Exception as ex:
    print(f"Exception in next() call: {type(ex)}, value: {ex.value}")

# Xixon
# Exception: 
# next iteration attempt
# Exception in next() call: 'StopIteration', value: None


print("--------------------------")


def cities2_gen_fn():
    try:
        yield "Xixon"
        yield "Paris"
        yield "Lisbon"
        yield "Bilbao"
    except Exception: 
        yield "Except City"


cities_gen = cities2_gen_fn()

print(next(cities_gen))
print(f"throw result: {cities_gen.throw(Exception())}")
print("after generator throw")
try:
    print("next iteration attempt")
    next(cities_gen)
except Exception as ex:
    print(f"Exception in next() call: {type(ex)}, value: {ex.value}")


# Xixon
# throw result: Except City
# after generator throw
# next iteration attempt
# Exception in next() call: 'StopIteration', value: None


No comments:

Post a Comment