Wednesday 26 July 2023

Python Closure in a Loop

A long, long time ago I wrote a post about the particularities of creating closures in a loop trapping the loop variable. At that time, 2010, both in JavaScript and C# that variable (declared with var in JavaScript) was scoped to the whole function and in order to trap a different variable for each closure-iteration we needed create an extra scope by means of an additional factory function in each iteration. A bit later, in 2012, that was no longer necessary for C# foreach, and 2 or 3 years later, with the addition of let scope the trick was no longer necessary in JavaScript either.

I've been reminded of all this because recently I've had to use that trick again, this time in Python, as in python variables are scoped to the whole function. So the problem is this:


def closure_in_loop_1():
    fns = []
    #i is scoped to the whole function
    for i in range(0,4):
        def fn():
            print("i: " + str(i))
        fns.append(fn)
    for fn in fns:
        fn()

closure_in_loop_1()        

# i: 3
# i: 3
# i: 3
# i: 3


And the solution, use an additional wrapper function receiving the value we want to trap in the closure, is this:


def closure_in_loop_2():
    # do the trick that I used to do in javasccript before the arrival of "let" 
    def scope(i):
        def fn():
            print("i: " + str(i))
        return fn
    fns = []
    for i in range(0,4):
        fns.append(scope(i))
    for fn in fns:
        fn()
        
closure_in_loop_2()

# i: 0
# i: 1
# i: 2
# i: 3


Python gives us access from the outer world to the values trapped by the closure. Functions are objects that have a __closure__ attribute that points to a tuple containing those trapped values (it's None if the function has not trapped any value). The tuple does not directly contain the trapped values, but a cell object that points to that value through the cell_contents attribute. This extra level of indirection is what allows 2 closures to share variables (both closures point to the same cell object):


def closure_cells_test():
    city = "Paris"
    def print_fav_city():
        print(f"my favorite city: {city}")
    
    def print_expensive_city():
        print(f"expensive city: {city}")

    print_fav_city()
    print_expensive_city()  
    print(f"city: {city}")

    print(type(print_fav_city.__closure__[0]).__name__)
    #__closure__ is a tuple of cell objects

    print(f"id1: {id(print_fav_city.__closure__[0])}") # id1: 139805268884496
    print(f"id2: {id(print_expensive_city.__closure__[0])}") # id2: 139805268884496
    # so it's  the same cell object in both cases 

	# the cell object has a cell_contents attribute
    print_fav_city.__closure__[0].cell_contents = "Moscow"
    print_fav_city() # Moscow
    print_expensive_city() # Moscow
    print(f"city: {city}") # Moscow

    city = "Beijing"
    print_fav_city() # Beijing
    print_expensive_city() # Beijing
    print(f"city: {city}") # Beijing

    # to get the names of the variables trapped by the closure:
    print(f"free vars: {print_fav_city.__code__.co_freevars}")
    # free vars: ('city',)

There's one low level details that escapes to me. How the cell object and the outer variable (on which the closure has closed over) are kept in sync (as you can see I can modifiy any of them and it's all kept synced)?.

Additionally you can see in the code above that we can get the names of the variables trapped by the closure (free variables) accessing the co_freevars attibute of the code object:
my_function.__code__.co_freevars

No comments:

Post a Comment