Sunday 8 May 2022

Python Lazy Object Initialization

Over the years I've posted several times about "Lazy objects" (Lazy initialization) in different languages. When implementing it in Python these days I've come up with a solution that I think is really interesting.

Of course the Lazy object is transparent. I mean, the client does not know if the object is being lazily initialized or not, "laziness" is an implementation detail that does not affect the interface. This has nothing to do with things like C#'s Lazy<T>. The class of the object that we're going to lazily initialize does not require any changes either. We design our class normally and then when we see it fit to use lazy initialization we create a lazy object from it.

Both in C# and in JavaScript I've used proxy objects to implement lazy initialization. I think this is the common approach and looks quite good to me, but it has some performance implications. Once the object has been initialized the proxy mechanism remains in place and it still checks in each property access if you already have done the initialization or not (so that you use the initialized instance or invoke the initialization method).

Python has an amazing combination of features that allows the clean lazy initialization technique that I'm going to show here:

  • The __getattribute__ and __setattr__ methods. When defined in a class these methods intercept the access to properties in instancess of that class or instances of derived classes. They intercept the access both "directly" (I mean, instance.attribute) and through the getattr and setattr functions. Notice that adding these methods directly to an instance of a class rather than the class, will have no effect.
  • Same as JavaScript Python allows us to create a class inside a function. This class can inherit from the value provided in one variable (so it can dinamically inherit from one class or another). We can return this dynamic class from our method for further use outside it.

I leverage these features to create lazy objects by means of creating a new class that inherits from the class of the instance that I want to behave lazily. In this class I define the __getattribute__ and __setattr__ hooks/traps, and in the __init__ method I store the parameters for later reuse. I carefully invoke the initialization method through the class rather than the instance, to avoid further triggering of the traps. What is very interesting is that once I do the object initialization, I remove both hooks from the class, so they will no longer interfere with ensuing attributes access, and hence no performance penalties.
So I have a factory function that creates instances of a dynamic _Lazy class:


def lazy(cls, *args, **kwargs):
    class _Lazy(cls):
        def __init__(self, *args, **kwargs):
            _Lazy.original_args = args
            _Lazy.original_kwargs = kwargs


        def _lazy_init(self):
            print(f"_lazy_init")
            # remove the traps so that they do not interfere in the next accesses
            del _Lazy.__setattr__
            del _Lazy.__getattribute__
            # invoke the __init__ of the "target" class
            super().__init__(*self.original_args, *self.original_kwargs)
            #change the parent class so that when we do a "type()" we no longer get "_Lazy", but the "real" class
            self.__class__ = _Lazy.__bases__[0]
            
        
        def __setattr__(self, name, value):
            print(f"setting attribute: {name}")
            #self._lazy_init() # can't do this as it will trigger the traps again
            # however, traps do not have effect on accesses to attributes through the class itself rather than through instances
            _Lazy._lazy_init(self)
            setattr(self, name, value)

        def __getattribute__(self, name):
            print(f"getting attribute: {name}")
            _Lazy._lazy_init(self)
            return getattr(self, name)
    
    return _Lazy(*args, **kwargs)

And I use it like this:


class Person:
    def __init__(self, name, age):
        print(f"{Person.__name__}.__init__")
        self.name = name
        self.age = age

    def say_hi(self, to_someone):
        print(f"{self.name} with age {self.age} says Bonjour to {to_someone}")

def test_1():
    lazy_p1 = lazy(Person, "Lazy Francois", 14)
    print(f"type: {type(lazy_p1).__name__}")
    # initialization takes place
    lazy_p1.say_hi("Xose") 
    
    # trap has been deactivated
    print(lazy_p1.name)
    lazy_p1.say_hi("Xose") 
    print(f"type: {type(lazy_p1).__name__}")

test_1()
# output:
# type: _Lazy

# getting attribute: say_hi
# _lazy_init
# Person.__init__

# Lazy Francois with age 14 says Bonjour to Xose
# Lazy Francois
# Lazy Francois with age 14 says Bonjour to Xose

# type: Person


As the icing of the cake, once the initialization is done I also change the __class__ of the instance (yes, one more Python powerful feature), to point again to the original class rather than the derived _Lazy one. This way, if we check the type with type we get the original type, meaning that after the initialization we no longer can know if the object was initialized normally or lazily.

I've uploaded the code to this gist.

No comments:

Post a Comment