Friday, 12 August 2022

Python Descriptors

Properties (aka getters-setters) in python do not seem like any particularly "advanced" feature. In principle it's just adding a getter and setter method to access a given attribute. Well, properties are a specific case of descriptors. In short a descriptor is an object with a __get__ method, and optionally __set__ and __delete__ methods. Adding a property to a class is not just directly adding a getter (setter, deleter) method to the class, it's adding an object that is an instance of the property class. There's a very nice explanation here

descriptors are a low-level mechanism that lets you hook into an object's attributes being accessed. Properties are a high-level application of this; that is, properties are implemented using descriptors. Or, better yet, properties are descriptors that are already provided for you in the standard library.

When we do an attribute lookup on an object and the found attribute is a descriptor (and it finds it in the class of that object, not directly in the object.__dict__) the runtime invokes its __get__ method (or __set__ or __delete__). Yes, descriptor objects have to be in the class, not directly in the instance, you can read an explanation here

What is really interesting of descriptors is that they are used to implement much of the python behaviour we like so much. For example, I used to think that in order to return a bound method when you access a method from an instance the python runtime would have some very specific code for that. A function (something that we define with "def" or "lambda") is an instance of the function class, that happens to be a descriptor. When we define a method in a class, we are adding that function object to that class __dict__. So when invoke that method in an instance (my_instance.method1("aaa")), we are first looking up the "method1" attribute in the my_instance object. It will be found in the __dict__ of my_instance class, and as it has a __get__ method (and has not been found directly in my_instance), that __get__ method is executed, and it takes care of returning an object that is an instance of the method class, bound to my_instance (and finally, for the call itself, the ("aaa") part, as the retrieved object is callable, __call__ method in the method class is invoked) So we have a language feature that is not implemented by some specific runtime/interpreter code, but just using a standard python feature.

The python documentation about descriptors is really, really good, as it helps you understand pretty important elements of the language, like attribute lookup, bound methods, class methods (a decorator applied to a function descriptor)... It provides python implementations of these functionalities, so it's really worth a read.

Attribute look-up is implemented in object.__getattribute__ and type.__getattribute__. When looking up an attribute of an object, the runtime searches for the __getattribute__ method in that object's class. So for an instance of a "normal class" (let's say an instance p1 of class Person:, p1.instance_attribute) the runtime searches for __getattribute__ in Person.__dict__, where unless that we have redefined it for the class, it won't find it, so it will continue to search in the inheritance chain and will end up finding it in object.__dict__["__getattribute__"]. If we are looking up an attribute in a class itself (Person.my_class_attribute), it will search in Person's class, that is either type or a class derived from type (if Person were using another metaclass (I mean: class Person(metaclass=CustomMetaclass):). So we'll end up with type.__dict__["__getattribute__"].
If the __getattribute__ that we've executed finds a descriptor, the invocation of __get__ in that descriptor is done with: desc.__get__(a, type(a)) from the object.__getattribute__, and with: desc.__get__(None, A) from the type.__getattribute__.

When you check the class of an object using the __class__ attribute, it's also implemented as a descriptor (same as __name__, as I already mentioned here), so __class__ is not in the object dictionary, but in the class, particularly in its inheritance chain, in object.__dict__['__class__']. There's a good explanation here

It's interesting that properties are implemented to match the behaviour of object.__getattribute__ and type.__getattribute__ so that when accessing a property from an instance we obtain the value, while accessing it from the class returns the proprerty object.


In [70]: class Book:
    ...:     @property
    ...:     def price(self):
    ...:         return 4
    ...:         

In [71]: b1 = Book()

In [72]: b1.price
Out[72]: 4


In [74]: b1.__class__.__dict__["price"]
Out[74]: <property> at 0x1c31f647ce0>

In [75]: b1.__class__.price
Out[75]: <property> at 0x1c31f647ce0>

This is a bit different in JavaScript, a language that also has the concept of descriptors (they are used for any attribute, not just for getters-setters aka "access descriptors"), but where to get the descriptor object rather than the value, we have to use a specific method: Object.getOwnPropertyDescriptor


class Person{
    constructor(name, lastName){
        this.name = name;
        this.lastName = lastName;
    }

    get fullName(){
        return `${this.name} ${this.lastName}`; 
    }

    sayHi(toX){
        return `Hi ${toX} I am ${this.name}`
    }

}

let p1 = new Person("Francois", "Belmondo")
console.log(p1.fullName); //Francois Belmondo

// the fullName access descriptor is in Person.prototype (just as the sayHi function)
//these 2 execute the "getter", but with the Person.prototype as "this", so gets undefined name and lastName
console.log(Person.prototype.sayHi("aa")); //Hi aa I am undefined
console.log(Person.prototype.fullName); //undefined undefined

let d1 = Object.getOwnPropertyDescriptor(Person.prototype, "fullName"); //the descriptor

let d2 = Object.getOwnPropertyDescriptor(p1, "fullName"); // undefined
let d3 = Object.getOwnPropertyDescriptor(Person, "fullName"); //undefined


No comments:

Post a Comment