Notice that in this post I'm talking about "standard" Python classes, not about dataclasses. I recently became aware of the possibility of using class-level type hints in your classes. The thing is that when reading the documentation I found it rather confusing. To make sense of it we have to be pretty aware of the difference between the intent that we express with those class hints and its runtime effects. So we have this example in the documentation:
class BasicStarship:
captain: str = 'Picard' # instance variable with default
damage: int # instance variable without default
stats: ClassVar[dict[str, int]] = {} # class variable
The 'damage: int' part is the one that I knew about "class-level typehints" and was clear to me. We declare an attribute and its type, but we don't initialize it. Python takes this just as typing information, it has no runtime impact (other than being added to that class __annotations__), we are not creating an attribute in the class object.
The 'captain: str = 'Picard'' is what I could not understand. For me it's like the normal way of adding a class attribute, only that additionally you indicate the type, so how can it be that the doc says that it's an "instance variable with default". Well, it's the type-checking meaning vs the runtime effect. I am right that we get an attribute created at the class level (in the class __dict__), just see:
>>> class User:
... continent: str = "Europe"
... active = True
...
>>> User.__dict__
mappingproxy({'__module__': '__main__', '__firstlineno__': 1, '__annotations__': {'continent': }, 'continent': 'Europe', 'active': True, '__static_attributes__': (), '__dict__': , '__weakref__': , '__doc__': None})
>>> User.continent
'Europe'
>>> User.active
True
But for the type checker what that typed declaration means is that instances of that class will have a captain (or continent in my example) attribute. This could feel contradictory, but given how attribute look up works it's perfectly fine. Initially the 'captain' attribute is created at the class level. If we read it through an instance (my_ship.captain) the look up mechanism won't find it in the instance, but in the class, and return it. Then, when we write to it through an instance (not through the class) the writing will be done in the instance, so a 'captain' attribute will be added to the instance. That's fine, indeed, it's very nice, while the attribute is not being written to, just read, it's being shared between instances, kept in the class (and saving memory), then, as soon as you write to it, it's shadowed by the instance.
s = BasicStarship()
print(s.captain) # "Picard" via class lookup
s.captain = "Xuan" # creates an instance attribute
print(s.__dict__) # {'captain': 'Xuan'}
print(BasicStarship.__dict__['captain']) # 'Picard'
We can sumarize it like this:
Type hints alone do not create attributes; they only declare intent.
If you want the attribute to exist on the class (and thus be visible via Foo.x), you must assign a default value.
By the way, this is not the first time I see this behaviour of reading values from a "parent object" until we write the value to the object itself, shadowing it. This is just how things work in JavaScript with the [[Prototype]] chain.
I'm not much of a fan of defining instance attributes at the class level. It's true that it makes very explicit that an attribute is part of the public contract of the class, but I think most of the time it's a bit boilerplate. Type-checkers and autocomplete work perfectly fine with the classical style of initializing in the __init__ method, and if an attribute is internal/private and should not be considered part of the public API we should just follow the convention of starting it with '_'. So normally I would write the above code like this:
class AdvancedStarship:
# stats = {} mypy will complain about this, because it is not a ClassVar
stats: ClassVar[dict[str, int]] = {} # class variable
def __init__(self, damage: int, captain: str = 'Picard') -> None:
self.captain = captain
self.damage = damage
The case where these class-level type hints feel very useful to me is for Protocols, making unnecessary to declare the "data part" of the protocol with properties (get/set descriptors), that is the approach I used to follow so far.
from typing import Protocol
class Foo(Protocol):
x: int # part of the interface
class Bar:
def __init__(self):
self.x = 42 # matches Foo
It's also useful if we have attributes that won't be set in __init__, but in some later method call. This way we make them part of the class contract and initialize them to a default value (probably None), shared by all instances via the class attribute (as we saw with BasicStarship.captain), and then get it added to each instance when it gets set to a specific value.
No comments:
Post a Comment