Friday 14 July 2023

Python Protocols

When I started to use Type Hints in Python (with whatever static type checker, normally mypy) it felt a bit to me as using TypeScript, though I feel way more free and comfortable with type hints. Additionally, there's a huge difference. TypeScript type system uses Structural Typing (objects are compatible if their types have the same "shape" regarless), and Python Type Hints use Nominal Typing. From wikipedia:

In computer science, a type system is a nominal or nominative type system (or name-based type system) if compatibility and equivalence of data types is determined by explicit declarations and/or the name of the types. Nominal systems are used to determine if types are equivalent, as well as if a type is a subtype of another. Nominal type systems contrast with structural systems, where comparisons are based on the structure of the types in question and do not require explicit declarations.

However, Python has some support for Structural Typing (both at "static type checking time" and at runtime) thanks to the recent addition of Protocols. We define a protocol by defining a class inheriting from Protocol. Whatever methods, properties or attributes we include in that class define a protocol (a structure, a shape) and the static type checker (mypy) will check if the "shape" of an object matches with what is defined in the Protocol, regardless of checking the type (the name of the class) of that object.

The above seems clear, and for example we can define a protocol for objects that can be moved like this:


from typing import Protocol

class Movable(Protocol):
    def move(self, to_x: int, to_y:int):
        ...

Unrelated objects that can be moved (feature a "move" method) will be considered as Movable by the static type checker (e.g. mypy) without explicitly inheriting from it.


class Animal:
    def move(self, to_x: int, to_y:int):
        print(f"moving to {to_x}-{to_y}")

# fine for mypy, Animal complies with the Movable protocol
mv3: Movable = Animal()

That's the most basic use of Protocols, but there's much more it it.

Protocols are related to abstract classes, indeed, the Protocol class is an abstract class. As we saw in a previous post, normally we create abstract classes by inheriting from abc.ABC, but what we achieve with that is having abc.ABCMeta as our metaclass, and it's in ABCMeta where the abstract magic is implemented. Likewise, when we create a protocol by inheriting from Protocol what we achieve is having typing._ProtocolMeta as our metaclass, and _ProtocolMeta inherits from ABCMeta, so Protocol is an abstract class.


print(f"issubclass of abc.ABCMeta: {issubclass(type(Protocol), abc.ABCMeta)}")
# True

print(f"type(Protocol): {type(Protocol)}")
# type(Protocol): typing._ProtocolMeta

Protocols are intended as "contracts" and we can not create instances of a Protocol. We get a runtime error if we try:


# Trying to instantiate a Protocol causes both:
# - typings error: Cannot instantiate protocol class "Movable"
# - runtime error: TypeError('Protocols cannot be instantiated')

mv: Movable = Button()

A class is considered a protocol only if it directly inherits from Protocol. A class that inherits from another class that inherits from Protocol is not a Protocol, as I've said, it has to directly inherit from Protocol. Anyway, we can define subprotocols (protocols that extend other protocols) by inheriting from those protocols and also directly inheriting from Protocol. From PEP 544:

Subclassing a protocol class would not turn the subclass into a protocol unless it also has typing.Protocol as an explicit base class. Without this base, the class is “downgraded” to a regular ABC that cannot be used with structural subtyping.
If Protocol is included in the base class list, all the other base classes must be protocols. A protocol can’t extend a regular class.

A very interesting feature is that we can do structural typing checks at runtime with issubclass() and isinstance(). For that we have to decorate our protocol with @runtime_checkable. Otherwise those methods will throw an Exception, and the static type checker (mypy) will also do so. If we look into the cpython source code we can see that class _ProtocolMeta(ABCMeta): has its own implementation of the __subclasscheck__ and __instancecheck__ methods that we saw in another previous post, and they check for the existence of an _is_runtime_protocol attribute in the class (that is added by the @runtime_checkable decorator).

When we define a class that we know complies with an existing protocol it makes sense to explicitly inherit from it. On one side this makes the contract explicit, and on the other side if the protocol changes mypy will warn us about the class no longer complying with it. There's an additional use for this, when combined with marking the empty methods defined in a protocol as abstract methods. This is useful because if then we decide to explicitly inherit from the protocol in those clasess that comply with it, if we forget to implement any of those methods we'll get not only a mypy typechecking error, but also a runtime error "Can't instantiate abstract class X with abstract methods ..."

Once you start to create classes that inherit from protocols there's another feature that makes sense. Same as Kotlin and modern Java and C# interfaces can have default implementations, you can have non empty methods in your protocols.

No comments:

Post a Comment