Thursday, 28 September 2023

Python, abc's vs Protocols

In languages like Kotlin or C# the differences between interfaces and abstract classes are quite clear. Both constructs have different features. Interfaces can not have state (save using very arcane Kotlin techniques) but abstract classes can. We can implement multiple interfaces but only extend one abstract class. In the past interfaces could not have "real" methods, just method declarations, but kotlin and modern C# provide default interface methods. Because of the multiple inheritance from interfaces but single inheritance from classes there's normally a semantic difference. Interfaces normally define capabilities (and a class can have multiple capabilities by implementing multiple interfaces), while using an abstract class implies an is arelation. This said, I've worked on projects where the architecs had defined interfaces without default methods, that were not intended for multiple inheritance (they were not capabilites, they were a "main responsability") and I think pure abstract classes would have made more sense.

In Python, given its dynamic nature and that it features multiple inheritance from classes, it didn't seem necessary to add interfaces to the language. I think we can say that abstract classes in Python (abc's, that were not initially present either) play the role of both Abstract classes and interfaces in other languages. Comparing the different capabilities of Python, Kotlin and C# the only reason that occurs to me for having a specific interface mechanism in Python would be for having a restricted form of abc's, lacking state. Well, there's < href="https://pypi.org/project/python-interface/">a module in pip for declaring interfaces, but reading the documentation on how they differ from abc's, the motivation is throwing errors earlier and making the methods signature part of the contract...

As we've seen in a previous post, a few releases ago Python introduced a new feature, Protocols. The main motivation was adding structural typing, but when you see the non basic examples and learn that they are a particular form of abc´s, it seems like Protocols are abc's with extra powers (structural typing) and that unless that you don't want the structural typing we should move from abc's to Protocols. Well, in python3.10 (and below) there was an important difference, protocols could not have an __init__ method, which means that they could not have useful state (you could add them attributes when executing other methods, but in most use cases state is needed at construction/inizialitation time). This lack of state indeed made them feel like interfaces. This limitation seemed more like a byproduct than like an intended feature. It was the way to prevent Protocols from being instantiated. This was implemented by replacing the __init__ method that you could have defined in the Protocol with an "empty" _no_init method. Problem with this was that it not only prevented you from instantiating the protocol (which is good) but prevented you from defining common state and initialization logic that could be invoked from child classes./p>

This limitation made sense when using protocols in the basic way, just for structural typing, but when you use them combining structural typing an abc's features, decorating the protocol methods with @abstractmethod, this __init__ replacement is not necessary, as having abstractmethods will already prevent instantiation of the protocol (same as they prevent instantiation of abstract classes).

You can read this stackoverflow discussion about the replacement of __init__ for _no_init and how it's considered by many as a problem

In python 3.11 this thing of __init__ being replaced by _no_init no longer happens (there's a long discussion about it here), so you can write code like this:



from abc import abstractmethod
from typing import Protocol

class Formatter(Protocol):
    # in python3.10 __init__ gets replaced by _no_init, so you can not instantiate  a protocol
    # but in python3.11 it's no longer the case
    def __init__(self, name):
        print("Formatter.__init__")
        self.name = name

    # If I don't declare it as abstract I'll be able to instantiate this Protocol, that makes not sense
    # so declare it as @abstract, and so I'll get the "Can't instantiate abstract class Formatter with abstract method format"
    @abstractmethod
    def format(self, txt):
        pass


class WrapFormatter(Formatter):
    def __init__(self, name, wrap):
        print("WrapFormatter.__init__")       
        # the __init__ method in the Protocol has been replaced by that _no_init thing
        super().__init__(name)
        self.wrap = wrap

    def format(self, txt):
        return f"{self.wrap}{txt}{self.wrap}"


try:
    f = Formatter("abstract formatter")
except Exception as ex:
    print(f"Exception: {ex}")
    # as I have an abstract method (format) I get: "Can't instantiate abstract class Formatter with abstract method format"

f1 = WrapFormatter("wrapper", "||")
# it prints:
# WrapFormatter.__init__
# Formatter.__init__
print(f1.format("Francois"))



With all the above my conclusion is that unless that for some reason you don't want the structural typing feature, we should use Protocol's instead of abc's.

No comments:

Post a Comment