Saturday 5 November 2022

Python Identity and Equality

There are quite a few things we should be aware regarding identity, equality and comparison in Python. I won't explain here the difference between identity (also known as reference equality, mainly in C#) and equality (also known, again mainly in C#, as value equality), it's a general Computer Sciences concept that you should already have clear before reading this post. I already talked about it quite a while ago.

In Python, For identity comparison we use the is and is not operators. They check if 2 variables point to the same object, so they hold the same memory location. ob1 is ob2 is the same as doing id(ob1) == id(ob2).

For equality comparison we use the == and != operators. The behaviour of == and != is determined by the __eq__ and __ne__ methods in the class of our object, so we can customize their behaviour (operator overloading) implementing them in our class. Otherwise, the attribute resolution algorithm we'll end up using object.__eq__ and object.__ne__. These default methods just do an identity/reference comparison. Obviously classes like str or int implement __eq__ and __ne__. This means that str.__eq__ is object.__eq__ is False. You can read more here

Because all types are (direct or indirect) subtypes of object, they inherit the default comparison behavior from object. Types can customize their comparison behavior by implementing rich comparison methods like __lt__(), described in Basic customization.

The default behavior for equality comparison (== and !=) is based on the identity of the objects. Hence, equality comparison of instances with the same identity results in equality, and equality comparison of instances with different identities results in inequality. A motivation for this default behavior is the desire that all objects should be reflexive (i.e. x is y implies x == y).

I've just talked about implementing the __ne__ method. Well, that's not necessary really, we only need to implement __eq__ to customize the equality behaviour for one class. != will try to find a __ne__ method in our instance, and as we have not implemented it it will end up in object.__ne__, that will call __eq__ in that instance (the method that we have implemented) and negate its result. You can find an in depth discussion here.

Python 3.x: indeed, the default implementation of __ne__ calls __eq__ and negates the result (but on the technical level there are some subtleties, please read on).

Hopefully, contrary to JavaScript, we could say that Python does not have type coercion/implicit conversion. I mean, the equality comparison implementations (__eq__, __ne__) in classes like int and str (or the __add__ method) will not automatically convert one type to another if it receives an object of a different type. In Python 5 == "5" is False, while in JavaScript it's True. In the same, way 3 + "5" in Python throws and Exception, while JavaScript converts the integer to string and returns "35".

in operator. The in operator is really useful and expressive. One should wonder how it works regarding value or reference equality. Well, answer gives such an excellent explanation. In short, "in" uses the __contains__ method defined in the container object. For standard containers __contains__ does this:

For container types such as list, tuple, set, frozenset, dict, or collections.deque, the expression x in y is equivalent to any(x is e or x == e for e in y).

Additional considerations. When comparing strings or numbers, we should always use "==" rather than "is". When checking if an object is None, and given that None is a singleton, in most cases using is or == would be equivalent, and is is considered better style. There could be some classes for which __eq__ has been implemented to make comparing to None as True, which I think is very unlikely. In those odd cases you should use ==, unless that you want to skip that odd behaviour...

Sort comparisons. Regarding the >, <, >=, <= comparisons, explains it pretty well.

Some polygloth notes: In Java we use the == operator for identity, and Object.equals (that we can override) for equality. In javascript it's commont to say that we have an identity operator === and an equality operator ==. As there is no operator overloading we can not customize equality ourselves, it's all determined by the algorithms defined by the language, which are very well explained here.

No comments:

Post a Comment