Every now and then we need a flag or sentinel value. A unique value that we can distinguish from the normal values that we are processing and that has a particular meaning. a special, unique value used in programming to signal the end of data processing, a loop, or an operation. For example in my recent post about adding null-safety to a pipe function I was using 2 sentinels/flags: NULL_SAFE and COALESCE.
The essential function a sentinel value has to accomplish is to have a unique identity, so that comparing it by identity (Python: is, JavaScript: ===) with any other value/object in our system has to return false. So the most simple approach is just using a new object for each of our sentinels.
NO_INVEST = object() # Sentinel value
def invest(amount: int | None | object) -> str:
if amount is NO_INVEST:
return "No investment"
else:
amount = amount or 0
return f"we've invested {amount}"
print(invest(NO_INVEST)) # Output: No investment
print(NO_INVEST) # Output: object object at 0x...
That simple approach works fine, but it's missing a few things. Printing the value is messy (we get an "object object at 0x..." representation, it would be nice to get NO_INVEST) and its rather typing unfriendly. Saying that invest can receive an object, apart from int or None lacks any meaning. What kind of object is that?. A sentinel value should (mainly) have these features:
- A unique identity (is comparison)
- A meaningful repr (debuggability)
- Clear typing (especially for static type checkers)
Our basic sentinel only provides the first one (identity). That's why some smart guy came up with a very interesting PEP 661 proposing a new Sentinel class. Unfortunately that PEP is in deferred status. The document also presents different techniques commonly used for Sentinel values, like the simple object() that I've just shown, using an enum or using a class. Using a class is the best approach to me, I'll show several iterations until getting what I think is the best we can get so far.
Approach 1. Classes are objects in Python. We can use a class for each sentinel object, and in order to get a nice representation we can give it a metaclass with a custom __repr__. The missing piece is having some typing friendliness, we're still stuck with the Any signature. Additionally, declaring a class for something that is not intended to work as an object factory, but to be used as an object in itself is rather unnatural.
class SentinelMeta(type):
def __repr__(cls):
return cls.__name__
class Sentinel(metaclass=SentinelMeta):
pass
class NO_INVEST(Sentinel): pass
#def invest(value: int | None | Type[NO_INVEST]) -> str: # this signature feels a bit strange, but it works
#def invest(value: int | None |Literal[NO_INVEST]) -> str: # this one feels better, but only works with mypy, not with pylance
def invest(amount: int | None | Any) -> str:
if amount is NO_INVEST:
return "No investment"
else:
amount = amount or 0
return f"we've invested {amount}"
print(invest(NO_INVEST)) # Output: Using NO_INVEST sentinel in match statement
print(NO_INVEST) # Output: NO_INVEST
Approach 2. We can make the usage quite more natural by hiding the class creation behind a function. That also allows us to skip the Sentinel base class.
class SentinelMeta(type):
def __repr__(cls):
return cls.__name__
def sentinel(name: str):
return SentinelMeta(name, (), {})
NO_INVEST = sentinel("NO_INVEST")
#def invest(value: int | Type[NO_INVEST]) -> str: # pylance doesn't like it, using a variable as type
#def invest(value: int | Sentinel) -> str: # we don't have a Sentinel class... so forget it
def invest(amount: int | None | Any) -> str:
if amount is NO_INVEST:
return "No investment"
else:
amount = amount or 0
return f"we've invested {amount}"
print(invest(NO_INVEST)) # Output: Using NO_INVEST sentinel
print(NO_INVEST) # Output: NO_INVEST
This one feels quite natural to use, but we still have the problem with typing. We can leverage the Generic types/class subscripting that we saw in my previous post for getting something like this (Approach 3)
class SentinelMeta(type):
def __repr__(cls):
return cls.__name__
class Sentinel(metaclass=SentinelMeta):
def __class_getitem__(cls, item):
return cls
def sentinel(name: str):
return SentinelMeta(name, (Sentinel,), {})
NO_INVEST = sentinel("NO_INVEST")
#def invest(value: int | Sentinel) -> str:
def invest(amount: int | None | Sentinel[NO_INVEST]) -> str: # at a typing level it's equivalent to the above, but it's provides extra meaining
if amount is NO_INVEST:
return "No investment"
else:
amount = amount or 0
return f"we've invested {amount}"
print(invest(NO_INVEST)) # Output: Using NO_INVEST sentinel
print(NO_INVEST) # Output: NO_INVEST
Hey, this one feels rather good to me. The Sentinel[NO_INVEST] looks pretty nice in that signature. The type checking is mainly off, cause our sentinel() function is not typed, so it's considered as returning Any, and when we pass Any to a function it disables type checking. This means that for the type system any sentinel that we create with sentinel() is just an Any, so the type checker will allow passing any sentinel to a function that expects a specific sentinel. It's not a problem for me, what I'm mainly interested in is the semantics, the clarity, that this type annotation provides to the function signature.
Given that we are using Sentinel classes as objects, not as object factories, we can make these classes to be more object like, by preventing instantiation. Additionally, Sentinels do not have attributes and are not intended to be expanded with attributes, so we can prevent them from getting attributes added dynamically. All in all we get:
class SentinelMeta(type):
def __repr__(cls):
return cls.__name__
# prevent sentinel classes from being instantiated
def __call__(cls, *args, **kwargs):
raise TypeError(f"{cls.__name__} is a sentinel and cannot be instantiated")
# prevent sentinel classes from being modified
def __setattr__(cls, name, value):
raise AttributeError(f"Cannot modify sentinel {cls.__name__}")
class Sentinel(metaclass=SentinelMeta):
def __class_getitem__(cls, item):
return cls
def sentinel(name: str):
return SentinelMeta(name, (Sentinel,), {})
NO_INVEST = sentinel("NO_INVEST")
There's something missing in this implementation, support for our sentinels to be pickled (particularly important if we plan to use them in multiprocessing scenarios). That complicates the design and I've deliberately left it aside for the moment. Maybe we'll see it in another post.
No comments:
Post a Comment