Saturday, 22 February 2025

Python walrus limitation

Sometime ago I talked about Python Assignment Expressions aka walrus operator, and overtime I've really come to appreciate it. Some weeks ago I came across and odd limitation of this operator, it can not be used for assigning to an attribute of an object (so you can use it only with variables), as you'll get an "SyntaxError: cannot use assignment expressions with attribute" error. I don't remember what I was trying when I hit this problem, but now I can think of an example like the verify method below:


class Country:
    def __init__(self, name):
        self.name = name
        self.cities = None
        self.last_verification = None
    
    def _lookup_cities(self):
        print("looking up cities")
        return ["Paris", "Toulouse", "Lyon"]

    def verify(self):  
	# [code to perform verification here]
        print(f"last verification done at: {(self.last_verification := datetime.now())}")
        # SyntaxError: cannot use assignment expressions with attribute
        

So the above throws a SyntaxError: cannot use assignment expressions with attribute. I can think of one technique to circunvent this limitation, leveraging an "assign()" custom function that I use sometimes to conveniently set several properties in one go).


def assign(val: Any, **kwargs) -> Any:
    for key, value in kwargs.items():
        setattr(val, key, value)
    return val

    def verify(self):
	# [code to perform verification here]
        print(f"last verification done at: {assign(self, last_verification=datetime.now()).last_verification}")

That syntax is cool, but having the print() call as the most visible part of the statement is probably confusing, as it makes us think that the important action in that line is print, while setting the last_verification attribute is the real deal in that line. So probably using the "traditional syntax" would make sense:


    def verify(self):
	# [code to perform verification here]
        self.last_verification = datetime.now()
        print(f"last verification done at: {self.last_verification}")

Another example for using this technique:


    def _lookup_cities(self):
        print("looking up cities")
        return ["Paris", "Toulouse", "Lyon"]
		
    def get_cities(self) -> list[str]:
        # return self.cities or (self.cities := self._lookup_cities())
        # SyntaxError: cannot use assignment expressions with attribute
        return self.cities or assign(self, cities=self._lookup_cities()).cities


Notice that this case could be rewritten using a lazy property via functools @cached_property


    @cached_property
    def cities(self):
        print(f"initializing lazy property")
        return self._lookup_cities()


That looks really neat, but notice that I think we should be careful with the use of cached/lazy properties. On one hand, cities represents data belonging to the object, it's part of its state, so using a property rather than a method feels natural. But on the other hand, to obtain those cities maybe we do a http or db request, an external request. This kind of external interaction can be considered as a side-effect, so in that sense we should use a method. In general, I think lazy properties should only be used for data that is calculated based on other data belonging to the object (and if that data is read-only or we observe it and update the property accordingly, and if that calculation is not too lenghty, as accessing a property should always be fast). This is an interesting topic and it has prompted me to revisit this stackoverflow question that I remember to have read several times over the last 15 years.

I have to add that these examples would look much nicer if we had a pipe operator (like in elixir) and we could write something like this:
return self.cities or (self | assign(cities=self._lookup_cities()) | .cities)

Thursday, 13 February 2025

File Locking 2

One decade ago I wrote this post about file-locking (aka file-sharing) and I've been revisiting it lately. That post was focused on Windows, and indeed I had not realized that "the attitude" towards file-locking in Windows and Linux is pretty different. We can say that in Windows file-locking is an integral part of how the the OS manages files. The sharing mode (dwShareMode) parameter of the CreateFile function (used to open files) determines file-sharing.

On the other side, for Unix-like OS's file-locking does not seem to be a major concern. By default no locking of any kind is performed when opening a file (so multiple processes can read and write to the file at the same time). The open() system call does not have any parameter related to locking-sharing. It's true that there is support for file-locking (fcntl, flock, lockf), but it's rather loose, as we can say that it's cooperative:

File locks under Unix are by default advisory. This means that cooperating processes may use locks to coordinate access to a file among themselves, but uncooperative processes are also free to ignore locks and access the file in any way they choose. In other words, file locks lock out other file lockers only, not I/O.

The above was pretty reassuring cause at work we have one application (not developed by us) that appends information to a file, and we wanted to write some code that would be periodically reading that file, with both applications running on Linux. We could not afford opening the file for reading and that right at that moment the other application tried to open it for writing and failed (maybe crashing, as we don't know what kind of error handling, if any, the application has).

For full peace of mind I did a fast check. Open a file for reading in Python (fr = open("test.txt", "r")) and while it's open append lines to it from the terminal (echo hi >> test.txt). No crash and the file gets updated without a problem.

The surprising thing is that doing just the same test on Windows also works fine! Well, indeed it makes good sense, let me explain. The Python open() function does not provide any sort of "File Sharing" parameter, while the different .Net file opening methods do. One can easily think that this is because while both languages are multiplatform, Python was born in a more Linux oriented community, while .Net was for many years Windows-centric, so both libraries reflect what exists in their "favorite" OS. But at the same time, I guess Python developers decided that it should try to show the same behaviour on any OS. Given that Python's open() on Linux can not provide any locking behaviour (cause as I've already mentioned the underlying Linux open() system call does not), it should do the same on Windows, so when it invokes the underlying Windows API CreateFileW function, it does so requesting Read and Write sharing. From here:

Python’s builtin open() shares read and write access, but not delete access. If you need a different share mode, you’ll have to call CreateFile directly via ctypes or PyWin32’s win32file module.

Sunday, 2 February 2025

JavaScript Arguments and Arrow Functions

After writing last week about how to sort of emulate JavaScript's arguments-object in Python, it seems a good idea to mention the special behaviour of the arguments-object in arrow functions. When arrow functions were added to JavaScript most articles promptly informed about the particular behaviour of the this-object in arrow functions (that is indeed one of its main features). The main idea that got stamped in my brain was "arrow functions do not have dynamic this, but lexical this". This means that they do not receive as "this" value the "receiver", the object on which they get invoked, but the "this" of their lexical environment. I used to think that such "this" value would get bound to the arrow function in a similar way to when we use function.bind(), but it's not like that. Arrow functions are not bound-functions with some particular property pointing to the "this", but they just look it up in its scope chain. From [1] and [2]:

[1] In essence, lexical this means that an arrow function will keep climbing up the scope chain until it locates a function with a defined this keyword. The this keyword in the arrow function is determined by the function containing it.

[2] Arrow functions lack their own this binding. Therefore, if you use the this keyword inside an arrow function, it behaves just like any other variable. It will lexically resolve to an enclosing scope that defines a this keyword.

What has prompted this post is another thing that I had not realized until recently (and it's not so hard as it comes in the MDN documentation) is that something similar happens with the arguments-object.

Arrow functions don't have their own bindings to this, arguments, or super, and should not be used as methods.

"don't have their own bindings" means that they'll be looked up in the scope chain (as any other varible). Arrow functions do not have its own arguments-object containing the arguments received by the arrow, but look up that arguments-object in the scope chain, so it will contain the arguments received by the first enclosing function (at the time of the arrow creation, not of the arrow invokation) that is not an arrow. Notice how in the example below, it prints "aaa" (from its scope chain) rather than "bbb" (that is what it receives as parameter).


// arrow functions do not have an "arguments" or "this" binding, they take them from their scope chain
function f2(a) {
    console.log("inside f");
    let fn1 = (b) => console.log(arguments);
    return fn1;
}

let arrow = f2("aaa")
arrow("bbb")
//[Arguments] { '0': 'aaa' }

I'll leverage this post to mention that changing where an entry in the arguments-object points to changes also where the variable itself points to. So this behaviour corresponds to what happens in Python with the FrameType.f_locals write-through proxy, not to what we have in Python via locals() (a snapshot through which we can not change where the original variables point to).


function test(user) {
    console.log(`user: ${user}`)
    console.log(`arguments[0]: ${arguments[0]}`);
    arguments[0] = "Iyán";
    console.log(`arguments[0]: ${arguments[0]}`);
    console.log(`user: ${user}`)
}

test("Xuan");

// user: Xuan
// arguments[0]: Xuan
// arguments[0]: Iyán
// user: Iyán