Thursday, 27 March 2025

type.__call__ and more

In the past I've written some posts about Python metaclasses [1], [2] and [3]. Metaclasses are very powerful and very interesting particularly because you won't find them in other languages (save in Smalltalk, that I think is what inspired Python metaclasses). Notice that Ruby has probably even more powerful metaprogramming constructs, but work differently. Additionally, it seems like each time I look into metaclasses I find something that I had not thought about. I have some new stuff not covered in my previous posts, so I'll write it down here

First, this is some of the best information about metaclasses that you can find. I guess each time I need to refresh my mind about how metaclasses work I'll jump into that article. Second, I've found a use of metaclasses I'd never though about. Here one guy is using metaclasses to implement lazy objects. There are other ways to do that, but the use of metaclasses for it is an interesting approach.

There's a method that plays a crucial role in object creation, both when creating an instance of a regular class, and when creating a class (an instance of a metaclass), the type.__call__ method.

Constructing an Instance of a Class:Given a "normal" class: class Person: , when we create an instance of that class: a = Person()
the runtime searches __call__ in Person's metaclass, that is type, so it ends up invoking type.__call__
Constructing a Class with a Custom Metaclass: Given a metaclass Meta1: class Meta1(type), creating an instance of that Metaclass : class A(metaclass=Meta1)
ends up in a call like this: A = Meta1(xxx)
that will search __call__ in Meta1's metaclass, that is also type, so also a type.__call__ invokation.

The confusing thing is that type.__call__ is represented in several places with 2 different signatures:
For regular classes, __call__(cls, *args, **kwargs) handles instance creation.
For metaclasses, __call__(metacls, name, bases, namespace, **kwargs) manages class

Those 2 different signatures we should see them as "virtual signatures". It's the parameters that we have to provide in each of those 2 cases. But underneath, type.__call__ is a C function that receives a bunch of unnamed and named arguments (don't ask me how that works in C...). It will check if the first argument is a class or a metaclass (in Python we would do: issubclass(cls, type)), and depending on that it will interpret the rest of parameters as if it were signature 1 or signature 2. In both cases, type.__call__ will invoke the __new__ and __init__ methods in the class or metaclass that it received as first parameter. Well, from this article I've learned that the call to __init__ won't happen if __new__ returns an object that is not an instance of cls:

Python will always first call __new__() and then call __init__(). How could I get one to run but not the other? It turns out that one way of doing this is by changing the type (i.e. class) of the object returned by __new__(). Python will call the __init__() constructor defined for the class of the object. If we change the object’s class to something else, then the original class’s __init__() will not get run. We can do this by modifying the __class__ attribute of the object returned by __new__(), swapping it to refer to some other class.

You can see that also in the first article, that provides this implementation of a metaclass that behaves just like type does:


class Meta:
    @classmethod
    def __prepare__(metacls, name, bases, **kwargs):
        assert issubclass(metacls, Meta)
        return {}
    def __new__(metacls, name, bases, namespace, **kwargs):
        """Construct a class object for a class whose metaclass is Meta."""
        assert issubclass(metacls, Meta)
        cls = type.__new__(metacls, name, bases, namespace)
        return cls
    def __init__(cls, name, bases, namespace, **kwargs):
        assert isinstance(cls, Meta)
    def __call__(cls, *args, **kwargs):
        """Construct an instance of a class whose metaclass is Meta."""
        assert isinstance(cls, Meta)
        obj = cls.__new__(cls, *args, **kwargs)
        if isinstance(obj, cls):
            cls.__init__(obj, *args, **kwargs)
        return obj

I'll leverage this post to mention that while the metaclass of a class does not play a role when looking up an attribute in an instance of that class (p.something), it will play that role when looking up an attribute in the class. Given a MetaPerson metaclass, a Person class and a user object, user.city will search city in user and in type(user) that is Person. And Person.city will search city in Person and in type(Person) that is MetaPerson.

Sometimes you'll find comments stating that metaclasses only affect the class creation process. That's only true if the metaclass only implements the __new__ and __init__ methods. However, if we implement the __call__ method, the metaclass will affect the creation of instances of classes of the metaclass. Furthermore, we can think of other interesting uses. If we define __getattribute__ in a metaclass, it will enter in action when an attribute look up is done in a class of that metaclass (not when it's done in the instance).

It's important to note also that we could say that in the last years metaclasses have become even more "esoteric" in the sense that after the inclusion of __init_subclass__ and __set_name__ they are not necessary for some of its most common use cases (but there are still things that can only be achieved via metaclasses. There's a good explanation here

Finally, if you think metaclasses are complex, enter the world of metametaclasses! A metametaclass is a metaclass that is used as the metaclass of another metaclass, not just of a class. Well, indeed this is not that surprising, type is meta-meta class. All classes have a metaclass that if not provided explicitly, is type. So when you define a Metaclass, that is indeed a class, it's metaclass is type. This smart response exposes the 2 main uses I can envision for metametaclasses: interfering in the normal class creation process, by defining __call__ in the metaclass of another metaclass and allowing composition of metaclasses by defining __add__ in their metaclass.

Sunday, 23 March 2025

FileSystems and Inodes

When writing this recent post about file-locking I came across an interesting Unix/Linux feature: you can remove (or move) an open file. The file entry in the Filesystem is removed (so you can not open it again), but if any process already has the file open, the data-blocks that make up the file will remain until any process that has the file open is closed. What they mention here is that the inode for that file remains (until all processes with a handle to the file are closed). This applies not just to files opened by a process, but to the process itself. I mean, a process corresponds to an executable file, I can remove that executable file and the process will remain running normally until it decides to finish. I've mentioned inodes, buff, I think I had not thought about what an inode is and how filesystems work in almost 2 decades! so I think it's time to refresh my mind and write down here a summary. From wikipedia

The inode (index node) is a data structure in a Unix-style file system that describes a file-system object such as a file or a directory. Each inode stores the attributes and disk block locations of the object's data.[1] File-system object attributes may include metadata (times of last change,[2] access, modification), as well as owner and permission data.[3]

A directory is a list of inodes with their assigned names. The list includes an entry for itself, its parent, and each of its children.

So an inode contains data about a file (attributes containing metadata) and the file data (via pointers to the file data-blocks). How the file system converts a path "/usr/xose/myFile.txt" to an inode which data and metadata can use goes like this: A directory is a file, a file which data (what is stored in the data-blocks pointed out from that directory inode) are pairs of fileName -> inodeNumer. So in my example, for the "xose" directory we have a file (an inode) that contains an entry like this: "myFile.txt, 11111" (inode number). Walking back, usr is a file with an entry "xose, xose-inode", and the same for the "/" root directory. OK, and where is the entry that tells us the inode number for the "/" root directory? Well, that's a fixed number, that in principle for all unix filesystems is inode 2. This discussion makes a good read:

Directories are just special files that map an inode number to a string filename. Each inode is numbered and usually represents an offset in some array-like structure in the filesystem. This mapping between inode to filename is a hard link. A file must have 1 or more hard links to be accessible. If you create another hard link, you’re just pointing another filename to the same inode. All of them are equally “the file”, and there’s no way to detect which hard link came first. As part of the inode contents, there’s a counter of how many hard links each inode has. It’s eligible for cleanup and reuse when this count is zero.

The root directory is usually some specially reserved inode number.

They mention hard links. I have to shamefully admit that I'd always being a bit confused about the hard link vs symbolic link difference (I come from a Windows background...) when it's a damn simple thing. A hard link corresponds to the "name" part in the "name, inode number" pairs that we have in a directory (remember, a directory is a file containing "name to inode pairs"). You can have multiple hard links pointing to a same inode, and indeed you can not differentiate which one was created first. That's why you can read in the wikipedia article Inodes do not contain their hard link names, only other file metadata. Sure, cause as multiple names can reference that inode, it would be a mess to keep track of that in the inode itself. Symbolic-links (aka soft links) are quite a different thing. They are files that contain just a path to another file. As wikipedia explains:

A symbolic link contains a text string that is automatically interpreted and followed by the operating system as a path to another file or directory. This other file or directory is called the "target". The symbolic link is a second file that exists independently of its target.

So as I've aforementioned, given a hard link:
"/usr/xose/myFile.txt", inside the "xose" directory-file we have an entry: [myFile.txt, 11111 (inode number)]
while for a soft link "/apps/important/file1.txt -> /usr/xose/myFile.txt" we have:
- an entry inside the "important" directory-file: [file1.txt, 22222 (inode number)]
- the data contained inside the 22222 inode, that is just "/usr/xose/myFile.txt"
- The OS (that knows how to treat symbolic links) will handle that path as if it had given to it in first instance.

From the previously linked discussion:

You asked about symbolic links. As I mentioned above, they’re a special kind of file. The filesystem knows to interpret its contents differently. The content of a directory is the mapping for filenames, but the content for symbolic links (soft links) is a file path string. Symlinks consume new inodes, and they do not increment the destination file’s hard link count. Deleting the destination file does not update any symlinks pointing to them.

So notice that an inode contains a counter of how many hard links point to it. There's another great discussion here:

The term hardlink is actually somewhat misleading. While for symlinks source and destination are clearly distinguishable (the symlink has its own entry in the inode table), this is not true for hardlinks. If you create a hardlink for a file, the original entry and the hardlink are indistinguishable in terms of what was there first. (Since they refer to the same inode, they share their file attributes such as owner, permissions, timestamps etc.) This leads to the statement that every directory entry is actually a hardlink, and that hardlinking a file just means to create a second (or third, or fourth...) hardlink. In fact, each inode stores a counter for the number of hardlinks to that inode.

The directory entries of "original file" and "hard link" are totally indistinguishable in quality: both establish a reference between a file name and the inode of a file.

One of the main visible differences between hardlinks and symlinks (a.k.a. softlinks) is that symlinks work across filesystems while hardlinks are confined to one filesystem. That is, a file on partition A can be symlinked to from partition B, but it cannot be hardlinked from there. This is clear from the fact that a hardlink is actually an entry in a directory, which consists of a file name and an inode number, and that inode numbers are unique only per file system.

Notice that a process references the files that it has open by inode, not by path. Well, this all comes down to File Descriptors and will be the subject of a next post.

Normally users don't have to deal with inodes in their daily life, but there are at least a couple of situations where some basic knowledge about them will come handy. In inode based filesystems normally (for example with ext4) the number of inodes of that filesystem is determined when that filesystem is created. This is so because inodes are preallocated and stored in an inode table. If we have many small files in our filesystem it could happen that we use all the available inodes without having used all the disk space, so we won't be able to create new files though our friend df -h will tell us that there's disk space available. Using df -i we can see the inode usage.

There are chances that we have heard about another inodes related concept, orphan inodes. An orphan inode is an inode that is still allocated but there's no longer any directory entry pointing to it. This can be something normal, like when we have deleted a file by it's still open by a process, as when the process finishes the inode will be released, or problematic, due to some system crash during a write operation, or a crash in the previous situation, a process keeps a open file that has been deleted and the OS crashes, so no time for deleting the inode. Hopefully, filesystems keep a list of orphan inodes and normally will be able to delete them in the next boot.

Friday, 14 March 2025

Kotlin cast operator vs !! operator

We know that null-safety is a key element in Kotlin programming. Kotlin will try by all means to prevent you from writing code that could throw an infamous NPE (Null Pointer Exception). Anyway, as the article mentions, there are a few cases where you still can get a NPE, like a problematic "leaky constructor" (it's something I'd never though about) and the not-null assertion operator(!!).

The not-null assertion operator !! converts any value to a non-nullable type.

When you apply the !! operator to a variable whose value is not null, it's safely handled as a non-nullable type, and the code executes normally. However, if the value is null, the !! operator forces it to be treated as non-nullable, which results in an NPE.

So the !! operator allows us to "convert" a type from nullable to not nullable. "convert" here does not mean transforming a value (like transforming from int to string or viceversa) but transforming its "contract" (its type indeed). We had the contract that the value was "string or null" and now we narrow that contract to just "string". If we are sure that the value adheres to this more restrictive contract, there are cases when it will be useful. If our assumption fails and the value is indeed null, we'll get a terrible NPE


>>> val a: String? = null;
>>> a!!
java.lang.NullPointerException
	at Line_4.(Line_4.kts:1)

But, this thing of changing the type, it feels familiar to me, it's what in other languages I've always used casting for. Of course Kotlin supports casting, with the as "unsafe" cast operator and as? safe (nullable) cast operator. So we can write:


>>> val a: String? = "aaa"   

// this throws an exception
>>> val b: String = a
//error: type mismatch: inferred type is String? but String was expected
//val b: String = a

// but this works fine                
>>> val b: String = a as String

// and this one also
>>> val c: String = a!!

So using the not-null assertion operator(!!) and the "as" unsafe cast operator seem to be just equivalent, the only difference is that when they fail they throw different exceptions, NullPointerException for !! and ClassCastException for the "as" operator. So they are so similar that one can wonder why the !! operator was introduced? I think the reasoning for having (and using) a particular not-null assertion operator is that it's more specific. It only serves one purpose, converting from nullable to not nullable, while the cast operator is broader, it can convert from any type to any other type. So when using !! you are more clearly communicating the idea of skipping null safety.

A bit related to this I came across this StackOverflow question about the difference between "x as? String" and "x as String?". If x is a String or null, both cases are equivalent, we get a nullable String. The difference is when x is neither null nor String, in that case the safe cast will return null while the unsafe cast will throw an exception. I just copy-paste the code from here


fun  safeCast(t: T){
    val res = t as? String //Type: String?
}

fun  unsafeCast(t: T){
    val res = t as String? //Type: String?
}

fun test(){
    safeCast(1234);//No exception, `res` is null
    unsafeCast(null);//No exception, `res` is null
    unsafeCast(1234);//throws a ClassCastException
}

Wednesday, 5 March 2025

Exceptions vs Errors

I think I've always thought of Errors as "old-school" error codes, and Exceptions as that "more modern" thing that you throw/raise and catch. I also used to expect exceptions to have an Exception suffix (I guess my C# background). Over the years I've noticed how JavaScript and Python follow a different convention for Exceptions, and I've recently learnt about some Java particularities (beyond the checked-unchecked mess). So I've thought it would be a good idea to write a post about this.

Java has a Throwable class, and any object that you intend to throw has to be an instance of a class inheriting from Throwable. Then, Java makes a clear distinction between Errors and Exceptions, as we have different classes for them, both inheriting from Throwable, so can be thrown and caught, but there's a clear semantic difference. An Error represents something critical, you can catch it for logging it, but most likely you can not recover from it and there's nothing more you can do. An Exception represents and "exceptional" situation from which in principle you can recover. We have to take into account one extra thing, that awful idea of checked (you are forced to handling them) and unchecked exceptions (you are not). I think checked Exceptions represent expected situations/problems (so in the end they are not so exceptional) and as such you have to be ready to deal with them and recover. Unchecked Exception (those inheriting from RuntimeException) represent conditions that probably should never happen, but if they were to occur, maybe you could manage to recover from them. This discussion has helped me to wrap my head around all this.

In JavaScript you can throw any object, but there's an Error class, and your custom errors/exceptions should inherit from Error, as the different built-in errors/exceptions do. Most of these built-ins have the Error suffix, save for some cases like DOMException, but it also inherits from Error. I'm not sure if there's a reason for having suffixed a few classes as "Exception" rather than "Error" or if it's just an inconsistency. Based on the Java logic one should think that the "Error" suffix is used for serious, unrecoverable errors/exceptions, and the "Exception" suffix for less critical stuff you can recover from. I've read some discussions, like this and have not found a clear answer. ChatGPT and Claude seem to recommend to use the "Error" suffix for all custom exceptions, but I see much code that does just the contrary, so I'm inclined to use "Exception" if it's not critical. All in all we can say that in JavaScript does not impose any distinction between Error and Exception (as the base class used for all errors/exceptions in the standard library is named just Error), and we are free to establish that distinction by naming our classes one way or another.

In Python we raise objects that are instances of BaseException or any of its subclasses. For that we can either create the instance and raise it, or raise the class (and Python will take care of creating an instance):

The sole argument to raise indicates the exception to be raised. This must be either an exception instance or an exception class (a class that derives from BaseException, such as Exception or one of its subclasses). If an exception class is passed, it will be implicitly instantiated by calling its constructor with no arguments:

We also have the concept of fatal (not recoverable) exceptions vs non-fatal ones:

BaseException is the common base class of all exceptions. One of its subclasses, Exception, is the base class of all the non-fatal exceptions. Exceptions which are not subclasses of Exception are not typically handled, because they are used to indicate that the program should terminate. They include SystemExit which is raised by sys.exit() and KeyboardInterrupt which is raised when a user wishes to interrupt the program.

And now the confusing part. Checking the Exception Hierarchy you can see that many built-in exceptions inheriting from Exception are named with the "Error" suffix. PEP-8 says this about naming:

Because exceptions should be classes, the class naming convention applies here. However, you should use the suffix “Error” on your exception names (if the exception actually is an error)

The "if the exception actually is an error" feels pretty important to me. In the Exception hierarchy you can also see that there are several xxxWarning classes, and the infamous StopIteration class. So in Python it seems like Errors are one kind of Exceptions, and there are other kinds of Exceptions that are not real Errors, just warnings or normal situations like finishing an iterator. I think we can say that Python does not differentiate between Exceptions and Errors (as the base class in the hierarchy is just BaseException), but between Exceptions that are Errors, and Exceptions that are not Errors, and does this based on the naming convention for the derived classes, using the Error suffix for errors.

I have to add that the use of an exception to indicate the end of an Iterator (StopIteration in Python, NoSuchElementException in Java) is something that has always feel rather odd to me. It's a totally normal situation, so using an Exception for something that is part of the normal flow feels strange to me. I quite prefer the JavaScript approach, returning an object with the done property set to true.

Sunday, 2 March 2025

Kotlin pseudo-constructor

There's an uncommon language feature that I first came across with in JavaScript, and then in Python. The ability for a constructor to return an object different from the one being constructed. I've talked about JavaScript "constructors" (I use quotes as any function save arrows and methods (meaning those defined with the modern ES6 class syntax) can be used with new to construct an object) several times in the past [1] and [2], and the main idea of how they work is:

The new operator takes a function F and arguments: new F(arguments...). It does three easy steps:
Create the instance of the class. It is an empty object with its __proto__ property set to F.prototype.
Initialize the instance.
The function F is called with the arguments passed and "this" set to be the instance.
If the function returns an object instead of undefined (which is the default return value), that object will "replace" "this" as the result of the new expression.

Just to make it clear:


class Person {
	constructor() {
		return "Not a real Person";
	}
}

We can do the same in Python by defining a custom __new__() method in our class. Object creation and initialization in Python works like this:
hen creating an instance of a class A (a = A()) it invokes its metaclass __call__ (so for "normal" classes it's type.__call__ and for classes with a custom metaclass it's CustomMetaclass.__call__). Then type.__call__(cls, *args, *kwargs) basically does this: invokes cls.__new__(cls) (that unless overriden is object.__new__) and with the returned instance object it invokes cls.__init__(instance).
So we can write something like this:


class Person:
	def __new__(cls, *args, **kwargs):
	    print("In new")
	    # this would be the normal implementation
	    #return super().__new__(cls,*args, **kwargs)
	    # and this our odd one:
	    return "Not a real Person"

The obvious question is, what's this feature useful for? Mainly I think it's a very nice way to implement a transparent Singleton or Object pool. You invoke the constructor and it decides to return always the same obect or one object form a pool. The client is not awaire of it, this "singletonness" or object caching. You could also think of a constructor that returns instances of derived classes based on its arguments, so the constructor becomes a factory.

Apparently Kotlin lacks this feature. Constructors implicitly return an instance of its class, that's all. Well, the thing is that there is a "hack" to get a behaviour similar to that of JavaScript and Kotlin, by combining the invoke() operator and companion objects. When a class has an invoke() method it makes its instances invokable, so it's the equivalent to Python's __call__ and callable objects. We know that when a class has a companion object its methods are accessble through the class. So if we have a MyClass class that has a companion object, and that companion has an invoke method, we can do "MyClass.invoke()" that can be just rewritten as "MyClass()", which just looks as a constructor call (thought it's not). The invoke in our companion can return whatever it wants, so we have a "constructor-like function" that can return whatever it wants. Notice that for the Companion.invoke to get invoked when called without "()" we can not have any real constructor in the main class with the same signature as our invoke() (making the real constructor private will work). I mean


class OsManager private constructor() {  // Private constructor prevents instantiation
    companion object {
        operator fun invoke(): String {
            println("inside invoke")
            return "OS Manager"
        }
    }
}


class OsManager2() {
    companion object {
        operator fun invoke(nm: String): String {
            println("inside invoke")
            return "OS Manager"
        }
    }
}

// These 2 will invoke the "pseudo-constructor"
val os1 = OsManager() // equivalent to OsManager.invoke()
val os2 = OsManager2("aa") // equivalent to OsManager2.invoke("aa")


This trick can be used hence for implementing transparent object pools or singletons (well, indeed a singleton is an object pool with a single element). Let's see a Singleton example ()


class Singleton private constructor() {  // Private constructor prevents instantiation
    init {
        println("Singleton instance created")
    }

    companion object {
        private val instance: Singleton by lazy { Singleton() }

        operator fun invoke(): Singleton {
            return instance
        }
    }
}

fun main() {
    val obj1 = Singleton()  // Calls the `invoke` operator
    val obj2 = Singleton()  // Calls the `invoke` operator again

    println(obj1 === obj2)  // true, same instance
}

An interface can also have a companion object, which allows this pretty nice pattern that I've found here. A sealed interface with a companion object with an invoke method that acts as a factory for instances of the different classes implementing that interface.

Saturday, 22 February 2025

Python walrus limitation

Sometime ago I talked about Python Assignment Expressions aka walrus operator, and overtime I've really come to appreciate it. Some weeks ago I came across and odd limitation of this operator, it can not be used for assigning to an attribute of an object (so you can use it only with variables), as you'll get an "SyntaxError: cannot use assignment expressions with attribute" error. I don't remember what I was trying when I hit this problem, but now I can think of an example like the verify method below:


class Country:
    def __init__(self, name):
        self.name = name
        self.cities = None
        self.last_verification = None
    
    def _lookup_cities(self):
        print("looking up cities")
        return ["Paris", "Toulouse", "Lyon"]

    def verify(self):  
	# [code to perform verification here]
        print(f"last verification done at: {(self.last_verification := datetime.now())}")
        # SyntaxError: cannot use assignment expressions with attribute
        

So the above throws a SyntaxError: cannot use assignment expressions with attribute. I can think of one technique to circunvent this limitation, leveraging an "assign()" custom function that I use sometimes to conveniently set several properties in one go).


def assign(val: Any, **kwargs) -> Any:
    for key, value in kwargs.items():
        setattr(val, key, value)
    return val

    def verify(self):
	# [code to perform verification here]
        print(f"last verification done at: {assign(self, last_verification=datetime.now()).last_verification}")

That syntax is cool, but having the print() call as the most visible part of the statement is probably confusing, as it makes us think that the important action in that line is print, while setting the last_verification attribute is the real deal in that line. So probably using the "traditional syntax" would make sense:


    def verify(self):
	# [code to perform verification here]
        self.last_verification = datetime.now()
        print(f"last verification done at: {self.last_verification}")

Another example for using this technique:


    def _lookup_cities(self):
        print("looking up cities")
        return ["Paris", "Toulouse", "Lyon"]
		
    def get_cities(self) -> list[str]:
        # return self.cities or (self.cities := self._lookup_cities())
        # SyntaxError: cannot use assignment expressions with attribute
        return self.cities or assign(self, cities=self._lookup_cities()).cities


Notice that this case could be rewritten using a lazy property via functools @cached_property


    @cached_property
    def cities(self):
        print(f"initializing lazy property")
        return self._lookup_cities()


That looks really neat, but notice that I think we should be careful with the use of cached/lazy properties. On one hand, cities represents data belonging to the object, it's part of its state, so using a property rather than a method feels natural. But on the other hand, to obtain those cities maybe we do a http or db request, an external request. This kind of external interaction can be considered as a side-effect, so in that sense we should use a method. In general, I think lazy properties should only be used for data that is calculated based on other data belonging to the object (and if that data is read-only or we observe it and update the property accordingly, and if that calculation is not too lenghty, as accessing a property should always be fast). This is an interesting topic and it has prompted me to revisit this stackoverflow question that I remember to have read several times over the last 15 years.

I have to add that these examples would look much nicer if we had a pipe operator (like in elixir) and we could write something like this:
return self.cities or (self | assign(cities=self._lookup_cities()) | .cities)

Thursday, 13 February 2025

File Locking 2

One decade ago I wrote this post about file-locking (aka file-sharing) and I've been revisiting it lately. That post was focused on Windows, and indeed I had not realized that "the attitude" towards file-locking in Windows and Linux is pretty different. We can say that in Windows file-locking is an integral part of how the the OS manages files. The sharing mode (dwShareMode) parameter of the CreateFile function (used to open files) determines file-sharing.

On the other side, for Unix-like OS's file-locking does not seem to be a major concern. By default no locking of any kind is performed when opening a file (so multiple processes can read and write to the file at the same time). The open() system call does not have any parameter related to locking-sharing. It's true that there is support for file-locking (fcntl, flock, lockf), but it's rather loose, as we can say that it's cooperative:

File locks under Unix are by default advisory. This means that cooperating processes may use locks to coordinate access to a file among themselves, but uncooperative processes are also free to ignore locks and access the file in any way they choose. In other words, file locks lock out other file lockers only, not I/O.

The above was pretty reassuring cause at work we have one application (not developed by us) that appends information to a file, and we wanted to write some code that would be periodically reading that file, with both applications running on Linux. We could not afford opening the file for reading and that right at that moment the other application tried to open it for writing and failed (maybe crashing, as we don't know what kind of error handling, if any, the application has).

For full peace of mind I did a fast check. Open a file for reading in Python (fr = open("test.txt", "r")) and while it's open append lines to it from the terminal (echo hi >> test.txt). No crash and the file gets updated without a problem.

The surprising thing is that doing just the same test on Windows also works fine! Well, indeed it makes good sense, let me explain. The Python open() function does not provide any sort of "File Sharing" parameter, while the different .Net file opening methods do. One can easily think that this is because while both languages are multiplatform, Python was born in a more Linux oriented community, while .Net was for many years Windows-centric, so both libraries reflect what exists in their "favorite" OS. But at the same time, I guess Python developers decided that it should try to show the same behaviour on any OS. Given that Python's open() on Linux can not provide any locking behaviour (cause as I've already mentioned the underlying Linux open() system call does not), it should do the same on Windows, so when it invokes the underlying Windows API CreateFileW function, it does so requesting Read and Write sharing. From here:

Python’s builtin open() shares read and write access, but not delete access. If you need a different share mode, you’ll have to call CreateFile directly via ctypes or PyWin32’s win32file module.