Saturday 31 July 2021

Gegen die Wand

I think I've mentioned it in previous posts, when I really like a film I tend to watch it multiple times. Not immediatelly, but with a 1-2-3 years gap. Indeed there are few films that I consider remarkable and that I've watched only once. Well, until last Tuesday, Head-On / Gegen die Wand / Contra la pared, was one of them. I had watched this amazing film in FIC Xixón in 2009 (I indeed mention it here), when the festival held a Fatih Akin retrospective.

The film quite shocked me and delighted me then, and it has done the same 12 years later. I've watched it now on my tablet, during a bus trip coming back from Toulouse. It's funny, cause tablets did not even exist (at least not for mainstream use) back in 2009, and back then I would not have expected to have a "secondary hometown" in France. In summer 2009 the world had been terrified about the possibility of a pandemic, the H1N1 influenza virus, well, in the end it took 11-12 years for another pandemic to come and turn our world upside down...

The story in the film is relatively timeless (the culture shock and generational conflict in families of recent immigration, the passion for life and for dead, the need for freedom and pleasure, the painful indiference for anything, self-hate, love... are elements that were here 20 years ago and will be here in 20 years time), and the fact that the film (shot in 2004) has no scenes (or hardly any scene) where mobile phones, computers, tv sets... play any role, helps the film independent of a specific year-decade (it would be different if you were confronted to several Nokia phones or CRT TV's...) . One thing that makes it look a bit "2000's" now is the tribal tattoos that both Sibel and Meren (Catrin Striebeck) wear right above their beatiful asses...

I like it when artists put elements of their lives in their creations. That's what Fatih Akin, a German guy of Turkish descent (but Western culture) does with his films, and furthermore it's also what the georgeous Sibel Kekilli does with her character in the film (that as a sort of wink is also called Sibel). In her real life Sibel is a German woman of Turkish descent (and Western culture) that has also gone through problems in her life due to the clash between her (Western) culture and the "less evolved" culture of her ancestors. In 2006 she said "I have experienced for myself that both physical and psychological abuse are regarded as normal in a Muslim family. Sadly, violence is part of the cultural heritage in Islam"... wow, I love this woman! :-D

Sadly, it seems that the real life of Birol Ünel, that plays the main male character in the film, had much elements of convergence with the film. Birol died in 2020 after having struggled with alcoholism for many years.

I won't tell you the plot here, just read it in wikipedia, I'll just say that though the story skillfully blends hilarious and tragic moments this is mainly a drama. Anyway, though there's so much sadness in the film, there's also much hope, same as there is much self-destruction and much desire to rebuild, there's apathy for life, but it ends up being overcome by passion...

Tuesday 20 July 2021

Promises and Futures

In the last years asynchronous programming has become more and more important. Hopefully the old callbacks techniques have given way to objects that later at some point will hold a result, objects that we call Tasks in C# and Promises in Javascript. I had also heard about Java using Future objects, so I just thought of Promises and Futures (and partially also Tasks) as interchangeable funny names "invented" by the designers of those languages.

Well, I've found out that the Promises and Futures concepts were invented long ago in Computer Science (CS), and that they are related, but not the same (and that language designers could have been a bit more careful). If you read this wikipedia article, and check some discussions like for example this one, you'll end up with this idea:

  • In CS, a Future is a read-only container for a result that does not yet exist
  • In CS, a Promises is also a container for a result that does not yet exist, but it can be written (normally only once)

A Future is read-only in the sense that it lacks a public "setResult" kind of method, but obviously there has to be an internal way to set its value. This is normally done by a callback passed to the object constructor (this sounds familiar to all JavaScript people out there, right?)

Java seems to be quite CS friendly regarding this topic based on the naming chosen by its designers. It provides a Future interface (which main implementation is FutureTask) that allows read access to a result. It also provides the CompletableFuture class, that implements the Future interface, but allows write access (you set the result with .complete(result)). So a CompletableFuture is indeed a Promise in Computer Science terms. There is a very nice example here of creating a CompletableFuture but returning it just as a Future, so that the client has no write access to it.

In C# there are no Promise or Future objects, but Tasks, that are read-only. We could write that example above by means of Task and TaskCompletionSource. The difference is that as there is not an "ITask interface" (though we talk about awaitable objects), and TaskCompletionSource is not a Task (nor an awaitable either), but it "contains" and directly gives us access to a Task object, and the result of that object can be set "internally" by the "owner" of the TaskCompletionSource object (by calling TaskCompletionSource.SetResult).

JavaScript Promises correspond to a Future in CS parlor. They are read-only and the only way to set its result is through the resolve-reject callback functions passed to the constructor. It's interesting to note that in JavaScript we use the thenable concept (an object with a "then" method). You can use the await magic keyword with any thenable (not just with Promises), and obviously a Promise is a "thenable", but there is at least one case where Promises and "thenables" are treated differently, in the Promise.resolve method.

It's been centuries since the last time I wrote some real python code, but as it's the first language where I saw the async-await magic construct, it was interesting to see how it deals with asynchrony. We can read here this:

We say that an object is an awaitable object if it can be used in an await expression. Many asyncio APIs are designed to accept awaitables.
There are three main types of awaitable objects: coroutines, Tasks, and Futures.

So Python also uses the awaitable concept. If you read the documentation to know more about those 3 types of awaitables you'll find that even Python designers did not follow the CS expected rules, and a Future in Python is not read-only, as it has a set_result method, so it would correspond to a CS Promise. Indeed, a Future is needed to allow callback-based code to be used with async/await.. and is similar to a Java CompletableFuture (and partially similar to the Task-TaskCompletionSource pair in C#)

There's something I really like, that Python uses a specific object, Task, to run code concurrently, Tasks are used to schedule coroutines concurrently.. This is something that I've always found confusing in C#. In C# we use a same object, Task, for 2 related, but different things:

  • For what we are seeing in this post, a placeholder for a value that will be available in the future and for which we can wait asynchronously
  • For running one piece of code in the ThreadPool by means of Task.Run.

For the second case, frequently we also want to wait asynchronously for its returned value (but not always, maybe we just want to run a code with no need for its result or even knowing if it has completed), but anyway it would be interesting to differentiate between both cases and it would help to avoid some confusions (beginners tend to wrongly think that an async method just magically runs in a separate thread, you can check this related old post)

Monday 12 July 2021

Out of Memory Killer

In this previous post I expressed my surprise before the fact that .Net does not allow to limit the Heap size of one application (contrary to Java and Node). Bearing in mind that .Net was born as a Windows only environment (hopefully things have moved a lot in these last years), this missing can be particularly harmful given the way Windows deals (doesn't deal indeed) with those awful times when the RAM of your system (and the Swap space) are completely full.

Linux manages this situation in a rather sensible way. The Linux kernel provides the OOM Killer (Out Of Memory Killer) mechanism (it can be disabled, but in 99% of occasions you shouldn't). Basically, when an application (or the OS itself) requests more physical memory and the OS has no more space available (both in RAM and Swap), the OOM Killer will choose an application and kill it. This choice is made based on a series of heuristics (but I guess the program using the most of RAM tends to be the best candidate). So one (or a few) programs are sacrificed in order to get the system back on track. The process gets a SIGKILL signal and it dies immediatelly.

At work we have a Java application that consumes a good bunch of RAM and that occasionally will stop running. We're running it with a Xmx setting of 4GB (max Heap size) and we had seen it several times using more than 5GBs of physical RAM (that means: Stack + MetaSpace + Code Cache + Heap...) so it seemed normal that at some point it would reach those 4GBs limit. In that case we should get a "java.lang.OutOfMemoryError: Java heap space" Exception that would be caught by our code and written to our application log before the application shuts down. As we were not finding anything like this in our log we were a bit confused until we found out about this nice thing called OOM Killer. In our case, the whole system had run out of RAM (before the Java Heap of our application had reached its limit), so the OOM Killer was doing its job and killing our process (that being the most RAM consuming process in that machine was the normal candidate).

When the OOM Killer kills a process it logs it to /var/log/messages (in RedHat) that is only accessible by root. We have no root access to that machine, but hopefully this information is stored in the kernerl ring buffer and any user can read it like this:

dmesg -T | egrep -i 'killed process'

For our OOM we could read this:
[2298074.477791] Out of memory: Kill process 18020 (MyProcess) score 298 or sacrifice child
[2298074.478406] Killed process 18020 (MyProcess), UID 16941, total-vm:8588392kB, anon-rss:5474896kB, file-rss:0kB, shmem-rss:48k

Windows behaviour is pretty different (and not much smart I would say...) There's nothing similar to the OOM Killer, so when the RAM and Swap are full Windows will gift the first process requesting more memory with an Out of Memory error and the process will die. This can be any process, not necessarily the process taking up more memory (that maybe is not actively requesting new memory now). So you can have a situation where processes keep dying in a system that becomes unusable until the moment when finally the main memory consumer dies or the user restarts the machine... I'm not making it up, you can read it from a more reliable source

Sunday 4 July 2021

Modification vs Change time

Sometimes you have deployed something to one server, find that one value in a configuration file is wrong and you want to change it. You have permissions, you are not changing any code, it's the same that was deployed to a lower environment and everyone was happy with... it's just a typo in a server name/port whatever... in a config file. But ideally everything should have been correct since the beginning, when your complex deployment scripts deployed it all. If you modify that file by hand if later on someone is navigating that folder he'll notice with a simple "ls -la" that one of the config files there has a different modification date, and maybe will say "hey, so in the end there was an error and you had to do some extra changes... bla, bla, bla". Not a big deal, but maybe you want to avoid that kind of remarks...

Files in Linux have 3 timestamps (Last Modification time, Last Access time and Last Changed time). There is a fourth one, Birth, for the creation time. This value is available in File Systems like Ext4, but most Linux tools do not read it (as the creation time is not defined in the POSIX standard). You can see all these values with the stat.

When you list the files in one folder (e.g. ls -la) the date that you see is the Modification time. The OS will always update the Change time of one file when you modify anything in it (contents, permissions, name, location), but for the Modification time things are more relaxed. If you rename a file or move it to another location, the modification time remains the same. If you do a copy of a file, by default the new file gets its Modification time set to the current datetime, but we are allowed to prevent this by means of the -p (preserve) flag. cp -p myFile myFileCopy will set the Modification time (and Access time) to the value of the source file (but for sure the Change value is set to the current time).

Even more interesting than the above is the fact that you can set the Access and Modification time of one file to an arbitrary value. For that we use the touch -r command. With that I can create a new file with the timestamps of a "source" file. Then I can freely modify the contents of my "source" file and then copy the original timestaps from the new file to the source file. Example:

//create the new file with the timestamp for later reuse touch -r db.conf db.conf.timestamp //modify db.conf correcting some values vi db.conf //db.conf Modify timestamp has changed, let's set it back to the initial value touch -r db.conf.timestamp db.conf rm db.conf.timestamp

With the above, at first sight (with ls) it looks as if both config files had not changed. Obviously checking more in depth (with "stat") will show you the true in the Change timestamp. We could say that this is a sort of "cosmetic trick".