Sunday, 31 August 2014

New IT Jargon

IT is a complex, constantly evolving world where it's rather difficoult for someone with an average brain to keep up with just a small part of the many things one can find interesting. What is quite simple (and sometimes can be pretty helpful) is keeping up with the new vocabulary. When I say helpful, I mean that knowing to name a few of the last cool IT things can make you look smarter than you are (but notice that some managers can consider that knowing what an Acronym stands for is right the same as being an expert on it, so it can end up causing you some trouble).

In the last weeks I've come across with some names/acronyms that can help you look cool :-D

  • NoSQL is one of the most misleading terms of the last years. OK, NoSQL stores do not use SQL, but that's not the real point, the real point is that those Data Stores are not based on the Relational Model we find in RDBMS's. So when I read somewhere that some people had started to use the term NoRel I immediately loved it.
  • Polyglot Persistence. When I first knew about the term polyglot programming I really liked it, it was a fast way to define what the programming world had evolved into, it's no longer enough to be proficient in one language/platform and blindly choose it for all your projects. I guess with Polyglot Persistence it's going to be right the same.
  • BASE. The counterpart to ACID in a NoRel world.
  • MEAN. I think it's been some years since the last time I heard about LAMP, and suddenly a modern, geek incarnation of that web stack is here: MongoDB, Express, AngularJS and Node. I'm almost certain I'll never use this combination. I love Node (but mainly as a scripting platform or just as a way to test JavaScript "proofs of concept") and I'm looking forward to have some time to play with Mongo and AngularJS, but I don't see myself doing any personal Web Development project, and in most companies the LOB (Line of Business) web applications are and will continue to be Java or .Net.
  • MVW. Probably you're a bit fed up of having to stop to think about the distinction among MVC, MVP, MVVM... to explain someone which one (or variation of which one) you're using. Saying "I'm using Model View Whatever" can come really handy to spare you time in fruitless debates.
  • Hoisting. In JavaScript variables declarations (but not initializations) are hoisted (that is, moved to the top). Function declaration (not function expressions) are also hoisted.
  • IIEF. If you've been into JavaScript development in the last years is pretty likely that you've done some use of them, but chances are that you don't know their name: IIFE (Immediately Invoked Function Expression), I mean:
    (function () {  // open IIFE
                    var tmp = 100 - x;
                    ...
                }()); 
    

Sunday, 17 August 2014

File Locking

I've gone through some problems with File Locking at work lately, so I think I'll write some notes about it here. We had an application appending lines (results) to a csv file in a shared folder. If that file is opened with Notepad++, results will continue to be appended successfully (and notepad++ will ask you to reload the file as it's been modified), however, if it's opened with Excel by someone with write access permissions on that file, new lines will fail to be appended to the file (we won't lose them as in that case we write to a secondary file, with means that in the end we'll have to consolidate both files). So Excel is locking the file while notepad++ isn't.

We requested Write access to be removed for our users, and just in case we also asked them to always copy the file to a local folder and open their local copy. Anyway, we've found lately that errors continue to happen (as we can see that new lines get added to the secondary file). After some bewilderment we finally found the reason, the errors were happening when a user was copying the file to their local folder (the file has grown enough to take some seconds to get copied, time enough for experiencing collisions with the main process trying to append new results to the file).

I have to admit that I had not much clear how "file locking" works, and I was quite confused on some points. The only kind of locking that I had in mind was that 2 processes (threads indeed) can not write to the same file at the same time, but from the above it seems like there are more cases where locking happens (I guess copying a file would mean just opening it in Read mode), so first I've done some basic tests with a .Net program opening a file with a StreamReader, StreamWriter or FileStrem with differen parameters, and trying to read/write from/to that file with some basic commands: echo hi ≶≶ File.txt / type File.txt.
Notice that I've performed all these tests on Windows, I'll test this on Linux when I have a chance.

1)One process opens a file for writing to it (either with a StreamWriter or a FileStream):

using (StreamWriter sw = new StreamWriter(filePath)) //careful, the StreamWriter empties the file!

//or:

using (FileStream fs = new FileStream(filePath, FileMode.Append, FileAccess.Write))

As expected, if another process tries to write to the same file (just a simple echo time >> file.txt), it'll fail with a The process cannot access the file because it is being used by another process error.

Notice however, that there will be no problem to read from this file from another process (a simple type file.txt)

2)Now, let's go for the interesting case, let's open a file for reading(with a StreamReader or a FileStream):

using (StreamReader sr = new StreamReader(filePath))

//or

using (FileStream fs = new FileStream(filePath, FileMode.Open, FileAccess.Read))

and let's try to write to it (again with just a simple echo time >> file.txt), It'll fail So it seems like by default opening a file for reading prevents us from writing to it. This quite fits with the unexpected behaviour that we'd come across when trying to write while copying.
Again, notice that there will be no problem to read from the file from another process.

These results disconcerted me a bit, I guess there has to be a way to open a file for reading while allowing other processes to write to it. Trying to find some more information about File Locking in Windows it happened again that the Wikipedia seems to have just the perfect amount of information.

Microsoft Windows uses three distinct mechanisms to manage access to shared files:

  • using share-access controls that allow applications to specify whole-file access-sharing for read, write, or delete
  • using byte-range locks to arbitrate read and write access to regions within a single file
  • by Windows file systems disallowing executing files from being opened for write or delete access

...

The sharing mode parameter in the CreateFile function used to open files determines file-sharing. Files can be opened to allow sharing the file for read, write, or delete access. Subsequent attempts to open the file must be compatible with all previously granted sharing-access to the file. When the file is closed, sharing-access restrictions are adjusted to remove the restrictions imposed by that specific file open.
Byte-range locking type is determined by the dwFlags parameter in the LockFileEx function used to lock a region of a file. The Windows API function LockFile can also be used and acquires an exclusive lock on the region of the file.

If you look at the different overloads for the FileStream constructor you'll find a few ones with a FileShare parameter. This is what the article refers to as "file access sharing". In my examples, I've not been using any of those overloads, so I guess I'm in this case:

Windows inherits the semantics of share-access controls from the MS-DOS system, where sharing was introduced in MS–DOS 3.3. Thus, an application must explicitly allow sharing; otherwise an application has exclusive read, write, and delete access to the file (other types of access, such as those to retrieve the attributes of a file are allowed.)

I found then this excellent answer, where a guy opens a file with FileShare.ReadWrite and then copies it by blocks. I've put it into my own FileCopyTaker class and done some tests, and it works like a charm. I can copy a large file and at the same time append additional information to it from another process.

I've uploaded it here.

It's clear from this that doing a file copy from Windows Explorer (or via the copy command) is not using this file-sharing feature at all, and one could wonder why. Well, I think that's the correct default behaviour. In most cases, modifying a file while it's being copied would leave the file in an unusable state (when the copy finishes the other process could still be writing information to the file, so it would be incomplete, and in most cases (save for my particular case of appending a line to a text file) it would mean broken.

You'll find people wondering why there's not a Windows function to check whether a file is Locked. The answer is clear, it would not be useful, as you could run into synchronization issues, I mean, given this code

if (!isFileLocked(filePath)){
	//Do stuff with the file
}

between the moment when isFileLocked returns false, and the moment you start to do things with it, another process could be locking the file and your code would crash... so in anyway, you should always put your file access code into a try-catch to deal with the possibility of the file being locked

Sunday, 10 August 2014

Thread Suspend, Resume and Abort

At first sight aborting, suspending or resuming threads (in the .Net arena) seems pretty simple, the Thread class sports methods for each of these 3 tasks. However, if you check the documentation, you'll see that both Suspend and Resume are marked as obsolete and highly discouraged:

Do not use the Suspend and Resume methods to synchronize the activities of threads. You have no way of knowing what code a thread is executing when you suspend it. If you suspend a thread while it holds locks during a security permission evaluation, other threads in the AppDomain might be blocked. If you suspend a thread while it is executing a class constructor, other threads in the AppDomain that attempt to use that class are blocked. Deadlocks can occur very easily.

As for the native Win32 funcion SuspendThread you'll find equally discouraging statements:

This function is primarily designed for use by debuggers. It is not intended to be used for thread synchronization. Calling SuspendThread on a thread that owns a synchronization object, such as a mutex or critical section, can lead to a deadlock if the calling thread tries to obtain a synchronization object owned by a suspended thread. To avoid this situation, a thread within an application that is not a debugger should signal the other thread to suspend itself. The target thread must be designed to watch for this signal and respond appropriately.

Thread.Abort is not marked as obsolete and no advice against its use is explicitly given in MSDN, but you'll find multiple articles and discussions (like this pointing against its use, just for the same reasons as Suspend.

The basic idea is that these methods are too invasive, they'll just Suspend or Abort a thread irrespective of what it's doing at that moment, which can be pretty risky. We need then a more collaborative way to do this, basically a Thread should be periodically stopping to do its main task and checking for Suspend or Abort requests at moments when it can really proceed with such order, aborting/suspending itself accordingly. Obviously the "periodically stop to do its main task" thing can be complicated to accomplish depending on what that main task is... but that's apart from the suspend/abort logic.

For Abort it seems pretty simple. A given thread would just check for an Abort flag, and when true, it would just end.

while(!this.abort){ //check for Abort requests
     DoWork();
}

For Suspend/Resume it's a bit more complicated, we can check for a Suspend flag, but once suspended, how do we check for the Resume order? Well, signals are the main communication mechanism among threads, so we should just use that, a ManuaResetEvent initially set as signaled, we would indicate a Suspend request by resetting it, and then send a Resume command by setting it again.

while(!this.abort){ //check for Abort requests
     DoWork();
     this.resumeSuspend.WaitOne(); //check for Resume/Suspend requests
    }

To really understand it you'll have to see this full sample