Sunday 17 August 2014

File Locking

I've gone through some problems with File Locking at work lately, so I think I'll write some notes about it here. We had an application appending lines (results) to a csv file in a shared folder. If that file is opened with Notepad++, results will continue to be appended successfully (and notepad++ will ask you to reload the file as it's been modified), however, if it's opened with Excel by someone with write access permissions on that file, new lines will fail to be appended to the file (we won't lose them as in that case we write to a secondary file, with means that in the end we'll have to consolidate both files). So Excel is locking the file while notepad++ isn't.

We requested Write access to be removed for our users, and just in case we also asked them to always copy the file to a local folder and open their local copy. Anyway, we've found lately that errors continue to happen (as we can see that new lines get added to the secondary file). After some bewilderment we finally found the reason, the errors were happening when a user was copying the file to their local folder (the file has grown enough to take some seconds to get copied, time enough for experiencing collisions with the main process trying to append new results to the file).

I have to admit that I had not much clear how "file locking" works, and I was quite confused on some points. The only kind of locking that I had in mind was that 2 processes (threads indeed) can not write to the same file at the same time, but from the above it seems like there are more cases where locking happens (I guess copying a file would mean just opening it in Read mode), so first I've done some basic tests with a .Net program opening a file with a StreamReader, StreamWriter or FileStrem with differen parameters, and trying to read/write from/to that file with some basic commands: echo hi ≶≶ File.txt / type File.txt.
Notice that I've performed all these tests on Windows, I'll test this on Linux when I have a chance.

1)One process opens a file for writing to it (either with a StreamWriter or a FileStream):

using (StreamWriter sw = new StreamWriter(filePath)) //careful, the StreamWriter empties the file!

//or:

using (FileStream fs = new FileStream(filePath, FileMode.Append, FileAccess.Write))

As expected, if another process tries to write to the same file (just a simple echo time >> file.txt), it'll fail with a The process cannot access the file because it is being used by another process error.

Notice however, that there will be no problem to read from this file from another process (a simple type file.txt)

2)Now, let's go for the interesting case, let's open a file for reading(with a StreamReader or a FileStream):

using (StreamReader sr = new StreamReader(filePath))

//or

using (FileStream fs = new FileStream(filePath, FileMode.Open, FileAccess.Read))

and let's try to write to it (again with just a simple echo time >> file.txt), It'll fail So it seems like by default opening a file for reading prevents us from writing to it. This quite fits with the unexpected behaviour that we'd come across when trying to write while copying.
Again, notice that there will be no problem to read from the file from another process.

These results disconcerted me a bit, I guess there has to be a way to open a file for reading while allowing other processes to write to it. Trying to find some more information about File Locking in Windows it happened again that the Wikipedia seems to have just the perfect amount of information.

Microsoft Windows uses three distinct mechanisms to manage access to shared files:

  • using share-access controls that allow applications to specify whole-file access-sharing for read, write, or delete
  • using byte-range locks to arbitrate read and write access to regions within a single file
  • by Windows file systems disallowing executing files from being opened for write or delete access

...

The sharing mode parameter in the CreateFile function used to open files determines file-sharing. Files can be opened to allow sharing the file for read, write, or delete access. Subsequent attempts to open the file must be compatible with all previously granted sharing-access to the file. When the file is closed, sharing-access restrictions are adjusted to remove the restrictions imposed by that specific file open.
Byte-range locking type is determined by the dwFlags parameter in the LockFileEx function used to lock a region of a file. The Windows API function LockFile can also be used and acquires an exclusive lock on the region of the file.

If you look at the different overloads for the FileStream constructor you'll find a few ones with a FileShare parameter. This is what the article refers to as "file access sharing". In my examples, I've not been using any of those overloads, so I guess I'm in this case:

Windows inherits the semantics of share-access controls from the MS-DOS system, where sharing was introduced in MS–DOS 3.3. Thus, an application must explicitly allow sharing; otherwise an application has exclusive read, write, and delete access to the file (other types of access, such as those to retrieve the attributes of a file are allowed.)

I found then this excellent answer, where a guy opens a file with FileShare.ReadWrite and then copies it by blocks. I've put it into my own FileCopyTaker class and done some tests, and it works like a charm. I can copy a large file and at the same time append additional information to it from another process.

I've uploaded it here.

It's clear from this that doing a file copy from Windows Explorer (or via the copy command) is not using this file-sharing feature at all, and one could wonder why. Well, I think that's the correct default behaviour. In most cases, modifying a file while it's being copied would leave the file in an unusable state (when the copy finishes the other process could still be writing information to the file, so it would be incomplete, and in most cases (save for my particular case of appending a line to a text file) it would mean broken.

You'll find people wondering why there's not a Windows function to check whether a file is Locked. The answer is clear, it would not be useful, as you could run into synchronization issues, I mean, given this code

if (!isFileLocked(filePath)){
	//Do stuff with the file
}

between the moment when isFileLocked returns false, and the moment you start to do things with it, another process could be locking the file and your code would crash... so in anyway, you should always put your file access code into a try-catch to deal with the possibility of the file being locked

No comments:

Post a Comment