Sunday 26 September 2021

Linux Process Memory Consumption

Properly understanding the memory consumption of a process is quite an endeavor. For Windows I've given up. Over the years I've gone through different articles and discussions about the WorkingSet, Private Bytes and so on... and I've not managed to get a full understanding. In Linux things seem to make quite more sense.

I have several recent posts that could join this current one in a sort of "Understanding Memory series". This one about the whole Memory consumption in your system, this about the available physical memory, and this one where among other things I talk about the Virtual Address Space.

I'm going to talk now about the physical memory consumed by a specific process and about "shared things". With the top command we see 3 memory values: VIRT, RES and SHR. VIRT is the virtual address space, so as I explained in previous articles it has nothing to do with physical memory and normally it's not something we should care about. RES is the important one, it's the physical memory used by the process, and it's the same value that we see with the ps v pid command under the RSS column.

Well, things are not so simple. Processes share resources among them. I can think of 2 main ones: SO's (shared objects/shared libraries, that is, Linux equivalent to dll's) and the code segment between different instances of the same process. From what I can read here and here, the memory taken by a SO will be included in the RSS value of each process using that library (though it's taking, at least for most part of the SO, physical memory only once). The same goes for the code segment space, for each instance of the same process the RSS value will include the code segment space, though it's loaded only once in physical memory. This means that if you add up the RSS value of all the processes in your system it will give you a far bigger value than the real physical memory consumption that is provided by free -m.

RSS is the Resident Set Size and is used to show how much memory is allocated to that process and is in RAM. It does not include memory that is swapped out. It does include memory from shared libraries as long as the pages from those libraries are actually in memory. It does include all stack and heap memory.

Remember that code segment pages are shared among all of the currently running instances of the program. If 26 ksh processes are running, only one copy of any given page of the ksh executable program would be in memory, but the ps command would report that code segment size as part of the RSS of each instance of the ksh program.

When a process forks, both the parent and the child will show with the same RSS. However linux employs copy-on-write so that both processes are really using the same memory. Only when one of the processes modifies the memory will it actually be duplicated. This will cause the free number to be smaller than the top RSS sum.

There's an additional point to take into account, Shared Memory. Contrary to what you could think, when talking about "Shared Memory" in linux systems we are not talking about shared libraries, but about a different concept, memory reserved by shmget or mmap. Honestly I know almost nothing about it, I'm just telling what I've read here and here

"shared libraries" != "shared memory". shared memory is stuff like shmget or mmap. The wording around memory stuff is very tricky. Using the wrong word in the wrong place can totally screw up the meaning of a sentence

This Shared Memory is the SHR value on top's output. All processes sharing a block of shared memory will have it counted under their SHR value (but not under RSS).

The RSS value doesn't include shared memory. Because shared memory isn't owned by any one process, top doesn't include it in RSS.

No comments:

Post a Comment