Sunday 25 April 2021

Java Heap

I'm not a Java guy, and recently I've found out about the -Xms and -Xmx settings that you can use when starting the VM to set the initial and maximum Heap size. Because of it I've been investigating a bit and I'll write it down here.

These settings quite surprised me, cause while node.js also has a --max-old-space-size setting that allows you to set the maximum Heap size, in .Net you have no control at all on the Heap size.

Before continuing, just some basic explanation. In 64 bits systems each application has an enormous, independent, Virtual Address Space available where it can reserve memory (in theory it's 128-terabytes, but the OS can set lower limits). What is really important is the Physical Address Space that is shared by all applications. An application can reserve memory in its VAS for some future use without really writing anything to it, so that's not taking up any Physical Memory. When you really need to use that memory it's when it has to be mapped to the common Physical Address Space, and this Physical Address Space is limited by... well, it's obvious, by the Physical and expensive RAM that you have bought for your system. You can check this link showing a good example with a native application in Linux. We could say that the Virtual Memory consumed by a 64 bits application is not particularly relevant, it's the Physical Memory who matters.

The Java Heap, memory where Java stores objects (those that you create with new) and that is taken care by the Garbage Collector, is not the only memory used by your Java application (though normally it's the main one). You obviously have the stack, and then several additional areas where the class definitions, code, and other structures are stored. This is nicely explained here. You can expect similar memory layouts for other Virtual Environments (.Net, Node, Python...).

The -Xmx setting makes sense to me. You set the maximum size that the application Heap can reach. This limit applies to the reserved memory in your Virtual Address Space, so obviously it also limits how much Physical Memory (Resident Memory) your heap will take up. When your heap is full and can not grow further, either the GC manages to release some space or your application will crash. I guess the idea behind this is to force more GCs to happen once an application is taking up much memory, rather than allowing it to continue to take up more and more memory. This way the application plays cleaner with other applications. Otherwise, if you have an application that has a serious memory leak, it would continue to grow until it takes up all the available physical memory and eventually crashes, but during that time it would be causing other applications that are behaving OK to GC more often or directly crash.

The reason for the -Xms setting is a bit less clear. You are setting the initial size of the Heap, so you are reserving that space in the huge Virtual Address Space of your application, but it won't be using physical memory until it really fills the addresses in your heap with something. Well, with this initial reservation you make sure that your heap takes up a contiguous space, which is better for performance. Apart from that, the main point is that I think GCs won't be conducted until you have filled this "minimum heap size" so your application will work faster until that moment is reached. That's the reason why sometimes it's recommended to run your application with Xms and Xmx set to the same value. You decide how much memory you allow your application to use (without being a burden for other applications on that machine), and let it run freely without Garbage Collecting interferance until the limit is reached. Also, as your heap has a fixed reserved size, you avoid the costs of resizing it.

I've been doing some tests with the Xms value, as there was some wrong information in different posts and I wanted to confirm that this value has no effect at all in the initial Physical Memory consumed by your application. To see the Physical Memory (Resident Memory) and Virtual Memory consumed by a process in Linux the best command that I came up with is:
top -p pid

I'm running a sort of "Hello World" Java application on Ubuntu, OpenJDK 64-Bit Server VM (build 14.0.2+12-Ubuntu-120.04, mixed mode, sharing). The application is not creating any objects other than the ones needed to read and write to the Console, so it's hardly putting anything into the Heap.

Running the application without providing any -Xms -Xmx setting I get this:

So this JVM is already reserving a large amount of Virtual Memory (the total reserved memory is more than 4 GBs, and based on what we'll see in the next execution, I guess the Reserved Heap is like 1 GB), but the consumed Physical memory is pretty low, just 33 MBs

Runnng it with a -Xms12g setting (12 GBs) we get this:

The JVM is reserving 15 GBs of Virtual Memory (so, apart from the Heap itself there are 3 additional GBs reserved for other memory areas), but the consumed Physical Memory is just 87 MBs. Anyway it's interesting to see comparing both cases that just reserving a much larger Heap involves a slight increase in the consumed Physical memory.

I've done the same test in Windows, and I get 10 GBs of Virtual Memory / 20 MBs of Working Set for the first case, and 13 GBs of Virtual Memory / 60 MBs of Working Set for the second case. Notice that I've never managed to fully understand the Windows Memory metrics: Private Bytes, Working Set... so I won't further comment on it.

No comments:

Post a Comment