Friday, September 11, 2009

how much memory is my application really using?

In order to properly visualize how much memory is being used by an application, you must understand the basics of how memory is managed by the Virtual Memory Manager.

Memory is allocated by the kernel in units called 'pages'. Memory in userland applications is lumped into two generic categories: pages that have been allocated and pages that have been used. This distinction is based on the idea of 'overcommittal' of memory. Many applications attempt to allocate much more memory for themselves than they ever use. If they attempt to allocate themselves memory and the kernel refuses, the program usually dies (though that's up to the application to handle). In order to prevent this, the kernel allows programs to over-commit themselves to more memory than they could possibly use. The result is programs think they can use a lot of memory when really your system probably doesn't even have that much memory.

The Linux kernel supports tuning of several VM parameters including overcommit. There are 3 basic modes of operation: heuristic, always and never. Heuristic overcommit (the default) allows programs to overcommit memory and determines if the allocation is wildly more than the system contains, and if so won't allow allocation. Always overcommit will pretty much guarantee an allocation. Never overcommit will stricly only allow allocation for the amount of swap + a configurable percentage of real memory (RAM). The benefit with never overcommit is that applications won't simply be killed due to the system running out of free pages; instead it will receive a nice friendly error when trying to allocate memory and *can* decide for itself what to do.

Now, let's say the program has allocated all it needs. The userland app now uses some of the memory. But how much is it using? How much memory is available for other programs? The dirty truth is, this is hard to figure out mostly because the kernel won't give us any easy answers. Most older kernels simply don't have the facility to report what memory is being used so we have to guesstimate. What we want to know is, given one or more processes, how much ram is actually being used versus what is allocated?

With the exception of very new kernels, all you can really tell about the used memory of a program is how much memory is in the physical RAM. This is called the Resident Set Size, or RSS. Since the 2.6.13 kernel we can use the 'smaps' /proc/ file to get a better idea of the RSS use. The RSS is split into two basic groups: shared (pages used by 2 or more processes) and private (pages used by only one process). We know that the memory that a process physically takes up in RAM combines the shared and private RSS, but this is misleading: the shared memory is used by many programs, so if we were to count this more than once it would falsely inflate the amount of used memory. The solution is to take all the programs whose memory you want to measure and count all the private pages they use, and then (optionally) count the shared pages too. A catch here is that there may be many other programs using the same shared pages, so this can only be considered a 'good idea' about the total memory your application is using (it may be an over-estimation).

With recent kernels some extra stats have been added to the kernel which can aid in giving a good idea about the amount of memory used. Pss is a value that can be found in smaps with kernels greater than 2.6.24. It contains the private RSS as well as the amount of shared memory divided by the number of processes that are using it, which (when added with other processes also sharing the same memory) gives you a closer idea to the amount of memory really being used - but again this is not completely accurate due to the many programs which may all be using different shared memory. In very recent kernels the 'pagemap' file allows a program to examine all the pages allocated and get a very fine-tuned look at what is allocated by what. This is also useful for determining what is in swap, which otherwise would be impossible to find out.

Based on all this information, one thing should be clear: without a modern kernel, you will never really know how much memory your processes are using! However, guesstimation can be very accurate and so we will try our best with what we have. I have created a perl script which can give you a rough idea of memory usage based on smaps information. You can pass it a program name and it will try to summarize the memory of all processes based on that program name. Alternatively you can pass it pid numbers and it will give stats on each pid and a summary. For example, to check the memory used by all apache processes, run 'meminfo.pl httpd'. To check the memory of all processes on the system, run 'ps ax | awk '{print $1}' | xargs meminfo.pl'.

Some guidelines for looking at memory usage:

* Ignore the free memory, buffers and cache settings. 99% of the time this will not apply to you and it is misleading. The buffer and cache may be reclaimed at any time if the system is running out of resources, and free memory is completely pointless - this relates to pages of physical memory not yet allocated, and realistically your RAM should be mostly allocated most of the time ( for buffers, cache, etc as well as miscellaneous kernel and userland memory ).

* If you don't have Pss or Pagemap in your kernel, a rough guess of used memory can be had by either adding up all the Private RSS of every process or subtracting the Shared RSS from the RSS total for every process. This still doesn't account for kernel memory which is actually in use and other factors but it's a good start.

* Do not make the mistake of confusing swap use with 'running out of memory'. Swap is always used by the kernel even if you have tons of free memory. The kernel tries to intelligently move inactive memory to swap to keep a balance between responsiveness and speed of memory allocation/buffer+cache reclimation. Basically, it's better to have lots of memory you don't use in swap than in physical RAM. You can tune your VM to swap less or more but it depends on your application.

No comments:

Post a Comment