Monday, May 27, 2013

How to diagnose OutOfMemoryError in your JVM application?



Types of OOME

So here You are, running Your brand new java application. Or maybe finally the day Your application gained real user traffic came. Your java process started to show in top screen at last and Your server gained some load. Everything was great and suddenly "bang!" - OutOfMemoryError. You restart Your application hoping that was some strange thing and sometimes it helps... for day or so. So You search web to find out how to give java more memory and You double JVM heap size. Sometimes it helps... for two days or so. 
If You see OutOfMemoryError in Your logs, it can mean one of these (percentage of cases based on my real life experience) :
  1. You have memory leak and You need to diagnose it
    (10% of cases)
  2. Your application is not using the memory right
    (89.9% of cases)
  3. Your application just needs more memory
    (0.1% of cases)
Some of You may be surprised that there is something like memory leak in a JAVA program. We must understand that GC surely handles most of the memory allocation issues, but is not a telepath. If You are holding references to objects You are no longer using, and they are still accessible from your applications main reference tree - GC will not free those objects' memory. 
If Your process downloads hundreds of thousands of records from the database to the memory and than analyses them - Your application is not using the memory right. While writing an applications You must predict and control memory usage of the program, so You don't end with great number of objects in Your memory (which can not fit there).
An example of such mistake can be observed here:

ResultSet rs = stmt.executeQuery(query);

while (rs.next()) {
    executorService.submit(new MyTask(rs.get("id"));
}

executorService.shutdown();
// ... waiting for finish

In most cases in situations like that, task execution takes more time than retrieving next row. While not executed, MyTask objects wait in executorService in-memory queue. Given result set big enough, the memory will end and error will be thrown.
Such errors are nasty, because they appear some time after the application launches. At first, when application has little data, such result sets are little and there is no problem. Application lives, new features are added, things are changed and suddenly OutOfMemoryError appears destroying the application. Such memory usage bug could be with the application from beginning and finding it can cause troubles, as most paniced people will target new functionalities believing that they are responsible for "the new bug".
Sometimes Your application just needs more memory. JAVA is kind'a memory consuming and when program runs many threads at the same time, and each of them takes some memory - You need to provide more memory for JVM.

Facts about OOME

  • Any *Error including OutOfMemoryError destabilizes JVM and it will not perform normally until it will be restarted. 
  • The place OutOfMemoryError  is thrown from typically is a random one. It will be thrown first time an object cannot be allocated, not necessarily (and rarely) of the class responsibility for memory problem
  • You can see OutOfMemoryError is closing by GC behaviour - it will desperately find at least some of the space to use. GC full collect will increase in numbers big time.

How to fix OOME?

When facing OutOfMemoryError Your first task is to diagnose what fills the memory. Your first instinct will be to give java first memory, but it is not likely to help in the long run, as it is rarely the problem with heap size. In most cases it only buys You some time.
Fortunately all the tools necessary to diagnose the problems comes with JVM and You already should have them in Your PATH (if not, You can find them close to other java package binaries) 
Here are the steps You should fallow to diagnose the problem. What is also important: You can run this procedure (and normally will) after OutOfMemoryError occurred. 
  1. You must be logged as the owner of the process You want to diagnose.
  2. Get the PID of the process You want to diagnose. You can use Your OS monitoring tool for that, or just use jps
    $ jps -l
    32513 sun.tools.jps.Jps
    9693  MyApp.jar
    
  3. Next You need to dump the heap to a file using the pid of our process
    $ jmap -dump:file=dump.map 9693
    Dumping heap to /tmp/dump.map ...
    Heap dump file created
    
  4. Such a damp isn't good for reading on Your own. Good someone came with jhat. It is web-based browser through this dump file.
    $ jhat -port 7401 dump.map
    Reading from dump.map...
    Dump file created Mon Nov 26 10:44:53 CET 2012
    Snapshot read, resolving...
    Resolving 9942216 objects...
    Chasing references, expect 1988 dots....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
    Eliminating duplicate references....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
    Snapshot resolved.
    Started HTTP server on port 7401
    Server is ready.
    
    Sometimes the default amount of memory taken by jhat is too little, so You must provide it with more:
    jhat -port 7401 -J-mx4G dump.map
    
When You type http://yourhost:7401 You will see jhat web interface. It takes some getting used to. Here is a tutorial that can help You out. When You know what-is-what in jhat, You must find out:

  1. What objects eat up most of memory 
  2. What threads eat up most of the memory
  3. Sometimes - what class loaders eat up most of the memory (when handling application servers to determine which application created the problem). 
If diagnosed threads consume predicted amount of memory - then You're lucky and tuning thread number and heap size will solve Your problem (You might also think about getting new server for scaling the workload horizontally)
In most cases You will need to find the lines of code that produces new instances of problematic class and check if You are not making too many new objects before using them. Maybe some queue grows too quick or list is too big? 
If none of this is true - memory leak occurred and You will need to analyze what is holding references to the objects eating too much memory. That is probably the hardest one to debug. 

Photos courtesy of wikia: 1, 2.

No comments:

Post a Comment