Sunday, June 2, 2013

Short jhat tutorial: diagnosing OutOfMemoryError by example

Last time we've learned what can be the reason of OutOfMemoryErrors and what tool-chain can we use to diagnose it. Today we will learn by example how to use the most important one: jhat.
I've prepared a sample project for this exercise, which can be cloned from github. The code is really simple and its  problem is obvious, but this simplicity will make it easy to get to know jhat.
First, we need to run our program. We will use small heap size, for two reasons:
  1. Program will throw the exception faster
  2. jhat will start more quickly, as the heap dump will be smaller
$ git clone https://github.com/petermodzelewski/snippets.git
$ cd snippets/OutOfMemoryErrorGenerator/
$ mvn package
$ java -Xmx128m -Xms128m -jar target/OutOfMemoryErrorGenerator-1.0.jar
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
        at pl.keyer.oome.generator.Sleepyhead(Sleepyhead.java:6)
        at pl.keyer.oome.generator.App.main(App.java:11)

We can notice, that the program is still running. We will need another console to run jhat. We are using the following commands:
$ jps -l
752 target/OutOfMemoryErrorGenerator-1.0.jar
4480 sun.tools.jps.Jps
$ jmap -dump:file=dump.map 752
$ jhat -port 7401 dump.map
Reading from dump.map...
Dump file created Sat Jun 01 23:25:55 CEST 2013
Snapshot read, resolving...
Resolving 561438 objects...
Chasing references, expect 112 dots................................................................................................................
Eliminating duplicate references................................................................................................................
Snapshot resolved.
Started HTTP server on port 7401
Server is ready.

Important notes about that process:
  • All commands must be executed by the same user: the java process owner
  • The "expect X dots" message is not a joke. While processing bigger heap dumps one can check the number of dots there in editor to see the progress, as it can take quite a while to process such a file.
  • When processing bigger dumps one must watch heap size of jhat itself. This depends on the case, but to be safe (provided with enough resources) jhat should have 2-4 times more heap size, than process heap it will diagnose. If memory size for jhat is too small it will just crush after using it and the process will need to be repeated with bigger amount of memory. For example to provide jhat with 4 gigs the command will be:
    $ jhat -port 7401 -J-mx4G dump.map
  • Diagnosed process may be terminated after dumping heap with jmap. 
  • Obviously jhat can be run on any machine where the dump will be present. On many occasions developers choose to zip the dump and move the debug process to machine more accessible for them and with enough ram. 
After executing the commands we can visit http://localhost:7401/

When facing with jhat You will quickly realize that this tool is from times, where such tools were designed without consideration of prettiness or usability. This tutorial will show how to navigate it in most cases - all it's features are cool and necessary, but everyday programmer will use only subset of them to quickly diagnose where the OOME came from.
jhat main page can be divided into sections:

  1. List of all classes in your program (excluding platform - that is, all that is not from standard library). This list is normally really long and in most cases it is not necessary. Normally You will scroll down to "Other Queries" section right away. 
  2. More options for listing classes
  3. Bread and butter of memory debugging, we will use them in a moment
  4. More tools for debugging, but not as helpfull as section 3.
    • Heap histogram is sometimes useful to compare quantity vs size of objects
    • When you become jhat ninja, you sometimes could use OQL to diagnose the application. It is a SQL-like language for searching heap and calculating it's statistics. 
Firstly, lets see heap histogram.


This view illustrates very important fact, that jhat does not compute "deep" object size. That's why on top of our process' memory consumption histogram we see class [B which is really an array of bytes: byte[]. Array of bytes is often on top of such histogram and what's more - in most cases it doesn't mean that there is a problem. For example, if your program processes lot's of strings it will naturally have lot of byte arrays, as each string object contains reference to some byte array. Similar problem will manifest when we follow the "Show instance counts for all classes (including platform)".


That view is very similar to histogram sorted by quantity. Normally during OOMR we will look for "our" classes. We will need to exclude platform classes to easily detect abnormalities in our classes quantity (or size). A good start is to follow "Show instance counts for all classes (excluding platform)".

Things to look: unnaturally big numbers of objects of some class. In our example the number of workers. Our example illustrates the common problem, when producer creates tasks faster, than consumer handles them, and the producer - consumer queue will not block after limit is reached. 
Unfortunately in most cases it is not as easy. Diagnosing that objects of some class are eating too much memory is one thing. Diagnosing why and where are they allocated is another. To do so, we need to track the objects references to something "active", for example thread, controller, etc. When given that "active" object we can than analyze the algorithm and find out why so many objects are created. 
To illustrate such process lets track class [B references.



jhat enables going through references as long as one needs. Unfortunately when you click on class "page" it will display all it's instances. To dodge that time consuming view, you can copy link of the class page (for example from histogram view) and construct reference by Type view link. For example class [B page can have the following url:
http://localhost:7401/class/0x3881d790
so the reference by type summery will have url as follow:
http://localhost:7401/refsByType/0x3881d790

Those methods normally are all you need to detect the memory trouble in the process. Additionally in case you are working with application container (like tomcat) and having problems to see what application (or pool) is leaking the objects, you should diagnose the Classloader section of class page:


That's it. All you need to know about jhat to start your own memory problem debug. Hope it'll help.

One last tip: Many developers after solving their memory problem while running new application version, are taking heap dump and running jhat with it - just in case. They are often terrified that despite their efforts object count is still grand but somehow OOME is not appearing. Of course they were so focused on fighting the memory leakage that they've forget how GC works. Remember: Always trigger GC collection (for example with visual vm) to clean old gen from unnecessary objects before taking the dump and analyzing it with jhat. 

Monday, May 27, 2013

How to diagnose OutOfMemoryError in your JVM application?



Types of OOME

So here You are, running Your brand new java application. Or maybe finally the day Your application gained real user traffic came. Your java process started to show in top screen at last and Your server gained some load. Everything was great and suddenly "bang!" - OutOfMemoryError. You restart Your application hoping that was some strange thing and sometimes it helps... for day or so. So You search web to find out how to give java more memory and You double JVM heap size. Sometimes it helps... for two days or so. 
If You see OutOfMemoryError in Your logs, it can mean one of these (percentage of cases based on my real life experience) :
  1. You have memory leak and You need to diagnose it
    (10% of cases)
  2. Your application is not using the memory right
    (89.9% of cases)
  3. Your application just needs more memory
    (0.1% of cases)
Some of You may be surprised that there is something like memory leak in a JAVA program. We must understand that GC surely handles most of the memory allocation issues, but is not a telepath. If You are holding references to objects You are no longer using, and they are still accessible from your applications main reference tree - GC will not free those objects' memory. 
If Your process downloads hundreds of thousands of records from the database to the memory and than analyses them - Your application is not using the memory right. While writing an applications You must predict and control memory usage of the program, so You don't end with great number of objects in Your memory (which can not fit there).
An example of such mistake can be observed here:

ResultSet rs = stmt.executeQuery(query);

while (rs.next()) {
    executorService.submit(new MyTask(rs.get("id"));
}

executorService.shutdown();
// ... waiting for finish

In most cases in situations like that, task execution takes more time than retrieving next row. While not executed, MyTask objects wait in executorService in-memory queue. Given result set big enough, the memory will end and error will be thrown.
Such errors are nasty, because they appear some time after the application launches. At first, when application has little data, such result sets are little and there is no problem. Application lives, new features are added, things are changed and suddenly OutOfMemoryError appears destroying the application. Such memory usage bug could be with the application from beginning and finding it can cause troubles, as most paniced people will target new functionalities believing that they are responsible for "the new bug".
Sometimes Your application just needs more memory. JAVA is kind'a memory consuming and when program runs many threads at the same time, and each of them takes some memory - You need to provide more memory for JVM.

Facts about OOME

  • Any *Error including OutOfMemoryError destabilizes JVM and it will not perform normally until it will be restarted. 
  • The place OutOfMemoryError  is thrown from typically is a random one. It will be thrown first time an object cannot be allocated, not necessarily (and rarely) of the class responsibility for memory problem
  • You can see OutOfMemoryError is closing by GC behaviour - it will desperately find at least some of the space to use. GC full collect will increase in numbers big time.

How to fix OOME?

When facing OutOfMemoryError Your first task is to diagnose what fills the memory. Your first instinct will be to give java first memory, but it is not likely to help in the long run, as it is rarely the problem with heap size. In most cases it only buys You some time.
Fortunately all the tools necessary to diagnose the problems comes with JVM and You already should have them in Your PATH (if not, You can find them close to other java package binaries) 
Here are the steps You should fallow to diagnose the problem. What is also important: You can run this procedure (and normally will) after OutOfMemoryError occurred. 
  1. You must be logged as the owner of the process You want to diagnose.
  2. Get the PID of the process You want to diagnose. You can use Your OS monitoring tool for that, or just use jps
    $ jps -l
    32513 sun.tools.jps.Jps
    9693  MyApp.jar
    
  3. Next You need to dump the heap to a file using the pid of our process
    $ jmap -dump:file=dump.map 9693
    Dumping heap to /tmp/dump.map ...
    Heap dump file created
    
  4. Such a damp isn't good for reading on Your own. Good someone came with jhat. It is web-based browser through this dump file.
    $ jhat -port 7401 dump.map
    Reading from dump.map...
    Dump file created Mon Nov 26 10:44:53 CET 2012
    Snapshot read, resolving...
    Resolving 9942216 objects...
    Chasing references, expect 1988 dots....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
    Eliminating duplicate references....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
    Snapshot resolved.
    Started HTTP server on port 7401
    Server is ready.
    
    Sometimes the default amount of memory taken by jhat is too little, so You must provide it with more:
    jhat -port 7401 -J-mx4G dump.map
    
When You type http://yourhost:7401 You will see jhat web interface. It takes some getting used to. Here is a tutorial that can help You out. When You know what-is-what in jhat, You must find out:

  1. What objects eat up most of memory 
  2. What threads eat up most of the memory
  3. Sometimes - what class loaders eat up most of the memory (when handling application servers to determine which application created the problem). 
If diagnosed threads consume predicted amount of memory - then You're lucky and tuning thread number and heap size will solve Your problem (You might also think about getting new server for scaling the workload horizontally)
In most cases You will need to find the lines of code that produces new instances of problematic class and check if You are not making too many new objects before using them. Maybe some queue grows too quick or list is too big? 
If none of this is true - memory leak occurred and You will need to analyze what is holding references to the objects eating too much memory. That is probably the hardest one to debug. 

Photos courtesy of wikia: 1, 2.

Saturday, May 18, 2013

On language popularity in first quarter of 2013

Recently I found this great graph showing language popularity basing on stack overflow and github tags:

On the upper right hand side we see the most popular languages: the front line. Most of them are really not a surprise: JAVA, PHP, C/C++, C#, Obj-C, PERL or Ruby. Those are the once that have been in that spotlight for a while now. What's interesting is the strong position of JavaScript (man that language grows) and Python with its second youth sponsored by indie games.
Not so far away (in compare with older reports) is the second wave of computer languages. One could say that those are the newcomers which are building it's community, but there are some old friends too. The unquestionable leader of the second wave is Scala almost ready to join the mainstream (I've separated it in a one dot set on the graph). The other new popular JVM languages are also there: Clojure and Groovy, although it could have been predicted as those three were gathering bigger and bigger community all the time. It's about time for them to slowly replace JAVA in some applications. 
Interesting that the mentioned old friends: Prolog, Haskel or Lua are there. Those three are really passing the test of time - always in the shadows of mainstream languages but never going down. 
What really got me thinking was how far Rust was from D. Is it that C-family programmers are not so eager  to try out new things?
"It is not the strongest of the species that survives, nor the most intelligent that survives. It is the one that is most adaptable to change."
One thing is certain - new times require new tools, and more and more people realizes that everyday. Mainstream languages will be there with us for long time, but the faster can we adapt to the "second wave" the better our situation on job market will be, and what's more important: the more exiting our everyday work will be. The second wave of languages is on its constant way to join the first way, or even to step into it's place. 

Friday, May 17, 2013

Atmosphere 2013

Earlier this week  I had a great pleasure to participate in the first edition of Atmosphere - an e-commerce industry conference. I must admit I went there full of doubts. Can the  first edition of such an anonymous event be a success, especially that it was a payed conference? Fortunately it can! When I arrived I saw the whole place  buzzing with people talking about all the cool stuff. 
It all started with refreshing keynote from Brian McCallister. It was interesting to listen that all startups travel the same evolutionary path. The refreshing part was hearing that "this is OK". That those technical debts we make at the start of a project are caused by good intentions and important reasons. How to mix this general truth with rules of software craftsmanship is yet to be discovered for me. 
"There are two types of startups. Those successful and a little ashamed of their code and those who're out of business"
Continuing tale of the startups Paul Hammond spoke about choices you can make to not run out of money during the first months of the project. That was a really great talk divided into few simple lessons.
My colleagues from Toruń working on Allegro Charity Platform (running for example wosp.allegro.pl) gave a great talk about technology and architecture they've used to handle the traffic of latest WOSP final. 
Talking about handling Allegro traffic our Poznań division did a great job sharing some of our experiences in that field fallowed by lecture of both: technologies and methodologies we use to handle such a high traffic with good site responsiveness.
There were plenty more interesting presentations there but those above are my personal favorites. I'm really looking forward to publication of the videos to watch the ones I missed from other tracks.
Personally I had an honor to give a speech about choosing the programming language for a project covering JAVA, PHP, Erlang and Perl as main characters. I would like to thank all the attendees for being there and I hope you had a good time taking part in this trip through magical lands of programming languages.
The conference was a blast. On top of great speakers, delicious food and wild party; all of the attendees left with their own raspberry pi  as a gadget. I have some plans how to use mine and I will share it with you if I succeed.
I'm looking forward for next years edition. Hope to meet you there too!

Photos courtesy of Atmosphere Facebook page.

Thursday, May 16, 2013

Hello World!

It's been a while since I had a blog. Because many people during conferences asked me if  I had a blog - I've decided to start one. I believe I have some cool stuff to share. I hope you will find it interesting. Don't hesitate to leave comments to let me know how do you like it.