Friday, May 23, 2014

MJStack -> MJProf

Two months ago I have started the MJStack project (see previous blog post). This project continues to evolve (I am getting help from anatmaor@github). As time passes   I have  realised  at some point that this is turning into a sampling profiler so I have decided to rename the project into mjprof.

So what is new ?

First of all data source concept. MJStack collected its data from standard input only. MJProf can   collect stack dumps via invocation of JStack, via JMX (including remote JMX), from standard input and directly from a file. Since mjprof is extensible via plugins you can add additional data sources.

In addition merging of stack traces (into profiles) can be done over time based on thread id.

This is a great leap forward. I will appreciate your feedback and contribution.
Stay tuned and
      fork me on github  https://github.com/lifey/mjprof


Saturday, March 1, 2014

MJStack- The Monadic JStack Analyzer

   When you come to debug a JVM in production(or in a performance lab) one of the available techinques is to grab a thread dump using either a JMX or the jstack command line tool. Sometimes it is all you have, thats why it is called the 'poor man's profiler'. JDK (1.6 and up) includes the jstack command line tool. In order to use it  you need to discover the relevant process id by using either ps (UNIX) / task manager (Windows) or by using jps which is another command line tool found in the jdk. Usually you will execute the following command:

jstack -l {your process i} > stacktrace.txt 

   This will create a file with stack traces for all the thread in the process. If you run your JVM in UNIX you need to make sure that you are using the same user (or root). In Windows you will need to make sure you are running on the same session as well. 
   The stacktrace.txt contains a short header and a per thread section. For each thread we will have some information: Thread name, priority, thread id, native thread id, status, state, stacktrace and "locked ownable synchronizers" (java.util.concurrent ReentrantLocks which the thread is locking) 

Following an example for a thread which is named ISCREAM3 and is currently sleeping.

"ISCREAM3" prio=5 tid=0x00007fd48a9b2800 nid=0x9203 waiting on condition [0x000000010ddec000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
        at java.lang.Thread.sleep(Native Method)
        at com.performizeit.jpo.ex.cpuprofile.threadClasses.Sleeper.rest(Sleeper.java:44)
        at com.performizeit.jpo.ex.cpuprofile.threadClasses.Sleeper.singleCycle(Sleeper.java:23)
        at com.performizeit.jpo.ex.cpuprofile.threadClasses.ExceptionThrower.singleCycle(ExceptionThrower.java:19)
        at com.performizeit.jpo.ex.cpuprofile.threadClasses.Sleeper.longerStackTrace(Sleeper.java:30)
        at com.performizeit.jpo.ex.cpuprofile.threadClasses.Sleeper.longerStackTrace(Sleeper.java:32)
        at com.performizeit.jpo.ex.cpuprofile.threadClasses.Sleeper.longerStackTrace(Sleeper.java:32)
        at com.performizeit.jpo.ex.cpuprofile.threadClasses.Sleeper.longerStackTrace(Sleeper.java:32)
        at com.performizeit.jpo.ex.cpuprofile.threadClasses.Sleeper.longerStackTrace(Sleeper.java:32)
        at com.performizeit.jpo.ex.cpuprofile.threadClasses.Sleeper.longerStackTrace(Sleeper.java:32)
        at com.performizeit.jpo.ex.cpuprofile.threadClasses.Sleeper.loopForever(Sleeper.java:37)
        at com.performizeit.jpo.ex.cpuprofile.threadClasses.Sleeper.run(Sleeper.java:51)
        at com.performizeit.jpo.ex.cpuprofile.threadClasses.Recurse.doRecursion(Recurse.java:25)
        at com.performizeit.jpo.ex.cpuprofile.threadClasses.Recurse.doRecursion(Recurse.java:25)
        at com.performizeit.jpo.ex.cpuprofile.threadClasses.Recurse.singleCycle(Recurse.java:16)
        at com.performizeit.jpo.ex.cpuprofile.threadClasses.Sleeper.longerStackTrace(Sleeper.java:32)
        at com.performizeit.jpo.ex.cpuprofile.threadClasses.Sleeper.longerStackTrace(Sleeper.java:32)
        at com.performizeit.jpo.ex.cpuprofile.threadClasses.Sleeper.loopForever(Sleeper.java:37)
        at com.performizeit.jpo.ex.cpuprofile.threadClasses.Sleeper.run(Sleeper.java:51)
        at java.lang.Thread.run(Thread.java:744)

   Locked ownable synchronizers:
        - None
  
   This is really cool, you take a quick glance and you are able to see what the thread is doing. The problem starts when you come to analyze real applications. These application are huge and may have hundreds of such threads (and from time to time even thousands) with long stacktraces so you will probably need to read through a file with tens of thousands of lines. 

   Thats where mjstack gets into the picture. It enables you to transform the thread dump into something that may be useful to you. It does not have a fancy UI, as a matter of fact it has no user interface at all. The way to work with it, is to compose several small steps that let you focus on what you really want. With this tool you are able to filter out threads, sort them, eliminate some of the data in them  and group them by certain attribute.

Each such step will be called a monad and there three kind of such monads:
  • Filters - filters out certain threads 
  • Mappers - Change thread info
  • Terminals - performs final transformation and subsequent monads can't be applied after it.
   Sounds like project lambda ? Well that is not a coincidence. I was inspired by the concept of functional programming streams closures and lambda expressions. You can take those monads and compose them as you wish.

   The argument passed to mjstack are a list of monads seperated by period(.) each one of them contains a list of arguments inside two slashes (//) seperated by a comma (,).

   For example the monad contains/name,mythread/  will filter out any thread which its name does not contain the string "mythread". The terminal group/state/ will create a histogram of different states. We can also combine them together  contains/name,mythread/.group/state/ this will histogram by state only the threads with name that contains mythread



Lets see some examples:

1. Focus on runnable threads:
    cat stacktrace.txt | ./mjs.sh contains/state,RUNNABLE/

2. Show only the names of the runnable threads
    cat stacktrace.txt | ./mjs.sh contains/state,RUNNABLE/.group/name/

3. Focus on the threads which belong to your ISCREAM thread pool and group them by state:
    cat stacktrace.txt | ./mjs.sh contains/name,ISCREAM/.group/state/ 

4. Look for threads that actually run your application code
    cat stacktrace.txt | ./mjs.sh contains/stack,myconpany/

4. Look for threads that actually run your application code and take only the 5 topmost stack-frames:
    cat stacktrace.txt | ./mjs.sh contains/stack,myconpany/.keeptop/5/

And so much more ! You have great power in your hands since you can compose monads as you like.

If you reached to this point and still interested, don't hesitate the project is open source and you can download it from :  

Disclaimer: This is an initial version so it may contain bugs and it was not optimized yet. I would love to get feedback.

Enjoy !



Tuesday, September 24, 2013

JVM Params Everyone Should Have in Production

I know I did not published anything useful in last months. As a matter of fact, I had this blog post almost ready for quite a long time. Unfortunately, I have a lot to say on procrastination but I will leave it for tomorrow:) Anyway after the JVM hotspot sessions in JavaOne and after a direct question in JClarity friends newsletter and after suffering from severe jet lag I have decided to finalise this post. 


When talking about JVM configuration I find myself breaking the DRY principle so I it about time to write a blog post on this topic. Following list of a the must have JVM params for (almost) every server installation:

(parameterised values are specified in curly braces)

-Xms{#MB}m -Xmx{#MB}m
-XX:PermSize={#MB}m -XX:MaxPermSize={#MB}m
-XX:+HeapDumpOnOutOfMemoryError
-XX:+PrintFlagsFinal
-server
-XX:+PrintGCDetails
-XX:+PrintGCDateStamps 

-XX:+PrintTenuringDistribution
-XX:+PrintGCApplicationStoppedTime
-XX:+PrintGCApplicationConcurrentTime
 -XX:+UseGCLogFileRotation
-XX:NumberOfGCLogFiles={#files}
-XX:GCLogFileSize={#MB}M
-Xloggc:{some gc log file}.gc 

-Dcom.sun.management.jmxremote
-Dcom.sun.management.jmxremote.port={a port}
-Dcom.sun.management.jmxremote.ssl=false
-Dcom.sun.management.jmxremote.authenticate=false




Memory params (blue): You probably need to set minimal and maximal heap size to suite your needs. In most cases ms=mx but that is not always the case. IMO it is really important to have a memory dump whenever you encounter an OutOfMemoryError mainly if you want to be able to isolate fast "spike" memory depletion bugs. 

-XX:+PrintFlagsFinal - Displays the configuration of the JVM all parameters available with their default value or the overridden one.
-server is not really needed it is just a safety net to make sure you are not running a client compiler or something stupid like that :) 

GC logging configuration(green): If you want to be able to estimate GC efficiency GC logging is the way. Starting from J7 log file rotation is supported use it to keep at least two-four weeks of GC activity.

Remote JMX configuration(purple) : Opening a JMX port for remote monitoring can help you debug and monitor your server with tools like VisualVM and JConsole. If security is an issue you may also require password and encrypt communications. 

A few more notes:
1. I am not getting into GC tuning configuration params as these params vary from one application to another and tuning GC is not in the scope of this post. However if you are using CMS I think that
-XX:+ExplicitGCInvokesConcurrent should be your default unless proven otherwise.
2. J8 has the new MetaSpace instead of the PermGen space so perm gen params are no longer required.

What is your common set of params ?

Tuesday, July 16, 2013

Scalapeño 2013

Tomorrow I am going to give the session "Tales about Scala Performance" in Scalapeño 2013 conference in Tel Aviv see ya there :)



Saturday, December 17, 2011

A tale about a JVM crash


It happened again :) 
After a long time it happened again....
I was called to investigate a JVM crash. Usually it is the sign of a long adventure along side with the Java platform. A thrilling and frustrating event in the same time. This time I was lucky it took me 4 hours to reach a valid solution ....


Chapter I : First Encounter

First of all I looked at the directory from which the process started running. I started looking at the hs_err_pidXXXXX.log files that existed there about... 50 such files as there is a watchdog process which relaunch the application immediately after the crash. In this case just for the sake of crashing again.
I was happy to discover they accumulate very fast the process needs merely 45 seconds to crash. Someone who is not familiar with those kind of problems may be startled by fact that the process crashes so fast but for me that was heaven! 45 seconds to reproduction.
I started looking at the crash files. They were very disappointing, each one them crashed on different address the thread was applicative. In addition stack trace was missing . I snooped around to make sure no native code was running on this process(no native code good news). 
All I had was a pale description of a JVM frustration:  
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  EXCEPTION_ACCESS_VIOLATION (0xc0000005) at pc=0x0000000000000000, pid=4384, tid=8364
#
# JRE version: 6.0_26-b03
# Java VM: Java HotSpot(TM) 64-Bit Server VM (20.1-b02 mixed mode windows-amd64 compressed oops)
# Problematic frame:
# C  0x0000000000000000
#
# If you would like to submit a bug report, please visit:
#

---------------  T H R E A D  ---------------

Current thread (0x000000001029e000):  JavaThread "XXXXXXXXXXXXX" daemon [_thread_in_Java, id=8364, stack(0x000000001a840000,0x000000001a940000)]

siginfo: ExceptionCode=0xc0000005, ExceptionInformation=0x0000000000000008 0x0000000000000000

The fact that crashing thread was an applicative thread was indicating that the crash was not related to GC.I tried switching the GC mechanism anyhow with no change in behavior. I also removed compressedOops and crash remain there standing still.
My next guess was the JIT compiler so as a first step I added the following command line option -Djava.compiler=NONE. That guess was working like a charm JVM stopped crashing.  That was still not a sign for euphoria, running interpreted mode is not a very good choice so the next step included divide and conquer.

Chapter II: Lion hunting

I added the following flags to the JVM  -XX:+LogCompilation -XX:+UnlockDiagnosticsVMOptions
When a HotSpot JVM is running with log compilation flag it is creating hotspot.log file which includes detailed info regarding all of its JIT activity. For a detailed document regarding this log file take a look here.
Truth be told I still don't fully understand the generated XML but I was looking for the following lines :


Compilation of a method is performed asynchronously by specific JVM threads and every request for JIT is queued before compilation. Grabbing all these lines will give me a full list of all compilation attempts done by the JVM.  This list now was taken to a text editor and manipulated so it can be fed to the .hotspot_compiler file.

.hotspot_compiler is a JIT exclusion file which should be placed in the working directory from which the JVM is launching. It contains lines of the following format:
exclude java/util/Properties$LineReader readLine ()I

Telling the JVM|:
"Dear JVM  
 Do not compile the method realLine which resides in the inner class LineReader of the class java.util.Properties,  Thanks in advance,
      A desperate programmer"

I started with approx. 2000 such lines and fed them to the .hotspot_compiler file (after removing the compile=NONE flag of course ) and the JVM did not crash. After the divide and conquer process I was left with a single method which its compilation made the JVM freak out.
Everybody is happy crash eliminated with minimal effect on performance.

Chapter III: After-max

Everybody is happy, except of me.... a lot of questions remain unanswered.
The method which aggravated the crash looks like this: 
protected boolean isXXXXXXXXX(double stdDev,double mean){
        //|stddev/mean| > 10^-5
        if(stdDev==0.0){
            return mean==0.0;
        }
        if(Double.isNaN(stdDev) || Double.isNaN(mean)){
            return true;
        }
        return Math.abs(mean/stdDev) < 100000.0;
    } 
What? That's a method which can cause the JVM to crash ? I don't think so... 
My guess was that it is related to some kind of interaction.....
I tried to reproduce the crash on an isolated test and as expected, nothing.
Could not find anything in the internet since I don't even know how to ask the correct question.

What am I going to do next ?:
  • Try to upgrade the JDK sub version though I could not find a fixed bug related to this in the release notes.
  • Try JDK 1.7
  • Try 32 bit JVM 
  • Try to disable certain optimizations such as Escape analysis 
  • Use -XX:+PrintOptoAssembly
But the most important of all hoping for some input from the community......

Monday, November 14, 2011

Java 7 and String.intern()

For a long time interning strings in HotSpot-VM would store the result in permanent generation. Since Java 7 this is no longer true. Interned strings are stored in the heap. As a result if you have an application where relied on intern() you may need to re-size your perm and your heap. Heap need to be larger Perm need to be smaller.
I understood that permanent generation is going to be removed from the HotSpot-VM in futures releases of Java.
Maybe this is the first step in this direction :)
Reference
http://www.oracle.com/technetwork/java/javase/jdk7-relnotes-418459.html

Wednesday, April 27, 2011

Java 6u25 released

Two new operations in ThreadingMBean
 - getThreadCPUTime - over threadID array
 - getThreadAllocateBytes - allocated bytes on thread basis 

http://www.oracle.com/technetwork/java/javase/6u25releasenotes-356444.html

- Tiered compilation is in...


HotSpot diagnostic information has been expanded in several ways:
  • Tracking of cumulative Java heap bytes allocated on a per-thread basis (Threading mbean was added an operation getThreadAllocateBytes

  • On OutOfMemoryError, indication of the faulting thread in the heap dump
  • Improved handling of unexpected exceptions in application native code
  • Better indication of native heap exhaustion
  • More details in hs_err files