ICE09 . playing with java, scala, groovy and spring .

Playing with Spring

Analysing Memory Leaks with BTrace

Posted by ice09 on July 12, 2009

After having experimented with Eclipse Galileo and the now included Memory Analyzer in the last post, I came across BTrace, which very nicely fits to the tool chain for analyzing Memory Leaks and/or performance problems in applications.

Continuing the problem analysis from the last post, after having found out the resulting object for the memory leak (leakingSet), it would be interesting to find out where the objects are inserted.

Preparation

  1. Download visualvm
  2. Install the BTrace plugin as described here
  3. Extend the object insertion with a Thead.sleep (I used 100ms)
  4. Start the sample application

Usage

  1. Start visualvm and search for the started application PID
  2. Start tracing…
  3. Install the following script
import com.sun.btrace.annotations.*;
import com.sun.btrace.AnyType;
import static com.sun.btrace.BTraceUtils.*;

@BTrace public class TraceLeakingSet {
    @OnMethod(
        clazz="+java.util.Set",
        method="add"
    )
    public static void method(AnyType[] arg) {
        print(strcat("entered ", name(probeClass())));
        print(strcat(".", probeMethod()));
        println(strcat(" with argument ", str(arg[1])));
        jstack();
        println();
    }
} 

The script will log every call to the method “add” on all subclasses of “java.util.Set” (that’s the “+”). The submitted argument is included in the first index of the AnyType array.
probeClass and probeMethod display the called class and method, jstack() displays the real stack trace. This is most useful, together with the display of class, method and argument.

Sample output is as follows:

BTrace code successfuly deployed
===========================================================
entered java.util.HashSet.add with argument scopedTarget.scopetest
java.util.Collections$SynchronizedCollection.add(Collections.java:1577)
org.springframework.beans.factory.support.AbstractBeanFactory.markBeanAsCreated(AbstractBeanFactory.java:1257)
org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:256)
org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:175)
org.springframework.aop.target.SimpleBeanTargetSource.getTarget(SimpleBeanTargetSource.java:33)
org.springframework.aop.framework.Cglib2AopProxy$DynamicAdvisedInterceptor.getTarget(Cglib2AopProxy.java:657)
org.springframework.aop.framework.Cglib2AopProxy$DynamicAdvisedInterceptor.intercept(Cglib2AopProxy.java:608)
com.commons.Scopetest$$EnhancerByCGLIB$$3824588a.uniqueCode()
com.commons.Tester.start(Tester.java:25)
com.commons.Tester.main(Tester.java:14)

entered java.util.HashSet.add with argument -841086979
com.commons.Tester.start(Tester.java:25)
com.commons.Tester.main(Tester.java:14)

So there are two alternating locations where add is called on Set and it’s subclasses, one location is in Spring itself (which is interesting but of no concern for the problem) and – of course – the line where the object is added in the start method. This is the location we are looking for – the question would be why so many objects are added with different arguments (instead of always the same argument with singleton instead of prototype usage).

So, BTrace is very easy to use and very useful for tracking problems in running systems with very low overhead and low intrusion (there is actually no change at the deployed code necessary).

9 Responses to “Analysing Memory Leaks with BTrace”

  1. Very cool! 🙂

  2. williamlouth said

    This approach is not at all scalable if “running system” means there is some actual work being performed other than a single click through.

    Do you know the cost (overhead) of a stack trace dump? It is huge in comparison to the cost of a set.add(..) not to mention that all the string concatenation of the class and method names with a stringified version of the argument which can untold consequences.

    Do this in any enterprise setting with millions of add calls per seconds (prior to the deployment of this script) and you can kiss good bye to your response times, throughput and disk space (if someone did even let a developer with such tooling close enough to a production system).

    There are much less painful ways of doing this.

    Of course in a development context you can do just about anything you wish.

    • williamlouth said

      By the way most heap profiler give you aggregation by call allocation site (call stacks). You would not have to even move outside of the tool that found the initial suspect map as the path would reveal it immediately unless the code itself was convoluted.

      • ice09 said

        Yes, for the given example, which is oversimplified as I stated, you are right.
        However, I can think of problems which are more difficult to track and BTrace is a nice, simple and easy way of starting the problem search.
        There are tools (profilers) out there which give much more information and are better suited for complex problem analysis, however, they have to be “installed” before, slow down the system constantly and just do not fit always.
        For me, BTrace is just a fine little extension of my toolbox.

      • Jaroslav Bachorik said

        But you need to enable allocation tracking for the whole application, don’t you? AFAIK there are no allocation stack traces in heap dumps obtained from Sun’s JVM (up to the latest update of JDK6). There is supposed to be this information in some of the upcoming updates, though. Then, all you need would be a heap analyzing tool supporting this additional info.
        Also, you can lower the overhead of the BTrace script by not tracking each single call to “add” method – you can narrow down tracing to the calls to “add” initiated directly from certain classes/methods.

    • ice09 said

      Sure, you would not run such a script in a production system, most companies would not let you and you should not try to.
      I am not saying it comes with no cost and has no overhead or intrusion, depending of what you instrument, you can easily almost halt a system. But I am pretty sure people will evaluate this correctly.
      In a performance test scenario e.g. this is a nice approach to track a problem without having to change the code.

  3. williamlouth said

    Why enable object allocation for the whole JVM unless one is only using JVMPI?

    JVMTI which removed object allocation event notification requires vendors to instrument the constructors themselves. So again why instrument all when you can narrow this down to particular instances you want tracked. I know one product that allows this and I doubt others will not follow unashamedly.

    By the way I would approach this from a caller perspective (who calls Set.add(..)) rather than instrument all Set implementation classes. This allows me to narrow this down to those methods called directly from my own application code base.

    The overhead of the above (print out) would be HUGE. There is no debating that. Here is how this can be achieved without the HUGE overhead of the btrace script. Simply replace the “hibenate.add.count” with “set.add.count” meter which we provide as a set of extensions for Collections metering.

    http://williamlouth.wordpress.com/2009/02/24/efficient-runtime-analysis-of-hibernate/

    I will try to get out a blog showing this for those that cannot see the method (approach) from the detail in the entry.

    William

    • Jaroslav Bachorik said

      From the example given I can not see how to get the call traces (which creates the biggest overhead for the btrace approach) Set.add() calls. How can I get the information about who initiated the operation resulting in calling Set.add() 1000x ?

      • williamlouth said

        There are two ways of doing this the new approach (cost centers metering) and the old style way (probe tracking).

        The new approach (which you missed in the entry) scales in production as it shows you which packages, classes, and methods directly (inherent total) or indirectly (total) result in a Set.add(…) call being made across threads, processes, hosts and clusters with even maximum values and variances recorded. Most of the time one is only concerned with what is called from their own code base which would be satisfied via the inherent total analysis.

        Then there is the tracking probes extension for those that like to click, click, click, click…….

        http://www.jinspired.com/products/jxinsight/new-in-5.7.html
        “Trackings can be used to Track (trace) back the usage of resources to one or more metered calling activities (probes). Like most other elements of the resource metering model Trackings support aggregation across Metering Contexts (hosts, processes and threads).”

        You can see the use of tracking in this blog entry:
        http://williamlouth.wordpress.com/2009/01/27/abc-for-cloud-computing/

        Alternatively you can use our charges extension and custom probes which map meters to actually metering activity representing the operation (Set.add) itself.

        Search for: “Custom Resource Meters” and “Charges”
        http://www.jinspired.com/products/jxinsight/new-in-5.7.html

        We have various approaches built on a bare metal runtime each with varying runtime collection and overhead characteristics. But at least you have the choice of going super fast or just fast, 😉

Leave a comment