Exploring Java Heap Dumps Ryan Cuprak
Java Heap Review • Java objects are stored in the heap • All objects are globally reachable in the heap • Heap is created when an application starts • Size of heap is configured using –Xmx and –Xmx • Garbage collection prunes the heap and removes objects no longer reachable • Stack memory - variable values are stored when their methods are invoked Heap Contains Everything and can be DUMPED to DISK
Why Analyze Heaps? • IT reports Java EE/Spring server memory footprint has grown to 9 gigs • Server app logs contain OutOfMemoryExceptions • Connections to queueing or database are exhausted • Serialized Java objects in queue are unreasonably large • Desktop application becomes unresponsive • Excessive amount of garbage collection
Java Heap Analysis MAT Profiler All you need is a profiler right?
Memory Leaks Textbook memory leaks - easy to find and fix. JConsole
Production Heap Dumps • 9 gigs of data • 13k classes loaded • ~ 136 million instances • ~6,000 GC roots
Production Heap Dumps Capture the Heap Dump and…
Production Heap Dumps
Heap Dump Panic Too much data! • Impossible to comprehend • No human way to explore the data • Application data model is too complicated
Real Memory Leaks Bank Account: 1231209 Owner Bob Owner JulieReport January 2018 Bank Account: 1231210 Bank Account: 1231209 Report January 2018 Owner Bob Challenge: Data looks good everywhere…
Real Memory Leaks Causes: • Faulty clone methods • Duplicate singletons • Accidently cached data • Cache logic bugs Complications • May NOT GROW over time (leaks gets cleaned-up) • More than one non-trivial memory leak
What about OQL? • OQL • Object Query Language – used for querying heaps • SQL-like language • Supports JavaScript expressions • Supported in NetBeans and VisualVM • Downside • Poorly documented and hard to use • Easy to create runaway queries
Heap Analysis Solution
NetBeans Profiler • NetBeans is open source IDE/platform • Modular architecture • Clean code base Profiler GUI Profiler API
NetBeans Profiler API • Parses hprof files • Creates an object model representing the hprof file • Pages data in from disk • Simple API (master in about 10 minutes) • Independent of NetBeans • Can be extract and use in any IDE – Plain old Java Talk is really about how to build a custom heap analysis tool: • To answer specific data model questions • With custom logic for your data model
Generating Heap Dumps
Generating Heap Dumps • Command line parameter: • -XX +HeapDumpOnOutOfMemoryError • Command line: • jmap –dump:format=b,file=dump.hprof <pid> • jhsdb jmap --binaryheap --pid <pid> • jcmd <pid> GC.heap_dump <file name> • Ctrl-Break Command Line
Generating Heap Dumps Programmatic
Generating Heap Dumps JMX
Heap Dump Warning Dumping the heap: • Takes time • Consumes diskspace • Negatively affects performance
Targeted Heap Dumps • Serialize object graphs from application to a file. • Read the serialized data into another tool and then programmatically create a heap dump.
Building a Profiler
Building Custom Profiler Create NetBeans Platform App Copy API src out of NetBeans
NetBeans Platform App
NetBeans Platform App Add dependency on “Java Profiler (JFluid)”
Profiler Sources Checkout source: https://github.com/apache/incubator-netbeans.git Profiler code: netbeans/profiler/lib.profiler/src/netbeans/lib.profiler/heap Copy heap directory
Which Approach? • Copying sources easiest • Most analysis apps are command line (one-offs) - Note - You don’t need the classpath of the application from which the heap was generated.
NetBeans Profiler API
Opening a Heap That’s All!
Heap Object Methods getJavaClassByName(String fqn) : JavaClass getAllClasses() : List getBiggestObjectsByRetainedSize(int number) : List getGCRoots(): GCRoot getInstanceByID(long instanceId) : Instance getJavaClassByID(long javaclassId) : JavaClass getJavaClassesByRegExp(String regexp) : HeapSummary getSummary() : Properties
System Properties
Heap Summary • getTotalLiveInstances() : long • getTime() : long • getTotalAllocatedBytes() : long • getTotalAllocatedInstances() : long • getTotalLiveBytes() : long
Exploration Starting Points • GCRoots • Threads (really GCRoots) • Class Types
GC Roots • Garbage Collection Root is an object that is accessible from outside the heap. • Objects that aren’t accessible from a GC Root are garbage collected • GC root categorization: • Class loaded by system class loader • Thread • Stack Local • Monitor • JNI Reference • Held by JVM
Garbage Collection Roots Java frame: 44 thread object: 5 JNI global: 29 sticky class: 1284
GCRoot Objects
GC Roots JNI_GLOBAL = "JNI global"; JNI_LOCAL = "JNI local"; JAVA_FRAME = "Java frame"; NATIVE_STACK = "native stack"; STICKY_CLASS = "sticky class"; THREAD_BLOCK = "thread block"; MONITOR_USED = "monitor used"; THREAD_OBJECT = "thread object"; UNKNOWN = "unknown"; INTERNED_STRING = "interned string"; FINALIZING = "finalizing"; DEBUGGER = "debugger"; REFERENCE_CLEANUP = "reference cleanup"; VM_INTERNAL = "VM internal"; JNI_MONITOR = "JNI monitor"; root.getKind() : String
Finding Classes • Can perform lookup using: • Fully qualified class name (ex. java.lang.String) • Class ID • Instance ID • IDs are unique to heap dump • Hash codes are not available!
Profiler Data Model JavaClass Instance B Value Value Instance A Value Value
Class Java.lang.String Java.util.List
Instances From an instance: • Who references the instance • Who does the instance reference Perform instanceof to find out: • ObjectArray • PrimitiveArray GCRoot can take forever…
Values If you ask an instance for its references, you get a list of Value objects.
Example: Member Variables Iterates over all Person objects and prints member variables.
Example: Static Variables
Example: References
String Implementation
String Extraction Strings are objects – array of characters
LinkedList Implementation
LinkedList Implementation
LinkedList Extract
ArrayList
ArrayList
Thread Extraction
Noise: Ignore Internal Classes Ignore internal JVM classes
Puzzler Prints: Count: 231 • 230 entries have 1 are referenced by one other object • 1 entry is “owned” by 822 other objects
Demo App Exploration
Demo App Note: Used HashSets, Arrays[][], Lists, and Vectors
Demo App
Demo App
Demo App
String Utilization • 5159 Strings in heap dump • 15 associated with data model
String Utilization Output
Data Model Leak Add logic to fire an employee:
Data Model Leak Fired Adam – shouldn’t be in the system!
Data Model Leak Found!
Data Model Leak Leaking here!
Best Practices • Be mindful of your heap • Cache analysis on disk when processing large heaps • Heap processing is I/O bound • Not all profiler calls are the same • Look for Javadoc: Speed: normal • Maintain a list of processed objects • Easy to run in circles • Exclude JVM internal classes from analysis • Revisit graph algorithms!
Summary • Heap snapshot can be easily explored • Excellent way to verify application logic • Only way to identify deep data model/logic errors • Can be used to recover data • Generate a heap snapshot from a frozen/corrupted application and then mine
Q&A Twitter: @ctjava Email: rcuprak@gmail.com / r5k@3ds.com Blog: cuprak.info Linkedin: www.linkedin.com/in/rcuprak Slides: www.slideshare.net/rcuprak/presentations

Exploring Java Heap Dumps (Oracle Code One 2018)

Editor's Notes