Java Runtime: повседневные обязанности JVM Андрей Паньгин ведущий разработчик проекта Одноклассники
JVM Iceberg 1
JVM Iceberg 2
Launcher • java.exe – just a simple C program 1. Locate JRE 2. Select version 3. Parse arguments 4. JNI_CreateJavaVM 5. JNIEnv::FindClass 6. JNIEnv::GetStaticMethodID (main) 7. JNIEnv::CallStaticVoidMethod 3
Other launchers • Differ only by Main Class – javac com.sun.tools.javac.Main – javadoc com.sun.tools.javadoc.Main – javah com.sun.tools.javah.Main – jconsole sun.tools.jconsole.JConsole – jmap sun.tools.jmap.JMap – … • Even -version calls Java – sun.misc.Version.print() 4
Class loading • Class loading != Class initialization • ClassLoader prepares byte[] definition • Real loading is done inside VM – ClassLoader.defineClass0 – Parses, verifies and creates internal VM structures • Unreferenced classes may be unloaded – -XX:+CMSClassUnloadingEnabled 5
Class Metadata • Constant Pool, Interfaces, Methods, Fields • Exceptions, Annotations, vtables, itables • Java 8: PermGen  Metaspace – java.lang.OutOfMemoryError: PermGen space • Field reordering – -XX:+CompactFields, -XX:FieldsAllocationStyle=1 6
Class data sharing • Snapshot of commonly used classes – jre/lib/classlist • Mapped into memory directly from disk – classes.jsa • Shared between multiple JVM instances • Improves start-up time, reduces footprint – -XX:+UseSharedSpaces – -XX:+DumpSharedSpaces 7
Bytecode verification • Java 6 Split Verifier – Inferencing verifier (javac) + Type checking (run-time) – StackMapTable attribute in .class • Skip verification – -XX:-BytecodeVerificationRemote – -XX:-BytecodeVerificationLocal – Inherit sun.reflect.MagicAccessorImpl 8
Class initialization • When? • 12-step procedure described in JLS §12.4 • Basically, the invocation of <clinit> (static initializer) 9
Bytecodes • Stack machine • 203 Java bytecodes Stack manipulation iconst, fconst, bipush, pop, dup, swap Type conversion i2l, f2i, l2d Arithmetic / logic iadd, isub, imul, ixor, ishr Local variables iload, aload, istore, fstore Arrays iaload, aaload, iastore, aastore, arraylength Fields getfield, putfield, getstatic, putstatic Branches ifeq, ifgt, goto, tableswitch, lookupswitch Method calls invokevirtual, invokeinterface, invokestatic, return Allocation new, anewarray, multianewarray Other monitorenter, monitorexit, instanceof, athrow 10
JVM specific bytecodes • fast_igetfield, fast_iputfield – -XX:+RewriteBytecodes • fast_iload2, fast_aaccess_0 – -XX:+RewriteFrequentPairs • fast_binaryswitch • return_register_finalizer 11
Bytecode interpreter • C++ and Assembler (template) interpreter • Generated at VM start-up • Run-time code profiling – -XX:CompileThreshold=10000 • Run-time CP resolution and bytecode rewriting • Optimizations – Dispatch table, top-of-stack caching 12
Stack • Java (expression) vs. Native (execution) – -XX:ThreadStackSize=320 • Guard pages Empty – -XX:StackShadowPages=20 – -XX:StackYellowPages=2 Shadow – -XX:StackRedPages=1 Frame Frame Frame 13
Stack frame SP Expression stack Monitors Method old SP FP old FP Return addr Locals Expression stack Previous frame 14
Frame types • Interpreted, Compiled, Native – Different layout • Inlining – 1 frame for multiple nested methods • Deoptimization – When optimistic assumptions fail – (exceptions, class hierarchy changes etc.) 15
Threads • Java threads & VM threads • States: in_java, in_vm, in_native, blocked • Thread pointer in register – Fast Thread.currentThread() • Priority policy (0 – normal, 1 – aggressive) – -XX:ThreadPriorityPolicy=N • TLAB allocation – -XX:+UseTLAB, -XX:TLABSize=0 16
Synchronization • Simple uncontended lock – CAS • Biased locking – -XX:+UseBiasedLocking – -XX:BiasedLockingStartupDelay=4000 – -XX:BiasedLockingBulkRebiasThreshold=20 – -XX:BiasedLockingBulkRevokeThreshold=40 • Contended lock optimizations – Spin  Yield  Park 17
Java-level locks • java.util.concurrent.locks.LockSupport • park() / unpark() • OS-level primitives – Mutex + Condition variable 18
wait / notify • What’s wrong with this code? void waitForCompletion() { void setCompleted() { synchronized (lock) { synchronized (lock) { if (!completed) { completed = true; lock.wait(); lock.notifyAll(); } } } } } 19
wait / notify • What’s wrong with this code? void waitForCompletion() { void setCompleted() { synchronized (lock) { synchronized (lock) { if (!completed) { completed = true; lock.wait(); lock.notifyAll(); } } } } } • -XX:+FilterSpuriousWakeups 20
Object header Unlocked unused hashCode 0 age 0 01 Monitor Thin lock Displaced header ptr 00 Inflated lock Inflated lock ptr 10 Stack Biased lock JavaThread epoch 0 age 1 01 21
Safepoints • When? – GC phases – Thread dump – Deoptimization – Revoke/rebias BiasedLock • How? – Interpreted: switch dispatch table – Compiled: page polling – Native: on return and on JNI calls • -XX:+PrintSafepointStatistics 22
Native methods • System.loadLibrary() • Lazy linking • Expensive invocation 1. Create stack frame 2. Set up arguments according to native calling convention 3. Pass JNIEnv* and jclass 4. Lock / unlock if synchronized 5. Trace method entry / method exit 6. Check for safepoint 7. Check exceptions 23
JNI functions • Executed in VM context • Check for safepoint • jobject == index in thread-local JNIHandles array • Optional verification – -XX:+CheckJNICalls 24
Exceptions • Which code is faster? for (;;) { try { int index = getNextIndex(); for (;;) { if (index >= arr.length) { int index = getNextIndex(); break; sum += arr[index]; } } sum += array[index]; } catch (IndexOutOfBoundsException e) { } // break } 25
Exceptions • Which code is faster? for (;;) { try { int index = getNextIndex(); for (;;) { if (index >= arr.length) { int index = getNextIndex(); break; sum += arr[index]; } } sum += array[index]; } catch (IndexOutOfBoundsException e) { } // break } • try-catch is free (exception tables) • throw is expensive (find handler, unwind stack, release locks, build stack trace) 26
Reflection • getDeclaredFields(), getDeclaredMethods() – VM Internal structures  Java representation • Field getters and setters – sun.misc.Unsafe • Method.invoke() 1. Native implementation 2. Dynamic bytecode generation (up to 10x faster) -Dsun.reflect.inflationThreshold=15 27
MethodAccessor example • void Point.move(float x, float y, boolean relative); class PointMove_MethodAccessor implements MethodAccessor { public Object invoke(Object target, Object[] args) { float x = ((Float) args[0]).floatValue(); float y = ((Float) args[1]).floatValue(); boolean relative = ((Boolean) args[2]).booleanValue(); try { ((Point) target).move(x, y, relative); } catch (Throwable t) { throw new InvocationTargetException(t); } return null; } } 28
Troubleshooting • Signal handler (SIGSEGV, SIGILL, SIGFPE…) • Not all SEGV are fatal – Polling page – Implicit NullPointerException, StackOverflowError • hs_err.log – Threads, Stack frames, Memory map – Registers, Top of stack, Instructions – -XX:+UnlockDiagnosticVMOptions -XX:+PrintAssembly 29
Still much to learn 30
Thank you! • OpenJDK sources – http://hg.openjdk.java.net • Contacts – andrey.pangin@odnoklassniki.ru • Open Source @ Odnoklassniki – https://github.com/odnoklassniki • Career @ Odnoklassniki – http://v.ok.ru 31

Java Runtime: повседневные обязанности JVM

  • 1.
    Java Runtime: повседневные обязанностиJVM Андрей Паньгин ведущий разработчик проекта Одноклассники
  • 2.
  • 3.
  • 4.
    Launcher •java.exe – just a simple C program 1. Locate JRE 2. Select version 3. Parse arguments 4. JNI_CreateJavaVM 5. JNIEnv::FindClass 6. JNIEnv::GetStaticMethodID (main) 7. JNIEnv::CallStaticVoidMethod 3
  • 5.
    Other launchers • Differ only by Main Class – javac com.sun.tools.javac.Main – javadoc com.sun.tools.javadoc.Main – javah com.sun.tools.javah.Main – jconsole sun.tools.jconsole.JConsole – jmap sun.tools.jmap.JMap – … • Even -version calls Java – sun.misc.Version.print() 4
  • 6.
    Class loading • Class loading != Class initialization • ClassLoader prepares byte[] definition • Real loading is done inside VM – ClassLoader.defineClass0 – Parses, verifies and creates internal VM structures • Unreferenced classes may be unloaded – -XX:+CMSClassUnloadingEnabled 5
  • 7.
    Class Metadata • Constant Pool, Interfaces, Methods, Fields • Exceptions, Annotations, vtables, itables • Java 8: PermGen  Metaspace – java.lang.OutOfMemoryError: PermGen space • Field reordering – -XX:+CompactFields, -XX:FieldsAllocationStyle=1 6
  • 8.
    Class data sharing • Snapshot of commonly used classes – jre/lib/classlist • Mapped into memory directly from disk – classes.jsa • Shared between multiple JVM instances • Improves start-up time, reduces footprint – -XX:+UseSharedSpaces – -XX:+DumpSharedSpaces 7
  • 9.
    Bytecode verification • Java 6 Split Verifier – Inferencing verifier (javac) + Type checking (run-time) – StackMapTable attribute in .class • Skip verification – -XX:-BytecodeVerificationRemote – -XX:-BytecodeVerificationLocal – Inherit sun.reflect.MagicAccessorImpl 8
  • 10.
    Class initialization • When? • 12-step procedure described in JLS §12.4 • Basically, the invocation of <clinit> (static initializer) 9
  • 11.
    Bytecodes •Stack machine • 203 Java bytecodes Stack manipulation iconst, fconst, bipush, pop, dup, swap Type conversion i2l, f2i, l2d Arithmetic / logic iadd, isub, imul, ixor, ishr Local variables iload, aload, istore, fstore Arrays iaload, aaload, iastore, aastore, arraylength Fields getfield, putfield, getstatic, putstatic Branches ifeq, ifgt, goto, tableswitch, lookupswitch Method calls invokevirtual, invokeinterface, invokestatic, return Allocation new, anewarray, multianewarray Other monitorenter, monitorexit, instanceof, athrow 10
  • 12.
    JVM specific bytecodes • fast_igetfield, fast_iputfield – -XX:+RewriteBytecodes • fast_iload2, fast_aaccess_0 – -XX:+RewriteFrequentPairs • fast_binaryswitch • return_register_finalizer 11
  • 13.
    Bytecode interpreter • C++ and Assembler (template) interpreter • Generated at VM start-up • Run-time code profiling – -XX:CompileThreshold=10000 • Run-time CP resolution and bytecode rewriting • Optimizations – Dispatch table, top-of-stack caching 12
  • 14.
    Stack •Java (expression) vs. Native (execution) – -XX:ThreadStackSize=320 • Guard pages Empty – -XX:StackShadowPages=20 – -XX:StackYellowPages=2 Shadow – -XX:StackRedPages=1 Frame Frame Frame 13
  • 15.
    Stack frame SP Expression stack Monitors Method old SP FP old FP Return addr Locals Expression stack Previous frame 14
  • 16.
    Frame types • Interpreted, Compiled, Native – Different layout • Inlining – 1 frame for multiple nested methods • Deoptimization – When optimistic assumptions fail – (exceptions, class hierarchy changes etc.) 15
  • 17.
    Threads •Java threads & VM threads • States: in_java, in_vm, in_native, blocked • Thread pointer in register – Fast Thread.currentThread() • Priority policy (0 – normal, 1 – aggressive) – -XX:ThreadPriorityPolicy=N • TLAB allocation – -XX:+UseTLAB, -XX:TLABSize=0 16
  • 18.
    Synchronization •Simple uncontended lock – CAS • Biased locking – -XX:+UseBiasedLocking – -XX:BiasedLockingStartupDelay=4000 – -XX:BiasedLockingBulkRebiasThreshold=20 – -XX:BiasedLockingBulkRevokeThreshold=40 • Contended lock optimizations – Spin  Yield  Park 17
  • 19.
    Java-level locks • java.util.concurrent.locks.LockSupport • park() / unpark() • OS-level primitives – Mutex + Condition variable 18
  • 20.
    wait / notify • What’s wrong with this code? void waitForCompletion() { void setCompleted() { synchronized (lock) { synchronized (lock) { if (!completed) { completed = true; lock.wait(); lock.notifyAll(); } } } } } 19
  • 21.
    wait / notify • What’s wrong with this code? void waitForCompletion() { void setCompleted() { synchronized (lock) { synchronized (lock) { if (!completed) { completed = true; lock.wait(); lock.notifyAll(); } } } } } • -XX:+FilterSpuriousWakeups 20
  • 22.
    Object header Unlocked unused hashCode 0 age 0 01 Monitor Thin lock Displaced header ptr 00 Inflated lock Inflated lock ptr 10 Stack Biased lock JavaThread epoch 0 age 1 01 21
  • 23.
    Safepoints •When? – GC phases – Thread dump – Deoptimization – Revoke/rebias BiasedLock • How? – Interpreted: switch dispatch table – Compiled: page polling – Native: on return and on JNI calls • -XX:+PrintSafepointStatistics 22
  • 24.
    Native methods • System.loadLibrary() • Lazy linking • Expensive invocation 1. Create stack frame 2. Set up arguments according to native calling convention 3. Pass JNIEnv* and jclass 4. Lock / unlock if synchronized 5. Trace method entry / method exit 6. Check for safepoint 7. Check exceptions 23
  • 25.
    JNI functions • Executed in VM context • Check for safepoint • jobject == index in thread-local JNIHandles array • Optional verification – -XX:+CheckJNICalls 24
  • 26.
    Exceptions • Which code is faster? for (;;) { try { int index = getNextIndex(); for (;;) { if (index >= arr.length) { int index = getNextIndex(); break; sum += arr[index]; } } sum += array[index]; } catch (IndexOutOfBoundsException e) { } // break } 25
  • 27.
    Exceptions • Which code is faster? for (;;) { try { int index = getNextIndex(); for (;;) { if (index >= arr.length) { int index = getNextIndex(); break; sum += arr[index]; } } sum += array[index]; } catch (IndexOutOfBoundsException e) { } // break } • try-catch is free (exception tables) • throw is expensive (find handler, unwind stack, release locks, build stack trace) 26
  • 28.
    Reflection •getDeclaredFields(), getDeclaredMethods() – VM Internal structures  Java representation • Field getters and setters – sun.misc.Unsafe • Method.invoke() 1. Native implementation 2. Dynamic bytecode generation (up to 10x faster) -Dsun.reflect.inflationThreshold=15 27
  • 29.
    MethodAccessor example • void Point.move(float x, float y, boolean relative); class PointMove_MethodAccessor implements MethodAccessor { public Object invoke(Object target, Object[] args) { float x = ((Float) args[0]).floatValue(); float y = ((Float) args[1]).floatValue(); boolean relative = ((Boolean) args[2]).booleanValue(); try { ((Point) target).move(x, y, relative); } catch (Throwable t) { throw new InvocationTargetException(t); } return null; } } 28
  • 30.
    Troubleshooting •Signal handler (SIGSEGV, SIGILL, SIGFPE…) • Not all SEGV are fatal – Polling page – Implicit NullPointerException, StackOverflowError • hs_err.log – Threads, Stack frames, Memory map – Registers, Top of stack, Instructions – -XX:+UnlockDiagnosticVMOptions -XX:+PrintAssembly 29
  • 31.
    Still much tolearn 30
  • 32.
    Thank you! • OpenJDK sources – http://hg.openjdk.java.net • Contacts – andrey.pangin@odnoklassniki.ru • Open Source @ Odnoklassniki – https://github.com/odnoklassniki • Career @ Odnoklassniki – http://v.ok.ru 31