Notes from "Java Memory Management" by Maaike van Putten, Seán Kennedy. Packt Publishing
- https://learning.oreilly.com/library/view/java-memory-management/9781801812856/
- https://www.amazon.com/Java-Memory-Management-comprehensive-collection/dp/1801812853
The Runtime data areas is essentially the Java memory; which is managed by JVM for executing code.
Runtime data areas:
- The stack
- Contains a stack for each thread. Primitives and reference variables to heap are stored in the stack. Each method call results in a stack frame being created in the stack. This frame contains values for the method. A thread can only access it's own stack.
- The heap
- The heap is used for dynamic memory allocation of instances of object at runtime. The heap is managed and cleaned up by garbage collection
- The method area/Metaspace
- Class's metadata is stored here. It used to be called PermGen (permanent generation). All threads share this method area. It contains the runtime code, constants, static variables & constructor code
- The runtime constant pool
- Is part of the method area or Metaspace
- The program counter register
- Each thread has it's own PC register (Call stack) which tells what code is being executed by holding the address of the instruction that's being executed. A separate PC for each thread allows concurrent execution of the same code
- The native method stack
- Also known as the C stack for invoking native methods
- Primitives (local variables) and reference variables are stored on the stack. The actual object is stored on the heap
- Variables used in methods are stored on the stack (local variables)
- Stack memory is used for executing methods
- Each method call results in a stack frame being created in the stack memory; so the stack consists of frames corresponding to method invocations.
- When the method is finished; the stack is removed
- When stack memory is too low
StackOverFlowError
is thrown - When a new stack cannot be allocated;
OutOfMemoryError
is thrown - Top frame of the stack corresponds to the currently executing method; and is called the current frame
A stack frame has the following components
- The array length of the local variables array is set at compile time
- The array has single & double spots; single for int, short, char, float, byte, boolean & reference. Double for long & double (64bits)
- For instance methods (non-static); index 0 is always the reference to the object they exist on (
this
). Parameters given to the method start from index 1 - For static methods; there's no reference to an instance
- Stores operands for method calls & execution
- Contains reference to constant pool, how to return a method normally; and how to return abruptly completed methods (exceptions)
- Objects can be created and stored on the heap using the new keyword. References to these objects are stored in the stack
- The objects contains primitive values & references to other objects on the heap
- When the heap is full and new object cannot be created,
OutOfMemoryError
is thrown
- Holds the class metadata necessary for runtime (from Java SE 7)
- It used to be called PermGen
- Contains
- Klass structure
- Bytecode of methods
- The constant pool
- Annotations & more
- Instance primitive variables (members of the class) are stored on the heap
- Local variables to methods are stored on the stack
- References can also be stored on either the heap or the stack (local variables array on the stack frame); depending on whether the reference is a instance variable or a local method variable
- We cannot create objects of
abstract
classes &interface
. The references of these types always point to objects which implement them
- String objects are stored in a special area on the heap called the String Pool (AKA String Constant Pool)
- Whenever a method is called, the passed arguments are copied and put into the new stack frame created for the method
- The primitive variables' copies are local to the stack frame; where as reference variable copies would still point to the same object on the heap (since it's a reference)
- Therefore any change inside the called method
- To primitives would not reflect outside to the calling method
- To references would be reflected outside to the calling method
- Even though StringBuilder is private to the person class; but due to the reference being passed around different scopes (because of call-by-value); other objects are also able to change it's value. Essentially everyone holds reference to the same object
- This is primarily because StringBuilder is mutable; in contrast to String which is immutable
- The wrapper types (Double, Long, Integer) are also immutable
- The solution revolves around
defensive copying
; where we copy the incoming (passed object) and the outgoing (returned object) for any mutable objects - The solution is to copy the contents of the passed object into a new object and store the reference to that new object. This is knows as
Deep Copying
; as against shallow copying (done by call-by-value for references)
Manage object references with care. If not managed properly, you could end up with _escaping references_.
Java uses call-by-value, which means a copy is made of the argument passed or returned.
Depending on whether the argument is a primitive or reference, it can have major implications.
If it’s a copy of a reference to a mutable type, then the calling code can change your supposedly **private** data. This is not proper encapsulation.
The heap space consists of two different memory areas:
- Young generation space
- Old generation (tenured) space
GC roots are special types of objects not eligible for GC. All objects approachable by GC roots are also ineligible for GC (called live objects). Common GC roots are:
- Local variables on the stack
- All active Java threads
- Static variables (can be referenced by Class)
- JNI references
- Young generation space: AKA nursery or new space contains two separate areas; eden space & survivor space
- Eden space: is where new objects are allocated. When eden space is full; a young generation (minor) garbage collector runs
- Survivor space: These are two equally divided spaces S0 & S1. The minor garbage collector uses these spaces alternatively
- Old generation space: AKA tenured space. This is where longer-lived objects reside. The GC moves objects that have survived a certain number of GCs here. When the old generation space is full; this triggers a major GC
- The tenuring threshold is configurable using a JVM argument;
- -XX:MaxTenuringThreshold
- Care must be exercised with the argument, as a value greater than 15 specifies that objects should never tenure, thereby indefinitely filling up the survivor space with old objects Other important GC tuning parameters for JVM
- -Xms and -Xmx specify the heap’s minimum and maximum sizes, respectively.
- -XX:NewSize and -XX:MaxNewSize specify the young generation’s minimum and maximum sizes, respectively.
- -XX:SurvivorRatio specifies the relative sizes of the two survivor spaces with respect to the eden space. For example, -XX: SurvivorRatio=6 sets the ratio between eden and a survivor space to 1:6. In other words, each survivor space will be one-sixth the size of eden and therefore, one-eighth the size of the young generation (not one-seventh, as there two survivor spaces).
- -XX:NewRatio expresses the relative size of the new generation with respect to the old generation. For example, -XX:NewRatio=3 sets the ratio between the new generation and old generation to 1:3. This means that the new generation (eden plus both survivor spaces) occupies 25% of the heap and the old generation occupies the remaining 75%.
- -XX:PretenureSizeThreshold – If an object’s size is greater than the size specified by this flag, then the object is tenured immediately, meaning the object is allocated straight to the old generation space. The default value is 0, which means no object will be directly allocated to the old generation of the heap.
- Any objects which don't have a reference on the stack are eligible for GC; somehow in order to determine what objects are eligible for GC is a difficult task and there are several algorithms for that Garbage collection has the following major steps:
- Marking: Where eligible objects are marked for GC
- Sweeping: Clearing out the eligible objects
- Compacting: Defragmenting the memory
- Objects keep a bit which either marks (1: as live, reachable)/unmarks (0:unreachable) them for GC
- All the object references from the stack/GC roots are followed and any unreachable object is marked as 0; as eligible for GC
- When an object is first created in the eden space; the GC bit is set to 0 (unmarked); hence it could be (mistakably) collected by GC
- All the threads are stopped for GC; otherwise the new objects in the eden space would also be collected
- The then GC roots are traversed to mark the unreachable (0) / reachable (1) objects
- After that the sweep occurs
- Since stop the world pauses all the threads; there is important performance implications for it. That's why more frequent minor GC events happen and this is also why the heap memory is structured into Young / Old Generation
- We can also maintain the number of reference to an object
- Any object with >= 1 number of reference cannot be garbage collected
- Somehow this approach is not used; as it doesn't work with objects which reference each other without having a connection to the stack. This is called Islands of isolation
The deletion of objects is called Sweeping in GC jargon. There are three kinds of sweeping
- Normal sweeping
- Sweeping with compacting
- Sweeping with copying
This only deletes the objects but leaves the memory fragmented. Sometimes; even with enough overall memory available; there isn't enough contiguous space available to allocate memory for that object due to fragmentation. This leads to OutOfMemoryError
being thrown.
- This strategy adds an extra step where after freeing up memory; the remaining objects are moved to ensure there are no gaps between them
- The end result is no fragmentation in the memory
- Somehow this comes at a performance cost
- For this strategy; two memory regions are required
- First all the reachable objects are copied over to the new memory area
- Then the first memory area is cleared of all objects
Standard JVM has 5 different GC implementations
- Serial GC
- Parallel GC
- CMS (Concurrent Mark Sweep) GC
- G1 GC
- Z GC
- Doing a full object tree traversal for every GC along with stop-the-world is an expensive performance hit
- Some GC implementations make use of young/old generation areas of the heap to improve performance
- Generational garbage collectors focus on certain parts of the memory; for e.g.: the young generation. This works well for applications where most objects die young & it saves a lot of marking. (See weak generational hypothesis)
- For applications which have most of their objects in the old generation; this strategy won't free up a whole lot of memory
- So generational GCs may have to use different strategies for different heap areas
- For example; use stop-the-world for young generation cleanup and CMS for old generation
- Generational garbage collectors use Remembered Set. This contains all the references from the old generation to the young generation. This way old generation doesn't need to be scanned; as the references are already in the Remembered Set
- Single threaded
- Uses stop-the-world strategy (application threads are paused)
- Uses mark & sweep with copying for young generation
- Uses mark & sweep with compacting for old generation
- Ideal for small programs, single core processors
- Default GC for Java 8
- Uses stop-the-world strategy
- Uses mark & sweep with copy for young generation
- Uses mark & sweep with compacting for old generation
- All the above is same as Serial GC, however here; multiple threads are used for cleaning up the heap space
- Since it uses multiple threads; the world needs to be stopped for a shorter time which results in better application performance
- Works well on multi-core processors
- Uses multiple threads to reduce the pause times
- Separate cycles for young & old generation
- For young generation it uses stop-the-world mark & sweep with copying (main application threads are paused)
- For old generation; it's mostly concurrent mark & sweep
- mostly concurrent because it still pauses the application twice during marking; first at the beginning & then around the middle of the GC cycle
- The pauses are very short; since it does most of the work concurrently while the application is also running
- Somehow; there are exceptions which result in a complete pause due to concurrent mode failure
- Default since Java 9
- G1 GC is a parallel, concurrent GC with short pauses
- Uses a technique called incremental compacting
- It divides the heap into smaller regions; (smaller than generational garbage collectors)
- The regions with the most garbage are collected first (garbage first)
- Keeps track of reachable/unreachable objects per memory area
- During mark & sweep it also copies objects from one memory area to another; thus freeing up an entire memory area at the same time. This also accomplishes compacting (via copying)
- It still uses stop-the-world; but since the memory areas are much smaller; the pause times are much shorter
- It can also concurrently do string deduplication in the String pool
- This needs to be enables with
-XX:+UseStringDeduplication
- This needs to be enables with
- However; there are exceptions and sometimes it cannot do GC concurrently and the application needs to be paused
- Good for high performing applications with a large memory footprint
- Concurrent GC which guarantees 10 ms pause times
- Works only on 64 bit systems
- It marks live objects by using reference coloring. I.e.: Storing the live state within the reference variable (requires extra bits)
- Avoids fragmentation by doing concurrent relocation.
- During a brief stop-the-world pause; adds live objects to a relocation set
- Then concurrently relocates the objects and maintains a forwarding table for the application to be still able to get to the new location of the objects
- Also does concurrent reference update
- Great for low latency applications
Metaspace is a special area of native memory outside of the heap. JVM uses the Metaspace to store class's metadata
Metadata consists of the follwoing:
- Class files
- Structure and methods of classes
- Constants
- Annotations
- Optimizations
- Before Java 8; Metadata was stored within the heap in an area known as Permgen or permanent generation
- Permgen stored
- Class metadata
- Interned strings
- class/static variables
- As of Java 8; the class metadata is now stored in Metaspace; whereas interned strings & class/static variables are still stored on the heap
- Class loaders load class metadata into the metaspace
- The class loader instance is loaded onto the heap
- Bootstrap class loader loads other class loaders
- Boostrap & application class loaders reside permanently in Metaspace & are never garbage collected
- Dynamic class loaders & the classes they load are eligible for garbage collection witin the metaspace
- GC in meta space is only done when
- Metaspace runs out of memory
- Metaspace exceeds JVM set threshold (set using
-XX:MetaspaceSize
flag)
- We know that dynamic class loader instances reside in the heap; and they are responsible for loading class's meta data into the metaspace
- The metadata for the classes loaded by a class loader can only be garbage collected once all the classes loaded by that class loader are out of scope
- This again will only happen if the GC is triggered in the metaspace (hitting the threshold or high watermark)
In order to tune the JVM memory management; it's a good idea to have an understanding of the following
- What is a well functioning memory for the application
- Normal latency
- Normal throughput
- Find out the stable memory footprint of an application
- There should be more memory available than the stable memory-usage
- Also make sure not to allocate too much memory to JVM and leave enough for the operating system to function properly
- Knowing the normal memory usage will set a benchmark to measure the consequences of any memory tuning
- Measuring the latency of an application could be tricky if the application is dependent on other services which incurs network/IO costs
- In these cases latency issues might not be related to JVM memory
- High throughput requires more memory which may affect latency (larger memory takes longer to cleanup)
- Profiling affects application performance, so it's advisable to profile in development environment if possible
- Use
jps
to get PID of all Java processes - We can do basic profiling with
- jstat
- e.g:
jstat -gc -t -h 10 58707 500 5000
- e.g:
- jmap
- e.g:
jhsdb jmap --heap --pid 35384
- e.g:
- VisualVM
- jstat
- Tuning should not be the first step to improve performance; good code should be
- It's generally a best practice to not set heap size more than half of what is available on the server
- Use the following command to see the defaults for heap size
java -XX:+PrintFlagsFinal -version | grep HeapSize
- The heap size affects garbage collection performance
- Too small & the GC needs to run more frequently and spends more CPU time on GC
- Too large and it take more time to do a full GC
- Good rule of thumb is that the application should spend less than 5% of it's execution time collecting garbage
- Examples for setting size limits for overall memory pool & stack size
- -Xms1024m (initial size heap)
- -Xmx1024m (maximum size heap)
- -Xss1024m (thread stack size)
- Examples for setting Young Generation size
- -XX:MaxNewSize=1024m (maximum new size)
- -XX:NewSize=1024m (minimum new size)
- Example for creating a heap dump on OOME (`OutOfMemoryException)
java -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/some/path/to/dumps ExampleAnalysis
- Example for specifying a GC
- XX:+UseSerialGC
- XX:+UseParallelGC
- XX:+UseConcMarkSweepGC
- XX:+G1GC
- XX:+UseZGC