In one of the previous lessons we saw the difference between JIT, JVM, JRE and JDK. This lesson looks inside JVM in somewhat greater detail. There are so many components inside JVM that they cannot be covered in a single lesson. So, this lesson focuses on memory areas inside Java and touches upon various other aspects of loading and interpretation. It is extremely important for a Java developer (especially beginners) to understand the memory areas. Without knowing how the memory is organized and where the variables sit inside memory it is nearly impossible to design linked data structures.
Given the complexity of the topic at hand, there will be inside-jvm-201 and inside-jvm-301 lessons in the future. There will be separate lessons on garbage collection.
What does JVM do anyway?
JVM has two main tasks –
- Converting the bytecodes contained in .class file to machine readable code.
- Running the machine code generated in previous step #1
Secondarily, JVM also manages memory areas because Java is a memory managed language. This essentially means that you, as a Java developer, do not have to worry about allocating and freeing memory blocks. The underlying operating system allocates memory to the Java process and the JVM further partitions this allocated memory into smaller blocks. Each block of serves a different purpose. JVM initializes these memory areas during class loading, so that they can be referenced by execution engine during running of machine code.
Memory areas inside JVM
JVM has the following memory areas
- Heap – this is the place where new objects are created. Primitive data types such as int are not allocated on heap. Consider this block as central memory store of Java. Various threads can reference objects in heap. This forms a blackboard kind of scenario. Updates to an object in heap by a thread are immediately visible to other threads.
- Stack (Java) – Every thread in Java has its own stack. The stack comprises of stack frames. A stack frame is a data structure created on invocation of method. When method invocation completes, the stack frame is popped. Variables of primitive data type and local variables sit on the Java stack. Static variables are not stored in Java stack.
- Method area – This part of the memory is for storing class level details. Details such as what methods are contained in a class, what’s the signature of those methods and static variables. Like heap, method area is shared among all threads.
- Program Counter Register – referred to as PC register for short holds the address of next instruction to be executed. Like Java stack, PC register is on a per thread basis. Each thread has its own PC register.
- Native stack – Is used during native code execution. Comes into play if you are using JNI and have native C code.
Classloader and Execution Engine Inside JVM
Classloader is the subsytem of JVM that
- Parses bytecodes presented to it. When the current class references another class, the classloader loads the dependency also. This process continues till the root where the class with no dependency is loaded.
- Verifies the integrity of the presented bytecode. Bytecode contained in .class file generated by javac is guaranteed to pass the verifier. But a malicious user may tamper with the bytecoedes and introduce harmful and illegal instructions to harm the JVM. Therefore, verifier acts as another line of defence againt malformed bytecodes as well as malicious users.
- Populates the method area with details of the class structure.
The execution engine is the interpreter. Its main task is to read bytecodes line by line and execute them. The JIT compiler which was discussed previously, is a part of this interpreter.
The bytecode interpreter is responsible only for running Java code. The native (C/C++) code is executed using the native interface and it accesses the native stack inside JVM.