Sunday, June 22, 2014

A little bit on the JVM and JIT

As you might be aware, the JVM(Java Virtusal Machine) is what makes it possible for Java to adhere to the write-once-run-anywhere paradigm. At its core, the JVM consists of the following components;

Heap
Stack
PermGen and Method Area
JIT Compiler
Code cache

The heap is where memory is allocated for every new operator you use during the application code development stage. Stack will store the local variables that you will assign within the scope of a method. One thing to note is that the variables defined within the scope of a method will be removed after the completion of the method. If for example, a String is assigned within the scope of a method, and its scope is guaranteed to be of local scope, then this will be stored in the stack which would otherwise be assigned within the heap.
The PermGen space will store class and method level data as well as static variables that are defined in your application. The method area is actual an area within the PermGen space where it will store all method,field, constant pool level details of your application.

The JIT compiler and the code cache go hand in hand. The JVM at its core interprets the Java Byte Code into assembly code at runtime. Interpreting can be a slow process because the code needs to be converted from byte code to machine code at runtime every time a portion of your application code is executed. This is where the JIT compiler comes into action, with its super awesome compilation of methods which it then stores in the code cache.

The JIT compiler analyzes the application code at runtime to understand which methods can be categorized as hot methods. Hot in this context meaning code fragments that are accessed more frequently. At a very high level, what the JIT compiler does is that it will have a counter for each method executed in order to understand the frequency of its usage. When the counter reaches a defined threshold value, the method then becomes eligible to be compiled by the JIT compiler to its respective assemble code which will then be stored within the code cache. What happens is that now, whenever the JIT compiler comes across calls to those methods which were compiled and stored within the code cache, it will not try to interpret them yet again but will use the already compiled assembly code available within the code cache. This gives your application a performance boost because using the compiled code is much faster than interpreting it during runtime.

When talking about the JIT compiler, there are mainly two flavors of it which we are mostly oblivious of due to the fact of the lack of documentation around them. The two types are;

Client
Server

The default compiler used will defer according to the machine architecture and the JVM version (32bit or 64bit) that you are running on. Let us briefly see what each one does.

The client compiler starts compiling your byte code to assembly code at the application startup time. What this indirectly means is that your application will have a much improved startup time. But the main disadvantage this brings along with it is that your code cache will run out of memory faster. Most optimizations can be made only after your application has run for a brief period of time. But since the client compiler already took up the code cache space, you will not have space to store the assembly code for these optimizations. This is where the server cache excels.

Unlike the client compiler, the server compiler will not start compiling at the start of your application. It will allow the application code to run for some time (which is often referred to as the warm-up period) after which it will start compiling the byte code to assembly code which it will then store within the code cache.
In my next post I will discuss how we can actually mix and match the client and server compilation and also introduce you to a few more JVM flags that we seldom come across but are vital for increasing the performance of your application.

Friday, June 6, 2014

Finding the Equilibrium index of an array

I wanted to do a brain teaser today so i took up an algorithmic question to give a shot at. Now i know this question already has many answer on the internet if you do a quick search. But i wanted to try it out with the solution that came to my mind. The problem statement is as follows;

Equilibrium index of an array is an index such that the sum of elements at lower indexes is equal to the sum of elements at higher indexes. For example, in an array A:

A[0] = -7, A[1] = 1, A[2] = 5, A[3] = 2, A[4] = -4, A[5] = 3, A[6]=0

3 is an equilibrium index, because:
A[0] + A[1] + A[2] = A[4] + A[5] + A[6]

6 is also an equilibrium index, because sum of zero elements is zero, i.e., A[0] + A[1] + A[2] + A[3] + A[4] + A[5]=0

7 is not an equilibrium index, because it is not a valid index of array A.

Write a function int equilibrium(int[] arr); that given a sequence arr[] of size n, returns an equilibrium index (if any) or -1 if no equilibrium indexes exist.

The solution i came up with is as follows;


 
public static int solution(int[] A) {

  if (A == null || A.length < 3)
   throw new RuntimeException("Cannot find equilirbium");
  int pointer = 1;
  int lowerIndCount = A[0];
  int upperIndCount = 0;

  for (int i = 2; i < A.length; i++) {
   upperIndCount += A[i];
   if (lowerIndCount < 0) {
    if (upperIndCount > lowerIndCount && i != A.length - 1)
     continue;
    if (upperIndCount == lowerIndCount && i == A.length - 1)
     break;
    lowerIndCount += A[pointer];
    upperIndCount = 0;
    ++pointer;
    i = pointer;
   } else {
    if (upperIndCount > lowerIndCount) {
     lowerIndCount += A[pointer];
     upperIndCount = 0;
     ++pointer;
     i = pointer;
    }
   }
  }

  if (upperIndCount == lowerIndCount)
   return pointer;

  return -1;
 }

If this is the optimum or not im not sure. Im also not sure if this will work on very large data sets. But for the data sets i tried(both negative and positive) it worked fine. And also this has O(N) complexity as I am iterating through the array only once.