Performance Tips: Java without a JIT

Variables

Variable access

In terms of the source code, variables in Java reside in two principle places. They can belong to an object ("instance variables") or class (static variables or "class variables"), in which case their declration sits outside of the method definitions. Or they can be "local variables" declared inside a method. So in the following example, variable byteData is an instance variable and belongs to the object Obj, whereas total (and indeed i) is a local variable and can only be seen within the doCalculation method:

public class Obj {
  private byte[] byteData = new byte[10];

  public void doCalculation() {
    int total = 0;
    for (int i = 0; i < 10; i++) {
      total += byteData[i];
    }
  }
}

Other than the place in which they are declared, the syntax for declaring and accessing these types of variables is very similar. What happens under the hood is quite different.

Local variables such as total are relatively straightforward to access. On entering the method, the JVM reserves some space for local variables. To the JVM, local variables are referred to by number as though stored in an array (indeed, in most JMs it's likely that that's basically how they're implemented). Although in the source code the variable is called total, to the VM it's just "local variable number 2" (or whatever).

Instance variables such as byteData work differently. In the Java bytecode, the variable is actually referenced by object and name (using the GETFIELD instruction). Without optimisation, the VM must do more work to determine, for that particular object and variable name, whereabouts the variable is actually stored in memory.

A potential performance hit of the code above, then, is that it uses the relatively expensive GETFIELD instruction on each pass through the loop. A minor modification to the method removes the need for a GETFIELD every time through the loop:

public class Obj {
  private byte[] byteData = new byte[10];

  public void doCalculation() {
    int total = 0;
    byte[] byteData = this.byteData;
    for (int i = 0; i < 10; i++) {
      total += byteData[i];
    }
  }
}

In this version, the VM 'finds' the byteData field once before entering the loop and store a reference to it in a local variable. Now during the loop, the array is a little faster for the JM to track down and access each time.

Look out for this technique in various places in the JDK source code.

Variable initialisation

It's common to see code like this:

public class Counter {
  private int value = 0;
  //...
}

This makes it nice and clear that the variable counter will be initialised to zero, but strictly speaking it's unnecessary and makes the VM do extra work every time one of these objects is initialised. Any instance variable will be initialised to zero "for free" when the object is created. So the following is functionally equivalent and saves on a couple of bytecodes:

public class Counter {
  private int value;
  //...
}

Conversely, if you are not initialising to zero, then there's no "magical" way to initialise variables. The first of these two code fragments will actually get turned into something like the following by the javac compiler:

public class Counter {
  private int value;
  
  public Counter() {
    super();
    value = 0;
  }
}

Similarly, if you write:

public class Obj {
  private int[] data = {1, 2, 3};
}

this is again just a shorthand and will be turned into something like this by the compiler:

public class Obj {
  private int[] data;

  public Obj() {
    super();
    data = new int[3];
    data[0] = 1;
    data[1] = 2;
    data[2] = 3;
  }
}

In other words, there isn't a performance gain from declaring variable values at the top of the class: it's just a lexical shorthand. This is different from C programming on some platforms, where an initial value of a global variable can be embedded in a program's object file.

All editorial content copyright (c) Neil Coffey 2007. All rights reserved.