Java: Class File and Char Arrays
Monday, 26 July 2010 20:06

Below are 'rough' notes regarding how Java class file contents differ for two character array initializations.

  • The first variable is initialized from a string literal by calling 'toCharArray()'
  • The second variable is initialized by using an array initializer.

public class mymain {
	public static void main(String[] args) {
		char[] helloarray = "hello".toCharArray();
		char[] worldarray = {' ','w','o','r','l','d','!'};
		String s = new String(helloarray) + new String(worldarray);
		System.out.println(s);
	}
}

The first decalaration generates two things in the 'mymain.class' file:

  • A UTF8 constant pool entry is generated for the string 'hello'.
    	ENTRY 0017: eCONSTANT_Utf8: hello
    
  • The method 'toCharArray()' is called by invoking the below byte codes:
                ...
                ...
                ldc		//Push item from runtime constant pool
      invokevirtual		//Invoke instance method; dispatch based on class
           astore_1		//Store reference into local variable
                ...
                ...
    

The second declartion generates only byte codes (similar to below), and no constant pool entry.

         bipush		//The immediate byte is sign-extended to an int value. That value is pushed onto the operand stack.
       newarray		//Create new array
            dup		//Duplicate the top operand stack value
       iconst_0		//Push the int constant  (-1, 0, 1, 2, 3, 4 or 5) onto the operand stack.
         bipush		//The immediate byte is sign-extended to an int value. That value is pushed onto the operand stack.
        castore		//Store into char array
            dup		//Duplicate the top operand stack value
       iconst_1		//Push the int constant  (-1, 0, 1, 2, 3, 4 or 5) onto the operand stack.
         bipush		//The immediate byte is sign-extended to an int value. That value is pushed onto the operand stack.
        castore		//Store into char array
            dup		//Duplicate the top operand stack value
       iconst_2		//Push the int constant  (-1, 0, 1, 2, 3, 4 or 5) onto the operand stack.
         bipush		//The immediate byte is sign-extended to an int value. That value is pushed onto the operand stack.
        castore		//Store into char array
            dup		//Duplicate the top operand stack value
            ...
            ...

Which one is better? Guessing, I believe the 1st technique is. I'll confirm this later, but for now here's some quick reasoning:

  • When literal data is stored in constant pool (vs the byte codes), it is easier for data sharing to occur within the JVM.
  • Frequently used functions like 'toCharArray()' should be 'marked hot' and heavily processed by the JIT.
  • Less byte codes are loaded, parsed, and JIT'd.