You are on page 1of 32

The Java class file lifestyle

An introduction to the basic structure and lifestyle of the Java class file

By Bill Venners, JavaWorld.com, 07/01/96

Welcome to another installment of "Under the Hood." In last month's article I discussed
the Java Virtual Machine, or JVM, the abstract computer for which all Java programs are
compiled. If you are unfamiliar with the JVM, you may want to read last month's article
before this one. In this article I provide a glimpse into the basic structure and lifestyle of
the Java class file.

Born to travel

The Java class file is a precisely defined format for compiled Java. Java source code is
compiled into class files that can be loaded and executed by any JVM. The class files
may travel across a network before being loaded by the JVM.

In fact, if you are reading this article via a Java-capable browser, class files for the
simulation applet at the end of the article are flying across the Internet to your computer
right now. If you'd like to listen in on them (and your computer has audio capability),
push the following button:

Sounds like they're having fun, huh? That's in their nature. Java class files were designed
to travel well. They are platform-independent, so they will be welcome in more places.
They contain bytecodes, the compact instruction set for the JVM, so they can travel light.
Java class files are constantly zipping through networks at breakneck speed to arrive at
JVMs all over the world.

What's in a class file?

The Java class file contains everything a JVM needs to know about one Java class or
interface. In their order of appearance in the class file, the major components are:
magic, version, constant pool, access flags, this class, super class, interfaces, fields,
methods, and attributes.

Information stored in the class file often varies in length -- that is, the actual length of the
information cannot be predicted before loading the class file. For instance, the number of
methods listed in the methods component can differ among class files, because it depends
on the number of methods defined in the source code. Such information is organized in
the class file by prefacing the actual information by its size or length. This way, when the
class is being loaded by the JVM, the size of variable-length information is read first.
Once the JVM knows the size, it can correctly read in the actual information.

Information is generally written to the class file with no space or padding between
consecutive pieces of information; everything is aligned on byte boundaries. This helps
keeps class files petite so they will be aerodynamic as they fly across networks.

The order of class file components is strictly defined so JVMs can know what to expect,
and where to expect it, when loading a class file. For example, every JVM knows that the
first eight bytes of a class file contain the magic and version numbers, that the constant
pool starts on the ninth byte, and that the access flags follow the constant pool. But
because the constant pool is variable-length, it doesn't know the exact whereabouts of the
access flags until it has finished reading in the constant pool. Once it has finished reading
in the constant pool, it knows the next two bytes will be the access flags.

Magic and version numbers

The first four bytes of every class file are always 0xCAFEBABE. This magic number
makes Java class files easier to identify, because the odds are slim that non-class files
would start with the same initial four bytes. The number is called magic because it can be
pulled out of a hat by the file format designers. The only requirement is that it is not
already being used by another file format that may be encountered in the real world.
According to Patrick Naughton, a key member of the original Java team, the magic
number was chosen "long before the name Java was ever uttered in reference to this
language. We were looking for something fun, unique, and easy to remember. It is only a
coincidence that OxCAFEBABE, an oblique reference to the cute baristas at Peet's
Coffee, was foreshadowing for the name Java."

The second four bytes of the class file contain the major and minor version numbers.
These numbers identify the version of the class file format to which a particular class file
adheres and allow JVMs to verify that the class file is loadable. Every JVM has a
maximum version it can load, and JVMs will reject class files with later versions.

Constant pool

The class file stores constants associated with its class or interface in the constant pool.
Some constants that may be seen frolicking in the pool are literal strings, final variable
values, class names, interface names, variable names and types, and method names and
signatures. A method signature is its return type and set of argument types.

The constant pool is organized as an array of variable-length elements. Each constant


occupies one element in the array. Throughout the class file, constants are referred to by
the integer index that indicates their position in the array. The initial constant has an
index of one, the second constant has an index of two, etc. The constant pool array is
preceded by its array size, so JVMs will know how many constants to expect when
loading the class file.

Each element of the constant pool starts with a one-byte tag specifying the type of
constant at that position in the array. Once a JVM grabs and interprets this tag, it knows
what follows the tag. For example, if a tag indicates the constant is a string, the
JVM expects the next two bytes to be the string length. Following this two-byte length,
the JVM expects to find length number of bytes, which make up the characters of the
string.

In the remainder of the article I'll sometimes refer to the nth element of the constant pool
array as constant_pool[n]. This makes sense to the extent the constant pool is organized
like an array, but bear in mind that these elements have different sizes and types and that
the first element has an index of one.

Access flags

The first two bytes after the constant pool, the access flags, indicate whether or not this
file defines a class or an interface, whether the class or interface is public or abstract, and
(if it's a class and not an interface) whether the class is final.
This class

The next two bytes, the this class component, are an index into the constant pool array.
The constant referred to by this class, constant_pool[this_class], has two parts, a one-byte
tag and a two-byte name index. The tag will equal CONSTANT_Class, a value that
indicates this element contains information about a class or interface.
Constant_pool[name_index] is a string constant containing the name of the class or
interface.

The this class component provides a glimpse of how the constant pool is used. This class
itself is just an index into the constant pool. When a JVM looks up
constant_pool[this_class], it finds an element that identifies itself as a
CONSTANT_Class with its tag. The JVM knows CONSTANT_Class elements always
have a two-byte index into the constant pool, called name index, following their one-byte
tag. So it looks up constant_pool[name_index] to get the string containing the name of
the class or interface.

Super class

Following the this class component is the super class component, another two-byte index
into the constant pool. Constant_pool[super_class] is a CONSTANT_Class element that
points to the name of the super class from which this class descends.

Interfaces

The interfaces component starts with a two-byte count of the number of interfaces
implemented by the class (or interface) defined in the file. Immediately following is an
array that contains one index into the constant pool for each interface implemented by the
class. Each interface is represented by a CONSTANT_Class element in the constant pool
that points to the name of the interface.

Fields

The fields component starts with a two-byte count of the number of fields in this class or
interface. A field is an instance or class variable of the class or interface. Following the
count is an array of variable-length structures, one for each field. Each structure reveals
information about one field such as the field's name, type, and, if it is a final variable, its
constant value. Some information is contained in the structure itself, and some is
contained in constant pool locations pointed to by the structure.

The only fields that appear in the list are those that were declared by the class or interface
defined in the file; no fields inherited from super classes or superinterfaces appear in the
list.

Methods

The methods component starts with a two-byte count of the number of methods in the
class or interface. This count includes only those methods that are explicitly defined by
this class, not any methods that may be inherited from superclasses. Following the
method count are the methods themselves.

The structure for each method contains several pieces of information about the method,
including the method descriptor (its return type and argument list), the number of stack
words required for the method's local variables, the maximum number of stack words
required for the method's operand stack, a table of exceptions caught by the method, the
bytecode sequence, and a line number table.

Attributes

Bringing up the rear are the attributes, which give general information about the
particular class or interface defined by the file. The attributes section has a two-byte
count of the number of attributes, followed by the attributes themselves. For example,
one attribute is the source code attribute; it reveals the name of the source file from which
this class file was compiled. JVMs will silently ignore any attributes they don't recognize.

Getting loaded: a simulation of a class file reaching its JVM destination

The applet below simulates a JVM loading a class file. The class file being loaded in the
simulation was generated by the javac compiler given the following Java source code:

class Act {
public static void doMathForever() {
int i = 0;
while (true) {
i += 1;
i *= 2;
}
}
}

The above snippet of code comes from last month's article about the JVM. It is the same
doMathForever() method executed by the EternalMath applet from last month's article. I
chose this code to provide a real example that wasn't too complex. Although the code
may not be very useful in the real world, it does compile to a real class file, which is
loaded by the simulation below.

The GettingLoaded applet allows you to drive the class load simulation one step at a
time. For each step along the way you can read about the next chunk of bytes that is
about to be consumed and interpreted by the JVM. Just press the "Step" button to cause
the JVM to consume the next chunk. Pressing "Back" will undo the previous step, and
pressing "Reset" will return the simulation to its original state, allowing you to start over
from the beginning.

The JVM is shown at the bottom left consuming the stream of bytes that makes up the
class file Act.class. The bytes are shown in hex streaming out of a server on the bottom
right. The bytes travel right to left, between the server and the JVM, one chunk at a time.
The chunk of bytes to be consumed by the JVM on the next "Step" button press are
shown in red. These highlighted bytes are described in the large text area above the JVM.
Any remaining bytes beyond the next chunk are shown in black.

I've tried to fully explain each chunk of bytes in the text area. There is a lot of detail,
therefore, in the text area and you may wish to skim through all the steps first to get the
general idea, then look back for more details.

Source Code
/********************************************************************

Copyright (c) 1996 Artima Software Company. All Rights Reserved.

PROJECT: JavaWorld
MODULE: Under The Hood
FILE: GettingLoaded.java
AUTHOR: Bill Venners, June 1996

DESCRIPTION:

This file contains all the code for the Java class load simulator that
accompanies the Under The Hood article titled, "The Java Class File
Lifestyle".
I developed this under Symantec Cafe on Windows 95. As I developed it I
had
each class in its own file, which made for very speedy compile and test
cycles. I lumped all the files together into this file to make it easier
to download.

This applet retrieves two files from the server, the Act.class file
itself,
from which it gets the bytes to display along the bottom, and a text
file
which contains the text that accompanies each step. Each block of text
is
separated by a line of stars which contains one star for each byte
consumed
by the step.

*********************************************************************/

import java.awt.*;
import java.applet.*;
import java.io.InputStream;
import java.io.DataInputStream;
import java.io.BufferedInputStream;
import java.io.IOException;
import java.io.EOFException;
import java.net.URL;
import java.net.URLConnection;
import java.net.MalformedURLException;

public class GettingLoaded extends Applet


implements Runnable {

private URL theClassFileURL;


private URL theActTextURL;
private Thread runner;
private TextArea ta = new TextArea();
private StepNode firstNode;
private StepNode lastNode;
private StepNode currentNode;
private boolean ready = false;
private boolean jvmFinishedGobbling = false;
private int currentGobblePosition = 0;
private JVMPacman jvmPacman;
private String titleString = "Getting Loaded\n\n";
private boolean urlExceptionWasThrown = false;
private String cantGoFurtherString = "Unfortunately this means the
applet cannot go any further.\n";
private String ioErrorMsg = "An IO Error occured while trying to
read a file from the server.\n";
private String securityErrorMsg = "An security exception occured
while trying to read a file from the server.\n";
private String urlErrorMsg = "This HTML file contains a malformed
URL of a file required by this applet.\n";

public void init() {

super.init();

ta.setEditable(false);

setBackground(Color.blue);

String url = getParameter("classURL");


try { this.theClassFileURL = new URL(url); }
catch (MalformedURLException e) {
urlExceptionWasThrown = true;
ta.setText(titleString + "Bad URL: " + url + "\n\n" +
urlErrorMsg + cantGoFurtherString);
}
url = getParameter("textURL");
try { this.theActTextURL = new URL(url); }
catch (MalformedURLException e) {
urlExceptionWasThrown = true;
ta.setText(titleString + "Bad URL: " + url + "\n\n" +
urlErrorMsg + cantGoFurtherString);
}
ControlPanel controlPanel = new ControlPanel();
jvmPacman = controlPanel.getJVMPacman();
setLayout(new BorderLayout(5, 5));

ta.setBackground(Color.white);

add("North", new ColoredLabel("GETTING LOADED", Label.CENTER,


Color.cyan));
add("South", controlPanel);
add("Center", ta);
}

public boolean handleEvent(Event event) {


return super.handleEvent(event);
}

public boolean action(Event evt, Object arg) {


if (evt.target instanceof Button) {
String bname = (String) arg;
if (bname.equals("Reset")) {
if (ready) {
if (!currentNode.first()) {
currentNode = firstNode;
currentGobblePosition = 0;
jvmPacman.setGobblePosition(0,
currentNode.getByteCount());
ta.setText(currentNode.getString());
}
}
}
else if (bname.equals("Step")) {
if (ready) {
if (!currentNode.last()) {
currentGobblePosition +=
currentNode.getByteCount();
currentNode = currentNode.getNext();
jvmPacman.gobbleToPosition(currentGobblePositio
n, currentNode.getByteCount());
ta.setText(currentNode.getString());
}
else {
if (!jvmFinishedGobbling) {
currentGobblePosition +=
currentNode.getByteCount();
jvmPacman.gobbleToPosition(currentGobblePos
ition, 0);
jvmFinishedGobbling = true;
ta.setText("(The End)");
}
}
}
}
else if (bname.equals("Back")) {
if (ready) {
if (!currentNode.first()) {
if (jvmFinishedGobbling) {
jvmFinishedGobbling = false;
currentGobblePosition -=
currentNode.getByteCount();
}
else {
currentNode = currentNode.getPrev();
currentGobblePosition -=
currentNode.getByteCount();
}
jvmPacman.setGobblePosition(currentGobblePositi
on, currentNode.getByteCount());
ta.setText(currentNode.getString());
}
}
}
}
return true;
}

public Insets insets() {


return new Insets(5, 5, 5, 5);
}

public void start() {


if (runner == null && !ready && !urlExceptionWasThrown) {
runner = new Thread (this);
runner.start();
}
}

public void stop() {


if (runner != null) {
runner.stop();
runner = null;
}
}

public void run() {


InputStream conn = null;
DataInputStream data = null;
String line;
StringBuffer buf = new StringBuffer();

ta.setText(titleString + "Loading First Of Two Files...\n");

try {
conn = this.theClassFileURL.openStream();
data = new DataInputStream(new BufferedInputStream(conn));

try {
while (true) {
int unsignedByte = data.readUnsignedByte();
HexString hexStr = new HexString(unsignedByte, 2);
buf.append(hexStr.getString());
}
}
catch (EOFException e) {
jvmPacman.setText(buf.toString());
}
try {
ta.setText(titleString + "Loading Second Of Two
Files...\n");
conn = this.theActTextURL.openStream();
data = new DataInputStream(new
BufferedInputStream(conn));
buf.setLength(0);

while ((line = data.readLine()) != null) {


if (line.length() > 0 && line.charAt(0) == '*') {
int starCount = line.length();
StepNode nextNode = new
StepNode(buf.toString(), starCount);
if (firstNode == null) {
firstNode = nextNode;
lastNode = nextNode;
}
else {
lastNode.setNext(nextNode);
nextNode.setPrev(lastNode);
lastNode = nextNode;
}
buf.setLength(0);
}
else {
buf.append(line + "\n");
}
}
ready = true;
currentNode = firstNode;
jvmPacman.setGobblePosition(0,
firstNode.getByteCount());
ta.setText(currentNode.getString());
}
catch (IOException e) {
ta.setText(titleString + "IO Error: " + e.getMessage()
+ "\n\n"
+ ioErrorMsg + cantGoFurtherString);
}
catch (SecurityException e) {
ta.setText(titleString + "Security Exception: " +
e.getMessage() + "\n\n"
+ securityErrorMsg + cantGoFurtherString);
}
}
catch (IOException e) {
ta.setText(titleString + "IO Error: " + e.getMessage() +
"\n\n"
+ ioErrorMsg + cantGoFurtherString);
}
catch (SecurityException e) {
ta.setText(titleString + "Security Exception: " +
e.getMessage() + "\n\n"
+ securityErrorMsg + cantGoFurtherString);
}
}
}

class ButtonPanel extends Panel {

ButtonPanel() {
setLayout(new GridLayout(3, 1, 5, 5));
setBackground(Color.blue);
Button b = new Button("Step");
b.setBackground(Color.lightGray);
add(b);
b = new Button("Back");
b.setBackground(Color.lightGray);
add(b);
b = new Button("Reset");
b.setBackground(Color.lightGray);
add(b);
}

public Insets insets() {


// top, left, bottom, right
return new Insets(0, 0, 0, 0);
}
}

// The string passed to the constructor and to addText must be one line
to
// be printed out, excluding a closing return.
class JVMPacman extends Canvas {

private String theString;


private boolean stringValid = false;
private int currentGobblePosition;
private int interestingCharsCount;
private int charsThatFitBetweenRectanglesCount = 2;

JVMPacman() {
setBackground(Color.cyan);
}

void setText(String passedText) {

theString = passedText;
stringValid = true;
}

public Dimension minimumSize() {


return new Dimension(110, 60);
}
public Dimension preferredSize() {
return new Dimension(110, 60);
}

public void setGobblePosition(int pos, int interesting) {


// Multiply the passed position by two because the passed
position represents
// a byte position whereas we want currentGobblePosition to
represent a
// character position, and there are two hex characters for
each byte.
currentGobblePosition = pos * 2;
interestingCharsCount = interesting * 2;
repaint();
}

public void gobbleToPosition(int pos, int interesting) {


currentGobblePosition = pos * 2;
interestingCharsCount = interesting * 2;
repaint();
}

public void paint(Graphics g) {

Font font = getFont();


FontMetrics fm = getFontMetrics(font);
int heightOfOneLine = fm.getHeight();

// Calculate x starting point


Dimension dim = new Dimension();

dim = size();

int xStartingPoint = 5;

// Calculate y starting point


int totalHeight = heightOfOneLine * 2;
int yStartingPoint = (dim.height - totalHeight) / 2;
if (yStartingPoint < 5) {
yStartingPoint = 5;
}

// Calculate width of JVM rectangle. This will be


heightOfOneLine more than
// the stringWidth of "JVM" which I'll write in the middle of
the rectangle.
// This will make the border around the "JVM" the same width
and height on
// all sides and will equal heightOfOneLine / 2. I'll make the
height of the
// rectangle heightOfOneLine *2.
int jvmRectangleWidth = fm.stringWidth("JVM") + heightOfOneLine;

// Draw the filled rectangle


g.setColor(Color.green);
g.fillRoundRect(xStartingPoint, yStartingPoint,
jvmRectangleWidth, totalHeight,
5, 5);

// Give it a handsome black outline


g.setColor(Color.black);
g.drawRoundRect(xStartingPoint, yStartingPoint,
jvmRectangleWidth, totalHeight,
5, 5);

// Calculate width of Server rectangle. This will be


heightOfOneLine more than
// the stringWidth of "Server" which I'll write in the middle
of the rectangle.
// This will make the border around the "Server" the same width
and height on
// all sides and will equal heightOfOneLine / 2. I'll make the
height of the
// rectangle heightOfOneLine *2.
int serverRectangleWidth = fm.stringWidth("Server") +
heightOfOneLine;

// Draw the filled rectangle. The x starting point is the width


of the
// canvas minus the width of the server rectangle minus the 5
pixel margin.
int xStartingPointServerRect = dim.width - serverRectangleWidth
- 5;
g.setColor(Color.green);
g.fillRoundRect(xStartingPointServerRect, yStartingPoint,
serverRectangleWidth, totalHeight, 5, 5);

// Give this rectangle a handsome black outline


g.setColor(Color.black);
g.drawRoundRect(xStartingPointServerRect, yStartingPoint,
serverRectangleWidth, totalHeight, 5, 5);

int whiteRectangleWidth = xStartingPointServerRect -


jvmRectangleWidth - 5;
if (whiteRectangleWidth > 0) {
g.setColor(Color.white);
g.fillRect(jvmRectangleWidth + 5, yStartingPoint +
(heightOfOneLine / 2),
whiteRectangleWidth, heightOfOneLine);
}

// Draw "JVM" inside the rectangle


g.setColor(Color.black);
xStartingPoint += (heightOfOneLine / 2);
int ascent = fm.getAscent();
yStartingPoint += ascent + (heightOfOneLine / 2);
g.drawString("JVM", xStartingPoint, yStartingPoint);

// Draw "Server" inside the rectangle


int xStartingPointServerText = xStartingPointServerRect +
(heightOfOneLine / 2);
g.drawString("Server", xStartingPointServerText,
yStartingPoint);

// The string should be written so that it fits between the JVM


and Server
// rectangles, leaving at least 5 pixels space between the
rectangle and
// the string.
if (stringValid && currentGobblePosition < theString.length()) {

// First need to figure out how many characters will fit in


// the space between the two rectangles.
int xTextStartingPoint = jvmRectangleWidth + 10;
int xTextEndingPoint = xStartingPointServerRect - 5;
int pixelsAvailableBetweenRectangles = xTextEndingPoint -
xTextStartingPoint;
if (pixelsAvailableBetweenRectangles < 0) {
pixelsAvailableBetweenRectangles = 0;
}

// Initialize the number of characters to write as the


number of
// remaining characters. This will be reduced below if this
amount of
// characters doesn't fit.
int charsToWriteCount = theString.length() -
currentGobblePosition;

// Check to see if the string to be displayed already fits


between the
// two rectangles. If so, we'll just use the total number
of characters
// remaining as the number of characters to write.
int pixelWidthOfRemainingString =
fm.stringWidth(theString.substring(currentGobblePosition));
if (pixelWidthOfRemainingString >
pixelsAvailableBetweenRectangles) {

// The first while loop increments the


charsThatFitBetweenTwoRectanglesCount
// until the width of the string in pixels just exceeds
the available space.
String tryThisString =
theString.substring(currentGobblePosition,
currentGobblePosition +
charsThatFitBetweenRectanglesCount);
int pixelsEaten = fm.stringWidth(tryThisString);
while (pixelsEaten <= pixelsAvailableBetweenRectangles)
{
++charsThatFitBetweenRectanglesCount;
tryThisString =
theString.substring(currentGobblePosition,
currentGobblePosition +
charsThatFitBetweenRectanglesCount);
pixelsEaten = fm.stringWidth(tryThisString);
}

// The second while loop decreases the charsThatFit


variable until the
// width of the string in pixels is just under the
available width.
while (pixelsEaten > pixelsAvailableBetweenRectangles) {
--charsThatFitBetweenRectanglesCount;
tryThisString =
theString.substring(currentGobblePosition,
currentGobblePosition +
charsThatFitBetweenRectanglesCount);
pixelsEaten = fm.stringWidth(tryThisString);
}

charsToWriteCount = charsThatFitBetweenRectanglesCount;
}

// Draw the interesting characters in red.


g.setColor(Color.red);
int redCharsCount = interestingCharsCount;
if (redCharsCount > charsToWriteCount) {
redCharsCount = charsToWriteCount;
}
String redString =
theString.substring(currentGobblePosition,
currentGobblePosition + redCharsCount);
g.drawString(redString, xTextStartingPoint, yStartingPoint);

// Draw the remaining characters in black.


int blackStringStartingPosition = currentGobblePosition +
redCharsCount;
int blackCharsCount = charsToWriteCount - redCharsCount;
if (blackStringStartingPosition < theString.length()
&& blackCharsCount > 0) {

xTextStartingPoint += fm.stringWidth(redString);
g.setColor(Color.black);
g.drawString(theString.substring(blackStringStartingPos
ition,
blackStringStartingPosition + blackCharsCount),
xTextStartingPoint, yStartingPoint);
}
}
}
}

class ControlPanel extends Panel {

JVMPacman jvmPacman = new JVMPacman();

ControlPanel() {
setLayout(new BorderLayout(5, 5));
setBackground(Color.blue);
add("West", new ButtonPanel());
add("Center", jvmPacman);
}

public JVMPacman getJVMPacman() {


return jvmPacman;
}

public Insets insets() {


// top, left, bottom, right
return new Insets(0, 0, 0, 0);
}
}

class StepNode {

private String theString;


private StepNode next;
private StepNode prev;
private boolean nextValid = false;
private boolean prevValid = false;
private int byteCount = 0;

StepNode(String s, int bytes) {


theString = s;
byteCount = bytes;
}

String getString() {
return theString;
}

int getByteCount() {
return byteCount;
}

StepNode getNext() {
// Should probably throw an exception here if !nextValid
return next;
}

void setNext(StepNode n) {
next = n;
nextValid = true;
}
boolean last() {
return !nextValid;
}

StepNode getPrev() {
// Should probably throw an exception here if !prevValid
return prev;
}

void setPrev(StepNode n) {
prev = n;
prevValid = true;
}

boolean first() {
return !prevValid;
}
}

// I used this class because I can't seem to set the background color of
// a label. I only want a label, but I want the backgound to be gray.
class ColoredLabel extends Panel {

private Label theLabel;

ColoredLabel(String label, int alignment, Color color) {

setLayout(new GridLayout(1,1));

setBackground(color);

theLabel = new Label(label, alignment);

add(theLabel);
}

public void setLabelText(String s) {

theLabel.setText(s);
}

public Insets insets() {


return new Insets(0, 0, 0, 0);
}
}

class HexString {

private final String hexChar = "0123456789ABCDEF";


private StringBuffer buf = new StringBuffer();

void Convert(int val, int maxNibblesToConvert) {

buf.setLength(0);

int v = val;
for (int i = 0; i < maxNibblesToConvert; ++i) {

if (v == 0) {

if (i == 0) {
buf.insert(0, '0');
}
break;
}

// Get lowest nibble


int remainder = v & 0xf;

// Convert nibble to a character and insert it into the


beginning of the string
buf.insert(0, hexChar.charAt(remainder));

// Shift the int to the right four bits


v >>>= 4;
}
}

HexString(int val, int minWidth) {

Convert(val, minWidth);

int charsNeeded = minWidth - buf.length();


for (int i = 0; i < charsNeeded; ++i) {
buf.insert(0, '0');
}
}

public String getString() {

return buf.toString();
}
}

ASCII FILE for previous source code


Step 1. Magic Number

hex bytes name


--------- ----
CAFEBABE magic

First the JVM must make sure that the class file starts with the proper
magic number. In this case our JVM will be happy because it will find
the CafeBabe magic right where it's supposed to be.

By the way, all numbers are stored in the class file in big-endian
order, which means the higher order bytes come first. The very first
byte of every class file, therefore, will be 0xCA.
****
Step 2. Version Numbers

hex dec name


--- --- ----
0003 3 minor_version
002D 45 major_version

Next, the JVM must make sure that it recognizes and fully understands
the
format of the class file being loaded. If either the major or minor
version
number is higher than those version numbers for which this JVM was
implemented,
the JVM must reject the class file. In this case, our JVM is relieved
to find
that the file has major version 45 and minor version 3, of which it has
intimate
knowledge.
****
Step 3. Constant Pool Count

0011 17 constant_pool_count

The next two bytes make up an unsigned short integer which indicates
the number of elements in the constant pool array. In this case the
constant pool will have 17 elements, but because the zeroeth element
doesn't appear in the class file, the JVM will expect to find elements 1
through 16 next in the stream.
**
Step 4. constant_pool[1]

07 7 tag
000C 12 name_index

Each constant pool element contains a structure whose type is indicated


by the first byte of the element which is interpreted as a type tag.

The first constant pool entry is a CONSTANT_Class_info structure. The


JVM
knows this because it finds a 7, which means CONSTANT_Class, in the tag
byte position. Aside from the tag the CONSTANT_Class_info structure has
only a name_index, which in this case is 12. This is an index into the
constant
pool. The JVM doesn't know it yet, because it hasn't read in the 12th
element
of the constant pool, but the 12th element will be the string
"java/lang/Object",
which is the name of the superclass of this class. Because class Act
doesn't
explicitly descend from any other class, it by default descends from
class
Object.

Note that this is the first constant pool element and that it already
has
index 1. constant_pool[0] doesn't appear in the class file.
***
Step 5. constant_pool[2]

07 7 tag
000D 13 name_index

The 2nd constant pool entry is another CONSTANT_Class_info structure.


The JVM
knows this because it again finds a 7, which still means
CONSTANT_Class, in the
tag byte position. The name_index here, however, is 13.
constant_pool[13] will be
the string "Act", although once again the JVM doesn't know this yet
because it
hasn't read that far into the constant pool. At this point the JVM only
knows
that when it does get around to the 13th constant pool element, that
element
will contain the name of the class which is represented by the
CONSTANT_Class_info structure occupying constant_pool[2].

constant_pool[2] therefore represents the "this" class, class Act,


which this
class file defines.
***
Step 6. constant_pool[3]

0A 10 tag
0001 1 class_index
0004 4 name_and_type_index

constant_pool[3] contains a CONSTANT_Methodref_info structure,


indicated by the
tag of 10, which means CONSTANT_Methodref. A CONSTANT_Methodref_info
structure
represents a method and records its name, type, and the class to which
it
belongs. The first two bytes after the tag, the class_index, form an
index into
the constant pool, in this case a 1. constant_pool[1] represents the
superclass Object,
so this method should be declared in Object.

The next two bytes make up the name_and_type_index, which is an index


into the
constant pool pointing to a CONSTANT_NameAndType structure. In this case
name_and_type_index is 4 and constant_pool[4] defines "void <init>()".
Because
class Act doesn't declare its own constructor, the javac compiler wrote
one
of its own which calls <init>() in superclass Object.
*****
Step 7. constant_pool[4]

0C 12 tag
000E 14 name_index
0010 16 descriptor_index

constant_pool[4] is a CONSTANT_NameAndType structure identified by its


initial
tag of 12, which stands for CONSTANT_NameAndType. Following the tag
byte is
the name_index, which is 14. constant_pool[14] will be the string
"<init>". Next
is the descriptor_index, which is 16. constant_pool[16] will be the
string "()V".

"<init>" is the name of the method being described. "()V" is the type.
In
plain Java it would look like "void <init>()". "()V" is a method
descriptor.
The "()" indicates that there are no arguments to "<init>". The "V"
indicates
the return type of "<init>" is void.
*****
Step 8. constant_pool[5]

01 1 tag
000D 13 length
436F6E7374616E7456616C7565 "ConstantValue"
bytes[length]

constant_pool[5] is a CONSTANT_Utf8_info structure as identified by its


tag of 1, which means CONSTANT_Utf8. CONSTANT_Utf8 is how constant
strings
are encoded in the constant pool. The UTF-8 format allows any 16 bit
character
to be represented by either 1, 2, or 3 bytes. The characters '\u0001'
to '\u007F'
occupy only one byte. Other characters require 2 or 3 bytes, however,
these
characters are expected to appear less frequently than the characters
that
require only 1 byte. By this means the full 16 bit Unicode character
set can
be supported by the class file without using as much space as would
likely be
used by just making each character 16 bits long.

The tag byte of a CONSTANT_Utf8_info structure is followed by a length,


in this
case 13, which indicates the length in bytes of the UTF-8 format string
that follows. In
this case the string spells out "ConstantValue". This string is used
internally
by the class file in a manner that will be described later in the file
loading
simulation.

There is no trailing null character in a CONSTANT_Utf8_info.bytes as


that
would unnecessarily waste bandwidth.
****************
Step 9. constant_pool[6]

01 1 tag
000D 13 length
646F4D617468466F7265766572 "doMathForever"
bytes[length]

constant_pool[6] is a CONSTANT_Utf8_info structure which contains the


string
"doMathForever". This is the name of the lone method defined in class
Act.
****************
Step 10. constant_pool[7]

01 1 tag
000A 10 length
457863657074696F6E73 "Exceptions" bytes[length]

constant_pool[7] is a CONSTANT_Utf8_info structure which contains the


string
"Exceptions".
*************
Step 11. constant_pool[8]

01 1 tag
000F 15 length
4C696E654E756D6265725461626C65 "LineNumberTable"
bytes[length]

constant_pool[8] is a CONSTANT_Utf8_info structure which contains the


string
"LineNumberTable".
******************
Step 12. constant_pool[9]

01 1 tag
000A 10 length
536F7572636546696C65 "SourceFile" bytes[length]

constant_pool[9] is a CONSTANT_Utf8_info structure which contains the


string
"SourceFile".
*************
Step 13. constant_pool[10]

01 1 tag
000E 14 length
4C6F63616C5661726961626C6573 "LocalVariable" bytes[length]

constant_pool[10] is a CONSTANT_Utf8_info structure which contains the


string
"LocalVariable".
*****************
Step 14. constant_pool[11]

01 1 tag
0004 4 length
436F6465 "Code" bytes[length]

constant_pool[11] is a CONSTANT_Utf8_info structure which contains the


string
"Code".
*******
Step 15. constant_pool[12]

01 1 tag
0010 16 length
6A6176612F6C616E672F4F626A656374 "java/lang/Object"
bytes[length]

constant_pool[12] is a CONSTANT_Utf8_info structure which contains the


string
"java/lang/Object". This is the fully qualified class name of
"java.lang.Object"
with the dots changed to slashes. (The slashes are there in place of
the dots
because of historical reasons.)
*******************
Step 16. constant_pool[13]

01 1 tag
0003 3 length
416374 "Act" bytes[length]

constant_pool[13] is a CONSTANT_Utf8_info structure which contains the


string
"Act". This is the name of the class which is being defined by this
file.
******
Step 17. constant_pool[14]

01 1 tag
0006 6 length
3C696E69743E "<init>" bytes[length]

constant_pool[14] is a CONSTANT_Utf8_info structure which contains the


string
"<init>", the name of a method in the superclass, Object.
*********
Step 18. constant_pool[15]

01 1 tag
000B 11 length
736E697065742E6A617661 "snipet.java" bytes[length]

constant_pool[15] is a CONSTANT_Utf8_info structure which contains the


string
"snipet.java", the name of the source file in which class Act was
defined.
**************
Step 19. constant_pool[16]

01 1 tag
0003 3 length
282956 "()V" bytes[length]

constant_pool[15] is a CONSTANT_Utf8_info structure which contains the


string
"()V", a method descriptor which translates to a method that takes no
arguments
and returns void.
******
Step 20. Access Flags

0000 access_flags

The access flags are a two byte unsigned integer that is composed by
bitwise
oring bitmasks for individual flags that represent modifiers of the
class or
interface defined by this file. For example, ACC_PUBLIC is 0x0001 and
ACC_FINAL is 0x0010. A class declared to be both public and final would
have
its access flags set to 0x0011, or (ACC_PUBLIC | ACC_FINAL).

In this case no access_flags are set because the class being defined,
class Act,
was not declared to be public, final, or abstract.

Classes which are declared with these modifiers would have the
appropriate bitmasks
from ACC_PUBLIC, ACC_FINAL, and ACC_ABSTRACT ored together to make the
resultant
access_flags. Also, if a class file defines an interface and not class,
then an
ACC_INTERFACE bit is set in access_flags. This is how the JVM knows
whether a class
or an interface is being defined by the file.
**
Step 21. This Class and Super Class

0002 2 this_class
0001 1 super_class

this_class is a two byte unsigned integer index into the constant pool,
where
constant_pool[this_class] is a CONSTANT_Class_info structure
representing the
class defined by this file. In our case it is the CONSTANT_Class_info
structure
for class Act.

super_class is a two byte unsigned integer index into the constant


pool, where
constant_pool[super_class] is a CONSTANT_Class_info structure
representing the
superclass from which the class defined by this file decends. In our
case it
is the CONSTANT_Class_info structure for class Object.
****
Step 22. Interfaces Count and Fields Count

0000 0 interfaces_count
0000 0 fields_count
interfaces_count is a two byte unsigned integer which indicates the
number of
interfaces implemented by this class. Because class Act implements no
interfaces,
interfaces_count is zero in this case.

fields_count is a two byte unsigned integer which indicates the number


of
fields (class or instance variables) implemented by this class. Because
class Act
has no class or instance variables, fields_count is zero in our case.
****
Step 23. Methods Count

0002 2 methods_count

methods_count is a two byte unsigned integer indicating the number of


methods
defined by this class. The count does not include any methods inherited
from
superclasses, only those methods explicitly defined in this class.

In this case methods_count is 2 because doMathForever() is defined in


the
source file and the constructor Act() is defined by the compiler.

The JVM will expect an array of 2 method_info structures to immediately


follow
this methods_count.
**
Step 24. doMathForever()'s Access Flags, Name Index, and Descriptor
Index

0009 access_flags
0006 6 name_index
0010 16 descriptor_index

This is the beginning of the array of method_info structures that


immediately follows
the methods_count. The first method_info structure, methods[0], gives
information
about the doMathForever() method. The second method_info structure,
methods[1], gives
information about the Act() constructor.

The first three parts of methods[0] are shown here. access_flags gives
the modifiers
with which the method was declared. In this case access_flags is a
0x0009 which
equates to (ACC_PUBLIC | ACC_STATIC). If you look back at the source
code, the
doMathForever() method is indeed declared public and static.

The name_index indicates the constant pool entry where the name of the
method is
stored. In this case name_index is 6 and constant_pool[6] is indeed the
UTF-8
string "doMathForever".

The descriptor_index indicates the constant pool entry where the


descriptor of
this method is stored. In this case descriptor_index is 16 and
constant_pool[16] is
the string "()V". This descriptor string indicates that doMathForever()
is a method
which takes no arguments and returns void.
******
Step 25. doMathForever() Attributes

0001 1 attributes_count

methods[0].attributes[0]
000B 11 attribute_name_index
00000030 48 length

Following the descriptor_index of a method is its attributes_count. In


this case there
is only 1 attribute for the doMathForever() method.

The list of attributes follows the attributes_count. Here there is only


one attribute,
attributes[0]. The first two bytes of the attribute are the
attribute_name_index. The
attribute_name_index is the index of the constant pool entry which
contains the name of
the attribute. In this case attribute_name_index is 11 and
constant_pool[11] is "Code".
Therefore this is the Code attribute of doMathForever() which will
contain, among a few
other items, the actual bytecode sequence for the method.

The length word indicates the number of bytes in the attribute, in this
case 48 bytes.
This means 48 bytes will follow this length word.
********
Step 26. doMathForever() Max Stack and Max Locals

0002 2 max_stack
0001 1 max_locals

Max stack is a two byte unsigned integer that indicates the maximum
number of entries
on the JVM's operand stack at any point in the method. Max locals is a
two byte
unsigned integer that indicates the number of local variables slots
used by the method.
Each local variable slot is a four byte word.

In this case, max_stack is 2. If you look back at the Eternal Math


applet from last
month's article and watch the operand stack as the JVM executes the
doMathForever()
bytecodes, you will see that the operand stack never has more than two
words in it.
The max_locals is 1 in this case, because doMathForever() takes no
arguments and has
only one variable, i. If you look back at the Eternal Math applet you
can see that
there is only one word in the local variables section of
doMathForever()'s stack
frame, the integer i.
****
Step 27. doMathForever() Bytecodes and Exception Table

0000000C 12 code_length
033B8400011A05683BA7FFF9 code[code_length]
0000 0 exception_table_length

code_length indicates the number of bytes in the bytecode sequence for


this method.
The actual bytecodes follow the code_length. In this case the
doMathForever()
method bytecode sequence fills 12 bytes. If you look back to the
Eternal Math applet
from last month's article you will see that these are the bytecodes
being
executed in that simulation.

pc instruction mnemonic
-- ----------- --------
0 03 iconst_0
1 3B istore_0
2 840001 iinc 0 1
5 1A iload_0
6 05 iconst_2
7 68 imul
8 3B istore_0
9 A7FFF9 goto 2

exception_table_length indicates the number of exceptions that are


caught in the
method. In the case of doMathForever(), exception_table_length is zero
because no
exceptions are explicitly caught in this method. For methods that do
catch
exceptions, an array of exception handler structures immediately
follows the
exception_table_length.

One other thing you may notice at this point is that these bytecodes
haven't
been assigned a place in memory yet. This is done by the JVM when it
loads the
class file. Therefore, the value I called pc is really the offset from
the
actual program counter address at which this bytecode sequence is loaded
by a JVM.
******************
Step 28. Attributes of the doMathForever() Code Attribute
0001 1 attributes_count
0008 8 attribute_name_index
00000012 18 attribute_length
0004 4 line_number_table_length

The doMathForever() "Code" attribute itself has a "sub-attribute"


section. I.e.,
the attributes listed here belong to the "Code" attribute of
doMathForever().

attributes_count is 1, which indicates there is only one attribute


belonging to
the "Code" attribute of doMathForever(). attribute_name_index is an 8,
and
constant_pool[8] is the string "LineNumberTable". This is the name of
the
attribute. attribute_length gives the length of attribute
"LineNumberTable" as
18. The line_number_table_length is 4, meaning that there will be 4
line_number_table entries immediately following.

If you look back at the snipet of Java code which defines


doMathForever() you'll
see that there are only 4 lines of code which get translated to
bytecodes:

line 4: i = 0;
line 5: while (true) {
line 6: i += 1;
line 7: i *= 2;

These four lines will be matched up with the corresponding bytecodes in


the
line_number_table that follows.
**********
Step 29. Line Number Table for doMathForever()

hex dec name what it refers to


--- --- ---- -------------------
line_number_table[0]
0000 0 start_pc iconst_0, istore_0
0004 4 line_number int i = 0;

line_number_table[1]
0002 2 start_pc iinc 0 1
0006 6 line_number i += 1

line_number_table[2]
0005 5 start_pc iload_0, iconst_2, imul,
istore_0
0007 7 line_number i *= 2

line_number_table[3]
0009 9 start_pc goto 2
0005 5 line_number while (true) {
The above table associates each line in the doMathForever() code with
its
corresponding bytecode instruction. The start_pc value of each table
entry
indicates the zero based byte position inside the doMathForever()
bytecode
sequence. The line_number value of each table entry is the line number
of
the source file that corresponds to the start_pc position in the
bytecode
sequence.

The line number information is of general utility. It is useful to


debuggers,
for example. It also allows the line number at which an exception is
thrown
to be written to the Java console.

This is the last of the information in the class file about the
doMathForever()
method. Next will come the information about the other method of this
class,
the constructor Act().
****************
Step 30. Act()'s Access Flags, Name Index, and Descriptor Index

methods[1]
0000 access_flags
000E 14 name_index
0010 16 descriptor_index

This is the beginning of the second method_info structure, methods[1],


which
gives information about the Act() constructor.

The first three parts of methods[1] are shown here. access_flags gives
the modifiers
with which the method was declared. In this case access_flags is a
0x0000 which
means it's neither public, private, protected, static, final,
synchronized, native,
or abstract. It's just a plain old method.

The name_index indicates the constant pool entry where the name of the
method is
stored. In this case name_index is 14 and constant_pool[14] is indeed
the UTF-8
string "<init>". "<init>" is a special internal method name used to
represent
constructors.

The descriptor_index indicates the constant pool entry where the


descriptor of
this method is stored. In this case descriptor_index is 16 and
constant_pool[16] is
the string "()V". This descriptor string indicates that <init>() or
Act() is a
method which takes no arguments and returns void.
******
Step 31. Act() Attributes

0001 1 attributes_count

000B 11 attribute_name_index
0000001D 29 attribute_length

Following the descriptor_index of a method is its attributes_count. In


this case there
is only 1 attribute for the Act() method.

The list of attributes follows the attributes_count. Here there is only


one attribute,
attributes[0]. The first two bytes of the attribute are the
attribute_name_index. The
attribute_name_index is the index of the constant pool entry which
contains the name of
the attribute. In this case attribute_name_index is 11 and
constant_pool[11] is "Code".
Therefore this is the Code attribute of doMathForever() which will
contain, among a few
other items, the actual bytecode sequence for the method.

The length word indicates the number of bytes in the attribute, in this
case 29 bytes.
This means 29 bytes will follow this length word.
********
Step 32. Act() Max Stack and Max Locals

0001 1 max_stack
0001 1 max_locals

Max stack is a two byte unsigned integer that indicates the maximum
number of entries
on the JVM's operand stack at any point in the method, in this case 1.
Max locals is a
two byte unsigned integer that indicates the number of local variables
slots used by
the method, in this case 1. Each local variable slot is a four byte
word.
****
Step 33. Act() Bytecodes and Exception Table

00000005 5 code_length
2AB70003B1 code[code_length]
0000 0 exception_table_length

code_length indicates the number of bytes in the bytecode sequence for


this method.
The actual bytecodes follow the code_length. In this case the Act()
method bytecode sequence fills 5 bytes. This method was generated by
the compiler
and doesn't appear in the Java source file for class Act.

pc instruction mnemonic
-- ----------- --------
0 2A aload_0
1 B70003 invokenonvirtual #3 <Method
java.lang.Object.<init>()V>
4 B1 return

If you look at the operand for the second instruction, invokenonvirtual


(bytecode B7),
you will see a 0003. If you go look at constant_pool[3] you'll find a
name and type
constant that describes the <init> method of java.lang.Object.

exception_table_length indicates the number of exceptions that are


caught in the
method. In the case of Act(), exception_table_length is zero because no
exceptions are explicitly caught in this method. For methods that do
catch
exceptions, an array of exception handler structures immediately
follows the
exception_table_length.
***********
Step 34. Line Number Table Attribute of the Act() Code Attribute

hex dec name


--- --- ----
0001 1 attributes_count
0008 8 attribute_name_index
00000006 6 attribute_length
0001 1 line_number_table_length

The Act() "Code" attribute itself has a "sub-attribute" section. I.e.,


the attributes listed here belong to the "Code" attribute of Act().

attributes_count is 1, which indicates there is only one attribute


belonging to
the "Code" attribute of Act(). attribute_name_index is an 8, and
constant_pool[8] is the string "LineNumberTable". This is the name of
the
attribute. attribute_length gives the length of attribute
"LineNumberTable" as
6. The line_number_table_length is 1, meaning that there will be only 1
line_number_table entry immediately following.
**********
Step 35. Line Number Table for Act()

hex dec name what it refers to


--- --- ---- -----------------
0000 0 start_pc aload_0, invokenonvirtual
#3, return
0002 2 line_number class Act {

The above table associates a line in the Java source file with the
starting
instruction of the Act() method. The start_pc value of the table entry
indicates the zero based byte position inside the Act() bytecode
sequence. The line_number value of the table entry is the line number of
the source file that corresponds to the start_pc position in the
bytecode
sequence.

This is the last of the information in the class file about the Act()
method, which was the last method to be described by this file. The last
section of the class file follows, the general class file attributes.
****
Step 36. General Attributes

0001 1 attributes_count

0009 9 attribute_name_index
00000002 2 attribute_length
000F 15 sourcefile_index

A class file can have any number of attributes at the end. In this case
attributes_count is a 1, so there is only 1 attribute here. The
attribute_name_index
is 9, and constant_pool[9] is the string "SourceFile". Therefore, this
is the
"SourceFile" attribute. attribute_length gives the length of the
attribute as 2,
which means that two bytes will follow the attribute_length field. The
last two
bytes are the sourcefile_index, which in this case is 15.
constant_pool[15] is
the string "snipet.java", which as it happens, is the name of the
source file
I compiled with javac to generate this file, Act.class.
**********

You might also like