Professional Documents
Culture Documents
PART I: Overview Material
PART I: Overview Material
PARTI:overviewmaterial
1
2
3
Introduction
Languageprocessors(tombstonediagrams,bootstrapping)
Architectureofacompiler
PARTII:insideacompiler
4
5
6
7
Syntaxanalysis
Contextualanalysis
Runtimeorganization
Codegeneration
PARTIII:conclusion
8
9
Interpretation
Review
TheJavaVirtualMachine
Supplementarymaterial:
Javasruntimeorganization
andtheJavaVirtualMachine
WhatThisTopicisAbout
WelookattheJVMasanexampleofarealworldruntimesystem
foramodernobjectorientedprogramminglanguage.
JVMisprobablythemostcommonandwidelyusedVMinthe
world,soyoullgetabetterideawhatarealVMlookslike.
JVMisanabstractmachine.
WhatistheJVMarchitecture?
Whatisthestructureof.classfiles?
HowareJVMinstructionsexecuted?
Whatistheroleoftheconstantpoolindynamiclinking?
AlsovisitthissiteformorecompleteinformationabouttheJVM:
http://java.sun.com/docs/books/vmspec/2ndedition/html/VMSpecTOC.doc.html
TheJavaVirtualMachine
Recap:InterpretiveCompilers
Why?
Atradeoffbetweenfast(er)compilationandareasonableruntime
performance.
How?
Useanintermediatelanguage
morehighlevelthanmachinecode=>easiertocompileto
morelowlevelthansourcelanguage=>easytoimplementasan
interpreter
Example:AJavaDevelopmentKitformachineM
Java>JVM
M
TheJavaVirtualMachine
JVM
M
AbstractMachines
Abstractmachineimplementsanintermediatelanguageinbetween
thehighlevellanguage(e.g.Java)andthelowlevelhardware(e.g.
Pentium)
Highlevel
Java
ImplementedinJava:
Machineindependent
Java
Javacompiler
JVM(.classfiles)
JavaJVMinterpreter
orJVMJITcompiler
Lowlevel
Pentium
TheJavaVirtualMachine
Pentium
AbstractMachines
Anabstractmachineisintendedspecificallyasaruntimesystemfor
aparticular(kindof)programminglanguage.
JVMisavirtualmachineforJavaprograms.
Itdirectlysupportsobjectorientedconceptssuchasclasses,
objects,methods,methodinvocationetc.
EasytocompileJavatoJVM
=>1.easytoimplementcompiler
2.fastcompilation
Anotheradvantage:portability
TheJavaVirtualMachine
ClassFilesandClassFileFormat
Externalrepresentation
(platformindependent)
.class files
load
JVM
Internalrepresentation
(implementationdependent)
classes
objects
primitivetypes
arrays
strings
methods
TheJVMisanabstractmachineinthetruestsenseoftheword.
TheJVMspecificationdoesnotgiveimplementationdetails(canbe
dependentontargetOS/platform,performancerequirements,etc.)
TheJVMspecificationdefinesamachineindependentclassfile
formatthatallJVMimplementationsmustsupport.
TheJavaVirtualMachine
DataTypes
JVM(andJava)distinguishesbetweentwokindsoftypes:
Primitivetypes:
boolean:boolean
numericintegral:byte, short, int, long, char
numericfloatingpoint:float, double
internal,forexceptionhandling:returnAddress
Referencetypes:
classtypes
arraytypes
interfacetypes
Note:Primitivetypesarerepresenteddirectly,referencetypesare
representedindirectly(aspointerstoarrayorclassinstances).
TheJavaVirtualMachine
JVM:RuntimeDataAreas
BesidesOOconcepts,JVMalsosupportsmultithreading.Threadsare
directlysupportedbytheJVM.
=>Twokindsofruntimedataareas:
1.sharedbetweenallthreads
2.privatetoasinglethread
Shared
GarbageCollected
Heap
Methodarea
TheJavaVirtualMachine
Thread1
pc
Java
Stack
Thread2
pc
Native
Method
Stack
Java
Stack
Native
Method
Stack
JavaStacks
JVMisastackbasedmachine,muchlikeTAM.
JVMinstructions
implicitlytakeargumentsfromthestacktop
puttheirresultonthetopofthestack
Thestackisusedto
passargumentstomethods
returnaresultfromamethod
storeintermediateresultswhileevaluatingexpressions
storelocalvariables
Thisworkssimilarlyto(butnotexactlythesameas)whatwe
previouslydiscussedaboutstackbasedstorageallocationand
routines.
TheJavaVirtualMachine
StackFrames
TheJavastackconsistsofframes.TheJVMspecificationdoesnotsay
exactlyhowthestackandframesshouldbeimplemented.
TheJVMspecificationspecifiesthatastackframehasareasfor:
Pointertoruntimeconstantpool
args
+
localvars
operandstack
Anewcallframeiscreatedbyexecuting
someJVMinstructionforinvokinga
method(e.g.invokevirtual,
invokenonvirtual,...)
Theoperandstackisinitiallyempty,
butgrowsandshrinksduringexecution.
TheJavaVirtualMachine
StackFrames
Therole/purposeofeachoftheareasinastackframe:
pointerto
constantpool
args
+
localvars
operandstack
UsedimplicitlywhenexecutingJVM
instructionsthatcontainentriesintothe
constantpool(moreaboutthislater).
Spacewheretheargumentsandlocalvariables
ofamethodarestored.Thisincludesaspace
forthereceiver(this)atposition/offset0.
Stackforstoringintermediateresults
duringtheexecutionofthemethod.
Initiallyitisempty.
Themaximumdepthisknownat
compiletime.
TheJavaVirtualMachine
StackFrames
AnimplementationusingregisterssuchasSB,ST,andLBanda
dynamiclinkisonepossibleimplementation.
topreviousframeonthestack
SB
LB
dynamiclink
args
+
localvars
operandstack
ST
TheJavaVirtualMachine
toruntimeconstantpool
JVMinstructionsstoreandload
(foraccessingargsandlocals)use
addresseswhicharenumbers
from0to#args+#locals1
JVMInterpreter
ThecoreofaJVMinterpreterisbasicallythis:
do {
byte opcode = fetch an opcode;
switch (opcode) {
case opCode1 :
fetch operands for opCode1;
execute action for opCode1;
break;
case opCode2 :
fetch operands for opCode2;
execute action for opCode2;
break;
case ...
} while (more to do)
TheJavaVirtualMachine
Instructionset:typedinstructions!
JVMinstructionsareexplicitlytyped:differentopCodesfor
instructionsforintegers,floats,arrays,referencetypes,etc.
Thisisreflectedbyanamingconventioninthefirstletterofthe
opCodemnemonics:
Example:differenttypesofloadinstructions
iload
lload
fload
dload
aload
integerload
longload
floatload
doubleload
referencetypeload
TheJavaVirtualMachine
Instructionset:kindsofoperands
JVMinstructionshavethreekindsofoperands:
fromthetopoftheoperandstack
fromthebytesfollowingtheopCode
partoftheopCodeitself
Eachinstructionmayhavedifferentformssupportingdifferent
kindsofoperands.
Example:differentformsofiload
Assemblycode
Binaryinstructioncodelayout
iload_0
26
iload_1
27
iload_2
28
iload_3
29
iload n
21
n
wide iload n
TheJavaVirtualMachine
196
21
Instructionset:accessingargumentsandlocals
argumentsandlocalsareainsideastackframe
0:
1:
2:
3:
Instructionexamples:
iload_1
istore_1
iload_3
astore_1
aload5
fstore_3
aload_0
TheJavaVirtualMachine
args:indexes0..#args1
locals:indexes#args..#args+#locals1
Aloadinstructiontakessomething
fromtheargs/localsareaandpushes
itontothetopoftheoperandstack.
Astoreinstructionpopssomething
fromthetopoftheoperandstack
andplacesitintheargs/localsarea.
Instructionset:nonlocalmemoryaccess
IntheJVM,thecontentsofdifferentkindsofmemorycanbe
accessedbydifferentkindsofinstructions.
accessinglocalsandarguments:loadandstoreinstructions
accessingfieldsinobjects:getfield, putfield
accessingstaticfields:getstatic, putstatic
Note:Staticfieldsarealotlikeglobalvariables.Theyareallocated
inthemethodareawherealsocodeformethodsand
representationsforclasses(includingmethodtables)arestored.
Note:getfieldandputfieldaccessmemoryintheheap.
Note:JVMdoesnthaveanythingsimilartoregistersL1,L2,etc.
TheJavaVirtualMachine
Instructionset:operationsonnumbers
Arithmetic
add:iadd, ladd, fadd, dadd
subtract:isub, lsub, fsub, dsub
multiply:imul, lmul, fmul, dmul
etc.
Conversion
i2l, i2f, i2d,
l2f, l2d, f2d,
f2i, d2i,
TheJavaVirtualMachine
Instructionset
Operandstackmanipulation
pop, pop2, dup, dup2, swap,
Controltransfer
Unconditional:goto, jsr, ret,
Conditional:ifeq, iflt, ifgt, if_icmpeq,
TheJavaVirtualMachine
Instructionset
Methodinvocation:
invokevirtual:usualinstructionforcallingamethodonan
object.
invokeinterface:sameasinvokevirtual,butused
whenthecalledmethodisdeclaredinaninterface(requiresa
differentkindofmethodlookup)
invokespecial:forcallingthingssuchasconstructors,
whicharenotdynamicallydispatched(thisinstructionisalso
knownasinvokenonvirtual).
invokestatic:forcallingmethodsthathavethestatic
modifier(thesemethodsaresenttoaclass,nottoanobject).
Returningfrommethods:
return, ireturn, lreturn, areturn, freturn,
TheJavaVirtualMachine
Instructionset:HeapMemoryAllocation
Createnewclassinstance(object):
new
Createnewarray:
newarray:forcreatingarraysofprimitivetypes.
anewarray, multianewarray:forarraysofreference
types.
TheJavaVirtualMachine
InstructionsandtheConstantPool
ManyJVMinstructionshaveoperandswhichareindexespointingto
anentryinthesocalledconstantpool.
Theconstantpoolcontainsallkindsofentriesthatrepresent
symbolicreferencesforlinking.Thisisthewaythatinstructions
refertothingssuchasclasses,interfaces,fields,methods,and
constantssuchasstringliteralsandnumbers.
Thesearethekindsofconstantpoolentriesthatexist:
Integer
Class_info
Float
Fieldref_info
Long
Methodref_info
Double
InterfaceMethodref_info
Name_and_Type_info
String
Utf8_info(Unicodecharacters)
TheJavaVirtualMachine
InstructionsandtheConstantPool
Example:Weexaminethegetfieldinstructionindetail.
Format:
180
indexbyte1
CONSTANT_Fieldref_info{
u1tag;
u2class_index;
u2name_and_type_index;
}
Class_info{
u1tag;
u2name_index;
}
CONSTANT_Name_and_Type_info{
u1tag;
u2name_index;
u2descriptor_index;
}
TheJavaVirtualMachine
indexbyte2
Utf8Info
fully
qualified
classname
Utf8Info
nameoffield
Utf8Info
fielddescriptor
InstructionsandtheConstantPool
Thatpreviouspictureisrathercomplicated,letssimplifyitalittle:
Format:
180
indexbyte1
indexbyte2
Fieldref
Class
Utf8Info
fullyqualified
classname
TheJavaVirtualMachine
Name_and_Type
Utf8Info
nameoffield
Utf8Info
fielddescriptor
InstructionsandtheConstantPool
TheconstantentriesformatispartoftheJavaclassfileformat.
Luckily,wehaveaJavaassemblerthatallowsustowriteakindof
textualassemblycodeandthatisthentransformedintoabinary
.classfile.
Thisassemblertakescareofcreatingtheconstantpoolentriesforus.
Whenaninstructionoperandexpectsaconstantpoolentrythe
assemblerallowsyoutoentertheentryinplaceinaneasysyntax.
Example:
getfield mypackage/Queue i I
TheJavaVirtualMachine
InstructionsandtheConstantPool
FullyqualifiedclassnamesanddescriptorsinconstantpoolUTF8
entries.
1.Fullyqualifiedclassname:apackage+classnamestring.Note
thisuses/insteadof.toseparateeachlevelalongthepath.
2.Descriptor:astringthatdefinesatypeforamethodorfield.
Java
boolean
integer
Object
String[]
int foo(int,Object)
TheJavaVirtualMachine
descriptor
Z
I
Ljava/lang/Object;
[Ljava/lang/String;
(ILjava/lang/Object;)I
Linking
Ingeneral,linkingistheprocessofresolvingsymbolicreferencesin
binaryfiles.
Mostprogramminglanguageimplementationshavewhatwecall
separatecompilation.Modulesorfilescanbecompiledseparately
andtransformedintosomebinaryformat.Butsincetheseseparately
compiledfilesmayhaveconnectionstootherfiles,theyhavetobe
linked.
=>Thebinaryfileisnotyetexecutable,becauseithassomekindof
symboliclinksinitthatpointtothings(classes,methods,functions,
variables,etc.)inotherfiles/modules.
Linkingistheprocessofresolvingthesesymboliclinksandreplacing
thembyrealaddressessothatthecodecanbeexecuted.
TheJavaVirtualMachine
LoadingandLinkinginJVM
InJVM,loadingandlinkingofclassfileshappensatruntime,while
theprogramisrunning!
Classesareloadedasneeded.
Theconstantpoolcontainssymbolicreferencesthatneedtobe
resolvedbeforeaJVMinstructionthatusesthemcanbeexecuted
(thisistheequivalentoflinking).
InJVMaconstantpoolentryisresolvedthefirsttimeitisusedbya
JVMinstruction.
Example:
Whenagetfieldisexecutedforthefirsttime,theconstantpool
entryindexintheinstructioncanbereplacedbytheoffsetofthefield.
TheJavaVirtualMachine
ClosingExample
AsaclosingexampleontheJVM,wewilltakealookatthe
compiledcodeofthefollowingsimpleJavaclassdeclaration.
class Factorial {
int fac(int n) {
int result = 1;
for (int i=2; i<n; i++) {
result = result * i;
}
return result;
}
}
TheJavaVirtualMachine
CompilingandDisassembling
% javac Factorial.java
% javap -c -verbose Factorial
Compiled from Factorial.java
class Factorial extends java.lang.Object {
Factorial();
/* Stack=1, Locals=1, Args_size=1 */
int fac(int);
/* Stack=2, Locals=4, Args_size=2 */
}
Method Factorial()
0 aload_0
1 invokespecial #1 <Method java.lang.Object()>
4 return
TheJavaVirtualMachine
CompilingandDisassembling...
// address:
Method int fac(int) // stack:
0 iconst_1
// stack:
1 istore_2
// stack:
2 iconst_2
// stack:
3 istore_3
// stack:
4 goto 14
7 iload_2
// stack:
8 iload_3
// stack:
9 imul
// stack:
10 istore_2
// stack:
11 iinc 3 1
// stack:
14 iload_3
// stack:
15 iload_1
// stack:
16 if_icmplt 7
// stack:
19 iload_2
// stack:
20 ireturn
TheJavaVirtualMachine
0
this
this
this
this
this
1
n
n
n
n
n
2
result
result
result
result
result
3
i
i 1
i
i 2
i
this
this
this
this
this
this
this
this
this
n
n
n
n
n
n
n
n
n
result
result
result
result
result
result
result
result
result
i
i
i
i
i
i
i
i
i
result
result i
result*i
i
i n
result
WritingFactorialinjasmin
JasminisaJavaAssemblerInterface.IttakesASCIIdescriptions
forJavaclasses,writteninasimpleassemblerlikesyntaxandusing
theJavaVirtualMachineinstructionset.Itconvertsthemintobinary
JavaclassfilessuitableforloadingintoaJVMimplementation.
.class
.classpackage
packageFactorial
Factorial
.super
.superjava/lang/Object
java/lang/Object
.method
.methodpackage
package<init>(
<init>()V
)V
.limit
.limitstack
stack50
50
.limit
.limitlocals
locals11
aload_0
aload_0
invokenonvirtual
invokenonvirtualjava/lang/Object/<init>(
java/lang/Object/<init>()V
)V
return
return
.end
.endmethod
method
TheJavaVirtualMachine
WritingFactorialinjasmin(continued)
.methodpackagefac(I)I
.limitstack50
.limitlocals4
iconst_1
istore2
iconst_2
istore3
Label_1:
iload3
iload1
if_icmpltLabel_4
iconst_0
gotoLabel_5
Label_4:
iconst_1
Label_5:
ifeqLabel_2
TheJavaVirtualMachine
iload2
iload3
imul
dup
istore2
pop
Label_3:
iload3
dup
iconst_1
iadd
istore3
pop
gotoLabel_1
Label_2:
iload2
ireturn
iconst_0
ireturn
.endmethod