Information taken from http://www.sable.mcgill.ca/soot/tutorial/pldi03/tutorial.pdf
General Overview Developed by Sable Research Group out of McGill University in 1996-1997 Used to optimize Java Bytecode 4 source languages 4 intermediate representations used Sources Languages Primarily takes Java Source as its input Can also take: SML Scheme Eiffel I.R.s Baf: Streamlined, stack-based representation of bytecode Abstracts type dependent variations of expressions into a single expression Jimple: Stack-less, typed, 3-Address representation of bytecode Mix between java source and java bytecode Linearization of a single expression into 3 separate statements Only refers to 3 local vars or conts at once Only 15 jimple instructions are used Compared to 200 possible instructions in java bytecode! Shimple: SSA-form version of Jimple main IR Each local var has a single static point of definition (never reassign) used!! Uses Phi-Nodes for control flow Grimp: Similar to Jimple but allows trees of expressions together with a representation of a new operator Expressions are aggregated Phases of the Optimization Analysis Tested using 8 SPECjvm98 benchmarks running on JDK 1.2 Showed 8% improvement when optimized bytecode is run using an interpreter 21% improvement when optimized bytecode is run using a JIT compiler Used in research with traditional compiler analyses, analyses for software engineering, analysis for distributed programs, and software verification Ptolemy Project Bandera Canvas Project Strengths and Future Enhancements Used as a common infrastructure with which researchers could compare common analyses Enhancements coming: Attribute management Attribute legends Improved visual attributes in source Interactive CFGs Growable graphical callgraph Making conversion from Java to Jimple more stable and complete