You are on page 1of 54

Lesson 4 Terminology

Compiler Vs Interpreter — What’s the Di!erence?


Lesson 5

Must Know!

HTML Compiler — Execute HTML Online


Lesson 1

Java Online Compiler — Compile and Execute Java


Lesson 2 Online

JavaScript Online Compiler — Execute JavaScript


Lesson 3 Online

Online Python Compiler — Python Compiler (Editor /


Lesson 4 Interpreter / IDE)

What is Compiler Design?


Compiler Design is the structure and set of defined principles that guide
the translation, analysis, and optimization of the entire compiling
What is a Compiler Design? Types,
Construction Tools, Example
John Smith June 24, 2023

MetroOpinion
Open
Earn $3.5 Per Answer

What is a Compiler?
A compiler is a computer program which helps you transform source
code written in a high-level language into low-level machine language. It
translates the code written in one programming language to some other
language without changing the meaning of the code. The compiler also
makes the end code e!icient, which is optimized for execution time and
memory space.

The compiling process includes basic translation mechanisms and error


detection. The compiler process goes through lexical, syntax, and
semantic analysis at the front end and code generation and optimization
at the back-end.

In this Compiler Design tutorial, you will learn


What is a Compiler?
Features of Compilers
Types of Compiler
Tasks of Compiler
History of Compiler
Steps for Language processing systems
Compiler Construction Tools
Why use a Compiler?
Application of Compilers

Features of Compilers
Correctness
Speed of compilation
Preserve the correct the meaning of the code
The speed of the target code
Recognize legal and illegal program constructs
Good error reporting/handling
Code debugging help

Types of Compiler
Following are the di!erent types of Compiler:

Single Pass Compilers


Two Pass Compilers
Multipass Compilers

Single Pass Compiler


Single Pass Compiler

Faster Payments
Simplify payment for your
customers by sharing your… Start Now
!u"erwave.com

In single pass Compiler source code directly transforms into machine


code. For example, Pascal language.

Two Pass Compiler

Two Pass Compiler

Two pass Compiler is divided into two sections, viz.

1. Front end: It maps legal code into Intermediate Representation


(IR).
2. Back end: It maps IR onto the target machine

The Two pass compiler method also simplifies the retargeting process. It
also allows multiple front ends.

Multipass Compilers

Multipass Compilers

The multipass compiler processes the source code or syntax tree of a


program several times. It divided a large program into multiple small
programs and process them. It develops multiple intermediate codes. All
of these multipass take the output of the previous phase as an input. So
it requires less memory. It is also known as ‘Wide Compiler’.

Tasks of Compiler
The main tasks performed by the Compiler are:

Breaks up the up the source program into pieces and impose


grammatical structure on them
Allows you to construct the desired target program from the
intermediate representation and also create the symbol table
Compiles source code and detects errors in it
Manage storage of all variables and codes.
Support for separate compilation
Read, analyze the entire program, and translate to semantically
equivalent
Translating the source code into object code depending upon the
type of machine

History of Compiler
Important Landmark of Compiler’s history is as follows:

The “compiler” word was first used in the early 1950s by Grace
Murray Hopper.
The first compiler was build by John Backum and his group
between 1954 and 1957 at IBM.
COBOL was the first programming language which was compiled on
multiple platforms in 1960
The study of the scanning and parsing issues were pursued in the
1960s and 1970s to provide a complete solution.

Steps for Language processing systems


Before knowing about the concept of compilers, you first need to
understand a few other tools which work with compilers.
Steps for Language processing systems

Preprocessor
Preprocessor: The preprocessor is considered as a part of the
Compiler. It is a tool which produces input for Compiler. It deals
with macro processing, augmentation, language extension, etc.

Interpreter
Interpreter: An interpreter is like Compiler which translates high-
level language into low-level machine language. The main
di!erence between both is that interpreter reads and transforms
code line by line. Compiler reads the entire code at once and creates
the machine code.

Assembler
Assembler: It translates assembly language code into machine
understandable language. The output result of assembler is known
as an object file which is a combination of machine instruction as
well as the data required to store these instructions in memory.
Linker
Linker: The linker helps you to link and merge various object files to
create an executable file. All these files might have been compiled
with separate assemblers. The main task of a linker is to search for
called modules in a program and to find out the memory location
where all modules are stored.

Loader
Loader: The loader is a part of the OS, which performs the tasks of
loading executable files into memory and run them. It also
calculates the size of a program which creates additional memory
space.

Cross-compiler
Cross-compiler: A Cross compiler in compiler design is a platform
which helps you to generate executable code.

Source-to-source Compiler
Compiler: Source to source compiler is a term
used when the source code of one programming language is
translated into the source of another language.

Compiler Construction Tools


Compiler construction tools were introduced as computer-related
technologies spread all over the world. They are also known as a
compiler- compilers, compiler- generators or translator.

These tools use specific language or algorithm for specifying and


implementing the component of the compiler. Following are the
example of compiler construction tools.

Scanner generators
generators: This tool takes regular expressions as input.
For example LEX for Unix Operating System.
Syntax-directed translation engines
engines: These so"ware tools o!er
an intermediate code by using the parse tree. It has a goal of
associating one or more translations with each node of the parse
tree.
Parser generators: A parser generator takes a grammar as input
and automatically generates source code which can parse streams
of characters with the help of a grammar.
Automatic code generators
generators: Takes intermediate code and
converts them into Machine Language.
Data-flow engines
engines: This tool is helpful for code optimization. Here,
information is supplied by the user, and intermediate code is
compared to analyze any relation. It is also known as data-flow
analysis. It helps you to find out how values are transmitted from
one part of the program to another part.

Why use a Compiler?


Compiler verifies entire program, so there are no syntax or semantic
errors.
The executable file is optimized by the compiler, so it is executes
faster.
Allows you to create internal structure in memory.
There is no need to execute the program on the same machine it
was built.
Translate entire program in other language.
Generate files on disk.
Link the files into an executable format.
Check for syntax errors and data types.
Helps you to enhance your understanding of language semantics.
Helps to handle language performance issues.
Opportunity for a non-trivial programming project.
The techniques used for constructing a compiler can be useful for
other purposes as well.

Application of Compilers
Compiler design helps full implementation Of High-Level
Programming Languages.
Support optimization for Computer Architecture Parallelism.
Design of New Memory Hierarchies of Machines.
Widely used for Translating Programs.
Used with other So"ware Productivity Tools.

Summary
A compiler is a computer program which helps you transform
source code written in a high-level language into low-level machine
language.
Correctness, speed of compilation, preserve the correct the
meaning of the code are some important features of compiler
design.
Compilers are divided into three parts 1) Single Pass Compilers
2)Two Pass Compilers, and 3) Multipass Compilers.
The “compiler” was word first used in the early 1950s by Grace
Murray Hopper.
Steps for Language processing system are: Preprocessor,
Interpreter, Assembler, Linker/Loader.
Important compiler construction tools are 1) Scanner generators,
2)Syntax-3) directed translation engines, 4) Parser generators, 5)
Automatic code generators.
The main task of the compiler is to verify the entire program, so
there are no syntax or semantic errors.

You Might Like:

Compiler vs Interpreter – Di!erence Between Them


Phases of Compiler with Example: Compilation Process &
Steps
Lexical Analysis (Analyzer) in Compiler Design with Example
Syntax Analysis: Compiler Top Down & Bottom Up Parsing
Phases of Compiler with Example:
Compilation Process & Steps
By John Smith Updated June 24, 2023

What are the


Phases of
Compiler Design?
Compiler operates in various
phases each phase transforms
the source program from one
representation to another.
Every phase takes inputs from
its previous stage and feeds its
output to the next phase of the compiler.
There are 6 phases in a compiler. Each of this phase help in converting
the high-level langue the machine code. The phases of a compiler are:

1. Lexical analysis
2. Syntax analysis
3. Semantic analysis
4. Intermediate code generator
5. Code optimizer
6. Code generator
Phases of Compiler

All these phases convert the source code by dividing into tokens,
creating parse trees, and optimizing the source code by di!erent phases.

In this tutorial, you will learn:

What are the Phases of Compiler Design?


Phase 1: Lexical Analysis
Phase 2: Syntax Analysis
Phase 3: Semantic Analysis
Phase 4: Intermediate Code Generation
Phase 5: Code Optimization
Phase 6: Code Generation
Symbol Table Management
Error Handling Routine:

Phase 1: Lexical Analysis


Lexical Analysis is the first phase when compiler scans the source code.
This process can be le! to right, character by character, and group these
characters into tokens.
Here, the character stream from the source program is grouped in
meaningful sequences by identifying the tokens. It makes the entry of
the corresponding tickets into the symbol table and passes that token to
next phase.
The primary functions of this phase are:

Identify the lexical units in a source code


Classify lexical units into classes like constants, reserved words, and
enter them in di"erent tables. It will Ignore comments in the source
program
Identify token which is not a part of the language

Example
Example:
x = y + 10

Tokens

X identifier

= Assignment operator

Y identifier

+ Addition operator

10 Number
Phase 2: Syntax Analysis
Syntax analysis is all about discovering structure in code. It determines
whether or not a text follows the expected format. The main aim of this
phase is to make sure that the source code was written by the
programmer is correct or not.
Syntax analysis is based on the rules based on the specific programing
language by constructing the parse tree with the help of tokens. It also
determines the structure of source language and grammar or syntax of
the language.
Here, is a list of tasks performed in this phase:

Obtain tokens from the lexical analyzer


Checks if the expression is syntactically correct or not
Report all syntax errors
Construct a hierarchical structure which is known as a parse tree

Example
Any identifier/number is an expression
If x is an identifier and y+10 is an expression, then x= y+10 is a statement.
Consider parse tree for the following example

(a+b)*c
In Parse Tree

Interior node: record with an operator filed and two files for children
Leaf: records with 2/more fields; one for token and other
information about the token
Ensure that the components of the program fit together
meaningfully
Gathers type information and checks for type compatibility
Checks operands are permitted by the source language
Phase 3: Semantic Analysis
Semantic analysis checks the semantic consistency of the code. It uses
the syntax tree of the previous phase along with the symbol table to
verify that the given source code is semantically consistent. It also
checks whether the code is conveying an appropriate meaning.
Semantic Analyzer will check for Type mismatches, incompatible
operands, a function called with improper arguments, an undeclared
variable, etc.
Functions of Semantic analyses phase are:

Helps you to store type information gathered and save it in symbol


table or syntax tree
Allows you to perform type checking
In the case of type mismatch, where there are no exact type
correction rules which satisfy the desired operation a semantic error
is shown
Collects type information and checks for type compatibility
Checks if the source language permits the operands or not

Example

float x = 20.2;
float y = x*30;

In the above code, the semantic analyzer will typecast the integer 30 to
float 30.0 before multiplication

Phase 4: Intermediate Code Generation


Once the semantic analysis phase is over the compiler, generates
intermediate code for the target machine. It represents a program for
some abstract machine.
Intermediate code is between the high-level and machine level
language. This intermediate code needs to be generated in such a
manner that makes it easy to translate it into the target machine code.
Functions on Intermediate Code generation:

It should be generated from the semantic representation of the


source program
Holds the values computed during the process of translation
Helps you to translate the intermediate code into target language
Allows you to maintain precedence ordering of the source language
It holds the correct number of operands of the instruction

Example
For example,

total = count + rate * 5

Intermediate code with the help of address code method is:

t1 := int_to_float(5)
t2 := rate * t1
t3 := count + t2
total := t3
Phase 5: Code Optimization
The next phase of is code optimization or Intermediate code. This phase
removes unnecessary code line and arranges the sequence of statements
to speed up the execution of the program without wasting resources.
The main goal of this phase is to improve on the intermediate code to
generate a code that runs faster and occupies less space.
The primary functions of this phase are:

It helps you to establish a trade-o" between execution and


compilation speed
Improves the running time of the target program
Generates streamlined code still in intermediate representation
Removing unreachable code and getting rid of unused variables
Removing statements which are not altered from the loop

Example:
Consider the following code

a = intofloat(10)
b = c * a
d = e + b
f = d

Can become
b =c * 10.0
f = e+b

Phase 6: Code Generation


Code generation is the last and final phase of a compiler. It gets inputs
from code optimization phases and produces the page code or object
code as a result. The objective of this phase is to allocate storage and
generate relocatable machine code.
It also allocates memory locations for the variable. The instructions in
the intermediate code are converted into machine instructions. This
phase coverts the optimize or intermediate code into the target
language.
The target language is the machine code. Therefore, all the memory
locations and registers are also selected and allotted during this phase.
The code generated by this phase is executed to take inputs and
generate expected outputs.

Example:
a = b + 60.0
Would be possibly translated to registers.

MOVF a, R1
MULF #60.0, R2
ADDF R1, R2

Symbol Table Management


A symbol table contains a record for each identifier with fields for the
attributes of the identifier. This component makes it easier for the
compiler to search the identifier record and retrieve it quickly. The
symbol table also helps you for the scope management. The symbol
table and error handler interact with all the phases and symbol table
update correspondingly.

Error Handling Routine:


In the compiler design process error may occur in all the below-given
phases:

Lexical analyzer: Wrongly spelled tokens


Syntax analyzer: Missing parenthesis
Intermediate code generator: Mismatched operands for an operator
Code Optimizer: When the statement is not reachable
Code Generator: When the memory is full or proper registers are not
allocated
Symbol tables: Error of multiple declared identifiers

Most common errors are invalid character sequence in scanning, invalid


token sequences in type, scope error, and parsing in semantic analysis.
The error may be encountered in any of the above phases. A!er finding
errors, the phase needs to deal with the errors to continue with the
compilation process. These errors need to be reported to the error
handler which handles the error to perform the compilation process.
Generally, the errors are reported in the form of message.

Summary
Compiler operates in various phases each phase transforms the
source program from one representation to another
Six phases of compiler design are 1) Lexical analysis 2) Syntax
analysis 3) Semantic analysis 4) Intermediate code generator 5)
Code optimizer 6) Code Generator
Lexical Analysis is the first phase when compiler scans the source
code
Syntax analysis is all about discovering structure in text
Semantic analysis checks the semantic consistency of the code
Once the semantic analysis phase is over the compiler, generate
intermediate code for the target machine
Code optimization phase removes unnecessary code line and
arranges the sequence of statements
Code generation phase gets inputs from code optimization phase
and produces the page code or object code as a result
A symbol table contains a record for each identifier with fields for
the attributes of the identifier
Error handling routine handles error and reports during many
phases

You Might Like:

What is a Compiler Design? Types, Construction Tools,


Example
Compiler vs Interpreter – Di"erence Between Them
Lexical Analysis (Analyzer) in Compiler Design with Example
Syntax Analysis: Compiler Top Down & Bottom Up Parsing
Types
Compiler Design Tutorial for Beginners – Complete Guide

Prev Report a Bug Next


Top Tutorials

About
About Us
Advertise with
Us
Python Testing Hacking
Write For Us
Contact Us

Career
Suggestion
SAP Career
Suggestion
Tool
So!ware
Testing as a SAP Java SQL
Career

Interesting
eBook
Blog
Quiz
Quiz
SAP eBook

Execute
online
Execute Java
Online
Execute
Javascript
Execute HTML Selenium Build Website VPNs
Execute
Python

© Copyright - Guru99 2023 Privacy


Policy | A"iliate Disclaimer | ToS
Lexical Analysis (Analyzer) in Compiler
Design with Example
By John Smith Updated June 24, 2023

What is Lexical
Analysis?
Lexical Analysis is the very
first phase in the compiler
designing. A Lexer takes the
modified source code which is
written in the form of
sentences . In other words, it
helps you to convert a
sequence of characters into a sequence of tokens. The lexical analyzer
breaks this syntax into a series of tokens. It removes any extra space or
comment written in the source code.

Programs that perform Lexical Analysis in compiler design are called


lexical analyzers or lexers. A lexer contains tokenizer or scanner. If the
lexical analyzer detects that the token is invalid, it generates an error.
The role of Lexical Analyzer in compiler design is to read character
streams from the source code, check for legal tokens, and pass the data
to the syntax analyzer when it demands.

Example

How Pleasant Is The Weather?


See this Lexical Analysis example; Here, we can easily recognize that
there are five words How Pleasant, The, Weather, Is. This is very natural
for us as we can recognize the separators, blanks, and the punctuation
symbol.

HowPl easantIs Th ewe ather?

Now, check this example, we can also read this. However, it will take
some time because separators are put in the Odd Places. It is not
something which comes to you immediately.

In this tutorial, you will learn

Basic Terminologies:
Lexical Analyzer Architecture: How tokens are recognized
Roles of the Lexical analyzer
Lexical Errors
Error Recovery in Lexical Analyzer
Lexical Analyzer vs. Parser
Why separate Lexical and Parser?
Advantages of Lexical analysis
Disadvantage of Lexical analysis

Basic Terminologies
What’s a lexeme?
A lexeme is a sequence of characters that are included in the source
program according to the matching pattern of a token. It is nothing but
an instance of a token.

What’s a token?
Tokens in compiler design are the sequence of characters which
represents a unit of information in the source program.

What is Pattern?
A pattern is a description which is used by the token. In the case of a
keyword which uses as a token, the pattern is a sequence of characters.

Lexical Analyzer Architecture: How tokens


are recognized
The main task of lexical analysis is to read input characters in the code
and produce tokens.

Lexical analyzer scans the entire source code of the program. It identifies
each token one by one. Scanners are usually implemented to produce
tokens only when requested by a parser. Here is how recognition of
tokens in compiler design works-
Lexical Analyzer Architecture

1. “Get next token” is a command which is sent from the parser to the
lexical analyzer.
2. On receiving this command, the lexical analyzer scans the input
until it finds the next token.
3. It returns the token to Parser.

Lexical Analyzer skips whitespaces and comments while creating these


tokens. If any error is present, then Lexical analyzer will correlate that
error with the source file and line number.

Roles of the Lexical analyzer


Lexical analyzer performs below given tasks:

Helps to identify token into the symbol table


Removes white spaces and comments from the source program
Correlates error messages with the source program
Helps you to expands the macros if it is found in the source program
Read input characters from the source program

Example of Lexical Analysis, Tokens, Non-Tokens


Consider the following code that is fed to Lexical Analyzer
#include <stdio.h>
int maximum(int x, int y) {
// This will compare 2 numbers
if (x > y)
return x;
else {
return y;
}
}

MetroOpinion

Earn $3.5 Per Answer

Examples of Tokens created


Lexeme Token

int Keyword

maximum Identifier

( Operator

int Keyword

x Identifier
, Operator

int Keyword

Y Identifier

) Operator

{ Operator

If Keyword

Examples of Nontokens
Type Examples

Comment // This will compare 2 numbers

Pre-processor directive #include <stdio.h>

Pre-processor directive #define NUMS 8,9

Macro NUMS

Whitespace /n /b /t

Lexical Errors
A character sequence which is not possible to scan into any valid token is
a lexical error. Important facts about the lexical error:

Lexical errors are not very common, but it should be managed by a


scanner
Misspelling of identifiers, operators, keyword are considered as
lexical errors
Generally, a lexical error is caused by the appearance of some illegal
character, mostly at the beginning of a token.
Error Recovery in Lexical Analyzer
Here, are a few most common error recovery techniques:

Removes one character from the remaining input


In the panic mode, the successive characters are always ignored
until we reach a well-formed token
By inserting the missing character into the remaining input
Replace a character with another character
Transpose two serial characters

Lexical Analyzer vs. Parser


Lexical Analyser Parser

Scan Input program Perform syntax analysis

Create an abstract representation of the


Identify Tokens
code

Insert tokens into Symbol


Update symbol table entries
Table

It generates a parse tree of the source


It generates lexical errors code

Why separate Lexical and Parser?


The simplicity of design: It eases the process of lexical analysis and
the syntax analysis by eliminating unwanted tokens
To improve compiler e!iciency: Helps you to improve compiler
e!iciency
Specialization: specialized techniques can be applied to improves
the lexical analysis process
Portability: only the scanner requires to communicate with the
outside world
Higher portability: input-device-specific peculiarities restricted to
the lexer

Advantages of Lexical analysis


Lexical analyzer method is used by programs like compilers which
can use the parsed data from a programmer’s code to create a
compiled binary executable code
It is used by web browsers to format and display a web page with
the help of parsed data from JavsScript, HTML, CSS
A separate lexical analyzer helps you to construct a specialized and
potentially more e!icient processor for the task

Disadvantage of Lexical analysis


You need to spend significant time reading the source program and
partitioning it in the form of tokens
Some regular expressions are quite di!icult to understand
compared to PEG or EBNF rules
More e!ort is needed to develop and debug the lexer and its token
descriptions
Additional runtime overhead is required to generate the lexer tables
and construct the tokens

Summary
Lexical analysis is the very first phase in the compiler design
Lexemes and Tokens are the sequence of characters that are
included in the source program according to the matching pattern
of a token
Lexical analyzer is implemented to scan the entire source code of
the program
Lexical analyzer helps to identify token into the symbol table
A character sequence which is not possible to scan into any valid
token is a lexical error
Removes one character from the remaining input is useful Error
recovery method
Lexical Analyser scan the input program while parser perform
syntax analysis
It eases the process of lexical analysis and the syntax analysis by
eliminating unwanted tokens
Lexical analyzer is used by web browsers to format and display a
web page with the help of parsed data from JavsScript, HTML, CSS
The biggest drawback of using Lexical analyzer is that it needs
additional runtime overhead is required to generate the lexer tables
and construct the tokens

You Might Like:

What is a Compiler Design? Types, Construction Tools,


Example
Compiler vs Interpreter – Di!erence Between Them
Syntax Analysis: Compiler Top Down &
Bottom Up Parsing Types
By John Smith Updated June 24, 2023

What is Syntax
Analysis?
Syntax Analysis is a second
phase of the compiler design
process in which the given
input string is checked for the
confirmation of rules and
structure of the formal
grammar. It analyses the
syntactical structure and checks if the given input is in the correct syntax
of the programming language or not.

Syntax Analysis in Compiler Design process comes a!er the Lexical


analysis phase. It is also known as the Parse Tree or Syntax Tree. The
Parse Tree is developed with the help of pre-defined grammar of the
language. The syntax analyser also checks whether a given program
fulfills the rules implied by a context-free grammar. If it satisfies, the
parser then creates the parse tree of that source program. Otherwise, it
will display error messages.
Syntax Analyser Process

In this tutorial, you will learn

Why do you need Syntax Analyser?


Important Syntax Analyser Terminology
Why do we need Parsing?
Parsing Techniques
Error – Recovery Methods
Grammar:
Notational Conventions
Context Free Grammar
Grammar Derivation
Syntax vs. Lexical Analyser
Disadvantages of using Syntax Analysers
Why do you need Syntax Analyser?
Check if the code is valid grammatically
The syntactical analyser helps you to apply rules to the code
Helps you to make sure that each opening brace has a
corresponding closing balance
Each declaration has a type and that the type must be exists

Important Syntax Analyser Terminology


Important terminologies used in syntax analysis process:

Sentence: A sentence is a group of character over some alphabet.

Lexeme: A lexeme is the lowest level syntactic unit of a language


(e.g., total, start).

Token: A token is just a category of lexemes.

Keywords and reserved words – It is an identifier which is used


as a fixed part of the syntax of a statement. It is a reserved word
which you can’t use as a variable name or identifier.

Noise words – Noise words are optional which are inserted in a


statement to enhance the readability of the sentence.

Comments – It is a very important part of the documentation. It


mostly display by, /* */, or//Blank (spaces)

Delimiters – It is a syntactic element which marks the start or end


of some syntactic unit. Like a statement or expression, “begin”…”
end”, or {}.
Character set – ASCII, Unicode

Identifiers – It is a restrictions on the length which helps you to


reduce the readability of the sentence.

Operator symbols – + and – performs two basic arithmetic


operations.
Syntactic elements of the Language

Why do we need Parsing?

A parse also checks that the input string is well-formed, and if not, reject
it.
Following are important tasks perform by the parser in compiler design:

Helps you to detect all types of Syntax errors


Find the position at which error has occurred
Clear & accurate description of the error.
Recovery from an error to continue and find further errors in the
code.
Should not a!ect compilation of “correct” programs.
The parse must reject invalid texts by reporting syntax errors
Parsing Techniques
Parsing techniques are divided into two di!erent groups:

Top-Down Parsing,
Bottom-Up Parsing

Top-Down Parsing:
In the top-down parsing construction of the parse tree starts at the root
and then proceeds towards the leaves.

Two types of Top-down parsing are:

1. Predictive Parsing:

Predictive parse can predict which production should be used to replace


the specific input string. The predictive parser uses look-ahead point,
which points towards next input symbols. Backtracking is not an issue
with this parsing technique. It is known as LL(1) Parser

2. Recursive Descent Parsing:

This parsing technique recursively parses the input to make a prase tree.
It consists of several small functions, one for each nonterminal in the
grammar.

Bottom-Up Parsing:
In the bottom up parsing in compiler design, the construction of the
parse tree starts with the leave, and then it processes towards its root. It
is also called as shi"-reduce parsing. This type of parsing in compiler
design is created with the help of using some so"ware tools.

Error – Recovery Methods


Common Errors that occur in Parsing in System So"ware

Lexical
Lexical: Name of an incorrectly typed identifier
Syntactical
Syntactical: unbalanced parenthesis or a missing semicolon
Semantical
Semantical: incompatible value assignment
Logical
Logical: Infinite loop and not reachable code

A parser should able to detect and report any error found in the program.
So, whenever an error occurred the parser. It should be able to handle it
and carry on parsing the remaining input. A program can have following
types of errors at various compilation process stages. There are five
common error-recovery methods which can be implemented in the
parser

Statement mode recovery

In the case when the parser encounters an error, it helps you to take
corrective steps. This allows rest of inputs and states to parse
ahead.
For example, adding a missing semicolon is comes in statement
mode recover method. However, parse designer need to be careful
while making these changes as one wrong correction may lead to an
infinite loop.

Panic-Mode recovery

In the case when the parser encounters an error, this mode ignores
the rest of the statement and not process input from erroneous
input to delimiter, like a semi-colon. This is a simple error recovery
method.
In this type of recovery method, the parser rejects input symbols
one by one until a single designated group of synchronizing tokens
is found. The synchronizing tokens generally using delimiters like or.

Phrase-Level Recovery:
Compiler corrects the program by inserting or deleting tokens. This
allows it to proceed to parse from where it was. It performs
correction on the remaining input. It can replace a prefix of the
remaining input with some string this helps the parser to continue
the process.

Error Productions
Error production recovery expands the grammar for the language
which generates the erroneous constructs. The parser then
performs error diagnostic about that construct.

Global Correction:

The compiler should make less number of changes as possible


while processing an incorrect input string. Given incorrect input
string a and grammar c, algorithms will search for a parse tree for a
related string b. Like some insertions, deletions, and modification
made of tokens needed to transform an into b is as little as possible.
Grammar:
A grammar is a set of structural rules which describe a language.
Grammars assign structure to any sentence. This term also refers to the
study of these rules, and this file includes morphology, phonology, and
syntax. It is capable of describing many, of the syntax of programming
languages.

Rules of Form Grammar

The non-terminal symbol should appear to the le" of the at least


one production
The goal symbol should never be displayed to the right of the::= of
any production
A rule is recursive if LHS appears in its RHS

Notational Conventions
Notational conventions symbol may be indicated by enclosing the
element in square brackets. It is an arbitrary sequence of instances of the
element which can be indicated by enclosing the element in braces
followed by an asterisk symbol, { … }*.
It is a choice of the alternative which may use the symbol within the
single rule. It may be enclosed by parenthesis ([,] ) when needed.

Two types of Notational conventions area Terminal and Non-terminals

1.Terminals:

Lower-case letters in the alphabet such as a, b, c,


Operator symbols such as +,-, *, etc.
Punctuation symbols such as parentheses, hash, comma
0, 1, …, 9 digits
Boldface strings like id or if, anything which represents a single
terminal symbol

2.Nonterminals:

Upper-case letters such as A, B, C


Lower-case italic names: the expression or some

Context Free Grammar


A CFG is a le"-recursive grammar that has at least one production of the
type. The rules in a context-free grammar are mainly recursive. A syntax
analyser checks that specific program satisfies all the rules of Context-
free grammar or not. If it does meet, these rules syntax analysers may
create a parse tree for that programme.

expression -> expression -+ term


expression -> expression – term
expression-> term
term -> term * factor
term -> expression/ factor
term -> factor factor
factor -> ( expression )
factor -> id

Grammar Derivation
Grammar derivation is a sequence of grammar rule which transforms the
start symbol into the string. A derivation proves that the string belongs to
the grammar’s language.

Le"-most Derivation

When the sentential form of input is scanned and replaced in le" to right
sequence, it is known as le"-most derivation. The sentential form which
is derived by the le"-most derivation is called the le"-sentential form.

Right-most Derivation

Rightmost derivation scan and replace the input with production rules,
from right to le", sequence. It’s known as right-most derivation. The
sentential form which is derived from the rightmost derivation is known
as right-sentential form.

Syntax vs. Lexical Analyser


Syntax Analyser Lexical Analyser

The lexical analyser


The syntax analyser mainly deals with
eases the task of the
recursive constructs of the language.
syntax analyser.

The syntax analyser works on tokens in a The lexical analyser


source program to recognize meaningful recognizes the token
structures in the programming language. in a source program.

It is responsible for
It receives inputs, in the form of tokens, from the validity of a token
lexical analysers. supplied by

the syntax analyser

Disadvantages of using Syntax Analysers


It will never determine if a token is valid or not
Not helps you to determine if an operation performed on a token
type is valid or not
You can’t decide that token is declared & initialized before it is being
used

Summary
Syntax analysis is a second phase of the compiler design process
that comes a"er lexical analysis
The syntactical analyser helps you to apply rules to the code
Sentence, Lexeme, Token, Keywords and reserved words, Noise
words, Comments, Delimiters, Character set, Identifiers are some
important terms used in the Syntax Analysis in Compiler
construction
Parse checks that the input string is well-formed, and if not, reject it
Parsing techniques are divided into two di!erent groups: Top-Down
Parsing, Bottom-Up Parsing
Lexical, Syntactical, Semantical, and logical are some common
errors occurs during parsing method
A grammar is a set of structural rules which describe a language
Notational conventions symbol may be indicated by enclosing the
element in square brackets
A CFG is a le"-recursive grammar that has at least one production of
the type
Grammar derivation is a sequence of grammar rule which
transforms the start symbol into the string
The syntax analyser mainly deals with recursive constructs of the
language while the lexical analyser eases the task of the syntax
analyser in DBMS
The drawback of Syntax analyser method is that it will never
determine if a token is valid or not

You Might Like:

What is a Compiler Design? Types, Construction Tools,


Example
Compiler vs Interpreter – Di!erence Between Them
Phases of Compiler with Example: Compilation Process &
Steps
Lexical Analysis (Analyzer) in Compiler Design with Example
Compiler Design Tutorial for Beginners – Complete Guide

Prev Report a Bug Next

INVT Electric Co., Ltd.


GET QUOTE
INVT Medium Voltage Drives
Compiler vs Interpreter – Di!erence
Between Them
By John Smith Updated June 24, 2023

Key Di!erence between


Compiler and Interpreter

Compiler transforms code


written in a high-level
programming language into
the machine code at once
before the program runs,
whereas an Interpreter
converts each high-level
program statement, one by
one, into the machine code, during program run.
Compiled code runs faster, while interpreted code runs slower.
Compiler displays all errors a"er compilation, on the other hand,
the Interpreter displays errors of each line one by one.
Compiler is based on translation linking-loading model, whereas
the Interpreter is based on Interpretation Method.
Compiler takes an entire program, whereas the Interpreter takes a
single line of code.
Compiler Vs Interpreter

Table of Content:

What is Compiler?
A compiler is a computer program that transforms code written in a high-
level programming language into the machine code. It is a program
which translates the human-readable code to a language a computer
processor understands (binary 1 and 0 bits). The computer processes the
machine code to perform the corresponding tasks.
A compiler should comply with the syntax rule of that programming
language in which it is written. However, the compiler is only a program
and can not fix errors found in that program. So, if you make a mistake,
you need to make changes in the syntax of your program. Otherwise, it
will not compile.

What is Interpreter?
An interpreter is a computer program, which converts each high-level
program statement into the machine code. This includes source code,
pre-compiled code, and scripts. Both compiler and interpreters do the
same job which is converting higher level programming language to
machine code. However, a compiler will convert the code into machine
code (create an exe) before program run. Interpreters convert code into
machine code when the program is run.

Di!erence between Compiler and Interpreter


Here are important di!erence between Compiler and Interpreter:

Basis of
Compiler Interpreter
di!erence

Create the
program.
Compile will
parse or
analyses all of
the language
statements for
its correctness.
If incorrect, Create the Program
throws an error No linking of files or
Programming If no error, the machine code generation
Steps compiler will Source statements executed
convert source line by line DURING
code to Execution
machine code.
It links
di!erent code
files into a
runnable
program(know
as exe)
Run the
Program

The program code


is already
translated into
Interpreters are easier to use,
Advantage machine code.
especially for beginners.
Thus, it code
execution time is
less.

You can’t change


Interpreted programs can run on
the program
Disadvantage computers that have the
without going back
corresponding interpreter.
to the source code.

Store machine
Machine language as
Not saving machine code at all.
code machine code on
the disk

Compiled code run


Running time faster Interpreted code run slower
It is based on
language It is based on Interpretation
Model
translation linking- Method.
loading model.

Generates output
program (in the
Do not generate output program.
form of exe) which
Program So they evaluate the source
can be run
generation program at every time during
independently from
execution.
the original
program.

Program execution
is separate from the
compilation. It Program Execution is a part of
Execution performed only Interpretation process, so it is
a"er the entire performed line by line.
output program is
compiled.

Target program
execute
Memory independently and The interpreter exists in the
requirement do not require the memory during interpretation.
compiler in the
memory.

Bounded to the
specific target For web environments, where
machine and load times are important. Due to
cannot be ported. C all the exhaustive analysis is
Best suited and C++ are a most done, compiles take relatively
for popular larger time to compile even
programming small code that may not be run
language which multiple times. In such cases,
uses compilation interpreters are better.
model.

The compiler sees


the entire code
upfront. Hence, Interpreters see code line by line,
Code
they perform lots of and thus optimizations are not
Optimization
optimizations that as robust as compilers
make code run
faster

Di!icult to
implement as
Dynamic compilers cannot Interpreted languages support
Typing predict what Dynamic Typing
happens at turn
time.

It is best suited for


It is best suited for the program
Usage the Production
and development environment.
Environment

Compiler displays
all errors and
warning at the The interpreter reads a single
Error compilation time. statement and shows the error if
execution Therefore, you can’t any. You must correct the error
run the program to interpret next line.
without fixing
errors

Input It takes an entire It takes a single line of code.


program
Compliers
generates Interpreter never generate any
Output
intermediate intermediate machine code.
machine code.

Display all errors


a"er, compilation, Displays all errors of each line
Errors
all at the same one by one.
time.

Pertaining C, C++, C#, Scala,


PHP, Perl, Ruby uses an
Programming Java all use
interpreter.
languages complier.

Earn $3.5 Per


Answer
Make money from answering
simple questions. We pay you in
cash. Simple and fun.

MetroOpinion

Role of Compiler
Compliers reads the source code, outputs executable code
Translates so"ware written in a higher-level language into
instructions that computer can understand. It converts the text that
a programmer writes into a format the CPU can understand.
The process of compilation is relatively complicated. It spends a lot
of time analyzing and processing the program.
The executable result is some form of machine-specific binary code.

Also Check:- Compiler Design Tutorial for Beginners

Role of Interpreter
The interpreter converts the source code line-by-line during RUN
Time.
Interpret completely translates a program written in a high-level
language into machine level language.
Interpreter allows evaluation and modification of the program while
it is executing.
Relatively less time spent for analyzing and processing the program
Program execution is relatively slow compared to compiler

HIGH-LEVEL LANGUAGES
High-level languages, like C, C++, JAVA, etc., are very near to English. It
makes programming process easy. However, it must be translated into
machine language before execution. This translation process is either
conducted by either a compiler or an interpreter. Also known as source
code.

MACHINE CODE
Machine languages are very close to the hardware. Every computer has
its machine language. A machine language programs are made up of
series of binary pattern. (Eg. 110110) It represents the simple operations
which should be performed by the computer. Machine language
programs are executable so that they can be run directly.

OBJECT CODE
On compilation of source code, the machine code generated for di!erent
processors like Intel, AMD, and ARM is di!erent. To make code portable,
the source code is first converted to Object Code. It is an intermediary
code (similar to machine code) that no processor will understand. At run
time, the object code is converted to the machine code of the underlying
platform.

Java is both Compiled and Interpreted.


To exploit relative advantages of compilers are interpreters some
programming language like Java are both compiled and interpreted. The
Java code itself is compiled into Object Code. At run time, the JVM
interprets the Object code into machine code of the target computer.

Also Check:- Java Tutorial for Beginners: Learn Core Java Programming

You Might Like:

What is a Compiler Design? Types, Construction Tools,


Example
Phases of Compiler with Example: Compilation Process &
Steps
Lexical Analysis (Analyzer) in Compiler Design with Example
Syntax Analysis: Compiler Top Down & Bottom Up Parsing
Types
Compiler Design Tutorial for Beginners – Complete Guide

Prev Report a Bug

You might also like