Professional Documents
Culture Documents
HTML Syntex Analyzer: Project::Compiler Construction
HTML Syntex Analyzer: Project::Compiler Construction
PROJECT::COMPILER CONSTRUCTION
Introduction
HTML syntax analyzer analyze the syntax for language and determine what does it means in
code that the user provide at run time. Lex tool is used for this purpose. The user can enter the
tags of the html such as the user can enter <title> than the compiler can understand it and give
its result that it is starting title tag or if the user enter </title> than the compiler analyze or
return the result that it is ending tag of title in short it can analyze the syntax of html and tells
us what does it means.
%{
#include<stdio.h>
#include<conio.h>
%}
doctag "<!DOCTYPE html>"
html "html"
body "body"
heading [h][0-9]
alpabets [a-zA-Z][a-zA-Z]*
numbers [0-9][0-9]*
spchr [+-,@#$%^&*)(!]+
string [A-Za-z0-9" "]+
tagstart "<"
tagend ">"
entagst "</"
str {spchr}
title "title"
titles {tagstart}{title}{tagend}
titlend {entagst}{title}{tagend}
htmls {tagstart}{html}{tagend}
htmlend {entagst}{html}{tagend}
bodys {tagstart}{body}{tagend}
bodye {entagst}{body}{tagend}
hes {tagstart}{heading}{tagend}
hede {entagst}{heading}{tagend}
head "head"
heads {tagstart}{head}{tagend}
heade {entagst}{head}{tagend}
paras {tagstart}[pP]{tagend}
parae {entagst}[pP]{tagend}
%%
{doctag} {printf("DOCTYPE tag start: \n");}
{htmls} {printf("html tag start: \n");}
{htmlend} {printf("html tag end ");}
{bodys} {printf("body tag start: \n");}
{bodye} {printf("body tag end ");}
{hes} {printf("heading tag start: \n");}
{hede} {printf(" heading tag end ");}
{string} {printf(" string ");}
{str} {printf(" special character ");}
{titles} {printf(" title tag start: ");}
{titlend} {printf(" title tag end ");}
{heads} {printf(" head tag start: ");}
{heade} {printf(" head tag end ");}
{paras} {printf(" paragragh tag start: ");}
{parae} {printf(" paragragh tag end ");}
%%
int yywrap()
{
return 1;
}
int main()
{
printf("enter :",yytext);
yylex();
return 0;
}
CFG MODULES
Numbers:
S -> AS|B|NULL
A -> 0|1|2|….|9
B -> 0|1|2|….|9|NULL
String:
S -> A|B|C|D
A -> A|B|C|…|Z
B ->a|b|c|..|z
C -> 0|1|2|….|9
D -> “ ”
Special chr:
S -> A|B|C|D|E|F|G|H|I|J|K|L
A -> +
B -> -
C -> @
D -> #
E-> $
F -> %
G -> ^
H -> &
I -> !=
J -> (
K -> )
L -> !
Tagstart:
S -> B
B-> <
Tagend :
S -> B
B-> >
Endingtagst:
S -> B
B-> </
Title:
S-> AS|BS|C
A-> tagstart
B->title
C->tagend
Titlend:
S-> AS|BS|C
A-> entagst
B->title
C->tagend
htmls:
S-> AS|BS|C
A-> tagstart
B->html
C->tagend
htmlend:
S-> AS|BS|C
A-> entagst
B->html
C->tagend
bodys:
S-> AS|BS|C
A-> tagstart
B->body
C->tagend
bodye:
S-> AS|BS|C
A-> entagst
B->body
C->tagend
hes:
S-> AS|BS|C
A-> tagstart
B->heading
C->tagend
paras:
S-> AS|BS|C
A-> tagstart
B->pP
C->tagend
parae:
S-> AS|BS|C
A-> entagst
B->pP
C->tagend
STRINGS:
Tagstart:
CFG FIRST FOLLOW
S -> B {<} $
B-> < {<} $
TagEND:
CFG FIRST FOLLOW
S -> B {>} $
B-> > {>} $
Endingtagst:
CFG FIRST FOLLOW
S -> B {</} $
B-> < {</} $
Title
CFG FIRST FOLLOW
S-> AS|BS|C { tagstart|title|tagend } $
A-> tagstart { tagstart } { tagstart|title|tagend }
B->title { title} { tagstart|title|tagend }
C->tagend {tagend} $
Titlend
CFG FIRST FOLLOW
S-> AS|BS|C { entagst|title|tagend } $
A-> entagst { entagst } { entagst|title|tagend }
B->title { title} { entagst|title|tagend }
C->tagend {tagend} $
htmls
CFG FIRST FOLLOW
S-> AS|BS|C { tagstart|html|tagend } $
A-> tagstart { tagstart } { tagstart|html|tagend }
B->html { html} { tagstart|html|tagend }
C->tagend {tagend} $
htmlend
CFG FIRST FOLLOW
S-> AS|BS|C { entagst |html|tagend } $
A-> entagst { entagst } { entagst |html|tagend }
B->html {html} { entagst |html|tagend }
C->tagend {tagend} $
bodys
CFG FIRST FOLLOW
S-> AS|BS|C { tagstart|body|tagend } $
A-> tagstart { tagstart } { tagstart|body|tagend }
B->body { body} { tagstart|body|tagend }
C->tagend {tagend} $
bodye
CFG FIRST FOLLOW
S-> AS|BS|C { entagst |body|tagend } $
A-> entagst { entagst } { entagst |body|tagend }
B->body {body} { entagst |body|tagend }
C->tagend {tagend} $
hes
CFG FIRST FOLLOW
S-> AS|BS|C { tagstart|heading|tagend } $
A-> tagstart { tagstart } { tagstart|heading|tagend }
B->heading {heading} { tagstart|heading|tagend }
C->tagend {tagend} $
paras
CFG FIRST FOLLOW
S-> AS|BS|C { tagstart|pP|tagend } $
A-> tagstart { tagstart } { tagstart|pP|tagend }
B->pP {pP} { tagstart|pP|tagend }
C->tagend {tagend} $
parae
CFG FIRST FOLLOW
S-> AS|BS|C { entagst |pP|tagend } $
A-> entagst { entagst } { entagst |pP|tagend }
B->pP {pP} { entagst |pP|tagend }
C->tagend {tagend} $
Parse table:
special chr:
Stack and parse tree
Numbers:
String:
Special chr:
Tagstart:
Tagend:
Entagst:
Semantic Analysis:
bodys:
Numbers:
Bodye:
Htmle:
htmls
Results and conclusions
Conclusion
It conclude that this compiler can accurately identify the syntax of the HTML and tell the user
that the what is the syntax means for example <head> the compiler analyze it and tells that it is
the starting head tag .
Results
Refrences
http://ijirt.org/master/publishedpaper/IJIRT100158_PAPER.pdf
https://www.academia.edu/Documents/in/Compilers
https://www.researchgate.net/publication/338104054_Paper_on_Symbol_Table_Implementation_in_C
ompiler_Design-_DrJad_Matta
https://www.semanticscholar.org/paper/Learning-Compiler-Design-as-a-Research-Activity-Moreno-
Seco-Forcada/39d00fe0af3ffcb8b72a291f1fac9659ce059910#related-papers