You are on page 1of 4

Synopsis of

NLP Based Grammar Checker


For the award of degree of
Bachelor in Engineering in Computer Science.

RAGIV GANDHI PRODYOGIKI VISHWAVIDYALAYA


(University of Technology of Madhya Pradesh)
BHOPAL (M.P)
Submitted by
Prakash Jha
Kumar Prasanna
Ishita Verma
Prakhar Gupta

Department of Computer Science Engineering


Sagar Institute of Research & Technology
Session 2015-2016

NLP Based Grammar Checker


A grammar checker is one of the basic Natural Language Processing (NLP) tools for any
language. The NLP field is relatively new in India and a lot of tools have yet to be developed.
One of these is a grammar checker.

Goals
To implement a Text Processing system which checks grammar of Input text and identifies
types of error?

Description in detail:

1.

POS tagging
Before grammar checking can be performed on a text it needs to be run
through a partof speech (POS) tagger and parser. This enables the grammar
checker to recognise types of words within each sentence. The text is first run
through a POS tagger which generates a tag for each word in a sentence. The
tag indicates the words class. Next, the text (with tags) is run through a parser
which performs syntactic analysis on it, adding tags to parts of the sentence,
marking phrases within it and syntactic roles.
for example:

2. Making Chunk-based Sentence Patterns


chunks is a process to parse the sentence into a form that is a chunk based sentence
structure. A chunk is a textual unit of adjacent POS tags which display the relations
between their internal words. Input English sentence is made in chunk structure by
using hand written rules. It represents how these chunks fit together to form the
constituents of the sentence. Context Free Grammar (CFG): CFGs constitute an
important class of grammars, with a broad range of applications including
programming languages, natural language processing, bio informatics and so on.
CFGs rules present a single symbol on the left-hand-side, are a sufficiently powerful
formalism to describe most of the structure in natural language.

A context-free grammar G = (V, T, S, P) is given by


A finite set V of variables or non terminal symbols.
A finite set T of symbols or terminal symbols. We assume that the sets V and T are
disjoint.
A start symbol S V.
A finite set P V (VT)* of productions. A production (A, ), where AV and
(VT)* is a sequence of terminals and variables, is written as A.
Parsing is the process of analysing the text automatically by assigning syntactic
structure according to the grammar of language. Parser is used to understand the
syntax and semantics of a natural language sentences confined to the grammar. There
are two methods for parsing such as Top-down parsing and Bottom-up parsing. Top
down parsing begins with the start symbol and attempt to derive the input sentence by
substituting the right hand side of productions for non-terminals. Bottom-up (shift
reduce) parsing begins with the input sentence and combines words into higher-level
chunks until the unit finally becomes a sentence.
Parsing chunks by using CFG:
The syntactic chunk structure of a sentence is necessary to determine its grammar
correctness. In the proposed system, ten general chunk types are used to make the
chunk structure as shown in Table.

The proposed grammar checker identifies the chunks using CFG based bottom-up
parsing for assembling POS tags into higher level chunks, until a complete sentence
has been found. For example, a simple sentence The students are playing football in
the playground. is chunked as follows:
NC_VC_NC_PPC_NC_END (Chunk-based Sentence Pattern)
NC_VC_NC_PPC_NC
NC_VC_NC_PPC
NC_VC_NC

System Components
1. PoS Tagger
2. Chunk Based Grammar Checker.

Applications

Text Processing
Machine Translation Systems
Search Engine
Spell-checker
Grammar Checker
Named Entity Identification
Information Extraction
Information Retrieval
Text Classification and Clustering
Question Answering Systems

Custom Search Systems

Technologies Used

PHP
AngularJs

You might also like