You are on page 1of 7

ABSTRACT

The aim is to develop a Question Answering System which can be deployed in


industries that generate a lot of textual data and help answer specific questions based
on the text data being provided. This particular concept on a much larger scale would
help crunch documents or books entirely while forming a back-end analysis. This
particular analysis would then serve as the base for the front-end Question-Answer
System. Systems like these would help improving the efficiency of certain phases of a
large-scale automation process and thus, reducing the existing requirement of
resources. The proposed system is implemented using Natural Language Processing
and Neural Networks and trained on various models to provide a comprehensive
overview of the confidence score.

i
ACKNOWLEDGEMENTS

With utmost joy and satisfaction, we submit this Project Report on “QUESTION
ANSWERING SYSTEM”. This has been completed as a part of the curriculum of
Visvesvaraya Technological University.
The satisfaction that accompanies the successful completion of our project would
be incomplete without mentioning the people who made it possible, whose constant
guidance and encouragement crowns all the efforts with success.
We take immense pleasure in thanking Dr. Mrityunjaya V Latte, Principal,
JSSATE, Bengaluru, for being kind enough to provide us with an opportunity to work the
Project in this institution.
We are also thankful to Dr Naveen N C, Professor and Head of Department of
Computer Science and Engineering, for his co-operation and encouragement at all
moments of approach.
We are thankful to Mr. Sreenatha M and Mr. Rohitaksha K, Assistant Professor,
Project Coordinator, for their cooperation and support.
We are thankful to our Project guide Dr. Prabhudev Jagadeesh, Professor, for his
constant support and encouragement.
We wish to thank every teaching and non-teaching faculty of out department for
always being there to support and guide us.

ALI ASGER MUSTAFA TABHA (1JS15CS016)

ARJUN H M (1JS15CS021)

ARVIND R (1JS15CS023)

ii
Table of Contents
Chapter Title Page No.

Abstract i
Acknowledgment ii
Table of Contents iii
List of Figures vi

Chapter 1 Introduction 1
1.1 Overview 1
1.2 Scope 2
1.3 Assumptions 2
1.4 Existing System 3
1.5 Proposed System 3
1.6 Problem Statement 4

Chapter 2 Literature Survey 5


2.1 Semantic Parsing via Stage Query Graph Generation 5
2.2 The NarrativeQA Reading Comprehension Challenge 7
2.3 Natural Language Processing in Information Retrieval 8
2.4 Preprocessing Techniques for Text Mining 9

Chapter 3 System Requirements 10


3.1 Hardware Requirements 10
3.2 Software Requirements 10
3.2.1 NLTK (Natural Language Toolkit) 11
3.2.2 TensorFlow 12
3.2.3 SciPy 14
3.2.4 numpy 16
3.2.5 Scikit-learn 17
3.2.6 Pandas 18
iii
3.2.7 PyTorch 21
3.2.8 Pickle 22

Chapter 4 System Architecture 23


4.1 System Structure 23
4.2 System Design 25
4.2.1 Tokenization 25
4.2.2 Removing Stopwords 26
4.2.3 Removing Accented Characters 26
4.2.4 Removing Contractions 26
4.2.5 Stemming and Lemmatization 26
4.3 Design Description 27
4.3.1 Capture 27
4.3.2 Preprocess 27
4.3.3 Localize 27
4.3.4 Connected Component Analysis 28
4.3.5 Segment 28
4.3.6 Context Recognition 28

Chapter 5 Implementation 29
5.1 Introduction 29
5.2 Programming Language Selection 30
5.2.1 Python 30
5.3 Data Flow Diagram 31
5.4 Activity Diagram 32
5.5 Use Case Diagram 33
5.6 Sequence Diagram 34

Chapter 6 System Study 36


6.1 Feasibility Study 36
6.1.1 Economic Feasibility 36
iv
6.1.2 Technical Feasibility 37
6.1.3 Social Feasibility 37

Chapter 7 System Testing 38


7.1 Types of Testing 38
7.1.1 Unit Testing 38
7.1.2 Integration Testing 38
7.1.3 Functional Testing 39
7.1.4 System Testing 39
7.1.5 Black Box Testing 40
7.1.6 White Box Testing 40
7.1.7 Acceptance Testing 40
7.2 Test Cases 40

Chapter 8 Results and Discussions 42

Chapter 9 Conclusion and Future Enhancements 48


8.1 Conclusions 48
8.2 Future Scope 48

References

v
List of Figures
Figure Number Figure Title Page No.
3.1 NLTK Hierarchy 12
3.2 TensorFlow Toolkit Hierarchy 12
4.1 Overall System Structure 24
4.2 System Modules 25
4.3 Main Steps in Text Processing 27
5.1 Level 0 Dataflow Diagram 31
5.2 Level 1 Dataflow Diagram 32
5.3 Activity Diagram 33
5.4 Use case Diagram 34
5.5 Sequence Diagram 35
8.1 Using NLP rules (Brute Force Method) 42
8.2 Grammatical changes in the input 42
8.3 Input Paragraph used for analysis 43
8.4 Answer generation based on input document 43
8.5 System answering indirect question 44
8.6 Generating answer by summarizing input data 44
8.7 Answer with Listings 45
8.8 Answer in a single word/phrase 45
8.9 Answer with underlying context 45
8.10 Different input text document 46
8.11 Generating answer by summarizing input 46
8.12 Answer based on numerical context 47

vi
vii

You might also like