Professional Documents
Culture Documents
ENGINEERING,
NAVI MUMBAI-400614
(2020-21)
PRESENTS
MINI PROJECT
ON
“ Named Entity Recognition(NER)”
1
COURSE OUTCOMES
CO1:
Students will have a broad understanding of the field of natural
language processing.
CO2:
The student will have a sense of the capabilities and limitations of current
natural language technologies.
CO3:
Students will be able to model linguistic phenomena with formal grammars.
CO4:
Students will be able to design, implement and test algorithms of NLP
problems.
CO5:
Students will be able to understand the mathematical and linguistic foundations
underlying approaches to the various areas in NLP.
CO6:
The students can apply NLP techniques to design real-world NLP
applications such as machine translation, text summarization, etc
2
BHARATI VIDYAPEETH COLLEGE OF ENGINEERING,
NAVI MUMBAI 400614
PROJECT REPORT
ON
“Named Entity Recognition(NER)”
PROJECT MEMBERS
SR NO NAME Roll NO MARKS
1 Dhyeya Dhaktode 13
2 Rohit Dudhal 15
3 Tejas Ghadshi 16
LAB - INCHARGE
Prof. Dr. D.R.Ingle
3
Table of content
SR. NO. CHAPTER PAGE NO.
1 Introduction 5
2 System Working 6
3 System Approach 9
4 System Implementation 10
5 Acknowledgement 14
6 Reference 15
CHAPTER 1
INTRODUCTION
Named-entity recognition is a subtask of information extraction that seeks
4
to locate and classify named entities mentioned in unstructured text into
pre-defined categories such as person names, organizations, locations,
medical codes, time expressions, quantities, monetary values,
percentages, etc.
There are two methods for named entity recognition machine learning,
ontology, and deep learning-based NER. In the first one ontology is a
knowledge-based recognition process, in which collection of data sets
containing words, terms, and their interrelations.
CHAPTER 2
SYSTEM WORKING
5
Loading the Data for Named Entity Recognition (NER)
Now the first thing is to load the data and have a look at it.
Import the pandas library and load the data:
In the data, we can see that the words are broken into columns that will
represent our feature X, and the Tag column in the right will represent our
label Y.
6
Extracting the mappings that are required to train the neural network:
Now transform the columns in the data to extract the sequential data for our
neural network:
Now split the data into training and test sets. Create a function for splitting
the data because the LSTM layers accept sequences of the same length only.
So every sentence that appears as an integer in the data must be padded with
the same length:
7
CHAPTER 3
SYSTEM APPROACH
8
Natural Language Processing (NLP)
In the 2010s, representation learning and deep neural network-style machine
learning methods became widespread in natural language processing, due in
part to a flurry of results showing that such techniques can achieve state-of-
the-art results in many natural language tasks, for example in language
modeling, parsing, and many others. This is increasingly important in
medicine and healthcare, where NLP is being used to analyze notes and text
in electronic health records that would otherwise be inaccessible for study
when seeking to improve care.
NER Approaches
NER systems have been created that use linguistic grammar-based
techniques as well as statistical models such as machine learning. Hand-
crafted grammar-based systems typically obtain better precision, but at the
cost of lower recall and months of work by experienced computational
linguists. Statistical NER systems typically require a large amount of
manually annotated training data. Semisupervised approaches have been
suggested to avoid part of the annotation effort.
Many different classifier types have been used to perform machine-learned
NER, with conditional random fields being a typical choice.
CHAPTER 4
SYSTEM IMPLEMENTATION
Training Neural Network for Named Entity Recognition (NER)
9
Now, proceed with training the neural network architecture of our model.
Importing all the packages we need for training our neural network:
The layer below will take the dimensions from the LSTM layer and will give
the maximum length and maximum tags as an output:
Now create a helper function that will help in giving the summary of every
layer of the neural network model for Named Entity Recognition (NER):
Now I will create a helper function to train the Named Entity Recognition
model:
10
Driver Code:
The model will give the final output after running for 25 epochs.
11
Testing the Named Entity Recognition (NER) Model:
12
CHAPTER 5
ACKNOWLEDGEMENT
13
I would like to express my special thanks of gratitude to our subject in charge as
well as our HOD Prof. Dayanand Ingle who gave us the golden opportunity to
do this project on the topic “Named Entity Recognition ” which also helped
me in doing a lot of research and I came to know about so many new things. I
am thankful to him.
CHAPTER 6
REFERENCES
14
▪ Python programming, NLP
▪ http://www.Google.co.in/
▪ https://www.tensorflow.org/
15