You are on page 1of 8

Mini Project

Natural Language Processing

“AUTOMATIC TEXT SUMMARIZATION ”

Group Members
Makarand Bhalerao - A - 21
Tejas Hasabnis - A - 35
Shrutika Kadam - A - 40

R1(2 M) R2(2 M) R3(1 M) Total(5M) Sign

Datta Meghe College of Engineering


Department of Computer Engineering
October 2022
Introduction

Text Summarization is one of those applications of Natural Language


Processing (NLP) which is bound to have a huge impact on our lives.
With growing digital media a never-growing publishing who has the
time to go through entire articles/documents / books. This is where
text summarization helps apps like Inshorts use it efficiently

Automatic Text Summarization gained attention as early as the 1950’


s. A research paper, published by Hans Peter Luhn in the late 1950s,
titled “The automatic creation of literature abstracts”, used features
such as word frequency and phrase frequency to extract important
sentences from the text for summarization purposes.

Summarization is the task of condensing a piece of text to a shorter


version, reducing the size of the initial text while at the same time
preserving key informational elements and the meaning of content.
Since manual text summarization is a time expensive and generally
laborious task, the automatization of the task is gaining increasing
popularity and therefore constitutes a strong motivation for academic
research.In the big data era, there has been an explosion in the amount
of text data from a variety of sources. This volume of text is an
inestimable source of information and knowledge which needs to be
effectively summarized to be useful. This increasing availability of
documents has demanded exhaustive research in the NLP area for
automatic text summarization. Automatic text summarization is the
task of producing a concise and fluent summary without any human

help while preserving the meaning of the original text document.


Problem Definition

Automatic Text Summarization is one of the most challenging and


interesting problems in the field of Natural Language Processing
(NLP). It is a process of generating a concise and meaningful
summary of text from multiple text resources such as books, news
articles, blog posts, research papers, emails, and tweets.

The demand for automatic text summarization systems is spiking


these days thanks to the availability of large amounts of textual data.

Summarization is a technique where a computer summarizes a text. A

text is given to the computer and the computer returns a required


extract of the original text document. Our methods on the sentence
extraction-based text summarization task use the graph based
algorithm to calculate importance of each sentence in document and
most important sentences are extracted to generate document

summary. These extraction based text summarization methods give an


indexing weight to the document terms to compute the similarity
values between sentences

Thus Automatic Text Summarization is very helpful in today's era


Proposed Solution
The first step would be to concatenate all the text contained in the
articles. Then split the text into individual sentences. In the next step,
we will find vector representation (word embeddings) for each and
every sentence.
Similarities between sentence vectors are then calculated and stored in
a matrix. The similarity matrix is then converted into a
graph, with sentences as vertices and similarity scores as edges, for
sentence rank calculation. Finally, a certain number of top-ranked
sentences form the final summary.

Extractive summarization picks up sentences directly from the


document based on a scoring function to form a coherent summary.
This method work
by identifying important sections of the text cropping out and stitch
together portions of the content to produce a condensed version.
Steps of the project:

Importing libraries

Load and preprocess the data

Apply Tokenization

Creating the model

Plot the model

Build the model

Prediction
Workflow of Project:

Code
Conclusion

Thus Text summarization is the technique for generating a concise and


precise summary of voluminous texts while focusing on the sections
that convey useful information, and without losing the overall
meaning

A final advantage of text summarization lies in its ability to increase


user engagement. When people read short summaries instead of

lengthy ones, they tend to spend less time reading each piece and will
typically read more as a result. This leads to higher levels of
engagement.

The study of automated text summarization still has a long way to go

before we can really claim to understand the nature of summaries.

The vast growth in the rate of information due to internet has called
for a need of efficient summarization systems.Although the research
on text summarization has started so many years ago, there is still a
long trail to walk and some more things to be researched as well

You might also like