Professional Documents
Culture Documents
AURANGABAD
Project Synopsis on
Text Summerizer
Submitted by
This approach weights the important part of sentences and uses the same to form
the summary. Different algorithm and techniques are used to define weights for the
sentences and further rank them based on importance and similarity among each
other.
Objectives and Future Scope:
Summaries reduce reading time
It refines most useful information from the source file and will provide an
Process Description:
Text summarization takes a document as input. The document is split into
sentences and similarity matrix will be build and lastly by using pagerank
algorithm important sentences will be picked for output summary.
Cosine Similarity:
In this project we’ll be using cosine distance for generating similarity matrix first
then we’ll rank this matrix using pagerank algorithm.
Pagerank Algorithm:
PageRank is used primarily for ranking web pages in online search results.In our
case In place of web pages, we use sentences.
Working:
● The first step would be to concatenate all the text contained in the articles
● In the next step, we will find vector representation (word embeddings) for each
● Similarities between sentence vectors are then calculated and stored in a matrix
● The similarity matrix is then converted into a graph, with sentences as vertices