You are on page 1of 5

Fast-Track Semester 2022

Technical Answers to Real-World Problems


Digital Assignment 3
Prof. Rajakumar K

Informative Text Summarizer Using NLP and DL

Team Members:
Saumitra Pathak (19BCE2411)
Shivam Bansal (19BCE0930)
Arkaraj Ghosh (19BCE24218)
Debalay Dasgupta (19BCE2423)
Pratyay Piyush (19BCE2364)
Proposed Architecture:
Methodology

The suggested model gathers data from three distinct media sources before employing
a comprehensive strategy based on an abstractive kind of text summarization. The long
video and journal datasets are first pre-processed using the Extractive summary to
create a homogenous dataset for T5. The T5 transformer model is then employed in the
following stage, keeping the ontological relationships in mind. The suggested hybrid
model is tested on a test dataset and produces a projected summary. The produced
summary from several sources is integrated into a single document for easy access in
the shortest amount of time.

T5 Transformer Model
T5 model will summarize the long video transcripts and research journals before
sending them to the LSTM model for a better abstractive summary. The T5 transformer
model produced excellent results when processed over CNNDM, MSMO, and XSUM,
with over 42 ROUGE and 43 BLEU scores on the MSMO dataset. The T5 model uses
stacked self-attention layers and is composed of encoder-decoder layers followed by a
feed-forward network.

Pretrained Abstractive Summarizer Comparison


A comprehensive comparison between the best pre-trained abstractive text
summarization models was performed. T5, Hugging face, and Bart transformer model
was used to process 20 research papers published by Elsevier publication and the
transcripts of 25 video lectures from a YouTube playlist. The length of the first-hand
summary of the resources generated and the grammatical correctness were the two
important metrics taken into consideration. For detecting grammar errors and spelling
mistakes, language_python_tool was used, which is an open-source grammar tool. The
results were plotted, and the average was calculated, which resulted in the selection of
the T5 model.
Fig: Video Lectures

Fig: Research Paper

Pre-Trained Abstractive Summarizer Comparison

A comprehensive comparison between the best pre-trained abstractive text


summarization models was performed. T5, Hugging face, and Bart transformer model
was used to process 20 research papers published by Elsevier publication and the
transcripts of 25 video lectures from a YouTube playlist. The length of the first-hand
summary of the resources generated and the grammatical correctness were the two
important metrics taken into consideration. For detecting grammar errors and spelling
mistakes, language_python_tool was used, which is an open-source grammar tool. The
results were plotted, and the average was calculated, which resulted in the selection of
the T5 model.

You might also like