Professional Documents
Culture Documents
Team Members:
Saumitra Pathak (19BCE2411)
Shivam Bansal (19BCE0930)
Arkaraj Ghosh (19BCE24218)
Debalay Dasgupta (19BCE2423)
Pratyay Piyush (19BCE2364)
Proposed Architecture:
Methodology
The suggested model gathers data from three distinct media sources before employing
a comprehensive strategy based on an abstractive kind of text summarization. The long
video and journal datasets are first pre-processed using the Extractive summary to
create a homogenous dataset for T5. The T5 transformer model is then employed in the
following stage, keeping the ontological relationships in mind. The suggested hybrid
model is tested on a test dataset and produces a projected summary. The produced
summary from several sources is integrated into a single document for easy access in
the shortest amount of time.
T5 Transformer Model
T5 model will summarize the long video transcripts and research journals before
sending them to the LSTM model for a better abstractive summary. The T5 transformer
model produced excellent results when processed over CNNDM, MSMO, and XSUM,
with over 42 ROUGE and 43 BLEU scores on the MSMO dataset. The T5 model uses
stacked self-attention layers and is composed of encoder-decoder layers followed by a
feed-forward network.