Professional Documents
Culture Documents
Abstractive Text
Summarization
Review - I
Prepared by :
21MCEC01 - Prashant Godhani
Guided by :
Dr. Priyank Thakkar
Outline
• Problem Statement
• Objective
• Comments from the previous review
• Dataset
• Base Paper
• implementation
• Tools and Technologies
• Conclusion & Future Work
• Reference
Problem Statement
▸ Data is increasing daily and it is getting very difficult to handle large
documents so to produce a summary that is a shorter version of the
document while retaining the most relevant information using different
deep learning algorithms.
3
Objective
▸ These are the objective of the research [1],[2],[3] & [4].
• Short summary from long text
• Research and analysis of different technique
• Implementation of Base Paper Model
• Analysis of the result of the implementation.
• Enhancing the model used in the base paper with a new approach
• Increase the accuracy of text summary
4
Comments from the previous review
Comments Progress
Table 1 - Comments
5
Dataset
▸ CNN_dailymail dataset [1],[2] & [3].
▹ Articles and Highlights
▹ More than 300k news articles.
▹ Data fields:
▹ Id : a string has of URL where the story was received from.
▹ Article: a string containing body of the news article.
▹ Highlights: a string containing highlight of article written by authors.
Train 287113
Validation 13368
Test 11490
7
Base Paper
▸ Approach:
9
Base Paper
▸ Result and Analysis
• Similarly they compare for dataset DUC 2004.
• According to other method this model giving better F1 score as ROUGE.
• Try with different variant of DEATS (Decoding Length).
• It is give precision result with 20-25 decoding length.
• It is best for short dataset.
10
Base Paper
▸ Limitation
• Model is not giving semantically accurate result.
• Model is working for smaller decoding length only.
• Higher decoding length makes secondary encoder out of function.
• Still need to improve ROUGE-L score for large data.
11
Implementation
▸ Implementation
• Implement an encoder-decoder model.
• Model is using CNN/DailyMail Dataset.
• Pre-processing of the dataset (Word Embedding).
• Train the model.
• Result.
12
implementation
▸ Model Diagram
• Implement an encoder-decoder model.
• Model is using CNN/DailyMail Dataset.
• Pre-processing of the dataset (Word Embedding).
• Train the model.
• Result.
13
implementation
▸ Result
• As per the result we get the below ROUGE Score.
15
Conclusion & Future work
▸ In this research we did a survey of some papers with different methodologies
and models of abstractive text summarization.
▸ Overview of Base Paper.
▸ Try to find out the limitation of the current model to be improved.
▸ Implement the encoder-decoder model and get the results.
▸ In future work changing the approach in the existing model.
▸ Propose a new model to get better results
▸ Implement the proposed model.
16
Reference
1) T. Ma, Q. Pan, H. Rong, Y. Qian, Y. Tian and N. Al-Nabhan, "T-BERTSum: Topic
Aware Text Summarization Based on BERT," in IEEE Transactions on Computational Social Systems, vol. 9,
no. 3, pp. 879-890, June 2022, doi: 10.1109/TCSS.2021.3088506.
2) K. Yao, L. Zhang, D. Du, T. Luo, L. Tao and Y. Wu, "Dual Encoding for Abstractive Text Summarization," in
IEEE Transactions on Cybernetics, vol. 50, no. 3, pp. 985-996, March 2020, doi:
10.1109/TCYB.2018.2876317.
3) W. Ren, H. Zhou, G. Liu and F. Huan, "ASM: Augmentation-based Semantic Mechanism on Abstractive
Summarization," 2021 International Joint Conference on Neural Networks (IJCNN), 2021, pp. 1-8, doi:
10.1109/IJCNN52387.2021.9534156.
6) N. Raphal, H. Duwarah and P. Daniel, "Survey on Abstractive Text Summarization," 2018 International
Conference on Communication and Signal Processing (ICCSP), 2018, pp. 0513-0517, doi:
10.1109/ICCSP.2018.8524532.
https://github.com/workofart/abstractive-text-summary.
18
THANKS!
Any questions?
19