You are on page 1of 4

TECHNICAL APPROVAL COMMITTEE

GUIDE APPROVAL FORM


Date: __ / __ / 2023
Starting Date of Work

Sl.
No Student Name Reg. No. Role Signature
.

1 Raghulraj M 7376212AD177 Team Leader

Applying for the work: Project

PDF summarizer
Title of Work

(To be Filled by Faculty In charge)


No. of students: 1
I acknowledge that I will act as a faculty in charge for the aforementioned students and guide
them to complete the work by adopting the guidelines provided.

Lab Name:DATA SCIENCE –BIGDATA Name & Signature of the Faculty In charge
ANALYTICS LAB with the date
(In case of Faculty belonging to any special lab)
Add process flow chart or simulated image of prototype or any relevant image related to your idea
Idea/Approach Details

Problem Statement
The problem addressed by this project pertains to the need for a versatile and efficient PDF summarization tool. PDF documents often contain lengthy and detailed
content, making it challenging for users to extract essential information quickly. Traditional manual summarization methods may be time-consuming and may not provide
the comprehensiveness desired by users.

Proposed Solution
The proposed solution is to develop a PDF Summarizer using the Flask web framework, capable of performing both abstractive and extractive summarization. This tool
aims to:

● Read and understand the content of PDF files using the PyPDF library.
● Provide abstractive summarization of the PDF content using pre-trained large language models via their API.
● Perform extractive summarization using the Latent Semantic Analysis (LSA) summarizer.
● Offer users the option to choose between abstractive and extractive summarization methods.
● Present summarized content in an easily accessible and user-friendly format.

The PDF Summarizer will empower users to quickly obtain concise summaries of PDF documents, enhancing their ability to digest and utilize information effectively.

METHODOLOGY

STEP 1 :

Develop a web application using Flask to create an intuitive user interface for uploading PDF files and selecting summarization methods.

STEP 2 :

Utilize the PyPDF library to extract and convert the content of uploaded PDF files into a suitable format for summarization.
STEP 3 :

Implement abstractive summarization using pre-trained large language models via their API to generate human-like summaries.

STEP 4:

Utilize the LSA summarizer to perform extractive summarization by identifying key sentences within the PDF content.

STEP 5:

Design a user-friendly interface that allows users to upload PDF files, choose summarization methods, and view summarized content.

Describe the features / functions of the concern work here

⮚ PDF Content Analysis: The tool extracts and analyzes content from uploaded PDF files.
⮚ Abstractive Summarization: Abstractive summarization generates human-like summaries using pre-trained language models via their API.
⮚ Extractive Summarization (LSA): Extractive summarization identifies key sentences in the PDF content to create concise summaries.
⮚ User Selection: Users can choose between abstractive and extractive summarization methods.
⮚ User-Friendly Interface: The web application provides an intuitive interface for PDF upload and summarization method selection.

Describe your required technologies / facilities to complete the prescribe work here

❖ Flask Framework: Utilized for building the web-based PDF Summarizer and creating the user interface.
❖ PyPDF Library: Employed to read and extract the content of PDF files for summarization.
❖ Pre-Trained Language Models API: Integrated to perform abstractive summarization by generating human-like summaries.
❖ LSA Summarizer: Used for extractive summarization by identifying key sentences from the PDF content.

Signature of Faculty In Charge

You might also like