You are on page 1of 25

Department of Computer Science and Engineering

III B.Tech-I Semester Mini Project-2023


Submitted by
Batch –A17

YOUTUBE TRANSCRIPT SUMMARIZER

Presented by:
Zaheer Anwar 22R25A0501
CH Pradyumna 21R21A0508
M Mounika 22R25A0506
Introduction:
 The YouTube viewership reached around 2.3 billion in 2020 and
continues to grow rapidly.
 Approximately 300 hours of YouTube videos are uploaded every
nanosecond.
 A Google study revealed that nearly one-third of YouTube users in India
access videos on their mobile devices, spending over 48 hours per
month on the platform.
 Despite the vast content available, finding specific information in
lengthy videos is frustrating and time-consuming.
 Existing ML video summarization methods require substantial
processing power and time due to numerous frames.
 Addressing this, we suggest using the LSA Natural Language Processing
algorithm for more efficient processing.
.

Abstract:

 The YouTube summarizer is a digital tool and natural language processing to


condense extended YouTube videos into brief summaries, providing essential
insights.
 Its objectives include time-saving, improved accessibility, and an enhanced
viewing experience for content creators and viewers. Utilizing advanced
algorithms and machine learning models, the tool analyzes topics, key points,
and timestamps.
 Successful implementation mandates careful consideration of system
requirements, encompassing hardware specifications and security protocols. This
enhances accessibility for individuals with disabilities or limited time, thereby
facilitating knowledge sharing and research
Literature Survey:
TITLE AUTHOR OBJECTIVE METHOD- FEATURES DATE OF
OLOGY PUBLISH
Automated P Nagaraj; Creates a Build a module • Speech 24-05-2023
YouTube B Rohith; Python video that generates Recognition
B Sai Vasanth; summarization audio Integration
Video G Veda paraphrases • Audio
system
Transcription Varshith emphasizing from Paraphrasing
Reddy; transcribed text, • Summary
audio for
A Koushik preserving key Integration
efficient
Teja statements and
content concepts with
retrieval and NLP techniques
browsing. for contextual
accuracy.

Video Atluri Naga Sai Design a Integrate • NLP-Driven 14-04-2022


Sri Vybhavi; globally modules for Key Element
Transcript Laggisetti Valli accessible popular web Identification
Summarizer Saroja; system for platforms • Educational
Jahnavi summarizing (YouTube, Content
Facebook,
Duvvuru; and providing Prioritization
Google,
Jayanag Bayana educational Instagram), using • Global
content, platform-specific Accessibility
considering APIs to fetch Considerations
language video transcripts
diversity and and metadata for
cultural nuances summarization.
TITLE AUTHOR OBJECTIV METHOD- FEATURES DATE OF
E OLOGY PUBLISH

Text Eesha Uses NLP Use Spacy for • Spacy 2022


Summarizer Inamdar, techniques to preprocessing Integration
Transcript Varada identify key (cleaning, stop • Frequency
Generator elements in word removal), Normalizati
kalaskar, YouTube video on
frequency
Vaidehi Zade transcripts, normalization, • Text
extracting and calculating Preprocessi
important sentence ng
phrases, weightage to • Weighted
keywords, and generate Sentence
contextual concise and Calculation
information.. informative
summaries

YouTube Gousiya This project Develops • Misleading 03-03-2022


Begum, aims to alleviate using Spacy Content
Transcript the challenges Mitigation
N.Musurat for effective
Summarizer posed by the text • User
Sultana, Empowerm
overwhelming summarization
Dharma volume of ent
Ashritha , mitigating
online video • Deployment
misleading
content, where Versioning
content on
creators may YouTube
prioritize views
over content
accuracy.
Software Requirements:

 Operating system : Windows 10.

 Coding Language : Python


 Web Framework : Flask
Hardware Requirements:
 System : Pentium i3 Processor.
 Hard Disk : 500 GB.
 Monitor : 15’’ LED
 Input Devices : Keyboard, Mouse
 Ram : 4 GB
Existing System:
 The Current Working/Processing of Extension is not at its best speed
accuracy or satisfactory fast.
 Whenever a user performs summarization, the summarization is
 only available for the videos who already have subtitle eligibility.
 Subtitle eligible video.
 Audio summary is not at its accuracy level.
 Extra short summary in text of extra small videos.
Disadvantages of Existing system:

 Current extension lacks optimal speed, accuracy, and summarization for


videos without built-in subtitles.
 Summarization requires subtitles, leading to errors in large videos
exceeding the 1024-word limit.
 Larger videos can't produce audio summaries due to capacity constraints.
Proposed System :
Advantages of Proposed System:

 Speed

 Accuracy

 Larger videos eligible for summarization

 Summarization of no-subtitle eligible videos.


Functional Requirements:

 Automatic Speech Recognition :


Recognizes spoken words in YouTube videos for accurate transcription.
 Natural Language Processing :
Analyzes transcriptions to identify topics, key points, and timestamps for
summarization.
 Summarization Algorithm:
Implements an effective algorithm for condensing video transcripts into concise
summaries.
 User Authentication:
Ensures secure access to the system with unique credentials for users.
 Database Management:
Stores and retrieves video transcripts, summaries, and user data efficiently.
 User Interface:
Provides an intuitive interface for users to interact with and initiate summarization
tasks.
 Compatibility:
Ensures compatibility with various devices and browsers for a diverse user base.
Non-Functional Requirements:

 Performance:
 Processes summarization tasks swiftly to provide a seamless user experience.
 Scalability:
 Handles a growing user base and increasing data volume without
compromising performance.
 Security:
 Implements robust security protocols to protect user data and ensure privacy.
 Reliability:
 Ensures consistent and accurate summarization results across different videos
and users.
 Accessibility:
 Enhances accessibility for users with disabilities through features like screen
reader compatibility.
 Usability:
 Provides an intuitive and user-friendly interface for users to navigate and
understand the summarization process.
 Maintainability:
 Allows for easy maintenance and updates to address evolving requirements
and technology changes.
SYSTEM ARCHITECTURE:
Data Flow Diagram:
Use Case Diagram:
Implementation:
Chrome Extension Manifest:
HTML:
JavaScript:
EXECUTION:
CHROME EXTENSION :
TRANSCRIPT SUMMARISING:
Result:
Conclusion:
 The YouTube Transcript Summarizer Chrome Extension, powered by
advanced natural language processing and SpaCy, delivers accurate and
concise video summaries, meeting diverse user needs for efficient content
consumption.
 With a modular design, specialized audio paraphrase integration, and real-
time processing, it excels in adaptability and scalability.
 Ongoing user feedback and iterative enhancements will undoubtedly
reinforce its role as an invaluable tool, enhancing the YouTube viewing
experience.
Feb 7, 2024
25

You might also like