Professional Documents
Culture Documents
On
In
this project to a successful end. We humbly pray with sincere heart for this
Rakesh Kumar Roushan who has given guidance and light tous during
this project. Her versatile knowledge has cased us in the critical times
during the span of this project. We also take this opportunity to express
our gratitude to all those people who have been directly and indirectly with
us during the completion of the project. We want to thanks our friends who
have always encouraged us during this project. At the last but not least
suggestion.
TABLE OF CONTENTS
CHAPTERS:
ABSTRACT
CHAPTER 1: INTRODUCTION
1.1 SCOPE
1.2 PURPOSE
FEATURES
CHAPTER 4: DESIGN
CHAPTER 5: LIMITATIONS
CHAPTER 7: CONCLUSIONS
ABSTRACT
Every day, countless videos are being created and shared on YouTube and Such
videos are the primary source of learning for college and school going students,
people preparing for competitive exams and many more people use YouTube for
productive outcomes. But longer than expected videos can be difficult to watch,
and if we don't learn anything useful from them, our efforts might be in vain. Even
sometimes while watching video people face lots of obstacles like network issues
and all may lead to wastage of time. Automated summarization of video text
allows us to quickly spot important trends and efficiently streamline the video's
content, thus saving our time and efforts. In this YouTube transcript summarizer
web application developed using flask, the transcript of the video is converted
into text and thus summarizing that text and in case if there is no transcript
available then the model will convert the audio directly into text using speech
NLP.
1.INTRODUCTION
The YouTube Transcript Summarizer is a Chrome extension designed to condense lengthy video
transcripts into concise summaries.
In modern times, there is a vast quantity of videos being produced and shared on youtube
continuously. Globally, YouTube ranks as the second-highest frequented website.
YouTube offers a diverse selection of content, varying from short films and music videos to
feature films, documentaries, corporate sponsored movie trailers, live streams, vlogs, and other
material produced by famous YouTubers. Every day, video content on YouTube is being watched
collectively by its users for more than a total of one billion hours. In 2020, there were
approximately 2.3 billion people who used YouTube and the number of users has been quickly
growing each year. At a rate of 300 hours of video uploaded per minute, YouTube constantly
receives an immense amount of content.
According to research conducted by Google, almost 33% of viewers on YouTube in India use
their mobile devices to watch videos and spend more than 48 hours on the platform every month.
youtube is the primary source for each and every student where they can learn new concept and
can do the self study.But Watching such lengthy videos has become challenging because it is
possible to waste time without finding the desired information as our efforts may be unproductive
if we fail to retrieve the relevant information we seek. Searching for videos that contain the
relevant content can be a tedious and exasperating process. Many videos posted online involve a
speaker discussing a subject at length, yet it can prove challenging to locate the main message of
the presentation without viewing the entire video. Python offers different packages that can be
extremely useful. Accessing YouTube content, such as transcripts of videos, has now become
more convenient with the assistance of the API in the Python library. We are able to view the
video content directly and provide users with a summary by utilizing this benefit. One way to
achieve this is through the application of Hugging Face transformer, a method for summarizing
text. The generated summary is a result of using the hugging face transformer package.
Typically, written descriptions are used to encapsulate the content of YouTube videos rather than
automation. our model proposes the usage of a transformer package for summarizing the
transcripts of the video, thereby providing a meaningful and important summary of the video. Our
main concern is to summarize the data, by using the pre-trained summarization techniques.
1.1 SCOPE
The primary Scope is to provide users with a quick overview of video content
without having to watch the entire video.
1.2 PURPOSE
This project proposes the usage of a transformer package for summarizing the
transcripts of the video, thereby providing a meaningful and germane summary of
the video. T5 is an encoder-decoder model which is pre-trained on a set of
unsupervised and supervised tasks and for which each task is converted into a
text-to-text format. Our main concern is to summarize the data, so a pre-trained
summarization technique is used.
To increase the user interface to the functionality part Chrome extensions can be
used.This Chrome extensions contains a summarize button, which on click
displays the summarized text of the current YouTube video on the Google
Chrome web browser. This summarized text is generated by the hugging face
transformer package.
The YouTube videos are usually summarized through manual descriptions and
thumbnails. YouTube is the second most visited website worldwide. The range of
videos on YouTube includes short films, music videos, feature films,
documentaries, audio recordings, corporate sponsored movie trailers, live
streams, vlogs, and many other contents from popular YouTubers. YouTube
users watch more than one billion hours of video every day.
FEATURES
Users can easily install the extension from the Chrome Web Store, enhancing
accessibility.
Transcript Retrieval:
The extension fetches video transcripts directly from YouTube, utilizing YouTube
API for seamless integration.
User-Friendly Interface:
The extension integrates into the YouTube interface, allowing users to access
summarized content with ease.
2: REQIUREMENTS & ANALYSIS
1) Window 7
B. Hardware
C. Software
1) Python
2) Visual Studio
1) Flask
2) Ffmpeg
3: STEPS FOR YOUTUBE TRANSCRIPT SUMMARIZATION
1) Using a Python API, find the transcripts and subtitles for a particular YouTube
video ID.
3) If transcript is not available then download then extract audio from the video
APIs changed the way we build applications, there are countless examples of
APIs in the world, and many ways to structure or set up our APIs [5]. We create a
In this module, we are going to utilize a python API which allows you to get the
function which will accept YouTube video id as an input parameter and return
parsed full transcript as output. As we get the transcript in the json format with
text, start and duration attributes we only need text so we parse the data from the
library in python. Now using ffmpeg convert the required audio file into .wav
format.Using python speech recognition module system will convert audio into
text.
Text summarization is the task of shortening longer text into a precise summary
that preserves key information content and overall meaning. Thereare two
different approaches that are widely used for text summarization :Extractive
Summarization: In this the model identifies the important sentences and phrases from
The following step is to define the resources that will be used in implementation
● In app.py, We create a Flask API Route with GET HTTP Request method with
YouTube video id from the YouTube URL which is obtained from the query
Finally, we return the summarized transcript with HTTP Status OK and handle
❖ CHROME EXTENSION
4: DESIGN
5: LIMITATIONS
❖ Long videos with extensive technical jargon may pose challenges for
accurate summarization.
6: FUTURE ENHANCEMENTS
accuracy.
This project has proposed a YouTube Transcript summarizer. The system takes
the input YouTube video from the Chrome extension of the Google Chrome
browser when the user clicks on the summarize button on the chrome extension
web page, and access the transcripts of that video with the help of python API.
The accessed transcripts are then summarized with the transformers package.
Then the summarized text is shown to the user in the chrome extension web
page. This project helps the users a lot by saving their valuable time and
resources. This helps us to get the gist of the video without watching the whole
video. It also helps the user to identify the unusual and unhealthy content so that
it may not disturb their viewing experience. This project also ensures great user
interface experience in finding out the summarized text as chrome extensions
have been used. This helps in getting the summarized text without copying the
URL and pasting at terminals or by any third-party applications.
8.REFRENCES
[1] I. Awasthi, K. Gupta, P. S. Bhogal, S. S. Anand and P. K. Soni, "Natural Language
Processing (NLP) based Text Summarization - A Survey," 2021 6th International
Conference on Inventive Computation Technologies (ICICT), 2021, pp. 1310-1317, doi:
10.1109/ICICT50816.2021.9358703.
[5] https://huggingface.co/transformers/installation.htm
[6] https://atmamani.github.io/blog/building-restful-apis-with-flask-in-python/
[7] https://pypi.org/project/youtube-transcript-api/
[8] https://blog.miguelgrinberg.com/post/designing-a-restful-api-with-python-and-flask
[9] https://medium.com/swlh/parsing-rest-api-payload-and-query-parameters-with-flask-
better-than-marshmallow-aa79c889e3ca
[10] https://betterprogramming.pub/the-ultimate-guide-to-building-a-chrome-extension-
4c01834c63ec
[11] https://developer.chrome.com/docs/extensions/mv2/