Professional Documents
Culture Documents
A PROJECT REPORT
Submitted by
Student Name (Enrollment No.)
Of
BACHELOR OF ENGINEERING
In
Computer Engineering
Page 1
Kalol Institute of Technology & Research Center
CERTIFICATE
This is to certify that the dissertation entitled “PDF to Audio Convertor & Translator”
has been carried out by Aditi Chauhan under my guidance in fulfillment of the degree of
Bachelor of Engineering in Computer Engineering (5th semester) of Gujarat Technological
University, Ahmedabad duringthe academic year 2023-24.
I take this opportunity to humbly express our thankfulness to all those concerned
with my project.
There are so many persons without whose help I would never have conceived and
learnt , to whom I would like to express my gratitude – my friends , colleagues,
and of course IT Department of KITRC.
With regards,
Aditi Chauhan
(210260107012).
1. INTRODUCTION
PROJECT PROFILE
SIZE Individual
• Submitted By
Aditi Chauhan (210260107012)
• Submitted To
Kalol Institute of Technology and Research Centre
In this section, you can introduce the problem statement, the objectives, the scope, and the significance
of your project. You can also mention the challenges and limitations that you faced while developing your
project. For example:
• The problem statement is to develop a Python program that can convert and translate PDF files to audio files
in different languages, using various libraries and tools.
• The objectives are to provide a user-friendly interface, to support multiple languages, to handle different
types of PDF files, and to produce high-quality audio files.
• The scope is to implement the basic functionalities of converting and translating PDF files to audio files,
such as selecting a PDF file, choosing a language, setting the output folder, and playing the audio file.
• The significance is to help users who want to listen to PDF files instead of reading them, such as students,
researchers, visually impaired people, etc. It can also help users who want to learn a new language or
improve their pronunciation and listening skills.
• The challenges are to deal with complex PDF files that contain images, tables, graphs, etc., to handle
different languages and accents, to optimize the speed and performance of the program, and to ensure the
accuracy and quality of the conversion and translation.
• The limitations are to depend on external libraries and tools that may have some errors or limitations, to
require an internet connection for the translation service, and to support only a limited number of languages.
Project Management:
In this section, you can describe the methodology, the tools, the resources, and the timeline of your project.
You can also mention the roles and responsibilities of the team members, if any. For example:
• The methodology is to follow the agile approach, which involves iterative and incremental development,
testing, and feedback. The project is divided into several sprints, each with a specific goal and
deliverable. The project is also documented and presented at each stage.
• The tools are PyCharm, which is an integrated development environment (IDE) for Python, GitHub,
which is a version control and collaboration platform, and Google Slides, which is a presentation
software.
• The resources are various Python libraries and packages, such as tkinter, PyPDF2, pyttsx3, gTTS,
googletrans, etc., that provide the functionalities of creating a graphical user interface (GUI), reading and
extracting text from PDF files, converting text to speech, translating text to different languages, etc.
• The timeline is to complete the project within four weeks, with the following milestones:
o Week 1: Designing and developing the GUI, selecting and installing the required libraries and
packages, testing the basic functionalities of reading and converting PDF files to audio files.
o Week 2: Implementing and testing the translation feature, supporting multiple languages and
accents, handling different types of PDF files, optimizing the speed and performance of the
program.
o Week 3: Debugging and fixing any errors or bugs, improving the quality and accuracy of the
conversion and translation, adding some additional features, such as pausing, resuming, and
stopping the audio playback, adjusting the volume and speed of the audio, etc.
o Week 4: Documenting and presenting the project, evaluating the results and feedback, making any
final changes or improvements, submitting the project report and code.
• The roles and responsibilities are to assign each team member a specific task or module, such as
designing the GUI, reading and converting PDF files, translating text, etc., and to coordinate and
communicate with each other regularly, using GitHub and Google Meet.
3. System requirement study: In this section, you can specify the hardware and software requirements of
your project, such as the operating system, the processor, the memory, the disk space, the internet
connection, the Python version, the libraries and packages, etc. You can also mention the assumptions and
dependencies of your project, such as the format and size of the PDF files, the availability and reliability
of the translation service, the compatibility and accessibility of the audio files, etc. For example:
• PyCharm: It is an IDE for Python that provides many features and tools for data analysis and
visualization, such as code completion, inspection, debugging, testing, etc. It helped us in writing and
running our Python code, and in finding and fixing any errors or bugs.
• GitHub: It is a version control and collaboration platform that allows us to store, manage, and share our
code and files online. It helped us in tracking and documenting the changes and progress of our project,
and in collaborating and communicating with our team members.
• Google Slides: It is a presentation software that allows us to create and edit slides, add images,
animations, transitions, etc. It helped us in presenting and demonstrating our project, and in explaining the
problem statement, the objectives, the scope, the significance, the methodology, the tools, the resources,
the timeline, the roles and responsibilities, the system requirements, the results, the feedback, the
conclusion, and the bibliography of our project.
• tkinter: It is a Python library that provides a simple and easy way to create GUI applications. It helped us
in designing and developing the GUI of our project, and in providing a user-friendly interface for the user
to interact with the program.
• PyPDF2: It is a Python library that provides a pure-Python toolkit for working with PDF files. It helped
us in reading and extracting text from PDF files, and in handling different types of PDF files, such as
those that contain images, tables, graphs, etc.
• pyttsx3: It is a Python library that provides a cross-platform text to speech conversion. It helped us in
converting text to speech, and in supporting multiple languages and accents.
• gTTS: It is a Python library that provides an interface to Google Text to Speech API. It helped us in
translating text to different languages, and in producing high-quality audio files.
• googletrans: It is a Python library that provides an interface to Google Translate API. It helped us in
detecting the language of the text, and in translating text to different languages.
5. Functional & behavioral Design of System: In this section, you can describe the functional and behavioral
design of your system, such as the input, the output, the process, the data flow, the use cases, the
algorithms, the pseudocode, the flowcharts, the diagrams, etc. You can also explain the logic and the steps
of your system, and how it works. For example:
• The input is a PDF file that the user selects from their computer, and a language that the user chooses
from a list of options.
• The output is an audio file that is saved in the output folder, and that is played aloud by the program.
• The process is as follows:
o The user runs the program and sees the GUI of the project, which has a button to select a PDF file,
a drop-down menu to choose a language, a button to convert and translate the PDF file to audio
file, a button to play the audio file, and a button to exit the program.
o The user clicks on the button to select a PDF file, and browses their computer to find and select the
PDF file that they want to convert and translate to audio file.
o The user clicks on the drop-down menu to choose a language, and selects the language that they
want to translate the PDF file to.
o The user clicks on the button to convert and translate the PDF file to audio file, and waits for the
program to complete the conversion and translation process.
o The program reads and extracts the text from the PDF file using the PyPDF2 library, and detects
the language of the text using the googletrans library.
o The program translates the text to the language that the user selected using the gTTS library, and
converts the text to speech using the pyttsx3 library.
o The program saves the audio file in the output folder, and displays a message that the conversion
and translation process is done.
o The user clicks on the button to play the audio file, and listens to the audio file that is played aloud
by the program.
o The user clicks on the button to exit the program, and closes the GUI of the project.
• The data flow is as follows:
o PDF file -> PyPDF2 -> Text
o Text -> googletrans -> Language detection
o Text -> gTTS -> Translation
o Text -> pyttsx3 -> Speech
o Speech -> Audio file
PDF to Audio Converter and Translator
Be it browsing through the seemingly endless pages of terms and conditions on an important
official document or kicking back and flipping through an intriguing eBook- reading is quite an
undeniable and inescapable part of our everyday lives.
However, reading anything demands our complete undivided attention making it nearly
impossible for us to multitask. Moreover, staring at a screen for long periods also strains
our eyes.
This PDF to Text Converter and Translator developed using Python can instantly and accurately
convert any PDF text into audio.
Along with reading any PDF document out loud, this application can also translate and
vocalize any text into up to five languages.
Moreover, this system can also benefit visually impaired individuals and people with learning
disabilities such as dyslexia.
2) Working of the Project
1. This project has a user page that first lets you sign up and thereafter one can log in and
2. one has to send in a pdf file containing text.
3. The text is read by the function "PyPDF2.PdfFileReader" and it is converted to byte form.
4. The text is then extracted using the function "text.extractText()".
5. The library googletrans is used to convert the text written in a particular language into
6. the audio form of a particular language.
7. We have several options in languages, such as English, Hindi, Marathi, Gujarati, etc.
8. Using the function "gTTS" we can hear the audio.
3) Advantages
a) Very efficient
b) People from different regions can use this
c) Saves time
4) System Description
The system comprises of 1 major module with the following sub-modules:
User:
o The waterfall model is a classical model used in system development life cycle to create a
system with a linear and sequential approach. It is termed as waterfall because the model develops
systematically from one phase to another in downward fashion. The waterfall approach does not
define the process to go back to the previous phase to handle changes in requirement.
The waterfall approach is the earliest approach that was used for software development
6) System Requirement
I. Hardware Requirement
i. Laptop or PC
• I3 processor system or higher
• 4 GB RAM or higher
• 100 GB ROM or higher
ii. Laptop or PC
• Windows 7 or higher
• Vs code
• Python 3.7
• Django 3
7) Limitation/Disadvantages
8) Application –
• Can be used to vocalise any PDF file.
• Besides reading out text, this system instantly and efficiently translates it as well.
• Useful for visually impaired individuals and people with learning disabilities as well.