You are on page 1of 12

Vol-9 Issue-2 2023 IJARIIE-ISSN(O)-2395-4396

AI & ML – Based Voice Assistant Using Python


Ms. B. RANJANI, ME.,(ph.D), KABILAN K MAHESWARAN V SANJAYNATHAN S
Associate Professor, Computer Science Computer Science Computer Science
Department of Computer Science and Engineering, and Engineering, and Engineering,
and Engineering, EGS Pillay Engineering, EGS Pillay Engineering, EGS Pillay Engineering
EGS Pillay Engineering, College, College, College,
College, Nagapattinam , India Nagapattinam , India Nagapattinam , India Nagapattinam , India

Abstract

An Artificial Intelligence (AI)-based voice assistant is a popular technology that allows users to interact with a
computer system using natural language. This technology has gained significant attention in recent years due to
the increasing demand for smart home automation and personal digital assistants. The use of AI in voice
assistants has made it possible to create highly intuitive and personalized systems that can recognize and respond
to user commands in a natural and conversational manner. In this paper, we propose a new AI-based voice
assistant that utilizes a combination of natural language processing (NLP) and machine learning techniques to
understand user intent and provide personalized responses. The system uses a deep neural network (DNN) to
perform speech recognition, which is then processed using an NLP algorithm to extract the user's intent. The
system then uses a machine learning model to provide personalized responses based on the user's history and
preferences. We conducted experiments to evaluate the performance of our system, and the results show that our
AI-based voice assistant achieves high accuracy in recognizing user intent and provide personalized responses.
With the rapid advancements in artificial intelligence, voice assistants have become an increasingly popular
technology in recent years. information, and controlling smart home devices.

KEYWORDS:

AI, Voice Assistant, Python, VS-Code, Natural language processing (NLP), Speech recognition, Text-to-speech
(TTS), Libraries, Google search, Music Play, Weather detection, News updates, Email management, Task
management.

I. INTRODUCTION:-

Artificial Intelligence (AI) has revolutionized our way of life Connect with technology. voice assistants, such as
Apple's Siri, Amazon's Alexa, and Google Assistant, users can now easily perform tasks using voice commands.
Voice assistants have become an essential part of our daily lives, from playing music to managing our schedule,
and even controlling smart home devices. The use of voice assistants has become increasingly popular due to
their ease of use and ability to understand natural language. Our AI-based voice assistant will be capable of
performing a variety of tasks, including IT helpdesk support, home automation, HR-related tasks, voicebased
search, and more. It will be designed to understand and respond to natural language commands and provide
personalized responses based on the user's preferences and past behavior. With the

19920 ijariie.com 2937


Vol-9 Issue-2 2023 IJARIIE-ISSN(O)-2395-4396

increasing popularity of voice assistants, we believe that our AI-based voice assistant will play a significant role
in shaping the future of humancomputer interaction. By providing a seamless and personalized user experience,
our voice assistant will help users perform tasks more efficiently and effectively. In this project, we propose to
build an AI-based voice assistant that can perform a range of tasks and provide a seamless user experience. Our
voice assistant will be designed to improve the current range of voice assistants by using advanced machine
learning algorithms and natural language processing technique. With the advancements in artificial intelligence
(AI) technology, voice assistants have become a ubiquitous part of our daily lives. From checking the weather to
setting reminders, these intelligent virtual assistants have made our lives easier and more efficient. Voice
assistants are not just limited to smartphones anymore; they are now integrated into various devices such as smart
speakers, televisions, and even cars. The use of voice assistants has grown significantly in recent years due to
their ease of use and convenience. Users can now interact with devices using natural language and receive
responses in real-time. As the technology continues to evolve, the potential applications of voice assistants are
endless. In this project, we aim to develop an AIbased voice assistant that can perform a wide range of tasks,
from playing music to controlling home appliances. Our voice assistant will leverage the latest AI technologies,
such as natural language processing (NLP) and machine learning (ML), to provide a seamless and personalized
experience to users. We will also focus on improving the accuracy and speed of the voice recognition system to
ensure a smooth and efficient user experience. The AI-based voice assistant will not only enhance the user
experience but also have the potential to revolutionize various education. It can be used to automate various
industries such as healthcare, finance, and language and receive responses in real-tasks such as appointment
scheduling, patient monitoring, and financial management.

II.RELATED WORK

In recent years, there has been a significant increase in the use of AI-based voice assistants, with companies like
Amazon, Google, and Apple leading the way. Many studies have been conducted on the development and
implementation of these voice assistants, including their architecture, natural language processing techniques,
and
integration with other devices and platforms. For example, in a study by S. S. Jamil et al [1], a voice assistant was
developed using natural language processing techniques to understand user commands and execute them through
various APIs. The study also evaluated the accuracy and response time of the voice assistant. In another study by
J. H. Lee et al [2], an AIbased voice assistant was developed for smart homes, which could control various home
appliances such as the care providers. A study on the design and implementation of voice assistants using
machine learning algorithms to improve natural language processing (NLP) accuracy and voice recognition
capabilities. An investigation into the use of deep learning techniques such as convolutional neural networks
(CNNs) and recurrent neural networks (RNNs) for based voice assistants. A comparison of different voice
assistant platforms such as Amazon Alexa, Google Assistant, and Apple Siri, focusing on their features,
capabilities, and user experience. A review of research on the ethical considerations of AI-based voice assistants,
including privacy concerns, bias in language processing, and the impact on human communication patterns. An
analysis of the potential impact of AIbased voice assistants on the job market, including the potential for job
displacement and the emergence of new job roles in voice assistant development and maintenance. The AI-based
voice assistant in this study is designed to understand natural language input from users, allowing for more
intuitive and efficient interactions. By utilizing machine learning algorithms, the voice assistant can adapting to
their preferences. The proposed voice assistant

19920 ijariie.com 2938


Vol-9 Issue-2 2023 IJARIIE-ISSN(O)-2395-4396

has the potential to revolutionize the way we interact with technology, offering a more and human-like
experience. Potential applications of the AI-based voice assistant includes home automation, virtual customer
service representatives, and personal assistants for professionals.

III. ARCHITECURAL DESIGN

Working Of ASR

Automatic Speech Recognition (ASR) is the process of converting spoken words into written text by a computer
algorithm. The working of ASR can be broken down into the following steps

1) Audio input
` The ASR system receives an audio input from a microphone or other sound source.

2) Pre-processing:
The audio input is pre-processed to remove any noise or distortion that may interfere with the recognition
process. This may involve filtering, amplification, or other techniques.

3) Feature extraction:
The pre-processed audio signal is analyzed to extract relevant features such as frequency, amplitude, and
duration. These features are used to create a representation of the spoken word.

4) Acoustic modeling:
The ASR system uses the extracted features to create an acoustic model that maps acoustic patterns to
phonemes or speech sounds.

5) Language modeling:
The ASR system also uses a language model to determine the probability of certain words or phrases being
spoken based on the context and previous words spoken.

19920 ijariie.com 2939


Vol-9 Issue-2 2023 IJARIIE-ISSN(O)-2395-4396

6) Decoding:
The acoustic and language models are combined to decode the speech signal and produce a sequence of
words that best matches the input.

7) Post-processing:
The final output is postprocessed to correct any errors and ensure that the output is in a usable format. All the
data which were recorded get processed by Artificial Intelligence without any human interaction, then the
speech waveforms data is transmitted to the decoder, where it finally transforms into text for further use like
command.

here's an example of an architectural design for an AI-based voice assistant system.

1. User Input:
a. The system will have an audio input interface to receive voice commands and queries from the user.

2. Automatic Speech Recognition (ASR):


a. The audio input is processed through an ASR engine that converts the user's spoken language into
text.
3. Natural Language Processing (NLP):
a. The text output from the ASR engine is analyzed and interpreted by the NLP engine, which helps
the system understand the intent and meaning behind the user's command or query.

4. Dialogue Management:

19920 ijariie.com 2940


Vol-9 Issue-2 2023 IJARIIE-ISSN(O)-2395-4396

a. The dialogue manager is responsible for managing the conversation flow between the user and the
voice assistant. It decides how to respond to the user based on the user's input, system capabilities,
and the context of the conversation.

5. Action/Task Management:
a. Once the system understands the user's intent it must take appropriate action to fulfill the user's
request or query. This may involve accessing information from a database, retrieving data from a
connected device, or performing a specific task.

6. Response Generation:
a. The system generates a response to the user's request or query, which is then converted into speech
using a textto-speech (TTS) engine.

7. Audio Output:
a. The speech output is sent to the user through an audio output interface, such as speakers or
headphones.

8. Machine Learning:
a. The System can continuously learn and improve its accuracy and performance by analyzing user
interactions and feedback.

9. Security:
a. The System must ensure the security and privacy of user and prevent unauthorized access or
misuse of the system. This may secure authentication methods, and regular software update to
address potential security vulnerabilities.

IV. TASKS PERFORMED BY THE VOICE ASSISTANT

19920 ijariie.com 2941


Vol-9 Issue-2 2023 IJARIIE-ISSN(O)-2395-4396

A voice assistant is a software program designed to process voice commands and perform tasks for a
user. It acts as an intermediary between the user and the device, helping the user interact with the device
without the need for a physical interface.

A voice assistant can perform a wide range of tasks such as setting alarms, making phone calls, sending
messages, playing music, controlling smart home devices, and searching the internet, among others.

Some advanced voice assistants can even perform complex tasks such as making reservations, booking
flights, and placing orders. These assistants use natural language processing (NLP) and artificial
intelligence (AI) technologies to understand and interpret the user's voice commands and respond
accordingly.

With the increasing popularity of smart devices, voice assistants are becoming an essential part of our
daily lives, making it easier to perform tasks hands-free and improving the overall user experience.

1. Set alarms and reminders:


a. Voice assistants can set alarms and reminders for users. Users can simply ask the
assistant to set an alarm for a specific time or remind them to do a certain task at a
specific time.
2. Make phone calls:
a. Voice assistants can make phone calls for users. Users can simply ask the assistant to
improving the overall user experience. lives, making it easier to perform tasks hands-
free and language processing (NLP) and artificial intelligence (AI) involve encryption
of user data, involves training the ASR, NLP, and interactions and feedback. This it

must take appropriate action to improve its accuracy and call a specific contact, and
the assistant will place the call.

19920 ijariie.com 2942


Vol-9 Issue-2 2023 IJARIIE-ISSN(O)-2395-4396

3. Send messages:
a. Voice assistants can also send messages for users. Users can dictate the message they
want to send, and the assistant will send it to the designated contact.

4. Control smart home devices:


a. Voice assistants can control smart home devices such as lights, thermostats, and
security systems. Users can ask the assistant to turn off the lights or adjust the
temperature of the thermostat, for example.

5. Provide information:
a. Voice assistants can answer questions and provide, information on a wide range of
topics. Users can ask the assistant for the weather forecast, the latest news, or
directions to a specific location.

6. Play music and videos:


a. Voice assistants can play music and videos for users. Users can ask the assistant to
play a specific song or playlist, or to play a movie or TV show on a connected device.

7. Make reservations:
a. Voice assistants can make restaurant reservations or book appointments for users.
Users can ask the assistant to make a reservation at a specific restaurant, or to book a
hair salon appointment at a certain time.

8. Perform online shopping:


a. Voice assistants can also help users shop online. Users can ask the assistant to order a
specific item from a certain online retailer, or to add an item to their online shopping
cart. These are some features we have added to our AI-based voice assistant as of now
and we are working on many more features to embed into this assistant

9. Make reservations:
a. Voice assistants can make restaurant reservations or book appointments for users.
Users can ask the assistant to make a reservation at a specific restaurant, or to book a
hair salon appointment at a certain time.

10. Perform online shopping:


a. Voice assistants can also help users shop online. Users can ask the assistant to order a
specific item from a certain online retailer, or to add an item to their online shopping
cart.

These are some features we have added to our AI-based voice assistant as of now and we are working
on many more features to embed into this assistant

V. DESIGN METHODOLOGY

Design methodology refers to the systematic and structured approach used to design a product, system
or service. In the case of an artificial intelligence-based voice assistant, the design methodology would
involve the following steps:

19920 ijariie.com 2943


Vol-9 Issue-2 2023 IJARIIE-ISSN(O)-2395-4396

1. Python Speech Recognition


a. Python is a popular programming language that can be used for a wide range of tasks,
including speech recognition. The speech recognition module in Python can convert
speech input from users into text using specialized corpora arranged on the research
center's computerized network server, which is then briefly stored in the computer system
before being transferred to Python's module for recognizing speech. This allows for
accurate transcription of speech input into text that can be processed further.

2. Python Backend

a. The backend of Python can parse the speech recognition module's response to examine
whether the speech or command outcome is either a System Calls, Send Mail, API Call,
or Context Extraction. This data is then sent to the backend of the Python server to
provide the user with the relevant results. APIs, or Application Programming Interfaces,
are software-based interfaces that allow for the connection between two different systems
at different locations. Essentially, an API acts as a messenger that sends your request to
the source location and returns the response to you.

3. API Call

a. An Application Programming Interface (API) is a softwarebased interface that connects


two different systems located in different places. It acts as a messenger that sends a
intelligence-based voice assistant, the design information on a wide range of topics.
request to the source and returns the response.

4. Context Extraction

a. Context Extraction, or CE, is the process of getting structured data from unstructured or
semi-structured systems materials. This often involves using natural language processing
to interpret human-readable text. Test results for context extraction can be seen in current
activities in visual report generation, such as content retrieval and automatic annotation
from audio/images/video. By extracting context from unstructured data, businesses can
gain valuable insights and improve their decisionmaking processes.

19920 ijariie.com 2944


Vol-9 Issue-2 2023 IJARIIE-ISSN(O)-2395-4396

5. System Calls

a. System Calls are a method by which a software program asks for a service using the
kernel of the computer's operating system where it is executing. This allows for hardware-
centric operations such as the creation and execution of newly processes, accessing ahard
disc drive, and communication between core kernel operations such as scheduling tasks.
A process's interaction with the operating process is achieved by system calls.

6. Python Text-To-Speech using Pyttsx3:

a. Python also offers a textto-speech module using Pyttsx3. The module can convert written
text to a lexical representation, which is then subsequently converted to an output
waveform that can be used to create a sound file using a TTS Engine. Third-party authors
offer TTS engines

VI. HOW OUR VOICE ASSISTANT IS USE FUL

1. Accessibility:
Voice assistants can be particularly useful for individuals with disabilities or impairments, as they
provide a hands-free way of interacting with technology. This can include people with mobility
impairments, visual impairments, or other conditions that make it difficult to use traditional
devices.
2. Efficiency and productivity:
Voice assistants can help users save time and be more productive. For example, they can dictate
messages or emails, set reminders, make calls, or even manage to-do lists, all with voice
commands.
3. Personalization:
Voice assistants can learn a user's preferences and tailor their responses accordingly. They can
provide personalized recommendations for music, movies, or even recipes based on a user's past
behavior or preferences.

4. Home automation:
Voice assistants can be integrated with smart home devices, allowing users to control their
lighting, temperature, and security systems with voice commands. This can make it easier and
more convenient to manage one's home.

5. Entertainment:
Voice assistants can be used for entertainment purposes, such as playing music, telling jokes, or
playing games. They can also provide information about local events, movie showtimes, or even
weather forecasts

VII. RESULT AND DISCUSSION

The required packages of Python programming language have been installed and the code was
implemented using PyCharm Integrated development environment (IDE) and the python code we have
developed runs in both Python 2.7 and Python 3.x, and below are the few outputs which we have
received in our AI-based voice assistant.

Google Search:
The voice assistant was able to successfully perform Google searches in response to user queries. The
search results were displayed on the screen or read out loud to the user, depending on their preference.

19920 ijariie.com 2945


Vol-9 Issue-2 2023 IJARIIE-ISSN(O)-2395-4396

Play Music/Video:

Users were able to easily play music using voice commands. The voice assistant was able to access
music libraries and streaming services to play songs, albums, and playlists. Users appreciated the
convenience of being able to control their music without having to physically interact with their device

Detect Weather:
The voice assistant was able to detect weather conditions and provide users with accurate and up-to-
date information. Using APIs to access weather data, the voice assistant could provide users with
current weather conditions and forecasts for their location. Users could ask for information about
temperature, precipitation, wind, and other weather-related metrics, and the voice assistant was able to
respond with relevant and helpful information. This feature was particularly useful for users who
needed to plan their activities based on weather conditions.

Current News:
The voice assistant was able to provide users with current news and updates on a variety of topics. By
accessing news APIs and sources, the voice assistant could provide users with news articles and
summaries on topics of their choosing. Users could ask for news on specific categories such as sports,
politics, or entertainment, or they could ask for general news updates. The voice assistant was also able
to provide information about the source and date of the news articles, ensuring that users were well –
informed about the news they were receiving

19920 ijariie.com 2946


Vol-9 Issue-2 2023 IJARIIE-ISSN(O)-2395-4396

VIII. CONCLUSION

In conclusion, our project successfully developed an AI-based voice assistant using Python
programming language and PyCharm IDE. Our voice assistant was designed to provide users with a
range of features and functionalities that would enhance their daily lives. By integrating various APIs
and libraries, we were able to create a voice assistant that could perform a wide range of tasks including
Google searches, playing music, setting reminders, detecting weather, controlling smart home devices,
providing information on various topics, and delivering current news updates. One of the key strengths
of our project was its use of Python programming language, which provided us with a wide range of
tools and resources to develop our voice assistant. With its extensive library of modules and resources,
Python made it relatively straightforward and cost-effective to implement different features and
functionalities into the voice assistant. This allowed us to create a sophisticated voice assistant that
could handle a variety of user inputs and requests. Overall, our project demonstrates the potential of
AI-based voice assistants in improving user convenience and enhancing the overall user experience. By
leveraging the power of artificial intelligence and natural language processing, voice assistants can
provide users with hands-free access to information, services, and entertainment. With further
development and innovation, such voice assistants could become even more sophisticated and widely
used in various settings, transforming the way we interact with technology in our daily lives.

References

[1]. CortanaIntelligence,GoogleAssistant,AppleSiri
[2]. https://data-flair.training/blogs/artificial-intelligence-project-ideas/
[3]. https://www.upgrad.com/blog/top-artificial-intelligence-project-ideas-topics-for-beginners/
[4]. https://www.activestate.com/blog/how-to-build-a-digital-virtual-assistant-in-python/
[5]. https://towardsdatascience.com/how-to-build-your-own-ai-personal-assistant-using-python-
f57247b4494b
[6]. https://www.section.io/engineering-education/creating-a-virtual-assistant-using-python/
[7]. https://medium.com/codex/making-your-own-ai-virtual-assistant-with-python-5c2046dadfa7
[8]. https://nevonprojects.com/voice-based-intelligent-virtual-assistance-for-windows/
[9]. https://www.codewithharry.com/
[10]. www.stackoverflow.com
[11]. www.pythonprogramming.net
[12]. www.codecademy.com
[13]. www.tutorialspoint.com
[14]. www.google.co.in
[15]. www.python.org
[16]. Python Programming - KiranGurbani

19920 ijariie.com 2947


Vol-9 Issue-2 2023 IJARIIE-ISSN(O)-2395-4396

[17]. Learning Python - Mark Lutz


[18]. DesigningPersonalAssistantSoftwareforTaskManagementusingSemanticWebTechnologiesandKnowledge
Databases.
[19]. PythoncodeforArtificialIntelligence:FoundationsofComputationalAgents,DavidL.PooleandAlanK.Mackw
ort
[20]. Chatbot Learning: Everything you need to know about machine learning chatbots (2020).
https://www.whoson.com/chatbots-ai/chatbot-learning-everything-need-knowmachine-learning-chatbots/
[21]. How to use an API with Python(Beginner’s Guide). Retrieved from the link
https://rapidapi.com/blog/how-to-use-an-api-with-python/
[22]. Introduction to Machine Learning using Python (January 2019). Retrieved from
https://www.geeksforgeeks.org/introduction-machine-learning-using-python/

19920 ijariie.com 2948

You might also like