You are on page 1of 22

Technobot Using Python

Project Report
<Version 1.0>

Industrial Training

BACHELOR OF TECHNOLOGY (CSE)

PROJECT GUIDE: SUBMITTED BY:


Mr. Anurag Gupta Vidhi Jain
TCA1909008
Assistant Professor

December,2021

COLLEGE OF COMPUTING SFIENCES AND INFORMATION TECHNOLOGY


TEERTHANKER MAHAVEER UNIVERSITY, MORADABAD
ACKNOWLEDGEMENT

I take this opportunity to express my profound gratitude and deep regards to my


teachers Prof Mr. Anurag Gupta for their exemplary guidance, monitoring and
constant encouragement throughout the course of this project. The blessing, help
and guidance given by them time to time shall carry me a long way in the journey
of life on which I am about to embark.

Place:

Date:

Sign:

DECLARATION

We hereby declare that this Project Report title Technobot Using Python
submitted by us and approved by our project guide Mr. Anurag Gupta , College of
Computing Sciences and Information Technology (CCSIT), Teerthanker Mahaveer
University, Moradabad, is a bonafide work undertaken by us and it is not
submitted to any other University or Institution for the award of any degree
diploma / certificate or published any time before.
Project Group

Student Name Vidhi Jain

Project Guide
(external)

Project Guide Mr. Anurag


(intenal) Gupta

Brief About the Company

LinuxWorld ('LW') is a fast growing ISO 9001:2008 Certified


Organisation; fully governed by young and energetic Technocrats,
dedicated to Open Source technologies and Linux promotion.
Since its inception in the year 2005, LW have achieved the status of
centre of excellence wherein there is latest technology, innovative
developing methodology, state of the art infrastructure and individual
needs of employees are identified and executed professionally,
efficiently & ethically.
LW provides training by in-house experts or certified trainers or corporate
developers in areas related to System & Network Administration,
Programming Languages, Applications/Software Packages, and Server
Administration etc. depending on the skills required by the customer.
LW is structured around its Customers, in which customers are the focus of
the organisation. Customers lie at the heart of our strategy. We encourage
our customers to self-manage, identifying for themselves where their needs
lie. Following are the few courses available in various areas like security,
certificate courses, development, database management, etc.

Table of Contents
TMU-FOE&CS Version 5.0 T004A-Project Report

1 PROJECT TITLE 6
2 PROBLEM STATEMENT 6
3 PROJECT DESCRIPTION 6
3.1 SCOPE OF THE WORK 6
3.2 PROJECT MODULES 6
3.3 CONTEXT DIAGRAM (HIGH LEVEL) 6
4 IMPLEMENTATION METHODOLOGY 6
5 TECHNOLOGIES TO BE USED 6
5.1 SOFTWARE PLATFORM 6
5.2 HARDWARE PLATFORM 7
5.3 TOOLS, IF ANY 7
6 ADVANTAGES OF THIS PROJECT 7
7 ASSUMPTIONS, IF ANY 7
8 FUTURE SCOPE AND FURTHER ENHANCEMENT OF THE PROJECT 7
9 PROJECT REPOSITORY LOCATION 7
10 DEFINITIONS, ACRONYMS, AND ABBREVIATIONS 8
11 CONCLUSION 8
12 REFERENCES 9

Appendix
A: Data Flow Diagram (DFD)
B: Entity Relationship Diagram (ERD)
C: Use Case Diagram (UCD)
D: Data Dictionary (DD)
E: Screen Shots

Project Title: <Project Name> Page 5 of 22


TMU-FOE&CS Version 5.0 T004A-Project Report

Project Title

In present era , it is believe that Integration is key to all technologies , but issue is
that every user doesn’t have knowledge of all technologies , so we are making a
technical assistant "TECHNOBOT", which will take care of industrial needs in any
technical domains.

Problem Statement

Initially , Manual testing and software were ok , but nowadays world is moving
forward towards automation. Every company wants that there software should
be automated and can able to control multiple different technologies . Its very
typical task to learn every technology deeply and to remember their commands.
And also many peoples don’t have knowledge in these domains so this will help
them also.

Project Description

We want to build a python based automated bot , which have capability to


perform different industrial task on multiple technologies , here user just needs to
choose technology name , on which he/she wants to perform tasks. This software
helps the user in such a way that he/she doesn't need to know deep knowledge of
that particular technology , and even no need to remember commands and big
codes of every technology. this software can perform tasks over several
technolgies like AWS , HADOOP , DOCKER , LINUX and many more.

Project Title: <Project Name> Page 6 of 22


TMU-FOE&CS Version 5.0 T004A-Project Report

1.1 Scope of the Work


Current demand of indusrty is that every technology is Automated and
intergrated in a single software , so any client can perform thier task by just
having core knowledge of that particular technolgy.

1.2 Project Modules


Backend Coding for every technology used in the software like AWS , Linux,
Hadoop , docker
- Integration of all technologies with python.

Implementation Methodology

Project Title: <Project Name> Page 7 of 22


TMU-FOE&CS Version 5.0 T004A-Project Report

2 Technologies to be used
2.1 Software Platform
a) Front-end - Web-App

b) Back-end - Python

2.2 Hardware Platform


RAM, Hard Disk, OS, Editor, Browser etc.

2.3 Tools, if any


Python 3.0

Advantages of this Project


⦁ This software will help to user for performing multiple tasks with less time
and errorless.
⦁ High accuracy

3 Assumptions, if any
<Guidelines: Mention NONE, if there are NO Assumptions>

4 Future Scope and further enhancement of the Project


In future , we will add some more technologies like KUBERNETES , PODMAN ,
CRIO , Devops , BigData , Databases , Splunk.

5 Project Repository Location


<Guidelines: Mention the location of the latest Source Code and all related documents, like- Project
Synopsis Report, Project Progress updates, Project Requirement Details, Project Report (Softcopy), Test
Repository (all test scenarios, test cases etc.) used for Functional Testing of the project etc. The
repository location must be somewhere in CCSIT-Lab>

S# Project Artifacts Location Verified by Verified by Lab


(softcopy) (Mention Lab-ID, Server ID, Folder Project Guide In-Charge
Name etc.)

Project Title: <Project Name> Page 8 of 22


TMU-FOE&CS Version 5.0 T004A-Project Report

1. Project Synopsis Report Name and Name and


(Final Version) Signature Signature
2. Project Progress updates Name and Name and
Signature Signature
3. Project Requirement Name and Name and
specifications Signature Signature
4. Project Report (Final Name and Name and
Version) Signature Signature
5. Test Repository Name and Name and
Signature Signature
6. Any other document, give Name and Name and
details Signature Signature

Conclusion

We can conclude that if any company use our Software , it will save money and
time for them as everything would be automated so they dont need to hier a
particular technology specialist for thier project.

References

Project Title: <Project Name> Page 9 of 22


TMU-FOE&CS Version 5.0 T004A-Project Report

LITERATURE REVIEW

1. TRAINING ASPECTS OF AUTOMATIC SPEECH RECOGNITION SYSTEMS DURING


CHAT BOT CREATION
By Belenko, M.; Burym, N.; Muratova, U.; Balakshin, P.( 19th International
Multidisciplinary Scientific GeoConference SGEM 2019, 30 June - 6 July, 2019)

The page describes the key features of the automatic speech recognition systems
training process that can be used in a work of chat bot or voice assistant.
reviewed. Kaldi and CMU. Sphinx has been used as an open source systems’
example. The whole installation and training process are reviewed, instruction for
further researchers are documented. Examined systems have been trained with
different parameter sets and training data. As a result, the most important
training parameters of automatic speech recognition system for chat bot are
designated and described. Results of this research can help to significantly
improve the speed and quality of the speech recognition in chat bots and voice
assistants.

Project Title: <Project Name> Page 10 of 22


TMU-FOE&CS Version 5.0 T004A-Project Report

2. AN INTELLIGENT WEB-BASED VOICE CHAT BOT


By S. J. du Preez, M. Lall, S. Sinha(IEEE EUROCON 2009)

In this paper, the design and development of an intelligent voice recognition chat
bot. The paper presents a technology demonstrator to verify a proposed
framework required to support such a bot (a Web service). While a black box
approach is used, by controlling the communication structure, to and from the
Web-service, the Web-service allows all types of clients to communicate to the
server from any platform. The service provided is accessible through a generated
interface which allows for seamless XML processing; whereby the extensibility
improves the lifespan of such a service. By introducing an artificial brain, the Web-
based bot generates customized user responses, aligned to the desired character.
Questions asked to the bot, which is not understood is further processed using a
third-party expert system (an online intelligent research assistant), and the
response is archived, improving the artificial brain capabilities for future
generation of responses.

3. INTERNAL LANGUAGE MODEL ESTIMATION FOR DOMAIN ADAPTIVE END-TO-


END SPEECH
By Zhong Meng, Sarangarajan Parthasarathy, Yashesh Gaur(2019)

Project Title: <Project Name> Page 11 of 22


TMU-FOE&CS Version 5.0 T004A-Project Report

In this paper, explains the external language models (LM) integration remains a
challenging task for end-to-end (E2E) automatic speech recognition (ASR) which
has no clear division between acoustic and language models. In this work, we
propose an internal LM estimation (ILME) method to facilitate a more effective
integration of the external LM with all pre-existing E2E models with no additional
model training, including the most popular recurrent neural network transducer
(RNN-T) and attention-based encoder-decoder (AED) models. Trained with audio-
transcript pairs, an E2E model implicitly learns an internal LM that characterizes
the training data in the source domain. With ILME, the internal LM scores of an
E2E model are estimated and subtracted from the log-linear interpolation
between the scores of the E2E model and the external LM. The internal LM scores
are approximated as the output of an E2E model when eliminating its acoustic
components. ILME can alleviate the domain mismatch between training and
testing, or improve the multi-domain E2E ASR.

1. MACHINE LEARNING IN AUTOMATIC SPEECH RECOGNITION


By Jayashree Padmanabhan, Melvin Jose Johnson Premkumar(2015)

In this paper, it explains that over the past few decades, there has been
tremendous development in machine learning paradigms used in automatic
speech recognition (ASR) for home automation to space exploration. Though
commercial speech recognizers are available for certain well-defined applications
like dictation and transcription, many issues in ASR like recognition in noisy

Project Title: <Project Name> Page 12 of 22


TMU-FOE&CS Version 5.0 T004A-Project Report

environments, multilingual recognition, and multi-modal recognition are yet to be


addressed effectively. A comprehensive review of common machine learning
techniques like artificial neural networks, support vector machines, and Gaussian
mixture models along with hidden Markov models employed in ASR is provided. A
thorough review on the recent developments in deep learning which has provided
significant improvements in ASR performance, along with its relevance in the
future of ASR, is also presented.

2. AUTOMATIC SPEECH RECOGNITION – A BRIEF HISTORY OF THE TECHNOLOGY


DEVELOPMENT’
By B.H. Juang & Lawrence R. Rabiner(2014)
In this paper, explains designing a machine that mimics human behaviour,
particularly the capability of speaking naturally and responding properly to
spoken language, has intrigued engineers and scientists for centuries. Since the
1930s, when Homer Dudley of Bell Laboratories proposed a system model for
speech analysis and synthesis, the problem of automatic speech recognition has
been approached progressively, from a simple machine that responds to a small
set of sounds to a sophisticated system that responds to fluently spoken natural
language and takes into account the varying statistics of the language in which the
speech is produced. Based on major advances in statistical modelling of speech in
the 1980s, automatic speech recognition systems today find widespread
application in tasks that require a human-machine interface, such as automatic
call processing in the telephone network and query-based information systems
that do things like provide updated travel information, stock price quotations,
weather reports, etc. In this article, we review some major highlights in the

Project Title: <Project Name> Page 13 of 22


TMU-FOE&CS Version 5.0 T004A-Project Report

research and development of automatic speech recognition during the last few
decades so as to provide a technological perspective and an appreciation of the
fundamental progress that has been made in this important area of information
and communication technology.

3. REDUCING DFT LEAKAGE IN SPEECH RECOGNITION USING PITCH


SEGMENTATION
By Sopon Wiriyarattanakul & Piroon Kaewfoongrungsi 10 September 2021

In this paper, explains the Frame segmentation which is dividing a signal into
frames, is one of the most important part in speech recognition. In general, the
frame’s size of all segmented frames are fixed at around 20–40 ms, which leads to
the occurrence of a non-periodic signal in each frame. Consequently, the spectral
leakage arise after Discrete Fourier transform on these non-periodic frames in the
process of MFCC and effects on reducing speech recognition performance. In this
paper, pitch segmentation is proposed for reducing the spectral leakage issue by
applied a new technique of pitch detection to produce a periodic signal in each
frame. The neural network models are trained from collected speech signals as
well as the result of performance between segmentations of fixed frame and
pitch frame are compared. As the result, pitch frame gives a higher accuracy than
fixed frame in speech recognition.

Project Title: <Project Name> Page 14 of 22


TMU-FOE&CS Version 5.0 T004A-Project Report

4. USE OF MACHINE LEARNING SERVICES IN CLOUD


BY Amit Ganatra, Amit Nayak 22 june 2021
In this paper, explains the Machine learning services are the comprehensive
description of integrated and semiautomated web devices covering most facilities
problems such as preprocessing information, design preparation, and design
assessment, with the further forecast. REST APIs can bridge the outcomes of
predictions with one’s inner IT infrastructure. Like the original SaaS, IaaS, and
PaaS cloud delivery models, ML and AI fields cover high-level services to provide
infrastructure and platform, exposed as APIs. This article identifying the most
used Cloud Technologies for Machine Learning as a Service (MLaaS): Google Cloud
AI, Amazon.

5. A MODERN APPROACH TO BUILDING A DATA SCIENT FRAMEWORK


DELIVERY PIPELINE USING DEVOPS PRACTICES
By Surbhi Saxena 2021
From the survey, it can be achieved that the Machine learning models and
algorithms are very efficient to recognize or detect patterns with different writing
style. Testing various algorithms gives results in terms of lesser the complexity
more will be efficiency and accuracy for any digital data sets. According to survey
we have founded that by using Convolutional neural network accuracy increase to

Project Title: <Project Name> Page 15 of 22


TMU-FOE&CS Version 5.0 T004A-Project Report

the 99.89% accuracy most among all Similarly, Double Q learning algorithm also
given high accuracy but in MATLAB dataset only.

6. EXPLORING THE MEL SCALE FEATURES USING SUPERVISED LEARNING


CLASSIFIERS FOR EMOTION CLASSIFICATION
By Kalpana Rangara and Monit Kapoor(aug 2021)

In this work, a new challenging digit Arabic dataset is collected from different
study levels of schools. A large dataset is collected after paying vast effort for
distributing and collecting digit forms over hundreds of primary, high, college
students. After we find that there are few and not challenging Arabic digit
dataset, we paid vast effort for preparing such a challenging dataset. Also the
collected dataset is trained using an efficient model of CNN which represents the
current state-of-the-art for variety of applications. Thus we extensively analyzed
the model by carefully selecting their parameters and showing its robustness for
handling our dataset.

7. AUTOMATIC SPEECH RECOGNITION


By Xugang Lu and Sheng Li(Nov 2019)

In this paper, the Handwritten Digit Recognition using Deep learning methods has
been implemented. The most widely used Machine learning algorithms, KNN,

Project Title: <Project Name> Page 16 of 22


TMU-FOE&CS Version 5.0 T004A-Project Report

SVM, RFC and CNN have been trained and tested on the same data in order
acquire the comparison between the classifiers. Utilising these deep learning
techniques, a high amount of accuracy can be obtained. Compared to other
research methods, this method focuses on which classifier works better by
improving the accuracy of classification models by more than 99%. Using Keras as
backend and Tensorflow as the software, a CNN model is able to give accuracy of
about 98.72%. In this initial experiment, CNN gives an accuracy of 98.72%, while
KNN gives an accuracy of 96.67%, while RFC and SVM are not that outstanding.

8. Review on Deep Learning Handwritten Digit Recognition using


Convolutional Neural Network
By Akanksha Gupta, Ravindra Pratap Narwaria, Madhav Singh

In this paper, Handwritten digit recognition has immense applications in the field
of medical, banking, student management, and taxation process etc. Many
classifiers like KNN, SVM, CNN are used to identify the digit from the
handwritten image. as per the review, CNN is providing better performance than
others. Stages of HDR using CNN classifier is discussed in this paper. MNIST
dataset consist of handwritten numbers from 0-9 and it is a standard dataset used
to find performance of classifiers. HDR consists of three different stages. First is
preprocessing where dataset is converted into binary form and image processing
has been applied on it. Second stage is segmentation where the image is
converted into multiple segments. Third stage is feature extraction where
features of image are identified. Last stage is classification where classifiers like

Project Title: <Project Name> Page 17 of 22


TMU-FOE&CS Version 5.0 T004A-Project Report

KNN, SVM, CNN are used. Results of HDR is improved a lot by using CNN classifier
but it can be improved further in terms of complexity, duration of execution and
accuracy of results by making combination of classifiers or using some additional
algorithm with it.

9. Handwritten Digit Recognition Using Computer Vision


By Ashish Shekhar, Ajay Kaushik

In this paper, they demonstrated residual neural networks. The technology has
evolved over two decades and with the improvement of graphics hardware they
are becoming the industry standard for solving a large number of computer
vision problems, as well as other domains. In particular, the best results reported
in this survey are based on complex neural networks, sometimes combined with
other methods to improve performance. In some cases, CNNs can be combined
with evolutionary algorithms or other optimization techniques to increase the
selection of hyperparameters, allowing better accuracy. In other cases, different
CNN models are combined into one committee, reducing classification error.
Furthermore, we can err that data enhancement techniques improve
performance from surveyed work. These methods are often used with neural
networks or SVMs, and when combined with CNN classifiers, they can top the
entire ranking in the MNIST dataset.

Project Title: <Project Name> Page 18 of 22


TMU-FOE&CS Version 5.0 T004A-Project Report

10. Handwritten Digit Recognition Using Various Machine Learning Algorithms


and Models
By Pranit Patil, Bhupinder Kaur
From the survey, it can be achieved that the Machine learning models and
algorithms are very efficient to recognize or detect patterns with different writing
style. Testing various algorithms gives results in terms of lesser the complexity
more will be efficiency and accuracy for any digital data sets. According to
survey we have founded that by using Convolutional neural network accuracy
increase to the 99.89% accuracy most among all Similarly, Double Q learning
algorithm also given high accuracy but in MATLAB dataset only. SVM also given
accuracy of 99.36%. CNN produce most as it used layered architecture which
improve the computer vision and follow a hierarchical model which work on
making network and give fully connected layers so neuron get connect to each
other and output is processed.

11. Fast Efficient Artificial Neural Network for Handwritten Digit Recognition
By Viragkumar N. Jagtap, Shailendra K. Mishra

In this paper we presented fast efficient artificial neural network for handwritten
digit recognition on GPU to reduce training time with PTM (Parallel Training
Method). We derived back propagation algorithm on GPU based parallelization
should be preferred generally with compared to CPU based program. But still, for
back propagation with small input data and few hidden neurons CPU based
execution is better. But, if the input dataset is larger than GPU based

Project Title: <Project Name> Page 19 of 22


TMU-FOE&CS Version 5.0 T004A-Project Report

parallelization is suitable to reduce training time. We expecting to explore the


capabilities of GPU and multicores in cloud environments and offering them as
services, where users can query and select them, depending on respective service
level agreements and the system is providing high performance by automatic
parallelization.

12. Offline Handwritten Digits Recognition Using Machine learning


By Shengfeng Chen, Rabia Almamlook, Yuwen Gu, Dr. Lee wells
This paper has practiced different machine learning technique and different
models for data training attempting to discover a representation of isolated
handwritten digits that allow their effective recognition and to achieve the
highest accuracy of predicting handwritten numeral. Thus, this study settled on
classifying a given handwritten digit image as the required digit using five
different algorithms and consequently testing its accuracy. This study built
handwritten recognizers evaluated their performances on MNIST (Mixed
National Institute of Standards and Technology) dataset and then improved the
training speed and the recognition performance. In addition to develop a system
for word based handwriting recognition system and test the handwriting of a
given word and detect the writer by selecting which is being recognized for most
of the user for a given training sample. This study discusses in detail all advances
in the area of handwritten character recognition. The most accurate solution
provided in this area directly or indirectly depends upon the quality as well as the
nature of the material to be read. Various techniques have been described in this
paper for character recognition in handwriting recognition system.

Project Title: <Project Name> Page 20 of 22


TMU-FOE&CS Version 5.0 T004A-Project Report

The result of this study shows that accuracy is improved as the no of blocked are
increased.
Apart from that, a 4x4, 7x7 and 14x14 attribute reduction is performed separately
to compare and find the optimal number of attributes that best represent the
image. The initial hypothesis was answered with concrete data support.
Comparing all the classification models we tested K-Nearest Neighbor is the
preferred choice in terms of its high accuracy and computational efficiency.
However, there is no single classifier that works best on all given problems. A
result shows that probabilistic methods suit better for handwriting recognition. By
varying the training and testing ratios (from 10% to 90%) we found that the larger
training data size improves accuracy, but smaller testing dataset may also favor
better accuracy. Preprocessing such as Attribute reduction (784 reduced to 196)
reduce runtime and increase accuracy (from 93.2% to 95.9%). The proposed
algorithm tries to address both the factors and well in terms of accuracy and time
complexity. The general accuracy of tested K-NN was found to be 96.7% while
96.8% were achieved by Neural Network.

CERTIFICATE
Project Title: <Project Name> Page 21 of 22
TMU-FOE&CS Version 5.0 T004A-Project Report

Project Title: <Project Name> Page 22 of 22

You might also like