You are on page 1of 89

INTEGRATING DIFFERENT ILLNESS PREDICTION METHODS INTO A

SINGLE INTERFACE

A Main project thesis submitted in partial fulfillment of requirements for the award
of degree for VIII semester

BACHELOR OF TECHNOLOGY
IN
COMPUTER SCIENCE AND ENGINEERING

by
M.MONISHA (19131A05C0)
K.SAI HARSHITA (19131A05K9)
K.HARITHA (19131A05A3)
SATYAJIT MISRO (19131A05L3)

Under the esteemed guidance of


Mr. CH. SRIKANTH VARMA,
Assistant Professor,
Department of Computer Science and Engineering

GAYATRI VIDYA PARISHAD COLLEGE OF


ENGINEERING(AUTONOMOUS)
(Affiliated to JNTU-K, Kakinada)
VISAKHAPATNAM
2022-2023

i
CERTIFICATE
This is to certify that the main project entitled “Integrating several illness prediction
methods into a single interface” being submitted by

M.MONISHA (19131A05C0)
K.SAI HARSHITA (19131A05K9)
K.HARITHA (19131A05A3)
SATYAJIT MISRO (19131A05L3)
in partial fulfilment for the award of the degree “Bachelor of Technology” in Computer
Science and Engineering to the Jawaharlal Nehru Technological University, Kakinada is
a record of bonafide work done under my guidance and supervision during VIII semester
of the academic year 2022-2023.

The results embodied in this record have not been submitted to any other
university or institution for the award of any Degree or Diploma.

Guide Head of the Department


Mr.Ch. Srikanth Varma Dr. D. N. D. Harini
Assistant Professor Associate Professor & H.O.D
Department of CSE Department of CSE
GVPCE(A) GVPCE(A)

ii
DECLARATION

We hereby declare that this project entitled “INTEGRATING DIFFERENT


ILLNESS PREDICTION METHODS INTO A SINGLE INTERFACE” is a
bonafide
work done by us and submitted to “Department of Computer Science and Engineering,
G. V. P College of Engineering (Autonomous) Visakhapatnam, in partial fulfilment for
the award of the degree of B. Tech is of our own and it is not submitted to any other
university or has been published anytime before.

PLACE: VISAKHAPATNAM M. MONISHA (19131A05C0)

DATE: K. SAI HARSHITA (19131A05K9)

K. HARITHA (19131A05A3)

SATYAJIT MISRO

(19131A05L3)

iii
ACKNOWLEDGEMENT

We would like to express our deep sense of gratitude to our esteemed institute
Gayatri Vidya Parishad College of Engineering (Autonomous), which has provided
us an opportunity to fulfill our cherished desire.

We express our sincere thanks to our principal Dr. A. B. KOTESWARA RAO,


Gayatri Vidya Parishad College of Engineering (Autonomous) for his encouragement
to us during this project, giving us a chance to explore and learn new technologies in the
form of mini projects.

We express our deep sense of Gratitude to Dr. D. N. D. HARINI, Associate


Professor and Head of the Department of Computer Science and Engineering,
Gayatri Vidya Parishad College of Engineering (Autonomous) for giving us an
opportunity to do the project in college.

We express our profound gratitude and our deep indebtedness to our guide
Mr.Ch.SRIKANTH VARMA, whose valuable suggestions, guidance and
comprehensive assessments helped us a lot in realizing our project.

We also thank our coordinator, Dr. CH. SITA KUMARI, Associate Professor,
Department of Computer Science and Engineering, for the kind suggestions and guidance
for the successful completion of our project work.

M.MONISHA (19131A05C0)
K.SAI HARSHITA (19131A05K9)
K.HARITHA (19131A05A3)
SATYAJIT MISRO (19131A05L3)

iv
ABSTRACT

We are all aware of how AI is advancing medical science in the modern world and how it
is proving to be a savior. AI has brought a revolution in the field of medical science and is
going beyond our imagination day by day. AI is transforming the practice of medicine.
It’s helping doctors diagnose patients more accurately, make predictions about patients’
future health, and recommend better treatments.

Disease prediction using Machine learning is the system that is used to predict diseases
from the symptoms which are given by the patients or any user. The system processes the
symptoms provided by the user as input and gives the output as the probability of the
disease occurrence.

From as basic as Diabetes detection or Breast Cancer detection using simple Machine
learning models to as complex as Corona virus detection or Alzheimer detection using
segmentation and other advanced techniques, AI has gone beyond our imaginations.

That’s why we came up with the idea of creating an online platform where we brought
together all these disease detections under one roof. The web application can detect a
variety of diseases. The machine learning algorithm determines whether a person is
suffering from a specific disease after they enter their information and upload their
scanning reports. It displays the result whether he or she is suffering from that disease or
not along with the severity.

For each disease listed above the model employs various machine learning techniques.

KEYWORDS – socioeconomic, CNN, Random Forest, web interface, feature


engineering, operability, feasibility, bagging, boosting, pedigree function.

v
INDEX

CHAPTER 1. INTRODUCTION.................................................................1
1.1 Objective…...........................................................................1
1.2 About the Algorithm............................................................2
1.3 Purpose…..............................................................................5
1.4 Scope….................................................................................6
CHAPTER 2. SRS DOCUMENT................................................................7
2.1 Functional Requirements..................................................7
2.2 Non-functional Requirements...........................................7
2.3 Minimum Hardware Requirements...................................8
2.4 Minimum Software Requirements....................................8

CHAPTER 3. ALGORITHM ANALYSIS…..............................................9


3.1 Existing Algorithm.........................................................10
3.2 Proposed Algorithm........................................................10
3.3 Feasibility Study…..........................................................11
3.4 Cost Benefit Analysis.....................................................14
CHAPTER 4. SOFTWARE DESCRIPTION...........................................15
4.1 Visual Studio Code….......................................................15
4.2 Flask..................................................................................15
4.3 Python...............................................................................15
4.4 HTML/CSS.......................................................................16
4.5 TensorFlow........................................................................16
4.6 Matplotlib..........................................................................16

vi
CHAPTER 5. PROJECT DESCRIPTION..............................................17
5.1 Problem Definition….......................................................17
5.2 Project Overview..............................................................17
5.3 Module Description….....................................................18
5.3.1 Python and Flask Framework…........................18
5.3.2 Model.................................................................19

CHAPTER 6. SYSTEM DESIGN.............................................................25


6.1 Introduction to UML.........................................................25
6.2 Building Blocks of the ML................................................26
6.3 UML Diagrams.................................................................32
CHAPTER 7. DEVELOPMENT...............................................................34
7.1 Datasets used.....................................................................34
7.2 Sample Code......................................................................37
7.3 Results...............................................................................59
CHAPTER 8. TESTING.............................................................................69
8.1 Introduction of Testing......................................................69
CHAPTER 9. CONCLUSION...................................................................73

FUTURE SCOPE...............................................................74

REFERENCE LINKS.......................................................75

vi
1. INTRODUCTION

In contrast to the intelligence of people or other living animals, artificial intelligence


(AI) is the intelligence possessed by machines. AI can also be referred to as the study of
"intelligent agents," which refers to any agent or machine that has the ability to see and
comprehend its environment and take appropriate action to increase its chances of
success. AI can also refer to circumstances in which machines can mimic human minds
in learning and analysis and thus be used to solve problems. The term "machine learning"
is another name for this type of intelligence (ML).

In most cases, AI uses a system that combines hardware and software. From a
software standpoint, AI is especially focused on algorithms. A theoretical foundation for
using AI is called an artificial neural network (ANN).

Our project's main objective is to integrate various illness prediction models into a
single web interface. It helps patients and other medical experts determine if they have a
certain condition or not. It offers accurate findings and the severity degree based on the
data the user has submitted.

1.1. OBJECTIVE

With traditional patient risk identification techniques, it was found that the number of
different combinations of variables about conditions, lab values, socioeconomic
information, and other data points corresponding to individuals made it very challenging
to identify the relationship among the data points. Machine learning models can address
some of the weaknesses of traditional linear models as they are better at handling
nonlinearity and can better identify implicit relationships among the input variables [1].
Machine learning models can handle feature selection, which is a process that is used to
determine the variables and relationships to be considered while building the model.

1
Hence, machine learning algorithms can be used to identify high-risk patients based on
historical data, assuming a high number of variables and data points.

We have used Python which has an extensive and comprehensive collection of freely
available packages covering a variety of topics. Scientific Python libraries such as
NumPy, SciPy, and pandas provide efficient implementation of numerical operations and
tasks common in science and engineering.

These libraries provide a strong base from which more advanced scientific
software can be built without needing to worry about low-level algorithms. Additionally,
many domain specific packages exist which address the scientific needs of the
meteorological community. Convolutional neural networks and other cutting-edge
machine learning techniques were also employed to categorize different disorders.

1.2. ABOUT THE ALGORITHM


1.2.1 Convolutional Neural Network (CNN)

CNN stands for Convolutional Neural Network. It is a type of neural network that is
commonly used for image and video analysis but can also be used for other types of data
such as audio signals or natural language processing.

A typical Convolutional Neural Network (CNN) consists of several layers. The most
common layers in a CNN are:

1. Convolutional Layer: The first layer of a CNN is typically a convolutional layer.


It applies a set of learnable filters to the input image, which allows the network to
extract important features from the image. Each filter produces a feature map,
which represents the response of the filter at that location.
2. Activation Layer: The activation layer applies a non-linear activation function to
the output of the convolutional layer. This introduces non-linearity into the model
and helps it learn more complex patterns.
3. Pooling Layer: The pooling layer reduces the size of the feature maps generated
by the convolutional layer. This helps reduce the number of parameters in the
model
2
and makes it more efficient. The most common type of pooling layer is the max
pooling layer, which selects the maximum value from each region of the feature
map.
4. Dropout Layer: The dropout layer randomly drops out some of the neurons in the
network during training. This helps prevent overfitting and improves the
generalization ability of the network.
5. Fully Connected Layer: The fully connected layer takes the output of the previous
layers and applies a set of weights to generate a prediction. This layer is similar to
the output layer in a traditional neural network.

These layers are typically stacked one on top of the other to form a deep neural network.
The output of the last layer is then fed into a loss function, which measures how well the
network is performing on the given task. The goal of training the network is to minimize
the loss function by adjusting the weights of the network.

CNNs have been used for a variety of applications, such as object recognition, facial
recognition, image and video classification, and medical image analysis. They have
shown excellent performance in these tasks, often outperforming other types of machine
learning algorithms.

FIG 1.1 CNN ARCHITECTURE

3
1.2.2. Random Forest

Random Forest is a type of ensemble learning algorithm used in machine learning for
classification and regression tasks [2]. It is based on the concept of decision trees and
uses a combination of multiple decision trees to make predictions.

1. Random subsets of the training data are selected, and a decision tree is built on each
subset.

2. Each decision tree is built by recursively splitting the data based on the best split that
maximizes the information gain. The split is chosen based on a randomly selected subset
of features.

3. The process of building the decision tree continues until a stopping criterion is reached,
such as the maximum depth of the tree or the minimum number of samples required to
make a split.

4. The output of each decision tree is a prediction, which is either a class label in the case
of classification or a continuous value in the case of regression.

5. To make a prediction, the Random Forest combines the predictions from all the
decision trees by taking the mode of the class labels in the case of classification or the
mean of the continuous values in the case of regression.

6. The final prediction is based on the combined predictions of all the decision trees in the
Random Forest.

FIG 1.2 WORKING OF RANDOM FOREST ALGORITHM

4
1.3. PURPOSE
The purpose of a disease prediction project that targets diseases is to develop accurate
and reliable machine learning models that can predict the likelihood of an individual
developing one of the target diseases. The purpose of a disease prediction project is to
leverage the power of machine learning to improve disease prevention and detection.

By leveraging large datasets of medical records, genetic data, and other relevant
information, the project aims to identify risk factors and develop models that can predict
disease risk with high accuracy. The developed models can be integrated into clinical
practice or electronic health record systems to provide healthcare providers with disease
risk predictions, helping them make more informed decisions about patient care and
disease prevention.

The web interface includes predictions for five different diseases:

1) Pneumonia - Infection that inflames air sacs in one or both lungs, which may fill with
fluid.

2) Alzheimer's – It is a neurodegenerative disorder that primarily affects the elderly


population. It is a progressive disease that affects the brain, leading to memory loss,
cognitive decline, and behavioral changes.

3) Diabetes - Diabetes is a chronic disease that occurs either when the pancreas does not
produce enough insulin or when the body cannot effectively use the insulin it produces
[3].

4) Breast cancer - It is a type of cancer that develops in the breast tissue. It is the most
common cancer among women worldwide. Breast cancer usually begins as a small lump
in the breast that can be detected through a breast exam or mammography

5)Covid-19 - It is a highly infectious respiratory illness that spreads primarily through


respiratory droplets when an infected person coughs or sneezes, or through close contact
with an infected person. The virus can also be transmitted by touching a surface
contaminated with the virus and then touching one's face.

5
1.4. SCOPE
The scope of this project can be quite broad, depending on the specific goals and
objectives of the project. However, in general the project aims to improve health
outcomes [12]. With the increasing availability of health data, the project focuses on
using machine learning to identify patterns and trends in health outcomes and to develop
more targeted interventions.

The scope of a disease prediction project can vary depending on the specific diseases
being targeted and the goals of the project. Here are some potential areas of focus for a
disease prediction project that targets seven diseases:

1. Data collection: Collecting and curating large datasets of medical records, genetic
data, and other relevant information for patients with the seven target diseases.
2. Feature engineering: Identifying and selecting the most informative features from
the collected data to build accurate disease prediction models.
3. Model development: Developing machine learning models, such as logistic
regression or random forest, that can accurately predict the likelihood of an
individual developing one of the seven target diseases.
4. Model evaluation and validation: Evaluating the performance of the developed
models and validating their accuracy using independent test datasets.
5. User interface development: Developing an easy-to-use interface for clinicians or
individuals to input patient data and receive disease risk predictions.
6. Deployment and integration: Deploying the developed models into clinical
practice or integrating them into electronic health record systems to provide
disease risk.

6
2. SRS DOCUMENT

A software requirements specification (SRS) is a document that describes what the


software will do and how it will be expected to perform.

2.1 FUNCTIONAL REQUIREMENTS

A Functional Requirement (FR) is a description of the service that the software


must offer. It describes a software system or its component. A function is nothing
but inputs to the software system, its behavior, and outputs. It can be a calculation,
data manipulation, business process, user interaction, or any other specific
functionality which defines what function a system is likely to perform. Functional
Requirements are also called Functional Specification.

• Maintains all the records of the patients and doctors and can be accessed
through a same command prompt.
• This bot would also save the time as searching digitalized records is quicker
than manually, though there exist some digitalized records they are separate
from one another

2.2 NON-FUNCTIONAL REQUIREMENTS

NON-FUNCTIONAL REQUIREMENT (NFR) specifies the quality attribute of a


software system. They judge the software system based on Responsiveness,
Usability, Security, Portability. Non-functional requirements are called qualities of
a system, there are as follows:

• Performance-The average response time of the system is less


• Reliability - The system is highly reliable.
• Operability - The interface of the system will be consistent.

7
• Efficiency - Once user has learned about the system through his interaction,
he can perform the task easily.
• Understandability-Because of user friendly interfaces, it is more
understandable to the users.

2.3 MINIMUM HARDWARE REQUIREMENTS

• Processor -Intel Core i7


• Hard Disk – 256GB
• RAM – 8GB
• Operating System – Windows 10

2.4 MINIMUM SOFTWARE REQUIREMENTS

Python based Computer Vision and Deep Learning libraries will be exploited for
the development and experimentation of the project.

• Programming Language – PYTHON 3.0


• IDE – Visual Studio Code
• TensorFlow – gpu
• OpenCV
• Flask
• HTML/CSS

8
3. ANALYSIS

3.1. EXISTING SYSTEMS


• The current illness prediction systems have unique user interfaces for every
disease prediction, which is expensive because it costs a lot of money and
resources to create separate user interfaces for predicting different diseases.

• Even a novice user will be able to comprehend how to use the interface, making it
user-friendly.

3.1.1. DRAWBACKS OF EXISTING ALGORITHM

While there are numerous online platforms for identifying different diseases
separately, we have developed a platform where we have combined different sickness
prediction methodologies into a single interface as part of our research. By merely
supplying the necessary test information, it enables the user to anticipate numerous
diseases with simplicity.

3.2. PROPOSED ALGORITHM

We came up with the idea of creating an online platform where we brought together all
these disease detections under one roof. The web application can detect a variety of
diseases. The machine learning algorithm determines whether a person is suffering from
a specific disease after they enter their information and upload their scanning reports. It
displays the result whether he or she is suffering from that disease or not along with the
severity.

9
For training the model to identify various diseases, we employ various machine learning
algorithms. They are: -

• Pneumonia - (CNN)

• Alzheimer- (CNN)

• Covid- (CNN)

• Diabetes- (Random Forest)

• Breast cancer- (Random Forest)

3.2.1. ADVANTAGES OF PROPOSED MODEL

 It is very time-saving
 Automatic medical reports generation
 Accurate results
 Degree of severity
 User- friendly graphical interface
 Highly reliable
 Cost effective

10
3.3 FEASIBILITY STUDY
A feasibility study is an analysis that takes all a project's relevant factors into
account including economic, technical, legal, and scheduling considerations
to ascertain the likelihood of completing the project successfully. A
feasibility study is important and essential to evolute any proposed project is
feasible or not. A feasibility study is simply an assessment of the practicality
of a proposed plan or project.

The main objectives of feasibility are mentioned below:

To determine if the product is technically and financially feasible to develop,


is the main aim of the feasibility study activity. A feasibility study should
provide management with enough information to decide:
• Whether the project can be done.

• To determine how successful your proposed action will be.

• Whether the final product will benefit its intended users.

• To describe the nature and complexity of the project.

• What are the alternatives among which a solution will be chosen (During
subsequent phases)
• To analyze if the software meets organizational requirements. There are
various types of feasibility that can be determined. They are:

Operational - Define the urgency of the problem and the acceptability of any
solution, includes people-oriented and social issues: internal issues, such as
manpower problems, labor objections, manager resistance, organizational conflicts,
and policies; also, external issues, including social acceptability, legal aspects, and
government regulations.

Technical: Is the feasibility within the limits of current technology? Does the
technology exist at all? Is it available within a given resource?

11
Economic - Is the project possible, given resource constraints? Are the benefits that
will accrue from the new system worth the costs? What are the savings that will
result from the system, including tangible and intangible ones? What are the
development and operational costs?

Schedule - Constraints on the project schedule and whether they could be


reasonably met.

3.3.1. ECONOMIC FEASIBILITY:

Economic analysis could also be referred to as cost/benefit analysis. It is the most


frequently used method for evaluating the effectiveness of a new system. In
economic analysis the procedure is to determine the benefits and savings that are
expected from a candidate system and compare them with costs. Economic
feasibility study related to price, and all kinds of expenditure related to the scheme
before the project starts. This study also improves project reliability. It is also
helpful for the decision-makers to decide the planned scheme processed latter or
now, depending on the financial condition of the organization. This evaluation
process also studies the price benefits of the proposed scheme. Economic
Feasibility also performs the following tasks.

• Cost of packaged software.


• Cost of doing full system study.

• Is the system cost Effective?

3.3.2. TECHNICAL FEASIBILITY:

A large part of determining resources has to do with assessing technical feasibility.


It considers the technical requirements of the proposed project. The technical
requirements are then compared to the technical capability of the organization. The
systems project is considered technically feasible if the internal technical capability
is sufficient to support the project requirements. The analyst must find out whether
current technical resources can be where the expertise of system analysts is
12
beneficial, since using their own experience and their contact with vendors they
will be able to answer the question of technical feasibility.

Technical Feasibility also performs the following tasks.

• Is the technology available within the given resource constraints?


• Is the technology have the capacity to handle the solution
• Determines whether the relevant technology is stable and established.
• Is the technology chosen for software development has a large number of
users so that they can be consulted when problems arise, or improvements
are required?

3.3.3. OPERATIONAL FEASIBILITY:

Operational feasibility is a measure of how well a proposed system solves the


problems and takes advantage of the opportunities identified during scope
definition and how it satisfies the requirements identified in the requirements
analysis phase of system development. The operational feasibility refers to the
availability of the operational resources needed to extend research results beyond
on which they were developed and for which all the operational requirements are
minimal and easily accommodated. In addition, the operational feasibility would
include any rational compromises farmers make in adjusting the technology to the
limited operational resources available to them. The operational Feasibility also
perform the tasks like

• Does the current mode of operation provide adequate response time?


• Does the current of operation make maximum use of resources.
• Determines whether the solution suggested by the software development
team is acceptable.
• Does the operation offer an effective way to control the data?
• Our project operates with a processor and packages installed are supported
by the system.

13
3.4. COST BENEFIT ANALYSIS

The financial and the economic questions during the preliminary investigation are
verified to estimate the following:

• The cost of the hardware and software for the class of application being
considered.
• The benefits in the form of reduced cost.
• The proposed system will give the minute information, as a result.
• Performance is improved which in turn may be expected to provide
increased profits.
• This feasibility checks whether the system can be developed with the
available funds.
• This can be done economically if planned judicially, so it is economically
feasible.
• The cost of the project depends upon the number of man-hours required.

14
4. SOFTWARE DESCRIPTION

4.1. Visual Studio Code


Visual Studio Code (famously known as VS Code) is a free open-source text editor
by Microsoft. VS Code is available for Windows, Linux, and macOS. VS Code
supports a wide array of programming languages from Java, C++, and Python to
CSS, Go, and Docker file. Moreover, VS Code allows you to add on and even
creating new extensions including code linters, debuggers, and cloud and web
development support. The VS Code user interface allows for a lot of interaction
compared to other text editors.

4.2. Flask

Flask is a web framework, it’s a Python module that lets you develop web applications
easily. It has a small and easy-to-extend core: it’s a microframework that doesn’t include
an ORM (Object Relational Manager) or such features. It does have many cool features
like URL routing and template engine [9]. It is a WSGI web app framework. To install
flask on the system, we need to have python 2.7 or higher installed on our system. It is
designed to keep the core of the application simple and scalable. Instead of an abstraction
layer for database support, Flask supports extensions to add such capabilities to the
application.

4.3. Python

Python is an interpreted, object-oriented, high-level programming language with dynamic


semantics. Its high-level built-in data structures, combined with dynamic typing and
dynamic binding, make it very attractive for Rapid Application Development, as well as
for use as a scripting or glue language to connect existing components together. Python's
simple, easy to learn syntax emphasizes readability and therefore reduces the cost of
program maintenance. Python supports modules and packages, which encourages
program modularity and code reuse. The Python interpreter and the extensive standard
library are available in source or binary form without charge for all major platforms, and
can be freely distributed.

15
4.4. HTML/CSS

The Hypertext Markup Language or HTML is the standard markup language for
documents designed to be displayed in a web browser. It is often assisted by technologies
such as Cascading Style Sheets (CSS) and scripting languages such as JavaScript. HTML
elements are the building blocks of HTML pages. With HTML constructs, images and
other objects such as interactive forms may be embedded into the rendered page. HTML
can embed programs written in a scripting language such as JavaScript, which affects the
behavior and content of web pages. The inclusion of CSS defines the look and layout of
content. CSS is designed to enable the separation of content and presentation, including
layout, colors, and fonts. This separation can improve content accessibility; provide more
flexibility and control in the specification of presentation characteristics; enable multiple
web pages to share formatting by specifying the relevant CSS in a separate .css file,
which reduces complexity and repetition in the structural content.

4.5. TensorFlow

TensorFlow is a free and open-source software library for machine learning and artificial
intelligence. It can be used across a range of tasks but has a particular focus on training
and inference of deep neural networks [10]. TensorFlow was developed by the Google
Brain team for internal Google use in research and production. The initial version was
released under the Apache License 2.0 in 2015.Google released the updated version of
TensorFlow, named TensorFlow 2.0, in September 2019.TensorFlow can be used in a
wide variety of programming languages, including Python, JavaScript, C++, and Java.

4.6. Matplotlib

Matplotlib is a comprehensive library for creating static, animated, and interactive


visualizations in Python. Matplotlib makes easy things easy and hard things possible.
Create publication quality plots. Make interactive figures that can zoom, plan, update.
Customize visual style and layout. Export to many file formats. Embed in Jupyter and
Graphical User Interfaces. Use a rich array of third-party packages built on Matplotlib
[11].
16
5. PROBLEM DESCRIPTION

5.1. PROBLEM DEFINITION


Disease Prediction using Machine Learning is the system that is used to predict
the diseases from the values and images provided by the user. The system processes the
input values provided by the user and gives the output as the probability of the disease,
along with the degree of severity. With an increase in biomedical and healthcare data,
accurate analysis of medical data benefits the early disease detection and patient care [4].

We used CNN and Random Forest algorithms for different disease predictions. The main
objective is to develop a reliable system for disease prediction with accurate results.

5.2. PROJECT OVERVIEW

Our project's primary goal is to assist users in determining whether or not they have a
specific disease based on the information provided by the users.

We used CNN (Convolutional Neural Network) which is a Deep Learning


algorithm, this is used as it is the best algorithm for image processing. We also used
Random Forest algorithm for prediction of Breast Cancer and Diabetes. The steps are
providing image to the algorithm in case of Pneumonia, Covid and Alzheimer’s disease
prediction. In case of Diabetes and Breast Cancer, lab result values like glucose, insulin,
concave points mean, radius mean, etc. are given as input to the model, based on which
the model predicts whether a person is suffering from a particular disease or not.

The steps involved in the project are: -

1. Collection of datasets.
2. Data split for training and testing.
3. Training the model.
4. Repeating steps 2 and 3 for different ratios of Training and testing data to
maximize the accuracy.
5. Webpage creation

17
6. Linking both Machine Learning model and Webpage.
7. Displaying results

The output of our project consists of Name of the person, age, and final result whether he
is suffering from particular disease or not, along with the degree of severity.

5.3. MODULE DESCRIPTION

5.3.1. PYTHON AND FLASK FRAMEWORK

Flask is a web framework, it’s a Python module that lets you develop web
applications easily. It has a small and easy-to-extend core: it’s a microframework that
doesn’t include an ORM (Object Relational Manager) or such features. It does have many
cool features like URL routing and template engine. It is a WSGI web app framework. To
install flask on the system, we need to have python 2.7 or higher installed on our system.
It is designed to keep the core of the application simple and scalable. Instead of an
abstraction layer for database support, Flask supports extensions to add such capabilities
to the application.

Python is an interpreted, object-oriented, high-level programming language with


dynamic semantics. Its high-level built-in data structures, combined with dynamic typing
and dynamic binding, make it very attractive for Rapid Application Development, as well
as for use as a scripting or glue language to connect existing components together.
Python's simple, easy to learn syntax emphasizes readability and therefore reduces the
cost of program maintenance. Python supports modules and packages, which encourages
program modularity and code reuse. The Python interpreter and the extensive standard
library are available in source or binary form without charge for all major platforms, and
can be freely distributed.

18
5.3.2. MODEL

CNN (Convolutional Neural Network)

In neural networks, Convolutional neural networks (ConvNets or CNNs) is


one of the main categories to do image recognition, images classifications. Objects
detections, recognition face etc., are some of the areas where CNNs are widely used.
CNN image classifications take an input image, process it and classify it under
certain categories [6]. Computers see an input image as an array of pixels and it
depends on the image resolution. Based on the image resolution, it will see h x w x d
(h = Height, w = Width, d = Dimension).
A convolutional neural network consists of an input and an output layer, as
well as multiple hidden layers. The hidden layers of a CNN typically consist of a
series of convolutional layers that convolve with a multiplication or other dot
product. The activation function is commonly a RELU layer, and is subsequently
followed by additional convolutions such as pooling layers, fully connected layers
and normalization layers, referred to as hidden layers because their inputs and
outputs are masked by the activation function and final convolution.

FIG 5.1 The layers of a CNN have neurons arranged in 3 dimensions: width,
height and depth

19
ReLU layer

ReLU means Rectified Linear Unit, ReLU is the most used activation function in
the world right now. Since, it is used in almost all the convolutional neural networks
or deep learning. As you can see, the ReLU is half rectified (from bottom). f(z) is
zero when z is less than zero and f(z) is equal to z when z is above or equal to zero.
Range: max (0, z)
But the issue is that all the negative values become zero immediately which decreases
the ability of the model to fit or train from the data properly. That means any negative
input given to the ReLU activation function turns the value into zero immediately in the
graph, which in turns affects the resulting graph by not mapping the negative values
appropriately.

Graphically it looks like this-

Fig 5.2 ReLU graphical representation

20
Random Forest
Random Forest is one of the most popular and commonly used algorithms by Data
Scientists. Random forest is a Supervised Machine Learning Algorithm that is used
widely in Classification and Regression problems. It builds decision trees on
different samples and takes their majority vote for classification and average in case
of regression. One of the most important features of the Random Forest Algorithm
is that it can handle the data set containing continuous variables, as in the case of
regression, and categorical variables, as in the case of classification [7]. It performs
better for classification and regression tasks.

Working of Random Forest Algorithm:


Ensemble simply means combining multiple models. Thus, a collection of models is used
to make predictions rather than an individual model.

Ensemble uses two types of methods:

1. Bagging– It creates a different training subset from sample training data with
replacement the final output is based on majority voting. For example, Random Forest.

2. Boosting– It combines weak learners into strong learners by creating sequential


models such that the final model has the highest accuracy. For example, ADA BOOST,
XG BOOST.

Fig 5.3 Bagging Ensemble method

21
Important Features of Random Forest
 Diversity: Not all attributes/variables/features are considered while making an
individual tree; each tree is different [7].
 Immune to the curse of dimensionality: Since each tree does not consider all the
features, the feature space is reduced.
 Parallelization: Each tree is created independently out of different data and
attributes. This means we can fully use the CPU to build random forests.
 Train-Test split: In a random forest, we don’t have to segregate the data for train
and test as there will always be 30% of the data which is not seen by the decision
tree.

Fig 5.4 Example of Random Forest algorithm

22
The implementation consists of following modules: -
The user is first redirected to the homepage of our website, where they can choose to
evaluate any of the listed diseases. They are- Pneumonia, Covid, Alzheimer’s, Breast
Cancer, and Diabetes-

 When a user submits a chest x-ray for covid or pneumonia, a machine learning
model analyses the image and determines whether the patient has the disease or
not. It also determines the severity of the condition, such as mild, moderate, or
high.
 A brain MRI is given as input to the machine learning model in the case of
Alzheimer's. The model analyses the image and determines whether or not the
patient has the disease or not. It also determines the severity of the condition, such
as mild demented, moderate demented, non demented and very demented.
 In the case of diabetes, the user provides the machine learning model with input
data on glucose, insulin, age, the number of pregnancies, blood pressure, skin
thickness, BMI, and the Diabetes Pedigree function. The model forecasts whether
a person has the disease or not based on these variables.
 In the case of Breast Cancer, the user provides the machine learning model with
input data on concave points mean, area mean, radius mean, perimeter mean,
concavity mean. The model forecasts whether a person has the disease or not
based on these variables.

23
FLOWCHART

Fig 5.5 Flowchart describing the Machine Learning model

24
6. SYSTEM DESIGN

6.1 Introduction to UML

Unified Modeling Language (UML) is a general-purpose modeling


language. The main aim of UML is to define a standard way to visualize the way a
system has been designed. It is quite like blueprints used in other fields of
engineering. UML is not a programming language; it is rather a visual language. We
use UML diagrams to portray the behavior and structure of a system. UML helps
software engineers, businessmen and system architects with modeling, design and
analysis. The Object Management Group (OMG) adopted Unified Modelling
Language as a standard in 1997. It's been managed by OMG ever since.
International Organization for Standardization (ISO) published UML as an
approved standard in 2005. UML has been revised over the years and is reviewed
periodically.

Why we need UML

• Complex applications need collaboration and planning from multiple teams


and hence require a clear and concise way to communicate amongst them.

• Businessmen do not understand code. So, UML becomes essential to


communicate with nonprogrammers’ essential requirements, functionalities and
processes of the system.

• A lot of time is saved down the line when teams can visualize processes,
user interactions and static structure of the system.

UML is linked with object-oriented design and analysis. UML makes the
use of elements and forms associations between them to form diagrams. Diagrams
in UML can be broadly classified as:

• Structural Diagrams – Capture static aspects or structure of a system.


Structural Diagrams include Component Diagrams, Object Diagrams, Class

25
Diagrams and Deployment Diagrams.

26
• Behaviour Diagrams – Capture dynamic aspects or behaviour of the system.
Behaviour diagrams include Use Case Diagrams, State Diagrams, Activity
Diagrams and Interaction Diagrams.

Building Blocks of the UML Building Blocks of the UML Building Blocks of the
UML

Fig 6.1 BUILDING BLOCKS IN UML

6.2 Building Block of the UML


The vocabulary of the UML encompasses three kinds of building blocks:

• Things
• Relationships
• Diagrams

Things are the abstractions that are first-class citizens in a model; relationships tie
these things together; diagrams group interesting collections of things.

27
Things in the UML

There are four kinds of things in the UML:

• Structural things
• Behavioural things
• Grouping things
• Annotational things

These things are the basic object-oriented building blocks of the UML. You use
them to write well-formed models.

Structural Things

Structural things are the nouns of UML models. These are the mostly static
parts of a model, representing elements that are either conceptual or physical.
Collectively, the structural things are called classifiers.

A class is a description of a set of objects that share the same attributes,


operations, relationships, and semantics. A class implements one or more interfaces.
Graphically, a class is rendered as a rectangle, usually including its name, attributes,
and operations

Class - A Class is a set of identical things that outlines the functionality and
properties of an object. It also represents the abstract class whose functionalities are
not defined. Its notation is as follows

Interface - A collection of functions that specify a service of a class or component,


i.e., Externally visible behavior of that class.

28
Collaboration - A larger pattern of behaviors and actions. Example: All classes
and behaviors that create the modeling of a moving tank in a simulation.

Use Case - A sequence of actions that a system performs that yields an observable
result. Used to structure behavior in a model. Is realized by collaboration.

Component - A physical and replaceable part of a system that implements a


number of interfaces. Example: a set of classes, interfaces, and collaborations.

Node - A physical element existing at run time and represents are source.

29
Behavioral Things

Behavioral things are the dynamic parts of UML models. These are the verbs of a
model, representing behavior over time and space. In all, there are three primary
kinds of behavioral things

• Interaction
• State machine

Interaction

It is a behavior that comprises a set of messages exchanged among a set of


objects or roles within a particular context to accomplish a specific purpose. The
behavior of a society of objects or of an individual operation may be specified with
an interaction. An interaction involves a number of other elements, including
messages, actions, and connectors (the connection between objects). Graphically, a
message is rendered as a directed line, almost always including the name of its
operation.

State machine

State machine is a behaviour that specifies the sequences of states an object


or an interaction goes through during its lifetime in response to events, together
with its responses to those events. The behaviour of an individual class or a
collaboration of classes may be specified with a state machine. A state machine
involves a number of other elements, including states, transitions (the flow from
state to state), events (things that trigger a transition), and activities (the response to
a transition). Graphically, a state is rendered as a rounded rectangle, usually
including its name and its substates.

21
0
Grouping Things

Grouping things can be defined as a mechanism to group elements of a


UML model together. There is only one grouping thing available.

Package − Package is the only one grouping thing available for gathering structural
and behavioural things.

Annotational Things
Annotational things are the explanatory parts of UML models. These are the
comments you may apply to describe, illuminate, and remark about any element in
a model. There is one primary kind of annotational thing, called a note. A note is
simply a symbol for rendering constraints and comments attached to an element or
a collection of elements.

Relationships in the UML

Relationship is another most important building block of UML. It


shows how the elements are associated with each other and this association
describes the functionality of an application.

There are four kinds of relationships in the UML:

• Dependency
• Association
• Generalization
• Realization

30
Dependency
It is an element (the independent one) that may affect the semantics of the other
element (the dependent one). Graphically, a dependency is rendered as a dashed
line, possibly directed, and occasionally including a label.

Association
Association is basically a set of links that connects the elements of a UML
model. It also describes how many objects are taking part in that relationship.

Generalization
It is a specialization/generalization relationship in which the specialized
element (the child) builds on the specification of the generalized element (the
parent). The child shares the structure and the behavior of the parent. Graphically, a
generalization relationship is rendered as a solid line with a hollow arrowhead
pointing to the parent.

Realization
Realization can be defined as a relationship in which two elements are
connected. One element describes some responsibility, which is not implemented
and the other one implements them. This relationship exists in case of interfaces.

31
6.3 UML DIAGRAMS

UML is a modern approach to modeling and documenting software. It is


based on diagrammatic representations of software components. It is the final
output, and the diagram represents the system.

UML includes the following

• Class diagram
• Object diagram
• Component diagram
• Composite structure diagram
• Use case diagram
• Sequence diagram
• Communication diagram
• State diagram
• Activity diagram

Fig 6.2 Use-Case diagram

32
Fig 6.3 Sequence Diagram

33
7. DEVELOPMENT

7.1. RAW
DATA Pneumonia
Dataset

Fig 7.1 Xray of various patients for detecting Pneumonia

Covid Dataset

Fig 7.2 Xray of various patients for detecting Covid

34
Alzheimer’s dataset

Fig 7.3 MRI Scan of various patients for detecting Alzheimer’s

Breast Cancer dataset

Fig 7.4 Lab results of various patients for detecting breast cancer

35
Diabetes dataset

Fig 7.5 Lab results of various patients for detecting diabetes

36
7.2. SAMPLE CODE
app.py

from flask import Flask, flash, request, redirect, url_for, render_template

import urllib.request

import os

from werkzeug.utils import secure_filename

import cv2

import pickle

import imutils

import sklearn

from tensorflow.keras.models import

load_model # from pushbullet import PushBullet

import joblib

import numpy as np

from tensorflow.keras.applications.vgg16 import

preprocess_input # Loading Models

covid_model = load_model('models/covid.h5')

braintumor_model = load_model('models/braintumor.h5')

alzheimer_model = load_model('models/alzheimer_model.h5')

diabetes_model = pickle.load(open('models/diabetes.sav', 'rb'))

heart_model = pickle.load(open('models/heart_disease.pickle.dat', "rb"))

37
pneumonia_model = load_model('models/pneumonia_model.h5')

breastcancer_model = joblib.load('models/cancer_model.pkl')

# Configuring Flask

UPLOAD_FOLDER = 'static/uploads'

ALLOWED_EXTENSIONS = set(['png', 'jpg', 'jpeg'])

app = Flask( name )

app.config['SEND_FILE_MAX_AGE_DEFAULT'] = 0

app.config['UPLOAD_FOLDER'] =

UPLOAD_FOLDER

app.secret_key = "secret key"

def allowed_file(filename):

return '.' in filename and filename.rsplit('.', 1)[1] in

ALLOWED_EXTENSIONS def preprocess_imgs(set_name, img_size):

"""

Resize and apply VGG-15 preprocessing

"""

set_new = []

for img in set_name:

img = cv2.resize(img,dsize=img_size,interpolation=cv2.INTER_CUBIC)

set_new.append(preprocess_input(img))

return np.array(set_new)

def crop_imgs(set_name, add_pixels_value=0):

38
"""

39
Finds the extreme points on the image and crops the rectangular out of them

"""

set_new = []

for img in set_name:

gray = cv2.cvtColor(img,

cv2.COLOR_RGB2GRAY) gray =

cv2.GaussianBlur(gray, (5, 5), 0)

# threshold the image, then perform a series of erosions +

# dilations to remove any small regions of noise

thresh = cv2.threshold(gray, 45, 255, cv2.THRESH_BINARY)

[1] thresh = cv2.erode(thresh, None, iterations=2)

thresh = cv2.dilate(thresh, None, iterations=2)

# find contours in thresholded image, then grab the largest one

cnts = cv2.findContours(

thresh.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)

cnts = imutils.grab_contours(cnts)

c = max(cnts, key=cv2.contourArea)

# find the extreme points

extLeft = tuple(c[c[:, :, 0].argmin()][0])

extRight = tuple(c[c[:, :, 0].argmax()][0])

extTop = tuple(c[c[:, :, 1].argmin()][0])

extBot = tuple(c[c[:, :, 1].argmax()][0])

ADD_PIXELS = add_pixels_value

31
0
new_img = img[extTop[1]-ADD_PIXELS:extBot[1]+ADD_PIXELS,

31
1
extLeft[0]-ADD_PIXELS:extRight[0]+ADD_PIXELS].copy()

set_new.append(new_img)

return np.array(set_new)

@app.route('/')

def home():

return render_template('homepage.html')

@app.route('/covid')

def covid():

return render_template('covid.html')

@app.route('/breastcancer')

def breast_cancer():

return render_template('breastcancer.html')

@app.route('/braintumor')

def brain_tumor():

return render_template('braintumor.html')

@app.route('/diabetes')

def diabetes():

return render_template('diabetes.html')

@app.route('/alzheimer')

def alzheimer():

return render_template('alzheimer.html')

40
@app.route('/pneumonia')

def pneumonia():

return render_template('pneumonia.html')

@app.route('/heartdisease')

def heartdisease():

return render_template('heartdisease.html')

@app.route('/resultc', methods=['POST'])

def resultc():

if request.method == 'POST':

firstname = request.form['firstname']

lastname = request.form['lastname']

email = request.form['email']

phone = request.form['phone']

gender = request.form['gender']

age = request.form['age']

file = request.files['file']

if file and allowed_file(file.filename):

filename = secure_filename(file.filename)

file.save(os.path.join(app.config['UPLOAD_FOLDER'], filename))

flash('Image successfully uploaded and displayed below')

img = cv2.imread('static/uploads/'+filename)

img = cv2.resize(img, (224, 224))

img = img.reshape(1, 224, 224, 3)

41
img = img/255.0

pred = covid_model.predict(img)

if pred < 0.5:

pred = 0

else:

pred = 1

# pb.push_sms(pb.devices[0],str(phone), 'Hello {},\nYour COVID-19 test


results are ready.\nRESULT: {}'.format(firstname,['POSITIVE','NEGATIVE'][pred]))

return render_template('resultc.html', filename=filename, fn=firstname,


ln=lastname, age=age, r=pred, gender=gender)

else:

flash('Allowed image types are - png, jpg, jpeg')

return redirect(request.url)

@app.route('/resultbt', methods=['POST'])

def resultbt():

if request.method == 'POST':

firstname = request.form['firstname']

lastname = request.form['lastname']

email = request.form['email']

phone = request.form['phone']

gender = request.form['gender']

age = request.form['age']

file = request.files['file']

42
if file and allowed_file(file.filename):

filename = secure_filename(file.filename)

file.save(os.path.join(app.config['UPLOAD_FOLDER'], filename))

flash('Image successfully uploaded and displayed below')

img = cv2.imread('static/uploads/'+filename)

img = crop_imgs([img])

img = img.reshape(img.shape[1:])

img = preprocess_imgs([img], (224, 224))

pred = braintumor_model.predict(img)

if pred < 0.5:

pred = 0

else:

pred = 1

# pb.push_sms(pb.devices[0],str(phone), 'Hello {},\nYour Brain Tumor test


results are ready.\nRESULT: {}'.format(firstname,['NEGATIVE','POSITIVE'][pred]))

return render_template('resultbt.html', filename=filename, fn=firstname,


ln=lastname, age=age, r=pred, gender=gender)

else:

flash('Allowed image types are - png, jpg, jpeg')

return redirect(request.url)

@app.route('/resultd', methods=['POST'])

def resultd():

if request.method == 'POST':

43
firstname = request.form['firstname']

lastname = request.form['lastname']

email = request.form['email']

phone = request.form['phone']

gender = request.form['gender']

pregnancies = request.form['pregnancies']

glucose = request.form['glucose']

bloodpressure = request.form['bloodpressure']

insulin = request.form['insulin']

bmi = request.form['bmi']

diabetespedigree = request.form['diabetespedigree']

age = request.form['age']

skinthickness = request.form['skin']

pred = diabetes_model.predict(

[[pregnancies, glucose, bloodpressure, skinthickness, insulin, bmi,


diabetespedigree, age]])

# pb.push_sms(pb.devices[0],str(phone), 'Hello {},\nYour Diabetes test results


are ready.\nRESULT: {}'.format(firstname,['NEGATIVE','POSITIVE'][pred]))

return render_template('resultd.html', fn=firstname, ln=lastname, age=age, r=pred,


gender=gender)

@app.route('/resultbc', methods=['POST'])

def resultbc():

if request.method == 'POST':

44
firstname = request.form['firstname']

lastname = request.form['lastname']

email = request.form['email']

phone = request.form['phone']

gender = request.form['gender']

age = request.form['age']

cpm = request.form['concave_points_mean']

am = request.form['area_mean']

rm = request.form['radius_mean']

pm = request.form['perimeter_mean']

cm = request.form['concavity_mean']

pred = breastcancer_model.predict(

np.array([cpm, am, rm, pm, cm]).reshape(1, -1))

# pb.push_sms(pb.devices[0],str(phone), 'Hello {},\nYour Breast Cancer test


results are ready.\nRESULT: {}'.format(firstname,['NEGATIVE','POSITIVE'][pred]))

return render_template('resultbc.html', fn=firstname, ln=lastname, age=age,


r=pred, gender=gender)

@app.route('/resulta', methods=['GET', 'POST'])

def resulta():

if request.method == 'POST':

print(request.url)

firstname = request.form['firstname']

lastname = request.form['lastname']

45
email = request.form['email']

phone = request.form['phone']

gender = request.form['gender']

age = request.form['age']

file = request.files['file']

if file and allowed_file(file.filename):

filename = secure_filename(file.filename)

file.save(os.path.join(app.config['UPLOAD_FOLDER'], filename))

flash('Image successfully uploaded and displayed below')

img = cv2.imread('static/uploads/'+filename)

img = cv2.resize(img, (176, 176))

img = img.reshape(1, 176, 176, 3)

img = img/255.0

pred = alzheimer_model.predict(img)

pred = pred[0].argmax()

print(pred)

# pb.push_sms(pb.devices[0],str(phone), 'Hello {},\nYour Alzheimer test results


are ready.\nRESULT:
{}'.format(firstname,['NonDemented','VeryMildDemented','MildDemented','ModerateDe
mented'][pred]))

return render_template('resulta.html', filename=filename, fn=firstname,


ln=lastname, age=age, r=0, gender=gender)

else:

flash('Allowed image types are - png, jpg, jpeg')

46
return redirect('/')

@app.route('/resultp', methods=['POST'])

def resultp():

if request.method == 'POST':

firstname = request.form['firstname']

lastname = request.form['lastname']

email = request.form['email']

phone = request.form['phone']

gender = request.form['gender']

age = request.form['age']

file = request.files['file']

if file and allowed_file(file.filename):

filename = secure_filename(file.filename)

file.save(os.path.join(app.config['UPLOAD_FOLDER'], filename))

flash('Image successfully uploaded and displayed below')

img = cv2.imread('static/uploads/'+filename)

img = cv2.resize(img, (150, 150))

img = img.reshape(1, 150, 150, 3)

img = img/255.0

pred = pneumonia_model.predict(img)

if pred < 0.5:

47
pred = 0

else:

pred = 1

# pb.push_sms(pb.devices[0],str(phone), 'Hello {},\nYour COVID-19 test


results are ready.\nRESULT: {}'.format(firstname,['POSITIVE','NEGATIVE'][pred]))

return render_template('resultp.html', filename=filename, fn=firstname,


ln=lastname, age=age, r=pred, gender=gender)

else:

flash('Allowed image types are - png, jpg, jpeg')

return redirect(request.url)

@app.route('/resulth', methods=['POST'])

def resulth():

if request.method == 'POST':

firstname = request.form['firstname']

lastname = request.form['lastname']

email = request.form['email']

phone = request.form['phone']

gender = request.form['gender']

nmv = float(request.form['nmv'])

tcp = float(request.form['tcp'])

eia = float(request.form['eia'])

thal = float(request.form['thal'])

48
op = float(request.form['op'])

mhra = float(request.form['mhra'])

age = float(request.form['age'])

print(np.array([nmv, tcp, eia, thal, op, mhra, age]).reshape(1, -

1)) pred = heart_model.predict(

np.array([nmv, tcp, eia, thal, op, mhra, age]).reshape(1, -1))

# pb.push_sms(pb.devices[0],str(phone), 'Hello {},\nYour Diabetes test results


are ready.\nRESULT: {}'.format(firstname,['NEGATIVE','POSITIVE'][pred]))

return render_template('resulth.html', fn=firstname, ln=lastname, age=age, r=pred,


gender=gender)

# No caching at all for API

endpoints. @app.after_request

def add_header(response):

"""

Add headers to both force latest IE rendering engine or Chrome Frame,

and also to cache the rendered page for 10 minutes.

"""

response.headers['X-UA-Compatible'] = 'IE=Edge,chrome=1'

response.headers['Cache-Control'] = 'public, max-age=0'

return response

if name == ' main ':

app.run(debug=True)

49
homepage.html
<!doctype html>

<html lang="en">

<style>

.headstyle {

color: rgb(255, 255,

255); font-variant: petite-

caps;

background-color: rgb(0, 0, 0, 0.8);

margin-bottom: 0px

.divstyle {

border-radius: 10px 10px 10px

10px; margin-left: 1px;

margin-right: 1px

</style>

<head>

<!-- Required meta tags -->

<meta charset="utf-8">

<meta name="viewport" content="width=device-width, initial-scale=1">

50
<!-- Bootstrap CSS -->

50
<link href="https://cdn.jsdelivr.net/npm/bootstrap@5.0.0-
beta3/dist/css/bootstrap.min.css" rel="stylesheet" integrity="sha384
eOJMYsd53ii+scO/bJGFsiCZc+5NDVN2yr8+0RDqr0Ql0h+rP48ckxlpbzKgwra6"
crossorigin="anonymous">

<title>AdvaCare</title>

</head>

<body>

<nav class="navbar navbar-expand-lg navbar-dark bg-dark">

<div class="container-fluid">

<a class="navbar-brand" href="/">AdvaCare</a>

<button class="navbar-toggler" type="button" data-bs-toggle="collapse"

data-bs-target="#navbarSupportedContent" aria-
controls="navbarSupportedContent" aria-expanded="false"

aria-label="Toggle navigation">

<span class="navbar-toggler-icon"></span>

</button>

<div class="collapse navbar-collapse" id="navbarSupportedContent">

<ul class="navbar-nav ms-auto mb-2 mb-lg-0">

<li class="nav-item">

<a class="nav-link " aria-current="page" href="/covid">Covid</a>

</li>

<li class="nav-item">

<a class="nav-link " aria-current="page" href="/breastcancer">Breast


Cancer</a>

51
</li>

<li class="nav-item">

<a class="nav-link " aria-current="page"


href="/alzheimer">Alzheimer</a>

</li>

<li class="nav-item">

<a class="nav-link " aria-current="page" href="/diabetes">Diabetes</a>

</li>

<li class="nav-item">

<a class="nav-link " aria-current="page"


href="/pneumonia">Pneumonia</a>

</li>

</ul>

</div>

</div>

</nav>

<h1 class='text-center py-3'

style="font-variant: petite-caps;margin-bottom:0px">

<b><i>Integrating different illness prediction methods into a single


interface</i></b>

</h1>

<div class="row" style="font-size: 20px;padding: 0px 50px 50px 50px;">

<p> The web application is about integrating different illness prediction


methods into a single interface.The web application determines whether a specific

52
person is

53
suffering from a specific disease or not. It is an all in one medical solution which brings
5 Disease Detections like Covid Detection, Breast Cancer Detection, Alzheimer
Detection, Diabetes Detection and Pneumonia Detection under one platform.</p>

<h2 class='text-center py-3'

style="color: rgb(255, 255, 255);font-variant: petite-caps;background-color:


rgb(0, 0, 0);margin-bottom:0px">

<b><i>5 Disease Detections</i></b>

</h2>

<div class='divstyle' style='margin:40px 20px 60px 20px'>

<div class="row py-3">

<div class="col md-3">

<h3 class='text-center py-3 headstyle' style="font-size: 18px;"><b>Covid


Detection</b></h3>

<a href="./covid"><img src="../static/icons/covid.jpg" class="img-fluid mx-


auto d-block"></a>

</div>

<div class="col md-3">

<h3 class='text-center py-3 headstyle' style="font-size: 18px;"><b>Breast


Cancer Detection</b></h3>

<a href="./breastcancer"><img

src="../static/icons/breastcancer.png" class="img-fluid mx-auto

d-block"></a>

</div>

<div class="col md-3">

<h3 class='text-center py-3 headstyle' style="font-size:


54
18px;"><b>Alzheimer Detection</b></h3>

55
<a href="./alzheimer"><img

src="../static/icons/alzheimer.png" class="img-fluid mx-

auto d-block"></a>

</div>

</div>

</div>

<div class='divstyle' style='margin:40px 20px 60px 20px'>

<div class="row py-3">

<div class="col md-3">

<h3 class='text-center py-3 headstyle' style="font-size: 18px;"><b>Diabetes


Detection</b></h3>

<a href="./diabetes"><img src="../static/icons/diabetes.png" class="img-


fluid mx-auto d-block"></a>

</div>

<div class="col md-3">

<h3 class='text-center py-3 headstyle' style="font-size:


18px;"><b>Pneumonia Detection</b></h3>

<a href="./pneumonia"><img

src="../static/icons/pneumonia.png" class="img-fluid mx-

auto d-block"></a>

</div>

<div class="col md-3">

</div>

</div>
56
</div>

57
<h3 class='text-center py-3'

style="color: rgb(255, 255, 255);font-variant: petite-caps;background-color:


rgb(0, 0, 0);margin-bottom:0px">

<b><i>AI in HealthCare</i></b>

</h3>

<div class="row py-3"

style='margin-bottom: 30px;'>

<div class="col">

<p class="text-left" style='font-size:18px'>

The artificial intelligence (AI) technologies becoming ever present in modern


business and everyday life is also steadily being applied to healthcare. The use of
artificial intelligence in healthcare has the potential to assist healthcare providers in many
aspects of patient care and administrative processes. Most AI and healthcare technologies
have strong relevance to the healthcare field, but the tactics they support can vary
significantly. And while some articles on artificial intelligence in healthcare suggest that
the use of artificial intelligence in healthcare can perform just as well or better than
humans at certain procedures, such as diagnosing disease, it will be a significant number
of years before AI in healthcare replaces humans for a broad range of medical tasks. </p>

</div>

<div class="col">

<img src="../static/healthcure.png" class="img-fluid rounded mx-auto d-block"


alt="...">

</div>

</div>

58
<h3 class='text-center py-3'

style="color: rgb(255, 255, 255);font-variant: petite-caps;background-color:


rgb(0, 0, 0);margin-bottom:0px">

<b><i>Machine Learning</i></b>

</h3>

<div class="row py-3" style='margin-bottom: 30px'>

<div class="col">

<p class="text-left" style='font-size:18px'>

Machine learning is one of the most common forms of artificial intelligence


in healthcare. It is a broad technique at the core of many approaches to AI and healthcare
technology and there are many versions of it. Using artificial intelligence in healthcare,
the most widespread utilization of traditional machine learning is precision medicine.
Being able to predict what treatment procedures are likely to be successful with patients
based on their make-up and the treatment framework is a huge leap forward for many
healthcare organizations. The majority of AI in healthcare that uses machine learning and
precision medicine applications require data for training, for which the end result is
known. This is known as supervised learning.</p>

</div>

<div class="col">

<img src="../static/ml.png" class="img-fluid rounded mx-auto d-block"


alt="...">

</div>

</div>

59
<h3 class='text-center py-3'

style="color: rgb(255, 255, 255);font-variant: petite-caps;background-color:


rgb(0, 0, 0);margin-bottom:0px">

<b><i>Natural Language Processing</i></b>

</h3>

<div class="row py-3" style='margin-bottom: 30px'>

<div class="col">

<p class="text-left" style='font-size:18px'>

Making sense of human language has been a goal of artificial intelligence


and healthcare technology for over 50 years. Most NLP systems include forms of speech
recognition or text analysis and then translation. A common use of artificial intelligence
in healthcare involves NLP applications that can understand and classify clinical
documentation. NLP systems can analyze unstructured clinical notes on patients, giving
incredible insight into understanding quality, improving methods, and better results for
patients. </p>

</div>

<div class="col">

<img src="../static/nlp.jpg" class="img-fluid rounded mx-auto d-block"


style='width:auto; height:300px'

alt="...">

</div>

</div>

</div>

51
0
<footer class='text-light bg-dark position-relative '>

<p class='text-center py-1 my-0'>

Made by Team - 47

</p>

</footer>

<!-- Option 1: Bootstrap Bundle with Popper -->

<script src="https://cdn.jsdelivr.net/npm/bootstrap@5.0.0-
beta3/dist/js/bootstrap.bundle.min.js"

integrity="sha384-
JEW9xMcG8R+pH31jmWH6WWP0WintQrMb4s7ZOdauHnUtxwoG2vI5DkLtS3qm9E
kf"

crossorigin="anonymous"></script>

</body>

</html>

51
1
7.3. RESULTS
INITIAL WEBPAGE

Fig 7.6 The homepage of our website contains following listed diseases. They are-
Pneumonia, Covid, Alzheimer’s, Breast Cancer, and Diabetes.

51
2
Covid Detection

Fig 7.7 The user provides following input values- First name, Last name, Phone number,
Email, Gender, Age, and also upload their chest scan.

Fig 7.8 The model analyzes the image and displays the result accordingly. In the
above case the model displays the result as positive.

60
Fig 7.9 The user provides following input values- First name, Last name, Phone number,
Email, Gender, Age, and also upload their chest scan.

Fig 7.10 The model analyzes the image and displays the result accordingly. In the
above case the model displays the result as negative.

61
Breast Cancer detection

Fig 7.11 The user provides following input values- First name, Last name, Phone
number, Email, Gender, Age, concave points mean, area mean, radius mean, perimeter
mean, concavity mean.

Fig 7.12 The model analyzes the values and displays the result accordingly. In the
above case the model displays the result as malignant (high in effect).

62
Fig 7.13 The user provides following input values- First name, Last name, Phone
number, Email, Gender, Age, concave points mean, area mean, radius mean,
perimeter mean, concavity mean.

Fig 7.14 The model analyzes the values and displays the result accordingly. In the
above case the model displays the result as benign (no effect).

63
Alzheimer’s detection

Fig 7.15 The user provides following input values- First name, Last name, Phone
number, Email, Gender, Age, and also upload their MRI scan.

Fig 7.16 The model analyzes the image and displays the result accordingly. In the
above case the model displays the result as Nondemented (no effect).

64
Diabetes detection

Fig 7.17 The user provides following input values- First name, Last name, Phone
number, Email, Gender, Age, number of pregnancies, glucose, blood pressure,
skin thickness, insulin, BMI, diabetes pedigree.

Fig 7.18 The model analyzes the values and displays the result accordingly. In the
above case the model displays the result as positive.

65
Fig 7.19 The user provides following input values- First name, Last name, Phone
number, Email, Gender, Age, number of pregnancies, glucose, blood pressure,
skin thickness, insulin, BMI, diabetes pedigree.

Fig 7.20 The model analyzes the values and displays the result accordingly. In the
above case the model displays the result as negative.

66
Pneumonia detection

Fig 7.21 The user provides following input values- First name, Last name, Phone
number, Email, Gender, Age, and also upload their chest scan.

Fig 7.22 The model analyzes the image and displays the result accordingly. In the
above case the model displays the result as positive.

67
Fig 7.23 The user provides following input values- First name, Last name, Phone
number, Email, Gender, Age, and also upload their chest scan.

Fig 7.24 The model analyzes the image and displays the result accordingly. In the
above case the model displays the result as negative.

68
8. TESTING

8.1 INTRODUCTION TO TESTING

Software Testing is defined as an activity to check whether the actual results


match the expected results and to ensure that the software system is Defect free. It
involves the execution of a software component or system component to evaluate one or
more properties of interest. It is required for evaluating the system. This phase is the
critical phase of software quality assurance and presents the ultimate view of coding.

Importance of Testing
The importance of software testing is imperative. A lot of times this process is
skipped, therefore, the product and business might suffer. To understand the importance
of testing, here are some key points to explain

• Software Testing saves money


• Provides Security
• Improves Product Quality
• Customer satisfaction

Testing is of different ways The main idea behind the testing is to reduce the errors and
do it with a minimum time and effort.

Benefits of Testing
• Cost-Effective: It is one of the important advantages of software testing. Testing
any IT project on time helps you to save your money for the long term. In case if the bugs
caught in the earlier stage of software testing, it costs less to fix.

• Security: It is the most vulnerable and sensitive benefit of software testing. People
are looking for trusted products. It helps in removing risks and problems earlier.

• Product quality: It is an essential requirement of any software product. Testing


ensures a quality product is delivered to customers.

69
• Customer Satisfaction: The main aim of any product is to give satisfaction to their
customers. UI/UX Testing ensures the best user experience.

Different types of Testing


Unit Testing: Unit tests are very low level, close to the source of your application. They
consist in testing individual methods and functions of the classes, components or
modules used by your software. Unit tests are in general quite cheap to automate and can
be run very quickly by a continuous integration server.

Integration Testing: Integration tests verify that different modules or services used by
your application work well together. For example, it can be testing the interaction with
the database or making sure that microservices work together as expected. These types
of tests are more expensive to run as they require multiple parts of the application to be
up and running.

Functional Tests: Functional tests focus on the business requirements of an application.


They only verify the output of an action and do not check the intermediate states of the
system when performing that action.

There is sometimes a confusion between integration tests and functional tests as they
both require multiple components to interact with each other. The difference is that an
integration test may simply verify that you can query the database while a functional test
would expect to get a specific value from the database as defined by the product
requirements.

Regression Testing: Regression testing is a crucial stage for the product & very useful
for the developers to identify the stability of the product with the changing requirements.
Regression testing is a testing that is done to verify that a code change in the software
does not impact the existing functionality of the product.

70
System Testing: System testing of software or hardware is testing conducted on a
complete integrated system to evaluate the system’s compliance with its specified
requirements. System testing is a series of different tests whose primary purpose is to
fully exercise the computer-based system.

Performance Testing: It checks the speed, response time, reliability, resource usage,
scalability of a software program under their expected workload. The purpose of
Performance Testing is not to find functional defects but to eliminate performance
bottlenecks in the software or device.

Alpha Testing: This is a form of internal acceptance testing performed mainly by the in-
house software QA and testing teams. Alpha testing is the last testing done by the test
teams at the development site after the acceptance testing and before releasing the
software for the beta test. It can also be done by the potential users or customers of the
application. But still, this is a form of in-house acceptance testing.

Beta Testing: This is a testing stage followed by the internal full alpha test cycle. This is
the final testing phase where the companies release the software to a few external user
groups outside the company test teams or employees. This initial software version is
known as the beta version. Most companies gather user feedback in this release.

Black Box Testing: It is also known as Behavioural Testing, is a software testing


method in which the internal structure/design/implementation of the item being tested is
not known to the tester. These tests can be functional or non-functional, though usually
functional.

71
Fig 8.1.1 Blackbox Testing
This method is named so because the software program, in the eyes of the tester, is like a
black box; inside which one cannot see. This method attempts to find errors in the
following categories:

• Incorrect or missing functions


• Interface errors
• Errors in data structures or external database access
• Behaviour or performance errors
• Initialization and termination errors

White Box Testing: White box testing (also known as Clear Box Testing, Open Box
Testing, Glass Box Testing, Transparent Box Testing, Code-Based Testing or Structural
Testing) is a software testing method in which the internal
structure/design/implementation of the item being tested is known to the tester. The
tester chooses inputs to exercise paths through the code and determines the appropriate
outputs. Programming know-how and the implementation knowledge is essential. White
box testing is testing beyond the user interface and into the nitty-gritty of a system. This
method is named so because the software program, in the eyes of the tester, is like a
white/transparent box; inside which one clearly sees.

Fig 8.1.2 Whitebox Testing

72
9. CONCLUSION

There are situations when you require a doctor's assistance right away but they are not
available for any reason. In our project, we have created a disease prediction system that
allows patients to test for any of the listed disorders.

Both benefits and drawbacks exist with this system. In order for the system to determine
whether a particular patient has a particular disease or not, it still requires certain test
results and patient scans to be provided as input to the model. Therefore, it enables users
to determine whether they have a specific condition or not and take preventative actions
as necessary.

The main challenge that we faced while working on this project was the availability of
data (mainly image data). As we know that data is a big issue in machine learning/deep
learning and that's why we had to go on with the amount of data we had.

73
10. FUTURE SCOPE

Depending on the precise goals and objectives of the project, the scope of this work can
seem quite vast. Yet, the project's overarching goal is to enhance health outcomes. The
project focuses on applying machine learning to discover patterns and trends in health
outcomes and to design more focused interventions as a result of the growing
accessibility of health data.

A disease prediction project's objectives and target diseases will determine the project's
precise scope.

As time goes on, we will have access to an increasing amount of data, and we will work
to improve the accuracy of our models by training them on a larger volume of data.

Also, we'll be introducing more diseases that can be identified using X-ray scans or
simply by entering a few numbers.

We also have plans to add other functionality, such as the ability for our app to display
warnings and self-cure instructions for users who test positive. The records of the
detection will also be kept.

Therefore, these are some upcoming upgrades or enhancements that we intend to make.

74
11. REFERENCE LINKS

1. https://towardsdatascience.com/training-a-random-forest-to-identify-malignant-breast-
cancer-tumors-49e8a69fc964

2.https://www.sciencedirect.com/science/article/pii/S187705092300025X#:~:text=The%
20Random%20Forest%20algorithm%20outperforms,of%20multiple%20feature%20selec
tion%20methods.

3.https://datascience.stackexchange.com/questions/55556/how-to-calculate-pedigree-
function-in-diabetes-prediction

4.https://www.analyticsvidhya.com/blog/2020/09/pneumonia-detection-using-cnn-with-
implementation-in-python/

5.https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8614951/

6.https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8083897/

7.https://www.upgrad.com/blog/basic-cnn
architecture/#:~:text=other%20advanced%20tasks.-
,What%20is%20the%20architecture%20of%20CNN%3F,the%20main%20responsibility
%20for%20computation.

8.https://www.analyticsvidhya.com/blog/2021/06/understanding-random-forest/

9.https://www.javatpoint.com/flask-tutorial

10.https://en.wikipedia.org/wiki/TensorFlow

11. https://matplotlib.org/

12.https://www.google.com/search?q=random+forest+algorithm&rlz=1C1UEAD_enIN9
89IN989&sxsrf=APwXEdeVp5Fpz0VQs4GL7fvyDc3YzgXt_Q:1680201532651&sourc
e=lnms&tbm=isch&sa=X&ved=2ahUKEwjbwYS8poT-
AhU7nmMGHSeHAfUQ_AUoAXoECAEQAw&biw=1366&bih=649&dpr=1#imgrc=to
Kd_5-L6dtcLM

75

You might also like