You are on page 1of 11

ABSTRACT

Ultrasound insonation of lungs that are dense with extravascular lung


water (EVLW) produces characteristic reverberation artifacts termed B-lines.
The number of B-lines present demonstrates reasonable correlation to the
amount of EVLW. However, analysis of B-line artifacts generated by this
modality is semi-quantitative relying on visual interpretation, and as a result,
can be subject to inter-observer variability. The purpose of this study was to
translate the use of a novel, quantitative lung ultrasound surface wave
elastography technique (LUSWE) into the bedside assessment of pulmonary
edema in patients admitted with acute congestive heart failure. To prevent this
problem in One of the most interesting (or perhaps most profitable) time series
data using machine learning techniques. Hence, pulmonary disease prediction
has become an important research area. The aim is to predict machine learning
based techniques for pulmonary disease prediction results in best accuracy. The
analysis of dataset by supervised machine learning technique(SMLT) to capture
several information’s like, variable identification, uni-variate analysis, bi-variate
and multi-variate analysis, missing value treatments and analyze the data
validation, data cleaning/preparing and data visualization will be done on the
entire given dataset. To propose a machine learning-based method to accurately
predict the pulmonary diseaseIndex value by prediction results in the form of
pulmonary disease classification best accuracy from comparing supervise
classification machine learning algorithms. Additionally, to compare and
discuss the performance of various machine learning algorithms from the given
dataset with evaluation classification report, identify the confusion matrix and
to categorizing data from priority and the result shows that the effectiveness of
the proposed machine learning algorithm technique can be compared with best
accuracy with precision, Recall and F1 Score.

5
TABLE OF CONTENT

CHAPTE TITLE PAGE.NO


R
01 INTRODUCTION 15
1.1 Data science 16
17
1.1.1 Data Scientist…
17
1.1.2 Required skills for a data scientist 20
20
1.2 Artificial Intelligence
23
1.3 Natural language processing 24
24
1.4 Machine Learning
25
1.5 Existing system
1.5.1 Drawbacks
1.6 Proposed System
1.6.1 Advantages

02 LITERATURE SURVEY 26
AIM AND SCOPE OF PRESENT 31
03 INVESTIGATION
3.1 Aim 31
3.2 Project goals 31
3.3 Scope of the project 32
3.4 Feasibility study 32

6
04 EXPERIMENTAL OR MATERIAL AND 35
METHODS; ALGORITHMS USED 35
4.1 List of Algorithm 35
4.2 Project Requirements 35
4.2.1 Functional Requirements 36
4.2.2 Non-Functional Requirements 37
4.3 Environmental Requirements 41
4.4 Software Description 47
4.4.1 Anaconda Navigator 59
4.4.2 Jupyter Notebook 60
4.5 Python 61
4.6 System Architecture 62
4.7 Workflow Diagram 63
4.8 Use Diagram 64
4.9 Class Diagram 65
4.10 Activity Diagram 66
4.11 Sequence Diagram 66
4.12 Entity Relationship Diagram 68
4.13 Module Description 70
4.13.1 Data preprocessing 72
4.13.2 Data validation/Cleaning /Preparing 78
process 82
4.13.3 Comparing Algorithms with 84
prediction in the form of best accuracy result 84
4.13.4 Deployment
05 RESULT AND DISCUSSION, 86
PERFORMANCE ANALYSIS
5.1 Module Results 86
5.1.1 Logistic Regression 87
5.1.2 Random Forest Classifier 88
5.1.3 Decision Tree Classifier 89
5.1.4 Naïve Bayes Algorithm 90
5.2 Prediction Result by Accuracy 90
06 SUMMARY AND CONCLUSION 93
6.1 Coding 93
6.2 Conclusion 126
6.3 Future Work 126

7
LIST OF FIGURES

SL.NO TITLE PAGE.NO


01 SYSTEM ARCHITECTURE 59
02 WORKFLOW DIAGRAM 60
03 USECASE DIAGRAM 61
04 CLASS DIAGRAM 62
05 ACTIVITY DIAGRAM 63
06 SEQUENCE DIAGRAM 64
07 ER – DIAGRAM 65
08 MODULE DIAGRAM 66

8
LIST OF SYMBOLS

9
S.N NOTATION NOTATION DESCRIPTION
O NAME

Represents a
collection of similar
1. Class
entities grouped
together.

2. Association NAME Associations


represents static
relationships
between classes.
Roles representsthe
way the two classes
see each other.
3. Actor It aggregates several
classes into a single
classes.

Aggregation Interaction between


the system and
4.
external
environment

5. Relation uses Used for additional


(uses) process
communication.

6. Relation EXTENDS Extends relationship


(extends) is used when one use
case is similar to
another use case but
does a bit more.
7. Communicatio Communication
n between various use
cases.

10
8. State State of the process.

9. Initial State Initial state of the


object

10. Final state Final state of the


object
11. Control flow Represents various
control flow between
the states.
12. Decision box Represents decision
making process from
a constraint

Usecase Interact ion between


the system and
13.
external environment.

14. Component Represents physical


modules which is a
collection of
components.

15. Node Represents physical


modules which are a
collection of
components.

16. Data A circle in DFD


represents a state or
Process/State
process which has
been triggered due to
some event or acion.

11
17. External entity Represents external
entities such as
keyboard,sensors,etc.

18. Transition Represents


communication that
occurs between
processes.

19. Object Lifeline Represents the


vertical dimensions
that the object
communications.

20. Message Message Represents the


message exchanged.

CHAPTER 1 : INTRODUCTION

Ultrasound insonation of lungs that are dense with extravascular lung


water (EVLW) produces characteristic reverberation artifacts termed B-lines.
The number of B-lines present demonstrates reasonable correlation to the
amount of EVLW. However, analysis of B-line artifacts generated by this
modality is semi-quantitative relying on visual interpretation, and as a result,
can be subject to inter-observer variability. The purpose of this study was to
translate the use of a novel, quantitative lung ultrasound surface wave
elastography technique (LUSWE) into the bedside assessment of pulmonary
edema in patients admitted with acute congestive heart failure. To prevent this
problem in One of the most interesting (or perhaps most profitable) time series
data using machine learning techniques. Hence, pulmonary disease prediction
has become an important research area. The aim is to predict machine learning

12
based techniques for pulmonary disease prediction results in best accuracy. The
analysis of dataset by supervised machine learning technique(SMLT) to capture
several information’s like, variable identification, uni-variate analysis, bi-variate
and multi-variate analysis, missing value treatments and analyze the data
validation, data cleaning/preparing and data visualization will be done on the
entire given dataset. To propose a machine learning-based method to accurately
predict the pulmonary diseaseIndex value by prediction results in the form of
pulmonary disease classification best accuracy from comparing supervise
classification machine learning algorithms. Additionally, to compare and
discuss the performance of various machine learning algorithms from the given
dataset with evaluation classification report, identify the confusion matrix and
to categorizing data from priority and the result shows that the effectiveness of
the proposed machine learning algorithm technique can be compared with best
accuracy with precision, Recall and F1 Score.

1.1 DATA SCIENCE:

Data science is an interdisciplinary field that uses scientific methods,


processes, algorithms and systems to extract knowledge and insights from
structured and unstructured data, and apply knowledge and actionable insights
from data across a broad range of application domains.

The term "data science" has been traced back to 1974, when Peter
Naur proposed it as an alternative name for computer science. In 1996, the
International Federation of Classification Societies became the first conference
to specifically feature data science as a topic. However, the definition was still
in flux.

13
The term “data science” was first coined in 2008 by D.J. Patil, and Jeff
Hammerbacher, the pioneer leads of data and analytics efforts at LinkedIn and
Facebook. In less than a decade, it has become one of the hottest and most
trending professions in the market.

Data science is the field of study that combines domain expertise, programming
skills, and knowledge of mathematics and statistics to extract meaningful
insights from data.

Data science can be defined as a blend of mathematics, business acumen, tools,


algorithms and machine learning techniques, all of which help us in finding out
the hidden insights or patterns from raw data which can be of major use in the
formation of big business decisions.

1.1.1 Data Scientist:

Data scientists examine which questions need answering and where to


find the related data. They have business acumen and analytical skills as well as
the ability to mine, clean, and present data. Businesses use data scientists to
source, manage, and analyze large amounts of unstructured data.

1.1.2 Required Skills for a Data Scientist:

● Programming: Python, SQL, Scala, Java, R, MATLAB.


● Machine Learning: Natural Language Processing, Classification, Clustering.
● Data Visualization: Tableau, SAS, D3.js, Python, Java, R libraries.
● Big data platforms: MongoDB, Oracle, Microsoft Azure, Cloudera.
14
1.2 ARTIFICIAL INTELLIGENCE

Artificial intelligence (AI) refers to the simulation of human intelligence


in machines that are programmed to think like humans and mimic their actions.
The term may also be applied to any machine that exhibits traits associated with
a human mind such as learning and problem-solving.

Artificial intelligence (AI) is intelligence demonstrated by machines, as opposed


to the natural intelligence displayed by humans or animals. Leading AI
textbooks define the field as the study of "intelligent agents" any system that
perceives its environment and takes actions that maximize its chance of
achieving its goals. Some popular accounts use the term "artificial intelligence"
to describe machines that mimic "cognitive" functions that humans associate
with the human mind, such as "learning" and "problem solving", however this
definition is rejected by major AI researchers.

Artificial intelligence is the simulation of human intelligence processes by


machines, especially computer systems. Specific applications of AI
include expert systems, natural language processing, speech recognition
and machine vision.

AI applications include advanced web search engines, recommendation systems


(used by Youtube, Amazon and Netflix), Understanding human speech (such
as Siri or Alexa), self-driving cars (e.g. Tesla), and competing at the highest
level in strategic game systems (such as chess and Go), As machines become
increasingly capable, tasks considered to require "intelligence" are often
removed from the definition of AI, a phenomenon known as the AI effect. For
instance, optical character recognition is frequently excluded from things
considered to be AI, having become a routine technology.

Artificial intelligence was founded as an academic discipline in 1956, and in the


years since has experienced several waves of optimism, followed by

15

You might also like