You are on page 1of 39

GURU NANAK DEV UNIVERSITY, AMRITSAR

Six Months Industrial Training Report On

‘’DIABETIC RETINOPATHY DETECTION’’

Bachelors of Technology
In
Computer Science &Engineering
Batch
(2016-2020)

SUBMITTED TO: DR PRABHSIMRAN SINGH SUBMITTED BY: SANDEEP


SINGH
ROLL NO: 2016CSA1180
CLASS: BTECH 8TH SEM (SEC B)

DEPARTMENT OF COMPUTER ENGINEERING


AND TECHNOLOGY

1
ATTESTATION OF AUTHORSHIP

I hereby declare that this Submission for partial fulfilment of the requirements
for the degree of B. Tech course is my own work and that to the best of my
knowledge and belief, it contains no written material which to a substantial extent
has been accepted for the qualification of any other degree or diploma of a
university or other institution of higher learning except where due acknowledge
is made.

Name of the Student: -SANDEEP SINGH

Roll no: -2016CSA1180

Branch & Semester: - Computer Science and Engineering 8th Sem

2
PREFACE

Technology has rapidly grown in past two-three decades. An engineer without practical
knowledge and skills cannot survive in this technical area. Theoretical knowledge does
matter but it is the practical knowledge that is the difference between the best and the
better. It also prefers experienced engineers than fresher ones due to practical
knowledge and industrial exposure of the former. The practical training is highly
conductive for solid foundation for: -

1. Knowledge and personality

2. Exposure to industrial environment.

3. Confidence building.

4. Enhancement of creativity.

5. Practicality

3
ACKNOWLEDGEMENT

While presenting this synopsis I would like to express my deep sense of gratitude to

entire Centre for Development of Advanced Computing (CDAC) staff that were

indispensable part of my training giving me unending guidance, inspiration,

encouragement and providing me excellent environment throughout my training at

Centre for Development of Advanced Computing (CDAC). The training was an

extremely productive & enriching experience, not only technically but also from

providing practical skills.

I am extremely thankful to DR. Sanjay Madan who had devoted a lot of


time in guiding and supervising me during my training.

ABSTRACT

Industrial training is an important phase of a student life. A well planned, properly


executed and evaluated industrial training helps a lot in developing a professional
attitude. It develops an awareness of industrial approach to problem solving,
based on a broad understanding of process and mode of operation of organization.
The aim and motivation of this industrial training is to receive discipline, skills,
teamwork and technical knowledge through a proper training environment, which
will help me, as a student in the field of Information Technology, to develop a
responsiveness of the self-disciplinary nature of problems in Computer Science

4
and Engineering (CSE). During a period of six months training at Centre for
Development of Advanced
Computing (CDAC), I was assigned to creating a more efficient system which
detect Diabetic Retinopathy in order to help the medical field to automate detection
of diabetic retinopathy and to reduce human error. As a result, I vital to achieve the
minimum requirement of the company, it will help the medical field to easily detect
this common problem which is faced by 80 percent of population across the world.
Throughout this industrial training, I have been learned new programming language
that required for the development of system, the process of development of AI based
software and able to implement what I have learnt for the past year as a student in
Bachelor of Technology in Computer Science & Engineering at Guru Nanak Dev
University, Amritsar

5
COMPANY PROFILE

Centre for Development of Advanced Computing (C-


DAC) is the premier R&D organization of the
Ministry of Electronics and Information
Technology for carrying out R&D in IT, Electronics
and associated areas. Different areas of C-DAC,
had originated at different times, many of which
came out as a result of identification of
opportunities. The setting up of C-DAC in 1988 itself was to build Supercomputers in
context of denial of import of Supercomputers by
USA. Since then C-DAC has been undertaking building of multiple generations of

Supercomputer starting from PARAM with 1 GF in 1988. National Centre for Software
Technology set up in 1985 had also initiated work in Indian Language Computing
around the same period. Electronic Research and Development Centre of India with
various constituents starting as adjunct entities of various State Electronic Corporations,
had been brought under the hold of Department of Electronics

and Telecommunications in around 1988 .C-DAC has today emerged as a premier R&D
organization in IT&E in the country working on strengthening national technological
capabilities in the context of global developments in the field and responding to change in the
market need in selected foundation areas. In that process, C-DAC represents a unique facet
working in close junction with MeitY to realize nation’s policy and pragmatic interventions
and initiatives in Information Technology. C-DAC has been at the forefront of the Information
Technology revolution, constantly building capacities in emerging/enabling technologies and
innovating and leveraging its expertise, caliber, skill sets to develop and deploy IT products
and solutions for different sectors of the economy, as per the mandate of its parent, the Ministry
of Electronics and Information Technology, Ministry of Communications and Information
Technology, Government of India and other stakeholders including funding agencies,
collaborators, users and the market-place

6
TABLE OF CONTENTS

DESCRIPTION Page Number

ATTESTATION OF AUTHORSHIP 22

PREFACE 33

ACKNOWLEDGEMENT 44

ABSTRACT 4

COMPAMY 6

1. Introduction 10

1.1. Intelligence 10

1.1.1. Components of intelligence. 10

1.2. Artificial Intelligence 14

1.2.1. What is Artificial Intelligence? 14

1.2.2. Why is Artificial Intelligence important? 16

1.2.3. Steps of making Artificial Intelligence model. 17

2. Project description 23

2.1. Information about fundus image. 23

7
2.2. Diabetic Retinopathy 24

2.2.1. What is Diabetic Retinopathy? 24

2.2.2. Symptoms of Diabetic Retinopathy. 25

2.2.3. Cause of Diabetic Retinopathy. 25

2.2.4. Risks. 26

2.2.5. Stages of Diabetic Retinopathy. 26

2.2.6. Diagnosis of Diabetic Retinopathy. 28

2.3. Steps involve in making of Diabetic retinopathy model. 28


21

3. Development Tools and Technologies 33


Data Flow Diagrams (DFD) 36

SYSTEM REQUIREMENTS 37

CONCLUSION 38

REFERENCES 39

FIGURES

Figure 1. Data Gathering 18


Figure 1.2. Straight Line Equation. 20
Figure 1.3. Training Data. 20
Figure 1.4. Prediction by input training data. 22
Figure 1.5. Prediction by giving output. 23
Figure 2. Left and Right eye fundus image of retina. 24
Figure 2.1. Normal Eye and Diabetic Retinopathy Eye 25

8
Figure 2.2. Stages of Diabetic retinopathy. 27
Figure 2.3. Stages of an ML Lifecycle. 29
Figure 2.5. Model with Layers. 32
Data Flow Diagrams (DFD)
DFD 1. Basic Model Representation. 36
DFD 2. Detail DFD of Model. 36

9
1. INTRODUCTION

1.1 What Is Intelligence?

All but the simplest human behaviour is ascribed to intelligence, while even the most

complicated insect behaviour is never taken as an indication of intelligence.


What is the difference? Consider the behaviour of the digger wasp, Sphex
ichneumons.
When the

female wasp returns to her burrow with food, she first deposits it on the threshold,

checks for intruders inside her burrow, and only then, if the coast is clear, carries her

food inside. The real nature of the wasp’s instinctual behaviour is revealed if the food

is moved a few inches away from the entrance to her burrow while she is inside: on

emerging, she will repeat the whole procedure as often as the food is displaced.

Psychologists generally do not characterize human intelligence by just one trait


but by

the combination of many diverse abilities.

1.1.1 Components of intelligence

Research in AI has focused chiefly on the following components of intelligence:

10
learning, reasoning, problem solving, perception, and using language

Learning

There are a number of different forms of learning as applied to artificial


intelligence. The simplest is learning by trial and error. For example, a
simple computer program for solving mate-in-one chess problems might try
moves at random until mate is found.

he program might then store the solution with the position so that the next time
the computer encountered the same position it would recall the solution. This
simple memorizing of individual items and procedures known as rote learning
is relatively easy to implement on a computer. More challenging is the
problem of implementing what is called generalization.

Generalization involves applying past experience to analogous new situations. For


example, a program that learns the past tense of regular English verbs by rote will
not be able to produce the past tense of a word such as jump unless it previously
had been presented with jumped, whereas a program that is able to generalize can
learn the “added” rule and so form the past tense of jump based on experience with
similar verbs.

Reasoning

To reason is to draw inferences appropriate to the situation. Inferences are


classified as either deductive or inductive. An example of the former is, “Fred must
be in either the museum or the café. He is not in the café; therefore, he is in the
museum,” and of the
latter, “Previous accidents of this sort were caused by instrument failure; therefore

11
this accident was caused by instrument failure.” The most significant difference
between these forms of reasoning is that in the deductive case the truth of the premises
guarantees the truth of the conclusion, whereas in the inductive case the truth of the
premise lends support to the conclusion without giving absolute assurance. Inductive
reasoning is common in science, where data are collected and tentative models are
developed to describe and predict future behaviour until the appearance of anomalous
data forces the model to be revised. Deductive reasoning is common in mathematics
and logic, where elaborate structures of irrefutable theorems are built up from a small
set of basic axioms and rules. There has been considerable success in programming
computers to draw inferences, especially deductive inferences. However, true reasoning
involves more than just drawing inferences; it involves drawing inferences relevant to
the solution of the particular task or situation.

Problem Solving

Problem solving, particularly in artificial intelligence, may be characterized as a


systematic search through a range of possible actions in order to reach some predefined
goal or solution.
Problem-solving methods divide into special purpose and general
purpose. A special-purpose method is tailor-made for a particular problem and often
exploits very specific features of the situation in which the problem is embedded. In
contrast, a general-purpose method is applicable to a wide variety of problems. One
general-purpose technique used in AI is means-end analysis a step-by-step, or
incremental, reduction of the difference between the current state and the final goal.

The program selects actions from a list of means in the case of a simple robot this
might consist of pickup, putdown, move forward, move back, move left, and move
right until the goal is reached.

Many diverse problems have been solved by artificial intelligence programs.


Some examples are finding the winning move in a board game, devising
mathematical proofs, and manipulating “virtual objects” in a computer-generated
world.

12
Perception

In perception the environment is scanned by means of various sensory organs, real or


artificial, and the scene is decomposed into separate objects in various spatial
relationships. Analysis is complicated by the fact that an object may appear different
depending on the angle from which it is viewed, the direction and intensity of
illumination in the scene, and how much the object contrasts with the surrounding
field. At present, artificial perception is sufficiently well advanced to enable optical sensors to
identify individuals, autonomous vehicles to drive at moderate speeds on the open road, and
robots to roam through buildings collecting empty soda cans. One of the earliest systems to
integrate perception and action was Freddy, a stationary robot with a moving television eye
and a pincer hand, constructed at the university of Edinburgh, Scotland, during the period
1966–73 under the direction of Donald machine. Freddy was able to recognize a variety of
objects and could be instructed to assemble simple artefacts, such as a toy car, from a random
heap of components.

Language

A language is a system of signs having meaning by convention. In this sense, language need
not be confined to the spoken word. Traffic signs, for example, form a minilanguage, it being

a matter of convention that ⚠ means “hazard ahead” in some countries. It is distinctive


of languages that linguistic units possess meaning by convention, and linguistic
meaning is very different from what is called natural

meaning, exemplified in statements such as “Those clouds mean rain” and “The

fall in pressure means the valve is malfunctioning.”

An important characteristic of full-fledged human languages—in contrast to birdcalls

13
and traffic signs is their productivity. A productive language can formulate an unlimited
variety of sentences. It is relatively easy to write computer programs that seem able, in
severely restricted contexts, to respond fluently in a human language to questions and
statements. Although none of these programs actually understands language, they may,
in principle, reach the point where their command of a language

is indistinguishable from that of a normal human. What, then, is involved in


genuine understanding, if even a computer that uses language like a native
human speaker is not acknowledged to understand? There is no universally
agreed upon answer to this difficult question. According to one theory,
whether or not one understands depends

not only on one’s behaviour but also on one’s history: in order to be said to understand,
one must have learned the language and have been trained to take one’s place in the
linguistic community by means of interaction with other language users.

1.2. What is Artificial Intelligence?

Artificial intelligence (AI), the ability of a digital computer or computer-controlled


robot to perform tasks commonly associated with intelligent beings. The term is
frequently applied to the project of developing systems endowed with the intellectual
processes characteristic of humans, such as the ability to reason, discover meaning,
generalize, or learn from past experience. Since the development of the digital
computer in the 1940s, it has been demonstrated that computers can be programmed

to carry out very complex tasks as, for example, discovering proofs for
mathematical theorems or playing chess with great proficiency. Still, despite
continuing advances in computer processing speed and memory capacity, there
are as yet no programs that can match human flexibility over wider domains or
in tasks requiring much everyday knowledge. On the other hand, some programs
have attained the performance levels

14
of human experts and professionals in performing certain specific tasks, so that
artificial intelligence in this limited sense is found in applications as diverse as
medical diagnosis, computer search engines, and voice or handwriting
recognition.

15
1.2.2. Why is Artificial Intelligence important?

1. AI automates repetitive learning and discovery through data. But AI is different from
hardware-driven, robotic automation. Instead of automating manual tasks, AI performs
frequent, high-volume, computerized tasks reliably and without fatigue. For this type
of automation, human inquiry is still essential to set up the system and ask the right
questions.

2. AI adds intelligence to existing products. In most cases, AI will not be sold as an

individual application. Rather, products you already use will be improved with AI
capabilities, much like Siri was added as a feature to a new generation of Apple products.
Automation, conversational platforms, bots and smart machines can be combined with large
amounts of data to improve many technologies at home and in the workplace, from security
intelligence to investment analysis.

3. AI adapts through progressive learning algorithms to let the data do the programming.

AI finds structure and regularities in data so that the algorithm acquires a skill: The
algorithm becomes a classifier or a predictor. So, just as the algorithm can teach itself
how to play chess, it can teach itself what product to recommend next online. And the
models adapt when given new data. Back propagation is an AI technique that allows
the model to adjust, through training and added data, when the first answer is not quite
right.

AI analyses more and deeper data using neural networks that have many hidden layers.
Building a fraud detection system with five hidden layers was almost impossible a few years
ago. All that has changed with incredible computer power and big data. You need lots of data
to train deep learning models because they learn directly from the data. The more data you
can feed them, the more accurate they become.

16
AI achieves incredible accuracy through deep neural networks – which was
previously impossible. For example, your interactions with Alexa, Google Search and
Google Photos are all based on deep learning – and they keep getting more accurate the
more we use them. In the medical field, AI techniques from deep learning, image
classification and object recognition can now be used to find cancer on MRIs with the same
accuracy as highly trained radiologists.

AI gets the most out of data. When algorithms are self-learning, the data
itself can become intellectual property. The answers are in the data; you just have to
apply AI to get them out. Since the role of the data is now more important than ever
before, it can
create a competitive advantage. If you have the best data in a competitive industry,
even if everyone is applying similar techniques, the best data will win.

1.2.3. Steps of making Artificial Intelligence model

The power of machine learning is that we were able to determine how to


differentiate between wine and beer using our model, rather than using human
judgement and manual rules. You can extrapolate the ideas presented today to
other problem domains as well, where the same principles apply:
1. Gathering data

2. Preparing that data

3. Choosing a model

4. Training

5. Evaluation

6. Hyper parameter tuning

7. Prediction.

17
Gathering Data

Once we have our equipment and booze, it’s time for our first real step of machine
learning: gathering data. This step is very important because the quality and
quantity of data that you gather will directly determine how good your predictive
model can be. In this case, the data we collect will be the colour and the alcohol
content of each drink.

Figure 1. Data Gathering

This will yield a table of colour, alcohol%, and whether it’s beer or wine. This
will be our training data.

Data Preparation

A few hours of measurements later, we have gathered our training data. Now
it’s time for the next step of machine learning: Data preparation, where we load
our data into a suitable place and prepare it for use in our machine learning
training. We’ll first put all our data together, and then randomize the ordering.
We don’t want the order of our data to affect what we learn, since that’s not part
of determining whether a drink is beer or wine. In other words, we make a
determination of what a drink is, independent of what drink came before or after
it.

This is also a good time to do any pertinent visualizations of your data, to help you
see if there are any relevant relationships between different variables you can take

18
advantage of, as well as show you if there are any data imbalances. For example,
if we collected way more data points about beer than wine, the model we train will
be biased toward guessing that virtually everything that it sees is beer, since it
would be right most of the time. The first part, used in training our model, will be
the majority of the dataset. The second part will be used for evaluating our trained
model’s performance. We don’t want to use the same data that the model was
trained on for evaluation, since it could then just memorize
the “questions”, just as you wouldn’t use the same questions from your math
homework

on the exam. Sometimes the data we collect needs other forms of adjusting and
manipulation. Things like de-duping, normalization, error correction, and more.
These would all happen at the data preparation step. In our case, we don’t have any
further data preparation needs, so let’s move forward.

Choosing a model

The next step in our workflow is choosing a model. There are many models that
researchers and data scientists have created over the years. Some are very well suited
for image data, others for sequences (like text, or music), some for numerical data,
others for text-based data. In our case, since we only have 2 features, colour and
alcohol%, we can use a small linear model, which is a fairly simple one that should get
the job done.

Training

Now we move onto what is often considered the bulk of machine learning the training.
In this step, we will use our data to incrementally improve our model’s ability to predict
whether a given drink is wine or beer.
In some ways, this is similar to someone first learning to drive. At first, they
don’t know how any of the pedals, knobs, and switches work, or when any of
them should be used. However, after lots of practice and correcting for their
mistakes, a licensed driver emerges. Moreover, after a year of driving, they’ve

19
become quite adept. The act of driving and reacting to real-world data has
adapted their driving abilities, honing their skills.

Figure 1.2. Straight Line Equation.

We will do this on a much smaller scale with our drinks. In particular, the formula for a straight
line is y=m*x+b, where x is the input, m is the slope of that line, b is the intercept, and y is the
value of the line at the position x. The values we have available to us for adjusting, or
“training”, are m and b. There is no other way to affect the position of the line, since the only
other variables are x, our input, and y, our output. In machine learning, there are many m’s
since there may be many features. The training process involves initializing some random
values for W and b and attempting to predict the output with those values. As you might
imagine, it does pretty poorly. But we can compare our model’s predictions with the output

that it should produce, and adjust the values in W and b such that we will have
more correct predictions.

20
Figure 1.3. Training Data.
This process then repeats. Each iteration or cycle of updating the weights and biases

is called one training “step”.

Let’s look at what that means in this case, more concretely, for our dataset. When
we

first start the training, it’s like we drew a random line through the data. Then as
each

step of the training progresses, the line moves, step by step, closer to an ideal separation
of the wine and beer.

Evaluation

Once training is complete, it’s time to see if the model is any good, using
Evaluation. This is where that dataset that we set aside earlier comes into play.
Evaluation allows us to test our model against data that has never been used for
training. This metric allows us to see how the model might perform against data
that it has not yet seen. This is meant to be representative of how the model might
perform in the real world. A good rule of thumb I use for a training-evaluation
split somewhere on the order of 80/20 or 70/30. Much of this depends on the size
of the original source dataset. If you have a lot of data, perhaps you don’t need as
big of a fraction for the evaluation dataset.

Parameter Tuning

Once you’ve done evaluation, it’s possible that you want to see if you can
further improve your training in any way. We can do this by tuning our
parameters. There were a few parameters we implicitly assumed when we did
our training, and now is a good time to go back and test those assumptions and
try other values.

21
Figure 1.4. Prediction by input training data.
Another parameter is “learning rate”. This defines how far we shift the line
during each step, based on the information from the previous training step.
These values all play a role in how accurate our model can become, and how
long the training takes. For more complex models, initial conditions can play a
significant role in determining the outcome of training. Differences can be seen
depending on whether a model starts off training with values initialized to
zeroes versus some distribution of values, which leads to the question of which
distribution to use.

As you can see there are many considerations at this phase of training, and it’s
important that you define what makes a model “good enough”, otherwise you
might find yourself tweaking parameters for a very long time.

These parameters are typically referred to as “hyper parameters”. The


adjustment, or tuning, of these hyper parameters, remains a bit of an art, and is
more of an experimental process that heavily depends on the specifics of your
dataset, model, and training process

Prediction

Machine learning is using data to answer questions. So Prediction, or inference,


is the step where we get to answer some questions. This is the point of all this
work, where the value of machine learning is realized.

22
Figure 1.5. Prediction by giving output.

We can finally use our model to predict whether a given drink is wine or beer,
given its colour and alcohol percentage.

2. PROJECT DESCRIPTION

Diabetic retinopathy (die-uh-BET-ik ret-ih-NOP-uh-thee) is a diabetes


complication that affects eyes. It's caused by damage to the blood vessels of
the light-sensitive tissue at the back of the eye (retina). At first, diabetic
retinopathy may cause no symptoms or only mild vision problems.
Eventually, it can cause blindness. The condition can develop in anyone who
has type 1 or type 2 diabetes. The longer you have diabetes and the less
controlled your blood sugar is, the more likely you are to develop this eye
complication.

2.1. Fundus Photography

23
Fundus photography involves photographing the rear of an eye; also known as
the fundus. Specialized fundus cameras consisting of an intricate microscope
attached to a flash enabled camera are used in fundus photography.

Figure 2. Left and Right eye fundus image of retina.

2.2.1. What is Diabetic Retinopathy?

Diabetic retinopathy is an eye condition that causes changes to the blood vessels
in the part of your eye called the retina. That's the lining at the back of your eye
that changes light into images. The blood vessels can swell, leak fluid, or bleed,
which often leads to vision changes or blindness. It usually affects both eyes.
When left untreated, diabetic retinopathy can scar and damage your retina.

Diabetic retinopathy is the most common cause of vision loss for people with
diabetes.

It’s the leading cause of blindness for all adults in the U.S. There are no symptoms
when this disease starts, so it is important to get your eyes checked regularly.

24
Figure 2.1. Normal Eye and Diabetic Retinopathy Eye

2.2.2 Symptoms

You might not have any signs of diabetic retinopathy until it becomes serious.
When you do have symptoms, you might notice:

Loss of central vision, which is used when you read or drive

Not being able to see colours

Blurry vision

Holes or black spots in your vision

Floaters, or small spots in your vision caused by bleeding

2.2.3 Causes

If your blood glucose level (blood sugar) is too high for too long, it blocks off
the small blood vessels that keep your retina healthy. Your eye will try to grow

25
new blood vessels, but they won’t develop well. The blood vessels start to
weaken. They can leak blood and fluid into your retina. This can cause another
condition called macular edema. It can make your vision blurry.

As your condition gets worse, more blood vessels become blocked. Scar tissue
builds up because of the new blood vessels your eye has grown. This extra
pressure can cause your retina to tear or detach.

This can also lead to eye conditions like glaucoma or cataracts (the clouding
of your eye’s lens) that may result in blindness.

2.2.4 Risks

If you have any form of diabetes type 1, type 2, or gestational you may get
diabetic retinopathy. Your chance goes up the longer you have diabetes. Almost
half of Americans diagnosed with diabetes have some stage of diabetic
retinopathy. And only about half of them know they have this disease. Other
things that can raise your odds of diabetic retinopathy include:
 High blood pressure

 High cholesterol

 Tobacco use

Being African American, Hispanic, or Native American

2.2.5 Stages

Diabetic retinopathy tends to go through these four stages:

1. Mild no proliferative retinopathy. In the disease’s earliest stage, tiny blood


vessels in your retina change. Small areas swell. These are called micro
aneurysms. Fluid can leak out of them and into your retina.

26
2. Moderate no proliferative retinopathy. As your disease gets worse, blood vessels

that should keep your retina healthy swell and change shape. They can’t deliver blood
to your retina. This can change the way your retina looks. These blood vessel changes
can trigger diabetic macular enema (DME). That’s swelling in the area of your retina
called the macula.
3. Severe no proliferative retinopathy. In the third stage, many blood vessels get
blocked. They can’t deliver blood to your retina to keep it healthy. Areas of your
retina where this happens make special proteins called growth factors that tell
your retina to grow new blood vessels.
4. Proliferative diabetic retinopathy (PDR). This is the most advanced stage. New
blood vessels grow inside your retina and then into the jelly inside your eyeballs
called vitreous humour. Fragile new blood vessels are more likely to leak fluid
and bleed. Scar tissue starts to form. This can cause retinal detachment, when
your retina pulls away from the tissue underneath. This can lead to permanent
blindness.

Figure 2.2. Stages of Diabetic retinopathy.

27
2.2.6 Diagnosis

Your eye doctor can usually tell if you have diabetic retinopathy during your eye
exam.

Pupil dilation. Your doctor will dilate your pupils to look for any changes in your

eye’s blood vessels or see if any new ones have grown. They'll also see if your retina is
swollen or detached.

Fluorescein angiogram. This test can tell your doctor if you have DME or

severe diabetic retinopathy. It shows if any of your blood vessels are leaking or
damaged. Your doctor will give you a shot with fluorescent dye into a vein in your
arm. When the dye reaches your eyes, your doctor will be able to see images of the
blood vessels in your retina and spot any serious problems.
2.3 Steps involved in making of Diabetic retinopathy model

In this article we are going to study in depth how the process for developing a
machine learning model is done. There will be a lot of concepts explained and
we will reserve others that are more specific, to future articles. Concretely, in the
article it will be discussed how to:

Define adequately our problem (objective, desired outputs…).

Gather data.

Choose a measure of success.

Set an evaluation protocol and the different protocols available.

Prepare the data (dealing with missing values, with categorical values…).

Spilt correctly the data.

28
Differentiate between over and under fitting, defining what they are

and explaining the best ways to avoid them.

An overview of how a model learns.

What is regularization and when is appropriate to use it.

Develop a benchmark model.

Choose an adequate model and tune it to get the best performance possible.

Figure 2.3. Stages of an ML


Lifecycle.

Define Appropriately the Problem According to business need


The first, and one of the most critical things to do, is to find out what are the
inputs and the expected outputs. The following questions must be answered:

What is the main objective? What are we trying to predict?

What are the target features?

29
What is the input data? Is it available?

What kind of problem are we facing? Binary classification? Clustering?

What is the expected improvement?

What is the current status of the target feature?

How is going to be measured the target feature?

Not every problem can be solved, until we have a working model we just can
make certain hypothesis:

Our outputs can be predicted given the inputs.

Our available data is sufficient informative to learn the relationship between

the inputs and the outputs

It is crucial to keep in mind that machine learning can only be used to


memorize patterns that are present in the training data, so we can only
recognize what we have seen before. When using Machine Learning we are
making the assumption that the future will behave like the past, and this isn’t
always true.

Collect Data
This is the first real step towards the real development of a machine learning model,
collecting data. This is a critical step that will cascade in how good the model will
be, the more and better data that we get, the better our model will perform.

There are several techniques to collect the data, like web scraping, but they are
out of the scope of this article. For this project I have collected the data from
field from hospitals and online from kaggle that really help me to collect data.

30
Figure 2.4. Data Set contain Retina Images of eye.

Develope model

At a high level, building a good ML model is like building any other product: You
start with ideation, where you align on the problem you’re trying to solve and some
potential approaches. Once you have a clear direction, you prototype the solution,
and then test it to see if it meets your needs. You continue to iterate between
ideation, prototyping and

testing until your solution is good enough to bring to market, at which point you
productize it for

a broader launch. Now let’s dive into the details of each stage.

Since data is an integral part of ML, we need to layer data on top of this
product development process, so our new process looks as follows:

Ideation. Align on the key problem to solve, and the potential data inputs

to consider for the solution.

Data preparation. Collect and get the data in a useful format for a model

to digest and learn from.

Prototyping and testing. Build a model or set of models to solve the


problem,

31
test how well they perform and iterate until you have a model that gives satisfactory
results.

Productization. Stabilize and scale your model as well as your data collection
and processing to produce useful outputs in your production environment.

For this model multilayer Convocational neural network (CNN) for building
neural network and used Reducing lighting-condition effects to pre-process
the images to feed to neural network.

In this the train and test dataset is divided to train the model with training dataset
and to test it with testing dataset to find the accuracy.

Figure 2.5. Model with Layers.

Deploying Model
Deployment of an ML-model simply means the integration of the model
into an existing production environment which can take in an input and
return an output that can be used in making practical business decisions.
putting the model into industry with a hardware so that this disease can
easily be identified.

32
Monitor and Optimize

In this the monitoring of the deployed model done and analyse of model for
their accuracy and errors. This is a very important step in this after
monitoring the behaviour of the model and optimize it according to the
requirements and to increase the accuracy rate of the model to avoid
mistakes in predicting a disease.

3. DEVELOPMENT TOOLS AND TECHNOLOGIES

Python

Python is an interpreted, high-level, general-purpose programming language. Created

by Guido van Rossum and first released in 1991, Python's design philosophy
emphasizes code readability with its notable use of significant whitespace. Its language
constructs and object-oriented approach aim to help programmers write clear, logical
code for small and large-scale projects.

Python is dynamically typed and garbage-collected. It supports multiple


programming paradigms, including structured (particularly, procedural), object-
oriented, and functional programming.

Anaconda

Anaconda is a free and open-source distribution of the Python and R programming


languages for scientific computing (data science, machine learning applications,
largescale data processing, predictive analytics, etc.), that aims to simplify

33
package management and deployment. The distribution includes data-science
packages suitable
for Windows, Linux, and mac OS. It is developed and maintained by Anaconda, Inc.,
which was founded by Peter Wang and Travis Oliphant in 2012 As an Anaconda, Inc.
product, it is also known as Anaconda Distribution or Anaconda Individual Edition,
while other products from the company are Anaconda Team Edition and Anaconda
Enterprise Edition, which are both not free

Package versions in Anaconda are managed by the package management system


conda. This package manager was spun out as a separate open-source package as
it ended up being useful on its own and for other things than Python. There is also
a small, bootstrap version of Anaconda called Miniconda, which includes only
conda, Python, the packages they depend on, and a small number of other
packages.

Libraries used

Numpy

NumPy is a very popular python library for large multi-dimensional array and
matrix processing, with the help of a large collection of high-level
mathematical functions. It is very useful for fundamental scientific
computations in Machine Learning. It is particularly useful for linear algebra,
Fourier transform, and random number capabilities.

Scikit-learn

Skikit-learn is one of the most popular ML libraries for classical ML


algorithms. It is built on top of two basic Python libraries, viz., NumPy and
SciPy. Scikitlearn supports most of the supervised and unsupervised learning

34
algorithms. Scikit-learn can also be used for data-mining and data-analysis,
which makes it a great tool who is starting out with ML.

Keras

Keras is a very popular Machine Learning library for Python. It is a high-level


neural networks API capable of running on top of Tensor Flow, CNTK, or
Theano. It can run seamlessly on both CPU and GPU. Keras makes it really
for ML beginners to build and design a Neural Network. One of the best thing
about Keras is that it allows for easy and fast prototyping.

Pandas

Pandas is a popular Python library for data analysis. It is not directly related to
Machine Learning. As we know that the dataset must be prepared before training.
In this case, Pandas comes handy as it was developed specifically for

data extraction and preparation. It provides high-level data structures and


wide variety tools for data analysis. It provides many inbuilt methods for
groping, combining and filtering data.

Matplotlib

Matpoltlib is a very popular Python library for data visualization. Like Pandas,
it is not directly related to Machine Learning. It particularly comes in handy
when a programmer wants to visualize the patterns in the data. It is a 2D
plotting library used for creating 2D graphs and plots. A module named pyplot
makes it easy for programmers for plotting as it provides features to control
line styles, font properties, formatting axes, etc. It provides various kinds of
graphs and plots for data visualization, viz., histogram, error charts, bar chats,
etc.,

35
Data Flow Diagram (DFD)
DFD 1. Basic Model Representation.

DFD 1

DFD 2

36
SYSTEM REQUIREMENTS

Hardware Requirements

 Processor – Intel i5 7th generation

 RAM – 8 GB DDR4 2400 MHz

 2 TB Hard Disk (5400 RPM) + 512 GB SSD

 NVidia GeForce gtx1050 4GB

 Heat sink to keep temperature under control

Software Requirements

 RHEL/CentOS 6.5 to 7.4, Ubuntu 12.04+/Windows 10

 Ubuntu users may need to install cURL.

 Client environment may be Windows, macOS or Linux

 MongoDB 2.6 (provided)

 Anaconda Repository license file

 Cron entry to start the repo on reboot

37
 Linux system accounts

 mongod (RHEL) or mongo dB (Ubuntu)

 anaconda-server

CONCLUSION

The preceding reports of the Diabetic Retinopathy Symposium are reviewed. Diabetic
retinopathy progresses with the duration of disease and often results in proliferative
retinopathy in the juvenile onset patient and macular edema in the older onset patient.
Periodic ophthalmoscopic examinations are essential in detecting the progression of
retinopathy and development of disease characteristics which indicate a need for
treatment. Laboratory and clinical experience stress the importance of rigid glucose
control in preventing diabetic retinopathy. Ischemia of the midperipheral retina
stimulates the development of high risk factors for which pan retinal photocoagulation

is indicated despite side effects such as decreased dark adaptation. Pars plana
vitrectomy results in substantial visual improvement in eyes with no clearing vitreous
haemorrhage and/or traction retinal detachments involving the macula. Future advances
in our knowledge of diabetic retinopathy should come from the National

Eye Institute's Collaborative Diabetic Retinopathy Vitrectomy Study and Early

Treatment Diabetic Retinopathy Study, and the analysis of vasoformative factors.

38
REFERENCES

 GitHub

 https://www.sciencedirect.com

 https://www.kaggle.com/

 https://towardsdatascience.com

 https://www.pythonistaplanet.com/image-classification-using-deep-learning/

 https://www.coursera.org/lecture/machine-learning/basic-operations-9fHfl

 https://scholar.google.co.in/scholar

39

You might also like