SurveyPaper DeepLearning Medical Surgical Instruments

Deep Learning in Medical and Surgical
Instruments
by
Srivarna Settisara Janney

Dr. Sumit Chakravarty
Kennesaw State University

April 2017
1
Table of Contents
1. Abstract ......................................................................................................... 4
2. Medical and Surgical Instructions ................................................................... 4
2.1 History ....................................................................................................................................................... 4
2.2 Concepts and Categories of Instruments .................................................................................... 5
2.3 Types of Equipment ............................................................................................................................. 5
2.4 Surgical Instruments ........................................................................................................................... 6
3. Deep Learning ................................................................................................ 8
3.1 What is Deep Learning? ...................................................................................................................... 8
3.2 Difference between AI, Machine Learning and Deep Learning .......................................... 8
3.3 Neural Network and its Architectures ...................................................................................... 10
3.3.1 History of Neural Network ......................................................................................................... 10
3.3.2 Artificial Neural Network (ANN) ............................................................................................. 11
3.3.3 Convolutional Neural Networks (CNN)................................................................................. 12
3.3.4 Recurrent Neural Network (RNN) .......................................................................................... 14
3.4 Hardware and Software .................................................................................................................. 15
4. Deep Learning in Health Care ....................................................................... 16
4.1 Diagnosis in Medical Images and Signals ................................................................................. 17
4.2 Robotics Surgery (Autonomous) ................................................................................................. 18
4.3 Genome and Bio-Informatics ........................................................................................................ 18
4.4 Drug Discovery.................................................................................................................................... 19
4.5 Virtual Visualization ......................................................................................................................... 19
5. Key papers in Deep Learning relevant to Medical Instruments and Surgical
Instruments ........................................................................................................ 20
6. Conclusion ................................................................................................... 25
7. References ................................................................................................... 26
2
Table of Figures
Figure 1: Iris Scissors ______________________________________________________________________________________ 7
Figure 2: Crile Hemostatic Forceps _______________________________________________________________________ 7
Figure 3: Volkman Retractor _____________________________________________________________________________ 7
Figure 4: Allis Tissue Forceps _____________________________________________________________________________ 8
Figure 5: Fashion Item Search ____________________________________________________________________________ 9
Figure 6: Age Estimation from Facial Images __________________________________________________________ 10
Figure 7: Comparison Between ANN and Deep Architecture __________________________________________ 11
Figure 8: How Neural Networks recognize a Dog in a photo __________________________________________ 12
Figure 9: A simple CNN Architecture ___________________________________________________________________ 13
Figure 10: Unfolded basic Recurrent Neural Network [22] ___________________________________________ 14
Figure 11: State of Open Source Deep Frameworks in 2017 [31] _____________________________________ 15
Figure 12: Companies working on Health Care Data as per survey conducted in April 2016 [34] __ 16
Figure 13: Application of Deep Learning in Bio-Informatics research ________________________________ 18
Figure 14: A view of the Human Brain. Courtesy BioDigital___________________________________________ 19
Figure 15: A view of acupuncture points on a 3D model. Courtesy Medical Augmented Intelligence 19
Figure 16: Challenges encountered by tool detection and localization algorithms in real
interventions [49] _______________________________________________________________________________________ 20
Figure 17: Classical method architecture for tool detection [53] _____________________________________ 21
Figure 18: Example of surgical tools segmentations. From left to right: Image, ground truth and
authors proposed method [47] _________________________________________________________________________ 21
Figure 19: Deep Residual Learning for Instrument Segmentation in Robotic Surgery ______________ 22
Figure 20: FCN based segmentation of 4 testing images, each one belonging to a different dataset.
From left to right, EndoVisSub(robotic), EndoVisSub(non-robotic), Neuro-Surgical tools [49] _____ 23
Figure 21: Experiment results comparing outputs of some of the Deep Learning models ___________ 24
3
Deep Learning in Medical and Surgical
Instruments
1. Abstract
Deep learning are a host of techniques used to implement machine learning on

grand scale. They have gained popularity for their accuracy in analysis and limited
human intervention to prediction results. This is a survey paper on deep learning
for medical and surgical instruments. Medical & surgical instruments have been a
part of our day-to-day lives for well being of our species. They provide monitoring,
diagnoses and medical procedures based on the different level of illness. In the
following sections, we review medical and surgical instruments in practice,
followed by an expose on deep learning techniques and available commercial and
research tools. Finally, we provide summary of some of the recent key papers,
which apply deep learning to medical domain and its instrumentation.
2. Medical and Surgical Instructions

2.1 History
People rarely give credit to the medical equipment used clinic, hospitals.
Medical equipment is an integral part of diagnosis, monitoring and assisting in
medical surgeries. Even the simplest physical exam can require a variety of
medical equipment.
In the 6th century BC, a famous Indian physician and surgeon Susruta who
wrote a Sanskrit book called “The Susruta Samhita” which even till date is being
taught at University of Benares (now called as Kasi or Varanasi on the banks of
river Ganga). He was known for unconventional ways to cure patients. He was also
known to have surgical knowledge and to use medical tools for his practices. His
texts laid foundation for todays Indian Ayurvedic medicine. [1]
In 14th century AD, bubonic plague epidemic swept across Asia, Europe,
and Africa. It is believed that around 50 million people died. Even though classic
Greek and Roman theories based on philosophy and superstition, 25% to 60% of
European population bodies were available for universities for autopsies.
Exploration led the science community toward practical surgery and anatomy
studies finally leading towards modern medical equipment and tools.[2]
Since the 15th century, Western science has focused on examining and
observing the body and has created tools to make this easier. X-ray imaging and
MRI devices can be considered as extension of the first autopsies and anatomical
studies, which strove to understand how the human body actually operates.
Diagnostic instruments like ophthalmoscopes, blood pressure monitors, and
4
stethoscopes are likewise extensions of our quest to monitor human physiology.
Medical technology and medical knowledge feed off of each other. Take for
instance hypertension. Although devices for measuring blood pressure have
existed for over 100 years, only in the last 20 years have the connections of blood
pressure to disease, genetics, and lifestyle been fully explored. As the importance
of measuring blood pressure increased, new technologies are explored to keep
accurate measurements and records. It wasn’t until the prevalence of automatic
blood pressure monitors that a correlation could be made between readings taken
by a human and readings taken in a controlled, isolated environment. [7]
Medical equipment acts as an extension for investigation of the how’s and why’s
of the human body, and as science catches up and surpasses the investigations,
completely new kind have medical diagnosis, monitoring and therapy. So,
technology marches forward continues, as a process, to change human life.
2.2 Concepts and Categories of Instruments
Instruments used in medical field are classified as Non Invasive, Minimal

invasive and Invasive instruments.
 Non-invasive instruments – are those that do not have any break or incision
made on the human body. Examples are X-Ray machine, CT scanning machine,
Ultrasound machine, thermometer, BP machine, Uric acid meter, ECG
machine, forensic acquisition equipment (finger print, retina), Extracorporeal
shock wave lithotripsy (ESWL – treat kidney stone using an acoustic pulse)
etc.
 Minimal invasive instruments – are those that limit the size of incisions
needed and so that less number of switches with minimum pain. These
incisions can tend to have lesser healing time for quick recovery. Examples
are glucometer, DNA sequencer, and instruments used in Cataract surgery,
Refractive surgery (vision correction) etc. There are minimally invasive
surgeries that use cameras to observe the internal anatomy is the preferred
approach to many surgical procedures. Current trend are Computer-assisted
interventions (CAI) where medical interventions are supported by computer-
based tools and technologies.
 Invasive instruments – Multiple instruments are used invasive surgeries.
Here was consider extensive multi-organ transplantation, open-heart
surgeries, etc. which requires incision on large option of body for larger time
frame. These wounds take a longer time to heal and recovery is slow.
2.3 Types of Equipment
Medical equipment (also known as armamentarium [3]) is designed to aid in

the diagnosis, monitoring or treatment of medical conditions. There are different
types of instruments used for different parts of the body to diagnosis different
varieties and varied ranges of diseases.
There are several basic types, few of the are mentioned below [4]:
5
 Diagnostic equipment includes medical imaging machines, used to aid in
diagnosis. Examples are ultrasound and MRI(Magnetic Resonance
Imaging) machines, PET (Positron Emission Tomography) and CT scanners
(Computed Tomography), and X-ray machines.
 Treatment equipment includes infusion pumps, medical lasers and LASIK
(Laser-Assisted in Situ Keratomileusis) surgical machines.
 Life support equipment is used to maintain a patient's bodily function. This
includes medical ventilators, incubators, anesthetic machines, heart-lung
machines, ECMO (Extracorporeal membrane oxygenation), and dialysis
machines.
 Medical monitors allow medical staff to measure a patient's medical state.
Monitors may measure patient vital signs and other parameters
including ECG (Electrocardiogram), EEG (electroencephalogram), and blood
pressure.
 Medical laboratory equipment automates or helps
analyze blood, urine, genes, and dissolved gases in the blood.
 Diagnostic Medical Equipment may also be used in the home for certain
purposes, e.g. for the control of diabetes mellitus
 Therapeutic: physical therapy machines like Continuous Passive range of
Motion (CPM) machines
2.4 Surgical Instruments
Surgical instruments are specially designed tools that assist health care
professionals carry out specific clinically actions during an operation. Early
know surgical instrument was hand that later on moved to sticks for handling
and cutting tools. Most instruments crafted from the early 19th century on are
made from durable stainless steel [5]. Some are designed for general use, and
others for specific procedures. There are many surgical instruments available for
almost any specialization in medicine. There are precision instruments used in
microsurgery, ophthalmology and otology.
Most surgical instruments can be classified into these 4 basic types as mentioned
below are referenced from [6]:
 Cutting and Dissecting – These instruments usually have sharp edges or

tips to cut through skin, tissue and suture material. Surgeons need to cut and
dissect tissue to explore irregular growths and to remove dangerous or
damaged tissue. These instruments have single or double razor-sharp edges
or blades. Nurses and OR personnel need to be very careful to avoid injuries,
and regularly inspect these instruments before using, for re-sharpening or
replacement.
Example shown: Iris Scissors
6
Figure 1: Iris Scissors
 Clamping and Occluding – Are used in many surgical procedures for

compressing blood vessels or hollow organs, to prevent their contents from
leaking. Occlude means to close or shut. Therefore, these instruments are
also used to control bleeding. They either straight, curved or angled, and
have a variety of inner jaw patterns. Hemostats and mosquito forceps are
some examples of these types of instruments.
Example shown: Crile Hemostatic Forceps
Figure 2: Crile Hemostatic Forceps
 Retracting and Exposing – These surgical instruments are used to hold

back, or retract organs and tissue so the surgeon has access to the operative
area. They spread open the skin, ribs and other tissue; and are also used
separate the edges of a surgical incision. Some retracting and exposing
instruments are “self-retaining,” meaning they stay open on their own.
Others manual styles need to be held open by hand.
Example shown: Volkman Retractor
Figure 3: Volkman Retractor
 Grasping and Holding – These instruments, as their name suggests, are

used to grasp and hold tissue or blood vessels that may be in the way during
a surgical procedure. Forceps are a very good example of these types of
instruments.
Example shown: Allis Tissue Forceps
7
Figure 4: Allis Tissue Forceps
In addition to these major categories, there are other narrow instrument

classifications –such as viewing (specula, endoscopes), dilators/probes, suturing
(needle holders), aspirating (suction tubes), and accessories (mallets, etc.).
3. Deep Learning
3.1 What is Deep Learning?
Deep Learning enables the computer to build complex concepts out of simpler
concepts. Deep learning is a part of Machine learning in Artificial Intelligence (AI)
that has networks capable of learning unsupervised from data to improve the
future performance with a much complex layers of algorithms.
Example: Speech Recognition, Image recognition, Art restoration, language
translation, Recommendation system in shopping sites like amazon, eBay, etc.
3.2 Difference between AI, Machine Learning and Deep Learning
There is always a fine line between difference between Artificial Intelligence

(AI), Machine Learning and Deep Learning. According to experts like Mr. Calum
McClelland, Director of Big Data at Leverege:
 Artificial Intelligence (AI) – “AI involves machines that can perform tasks
that are characteristic of human intelligence.”
 Machine Learning (ML) – “Machine learning is simply a way of achieving
AI.”
 Deep Learning (DL)–“Deep learning is one of many approaches to
machine learning.”
Demo – Reference the link below to get an insight into one of current application
of machine learning that has revolutionized search for online shopping.
http://demo.markable.ai
8
Figure 5: Fashion Item Search
Machine Learning Algorithm can be classified as Supervised, Semi-supervised or

Unsupervised learning.
1. Supervised Learning- (Classification and Regression problem) [8]
The majority of practical machine learning uses supervised learning. In
supervised learning we have input variables (X) and output variables (Y)
Algorithms are used to learn the mapping function from the input to the output.
Y = f (X)
The goal is to approximate the mapping function or also called as objective
function so well that when you have new input data (X) that you can predict the
output variables (Y) for the new input data.
It is called supervised learning because the process of algorithm learning from the
training dataset can be thought of as a teacher supervising the learning process.
We know the correct answers; the algorithm iteratively makes predictions on the
training data which is validated by us the teacher. Learning stops when the
algorithm achieves an acceptable level of performance.
Supervised learning problems can be further grouped into regression and
classification problems.
 Classification: A classification problem is when the output variable is a
category, such as “red” or “blue” or “disease” and “no disease”.
 Regression: A regression problem is when the output variable is a real value,
such as “Age” or “weight”. Age estimation from facial images is an example of
regression problem.
Figure 6: Age Estimation from Facial Images
9
(Image adapted from Dr. Hung Huang and Dr. Chengkai Li at UT Arlington)
2. Semi-supervised learning
Problems where you have a large amount of input data (X) and only some
of the data is labeled (Y) are called semi-supervised learning problems. These can
be both supervised and unsupervised learning approaches.
A good example is a photo archive where only some of the images are
labeled, (e.g. dog, cat, person) and the majority is unlabeled.
Many real world machine-learning problems fall into this area. This is because it
can be expensive or time-consuming to label data as it may require access to
domain experts. Whereas unlabeled data is cheap and easy to collect and store.
Unsupervised learning techniques can be used to discover and learn the structure
in the input variables.
Unsupervised learning techniques can also be applied to make best guess
predictions for the unlabeled data, when they can be feed back into the supervised
learning algorithm as training data and use the model to make predictions on new
data.
3. Unsupervised learning (Clustering)

Unsupervised learning is where you only have input data (X) and no
corresponding output variables. The goal for unsupervised learning is to model
the underlying structure or distribution in the data in order to learn more about
the data. These are called unsupervised learning because unlike supervised
learning above there are no available labels. Algorithms are left to their own
devises to discover and present the interesting structure in the data. Unsupervised
learning problems can be further grouped into clustering and association
problems.
 Clustering: A clustering problem is where you want to discover the inherent

groupings in the data, such as grouping customers by purchasing behavior.
 Association: An association rule learning problem is where you want to
discover rules that describe large portions of your data, such as people that
buy X also tend to buy Y.
Some popular examples of unsupervised learning algorithms are:

 K-means for clustering problems.
 Apriori algorithm for association rule learning problems.
3.3 Neural Network and its Architectures

3.3.1 History of Neural Network
Warren McCulloch and Walter Pitts [9] (1943) created a computational model
for neural networks based on mathematics and algorithms called threshold logic.
This model paved the way for neural network research to split into two
approaches. One approach focused on biological processes in the brain while the
10
other focused on the application of neural networks to artificial intelligence. This
work led to concepts of nerve networks and their link to finite automata.
Much of artificial intelligence had focused on high-level (symbolic) models
that are processed by using algorithms, characterized for example by expert
systems with knowledge embodied in if-then rules, until in the late 1980s
research expanded to low-level (sub-symbolic) machine learning, characterized
by knowledge embodied in the parameters of a cognitive model
3.3.2 Artificial Neural Network (ANN)
The human brain and its activities inspired neural network. Human brain is
said have 100 billion neurons that are all interconnected to form a network. This
network is responsible for our perception of the world around us. Our perception
of the world around us is captured and stored, remember it when we want to be
all because of this network of neurons. Scientists mimicked them to form Artificial
Neural Network. When we pass input such as image, there are nodes similar to
neurons that are hidden layer to capture the features to replication the input and
give the output. Output could be either identifying the input or reproducing input.
ANNs have been used on a variety of tasks, including computer

vision, speech recognition, machine translation, social network filtering, playing
board and video games and medical diagnosis.
Figure 7: Comparison Between ANN and Deep Architecture
Figure shows three layers and one transformation toward the final outputs usually composed in ANNs while
several layers of neural networks constitute deep learning architecture. Layer-wise unsupervised pre-training
allows deep networks to be tuned efficiently and to extract deep structure from inputs to serve as higher-level
features that are used to obtain better predictions. [11]
11
Figure 8: How Neural Networks recognize a Dog in a photo
Online Fortune.com site, author of blog Roger Parloff illustrated how to photo of dog can be recognized by a
neural network. [12]
3.3.3 Convolutional Neural Networks (CNN)
12
CNNs are relatively new deep learning techniques, which use a variation
of multilayer perceptrons designed to require minimal pre-processing [13]. They
are also known as shift invariant or space invariant artificial neural
networks (SIANN), based on their shared-weights architecture and translation
invariance characteristics. [14] Learning is usually done without unsupervised
pre-training. A CNN consists of an input and an output layer, as well as
multiple hidden layers.
Figure 9: A simple CNN Architecture
Image adapted from https://www.clarifai.com/technology
CNN contain one or more of each of the following layers [15]:

 Convolution layer-This layer is the core building block of a CNN.
Convolutional layers apply a convolution operation to the input, passing the
result to the next layer. The convolution emulates the response of an
individual neuron to visual stimuli [16].It resolves the vanishing or
exploding gradients problem in training traditional multi-layer neural
networks with many layers by using back-propagation
 ReLU (Rectified linear units) layer –This layer commonly follows the
convolution layer. The addition of the ReLU layer allows the neural network
to account for non-linear relationships, i.e. the ReLU layer allows the CNN to
account for situations in which the relationship between the pixel value
inputs and the CNN output is not linear. This layer applies the non-
saturating activation function f(x) = max(0, x). It increases the nonlinear
properties of the decision function and of the overall network without
affecting the receptive fields of the convolution layer. [17]
Other non-linear functions such as tanh or sigmoid can also be used
instead of ReLU, but ReLU has been found to perform better in most
situations.
 Pooling layer-next important layer of CNNs is pooling, which is a form of
non-linear down sampling. The pooling layer serves to progressively reduce
the spatial size of the representation, to reduce the number of parameters
and amount of computation in the network hence also controls over-fitting.
[18][19] It is common to periodically insert a pooling layer between
successive convolutional layers in CNN architecture. The pooling operation
provides another form of translation invariance. For example, max
pooling uses the maximum value from each of a cluster of neurons at the
prior layer. [20] Another example is average pooling, which uses the average
value from each of a cluster of neurons at the prior layer.
13
 Fully connected layer - Fully connected layers connect every neuron in one
layer to every neuron in another layer. It is same as the traditional multi-
layer perceptron neural network (MLP). After several convolutional and max
pooling layers, the high-level reasoning in the neural network is done via
fully connected layers.
3.3.4 Recurrent Neural Network (RNN)
The idea behind RNNs is to make use of sequential information. In a traditional

neural network we assume that all inputs (and outputs) are independent of each
other. But for many tasks that may not be proper.. If we want to predict the next
word in a sentence we should know which words came before it. RNNs are
called recurrent because they perform the same task for every element of
a sequence, with the output being depended on the previous computations.
Another way to think about RNNs is that they have a “memory” which captures
information about what has been calculated so far. [21] In theory RNNs can make
use of information in arbitrarily long sequences, but in practice they are limited to
looking back only a few steps. Figure 10 shows is what a typical RNN looks like:
Figure 10: Unfolded basic Recurrent Neural Network [22]
Recurrent neural networks are used somewhat indiscriminately about two broad
classes of networks with a similar general structure, namely where one is finite
impulse and the other is infinite impulse. Both classes of networks exhibit
temporal dynamic behaviour.[23] A finite impulse recurrent network is a directed
acyclic graph that can be unrolled and replaced with a strictly feed forward neural
network, while an infinite impulse recurrent network is a directed cyclic
graph that cannot be unrolled.
Both finite impulse and infinite impulse recurrent networks can have additional
storage state, and the storage can be under direct control of the neural network.
The storage can also be replaced by another network or graph, if that incorporates
time delays or has feedback loops. Such controlled states are referred to as gated
state or gated memory, and are part of Long Short-Term Memory (LSTM) [24]
and gated recurrent units.
RNNs have shown great success in many Natural Language Processing

(NLP) tasks. The most commonly used type of RNNs are LSTMs (Long Short-Term
Memory), which are much better at capturing long-term dependencies than
traditional RNNs [21].
14
3.4 Hardware and Software
GPU (Graphics Processing Unit) and its libraries (CUDA, OpenCL) are normally
recommended due to the shear volume of data required for training deep neural
network. GPUs are highly parallel computing engines, which have higher
execution threads than Central Processing Units (CPUs). CPU would take several
hours or most of the time several days to complete the same task. [25]
Nvidia’s hardware has established its silent but prominent role in deep
learning. Nvidia’s DGX-1 installed in hospitals and medical research centers
across the world. Some hospitals such as Massachusetts General Hospital’s
new clinical data science center are already using this new hardware
for population health, comparing patients’ test results and medical histories
history to identify correlations in the data. [26]
Google unveiled Tensor Processing units (TPUs) specifically designed to facilitate
deep learning.
There is huge range of software packages available in different programming

languages that make it convenient for user to implement at higher level without
worrying about the lower level implementation and at the same time
customizable parameters based on the applications.
 Caffe: C++ and Python interfaces, developed by graduate students at UC

Berkeley. [27]
 Theano: a Python interface, developed by MILA lab in Montreal. [28]
 Torch: a Lua (cross-platform, since the interpreter is written in ANSI C, and
has a relatively simple C API) interface and is used by, among others,
Facebook AI research. [29]
 Tensorflow: C++ and Python and interfaces, developed by Google and is used
by Google research. [30]
Figure 11: State of Open Source Deep Frameworks in 2017 [31]
15
In the past years different open source especially in Python deep learning
frameworks were introduced, often developed or backed by one of the big tech
companies, and some got a lot of traction. Open Source technologies and their
companies are shown in figure 11.
To become a Deep Learning programmer and also to learn about some of the tips
and tricks in Deep Learning, refer to the blog written by Nikolas Markou “The
Black Magic of Deep Learning – Tips and Tricks for the practitioner”. [56]
4. Deep Learning in Health Care
Health care data is available in abundance in Electronic Health Record (EHR),

or electronic medical record (EMR) [32]. EHR systems are designed to store data
accurately and to capture the state of a patient across time. It eliminates the need
to track down a patient's previous paper medical records and assists in ensuring
data are accurate and legible. It can reduce risk of data replication, as there is only
one modifiable file, which means the file is more likely up to date, and decreases
risk of lost paperwork. Due to the digital information being searchable and in a
single file, EMRs are more effective when extracting medical data for the
examination of possible trends and long-term changes in a patient. Population-
based studies of medical records may also be facilitated by the widespread
adoption of EHRs and EMRs. [33]
We’ve got more healthcare data to use to train algorithms than ever before.
Today, there are more than 100 healthcare-related AI start-ups.
16
Figure 12: Companies working on Health Care Data as per survey conducted in April 2016 [34]
4.1 Diagnosis in Medical Images and Signals
Computer vision has been one of the most remarkable breakthroughs, thanks
to machine learning and deep learning, and it’s a particularly active healthcare
application for ML. As examples, Microsoft’s Inner Eye initiative (started in 2010)
is presently working on image diagnostic tools [35]. Enlitic [36] uses deep
learning to detect lung nodules in radiographs and CT and MRI scans and
determines whether they’re benign or malignant. CEO Igor Barani, a former
professor of radiation oncology at the University of California in San Francisco,
claims that Enlitic’s algorithms outperformed four radiologists in testing. Barani
told Medical Futurist [37]:
“Until recently, diagnostic computer programs were written using a series of

predefined assumptions about disease-specific features. A specialized program
had to be designed for each part of the body and only a limited set of diseases could
be identified, preventing their flexibility and scalability. The programs often
oversimplified reality, resulting in poor diagnostic performance, and thus never
reached widespread clinical adoption. In contrast, deep learning can readily
handle a broad spectrum of diseases in the entire body, and all imaging modalities
(X-rays, CT scans, etc.).”
17
Deep learning will probably play a more and more important role in diagnostic
applications as deep learning becomes more accessible, and as more data sources
(including rich and varied forms of medical imagery) become part of the AI
diagnostic process.
4.2 Robotics Surgery (Autonomous)
The da Vinci robot has gotten the bulk of attention in the robotic surgery
space, and some could argue for good reason. This device allows surgeons to
manipulate dextrous robotic limbs in order to perform surgeries with fine detail
and in tight spaces (and with less tremors) than would be possible by the human
hand alone. [35][38]
While not all robotic surgery procedures involve machine learning, some
systems use computer vision (aided by machine learning) to identify distances,
or a specific body part (such as identifying hair follicles for transplantation on
the head, in the case of hair transplantation surgery). In addition, machine
learning is in some cases used to steady the motion and movement of robotic
limbs when taking directions from human controllers.
4.3 Genome and Bio-Informatics
Freenome[39] uses deep learning to find cancer in blood samples or, more
specifically, the fragments of DNA that blood cells emit as they die. Venture
capital firm Andreessen Horowitz sent the company five blood samples to
analyze as a pre-investment test. The firm went ahead with its investment after
Freenome identified all five two normal and three cancerous correctly.
Founder Gabriel Otte told Fortune that his deep learning algorithm is detecting
cancer signatures that cancer biologists have yet to characterize. [32]
In May, Babylon Health founder and CEO Ali Parsa told online tech show “Hot
Topics” that his team had recently submitted the world’s first AI-powered
clinical triage system to academic testing, during which his system proved itself
13% more accurate than a doctor and 17% more accurate than a nurse. [32] [40]
18
Figure 13: Application of Deep Learning in Bio-Informatics research
Figure (A) Overview diagram with input data and research objectives. (B) A research example in the omics
domain. Prediction of splice junctions in DNA sequence data with a deep neural network [41]. (C) A research
example in biomedical imaging. Finger joint detection from X-ray images with a convolutional neural network
[42]. (D) A research example in biomedical signal processing. Lapse detection from EEG signal with a recurrent
neural network [43].
4.4 Drug Discovery
Machine learning applications are trained to prescribe drug to patients based

on their private health report. [32]
IBM’s own health applications have initiatives in drug discovery since it’s early
days. Google has also jumped into the drug discovery fray and joins a host of
companies already raising and making money by working on drug discovery with
the help of machine learning. [44]
4.5 Virtual Visualization
Several companies are exploring 3D technologies, Augmented Reality (AR),

and Virtual Reality (VR) in health care. Any technology or advancement in health
care has two primary sets of stakeholders: the doctors/caregivers and the
patients. For 3D technologies there is no difference between AR, and VR. For
doctors and other caregivers, these technologies are driving big leaps forward in
training and education. For patients, it’s all about greater engagement and
enhanced healing, rehabilitation, and comfort.
19
BioDigital Inc is enabling 3D exploration and is often called the Google
Maps of the human body. “Doctors and patients alike are inundated with
information,” says BioDigital CEO Frank Sculli. “With 3D, we can make the content
more engaging, which leads to increased understanding and retention.”
BioDigital’s cloud-based Human 3D model features more than 5,000 anatomical
objects to explore, and more than 2,500 schools are using the platform to educate
and train students. [45]
Figure 14: A view of the Human Brain. Courtesy BioDigital
Figure 15: A view of acupuncture points on a 3D model. Courtesy Medical Augmented Intelligence
3D, VR (Virtual Reality), and AR (Augmented Reality) are enabling new types
of medical training, several companies are using these technologies to engage
patients in learning, rehabilitation, and therapy to deal with things like pain, aging,
and anxiety.
5. Key papers in Deep Learning relevant to Medical

Instruments and Surgical Instruments
Laparoscopic surgery, where the surgery is performed far away from the
patient by inserting small incisions on the patient's body and the surgery is
performed with a help of a video recorder and through which the doctor performs
the surgery. The Computer-Assisted Interventions (CAI) is increasing
20
exponentially and the need for accurate and reliable intervention is very
important because of the critical nature of the domain. [53]
Figure 16: Challenges encountered by tool detection and localization algorithms in real
interventions [49]
CAI is used for staff assignment, automated guidance during intervention,

surgical alert systems, automatic indexing of surgical video databases and
optimization of the real-time scheduling of operating room. Semantic
segmentation is used for accurate delineation of surgical tools from the
background.
Efforts have made to develop a system that is both fast and accurate
approach but it is still an active area of research due its importance. Some
applications that involve identifying the location of surgical tool, identifying tools
in the given frame and many more applications.
With the advance of deep learning models, the computer Assisted intervention are
getting its reward and many papers have been published in this domain recently.
[46]
21
Figure 17: Classical method architecture for tool detection [53]
In [53], authors worked on Deep learning based multi-label classification

method for identifying surgical tools in a given frame.
The model mainly consists of Convolutional neural network with many layers.
They used Inception architecture and the standard feed-forward architecture for
performing the prediction. This method was able to beat other results and was
able to get first place in MICCAI challenge 2017.
In semantic segmentation, each label is assigned to a class as a tool or a

background. In [47], authors have applied a hybrid method utilizing both
recurrent and convolutional networks to achieve higher accuracy of surgical tools
segmentation. Training and testing was carried out on public dataset MICCAI 2016
Endoscopic Vision Challenge Robotic Instruments dataset “EndoVis”. Authors
claim that this works better than with balanced accuracy of 93.3% and Jaccard
index [46] of 82.7%.
Figure 18: Example of surgical tools segmentations. From left to right: Image, ground truth and
authors proposed method [47]
Authors of [48] work in two major ways. First is to leverage recent techniques
such as deep residual learning (ResNet-101), which helped them to achieve 4%
improvement in binary tool segmentation and dilated convolutions to advance
binary-segmentation performance. Secondly, to extend the approach to multi-
class segmentation, which lets us segment different parts of the tool, in addition
22
to background.
Figure 19: Deep Residual Learning for Instrument Segmentation in Robotic Surgery
Figure shows a simplified CNN before and after being converted into an FCN(illustrations (a) and (b)
respectively), after reducing downsampling rate with integration of dilated convolutions into its architecture
with subsequent bilinear interpolation (illustration (c) ). Illustration (a) shows an example of applying a CNN
to an image patch centered at the red pixel, which gives a single vector of predicted class scores (manipulator,
shaft, and background). Illustration (b) shows the fully connected layer being converted into 1x1 convolutional
layer, making the network fully convolutional, thus enabling a dense prediction. Illustration (c) shows network
with reduced downsampling and dilated convolutions that produce outputs that are being upsampled to
acquire pixel wise predictions. [48]
Authors of [49] proposed a novel real-time automatic method based on Fully

Convolutional Networks (FCN) and optical flow tracking. Their method exploits
the ability of deep neural networks to produce accurate segmentations of highly
deformable parts along with the high speed of optical flow. They validated using
existing and new benchmark datasets, covering both ex vivo and in vivo real
clinical cases where different surgical instruments are employed. Two versions
of the method are presented, non-real-time and real-time. The former, using only
deep learning, achieves a balanced accuracy of 89.6% on a real clinical dataset,
outperforming the (non-real-time) state of the art by 3.8% points. The latter, a
combination of deep learning with optical flow tracking, yields an average
balanced accuracy of 78.2% across all the validated datasets.
23
Figure 20: FCN based segmentation of 4 testing images, each one belonging to a different dataset.
From left to right, EndoVisSub(robotic), EndoVisSub(non-robotic), Neuro-Surgical tools [49]
Authors of [50] have given a good comparison model of different deep learning
models results. They compared
1. FCN-VGG 400– Fully Convolutional Network with VGG [51] with 400
smaller samples set
2. FCN-VGG 10k small – Fully Convolutional Network with VGG with 10k with
less pixel size of 256x256
3. FCN-VGG 10k Large - Fully Convolutional Network with VGG with 10k with
less pixel size of 940x940
4. P2P 400 – Pixel to Pixel use of a conditional generative adversarial network
(cGAN) with 400 smaller sample sets
5. P2P 10k - Pixel to Pixel with 10k data
24
Figure 21: Experiment results comparing outputs of some of the Deep Learning models
Figure shows experimental results presenting outputs of some of the deep learning models for different
parameters (training dataset size and resolution) applied on a simulation testing dataset. The simulated
images are shown in the first column and the ground truth segmentation images in the second column [50 ]
6. Conclusion
Deep Learning (DL) is both the natural evolution of prior technologies
accelerated by improvements in algorithms and computing power as well as a
dramatic leap forward of our ability to extract critical information from data that
may be difficult to observe using non deep learning techniques. Deep Learning
are changing the way doctors diagnose illnesses, making diagnostics faster,
cheaper, and more accurate than ever before. Taking advantage of these
advances requires certain preparatory steps, such as upgrading your hardware.
We are still in the early stages of applying DL, and building DL systems is much
more an art than a science at this stage. When deciding the optimal architecture
of a DL network for a particular problem
[55]. The type of problem being addressed also impacts the architecture. For
image segmentation, the most commonly utilized architectures are the Fully
Convolutional Networks (FCN) , Auto encoders. These techniques have been
successfully applied to segmentation of several types of medical images,
including brain, lung, prostate, and kidney. For image classification, CNNs have
been the most common architecture.
The key elements related to DL in medical imaging and instrumentation are:
- Deep Learning has dramatically improved the performance of computer
algorithms outside of medicine, and it can be expected to dramatically
25
improve performance of medical applications in the near future.
- Detection, tracking, and pose estimation of surgical instruments are
crucial tasks for computer assistance during minimally invasive robotic
surgery. In the majority of cases, the first step is the automatic
segmentation of surgical tools. [48]
7. References
[1] M. Loukas, A. Lanteri, J. Ferrauiola, R.S.Tubbs, G. Maharaja, M.M. Shoja, A.
Yadav, and V.C. Rao, “Anatomy in ancient India: a focus on the SusrutaSamhita”
Published online 2010. Articles from Journal of Anatomy
[2] http://baysidejournal.com/medical-equipment-development-and-history-of-
medical-equipment/
[3] "ar·ma·men·tar·i·um". www.thefreedictionary.com. Retrieved 14
November 2013.
[4] https://en.wikipedia.org/wiki/Medical_equipment
[5] R Nimitz, “Surgical Instrumentation: an Interactive Approach”, Saunders, 2010)
1416037020, pxiii
[6] Heller, Michelle. (2016). Clinical Medical Assisting: A professional, Field Smart
Approach to the Workplace.
http://research.sklarcorp.com/4-basic-types-of-surgical-instruments
[7] J.R. Krikup, ‘The history and evolution of surgical instruments’, Annals of the
Royal College of Surgeons of England dated July 1981, vol 63.
[8] https://machinelearningmastery.com/supervised-and-unsupervised-machine-
learning-algorithms/
[9] W. McCulloch, W. Pitts, "A Logical Calculus of Ideas Immanent in Nervous
Activity". Bulletin of Mathematical Biophysics. 5 (4): 115
133. doi:10.1007/BF02478259
[10] https://en.wikipedia.org/wiki/Artificial_neural_network
[11] R. Miotto, F. Wang, S. Wang, X. Jiang and J. T. Dudley, “Deep learning for
healthcare”, Briefings in Bioinformatics, 2017
[12] http://fortune.com/ai-artificial-intelligence-deep-machine-learning/
[13] LeCun, Yann. "LeNet-5, convolutional neural networks". Retrieved 16
November 2013.
[14] Zhang, Wei (1988). "Shift-invariant pattern recognition neural network and
its optical architecture". Proceedings of annual conference of the Japan Society of
Applied Physics.
[15] https://en.wikipedia.org/wiki/Convolutional_neural_network#cite_note-
deeplearning-7
[16] "Convolutional Neural Networks (LeNet) – DeepLearning 0.1
documentation". DeepLearning 0.1. LISA Lab. Retrieved 31 August 2013.
[17] Neural Networks Part 1: Setting up the Architecture (Stanford CNN Tutorial):
http://cs231n.github.io/neural-networks-1/
[18] D. Ciresan, U. Meier, J. Masci, L. M. Gambardella, J. Schmidhuber, "Flexible, High
Performance Convolutional Neural Networks for Image
Classification" (PDF). Proceedings of the Twenty-Second international joint
conference on Artificial Intelligence-Volume Volume Two. 2: 1237–1242. Nov 2013.
26
[19] A. Krizhevsky, "ImageNet Classification with Deep Convolutional Neural
Networks". Retrieved 17 November 2013.
[20] D. Ciresan, U. Meier, J. Schmidhuber, "Multi-column deep neural networks for
image classification". 2012 IEEE Conference on Computer Vision and Pattern
Recognition.
[21] D. Britz, “Recurrent Neural Networks Tutorial, Part 1 – Introduction to RNNs”,
article published in WildMLin Sept 2015
[22] https://en.wikipedia.org/wiki/Recurrent_neural_network
[23] Miljanovic, Milos (Feb–Mar 2012). "Comparative analysis of Recurrent and
Finite Impulse Response Neural Networks in Time Series Prediction" (PDF). Indian
Journal of Computer and Engineering.
[24] S. Hochreiter, J. Schmidhuber, "Long Short-Term Memory", Neural
Computation.9 (8): 1735–1780.
[25] G. Litjens, T. Kooi, B. E. Bejnordi, A. A. A. Setio, F. Ciompi,
M. Ghafoorian, J. A.W.M. van der Laak, B. van Ginneken, C. I. S´anchez, “A Survey on
Deep Learning in Medical Image Analysis”, Medical Image Analysis, Volume
42, December 2017, Pages 60-88
[26] C. Caparas, “Google’s DeepMind to Scan a Million Eyes to Fight Blindness with
NHS”, article in Futurism.com in Jul 2016
[27] Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama,
T. Darrell, 2014. “Caffe: Convolutional architecture for fast feature embedding”, In:
Proceedings of the 22nd ACM International Conference on Multimedia. pp. 675–678.
[28] F. Bastien, P. Lamblin, R. Pascanu, J. Bergstra, I. Goodfellow, A. Bergeron, N.
Bouchard, D. Warde-Farley, Y. Bengio, 2012. “Theano: new features and speed
improvements”, Deep Learning and Unsupervised Feature Learning NIPS 2012
Workshop.
[29] R. Collobert, K. Kavukcuoglu, C. Farabet, 2011. “Torch7: A
matlablikeenvironment for machine learning”, Advances in NeuralInformation
Processing Systems.
[30] M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro,
G. S. Corrado, A. Davis, J. Dean, M. Devin, S. Ghemawat, I. Goodfellow, A. Harp, G.
Irving, M. Isard, Y. Jia, R. Jozefowicz, L. Kaiser, M. Kudlur, J. Levenberg, D. Mane, R.
Monga,S. Moore, D. Murray, C. Olah, M. Schuster, J. Shlens, B. Steiner
, I. Sutskever, K. Talwar, P. Tucker, V. Vanhoucke, V.Vasudevan, F. Viegas, O. Vinyals,
P. Warden, M. Wattenberg, Y. Wicke, Yu, X. Zheng, 2016. “Tensorflow:Large-scale
machine learningon heterogeneous distributed systems”, arXiv:1603.04467.
[31] I.D. Bakker, “Battle of the Deep Learning frameworks — Part I: 2017, even
more frameworks and interfaces” article published in Towards Data Science, Dec
2017
[32] C. Reisenwitz, “How Deep Learning is Changing Healthcare Part 1: Diagnosis”
article published in Capterra Medical Software Blog, Oct 2017
[33] https://en.wikipedia.org/wiki/Electronic_health_record
[34] D. Varadharajan, “From Virtual Nurses To Drug Discovery: 106 Artificial
Intelligence Startups In Healthcare” article published in CBInsight.com
[35] D. Faggella, “Machine Learning Healthcare Applications – 2018 and Beyond”,
article published in techemergence.com in Mar 2018
[36] https://www.enlitic.com/
[37] “Artificial Intelligence will redesign healthcare” article published in
Medicalfuturist.com
27
[38] https://www.youtube.com/watch?v=0XdC1HUp-rU
[39] https://www.freenome.com/
[40] D. Faggella, “The Pathway to ‘Precision Medicine’ with Tute Genomics CEO
Reid Robison” article published in techemergence.com in Sept 2017
[41] T. Lee, S. Yoon, “Boosted Categorical Restricted Boltzmann Machine for
Computational Prediction of Splice Junctions”. International Conference on
Machine Learning. Lille, France, 2015. p. 2483-92.
[42] S. Lee, M. Choi, H. Choi, “FingerNet: Deep learning-based robust finger joint
detection from radiographs”. Biomedical Circuits and Systems Conference
(BioCAS), 2015 IEEE. 2015. p. 1-4. IEEE.
[43] P.R. Davidson, R. D Jones, M. T Peiris, “EEG-based lapse detection with high
temporal resolution”. Biomedical Engineering, IEEE Transactions on
2007;54(5):832-9.
[44] K. Sennaar, “AI in Pharma and Biomedicine - Analysis of the Top 5 Global Drug
Companies” article published in techemergence.com in Feb 2018.
[45] C. Mogk, “Virtual Reality in Health Care Makes Medical Reality Easier to
Endure” article published in Redshift by autodesk.com in Jun 2017.
[46] https://en.wikipedia.org/wiki/Jaccard_index
[47] M.Attisa, M. Hossny, S. Nahavandi, and H. Asadi, “Surgical Tool Segmentation
Using A Hybrid Deep CNN-RNN Auto Encoder-Decoder”, IEEE International
Conference on Systems, Man, and Cybernetics, Oct 2017
[48] D. Pakhomov, V. Premachandran, M. Allan, M. Azizian and N. Navab, “Deep
Residual Learning for Instrument Segmentation in Robotic Surgery”, Computer
Vision and Pattern Recognition, Mar 2017.arXiv:1703.08580
[49] L. C. García-Peraza-Herrera, W. Li, C. Gruijthuijsen, A. Devreker, G. Attilakos,
J. Deprest, E. V. Poorten, D. Stoyanov,T. Vercauteren, S. Ourselin, “Real-Time
Segmentation of Non-Rigid Surgical Tools based on Deep Learning and Tracking”,
International Workshop on Computer-Assisted and Robotic Endoscopy, CARE
Workshop (MICCAI 2016).
[50] O. Zisimopoulos, E. Flouty, M. Stacey, S. Muscroft, P. Giataganas,
J. Nehme, A. Chow, D. Stoyanov, “Can surgical simulation be used to train detection
and classification of neural networks?”,Healthcare Technology Letters Jul 2017
[51] K. Simonyan and A. Zisserman. Very deep convolutional networks for large-
scale image recognition.CoRR, abs/1409.1556, 2014
[52] S. Petscharnig, Klaus Schöffmann, “Learning laparoscopic video shot
classification for gynecological surgery”, Multimedia Tools and Applications,
Springer Journal, Apr 2017
[53] S. Wang, A. Raju, J. Huang, “Deep Learning Based Multi-Label Classification for
Surgical Tool PresenceDetection in Laparoscopic Videos”, IEEE 14th International
Symposium on Biomedical Imaging, Apr 2017
[54] Z. Zhao, S. Voros, Y. Weng, F. Chang and R. Li, “Tracking-by-detection of
surgical instruments in minimally invasive surgeryvia the convolutional neural
network deep learning-based method”, Computer Assisted Surgery 2017, Vol. 22,
No. S1, 26–35
[55] B. J. Erickson, P. Korfiatis, T. L. Kline, Z. Akkus, K. Philbrick, A. D. Weston, “Deep
Learning in Radiology: Does One Size Fit All?” published in Journal of the American
College of Radiology
in Mar 2018
[56] N. Markou, “The Black Magic of Deep Learning – Tips and Tricks for the
28
practitioner”, article published in Envision in Feb 2017
29

SurveyPaper DeepLearning Medical Surgical Instruments

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

SurveyPaper DeepLearning Medical Surgical Instruments

Uploaded by

Copyright:

Available Formats

Deep Learning in Medical and Surgical

Srivarna Settisara Janney

Kennesaw State University

Deep learning are a host of techniques used to implement machine learning on

2. Medical and Surgical Instructions

2.2 Concepts and Categories of Instruments

Instruments used in medical field are classified as Non Invasive, Minimal

2.3 Types of Equipment

Medical equipment (also known as armamentarium [3]) is designed to aid in

2.4 Surgical Instruments

 Cutting and Dissecting – These instruments usually have sharp edges or

 Clamping and Occluding – Are used in many surgical procedures for

Figure 2: Crile Hemostatic Forceps

 Retracting and Exposing – These surgical instruments are used to hold

Figure 3: Volkman Retractor

 Grasping and Holding – These instruments, as their name suggests, are

In addition to these major categories, there are other narrow instrument

3.2 Difference between AI, Machine Learning and Deep Learning

There is always a fine line between difference between Artificial Intelligence

Machine Learning Algorithm can be classified as Supervised, Semi-supervised or

Figure 6: Age Estimation from Facial Images

3. Unsupervised learning (Clustering)

 Clustering: A clustering problem is where you want to discover the inherent

Some popular examples of unsupervised learning algorithms are:

3.3 Neural Network and its Architectures

3.3.2 Artificial Neural Network (ANN)

ANNs have been used on a variety of tasks, including computer

Figure 7: Comparison Between ANN and Deep Architecture

3.3.3 Convolutional Neural Networks (CNN)

Figure 9: A simple CNN Architecture

Image adapted from https://www.clarifai.com/technology

CNN contain one or more of each of the following layers [15]:

3.3.4 Recurrent Neural Network (RNN)

The idea behind RNNs is to make use of sequential information. In a traditional

Figure 10: Unfolded basic Recurrent Neural Network [22]

RNNs have shown great success in many Natural Language Processing

There is huge range of software packages available in different programming

 Caffe: C++ and Python interfaces, developed by graduate students at UC

Figure 11: State of Open Source Deep Frameworks in 2017 [31]

4. Deep Learning in Health Care

Health care data is available in abundance in Electronic Health Record (EHR),

4.1 Diagnosis in Medical Images and Signals

“Until recently, diagnostic computer programs were written using a series of

4.2 Robotics Surgery (Autonomous)

4.3 Genome and Bio-Informatics

4.4 Drug Discovery

Machine learning applications are trained to prescribe drug to patients based

4.5 Virtual Visualization

Several companies are exploring 3D technologies, Augmented Reality (AR),

Figure 14: A view of the Human Brain. Courtesy BioDigital

5. Key papers in Deep Learning relevant to Medical

CAI is used for staff assignment, automated guidance during intervention,

In [53], authors worked on Deep learning based multi-label classification

In semantic segmentation, each label is assigned to a class as a tool or a

Authors of [49] proposed a novel real-time automatic method based on Fully

You might also like