Particular Use of BIG DATA in Medical Diagnostic Tasks: Applied Problems

APPLIED PROBLEMS
Particular Use of BIG DATA in Medical Diagnostic Tasks1

N. Ilyasovaa,b,*, A. Kupriyanova,b,**, R. Paringera,b, and D. Kirsha,b
aSamara
University, Samara, Russia
bImage Processing Systems Institute – Branch of the Federal Scientific Research Centre “Crystallography and Photonics”
of Russian Academy of Sciences, Samara, Russia
* e-mail: ilyasova@smr.ru
** e-mail: alexkupr@gmail.com
Abstract—The paper presents the main research results in the area of data mining application to medicine. We
propose a new information technology of data mining for different classes of biomedical images based on the
methodology of diagnostically relevant information selection and creation of informative characteristics.
Application of Big Data technology in proposed systems of medical diagnostics has allowed to improve the
learning set quality and reduce the classification error. Based on these results, the conclusion is made, that
the usage of many heterogeneous sources of diagnostic information made it possible to improve the overall
quality of the diagnostics.
Keywords: Big Data, medical diagnostics, data mining

DOI: 10.1134/S1054661818010066
1. INTRODUCTION rate (90% of information has been gathered in the last

two years) and a data processing rate – on-line data
The existing level of technology allows us to speak processing techniques have recently become more
about evolution of the world medicine. It moves sought-after. Under big data work conditions, the par-
beyond evidence based medicine to personalized ticular significance is taken on by data veracity, sepa-
medicine intending a close combination of informa- ration of true data from information noise and junk
tion technology, science, and clinical therapy to information, and sifting of this noise and junk infor-
achieve the best clinical or preventive results. The mation [3]. There is the following thing in biomedi-
treatment efficiency considerably increases. The prob- cine: the larger sample size, the more accurate esti-
lem of knowledge extraction from big data sets is mates. However, large sample sizes with poor-quality
denoted by the “big data” definition (Big Data) [1, 2]. data may seriously disorient. In public health service,
The problem of data-arrays-to-knowledge conversion either accuracy or reliability qualities are equally
defined in 2013 by the system of National Institutes of important. Information value predetermines reason-
Health (USA) is of prior focus for the International ability of its processing. Currently collected data
Data-Enabled Life Science Alliance involved over 80 should provide answers to preliminarily articulated
world’s leading scientists working in medicine, public and reappeared questions. The effects, resulting from
health, and application informatics [3, 4]. data collection and processing, should justify the costs
Big Data has been characterized in terms of its vol- required for these operations. Currently collected data
ume, variety, velocity, veracity, and value [1, 2]. A data should bring positive results.
volume is considered to be big when difficulties stem
during its processing and storing by applying tradi- Public health statistics pass through the following
tional methods which require new approaches and three stages before they are used for structural diag-
more refined tools. The internal reason for new data- nostics: (1) data collection including acquisition of key
processing techniques is the necessity to parallelize the data elements, data quality assurance, and integration
processing and distribute it into a large number of of collected data in the work process; (2) data analysis
independent data-processing flows. In regard to the including data interpretation, data mining, and data
variety, such data volumes are rarely homogeneous. In quality assessment; (3) data submission including data
the vast majority of cases, the overall data array movement and visualization for clinicians.
includes both structured and unstructured data. The State-of-the-art information technology makes it
velocity shows both an increasing data accumulation possible to collect, store and process enormous aggre-
1 The article is published in the original.
gations of information. There are a lot of data analysis
methods, i.e. the methods of mathematical statistics,
clusterization, and classification including “sections”
of algorithms when the results obtained by several
Received September 15, 2017 algorithms are averaged to construct a finite model [1,
ISSN 1054-6618, Pattern Recognition and Image Analysis, 2018, Vol. 28, No. 1, pp. 114–121. © Pleiades Publishing, Ltd., 2018.
PARTICULAR USE OF BIG DATA 115
2020
as much Data and Content 35 zettabytes
Over Coming Decade
Velocity Of world’s data

Variety is unstructured
Volume
2009
800.000 petabytes
Storage Gowth
Total Data Healthcare Providers (PB)
15000
Admin
Imaging
10000
EMR
Email
5000
File
Non clin img
0
2010 2011 2012 2013 2014 2015 Reserch
Fig. 1. Data volumes increase in public health service.
5]. The above mentioned methods may describe data icant breakthrough in technological infrastructure of
mining with varying degrees of accuracy. Under mod- medicine. Computer image analysis became a basic
ern conditions, by reason of too much increasing tool of medical diagnostic systems which allows to
information, its structure becomes more complicated. considerably increase diagnostics quality.
One of the procedures, which help to solve the Medicine is one of few fields which accumulate
tasks of data mining and diagnostic result interpreta- data in huge quantities (Fig. 1) with high velocity and
tion, is the Data Mining technique. It is used for iden- in very diverse formats, i.e. figures (tests data), texts
tifying and analysis of relationships in arrays of semi- (medical notes), videography (e.g., ultrasound investi-
structured information and for build-up of models gation), photography (tomography, X-ray photogra-
which describe behavior of complicated systems. Data phy), and technical signals from electric-signal
Mining means the research and detection by a “mech- recording equipment (electrocardiography, electroen-
anism” (algorithms, artificial intelligence tools) in raw cephalography), etc. According to projections, the
data which were formerly unknown and are practically data volume will be 35 zettabytes with 44-frequency
useful and acceptable for interpretation by a human multiplication by 2020 versus 2009.
being [5].
The most of data increase is produced by unstruc-
tured data (medical imaging, videography, texts,
2. MEDICINE AS A SOURCE OF BID DATA speaking). It may be currently affirmed that 90% of all
Modern medicine is one of the most high-tech medical data is supposed to be unstructured. As a mat-
fields of scientific and practical activities, the high pri- ter of fact, public health data are usually presented in
ority of which is the development of new efficient different departments, in various formats, from several
early-diagnostic techniques in various health prob- sources and different clinical systems (80% of elec-
lems. Some last decades are characterized with signif- tronic medical history and medical images of com-
PATTERN RECOGNITION AND IMAGE ANALYSIS Vol. 28 No. 1 2018

116 ILYASOVA et al.
puted tomography (CT) or magnetic resonance regular changes in diagnostic information in medical
tomography (MRT)). images with various types of diseases includes the use
The Institute for Health Technology Transforma- of new mathematical methods and algorithms of dis-
tion has shown that a human body represents an inex- tributed processing and recognition of biomedical
haustible source of big data [6]. Image archive volumes images for remote diagnostic systems. A common
in medicine annually increase by 20–40%: the data approach has been proposed to the analysis of different
volume of one 3D X-ray computed tomography snap- classes of images based on evaluation of aggregate geo-
shot (3D CT) is about 1 GB; the data volume of one metric and texture parameters of allocated regions of
3D magnetic resonance tomography snapshot (3D interest which are supposed to be a basic feature set for
MRT) is about 150 MB; the data volume of one X-ray
snapshot is about 30 MB; the data volume of one further diagnostic analysis [9].
mammogram snapshot is about 120 MB. There is also As integrated indices of the state of the fundus ves-
a clear trend of rapid growth in the number of wearable sels and coronary heart vessels, a global set of geomet-
devices that are wore on the patient’s body and shoot
ric features is used that is a sufficiently total character-
on-line information. It is expected that about 500 mln
of such devices will be used worldwide by 2018. istic of diagnostic images which allows to perform effi-
Expanding a diagnostic information pattern makes it cient diagnosis of vascular malformations [10, 11]. As
possible to considerably increase the veracity of diag- integrated indices of the state of crystallogramm
nosis of human diseases in personalized medicine. images of biological fluids, a set of texture features is
The objective is to increasingly quantify capacity of proposed that enables to perform efficient diagnosis of
different ways of efficient diagnostics. inflammatory diseases.
The main objective of currently conducted
To detect renal system ultrasonography, bone
research at the Image Processing Systems Institute –
Branch of the Federal Scientific Research Centre X-ray imaging, and lungs CT scanning the polynomial
“Crystallography and Photonics” of Russian Acad- features are suggested that are consistent with textural
emy of Sciences (RAS) under the leadership of the properties of the given classes of images [12].
Academic of the RAS V.A. Soifer is the development
of computer techniques for remote high-performance The efficient feature space technique has been
processing, analysis, and interpretation of medical and developed to analyze diagnostic images based on big
diagnostic images in order to identify cause-and-effect data mining of unstructured information using the
relationships in changes of diagnostic information of methods of statistical analysis [13–18]. Informational
different image classes for various types of diseases. feature analysis is performed by discriminative analy-
The relevance of conducted research is also stipulated sis using separability criteria that depends neither on
by the significance of early diagnostics, prediction of distribution of objects per classes nor on the used clas-
the course and selection of an optimal therapeutic sifier.
approach to the treatment of human diseases. Late
diagnostics or interpretation of changes often results in Remote high-performance processing, analysis,
significant treatment efficiency decrease and disease and interpretation of images to identify main relation-
prevention. Currently used methods of status account- ships are based on the Big Data – Hadoop technique.
ing and formalized description do not always give an The need for its usage is justified with a large size of
aggregate factor pattern required for proper diagnos- arrays and semi-structured information generated by
tics. There is an urgent need for introduction into clin- standard software and hardware used for medical diag-
ical practice of new diagnostic techniques for various
diseases [7, 8]. nostic purposes. The Hadoop technique allows not
only to reduce the time of data pre-processing and
The Image Processing Systems Institute (IPSI) processing for imaging systems, but also to consider-
studies the following imaging classes: human vascular
system imaging for early diagnostics of diabetic reti- ably enhance capabilities for analysis, in terms of new
nopathy; bone tissue X-ray imaging (femoral neck information, of semi-structured or completely
fracture imaging) to diagnose osteoporosis; ultrasonic unstructured data. In order to provide efficient storage
images of a renal system to diagnose pyelonephritis and processing of large volumes of unstructured infor-
and computed tomography scans of lungs to diagnose mation in large-sized image mining, we use, as the
chronic obstructive pulmonary disease (emphysema). basis, the methods of information parallel processing
Research of diagnostic images is comprised of and distributed storage by applying a software infra-
three stages of data processing given in Fig. 2: process- structure of MapReduce distributed computations.
ing of biomedical signals, data analysis, and results The technique is also used to optimize the existing
visualization. data-processing operations, allows to significantly
Big data mining of specified image classes per- reduce storage and processing costs, and ensures high
formed at IPSI RAS to solve tasks of identification of efficiency of data handling [19].

Biomedical signal processing
Kidney ultrasound Bone X-ray CT of lungs Fundus

Nephrological study Osteoporosis Emphysema Estimation of vessel
Recognition based on texture analysis geometrical parameters
Data analysis
0.6 0 Separability criterion
1
0.4 2 Path tortuosity
3 Path frequency
Path amplitude
0.2 Radius tortuosity
Radius frequency
Radius amplitude
0 Beads factor
0 0.2 0.4 0.6 0.8 1.0 1.2 Straightness
Mean diameter
Feature distribution histogram Samples in feature space before and after 0 0.04 0.08
for different stages of diabetes discriminative analysis
Visualization and results
Areas Reconstruction and visualization Estimation of blood

of interest selection of coronary blood vessels vessel features
Fig. 2. Main phases of research.
3. INFORMATION TECHNOLOGY method of automatic region-of-interest differentiation

OF DIAGNOSTIC IMAGE DATA MINING which conforms to texture, geometric and topological
The information technology of diagnostic image characteristics of regions of interest for the given train-
data mining includes the efficient feature-space gen- ing sample of grayscale diagnostic images that
eration method to classify the given set of images. allowed, except additional diagnostics automation, to
The methodology of significant diagnostic infor- increase the veracity of pulmonary emphysema recog-
mation differentiation in blood-vessel images is based nition from 0.94 to 0.97. The approach also includes
on a new generalized mathematical model of blood the calculation of polynomial information features
vessels of two diagnostic image classes, i.e. the fundus and the methods of their matching with textural prop-
and coronary vessels, which is characterized by a set of erties of X-ray diagnostic images that allowed to
geometric parameters [9]. The geometric approach to increase the veracity of osteoporosis recognition from
generation of diagnostic features, which, unlike tradi- 0.9 to 0.95 [20, 21].
tional abstract spectral-and-correlation features, are
supposed to be customary and understandable for To select the most efficient features we use their
medical professionals, as well as clearly evident and correlations with expert evaluation results and make
having regard to specifics of the object, allows to even- the analysis of variance of training samples or diagnos-
tually improve the value of diagnosis. tic errors by applying some additional characteristics.
The methodology of significant diagnostic infor- We evaluate the efficiency of various features for the
mation differentiation in lungs CT images includes the problem of automatic diagnosis and make recommen-

118 ILYASOVA et al.
dations on how to use different groups of features in Table 1. Changes of separability criteria and diagnostic
medical practice. accuracy for various types of diagnostic data
The information technology of diagnostic image Rising of classes
data mining involves the following methods and algo- Data set Reliability
separability
rithms:
Bones X-rays 8% 0.9
– the method of region-of-interest automatic
selection in diagnostic images using the region-grow- Ultrasonography 23%
0.87
ing segmentation algorithm. Its use provides an of kidneys
opportunity to reduce probable incorrect recognition Computer tomography 17%
for the problem of lungs CT diagnostics by considering 0.95
of lungs
only diagnostically important image areas; Blood vessels 21% 0.94
– the calculation algorithm for polynomial features
that enable to conform to textural properties of diag-
nostic grayscale images while imposing natural cesses and enable to analyze subclinical morphologi-
restrictions on physical feasibility of calculating qua- cal changes of pathomorphological elements, com-
dratic features, significant in practice, the use of which puterize diagnostic steps, and carry out quantitative
can further reduce possible incorrect detection for monitoring of pathological changes of diagnostic sam-
diagnostics of bone-tissue X-ray images [22, 23];
ples. A special feature is the use of elements of expert
– the method and the algorithm of increasing fea- systems: a database of diagnostic characters; the fea-
ture informative value based on the discriminative
ture-space correlation, discriminative, and cluster
analysis and optimal sampling for training an expert
system of diagnosis of diseases [14, 17, 18]; analysis; and the prognosis of a pathology degree
based on expert assessments.
– the algorithm of reducing feature space dimen-
sions for medical radiological images. A separate The classification and diagnostic testing system
problem is the space dimension of features optimized [24] (Figs. 3, 4) provides tools for correlation and dis-
in the process of correlation of features with textural criminative analysis to form informative feature space,
properties of halftone training-sample images. To tools for optimal sampling based on efficiency separa-
effectively solve the optimization problem for such a bility criteria according to pathology groups, and
large number of parameters it is needed to use a large
amount of training image samples. For this purpose, instruments for cluster analysis to filter out the train-
finite sampling from hundreds of pre-readied images ing sample for the purpose of removing invalid data
is not enough: constantly updated sampling from and obtaining feature standard values according to
thousands of images is needed that is not possible to pathology groups.
get in a particular clinic or to computerize. Instead of
A data mining subsystem allows the user to receive
this, a distributed software environment is built that
allows us to store and process large and constantly a proper degree of pathology, standard feature values
updated sets of diagnostic images at the same time; for each degree of disease pathology, and the progno-
– the method of optimal sampling for training the sis of possible development of diseases, and provides
expert system of diagnosis of vascular malformations diagnostic decisions made.
on the basis of exclusion of discordant observations. The use of Big Data technology in developed med-
Based on the discordant observation hypothesis test- ical diagnostic systems has made it possible, due to
ing, we have developed the algorithm of optimal sam- attracting diverse diagnostic information sources and
pling that enabled to improve the accuracy of diagno-
sis of diseases. To remove invalid data and to obtain more data amounts, to improve the training sample
standard feature values according to pathology groups, and reduce classification errors that ensured increase
the tools of cluster analysis are used to filter out the of diagnosis accuracy up to 95% (Table 1).
training sample. Research of various image classes showed that
Problem-oriented complex software systems have increase of the number of considered diversified diag-
been developed for the analysis of medical and diagnostic data sources and concurrent discussion of vari-
nostic images to detect pathological changes including ous aspects of how to use big data allowed us to reveal
software tools for quantitative estimates of pathology
degrees based on expert evaluations and proposed new trends, which influence over diagnosis of dis-
classification methods: a complex software system for eases. Despite of the fact that the data are different,
analysis of lungs computed tomography, computer they have some common characteristics that are typi-
systems for analysis of diagnostic images of the fundus cal not so much for medicine as for data mining. Fur-
vessels (OphthalmOffice) and the coronary heart ves- ther acceptance of these characteristics is reflected in
sels (CardiOffice). The software systems allow the user data operating procedures that resulted in improving
to control the analytics and decision-making pro- the quality of diagnosing.

User
Discriminative Filtered
analysis criteria Sample
sample
EFFICIENT FEATURE GENERATION SYSTEM

INPUT AND OUTPUT SYSTEM
Discriminative analysis
Sample load and storage
Feature transformation module
Imaging and storage of 2D and 3D

graphics of mutually arranged data Classification Samples with Training
parameters new features sample
Organization of a user interface SAMPLE CLASSIFICATION SYSTEM
Supprot Vector Machines

classification
Training Processing Sample Clusterization
Classification error analysis
sample result parameters
Classification result
SAMPLE FILTERING SYSTEM
DATA ANALYSIS SYSTEM
Features space clusterization
Data mining
Modality criteria calculation for
sample histograms Standard values generation
Filtered sample
Processing result
Fig. 3. Architecture of classification and diagnostic testing software applications.
Fig. 4. User’s interface of a data analysis subsystem.

120 ILYASOVA et al.
4. CONCLUSIONS 9. N. Yu. Ilyasova, A. V. Kupriyanov, and A. G. Khramov,

Information Technologies of Image Analysis in Medi-
The developed efficient feature-space generation cal Diagnostics (Radio i svyaz, Moscow, 2012) [in Rus-
technique used to analyze diagnostic images based sian].
on big data mining of unstructured information by
applying the discriminative analysis methods and 10. N. Ilyasova, “Evaluation of geometric characteristics of
the spatial structure of vessels,” Pattern Recogn. Image
the methods proposed hereby, enable to improve Anal. 25 (4), 621–625 (2015).
the quality of medical diagnosis due to obtaining
numerical objective estimations of biomedical 11. N. Ilyasova, “Methods to evaluate the three-dimen-
image parameters using large volumes of arrays of sional features of blood vessels,” Opt. Mem. Neural
available data. Networks (Inf. Opt.) 24 (1), 36–47 (2015).
Conducted research have shown that the use of 12. A. V. Gaidel, “A method for adjusting directed texture
modern big data processing techniques, as well as fur- features in biomedical image analysis problems,” Com-
ther increase of the number and the shared use of het- put. Opt. 39 (2), 287–293 (2015).
erogeneous data sources leads to identification of new 13. N. Ilyasova, R. Paringer, A. Kupriyanov, and N. Usha-
patterns and enhances the accuracy of diagnosing. kova, “The effective features formation for the identifi-
cation of regions of interest in a fundus images,” CEUR
Workshop Proc. 1638, 788–795 (2016).
ACKNOWLEDGMENTS 14. N. Yu. Ilyasova, A. V. Kupriyanov, R. A. Paringer,
This work was partially supported by the Ministry “The discriminative analysis application to refine the
of education and science of the Russian Federation in diagnostic features of blood vessels images,” Opt. Mem.
the framework of the implementation of the Program Neural Networks (Inf. Opt.) 24 (4), 309–313 (2015).
of increasing the competitiveness of SSAU among the 15. E. Biryukova, R. Paringer, and A. Kupriyanov, “Devel-
world’s leading scientific and educational centers for opment of the effective set of features construction
2013–2020 years; by the Russian Foundation for Basic technology for texture image classes discrimination,”
Research grants (nos. 15-29-03823, 15-29-07077, 16- CEUR Workshop Proc. 1638, 263–269 (2016).
41-630761; 16-29-11698; 17-01-00972); by the ONIT 16. N. Yu. Ilyasova and A. V. Kupriyanov, “The big data
RAS program no. 6 “Bioinformatics, modern infor- mining to improve medical diagnostics quality,”
mation technologies and mathematical methods in CEUR Workshop Proc. 1490, 346–354 (2015).
medicine” 2017, in the framework of the state task 17. N. Yu. Ilyasova, A. V. Kupriyanov, and R. A. Paringer,
#0026-2018-0104 “Optoinformation technologies for “Formation of features for improving the quality of
obtaining and processing hyperspectral data.” medical diagnosis based on discriminant analysis
method,” Comput. Opt. 38 (4), 851–856 (2014).
REFERENCES 18. N. Ilyasova, R. Paringer, and A. Kupriyanov, “Regions
of interest in a fundus image selection technique using
1. A. Gandomi and M. Haider, “Beyond the hype: big the discriminative analysis methods,” in Lecture Notes
data concepts, methods, and analytics,” Int. J. Inf. in Computer Science (Including Subseries Lecture Notes
Manag. 35 (2), 137–144 (2015). in Artificial Intelligence and Lecture Notes in Bioinfor-
2. H. Ozkosea, S. E. Aria, and C. Gencerb, “Yesterday, matics) (2016), Vol. 9972, pp. 408–417.
today and tomorrow of big data,” Proc. – Soc. Behav. 19. S. Maitreya and C. K. Jhab, “Simplified data analysis
Sci. 195, 1042–1050 (2015). of big data,” Proc. Comput. Sci. 57, 563–571 (2015).
3. E. Kolker, E. Stewart, and V. Ozdemir, OMICS 3 (16),
138–147 (2012). 20. A. V. Gaidel and A. G. Khramov, “Application of tex-
ture analysis for automated osteoporosis diagnostics by
4. V. Sujathaa, S. P. Devib, S. V. Kiranb, and S. Manivan- plain hip radiography,” Pattern Recogn. Image Anal.
nan, “Bigdata analytics on diabetic retinopathy study 25 (2), 301–305 (2015).
(DRS) on real-time data set identifying survival time
and length of stay,” Proc. Comput. Sci. 87, 227–232 21. A. V. Gaidel, P. M. Zelter, A. V. Kapishnikov, and
(2016). A. G. Khramov, “Possibilities of texture analysis of
computed tomogram diagnosis of chronic obstructive
5. C. K. Emani, N. Cullot, and C. Nicolle, “Understand- disease,” Opt. Mem. Neural Networks 24 (3), 240–248
able big data: a survey,” Comput. Sci. Rev. 17, 70–81 (2015).
(2015).
6. T. White, Hadoop: the Definitive Guide, 3rd ed. 22. A. V. Gaidel, “Adjusted polynomial features for analy-
(O’Reilly Media. Yahoo Press, 2012) [in Russian]. sis of lung CT images,” CEUR Workshop Proc. 1638,
313–319 (2016).
7. N. Ilyasova, “Computer systems for geometrical analy-
sis of blood vessels diagnostic images,” Opt. Mem.Neu- 23. A. V. Gaidel, “Matched polynomial features for the
ral Networks (Inf. Opt.) 23 (4), 278–286 (2014). analysis of grayscale biomedical images,” Comput. Opt.
40 (2), 232–239 (2016).
8. N. Yu. Ilyasova, “Methods for digital analysis of human
vascular system. Literature review,” Comput. Opt. 37 24. N. Yu. Ilyasova, “Diagnostic complex for analysis of
(4), 517–541 (2013). fundus vessels,” Biotechnosphere 3, 132–138 (2014).

Nataly Yu. Ilyasova (born 1966), Rustam Aleksandrovich Paringer

graduated with honors from S.P. (born 1990) received Master’s de-
Korolyov Samara State Aerospace gree in Applied Mathematics and
University (SSAU) (1991). She re- Informatics from Samara State
ceived her PhD (1997) and DSc Aerospace University (2013). Teach-
(2015) in Technical sciences. At ing assistant of the Technical Cyber-
present, she is a senior researcher at netics Department and junior re-
the Image Processing Systems Insti- searcher of Samara University, in-
tute of the Russian Academy of Sci- tern researcher of IPSI RAS –
ences, and holding a part-time posi- Branch of the FSRC “Crystallogra-
tion of Associate Professor at phy and Photonics”. Research inter-
SSAU’s Technical Cybernetics sub- ests are currently focused on com-
department. The area of interests in- puter image processing, pattern rec-
cludes digital signals and image processing, pattern recog- ognition and data mining.
nition and artificial intelligence, biomedical imaging and
analysis. She’s list of publications contains more than 100
scientific papers, including 35 articles and 3 monographs Dmitriy Victorovich Kirsh (born
published with coauthors. 1990), graduated (2014) with Mas-
ter’s degree in Applied Mathematics
and Informatics from Samara State
Alexander Victorovich Kupriyanov Aerospace University. At present, he
(born 1978) graduated with honors is a postgraduate student of Samara
from Samara State Aerospace Uni- University, and holding a part-time
versity (SSAU) (2001). Candidate’s position of a junior researcher of IP-
degree in Technical Sciences (2004) SI RAS – Branch of the FSRC
and Doctor of Engineering Science “Crystallography and Photonics”.
(2013). Currently, Senior Researcher The area of interests includes digital
at the Image Processing Systems In- image processing, pattern recogni-
stitute, Russian Academy of Scienc- tion, methods of mathematical for-
es, and part-time position as Associ- mulation and comparison of crystal lattices, classification
ate Professor at SSAU’s sub-depart- of crystal lattices.
ment of Technical Cybernetics.
Areas of interest: digital signals and
image processing, pattern recognition and artificial intelli-
gence, nanoscale image analysis and understanding, bio-
medical imaging and analysis. More than 90 scientific pa-
pers, including 42 published articles and 2 monographs.

Particular Use of BIG DATA in Medical Diagnostic Tasks: Applied Problems

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Particular Use of BIG DATA in Medical Diagnostic Tasks: Applied Problems

Uploaded by

Copyright:

Available Formats

APPLIED PROBLEMS

Particular Use of BIG DATA in Medical Diagnostic Tasks1

Keywords: Big Data, medical diagnostics, data mining

1. INTRODUCTION rate (90% of information has been gathered in the last

Velocity Of world’s data

Fig. 1. Data volumes increase in public health service.

PATTERN RECOGNITION AND IMAGE ANALYSIS Vol. 28 No. 1 2018

PATTERN RECOGNITION AND IMAGE ANALYSIS Vol. 28 No. 1 2018

Biomedical signal processing

Kidney ultrasound Bone X-ray CT of lungs Fundus

Visualization and results

Areas Reconstruction and visualization Estimation of blood

Fig. 2. Main phases of research.

3. INFORMATION TECHNOLOGY method of automatic region-of-interest differentiation

PATTERN RECOGNITION AND IMAGE ANALYSIS Vol. 28 No. 1 2018

PATTERN RECOGNITION AND IMAGE ANALYSIS Vol. 28 No. 1 2018

EFFICIENT FEATURE GENERATION SYSTEM

Imaging and storage of 2D and 3D

Organization of a user interface SAMPLE CLASSIFICATION SYSTEM

Supprot Vector Machines

Fig. 3. Architecture of classification and diagnostic testing software applications.

Fig. 4. User’s interface of a data analysis subsystem.

PATTERN RECOGNITION AND IMAGE ANALYSIS Vol. 28 No. 1 2018

4. CONCLUSIONS 9. N. Yu. Ilyasova, A. V. Kupriyanov, and A. G. Khramov,

PATTERN RECOGNITION AND IMAGE ANALYSIS Vol. 28 No. 1 2018

Nataly Yu. Ilyasova (born 1966), Rustam Aleksandrovich Paringer

PATTERN RECOGNITION AND IMAGE ANALYSIS Vol. 28 No. 1 2018

You might also like