Professional Documents
Culture Documents
Algoritmos
Algoritmos
a r t i c l e i n f o a b s t r a c t
Article history: The integration of intelligent system mainly includes the application of intelligent technology, such as
Received 28 July 2017 artificial intelligence and computational intelligence method, which is used in different levels of the
Received in revised form 21 October 2017 system. This paper introduces the application and technology of several intelligent system integrations,
Accepted 22 October 2017
the advantages and disadvantages of learning theory and expert system. Neural network is applied in
Available online xxx
intelligent systems and we use scope reviewed several new development of intelligent technology, plus
this paper describes the development direction of the intelligent system. This paper introduces the basic
Keywords:
concepts of data mining, including data mining technology, artificial intelligence, machine learning, sta-
Pattern recognition
Data mining
tistical analysis, fuzzy logic, pattern recognition and artificial neural networks and other technologies. We
Intelligent systems analyze the structure of the general algorithm of data mining, and classify the data mining technology in
Technologies details, including more than 10 techniques of decision tree technology, neural network technology, rough
Algorithms set and fuzzy set. Finally, the research directions of data mining in artificial intelligence, e-commerce
applications and mobile communication computing are discussed.
© 2017 Elsevier Inc. All rights reserved.
https://doi.org/10.1016/j.suscom.2017.10.010
2210-5379/© 2017 Elsevier Inc. All rights reserved.
Please cite this article in press as: J. Zhang, et al., Intelligent computing system based on pattern recognition and data mining algorithms,
Sustain. Comput.: Inform. Syst. (2017), https://doi.org/10.1016/j.suscom.2017.10.010
G Model
SUSCOM-201; No. of Pages 11 ARTICLE IN PRESS
2 J. Zhang et al. / Sustainable Computing: Informatics and Systems xxx (2017) xxx–xxx
Please cite this article in press as: J. Zhang, et al., Intelligent computing system based on pattern recognition and data mining algorithms,
Sustain. Comput.: Inform. Syst. (2017), https://doi.org/10.1016/j.suscom.2017.10.010
G Model
SUSCOM-201; No. of Pages 11 ARTICLE IN PRESS
J. Zhang et al. / Sustainable Computing: Informatics and Systems xxx (2017) xxx–xxx 3
Expert system can provide power emergency treatment, system defined alarm and abnormal behavior automatic alarm. It has broad
restoration, system planning, fault isolation, static and dynamic application prospects for museums, traffic, road behavior, station
security analysis, etc. The results are satisfactory for the complete monitoring and other places. Scene detection is divided into the
and accurate signals, but the limitation of fault tolerance is still number of personnel detection, traffic detection and production
far from the required results, especially when the information is process detection, and is widely used in industry [14,15].
incomplete, thus the diagnosis of power systems is prone to errors. To realize intelligent video surveillance, we should adopt several
Neural network has nonlinear characteristics, and its parallel pro- new technologies including moving object detection technology,
cessing ability, robustness and self-organizing learning ability have motion tracking technology, automatic video retrieval technology,
been pursued by researchers in various fields for two years. Neural pattern analysis technology.
network algorithm is more suitable for small and medium power
system failures [11].
2.4. Knowledge processing methods contained in collaborative
Based on optimized fault diagnosis, genetic algorithm is used to
intelligent computing model
optimize the system diagnosis. Genetic algorithm needs to estab-
lish the model of fault point, and the prediction ability of the
Intelligent computing, in essence, is sort of connection and
warning information can predict the location of the fault point. We
information interaction under the environment of new processing
can give an fitness function that makes the diagnosis an integer
platform, involving sensing, data acquisition, data analysis and pro-
programming problem.
cessing, a large system simulation technology and communication
connection.
2.2. Search engines
The combination of intelligent computing and mobile devices
is the general trend in the future. At the latest Taipei computer
Searching for information is not our goal, and only when we find
show, more and more mobile PC are following the footsteps of
information useful to us can we say that the search is successful. The
mobile phones to realize many functions. In the calculation of intel-
development of intelligent search engine is to solve this problem. It
ligence, transmission speed is the responsibility of the network, and
has considerable knowledge processing ability and understanding
in the embedded system, energy saving and high performance of
ability. When keyword based search does not satisfy search require-
the balanced development increase the performance of portable
ments; the requirement of intelligent search engine is raised to the
processing products, which will provide more processing power
knowledge based level.
and the execution of software to obtain information, and let the
The intelligent search engine can work across platforms, and
intelligent and energy saving more thorough [16,17].
its ability of processing a variety of mixed documents allows us
One of the core concepts of intelligent computing is that the ter-
to accurately retrieve the latest information during the uninter-
minal collects various information and collects it into a data center.
rupted information replacement process. Intelligent search engine
Then, the intelligent algorithm is used for data analysis and mining
is initiative, it can observe the behavior of the user actively, and
to achieve the goal of intelligence. The typical application of embed-
understand the focal point of the user and from the user’s point
ded and network is a vending machine, by sending the information
of view. Through continuous improvement of learning, improving
collected by the different devices to the client, not only can the
search accuracy. Personalized search is an important breakthrough
data analysis submitted to the supplier for targeted delivery but
of intelligent search engine, and it is an important way to improve
also provide different crowd users demand, so as to give targeted
search accuracy [12,13]. The search engine for users to search the
procurement proposals.
effective classification information can make the search engine to
meet each customer needs, and allow customers to customize web
page, then select the items of interest or frequent landing site. 3. Pattern recognition
The intelligent search engine should have the knowledge base
and the information base. The implementation of personalized When people see something or phenomenon, people will first
retrieval and retrieval learning requires the use of information collect all the information of the object or phenomenon, then the
retrieval, computer networks, natural language processing, dis- behavior and mind has some related information to compare to, if
tributed artificial intelligence, automatic theorem proving and you find a same or similar matching, people can recognize the object
other theoretical techniques. or phenomenon. Therefore, the relevant information of an object or
The intelligent agent technology can integrate the client’s phenomenon, such as spatial information, temporal information,
special environment and complete the search according to the etc., constitutes the model of the object or phenomenon. Broadly,
user’s interest. Reduce workload during user search. Through things that can be observed in time and space can be called patterns
machine learning, users can independently and independently find if they can be distinguished from whether they are the same or
databases of interest to users. These databases can be divided into similar [16,17].
different regions and improved retrieval performance. You can also Human beings have strong pattern recognition ability. It
monitor information sources in real time so as to reduce retrieval through the visual information to identify text, pictures and the
time. Net to net technology, that is, the application of neural net- surrounding environment, through the auditory information iden-
work technology, which must establish a stable data model for Web tification and understanding of language to finalize the PR process.
to obtain information. Web mining technology is a pattern of dis- Pattern recognition is a basic cognitive ability or intelligence of
covering interest points from information sources and activities. human being. It is an important component of human intelligence
It includes content, structure and access information mining. The and plays an important role in all kinds of human activities. In real
application of mining technology can improve the accuracy and life, almost everyone can easily accomplish the process of pattern
generalization rate of query. recognition. But I’m afraid it’s not that easy if you want the machine
to do the same thing. From the point of view of artificial intelligence,
2.3. Video monitoring system this paper analyzes the concept of pattern recognition and how to
recognize patterns by machine.
Video surveillance technology is applied in enterprise produc- Pattern recognition is a mathematical model that studies human
tion, life and so on. It can be used in real-time alarm system recognition, which uses computer technology to allow computers
and scene detection. Real time alarm system is divided into user- to model the behavior of human recognition. In other words, pat-
Please cite this article in press as: J. Zhang, et al., Intelligent computing system based on pattern recognition and data mining algorithms,
Sustain. Comput.: Inform. Syst. (2017), https://doi.org/10.1016/j.suscom.2017.10.010
G Model
SUSCOM-201; No. of Pages 11 ARTICLE IN PRESS
4 J. Zhang et al. / Sustainable Computing: Informatics and Systems xxx (2017) xxx–xxx
tern recognition is the study of how to let the machine look at the that is, the feature dimension is too high to be solved by com-
environment, and to learn to identify patterns of interest from the puter. Therefore, the key of data processing stage is the selection
background, and to make accurate judgments about the class of the of filtering algorithm and feature extraction method. For different
pattern. Recognition behavior can be divided into two broad cat- applications, the filtering algorithm and feature extraction method,
egories: identifying specific things and identifying abstract things. as well as the extracted features will also be different [21].
The identification of specific things involves the identification of
spatio-temporal information. The identification of abstract matters 3.1.4. Classification decision or model matching
involves the identification of a problem, solution, or argument. In Based on the pattern feature space generated by data process-
other words, the recognition of abstract things is the identifica- ing, we can carry out the last part of pattern recognition: pattern
tion of phenomena that does not exist in the form of matter, which classification or model matching. The final output at this stage may
belongs to the category of conceptual recognition research. Pattern be the type of object to which it belongs, or it may be the model
recognition refers to the identification of specific things, such as number that is most similar to the object in the model database. A
speech waveform, seismic wave, ECG, EEG, pictures, text, symbols. pattern, classification, or description, usually based on a collection
Three-dimensional objects can be measured by physical, chemical, of patterns that have been classified or described and this model
biological and other specific patterns. is called a training set, and the resulting learning strategy is called
supervised learning. Learning can also be unsupervised learning. In
3.1. Pattern recognition system this sense, the system does not need to provide a priori knowledge
of the pattern class, but it is based on the statistical laws of the
A complete pattern recognition system consists of data acqui- patterns or the similarity of the patterns.
sition, data processing, feature extraction and selection, and Member table: template matching. Based on this idea, pre stored
classification decision making [18,19]. process belongs to the same class, and the unknown mode and sys-
In the design of pattern recognition system, we need to pay tem input is compared with the same or similar pattern, namely
attention to model class definitions, applications, pattern rep- the unknown pattern type.
resentation, feature extraction and selection, clustering analysis, General characteristics: general patterns are stored in a classifi-
classifier design and learning, training and testing sample selection, cation system, when there is an unknown pattern into the system,
performance evaluation etc. For different application purposes, the the system will be compared with the general characteristics of the
content of each part of the pattern recognition system can vary con- general characteristics and existing classes in the system, and put
siderably, especially in data processing and pattern classification. In it into the class with similar characteristics [22,23].
order to improve the reliability of the identification results, we need Clustering: if the target vectors are far apart from each other in
to add the knowledge base to correct possible errors, or by introduc- geometry, it is easy to determine the class of unknown patterns.
ing constraints which greatly reduce the pattern recognized in the However, if the target vectors are closer or even overlap, people
model library of the search space, in order to reduce the matching need to adopt relatively complex algorithms to determine the class
calculation. In some specific applications, such as machine vision, in of unknown patterns. Minimum distance classification is a simple
addition to identifying what the object is, the position and posture algorithm based on the concept of clustering. By calculating the
of the object must be determined to guide the robot’s work. unknown pattern, to decide which one belongs to known model
and the unknown model recently, and finally the unknown patterns
3.1.1. Data acquisition are known as model classes. The algorithm is very effective for clas-
Data acquisition refers to the use of a variety of sensors to con- sifying the target vectors at different distances from the geometric
vert the various information of the object into a set of values or positions.
symbols that the computer can accept. We call this kind of numer- Neuron: bionics refers to the application of biological knowl-
ical or symbolic (string) space as the model space. The key to this edge to electronic machines. The neural system approach
step is the selection of sensors. In order to extract valid information introduces the biological knowledge to the machine for pattern
from these numbers or symbols, data processing must be carried recognition, thus introducing artificial neural networks. A neural
out, including digital filtering and feature extraction. network is an information processing system consisting of a large
number of simple data processing units, which works together to
3.1.2. Data processing achieve large-scale parallel distributed processing. The design and
Data processing is to eliminate the noise in the input data or function of a neuronal network is designed to mimic the biological
information, and eliminate the irrelevant signals, leaving only the brain and nervous system functions [24,25]. Neural networks have
features and properties of the subjects and the identification meth- the advantages of adaptive learning, self-organization and fault tol-
ods are closely related (such as the representation of the object’s erance. Because of these prominent features of neural networks,
shape, perimeter, area etc.). For example, in fingerprint recogni- neural networks can be used for pattern recognition. Some of the
tion, fingerprint scanning equipment for each output with image best neural network models are backward propagation networks,
contrast, and brightness or background are different, sometimes higher order networks, delay and periodic networks.
they may be deformed. Therefore, it is necessary to adopt appro- In general, people use forward propagation networks for pattern
priate filtering algorithms, such as directional filtering based on recognition. Forward propagation is feedback that does not return
block diagram, two valued filtering, etc., in order to filter out these to the input. Similar to what humans have learned from mistakes,
unnecessary parts in the fingerprint image [20]. neural networks can learn from their mistakes by feeding back
information to the input. Through feedback, the input pattern can
3.1.3. Feature extraction be reconstructed to avoid errors and improve the performance of
Feature extraction refers to derive useful information from data the neural network. Obviously, the construction of such a network
filtering, to find out the most effective features from many features, of neurons is very complex. The back-propagation algorithm is used
in order to reduce the processing difficulty of human features easy in this kind of neural networks. One of the main problems of the
access to machines, which is difficult to obtain. Feature selection backward propagation algorithm is the local minimum problem. In
and extraction are key problem in pattern recognition. In general, addition, neural networks have some problems in the aspects of
the more types of candidate features, the better results should be learning speed, structure selection, feature representation, modu-
obtained. However, it may lead to the curse of dimensionality, larity and scaling. Although there are such problems and difficulties
Please cite this article in press as: J. Zhang, et al., Intelligent computing system based on pattern recognition and data mining algorithms,
Sustain. Comput.: Inform. Syst. (2017), https://doi.org/10.1016/j.suscom.2017.10.010
G Model
SUSCOM-201; No. of Pages 11 ARTICLE IN PRESS
J. Zhang et al. / Sustainable Computing: Informatics and Systems xxx (2017) xxx–xxx 5
Table 1 method. However, the method can only be judged by “yes” or “no”
Four basic types of fuzzy pattern recognition.
which does not allow the model to have noise. On the other hand,
Standard Sample to be Methods neural networks allow patterns to be noisy, if properly trained,
mode identified neural networks can respond correctly to categories of unknown
R1 , ...Rc X Maximum membership principle patterns. For example, BP neural network directly learns observa-
R1 , ...Rc X Proximity principle tion data which is very simple and effective, so it has been widely
∼
R1 , ...Rc X Proximity principle/Synthetic fuzzy set used, but it is a heuristic technology, without a solid theoretical
R1 , ...Rc X Synthetic fuzzy set/Synthetic approach degree basis for specifying engineering practice.
∼
Fuzzy pattern recognition and neural network pattern recogni-
tion are newly developed pattern recognition methods, and they
in neural networks, there is still great potential for the development are important components of information science and artificial
of such networks [26]. intelligence. In the past few decades, people are interested in fuzzy
mathematics, thus artificial intelligence and rule-based expert sys-
3.2. Pattern recognition method tems has soared. Pattern recognition plays an important role in
these research fields [28].
The following table describes some of these pattern recognition
Definition 1. Assume we have n fuzzy subsets in discourse methods. In fact, the pattern recognition method is not completely
domain, denoted as follows: independent of each other. In many emerging applications, where
An = {A1 , ..., An } (1) there is no optimal approach, several different pattern recognition
methods must be used simultaneously. Attempts have been made
Ui {i = 1, 2, ..., n} (2) to design a pattern recognition system incorporating a number of
We call fuzzy vectors as fuzzy vector sets, denoted as: identification methods (Table 2).
With the rapid development of computer hardware and soft-
n
ware technology, pattern recognition has gained more and more
Fn ≡ {A} = F(Ui ) (3) attention, the pattern recognition technology is becoming more
i=1 and more perfect, and has been successfully applied in many fields,
Definition 2. A set of fuzzy vector sets is selected randomly, such as data mining, document classification, financial forecast-
denoted as A ∈ Fn . We build a synthetic fuzzy set, denoted as ing, multimedia database organization and retrieval, biological
A≡ < A > ≡ < A1 , A2 , ..., An >. Membership function is calculated as: (for example according to the physical characteristics, such as
human face and fingerprint identifying people), medical (med-
A(u) = Mn (A1 (u1 ), A2 (u2 ), ..., An (un )) (4) ical image analysis), energy, geology, meteorological (weather
Whereu = (u1 ,u2 , ...,un ) ∈ U∗. (5) forecast), chemical industry, metallurgy, aviation (satellite aerial
photograph interpretation), and the field of industrial product test-
Axiom of nearness: A and B are two fuzzy subsets, mapping ing etc. The field of pattern recognition is the fastest development
satisfies the following equations: in recent years which should belong to the computer visual and
: F(U) × F(U) → [0, 1], (A, B) → (A, B) (6) auditory fields, such as handwriting recognition, biometric identi-
fication (including: fingerprint recognition, iris recognition, retina
(A, A) = 1 (7) recognition, palmprint recognition, face recognition, palm vein
recognition, distribution).
(A, B) = (B, A) (8)
Pattern recognition is a fast developing discipline, so it is diffi-
A ⊆ B ⊆ C ⇒ (A, C) ≤ (A, B) ∧ (B, C) (9) cult to make a comprehensive and detailed summary of the latest
research progress in this field. With development of pattern recog-
3.2.1. Four basic models of fuzzy pattern recognition nition, there have been many effective pattern recognition methods
The fuzzy pattern recognition problem can be divided into the to solve different problems, but they have not yet developed into
following four basic models from the perspective of the identified a unified and effective pattern recognition theory for all problems.
object (Table 1). The purpose of pattern recognition is to develop general data anal-
ysis techniques that do not rely on application domains, so that
3.2.2. Artificial neural network based pattern recognition machines can be analyzed and solved like human beings. This is a
In 1950, F.Rosenblatt proposed a simulation of the human brain difficult goal, the current work is to figure out a combination of the
based on a simplified mathematical model, and perception, recog- specific problems and propose new methods of pattern recognition.
nition, preliminary implementation training and recognition ability
are included in the system. In 1980, J.Hopfield revealed Lenovo stor-
age and computation capacity of artificial neural network, which
is a new approach for pattern recognition technology, so as to
form the artificial neural network pattern recognition method. 3.2.3. Template matching pattern recognition
Neural pattern recognition takes advantage of neural computing The principle of template matching is to select the known object
patterns that arise in neural networks. Most neural networks have as a template, and to compare the selected area with the image to
some training rules, such as adjusting connection weights based identify the target. The computation of template matching is very
on existing patterns. In other words, the neural network learns the large, and the storage of the corresponding data is also very large.
examples directly and obtains its structural features for generaliza- Moreover, as the image template increases, the amount of com-
tion [18]. putation and storage increases with the geometric number. If the
Artificial neural networks can outperform traditional computer- image and template are large enough, the computer will not be able
based pattern recognition systems. People can recognize patterns to deal with it, and then it will lose the meaning of image recog-
by using computers or neural networks. The computer uses tradi- nition. Another disadvantage of template matching is that it can
tional mathematical algorithms to detect whether a given pattern achieve the optimal solution in theory because of many matching
matches the existing pattern. This is a simple and easy understood points, but it is very difficult to achieve in practice.
Please cite this article in press as: J. Zhang, et al., Intelligent computing system based on pattern recognition and data mining algorithms,
Sustain. Comput.: Inform. Syst. (2017), https://doi.org/10.1016/j.suscom.2017.10.010
G Model
SUSCOM-201; No. of Pages 11 ARTICLE IN PRESS
6 J. Zhang et al. / Sustainable Computing: Informatics and Systems xxx (2017) xxx–xxx
Table 2
Common pattern recognition methods.
Template matching Sample, pixel, curve Correlation, distance measurement Classification error
Statistical pattern recognition Feature Generic discriminant function Classification error
Structure pattern recognition Element Rules, grammar Acceptance error
Fuzzy pattern recognition Feature Membership function Membership degree
Neural network pattern recognition Sample, pixel, feature Nonlinear signal processing function Mean square error
Template matching is mainly used in the detection of object coordinate.The kinetic equations used for image recognition are
position in images, tracking of moving objects, registration of described as follows:
images between different spectrums or different photography time.
q= k vk (vk q) − Bkk (vk q)2 (13)
k k =
/ k / k
k=
3.2.4. Pattern recognition based on support vector machines
Support vector machine is a new kind of classification tech- (vk q)vk − C(q+ q)q + F(t) (14)
nology proposed in 1963, its basic idea is: first, based on the
Compared with the traditional pattern recognition system,
sample space, constructing an optimal hyper-plane to maximize
cooperative pattern recognition system reduces the process of
the distance between the hyperplane and different samples, so
feature extraction and selection. The recognition is based on the
as to achieve the maximum generalization ability. Support vector
method of collaborative prototype model, so the recognition ability
machines (SVM) are simple in structure, and have global optimal-
of collaborative method mainly depends on the prototype model,
ity and better generalization ability. They have been widely studied
and collaborative algorithm can only base on further optimization
since they were introduced.
of synergetic neural network in other aspects. Therefore, collab-
Support vector machines (SVM) is an effective tool for solving
orative pattern recognition method is more suitable for solving
pattern recognition and function estimation problems. SVM in dig-
license plate recognition, fingerprint identification, face recogni-
ital image processing applications is to find the difference between
tion, industrial parts recognition and so on.
the image pixels from the pixel, feature and surrounding environ-
In order to further improve the ability of image recognition and
ment itself (adjacent pixels).
classification system, we can adopt collaborative method and tra-
ditional recognition method based on pattern recognition, feature
3.3. Synergetic pattern recognition extraction. Synergetic neural network for recognition learning is
also a direction of collaborative pattern recognition method. For
The basic idea of synergetic is that a high-dimensional nonlin- the problem of invariant recognition, we can extract the invariants
ear problem is reduced to a nonlinear equation with the same set of of the image in the spatial variation, and recognize the invariants
dimensions. The order parameter equation controls the dynamics by cooperative pattern recognition, so as to achieve the goal of
of the system near the critical point. By solving the order param- invariant recognition of the target pattern (Fig. 3).
eter equation, we can obtain temporal, spatial or spatiotemporal
structure.
4. Data mining algorithms
Assumeastatevectordescribessystemstatus, denotedasq = (q1 , ...,qn )(10)
Data mining algorithms are mechanisms for creating data min-
In all cases considered by mathematical theory of synergetic, the
ing models. In order to create a model, the algorithm first analyzes a
state vector of the time derivative follows the following equation:
set of data and looks for specific patterns and trends. The algorithm
q(x, t) = N[q(x, t), ∇ ˛, x, t] + F(t) (11) uses the results of this analysis to define the parameters of the min-
ing model. These parameters are applied to the entire data set for
The dependence of the stable modulus is proved by the dom- extracting feasible patterns and detailed statistical information.
ination principle of synergetic. In the process of system motion, Data mining is closely related to knowledge discovery. Knowl-
the stable modes are gradually weakened, and some unstable edge discovery refers to the whole process of discovering useful
modes are enhanced, which become the main structural factors knowledge from databases. It includes data selection, preprocess-
of the system. The magnitude of the unstable modes is called the ing, data transformation, data mining, schema interpretation and
order parameter. In essence, the final state of the system will be knowledge evaluation. Data mining is a key step in the process of
determined by the unstable modulus with the largest initial order knowledge discovery. Data mining is the extraction of useful infor-
parameter. Therefore, the order parameter of unstable modes can mation patterns from large amounts of random data. The purpose
be used only. The modeling of synergetic is mainly to establish of data mining is to improve the market decision-making ability.
stochastic differential equations. The stability principle is used to Model representation used to describe the model as a language.
eliminate the stable modulus so as to obtain the closed equation of If the language is descriptive, it helps to find accurate mathematical
order parameter. Thus the high-dimensional problem is reduced to models. However, the descriptive language which is too powerful
a low dimensional problem. may lead to the over generalization of the model and reduce the
The process of pattern recognition corresponds to a kinetic pro- accuracy of prediction. The commonly used model representation
cess. Assume a virtual particle describing a pattern moves on a methods include decision tree, nonlinear regression, case-based
potential terrain, and when a particle enters a certain attractive reasoning, Bayesian network and inductive programming.
valley, a corresponding pattern is identified. This process can be Model evaluation criteria is to predict the class model, some
described as: test data sets can be used to evaluate its accuracy. The model of
description class can be evaluated in terms of accuracy, novelty,
q(0) → q(t) → vk (12)
practicability and understandability.
In image pattern recognition, the image matrix is transformed Detection methods are divided into parameter discovery and
into one dimension vector. We do not consider the spatial infor- model discovery. After the model representation and model eval-
mation, so the state vector is only related to the temporal uation criteria have been determined, data mining has become an
Please cite this article in press as: J. Zhang, et al., Intelligent computing system based on pattern recognition and data mining algorithms,
Sustain. Comput.: Inform. Syst. (2017), https://doi.org/10.1016/j.suscom.2017.10.010
G Model
SUSCOM-201; No. of Pages 11 ARTICLE IN PRESS
J. Zhang et al. / Sustainable Computing: Informatics and Systems xxx (2017) xxx–xxx 7
Please cite this article in press as: J. Zhang, et al., Intelligent computing system based on pattern recognition and data mining algorithms,
Sustain. Comput.: Inform. Syst. (2017), https://doi.org/10.1016/j.suscom.2017.10.010
G Model
SUSCOM-201; No. of Pages 11 ARTICLE IN PRESS
8 J. Zhang et al. / Sustainable Computing: Informatics and Systems xxx (2017) xxx–xxx
that the data set may not be able to logically observe by random
observation. For example, it is logical to know that people who com-
mute to work usually live not far from their place of work. But the
algorithm can identify other features that are not obvious about
cyclists.
The clustering analysis algorithm is different from other data
mining algorithms. The algorithm can generate clustering analy-
sis model without specifying predictable columns. The clustering
algorithm is strictly based on the data and the relations among the
categories are identified by the algorithms. Clustering algorithm
first identifies the relation of data sets, and generates a series of clas-
sification according to these relations. Scatter diagrams are a very
useful way of visually representing how algorithms group data. A
scatter diagram can represent all instances in a dataset, where each
instance is a point (Fig. 5).
Please cite this article in press as: J. Zhang, et al., Intelligent computing system based on pattern recognition and data mining algorithms,
Sustain. Comput.: Inform. Syst. (2017), https://doi.org/10.1016/j.suscom.2017.10.010
G Model
SUSCOM-201; No. of Pages 11 ARTICLE IN PRESS
J. Zhang et al. / Sustainable Computing: Informatics and Systems xxx (2017) xxx–xxx 9
A ⇒ B, A ⊂ I, B ⊂ I (15)
Please cite this article in press as: J. Zhang, et al., Intelligent computing system based on pattern recognition and data mining algorithms,
Sustain. Comput.: Inform. Syst. (2017), https://doi.org/10.1016/j.suscom.2017.10.010
G Model
SUSCOM-201; No. of Pages 11 ARTICLE IN PRESS
10 J. Zhang et al. / Sustainable Computing: Informatics and Systems xxx (2017) xxx–xxx
application difficulty is that when dealing with large and sparse is easy to cause the processor is in an idle state, and each trans-
databases, considerable space is needed in mining, processing, and action records are based on multiple hash tree processing leads to
recursive computation. redundant computation.
The CaD algorithm attempts to reduce the data dependency
4.10. Data set partitioning algorithm between processors by partitioning databases and candidate sets
so that each processor can compute independently.
The data set partitioning algorithm includes the Partition algo- The PDM algorithm is similar to the CD algorithm, all processors
rithm proposed by Savasere and the DIC algorithm proposed by have the same hash table and candidate set. The process of generat-
Brin. Partition algorithm is a data block of several independent ing parallel candidates is to generate a set of candidate sub itemsets
and can be stored in memory process, and save disk access over- through each processor, and then exchange all the sub itemsets on
head. It considers each logical block individually to generate the all the processors to generate a global candidate set. But the PDM
corresponding frequency set, then generates all possible global algorithm involves a large number of disk I/O operations.
candidate itemsets using frequent itemsets, and finally scans the
database again to calculate the support of the itemsets and per-
forms global enumeration. The whole process requires only two 5. Conclusion
scans of the database, but the number of candidate itemsets is large.
The DIC algorithm also takes database partition of the database, Intelligent system integration mainly includes artificial intel-
which will be divided into several partitions at the beginning of ligence, computational intelligence method and other intelligent
each part of the mark. In the process of scanning the database, it can technology. We introduce the application and technology of several
be added in the candidate itemsets in each partition of the mark- intelligent system integration, and the advantages and disadvan-
ing point, parallel computing support may be set in the calculation tages of learning theory and expert system, plus, neural network is
of frequency itemsets. Algorithm of scanning database basically is applied in intelligent systems. When solving the intelligent com-
less than the maximum number of frequency sets. When the data putation, the solution of the problem is not needed, or the gradient
blocks are properly partitioned, all the frequent itemsets can be information of the system is not required, so the continuous and
found by only scanning the database two times. discrete problems can be dealt with independently. Intelligent
In the algorithm based on partitioning, the main bottleneck is computing method can figure out the optimal solution of the global
the time of algorithm execution, and the accuracy of the frequent optimization problem with a greater probability for different opti-
itemsets is not very high. However, the algorithm of this type has mization problems, and the intelligent calculation method can
high parallelism. It only needs to scan the database two times, be introduced easily with the heuristic rules of logic calculation
which greatly reduces the I/O operation, thus improving the effi- method which is simple and easy to understand.
ciency of the algorithm. In this paper, the mining of association rules in data mining is
discussed carefully and clearly, and some commonly used min-
4.11. Incremental updating algorithm ing algorithms are analyzed, compared and summarized based
on statistics. The existing improved algorithms can not meet the
The incremental updating algorithm uses the mining association needs of people’s fast and timely response to the mining system.
rules to discover new association rules on the changed database or Therefore, we need to improve the efficiency of the mining pro-
parameters, and deletes the outdated association rules to maintain cess and interact with the user to generate the visual results. At the
the updating problem of data sets. At present, most of the incremen- same time, we introduce various techniques and models of pattern
tal updating algorithms are improved and evolved with the Apriori recognition. Several examples show that the intelligent computing
algorithm, including the FUP algorithm proposed by D.W.Cheung, system based on pattern recognition and data mining algorithm
IUA, PIUA, IUAR algorithm and so on. has higher efficiency and recognition rate.
FUP algorithm is the improvement of Apriori algorithm, and
it is also a classical algorithm to solve the incremental updat-
Acknowledgement
ing problem. The FUP algorithm mainly aims at how to generate
the association rules of the updated database when the database
This paper is supported by the Large-scale public building
is modified under the condition that the minimum support and
energy consumption data fusion platform (Shaanxi Provincial
the minimum confidence level remain unchanged. It uses the fre-
Department of Education grant projects, Shaanxi [2016] 250 docu-
quent itemsets information obtained by mining process to avoid
ment)
the repeated computation of the time cost of the support number
of frequent itemsets and to improve the efficiency of the algorithm.
References
4.12. Parallel mining algorithm
[1] Poonam Sinai Kenkre, Anusha Pai, Louella Colaco, Real time intrusion
detection and prevention system, in: Proceedings of the 3rd International
The parallel algorithm uses the set of simultaneous processes,
Conference on Frontiers of Intelligent Computing: Theory and Applications
interactions and coordination to complete the solution of a given (FICTA) 2014, Springer, Cham, 2015.
problem. Including CD, DD and CaD algorithms proposed by [2] Fei Tao, et al., CCIoT-CMfg: cloud computing and internet of things-based
Agrawal et al. PDM algorithm proposed by Park et al. DMA and cloud manufacturing service system, IEEE Trans. Ind. Inf. 10 (2) (2014)
1435–1442.
FDM algorithms proposed by Cheung et al. [3] H. Wang, J. Wang, An effective image representation method using kernel
The CD algorithm allows parallel redundant computations on classification, in: 2014 IEEE 26th International Conference on Tools with
idle processors to reduce traffic, and speeds almost linearly to Artificial Intelligence (ICTAI), IEEE, November, 2014, pp. 853–858.
[4] Ahmed Patel, et al., An intrusion detection and prevention system in cloud
speedup. But its disadvantage is that both traffic and candidate computing: a systematic review, J. Netw. Comput. Appl. 36 (1) (2013) 25–41.
frequent itemsets are relatively large. [5] Batya Friedman, Peter H. Kahn, Human agency and responsible computing:
DD algorithm for the candidate set is divided into each proces- Implications for computer system design, J. Sys. Software 17 (1) (1992) 7–14.
[6] C. Bi, H. Wang, R. Bao, November. SAR image change detection using
sor to overcome the shortcomings of CD algorithm, DD algorithm, regularized dictionary learning and fuzzy clustering, in: 2014 IEEE 3rd
however due to the data of mobile solutions of low efficiency in International Conference on Cloud Computing and Intelligence Systems
communication load, large interactive mode between processors (CCIS), IEEE, 2014, pp. 327–330.
Please cite this article in press as: J. Zhang, et al., Intelligent computing system based on pattern recognition and data mining algorithms,
Sustain. Comput.: Inform. Syst. (2017), https://doi.org/10.1016/j.suscom.2017.10.010
G Model
SUSCOM-201; No. of Pages 11 ARTICLE IN PRESS
J. Zhang et al. / Sustainable Computing: Informatics and Systems xxx (2017) xxx–xxx 11
[7] T. Qiu, Y. Zhang, D. Qiao, X. Zhang, M.L. Wymore, A.K. Sangaiah, A Robust Time [17] Sukhpal Singh, Inderveer Chana, A survey on resource scheduling in cloud
Synchronization Scheme for Industrial Internet of Things, IEEE Trans. Ind. Inf. computing: Issues and challenges, J. Grid Comput. 14 (2) (2016) 217–264.
(2017), http://ieeexplore.ieee.org/abstract/document/8008773/ (in press). [18] B.B. Gupta, S. Gupta, P. Chaudhary, Enhancing the browser-side context-aware
[8] Pachipala Yellamma, Narasimham Challa, V. Sreenivas, Intelligent Data sanitization of suspicious html5 code for halting the dom-based XSS
Security In Cloud Computing, Int. J. Curr. Eng. Technol. 4 (1) (2014). vulnerabilities in cloud, Int. J. Cloud Appl. Comput. (IJCAC) 7 (1) (2017) 1–31.
[9] S. Zhang, H. Wang, W. Huang, Two-stage plant species recognition by local [19] S. Bakshi, P.K. Sa, H. Wang, S.S. Barpanda, B. Majhi, Fast periocular
mean clustering and Weighted sparse representation classification, Cluster authentication in handheld devices with reduced phase intensive local
Comput. (2017) 1–9. pattern, Multimedia Tools Appl. (2017) 1–29.
[10] D.V. Medhane, A.K. Sangaiah, ESCAPE: Effective Scalable Clustering Approach [20] J. Jin, J. Gubbi, S. Marusic, M. Palaniswami, An information framework for
for Parallel Execution of continuous position-based queries in position creating a smart city through internet of things, IEEE Internet Things J. 1 (2)
monitoring applications, IEEE Trans. Sustain. Comput. 2 (2) (2017) 49–61, (2014) 112–121.
http://dx.doi.org/10.1109/TSUSC.2017.2690378. [21] Chen Chen, Xiaomin Liu, Tie Qiu, Lei Liu, Arun Kumar Sangaiah, Latency
[11] Jinsung Byun, et al., Intelligent household LED lighting system considering estimation based on traffic density for video streaming in the internet of
energy efficiency and user satisfaction, IEEE Trans. Consumer Electron. 59 (1) vehicles Computer Communications, 111, 2017, pp. 176–186, http://dx.doi.
(2013) 70–76. org/10.1016/j.comcom.2017.08.010, ISSN 0140-3664.
[12] A.K. Sangaiah, O.W. Samuel, X. Li, M. Abdel-Basset, H. Wang, Towards an [22] R. Nasim, A.J. Kassler, A. Antonic, Mobile publish/subscribe system for
efficient risk assessment in software projects-fuzzy reinforcement paradigm, intelligent transport systems over a cloud environment, in: 2014 International
Comput. Electr. Eng. (2017), http://www.sciencedirect.com/science/article/ Conference on Cloud and Autonomic Computing (ICCAC), IEEE, 2014.
pii/S0045790617305530 (in press). [23] Sundarapandian Vaidyanathan, A novel chemical chaotic reactor system and
[13] Oscar Castillo, Patricia Melin, Janusz Kacprzyk (Eds.), Recent advances on its output regulation via integral sliding mode control. parameters 1 (2015): 4.
hybrid intelligent systems, Springer, 2013. [24] A.K. Sangaiah, A.K. Thangavelu, X.Z. Gao, N. Anbazhagan, M.S. Durai, An ANFIS
[14] Omid Fatahi Valilai, Mahmoud Houshmand, A collaborative and integrated approach for evaluation of team-level service climate in GSD projects using
platform to support distributed manufacturing system using a Taguchi-genetic learning algorithm, Appl. Soft Comput. 30 (2015) 628–635.
service-oriented approach based on cloud computing paradigm, Robot. [25] A.K. Sangaiah, A.K. Thangavelu, An adaptive neuro-fuzzy approach to
Comput.-Integrat. Manuf. 29 (1) (2013) 110–127. evaluation of team-level service climate in GSD projects, Neural Comput.
[15] M. Zareapoor, P. Shamsolmoali, J. Yang, Kernelized support vector machine Appl. 25 (3–4) (2014) 573–583.
with deep learning: an efficient approach for extreme multiclass dataset, [26] J. Wang, Y. Zhou, H. Wang, X. Yang, F. Yang, A. Peterson, Image tag completion
Pattern Recognit. Lett. (2017), http://www.sciencedirect.com/science/article/ by local learning, in: International Symposium on Neural Networks, Springer,
pii/S0167865517303276 (in press). Cham, October, 2015, pp. 232–239.
[16] Neeraj Kumar, Jong-Hyouk Lee, Joel J.P.C. Rodrigues, Intelligent mobile video [28] V. Jain, A.K. Sangaiah, S. Sakhuja, N. Thoduka, R. Aggarwal, Supplier selection
surveillance system as a Bayesian coalition game in vehicular sensor using fuzzy AHP and TOPSIS: a case study in the Indian automotive industry,
networks: learning automata approach, IEEE Trans. Intell. Transp. Syst. 16 (3) Neural Comput. Appl. (2016) 1–10.
(2015) 1148–1161.
Please cite this article in press as: J. Zhang, et al., Intelligent computing system based on pattern recognition and data mining algorithms,
Sustain. Comput.: Inform. Syst. (2017), https://doi.org/10.1016/j.suscom.2017.10.010