Professional Documents
Culture Documents
Advanced Data Mining and Applications 14th International Conference ADMA 2018 Nanjing China November 16 18 2018 Proceedings Guojun Gan
Advanced Data Mining and Applications 14th International Conference ADMA 2018 Nanjing China November 16 18 2018 Proceedings Guojun Gan
https://textbookfull.com/product/advanced-data-mining-and-
applications-15th-international-conference-adma-2019-dalian-
china-november-21-23-2019-proceedings-jianxin-li/
https://textbookfull.com/product/advanced-data-mining-and-
applications-10th-international-conference-adma-2014-guilin-
china-december-19-21-2014-proceedings-1st-edition-xudong-luo/
https://textbookfull.com/product/computational-data-and-social-
networks-7th-international-conference-csonet-2018-shanghai-china-
december-18-20-2018-proceedings-xuemin-chen/
https://textbookfull.com/product/service-oriented-computing-16th-
international-conference-icsoc-2018-hangzhou-china-
november-12-15-2018-proceedings-claus-pahl/
Knowledge Engineering and Knowledge Management 21st
International Conference EKAW 2018 Nancy France
November 12 16 2018 Proceedings Catherine Faron Zucker
https://textbookfull.com/product/knowledge-engineering-and-
knowledge-management-21st-international-conference-
ekaw-2018-nancy-france-november-12-16-2018-proceedings-catherine-
faron-zucker/
https://textbookfull.com/product/big-data-analytics-6th-
international-conference-bda-2018-warangal-india-
december-18-21-2018-proceedings-anirban-mondal/
https://textbookfull.com/product/bio-inspired-computing-theories-
and-applications-13th-international-conference-bic-
ta-2018-beijing-china-november-2-4-2018-proceedings-part-i-
jianyong-qiao/
https://textbookfull.com/product/ambient-intelligence-14th-
european-conference-ami-2018-larnaca-cyprus-
november-12-14-2018-proceedings-achilles-kameas/
https://textbookfull.com/product/frontiers-in-cyber-security-
first-international-conference-fcs-2018-chengdu-china-
november-5-7-2018-proceedings-fagen-li/
Guojun Gan
Bohan Li
Xue Li
Shuliang Wang (Eds.)
LNAI 11323
123
Lecture Notes in Artificial Intelligence 11323
123
Editors
Guojun Gan Xue Li
University of Connecticut The University of Queensland
Storrs, CT, USA Brisbane, QLD, Australia
Bohan Li Shuliang Wang
Nanjing University of Aeronautics Beijing Institute of Technology
and Astronautics Beijing, China
Nanjing, China
This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface
Finally, we would like to thank all researchers, practitioners, and volunteer students
who contributed with their work and participated in the conference.
With the new challenges in data mining research, we hope the participants in the
conference and the readers of the proceedings will enjoy the research outcome of
ADMA 2018.
General Chairs
Xue Li University of Queensland, Australia
Joao Gama University of Porto, Portugal
Acting Chair
Bing Chen Nanjing University of Aeronautics and Astronautics, China
Program Chairs
Songcan Chen Nanjing University of Aeronautics and Astronautics, China
Shuliang Wang Beijing Institute of Technology, China
Xingquan (Hill) Zhu Florida Atlantic University, USA
Demo Chairs
Zhifeng Bao RMIT, Australia
Jianqiu Xu Nanjing University of Aeronautics and Astronautics, China
Proceedings Chair
Yunlong Zhao Nanjing University of Aeronautics and Astronautics, China
Publicity Chair
Guojun Gan University of Connecticut, USA
Sponsorship Chairs
Donghai Guan Nanjing University of Aeronautics and Astronautics, China
Xiangping Zhai Nanjing University of Aeronautics and Astronautics, China
Local Chair
Bohan Li Nanjing University of Aeronautics and Astronautics, China
Web Chair
Xin Li Nanjing University of Aeronautics and Astronautics, China
Program Committee
Bin Guo Northwestern Polytechnical University, USA
Bin Yao Shanghai Jiao Tong University, China
Bin Zhao Nanjing Normal University, China
Bin Zhou National University of Defense Technology, China
Chandra Prasetyo University of Queensland, Australia
Utomo
Changdong Wang Sun Yat-Sen University, China
Chuan Shi Beijing University of Posts and Telecommunications, China
Dechang Pi Nanjing University of Aeronautics and Astronautics, China
Guandong Xu University of Technology Sydney, Australia
Hongxu Chen University of Queensland, Australia
Hongzhi Wang Harbin Institute of Technology, China
Hongzhi Yin University of Queensland, Australia
Jianxin Li University of Western Australia
Jingfeng Guo Yanshan University, China
Lina Yao University of New South Wales, Australia
Luyao Liu University of Queensland, Australia
Meng Wang Xi’an Jiaotong University, China
Michael Sheng Macquarie University, Australia
Min Yao Zhejiang University, China
Moscato Pablo University of Newcastle, Australia
Nguyen Hung Griffith University, Australia
Peiquan Jin University of Science and Technology of China
Qilong Han Harbin Engineering University, China
Rui Mao Shenzhen University, China
Sen Wang Griffith University, Australia
Shuai Ma BeiHang University, China
Tong Chen University of Queensland, Australia
Unankard Sayan Maejo University, Thailand
Wei Zhang Macquarie University, Australia
Weitong Chen University of Queensland, Australia
Wenjie Ruan University of Oxford, UK
Xiaolin Qin Nanjing University of Aeronautics and Astronautics, China
Organization IX
Steering Committee
Jie Cao Nanjing University of Finance and Economics, China
Xue Li (Chair) University of Queensland, Australia
Shuliang Wang Beijing Institute of Technology, China
Michael Sheng University of Adelaide, Australia
Jie Tang Tsinghua University, China
Kyu-Young Whang Advanced Institute of Science and Technology, South Korea
Min Yao Zhejiang University, China
Osmar Zaiane University of Alberta, Canada
Chengqi Zhang University of Technology Sydney, Australia
Shichao Zhang Guangxi Normal University, China
Contents
Big Data
A Sparse and Low-Rank Matrix Recovery Model for Saliency Detection . . . . 129
Chao Wang, Jing Li, KeXin Li, and Yi Zhuang
Fault Diagnosis for an Automatic Shell Magazine Using FDA and ELM . . . . 255
Qiangqiang Zhao, Lingfeng Tao, Maosheng Li, and Peng Hong
Miscellaneous Topics
Abstract. Big time series data are generated daily by various applica-
tion domains such as environment monitoring, internet of things, health
care, industry and science. Mining this massive data is a very challenging
task because conventional data mining algorithms are unable to scale
effectively with massive time series data. Moreover, applying a global
classification approach to a highly similar and noisy data will hinder the
classification performance. Therefore, utilizing constrained subsequence
patterns in data mining applications increases the efficiency, accuracy,
and could provide useful insight into the data.
To address the above mentioned limitations, we propose an efficient
subsequence processing technique with preferences constraints. Then, we
introduce a sub-patterns analysis for time series data. The sub-pattern
analysis objective is to maximize the interclass separability using a local-
ization approach. Furthermore, we make use of the deviation from a cor-
relation constraint as an objective to minimize in our problem, and we
include users preferences as an objective to maximize in proportion to
users’ preferred time intervals. We experimentally validate the efficiency
and effectiveness of our proposed algorithm using real data to demon-
strate its superiority and efficiency when compared to recently proposed
correlation-based subsequence search algorithms.
1 Introduction
Time series data nowadays is in continuous increase in terms of size and complex-
ity. It is being generated, gathered and stored in unprecedented rate, whether
for the purpose of financial analysis (e.g., exchange rates, stock market), envi-
ronment monitoring [1–4], health care [13,16], social networks [17]. This increase
easily overwhelms data mining users when applying data mining applications on
these ever-growing time series data.
c Springer Nature Switzerland AG 2018
G. Gan et al. (Eds.): ADMA 2018, LNAI 11323, pp. 3–16, 2018.
https://doi.org/10.1007/978-3-030-05090-0_1
4 A. Albarrak et al.
Fig. 1. A snippet of air quality sensor data. Large redundant, meaningless and irrev-
erent data exist which does not contribute positively in the classification process. The
two most significant sections are labeled Pattern 1 and Pattern 2.
Efficiently Mining Constrained Subsequence Patterns 5
Example 1. Figure 1 shows a snippet of a large air quality time series data from
multiple sensors. Because most of these time series are redundant, they do not
contribute positively in the classification process. However, it can be seen that
pattern 1 and pattern 2 are the most significant sections since they provide
highly distinguishable features.
Symbol Definition
xi A time series
n Length of time series
m Length of subsequence time series
xi [s, e] A subsequence of xi
λ Weight of preference
tc Target correlation range [cl − cu ]
SI Input(initial) subsequence
Sc Candidate subsequence
ΔρSc Correlation deviation of Sc
ΔE
Sc Preference deviation of Sc
ΔSc Total deviation of Sc
where ρ(Sc ) is a function that returns the Pearson correlation coefficient of the
subsequences in Sc . The more ΔρSc approaches zero, the more Sc is preferable.
Pearson correlation coefficient of two subsequences x and y (in Sc ) of length
m is computed as follows [10]:
m
m
m
xi yi − xi yi
i=1 i=1 i=1
ρ(Sc ) = ρ(x, y) = (2)
m
m
m
m
x2i −( xi )2 yi2 −( yi )2
i=1 i=1 i=1 i=1
But how to obtain any candidate subsequence in the first place? We explain
the how next.
v1 v2 v3 .. .. .. vj .. .. .. vj+(e-s) .. .. .. vn
Time series x .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..
Expansion on s Expansion on e
.. .. .. .. .. .. .. .. ..
side side
vs’ vs ve ve’
vs’ vs ve ve’
Expansion on s Expansion on e
.. .. .. .. .. .. .. .. ..
side side
Time series y .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..
v1 v2 v3 .. .. .. vj .. .. .. vj+(e-s) .. .. .. vn
3 Methodology
Mining for the optimal subsequence pattern is essentially a search problem. That
is, to find Sp , an algorithm has to iterative over all possible combinations of
subsequences, compute the objective Eq. (4), while keeping the one with the
minimum deviation.
Iterating over all subsequences is obviously an intensive task and incurs high
computational overhead. Hence, we tackle this problem from this angle, and
propose a computational-centric (Sect. 3.3) algorithm with efficient optimization
techniques.
Efficiently Mining Constrained Subsequence Patterns 9
Hence, with those α-skipping cumulative arrays, the components in Eq. 2 are
computed in O(α) time.
1 2 3 m/α
α
Sx ………..
1 2 3 m/α
α ………..
Sx 2
Fig. 4. Constructing α-cumulative sum arrays Sxα and Sxα2 for time series x of length
m. α = 4
part of the time series instead of the whole time series. To achieve that, it uses
the preference objective to limit the search space (lines 5–10).
Specifically, it defines two variables maxs and maxe as new boundaries for
the search space. Then, starting from SI ’s boundaries (line 4), it enters a while-
loop (line 5) until: (1) both sides of the time series have been reached, or (2) the
preference deviation reached its maximum possible value (lines 7,8).
There are two prominent differences between Baseline and Baseline++.
Firstly, the generation of the cumulative arrays. The cumulative arrays in Base-
line++ are generated for a sub time interval of x and y, which are essentially
the new found boundaries [maxs, maxe] (line 10), not the whole interval as in
Baseline.
Secondly, the new boundaries [maxs, maxe] in Baseline++ will decrease the
number of candidate subsequences (when λ > 0). As a result, the cost of search
will decrease too. Though, Baseline++’s technique in optimizing the search cost
is limited by the subsequence/time series length ratio.
Next, we introduce our algorithm Incremental (INC) which uses the prefer-
ence deviation as an optimization technique to prune unpromising subsequences.
INC further optimize the computations of correlation by a simple technique:
12 A. Albarrak et al.
creating and maintaining four instances of the cumulative arrays, one for each
enumeration operation: LE, RE, LC, RC.
Incremental (INC): Essentially, INC starts from SI to generate all possible
subsequences with the help of an auxiliary function Enumerate() and a pri-
ority queue Q. This auxiliary function takes a subsequence as an input, and
performs the four enumeration operations defined earlier in Sect. 2.2 to produce
the set Sc . Then, for each Si ∈ Sc , the corresponding cumulative arrays instance
are updated incrementally. After that, the correlation is computed from this
instance, and Si is pushed into Q. Once all candidate subsequences in Sc have
been processed, INC pops an un-enumerated subsequence Si from Q and calls
Enumerate(), and so on. At any time, the candidate subsequences in Q are
ascendingly sorted from the closest to SI to the furthest based on Eq. 3.
To avoid an exhaustive search, INC abandons the search when reaching a
point where any candidate subsequence to be generated has a preference devi-
ation higher than the best deviation found so far, or when the queue becomes
empty, as shown in Algorithm 3 line 7.
4.1 Setup
Table 2 summarizes all parameters used throughout the experiments and the
dataset settings.
4.2 Results
5 Conclusion
We have addressed the challenging problem of finding constrained subsequence
patterns in time series data, to increase the efficiency and accuracy of data
mining applications. Then, we proposed our efficient subsequence processing
techniques and illustrated the reasons behind their design choices. Finally, we
empirically demonstrated the efficiency of our techniques using real and synthetic
dataset.
References
1. Al-Maskari, S., Bélisle, E., Li, X., Le Digabel, S., Nawahda, A., Zhong, J.: Clas-
sification with quantification for air quality monitoring. In: Bailey, J., Khan, L.,
Washio, T., Dobbie, G., Huang, J.Z., Wang, R. (eds.) PAKDD 2016. LNCS (LNAI),
vol. 9651, pp. 578–590. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-
31753-3 46
2. Al-Maskari, S., Guo, W., Zhao, X.: Biologically inspired pattern recognition for
e-nose sensors. In: Li, J., Li, X., Wang, S., Li, J., Sheng, Q.Z. (eds.) ADMA 2016.
LNCS, vol. 10086, pp. 142–155. Springer International Publishing, Cham (2016).
https://doi.org/10.1007/978-3-319-49586-6 10
3. Al-Maskari, S., Ibrahim, I.A., Li, X., Abusham, E., Almars, A.: Feature extraction
for smart sensing using multi-perspectives transformation. In: Wang, J., Cong, G.,
Chen, J., Qi, J. (eds.) ADC 2018. LNCS, vol. 10837, pp. 236–248. Springer, Cham
(2018). https://doi.org/10.1007/978-3-319-92013-9 19
4. Al-Maskari, S., Li, X., Liu, Q.: An effective approach to handling noise and drift in
electronic noses. In: Wang, H., Sharaf, M.A. (eds.) ADC 2014. LNCS, vol. 8506, pp.
223–230. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-08608-8 21
5. Fu, T.C.: A review on time series data mining. Eng. Appl. Artif. Intell. 24(1),
164–181 (2011)
6. Gavrilov, M., Anguelov, D., Indyk, P., Motwani, R.: Mining the stock market
(extended abstract): which measure is best? In: Proceedings of the Sixth ACM
SIGKDD International Conference on Knowledge Discovery and Data Mining, 20–
23 August 2000, Boston, MA, USA, pp. 487–496 (2000)
7. Ghazavi, S.N., Liao, T.W.: Medical data mining by fuzzy modeling with selected
features. Artif. Intell. Med. 43(3), 195–206 (2008)
8. Ibrahim, I.A., Albarrak, A.M., Li, X.: Constrained recommendations for query
visualizations. Knowl. Inf. Syst. 51(2), 499–529 (2017)
9. Keogh, E.J., Kasetty, S.: On the need for time series data mining benchmarks:
a survey and empirical demonstration. Data Min. Knowl. Discov. 7(4), 349–371
(2003)
10. Li, Y., U, L.H., Yiu, M.L., Gong, Z.: Discovering longest-lasting correlation in
sequence databases. PVLDB 6(14), 1666–1677 (2013)
11. Mueen, A., Hamooni, H., Estrada, T.: Time series join on subsequence correla-
tion. In: 2014 IEEE International Conference on Data Mining, ICDM 2014, 14–17
December 2014, Shenzhen, China, pp. 450–459 (2014)
16 A. Albarrak et al.
12. Mueen, A., Nath, S., Liu, J.: Fast approximate correlation for massive time-series
data. In: Proceedings of the ACM SIGMOD International Conference on Manage-
ment of Data, SIGMOD 2010, 6–10 June 2010, Indianapolis, Indiana, USA, pp.
171–182 (2010)
13. Raghupathi, W., Raghupathi, V.: Big data analytics in healthcare: promise and
potential. Health Inf. Sci. Syst. 2(1), 1 (2014)
14. Rakthanmanon, T., et al.: Searching and mining trillions of time series subse-
quences under dynamic time warping. In: The 18th ACM SIGKDD International
Conference on Knowledge Discovery and Data Mining, KDD 2012, 12–16 August
2012, Beijing, China, pp. 262–270 (2012)
15. Sakurai, Y., Papadimitriou, S., Faloutsos, C.: BRAID: stream mining through
group lag correlations. In: Proceedings of the ACM SIGMOD International Con-
ference on Management of Data, 14–16 June 2005, Baltimore, Maryland, USA, pp.
599–610 (2005)
16. Utomo, C., Li, X., Wang, S.: Classification based on compressive multivariate time
series. In: Cheema, M.A., Zhang, W., Chang, L. (eds.) ADC 2016. LNCS, vol. 9877,
pp. 204–214. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46922-
5 16
17. Nahar, V., Al-Maskari, S., Li, X., Pang, C.: Semi-supervised learning for cyberbul-
lying detection in social networks. In: Wang, H., Sharaf, M.A. (eds.) ADC 2014.
LNCS, vol. 8506, pp. 160–171. Springer, Cham (2014). https://doi.org/10.1007/
978-3-319-08608-8 14
18. Zhu, Y., Shasha, D.: Statstream: statistical monitoring of thousands of data
streams in real time. In: Proceedings of 28th International Conference on Very
Large Data Bases, VLDB 2002, 20–23 August 2002, Hong Kong, China, pp. 358–
369 (2002)
Another random document with
no related content on Scribd:
PLATE XXX
TURTLE-HEAD.—C. glabra.
TRAVELLER’S JOY.—Clematis
Virginiana.
Wild Balsam-apple.
Echinocystis lobata. Gourd Family.
White Asters.
Aster. Composite Family (p. 13).
Boneset. Thoroughwort.
Eupatorium perfoliatum. Composite Family (p. 13).
Stem.—Stout and hairy, two to four feet high. Leaves.—Opposite, widely
spreading, lance-shaped, united at the base around the stem. Flower-heads.—Dull
white, small, composed entirely of tubular blossoms borne in large clusters.
To one whose childhood was passed in the country some fifty
years ago the name or sight of this plant is fraught with unpleasant
memories. The attic or wood-shed was hung with bunches of the
dried herb which served as so many grewsome warnings against wet
feet, or any over-exposure which might result in cold or malaria. A
certain Nemesis, in the shape of a nauseous draught which was
poured down the throat under the name of “boneset tea,” attended
such a catastrophe. The Indians first discovered its virtues, and
named the plant ague-weed. Possibly this is one of the few herbs
whose efficacy has not been over-rated. Dr. Millspaugh says: “It is
prominently adapted to cure a disease peculiar to the South, known
as break-bone fever (Dengue), and it is without doubt from this
property that the name boneset was derived.”
White Snakeroot.
Eupatorium ageratoides. Composite Family (p. 13).
BONESET.—E. perfoliatum.
Climbing Hemp-weed.
Mikania scandens. Composite Family (p. 13).
Green-flowered Milkweed.
Asclepias verticillata. Milkweed Family.
A shrub from six to twelve feet high. Leaves.—Somewhat ovate and wedge-
shaped, coarsely toothed on the upper entire. Flower-heads.—Whitish or
yellowish, composed of unisexual tubular flowers, the stamens and pistils
occurring on different plants.
Some October day, as we pick our way through the salt marshes
which lie back of the beach, we may spy in the distance a thicket
which looks as though composed of such white-flowered shrubs as
belong to June. Hastening to the spot we discover that the silky-
tufted seeds of the female groundsel tree are responsible for our
surprise. The shrub is much more noticeable and effective at this
season than when—a few weeks previous—it was covered with its
small white or yellowish flower-heads.
Grass of Parnassus.
Parnassia Caroliniana. Saxifrage Family.
Stem.—Scape-like, nine inches to two feet high, with usually one small
rounded leaf clasping it below; bearing at its summit a single flower. Leaves.—
Thickish, rounded, often heart-shaped, from the root. Flower.—White or cream-
color, veiny. Calyx.—Of five slightly united sepals. Corolla.—Of five veiny petals.
True Stamens.—Five, alternate with the petals, and with clusters of sterile gland-
tipped filaments. Pistil.—One, with four stigmas.
PLATE XXXIV
GRASS OF PARNASSUS.—P.
Caroliniana.
Fragrant Life-everlasting.
Gnaphalium polycephalum. Composite Family (p. 13).
Marsh Marigold.
Caltha palustris. Crowfoot Family.
of the “Winter’s Tale,” but insist on retaining for that larger, lovelier
garden in which we all feel a certain sense of possession—even if we
are not taxed on real estate in any part of the country—the “golden
eyes” of the Mary-buds, and we feel strengthened in our position by
the statement in Mr. Robinson’s “Wild Garden” that the marsh
marigold is so abundant along certain English rivers as to cause the
ground to look as though paved with gold at those seasons when they
overflow their banks.
These flowers are peddled about our streets every spring under
the name of cowslips—a title to which they have no claim, and which
is the result of that reckless fashion of christening unrecognized
flowers which is so prevalent, and which is responsible for so much
confusion about their English names.
The derivation of marigold is somewhat obscure. In the “Grete
Herball” of the sixteenth century the flower is spoken of as Mary
Gowles, and by the early English poets as gold simply. As the first
part of the word might be derived from the Anglo-Saxon mere—a
marsh, it seems possible that the entire name may signify marsh-
gold, which would be an appropriate and poetic title for this shining
flower of the marshes.
PLATE XXXV
Celandine.
Chelidonium majus. Poppy Family.
And when certain yellow flowers which frequent the village roadside
are pointed out to us as those of the celandine, we feel a sense of
disappointment that the favorite theme of Wordsworth should
arouse within us so little enthusiasm. So perhaps we are rather
relieved than otherwise to realize that the botanical name of this
plant signifies greater celandine; for we remember that the poet
never failed to specify the small celandine as the object of his praise.
The small celandine is Ranunculus ficaria, one of the Crowfoot
family, and is only found in this country as an escape from gardens.
PLATE XXXVI
Celandine Poppy.
Stylophorum diphyllum. Poppy Family.
Stem.—Low, two-leaved. Stem-leaves.—Opposite, deeply incised. Root-leaves.
—Incised or divided. Flowers.—Deep yellow, large, one or more at the summit of
the stem. Calyx.—Of two hairy sepals. Corolla.—Of four petals. Stamens.—Many.
Pistil.—One, with a two to four-lobed stigma.
In April or May, somewhat south and westward, the woods are
brightened, and occasionally the hill-sides are painted yellow, by this
handsome flower. In both flower and foliage the plant suggests the
celandine.
sings Bryant, in his charming, but not strictly accurate poem, for the
chances are that the “beechen buds” have almost burst into foliage,
and that the “bluebird’s warble” has been heard for some time when
these pretty flowers begin to dot the woods.
PLATE XXXVII