Download textbook Database Systems For Advanced Applications 22Nd International Conference Dasfaa 2017 Suzhou China March 27 30 2017 Proceedings Part I 1St Edition Selcuk Candan ebook all chapter pdf

Database Systems for Advanced
Applications 22nd International

Conference DASFAA 2017 Suzhou
China March 27 30 2017 Proceedings
Part I 1st Edition Selçuk Candan
Visit to download the full and correct content document:
https://textbookfull.com/product/database-systems-for-advanced-applications-22nd-int
ernational-conference-dasfaa-2017-suzhou-china-march-27-30-2017-proceedings-par
t-i-1st-edition-selcuk-candan/
More products digital (pdf, epub, mobi) instant
download maybe you interests ...
Database Systems for Advanced Applications 25th

International Conference DASFAA 2020 Jeju South Korea
September 24 27 2020 Proceedings Part I Yunmook Nah
https://textbookfull.com/product/database-systems-for-advanced-
applications-25th-international-conference-dasfaa-2020-jeju-
south-korea-september-24-27-2020-proceedings-part-i-yunmook-nah/

September 24 27 2020 Proceedings Part II Yunmook Nah
south-korea-september-24-27-2020-proceedings-part-ii-yunmook-nah/

September 24 27 2020 Proceedings Part III Yunmook Nah
south-korea-september-24-27-2020-proceedings-part-iii-yunmook-
nah/
Intelligent Information and Database Systems: 9th Asian

Conference, ACIIDS 2017, Kanazawa, Japan, April 3-5,
2017, Proceedings, Part I 1st Edition Ngoc Thanh Nguyen
https://textbookfull.com/product/intelligent-information-and-
database-systems-9th-asian-conference-aciids-2017-kanazawa-japan-
april-3-5-2017-proceedings-part-i-1st-edition-ngoc-thanh-nguyen/
Database Systems for Advanced Applications DASFAA 2020
International Workshops BDMS SeCoP BDQM GDMA and AIDE
Jeju South Korea September 24 27 2020 Proceedings
Yunmook Nah
applications-dasfaa-2020-international-workshops-bdms-secop-bdqm-
gdma-and-aide-jeju-south-korea-september-24-27-2020-proceedings-
yunmook-nah/
Requirements Engineering Foundation for Software

Quality 23rd International Working Conference REFSQ
2017 Essen Germany February 27 March 2 2017 Proceedings
1st Edition Paul Grünbacher
https://textbookfull.com/product/requirements-engineering-
foundation-for-software-quality-23rd-international-working-
conference-refsq-2017-essen-germany-
february-27-march-2-2017-proceedings-1st-edition-paul-grunbacher/
Emerging Technologies for Developing Countries: First

International EAI Conference, AFRICATEK 2017,
Marrakech, Morocco, March 27-28, 2017 Proceedings 1st
Edition Fatna Belqasmi Et Al. (Eds.)
https://textbookfull.com/product/emerging-technologies-for-
developing-countries-first-international-eai-conference-
africatek-2017-marrakech-morocco-
march-27-28-2017-proceedings-1st-edition-fatna-belqasmi-et-al-
eds/
Information Security and Privacy: 22nd Australasian
Conference, ACISP 2017, Auckland, New Zealand, July
3–5, 2017, Proceedings, Part I 1st Edition Josef
Pieprzyk
https://textbookfull.com/product/information-security-and-
privacy-22nd-australasian-conference-acisp-2017-auckland-new-
zealand-july-3-5-2017-proceedings-part-i-1st-edition-josef-
pieprzyk/
Business Information Systems: 20th International

Conference, BIS 2017, Poznan, Poland, June 28–30, 2017,
Proceedings 1st Edition Witold Abramowicz (Eds.)
https://textbookfull.com/product/business-information-
systems-20th-international-conference-bis-2017-poznan-poland-
june-28-30-2017-proceedings-1st-edition-witold-abramowicz-eds/
Selçuk Candan · Lei Chen
Torben Bach Pedersen · Lijun Chang
Wen Hua (Eds.)
LNCS 10177
Database Systems
for Advanced Applications
22nd International Conference, DASFAA 2017
Suzhou, China, March 27–30, 2017
Proceedings, Part I
123
Lecture Notes in Computer Science 10177
Commenced Publication in 1973
Founding and Former Series Editors:
Gerhard Goos, Juris Hartmanis, and Jan van Leeuwen
Editorial Board
David Hutchison
Lancaster University, Lancaster, UK
Takeo Kanade
Carnegie Mellon University, Pittsburgh, PA, USA
Josef Kittler
University of Surrey, Guildford, UK
Jon M. Kleinberg
Cornell University, Ithaca, NY, USA
Friedemann Mattern
ETH Zurich, Zurich, Switzerland
John C. Mitchell
Stanford University, Stanford, CA, USA
Moni Naor
Weizmann Institute of Science, Rehovot, Israel
C. Pandu Rangan
Indian Institute of Technology, Madras, India
Bernhard Steffen
TU Dortmund University, Dortmund, Germany
Demetri Terzopoulos
University of California, Los Angeles, CA, USA
Doug Tygar
University of California, Berkeley, CA, USA
Gerhard Weikum
Max Planck Institute for Informatics, Saarbrücken, Germany
More information about this series at http://www.springer.com/series/7409
Selçuk Candan Lei Chen
•
Torben Bach Pedersen Lijun Chang

•
Wen Hua (Eds.)
Database Systems
for Advanced Applications
22nd International Conference, DASFAA 2017
Suzhou, China, March 27–30, 2017
Proceedings, Part I
123
Editors
Selçuk Candan Lijun Chang
Arizona State University University of New South Wales
Tempe - Phoenix, AZ Sydney, NSW
USA Australia
Lei Chen Wen Hua
Hong Kong University of Science The University of Queensland
and Technology Brisbane, QLD
Hong Kong Australia
China
Torben Bach Pedersen
Aalborg University
Aalborg
Denmark
ISSN 0302-9743 ISSN 1611-3349 (electronic)

Lecture Notes in Computer Science
ISBN 978-3-319-55752-6 ISBN 978-3-319-55753-3 (eBook)
DOI 10.1007/978-3-319-55753-3
Library of Congress Control Number: 2017934640
LNCS Sublibrary: SL3 – Information Systems and Applications, incl. Internet/Web, and HCI
© Springer International Publishing AG 2017

This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the
material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation,
broadcasting, reproduction on microfilms or in any other physical way, and transmission or information
storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now
known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this book are
believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors
give a warranty, express or implied, with respect to the material contained herein or for any errors or
omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in
published maps and institutional affiliations.
Printed on acid-free paper
This Springer imprint is published by Springer Nature

The registered company is Springer International Publishing AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface
It is our great pleasure to welcome you to DASFAA 2017, the 22nd edition of the
International Conference on Database Systems for Advanced Applications (DASFAA),
which was held in Suzhou, China, during March 27–30, 2017.
The long history of Suzhou City has left behind many attractive scenic spots and
historical sites with beautiful and interesting legends. The elegant classical gardens, the
old-fashioned houses and delicate bridges hanging over flowing waters in the drizzling
rain, the beautiful lakes with undulating hills in lush green, and the exquisite arts and
crafts, among many other attractions, have made Suzhou a renowned historical and
cultural city full of eternal and poetic charm. Suzhou is best known for its gardens: the
Humble Administrator’s Garden, the Lingering Garden, the Surging Wave Pavilion,
and the Master of Nets Garden. These gardens weave together the best of traditional
Chinese architecture, painting, and arts. Suzhou is also known as the “Venice of the
East.” The city is sandwiched between Taihu Lake and Grand Canal. A network of
channels, criss-crossed with hump-backed bridges, give Suzhou an image of the city on
the water.
We were delighted to offer an exciting technical program, including three keynote
talks by Divesh Srivastava (AT&T Research), Christian S. Jensen (Aalborg Univer-
sity), and Victor Chang (Xi’an Jiaotong University and Liverpool University), one
10-year best paper award presentation; a demo session with four demonstrations; two
industry sessions with nine paper presentations; three tutorial sessions; and of course a
superb set of research papers. This year, we received 300 submissions to the research
track, each of which went through a rigorous review process. Specifically, each paper
was reviewed by at least three Program Committee (PC) members, followed by a
discussion led by the PC co-chairs. Several papers went through a shepherding process.
Finally, DASFA 2017 accepted 73 full research papers, yielding an acceptance ratio of
24.3%.
Four workshops were selected by the workshop co-chairs to be held in conjunction
with DASFAA 2017. They are the 4th International Workshop on Big Data Man-
agement and Service (BDMS 2017), the Second Workshop on Big Data Quality
Management (BDQM 2017), the 4th International Workshop on Semantic Computing
and Personalization (SeCoP), and the First International Workshop on Data Manage-
ment and Mining on MOOCs (DMMOOC 2017).
The workshop papers are included in a separate volume of the proceedings also
published by Springer in its Lecture Notes in Computer Science series.
The conference received generous financial support from Soochow University. The
conference organizers also received extensive help and logistic support from the
DASFAA Steering Committee and the Conference Management Toolkit Support Team
at Microsoft.
VI Preface
We are grateful to the general chairs, Karl Aberer, EPFL, Switzerland, Peter
Scheuermann, Northwestern University, USA, and Kai Zheng, Soochow University,
China, the members of the Organizing Committee, and many volunteers, for their great
support in the conference organization. Special thanks also go to the DASFAA 2017
local Organizing Committee: Zhixu Li and Jiajie Xu, both from Soochow University,
China. Finally, we would like to take this opportunity to thank the authors who sub-
mitted their papers to the conference and the PC members and external reviewers for
their expertise and help in evaluating the submissions.
February 2017 K. Selçuk Candan

Lei Chen
Torben Bach Pedersen
Organization
General Co-chairs
Karl Aberer EPFL, Switzerland
Peter Scheuermann Northwestern University, USA
Kai Zheng Soochow University, China
Program Committee Co-chairs

Selçuk Candan Arizona State University, USA
Lei Chen HKUST, Hong Kong, SAR China
Torben Bach Pedersen Aalborg University, Denmark
Workshops Co-chairs
Zhifeng Bao RMIT, Australia
Goce Trajcevski Northwestern University, USA
Industrial/Practitioners Track Co-chairs

Nicholas Jing Yuan Microsoft, China
Georgia Koutrika HP Labs, USA
Demo Track Co-chairs

Meihui Zhang Singapore University of Technology and Design,
Singapore
Wook-Shin Han POSTECH University, South Korea
Tutorial Co-chairs
Katja Hose Aalborg University, Denmark
Huiping Cao New Mexico State University, USA
Panel Chair
Xuemin Lin University of New South Wales, Australia
Proceedings Co-chairs
Lijun Chang University of New South Wales, Australia
Wen Hua University of Queensland, Australia
VIII Organization
Publicity Co-chairs
Bin Yang Aalborg University, Denmark
Xiang Lian University of Texas-Pan American, USA
Maria Luisa Sapino University of Turin, Italy
Local Organization Co-chairs

Zhixu Li Soochow University, China
Jiajie Xu Soochow University, China
Steering Committee Liaison

Xiaofang Zhou University of Queensland, Australia
Conference Secretary
Yan Zhao Soochow University, China
Webmaster
Yang Li Soochow University, China
Program Committee
Amr El Abbadi University of California at Santa Barbara, USA
Alberto Abello UPC Barcelona, Spain
Divyakant Agrawal University of California at Santa Barbara, USA
Marco Aldinucci University of Turin, Italy
Ira Assent Aarhus University, Denmark
Rafael Berlanga Llavori University Jaume I, Spain
Francescho Bonchi ISI Foundation, Italy
Selcuk Candan Arizona State University, USA
Huiping Cao New Mexico State University, USA
Barbara Catania Università di Genova, Italy
Qun Chen Northwestern Polytechnical University, China
Reynold Cheng University of Hong Kong, SAR China
Wonik Choi Inha University, South Korea
Gao Cong Nanyang Technological University, Singapore
Bin Cui Peiking University, China
Lars Dannecker SAP, Germany
Hasan Davulcu Arizona State University, USA
Ugur Demiryurek University of Southern California, USA
Francesco Di Mauro University of Turin, Italy
Curtis Dyreson Utah State University, USA
Hakan Ferhatosmanoglu Bilkent University, Turkey
Organization IX
Elena Ferrari Università dell’Insubria, Italy

Johann Gamper Free University of Bozen-Bolzano, Italy
Hong Gao Harbin Institute of Technology, China
Yunjun Gao Zhejiang University, China
Yash Garg Arizona State University, USA
Lukasz Golab University of Waterloo, Canada
Le Gruenwald University of Oklahoma, USA
Ismail Hakki Toroslu Middle East Technical University, Turkey
Jingrui He Arizona State University, USA
Yoshiharu Ishikawa Nagoya University, Japan
Linnan Jiang HKUST, SAR China
Cheqing Jin East China Normal University, China
Alekh Jindal MIT, USA
Sungwon Jung Jung Sogang University, South Korea
Arijit Khan Nanyang Technological University, Singapore
Latifur Khan University of Texas at Dallas, USA
Deok-Hwan Kim Inha University, South Korea
Jinho Kim Kangwon National University, South Korea
Jong Wook Kim Sangmyung University, South Korea
Jung Hyun Kim Arizona State University, USA
Sang-Wook Kim Hanyang University, South Korea
Peer Kroger LMU Munich, Germany
Anne Laurent Université de Montpellier II, France
Wookey Lee Inha University, South Korea
Young-Koo Lee Kyung Hee University, South Korea
Chengkai Li University of Texas at Arlington, USA
Feifei Li University of Utah, USA
Guoliang Li Tsinghua University, China
Zhixu Li Soochow University, China
Xiang Lian Kent State University, USA
Eric Lo Chinese University of Hong Kong, SAR China
Woong-Kee Loh Gachon University, South Korea
Hua Lu Aalborg University, Denmark
Nikos Mamoulis University of Ioannina, Greece/University
of Hong Kong, SAR China
Ioana Manolescu Inria, France
Rui Meng HKUST, SAR China
Parth Nagarkar Arizona State University, USA
Yunmook Nah Dankook University, South Korea
Kjetil Norvag Norwegian University of Science and Technology,
Norway
Sarana Yi Nutanong City University of Hong Kong, SAR China
Vincent Oria New Jersey Institute of Technology, USA
Paolo Papotti Arizona State University, USA
Torben Bach Pedersen Aalborg University, Denmark
Ruggero Pensa University of Turin, Italy
X Organization
Dieter Pfoser George Mason University, USA

Evaggelia Pitoura University of Ioannina, Greece
Silvestro Poccia University of Turin, Italy
Weixiong Rao Tongji University, China
Matthias Renz George Mason University, USA
Oscar Romero UPC Barcelona, Spain
Florin Rusu University of California Merced, USA
Simonas Saltenis Aalborg University, Denmark
Maria Luisa Sapino University of Turin, Italy
Claudio Schifanella University of Turin, Italy
Cyrus Shahabi University of Southern California, USA
Jieying She HKUST, SAR China
Hengtao Shen University of Queensland, China
Yanyan Shen Shanghai Jiao Tong University, China
Alkis Simitsis HP Labs, USA
Shaoxu Song Tsinghua University, China
Yangqiu Song HKUST, SAR China
Xiaoshuai Sun University of Queensland, Australia
Letizia Tanca Politecnico di Milano, Italy
Nan Tang Qatar Computing Research Institute, Qatar
Egemen Tanin University of Melbourne, Australia
Junichi Tatemura Google, USA
Christian Thomsen Aalborg University, Denmark
Hanghang Tong Arizona State University, USA
Yongxin Tong Beihang University, China
Panos Vassiliadis University of Ioannina, Greece
Sabrina De Capitani University of Milan, Italy
Vimercati
Bin Wang Northeastern University, China
Wei Wang National University of Singapore, Singapore
Xin Wang Tianjin University, China
John Wu Lawrence Berkeley Lab, USA
Xiaokui Xiao Nanyang Technological University, Singapore
Xike Xie University of Science and Technology of China, China
Jianliang Xu Hong Kong Bapatist University, SAR China
Jeffrey Xu Yu Chinese University of Hong Kong, China
Xiaochun Yang Northeastern University, China
Bin Yao Shanghai Jiao Tong University, China
Hongzhi Yin University of Queensland, Australia
Man Lung Yiu Hong Kong Polytechnic, SAR China
Yi Yu National Institute of Informatics, Japan
Ye Yuan Northeastern University, China
Meihui Zhang Singapore University of Technology and Design,
Singapore
Wenjie Zhang University of New South Wales, Australia
Ying Zhang University of Technology Sydney, Australia
Organization XI
Zhengjie Zhang Advanced Digital Sciences Center, Singapore

Xiangmin Zhou RMIT, Australia
Yongluan Zhou University of Southern Denmark, Denmark
Lei Zhu University of Queensland, Australia
Esteban Zimanyi Université Libre de Bruxelles, Belgium
Andreas Zufle George Mason University, USA
Industry Track Program Committee

Akhil Arora Xerox Research Centre, India
Jie Bao Microsoft, China
Senjuti Basu Roy UW Tacoma, USA
Neil Zhenqiang Gong Iowa State University, USA
Defu Lian University of Electronic Science and Technology,
China
Qi Liu University of Science and Technology of China, China
Alkis Simitsis HP Labs, USA
Kostas Stefanidis University of Tampere, Finland
Lu-An Tang NEC Lab, USA
Fuzheng Zhang Microsoft, China
Hengshu Zhu Baidu, China
Demo Track Program Committee

Jinha Kim Oracle Labs, USA
Xuan Liu Baidu, China
Yanyan Shen Shanghai Jiao Tong University, China
Yongxin Tong Beihang University, China
Additional Reviewers
Jinpeng Chen Beihang University, China
Xilun Chen Arizona State University, USA
Yu Cheng Turn Inc., USA
Alexander Crosdale UC Merced, USA
Tiziano De Matteis University of Pisa, Italy
Vasilis Efthymiou University of Crete, Greece
Roberto Esposito University of Turin, Italy
Yixiang Fang The University of Hong Kong, SAR China
Christian Frey LMU Munich, Germany
Yash Garg Arizona State University, USA
Concorde Habineza George Mason University, USA
Jiafeng Hu The University of Hong Kong, SAR China
Zhiyi Huang UC Merced, USA
Zhipeng Huang The University of Hong Kong, SAR China
Shengyu Huang Arizona State University, USA
XII Organization
Petar Jovanovic UPC Barcelona, Spain

Elvis Koci Technische Universität Dresden, Germany
Georgia Koloniari University of Macedonia, Greece
Haridimos Kondylakis ICS-FORTH, Greece
Rohit Kumar
Xiaodong Li The University of Hong Kong, SAR China
Xinsheng Li Arizona State University, USA
Mao-Lin Li Arizona State University, USA
Xuan Liu Google, USA
Sicong Liu Arizona State University, USA
Yujing Ma UC Merced, USA
Sebastian Mattheis BMW Car-IT, Germany
Mirjana Mazuran Politecnico di Milano, Italy
Gabriele Mencagli University of Pisa, Italy
Faisal Munir
Sergi Nadal Universitat Politècnica de Catalunya, Spain
Silvestro Poccia University of Turin, Italy
Emanuele Rabosio Politecnico di Milano, Italy
Klaus A. Schmid Ludwig-Maximilians Universität München, Germany
Caihua Shan The University of Hong Kong, SAR China
Haiqi Sun The University of Hong Kong, SAR China
Massimo Torquati University of Turin, Italy
Martin Torres UC Merced, USA
Jovan Varga Universitat Politècnica de Catalunya, Spain
Paolo Viviani University of Turin, Italy
Hongwei Wang
Jianqiu Xu Nanjing University of Aeronautics and Astronautics,
China
Yong Xu The University of Hong Kong, SAR China
Xiaoyan Yang Advanced Digital Sciences Center, Singapore
Liming Zhan University of New South Wales, Australia
Xin Zhang UC Merced, USA
Weijie Zhao UC Merced, USA
Yuhai Zhao Northeastern University, China
Contents – Part I
Semantic Web and Knowledge Management
A General Fine-Grained Truth Discovery Approach for Crowdsourced

Data Aggregation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Yang Du, Hongli Xu, Yu-E Sun, and Liusheng Huang
Learning the Structures of Online Asynchronous Conversations . . . . . . . . . . 19

Jun Chen, Chaokun Wang, Heran Lin, Weiping Wang, Zhipeng Cai,
and Jianmin Wang
A Question Routing Technique Using Deep Neural Network

for Communities of Question Answering . . . . . . . . . . . . . . . . . . . . . . . . . . 35
Amr Azzam, Neamat Tazi, and Ahmad Hossny
Category-Level Transfer Learning from Knowledge Base to Microblog

Stream for Accurate Event Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
Weijing Huang, Tengjiao Wang, Wei Chen, and Yazhou Wang
Indexing and Distributed Systems
AngleCut: A Ring-Based Hashing Scheme for Distributed

Metadata Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
Jiaxi Liu, Renxuan Wang, Xiaofeng Gao, Xiaochun Yang,
and Guihai Chen
An Efficient Bulk Loading Approach of Secondary Index in Distributed

Log-Structured Data Stores. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
Yanchao Zhu, Zhao Zhang, Peng Cai, Weining Qian, and Aoying Zhou
Performance Comparison of Distributed Processing of Large Volume

of Data on Top of Xen and Docker-Based Virtual Clusters . . . . . . . . . . . . . 103
Haejin Chung and Yunmook Nah
An Adaptive Data Partitioning Scheme for Accelerating Exploratory Spark

SQL Queries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
Chenghao Guo, Zhigang Wu, Zhenying He, and X. Sean Wang
Network Embedding
Semi-Supervised Network Embedding . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131

Chaozhuo Li, Zhoujun Li, Senzhang Wang, Yang Yang,
Xiaoming Zhang, and Jianshe Zhou
XIV Contents – Part I
CirE: Circular Embeddings of Knowledge Graphs . . . . . . . . . . . . . . . . . . . . 148

Zhijuan Du, Zehui Hao, Xiaofeng Meng, and Qiuyue Wang
PPNE: Property Preserving Network Embedding. . . . . . . . . . . . . . . . . . . . . 163

Chaozhuo Li, Senzhang Wang, Dejian Yang, Zhoujun Li, Yang Yang,
Xiaoming Zhang, and Jianshe Zhou
HINE: Heterogeneous Information Network Embedding. . . . . . . . . . . . . . . . 180

Yuxin Chen and Chenguang Wang
Trajectory and Time Series Data Processing
DT-KST: Distributed Top-k Similarity Query on Big Trajectory Streams . . . . 199

Zhigang Zhang, Yilin Wang, Jiali Mao, Shaojie Qiao, Cheqing Jin,
and Aoying Zhou
A Distributed Multi-level Composite Index for KNN Processing on Long

Time Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
Xiaqing Wang, Zicheng Fang, Peng Wang, Ruiyuan Zhu, and Wei Wang
Outlier Trajectory Detection: A Trajectory Analytics Based Approach . . . . . . 231

Zhongjian Lv, Jiajie Xu, Pengpeng Zhao, Guanfeng Liu, Lei Zhao,
and Xiaofang Zhou
Clustering Time Series Utilizing a Dimension Hierarchical

Decomposition Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247
Qiuhong Li, Peng Wang, Yang Wang, Wei Wang, Yimin Liu, Jiaye Wu,
and Danyang Dou
Data Mining
Fast Extended One-Versus-Rest Multi-label SVM Classification Algorithm

Based on Approximate Extreme Points . . . . . . . . . . . . . . . . . . . . . . . . . . . 265
Zhongwei Sun, Zhongwen Guo, Xupeng Wang, Jing Liu,
and Shiyong Liu
Efficiently Discovering Most-Specific Mixed Patterns from Large

Data Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279
Xiaoying Wu and Dimitri Theodoratos
Max-Cosine Matching Based Neural Models for Recognizing

Textual Entailment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295
Zhipeng Xie and Junfeng Hu
An Intelligent Field-Aware Factorization Machine Model . . . . . . . . . . . . . . . 309

Cairong Yan, Qinglong Zhang, Xue Zhao, and Yongfeng Huang
Contents – Part I XV
Query Processing and Optimization (I)
Beyond Skylines: Explicit Preferences . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327

Markus Endres and Timotheus Preisinger
Optimizing Window Aggregate Functions in Relational Database Systems . . . 343

Guangxuan Song, Jiansong Ma, Xiaoling Wang, Cheqing Jin,
and Yu Cao
Query Optimization on Hybrid Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . 361

Anxuan Yu, Qingzhong Meng, Xuan Zhou, Binyu Shen,
and Yansong Zhang
Efficient Batch Grouping in Relational Datasets . . . . . . . . . . . . . . . . . . . . . 376

Jizhou Sun, Jianzhong Li, and Hong Gao
Text Mining
Memory-Enhanced Latent Semantic Model: Short Text Understanding for

Sentiment Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 393
Fei Hu, Xiaofei Xu, Jingyuan Wang, Zhanbo Yang, and Li Li
Supervised Intensive Topic Models for Emotion Detection over Short Text . . . . 408
Yanghui Rao, Jianhui Pang, Haoran Xie, An Liu, Tak-Lam Wong,
Qing Li, and Fu Lee Wang
Leveraging Pattern Associations for Word Embedding Models . . . . . . . . . . . 423

Qian Liu, Heyan Huang, Yang Gao, Xiaochi Wei, and Ruiying Geng
Multi-Granularity Neural Sentence Model for Measuring

Short Text Similarity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 439
Jiangping Huang, Shuxin Yao, Chen Lyu, and Donghong Ji
Recommendation
Leveraging Kernel Incorporated Matrix Factorization for Smartphone

Application Recommendation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 459
Chenyang Liu, Jian Cao, and Jing He
Preference Integration in Context-Aware Recommendation . . . . . . . . . . . . . . 475

Lin Zheng and Fuxi Zhu
Jointly Modeling Heterogeneous Temporal Properties

in Location Recommendation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 490
Saeid Hosseini, Hongzhi Yin, Meihui Zhang, Xiaofang Zhou,
and Shazia Sadiq
XVI Contents – Part I
Location-Aware News Recommendation Using Deep Localized

Semantic Analysis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 507
Cheng Chen, Thomas Lukasiewicz, Xiangwu Meng, and Zhenghua Xu
Review-Based Cross-Domain Recommendation Through Joint

Tensor Factorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 525
Tianhang Song, Zhaohui Peng, Senzhang Wang, Wenjing Fu,
Xiaoguang Hong, and Philip S. Yu
Security, Privacy, Senor and Cloud
A Local-Clustering-Based Personalized Differential Privacy Framework

for User-Based Collaborative Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . 543
Yongkai Li, Shubo Liu, Jun Wang, and Mengjun Liu
Fast Multi-dimensional Range Queries on Encrypted Cloud Databases. . . . . . 559

Jialin Chi, Cheng Hong, Min Zhang, and Zhenfeng Zhang
When Differential Privacy Meets Randomized Perturbation: A Hybrid

Approach for Privacy-Preserving Recommender System. . . . . . . . . . . . . . . . 576
Xiao Liu, An Liu, Xiangliang Zhang, Zhixu Li, Guanfeng Liu, Lei Zhao,
and Xiaofang Zhou
Supporting Cost-Efficient Multi-tenant Database Services with Service

Level Objectives (SLOs) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 592
Yifeng Luo, Junshi Guo, Jiaye Zhu, Jihong Guan, and Shuigeng Zhou
Recovering Missing Values from Corrupted Spatio-Temporal Sensory Data

via Robust Low-Rank Tensor Completion . . . . . . . . . . . . . . . . . . . . . . . . . 607
Wenjie Ruan, Peipei Xu, Quan Z. Sheng, Nickolas J.G. Falkner, Xue Li,
and Wei Emma Zhang
Social Network Analytics (I)
Group-Level Influence Maximization with Budget Constraint . . . . . . . . . . . . 625

Qian Yan, Hao Huang, Yunjun Gao, Wei Lu, and Qinming He
Correlating Stressor Events for Social Network Based Adolescent

Stress Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 642
Qi Li, Liang Zhao, Yuanyuan Xue, Li Jin, Mostafa Alli, and Ling Feng
Emotion Detection in Online Social Network Based

on Multi-label Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 659
Xiao Zhang, Wenzhong Li, and Sanglu Lu
Contents – Part I XVII
Tutorials
Urban Computing: Enabling Urban Intelligence with Big Data . . . . . . . . . . . 677

Yu Zheng
Incentive-Based Dynamic Content Management in Mobile Crowdsourcing

for Smart City Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 680
Sanjay K. Madria
Tutorial on Data Analytics in Multi-engine Environments . . . . . . . . . . . . . . 682

Verena Kantere and Maxim Filatov
Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 685

Contents – Part II
Map Matching and Spatial Keywords
HIMM: An HMM-Based Interactive Map-Matching System . . . . . . . . . . . . . 3

Xibo Zhou, Ye Ding, Haoyu Tan, Qiong Luo, and Lionel M. Ni
HyMU: A Hybrid Map Updating Framework . . . . . . . . . . . . . . . . . . . . . . . 19

Tao Wang, Jiali Mao, and Cheqing Jin
Multi-objective Spatial Keyword Query with Semantics . . . . . . . . . . . . . . . . 34

Jing Chen, Jiajie Xu, Chengfei Liu, Zhixu Li, An Liu, and Zhiming Ding
Query Processing and Optimization (II)
RSkycube: Efficient Skycube Computation by Reusing Principle . . . . . . . . . 51

Kaiqi Zhang, Hong Gao, Xixian Han, Donghua Yang, Zhipeng Cai,
and Jianzhong Li
Similarity Search Combining Query Relaxation and Diversification . . . . . . . . 65

Ruoxi Shi, Hongzhi Wang, Tao Wang, Yutai Hou, Yiwen Tang,
Jianzhong Li, and Hong Gao
An Unsupervised Approach for Low-Quality Answer Detection

in Community Question-Answering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
Haocheng Wu, Zuohui Tian, Wei Wu, and Enhong Chen
Approximate OLAP on Sustained Data Streams . . . . . . . . . . . . . . . . . . . . . 102

Salman Ahmed Shaikh and Hiroyuki Kitagawa
Search and Information Retrieval
Hierarchical Semantic Representations of Online News Comments

for Emotion Tagging Using Multiple Information Sources . . . . . . . . . . . . . . 121
Chao Wang, Ying Zhang, Wei Jie, Christian Sauer, and Xiaojie Yuan
Towards a Query-Less News Search Framework on Twitter . . . . . . . . . . . . . 137

Xiaotian Hao, Ji Cheng, Jan Vosecky, and Wilfred Ng
Semantic Definition Ranking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153

Zehui Hao, Zhongyuan Wang, Xiaofeng Meng, Jun Yan,
and Qiuyue Wang
XX Contents – Part II
An Improved Approach for Long Tail Advertising in Sponsored Search . . . . 169

Amar Budhiraja and P. Krishna Reddy
String and Sequence Processing
Locating Longest Common Subsequences with Limited Penalty . . . . . . . . . . 187

Bin Wang, Xiaochun Yang, and Jinxu Li
Top-k String Auto-Completion with Synonyms . . . . . . . . . . . . . . . . . . . . . . 202

Pengfei Xu and Jiaheng Lu
Efficient Regular Expression Matching on Compressed Strings . . . . . . . . . . . 219

Yutong Han, Bin Wang, Xiaochun Yang, and Huaijie Zhu
Mining Top-k Distinguishing Temporal Sequential Patterns

from Event Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
Lei Duan, Li Yan, Guozhu Dong, Jyrki Nummenmaa, and Hao Yang
Stream Data Processing
Soft Quorums: A High Availability Solution for Service Oriented

Stream Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253
Chunyao Song, Tingjian Ge, Cindy Chen, and Jie Wang
StroMAX: Partitioning-Based Scheduler for Real-Time Stream

Processing System. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
Jiawei Jiang, Zhipeng Zhang, Bin Cui, Yunhai Tong, and Ning Xu
Partition-Based Clustering with Sliding Windows for Data Streams . . . . . . . . 289

Jonghem Youn, Jihun Choi, Junho Shim, and Sang-goo Lee
CBP: A New Parallelization Paradigm for Massively Distributed

Stream Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304
Qingsong Guo and Yongluan Zhou
Social Network Analytics (II)
Measuring and Maximizing Influence via Random Walk in Social

Activity Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323
Pengpeng Zhao, Yongkun Li, Hong Xie, Zhiyong Wu, Yinlong Xu,
and John C.S. Lui
Adaptive Overlapping Community Detection with Bayesian NonNegative

Matrix Factorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339
Xiaohua Shi, Hongtao Lu, and Guanbo Jia
Contents – Part II XXI
A Unified Approach for Learning Expertise and Authority

in Digital Libraries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354
B. de La Robertie, L. Ermakova, Y. Pitarch, A. Takasu, and O. Teste
Graph and Network Data Processing
Efficient Local Clustering Coefficient Estimation in Massive Graphs . . . . . . . 371

Hao Zhang, Yuanyuan Zhu, Lu Qin, Hong Cheng, and Jeffrey Xu Yu
Efficient Processing of Growing Temporal Graphs . . . . . . . . . . . . . . . . . . . 387

Huanhuan Wu, Yunjian Zhao, James Cheng, and Da Yan
Effective k-Vertex Connected Component Detection

in Large-Scale Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 404
Yuan Li, Yuhai Zhao, Guoren Wang, Feida Zhu, Yubao Wu,
and Shengle Shi
Spatial Databases
Efficient Landmark-Based Candidate Generation for kNN Queries

on Road Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 425
Tenindra Abeywickrama and Muhammad Aamir Cheema
MinSum Based Optimal Location Query in Road Networks . . . . . . . . . . . . . 441

Lv Xu, Ganglin Mai, Zitong Chen, Yubao Liu, and Genan Dai
Efficiently Mining High Utility Co-location Patterns from Spatial Data Sets
with Instance-Specific Utilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 458
Lizhen Wang, Wanguo Jiang, Hongmei Chen, and Yuan Fang
Real Time Data Processing
Supporting Real-Time Analytic Queries in Big

and Fast Data Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 477
Guangjun Wu, Xiaochun Yun, Chao Li, Shupeng Wang, Yipeng Wang,
Xiaoyu Zhang, Siyu Jia, and Guangyan Zhang
Boosting Moving Average Reversion Strategy for Online Portfolio

Selection: A Meta-learning Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . 494
Xiao Lin, Min Zhang, Yongfeng Zhang, Zhaoquan Gu, Yiqun Liu,
and Shaoping Ma
Continuous Summarization over Microblog Threads . . . . . . . . . . . . . . . . . . 511

Liangjun Song, Ping Zhang, Zhifeng Bao, and Timos Sellis
Drawing Density Core-Sets from Incomplete Relational Data . . . . . . . . . . . . 527

Yongnan Liu, Jianzhong Li, and Hong Gao
XXII Contents – Part II
Big Data (Industrial)
Co-training an Improved Recurrent Neural Network with Probability

Statistic Models for Named Entity Recognition . . . . . . . . . . . . . . . . . . . . . . 545
Yueqing Sun, Lin Li, Zhongwei Xie, Qing Xie, Xin Li, and Guandong Xu
EtherQL: A Query Layer for Blockchain System. . . . . . . . . . . . . . . . . . . . . 556

Yang Li, Kai Zheng, Ying Yan, Qi Liu, and Xiaofang Zhou
Optimizing Scalar User-Defined Functions in In-Memory Column-Store

Database Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 568
Cheol Ryu, Sunho Lee, Kihong Kim, Kunsoo Park, Yongsik Kwon,
Sang Kyun Cha, Changbin Song, Emanuel Ziegler, and Stephan Muench
GPS-Simulated Trajectory Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 581

Han Su, Wei Chen, Rong Liu, Min Nie, Bolong Zheng, Zehao Huang,
and Defu Lian
Social Networks and Graphs (Industrial)
Predicting Academic Performance via Semi-supervised Learning

with Constructed Campus Social Network . . . . . . . . . . . . . . . . . . . . . . . . . 597
Huaxiu Yao, Min Nie, Han Su, Hu Xia, and Defu Lian
Social User Profiling: A Social-Aware Topic Modeling Perspective. . . . . . . . 610

Chao Ma, Chen Zhu, Yanjie Fu, Hengshu Zhu, Guiquan Liu,
and Enhong Chen
Cost-Effective Data Partition for Distributed Stream Processing System . . . . . 623

Xiaotong Wang, Junhua Fang, Yuming Li, Rong Zhang,
and Aoying Zhou
A Graph-Based Push Service Platform . . . . . . . . . . . . . . . . . . . . . . . . . . . . 636

Huifeng Guo, Ruiming Tang, Yunming Ye, Zhenguo Li, and Xiuqiang He
Edge Influence Computation in Dynamic Graphs . . . . . . . . . . . . . . . . . . . . 649

Yongrui Qin, Quan Z. Sheng, Simon Parkinson,
and Nickolas J.G. Falkner
Demos
DKGBuilder: An Architecture for Building a Domain Knowledge Graph

from Scratch. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 663
Yan Fan, Chengyu Wang, Guomin Zhou, and Xiaofeng He
Contents – Part II XXIII
CLTR: Collectively Linking Temporal Records Across

Heterogeneous Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 668
Yanyan Zou and Kasun S. Perera
PhenomenaAssociater: Linking Multi-domain Spatio-Temporal Datasets. . . . . 672

Prathamesh Walkikar and Vandana P. Janeja
VisDM–A Data Stream Visualization Platform . . . . . . . . . . . . . . . . . . . . . . 677

Lars Melander, Kjell Orsborn, Tore Risch, and Daniel Wedlund
Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 681

Semantic Web and Knowledge
Management
A General Fine-Grained Truth Discovery
Approach for Crowdsourced Data Aggregation
Yang Du1(B) , Hongli Xu1 , Yu-E Sun2 , and Liusheng Huang1

1
School of Computer Science and Technology,
University of Science and Technology of China, Hefei, China
jannr@mail.ustc.edu.cn, {xuhongli,lshuang}@ustc.edu.cn
2
School of Urban Rail Transportation, Soochow University, Suzhou, China
sunye12@suda.edu.cn
Abstract. Crowdsourcing has been proven to be an efficient tool to

collect large-scale datasets. Answers provided by the crowds are often
noisy and conflicted, which makes aggregating them to infer ground
truth a critical challenge. Existing fine-grained truth discovery methods
solve this problem by exploring the correlation between source reliabil-
ity and task topics or answers. However, they can only work on lim-
ited tasks, which results in the incompatibility with Writing tasks and
Transcription tasks, along with the insufficient utilization of the global
dataset. To maintain compatibility, we consider the existence of clusters
in both tasks and sources, then propose a general fine-grained method.
The proposed approach contains two integral components: kl-means and
Pattern-based Truth Discovery (PTD). With the aid of ground truth
data, kl-means directly employs a co-clustering reliability model on the
correctness matrix to learn the patterns. Then PTD conducts the answer
aggregation by incorporating captured patterns, producing a more accu-
rate estimation. Therefore, our approach is compatible with all tasks
and can better demonstrate the correlation among tasks and sources.
Experimental results show that our method can produce a more precise
estimation than other general truth discovery methods due to its ability
to learn and utilize the patterns of both tasks and sources.
1 Introduction
In recent years, producing large-scale datasets has been vital for both research
and industry. Traditional strategies, mostly hiring motivated experts, are not
competent enough when collecting datasets like ImageNet [1] and TinyImages
[2] which contain millions of samples. On the other hand, crowdsourcing services,
like AMT (Amazon Mechanical Turk) [3] or CrowdFlower [4], offer a convenient
way to collect datasets by distributing tasks to workers (sources) all over the
world.
For its efficiency and effectiveness in producing large-scale datasets, crowd-
sourcing has become increasingly popular. As sources are not necessarily experts,
tasks are often distributed more than once, which results in the noisy and con-
flicted answers. Therefore, how to aggregate answers and infer the ground truth
has been a critical challenge in crowdsourcing.

c Springer International Publishing AG 2017
S. Candan et al. (Eds.): DASFAA 2017, Part I, LNCS 10177, pp. 3–18, 2017.
DOI: 10.1007/978-3-319-55753-3 1
4 Y. Du et al.
An intuitive solution to this challenge is majority voting, which selects the

answer with most votes as the final output. However, it overlooks the rugged
reliability of sources and shows poor performance when the number of low-
quality sources exceeds that of high-quality ones. To solve this problem, a set
of weighted voting approaches was proposed [5–7]. Regardless of the differences
in their models, their principles are assigning larger weights to sources with
higher reliability and outputting the answers with largest weights. Nevertheless,
these approaches assume that each source’s reliability is consistent with all tasks,
which is not true, as everyone has his areas of expertise.
Given the shortcomings of above approaches, current efforts try to utilize
sources’ fine-grained reliability. Some of them divide tasks into topical-level clus-
ters and estimate source’s topical reliability, like FaitCrowd [8], which employs
a topic model on task description and divides tasks into groups. Others attempt
to model source’s reliability for different answers [9,10]. However, a common
drawback of these methods is their limited compatibility with tasks.
Table 1. Crowdsourcing task templates in AMT
Task form Task description Candidate answers

task-specific non-specific task-specific non-specific
Image tagging/Moderation
Writing
Data collection
Transcription
Sentiment wizard
Categorization wizard
To better demonstrate the incompatibility issues of existing fine-grained

approaches, we consider the task templates in AMT and propose a rough classi-
fication for their task description and candidate answers. As Table 1 shows, we
divide each feature into two groups: task-specific or non-specific. For example,
the task description of Data Collection tasks, like “What is the height of the
Mount Everest?”, is labeled as task-specific for that each description is targeted
at a specific task. In contrast, the task description of Image Tagging tasks, like
“Is there a duck in the picture?”, which can be used for multiple images, is
labeled as non-specific. The classification process of candidate answer is similar.
Consider a typical Image Tagging task which asks workers to answer whether
they see birds in the picture. We label the candidate answers of this task as
non-specific since they can be shared among a set of Image Tagging tasks.
We find out that current fine-grained approaches are only competent for lim-
ited tasks in AMT. For instance, the approaches which divide tasks into topical-
level groups, like FaitCrowd, can only work with the tasks whose description
can reflect their topics. Therefore, those approaches can only perform well on
A General Fine-Grained Truth Discovery Approach 5
tasks with the task-specific description but are not competent for those with
non-specific description. Other approaches, which model fine-grained reliability
for different answers, limit the tasks to those whose candidate answers are non-
specific. As consequences, (1) none of these approaches can deal with the Writing
tasks or the Transcription tasks, and (2) they can only utilize a subset of the
global dataset to help infer the ground truth. Therefore, it is crucial to design a
general fine-grained truth discovery approach.
To tackle with challenges above, we consider the clusters in both tasks and
sources. It has been observed that members of the same task cluster or source
cluster share identical patterns. For instance, the pattern shared in a source
cluster is sources’ rugged reliability levels for different task clusters. More specif-
ically, sources can have different reliability levels for various task clusters, but
for a particular task cluster, sources coming from the same cluster show an equal
reliability level. The patterns of task clusters are similar to that of source clusters.
Therefore, we can first learn the patterns of task clusters and source clusters,
then derive correct answers for target tasks based on the learned patterns.
In this paper, we propose a general fine-grained truth discovery approach
to aggregate crowdsourced answers. The proposed method contains two integral
components: kl-means and Pattern-based Truth Discovery (PTD). With the aid
of standard tasks whose ground truth are known, kl-means directly employs a
co-clustering reliability model on the correctness matrix to learn the patterns
of sources and tasks simultaneously. Then PTD further uses captured patterns
to help infer the ground truth. Since these two components do not have any
additional requirements for tasks, like the task-specific description or non-specific
answers, our approach is compatible with all tasks. To the best of our knowledge,
this is the first work to propose a general fine-grained method with consideration
of the patterns of task clusters and source clusters. One important feature of our
approach is the utilization of co-clustering reliability model and standard tasks,
which allows us to learn the cluster patterns directly from the correctness matrix.
Therefore, our method can better demonstrate the relationship between tasks
and sources and produce more accurate inference.
In summary, three main contributions of this work are as follows:
(1) To the best of our knowledge, we are the first to recognize the compatibility
issues of state-of-art fine-grained truth discovery methods and propose a
general one to aggregate noisy crowdsourced answers.
(2) We employ a co-clustering reliability model to learn the patterns of task
clusters and source clusters simultaneously and use captured patterns to
derive a more accurate inference of ground truth.
(3) We empirically show that the proposed method outperforms existing general
truth discovery methods.
2 Problem Formulation
In this section, we first introduce some basic notations used in this paper, then
give a formal definition of our problem.
6 Y. Du et al.

The inputs of our method are m standard tasks T = {ti }m 1 , m target tasks
m+m m+m ,n
T = {ti }m+1 , n sources U = {uj }1 , answers matrix A = {aij }i=1,j=1 and cor-
n
rectness matrix R = {rij }m,n

i=1,j=1 , where standard tasks are those whose ground
truth we already know, the target tasks are those whose ground truth we want
to find out.
Definition 1. All tasks are single-choice questions. For task ti , it has B candi-
date answers ci = {ci1 , ci2 · · · ciB }, and its ground truth is denoted by c∗i ∈ ci .
Definition 2. Answer element aij denotes the answer given by source uj to task
ti . aij = b, 1 ≤ b ≤ B means source uj select task ti ’s candidate answer cib as
his answer, aij = 0 means source uj has not answered task ti .
Definition 3. Correctness element rij denotes the correctness of answer given

by source uj to standard task ti . rij = 1 means source uj answers task ti correctly,
rij = 0 means his answer is incorrect, while rij = −1 means source uj has not
answered task ti .

Our goal is to derive estimation of target tasks’ ground truth E = {ei }m+mm+1 ,
k task clusters I = {Ip }k1 , l source clusters {Jq }l1 and co-clustering reliability
matrix {ηpq }k,l
p=1,q=1 .
Definition 4. Estimated ground truth ei for task ti is the most accurate answer
provided by sources.
Definition 5. Co-clustering reliability matrix η k×l is referred as the reliability

levels of l source clusters on k task clusters.
Definition 6. Sources in same source cluster share an identical pattern on task

clusters, and a source’s reliability is consistent on the same task cluster.
Based on these definitions, we can formally define our problem as: given m
m+m
standard tasks T = {ti }m
1 , m target tasks T = {ti }m+1 , n sources U = {uj }1 ,
n

m+m ,n
answers matrix A = {aij }i=1,q=1 and correctness matrix R = {rij }m,n i=1,q=1 , our
goals are learning k task clusters I = {Ip }1 , l source clusters {Jq }1 , co-clustering
k l
reliability matrix {ηpq }k,l

p=1,q=1 and producing estimation of target tasks’ true

answers E = {ei }m+m
m+1 .
3 Co-clustering Reliability Model
The basic idea of proposed approach is to build a co-clustering reliability model

on the correctness matrix, which is based on the observation that there exist
clusters in both sources and tasks. For each source cluster, its members share
same reliability level on tasks belonging to the same task cluster, so do the task
clusters.
A General Fine-Grained Truth Discovery Approach 7
Therefore, we can divide tasks into k clusters, sources into l clusters by the
following functions:
ρ : {t1 , t2 · · · tm } → {1, 2 · · · k}
(1)
γ : {u1 , u2 · · · un } → {1, 2 · · · l},
where ρ(ti ) = r implies that task ti is in task cluster r, γ(uj ) = s implies that
source uj is in source cluster s.
Let I = {Ip }k1 denote the k task clusters and J = {Jq }l1 denote the source
clusters. A co-cluster is referred as the submatrix of correctness matrix R deter-
mined by a task cluster Ip and a source cluster Jq .
We evaluate the homogeneity of co-cluster by the sum of squared differences
between each entry and the mean of this co-cluster which was used by Hartigan
et al. [15]. Consider that the correctness matrix could be highly sparse since it
is unnecessary for sources to perform all tasks. We compute the residue of the
element rij differently based on whether source uj has answered task ti . Let
value −1 represent the unfilled items in the correctness matrix. Formally:

rij − ηpq , if rij = −1
hij = (2)
0, otherwise.
When source uj has answered task ti , the residue of rij is hij = rij − ηpq ,
where p is the cluster label of task ti , q is the cluster label of source uj , ηpq is
the mean of submatrix determined by Ip and Jq . But, when source uj has not
answered task ti , we set hij to 0, which in fact ignores the loss caused by unfilled
elements.
The objective of this co-clustering reliability model is to learn the patterns
of both tasks and sources. Hence, the objective function is to minimize the total
squared residues of all co-clusters, which is shown as follows.

L= Lpq = h2ij , (3)
Ip ,Jq Ip ,Jq i∈Ip ,j∈Jq
where Ip , Jq enumerates all co-clusters, Lpq represents the sum squared residues
of co-cluster determined by Ip and Jq .
4 Algorithms
The proposed approach estimates the ground truth by employing two integral
components: (1) using kl-means to learn the co-clustering reliability model from
sources performance on standard tasks, which contains the patterns of clusters,
and (2) infer the ground truth of target tasks with the help of capture patterns.
4.1 kl-means
We propose kl-means, an extended version of the k-means algorithm, to learn
co-clustering reliability of tasks and sources. Differing from previous work
Another random document with
no related content on Scribd:
⅓ cup soft shortening
1 cup brown sugar
1½ cups black molasses
Stir in ...
½ cup cold water
Sift together and stir in ...
6 cups sifted GOLD MEDAL Flour

1 tsp. salt
1 tsp. allspice
1 tsp. ginger
1 tsp. cloves
1 tsp. cinnamon
Stir in ...
2 tsp. soda dissolved in 3 tbsp. cold water
Chill dough. Roll out very thick (½″). Cut with 2½″ round cutter. Place
far apart on lightly greased baking sheet. Bake until, when touched
lightly with finger, no imprint remains.
temperature: 350° (mod. oven).
time: Bake 15 to 18 min.
amount: 2⅔ doz. fat, puffy 2½″ cookies.
FROSTED GINGIES
Follow recipe above—and frost when cool with Simple White Icing
(recipe below).
SIMPLE WHITE ICING

Blend together 1 cup sifted confectioners’ sugar, ¼ tsp. salt, ½ tsp.
vanilla, and enough milk or water to make easy to spread (about 1½
tbsp.). Part of icing may be
colored by adding a drop or two
of food coloring.
GINGERBREAD BOYS
Make holidays gayer than ever.
Follow recipe above—and mix in 1 more cup sifted GOLD MEDAL
Flour. Chill dough. Roll out very thick (½″). Grease cardboard
gingerbread boy pattern, place on the dough, and cut around it with
a sharp knife. Or use a gingerbread boy cutter. With a pancake
turner, carefully transfer gingerbread boys to lightly greased baking
sheet. Press raisins into dough for eyes, nose, mouth, and shoe and
cuff buttons. Use bits of candied cherries or red gum drops for coat
buttons; strips of citron for tie. Bake. Cool slightly, then carefully
remove from baking sheet. With white icing, make outlines for collar,
cuffs, belt, and shoes.
amount: About 12 Gingerbread Boys.
★ STONE JAR MOLASSES COOKIES

Crisp and brown ... without a bit of sugar.
Heat to boiling point ...
1 cup molasses
Remove from heat

Stir ...
½ cup shortening
1 tsp. soda
2¼ cups sifted GOLD MEDAL Flour

1¾ tsp. baking powder
1 tsp. salt
1½ tsp. ginger
Chill dough. Roll out very thin (¹⁄₁₆″). Cut into desired shapes. Bake
until, when touched lightly, no imprint remains. (Over-baking gives a
bitter taste.)
amount: About 6 doz. 2½″ cookies.
PATTERN FOR GINGER BREAD BOY

Trace on tissue paper. Then cut pattern from cardboard. Place
greased pattern on dough. Cut around it with a sharp knife. Other
cooky patterns can be made in same way.
To make “dancing” Gingerbread boys ... bend the legs and arms into
“action” positions when you place them on baking sheet (as shown in
small figures above).
Packing cookies successfully for mailing
1 Select heavy box, lined with waxed paper. Use plenty of filler
(crushed wrapping or tissue paper, or unbuttered popcorn or
Cheerios).
2 Wrap each cooky separately ... in waxed paper. Or place cookies
back-to-back in pairs ... then wrap each pair.
3 Pad bottom of box with filler.
Fit wrapped cookies into box
closely, in layers.
4 Use filler between layers to
prevent crushing of cookies.
5 Cover with paper doily, add
card, and pad top with crushed
paper. Pack tightly so contents
will not shake around.
6 Wrap box tightly with heavy
paper and cord. Address plainly
with permanent ink ... covering
address with Scotch tape or
colorless nail polish. Mark the
box plainly: “PERISHABLE.”
Festive Christmas Cookies
★ 1 Sandbakelser
★ 2 Spritz Rosettes
★ 3 Merry Christmas Cookies, Trees, Stars, etc.
★ 4 Nurnberger
★ 5 Almond Crescents
★ 6 Lebkuchen
★ 7 Scotch Shortbread
★ 8 Berliner Kranser
★ 9 Finska Kakor
★ 10 Russian Tea Cakes
Gay shapes ... for holiday cheer.
MERRY CHRISTMAS COOKIES ( Recipe)

Soft, cushiony cookies, dark or light.
DARK DOUGH.... For animal shapes, toy shapes, and boy and girl
figures.
Mix together thoroughly ...
⅓ cup soft shortening

⅓ cup brown sugar
1 egg
⅔ cup molasses
2¾ cups sifted GOLD MEDAL Flour

1 tsp. soda
1 tsp. salt
2 tsp. cinnamon
1 tsp. ginger
Chill dough. Roll out thick (¼″). Cut into desired shapes. Place 1″
apart on lightly greased baking sheet. Bake until, when touched
lightly with finger, no imprint remains. When cool, ice and decorate
as desired.
temperature: 375° (quick mod. oven).
LIGHT DOUGH
For bells, stockings, stars, wreaths, etc.
Follow recipe for Dark Dough above except substitute honey for
molasses, and granulated sugar for brown. Use 1 tsp. vanilla in
place of cinnamon and ginger.
TO HANG ON CHRISTMAS TREE

Just loop a piece of green string and press ends into the dough at
the top of each cooky before baking. Bake with string-side down on
pan.
TO DECORATE
Use recipe for Decorating Icing (p. 31) (thin the icing for spreading).
For decorating ideas, see picture on preceding page. Sugar in
coarse granules for decorating is available at bakery supply houses.
STARS
Cover with white icing. Sprinkle with sky blue sugar.
WREATHS
Cut with scalloped cutter ... using smaller
cutter for center. Cover with white icing.
Sprinkle with green sugar and decorate with clusters
of berries made of red icing—leaves of green icing—
to give the realistic effect of holly wreaths.
BELLS
Outline with red icing. Make clapper of red icing. (A
favorite with children.)
STOCKINGS
Sprinkle colored sugar on toes and heels before
baking. Or mark heels and toes of baked cookies
with icing of some contrasting color.
CHRISTMAS TREES
Spread with white icing ... then sprinkle with green sugar.
Decorate with silver dragées and tiny colored candies.
TOYS
(Drum, car, jack-in-the-box, etc.):
Outline shapes with white or colored icing.
ANIMALS
(Reindeer, camel, dog, kitten, etc.): Pipe icing
on animals to give effect of bridles, blankets,
etc.
BOYS AND GIRLS
Pipe figures with an icing to give desired effects:
eyes, noses, buttons, etc.
“Old country” Christmas treasures.

LEBKUCHEN ( Recipe)
The famous old-time German Christmas Honey Cakes.
Mix together and bring to a boil ...
½ cup honey
½ cup molasses
Cool thoroughly
Stir in ...
¾ cup brown sugar

1 egg
1 tbsp. lemon juice
1 tsp. grated lemon rind

½ tsp. soda
1 tsp. cinnamon
1 tsp. cloves
1 tsp. allspice
1 tsp. nutmeg
Mix in ...
⅓ cup cut-up citron

⅓ cup chopped nuts
Chill dough overnight. Roll small amount at a time, keeping rest

chilled. Roll out ¼″ thick and cut into oblongs 1½ × 2½″. Place one
inch apart on greased baking sheet. Bake until when touched lightly
no imprint remains. While cookies bake, make Glazing Icing (recipe
below). Brush it over cookies the minute they are out of oven. Then
quickly remove from baking sheet. Cool and store to mellow.
temperature: 400° (mod. hot oven).
amount: About 6 doz. 2″ × 3″ cookies.
GLAZING ICING
Boil together 1 cup sugar and ½ cup water until first indication of a
thread appears (230°). Remove from heat. Stir in ¼ cup
confectioners’ sugar and brush hot icing thinly over cookies. (When
icing gets sugary, reheat slightly, adding a little water until clear
again.)
★ NURNBERGER
Round, light-colored honey cakes from the famed old City of Toys.
Follow recipe above—except in place of honey and molasses use
1 cup honey; and reduce spices (using ¼ tsp. cloves, ½ tsp. allspice,
and ½ tsp. nutmeg ... with 1 tsp. cinnamon).
Roll out the chilled dough ¼″ thick. Cut into 2″ rounds. Place on
greased baking sheet. With fingers, round up cookies a bit toward
center. Press in blanched almond halves around the edge like petals
of a daisy. Use a round piece of citron for each center. Bake just until
set. Immediately brush with Glazing Icing (above). Remove from
baking sheet. Cool, and store to mellow.
TO “MELLOW” COOKIES
... store in an air-tight container for a few days. Add a cut orange or apple; but fruit
molds, so change it frequently.
ZUCKER HÜTCHEN (Little Sugar Hats)
From the collection of Christmas recipes by the Kohler Woman’s Club of Kohler,
Wisconsin.
6 tbsp. soft butter
½ cup sugar
1 egg yolk
Stir in ...
2 tbsp. milk
1⅜ cups sifted GOLD MEDAL Flour

½ tsp. baking powder
¼ tsp. salt
Mix in ...
¼ cup finely cut-up citron
Chill dough. Roll thin (⅛″). Cut into 2″ rounds. Heap 1 tsp. Meringue
Frosting (recipe below) in center of each round to make it look like
the crown of a hat. Place 1″ apart on greased baking sheet. Bake
until delicately browned.
amount: About 4 doz. 2″ cookies.
MERINGUE FROSTING
Beat 1 egg white until frothy. Beat in gradually 1½ cups sifted
confectioners’ sugar and beat until frosting holds its shape. Stir in ½
cup finely chopped blanched almonds.
Decorative favorites from lands afar.
SCOTCH SHORTBREAD
Old-time delicacy from Scotland ... crisp, thick, buttery.
Mix together thoroughly
...
1 cup soft butter

⅝ cup sugar (½ cup
plus 2 tbsp.)
Stir in ...
2½ cups sifted GOLD MEDAL Flour
Mix thoroughly with hands. Chill dough. Roll out ⅓ to ½″ thick. Cut
into fancy shapes (small leaves, ovals, squares, etc.). Flute edges if
desired by pinching between fingers as for pie crust. Place on
ungreased baking sheet. Bake. (The tops do not brown during
baking ... nor does shape of the cookies change.)
temperature: 300° (slow oven).
amount: About 2 doz. 1″ × 1½″ cookies.
★ FINSKA KAKOR (Finnish Cakes)

Nut-studded butter strips from
Finland.
¾ cup soft butter

¼ cup sugar
1 tsp. almond flavoring
Stir in ...
2 cups sifted GOLD

MEDAL Flour
Mix thoroughly with hands. Chill dough. Roll out ¼″ thick. Cut into
strips 2½″ long and ¾″ wide. Brush tops lightly with 1 egg white,
slightly beaten. Sprinkle with mixture of 1 tbsp. sugar and ⅓ cup
finely chopped blanched almonds. Carefully transfer (several strips
at a time) to ungreased baking sheet. Bake just until cookies begin to
turn a very delicate golden brown.
amount: About 4 doz. 2½″ × ¾″ cookies.
SANDBAKELSER (Sand Tarts)
Fragile almond-flavored shells of
Swedish origin, made in copper molds
of varied designs.
Put through fine knife of food
grinder twice ...
*⅓ cup blanched almonds

*4 unblanched almonds
Mix in thoroughly ...
⅞ cup soft butter (1 cup

minus 2 tbsp.)
The ring of sleigh bells fills the air as
¾ cup sugar everyone races to church on Christmas Day
1 small egg white, in Finland.
unbeaten
Stir in ...
*In place of the almonds, you may use 1 tsp. vanilla flavoring and 1
tsp. almond flavoring.
Chill dough. Press dough into Sandbakels molds (or tiny fluted tart
forms) to coat inside. Place on ungreased baking sheet. Bake until
very delicately browned. Tap molds on table to loosen cookies and
turn them out of the molds.
amount: About 3 doz. cookies.
MOLDED COOKIES Mold ’em fast with
a fork or glass!
HOW TO MAKE MOLDED COOKIES (preliminary steps on pp. 14-

15)
1 With hands, roll dough 2 Flatten balls of dough 3 Cut pencil-thick strips
into balls or into long, with bottom of a glass ... and shape as directed
pencil-thick rolls, as dipped in flour (or with a ... as for Almond
indicated in recipe. damp cloth around it), or Crescents (p. 41) or
with a fork—crisscross. Berliner Kranser (p. 42).
DATE-OATMEAL COOKIES
¾ cup soft shortening (half butter)

1 cup brown sugar
2 eggs
3 tbsp. milk
1 tsp. vanilla
2 cups sifted GOLD MEDAL Flour

¾ tsp. soda
1 tsp. salt
Stir in ...
2 cups rolled oats
1½ cups cut-up dates
¾ cup chopped nuts
Chill dough. Roll into balls size of large walnuts. Place 3″ apart on
lightly greased baking sheet. Flatten (to ¼″) with bottom of glass
dipped in flour. Bake until lightly browned.
★ PEANUT BUTTER COOKIES ( Recipe)

Perfect for the Children’s Hour.
½ cup soft shortening (half butter)

½ cup peanut butter
½ cup sugar
½ cup brown sugar
1 egg
1¼ cups sifted GOLD MEDAL Flour

½ tsp. baking powder
¾ tsp. soda
¼ tsp. salt
Chill dough. Roll into balls size of large walnuts. Place 3″ apart on
lightly greased baking sheet. Flatten with fork dipped in flour ...
crisscross. Bake until set ... but not hard.
HONEY PEANUT BUTTER COOKIES

Follow recipe above—except use only ¼
cup shortening, and in place of brown sugar
use ½ cup honey.
Sprightly tea cakes for friends and family.
THUMBPRINT COOKIES Nut-rich ... the thumb dents filled with

sparkling jelly.
I’m as delighted with this quaint addition to our cooky collection, from Ken
MacKenzie, as is the collector of old glass when a friend presents her with some
early thumbprint goblets.

¼ cup brown sugar
1 egg yolk
½ tsp. vanilla
1 cup sifted GOLD MEDAL Flour

¼ tsp. salt
Roll into 1″ balls. Dip in slightly beaten egg whites. Roll in finely
chopped nuts (¾ cup). Place about 1″ apart on ungreased baking
sheet. Bake 5 min. Remove from oven. Quickly press thumb gently
on top of each cooky. Return to oven and bake 8 min. longer. Cool.
Place in thumbprints a bit of chopped candied fruit, sparkling jelly, or
tinted confectioners’ sugar icing.
time: Bake 5 min., then 8 min.
★ ENGLISH TEA CAKES Tender, flavorful tidbits with a sugary

glaze.

¾ cup sugar
1 egg
3 tbsp. milk

1½ tsp. baking powder
¼ tsp. salt
Mix in ...
½ cup finely cut sliced citron

½ cup currants or raisins, cut-up
Chill dough. Roll into balls the size of walnuts. Dip tops in slightly
beaten egg white, then sugar. Place sugared-side-up 2″ apart on
ungreased baking sheet. Bake until delicately browned. The balls
flatten some in baking and become glazed.
temperature: 400° (mod. hot oven).
ALMOND CRESCENTS
Richly delicate, buttery. Party favorites.
1 cup soft shortening (half butter)

⅓ cup sugar
⅔ cup ground blanched almonds
Sift together and work in ...
1⅔ cups sifted GOLD MEDAL Flour

¼ tsp. salt
Chill dough. Roll with hands pencil-thick. Cut in 2½″ lengths. Form
into crescents on ungreased baking sheet. Bake until set ... not
brown. Cool on pan. While slightly warm, carefully dip in 1 cup
confectioners’ sugar and 1 tsp. cinnamon mixed.
temperature: 325° (slow mod. oven).
LEMON SNOWDROPS
Refreshing, lemony ... with snowy icing.
Follow recipe for English Tea Cakes above—except use 2 tbsp.
lemon juice and 1 tbsp. water in place of the milk. Add 2 tsp. grated
lemon rind. Omit citron and currants. Mix in ½ cup chopped nuts.
Chill dough. Roll into balls and bake. Then roll in confectioners’
sugar.
BUTTER FINGERS
Nut-flavored, rich buttery party cookies.
Follow recipe for Almond Crescents—except in place of almonds use
black walnuts or other nuts, chopped. Cut into finger lengths and
bake. While still warm, roll in confectioners’ sugar. Cool, and roll in
the sugar again.
Festive cookies for the holidays ... ideal for Christmas boxes.

Download textbook Database Systems For Advanced Applications 22Nd International Conference Dasfaa 2017 Suzhou China March 27 30 2017 Proceedings Part I 1St Edition Selcuk Candan ebook all chapter pdf

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Download textbook Database Systems For Advanced Applications 22Nd International Conference Dasfaa 2017 Suzhou China March 27 30 2017 Proceedings Part I 1St Edition Selcuk Candan ebook all chapter pdf

Uploaded by

Copyright:

Available Formats

Database Systems for Advanced

Applications 22nd International

Database Systems for Advanced Applications 25th

Database Systems for Advanced Applications 25th

Database Systems for Advanced Applications 25th

Intelligent Information and Database Systems: 9th Asian

Requirements Engineering Foundation for Software

Emerging Technologies for Developing Countries: First

Business Information Systems: 20th International

Torben Bach Pedersen Lijun Chang

Wen Hua (Eds.)

ISSN 0302-9743 ISSN 1611-3349 (electronic)

Library of Congress Control Number: 2017934640

© Springer International Publishing AG 2017

Printed on acid-free paper

This Springer imprint is published by Springer Nature

February 2017 K. Selçuk Candan

Program Committee Co-chairs

Industrial/Practitioners Track Co-chairs

Demo Track Co-chairs

Local Organization Co-chairs

Steering Committee Liaison

Elena Ferrari Università dell’Insubria, Italy

Dieter Pfoser George Mason University, USA

Zhengjie Zhang Advanced Digital Sciences Center, Singapore

Industry Track Program Committee

Demo Track Program Committee

Petar Jovanovic UPC Barcelona, Spain

Semantic Web and Knowledge Management

A General Fine-Grained Truth Discovery Approach for Crowdsourced

Learning the Structures of Online Asynchronous Conversations . . . . . . . . . . 19

A Question Routing Technique Using Deep Neural Network

Category-Level Transfer Learning from Knowledge Base to Microblog

Indexing and Distributed Systems

AngleCut: A Ring-Based Hashing Scheme for Distributed

An Efficient Bulk Loading Approach of Secondary Index in Distributed

Performance Comparison of Distributed Processing of Large Volume

An Adaptive Data Partitioning Scheme for Accelerating Exploratory Spark

Semi-Supervised Network Embedding . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131

CirE: Circular Embeddings of Knowledge Graphs . . . . . . . . . . . . . . . . . . . . 148

PPNE: Property Preserving Network Embedding. . . . . . . . . . . . . . . . . . . . . 163

HINE: Heterogeneous Information Network Embedding. . . . . . . . . . . . . . . . 180

Trajectory and Time Series Data Processing

DT-KST: Distributed Top-k Similarity Query on Big Trajectory Streams . . . . 199

A Distributed Multi-level Composite Index for KNN Processing on Long

Outlier Trajectory Detection: A Trajectory Analytics Based Approach . . . . . . 231

Clustering Time Series Utilizing a Dimension Hierarchical

Fast Extended One-Versus-Rest Multi-label SVM Classification Algorithm

Efficiently Discovering Most-Specific Mixed Patterns from Large

Max-Cosine Matching Based Neural Models for Recognizing

An Intelligent Field-Aware Factorization Machine Model . . . . . . . . . . . . . . . 309

Query Processing and Optimization (I)

Beyond Skylines: Explicit Preferences . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327

Optimizing Window Aggregate Functions in Relational Database Systems . . . 343

Query Optimization on Hybrid Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . 361

Efficient Batch Grouping in Relational Datasets . . . . . . . . . . . . . . . . . . . . . 376

Memory-Enhanced Latent Semantic Model: Short Text Understanding for

Leveraging Pattern Associations for Word Embedding Models . . . . . . . . . . . 423

Multi-Granularity Neural Sentence Model for Measuring

Leveraging Kernel Incorporated Matrix Factorization for Smartphone

Preference Integration in Context-Aware Recommendation . . . . . . . . . . . . . . 475

Jointly Modeling Heterogeneous Temporal Properties

Location-Aware News Recommendation Using Deep Localized

Review-Based Cross-Domain Recommendation Through Joint