You are on page 1of 54

Advances in Multimedia Information

Processing PCM 2016 17th Pacific Rim


Conference on Multimedia Xi an China
September 15 16 2016 Proceedings Part
I 1st Edition Enqing Chen
Visit to download the full and correct content document:
https://textbookfull.com/product/advances-in-multimedia-information-processing-pcm-
2016-17th-pacific-rim-conference-on-multimedia-xi-an-china-september-15-16-2016-p
roceedings-part-i-1st-edition-enqing-chen/
More products digital (pdf, epub, mobi) instant
download maybe you interests ...

Advances in Multimedia Information Processing PCM 2016


17th Pacific Rim Conference on Multimedia Xi an China
September 15 16 2016 Proceedings Part II 1st Edition
Enqing Chen
https://textbookfull.com/product/advances-in-multimedia-
information-processing-pcm-2016-17th-pacific-rim-conference-on-
multimedia-xi-an-china-september-15-16-2016-proceedings-part-
ii-1st-edition-enqing-chen/

Advances in Multimedia Information Processing – PCM


2018: 19th Pacific-Rim Conference on Multimedia, Hefei,
China, September 21-22, 2018, Proceedings, Part III
Richang Hong
https://textbookfull.com/product/advances-in-multimedia-
information-processing-pcm-2018-19th-pacific-rim-conference-on-
multimedia-hefei-china-september-21-22-2018-proceedings-part-iii-
richang-hong/

Advances in Multimedia Information Processing – PCM


2018: 19th Pacific-Rim Conference on Multimedia, Hefei,
China, September 21-22, 2018, Proceedings, Part II
Richang Hong
https://textbookfull.com/product/advances-in-multimedia-
information-processing-pcm-2018-19th-pacific-rim-conference-on-
multimedia-hefei-china-september-21-22-2018-proceedings-part-ii-
richang-hong/

Advances in Multimedia Information Processing – PCM


2017: 18th Pacific-Rim Conference on Multimedia,
Harbin, China, September 28-29, 2017, Revised Selected
Papers, Part II Bing Zeng
https://textbookfull.com/product/advances-in-multimedia-
information-processing-pcm-2017-18th-pacific-rim-conference-on-
multimedia-harbin-china-september-28-29-2017-revised-selected-
MultiMedia Modeling 22nd International Conference MMM
2016 Miami FL USA January 4 6 2016 Proceedings Part I
1st Edition Qi Tian

https://textbookfull.com/product/multimedia-modeling-22nd-
international-conference-mmm-2016-miami-fl-usa-
january-4-6-2016-proceedings-part-i-1st-edition-qi-tian/

Web Age Information Management 17th International


Conference WAIM 2016 Nanchang China June 3 5 2016
Proceedings Part I 1st Edition Bin Cui

https://textbookfull.com/product/web-age-information-
management-17th-international-conference-waim-2016-nanchang-
china-june-3-5-2016-proceedings-part-i-1st-edition-bin-cui/

Neural Information Processing 23rd International


Conference ICONIP 2016 Kyoto Japan October 16 21 2016
Proceedings Part IV 1st Edition Akira Hirose

https://textbookfull.com/product/neural-information-
processing-23rd-international-conference-iconip-2016-kyoto-japan-
october-16-21-2016-proceedings-part-iv-1st-edition-akira-hirose/

Perspectives in Business Informatics Research 15th


International Conference BIR 2016 Prague Czech Republic
September 15 16 2016 Proceedings 1st Edition Václav
■epa
https://textbookfull.com/product/perspectives-in-business-
informatics-research-15th-international-conference-
bir-2016-prague-czech-republic-
september-15-16-2016-proceedings-1st-edition-vaclav-repa/

Bio inspired Computing Theories and Applications 11th


International Conference BIC TA 2016 Xi an China
October 28 30 2016 Revised Selected Papers Part I 1st
Edition Maoguo Gong
https://textbookfull.com/product/bio-inspired-computing-theories-
and-applications-11th-international-conference-bic-ta-2016-xi-an-
china-october-28-30-2016-revised-selected-papers-part-i-1st-
Enqing Chen · Yihong Gong
Yun Tie (Eds.)

Advances in Multimedia
LNCS 9916

Information Processing –
PCM 2016
17th Pacific-Rim Conference on Multimedia
Xi‘an, China, September 15–16, 2016
Proceedings, Part I

123
Lecture Notes in Computer Science 9916
Commenced Publication in 1973
Founding and Former Series Editors:
Gerhard Goos, Juris Hartmanis, and Jan van Leeuwen

Editorial Board
David Hutchison
Lancaster University, Lancaster, UK
Takeo Kanade
Carnegie Mellon University, Pittsburgh, PA, USA
Josef Kittler
University of Surrey, Guildford, UK
Jon M. Kleinberg
Cornell University, Ithaca, NY, USA
Friedemann Mattern
ETH Zurich, Zurich, Switzerland
John C. Mitchell
Stanford University, Stanford, CA, USA
Moni Naor
Weizmann Institute of Science, Rehovot, Israel
C. Pandu Rangan
Indian Institute of Technology, Madras, India
Bernhard Steffen
TU Dortmund University, Dortmund, Germany
Demetri Terzopoulos
University of California, Los Angeles, CA, USA
Doug Tygar
University of California, Berkeley, CA, USA
Gerhard Weikum
Max Planck Institute for Informatics, Saarbrücken, Germany
More information about this series at http://www.springer.com/series/7409
Enqing Chen Yihong Gong

Yun Tie (Eds.)

Advances in Multimedia
Information Processing –
PCM 2016
17th Pacific-Rim Conference on Multimedia
Xi’an, China, September 15–16, 2016
Proceedings, Part I

123
Editors
Enqing Chen Yun Tie
Zhengzhou University Zhengzhou University
Zhengzhou Zhengzhou
China China
Yihong Gong
Jiaotong University
Xi’an
China

ISSN 0302-9743 ISSN 1611-3349 (electronic)


Lecture Notes in Computer Science
ISBN 978-3-319-48889-9 ISBN 978-3-319-48890-5 (eBook)
DOI 10.1007/978-3-319-48890-5

Library of Congress Control Number: 2016956479

LNCS Sublibrary: SL3 – Information Systems and Applications, incl. Internet/Web, and HCI

© Springer International Publishing AG 2016


This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the
material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation,
broadcasting, reproduction on microfilms or in any other physical way, and transmission or information
storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now
known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this book are
believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors
give a warranty, express or implied, with respect to the material contained herein or for any errors or
omissions that may have been made.

Printed on acid-free paper

This Springer imprint is published by Springer Nature


The registered company is Springer International Publishing AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface

The 17th Pacific-Rim Conference on Multimedia (PCM 2016) was held in Xi’an,
China, during September 15–16, 2016, and hosted by the Xi’an Jiaotong University
(XJTU). PCM is a leading international conference for researchers and industry
practitioners to share their new ideas, original research results, and practical devel-
opment experiences from all multimedia-related areas.
It was a great honor for XJTU to host PCM 2016, one of the most longstanding
multimedia conferences, in Xi’an, China. Xi’an Jiaotong University, located in the
capital of Shaanxi province, is one of the key universities run by the Ministry of
Education, China. Recently its multimedia-related research has been attracting
increasing attention from the local and international multimedia community. For over
2000 years, Xi’an has been the center for political and economic developments and the
capital city of many Chinese dynasties, with the richest cultural and historical heritage,
including the world-famous Terracotta Warriors, Big Wild Goose Pagoda, etc. We
hope that our venue made PCM 2016 a memorable experience for all participants.
PCM 2016 featured a comprehensive program. The 202 submissions from authors
of more than ten countries included a large number of high-quality papers in multi-
media content analysis, multimedia signal processing and communications, and mul-
timedia applications and services. We thank our 28 Technical Program Committee
members who spent many hours reviewing papers and providing valuable feedback to
the authors. From the total of 202 submissions to the main conference and based on at
least three reviews per submission, the program chairs decided to accept 111 regular
papers (54 %) among which 67 were posters (33 %). This volume of the conference
proceedings contains the abstracts of two invited talks and all the regular, poster, and
special session papers.
The technical program is an important aspect but only achieves its full impact if
complemented by challenging keynotes. We are extremely pleased and grateful to have
had two exceptional keynote speakers, Wen Gao and Alex Hauptmann, accept our
invitation and present interesting ideas and insights at PCM 2016.
We are also heavily indebted to many individuals for their significant contributions.
We thank the PCM Steering Committee for their invaluable input and guidance on
crucial decisions. We wish to acknowledge and express our deepest appreciation to the
honorary chairs, Nanning Zheng, Shin’chi Satoh, general chairs, Yihong Gong, Tho-
mas Plagemann, Ke Lu, Jianping Fan, program chairs, Meng Wang, Qi Tian, Abdul-
motaleb EI Saddik, Yun Tie, organizing chairs, Jinye Peng, Xinbo Gao, Ziyu Guan,
Yizhou Wang, publicity chairs, Xueming Qian, Xiaojiang Chen, Cheng Jin, Xiangyang
Xue, publication chairs, Jun Wu, Enqing Chen, local Arrangements Chairs, Kuizi Mei,
Xuguang Lan, special session chairs, Jianbing Shen, Jialie Shen, Jianru Xue, demo
chairs, Yugang Jiang, Jitao Sang, finance and registration chair, Shuchan Gao. Without
their efforts and enthusiasm, PCM 2016 would not have become a reality. Moreover,
we want to thank our sponsors: Springer, Peking University, Zhengzhou University,
VI Preface

Ryerson University. Finally, we wish to thank all committee members, reviewers,


session chairs, student volunteers, and supporters. Their contributions are much
appreciated.

September 2016 Meng Wang


Yun Tie
Qi Tian
Abdulmotaleb EI Saddik
Yihong Gong
Thomas Plagemann
Ke Lu
Jianping Fan
Organization

Honorary Chairs
Nanning Zheng Xi’an Jiaotong University, China
Shin’chi Satoh National Institute of Informatics, Japan

General Chairs
Yihong Gong Xi’an Jiaotong University, China
Thomas Plagemann University of Oslo, Norway
Ke Lu University of Chinese Academy of Sciences, China
Jianping Fan University of North Carolina at Charlotte, USA

Program Chairs
Meng Wang Hefei University of Technology, China
Qi Tian University of Texas at San Antonio, USA
Abdulmotaleb EI Saddik University of Ottawa, Canada
Yun Tie Zhengzhou University, China

Organizing Chairs
Jinye Peng Northwest University, China
Xinbo Gao Xidian University, China
Ziyu Guan Northwest University, China
Yizhou Wang Peking University, China

Publicity Chairs
Xueming Qian Xi’an Jiaotong University, China
Xiaojiang Chen Northwest University, China
Cheng Jin Fudan University, China
Xiangyang Xue Fudan University, China

Publication Chairs
Jun Wu Northwestern Polytechnical University, China
Enqing Chen Zhengzhou University, China
VIII Organization

Local Arrangements Chairs


Kuizi Mei Xi’an Jiaotong University, China
Xuguang Lan Xi’an Jiaotong University, China

Special Session Chairs


Jianbing Shen Beijing Institute of Technology, China
Jialie Shen Singapore Management University, Singapore
Jianru Xue Xi’an Jiaotong University, China

Demo Chairs
Yugang Jiang Fudan University, China
Jitao Sang Institute of Automation, Chinese Academy of Sciences,
China

Finance and Registration Chair


Shuchan Gao Xi’an Jiaotong University, China
Contents – Part I

Visual Tracking by Local Superpixel Matching with Markov Random Field . . . 1


Heng Fan, Jinhai Xiang, and Zhongmin Chen

Saliency Detection Combining Multi-layer Integration Algorithm


with Background Prior and Energy Function . . . . . . . . . . . . . . . . . . . . . . . 11
Chenxing Xia and Hanling Zhang

Facial Landmark Localization by Part-Aware Deep Convolutional Network . . . 22


Keke He and Xiangyang Xue

On Combining Compressed Sensing and Sparse Representations


for Object Tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
Hang Sun, Jing Li, Bo Du, and Dacheng Tao

Leaf Recognition Based on Binary Gabor Pattern and Extreme


Learning Machine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
Huisi Wu, Jingjing Liu, Ping Li, and Zhenkun Wen

Sparse Representation Based Histogram in Color Texture Retrieval . . . . . . . . 55


Cong Bai, Jia-nan Chen, Jinglin Zhang, Kidiyo Kpalma,
and Joseph Ronsin

Improving Image Retrieval by Local Feature Reselection


with Query Expansion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
Hanli Wang and Tianyao Sun

Sparse Subspace Clustering via Closure Subgraph Based on Directed Graph. . . 75


Yuefeng Ma and Xun Liang

Robust Lip Segmentation Based on Complexion Mixture Model . . . . . . . . . . 85


Yangyang Hu, Hong Lu, Jinhua Cheng, Wenqiang Zhang, Fufeng Li,
and Weifei Zhang

Visual BFI: An Exploratory Study for Image-Based Personality Test . . . . . . . 95


Jitao Sang, Huaiwen Zhang, and Changsheng Xu

Fast Cross-Scenario Clothing Retrieval Based on Indexing Deep Features . . . 107


Zongmin Li, Yante Li, Yongbiao Gao, and Yujie Liu

3D Point Cloud Encryption Through Chaotic Mapping . . . . . . . . . . . . . . . . 119


Xin Jin, Zhaoxing Wu, Chenggen Song, Chunwei Zhang,
and Xiaodong Li
X Contents – Part I

Online Multi-Person Tracking Based on Metric Learning . . . . . . . . . . . . . . . 130


Changyong Yu, Min Yang, Yanmei Dong, Mingtao Pei, and Yunde Jia

A Low-Rank Tensor Decomposition Based Hyperspectral Image


Compression Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
Mengfei Zhang, Bo Du, Lefei Zhang, and Xuelong Li

Moving Object Detection with ViBe and Texture Feature. . . . . . . . . . . . . . . 150


Yumin Tian, Dan Wang, Peipei Jia, and Jinhui Liu

Leveraging Composition of Object Regions for Aesthetic Assessment


of Photographs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
Hong Lu, Zeping Yao, Yunhan Bai, Zhibin Zhu, Bohong Yang,
Lukun Chen, and Wenqiang Zhang

Video Affective Content Analysis Based on Protagonist


via Convolutional Neural Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
Yingying Zhu, Zhengbo Jiang, Jianfeng Peng, and Sheng-hua Zhong

Texture Description Using Dual Tree Complex Wavelet Packets . . . . . . . . . . 181


M. Liedlgruber, M. Häfner, J. Hämmerle-Uhl, and A. Uhl

Fast and Accurate Image Denoising via a Deep


Convolutional-Pairs Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
Lulu Sun, Yongbing Zhang, Wangpeng An, Jingtao Fan, Jian Zhang,
Haoqian Wang, and Qionghai Dai

Traffic Sign Recognition Based on Attribute-Refinement Cascaded


Convolutional Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
Kaixuan Xie, Shiming Ge, Qiting Ye, and Zhao Luo

Building Locally Discriminative Classifier Ensemble Through Classifier


Fusion Among Nearest Neighbors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
Xiang-Jun Shen, Wen-Chao Zhang, Wei Cai, Ben-Bright B. Benuw,
He-Ping Song, Qian Zhu, and Zheng-Jun Zha

Retrieving Images by Multiple Samples via Fusing Deep Features . . . . . . . . 221


Kecai Wu, Xueliang Liu, Jie Shao, Richang Hong, and Tao Yang

A Part-Based and Feature Fusion Method for Clothing Classification. . . . . . . 231


Pan Huo, Yunhong Wang, and Qingjie Liu

Research on Perception Sensitivity of Elevation Angle in 3D Sound Field . . . 242


Yafei Wu, Xiaochen Wang, Cheng Yang, Ge Gao, and Wei Chen

Tri-level Combination for Image Representation . . . . . . . . . . . . . . . . . . . . . 250


Ruiying Li, Chunjie Zhang, and Qingming Huang
Contents – Part I XI

Accurate Multi-view Stereopsis Fusing DAISY Descriptor


and Scaled-Neighbourhood Patches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260
Fei Wang and Ning An

Stereo Matching Based on CF-EM Joint Algorithm . . . . . . . . . . . . . . . . . . . 271


Baoping Li, Long Ye, Yun Tie, and Qin Zhang

Fine-Grained Vehicle Recognition in Traffic Surveillance. . . . . . . . . . . . . . . 285


Qi Wang, Zhongyuan Wang, Jing Xiao, Jun Xiao, and Wenbin Li

Transductive Classification by Robust Linear Neighborhood Propagation . . . . 296


Lei Jia, Zhao Zhang, and Weiming Jiang

Discriminative Sparse Coding by Nuclear Norm-Driven Semi-Supervised


Dictionary Learning. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 306
Weiming Jiang, Zhao Zhang, Yan Zhang, and Fanzhang Li

Semantically Smoothed Refinement for Everyday Concept Indexing . . . . . . . 318


Peng Wang, Lifeng Sun, Shiqiang Yang, and Alan F. Smeaton

A Deep Two-Stream Network for Bidirectional Cross-Media


Information Retrieval. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328
Tianyuan Yu, Liang Bai, Jinlin Guo, Zheng Yang, and Yuxiang Xie

Prototyping Methodology with Motion Estimation Algorithm . . . . . . . . . . . . 338


Jinglin Zhang, Jian Shang, and Cong Bai

Automatic Image Annotation Using Adaptive Weighted Distance


in Improved K Nearest Neighbors Framework . . . . . . . . . . . . . . . . . . . . . . 345
Jiancheng Li and Chun Yuan

One-Shot-Learning Gesture Segmentation and Recognition


Using Frame-Based PDV Features. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355
Tao Rong and Ruoyu Yang

Multi-scale Point Set Saliency Detection Based on Site Entropy Rate . . . . . . 366
Yu Guo, Fei Wang, Pengyu Liu, Jingmin Xin, and Nanning Zheng

Facial Expression Recognition with Multi-scale Convolution


Neural Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 376
Jieru Wang and Chun Yuan

Deep Similarity Feature Learning for Person Re-identification . . . . . . . . . . . 386


Yanan Guo, Dapeng Tao, Jun Yu, and Yaotang Li

Object Detection Based on Scene Understanding and Enhanced Proposals . . . 397


Zhicheng Wang and Chun Yuan
XII Contents – Part I

Video Inpainting Based on Joint Gradient and Noise Minimization . . . . . . . . 407


Yiqi Jiang, Xin Jin, and Zhiyong Wu

Head Related Transfer Function Interpolation Based on Aligning Operation . . . 418


Tingzhao Wu, Ruimin Hu, Xiaochen Wang, Li Gao, and Shanfa Ke

Adaptive Multi-window Matching Method for Depth Sensing SoC


and Its VLSI Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 428
Huimin Yao, Chenyang Ge, Liuqing Yang, Yichuan Fu, and Jianru Xue

A Cross-Domain Lifelong Learning Model for Visual Understanding . . . . . . 438


Chunmei Qing, Zhuobin Huang, and Xiangmin Xu

On the Quantitative Analysis of Sparse RBMs . . . . . . . . . . . . . . . . . . . . . . 449


Yanxia Zhang, Lu Yang, Binghao Meng, Hong Cheng, Yong Zhang,
Qian Wang, and Jiadan Zhu

An Efficient Solution for Extrinsic Calibration of a Vision System


with Simple Laser . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 459
Ya-Nan Chen, Fei Wang, Hang Dong, Xuetao Zhang, and Haiwei Yang

A Stepped-RAM Reading and Multiplierless VLSI Architecture


for Intra Prediction in HEVC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 469
Wei Zhou, Yue Niu, Xiaocong Lian, Xin Zhou, and Jiamin Yang

A Sea-Land Segmentation Algorithm Based on Sea Surface Analysis . . . . . . 479


Guichi Liu, Enqing Chen, Lin Qi, Yun Tie, and Deyin Liu

Criminal Investigation Oriented Saliency Detection for Surveillance Videos . . . 487


Yu Chen, Ruimin Hu, Jing Xiao, Liang Liao, Jun Xiao, and Gen Zhan

Deep Metric Learning with Improved Triplet Loss for Face Clustering
in Videos . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 497
Shun Zhang, Yihong Gong, and Jinjun Wang

Characterizing TCP Performance for Chunk Delivery in DASH . . . . . . . . . . 509


Wen Hu, Zhi Wang, and Lifeng Sun

Where and What to Eat: Simultaneous Restaurant and Dish Recognition


from Food Image . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 520
Huayang Wang, Weiqing Min, Xiangyang Li, and Shuqiang Jiang

A Real-Time Gesture-Based Unmanned Aerial Vehicle Control System . . . . . 529


Leye Wei, Xin Jin, Zhiyong Wu, and Lei Zhang

A Biologically Inspired Deep CNN Model . . . . . . . . . . . . . . . . . . . . . . . . . 540


Shizhou Zhang, Yihong Gong, Jinjun Wang, and Nanning Zheng
Contents – Part I XIII

Saliency-Based Objective Quality Assessment of Tone-Mapped Images . . . . . 550


Yinchu Chen, Ke Li, and Bo Yan

Sparse Matrix Based Hashing for Approximate Nearest Neighbor Search . . . . 559
Min Wang, Wengang Zhou, Qi Tian, and Houqiang Li

Piecewise Affine Sparse Representation via Edge Preserving


Image Smoothing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 569
Xuan Wang, Fei Wang, and Yu Guo

Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 577


Contents – Part II

A Global-Local Approach to Extracting Deformable Fashion Items from


Web Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Lixuan Yang, Helena Rodriguez, Michel Crucianu, and Marin Ferecatu

Say Cheese: Personal Photography Layout Recommendation


Using 3D Aesthetics Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Ben Zhang, Ran Ju, Tongwei Ren, and Gangshan Wu

Speech Enhancement Using Non-negative Low-Rank Modeling


with Temporal Continuity and Sparseness Constraints . . . . . . . . . . . . . . . . . 24
Yinan Li, Xiongwei Zhang, Meng Sun, Xushan Chen, and Lin Qiao

Facial Animation Based on 2D Shape Regression . . . . . . . . . . . . . . . . . . . . 33


Ruibin Bai, Qiqi Hou, Jinjun Wang, and Yihong Gong

A Deep CNN with Focused Attention Objective for Integrated Object


Recognition and Localization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
Xiaoyu Tao, Chenyang Xu, Yihong Gong, and Jinjun Wang

An Accurate Measurement System for Non-cooperative Spherical Target


Based on Calibrated Lasers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
Hang Dong, Fei Wang, Haiwei Yang, Zhongheng Li, and Yanan Chen

Integrating Supervised Laplacian Objective with CNN


for Object Recognition. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
Weiwei Shi, Yihong Gong, Jinjun Wang, and Nanning Zheng

Automatic Color Image Enhancement Using Double Channels . . . . . . . . . . . 74


Na Li, Zhao Liu, Jie Lei, Mingli Song, and Jiajun Bu

Deep Ranking Model for Person Re-identification with Pairwise


Similarity Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
Sanping Zhou, Jinjun Wang, Qiqi Hou, and Yihong Gong

Cluster Enhanced Multi-task Learning for Face Attributes Feature Selection . . . 95


Yuchun Fang and Xiaoda Jiang

Triple-Bit Quantization with Asymmetric Distance for Nearest


Neighbor Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
Han Deng, Hongtao Xie, Wei Ma, Qiong Dai, Jianjun Chen,
and Ming Lu
XVI Contents – Part II

Creating Spectral Words for Large-Scale Hyperspectral Remote


Sensing Image Retrieval. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
Wenhao Geng, Jing Zhang, Li Zhuo, Jihong Liu, and Lu Chen

Rapid Vehicle Retrieval Using a Cascade of Interest Regions . . . . . . . . . . . . 126


Yuanqi Su, Bonan Cuan, Xingjun Zhang, and Yuehu Liu

Towards Drug Counterfeit Detection Using Package Paperboard


Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
Christof Kauba, Luca Debiasi, Rudolf Schraml, and Andreas Uhl

Dynamic Strategies for Flow Scheduling in Multihoming Video CDNs . . . . . 147


Ming Ma, Zhi Wang, Yankai Zhang, and Lifeng Sun

Homogenous Color Transfer Using Texture Retrieval and Matching . . . . . . . 159


Chang Xing, Hai Ye, Tao Yu, and Zhong Zhou

Viewpoint Estimation for Objects with Convolutional Neural Network


Trained on Synthetic Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
Yumeng Wang, Shuyang Li, Mengyao Jia, and Wei Liang

Depth Extraction from a Light Field Camera Using Weighted


Median Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
Changtian Sun and Gangshan Wu

Scale and Topology Preserving SIFT Feature Hashing . . . . . . . . . . . . . . . . . 190


Chen Kang, Li Zhu, and Xueming Qian

Hierarchical Traffic Sign Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200


Yanyun Qu, Siying Yang, Weiwei Wu, and Li Lin

Category Aggregation Among Region Proposals for Object Detection . . . . . . 210


Linghui Li, Sheng Tang, Jianshe Zhou, Bin Wang, and Qi Tian

Exploiting Local Feature Fusion for Action Recognition . . . . . . . . . . . . . . . 221


Jie Miao, Xiangmin Xu, Xiaoyi Jia, Haoyu Huang, Bolun Cai,
Chunmei Qing, and Xiaofen Xing

Improving Image Captioning by Concept-Based Sentence Reranking . . . . . . . 231


Xirong Li and Qin Jin

Blind Image Quality Assessment Based on Local Quantized Pattern . . . . . . . 241


Yazhong Zhang, Jinjian Wu, Xuemei Xie, and Guangming Shi

Sign Language Recognition with Multi-modal Features . . . . . . . . . . . . . . . . 252


Junfu Pu, Wengang Zhou, and Houqiang Li

Heterogeneous Convolutional Neural Networks for Visual Recognition . . . . . 262


Xiangyang Li, Luis Herranz, and Shuqiang Jiang
Contents – Part II XVII

Recognition Oriented Feature Hallucination for Low Resolution


Face Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275
Guangheng Jia, Xiaoguang Li, Li Zhuo, and Li Liu

Learning Robust Multi-Label Hashing for Efficient Image Retrieval . . . . . . . 285


Haibao Chen, Yuyan Zhao, Lei Zhu, Guilin Chen, and Kaichuan Sun

A Second-Order Approach for Blind Motion Deblurring by Normalized


l1 Regularization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296
Zedong Chen, Faming Fang, Yingying Xu, and Chaomin Shen

Abnormal Event Detection and Localization by Using Sparse Coding


and Reconstruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 306
Jing Xue, Yao Lu, and Haohao Jiang

Real-Time Video Dehazing Based on Spatio-Temporal MRF . . . . . . . . . . . . 315


Bolun Cai, Xiangmin Xu, and Dacheng Tao

Dynamic Contour Matching for Lossy Screen Content Picture Intra Coding . . . 326
Hu Yuan, Tao Pin, and Yuanchun Shi

A Novel Hard-Decision Quantization Algorithm Based on Adaptive


Deadzone Offset Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335
Hongkui Wang, Haibing Yin, and Ye Shen

Comparison of Information Loss Architectures in CNNs . . . . . . . . . . . . . . . 346


Song Wu and Michael S. Lew

Fast-Gaussian SIFT for Fast and Accurate Feature Extraction . . . . . . . . . . . . 355


Liu Ke, Jun Wang, and Zhixian Ye

An Overview+Detail Surveillance Video Player: Information-Based


Adaptive Fast-Forward. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 366
Lele Dong, Qing Xu, Shang Wu, Xueyan Song, Klaus Schoeffmann,
and Mateu Sbert

Recurrent Double Features: Recurrent Multi-scale Deep Features


and Saliency Features for Salient Object Detection . . . . . . . . . . . . . . . . . . . 376
Ziqin Wang, Peilin Jiang, Fei Wang, and Xuetao Zhang

Key Frame Extraction Based on Motion Vector . . . . . . . . . . . . . . . . . . . . . 387


Ziqian Qiang, Qing Xu, Shihua Sun, and Mateu Sbert

Haze Removal Technology Based on Physical Model . . . . . . . . . . . . . . . . . 396


Yunqian Cui and Xinguang Xiang

Robust Uyghur Text Localization in Complex Background Images . . . . . . . . 406


Jianjun Chen, Yun Song, Hongtao Xie, Xi Chen, Han Deng,
and Yizhi Liu
XVIII Contents – Part II

Learning Qualitative and Quantitative Image Quality Assessment . . . . . . . . . 417


Yudong Liang, Jinjun Wang, Ze Yang, Yihong Gong,
and Nanning Zheng

An Analysis-Oriented ROI Based Coding Approach on Surveillance


Video Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 428
Liang Liao, Ruimin Hu, Jing Xiao, Gen Zhan, Yu Chen, and Jun Xiao

A Stepwise Frontal Face Synthesis Approach for Large Pose


Non-frontal Facial Image . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 439
Xueli Wei, Ruimin Hu, Zhen Han, Liang Chen, and Xin Ding

Nonlinear PCA Network for Image Classification . . . . . . . . . . . . . . . . . . . . 449


Xiao Zhang and Youtian Du

Salient Object Detection in Video Based on Dynamic Attention Center . . . . . 458


Mengling Shao, Ruimin Hu, Xu Wang, Zhongyuan Wang, Jing Xiao,
and Ge Gao

Joint Optimization of a Perceptual Modified Wiener Filtering Mask


and Deep Neural Networks for Monaural Speech Separation . . . . . . . . . . . . 469
Wei Han, Xiongwei Zhang, Jibin Yang, Meng Sun, and Gang Min

Automatic Extraction and Construction Algorithm of Overpass


from Raster Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 479
Xincan Zhao, Yaodan Liu, and Yaping Wang

Geometric and Tongue-Mouth Relation Features for Morphology


Analysis of Tongue Body . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 490
Qing Cui, Xiaoqiang Li, Jide Li, and Yin Zhang

Perceptual Asymmetric Video Coding for 3D-HEVC. . . . . . . . . . . . . . . . . . 498


Yongfang Wang, Kanghua Zhu, Yawen Shi, and Pamela C. Cosman

Recognition of Chinese Sign Language Based on Dynamic Features


Extracted by Fast Fourier Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 508
Zhengchao Zhang, Xiankang Qin, Xiaocong Wu, Feng Wang,
and Zhiyong Yuan

Enhanced Joint Trilateral Up-sampling for Super-Resolution. . . . . . . . . . . . . 518


Liang Yuan, Xin Jin, and Chun Yuan

Learning to Recognize Hand-Held Objects from Scratch . . . . . . . . . . . . . . . 527


Xue Li, Shuqiang Jiang, Xiong Lv, and Chengpeng Chen

Audio Bandwidth Extension Using Audio Super-Resolution . . . . . . . . . . . . . 540


Jiang Lin, Hu Ruimin, Wang Xiaochen, and Tu Weiping
Contents – Part II XIX

Jointly Learning a Multi-class Discriminative Dictionary for Robust


Visual Tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 550
Zhao Liu, Mingtao Pei, Chi Zhang, and Mingda Zhu

Product Image Search with Deep Attribute Mining and Re-ranking . . . . . . . . 561
Xin Zhou, Yuqi Zhang, Xiuxiu Bai, Jihua Zhu, Li Zhu, and Xueming Qian

A New Rate Control Algorithm Based on Region of Interest for HEVC . . . . 571
Liquan Shen, Qianqian Hu, Zhi Liu, and Ping An

Deep Learning Features Inspired Saliency Detection of 3D Images . . . . . . . . 580


Qiudan Zhang, Xu Wang, Jianmin Jiang, and Lin Ma

No-Reference Quality Assessment of Camera-Captured Distortion Images . . . 590


Lijuan Tang, Leida Li, Ke Gu, Jiansheng Qian, and Jianying Zhang

GIP: Generic Image Prior for No Reference Image Quality Assessment . . . . . 600
Qingbo Wu, Hongliang Li, and King N. Ngan

Objective Quality Assessment of Screen Content Images


by Structure Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 609
Yuming Fang, Jiebin Yan, Jiaying Liu, Shiqi Wang, Qiaohong Li,
and Zongming Guo

CrowdTravel: Leveraging Heterogeneous Crowdsourced Data


for Scenic Spot Profiling and Recommendation. . . . . . . . . . . . . . . . . . . . . . 617
Tong Guo, Bin Guo, Jiafan Zhang, Zhiwen Yu, and Xingshe Zhou

Context-Oriented Name-Face Association in Web Videos. . . . . . . . . . . . . . . 629


Zhineng Chen, Wei Zhang, Hongtao Xie, Bailan Feng, and Xiaoyan Gu

Social Media Profiler: Inferring Your Social Media Personality from Visual
Attributes in Portrait . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 640
Jie Nie, Lei Huang, Peng Cui, Zhen Li, Yan Yan, Zhiqiang Wei,
and Wenwu Zhu

SSFS: A Space-Saliency Fingerprint Selection Framework


for Crowdsourcing Based Mobile Location Recognition . . . . . . . . . . . . . . . . 650
Hao Wang, Dong Zhao, Huadong Ma, and Huaiyu Xu

Multi-view Multi-object Tracking Based on Global Graph Matching


Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 660
Chao Li, Shantao Ping, Hao Sheng, Jiahui Chen, and Zhang Xiong

Accelerating Large-Scale Human Action Recognition


with GPU-Based Spark . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 670
Hanli Wang, Xiaobin Zheng, and Bo Xiao
XX Contents – Part II

Adaptive Multi-class Correlation Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . 680


Linlin Yang, Chen Chen, Hainan Wang, Baochang Zhang,
and Jungong Han

Deep Neural Networks for Free-Hand Sketch Recognition . . . . . . . . . . . . . . 689


Yuqi Zhang, Yuting Zhang, and Xueming Qian

Fusion of Thermal and Visible Imagery for Effective Detection


and Tracking of Salient Objects in Videos . . . . . . . . . . . . . . . . . . . . . . . . . 697
Yijun Yan, Jinchang Ren, Huimin Zhao, Jiangbin Zheng,
Ezrinda Mohd Zaihidee, and John Soraghan

RGB-D Camera based Human Limb Movement Recognition


and Tracking in Supine Positions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 705
Jun Wu, Cailiang Kuang, Kai Zeng, Wenjing Qiao, Fan Zhang,
Xiaobo Zhang, and Zhisheng Xu

Scene Parsing with Deep Features and Spatial Structure Learning . . . . . . . . . 715
Hui Yu, Yuecheng Song, Wenyu Ju, and Zhenbao Liu

Semi-supervised Learning for Human Pose Recognition


with RGB-D Light-Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 723
Xinbo Wang, Guoshan Zhang, Dahai Yu, and Dan Liu

Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 739


Visual Tracking by Local Superpixel Matching
with Markov Random Field

Heng Fan1 , Jinhai Xiang2(B) , and Zhongmin Chen2


1
College of Engineering, Huazhong Agricultural University, Wuhan, China
2
College of Informatics, Huazhong Agricultural University, Wuhan, China
jimmy xiang@mail.hzau.edu.cn

Abstract. In this paper, we propose a novel method to track non-rigid


and/or articulated objects using superpixel matching and markov random
field (MRF). Our algorithm consists of three stages. First, a superpixel
dataset is constructed by segmenting training frames into superpixels, and
each superpixel is represented by multiple features. The appearance infor-
mation of target is encoded in the superpixel database. Second, each new
frame is segmented into superpixels and then its object-background con-
fidence map is derived by comparing its superpixels with k-nearest neigh-
bors in superpixel dataset. Taking context information into account, we
utilize MRF to further improve the accuracy of confidence map. In addi-
tion, the local context information is incorporated through a feedback to
refine superpixel matching. In the last stage, visual tracking is achieved via
finding the best candidate by maximum a posterior estimate based on the
confidence map. Experiments show that our method outperforms several
state-of-the-art trackers.

Keywords: Visual tracking · Superpixel matching · Markov Random


Field (MRF) · Local context information

1 Introduction

In computer vision field, object tracking plays a crucial role for its various appli-
cations, such as surveillance and robotics [14]. To develop a robust tracker,
numerous algorithms have been proposed. Despite reasonable good results of
these methods, visual tracking remains a challenge due to appearance variations
caused by occlusion and deformation. To address these problems, a wide range
of appearance models have been presented. In general, these appearance models
can be categorized into two types: discriminative models [2,3,6,9,10,13,18,20]
and generative models [1,5,7,8,15,16].
Discriminative algorithms focus on building online classifiers to distinguish
the target from the background. These methods employ both the foreground
and background information. In [2], an adaptive ensemble of classifier is trained
to separate target pixels from background pixels. Kalal et al. [13] introduce a
P-N learning algorithm for object tracking. However, this tracking method easily

c Springer International Publishing AG 2016
E. Chen et al. (Eds.): PCM 2016, Part I, LNCS 9916, pp. 1–10, 2016.
DOI: 10.1007/978-3-319-48890-5 1
2 H. Fan et al.

causes drift when object appearance varies. Babenko et al. [3] utilize the multiple
instance learning (MIL) method for visual tracking, which can alleviate drift to
some extent. Yang et al. [18] suggest a discriminative appearance model based
on superpixels, which facilitates the tracking algorithm to distinguish the target
from the background.
On the other hand, the generative models formulate tracking problem as
searching for regions most similar to object. These methods are based on either
subspace models or templates and update appearance model dynamically. In [16],
the incremental visual tracking method suggests an online approach for efficiently
learning and updating a low dimensional principal components analysis (PCA)
subspace representation for the object. However, this representation scheme is
sensitive to occlusion. Adam et al. [1] present a fragment-based template model
for visual tracking. Mei and Ling [15] model the object appearance with sparse
representation for visual tracking and achieve a good performance.
Though having achieved promising performance for object tracking, the
aforementioned algorithms often suffer from drifting problems when substantial
non-rigid and articulated motions are involved in the object.
To solve the problem of tracking non-rigid and/or articulated objects, we
propose a novel tracking algorithm with local superpixel matching and markov
random field (MRF). Our method mainly contains three stages. In the first stage,
we construct a superpixel dataset by segmenting training frames into superpix-
els, and each superpixel in the dataset is represented with multiple features.
Through this way, the appearance information of the object is encoded in the
superpixel dataset. In the second stage, for each new frame, we represent it with
its superpixels. We can compute its object-background confidence map by com-
paring its superpixels with their k-nearest neighbors in the superpixel dataset.
In this process, the tracking task is treated as separating object pixels from
background pixels. Taking the context information into consideration, we utilize
MRF to further improve the accuracy of the confidence map. In addition, the
local context information of each superpixel is incorporated through a feedback
to refine superpixel matching. In the final stage, object tracking is achieved via
searching the best candidate by maximum a posterior estimate based on the con-
fidence map. When tracking is completed in each frame, we collect good tracking
results to update the superpixel dataset. With the help of this update scheme,
our tracker is able to adapt to the appearance changes of the target. Figure 1
illustrates the framework of the proposed method.

2 The Proposed Tracking Algorithm


2.1 Superpixel Dataset Construction
To build the superpixel dataset, we oversegment Q training frames to generate N
superpixels by algorithm in [11]. For each superpixel si (1 ≤ i ≤ N ), we extract
four kinds of features including SIFT histogram, RGB histogram, location his-
togram and PHOG histogram. These histograms are concatenated to represent
a superpixel similar as in [19].
Visual Tracking by Local Superpixel Matching with Markov Random Field 3

Fig. 1. Framework of the proposed method.

Let xi denote the feature of si . We use yi to represent its label, where yi ∈


{0, 1} (0 and 1 represent the background and object labels respectively) and is
determined by 
1, ai  95 %
yi = (1)
0, otherwise
where ai represents target area ratio of si . We then collect all the training super-
pixels into a database and obtain the superpixel database D = {si , xi , yi }N i=1 .

2.2 Object-Background Confidence Map

For each new frame, we firstly extract the surrounding region1 of the target in the
last and then segment this region into superpixels with the same method in [11].
Let M be the number of its superpixels. For the ith superpixel sj (1≤ j ≤ M ),
we are able to calculate its label cost by comparing its k-nearest neighbor Nk (j)
in superpixel dataset D as follows

i∈N (j),yi =c K(xj , xi )
U (yj = c|sj ) = 1 −  k (2)
i∈Nk (j) K(xj , xi )

where xj denotes the feature of sj , c ∈ {0, 1} represents the label and K(xj , xi )
is the intersection kernel between features xj and xi .
In this work, tracking is treated as separating object pixels from background
pixels. In order to exploit the context relationship of object pixels and back-
ground pixels, we utilize MRF inference for contextual constraints. The energy
function is given by
 
E(Y ) = U (yp = c) + λ V (yp = c, yq = c ) (3)
p pq

1
The surrounding region is a square area centered at the location of target Xtc , and
1
its side length is equal to λs [Xst ] 2 , where Xtc represents the center location of target
s
region Xt and Xt denotes its size. The parameter λs is a constant variable, which
determines the size of this surrounding region.
4 H. Fan et al.

where p, q are pixel indices, c, c are candidate labels and λ is the weight of
pairwise energy. The unary energy of one pixel is given by the superpixel it
belongs to
U (yp = c) = U (yj = c|sj ), p ∈ sj (4)
The pairwise energy on edges is given by spatially variant label cost

V (yp = c, yq = c ) = d(p, q) · μ(c, c ) (5)

where d(p, q) = exp(−I(p) − I(q)2 /2σ 2 ) is the color dissimilarity between two
adjacent pixels, and μ(c, c ) is the penalty of assigning label c and c to two
adjacent pixels and defined by log-likelihood of label co-occurrence statistics

μ(c, c ) = −log[(P (c|c ) + P (c |c))/2] × σ (6)

Through this way, we can derive the labels of all pixels by performing MAP
interference on E(Y ) with graph cut optimization in [4].
Taking into local context information of each superpixel into account, we
adopt a simple yet effective feedback mechanism as in [19]. In the feedback
process, we can obtain the pixel-wise classification likelihood of each pixel by
1
(p, c) = (7)
1 + exp(U (yp = c))

where U (yp = c) is the cost of assigning label c to pixel p in Eq. (4).

Fig. 2. Local context descriptor of superpixel.

For robust superpixel matching, we exploit the local context of each super-
pixel. For superpixel sj , we divide its neighborhood into left, right, top, bottom
four cells {lc1j , lc2j , lc3j , lc4j } (see Fig. 2). For each cell lckj (1 ≤ k ≤ 4), we compute
its sparse context hkj = [hkj1 , hkj2 ] by

hkic = max (p, c) (8)


p∈lck
i

where (p, c) represents the pixel-wise classification likelihood obtained by


Eq. (7). For superpixel si , we can obtain spatial context descriptor hj =
[h1j ; h2j ; h3j ; h4j ]. Thus we can classify the superpixels of the new frame by Eq. (2)
with new feature [xj ; hj ].
Visual Tracking by Local Superpixel Matching with Markov Random Field 5

Through the above process, we are able to obtain the matching score Score(j)
for superpixel sj by

U (yj = 1|sj ) − U (yj = 0|sj )


Score(j) = (9)
U (yj = 1|sj ) + U (yj = 0|sj )

and the confidence map C for each pixel on the entire current frame as follows.
We assign every pixel whose label is object with 1, and every pixel whose label
is background or outside the surrounding region with −1. Figure 3 shows the
matching maps, confidence maps and tracking results of the target in video
Iceskater by our method.

Iceskater 39 Iceskater 86 Iceskater 167 Iceskater 233

Fig. 3. Matching maps, confidence maps and tracking results on video Iceskater. First
row: original images. Second row: matching maps of corresponding regions obtained by
our local superpixel matching. Third row: confidence maps of corresponding regions
derived by performing MRF on matching maps. Fourth row: the final tracking results
of each frame (see details in Sect. 2.3).

2.3 Tracking Formulation

Our tracker is implemented within Bayesian framework. Given the observation


set of targets Z t = {z1 , z2 , · · · , zt } up to the frame t, where zτ (τ = 1, 2, · · · , t)
6 H. Fan et al.

t by
represents the observation of target in frame τ , we can obtain estimation X
computing the maximum a posterior via
t = argmax p(X i |Y t )
X (10)
t
Xti

where X t denotes the ith sample at the state of Xt . The posterior probability
p(Xt |Z t ) can be obtained by the Bayesian theorem recursively via
i


p(Xt |Y t ) ∝ p(zt |Xt ) p(Xt |Xt−1 )p(Xt−1 |Z t−1 )dXt−1 (11)

where p(Xt |Xt−1 ) and p(zt |Xt ) represent the dynamic model and observation
model respectively.
The dynamic model indicates the temporal correlation of the target state
between consecutive frames. We apply affine transformation to model the target
motion between two consecutive frames within the particle filter framework. The
state transition can be formulated as
p(Xt |Xt−1 ) = N (Xt ; Xt−1 , Ψ ) (12)
where Ψ is a diagonal covariance matrix whose elements are the variance of
affine parameters. The observation model p(zt |Xt ) represents the probability of
the observation zt at state Xt . In this paper, the observation for ith sample at
the state of Xt is designed as in [18] by

p(zt |Xti ) ∝ vti (w, v) × [S(Xti )/S(Xt−1 )] (13)
(w,v)∈Cti

where Cti is the confidence map of the ith candidate warped from confidence
map of corresponding region, vti (w, v) denotes the confidence value of pixels at
location (w, v), S(Xti ) represents the area size of the ith candidate and S(Xt−1 )
is the area size of the object in last frame. Through Bayesian inference, we can
determine the candidate sample with the maximum observation as the tracking
result.

2.4 Online Update


Before tracking, the target in the initial frame is manually labeled. The Q train-
ing samples utilized for constructing database are the same with the first frame
and stored in a set T with fixed length L. Our strategy is to choose the latest
good tracking result, add it into set T and remove the oldest element in T if T
is full. In this way, when superpixel matching starts, the superpixel dataset can
be effectively updated to adapt to the object appearance changes by the new
set T .
Every H frames, we compute the occlusion coefficient Ot of the latest tracking
result Xt by 
(w,v)∈Ct vt (w, v)/S(Xt )
Ot = 1 −  (14)
(w,v)∈Ct−1 vt−1 (w, v)S(Xt−1 )
Visual Tracking by Local Superpixel Matching with Markov Random Field 7

When heavy occlusion happens, the occlusion coefficient Ot will be large, and
thus it is unnecessary to add the tracking result into T . We set a threshold θ to
determine whether the tracking result is added into T . If Ot > θ, we skip this
frame to avoid introducing noise into T . Otherwise, we add the tracking result
into T and remove the oldest element from T if the number of elements in T is
larger than L.

3 Experiments
We evaluate our tracker on eight challenging image sequences and compare it
with seven state-of-the-art tracking methods. These algorithms are SPT tracking
[18], CT tracking [20], SCM tracking [23], STC tracking [21], ASLA tracking [12],
PCOM tracking [17], MTT tracking [22]. The proposed algorithm is implemented

Table 1. Average center location error (CLE) in pixel. The best and the second best
results are shown in red and blue fonts.

Sequence ASLA CT MTT PCOM SCM STC SPT Ours


Bikeshow 182.7 82.2 190.0 135.1 193.2 148.4 22.3 6.8
David3 104.5 89.8 307.0 100.0 67.6 6.3 8.3 8.1
Gymnastics 15.9 24.7 100.0 102.1 16.5 17.7 25.3 11.9
Iceskater 21.6 45.3 86.9 125.6 144.7 130.1 17.6 12.7
Mottocross2 4.5 18.8 32.5 86.8 10.9 4.6 19.2 10.0
Skater 9.3 14.3 7.6 88.5 12.6 15.5 14.3 7.3
Skater2 28.4 41.0 76.3 107.6 88.5 28.4 24.2 9.1
Transformer 43.1 52.8 42.9 104.6 37.5 44.2 15.6 10.1
Average 51.3 46.1 105.4 106.3 71.4 49.4 18.4 9.5

Bikeshow David3 Gymnastics Iceskater


350 600 250 400
ASLA ASLA ASLA ASLA
CT CT CT CT
300 MTT MTT MTT 350
500 MTT
PCOM PCOM 200 PCOM PCOM
Center Location Error(pixel)

Center Location Error(pixel)

Center Location Error(pixel)

Center Location Error(pixel)

SCM SCM SCM 300 SCM


250 STC STC STC STC
SPT 400 SPT SPT SPT
Ours Ours Ours 250 Ours
150
200
300 200
150
100
150
200
100
100
50
100
50 50

0 0 0 0
0 50 100 150 200 250 300 350 400 0 50 100 150 200 250 300 0 100 200 300 400 500 600 700 800 0 100 200 300 400 500
Frame Index Frame Index Frame Index Frame Index

Motocross2 Skater Skater2 Transformer


250 200 300 200
ASLA ASLA ASLA ASLA
CT 180 CT CT 180 CT
MTT MTT MTT MTT
250
200 PCOM 160 PCOM PCOM 160 PCOM
Center Location Error(pixel)

Center Location Error(pixel)

Center Location Error(pixel)


Center Location Error(pixel)

SCM SCM SCM SCM


STC 140 STC STC 140 STC
SPT SPT 200 SPT SPT
150 Ours 120 Ours Ours 120 Ours

100 150 100

100 80 80
100
60 60

50 40 40
50
20 20

0 0 0 0
0 5 10 15 20 25 0 20 40 60 80 100 120 140 160 0 50 100 150 200 250 300 350 400 450 0 20 40 60 80 100 120 140
Frame Index Frame Index Frame Index Frame Index

Fig. 4. Quantitative evaluation in terms of center location error in pixel. The proposed
method is compared with seven state-of-the-art algorithms on eight challenging test
sequences.
8 H. Fan et al.

in MATLAB and runs at 1.5 frames per second on a 3.2 GHz Intel E3-1225 v3
Core PC with 8 GB memory. The parameters of the proposed tracker are fixed in
all experiments. The number of neighbors k in Eq. (2) is set to 7. The number of
particles in Bayesian framework is 300 to 600. The λs is set to 1.5. The number
of initial training samples is 5. The length L of set T is fixed to 10, and H is set
to 5. The threshold θ is 0.8.

3.1 Quantitative Comparison

We evaluate the above mentioned trackers by center location error (CLE) in


pixels, and the comparing results are shown in Table 1. Figure 4 shows the center
location error of utilized trackers on eight test sequences. Overall, the proposed
tracker outperforms other state-of-the-art algorithms.

3.2 Qualitative Comparison

Deformation: Deformation is a disaster for a tracker because it is able to


cause heavy appearance variations. Figure 5(a) and (d) demonstrate the tracking
results in the presence of deformation. The proposed tracker is able to robustly
locate the non-rigid object in these sequences we represent the object appear-
ance with a robust superpixel database. With the help of update scheme, the

(a) Bikeshow

(b) David3

(c) Motocross2

(d) Transformer
ASLA CT MTT PCOM SCM STC SPT Ours

Fig. 5. Screenshots of some sample tracking results.


Visual Tracking by Local Superpixel Matching with Markov Random Field 9

superpixel database can be updated to adapt to object appearance changes, and


thus our tracker is robust to deformation.
Occlusion: Occlusion is a common problem in visual tracking. Figure 5(b) shows
the performance of our tracker in the presence of occlusion. When occlusion hap-
pens, object appearance will change because part of target is occluded. However,
our tracker is still able to locate the object because our tracker can utilize the
unoccluded part of the target for tracking with local superpixel matching.
Rotation: Figure 5(c) shows sampled experimental results of target with drastic
rotation. In these sequence, the object suffers from not only rotation but also
scale variation. Our methods demonstrates good performance to track the tar-
get owing to our appearance model. When rotation happens, the structure of
object appearance will change. Nevertheless, our superpixel database can ignore
this structure changes and distinguish the object superpixels from background
superpixels via our matching method.

4 Conclusion
In this paper, we propose a novel method for object tracking, especially for the
targets involved with non-rigid and articulated motions. This approach mainly
consists of three stages. In the first stage, a superpixel database is constructed
to represent the appearance of object. In the second stage, when a new frame
arrives, it is firstly segmented into superpixels. Then we compute its confidence
via superpixel matching and MRF. Taking context information into account, we
utilize MRF to further improve the accuracy of confidence map. In addition,
the local context information is incorporated through a feedback to refine super-
pixel matching. In the last stage, visual tracking is achieved through finding the
best candidate by maximum a posterior estimate based on the confidence map.
Experiments evidence the effectiveness of our method.

Acknowledgement. This work was primarily supported by Foundation Research


Funds for the Central Universities (Program No. 2662016PY008 and Program No.
2662014PY052).

References
1. Adam, A., Rivlin, E., Shimshoni, I.: Robust fragments based tracking using the
integral histogram. In: IEEE Conference on Computer Vision and Pattern Recog-
nition (CVPR), pp. 798–805 (2006)
2. Avidan, S.: Ensemble tracking. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI)
29(2), 261–271 (2007)
3. Babenko, B., Yang, M.-H., Belongie, S.: Robust object tracking with online mul-
tiple instance learning. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 33(8),
1619–1632 (2011)
4. Boykov, Y., Veksler, O., Zabih, R.: Fast approximate energy minimization via
graph cuts. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 23(11), 1222–1239
(2001)
Another random document with
no related content on Scribd:
(good)
Strawberry Isinglass Jelly 468
Fancy Jellies, and Jelly in 469
Belgrave mould
Queen Mab’s Pudding (an 470
elegant summer dish)
Nesselróde Cream 471
Crême à la Comtesse, or the 472
Countess’s Cream
An excellent Trifle 473
Swiss Cream, or Trifle (very 473
good)
Tipsy Cake, or Brandy Trifle 474
Chantilly Basket filled with 474
whipped Cream and fresh
Strawberries
Very good Lemon Cream, 475
made without Cream
Fruit Creams, and Italian 475
Creams
Very superior whipped 476
Syllabubs
Good common Blanc-mange, 476
or Blanc Manger (Author’s
receipt)
Richer Blanc-mange 477
Jaumange, or Jaune Manger; 477
sometimes called Dutch
Flummery
Extremely good Strawberry 477
Blanc-mange, or Bavarian
Cream
Quince Blanc-mange 478
(delicious)
Quince Blanc-mange, with 478
Almond Cream
Apricot Blanc-mange, or 479
Crême Parisienne
Currant Blanc-mange 479
Lemon Sponge, or Moulded 480
Lemon Cream
An Apple Hedgehog, or 480
Suédoise
Imperial Gooseberry-fool 480
Very good old-fashioned boiled 481
Custard
Rich boiled Custard 481
The Queen’s Custard 481
Currant Custard 482
Quince or Apple Custards 482
The Duke’s Custard 482
Chocolate Custards 483
Common baked Custard 483
A finer baked Custard 483
French Custards or Creams 484
German Puffs 484
A Meringue of Rhubarb, or 485
green Gooseberries
Creamed Spring Fruit, or 486
Rhubarb Trifle
Meringue of Pears, or other 486
fruit
An Apple Charlotte, or 486
Charlotte de Pommes
Marmalade for the Charlotte 487
A Charlotte à la Parisienne 486
A Gertrude à la Créme 486
Pommes au Beurre (Buttered 488
Apples) (excellent)
Suédoise of Peaches 488
Aroce Doce, or Sweet Rice à la 489
Portugaise
Cocoa Nut Doce 490
Buttered Cherries (Cerises au 490
Beurre)
Sweet Macaroni 490
Bermuda Witches 491
Nesselróde Pudding 491
Stewed Figs (a very nice 492
Compote)
CHAPTER XXIV.

PRESERVES.

Page

General Remarks on the use 493


and value of Preserved
Fruits
A few General Rules and 496
Directions for Preserving
To Extract the Juice of Plums 497
for Jelly
To weigh the Juice of Fruit 498
Rhubarb Jam 498
Green Gooseberry Jelly 498
Green Gooseberry Jam (firm 499
and of good colour)
To dry green Gooseberries 499
Green Gooseberries for Tarts 499
Red Gooseberry Jam 500
Very fine Gooseberry Jam 500
Jelly of ripe Gooseberries 500
(excellent)
Unmixed Gooseberry Jelly 501
Gooseberry Paste 501
To dry ripe Gooseberries with 501
Sugar
Jam of Kentish or Flemish 502
Cherries
To dry Cherries with Sugar (a 502
quick and easy method)
Dried Cherries (superior 503
receipt)
Cherries dried without Sugar 503
To dry Morella Cherries 504
Common Cherry Cheese 504
Cherry Paste (French) 504
Strawberry Jam 504
Strawberry Jelly, a very 505
superior Preserve (new
receipt)
Another very fine Strawberry 505
Jelly
To preserve Strawberries or 506
Raspberries, for Creams or
Ices, without boiling
Raspberry Jam 506
Very rich Raspberry Jam, or 506
Marmalade
Good Red or White Raspberry 507
Jam
Raspberry Jelly for flavouring 507
Creams
Another Raspberry Jelly (very 508
good)
Red Currant Jelly 508
Superlative Red Currant Jelly 509
(Norman receipt)
French Currant Jelly 509
Delicious Red Currant Jam 509
Very fine White Currant Jelly 510
White Currant Jam, a beautiful 510
Preserve
Currant Paste 510
Fine Black Currant Jelly 511
Common Black Currant Jelly 511
Black Currant Jam and 511
Marmalade
Nursery Preserve 512
Another good common 512
Preserve
A good Mélange, or mixed 513
Preserve
Groseillée, (another good 513
Preserve)
Superior Pine-apple 513
Marmalade (a new receipt)
A fine Preserve of the green 514
Orange Plum (sometimes
called the Stonewood Plum)
Greengage Jam, or Marmalade 515
Preserve of the Magnum 515
Bonum, or Mogul Plum
To dry or preserve Mogul 515
Plums in syrup
Mussel Plum Cheese and Jelly 516
Apricot Marmalade 516
To dry Apricots (a quick and 517
easy method)
Dried Apricots (French receipt) 517
Peach Jam, or Marmalade 518
To preserve or to dry Peaches 518
or Nectarines (an easy and
excellent receipt)
Damson Jam (very good) 519
Damson Jelly 519
Damson or Red Plum Solid 519
(good)
Excellent Damson Cheese 520
Red Grape Jelly 520
English Guava (a firm, clear, 520
bright Jelly)
Very fine Imperatrice Plum 521
Marmalade
To dry Imperatrice Plums (an 521
easy method)
To bottle Fruit for winter use 522
Apple Jelly 522
Exceedingly fine Apple Jelly 523
Quince Jelly 524
Quince Marmalade 523
Quince and Apple Marmalade 525
Quince Paste 525
Jelly of Siberian Crabs 526
To preserve Barberries in 526
bunches
Barberry Jam (First and best 506
receipt)
Barberry Jam (second receipt) 527
Superior Barberry Jelly, and 527
Marmalade
Orange Marmalade (a 527
Portuguese receipt)
Genuine Scotch Marmalade 528
Clear Orange Marmalade 529
(Author’s receipt)
Fine Jelly of Seville Oranges 530
(Author’s original receipt)
CHAPTER XXV.

PICKLES.

Page

Observations on Pickles 531


To pickle Cherries 532
To pickle Gherkins 532
To pickle Gherkins (a French 533
receipt)
To pickle Peaches, and Peach 534
Mangoes
Sweet Pickle of Lemon 534
(Foreign receipt) (to serve
with roast meat)
To pickle Mushrooms 535
Mushrooms in brine, for winter 536
use (very good)
To pickle Walnuts 536
To pickle Beet-Root 537
Pickled Eschalots (Author’s 537
receipt)
Pickled Onions 537
To pickle Lemons and Limes 538
(excellent)
Lemon Mangoes (Author’s 538
original receipt)
To pickle Nasturtiums 539
To pickle red Cabbage 539
CHAPTER XXVI.

CAKES.

Page

General Remarks on Cakes 540


To blanch and to pound 542
Almonds
To reduce Almonds to a Paste 542
(the quickest and easiest
way)
To colour Almonds or Sugar- 542
grains, or Sugar-candy, for
Cakes or Pastry
To prepare Butter for rich 543
Cakes
To whisk Eggs for light rich 543
Cakes
Sugar Glazings and Icings, for 543
fine Cakes and Pastry
Orange-Flower Macaroons 544
(delicious)
Almond Macaroons 544
Very fine Cocoa-nut 545
Macaroons
Imperials (not very rich) 545
Fine Almond Cake 545
Plain Pound or Currant Cake 546
(or rich Brawn Brack or
Borrow Brack)
Rice Cake 546
White Cake 546
A good Sponge Cake 547
A smaller Sponge Cake (very 547
good)
Fine Venetian Cake or Cakes 547
A good Madeira Cake 548
A Solimemne (a rich French 549
breakfast cake, or Sally
Lunn)
Banbury Cakes 549
Meringues 550
Italian Meringues 551
Thick, light Gingerbread 551
Acton Gingerbread 552
Cheap and very good Ginger 552
Oven-cake or Cakes
Good common Gingerbread 553
Richer Gingerbread 553
Cocoa-nut Gingerbread 553
(original receipts)
Delicious Cream Cake and 554
Sweet Rusks
A good light Luncheon-cake 554
and Brawn Brack
A very cheap Luncheon-biscuit, 555
or Nursery-cake
Isle of Wight Dough-nuts 556
Queen Cakes 556
Jumbles 556
A good Soda Cake 556
Good Scottish Short-bread 557
A Galette 557
Small Sugar Cakes of various 558
kinds
Fleed, or Flead Cakes 558
Light Buns of different kinds 559
Exeter Buns 559
Threadneedle-street Biscuits 560
Plain Dessert Biscuits and 560
Ginger Biscuits
Good Captain’s Biscuits 560
The Colonel’s Biscuits 561
Aunt Charlotte’s Biscuits 561
Excellent Soda Buns 561
CHAPTER XXVII.

CONFECTIONARY.

Page

To clarify Sugar 562


To boil Sugar from Syrup to 563
Candy, or to Caramel
Caramel (the quickest way) 563
Barley-sugar 564
Nougat 564
Ginger-candy 565
Orange-flower Candy 565
Orange-flower Candy (another 566
receipt)
Cocoa-nut Candy 566
Everton Toffee 567
Chocolate Drops 567
Chocolate Almonds 568
Seville Orange Paste 568
CHAPTER XXVIII.

DESSERT DISHES.

Page

Dessert Dishes 569


Pearled Fruit, or Fruit en 570
Chemise
Salad of mixed Summer Fruits 570
Peach Salad 570
Orange Salad 571
Tangerine Oranges 571
Peaches in Brandy (Rotterdam 571
receipt)
Brandied Morella Cherries 571
Baked Compôte of Apples (our 572
little lady’s receipt)
Dried Norfolk Biffins 572
Normandy Pippins 572
Stewed Pruneaux de Tours, or 573
Tours dried Plums
To bake Pears 573
Stewed Pears 573
Boiled Chestnuts 574
Roasted Chestnuts 574
Almond Shamrocks (very good 574
and very pretty)
Small Sugar Soufflés 575
Ices 575
CHAPTER XXIX.

SYRUPS, LIQUEURS, ETC.

Page

Strawberry Vinegar, of 577


delicious flavour
Very fine Raspberry Vinegar 578
Fine Currant Syrup, or Sirop de 579
Groseilles
Cherry Brandy (Tappington 579
Everard receipt)
Oxford Punch 580
Oxford receipt for Bishop 580
Cambridge Milk Punch 581
To mull Wine (an excellent 581
French receipt)
A Birthday Syllabub 581
An admirable cool cup 582
The Regent’s, or George the 582
Fourth’s Punch
Mint Julep (an American 582
receipt)
Delicious Milk Lemonade 583
Excellent portable Lemonade 583
Excellent Barley Water (Poor 583
Xury’s receipt)
Raisin Wine, which, if long 583
kept, really resembles
foreign
Very good Elderberry Wine 584
Very Good Ginger Wine 584
Excellent Orange Wine 585
The Counsellor’s Cup 585
CHAPTER XXX.

COFFEE, CHOCOLATE, ETC.

Page

Coffee 587
To roast Coffee 588
A few general directions for 589
making Coffee
Excellent Breakfast Coffee 590
To boil Coffee 591
Café Noir 592
Burnt Coffee, or Coffee à la 592
militaire (In France vulgarly
called Gloria)
To make Chocolate 592
A Spanish recipe for making 592
and serving Chocolate
To make Cocoa 593
CHAPTER XXXI.

BREAD.

Page

Remarks on Home-made 594


Bread
To purify Yeast for Bread or 595
Cakes
The Oven 595
A few rules to be observed in 596
making Bread
Household Bread 596
Bordyke Bread (Author’s 597
receipt)
German Yeast (and Bread 598
made with German Yeast)
Professor Liebig’s Bavarian 599
Brown Bread (very nutritious
and wholesome)
English Brown Bread 599
Unfermented Bread 599
Potato Bread 600
Dinner or Breakfast Rolls 600
Geneva Rolls or Buns 601
Rusks 602
Excellent Dairy Bread, made 602
without Yeast (Author’s
receipt)
To keep Bread 603
To freshen stale Bread (and 603
Pastry, &c.) and preserve it
from mould
To know when Bread is 604
sufficiently baked
On the proper fermentation of 604
Dough
CHAPTER XXXII.

FOREIGN AND JEWISH COOKERY.

Page

Foreign and Jewish Cookery 605


Remarks on Jewish Cookery 606
Jewish Smoked Beef 606
Chorissa (or Jewish Sausage) 607
with Rice
To fry Salmon and other Fish in 607
Oil (to serve cold)
Jewish Almond Pudding 608
The Lady’s or Invalid’s new 608
Baked Apple Pudding
(Author’s original receipt.
Appropriate to the Jewish
table)
A few general directions for the 609
Jewish table
Tomata and other Chatnies 609
(Mauritian receipt)
Indian Lobster Cutlets 610
An Indian Burdwan (Entrée) 611
The King of Oude’s Omlet 611
Kedgeree or Kidgeree, an 612
Indian breakfast-dish
A simple Syrian Pilaw 612
Simple Turkish or Arabian 613
Pilaw (From Mr. Lane, the

You might also like