Professional Documents
Culture Documents
FUZZ-IEEE 2019
June 23-26, 2019
General Chairs The 2019 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE 2019), the world-leading
Tim Havens, USA event focusing on the theory and application of fuzzy logic, will be held in New Orleans, Louisiana,
Jim Keller, USA USA. Nicknamed the “Big Easy,” New Orleans is known for its round-the-clock nightlife, vibrant
live-music scene, and spicy Cajun cuisine. It is located on the Mississippi River, near the Gulf of
Program Chairs
Mexico, and is popular tourist destination for all ages.
Alina Zare, USA
Derek Anderson, USA FUZZ-IEEE 2019 will be hosted at the JW Marriott, a premier conference venue nestled in
the heart of the world-famous French Quarter. You will be steps away from some of New Orleans’s
Special Sessions Chairs most iconic nightlife and restaurants. Take a walk outside and visit Jackson Square, shop at the
Christian Wagner, UK lively French Market, or dance your way through Bourbon Street.
Thomas Runkler, Germany FUZZ-IEEE 2019 will represent a unique meeting point for scientists and engineers, from
Tutorials Chairs academia and industry, to interact and discuss the latest enhancements and innovations in the
Qiang Shen, UK field. The topics of the conference will cover all aspects of theory and applications of fuzzy logic
Jesus Chamorro, Spain and its hybridisations with other artificial and computational intelligence methods. In particular,
Keynotes Chair
FUZZ-IEEE 2019 topics include, but are not limited to:
Robert John, UK
Posters Chair Mathematical and theoretical foundations of fuzzy sets, fuzzy measures, and fuzzy integrals
Marek Reformat, Canada Fuzzy control, robotics, sensors, fuzzy hardware and architectures
Finance Chair Fuzzy data analysis, fuzzy clustering, classification and pattern recognition
Mihail Popescu, USA Type-2 fuzzy sets, computing with words, and granular computing
Conflict-of-Interest Chairs Fuzzy systems with big data and cloud computing, fuzzy analytics, and visualization
Sansanee Fuzzy systems design and optimization
Auephanwiriyakul, Thailand Fuzzy decision analysis, multi-criteria decision making and decision support
CT Lin, Australia Fuzzy logical and its applications to industrial engineering
Competitions Chairs Fuzzy modeling, identification, and fault detection
Christophe Marsala, France Fuzzy information processing, information extraction, and fusion
Mika Sato-Ilic, Japan Fuzzy web engineering, information retrieval, text mining, and social network analysis
Panel Sessions Chair Fuzzy image, speech and signal processing, vision, and multimedia data analysis
Humberto Bustince, Spain Fuzzy databases and informational retrieval
Publications Chairs Rough sets, imprecise probabilities, possibilities approaches
Anna Wilbik, Netherlands Industrial, financial, and medical applications
Tim Wilkin, Australia Fuzzy logic application in civil engineering and GIS
Registrations Chair Fuzzy sets and soft computing in social sciences
Marie-Jeanne Lesot, France Linguistic summarization, natural language processing
Local Arrangement Chairs Computational intelligence in security
Fred Petry, USA Hardware and software for fuzzy systems and logic
Paul Elmore, USA Fuzzy Markup Language and standard technologies for fuzzy systems
Publicity Chair Adaptive, hierarchical, and hybrid neuro- and evolutionary-fuzzy systems
Daniel Sanchez, Spain
Web Chair The conference will include regular oral and poster presentations, an elevator pitch competition,
Tony Pinar, USA
tutorials, panels, special sessions, and keynote presentations. Full details of the submission
process for papers, tutorials, and panels will be made available on the conference website:
http://www.fuzzieee.org
Important dates
Deadline for special session, tutorial, competition, and panel session proposals: October 8, 2018
Z•IE Notification of acceptance for tutorials, special sessions, and panels: November 2, 2018
Z Deadline for full paper submission: January 11, 2019
E
ANS
LE
Notification of paper acceptance: March 4, 2019
FU
E
OR
019 http://www.fuzzieee.org
2
Features
12
Identifying DNA Methylation Modules Associated
with a Cancer by Probabilistic Evolutionary Learning
b y Je-Keun Rhee, Soo-Jin Kim, and Byoung-Tak Zhang
20
Augmentation of Physician Assessments with Multi-Omics
Enhances Predictability of Drug Response: A Case Study
of Major Depressive Disorder
by Arjun Athreya, Ravishankar Iyer, Drew Neavin, Liewei Wang,
Richard Weinshilboum, Rima Kaddurah-Daouk, John Rush,
Mark Frye, and William Bobo
Columns
32 Research Frontier
Optimal Weighted Extreme Learning Machine for Imbalanced Learning
with Differential Evolution
by JongHyok Ri, Liang Liu,Yong Liu, Huifeng Wu,Wenliang Huang, and Hun Kim
Learning Without External Reward
by Haibo He and Xiangnan Zhong
on the cover
©istockphoto.com/kirstypargeter
55 Review Article
Recent Trends in Deep Learning Based Natural Language Processing
by Tom Young, Devamanyu Hazarika, Soujanya Poria, and Erik Cambria
Departments
2 Editor’s Remarks 10 Guest Editorial
3 President’s Message Computational Intelligence
by Nikhil R. Pal Techniques in Bioinformatics
and Bioengineering
5 Conference Reports by Richard Allmendinger,
Conference Report on IEEE Daniel Ashlock, and
Computational Intelligence Sansanee Auephanwiriyakul
Society ExCom Workshop 2018
by Jie Lu 76 Conference Calendar
by Bernadette Bouchon-Meunier
8 Publication Spotlight
by Haibo He, Jon Garibaldi,
Kay Chen Tan, Julian Togelius,
Yaochu Jin, and Yew Soon Ong
Promoting Sustainable Forestry
SFI-01681
IEEE Computational Intelligence Magazine (ISSN 1556-603X) is published quarterly by The Institute of Electrical and Electronics Engineers, Inc. Headquarters: 3 Park Avenue, 17th
Floor, New York, NY 10016-5997, U.S.A. +1 212 419 7900. Responsibility for the contents rests upon the authors and not upon the IEEE, the Society, or its members. The magazine is a
membership benefit of the IEEE Computational Intelligence Society, and subscriptions are included in Society fee. Replacement copies for members are available for US$20 (one copy only).
Nonmembers can purchase individual copies for US$201.00. Nonmember subscription prices are available on request. Copyright and Reprint Permissions: Abstracting is permitted with
credit to the source. Libraries are permitted to photocopy beyond the limits of the U.S. Copyright law for private use of patrons: 1) those post-1977 articles that carry a code at the bottom of
the first page, provided the per-copy fee is paid through the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01970, U.S.A.; and 2) pre-1978 articles without fee. For other
copying, reprint, or republication permission, write to: Copyrights and Permissions Department, IEEE Service Center, 445 Hoes Lane, Piscataway NJ 08854 U.S.A. Copyright © 2018 by The
Institute of Electrical and Electronics Engineers, Inc. All rights reserved. Periodicals p ostage paid at New York, NY and at additional mailing offices. Postmaster: Send address changes to
IEEE Computational Intelligence Magazine, IEEE, 445 Hoes Lane, Piscataway, NJ 08854-1331 U.S.A. Printed in U.S.A. Canadian GST #125634188.
Digital Object Identifier 10.1109/MCI.2017.2770275 August 2018 | IEEE Computational intelligence magazine 1
CIM Editorial Board Hisao Ishibuchi Editor’s
Editor’s
Editor-in-Chief
Hisao Ishibuchi
Southern University of Science Remarks
Southern University of Science and Technology and Technology, CHINA
Department of Computer Science and Engineering Osaka Prefecture University, JAPAN
Shenzhen, Guangdong, CHINA
Osaka Prefecture University
Department of Computer Science
Sakai, Osaka 599-8531, JAPAN
(Email) hisaoi@cs.osakafu-u.ac.jp
depart title
Founding Editor-in-Chief
Gary G. Yen, Oklahoma State University, USA
One Year in China
Past Editor-in-Chief
Kay Chen Tan, City University of Hong Kong,
HONG KONG
Editors-At-Large
Piero P. Bonissone, Piero P. Bonissone Analytics
LLC, USA
David B. Fogel, Natural Selection, Inc., USA
Vincenzo Piuri, University of Milan, ITALY
S
ince April 2017, I have been working in China. I live in an
apartment on the university campus. When I moved to
China last year, I knew only two Chinese words, “ni hao
(hello)” and “xie xie (thank you)”. Surprisingly I have yet to learn
any other Chinese words. I rely on my secretary’s help which I
Marios M. Polycarpou, University of Cyprus,
CYPRUS am very thankful for. My biggest achievement in China has been
Jacek M. Zurada, University of Louisville, USA
the success in my weight loss. I have lost more than 10 kg! I eat
Associate Editors three meals a day at the university’s canteen. It opens seven days a week but only
José M. Alonso, University of Santiago de Compos- for five hours a day: two hours for breakfast, one and a half hours each for lunch
tela, SPAIN and dinner. Due to this, my daily schedule is clock work around the canteen’s
Erik Cambria, NTU, SINGAPORE
Raymond Chiong, The University of Newcastle, opening hours. Also the large and hilly campus lets me exercise regularly for my
AUSTRALIA weight control too.
Yun Fu, Northeastern University, USA
Robert Golan, DBmind Technologies Inc., USA Recently I visited Xiamen for a conference. Three undergraduate students came to
Roderich Gross, The University of Sheffield, UK see me at the high-speed train station with a large name board (see the photo below).
Amir Hussain, University of Stirling, UK
John McCall, Robert Gordon University, UK I enjoyed dinner and a one-hour metro ride to the conference hotel. Thank you to
Zhen Ni, South Dakota State University, USA the student who brought back
Yusuke Nojima, OPU, JAPAN
Nelishia Pillay, University of Pretoria, SOUTH the name board to the univer-
AFRICA sity. On the last day in Xiamen,
Rong Qu, University of Nottingham, UK
Dipti Srinivasan, NUS, SINGAPORE they took me to the train sta-
Ricardo H. C. Takahashi, UFMG, BRAZIL tion and gave me some local
Kyriakos G.Vamvoudakis, Virginia Tech, USA
Nishchal K.Verma, IIT Kanpur, INDIA cakes and nuts. One of the stu-
Dongrui Wu, DataNova, USA dents spoke excellent Japanese.
Bing Xue, Victoria University of Wellington, NEW Her Japanese was much better
ZEALAND
Dongbin Zhao, Chinese Academy of Sciences, than my English. When I re-
CHINA ceived a phone call from her With three first-year undergraduate students in Xiamen
IEEE Periodicals/
before meeting them, I thought
Magazines Department that the call came from Japan. She was in Japan for one year when she was a high-
Editorial/Production Associate, Heather Hilton school student. I was so impressed by her language learning ability while comparing it
Senior Art Director, Janet Dudar
Associate Art Director, Gail A. Schnitzer
to my progress with Chinese after one year. Also the other student showed me her
Production Coordinator, Theresa L. Smith amazing photos of her hometown about 600 km north of Xi’an. She told me that she
Director, Business Development— often enjoyed camel rides when she was a small girl. Next year, we will host the IEEE
Media & Advertising, Mark David
SSCI 2019 conference in Xiamen. I am looking forward to seeing all of you and those
Advertising Production Manager,
Felicia Spagnoli students again.
Production Director, Peter M. Tuohy The feature topic of the current CIM issue is “Bioinformatics and Bioengineer-
Editorial Services Director, Kevin Lisankie ing.” This has been an interesting and challenging application field for CI techniques. I
Staff Director, Publishing Operations,
Dawn M. Melley hope you will enjoy all of the articles in this issue.
W
Vice President – Technical Activities –
e held our society’s first Executive Committee (ExCom) Hussein Abbass, University of New South Wales,
AUSTrALIA
meeting of this year on April 8, 2018, in Sydney, Austra-
lia. Although I had attended several ExCom meetings as Publication Editors
the Vice-President for Publications and as the President-Elect, this IEEE Transactions on Neural Networks
was my first meeting as the President and hence was of a special and Learning Systems
Haibo He, University of rhode Island, USA
importance to me. We had a long but very successful meeting. I
IEEE Transactions on Fuzzy Systems
take this opportunity to sincerely thank all members of the Jon Garibaldi, University of Nottingham, UK
ExCom for their very active participation and providing useful suggestions, which IEEE Transactions on Evolutionary Computation
will certainly help to take the society ahead. Kay Chen Tan, City University of Hong Kong,
In our society we make use of every opportunity to strengthen our interaction HONG KONG
IEEE Transactions on Games
with people and it is not restricted to the members of the Computational Intelli-
Julian Togelius, New York University, USA
gence Society (CIS). Keeping this in mind, we carefully choose the locations for our IEEE Transactions on Cognitive and
ExCom meetings and usually organize one (and sometimes even two) one-day Developmental Systems
workshop on computational intelligence and its applications in the region where Yaochu Jin, University of Surrey, UK
the meeting is held. To the extent I know, this initiative was started in 2011 and so IEEE Transactions on Emerging Topics in
Computational Intelligence
far it has been found to be very successful in achieving its intended goals of allow-
Yew Soon Ong, Nanyang Technological University,
ing us to reach out to students, researchers, and practitioners. Generally we find a SINGAPOrE
“champion” from a local university/institute or from the local CIS chapter, if any, to
organize the event. Every ExCom member, who attends the ExCom meeting in Administrative Committee
person, serves as a speaker to the workshop. In addition, to strengthen our bonding Term ending in 2018:
with the local researchers as well as to get a better idea of some of the research Piero P. Bonissone, Piero P. Bonissone Analytics
activities that the local researchers are engaged in, often we invite a few speakers LLC, USA
Carlos A. Coello Coello, CINVESTAV-IPN,
from the neighboring institutes/universities. In the past years we organized very
MEXICO
successful events in different countries including Brazil, Cyprus, Peru, Spain, Ecua- Barbara Hammer, Bielefeld University, GErMANY
dor, Chile, and Argentina. This year, after the ExCom meeting, we have organized Alice E. Smith, Auburn University, USA
two such workshops, one in Sydney (on April 9) and the other in Canberra (on Jacek M. Zurada, University of Louisville, USA
April 10). In Sydney, the workshop was organized by the University of Technology Term ending in 2019:
Sydney (UTS) under the leadership of Prof. Jie Lu. It was a very well-attended Cesare Alippi, Politecnico di Milano, ITALY
event—more than 100 researchers/students participated in this workshop. The sec- James C. Bezdek, USA
ond workshop was organized by the University of New South Wales (UNSW) Gary B. Fogel, Natural Selection, Inc., USA
Hisao Ishibuchi, Southern University of Science
Canberra and Prof. Hussein Abbass took the lead in organizing this event. This was and Technology, CHINA, and Osaka Prefecture
also a well-attended one with more than 80 participants. For this event, in addition University, JAPAN
to the ExCom members, Prof. Jason Scholz, Adjunct Professor, University of New Kay Chen Tan, City University of Hong Kong,
South Wales in Canberra and the Chief Scientist and Engineer, Defence Coopera- HONG KONG
tive Research Centre (DCRC) was invited to give a talk. He spoke on “Trusted Term ending in 2020:
Autonomous Systems Defence CRC”. These workshops covered a wide spectrum Janusz Kacprzyk, Polish Academy of Sciences,
of related topics of contemporary interests including deep learning, recognition POLAND
Sanaz Mostaghim, Otto von Guericke University of
technology, machine learning, big data challenges in astrophysics, fuzzy information
Magdeburg, GErMANY
processing, and human-machine systems. In both workshops the participants were Christian Wagner, University of Nottingham, UK
very enthusiastic and interactive. An important aspect of the Canberra program was Ronald R.Yager, Iona College, USA
that some young high school students were invited to attend the talk by Prof. Gary G.Yen, Oklahoma State University, USA
Visit www.ieee.org.
************************
Members of the IEEE Computational Intelligence Society with UTS hosts, IEEE Student Board volunteers, staff, and students.
© ISTOCKPHOTO.COM/BRIANAJACKSON
IEEE Transactions on Neural Multicolumn RBF Network, by A. O. mative subsets provide the MCRN with
Networks and Learning Systems Hoori and Y. Motai, IEEE Transac- a regional experience to specify the
tions on Neural Networks and Learning problem instead of generalizing it. The
Robust C-Loss Kernel Classifiers, by G. Systems,Vol. 29, No. 4, April 2018, pp. MCRN has been tested on many bench-
Xu, B. Hu, and J. C. Principe, IEEE 766–778. mark datasets and has shown better
Transactions on Neural Networks and accuracy and great improvements in
Learning Systems, Vol. 29, No. 3, Digital Object Identifier: 10.1109/ training and testing times compared
March 2018, pp. 510–522. TNNLS.2017.2650865 with a single RBFN. The MCRN also
“This paper proposes the multicol- shows good results compared with those
Digital Object Identifier: 10.1109/ umn RBF network (MCRN) as a of some machine learning techniques,
TNNLS.2016.2637351 method to improve the accuracy and such as the support vector machine and
“The correntropy-induced loss (C-loss) speed of a traditional radial basis func- k-nearest neighbors.”
function has the nice property of being tion network (RBFN). The RBFN, as a
robust to outliers. In this paper, we study fully connected artificial neural network IEEE Transactions
the C-loss kernel classifier with the Tik- (ANN), suffers from costly kernel inner- on Fuzzy Systems
honov regularization term, which is used product calculations due to the use of
to avoid overfitting. After using the half- many instances as the centers of hidden Analysis and Design of Functionally
quadratic optimization algorithm, which units. This issue is not critical for small Weighted Single-Input-Rule-Modules
converges much faster than the gradient datasets, as adding more hidden units Connected Fuzzy Inference Systems, by
optimization algorithm, we find out that will not burden the computation time. C. Li, J. Gao, J. Yi, and G. Zhang,
the resulting C-loss kernel classifier is However, for larger datasets, the RBFN IEEE Transactions on Fuzzy Systems,
equivalent to an iterative weighted least requires many hidden units with several Vol. 26, No 1, February 2018, pp.
square support vector machine (LS- kernel computations to generalize the 56–71.
SVM). This relationship helps explain the problem. The MCRN mechanism is
robustness of iterative weighted LS-SVM constructed based on dividing a dataset Digital Object Identifier: 10.1109/
from the correntropy and density estima- into smaller subsets using the k-d tree TFUZZ.2016.2637369
tion perspectives. On the large-scale data algorithm. N resultant subsets are con- “Single-input-rule-modules (SIRMs)
sets which have low-rank Gram matri- sidered as separate training datasets to can efficiently solve the fuzzy rule explo-
ces, we suggest to use incomplete Cho- train N individual RBFNs. Those small sion phenomenon, which usually occurs
lesky decomposition to speed up the RBFNs are stacked in parallel and in the multivariable modeling and/or
training process. Moreover, we use the bulged into the MCRN structure dur- control applications. However, the per-
representer theorem to improve the sparse- ing testing. The MCRN is considered as formance of SIRM connected fuzzy
ness of the resulting C-loss kernel classifier. a well-developed and easy-to-use paral- inference systems (SIRM-FIS) is limited
Experimental results confirm that our lel structure, because each individual due to its simple input-output mapping.
methods are more robust to outliers than ANN has been trained on its own sub- In this paper, to further enhance the per-
the existing common classifiers.” sets and is completely separate from the formance of SIRM-FIS, a functionally
other ANNs. This parallelized structure weighted SIRM-FIS (FWSIRM-FIS),
reduces the testing time compared with which adopts multi-variable functional
that of a single but larger RBFN, which weights to measure the important degrees
Digital Object Identifier 10.1109/MCI.2018.2840645 cannot be easily parallelized due to its of the SIRMs, is presented.Then, in order
Date of publication: 18 July 2018 fully connected structure. Small infor- to show the fundamental differences of
“Reinforcement learning (RL) prob- IEEE Transactions on Emerging vates advanced optimizers that mimic
lems are hard to solve in a robotics con- Topics in Computational human cognitive capabilities; leveraging
text as classical algorithms rely on Intelligence on what has been seen before to accel-
discrete representations of actions and erate the search toward optimal solu-
states, but in robotics both are continu- Insights on Transfer Optimization: tions of never before seen tasks. With
ous. A discrete set of actions and states Because Experience is the Best Teacher, this in mind, this paper sheds light on
can be defined, but it requires an exper- by A. Gupta, Y. S. Ong, and L. Feng, recent research advances in the field of
tise that may not be available, in particu- IEEE Transactions on Emerging Topics in global black-box optimization that
lar in open environments. It is proposed Computational Intelligence, Vol. 2, No. champion the theme of automatic knowl-
to define a process to make a robot 1, February 2018, pp. 51–64. edge transfer across problems. We intro-
build its own representation for an RL duce a general formalization of transfer
algorithm. The principle is to first use a Digital Object Identifier: 10.1109/ optimization, based on which the con-
direct policy search in the sensori-motor TETCI.2017.2769104 ceptual realizations of the paradigm are
space, i.e., with no predefined discrete “Traditional optimization solvers classified into three distinct categories,
sets of states nor actions, and then extract tend to start the search from scratch by namely sequential transfer, multitasking,
from the corresponding learning traces assuming zero prior knowledge about and multiform optimization. In addition,
discrete actions and identify the relevant the task at hand. Generally speaking, the we carry out a survey of different meth-
dimensions of the state to estimate the capabilities of solvers do not automati- odological perspectives spanning Bayes-
value function. Once this is done, the cally grow with experience. In contrast, ian optimization and nature-inspired
robot can apply RL: 1) to be more however, humans routinely make use of computational intelligence procedures
robust to new domains and, if required a pool of knowledge drawn from past for efficient encoding and transfer of
and 2) to learn faster than a direct policy experiences whenever faced with a new knowledge building blocks. Finally, real-
search. This approach allows to take the task. This is often an effective approach world applications of the techniques are
best of both worlds: first learning in a in practice as real-world problems sel- identified, demonstrating the future
continuous space to avoid the need of a dom exist in isolation. Similarly, practi- impact of optimization engines that
specific representation, but at a price of a cally useful artificial systems are expected evolve as better problem-solvers over
long learning process and a poor gener- to face a large number of problems in time by learning from the past and from
alization, and then learning with an their lifetime, many of which will either one another.”
adapted representation to be faster and be repetitive or share domain-specific
more robust.” similarities. This view naturally moti-
©istockphoto.com/ciphotos
X1 X2 X3 X5 X6
X1 X2 X3 X4 X5
X1 X7 X8 X2 X4 X8
X2 X3 X6 X7 X3 X4 X7 X9
X1 X4 X7 X8 X9 X1 X5 X6 X8 X10
Dependency
X3 X5 X9 X10 Graphs
X4 X6 X7 X9
Selection Evaluation
X1 X2 X3 X4 X5
X1 X2 X3 X5 X6
X2 X4 X8
X2 X4 X8
X2 X3 X6 X7 X3 X4 X7 X9
Training
Datasets
X1 X7 X8
X1 X5 X6 X8 X10
X4 X6 X7 X9 X4 X6 X7 X9
FIgurE 1 Schematic overview for probabilistic evolutionary learning to identify DNA methylation modules.
argmax r, pa % I (X i; X pa (i)), (3) where c is the count of a variable X with a specific value and
i!r N is the total number of individuals.
f ( D j) = *
1, if / w i $ v ji 2 0, (8)
methylation sites to find an individual with a shorter length;
i=0
- 1, otherwise.
1.0
The difference between the predictions and the target values
specified in the training sequence is used to represent the error 0.9
of the current weight vector. The target function is optimized
0.8
to minimize the classification error. The weight values are eval-
uated against a sequence of training samples and are updated to 0.7
Fitness
D. Dataset 600
Order
TAbLE 3 Classification Performance Using the Selected Sites and Randomly Selected Sites.
0.6
duced better results than the others, regardless of the number eNDoMeTRiAL cANceR 1.39e-03 2.35e-02
of the randomly selected sites. In particular, it was noted that BASAL ceLL cARciNoMA 1.55e-03 2.41e-02
the specificity using the selected sites by our method was coLoRecTAL cANceR 1.97e-03 2.73e-02
much better than the others, even though the original data pANcReATic cANceR 2.50e-03 2.73e-02
was highly imbalanced.
MeLANoMA 2.57e-03 2.73e-02
chRoNic MyeLoiD LeuKeMiA 2.72e-03 2.73e-02
B. Modules associated with Colorectal Cancer
using High-Throughput Sequencing Data cyToKiNe-cyToKiNe RecepToR 2.82e-03 2.73e-02
iNTeRAcTioN
Recently, high-throughput sequencing technologies have been
MApK SigNALiNg pAThwAy 2.82e-03 2.73e-02
used to determine DNA methylation profiles. We applied our
method to the sequencing-based methylation profile datasets phoSphATiDyLiNoSiToL 2.94e-03 2.73e-02
SigNALiNg SySTeM
produced by Simmer et al. [36].
VegF SigNALiNg pAThwAy 2.94e-03 2.73e-02
The experiments were carried out using 25 cancer and 25
normal samples with 10,393 genomic regions on chromosome Fc epSiLoN Ri SigNALiNg 3.17e-03 2.81e-02
pAThwAy
17. Figure 3 depicts the improvement of the fitness in iterative
SMALL ceLL LuNg cANceR 3.58e-03 2.98e-02
learning procedures using these datasets, and finally 348 regions
were selected to discriminate colorectal cancer and normal sam- eRBB SigNALiNg pAThwAy 3.83e-03 2.98e-02
ples after convergence. Table 4 shows the average classification ApopToSiS 3.92e-03 2.98e-02
performance by 10-fold cross-validation using the selected sites. pRoSTATe cANceR 4.01e-03 2.98e-02
We annotated the 348 selected regions using GPAT [56] and
investigated which genes were located close to the selected
regions. We determined which genes were enriched within the cancer, the roles of the wnt signaling pathway and MAPK signal-
KEGG pathway using the genes whose transcription start sites ing pathway have been studied intensively [59]–[62]. Genetic
are located within 5,000 bp from the selected genomic regions mutations affecting the pathway components and the alteration
[57], [58]. Table 5 summarizes the significantly enriched path- of their expression can enhance tumorigenicity in cancer cells. In
ways with low p-values and shows that most of these are closely addition, the neurotrophin signaling pathway could be related to
associated with cancer-related networks. Note that the enriched growth of colorectal cancer cells [63] and the chemokine signal-
signaling pathways were related to colorectal cancer. In colorectal ing pathway suppresses colorectal cancer metastasis [64], [65].
Mark Frye
Department of Psychiatry and Psychology, Mayo Clinic, MN, USA
William Bobo
Department of Psychiatry and Psychology, Mayo Clinic, FL, USA
I
measures with heterogeneous biological measures to accurately I. Introduction
predict treatment outcomes using machine learning. Across n diseases that are characterized by the complex pheno-
many psychiatric illnesses, ranging from major depressive disor- types (traits) listed in Table 1, such as psychiatric disor-
der to schizophrenia, symptom severity assessments are subjective ders, inflammatory diseases, and migraines, therapeutic/
and do not include biological measures, making predictability in treatment decisions are primarily based on the subject-
eventual treatment outcomes a challenge. Using data from the reported/physician-rated severity of symptoms (which are an
Mayo Clinic PGRN-AMPS SSRI trial as a case study, this example of complex phenotypes/traits) in conjunction with
work demonstrates a significant improvement in the prediction standard social/demographic factors. The ability of these mea-
accuracy for antidepressant treatment outcomes in patients sures to predict therapeutic success is slightly better than
with major depressive disorder from 35% to 80% individualized chance [1], [2], and is largely limited by the lack of biological
by patient, compared to using only a physician’s assessment as measures that reflect the underlying molecular mechanisms of
the predictors. This improvement is achieved through an itera- therapeutic agents (e.g., drugs) and could therefore potentially
tive overlay of biological measures, starting with metabolites serve as stronger predictors of therapeutic outcomes. The key
(blood measures modulated by drug action) associated with contribution of this work in addressing that limitation is a
symptom severity, and then adding in genes associated with “learning-augmented clinical assessment” workflow to sequential-
metabolomic concentrations. Hence, therapeutic efficacy for a ly augment physicians’ assessments of subject-specific ratings
new patient can be assessed prior to treatment, using predic- of symptoms with heterogeneous biological measures (such as
tion models that take as inputs selected biological measures metabolomics and genomics) to significantly enhance the pre-
and physicians’ assessments of depression severity. Of broader dictability of drug treatment outcomes as shown in Fig. 1. As
significance extending beyond psychiatry, the approach pre- a case study, the workflow proposed in this work demon-
sented in this work can potentially be applied to predicting strates improved predictability in antidepressant treatment
treatment outcomes for other medical conditions, such as outcomes of patients with major depressive disorder (MDD)
migraine headaches or rheumatoid arthritis, for which patients by using biological measures (metabolomics and genomics)
TAblE 1 Efforts in integrating multiple measures to predict clinical outcomes in diseases characterized by complex phenotypes.
CliniCAl
MEASuRES (nOn- nOn-DRug DRug-bASED
biOMARkERS) biOMARkERS biOMARkERS
MDD ChekrouD et Yes no no abIlItY to establIsh Cross-trIal
al. [1], InIesta preDICtIon of ClInICal outCoMes
et al. [2] usIng onlY ClInICal Measures
thIs work Yes targeteD targeteD augMentIng exIstIng ClInICal Mea-
Metabolo- MetaboloMICs sures wIth funCtIonallY valIDateD
MICs anD anD genoMICs bIoMarkers assoCIateD wIth DIsease
genoMICs pathophYsIologY or Drug MeCha-
nIsMs IMproves preDICtabIlItY In
treatMent outCoMes
reDlICh Yes MagnetIC no establIshIng the abIlItY of IMagIng
et al. [3] resonanCe Data to preDICt ClInICal outCoMes
IMagIng
(MrI)
sChIzophrenIa koutsoulerIs Yes no no abIlItY to establIsh Cross-trIal
et al. [4] preDICtIon of ClInICal outCoMes
usIng onlY ClInICal Measures
bIpolar DIsor- tIghe Yes funCtIonal no establIshIng the abIlItY of IMagIng
Der et al. [5] MrI Data anD transCrIptoMICs Data to
preDICt ClInICal outCoMes
rheuMatoID wIjbranDts Yes no targeteD abIlItY to preDICt treatMent outCoMes
arthrItIs et al. [6] transCrIp- usIng ClInICal Measures anD
toMICs transCrIptoMe varIatIons assoCIateD
wIth outCoMes, but bIoMarkers neeD
replICatIon anD funCtIonal valIDatIon
–log10 (P )
ics
m
e no Chromosome
G
?
ics
m
lo
abo
et ?
M
FiguRE 1 the proposed analyses to establish predictability of clinical outcomes at eight weeks.
derived from peripheral blood to augment the severity mea- clinical trial of the Mayo Clinic Pharmacogenomics Research
sures and other sociodemographic factors currently used in Network Antidepressant Medical Pharmacogenomic Study
clinical practice as predictor variables. This improvement in (Mayo PGRN-AMPS) [7], which is the largest single-center
predictive accuracy of treatment outcomes motivates the need selective serotonin reuptake inhibitor (SSRI) trial that has been
for developing antidepressant-specific prediction models, so conducted in the United States. 603 patients completed the
that choice of antidepressant can be based on highest likeli- trial. They were administered citalopram/escitalopram (com-
hood of remission of depressive symptoms. Choosing antide- monly prescribed SSRIs) for eight weeks, and psychiatric
pressants that maximize therapeutic success marks a major assessments of depression severity at baseline (pre-treatment),
shift from the current “try and wait” approach, which often four weeks, and eight weeks were conducted by a clinician
requires multiple trials of antidepressants before patients using the Quick Inventory of Depressive Symptomatology
achieve remission of their depressive symptoms. (QIDS-C). In this trial, biological measures for 290 of the 603
To demonstrate the improved predictability in treatment patients included genome-wide association study (GWAS) gen-
outcomes, the workflow was developed using data from the otype data that, after imputation, included approximately 7 million
four weeks, and eight weeks). Through augmentation of those Men: total: 222. wIth oMICs: 99.
biological measures with psychiatric assessments and sociode- woMen: total: 381. wIth oMICs: 191.
mographic factors as predictor variables, the prediction accura- SOCiAl AnD DEMOgRAPhiC DATA (S) COllECTED Only AT
cy of antidepressant treatment outcomes in MDD patients bASElinE:
improved from 35% to 80% relative to the use of clinical mea- age (In Years)
sures alone as the predictor variables. boDY Mass InDex (bMI In kg/m2)
The formalism for integrating multiple biological measures DepressIon In {parents, sIblIngs, ChIlDren}
in this case study is as follows and is illustrated in Fig. 2. Just as
bIpolar DIsorDer In {parents, sIblIngs, ChIlDren}
tumor subtypes serve as a foundation for integrating biological
alCohol abuse bY {parents, sIblIngs, ChIlDren}
measures in oncology, our formalism first established patient
subtypes/stratification C by using mixture-model-based unsu- Drug abuse bY {parents, sIblIngs, ChIlDren}
pervised learning techniques. In the first layer of overlaying of seasonal pattern In sYMptoM oCCurrenCe
the biological measures, a set of metabolites m ! M were hIstorY of psYChotherapY
identified based on significant associations of their concentra- DEPRESSivE SEvERiTy ASSESSMEnT (C ):
tions with symptom severity in previously inferred patient ClInICIan-rateD QuICk InventorY of DepressIve sYMptoMatologY
stratification. In the second layer of the overlay of biological (QIDs-C) QuestIonnaIre (16 QuestIons)
measures, in what is referred to as a metabolomics-informed-genom-
QIDs-C total sCore
ics approach, we used GWAS to identify SNPs g ! G that are
biOlOgiCAl DATA (B):
associated with concentrations of metabolites comprising m.
Through iterative overlaying of biological measures starting M: 31 MetabolItes froM the hplC lCeCa platforM
with metabolites (blood measures reflecting drug action) asso- G : 7 MIllIon sIngle nuCleotIDe polYMorphIsM genotYpes
ciated with depressive severity, and then adding in the genes
associated with metabolomic concentrations, the biological
measures became more closely associated with the molecular
mechanisms of antidepressant response. Finally, out of the more Formulation of Multi-Omic Integration
than 7 million possible predictor variables, the proposed
approach identified about 65 predictor variables that comprised
(1) SNPs ( g) identified by the GWAS based on metabolomic
concentrations, (2) metabolites (m) whose concentrations are
significantly associated with depression severity in patient clus- Patient Stratification
ters, and (3) clinical measures (as shown in Table 2). Thus we = arg max (x, µk, σk2)
k (x ) where k (x ) =
made the size of the predictor data computationally tractable to k ∈[1:K ]
0.09 0.2
Probability Density
Probability Density
0.06
0.1
0.03
0.00 0.0
10 15 20 25 10 15 20 25
Baseline Total QIDS−C Score Baseline Total QIDS−C Score
(a) Estimating Distributions (b) Clusters from Inferred GMM
FiguRE 3 estimating parameters of gaussian mixture model for identifying clusters in the data. (a) the inference of mixtures comprising the
distribution of symptom severity scores. (b) Distribution of symptom severity within the clusters inferred using the sufficient statistics of
components inferred in (a).
increasing mean ( n k) of the components were the outputs of with depression severity in all three clusters at eight weeks.
the workflow [38]. Patients were assigned to the component C These correlations were biologically significant because
that maximized the likelihood L (x) given the component’s suf- they have been associated with MDD treatment and
ficient statistics (gmmCluster), as illustrated in Fig. 3(b) and response, as these metabolites are related to the mono-
described as amine neurotransmitter pathways [40], [41]. Furthermore,
SNPs ^ g = " g ! G p.value [GWAS (m)] # 1E - 06 ,h in the
C = argmax L k (x) where L k (x) = N ^x, n k, v 2k h . (1) TSPAN5 (rs10516436), AHR (rs17137566), ERICH3
k ! [1: K ]
(rs696692), and DEFB1 (rs5743467, rs2741130, rs2702877)
Results genes have been associated through use of GWAS with con-
At each time-point t ! {b (baseline), f (four weeks), e (eight centrations of kynurenine and serotonin [42], [43]. The associa-
weeks)}, we found three clusters of men and women by using tions of these biological measures laid the foundation for
the proposed stratification process. Clusters at the baseline are assessing whether they could improve the predictability of clin-
C b = " C 1b , C 2b , C 3b ,, at four weeks are C f = " C 1f , C 2f , C 3f ,, and ical outcomes when combined with traditional clinical, social,
at eight weeks are C e = " C 1e , C 2e , C 3e , . and demographic variables.
The clinical value of the clustering behavior is that C 1e in
both men and women captures all patients who achieved V. Using Baseline Data to Predict Clinical Outcomes
remission at the end of eight weeks. Furthermore, the C 2e in A recent publication proposed a prediction model that uses
both men and women comprised patients who demonstrated elastic-net regularization for feature selection and a gradient
response but did not achieve remission. Finally, patients in C 3f boosting machine (GBM) for classification, but only using
(both men and women) did not exhibit response or achieve baseline social, demographic, and clinical data 6S :C@ from the
remission. The same workflow demonstrated identical patient STAR*D trial [1]. While their prediction accuracies were bet-
stratification in the Sequenced Treatment Alternatives to ter than chance, the authors acknowledged the limitations of
Relieve Depression (STAR*D) trial [39], i.e., the Kol- their work, which suggests that it might be worthwhile to
mogorov-Smirnov test for the symptom severity scores study whether the addition of baseline biological measures
between clusters of similar average symptom severity had a together with the social, demographic, and clinical data would
p-value 2 0.8 [19]. From the analytics perspective, the signifi- increase the predictability of the clinical outcomes. With access
cance of the replication of patient stratification in two independent to metabolomics and genomics data in a smaller cohort of the
clinical trials is that the clustering behavior followed the existing defi- Mayo PGRN-AMPS trial, we set out to answer the following
nitions of clinical outcomes in psychiatry. questions (illustrated in Fig. 4):
1) Would augmenting social, demographic, and clinical
B. Omics Associations data 6S :C@ with metabolomics data improve the predic-
Baseline concentrations of key metabolites m = " m ! M p.value tion accuracies of treatment outcomes over using only
[m a x (C )] # 0.05/ M ,, such as serotonin, kynurenine, tryp- social, demographic, and clinical data 6S :C@ as predic-
tophan, tyrosine, and paraxanthine, were significantly correlated tor variables?
Clinical Assessments
Depression Severity,
Social and Demographic Data,
Patient History
Metebolomics Genomics
FiguRE 4 the proposed analyses to establish improved predictability in antidepressant treatment outcomes by augmenting the clinicians’
assessments with biological measures.
MEn
RESPOnSE
DATA CliniCAl DATA Only CliniCAl AnD METAbOlOMiCS DATA TOP PREDiCTORS
MoDel svM-rbf svM-lInear glM gbM svM-rbf svM-lInear glM gbM atoCo
aCCuraCY 28.2 32 52 40 48 48 64 48 urIC
sensItIvItY 0 16.67 16.67 33.33 33.33 33.33 50 33.33 QIDs-1
speCIfICItY 53.5 46.15 84.62 46.15 61.54 61.54 61.54 61.54 kYn
auC 0.64 0.60 0.63 0.54 0.53 0.53 0.68 0.5 3ohkY
REMiSSiOn
DATA CliniCAl DATA Only CliniCAl AnD METAbOlOMiCS DATA TOP PREDiCTORS
MoDel svM-rbf svM-lInear glM gbM svM-rbf svM-lInear glM gbM aMtrp
aCCuraCY 28 44 44 48 64 68 64 45.65 I3pa
sensItIvItY 38.46 38 53.85 46.15 76.52 76 76.92 65.22 Drug Dosage
speCIfICItY 16.67 50 33.33 50 50 50 50 26.09 gtoCo3
auC 0.8 0.6 0.67 0.6 0.76 0.78 0.62 0.6 5ht
wOMEn
RESPOnSE
DATA CliniCAl DATA Only CliniCAl AnD METAbOlOMiCS DATA TOP PREDiCTORS
MoDel svM-rbf svM-lInear glM gbM svM-rbf svM-lInear glM gbM seasonal pattern
aCCuraCY 52.08 52.08 54.17 50 41.3 72.33 64.58 41.67 5ht
sensItIvItY 18.18 18.18 27.27 18.18 34.78 18.18 36.36 0 Mhpg
speCIfICItY 80.72 80.76 76.92 76.9 47.83 92.83 88.46 76 Met
auC 0.60 0.59 0.63 0.63 0.69 0.74 0.68 0.51 QIDs-13
REMiSSiOn
DATA CliniCAl DATA Only CliniCAl AnD METAbOlOMiCS DATA TOP PREDiCTORS
MoDel svM-rbf svM-lInear glM gbM svM-rbf svM-lInear glM gbM 5ht
aCCuraCY 34.78 50 45.65 36.96 41.3 54.33 52.17 45.65 hga
sensItIvItY 26.09 65.22 56.52 47.83 34.78 56.52 76.92 65.22 3ohkY
speCIfICItY 43.48 34.78 34.78 26.09 47.83 52.17 50 26.09 seasonal pattern
auC 0.64 0.52 0.58 0.58 0.56 0.53 0.53 0.53 paraxan
researchers and clinicians to collect more biological measures disorder, for which such biologically based subtyping is not
for psychiatric diseases other than major depressive disorder possible, the approach described in this work for stratifying
(e.g., bipolar disorder, schizophrenia, and various dementias) patients using symptomatic characteristics is of immense value
that would not only help subtype or stratify patients by their to associated pharmacogenomics research. In particular, trials
symptom severity profiles, but also combine biological charac- could be designed in which multi-omics (metabolomics,
teristics that would enable treatment strategies closer to the genomics, etc.) and other biological measures (neuroimaging,
kinds used in breast cancer therapeutics. electrophysiology, etc.) could be collected that help to establish
biological associations with patients stratified using the pro-
C. In Pharmacogenomics Research posed approach. Then, longitudinal effects of the drug on these
Pharmacogenomics research focuses on understanding the biomarkers can be used to study why patients either respond
interplay between drug effects and functions of the genome. In well to the intervention or do not. Furthermore, as already
this context, we reflect on the improvements in breast cancer demonstrated in this work, associations of biological markers
therapeutics wherein treatment selection is based on molecular with inferred patient stratification can provide improved pre-
characteristics of the tumor. In diseases such as major depressive dictability for treatment outcomes.
Liang Liu
Institute of Cyber-Systems and Control, Zhejiang University, Zhejiang, China
Yong Liu
State Key Laboratory of Industrial Control Technology and Institute
of Cyber-Systems and Control, Zhejiang University, Zhejiang, China
Huifeng Wu
College of Computer Science and Technology, Zhejiang Dianzi University,
Zhejiang, China
Wenliang Huang
China Unicom Ltd. Beijing, China
Hun Kim
Department of Computer Science, Kim Il Song University, DPR of Korea
Abstract high accuracy in representation learn- neuron parameters are randomly assigned
is based on the work regarding weighted cient for the practical imbalanced 2 2 i=1
regularized ELM presented by Toh [9] learning problem. Subject to : h (x i) b = t Ti - p Ti , i = 1, ..., n.
and Deng et al. [10], and the key essence The remaining sections are organized (3)
of weighted ELM in Zong et al. [7] is to as follows. Sections II and III introduce
assign an extra weight to each sample to the theoretical background of ELM and where p i = [p i, 1, ..., p i, m] is the training
strengthen the impact of the minority related ELM methods on imbalanced error vector of the m output nodes cor-
class while weakening the relative impact learning. Section IV presents the pro- responding to training sample x i . C is
of the majority class. Experimental results posed method. Section V reports the the trade-off regularization parameter
in their work showed superior perfor- experimental results and performance between the minimization of training
mance of weighted ELM compared with analysis. Finally, the conclusions are sum- errors and the maximization of the mar-
original ELM on various imbalanced marized in Section VI. ginal distance. Based on the Karush-
datasets. However, the two weighting Kuhn-Tucker (KKT) theorem [16], we
schemes used in their approach can only II. Theoretical Background can solve the optimization problem of
obtain empirical sub-optima, and global This section introduces a brief theoreti- formula (3) and obtain the same solution
optima cannot be guaranteed. Following cal background of ELM. as formula (2).
work [8] introduced the boosting meth- ELM [1], [2] was originally proposed Given a new sample x, the output
od to obtain better weighting schemes; for single-hidden layer feedforward function of ELM is obtained as follows.
however, how to set the optimal weight- neural networks (SLFNs) and then ex-
ing scheme remains an open problem. tended to ‘generalized’ SLFNs where f ( x) =
h (x) H T ` I + HH T j T, when n 1 l
In this paper, we present a formal the hidden layer does not require tun- -1
Figure 1 Classification results produced by ELM, WELM-W1, WELM-W2, and WELM with random selected weights on a randomly-generated
dataset with an imbalanced ratio of 1:10. Blue circles and red circles represent the correctly classified instances and incorrectly classified instanc-
es in the majority class, respectively. Red crosses and blue crosses represent the correctly classified instances and incorrectly classified instances
in the minority class, respectively.
This results in a boosting classifier with [25], DE is far more efficient and robust IV. Our Approach
a smaller number of WELM classifiers (with respect to reproducing the results Although the two weighting schemes in
and can save much computational time. in several runs) compared to other evo- formula (8) can obtain superior results
Second, the distribution weights are lutionary computation algorithms such compared with the original ELM, they
updated separately for different classes as particle swarm optimization (PSO) are only empirical schemes and cannot
to avoid destroying the distribution [26] and evolutionary programming (EP) guarantee the optima, which was proven
weights’ asymmetry. [27]. In addition, it has few parameters to in the toy experiment. In this section,
In this paper, we also address the set, and the same settings can be adapted we present the formal mathematical
problem of obtaining the best weighted to many different problems. Thus, this model to obtain the optimal weighting
matrix for the ELM-based imbalanced method has been actively used in various scheme, and we also introduce the DE
learning problem, and we propose a DE- fields [28]–[30]. Some works [31]–[33] method [12] to calculate the approxi-
based WELM (DE-WELM) for imbal- have focused on updating ELM using mate optimal weight matrix used in
anced dataset learning. DE. However, they only use DE to search WELM, as the calculation for the formal
DE was first presented by R. Storn the hidden neuron parameters instead of model is infeasible.
and Price [12], and it has been recog- searching the weighted matrix for imbal-
nized as a powerful method for solving anced learning. A. Mathematical Model for
optimization problems. It resembles the As parameters such as NP, F, and CR Optimal Weighted Scheme
structure of evolutionary algorithms, but in the DE algorithm are critical for its The problem of calculating the optimal
differs from their traditional versions in perfor mance, there are also many weight matrix in WELM can be formal-
the generation of new candidate solu- improved DE algorithms [34]–[40], ly defined as follows, Given training data
tions and the use of a greedy selection and [41] which can adaptively control X = {(x i, t i), t i ! {1, 2, ..., m}, i = 1, 2, ...,
scheme. As shown in previous research those parameters. n}, and there are m classes in X, the
Algorithm 1 De-WeLM.
Then the next candidates popula-
tion W R + 1 = {W i, R + 1, i = 1, 2, ..., NP} input: Training set X = {(x i, t i), t i ! {1, 2, ..., m}, i = 1, 2, ..., n}, R max .
and its best candidate can be decid- Output: Approximate optimal weight matrix W *.
ed according to the following select 1: All data values x i of training set X are normalized in [–1,1], and then select valida-
operation. tion set X vd. Set R = 0.
2: Generate initial population W R = {W i, R, i = 1, 2, ..., NP} by formula (11).
3: Compute the error E W i,R (i = 1, ..., NP) of WELM classifier with weight matrix
W i, R + 1 = '
U i, R if E Ui,R # E Wi,R
W i, R otherwise diag (W i, R) on validation set X vd and the candidate optimal weight vector W best,R by
W best, R if E Wbest,R 1 E Ubest,R formula (12).
W best, R + 1 = ' 4: while R 1 R max do
U best, R otherwise.
(16) 5: Compute the trial vectors U i, R (i = 1, ..., NP) from W R by formulas (13) and (14).
6: Compute the error E U i,R(i = 1, ..., NP) of WELM classifier with weight matrix
We compare the E Ui,R of each trial diag (U i, R) on validation set X vd by formula (15).
vector U i, R with the E Wi,R of its corre- 7: Obtain next stage population W R + 1 and a candidate optimal weight vector of the
sponding target vector in the current next stage W best,R + 1 by formula (16).
population. If the trial vector has less or 8: R = R + 1.
equal error value to the corresponding 9: end while
target vector, the trial vector will replace 10: W * = diag (W best,Rmax).
Airplane
Automobile
Bird
Cat
Deer
Dog
Frog
Horse
Ship
Truck
(a) MNIST (b) CIFAR-10
Figure 2 Example digital images of MNIST database and CIFAR-10 database, and number histograms of top training videos in YouTube-8M.
Training Gmean
Training Gmean
Training Gmean
Training Gmean
Figure 3 Gmean values after every iteration of our DE-based method on six standard classification datasets.
0.95 0.95
Gmean
Gmean
0.90 0.90
0.85 0.85
1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10
Experiment Number Experiment Number
(a) (b)
Figure 4 Experimental results of the convergence on initialization of weight on two datasets (ECOLI1, GLASS2).
True Expression
Cat 0.04 0.07 0.07 0.48 0.05 0.07 0.06 0.07 0.06 0.03 Cat 0.03 0.07 0.06 0.43 0.06 0.09 0.06 0.08 0.07 0.05
Deer 0.03 0.05 0.05 0.06 0.55 0.06 0.06 0.05 0.07 0.02 Deer 0.05 0.06 0.10 0.09 0.36 0.05 0.09 0.08 0.09 0.02
Dog 0.03 0.06 0.06 0.06 0.08 0.48 0.06 0.05 0.09 0.03 Dog 0.02 0.08 0.08 0.08 0.06 0.42 0.06 0.08 0.07 0.05
Frog 0.02 0.05 0.03 0.05 0.05 0.05 0.62 0.04 0.05 0.03 Frog 0.04 0.08 0.08 0.08 0.08 0.08 0.35 0.07 0.09 0.03
Horse 0.04 0.04 0.06 0.03 0.08 0.06 0.04 0.60 0.04 0.02 Horse 0.04 0.07 0.08 0.08 0.07 0.10 0.10 0.32 0.09 0.07
Ship 0.03 0.03 0.09 0.04 0.04 0.04 0.05 0.06 0.58 0.02 Ship 0.02 0.05 0.05 0.06 0.10 0.05 0.08 0.08 0.45 0.05
Truck 0.05 0.05 0.02 0.00 0.03 0.02 0.05 0.03 0.02 0.72 Truck 0.03 0.05 0.07 0.07 0.07 0.14 0.09 0.10 0.03 0.34
Ai
Au
Bi
Fr
Sh
Tr
Ai
Au
Bi
Fr
Sh
Tr
at
ee
og
or
at
ee
og
or
rp
rd
og
uc
rp
rd
og
uc
to
ip
to
ip
se
se
r
l
an
k
m
an
k
m
ob
ob
e
e
ile
ile
Airplane 0.55 0.03 0.06 0.06 0.06 0.05 0.06 0.06 0.07 0.01 Airplane 0.57 0.04 0.04 0.06 0.04 0.08 0.06 0.01 0.06 0.03
Automobile 0.03 0.60 0.04 0.05 0.04 0.05 0.06 0.05 0.06 0.02 Automobile 0.03 0.60 0.04 0.04 0.05 0.04 0.06 0.05 0.05 0.03
Bird 0.01 0.11 0.61 0.03 0.03 0.07 0.05 0.03 0.02 0.03 Bird 0.00 0.06 0.63 0.03 0.05 0.02 0.05 0.11 0.03 0.01
True Expression
True Expression
Cat 0.03 0.07 0.08 0.45 0.06 0.07 0.06 0.09 0.06 0.03 Cat 0.04 0.06 0.06 0.45 0.07 0.07 0.08 0.06 0.08 0.03
Deer 0.03 0.07 0.04 0.06 0.49 0.08 0.07 0.07 0.06 0.02 Deer 0.02 0.05 0.07 0.08 0.50 0.05 0.07 0.07 0.05 0.03
Dog 0.04 0.07 0.08 0.07 0.06 0.48 0.06 0.05 0.05 0.02 Dog 0.04 0.06 0.06 0.06 0.05 0.50 0.07 0.05 0.05 0.05
Frog 0.02 0.04 0.05 0.06 0.03 0.05 0.65 0.04 0.04 0.03 Frog 0.02 0.04 0.05 0.05 0.03 0.04 0.65 0.04 0.04 0.03
Horse 0.02 0.06 0.12 0.05 0.05 0.05 0.04 0.51 0.07 0.03 Horse 0.03 0.10 0.06 0.05 0.08 0.08 0.04 0.51 0.04 0.01
Ship 0.04 0.05 0.06 0.06 0.06 0.09 0.05 0.06 0.50 0.02 Ship 0.02 0.06 0.06 0.06 0.08 0.05 0.06 0.04 0.52 0.02
Truck 0.05 0.09 0.03 0.07 0.07 0.03 0.03 0.02 0.00 0.60 Truck 0.03 0.03 0.05 0.02 0.00 0.07 0.09 0.02 0.00 0.69
Ai
Au
Bi
Fr
Sh
Tr
Ai
Au
Bi
Fr
Sh
Tr
at
ee
og
or
at
ee
og
or
rp
rd
og
uc
rp
rd
og
uc
to
ip
to
ip
se
se
la
la
r
k
m
k
m
ne
ne
ob
ob
ile
ile
Airplane 0.56 0.02 0.03 0.03 0.08 0.08 0.03 0.06 0.06 0.04 Airplane 0.56 0.02 0.06 0.08 0.06 0.06 0.05 0.03 0.05 0.03
Automobile 0.01 0.60 0.05 0.05 0.05 0.06 0.05 0.06 0.05 0.02 Automobile 0.03 0.60 0.04 0.06 0.04 0.04 0.04 0.05 0.07 0.03
Bird 0.03 0.02 0.63 0.06 0.04 0.04 0.08 0.03 0.04 0.01 Bird 0.01 0.02 0.79 0.03 0.02 0.04 0.04 0.03 0.00 0.01
True Expression
True Expression
Cat 0.02 0.06 0.07 0.45 0.08 0.06 0.08 0.06 0.07 0.04 Cat 0.04 0.06 0.06 0.45 0.06 0.07 0.10 0.08 0.06 0.02
Deer 0.04 0.09 0.05 0.03 0.50 0.05 0.05 0.08 0.08 0.04 Deer 0.05 0.06 0.07 0.07 0.50 0.05 0.05 0.07 0.05 0.03
Dog 0.04 0.07 0.05 0.07 0.06 0.50 0.08 0.05 0.06 0.03 Dog 0.03 0.08 0.08 0.05 0.06 0.49 0.06 0.05 0.06 0.03
Frog 0.03 0.05 0.04 0.03 0.05 0.04 0.65 0.03 0.04 0.03 Frog 0.02 0.07 0.03 0.04 0.03 0.05 0.63 0.04 0.04 0.03
Horse 0.01 0.06 0.06 0.10 0.04 0.06 0.05 0.52 0.08 0.02 Horse 0.07 0.07 0.07 0.04 0.07 0.04 0.06 0.51 0.06 0.01
Ship 0.05 0.04 0.05 0.08 0.05 0.07 0.07 0.05 0.50 0.02 Ship 0.03 0.08 0.07 0.05 0.04 0.04 0.05 0.06 0.51 0.05
Truck 0.02 0.09 0.02 0.05 0.02 0.00 0.07 0.02 0.03 0.69 Truck 0.02 0.07 0.10 0.00 0.00 0.03 0.00 0.00 0.02 0.76
Ai
Au ne
Bi ob
Fr
Sh
Tr
Ai
Au ne
Bi ob
Fr
Sh
Tr
at
ee
og
or
at
ee
og
or
rp
rd
rp
rd
og
uc
og
uc
to
ip
to
ip
se
se
la
la
r
r
k
k
m
m
ile
ile
Figure 5 Confusion matrices obtained by applying six algorithms (CifarNet, ELM, WELM-W1, WELM-W2, BWELM, DE-WELM) on CIFAR-10 dataset.
Xiangnan Zhong
Department of Electrical Engineering,
University of North Texas, Denton, TX, USA
Abstract that we learn by interacting with the important role in the learning process. In
V (t )
s (t )
Goal Network Critic Network Action Network
V (t )
x (t )
x (t )
s (t ) V (t ) a (t )
x (t )
a (t )
a (t )
s (t )
a (t ) a (t )
Figure 2 Neural network training process. The goal network takes the system state x (t) and control action a (t) as its input, and outputs an
estimated internal reinforcement signal s (t). The critic network similarly applies the neural network structure, but takes an additional input s (t)
and outputs the value function V (t) . Moreover, the action network uses a similar neural network structure with input x (t) and output a (t) .
B. Critic Network
The value function V (t) in Eq. (2) is
Figure 3 Implementation architecture of self-learning adaptive dynamic programming
estimated by the critic network. To design: three neural networks are established as the action network, the critic network and
closely connect the critic network with the goal network.
θ1 (rad)
x (m)
θ3 (rad)
of go without human knowledge,” Nature, vol. 550, no.
0 0 7676, pp. 354–359, Oct. 2017.
[11] D. P. Bertsekas, Dynamic Programming and Optimal
Control. Belmont, MA: Athena Scientific, 1995.
–0.02 –0.01 [12] P. J. Werbos, “Intelligence in the brain: A theory of
0 2,000 4,000 6,000 0 2,000 4,000 6,000
how it works and how to build it,” Neural Netw., vol. 22,
Time Step Time Step no. 3, pp. 200–212, Apr. 2009.
(c) (d) [13] P. J. Werbos, “ADP: The key direction for future
research in intelligent control and understanding brain
intelligence,” IEEE Trans. Syst., Man, Cybern. B, Cybern.,
0.2 1
θ1 (rad/s)
Control Syst. Mag., vol. 32, no. 6, pp. 76–105, Dec. 2012.
0 2,000 4,000 6,000 0 2,000 4,000 6,000 [15] W. B. Powell, Approximate Dynamic Programming: Solv-
Time Step Time Step ing the Curses of Dimensionality. New York: Wiley, 2007.
[16] D.V. Prokhorov and D. C.Wunsch,“Adaptive critic designs,”
(e) (f) IEEE Trans. Neural Netw., vol. 8, no. 5, pp. 997–1007, Sept. 1997.
1 0.4 [17] F. Liu, J. Sun, J. Si, W. Guo, and S. Mei, “A bound-
edness result for the direct heuristic dynamic program-
θ2 (rad/s)
θ3 (rad/s)
0.2 ming,” Neural Netw., vol. 32, pp. 229–235, Aug. 2012.
0 0 [18] J. Si and Y.-T. Wang, “Online learning control by as-
sociation and reinforcement,” IEEE Trans. Neural Netw.,
–0.2
.
Soujanya Poria
Temasek Laboratories, Nanyang Technological University,
Singapore
Erik Cambria
School of Computer Science and Engineering,
Nanyang Technological University, Singapore
Abstract processing, in which the analysis of a creasingly focusing on the use of new deep
(–) Man
(%)
50
40 (+) Woman
30
tasks. We review major deep learning ods on standard datasets about major smaller dimensionality, are fast and effi-
related models and methods applied to NLP topics. cient in computing core NLP tasks.
natural language tasks such as convolu- Over the years, the models that cre-
tional neural networks (CNNs), recur- II. Distributed Representation ate such embeddings have been shallow
rent neural networks (RNNs), and Statistical NLP has emerged as the pri- neural networks and there has not been
recursive neural networks. We also dis- mary option for modeling complex nat- need for deep networks to create good
cuss memory-augmenting strategies, ural language tasks. However, in its embeddings. However, deep learning
attention mechanisms and how unsu- beginning, it often used to suffer from based NLP models invariably represent
pervised models, reinforcement learning the notor ious curse of dimensionality their words, phrases and even sentences
methods and recently, deep generative while learning joint probability func- using these embeddings. This is in fact a
models have been employed for lan- tions of language models. This led to the major difference between traditional
guage-related tasks. motivation of learning distributed repre- word count based models and deep
To the best of our knowledge, this sentations of words existing in low- learning based models. Word embed-
work is the first of its type to compre- dimensional space [7]. dings have been responsible for state-of-
hensively cover the most popular deep the-art results in a wide range of NLP
learning methods in NLP research A. Word Embeddings tasks [9]–[12]. For example, Glorot et al.
today. The work by Goldberg [6] only Distributional vectors or word embed- [13] used embeddings along with
presented the basic principles for apply- dings (Fig. 2) essentially follow the dis- stacked denoising autoencoders for
ing neural networks to NLP in a tuto- tributional hypothesis, according to domain adaptation in sentiment classifi-
rial manner. We believe this paper will which words with similar meanings tend cation and Hermann and Blunsom [14]
give readers a more comprehensive idea to occur in similar context. Thus, these presented combinatory categorial auto-
of current practices in this domain. The vectors try to capture the characteristics encoders to learn the compositionality
structure of the paper is as follows: Sec- of the neighbors of a word. The main of sentence. Their wide usage across the
tion II introduces the concept of dis- advantage of distributional vectors is that recent literature shows their effectiveness
tributed representation, the basis of they capture similarity between words. and importance in any deep learning
sophisticated deep learning models; Measuring similarity between vectors is model performing a NLP task.
next, Sections III, IV, and V discuss possible, using measures such as cosine Distributed representations (embed-
popular models such as convolutional, similarity. Word embeddings are often dings) are mainly learned through con-
recurrent, and recursive neural net- used as the first data processing layer in a text. During 1990s, several research
works, as well as their use in various deep learning model. Typically, word developments [15] marked the founda-
NLP tasks; following, Section VI lists embeddings are pre-trained by optimiz- tions of research in distributional seman-
recent applications of reinforcement ing an auxiliary objective in a large tics. A more detailed summary of these
learning in NLP and new developments unlabeled corpus, such as predicting a early trends is provided in [16], [17].
in unsupervised sentence representation word based on its context [3], [8], where Later developments were adaptations of
learning; later, Section VII illustrates the the learned word vectors can capture these early works, which led to creation
recent trend of coupling deep learning general syntactical and semantic infor- of topic models like latent Dirichlet
models with memory modules; finally, mation. Thus, these embeddings have allocation [18] and language models [7].
Section VIII summarizes the perfor- proven to be efficient in capturing con- These works laid out the foundations of
mance of a series of deep learning meth- text similarity, analogies and due to its representation learning.
i= 1
tens et al. [20], which stated that com- ding dimension is determined by the
T
positionality is seen only when certain accuracy of prediction. As the embedding where, u i = v .v c . wi
assumptions are held, e.g., the assump- dimension increases, the accuracy of pre- The parameters i = {Vw, Vc} are
tion that words need to be uniformly diction also increases until it converges at learned by defining the objective func-
distributed in the embedding space. some point, which is considered the optimal tion as the log-likelihood and finding its
Pennington et al. [21] is another embedding dimension as it is the shortest gradient as
famous word embedding method which without compromising accuracy.
is essentially a “count-based” model. Here, Let us consider a simplified version l (i) = / log ` P ` w jj (3)
the word co-occurrence count matrix is of the CBOW model where only one w ! Vocabulary c
= Vc ` 1 - P ` w jj .
preprocessed by normalizing the counts word is considered in the context. 2l (i)
(4)
and log-smoothing them. This matrix is This essentially replicates a bigram lan- 2Vw c
then factorized to get lower dimensional guage model.
representations which is done by minimiz- The CBOW model is a simple fully In the general CBOW model, all the
ing a “reconstruction loss”. connected neural network with one one-hot vectors of context words are
Below, we provide a brief description hidden layer. The input layer, which taken as input simultaneously, i.e,
of the word2vec method proposed by takes the one-hot vector of context
h = W T (x 1 + x 2 + g + x c). (5)
Mikolov et al. [3]. word has V neurons while the hidden
layer has N neurons. The output layer is One limitation of individual word
B. Word2vec softmax of all words in the vocabulary. embeddings is their inability to represent
Word embeddings were revolutionized The layers are connected by weight phrases, where the combination of two
by Mikolov et al. [3], [8] who proposed matrix W ! R V # N and Wl ! R H #V , or more words (e.g., idioms like “hot
r
f
i o
xt
ht –1 C̃ C tanh ht xt s z ht
CNN
rization can be cast as a sequence-to-
LSTM LSTM LSTM
sequence learning problem, where the
input is the original text and the output
is the condensed version. Intuitively, it is
w1 w2 wN – 1 unrealistic to expect a fixed-size vector to
True Image Description encode all information in a piece of text
whose length can potentially be very
Figure 9 image captioning using CNN image embedder followed by LSTM decoder. This long. Similar problems have also been
architecture was proposed by vinyals et al. [90]. reported in machine translation [94].
giMéNEz AND MARquEz [125] SvM WiTH MANuAL fEATuRE PATTERN 97.16
SANToS AND zADRozNY [31] MLP WiTH CHARACTER + WoRD EMBEDDiNgS 97.32
TABLe 3 Parsing (UAS/LAS = Unlabeled/labeled Attachment Score; WSJ = The Wall Street Journal Section of Penn Treebank).
PArSing TyPe PAPer modeL wSJ
DEPENDENCY PARSiNg CHEN AND MANNiNg [129] fuLLY-CoNNECTED NN WiTH fEATuRES iNCLuDiNg PoS 91.8/89.6 (uAS/LAS)
WEiSS ET AL. [130] DEEP fuLLY-CoNNECTED NN WiTH fEATuRES iNCLuDiNg PoS 94.3/92.4 (uAS/LAS)
CoNSTiTuENCY PARSiNg PETRov ET AL. [133] PRoBABiLiSTiC CoNTExT-fREE gRAMMARS (PCfg) 91.8 (f1 SCoRE)
viNYALS ET AL. [97] SEq2SEq LEARNiNg WiTH LSTM+ATTENTioN 93.5 (f1 SCoRE)
ry, question, answer> triple format. Xiong CHiu AND NiCHoLS [139] Bi-LSTM WiTH WoRD+CHAR+LExiCoN 90.77
et al. [137] applied the same model to vi- EMBEDDiNgS
sual QA and proved that the memory Luo ET AL. [140] SEMi-CRf JoiNTLY TRAiNED WiTH LiNkiNg 91.20
module was applicable to visual signals. LAMPLE ET AL. [85] Bi-LSTM-CRf WiTH WoRD+CHAR 90.94
EMBEDDiNgS
VIII. Performance of Different LAMPLE ET AL. [85] Bi-LSTM WiTH WoRD+CHAR EMBEDDiNgS 89.15
Models on Different NLP Tasks STRuBELL ET AL. [141] DiLATED CNN WiTH CRf 90.54
We summarize the performance of a
series of deep learning methods on stan-
dard datasets developed in recent years been widely used for developing and cy between adjacent tags. With a simple
on 7 major NLP topics in Tables 2–7. evaluating POS tagg ing systems. left-to-right tagging scheme, this meth-
Our goal is to show the readers com- Giménez and Marquez [125] employed od modeled dependencies between
mon datasets used in the community one-against-all SVM based on manual- adjacent tags only by feature engineer-
and state-of-the-art results along with ly-defined features within a seven- ing. In an effort to reduce feature engi-
different models. word window, in which some basic neering, Collobert et al. [5] relied on
n-gram patterns were evaluated to form only word embeddings within the word
A. POS Tagging binary features such as: “previous word is window with a multi-layer perceptron.
The WSJ-PTB (the Wall Street Journal the”, “two preceding tags are DT NN”, Incorporating CRF was proven useful
part of the Penn Treebank Dataset) cor- etc. One characteristic of the POS tag- in [5]. Santos and Zadrozny [31] concat-
pus contains 1.17 million tokens and has ging problem was the strong dependen- enated word embeddings with character
TäCkSTRöM ET AL. [142] MANuAL fEATuRES WiTH DP foR iNfERENCE 78.6 79.4
SuTSkEvER ET AL. [67] RERANkiNg PHRASE-BASED SMT BEST LiST WiTH LSTM SEq2SEq 36.5
Wu ET AL. [147] RESiDuAL LSTM SEq2SEq + REiNfoRCEMENT LEARNiNg REfiNiNg 26.30 41.16
LoWE ET AL. [89] DuAL LSTM ENCoDERS foR SEMANTiC MATCHiNg 55.22
∆ 13th International Workshop * 2018 IEEE Symposium Series * The 9th Joint IEEE International
on Semantic and Social Media on Computational Intelligence Conference of Developmental
Adaptation and Personalization (IEEE SSCI 2018) Learning and Epigenetic Robotics
(SMAP 2018) November 18-21, 2018 (IEEE ICDL-EpiRob 2019)
September 6-7, 2018 Place: Bangalore, India August 19-22, 2019
Place: Zaragoza, Spain General Co-Chairs: Sundaram Suresh Place: Oslo, Norway
General Chairs: Sergio Ilarri, and Koshy George General Chairs: Jim Torresen
Fernando Bobillo, Raquel Trillo-Lado, Website: http://ieee-ssci2018.org and Kerstin Dautenhahn
and Martín López-Nores Website: TBA
Website: http://smap2018.unizar.es * 2019 IEEE Conference on
Computational Intelligence * 2019 IEEE International
* 2018 IEEE International for Financial Engineering and Conference on Games
Conference on Data Science Economics (IEEE CIFEr 2019) (IEEE CoG 2019)
and Advanced Analytics May 4-5, 2019 August 20-23, 2019
(IEEE DSAA 2018) Place: Shenzhen, China Place: London, United Kingdom
October 1-4, 2018 General Chairs: Hisao Ishibuchi General Chairs: Diego Perez Liebana
Place: Turin, Italy and Dongbin Zhao and Sanaz Mostaghim
General Chairs: Francesco Bonchi Website: TBA Website: TBA
and Foster Provost
Website: https://dsaa2018.isi.it * 2019 IEEE Congress on Evolutionary * 2019 IEEE Latin American
Computation (IEEE CEC 2019) Conference on Computational
* 2018 IEEE Smart World Congress June 10-13, 2019 Intelligence (2019 IEEE LA-CCI)
(IEEE SWC 2018) Place: Wellington, New Zealand November 11-15, 2019
October 8-12, 2018 General Co-Chairs: Mengjie Zhang Place: Guayaquil, Ecuador
Place: Guangzhou, China and Kay Chen Tan General Chair: Otilia Alejandro
General Chairs: Guojun Wang Website: http://www.cec2019.org Website: http://la-cci.org
and Yew Soon Ong
Website: http://www.smart-world.org/ * 2019 IEEE International * 2019 IEEE Symposium
2018/ Conference on Fuzzy Systems Series on Computational
(FUZZ-IEEE 2019) Intelligence (IEEE SSCI 2019)
* 2018 IEEE Latin American June 23-26, 2019 December 6-9, 2019
Conference on Computational Place: New Orleans, USA Place: Xiamen, China
Intelligence (2018 IEEE LA-CCI) General Chairs: Timothy C. Havens General Chair: Tingwen Huang
November 7-9, 2018 and James M. Keller Website: http://ssci2019.org
Place: Guadalajara, Mexico Website: http://www.fuzzieee.org
General Chairs: Alma Y. Alanis * 2020 IEEE World Congress
and Marco A. Perez-Cisneros * 2019 IEEE Conference on on Computational Intelligence
Website: http://la-cci.org Computational Intelligence in (IEEE WCCI 2020)
Bioinformatics and Computational July 19-24, 2020
Biology (IEEE CIBCB 2019) Place: Glasgow, UK
July 24-27, 2019 General Chairs: Amir Hussain,
Place: Siena, Italy Marios M. Polycarpou, and Xin Yao
Digital Object Identifier 10.1109/MCI.2018.2840739 General Chair: Giuseppe Nicosia Website: TBA
Date of publication: 18 July 2018 Website: TBA
al
on
ati
ut a
p c ts l,
om o
e
C t r efl nica
s on ed nge tech
n m a
c tio rena le ch s of
a
r a ns was he tit ll formes.
E E T mes es. T ass a gam
a
8 , IE in G Gam omp h on
2 01 AI on enc earc
a ry and ns e to res
a nu nce ctio cop ring
J a
In ellige ans the s inee
r
Int E T g of eng m es on
IEE enin c an
d e lco pers r
l w f pa e fo cial
wid entifi a
ur ons o genc artifi
n
c i j o
s e si lli r
Th mis l inte es fo an–
s s ub cia am um on, nd
a me a rtifi es, g ce, h racti nal a e
n G lity a m gen inte atio war
g t
n s o h-qu ,
a
n t elli uter educ , sof es,
i
c tio l hig tific o mp ics, mes gam in
n sa ina cien ring c ph ga in
r a s t i ng
a i g s g iou rin g u
E Tr s or ing inee e r e m p nd
a me
e r g s gin e o a g s.
IEE blish cove d en s.
i u s e c al ty,
u i i c
ge
l en ectiv virt real r top
pu icles al, an ame o
r t i c f g a n T aff mes, ted othe
a hn o uli ga gmen and
tec ects ief
: J
p h au sign,
as C
r - in- de
ito
Ed
Advisory Board
Hussein Abbass, Australia
Kalyanmoy Deb, USA
David Fogel, USA
Kwong Tak Wu Sam, Hong Kong SAR
Simon Lucas, UK
Zbigniew Michalewicz, Australia
Xin Yao, UK
The IEEE Congress on Evolutionary Computation (IEEE CEC) is a world-class event in the field of Evolutionary Gary Yen, USA
Computation. It provides a forum to bring together researchers and practitioners from all over the world to General Co-Chairs
present and discuss their research findings on Evolutionary Computation. Mengjie Zhang, New Zealand
Kay Chen Tan, Hong Kong SAR
Program Chair
IEEE CEC 2019 will be held in Wellington, New Zealand. Wellington is known as the ‘Coolest Little Capital’. It is
Carlos A. Coello Coello, Mexico
famous for a vibrant creative culture fuelled by events and great food. Wellington offers a wide range of Technical Co-Chairs
cosmopolitan amenities in downtown that is safe, clean, and pedestrian friendly. Jürgen Branke, UK
Oscar Cordón, Spain
Call for Papers Hisao Ishibuchi, Japan
Jing Liu, China
Papers for IEEE CEC 2019 should be submitted electronically through the Congress website at Gabriela Ochoa, UK
www.cec2019.org, and will be refereed by experts in the fields and ranked based on the criteria of Dipti Srinivasan, Singapore
originality, significance, quality and clarity. Plenary Talk Chair
Yaochu Jin, UK
Call for Special Sessions Special Session Chair
Chuan-Kang Ting, Taiwan
Special session proposals are invited to IEEE CEC 2019. Special session proposals should include the title, aim Tutorial Chair
and scope (including a list of main topics), and the names of the organizers of the special session, together Xiaodong Li, Australia
with a short biography of all organizers. A list of potential contributors will be very helpful. All special Competition Chair
sessions proposals should be submitted to the Special Session Chair: Prof Chuan-Kang Ting Jialin Liu, UK
(ckting@pme.nthu.edu.tw). Workshop Chair
Handing Wang, UK
Submission Chair
Call for Tutorials Huanhuan Chen, China
IEEE CEC 2019 solicits proposal for tutorials covering specific topics in Evolutionary Computation. If you are Sponsorship Chair
interested in proposing a tutorial, would like to recommend someone who might be interested, or have Andy Song, Australia
Poster Chair
questions about tutorials, please contact the Tutorial Chair: Prof Xiaodong Li (xiaodong.li@rmit.edu.au).
Kai Qin, Australia
Publicity Co-Chair
Call for Competitions Stefano Cagnoni, Italy
Competitions will be held as part of the Congress. Prospective competition organizers are invited to submit Anna I Esparcia-Alcázar, Spain
their proposals to the Competition Chair: Dr Jialin Liu ( jialin.liu@qmul.ac.uk). Emma Hart, UK
Bin Hu, Austria
Sanza Mostaghim, Germany
Call for Workshops
Yew Soon Ong, Singapore
Workshops will be held to provide participants with the opportunity to present and discuss novel research Jun Zhang, China
ideas on active and emerging topics in Evolutionary Computation. Prospective workshop organizers are Finance Chair
invited to submit their proposals to the Workshop Chair: Dr Handing Wang (handing.wang@surrey.ac.uk). Bing Xue, New Zealand
Local Organising Co-Chairs
Will Browne, New Zealand
Important Dates Hui Ma, New Zealand
Registration Chair
26 26 7 7 31 Aaron Chen, New Zealand
October November January March March Proceedings Chair
2018 2018 2019 2019 2019 Yi Mei, New Zealand
Web Masters
Special Session Proposal Competition Proposal Paper Submission & Notification Deadline Camera-ready & Harith Al-Sahaf
Deadline Deadline Workshop Proposal & Early Registration Deadline
Tutorial Proposal Deadline Yiming Peng
Qi Chen
Sponsored by Find Us At
www.cec2019.org
TE WHARE WANANGA O TE UPOKO O TE IKA A MAUI
VICTORIA
UNIVERSITY OF WELLINGTON
admin@cec2019.org
Digital Object Identifier 10.1109/MCI.2018.2840759