Professional Documents
Culture Documents
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. Copyrights for
components of this work owned by others than ACM must be honored.
Abstracting with credit is permitted. To copy otherwise, or republish, to
post on servers or to redistribute to lists, requires prior specific permission
and/or a fee. Request permissions from Permissions@acm.org. Figure 1. Research overview
ICMLC’17, February 24–26, 2017, Singapore, Singapore
© 2017 ACM. ISBN 978-1-4503-4817-1/17/02…$15.00 The summary of research is shown in Figure 1. The review
DOI: http://dx.doi.org/10.1145/3055635.3056601 protocol is developed (Section 2.1) to select 31 studies as per
410
selection / rejection criteria (Section 2.1.1). Subsequently, 14 selected study should be based on solid evidences. 2) The
ensemble techniques (Section 3.1), 26 classifiers (Section 3.4), 15 objective of this research is to include most recent researches.
datasets (Section 3.5), 19 features (Section 3.2) and 8 tools Therefore, we try our best to include most recent researches as
(Section 3.3) are presented. The limitations and answers of RQ’s 67% researches are from
are provided in Section 4. Finally, conclusion is given in Section 5.
411
Table 3. Identified ensemble methods for sentimental classification
Study Ensemble Technique
Novel
BG Boosting Stacking FC RS AB Daggling MV WC MCC BRS MC SMOTE
App.
[1]
[2]
[3]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]
[21]
[22]
[23]
[24]
[25]
[26]
[27]
[28]
[29]
[20]
[31]
[32]
Bagging (Bootstrap) =BG, Majority Voting=MV, AdaBoost=AB, Weighted Combination= WC, Meta Classifiers Combination= MCC,
Bagging random Space = BRS, Meta Cost =MC
412
3.4 Classifiers Utilized
3. RESULTS We find 26 classifiers that are frequently ensemble with SVM to
perform sentimental analysis as given in Table 6.
3.1 Ensemble Techniques
We analyze 31 selected studies and found 14 leading ensemble Table 6. Leading classifiers for sentimental analysis
approaches for sentimental analysis as given in Table 3.
Classifier Studies
3.2 Feature Selection RBF NN [29]
We identify 19 features, frequently used in the domain of Naïve Bayes [1],[2],[3],[5],[6],[8],[9],[10],
sentimental analysis, as given in. There are studies that utilized (NB) [15],[19],[20],[22],[26],[27],[29],[32]
more than one feature simultaneously for sentimental analysis as Decision trees [1],[9],[15],[19]
given the studies column of Table 4. BLR [2]
Table 4. Prominent features for sentimental analysis RF [3],[19]
CRF [5]
Sr.# Feature Name Studies
BPN [7]
1 Semantic [29], [9]
PNN [1],[10],[20],[23]
2 SentiWordNet [16], [29], [28]
K neighbors [1],[10],[20],[23]
3 Product attributes [17],
Max. Entropy [1],[5],[6],[12],[15],[18],[20],[24],[26],[32]
4 Word Pairs and
[2],[6],[20],[29],[31],[11],[3],[27] (LDA) [2],[7]
Word Relation
5 [1],[6],[7],[8],[10],[11][13],[14],[ (LR) [2],[3],[9],[23]
N-gram 15],[16],[18],[19],[20],[21],[22],[ CRF [11],[15]
23,[24],[25],[26],[28][30],[32] ANN [12]
6 POS [2],[6],[29] (HMM) [16]
7 Hashing [3],[10] BN [15],[19],[27],[29]
8 dependency (CB) [20]
[6]
relations (RBF) [23]
9 Length [24],[28],[11] MLP [23]
10 Polarity ELM [31]
[24],
Dictionary (BPNN) [31]
11 Abbreviation [24] Senti Strength [22]
12 Negation [24]
Scoring [32]
13 Stems [24]
SBC [14]
14 Clustering. [24]
15 ALLCAPS [24] (GIBC) [14]
16 VSM (Vector RBC [14]
[25]
Space Model)
17 Sentiment [9],[12] 3.5 Important Datasets
We identify 15 benchmark datasets for sentimental analysis as
18 Stylometric [9]
given in Table 7.
19 Numbers [10]
Table 7. Important datasets for sentimental analysis
3.3 Leading Tools
We identify 8 tools / frameworks that are commonly used to Open
Dataset Studies
Source
perform various tasks (e.g. pre-processing, training, classification
etc.) for sentimental analysis as shown in Table 5. Movie YES [01],[5], [6], [8], [18],[14] [25]
Twitter Yes [3],[12],[22],[23],[24],[28],[29],[19]
Table 5. Identified tools / framework
Product [1],[2],[5],[7],[21],[13],[10],[14],[20],
Tools Studies YES
Data [26],[30]
[1], [2], [3], [8], [13], [15], [17], [18], [19], Medical YES [1],[2],[27],[10]
WEKA
[20], [22], [23], [27], [29], [30] SMD YES [31]
Rapid miner [07],[09] B News No [11],[15]
E-Comer YES [17]
MATLAB [07],[31]
NTCIR YES [32]
YamCha [11] Poem YES [28]
LibSVM [6],[25],[26] Goog. Ad No [09]
Book YES [8]
Mallet2 tool [26]
Shopping YES [8]
SVMLight [22],[32] My Space YES [14]
Sentiment Montada YES [16]
[14]
analysis tool AFF YES [16]
413
4. RQ’s ANSWERS AND LIMITATIONS [13] G Vinodhini “A sampling based sentiment mining approach
The answer of RQ1 is provided in Table 3 and Table 6. for e-commerce applications” JIPM 2017, Vol 53, Issue 1,
Furthermore, the answers of RQ2, 3 & 4 are provided in Table 4, Pages 223–236
Table 7 and Table 5 respectively. Although we select four well- [14] Rudy Prabowo, Thelwall, “Sentiment analysis: A combined
known scientific repositories for this SLR, there are fair approach,” Informatics 2009, Vol 3, Issue 2, PP 143–157
probabilities that we miss few studies published in other [15] Sriparna Saha, Asif Ekbal, “Combining multiple classifiers
repositories (e.g. Wiley). However, this limitation does not using vote based classifier ensemble technique for named
majorly affect the ultimate results of this SLR due to the selection entity recognition” JD&KE 2013, Vol. 85, Pages 15–39
of high impact scientific databases. [16] Ahmed Abbasi, Member, Hsinchun Chen, FellowSven
Thoms, and Tianjun Fu, “Affect Analysis of Web Forums and
5. CONCLUSION AND FUTURE WORK Blogs Using Correlation Ensembles” IEEE Trans. On Knowl.
This study explores the modern sentimental analysis trends. A & Data Eng 2008, VOL. 20, NO. 9
Systematic Literature Review (SLR) is executed to identify 31 [17] G. Vinodhini and R. M. Chandrasekaran, “Sentiment Mining
studies published in 2008-2016. As a result, 14 modern ensemble Using SVM-Based Hybrid Classification Model,” Springer
techniques, 26 leading classifiers, 15 benchmark datasets, 19 2013, Volume 246 pp 155-162
prominent features and 8 tools are identified in the context of [18] Yumi Lin, Xiaoling Wang, Jingwei Zhang, Aoying Zhou,
sentimental analysis. Although the outcomes of the SLR are “Assembling the Optimal Sentiment Classifiers,” 13th
highly beneficial for the scholars and industrial experts, it is International Conference, Paphos, Cyprus, November 28-30,
essential to perform comparative analysis of identified ensemble 2012. Proceedings, vol. 7651, pp. 271-283, 2012.
approaches, features, classifiers, tools and datasets in order to [19] Yun Wan, Qigang Gao, “An Ensemble Sentiment
provide the in-depth details. We intend to perform such analysis in Classification System of Twitter Data for Airline Services
the next article. Analysis,” IEEE 15th Data Mining Workshops, 2015.
[20] Ying Su, Wang, Hongmiao “Ensemble Learning for
6. REFERENCES Sentiment Classification,” Spri. 2013, Vol. 7717, pp 84-93.
[1] G. Wang, et al., “Sentiment classification: The contribution [21] Mattew Whitehead, Larry Yaeger, “Sentiment mining using
of ensemble learning”, journal of Decision Support Systems, Ensemble classification model, “Springer B.V. 2010
2013, Volume 57, Pages 77–93 [22] Tawunrat Chalothorn, Jeremy Ellman, “Simple approaches
[2] Aytug Onan, Serdar glu, Hasan, A Multiobjective Weighted of sentiment analysis via ensemble learning” Springer-
Voting Ensemble Classifier Based on Differential Evolution Verlag Berlin Heidelberg , vol. 339, pp 631-639, 2015.
Algorithm for Text Sentiment Classification”, JESA 2016, [23] Joseph Prusa, Khoshgoftaar, Daivd J. Dittman ,“Using
Vol. 62, Pages 1–16. Ensemble Learners to Improve Classifier Performance on
[3] N.F.F. da Silva, et al., “Tweet sentiment analysis with Tweet Sentiment Data, ” IEEE 16th ICIRI 2015.
classifier ensembles”, Journal of Decision Support Systems”, [24] Matthias Hagen, Potthast, Büchner, Stein “Twitter Sentiment
Volume 66, October 2014, Pages 170–179 Detection via Ensemble Classification Using Averaged
[4] Kitchenham, Barbara. “Procedures for Performing Confidence Scores” Spr. 2015 pp. 741–754.
Systematic Reviews.” Keele, UK, Keele University 33.2004 [25] Lin Dai, Hechun Chen, Xuemei Li , “Improving Sentiment
(2004): 1-26. Classification Using Feature Highlighting and Feature
[5] Fersini, Messina, F. Pozzi, “Sentiment analysis: Bayesian Bagging,” 11th IEEE ICDMW 2011, Pages 61-66
Ensemble Learning” DSS 2014, Vol 68, Pages 26–38. [26] Zhongqing Wang, Li, Zhou, Peifeng, Zhu, “Imbalanced
[6] Rui Xiaa, Chengqing Zonga, Shoushan, “Ensemble of feature Sentiment Classification with Multi-Strategy Ensemble
sets and classification algorithms for sentiment Learning,” Proceedings Asian Language Processing, 2011.
classification”, JIS 2011, Vol 181, Is. 6, Pages 1138–1152 [27] Wenjia Wang, “Heterogeneous Bayesian Ensembles for
[7] G. Vinodhini, R.M. Chandrasekaran “A Comparative Classifying Spam Emails,” proceedings Neural Net., 2010.
Performance Evaluation of Neural network based approach [28] Vipin Kumar, Sonajharia Minz, “Multi-view Ensemble
for Sentiment Classification of Online Reviews” Journal of Learning for Poem Data Classification Using SentiWordNet,
King Saud University - Computer and Information Sciences “Advanced Computing and Informatics Proceedings of
2016, Vol 28, Issue 1, Pages 2–12. ICACNI 2014, vol. 27, pages 57-66.
[8] Cagatay Catal, Mehmet Nangir, “A Sentiment Classification [29] Ammar Hassan, Ahmed Abbasi, Daniel Zeng “Twitter
Model Based On Multiple Classifiers” Applied Soft Sentiment Analysis: A Bootstrap Ensemble Framework”,
Computing 2017 , Vol 50, Pages 135–141 International Conference on Social Computing, 2013.
[9] Michael A, Abrahams, T. Ragsdale “Ensemble learning [30] G. Vinodhini and R. M. Chandrasekaran, “Sentiment Mining
methods for pay-per-click campaign management” ESA Using SVM-Based Hybrid Classification Model”,
2015, Vol 42, Issue. 10, Pages 4818–4829. Computational Intelligence, Cyber Security and
[10] Johannes V. Lochter, Rafael F. Zanetti, Dominik Reller, Computational Models, Volume 246, pp 155-162, 2013.
Tiago A. Almeida “Short Text Opinion Detection using [31] Feng Wang, Yongquan Zhang, Qi Rao, Kangshun Li, H.
Ensemble of Classifiers and Semantic Indexing” ESA 2016, Zhang, “Exploring mutual information-based sentimental
Vol 62, Pages 243–249 analysis with kernel-based extreme learning machine for
[11] Asif Ekbal • Sriparna Saha, “Combining feature selection stock prediction, ” soft computing 2016, PP 1-13.
and classifier ensemble using a multi objective simulated [32] Bin Lu, Benjamin K. Tsou, “Combining a large sentiment
annealing approach: application to named entity lexicon and machine learning for subjectivity classification, ”
recognition” Soft comp. 2013, Volume 17, Issue 1, pp 1–16 Proceedings of the Ninth International Conference on
[12] Yaowei, Rao∗, Xueying Zhan, Huijun Chen, Maoquan Luo, Machine Learning and Cybernetics, 11-14 July 2010
Jian Yin, "Sentiment and emotion classification over noisy
labels” KBS 2016 Vol. 111, pp 207–216
414