You are on page 1of 29

STUDYING CHURNING

FACTORS IN INDIAN

TELECOMMUNICATION SECTOR
USING SOCIAL MEDIA ANALYTICS
Nitish Varshey , S. K. Gupta

Indian Institute of Technology Delhi, India

OUTLINE
 Telecommunication

Sector of India

 Goal
 Related

Work

 Methodology
 Results

and Analysis

 Conclusion

and Future Work

.86 million subscribers [2]  13 mobile carriers : most of them having pan India presence [1]   Depicting situation : Telecom Service Providers are stable and market saturates Main Business Drivers [Hung [3] ]  Retention of customer subscriber’s base  Increase in average revenue per user TRAI received 109 million MNP requests within a span of 2 years [2].TELECOMMUNICATION SECTOR OF INDIA  Grown Market  Second largest in subscriber base [1]  899.

GOAL To identify factors pertaining to churn which may help decision makers improve operations in terms of their marketing strategy  Setting a relation between churning in telecommunication sector of India and sentiments present in social media feeds   Field : Telecommunication Sector  Churn is particularly high in Telecom Accenture 2011 Global Consumer Survey [4] ] [  Data [ availability TRAI Telecom Subscription Data Monthly Report ] .

RELATED WORK : DATA MINING TECHNIQUES  Predict customers getting ready to switch.7. account length.6.8]  Data Corpus : Actual customer transactions and billing data  Churn Pertaining Attributes Identified : geographic location. call length etc  Work Done : Patterns followed by churning customers:  “If I am calling to customer care more than Y times then I will churn”  “If I am calling more than X minutes and if someone else provides me a better plan I will churn”  “If my in net call duration is low I will churn”  Potential value of a customer  . understand why and connect with them to provide offers to mitigate change [5.

11.10.12]  Avoids use of proprietary customer data  Major Goal :  Finding customer loyalty and intention to recommend a service provider to other customers  Socio-economic factors can also be examined .RELATED WORK BASED TECHNIQUES : SURVEY  Based on consumer survey data [9.

October 2011 )  Opinions available on social media can be a valuable online source for mining reasons for customers dissatisfaction. which most likely lead to customers churn .METHODOLOGY : SOCIAL MEDIA ANALYSIS   People share opinion on different aspects of life everyday. including telecommunication services they are using 27% customers complains on social InformationWeek’s survey of 392 business tech pros at companies using one or more internal social media ( networking systems.

METHODOLOGY I/P Data Source :Twitter .

2012 to 31 Apr. 2013) .METHODOLOGY I/P Selection Corpus Creation : Telecom Specific Tweets : Queried for service provider within time range Data collected over a span of 9 months (1 Aug.

“gr8” • Emoticons .METHODOLOGY I/P Selection Data Cleansing • Non Relevant Tweets Removal • Spell Corrector • Google’s spell corrector • Net lingo’s “lv”.

coooool Y Language Dictionary N Netlingo Dictionary Y N Is Emoticon? Y N Norvig’s Toy Spell Corrector Processed Word .g.SPELL CORRECTOR Word Removal of any character that repeats more than twice e.

METHODOLOGY I/P Selection Data Cleansing Preprocessing • Stemming of each token .

METHODOLOGY I/P Selection Data Cleansing Preprocessing Transformation Relational DB tuples 86K tuples retrieved for 3 telecom players Manually Annotated Text .

Tweet : “A B C D E” .METHODOLOGY I/P Selection Data Cleansing Preprocessing Transformation Classification Eg.

Service. discount.METHODOLOGY N-gram Based Text Categorization [5.6] Feature-based sentiment analysis [8] Eg. messaging. recharge. billing  Satisfaction : (change|switch)*(to|from)*service_provider  Miscellaneous : waive off. bailout. Miscellaneous  Feature Indicators : Price : Tariff / costing / rate  Service : network. implement  . Satisfaction. interrupt. Tweet : “A B C D E”  Feature-based sentiment analysis:  Aspects act as object features  Object Features : Price. internet.

2012 to 31 Apr. Tweet : “A B C D E” ARM for most pertaining churning factor . Tata Selection Data collected over a span of 9 months (1 Aug. Aircel. 2013) • Non Relevant Tweets Removal • Spell Corrector Data Cleansing Preprocessing Transformation Classification • Stemming of each token • Stop Words Removal Relational Database Around 86K tweets retrieved Eg.METHODOLOGY I/P Data Source :Twitter Telecom Specific Tweets : BSNL.

with K=3 . Y : Aspect  Eg. X : Sentiment.ASSOCIATION RULE MINING  Finds correlations among different attributes (features) in a dataset   Correlation between aspects and sentiments Strongness of correlation among different features is measured in terms of interesting of rule  Support-Confidence used for generating interesting rules Constraint Based Association Rules are mined  X ->Y . ‘Service’ -> ‘Negative’  Most frequent rules are mined :   Top-K association rules are mined.

interruptions. flop. problem  . tough  cheap. lightning. slashes  Withdrawn.RESULTS AND ANALYSIS  Strongly positive / negative term list provided by Bing Liu [13] is domain independent  Needs to be modified for Telecommunication Sector Domain Dependent strongly positive / negative word  highspeed. right. shitty  Domain independent words may not be strongly positive / negative for telecommunication sector  pretty.

RESULTS AND ANALYSIS  Multiple service provider issue (inclusion of multiple service providers in a tweet presents semantic issues) (79.5%)  Examples : planning to move from BSNL to aircel  idea and bsnl both are shit i am pretty sure about this need to shift to virgin  bsnl services are much better than idea dont get an idea  .

is gong to expire after 03/01/2013  . however in tweets they represent negative sentiment (Our experiment show 86%)  Examples: bsnl internet is dead today no internet  aircel i hate u no wifi since morning  bsnl suck no line never work uncouth staff  @bsnlportal you r not responding to ur mistake by deducting my talktime 4 bsnl-tune how long it’ll take from 27th dec  my bsnl data card 9408914869 is not working however the balance is 3g 8gb and main balance is 400/.RESULTS AND ANALYSIS  Neg neg is generally positive.

0.284 . of Customers Customers 65952244   793717   66607361 655117         66786295 178934         65323317   -1462978   63347284   -1976033   61571291 -1775993         .) ==> ('positive'. 0.124 .) ==> ('positive'.) ==> ('positive'. 0.) .) ==> ('positive'.211 . 0.311 . 0.) ==> ('positive'. 0.942 ('service'.) ==> ('positive'.) .856 ('service'.302 . 0.185 . 2012   Dec.) .) .) . 2012   Jan. 0.609 ('miscellaneous'. 0. 0.) ==> ('positive'. 0. 0.129 .) .261 . 0. 0.948 ('service'. 2012   Sep.) ==> ('negative'.206 .921 ('service'.207 . 0.306 .) ==> ('service'. 0.917 ('service'. 2012   Nov.445 ('miscellaneous'.) .Confidence Rule.915 ('service'. 0. 'service') ==> ('positive'.) .249 .914 ('miscellaneous'.) .) . 0. Confidence ('miscellaneous'.) .) ==> ('positive'.713 ('negative'.) ==> ('positive'. 0.476 From reports published by TRAI Change in Total no.633 ('miscellaneous'. 2013   Support .635 ('miscellaneous'. 2012   Oct.375 ('miscellaneous'. 0. 0.) ==> ('negative'.) .380 . 0.625 ('service'. 0. 0.) ==> ('positive'. 0.Experiment 1 for Aircel Service Provider Performance Measure How to read Aug. 0.) .) .100 .) ==> ('positive'. 0.123 .) . 0. 0. Support. 0.

306 . 0.) .145 .166 .) ==> ('positive'. 0. 0.12   Nov.411     ('miscellaneous'.103 .232 .) ==> ('positive'.842     ('service'.Experiment 1 for BSNL service provider Performance Measure Support.591     ('service'.) . 'service') ==> ('positive'.) ==> ('positive'.13   From reports published by TRAI Change in no.763 99922347 13013 ('service'.) ==> ('positive'.) ==> ('negative'. 0. 0. 0. 0.114 . 0.) .846 99633207 393016 ('miscellaneous'. 0.698 100240893 318546 ('satisfaction'.210 .240 .) . 0.141 .) .125 . 0. 0.207 .) .227 .) ==> ('positive'. 0.) .) ==> ('positive'.830 99240191 491362 ('miscellaneous'.) ==> ('positive'.) .) ==> ('negative'. 0.327 .12   Sep.) . 0.627     ('satisfaction'.160 .610     .) .) . 0. 0. 0.) ==> ('positive'. 'service') ==> ('positive'.247 . 0. 0.13   Feb.) ==> ('negative'. 0. 0. 0.827     ('service'.694 100670567 429674 ('service'. 0.) ==> ('positive'.) . 0.259 .555     ('miscellaneous'. 0. 0. 0.) .Confidence How to read Rule.595     ('miscellaneous'.) . 0.187 . 0. 0.733 99909334 -81021 ('satisfaction'.) . Confidence Aug.) ==> ('negative'.12   Jan.) . 0.) ==> ('negative'. 0.12   Oct.) .) ==> ('positive'. 0.541     ('miscellaneous'. 0.) ==> ('positive'.12   Dec.498     ('miscellaneous'.568     ('service'. 0.804 99990355 357148 ('service'. 0. of Total Customer Customers s ('miscellaneous'.) . 0.111 .247 . 0.109 . 0. 0.502     ('service'.) ==> ('negative'.) . Support.) .582     ('miscellaneous'.) ==> ('positive'.589     ('service'.) ==> ('positive'.) .188 . 0.

it is highly probable that total customers for the service provider would decrease.CONCLUSION  Whenever negative sentiment is present in top rules. Service provider needs to work upon the aspects indicated in negative sentiment containing rule so that customer churn can be controlled .

FUTURE WORK Tweaking parameters used for Association Rule Mining  Disregard categorically circle specific issues as feeds geographic location not available   Based on GPS location : tagging location using smart phones  Inferring relation between tweets : People who are living in Britain are more likely to post about Royal Wedding in Britain  Direct Inference : Moving to Singapore   Work is based on assumption  All tweets are genuine : Identification and Removal of all such feeds .

Yen .REFERENCES 1. “India needs umbrella body on telecom standards”. Shin-Yuan Hung . . 2. information Week. Hsiu-Yu Wang. “Applying data mining to telecom churn management”. 78/2013. Feb 2012. 4. “From CRM to Social”. press Release No. David C. Accenture 2011 Global Consumer Survey. telecom subscription data monthly report” taken on 30th september. 3. “TRAI. Article published in Economic times on 16 August 2012. Expert Systems with Applications 31 (2006) 515–524.

P. 569–590. D. E.14(6). J. IEEE Transactions on Neural Networks. B. Mani... “Predicting subscriber dissatisfaction and improving retention in the wireless telecommunications industry”. L. K. & Kaushansky. 8.. Grimes. 103–112. Mozer.. M. Wei. Johnson. 3 (3). R.. R. 7. 690–696. “Turning telecommunications call details to churn prediction: A data mining approach.. P. I. & Liu.. D. NG. Wolniewicz. 6. Expert Systems with Applications” . 205–219. & Datta.REFERENCES 5. (2002). “Customer retention via data mining”. . Drew. A. H. C. & Chiu. H. (2001). T.. H. Betz. 11(3). (2000). (2000).. C. 23 (2). Artificial Intelligence Review. Journal of Service Research . “Targeting customers with statistical and data-mining techniques”.

Rams. . “Using survey data to predict adoption and switching for services. loyalty and satisfaction in the German mobile cellular telecommunications market. N. Kim. 751–765. & Moitra. & Schindler. Telecommunications Policy . D. Weerahandi.. Customer retention. Implications of loyalty program membership and service experiences for customer retention and value. Gerpott. Telecommunications Policy . C. & Bramlett. K.. S. M. Bolton. (2001). H. (2004). 11. Journal of Marketing Research”. S. R.. Journal of the Academy of Marketing Science . 85–96. Kannan. H. 249–269. W. Determinants of subscriber churn and customer loyalty in the Korean mobile telephony market. P. T. 10.REFERENCES 9. S. (1995).. & Yoon. A. 95–108.. 25 (4).. (2000). 28 (9/10). 28 (1). 32 (1). 12.

Rawia." Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval. Bing. 14." Handbook of natural language processing 2 (2010): 627-666.REFERENCES 13. . Trenkle. "Language-model-based pro/con classification of political text. and John M. 15.2 (1994): 161-175. Awadallah. "Ngram-based text categorization.. Liu. 2010. and Gerhard Weikum. Maya Ramanath." Ann Arbor MI 48113. ACM. Cavnar. "Sentiment analysis and subjectivity. William B.

QUESTIONS? .