Professional Documents
Culture Documents
Contents
1 Techniques
2 Author profiling and the Internet
2.1 Social media
2.1.1 Facebook
2.1.2 Weibo
2.1.3 Chat logs
2.1.4 Blogs
2.2 Email
3 Applications
3.1 Forensic linguistics
3.2 Bot detection
3.3 Marketing
3.4 Literary works
3.5 Library cataloguing
4 In popular culture
5 See also
6 References
Techniques
Through the analysis of texts, various author profiling
techniques can be applied to predict information about the
author. For example, function words, as well as part-of-
speech analysis, can be referenced to determine the
author's gender and truth of a text.[3]
Social media
Chat logs
Blogs
Instance-based learning
Random Decision Forests
Applications
Author profiling has applications in various fields where there
is a need to identify specific characteristics of an author of a
text, with a growing importance in fields like forensics and
marketing.[26] Depending on its application, the task of
author profiling can vary in terms of the characteristics to be
identified, number of authors studied and number of texts
available for analysis.
Bot detection
The task of bot and gender profiling was one of four shared
tasks organised by PAN, which organises a series of
scientific events and shared tasks of digital text forensics
and stylometry, in its 2019 edition.[33] Participating teams
had achieved much success, with the best results for bot
detection for English and Spanish tweets at 95.95% and
93.33% respectively.[35]
Marketing
Literary works
The Bible
Gospels of the New Testament
Shakespeare’s works [41]
The Federalist Papers in the 1990s and 1960s
Author profiling studies for Lithuanian Literary Texts [40]
Library cataloguing
In popular culture
Author profiling has been featured in popular culture. The
2017 Discovery Channel mini-series Manhunt: Unabomber is
a fictionalised account of the FBI investigation surrounding
the Unabomber. It features a criminal profiler who identifies
defining characteristics of the Unabomber’s identity based
on his analysis of the Unabomber’s idiolect in his published
manifesto and letters. The show highlighted the importance
of author profiling in criminal forensics, as it was critical in
the capture of the real Unabomber culprit in 1996.[43]
See also
Related subjects
Computational linguistics
Forensic linguistics
Native-language identification
Social bot
Stylometry
References
1. Wiegmann, M., Stein, B. & Potthast, M. (2019).
"Overview of the Celebrity Profiling Task at PAN 2019."
CLEF.
2. Mikros, G.K., & Perifanos, K. (2013). "Authorship
attribution in Greek tweets using author's multilevel n-
gram profiles." 2013 AAAI Spring Symposium Series.
3. Koppel, M., Argamon, S., & Shimoni, A.R. (2013).
"Automatically categorizing written texts by author
gender." Literary and Linguistic Computing, 17, pg 401–
412.
4. ^ a b c d e f López-Monroy, A. P., Montes-y-Gómez, M.,
Escalante, H. J., Villaseñor-Pineda, L. & Stamatatos, E.
(2015). "Discriminative subprofile-specific
representations for author profiling in social media." In:
Knowledge-Based Systems, 89, 134 - 147.
5. ^ a b Lundeqvist, E. & Svensson, M. (2017). "Author
profiling: A machine learning approach towards
detecting gender, age and native language of users in
social media." In: Department of Information
Technology.
6. Franco-Salvador, M., Plotnikova, N., Pawar, N., &
Benajiba, Y. (2017). "Subword-based deep averaging
networks for author profiling in social media." CLEF.
7. Kurita, K. (2018). "Paper dissected: Deep unordered
composition rivals syntactic methods for text
classification explained." Machine Learning Explained.
8. ^ a b c Bsi, B. & Zrigui, M. (2018). "Deep learning
techniques for author profiling in social media content."
In: 31st IBIMA Conference.
9. ^ a b Bilan, I. & Zhekova, D. (2016). "CAPS: A cross-
genre author profiling system." CLEF.
10. Schler, J., Koppel, M., Argamon, S., & Pennebaker, J.W.
(2005). "Effects of Age and Gender on Blogging." AAAI
Spring Symposium: Computational Approaches to
Analyzing Weblogs.
11. ^ a b Rangel, F., & Russo, P. (2019). "Overview of the 7th
author profiling task at PAN 2019: Bots and gender
profiling in Twitter." CLEF.
12. ^ a b Rosso, P., Rangel, F., Farías, I. H., Cagnina, L.,
Zaghouani, W., & Charfi, A. (2018). "A survey on author
profiling, deception, and irony detection for the Arabic
language." Language and Linguistics Compass, 12(4).
13. ^ a b Gómez-Adorno, H., Markov, I., Sidorov, G.,
Posadas-Durán, J.-P., Sanchez-Perez, M. A., &
Chanona-Hernandez, L. (2016). "Improving Feature
Representation Based on a Neural Network for Author
Profiling in Social Media Texts". In: Computational
Intelligence and Neuroscience, pg 1–13.
14. Dam, J. W. V., & Velden, M. V. D. (2015). "Online
profiling and clustering of Facebook users". In: Decision
Support Systems, 70, 60–72.
15. ^ a b c Hsieh, F.C., Sandroni, R.F., & Paraboni, I. (2018).
"Author Profiling from Facebook Corpora". LREC.
16. ^ a b Fatima, M., Hasan, K., Anwar, S., & Nawab, R. M. A.
(2017). "Multilingual author profiling on Facebook". In:
Information Processing & Management, 53(4), 886–
904.
17. Rangel, F., & Rosso, P. (2013). "Use of Language and
Author Profiling: Identification of Gender and Age."
18. ^ a b c Zhang, W., Caines, A., Alikaniotis, D., & Buttery, P.
(2015). "Predicting author age from Weibo microblog
posts." LREC.
19. ^ a b Chen, L., Qian, T., Wang, F., You, Z., Peng, Q., &
Zhong, M. (2015). "Age Detection for Chinese Users in
Weibo." WAIM 2015, LNCS 9098, 83–95.
20. Lin, J. (2007). "Automatic Author Profiling of Online
Chat Logs"
21. Bengel J., Gauch S., Mittur E., Vijayaraghavan R. (2004)
ChatTrack: "Chat Room Topic Detection Using
Classification." In: Chen H., Moore R., Zeng D.D., Leavitt
J. (eds) Intelligence and Security Informatics. ISI 2004.
Lecture Notes in Computer Science, 3073. Springer,
Berlin, Heidelberg
22. ^ a b c Pham, D.D., Tran, G.B., & Pham, S.B. (2009).
Author Profiling for Vietnamese Blogs. 2009
International Conference on Asian Language
Processing, 190-194.
23. Santosh, K., Bansal, R., Shekhar, M. & Varma, V. (2013).
Author Profiling: Predicting Age and Gender from Blogs
Notebook for PAN at CLEF 2013. CLEF.
24. Rangel, F. & Rosso, P. (2013). Use of Language and
Author Profiling: Identification of Gender and Age.
Natural Language Processing and Cognitive Science
2013.
25. ^ a b c Estival, D., Gaustad, T., Pham, S. B., Radford, W.,
& Hutchinson, B. (2007). Author Profiling for English
Emails.
26. ^ a b Author Profiling 2018. (n.d.).
27. Foster, D. (2000). Author Unknown: On the Trail of
Anonymous. Henry Holt and Company
28. ^ a b Grant, T. D. (2008). "Approaching questions in
forensic authorship analysis." In Gibbons, J. & Turell, M.
T. (Eds.). Dimensions of Forensic Linguistics. John
Benjamins.
29. Kotzé, E. F. (2010). "Author identification from
opposing perspectives in forensic linguistics". South
African Linguistics and Applied Language Studies.
28(2). 185-197
30. Yang, M. & Chow, K. P. (2014) "Authorship Attribution
for Forensic Investigation with Thousands of Authors."
In: Cuppens-Boulahia N., Cuppens F., Jajodia S., Abou
El Kalam A., Sans T. (eds) ICT Systems Security and
Privacy Protection. SEC 2014. IFIP Advances in
Information and Communication Technology, vol 428.
Springer, Berlin, Heidelberg.
31. Leonard, R. A. (2005). "Applying the Scientific
Principles of Language Analysis to Issues of the Law."
International Journal of Humanities. 3. 1-9
32. Chaski, C. E. (2001). "Empirical evaluations of
language-based author identification techniques."
Forensic Linguistics, 8, 1-65.
33. ^ a b c "Bots and Gender Profiling 2019". (n.d.).
34. ^ a b c Goubin, Régis & Lefeuvre, Dorian & Alhamzeh,
Alaa & Mitrović, Jelena & Egyed-Zsigmond, El˝ & Fossi,
Leopold. (2019). "Bots and Gender Profiling using a
Multi-layer Architecture Notebook for PAN at CLEF
2019".
35. ^ a b Daelemans W. et al. (2019) "Overview of PAN
2019: Bots and Gender Profiling, Celebrity Profiling,
Cross-Domain Authorship Attribution and Style Change
Detection." In: Crestani F. et al. (eds) Experimental IR
Meets Multilinguality, Multimodality, and Interaction.
CLEF 2019. Lecture Notes in Computer Science, vol
11696. Springer, Cham.
36. Kovács, G., Balogh, V., Mehta, P., Shridhar, K., Alonso,
P., & Liwicki, M. (2019). "Author Profiling using
Semantic and Syntactic Features: Notebook for PAN at
CLEF 2019."
37. Raghunadha Reddy T., Lakshminarayana M., Vishnu
Vardhan B., Sai Prasad K., Amarnath Reddy E. (2019) "A
New Document Representation Approach for Gender
Prediction Using Author Profiles." In: Bapi R., Rao K.,
Prasad M. (eds) First International Conference on
Artificial Intelligence and Cognitive Computing.
Advances in Intelligent Systems and Computing, vol
815. Springer, Singapore
38. Maharjan, Suraj & Shrestha, Prasha & Solorio, Thamar
& Hasan, Ragib. (2014). "A Straightforward Author
Profiling Approach in MapReduce." LNCS (LNAI).
39. Company, J. S., & Wanner, L. (2017). "On the Relevance
of Syntactic and Discourse Features for Author Profiling
and Identification." Proceedings of the 15th Conference
of the European Chapter of the Association for
Computational Linguistics, 2, 681–687.
40. ^ a b Dzikiene. J. K., Utka, A., & Šarkute, L. (2015).
"Authorship Attribution and Author Profiling of
Lithuanian Literary Texts", 96–105.
41. Ledger, G. (1994). "Shakespeare, Fletcher, and the Two
Noble Kinsmen." Literary and Linguistic Computing,
9(3), 235–247.
42. ^ a b Nomoto, T. (2009). "Classifying library catalogues
by author profiling." In: Proceedings of the 32nd
International ACM SIGIR Conference on Research and
Development in Information Retrieval - SIGIR 09.
43. Davies, D. (2017, August 22). "FBI Profiler Says
Linguistic Work Was Pivotal In Capture Of Unabomber."