Professional Documents
Culture Documents
Seminar Internal Evaluation Phase - 1 & 2: Automated Log Parsing For Large-Scale Log Data Analysis
Seminar Internal Evaluation Phase - 1 & 2: Automated Log Parsing For Large-Scale Log Data Analysis
2
Continued...
• This triggers a number of studies on log parsing
that aims to transform free-text log messages into
structured events
3
Continued...
• Traditional method of log analysis, which largely
relies on manual inspection and is labor-intensive
and error-prone, has been complemented by
automated log analysis techniques.
5
Review of Literature
Sl.No Title Authors Year Description
3 Structured D. Yuan, S. Park, 2012 This paper proposes a tool
Comparative P. Huang, Y. Liu, which uses machine learning
Analysis of M. Lee, X. Tang, techniques to compare system
Systems Logs Y. Zhou, behaviors extracted from
to Diagnose and S. Savage, the logs and automatically infer
Performance the strongest associations
Problems between system components
and performance.
6
Technical Relevance
11
Overview of POP
Implementation
12
Experimental Results
• F-measure is used as evaluation metric for
evaluating parsing accuracy.
13
Continued...
14
Tools Used
• Commercial Tools
– Splunk
– Logentries
– Logmatric
• Open Source Tools
– Graylog
– Logstash
– Logz.io
15
Sustainability
When logs grow to a large scale (e.g., 200 million log
messages), which is common in practice, traditional
parsers are not efficient enough to handle such data
on a single computer. this limitation is overcome by
implementing a parallel log parser (namely POP) on
top of Spark.
16
Conclusion
Automated log parsing for the large scale log analysis
of modern systems is fast and reliable than manual
processing of the logs.
17
REFERENCES
[1] Pinjia He, Jieming Zhu, Shilin He, Jian Li, and Michael R. Lyu,“Towards Automated Log Parsing for Large-Scale
Log Data Analysis”, IEEE Transactions on Dependable and Secure Computing Volume: 15 , Issue: 6 , Nov.-Dec. 1 2018
[2] W. Xu, L. Huang, A. Fox, D. Patterson, and M. Jordon, “Detecting large-scale system problems by mining console
logs,” in SOSP’09: Proc. of the ACM Symposium on Operating Systems Principles, 2009.
[3] Q. Fu, J. Lou, Y. Wang, and J. Li, “Execution anomaly detection in distributed systems through unstructured log
analysis,” in ICDM’09: Proc. of International Conference on Data Mining, 2009.
[4] K. Nagaraj, C. Killian, and J. Neville, “structured comparative analysis of systems logs to diagnose performance
problems,” in NSDI’12: Proc. of the 9th USENIX.
[5] A. Oprea, Z. Li, T. Yen, S. Chin, and S. Alrwais, “Dectection of early-stage enterprise infection by mining large-scale
log data,” in DSN’15, 2015.