Professional Documents
Culture Documents
PRESENTED BY,
GUIDED BY: ANNAPURNA.T(20104009)
MRS.S.SIVASELVI,AP/CSE ASWITHA.U(20104013)
KANAGALAKSHMI.V(20104050)
KAVYA.R(20104057)
AGENDA
ABSTRACT
OBJECTIVE
INTRODUCTION
LITERATURE REVIEW
DESIGN AND METHODOLOGIES
IMPLEMENTATION
CONCLUSION
REFERENCES
ABSTRACT
Cricket as a sport has the importance and entertaining nature of the
world.
For a team the performance of each individual player has great
importance for their victory.
In this project, an automated web scraping is to extract and store data
of cricket players from a website and the results will be visible as
dashboard.
This website contain web-enabled database of cricket players where the
data is contained on multiple web pages and proposed mechanism is
designed to extract this variation as per requirement of researcher.
This generic tool will be applicable in analyzing the formation of cricket
team as per rating of individual player.
Moreover, this tool will contribute as an essential ingredient for research
related in the domain of cricket analytics. Using this dashboard results
we can predict future analysis.
OBJECTIVES
Aim of the Project:
The main aim is to develop an automated generic
scraping framework that can work dynamically to
extract information about cricket for data analytics.
It contributes as an essential ingredient for research
related in the domain of cricket analytics.
This generic tool will be applicable in analyzing the
formation of cricket team as per rating of individual
player.
.
Scope of the Project:
This web scraping tool can enable a user to just type the
name of the player, after which the developed scraper
will automatically indexes the page where that player’s
data is located and delivers the question to the user to
select the format of which the data is to be extracted of
that player.
KhalilS et al., has proposed an R package for parallel web crawling and
scraping, Software X [3].
Step 4: Output
IMPLEMENTATION
Data –Flow Diagram
Class Diagram
DATA FLOW DIAGRAM
Class Diagram
REFERENCES
Khan JR, Biswas RK, Kabir E. A quantitative approach to influential factors in
One Day International cricket:Analysis based on Bangladesh, Journal of Sports
Analytics.2020; 5(1):57−63. https://doi.org/10.3233/JSA-170260.
Uzun Erdinç, Nusret Buluş H, Doruk Alpay, Özhan Erkan. Evaluation of Hap,
Anglesharp and Htmldocument in Web Content Extraction; 2020.Date
accessed:11/2021https://www.researchgate.net/publication/321228381_EVALU
ATION_OF_HAP_ANGLESHARP_AND_HTMLDOCUMENT_IN_WEB_CO
NTENT_EXTRACTION.
Khalil S, Fakir M. RCrawler: An R package for parallel web crawling and
scraping, Software X.2020;6:98−106https://doi.org/10.1016/j.softx.2020.04.004.
McNamara DJ, Gabbett TJ, Naughton GJSM. Assessment of workload and its
effects on performance and injury in elite cricket fast bowlers, Sports Medicine.
2021; 47(3):503−15.https://doi.org/10.1007/s40279-016-0588-8.
PMid:27435575.
Rahman AFR, Alam H, Hartono R. Content extraction from HTML documents.
International workshop on Web document Analysis; 2020.
CONCLUSION
The nature of this paper was to reflect and exhibit different tools
that can be implemented to leverage a scraping tool, for the
possession of any desired data available on the web.
Despite of the fact there are web-scraping techniques available but
for dynamic websites, it is still challenging researchers to extract
data of their use.
After analysis of available resources and their justification of use,
an automated scraper was developed to extract information of
cricket players with customized settings.
This research work aims to contribute towards statistical
comparisons between players of different teams irrespective of
their team ranking in international formats of cricket, generating a
rating system based on their performance in recent and past
matches and ranking them according to that rating.
THANK
YOU!