You are on page 1of 26

1

Cricket Team Recommendation System Using


Machine Learning Algorithm

BY-
o Anish Srivastava(08)
o Anurag(09)
o Ayush Chaurasia (16)
MOTIVATION 2

o Currently world is full of automation. Almost In every field you look is automated or looking for the
automation like automation in hiring process, customer support desk, auto driving mode.

o In this project we tried to automate selection of cricket team by replacing the selection committee
through machine learning model.

o Like selection committee select the players on the basis of performance of individual like his avg,strike
rate, century ,batting avg,etc in the same way model is trained by previous record of the player.

o In practical life it will reduce the human effort as we know that BCCI (Board for cricket control in India)
have selection committee of around 5-10 people.

o Model will reduce the chances of being biased because in the selection panel there are chances of
favouritism and nepotism for some players.

o It will also reduce the financial burden on the board , as we know that board paid in crores annually to
them.

o This process will be time efficient , because selection meeting takes lot of time, reduce the human labour.
METHODOLOGY: DATA EXTRACTION AND 3

PREPROCESSING
o Raw data is collected for ODI matches as well as for State Level matches. We have collected Series-wise
Record of all matches till now from howstat.com for ODI series and crickbuzz.com for State Level Series.
4
Methodology and Implementation
Data Pre-processing: Preparation of Data set

Initial data-set
5
PREPROCESSING

Batting measures and ranking index for batsman:

FEATURES:

❏ BA = Runs scored / Innings played - NOI............ (1)

❏ BS = Runs scored / Balls faced............................ (2)

❏ MRA = 100’s + 50’s / Innings played................. (3)


❏ Outrate = number of times batsman got out / number of balls faced by batsman........ (4)
❏ BRPI = 4*total number of fours + 6* total number of sixes’’s / Innings played............... (5)
6
Cont….

❏ Bat gen avg = Tr / Tw......................... (6)

❏ Bat gen outrate = Tw / Tb........................ (7)

❏ Bat gen sr = Tr / Tb........................ (8)

❏ AGR = ((tbatsman - Bat gen sr ∗ nb) + Bat gen avg ∗ nb ∗ (Gen outrate - outrate))... (9)

❏ RI = AGR / ( 10* Bat gen avg)................. (10)


Bowling measures and bowler ranking index system :

FEATURES:

❏ Bowl avg = Runs conceded / wickets taken.............. (11)

❏ Bowl sr = balls bowled / wickets taken..................... (12)

❏ Bowler = runs conceded / overs bowled................... (13)


❏ Outrate = wickets taken / balls bowled...................... (14)

❏ Bowl gen avg = tc / tw............. , (15)

❏ Bowl gen outrate = tw / tb............. (16)


❏ Bowl gen sr = tc / tb...................... (17)

❏ AGR = (Bowl gen sr ∗ tb – tc) + Bowl gen avg ∗ tb ∗ (Bowl gen outrate − outrate)..(18)

❏ RI = AGR / (10* team generic bowling average)............. (19)

·
Input & Output features for the Machine Learning Algorithm: 8

❏ There are total 12 features available in data set, but if we talk about the information that is known before a match to select or

reject a player, then there are only two attributes

❏ Batting Index

❏ Bowling Index

❏ Thus the input x is (BatIndex, BallIndex).

FOR K-neighbour clustering there are five clusters as output

❏ Cluster [0-1]: Batsman


❏ Cluster [2-3]: All-Rounder
❏ Cluster [4] : Bowler
3.1 Implementation Steps
After feature extraction from raw data we have obtained Batting index and balling index.
Applying UnSupervised Machine Learning

Data Representation before Clustering


After Applying k-Mean Clustering
❏ There are five Clusters for the players.
❏ Cluster [0-1]: Batsman
❏ Cluster [2-3]: All-Rounder
❏ Cluster [4] : Bowler
Predicting the Clusters of State Level Players:
We can not directly cluster the data for State Level Players, because State Players have records of state level matches which is quite easier
than international matches. So we need a method to equate their performance scale to the ODI players performance scale. We are taking
weighted performance (by multiplying original indexes with an alpha coefficient) of State Players.
3.2.3.1 Selection of Players

After performing all of the above procedure, finally we have clustered data of all players, ODI players as well as
State players. For selection of State Players, we have defined two methods, one is Overall Top Players and the
other one is State-wise Top players. Now we have selected required number of Batsman from cluster 1 and cluster
2, All-Rounders from cluster 3 and cluster 4 and Bowlers from clusters 4. For selection, we have defined a selector
function, which takes number of required players for each category and makes the selection.
User Interface and Working with the System
❏ First users have to login into the application , by clicking on Login here.
❏ After filling Login credential, it is validated from the back-end. if found true you will be logged in, otherwise access denied.

Screen After logging in


TRAIN-TEST SPLIT FOR KNN CLASSIFIER:
❑ The dataset is splitted into two parts –

❑ Training dataset: A subset of dataset to train the model. (70%)

❑ Testing dataset: A subset of dataset to test the trained model.(30%)


5.1 CONCLUSION:

By using Machine Learning Algorithm, we converted Manual selection of cricket team into the Automated selection of
Cricket team

The team selected through the machine learning algorithm is much closer to BCCIs team. From the result we can
also conclude that using ML algorithm it seems that it is impossible to get 100% accuracy. The accuracy of this
model is roaming around 70 to 90 percent which is pretty enough.
FUTURE
25
WORK

Many different areas left for research in this project due to limited time period. Enhancing “Features” for the
selection of player is the key concept. For mapping to the real-world selection process, more features need to
count and research of such model or method which can assign the weightage to the features. Actually, a wide area
is open for research and modify the current version. More tuples needed which can analyze to get good results and
accuracy.

We can extend the system to manage dynamic data of matches where admin can add the data after each match so
that the analysis is more dynamic.
26

You might also like