You are on page 1of 28

STATISTICAL INFERENCES AND REGRESSION ANALYSIS IN CRICKET

SUBMITTED BY

GAGANDEEP SINGH – 12PGP015 MANOJ H - 12PGP026 NIKESH AGARWAL - 12PGP030 SOURAV MONDAL - 12PGP042 VIJAYKRISHNAN G - 12PGP016

ABSTRACT
Cricket is a sport which employs extensive statistical tools for representation and analysis of data. We, in this project, intended to find how the impact of toss differs on the results of day and day-night matches. For the purpose of this statistical inference, we used the hypothesis testing of two population tool to study the mean of both day and day-night population. The findings showed that toss has a very minimum difference in impact on the result between the day and day-night matches. We have also studied and estimated with ninety percent confidence, the likely target interval for runs scored by Indian team while chasing against Pakistan using single population estimation. This was done with the help of the population which contained all the matches where India faced Pakistan and batted second. In addition to these, we studied the compensation of IPL players and tried to establish the relationship between the players’ skill using their statistical attributes, and the compensation they are paid using the simple linear regression and multiple linear regression analysis.

GAGANDEEP SINGH – 12PGP015 (pgp12015.gagandeep@iimraipur.ac.in)

VIJAY KRISHNAN G - 12PGP016 (pgp12016.vijay@iimraipur.ac.in)

MANOJ H - 12PGP026 (pgp12026.manoj@iimraipur.ac.in)

NIKESH AGARWAL - 12PGP030 (pgp12030.nikesh@iimraipur.ac.in)

SOURAV MONDAL - 12PGP042 (pgp12042.sourav@iimraipur.ac.in) i

Director of Indian Institute of Management Raipur.ACKNOWLEDGEMENT We would like to sincerely thank Prof. Sahay. B. Indian Institute of Management Raipur for his valuable guidance in this project right from the conception till the completion of the same.S. We would also like to thank our beloved Prof. for rendering his support during the entire project period. We also thank all the anonymous referees for their valuable comments on the report. we thank our classmates for their encouragement and support. ii . Last but not the least. Naval Bajpai.

4 CHAPTER 2 CHAPTER 3 LITERATURE REVIEW ------------------------------------.2 1.1 PIE CHART -----------------------------------------------------------------------------------------------------------.3.5.3 1.1 1.7 3.3.5.5 STATISTICAL TOOLS EMPLOYED--------------------------------------------------------------------------------------.3 1.2 STATISTICS IN CRICKET -------------------------------------------------------------------------------------------------.2 1.4 OBJECTIVE OF THE PROJECT ------------------------------------------------------------------------------------------.1 1.1 INDIVIDUAL STATISTICS------------------------------------------------------------------------------------------.1 1.2 WAGON-WHEEL ---------------------------------------------------------------------------------------------------.TABLE OF CONTENTS ABSTRACT ---------------------------------------------------------------------------------I ACKNOWLEDGEMENT ------------------------------------------------------------------------------.VI ----------------------------------------------------------------------------------.3 1.2 1.1 WINNING PERCENTAGE USING PIE CHART -----------------------------------------------------------------------.1 OBJECTIVE-----------------------------------------------------------------------------------------------------------.2.3.5.4 SIMPLE LINEAR REGRESSION -----------------------------------------------------------------------------------.III ----------------------------------------------------------------------------------.2 1.3 1.1 CRICKET ---------------------------------------------------------------------------------------------------------------------.5.3.2 1.1 1.1.4 1.2 1.7 3.4 MANHATTAN CHART ---------------------------------------------------------------------------------------------.3 1.1 CHARTS AND GRAPHS --------------------------------------------------------------------------------------------.5 RESEARCH METHODOLOGY -----------------------------.2 SINGLE POPULATION ESTIMATION ---------------------------------------------------------------------------.3 APPLICATION OF TOOLS -----------------------------------------------------------------------------------------------.3 WORM GRAPH -----------------------------------------------------------------------------------------------------.2.2 TEAM STATISTICS--------------------------------------------------------------------------------------------------.7 iii .5.5 MULTIPLE LINEAR REGRESSION -------------------------------------------------------------------------------.VI CHAPTER 1 INTRODUCTION --------------------------------------------.3 HYPOTHESIS TESTING FOR TWO POPULATION ------------------------------------------------------------.II TABLE OF CONTENTS LIST OF FIGURES LIST OF TABLES ----------------------------------------------------------------------------------.

1.4.11 4.4 DIFFERENCE IN IMPACT OF TOSS BETWEEN DAY AND DAY-NIGHT MATCHES --------------------------.4 SET THE DECISION RULE ----------------------------------------------------------------------------------------.2.2.2 DETERMINE APPROPRIATE STATISTICAL TEST ------------------------------------------------------------.10 4.1.1.11 4.1 SET NULL AND ALTERNATE HYPOTHESIS -------------------------------------------------------------------.12 4.9 3.2 CAPTAINCY RECORD CALCULATION USING BAR CHART-------------------------------------------------------.2.1.3.14 iv .8 3.10 4.1 POPULATION AND SAMPLING ---------------------------------------------------------------------------------.9 3.1.12 4.4 REGRESSION ANALYSIS ------------------------------------------------------------------------------------------------.2.11 4.7 INFERENCES --------------------------------------------------------------------------------------------------------.4.6 ANALYZE THE DATA ----------------------------------------------------------------------------------------------.10 4.3 INFERENCES --------------------------------------------------------------------------------------------------------.1.5 VALUATION OF PLAYERS IN IPL --------------------------------------------------------------------------------------.8 3.3 LEVEL OF SIGNIFICANCE ----------------------------------------------------------------------------------------.2.3 LEVEL OF SIGNIFICANCE ----------------------------------------------------------------------------------------.3.2.10 4.6 ANALYZE THE DATA ----------------------------------------------------------------------------------------------.10 4.11 4.12 4.4 SET THE DECISION RULE ----------------------------------------------------------------------------------------.1 POPULATION AND SAMPLING ---------------------------------------------------------------------------------.3 ACHIEVABLE SCORE AT THE END OF 50 OVERS ------------------------------------------------------------------.1 SET NULL AND ALTERNATE HYPOTHESIS -------------------------------------------------------------------.2.7 3.5.9 3.2 3.4 POPULATION -------------------------------------------------------------------------------------------------------.10 4.1 REGRESSION --------------------------------------------------------------------------------------------------------.7 STATISTICAL CONCLUSION AND BUSINESS IMPLICATION ----------------------------------------------.11 4.10 4.2 POPULATION -------------------------------------------------------------------------------------------------------.2.1.1 ESTIMATION OF SINGLE POPULATION ----------------------------------------------------------------------------.2.2.3.2 TECHNIQUE EMPLOYED -----------------------------------------------------------------------------------------.2 HYPOTHESIS TESTING FOR TWO POPULATION -----------------------------------------------------------------.9 3.9 3.5 COLLECTION OF DATA-------------------------------------------------------------------------------------------.1.1.1.3 REGRESSION ANALYSIS OF IPL VALUATION OF PLAYERS-----------------------------------------------------.3 3.11 4.9 3.9 3.8 3.10 4.9 CHAPTER 4 STATISTICAL ANALYSIS AND INTERPRETATION 10 4.7 PIE CHART -----------------------------------------------------------------------------------------------------------.2 TECHNIQUE EMPLOYED -----------------------------------------------------------------------------------------.8 3.7 STATISTICAL CONCLUSION AND BUSINESS IMPLICATION ----------------------------------------------.2 DETERMINE APPROPRIATE STATISTICAL TEST ------------------------------------------------------------.5 COLLECTION OF DATA-------------------------------------------------------------------------------------------.1 OBJECTIVE-----------------------------------------------------------------------------------------------------------.

4.2 BATSMEN -----------------------------------------------------------------------------------------------------------------.15 4.1 BOWLERS------------------------------------------------------------------------------------------------------------------.20 REFERENCES ----------------------------------------------------------------.17 CHAPTER 5 DISCUSSIONS ----------------------------------------------.4.2 4.19 CHAPTER 6 CONCLUSION ----------------------------------------------.3 AMOUNT (IN US DOLLARS) VERSUS RUNS.4.5. AVERAGE ---------------------------------------------------.1 AMOUNT (IN US DOLLARS) VERSUS RUNS -----------------------------------------------------------------.2.5.1 LIMITATIONS -------------------------------------------------------------------------------------------------------------.18 5.4 AMOUNT VERSUS STRIKE RATE -------------------------------------------------------------------------------.4.14 AMOUNT VERSUS WICKETS.18 5.16 4.18 5.5.14 ANALYSIS OF VARIANCE ----------------------------------------------------------------------------------------.20 6.16 4.17 4.20 6.4 ANALYSIS OF VARIANCE ----------------------------------------------------------------------------------------.1 4.2 ANALYSIS OF VARIANCE ----------------------------------------------------------------------------------------.21 v .16 4. STRIKE RATE -----------------------------------------------------------------.3 4.5.1 REASONS FOR NON-EXPLANATION --------------------------------------------------------------------------.14 ANALYSIS OF VARIANCE ----------------------------------------------------------------------------------------.2 FUTURE SCOPE-----------------------------------------------------------------------------------------------------------.4.5 DESCRIPTION OF STATISTICS OF BATSMAN ---------------------------------------------------------------------.

3 DESCRIPTION OF STATISTICS OF BOWLERS TABLE 4.2 INDIA'S WINNING RECORD UNDER MS DHONI TABLE 3.3 MS DHONI'S CAPTAINCY RECORD TABLE 4.1 RESIDUAL PLOTS FOR BOWLERS FIGURE 4.2 RESIDUAL PLOTS FOR AMOUNT 7 15 17 LIST OF TABLES TABLE 3.1 POPULATION DATA TABLE 3.LIST OF FIGURES FIGURE 3.1 PIE CHART FOR WINNING PERCENTAGE FIGURE 4.1DISTRIBUTION PLOT TABLE 4.2 DESCRIPTION OF VARIABLES TABLE 4.4 BATSMAN STATISTICS 7 8 8 11 13 14 16 vi .

which upon performing an in-depth analysis tend to reveal a lot of clues on how the game has evolved over the years.2.CHAPTER 1 1. and aggregated over a career for batting and bowling across formats. ranging from analysis of the team/player‟s performance in a particular match/over a period of time. the One Day Internationals are a variant of the List-A matches and hence the corresponding statistics will be included in the List-A statistics of an individual/team. one can predict the impact of a particular player on the outcome.1 INDIVIDUAL STATISTICS They are generally calculated for each individual player either for a certain set of matches or aggregated over his career.2STATISTICS IN CRICKET The applications of statistics in cricket are very diverse. 1. First-Class matches and List-A matches. with the help of the game‟s statistics. o o o o o o o o o o o o Matches Played Runs Scored Highest Score Batting/Bowling Averages Centuries. taken over a period of time. and that would serve as the performance indicator of the player. One Day Internationals. venue-based and team-based statistics could be arrived at. The test matches are the international variant of the First Class matches and hence the corresponding statistics will be included in the first class statistics of an individual/team. Team statistics are recorded and maintained separately for various teams in different formats of the cricket like Test matches. to a comprehensive study of the evolution of the various aspects of the game.1CRICKET INTRODUCTION The game of cricket has fascinated the minds of many statisticians simply because of the sheer amount and variety of statistics it generates. Similarly. Based on the analysis of general statistics across the different formats of cricket. Strike Rate Maiden Overs Economy Rate Best Bowling Wickets Partnerships Catches &Stumping Captaincy Statistics 1 . Individual statistics are recorded for each player during a match. For example. 1. Twenty 20s.

It is a variant of the bar graph/histogram.3. 1.3APPLICATION OF TOOLS Of late. o Match Results o Result Margins o Series Results o Innings Totals o Match Scores o Run Rate o Extras etc. The television networks are thus engaged in pioneering the cause of several new innovative ways of presenting cricket statistics. and Wides etc. considering all the individual players‟ statistics into account. 1.3. The size of each of the sector is dependent on the proportion of the total quantity it represents.4 MANHATTAN CHART This is used to represent the runs scored and wickets in each over during a match. For example.3.2. the impact of television coverage on the sport has been profound. the purpose is to make the viewer understand clearly the impact of statistics on the game of cricket.2 TEAM STATISTICS They are generally calculated for the whole team taken together. 1. and then to use statistical inferences to arrive at estimations and predictions about the game.1 PIE CHART The Pie charts are one of the most widely used methods in representing cricket statistics. Thereafter. Some of the most widely used new forms of statistical representation include: 1.1. and it is named as Manhattan Chart because of its similarity to the Manhattan skyline. and it is a circular chart which is subdivided into many sectors. plotted against the time or balls bowled during a match.2 WAGON-WHEEL It displays a 2D or 3D plot of various shots or runs scored by a player/team upon a cricket field‟s overhead view.3. many methods are devised by the cricket pundits to perform analysis of the statistics. No Balls.3 WORM GRAPH This is used to represent the runs scored and wickets taken during an innings. the extras can be presented as a pie-chart with the different sectors representing the Leg-byes. and it has provided a huge impetus to develop interesting forms of statistical representation to the viewers. 2 . 1. With the help of various tools like the ones mentioned above.

1. according to a pre-determined threshold probability. the head-to-head record advantage of Pakistan would not have a significant say in the outcome of the game.3 HYPOTHESIS TESTING FOR TWO POPULATION A statistical hypothesis test is a method of making decisions using data.4OBJECTIVE OF THE PROJECT The main objective of this project is to illustrate the application of statistical inferences and regression analysis in cricket. Charts can usually be read more quickly than the raw data that they are produced from. Since the data represented using the pie chart was taken from matches spread across a long duration of time. A prediction interval consisting of a lower endpoint designated and an upper endpoint designated. or slices in a pie chart. the results are represented using a pie-chart and then proportion of results in each team‟s favor is interpreted. b) During the innings break. 1.5. whether from a controlled experiment or an observation study. and represented using the bar-chart. losses and other results achieved by Team India under the leadership of MS Dhoni are considered.1.2 SINGLE POPULATION ESTIMATION The Z statistic can be used in the calculation of prediction intervals. in which the data is represented by symbols. The wins.1 CHARTS AND GRAPHS A chart is a graphical representation of data. a result is called statistically significant if it is unlikely to have occurred by chance alone. estimation of an achievable target score range for India is done with a confidence interval of ninety percent. using 2-population Hypothesis testing. 1. the significance level. prediction is done if there would be a difference in the impact of toss between the day and day-night matches. and hence. all the One Day Internationals which ended in a result between India and Pakistan so far are taken into account. is an interval such that a future observation X will lie in the interval with high probability. which could be used to understand the extremely high win-loss ratio of MS Dhoni. a regression analysis is carried out to determine if the pricing of the players in the IPL auction is explained fully by the various parametric statistics of the individual players or whether the pricing is influenced by other factors as well. A chart can represent tabular numeric data. Charts are often used to ease understanding of large quantities of data and the relationships between parts of the data. another type of statistic could be considered to perform the analysis. The prediction of the outcome of the game is done in two stages: a) In the pre-match analysis.5.5 STATISTICAL TOOLS EMPLOYED 1. and to perform a pre-match analysis. Then. 3 .5. lines in a line chart. In statistics. A case is taken into account such that the situation is an IndiaPakistan cricket match. functions or some kinds of qualitative structures. such as bars in a bar chart.

5.1.4 SIMPLE LINEAR REGRESSION In statistics. simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model as small as possible. 1. 4 . In other words.5 MULTIPLE LINEAR REGRESSION Multiple linear regressions are when more than one explanatory variable is used to estimate the least squares.5.

It did not consider consumers‟ preferences. price of a product is nothing but the summation of the prices of all quality attributes. there are a few studies which deal with the game of cricket. which is reflected in the observed market price. In cricket. (1999) answers the question of measuring the productivity of an individual participating in a team sport that links the player's statistics in the National Basketball Association (NBA) to team wins. in the one-day game. Dobson and Goddard (1998) and Kahn (1992) considered compensations made for players in football. According to him. Barr and Kantor (2004) intended to determine the important skill set for a batsman in one-day cricket. under the assumption of competitive markets and imperfect information. Shapiro (1983) presented a theoretical framework to examine the halo effect on prices. reputation building can be considered as an investment good. However. Results of auction have showed that salaries matched marginal revenue products and that the open auction showed the declining price anomaly found to exist in real-world auctions. The batting average statistic has been used to assess the worth of a batsman. but generated quality attributes from the product label. consumers evaluate product quality attributes. to obtain useful. and pay the sum of implicit prices for each quality attribute. Developing an equilibrium price-quality schedule for high-quality products. Such a study is useful in answering the question offered in the title. An economic model is employed in the measurement of each player's marginal product. hence. Hence. Assessing batting performance in the one-day game requires the application of at least a two-dimensional measurement approach because of the time dimension imposed on limited over cricket. and Bennett and Flueck (1983) have studied the player‟s compensation that is being done in the game of baseball. They had used a new graphical representation with Strike rate on one axis and the Probability of getting out on the other. he showed that reputation facilitates a price premium. Jones and Walsh. Weemaes and Riethmuller (2001) studied the role of quality attributes on preferences for fruit juices. akin to the risk-return framework used in portfolio analysis. Rosen (1974) based his model of product differentiation on the hypothesis that goods are valued for their utility generating attributes. we have not come across any study that links compensation to player attributes. However. Berri. particularly in the context of the one-day game. (1988) made similar studies in ice-hockey and concluded that skills are the principal determinant of salaries at all positions. while making a purchase decision.CHAPTER 2 LITERATURE REVIEW Estenson et al (1994). direct and comparative insights into batting performance. limits on the number of balls bowled have introduced a very important additional dimension to performance. The study involved market valuation of various attributes of fruit juice. The study 5 . or a broader list of questions by both industry insiders and other interested observers. Similarly.

for they would like to maximize their utility and player performance is an important argument of their utility function. convenience.revealed that consumers paid a premium for nutrition. In equilibrium. a cricket player is valued for his on-the-field (and perhaps. Deodhar and Intodia (2004) showed that colour and aroma were the two important attributes of a prepared tea. Extending the analogy to cricket. The franchisee team owners bid for the players‟ services. In a similar study on tea. offthe-field) performance. and information. the final bid price of a player must be a function of the valuation of winning attributes of a player. We propose that a cricket player sells his cricketing skills for the IPL tournament. 6 .

1.1 Pie Chart for Winning Percentage 3.1 OBJECTIVE To give a clear representation of the matches ending in a result between India and Pakistan in ODI matches played so far. We consider the entire matches played so far.2 POPULATION Table 3.1 Population Data Total Matches 117 Won by India 48 Won by Pakistan 69 3.1WINNING PERCENTAGE USING PIE CHART 3.1. We have a sample size of 117 excluding four matches which have ended in no result 3.4 INFERENCES The above pie-chart implies that among the total number of matches Pakistan won more matches with total win percentage of 59% and India won 41% of the total number of matches.1.3 PIE CHART For the above data a pie-chart can be used best to represent the data.1.CHAPTER 3 RESEARCH METHODOLOGY 3. 7 . Total Matches: 117 India 41% Pakistan 59% Figure 3.

3 INFERENCES From the above bar chart we can see that under the captaincy of Mahendra Singh Dhoni India played a total of 117 matches among which India won 80 matches.3.2. 8 .2.3 MS Dhoni's Captaincy Record 3.2CAPTAINCY RECORD CALCULATION USING BAR CHART 3. lost 32 matches and tied 2 of them. The number matches won or lost by India under the captaincy of Captain Mahendra Singh Dhoni is taken as the population and the graph is made for the same 3. For 3 of the matches there were no results.1 OBJECTIVE The objective is to present the best way to represent the captaincy record of MS Dhoni.2 POPULATION Table 3.2.2 India's Winning Record under MS Dhoni Total Matches 117 Won 80 Lost 32 Tied 2 No result 3 For the above data a bar-chart can be used as the best tool to represent the data. 80 32 2 WON LOST TIED 3 NO RESULT Table 3.

the team chasing can win when it exceeds the score scored by the opponent. We choose a sample consisting of 7 batsmen and 7 bowlers and developed the regression. 3.5VALUATION OF PLAYERS IN IPL Next. 3.4. 3. 3.4 DIFFERENCE IN IMPACT OF TOSS BETWEEN DAY AND DAYNIGHT MATCHES We intended to study the impact of toss between day and day. 9 . For successful chasing of the total we need to have the team batting second score more than the team batting first. From the population we applied the technique of random sampling and arrived at a sample size of 38 for both the populations of day and day-night matches. we intended to find the runs that Indian team could score while chasing a target against Pakistan.2 TECHNIQUE EMPLOYED Estimation of single population mean was applied to get the intended result. our objective was to find the whether the valuation of players in IPL is matching their skills or are they over or under valued for their skill.3.1 POPULATION AND SAMPLING The data of that particular team while chasing was considered as population 3.3.3ACHIEVABLE SCORE AT THE END OF 50 OVERS In the game of cricket.4.5. We were able to predict the mean with a confidence level of 90%. 3.3.1 POPULATION AND SAMPLING We used the data of matches played between India and Pakistan as the Population.1 REGRESSION We developed a regression model for finding the correlation between a player‟s compensation against their skills. 3. Thus.night matches played between India and Pakistan.2 TECHNIQUE EMPLOYED The hypothesis testing for two populations was applied to study the differences between both the population means.

95 P 90% of CI SE Mean 0.7 STATISTICAL CONCLUSION AND BUSINESS IMPLICATION With the 90% confidence we can say that India will chase down the total of 245 in 50 overs because the total score 245 comes in the range of (232. According to the data given we are estimating the single population mean at assumed standard deviation as 53 4.58) 6. value of Z from the z distribution table is +1.645 4.340 (232.1.78.1ESTIMATION OF SINGLE POPULATION 4.1.78.10 4.5 COLLECTION OF DATA Sample size (Runs): 64 Standard Deviation: 52.58). we take z-test for single sample population mean.1. 254.63 4.CHAPTER 4 STATISTICAL ANALYSIS AND INTERPRETATION 4.2 DETERMINE APPROPRIATE STATISTICAL TEST As the number of samples is greater than 30(64).1.3 LEVEL OF SIGNIFICANCE Alpha = 0. 10 .1.68 4.71 Mean of Sample: 243.6 ANALYZE THE DATA Z -0.4 SET THE DECISION RULE For value of α 0. We calculate the estimate value using the formula 4.10. The null hypothesis will be rejected if the computed value of z is outside +1.1. 254. we are trying to predict whether India will be able to successfully chase the total of 245 runs in 50 overs.645.1.1 SET NULL AND ALTERNATE HYPOTHESIS In this step.

2.3 LEVEL OF SIGNIFICANCE Alpha = 0.2HYPOTHESIS TESTING FOR TWO POPULATION 4.039473684 11 .10 4.µ2)≠0 4.2 DETERMINE APPROPRIATE STATISTICAL TEST As the number of samples in both cases is greater than 30 and are independent and their population variance is unknown.4 SET THE DECISION RULE For value of α 0. The null hypothesis will be rejected f the computed value of z is outside +1.5 COLLECTION OF DATA Sample size 1: 38 Sample size 2: 38 Variance of sample 1: 2370.37 Mean of sample 1: 7.Table 4.2.775 Variance of sample 2: 3119.539473684 Mean of sample 2: 5.10. value of Z from the z distribution table is +1.2.1 SET NULL AND ALTERNATE HYPOTHESIS Null Hypothesis =(µ1 .2.2.645.645 4.1Distribution Plot 4.µ2)=0 (No significant difference in runs scored) Alternate Hypothesis =(µ1 . 4. we take z-test for two sample population mean.

4. where Pi is the final bid price paid to a cricketer i for the IPL tournament and zij is the value of the attribute j of the cricket player i.4.. 12 . 1) Pi = f ( zi1. Hence Null hypothesis is accepted and alternative hypothesis is rejected. …. It can be concluded that there is no significant change in the impact of toss. we postulate that cricketing attributes at its best must have been promoted and incentivized which in turn must be contributing to the final bid price of the players. …. their cricketing attributes. is a locus of equilibrium final bid prices and player attributes. Thus we arrive at a fix as to which quality of a player acts as the crucial factor.417618865 1.zij. While non-cricketing attributes matter in IPL we have deliberately excluded them from the analysis process as measurement of such qualitative attributes is a subjective process and at the same time money. i. Higher the win ability and crowd pulling attributes of the players.835237731 1. is nothing but the sum total of the prices of his performance over a given period of time in IPL League. Variables that capture the batting and bowling performances of cricket players must contribute to the players‟ final bid price. as given in equation (1). Data on final bid prices and values of cricketing attributes of players are readily available for IPL 2008-IPL 2012. and advertisements.645. broadcasting rights.7 STATISTICAL CONCLUSION AND BUSINESS IMPLICATION The observed z-value is less than the tabular z-value 1. The hedonic price equation. zin). higher will be the revenue earned from sale of tickets. cost and time constraints restricts the scope of digging deep into the abstruse process of fair evaluation of such attributes.2. memorabilia. in this context.281551566 0.644853627 4. between the day and day-night matches. IPL cricket being both a sport and a source of entertainment.2. where buyers (team owners) and sellers (cricket players) participate in an auction. winning and crowd-pulling abilities of a player.6 ANALYZE THE DATA Z 0. are very crucial for IPL. sports merchandize. Thus we have premised and hypothesized that the final bid price and the consequent success of IPL depends mainly on the core competencies of the cricket players.e.3REGRESSION ANALYSIS OF IPL VALUATION OF PLAYERS We hypothesized that the equilibrium final bid price of an IPL cricket player.207988776 P (Z<=z) one-tail Z Critical one-tail P (Z<=z) two-tail Z Critical two-tail 0. But from the point of view of the organizers and the team owners‟ perspective.

We have data relating to the individual performances of these 16 players spanning across all the IPLs taken place till date 1) Batsman: For the multiple regression analysis we have taken 2 important independent variables which are the prime determinant of the performances of the players in the long run. The two variables are the Total runs scored and the Batting averages. a wicket taking bowler could put a lot of pressure on the opposition. there is a wealth of data available on the cricketing attributes of IPL players hypothesized above. Bowlers and Batsman. the equation has a reasonably high (adjusted) R-square and maintains parsimony. correct signs of the coefficients. and there are sufficient degrees of freedom. For example. The exact specification of the regression is given below in Equation (2). Cricinfo and Wikipedia. However. Based on such guidelines. While we have considered final bidding price as the dependent variable. t-statistics.The data sources include the official website of IPL and two other websites. in this shorter version of the game. and F-statistics. The relevant variables are drawn from observations on skills that are considered important for Twenty20 form of the game. 13 . he would be considered quite useful.e. no one is likely to make centuries frequently. While IPL is a Batsman‟s game.2 Description of Variables Variable P Runs Average Wickets Strike Rate Description Final bid price of a player. a player contributing many runs on a continuous basis and having high batting average would be an asset for the team. 2) Bowlers: For the multiple regression analysis of bowlers also we have taken 2 important independent variables which are the prime determinant of the performances of the players in the long run. It has been taken into consideration as to which combination offered the best goodness of fit in terms of R-square.e. balls per wicket. and hence. For the sake of convenience we have considered only 8 Indian players in each category i. Total runs scored over a span of 5 IPL . Average runs scored in the same period. adjusted R-square. P (BATSMAN)= b0 + b1(RUNS)+ b2(AVERAGE) P (Bowlers) = b0 + b1(wickets) + b2(strike rate) Table 4. These are the wickets taken and strike rate. To paraphrase the estimated variable coefficients should be having the right signs and are statistically significant. the variables chosen for estimating equation (1) and their description is reported in Table. Total number of wickets taken by a bowler in 5 IPLs. Strike rate i.

070 Durbin-Watson statistic = 1.2 ANALYSIS OF VARIANCE Source Regression Residual Error Total DF 1 6 7 SS 2.070 R-Sq = 44.20 P 0.47345E+11 51275908679 F 4.3 AMOUNT VERSUS WICKETS.4.47345E+11 3.07655E+11 5.6% R-Sq(adj) = 35. Singh Zaheer Khan Wickets 54 36 70 69 53 49 74 65 Strike rate 18.82 P 0.59 -2.67 4-dl=3.4.78 19.51941 Strike Rate Predictor Constant Strike Rate S = 226442 Coef 1899656 -51941 SE-Coef 529535 23649 T 3. STRIKE RATE The regression equation is Amount = 3190691 .33 4-du=2. Ashwin R.33 21.22 25.75398 Strike Rate 14 .76 du=1.012 0.22 Amount(in US Dollars) 1300000 450000 700000 500000 800000 850000 500000 900000 4.35 20.P.3% 4.66 19.42023 dl=0.24 Hence there is is no autocorrelation.63 22.4.55000E+11 MS 2.88 29. 4.4REGRESSION ANALYSIS 4.1 AMOUNT VERSUS STRIKE RATE The regression equation is Amount = 1899656 .3 Description of statistics of bowlers Name of the bowler Harbhajan Singh Ishant Sharma Munaf Patel Pragyan Ojha Praveen Kumar R.Table 4.13138 Wickets .

40 P 0.Predictor Constant Wickets Strike Rate S = 176571 Coef 3190691 -13138 -75398 SE Coef 716164 5955 21286 T 4.7% 4.9% R-Sq(adj) = 60.55886E+11 5.4.99557E+11 31177190382 F 6.91173E+11 Durbin-Watson statistic = 1. Residual Plots for Amount Normal Probability Plot 99 90 50 10 1 -400000 -200000 0 Residual 200000 400000 200000 100000 0 -100000 -200000 400000 600000 800000 Fitted Value 1000000 Versus Fits Histogram 3 200000 Residual Percent Versus Order Frequency Residual 2 1 0 100000 0 -100000 -200000 -200000 -100000 0 100000 200000 Residual 1 2 3 4 5 6 Observation Order 7 8 Figure 4.078 0.46 -2.1Residual Plots for Bowlers 15 .017 R-Sq = 71.007 0.99114E+11 1.4 ANALYSIS OF VARIANCE Source Regression Residual Error Total Source Wickets Strike Rate DF 2 5 7 DF 1 1 SS 3.21 -3.042 SeqSS 7940674349 3.55000E+11 MS 1.39505 Hence there is no autocorrelation.54 P 0.

03 P 0.04188E+11 1.35 33.Gambhir R.Kohli R.8% R-Sq(adj) = 31.21500E+12 MS 9.Raina V.5DESCRIPTION OF STATISTICS OF BATSMAN 4.1 AMOUNT (IN US DOLLARS) VERSUS RUNS Table 4.5.Dravid Runs 1782 2254 1879 2047 1639 1975 2065 1703 Average 37.088 R-Sq = 40.4.04188E+11 2.5 SE Coef 1649219 855.G.K.5 T -1.Sharma G.25 31.18469E+11 F 4.14 P 0.4 Batsman Statistics Name of the Batsman M.Sehwag SR Tendulkar V.3 37.2 ANALYSIS OF VARIANCE Source Regression Residual Error Total DF 1 6 7 SS 9.59499 Thus there is no autocorrelation 16 .S.9 28.64 30.352 0.12 33.0% 4.31081E+12 2.01 2.91 Amount(in US Dollars) 1300000 1800000 1800000 2400000 500000 2000000 1800000 1800000 The regression equation is Amount in US Dollars) = .088 Durbin-Watson statistic = 2.5.1663273 + 1740 Runs Predictor Constant Runs S = 467406 Coef -1663273 1740.31 27.Dhoni S.

4. AVERAGE The regression equation is Amount (in US Dollars) = .2Residual Plots for Amount 17 .5.207 0.4 ANALYSIS OF VARIANCE Source Regression Residual Error Total Source Runs Average DF 1 1 DF 2 5 7 SS 9.45 0.04188E+11 26699560534 Durbin-Watson statistic = 2.373 0.760 R-Sq = 42.0 -750000 -500000 -250000 0 250000 500000 250000 0 -250000 -500000 1 2 3 4 5 6 Observation Order 7 8 Residual Figure 4.30888E+11 1.8% 4.65444E+11 2.5 0.98 1.1946333 + 1562 Runs + 19235 Average Predictor Constant Runs Average S = 506777 Coef -1946333 1562 19235 SE Coef 1992016 1080 59656 T -0.0 0.5.0 500000 Versus Order Frequency Residual 1.56822E+11 F 1.28411E+12 2.0% R-Sq(adj) = 18.256 Seq SS 9.81 P 0. Residual Plots for Amount(in US Dollars) Normal Probabilit y Plot 99 90 500000 Versus Fit s Residual 250000 0 -250000 -500000 Percent 50 10 1 -1000000 -500000 0 Residual 500000 1000000 1000000 1500000 Fitted Value 2000000 Histogram 2.39651 Thus there is no autocorrelation.21500E+12 MS 4.3 AMOUNT (IN US DOLLARS) VERSUS RUNS.32 P 0.5 1.

The corresponding p-value has been obtained as 0.44. This indicates the low level of correlation between strike rate of a bowler and the amount paid to him in IPL. In other words only around 44.This indicates the low level of correlation between runs scored by a batsman and the amount paid to him in IPL.It implies that only 40.256 which lies in the rejection region. H1: Other factors act as the key determinants of the amount paid to the bowlers. 18 . H1: Other factors act as the key determinants of the amount paid to the batsman.9% which too is low. The corresponding p-value has been obtained as 0. The rest of the change is unexplained.Similarly coefficient of determination in multiple regression model has been determined as 42% which too is really low.CHAPTER 5 5.6% for strike rate as an individual factor i.8% of the change in amount is determined or explained by runs scored by the batsman in the T-20 format.e.6% of the change in amount is determined or explained by strike rate of the bowler. Similarly coefficient of determination in multiple regression model has been determined as 71. Hence it can be safely concluded that the performance factors are not at the helm for determination of the bid price of the bowlers which is rather determined by various other factors which have been discussed later on in the below mentioned analysis of the regression output. Hence it can be safely concluded that the performance factors are not the key factors to be considered as majority of the part is dependent upon various other factors. 5. H0: Key performance indicators (runs and average) are the key determinant of the amount paid to batsman in IPL.2BATSMEN It is clearly evident that the coefficient of determination for runs is quite low at 40.e.8% which indicates runs do not play a major role in the fixing of the disbursements of the cricketers. The rest of the change is unexplained. Since the null hypothesis has been rejected and the alternative hypothesis has been selected the key conclusion that can be derived from the above exercise is that there are a variety of other reasons responsible for the insuperably high amount of money paid to bowlers. Simple Linear regression. H0: Key performance indicators (wickets and strike rate) are the key determinant of the amount paid to bowlers in IPL.1BOWLERS DISCUSSIONS We can clearly see that coefficient of determination is very low i.042 which lies in the rejection region.

the racial controversies surrounding them etc. 2. These high premiums.2. Glamour. seem to bea reflection of their ability to draw huge crowds nationallydue to their charismatic association with film stars. Age. 5. 4. 19 . over and above thecompensation for their cricketing attributes. Controversy. 3. 5.1 REASONS FOR NON-EXPLANATION Some of the reasons which may account for non-explanation of the relation might be as follows: 1. Iconic Value. Popularity.Thus null hypothesis has been rejected driving home the point that there are various other factors in operation which may be responsible for the amount of money being so high.

the form in which the individual players are currently in. a player‟s brand value. But in this case.1LIMITATIONS The limitations of our study are: 1. During the process of developing a regression model for determining the pricing of an IPL player based on his statistical attributes. relevance to the franchise is all taken into account while determining his price. 6. Regression model could be applied in to fix a player‟s compensation based on his skill set. weather conditions etc. 4. This could help them spend the money accordingly and thus could achieve maximum return on money.CHAPTER 6 6. This could help the team franchise to fix a ceiling price on each player before going in for auction. and arrive at two populations. But.2FUTURE SCOPE Single population estimation could be used to estimate the likely scores of people with confidence based on their previous performances. one each for the Day and Day-night matches. these aspects are completely ignored in our study while determining the regression model. This could help the teams in formulating the strategies against he opponent. In the determination of the difference in the impact of toss. 20 . it is not exactly possible because of the inherent unpredictability in the game of cricket. the net run rate difference is calculated across the maximum overs for all the matches. but when predictions are made with the help of those representative forms with respect to the current match. For example. CONCLUSION The usage of the pie chart and the bar chart to represent the statistics for earlier IndiaPakistan matches was appropriate. 3. we calculate the net run-rate difference between the teams batting first and second. image. This probably explains the low correlation between the independent variables and the pricing of the player. and the event of teams chasing down targets easily without losing wickets is not explained through our population. While estimating the achievable target score with a ninety percent confidence interval range. without considering other factors like the difference in the set of players between those games and the current match. 2. This might result in incorrect range estimation. the nature of the pitch. we take into account only the matches played already between the two teams. there are many intangible attributes of an individual player.

411-427. K. “Scheduling the Cricket World Cup: A Case Study. 55(12). R J (1993).” Industrial and Labor Relations Review. Jones.” The American Statistician.” Managerial and Decision Economics. 15(5). J C H and Walsh. J M and Flueck. 537-541. W D (1988).” Managerial and Decision Economics.com/. http://www. (APRIL . S. “Salary Determination in Major League Baseball: A Classroom Exercise. Bennett. “Who Is „Most Valuable‟? Measuring the Player‟s Production of Wins in the National Basketball Association. B S (2004). Cricinfo. Volume 34.JUNE 2009). Vikalpa. "Player Pricing and Valuation of Cricketing Attributes:Exploring the IPL Twenty20 Vision". Rastogi. 15-23.” Journal of the Operational Research Society. J A (1983). 592-604. “An Evaluation of Major League Baseball Offensive Performance Models. “Salary Determination in the National Hockey League: The Effects of Skills. Franchise Characteristics. J and Willis. as on September 13. 21 . Berri. 2012. 12661274.cricinfo. Estenson. 37(1). “A Criterion for Comparing and Selecting Batsmen in Limited Overs Cricket. D J (1999). and Discrimination. 76-82. 41(4). G D I and Kantor.” The Journal of the Operational Research Society. Barr. P S (1994). 44(11). 20(8). 1067-1072.REFERENCES Armstrong.