You are on page 1of 12

Arokia Rexton C, PGDMB13028 1stApril 13

Research on Factors influencing Milk Consumption

Introduction
The global milk consumption has been rising significantly in the past few years and is expected to rise by a compound annual growth rate (CAGR) of 2.2% over the next three years also, according to a report by Tetra Pak, a food processing and packaging solutions company. And India, where around 65% of the population are Deeper in the Pyramid, still majorly consumes loose milk, but that is changing, particularly in cities. India is one of the worlds biggest milk consumer. And the milk consumption in India is expected to notch up a compound annual growth rate (CAGR) of 2.9% in 2011-2014, according to forecasts. Part of that growth is expected to come from less affluent consumers buying dairy snacks and drinks in a country where white milk sales still account for the bulk of consumption.

The report focuses on analyzing the factors affecting the milk consumption in Chennai, since Chennai has been one of the major consumers of dairy products among all the metropolitans in India. Chennai metro region has been expanded and its population is growing. In order to fulfill the increasing demand, Tamil Nadu Cooperative Milk Producers Federation (TCMPF) is planning to increase the sale of milk by another one lakh litre per day. It is proposed to sell 11 lakh litres of milk per day in Chennai Metro and 10.00 lakh litres of milk per day in District Unions in various pack size and varieties.

Selling price of Milk Products (2012-2013)

Study Objectives
The main objective of this study is to demonstrate the way in which econometric analysis can be used for three different specific purposes: (1) historical analysis - quantifying the factors which determine demand for milk and evaluating the effects of price increases, the effective- ness of advertising, and other marketing activity; (2) forecasting annual Consumption (3) Regression analysis Studying how the consumption of packaged milk changes with the changes in the independent variables like income level, family size, beverage drinking habit and socio-economic status. The focus has been on analyzing the consumption pattern of the packaged milk by the households. And how factors like income level, family size, beverage drinking habit and socio-economic status affect the overall demand of packaged milk in Chennai. As we know, that an increase in demand leads to an increase in the prices, which has been a common phenomenon in case of Chennai. However, data shows that the sales has increased with the prices of the packaged milk products, which makes the whole study an interesting subject as it is a contradiction to the law of demand. One of the purposes of the report is also to understand the level of relation between the milk consumption and the standard of living of the consumers, which is clearly depicted by the income levels and the socio-economic status of the people.

Preliminary Examination of Data


The variables analyzed in the report are as follows:Dependent Variable Milk Consumed by the households in Chennai The packaged milk consumption has been rising steadily in Chennai even though the prices of milk are rising, which makes it an interesting area of study. The econometrics research specifies the affect of the independent variables chosen on the consumption of packaged milk. The milk consumption is measured in litres. And the scale of measurement for the variable is a ratio scale.

Independent Variable Income of the Households The monthly income is a factor which affects the level of consumption of food items. However, studies indicate that milk is regarded as a necessity by consumers and so is not affected by income changes. The average income data has been used and the scale of measurement is ratio scale. The unit of measurement used is Indian National Rupee INR. Socio Economic Status It is a combination of the education level, asset holdings, and the standard of living of the individuals. It is an experience in many developed nations that an increase in the socio economic status leads to an increase in the consumption of the individuals, some part of which is the increase in the consumption of the necessity. So we used this variable to study its impact on the consumption of the milk, which is a necessity. The scale of measurement for this variable is Nominal Scale.

Tea/ Coffee drinking habits This variables states the addiction of the people towards caffeine drinks like tea and coffee and how does it affect the demand of tea and coffee which in turn affects the demand for milk. Since Tea and coffee are the major beverage drinks consumed by the people in India and so it has been used as a variable to study how does it affect the consumption pattern of packaged milk products. This is a dummy variable added to the regression model that we are studying. And it is a categorical variable.

Family Size The size of the family directly correlates with the consumption of milk theoretically. However the study focuses on studying whether the theory states right or not. The scale of measurement for this variable is again ratio scale.

Frequency of milk purchase This variable measures the number of times the households purchases milk. It is a clear indicator of the milk consumption pattern among the households in Chennai. Frequency of milk purchase can have a increasing as well as decreasing affect on the litres of milk consumed, depending on the volume bought by the households. The scale of measurement for this variable is a ratio scale.

Preliminary Data Description


The describe command gives a detailed description about the data used for the analysis. The output tabular column shows us that there are 304 observations in the data. The variables list includes average milk usage per day in a household, the total family members present, how frequent do they buy milk?, do they have the habit of drinking tea / coffee, their socio-economic classification, their average monthly income. The table also includes the variable labels and the data format in which they are saved.

Summary Statistics
The summary statistics gives us the complete statistical details about the variables used in the analysis. The statistics shows that there are 304 observations of each variables, their mean, standard deviation, minimum value and the maximum. The categorical variable, indicating, whether the family members have the habit of drinking tea or coffee acts as the dummy variable. 0 indicates that they do not have the habit of drinking tea / coffee and 1 is vice versa. It is found that the dummy variable dont have a significant influence in determining the variance in the amount of milk consumed in the later part of the report. The details are listed below.

Correlations
The correlation matrix computes the correlation coefcients of the columns of a matrix. That is, row i and column j of the correlation matrix is the correlation between column i and column j of the original matrix. The diagonal elements of the correlation matrix will be 1 since they are the correlation of a column with itself. The correlation matrix is also symmetric since the correlation of column i with column j is the same as the correlation of column j with column i.

The correlation matrix shows that there is a small negative correlation of .108 between the frequency of milk purchased and the amount of milk purchased and a strong positive correlation between the income and the amount of milk consumed in a house hold.

Base line model


Amount of milk consumed = 0 + (1 * family size) + (2 *frequency of milk purchase) + (3 *SEC classification) + (4 *Average monthly Income) Result of multiple regression in STATA shows that the regression model is able to explain 58.61% of variance in the dependent variable, ie 58.61% of amount of milk consumed per household depends on the listed independent factors at 90% confidence interval. It is also found that frequency of milk purchase remains insignificant in the regression model at 90% confidence interval. The result also shows us that any change in the family size will have a significant impact on the amount of milk consumed.

Omitting Frequency of Milk Purchase


Amount of milk consumed = 0 + (1 * family size) + (3 *SEC classification) + (4 *Average monthly Income) The above model is a result of omitting the influence of frequency of milk purchase from the base line model. There is no significant change in the value of R 2 , but there is a small increase of .0014 in the adjusted R square and it reaches 0.5819 from 0.5805. All the other variables are significant even at 95% confidence interval. Therefore in the following models we proceed without incorporating the effects of the frequency of milk purchase on the amount of milk consumed.

Log-Log Model
Log(Amount of milk consumed) = 0 + log(1 * family size) + log(2 *frequency of milk purchase) + log(3 *SEC classification) + log(4 *Average monthly Income) In this model, all the variables are significant except the frequency variable, just like the lin-lin model. The R square and the adjusted R square decreases to 0.5426 and 0.5365 respectively and the effectiveness of the model is reduced.

Log-Lin Model
Log(Amount of milk consumed) = 0 + (1 * family size) + (2 *frequency of milk purchase) +(3 *SEC classification) + log(4 *Average monthly Income) The model estimates the percentage of variance in the amount of milk consumed that is caused by i% change in the average monthly income of the family. In other

words it is the measurement of elasticity between the milk consumption and the income factor. Test result shows that 1% increase in their average monthly income would lead to 1.6 % increase in the average amount of milk consumed every day at 95% confidence level.

Quadratic Variable
Amount of milk consumed = 0 + (1 *( family size)2) + (2 *frequency of milk purchase) + (3 *SEC classification) + (4 *Average monthly Income) In the above mentioned model the family size has a better quadratic fit in the model. This leads to the betterment of the effectiveness of the model with adjusted R square reaching 58.80 % but the variable explains less of the variation in the dependent variable.

Impact of Dummy Variable


Amount of milk consumed = 0 + (1 * family size) + (2 *frequency of milk purchase) + (3 *SEC classification) + (4 *Average monthly Income)+ (5 * Tea/Coffee) The newly added variable shows whether or not the respondent has an addiction towards coffee. This model checks whether there is an increase in the milk consumption if there in an addiction to caffeine.

The result shows that there is a slight increase in the R square and the Adjusted R square values due to the inclusion of the dummy variable. Thought they are not significant, they are negatively correlated with the dependent variable.

Conclusion
The model consist of 6 independent variables explaining the variance in the dependent variable. The factors analysed in the research constitute only for around 60% variance in the dependent variable whatsoever variations incorporated in the model. The effect of transformations in the variables are analyzed and the results of all the regression models are interpreted. The result shows strong relation between the income and the amount of milk consumed in every household, which may not be the case in reality. The family size, which had a positive correlation with the amount of milk consumed in simple regression model, has a negative influence on the dependent variable as other factors are included in the system. The paper also proves that there is no significant relation between the addiction to caffeine and the amount of milk consumed. The frequency of milk purchase has also proved to have

no significant influence in the amount of milk consumed by the family. This gives an insight that people who buy milk on alternative days or only twice in a week, have the habit of storing and using milk. SEC category of the customer is seen to have better influence on the amount of milk consumption than the average monthly income. If better the standards of living more is the amount of milk and other healthy products consumption, was the assumption behind including the income and SEC variables, then the standard of living should be determined by the SEC classification rather than income at 90% confidence level.

Description
. describe Contains data obs: vars: size: variable name 304 7 3,344 storage type int byte byte byte byte long byte display format %14.2f %14.2f %14.2f %14.2f %14.2f %14.2f %10.0g value label variable label total daily milk usage total family members frequency TEA/COFFEE SEC Income NO tea/coffee

totaldailymil~e totalfamilyme~s frequency TEACOFFEE SEC Income NOteacoffee

Summary Statistics
. summ Variable totaldaily~e totalfamil~s frequency TEACOFFEE SEC Income Obs 304 304 304 304 304 304 Mean 1058.882 4.009868 4.526316 .8157895 6.269737 31111.84 Std. Dev. 346.4388 1.238772 1.587689 .388295 1.549919 6470.388 Min 200 1 3 0 1 21200 Max 3000 11 7 1 8 64000

Correlation matrix explained by Scatter Plot

Distribution of Independent Variables with respect to Milk consumption

Charts

Regression
. reg totaldailymilkusage totalfamilymembers Source Model Residual Total SS 5210952.63 31155067.1 36366019.7 df 1 302 303 MS 5210952.63 103162.474 120019.867 Std. Err. 14.89526 62.50432 t 7.11 10.15 Number of obs F( 1, 302) Prob > F R-squared Adj R-squared Root MSE P>|t| 0.000 0.000 = = = = = = 304 50.51 0.0000 0.1433 0.1405 321.19

totaldailymilkus~e totalfamilymembers _cons

Coef. 105.8634 634.3832

[95% Conf. Interval] 76.55177 511.3841 135.1751 757.3823

. reg totaldailymilkusage totalfamilymembers frequency TEACOFFEE SEC Income Source Model Residual Total SS 21432432.8 14933586.9 36366019.7 df MS Number of obs F( 5, 298) Prob > F R-squared Adj R-squared Root MSE t -6.12 0.01 -1.54 1.91 17.32 -2.37 P>|t| 0.000 0.990 0.124 0.057 0.000 0.019 = = = = = = 304 85.54 0.0000 0.5894 0.5825 223.86

5 4286486.57 298 50112.7077 303 120019.867 Std. Err. 16.0407 8.342478 33.40568 8.522425 .0030275 100.928

totaldailymilkus~e totalfamilymembers frequency TEACOFFEE SEC Income _cons

Coef. -98.21698 .1036065 -51.53908 16.26448 .0524321 -238.9393

[95% Conf. Interval] -129.7844 -16.31403 -117.28 -.5072791 .0464741 -437.5612 -66.64958 16.52124 14.20185 33.03624 .0583902 -40.31743

Regressing without frequency factor


. reg totaldailymilkusage totalfamilymembers SEC Income Source Model Residual Total SS 21313059.5 15052960.3 36366019.7 df 3 300 303 MS 7104353.16 50176.5342 120019.867 Std. Err. 15.82341 8.443745 .0030225 80.11384 t -6.02 1.97 17.25 -3.56 Number of obs F( 3, 300) Prob > F R-squared Adj R-squared Root MSE P>|t| 0.000 0.050 0.000 0.000 = = = = = = 304 141.59 0.0000 0.5861 0.5819 224

totaldailymilkus~e totalfamilymembers SEC Income _cons

Coef. -95.30542 16.63666 .0521273 -285.0403

[95% Conf. Interval] -126.4444 .0201895 .0461793 -442.6966 -64.16649 33.25313 .0580754 -127.3841

Log-Log Model
. reg l_totaldailymilkusage l_totalfamilymembers l_frequency l_SEC l_income Source Model Residual Total SS 17.2796051 14.5650141 31.8446192 df 4 299 303 Coef. -.2311422 -.0571002 .1086376 1.486477 -8.238261 MS 4.31990128 .048712422 .105097753 Std. Err. .0578603 .0396761 .0398637 .0996683 .9732664 t -3.99 -1.44 2.73 14.91 -8.46 Number of obs F( 4, 299) Prob > F R-squared Adj R-squared Root MSE P>|t| 0.000 0.151 0.007 0.000 0.000 = = = = = = 304 88.68 0.0000 0.5426 0.5365 .22071

l_totaldailymilkus~e l_totalfamilymembers l_frequency l_SEC l_income _cons

[95% Conf. Interval] -.3450072 -.1351799 .0301886 1.290337 -10.15358 -.1172772 .0209796 .1870866 1.682618 -6.322941

Log-Lin Model
. reg totaldailymilkusage totalfamilymembers frequency SEC l_income Source Model Residual Total SS 21653536.1 14712483.6 36366019.7 df 4 299 303 MS 5413384.04 49205.6307 120019.867 Std. Err. 16.0361 8.270006 8.4704 106.4918 1047.332 t -6.45 0.53 1.58 17.61 -17.18 Number of obs F( 4, 299) Prob > F R-squared Adj R-squared Root MSE P>|t| 0.000 0.597 0.115 0.000 0.000 = = = = = = 304 110.02 0.0000 0.5954 0.5900 221.82

totaldailymilkus~e totalfamilymembers frequency SEC l_income _cons

Coef. -103.4793 4.38183 13.39921 1875.339 -17996.65

[95% Conf. Interval] -135.0372 -11.89296 -3.269945 1665.771 -20057.72 -71.92137 20.65662 30.06836 2084.908 -15935.57

Quadratic Variable- Family Size


. reg totaldailymilkusage frequency SEC Income sq_size Source Model Residual Total totaldaily~e frequency SEC Income sq_size _cons SS 21579573.8 14786446 36366019.7 Coef. 1.172303 16.55755 .0534463 -10.7708 -523.3907 df 4 299 303 MS 5394893.44 49452.9966 120019.867 t 0.14 1.96 17.57 -6.46 -5.23 P>|t| 0.887 0.051 0.000 0.000 0.000 Number of obs F( 4, 299) Prob > F R-squared Adj R-squared Root MSE = = = = = = 304 109.09 0.0000 0.5934 0.5880 222.38

Std. Err. 8.250041 8.444634 .0030424 1.668011 100.1633

[95% Conf. Interval] -15.0632 -.0608914 .0474591 -14.05332 -720.5049 17.4078 33.176 .0594335 -7.488266 -326.2765

Dummy Variable Tea / Coffee


. reg totaldailymilkusage totalfamilymembers frequency TEACOFFEE SEC Income Source Model Residual Total SS 21432432.8 14933586.9 36366019.7 df 5 298 303 MS 4286486.57 50112.7077 120019.867 Std. Err. 16.0407 8.342478 33.40568 8.522425 .0030275 100.928 t -6.12 0.01 -1.54 1.91 17.32 -2.37 Number of obs F( 5, 298) Prob > F R-squared Adj R-squared Root MSE P>|t| 0.000 0.990 0.124 0.057 0.000 0.019 = = = = = = 304 85.54 0.0000 0.5894 0.5825 223.86

totaldailymilkus~e totalfamilymembers frequency TEACOFFEE SEC Income _cons

Coef. -98.21698 .1036065 -51.53908 16.26448 .0524321 -238.9393

[95% Conf. Interval] -129.7844 -16.31403 -117.28 -.5072791 .0464741 -437.5612 -66.64958 16.52124 14.20185 33.03624 .0583902 -40.31743

You might also like