You are on page 1of 13

CIA – III

BUSINESS STATISTICS I
inspection of Data and Comprehensive Reporting
using Correlation, Regression/Time Series
Analysis
SUBMITTED BY:
VEDANSH MUDGAL (2123547)
SUBMITTEDTO :
PROF.FEZEENA KHADIR
INTRODUCTION

The main objective of this project is to research and predict the connection between Cost i.e.
Employee cost and also the total income of Tata Power. During this project, we`ll fit a line
between the 2 data sets selected and calculate the correlation between the two data sets. We
are going to then plot a scatter diagram of the info sets. The scatter diagram graphs pairs of
numerical data with one variable on each axis to search out the connection between the
variables. If the variables are correlated, then the points will fall during a line or curve. Then
the probable error is calculated. It defines the half range of an interval about its central point
for distribution so 1/2 the values are within the interval and half is outside. We also use
regression and forecast the chosen variable for the subsequent five years. Regression is that
the process of estimating the link between a variable and one or more independent variables.
The method used for correlation is Karl Pearson`s correlation coefficient and the leastsquares
method for regression. Karl Pearson`s coefficient of correlation is an extensively used
mathematical method in which the numerical representation is applied to measure the level
od=f relation between linearly related variables. The coefficient of correlation is expressed by
`r`. The leastsquare method is the process of finding the bestfitting curve or line of best fit for
a set of data points by reducing the sum of the squares of the offsets (residual part) of the
points from the curve. During the process of finding the relation between two variables, the
trend of outcomes is estimated quantitatively. This process is termed regression analysis. The
curve fitting method is a regression analysis approach. This method of fitting the equation,
which fits the curve to the given raw data, is the least squares method.

DATA
Dataset description For this project we are using the Total Income and Total expenditure as The
dataset being used is from March 2004 to March 2022
Biocon Limited is India's largest and fully-integrated innovation-led biopharmaceutical company.
company that is based in Bangalore, India founded by Kiran Mazumdar-Shaw. The company
manufactures generic active pharmaceutical ingredients (APIs) that are sold in over 120 countries
across the globe, including the developed markets of the United States and Europe. It also
manufactures novel biologics, as well as, biosimilar insulins and antibodies, which are sold in India as
branded formulations. Biocon's biosimilar products are also sold in both bulk and formulation forms
in several emerging markets

(In Crores)

YEAR EXPENDITURE REVENUE


X Y
2004 364.59 512.07
2005 481.40 668.60
2006 536.91 697.79
2007 703.55 874.60
2008 729.80 936.00
2009 774.15 988.36
2010 946.92 1,223.15
2011 1101.10 1,618.30
2012 1318.00 1,622.40
2013 1628.60 1,989.50
2014 1854.50 2,263.10
2015 1940.60 2,390.70
2016 2026.40 2,474.90
2017 2150.50 2686.70
2018 2238.10 2543.90
2019 2648.80 3002.20
2020 1801.60 2190.10
2021 1819.80 2178.60
Calculation of Correlation using Karl Pearson’s Coefficient of Correlation

dx=x-A=x- dy=y-B=y-
x y dx2 dy2 dx⋅dy
1948.0733 1714.4983
364.5 512.0
-1583.4833 -1202.4283 2507419.4669 1445833.8968 1904025.2254
9 7
481.4 668.6 -1466.6733 -1045.8983 2151130.6667 1093903.3237 1533991.1949
536.9 697.7
-1411.1633 -1016.7083 1991381.9533 1033695.8351 1434741.5207
1 9
703.5
874.6 -1244.5233 -839.8983 1548838.3272 705429.2103 1045273.0735
5
729.8 936 -1218.2733 -778.4983 1484189.9147 606059.655 948423.7595
774.1 988.3
-1173.9233 -726.1383 1378095.9925 527276.8791 852430.7327
5 6
946.9 1223.
-1001.1533 -491.3483 1002307.9968 241423.1847 491915.0217
2 15
1101. 1618.
-846.9733 -96.1983 717363.8274 9254.1193 81477.423
1 3
1622.
11318 9369.9267 -92.0983 87795525.7387 8482.103 -862954.6295
4
1628. 1989.
-319.4733 275.0017 102063.2107 75625.9167 -87855.6991
6 5
1854. 2263.
-93.5733 548.6017 8755.9687 300963.7887 -51334.4866
5 1
1940. 2390.
-7.4733 676.2017 55.8507 457248.694 -5053.4805
6 7
2026. 2474.
78.3267 760.4017 6135.0667 578210.6947 59559.7279
4 9
2150. 2686.
202.4267 972.2017 40976.5554 945176.0807 196799.5427
5 7
2238. 2543.
290.0267 829.4017 84115.4674 687907.1247 240548.6007
1 9
2648. 3002.
700.7267 1287.7017 491017.8614 1658175.5823 902326.8965
8 2
1801. 2190.
-146.4733 475.6017 21454.4374 226196.9453 -69662.9615
6 1
1819. 2178.
-128.2733 464.1017 16454.048 215390.357 -59531.8678
8 6
--- --- --- --- --- --- ---
35065. 30860.
∑dx=0 ∑dy=0 ∑dx2=10134728 ∑dy2=1081625 ∑dx⋅dy=855511
32 97 2.3508 3.391 9.5944
Solution:
Mean ˉx=∑xi n/
=364.59+481.4+536.91+703.55+729.8+774.15+946.92+1101.1+11318+1628.6+1854.5+1940.6+2026.4+2
150.5+2238.1+2648.8+1801.6+1819.8 18 /
/
=35065.32 18

=1948.0733

Mean ˉy=∑yi n/
=512.07+668.6+697.79+874.6+936+988.36+1223.15+1618.3+1622.4+1989.5+2263.1+2390.7+2474.9+26
86.7+2543.9+3002.2+2190.1+2178.6 18 /
/
=30860.97 18

=1714.4983

Use assumed mean A=ˉx=1948.0733

Use assumed mean B=ˉy=1714.4983

Correlation Coefficient r :

r = n⋅∑dxdy-∑dx⋅∑dy /√ √
n⋅∑dx^2-(∑dx)^2⋅ n⋅∑dy2-(∑dy)^2

=18⋅8555119.5944-0⋅0 /√ √
18⋅101347282.3508-(0)^2⋅ 18⋅10816253.391-(0)^2

=153992152.6992+0 /√ 1824251082.3144-0⋅√194692561.0389-0

=153992152.6992 /√ 1824251082.3144⋅√194692561.0389
/
=153992152.6992 42711.2524⋅13953.2276

/
=153992152.6992 595959826.8288

=0.2584

Calculation of Probable error

/
= 0.6745 X 1-r^2/ √N

=0.6745X 1-0.2584^2/√18

=0.1483

TIME SERIES ANALYSIS

YEAR EXPENDITURE (x-2012.5) X^2 XY


x Y
2004 364.59 -8.5 72.25 -3099.02
2005 481.40 -7.5 56.25 -3610.5
2006 536.91 -6.5 42.25 -3489.92
2007 703.55 -5.5 30.25 -3869.53
2008 729.80 -4.5 20.25 3284.1
2009 774.15 -3.5 12.25 -2709.53
2010 946.92 -2.5 6.25 -2367.3
2011 1101.10 -1.5 2.25 -1651.65
2012 1318.00 -0.5 0.25 -659
2013 1628.60 0.5 0.25 814.3
2014 1854.50 1.5 2.25 2781.75
2015 1940.60 2.5 6.25 4851.5
2016 2026.40 3.5 12.25 7092.4
2017 2150.50 4.5 20.25 9677.25
2018 2238.10 5.5 30.25 12309.55
2019 2648.80 6.5 42.25 17217.2
2020 1801.60 7.5 56.25 13512
2021 1819.80 8.5 72.25 15468..3
Y=a+bx
a = ∑Y/n
b =∑XY/X^2
a = 25065.32/18

1392.51
b = 58983.72/484.5

121.74

Y=1392.51+121.74 x
Putting x as

9.5-2022

10.5-2023

11.5-2024

12.5-2025

13.5 -2026

We Get

9.5 so Y= 2549.07

X= 10.5 so Y= 2670.81

X= 11.5 so Y= 2792.55

X= 12.5 so Y= 2914.29

X= 13.5 so Y= 3036.03
INTERPRETATION
Correlation is a statistic that measures the degree to which two variables move from each
other. Since the correlation value is 0.2584, we can see that there is a moderate positive
correlation between total spending and total income. This means that as total spending
increases, so does total income. This means that both variables are positively correlated with
each other. Since the degree of correlation is moderate, the magnitude of the change is small
compared to the change in total income, so it can be interpreted that the pharmaceutical sector
is not a capital-intensive sector. The estimation error of the data is 0.1483. Therefore, if the
correlation coefficient r> 6PE, then the value of r is significant and we can say that there is a
relationship between two variables, here total expenditure and total income. Regression
means going back or going back to the mean. You can use the value of the independent
variable to predict the most likely value of the dependent variable. We used the least squares
method to predict future values for the next five years. In this case, the total yield of Biocon
increases as the year progresses.
Conclusion
This project provided a perspective on correlation and regression. You can use correlations
to find relationships and similarities between two or more variables. The correlation value
can be used in everyday activities. In this project, labor costs and total revenue of show a
high correlation. Use trend analysis to predict future values for your dataset. This is highly
needed in today's world. You can use this information to make many decisions in your day-
to-day operations. This is very convenient.

REFERENCES
Biocon.in
Moneycontrol.com
Youtube.com

You might also like