You are on page 1of 9

2020

Data Analysis

STUDENT NAME:
STUDENT ID:
Table of Contents

Introduction......................................................................................................................................2

Ho: Higher age group people have higher salaries....................................................................2

H1: Higher age group has low salaries.......................................................................................2

Graphs and Chats.............................................................................................................................2

Regression Analysis.....................................................................................................................3

Plots..............................................................................................................................................4

Online Calculation results of T –test of Province A........................................................................5

Difference core Calculations:.......................................................................................................5

T value calculation.......................................................................................................................6

Online Calculation results of T –test of Province B....................................................................6

Difference core Calculations:...................................................................................................7

T value calculation...................................................................................................................7

Conclusion.......................................................................................................................................8

Introduction
1
The dataset that has been collected on the following case belong to the annual salaries of two

years with respect to the age groups. Moreover from their annual salaries and their likes and

dislikes it also has been predicted that how much they have been influence from the services of

KFC. The information that is basically retrieved from the data is that what the expected income

of people form the respective age group. Secondly the second income that has been retrieved

from the data set is that they are about the liking and as well as disliking of KFC meal. We can

say from the following data set that people have much influence about the services of KFC and

they have inheritance with respect to their salary ratio. In the following data set as we can see

that the people who have age group less than 20 have less salary. After that the people who have

the age group of 20+ have more salary as compare to less that 20 and the age group of people

from 30+ have more salary as compare to the age group people of 20. Hence we can say that

there is a linear relationship between the age group and as well as salary perspective. There is a

comparison between the two services one is Macdonald and the other one is KFC. From the

following database we can state our hypothesis that:

Ho: Higher age group people have higher salaries.

While the Alternative hypothesis form the following data set can also be predicted that:

H1: Higher age group has low salaries

The total population size is 32 in which 17 are form province A and 15 are form province B.

The second type of the hypothesis that can also be retrieved from the following study is that:

Ho: KFC is more in linking’s as compare to Macdonald.

2
The alternative hypothesis is:

H1: Macdonald is more in linking’s as compare to KFC.

Graphs and Chats

From the line graph we can directly determine the relationship of the age group and as well as

their respective graph. Secondly we from the line graph we can also determine the R square and

equation of the line which will directly provide us about the slope of the graph. From the bar

graph we can determine the salary with respect the age group. If we make the frequency graph

distribution we can directly determine that at which range the salaries are varying. On the

following data set we can also implement the regression and statistical analysis too.

Regression Analysis

SUMMARY OUTPUT

Regression Statistics

Multiple R 0.99738

R Square 0.99478

Adjusted R 0.92811

Square 3

3
Standard 1007.06

Error 9

Observations 16

ANOVA

  df SS MS F Significanc

eF

Regression 1 2.9E+09 2.9E+0 2858.36 1.37E-17

9 4

Residual 15 1521281 101418

9 8

Total 16 2.91E+0      

Plots
Relationship between Age group and Annaul
Salary of Provience A
40
35
35 31
30 f(x) = 0.00144754983752008 27
x + 4.99970116869888 27
24
2323 24 25
R² = 0.865455600473044 23
25 22212222
20
Axis Title

20 18
15
10
5
0
9000 10000 11000 12000 13000 14000 15000 16000 17000 18000 19000
Axis Title

4
Figure 1.1: Relationship between Annual Salary and Age group of Province A

Realtionship between Age group and Annual


Salary
35
30
25 f(x) = 0.0011630943081959 x + 8.84669306166677
R² = 0.788443737767416
20
Age

15
10
5
0
8000 9000 10000 11000 12000 13000 14000 15000 16000 17000
Annual Salary

Figure 1.1: Relationship between Annual Salary and Age group of Province B

Hence from the following both graphs as we can see that there is a linear relationship between

salary and Age group.

Online Calculation results of T –test of Province A

5
Difference core Calculations:
N 1 :17

d f 1 =N−1=17−1=16

M 1 :12990

S S 1 : 101949200

S S1 101949200
s21 = = =6371825
N−1 17−1

Treatment 2

N 2 :17

d f 2 =N−1=17−1=16

M 2 :24.06

S S 2 :268.94

S S 2 268.94
s22= = =16.81
N−1 17−1

T value calculation
2 2 2
s p=((d f 1 /(d f 1 +d f 2 ))∗s 1)+(( d f 2 /(d f 2 + d f 2))∗s2 )=((16 /32)∗6371825)+((16/32)∗16.81)=3185920.9

2 2
s M 1 =s p/ N 1=3185920.9 /17=187407.11

s2 M 2 =s 2 p/ N 2=3185920.9 /17=187407.11

6
2 2
t=( M 1 −M 2 )/√ (s M 1 + s M 2 )=12965.94 /√ 374814.22=21.18

Online Calculation results of T –test of Province B

Difference core Calculations:


N 1 :15

d f 1 =N−1=15−1=14

M 1 :12075.33

S S 1 :55075773.33

2
s1=S S1 /( N −1)=55075773.33/(15−1)=3933983.81

N 2 :15

d f 2 =N−1=15−1=14

M 2 :21.47

S S 2 :505.73

7
2
s2=S S2 /(N −1)=505.73/(15−1)=36.12

T value calculation
2 2 2
s p=((d f 1 /(d f 1 +d f 2 ))∗s 1)+(( d f 2 /(d f 2 + d f 2))∗s2 )=((14 /28)∗3933983.81)+((14 /28)∗36.12)=1967009.97

2 2
s M =s p /N 1 =1967009.97/15=131134
1

s2M =s2 p /N 2=1967009.97/15=131134


2

M 1−M 2 12053.87
t= = =23.54
√s 2 2
M 1+ s M 2 √262268

From the following data as we can see that there is a resemblance between the age group and as

well as form the annual salaries. From the following analysis we can directly determine the

dependent and as well as independent variables.

Conclusion

To sum up the whole discussion it has been observed that the independent variable is age and the

dependent variable form the following case is annual salaries which is dependent upon the age

group. There is a linear trend which we has been observed from the following graphs too.

Moreover from the means value it has been seen that the age group of 23 has maximum flow

towards the KFC services. From the both age groups and as well as data set from the following

province we can see that province A have more tendency towards the KFC meals and as well as

its services.

You might also like