Business Analytics Assignment Riya Mathew 19021141088: STR (Crew - Data)

BUSINESS ANALYTICS ASSIGNMENT
RIYA MATHEW
19021141088
Import “crew data.csv” from MS teams>files and answer to the following questions
1. List the categorical and numeric variables of the data set
Categorical: Hire.date, Lastname, Firstname, Location, EmpId, Job.code

Numeric:bonus
> str(Crew.data)
'data.frame': 69 obs. of 9 variables:

$ Hire.date: Factor w/ 69 levels "1-Jul-87","1-Mar-90",..: 35 50 3 16 27 36 62 60 24 17 ...
$ Lastname : Factor w/ 69 levels "BEAUMONT","BERGAMASCO",..: 21 35 69 19 41 18
42 64 67 9 ...
$ Firstname: Factor w/ 69 levels "ANITA M.","ANNETTE M.",..: 30 29 24 58 54 26 68 39
59 37 ...
$ Location : Factor w/ 3 levels "CARY","FRANKFURT",..: 1 2 3 1 3 2 3 2 2 3 ...
$ Phne : int 1168 2164 1565 1157 2360 1595 2366 1197 1553 1369 ...
$ EmpId : Factor w/ 69 levels "E00034","E00084",..: 53 36 49 46 31 4 25 29 41 18 ...
$ Job.code : Factor w/ 6 levels "FLTAT1","FLTAT2",..: 1 1 1 1 1 1 1 1 1 1 ...
$ Salary : int 21000 22000 22000 23000 24000 25000 25000 26000 27000 28000 ...
$ bonus : num 2100 2200 2200 2300 2400 2500 2500 2600 2700 2800 ...
2. Describe the numeric variable using descriptive function
> summary(Crew.data$bonus)
Min. 1st Qu. Median Mean 3rd Qu. Max.

2100 3300 4200 5214 7300 11200
> sd(Crew.data$bonus)
[1] 2552.178
> var(Crew.data$bonus)
[1] 6513610
3. How many groups are containing in the variable “Job code”
There are 6 groups in variable “Job.code”
> Crew.data%>%count(Job.code)
Job.code n
1 FLTAT1 14
2 FLTAT2 18
3 FLTAT3 12
4 PILOT1 8
5 PILOT2 9
6 PILOT3 8
4. Enumerate all functions explained in the video for “Job code”
> table(Crew.data$Job.code)
FLTAT1 FLTAT2 FLTAT3 PILOT1 PILOT2 PILOT3

14 18 12 8 9 8
> Emptb=table(Crew.data$Job.code)
> Emptb
FLTAT1 FLTAT2 FLTAT3 PILOT1 PILOT2 PILOT3

14 18 12 8 9 8
> class(Emptb)
[1] "table"
> Empf=as.data.frame(Emptb)
> Empf
Var1 Freq
1 FLTAT1 14
2 FLTAT2 18
3 FLTAT3 12
4 PILOT1 8
5 PILOT2 9
6 PILOT3 8
> names(Empf)=c("Jobcat","count")
> Empf
Jobcat count
1 FLTAT1 14
2 FLTAT2 18
3 FLTAT3 12
4 PILOT1 8
5 PILOT2 9
6 PILOT3 8
Using dplyr
> Crew.data%>%count(Job.code)
Job.code n
1 FLTAT1 14
2 FLTAT2 18
3 FLTAT3 12
4 PILOT1 8
5 PILOT2 9
6 PILOT3 8
> Crew.data%>%group_by(Job.code)%>%summarise(count=n())
`summarise()` ungrouping output (override with `.groups` argument)

# A tibble: 6 x 2
Job.code count
<fct> <int>
1 FLTAT1 14
2 FLTAT2 18
3 FLTAT3 12
4 PILOT1 8
5 PILOT2 9
6 PILOT3 8
> Crew.data%>%group_by(Job.code)%>%summarise(mean(Salary))

# A tibble: 6 x 2
Job.code `mean(Salary)`
<fct> <dbl>
1 FLTAT1 25643.
2 FLTAT2 35111.
3 FLTAT3 44250
4 PILOT1 69500
5 PILOT2 80111.
6 PILOT3 99875
5. Enumerate all functions explained in the video for “Salary”
> summary(Crew.data$Salary)

21000 33000 42000 52145 73000 112000
> table(Crew.data$Salary)
21000 22000 23000 24000 25000 26000 27000 28000 29000 30000 32000 33000
34000 35000
1 2 1 1 2 1 1 2 2 1 1 3 4 3
36000 37000 38000 41000 42000 43000 44000 45000 47000 48000 65000 66000
68000 69000
2 2 3 2 1 1 3 2 2 1 1 1 1 1
71000 72000 73000 75000 76000 77000 78000 81000 82000 83000 86000 92000
93000 94000
1 2 1 1 1 1 1 1 1 2 1 1 1 1
95000 100000 105000 108000 112000
1 1 1 1 1
> Emptb=table(Crew.data$Salary)
> Emptb
21000 22000 23000 24000 25000 26000 27000 28000 29000 30000 32000 33000
34000 35000
1 2 1 1 2 1 1 2 2 1 1 3 4 3
36000 37000 38000 41000 42000 43000 44000 45000 47000 48000 65000 66000
68000 69000
2 2 3 2 1 1 3 2 2 1 1 1 1 1
71000 72000 73000 75000 76000 77000 78000 81000 82000 83000 86000 92000
93000 94000
1 2 1 1 1 1 1 1 1 2 1 1 1 1
95000 100000 105000 108000 112000
1 1 1 1 1
> Empf=as.data.frame(Emptb)
> Empf
Var1 Freq
1 21000 1
2 22000 2
3 23000 1
4 24000 1
5 25000 2
6 26000 1
7 27000 1
8 28000 2
9 29000 2
10 30000 1
11 32000 1
12 33000 3
13 34000 4
14 35000 3
15 36000 2
16 37000 2
17 38000 3
18 41000 2
19 42000 1
20 43000 1
21 44000 3
22 45000 2
23 47000 2
24 48000 1
25 65000 1
26 66000 1
27 68000 1
28 69000 1
29 71000 1
30 72000 2
31 73000 1
32 75000 1
33 76000 1
34 77000 1
35 78000 1
36 81000 1
37 82000 1
38 83000 2
39 86000 1
40 92000 1
41 93000 1
42 94000 1
43 95000 1
44 100000 1
45 105000 1
46 108000 1
47 112000 1
> names(Empf)=c("Salary","count")
> Empf
Salary count
1 21000 1
2 22000 2
3 23000 1
4 24000 1
5 25000 2
6 26000 1
7 27000 1
8 28000 2
9 29000 2
10 30000 1
11 32000 1
12 33000 3
13 34000 4
14 35000 3
15 36000 2
16 37000 2
17 38000 3
18 41000 2
19 42000 1
20 43000 1
21 44000 3
22 45000 2
23 47000 2
24 48000 1
25 65000 1
26 66000 1
27 68000 1
28 69000 1
29 71000 1
30 72000 2
31 73000 1
32 75000 1
33 76000 1
34 77000 1
35 78000 1
36 81000 1
37 82000 1
38 83000 2
39 86000 1
40 92000 1
41 93000 1
42 94000 1
43 95000 1
44 100000 1
45 105000 1
46 108000 1
47 112000 1
Using dplyr
> Crew.data%>%count(Salary)
Salary n
1 21000 1
2 22000 2
3 23000 1
4 24000 1
5 25000 2
6 26000 1
7 27000 1
8 28000 2
9 29000 2
10 30000 1
11 32000 1
12 33000 3
13 34000 4
14 35000 3
15 36000 2
16 37000 2
17 38000 3
18 41000 2
19 42000 1
20 43000 1
21 44000 3
22 45000 2
23 47000 2
24 48000 1
25 65000 1
26 66000 1
27 68000 1
28 69000 1
29 71000 1
30 72000 2
31 73000 1
32 75000 1
33 76000 1
34 77000 1
35 78000 1
36 81000 1
37 82000 1
38 83000 2
39 86000 1
40 92000 1
41 93000 1
42 94000 1
43 95000 1
44 100000 1
45 105000 1
46 108000 1
47 112000 1
> Crew.data%>%group_by(Salary)%>%summarise(count=n())

# A tibble: 47 x 2
Salary count
<int> <int>
1 21000 1
2 22000 2
3 23000 1
4 24000 1
5 25000 2
6 26000 1
7 27000 1
8 28000 2
9 29000 2
10 30000 1
# ... with 37 more rows
> Crew.data%>%group_by(Salary)%>%summarise(mean(Salary))

# A tibble: 47 x 2
Salary `mean(Salary)`
<int> <dbl>
1 21000 21000
2 22000 22000
3 23000 23000
4 24000 24000
5 25000 25000
6 26000 26000
7 27000 27000
8 28000 28000
9 29000 29000
10 30000 30000
# ... with 37 more rows
Execute “mtcars” in built data in R and answer to the following questions

1. Enumerate all functions explained in the video for all categorical and numerical
variables of the data set.
Note: There are only numeric variables in the dataset

> data(mtcars) #importing the dataset
> str(mtcars)
'data.frame': 32 obs. of 11 variables:

$ mpg : num 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
$ cyl : num 6 6 4 6 8 6 8 4 4 6 ...
$ disp: num 160 160 108 258 360 ...
$ hp : num 110 110 93 110 175 105 245 62 95 123 ...
$ drat: num 3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
$ wt : num 2.62 2.88 2.32 3.21 3.44 ...
$ qsec: num 16.5 17 18.6 19.4 17 ...
$ vs : num 0 0 1 1 0 1 0 1 1 1 ...
$ am : num 1 1 1 0 0 0 0 0 0 0 ...
$ gear: num 4 4 4 3 3 3 3 4 4 4 ...
$ carb: num 4 4 1 1 2 1 4 2 2 4 ...
> summary(mtcars$mpg)

10.40 15.43 19.20 20.09 22.80 33.90
> summary(mtcars$cyl)

4.000 4.000 6.000 6.188 8.000 8.000
> summary(mtcars$disp)

71.1 120.8 196.3 230.7 326.0 472.0
> summary(mtcars$hp)
52.0 96.5 123.0 146.7 180.0 335.0
> summary(mtcars$drat)

2.760 3.080 3.695 3.597 3.920 4.930
> summary(mtcars$wt)

1.513 2.581 3.325 3.217 3.610 5.424
> summary(mtcars$qsec)

14.50 16.89 17.71 17.85 18.90 22.90
> summary(mtcars$vs)

0.0000 0.0000 0.0000 0.4375 1.0000 1.0000
> summary(mtcars$am)

0.0000 0.0000 0.0000 0.4062 1.0000 1.0000
> summary(mtcars$gear)

3.000 3.000 4.000 3.688 4.000 5.000
> summary(mtcars$carb)

1.000 2.000 2.000 2.812 4.000 8.000

Business Analytics Assignment Riya Mathew 19021141088: STR (Crew - Data)

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Business Analytics Assignment Riya Mathew 19021141088: STR (Crew - Data)

Uploaded by

Copyright:

Available Formats

BUSINESS ANALYTICS ASSIGNMENT

1. List the categorical and numeric variables of the data set

Categorical: Hire.date, Lastname, Firstname, Location, EmpId, Job.code

'data.frame': 69 obs. of 9 variables:

2. Describe the numeric variable using descriptive function

Min. 1st Qu. Median Mean 3rd Qu. Max.

There are 6 groups in variable “Job.code”

4. Enumerate all functions explained in the video for “Job code”

FLTAT1 FLTAT2 FLTAT3 PILOT1 PILOT2 PILOT3

FLTAT1 FLTAT2 FLTAT3 PILOT1 PILOT2 PILOT3

`summarise()` ungrouping output (override with `.groups` argument)

`summarise()` ungrouping output (override with `.groups` argument)

5. Enumerate all functions explained in the video for “Salary”

Min. 1st Qu. Median Mean 3rd Qu. Max.

`summarise()` ungrouping output (override with `.groups` argument)

`summarise()` ungrouping output (override with `.groups` argument)

Execute “mtcars” in built data in R and answer to the following questions

Note: There are only numeric variables in the dataset

'data.frame': 32 obs. of 11 variables:

Min. 1st Qu. Median Mean 3rd Qu. Max.

Min. 1st Qu. Median Mean 3rd Qu. Max.

Min. 1st Qu. Median Mean 3rd Qu. Max.

Min. 1st Qu. Median Mean 3rd Qu. Max.

Min. 1st Qu. Median Mean 3rd Qu. Max.

Min. 1st Qu. Median Mean 3rd Qu. Max.

Min. 1st Qu. Median Mean 3rd Qu. Max.

Min. 1st Qu. Median Mean 3rd Qu. Max.

Min. 1st Qu. Median Mean 3rd Qu. Max.

Min. 1st Qu. Median Mean 3rd Qu. Max.

You might also like