Professional Documents
Culture Documents
Qn:
Text report of the document sent yesterday Bar chart, Frequency count not less than for 9th
column and observation from the word cloud.
dmining=Dmining[,9]
dmining
TextDoc
TextDoc_dtm
dtm_m <- as.matrix(TextDoc_dtm)
dtm_m
dtm_v
dtm_d
head(dtm_d, 5)
head(dtm_d, 15)
set.seed(1234)
colors=brewer.pal(8, "Dark2"))
output
Wordcloud
Report
Qn2
Submit the text mining descriptives report and use R functions for Cardata generate report team
wise with syntaxes.
Cardata
names(Cardata)
nrow(Cardata)
dim(Cardata)
str(Cardata)
mean(Cardata$MSRP)
median(Cardata$MSRP)
sd(Cardata$MSRP)
skewness(Cardata$MSRP)
kurtosis(Cardata$MSRP)
table(Cardata$Engine.HP)
Cardata%>%count(Engine.HP)
x=Cardata%>%count(Engine.HP)
x$prop=(x$n)/11914
x
x$per=((x$n)/11914)*100
x$cum=cumsum(x$n)
> mytable<-table(Cardata$Driven_Wheels)
> lbls<-paste(names(mytable),"\n",mytable,sep="")
> pie(mytable,labels=lbls,main="Pie chart of Driven wheels\n(with sample sizes)")
output
Cardata
Make Model Year
1 BMW 1 Series M 2011
2 BMW 1 Series 2011
3 BMW 1 Series 2011
4 BMW 1 Series 2011
5 BMW 1 Series 2011
6 BMW 1 Series 2012
7 BMW 1 Series 2012
8 BMW 1 Series 2012
9 BMW 1 Series 2012
10 BMW 1 Series 2013
11 BMW 1 Series 2013
12 BMW 1 Series 2013
13 BMW 1 Series 2013
14 BMW 1 Series 2013
15 BMW 1 Series 2013
16 BMW 1 Series 2013
17 BMW 1 Series 2013
18 Audi 100 1992
19 Audi 100 1992
20 Audi 100 1992
21 Audi 100 1992
22 Audi 100 1992
23 Audi 100 1993
24 Audi 100 1993
25 Audi 100 1993
26 Audi 100 1993
27 Audi 100 1993
28 Audi 100 1994
29 Audi 100 1994
30 Audi 100 1994
31 Audi 100 1994
32 Audi 100 1994
33 FIAT 124 Spider 2017
34 FIAT 124 Spider 2017
35 FIAT 124 Spider 2017
36 Mercedes-Benz 190-Class 1991
37 Mercedes-Benz 190-Class 1991
38 Mercedes-Benz 190-Class 1992
39 Mercedes-Benz 190-Class 1992
40 Mercedes-Benz 190-Class 1993
41 Mercedes-Benz 190-Class 1993
42 BMW 2 Series 2016
43 BMW 2 Series 2016
44 BMW 2 Series 2016
45 BMW 2 Series 2016
46 BMW 2 Series 2016
47 BMW 2 Series 2016
48 BMW 2 Series 2016
49 BMW 2 Series 2016
50 BMW 2 Series 2016
51 BMW 2 Series 2017
52 BMW 2 Series 2017
53 BMW 2 Series 2017
54 BMW 2 Series 2017
55 BMW 2 Series 2017
56 BMW 2 Series 2017
57 BMW 2 Series 2017
58 BMW 2 Series 2017
59 Audi 200 1990
60 Audi 200 1990
61 Audi 200 1990
62 Audi 200 1991
Engine.Fuel.Type Engine.HP
Engine.Cylinders
1 premium unleaded (required) 335
6
2 premium unleaded (required) 300
6
3 premium unleaded (required) 300
6
4 premium unleaded (required) 230
6
5 premium unleaded (required) 230
6
6 premium unleaded (required) 230
6
7 premium unleaded (required) 300
6
8 premium unleaded (required) 300
6
9 premium unleaded (required) 230
6
10 premium unleaded (required) 230
6
11 premium unleaded (required) 300
6
12 premium unleaded (required) 230
6
13 premium unleaded (required) 300
6
14 premium unleaded (required) 230
6
15 premium unleaded (required) 230
6
16 premium unleaded (required) 320
6
17 premium unleaded (required) 320
6
18 regular unleaded 172
6
19 regular unleaded 172
6
20 regular unleaded 172
6
21 regular unleaded 172
6
22 regular unleaded 172
6
23 regular unleaded 172
6
24 regular unleaded 172
6
25 regular unleaded 172
6
26 regular unleaded 172
6
27 regular unleaded 172
6
28 regular unleaded 172
6
29 regular unleaded 172
6
30 regular unleaded 172
6
31 regular unleaded 172
6
32 regular unleaded 172
6
33 premium unleaded (recommended) 160
4
34 premium unleaded (recommended) 160
4
35 premium unleaded (recommended) 160
4
36 regular unleaded 130
4
37 regular unleaded 158
6
38 regular unleaded 158
6
39 regular unleaded 130
4
40 regular unleaded 130
4
41 regular unleaded 158
6
42 premium unleaded (required) 240
4
43 premium unleaded (required) 240
4
44 premium unleaded (required) 320
6
45 premium unleaded (required) 240
4
46 premium unleaded (required) 240
4
47 premium unleaded (required) 320
6
48 premium unleaded (required) 240
4
49 premium unleaded (required) 320
6
50 premium unleaded (required) 320
6
51 premium unleaded (recommended) 335
6
52 premium unleaded (recommended) 335
6
53 premium unleaded (recommended) 335
6
54 premium unleaded (recommended) 335
6
55 premium unleaded (recommended) 248
4
56 premium unleaded (recommended) 248
4
57 premium unleaded (recommended) 248
4
58 premium unleaded (recommended) 248
4
59 regular unleaded 162
5
60 regular unleaded 162
5
61 regular unleaded 162
5
62 regular unleaded 217
5
Transmission.Type Driven_Wheels Number.of.Doors
1 MANUAL rear wheel drive 2
2 MANUAL rear wheel drive 2
3 MANUAL rear wheel drive 2
4 MANUAL rear wheel drive 2
5 MANUAL rear wheel drive 2
6 MANUAL rear wheel drive 2
7 MANUAL rear wheel drive 2
8 MANUAL rear wheel drive 2
9 MANUAL rear wheel drive 2
10 MANUAL rear wheel drive 2
11 MANUAL rear wheel drive 2
12 MANUAL rear wheel drive 2
13 MANUAL rear wheel drive 2
14 MANUAL rear wheel drive 2
15 MANUAL rear wheel drive 2
16 MANUAL rear wheel drive 2
17 MANUAL rear wheel drive 2
18 MANUAL front wheel drive 4
19 MANUAL front wheel drive 4
20 AUTOMATIC all wheel drive 4
21 MANUAL front wheel drive 4
22 MANUAL all wheel drive 4
23 MANUAL front wheel drive 4
24 AUTOMATIC all wheel drive 4
25 MANUAL front wheel drive 4
26 MANUAL front wheel drive 4
27 MANUAL all wheel drive 4
28 AUTOMATIC front wheel drive 4
29 MANUAL all wheel drive 4
30 MANUAL front wheel drive 4
31 AUTOMATIC front wheel drive 4
32 AUTOMATIC all wheel drive 4
33 MANUAL rear wheel drive 2
34 MANUAL rear wheel drive 2
35 MANUAL rear wheel drive 2
36 MANUAL rear wheel drive 4
37 MANUAL rear wheel drive 4
38 MANUAL rear wheel drive 4
39 MANUAL rear wheel drive 4
40 MANUAL rear wheel drive 4
41 MANUAL rear wheel drive 4
42 AUTOMATIC rear wheel drive 2
43 AUTOMATIC rear wheel drive 2
44 AUTOMATIC rear wheel drive 2
45 AUTOMATIC all wheel drive 2
46 AUTOMATIC all wheel drive 2
47 AUTOMATIC rear wheel drive 2
48 MANUAL rear wheel drive 2
49 AUTOMATIC all wheel drive 2
50 AUTOMATIC rear wheel drive 2
51 AUTOMATIC all wheel drive 2
52 AUTOMATIC rear wheel drive 2
53 AUTOMATIC all wheel drive 2
54 AUTOMATIC rear wheel drive 2
55 AUTOMATIC rear wheel drive 2
56 AUTOMATIC rear wheel drive 2
57 AUTOMATIC all wheel drive 2
58 AUTOMATIC all wheel drive 2
59 AUTOMATIC front wheel drive 4
60 MANUAL all wheel drive 4
61 MANUAL all wheel drive 4
62 MANUAL all wheel drive 4
Market.Category Vehicle.Size
Vehicle.Style
1 Factory Tuner,Luxury,High-Performance Compact
Coupe
2 Luxury,Performance Compact
Convertible
3 Luxury,High-Performance Compact
Coupe
4 Luxury,Performance Compact
Coupe
5 Luxury Compact
Convertible
6 Luxury,Performance Compact
Coupe
7 Luxury,Performance Compact
Convertible
8 Luxury,High-Performance Compact
Coupe
9 Luxury Compact
Convertible
10 Luxury Compact
Convertible
11 Luxury,High-Performance Compact
Coupe
12 Luxury,Performance Compact
Coupe
13 Luxury,Performance Compact
Convertible
14 Luxury Compact
Convertible
15 Luxury,Performance Compact
Coupe
16 Luxury,High-Performance Compact
Convertible
17 Luxury,High-Performance Compact
Coupe
18 Luxury Midsize
Sedan
19 Luxury Midsize
Sedan
20 Luxury Midsize
Wagon
21 Luxury Midsize
Sedan
22 Luxury Midsize
Sedan
23 Luxury Midsize
Sedan
24 Luxury Midsize
Wagon
25 Luxury Midsize
Sedan
26 Luxury Midsize
Sedan
27 Luxury Midsize
Sedan
28 Luxury Midsize
Wagon
29 Luxury Midsize
Sedan
30 Luxury Midsize
Sedan
31 Luxury Midsize
Sedan
32 Luxury Midsize
Wagon
33 Performance Compact
Convertible
34 Performance Compact
Convertible
35 Performance Compact
Convertible
36 Luxury Compact
Sedan
37 Luxury Compact
Sedan
38 Luxury Compact
Sedan
39 Luxury Compact
Sedan
40 Luxury Compact
Sedan
41 Luxury Compact
Sedan
42 Luxury,Performance Compact
Coupe
43 Luxury Compact
Convertible
44 Factory Tuner,Luxury,High-Performance Compact
Convertible
45 Luxury,Performance Compact
Coupe
46 Luxury Compact
Convertible
47 Factory Tuner,Luxury,High-Performance Compact
Coupe
48 Luxury,Performance Compact
Coupe
49 Factory Tuner,Luxury,High-Performance Compact
Coupe
50 Factory Tuner,Luxury,High-Performance Compact
Convertible
51 Factory Tuner,Luxury,High-Performance Compact
Coupe
52 Factory Tuner,Luxury,High-Performance Compact
Convertible
53 Factory Tuner,Luxury,High-Performance Compact
Convertible
54 Factory Tuner,Luxury,High-Performance Compact
Coupe
55 Luxury,Performance Compact
Convertible
56 Luxury,Performance Compact
Coupe
57 Luxury,Performance Compact
Coupe
58 Luxury Compact
Convertible
59 Luxury Midsize
Sedan
60 Luxury Midsize
Wagon
61 Luxury Midsize
Sedan
62 Luxury,Performance Midsize
Sedan
highway.MPG city.mpg Popularity MSRP
1 26 19 3916 46135
2 28 19 3916 40650
3 28 20 3916 36350
4 28 18 3916 29450
5 28 18 3916 34500
6 28 18 3916 31200
7 26 17 3916 44100
8 28 20 3916 39300
9 28 18 3916 36900
10 27 18 3916 37200
11 28 20 3916 39600
12 28 19 3916 31500
13 28 19 3916 44400
14 28 19 3916 37200
15 28 19 3916 31500
16 25 18 3916 48250
17 28 20 3916 43550
18 24 17 3105 2000
19 24 17 3105 2000
20 20 16 3105 2000
21 24 17 3105 2000
22 21 16 3105 2000
23 24 17 3105 2000
24 20 16 3105 2000
25 24 17 3105 2000
26 24 17 3105 2000
27 21 16 3105 2000
28 21 16 3105 2000
29 22 16 3105 2000
30 22 17 3105 2000
31 22 16 3105 2000
32 21 16 3105 2000
33 35 26 819 27495
34 35 26 819 24995
35 35 26 819 28195
36 26 18 617 2000
37 25 17 617 2000
38 25 17 617 2000
39 26 18 617 2000
40 26 18 617 2000
41 25 17 617 2000
42 35 23 3916 32850
43 34 23 3916 38650
44 31 20 3916 48750
45 35 23 3916 34850
46 34 22 3916 40650
47 31 20 3916 44150
48 34 22 3916 32850
49 30 20 3916 46150
50 30 20 3916 50750
51 31 21 3916 46450
52 32 21 3916 49050
53 32 21 3916 51050
54 32 21 3916 44450
55 34 23 3916 38950
56 35 24 3916 33150
57 33 24 3916 35150
58 33 23 3916 40950
59 20 16 3105 2000
60 22 15 3105 2000
61 23 15 3105 2000
62 22 16 3105 2000
[ reached getOption("max.print") -- omitted 11852 rows ]
> names(Cardata)
[1] "Make" "Model" "Year"
"Engine.Fuel.Type"
[5] "Engine.HP" "Engine.Cylinders" "Transmission.Type"
"Driven_Wheels"
[9] "Number.of.Doors" "Market.Category" "Vehicle.Size"
"Vehicle.Style"
[13] "highway.MPG" "city.mpg" "Popularity" "MSRP"
> nrow(Cardata)
[1] 11914
> skewness(Cardata$MSRP)
[1] 11.76902
> kurtosis(Cardata$MSRP)
[1] 268.7673
> dim(Cardata)
[1] 11914 16
> str(Cardata)
'data.frame': 11914 obs. of 16 variables:
$ Make : Factor w/ 48 levels "Acura","Alfa Romeo",..: 6 6 6 6 6 6 6 6 6 6
$ Model : Factor w/ 915 levels "09-Jyaistha",..: 4 3 3 3 3 3 3 3 3 3 ...
$ Year : int 2011 2011 2011 2011 2011 2012 2012 2012 2012 2013 ...
$ Engine.Fuel.Type : Factor w/ 10 levels "diesel","electric",..: 9 9 9 9 9 9 9 9 9 9 .
$ Engine.HP : int 335 300 300 230 230 230 300 300 230 230 ...
$ Engine.Cylinders : int 6 6 6 6 6 6 6 6 6 6 ...
$ Transmission.Type: Factor w/ 5 levels "AUTOMATED_MANUAL",..: 4 4 4 4 4 4 4 4 4 4 ...
$ Driven_Wheels : Factor w/ 4 levels "all wheel drive",..: 4 4 4 4 4 4 4 4 4 4 ...
$ Number.of.Doors : int 2 2 2 2 2 2 2 2 2 2 ...
$ Market.Category : Factor w/ 73 levels "Crossover","Crossover,Diesel",..: 39 68 65 6
64 64 ...
$ Vehicle.Size : Factor w/ 3 levels "Compact","Large",..: 1 1 1 1 1 1 1 1 1 1 ...
$ Vehicle.Style : Factor w/ 16 levels "2dr Hatchback",..: 9 7 9 9 7 9 7 9 7 7 ...
$ highway.MPG : int 26 28 28 28 28 28 26 28 28 27 ...
$ city.mpg : int 19 19 20 18 18 18 17 20 18 18 ...
$ Popularity : int 3916 3916 3916 3916 3916 3916 3916 3916 3916 3916 ...
$ MSRP : int 46135 40650 36350 29450 34500 31200 44100 39300 36900 37200
>
> table(Cardata$Engine.HP)
55 62 63 66 73 74 78 79 81 82 84 88 90 92 93
94 95 96 97
2 2 13 7 9 18 8 12 19 5 6 3 21 25 28
10 22 9 1
98 99 100 101 102 103 105 106 107 108 109 110 111 113 114
115 116 118 119
32 13 45 22 1 4 10 31 8 31 32 32 7 14 25
33 30 7 14
120 121 122 123 124 125 126 127 128 130 131 132 133 134 135
136 137 138 140
67 18 40 1 4 37 15 58 11 85 12 65 4 44 15
15 16 199 143
141 142 143 144 145 146 147 148 150 151 152 153 154 155 156
157 158 159 160
31 4 43 2 32 12 17 94 232 1 40 23 1 156 2
16 82 21 146
161 162 163 164 165 166 167 168 169 170 171 172 173 174 175
176 177 178 179
23 44 2 37 73 57 11 55 46 351 22 44 59 33 87
30 33 32 19
180 181 182 184 185 186 187 188 189 190 191 192 193 194 195
196 197 198 199
135 24 87 85 241 9 6 54 14 132 3 21 12 23 56
15 28 2 14
200 201 202 203 204 205 206 207 208 210 211 212 214 215 217
218 219 220 221
456 90 18 50 11 72 6 43 32 320 9 12 3 39 19
12 6 171 5
222 223 224 225 227 228 230 231 232 234 235 236 237 238 239
240 241 242 244
21 2 7 72 49 8 88 25 16 5 27 21 14 4 16
268 43 20 15
245 248 250 251 252 253 254 255 256 257 259 260 261 263 264
265 266 268 270
54 29 120 6 73 13 4 56 14 3 17 133 100 12 11
47 34 84 73
271 272 273 274 275 276 278 279 280 281 282 283 284 285 287
288 290 291 292
10 54 16 11 123 30 84 20 120 39 52 70 8 246 20
43 132 26 67
293 295 296 297 298 300 301 302 303 304 305 306 308 310 311
315 316 317 318
16 80 17 20 2 192 6 94 39 36 89 62 26 123 15
35 9 40 30
320 321 322 323 325 328 329 330 332 333 335 337 338 340 342
343 345 348 349
69 18 11 6 93 35 23 57 93 37 54 5 3 44 4
3 21 23 2
350 354 355 359 360 361 362 365 370 372 375 377 380 381 382
383 385 386 389
38 6 158 2 30 1 10 63 11 1 19 2 18 152 7
3 47 7 3
390 394 395 400 401 402 403 404 410 415 416 420 424 425 426
429 430 435 438
44 3 7 52 3 15 9 20 6 6 7 133 1 13 4
14 45 11 3
440 442 443 444 445 449 450 451 453 454 455 456 460 464 467
469 470 475 480
4 6 8 5 36 12 27 1 13 16 30 1 26 4 4
7 4 4 2
483 485 490 493 500 503 505 510 515 518 520 521 523 525 526
530 532 535 536
15 19 2 1 17 6 4 66 6 3 15 6 9 8 6
3 5 1 4
540 543 545 550 552 553 556 557 560 562 563 565 567 568 570
572 573 577 580
15 6 5 27 10 1 7 1 24 7 15 6 12 6 16
1 1 20 5
582 583 592 597 600 604 605 610 611 616 617 620 621 622 624
626 631 632 640
2 6 1 2 11 2 4 4 4 5 2 6 18 1 3
5 7 4 6
641 645 650 651 660 661 662 670 700 707 720 731 750 1001
3 14 21 3 1 1 4 1 6 6 4 3 2 3
> Cardata%>%count(Engine.HP)
> Cardata%>%count(Engine.HP)
# A tibble: 357 x 2
Engine.HP n
<int> <int>
1 55 2
2 62 2
3 63 13
4 66 7
5 73 9
6 74 18
7 78 8
8 79 12
9 81 19
10 82 5
# ... with 347 more rows
> x=Cardata%>%count(Engine.HP)
> x$prop=(x$n)/11914
> x
# A tibble: 357 x 3
Engine.HP n prop
<int> <int> <dbl>
1 55 2 0.000168
2 62 2 0.000168
3 63 13 0.00109
4 66 7 0.000588
5 73 9 0.000755
6 74 18 0.00151
7 78 8 0.000671
8 79 12 0.00101
9 81 19 0.00159
10 82 5 0.000420
# ... with 347 more rows
> x$per=((x$n)/11914)*100
> x$cum=cumsum(x$n)
> x
# A tibble: 357 x 5
Engine.HP n prop per cum
<int> <int> <dbl> <dbl> <int>
1 55 2 0.000168 0.0168 2
2 62 2 0.000168 0.0168 4
3 63 13 0.00109 0.109 17
4 66 7 0.000588 0.0588 24
5 73 9 0.000755 0.0755 33
6 74 18 0.00151 0.151 51
7 78 8 0.000671 0.0671 59
8 79 12 0.00101 0.101 71
9 81 19 0.00159 0.159 90
10 82 5 0.000420 0.0
Conclusion:
We are doing descriptive analytics on data set Cardata. The analysis is aimed to
understand the data better and helps in summarize the data in order identify
the patterns. The dataset contains 16 columns and 11915 rows .
Here we are considering MSRP to find out the mean,median,standard
deviation,skewness and kurtosis.we found the dimensions and structure of
cardata data set.
th