Professional Documents
Culture Documents
Aravind Rangamreddy 500195259 As5
Aravind Rangamreddy 500195259 As5
data = [4, 8, 13, 15, 21, 27, 32, 37, 45, 45, 46, 75, 85, 121, 155, 207, 254, 300,
1200]
Perform the following tasks:
1) clean the data (use 3 Standard deviation method to detect and remove
outliers, if any);
2) apply data binning to the clean data. Use two different methods: equal width
(number of bins=6), and equal depth (height) (4 items in a bin).
1 Solution –
Step – 1:
Step – 2:
Calculating the Variance =
Variance of 1st data: (4 – 141.52)2 = (-137.52)2 = 18,911.75
Variance of 2nd data: (8 – 141.52)2 = (-133.52)2 = 17,827.59
Variance of 3rd data: (13 – 141.52)2 = (-128.52)2 = 16,517.39
Variance of 4th data: (15 – 141.52)2 = (-126.52)2 = 16,007.31
Variance of 5th data: (21 – 141.52)2 = (-120-.52)2 = 14,525.07
Variance of 6th data: (27 – 141.52)2 = (-114.52)2 = 13,114.83
Variance of 7th data: (32 – 141.52)2 = (-109.52)2 = 11,994.63
Variance of 8th data: (37 – 141.52)2 = (-104.52)2 = 10,924.43
Variance of 9th data: (45 – 141.52)2 = (-96.52)2 = 9,316.11
Variance of 10th data: (45 – 141.52)2 = (-96.52)2 = 9,316.11
Variance of 11th data: (46 – 141.52)2 = (-95.52)2 = 9,124.07
Variance of 12th data: (75 – 141.52)2 = (-66.52)2 = 4,424.91
Variance of 13th data: (85 – 141.52)2 = (-56.52)2 = 3,194.51
Variance of 14th data: (121 – 141.52)2 = (-20.52)2 = 421.07
Variance of 15th data: (155 – 141.52)2 = (13.48)2 = 181.71
Variance of 16th data: (207 – 141.52)2 = (65.48)2 = 4,287.63
Variance of 17th data: (253 – 141.52)2 = (111.48)2 = 12,427.79
Variance of 18th data: (300 – 141.52)2 = (158.48)2 = 25,115.91
Variance of 19th data: (1200 – 141.52)2 = (1058.48)2 = 1,120,379.91
Step – 3:
2 Solution –