Professional Documents
Culture Documents
Gorakhpur Center
Ministry of Electronics & Information Technology (MeitY), Government of India
<Shivlok Singh>
<S.T.O.>
Qualitative vs Quantitative
•Discrete data can only take certain values (like whole numbers) Quantitative:
Quantitative:
•Height (Continuous)
•Weight (Continuous)
•Petals on a flower (Discrete)
•Customers in a shop (Discrete)
http://www.nielit.gov.in/gorakhpur /GKP.NIELIT @GKP_NIELIT /NIELITIndia /school/NIELITIndia
Data Collection
Collecting
Data can be collected in many ways. The simplest way is direct observation.
Example: Counting Cars
You want to find how many cars pass by a certain point on a road in a 10-minute interval.
Census or Sample
A Census is when we collect data for every member of the group (the whole "population").
A Sample is when we collect data just for selected members of the group.
A census is accurate, but hard to do. A sample is not as accurate, but may be good enough, and is a lot easier.
2 Numbers
With just 2 numbers the answer is easy: go half-way between.
Example: what is the central value for 3 and 7?
Answer: Half-way between, which is 5.
We can calculate it by adding 3 and 7 and then dividing the result by 2:
(3+7) / 2 = 10/2 = 5
3 or More Numbers
We can use that idea of "adding then dividing" when we have 3 or more numbers:
(13+13+13+13+13+13+1+1+1+1+1) / 11 = 7.5...
The Median age is 13 ... so let's have a Disco! Example: What is the Median of 3, 4, 7, 9, 12, 15
There are two numbers in the middle:
3, 4, 7, 9, 12, 15
So we average them:
(7+9) / 2 = 16/2 = 8
The Median is 8
"13" occurs 6 times, "1" occurs only 5 times, so the mode is 13.
But Mode can be tricky, there can sometimes be more than one Mode.
When there are two modes it is called "bimodal", when there are three or
more modes we call it "multimodal".
They can change the mean a lot, so we can either not use them (and say so)
or use the median or mode instead.
Mean: Add them up, and divide by 5 (as there are 5 numbers):
(3+4+4+5+104) / 5 = 24
24 does not represent those numbers well at all!
Median: They are in order, so just choose the middle number, which is 4:
3, 4, 4, 5, 104
In groups of 10, the "20s" appear most often, so we could choose 25 (the middle of the 20s
group) as the mode. You could use different groupings and get a different answer.
Grouping also helps to find what the typical values are when the real world messes things .
It takes longer when there is break time or lunch so an average is not very useful.
"35-39" appear most often, so we can say it normally takes about 37 minutes to fill a pallet.
Mr. Ajay timed 21 people in the sprint race, to the nearest second:
59, 65, 61, 62, 53, 55, 60, 70, 64, 56, 58, 58, 62, 62, 68, 65, 56, 59, 68, 61, 67
To find the Mean Alex adds up all the numbers, then divides by how many numbers:
Mean = ( 59 + 65 + 61 + 62 + 53 + 55 + 60 + 70 + 64 + 56 + 58 + 58 + 62 + 62 + 68 + 65 + 56 + 59 + 68 +
61 + 67 ) / 21
= 61.38095...
To find the Median , places the numbers in value order and finds the middle number.
In this case the median is the 11th number:
53, 55, 56, 56, 58, 58, 59, 59, 60, 61, 61, 62, 62, 62, 64, 65, 65, 67, 68, 68, 70
Median = 61
To find the Mode, or modal value, places the numbers in value order then counts how many of each
number. The Mode is the number which appears most often (there can be more than one mode):
53, 55, 56, 56, 58, 58, 59, 59, 60, 61, 61, 62, 62, 62, 64, 65, 65, 67, 68, 68, 70
62 appears three times, more often than the other values, so Mode = 62
Q2 = (5+6)/2 = 5.5
In three steps:
1. Find the mean of all values
2. Find the distance of each value from that mean (subtract the mean from each value, ignore minus signs)
3. Then find the mean of those distances
Variance
The Variance is defined as:
The average of the squared differences from the Mean.
Find out the Mean, the Variance, and the Standard Deviation.
Your first step is to find the Mean:
Answer:
Mean=600 + 470 + 170 + 430 + 300/ 5
= 1970/5=394
so the mean (average) height is 394 mm. Let's plot this on the chart:
Now we calculate each dog's difference from the Mean:
• heights of people
• size of things produced by machines
• errors in measurements
• blood pressure
• marks on a test
Standard Scores
• The test must have been really hard, so the Prof decides to Standardize all the scores and only fail
people more than 1 standard deviation below the mean.
• The Mean is 23, and the Standard Deviation is 6.6, and these are the Standard Scores:
• -0.45, -1.21, 0.45, 1.36, -0.76, 0.76, 1.82, -1.36, 0.45, -0.15, -0.91
• Now only 2 students will fail (the ones lower than −1 standard deviation)
• Much fairer!
http://www.nielit.gov.in/gorakhpur /GKP.NIELIT @GKP_NIELIT /NIELITIndia /school/NIELITIndia
Example -2
Some values are less than 1000g ... can you fix that?
• The normal distribution of your measurements looks like this:
• It is a random thing, so we can't stop bags having less than
1000g, but we can try to reduce it a lot.
• at −3 standard deviations:
From the big bell curve above we see that 0.1% are less. But
maybe that is too small.