MELJUN CORTES DATA TYPES Rm104tr-13

MELJUN CORTES DATA TYPES Rm104tr-13

Qualitative

No particular order

Examples: Colour, Types of Materials

Quantitative

Discrete

E.g. Shoe size, Number of people

CS113/0401/v1

Lesson 13 - 1

Year 1

RAW DATA Number of sheets of listing paper used by each of 120 jobs

8 14 17 7 8 5 9 6 11 18 22 14 6 9 9 5 8 17 18 7 12 18 14

17

24 11 18 14 29 13 23 16 7

21 14 12 27 13 12

16 27 21 14 11 19 2 7

18 14 21 27 11 10 19 14 12 17 14 21 17 19 24 26 8 7 8 9 14 13 28 16 17 24 13 17 19 16 18 24 27 9 9 8 8 20 16

14 16 19 11 17 23 12 11 22 10 17 24 20 5 16 9 7 18 12 10 7

19 13 25 18 21 10 15 11 14

14 28 12 10

CS113/0401/v1 Lesson 13 - 2
Year 1

Year 1

Tally 1 1111 1111 1111 1111 1111 1 1111 1111 1111 1111 1111 1111 1111 11 1111 1111 1111 1111 1111 1111 1 1111 1111 1111 1 1111 1111 Total

Frequency 1 26 37 31 16 9 120

CS113/0401/v1 Lesson 13 - 3
Year 1

Year 1

Raw data

Raw data are collected data which have been organized numerically

Array

An array is an arrangement of raw numerical data in ascending or descending order of magnitude. The difference between the largest and smallest number is called the range of the data

Lesson 13 - 4

CS113/0401/v1

Year 1

Frequency distribution

When summarizing a large number of raw data it is often useful to distribute the data into classes or categories and to determine the number of individuals belonging to each class, called the class frequency

CS113/0401/v1

Lesson 13 - 5

Year 1

EXAMPLE

A set of 100 students obtained from an alphabetical listing of an university record. Their weights ranging from 60kg to 74kg are tabulated.

CS113/0401/v1

Lesson 13 - 6

Year 1

EXAMPLE

Mass ( kilograms)

60 - 62 63 - 65 66 - 68 69 - 71 72 - 74 Total

Number of Students

5 18 42 27 8 100

The first class or category, for example consists of masses from 60 to 62 kg and is indicated by the symbol 60 - 62. Since 5 students have masses belonging to this class, the corresponding class frequency is 5. Data organized and summarized in the above frequency distribution are often called grouped data

CS113/0401/v1 Lesson 13 - 7
Year 1

Year 1

CLASS INTERVAL

A symbol defining a class such as 60 - 62 is called a class interval. The end numbers 60 and 62, are called the class limits. The smaller number 60 is the lower class limit and the larger number 62 is the upper class limit.

CS113/0401/v1

Lesson 13 - 8

Year 1

CLASS MARK

A class mark is the midpoint of the class interval and is obtained by adding the lower and upper class limits and dividing by two In the previous examples, the class mark of the interval 60 - 62 is (60 + 62) / 2 = 61

CS113/0401/v1

Lesson 13 - 9

Year 1

MEDIAN (1)

The median of a set of numbers arranged in order of magnitude is the middle value or the arithmetic mean of the two middle values.

Example 1

The set of numbers 3, 4, 4, 5, 6, 8, 8, 8, 10 For an odd number of data the median occurs at position (N + 1) / 2 = = 10 / 2 5th position

CS113/0401/v1 Lesson 13 - 10
Year 1

Year 1

MEDIAN (2)

Example 2

For even number of data the median is the average of the two middle values The median =(Pos 4 + Pos 5) / 2 =(9 + 11) / 2 =10

CS113/0401/v1

Lesson 13 - 11

Year 1

MEDIAN (1)

For grouped data the median, obtained by interpolation is given by MEDIAN = L1 + Where

L1 = lower class boundary of the median class(I.e. the class containing the median).

N = number of items in the data (I.e. total frequency)

N 2

1 median

CS113/0401/v1

Lesson 13 - 12

Year 1

MEDIAN (2)

= sum of frequencies of all classes lower than the median class = frequency of median class = size of median class interval

median

CS113/0401/v1

Lesson 13 - 13

Year 1

Search for the middle value on the c axis and read off the corresponding value on the x axis

This is the median

CS113/0401/v1

Lesson 13 - 14

Year 1

CS113/0401/v1

Lesson 13 - 15

Year 1

MODE (1)

The mode of a set of numbers is that value which occurs with the greatest frequency, I.e. it is the most common value. The mode may not exit, and even of it does exists it may not be unique Example

The set

2, 2, 5, 7, 9, 9, 9, 10, 11, 12, 18 has mode 9

Example

The set

3, 5, 8, 10, 12, 15, 16 has no mode

CS113/0401/v1

Lesson 13 - 16

Year 1

MODE (2)

Example

The set

2, 3, 4, 4, 4, 5, 5, 7, 7, 7, 9

has mode 4 and 7 and is called bimodal

CS113/0401/v1

Lesson 13 - 17

Year 1

Ungrouped data

Grouped data

CS113/0401/v1

Lesson 13 - 18

Year 1

MODAL CLASS

x

51 - 55 55 - 60 61 - 65

f

12 16 10

55 - 60 is the modal class We dont know x values before grouping, so we cant find the mode exactly N.B.

Lesson 13 - 19

CS113/0401/v1

Year 1

MODE (1)

In cases where grouped data where frequency curve has been constructed to fit the data, the mode will be the value (or values) of x corresponding to the maximum point (or points) on the curve, From a frequency distribution or histogram the mode can be obtained from the following formula,

Mode = L1 +

1 +

CS113/0401/v1

Lesson 13 - 20

*c

Year 1

MODE (2)

Where

L1

lower class boundary of modal class (i.e. class containing the mode).

excess of modal frequency over frequency of next lower class excess of modal frequency over frequency of the next higher class size of modal class interval

CS113/0401/v1

Lesson 13 - 21

Year 1

CS113/0401/v1

Lesson 13 - 22

Year 1

= 25 + 5 x

= 25 + 5 x

40 40 + 64 40 104

= 25 + 1.9

= 26.9

CS113/0401/v1

Lesson 13 - 23

Year 1

The arithmetic mean or the mean of a set of N numbers X1, X2, X3, ..., Xn is donoted by X is defined as

X1 + X2 + X3 + .. + Xn

= N

i=1

N

Xi

CS113/0401/v1

Lesson 13 - 24

Year 1

Add them = 128

Divide by 8 = 16

This is the arithmetic mean It is the the most common definition of average It only works with quantitative data

CS113/0401/v1

Lesson 13 - 25

Year 1

If the number X1, X2, X3, ..., Xn occurs 1, 2, 3, ..., n times respectively, the arithmetic mean is

1X1 + 2X2 + .. + nXn

1 + 2 + . n

i=1 i=1

n

iXi i

CS113/0401/v1

Lesson 13 - 26

Year 1

Age (x)

17 18 19

Frequency ()

3 8 14

x

51 144 266

20

21 22 23 24 25 26

21

24 13 7 6 3 1

420

504 286 161 144 75 26

= 100

Mean age =

x = 2077

CS113/0401/v1 Lesson 13 - 27
Year 1

Year 1

HISTOGRAMS (1)

Only used for quantitative data Histogram is like a bar chart, but with no gaps between bars and calibrated horizontal axis Order of bars depends on value and on horizontal scale

CS113/0401/v1

Lesson 13 - 28

Year 1

HISTOGRAMS (2)

CS113/0401/v1

Lesson 13 - 29

Year 1

HISTOGRAMS (3)

CS113/0401/v1

Lesson 13 - 30

Year 1

AREA IN HISTOGRAMS

CS113/0401/v1

Lesson 13 - 31

Year 1

100 125 150 175 200 225 -

No of Programs

3 12 24 42 51 39

30 21

12 6

CS113/0401/v1

Lesson 13 - 32

Year 1

Table 2:

Table 2:
Line of Code (less than) 100 125 150 175 200 225 250 275 300 325 350 CS113/0401/v1 Cumulative Frequency 0 3 15 39 81 132 171 201 222 234 240 Lesson 13 - 33
Year 1

Year 1

240 220 200 180 160 140 120 100 80 60 40 20 0 0

Cummulative Frequency

50 100 150 200 250 300 350 Lines of code (less than)

CS113/0401/v1

Lesson 13 - 34

Year 1

The Standard Deviation of a set of N numbers X1, X2, ..., Xn is denoted by S.D. and is defined by S.D. =

i=1 (X - X)

i

Where

X = Arithmetic Mean N = Total Number of element in the set

CS113/0401/v1

Lesson 13 - 35

Year 1

If X1, X2, ..., Xn occurs with frequencies 1, 2, ..., n respectively, the standard deviation can be written as

S.D.

j=1

[ j (Xj - X) 2 ]

n

j=1

or

S.D. =

i Xi i Xi ) - ( i i

2 2

CS113/0401/v1

Lesson 13 - 36

Year 1

Question 6 c) NCC 1/93 On test the actual access times for 50 hard disc drives were distributed as follows:

CS113/0401/v1

Lesson 13 - 37

Year 1

x

22.6 22.7 22.8 22.9 23.0 23.1 23.2 23.3

Alternative 2Question 6c

f fx fx

1 3 6 10 14 9 5 2 22.6 68.1 136.8 229.0 322.0 207.9 116.0 46.6

[1] [1]

(1 mark for each total) 2 2 [1]

1149.0 26405.24

S.D = =

fx2

( X )2

(22.98)2

26405.24 50

= 0.156

CS113/0401/v1 Lesson 13 - 38
Year 1

Year 1

VARIANCE

The variance of a set of data is defined as the square of the standard deviation and is thus given by (S.D.)

j=1

[ j (Xj - X) 2 ]

n

j=1j

= (S.D.)2

CS113/0401/v1

Lesson 13 - 39

