You are on page 1of 15

For use only in [the name of your school] 2014

S1 Note

S1 Notes
(Edexcel)

Copyright www.pgmaths.co.uk - For AS, A2 notes and IGCSE / GCSE worksheets 1


For use only in [the name of your school] 2014
S1 Note

Copyright www.pgmaths.co.uk - For AS, A2 notes and IGCSE / GCSE worksheets 2


For use only in [the name of your school] 2014
S1 Note

Copyright www.pgmaths.co.uk - For AS, A2 notes and IGCSE / GCSE worksheets 3


For use only in [the name of your school] 2014
S1 Note

Copyright www.pgmaths.co.uk - For AS, A2 notes and IGCSE / GCSE worksheets 4


For use only in [the name of your school] 2014
S1 Note

Copyright www.pgmaths.co.uk - For AS, A2 notes and IGCSE / GCSE worksheets 5


For use only in [the name of your school] 2014
S1 Note

Copyright www.pgmaths.co.uk - For AS, A2 notes and IGCSE / GCSE worksheets 6


For use only in [the name of your school] 2014
S1 Note

Copyright www.pgmaths.co.uk - For AS, A2 notes and IGCSE / GCSE worksheets 7


For use only in [the name of your school] 2014
S1 Note

Copyright www.pgmaths.co.uk - For AS, A2 notes and IGCSE / GCSE worksheets 8


For use only in [the name of your school] 2014
S1 Note

Copyright www.pgmaths.co.uk - For AS, A2 notes and IGCSE / GCSE worksheets 9


For use only in [the name of your school] 2014
S1 Note
Definitions for S1
Statistical Experiment
A test/investigation/process adopted for collecting data to provide evidence for or against a
hypothesis.

“Explain briefly why mathematical models can help to improve our understanding of real
world problems”
Simplifies a real world problem; enables us to gain a quicker / cheaper understanding of a real world
problem

Advantage and disadvantage of statistical model


Advantage : cheaper and quicker
Disadvantage : not fully accurate

“Write down two reasons for using statistical models”


To simplify a real world problem
To improve understanding / describe / analyse a real world problem
Quicker and cheaper than using real thing
To predict possible future outcomes
Refine model / change parameters possible Any 2

“Statistical models can be used to describe real world problems. Explain the process involved
in the formulation of a statistical model.”
• Observe real-world problem
• Devise a statistical model and collect data
• (Experimental) data collected
• Model used to make predictions
• Compare and observe against expected outcomes and test model;
• Statistical concepts are used to test how well the model describes the real-world problem
• Refine model if necessary.

A sample space
A list of all possible outcomes of an experiment

Event
Sub-set of possible outcomes of an experiment.

Normal Distribution
 Bell shaped curve
 symmetrical about mean; mean = mode = median
 95% of data lies within 2 standard deviations of mean
 68.3% between one standard deviation of mean
 Horizontal axis asymptotic to curve (S1 Jan 04)

Independent Events
P( A ∩ B)= P( A) × P( B)

Mutually Exclusive Events


P( A ∩ B) =
0

Copyright www.pgmaths.co.uk - For AS, A2 notes and IGCSE / GCSE worksheets 10


For use only in [the name of your school] 2014
S1 Note

Explanatory and response variables


The response variable is the dependent variable. It depends on the explanatory variable (also called
the independent variable). So in a graph of length of life versus number of cigarettes smoked per
week, the dependent variable would be length of life. It depends (or may do) on the number of
cigarettes smoked per week.

Give two reasons to justify the use of statistical models


Used to simplify or represent a real world problem
Cheaper or quicker or easier (than the real situation) or more easily modified (any two lines)
To improve understanding of the real world problem B1
Used to predict outcomes from a real world problem (idea of predictions)

Describe the main features and uses of a box plot.

Two tests for skewness

Positive skew if ( Q3 − Q2 ) − ( Q2 − Q1 ) > 0 and if Mean > Median > Mode

Negative skew if ( Q3 − Q2 ) − ( Q2 − Q1 ) < 0 and if Mean < Median < Mode

A good way to remember the condition for skewness involving mean, median and mode.

Write down the three averages in alphabetical order, that is

Mean…Median…Mode

For positive skew fill the blank with a “>” sign to give

Mean > Median > Mode

For negative skew fill the blank with a “<” sign to give

Mean < Median < Mode

Which sort of average and which sort of measure of dispersion should you use?
If there are outliers or if the data is skewed then use median and IQR,
If the are not outliers then use mean and standard deviation.

Copyright www.pgmaths.co.uk - For AS, A2 notes and IGCSE / GCSE worksheets 11


For use only in [the name of your school] 2014
S1 Note
Data
Discrete
Discrete data can only take certain values in any given range. Number of cars in a household is an
example of discrete data. The values do not have to be whole numbers (e.g. shoe size is discrete).

Continuous
Continuous data can take any value in a given range. So a person’s height is continuous since it
could be any value within set limits.

Categorical
Categorical data is data which is not numerical, such as choice of breakfast cereal etc.

Data may be displayed as grouped data or ungrouped data.

We say that data is “grouped” when we present it in the following way:

Weight (w) Frequency


65- 3
70- 7

Or
Score (s) Frequency
5-9 2
10-14 5

NB: We can group discrete data or continuous data.

We must know how to interpret these groups,


So that
Weight (w)
65- 65 ≤ w < 70
70- 70 ≤ w < 75

Or
Score (s)
5-9 4.5 ≤ s < 9.5
10-14 9.5 ≤ s < 14.5

Copyright www.pgmaths.co.uk - For AS, A2 notes and IGCSE / GCSE worksheets 12


For use only in [the name of your school] 2014
S1 Note
Representation of Data
Histograms, stem and leaf diagrams, box plots. Use to compare distributions. Back-to-back stem
and leaf diagrams may be required.

Stem and Leaf Diagrams

The stem and leaf diagram is a very useful way of grouping data whilst retaining the original data.

For example suppose we had the following scores from children in a Maths test:
85, 18, 38, 67, 43, 75, 78, 81, 92, 71, 52, 62, 49, 62, 82, 69, 55, 57, 95, 62, 37

We see that the smallest value is 18 and the largest is 95. The classes of stem and leaf diagrams
must be of equal width and so it would seem sensible to choose classes 10-19, 20-29, etc.

The “stem” in this case represents the tens and the “leaf” represents the units so we have the
following:

Scores in Maths Test


Stem (Tens) Leaf (Units)
1 8
2
3 87
4 39
5 257
6 72292
7 581
8 512
9 25

We then arrange these in numerical order to give the following:

Scores in Maths Test


NB : the data must be in
Stem (Tens) Leaf (Units) order in a Stem and Leaf
1 8 Diagram.
2
3 78
4 39
5 257
6 22279
7 158
8 125
9 25

We must also include a “key” with the diagram, so we say

1 8 means 18

This diagram tells us the basic shape of the distribution. We can easily see the smallest and largest
values and we can see that the mode is 62. We can also use it to calculate Q1 , Q2 and Q3 .

Copyright www.pgmaths.co.uk - For AS, A2 notes and IGCSE / GCSE worksheets 13


For use only in [the name of your school] 2014
S1 Note

NB: If we wanted to represent the interval 18-22 on a stem & leaf we could not make 1 the stem
since not all the numbers would begin with 1. What we could do is have a stem of 18 and then make
the leaf the number we add on to the stem. In this case our key would be:

18 0 means 18 and 18 4 means 22

Back to back stem diagrams

We can use these to compare two samples by using a “back to back stem plot”. In this we put stems
down the middle and then one set of data on the left and the on the on the right. So we might end up
with a diagram as follows:

Physics Maths
75 1 8
1 2
653 3 78
421 4 39
94310 5 257
842 6 22279
63 7 158
51 8 125
9 25

Our key here would be


In Physics 7 1 means 17
In Maths 1 8 means 18

Copyright www.pgmaths.co.uk - For AS, A2 notes and IGCSE / GCSE worksheets 14


For use only in [the name of your school] 2014
S1 Note
Histograms
Data that has been grouped can be represented using a histogram.
A histogram is made up of rectangles of varying widths and heights – there are no gaps between the
blocks.

The key feature of a histogram is that the area of each block is proportional to the frequency

In order for the area to be equal (or proportional) to the frequency we plot frequency density on the
frequency
vertical axis, where frequency density = . The class width is the width of the interval
class width
(i.e. it runs from the lower boundary to the upper boundary).
Remember Frequency Density is Frequency Divided by class width.

Example Plot a histogram for the following:

Length (h) Frequency Class width Frequency


Density
650- 3 20 0.15
670- 7 10 0.7
680- 20 10 2
690- 16 10 1.6
700-720 4 20 0.2

So the first block runs from 650 to 670 and has height 0.15 etc.
FD

Length

NB: If there are gaps between the stated upper limit of one class interval and the lower limit of
the next class interval then we need to fill those gaps as shown below. For example,
When question says “give a reason So the class width is 5.
Length (m)
to justify the use of a histogram to Not 19 − 15 = 4
represent these data”…. 15-19 14.5 ≤ x < 19.5
20-24 19.5 ≤ x < 24.5
The answer is “Data is continuous” 25-29 24.5 ≤ x < 29.5 Class width on the
horizontal axis runs from
14.5 to 19.5

NB: Be careful with age since “15-19” would mean 15 ≤ x < 20 since one is 19 until the moment
before one’s 20th birthday.
The shape of the histogram gives us information about the mean and the dispersion
Copyright www.pgmaths.co.uk - For AS, A2 notes and IGCSE / GCSE worksheets 15

You might also like