You are on page 1of 6

Academic year 2023 – 24

Unit: Reasoning with data (Statistics) Grade: MYP 5

Key concept: Relationships

Related concepts: Representation, Validity

Global context: Globalization and sustainability

Exploration: Students will explore Data-driven decision -making

SOI: Inquiring about the representation and validity of data can help to establish the
underlying relationships and trends thus enhancing our decision-making skills.

Connection with SOI: Students will explore the validity and representation of the
data using different techniques of data collection and presentation which will enhance
the data driven decision making skill.

Topic: Basic Concepts of Statistics

Data:
The word data is plural, the singular being 'datum'. Dictionary meaning of the word datum is 'fact'
and therefore in plural the word data signifies more than one fact. In a wider sense, the term data
denotes evidence or facts describing a group or a situation. However, in a practical sense, the
statistical term is generally used for numerical facts such as measures of heights, test scores on
achievements, etc.
Data can also be classified as primary or secondary.

Primary data is data that is collected by a researcher from first-hand sources, using methods like
surveys, interviews or experiments. It is collected with the research project in mind, directly from
primary sources. Whereas, secondary data is data gathered from studies, surveys or experiments
that have been run by other people or for other research.

Statistics is a branch of applied mathematics that involves the collection, description, analysis, and
inference of conclusions from quantitative data. In Statistics we deal with data collection,
presentation, analysis and interpretation of results.

Data can be c o l l e c t e d from

Population (the entire list of a specified group)

Sample (a subset of the Population)

We usually investigate a small sample of the population to draw conclusions for


the whole population itself.

A systematical arrangement of the data in a tabular form is called tabulation or


presentation of the data. This grouping results in a table called the frequency table which
indicates the number of observations within each group. Many conclusions about the
characteristics of the data, the behaviour of variables, etc can be drawn from this table.
Histogram (for continuous data)

Age Frequency
[0,10) 7
[10,20) 5
[20,30) 1
[30,40) 3

Stem and leaf Diagram

Data
12, 14, 16, 16, 20,
21 Key: 1|3 represents 13
21, 21, 25, 32, 39,
40
43, 44, 47, 48, 49,
53
Statistics Sampling Techniques

It would often not be possible to gather data from every data point within a population to gather
statistical information. Statistics relies instead on different sampling techniques to create a
representative subset of the population that's easier to analyze. In statistics, there are several
primary types of sampling in statistics.

As far as sampling is concerned, it is very crucial to select a sample which is not


biased. There are several sampling techniques which face this bias.
Suppose that we have a population of 100,000 people and wish to select a
sample of 1000 people. If we select the first 1000 in a list, or the youngest 1000
there is certainly a bias in our selection.

Simple random sampling: calls for every member within the population to have an equal
chance of being selected for analysis. The entire population is used as the basis for sampling,
and any random generator based on chance can select the sample items. For example, 100
individuals are lined up and 10 are chosen at random.

Systematic sampling: calls for a random sample as well, but its technique is slightly modified
to make it easier to conduct. A single random number is generated and individuals are then
selected at a specified regular interval until the sample size is complete. For example, 100
individuals are lined up and numbered. The 7th individual is selected for the sample followed by
every subsequent 9th individual until 10 sample items have been selected.

Stratified sampling: The population is divided into subgroups based on similar


characteristics. Then you calculate how many people from each subgroup would represent the
entire population.

For example, 100 individuals are grouped by gender and race. Then a sample from each
subgroup is taken in proportion to how representative that subgroup is of the population or we
divide the population in subgroups (say men and women, or under and over 40
years old) then we pick a sample from each group

Cluster sampling: calls for subgroups as well, but each subgroup should be representative of
the population. The entire subgroup is randomly selected instead of randomly selecting
individuals within a subgroup.

There are advantages and disadvantages in each method. Simple random sampling is
fair but it may be very time consuming compared to the systematic sampling. In
systematic sample though, if there is a periodic pattern in the population there may
be a bias. Suppose that the 100000 are in groups of 100 people. If the first person of
the group is the leader, then the sampling method of selecting every 100 th person
may provide a sample of only leaders or no leaders at all.

Uses of Statistics: Statistics are used widely across an array of applications and professions.
Statistics are done whenever data are collected and analyzed. This can range from government
agencies to academic research to analyzing investments.
Resources:

“Statistics in Math: Definition, Types, and Importance.” Investopedia, 2023,

www.investopedia.com/terms/s/statistics.asp#toc-statistics-sampling-techniques. Accessed 18 Dec. 2023.

Nikolaidis, Christos. “Ib Dp Mathematics.” Math AI, 11 Sept. 2021, www.christosnikolaidis.com/en/.

Harris, David, and Peter Gray. Mathematics: Applications and Interpretations. Oxford University
Press, 2020.

“Mathematics Applications and Interpretations SL Study Guide.” Https://Www.ib.academy/, IB


Academy .

You might also like