You are on page 1of 2

23 How are outliers represented on box plots? 24 What do we call the process of removing anomalies from a data set?

With a cross or a dot. The end of the affected whisker should then be drawn Cleaning the data.
at the first non-outlier data point, if known – otherwise, at the outlier
boundary.
25 What is the key feature of a histogram? And for what type of data would 26 On a histogram, what goes on the vertical axis?
you use a histogram? Frequency density.
The key feature of a histogram is that the area of each block is proportional to
the frequency. We use a histogram for continuous data.
27 What do you get if you join the middle of the top of each bar in a 28 What must you comment on when comparing data sets?
histogram? A measure of location and a measure of spread.
A frequency polygon.
29 On a scatter graph, which variable should go on the x-axis? And what do 30 How do we describe correlation and what does it tell us about the linear
you call the variable that goes on the y-axis? relationship between the two variables?
The x-axis is for the explanatory variable (or independent variable). This is the Negative correlation (when one variable increases the other decreases),
variable you control. The y-axis is for the response (or dependent) variable. positive correlation (when one variable increases the other also increases) or
The researcher measures this variable. no linear correlation.
31 What do we mean by a causal relationship between variables? 32 In the equation of a regression line 𝑦 = 𝑎 + 𝑏𝑥, how do you interpret the
values of 𝑎 and 𝑏?
When a change in one variable causes a change on the other. (Correlation a tells you the value of y when x = 0. b tells you the change in y for every
does not imply causation so consider the context carefully). unit change in x. If the data is positively correlated, b will be positive. If the
data is negatively correlated, b will be negative.
33 When are you able to make predictions using a least squares regression 34 In probability, how do you know if two events are independent?
line?
P(A∩B) = P(A) x P(B) (ie the probability of the intersection is the same as the
Predictions inside the range of data (interpolation) should be accurate, as long
two probabilities multiplied together).
as there is a fairly strong linear relationship (correlation) between the two
variables. Extrapolation (estimating outside the range of data collected) is to
be treated with caution as the linear relationship may not remain valid.
35 In probability, how do you know if two events are mutually exclusive? 36 Describe and give an example of a discrete uniform distribution.

A discrete uniform distribution is a probability distribution where the


P(A∩B) = 0 (the probability of the intersection is zero)
probability of each outcome occurring is the same (e.g rolling a fair die).
or P(AUB) = P(A) + P(B).

37 State the four conditions for a Binomial Distribution. 38 What is a null hypothesis?

1. There is a fixed number of trials. 2. Each trial has two possible outcomes A statement made about a value of the population parameter that we assume
(success/failure). 3. The trials are independent of each other. 4. The probability to be correct unless there is evidence to suggest otherwise.
of success is constant/fixed.
39 What notation do we use for the null and alternative hypothesis? 40 What is a critical region?

Null hypothesis: Ho The range of values of the test statistic that would lead us to reject the null
Alternative hypothesis: H1 hypothesis.
41 What is a critical value? 42 What is a test statistic?

The value(s) on the boundary of the critical region. The result of an experiment or the statistic that is calculated from the sample.
43 What’s the difference between a one-tailed and two-tailed test? 44 What’s an actual significance level?

A one-tailed test looks either for an increase OR for a decrease in a parameter The probability of incorrectly rejecting the null hypothesis.
(ie p>… or p<…), and has a single critical value. A two-tailed test looks for a
change in a parameter (ie 𝑝 ≠…), and has two critical values. For a two tailed
test, halve the significance level at the end you’re testing.

You might also like