You are on page 1of 6

Final Examination

1. There are various types of graphical representations used in statistics to visualize data. Three common
types are:

a. Bar Chart: A bar chart represents data with rectangular bars. It's used to show and compare the
frequency or distribution of categorical data. For example, you can create a bar chart to display the
number of cars of different colors in a parking lot.

b. Line Graph: A line graph is used to show trends and changes in data over a continuous range. For
instance, you can use a line graph to illustrate the change in temperature over a week.

c. Histogram: A histogram is used to represent the distribution of a continuous variable by dividing it


into bins or intervals and displaying the frequency or density of data points within each bin. It's
especially useful for visualizing the shape of data distributions, like the distribution of test scores in a
classroom.

2. Skewness measures the asymmetry of a probability distribution. There are various methods to
measure skewness, including the most common one using the sample data:

Skewness = (3 * (Mean - Median)) / Standard Deviation

 If the skewness is:

 Positive: The distribution is right-skewed (tail on the right).

 Negative: The distribution is left-skewed (tail on the left).

 Zero: The distribution is symmetric.

3. Leptokurtic and Platykurtic:

 A computed value is said to be Leptokurtic when it indicates a distribution with more extreme
values in its tails compared to a normal distribution. This means the distribution has higher peak
and heavier tails.

 A computed value is considered Platykurtic when it suggests a distribution with fewer extreme
values in its tails compared to a normal distribution. This means the distribution has a flatter
peak and lighter tails.

4. To create a box-and-whisker plot for the given data, you'll first need to organize the data in ascending
order and then calculate the quartiles and other necessary statistics. After that, you can draw the plot
and interpret the results. Here's how you can do it step by step:

Step 1: Organize the data in ascending order:

40, 46, 49, 54, 55, 55, 60, 60, 64, 68, 75, 76, 77, 79, 84, 89, 90, 90, 92, 94

Step 2: Calculate the Quartiles:

a) Calculate the median (Q2 or the 2nd quartile):

 There are 20 data points, so the median will be the average of the 10th and 11th values.
 Median = (64 + 68) / 2 = 66

b) Calculate the first quartile (Q1), which is the median of the lower half of the data (excluding the
overall median):

 There are 10 data points in the lower half, so the median will be the average of the 5th and 6th
values.

 Q1 = (55 + 55) / 2 = 55

c) Calculate the third quartile (Q3), which is the median of the upper half of the data (excluding the
overall median):

 There are 10 data points in the upper half, so the median will be the average of the 5th and 6th
values.

 Q3 = (84 + 89) / 2 = 86.5

Step 3: Find the Minimum and Maximum:

 The minimum value in the data set is 40.

 The maximum value in the data set is 94.

Step 4: Create the Box-and-Whisker Plot:

Now that you have the necessary statistics, you can create the box-and-whisker plot:

|-------|----------------|--------| |-------|

40 55 66 86.5 94

In the box-and-whisker plot:

 The horizontal line inside the box represents the median (Q2) at 66.

 The box represents the interquartile range (IQR), which goes from Q1 (55) to Q3 (86.5).

 The "whiskers" extend from the minimum value (40) to the maximum value (94).

Step 5: Interpretation of Results:

 The median (Q2) is 66, indicating that 50% of the students scored below 66 on the pre-test
exam.

 The interquartile range (IQR) goes from Q1 (55) to Q3 (86.5), which represents the middle 50%
of the data. This range shows that the majority of students scored between 55 and 86.5.

 The whiskers show the range of the data, with the lowest score being 40 and the highest being
94. There are no outliers in this data.

 The box-and-whisker plot provides a clear visual summary of the distribution and spread of the
data.
5. To calculate the sample standard deviation for the given data, you can follow these steps:

1. Calculate the sample mean (average): Mean (x̄) = (4 + 5 + 6 + 7 + 8 + 9 + 12 + 14 + 14 + 15 + 16 +


18) / 12 Mean (x̄) = 128 / 12 Mean (x̄) = 10.67 (rounded to two decimal places)

2. Calculate the squared difference between each data point and the mean for all 12 data points:

(4 - 10.67)^2, (5 - 10.67)^2, (6 - 10.67)^2, (7 - 10.67)^2, (8 - 10.67)^2, (9 - 10.67)^2, (12 - 10.67)^2, (14 -


10.67)^2, (14 - 10.67)^2, (15 - 10.67)^2, (16 - 10.67)^2, (18 - 10.67)^2

3. Calculate the sum of these squared differences.

4. Divide the sum of squared differences by (n-1), where n is the number of data points (12 in this
case) to calculate the sample variance.

5. Take the square root of the sample variance to obtain the sample standard deviation.

Let's calculate it step by step:

Step 2: Calculate squared differences:

(4 - 10.67)^2 ≈ 44.89 (5 - 10.67)^2 ≈ 31.36 (6 - 10.67)^2 ≈ 21.96 (7 - 10.67)^2 ≈ 13.48 (8 - 10.67)^2 ≈ 7.14
(9 - 10.67)^2 ≈ 2.79 (12 - 10.67)^2 ≈ 1.79 (14 - 10.67)^2 ≈ 11.11 (14 - 10.67)^2 ≈ 11.11 (15 - 10.67)^2 ≈
12.06 (16 - 10.67)^2 ≈ 30.78 (18 - 10.67)^2 ≈ 54.61

Step 3: Calculate the sum of squared differences:

44.89 + 31.36 + 21.96 + 13.48 + 7.14 + 2.79 + 1.79 + 11.11 + 11.11 + 12.06 + 30.78 + 54.61 ≈ 242.09

Step 4: Calculate the sample variance:

Sample Variance (s²) = Sum of squared differences / (n - 1) Sample Variance (s²) ≈ 242.09 / (12 - 1)
Sample Variance (s²) ≈ 242.09 / 11 ≈ 22.01 (rounded to two decimal places)

Step 5: Calculate the sample standard deviation:

Sample Standard Deviation (s) = √Sample Variance Sample Standard Deviation (s) ≈ √22.01 ≈ 4.69
(rounded to two decimal places)

So, the sample standard deviation for the given data is approximately 4.69.

6. Here's how you would conduct a One-Way ANOVA to determine differences within a factor:

1. Collect Data: Collect data from multiple groups or levels under a single factor. For example, you
might be comparing the test scores of students who have received different types of tutoring.

2. State the Hypotheses:

 Null Hypothesis (H0): There is no significant difference in the means of the groups.

 Alternative Hypothesis (Ha): At least one group mean is different from the others.

3. Calculate Group Statistics:

 Calculate the mean (average) for each group.


 Calculate the sum of squares between groups (SSB), which quantifies the variability
between group means.

 Calculate the sum of squares within groups (SSW), which quantifies the variability within
each group.

 Calculate the degrees of freedom (df) for both SSB and SSW.

4. Calculate the F-Statistic:

 The F-statistic is calculated as the ratio of SSB to SSW, adjusted for the degrees of
freedom.

 F = (SSB / df_B) / (SSW / df_W)

 Where df_B is the degrees of freedom for between groups and df_W is the degrees of
freedom for within groups.

5. Determine the Critical Value:

 Using the F-distribution table or a statistical software, determine the critical F-value at a
specified significance level (usually 0.05).

6. Compare the Calculated F-Statistic with the Critical Value:

 If the calculated F-statistic is greater than the critical F-value, you reject the null
hypothesis, indicating that there are statistically significant differences between at least
some of the group means.

 If the calculated F-statistic is less than the critical F-value, you fail to reject the null
hypothesis, indicating that there are no significant differences between the group
means.

7. Post-Hoc Tests (if needed):

 If the One-Way ANOVA indicates that there are significant differences between groups,
you can perform post-hoc tests (e.g., Tukey's HSD, Bonferroni, Scheffe) to identify which
specific group pairs have significant differences.

8. Interpretation:

 If the null hypothesis is rejected, you can conclude that there are statistically significant
differences within the factor. If not rejected, there is no significant difference among the
groups.

7. To determine if there is enough evidence to discard the null hypothesis, we can perform a hypothesis
test. In this case, the null hypothesis (H0) is that the true mean weight of all residents in Negros is 160
lbs, and the alternative hypothesis (Ha) is that the true mean is different from 160 lbs. We will perform a
two-tailed t-test at a 95% confidence level.

Here are the key values we have:


Sample mean (x̄) = 162.5 lbs Sample standard deviation (s) = 3.6 lbs Sample size (n) = 29 Population
mean under the null hypothesis (μ0) = 160 lbs Significance level (α) = 0.05 (for a 95% confidence level)

Now, let's perform the t-test:

1. Calculate the standard error of the sample mean:

Standard Error (SE) = s / √n SE = 3.6 / √29 ≈ 0.67

2. Calculate the t-statistic:

t = (x̄ - μ0) / SE t = (162.5 - 160) / 0.67 ≈ 3.73

3. Find the degrees of freedom (df):

Degrees of Freedom (df) = n - 1 df = 29 - 1 = 28

4. Find the critical t-values for a two-tailed test with a 95% confidence level and 2.5% in each tail
(α/2 = 0.025). You can look up the t-table or use a t-distribution calculator. With 28 degrees of
freedom and α/2 = 0.025, the critical t-values are approximately ±2.048.

5. Compare the calculated t-statistic to the critical t-values:

|t| > 2.048

In this case, |3.73| > 2.048, which means that the calculated t-statistic falls in the rejection region. This
implies that you have enough evidence to reject the null hypothesis.

So, at a 95% confidence level, there is enough evidence to conclude that the true mean weight of
residents in Negros is different from 160 lbs based on the sample data.

8. The coefficient of correlation, often denoted as "r," measures the strength and direction of the linear
relationship between two variables. To calculate the coefficient of correlation using simple regression
coefficients, you typically use the formula for Pearson's correlation coefficient (r). Here's how you can do
it:

Let's assume you have two variables, X and Y, and you've already calculated the simple linear regression
coefficients:

1. Calculate the means (average) of both X and Y:

Mean of X (X̄) = ΣX / n Mean of Y (Ȳ) = ΣY / n

Where ΣX is the sum of all X values, ΣY is the sum of all Y values, and "n" is the number of data points.

2. Calculate the sum of the products of the deviations from the means:

Σ((X - X̄)(Y - Ȳ))

This means, for each pair of X and Y values, calculate the difference between each X value and the mean
of X, and the difference between each Y value and the mean of Y. Then, multiply these differences
together and sum them up for all data points.

3. Calculate the sum of squared deviations for both X and Y:


Σ((X - X̄)²) Σ((Y - Ȳ)²)

Square the difference between each X value and the mean of X and sum them up, and do the same for Y
values.

4. Use the formula to calculate the correlation coefficient (r):

r = Σ((X - X̄)(Y - Ȳ)) / √(Σ(X - X̄)² * Σ(Y - Ȳ)²)

In this formula, Σ represents the summation symbol, and √ represents the square root.

5. Once you calculate "r," it will give you a value between -1 and 1. The sign of "r" indicates the
direction of the relationship:

 A positive "r" (r > 0) indicates a positive (direct) linear relationship. As X increases, Y also
increases.

 A negative "r" (r < 0) indicates a negative (inverse) linear relationship. As X increases, Y


decreases.

 An "r" close to 0 suggests a weak or no linear relationship.

You might also like