STAT FINALexam

St.
Paul University Dumaguete

Dumaguete City
I. Answer the following questions:
1. For each of the following situations, answer questions a through d:
a. Situations A. A study of 300 households in a small southern town revealed that 20 percent had
at least one school- age child present.
a. What is the sample in the study?
Answer: 20 percent had at least one school-age present or 60 households
b. What is the population?
Answer: 300 households
c. What is the variable of interest?
Answer: The number of household who had at least one school-age child
present
d. What measurement scale was used?
Answer: Nominal Scale
b. Situation B. A study of 250 patients admitted to a hospital during the past year revealed that, on
the average, the patients lived 20 kilometers from the hospital.
a. What is the sample in the study?
Answer: On the average, the patients lived 20 kilometers from the hospital
b. What is the population?
Answer: 250 patients
c. What is the variable of interest?
Answer: Distance of patient’s home from the hospital
d. What measurement scale was used?
Answer: Interval Scale

St. Paul University Dumaguete
Dumaguete City
2. Differentiate the following:
a. Descriptive statistics and inferential statistics
Answer:
According to Courtney Taylor (2020), The field of statistics is divided into two major divisions:
descriptive and inferential. Each of these segments is important, offering different techniques that accomplish
different objectives. Descriptive statistics describe what is going on in a population or data set. Inferential
statistics, by contrast, allow scientists to take findings from a sample group and generalize them to a larger
population. The two types of statistics have some important differences.
Descriptive statistics is the type of statistics that probably springs to most people’s minds when they
hear the word “statistics.” In this branch of statistics, the goal is to describe. Numerical measures are used to tell
about features of a set of data. There are a number of items that belong in this portion of statistics, such as:
 The average, or measure of the center of a data set, consisting of the mean, median, mode, or midrange
 The spread of a data set, which can be measured with the range or standard deviation
 Overall descriptions of data such as the five number summary
 Measurements such as skewness and kurtosis
 The exploration of relationships and correlation between paired data
 The presentation of statistical results in graphical form
These measures are important and useful because they allow scientists to see patterns among data, and thus to
make sense of that data. Descriptive statistics can only be used to describe the population or data set under
study: The results cannot be generalized to any other group or population.
On the other hand, Inferential statistics are produced through complex mathematical calculations that
allow scientists to infer trends about a larger population based on a study of a sample taken from it. Scientists
use inferential statistics to examine the relationships between variables within a sample and then make
generalizations or predictions about how those variables will relate to a larger population.
It is usually impossible to examine each member of the population individually. So scientists choose a
representative subset of the population, called a statistical sample, and from this analysis, they are able to say
something about the population from which the sample came. There are two major divisions of inferential
statistics:
 A confidence interval gives a range of values for an unknown parameter of the population by measuring
a statistical sample. This is expressed in terms of an interval and the degree of confidence that the
parameter is within the interval.
 Tests of significance or hypothesis testing where scientists make a claim about the population by
analyzing a statistical sample. By design, there is some uncertainty in this process. This can be
expressed in terms of a level of significance.
Techniques that social scientists use to examine the relationships between variables, and thereby to create
inferential statistics, include linear regression analyses, logistic regression analyses, ANOVA, correlation
Dumaguete City
analyses, structural equation modeling, and survival analysis. When conducting research using inferential
statistics, scientists conduct a test of significance to determine whether they can generalize their results to a
larger population. Common tests of significance include the chi-square and t-test. These tell scientists the
probability that the results of their analysis of the sample are representative of the population as a whole.
Hence, descriptive statistics is helpful in learning things such as the spread and center of the data,
nothing in descriptive statistics can be used to make any generalizations. In descriptive statistics, measurements
such as the mean and standard deviation are stated as exact numbers.
Even though inferential statistics uses some similar calculations — such as the mean and standard
deviation — the focus is different for inferential statistics. Inferential statistics start with a sample and then
generalizes to a population. This information about a population is not stated as a number. Instead, scientists
express these parameters as a range of potential numbers, along with a degree of confidence.
Work Cited
Taylor, Courtney. (2020, February 17). The Difference Between Descriptive and Inferential Statistics. Retrieved
from https://www.thoughtco.com/differences-in-descriptive-and-inferential-statistics-3126224
b. Measures of central tendency and measures of variability
Answer:
Measures of central tendency are measures of the location of the middle or the center of a distribution.
The most frequently used measures of central tendency are the mean, median and mode.
The mean is obtained by summing the values of all the observations and dividing by the number of
observations. The median (also referred to as the 50th percentile) is the middle value in a sample of ordered
values. Half the values are above the median and half are below the median. The mode is a value occurring
most frequently. It is rarely of any practical use for numerical data.
A comparison of the mean, median and mode can reveal information about skewness, as illustrated in
figure below. The mean, median and mode are similar when the distribution is symmetrical. When the
distribution is skewed the median is more appropriate as a measure of central tendency.
On the other hand, Variability refers to how spread out a group of data is. In other words, variability
measures how much your scores differ from each other. Variability is also referred to as dispersion or spread.
Data sets with similar values are said to have little variability, while data sets that have values that are spread
out have high variability.
A measure of variability is a summary statistic that represents the amount of dispersion in a dataset.
How spread out are the values? While a measure of central tendency describes the typical value, measures of
variability define how far away the data points tend to fall from the center. We talk about variability in the
context of a distribution of values. A low dispersion indicates that the data points tend to be clustered tightly
around the center. High dispersion signifies that they tend to fall further away. The common measures of
variability are the range, interquartile range, variance, and standard deviation.
Dumaguete City
c. Parametric tests and Non-parametric tests
Answer:
In statistics, parametric and nonparametric methodologies refer to those in which a set of data has a
normal vs. a non-normal distribution, respectively. Parametric tests make certain assumptions about a data
set; namely, that the data are drawn from a population with a specific (normal) distribution. Non-parametric
tests make fewer assumptions about the data set. The majority of elementary statistical methods are
parametric, and parametric tests generally have higher statistical power. If the necessary assumptions cannot
be made about a data set, non-parametric tests can be used. Here, you will be introduced to two parametric
and two non-parametric statistical tests (Tyler, 2017).
3. Define.
a. What is hypothesis testing?
Answer:
Hypothesis testing was introduced by Ronald Fisher, Jerzy Neyman, Karl Pearson and Pearson’s

son, Egon Pearson. Hypothesis testing is a statistical method that is used in making statistical decisions using
experimental data. Hypothesis Testing is basically an assumption that we make about the population parameter.
Hypothesis testing is an act in statistics whereby an analyst tests an assumption regarding a population

parameter. The methodology employed by the analyst depends on the nature of the data used and the reason for
the analysis.
Hypothesis testing is used to assess the plausibility of a hypothesis by using sample data. Such data may come
from a larger population, or from a data-generating process. The word "population" will be used for both of
these cases in the following descriptions (Majaski, 2020).
b. Discuss the steps in hypothesis testing.
Answer:
In hypothesis testing, an analyst tests a statistical sample, with the goal of providing evidence on the
plausibility of the null hypothesis.
Statistical analysts test a hypothesis by measuring and examining a random sample of the population being
analyzed. All analysts use a random population sample to test two different hypotheses: the null hypothesis and
the alternative hypothesis.
The null hypothesis is usually a hypothesis of equality between population parameters; e.g., a null hypothesis
may state that the population mean return is equal to zero. The alternative hypothesis is effectively the opposite
of a null hypothesis; e.g., the population mean return is not equal to zero. Thus, they are mutually exclusive, and
only one can be true. However, one of the two hypotheses will always be true.
Dumaguete City
Four Steps of Hypothesis Testing

All hypotheses are tested using a four-step process:
1. The first step is for the analyst to state the two hypotheses so that only one can be right.
2. The next step is to formulate an analysis plan, which outlines how the data will be evaluated.
3. The third step is to carry out the plan and physically analyze the sample data.
4. The fourth and final step is to analyze the results and either reject the null hypothesis, or state that the
null hypothesis is plausible, given the data.
4. Indicate the significance of each of the following software:
a. Excel
Answer:
Excel is your best tool for storing and manipulating data, calculating basic descriptive statistics such as

means and standard deviations, and conducting simple mathematical operations on your numbers. It can also
run the five basic Statistical Tests.
 Descriptives
What they are: Numbers that describe a dataset.
When to use them: When you want to get a general sense of your findings, and how much agreement there
was across participants.
The details: Descriptives include measures of central tendency such as the mean (the average), median
(middle-point), and mode (the most common point), as well as measures of variability (how spread out your
data is) such as range and standard deviation. While the math behind these is much simpler than the other
statistical tests on this list, it is likely the most prevalent type of analysis in all of statistics. There is a rather
simple reason for that, too. Descriptive analyses tell you what your data look like, and as such, it is
frequently a first step (or a simultaneous step if you are using programs like SPSS) when doing data
analysis. Gaining a general sense of how your participants responded as a whole can help you understand
the results of some of the other tests you run. It can also help you understand how much (or how little) your
participants agreed with each other.
 T-test
What it is: A test for comparing the averages of two groups.
When to use it: When you want to see if two groups are actually distinct from each other, or if the same
group differs at two separate points in their experience or with two different devices.
Dumaguete City
The details: The T-test is the most basic way to see if two groups are “significantly” different from one
another. It is used in null hypothesis testing, wherein the goal is essentially to determine if your hypothesis
(what you think is happening) exists or not. Basically, you take a measurement of two groups (e.g., one
group using one device and another group using another device) under the assumption that there are no
differences between them. The T-test will tell you if there is a large enough difference between the two to
doubt that assumption, and if that is the case, you can reject the assumption (or null hypothesis).
 ANOVA
What it is: A test that reveals the variance between multiple variables.
When to use it: When you have a complex study that involves multiple groups, multiple measures over time,
or both, and you want to know if there are differences between those groups or over time.
The details: The ANOVA, or Analysis of Variance, is one of the more common statistical tests used in
research. As the name implies, the ANOVA is a comparison of different means (typically 3+) to determine
whether or not they are different. ANOVA is comparable to the t-test, which compares two variables. In
fact, assuming your groups are equal, you may even be able to substitute a bunch of t-tests for an ANOVA.
The complication may leave you with error, however, so it’s probably best to stick with the ANOVA. There
is a wide variety of ways to run an ANOVA, so it is important to know what your study entails before
running this test. Knowing the number of independent variables (e.g., the number of devices being tested)
and whether you are running a within-subjects (e.g., every participant uses every device) or between-
subjects (e.g., half of the participants use one device and half use another) study is particularly important. If
used properly, the ANOVA can be a powerful analytical tool.
 Repeated measures
What it is: A test to analyze measurements of the same variable at multiple points in time.
When to use it: When you want to examine changes in a group over time (e.g., a longitudinal diary study),
particularly if you are intervening (e.g., with a device, a service, training, etc.) and want to see if that
intervention has an actual influence on the group.
The details: The statistical test for repeated measures is a specific subset of ANOVA, often called rANOVA
(think ‘r’ for ‘repeated’). It employs a mixture of within-subjects and between-subjects designs in order to
understand how interventions or other variables can influence groups over time. For example, a simple
repeated measures study may have a control group and a test group. Both groups are measured at the start,
then the test group alone is given some sort of intervention (they are experience your product or undergo
some treatment), then both groups are again measured. The rANOVA looks at the differences between the
measurements taken to determine if the intervention had any true effects. This test has a broad range of
applicability, and as such is one of the most commonly used tests.
Dumaguete City
 Regression
What it is: A test for estimating how variables relate to each other and to potential outcomes.
When to use it: When you want to predict the future.
The Details: In essence, regression is simply a technique used to estimate how variables relate to one
another and how they each influence the outcome. Like the ANOVA, regression can come in many forms,
so before you attempt to use it, it is important to understand your research goals. Regression can be linear or
nonlinear, and can have just one input or many. Perhaps the most common use of regression is in linear
regression, which has many practical uses including being able to create predictive models. Let’s look at
weather as an example. If you wanted to determine whether it is going to rain or not tomorrow, you might
look at temperature, pressure, humidity, cloud cover, and historical weather pattern information. Each of
those variables will have a different influence in ultimate outcome. For instance, cloud cover may influence
tomorrow’s rain potential more than historical data or temperature. Using regression, we can figure out how
strongly each variable relates to the ultimate outcome. Coupling that with current data we have an estimate
of the likelihood of rain tomorrow. The same methods can be used to predict stock prices, election
outcomes, or just about anything else. Very useful stuff when used correctly (Brogdon, 2018).
b. Minitab
Answer:
Minitab provides the following statistical analyses:
1. Summarize the data- Descriptive statistics summarize and describe the prominent features of data. Use
Display Descriptive Statistics to determine how many book orders were delivered on time, how many were late,
and how many were initially back ordered for each shipping center.
2. Interpret the results- The Session window displays each center’s results separately. Within each center, you
can see the number of back orders, late orders, and on-time orders in the Total Count column.
3. Compare two or more means- One of the most common methods used in statistical analysis is hypothesis
testing. Minitab offers many hypothesis tests, including t-tests and ANOVA (analysis of variance). Usually,
when you perform a hypothesis test, you assume an initial claim to be true, and then test this claim using sample
data.
4. Interpret the ANOVA graphs- Minitab produced the following graphs: Four-in-one residual plot, Interval
plot, Individual value plot, Boxplot, and Tukey 95% confidence interval plot.
Dumaguete City
5. Access Key Result- Suppose you want more information about how to interpret a one-way ANOVA,
specifically Tukey’s multiple comparison method. Minitab provides detailed information about the Session
window output and graphs for most statistical commands.
7. Save the project- Save all your work in a Minitab project.
c. SPSS
Answer:
SPSS (Statistical package for the social sciences) is the set of software programs that are combined
together in a single package. The basic application of this program is to analyze scientific data related with the
social science. This data can be used for market research, surveys, data mining, etc. With the help of the
obtained statistical information, researchers can easily understand the demand for a product in the market, and
can change their strategy accordingly. Basically, SPSS first store and organize the provided data, then it
compiles the data set to produce suitable output. SPSS is designed in such a way that it can handle a large set of
variable data formats (Thomes, 2018). The SPSS plays a significant role in the following:
1. Data Transformation: This technique is used to convert the format of the data. After changing the data type, it
integrates same type of data in one place and it becomes easy to manage it.
2. Regression Analysis: It is used to understand the relation between dependent and interdependent variables
that are stored in a data file. It also explains how a change in the value of an interdependent variable can affect
the dependent data.
3. ANOVA( Analysis of variance): It is a statistical approach to compare events, groups or processes, and find
out the difference between them. It can help you understand which method is more suitable for executing a task.
By looking at the result, you can find the feasibility and effectiveness of the particular method.
4. MANOVA( Multivariate analysis of variance): This method is used to compare data of random variables
whose value is unknown. MANOVA technique can also be used to analyze different types of population and
what factors can affect their choices.
5. T-tests: It is used to understand the difference between two sample types, and researchers apply this method
to find out the difference in the interest of two kinds of groups. This test can also understand if the produced
output is meaningless or useful.
5. What insights have you gained from your group research project?
Answer:
Dumaguete City
As a student in the Graduate School of St. Paul University University Dumaguete, during my experience
working with my classmates, I’ve found that I’m drawn towards and most enjoy doing the tasks. In thinking
about how I can build on this experience, and taking into account my goal of improving my skills.
The research project allows me to pursue my interests, learned something new, honed my problem-
solving skills and challenged myself in new ways. Working on a group research project gives me the
opportunity work closely with equally equipped teachers, hence it made the work easier. As always, I sincerely
appreciate any guidance you were able to provide us.

STAT FINALexam

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

STAT FINALexam

Uploaded by

Copyright:

Available Formats

St.

Paul University Dumaguete

I. Answer the following questions:

1. For each of the following situations, answer questions a through d:

a. What is the sample in the study?

Answer: 20 percent had at least one school-age present or 60 households

b. What is the population?

Answer: 300 households

c. What is the variable of interest?

d. What measurement scale was used?

Answer: Nominal Scale

a. What is the sample in the study?

b. What is the population?

Answer: 250 patients

c. What is the variable of interest?

Answer: Distance of patient’s home from the hospital

d. What measurement scale was used?

Answer: Interval Scale

2. Differentiate the following:

a. Descriptive statistics and inferential statistics

b. Measures of central tendency and measures of variability

c. Parametric tests and Non-parametric tests

a. What is hypothesis testing?

Hypothesis testing was introduced by Ronald Fisher, Jerzy Neyman, Karl Pearson and Pearson’s

Hypothesis testing is an act in statistics whereby an analyst tests an assumption regarding a population

b. Discuss the steps in hypothesis testing.

Four Steps of Hypothesis Testing

4. Indicate the significance of each of the following software:

Excel is your best tool for storing and manipulating data, calculating basic descriptive statistics such as

What they are: Numbers that describe a dataset.

What it is: A test for comparing the averages of two groups.

When to use it: When you want to predict the future.

Minitab provides the following statistical analyses:

7. Save the project- Save all your work in a Minitab project.

You might also like