Professional Documents
Culture Documents
All the questions must be answered and provide rationale for the answers and show the
calculations.
Name the file with your “last name” and attach the file to blackboard in “Assignment2”
1. What are the differences between population parameters and sample statistics?
The differences between population parameters and sample statistics is that a parameter is
derived from the whole population using all possible data; a descriptive index
(Introduction to Statistics_1, ppt slide 4).
A sample statistic is a descriptive index for a sample. Statistics are not taken from the
entire population, but a mere part and are mainly used to make Inferences about
parameters by researchers (Introduction to Statistics_1, ppt slide 4).
2. Why must the researchers begin with descriptive statistics when the goal is to conduct
inferential statistics?
Researchers must begin with descriptive statistics because these are used to describe and
summarize data about the sample. Once they get their “baseline” then they can draw
objective conclusions about a population, using data from a sample basing the inferential
statistics on laws of probability. In order to use laws of probability the researchers must
know have some description of the statistics (Introduction to Statistics_1, ppt slide 4).
3. Imagine you are to conduct a study on how weight and age group (18-35, 36-53, and
=>54 years) relate to systolic blood pressure.
A variable is a characteristic that changes or varies among individuals in a sample
(subset of population) or in a population. Variables can be quantitative or categorical
(Introduction to Statistics_1, ppt slide 5).
5. True or false:
a. An instrument can be reliable without being valid
True
An instrument can be reliable without being valid. For example, I can be using a
household scale that keeps saying I weight 150 lbs every single time I try to
weight myself. However, it may be off and therefore not valid. Meaning the scale
was reliable to give me the weight measurement of 150 lbs consistently, however
it was not calibrated correctly so it was not valid.
False
An instrument cannot be valid without being reliable. For example, I use another
scale that says I weight 120 lbs, later it says I weight 140 lbs, the next time in the
same day, I weight 110 lbs. The scale is not reliable because I am not getting
consistency, therefore it cannot be valid. Reliability is an essential component of
validity.
6. Which of these charts allows a researcher to examine a possible relationship between two
ratio variables
a. Histogram
b. Bar chart
c. Scatter plot
d. Line chart
The purpose of a scatter plot is to show how much one variable affects the other. In
a scatter plot the relationship between two variables is termed their correlation. The
correlation can be negative or positive or may be no correlation. (Introduction to
Statistics_1, ppt slide 22).
The median is 117 because the sample data is an even number therefore, we add
the middle two numbers and divide by 2 when they are arranged in increasing
order. This makes the median = 117.
9. The 95% confidence interval of sodium content level in 32 nursing home patients is 4,250
mg/day and 4750 mg/day. What does this confidence interval tell us?
I am 95% confident that the mean of sodium content level in 32 nursing home
patients could lie between 4,250 mg/day and 4,750 mg/day.
The mean equals the average of the numbers or the calculated central value of a set
of numbers that is the sum of all the numbers in the data set divided by how many
numbers there are. (Introduction to statistics_1, ppt slide 15).
11. The purpose of a study is to test the effect of pressure ulcer prevention in reducing the
incidence of pressure ulcer in critically ill patients in intensive care units.
(Inferential Statistics, ppt slide, 15)
a. What is the null hypothesis?
The pressure ulcer prevention has no effect in reducing the incidence of pressure
ulcer in critically ill patients in ICU.
The null hypothesis states a prediction that variables in the study are NOT related;
there are no differences between the two groups. (Introduction to Statistics_1, ppt
slide 15).
answer is determined simply as a yes or no. Numbers are used simply as labels
for groups or classes.
(Introduction to Statistics_1, ppt slide 3).
e. The finding was as follows: intervention group had less pressure ulcer than the
control group (p=0.005). what is the status of the null hypothesis based on this
result?
Due to the p value being less than 0.5 we reject the null hypothesis,
making our results statistically significant.
A statistically significant result is one that has a high probability of being
real in the population.
(Inferential Statistics, ppt slide 20).
A type II error is made when we fail to reject the null hypothesis and the null
hypothesis is false. Eg. An effective intervention is erroneously considered
ineffective, a false negative. (Inferential Statistics, ppt slide 17).
Page 6 of 20
14. True or false: normality is assumed with all parametric statistical tests; therefore, it is
important to check if the data is normally distributed or not.
True; A parametric test involves estimating a population parameter and typically
assumes the dependent variable is normally distributed in the population, has a
dependent variable that is measured on either interval or ratio scale.
(Inferential Statistics, ppt slide 1).
Therefore, it is important to check if the data is normally distributed or not because if
the data is not normally distributed it could skew the results.
15. Which of the following correlation coefficients represents the strongest relationship?
0= no correlation
1= perfect correlation
0.1-0.4= weak correlation
0.5-0.7= moderate
0.8-1.0= strong
a. 0.14 = positive weak
b. 0.82 = positive strong
c. -0.02 = negative weak
d. -0.34 = negative weak
e. 0.56 = positive moderate
B. 0.82 is the correlation coefficient that represents the strongest relationship out of
all the choices. This is because it is the closest value to 1 and when the correlation
coefficient =1 that means it is a perfect correlation. Any number between 0.8-1.0 is a
strong correlation. (Introduction to Statistics_1, ppt slide 24). Choice B is a strong
positive correlation.
16. True or false: if a correlation coefficient is -1.0, it means that the two variables will move
in opposite directions
False: if a correlation coefficient is –1.0, it means the two variables move in perfect
correlation just that the direction is negative. Eg. When drawing a line; the two
variables would be very close due to their strong correlation and the line would start
from left to right due to their direction being negative and from high to low.
Correlation only describes linear relationships (Introduction to Statistics_1, ppt
slide 25).
17. What is the level of measurement for the following variables: age in years, income
group, and blood type?
Ratio scale is numerically equal, distances between paired observations have a meaning,
but there is a meaning also to the ratio of the distances (Introduction to Statistics_1, ppt
slide 3). I.e. age in years has a beginning point zero (birth) and it is continuous.
Page 7 of 20
18. What type of statistics (mention all possible statistics) can be used to describe the
variables: age in years, income groups, and blood type?
Quantitative variable ex are how tall you are, your age, your blood cholesterol level, the
number of credit cards you own.
Categorical ex. Your blood type, hair color, ethnicity
Distribution of data-data display-graphs
Ways to graph categorical data (bar graphs, pie charts)
Ways to graph quantitative data (line graphs: time plots, scales matter, histograms, box
plots.
(Introduction to Statistics_1, ppt slide 5).
19. Does a set of scores with most of its values below the mean have a negatively or
positively skewed distribution? Provide a rationale for your answer.
Page 8 of 20
A set of scores with most of its values below the mean has a positive skewed
distribution. It is positively skewed or right-skewed because most of the values are
below the mean meaning most of its values are less than the mean or the peak (mode) is
to the left of the mean. This means that most of the values are on the left of the mean or
below the mean.
(Introduction to Statistics_1, ppt slide 11).
Skewed Distribution: Definition, Examples. (n.d.). Retrieved from
https://www.statisticshowto.datasciencecentral.com/probability-and-statistics/skewed-
distribution/
20. t-statistics = -7.9 and p-value=0.005 describe the difference between women and men for
mental health score.
a. If alpha is set to 0.05, is the p-value of 0.005 statistically significant?
If the p-value is >0.5 I.e. the p value is 0.005 <0.5 then the results are statistically
significant.
Page 9 of 20
If the null hypothesis is retained (whenever p=> .05), the results are
statistically non-significant, however our p value is less than 0.5 it is
0.5% > 5%. A non-significant result reflects an outcome that could have been
obtained as a result of chance I.e. more than 5 times out of 100 times (just
dumb luck) A statistically significant result is one that has a high probability
of being real in the population. (Inferential Statistics, ppt slide 18).
21. A study found that the Pearson correlation coefficient “r” value for the relationship
between serum level of cholesterol and the age of the patients in years is 0.77 and p-value
was 0.002.
a. Interpret the “r” value of 0.77 [strength and direction] and provide a rationale for
your answer.
“r” quantifies the strength and direction of a linear relationship between two
quantitative variables. Strength is how closely the points follow a straight line
and direction is positive when individuals with higher X values tend to have
higher values of Y. The correlation coefficient is calculated using the mean and
the standard deviation of both the x and y variables. (Introduction to
Statistics_1, ppt slide 24).
0= no correlation
1= perfect correlation
0.1-0.4= weak correlation
0.5-0.7= moderate correlation
0.8-1.0= strong correlation
The “r” value given is 0.77 therefore, the direction is positive meaning the arrow
is going from left to right upwards and the strength is moderate correlation
meaning the points do somewhat closely follow a straight line but not as close as
they can to be strong correlation. Meaning, the straight line is not so obvious.
Page 10 of 20
b. If alpha is set to be 0.05, is this “r” value of 0.77 and p-value of 0.002 statistically
significant?
If alpha is 0.05 there is a 95% chance that you rejected the null hypothesis correctly.
The p-value is statistically important as it only occurs 4/2000 by chance. Because the
alpha is set to be 0.05 we reject the null hypothesis. This means there is a 95% chance
that the null hypothesis is rejected correctly or that there is a 5% chance that we do
not reject the null hypothesis correctly (Type 1 Error).
22. What is the statistical test (procedure) that is used to determine whether a significant
difference exists between three or more group means?
A) t-test
B) ANOVA = Analysis of Variance
C) Correlation coefficient
D) Mann Whitney U test
The simplest ANOVA; extension of independent groups t-test, is One-way ANOVA=
is the parametric test that is used for comparing means of three or more independent
groups. One-way ANOVA computations involve deviations of scores from the group
means and the overall grand mean. (Inferential Statistics, Analysis of Variance
[ANOVA], ppt slide 3).
23. What type of hypothesis is represented by the statement “women who smoke are more
likely to have low-birth-weight babies relative to women who do not”?
A) Alternative hypothesis
B) Non-directional
C) Research
D) Null hypothesis
Alternative hypothesis states a prediction that variables in the study are related eg.
Difference in means between two groups. (Inferential Statistics, ppt slide 15).
Meaning there is a difference between women who smoke and women who do not in
terms of the likeliness of having low-birth-weight babies.
24. The nurse researcher is calculating the standard deviation. What is the standard deviation?
A) The average amount of deviation of values from the mode and is calculated for every other
score
B) The average amount of deviation of values from the median and is calculated for every
other score
C) The average amount of deviation of values from the mean and is calculated for every
score
D) The average amount of deviation of values from the median and is calculated for every
score
The standard deviation is the square-root of the variance. The sample variance, s2 is the
arithmetic mean of the squared deviations from the sample mean. (Introduction to
Page 11 of 20
Statistics_1, ppt slide 19). The standard deviation measures the average amount of
variation or dispersion of a set of data values.
25. What is the name for the shape of distribution that occurs when the nurse
researcher has a bell-shaped curve distribution?
A) Frequency
B) Unimodal
C) Multimodal
D) Normal
The name of a bell-shaped curve distribution is normal because a bell-
shaped curve has both sides symmetrical to one another so if I were to fold
a paper with a bell-shaped curve distribution down the middle, both sides
would equal each other.
The normal distribution is bell-shaped: specific shape that can be defined
as an equation, symmetrical around the midpoint, where the greatest
frequency if scores occur (Introduction to Statistics_1, ppt slide 13).
Page 12 of 20
26. What parametric statistical method(s) a researcher can use to determine if the mean
body mass index of the population is the same for two groups of subjects (group1=diet
restriction; group2=none).
A. The name of the statistical test is _between-subjects test
“sample_data_Assignment2_SPRING2019.sav”
1. Do frequency for the following variables and interpret the findings: Agecat (Age
category), dhosp (died in hospital).
Page 13 of 20
For those who died in the hospital There is a total of 194 participants at first, however out
of those 20 were D.O.A. so they cannot be accounted for or placed in the did or did not die
in hospital groups. Therefore, our total is reduced to 174 participants. Out of this new total
the frequency of those who did not die in the hospital was 90 out of 174 participants or
51.7%.
The frequency of those who did die in the hospital was 84 out of 174 participants or 48.3%.
I.e. those who did not die = 90/174= 51.7%
Page 14 of 20
The reason why the percent and the valid percent differ is because the valid percent’s total
is 174 because of the 20 missing. The percent is still using the 194 total, so it is not precise.
2. Do descriptive statistics and histogram with normal distribution and interpret the results
for the following variable: fasting_glucose_level (fasting glucose level)
Define:
Histograms give us the range of values that a variable can take when divided into equal size
intervals. The histogram shows the number of individual data points that fall in each
interval. (Introduction to Statistics_1, ppt slide 9).
Page 15 of 20
Interpret:
When looking at the histogram that has a range of fasting glucose level from 75-160, the
cumulative percentage I see that 63.4% of the total 100% has already been accounted for
by 110 glucose level, meaning that 63.4% is below the mean. It is positively skewed or
right-skewed because most of the values are below the mean meaning most of its values are
less than the mean or the peak (mode) is to the left of the mean. This means that most of the
values are on the left of the mean or below the mean of 110.39.
This histogram is unimodal, however it is not a normal distribution because it peaks
between 110-120, it is right-skewed.
The total number of participants for this histogram is 194 participants or sample size (N) =
194. It is a poor fit for distribution. The mode is 110 because it is the most frequently
occurring fasting glucose level with a total of 46 out of 194 participants falling under that
level. The standard deviation is 16.105 whereas the mean of 110.39. The standard deviation
is a useful way of comparing the dispersion of variables (Introduction to Stats_1, ppt slide
20). With this small SD we can summarize that the distribution is close to the mean.
The Pearson Correlation between age in years and fasting glucose level is Correlation
coefficient “r” of .444. This means the correlation coefficient is a positive weak correlation.
The direction is positive meaning the arrow is going from left to right upwards and the
strength is weak correlation meaning the points do somewhat follow a straight line but not
close enough where the straight line is obvious.
“r” quantifies the strength and direction of a linear relationship between two quantitative
variables. Strength is how closely the points follow a straight line and direction is positive
when individuals with higher X values tend to have higher values of Y. The correlation
coefficient is calculated using the mean and the standard deviation of both the x and y
variables. (Introduction to Statistics_1, ppt slide 24).
0= no correlation
1= perfect correlation
0.1-0.4= weak correlation
0.5-0.7= moderate correlation
0.8-1.0= strong correlation
4. Is there a difference between those who died in the hospital and those who did not die in
the hospital [variable name= dhosp (died in hospital)] in the fasting glucose level
Page 17 of 20
a. What statistical test you will use? Is the difference statistically significant?
Explain and interpret the findings.
The statistical test that was used was the Independent Sample T Test. This test is
a parametric statistical technique useful in determining significant differences
between the scores obtained from two samples. This t-test examines the
difference between both means of both groups and adjusts that difference for the
variability. In this type of test the larger the calculated t ratio, in absolute value,
the greater the difference between the two groups.
F = tests if the variances are equal, when F is large and the significance level is
smaller than 0.10, 0.05, or 0.01 the hypothesis is of equal variances can be
rejected.
Therefore; there is a difference between the higher fasting glucose level and the
people who died in the hospital vs the lower fasting glucose level and those who
did not die in the hospital.
5. Is there a difference between the Age groups [variable name= Agecat] in the following
variable: fasting glucose level (fasting_glucose_level)
a. What statistical test you will use? Is the difference statistically significant?
Explain and interpret the findings.
I will use the one-way ANOVA also known as the Analysis of Variance statistical technique
that is conducted to examine differences between two or more groups. For the dependent
variable I used age and for the factor I used the fasting glucose level.
(Cipher & Grove, 2015, pg 259).
The sig in this test is .000 and the sig level is less than 0.05 therefore it is statistically
significant.
This means to say that the results presented here cannot be attributed to chance. There is
definitely a relationship between higher age and higher glucose levels.
6. Write a ONE PAGE summary report for the results of the study and its impact on
nursing practice (i.e., summarize the findings from question 1 to 5) [two points]
Page 19 of 20
Questions 1 through 5 had to deal with age, fasting glucose level, death or non-
death in a hospital as well as whether or not there is a relationship here. It is
fundamentally important that this type of research is done so we can be better
prepared to treat patients with these variables in the future, we need to understand
the relationships they have. Question number 1 asked to find out the frequency
between the variables age category and people who died in the hospital. When using
the numbers provided to plug into a frequency table, I noticed that the age category
of 55-64 had the biggest frequency or most of the people hospitalized fell under this
category with a number of 73 of the total 194 participants. This information is
important because with this we can start getting a baseline. Now that we see that
this age category has the most frequency we can start asking ourselves why and
how can we prevent it? We can start focusing on making changes so this age
category is no longer the highest risk age. Question 1 also makes it clear that there
is not much of a difference in how many people died in a hospital. Keeping in mind
that out of the original N= 194 we subtracted 20 due to them being D.O.A. we can
see that the valid percent of people who died while in the hospital was 48.3% and
the rest which is 51.7% did not die while in the hospital. These are not very good
statistics in terms of survival rate. Question number 2 asked to create a histogram
with normal distribution and to interpret what it meant in terms of the variable
fasting glucose level. This histogram showed a mean fasting glucose level of 110.39.
These findings are important because they emphasize that most of the people
surveyed do not have good control on their glucose levels and it suggests the need
for further teaching on the subject. Question number 3 asked if there was a
relationship between age in years and fasting glucose level. The end result using the
data given was that there is a positive weak correlation. This means to say that
although there does seem to be a relationship between higher age and higher fasting
glucose level it is a weak one. Nursing practice could benefit from the results of this
study because it shows there is a relationship although weak it is still there.
Emphasizing the importance of education at a young age and/or continued
education about the importance of having good control of glucose levels. Question 4
asked if there is a difference between those who died in the hospital and those who
did not die in terms of fasting glucose levels. This finding is important because it
shows that those who died in the hospital more often than not were those who had
the higher fasting glucose levels. If we can study this more we can hopefully provide
awareness of this result and highlight the importance of keeping a good control on
glucose levels. Question number 5 asked if there was a difference between age
groups and fasting glucose level. The end result was that the difference is
statistically significant as in the result is not attributed to chance. As in there is
definitely a relationship between higher age and higher glucose levels and we
should focus our research on finding solutions to fix this. How can we prevent,
Page 20 of 20
educate and reduce these numbers? By looking at research studies we can help
improve the nursing practice by practicing more on preventative measures rather
than tertiary measures or treating the problems.