You are on page 1of 12

MINISTRY OF EDUCATION AND TRAINING

NATIONAL ECONOMICS UNIVERSITY


------***------

GROUP MID-TERM EXAM


Course: Mathematical Statistics
TOPIC: Hypothesis testing
Lecturer: Trần Thị Bích

Class: Finance Economics – FE64


Group: 1
Members: Mai Thái Sơn – 11225626
Phan Mạnh Hùng – 11222590
Phạm Ngọc Hải – 11222029
Trần Mạnh Hào – 11222180
Phạm Văn Lâm – 11223240
Hoàng Phú Khánh - 11223030
CONTENTS
PART I: SUMMARIZE THE ARTICLE ................................................................. 2

1. The article ...................................................................................................... 2

2. The issue of interest....................................................................................... 2

3. The technique ................................................................................................ 2

4. Additional references of Hypothesis testing ............................................... 3

5. Viewpoint ....................................................................................................... 3

PART II: DATA ANALYSIS TO A SPECIFIC ORGANIZATIONAL PROBLEM


.................................................................................................................................. 5

1. Overall view: Insurance claims and Dataset. ............................................. 5

2. The purpose ................................................................................................... 6

3. Descriptive Statistics for Variables ............................................................. 7

4. Data analyze................................................................................................... 7

1|Page
PART I: SUMMARIZE THE ARTICLE

1. The article
Stress management through regulation of blood pressure among college
students.
Source: https://content.iospress.com/articles/work/wor2308
2. The issue of interest
Stress is a pervasive condition that affects individuals on multiple levels,
disrupting their sleep, work, and overall quality of life. Extensive research has
identified job-related and academic-related factors as major contributors to
stress. However, the field lacks studies investigating immediate remedies or first
aid measures to alleviate stress. This article introduces the concept of Deep
Breathing Technique (DBT) and explores its potential application as a means of
stress management by regulating blood pressure among Indian college
engineering students. A comparative analysis is conducted between DBT and
Ordinary Breathing Technique (OBT) to assess their effectiveness.
The primary objective of this article is to investigate whether deep breathing
techniques can effectively control blood pressure and subsequently reduce stress
levels. By examining the impact of different breathing techniques, the article
aims to provide recommendations based on the findings.
3. The technique
This academic article used hypothesis testing to analyze data relating to stress
management. In data science and statistic hypothesis testing is an important step
as it involves the verification of an assumption that could help develop a
statistical parameter. It is the act of testing a hypothesis or a supposition in
relation to a statistical parameter. Analysts implement hypothesis testing in order
to test if a hypothesis is plausible or not. In order to find the plausibility of the
hypothesis we have to use hypothesis testing method.
In order to perform hypothesis testing method first weestablish two
hypotheses - alternative hypothesis and null hypothesis in order to begin with
the procedure. Then we collect data to test the hypothesis after that, perform an
2|Page
appropriate statistical test, through the test we decide whether reject the null
hypothesis or no.
Thus, as an organizational manager, we see this as an useful tool to create a
reliable environment for deciding on sample data. It helps us move on knowing
that there is no possibility being overlooked that may have an effect in the future.
4. Additional references of Hypothesis testing
a. The first source
Hypothesis and Hypothesis Testing in the Clinical Trial.
https://www.psychiatrist.com/wp-content/uploads/2021/02/13947_hypothesis-
hypothesis-testing-clinical-trial.pdf.
b. The second source
Testing a Hypothesis—Plant Growth.
https://fathom.concord.org/resources/tutorials/testing-a-hypothesis-plant-growth/.
5. Viewpoint
In order to have a better understanding of Deep Breathing Technique (DBT)
and its application such as control blood pressure, level of strees, it is essential
that we examine the technique carefully so that we could draw the most suitable
conclusions.
For the target of the research, a total of 123 students were selected. Sample
students are filtered and selected via an initial screening (a questionnaire on
academic stress) and the ones reported high mental stress during the interview
were chosen for the main drills. The total data set was divided into two groups
named as control group and experimental group. In the control group, the first
readings were recorded as “before the drill readings” for Systolic Blood Pressure
(SBP) and Diastolic Blood Pressure (DBP). The second reading was recorded as
“after Ordinary Breathing Technique (OBT)”.

3|Page
Table 1. Hypothesis to be tested
Using the t test formula, the mean of the differences as well as standard
deviation can be calculated and illustrated in Table 2. As listed in Table 2 (for
control group), that the average DBP is 87.75 before the OBT drill and 87.27
after the OBT drill, with a variance of 7.54 and 9.34 respectively Based on
the t test, it is calculated that the P-value (= 0.089 and 0.274) > α. Therefore, it is
concluded that both H01 and H02 are not rejected at 1% level of significance.

Table 2. t test (Control Group)


Onto Table 3 (for experimental group), the DBP is 87.27 before the DBT
Drill and 79.89 after the DBT drill, which is significantly lower and closer to the
desired level of 80. Asvident, the P-value is less that α at 0.01 (P-value = 0.000
< α = 0.01). And the delivered conclusion is “ H03 is rejected”, and we can say
that there is a significant positive effect of DBT on DBP.
On the other hand, the SBP was 128.82 before the DBT drill and 121.03 after
the drill gain indicating a significant positive effect as the desired level is 120.
4|Page
References including the P-value being 0.000 (lower than α at 0.01) in the t test
confirms this (H04 is rejected).

Table 3. t test (Experimentation Group)


Table 4 shows the status of hypothesis for control and experimental groups after
testing and statistical analysis.

Table 4. Hypothesis status


Based on the result of the hypothesis testing, we could draw a conclusion that Deep
Breathing Technique has a great effect on students. It is recommended that people
should use this techniquie in order to have a better health condition.

PART II: DATA ANALYSIS TO A SPECIFIC ORGANIZATIONAL


PROBLEM

1. Overall view: Insurance claims and Dataset.


Leveraging customer information is of paramount importance for most
businesses. Most firms place an emphasis on leveraging customer information.
In the case of an insurance company, consumer characteristics such as those
listed below might be critical in making business decisions. So, we have
collected data from 1338 of our customers, 676 male and 662 female or as
mentioned below as policy holders or insurance holders.

5|Page
Dataset source: https://www.kaggle.com/code/yogidsba/insurance-claims-
eda-hypothesis-testing.
2. The purpose
The objective of this testing is to find the profile of customers that will benefit
the company most through 2 questions:
- If insurance holders with no kids pay less charges than average at 90%
confidence level?
- If medical charges made by the people who smoke are greater than those who
don’t at 90% confident level?
DATASET:

- Age : This is an integer indicating the age of the primary beneficiary (excluding
those above 64 years, since they are generally covered by the government).
- Sex : This is the policy holder's gender, either male or female.
- BMI :This is the body mass index (BMI), which provides a sense of how over or
under-weight a person is relative to their height. BMI is equal to weight (in
kilograms) divided by height (in meters) squared. An ideal BMI is within the
range of 18.5 to 24.9.
- Children : This is an integer indicating the number of children / dependents
covered by the insurance plan.
- Smoker : This is yes or no depending on whether the insured regularly smokes
tobacco.
- Region :This is the beneficiary's place of residence in the U.S., divided into four
geographic regions - northeast, southeast, southwest, or northwest.
- Charges :Individual medical costs billed to health insurance.

Above are all the variances in the data that we collected, however, to answer the 2
questions of this test, we will only be using the "Children", "Smoker" and
"Charges" variances in our calculation.

6|Page
3. Descriptive Statistics for Variables

As you can see from the table above, the dataset consists of 1338 samples and
all are valid with 0 missing information.

4. Data analyze
a. Question 1
- State the hypothesis
Null hypothesis: the average insurance charge of people don’t have kids is $13270.
H0: µ = 13270.
Alternative hypothesis: the average insurance charge of people don’t have kids
less than $13270. Ha: µ < 13270.
7|Page
- Compute the test statistic
This measures (in standardized unit) how far how hypothesis µ to our sample
average is
𝑥̅ + 𝜇
𝑡=
𝑆
√𝑛
Then, we used the One Sample T-Test to compare the means. In the “Test Value”
section, enter the 13270 as null hypothesis for the insurance charge of people who
don’t have children.
The outcomes:

- Make decisions
As we can see on the table, the average insurance charge of people of people who
don’t have kids in this sample is 12365.975. We can use a Decision Rule using
either the Rejection Region, p-value found from appropriate distribution (std
normal), or confidence interval approach.
• With rejection region
We wan to be 90% certain. This means a 10% chance of rejecting H0 when it is true.
According to our alternative hypothesis, 10% of the standard normal will be our
rejection region.
We use a t-table with t= -1.801 with a degree of freedom of 573.

8|Page
T-value os negative, so significance woud only be found in the negative one-tailed
t-test. Comparing the critical value at 10% of a standard normal is -1.646, our test
statistics= -1.801, lying below -1.646, so we reject the null hypothesis in face of the
alternative hypothesis.

• P-value
We compare the Sig. (2-tailed) with significance level. Sig. (2-tailed) = 0.072
smaller than α = 0.1, therefore, we reject the null hypothesis and conclude that the
average insurance charge of people don't have kids less than $13270.
• Confidence Interval (CI) approach
The 90% confidence interval for µ is: -1730.820 < µ < -77.231
Because µ= 13270 doesn’t lie between this CI, we reject the null and conclude that
the average insurance charge of people don’t have kids less than $13270.
Conclusion: Insurance holders with no kids pay less charges than average at 90%
confidence interval.
b. Question 2
- State the hypothesis
Null hypothesis: smokers have the same medical charges as non-smokers.
H0: µ1 = µ2
Alternative hypothesis: smokers have greater medical charges than non-smokers.
Ha: µ1 > µ2
- Compute the test statistics: compute data on SPSS with Independent Sample
T-test.
9|Page
- Make decisions
We can see in the Table that the 1st group, which is smokers, has a sample of 274
and the mean of their medical charges is $32050.232. While the group of non-
smokers has a sample of 1064 and their average medical charges is $8434.268. From
observation, we can see that non-smokers pay way less than smokers, however, the
sample of the 2 groups is different so we need to use the table below for testing.
• Tests for equal variances:
Null hypothesis: H0: σ12 = σ22
Alternative hypothesis: Ha: σ12 ≠ σ22
We will use the data for Sig., which is the p-value, is almost equal to 0 (.000). It is
smaller than the significance level of 1%. Therefore, we reject the null hypothesis
of "H0: σ12 = σ22 " and we can conclude that "Ha: σ12 ≠ σ22 ". Which means in the
next steps when testing for equal means, we will use data in the "Equal variances
not assumed".
10 | P a g e
• Test for equal means:
We can see the data for Sig. (2-tailed) is also equal to 0.000, smaller than 1% of
significance level. We can reject the null hypothesis and conclude that the average
medical charges of smokers and non-smokersare different and non-smokers pay less
than smokers on average at 99% confidence level.

11 | P a g e

You might also like