You are on page 1of 8

RMIT International University Vietnam

Assignment cover page

Course code ECON1193


Course Business Statistics 1
Location Ha Noi
Title Individual Case Study - Inferential Statistics
Student name Nguyen Le Duc Huy
Student ID S3978053
Lecturer Pham Thi Minh Thuy
Word count 2528

I. Introduction:
The environmental issue is the main priority for humankind because of the population explosion
as well as the economic growth. The global has been clearly acknowledged the disadvantage of
increasing energy depletion. The demand for natural resources tends to increase in daily human
activities. Over 90% of the electricity comes from coal, and it is mainly consumed in developing
countries such as Congo and Ghana. The increase in using coal is one of the major causes for
environmental problems nowadays (Karekezi S, 2002).
GNI would stand for gross national income, that means the calculation of the yearly earnings in
a specific country by USD. Not only can GNI express poverty as well as the wealth of
employees living standards, it also can present the growth of economy in different countries.
Understanding the connection between GNI and the rate of energy depletion will play an
important role for society in order to solve pollution problems. Moreover, finding the relationship
between these two factors can support to aware reasons for the rising of environmental issues.

II. Descriptive statistics and probability:


a. Categories:

High-income countries (HG), GNI>US$14,000

YEAR Country Name GNI per capita, Atlas


method (current US$)

2020 Denmark 62710.0

2020 Netherlands 50170.0

2020 Canada 43540.0

2020 United Arab Emirates 41770.0

2020 United Kingdom 38590.0

2020 Brunei Darussalam 31210.0

2020 Estonia 23570.0


Middle-income countries (MG), GNI 3.500$-14.000$

YEAR Country Name GNI per capita, Atlas


method (current US$)

2020 Trinidad and 13830.0


Tobago
2020 Romania 12700.0

2020 Russian Federation 10740.0

2020 Malaysia 10320.0

2020 Mexico 8750.0

2020 Kazakhstan 8710.0

2020 Guyana 8190.0

2020 Thailand 6900.0

2020 South Africa 6090.0

2020 Peru 6000.0

2020 Colombia 5820.0

2020 Ecuador 5560.0

2020 Equatorial Guinea 5100.0

2020 Iraq 4720.0

2020 Indonesia 3900.0

Low-income countries (LG), GNI<3.500$

YEAR Country Name GNI per capita, Atlas


method (current US$)

2020 Tunisia 3230.0

2020 Cabo Verde 2920.0

2020 Papua New Guinea 2470.0

2020 Ghana 2230.0

2020 Nigeria 2020.0

2020 India 1890.0

2020 Congo, Rep. 1760.0

2020 Uzbekistan 1740.0


2020 Myanmar 1370.0

2020 Tanzania 1050.0

2020 Chad 630.0

2020 Sudan 630.0

2020 Niger 550.0

b. Probability:

Low-income Middle-income High-income Total


countries (LG) countries (MG) countries (HG)
High energy 2 5 1 8
depletion (H)
Low energy 11 10 6 27
depletion (L)
Total: 13 15 7 35
Table 1: Contingency table of three countries categories in terms of energy depletion and GNI
P (LG/H) = 2/8 P (MG/H) = 5/8 P (HG/H) = 1/8
P (LG) = 13/35 P (MG) = 15/35 P (HG) = 7/35
According to the table, the relationship between income rate and energy depletion has
been shown. However, the comparison is not relative due to the different amount of
data provided in each group. MG group has more values than the high income and low-
income group. Based on the table, the probability of low energy depletion in MG column
is 10/27 and this probability in LG column is 11/27, that is a larger portion compared to
the high-income countries. It is believed that these two groups of countries would have
more potential for economic growth than the high-income group. Besides the probability
of high energy depletion in the high-income countries is 1/8, which means that the
developed nations would be more successful in controlling the increase of the energy
consumer.
Compare and analysis:
1. Measurement of Central Tendency:
Lower bound Comparison Min Max Comparison Upper bound Result

LG -4.006 < 0.219 24.491 > 7.875 1 outlier


M -6.982 < 0.182 63.300 > 15.216 1 outlier
G
HG -2.526 < 0.131 10.705 > 4.598 1 outlier
Table 2: Test of outliers

LG MG HG
Mean 1.481 3.762 0.763

Median 1.276 2.742 0.224

Mode X X X
Table 3: Best measure of Central Tendency (without outlier)
There is no mode existing in the three categories due to the table, thus the best alternative
measurements are mean and median. The mean can be considered as the best method to
measure the central tendency; however, it can be easily affected by the outlier. The median is
less sensitive to the influence of outliers; therefore, the median is preferred in this situation
(McCluskey A & Lalkhen A.G, 2007). Regarding table 3, HG present the smallest number 0.224;
whereas MG has the biggest value that is four times bigger than HG. LG is located in between
these two groups with the number of 1.276. From the economic point of view, HG has the least
energy expenditure, and the developing countries including average as well as low-income
nations will have negative influence on the environment.
2. Measurement of Variation:
LG MG HG
Range 4.061 11.931 3.368

IQR 2.970 5.549 0.196

Variance 1.832 11.952 1.806

Standard Deviation 1.354 3.457 1.344

Coefficient of Variation 0.914 0.919 1.762

Table 4: Measures of Variation table (without outliers)


Table 4 shows that the IQR is the least impacted by outliers, making it superior to other
measurements. As a result, the IQR is the most accurate measure of variance. The MG group
has the greatest IQR value of energy depletion at 5.549%. In other words, the dataset of middle-
income nations is the most dispersed and dispersed among the three categories. High-income
nations, on the other hand, have the lowest IQR score (0.196%), suggesting that their statistics
are the least variable around their median. As a result, when compared to other groups, the HG
group has the most stable values.
3. Reason:
It seems that members from the group of developed countries are more successful at controlling
the increase of waste than others. Due to the growth of population, it is possible that there will
be a high demand for energy consumption. However, European countries make a remarkable
effort to find the key for environmental issues. They have invented a label which will inform the
customer about the calculation of energy saving. Products before releasing into the market have
been marked such as A+, A++, A+++, etc… and the results will be added on the coverage of
those items (Bertoldi et al, 2016). Therefore, customers can be provided with a better
understanding about the items, for example the electricity spending or the energy saving rate.
The government has taken an essential role in the attempt to reduce energy usage, they have
published several laws such as The Energy Efficiency Directive or The Energy Performance of
Building Directive in order to minimize the energy consumption (Bertoldi et al, 2016). As a result,
there is a significant change in European members from 2000 to 2014. The percentage of gas
consumption had dropped by 14%, meanwhile the gas consumption for household needs
decreased by 11.3% and the gas consumption for industry activities reduced by 14.6%.
III. Confidence Intervals:
a. Calculate the confidence intervals for the world average energy depletion:
 Level of significance: 5%
 Level of confidence: 95%
 Sample size: 35
 Sample mean: X = 5.578
 Standard deviation: 11.463
 0.05, degree of freedom = n-1 = 34
 Use T-online calculator => t = ± 2.032
 Confidence intervals:
S 11.463
¿ X ±t× =5.578 ± 2.032 x = 5.578 ± 3.937
√n √ 35
 1.641    9.515
In conclusion, with 95% confidence, the world average energy depletion is between 1.641% and
9.515%.
b. Discuss whether and why any assumptions are required or not to calculate these
confidence intervals:
As the population standard deviation is unknown, the sample size of 35 is larger than the 30
required by the Central Limit Theorem (CLT). Regardless of the population structure, CLT is
relevant because the mean sampling distribution becomes normal. Therefore, there is no
presumption needed to calculate these confidence intervals.
IV. Hypothesis Testing:
a. Hypothesis testing:
We are 95% certain that the globe will experience energy depletion in 2020 between 1.641% to
9.515% (part 3a). However, according to World Bank Open Data from 2019, the average global
energy depletion (% of GNI) was 0.90%. As a result, the average rate of energy depletion will
rise in the future.

Significance Level (α) 5%

Confidence Level (1- α) * 100% 95%

Population SD (σ) Unknown

Population mean () 0.9

Sample SD (S) 11.463

Sample Mean ( X ) 5.578

Sample Size (n) 35

Step 1: Check for CLT


The sample size is 35 which is higher than 30, CLT is applied. Thus, the mean sample
distribution is normally distributed.
Step 2: State null and alternative hypothesis

H0:   0.9

H1:  < 0.9

Step 3: Tail test


Since H1 shows the sign “<”, the lower-tailed test is conducted.

Step 4: Determine which table to use:


Because the population standard deviation is unknown, and the sampling distribution of mean
become normal, we will use t-table.
Step 5: Determine Critical Value (CV)
Degree of freedom (d.f) = n -1 = 34
 Lower-tailed test, t = -1.69
Significance level (α) = 0.05
Step 6: Calculate test statistics t
5.578−0.9
t ' =X − ¿ = =2.414 ¿
S 11.463
√n √ 35
Step 7: Make statistical decision.
Since t = -1.69 < t’ = 2.414, the test statistic falls into non-rejection region. We do not reject H0.
Step 8: Explanation.
As H1 is rejected, our claim that the mean of energy depletion will increase is true. Hence, we
are 95% confident that the world’s average energy depletion will not decrease.
Step 9: Discuss possible errors.
Because the null hypothesis was rejected, we may have made a Type II error. We conclude that
the world's average energy depletion will not reduce in the future, yet there is still possibility. The
type II mistake can be reduced by raising the degree of significance.
b. The number of countries is reduced by half:
When the number of nations in the dataset is cut in half, the sample size is likewise cut in half,
resulting in a change in the degree of freedom that may have a little impact on the statistical
decision and test results. When the sample size is reduced, the standard deviation rises while
the test statistic value falls. Because the CI width is inversely proportional to the sample size,
increasing the critical value (CV) of the lower-tailed test will broaden the CI. As a result, the
rejection zone would decrease, reducing the accuracy of identifying the genuine population
value. The null hypothesis, on the other hand, is still likely to be rejected, implying that the
hypothesis's conclusion remains intact, and that the world's average rate of energy depletion will
grow in the future. By accepting the erroneous H1, the chance of Type I error increases as the
sample size decreases. To summarize, lowering the number of nations by half may have an
impact on the test statistics and CV, resulting in less trustworthy findings with poorer accuracy
and an increased risk of Type I error.

V. Conclusion:
After conducting calculations, formulating hypotheses, and analyzing a sample of 35 countries
selected for their GNI and renewable natural capital, overall, there are 4 main findings on this
topic.
Firstly, the GNI and renewable natural capital are statistically dependent occurrences, with the
chance of three categories (LG, MG, and HG) being wholly different. The MG countries have the
highest likelihood in the contingency table, followed by the LG group, and finally the HG
category. It suggests that energy depletion is prevalent in middle-income nations. This was also
demonstrated in descriptive statistics, as MG nations had the greatest mean, indicating that
mid-income countries have the highest rate of energy depletion.
Second, in the descriptive statistics section, the median and interquartile range were selected
as the best gauges of central tendency and variance since both are least impacted by extreme
results. Furthermore, with an IQR of 5.549, the MG group has the highest rate of energy
depletion of any group. The HG group, on the other hand, has the smallest IQR value of 1.781,
indicating that high-income nations have the most consistent data. When compared to the other
income groups, HG countries have the least variability.
Third, based on hypothesis testing, we have 95% confidence that the world's average energy
depletion rate in 2020 will be between 1.641% and 9.515%. According to the World Bank, global
energy depletion averaged 0.9% in 2019, which is greater than the range of confidence
intervals. As H1 is rejected, the average rate of energy depletion will increase. Given that the
null hypothesis was rejected, we may have made a Type II error. As a result, we may assume
that the world's average rate of energy depletion will not decrease in the future, despite the fact
that it is still possible.

Finally, cutting the number of countries in the dataset in half reduces the sample size and
degrees of freedom, which may have an impact on statistical decision and test statistics. A
smaller sample size might raise standard deviation and decrease test statistic value. Even with
these changes, the null hypothesis is still likely to be rejected, implying that the world's average
rate of energy depletion will climb in the future. When the sample size is small, the likelihood of
Type I error rises. In general, reducing the number of nations in the dataset may result in less
accurate results with lower accuracy and a larger probability of Type I error.

VI. Data collection in Vietnam:


In order to create a survey which is about the petrol consumption of vehicles in family in
different cities in Vietnam, there is a huge number of data that will be collected. There are
two primary types of a sampling method including probability and non-probability. Basically,
the former would allow us to collect random information, thus the analysis will have a bigger
view of the problems. However, the latter will collect the data from non-random sources. In
this situation, the probability sampling can be considered. During the survey process, an
error would occur at any time, for example the population specific error. Because of the data
coming from random sources, the whole data can be not present for the aiming population.
The audiences targeting could be only students or they are at the old age; thus, the selection
might not show the the population of Vietnam. Furthermore, the non-response error can also
be considered to happen. If someone received an interview about how much petrol they use
daily for travelling purposes, they can refuse to answer for server reasonable view. Then the
non-response error occurs. For those difficulties, the survey will need fully prepared about
the list of questions targeting. Before heading to the main question, it is essential to clasify
whether the audience is from the right categories such as age, gender, occupation or types
of vehicles using. Therefore, the list of questions can include: which city do you come from?
Are you working or not? What do you do for a living? What is your main vehicle?

VII. References:
‌Karekezi S, 2002, “Poverty and energy in Africa—A brief review”, Energy Policy, page 915-919.

Bertoldi P, Lorente J.L, Labanca N, 2016, “Energy Consumption and Energy Efficiency Trends
in the EU-28 2000-2014”, European Commission, page 30.

McCluskey A & Lalkhen A.G, 2007, “Statistics II: Central tendency and spread of data”,
Continuing Education in Anaesthesia, Critical Care & Pain, page 127.

You might also like