You are on page 1of 40

1

GROUP ASSIGNMENT COVER SHEET


STUDENT DETAILS

Student name: Nguyễn Thị Ngọc Nguyên Student ID number: 22004157

Student name: Võ Minh Như Student ID number: 21001069

Student name: Saw Stephen Student ID number: 22000004

Student name: Nguyễn Thị Phương Thảo Student ID number: 21001461

Student name: Nguyễn Việt Vy Student ID number: 21001390


UNIT AND TUTORIAL DETAILS

Unit name: Statistic For Business Unit number: MAT102


Tutorial/Lecture: Dr. Nguyễn Thị Thu Vân Class day and time: Friday, 12h-15h15
Lecturer or Tutor name:
ASSIGNMENT DETAILS

Title: Assessment 3: Group Project


Length: Due date: 24/08/2023 Date submitted: 25/08/2023

DECLARATION

x I hold a copy of this assignment if the original is lost or damaged.


I hereby certify that no part of this assignment or product has been copied from any other student’s work or from
X any other source except where due acknowledgement is made in the assignment.
I hereby certify that no part of this assignment or product has been submitted by me in another (previous or
X current) assessment, except where appropriately referenced, and with prior permission from the Lecturer /
Tutor / Unit Coordinator for this unit.
No part of the assignment/product has been written/ produced for me by any other person except where
X collaboration has been authorised by the Lecturer / Tutor /Unit Coordinator concerned.
I am aware that this work may be reproduced and submitted to plagiarism detection software programs for the
X purpose of detecting possible plagiarism (which may retain a copy on its database for future plagiarism
checking).

Student’s signature: Nguyên


Student’s signature: Như
Student’s signature: Stephen
Student’s signature: Thảo
Student’s signature: Vy
Note: An examiner or lecturer / tutor has the right to not mark this assignment if the above declaration has not been
signed.

1
2

INDIVIDUAL CONTRIBUTION RATE


Unit: Marketing Research

Class: SB-T223WSB-6
Group: 1
Assignment: Group Assignment
Individual contribution rate:

Contribution rate
N.o Student full name Student ID Signature
(%)

1 Nguyễn Thị Ngọc Nguyên 22004157 20% Nguyên

2 Võ Minh Như 21001069 20% Như

2
3

3 Saw Stephen 22000004 20% Stephen

4 Nguyễn Thị Phương Thảo 21001461 20% Thảo

5 Nguyễn Việt Vy 21001390 20% Vy

Note:
 The contribution rate must be discussed and agreed by all group members.
 The maximum contribution rate is 100% indicating that the student fully contributes to the
assignment.

SB-T223WSB-06
Dr. Nguyen Thi Thu Van
Western Sydney University

3
4

August 24, 2023

FINAL GROUP ASSIGNMENT REPORT


GROUP 1

PROJECT 1: As gasoline prices increase, alternative fuels appeal more to vehicle fleet
managers and consumers. Like gasoline, alternative fuel prices can fluctuate based on
location, time of year, and political climate. The Clean Cities Alternative Fuel Price Report
provides regional alternative and conventional fuel prices for biodiesel, compressed natural
gas, ethanol, hydrogen, propane, gasoline, and diesel. Use the dataset provided to:

4
5

5
Table of content
I. Introduction
II. Analysis of data

1. By using a suitable display, describe the data for each kind of fuel price.

2. Sketch boxplots for each kind of fuel price and make comparisons between them

3. Find the association between gasoline prices and alternative fuel prices.
4. Build a regression model to predict the fuel price of each type. In your own words,
describe the fitting model.

III. Finding/Collecting data


IV. Cleaning data
V. Analyzing data
VI. Interpreting results - Conclusion

6
7

I. Introduction
The growing gasoline expense has encouraged vehicle fleet managers and customers to investigate
alternative fuel choices. Alternative fuels might help to lessen the impact of rising gasoline costs
while also addressing environmental issues. However, like gasoline, alternative fuel prices fluctuate
due to a variety of factors such as geographical location, time of year, and political situation.

This report offers a thorough investigation into the prices of alternative fuels through the application
of statistical analysis. The data utilized in this analysis is sourced from the Clean Cities Alternative
Fuel Price Report. The primary focus of this study is the pricing of various alternative fuels such as
biodiesel, compressed natural gas (CNG), ethanol, hydrogen, propane, gasoline, and diesel fuel
across different regions. Several factors including location, time of year, and political climate are
considered in the analysis.

The main aim of this examination is to provide valuable insights into the fluctuations and patterns
observed in alternative fuel prices, particularly when compared to the increasing prices of gasoline.
These findings are intended to be advantageous for both vehicle fleet managers and consumers. To
explore the data, the report employs a variety of statistical techniques including descriptive statistics,
time series analysis, and regional comparisons. The outcomes of this analysis will contribute to a
deeper understanding of the alternative fuel market and its potential as a viable choice within the
transportation sector.

II.Analysis of data

7
8

1.By using a suitable display, describe the data for each kind of fuel price.

Figure 1: The prices of Gasoline over the period.


The bar chart illustrates gasoline price fluctuations over time. The prices began at approximately
$1.52 in April 2000 and fluctuated, reaching a peak of around $4.13 in April 2022 before falling to
about $3.69 in April 2023. Notable fluctuations include a substantial increase from approximately
$1.99 in June 2004 to nearly $3.91 in July 2008, followed by a sharp drop during the recession of
2009 and subsequent recovery.

8
9

Figure 2: The prices of E85

"E85 is a blend of 85% ethanol and 15% gasoline. The bar chart depicting the price of E85 fuel from
April 2000 to April 2023 exposes a complex tale of fluctuation. Beginning around $1.80 in the year
2000, the price of E85 fluctuated and crested at approximately $5.10 in July 2022 before declining to
approximately $3.50 in April 2023. Similar to petroleum, E85 prices increased between 2008 and
2011 due to market volatility. The higher ethanol content of E85, along with factors like ethanol
production costs and regulatory policies, contributes to its distinct pricing trajectory.

9
10

Figure 3: The prices of compressed natural gas

From April 2000 to April 2023, the bar chart vividly illustrates the fluctuations in Compressed
Natural Gas (CNG) prices. Beginning at approximately $0.89, CNG prices fluctuated and increased
to approximately $3.25 in January 2023 before stabilizing at approximately $2.99 in April 2023.
CNG attracts the attention of eco-conscious consumers and fleet managers seeking viable fuel
options amid shifting energy landscapes because it is a cleaner and less expensive alternative.

10
11

Figure 4: The prices of liquefied natural gas

From July 2016 to April 2023, the bar chart illustrates the fluctuations in LNG prices. Beginning at
approximately $2.41, LNG prices fluctuated, increasing to approximately $4.23 in January 2023
before decreasing to approximately $4.02 in April 2023. The graph illustrates the volatility of LNG
prices, resembling patterns observed with other fuel types. Notably, 2022 witnessed a significant
price increase, which peaked in early 2023.

11
12

Figure 5: The prices of liquefied Propane

The bar chart illustrates the fluctuations in the price of Propane fuel from April 2000 to April 2023.
Propane prices began at approximately $1.62, fluctuated widely, and peaked at approximately $5.19
in July 2022 before declining to approximately $4.50 in April 2023. The graph illustrates the
volatility of Propane prices, mirroring patterns observed with other fuel types. A notable price
increase in 2021 followed by a price apex in the middle of 2022 reflects the complex interaction of
economic factors. Propane's applications for heating and as a vehicle fuel make its price dynamics
pertinent for fleet managers and consumers in a dynamic market seeking cost-effective and versatile
energy solutions.

12
13

Figure 6:The prices of Diesel


The bar chart illustrates the price changes for Diesel fuel from April 2000 to April 2023. Beginning
at approximately $1.29, Diesel prices fluctuated, reaching a high of approximately $5.02 in July
2022 before falling to approximately $3.78 in April 2023. The graph illustrates the volatility of
Diesel prices, which mirrors patterns observed for all fuel types. Notable is the steep increase in
2021, culminating in a peak in mid-2022, which reflects the complex interaction of global factors. As
Diesel propels commercial transportation, its price dynamics have consequences for industries and
consumers, influencing costs and economic dynamics in a dynamic market environment.

13
14

Figure 7: The Prices of B20

From April 2000 to April 2023, the data is displayed in a table format displaying the B20 petroleum
price over time. B20 refers to a blend of 20% biodiesel and 80% petroleum diesel. The prices have
witnessed fluctuations throughout the years, with some noticeable trends. The B20 fuel price started
around $1.35 in October 2001, experienced ups and downs over the years, and reached a peak of
approximately $4.80 in July 2022. The graph illustrates how market dynamics, energy policies, and
environmental concerns influence the price of this biodiesel blend.

14
15

Figure 8: The Prices of B99/ B100


The presented information pertains to the B99 and B100 fuel prices from April 2000 through April
2023. B99 or B100 refers to a compound of 99% or 100% biodiesel, a renewable alternative fuel.
This data set depicts the fluctuation of petroleum prices over time. Prices were relatively stable until
approximately 2004, after which they began to fluctuate, reaching a high of approximately $5.48 in
July 2022.

15
16

2.Sketch boxplots for each kind of fuel price and make comparisons between
them

I am using the data from the Clean Cities Alternative Fuel Price Report of January
2023 to sketch the boxplots for each kind of fuel including biodiesel, compressed
natural gas, ethanol, propane, gasoline, and diesel. However, it is interesting to note
that these boxplots are sketched in Excel.

Min = $1.11
Quarter 1 = $2.22
Quarter 2 = $2.68
Quarter 3 = $3.30
Max = $4.70

Min = $1.04
Quarter 1 = $2.20
Quarter 2 = $2.67
Quarter 3 = $3.37

16
17

Max = $5.02

Max = $0.89
Quarter 1 = $1.88
Quarter 2 = $2.09
Quarter 3 = $2.18
Max = $3.25

Min = $1.54
Quarter 1 = $2.58
Quarter 2 = $3.06
Quarter 3 = $3.94
Max = $5.10

17
18

Min = $1.55
Quarter 1 = $3.55
Quarter 2 = $3.87
Quarter 3 = $4.11
Max = $5.19

Min = $1.18
Quarter 1 = $2.26
Quarter 2 = $2.63
Quarter 3 = $3.56
Max = 4.80

18
19

COMPARISON: Looking at the boxplots illustrating the price of 6 types of fuel. It is interesting
to note that most of them are skewed right. Compressed natural gas and propane, by contrast, are
skewed left. However, the propane price reached the highest point of price compared to other fuels,
with approximately $5.19. The price of compressed natural gas, meanwhile, has the lowest price
which is roughly $0.89.

19
20

3. Find the association between gasoline prices and alternative fuel prices.

a.Gasoline prices and E85 price


r = 0.9657
=> there was a very strong positive association between E85 price and Gasoline price.

20
21

b. Gasoline prices and CNG prices


r = 0.748
=> there was a strong positive association between CNG price and Gasoline prices

21
22

c. Gasoline prices and LNG prices


r= 0.59
=> there was a moderate positive association between Gasoline price and LNG price

22
23

d.Gasoline prices and Propane prices


r= 0.795
=> there was a strong positive association between Gasoline prices and Propane prices

23
24

e.Gasoline prices and Diesel prices


r= 0.973
=> there was a very strong positive association between Gasoline prices and Diesel
prices

24
25

f. Gasoline prices and B20 prices


r= 0.9655
=> there was a very strong positive association between Gasoline prices and B20
prices

25
26

g.Gasoline prices and B99/B100 prices


r= 0.8686
=> there was a very strong positive association between Gasoline prices and
B99/B100 prices

In conclusion, there was a positive association between Gasoline prices and other
alternative fuel prices.
The relationship between Gasoline prices and:

26
27

E85 price, Diesel prices, B20 prices and B99/B100 prices: very strong positive
association
CNG prices and Propane prices: strong positive association
LNG prices: moderate positive association

4. Build a regression model to predict the fuel price of each type. In your own
words, describe the fitting model.

Based on the previous data the price of gasoline is going to fall slightly start from Apr
2, 2023.

27
28

The price of E85 is not much different from the price of gasoline. It is predicted based
on the previous data of E85 price from 2001 to 2023. According to the predicted
result, the price of E85 starts to decrease slowly from Apr 2, 2023.

28
29

The price of CNA has increased steadily from 2009 to 2021 and it reached its highest
peak in 2023. However, according to the linear regression model prediction, the price
of CNG has started to fall in the late 2023.

29
30

From 2001 to 20015 there is no record of the price of LNG. However the price of
LNG has increased slowly from 2015 to 2022. According to the prediction based on
the previous data of the LNG prices, the price of LNG started to fall on April 10
2023, but there is a gradual increase of LNG in late April 10, 2023.

30
31

The price of propane has followed a volatile trend over the past two decades. From 2001 to 2008, the
price steadily increased, reaching a high of \$4.50 per liter in 2008. However, the price then fell
sharply in 2009 to \$3.10 per liter. The price remains stable for the next few years, before beginning
to rise again in 2022. By the end of 2022, the price had reached \$5.00 liter. However, a linear
regression model predicts that the price will start to decrease slightly in 2023 and stabilize at around \
$4.80 per liter.

31
32

The price of diesel has been much more volatile than propane over the past few years. In 2020, the
price of diesel dropped to a low of $2 per liter, which was the lowest price in over a decade.
However, the price of diesel then began to rise sharply in 2021, reaching a record high of $5 per liter
in 2022. The price of diesel has started to decline in 2023, and is currently around $3.5 per liter. A
linear regression model predicts that the price of diesel will continue to decline and stabilize at
around $3 per liter by the end of 2023.

32
33

The price of B20 has followed a roller coaster pattern over the past two decades. From 2001 to 2008,
the price steadily increased, reaching a high of \$4.10 per liter in 2008. However, the price then fell
sharply in 2009 to \$2.00 per liter. There were no major changes in price for the next 13 years, until
2022, when the price began to rise again. By the end of 2022, the price had reached \$4.90 per liter.
However, a linear regression model predicts that the price will start to decrease in 2023 and stabilize
at around \$3.50 per liter.

33
34

The price of B99/B100 has fluctuated significantly over the past 17 years. From 2006 to 2008, the
price increased from \$3.20 to \$4.90 per liter. However, the price then began to decline slowly, and
by 2021, it had reached \$3.00 per liter. In late 2021, the price of B99/B100 surged, almost reaching \
$5.60 per liter. However, a linear regression model predicts that the price will stabilize at around \
$4.90 per liter by the end of 2023.

34
35

III. Finding/Collecting data


Understanding how fluctuations in petroleum prices affect the appeal of alternative fuels to both
vehicle fleet managers and consumers is the issue at hand. This information is crucial for making
informed decisions about fuel choices to address this issue, The Clean Cities Alternative Fuel Price
Report, published in January 2023, serves as a valuable resource. This Excel data has been collected,
providing a dataset with 112 observations comprising regional retail prices for a variety of alternative
and conventional fuels, including biodiesel, compressed natural gas (CNG), ethanol, hydrogen,
propane, gasoline, and diesel corresponding with a specific day. The prices were collected from retail
and at-the-pump sales and were submitted for major alternative fuels currently in use as well as
conventional fuels from stations selling alternative fuels or nearby stations. These prices were then
averaged to identify regional price trends and fuel price variability within and among regions. By
utilizing this secondary data from the report, stakeholders can analyze how petroleum prices have
affected the adoption of alternative fuels across various regions and time periods, making more
informed decisions in response to changing fuel market dynamics.

IV.Cleaning data
To clean the data provided in the Clean Cities Alternative Fuel Price Report dataset, you can solve
two of these mistakes:
1. Wrong order of month/day/year and font error of month/day/year
Step 1: Select all the columns “Report Date” - highlight
Step 2: Click Format cell, select Data, and adjust the form according to the correct format
mm/dd/yyyy

35
36

Step 3: Highlight the “Report Date” column again and select “Sort & Filter”
Step 4: Select alphabetically Sort A->Z, the date format will be rearranged.

2. Error missing value:


Step 1: Highlight the first row of each value column, like Date Report, Gasoline, E85
Step 2: Click the filter for that row (row number 3) and an arrow will appear
Step 3: Click arrows in columns with missing values such as LNG, B20, and B90
Step 4: A small table will appear, in that small table, the "Filter" section scrolled down will have the
word Blank, Click to delete these data in 3 columns LNG, B20, and B90, and remove unnecessary
columns.

The word "Blank" represents empty space and has no value, so when running the data, the values
available in other columns will not be counted, so it is not necessary to keep them.

V.Analyzing data
After cleaning the data, we decided to do the data analysis process including analyzing and
manipulating the data. In this project, this can be done in two ways which are exploratory data
analysis and using algorithms and models.

Exploratory data analysis (EDA):

Data visualization:

36
37

We observed a right-skewed distribution of the average retail fuel prices in the United States.
Through the chart, it can be seen that there is an increase of gas prices in the United States. The
propane prices, however, reach the highest point of price, which is roughly $5.0

Descriptive statistics:

Descriptive Statistics Mean Mode Median Min Max Standard deviation

Gasoline 2.82 2.22 2.67 1.91 4.7 0.67

E85 3.12 2.65 2.96 2.28 5.1 0.68

CNG 2.32 2.19 2.19 2.05 3.25 0.3

Propane 4.11 3.87 3.88 3.67 5.19 0.46

Diesel 2.9 2.2 2.71 2.13 5.02 0.78

B20 2.78 2.24 2.58 2.06 4.8 0.73

Using Algorithms and models


Another way to analyze data is using linear regression to measure the association between variable
data. Linear regression is a statistical method that is used to predict the value of a dependent variable
(Y) from the value of one or more independent variables (X). In addition, the degree of association is
measured by a correlation coefficient, denoted by r. It is sometimes called Pearson’s correlation
coefficient after its originator and is a measure of linear association.
Here are steps to create linear regression and identify data by using Microsoft excel:
1.Prepare data: the data should be two columns ( the Gasoline prices with the other alternative fuel)
2.Select the 2 columns that we want to compare in scatter plot chart
3.Click Insert -> Chart -> X Y Scatter (the chart will appear on the page)
4.Click right mouse in the chart -> choose add Trendline (the Trendline will appear)
5.Double click on the Trendline (the format table appear on the right with 3 options: Fill & line,
Effect, Trendline) -> choose Trendline -> click on the 2 boxes below (Display equation on chart and
Display R-squared value on chart) to show the y value and R-squared value on the chart.

37
38

After that, use the R-squared value to calculate the r value. According to sph.bu.edu (2021), the
sample correlation coefficient (r) is a measure of the closeness of association of the points in a scatter
plot to a linear regression line based on those points, as in the example above for accumulated saving
over time. Possible values of the correlation coefficient range from -1 to +1, with -1 indicating a
perfectly linear negative, i.e., inverse, correlation (sloping downward) and +1 indicating a perfectly
linear positive correlation (sloping upward).
When we get the result for r we can categorize the association by the table below which provides
some guidelines for how to describe the strength of correlation coefficients.

VI.Interpreting results - Conclusion

We used information from the Clean Cities Alternative Fuel Price Report in this Through meticulous
employment of various statistical techniques such as time series analysis and regional comparisons,
this study has revealed valuable insights.. Our investigation covered a range of fuels, including
biodiesel, CNG, ethanol, hydrogen, propane, gasoline, and diesel, as well as varied geographic and
temporal contexts.

38
39

The analysis showed that, although alternative fuel prices fluctuated similarly to gasoline prices,
there was a significant correlation between the two. A minor correlation was seen for LNG, but
strong positive correlations were seen for E85, CNG, and propane. Very high positive correlations
were found for diesel, B20, and B99/B100. Indicating anticipated price drops for E85, CNG, and
LNG, regression models forecast subtle trends for each fuel, while projecting varied levels of price
stability for propane, diesel, B20, and B99/B100.

In conclusion, this work contributes to our understanding of alternative fuel pricing while offering
decision-makers relevant information to aid them in navigating the continuously evolving
transportation scenario. As gas prices continue to fluctuate, the data-driven conclusions drawn from
this analysis seek to promote informed decision-making, advancing both economic concerns and the
greater environmental aims of the transportation sector.

VII. References

LaMorte, W. (2021, April 21). The Correlation Coefficient (r). Sphweb.bumc.bu.edu.


https://sphweb.bumc.bu.edu/otlt/MPH-Modules/PH717-QuantCore/PH717- Module9-Correlation-
Regression/PH717-Module9-Correlation-Regression4.html

(2023, January). CLEAN CITIES Alternative Fuel Price Report [Review of CLEAN CITIES Alternative Fuel
Price Report]. US Department of Energy.

39
40

https://afdc.energy.gov/files/u/publication/alternative_fuel_price_report_januar y_2023.pdf?
fbclid=IwAR14mfCiE_l8pvHGsyOL_MDRe5yzL7RwHGXqmoyG0o2IdG GW7SPrkuXcUT4

40

You might also like