You are on page 1of 34

Air Pollution Modelling Workshop

49049 – AIR AND NOISE POLLUTION


Raissa Gill & Peter Irga
Learning Objectives

1. Use online, publicly available datasets to compile PM and wind


data for your own study

2. Generate wind roses and polar plots using the “openair” package
in R statistical software

3. Use analysis outputs to determine the most likely source(s) of air


pollution arriving at two study locations

4. Compare and contrast air pollution between the two study


locations, and assess changes through time
Background: Source apportionment

• Source apportionment is a technique used to understand different


sources of air pollution.
• It identifies the air constituents that make up air pollution, and
where they come from.
• A common method for source apportionment is the combined use
of wind roses and bivariate polar plots.
• A wind rose is a method of graphically presenting wind conditions,
direction, and speed over a period of time at a specific location.
• A bivariate polar plot is used to visualise and explore mean
pollutant concentrations based on wind speed and direction.
Background: Interpreting a wind rose

• Each branch of the rose represents wind


coming from that direction, with north to the
top of the diagram.

• The branches are divided into segments of “Branch”


different thickness and colour, which
represent the range of wind speeds from that
direction.
“Segments”
• The length of each segment within a branch is
proportional to the frequency of winds
blowing within the corresponding range of
speeds from that direction.
https://www.envitrans.com/how-to-interpret-a-wind-rose.php
Background: Wind rose example

An example for Chennai India:

• Winds blow predominately from south-east


(SE) where the 3 spokes (ESE,SE,SSE) comprise 1%
~39% of all hourly wind directions (sum of
frequencies associated with each spoke as
indicated by circular line)

• Winds rarely blow from the north-west (NW ESE ~13%


~1% of the time – length of all segments) 5%
SE ~16%
SSE ~10%
Sum = 39%
• On the SE spoke, winds blow at 18.0-28.8
km/h ~5% of the time (length of red segment)
https://www.envitrans.com/how-to-interpret-a-wind-rose.php
Background: Interpreting a polar plot

• The centre of the plot indicates a wind


speed of 0, which increases radially
outwards by 2.5 m/s at each dashed
1%
circular line.
“Origin”

• The concentration of air pollutant PM10


(µg/m³) is shown in colour, where it’s
relative location in reference to the origin 5%

indicates that when wind blows from said


direction and speed, [PM10] at the study
site is the value reflected by the legend.
https://www-sciencedirect-
com.ezproxy.lib.uts.edu.au/science/article/pii/S1364815211002064
Background: Polar plot example

• When wind speed is 0 m/s, [PM10] at the site


is ~35 µg/m³.

• When wind blows at 5 m/s from the north, 1%


[PM10] at the site is ~10 µg/m³. “Origin”

• [PM10] at the site is highest when winds blow


from ENE at ~7 m/s, with concentrations
exceeding 45 µg/m³.
5%

• [PM10] at the site is lowest when winds blow


from WNW at ~6.5 m/s, with concentrations of
~5 µg/m³.
https://www-sciencedirect-
com.ezproxy.lib.uts.edu.au/science/article/pii/S1364815211002064
Part 1: UTS Tech Lab mock analysis

First we will run through an example using the UTS Tech Lab to get you familiar with
the data structure and how to run the analysis, but this does not go in your report –
you will need to repeat the steps you learn here on different datasets.

Recently, UTS has built a new facility


called UTS Tech Lab (-33.938726,
151.199040) which is located at an
industrial estate in Botany, NSW.

Botany has a history of poor air quality


and is considered the most polluted
area in Australia!
Part 1: Potential sources of air pollution

If you look at a map of the area, the site has many


potential contributing sources of outdoor air pollution:
• Industrial estates (S)
• Cargo shipping port (S)
• Oil refinery (SE)
• Airport (W)
• Residential (N)
• Vehicle traffic (N)
To determine the source of ambient PM10 at UTS Tech
Lab, PM10 sampling was conducted on the roof of the
building, in combination with a weather station, for a
period of 30 days.
Part 2: Download UTS Tech Lab data

Download the “UTSTechLab.csv” file from UTS Canvas and save it to your computer.
Look carefully at the way the sheet is set up. If the data sheet is not set up in this
format with separate columns for time, wind direction, wind speed and [PM10]
starting in the first row, it will be imported into R incorrectly.

Column A = Date
Column B = wind direction (°)
Column C = wind speed (m/s)
Column D = PM10 (µg/m3)
Part 3: Download R software

Now that you have the data, it’s time to install R software:

1) Go to https://cran.csiro.au/ and download R by clicking the link relevant to your


operating system:
Windows: Download R for Windows > install R for the first time > “Download R 4.0.0 for Windows”
Mac: Download R for (Mac) OS X > “R-4.0.0.pkg”
Other? Google it first. If it’s unclear, email us.

2) AFTER Step 1, go to https://rstudio.com/products/rstudio/download/ and


download RStudio by clicking DOWNLOAD under the free Open Source License and
selecting the installer relevant to your operating system.
Part 3: Set up your working directory

3) Download the “Openair.rmd” file from UTS Canvas and save it to the same
location as your UTSTechLab.csv file (in one folder).
This .rmd file is an “R notebook” containing the code required to generate your wind roses and polar
plots. Whenever you want to use this file, it must be in the same location (i.e. folder) as the data you
wish to use – remember this for when you rerun the analysis on the datasets for your report.
Part 3: Open the R notebook

4) Double click the Openair.rmd file to open it in RStudio. Some key features:

“Chunk” = shaded area which starts with


```{r} and ends in ```, contains code which
you can organise into sections, you can press
the play button to run the entire chunk

“Comment” = anything after a #


“Line” = reference
symbol, it doesn’t run as code, but lets
to a line of code
you create notes to help you later
Part 3: Analyse UTS Tech Lab using “openair”

5) Work through the R notebook:


First chunk:
Click the play symbol to the top right of the first chunk to install and load the openair package. This will run the
code contained in this chunk, and the play symbol will change to a stop symbol. The chunk is finished running
when the stop symbol reverts back to a play symbol.
Part 3: Analyse UTS Tech Lab using “openair”

Second chunk:
Scroll to the second chunk, place your cursor on line 22 next to getwd() and click Ctrl+Enter to run that single
line of code. This will return your current working directory directly beneath this chunk, which tells you where
on your computer R is looking for files.
On line 24, change the working directory in quotation marks to the location on your computer containing the R
notebook and .csv file. If you’re not sure how to do this, use the output of setwd() to help you get the
structure/order right. Once you’ve adjusted the code, click the play symbol to run the entire chunk, which will
import the UTSTechLab.csv file and show you the data structure directly below.
Part 3: Analyse UTS Tech Lab using “openair”

Third chunk:
Scroll to the third chunk and click the play symbol to run the chunk on the UTSTechLab data. Under the chunk
you can select on the different figures to view the wind rose and polar plot produced using the openair package.
To insert these plots into a word document, right click and select Copy image, right click a space in your word
document and select the Picture paste option to paste the plot. Repeat this for both plots. Note, you are just
practising this step using the UTSTechLab data, but you should not include these plots in your actual lab report –
you need to repeat this for your report datasets.
Part 4: UTS Tech Lab source apportionment

Now that you have your plots, you can identify the most likely source of
PM10 at the site. Let’s do this now with the UTS Tech Lab site – you will
need to do this on your datasets for the report!
• Winds blow predominately from NE, S, and WSW.

• PM10 concentrations under NE and S winds range from ~6 to ~18 µg/m³.

• When winds blow from WSW, you observe the highest concentrations of
PM10 (~22 µg/m³)

• Thus, comparing to the map of UTS Tech Lab, we can estimate that the
most likely source of PM10 is the airport and shipping dock.

*** Important note: Comparing your plots to the map is useful, but the
plot only tells you which direction & speed a pollutant is coming from,
not where it is – don’t interpret the map overlay literally!
Report: “A tale of two cities - what is the concentration
and source of air pollution in Orange and Bathurst?”

Using the modelling skills you have learned in this workshop and previous workshops to
answer the following questions in your report:
1. How has the local concentration of particulate matter (PM10 and PM2.5) changed in April from 2018
to 2020 (3 years)?
2. How have the source(s) of remotely sourced particulate matter (PM10 and PM2.5) changed in April
from 2018 to 2020 (3 years)?
3. How does a) the local concentration and b) source(s) of particulate matter (PM10 and PM2.5) differ
between Bathurst and Orange sites?

You will need download these datasets and re-run the analysis in R

You will also need to formulate your own testable hypotheses that address each question,
and generate results to investigate each hypothesis.
Part 5: Obtain Bathurst-Orange data

Air quality and environmental data can be acquired here:


https://www.dpie.nsw.gov.au/air-quality/search-for-and-download-air-quality-data

Scroll down and input the following parameters to the search tool:
Step 1: Site averages = Hourly
Select parameter = Particles – PM10, Particles – PM2.5, Wind speed, Wind direction
Step 2: Select sites = BATHURST, ORANGE
Step 3: Select data display = Download data as file
Set data period = Start date: 01/04/2020, End date: 30/04/2020
Click “Load data”

Scroll up a little and click the link “XLS-file: Hourly Averages Time Range: 01/04/2020 00:00 to
01/05/2020 00:00” to download the file.

Repeat these steps to download April 2019 and 2018 data with the same input parameters.
Part 6: Inspect Bathurst-Orange data

Your datasets (you should have three, one for each year) should look something like below, with
hourly averages for April at Bathurst and Orange stations for the following variables:
• WDR = wind direction in degrees (°)
• WSP = wind speed in m/s
• PM10 = particulate matter of size 10 µm or less (µg/m³)
• PM2.5 = particulate matter of size 2.5 µm or less (µg/m³)
Part 7: Restructure Bathurst & Orange data

Next we need to do some slight restructuring.


First, in all three of your .xls files (one for each year), rename the columns as per the new labels
(note this IS case sensitive):

Date Time Bathurst wd Bathurst ws Bathurst PM10 Bathurst PM2.5 Orange wd Orange ws Orange PM10 Orange PM2.5

Delete the first two rows so that the name of the four columns is in the first row and save the sheet
as a .csv (File > Save as > CSV comma delimited). Ensure your 3 csv files are saved in the same folder
as your Openair.rmd R notebook.
Part 8: Analyse Bathurst & Orange using “openair”

Adjust the R notebook for your Bathurst-Orange datasets:


First chunk:
No adjustments - place your cursor on line 13 and hit Ctrl-Enter to load the package (no need to reinstall).
Second chunk:
Adjustments needed - ensure that the setwd() on line 24 contains the file location of your Bathurst-Orange
csv files, and change the name of the csv file on line 27 from ‘UTSTechLab.csv’ to the name of your first
Bathurst-Orange .csv file (don’t forget the quotation marks and .csv).
Third chunk:
Adjustments needed – you will need to generate wind roses and polar plots for 1) Bathurst PM10, 2)
Bathurst PM2.5, 3) Orange PM10, 4) Orange PM2.5. Change the names in quotation marks on lines 42 and 46
to match the names of the columns you’re trying to analyse (you can use the head() function to help you,
don’t forget the quotation marks). Remember, if you want to analyse Bathurst, you need to input the
Bathurst wind variables and particulate matter. Copy-paste your plots over as per prior instruction.
Repeat this for each of your three .csv files (one for each year).
Part 9: Bathurst & Orange source apportionment

To do this, you will need: *** Important note: To assess local PM, you
need to look at concentrations when wind
• Wind roses and polar plots from Part 8. speed = 0 m/s (no transport of PM)

• Google map screenshots of Bathurst (-33.403333, 149.573333) and


Orange sites (-33.274240, 149.094545), with a consistent radius to
the N, S, E and W of the sensors. You can use the distance tool on
Google Maps by right clicking the map > Measure distance, and
creating transects from the site by clicking and creating continuous UTS Tech Lab
lines. The length of your transects will depend on the activities and 5 km
population density of your site. If Sydney city has a transect of 5km,
how far should you look for sources in a semi-rural location?
• Information about potential sources of particulate matter relevant
to Bathurst and Orange, including various sources of PM10 and
PM2.5, current land use, and recent land use (past 3 years). This will
require government and primary research and assessment of the
layout of the town and its roads, land use, and industries.
Part 10: Temporal & comparative analysis

One aspect of your report is to assess changes through time, and to


compare your Bathurst and Orange sites.

Think about how you could do this:


• How can you assess temporal trends?
• What metrics will you use?
• How will you present the data (in figures) concisely and
appropriately?

Here are some resources to help (see next slides)


Part 10: Metric calculations

Useful metric calculations to summarise PM concentrations in Excel:

=AVERAGE(B2:B10) Calculates the average


=STDEV.S(B2:B10) Calculates the standard deviation
=MEDIAN(B2:B10) Calculates the median
=MAX(B2:B10) Calculates the maximum
=MIN(B2:B10) Calculates the minimum
=COUNTIF(B2:B10, “>20”) Calculates the number of values exceeding 20
=CORREL(A:A, B:B) Calculates the correlation coefficient

Think about what temporal scale you will use to calculate these – separately for each
month or collectively for the three years?
Part 10: Generating figures

First, you will need to save a copy of your data as an Excel spreadsheet (not .csv),
otherwise your figures will not save once you close the sheet!

Producing high quality figures in Excel:


• Highlight the data > Insert > Charts
• Double click elements of the chart or use the [+] button to make changes
• Explore this link for plotting dos and don’ts: https://www.clips.edu.au/displaying-data/
• Tables are OK, but use sparingly if at all. Note that figures are usually the best
way to convey information, and this is considered in the marking process.
• See the next slides for some common plotting tips.
Part 10: Some dos and don’ts…

https://www.clips.edu.au/displaying-data/
Part 10: Plotting Averages with SD

1. Create summary of Averages and Standard Deviations – in this case we


are averaging CO2 concentrations for each hour across 3 days to
determine when CO2 is highest (will be different for your report).
2. Insert plot of Averages > Click plot > [+] on top right > Error Bars
arrow > More options > Error Bar Options arrow >
Series “Average” X Error Bars
3. Select Custom [Specify Value] >
Enter 0 for Positive & Negative values
4. Select Error Bar Options Arrow > Series “Average” Y Error Bars
5. Select Custom [Specify Value] > Click upwards arrow >
Highlight SD column for Positive & Negative Error Values

This is a similar process for other chart types (bar/line etc.)


Part 10: Plotting Linear Relationships

1) Highlight the raw data containing the two variables you wish to plot against each other. In this
case, we will assess the relationship between Temperature and CO2 concentration (you might do
time and particulate matter for your report!)
2) Insert scatterplot > Click plot > [+] on top right > Trendline arrow > More options >
Trendline options > Select Linear > Tick Display Equation & R-squared value on chart

How to interpret:
Equation of line:
“For each unit increase in x, there is a m change in y”
“For each 1°C increase in temperature, there is a ~7.53 decrease in
CO2 (ppm)”

R² = value between 0-1, describes how well the line of best fit
captures your data points
0-0.3 = bad fit → no/weak relationship
0.3-0.6 = OK fit → moderate relationship
0.6-1.0 = good fit → moderate-strong/strong relationship
Report: Structure

Before we get into the specific structure of your report…

DO NOT INCLUDE THE UTS TECH LAB RESULTS OR


ANALYSIS IN YOUR REPORT !!!

BATHURST AND ORANGE SITES ONLY !!!


Report: Structure

Section Information to include


Abstract Brief summary of all sections of the report (without citations) – just the key points re-worded.
~200 words
Introduction Establish the importance of air pollution and its potential impact on Bathurst and Orange, with
~500 words reference to primary and/or government literature:
• What is particulate matter and why is it important to study?
• What is the land use around the sensors at Bathurst and Orange relevant to PM10 and PM2.5.
State the three questions to investigate in this study, and your hypotheses relevant to each question.
Methodology Describe the source apportionment modelling, temporal and comparative analysis, including:
~no word • The source of your datasets and the information you gathered.
limit • Brief description of how you analysed the data with the openair package in R.
• Description of metric calculations used to make temporal and site comparisons.
Section Information to include
Results ~no Your results must report the following:
word limit • State and compare the local concentration of PM10 and PM2.5 over the three years in both sites – this can be

Report: Structure
summarised in a figure.
• State the concentration, wind direction and wind speed of the largest remote source(s) of PM10 and PM2.5 over
the three years at both sites - this can be summarised in a table.
• The wind roses and polar plots generated for each year, pollutant type and site.
• The metrics and/or trends analysed to make your temporal and site comparisons described in text, as well as
complementary figures to visualise the results.

Discussion You will need to discuss the following:


~max 1000 • Local sources of PM10 and PM2.5 at each of your sites, and whether their concentrations have changed in the
words past three years. If it has, describe possible causes and assert which you believe is most likely. Ensure you
justify this with evidence and logical reasoning.
• Remote source(s) of PM10 and PM2.5 at each of your sites, and whether these source(s) and their associated
concentrations have changed in the past three years. If it has, describe possible causes and assert which you
believe is most likely. Ensure you justify this with evidence and logical reasoning, and include reference to your
wind roses and/or polar plots, and to primary/government literature to back your source apportionment.
• Discuss the implications of the highest detected levels of particulate matter at each site in 2020 only. Compare
to national standards (https://www.environment.gov.au/protection/publications/factsheet-national-standards-criteria-air-pollutants-Australia,
ensure you adjust the temporal scale of your data where appropriate) and use primary literature to determine
whether the levels detected pose any risk to human health for both pollutant types and sites.
• Draw conclusions in reference to your hypotheses – were they proven or disproven? How?

References Mostly primary literature (20 is good!) using APA referencing style.
Report: Marking Rubric
Weight Criteria
(%)
1. Disciplinary knowledge & its appropriate application
Introduction
• Demonstrated understanding of research area and study sites, including discerning potential source(s) of air pollution as a function of land use.
• Ability to synthesise information and present concise hypotheses of the study.
40
Methodology
• Evidence of practical component completed, including obtaining data, restructuring data, openair analysis in R, and temporal and comparative analysis.
• Logical process of investigating concentration and source(s) of air pollutants between sites and through time to assess each hypotheses.
• Evidence of temporal and site comparison analysis, which is supported by generation of key metrics and tables/figures.

2. An enquiry orientated approach


Results
• Data is plotted and presented to a high standard, acceptable to the level of a designated engineering publication.
• Data presentation is relevant to the questions of this research and intended purpose of the figure.
• Written descriptions are concise, and clearly describe the concentration of local and remote pollutants through time at both sites, and identifies the wind direction and speed of the
source(s) contributing the highest level of pollutants.
• Use of quantifiable and comparative language is used to describe local concentrations, temporal trends and differences/similarities between sites.
Discussion
45 • Local and remote source apportionment of both sites through time is adequately performed for pollutants with reference to results, and government and primary literature.
• Temporal trends are discussed and backed by government and peer reviewed literature to support/explain findings.
• Differences/similarities in pollutant concentration and sources between sites are insightful and backed by government and peer reviewed literature to support/explain findings.
• Risk hazard of study sites is adequately assessed using relevant government guidelines, and its implications for human health are discussed in context of peer reviewed literature.
• Conclusions are drawn in reference to the hypotheses presented in the Introduction.
References
• References cited correctly using APA Referencing style and formatted consistently in reference list
• References used are of high quality, from relevant engineering/scientific journals
• Every relevant statement or use of evidence is referenced throughout report, with minimum 20 references to peer reviewed literature
3. Communication skills
• The report is insightful, logical and clear, with use of disciplinary language that is engaging and comprehensive
15 • Writing is above all clear and concise
• Formatting guidelines are adhered to, with minimal spelling/grammatical errors
REMINDER:

DO NOT INCLUDE THE UTS TECH LAB RESULTS OR


ANALYSIS IN YOUR REPORT !!!

BATHURST AND ORANGE SITES ONLY !!!

You might also like