You are on page 1of 4

Air Pollution Personalization: A Data Fusion

and Integration Project


Air pollution has been identified by the World Health Organization (WHO) as the world’s largest single
environmental health risk. In the year 2012 alone around 7 million people died - one in eight of total global
deaths - as a result of air pollution exposure [1]. WHO also found that China, which recently issued the
first national smog red alert, had the most deaths in the world, over 1 million, attributable to air pollution in
2012 [2]. Nowadays, as recent news reports indicate, the situation in India is worse, with pollution levels
exceeding 20 times the maximum indicated by the WHO [81]. In the U.S., 166 million people live in areas
with unhealthy air [3]. A Health Effects Institute study between 2000 and 2005 concluded that for each 10
μgm-3 increase in PM10 in the ambient air there was an approximate 1% increase in hospital admissions
for cardiovascular disease and about a 2% increase in admissions for pneumonia and chronic obstructive
pulmonary disease among those 65 years of age or older in fourteen U.S. cities [4]. Moreover, Pope III et
al. [5] found that each 10 μgm-3 elevation in PM2.5 air pollution was associated with approximately a 4%,
6%, and 8% increased risk of all-cause, cardiopulmonary, and lung cancer mortality, respectively.
In the U.S., the U.S. Environmental Protection Agency’s AirNow program has been providing hourly
air quality data and daily forecasts to the public since 1998 [6]. The data source for AirNow is the ambient
air quality monitoring data from the U.S. EPA’s air monitoring station network. In addition to air quality
data, AirNow’s use of the EPA Air Quality Index (AQI) (see Table 1) ensures air quality data is
presented with human health in mind (EPA, 2016).
Table 1. US EPA’s AQI for PM2.5

The EPA-published AQI, or pollutant concentration values, do not reflect the actual pollutant intake of
individuals. This is because personal intake also depends on the person’s activity (sitting, sleeping,
running, walking, biking etc.), and physiology (age, gender, health condition). Study the personalization of
air-pollution via automatic recognition of the activity and its intensity; this detection will mainly use the
smart-phone sensors. More specifically, the question of this project is how accurately we can estimate a
user’s breathing rate based on Intelligent Information Integration of data from the smartphone sensors,
current weather conditions, and physiology of the user.
The objective of the work performed in this project is to automatically and seamlessly, i.e. without
user intervention, determine the heart rate of a user. Construct a model that machine learns the heart rate
as follows. First determine the activity using the ios/android activity recognition; and if the activity is
walking, cycling, or running, then determine the corresponding breathing rate based on sensors such as
the GPS, accelerometer, compass, meteorological features such as wind speed and direction (since the
effort of running at the same speed with the wind is higher than that against it), and physiological features
such as age, gender, health-condition. Labeled data will be collected by having subjects wear a heart rate
monitor. The labeled data will consist of features (moving speed, moving direction, wind direction, wind
speed, age, gender, health condition), and the label will give the corresponding heart rate.
The heart rate should be plugged into the following formula to display the current pollution intake.
𝐻𝑏𝑎,𝑖 𝐴𝑄𝐼𝐻𝑖 −𝐴𝑄𝐼𝐿𝑜
𝑃𝐴𝑄𝐼𝑝 = ̅ − 𝐵𝑃𝐿𝑜 ) + 𝐴𝑄𝐼𝐿𝑜
(𝐶𝑝,𝑖 (1)
𝐻𝑏𝑖 𝐵𝑃𝐻𝑖 −𝐵𝑃𝐿𝑜
where, 𝑃𝐴𝑄𝐼𝑝 is the personal AQI value for pollutant 𝑝; 𝐶𝑝 is the concentration value of pollutant 𝑝 (web
links provided below to the air monitoring stations in and around Chicago and the actual monitoring data);
𝐵𝑃𝐻𝑖 is the upper bound of the concentration-interval in which 𝐶𝑝 falls (as given by Table 2 for PM2.5), and
𝐵𝑃𝐿𝑜 is the lower bound of the same concentration interval. 𝐴𝑄𝐼𝐻𝑖 is the AQI value corresponding to 𝐵𝑃𝐻𝑖 ;
and 𝐴𝑄𝐼𝐿𝑜 is the AQI value corresponding to 𝐵𝑃𝐿𝑜 . 𝐻𝑏𝑎,𝑖 is the heart beat rate of individual 𝑖 engaged in
activity 𝑎, and the crux of the work performed in this project is to machine-learn this parameter of the
formula, for a user that is not wearing a heart rate monitor, based on labeled data obtained from heart
rate monitor straps (see e.g. [73]). 𝐻𝑏𝑖 is the normal (reference) heart beat rate of individual 𝑖;
Essentially Eq.(1) represents the linear interpolation of AQI between 𝐴𝑄𝐼𝐻𝑖 and 𝐴𝑄𝐼𝐿𝑜 . For
example, if the PM2.5 concentration 𝐶𝑝 is 9 μg/m3, then the corresponding 𝐴𝑄𝐼𝑝 value is 37.5.
Demo of the project will produce a continuous display of 𝑃𝐴𝑄𝐼𝑝 on the smartphone of a user that
is not wearing a heart rate monitor. The remaining values of formula 1 should be obtained from the
closest EPA air monitoring station (web links provided below to the air monitoring stations in and around
Chicago and the actual monitoring data), and other publicly available websites (links to the weather data
and City of Chicago Data Portal below).
Accuracy of the heart rate estimation should be evaluated and reported.

Table 2. The breakpoint values (BP) in Eq. (1) (source: EPA, 2016)

EPA, Technical Assistance Document for the Reporting of Daily Air Quality – the Air Quality Index (AQI),
EPA – 454/B-16-002, May 2016.

EPA air monitoring station interactive map,


https://epa.maps.arcgis.com/apps/webappviewer/index.html?id=5f239fd3e72f424f98ef3d5def547eb5&ext
ent=-146.2334,13.1913,-46.3896,56.5319

Below is the Chicago boundary, along with the locations of the EPA monitoring stations.
EPA Air Monitoring Station Data: https://www.epa.gov/outdoor-air-quality-data

Chicago hourly weather data: https://www.glerl.noaa.gov/metdata/chi/

Chicago Data Portal: https://data.cityofchicago.org/

You might also like