You are on page 1of 7

SUMMER UNDERGRADUATE RESEARCH AWARD

Vehicle Emissions Prediction using Machine Learning Based


on Data from Remote Sensing

Australian Data Visualized Week 4

Tanya Goyal - 2020CH10140 Facilitator:


Hriday Goel - 2020CH10091 Dr. Divesh Bhatia
Emissions vs Yom Plots
We have grouped the data by unique year of manufacture, taking the mean of emissions
made by vehicles produced in the same year.

We observed the graphs are essentially mirror images of each other, meaning that more the
CO2 formed less the CO formed. As far as I know, CO emissions are much more harmful and
hence these graphs accurately portray older vehicles being more polluting on average
These are plots of hc_ppm, hex_ppm, no_ppm with yom.
Here we observed high similarity between the plots, as in the
emissions reducing as the age of vehicle reduces with a steep
incline near the 1980s.
Emissions vs Speed plots
We have grouped the data by speed of the vehicle as measured during remote
sensing and created intervals of speed (km/ hr). Then we have plotted the
averaged of the intervals to make observations.

Speed data table


https://docs.google.com/spreadsheets/d/1hxwWQjO1Gf0NwWMFnUkY1djaaHFekMr
OYzsZnNCB_wE/edit?usp=sharing

The graphs are attached on the next page


CO2 vs CO plots

Here again we see rise in CO emissions at higher speeds and essentially the CO plot is the
mirror image of the CO2 plot.
Basically indicating CO2 + CO = constant
NO, HC, HEX Plots
Here in the HEX and HC plots it is evident that the emissions
are significantly higher at lower and higher speeds,
indicating that these emissions are higher when the engine is
under higher load
CPCB Data Request
Please find attached the data that we received from CPCB. We feel the data is not of much
use to our case.

https://vahan.parivahan.gov.in/vahan4dashboard/vahan/dashboardview.xhtml

https://drive.google.com/file/d/1fmUaXUiGSrY7JITt2Awd1Wop9QIE2CoQ/view?usp=sharing

Next Steps
We need a binary output as to whether the given vehicle is a higher emitter or not, and the
current dataset lacks that. So, we are looking for methods on how to add that in our current
dataset.

Also, to find outliers, one method could that could be explored is ZScore Outlier Detection

You might also like