You are on page 1of 38

B-Tech Project

Supervised predictive modelling of space


weather events using remote sensing
observations

ANIKET SUJAY - 17122006


AMAN KUMAR - 17122005
OVERVIEW
Nitric Oxide (NO)
During a geomagnetic storm, larger amounts of energy and particles are deposited
in Earth’s atmosphere altering its structure, composition, and dynamics. The storm
energy is rapidly lost from atmosphere and heat balance occurs via infrared
emissions. The NO infrared emission at 5.3 μm is a dominating heat balance
process in the atmosphere.

The nitric oxide emission at 5.3 μm acts as natural thermostat in the


thermosphere by which heat and energy are efficiently lost to space and lower
atmosphere. The NO radiative flux acts as a potential candidate to understand
thermospheric modulation due to space weather events.
Ozone (O3)
NO and it’s related species (NOx) plays an important role in formation & destruction of ozone in atmosphere.

O3 formation: O2 + sunlight O + O

NO2 + sunlight NO + O

O2 + O O3

Destruction of O3: O3 + sunlight O2 + O

NO + O3 NO2 + O2
Objective
Objective
To Build deep learning models to make prediction of:

● NO Infrared radiative flux (NO volume emission rate integrated along


altitude from 90 to 250 Km).

● Ozone Density in 70 to 110 km altitude range.


Data overview
SABER
● SABER stands for “Sounding of the Atmosphere using Broadband Emission Radiometry”.
● It is one of four instruments on NASA's TIMED (Thermosphere Ionosphere Mesosphere Energetics
Dynamics) satellite. The primary goal of the SABER is to provide the data needed to study the fundamental
processes governing the energetics, chemistry, dynamics, and transport in the mesosphere and lower
thermosphere.
● SABER accomplishes this with global measurements of the atmosphere using a 10-channel broadband
limb-scanning infrared radiometer covering the spectral range from 1.27 µm to 17 µm.
● These measurements are used to provide vertical profiles of kinetic temperature, pressure, volume mixing

ratios for the trace species O , CO , H O, [O], and [H], volume emission rates for NO, OH and O ,
3 2 2 2
cooling & heating rates for many CO , O , and O bands, and chemical heating rates for important
2 3 2
reactions.
Data Source
1. SABER website.
FEATURES: Solar Ap, Solar Kp, Solar F10.7 index, solar zenith angle, time, tp_latitude, tp_longitude, tp_altitude,
temperature,Ozone mixing ratios, neutral density & NO volume emission rate.

2. Wdc-Kyoto website.
FEATURES: DST-index, AE-index, SYM-H Component.
Data
Correlation Matrix
Heat Map
Of Correlation Matrix
MACHINE LEARNING
● Any kind of observation always have an underlying probability distribution.
● If the distribution for the system is known then we can sample value for a
given condition.
● Generally figuring out this distribution is a non-trivial task.
● One of the ways to achieve this is machine learning.
● Machine learning is divided into 3 major categories:
○ Supervised Machine Learning
○ Unsupervised Machine Learning
○ Reinforcement Learning
● In this project we make use of supervised learning techniques.
Supervised Machine Learning
● Supervised machine learning involves inferring a mapping between input
features and the output based on the set of examples provided.
● General roadmap for supervised learning:
○ We start with an initial representation for the data, i.e we start with some functional form.
○ The parameter of the functional form is then adjusted using the training data available.
○ We feed in the data get some result compare the result with the given value and using the
difference between the values to calculate the amount by which the parameters must be
adjusted.
○ We use a loss function which in some sense would give us the “distance” of the computed
values from the observed value.
○ The main task in supervised learning is to minimize this loss function for the whole of the
dataset.
Learning
● Loss functions:
○ These functions gives us information about the how far off the predicted values is from the
observed values for the present set of parameter.
○ Our aim is to minimize this loss.
○ One of the most popular ways to do this is gradient descent.
○ In this method we travel the loss surface in the opposite direction of the highest descent.
○ If the surface is locally convex then this method will lead us to the minima.
○ Sometimes the loss surface is too complex and we could only achieve local minima because
global minima does not exist.
Gradient Descent
NEURAL NETWORKS
Typical Neural Network Architecture

x1 z1

y1

g
x2 z2

y2

x3 z3
The forward pass

Z1 = f(x1*a11 + x2*a21 + x3*a31)


Z2 = f(x2*a12 + x2*a22 + x3*a23)
a11 Z3 = f(x3*a13 + x2*a23 + x3*a33)
x1 z1
a21
a12
Here f is any sigmoidal
function.
x2 a22 z2 Ex - Sigmoid, Tanh etc
a23 a13
a31
a32
x3 a33 z3
Calculating the value and loss function

g = y1*c1 + y2*c2 Loss Calculation


Loss function gives a measure of how well the model
performs on the dataset.
We have to minimize this function.
y1 c1 For regression tasks we generally used mse(Mean
Squared Error), Mean Squared Log Loss etc.

y2 c2
Back Propagation
x3 z1

y1

g L(g, g’)
x2 z2

y2

x1 z3
Updating the weights

This could be further expanded using


chain rule.
RESULTS SO FAR
Calculation of NO infrared radiative flux
For altitude 90 km to 250 km.

Model Performance
After Regularization

Mean Squared Error


Mean Absolute Error
Y_prediction vs Y_test
Red: Y_test
Blue:Y_prediction
Mesospheric Ozone Density
For 70 to 110 km altitude range
After Regularization
IMPROVEMENTS
BIAS - VARIANCE TRADEOFF
REGULARIZATION
● One of the ways to reduce variance in model is regularization techniques.
● Variance occurs when the model is too complex. Due to this complexity the
model gets affected by the noise present in the data.
● In neural networks the complexity of the model is determined by the no of
layers and no if nodes in those layers.
● To reduce the complexity we introduce a penalty term in the loss function
which causes the weights to get reduced.
● This reduction in the weight is proportional to the weight of the nodes, i.e the
higher the weight of the node greater the decay.
● This effectively makes the graph sparse and reduces the complexity thus
reducing the variance.
Normal Weight adjustment Regularization Term Added

Decay is proportional to the weight.


RECURRENT NEURAL NETWORKS
● As the data we have is a time series data, it could potentially have long range
order.
● Long range order generally means that data occurring later in the time series
has some kind of dependence on the data appearing before.
● As the result obtained in the vanilla ANNs only depend on the data at the
instant, it does not capture this long range order. Which could result in the
worse performance.
● To take advantage of this dependence we use RNNs(Recurrent Neural
Networks)
● RNNs have a feedback loop for the layer. Which means that the output from
each node affects the node before it.
● This cascading effect will then preserve the long range dependence of the
data.
Thank You

You might also like