Non Stochastic Volatility Rev

1
Non-Stochastic Volatility Vs. Stochastic Volatility Models
Student's Name
Institutional Affiliation
Course Number and Name
Professor's Name
Assignment Due Date

2
Non-Stochastic Volatility Vs. Stochastic Volatility Models
Introduction
A time-varying parameter model can be classified into two categories: parameter-driven
and observation-driven specifications (Koopman et al., 2016). It's important to note that in an
observation-driven model, the current parameters are dependable functions of both lag-
dependent and lag-exogenous variables. Randomly evolving parameters can be predicted one
step in the future based on previous data. The forecast error dissection provides a closed-form
conditional probability for observation-driven models (Koopman et al., 2016). This property has
made this framework popular in applied econometrics and analytics since it facilitates easy
estimating processes.
Dynamic models and idiosyncratic inventions drive variables in parameter-driven
models. They are subject to change. There are no closed-form theoretical formulas for the
likelihood function for these models. A more comprehensive approach to evaluating likelihood is
needed for parameter-driven simulations because efficient prediction models are often required.
Stochastic volatility models incorporate stochastic conditional duration models and stochastic
copula models (Koopman et al., 2016). In light of the vast amount of time and effort that has
been invested in studying and implementing a wide range of these parameters, it is critical to
examine the relative advantages of parameter-driven and observation-driven frameworks
(Koopman et al., 2016). Any time series model's applicability hinges on its robust out-of-sample
performance.
3
The generalized autoregressive score (GAS) model belongs to the same family of
observation-driven approaches as nonlinear non-Gaussian state-space models in terms of
generalizability. The GAS approach updates time-varying variables using a scaled score vector
from the density of the statistical models (Blasques et al., 2018). For any density of observations,
we can use the GAS model. Although the GARCH model belongs to the GAS class, new models
like the mixed variable dynamic factor structure can also be developed using the GAS
framework. A natural assertion alternative to the state space framework, the GAS framework can
be used for a wide variety of DGPs (Blasques et al., 2018). No matter how much data you have,
comparing observation- and parameter-driven models are impossible. In parameter-driven
models, the predictive dispersion is a mix of observation distributions over the random time-
varying variable, whereas, in observation-driven models, the forecast density is just the
observation distribution given a completely predictable parameter (Blasques et al., 2018). Over-
distribution, larger tails, and other properties of parameter-driven models may directly place
these models ahead of observation-driven models.
Research Motivation
Since the financial crisis of 2008, stock risk analysis has become increasingly important.
Financial institutions and regulators focus on determining the common variation in corporate
failures (Blasques et al., 2018). This study developed a novel instrumental variable modeling
approach for mixed quantification time series data. There may be a wide range of distributions to
choose from and the possibility of missing data and cross-sectional interdependence due to
shared experience to dynamic homogeneity components in this framework. The primary goal is
4
to build a versatile framework for modeling stock risk estimation, assessment, and projection.
This framework has a significant advantage in that the certainty is obtainable in a linear system
and does not need to be determined through simulation (Blasques et al., 2018). As a result,
parameter estimation can be performed using simple approaches.
It is possible to price options using the stochastic volatility model. Option prices may not
represent volatility predictions with large standard errors. It is hoped that this averaging will
lower the standard errors in option prices, which are dependent on the average volatility of the
option contract. If process parameters are considered over a lengthy period, volatility converges
to the process's unconditional variance, which is known without any error. The smoothing
estimator's errors will be highly autocorrelated, and its convergence to unconditional volatility
will be delayed if the volatility process's persistence parameter r is close to unity. Large standard
errors will be present in both the option pricing and the deltas if this is the case.
Stochastic volatility models are a natural option to the GARCH parameters of time-
varying volatility models. In SVOL models, there is a distinct error method for the conditional
variance and the conditional mean. The auto-regressive lognormal process with independent
innovations invariance equations is the SVOL model's basis. The SVOL family of models has
been more scalable than the GARCH family. Conditional mean innovations with a fat tail or
"leverage" impact can naturally expand an SVOL model. Gallant and coworkers have uncovered
findings in favor of fat-tails. Volatility increases (decreases) for negative shocks to the mean
when the mean and variance errors are negatively correlated. This is something that Black has
looked into. GARCH and Glosten et al. (1991)'s modified GARCH are two such models.
5
According to Dumas et al. (1998), index option prices negatively correlate with the underlying
index return.
Academic Contribution
Using volatility models to analyze financial time series data has become a widespread
practice in recent years, and a large literature has been developed. The ARCH model is a critical
tool for studying variance changes over time. Using the ARCH process, Engle (1982) attempts to
explain time-varying conditional variance by taking into account past disruptions. High ARCH
order is required to capture the provisions of conditional variance, as early empirical data
suggests. Bollerslev's (1986) GARCH model provides an answer to this problem. Many great
ARCH/GARCH models assessments may be found in Bollerslev, Chou, and Kroner (1992) and
Pantula (1986), describing ARCH models' maximum likelihood inference strategies under the
normality assumption (Asai & McAleer, 2009).
It has been discussed in Mark (1988) and Bodurtha and Mark (1991), as well as Simon
(1989). In addition to this, Geweke (1988) established the Bayesian inference processes for
ARCH models using Monte Carlo approaches to predict the precise posterior distributions.
While Robinson (1987) utilizes a nonparametric method, Gallant and Nychka (1987), Gallant,
Rossi, and Tauchen (1990) use a semiparametric approach. It has been found in recent studies
that a good vector autoregression (VAR) for predicting and computational modeling of
macroeconomic data requires a broad, broad spectrum of macroeconomic variables, as well as
time variation in their volatilities. According to Banbura, Giannone, and Reichlin (2010), larger
systems are better at predicting and structural analysis. Volatility changes throughout time and
6
Primiceri (2005) emphasizes the importance of this variance. Bayesian estimate relies on a
probability density, the sum of the certainty, and the before accommodate both of these
components (Asai & McAleer, 2009). A large model's many parameters necessitate Bayesian
shrinkage, and Bayesian computation is useful for stochastic volatility.
Literature Review
Observation-Driven Models
An observation-driven model uses contemporary parameters as predictable functions of
serial correlation and exogenous variables (Creal et al., 2013). In this scenario, parameters
change at random yet can be predicted exactly one step ahead of time-based on prior knowledge.
i. GARCH
According to Creal et al. (2013), Heteroskedastic means that the regression coefficients
are predicted to be bigger for certain points or bands of data than for others and that the
deviations of the parameter estimates are not equal. Even if the regression estimates for an
ordinary least regression are still impartial even when heteroskedasticity is present, conventional
techniques' standard errors and confidence intervals will be too small, leading to an illusion of
precision. GARCH models regard heteroskedasticity as a variability to be modeled rather than a
concern to be fixed. A prediction is also made for the dispersion of each residual variance due to
the corrections made to least squares. This is frequently of interest in the financial industry. The
GARCH general formula is given by;

7
A class of GARCH-X models can be defined by the return and GARCH equations, which
we refer to as the return and GARCH equations, respectively. In the abbreviation GARCH-X, the
exogenous variable xt is referred to. Chen et al. (2009) developed the HYBRID GARCH
framework, which incorporates GARCH-X model versions and other related models. If you look
at xt, you can see that it can be used to measure ht; hence, we'll call it the measurement equation.
xt = ht + ut is the most basic measurement equation. The measurement equation is a crucial part of
the model, which completes it. It is known that rt and xt are dependent on each other, and the
measurement equation gives an easy approach to model this relationship. The existence of zt in
the quantification equation models this interdependence, which our empirical research shows to
be quite significant. The Realized GARCH approach incorporates all of the ARCH and GARCH
model variants, if not all of them. In such models, the measurement solution is reduced to a
single identity, making nesting possible by setting xt = rt or xt = r2t.
The multimodal realized GARCH model now includes return variables, GARCH
symmetries, and assessment equations, and it can be introduced using the proper notation. The
following is the return formula for the i-th asset at time t:

8
With example, the original observation equation can be substituted for the simplified one
under the structural model, i.e., percent t = percent (t). Most of the time, a useful analogy for
returns is more important than a model for actual measurements because they are secondary
considerations. Using the converted variance and correlation measurement equations, we may
incorporate the actual measurements into our models more easily. As long as a factor structure is
in place, the lower-dimensional assessment equation for t can be utilized to achieve this, making
it possible to gain numerical advantages by employing a lower-dimensional system of equations.
It's important to remember that the equation's initial statistic should be used in some
cases. To compare the overall log-likelihood of several model specifications, all models must use
the same measurement formulae. It is possible to combine different factor structures into distinct
measurement algorithms. Models that use distinct measurement equations can only be compared
in terms of their total likelihood, comparing models that simulate different (dimensions). This
comparison would be like comparing apples and oranges. If the total likelihoods of different
models are to be compared, then a universal set of assessment equations, such as the first
equation. You can also use a selective log-likelihood for returns to evaluate the model
parameters instead of considering the entire probability and then omitting the part about the
actual measurements. Since multivariate GARCH models are designed to model the rate of
9
return, this may be the ideal comparison method. The following section goes through the log-
likelihood terms pertinent to these comparisons.
It is simple to forecast return probabilities from the model because all dynamic variables
are presented in an observation-driven manner. It is possible to calculate all endogenous
variables and correlations for period t+1 by using the GARCH equations. Ht+h's constituents are
not established beyond threshold h = 1; hence the future of zt and ut cannot be predicted.
Simulated or bootstrapped estimates of Ht+h's distribution are trivial to calculate. Multi-step
predictions can be inferred from this model at any forecasting horizon. Such Realized GARCH
models are extensively discussed in depth in Lunde and Olesen's book on forecasting
methodologies. The bootstrap method has the advantage of not relying on the distributional
features of the data.
In multivariate GARCH models, the conditional dispersion of the variance vector is
described. This objective is defined by the log-likelihood function for returns used in the
estimation. There are many ways to evaluate the different specifications, but comparing their
return log-likelihoods, which represent their ability to predict the distribution of the return
vector, is an excellent starting point for comparison. Since the parameter vector can be calculated
from in-sample data, it is necessary to evaluate and compare specification values using the
average value of 'r,t (in and out of sample). The mean predictive log-likelihood is used as a gain
function in this type of model evaluation, which is similar to one-day-ahead return vector density
forecasting.
ii. GAS
10
Huang et al. (2014) asserted that observation-based models like GAS have advantages
over the other models. The process of determining the probability of an event is simple.
Asymmetric, long-term memory, and other more intricate dynamics can be studied without
additional complexity. Based on the score, rather than averages and high-order moments alone,
the GAS model exploits the full density structure, not only means and high-order moments.
According to Benjamin et al. (2003) and Cipollini and Cipollini et al. (2012), it is distinct from
other observation-driven modeling approaches, such as generalized autoregressive moving
average models and vector cumulative error models. The general formulation of the GAS model
is represented as below;
2.2 is the measurement equation because it connects realized variance to conditional
variation. The bivariate Gaussian distribution of zt and ut is taken for granted. The leverage
function, d1(z2t +1) + d2zt, introduces reliance between the yield shock and fluctuation shock.
Density p is used to jointly distribute y1t and y2t (y1t; y2t). The new (zt; ut) function of time t's
inventions. As a result of these new findings and their previous value, the volatility changes. The
observation density immediately following the GAS definition determines the precise functional
shape. The GAS model is a model for latent dynamic components based on observations. The
dynamic score determines the time-varying parameter t = ft.

11
There's a big advantage to the GAS(p/q) specification in that it may be used for various
models and parameterizations. While the recursion approach can be applied to a broad variety of
models with a parametric likelihood definition, it is more general. The GAS model's score is
scaled according to the information matrix's inverse. The unit matrix, St1 I, provides a simpler
scaling option. Unscaled gradients are used in this situation; therefore, it's similar to a steepest-
descent optimization procedure. However, our experience has shown that this revamping
mechanism is generally less reliable than other mechanisms. Due to these considerations, we
believe it is best to use St=I1/t to scale the rating matrix instead of the direct product of the
matrix. The execution of the inversion when the informational matrix is not full rank or
statistically volatile for specific models could be a problem.
The GAS model has the advantage of using all of the likelihood information. The time-
varying parameter makes a scaled (local density) score step to lower the one-step-ahead
estimation error at the present observation. Despite Being Built on A Fundamentally Different
Paradigm, the GAS model offers a powerful and hypercompetitive alternative to conventional
observation-driven models and parameter-driven models. Extensive nontrivial empirical and
computational examples have been used to demonstrate this. The time-varying parameters of
state-space structures with stochastically time-varying values, multivariate unmarked point
operations, and time-varying copula structures all have intriguing expansions and alternate
specifications in these situations.
This versatility and relevance for a wide range of models make it challenging to develop
a standard set of parameters for stationarity and regularity that can be applied to all relevant
12
scenarios. A more promising approach may be to use GAS specifications to create conditions for
specific subsets of models. A second research direction is to examine the finite-sample aspects of
GAS models in greater detail. The statistical features of parameter estimations for GAS models
may benefit from a more comprehensive investigation than the few fascinating empirical and
computational examples we've supplied so far. This is especially true if the information matrix is
unambiguously non-singular for all sample data. The probabilistic maximization technique
converges quickly and reliably. When there is little or no information about the design parameter
in observation, it is more important to introduce information smoothing and find suitable
beginning values. There are locations where the information matrix degenerates, which is
especially pertinent here. In our opinion, data smoothing is essential in these situations.
Automated smoothing by computing the smoothing variable directly from data has also
improved the probability value in numerous circumstances.
iii. Realized variance
According to Barndorff‐Nielsen et al. (2009), it is possible to estimate the exponential
variation of an efficient price mechanism from high bandwidth noisy data using the realized
kernel estimators. With the help of alternative methodologies like subsampling and pre-
averaging, we can understand better time-varying variability and better anticipate future
volatility by extending the influential achieved variance literature. RV is a popular experimental
statistical measurement, and RV provides a perfect approximation of volatility in fictitious
circumstances when prices are tracked continuously and without estimation errors.
13
Since the RV is a sum-of-squared yields calculation, this finding recommends that the
evidence be sampled as frequently as possible when calculating the RV. However, due to market
microstructure noise, this leads to a bias issue. Awartani, Corradi, and Distaso have recently
documented the presence of noise in volatility indicator plots that Andersen established,
Bollerslev, Diebold, and Labys (2000) (Barndorff‐Nielsen et al., 2009). As a result, returns are
often sampled at a modest frequency, such as every five minutes, because of the trade-off
between prejudice and volatility. Filtering techniques, which earlier studies have employed to
correct bias, are an alternate method of dealing with the problem.
Parameter-Driven Models
There are many unique improvements in parameter-driven models because the
parameters themselves change over time. Closed-form analytical equations do not represent the
probability function in these models. Effective simulation methodologies are generally required
to evaluate the likelihood of a parameter-driven model (Hansen et al., 2010). Time-varying
parameters in stochastic processes are possible for any conditional observation density, so long
as the process itself is stochastic. Because of this, parameter-driven models are applicable in a
various situation. Instead of a flexible unifying framework for observation-driven models, we
must create a new data function to update the time-varying parameter for new observation
density and parametrization. Many times, it may not be obvious what the right function is in a
certain situation, such as volatility modeling.
i. Stochastic Volatility Models

14
According to Hansen et al. (2010), simplicity, flexibility, and resilience are the
multivariate factor stochastic volatility model (SV) (Abbara & Zevallos, 2019). According to this
model, a potentially large monitoring space is reduced to a smaller orthogonal matrix
factorization space, much like a factor model; therefore, it is straightforward. These components
are allowed to display clustering, but they are also allowed to be stochastic volatility processes,
allowing the extent of turbulence co-movement to be time-varying. This is both flexible and
resilient.
There may be a tendency for volatility to revert to its long-term mean value in stochastic
volatility models. Stochastic volatility models can be used to solve the challenge of derivative
models that assume constant volatility over a certain time frame (Hansen et al., 2010). Stochastic
volatility models are used to control and value the risk relating with derivative contracts. The
variance (error term) in an ARCH model is a function of the period squared mistakes in the past.
Volatility clustering is a financial theory that explains the correlation between volatile markets
and times of low volatility. For example, in a generalized autoregressive model, a GARCH
model's error term is a function of previous period squared errors and the previous period
estimated variance (Hansen et al., 2010). High volatility tends to follow periods of low volatility,
which is explained in finance by the concept of "volatility clustering."
ii. State-Space Models
The state-space model of a continuous-time dynamic conceptual model can be generated
from either the time domain system model supplied by a differential equation or its transfer
function representation (Hansen & Lunde, 2006). This section will deal with scenarios that
15
incorporate the observer form, the modal form, and the Jordan form, four state-space forms
commonly employed in current control theory and application. The general formula is
represented below;
This may or may not incorporate all of the underlying transfer function modes, i.e., the
poles of the underlying transfer function, depending on how the state-space models are
constructed (before zero-pole cancellation, if any, takes place) (Hansen & Lunde, 2006). The
state-space model will have a lower order if parts of the transfer function's zeros or poles are
negated, and the accompanying patterns will not be visible in the transition matrix.
According to Costa & Alpuim (2010), the Kalman filter method, developed by Kalman in
1960, has been widely applied to studying dynamic structures' evolution. Using a group of
equations called a state-space model, the technique attempts to derive estimates for unobservable
variables based on associated observable variables. The equations used to construct these types
of models define them.

16
Assessment Equation (1) connects the n*1 vectors of measured variables (Yt) and states
(bt), which are known as the m*1 vectors of unobservable variables. White noise n*1 vector et
referred to as the sampling error, and the correlation coefficient of the n*m matrix Ht make up
the Ht matrices (Costa & Alpuim, 2010). Eq. (2), the equation of transformation or state, also
shows that the vector of states bt changes with time. Covariance matrix and vector
autoregression coefficient matrix U are used in this example. There is no correlation between the
two disturbances et and et. When the state vector is a deterministic model with a mean, one
family of models is of major relevance.
iii. Quasi Maximum Likelihood
According to Wooldridge (2014), if you want to estimate a linear model with one or more
endogenous explanatory variables, 2SLS is the most typical method. The limited information
probabilistic estimator, computed under the nominal condition of jointly regularly dispersed
unobservable, has better small-sample features, however, as many authors have shown,
particularly when multiple overidentifying requirements are present. The Quasi model can be
computed by
17
Even though dealing with explanatory variable endogeneity in nonlinear models is
notoriously challenging, specific instances have been provided. In nonlinear models, the
probability of the endogenous explanatory variables (EEVs) – whether continuous, discrete, or
some mix – portrays imminent significance (Durbin & Koopman, 2000). 2SLS can be utilized
irrespective of the type of the EEVs. In the second stage, the structural variables and other
quantities of relevance, e.g., average partial impacts, are frequently discordant with the values
obtained in the first stage. The bulk of the time, two strategies are utilized to approximate
nonlinear models with EEVs.
To achieve maximum likelihood, a model of equations with unknown unobserved errors
must have an explicit description of the EEV distribution and an associated response parameter
distribution conditional on the EEVs (Durbin & Koopman, 2000). There are various downsides
to the MLE technique, especially when dealing with binary responses. Managing many EEVs,
for one thing, could be computationally taxing. Furthermore, if assumptions are erroneous, it is
typically inaccurate. This is perhaps an essential reason to avoid it.
Remainders from the first stage estimate technique, including EEV, are utilized in a
second stage assessment issue using a control function approach. Nonlinear models with cross-
section data and panel data are just two of the circumstances in which researchers use the control
function (CF) technique (Wooldridge, 2014). Using the method in semiparametric and
nonparametric circumstances has been proven by Wooldridge (2014). According to BP, there are
no distributional or functional limits on quantities of interest, which means they can be found
generically. APE provides unobservable that aren't considered independent of external causes,
18
but Durbin & Koopman (2000) compares the ordinary structural support and average partial
effects as similar notions. The method of inserting first-stage aspects for EEVs can generate
consistent estimators of parameters up to a universal scale factor in some cases, but the
constraints under which this occurs are quite restrictive, and average partial effects cannot be
easily retrieved (Wooldridge, 2014). It is also difficult to test the premise that the EEVs are
exogenous because of the fitted-value technique.
iv. Indirect Inference
According to Gourieroux et al. (1993), Indirect inference makes use of the simplicity and
convenience with which data from even complex structural frameworks may be replicated. The
central notion is to look at observed values and replicated findings through the lens of an
inferential statistical (or auxiliary) framework with auxiliary parameters. The structural
parameters are then chosen to match the parameter estimates, and when viewed through this
perspective, the simulated outcomes resemble the observed data. To formalize these notions,
suppose the actual choices {yit}, I = 1, . . . , n, t = 1, . . . , T, are created by the structural discrete
choice model specified in (2.1), for a given value β0 of the structural response. An auxiliary
framework can be evaluated using the observed values to obtain parameter estimates ˆθn.
Formally, ˆθn solves:

19
Where the x/s are external variables that can be observed, but the u/s and &;s are not.
Suppose (Al) is a stationary Markov process (yt,xt) and (et) is a white noise whose dispersion
GO is known. We further assume that (Al) (xt) is a homogeneous Markov process (Al) (xt).
Notably, in the parametric situation, it is not necessary to assume that et has some known
distribution as a white noise with standard normal distribution, and a variable that can be
included in 6 can always be used instead.
Because any simulated choice is a step function, step functions appear naturally when
using indirect inference on discrete choice models. The prototype binding function n is hence
discontinuous. Because of this, the II estimators' criterion functions have a discontinuity. Due to
II's discrete outcomes, gradient-based optimization methods cannot be used. There are only a few
options left: derivative-free approaches, random search algorithms (such as simulated annealing),
or simply abandoning optimization in favor of a Laplace-type estimator (Gourieroux et al.,
1993). MCMC, on the other hand, can provide (infinite samples) an approximation that differs
significantly from the statistical criterion's optimum even when it converges slowly. Since non-
smooth criterion functions define II estimators, their employment in the form of nonlinear data is
extremely problematic (Gourieroux et al., 1993). Despite the challenges of applying II to discrete
choice models, several authors have persisted in doing so due to the attraction of the II approach.
v. Importance Sampling
According to Yuan & Druzdzel (2012), Monte Carlo estimator convergence is
accelerated if the samples are drawn from a distribution comparable to the function in the
integrand. Importance sampling utilizes this fact. The underlying principle is that an accurate
20
approximation is produced more quickly by focusing effort where the integrand's value is
relatively high. The scattering equation for this model is computed using the formula below;
The scattering equation, for example, might be used as an example. What happens if a
random direction is selected that is approximately perpendicular to the surface normal? If the
cosine term is close to zero, this integral is estimated. Durbin & Koopman (2000) asserted that
performing a BSDF analysis and then tracking a laser beam to determine the incident radiance at
the sample site will be a complete waste of time and money, as the results will be so insignificant
(Durbin & Koopman, 2000). Sample directions in such a manner that they are less likely to lead
us to choose directions close to the horizon. In general, efficiency improves when directions are
selected from populations that match other integrand factors (the BSDF or the distribution of
incoming illumination). Variance can be decreased as long as the independent variables are
drawn from an integrand-like probability distribution.
vi. Local-Level Model
According to Durbin & Koopman (2000), the local level model encompasses other
filtering models such as the Kalman filter, Regression Lemma, Bayesian Treatment, Minimum
Variance linear unbiased treatment, and smooth state variance, among other models within the
21
local level projection. For instance, In a time series, each observation is sorted sequentially from
y1 to yn. The additive model is the most fundamental representation of a time series. The
resultant model formula is provided by;
These three components are called the trend, the seasonal, and the error/disturbance,
respectively. t is the slow-changing component. For this model, we assume the observation yt
and the other parameters listed in (2.1) are numeric values. Components are multiplied together
in various applications, especially in the economy (Durbin & Koopman, 2000). The resulting
condensed formula for the model becomes;
This model, despite its simplicity, does not represent a contrived special instance but
rather serves as a foundation for the investigation of key real-world challenges in time series
analysis. According to typical multivariate analysis results, estimation of contingent means,
variances, and covariance matrices is a routine affair based on the features of the multivariate
standard deviation, which may be applied here (Durbin & Koopman, 2000). However, when the
number of observations yt rises, the usual computations become increasingly time-consuming.
Using the filtering and flattening techniques discussed in the following three sections, this naïve
approach to estimating can be greatly improved (Durbin & Koopman, 2000). It is possible to get
22
the same findings as multivariate analysis theory using these techniques because they give fast
computer algorithms.
The Student T-Distribution
According to Zhu & Galbraith (2010), a variety of distributions resembles the normal
distribution curve, although it is a little shorter and thicker. The t distribution is preferred over
the normal distribution when working with tiny samples. The t distribution resembles the normal
distribution more closely as the sample size increases. For sample sizes greater than 20, the
dispersion is nearly identical to the normal distribution. This model is computed using the
formula presented below;
Adapted from: https://www.statisticshowto.com/probability-and-statistics/t-distribution/

23
If you want to know if you should embrace or reject the null hypothesis, you utilize the T
Distribution (and its associated t scores) in hypothesis testing. The graph's core represents the
area of acceptance, while the graph's extremities represent the area(s) of rejection. This two-
tailed test graph's rejection area is colored blue (Zhu & Galbraith, 2010). Z-scores and t-scores
can be used to describe the tail area.
Gaussian Distribution
The Gaussian distribution is formulated using the formula presented below.
According to Giner & Smyth (2016), Gaussian distribution is a theoretical symmetrical
dispersion used to compare scores or make other statistical decisions, such as determining the
mean and standard deviation. This distribution's form suggests that the bulk of scores is located
near the distribution's center and that the frequency of scores that deviate from the center
diminishes. The normal probability distribution is the most common among discrete probability
distributions or cumulative distribution functions. A probability density model is a variable that
expresses the probability of a random variable taking a specific value. Plotting the variable x by
its likelihood of happening, y, creates this graph. They have asymmetric bell shape but can have
any real mean and any positive integer standard deviation. These are normally distributed
ranges. To put it simply, normal distributions are defined by continuous data, which means that
any value in the data set can be represented. Special instances allow the normal distribution to be
24
normalized so that its mean is zero and the standard deviation is one. It is possible to standardize
any normal distribution to fit the normal curve. Its condensed formula is;
The mean (μ), standard deviation (σ), and variance (σ2) make up this exponential
function. Shorthand for the expression is N (μ, σ2). N (0, 1) has the usual normal distribution if
the parameter values are equal to zero and one (Giner & Smyth, 2016). The mean and the
confidence interval determine the normal distribution's form. Standard deviation controls the y-
axis, whereas the mean controls the x-axis. The mean affects the distribution's apex's location, so
it is referred to as the location parameter. What determines how large a distribution appears is
referred to as a "scale parameter." The bell curve will be broader if the variance is greater.
Data Cleaning
Volatility assessment from high-frequency data requires careful data cleansing. High-
frequency data cleansing has been given considerable attention. A large data set can improve
volatility estimators, as demonstrated by Barndorff‐Nielsen et al. (2009). The reasoning behind
this conclusion may initially appear to be counterintuitive, yet it is rather simple. As a rule of
thumb, an estimator that uses all available data to its fullest extent is more likely to place a high
value on reliable statistics and a lower value on inaccurate data.
The generalized least-squares (GLS) approximation provides a suitable comparison point
(Barndorff‐Nielsen et al., 2009). The conventional least squares estimator, on the other hand, has
25
a lower level of precision when it includes noisy data. The least-squares estimator can suffer
more than benefit from using low-quality observations, and this is a reasonable comparison in
light of the current scenario. Outliers can significantly impact the realized kernel and related
estimators, which process all observations equally (Barndorff‐Nielsen et al., 2009). The
following steps can be followed for trade data to achieve high-frequency data cleaning.
 Re-enter the right trades in T1. A Correction Indicator (CORR > 0) is used in these trades
 Use T2 to remove any entries that have an odd Sale Condition. Letter-coded COND
transactions, except those including the letters "E" and "F," are excluded from this list.
TAQ 3's User's Guide has more information on how sales are handled (Barndorff‐Nielsen
et al., 2009).
 T3: Use the median price if there are numerous transactions with the same time stamp.
 In T4, the bid-ask spread should be subtracted from any prices that are over the 'ask.'
With prices below the "bid" minus the "bid-ask spread," the situation is similar
(Barndorff‐Nielsen et al., 2009).
Conclusion
This paper has exploratively discussed the stochastic and non-stochastic volatility models
encompassed under observation-driven and parameter-driven models. An observation-driven
model uses contemporary parameters as predictable functions of serial correlation, and
exogenous variables and parameters change at random yet can be predicted exactly one step
ahead of time based on prior knowledge. Time-varying parameters in stochastic processes are
26
possible for any conditional observation density, so long as the process itself is stochastic.
Because of this, parameter-driven models can be applied in various situations.

27
References
Abbara, O., & Zevallos, M. (2019). A note on stochastic volatility model estimation. Brazilian
Review of Finance, 17(4), 22-32.
Asai, M., & McAleer, M. (2009). The structure of dynamic correlations in multivariate stochastic
volatility models. Journal of Econometrics, 150(2), 182-192.
Barndorff‐Nielsen, O. E., Hansen, P. R., Lunde, A., & Shephard, N. (2009). Realized kernels in
practice: trades and quotes.
Blasques, F., Gorgi, P., Koopman, S. J., & Wintenberger, O. (2018). Feasible invertibility
conditions and maximum likelihood estimation for observation-driven
models. Electronic Journal of Statistics, 12(1), 1019-1052.
Costa, M., & Alpuim, T. (2010). Parameter estimation of state-space models for univariate
observations. Journal of Statistical Planning and Inference, 140(7), 1889-1902.
Creal, D., Koopman, S. J., & Lucas, A. (2013). Generalized autoregressive score models with
applications. Journal of Applied Econometrics, 28(5), 777-795.
Durbin, J., & Koopman, S. J. (2000). The time series analysis of non‐Gaussian observations is
based on state-space models from classical and Bayesian perspectives. Journal of the
Royal Statistical Society: Series B (Statistical Methodology), 62(1), 3-56.

28
Giner, G., & Smyth, G. K. (2016). stated: probability calculations for the inverse Gaussian
distribution. arXiv preprint arXiv:1603.06687.
Gourieroux, C., Monfort, A., & Renault, E. (1993). Indirect inference. Journal of applied
econometrics, 8(S1), S85-S118.
Gustafsson, F. (2010). Particle filter theory and practice with positioning applications. IEEE
Aerospace and Electronic Systems Magazine, 25(7), 53-82.
Hansen, P. R., & Lunde, A. (2006). Realized variance and market microstructure noise. Journal
of Business & Economic Statistics, 24(2), 127-161.
Hansen, P. R., Huang, Z., & Shek, H. H. (2010). Realized GARCH: A complete model of returns
and realized volatility measures, mimeograph, Department of Economics.
Huang, Z., Wang, T., & Zhang, X. (2014). Generalized autoregressive score model with realized
measures of volatility. Available at SSRN 2461831.
Koopman, S. J., Lucas, A., & Scharth, M. (2016). Predicting time-varying parameters with
parameter-driven and observation-driven models. Review of Economics and
Statistics, 98(1), 97-110.
Wooldridge, J. M. (2014). Quasi-maximum likelihood estimation and testing for nonlinear
models with endogenous explanatory variables. Journal of Econometrics, 182(1), 226-
234.
29
Yuan, C., & Druzdzel, M. J. (2012). An importance sampling algorithm based on evidence pre-
propagation. arXiv preprint arXiv:1212.2507.
Zhu, D., & Galbraith, J. W. (2010). A generalized asymmetric Student-t distribution with
application to financial econometrics. Journal of Econometrics, 157(2), 297-305.

Non Stochastic Volatility Rev

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Non Stochastic Volatility Rev

Uploaded by

Copyright:

Available Formats

1

Non-Stochastic Volatility Vs. Stochastic Volatility Models

Course Number and Name

Assignment Due Date

Non-Stochastic Volatility Vs. Stochastic Volatility Models

A time-varying parameter model can be classified into two categories: parameter-driven

Dynamic models and idiosyncratic inventions drive variables in parameter-driven

Stochastic volatility models incorporate stochastic conditional duration models and stochastic

examine the relative advantages of parameter-driven and observation-driven frameworks

observation-driven approaches as nonlinear non-Gaussian state-space models in terms of

comparing observation- and parameter-driven models are impossible. In parameter-driven

these models ahead of observation-driven models.

parameter estimation can be performed using simple approaches.

normality assumption (Asai & McAleer, 2009).

macroeconomic data requires a broad, broad spectrum of macroeconomic variables, as well as

shrinkage, and Bayesian computation is useful for stochastic volatility.

An observation-driven model uses contemporary parameters as predictable functions of

precision. GARCH models regard heteroskedasticity as a variability to be modeled rather than a

GARCH general formula is given by;

single identity, making nesting possible by setting xt = rt or xt = r2t.

following is the return formula for the i-th asset at time t:

it possible to gain numerical advantages by employing a lower-dimensional system of equations.

likelihood terms pertinent to these comparisons.

are presented in an observation-driven manner. It is possible to calculate all endogenous

Simulated or bootstrapped estimates of Ht+h's distribution are trivial to calculate. Multi-step

features of the data.

In multivariate GARCH models, the conditional dispersion of the variance vector is

other observation-driven modeling approaches, such as generalized autoregressive moving

2.2 is the measurement equation because it connects realized variance to conditional

dynamic score determines the time-varying parameter t = ft.

statistically volatile for specific models could be a problem.

observation-driven models and parameter-driven models. Extensive nontrivial empirical and

state-space structures with stochastically time-varying values, multivariate unmarked point

specifications in these situations.

in observation, it is more important to introduce information smoothing and find suitable

improved the probability value in numerous circumstances.

iii. Realized variance

According to Barndorff‐Nielsen et al. (2009), it is possible to estimate the exponential

volatility by extending the influential achieved variance literature. RV is a popular experimental

statistical measurement, and RV provides a perfect approximation of volatility in fictitious

correct bias, are an alternate method of dealing with the problem.

There are many unique improvements in parameter-driven models because the

to evaluate the likelihood of a parameter-driven model (Hansen et al., 2010). Time-varying

various situation. Instead of a flexible unifying framework for observation-driven models, we

certain situation, such as volatility modeling.

i. Stochastic Volatility Models

model, a potentially large monitoring space is reduced to a smaller orthogonal matrix

which is explained in finance by the concept of "volatility clustering."

ii. State-Space Models

The state-space model of a continuous-time dynamic conceptual model can be generated

of models define them.

family of models is of major relevance.

iii. Quasi Maximum Likelihood

Even though dealing with explanatory variable endogeneity in nonlinear models is

probability of the endogenous explanatory variables (EEVs) – whether continuous, discrete, or

nonlinear models with EEVs.

To achieve maximum likelihood, a model of equations with unknown unobserved errors

typically inaccurate. This is perhaps an essential reason to avoid it.

exogenous because of the fitted-value technique.

iv. Indirect Inference

Formally, ˆθn solves:

included in 6 can always be used instead.

or simply abandoning optimization in favor of a Laplace-type estimator (Gourieroux et al.,

According to Yuan & Druzdzel (2012), Monte Carlo estimator convergence is