Professional Documents
Culture Documents
11 Data Analysis
You can use discrete event simulations to generate different forms of output, as
described in Chapter 10 Simulation Design on page MC-10-1. These forms
include several types of numerical data, animation, and detailed statistics
provided by the debugger included with OPNET Modeler.
The most commonly used forms of output data for discrete event simulations
are those that are directly supported by Simulation Kernel interfaces for
collection and by existing tools for viewing and analysis. This data falls into two
primary categories:
Results Browser
The Results Browser is used to display information in the form of graphs.
Graphs are presented within rectangular areas called analysis panels. Each
analysis panel can have one or more graphs. A graph is the part of the analysis
panel that can contain statistics. A number of different operations can be used
to create graphs and analysis panels, all of which have as their basic purpose
to display a new set of data or to transform an existing one. An analysis panel
consists of a plotting area, with two numbered axes generally referred to as the
horizontal axis (abscissa), and the vertical axis (ordinate). The plotting area can
contain one or more graphs describing relationships between variables mapped
to the two axes. For example, the graph in the following figure shows how the
size of a queue varies as a function of time.
Analysis
panel
Graph
0 x0 y0
1 x1 y1
2 x2 y2
3 x3 y3
4 x4 y4
5 x5 y5
6 x6 y6
The relationship between the abscissa and ordinate variables is then described
by the correspondence established by each of the entries. For a given entry this
relationship can usually be read “When the abscissa variable takes on the value
x, the ordinate variable takes on the value y”, where x and y are the values
stored in the entry. In the analysis panel, this entry may be represented by a
point located at the intersection of the lines represented by the equations
abscissa = x and ordinate = y, as shown in the following figure.
Entry with:
abscissa = 2.5
ordinate = 3.5
Because each statistic may consist of multiple entries, panels usually contain
many points. The resulting graph describes the relationship between the
abscissa and the ordinate not only in terms of the dependency at each point, but
also by expressing sensitivity of one variable to the other; in other words, graphs
can give an indication of the effect that changing one variable has on the other.
Usually, if one variable is considered to be varied intentionally, or is treated as
a system input parameter, it is called an independent variable and is placed on
the horizontal axis. The second variable is called a dependent variable and is
mapped to the vertical axis.
Dependent variable
“Throughput” is
mapped to vertical axis
Several special entries are defined to represent features of statistics that do not
correspond to ordinary numerical data. In some cases, these features are
naturally incorporated into certain statistics based on their definitions; in other
cases they result from transformations of ordinary statistical data by the
operations of the Results Browser.
The default graph style changes from linear to discrete if a graph has ten or
more disconnected lines or points. (This threshold may be lower for smaller
data sets.) You can switch the graph to linear mode in this case, but the line
will appear highly fractured.
• Infinite Value—Like an undefined value entry, this type of entry can arise
either from the definition of a statistic or from numerical manipulations
performed in the Results Browser. For example, the space available in a
queue with unlimited capacity is a statistic that has a permanently infinite
value. A common numerical manipulation that generates infinite value
entries is dividing a nonzero value entry by one with a zero value. Negative
infinite values are differentiated from positive infinite values.
Like an undefined value, an infinite value is omitted from a graph drawn with
the discrete style. The linear style distinguishes an infinite value from an
undefined value by drawing a line straight up or down (to positive or negative
infinity) from the last finite value and then back again to the next finite value.
A positive-infinity-to-negative-infinity transition appears as a straight
top-to-bottom line. The following panel depicts a statistic for the function
f(x) = 1 / x, containing a negative infinite value followed immediately by
a positive infinite value.
Data Sources
Analysis panels can be created by a number of different operations in the
Results Browser. Because all panels must contain at least one statistic, these
operations require a source of data on which to base the new panel. There are
two possible sources of data (only one might be applicable, depending on the
operation).
Vectors can be loaded into the Results Browser to serve as the basis of most of
the available operations. The simplest vector-loading operation allows one
statistic to be viewed in a panel. Numerous filters can also be applied to the
data. For more information, see View Results on page ITG-3-37.
Output vector files also store scalar statistics, which are stored as individual real
numbers. Typically, each scalar statistic accumulates one value per simulation,
although it is possible to accumulate multiple values in one simulation. A scalar
statistic can be thought of as a “summary” of some aspect of the system’s
behavior or performance, as evidenced during one particular simulation run.
Scalar statistics can also represent system input or operating conditions,
obtained either from model attributes or from measurements made during the
simulation. See Chapter 10 Simulation Design on page MC-10-1 for more
information about generating output scalars.
Because scalar statistics do not depend on time, but on other quantities in the
system, they cannot be plotted without choosing another variable with which a
dependency can be expressed. The Results Browser therefore supports plotting
of scalar statistics “against” one another, using the DES Parametric Studies
page. Plotting scalar Y against scalar X shows the possible values of scalar Y
for individual values of scalar X. If there are several values of Y for a given value
of X (e.g., in different simulations using distinct random number seeds), then
several vertically “stacked” data points appear in the graph.
The following figure of a scalar panel in the Results Browser shows this stacking
effect.
Note that the relationship that is shown in a scalar plot is not necessarily due to
an inherent dependency between the output scalars. The plot merely shows
how the two quantities varied simultaneously over a series of experiments. The
causal nature of the relationship between the two variables must be inferred by
the user based on additional knowledge about the actual meaning of these
variables. OPNET Modeler is not able to make such inferences in an automated
fashion.
The Results Browser supports a second approach to visualizing scalar data that
is useful when the relationship between three scalar variables is of interest. The
supporting panel is called a parametric scalar panel and is created on the DES
Parametric Studies tabbed page. In a parametric scalar panel, an abscissa
variable and an ordinate variable play the same role as in an ordinary scalar
panel. However, a third variable called “parameter”, is used to separate the sets
of resulting points into distinct subsets. In each subset, the parameter has a
fixed value which is distinct from the parameter’s value in each of the other
subsets. The result is a “family” of curves plotted in the panel, as shown in the
following example.
Statistic Data Option The most detailed view of a statistic’s data can be
obtained by using the Statistic Data option, which is provided by the Statistic
Information operation. This option displays the explicit contents of the statistics
that the panel includes. The statistics’ lengths and axes labels are given as well
as each entry’s abscissa and ordinate value. This operation applies to the
visible portion of the panel, meaning that if a panel’s axes bounds have been
modified, or if the zoom operation has been used, less than the statistic’s full
content may be displayed. The following panel and editing pad illustrate the
capability provided by the Statistic Data option.
well). Therefore, if a panel’s full vertical span has been reduced by editing the
vertical or horizontal scales, or by zooming, less than the full content of the
statistic will be taken into account. The following table explains the information
provided by this option:
expected value Average value of ordinate variable treated as a step function (i.e.,
using sample-and-hold interpretation of data) and weighting each
entry by the abscissa interval until the next entry; corresponds to
calculation performed by “time-average” filter.
sample mean Mean value of entries’ ordinates computed by weighting all entries
equally; corresponds to calculation performed by “average” filter.
variance Variance of ordinate values; this is the mean value of the squared
deviation from the sample mean.
standard deviation Square root of the variance; represents typical distance between an
ordinate value and the mean ordinate value.
confidence intervals Intervals estimated to contain the true mean of entries’ ordinate
values with five separate levels of confidence; calculations are
based on principles described in Computing Confidence in
Simulation Results on page MC-11-20; these results are meaningful
only if entries are independent measurements.
• The External Model Access (Ema) interface allows an output vector file to be
created or data to be extracted from it. See External Model Access on
page MFA-1-1 for details.
• The Statistic Information operation displays statistic data as text. You can
then use the edit pad operations to export the data to a text file. For more
information, see Project Editor on page ER-3-1.
• The Export Data to Spreadsheet operation converts the data to a text file that
can be opened and converted by a spreadsheet program. For more
information, see Project Editor on page ER-3-1.
Template Statistics
Users of the Results Browser frequently find that they must execute the same
operations to view data from different simulation runs. In other words, after each
simulation, or set of simulations, the same statistics are loaded into the Results
Browser, with only the content of those statistics changing. This leads to the
notion that the specification for the manipulations and presentation of data can
be saved independently from the data itself. The specifications can then be
simply “applied” to data resulting from new simulations, to automatically obtain
processed and displayed information. The Results Browser supports this
capability with a feature called template statistics.
Each graph in an analysis panel can be given a special status called “template”.
A template graph contains no data (it is stripped of its data at the time that it
becomes a template). However, it does contain all of the configuration
information, such as the name of the original vectors or scalars that were used
to create it, and the operations that might have been applied to that data. It also
contains display information such as draw style and color. In other words, only
the graph’s entries (abscissa-ordinate pairs) are missing. The Results Browser
provides several operations that support converting graphs from ordinary form
to template form. See Project Editor on page ER-3-1 for more information.
The utility of a template graph is that it can again become an ordinary graph by
using its configuration information to display new data that is “applied” to it. The
new graph data need only match the graphs requirements—namely that the
names of the original scalar or vector statistics be the same. Using this feature,
the output statistics from many different simulations can be automatically
processed and displayed in an identical manner without having to go through
the individual steps required to generate each graph. The Results Browser
supports applying data to template panels when the data is loaded from output
files. In other words, when an output vector file is opened, the Results Browser
provides the option to match the data against the template graphs’
specifications and “fill in” the data if possible.
OV OV
OV OV
Four separate output vector (OV) files are applied to the template panel to
generate graphs of the same statistic for four separate simulations
OV
Note—Output files can be applied to graphs at any time and only modify those
graphs that are selected and that match the data. This allows successive
applications to be performed to the same template panel to progressively fill in
additional data.
Data Presentation
The Results Browser offers a number of options with regard to the graphical
presentation of a panel. These options never affect the data content of the
panel, but only the manner in which the data is displayed. Access to the
presentation options is via the Edit Panel Properties and Edit Graph Properties
operations, which are activated by clicking with the right mouse button while the
cursor is in a panel or graph, respectively.
Graphs
The graphs shown previously in this section show analysis panels that contain
one graph. An analysis panel can have more than one graph, however, so long
as all graphs can share the same horizontal axis. While the vertical axes may
differ in a panel, all graphs in a panel must be able to use the same horizontal
axis. Because of this, separate graphs in an analysis panel stack vertically, as
shown in the following figure.
You can create a panel with multiple graphs or add graphs to the analysis panel
later.
Panels
Graphs reside in an analysis panel. Clicking in the panel, as opposed to a graph,
brings up the Edit Panel Properties dialog box, which allows the appearance of
the horizontal axis to be changed or the draw style for all statistics in the graph
to be globally set. The Statistics Info operation provides useful information about
the statistics contained in the panel, including the data points themselves. The
Edit Panel Properties dialog box also allows additional graphs to be added to
the panel.
Drawing Style
Each graph within a panel can be assigned one of five possible graphical
representations, called the graph’s draw style. Each graph’s draw style is
controlled independently of the draw styles of other graphs. The drawing styles
include discrete, linear, sample-hold, bar, bar chart, and square-wave.
The discrete draw style provides the most direct view of the actual data content
because one “dot” is used to represent each entry in the statistic (provided that
the ordinate is not undefined). Because no attempt is made to attribute ordinate
values to intermediate abscissa values, as is intrinsically done by the other draw
styles, the discrete drawing style is most appropriate for graphs that represent
a set of independent samples where intermediate values are not well defined.
For example, a typical statistic resulting from measuring end-to-end delay for
each received packet at the time where it is received is plotted below. Though
it may be of interest in some cases to use the linear draw style to emphasize a
trend in the discrete points, estimating the delay value at times between the
packet arrivals does not correspond to a measurement that could actually be
taken.
The linear draw style consists of drawing line segments between the points that
are defined by a statistic’s entries. One of the uses of this style is to represent
intermediate points for which the statistic contains no samples, but which can
be assumed to exist nonetheless. A common example of this is for panels
containing scalar data, where each point represents the result collected by a
simulation; the linear draw style can be used to “fill in” or approximate parts of
the curve that lie between available data points, as shown in the following
example.
Because the resulting graph is without breaks (except at undefined points), the
linear draw style is also sometimes used simply to emphasize the trend in a
statistic, even if the statistic is discrete in nature. An example of this is shown in
the second panel below, which contains the statistic for the size of a queue (i.e.,
the number of packets it contains) as it varies over time.
The sample-hold draw style is based on the notion that between abscissa
values, no new information is known about certain types of statistics, and
therefore these statistics should be assumed to maintain their previous ordinate
value. This interpretation of a statistic’s discrete set of entries makes sense for
many statistics collected in OPNET Modeler-based simulations. Any statistic
that represents a counter of some type, such as a queue size, the number of
packets received without errors, or the number of times a queue has
overflowed, inherently maintains its value until a new sample is obtained.
The bar draw style is essentially a simple extension of the sample-hold draw
style, where the horizontal segment that is drawn at each entry is instead
extended into a filled in bar that reaches down to the horizontal axis. This is the
traditional “bar chart” which is useful for expressing the weight associated with
each recorded abscissa value. This style is therefore often used to represent
histogram data and probability distributions.
The square wave draw style is similar to both the sample-hold draw style and
the bar draw style. It is, in effect, a bar graph that is not filled in. Vertical lines
connect each horizontal segment, but the horizontal segments do not extend to
the abscissa.
When you create a panel annotation, the original analysis panel window
disappears and the annotation displays simulation results as they appeared in
the panel window.
Annotation displays
simulation results
and panel/graph Right-click on an
properties as they annotation to edit
appeared in original attributes, set
window view properties or
open the original
analysis panel
After you edit the panel, double-click in the panel background (or right-click and
choose Make Panel Annotation in Network). The window again disappears and
the annotation displays the updated results.
• Fixed size (default)—The annotation is fixed at the same size as the original
panel window and retains its position in the Project Editor window (regardless
of the view’s zoom level or location in the network).
Confidence Intervals
The field of statistics provides methods for calculating confidence in an
estimate, based on a trial or series of random trials. The techniques that it
provides are also frequently used in applied sciences where field
measurements are subject to error, and multiple measurements are taken to
attempt to place a bound on the magnitude of that error. The Results Browser
provides a basic capability in this area by automatically calculating and
displaying confidence intervals for statistics already contained within panels.
This capability is supported by the Show Confidence Interval checkbox in the
Edit Graph Properties dialog box.
The confidence intervals calculated by the Results Browser are for the mean
ordinate value of a set of entries. For the purposes of this operation, entry sets
are defined by collocation at the same abscissa. This approach to calculating
confidence intervals is designed primarily to support confidence estimation for
scalar data collected in multi-seed parametric experiments, where one or more
input parameters are varied, and for each input parameter value, multiple
random number seeds are used to obtain multiple output parameters. The type
of statistics that result from this type of simulation study (prior to confidence
interval calculation) are illustrated by the example below. The vertical “columns”
of entries correspond to the multiple experiments run by varying the random
seed and maintaining a fixed value for an input parameter.
Figure 11-21 Statistic Consisting of Scalar Data from Multiple Simulation Runs
Suppose that a number of simulations of a system have been run with different
random number seeds to obtain N samples of the statistic X. Even though X
may take on many values, and X’s precise distribution is unknown, it is possible
to define a value µ, which is the true mean of the random variable X. One way
to think of µis as the mean value of an extremely large set of samples of X, if it
were possible to run such a large number of simulations to obtain this sample
set. The reason µis interesting as the true mean of X is that it represents the
typical behavior of the modeled system with regard to the statistic X.
F(X)
sampling distribution of X randomly obtained
sample of X
σ-
------
n
µ x X
Because the distribution of X is normal, the probability that the random sample
x falls within a particular distance of µcan be computed. Usually this distance
is measured in terms of the number of standard deviations that separate the
random sample from the mean. This way, a “standardized normal variable”
z = ( x – µ) ⁄σ x is defined, for which the standard deviation is unity and the
mean is zero, as shown below.
0 z
If the positive value zα is defined such that Prob (-zα < z < zα ) = α, then the
following statement can be made by substituting for z (note: most standard
statistics textbooks provide tables mapping α to zα , or equivalent variables).
This statement can simply be thought of as defining the probability that x is
within a particular distance of µ, based on the fact that the distribution of X is
normal.
x–µ
Prob --------------- < z α = α
σx
σ σ
Prob x – z α -------- < µ< x + z α -------- = α
N N
From these definitions, it is clear that the confidence interval widens as the
degree of confidence increases; this makes sense, because to achieve a high
level of confidence that the true mean is within a particular interval, one can
expect to make a less restrictive hypothesis about that interval; similarly, if one
is willing to accept a lower degree of confidence, a more constraining
hypothesis can be made about the interval. As an extreme example, one can be
100% confident that the value µlies between negative and positive infinity. In
practice, a few particular confidence levels are chosen as shown in the following
table.
99% 2.575
98% 2.327
95% 1.96
90% 1.645
80% 1.282
Because x is the estimator for µ, the error is simply the absolute value of the
difference between these two values. Then if e is the upper bound on the error
with certainty α , the number of required samples n is given by:
σ
e = z ------ z α σ 2
α - ⇒ n = --------
-
n e
For cases where variance is unknown and the sample size is small, a method
is used that is similar to the one described above, but is based on the
T-distribution rather than the normal distribution. The T-distribution resembles
the normal distribution in its characteristic “bell curve” shape. However, this
distribution is based on the use of the sample variance rather than the
assumed or known variance. It is therefore useful for simulation studies where
fewer than 30 samples are used to estimate µ, which is actually a frequent case.
Some common values of tα are provided in the following table. More extensive
tables are available in standard statistics textbooks.
Vector/Statistic Operations
In addition to displaying statistical data, the Results Browser provides a number
of operations that can be used to transform this data to generate new statistics.
Because vectors stored in output vector files have the same data content as
statistics, these operations can also be applied directly to vectors. However, to
simplify discussion, all operations are described in terms of their application to
statistics.
• Histogram (Sample-Distribution)
• Histogram (Time-Distribution)
Each operation is unary (i.e., requires only one statistic as input) and produces
a new single-statistic panel to hold its result when it completes. The
computations done by each of these operations are described in this section.
See Project Editor on page ER-3-1 for instructions on their use.
The actual definition of a probability density function is based on the fact that its
integral over a given interval yields the probability mass associated with that
interval. The probability mass associated with an interval can also be obtained
by computing the difference in the CDF for the upper and lower limits of the
interval. As interval widths become infinitesimally small, it can be seen that the
PDF is therefore the derivative of the CDF with respect to the outcome (i.e.,
ordinate) variable.
The relationship between a PDF and a CDF is in fact the basis for the method
used by the Results Browser to compute PDFs. A CDF is first computed as
described earlier in this section, and a differentiation is performed to construct
a PDF. Because the original statistic data is necessarily discrete, differentiation
is performed in an approximate manner by dividing probability mass associated
with an interval by the interval’s width. In other words, the difference between
two consecutive CDF values is divided by the difference in the corresponding
ordinates. The resulting value is taken as the density associated with the
interval and is placed at the interval’s lower limit. Therefore if a statistic contains
two consecutive ordinate values y1, and y2, the PDF is computed as follows:
CDF ( y2 ) – CDF ( y1 )
PDF(y1) = -------------------------------------------------------
y2 – y1
A second consequence of this calculation is that the PDF contains one less
entry than the CDF due to the fact that no forward-looking difference can be
calculated for the final (i.e., maximum) ordinate value.
Finally, the integral of the PDF statistic, which can be computed using the
correct filter, produces a statistic that is identical to the CDF in its shape.
However, the initial value of the CDF is lost in computing the PDF, meaning that
the two statistics differ by a constant. This difference is particularly noticeable
when the original statistic has a small number of distinct ordinate values,
because the CDFs value for the minimum ordinate is at least the reciprocal of
this number (i.e., this is the probability mass associated with the first ordinate
value).
Figure 11-25 Results of PDF Operation for Regularly and Irregularly Spaced Ordinates
Two simple properties of the CDF result from the method of computation
described above: (1) because each CDF value is computed by adding a positive
probability mass to the previous value, CDFs are monotonically increasing;
(2) because the sum of all probability masses must add up to unity, all CDFs
must have a final value of 1.0; this also makes sense under the definition of the
CDF, because one would expect the likelihood of obtaining an ordinate value
less than or equal to the maximum ordinate value to simply be 1.0.
The counters used by the PMF operation to compute the frequency of each
ordinate value are normalized with respect to the total number of entries in the
original statistic. In other words, the resulting PMF represents the frequency of
occurrence of a particular ordinate value as a proportion of the number of
occurrences of all ordinate values. Therefore, the measurement provided by a
PMF can be though of as the likelihood that an entry chosen at random among
all the entries of the original statistic, would have a particular ordinate value. For
such a selection experiment, the likelihood of choosing a particular ordinate
value is also sometimes called the probability mass of that outcome, hence the
name of the operation.
The following set of data, and the accompanying statistic, illustrate the
calculation of a PMF.
0.0 3.0
1.0 3.0
1.0 4.0
1.0 4.0
2.0 4.0
2.0 5.0
2.0 5.0
The fact that distinct ordinate values are not aggregated on the basis of intervals
makes PMFs appropriate to apply to statistics that contain a relatively small
number of discrete ordinate values. In such cases, sample-distribution
histograms may be less appropriate than PMFs due to one primary problem: if
the discrete values that are present are unevenly spaced, it may be difficult to
choose a histogram interval width that provides for both good separation of the
values and a reasonable number of intervals. For example, consider a statistic
containing the three ordinate values 0.0, 0.001, and 1000.0. To treat the values
distinctly, a sample-distribution histogram would require an interval width that is
the smallest difference between consecutive values, or in this case 0.001.
However, the highest value, 1000.0, can only be encompassed with one million
intervals in this case, causing the sample-distribution histogram to produce an
extremely large statistic.
Conversely, PMFs may not provide significant insight into the characteristics of
statistics containing a very diverse set of ordinate values. This is due to the fact
that each ordinate value is separately counted and that as a result, little can be
said about which ordinate region(s) exhibit the highest density in terms of the
statistic’s presence. In the extreme case, if each value in the original statistic is
unique, then the resulting PMF will have a constant value of 1.0, providing
almost no visually apparent information on the distribution of the values.
Histogram (Sample-Distribution)
The sample-distribution histogram of a statistic reflects the distribution of its
ordinate values over evenly spaced intervals of the vertical axis. The vertical
axis is divided into N distinct intervals beginning at the lower bound and ending
at the upper bound. By default, N is 100, but this value may vary according to a
user-selected interval width. For each interval, the Sample-Distribution
Histogram operation then creates and initializes a separate counter to represent
the frequency with which entries occur in that interval. Subsequently, the entire
statistic is traversed and each entry analyzed; the counter whose interval
contains the entry’s ordinate value is incremented by one.
The statistic that results from this operation contains N entries corresponding to
the N intervals; because these intervals divide the vertical axis of the original
statistic, they appear on the horizontal axis of the new statistic, and the vertical
axis corresponds to the frequencies of occurrence held in the N counters. Note
from the description of this computation, that abscissa values in the original
statistic are not relevant to the sample-distribution histogram. As an example,
consider computing a histogram for the following set of entries:
1.0 1.0
2.0 4.0
3.0 1.0
4.0 2.0
5.0 1.0
6.0 3.0
7.0 4.0
8.0 6.0
9.0 1.0
10.0 0.0
The ordinate values of this statistic range from 0.0 to 6.0 and are all integers.
The default setting of 100 intervals would create far more intervals than there
are values, yielding an essentially empty histogram. An interval size of 1.0 is
more sensible. The counting process performed by the sample-distribution
histogram table is summarized by the table below. Notice that intervals are
inclusive of their lower bound, but not of their upper bound, so that they provide
a complete partitioning of the vertical axis within its range, but do not overlap
with each other.
For more complex input statistics containing a richer set of ordinate values,
sample-distribution histograms can be interpreted as a density profile, showing
where the ordinate values are concentrated.
Histogram (Time-Distribution)
Time-distribution histograms resemble sample-distribution histograms in the
sense that they establish a profile for the ordinate value of a statistic. The
resulting profile shows how frequently the ordinate value of the statistic lies
within specific ranges. Therefore, this operation divides the vertical axis into
intervals in the same manner as the sample-distribution histogram. However,
rather than use the number of entries falling within each interval as the measure
of frequency, a time-distribution histogram is based on the “time spent” by the
statistic within the intervals. In other words, ordinate values are still the basis for
the histogram, but weighting of each entry is performed differently:
sample-distribution histograms weight each entry with a coefficient of 1.0;
time-distribution histograms weight each entry with the difference between its
abscissa value and the abscissa value of the next entry.
1.0 1.0
1.5 4.0
3.0 1.0
3.25 2.0
4.0 1.0
5.25 3.0
7.0 4.0
7.75 6.0
8.0 1.0
8.5 0.0
9.0 end-of-statistic
The ordinate values of this statistic range from 0.0 to 6.0 and are all integers.
The default setting of 100 intervals would create far more intervals than there
are values, yielding an essentially empty histogram. An interval size of 1.0 is
therefore more sensible. The calculation performed by the time-distribution
histogram table is summarized by the table below. For each interval an
accumulator variable is maintained to compute the total abscissa span for which
the statistic’s ordinate falls within the interval. Notice that intervals are inclusive
of their lower bound, but not of their upper bound, so that they provide a
complete partitioning of the vertical axis within its range, but do not overlap with
each other.
This operation uses a simple mechanism to create the entries of the new
statistic. For each entry in the first statistic, an entry with equal abscissa is
searched for in the second statistic; if no exact match is found, then the nearest
entry with a lesser abscissa value is selected. The ordinate values of each of
this pair of entries is used to form an entry for the new statistic (i.e., the ordinate
of the first entry becomes the abscissa of the new entry, and the ordinate of the
second entry becomes the ordinate of the new entry).
The scatter plot statistic that results from this operation shows a possible
correlation between the two input statistics based on their abscissa variables. In
general, the resulting statistic is viewed using the discrete draw style and
appears as a cloud of points. If the cloud appears relatively shapeless with many
ordinates for each abscissa, and vice versa, then it can be assumed that there
is no strong correlation between the two input statistics. Otherwise, the scatter
plot statistic provides a mapping indicating either a dependency between the
ordinate variables of the two input statistics, or a correlated dependency on one
or more other factors. The following scatter plots provide examples of the
operation’s result for correlated and uncorrelated pairs of input statistics.
Filter Operations
In addition to histogram and probability distribution functions, the Results
Browser provides the ability to transform and combine statistic data with a
variety of mathematical operators, including arithmetic, calculus, and statistical
functions. Statistics and/or vectors may be fed through computational block
diagrams called filters to generate and plot new statistics. Filters are developed
using the Filter Editor. You apply the filter to a statistic in the Results Browser
by selecting the filter from the filter pull-down menu.
Filter Models
A filter model is a specification for a computation that operates on one or more
statistics to create exactly one new statistic. Abstractly, a filter can be thought
of as one system that has a defined set of inputs and an algorithm for computing
its output. In addition to inputs and outputs, a filter also has associated
parameters that may factor into the execution of its algorithm. Inputs and
parameters are given names when a filter model is created in the Filter Editor.
parameter 0
input n parameter 1
parameter n
To form macro filters, existing filter models are used to create subordinate filters
and are attached using filter connections. A filter connection is defined between
the output of a filter and the input of another filter to specify the flow of statistic
data. Each filter has only one output, but this output can support outgoing
connections to any number of destination filters. However, each filter input can
be the recipient of at most one connection.
Exactly one output of one subordinate filter must be left unattached when
compiling a filter model. This output becomes the output of the encompassing
macro filter. Therefore, the subordinate filter with no connections attached to its
output is the final subordinate filter that is executed last; the output data is then
made available to the encompassing filter or to the Results Browser to create a
new panel.
Filter connections may not be used to create feedback paths within a filter
model. Feedback paths are individual connections, or sequences of
connections that would create a flow of data such that a filter input would receive
data more than once in one filter execution. Feedback conditions are detected
during the compilation process in the Filter Editor (refer to Filter Execution on
page MC-11-40). Some examples of feedback paths are shown below.
In addition to preventing feedback paths, the Filter Editor also disallows circular
inclusion of filter models within macro filters. In other words, a filter model may
not appear at any level of depth within its own definition. Therefore if filter model
A incorporates a subordinate filter with model B, and model B incorporates
model C, then it would be illegal for models B or C to incorporate model A.
In the case of a filter, a numeric parameter can be set to the promoted status to
automatically become a parameter of the macro filter. That is, the promoted filter
parameter will appear in the parameter menu of the macro filter when the latter
is deployed as part of a higher-level macro filter, or when it is executed in the
Results Browser.
Filter Execution
A filter can only be executed to operate on statistics in the Results Browser if
the filter model has been compiled at some earlier time in the Filter Editor.
Successful compilation is also required for a macro filter to be usable as a
component in a still higher-level macro filter. When a filter model is compiled, all
promoted inputs and parameters must be given names. These names serve to
identify the inputs and parameters when the filter model is used, both for
execution in the Results Browser and for deployment in other filter models.
When a macro filter is executed, all unconnected filter inputs must be provided
with either a statistic or a vector from an output vector file. This data, together
with assignments for promoted parameters, constitutes the input of the filter’s
computation and is responsible for directly or indirectly triggering all
computations of subordinate filters. The filter execution method follows the
data-flow paradigm, meaning that each subordinate filter may only be executed
after all its connections have received data, either directly from the
encompassing filter’s inputs, or from another subordinate filter. After a
subordinate filter is executed, the new statistic that results from its computation
is transferred from its output to each of the connected filter inputs. This may in
turn trigger the destination subordinate filters to be executed, provided that their
other inputs have also received data.
Execution completes when all subordinate filters have executed. The final filter
to executed must have no connection attached to its output. The output that it
produces is instead made available to the Results Browser to incorporate into a
new panel.
Predefined Filters
Macro filter models are based at the lowest level on predefined filters that
actually do computations to generate new statistic data. The top level macro
filter and intermediate levels of macro filters merely serve to structure the user’s
approach to designing a filter model that performs a particular computation.
The descriptions of some of the filters include equations and tables to explain
the computations that are performed. These tables and equations make use of
the following notations:
Arithmetic Filters
Adder Filter The “adder” filter is a binary filter used to combine two input
statistics T0 and T1 to generate a third statistic, Tout, which represents their
sum.
If the two input statistics T0 and T1 have exactly the same number of entries and
these entries are aligned with respect to their abscissa values, then Tout can be
computed simply by adding ordinate values for entries of equal abscissa, as
follows:
However, if the input statistics are not initially perfectly aligned with respect to
each other, then an abscissa alignment mechanism is automatically applied by
this filter before adding is performed. Alignment consists of two steps:
2) The truncated statistics resulting from step 1 are augmented to ensure that
each statistic contains entries at the same abscissas as the other; this
involves inserting points into each of the statistics. For example, if the two
statistics had no abscissa values in common, once augmented they would
contain a number of entries equal to the sum of their original lengths. When
an entry is inserted, its ordinate value is taken to be the ordinate value of
the previous entry in the same statistic (i.e., it is assumed that the statistic’s
ordinate value remains constant until the next original entry).
After alignment has completed, the two resulting statistics can be added
directly, entry by entry. When adding entries, the rules presented in the following
table are applied (because T0 and T1 are treated identically, the table’s
corresponding column headings can be inverted to address symmetric cases).
a +infinity +infinity
a -infinity -infinity
* undefined undefined
Constant Shift Filter The “constant_shift” filter is a unary filter used to operate
on one input statistic T0 to generate a second statistic, Tout, which is a
translation of T0 by an amount ∆ along the direction of the vertical axis. The shift
quantity ∆ is a real number specified as a parameter of the filter.
The Tout statistic has the same number of entries as T0 and the two statistics are
aligned with each other with respect to the entries’ abscissa values. Only the
ordinate values differ by the constant ∆ as follows:
y out [ n ] = y 0 [ n ] + ∆
When computing the entries of Tout, the rules summarized in the following table
are applied (the content of the T0 and ∆ columns can be interchanged to
address symmetric cases).
T0 ∆ Tout
a1 b a+b
+infinity a +infinity
-infinity a -infinity
undefined * undefined
Gain Filter The “gain” filter is a unary filter used to operate on one input statistic
T0 to generate a second statistic, Tout, which is a scaled version of T0 by a factor
K along the direction of the vertical axis. The scaling factor, or gain K, is a real
number specified as a parameter of the filter.
single input gain: scaling factor along vertical scaled version of input
axis statistic
The Tout statistic has the same number of entries as T0 and the two statistics are
aligned with each other with respect to the entries’ abscissa values. Only the
ordinate values differ by the factor K, as follows:
y out [ n ] = y 0 [ n ] ⋅K
When computing the entries of Tout, the rules summarized in the following table
are applied (note: the content of the T0 and K columns can be interchanged to
address symmetric cases).
a1 b a·b
undefined * undefined
Multiplier Filter The “multiplier” filter is a binary filter used to combine two input
statistics T0 and T1 to generate a third statistic, Tout, which represents their
product by real-number multiplication.
If the two input statistics T0 and T have the same number of entries and these
entries are aligned with respect to their abscissa values, then Tout can be
computed simply by multiplying ordinate values for entries of equal abscissa, as
follows:
However, if the input statistics are not initially perfectly aligned with respect to
each other, then an abscissa alignment mechanism is automatically applied by
this filter before multiplication is performed. This alignment process is identical
to that performed for the “adder” filter; a complete explanation of this
mechanism appears in the corresponding section.
After alignment has completed, the two resulting statistics can be multiplied
directly, entry by entry. When multiplying entries the rules presented in the
following table are applied (because T0 and T1 are treated identically, the table’s
corresponding column headings can be inverted to address symmetric cases).
undefined * undefined
Reciprocal Filter The “reciprocal” filter is a unary filter used to operate on one
input statistic T0 to generate a second statistic, Tout, which is obtained by
inverting the ordinate values in T0.
In general however, the Tout statistic has a set of entries with abscissa values
that match those of T0. For entries where the reciprocal is defined, the
relationship between input and output statistic is given by the following:
1
T out ( x ) = --------------
T0 ( x )
When computing the entries of Tout, the rules summarized in the following table
are applied. The notation T0(x-) and T0(x+) refers to the ordinate values of the
entries immediately preceding and following the entry whose ordinate value is
being inverted.
+infinity * * 0.0
-infinity * * 0.0
undefined * * undefined
Statistical Filters
Average Filter The “average” filter is a unary filter used to operate on one input
statistic T0 to generate a second statistic, Tout, which represents the running
mean of the ordinate values of T0 beginning with the first entry.
The Tout statistic has the same number of entries as T0 and the two statistics are
aligned with each other with respect to the entries’ abscissa values. The
ordinate value of the i-th Tout entry is computed as a function of all entries up to
and including the i-th entry of T0. Because there is not necessarily a one-to-one
correspondence between the indices of the entries and their abscissa values,
discrete notation is used in the following expression for the value of the n-th
entry of Tout.
n
∑ y [ i]0
y out [ n ] = i-------------------
=0
-
(n + 1)
Note in the above expression that the denominator is the entry index plus one,
due to the fact that entries are zero-indexed (in other words, n + 1 is the number
of entries).
When using this filter, it is important to realize how special values such as
+infinity, -infinity, and undefined are treated. If an undefined value is
encountered in the input statistic a new entry is created at the same abscissa in
Tout. If no defined entries precede this entry in T0 then the new entry is marked
as undefined in Tout as well; however, in the opposite case, the new entry is
simply marked with the value of the preceding entry in Tout. In both cases,
undefined entries are not considered part of the mean that is computed, and
therefore their occurrence affects neither the numerator nor the denominator of
the expression above. Therefore, a discrepancy between the expression for Tout
and the actual algorithm of the filter can develop as undefined entries occur; a
correction to this discrepancy can be made by subtracting from the denominator
the number of undefined entries whose indices are less than or equal to n.
The first infinite value encountered in T0 causes the value of Tout to become
infinite as well (with same sign). The filter’s output will continue to be infinite for
all remaining entries unless an infinite value of opposite sign is encountered at
which point Tout will become undefined and remain so up to and including the
final entry.
The following table lists the rules applied in computing the values of Tout.
Because this filter incorporates the history of the input statistic into the
calculation of each entry, the table relies on a variable Sn which is the sum of all
th
defined entry ordinates of T0 up to and including the n entry. In addition, the
variable Un is defined as the number of undefined entries in T0 with index less
than or equal to n.
Time Average Filter The “time_average” filter is a unary filter used to operate on
one input statistic T0 to generate a second statistic, Tout, which represents the
running continuous average of the ordinate values of T0 beginning with the first
entry. The difference between this filter and the “mean” filter, described
previously, is that entry values are not weighted equally, but are instead
weighted by the difference between their own abscissa and that of the
subsequent entry. The filter is named “time average” due to the fact that it is
frequently applied to statistics whose horizontal axis represents time; however,
it is also applicable to other types of statistics.
The Tout statistic has the same number of entries as T0 and the two statistics are
aligned with each other with respect to the entries’ abscissa values. The
ordinate value of the i-th Tout entry is computed as a function of all entries up to
and including the i-th entry of T0. Because there is not necessarily a one-to-one
correspondence between the indices of the entries and their abscissa values,
discrete notation is used in the following expression for the value of the n-th
entry of Tout.
n–1
∑ y [ i ] ⋅( x [ i + 1 ] – x [ i ] )
0 0 0
i=0
y out [ n ] = --------------------------------------------------------------------
n
-
∑ (x [ i + 1] – x [ i] )
0 0
i=0
With regard to treatment of the special ordinate values +infinity, -infinity, and
undefined, this filter behaves in a manner that is analogous to the “mean” filter,
described earlier. In particular, undefined values are essentially ignored: the
numerator of the above expression is not modified by their occurrence; and the
width of an undefined interval does not contribute to the denominator term. Also,
infinite ordinate values in the input statistic result infinite values of same sign for
all subsequent entries of the output statistic. However, if infinite values of
different sign are encountered, Tout becomes undefined. See Average Filter on
page MC-11-46 for a general understanding of how the time average filter is
implemented.
When the window parameter is set to a large value relative to the typical
abscissa spacing between entries of the input statistic, the “moving_average”
filter provides a smooth result that averages values together on a local basis.
However, the window parameter may be selected to be any positive size,
including extremely small values, making it possible for consecutive points to be
more than one window apart. In such cases, the moving average should still
exhibit change over the period of one window because it attempts to emulate a
continuous averaging calculation. To improve the smoothness of its results, the
“moving_average” filter may insert additional samples in the output statistic.
Therefore, there is not necessarily a direct correspondence between the lengths
of T0 and Tout.
The normal value for the minimum abscissa distance is the value of min_dt/2,
where min_dt equals minimum distance between abscissa points in the original
statistic. This results in a given number of entries for the new statistic. The
moving average filter limits the number of entries in the new statistic, however,
to no more than 10 times the number of original entries. When this occurs, the
x-axis increment is scaled up to result in a lower density of points on the x-axis
and fewer points overall.
The general calculation of Tout’s values is quite complicated and involves many
special cases for special undefined and infinite values. Due to the complexity of
the algorithm, this manual does not present a full description, as was done for
other predefined filters.
However, the following statements are useful to describe the filter’s behavior for
most purposes:
a fixed window, the numerator is simply the constant that corresponds to the
filter’s window parameter. As with other averaging filters, the general
expression provided here is not precisely accurate in the case where
undefined points appear in the input statistic.
∑ y 0 [ i ] ⋅( x 0 [ i + 1 ] – x 0 [ i ] ) + [ y 0 [ k – 1 ] ⋅( x 0 [ k ] – x out [ n ] + W ) ] + [ y 0 [ m ] ⋅( x out [ n ] – x 0 [ m ] ) ]
i=k
y out [ n ] = ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
W
where:
W is the window size;
k is the minimum index of an entry in T0 such that the previous entry is outside the window;
m is the maximum index of an entry in T0 such that the following entry is outside the window.
Sample Sum Filter The “sample_sum” filter is a unary filter used to operate on
one input statistic T0 to generate a second statistic, Tout, which represents the
running total of the ordinate values of T0 beginning with the first entry.
The Tout statistic has the same number of entries as T0 and the two statistics are
aligned with each other with respect to the entries’ abscissa values. The
ordinate value of the i-th Tout entry is computed as a function of all entries up to
and including the i-th entry of T0. Because there is not necessarily a one-to-one
correspondence between entries indices and abscissa values, discrete notation
is used in the following expression for the value of the n-th entry of Tout.
n
y out [ n ] = ∑ y [ i] 0
i=0
When using this filter, it is important to realize how special values such as
+infinity, -infinity, and undefined are treated. If an undefined value is
encountered in the input statistic, a new entry is created at the same abscissa
in Tout. If no defined entries precede this entry in T0 then the new entry is marked
as undefined in Tout as well; however, in the opposite case, the new entry is
simply marked with the value of the preceding entry in Tout. In both cases,
undefined entries are not considered part of the sum that is computed, and
therefore their occurrence is essentially ignored.
The first infinite value encountered in T0 causes the value of Tout to become
infinite as well (with same sign). The filter’s output will continue to be infinite for
all remaining entries unless an infinite value of opposite sign is encountered, at
which point Tout will become undefined and remain so up to and including the
final entry.
The following table lists the rules applied in computing the values of Tout.
Because this filter incorporates the history of the input statistic into the
calculation of each entry, the table uses the previous value of Tout as an input
appearing on the left side of the table. Note that the contents of the two left
columns of the table can be interchanged to address symmetric cases.
a1 b a+b
a +infinity +infinity
a -infinity -infinity
y0 [ n + 1 ] – y0 [ n ]
y out [ n ] = ------------------------------------------
-
x0 [ n + 1 ] – x0 [ n ]
The input and output statistics are aligned with respect to the abscissa values
of their entries. However, note that using the above expression, the
“differentiator” filter is unable to compute a value for the final entry of the input
statistic, because there is no following entry to calculate the required
differences. Therefore, Tout is shorter than T0 by one entry.
When computing the entries of Tout, the rules summarized in the following table
are applied.
a ±infinity 0.0
* 0 undefined
undefined * undefined
* undefined undefined
Integrator Filter The “integrator” filter is a unary filter used to operate on one
input statistic T0 to generate a second statistic, Tout, which represents the
integral of T0 with respect to its abscissa variable.
The integral statistic is computed as the area under the input statistic using the
“sample-and-hold” interpretation of the statistic. In other words, the area
between the statistic and the horizontal axis is obtained for a particular entry by
multiplying the width of the abscissa interval until the next entry by the ordinate
value of the entry. The calculation generally performed by the filter is the
following (with the exception of special handling for entries that contain special
values):
n–1
y out [ n ] = ∑ y [ i ] ⋅( x [ i + 1 ] – x [ i ] )
0 0 0
i=0
The input and output statistics are aligned with respect to the abscissa values
of their entries. The output statistic may have one additional entry relative to the
input statistic to account for the area contributed by the final entry of the latter.
This depends upon whether the input statistic’s end-of-statistic marker has an
abscissa that exceeds that of the final entry with an ordinary value.
The “integrator” filter treats special values much in the same way as other
accumulating filters such as the “mean”, “time_average”, and “sample_sum”
filters. That is primarily to say adding infinite values of opposite sign yields an
undefined value, and that adding undefined entries in the input statistic are
essentially ignored (i.e., in this case, these entries are treated as though their
ordinate value were zero sine they contribute nothing to the integral). The
following table lists the rules applied in computing the values of Tout. Because
this filter incorporates the history of the input statistic into the calculation of each
entry, the table uses the previous value of Tout as an input appearing on the left
side of the table. Note that the contents of the two left columns of the table can
be interchanged to address symmetric cases.
a ±infinity c ±infinity
a undefined c a
+infinity b c +infinity
-infinity b c -infinity
undefined * c undefined
In general however, the Tout statistic has a set of entries with abscissa values
that match those of T0. For entries where the exponentiation result is defined,
the relationship between input and output statistic is given by the following:
The following table lists the rules applied in the computation of the entries of Tout
in the case where the exponent is positive or zero. For cases where the
exponent is negative, this table can be used to determine an intermediate result
and subsequently, the calculation rules of the reciprocal filter can be applied.
Table 11-26 Entry Calculation Rules for Exponent Filter (with Nonnegative or
Infinite Exponent)
T0(x) exponent Tout(x)
1
a≥ 0 b≥ 0 ab
a<0 b ab
Logarithm Filter The “logarithm” filter is a unary filter used to operate on one
input statistic T0 to generate a second statistic, Tout, which is obtained by
computing the base ten logarithm, or common logarithm, of the ordinate entries
of T0.
The number of entries in the Tout statistic is identical to that of T0 and the entries
of both statistics are aligned with each other on the horizontal axis. The
logarithm function is not defined for all values, meaning that some entries with
special values may be present in Tout. However, for entries where the logarithm
result is defined, the relationship between input and output statistic is given by
the following:
y out [ n ] = Log 10 ( y 0 [ n ] )
The following table lists the rules applied in the computation of the entries of
Tout.
a = 0.0 -infinity
+infinity +infinity
-infinity undefined
undefined undefined
Miscellaneous Filters
Abscissa Filter The “abscissa” filter is a unary filter used to operate on one input
statistic T0 to generate a second statistic, Tout, whose ordinate values are simply
the abscissa values of the entries of T0.
The number of entries in the Tout statistic is identical to that of T0 and the entries
of both statistics are aligned with each other on the horizontal axis. Note that
only the abscissa values of the input statistic are relevant to the computation
performed by this filter, as shown in the following equation relating input and
output statistics.
y out [ n ] = x 0 [ n ]
Delay Element Filter The “delay_element” filter is a unary filter used to operate
on one input statistic T0 to generate a second statistic, Tout, which is obtained
by translating T0 along the horizontal axis by a fixed amount. The translation
distance is controlled by the sole parameter of the filter, called delay. A positive
value for delay causes a translation of the statistic in the positive abscissa
direction.
The length of T0 and Tout are identical, with the entries of Tout computed as
follows (note: all special values are simply translated to new abscissas).
y out [ n ] = y 0 [ n ]
x out [ n ] = x 0 [ n ] + delay
Glitch Notch Filter The “glitch_notch” filter is a unary filter used to operate on
one input statistic T0 to generate a second statistic, Tout, by eliminating the
occurrence of multiple entries that share the same abscissa value (such an
occurrence is referred to as a glitch). Tout is constructed by copying entries from
T0. However, if a sequence of consecutive entries (at least two) with the same
abscissa is encountered in T0, all but the last entry in this sequence are ignored,
and the last entry is copied into Tout. Therefore, Tout is glitch-free.
Limiter Filter The “limiter” filter is a unary filter used to operate on one input
statistic T0 to generate a second statistic, Tout, which is obtained by constraining
the ordinate values T0 within a specified range. The lower and upper limits of
this range are controlled by the filter parameters called min_val and max_val,
respectively. These are inclusive bounds for the range.
The length of T0 and Tout are identical, and are aligned with each other along
the horizontal axis. The entries of Tout are computed as shown below. Undefined
entries are not transformed by this filter. Positive infinite ordinates are clipped to
the upper bound, and negative infinite ordinates are clipped to the lower bound.
Time Window Filter The “time_window” filter is a unary filter used to operate on
one input statistic T0 to generate a second statistic, Tout, which is obtained by
eliminating all entries whose abscissa values lie outside a specified range. The
lower and upper limits of this range are controlled by the filter parameters called
min_time and max_time, respectively. These bounds themselves are also
included in the range. The names min_time and max_time are used due to the
fact that this filter is frequently applied to statistics and/or output vectors that use
time as their abscissa variable. However, the “time window” filter is equally
applicable to other types of statistics.
single input Min_time: lower single statistic composed of only those entries in
bound of T0 whose abscissa values are in the inclusive
horizontal range range [min_val, max_val]
Max_time: upper
bound of
horizontal range
Value Notch Filter The “value_notch” filter is a unary filter used to operate on
one input statistic T0 to generate a second statistic, Tout. Tout is obtained by
eliminating all entries that are approximately equal to the parameter value.
“Approximately equal” is defined as falling within a range determined by a
specified ordinate value (x) and the value of the value_notch_filter_tolerance
preference (tol). If T0 is greater than x - tol and less than x + tol, it is eliminated;
otherwise T0 is simply copied to the output statistic Tout.
-9
(value_notch_filter_tolerance has a default value of 10 .)
single input value: entries single statistic composed of only those entries in
with this ordinate T0 whose ordinate values are higher or lower
value are than the specified value by a certain number,
eliminated defined by the value_notch_filter_tolerance
preference.