You are on page 1of 26

FACULTY OF ENERGY AND OIL & GAS INDUSTRY

Geostatistics.
Laboratory Session 2.
The Examples of Univariate Description

Prepared by Aibek Abdukarimov


Course Outline
• Lecture 1. Introduction to RM (Geostatistics)
• Lecture 2 and 3. Univariate and Bivariate description. Various ways of describing the spatial features of а data set.
• Lecture 4. Modeling Principles.
• Lecture 5. Structural Analysis
• Lecture 6. Variogram Modelling. Sample Variograms for U.
• MIDTERM
• Lecture 7. Random Function Models
• Lecture 8. Global Estimation
• Lecture 9. Kriging and the Estimation of in-situ Recourses.
• Lecture 10. Point Estimation
• Lecture 11. Ordinary Kriging
• Lecture 12. Block Kriging.
• ENDTERM
• FINAL EXAM.

Faculty of Energy and Oil & Gas Industry 2


Why Univariate Statistics?

• Univariate analysis explores each variable in a data set, separately. It looks at the
range of values, as well as the central tendency of the values. It describes the
pattern of response to the variable. It describes each variable on its own.   
• Descriptive statistics describe and summarize data.
• Univariate descriptive statistics describe individual variables.

Faculty of Energy and Oil & Gas Industry 3


How to Analyze One Variable

• 1) Raw Data
• Obtain a printout of the raw data for all the variables. Raw data resembles a
matrix, with the variable names heading the columns, and the information for each
case or record displayed across the rows.

Faculty of Energy and Oil & Gas Industry 4


Example: Raw data for a study of injuries among
county workers (first 10 cases)
Injury Report No. County Name Cause of Injury Severity of Injury
1 County A Fall 3
2 County B Auto 4
3 County C Fall 6
4 County C Fall 4
5 County B Fall 5
6 County A Violence 9
7 County A Auto 3
8 County A Violence 2
9 County A Violence 9
10 County B Auto 3

Faculty of Energy and Oil & Gas Industry 5


Frequency Distribution.
The variable Severity of Injury
Severity of Injury
• Obtain a frequency distribution of the data for the
3
variable. This is done by identifying the lowest and
4
highest values of the variable, and then putting all the
6
values of the variable in order from lowest to highest.
4
Next, count the number of appearance of each value of
5
the variable. This is a count of the frequency with which
each value occurs in the data set. For example, for the 9
variable "Severity of Injury," the values range from 2 to 3
9. 2
9
3

Faculty of Energy and Oil & Gas Industry 6


Frequency Distribution. Severity of Injury

Number of Injuries with this


Severity of Injury
Severity

2 1
3 3
4 2
5 1
6 1
9 2
Total 10

Faculty of Energy and Oil & Gas Industry 7


Grouped Data. Severity of Injury

• Decide on whether the data should be grouped into classes.   


• The severity of injury ratings can be collapsed into just a few categories or groups.
Grouped data usually has from 3 to 7 groups. There should be no groups with a
frequency of zero (for example, there are no injuries with a severity rating of 7 or
8).

Faculty of Energy and Oil & Gas Industry 8


Grouped Data. Severity of Injury

• One way to construct groups is to have equal class intervals (e.g., 1-3, 4-6, 7-9).
Another way to construct groups is to have about equal numbers of observations
in each group. Remember that class intervals must be both mutually exclusive and
exhaustive.
Severity of Injury Number of Injuries with this Severity
Mild (1-3)  4
Moderate (4-6)  4
Severe (6-9)  2
Total 10

Faculty of Energy and Oil & Gas Industry 9


The Uniform Distribution
• In uniform distribution the random variable is a continuous random variable
• The probability density function is calculated as:
• Mean
• Variance
• The cumulative distribution function is calculated by integrating the probability density
function f(x) to give
• Standard deviation is the under root of variance   

Faculty of Energy and Oil & Gas Industry 10


The Uniform Distribution. Simple example

• For example, if we say that it is observed in a school, over a period of 2 months


that the teacher arrives in school earliest by 4 minutes, before school starts or
latest by 6 minutes after school starts.
• Thus, looking at the figure below, we can deduce that the probability of a teacher
entering class anywhere between 7:56 to 8:06 is constant.

Faculty of Energy and Oil & Gas Industry 11


Cumulative Distribution Function

• The cumulative distribution function of a continuous random variable, is known


give the probabilities P(X ≤ x) and is calculated by integrating the probability
density function f(x) between the limits -∞ and x

Faculty of Energy and Oil & Gas Industry 12


Cumulative Distribution Function. Examples

1. A random variable is uniformly distributed over the interval 2 to 10. Find its
variance.
2. A zero mean random signal is uniformly distributed between limits -a and +a
and its mean square value is equal to its variance. Then the standard deviation of
the signal is?
3. Suppose a train is delayed by approximately 60 minutes. What is the probability
that train will reach by 57 minutes to 60 minutes?

Faculty of Energy and Oil & Gas Industry 13


Cumulative Distributions. Severity of Injury

• Cumulative frequency distributions include a third column in the table (this can be
done with either simple frequency distributions or with grouped data):
Severity of Injury Number of Injuries Cumulative frequency 

2 1 1
3 3 4
4 2 6
5 1 7
6 1 8
9 2 10

• A cumulative frequency distribution can answer questions such as, how many of
the injuries were at level 5 or lower? Answer=..

Faculty of Energy and Oil & Gas Industry 14


Percentage Distributions. Severity of Injury

•  Frequencies can also be presented in the form of percentage distributions and


cumulative percentages.

Severity of Injury Percent of Injuries  Cumulative percentages 


2 10 10
3 30 40
4 20 50
5 10 70
6 10 80
9 20 100

Faculty of Energy and Oil & Gas Industry 15


Why Graph? Graphing the Single Variable

Graphing is a way of visually presenting the data. Many people can grasp the
information presented in a graph better than in a text format. The purpose of
graphing is to:

• present the data


• summarize the data
• enhance textual descriptions
• describe and explore the data
• make comparisons easy
• avoid distortion
• provoke thought about the data

Faculty of Energy and Oil & Gas Industry 16


Bar Graphs

•  Bar graphs are used to display the


frequency distributions for variables
measured at the nominal and ordinal
levels. Bar graphs use the same width for
all the bars on the graph, and there is
space between the bars. Label the parts of
the graph, including the title, the left (Y)
or vertical axis, the right (X) or
horizontal axis, and the bar labels.

Faculty of Energy and Oil & Gas Industry 17


Bar Graphs

• Bar graphs can also be rotated so that the bars are parallel to the horizontal
orientation of the page. For example,

Faculty of Energy and Oil & Gas Industry 18


HISTOGRAM

•  A histogram is a chart that is similar to a bar chart, but it is used for interval and
ratio level variables. With a histogram, the width of the bar is important, since it is
the total area under the bar that represents the proportion of the phenomenon
accounted for by each category. The bars convey the relationship of one group or
class of the variable to the other(s).

Faculty of Energy and Oil & Gas Industry 19


HISTOGRAM County Name Rate of Injury 
per 1,000 workers

County A 5.5

• For example, in the case of the counties County B 4.2


and employee injuries, we might have County C 3.8
information on the rate of injury County D 3.6
according to the number of workers in County E 3.4
each county in State X.
County F 3.1
County G 1.8
County H 1.7
County I 1.6
County J 1.0
County K 0.9
County L 0.4

Faculty of Energy and Oil & Gas Industry 20


HISTOGRAM

• If we group the injury rates into three


groups, then a low rate of injury would
be 0.0-1.9 injuries per 1,000 workers;
moderate would be 2.0-3.9; and high
would be 4.0 and above (in this case, up
to 5.9). This could be graphed as
follows:

Faculty of Energy and Oil & Gas Industry 21


Frequency Polygon

• A frequency polygon is another way of displaying information for an interval or


ratio level variable. A frequency polygon displays the area under the curve that is
represented by the values of the variable. This type of chart is also used to show
time series graphs, or the changes in rates over time.

Faculty of Energy and Oil & Gas Industry 22


Frequency Polygon
• For example, the following table shows the average injury rate per 1,000
employees for counties in State X for the years 1980 to 1990.
Year 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990
Rate 3.6 4.2 3.4 5.5 3.8 3.1 1.7 1.8 1.0 1.6 0.9

• A cumulative frequency polygon is used to


display the cumulative distribution of values
for a variable.

Faculty of Energy and Oil & Gas Industry 23


Pie Chart

•  Another way to show the


relationships between classes or
categories of a variable is in a pie or
circle chart. In a pie chart, each
"slice" represents the proportion of
the total phenomenon that is due to
each of the classes or groups.

Faculty of Energy and Oil & Gas Industry 24


Rates and Ratios

• Other ways to look at the sub-groups or classes within one variable is by the
relation of each sub-group or class to the whole. This can be calculated with a
proportion. A proportion is obtained by dividing the frequency of observations
counted for one group or class (written as f) by the total number of observations
counted for the variable (written as N).
• This can be expressed as f / N
• A percentage is the same as a proportion, multiplied by 100.
• This can be expressed as f / N x 100

Faculty of Energy and Oil & Gas Industry 25


Rates and Ratios

• A rate is the relationship between two different numbers, for example, the number
of injuries among county workers and the population of the county. This can be
calculated as the first number (N1, or injuries) divided by the second number (N2,
or population).
• This can be expressed as N1 / N2
 
• Many health statistics are expressed as rates, for example, the birth rate is the
number of births per some population, such as number of births per 1,000 women.

Faculty of Energy and Oil & Gas Industry 26

You might also like