You are on page 1of 16

Lesson 2.

Basic concepts
• Statistical Tools Histograms
- a bar chart comparing a variable to its frequency of occurrence
• Histograms
- the most common way of graphically presenting a frequency distribution
• Probability Distribution
- variable is usually organized into class intervals or bins
• Categorical Variables
• Comparing Histograms
• Data Transformation
• Monte Carlo Simulation
• Bootstrap
• Geostatistical, and
Other Key Concepts
• Numerical Facies
Modeling
• Cell Based Modeling
• Object Based Modeling
• Lecture 2 Exercises
• Lecture 2 Quiz
Lesson 2. Basic concepts
• Statistical Tools Cumulative Distribution Function
- A probability distribution summarizes the probabilities that the random variable will take a
• Histograms
certain value.
• Probability Distribution
- A probability distribution and a cumulative distribution function are the same. Probability
• Categorical Variables can be defined as the relative frequency of an event in the long run. If we repeat the
• Comparing Histograms experiment many times the relative frequency of the outcomes should be the same as the
• Data Transformation random variables probability histogram. Cumulative distribution functions (cdf) are defined
mathematically by:
• Monte Carlo Simulation
F(x)=Prob(X<=x)
• Bootstrap 0<=F(x)<=1
• Geostatistical, and - A cumulative probability plot and a cumulative frequency plot are the same thing
Other Key Concepts
• Numerical Facies
Modeling
• Cell Based Modeling
• Object Based Modeling
• Lecture 2 Exercises
• Lecture 2 Quiz
Lesson 2. Basic concepts
• Statistical Tools Categorical Variables
The probability distribution of a categorical variable is defined by the probability or proportion
• Histograms
of each category.
• Probability Distribution
• Categorical Variables
• Comparing Histograms
• Data Transformation
• Monte Carlo Simulation
• Bootstrap
• Geostatistical, and
Other Key Concepts
• Numerical Facies
Modeling
• Cell Based Modeling
• Object Based Modeling
• Lecture 2 Exercises
• Lecture 2 Quiz
Lesson 2. Basic concepts
• Statistical Tools Quantile-Quantile Plots
Quantile-quantile plots (Q-Q plot) are useful for comparing two distributions. A change in slope
• Histograms
indicates a difference in variance, and a parallel shift in any direction indicates a difference in
• Probability Distribution
the mean. Some uses of the Q-Q plot include core to log relations, comparing the results from
• Categorical Variables different drilling campaigns, and comparing distributions by lithofacies.
• Comparing Histograms
• Data Transformation
• Monte Carlo Simulation
• Bootstrap
• Geostatistical, and
Other Key Concepts
• Numerical Facies
Modeling
• Cell Based Modeling
• Object Based Modeling
• Lecture 2 Exercises
• Lecture 2 Quiz
Lesson 2. Basic concepts
• Statistical Tools Exercise 2:
Using the following data:
• Histograms
- Calculate the population mean, variance, median, 25th percentile, 75th percentile, the
• Probability Distribution
interquartile range, and the COV.
• Categorical Variables - Repeat the above summary statistics first without the zero values (leaving 16 data), then
• Comparing Histograms without the outlier (leaving 19 values). Compare with the results from part b to those in part a
• Data Transformation and comment.
- Using the original data to draw the histogram using 5 bins and the cdf
• Monte Carlo Simulation
- Assume the values in question 1 are really measures of permeability (Ki= zi*1000,
• Bootstrap ie. K5=1.20*1000=1200 md).
• Geostatistical, and A geologist has interpreted the core samples as follows: lithofacies 1 is shale having a
Other Key Concepts permeability of 0.00 md, lithofacies 2 has permeabilities varying from 1000 md to 2999 md,
• Numerical Facies lithofacies 3 has permeabilities varying from 3000 md to 3999 md, and lithofacies 4 has
Modeling
permeabilities varying from 4000 md and up.
• Cell Based Modeling Using this information transform the data from question 1 into categorical form and
• Object Based Modeling show a categorical histogram including all four lithofacies.
• Lecture 2 Exercises - Discuss the interpretation of the geologist using Q-Q plot
• Lecture 2 Quiz
Lesson 2. Basic concepts
• Statistical Tools Correlation:
• Histograms
Statistical relationship between two variables
• Probability Distribution
• Categorical Variables X ↔ Y: n data set (x1, y1), (x2, y2), … (xn, yn)
• Comparing Histograms Coefficient of Correlation:
• Data Transformation
• Monte Carlo Simulation
• Bootstrap
• Geostatistical, and
Other Key Concepts
• Numerical Facies
Modeling
• Cell Based Modeling Correlation coefficient varies between +1 and −1 inclusive, where 1 is total
• Object Based Modeling positive correlation, 0 is no correlation, and −1 is total negative correlation
• Lecture 2 Exercises
• Lecture 2 Quiz
Lesson 2. Basic concepts
• Statistical Tools
• Histograms
• Probability Distribution • Prediction:
• Categorical Variables
• Comparing Histograms • A relationship model between 2 variables can be built
• Data Transformation
• Monte Carlo Simulation
• Choice of type of model is important
• Bootstrap
• The model helps to predict: X → Y
• Geostatistical, and
Other Key Concepts
• X : independent variable, predictor variable..
• Numerical Facies
Modeling
• Cell Based Modeling
• Y : dependent variable, response variable…
• Object Based Modeling
• Lecture 2 Exercises
• Lecture 2 Quiz
Lesson 2. Basic concepts
• Statistical Tools • How is the relationship between the number of hours per day
• Histograms students watch TV and the semester score?
• Probability Distribution
• Categorical Variables • Exercise 3:
• Comparing Histograms Hours watching TV Semester
• Data Transformation per day score
• Monte Carlo Simulation
4 4
• Bootstrap
• Geostatistical, and
Other Key Concepts
2.5 5
• Numerical Facies
Modeling 3 4.8
• Cell Based Modeling
1 6.4
• Object Based Modeling
• Lecture 2 Exercises 0.5 7.6
• Lecture 2 Quiz
Lesson 2. Basic concepts
• Statistical Tools • Regression analysis : estimated from experimental data
• Histograms set
• Probability Distribution
q General form: ў = f (x1, x2, x3,…, a0, a1, a2 , …)
• Categorical Variables
• Comparing Histograms • xi (i=1:k): independent variable
• Data Transformation • y: dependent variable
• Monte Carlo Simulation
• a0, a1, a2 , …: unknown regression parameters
• Bootstrap
• Geostatistical, and
Other Key Concepts
• Regression parameters calculation requires the most
• Numerical Facies suitable model:
Modeling
• Linear regression
• Cell Based Modeling
• Object Based Modeling • Polynomial regression
• Lecture 2 Exercises • Power regression
• Lecture 2 Quiz
• Exponential regression
Lesson 2. Basic concepts
• Statistical Tools
• Histograms
• Probability Distribution • Objective : modelling a relationship between one or multiple
• Categorical Variables independent variables Xi and one dependent variable Y
• Comparing Histograms
• One independent variable: Simple linear regression
• Data Transformation
• Monte Carlo Simulation • Multiple independent variables: Multiple linear regression
• Bootstrap • General form of multiple linear regression:
• Geostatistical, and
Other Key Concepts
ў = a0 + a1x1 + a2x2 + … + amxm
• Numerical Facies Example: x1 is Age, x2 is Weigh, x3 is Height, y is Cholesterol
Modeling concentration in blood
• Cell Based Modeling
• Object Based Modeling
• Lecture 2 Exercises
• Lecture 2 Quiz
Lesson 2. Basic concepts
• Statistical Tools
• Histograms
• Probability Distribution
• Categorical Variables
• Comparing Histograms
• Data Transformation
• Monte Carlo Simulation
• Bootstrap
• Geostatistical, and
Other Key Concepts
• Numerical Facies
Modeling • Least squares means that the overall solution minimizes
• Cell Based Modeling
the sum of the squares of the errors made in the results of
• Object Based Modeling
every single equation.
• Lecture 2 Exercises
• Lecture 2 Quiz
Lesson 2. Basic concepts
• Statistical Tools
• Least squares formulation:
n 2

f = å [ y j - yˆ j ]
• Histograms j =1

• Probability Distribution n
= å [ y j - f ( x1 , x2 ,..., a0 , a1 , a2 ...) j ]2 ® min
• Categorical Variables j =1

• Comparing Histograms
• Data Transformation ü yj (j=1:n): experimental dependent data
• Monte Carlo Simulation ü ўj : predicted data
• Bootstrap ü n: number of data set
• Geostatistical, and ü xi (i=1:m): experimental independent data
Other Key Concepts
• Numerical Facies
Modeling
• Cell Based Modeling
• Object Based Modeling
• Lecture 2 Exercises
• Lecture 2 Quiz
Lesson 2. Basic concepts
• Statistical Tools
• Multiple linear regression:
• Histograms
ў = a0 + a1x1 + a2x2 + … + amxm
• Probability Distribution
• Categorical Variables • Workflow:
• Comparing Histograms • Determine Φ
• Data Transformation n n

• Monte Carlo Simulation f = å [ y j - yˆ j ] = å [ y j - ( a0 + a1 x1 + a 2 x2 + ! + a m xm ) j ]2


2

j =1 j =1
• Bootstrap
• Geostatistical, and • Using least squares method:
Other Key Concepts
• System of equations with (m+1) unknown parameters a0, a1, a2, …, am
• Numerical Facies
Modeling • Solve: AX = B with X = [a0, a1, a2, …, am]T
• Cell Based Modeling
• Object Based Modeling
• Lecture 2 Exercises
• Lecture 2 Quiz
Lesson 2. Basic concepts éå
ê j
yj ù
ú
ê ú
• Statistical Tools êå y j x1 j ú
• We have: ê j ú
• Histograms B = êå y j x2 j ú
ê j ú
• Probability Distribution ê ú
§ Vector; ê ! ú
• Categorical Variables êå y j x mj ú
• Comparing Histograms ê j ú
ë û
• Data Transformation
• Monte Carlo Simulation § Matrice A (m+1,m+1) éN
ê
åx 1j åx 2j ! åx mj ù
ú
ê å x1 j åx åx åx
2
• Bootstrap 1j 2j x1 j ! mj x1 j ú

• Geostatistical, and A = êå x 2 j åx 1j x2 j åx 2
2j åx mj x 2 j
ú
ê ú
Other Key Concepts ê" ú
ê
• Numerical Facies ë å x mj åx 1j x mj åx 2j x mj ! å x mj úû
2

Modeling
• Cell Based Modeling § Replace (a0, a1, a2, …, am) into the model and we can have graphical presentation for the
• Object Based Modeling experimental data set {(yj,x1j,x2j,…,xmj), j=1:n}
• Lecture 2 Exercises
• Lecture 2 Quiz
Lesson 2. Basic concepts
• Statistical Tools
• In case of one independent variable: ў = a0 + a1x
• Histograms
• f for a experimental data set {yj,xj,j=1:n} is:
• Probability Distribution n n
• Categorical Variables f = å [ y j - yˆ j ] = å [ y j - ( a0 + a1 x j ]2
2

• Comparing Histograms j =1 j =1
• Data Transformation
• Monte Carlo Simulation • Minimizing the f gives:
• Bootstrap
• Geostatistical, and
¶f n
= - å 2[ y j - ( a0 + a1 x j )] = 0
Other Key Concepts
• Numerical Facies
Modeling
¶a0 j =1

• Cell Based Modeling ¶f n

• Object Based Modeling = - å 2[ y j - ( a0 + a1 x j )] x j = 0


¶a1 j =1
• Lecture 2 Exercises
• Lecture 2 Quiz
Lesson 2. Basic concepts
• Statistical Tools
• Exercise 4: J (hydraulic inclination) and V (cm/d) permeability:
• Histograms
• Probability Distribution J 7 9 12 15 18
• Categorical Variables V (cm/d) 0,5 2 6,5 9,5 14
• Comparing Histograms
• Data Transformation V (cm/d)
16
• Monte Carlo Simulation
• Bootstrap 12
V = 1.2373J - 8.5952
• Geostatistical, and
8
Other Key Concepts
• Numerical Facies 4
Modeling
• Cell Based Modeling 0
J
0 5 10 15 20
• Object Based Modeling
• Lecture 2 Exercises
• Lecture 2 Quiz

You might also like