You are on page 1of 25

# Class Note PETE 630 Geostatistics # • In general there are as many principal components as variables. However, in most cases it is possible to consider first few principal components that explain most of the data variance. 2

# been normalized)  3

# The principal components are given by the following  4

# PCA Example: Well Log Data Analysis  # Variance given as: # Covariance given as:  # Covariance matrix given by:  ## Reduction of data to a 6 x 6 covariance matrix instead of multiple well log data  # Eigenvalues of a matrix A are given by: # A x = l x  # Sum of the eigenvalues yields the overall variance of data set # PETE 630 Geostatistics # a ’s are the coefficients for factor i, multiplied by the measured value for variable j PC 1 is simultaneously the direction of maximum variance and a least-squares “line of best fit” (squared distances of points away from PC 1 are minimized).

### ij Variable X 2 6
-6
-4
6
8
-2
4
1
PC
2
4
-4
8
12
10
2
0
0
-2
PC 2
-8
-6

Variable X 1

9

# PCA PC1
PC2
PC3
PC4
PC5
PC6
GR
-0.16
0.43
-0.09
0.06
-0.31
-0.02
NPHI
-0.42
-0.16
0.19
0.86
-0.15
0.08
RHOB
0.41
0.06
-0.76
0.42
0.27
-0.01
DT
-0.46
0.21
0.05
-0.04
0.86
-0.09
log (LLD)
0.46
0.13
0.45
0.24
0.13
-0.71
log(MSFL)
0.46
0.20
0.42
0.15
0.25
0.70
Contribution, %
Cum.Contribution, %
64.5
13.4
8.1
7.4
4.1
2.5
61.3
77.9
86.0
93.4
97.5 100
PC2 = 0.43(GR)-
0.16(NPHI)+0.06(RHOB)+0.21(DT)+0.13log(LLD)+0.20log(MSFL) # PETE 630 Geostatistics PCA
DT # PCA 4.0
3.0
2.0
1.0
0.0
-1.0
-2.0
-3.0
-6.0
-5.0
-4.0
-3.0
-2.0
-1.0
0.0
1.0
2.0
3.0
4.0
5.0
PC 2 PC 1

# PETE 630 Geostatistics

## Tightness k

s

# 

1

ix S



ji x 
j
i

2

• # i . 13

# Given an initial set of k means m 1, m 2, ….,m k, the algorithm proceeds by alternating between two steps:

• ##  Assignment step: Assign each observation to the cluster with the closest mean: Sx :
()
t
i
p

# i  xm
()
t
p
j

# i

• ##  Updating step: Calculate the new means to be the centroid of the observations in the cluster. (1)

t

• m i

#  1
x
j
S
()
t
()
t
i
xS
ji

14

# Step 4: Step2 through 3 are repeated until convergence has been reached. # PETE 630 Geostatistics

15 STEP 1

## K initial “means” (in this case k = 3) are randomly selected from the data set.  5
4
k 1
3
k 2
2
1
k 3
0
0
1
2
3
4
5

# PETE 630 Geostatistics STEP 2

## K clusters are created by associating every observation with the nearest mean.  5
4
k 1
3
k 2
2
1
k 3
0
0
1
2
3
4
5

# PETE 630 Geostatistics STEP 3

## The centroid of each of the k clusters becomes the new means.  5
4
k 1
3
2
k 3
k 2
1
0
0
1
2
3
4
5

# PETE 630 Geostatistics K-means Clustering: Step 4

## Step2 through 3 are repeated until convergence has been reached.  5
4
k 1
3
2
k 3
k 2
1
0
0
1
2
3
4
5

# PETE 630 Geostatistics Use of K-Means Clustering

• #  Result is more likely circular shape because of use of distance # Discriminant Analysis # Discriminant Analysis

## function f c(x) and the prior probability of the group p cis known, then the posterior distribution of the classes given the observation x is (Bayes’ theorem) # • By Bayes’ theorem, the posterior distribution of the classes given the observation is (
|
)
f
( x )

# By Maximum Likelihood rule, 