Professional Documents
Culture Documents
Rick Hathaway
Jim Bezdek
geFFCM
geFEM
3/15/06 geFFCM 1
The geFFCM Algorithm
Compute
initial # of samples ⎡
n = (pN) / 100⎤
n/b = initial # of samples per bin
3/15/06 geFFCM 2
PS1 Randomly choose (wo replacement)* n vectors Xn ⊂ XN
S1 = A = Single Accept
S2 = FA = First Accept
S3 = CA = Cumulative Accept
Each feature in a specified set either did or does pass
S4 = SA = Simultaneous Accept
x1 x2 xj xg x m xn x p xq xN VL data
1
1st Xn
2
get EC bins
with {xik }
S1 k
x2k x kj x kg x km x kn x kp x kq
p test {xik }
Test fails
Test ok Get ∆X
Test
Run LFCM on all p features
{xik + ∆xik }
of the accepted sample
3/15/06 geFFCM 4
The first accept (S2=FA) selection strategy for any single feature
x1 x2 xj xg x m xn x p xq xN VL data
1
1st Xn
2
For each k
get EC bins
k with {xik }
p For each k
test {xik }
Feature 2 passes ⇒ All k tests fail ⇒
Accept Current Sample Get ∆X
reject current sample
For each k
3/15/06 geFFCM 5
The cumulative accept (S3=CA) selection strategy for features (j,k)
x1 x2 xj xg x m xn x p xq x N VL data
1
1st Xn
2
j j
j x2 xj x jg x jm j
x jn x p x jq get EC bins
for {x j } {xik }
i
k
x2k x kj x kg x km k
x kn x p x kq
p test {xij } {xik }
Get ∆X
j fails, k passes
Test
j j
{xi + ∆xi } j passes, k fails
{xik + ∆xik }
Each feature did pass, or does pass ⇒ accept current sample
1
1st Xn
2
j j
j x2 xj x jg x jm j
x jn x p x jq get EC bins
for {x j } {xik }
i
k
x2k x kj x kg x km k
x kn x p x kq
p test {xij } {xik }
Get ∆X j fails, k passes
Test
j passes, k fails
j j
{xi + ∆xi }
{xik + ∆xik } j,k both pass
Get ∆X
Both features pass at the same time ⇒ accept current sample
Test
Run LFCM on all p features of the accepted sample
3/15/06 geFFCM 7
PS3 For each active feature k
% n = 0.2
XN
3/15/06 geFFCM 9
PS4 For each active feature (k=1 to IA)
For each bin (i = 1 to bik)
bik ⎛ k ⎞ ⎛ k⎞
⎜ Ni nik⎟ log⎜ nNi ⎟
Compute divergence* div k = n∑ −
⎜ n ⎟⎠ ⎜ k⎟
i= 1 ⎝ N ⎝ i ⎠
Nn
3/15/06 geFFCM 11
Data XL (loadable), N=100,000 draws
from a mixture of c=2 2D normals
⎧ k +1 k ⎫
Termination U k +1 − U k = max ⎨ Ui,j − Ui,j ⎬ < ε
i,j
⎩ ⎭
3/15/06 geFFCM 12
Divergence vs χ2 : Why we use either one
120
Wo replacement
100 Feature 1 only
F-1(0.10)
80
F-1(0.05)
Because they
60
are basically
Accept @
-1(*)
F40 Accept @ identical !
α =0.95 α =0.90
20
| Xn |
0 %
0 10 20 |30XN | 40 50 60 70 80 90 100
3/15/06 geFFCM 13
Acceptance Strategies
120
Wo replacement F1 first signals
Feature F1 ( ) at |Xn|= 3%
100 Feature F2 ( )
F2 first signals
-1
F (0.10) at |Xn|= 6%
80
F-1(0.05) Both first signal
FACA SA at |Xn|= 20%
60
40
tdiv(Xn) FA≤CA≤SA
20
(always !)
| Xn |
0 %
0 10 20 | XN30| 40 50 60 70 80 90 100
3/15/06 geFFCM 14
V X N − VX n
Terminal
Prototypes
0.14
LFCM( XN ) ⇒ VX N
0.12
LFCM( Xn ) ⇒ VX n
0.1
0.08
With replacement
0.06
| Xn |
0 %
0 10 20 30 40 50 60 70 80 90 100 | XN |
3/15/06 geFFCM 15
Data XL (loadable), N=100,000 draws
from a mixture of c=4 5D normals
Termination Same
Study parameters
3/15/06 geFFCM 16
"good" separation
b= 25 50 100 200
α= .90 .95 .90 .95 .90 .95 .90 .95 σ2=0.1
x1 |XS1| 19 21 15 19 12 14 7 10 Typical Result
x2 |XS1| 17 27 14 19 9 14 8 12
25 trials ave.
x3 |XS1| 17 26 16 21 13 15 9 13
x4 |XS1| 13 22 12 17 13 19 10 12 |Xn| of %|XN|
x5 |XS1| 19 27 18 23 13 18 11 13
FA Trend Studies
3 6 4 6 3 5 3 4
|XS2|
time, secs .14 .15 .16 .17 .18 .19 .20 .21 %|XN| vs b
CA
35 47 30 35 24 29 17 22 %|XN| vs SS
|XS3|
time, secs .17 .19 .20 .21 .22 .25 .26 .29 %|XN| vs α
SA
44 54 35 40 27 33 21 26 %|XN| vs cpu time
|XS4|
time, secs .21 .23 .22 .24 .25 .26 .29 .31
SS vs σ2
3/15/06 geFFCM 17
Average Trends
Sample size vs b
b 25 200 % |XN|
25 50 100 200 by more than 1/2
25.5 19.8 16.4 12.3
3/15/06 geFFCM 18
Average Trends
3/15/06 geFFCM 19
Approximation and Acceleration Measures
DECREASING separation
σ2=0.1 σ2=0.5 σ2=1.0
3/15/06 geFFCM 20
Average Trends in Acceleration
Acceleration vs σ2
Acceleration vs SS
3/15/06 geFFCM 21
Average Trends in Approximation
Eapp as separation
Etr as separation
3/15/06 geFFCM 22
Probabilistic Clustering Typical Result (10 trials ave.)
with geFEM for "good" separation (σ2=0.1)
3/15/06 geFFCM 24
Elastic control
Elastic control of
of nn == ||X to reduce
Xnn|| to reduce sample
sample size
size
⎡ bik ⎛ k k⎞ ⎛ k ⎞⎤
Recall that for each active N n nN
D*k = min {n,n*} ⎢ ∑ ⎜ i − i ⎟ log⎜ i ⎟⎥
feature k, we compute ⎢ i= 1⎜⎝ N n ⎟
⎠
⎜
⎝
k ⎟⎥
⎣ Nni ⎠⎦
⎡ bik ⎛ k k⎞ ⎛ k ⎞⎤
N n nN
⎢ ∑ ⎜ i − i ⎟ log⎜ i ⎟⎥ Choose a target sample
Dk = n
⎢ i= 1⎜ N n ⎟⎠ ⎜ k ⎟⎥ size n* and define
⎣ ⎝ ⎝ Nni ⎠⎦
3/15/06 geFFCM 25
|XN|=1,600,000 25 trials ave. α = 0.95
σ 2= 0.50
n* = 20,000 = 1.25% |XN|
52% of N =
832,000
samples
b = 25 50 100 200
X = D D* D D* D D* D D*
x1 |XS1|% 15 2 12 2 8 1 9 1
x2 |XS1|% 12 2 10 1 9 1 5 1 3% of N =
48,000
x3 |XS1|% 17 2 11 1 9 1 6 1
samples
x4 |XS1|% 15 2 13 1 8 1 8 1
x5 |XS1|% 13 2 14 1 11 1 9 1
FA |XS2|% 1 1 1 0 1 1 1 1
time, secs 1.95 1.94 2.19 2.18 2.42 2.40 2.67 2.63
Elastic when
CA |XS3|% 35 3 31 2 21 2 17 2
D*>1.25%
time, secs 2.44 1.97 2.79 2.22 3.10 2.45 3.55 2.72
SA |XS4|% 52 3 40 2 29 2 23 2
time, secs 3.28 1.99 3.53 2.23 3.63 2.44 4.07 2.65
3/15/06 geFFCM 26
832,000 48,000 5 times faster
samples samples same accuracy
3/15/06 geFFCM 27
b=100 bins 25 trials aves. α = 0.95
SS3=CA n* = 20,000 σ 2= 0.50
D D* D D* D D* D D*
|Xn| as % of |XN| 28 23 26 13 24 7 21 4
geFFCM time, secs 3.28 2.81 6.55 3.79 12.75 5.19 23.29 7.93
LFCM time, secs 10.87 10.84 21.58 21.54 42.93 42.72 85.91 85.98
Acceleration, tacc 3.40 3.89 3.58 5.69 3.62 8.24 4.02 10.84
3/15/06 geFFCM 28
Empirical Conclusions
Separation is highest
tacc Accleration highest when
SS is minimal (FA)
Eapp Smallest approx. errors for stringent SS's (CA and SA)
3/15/06 geFFCM 29
Yet do do : geFFCM
Process VL data
geFFCM was designed for VL data, but
so far, no real tests have been made Should work ok !
3/15/06 geFFCM 30
Thanks mates !
G’Day
3/15/06 geFFCM 31
3/15/06 geFFCM 32