27 views

Uploaded by anon-139146

bayes ai police crime judge

save

- Experiment 1
- stock analysis
- Excel Gyan
- HistoryResearch.doc
- Modeling Weather Related Failures of Overhead Distribution Lines
- Lecture Application 3 Gene Expression
- Syntax Reference
- Deep Learning in Image Computing: An Overview
- Excel Formulas
- 1601116
- Study Notes
- National Science Foundation: washdescript
- My Free List Experience
- ex9s
- PGM presentation
- A6-OutputAnalysis-TerminatingSimulation
- Akrual, Manajemen Penilaian Berbasis Akuntansi Dan Prediksi Nilai Ekuitas
- resume
- Stats Think
- ISO MANUAL
- SyllabusLedolter216HK2012
- New Microsoft Office Word Document
- Bayesian Statistics
- Bardhan and Mookherjee_decline in Landreform_highlighted
- Presentation of Data
- Review for Test 3 Solved in Class
- Job Description Statistician
- Quick Introduction to CANalyzer
- Chapter 1 - Data Collection
- MAT 540 PAPERS Inspiring Minds Mat540papers.com
- PERSYARATAN TASPEN DAN BPJS.docx
- Estudios
- antro 1
- El enigma de las virgen - Jaques Huyen.pdf
- Spo Identifikasi b3
- Jaw Crusher- Technical Briefing.pdf
- sop Evaluasi Kinerja Program.docx
- Wednesday Mikeleonard
- - I NUMERI DI CARMICHAEL - (possibile infinità , distribuzione, ed altro)
- DC Dallas Flex
- liugares turisticos
- BABVICahayadanAlat-alatOptik
- psi-instructiuni-proprii (1).pdf
- Acórdão - AgRg No Ag. 697.227 - Impenhorabilidade de Bem de Família - Alegação Após Arrematação
- auto de inadmisibilidad terminado.doc
- nunca-mas-conadep.pdf
- ejercicio-resuelto-superficies-planas-sumergidas.docx
- Prueba Unidad i - Diferenciada
- MesopotÂmia
- Document Fără Titlu
- contoh-soal-dan-lap-keu-asp.doc
- El Cuento 5 Básico
- #2 Philosophy Rating Tools Rev-2 37.pdf
- Portugal
- Test.xlsx
- 07 08 09 Cbs Fdkslafj Dkfjladjflk Dkfjlk
- Capablanca's Best Chess Endings - Chernev, Irving.pdf
- guia_mack.pdf
- 13. Menghitung Batas Kerja Aman Dari Sling
- Memorandum of Law in Support of Motion for Summary Judgment

You are on page 1of 52

Masters Defense Kelli Crews Baumgartner Laboratory for Intelligent Systems & Control Mechanical Engineering Duke University

10/18/2007 1

Outline

• • • • • • • Background to Criminal Profiling Motivation Bayesian Networks (BNs) Methodology Results Future Works Acknowledgments

2

10/18/2007

**The Case of the Mad Bomber
**

• • • • Male Eastern European Late 40’s to early 50’s Lived in Connecticut with a maiden aunt or sister • Paranoia • Wearing a doublebreasted suit, buttoned • • • • George Meteskey Slavic descent 54 years old Lived in Connecticut with two maiden sisters • Acute Paranoia • When apprehended, was wearing a double breasted suitbuttoned

3

10/18/2007

Background

• Goals of criminal profiling:

– Concentrate criminal investigations – Interrogation strategies

• Collaboration with law enforcement

10/18/2007

4

Previous Research

• Early 1980’s: FBI developed the organized/disorganized dichotomy • 1985 -1994: Dr. David Canter expanded the FBI model: interpersonal coherence, time and place, criminal characteristics, criminal behavior, and forensic awareness • 1999-Present: G. Salfati and D. Canter explore the expressive/instrumental dichotomy

10/18/2007 5

Expressive vs. Instrumental

• Uses multidimensional scaling (MDS)

– Non-metric multidimensional scaling procedure – Plots association coefficients for each crime scene behavior – similarly themed actions will co-occur in the same region of the plot, and vice versa

**• Single offender-single victim homicides by the British police (82 cases and 247 cases)
**

10/18/2007 6

**Results of MDS Research
**

• 62% of the cases exhibited a majority of the crime scene characteristics in a single theme • 74% of all offenders could also be classified as either expressive or instrumental • 55% of the cases exhibited the same theme in both their crime scene actions and in the offender background characteristics • High Frequency crime scene behaviors can be eliminated

10/18/2007 7

Motivation

• Develop a network model linking the profile of the offender to his/her decisions and behaviors at the crime scene • Determine correlations between input and output variables based on data from real cases – Input variables: Crime Scene Analysis (CSA), victimology assessment, and medical examination. Output variables: Offender profile: sex of offender, prior convictions, relationship with victim – Training data: Solved cases (inputs/outputs known) • Apply software to produce the offender profile for unsolved cases, given the input variables

10/18/2007 8

**Criminal Profiling Software Development
**

• • • Use expert knowledge to initialize BN variables Train the NM with solved cases Test NM with validation cases by inputting the crime scene variables and comparing the predicted offender variables to the observed offender variables. Data Initial Model Inputs Trained Model Outputs

10/18/2007

9

Belief Networks

• Bayesian Networks, B=(S,Θ)

– Probabilistic Network

P(X1) X1 X2

P(x1,1 ) 0.4 P(x1,2 ) 0.6

**xi,j =x variable,state
**

P(X2|X1)

P(x2,1 | X1 ) P(x2,2 | X1 ) X1=x1,1 X1=x1,2

10/18/2007

X3

P(X3 | X1)

0.9 0.5

10

P(x3,1 | X1 ) P(x3,2 | X1 ) X1=x1,1 X1=x1,2 0.1 0.5

0.8 0.3

0.2 0.7

Example of Inference

• X2 and X3 are observed • X1 is unknown, to be inferred

P(X2|X1) X2 P(X1) X1

X3

P(X3 | X1)

• Possible states :

⎧ X 1 = {x1,1 ,..., x1,r1 } ⎪ ⎨ X 2 = {x 2,1 ,..., x 2,r 2 } ⎪ X = {x ,..., x } 3,1 3, r 3 ⎩ 3

11

10/18/2007

Bayes Theorem

• Bayes Rule* to infer X1 when X2=x2,r2 and X3=x3,r3:

P ( X 1 | x 2 , r 2 , x 3, r 3 ) = P ( x 2 , r 2 , x 3, r 3 | X 1 ) P ( X 1 ) P( x 2,r 2 , x s ,r 3 )

3

• Marginalization of the observed variables:

P( x 2,r 2 , x3,r 3 ) = ∑ P( X 1 = x1,k )∏ P( x j ,rj | x k )

k =1 j =2

r1

• The posterior probability distribution:

∑ P( X

h =1

10/18/2007

r1

1

= x1,h | x 2,r 2 , x3,r 3 ) = 1

F. Jensen, Bayesian Networks and Decision Graphs, 2001 12

BN Training

• Set of training data, T, to learn BN, B=(S,Θ) • Use structural learning algorithm to find Soptimal

– Search method needed to decrease search space – Scoring Metric

• Maximum Likelihood Estimation (MLE) algorithm to learn CPTs, P(Θoptimal |Soptimal, T )

10/18/2007

13

**Joint Probability Scoring Metric
**

• Relative measure of the level of compatibility of Sh given the training data T, P(Sh |T) • P(Sh |T) α P(Sh,T):

P ( S h , Τ) = P( S h | Τ) P (Τ) ⇒ P ( S ih , Τ) P ( S ih , Τ) P( S ih | Τ) P (Τ) = = h h P( S j | Τ) P ( S j , Τ) P ( S h , Τ) j P (Τ) ∴ P ( S ih | Τ) < P ( S h | Τ) ⇔ P ( S ih , Τ) < P ( S h , Τ) j j

10/18/2007 14

**Scoring Metric Assumptions
**

1. 2. 3. 4. 5. All variables are discrete All structures are equally likely All variables are known with no missing variables All cases in T occur independently given a BN model No prior knowledge of the numerical properties to assign to Bh with structure Sh before observing T

B h = (S h , Θ h ) ∈ Β P( S h , T ) = ∫ f (T | S h , Θ h ) f (Θ h | S h ) P( S h )dΘ h ⇒

Θ ri (ri − 1)! P( S h , T ) = P ( S h ) ⋅ ∏∏ ∏ N ijk ! i =1 j =1 ( N ij + ri − 1)! k =1 n qi

10/18/2007

G.F. Cooper et al. , Machine Learning, 1992.

15

**Variable Definition for Scoring Metric
**

ri (ri − 1)! P( S h , T ) = P( S h ) ⋅ ∏∏ ∏ N ijk ! i =1 j =1 ( N ij + ri − 1)! k =1 n qi

n: Number of model variables qi: Number of unique instantiations of πi, where πi=pa(Xi) ri: Number of possible states for Xi Nijk: Number of cases in T that Xi=xi,k and πi is instantiated as wij, where k=(1,…,ri) r Nij: N ij = ∑ N ijk

i

k =1

10/18/2007

G.F. Cooper et al. , Machine Learning, 1992.

16

K2 Algorithm

• Maximizes scoring metric • Node Ordering: X1 X2 X3 • Limit on number of parents

P ( S optimal , T ) = max[ P( S h , T )] ⇒ h

S qi ri n ⎧ ⎫ K2 (ri − 1)! ⎪ ⎪ h max ⎨ P( S ) ⋅ ∏∏ N ijk !⎬ ⎯⎯→ ∏ ⎪ Bh ⎪ ( N ij + ri − 1)! k =1 i =1 j =1 ⎩ ⎭ qi ri ⎧ ⎫ (ri − 1)! ⎪ ⎪ max ⎨ g = ∏ N ijk !⎬ ∏ ⎪ Bh ⎪ ( N ij + ri − 1)! k =1 j =1 ⎩ ⎭

10/18/2007 G.F. Cooper et al. , Machine Learning, 1992. 17

K2’ Algorithm

P(X4) P(X1)| X4) X1 X2

X4

P(X2|X1, X4) )

X3

P(X3 | X1)

10/18/2007

18

X4 X2

X1 X3

K2’ Algorithm

• Inhibit nodal connections between Input nodes – d-seperation: X3 ⊥ X2, X4 iff X1 is known – X2 and X3 are not affected by X4 X1 relationship if parents are known • Everything else same as K2

⊥

10/18/2007 19

**The Learned Structure
**

• Initial Structure is an empty graph • Final Structure is learned from T O1 O2 … Om I1 I2 … Ik

: Only for K2

10/18/2007 20

Parameter Learning

• Maximum Likelihood Estimator (MLE) for parameter learning due to no missing values (EM algorithm otherwise) • MLE determines the parameters that maximize the probability (likelihood) of T

f (T | Θ h , S h ) = ∏ f (C i | Θ h , S h ) = L(Θ h | T , S h )

i =1

t

Λ = ln L = ∑ ln[L(Θ h | T , S h )] ⇒

i =1

t

∂Λ =0 h ∂Θ

10/18/2007 21

Modeling Example

• Train a Network Model for a simple problem with two inputs and two outputs:

– Inputs, CSA

(1) Place of aggression characteristics (2) Amount of disorder provoked from fight/struggle (1) Gender of the offender (2) Presence of sexual relations between victim and offender

– Outputs, criminal profile:

• •

Train model with simulated cases using Matlab Use evidence of the inputs to infer outputs in new cases

22

10/18/2007

**Example: Variable Definition
**

Inputs from Crime Scene Analysis (CSA) Checklist (evidence): • Characteristics about Place of Aggression, node PA

(1) (2) (3) (4) Not crowded external place (remote) Semi-crowded external place( semi-remote) Crowded external place Inner place (room, building, office) In room/area where corpse is found On all the area/room/study/office/store In the vicinities of area/room/study/office/store No disorder provoked

•

**DiSorder Provoked by fight/struggle, node DS
**

(1) (2) (3) (4)

**Outputs from Network Model, criminal profile: • Gender of offender, node G
**

(1) (2) Male Female

•

**Presence of Sexual Relations between victim and offender, node SR
**

(1) (2) Yes No

23

10/18/2007

Network Model

1 Solved Case: PA = 2 (Semi-remote) DS = 4 (No disorder) G = 1 (Male) SR = 2 (No)

Network variables: PA (Place of Aggression), DS (DiSorder provoked by fight), G (Gender of offender), SR (Sexual Relations between victim and offender).

10/18/2007 24

**Results: Percent Binary Error for Validation Set
**

• Error metric: if x = x* , then error = 0, if x ≠ x*, then error = 1

Percentage of Each Output Node Inferred Incorrectly vs. Number of Training Cases

10/18/2007

25

Full CP Model

• 247 cases:

– 200 training cases (T) – 47 validation cases (V)

• 57 total variables

– 36 CS variables (inputs) – 21 CP variables (outputs)

10/18/2007

26

Internal Stability

• Internal stability refers to the consistency of predictions

10/18/2007

27

**Overall Performance Summary
**

• Overall Predictive Accuracy (OPA):

OPA(%) =

Kt : Total predictions

K C ,CL ( ≥50%) Kt

× 100

**KC,CL(≥50%): Total correct predictions with CL ≥50% Algorithm Accuracy (%) Correct Predictions (number of nodes)
**

10/18/2007

K2 64.1% 633

**K2’ 79% 780
**

28

**Confidence Level of Predictions
**

• Confidence Level Accuracy (CLA):

CLA(%) =

Algorithm: CL: KCL

K C ,CL K CL

× 100

**K2 || K2’ KC,CL CLA(%) Δ (KC,CL) 22 52 73
**

29

**0.5 ≤CL< 0.7 225 || 262 140 || 162 62.2 || 61.8 0.7≤CL< 0.9 0.9 ≤ CL
**

10/18/2007

405 || 470 334 || 386 82.5 || 82.1 168 || 255 159 || 232 94.6 || 91

**Zero Marginal Probability Variables
**

Algorithm CL KCL K2 || K2’ ≥ 50% 798 || 987

ZMP Variables

K t − ( K w + K ZMP ) × 100 PA(%) = Kt

1. Add more training cases 2. Declare variable independencies (K2’) 3. Decrease number of system variables

10/18/2007 30

**High Frequency Variables
**

3. Decrease number of system variables:

– – High Frequency variables are present in more than 50% of cases High Frequency Model (HFM): CP model with HF Variables removed

CS Behavior Face not hidden Victim found at scene where killed Victim found face up Multiple wounds to the body

10/18/2007

Frequency 88.4% 78.9% 61.1% 52.2%

31

**HFM Overall Predictive Accuracy
**

• Negligible accuracy increase for K2’ • Decrease in the number of ZMP variables for K2

Algorithm: OPA (%) KC,≥50% KZMP

10/18/2007

K2HFM K2’HFM 66% 652 168 79.6% 786 0

K2 64.1% 633 189

**K2’ 79% 780 0
**

32

Frequency of Occurrence

• Frequency of occurrence is the number of times the variable was present in the dataset • Frequency method (F) predicts the states of V by the more apparent state in T • The CP by F is the same over all V

10/18/2007

33

**OPA for K2’ and F
**

• Overall Predictive Accuracy: Algorithm Accuracy (%) K2’ 79% F 79.3% 784

Correct Predictions 780 (number of nodes)

10/18/2007

34

**Confidence Level of Predictions
**

• Confidence Level Accuracy (CLA):

Algorithm: CL: 0.5 ≤ CL< 0.7 0.7 ≤ CL< 0.9 0.9 ≤ CL< 0.95 0.95 ≤ CL KCL

K2’ || F KC,CL CLA(%) 65.7 || 61.8 82.1 || 84.3 90 || 89.3 92.2 || 98 Δ (KC,CL) -54 -10 -1 61

262 || 329 162 || 216 470 || 470 386 || 396 139 || 141 125 || 126 116 || 47 107 || 46

10/18/2007

35

Information Entropy

• Information Entropy (H) quantifies the certainty/uncertainty of a model • Amount of information is related to the confidence of the prediction: less entropy means more confidence in the long run

H = −∑ pi log( pi )

i =1 ri

10/18/2007

36

**H for K2’ and F
**

• K2’ model H is an infeasible calculation • An average of each variable’s H is a suitable measure

H ( X 1,..., X k ) = ∑ H ( X i | X i −1 ,..., X 1 ) ⇒

i =1 k k

H ( X 1,..., X k ) ≤ ∑ H ( X i )

i =1

• H(K2’)=0.43 vs. H(F)=0.48

10/18/2007 37

CL Ranges for Predictions

10/18/2007

38

**Crime Scene Variables
**

• Input variables from Crime Scene (evidence)

Input Variable I1, pen I2, hid I3, blnd I4, blnt I5, suff

10/18/2007

Definition Foreign Object Penetration Face Hidden Blindfolded Wounds caused by blunt Instrument Suffocation (other than strangulation)

39

**Criminal Profile Variables
**

• Output variables comprising the criminal profile

Output Variable O1, yoff O2, thft O3, frd O4, brg O5, rlt O6, unem O7, male O8, famr

10/18/2007

Definition Young offender (17-21 years old) Criminal record of theft Criminal record of fraud Criminal record of burglary Relationship with victim Unemployed at time of offense Male Familiar with area of offense occurrence

40

**Predicted Case vs. Actual Case
**

• K2’ Profile • Frequency Profile -A- Young: A (0.813) Young: A (0.805) -A- Theft: A (0.75) Theft: A (0.54) -P- Fraud: A (0.76) Fraud: A (0.67) -A- Burglary: A (1) Burglary: A (0.67) Relationship: A (0.64) -A- Relationship: A (1) Unemployed: P (0.52) -P- Unemployed: P (0.79) -A- Male: A (1) Male: P (0.9) Familiar w/ area: P (0.86) Familiar w/ area: P (0.91) -P10/18/2007 41

**Predicted Case vs. Actual Case
**

• K2’ Profile • Frequency Profile -A- Young: A (0.83) Young: A (0.81) -P- Theft: P (0.52) Theft: A (0.54) -A- Fraud: A (0.67) Fraud: A (0.67) -P- Burglary: A (0.54) Burglary: A (0.67) Relationship: A (0.64) -A- Relationship: A (0.57) Unemployed: P (0.52) -P- Unemployed: P (0.53) -P- Male: P (0.82) Male: P (0.9) Familiar w/ area: P (0.86) Familiar w/ area: P (0.86) -P10/18/2007 42

**Predicted Case vs. Actual Case
**

• K2’ Profile • Frequency Profile -A- Young: A (0.83) Young: A (0.81) -A- Theft: P (0.97) Theft: A (0.54) -P- Fraud: A (0.67) Fraud: A (0.67) -A- Burglary: P (0.60) Burglary: A (0.67) Relationship: A (0.64) -P- Relationship: A (0.59) Unemployed: P (0.52) -A- Unemployed: P (0.61) -A- Male: P (1) Male: P (0.9) Familiar w/ area: P (0.86) Familiar w/ area: P (0.87) -P10/18/2007 43

**Predicted Case vs. Actual Case
**

• K2’ Profile • Frequency Profile -A- Young: A (0.93) Young: A (0.81) -A- Theft: A (0.75) Theft: A (0.54) -P- Fraud: P (0.73) Fraud: A (0.67) -A- Burglary: A (0.82) Burglary: A (0.67) Relationship: A (0.64) -P- Relationship: P (0.56) Unemployed: P (0.52) -A- Unemployed: A (0.87) -P- Male: P (0.89) Male: P (0.9) Familiar w/ area: P (0.86) Familiar w/ area: P (0.78) -P10/18/2007 44

Evidence: pen: penetration blnd: blindfolded hid: face hidden blnt: blunt instrument suff: suffocation CP: yoff: young offender frd: fraud rlt: relationship w/ victim thft: theft brg: burglary famr: familiar w/ area unmp: unemployed

10/18/2007

**Slice of K2’ DAG
**

…

yoff

thft

…

…

pen

frd

brg

unmp suff

rlt

male

famr

blnd

hid

blnt

45

Conclusions

• Due to the absence of ZMP variables, the K2’ structural learning algorithm requires fewer cases compared to K2 • A benefit of a BN model over the naïve frequency approach to acquire a CP is the range of confidence levels of the BN model due to the evidence • Because all of the variables are binary, the frequency approach is more susceptible to better performance than if the variables had many states

10/18/2007 46

Further Research

• Develop a search algorithm that increases performance for BN • Incorporate Salfati’s Expressive/ Instrumental dichotomy to supervise training of a BN model • Apply method to other fields • Combine NN and BN methods to improve model performance

10/18/2007 47

**Neural Network Research
**

• Neural networks implemented similar to BN Algorithm: Nodes Predicted Correctly Overall PA (%) K2’ 780 79% NN 739 75%

10/18/2007

48

Acknowledgments

• Special thanks to

– Dr. Silvia Ferrari, advisor – Dr. Gabrielle Salfati, John Jay College of Criminal Justice – Dr. Marco Strano, President of the International Crime Analysis Association (ICAA) – My Masters Committee

10/18/2007 49

**April Fools' Day Origin
**

• April 1 was originally New Years Day in France • 1582 Pope Gregory decreed the Gregorian calendar was to replace the Julian calendar • January 1 is New Years day according to new calendar • Those who refused to accept the new calendar were April fools

10/18/2007 50

Questions?

10/18/2007

51

Laboratory for Intelligent Systems & Control Mechanical Engineering Duke University

10/18/2007

52

- Experiment 1Uploaded byMarivic Baranda
- stock analysisUploaded byapi-285777244
- Excel GyanUploaded bygsethuanandh
- HistoryResearch.docUploaded byLatrice Coleman
- Modeling Weather Related Failures of Overhead Distribution LinesUploaded byAldrin Khan
- Lecture Application 3 Gene ExpressionUploaded bysumanthandshiva
- Syntax ReferenceUploaded bywahyy
- Deep Learning in Image Computing: An OverviewUploaded byPublishing Manager
- Excel FormulasUploaded bySukhdev Singh Rehal
- 1601116Uploaded byhuevonomar05
- Study NotesUploaded byHoly Nipples
- National Science Foundation: washdescriptUploaded byNSF
- My Free List ExperienceUploaded byrprytz
- ex9sUploaded byaset999
- PGM presentationUploaded bybnetwork
- A6-OutputAnalysis-TerminatingSimulationUploaded byKaranVij
- Akrual, Manajemen Penilaian Berbasis Akuntansi Dan Prediksi Nilai EkuitasUploaded byTeguh Ahmad
- resumeUploaded byapi-312570323
- Stats ThinkUploaded byGundeep Kalra
- ISO MANUALUploaded bySanjay Malhotra
- SyllabusLedolter216HK2012Uploaded byRyan Lam
- New Microsoft Office Word DocumentUploaded byPrateek Singhal
- Bayesian StatisticsUploaded byMNGGG
- Bardhan and Mookherjee_decline in Landreform_highlightedUploaded bysarasij
- Presentation of DataUploaded byMikaella Manzano
- Review for Test 3 Solved in ClassUploaded byTrentTravers
- Job Description StatisticianUploaded byAjay Bagaria
- Quick Introduction to CANalyzerUploaded byIonut Bogdan Pop
- Chapter 1 - Data CollectionUploaded byRay Adam
- MAT 540 PAPERS Inspiring Minds Mat540papers.comUploaded byalpanlebayk