Professional Documents
Culture Documents
Gene Regulation
xi , wi , j ∈ R
Or in matrix notation :
x t(+n1× 1) = W ( n× n ) ⋅ x t( n× 1)
(k × n) yy 1
A be a matrix with rows equal to the first k vectors, i.e. A = 2 , and
y
⋯
k
(k × n) yy 2
B be a matrix with rows equal to the last k vectors, i.e. B = ⋯ 3 .
y
k +1
Then, the linear system becomes :
W = A* *B, A* * = (A T A) − 1 A T
W = A* * A, A* * = A T (AA T ) − 1
A
1
A A
0.5
0
-0.5
-1
1 3 5 7 9 11 13 15 17 I I I
A I
I
1
A
0.5
0 I
-0.5
-1
1 2 3 4 5 6 7 8 9 10 ECS289A, UCD SQ’05, Filkov
A system for inferring gene- regulation
networks:
• Filter (thresholding)
• Cluster (average link clustering)
• Curve Smoothing (peak identification)
• Inferring Regulation Scores
• Optimizing regulation assignment
Yeast genome microarray data, Cho et al (1998)
Repeated runs
(random starts)
produce different A/I
networks
Observations:
• an activator /
inhibitor pair that
regulate 15 other
clusters
•cell cycle regulator
•DNA replication
•amino acid synthesis
C Indirect
A ⇒C
• How can we distinguish between direct and indirect
relationships in a network based on microarray data?
• Additional assumptions needed
• In the previous model: optimize f(grade,#regulators)
• Next: minimize # relationships
• Trajectories:
series of state
transitions
• Attractors:
repeating
trajectories
• Basin of
Attraction: all
states leading
to an attractor
• Available algorithms:
– Akutsu et al.
– Liang et al. (REVEAL)
• Correctness: Exhaustive
• Time: Examine all Boolean functions of 2
inputs, for all node triplets, and all
examples O (2 ⋅ 2 2 2 ⋅ n 3 ⋅ m)
n
P(X1 , X 2 ,..., X n ) = ∏i= 1
P(X i | Parents(X i ))
P(S|C) P(R|C)
P(W|S,R)
Joint: P(C,R,S,W)=P(C)*P(S|C)*P(R|C)*P(W|S,R)
Independencies: I(S;R|C), I(R;S|C)
Dependencies: P(S|C), P(R|C), P(W|S,R)
S
U
1. Bayes Logic
2. Graphs
N
3. Bayes Nets
BO
4. Learning Bayes Nets
5. BN and Causal Nets
6. BN and Regulatory Networks
n n
P(x 1 , x 2, ..., x n ) = ∏
i= 1
P(x i | x 1 , x 2, ..., x i − 1 ) = ∏
i= 1
P(x i | Pa(x))
P(G) P(G)
… …
Example for maximizing P(G) (from David Danks, IHMC)
P( D | G ) = ∫ P ( D | G , θ ) P (θ | G )dθ
• graphs that capture the exact properties of
the network (i.e. all dependencies in the
distribution) very likely score higher than
ones that do not (given large # of samples)
• Score is decomposable:
n
S (G : D ) = ∑
i= 1
ScoreContribution( Xi, Pa ( Xi ) : D )
• Which metric:
– Bayesian Dirichlet equivalent: captures P(G|D),
– Bayesian Information Criterion: approx.
• Which data discretization? Hard, 2,3,4?
• Which Priors?
• Which heuristics?
– simulated annealing
– hill climbing
– GA, etc.