Professional Documents
Culture Documents
AMT2006
AMT2006
2 2 2 1.5
1
Data set 1 Data set 2
Gene P
1.5 1.5
1.5 0.5 YCR042C,YHL025W YCR040W,YIL115C
0 YCR042C,YHR202W YCR042C,YKR051W
1 1
1 −0.5
−1
YCR042W,YIL049W YCR042C,YML107C YDL006W, YER042W 0.54%
YDL002C,YBR092C YCR042C,YHR202W
expression
expression
expression
0.5 −1.5
0 5 10
time
15 20 YDL002C, YCR042C YCR091W,YDR540C YDL058W,YHR004C 0.37%
0.5 0.5
YDL002C,YEL052W YDL006W,YER042W
0 1.5
YDL006W,YER042W YDL26W,YNL159C YCR042C,YHR202W 0.23%
1
YDL034W,YER054W 0.11%
0 0
YDL034W,YBR054W YDL034W,YBR054W
Gene Q
−0.5 0.5
0 YDL054C,YAL019W YDL036C,YIL065W
−0.5 −0.5
−1 −0.5 YDL058W,YFL048C YDL058W,YJL037W
−1 YDL058W,YHR066W YDL058W,YHR004C
−1
0 5 10 15 20
−1
0 5 10 15 20
−1.5
0 5 10 15 20
−1.5
0 5 10 15 20 YDL076C,YCR066W YDL058W,YLR241W
time
time time time
Abstract 0.2
2. Results
0.1
We use local extrema in microarray time series 0 We applied the method to two yeast datasets of Spell-
data as the basis for discretisation. By com- −0.1 man et. al. and validated the resulting predictions by
paring discretised gene expressions using simi- negative−positive using a public Gold standard protein-to-protein interaction
expression
−0.2
zero transition,
larity functions, we discover putative protein-to- −0.3 positive−negative indicating a
local
database. We compared the results to a similar approach
zero transition,
protein interactions. We validate the results by −0.4
indicating a minimum where discretisation was based on per gene thresholding
use of public true positive and true negative −0.5
local
maximum
as shown in Figure 4.
databases for protein-to-protein interactions and swi4 epxression
−0.6
demonstrate the high predictiveness of these lo- swi4 first−order derivative
zero axis 2
1.5
up regulated
expression
change
extrema. 0.5
1.1 Smoothing
0
We create a smoothed version of our original signal h, Discretisation is now straightforward,
which removes noise and effectively captures the proper- • a positive-negative transition becomes a maximum −0.5
down regulated
−0.2
dependent data sets using a logical AND (Figure 1e). Table 1: # pred. is the number of predictions % TP is
−0.4 the percentage true-positives, % TN is the percentage true-
negatives.
−0.6 1.5 Confidence intervals
For each detected local extrema tl in a gene pattern we Results are shown in Table 1. Above: the results of
−0.8
calculate the probability of a type I error, P (tl ). The total the method with varying options: no options; increased
−1
confidence for a gene Q with n extrema is then calculated smoothing factor; comparison using the lagfunction; ignor-
0 5 10 15 20
by the geometric mean: ing extrema at the first and last samples of the series. Be-
time
low: results from discretisation based on thresholding.
1
n
Figure 2: Smoothing of swi4 expression, across 19 time n
Y
samples. The original signal in blue; the smooth signal in CQ = (1 − P (tl )) (4)
3. Conclusion
green. l=1
We rank predictions according to their confidence (Fig- Local extrema are a feature in time-series gene
1.2 Discretisation ure 1f). For any relation consisting of genes Q and R the expression data that can be used for finding bi-
We compute the regularized (smoothed) derivative of the confidence measure becomes: ologically relevant interactions (e.g., protein-to-
gene expression time signal by convolution with the first-
order derivative of the Gaussian function as: CQR = CQ · CR (5) protein interactions). The applied method is in-
variant under scaling and shifting and can be ad-
D(h ∗ k)(t) = (h ∗ Dk)(t) = (Dh ∗ k)(t) (3) justed for the amount of experimental noise.
2 2 1 1
rad51 pol32 pob1 pol2
msh2 rad51 ase1 msh2
1.5
1.5 0.5 0.5
1 0 0
0.5
expression
expression
expression
expression
−0.5
0 −1 −1
−1
−2 −1 −2 −2
0 5 10 15 20 0 5 10 15 20 0 5 10 15 20 0 5 10 15 20
time time time time
AMT 2006, International Conference on Advances in Microarray Technology, 31 October - 2 November, Amsterdam, The Netherlands