You are on page 1of 30

Handy tools for a modern lab

Stefanie Lck

Leibniz Institute of Plant Genetics and Crop Plant Research (IPK)


Gatersleben
Germany

EuroSciPy 2009
Topic
Software developed for molecular
biology lab
Inspired by routine lab work
Adapted to the specific needs of the
scientists
Existing tools

Mostly for single samples


Rarely specific
Seldom with GUI AND batch mode
capability (multiple input/output)
Usually run on Unix/Linux servers and
requires assistance from server
administrators etc.
Software in this talk

Primer Factory A primer design and


Blast tool
si-Fi RNAi off-target prediction and
RNAi design tool
Software features

Fully implemented in Python


Strong usage of Biopython - a
computational molecular biology library
GUI in wxPython
Plotting in matplotlib, wx.lib.plot
Primer Factory

A Primer Design and Blast Tool


Central dogma of molecular biology

DNA

RNA

Protein
Polymerase Chain Reaction
(PCR)

72
25
95
55
Extension
Annealing
Denaturation

30 x

Generating millions of copies of the original DNA molecule!


Easily several hundreds PCRs per day!!!
Primers must be designed
Primer stocks must be maintained
Primer search on sequences required
Primer Primer
Primer Primer Primer
Primer
Primer Primer
Primer Primer Primer
Primer
Primer Primer Primer
Primer Primer
Primer

?
GTCCGGGATCACGCTGCACGAGTGGTGGCGCAACGAGCAGTTCTGGGTGATCGGCGGCACGA
GCGCGCACCCGGCGGCGGTGCTGCAGGGCCTCCTCAAGGTGATCGCCGGCGTGGACATCTCC

Primer Factory
TTCACGCTCACGTCCAAGCCCGGCGGCGCAGACGACGGCGAGGAGGACACGTTCGCGGAGCT
GTACGAGGTGCGGTGGAGCTTCCTGATGGTGCCACCCGTGACCATTATGATGCTGAACGCGG
TGGCGCTGGCGGTGGGGACGGCGAGGACGCTATACAGCGAGTTCCCGCAATGGAGCAAGCTG
CTGGGCGGCGCCTTCTTCAGCTTCTGAGTGCTGTGCCACCTCTACCCCTTCGCCAAAGGCCT
CCTGGGATGATCGGCGGCACGAGCGCGCACCCGGCGGCGGTGCTGCAGGGCCTCCTCAAGGT
GATCGCCGGCGTGGACATCTCCTTCACGCTCACGTCCAAGCCCGGCGGCGCAGACGACGGCG
AGGAGGACACGTTCGCGGAGCTGTACGAGGTGCGGTGGAGCTTCCTGATGGTGCCACCCGTG
ACCATTATGATGCTGAACGCGGTGGCGCTGGCGGTGGGGACGGCGAGGACGCTATACAGCGA
GTTCCCGCAATGGGTGGAGCTTCCTGATGGTGCCACCCGTGACCATTATGATGCTGAACGCG
GTGGCGCTGGCGGTGGGGACGGCGAGGACGCTATACAGCGAGTTCCCGCAATGGAGCAAGCT
GCTGGGCGGCGCCTTCTTCAGCTTCTGAGTGCTGTGCCACCTCTACCCCTTCGCCAAAGGCC
TCCTGGGATGATCGGCGGCACGAGCGCGCACCCGGCGGCGGTGCTGCAGGGCCTCCTCAAGG
TGATCGCCGGCGTGGACATCTCCTTCACGCTCACGTCCAAGCCCGGCGGCGCAGACGACG
Main algorithm

Basic Local Alignment Search Tool


(BLAST)

Most frequently used tool for calculating sequence


similarity

Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990).


"Basic local alignment search tool".
J Mol Biol 215 (3): 403410. doi:10.1006/jmbi.1990.9999.
Global Alignment
Align two sequences over their entire lengths

Q = GAC T = GC
Which is the optimal alignment?
GAC GAC-- GAC
GC- ---GC G-C

match: score +1
mismatch: score -1
Indel: score -2

1+(-1)+(-2) (-2)+(-2)+(-2)+(-2)+(-2) 1+(-2)+1


score = -2 score = -10 score = 0

Very slow O(Q*T)


BLAST A Local Alignment

compare a query sequence with a


database of sequences for similarity
uses a heuristic approach that
approximates the Smith-Waterman
algorithm
less accurate than a global Alignment
but over 50 times faster (O(Q))
How does BLAST work?
Query sequence

Connect hits, which are close to each other


Assign a score to each hit
List of substring
in query X

Query sequence
0.27
0.0005 X
X
X
X
Does the query substring
match to the target?
Yes, at position XY on the target Target sequence

Report the matches, whose expect score is lower than a threshold.


GTACGAGGTGCGGTGGAGCTTCCTGATGGTGCCACCCGTGACCATTATGATGCTGAACGCGG
TGGCGCTGGCGGTGGGGACGGCGAGGACGCTATACAGCGAGTTCCCGCAATGGAGCAAGCTG
CTGGGCGGCGCCTTCTTCAGCTTCTGAGTGCTGTGCCACCTCTACCCCTTCGCCAAAGGCCT
CCTGGGATGATCGGCGGCACGAGCGCGCACCCGGCGGCGGTGCTGCAGGGCCTCCTCAAGGT
GATCGCCGGCGTGGACATCTCCTTCACGCTCACGTCCAAGCCCGGCGGCGCAGACGACGGCG
AGGAGGACACGTTCGCGGAGCTGTACGAGGTGCGGTGGAGCTTCCTGATGGTGCCACCCGTG
ACCATTATGATGCTGAACGCGGTGGCGCTGGCGGTGGGGACGGCGAGGACGCTATACAGCGA
GTTCCCGCAATGGGTGGAGCTTCCTGATGGTGCCACCCGTGACCATTATGATGCTGAACGCG
GTGGCGCTGGCGGTGGGGACGGCGAGGACGCTATACAGCGAGTTCCCGCAATGGAGCAAGCT
GCTGGGCGGCGCCTTCTTCAGCTTCTGAGTGCTGTGCCACCTCTACCCCTTCGCCAAAGGCC
TCCTGGGATGATCGGCGGCACGAGCGCGCACCCGGCGGCGGTGCTGCAGGGCCTC
Screenshots

www.snowflake-sl.info/PrimerFactory
Advantages of Primer Factory

Reuse of Primers
Easy Primer maintains
Standard input formats (Fasta)
Standard output format (GenBank)
Graphical presentation
Independent, user can generate an own
database
si-Fi

RNAi off-target prediction and


RNAi design tool
RNA interference
(RNAi)

Add colour enhancing gene

Napoli C., Lemieux C., Jorgensen R. (1990):


X
Introduction of a chalcone synthase gene into Petunia results in reversible co-suppression of homologous genes in trans .
In: Plant Cell. Bd. 2, Nr. 4, S. 279-289.
ds RNA

Dicer

siRNA

RISC Complex

mRNA

mRNA Degradation

No protein!!!
What is an off-target?

Target GTCGATGCATGCTAGCTAGCTGCTAGCTAGCTAGCTAGCT

siRNA TCGATCGACGAT

Off-target CGTGACGACGAGCTAGCTGCTACAGTCTGACGAGC
Problems
Miss targeting of unrelated genes
Which sequence part is the best for
silencing?
How does si-Fi work?
GTCCGGGATCACGCTGCACGAGTGGTGGCGCAACGAGCAGTTCTGGG
TGATCGGCGACGAGCGCGCACCCGGCGGCGGTGCTGCAGGGCCTCCT
CAAGGTGATCGCCGGCGTGACATCTCCTTCACGCTCACGTCCAAGCC My RNAi sequence
CGGCGGCGCAGACGACGGCGAGGAGGACACGTTCGCGGAGCGTACGA
GGTGCGGTGGAGCTTCCTGATGGTGCCACCCGTGA

Split sequence in size 21 by position shift +1

GTCCGGGATCACGCTGCACGAG
TCCGGGATCACGCTGCACGAGT
CCGGGATCACGCTGCACGAGTG
CGGGATCACGCTGCACGAGTGG

Blast each siRNA against your DB and


extract only 100 % matches

TAGGCTCGCGCGCGTCCGGGATCACGCTGCACGACTGCCGGATAGGA
GTCCGGGATCACGCTGCACGA


TAGGCTCGCGCGCGTCTGGGA-CACGCTGCACGACTGCCGGATAGGA
GTCCGGGATCACGCTGCACGA
GTCCGGGATCACGCTGCACGAG
TCCGGGATCACGCTGCACGAGT
ACGGGATCACGCTGCACGAGTG
CGGGATCACGCTGCACGAGTGG siRNA list with 100
% matches

Count siRNAs per sequence position

TAGGCTCGCGCGCGTCCGGGATCACGCTGCACGACTGCCGGATAGGAATCGCGTCGCTAGGATCGCGCTCGCTCTGAGAGATGCGCTCGCG
GTCCGGGATCACGCTGCACGA
CGCTGCACGACTGCCGGATAG
AGGAATCGCGTCGCTAGGATC
CGCTCTGAGAGATGCGCTCGC
0000000000000111111111112222222222111111122221111111111111111100011111111111111111111100000

0
Example

Off-Target?
Example of an existing tool

Year
2004,
2003, OUT OF DATE!!!
2002
si-Fi Advantages

Can be very specific and up to date


(custom database)
Standard input formats (Fasta)
Graphical presentation
Printable graphic and table output
Screenshots

www.snowflake-sl.info/si-Fi
Conclusion

software adapted to the specific needs


of the scientists
makes life easier in the lab
convenient use for multiple input jobs
easy to use for everybody
Acknowledgment

Dr. Dimitar Douchkov


Dr. Patrick Schweizer

Biopython contributors & developers


wxPython contributors & developers
The German Python Forum

You might also like