Professional Documents
Culture Documents
Slides From Elixir Webinar On Bils-Proteomexchange Integration - May 2015
Slides From Elixir Webinar On Bils-Proteomexchange Integration - May 2015
• Fredrik Levander
• Samuel Lampa
• Janos Nagy
• Mikael Borg
• Jani Heikkinen
•Main http://www.proteomexchange.org
objective: Make life easier forVizcaíno et al., Nat Biotechnol, 2014
researchers
Juan A. Vizcaíno ELIXIR Webinar
juan@ebi.ac.uk 20 May 2015
ProteomeXchange data workflow: PRIDE
Receiving repositories Peptide
Atlas
PRIDE
(MS/MS data)
Results
MassIVE
(MS/MS data) UniProt/
Raw Data* ProteomeCentral
neXtProt
Metadata / PASSEL
Manuscript (SRM data)
Other DBs
Researcher’s results
Reprocessed results
Journals GPMDB Other DBs
Raw data*
2. Result files:
mzIdentML
PRIDE XML
PRIDE Converter 2
Juan A. Vizcaíno ELIXIR Webinar
juan@ebi.ac.uk 20 May 2015
ProteomeXchange: 1,963 datasets up until 1st April, 2015
Origin: Top Species studied by at least 20
396 USA datasets:
224 Germany Type:
191 United Kingdom 839 Homo sapiens
613 PRIDE complete
106 Netherlands 232 Mus musculus
1177 PRIDE partial
105 China 79 Arabidopsis thaliana
104 France 79 PeptideAtlas/PASSEL complete
94 Switzerland 77 Saccharomyces cerevisiae
69 MassIVE
75 Canada 44 Rattus norvegicus
25 reprocessed
55 Japan 35 Escherichia coli
55 Spain
54 Denmark 21 Bos taurus
52 Sweden 21 Glycine max
50 Belgium
48 Australia
34 Austria Datasets/year: ~ 460 species in total
25 Norway 2012: 102
23 Taiwan 2013: 527
22 India
21 Finland 2014: 963
20 Ireland 2015: 371
20 Italy
16 Brazil
15 Russia
14 Republic of Korea Data volume:
10 Israel Publicly Accessible:
Total: ~102 TB
10 Singapore … 959 datasets, 49% of all
Number of all files: ~250,000
88% PRIDE
PXD000320-324: ~ 5 TB
9% PASSEL
PXD000065: ~ 1.4TB
3% MassIVE
• BILS provides:
• Bioinformatics support (consultancy)
• Bioinformatics infrastructure (data and tools)
Computing and storage is provided in collaboration with SNIC
• Bioinformatics network
• Nodes at each of the 6 large university cities
• Annual workshop
• Training
• Coordination with other bioinformatics activities
• Swedish node in ELIXIR
• Data processing:
• Accessible data processing workflows
BILS
Scripts
Public access
to released
raw data Häkkinen et al. (2009) J Proteome Res
http://www.eudat.eu
Juan A. Vizcaíno ELIXIR Webinar
juan@ebi.ac.uk 20 May 2015
B2SAFE
• At this point (March 2015) the pilot had overrun (it was
expected to last 6 months), with more work required to
integrate the B2SAFE replication process with the PRIDE
submission pipeline.
• A detailed report has been written and has been sent to all the
parties involved.
• Rafael Jimenez
• Bengt Persson
• EUDAT management
& developers
Juan A. Vizcaíno ELIXIR Webinar
juan@ebi.ac.uk 20 May 2015