Professional Documents
Culture Documents
Received September 15, 2011; Revised October 19, 2011; Accepted October 21, 2011
*To whom correspondence should be addressed. Tel: +1 650 723 7541; Email: cherry@stanford.edu
peer-reviewed literature. At SGD, curation has been, and annotations comprehensive and current, we explored
will continue to be, a core responsibility that we conduct whether the large set of computational predictions could
supplying a valued service to biomedical research be leveraged to find inaccuracies or omissions in our
communities. Providing comprehensive information for manual literature-based GO annotations. In an initial
gene products, derived via literature curation, is of feasibility study (6), we examined a subset of genes that
special value because it aids experimental design, facili- had a manual GO annotation that was less specific than
tates gene function discovery and enhances gene product the computational prediction. Specifically we selected
annotations provided by other genomic resources. genes that lacked published functional information
Structured controlled vocabularies (i.e. ontologies) (manual ‘unknown’ annotations) but had InterPro com-
include precise definitions representing conserved putational predictions. Review of the publications for
common biological concepts, thereby ensuring a consist- these genes found that a significant proportion of the
ent interpretation of results being represented. Controlled manual annotations could be updated to a defined, experi-
vocabularies facilitate the integration of diverse data types mentally supported function. In contrast, review of the
controlled vocabulary called the ChEBI ontology (11). strain AB972, a direct descendant of S288C and one of the
Phenotypes representing over 17 000 classical and strains used in the original 1996 published sequence (14).
high-throughput phenotype assays have been described The new reference sequence (version 64) was determined
and are available from SGD (12). by mapping next-generation sequence reads to the
previous reference (version 63) then manually inspecting
Biochemical pathways any disagreements between the new sequence and the old
Biochemical pathways are manually curated by SGD and reference. The AB972 sequence was released in February
provided using the Pathway Tools browser version 15.0 2011 as the new reference genome for S. cerevisiae strain
(13). The SGD biochemical pathways data set for S288C, providing even higher quality sequence from a
S. cerevisiae, one of the most highly curated data sets single consistent strain background for all 16 nuclear
among all Pathway Tools data sets available, is the gold chromosomes. Concurrent with the update of the refer-
standard for budding yeast; SGD supports an ongoing ence sequence we have adopted a reference versioning
system; the current version of the reference sequence
S. cerevisiae. SGD has long provided so called ‘not in sets and co-expressed genes based on patterns of expres-
S288C’ genes beginning with those that were genetically sion shared with the query gene(s) from a comprehensive
defined. We are now expanding this to include genes that collection of almost 400 data sets. The expression analysis
have been observed within the genome of any S. cerevisiae tool can be accessed from the Locus Summary via the
stain. Genes can be lost as isolated populations adapt to Expression tab and the Expression Summary histogram.
their environment, and the pan-genomic representation of Implementation of a new data pipeline and the SPELL
S. cerevisiae provides ready access to these genes, in interface was created through collaboration with the
context. For example, the strain S288C, source of the ref- SGD Colony at Princeton University.
erence sequence, is known to be missing several well Budding yeast has long been used as the initial system
studied genes involved in sugar utilization such as SUC1 for developing new high-throughput methodologies; as a
(encoding one of several invertases, sucrose hydrolyzing result, we have a wealth of data types to integrate with
enzymes), MEL1 (encoding an alpha-galactosidase specific gene products. SGD uses the GBrowse genome
required for catabolic conversion of melibiose to browser (24) to display diverse types of genomic informa-
multi-purpose tool-kit at SGD. submitted to Database for and tool development at SGD. We also provide significant
Biocuration 2012 virtual issue) is a multifaceted search and community news such as future scientific meetings and
retrieval tool that provides access to diverse types of new laboratory resources. Blog-style news announcements
curated and systematic data. Searches can be initiated will be available on the SGD home page providing news
with a list of genes, a list of GO terms, or results from noteworthy for the fungal genetics researcher. We are
multiple searches of a variety of data types. At each step, updating our software to allow reporting of news about
the results can be combined for further analysis. Queries yeast genomics, notable awards to community members,
can be customized by modifying predefined templates, or publication of highly significant results and new methods.
by creating completely new searches via the QueryBuilder. All posts are also distributed via RSS feed.
Customized queries facilitate access and retrieval of
specific types of data, without requiring additional Social media
programming support at SGD. The search and its
results can be saved or downloaded in customizable file SGD is now on Facebook and Twitter. Users can ‘Like’ us
2. Goeckeler,J.L. and Brodsky,J.L. (2010) Molecular chaperones and 15. Mortimer,R.K. and Johnston,J.R. (1986) Genealogy of principal
substrate ubiquitination control the efficiency of endoplasmic strains of the yeast genetic stock center. Genetics, 113, 35–43.
reticulum-associated degradation. Diabetes Obes. Metab., 16. Fan,H.Y., Cheng,K.K. and Klein,H.L. (1996) Mutations in the
12(Suppl.), 32–38. RNA polymerase II transcription machinery suppress the
3. Rinaldi,T., Dallabona,C., Ferrero,I., Frontali,L. and Bolotin- hyper-recombination mutant hpr1 delta of Saccharomyces
Fukuhara,M. (2010) Mitochondrial diseases and the role of the cerevisiae. Genetics, 142, 749–759.
yeast models. FEMS Yeast Res., 10, 1006–1022. 17. Kane,S.M. and Roth,R. (1974) Carbohydrate metabolism during
4. Bharadwaj,P., Martins,R. and Macreadie,I. (2010) Yeast as a ascospore development in yeast. J. Bacteriol., 118, 8–14.
model for studying Alzheimer’s disease. FEMS Yeast Res., 10, 18. Carlson,M. and Botstein,D. (1983) Organization of the SUC gene
961–969. family in Saccharomyces. Mol. Cell Biol., 3, 351–359.
5. Barros,M.H., da Cunha,F.M., Oliveira,G.A., Tahara,E.B. and 19. Naumov,G., Turakainen,H., Naumova,E., Aho,S. and
Kowaltowski,A.J. (2010) Yeast as a model to study mitochondrial Korhola,M. (1990) A new family of polymorphic genes in
mechanisms in ageing. Mech. Ageing Dev., 131, 494–502. Saccharomyces cerevisiae: alpha-galactosidase genes MEL1-MEL7.
6. Costanzo,M.C., Park,J., Balakrishnan,R., Cherry,J.M. and Mol. Gen. Genet., 224, 119–128.
Hong,E.L. (2011) Using computational predictions to improve 20. Hall,C. and Dietrich,F.S. (2007) The reacquisition of biotin
literature-based Gene Ontology annotations: a feasibility study.