You are on page 1of 7

EXPERIMENT No.

DATE :

OBJECTIVE
. To study the various databases of NCBI
URL: www.ncbi.nlm.nih.gov

Theory
A database is a computerized archive used to store and organize data in such a way that
information can be retrieved easily via a variety of search criteria. Along with data retrieval
the main purpose of biological databases is knowledge discovery. Some
examples of databases are NCBI, Genbank, PubMed etc.

NCBI stands for National Center for Biotechnology Information. The National Center for
Biotechnology Information (NCBI) is a resource for information about all aspects of
biotechnology. It’s also a database that indexes and catalogs large amounts of data on
research in biology, genetics, and other fields related to the scope of biotechnology.

It was created at the National Institutes of Health in 1988 to develop information systems for
molecular biology. The National Center for Biotechnology Information (NCBI) is part of the
United States National Library of Medicine (NLM), a branch of the National Institutes of Health
(NIH). It is approved and funded by the government of the United States. The NCBI is located
in Bethesda, Maryland. It creates and maintains over 40 integrated databases for the medical
and scientific communities as well as the general public. There are over 3 million visitors daily
to its website, approximately 27 terabytes of data downloaded per day, and the number of
users as well as downloads increases dramatically each year. With more than 15 million
indexed articles and growing, the
National Center for Biotechnology Information (NCBI) database is one of the largest
databases in existence today. It’s a searchable database filled with information about
different scientific research projects, such as genome sequencing data and other genetic
information from experiments.

The NCBI houses a series of databases relevant to biotechnology and biomedicine and is an
important resource for bioinformatics tools and services. Major databases include
GenBank for DNA sequences and PubMed, a bibliographic database for biomedical literature.
Other databases include the NCBI Epigenomics database. All these databases are available
online through the Entrez search engine
The major functions of NCBI are:

1. Create public databases for storing, retrieving, and analyzing


knowledge about molecular biology, biochemistry, and genetics.

2. Conduct research in computational biology, for analyzing the


structure and function of biological molecules.

3. Develop software tools for analyzing genomic data.


4. Disseminate biomedical information.
5. Gather biotechnology information worldwide.
Various databases of NCBI are:

A. Literature Database: Literature Databases are used to identify articles from


peer-reviewed journals and other types of periodicals. The world's largest repository of
medical and scientific abstracts, full-text articles, books and reports, as well as
supporting resources for cataloging and indexing the materials.

Some of the literature databases are listed below:

I. Bookshelf: The url for bookshelf is https://www.ncbi.nlm.nih.gov/books/

A collection of biomedical books that can be searched directly or from linked data
in other NCBI databases. The collection includes biomedical textbooks, other
scientific titles, genetic resources such as GeneReviews, and NCBI help manuals.

II. PubMed: The url for PubMed is https://pubmed.ncbi.nlm.nih.gov/

A database of citations and abstracts for biomedical literature from MEDLINE and
additional life science journals. Links are provided when full text versions of the
articles are available via PubMed Central or other websites.

III. PubMed Central (PMC): The url is https://www.ncbi.nlm.nih.gov/pmc/

A digital archive of full-text biomedical and life sciences journal literature,


including clinical medicine and public health.

B. Genomes Database: Genome resources include information on large- scale genomics


projects, genome sequences and assemblies, and mapped annotations, such as
variations, markers and data from epigenomics studies. Assembly, Biosamples, SRA are
also a part of Genomes databases.
I. Nucleotide: The Nucleotide database is a collection of sequences from several
sources, including GenBank, RefSeq, TPA and PDB. INSDC ( International
Nucleotide Sequence Database Collaboration) is a consortium comprising
GenBank, DDBJ (DNA Data Bank Of Japan) and EMBL (European
Molecular Biology Laboratory) nucleotide sequence database. GenBank is a
primary nucleotide sequence repository.
II. Taxonomy: The Taxonomy Database is a curated classification and
nomenclature for all of the organisms in the public sequence databases. This
currently represents about 10% of the described species of life on the planet.
III. Genome: Contains sequence and map data from the whole genomes of over 1000
organisms. The genomes represent both completely sequenced organisms and
those for which sequencing is in progress. All three main domains of life (bacteria,
archaea, and eukaryota) are represented, as well as many viruses, phages,
viroids, plasmids, and organelles. It organizes information on genomes including
sequences, maps, chromosomes, assemblies, and annotations.

C. Protein Databases: NCBI's Protein resources include protein sequences and


structures and related comparison and visualization tools, as well as databases and tools
to predict and analyze functional domains. The Protein database is a collection of
sequences from several sources, including translations from annotated coding regions in
GenBank, RefSeq and TPA, as well as records from SwissProt, PIR, PRF, and PDB. Protein
sequences are the fundamental determinants of biological structure and function.
I. PDB: The url for pdb is https://www.rcsb.org/

It stands for The Protein Data Bank. Protein Data Bank (PDB) is the single
worldwide archive of structural data of biological macromolecules. It includes data
obtained by X-ray crystallography and nuclear magnetic resonance (NMR)
spectrometry submitted by biologists and biochemists from all over the world.

II. UniProt: The url is https://www.uniprot.org/

D. The Universal Protein Resource (UniProt) is a comprehensive resource for protein


sequence and annotation data. The UniProt databases are the UniProt Knowledgebase
(UniProtKB), the UniProt Reference Clusters (UniRef), and the UniProt Archive (UniParc)

Procedure:
Steps to access database on NCBI

Step 1: Open the URL: http://www.ncbi.nlm.nih.gov


Step 2: In the ENTREZ search bar enter the search terms. For e.g. Lipase

Step 3: Search result will display different databases along with number of hits in each database
maintained by NCBI.

For e.g. Click on Nucleotide for nucleotide sequences data.

Step 4: Access the database and view your data.

Results:

Fig 1: Homepage of NCBI


Fig 2: Search results for Lipase in different databases of NCBI.

Fig 3: Search results in Nucleotide database

5
.

6
7

You might also like