You are on page 1of 7

RETRIEVAL TOOLS

ENTREZ:
The Entrez Global Query Cross-Database Search System is a search engine,
or web portal that allows users to search many discrete health sciences
databases at the National Center for Biotechnology Information website.
 The key feature of Entrez is its ability to integrate information, which
comes from cross-referencing between NCBI databases based on pre-
existing and logical relationships between individual entries.

 This is highly convenient: users do not have to visit multiple
databases located in disparate places. For example, in a nucleotide
sequence page, one may find cross-referencing links to the translated
protein sequence, genome mapping data, or to the related PubMed
literature information, and to protein structures if available.

Search tool on the NCBI website Contains a variety of databases:

-Nucleotide sequence

-Protein sequence

-Molecular structure

-SNPs

-Expression data

-Journal literature

•Each “database” contains “records” and each “record” in database contains


“fields”

 Entrez is a system of more than 40 linked databases

 It is a text search engine

 A tool for finding biologically linked data

 A retrieval engine

 It is a virtual workspace for manipulation large datasets

SEARCHING IN ENTREZ:

The Entrez retrieval system uses an intuitive user interface for


rapidly searching sequence and bibliographic data.
A unique feature of the system is its use of precomputed
similarity searches for each record to create links to "neighbours" or related
records in other Entrez databases.

SEARCH OPTIONS:

Entrez queries can be single words, short phrases, sentences, database


identifiers, gene symbols, or names just about anything.

Often simple searches can result in overwhelming numbers of results or


even no results at all. There are a number of built-in Entrez features that
can help in creating more effective queries.

These include Boolean operators, query translation, and fielded searching


using any of the indexed fields available for the database.

ENTREZ CONVENTIONS:

Using Boolean operators:

Boolean operators provide a way of generating precise queries that produce


well-defined sets of results. The Boolean operators used in Entrez and how
they work are as follows.
AND: Finds documents that contain terms on both sides of the operator
terms, the intersection of both searches.
OR: Finds documents that contain either term, the union of both searches.
NOT: Finds documents that contain the term on the left but not the term on
the right of the operator, the subtraction of the right-hand search from the
one on the left.

Three ways to search:

•Basic: just enter your search terms


•Advanced: more controlled search -uses limits, preview/index, history
•Complex Boolean: command language with qualifiers in brackets; syntax=
term [field] AND term [field] etc.

ENTREZ TABS:
Limits: - Provides a simple form for applying commonly used Entrez limits
- Helps to restrict the search to a subset of a particular database.
It can also be set to restrict a search to a particular database
(e.g., the field for author or publication date) or a particular type
of data (e.g., chloroplast DNA/RNA).

Preview/Index: - Allows access to the full indexing of each Entrez database


and aids in constructing complex queries
History: Provides access to previous searches in the current Entrez
database

Clipboard: A temporary storage area for selected records

Details: Displays the detailed parsing of the current Entrez query, and
lists errors and terms without matches

LIMITS:

Limits vary by database. In Entrez Nucleotides, for example, you can limit
by:
•search field
•source database (subset)
•molecule type
•gene location
•modification date
•exclude certain categories of records

SEARCH FIELD:

Nearly all search boxes that appear on the NCBI site access the Entrez
system. The search box at the top of the NCBI homepage is a convenient
place to begin Entrez searches.
The search box on the NCBI homepage also has a pull-down list that allows
selection of any of the individual databases.

Entrez nucleotide:

Example: HFE  Title


Organism  Human / homo sapiens
After entering the required filters, there appears all available records
according to the entered filter.
PROCEDURE:

1. Home page

2. User can type any query word in the query box, which will display the
list of results associated with that query from all the databases in
Entrez. Here in this example the user searches for “insulin”

3. The results are displayed with their corresponding number of records


(hits) on the left side in rectangular boxes with a short description
about each database.

4. Select any database which will display sequence records. For


example, nucleotide.

5. The result page displays the nucleotide database records for different
organisms with different features. The selected database will give a
message on the top of the page with database name and their
corresponding number of records. Along with nucleotide records, it
also gives information on EST (Expression Survey Sequence) and GSS
(Genome Survey Sequence)
6. The results are displaying with their unique Accession Number and
Gene ID number for nucleotide with a short description of number of
base pairs. User can filter his results according to their preference by
clicking on the “Filter your results” option.
7. User can select a specific organism by selecting on the “Top
Organisms” which is displayed on the right side of the result page
with their corresponding number of records

8. The result is displayed in GenBank format. User can switch to FASTA


format by clicking on the FASTA option which is displayed on the top.

LIMIT BY TITLE AND DATABASE:

The source database can be selected according to ones own interest. Eg,
DDBJ, RefSeq, GenBank, EMBL.

LIMIT BY BIOMOLECULE TYPE:

The molecule of interest is then chosen from the “molecule type” filter.
It shows the required sequenced molecule.

Eg, mRNA, cDNA, rRNA, genomic DNA/RNA

LIMIT BY PROTEIN NAME:

Required protein name is entered in the query box. For example, Thyroid
peroxidase. The available records are displayed below

PREVIEW / INDEX
Preview function
•rather than displaying the actual documents retrieved by your search,
Preview just displays your search statement and the number of hits it found
•useful for quickly trying various searches for a topic before actually
displaying the results

Index function
•browse the index for a specific search field
•view:
–broad and narrow terms
–correct spellings and misspellings
–syntax for entering specific terms (properties, dates, etc.)
•select one or more terms from the index
•add selected term(s) to your query using the AND, OR, NOT buttons

VIEW INDEX:
Enter a term in the text box; use the pull-down menu to specify a search
field
Click preview to add terms to the query box and see the number of result
search.
The most commonly used fields under (complex)Entrez are the following:

 M17755[primary accession]
 TPO [gene name]
 thyroid peroxidase[title]
 thyroiditis [text word]
 Homo sapiens[organism]
 thyroid peroxidase [protein name]
 3060[sequence length]
 1999/04/26 [modification date]
 biomol mrna [properties]
 gbdiv pri [properties]
 srcdb genbank [properties]

You might also like