Professional Documents
Culture Documents
Rules and Guidelines For Genetic Nomenclature in Mice: Excerpted Version
Rules and Guidelines For Genetic Nomenclature in Mice: Excerpted Version
C O M M I T T E E O N S TA N DA R D I Z E D G E N E T I C N O M E N C L AT U R E
FOR MICE
C h a i r p e r s o n : M U R I E L T. DAV I S S O N
The Jackson Laboratory, 600 Main Street, Bar Harbor, Maine 04609, USA
The unique identification of genes and mouse strains is critical to their identification in research and in the
scientific literature. Rules for genetic nomenclature in mice have existed since the 1940s. The latest complete
revision of the rules was approved by the International Committee on Standardized Genetic Nomenclature for Mice
in November, 1993. Minor revisions have occurred since. The complete, current rules are available on-line from
The Jackson Laboratory's Mouse Genome Database (MGD), URL: http://www.informatics.jax.org. A printed
version appeared in Mouse Genome 92 (2), June, 1994, and in Genetic Variants and Strains of the Laboratory
Mouse, 3rd edition (Committee on Standardized Genetic Nomenclature for Mice, 1996a,b,c). The excerpted version
below gives the general guidelines for naming and symbolizing mouse genes, transgenes and transgenic strains,
targeted mutations and DNA markers. More detailed guidelines, including revisions as they are made, may be
found at the MGD Web site given above. Subparagraphs are numbered as in the complete guidelines so that the
user can refer easily from this excerpted version to the full text. Textual references to paragraphs not included here
may be found in the full text as well.
1. General rules for gene nomenclature used in gene symbols only for clarity, primarily to
separate characters that together might be confusing, e.g.:
Introduction
Gene nomenclature guidelines are based upon the premise (1) two numbers that would be in adjacent positions,
that the primary purpose of a gene or locus symbol is to such as Lamb1-2,
provide a brief and universally acceptable symbol that (2) -rs and - ps (related sequence and pseudogene,
uniquely identifies a specific gene or locus; all other respectively, see 1.1.4) from gene symbols,
purposes of a symbol are secondary and should not (3) as required, characters for loci in a complex from
interfere with this primary purpose. the complex symbol (see 1.1.12 and 1.2.1),
(4) components of mutant allele symbols, such as
Mod1a-m1Lws (see 1.1.7.7).
1.1.1. Names of genes or loci
Names of genes and loci should be brief and chosen to 2. The total number of characters in a locus symbol
convey as accurately as possible the character by which should not exceed 10 unless this maximum limit would
the gene is usually recognized. Genes are functional units, cause violation of some other rule.
whereas a locus can be any distinct, recognizable DNA 3. Except in the case of loci first discovered because
sequence (see 1.1.3). of a recessive mutation (see Section 1.1.7), the initial
letter of the locus symbol should be capital, and all
others lower case.
1.1.2. Symbols for genes
4. In published articles gene symbols should be set in
1. Symbols for genes should typically be two-, three-, italics, e.g. dw, dwarf; Hbb, haemoglobin b-chain.
or four-letter abbreviations of the name. Hyphens are 5. Identification of new loci should not be assumed
0962±8819 # 1997 Chapman & Hall
310 Davisson
from the discovery of variation, whether morphological, point substitutions, insertions or deletions (e.g. of retro-
biochemical, quantitative or antigenic. viral elements), and variations in simple sequence repeat
6. A proposed new symbol must never duplicate one numbers. Each such variant has the potential to define a
already used for another locus. Date of publication in a distinct genetic locus or DNA segment, given a specific
refereed journal, provided the symbol is acceptable, assay and appropriately recombinant mice, and the full set
establishes priority. Symbols for new loci can be reserved of variants possessed by any given form of gene or length
through the MGD at The Jackson Laboratory, Bar of chromosomal DNA defines a haplotype. The term
Harbor, Maine, USA. haplotype is also used to define a complement of alleles at
7. When a well-known locus has been recognized multiple loci within a complex (see Sections 1.2.1 and
initially by a mutation and later the structural locus is 1.2.3).
identified, the locus is identified by the symbol for the To describe the results of a DNA typing assay
structural locus and the mutant allele symbol is unambiguously, investigators should, therefore, give the
designated as a superscript to the structural locus symbol, gene name and symbol, the assay (e.g. probe or PCR
e.g. W, which is a mutation in Kit, becomes Kit W . primers, and restriction enzyme if any), and, if a specific
8. Symbols for quantitative trait loci (QTL) genes shall DNA segment within the gene is being assayed, the D
follow the rules for other types of genes (see 1.1.1). They locus symbol. An important function of chromosome
should be symbolized with 3-4 character symbols that are committees and of databases is to assemble such data to
acronyms of the name and begin with a capital letter. define haplotypes for genes. To avoid ambiguity, assay
Those affecting the same complex trait, i.e. in a series names and probe names should not resemble D symbols;
like Idd loci, shall be given the same stem symbol and use of the letter `D' should be reserved for DNA loci.
serially numbered. `q' may be used as the final letter D symbols are used in two ways:
preceding the serial number but is not required.
9. Expressed sequence tagged (EST) loci, when 1. Loci recognized by anonymous DNA probes should
mapped to chromosomes, may be given either D symbols be given D symbols. This use of D symbols as gene or
with a final `e' for expressed or the symbol ESTM###, locus symbols should be reserved for anonymous DNA
where M identifies mouse as opposed to human EST loci loci distinct from known genes.
and ### a serial number assigned from the Mouse 2. Intragenic loci may be given D symbols to
Genome Database (MGD). distinguish individual sites within a gene. Remember
10. Genes encoded by the opposite (antisense) strand that genes are functional units whereas loci can be any
of a known gene shall be given their own symbols. distinct DNA segment, and that several variant D loci
11. Alternative transcripts from the same gene should may lie within, and be used to assay, a gene. Variation in
not be given different `locus' symbols. DNA sequence of a known gene or tightly linked
12. For homologous genes, to facilitate comparative sequences should be described using the gene symbol
mapping one should use the mouse equivalent of the and current rules for nomenclature of genes and alleles.
symbol already adopted for the conserved gene in other Intragenic D locus symbols should only be used (a) when
species. For example, the gene symbol for amyloid beta describing intragenic mapping analysis or (b) when
precursor protein is APP in the human and cow genomes stating that a gene was typed using an intragenic D
and App in the mouse. If the exact symbol has already locus (see next paragraph for the most common example
been assigned to another gene in the mouse, the symbol of this).
should be modified to one resembling as closely as
possible that used in the other species; e.g. by adding a Note: While D symbols could be given to all variants
single letter such as `h' for homologue. One should also within a gene, the Committee discourages the use of D
try not to duplicate a symbol used for a different gene in symbols for intragenic DNA segments except when a D
another species. Do not insert the letter `m' or `M' (for locus symbol is needed to relate a segment to other
mouse) as the first letter of the symbol for a locus with objects within or near the gene. YAC ends may be given
homologues in other species. D symbols when they are used for genetic mapping.
(Note approved November 1995).
Loci recognized by variation in copy number of mini-
or microsatellites should be given D symbols. If such
1.1.3. Genes and loci recognized by DNA sequence
microsatellite (D symbol) loci are within or very near
A gene, including both transcribed DNA sequences and known genes and, thus, can be used to detect those
associated regulatory elements, can span a considerable genes, then the gene symbol should always be used to
length of chromosomal DNA. Sequences at many places refer to the gene, e.g. the gene's location on the
in a gene might, therefore, vary from strain to strain of chromosome, and the D symbols should only be used
mouse, and each variation could be any of several kinds: to refer to specific sites within the gene, e.g. to convey
Rules and guidelines for genetic nomenclature in mice: excerpted version 311
intragenic mapping information. When D symbol loci fall its function remains unknown, a lower case `e' for
within known genes, on general maps the gene will be expressed should be added to the D symbol, e.g.
identified by the gene symbol and the locus (D) symbols D1Pas5e. Newly identified expressed genes for which
will appear in locus lists and databases cross-referenced the gene product is unknown should also be given D
with the gene. Locus symbols would, of course, be used symbols followed by `e'.
on fine structure, high resolution maps around and within Anonymous DNA loci from the human genome that
genes. cross-hybridize with mouse DNA and are mapped to a
D symbols are composed of four parts: mouse chromosome retain their human symbol in all
uppercase and the mouse chromosome number and a
(1) D for DNA.
capital H, for human, are inserted after the D, e.g.
(2) 1 . . . 19, X and Y for the chromosomal assign-
D16H21S56 is a D locus on mouse Chromosome 16 that
ment, and 0 for unmapped loci.
cross-hybridizes with the probe for the 56th simple
(3) A 2±3 letter laboratory registration code indicating
sequence DNA locus from human Chromosome 21. This
the laboratory or scientist describing the locus. The same
same convention may be used for D loci derived from
symbol should be used for a laboratory's loci, chromo-
other species. Single-letter abbreviations for other species
some aberrations, and inbred substrains, e.g. Pas for
will be assigned by the International Nomenclature
Pasteur Institute. A code can be obtained from the central
Committee to assure that each letter unambiguously
registry maintained by the Institute of Laboratory Animal
indicates the species of origin. If a unique locus in
Resources (see section on Laboratory Registration Codes
another species identifies more than one mouse locus, the
below).
related sequence convention should be used (see 1.1.4).
(4) A unique serial number. It is preferred that
Alternative or new loci detected with primers for a
numbers be assigned to loci in the order the loci are
known locus belong to the lab discovering them and shall
described on each chromosome for a particular labora-
be symbolized by that lab with its lab code and next
tory, e.g. D1Pas5 the fifth D locus developed and
serial number on the chromosome. `Mismapped' loci
mapped on Chromosome 1 at the Pasteur Institute. The
should only be resymbolized if it can be proved that the
use of a number from the probe that detects the locus,
original locus does not exist in the original location.
e.g. D17Leh48, a Chromosome 17 D locus identified by
B1 (and other types of) repeat `loci' should only be
probe number Tu48 of Lehrach, is discouraged.
given symbols if there is evidence of a unique locus (e.g.
Recognizing that allelic variation detected by DNA probes genetic proof or polymorphism).
can be complex, the Committee proposes the following. In Xrf shall be used as the `lab code' in D symbols that
published papers, the allele type of a specific strain should designate cross-referenced genes in mice and yeast or
be given by fragment(s) size with a description of the other species. Symbols will be DChr#Xrf###, where ###
assay used. When simple allele symbols are required, as is a serial number assigned by the Johns Hopkins
for display of linkage data or listing in databases or locus database. Typically, a clone derived from yeast may hit
lists, a single letter abbreviation for the strain should be 2 (or more) mouse loci. These will not require the -rs
used (see Alleles, section 1.1.7.9). Allele symbols should nomenclature because the clone number in the symbol
be written as a lowercase superscript when appended to will identify relatedness between loci on different
the locus symbol; in tables of linkage data in publications chromosomes. A hyphenated serial number shall be used
a single uppercase letter denoting the strain may be used. when two loci detected by the same clone are on the
The assay or restriction enzyme must be specified for same chromosome.
each allele used in publication or entered into a database,
because the same two strains may differ with one assay Chromosomal regions detected cytologically and by
but be indistinguishable with another. That is, a strain may RFLPs
have of as designating the haplotype for that gene or At least five chromosomal regions can be detected by a
segment of DNA (see Section 1.1.3, para. 1 and Section specific cytological staining method that reveals the whole
1.1.7, general rules for allelic designation). region or loci within the region can be detected with DNA
If an anonymous DNA locus identified only by the D probes: centromeres, pericentromeric heterochromatin (C-
symbol is later identified to be a known locus or the bands), nucleolus organizer regions (NORs), homoge-
function of the gene is determined, the D symbol should neously staining regions (HSRs) and telomeres. Chromo-
either be replaced by the known gene's symbol or somal nomenclature guidelines should be followed for
changed to a new gene symbol that is an acronym for the cytologically detected chromosomal regions and gene
new gene's name. If more than one mutation is identified nomenclature for genetic loci. While we recognize the
within the gene, the original D symbol should be retained distinction blurs, please try to follow these guidelines;
for the mutated site it originally designated (see Section contact a member of the International Nomenclature
1.1.3, para. 3). If a D locus is shown to be expressed but Committee if in doubt. Hc# is the symbol for a
312 Davisson
cytologically detected C-band; loci recognized by DNA recombination is used as a mechanism to insert a
polymorphisms is heterochromatin should be given D transgene and it is the transgene itself that is of primary
locus symbols. Centromeres are designated by the symbol interest. The purpose of this designation is to enable the
Cen when detected cytologically or referred to as a unit. If user to identify it as a symbol for a transgene and to
it becomes possible and necessary to distinguish loci or distinguish between the three fundamentally different
segments within the centromere, these should be given D organizations of the introduced sequence relative to the
locus symbols. Polymorphic loci within NORs are host genome, not simply to indicate the method of
symbolized Rnr#, where Rnr ribosomal RNA and insertion or nature of the vector. To illustrate these
# the chromosome on which the locus is located. If distinctions, examples are given below.
more than one Rnr locus on the same chromosome is
· Mice derived by infection of embryos with MuLV
distinguished by genetic means, the loci should be serially
vectors are designated TgR; mice derived by microinjec-
numbered in order of identification, e.g. Rnr1-1, Rnr1-2.
tion of MuLV DNA into zygotes are designated TgN.
For homogeneously staining regions, HSR is incorporated
· Mice derived from ES cells by introduction of DNA
into a chromosome aberration symbol (see chromosome
followed by recombination with the homologous genomic
guidelines) or a D symbol is assigned to the locus
sequence are designated TgH; mice derived by random
amplified and Hsr is added when it is amplified, e.g.
insertion of the same sequence by nonhomologous
D1Lub1Hsr. Related sequence nomenclature (rs#) will be
crossing over events are designated TgN.
used for telomere sequence loci recognized by poly-
morphic variants with telomere sequences. Telomere B. The insert designation is a symbol for the salient
sequences detected by cytogenetic methods shall be features of the transgene, as determined by the investi-
symbolized Tel# like other chromosomal `anomalies or gator. It is always contained within parentheses and
variants'. consists of no more than 6 characters: letters (capitals,
or capitals and lower-case letters) or a combination of
letters and numbers. Insert designations longer than 6
Transgenes (Section 1.3 in the complete guidelines;
characters may be used only if the insert designation and
MGD; Committee 1996)
the laboratory assigned number (C. a. below) together are
All DNA sequences that are experimentally and stably 11 characters or less. Italics, super- or subscripts, internal
introduced into the germline of animals are considered spaces, and punctuation should not be used. While the
transgenes. They are named according to the following choice of the insert designation is up to the investigator,
conventions, which were developed by an interspecies the following guidelines should be followed:
committee sponsored by the Institute of Laboratory
a. The insert designation should identify the inserted
Animal Resources (ILAR, 1992). Transgenic symbols
sequence and indicate important features. Where the
can be registered with TBASE at the Human Genome
insertion uses sequences from a named gene, it should
Database (GDB) at Johns Hopkins (URL: http://
contain the standard symbol for that gene. Hyphens are
www.gdb.org/Dan/tbse/tbase.html).
omitted when using hyphenated gene symbols. If the
1. A transgene symbol consists of three parts, all in gene symbol exceeds the spaces available, use the
roman typeface, as shown below: beginning letters of the symbol. For example, Ins1
should be used within the symbols of transgenes
TgX(YYYYYY)#####Zzz
containing either coding or regulatory sequences from
the mouse insulin gene (Ins1) as an important part of the
Where: TgX mode,
insert designation. Gene symbols are not italicized when
(YYYYYY) insert designation, incorporated into transgene symbols.
b. Avoid using symbols that are identical to other
##### laboratory assigned number and
named genes in the same species. For example, the use
Zzz laboratory registration code of `Ins' to designate `insertion' would be incorrect.
c. Ideally two different gene constructs should not be
A. The mode, designates the transgene and always consists
identified by identical insert designations.
of the letters `Tg' followed by a letter designating the
d. To aid communication, standard abbreviations can
mode of insertion of the DNA: H for homologous
be used as part of the insert designation.
recombination, R for insertion via infection with a
retroviral vector, and N for nonhomologous insertion. These presently include:
`Knockout' or directed mutation of a specific known gene
should be designated using standard allele symbol con- An anonymous sequence
ventions, (see below). Transgenic nomenclature is used for Gen genomic
homologous recombination insertions when homologous Im insertional mutation
Rules and guidelines for genetic nomenclature in mice: excerpted version 313
Nc noncoding sequence series of microinjections done in the laboratory of Jon W.
Rp reporter sequence Gordon (Jwg).
Sn synthetic sequence · Crl:ICR-TgN(SVDhfr)432Jwg The SV40 early pro-
ET enhancer trap construct motor driving a mouse dihydrofolate reductase (Dhfr)
Pt promoter trap construct gene. This was a 4 kb plasmid, and this animal was the
32nd animal screened in the laboratory of Jon W. Gordon
This list will be expanded as needed and maintained by (Jwg). The ICR outbred mice were obtained from Charles
the Nomenclature Committee. River Laboratories (Crl).
· TgN(GPDHim)1Bir The human glycerol phosphate
e. The insert designation should identify the inserted
dehydrogenase, which caused an induced mutation (im);
sequence, not its location or phenotype.
the first transgenic line produced by Birkenmeier.
C. Laboratory Assigned Number and Laboratory Registra-
tion Code is a number and letter combination that must
Examples of insertional mutation designations
uniquely identify each independently inserted sequence. It
is formed of two parts: · hoTgN447Jwg The insertion of a transgene into the
hotfoot locus (ho).
a. The Laboratory Assigned Number is a number from · xxx TgN21Jwg The insertion of a transgene that leads to
1 to 99 999 that is uniquely assigned by the laboratory to a recessive mutation in a previously unidentified gene. A
each stably transmitted insertion. This assignment should gene symbol for xxx must be obtained from MGD.
be done at the time germline transmission is confirmed.
The number can have some intra-laboratory meaning or
simply be a number in a series of transgenes produced by Targeted mutations
the laboratory. The same number cannot be used more
than once by each lab. Rules for symbolizing targeted mutations are given in
b. The Laboratory Registration Code is uniquely Section 1.1.7.7 below. Currently, targeted mutations are
assigned to all laboratories originating transgenic animals, often maintained on a mixed genetic background derived
DNA loci or inbred strains (see Laboratory Registration from the embryonic stem (ES) cell line and the host strain
Code section below). used. Standardized nomenclature for naming such strains
or for congenic strains made by backcrossing the targeted
2. The complete designation identifies the inserted site, mutation onto a standard inbred background is given in
and provides a symbol for unique identification. When a Sections 3.1.2 and 3.3.
mutation that produces an observable phenotype is caused
by the insertion, the locus so identified must be named
according to standard procedures for the species involved. 1.1.7. Alleles
The allele of the locus identified by the insertion can
Alleles are usually designated by the locus symbol with an
then be identified by the abbreviated transgene symbol
added superscript (in italics when printed). In computer-
according to the conventions adopted for communication,
ized symbols the superscript may be denoted by prefixing
and supplies a unique identifier to distinguish it from all
an asterisk or enclosing the allele symbol in angle
other insertions. Each insertion retains the same symbol
brackets, e.g. Gpi1a or Gpi1 a or Gpi1 , a .. For D
even if it is placed on a different genetic background.
symbol locus alleles see section 1.1.3. When a sponta-
Specific lines of animals carrying the insertion should be
neous mutation is cloned or shown to occur in a
additionally distinguished by a stock designator preceding
previously named candidate gene, the mutation's symbol
the transgene symbol. In general, this designator will
is changed to become an allele at the cloned locus by
follow the established conventions for the naming of
turning the mutation symbol into an allele symbol, e.g. the
strains or stocks of the particular animal used. In cases
shi (shiverer) mutation in the Mbp (myelin basic protein)
where the background is a mixture of several strains,
gene becomes Mbpshi . If the original mutation symbol
stocks, or both, the transgene symbol should be used
already has a superscript, the mutation and allele symbols
without a strain or stock name. For rules on how to
are placed on one line in the new superscript and
designate strains derived from such mixed genetic
hyphenated, e.g. the shimld (myelin deficient) mutation
backgrounds or congenic strains or see the section below
becomes Mbpshi-mld (see also #2 below).
on targeted mutations and Sections 3.1.2 and 3.3.
1. In the case of mutant genes for which there is
Examples of transgenic strain designations clearly a wild-type, the symbol for the first discovered
· C57BL=6J-TgN(CD8GEN)23Jwg The human CD8 mutant allele becomes both the gene symbol and the
genomic clone inserted into C57BL=6 mice from The symbol for that allele. No superscript is then used, e.g.
Jackson Laboratory (J). The 23rd mouse screened in a Ca, caracul. When a new allele is discovered, it is
314 Davisson
symbolized by adding a superscript to the original 6. Indistinguishable alleles of independent origin (e.g.
symbol. reoccurrences, reversions to wild-type) should be desig-
2. Recessive alleles should be indicated by the use of nated by the existing gene symbol with a series symbol
a lower case initial letter for a mutant gene, e.g. a, non- (see below) appended as a superscript in italics. If the
agouti. All other alleles, whether dominant, codominant gene symbol already has a superscript, this should be
or having dominance relationships that vary with method separated from the series symbol by a hyphen. The series
of assessment, should be indicated by the use of a capital symbol should consist of an arabic numeral correspond-
initial letter followed by lower case letters, as in the locus ing to the serial number of the variant in any given
symbol, e.g. Ta, tabby. laboratory, plus the laboratory registration code. To avoid
Two exceptions to this rule are allowed for targeted the confusion of the numberal 1 and the letter l, a first-
and cloned mutant genes when the original cloned gene discovered variant may be left unnumbered, and the
symbol starts with an upper case letter: second variant numbered 2. When two named mutant
genes are found to be alleles at the same locus, the
1. If the phenotype of mutant alleles may be recessive symbol published or assigned first remains the locus
or codominant depending on the method of determina- symbol and the symbol of the second gene is super-
tion, the use of upper or lower case will depend upon scripted as an allele symbol for that mutation, e.g. hr rh ,
what the naming investigator considers the defining the rhino allele at hairless.
phenotype. For example, a targeted mutant allele of Tcra 7. Mutations or other variations occurring in known
created by Mombaerts can be symbolized Tcratm1Mom, alleles may be denoted by a superscript m followed by an
even though heterozygotes are not visibly different from appropriate series symbol (as above) and separated from
wild-type mice, if heterozygotes can be distinguished at the original allele symbol, if one exists, by a hyphen, e.g.
the DNA or protein level. Mod1a-m1Lws , the first mutant allele of Mod1a found by
2. When a mutation is shown to occur in a cloned Lewis. For known deletions of all or part of an allele, the
candidate gene and its symbol is changed to become an superscript m may be replaced with dl. Information on
allele of the cloned gene (see first paragraph of 1.1.7 the allele of origin of mutations may be valuable in
above), the first letter of the gene symbol may remain elucidating changes in DNA sequence. Mutant alleles
uppercase and the inheritance pattern may be conveyed in created by targeted mutagenesis should have a t
the allele symbol, e.g. the e (recessive yellow) and Eso preceding the m to denote targeted, e.g. Cftr tm1Unc , a
(somber) alleles at the Mc1r (melanocortin 1 receptor) targeted mutation of the cystic fibrosis transmembrane
gene become Mc1r e and Mc1r E-so . regulator gene created at the University of North
3. Allele superscripts should typically be one or two Carolina.
lower case letters and, if possible, should convey addi- 8. Mutant alleles that turn out subsequently to be
tional information about the allele, e.g. pun , pink-eyed deletions retain their allelic designation, e.g. Ta25H and
unstable allele of p of pink-eyed dilution. If information the various c-locus deletions retain their original symbols
is too complex to be conveyed conveniently in the even though they are now known to be deletions that
symbol, the alleles are given single letter superscripts encompass more than the Ta or c loci. If the deletion
and the information concerning the allelic properties is deletes more than one gene and is cytologically visible,
shown in catalogs or tables, e.g. Pgm1a , Pgm1b ; H2a , the deletion should be given a chromosome anomaly
H2b , etc. designation containing the original allele designation and
4. For alleles where the allele designates presence or the allele symbol is used as the abbreviation. For
absence of a virus or immune response, the allele for the example, Del(10)Sl12 1H is the first deletion discovered
presence of the virus or immune response is designated at Harwell; it is located on Chromosome 10 and was
by a superscript `a' and the allele for absence of the trait originally detected as the 12th steel allele at Harwell
by a superscript `b'. For loci governing resistance and (Sl12H ). Once referred to in a publication by the full
susceptibility to infectious organisms or other agents, designation, it may be abbreviated Sl12H . Information on
resistance is designated by a superscript `r' and the genes deleted becomes part of the description. If the
susceptibility by a superscript `s'. deletion deletes more than one gene, but is not
5. Wild-type should be designated by a sign, with cytologically detectable, the above nomenclature is
the locus symbol as superscript, e.g. p . Reversions from discouraged, although a cytological designation may be
a mutant allele to wild-type should be distinguished from given in the future if improved techniques reveal the
the original wild-type allele by designating them by the deletion cytologically. The term `cytological' refers to
locus symbol, with a sign as superscript e.g. punJ . cases where the deletion can be detected by simple
A sign alone may be used when the context leaves no staining methods and visual examination of chromo-
doubt as to the locus represented, e.g. in genetic somes. For a deletion of multiple genes detected only by
formulae. methods such as in situ hybridization of gene-specific
Rules and guidelines for genetic nomenclature in mice: excerpted version 315
probes, it is left to the investigator to determine the most regarded as provisional until the true functions of the
useful terminology. genes become known, when they should be renamed, e.g.
9. To display polymorphic data from a multipoint Erbb became epidermal growth factor receptor, Egfr.
cross in a table for publications, it is acceptable to use
single letter abbreviations for the strains involved in the
cross to designate the strain origin of the alleles. A 1.1.11. Phenotype symbols
footnote to the table should point out that these designate Phenotype symbols, where these are necessary (e.g.
strain of origin rather than allele symbols, e.g. B for antigen loci, enzyme loci), should be the same as
C57BL=6 vs S for Mus spretus. In databases where genotype symbols except that symbols for phenotypes
single letters are needed to make comparisons among should be in capitals, not italicized, and with superscripts
strains, the letter used will refer to the `haplotype' or lowered to the line. The phenotypes of heterozygotes
constellation of different variants revealed by different should be written as in the following example: GPI1A,
assays that make up the phenotype of a particular locus GPI1B, and GPI1AB are phenotypes associated with the
for a particular strain. The first strain in which the Gpi1 locus a and b alleles.
`phenotype' is described is the prototype and determines
the allele symbol. If two strains thought to have the same
allele are distinguished from each other by a different 1.1.12. Gene complexes
assay, one of them is given a new simple allele Gene complexes are considered to exist when a number of
designation. Other strains with the original allele apparently functionally or evolutionarily related loci are
designation, retain the original until they are typed with genetically closely linked. Alternative states of complexes
the new assay to determine their appropriate allele are referred to as haplotypes rather than alleles. Known
designation. complexes are of two main types: (a) less extensive
complexes involving duplicate loci or in which operators
1.1.8. Lethals or cis-acting regulators of structural genes for protein
show little or no recombination with the loci on which
Appropriate locus symbols for recessive lethals with no they act: and (b) very extensive complexes, possibly
known heterozygous effect and unidentified function involving hundreds of related loci, for which special rules
consist of a lower case letter l followed by the number may be necessary. The H2 and the immunoglobin
of the chromosome on which the locus is located in complexes are in category (b). The complete nomenclature
parentheses, and series symbol indicating the serial rules contain guidelines for less extensive complexes
number of the lethal in the laboratory of origin, e.g. involving operators, cis-acting regulators, or duplicate
l(17)2Pas, the second lethal on Chr. 17 found at the loci, and for very extensive complexes with special rules,
Institut Pasteur. including the H2 complex (see 1.2.3), immunoglobulin
complexes (see 1.2.5), globin gene complexes (see 1.2.4),
1.1.9. Viruses homeobox-containing gene complexes (see 1.2.6), and the
t-complex.
Nomenclature for genes related to the expression of viral
antigens, or to sensitivity or resistance to viruses, should
follow the standard rules for gene nomenclature. Where 1.1.13. Mitochondrial genome
possible and appropriate, the letters of the symbol should Loci in the mitochondrial genome should be denoted by
be those by which the virus is usually known; e.g. Mtv1, a the prefix mt-set off from the main symbol by a hyphen.
locus concerned in induction of mammary tumour virus,
MTV. Successive loci concerned with the same virus
should be distinguished by appending serial numbers; e.g. 1.1.14. Antigenic variants
Fv1, Fv2.
Symbols adopted for loci concerned in cell-mem-
brane alloantigens should be based on the method of
1.1.10. Oncogenes demonstrating such loci. Brief examples of the different
types are listed below with reference to the sections in
Nomenclature for mouse cellular oncogene sequences
which detailed rules, if they exist, can be found.
should follow the standard nomenclature for oncogenes.
When referring specifically to the mouse locus, however, 1. Loci primarily demonstrable by transplantation
in lists of symbols and maps, the prefix c-denoting techniques should be designated by an initial H; e.g.
cellular sequence should be omitted and the initial letter H1, H2, etc. (see 1.2.3).
of the symbol should be capitalized; e.g. c-myc becomes 2. Loci demonstrable by red-cell agglutination should
Myc. The names and symbols of oncogenes should be be designated by the letters Ea; e.g. Ea1, Ea2.
316 Davisson
3. Loci coding for a cell surface molecule on designations that do not conform. For example, see
lymphocytes, or shared by lymphocytes and other cell Section 3.1.6. Any strains with a common origin
types, and detected by serological or biochemical meth- separated before F20 shall be regarded as related inbred
ods, and for which there is a demonstrable polymorph- strains and be given symbols that indicate relationship,
ism, should be designated by the letters Cd, if the CD e.g. NZB, NZC, NZO.
antigen is known, and Ly, if not. Inbred targeted mutation strains derived from only two
4. Similarly, other antigen loci involving other cell strain (including the ES cell line as one) but not exactly
types should be denoted by symbols indicating the cell recombinant inbred strains (see below) may be designated
type, e.g. Pca, plasma cell antigen; Tla, thymus leuk- using abbreviations for the two strains separated by a
aemia antigen. comma, e.g. B6, 129-gene symbol.