Professional Documents
Culture Documents
2023
HUMAN DISEASE
KIVANÇ BİLECEN
PROF. OF MOLECULAR BIOLOGY & GENETICS
Human Disease
At the level of the individual within a species, some mutations improve fitness, most mutations have no effect on fitness,
and some are maladaptive (relative to some norm).
Disease may be defined as maladaptive changes that afflict individuals within a population.
Disease is also defined as an abnormal condition in which physiological function is impaired.
Main approach >>> to describe genes and gene products that cause disease
Main challenge >>> to connect the genotype to the phenotype
Bioinformatics can have impact on our knowledge of diseases:
DNA databases to compare DNA sequences (GenBank/EMBL/DDBJ/SRA; Online Mendelian Inheritance in Man
(OMIM); locus‐specific databases)
Linkages studies, association studies >>> physical and genetics maps for the identification of mutant genes
Mutations affecting proteins’ 3D structure – structural bioinformatics
Functional studies to understand the effect of a mutation
Gene function prediction – through identification of orthologs in simpler organisms
28.12.2023
Main approach >>> to describe genes and gene products that cause disease
Main challenge >>> to connect the genotype to the phenotype
Bioinformatics can have impact on our knowledge of diseases:
DNA databases to compare DNA sequences (GenBank/EMBL/DDBJ/SRA; Online Mendelian Inheritance in Man
(OMIM); locus‐specific databases)
Linkages studies, association studies >>> physical and genetics maps for the identification of mutant genes
Mutations affecting proteins’ 3D structure – structural bioinformatics
Functional studies to understand the effect of a mutation
Gene function prediction – through identification of orthologs in simpler organisms
Classification of Disease
Genetic Disorders/Diseases
Hereditary Sporadic
Disease Predisposition/Susceptibility Somatic Chromosomal
Late Onset
Somatic
Mendelian Non Mendelian Common Cancer Drug Response Cancer CNV Structural
Mosaicism
Diseases
Other
Autozomal Sex Linked Polygenic Mitochondrial Imprinting TNR Ploidy Aneuploidy Microdeletion
Structural
Syndromes
Anomalies
Classification of Disease
Rare diseases
Diseases affecting <200.000 people
Why are they important?
28.12.2023
Categories of Disease
Categories of Disease
https://www.ncbi.nlm.nih.gov/snp/
Categories of Disease
https://www.ncbi.nlm.nih.gov/snp/
28.12.2023
Categories of Disease
Effect size
Can be quantitated as an odds ratio (OR).
An OR is a measure of association between an exposure (in our case a genetic variant) and an outcome (expression
of a disease).
An OR of 1 implies that the presence of a variant does not affect the odds of a disease outcome;
OR>1 implies an association with higher odds
of a disease occurrence.
Categories of Disease
A major goal of human genetics and genomics is to identify variants that cause disease (or confer risk for disease)
Categories of Disease
Monogenic Disorders
Categories of Disease
Complex Disorders
Complex disorders such as Alzheimer’s disease and cardiovascular disease are caused by defects in multiple genes
These disorders are also called multifactorial, reflecting that they are expressed as a function of both genetic and
environmental factors.
These traits do not segregate in a simple, discrete, Mendelian manner
asthma, autism, depression, diabetes, high blood pressure, obesity, osteoporosis
Categories of Disease
Genomic Disorders
Genomic disorders >>> changes in the structure of the genome that cause disease
Large‐scale chromosomal abnormalities are extremely common causes of disease in humans.
Many developmental abnormalities involve a portion of a chromosome.
Some involve cytogenetically detectable changes and span millions of base pairs
If they are too small to be cytogenetically visible (e.g., smaller than about 3 Mb) they are usually referred to as cryptic
changes.
microdeletion syndromes include Cri‐du‐chat syndrome, Angelman syndrome, Prader Willi syndrome,
Smith‐Magenis syndrome, and various forms of intellectual disability that result from the gain (microduplication) or
loss (microdeletion) of chromosomal regions.
Categories of Disease
Genomic Disorders
genomicdisorders
Genomic disorders>>>
that are inherited
changes in a Mendelian
in the structure of the fashion
genomeandthatinvolve only one or several genes
cause disease
Large‐scale chromosomal abnormalities are extremely common causes of disease in humans.
Many developmental abnormalities involve a portion of a chromosome.
Some involve cytogenetically detectable changes and span millions of base pairs
If they are too small to be cytogenetically visible (e.g., smaller than about 3 Mb) they are usually referred to as cryptic
changes.
microdeletion syndromes include Cri‐du‐chat syndrome, Angelman syndrome, Prader Willi syndrome,
Smith‐Magenis syndrome, and various forms of intellectual disability that result from the gain (microduplication) or
loss (microdeletion) of chromosomal regions.
28.12.2023
Categories of Disease
Categories of Disease
Genomic Disorders
Categories of Disease
Genomic Disorders
Categories of Disease
Categories of Disease
Modifier genes are likely to be involved, and environmental factors are certain to have large roles in genetic
diseases.
the concept that monogenic disorders may be caused primarily by the abnormal function of a single gene yet
they always involve multiple genes
Particular ethnic groups or other discrete groups have high susceptibility to some genetic diseases
Categories of Disease
Mitochondrial Disease
Today, over 100 disease‐causing point mutations have been described
The mitochondrial genome contains 37 genes, any of which can be associated
with disease. Morbidity map of the human mitochondrial genome
Mitochondrial genetics differs from Mendelian genetics in three main ways
Mitochondrial DNA is maternally inherited. A woman having a
mitochondrial DNA mutation may therefore transmit it to her children,
but only her daughters will further transmit the mutation to their
children.
While nuclear genes exist with two alleles (one maternal and one
paternal), mitochondrial genes exist in hundreds or thousands of
haploid copies per cell. Some critical threshold of mutated
mitochondrial genomes is required before a disease is manifested.
As cells divide the proportion of mitochondria having mutated genomes
can change, affecting the phenotypic expression of mitochondrial
disorders. Clinically, mitochondrial disorders present at different times
and in different regions of the body. An extremely broad variety of
diseases are associated with mutations in mitochondrial DNA.
MITOMAP is a useful mitochondrial genome database
28.12.2023
Categories of Disease
Categories of Disease
Cancer is a somatic mosaic disease, arising from a clone having somatic mutations and leading to malignant
transformation
Cancer occurs when DNA mutations confer selective advantage to cells that proliferate, often uncontrollably
There are six hallmarks of cancer, described by Hanahan and Weinberg (2011):
proliferative signaling
evading growth suppressors
resisting cell death
enabling replicative immortality
induction of angiogenesis
inactivating invasion and metastasis
There are >200 types of cancer and many disease mechanisms, and a growing number of key tumor suppressor
genes and other oncogenic genes have been identified
A human cancer genome project has been launched to catalog the DNA sequence of a variety of cancer genomes
28.12.2023
Categories of Disease
Categories of Disease
Categories of Disease
Categories of Disease
Categories of Disease
Disease Databases
Disease Databases
A comprehensive database for human genes and genetic disorders, particularly rare (often monogenic) disorders having a genetic
basis.
The OMIM database contains bibliographic entries for over 25,000 human diseases and relevant genes
A focus of OMIM is inherited genetic diseases. The OMIM database is concerned with Mendelian genetics.
Little information about genetic mutations in complex disorders, or chromosomal disorders.
Disease Databases
Disease Databases
Disease Databases
Disease Databases
Disease Databases
The Human Gene Mutation Database (HGMD) is another major source of information on disease‐associated
mutations.
The database is partly commercial (requiring payment for full access). [purchased by Qiagen]
HGMD emphasizes more comprehensive cataloguing of mutations, compared to OMIM
In sequencing human genomes and exomes, it is common to filter variants based on whether they have been
previously associated with disease; HGMD has emerged as a basic resource in many analysis pipelines.
28.12.2023
Disease Databases
Disease Databases
Disease Databases
https://varsome.com/variant/hg19/MEFV%3AM694V?annotation‐mode=germline
Disease Databases
The ClinVar database provides data on human variants and their relationship to disease.
It further provides links to the NIH Genetic Testing Registry (GTR), MedGen, Gene, OMIM, and PubMed
There are five categories of content in ClinVar:
(1) Submitter Submissions are from organizations and individuals.
(2) Variation Includes sequences at one location (single allele) or multiple alleles (e.g., compound
heterozygotes in which two parents transmit different alleles at a single locus, sometimes causing a
phenotypic change). Variants are cross‐referenced to dbSNP and dbVar.
(3) Phenotype May represent one concept or more and is annotated by MeSH term, OMIM number,
MedGen identifier, or Human Phenotype Ontology (HPO).
(4) Interpretation Submitter‐driven and uses terms recommended by the American College of Medical
Genetics and Genomics (ACMG).
(5) Evidence Typically consists of the number of individuals in which a given mutation was observed.
28.12.2023
Disease Databases
Disease Databases
Disease Databases
Disease Databases
Disease Databases
GeneCards
Disease Databases
GeneCards
28.12.2023
Disease Databases
GeneCards
Disease Databases
GeneCards
28.12.2023
Disease Databases
GeneCards
Disease Databases
GeneCards
28.12.2023
Disease Databases
GeneCards
Disease Databases
Central databases such as OMIM and HGMD attempt to comprehensively describe all disease‐related genes
without necessarily cataloguing every known allelic variant.
IN CONTRAST, locus‐specific mutation databases describe variations in a single gene (or sometimes in several
genes) in depth.
The coverage of known mutations also tends to be far deeper in locus‐specific databases as a group than in
central databases.
A locus‐specific mutation database is a repository for allelic variations.
The essential components of a locus‐specific database include the following
a unique identifier for each allele;
information on the source of the data;
the context of the allele;
information on the allele (e.g., its name, type, and nucleotide variation).
28.12.2023
Disease Databases
A main point of entry to locus‐specific databases is the Human Genome Variation Society (HGVS).
HGVS‐nomenclature is used to report and exchange information regarding variants found in DNA, RNA and protein sequences and
serves as an international standard
Provides access to 1,600 locus‐specific mutation databases.
Its major categories include:
(1) locus‐specific mutation databases, organized by HUGO approved gene symbols;
(2) disease‐centered central mutation databases, such as the Asthma Gene Database;
(3) central mutation and SNP databases, such as OMIM, dbSNP, HGMD, and PharmGKB;
(4) national and ethnic mutation databases, such as databases for diseases affecting Finns or Turks;
(5) mitochondrial mutation databases, such as MITOMAP;
(6) chromosomal variation databases, such as the Mitelman database of chromosome aberrations in cancer;
(7) nonhuman mutation databases, such as OMIA (Online Mendelian Inheritance in Animals);
(8) clinical databases such as those of the National Organization for Rare Disorders (NORD)
Disease Databases
Disease Databases
Disease Databases
Disease Databases
Databases reporting which alleles are associated with human disease have critical roles in the interpretation of the clinical significance
of genomic variants.
Data analysis pipelines for next‐generation sequencing studies
filter‐exclude variants that are likely to be benign (neutral) because they appear in databases of apparently normal individuals
filter‐include variants that are likely to be pathogenic because they have been annotated as disease‐associated
Some of the challenges faced in assessing variants include the following:
For monogenic disorders, some variants in a disease‐associated gene occur relatively frequently and their pathogenicity is
established. For other rare variants, the clinical significance is unknown.
For multigenic disorders, allelic heterogeneity makes the interpretation of the clinical significance of variants even more difficult.
There is a large “interpretive gap” as increasing numbers of variants are identified, but their significance has not yet been
assessed.
o Locus‐specific databases are excellent repositories for the cataloguing of variants, but they also need associated clinical or
phenotypic data.
Databases such as the variants from the 1000 Genome Project are currently used to define neutral variants, but clinical and
phenotypic data are not available for those individuals.
o Even if they are defined as “apparently normal,” all are susceptible to disease.
Disease Databases
https://doi.org/10.1038/nmeth.2890
28.12.2023
Linkage Analysis
A genetic linkage map displays genetic information in reference to linkage groups (chromosomes) in a genome.
The mapping units are centiMorgans
Based on recombination frequency between polymorphic markers such as SNPs or microsatellites
One cM equals one recombination event in 100 meiosis (for the human genome, the recombination rate is typically 1–2 cM/Mb
In linkage studies, genetic markers are used to search for coinheritance of chromosomal regions within families
Two genes that are in proximity on a chromosome will usually cosegregate during meiosis.
By following the pattern of transmission of a large set of markers in a large pedigree,
linkage analysis can be used to localize a disease gene based on its linkage to a genetic marker locus.
Huntington’s disease (OMIM#143100) was the first autosomal disorder for which linkage analysis was used to identify the disease
locus
Linkage is usually successful for single‐gene disease models rather than for complex traits.
It also typically involves studies of large pedigrees.
While the genetic basis of over a thousand single‐gene disorders has been found,
it is far more difficult to identify the genetic causes of common human diseases that involve multiple genes
a large number of genes may each make only a small contribution to the disease risk
GWAS rely on SNP microarrays having several hundred thousand to more than a million SNPs represented on a single array
There are two main experimental designs
Family‐based design markers are measured in affected individuals (probands) and unaffected individuals to identify differences
in the frequency of variants
Population‐based design large number of unrelated cases and controls are studied
hundreds or thousands in each group (larger the better in terms of statistical power)
!!! GWAS which succeed in identifying strong evidence of association often implicate intergenic regions far removed from
protein‐coding genes.
A key aspect of genome‐wide association studies is that replication studies are required to confirm that positive signals are authentic.
NCI‐NHGRI Working Group on Replication in Association Studies to eliminate false positive results that often occur.
Repositories of GWAS data
Catalog of Published Genome‐Wide Association Studies >>> at the National Human Genome Research Institute (NHGRI)
Database of Genotype and Phenotype (dbGaP) >>> The National Library of Medicine (NLM)
o study documentation (e.g., protocols and data collection instruments); phenotypic data (of individuals and as a summary);
genetic data (genotypes, pedigrees, mapping results); statistical results (e.g., linkage and association results)
28.12.2023
The most common chromosomal aberrations in early development include the gain or loss of
whole chromosomes.
Other common phenomena large‐scale duplications, deletions, or rearrangements
involving many millions of base pairs
Can be detected using >>> Standard cytogenetic approaches (karyotype analysis and
fluorescence in situ hybridization (FISH)
Techniques used / improved
Spectral karyotyping/multiplex‐FISH (SKY/M‐FISH)
o Different fluorescence for each chromosome, facilitating the identification of
abnormal karyotypes
Array comparative genomic hybridization (aCGH),
o A high‐resolution karyotype analysis solution for the detection of unbalanced
structural and numerical chromosomal alterations with high‐throughput capabilities
Both genomic microarrays (aCGH) and SNP microarrays are used routinely to identify
disease‐associated chromosomal abnormalities.
28.12.2023
research studies must be approved by an Institutional Review Board (IRB) to confirm that appropriate procedures are in place
In Turkey Ethics Committee (or Board)
Informed consent must be obtained from the research participants or from patients
The informed consent document explains the risks and benefits of a study
o The risk of an exome study includes the potential loss of sequence data by the research team
o The possible negative impact of learning that a family member has a disease‐causing mutation
Consider a research study involving whole‐exome sequencing of a child with autism and his/her parents. The inclusion of the
parents’ exomes is critical because it allows inherited variants to be distinguished from de novo variants.
What procedure will be followed if a parent or child has a mutation in a cancer‐causing gene? This possibility should be
addressed as part of the informed consent process, and the IRB should review this procedure.
For clinical sequencing
The American College of Medical Genetics and Genomics (ACMG) issued recommendations for reporting incidental findings in
exome and genome sequencing
Primary finding as “pathogenic alterations in a gene or genes that are relevant to the diagnostic indication for which the
sequencing was ordered
Incidental findings are unexpected positive findings.
o The results of a deliberate search for pathogenic or likely pathogenic alterations in genes that are not apparently
relevant to a diagnostic indication for which the sequencing test was ordered
28.12.2023
research studies must be approved by an Institutional Review Board (IRB) to confirm that appropriate procedures are in place
In Turkey Ethics Committee (or Board)
Informed consent must be obtained from the research participants or from patients
The informed consent document explains the risks and benefits of a study
o The risk of an exome study includes the potential loss of sequence data by the research team
o The possible negative impact of learning that a family member has a disease‐causing mutation
Consider a research study involving whole‐exome sequencing of a child with autism and his/her parents. The inclusion of the
parents’ exomes is critical because it allows inherited variants to be distinguished from de novo variants.
What procedure will be followed if a parent or child has a mutation in a cancer‐causing gene? This possibility should be
addressed as part of the informed consent process, and the IRB should review this procedure.
For clinical sequencing
The American College of Medical Genetics and Genomics (ACMG) issued recommendations for reporting incidental findings in
exome and genome sequencing
Primary finding as “pathogenic alterations in a gene or genes that are relevant to the diagnostic indication for which the
sequencing was ordered
Incidental findings are unexpected positive findings.
o The results of a deliberate search for pathogenic or likely pathogenic alterations in genes that are not apparently
relevant to a diagnostic indication for which the sequencing test was ordered
The study of human disease genes and gene products in other organisms is of fundamental importance in our efforts to understand
the pathophysiology of human disease
While mutations in genes cause many diseases, it is the aberrant protein product that has the proximal functional consequence on
the cell and ultimately on the organism
Once a human disease gene is identified in a model organism, it can often be knocked out or otherwise manipulated.
This allows the phenotypic consequences of specific mutations to be assessed.