You are on page 1of 4

Clinical and Experimental Allergy, 1998, Volume 28, Supplement 1, pages 8487

Analytic options for asthma genetics


S. S. RICH
Department of Public Health Sciences, Bowman Gray School of Medicine, Winston-Salem, North Carolina, USA
Summary
The analytic methods currently being used for the study of the genetics of asthma have
primarily focused on the evaluation of linkage by non-parametric methods as applied to
genome screen data in affected sibling pairs. Complexity in the analysis of asthma genetics
has been shown to occur at several levels, including phenotypic definition (wide vs narrow
criteria for asthma, including restriction based upon multiple phenotypes) and joint analysis
of asthma with associated phenotypes. Alternative approaches that purport to treat asthma
as a quantitative trait (a score or index) rather than as a qualitative (asthma, yes or no)
trait were presented, including the development of a Framingham Risk Score for asthma, a
selection index, or a propensity score. While each of these alternatives have interesting
features, issues relating to estimation and incorporation in a family structure have yet to be
resolved. Nonetheless, collection of a standard set of clinical data from multiple studies
could be used in a score to increase the power of genetic mapping studies for asthma.

Introduction
Asthma is a complex genetic disorder that is highly familial
yet does not follow a simple pattern of inheritance [1,2]. In
order to successfully map asthma susceptibility genes,
several analytic methods will need to be employed and,
ultimately, new methods will need to be developed. Prior to
application of analytic options for detection of linkage from
either a candidate gene or anonymous marker screen,
however, the definition of the phenotype (asthma) has to
be established. While not the focus of this discussion,
inconsistent definition of asthma within a study will introduce misclassification (with accompanying loss of power)
and decrease the likelihood of replication of linkage results
across studies.
The prevalence of asthma appears to be increasing in
most parts of the industrialized world and, with increasing
diagnoses, the number of families with multiple cases of
asthma is also increasing [3,4]. Since the rate of increase in
asthma cases has occurred over a short period of (evolutionary) time, the likely cause of the increase could be due
to either increased diagnosis or increased exposure to
environmental triggers that allow expression of asthma
susceptibility genes. While the gold standard for diagnosis
of asthma is likely a combination of clinical history of
Correspondence: Dr S. R. Rich, Department of Public Health Sciences,
Bowman Gray School of Medicine, Winston-Salem, North Carolina, USA.
q 1998 Blackwell Science Ltd

cough, wheeze and shortness of breath in absence of


infection, and evidence of bronchial hyperresponsiveness,
this latter clinical evaluation may not be practical for most
genetic studies [5]. Nonetheless, the reliance on a verbal
clinical history may not be sufficiently precise to permit
confidence in the genetic analysis. In the context of a
genetic study of asthma, the phenotype should be defined
in a way that is practical for the study design and
reproducible within and across studies.
Asthma as a qualitative trait
The standard definition of asthma is one based upon a
medical history or evidence of airways reactivity, resulting
in a binary (yes/no) response. Although many candidate
genes have been evaluated in the context of a case-control
(association) study, recent studies in asthma have focused
on the affected sib-pair family design [6,7], a design shown
to have been effective in the identification of regions of
interest for multifactorial disorders (insulin-dependent diabetes mellitus (IDDM), non-insulin dependent diabetes
mellitus (NIDDM), multiple sclerosis (MS), coeliac disease).
Although technically a simple study design, the collection
of affected sibling pairs (with parents) and analysis of mean
proportion of alleles shared identical by descent (or identical by state) using a highly polymorphic panel of genetic
markers has developed into a standard protocol for detecting
linkage. As the sib pair analytic approach does not require
84

Analytic options for asthma genetics

the specification of a genetic model, or the calculation of


complex pedigree likelihoods, the approach is relatively
robust with large sample sizes [8].
For certain combinations of genetic parameters, the
collection of discordant sibling pairs can serve as a powerful
complementary approach for gene mapping. As demonstrated for diabetic nephropathy [9], diseases with high
sibling risk (< 70%) and high prevalence (< 30%) may be
more easily mapped using the discordant sibling pair (DSP)
approach than using affected sibling pairs (ASP). As the
sibling risk and the prevalence of the disorder differentially
decrease (resulting in an increasing lS), the gain of DSP
over ASP diminishes, so that for traits with larger lS values,
the ASP method becomes more powerful. Since there are
little data concerning lS for asthma (although some estimates range from 2 to 6), both the ASP and the DSP
approaches could be employed to map asthma susceptibility
loci.
In the affected sib pair design, the formal test for
significance of the deviations from expectation in sibling
pairs is the maximum likelihood statistic (MLS) proposed
by Risch [10]. In this formulation, the likelihood remains of
the same form, whether ASP or DSP are used, only the
components of the likelihood are changed. The likelihood
for N sib pairs can be written as
L Pj {Si wij yi}, i 1,2; j 1, N,
where wij P (marker phenotype | i marker alleles IBD for
sib pair j) and yi p (sibs share i marker alleles IBD | sib
pair type) and where sib pair type can either be ASP or DSP.
The values of yi that maximize L represent the maximum
likelihood estimates of yi, and the ratio of the maximized
likelihood to the likelihood based upon expectation of no
linkage represents the MLS, where large values of MLS are
indicative of linkage. Since deviations in sharing under
linkage for ASP are in the direction of an excess of sharing
two alleles, and deviations in sharing under linkage for DSP
are in the direction of sharing 0 alleles, the collection of
ASP and DSP families will require separate analyses.
With available parents, wij can be easily calculated and is
then independent of marker allele frequencies. When data
on sib pairs are available, some families will be ascertained
that will have more than two affected siblings. When
performances of several sib pair linkage tests were compared the two-allele test (proportion of sib pairs with two
marker alleles IBD), a mean test (the mean number of
marker alleles IBD), and a chi-square goodness-of-fit test
(comparing observed and expected IBD) it was shown
that the two-allele test had significance levels most widely
dispersed from theoretical values, often less significant than
expected [8]. Depending on the true underlying genetic
model, either the two-allele or the mean test would have
superior power, and the mean test statistic is usually
unaffected by including affected sibships of size three or

85

larger, and that the pairs are considered pairwise independent. Thus, as long as the total number of sib pairs is large,
greater than 100, then all affected pairs can be used in the
analyses.
Although sib pair analysis can take place with either
candidate gene or anonymous marker data, the use of
genome screen methods provide increased information in
a multipoint analysis. This approach usually assumes that
the recombination fraction (v) between an asthma susceptibility locus is 0.05 (< 5 cm), corresponding to a locus
between two markers in a 10 cm genome-wide screen. For
genetic markers in a typical screen, the polymorphic information content (PIC) ranges from 0.70 to 0.90; even with
the low end of PIC values, multipoint analysis [11] would
typically result in a significant gain in information. The
critical value for detecting linkage in a genome screen is
usually set at a criteria of 3.6, corresponding to a significance level of 2 10 5 in order to assure a global type I error
rate of 5% [12].
Associated phenotypes
Asthma, as defined by history and bronchial hyperreactivity
(BHR), lends itself to decomposition into component (intermediate) phenotypes, including BHR, atopy, and skin-test
reactivity [13]. Since the individual components of symptoms that comprise asthma may be more common than
asthma, per se, the families ascertained for occurrence of
asthma may contain more members affected with BHR,
atopy, or skin-test reactivity. As power for mapping genes
depends on the number of affected sibling pairs, analysis of
BHR or atopy in families ascertained on the basis of two or
more affected siblings with asthma may have more power
than that associated with asthma.
When families are analysed for asthma and its associated
phenotypes (e.g. BHR and atopy), several outcomes are
possible with respect to regions of interest (Fig. 1). First,
there could be complete concordance for regions of interest
of all phenotypes (asthma, BHR and atopy). While this
event may provide confirmation that the region is of
interest for disease, it is not clear which disease(s) the
region controls. On the other hand, discordance within a
region (significant for asthma but not for BHR and atopy)
suggests importance for the disease, yet concern that there is
no evidence for clearly associated phenotypes.
An alternative to analysis of asthma separately from its
associated phenotypes is the restriction of the asthma
phenotype to those who exhibit asthma, BHR and atopy.
While this approach may provide increased homogeneity of
phenotype, there will also be a decreased number of individuals meeting the more restricted criteria and therefore a
reduction in the number of affected sibling pairs (and
reduced power).

q 1998 Blackwell Science Ltd, Clinical and Experimental Allergy, 28, Supplement 1, 8487

86

S. S. Rich

Fig. 1. Multipoint linkage mapping of asthma, atopy and BHR in


Minnesota kindreds.

Asthma as a quantitative trait


The power associated with mapping a quantitative trait with
modest heritability is often greater than that for a qualitative
trait. The developments made in analysis of quantitative
trait loci (QTL) have allowed both candidate gene and
genome screening to be employed. Further, use of discordant pairs (extreme quintiles or deciles) could provide an
additional increment in power [14,15]. In order to transform
the binary response (asthma, yes/no) to a continuous variable, several possible approaches have been developed, all
related to the establishment of scores.
An approach popularized by investigators in cardiovascular disease has been the development of health risk
appraisal functions, using a defined population of subjects
(both with and without disease), measured risk factors and
demographic data. The Framingham Risk Score (FRS) was
constructed using the events observed in the Framingham
Study cohort using data collected on participants of Framingham, Massachusetts, and follow-up examinations over
an extended period [15,16]. On the basis of the risk factors
identified by epidemiologic analyses, a logistic regression
(or discriminant analysis) approach was used to generate a
simple function that allows for the estimation of probabilities of disease by specific level of risk factor. Two issues
related to the creation of an asthma index, similar to that of
the Framingham Risk Score, need to be resolved. The first
issue is the identification of risk factors. Even though some
may be known with some certainty (age, cigarette smoking,
atopy, etc.), there have been few population-based studies
that allows for construction of an asthma risk profile that
could be used in multiple populations.

A second issue is that asthma is a disease of temporality


and environment, such that symptoms may be temporary
and only in specific exposures. Since amount of exposure, as
well as time of exposure, may be of importance, the Cox
proportional hazards regression model could be employed
to model an asthma risk profile. An advantage of the Cox
model is that it provides a procedure to estimate asthma risk
for specific levels of risk factors for variable lengths of
exposure.
A third approach to construction of a quantitative score
for asthma returns to those efforts made in the animal
breeding literature to combine multiple traits for simultaneous improvement by artificial selection. Animal breeders
had noted that the most rapid improvement in multiple traits
under selection was not made by selection of traits sequentially, but when selection was applied simultaneously [17].
This meant that, for example, improvement in bodyweight,
egg production and meat composition in poultry was most
effective when the three phenotypes were improved simultaneously, but not at the maximum for any one trait. The
method of simultaneous selection was implemented by a
function termed a selection index, composed of a linear
function of weights, heritabilities and the genetic/phenotypic correlations between traits. When traits were
independent, the weights are, in effect, the heritabilities of
the individual traits. In the case of asthma, a selection index
could be constructed with respect to estimated heritabilities
of asthma, BHR and atopy, and estimated genetic correlations between asthma and BHR, asthma and atopy, and BHR
and atopy.
Issues related to analysis of observational studies has
focused on the fact that in many situations, there is little
control over assignment of treatment group. That is, data are
collected on treated (affected) and untreated (unaffected)
subjects that may have large differences with respect to their
observed covariates (measures of exposure). One recent
development has been to define the conditional probability
of being treated (affected) given the covariate structure,
providing the likelihood that an individual would have been
treated (affected) given only the covariates. The resulting
propensity score [18] can be used for matching, stratification, or as an individual score in a subsequent analysis. Two
substantial problems remain, however, prior to implementation of the propensity score in a genetic analysis of asthma.
When covariates contain no missing data, the propensity
score can be estimated using discriminant analysis or
logistic regression; however, when missing covariate data
are present, complexities in estimation (depending on the
pattern of missing data) can arise. A second problem is that
applications of propensity scores (and other scores as well)
have centred on the individual in a case-control design. Use
in gene mapping studies adds the complexity of correlations
between individuals within a family due to shared genes

q 1998 Blackwell Science Ltd, Clinical and Experimental Allergy, 28, Supplement 1, 8487

Analytic options for asthma genetics

(and environments); thus, a generalized estimating equation


(GEE) approach may need to be combined with the propensity score to appropriately develop a score for each
member of a family.

Future prospects
Given the complexity of asthma and its associated phenotypes, the diverse populations under study, and the many
groups of investigators who are leading these studies, it
would appear on the surface that collaboration and combined analyses would not be likely. It should be noted,
however, that collaboration can take place at several levels
using a core set of variables for phenotype definition
(either as a qualitative or quantitative trait) in a joint genetic
analysis, using a core set of genetic markers, or providing a
structure to allow rapid data exchange for replication
studies. The first of these approaches (core variables to
allow joint analysis) would be most flexible, in that differences in results from different populations could not be
attributed to differences in phenotype definition (one group
using a clinical history while another using history and
BHR, for example).
The ultimate success of each investigators endeavours
will be the identification of asthma susceptibility genes in
the target population. It is not clear, given the complexity
of genetic susceptibility and environmental exposure, replication across divergent populations can be expected. It is
also not clear that replication would be likely within an
ethnic group across a divergent geographical region, or
even within the same region, depending upon time of
sampling (seasonality and environmental load) and microexposures that may have macro-effect (such as differences
between dog/cat/house dust mite allergen loads that are
home-specific). In many respects therefore failure to replicate an initial linkage should not necessarily suggest a
falsepositive result. Rather, caution should be taken for
that region, yet it should remain a region of interest.
Standardization of phenotype definition would facilitate
the evaluation of these numerous interesting regions for
asthma susceptibility genes.
References
1 National Institutes of Health. Global strategy for asthma
management and prevention. NHLBI/WHO workshop report.
NIH, Bethesda, MD. Publication no. 953659, 1995.
2 Sibbald B, Horn ME, Gregg I. A family study of the genetic
basis of asthma and wheezy bronchitis. Arch Dis Child 1980;
55:3547.
3 Ninan TK, Russell G. Respiratory symptoms and atopy in
Aberdeen schoolchildren: Evidence from two surveys 25 years

10

11

12

13

14
15

16

17

18
19

87

apart [published erratum appears in Br Med J. May 2; 304


(6835):1157] Br Med J 1992; 304:8735.
Peat JK, van den Berg RH, Green WF et al. Changing
prevalence of asthma in Australian children. Br Med J 1994;
308:15916.
Sears MR, Burrows B, Flannery EM et al. Relation between
airway responsiveness and serum IgE in children with asthma
and in apparently normal children. N Engl J Med 1991;
325:106771.
Daniels SE, Bhattacharrya S, James A et al. A genome-wide
search for quantitative trait loci underlying asthma. Nature
1996; 383:24750.
The Collaborative Study on the Genetics of Asthma (CSGA). A
genome-wide search for asthma susceptibility loci in ethnically
diverse populations. Nature Genet 1997; 15:38992.
Blackwelder WC, Elston RC. A comparison of sib-pair linkage
tests for disease susceptibility loci. Genet Epidemiol 1985;
2:8597.
Rogus JJ, Krolewski AS. Using discordant sib pairs to map loci
for qualitative traits with high sibling recurrence risk. Am J
Hum Genet 1996; 59:137681.
Risch N. Linkage strategies for genetically complex traits. III.
The effect of marker polymorphism on analysis of affected
relative pairs. Am J Hum Genet 1990; 46:24253.
Kruglyak L, Lander ES. Parametric and nonparametric linkage
analysis: A unified multipoint approach. Am J Hum Genet
1996; 58:134763.
Lander ES, Kruglyak L. Genetic dissection of complex traits:
Guidelines for interpreting and reporting linkage results.
Nature Genet 1995; 11:2417.
Burrows B, Sears MR, Flannery EM, Herbison GP, Holdaway
MD. Relations of bronchial responsiveness to allergy skin test
reactivity, lung function, respiratory symptoms, and diagnoses
in thirteen-year-old New Zealand children. J Allergy Clin
Immunol 1995; 95:54856.
Risch N, Merikangas K. The future of genetic studies of
complex human diseases. Science 1996; 273:15167.
Gu C, Todorov A, Rao DC. Combining extremely concordant
sibpairs with extremely discordant sibpairs provides a cost
effective way to linkage analysis of quantitative trait loci.
Genet Epidemiol 1996; 13:51333.
Gordon T, Sorlie P, Kannel WB. Coronary Heart Disease,
Atherothrombotic Brain Infarction, Intermittent Claudication
A Multivariate Analysis of Some Factors Related to Their
Incidence: The Framingham Study, 16-Year Follow-up. Section 27. NHLBI, Bethesda, MD, US Government Printing
Office.
Abbott RD, McGee D. The Probability of Developing Certain
Cardiovascular Disease in Eight Years at Specified Values of
Some Characteristics: The Framingham Study; NHLBI Publication no. 37, Bethesda, MD, 1987.
Hazel LN, Lush JL. The efficiency of three methods of
selection. J Heredity 1943; 33:3939.
Rosenbaum PR, Rubin DB. The central role of the propensity
score in observational studies for causal effects. Biometrika
1983; 70:4155.

q 1998 Blackwell Science Ltd, Clinical and Experimental Allergy, 28, Supplement 1, 8487