This action might not be possible to undo. Are you sure you want to continue?

Welcome to Scribd! Start your free trial and access books, documents and more.Find out more

Statistical Methods for Hazards and Health Author(s): Yvonne M. M. Bishop Source: Environmental Health Perspectives, Vol. 20 (Oct., 1977), pp. 149-157 Published by: Brogan & Partners Stable URL: http://www.jstor.org/stable/3428653 . Accessed: 01/10/2013 15:59

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp

.

JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact support@jstor.org.

.

The National Institute of Environmental Health Sciences (NIEHS) and Brogan & Partners are collaborating with JSTOR to digitize, preserve and extend access to Environmental Health Perspectives.

http://www.jstor.org

This content downloaded from 186.18.32.91 on Tue, 1 Oct 2013 15:59:06 PM All use subject to JSTOR Terms and Conditions

Below we show how the emphasison this similarity has led the authors to report their analyses of categorical models inappropriatelyand generally inadequatelyexploit the strengths of the analytic technique. As the need for better methodologycannot be appreciatedunless the deficiencies of the present state-of-the-artare considered.sex. Bishope methodology. pp. Much discussion centers on the reliabilityand validity of specific measures. rates. Both the exposure variable "air quality" and the outcome variable "health effects" are hard to define and measure. In some instances the from a Chess monograph state of the art has improved since this work was done. This paper addresses only the issue of data analysis and ignores study design. Some of the modern advances in fitting linear and nonlinear models to quantitative variables are mentioned briefly. except insofaras improvementsof analytic techniques will reflect on *HarvardSchool of Public Health Boston. 149 October 1977 This content downloaded from 186.Discussion engagedin enuronmental disciplines is made not be offensiveto the authors.18. in other areas many deficiencies still exist.91 on Tue.32.All these issues are of crucial importancein designing good studies and point to inputwhen studies are the need for interdisciplinary being designed. examples will be given where the information obtained from the available data is not optimum Examples for this purposehave been taken (4). in time-series are givento a varietyof newdevelopments tionsstrends. and education strationof theirutilityusingan example effectsto eachother whenrelatingpollution are listedof the ubiquityof the timecomponent Examples autocorrelatheeffectsof time-dependent exampleis usedto emphasize and to healthef&cts.Health Perspectives Environmental Vol.Reference ogy requiresthe use of examples. The similarity of the two methods is stressed. Massachusetts 02115.whichwill hopefully data in threebroadareas:enumeration delineated problems andareasof unsolved to recentdevelopments regression.and multiple andadjusted discretedata is followedby a demonA briefoutlineof the ideasbehindcurrentmethodsof analyzing rates. 149-157) 1977 Statistical Health and Methods for Hazards by Yvonne M. 20. We need bothgood design andgood analysis. We conclude that the 1970task force recommendations shouldbe stressed once again. on bronchitis of the effectsof exposure. The purposeof using these examples is not to criticize but to demonstratethe importanceof improving our analytictechniques. If a study is poorly designed no amount of subsequent statistical legerdemainwill produce meaningfulresults. Now we must determinemore precisely how much pollutionand what type of pollutioncauses disability. of statistical the needfor furtherdevelopment of this articleis to document Theobjective and the many other betweenstatisticians trainingof more statisticiansand improvedcommunication methodolof thecurrentstatisffcal of adequacy research. We discuss the problemsof time series and why linear regression techniques are inappropriate for their analysis. design requirements. general linearregressionfor quantitativevariablesand general linear models for categorical responses (44).timeseries.References is largely approaches analysis. ConverselySeven the best designedstudies can lead to misleadingconclusions if the data are inadequately analyzed. attention is being paid to numerousanposcillary factors or covariates that iniEluence tulated relationships.An artificial analysis. to recentdevelopments references andincludes basedon two recentreviews Introduction Dramatic episodes of fog or smog accompanied by notably increased mortalityand morbidityhave convinced us that polluted air affects health (1-3).and possiblealternative of the pitfallsin multipleregression Discussion of robusttechniques. M. increasingly. 1 Oct 2013 15:59:06 PM All use subject to JSTOR Terms and Conditions . The introductoryoverview to the Chess monograph cites two statistical methodologies.andcycles.

Between the most complex and the simplest model we can choose from a largevariety of intermediatemodelss each postulating different combinations of simple proportionalmain effects and interaction effects.s Enumeration Data ancx Adjusted Rates - What Is a Log-Linear Model? In recent years there has been muchdevelopment in the handling of discrete data that have many categorical variables. Inspection of the first Tables I and 2 indicates that we have the following five variables: bronchitisstwo categories. A well-fittingmodel is selected by a process of trial and error. or minimum chi-square usually yield comparableif not identicalestimates. the simplestmodel states that the bronchitisrate is constantfor every sex-age-area combination. three categories. on the choice of technique. At the other extreme. The degrees of freedom associated with these measure-of-fitstatistics are determined from the number of categories in the relevant variables.and (b) we can use the fitted estimates obtained under the model in order to obtain meaningfulsummary statistics. age. Further discussion of comparisons between techniques has been given elsewhere (7. and that each pairof variablesmay modify the effect of the other. Most authors agree that the interactionsbetween the variables can best be determined by fitting models that are linear in the logarithmic scale. The models can be extendedto includemanyvariables. We say that this model incl udes the four-factor interaction bronchitis-age-sex-area. As an exampleof the type of situationwhere they are of value we include Tables 1-3 which are takenfromthe Rocky Mountainstudies (4). We often declare that the effects that are includedare "significant"and those that are discardedare "not significant. then the model fitted would have the terms shown in Table 4. Most of the proposed methods such as maximum likelihood.one for each parameter. The most commonly used measures are asymptotically distributed according to the chi-squaredistributionand so the probabilityof observinga value as largeor largerthan value tabulatedmay be readilyobtained. Analysis consists of determiningwhich intermediatemodel fits the data well and is not appreciably improved by adding moreterms. Multiplying together the number of categories tells us that each personis distributedinto one of 96 cells. Suppose we are interested in the effect of the three variables sex. educationand age are relatedto bronchitisrates. four categories. least squares. 1 Oct 2013 15:59:06 PM All use subject to JSTOR Terms and Conditions .32. two categories. If we assume (a) that sex. age. yes or no.there is some disagreementover the methods of obtaining estimates under a specific model and determining how well these estimates fit the observed data. (c) the numbers of persons in each sex-education-age category differs by exposurearea. In our example above. Each main or interactioneffect is representedby a term in the log-linear model. meaningfulsummary statistics might be bronchitis rates for each exposure area adjusted for differences in the sex and age distributions in the areas. the final selection of a suitable model is not dependent 150 This content downloaded from 186. exposure area two categories. Such a table will list effects of importance. we may finish up with a table resemblingan analysisof variance table.18. and indeed that all three variables may have a joint effect. and the probability levels associated with the goodness-of-fit statistics are in general very close. . us a t lough we can chose from a varietyof techniquesfor fittingmodels to a particular data set. It is difElcult to interpretTable 3 because sufficient information on which model was fitted is not given. How Does This Help Us? Fittingmodels may be helpfulin two ways: (a) we can determinewhich effects are of importance. and that the magnitudeof this interaction varles zetween exposure areas. and it includes those main effects and interactionswhich are large. (b) exposurearea has no effect on bronchitis rates.91 on Tue. Environmental Health Perspectives How Do We Choose a Model? Although most authorsare agreed upon the general utilityof the log-linearmodelapproach. The main effects and interactionsthat do not improvethe goodnessof-fit are discarded. sex. education. and exposure area on the prevalence of bronchitis.' Indeed. each with their associated degrees of freedom.The most complex model states that each of the three variableshas a proportional effect on the bronchitis rate.andgiven an indication of how the overall goodness-of-fitwould be changedif each effect is excluded from the model. and (d) that no multifactor effects are present. This is equivalentto saying that the effect of age on the bronchitisrate is not the same for each sex. 8).

70 3.95 Smokers Mothers 11.00 Ex-smokers Mothers 3. bChronic bronchitisrates are equivalentto crude rates for symptom Table 2.69 21.61 3.08 smokers 2.19 Smokers Probability (p) <0.004 <0.48 10. Smoking-and sex-specific prevalence rates (percent)for chronicbronchitisby educationand age"b Category Education: <High school High school >High school Age s29 3s39 4W9 ¢50 Nonsmokers Mothers Fathers 2.25 1.24 5.56 1.15 13.2 aData from Chess monograph (4).82 4.88 13.36 1.55= 41 degrees of freedomfor assessing the goodness-of-fit of our model.73 13.00 28.00 severities 6 and 7.00 2.51 2.36 0.38 0.10 8.43 2.31 0. chronic bronchitis.17 14.66 14.5 <0 2 X2 1.55 0.46 Smokers 14.45 0.3 <0.68 4.6 <0.83 11. By fitting models with each of the interactioneffects removed in turn.Thus our table would not resemble Table 3 very closely.08 1.86 2.06 2.86 20.46 14. Table4.55 15.92 Fathers 1. we would have one.54 3.75 10.12 0.23 3.12 3.59 2 1.49 2.00 0.88 3.25 Ex-smokers Mothers Fathers 5.72 3.72 14.90 2.95 7. two and three degrees of freedom associated with the di£ferences in goodness-of-fit.61 0.06 Smokers Mothers Fathers 14. b Degrees Factor Sex Education Age Exposure Fit of model freedom 1 1 1 1 11 X2 14.50 11.47 4.31 2.00 5.79 3. 1 Oct 2013 15:59:06 PM All use subject to JSTOR Terms and Conditions .80 1.00 4.83 1. Analysis of variance for health observations in smokers and nonsmokers.48 1.72 aData from Chess monograph (4?.66 1.63 3.12 3.005 <0.85 13.00 4.32.44 20. Effects Ovcrallmeanandadjustment for distribution of persons betweenareas Bronchitisx sex effect Bronchitisx educationeffect Bronchitisx age effect Total Parameters fitted 49 1 2 3 55 If we fit a model with these 55 parametersto the 96 cells we have 96 .15 8.0003 <0.91 on Tue.85 Nonsmokersa Probability (p) <0. Prevalence of chronic bronchitis in nonindustrially exposed parents: individual and pooled community rates (percent) by sex and smoking status'l Nonsmokers Community Pooled low Low I Low II Low 111 Pooled high High I High II Mothers 1. Table 3.18 15.85 Sex.13 0.50 Fathers 1.07 11. We would of course have only one degree of freedom for each effect if we reduced the numberof categories in each variable lSl October 1977 This content downloaded from 186.Table 1.20 1.00 18.68 12.00 3. hEx-smokers and lifetime nonsmokers were combined for this analysis to obtain a larger sample size.63 18.07 <0.002 <0.05 12.and age-adjusted rates NonExsmokers 1.63 2.72 4.18.78 8.75 Fathers 17.81 3.10 0.40 18.00 2.09 1.41 aData from Chess monograph (4).56 15. 4.70 19.05 4.10 6.06 2.95 0.

Thus the relationshipof various time series is centralto relatingenvironmental and healtheffects. Then we could determine the magnitudeof possible three-factoreffects one relatingsmoking-sex-bronchitisand the other relatingsmoking-education-bronchitis. dered categories (10-14)and methods for computing variances for certain types of estimates. We cannot make the assumption because x2 values increase with largersample sizes. hospital visits or exacerbation of symptoms to measuresof air quality. What Improvements Are Needed? ln conclusion. it is importantto distinguishbetween the strengthsof the differentmethodologiesappropriate for different types of data (9). ln Figure la. two lines. If we turn to the second purpose of model fitting to enable us to adjust rates for several underlyingvariablessimultaneously we find that this strengthof the procedurehas been ignored. furtheradvances in technology have been made. or adjustedfor at most two variablesusingcrudespecific rates. Some of the difficultiesnoted above stem from the attemptto present the results in a table formatthat resembles analysis of variancefor continuousdata.(S) assessing the extent to which differentpollutantsincreaseand decrease simultaneouslyor with a consistent lag between peaks. Time Series WhyDo We Need to Lookat Them? The following are examples of situations where the relationshipsbetween two or moreseries of data collected over time are of current interest: (1) assessing the performance of a new pollutionmeasuringdevice comparedwith that of a standard device in the field. This indicates a need for bettertrainingandcommunication.giving 12 degrees of freedom. If each serial measurementcould be regardedas independentof all preceding measurements(which is usually untrue)and was taken froma normaldistribution then correlationwould be a reasonableapproach. There is still need for further development of methods suitable for a mixture of discrete and continuous variables.lf we look at the first line of the table we see x2 values for sex and educationare largerfor smokers than for nonsmokers. (6) predictionof the futurelevels of a given series so that the effects of interventionmay be assessed.We might suspect that smoking had a synergistic influence and enhanced the effects of age and education. However when observing natural phenomenon the strengthof the association will dependon the range of values that occurred during the observation period. (4) the fitted values were not used to computeadjustedrates. As an illustration. Such a suspicion would be unjustifiedif the sample of smokers was larger than the sample of nonsmokers. Thus the inadequacieswere largelydue to a lack of understanding of the methodology.(2) smoking was not includedas a variable. Since 1970. This example has been cited laboriously to illustratethe importanceof specifying which model was fitted. Range of Obser2nations.91 on Tue. All the rates given are either crude rates. . one to smokersand the other to nonsmokers. The additionof the effect of exposure on bronchitiswould bringus to 11 degrees of freedom as given in Table 3. With this reductionwe would have 32 cells and be fitting 20 parameters. are connecting a series of points. Apparentlytwo separate models were fitted. (3) determining whethercentralmonitoring stations give a true picture of individualexposure by comparingtheir readingswith personaldosimeterreadings. (3) the particular model fitted could only be inferred. thus its goodness-of-fitstatistics are of no value. the full strengthsof the methodology were not used: (I) variables were reduced to two categoriesthus losing information.thus its effect cannot be assessed from the results given. (2) determiningwhether adjacent stations monitoringthe air in a city are giving comparabledata or whether there are real differences in air quality in neighboring regions. 1 Oct 2013 15:59:06 PM All use subject to JSTOR Terms and Conditions .32.to two.18.consider Figures la and lb. The points were obtained from a table of randomnormaldeviates (15). (4) relatingfluctuationsin indices of disease such as deaths. There were further problems in understanding Table 3. even when the interaction effect they reflect remainsconstant. Thus the points are independent observations from a normaldistributionwith mean of zero and variance Environmental Health Perspectives This content downloaded from 186. notably methods for dealing with or152 Why Is a Simple Correlation Not Informative? In each of the situations cited above attempts have been made to use simple correlationsas measures of the association between two time series. This approachcan be criticized on several levels. We could readily evaluate the possibility of smoking affecting other interactions by the simple procedure of adding smoking as a sixth variable to the other five variablesalready in the model. markedA and B.Althoughthere are similaritiesin that models are being fitted.

/ . 10 Days \ } 15 This autocorrelation invalidates the use of regression or multiple regression techniques designed for independent observations. r = 0. .' / $hz ' \ X S > \' \ 8 a. We note that the new series looks smoother. as in Figure lb we will increase our apparent correlation If we measure during a period when there is a period of stability and a period when both phenomena have a trend we will obtain an intermediate value for r. to provide a new series. 1 Oct 2013 15:59:06 PM All use subject to JSTOR Terms and Conditions .37. 1 with diSerenttrendsaddedto each series. it is necessary to consider whether the series have common large shifts. The effect of autocorrelation is shown in Figure IcJ The random value for each day in Figure la has been added to the value for the previous day.\ . and the correlation between the two series will not differ significantly from zero./ \\/.Bl >/ lly/ \ il/l Successil.43. % I \ 8 ' 3 \ // \ 4 '' . lf we measure both phenomena during a period when both are subject to a seasonal trend. -2- 2 3 a 1 | t | t t 5 DaYS 1 10 1 1 \ \ \ 15 .1 .43. 3 2- r0.66. however. whether we need to distinguish short-term association from general seasonal trends and in fact to consider carefully the hypothetical model we are evaluating. and differences successive of 0. Before computing a correlation coefficient. as in Figure lta. * \ . \/ . '! \.32. as each day's values in a given series are related. except insofar as they have the same internal relationship. We have only to consider a familiar measure such as minimum 24-hr temperature to appreciate that the possible values for a particular day fall within a range determined by knowing the time of year and can be defined even more knowing the values for immediately preceding days. and so on.' \ ' B i \ ' / < / \ {iX. (c) same series as Fig. \ \\\ t c 5 1 1 \ 1 . the values for day t are related to those for day t .37 / . r = 0. slmple examples lllustrate some of the characteristics of time series that must be handled.e Values Not Independent.} '." > . In Figure lb we have introduced linear trends by adding to these random deviates a difference of 0. we would get larger values of r Clearly.66. Theoretically the two series of independent observations have a correlation of zero. Advances and Needs in Time Series Analysis FIGURE 1./ 2 ! .18.+ \ / Al 1{ t 9t \ D ! \. If we were to introduce steeper trends by adding larger constants.-2-lf . A /N1 \ i of one unit. still unrelated to each other.37. By chance we have an observed value of r .. Almost any series will exhibit noise and au- rT he foregolng - - October 1977 153 This content downloaded from 186. (a) Two series of independentnormaldeviates.1 for line B. ^ 1 O 8 ' ! A' rg V ' . /\ < A' / ' / \1 V l ^S \ -2- < 'I t X Vli 3' 1f) l 1 5 t 1 i 1 1 t t | 1 I 3 closelyby 24 . 1 w/ . The observed correlation has however changed to r = 0.66 A 0 B<' \ . the time series data of interest cannot be regarded as independent observations as we did in the preceding section.91 on Tue.0.1. The on correlation we now compute is increased to r = 0.1\ '\ ' . (b) same series as Fig.tt\\ ' \ '->\iG<\ / .s1 '! >' . la with autocorrelationwithineachseriesr=0. .0. Thus the series are autocorrelated. in periods of relative stability of the underlyingphenomenathe values we-obtain represent noise about the constant true value. . The two series are. Most of 0 1 31 ^ tI reo.2 between measurements line A.t.43 8 / 1 8 / l." .

" Thus he has a series of correlationsthat show the extent to which the cyclic patterns of the series correspond. namely that the series are "stationary" in the sense that the covariances between time periodsare constantthroughoutthe series. Directions of Current Development In a recent review. The noise inherent in any system together with the limitationsof the lengths of the series.91 on Tue. This is a warning that the impact of these two series on the health series may be complex and hard to disentangle. Even when the problemsare detected. Researchersat Princetonhave been making rapid advances in development of these techniques and are conducting Monte Carlo simulationsto evaluate differentapproaches. They provide a first step . at lower frequencies. which he explainsas "the frequency-dependent measureof correlation between series.Whenthe numberof variables increases so do the problems: the list mustbe enlargedto includemulticollinearity of the variables.Thus again the research is in progressbut much needs to be done before the relative advantagesof different strategiesare fully understood(22-24)." or. and that the relationships between the variables are linear. .. 1 Oct 2013 15:59:06 PM All use subject to JSTOR Terms and Conditions ." He also investigates partial coherence.Recent developments deal with both methods of detecting particular types of departureand with data-analysis in the presence of such departures. the optimum methodof analyzingdata with one or more types of departurefrom the assumptions underlyingleastsquares regression is not readily apparent. . are relativelyinsensitive to departures from the usual assumptions underlying least- Environmental Health Perspectives 1S4 This content downloaded from 186. Stressing the limitationsof a particularmodel is not intended to indicate that the approach is poor rather it is to stress that analysis of time series is not simply a matter of runningthe data through a computer program. (18): "The obtainingof sample estimates of the autocorrelationfunction and the spectrumare non-structural approaches." Much effort has gone into the development of techniques that are "robust.The situation is describedby Box et al. . usually requiresthat some form of smoothingis carriedout duringthe analysis. which correspond to a period of four days. Increasingly these methods are being applied to analysis of environmental data but are apparentlynot well known to a11 investigators.nonlinearity of the relationshipbetween the variables. (25) suggests that "the role of the developers of regressionmethodology is to provide the less skilled user with techniques that are robust while easy to use and understand. Throughout his paper he warns us about assumptions underlying the analysis. "the series are essentially unrelatedat frequencies above 0. thus an annual effect would theoretically be at the frequency of 1/365 cycle per day. 17) has investigated the use of spectrumanalysis as a tool for determining whether the aggravationof asthma symptoms are related to daily minimumtemperatureor to atmosphericSOx levels.32.18. and finally that the tentative conclusions reached may be reversed following subsequent analysis.lack of independencebetween observationsand the presence of outliers. there is substantial coherence. He explains: "The spectrum may be regarded as a decomposition of the variance of the data into components associated with differentfrequencies. which correspond to longer periods." Frequencies in this context means number of cycles per day. namely the frequency-dependent partialcorrelation between asthma and sulfur oxide after correctionfor the effect of minimum temperature. in other words. Bloomfield (16.25 cycles per day. Bloomfieldalso computes the coherence between series. pointing the way to some parametric model on which subsequentanalyses will be based. He concludes.and most will have cyclic patternsof varyinglength. but in fact the smoothingof the data (which was a necessary preliminarystep) spreads the effect over a wider band. However.tocorrelation. Thus we conclude that this is a very promising approachbut that care must be taken to recognize the importanceof the underlyingassumptions. Hocking. and it is no longer possible to detect these problems by simple plots of the data. Multiple Regression When Are Least-Squares Fits a Poor Choice? Pitfalls in the interpretation of linear leastsquaresregressionrelatingto two variablesare well known. they include nonnormalityof the distribution of variables.analogous to the representationof an empirical distribution function by a histogram . Box and other authors (19-21) have been developingsuch specific models for carbonmonoxide in Los Angeles to study the effect of changes in methods of instrumentcalibrationand the effect of various control measures.

The problem of more complex relationshipsbetweenvariables has received much attention. - squares regression. He warns that his iterative technique is more expensive than least-squaresbut in additionto producingstable estimatesit will detect outliers. Diaconis was unableto find parallelreductionin CO or NO2.l | f o . 1 Oct 2013 15:59:06 PM All use subject to JSTOR Terms and Conditions . In view of these complexities. I w w ss - w .and withoutreportingany attemptto investigate alternatemodels so g | l l l / o i y zlul w . | ! s tY.but thatwhen near-singularitiesexist the method of handling them is not clear. it is unlikely that a October1977 least-squaresEltof a simple "hockey stick' function will prove to be an adequatemethod of determining"threshold"levels of pollutantsas has been done (Fig. .' lo l ls l 21 Xs FIG. Andrews (27. The plots show temperature-specific thresholdestimatesfor symptomaggravationby sulfurdioxide. and this has led Hocking (25) to observe that these skilled analysts using repeated inspection of residual plots were in fact using a robust procedure.- " - an §. Certainly it is misleading to present point estimates obtainedby this methodwithoutindicatingtheir variability.or to interactions amongair pollutantsthat have not yet been investigated.91 on Tue. Gnanadesikan and Kettenring(26) review many of these. Diaconis (30) has applied resistant analysis of variancetechniquesto air pollution data. w s 5 10@ 19 M N .18. All of these endeavors point to the complexities that may be encountered in multivariatedata. 2). lSS This content downloaded from 186. and Clevelandand Kleimer(38) have developed sophisticatedplottingtechniquesfor detection of characteristicsof the data. because other sources of variationare controlled. Andrews reaches the same conclusions regarding this sample data set as Daniel and Wood. The problemof multicollinearity has been tackled bya varietyof approaches. In arecent review. 2 Examplesof the use of a hockey-stickfunction where no attemptis madeto indicatereliability or to assess the interaction effects of different pollutants (4). and Wilk (37).32. .Schwingand McDonald (32)have comparedleast-squaresand ridge regression." Herecommendsthat eigenvalues should always be inspected to determinepossible redundancies. This method may be useful in an experimental situation such as that described by McNeil (39). (26) have been particularlyconcerned with the detection of outliers. and some air pollutants. They showthat the two later approachesyield comparableresults that differfrom those obtainedby using least-squares (32. and suspendedsulfates(SS). using newer techniques that he believes are resistant to a small numberof gross outliers. Otherauthorssuch as Anscombe (36). or to chance fluctuations. Gnanadesikanet al. Thus the question remains open whether the observed reduction in mortality was due to other causes. 28) has re-analyzed data originally analyzed by Daniel and Woods (29).LUTA"T C"Ct"T"f>.0 l. In the conclusionof his review Hocking (25) states that "themulticollinearity problemseems to have been given too little attentionin the statisticsliterature. Brown et al. (31) observed reductionin mortality rates in two Californiacounties and suggested that this mightbe a reflection of reduced air pollution consequentupon the 1974fuel crisis. total suspendedparticulates (TSP). natural ionizingradiation. and have appliedboth ridge regressionand a sign-restricted least-squaresmethod to the analysis of the association between mortalityrates. Gallant (35) concentrates on methods of Elttingnonlinearfunctions rather than on the detection of such functionalrelationshipsin the data. The implicationsof orderrestrictions have also been investigated (34). 33). . | I - J b r / ss.

Williams. Van Belle noted the dangersthat '. MIT Press. P.. Public Health Service. 8. 21. J. J. 1976. and Tiao. Intervention analysis and applications to economic and environmental problems.. Analysis of contingency tables having ordered response categories. and Hamming. partly becatlse of lack of communication. W. 6." He also cautions about the indiscriminant accumulation of largebodies of data and on the tendency to place too much faith in "indices. 1 Oct 2013 15:59:06 PM All use subject to JSTOR Terms and Conditions . Assoc. 10. G. E. Glaser. G. D. but three different polllltantswere each treatedseparatelywith no attempt being made to consider how they would affect symptomaggravationwhen present in different combinations. Fed. J. Bulletin 306. R. Multivariate analysis of qualitative data.C. producersn n of statisticalanalyseswill base their product on argumentsof dubious validity. Control Assoc. Biometrics 25: 489 (1969). Wiley. S. 25: 260 (1975). San Francisco. and Field. 17. Simon. HEW. Analysis of Los Angeles photochemical smog data: a statistical overview. and Koch. Assoc. W. Analysis of categorical data by linear models. 5. Paper presented at Fourth Symposium on Statistics and the Environment. Much needs to be done. 16.. G. C. 11. In at least threeof the five areas of concern(contingency tables. Dept. C. This material is drawn from a Background Document prepared by the author for the NIEHS Second Task Force for Research Planning in Environmental Health Science.. Springfield. E.000 Normal Deviates.' These problemsare still with us. H. H. Conclusions The reportof the task force on researchplanning in Environmental Health Sciences (41) recommended in 1970 that further development of efficient statistical techniques be undertaken. Virginia 22161. Agency. Bloomfield. G. 14. He cites four areas: the first two were: (l) . Med. Ferris and F.. Many thanks go to Drs. L. W. 20. 13. A ssltellitesymposiumwas sponsoredby IASPS on StatisticalAspects of pollutionproblemsin 1971 (42).Thus the need for trainingrecommendedin 1970still exists. E. In some areas these advances have been well documented.. Greenberg. Tiao. theoretical advances have been made."rhe use of elasticity coefficients is misleading when the variables are measured in arbitrary units. Graybill. 2. Johnson. 1. N. Partly this is because the stage of development is such that they are not readily available. P.67: 55 (1972). J. D. Environmental Protection Agency. 69: 971 (1974). Air Pollut. New York. A statistical analysis of the Los Angeles ambient carbon monoxide data 1955-1972. Assoc. Amer. C. 3. W.. E. 1974. P. C. and Hamming. Health Consequences of Sulfur Oxides: A Report from Chess.. 25: 1129 (1975). in others progress has only reached the stage of verbal reporting and unpublished manuscripts. The Report of the Task Force is an independent and collective report which has been published by the Government Printing Office under the title. Glencoe. P. E. Discrete Multivariate Analysis: Theory and Practice. J.'Analogies between multiplicative models in contingency tables and covariance selection.18. Y.iThe use of a linear regression model to approximatea causewffect link is questionable"and (2) .Similarobservationswere made by the discussantsof a paperby Nelson et al. J. Sci. P. A. Biometrics 32: 95 (1976).. 1976.S. Alternative Analyses for the singly-ordered contingency table.. Time Series Analysis Forecasting and Control. and Koch. S. Div. J. Schronk. Tiao. Department of Commerce. C. and Jenkins. G. and Grizzle. G. H. IJ. A Million Random Digits with 100. McGraw-Hill. Scott. Nov. Washington. U. 19. review of recent literature reveals relatively few instances where the newer techniques are being employed. D. E. An Introduction to Linear Statistical Models. 1970-1971. 9. Technometrics 13: 438 (1971). Log-linear models for frequency tables with ordered classifications. time seriesS and multivariate methods). Off1ce. G. 1961. and Holland.S.. Biometrika 61: 525 (1974). 5285 Port Royal Road. Clayton. 109: 250 ( 1966).. G. J. Bishop.: Epidemiology of an unusual smog episode of October. D. (40). Washington.In the example reproducedin Figure 2 the effect of temperaturewas held constant. Air pollution in Donora.. Amer. In the publishedreport. Bock. Spectrum analysis of epidemiological data. In spite of this developmentalactivity. P. 1948... 4. J. Starmer. M. Hyg. F. J. Rand Corporation.C. New York. In: Multivariate Statistical Methods in Behavioral Research. Air Pollut." Copies of the original material for this Background Document. C. Fienberg. F. 156 Environmental Health Perspectives This content downloaded from 186. Biometrics 30: 589 (1974). Vol. Speizer for introduction to these problems.32. Statist. U S.91 on Tue. G. Holden-Day.. Free Press. Research Triangle Park N. PHS. M. M. Box. Bloomfield. Haberman. E. 18. 1975. Boxs G. P. et al. both in terms of development of theory and makingreadilyaccessible computer programswith adequate documentationfor carryingout the techniques proposed. "Human Health and Environment Some Re- REFERENCES 1. 70: 70 (1975). S. F. Amer. search Needs. New York. Fourier Analysis of Time Series: An Introduction. B. The London fog of December 1966. The author was supported in part by grant ES 01108 from the U. 12. Box. 15. G. Control Assoc. D. Health 15: 684 (1967). 7. P. O. 23-25. Ind. Statist. Arch. G. 1975. .. Pa. J. A note on the weighted least-squares analysis of the Ries-Smith contingency table data. Mortality and morbidity during a period of high levels of air pollution. as well as others prepared for the report can be secured from the National Technical Information Service. Wermuth. Some Odds Ratio Statistics for the Analysis of Ordered Categorical Data.. G. Environ.. 1970. M. 1949. Grizzle. Statist. Box G. McGraw-Hill. 1966. 1975. Cambridge.

Wiley. Biometrics28: 81 ( 1972). The analysis and selection of variablesin linearregression.. W.22. F. Pratt J. Wilk. et al. Technometrics 17: 447 (1975).. E. S. Graphs in statistical analysis. Applications of a nonlinearsmoothingalgorithmto speech processing. 1974. Beaton. Paper presentedat Symposiumon Recent Advances in the AsInternational Pollution. New York. McNeil. U. 32. tion Problems. First Task Force on Research Planningin EnvironmentalHealth Science.naturalionizingradiationand cigarettesmokingwith mortalityrates. residuals and outlier detection with multiresponsedata. 1972. and McDonald.. 1971.G. et al. Measuresof Association of some air pollutants. R. research 41. S. New York. Brown. Fitting Equations to Data.. 27. C.G. illustrated on band-spectroscopic data. S. Cleveland.. Schwing. Statistician27: 17 (1973). 125. J. R. 25. Barlow. R. 89. 34. R. and Lowrimond. J. J. 39. Sambur. meaning polynomials..32. 1975. C. 6. Dekker. 35. V. R. 384. F. S. E. Technometrics 16: 147(1974). Statisticaland Mathematical Vol. B. Statistician29: 73 (1973) 36. M. and Schwing. and Kettenring. Diaconis. Paper 30. B. P.M. the effect of the fuel crisis. D. R. Statistical models and stochastic models. 26. Boston. Hocking. The fittingof power series. sessment of the Health Effects of Environmental Paris.Alta. R. R. of Statistics. Series 2.R. Measuring given at AAAS meeting.D. Report No. 1972. IEEE Trans. Gallant.. C. 6). Princeton Univ. A robustmethodfor multiplelinearregression.. R. 23. DHEW. Enhancingscatter plots with curves of moving statistics.P. Aspects of Pollu42. StatisticalInferenceunderOrderRestrictions.. R. VIth Berkeley Symposium. D F. W. Amer. PrincetionUniv. 1972. 28.p. C. 29. Anscombe. France. RobustNon-LinearData Smoothing Tech. Robustestimatesof location:Survey and Advances. 31.Wiley.A.18.7and Schmidt C. Robustestimates. of a SIMS Conferenceon Epidemiology. Probabilityplotting 55: 1 ( 1968). 40.. 1974. Velleman.W.. Andrews.. McDonald. G. Mass. E. 24. and Wood.Amer. Rabiner.. Proceedings Utah.Biometrics32: 1 (1976). N. Daniel. 1974. C.. F. Instabilitiesof regression estimatesrelatingair pollutionto mortality. Press7 Princeton.. and Gnanadesikan.. Washington. Vol. NIEHS. Speech Signal Proc. Andrews. 37. R. D. 1976. October 1977 157 This content downloaded from 186. Man'shealthandthe environment ----some needs. Hasselblad. F. Nature 257: 306 (1975). J. 1970.R. Technometrics16:523 ( 1974). Dept. and Tukey. 33. C. Effect on mortalityof the 1974fuel crisis.Technometrics11:763 (1973). Nonlinearregression. M. healthand environmental Statisticalaspects of a community surveillancesystem.p.91 on Tue. 6. W. 23: No. and Kleimer. New York. 1 Oct 2013 15:59:06 PM All use subject to JSTOR Terms and Conditions . C. methodsfor the analysis of data Biometrika 38. Gnanadesikan.L.(StatisticsTextbooksand Monographs. 552 (1975). et al. Nelson. Acoustics..

- Catálogo Hafele Herrajes
- Financial World
- Homo Ludens
- Ley 26899. Creación de Repositorios Digitales Institucioanales de Acceso Abierto, Propios o Compartidos (Congreso Nacional, 2013)
- Ley 26899. Creación de Repositorios Digitales Institucioanales de Acceso Abierto, Propios o Compartidos (Congreso Nacional, 2013)
- Microsoft R Server Advanced Analytics Datasheet en-US
- How Not to Lie Wothout Statistics (King G. - Powell E., 2008)
- Anuario 2013
- The Flawed Foundations of General Equilibrium (Akerman F., 2004)
- Re Engineering Philosophy for Limited Beings (Wimsatt W., 2007)
- La Mortalidad en La Argentina Entre 1869 y 1960 (Somoza J., 1973)
- Max Weber's Theory of Concept Formation- History, Laws and Ideal Types (Burger T., 1987)
- Modelos No Lineales Para La Teoría de Muestreo (Anastasio M., 2011)
- The Evolution of Private Property (Gintis H., 2006)
- Beyond Post Casmin (Woult E., 2007)
- Raymond Boudon in Memoriam. Un Sociólogo Que Creía en El Individuo (Vallet L., 2013)
- Avoiding Randomization Failure in Program Evaluation, With Application to the Medicare Health Support Program (King G. -Et. Al., 2011)
- Avatares de La Estadística Social en Iberoamérica o Confieso Que He Enseñado Estadística (Cortes, F., 2000 )
- Los Metodos Cuantitativos en Las Ciencias Sociales de America Latina (Cortes F., 2008)
- Historia de La Astronomia )2010)
- Causalidad y Experiencia (Cassini a., 1986)
- UNIVAC II (Remington Rand., 1957)
- The Public Domain. Enclosing the Commons of the Mind (Boyle J., 2008)
- Did Henry Ford Pay Efficiency Wages (Raff D. - Summers L., 1987)
- Social Justice and Public Policy- Seeking Fairness in Diverse Societies (Gordon D. - Craig G. - Burcgardt T., 2008)

- MB0040 - Set 1
- Statistical Methods for Quality & Reliability
- Topic 11
- Time Series Documentation - Mathematica
- Fall 2004 Are 251 /Econ 270a
- Introduction to Statistics
- 01 Introduction to Statistics
- Validation of the Geostrophic Method for Estimating Zonal
- Engineering Applications of Artificial Intelligence2015
- Julia Gadfly Reference Card 0.1
- g.n new cv.doc
- Making Data Patterns Visible
- Statistics 1
- Regression Analysis 08
- CHAPTER 1
- s40345-014-0011-z
- Expanding the Role of Statistics to Areas Traditionally Dominated by Expert Judgment
- bu_mssp_pdf_brochure.pdf
- 4 Empirical
- The Difference Between “Signiﬁcant” and “Not Signiﬁcant” is not Itself Statistically Signiﬁcant
- Project Writing the Report
- PSY 315 PSY315.doc
- MoneyBall Essay
- Groebner Business Statistics 7 Ch09
- Learning Unit 1
- Overview of Eviews
- PWJohn
- Development of Rainfall Forecasting Model in Indonesia by Using ASTAR, TRansfer Function, & Arima Methods
- ISM Chapter11
- Probability

Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

We've moved you to where you read on your other device.

Get the full title to continue

Get the full title to continue reading from where you left off, or restart the preview.

scribd