0:02 in environmental epidemiology. 0:06 So the term biomarker can be used 0:08 to describe a marker of exposure 0:11 or the marker of effect among other things, 0:14 but we're really gonna be focusing on 0:17 how to utilize these biomarkers 0:20 to assess exposure in humans to environmental chemicals. 0:22 So the way the lecture is structured first, 0:30 we're gonna talk about the different types of biomarkers, 0:32 how we go about selecting them, 0:35 then how to deploy these measurements, 0:38 and then once we've measured these concentrations 0:41 or levels in individuals, how then to analyze the data. 0:44 And then finally some of the pitfalls 0:49 that we need to think about. 0:51 So considerations that we need to think about 0:53 when we select these biomarkers, first of course 0:56 science should drive our choice of biomarker. 0:59 We need to think about what the suspected mechanism is 1:03 and what the target system or organ is. 1:06 So for example, if we're interested 1:09 in how methylmercury affects the brain, 1:11 the ideal biomarker might be levels 1:16 of methylmercury actually in the brain. 1:20 Now, that's really not feasible unless we're dealing 1:23 with deceased individuals. 1:28 So, what's the next best thing? 1:30 Well, often we rely on circulating concentrations in blood. 1:32 We might rely on hair for example 1:38 as a biomarker of that exposure. 1:40 But we should be thinking about how well 1:42 that exposure represents the true level of methylmercury 1:45 in that suspected target system or organ. 1:50 Secondarily, hopefully to that is feasibility. 1:55 So I mentioned that we can't assess methylmercury 1:59 in the brain. 2:03 So what's the next and most feasible option? 2:04 Collecting it in hair, collecting it in blood. 2:09 Again, maybe we would want to collect a biomarker in blood, 2:12 but perhaps we're dealing with infants. 2:17 We don't wanna draw blood 2:19 so we might use another biomarker such as hair. 2:20 So in relation to that invasiveness for example, 2:24 that I mentioned like venipuncture 2:28 in infants or children is cost. 2:30 So some of these persistent organic pollutants like PCBs, 2:33 perfluorinated compounds, DDT, DDE are quite expensive. 2:39 So if you send a single sample, 2:43 plasma sample or serum sample 2:47 to a lab, it might cost about $750 2:49 to ascertain all of those various concentrations 2:53 so it can get expensive 2:57 and that's something that we also need to think about. 2:59 And again related to invasiveness, 3:03 for some of these concentrations like furans and dioxins, 3:05 if the concentrations are very low, 3:12 it might take a lot of blood 3:14 in order for us to determine the actual concentration 3:16 so seven or eight, 10ml tubes. 3:20 So a 10ml tube, typically if you got your blood drawn, 3:24 this is the small tube that is used 3:26 to collect whole blood, imagine seven or eight of those. 3:30 I've already mentioned blood several times. 3:37 So blood itself is not a biomarker, 3:40 but it is a matrix in which the biomarker exists. 3:43 So really what we're gonna be talking about 3:48 are these different matrices 3:50 that you can assess these biomarkers in. 3:52 So blood, we think about whole blood, plasma, serum. 3:56 So the difference 4:02 is that let's say you have a test tube 4:03 and you might draw an entire tube of whole blood 4:08 which contains clotting factors, red blood cells, et cetera. 4:16 If we separate off the red blood cells, 4:22 what we're left with 4:31 is rather plasma. 4:37 Serum is plasma minus the clotting factors. 4:42 Why does this matter in terms 4:53 of assessing different environmental exposures? 4:54 For some environmental exposures like lead, 4:57 we're gonna use whole blood. 5:02 For something like persistent chemicals 5:05 such as perfluorinated compounds and PCBs, 5:08 we're gonna use either plasma or serum. 5:11 It doesn't really matter, 5:15 these clotting factors that get removed as you go 5:17 from plasma to serum don't really affect the determination 5:21 of these concentrations very much. 5:24 So, usually either of these matrices are fine. 5:26 So again, whole blood is used for things like blood lead 5:31 and other metals. 5:36 It's easily obtained. 5:37 Nobody likes getting a skin prick, 5:40 but it's generally easy compared 5:41 to other more invasive measures. 5:44 In some cases, we can do a fingerstick option. 5:47 So for determining blood lead, often in the field, 5:52 you can do a little finger prick to draw that small amount 5:55 of whole blood. 5:59 Often an issue there is surface contamination. 6:01 If there's a lot of handling 6:04 of lead materials for example, 6:09 people who are taking apart batteries, 6:12 there may be quite a bit of service contamination 6:15 of lead on their hands. 6:17 So finger stick would not be the ideal option. 6:19 And as I mentioned before, plasma or serum, 6:24 these persistent often lipid soluble compounds 6:27 so things like PCBs that are bound 6:31 to lipids are more amenable to determination 6:35 in plasma or serum. 6:40 The nice thing about plasma and serum is it can be stored 6:42 for decades without much loss in terms of these chemicals. 6:45 If you take something like whole blood and you store that 6:52 for a while, there's lots of problems. 6:55 The red blood cells burst and it can't be stored. 6:57 So typically if you're gonna measure something like lead, 7:02 you need to do it right away 7:06 whereas if you're measuring other compounds, 7:07 you can collect blood 7:09 and then subsequently analyze it later. 7:11 Urine again is another matrix for which we can determine 7:17 these different exposure biomarkers. 7:22 Urine is typically used for readily metabolized compounds. 7:25 Compounds like Bisphenol A (BPA) 7:29 which I'm sure you've all heard of. 7:32 This is in water bottles and lots of other consumer 7:34 and personal care products, perfumes. 7:38 Things like phthalates also again are plasticizers 7:41 in personal care products and perfumes. 7:46 And organophosphate pesticides that are used on crops. 7:50 These compounds typically have short half lives. 7:55 They're not typically lipid soluble. 7:58 They're ingested, they get glucuronidated 8:01 in the liver and then they are excreted. 8:05 So for example, Bisphenol A has a half life 8:08 of about six hours in urine 8:11 so it's ingested and it's it's readily excreted. 8:15 The benefit of using urine as a matrix 8:19 for determining exposure 8:22 is that it's easy to obtain in adults. 8:25 I say in adults, 8:29 in infants obviously who aren't toilet trained, 8:31 it's a lot more complicated. 8:35 There's different ways to assess it 8:38 and you'll see these mentioned 8:40 in the epidemiologic literature, 8:41 the exposure assessment literature. 8:44 First morning void. 8:47 This is what it sounds like, 8:48 it's the first urination of the day. 8:50 And that's thought to maybe better represent exposure 8:53 in the last 24 hours 8:57 because you haven't been ingesting anything 8:59 for the last eight hours or so. 9:02 A spot urine is just a collection at any time. 9:05 So if for example, 9:09 you go to the doctor and they ask for a urine specimen, 9:11 what you're giving them is a spot urine specimen 9:14 so just one point in time. 9:17 There are in more complicated exposure assessment studies, 9:21 this idea of 24 hour urine 9:26 and this is collection 9:29 of every urination throughout the day. 9:30 So for example, EPA does some studies 9:33 where they measure 24 hour urine in pregnant women 9:36 and they do so for several weeks 9:40 in order to determine what the half life 9:43 is of a lot of compounds. 9:46 Obviously, if you have an epidemiologic study 9:48 and you're asking people 9:51 to collect urine each time they go to the bathroom, 9:52 it gets a little complicated. 9:58 The problem with urine, because it involves self-collection 10:00 is this non-trivial contamination potential 10:04 which we'll talk about a little bit later in the lecture. 10:07 But again, because we're typically measuring compounds 10:10 like Bisphenol A and phthalates, 10:14 these are compounds that are in house dust 10:17 and in many other products around. 10:20 And so, when an individual is donating the specimen 10:23 and they open up the jar and they open up the cap, 10:26 there's the possibility to introduce quite a bit 10:30 of contamination that way. 10:33 I'm gonna spend a little bit 10:38 of time talking about less common matrices 10:40 in which we assess biomarkers. 10:42 One example is meconium. 10:46 Those of you who don't know, 10:49 meconium is the first bowel movement 10:50 that the infant has after birth or right around birth. 10:52 Obviously there are some collection problems 10:57 because first, sometimes this occurs very early. 11:00 Second of all, if you're collecting this 11:05 for the purposes of ascertaining exposure concentrations, 11:08 there's again the potential 11:15 for quite a bit of contamination. 11:17 Why? 11:19 If it's collected in a diaper, 11:20 there are already contaminants in that diaper. 11:22 The question though is why is this interesting? 11:25 It's interesting because it captures 11:28 what that infant was exposed to throughout gestation. 11:31 Think about all of the products that 11:36 that infant has metabolized during that process 11:38 and been exposed to in utero, 11:41 that meconium is a representation of all that exposure. 11:44 So sometimes this is used in studies as a way 11:50 of assessing exposure during pregnancy. 11:53 Another less common biomarker 11:58 that's used are dried blood spots. 12:01 So in New York State, from each infant, 12:05 a dry blood spot is collected. 12:10 A few drops are spotted onto a card. 12:14 It gets sent to Albany for genetic testing. 12:17 And those cards are then stored 12:20 for I believe 20 or 25 years in Albany. 12:23 One could go back with proper approval from human subjects 12:28 and measure concentrations of some pollutants 12:32 on those cards. 12:36 So for example, there is an individual 12:38 at Wadsworth lab in Albany, Dr. Kannan 12:41 that I collaborate with, 12:45 who has pioneered using these dried blood spots 12:46 to measure some persistent organic pollutants 12:50 and also some non persistent organic pollutants. 12:53 Not typically well measured, 12:57 the non persistent pollutants in these dry blood spots, 12:59 but certainly a good measurement 13:04 for things like perfluorinated compounds and PCBs. 13:06 Something like this might be particularly useful 13:10 when you have a large population with maybe a rare outcome. 13:14 So let's say we're looking at brain cancer 13:20 and we're doing a case control study. 13:25 And so we identify cases of individuals 13:27 who develop this brain cancer 13:31 or some other cancer in their early twenties 13:33 or maybe late teens, 13:36 but we're interested in the effect 13:39 of an early life exposure. 13:41 So, there's not a way to go back and easily ascertain what 13:43 that person was exposed to say, 15 or 20 years prior. 13:50 Dry blood spots are useful for this 13:55 because you have an existing record potentially 13:57 of exposure during that early life period 14:01 even what that person was exposed to during gestation. 14:04 So sometimes we can think about if you had a rare disease 14:09 and you had a large population 14:13 from which those cases accrued, 14:16 using these storage dried blood spots might be a useful way 14:19 of going about assessing exposure. 14:24 Another less common matrix here is amniotic fluid 14:27 and this is very similar to meconium. 14:34 So again, this is a representation 14:37 of what that exposure was like 14:40 in the fetal compartment, in utero during gestation. 14:43 Breast milk is more commonly used 14:49 than these other biomarkers mostly 14:51 because it's readily available. 14:54 Most women try at least to breastfeed 14:58 even if they don't continue for very long, 15:01 they do at least try and so early in life it's possible 15:04 to collect breast milk. 15:08 So colostrum refers to milk very early in life, 15:11 transitional milk is then a little bit later 15:16 and then mature milk after a couple of weeks after birth. 15:20 Breast milk is useful because it's a very lipophilic. 15:25 There's a lot of fat, cholesterol in breast milk. 15:28 And so compounds that are bound 15:31 to these lipids like PCBs, other organochlorines, 15:35 breast milk is a useful way 15:41 for measuring those concentrations. 15:43 And also, if breast milk is really the only dietary source 15:45 for an infant, it's also gonna be just about one 15:51 of the only sources of exposure for many of these compounds. 15:55 Bone is much less used, 16:01 but it is used quite frequently in studies of aging, 16:04 particularly for metals like lead. 16:07 There's been some studies 16:11 in the normative aging study out of Boston 16:12 that have measured lead levels in bone 16:16 and that can be done through x-ray. 16:20 Typically by x-raying bone in the long part 16:22 of the leg and in the patella in the knee. 16:28 That's a useful way to measure lead 16:34 because again, it's not invasive 16:36 and the nice thing about that 16:39 is that concentrations of metals like lead 16:41 in bone represent a cumulative measurement 16:45 of lead throughout the life period. 16:50 So, if you're thinking about the exposure 16:52 as being a cumulative process, 16:56 bone might be a good way of going about measuring them. 16:59 I mentioned before that for organochlorines like PCBs, 17:06 they're lipid bound. 17:14 As a result, 17:16 using something like adipose tissue may be a great way 17:18 to assess body burden. 17:22 So this is a title from an actual paper, 17:24 "A Prospective Study of Organochlorines 17:26 "in Adipose Tissue 17:29 "and risk of Non-Hodgkin Lymphoma." 17:30 So concentrations of PCBs were measured directly 17:33 in adipose tissue. 17:37 This is not commonly used much anymore. 17:40 It was thought 17:43 that maybe adipose tissue better represents body burden, 17:44 but as you can imagine, it is somewhat invasive. 17:49 I do have a collaborator again in Albany, New York 17:53 who has used adipose tissue to do some bio monitoring 17:57 and they have a noninvasive way of collecting that. 18:03 They actually get adipose tissue from plastic surgeons 18:07 who have been doing liposuction 18:11 and who would otherwise discard that medical waste. 18:13 And they've used it 18:16 to ascertain different chemical concentrations. 18:17 Nail clippings, these are useful for a lot of trace elements 18:23 and metals that sequester in areas of the body like hair, 18:27 teeth and nails. 18:34 This is mostly used again for trace element analysis. 18:36 And those of you who are clinically aware, 18:40 you'll know that sometimes changes in color 18:44 of the nails may represent different types of poisonings. 18:48 The teeth, we'll talk about in a minute. 18:52 And I also mentioned that hair is a useful biomarker. 18:57 Hair is useful for a couple of reasons. 19:01 It's first of all, noninvasive. 19:03 Second of all, there's not much contamination 19:05 if you treat it and handle it properly. 19:08 And thirdly, hair grows at a pretty predictable rate. 19:11 So if you think about an individual's scalp 19:15 and you think about the hair growing out of their scalp, 19:18 and this is the longest most distal date, 19:23 and this would be obviously the most recent date, 19:30 you can count up the amount of time that has elapsed. 19:34 So let's say that each of these is one centimeter 19:44 and you know that hair grows one centimeter, 19:49 let's say every two weeks. 19:55 You can say something about exposure here that, 20:00 that was say 20 weeks ago, 18 weeks ago, 16, et cetera. 20:04 So if you think about measuring that during pregnancy, 20:12 it may be that if this is hair collected 20:16 from a woman right after pregnancy, 20:20 and let's say this is three months, 20:23 and this is six months, 20:27 capturing that hair in this period 20:29 is gonna give you the concentrations of mercury 20:32 that she was exposed to during pregnancy. 20:35 So hair is also a very useful biomarker. 20:38 I previously mentioned teeth, 20:43 and they're a pretty interesting biomarker. 20:45 Teeth grow a lot like concentric rings on a tree. 20:50 So on a tree, 20:55 this is when the tree starts beginning growth, 20:57 and then each year or approximately a new ring is added. 21:03 And teeth grow approximately the same way. 21:11 So if we think about a tooth as a matrix 21:15 in which to measure exposure to environmental contaminant, 21:19 we can imagine that varying the depth 21:25 at which we assess the concentration 21:27 will give us different information about when 21:31 that particular exposure occurred. 21:35 So a less deep measurement would tell us something 21:37 about more recent exposure and a deeper assessment 21:42 in that tooth would tell us more about later exposure 21:47 or exposure that occurred in the past. 21:52 That's why I'm calling this retrospective exposure. 21:55 So it can give us information about the past. 21:58 These are data from work by Manish Arora 22:05 who is the primary investigator on this 22:10 at Mount Sinai, New York City and his collaborators. 22:14 This was published in "Nature" in 2013. 22:17 What this is looking at are barium/calcium ratios. 22:20 The reason that they look at this ratio 22:26 of barium to calcium 22:29 is that the placenta restricts barium. 22:31 So if we think about, 22:35 say these two different ratios for example, 22:38 and we have a ratio that is in utero 22:43 and then after birth so potnatally, 22:49 the difference here is that there's no longer a placenta 22:57 so the placenta is gone after birth. 23:00 So what we would expect if we look 23:03 at the barium to calcium ratio 23:05 is if barium is being restricted by the placenta, 23:14 we would expect a smaller fraction 23:19 in the in utero period and after birth 23:27 because barium is no longer being restricted, 23:30 we would expect a larger fraction. 23:36 So if we take that information and apply it to these figures 23:41 on the left, 23:44 what we see here are measurements in two different parts 23:46 of the tooth, one in the dentin and one in the enamel 23:52 and that's not particularly important. 23:57 On the Y axis, we have the barium to calcium ratio. 24:00 And on the Y axis here we have time, 24:05 and this is in days. 24:11 What the vertical lines represent in each figure is birth, 24:16 and so this would be the postnatal period, 24:22 and this would be the prenatal period. 24:26 So again, what I mentioned before is if this ratio 24:33 is smaller during the in utero period 24:41 which we would expect because of the restriction of barium, 24:44 we would have a smaller fraction, 24:48 which is exactly the case here in both of these, 24:50 the fraction is smaller. 24:56 Than after birth, 24:58 because barium is no longer being restricted 25:01 by the placenta, we see larger ratios of barium to calcium. 25:06 This was simply a demonstration 25:14 that by measuring the barium/calcium ratio in a tooth, 25:17 we can document exactly when birth occurred relative 25:23 to the development of the tooth. 25:27 So they took this knowledge and they applied it 25:30 to environmental exposures. 25:33 And basically the data look something like this. 25:37 So there's a lot of information contained in this figure. 25:46 We have data that we can use to ascertain 25:53 when the peak exposure occurred and how high it was. 25:59 We can think about a cumulative exposure metric. 26:04 We can think about average. 26:09 And we can think about these for the postnatal period 26:13 as well as the prenatal period. 26:19 So for example, 26:24 if we wanna think about the peak concentration, 26:26 this is obviously the peak level, 26:31 we can talk about when it occurred. 26:35 We can also measure cumulative exposure. 26:39 For example, by taking area under this curve. 26:44 We could think about cumulative exposure perhaps only 26:49 in the prenatal period. 26:56 We could think about maybe average exposure 26:59 which would be something like this. 27:04 And again, we can apply this idea 27:09 of peak and cumulative average specifically 27:13 to the postnatal and prenatal period. 27:17 So as I showed you here, 27:20 this is cumulative just for the prenatal period. 27:23 And we could think about perhaps an average 27:26 for just the postnatal period. 27:29 So again, this one tooth 27:32 is providing lots of different exposure data 27:35 for this individual. 27:40 Peak exposure, cumulative, average, time specific. 27:42 So let's say that the outcome of interest you thought 27:46 was a neurodevelopmental outcome, 27:51 and let's say that based on animal studies 27:54 and other mechanistic studies, 27:57 you believe that the critical period 28:00 for brain development occurs at 15 days before birth 28:03 so somewhere in here. 28:09 So using this metric we can say, 28:11 "I know exactly what the concentration was on day 15." 28:15 And so that might be the relevant measurement 28:20 that you're gonna use in your study. 28:22 So this is pretty incredible if you think about 28:25 how we're going to assess exposure in an individual. 28:28 This is extremely costly. 28:32 It takes a lot of analytic expertise to do this. 28:34 But again, really interesting information. 28:39 And the example here is for lead. 28:42 Again, they have it adjusted by the calcium concentration 28:45 and that's for analytic reasons, 28:49 but you can just think about this 28:50 as a relative measurement of the amount of lead 28:52 that this particular infant is exposed to. 28:55 So really, really some nice work here 28:58 by Manish Arora at Mount Sinai. 29:01 Here's an example for methylmercury exposure 29:07 and fish consumption. 29:10 And there's a lot of information here. 29:12 This is from a paper published in 1983, 29:15 and this is a single individual. 29:20 There's a couple messages here in this figure 29:23 that I wanna discuss. 29:27 So what happened here is this individual ate fish 29:29 for some amount of time, about 100 days. 29:34 Then measured methylmercury in three different matrices, 29:41 blood, hair stubble and head hair. 29:46 So here are the blood measurements in blue. 29:51 We have the hair stubble here, these small dots. 30:04 And then we have the head hair. 30:13 And I talked about this example 30:15 of measuring hair during pregnancy and the ability 30:18 to ascertain discrete time windows of exposure. 30:23 So on the X axis, we have time, we have days. 30:30 On this Y, we have mercury concentration in blood. 30:35 And on the other Y axis, 30:40 we have the measurement of mercury in hair. 30:41 And they're just different concentrations. 30:47 One is parts per billion and the other is parts per million. 30:51 So there's several takeaway points. 30:57 First of all, we can say that fish intake 30:59 is positively associated with methylmercury exposure. 31:02 Why? 31:06 As fish intake goes up over time, 31:07 so does methylmercury concentration. 31:11 We see that here in blood, 31:15 we see that in the stubble 31:19 and we see that in the head hair. 31:21 It's the first takeaway. 31:24 The second piece is that once fishing intake 31:26 is discontinued, mercury concentrations decrease. 31:30 So again, that supports causality 31:34 in terms of the methylmercury exposure is caused 31:38 by the increased fish consumption. 31:42 Another point to take away from this 31:45 is that even though these are different biomarkers 31:47 of exposure, we're using hair from the head, 31:53 stubble from the face and blood, 31:56 they're all correlated. 31:59 There's a little bit of a lag here with head hair. 32:02 It seems to peak a little bit later than the others 32:07 so it's kind of slid this way, 32:10 but it is this kind of 20 day mean. 32:12 But in general, these are highly correlated. 32:15 Concentrations say at day 40 look very similar. 32:18 Concentrations at about day 100 or 120 32:24 and concentrations at day 200. 32:28 The level may not all be the same but relatively speaking, 32:31 they track and would thus be correlated. 32:36 So we can think about these different matrices, 32:39 but at the end of the day, as an investigator, you may say, 32:44 "Which of these am I going to use?" 32:47 Well, certainly getting head hair is probably the easiest. 32:49 So maybe you're gonna use head hair, right? 32:56 Stubble, that's gonna be problematic 32:59 so you probably don't wanna use that. 33:02 Maybe you wanna use blood. 33:04 So even though we're talking about these different methods 33:07 to assess exposure, 33:12 these different matrices at the end of the day, 33:13 they may be highly correlated 33:16 and that's exemplified on the next slide. 33:19 So this isn't methylmercury, 33:24 this is looking at PCBs and they're measured 33:26 in several matrices. 33:32 These are mother-infant pairs. 33:33 And we have maternal milk, placenta, cord tissue, 33:36 and cord serum so these four different matrices. 33:40 And we also have measured maternal serum so five in all. 33:46 So these four are on the Y axis. 33:54 And on the X axis, we have maternal serum. 33:59 So maternal serum is being compared with concentrations 34:01 in these other different matrices. 34:06 So what do we know about PCBs? 34:11 Well, PCBs are lipophilic. 34:13 These are lipid based concentrations. 34:15 Because they're lipophilic, 34:18 they're persistent and they tend to stick around for a while 34:20 and they're readily available in these different matrices. 34:25 So overall, what do you see in this figure? 34:33 Well, overall you see this positive trend. 34:37 In other words, 34:41 maternal serum concentrations are well represented 34:42 by these concentrations in maternal milk and fetal tissues 34:46 so as one goes up, so does the other. 34:51 so again, as an investigator, as an epidemiologist, 34:54 you may look at this and say, 34:58 "Well, do I really need 34:59 "to collect five different exposure matrices? 35:01 "Do I need to collect serum from mom, and milk from mom, 35:04 "and the placenta at birth, and cord tissue, 35:08 "and cord serum and analyze these?" 35:11 This is going to get very expensive. 35:14 Rather than do that, you may consider this and say, 35:17 "Well, which of these 35:20 "do I think best represents the underlying 35:22 "and most relevant process that I'm after?" 35:25 So again, we can measure it in all these different matrices, 35:30 but we should in some cases just pick one. 35:34 So now that we've decided on your biomarker for your study, 35:40 now what do we do? 35:44 Well, we have to decide how often we're gonna collect this? 35:48 It may not be relevant just to collect the biomarker 35:52 at a single time point. 35:55 That's dependent on the scientific question, 35:57 feasibility and the reliability of the measurement. 36:00 So for example, 36:03 let's say that you're interested Bisphenol A exposure. 36:05 This is a non-persistent compound. 36:09 It's in plasticizers and things like that. 36:13 It's readily excreted. 36:16 The half life in urine is about six hours. 36:18 And you're interested in kind of the total exposure 36:21 to BPA during pregnancy. 36:25 You don't have a particular time period in mind, 36:28 but you're interested in exposure throughout 36:31 that entire pregnancy, 36:33 say an assessment of average or cumulative exposure. 36:34 Because of that, 36:40 your question is about all of pregnancy. 36:41 So measuring it at one time during pregnancy 36:43 is probably not going to well represent average 36:47 or cumulative exposure. 36:51 So then you have to think about, 36:52 "How often am I gonna measure it? 36:54 "Am I gonna measure it once per trimester 36:56 "so three measurements?" 36:59 Ideally, maybe you'd measure it every day, 37:01 several times a day. 37:04 Well, that isn't really feasible. 37:05 The third issue that is related 37:08 to some of these other issues is the reliability 37:11 of the measurement and that's dependent 37:15 on how long this compound sticks around. 37:18 This is an example of BPA. 37:23 So we talked a little bit about BPA previously. 37:26 And in this study, there were thousands of women, 37:30 but we selected 80. 37:33 And for each woman, 37:37 three urine specimens were collected during pregnancy 37:38 about wondering each trimester. 37:42 And we had 240 specimens total. 37:45 So a little more background on BPA. 37:53 It's a weak estrogenic chemical. 37:55 It's largely a dietary exposure 37:58 so we're exposed through containers that we eat food from 38:01 and from the food itself which has already been contaminated 38:07 with BPA through its handling. 38:12 As I mentioned before, it has a short half life, 38:16 six or less hours in urine and low ICCs have been reported 38:20 in the peer-reviewed literature. 38:25 So an ICC is an intraclass correlation coefficient 38:28 and we'll talk a bit about that on the next slide. 38:33 And in the peer reviewed literature, 38:36 this ICC has ranged from 0.09 to 0.43. 38:38 So this is the correlation for three or more measurements. 38:43 And in pregnant women, 38:48 three studies have reported ICCs of 0.10 to 0.32 38:51 so not a strong correlation between these measures 38:56 of BPA in urine. 39:01 By contrast, if you look at the ICC 39:03 for a persistent compound like DDT, DDE, or PCBs, 39:05 and if you measured that at three time points in pregnancy, 39:12 you would have an ICC of at least 0.9. 39:14 So what is the ICC? 39:20 It's the intraclass correlation coefficient. 39:22 You can think about it as a correlation coefficient 39:25 when you have more than two measurements. 39:28 And again, just like a correlation coefficient, 39:31 it ranges from a zero to one. 39:34 There aren't technically negative ICCs 39:42 so any of the ICC that you see 39:45 because of the way it's calculated are going to be positive. 39:48 So in that sense, it's slightly different 39:52 from a typical correlation coefficient, 39:54 but the range of measurements are still the same. 39:57 So why am I introducing this concept of the ICC? 40:02 There are some simple methods that take into account the ICC 40:07 when thinking about sample size needed. 40:13 So let's just walk back from this for a minute. 40:18 Let's say that you have three measures of BPA 40:20 in urine during pregnancy, one in each trimester. 40:26 So you develop a sample size calculation and in it, 40:32 you determined that you have four different outcomes. 40:40 And for each of these outcomes, let's say, hypothetically, 40:44 you need 50 participants for one outcome, 40:48 100 for another, 200 and 500. 40:53 So basically you'd need a study with a sample size of 500 40:57 in order to accomplish all these aims 41:03 and to analyze all these outcomes. 41:07 The assumption there with that power calculation 41:11 is that you have perfect measurement of your exposure. 41:14 But the ICC is a measurement of exposure reproducibility 41:21 and that ICC is related to the sample size 41:26 that you need to adequately power your study. 41:33 And it's not important how this is exactly calculated, 41:38 but really what you're doing is inflating the sample size 41:41 by the sample size over the ICC. 41:46 So let's say you assume perfect measurement of the exposure, 41:51 the ICC is one. 41:57 So it's perfectly correlated 41:58 across those three measurements. 42:01 Well, what if it's less correlated? 42:04 What if the ICC is 0.50? 42:06 Well, using this formula of, N over the ICC, 42:10 we would end up with 50 over 0.50, 42:18 and so we would require 100 individuals. 42:24 So that's forcing us to inflate by a factor of two. 42:29 The issue then becomes 42:36 when you have these substantially lower ICCs. 42:37 And this is the territory we're in for exposures like BPA 42:42 that are measured in urine. 42:49 The ICC is fairly low. 42:51 As I mentioned on the previous slides, 42:52 the ICCs are somewhere between 0.10 and 0.3 42:55 for measurements taken during pregnancy. 43:02 So initially we're thinking about a sample size of say 500. 43:05 But, if the ICC has really 0.10, 43:12 you have to think about inflating our sample size 43:16 by a factor of 10. 43:18 So this really affects the statistical power 43:20 and it really makes us think about how many people we need 43:25 to adequately power our study 43:29 when there is such a degree of measurement error, 43:31 when there is so much misclassification of exposure. 43:35 So I'm showing you this because I want you to recognize 43:39 that yes, these measurements are not very reproducible. 43:43 What does that mean for epidemiologic study design? 43:50 Well, what that means is we're gonna need more individuals. 43:53 We've touched on this a little bit already 44:00 when we talked about measuring say lead in teeth 44:02 that we're able to ascertain these measures of cumulative, 44:08 and peak and average exposure. 44:12 What do these really mean? 44:15 So we can measure cumulative exposure 44:17 through something like area under the curve 44:21 and I showed you that example. 44:23 But what does this really measure? 44:27 Typically, it's a cumulative or irreversible effect. 44:29 Think about cigarette smoking. 44:34 So, in contrasting these two 44:36 or these three rather. 44:42 If you smoke say two packs a day for 20 years, 44:45 three packs a day for 10 years 44:56 and maybe one pack a day for 20 years, 45:03 if it's the cumulative effect 45:13 of the cigarette smoking we're after, 45:15 it would be essentially if we had 50 years. 45:19 So 10, 20, 30, 40, 50, 45:27 and one, two and three packs a day. 45:36 We can think about for 20 years, smoking two packs a day, 45:40 for 10 years, smoking three packs a day 45:47 and then for 20 years, one pack a day. 45:51 So this cumulative process 45:56 is really the area under this entire curve. 45:59 So maybe that's really what matters 46:08 for our outcome of interest 46:10 and let's say it's lung cancer. 46:11 Maybe it's that total cumulative effect. 46:13 On the other hand, it may just be the peak exposure. 46:17 And its peak is an assessment of it's the greatest exposure 46:21 as the causal agent. 46:25 So the two pack a day smoking isn't enough to kind of cross 46:27 that exposure threshold to start 46:31 that underlying physiological process 46:35 which moves us toward lung cancer. 46:38 Maybe what we really need to do is hit this threshold 46:41 of three packs a day. 46:44 We can also think about the average. 46:48 And maybe the average is somewhere here. 46:51 So we can think about that as slow 47:01 or partially reversible effects. 47:04 It's related to cumulative exposure 47:06 in the case of persistent exposures. 47:08 Cumulative and average are essentially the same. 47:12 But instead of cumulative exposure 47:16 which encompasses more of the time element, 47:21 the average exposure 47:25 is really only encompassing kind of the average dose. 47:26 What does that look like? 47:34 Well, these are actual data. 47:35 The Y axis didn't come out very well, 47:38 but these are mean blood lead concentrations 47:40 and micrograms per deciliter. 47:43 So I'm gonna say mean blood lead levels 47:44 in micrograms per deciliter. 47:49 And this is an individual at different ages of assessment 47:54 in months, six months, 12, et cetera, 47:57 all the way up to 72 months or six years. 48:01 And what you can see is for this individual, 48:10 their concentrations start 48:13 around three micrograms per deciliter. 48:15 They peak between 24 and 36 months 48:17 and subsequently declined. 48:22 So again, like the two flood example, 48:26 we can ascertain a lot of information here. 48:29 So we can calculate the peak concentration 48:31 which is about nine micrograms per deciliter. 48:34 We can calculate an AUC or cumulative effect 48:39 and that would be this total area under the curve. 48:44 And we can also calculate a mean concentration 48:50 which is gonna be related to the AUC. 48:54 But instead of it being a total measurement, 49:03 the mean is going to be something like, 49:10 say five micrograms per deciliter 49:13 which isn't necessarily gonna take 49:17 into account the time element. 49:21 And you can see this on the next slide. 49:23 Again, we have the same individual 49:29 and we would calculate the AUC 49:34 by taking the total area under the curve. 49:38 So now that we've estimated these different exposure metrics 49:43 of cumulative and peak 49:48 and average exposure levels, 49:51 we can incorporate that into a data analysis. 49:55 So in this case, you can see we have peak, 50:00 we have concurrent 50:04 which is an assessment occurring at the same time 50:06 as the outcome assessment 50:10 and we have infancy average. 50:13 So this is from a table from a paper published in 2003 50:15 and it's looking at blood lead levels 50:21 and how they relate to child IQ. 50:27 And these are data from a cohort study 50:34 in Rochester, New York. 50:36 Same data from the slides I showed you just previously 50:37 where we constructed the blood lead concentration. 50:43 And so if we just take a look at these estimates here 50:47 in the far right column, this is the overall estimate. 50:51 IQ was assessed at two time points, at three and five, 50:54 but we'll just look at the overall for simplicity. 50:57 We can look at different estimates, 51:00 one for lifetime average, one for peak, 51:02 one for this concurrent measurement 51:08 and one for infancy average. 51:11 Well, what you see is that there's not much difference 51:15 between the lifetime average and the infancy average. 51:17 Also, not too much difference between lifetime average 51:21 and infancy average and the concurrent measurement. 51:25 But really peak, not as strong in association. 51:29 So for example, 51:33 for every 10 microgram per deciliter increase, 51:34 you have a decline in IQ of 4.4 points 51:37 for the peak exposure. 51:41 And for these other metrics of exposure, 51:44 you have about an eight point decline in child IQ overall 51:48 for each 10 micrograms per deciliter. 51:54 So this is useful information for a variety of reasons. 51:57 First of all, 52:02 it's informative from a public health standpoint. 52:03 So if we think about how do we regulate blood lead? 52:07 Well, this says it's not so much the peak concentration 52:11 that matters, it's the average exposure during a period. 52:14 It's also not just the exposure that occurs during infancy. 52:19 It seems to be exposure that occurs later as well. 52:24 So maybe we don't regulate 52:29 the child's highest blood lead concentration, 52:31 but what we regulate 52:34 are average blood lead concentrations or something. 52:35 It's a little hard to do, 52:41 but this does provide useful public health information. 52:43 It also provides useful information to kind of go back 52:46 and think about what the mechanism may be. 52:51 Also what the critical period 52:53 for brain development might be. 52:55 You know, you might assume 52:57 that concentrations higher during infancy 52:59 are really what matter, 53:01 but this would suggest 53:02 that it doesn't really matter any more 53:03 or any less than average exposure 53:06 throughout the whole time period we're considering. 53:09 So we've talked about how many times 53:16 to measure this exposure. 53:19 We've talked about after you've measured it, 53:21 how to construct these different variables. 53:25 A related topic is this idea of pharmacokinetic modeling 53:28 or physiologically based pharmacokinetic models. 53:35 What these are, is using data 53:40 so measured exposure data. 53:44 Other information that's known about the kinetics 53:48 of the exposure. 53:52 Also information about in this case, 53:54 we're thinking about PCBs, PCB exposures that occur 53:58 in a child in utero and postnatally. 54:08 We also can incorporate child variables 54:11 like gestation weight, birth weight 54:14 in these exposure variables. 54:15 So the actual measured concentrations, when it was sampled 54:19 and breastfeeding, length of breastfeeding, et cetera. 54:24 What this allows us to do is through mathematical models are 54:30 to derive additional data about exposure. 54:35 So let me show you what I mean. 54:40 In this figure here, 54:43 we have measured levels identified by the open circle 54:45 which I'm gonna make red. 54:50 So in this example, 54:54 we have exposure measured at six months in the child. 54:56 We have it measured again at 16 months. 55:03 This is the second dot here. 55:08 And then again at say about 45 months. 55:11 So this is really all we have 55:19 are these measured concentrations. 55:21 By utilizing the information here 55:27 and knowledge about the toxic kinetics of PCBs, 55:31 we can actually simulate kind of a time course 55:36 of what exposure would look like. 55:40 So for this individual, 55:43 they start at some level, say in utero 55:45 and that's based on a maternal specimen or a cord specimen. 55:49 So I should say we actually 55:53 have four concentrations measured here. 55:54 So we can using this model, estimate these profiles, 56:01 these exposure profiles or curves for an individual. 56:06 So based on the measured concentrations, 56:13 we know that it peaks and we know that it declines. 56:15 Why? 56:18 Because we've got a measurement up here 56:19 and measurements down here. 56:20 We also might know that for this individual, 56:23 that breastfeeding occurred for let's say eight months. 56:26 Because of that, 56:33 concentrations are peaking during that eight month period. 56:36 Subsequent to that, exposure may only be occurring 56:40 through normal dietary exposure, 56:44 not through breastfeeding 56:46 because breastfeeding is lipophilic 56:48 and the exposure dose for persistent compounds 56:50 like PCBs matter a lot during that period. 56:54 So if I go back and clean this up a little bit. 57:02 So I know that this individual breastfed for eight months. 57:14 What I can do is derive information about peak exposure. 57:21 So this measurement was done at six months and we know that, 57:27 but it looks like exposure peaked again if I clean this up. 57:34 It looks like exposure peaked just before that. 57:40 Why might it have peaked just before that? 57:46 Well, maybe exclusive breastfeeding peaked 57:48 so let me just say at five months. 57:56 And so the peak exposure occurred at five months. 58:03 We measured concentrations in the infinite six months, 58:06 but because we have this model 58:10 and we have information about breastfeeding, 58:12 we know that the peak occurred slightly before 58:15 that measurement and then tapered down. 58:18 Now, going back, the only measurements that we had 58:22 on this individual are these maternal six month, 58:26 16 month and 45 months. 58:32 But using this model, 58:35 what we're able to do 58:37 is ascertain information about peak exposure. 58:39 And if you recall, 58:42 we can also say something about an area under the curve 58:44 for this individual. 58:49 Now you might imagine a case 58:52 where we have two different individuals. 58:56 And let's say that these are fictitious Y axis. 59:01 We have time on the X axis in both cases. 59:09 And this is person one 59:13 and person number two. 59:18 Maybe individual two was never breastfed 59:23 and they had maternal concentrations 59:29 that were starting at one and maybe stayed at one. 59:31 And during that time period didn't really change much. 59:40 So their AUC, let's just say that this time was one year. 59:45 Their AUC was say 1.0 over that time period. 59:52 On the other hand, you may have an individual 1:00:01 who for (clears throat) six months was breastfed. 1:00:03 So they had 1.5 for six months. 1:00:15 And then for the last six months had minimal exposure, 1:00:20 but their average AUC is still gonna be 1.0. 1:00:27 What this pharmacokinetic model does is allow us 1:00:34 to say something about these exposure profiles. 1:00:37 So even though the AUC is the same, 1:00:41 the peak concentration is different. 1:00:45 This information can be incorporated again 1:00:48 into our analysis. 1:00:51 And rather than say just using AUC or something else, 1:00:53 we can say something more specific about peak exposure 1:00:57 or exposure at a specific time point, 1:01:01 say at one month, or two months, or three months, et cetera. 1:01:04 So that goes to the next slide. 1:01:10 So these are actual data from a study I did. 1:01:17 And the exposure of interest in this study was PCB exposure. 1:01:21 And we were looking at anti BCG levels. 1:01:26 So anti BCG levels are essentially levels 1:01:31 of antibody specific to the BCG vaccine. 1:01:35 And we measured this our outcome at six months. 1:01:43 And we measured PCBs at birth 1:01:50 and at six months. 1:01:57 What we did is we used a pharmacokinetic model. 1:02:02 And from those exposures, we were able 1:02:07 to derive several different metrics of exposure. 1:02:11 So we were able to do month specific estimates 1:02:15 and we were also able to calculate an AUC 1:02:22 for that six month time period 1:02:25 and also peak exposure for that time period. 1:02:28 So let's ignore this side of the figure for now. 1:02:34 But you can see here, 1:02:39 these are the different month specific exposures, 1:02:40 one, two, three, four, five and six. 1:02:43 And we also have an AUC measure and a peak measure. 1:02:47 Each of these estimates here represents the percent change 1:02:53 in the outcome for a change 1:02:58 in exposure across the intercore tile range. 1:03:01 And don't worry too much about this, 1:03:08 but these are two different outcomes specific 1:03:10 to the anti BCG. 1:03:12 But essentially what you see 1:03:15 is the closer the exposure assessment 1:03:17 is to the outcome assessment which is again, 1:03:20 the outcome is assessed at six months, 1:03:23 the stronger the association. 1:03:26 So as we go further back in time, 1:03:28 we see that the association is less strong. 1:03:32 So this tells us something about the nature 1:03:35 of this association between PCBs and anti BCG. 1:03:38 It suggests that concurrent exposure 1:03:43 is most strongly related to the outcome. 1:03:47 It also suggests that exposure around six months 1:03:50 might matter most. 1:03:55 When we also looked at AUC, 1:03:58 AUC was not as strongly associated 1:04:00 as the six month measurement 1:04:03 which again, provides information 1:04:05 that it's not just the dose, the cumulative dose, 1:04:07 it seems to be timing specific. 1:04:11 And same thing with peak. 1:04:14 We did observe associations with peak, 1:04:15 but again, not as strong as we did 1:04:18 with the six month time point. 1:04:20 So this provides by analyzing data using these models, 1:04:22 we're able to drive these additional metrics of exposure 1:04:27 and capitalize on these when we analyze our data. 1:04:32 So this would suggest that PCBs matter in terms of timing 1:04:35 and not just dose. 1:04:42 So it's not just the dose, 1:04:44 it's not the AUC or the peak. 1:04:46 It seems to mean that timing really matters. 1:04:48 Let's look at another issue in environmental epidemiology 1:04:54 and for measuring biomarkers of exposure. 1:05:00 Let's assume that you have a case control study 1:05:06 of breast cancer in serum pesticide concentrations. 1:05:08 So pesticides are your exposure of interest 1:05:12 and your outcome here is breast cancer. 1:05:20 So what we really want to know, 1:05:25 we have 500 cases of breast cancer in 500 controls. 1:05:28 So my question to you is what is the measure 1:05:34 of association you want to know? 1:05:36 Well, what we really want to know 1:05:38 to make inferences about causality 1:05:41 is whether the exposure concentrations 1:05:45 to these pesticides are higher 1:05:48 among breast cancer cases relative to non case controls. 1:05:51 So, is the concentration higher 1:05:59 in these 500 cases versus the 500 controls? 1:06:03 That really is at the end of the day, 1:06:14 what we want to know. 1:06:16 Now, typically how we would conduct this study 1:06:18 is we would determine concentrations 1:06:21 in each of the cases, in each of the controls. 1:06:23 And we would arrange people into a two by two table. 1:06:27 So what does that look like? 1:06:34 Here's our two groups, 1:06:35 the case group, the control group, 1:06:40 and we can divide people 1:06:43 into high and low pesticide concentrations. 1:06:45 So we've analyzed the 500 cases, 500 controls. 1:06:51 So we've analyzed 1000. 1:07:01 And I told you earlier that this might cost around $750 1:07:03 so this is $750,000 to do. 1:07:08 This is quite expensive. 1:07:13 At the end of the day, you could say, 1:07:16 "Well, why don't I just take blood from all of the cases, 1:07:18 "pool them into a single aliquot and analyze one aliquot? 1:07:25 "I could do the same for controls, 1:07:34 "aliquot out a sample from each of the controls 1:07:36 "and pool them and analyze that single pool." 1:07:43 So this should say one pool 1:07:47 and this really should say one pool. 1:07:52 So an aliquot from each individual. 1:07:53 Then what have I done? 1:07:57 Well, I've analyzed two specimens. 1:07:58 It's gonna cost me $1,500, right? 1:08:02 You could think about doing that, 1:08:08 only analyzing two specimens. 1:08:10 There are lots of issues with that 1:08:13 and that's to the extreme. 1:08:17 But what if we look at this a little bit differently 1:08:20 and let's say that we have pools of 10. 1:08:24 Pools of 10 specimens. 1:08:33 That means four cases. 1:08:37 We would have 50 sets. 1:08:43 And for controls, we would have 50 sets. 1:08:46 Remember there's 10 in each. 1:08:49 So, we pool 10 people together, 1:08:53 another 10, another 10 and we do that for 50 sets. 1:08:57 We then distribute each set into a higher-low group. 1:09:01 So maybe set one goes here, set two, set three, 1:09:06 set four, set five, six, seven, et cetera. 1:09:10 Then we get to the end of those 50 sets. 1:09:14 And let's say it turns out that we've evenly divided them 1:09:17 into high and low groups based 1:09:22 on their average concentration. 1:09:24 Same thing with the controls. 1:09:27 We have 50 sets and we analyze set number one 1:09:29 with the 10 specimens. 1:09:33 And let's say we put it in this group, 1:09:35 set number two, and three, four, five, et cetera. 1:09:37 So, rather than analyzing individual concentrations, 1:09:43 we're now analyzing pools of 10 specimens 1:09:47 and then distributing them 1:09:51 into the two by two table dependent 1:09:53 on where that mean concentration lies. 1:09:56 So again, instead of taking for example, 1:10:00 we have five specimens, 1:10:05 rather than analyzing each of those five individuals, 1:10:07 we take this whole group, 1:10:11 we mix them together simplistically 1:10:13 and we analyze a single specimen 1:10:16 that might be a pool of five. 1:10:19 What this does is this reduces the cost. 1:10:22 So now we're analyzing a total of 100 sets rather than 1000 1:10:24 so we've decreased our costs substantially. 1:10:31 Now, there are some issues with this. 1:10:34 You can't look at effect modification 1:10:36 and there are some methods 1:10:39 that you need to apply in the data analysis. 1:10:40 But in this case, 1:10:43 you can certainly adopt the idea of pooling 1:10:45 and it is done on occasion, 1:10:49 something that you should be aware of. 1:10:51 So you've decided on how to collect them. 1:10:55 You've decided on how often to collect them. 1:10:59 You've decided on the matrices 1:11:01 in which you're going to measure them in. 1:11:06 Now you have to actually analyze the specimens 1:11:09 or probably send them off to an analytic chemist 1:11:12 who can do it for you. 1:11:17 So what are the issues there? 1:11:19 This is a picture from the cleanup of serum. 1:11:23 So serum cleanup for several individuals. 1:11:30 We have this person here, here, et cetera. 1:11:38 So you can see the color of the serum. 1:11:49 So we have one, two, three, four, five, six, 1:11:54 seven, eight, nine, 10. 1:11:58 We have one that is a standard 1:12:04 which contains known amounts 1:12:06 of the different PCBs we're gonna analyze. 1:12:07 And we have a blank concentration. 1:12:10 So you've collected, let's say serum from this individual. 1:12:15 You have it in the test tube. 1:12:21 Let's say you collect whole blood, 1:12:25 you spin it down and you remove clotting factors. 1:12:27 And then you're left was a serum you thought, 1:12:31 and you weigh it and you add it to this extraction process. 1:12:36 So all those steps occur. 1:12:43 You then have to " do clean up." 1:12:44 So in here, 1:12:48 there's essentially kind of like a cotton material that acts 1:12:50 as a filter and to clean up anything in this matrix. 1:12:54 The next step here is to dilute these concentrations. 1:13:02 So again, after the cleanup process, 1:13:07 you're left with considerably less. 1:13:09 And they further get diluted under a nitrogen stream 1:13:16 and you're left with very small amounts. 1:13:20 These are each separate steps with different columns. 1:13:23 After that, this is a GC-ECD machine. 1:13:29 This is gas chromatography with electron capture detection. 1:13:37 This is an older technology 1:13:42 for measuring substances like PCBs 1:13:44 in different biological matrices. 1:13:49 So this is an oven, it heats up. 1:13:53 The sample is injected from the top. 1:13:55 And based on when that sample comes out 1:13:57 of the gas chromatograph, 1:14:00 it provides information about the molecular weight 1:14:01 and the relative concentration. 1:14:04 So after the cleanup, 1:14:08 you have to inject these into the machine 1:14:10 and you get a read out on the machine. 1:14:13 This for example is the peak for para para DDE 1:14:16 which is probably the most prevalent conjure related to DDT. 1:14:22 And then looking at one of these curves, 1:14:31 you have to then figure out what the concentration is 1:14:34 by determining information on the peak 1:14:37 and the length of the peak. 1:14:42 So the point is in those previous slides 1:14:47 is that it takes quite a bit of time. 1:14:51 So to do 10 samples, it might take a person a day 1:14:54 to extract them, to clean up, to concentrate them, 1:14:58 to put them through the GC-ECD, 1:15:02 and then to integrate the curves. 1:15:06 So when you think about measuring these specimens 1:15:09 and say on 1000 people at three time points, 1:15:13 you think about 3000 specimens. 1:15:17 Well, that's a person working every day for a year 1:15:19 so it takes quite a bit of time to do this work. 1:15:23 And thus, when I say it's $750 a specimen, 1:15:26 you can easily see how that cost quickly adds up. 1:15:30 So let's discuss some of the potential issues 1:15:36 that occur when analyzing biomarker data. 1:15:39 The first and probably the most common issue 1:15:44 is that biomarker concentrations are subject 1:15:48 to a limit of detection. 1:15:52 When you're measuring compounds 1:15:55 that occur at the parts per million 1:15:57 or parts per billion level, 1:15:59 often some concentrations won't be detected within a sample 1:16:02 of say serum, or plasma, or breast milk 1:16:09 because they're just not a prevalent compound. 1:16:13 So one example is TCDD, commonly referred to as dioxin. 1:16:16 And in this case, 1:16:23 you can see that these are data from Ann Haynes 1:16:24 in multiple survey years. 1:16:28 So 1999 and 2000, et cetera. 1:16:30 And if you look at these percentiles, so the median here, 1:16:33 the 50th percentile, 1:16:37 you can see that the median concentration for all years 1:16:39 for the entire group of individuals 1:16:45 is below the limit of detection. 1:16:46 So what do you do with concentrations 1:16:51 that are subject to a limit of detection? 1:16:55 Well, there are a few ways to deal with it. 1:16:59 The first is what we call simple substitution. 1:17:02 And this is useful when some of the concentration 1:17:06 are below a limit of detection, say less than 10%. 1:17:10 So there are methods such as using one-half the LOD value, 1:17:14 or one-half divided by the square root, 1:17:20 or just substituting the actual limit of detection. 1:17:25 These are fine 1:17:30 when the amount of essentially missing data is trivial. 1:17:32 There are other methods when the amount of missing data 1:17:38 is a bit more substantial, more than 10% to about 50%. 1:17:42 Multiple imputation is one example 1:17:48 and there's a whole body of literature on this 1:17:50 in the epidemiology field and in epi methods. 1:17:53 And essentially this involves simulating multiple data sets 1:17:59 and then impugning values based 1:18:03 on those simulated data sets. 1:18:06 It performs well, 1:18:08 multiple imputation does when the range of missing this 1:18:10 is in this 10 to 50%. 1:18:14 And in fact, it's really the ideal method 1:18:18 when you have less than 50% of values 1:18:21 that are censored or are missing. 1:18:25 There are some other techniques out there 1:18:27 that can be used like Cox regression 1:18:30 where you're really treating exposure 1:18:33 as a censored measurement where it's really left censored. 1:18:37 I was involved with a group that published a paper 1:18:45 where they treated exposure as outcomes. 1:18:48 So they flipped X and Y in the regression equation 1:18:52 and treated the exposure as a Y value which was censored. 1:18:56 This gets complicated very quickly. 1:19:02 It limits the analysis that you can do. 1:19:04 And obviously if you anticipate having lots 1:19:07 of values below the LOD, 1:19:11 you need to rethink your method of analysis 1:19:12 or the matrix you're using, 1:19:16 or even if it's relevant to measure that compound. 1:19:19 So a second problem or really debate in this field 1:19:26 is how to adjust for the solvent. 1:19:30 So chemical concentrations in urine for example, 1:19:33 are a function of urine output. 1:19:37 So in other words, if your urine is very dilute, 1:19:40 say you just drank a lot of water, 1:19:44 the amount of chemical concentration in there 1:19:48 is gonna be a function of how dilute that urine is. 1:19:52 And so there are ways to adjust for urine output. 1:19:56 One is called creatinine concentration 1:20:00 and creatinine is a protein that you can measure in urine. 1:20:03 And so it's a way of adjusting for the dilution 1:20:08 of the urine. 1:20:13 And a second method that you'll see is specific gravity. 1:20:14 So if you encounter publications 1:20:17 that are using non-persistent chemicals measured 1:20:21 in urine such as BPA and phthalates, 1:20:24 you'll often see some correction made 1:20:30 for the concentration based 1:20:34 on either creatinine concentration 1:20:36 or specific gravity. 1:20:38 Same goes for lipid-soluble compounds like PCBs, DDT. 1:20:42 And the most typical method is to adjust 1:20:54 for lipid concentration 1:20:57 so measuring total cholesterol 1:20:59 and some other specific types of lipids. 1:21:03 How you do that adjustment is debatable. 1:21:06 So one method is to add the lipid concentration 1:21:11 as a separate term, 1:21:16 as a covariate in a statistical model 1:21:18 and adjust for it that way. 1:21:20 Another method is just to standardize concentration. 1:21:22 So if you just measure PCBs in blood, 1:21:26 you'll see something like 5 nanograms per ml. 1:21:31 But if it's standardized per lipid, 1:21:37 you might see it as 5 nanograms per gram lipid. 1:21:41 So in that case, 1:21:47 the concentration is actually standardized by the amount 1:21:48 of lipid in the blood. 1:21:53 The best way of adjusting for this really is debatable. 1:21:55 And there are some methods, papers dealing with this 1:21:59 by a group at NIHS, also Enrique Schisterman 1:22:02 who's also at NIH. 1:22:08 This is a pretty typical sensitivity analysis in papers 1:22:12 that adjust for creatinine concentrations, specific gravity 1:22:18 or lipid adjustment is to do it one way 1:22:23 in the primary analysis and in a secondary analysis 1:22:26 or in a sensitivity analysis to say, 1:22:29 "We're doing it another way 1:22:32 "and do our results really differ depending 1:22:34 "on the method of adjustment." 1:22:39 So you have your results from your essays 1:22:44 and you need to interpret those results 1:22:47 and you need to do so carefully. 1:22:51 So I'm gonna walk you through an example 1:22:55 of how you can go wrong. 1:22:58 And the authors in this study didn't go wrong, 1:23:01 but I think if they hadn't thought about this 1:23:05 or knew as much as they did about the pharmacokinetics 1:23:08 of perfluorinated compounds, they might've gone wrong. 1:23:14 So their question of interest were perfluorinated compounds 1:23:18 so these are the PFAs. 1:23:22 So things like PFOS and PFOA, 1:23:26 and these are chemicals that are used as stain repellents. 1:23:32 They're used in products that carry food. 1:23:36 They were used in a lot of consumer products like scotchgard 1:23:40 and things like that. 1:23:45 They're very prevalent. 1:23:46 And there's decent experimental evidence suggesting 1:23:48 that they're developmental toxicants. 1:23:52 So their study was looking 1:23:56 at these perfluorinated compounds and fecundability. 1:23:58 And fecundability is the probability of conception 1:24:03 in a given menstrual cycle. 1:24:07 So it's a reproductive outcome. 1:24:09 It's focused on the female partner in this case. 1:24:11 And the outcome really is time to pregnancy. 1:24:16 This reflects fecundability. 1:24:20 So time to pregnancy in this case is time from initiation 1:24:23 of unprotected intercourse to conception. 1:24:27 And it's typically self-reported as in the case 1:24:31 of this study. 1:24:35 So the construct here is this fecundability or probability 1:24:36 of conception and how they operationalize constructors 1:24:40 through this assessment of time to pregnancy 1:24:44 which is self-reported. 1:24:47 So they're including 932 subjects. 1:24:51 They have a case group which includes subfecund women 1:24:56 or it is a group of subfecund women. 1:25:01 And these are women 1:25:04 whose time to pregnancy was longer than 12 months. 1:25:05 And then a control group 1:25:10 where time to pregnancy was less than or equal to 12 months 1:25:11 and the women had no reported fertility treatments. 1:25:15 The exposure of interest here 1:25:19 were maternal perfluorinated compounds 1:25:23 and they're measured at 17 weeks of gestation. 1:25:26 So they were measured after the mother got pregnant. 1:25:29 So logistic regression analyses were used. 1:25:36 Results were stratified by parody. 1:25:39 It's a bit cutoff in the figure here, 1:25:42 but these come from a paper by Christy Whitworth 1:25:45 and they were published in epidemiology. 1:25:48 So what did the authors find? 1:25:56 Let's just focus here on the adjusted estimates. 1:25:58 And so these are odds ratios and confidence limits here. 1:26:02 We have those for PFOs and those for PFOA. 1:26:07 So it looks like for PFOs 1:26:18 that as the concentrations increase, 1:26:22 that the odds of subfecundability increase 1:26:26 and that's also the same for PFOA. 1:26:34 And you can see here, 1:26:38 the tests for trend indicate that there's dose response 1:26:39 and you can see that here. 1:26:44 So in the previous table, you saw the results collapsed, 1:26:50 not stratified by parity. 1:26:56 These results are shown in graphical form, 1:26:59 and they are stratified by parity. 1:27:02 So these are the results for PFOs. 1:27:04 And the blue is for nulliparous women. 1:27:10 So these are women who have not had a previous pregnancy. 1:27:15 And the green is for parous women. 1:27:20 So they've had a previous pregnancy. 1:27:23 So I'll say previous pregnancy positive, 1:27:28 and I'll say previous pregnancy negative. 1:27:31 And what's modeled are the odds of subfecundity. 1:27:38 And we have four exposure categories, core tiles. 1:27:51 This is the reference category, second, third and fourth. 1:27:56 And generally what we see are elevated odds 1:28:02 for the parous women. 1:28:07 And so did this protective association 1:28:11 for nulliparous women. 1:28:19 So in other words, 1:28:23 parous women are at an increased odds for subfecundity 1:28:25 whereas nulliparous women, 1:28:31 women who have never had a previous pregnancy don't seem 1:28:33 to have that risk or it is protective. 1:28:38 So it almost looks generally these confidence limits 1:28:41 are fairly wide. 1:28:44 They're fairly wide for most of the estimates, 1:28:45 but it generally looks like no association 1:28:49 for the nulliparous women. 1:28:52 But for parous women, 1:28:53 women who have had a previous pregnancy, 1:28:54 it looks like perfluorinated compound PFOSs 1:28:59 in this case is associated with an increased odds 1:29:02 of having this delayed time to pregnancy. 1:29:07 If we look at the results for PFOA now, 1:29:13 we see a similar trend. 1:29:18 Essentially no association for the nulliparous women, 1:29:20 but a positive association for the parous women. 1:29:29 So again, it looks like only 1:29:35 for the parous women 1:29:37 are these perfluorinated compounds associated 1:29:40 with an increased odds of subfecundity. 1:29:43 So one explanation is that it's causal 1:29:51 that these perfluorinated compounds are causing women 1:29:56 to have a delayed or a longer time to pregnancy. 1:30:01 So these compounds somehow interfere with reproduction. 1:30:06 So that is the causal explanation. 1:30:12 Another way of looking at this 1:30:17 is to think about the pharmacokinetics 1:30:18 of these compounds. 1:30:21 So this is from a publication by Olson in 2009 1:30:22 and it's looking at a hypothetical woman 1:30:30 who's trying to get pregnant in the late 1990s. 1:30:32 So she has some body burden 1:30:42 of these perfluorinated chemicals. 1:30:45 So as we go up on the Y axis, the body burden increases. 1:30:48 So this woman is trying to get pregnant. 1:30:58 Then she's kind of chugging along in time. 1:31:01 She isn't pregnant yet. 1:31:05 So there aren't changes, dramatic changes 1:31:07 in these body burden of perfluorinated compounds. 1:31:10 Well, she gets pregnant. 1:31:14 And then during pregnancy, the body burden decreases. 1:31:17 Why does it decrease? 1:31:21 Well, her body burden decreases 1:31:23 because she's off putting these perfluorinated compounds 1:31:25 onto the fetus. 1:31:29 After she gives birth, 1:31:32 her body burden continues to decline during lactation 1:31:36 because she's breastfeeding the infant 1:31:41 and breast milk is rich in perfluorinated compounds. 1:31:43 And so again, she continues 1:31:47 to off put these compounds onto this time the neonate. 1:31:49 So after she's finished with pregnancy, 1:31:54 she's finished with breastfeeding, 1:31:56 what happens is she then begins to go back up 1:31:59 to her normal baseline which is up here. 1:32:03 This process of re-equilabration. 1:32:08 So she'll continue to ascend to her typical body burden. 1:32:12 Why? 1:32:18 Because she'll resume her normal diet and typical exposures. 1:32:19 So this nicely describes what happens 1:32:24 to the body burden during pregnancy and lactation. 1:32:27 So now this is where things get a little bit complicated. 1:32:32 And so again, 1:32:36 we have the woman who pre pregnancy has a steady state, 1:32:37 drops during pregnancy, drops during lactation. 1:32:41 Then when she's finished, she begins to re-equilibrate. 1:32:46 Now let's say that we're dealing with a parous woman, 1:32:50 so that woman will have had a pregnancy here. 1:32:57 So this is pregnancy number one 1:33:02 and this would be pregnancy number two, 1:33:06 or this could be pregnancy three and four et cetera. 1:33:11 So she, during this re-equilibration process begins trying 1:33:17 to get pregnant again. 1:33:24 Now let's say there are two different women here 1:33:27 and one woman has a short time to pregnancy here, 1:33:31 and woman two has a long time to pregnancy. 1:33:38 Now, recall that in the previous slide, 1:33:48 what we saw was 1:33:52 that this adverse association was only present 1:33:53 in parous women. 1:33:59 So women like those in this diagram, right? 1:34:00 Because parous women 1:34:04 are going through this re-equilibration process. 1:34:05 Nulliparous women are always at a steady state. 1:34:09 So what I'm about to show you does not apply 1:34:14 to the nulliparous women. 1:34:19 So let's say these two women both begin trying 1:34:25 to get pregnant. 1:34:31 And for whatever reason, 1:34:33 woman one has a short time to pregnancy 1:34:34 and woman two has a longer time to pregnancy. 1:34:37 Well, what have we done? 1:34:41 We have artificially fixed the association 1:34:43 between these perfluorinated compounds 1:34:48 and time to pregnancy. 1:34:51 How? 1:34:53 Well, a woman with a short time to pregnancy 1:34:55 is guaranteed a lower body burden 1:34:59 and a woman with a longer time to pregnancy 1:35:05 is guaranteed a higher body burden of these compounds. 1:35:09 So without any kind of causal mechanism at play, 1:35:15 we have guaranteed an association 1:35:21 between perfluorinated compounds and time to pregnancy. 1:35:25 So the pharmacokinetics of this exposure make it look like 1:35:31 for parous women who've had a short time to pregnancy, 1:35:36 they're gonna have a lower body burden. 1:35:41 So a short time to pregnancy is a good thing, 1:35:43 and they're gonna have lower concentrations on average. 1:35:47 So what's that gonna do? 1:35:51 It's gonna introduce an association 1:35:53 whereby higher concentrations are associated 1:35:58 with longer time to pregnancy. 1:36:00 And again, we would only see this in parous women. 1:36:02 Why? 1:36:11 Because only parous women go 1:36:12 through this re-equilibration process. 1:36:14 Now, if you go back to the previous slides, 1:36:19 recall that Whitworth only observed the association 1:36:23 among parous women and not among nulliparous women. 1:36:29 Based on that, you might start to think that, 1:36:35 well, maybe the association they've seen 1:36:38 is really due to pharmacokinetics 1:36:41 and has nothing to do with an actual causal mechanism 1:36:43 that has something to do with the developmental toxicity 1:36:51 of these compounds. 1:36:55 So it's completely artificial. 1:36:56 This association is completely artificial 1:36:58 and it is due to the pharmacokinetics of these compounds. 1:37:01 So again, it may not be causal possibly due 1:37:07 to pharmacokinetics. 1:37:11 So longer time to pregnancy allows 1:37:12 for greater amounts of time 1:37:14 for these perfluorinated concentrations to rise back 1:37:17 to baseline following birth. 1:37:20 So wrapping up some conclusions about biomarkers, 1:37:24 exposure measurement needs 1:37:28 to be considered during all phases of the study. 1:37:30 So I started this lecture talking 1:37:33 about selecting your bio marker. 1:37:34 Which matrix are you gonna choose? 1:37:37 Is it gonna be urine? 1:37:40 Is it gonna be nails? 1:37:41 Is it gonna be hair, et cetera? 1:37:43 How many times do you need to sample it? 1:37:46 How many times do you need to collect it? 1:37:48 Then at the analysis phase, how are you gonna analyze it?, 1:37:51 Are you gonna pool the data, perhaps? 1:37:56 Are you gonna pool the exposure measurements together? 1:37:59 That's one technique. 1:38:04 And then further on in the data analysis phase, 1:38:05 how are you gonna handle values below a limit of detection? 1:38:09 How are you gonna handle potential contamination? 1:38:13 How do you interpret the results? 1:38:16 So you need 1:38:17 to be thinking about all of these issues about biomarkers. 1:38:18 You don't just sort of select it and be done with it. 1:38:21 There are issues intrinsic to biomarkers 1:38:24 that carry throughout the entire study. 1:38:26 Biomarkers can be great 1:38:30 because they can reduce misclassification, 1:38:31 exposure misclassification compared 1:38:33 with other methods of exposure assessment. 1:38:36 So in environmental epi, 1:38:38 this is the biggest issue that we face 1:38:40 is really exposure, assessment and misclassification. 1:38:43 Sometimes biomarkers are a lot better at assessing exposure. 1:38:47 So if I ask you to tell me about your fish consumption 1:38:51 during pregnancy and before, 1:38:55 if I want to assess methylmercury, 1:38:57 I would say that, 1:39:00 "Why don't you just take a sample of my hair 1:39:02 "and get an objective measurement of it 1:39:04 "and not base the assessment 1:39:07 "of exposure on my recall?" 1:39:09 So that's a great example 1:39:12 where a biomarker is clearly better than asking someone 1:39:14 to recall their diet. 1:39:17 Nevertheless, you need to think carefully 1:39:20 about these biomarkers. 1:39:23 You need to think about potential contamination. 1:39:25 You need to think 1:39:28 about whether you're gonna have sufficient power 1:39:29 if you use these biomarkers. 1:39:32 We talked about the issue with BPA. 1:39:34 You also need to think about how you interpret your results. 1:39:37 It is a biomarker. 1:39:40 It is subject to a whole bunch of other processes, 1:39:41 physiological processes in the body. 1:39:45