You are on page 1of 18

This discussion tells about the hypothesis with clear example.

And also tells what is null hypothesis and Meaning of Significance. Question
Howwouldyou explainto yourparents(*)hypothesistestingand significance?
The shorter, the better! (*) assuming that your parents don't have a formal statistical training. If so, go up your family tree as many levels as needed. HTTP://WWW.RESEARCHGATE.NET/PROFILE/ANDREAS_BRANDMAIER/ HTTP://WWW.RESEARCHGATE.NET/POST/HOW_WOULD_YOU_EXPLAIN_TO_YOUR_PARENTS_HYPOTHESIS_TESTING_AND_ SIGNIFICANCE

POPULARANSWERS

JochenWilhelm Justus-Liebig-Universitt Gieen Ok, I will expain it to my dad (my mom won't be much interested is this stuff): "After my friend and I visited you last time, one of us left a pile of books in your living room. No need to bother if these were my books, but you should say something if these were my friend's books, since he will probably miss and search them. Consider the phone call you would have to do in this case would be VERY EXPENSIVE (you would not like to call for no reason). So, what do you think whom these books are? Should you call or not? You know most of the books I have, but you don't know which books my friend has. You can check if these books fit into my "portefolio". If this is the case, you can't say much; could be well my books, it's also still possible that these books belong to my friend, since he could have a similar taste (what you don't know). To avoid an expensive call that might be unneccesary, you woudl't call in this case. But if these book do not at all fit into my libary, then you will strongly suppose that these must be my friend's books - or I would have completely changed my taste, what you consider really unlikely. In this case you feel you should inform us about the books (my friend would surely be very happy and pay for the call). Statistically speaking: This scenario,thay I left the books is called a "hypotesis". There is an alternative scenario or alternative hypothesis, namely that my friend left the books. You know

what books to expect when *I* leave books in your room. Therefore, this scenario is entitled the "null hypothesis" (just to give it a name). Given the particular books lying on your table, one can then use this knowledge and calculate a probability value that *I* would leave *these* books (or, better, "books like those"). This may be more or less likely. If it is quite likely, these books do not provide evidence against the null hypothesis: no call. But if this is quite unlikely, you have two options: either you believe that the null hypothesis (this scenario in which I left the books) is not true (call!), or you believe that in this stupid case it was still me who left the books, but it was a very unlucky incident that I left just those books you wouldn't expect to be part of my library (strange...) (no call). Statisticians don't like to believe in unlucky or strange incidents. They would much more believe that the scenario was not the actual true situation. And if they conclude that it wasn't me who left the books, it could have only been my friend who forgot them. In this case they call their finding statistically significant, what means "this is considered to be too unlikely when the null hypothesis was correct". Consequently, they "reject" the null hypothesis, and they would phone me and tell me: "Hey, there is a pile of books lying around here, and these books are statistially significantly *not* your books. Hence I suppose your friend left them here"."
Sep 25, 2012

ALLANSWERS(27)

JochenWilhelm Justus-Liebig-Universitt Gieen Ok, I will expain it to my dad (my mom won't be much interested is this stuff): "After my friend and I visited you last time, one of us left a pile of books in your living room. No need to bother if these were my books, but you should say something if these were my friend's books, since he will probably miss and search them. Consider the phone call you would have to do in this case would be VERY EXPENSIVE (you would not like to call for no reason). So, what do you think whom these books are? Should you call or not? You know most of the books I have, but you don't know which books my friend has. You can check if these books fit into my "portefolio". If this is the case, you can't say much; could be well my books, it's also still possible that these books belong to my friend, since he could have a similar taste (what you don't know). To avoid an expensive call that might be unneccesary, you woudl't call in this case. But if these book do not at all fit into my libary, then you will strongly suppose that these must be my friend's books - or I would have completely changed my taste, what you consider really unlikely. In this case you feel you should inform us about the books (my friend would surely be very happy and pay for the call). Statistically speaking: This scenario,thay I left the books is called a "hypotesis". There is an alternative scenario or alternative hypothesis, namely that my friend left the books. You know what books to expect when *I* leave books in your room. Therefore, this scenario is entitled the "null hypothesis" (just to give it a name). Given the particular books lying on your table, one can

then use this knowledge and calculate a probability value that *I* would leave *these* books (or, better, "books like those"). This may be more or less likely. If it is quite likely, these books do not provide evidence against the null hypothesis: no call. But if this is quite unlikely, you have two options: either you believe that the null hypothesis (this scenario in which I left the books) is not true (call!), or you believe that in this stupid case it was still me who left the books, but it was a very unlucky incident that I left just those books you wouldn't expect to be part of my library (strange...) (no call). Statisticians don't like to believe in unlucky or strange incidents. They would much more believe that the scenario was not the actual true situation. And if they conclude that it wasn't me who left the books, it could have only been my friend who forgot them. In this case they call their finding statistically significant, what means "this is considered to be too unlikely when the null hypothesis was correct". Consequently, they "reject" the null hypothesis, and they would phone me and tell me: "Hey, there is a pile of books lying around here, and these books are statistially significantly *not* your books. Hence I suppose your friend left them here"."
Sep 25, 2012

Ivan Sucharski Independent Researcher Hypothesis testing: A hypothesis is an expected answer to a question, usually based on some theory or prior research. Hypothesis testing is a method of determining the viability of a hypothesis. Lets use the US presidential election as an example. There are many hypotheses (questions) about how people feel or whom they will vote for. The only way to know the 100% truth about the answers to these questions is to ask every single possible voter something that is impossible, expensive, time consuming etc. Instead, we take samples and use statistics to make best guesses about the answer to a particular hypothesis. We never know for sure if our conclusion is right at the time we accept/reject it, but these techniques, when used appropriately can get us close to the truth most of the time (which is a lot better than guessing, flipping coins or throwing darts to get answers). The tricky part about hypothesis testing is that we have to think somewhat backwards about it we have to test what is known as the null hypothesis. As an example, the hypothesis might be Obama is more popular than Romney while the null hypothesis is There is no difference in popularity between Obama and Romney. We focus on the null hypothesis because it tells us exactly what the expected value of our tests should be (Obama popularity = Romney popularity) and we can collect data and see if it comes out that way (see next section on significance). If it does, then we accept the null hypothesis and reject our original hypothesis. If the null hypothesis is disproven, we look at the data to see if our original hypothesis holds (Obama>Romney). Significance: Significance in statistical tests means the degree to which we are confident that the answer we found regarding the null hypothesis was not due to chance (remember we can always be wrong, even after collecting a sample of data). Say you want to know if a coin is fair and you flip

it 3 times heads, heads, heads is that suspect? Probably not, because we know that this occurs with a probability of 12.5% which isnt all that crazy. Now if we flipped it an additional 7x and it was heads each time, then things are fishy (0.098% chance). Note that the outcome is still possible on a fair coin, just not probable. This is important you might say this coin is totally fishy! and based on your tests, you would be right. But the next person to come along and flip it might find a much more probable outcome so be careful how you talk about significant findings. Lets get back to our election null hypothesis we know that if our data shows the same value for Obama and Romney on popularity then we accept the null hypothesis and our original hypothesis appears wrong, but what if the numbers are close? How uneven do they have to be before we are confident that the null hypothesis is wrong? If our question is something silly like On a scale of 1-100, tell me how much you like each candidate and the data shows Obama on average has 62 and Romney has 59 is that enough to say Obama is definitively more popular? How about 63 vs 58? The answer lies in the statistical tests and is a function of several parts of those tests including: how many people were asked and how variable were the answers. The math involved allows us to estimate the truth from our sample this is why on survey results you often see a specific value and then + or X% this is a way of describing the data collected (the actual number reported) along with the margin of error around it. We know that if we ask 100 random people a yes/no question the percentage of yes will probably differ from if we asked 1,000 people. As such, any given value derived from sampling actually symbolizes a range of values, and this range is mathematically based on the number of people asked. In the 62 vs 59 example above, if the margin of error is 3 points it means our data suggests the truth is somewhere between 59-65 for Obama and 56-62 for Romney. Since the two scores overlap, we retain the null hypothesis and say our test shows they are equally popular. Its only when the range of scores do not overlap* that we are confident that there is a real, statistically significant difference in the scores. In other words even though the numbers are different (62 vs 59) they do not significantly differ from one another and as such we cant confidently say that one is truly larger/smaller than the other. *this isnt exactly true, especially for parametric tests. Since most distributions are infinite they will overlap, but for talking to mom about significance, it should do. I also avoided getting into distributions etc. because mom doesnt want to hear about that either.
Sep 25, 2012

EmmanuelCuris Universit Ren Descartes - Paris 5 > Ivan : If it does, then we accept the null hypothesis and reject our original hypothesis , no. Null hyptothesis is not choosed for its convenience (at least, it should not), but because it is the opposite of what you want to proove, and you then try to show it is absurd. and because of that, not rejecting H0 is NOT accepting it.

So, my proposition would be something like OK. You think something is true, for instance "this drug cures the disease". But how to proove it? The hypothesis approach is based on the idea that it is easier to proove that something is false, because a contradiction exists. Hence, you assume that your idea is FALSE: this is the null hypothesis, here this drug does not cure the disease". Then, using this (hopefully) false hypothesis, you derive the results of the experiment you expect --- here "there will be as cured patients when drug is given as when it is not, or even less". Then, you make the experiment and compare the results to your prediction. If they disagree: you win, your false hypothesis is contradicted by the experiment, hence it is false [you assume experiment is always right, this is another debate ;)], hence your real hypothesis hypothesis is true, since it it the opposite of the null hypothesis by definition. If they do not disagree... you lost: you cannot rule out the null, because results are in agreement with it, but you cannot rule out you real hypothesis either, because there are always many other ways than H0 to explain the experimental results, hence you cannot be sure that H0 is true. To quantify this disagreement between your prediction using H0 and the reality, you use the probability of obtaining these results assuming H0 is true; this is the so-called significance. If it is small enough, you will assume that you cannot be so unlucky, so that H0 is contradicted by the experiment.
Sep 26, 2012

Erik Stengler University of the West of England, Bristol Dear Mum and Dad, when I was a child, I often disobeyed you and behaved badly. But then even more often I came back to you, gave you a kiss and showed you that I love you. So, by performing many more actions that showed I loved you than those of bad behaviour, I instinctively tried to make the latter less significant that the former. If you ever considered the hypothesis that I didn't love you, you would have instinctively rejected it because there were significantly more data showing you the contrary. Well, that's what we scientists do when we test theories with experiment, only we don't always get the desired answer. That's why I always feel best at home!
Sep 26, 2012

JeanMaccario Institut national de la sant et de la recherche mdicale Mum s'pose you came back home and found Dad in the bed with the maid. Dad jumped out of the bed saying he doesnt even know the lady. How much a credit would you give to his

statement. That's the p-value. (Dear Andreas, make it with Mum and the plumber at your convenience).
Sep 27, 2012

HeydarAli MardaniFard Yasouj University Mom, suppose dad says that "I got married with another woman". By default you think that he is funny. This is null-hypothesis. But if he leave you for 2-3 weeks and does not answer your calls; then you may convinced that he got married!!! So, this is a significant reason to accept the hypothesis "your husband got married again!!!!"
Oct 4, 2012

LeahJarlsberg University of California, San Francisco A hypothesis is a theory you have about something that might or might not be true. For instance, your theory might be that lemon & honey stew makes colds go away faster. To test your theory, you give lemon & honey stew to all your children when they get colds & write down how many days they are sick; and you write down how many days your sister's kids are sick with colds when she gives them nothing. You compare the two groups to see which one is sick for the fewest days. Your results are true for this year & these kids, but how likely is it that they will always be true every year and for all the kids in the universe? To answer, we put your results into a big fat mathematical equation to find the probability that you would see no difference between the stew-eaters and the nothing-eaters if your neighbor performed the test in her family. If the probability of no-difference is very low (<5%), then you can be confident that most of the time, lemon & honey stew will make colds go away faster.
Oct 4, 2012

EmmanuelCuris Universit Ren Descartes - Paris 5 > Leah: your computation of the probability is not correct (but the error seems to be quite common...). Using your example it would be [in CAPITALS, the difference, sorry no other enrichment in R.Gate apparently] To answer, we put your results into a big fat mathematical equation to find the probability that you would see THE DIFFERENCE YOU OBTAINED (*) between the stew-eaters and the nothing-eaters ASSUMING THAT LEMON AND HONEY STEW HAS NO EFFECT ON COLDS. If thIS probability is very low (<5%), then you can be confident that most of the time, lemon &

honey stew will make colds go away faster Note that this is the only way to justify why if it is small, one may assume lemon & honey stew has an effect. (*): to be more precise, the probability of observing this difference or an even biggest one, but that does not really change the idea I think.
Oct 4, 2012

RomannWeber Rensselaer Polytechnic Institute I'd put it this way: In science, we often wish to make conclusions about the world by watching it closely and taking measurements of various things. If those measurements are numbers, we can often make informed decisions about the things we're measuring. When we use statistics, those decisions are bets on what are essentially guesses (educated or otherwise) that we call hypotheses. Statistics typically forces at least one guess on us, the "null hypothesis," which essentially says that nothing interesting is happening in our data. So, if we are looking for an effect of, say, a nutritional supplement on muscle growth, our null hypothesis says that our supplement isn't producing an effect; that is, it doesn't work. Now, maybe we'd like to show that our supplement does work. The truth is that we never really can. The best we can ever do is find enough evidence to bet against our null hypothesis. If our experiment was designed carefully, then the "alternative hypothesis," namely that an effect does exist, is closely aligned with what we'd like to "prove." Our bet against the null hypothesis is essentially the argument that "it's probably not the case that nothing is going on." This is basically the same as saying that something is probably going on, but we're never quite sure exactly what. Maybe our alternative hypothesis is correct. Maybe it's something else. Or maybe we're wrong altogether and nothing is going on after all. We make our bet as safe as possible by phrasing it in terms of probabilities. To do so, we need to know what rules of probability govern whatever it is we're measuring. A great deal of the theory behind statistics is concerned with figuring those details out. But the intuition behind it is easy. For instance, you'd know something was fishy if I kept flipping a coin and getting heads all the time. Specifically, it would be fishy if I got that result while using a fair coin. The important point is that the "fishy" factor is based on an assumption, namely that I'm using a fair coin. (The result isn't fishy if you know I'm using a trick coin.) Probability theory allows us to put a number on exactly how fishy it is. So, when we test data and make a bet based on it, we base our probabilities on the assumption

that the null hypothesis is true. Then, if our data seem really fishy, we bet against the null hypothesis. In doing so, we know it's possible that we're going to be wrong. But we try to make that possibility rare. It's accepted by a large number of people that a 1/20 chance of being wrong is an acceptable risk, and this is usually the threshold at which we would consider a fishy result "significantly" fishy. It's these fishy results that we're most often interested in.
Oct 8, 2012

DimitriosKarypidis Cardiff University I would simply say that a hypothesis is usually a theoretical explanation to a question that the more true ( less possibilites to happen by chance) is found to be following statistical testing, the more reliable its content and suggestion is. And I would use an example with...lets say gender and three colored cats: Three colored cats are always female and never male (hypothesis). I would then take-use a large number of randomly selected three colored cats and investigate their sex. If three colored cats are with no or very few exeptions female, which means their percentage is far different from being 50% female and 50% male as if it was pure chance, then, the hypothesis-suggestion that three colored cats are female is valid and reliable with a significance as high as the possibility of not being male or female by chance.
Oct 9, 2012

EmmanuelCuris Universit Ren Descartes - Paris 5 >Dimitrios: if your hypothesis is really "3-colored cats are never male", if you observe only a single male with 3 colours you will reject it... To use the null that you give of 50%-50%, your hypothesis you want to proove should be "3-colored cats are more often female than male".
Oct 10, 2012

JochenWilhelm Justus-Liebig-Universitt Gieen This highlights the unfortunate philosophical origin of hypothesis testing (Popper). You can not proof that something is *always* as you observed in the past. Having observed one million white swans and not a single black one does not proof that there are no black swans. However, the observation of just one single black swan does proof that there *are* black swans. This is like "disprofing the absence of black swans". Following the Popperian view, the null-hypotheses are our "black swans" we'd like to disprove. Unfortunately, this nice philosophical picture crumbles into dust when we consider that we are not able to really observe black swans - we can just say that the last observation was more or less *likely* a black swan, but we cannot be sure anyway. Hence a rigid disproval is as impossible as a positive proof (if you think what you saw was likely

to be a black swan, then it's just unlikely/unexpected that black swans do not exist). To my opinion, a "formal disproof" based on uncertain knowledge is worst than an "uncertain proof". Statistical testing distracts from looking at the estimated effects and their unceratainties.It can be a valuable tool in some circumstances, but it is not to be taken as the sole aspect in making decisions. For instance, such behaviour has demonstrated to justify new drugs that are more expensive but no better than established drugs. Or they are better in one respect and worst in many others. Decisions are always actions taken by humans. They must consider our aims, expected benefits and costs. These points are nothing objective, although in some points there may exists some social consensus. A desicion can not be objectified. It may be more or less reasonable, depending on the considered circumstances and ancillary conditions. "The Guide is definitive. Reality is frequently inaccurate." Douglas Adams in "The Restaurant at the End of the Universe" (1980)
Oct 10, 2012

StevenD'Alessandro Macquarie University Here is my simple answer. In a court of law, the prosecution must prove its case beyond "reasonable doubt". That is the hypothesis. Some one is guilty. If the hypothesis is not proved, provided the evidence, then the defendant is assumed to be innocent (since the prosecution did not prove its case). This is the null hypothesis, which must be disproven (the defendant is not innocent, and the probably that this occurred by happenstance, is low, e.g. the p value and the idea of "reasonable doubt"). Science like the Courts, seek only to change people's lives when the evidence is there beyond a reasonable doubt, otherwise the status -quo holds. It is by showing evidence at a point of time that status-quo (innocence in courts) cannot be supported, that we advance knowledge.
Oct 17, 2012

DimitriosKarypidis Cardiff University @Emmanuel Curis: although you are generally correct, a hypothesis can stand since it is usually built upon general (qualitative or empirical or other) understanding which requires (for certain functional and utilitarian reasons) quantification and/or further investigation and/or further examination.Therfore, one may have 'expected' results (basically based on the previous qualitative knowledge or observations). After all, to mention another practical fact, NO institution would ever approve a research proposal or fund WITHOUT a crystal clear list of expected results and uses. However, these 'expectations' (which inevitably bias the entire methodology from the very

beginning but thats another story) cannot fully configure the form of the hypothesis. This is the reason why 'often' hypotheses are set into a yes-no, all-nothing, never-always format. The reason is not to combat the established belief that such absolutisms do not exist but just to make simplistically easy the process of inference. Consequently, it is not, generally speaking, wrong to begin with the hypothesis as stated and then reject the initial hypothesis, in case e.g. one finds even one male cat with three colors, and modify it in the way you suggested, rather than incorporating from the beginning the element of which is more often. Null hypotheses in clinical research have a very clear and straightforward content and aim. Both of these components are strictly based on previously conducted observational research, followed by explanatory or confirmatory factor analysis followed by a clearly stated, often dichotomous, approach. It is extremely difficult to maintain an aim to 'prove' anything in a single approach even when sound experimental methodology is applied. It wouldn't even be possible after multiple RCT's to definitively fully 'prove' the entire structure of ANY causal relationship but it would rather provide sufficient confidence levels for a specifically termed use ( of a substance or even a conception) under a very specific range of conditions and on a specific range of subjects or infrastructure (theoretical, organic, or else). I also disagree with your approach concerning the causality 'proof' you tried to explain to Ivan since this is definitely light years away from how clinical research works (and excuse my straightforwardeness but it is due to the fact that you were trying to explain it in a rather 'as-itshould-be' way, which it's certainly not). But that's a totally different story.On the other hand I fully acknowledge your points when made for purely mathetmatical models. And at this point I couldn't agree more with Jochen Wilhelm in his effort to demonstrate how inherrently imperfect and even (how in the long run) inefficient, statistical inferences (and approach in general) may sometimes be (since the entire hermeneutical structure is subjective) . Finally I would be extremely glad to read Emmanuel's example ( to his parents) apart from his very knowle... [more]
Oct 17, 2012

EmmanuelCuris Universit Ren Descartes - Paris 5 @ Dimitrios : 1) Read the fourth answer of the post, 13 days ago, it contains my proposition. If you have comments on it, they are welcome. It is certainly not the best, I hope it does not contains methodological erros, and I would be happy to discuss it like I think other contributions are also made to be discussed so that every participant (me included) will end with a better comprehension of all these delicat concepts.

2) As for the answer to Ivan: I never mentionned anything about causality, so I do not see what you mean exactly. Beside, the fact that common practice is to assimilate "non-rejecting H0" to "accept H0" does not mean it is correct. However, I agree that probably quite often it does not matter if experiment was carefully designed to ensure enough power (which is not discussed in these simple presentations of hypotheses tests procedure). My opinion is that it is much better to do something knowing it is not correct ("accepting H0"), but a solid background of arguments ("I had enough power, approximations are fairly good..") so doing it carefully and cautiously, than doing it just because everyone does it, eventually not even thinking about its correctness. Like writers and poets, that are "allowed" to make uncorrect language constructions "for good purpose" but they learned the correct way before --- unlike daily users that make mistake even knowing it. 3) I agree in general in your comments, and with your presentation of the test also. However, your example could be slightly improved, that was the aim of my answer. I didn't met to hurt you, and I apologise if I did. I meant to participate to a scientific discussion, in which every comment can be discussed and improved to tend to a better comprehension of the subject.
Oct 18, 2012

DimitriosKarypidis Cardiff University @ Emmnuel: 1) Thank you for making it clear to me that it was actually your example. It was far too well hidden (for me) in the quotation meant to correct Leah's computation that I must have missed it. 2) On the comment about Ivan my objection begun with the way a hypothesis is chosen and about the ''convenience'' that is often entailed or inherently implied (and it was certainly not about the unquestionable facts you said about rejecting H0 etc). The rest of the point contained causality, in your example, causality lies within: drug (let's say A) causes/is the reason of: cure(let's say B).Then the hypothesis took the form A does not cause B and the statistics would either prove or disprove it. Now in mathematical/purely statistical terms there was no objection on my part (and I made it clear) regarding the word 'prove' as it means proving the logical coherence of a prospective acceptance or rejection of a statement (hypothesis). My point was about the clinical component which of course starts with a completely different approach of such hypotheses when it comes to treatments and cures other than just testing their being true or false. And as you may know, better than most, in full that process, the rest of the point (about the entire procedure of observing-investigating data which then become factors, which are then confirmed and explained (EFA CFA), which then enter the realm of experimental research and ULTIMATELY tested in controlled environments etc) was meant as a prompting for another example of yours. Example that wouldn't only contain raw single hypothesis testing (about which there couldn't be any objection) , but also hypothesis forming (since you briefly but clearly mentioned the association between ''prediction using H0 and the

reality''). 3) Likewise, I didn't mean to sound offensive or judgmental as my point was to simply differentiate myself from the use of words such as : '' your hypothesis SHOULD be'', ''your computation is not correct'', ''no'' etc which seem to have been included in some of your answers. As I am personally used to 'read' suggestions written in a slightly less directive and outspoken manner, much like your last reply to me, I probably misunderstood you and as a result urged you to elaborate with another example. Thank you for your concern but you didn't hurt anyone. As for the motives of participation in such discussions that you included in the end of the reply, I am confident that we all share them. Have a good one!
Oct 18, 2012

JochenWilhelm Justus-Liebig-Universitt Gieen @Dimitros, #2) This posting is also not ment offensive in any way. I just hope to make a useful contribution. "Then the hypothesis took the form A does not cause B and the statistics would either prove or disprove it." This is just not correct. I make this point because this mistake is a frequent cause for misinterpretations. First to the "hypothesis". The null hypothesis is "the values in B and not correlated with the values in A. Whatever correlation we see happened just by chance". A statistical hypothesis is always just about correlation, never about "causes" or anything else. The reason that A may cause B is not inside the statistical framework. It is outside, in the design of the experiment. If the presence of the drag (the value of A) is the sole and only thing that varies, then, and only then, any variation in B can logically be attributed only to the variation in A (so "A must have caused B"). The line of arguments starts becoming slippery when we face the fact that there is always some variation, even in the best lab experiments. Nevertheless a sufficiently strong correlation between A and B is a reasonable justification of our belief that A causes B (again: not because of the statistics but because of the way the data was generated). Statistics can tell us how likely we would expect the observed correlation by chance, and not - as you said - how likely we would expect a cause-response relationship" by chance. The second point referes to the "proof". Statistics cant proof anything. From the view-point of inference, a general principle can logically never be proven, it can only be disproven. This idea stems from the philosophy of Boolean logic, where premises are either clearly right or wrong. In statistics we deal with uncertainties. Instead of getting definitive answers (e.g. "A and B are

correlated") the best we can get is a quantified expectation for this case, given the data we have (e.g. the chance of correlation is 98%). Unfortunately, this can only be calculated from a prior belief, what is usually not done. Most scientists stay with just calculating the inverse belief, i.e. the likelihood to get the observed data given a specific hypothesis (usually the null). This is reported as p-value. Low p-values are a kind of "evidence" against the null. However: even overwhelming evidence against the null does not (logically) disproof it; it just says that it is reeeaallly unlikely to obtain the data whe the null was true. The "test" is about the *data*, not about the *hypothesis*. An additional problem arises when people talk about "significance", when a p-value is below a "level of significance". This refers to the Newman-Pearson theory and this is not anymore "null hypothesis testing" but "significance testing". This procedure is focusing on the taken actions. It controls the long-term error rates and makes NO statements to the correctnes or falseness or expectation of any hypothesis. Hence, this procedure is not suited for infe... [more]
Oct 18, 2012

EmmanuelCuris Universit Ren Descartes - Paris 5 > Dimitrios: just a short answer, I'll take time to read carefully and think about you message later. I think you may not have read the good post; my example is not embedded as quotation of Leah's example, but as a fully wrotten example after the comment on Ivan's post. But it is my fault, I incorrectly said "13 days ago", which is answer to Leah's answer, when indeed the correct comment is 22 days ago. Sorry. It is also on usage of drugs, so I think your remarks are still valid. A missing step is indeed "You think the drug cures the disease" (proposition A) implies "one should expect more cures when giving the drug" which can be rewrotten "there is correlation/association/ between giving or not the drug and observing cures" or "the probability of cures changes with taking or not the drug" (proposition B). So, conversely, "there is no association between taking or not the drug and the probability of cure" (not B) implies that the drug has no effect (not A). The test tries to proove that (not B) is false, hence that B is true. But A => B does not mean that B => A, so you are not sure on the drug itself --- correlation may be due to something else. If you fail to proove that (not B) is false, the test procedure does not really say anything [at least in my opinion]. But if you assume that (not B) is true, then you can assume that (not A) is true, since (A => B) <=> ((not B) => (not A)). I guess this is what is done quite often. But there are two concerns here 1) is there really A => B ? because all after is based on that 2) is it really sounded to accept H0 when you cannot disproove it ? Experimental design is meant to avoid these two concerns, and also the fact that accepting B does not mean A is true. In fact, I guess the experimental design is meant to have A <==> B and not only a => B. And I think Jochen explained very well all about this in his post, so I will stop

here. (clearly, the above is not for my parents --- well perhaps could be in fact, but not as meant in the initial question I guess). As for the style... Except that when something is wrong, there is not really other ways to say that clearly, I would answer "let's talk in French and not in a [for me] foreign language and I'll add all kind of subtle distinctions..." But once again, sorry in advance if this somehow crude style, caused by my limied English, hurt some people.
Oct 18, 2012

DimitriosKarypidis Cardiff University @Jochen: I'm sorry but you must have misunderstood it as I didn't originally use the statement: ''Then the hypothesis took the form A does not cause B and the statistics would either prove or disprove it." which you say it is not correct. By this line I was trying to reiterate Emannuel's: '' Hence, you assume that your idea is FALSE: this is the null hypothesis, here this drug does not cure the disease". So your point could also be directed to his comment as well. About your writings regarding the 'significance of the data' and not the truth or fault of the hypothesis, I believe there couldn't be any objection. However, I'd like to add that data is often defined, identified and ultimately collected according to a model and/or using a tool (clinimetric/psychometric/ scientific) which serves/follows/is based on (as a measure of its construct validity) the hypothesis (or what leads to it) itself e.g. ACE-inhibitors provide better renal function prophylaxis in diabetics...to test such a hypothesis one would need data indicative of renal function, renal artery hypertension, blood sugar levels, drug pharmacokinetics/dynamics parameters etc...in other words the 'suspected' validity of a question-hypothesis often dictates the choice of the appropriate data and data collection tools, thus the hypothesis itself, recruits inherently related means of investigation. It is the metric or value or outcome measure which, if significant, would confidently support or reject the hypothesis. So, it is not only the significance of the data but also the validity of the methods used to generate it (=define+identify+collect)what makes a hypothesis true or false in overall. @Emmanuel: Well then I'm utterly sorry as I've been reading the wrong post all this time. I agree with the safer way of conducting a hypothesis by following the disproving the 'false' rather than proving the 'true' but there are certain prerequisites I believe, similar to what you said. In the example: A = drug, B =cure, not B = no cure, not A = no use of the drug. You mentioned that: ''if you assume that (not B) is true, then you can assume that (not A) is true, since (A => B) <=> ((not B) => (not A)).'' Which sounds like: if using the drug results in the cure, then not using the drug does not result in the cure (and vice versa). But how can we accept that no use of the drug results in no treatment?

As far as I could tell, the disease could be self-limiting at least in some cases etc. So as you said, it may be safer to approach an issue by investigating the 'truth' of what it is not, BUT even if the lack of what is assumed to be the reason of a result does not lead to the result is true, it still doesn't mean much as there could be more than one reasons for the same result (not being present in the experiment). So in such models one could have e.g. subjects n1= took the drug and got cured, n2=took the drug and didn't get cured, n3=didn't take the drug and got cured and n4=didn't take the... [more]
Oct 18, 2012

JoseKitahara Universidade de So Paulo Andreas, I like this example: Your mom got a box of white powder at the kitchen and shes not sure whats that, but suppose its SUGAR. Tell her that its her hypothesis (name it: H0). Otherwise it could be anything or simply No SUGAR (name it the alternate hypothesis, Ha). Since theres anybody to tell her about it, she decided to test by herself. Then se took a spoon of that powder in a dish with water, since she knows that sugar dissolves in water. Scenario #1: It doesnt dissolve. Then, she can say: its not sugar. She rejects H0 and accept Ha as true. This test has good performance in this case and the probability of negative false, when we reject the hypothesis and its true, is small. Scenario #2: It dissolves. Then she can say that theres a probability that could be Sugar, I mean, I cannot reject H0 as true, but this test hasnt enough POWER to discriminate this situation, since many other powders also dissolves in water. She could use another test, like, put the dish in the garden and see if flies go there. Same hypothesis and 2 scenarios: Scenario #1: No flies at all. She can say with high probability that its not SUGAR and rejects H0, since there are flies around but no in that soup. Scenario #2: Lots of flies in that soup. She can accept, with high probability, that its Sugar and rejects Ha. This test has some Power to discriminate, but we can assume 100% sure. She can accept the risk of a false positive when we accept the hypothesis H0 and its false. Hope this help!
Oct 23, 2012

JochenWilhelm Justus-Liebig-Universitt Gieen Jose, I like your example, but I'm afaraid my parents would insist on just tasting a little bit of this powder (-> 100% specificity, 100% power, 100% faster) ;-) PS: If there is a chance that this powder might be harmful they probably would ask the mother-inlaw taste it...
Oct 24, 2012

EmmanuelCuris Universit Ren Descartes - Paris 5 > Dimitrios There is a difference in your A and B and mine, that may explain the difficulty. In your version A "treatment" (yes/no) and B "cure" (yes/no) are more like events that are observed or not and would be the basis for the model of the experiment, here leading to 2 Bernoulli variables/patient, and after that binomial variables and so on. I agree with you, defining logical relationship between events does not seem pertinent --- as you said, there is no reason to assume implication or inference between the two. However, defining independant between this events makes sense, and is the base of the statistical approach. In my version, A "the drug cures the disease" and B "giving treatment is associated to an increased probability of cured patient" were intended to be logical propositions, allowing implications and equivalences. With the first one the one we are interested in and the second one the one used to make the test [well, in fact the real B is even more restrictive, "beeing in the so-called treated group leads to an increase of the probability of the so-called being cured event"]. And your A and B are the events defined when "my" B proposition is made to build the test. All the question is, does giving an answer to the B proposition allow to give an answer to the A question ? and the experimental design gives elements to answer to this question. Sorry if I was not clear; I hope this post clarifies the idea.
Oct 25, 2012

AndyField University of Sussex Short answer: I wouldn't, I'd explain effect sizes and confidence intervals instead;)
Oct 29, 2012

ChandrikaB-Rao Piramal Life Sciences Limited This refers mainly to Jose and Jochen's postings. A cardinal rule I learnt for dealing with real data : "When data speaks for itself, please don't interrupt [it with statistics]". Use statistical testing of hypothesis only in situations of uncertainty. When an experiment can be done to get a conclusive answer (e.g. taste the white powder and observe n other definitive factors), one does not need statistical decision making. In some rare cases, even a stochastic experiment cn result in a deterministic outcome, in which case, one doesn't need statistical hypothesis testing. E.g. if drug A gave the exact same outcome when tested on a sufficiently large sample of individuals, you don't need statistics to tell you what the decision should be.
Jan 6, 2013

EdwinHuff Centers for Medicare & Medicaid Services The responses to this wonderful question have been fun to read! However, some reveal more about family relationships, and culture perhaps, than they exhibit ways of explaining hypothesis testing and statistical significance.. My mother is a serious bridge player, and I would explain these things using examples with cards. I would introduce hypothesis testing as a way of pciking alternative explanations for things, with one called a null hypothesis which uses randomness as a key idea for explaining things, and an alternative explanation, or alternative hypothesis which typically uses some nonrandom explanation for the same thing. She understands randomness in a concrete way, through the distinctions of card distributions from deals of shuffled versus unshuffled decks. I would explain significance as the relative confidence she would feel that supports her selection of the best explanation for any distribution of cards, as to whether they could be best explained by either a random or well shuffled deck, or less-random or non-random explanantion from an unshuffled, or a less-well shuffled deck of cards.
28 days ago

SimonMoon La Salle University I would use a boyfriend/girlfriend analogy. Imagine that you are going out with a girl/boy. You have had a long relationship with this girl/boy and are seriously thinking about marrying this person. This girl/boy might or might not be the right person for your life, but you probably will not find out about that for a while. Anyway, you will decide either marry this person or not. You will be happy if you chose to marry the right girl/boy. You will be grateful also if you chose not to marry the wrong girl/boy. If you chose to marry a wrong girl/boy, you will regret for a long time. If you

missed a right girl/boy, you may despair when you find out about this person's great life on the Facebook. Your mom might start talking about her old history or asking you about your situation. I guess you can match this scenario with the 4 quadrants in the 'confusion matrix'. My students can easily connect this to the logic of hypothesis testing. Anyway, hypothesis testing is a decision making process. Good luck!