Professional Documents
Culture Documents
Statistics Step2 PDF
Statistics Step2 PDF
** Descriptive epidemiology deals with rates, ratios and distributions, it explain the
determinant's of the disease in the form of time place and person
-The researcher begins with a population with a certain outcome, and subjects are
classified into either "cases" or "controls" based on the outcome status
-The cases and controls are assessed retrospectively to for the presence of risk factor
(Information is collected about exposure to risk factors).
- Selection of control subjects based on exposure status (exposed diseased or even non
exposed non diseased) is inappropriate ,
because comparing the frequency of exposure between the case and control groups is an
important part of case-control study.
- Independent variables (age, sex,...) are often selected to be the same (matched) between
the case and control groups to decrease the effect of confounding.
- Subjects with the disease of interest (case group) are comapred with an otherwise similar
group that is disease free (control group).
- It is retrospective study aiming at determining the association between risk factors and
disease occurance
The main measure of association is exposure Odds ratio can be calculated in the case
.control study but incidence of the disease can't
One of the drawbacks of case control study is that the risk can not be drived directly from
it's results.
:.N.B.
- Incidence measures ( e.g. relative risk or relative rate) can't be directly measured in
case-control study,
Because the people being studied are those who have already developed the disease
- Relative risk and relative rate are calculated in cohort studies, where people are
followed over time for the occurance of the disease.
------------------------------------------------------------------------
- Divides the study group into "exposed" and "non exposed" to the risk factors
-Each subject is then follow prospectively till the presence of the disease .
-Is a prospective observational study in which groups are chosen based upon the presence
or absence of one or more risk factors
-All subjects are then observed over time for the development of the disease of interest
-Thus allowing estimation of the incidence within the total population and comparison of
incidences between subgroups.
-It is best for determining the incidence of the disease & comparing the incidence of the
,disease in 2 populations.
-one with and one without agiven risk) allows for calculation of a relative risk .
e.g. if a substanial number of subjects are lost to follow-up in exposed and/or unexposed
,groups
- It is possible that the lost subjects differ in their risk of developing the outcome from the
remaining,
Example: if 30% of subjects were lost to follow-up in a prospective study for the relation
of alcohol and breast cancer
-There is no information available on whether these subjects develop breast cancer or not
-The number (30%) is substanial and will influence the outcome if heterogeneity in
.developing breast cancer exists between the lost subjects and the remaining subjects
for example if the subjects lost in the exposed group experienced more breast cancer than
those with follow-up (selective loss of high risk subjects).
# To reduce the potential for selection bias in prospective studies, investigators try to
acheive high rats of follow-up
:.N.B.
-- Median survival: used to compare the median survival times in two or more groups of
patients (e.g. receiving new treatment or placebo).
:.N.B.
.
-is the measure of those with the disease in the population at a particular point in time
the relation between them in a stable population (little migration) can be demonstrated by:
# So if the incidence is fixed in a stabe population, the prevalence is increased if there are
factors that prolong survival (i.e. disease duration) e.g. improved quality of care
- The researcher reviews the past records and classify subjects into "exposed" and "non
exposed" and then follow them untill the outcome
-In a cohort study, the study subjects are free of the outcome at the time a study begins .
- Both the exposure and the outcome are studied at one point of time
(at one cross section of time)
- Since both exposure and outcome are present for sometime before the study,
it is not possible to
determine the temporal association between the exposure and outcome from
cross-sectional study
- Usually subjects are randomly arranged into exposed (treatment group) & placebo and
then followed to detect the development of the outcome of interest
- This type of study has the least bias and helps to show a strong causal relationship .
- Usually involves randomization at the level of groups rather than at the level of individuals
- In which a group of participants is randomized to one treatment for a period of time
and the other group is given analternate treatment for the same period of time
)(interchanging the treatment))
with a washout (no ttt) period inbetween the treatment intervals to limit the confounding
.effect of the prior treatment
** At the end of the time period, the two groups then switch treatment for another set
period of time .
-Randomizes one treatment to one group and another treatment to the other group .
.Such as treatment drug to one group versus a palcebo to the other group .
-Occurs when the effect a main exposure on an outcome is modified by another variable
i.e. women with +ve family history have an increased risk, while women without +ve
family history don't have an increased risk .
** Other examples: .
-- studying the effect of estrogen on the risk of venous thrombosis (modified by smoking)
-- Also studying of the risk of lung cancer in people exposed to asbestos (greatly depends .
on / modified by smoking)
:For exampl .
the effect of a new estrogen receptors agonist drug on the incidence of DVT is modified by
smoking status :
- Smokers taking the drug have an increased risk of developing DVT, while nonsmokers
taking the drug don't
- It may be confused with confounding, both can be diffrentiated by dividing the whole
.)cohort into subgroups (stratified analysis
Imagin that smoking is a confounding that, by itself is associated with a higher risk of DVT,
,so if more smokers are taking the drug
- it might appear that the drug causes DVT, but when stratified analysis is performed by
,analyzing smokers and nonsmokers separately
-it will appear that the drug is no longer associated with DVT
- Is a time period required for an exposure to start the effect i.e the time require from
getting exposed to outcome
In infectious diseases it is relatively short, while in chronic diseases (e.g. cancer or CAD (
-- it may be very long and extended period of exposure may be required to affect the
outcome
-- Latent period also can be applied to the exposure to risk modifier, as it may need to be
continous over a certain period of time before influencing the outcome
-It may be the result of a recording error, a measurement error or a natural phenomenon .
# The mean: is extremely sensetive to the outliers and easily shifts towards them
within the data set and outliers significantly increase the dispersion
(SD = deviation of values around the mean).
## The median is much more resistant to the outliers as is located in the middle of the
dataset where the observations usually dont differ much from each other .
--It is the ratio of the risk in an exposed group to that of the unexposed group .
**A RR of 1 means that there is no association between the risk factor and the disease .
A relative risk > 1 means that there is a positive association between the risk factor and .
.the outcome
A relative risk < 1 means that there is a negative association between the risk factor and .
.the association
--The farther the the value of the RR from 1, the stronger the association .
** The classification into two or more ordinal categories enable the risk to be assessed as a
function of exposure .
--And the DOSE RESPONCE EFFECT can be calculated from the exposure and the outcome
** The present example illustrates a dose responce relationship between smoking and
bronchogenic cancer
)the RR of bronchgenic lung cancer increases as the number of smoked PPD increases( .
-- One weakness of the RR is that it gives no clue whether such finding can be explained by .
chance alone.
The confidence interval and the "P" value can help strengthen the finding of the study .
For the study to be statistically significant .
2- The "p" value should be less than 0.05 (i.e < 5% chance the result obtained were due to
chance alone).
## The "p" value is used to strengthen the results of the study, it is defined as the
propability of obtaining the result by chance alone.
**Ther commonly accepted upper limit (cut-off point) of the "P" value for the study
** If the "P" value less than 0.05 (i.e the study is statistically significant), the 95%
confidence interval doesn't contain 1.0 (the null value for RR).
@@ A relative risk of 0.71 shows that the drug decreased the risk of mortality by 29% the -
null value for RR is 1).
e.g.: Acase of RR 1.6 (greater than 1) & the confidence interval 1.02-2.15
(doesn't contain the null value 1).
**so for the study to be statistically significant the "P" value must be less than 0.05
:N.B: Verrrrrry important to know how to calculate relative risk fron the 22 table
RR = event rate for the drug or test i.e = +ve cases/ total nuber examined by the test or
drug
** In case of 2 drugs or interventions study one drug reduce the relative risk (RR) than the
other .
--Absolute risk Reuction (ARR) = RR of first drug(placebo) - RR of second drug (under test (.
** Number needed to treat (NNT): is the number of people that should receive
a treatment to prevent one defined event .
** NNT = 1/ARR **
:.N.B.
- The power of a study is the ability to detect a difference between two groups (treated
versus non treated, exposed versus non exposed ).
- Increasing the sample size --> increases the power of the studyand consequently makes
-the confidence interval of the point of estimate (e.g. relative risk) tighter
** If the sample size is small --> low power of study to detect the difference between
& exposed and non exposed subjects
this makes the confidence interval of the study wide (e.g. 0.8-3.1) and makes the study
statistically insignificant.
And if we increase the sample size --> the confidence interval will be tighter and the study
will be statistically significant.
- It is the number of people that must be treated for one adverse event to occur .
(similar to number needed to treat )
Attributable risk = .
Adverse event rate (treatment group) - Adverse event rate (control group).
NNH = 1/0.25 = 4 .
- Results from the manner in which the subjects are selected for the study, from the
selective losses from the follow-up .
It is a selection bias that can be created by selecting a hospitalized patients as the cotrol
group.
- It can lead to a study population having characteristics that differ from the target
population
* A common example; is that severely ill patients are most likely to enroll in cancer trials
leading to:
results that are not applicable to patients with less advanced cancer
i.e. the study sample isn't represntative of the target population with respect to the joint .
distribution of exposure and outcome .
- Occurs due to imperfect assessment of the association between the exposure and
outcome .
and one of them diagnose the disease earlier than the other without an effect on the
outcome (survival).
## What actually happens is that detection of the disease was made at an earlier point of
time
- But the disease course itself or the prognosis did not change .
>> So the screened patients appeared to live longer from the time of diagnosis till the time
of death.
:N.B.: IN USMLE
Think of LEAD BIAS when you see " a new screening test" for poor prognosis diseases like
lung cancer or pancreatic cancer
- when the observer maybe influenced by prior knowledge or details of the study that can
affect the results
* Blinded studies usually avoid this bias by preventing the observer from knowing which
treatment or intervention the participants are receiving .
*Blinding can involve patients exclusively or both patients and physicians (double blinding(
and are related to the design of the study (the scenario will describe how the study was .
desgined).
* Result from inaccurate recall of past exposure by people in the study and applies mostly
to retrospective studies as case-control study .
# People who have suffered an advirse event (such as having a child with congenital
anomalies) are more likely to recall previous risk factors than
- Occure when the case and control populations differ due to admission or referral
practices .
For example: a study involving cancer risk factors performed at a hospital specialized in
cancer research ,
* may enroll cases referred from all over the nation, however hospitalized control subjects
without cancer may come from only the local area .
- Refers to the fact that a risk factor itself may lead to extensive diagnostic investigations and
increase the probability that a disease is identified.
For example: patients who smoke may undergo increased imaging surveillance due to
their smoking status, which would detect more cases of cancer in general .
- Occurs when the outcome of the test is obtained by the patient's response not by
objective diagnostic methods (e.g. migrane headache ).
- Is a type of selection bias where a treatment regimen is selected for a patient based on the
severity of their condition.
.Offline case 20 .
- It may result fro the way that treatment and control groups are assembled .
- It may occur if the subjects are assigned to the study groups of a clinical trial in
a non random fashion .
For example in a study group comparing oral NSAIDs and intra-articular corticosteroid
injections for the treatment of osteoarthritis
* obese patients may be pereferentially assigned to the corticosteroid group (affect the
outcome ).
- Reffer to a conclusion that there is no difference between the groups studied when
a difference truely existing .
>> Due to presence of one or more variables associated independently with both the
exposure and the outcome .
For example: cigarette smoking can be a aconfounding factor in studying the association
between maternal alcohol drinking and low birth weight babies .
As cigarette smoking is independently associated with alcohol consumption and low borth
weight babies.
- It is the tendency of a study population to affect the outcome because these people are
aware that they are being studied .
- This awareness leads to consequent change in behaviour while under observation -->
seriously affecting the validity of the study .
- It is usually seen in studies that concern behavioral outcomes or outcomes that can be
influenced by behavioral changes .
** In order to minimize the Hawthorne effect, the studied subjects can be kept unaware
that they are being studied .
- It describes researcher's beleifs in the efficacy of treatment that can potentially affect the
outcome .
4- Confounders: can be avoided by 3 methods in the design stage of the study; matching
restriction and randomization
** Matching is used in case control study in which select variables that could be
confounders (age, race,..) then
factors (confounders) that can influence the estimate of association between the
treatment and placebo groups so that the unconfounded effect of the exposure can be
isolated .
the known risk factors(as; Age, severity of the disease) as well as unknown & difficult to
measure confounders as level of stress, socioeconomic status) and make all confounders (
evenly distributed between the treatment group and the placebo
i.e the confounders are evenly distributed between the treatment and the placebo
.groups
- It is the ratio of the chance of an event occuring in the treatment arm (drug or group of
interest ).
compared to the chance of that event occuring in the control arm (the other drug or
group) during a set period of time
Hazard ratio = event occuring in the test group / event occuring in the control group .
So; the lower the hazaed ratio, the less likely the event will occur in the treatment arm .
-The higher the ratio, the more likely the event will occur in the treatment arm .
** Hazard ratio for major bleeding = 0.93 i.e. close to 1 means that both groups are similar
to each others in this event .
** Hazard ratio for intracranial bleeding = 0.41 (indicates the lower chance of drug "A" to
cause intracranial bleeding than drug "B" ).
** Hazard ratio for GIT bleeding = 1.50 (indicates that drug "A" has a higher chance to
cause GIT than drug "B" ).
** Hazard ratio for life threating bleeding = 0.80 (indicates the lower chance of drug "A" to
cause intracranial bleeding than drug "B" ).
** Hazard ratio for total bleeding = 0.91 (indicates the slight lower chance of drug "A" to
cause intracranial bleeding than drug "B" ).
## In case number (11 ofline) you should focus on the baseline value in the case in take
the corresponding hazard ratio in the study
then
2- Blind the inestigators from the identity of the patients who receive the treatment arm
-- A listing of the base line characteristics of the patients in each arm would demonstrate
if the two arms had patients with similar characteristics and would insure the proper
randomization occured in the study
-the two mean values - the sample variances - the sample size >>
If the "P" value is less than 0.005 --> the null hypothesis
(that there is no difference between the two groups) is rejected
- Because the population variances are not usually known --> this test has limited
applicability .
* Used to compare two or more means (determine whether there are significant
differences between the means of 2 or more independent groups.
e.g. ANOVA can be used to assess for difference in mean blood pressure among three
samples of populations.
Grouped by exercis status (never exercis, exercis occasionally and exercis frequently( .
- Used to test the association between two categoral variables .
- By compare proportions (of categorized outcome, e.g. high or low ) then presented with
the exposure (present or not present ).
A 22 table may be used (high or low outcome) and (exposed & non exposed) to compare
the observed values to the expected values .
** If the difference between the observed and expected values is large, this means there is
association between the exposure and the outcome
For example: it is used to determine if the distribution of gender and smoking status is
- Is an epidemiologic meathod for pooling of the data from several studies to do an analysis
having a relatively big statistical power.
For example: individual studies assessing the effects of aspirin on certain cardiovascular
events may be inconclusive
However analysis of data compiled from multiple clinical trials may reveale a significant
benefite.
- Is a method used to model the linear relationship between a dependent variable and 2 or
more non dependent variables .
For example this test could be used to quantify the effects of alcohol use, tobacco,
smoking and charred food consumption on the incidence of gastric ulcer .
-It is a measure of the strength and direction of a linear relationship between 2 variables .
For example, a study may report a correlation coefficient describing the association .
.between hemoglobin A1c level and average blood glucose level
- Involves two or more expermintal intervensions, each with two or more variables that are
studied independtly .
For example .
with to two different variable bl pr. endpoints (102-107 mmHg or < 92 mmHg (.
1- ACEIs
2-Beta blocker
3- Ca channel blocker
-Lower bp goal -higher bp goal
-All measures of central tendancy are equal i.e mean = median = mode .
# The degree of dispersion from the mean is determined by the standard deviation
* of data --> within 1 Standard deviation from the mean ( mean +/- 1 SD %68 (.
* of data --> within 2 standard deviation from the mean (mean +/- 2 SD %95( .
* of data --> within 3 standard deviation from the mean (mean +/- 3 SD %99.7) .
:.N.B
-The long slop of the curve "the tail" extends in the positive direction
- The mean is the most shifted to the positive direction followed be the median then the
mode .
In strongly skewed distributions, the median is a better measure for centeral tendency
than the mean.
-The long slop of the curve "the tail" extends in the negative direction
- The mean is the most shifted to the negative direction followed by the median then the
mode.
** So the mode > the median > the mean (i.e the mean is the smallest (.
** In strongly skewed distributions, the median is a better measure for centeral tendency -
.than the mean
- Sensitivity --> the proportion of true +ve cases among all diseased cases .
(Sensitivity = true +ve by the test/all patients that are actually diseased).
- A higher sensitivity --> the higher the test detect patient with the disease --> decrease
false negatives.
- Screening tests (especially for diseases with severe sequally) should have a high sensitivity
- Specificity --> the proportion of true -ve cases among all non diseased cases .
(Specificity = true -ve by the test/all patients that are actully free)
- Is a measure of the true negative rate and indicates how will a test can rule out a given
condition (exclude those without the disease).
- The higher the specificity the more likely that most healthy patients will have a -ve test
results
** The higher the specificity --> the less likely the false +ves .
-- They are fixed values that are not vary with the pre-test probability of a disease or with
the prevalence of the disease.
-- The ideal diagnostic test should have high sensitivity and specificity .
:.N.B
- Raising the cutoff point of a diagnostic test --> decrease it's sensitivity but increase it's
specificity.
- Lowering the cutoff point of a diagnostic test --> increase it's sensitivity but decrease it's
specificity.
(.
## draw the 22 table (a,b,c,d). ##
OR = (ad)/(bc) .
RR = [a/(a+b)] / [c/(c=d( [ .
- Direct calculation of RR in case-control study is not possible, because the study design
doesn't include following peoples overtime .
- If the prevelance of the disease is low --> the odd's ratio approximates the Relative risk
- Increasing the sample size will decrease the "P" value of the odd's ratio and make the
confidence interval tighter .
:N.B.
- Attributable risk percent (ARP): represents the excess risk in a population that can be
attributed to the exposure to a particular risk factor .
- It can be calculated by subtracting the risk in the unexposed population (basline risk)
from the risk from the exposed population
or
ARP = (RR-1)/RR .
Pre and post-test Probabilities (+ve perdictive value (PPV) & -ve predictive value .
- Describes the probability of having the disease if the test result is +ve .
)if the patient has a +ve test result, what is the liklehood that he actually has a disease(
- The post-test probability of having the disease is directly related to the PPv .
- If the PPV is 25% i.e low, consequently if the test result is positive, then the post-test
probability of having the disease is low
- The post-test probability is also dependent on the sensitivity, specificty and pre-test .
probability of having the disease .
- describes the probability of not having the disease if the test result is ve .
-And a patient with a low probability of having a disease will have a high NPV .
** If the NPV is 96 % this means that if the test result is -ve, the chances of the patient to .
not have the disease is high (96%).
And the chances of the patient to have the disease is low (100 - 96 = 4%( .
##Example##
-- a patient of a high pre-test probability for having the disease (1st degree relative having .
breast cancer or age > 40 ys), has a low NPV.
-- a patient of a low pre-test probability for having breast cancer (less than 40 ys old), has a .
high NPV.
- A patient who belongs to a high risk group e.g. (multiple sexual partners, use no .
condoms, IV drug abuse)
has a high pre-test probability of having AIDS --> so he will have a low NPV >>
-- On the other hand a patient who belongs to a low risk group (one sexual partner, using .
condom and no IV drug abuse)
has a low pre-test probability of having AIDS --> so has a high NPV -
NOTE
- The prevalence of the disease is directly related to the pre-test probability of having the
disease (PPV) & inversely related to the pre-test probability of not having the disease
(NPV), so increased prevalence --> low NPV but high PPV and vice versa.
-- Sensitivity and specificity are not affected by the prevalence of the disease and so the
likehood ratio positive i.e sensitivity (1-specificity)
** Cases and diagnostic tests tha are high yield USMLE questions in probabilities :
- Represents the appropriatness of the test (i.e. the test ability to measures what is
supposed to be measured).
- In order to determine the validity of a test, the results are compared to those obtained
from the gold standerd test.
N.B.: Also sensitivity and specificity of a test compare its results to the results obtained by
the gold standard test
.Test-retest reliability .
-A reliable test is reproducible; gives similar or very close results on repeat measurements
- Sensitivity (positivity in disease) --> is the proportion of subjects who have the target
condition and gives positive results
- Specificity (Negativity in health) --> is the proportion of subjects without the target .
condition and gives negative results
Sensitivity --> ++ true +ve & -- false -ve (diagnosed as normal but he is diseased ++( .
Sensitivity --> allaw not to miss any diseased patient (not to miss any true +ve ++ (.
Specificity --> ++ true -ve & -- false +ve (diagnosed as diseased but he is normal ++( .
** ROC --> Aiming at decrease false -ve and false +ve results .
(i.e increase sensitivity and specificity).
- Positive predictive value (ppv) --> is the probability of having the disease if the test results
are +ve .
Negative predictive value (NPV) --> is the probability of not having the disease if the test .
.result is -ve
NPV = TN/(TN + FN (.
LR+) --> is the ratio of the proportion of patients who have the target condition & test ( .
positive to
the proportion of patients without the target condition & who also test positive .
Negative likelihood ratio (LR-) = (1-specificity)/sensitivity
LR-) --> is the ratio of the proportion of patients who have the target condition who test ( .
negative to
the proportion of patients without the target condition who also test negative
ROC curve has 2 lines; vertical line (Y) for sensitivity and horizontal line (X) for specificity .
Low cutoff --> Increase sensitivity (better ability to identify patients with the disease i.e
)increase true positive
Although this causes decrease specificity (the test falsely identifies more subjects as
.diseased also they are not) and vice versa
Low cutoff --> High Sensitivity --> higher negative predictive value (NPV) --> decrease false
-ve results (Ruling out probability).
High cutoff --> Higher Specificity --> higher positive predictive value (PPV) --> decrease false
+ve results (Ruling in probability).
-- A shift of the ROC curve upwards for a given cutoff indicates increased sensitivity and
vice versa.
-- A shift of the curve to the right for a given cutoff (higher value)indicates decreased
sensitivity and vice versa .
--The curve usually shows that an increase in sensitivity is offset by decrease in specificity
As mentioned before .
sensitivity= TP/(TP+FN) & specificity= TN/(TN+FP), so decreased overlap between the
healthy and diseased population curves >>
decrease both the number of FP & FN (i.e decreses the dominator) --> thus increase >>
both sensitivity and specificity
(i.e allow for a test with both higher sensitivity and specificity).
-- In overlap curve: moving the cutoff vlaue to the right (higher value) would increase
specificity at the expense of sensitivity, while
moving the cutoff to the left (lower value) would increase sensitivity at the expense of
specificity.
A cutoff value just outside the overlapping portion would maximize the sensitivity
(if to the left) or specificity (if to the right) at 100% .
Both sensitivity and specificity depend on the cutoff value of a given test for example : .
- Raising the cutoff value makes it more difficult to diagnose the condition .
-- i.e
it makes it harder to obtain +ve results and easier to obtain -ve results --> this will increase
specificity but decrease sensitivity .
-- Lowering the cutoff value makes it easier to obtain +ve results and harder to obtain -ve
results .
- Is the proportion of the true +ve results out of the total number of the true results of the
test (-ve results are not taken into account ).
- The study is percised if the results are not scaterred widely, this is reflected by a tight
confidence interval .
So, if the first study has a wider confidence interval than the second study --> the second .
.study is more percised
- Is the proportion of the true results (true +ve and true -ve) out of all results that are
predicted by the test .
- The closer the ploted curve approaches the left and top borders of the ROC curve, the
more accurate the test .
** Accuracy can also be measured by the total area under the plotted curve on ROC curve
Increase of the total area under the curve --> increases the accuracy of the test .
:N.B.
-- Both accuracy and percision depend upon sensitivity and specificity of the test as well as
the prevalence of the condition in the population tested .
** Accuracy is reduced if the sample doesn't reflect the true value of the parameter .
measured.
Increasing the sample size --> increases the percision of the study, but doesn't affect the .
.accuracy
The closer the value to its margins (-1 or 1), the stronger the association .
-- The correlation coefficient shows the strength of association but does not necessarily .
imply causality (cause of it).
- It is calculated by divide the number of diseased subjects by the number of people at risk
or of interest .
- Median --> is the middle observation in a series of observations after arranging them in an
ascending or descending manner .
EXAMPLE: 5,6,7,5,10,3 .
.Mean = (5+6+7+5+10+3)/5 = 36/6 = 6 .
.Mode --> 5 .
.EX2: 5,6,8,9,11.
EX3: 5,6,8,9 .
Range: is the difference between the largest and the smallest values .
.e.g.Range = 9-5 = 4
:.N.B.
- Average: it is the summation of the total number of observations divided by the
sample size .
e.g. in random sample of children the number of episods of UTIs are as follow (50 child (0),
)30 child (1), 10 child (2), 10 child (3).
** The average number of UTIs episods per year in a child = 80/100 = 0.8 (between 0 and 1(
.i.e the child experiences less than one attack of UTIs per year
- A 95% confidence interval is the range of values in which we can be 95% confident that the
true mean of the underlying population falls in
- In order to calculate the confidence interval we need to know the (mean, SD, Z- score and
sample size).
.*Thus the confidence interval (CI) will tighten as the sample size increases .
.*The next step is to multiply the SEM with the corresponding z-score .
For 95% CI, the Z-score is 1.96 (for 99% CI the Z-score is 2.58) .
The association is positive (if the outcome increases with the increase in the exposure) ->
+ve correlation coefficient while
the association is negative (if outcome decreases with the increase in exposure) ->
-ve correlation coefficient .
-- Crude analysis of association using the scatter plots doesn't account for possible
confounders.
:N.B.
1- It is very important to consider the natural history of a disease when evaluating the
effectiveness of a druge in a trial .
e.g. common cold --> natural esolution within one week should be taken in consideration
control group and statisical significance is made to know the power of the study
-Is always the statement of NO relationship between the exposure and the outcome
- To state the null hypothesis correctly you should recognize the study design first
- In cross-sectional study: the 2 variables (CRP & cancer colon) are studied at the same
point of time so
So you can't measure the relationship between the 2 variables --> Null hypothesis is better
.considered
- It Opposes the Null hypothesis .
-It States that there is a relationship between the exposure and the outcome .
- It is the applicability of the obtained results beyond the cohort that was studied .
- External validity answer the question "how the generalizabe are the results of a study to
other populations .
For example: if the cohort is restricted to middle aged women, the results of the study are .
.applicable only to middle aged women & not applicable to eldery men
================================================================
1- Smoking cessation the single most effective preventive intervention in almost every
patient or most effective modifier of mortality including aspirin and tight glucose control) (
in nearly every disease .
2- How to calculate:
i.e. it is the probability of finding a true relationship (the probability of seeing difference .
when there is one truly existing).
-- So if the researchers need to find a difference between a tested drug and the standard .
of care if exists, they need to maximiz the power (1-B).
** Power depends on sample size and the difference in outcome between the 2 groups
being tested
- Occurs when the researchers fail to reject the null hypothesis when the null hypothesis .
is really false
- An example: a study finding that doesn't affect platelet function when, in fact it does .
If (B) is set at 0.2 (20%) i.e there will be a 20% chance to accept the null hypothesis when it
is false >
the power (1-B) will be 0.8 (8o%) i.e there will be a 80% chance of rejecting the null
hypothesis when it is truly false
- Occurs when the researchers reject the null hypothesis when the null hypothesis is really
true .
i.e the study finds a statistically significant difference between 2 groups when it is actually .
.not existing
An example: If a study concluded that hard candy improves heart failure mortality, when
it doesn't
Alpha (a): is the maximum probability of making type I erorr a researcher is willing to .
.accept
-It corresponds with the 'P" value or the probability of making a type I error .
- The (a) is typically set at P= 0.05, meaning that the researchers accept a 5% possibility .
that the difference preceived as true is actually due to chance .
:N.B.
There are 4 basic payment methods that exist between health insurance and physicians :
1-Capitation
- Physicians are paid fixed amount of money per enrollee, not per service .
(i.e paid by capitation ).
So they have incentives to contain (decrease) costs per enrollee due to the fixed budget
allocated for them.
-If many enrollees seek care or there are enrollees need extensive care, physicians costs
.may be greater than their payments
So physicians are motivated to provied more preventive care to catch illness early so .
.patients stay healthier and need fewer tests and procedures as they age
2-Free for service (FFS(
-Physicians are paid fixed amount of money for every service and diagnostic test they
provide .
- They face little financial risk and they enticed (tend to) increase the number of service
they provide on each visit.
--Discounted FFS works smilarly to FFS except that physicians are reimbursed (repay)
a discounted amount .
So physicians paid under this model may be more conservative when ordering tests and .
,providing services compared to those paid by FFS
4-Salary
- Physicians are paid a fixed amount and their pay is not tied to number of enrollees or
services rendered (provided ).
- Unless their contracts includ withholds or bonuses, salaried physicians face no financial
risk.
So they have no financial incentive to change their treatment patterns, either in service
.provided or number of follow up visits
** FFS and discount FFS are commonly used in preferred provider organization insurance
plans .
:.N.B
- A state with a population of 4,000,000 contains 20,000 people who have
, disease A, a fatal neurodegenerative condition. there are 7,000 new cases of the disease
a year and 1000 deaths attributable to disease A. there are 40,000 deaths per year from all
causes, what is the,,,,,,,, ?
1- An increase in lung cancer incidence and mortality has been observed in women over
the last four decades due to increased cigarette smoking .
2- Breast cancer is the most common non skin cancer among women in USA, but breast
,cancer mortality is comperatively low.
3-Mortality from breast cancer has stayed relatively stable overtime, where as colon cancer
.mortality decreased some what over the last decades
4-Stomach cancer is now uncommon, so it's incidence and mortality have been drastically
.decreased in the last decades
6-A part from skin cancer, the most common women cancer are ordered in descending
.according to incidence: Breast cancer, Lung cancer then colon cancer
7-In order of mortality: Lung cancer followed by Breast cancer then colon cancer
:.N.B
- case-Fatality rate: is calculated by dividing the fatal cases by the total number of people
with the disease.
:N.B
** If events are independent, the probability that all events will turn out the same
.(e.g. -ve) is the product of the separate probabilities for each event
** The probability of at least 1 event turnning out differently is given as: 1- (the probability -
of all events being the same).
For example
## A new seriological test for detecting prostate cancer is negative in 95% of patients who
dont have the disease, if the test is used on 8 blood samples
taken from patients with out prostate cancer, what is the probability of getting
at least 1 positive test .
- In this case a 0.95 (95%)probability of giving a true negative result and 0.05 (5%)
probability of giving false positive result .
8- To calculate the chance of all 8 tests being negative: probability (all negative)= (0.95(
you have to to know that the total probability is always equal to 1.0 (100%).
>> so