You are on page 1of 8

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/281430006

Basics of Biostatistics for Understanding Research Findings

Article  in  MAMC Journal of Medical Sciences · September 2015


DOI: 10.4103/2394-7438.166310

CITATION READS
1 7,549

2 authors:

Satyanarayana Labani Smita Asthana


Indian Council of Medical Research, Dept of Health Research Indian Council of Medical Research
142 PUBLICATIONS   1,575 CITATIONS    78 PUBLICATIONS   520 CITATIONS   

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Multi-model cervical cancer screening - demonstration & Implementation View project

Collaboration with Bio-medical scientists and faculty View project

All content following this page was uploaded by Satyanarayana Labani on 02 September 2015.

The user has requested enhancement of the downloaded file.


mamcjms_43_15R1

MAMC Journal of Medical Sciences


1 1
2 2
3 3
Review Article
4 4
5 5
6 6
7 7
8 Basics of Biostatistics for Understanding Research Findings 8
9 9
10 Satyanarayana Labani, Smita Asthana 10
11 Division of Epidemiology and Biostatistics, Institute of Cytology and Preventive Oncology, Indian Council of Medical Research, Noida, Uttar Pradesh, India 11
12 12
13 13
14 Abstract 14
15 15
The aim of this communication is to give an overview of basic biostatistics procedures that are helpful in understanding medical research
16 findings. There are several books on this topic now and several articles are written on this subject in various reputed journals on individual
16
17 topics of interest or as a series of chapter articles. On the contrary, this article attempts to cover summary of basic biostatistics in a descriptive 17
18 manner with the attempt to provide the reader the essential basis of research methodology. This may also be useful for medical UG/PG students 18
19 viva and biomedical junior faculty in understanding advancement of knowledge in their area of specialization. This article as such is not 19
20 complete on the basics of the subject attempted to present. The reader is, however, advised further reading of reference books for more details. 20
21 Key words: Basics, biostatistics, research methodology
21
22 22
23 23
24 24
AQ1 Introduction evidence that emerge from research is included in the medical
25 25
text books. The findings mostly in numerical format is
26 In medical practice suppose few patients out of a group
represented in commonly used term in pleural called statistics.
26
27 with a similar condition may not get cured or responded to a The term statistics in singular form is a science involving 27
28 treatment regimen chosen. This failure to the response which activities such as planning, data collection, and analysis of 28
29 led to uncertainty in the cure of patients might be due to the data and interpretation of findings in making valid conclusions. 29
30 contribution of natural variations inherit or due to sampling Such a role of statistics in the management of uncertainties 30
31 variations among patients included for such assessment. in diagnosis or prognosis that emerge from biomedical data 31
32 Research is often conducted in any discipline including is called biostatistics. This communication is an overview 32
33 medicine for investigation of unknown facts through a from biostatistics to help in understanding medical research 33
34 collection of qualitative and quantitative information called findings and to appreciate the use of research methodology as 34
35 “data.” In research data are often collected from a fraction a tool in medical advancement. An essential broader view of 35
of subjects called “sample” from a large target group called research methodology and biostatistical and epidemiological
36 36
“population.” The sample in research is required to be a random tools required in understanding research findings are presented.
37 37
sample from target population for its representativeness. A
38 38
random or probability sample fairly assumes generalizability Address for correspondence: Dr. Satyanarayana Labani,
39 of findings obtained from a sample to the entire population. In Division of Epidemiology and Biostatistics, Institute of Cytology and Prev
39
40 the process of random sampling, the inclusion of a particular ntive Oncology, Indian Council of Medical Research, 40
41 subject into the sample cannot be predicted, this is similar to
I‑7, Sector‑39, Noida, Uttar Pradesh, India. 41
E‑Mail: satyanarayanalabani@yahoo.com
42 the process of the lottery. 42
43 43
44 Research findings that obtained from collected data are This is an open access article distributed under the terms of the Creative Commons 44
45 communicated in scientific journals and subsequently the Attribution‑NonCommercial‑ShareAlike 3.0 License, which allows others to remix,
45
tweak, and build upon the work non‑commercially, as long as the author is credited and
46 the new creations are licensed under the identical terms. 46
Access this article online
47 Quick Response Code: For reprints contact: reprints@medknow.com
47
48 Website: 48
www.mamcjms.in
49 How to cite this article: *** 49
50 50
DOI:
51 *** 51
52 52

© 2015 MAMC Journal of Medical Sciences | Published by Wolters Kluwer - Medknow 1


Labani and Asthana: Basic biostatistics

1 Research Questions and Designs of Studies Sampling of Subjects in Various Designs 1


2 Research is done on hot topics or novel research questions In observational studies sampling from target population is
2
3 which find interesting to research community it can be done to ensure the presence of randomness in selection of 3
4 in the form of (i) a confusing phenomenon that required subjects included while in clinical trials random allocation 4
5 solution, (ii) an unsolved mystery, (iii) development of a or randomization of subjects is done to different groups of 5
6 new technology, and (vi) an alternative or better solutions intervention under comparison. The sampling method varies 6
7 to existing problem. In the process of research, prior according to the specific type of observational study whether 7
8 requirements for designing a study are: (i) Framing a it is a descriptive, cross‑sectional, cohort, or case‑control 8
9 research question or hypothesis and (ii) details of needed study design. 9
10 methods to answer the research question. Research question 10
arise with the accumulated experience in the related area Sampling in descriptive design
11 For a descriptive or a cross‑sectional study, the method of 11
12 or through literature search in the related fields to find 12
research gaps. Study design is a structured approach to sampling could be either simple, systematic, stratified, cluster
13 or multistage, or a combination of these. Simple random 13
address a specific research question. Studies to examine
14 sampling gives equal chance of selection of a unit or an 14
patterns of disease are done on a descriptive design. The
15 studies of determining suspected causes of disease are individual for getting included in the sample on the bases of 15
16 done in an analytical study design. These two designs are random number generated by the computers. In systematic 16
17 observational in nature. On the other hand experimental or sampling the selection of individuals is done at the regular 17
18 clinical trials that compare treatment modalities are done intervals depending on the sampling fraction. Suppose we 18
19 in an intervention approach. The formats of such designs want to select a sample of 50 from 500 units, then the sampling 19
20 are illustrated in Tables 1 and 2. The impact of natural fraction is 50/500 or 1/10, that is, one unit out of every 10 is to 20
21 variation on medical decision can be controlled by using a be selected. One number (random start) is randomly selected 21
22 design. Designs are scientific plan to collect and compile out of first 10, then 10 is systematically added each time, for 22
evidence. The main thrust of a design is that the sample of example, if the first number is 5, the others are 15, 25, 35, etc.,
23 23
observations is sufficient and free from bias. Recruitment Suppose there are different categories or strata for which data
24 24
of subjects in a sample depends on the study design chosen needs to be collected separately. This is done by performing
25 simple random sampling in different strata is called stratified 25
to answer the research question.
26 random sampling. This method is used in a survey, when 26
27 adequate representation of categories are needed such as rural 27
AQ3
28 Table 1: Study designs-1 and urban areas or low middle and higher income groups of a 28
29 community. For a sample of random numbers in the selection 29
30 of subjects in a study a website on random number generator 30
31 maybe referred.[1] 31
32 32
Sampling in prospective design
33 Group of subjects or cohort with exposed and/or unexposed
33
34 to a particular characteristic are followed up over a period 34
35 of time in future and observed to see whether they develop 35
36 the outcome of interest or not. For example, a group 36
37 of elderly women exposed to or unexposed to human 37
38 papillomavirus (HPV) are cohorts. For sampling in this 38
39 setup, a cross‑sectional study on prevalence of exposure 39
40 could serve as a basis when the exposure is relatively not 40
41 rare. In a rare exposure case, for sampling methods, or 41
AQ3
42 Table 2: Study designs-2 approaches as required in the case‑control setting may be 42
43 useful. 43
44 Sampling in case‑control design 44
45 A case‑control study enrols cases with disease and without 45
46 disease as against a prospective study where the exposure 46
47 present and exposure absent are followed up for the 47
48 development of the outcome of interest. As compared to 48
49 prospective nature in a cohort study, a case‑control study 49
50 retrospectively observes for frequency of presence of exposure 50
51 in cases and controls to come to a conclusion. 51
52 52

MAMC Journal of Medical Sciences  ¦  Sep-Dec 2015  ¦  Volume 1  ¦  Issue 3


Labani and Asthana: Basic biostatistics

1 Selection of Cases Estimation and Test of Hypothesis 1


2 Two important aspects in the selection of cases are the An understanding of uncertainty begins with an estimation of
2
3 representativeness and the method of selection. The cases central and interval values along with measures of variability. 3
4 could be prevalent cases obtained from a cross‑sectional The basis for estimation of measures is dependent on whether 4
5 study or incidence cases obtained from a cohort study and the the data are qualitative or quantitative. Data basically are 5
6 other sources could be cases from a medical care facility or obtained by interviewing and examination of patients and 6
7 a disease registry. by noting down the reports of investigations on a uniform 7
8 format such as – questionnaire, schedule or proforma called 8
9 Selection of Controls tools for data collection. The quantitive measures such as 9
10 mean, standard deviation (SD), etc., and qualitative feature 10
The essential qualities needed for selection of control are: in frequency or proportion or a ratio or rate in percent,
11 (i) The control must be at risk of getting the disease and 11
12 etc., are computed to summarize data. These give an initial 12
(ii) the control should resemble the case in all respect except understanding of hidden uncertainty. Such measures obtained
13 for the presence of disease. This means that comparability 13
from a sample are called as statistics and the same measures
14 is more important than representativeness in the selection of 14
computed on the entire population are called parameter.
15 controls. Ideally, there should be one control for each case Those statistics are point estimates of population parameters.
15
16 in a case‑control study. In general, there is no further gain Statistical inference [Figure 1] which deals with the estimation 16
17 in statistical power if the number of controls is more than of population parameters and statistical tests of significance 17
18 four. Selection of controls is usually done through individual is drawn on the basis of sample statistics, and the findings are 18
19 matching of cases or group matching of cases. An example expected to be applicable for the entire target population. The 19
20 could be matching of a factor, for example, age ‑ a particular entire population is never studied in any research setting, but 20
21 single age matching or a 5 years age group matching between only samples are drawn from the target population. For the 21
cases and control. purpose of estimation and test of hypothesis knowledge of
22 22
23 Sampling in randomized control trial some theoretical distributions are necessary. 23
24 Randomized control trials (RCT) are experiments on human Use of normal or Student’s t‑distributions in medical 24
25 the trials are conducted in four phases. Phase I trials are decisions 25
26 done on normal voluntaries to evaluate maximum tolerance Most biological variables have nearly symmetric and 26
27 dose. Phase II evaluates primary effects and side effects of bell‑shaped distribution. In practice, as a rule of thumb, 27
28 patients. Phase III trials are actual a randomized control the verification of data is done using a histogram for any 28
29 trials for evaluating the efficacy on intervention. Phase IV asymmetrical shape of distribution, presence of too much 29
trial constitute the postmarketing surveillance. These trials difference in the central values, mean–median–mode, that is
30 30
evaluate the impact of specific intervention in improving not being approximately equal, and large magnitude of SD
31 31
the health outcome and involve time, complexity, and cost. to the extent of more than or equal to mean. These are some
32 In clinical research, RCTs are considered the gold standard
32
33 useful approaches in guessing gross nonnormality present in 33
studies. the data, if any. An essential property of normal distribution
34 34
The sampling of subjects in RCTs involves two important is that the range mean –2SD to mean + 2SD include 95% of
35 35
techniques viz blinding and randomization. Blinding is observations [Figure 2]. This property is very useful in the
36 estimation and test of hypothesis components of statistical
36
concealment of knowledge of treatment allocation from
37 inference on medical data. The normal or Gaussian distribution 37
patients or care providers or data analyzers. On the other hand,
38 has another important property that even if the observations are 38
randomization is the actual approach of allocation of patients
39 in a clinical trial by introducing a deliberate element of chance far from normal or symmetrical, the sample means will have a 39
40 into the assignment. The advantages of this are: (i) To ensure normal distribution for a large sample size. The distribution is 40
41 that each subject has an equal chance of assignment to any called standard normal distribution when the variable x under 41
42 intervention under the study trial, (ii) to produce comparable 42
43 groups and to attain validity for statistical tests, and (iii) to 43
44 ensure that the groups are alike in all important aspects and 44
45 differ only in intervention each group receives. A confounder 45
46 is a nuisance factor that comes into the way in the study of 46
47 the association between a risk factor and disease. Ethical 47
48 considerations such as ensuring and optimizing the potential 48
49 befits while minimizing the potential harms to the participants 49
50 are essential. For allocation of subjects in a trial in different 50
groups, free web sites are available for various allocation
51 51
designs.[2] Figure 1: Statistical methods
52 52

MAMC Journal of Medical Sciences  ¦  Sep-Dec 2015  ¦  Volume 1  ¦  Issue 3


Labani and Asthana: Basic biostatistics

1 study is transformed into Z (Z = [x – µ]/σ) this converted Z to (mean + 2SE) or (130 − 2 × 1.45) to (130 + 2 × 1.45) or 1
2 has zero mean and variance one. When the population SD is 127–133. This is interpreted as there is a 95% chance of 2
3 not available or unknown and replaced by the sample SD, population mean of cholesterol level in children of 3–12 years 3
4 the distribution is almost similar and has a different name of age to be included in the interval (127–133). The interval 4
5 called Student’s t‑distribution [Figure 3]. For large samples, when used with (mean ± 2SD) or 130 ± 2 × 25 or (130 + 50) to 5
t‑tests give virtually identical results in comparison to normal (130 – 50) or 80 – 180 is the 95% range of observations in the
6 6
tests. As compared to a normal distribution, a quantity called sample under study. This shows the clear distinction between
7 7
degrees of freedom (df) which depends on sample size is used interval of observations and CI.
8 in t‑distribution. For example, df is sample size minus the
8
9 number of the parameter under estimation. For estimating the
Concept of statistical significance and P value 9
10 Statistical significance is closely related to confidence 10
mean with a sample of n observations, the df is n − 1.
11 statement such as 95% CI. A threshold of 95% confidence 11
12 Estimation indicates that there remains an uncertainty of 5% which could 12
13 The sample estimates such as mean or proportion tend to vary result into a critical region that becomes basis for hypothesis 13
from sample to sample due to sampling variability or sampling testing. In statistical inference, hypotheses are formulated so
14 14
fluctuation. It is important to understand how much uncertainty that the hypotheses to be tested can be refuted this is called a
15 is conferred upon on point estimates such as central values null hypotheses or statistical hypotheses. The null indicates
15
16 mean or proportion. The measure of sampling variability called zero and in the null hypothesis either no difference or zero 16
17 standard error (SE) can be estimated for mean and proportion or difference is assumed. For any null hypothesis there could be 17
18 any other estimate of interest. If the estimates are computed in a one‑sided or two‑sided alternatives. Suppose our interest is 18
19 an interval as against a single value, such an interval is called to examine whether the hemoglobin level of children with 19
20 confidence interval (CI) an example of computation of SE of chronic diarrhea is same as that of healthy children. This is an 20
21 mean and 95% CI of mean is illustrated below. Replacement example for one‑sided test because the Hb level in the chronic 21
22 of SD with SE in the expression of mean ± 2SD provides 95% diarrhea is not expected to be higher than the normal Hb level 22
23 CI for mean (i.e. mean ± 2SE). among healthy children. An example of two‑sided hypothesis 23
24 is Hb level in children undergoing two types of feeding 24
Suppose we have data on the cholesterol level in 300 children
25 practices. Figures 4‑6 depict errors in decision making in the 25
of 3–12 years of age then what is the 95% CI of mean? The
context of marketing a new drug, diagnostic, and statistical
26 computed mean and SD of the cholesterol level is 130 and 25, 26
test settings. The probability of wrongly rejecting a true null
27 respectively. The SE of mean for the sample size of n = 300 27
hypothesis is an error (type I) in statistical decision‑making.
28 is computed as: SD/√(n) or 25/√(300) or 1.45. Now 95% CI 28
29 for mean cholesterol level is (mean ± 2SE) or (mean − 2SE) 29
30 30
31 31
32 32
33 33
34 34
35 35
36 36
37 37
38 38
39 39
40 40
41 41
42 42
43 Figure 2: Standard normal distribution (source: http://www.regentsprep.
43
44 org/regents/math/algtrig/ats2/normallesson.htm. [Accessed 22 Jul Figure 3: Standard normal and t-distribution (source: http://www.sjsu. 44
45 2015]) edu/faculty/gerstman/StatPrimer/probability. [Accessed 22 Jul 2015]) 45
46 46
47
AQ5 Figure 4: Errors in marketing a new drug setup Figure 5: Errors in diagnostic test setup AQ547
48 48
49 Marketing a new drug Disease status/ Diagnostic/screening test result 49
gold standard
50 Marketed Not marketed Test positive Test negative 50
51 Drug effective Correct decision Error Disease positive No error (true positive) Error (false negative) 51
52 Drug ineffective Error Correct decision Disease negative Error (false positive) No error (true negative) 52

MAMC Journal of Medical Sciences  ¦  Sep-Dec 2015  ¦  Volume 1  ¦  Issue 3


Labani and Asthana: Basic biostatistics

1
AQ5 these distributions in decisions making. For details of all these 1
Figure 6: Errors in hypothesis testing setup
2 tests various other references may be referred.[3,4] 2
3 Actual position Statistical decision on the basis of data 3
Assessment of strength of association
4 Reject null Do not reject
There are more profound uncertainties in the assessment of 4
5 hypothesis null hypothesis 5
relationship between disease and exposure. For categorical
6 Null hypothesis false Correct decision Error (type II) 6
variables, the association between disease and exposure is
Null hypothesis true Error (type I) Correct decision
7 measured as relative risk (RR) or risk difference and odds ratio 7
8 (OR). The strength of association for continuous variables are 8
9 This is also referred to as P value. The value of this error is correlation coefficient (R) and coefficient of determination 9
10 generally kept at 0.05. This threshold of 5% is also called (R‑square) are computed. 10
11 the level of significance. A result is called as “statistically 11
significant” when P < 0.05. The other important concept
12
relates to not rejecting a false null hypothesis is of another Relative Risk 12
13 error (type II) in statistical decision‑making. The type I error The RR is measured as ratio of incidence rate among the 13
14 and type II errors can be viewed as false positive and false exposed to the unexposed. The RR or rate ratio for the 14
15 negative respectively, in the setting of diagnostic accuracy event of outcome such as disease would be calculate using 15
16 testing. Rejecting a false null hypothesis is called as the power exposure and un exposure categories. Consider a prospective 16
17 of the test. The power is also called probability of getting a study on follow‑up of women with HPV and without HPV to 17
18 statistically significant result. observe the outcome of cervical precancerous state of cervical 18
19 intraepithelial neoplasia grade (CIN). The hypothetical data 19
tabulated on 600 women is shown in Table 1. Relation between
20 General Significance Test Procedure 20
21 status of HPV and CIN may be obtained RR = 19.25, and 95% 21
The basis of any test procedure is to judge a sample mean with CI = 7.1–51.9 Chi‑square = 76.1, P < 0.001.
22 a hypothesized value (μ) in relation to the SE of mean in the 22
23 one sample context test criterion. The test criterion (based on The exact P value obtained from the statistical package is very 23
24 Student’s t‑test) in relation is: low. The RR = 19.25 is interpreted as follows. HPV presence 24
25 has a 19‑fold risk of developing CIN as compared to the 25
26 (x - µ) absence of HPV. The null value of RR = 1 did not include in 26
t=
27 SE (x) the 95% CI also indicated the significance of RR = 19.25. The 27
28 risk difference is the difference in incidence or risk between 28
With (n − 1) df. This ratio is used to reject or not to reject the exposure and nonexposure.
29 29
null hypothesis depending on the computed value of t. The
30 30
null hypothesis is rejected if the calculated value of t is more
31 than the critical value of t given in the t‑distribution table
Odds Ratio 31
32 corresponding to a prefixed level of significance either for a Case‑control studies assess the frequency of exposure in cases 32
33 one‑tailed or two‑tailed test. The calculated value of t more with disease and controls without the disease. These are called 33
34 than the critical value (1.96 for very large N) indicates that P odds, and their ratio is OR. The OR is approximately same as 34
35 is less than a threshold level of significance such as 0.05 (5%). RR when the disease is rare. Consider a case‑control study 35
36 The value of P less than the threshold probability (0.05) is that evaluates the role of low birth weight in early neonatal 36
37 interpreted as significant (P < 0.05). The above basic procedure mortality. A hypothetical data tabulated in 2 × 2 contingency 37
38 is the same in any test of significance and difference is with forms is shown in Table 2. 38
39 the test statistic for different comparison. The details of tests OR, its 95% CI and Chi‑square test are as follows. 39
40 for situations such as testing a sample mean in comparisons OR = 47 × 55/15 × 11 = 15.6 and 95% CI of OR is 6.5–37.4. 40
41 to a hypothetical population mean and, testing of two means Chi‑square = 45.1, and P = 0.000 is very low at 1 df and can 41
are the common situations in test of hypothesis using t‑test reported as P < 0.001. The 95% CI is computed using a statistical
42 42
for quentative data. There could be two settings in such test packages such as SPSS, Epi‑info[ 5], etc., The interpretation of AQ243
43 of hypothesis one is between two independent groups and the
44 OR = 15.6 indicate that the odds of death in neonates with low 44
other is in paired group data. For assessing significance in birth weight (<2000 g) is 15.6 times the odds of death in neonates
45 qualitative data tests such as Chi‑square, Fishers exact, and 45
without low birth weight. This is not the same as the RR.
46 Mc Nemar tests are used. The situations where data are not 46
47 normally distributed in quantitative type, nonparametric tests Correlation coefficient 47
48 such as rank tests are used in assessing statistical significance. Scatter diagrams[4] are important for initial exploration of 48
49 The situations where more than two means are to be compared the relationship between two quantitative variables. The 49
50 a procedure called analysis of variance (ANOVA) with F‑test relationship between two quantitative variables to assess 50
51 for assessing overall significance and several other choices for the strength of degree of linear or straight line relationship 51
52 pair wise comparisons are used critical values are available for is called correlation relationship or Pearson’s correlation 52

MAMC Journal of Medical Sciences  ¦  Sep-Dec 2015  ¦  Volume 1  ¦  Issue 3


Labani and Asthana: Basic biostatistics

1
AQ4 (1‑sensetivity) divided by the probability of a person who does 1
Table 3: Presence HPV status in relation with CIN
2 not have the disease testing negative (specificity). 2
3 Exposure CIN Non‑CIN Total 3
HPV present 77 223 300
Sample size determination
4 The number of subjects decided to be included in a sample of a 4
HPV absent 4 296 300
5 research investigation is called sample size. Sample size plays 5
Total 81 519 600
6 HPV: Human papillomavirus, CIN: Cervical intraepithelial neoplasia an important role in estimation and test of the hypothesis. The 6
7 sample size of the proposed investigation should be calculated 7
8
AQ4 with the help of essential information based on scientific 8
9 Table 4: Relationship between low birth weight and early knowledge. The determination of sample size depends on a 9
10 neonatal mortality variety of considerations. These in the setup of estimations 10
11 Birth weight Neonatal outcome Total are as follows: (i) The proposed method of sampling through 11
12 Death Alive which sample of subjects to be enrolled is required as sample 12
13 <2000 g 47 11 58
size is determined based on simple random sampling, (ii) the 13
level of precision around which the estimate is desired to fall 14
14 ≥2000 g 15 55 70
in, (iii) knowledge of variability through SD is required in
15 Total 62 66 128 15
estimation of mean setting, and (iv) the desired confidence level
16 such as 95% or 99%. In the determination of sample size for
16
17 coefficient. The value of correlation coefficient lies between test of hypothesis situations the required considerations are: 17
18 − 1 and + 1 indicating negative and positive correlations. (i) The desired magnitude of difference that is considered to be 18
19 On the other hand, the relationship between two or more clinically significant, (ii) the assumption of normal distribution 19
20 quantitative variable in a structural form for the prediction of data being considered and the extent of variability through 20
21 of one variable when the other variable is given is called SD when the interest is of quantative variable, (iii) the level of 21
22 regression. The square of the correlation coefficient is called as significance or maximum type I error tolerable and the required 22
the coefficient of determination and interpreted as the percent statistical power for a specified clinically important difference,
23 23
of variation explained in one variable (dependent) by the other and (iv) the alternative hypothesis considered to be a one‑tailed
24 24
variable (independent) on which it is regressed. or two‑tailed test. Free statistical packages such as Epi‑info and
25 25
26 other available websites can be used to determine sample size 26
27 Evaluation of Diagnostic Test Performance after providing the required input for that purpose.[6] 27
28 Statistical measures for assessing the performance of a clinical 28
29 (screening/diagnostic) test are sensitivity‑specificity, positive Conclusions 29
30 and negative predictive values. Sensitivity and specificity are Findings from the medical research required to be understood 30
useful to identify or to rule out the disease and indicate the in order to put the emerged evidence into medical practice.
31 31
inherent quality of the test. These indicators do not depend Apart from the area of medicine in which investigations are
32 on the prevalence of the disease in a population. Contrary to
32
33 done, knowledge of biostatistics as part and parcel of research 33
this predictive values are dependent on the prevalence of the
34 methodology is essential. Beginning with research question, 34
disease in a community on which the test is applied. Predictive
35 how the design of the study, sample of observations chosen, 35
values speak about the probability that the test will give the
proceeding for further data analysis, and interpretation to
36 correct diagnosis. 36
provide final conclusions of the research investigation given
37 37
Sensitivity, specificity, and predictive values: Sensitivity of in this article is helpful in the understanding of advancement
38 a screening/diagnostic test means the ability of the test to taking place in a particular area medicine. This brief overview
38
39 correctly identify those patients who have the disease. The of the subject would serve a quick summary methods for UG 39
40 specificity of a screening/diagnostic test refers to the ability and PG medical students in their short projects and thesis works. 40
41 of the test to correctly ruling out persons without the disease. 41
42 Positive and negative predictive values: Positive predictive
Financial support and sponsorship 42
43 Nil. 43
value is the proportion of patients with positive test results
44 who are correctly diagnosed. Negative predictive value is Conflicts of interest 44
45 the proportion of patients with negative test results who are There are no conflicts of interest. 45
46 correctly diagnosed. 46
47 Positive and negative likelihood ratios References 47
48 The positive likelihood ratio is the probability of a person who 1. Random Number Generator. Available from: http://www.stattrek. 48
49 has the disease testing (sensitivity) positive divided by the com/statistics/random‑number‑generator.aspx. [Last accessed 49
on 2015 Jul 22].
50 probability of a person who does not have the disease testing 2. Sealed Envelope Ltd. Simple Randomisation Service; 2015. Available
50
51 positive (1‑specificity). The negative likelihood ratio is the from: https://www.sealedenvelope.com/simple‑randomiser/v1/. [Last 51
52 probability of a person who has the disease testing negative accessed on 2015 Jul 22]. 52

MAMC Journal of Medical Sciences  ¦  Sep-Dec 2015  ¦  Volume 1  ¦  Issue 3


Labani and Asthana: Basic biostatistics

1 3. Satyanarayana L, Asthana S. Relevance of statistical significance in 5. Epi Info™ 7.1.5. Available from: http://www.cdc.gov/Epiinfo/7/index. 1
medical research. Ganga Ram J 2014:3;107‑213. htm. [Last accessed on 2015 Jul 22].
2 4. Indrayan A, Satyanarayana L. Simple Biostatistics for MBBS, PG 6. Web‑based Sample Size/Power Calculations. Available from: http://
2
3 Entrance and USMLE. 4th ed. Delhi: Academa Publishers; 2013. www.stat.ubc.ca/~rollin/stats/ssize/. [Last accessed on 2015 Jul 22]. 3
4 4
5 5
6 6
Author Queries???
7 7
AQ1: Kindly check the heading levels throughout the
8 8
article.
9 9
AQ2: Kindly provide manufactured details.
10 AQ3: Kindly provide editable format in Table 1 and 2. 10
11 AQ4: Kindly provide Table 3 and 4 citation in text part. 11
12 AQ5: Kindly check the Figures 4-6 given as table 12
13 format. Kindly check and confirm. 13
14 14
15 15
16 16
17 17
18 18
19 19
20 20
21 21
22 22
23 23
24 24
25 25
26 26
27 27
28 28
29 29
30 30
31 31
32 32
33 33
34 34
35 35
36 36
37 37
38 38
39 39
40 40
41 41
42 42
43 43
44 44
45 45
46 46
47 47
48 48
49 49
50 50
51 51
52 52

MAMC Journal of Medical Sciences  ¦  Sep-Dec 2015  ¦  Volume 1  ¦  Issue 3

View publication stats

You might also like