Biostatistics

National Institute of Veterinary Epidemiology and

Disease Informatics (NIVEDI) Hebbal, Bangalore560024

head in an oven and his feet in a bucket

of ice water, when asked how he feels,

responds: On the average, I feel fine.

What is Research?

Research is the systematic process

of

collecting

and

analyzing

information

to

increase

our

understanding of the phenomenon

under study. It is the function of the

researcher to contribute to the

understanding of the phenomenon

and

to

communicate

that

understanding

others.

Invention: Invest money to

to generate

knowledge

Innovation: Invest knowledge to generate money

reasoning

science is to uncover the

truth.

We have our senses,

through which we

experience the world

and make observations.

reason, which enables us

to make logical

inferences.

In science we impose

logic on those

observations.

Inductive Inference:

Statistics as the Technology of the

Scientific Method

Statistical methods are objective methods by which

group trends are

abstracted from observations on many separate

individuals.

Summarizing data: Averages, percentages , presentation

of tables and charts

A major part of statistics involves the drawing of

inferences from samples to a population in regard to

some characteristic of interest

In statistical reasoning, then, we make inductive

inferences, from the particular (sample) to the general

(population). Thus, statistics may be said to be the

technology of the scientific method.

Pre-clinical testing

Investigational New Drug Application (IND)

Phase I (assess safety)

Phase II (test for effectiveness)

Phase III (large-scale testing)

Licensing (approval to use)

Approval (available for prescription)

Post-marketing studies (special studies and

long-term effectiveness/use)

Scientific enterprise

Values, Ethics and Standards in Scientific

Research

Research is based on the same ethical

values that apply in everyday life, including

honesty, fairness, objectivity, openness,

trustworthiness, and respect for others.

A scientific standard refers to the

application of these values in the context of

research. Examples are openness in

sharing research materials, fairness in

reviewing grant proposals, respect for

ones colleagues and students, and honesty

in reporting research results.

Scientific misconduct

The most serious violations of standards have come to be known as

scientific misconduct. The U.S. government defines misconduct as

fabrication, falsification, or plagiarism (ffP) in proposing, performing, or

reviewing research, or in reporting research results.

Scientists who violate standards other than ffP are said to engage in

questionable research practices. Scientists and their institutions should

act to discourage questionable research practices (QRPs) through a broad

range of formal and informal methods in the research environment

Fabrication is making up data or results.

Falsification is manipulating research materials, equipment, or processes,

or changing or omitting data or results such that the research is not

accurately represented in the research record.

Plagiarism is the appropriation of another persons ideas, processes,

results, or words without giving appropriate credit.

Questionable Research Practices: deliberately dividing research

results into the least publishable units to increase the count of ones

publications

Research

Discoveries made through scientific research can have great

value to researchers in advancing knowledge, to

governments in setting public policy, and to industry in

developing new products.

Researchers should be aware of this potential value and of

the interest of their laboratories and institutions in it, know

how to protect their own interests, and be familiar with the

rules governing the fair and proper use of ideas.

Intellectual Property rights:

benefiting from a new idea may require establishing

intellectual property rights through patents and copyrights,

or by treating the idea as a trade secret. Intellectual

property is a legal right to control the application of an idea

in a specific context

Patent: Control the Application of ideas

Copyright: Control the expression of ideas

Research methods

vs Research methodology

Research methods: usually refers to

specific

activities

designed

to

generate

data

(Questionnaire,

interviews,

focus

groups,

observation, experimental)

Research Methodology: is more about

your

attitude

to

and

your

understanding of research and the

strategy you choose to correctly

Why do research?

Research allows you to gain appreciation for the practical

applications of knowledge, and to step outside your

classroom and learn about the theories, tools, resources, and

ethical issues that scholars and professionals encounter on a

daily basis.

1.Fascination

2.Answer to unsolved problems

3.Gain insight in to a particular issue

4. Develop many transferable skills for the benefit of

Research

is a method of making new ideas to be implemented into

community

practice and checking if it works

5.Personal satisfaction and achievement

Increasing Mind

Power

NEGATIVE

POSITIVE

Anger

Preoccupation/Work

Irritability

Laughing therapy

Jealousy

Yoga

Blame

Music therapy

Complain

Read

Anxiety

Play

Inferiority

Ego personality

How to be Happy

Keep your heart free

from hate, your

mind from worry.

Live simply, expect

little, give much,

sing often, pray

always. Fill your life

with love, scatter

sunshine, forget

1. Explorative :

is undertaken when few or no previous studies exist. The aim is look for

patterns or hypothesis that can be tested and will form the basis for

further research: Meta-analysis

2. Descriptive research

Describe the data generating process, Description would answer questions

such as; 1. what is the range of prostate volumes(ml) for a sample urology

patients 2. What is the difference in average volume between patients

with negative biopsy results and those with positive results

We seek to infer characteristics of the data generating process . Inference

would answer questions such as, for a sample of patients with prostate

problems , can we expect the average of volumes of patients with positive

biopsy results to be less than those of negative biopsy results.

4. Predictive research

We seek to make predictions about a characteristic of data generating

process, such prediction would answer questions such as, on the basis of

patient negative digital rectal examination, Prostate Specific antigen

Internal Validity

Internal validity is a crucial measure in quantitative studies, where it

ensures that a researchers experiment design closely follows the

principle of cause and effect.

The researcher can eliminate almost all of the

potentialconfounding variablesand set up strongcontrolsto

External

isolate other factors.

Validity

External validity is one the most difficult of the validity types to achieve,

and is at the foundation of every good experimental design.

External validity asks the question of generalizability: To what

populations, settings, treatment variables and measurement

variables can this effect be generalized?

effect of one risk factor by the presence of another.

Confounding occurs when another risk factor for a

disease is also associated with the risk factor being

studied but acts separately

Confounding can be controlled by

restriction, by matching on the

confounding variable or by including it in

the statistical analysis.

Bias (Systematic Error): Any process or

effect at any stage of a study from its design to

its execution to the application of information

from the study, that produces results or

conclusions that differ systematically from the

truth.

Extraneous variables

Extraneous variables are those having relationship with

main variables (X and Y)

Confounder:

Z

X

Mediator

Z

Moderator: A moderator is a variable (z) , where X and

Y have different relationship at different levels of Z

Suppressor:

Z

Covariate

X

Decision theory is theory about decision, This subject is

not unified one. The Decision theory is concerned with

goal directed behavior in the presence of options

making

Data analysis

Statistical tests

Predictions

Simulations

Accuracy is the

degree of veracity

(adherence to

truthfulness)

Precision is the

degree of

reproducibility

knowing which decision to take can

sometimes be the most painful...

Decision based on

sample data

Decision

making

system

Criteria:

Surveillan

ce system

Actual Condition

(in population)

Infection

present

Infection

Absent

Infection

present

True Positive

False Positive

(Type II error)

Infection

Absent

False

Negative

(Type I error)

True Negative

Reality

Effect does

not exist

Effect

Exists

Effect does

not Exist

Correct

Decision

Type 2

Error

Effect Exists

Type 1

Error

Correct

Decision

Conclusion

Definition of population

Study design

Research hypothesis

Null hypothesis

Alternative hypothesis

One tailed/two tailed study

Sample size

Sample selection

Randomization

Blinding

Data collection procedures

Data management procedures

Suitable statistical method/s

Interpretations

Criteria:

Surveillance

system

Actual Condition

(in population)

Infection

present

(Ho)

Infection

Absent

(H1)

Infection

present

True Positive

(TP)

False Positive

(Type II error)

Infection

Absent

False Negative

(Type I error)

True Negative

(TP)

Sensitivity

Sensitivity= TP/(TP+FN)

Specificity

Specificity=TN/(TN+FP)

PPV/NPV

PPV= TP/(TP+FP)

NPV=TN/(TN+FN)

Accuracy=

(TP+TN)/N

so that control action can be quickly instituted

Monitoring:

Systematic

collection,

analysis

and

dissemination of information about the level (occurrence,

incidence and prevalence) of infections or disease that

are known to occur in a specified population

Surveys: are the tools for data collection

An Investigation using the systematic collection of

information from a population that is not under the control of

Investigator

Passive Surveillance: passive or general surveillance typically

takes the form of disease reporting system. If a producer notices

a disease problem , this is reported and reported in systematic

manner

Active Surveillance: uses structured disease surveys to collect

high-quality disease information quickly and cheaply

Representative Sample: one that is Similar to Population.

Inference is valid only when a representative sample is chosen

Random Sample: Every element in the population has the same

probability of being selected in the sample.

sun rises in the east

Fact:sun neither rises nor sets, only

earth rotates

Moral:

Education spoils our

Exclusion criteria

for defining study population

All clinical trials have guidelines about who can participatethese

are specified in the inclusion/exclusion criteria:

Factors that allow someone to participate in a clinical trial are

"inclusion criteria"

Factors that exclude or do not allow participation in a clinical

trial are "exclusion criteria"

These factors may include: Age, Gender, The type and stage of

disease, Previous treatment history, Specific lab values, Other

medical conditions..

Inclusion and exclusion criteria are not used to reject people

personally. The criteria are used to:

1. Identify appropriate participants

2. Keep them safe

3. Help ensure that researchers can answer the questions they

want answered

One of the crucial component of successful research or trial is the

Testing as a Scientific

Knowledge

1.

you are asking

variable(s) to answer that question

the appropriateness of generalization

proposed statistical power

10

11

Study designs

a study is more important

than the analysis. A badly

designed study can never

be retrieved, whereas a

poorly analysed one can

usually be re-analysed.

because the design of a study will govern how the

data are to be analysed.

Study Designs

Observational Studies:

exposed to factor of interest and who is not

Cross-Sectional studies

Case-Control studies

Prospective

Experimental Study Designs:

Investigator

Feasibility study:

In vitro studies, case series studies, pilot

studies

Case series

Descriptive account of an interesting

characteristic

In one patient

In a small group of patients

Usually involves patients seen over a short

period of time

Does not involve controls

No research hypothesis

Leads to formulation of hypotheses, other

types of studies

Analyze data collected at a single

point in Time

Provide information on status quo

(e.g. prevalence of a condition, or

disease characteristics)

Quick to complete, cheap

Cannot examine outcomes

May lead to biased conclusions about

disease

progression

Longitudinal, retrospective

design

Starts with the outcome

Cases: those with the outcome

Controls: those without the

outcome

Case-control advantages

Shorter, Cheaper and Useful

to study rare diseases or

diseases that take a

long time to manifest, or to

explore preliminary

Hypotheses

Case-control

disadvantages

Difficult to control for bias

May depend entirely on

quality of existing records

Can be difficult to designate

appropriate control group

Advantages

Disadvantages

Quick

disease

numbers

Reasonably economical

influenced by exposure)

disease

incidence

No loss to follow-up

reliable than either randomized

controlled trials or cohort

studies.

Consistency of measurement

easily maintained

RETROSPECTIVE STUDY

Experimental Design

where the investigator determines who is exposed. These

may prove causation

Determine causes:

True experimental design is regarded as the most accurate form of

experimental research, in that it tries to prove or disprove a

hypothesis mathematically, with statistical analysis.

A double blind experiment is an experimental method used

to ensure impartiality, and avoid errors arising from bias.

Quasi Experimental Design: where lack of randomization

Prospective Study

designs

The most powerful studies are prospective

studies, and the paradigm for these is the

randomised controlled trial. In this subjects

with a disease are randomised to one of two

(or more) treatments. one of which may be a

control treatment.

A parallel group design is one in which

treatment and control are allocated to different

individuals. To allow for the therapeutic effect of

simply being given treatment, the control may

consist of a placebo, an inert substance that is

physically identical to the active compound.

Randomized Controlled

Study

groups for purposes of comparison.

2. The selection of control groups depends on the objectives

of the study. In the evaluation of traditional medicine,

3. a concurrent control group should be used.

The control groups may involve (not in order of priority):

well established treatment

non-treatment

different doses of the same

treatment

sham or placebo treatment

full-scale treatment

minimal treatment

alternative treatment.

A crossover study is one in which two or more treatments

are applied sequentially to the same subject.

The advantages are that each subject then acts as their own

control and so fewer subjects may be required.

The main disadvantage is that there may be a carry over

effect in that the action of the second treatment is affected

by the first treatment.

Cohort studies

Cohort: a group of people who have something in common and who

remain part of a group over an extended period of time

Outcomes determined after follow-up: longitudinal, prospective

studies

COHORT STUDY

Advantages/Disadvantages of Cohort

study

Advantages

Disadvantages

Can collect exposure

information as exposure

happens

decades to complete

different exposures

over time

be relatively reliable

the study

outcome happens

Very expensive

Can you afford to wait

decades for your answer

Comparison of Case-Control ad

Cohart Study Design

treatment (or exposure),

Cohort works from treatment (or exposure) to outcome (or

presence of disease).

defined cohort are identified and, for

each, a specified number of matched

controls is selected from among those

in the cohort who have not developed

the disease by the time of disease

occurrence in the case

in costs and efforts of data collection and

analysis compared with the full cohort

approach, with relatively minor loss in

statistical efficiency

Case-Cohort

studies

The case-cohort design is most useful in analyzing time to

failure in a large cohort in which failure is rare.

Covariate information is collected from all failures and a

representative sample of censored observations.

Sampling is done without respect to time or disease status,

and, therefore, the design is more flexible than a nested

case-control design.

Despite the efficiency of the methods, case-cohort designs

are not often used because of perceived analytic

complexity.

IMPROVED EXPERIMENTAL

DESIGN

Slope effect or Disease modifying effect

R

BETTE MANCE

R

PERFO

WITHDRAWAL DESIGN

ACTIV

E

SLOPE EFFECT/DISEASE

MODIFYING EFFECT

PLACEB

O

TIME

Blinding

To

further

eliminate

bias,

randomized trials are sometimes

"blinded" (also called masked).

Single-blinded trials are those

in which participants do not

know which group they are in and therefore which

intervention they are receiving

- until the conclusion of the

study.

Double-blinded trials are those

in which neither the

participant nor the

investigators know to which

group the participant has been

assigned until the conclusion

of the study.

- Triple blinded trials where

Statistician also blinded

between a mothers education and risk of a

congenital heart defect in her offspring. The

investigator enrolls a group of mothers of babies

with birth defects and a group of mothers of

babies without birth defects. The mothers are

then asked a series of questions about their

education

1.Case series

2.Case-control study

3.Nested case-control study,

4.Prospective cohort study

5. Retrospective cohort study,

6. Randomized clinical trial,

consumption and performance on a

memory test randomly assigns half of the

enrolled subjects to drink coffee one hour

before taking the memory test and the

other half to not drink coffee one hour

before taking the memory test.

1. Case series

2. Case-control study

3. Nested case-control study,

4. Prospective cohort study

5. Retrospective cohort study,

6. Randomized clinical trial,

7.Cross-sectional study,

8.Ecological study and

between meat consumption and heart

disease compares the average number of

kilograms of meat consumed per person

for 50 different countries to the incidence

rate of heart disease in the same 50

countries

1. Case series

2. Case-control study

3. Nested case-control study,

4. Prospective cohort study

5. Retrospective cohort study,

6. Randomized clinical trial,

7.Cross-sectional study,

8.Ecological study and

whom suffer from migraine with aura and experienced an

ischemic stroke.

1. Case series

2. Case-control study

3. Nested case-control study,

4. Prospective cohort study

5. Retrospective cohort study,

6. Randomized clinical trial,

7.Cross-sectional study,

8.Ecological study and

9.Case cohort study

individuals and distributes questionnaires to

collect information on sex and blood type. The

investigator then examines the association

between sex and blood type.

1. Case series

2. Case-control study

3. Nested case-control study,

4. Prospective cohort study

5. Retrospective cohort study,

6. Randomized clinical trial,

7.Cross-sectional study,

8.Ecological study and

9.Case cohort study

records to identify a group of retired factory

workers. He reviews each persons medical

records to follow their factory exposures over

time and see which of these subjects has

developed skin cancer in the past 25 years.

1. Case series

2. Case-control study

3. Nested case-control study,

4. Prospective cohort study

5. Retrospective cohort study,

6. Randomized clinical trial,

7.Cross-sectional study,

8.Ecological study and

9.Case cohort study

Girl: For Suicide..

Boy: Then, Why so Much Make-Up?

Girl: You Idiot..!! Tomorrow My Photo

will Come In Newspaper..........

Superiority study:

A Trial or Research with the objective of showing that the

response to the investigational product is superior to a

comparative agent (Active or placebo control)

Equivalence study:

Research with the objective of showing that the difference

between control and study treatments are not large in either

direction of the study . Investigational product is compared

to a reference treatment without the objective of showing

superiority

Non-significant/significant results are good outcome

Non-inferiority study:

If the study objective is to demonstrate that

Measurement error

If more than one operator used in study

then measurement (gauge) error has two

components of variance:

2

total

Repeatability:

2

repeatability

instrument

Reproducibility:

2

reproducibility

operators

2

product

2

repeatability

2

gauge

2

reproducibiliy

What is Hypothesis

testing?

parameter. This assumption may or may not be true. Hypothesis

testing refers to the formal procedures used by statisticians to

accept or reject statistical hypotheses.

Statistical Hypothesis

The best way to determine whether a statistical hypothesis is true

would be to examine the entire population. Since that is often

impractical, researchers typically examine a random sample from

the population. If sample data are not consistent with the

statistical hypothesis, the hypothesis is rejected.

Null hypothesis. The null hypothesis, denoted by H 0, is usually the

hypothesis that sample observations result purely from chance.

Alternative hypothesis. The alternative hypothesis, denoted by H1

or Ha, is the hypothesis that sample observations are influenced by

some non-random cause.

Example

For example, suppose we wanted to determine whether a coin was fair

and balanced. A null hypothesis might be that half the flips would result in

Heads and half, in Tails. The alternative hypothesis might be that the

number of Heads and Tails would be very different. Symbolically, these

hypotheses would be expressed as

H0: P = 0.5

Ha: P 0.5

Suppose we flipped the coin 50 times, resulting in 40 Heads and 10 Tails.

Given this result, we would be inclined to reject the null hypothesis. We

would conclude, based on the evidence, that the coin was probably not

fair and balanced.

hypothesis test can have one of two outcomes: Reject the Null

Hypothesis or failure to reject the null hypothesis.

distinction between "acceptance" and "failure to reject?"

Acceptance implies that the null hypothesis is true.

Failure to reject implies that the data are not sufficiently persuasive for

us to prefer the alternative hypothesis over the null hypothesis.

The null hypothesis, H0, represents a theory that has been put

forward, either because it is believed to be true or because it

is to be used as a basis for argument, but has not been

proved. For example, in a clinical trial of a new drug, the null

hypothesis might be that the new drug is no better, on

average, than the current drug. We would write

H0: there is no difference between the two drugs on average.

We give special consideration to the null hypothesis. This is due

to the fact that the null hypothesis relates to the statement

being tested, whereas the alternative hypothesis relates to

the statement to be accepted if / when the null is rejected.

The final conclusion once the test has been carried out is always

given in terms of the null hypothesis. We either "Reject H0 in

favour of H1" or "Do not reject H0"; we never conclude

"Reject H1", or even "Accept H1".

If we conclude "Do not reject H0", this does not necessarily

mean that the null hypothesis is true, it only suggests that

there is not sufficient evidence against H0 in favour of H1.

Null Hypothesis:

Suppose we are testing

the efficacy of a new drug on

Example

We divide the patients into two groupsdrug and no drug

according to good design procedures,

and use as our criterion measure mortality in the two

groups.

It is our hope that the drug lowers mortality,

but to test the hypothesis statistically, we have to set it up in a sort

of backward

way.

We say our hypothesis is that the drug makes no difference, and

what we hope to do is to reject the no difference

hypothesis, based on

evidence from our sample of patients.

This is known as the null hypothesis.

Alternative Hypothesis

We test this against an alternate hypothesis, known as HA , that the

difference in death rates between the two groups does not equal 0.

We then gather data and note the observed difference in

mortality between group A and group B.

If this observed difference is sufficiently greater than zero, we

reject the null hypothesis.

If we reject the null hypothesis of no difference, we accept the alternate

hypothesis, which is that the drug does make a difference.

1.I will assume the hypothesis that there is no difference is true;

2. I will then collect the data and observe the difference between

the two groups;

3. If the null hypothesis is true, how likely is it that by chance

alone I would get results such as these?

4. If it is not likely that these results could arise by chance under

the assumption than the null hypothesis is true, then I will

conclude it is false, and I will accept the alternate hypothesis.

Suppose we believe that drug A is better than drug B in preventing death

from a heart attack.

Why don't we test that belief directly and see which drug is better, rather

than testing the hypothesis that drug A is equal to drug B?

The reason is that there is an infinite number of ways in which

drug A can be better than drug B, so we would have to test an

infinite number of hypotheses.

If drug A causes 10% fewer deaths than drug B, it is better. So first

we would have to see if drug A causes 10% fewer deaths.

If it doesn't cause 10% fewer deaths, but if it causes 9% fewer deaths, it is

also better.

Then we would have to test whether our observations are

consistent with a 9% difference in mortality between the two

drugs.

A one-tailed test looks for an increase or

decrease in the parameter whereas

a two-tailed test looks for any change in

the parameter (which can be any

change- increase or decrease).

When is a one-tailed test appropriate?

Because the one-tailed test provides more power to detect an effect, you

may be tempted to use a one-tailed test whenever you have a hypothesis

about the direction of an effect. Before doing so, consider the

consequences of missing an effect in the other direction. Imagine you

have developed a new drug that you believe is an improvement over an

existing drug. You wish to maximize your ability to detect the

improvement, so you opt for a one-tailed test. In doing so, you

fail to test for the possibility that the new drug is less effective

than the existing drug. The consequences in this example are

extreme, but they illustrate a danger of inappropriate use of a

one-tailed test.

So when is a one-tailed test appropriate? If you consider the

consequences of missing an effect in the untested direction and conclude

that they are negligible and in no way irresponsible or unethical,

then you can proceed with a one-tailed test.

When is a one-tailed test NOT appropriate?

Choosing a one-tailed test for the sole purpose of attaining significance is

not appropriate. Choosing a one-tailed test after running a twotailed test that failed to reject the null hypothesis is not

appropriate, no matter how "close" to significant the two-tailed test

examples

We could use a one-tailed test, to see if

the stream has a higher pH than one

year ago, for which we would use the

alternate hypothesis HA: prev < current.

However, we may want a more rigorous

test, for the hypothesis that HA: prev

current. This would mean that both HA:

prev < current and HA: prev > current were

satisfied, and we could be sure that

there is a significant difference between

the means.

Factorial Designs

An experiment , the process engineer's goal is to determine how the yield

of an adhesive application process can be improved by adjusting three

(3) process parameters: mixture ratio, curing temperature, and curing

time. For each of these input parameters, two levels will be defined for

use in this 2-level experiment. For the mix ratio, the high level is set at

55%, while the low level is set at 45%. For the curing temp., the high

level is set at 150 deg C while the low level is set at 100 deg C. For the

curing time, the high level is set at 90 minutes, while the low level is set

at 30 minutes.As mentioned, the output response monitored is process

yield. Assume further that the data were gathered by performing just a

single replicate (n=1) per combination treatment.

ADVANTAGES

Factorial designs are the

ultimate designs of choice

whenever we are interested in

examining treatment

variations.

Factorial designs are

efficient. Instead of

conducting a series of

independent studies, we

are effectively able to

combine these studies into

P value

The probability value (p-value) of a

statistical hypothesis test is the

probability of getting a value of

the test statistic as extreme as or

more extreme than that observed

by chance alone, if the null

hypothesis H0, is true.

It is the probability of wrongly

rejecting the null hypothesis if it

is in fact true.

1.Significant figures

+ Suggestive significance

(P value: 0.05<P<0.10)

* Moderately significant

( P value:0.01<P 0.05)

** Strongly significant

(P value : P 0.01)

The confidence interval is the statistical measure of the number of

times out of 100 that results can be expected to be within a

specified range.

For example, a confidence interval of 90% means that results of an

action will probably meet expectations 90% of the time.

The basic idea described in Central Limit Theorem is that when a

population is repeatedly sampled, the average value of an attribute

obtained is equal to the true population value. In other words, if a

confidence interval is 95%, it means 95 out of 100 samples will

have the true population value within range of precision.

Degree of Variability.

Depending upon the target population and attributes under

consideration, the degree of variability varies considerably. The

more heterogeneous a population is, the larger the sample size is

required to get an optimum level of precision. Note that a

Measurements

Identity : each value on the measurement scale

has unique meaning

Magnitude: Values on the measurement scale

have an ordered relationship to one another. That

is, some values are larger and some are smaller.

Equal intervals. Scale units along the scale are

equal to one another. This means, for example,

that the difference between 1 and 2 would be

equal to the difference between 19 and 20.

A minimum value of zero. The scale has a true

zero point, below which no values exist.

Types of Measurements

1. Nominal Scale of Measurement

The nominal scale of measurement only satisfies the identity property

of measurement.

2. Ordinal Scale of Measurement

The ordinal scale has the property of both identity and magnitude

3. Interval Scale of Measurement

The interval scale of measurement has the properties of identity,

magnitude, and equal intervals.

4. Ratio Scale of Measurement

The ratio scale of measurement satisfies all four of the properties of

measurement: identity, magnitude, equal intervals, and a minimum

value of zero.

Standard Deviation

No effect : d<0.20

Mild effect : 0.20

<d<0.50

Moderate effect: 0.50

<d<0.80

Large effect :

0.80<d<1.20

Very large effect :

d>1.20

measure of the strength of the

relationship between two

variables in a statistical

population, or a sample-based

estimate of that quantity. An

effect size calculated from data

is a descriptive statistic that

conveys the estimated

magnitude of a relationship

without making any statement

about whether the apparent

relationship in the data reflects

a true relationship in the

population.

from before to after

treatment, divided by the standard deviation at

baseline.

Marks!!

Student:

Mix Chilli Powder with Sugar, & keep It Outside the Ant's

Hole..!

After eating, Ant will Search for some Water near a Water

tank.

Push ant in to it.. =!!

When it Reaches fire, Put a Bomb into D fire..!!

Then Admit Wounded Ant in ICU..!! =O

And Then Remove Oxygen Mask from it's Mouth and Kill the

Ant.. !! =|

MORAL:

Don't Play with Students.. !!

Sample Size

Determining the sample size to be selected is an important step in any

research study. For example let us suppose that some researcher wants to

determine prevalence of eye problems in school children and wants to

conduct a survey.

The important question that should be answered in all sample

surveys is "How many participants should be chosen for a survey"?

However, the answer cannot be given without considering the

objectives and circumstances of investigations.

The choosing of sample size depends on non-statistical

considerations and statistical considerations. The non-statistical

considerations may include availability of resources, manpower,

budget, ethics and sampling frame.

The statistical considerations will include the desired precision of

the estimate of prevalence and the expected prevalence of eye

problems in school children.

The Level of Precision: Also called sampling error, the level of precision,

is the range in which the true value of the population is estimated to be.

This is range is expressed in percentage points. Thus, if a researcher finds

Sample size

Estimation

Size;

of

Optimum

Sample

insignificant results

Too

many

sample

make

unnecessary significant results and

costly experiment

Criteria for Estimation of sample

size

1.Type I error (0.05 or 5%)

2.Statistical Power( 0.80 or 80%)

3.Expected difference

4.Standard deviation

Sample size :

Dose Escalation

Dose limiting toxicity (DLT) must

be defined

Decide a few dose levels (e.g. 4)

At least three patients will be

treated on each dose level

(cohort)

Not a power or sample size

calculation issue

Dose Escalation

Enroll 3 patients

If 0/3 patients develop DLT

Escalate to new dose

If DLT is observed in 1 of 3

patients

Expand cohort to 6

Escalate if 3/3 new patients do not

develop DLT (i.e. 1/6 develop DLT)

Dose Escalation

Maximum Tolerated Dose (MTD)

Dose level immediately below the

level at which 2 patients in a

cohort of 3 to 6 patients

experienced a DLT

MTD or a maximum dosage that is

pre-specified in the protocol

Patients to Enroll?

Ratio of two arms !:1, 1:1.5 or 1:2

Power of study minimum 80.0% or

=0.80

Difference of outcome

Standard deviation

One tailed/Two tailed

Type I error, = 0.05/0.01

Z 2 /2 * P *(1 p ) * D

N

E2

the study, E is the Precision (or margin of error) with which a

researcher want to measure something. Generally E will be 10%

of P and Z/2 is normal deviate for two tailed alternative

hypothesis at level of significance, for example, for 5% level of

significance Z/2 is 1.96 and 1% level of significance it is 2.58 as

shown in table 2.

D is the design effect reflects the sampling design used in the

survey type of study. This is 1 for simple random sampling and

higher values (usually 1 to 2) for other designs such as stratified,

Systematic, cluster random sampling etc, estimated to

compensate for deviation from simple random sampling

procedure. The design effect for cluster random sampling is

taken as 1.5 to 2.

For the purposive sampling, convenience or judgment sampling,

survey type of studies

Example: Researcher interested to know the sample size for

conducting a survey for measuring the prevalence of obesity in

certain community. Previous literature gives the estimate of

obesity at 20% in the population to be surveyed, and assuming

95% confidence interval or 5% level of significance and 10%

margin of error, the sample size can be calculated as follow as;

1537 is required to conduct community based survey to

estimate the prevalence of obesity. Note-E is the margin

of error, in the present example it is 10%X0.20=0.02.

To find the final adjusted sample size, allowing nonresponse rate of 10% in the above example,, the adjusted

sample size will be 1537/(1-0.10)=1537/0.90=1708.

group mean

N = (Z/2)2 s2 / d2,

where s is the standard deviation obtained from

previous study or pilot study and

d is the accuracy of estimate or how close to the true

mean

Z/2 is normal deviate for two tailed alternative

hypothesis at level of significance

Example: In a study for estimating the weight of

population and wants the error of estimation to be less

than 2 kg of true mean (that is expected difference of

weight to be 2 kg) , the sample standard deviation was 5

and with a probability of 95%, and (that is) at an error

rate of 5%, the sample size estimated to be

N=(1.96)2 (5)2/ 22 gives the sample of 24 subjects, if the

allowance of 10% for missing, losses to follow-up,

N

(r 1)( Z /2 Z1 ) 2 2

r d2

level of significance( Z is 1.96

for 5% level of significance and

2.58 for 1% level of significance)

and Z1- is the normal deviate at

1-% power with % of type II

error( 0.84 at 80% power and

1.28 at 90% statistical power).

r=n1/n2 is the ratio of sample

size required for two groups,

generally it is one for keeping

equal sample size for two groups,

If r=0.5 gives the sample size

distribution as 1:2 for two

Let`s

us

say

a

clinical

researcher wanting to compare

the effect of two drugs, A and

B, on systolic blood pressure

(SBP). On literature search

researcher found the mean SBP

in two groups were 120 and

132 and common standard

deviation of 15, The total

sample size for the study with

r=1 (equal sample size), =5%

and power at 80% and 90%

were computed as

24 and for 90% of statistical

power the sample size will be

32. In unequal sample size of

1: 2 (r=0.5) with 90% statistical

power of 90% at 5% level

/2

2 p(1 p) Z1 p1 (1 p1 ) p2 (1 p2 )

( p1 p2 )2

interest (outcome) for group I and group II and p

is , is normal deviate at level of significance

and Z1- is the normal deviate at 1-% power with

% of type II error, normally type II error is

considered 20% or less.

If researcher is planning to conduct a study with

unequal groups, he or she must calculate N as if

we are using equal groups, and then calculate the

modified sample size . If r=n1/n2 is the ratio of

sample size in two groups, then the required

Example: It is believed that the proportion of patients

who develop complications after undergoing one type of

surgery is 5% while the proportion of patients who

develop complications after a second type of surgery is

15%. How large should the sample be in each of the two

groups of patients if an investigator wishes to detect,

with a power of 90% , whether the second procedure has

a complications rate significantly higher than the first at

the 5% level of significance?

correlation co-efficient

N

1

4

Z /2 Z1

1 r

log

between salt intake and systolic blood pressure is around

0.30. A study is conducted to attests this correlation in a

population, with the significance level of 1% and power of

90%. The sample size for such a study can be estimated as

2

follows:

2.58

1.28

1

1 0.3

log e

4

1

0.3

power at 1% level of

significance was 99 for

two tailed alternative

test and 87 for one

ratio

rule

Two group : n=16 s2/d2

Three groups: n=22 s2/d2

Four groups: n=26 s2/d2

Pre-post design: 50% less

Cross-over design: 25% of

two tailed

sample size

Where s is the within

Increase total sample size

standard deviation d is the

by

smallest difference of means

(k+1)2/4k

Total sample size for two

group is 26

Factors affecting the sample sizeWant in 2:1

1. The Sample size increases as SD

increase

Then

26*9/8=30 (20:10)

2. The sample size increases as significance level made

smaller(<0.05)

3. Sample size increases with required power increases

(>0.80)

4. Sample size increases with decrease in difference

n=111

n=141

n=69

for =0.70

yoga any effect over your husbands

drinking habit?

Woman: Yes, An Amazing Funny

Effect !! Now he drinks the whole

bottle standing upside down over his

head.

Selection of Controls

Generally control: Cases ratio is 1:1

1.Historical controls: comparison of group

which was treated earlier period using

another form of therapy/intervention

2.Geographical control : Comparison with a

group treated elsewhere with a different

form of therapy /intervention

3.Volunteer control: May not be matched group

4.Concurrent control : Control group observed

simultaneously with the treated group

Placebo control vs Active control A placebo is

an inactive pill, liquid or a powder that has

no treatment value Control is the standard by

which the experimental observations are

evaluated. In many Clinical trials an

Case-Control Study

Sample size

calculation says

n=13 cases

and controls are

needed

Only have 11 cases!

Want the same

precision

n0 = 11 cases

kn0 = # of controls

n

k

2n0 n

k = 13 / (2*11 13) =

13 / 9 = 1.44

kn0 = 1.44*11 16

controls (and 11

cases)

Same precision as

13 controls and 13

cases

Question by a student !!

If a single teacher can't

teach us all the subjects,

Then...

How could you expect a single

student

to learn all subjects ?

Randomization

An important aspect of any research

which should be clearly stated in

the final report is the method used

to assign treatments (or other

interventions)

to

participants.

Random assignment has been used

for more than 50 years and is the

preferred method of assignment.

Randomization eliminates the source

of bias in treatments assignment;

Subjects in various groups should not

differ in any systematic way,

If

treatments

are

systematically

different , research results will be

biased.

Inadequate

randomization,

overestimate treatment by 40%

research

Protocol

trials are based. The plan is carefully

designed to safeguard the health of the

participants as well as answer specific

research questions. A protocol describes

what types of people may participate in the

trial; the schedule of tests, procedures,

medications, and dosages; and the length of

the study. While in a clinical trial,

participants following a protocol are seen

regularly by the research staff to monitor

their health and to determine the safety and

effectiveness of their treatment.

Informed Consent An agreement signed by all volunteers

participating in a clinical research study, indicating their

understanding of:

(2)what researchers hope to learn;

(3)what will be done during the trial and for how

long;

(4)what risks are involved;

(5)what, if any, benefits can be expected from

the trial;

(6)what other interventions are available;

(7)the participants right to leave the trial at any

time.

analysis

Intent-to-treat Analysis (ITT) Full Analysis data Set

Patients in a trial assigned to one treatment group but

for variety of reasons, they receive other treatment

(withdrawal or failure to comply). If this occurs, subjects

should be analyzed as if they had completed the study in

their treatment group. If the composition of each

treatment groups are altered, one negates the intention of

randomized trial- to have random distribution of

unmeasured characteristics that may affect the outcome

(confounders). Regardless protocol deviations, subject

compliance or withdrawal analysis is performed according

to assigned treatment group. Admits non-compliance and

protocol deviations.

Per Protocol Analysis (PP) Per Protocol Analysis data Set or

efficacy sample or evaluable sample

Only patients who sufficiently complied with the trial

protocol should be considered in analysis. Compliance

covers exposure to treatment, availability of measurement

and absence of major protocol violations, also called as

Treatment trials

Prevention trials

Vaccines, medicines, minerals etc

Diagnostic trials

diseases

Screening trials

disease or health conditions

Quality of Life

trials

comfort and quality of life for

individuals with Chronic illness

Phase I

small group of healthy people (20-80) for the first time to

evaluate its safety, determine a safe dosage range, and

identify side effects

Phase II

larger group of patients (100-300) to see if it is effective

and to further evaluate its safety.

Phase III

groups of Patients (1,000-3,000) to confirm its

effectiveness, monitor side effects, compare it to

commonly used treatments, and collect information that

will allow the experimental drug or treatment to be used

safely.

Phase IV

including the drug's risks, benefits, and optimal use.

to him,

Little boy: "Teacher are you sleeping in class?"

Teacher: "No I am not sleeping in class."

Little boy: "What were you doing sir ?"

Teacher: "I was talking to God."

The next day the naughty boy fell asleep in class and the same

teacher walks up to him...

Teacher: "young man, you are sleeping in my class."

Little boy: "No not me sir, I am not sleeping."

Angry teacher: "What were you doing.??"

Little boy: "I was talking to God."

Angry teacher: "What did He say??"

Data management

Involves Screening for missing data

Is the Missing due to incomplete data

collection

Is the missing due to non-response

Is the pattern of missing is random

Data validity: Screening for data validity; wrong

entry etc

Outliers: Screening for outliers

Normality test:

test

Shaperio wilk W test

Since 1960, principles of Intent-to-treat (ITT) has become widely accepted for the

analysis of controlled clinical trails. In this context, the question of how to

perform such an analysis in the presence of missing information about a main

endpoint is of Major importance. 10-20% additional samples are required to

adjust the drop out rate

Drastic increase in the Type I error and Substantial decrease in Power of the study

Type of Missing:

1.Missing completely at random(MCAR): when missing values are randomly

distributed across all observations, it can be confirmed by dividing respondents into

those with and without missing data, using the t-test or chi-square test to establish two

groups do not differs significantly

2. Missing at random (MAR) : is a condition which exits when missing values are not

randomly distributed across all the observations but are randomly distributed within

one or more subsamples

Methods of estimating missing values

1. LOCF (Last observation carry forward)

2. Mean value

3. By Regression method

SAMPLING

It is a scientific method of data collection.

The main principle behind sampling is that we seek knowledge about the total units(called

population) by observing a few units(called sample) and extend our inference about the

sample to the entire population.

AND CENSUS

IN CENSUS METHODLOGY EACH AND EVERY ELEMENT OF THE UNIVERSE

IS CONTACTED

WHEREAS IN SAMPLING METHODOLOGY FEW ELEMENTS ARE SELECTED

FROM UNIVERSE FOR THE RESEARCH.

NEED OF SAMPLING

1. POPULATION IN MANY CASES MAY BE SO LARGE AND SCATTERED THAT A

COMPLETE COVERAGE MAY NOT BE POSSIBLE.

2. IT OFFERS HIGH DEGREE OF ACCURACY BECAUSE IT DEAL WITH A SMALL

NUMBER OF PERSONS.

3. IN SHORT PERIOD OF TIME VALID AND COMPARABLE RESULTS CAN BE

OBTAINED

4. SAMPLING IS LESS DEMANDING IN TERMS OF REQUIREMENTS OF

INVESTIGATORS.

5. IT IS ECONOMICAL SINCE IT CONTAINS FEWER PEOPLE.

For the purpose of disease control , improving the health and

productivity of animal or aquatic animals and therby, the well-being of

the people information needed to

1.

2.

3.

4.

5.

6.

7.

Determine the level and location of diseases

Determine the importance of different diseases

Set priorities for the use resources for disease control activities

Plan, implement, and monitor disease controls programs

Respond to disease outbreaks

Meet reporting requirement of International organisations OIE, WHO

etc

8. Demonstrate disease status for trading parteners

Who is the target group This is called the study

for the study?

population

Who in the target group This is called the

should be surveyed?

sample.

How many people

should be surveyed?

sample size.

to be surveyed by

sampling method.

selected?

PRINCIPLES OF SAMPLING

1. SAMPLE UNITS MUST BE CHOSEN IN A SYSTEMATIC AND OBJECTIVE

MANNER.

2. SAMPLE UNITS MUST BE CLEARLY DEFINED AND EASILY IDENTIFIABLE

3. SAMPLE UNITS MUST BE INDEPENDENT OF EACH OTHER.

4. SAME UNITS OF SAMPLE SHOULD BE USED THROUGHTOUT THE

STUDY.

5. THE SELECTION PROCESS SHOULD BE BASED ON SOUND CRITERIA

AND SHOULD AVOID

ERRORS, BIAS AND DISTORTIONS.

Population

Sample

Convenience Sampling

Researche

r

on ease of obtaining them or simple availability

Volunteer Sampling

Ill do it!

Ill do it!

Researcher

Ill do it!

in the research

Network/Snowball Sampling

Researcher

suggests others

who may be willing to participate

2, 6, 7, 12, 18

and then a sample is drawn by randomly selecting members of the p

Population Size: 20

A random start

in the

sequence is selected, and sample is selected

Random

Start:

Increment:

20/52 = 4

Cluster Sampling

Probability Sampling

1. Random Sampling

of the most popular types

of random or probability

sampling.

Stratified sampling is a

probability sampling

technique wherein the

researcher divides the

entire population into

different subgroups or

strata, then randomly

selects the final subjects

proportionally from the

different strata.

3. Systematic Sampling

Systematic sampling is a random sampling technique which is

frequently chosen by researchers for its simplicity and its

periodic quality.

than the total number of individuals in the population. This integer will

correspond to the first subject.

Interval: The researcher picks another integer which will serve as the

constant difference between any two consecutive numbers in the

4. Cluster Sampling

In cluster sampling, instead of selecting all the subjects from

the entire population right off, the researcher takes several

steps in gathering his sample population.

An area probability sample is one in which geographic areas are

sampled with known probability. While an area probability sample

design could conceivably provide for selecting areas that are

themselves the units being studied,

In survey research an area probability sample is usually one in

which areas are selected as part of a clustered or multi-stage

design. In such designs, households, individuals, businesses, or

other organizations are studied, and they are sampled within the

geographical areas selected for the sample.

Area sampling is basically multistage sampling in which maps,

rather than lists or registers, serve as the sampling frame. This is

the main method of sampling in developing countries where

adequate population lists are rare. The area to be covered is

divided into a number of smaller sub-areas from which a sample is

selected at random within these areas; either a complete

enumeration is taken or a further sub-sample.

6. Multi-stage Sampling

A multi-stage sample is one in which sampling is done

sequentially across two or more hierarchical levels, such as

first at the county level, second at the census track level, third

at the block level, fourth at the household level, and ultimately

at the within-household level.

Single-stage samples include simple random sampling, systematic

random sampling, and stratified random sampling. In single-stage

samples, the elements in the target population are assembled into a

sampling frame; one of these techniques is used to directly select a

sample of elements.

In contrast, in multi-stage sampling, the sample is selected in

stages, often taking into account the hierarchical (nested)

structure of the population. The target population of elements

is divided into first-stage units, often referred to as primary

sampling units (PSUs), which are the ones sampled first. The

selected first-stage secondary .

IN THIS METHOD , SAMPLING IS SELECTED IN VARIOUS STAGES

7. MULTI-PHASE SAMPLING

IN THIS TYPE OF SAMPLING THE PROCESS IS SAME AS IN

MULTI-STAGE SAMPLING . IN THIS EACH SAMPLE IS

ADEQUATELY STUDIED BEFORE ANOTHER SAMPLE IS DRAWN

FROM IT.

IN MULTISTAGE SAMPLING ONLY THE FINAL SAMPLE IS STUDIED

WHEREAS

IN MULTI-PHASE SAMPLING ALL SAMPLES ARE RESEARCHED.

Summing up.

Hypothesis Tests

Statisticians follow a formal process to determine whether to

reject a null hypothesis, based on sample data. This process, called

hypothesis testing, consists of four steps.

1.State the hypotheses. This involves stating the null and

alternative hypotheses. The hypotheses are stated in such a way

that they are mutually exclusive. That is, if one is true, the other

must be false.

2.Formulate an analysis plan. The analysis plan describes how to

use sample data to evaluate the null hypothesis. The evaluation

often focuses around a single test statistic.

3.Analyze sample data. Find the value of the test statistic (mean

score, proportion, t-score, z-score, etc.) described in the analysis

plan.

Non-Probability Sampling:

1.Convenience Sampling

Convenience

sampling is a nonprobability sampling

technique where

subjects are selected

because of their

convenient

accessibility and

proximity to the

researcher.

Sequential sampling is a non-probability sampling

technique wherein the researcher picks a single or a

group of subjects in a given time interval, conducts his

study, analyzes the results then picks another group of

subjects if needed and so on.

3. Quota Sampling

Quota sampling is a non-probability sampling technique wherein

the assembled sample has the same proportions of individuals

as the entire population with respect to known characteristics,

traits or focused phenomenon.

4. Judgmental Sampling

Judgmental sampling is a non-probability sampling technique where the

researcher selects units to be sampled based on their knowledge and

professional judgment.

5. Snowball sampling

Snowball sampling is a non-probability sampling technique that is used

by researchers to identify potential subjects in studies where subjects

are hard to locate.

Decision Errors

Two types of errors can result from a hypothesis test.

Type I error. A Type I error occurs when the researcher rejects a null

hypothesis when it is true. The probability of committing a Type I error is

called the significance level. This probability is also called alpha, and is

often denoted by .

Type II error. A Type II error occurs when the researcher fails to reject a

null hypothesis that is false. The probability of committing a Type II error

is called Beta, and is often denoted by . The probability of not

committing

Decision

Rules

a Type II error is called the Power of the test.

The analysis plan includes decision rules for rejecting the null

hypothesis.

P-value. The strength of evidence in support of a null hypothesis is

measured by the P-value. Suppose the test statistic is equal to S. The Pvalue is the probability of observing a test statistic as extreme as S,

assuming the null hypothesis is true. If the P-value is less than the

significance level, we reject the null hypothesis.

Region of acceptance. The region of acceptance is a range of

values. If the test statistic falls within the region of acceptance, the null

hypothesis is not rejected. The region of acceptance is defined so that

the chance of making a Type I error is equal to the significance level.

One tailed or two tailed

radio.

Shopkeeper: No, I sold a good radio to you.

Car driver: Radio label shows "Made in Japan" but

radio says: This is all India Radio.

sheet

CONSTRUCTION OF

SCALES

Construction of

questionnaire or scale

What a person knows (knowledge of

information)

What a person likes and dislikes (values and

preferences)

What a person thinks (opinions, attitudes,

beliefs, perceptions)

What experiences have taken place

(biography)

What is occurring at present (facts)

Likert scales were developed in 1932 as the familiar five-point bipolar

response format most people are familiar with today. These scales always

ask people to indicate how much they agree or disagree, approve or

disapprove, believe to be true or false.

Frequency

Important

Quantity

Anonymity and Confidentiality

An anonymous study is one in which nobody (not even the study directors)

can identify who provided data on completed questionnaires.

The Length of a Questionnaire: As a general rule, long questionnaires

get less response than short questionnaires (Brown, 1965; Leslie, 1970).

Color of the Paper : Berdie, Anderson and Neibuhr (1986) suggest that

color might make the survey more appealing.

Incentives: Many researchers have examined the effect of providing a

variety of nonmonetary incentives to subjects. These include token gifts

such as small packages of coffee, ball-point pens, postage stamps, or key

rings .

The "Don't Know", "Undecided", and "Neutral" Response Options

Response categories are developed for questions in order to facilitate the

process of coding and analysis. Respondents were more likely to choose the

"undecided" category when it was off to the side of the scale.

Question Wording

Phase I: Item generation

Face Validity

Content Validity

Phase II: Scale Development (pilot

study)

Construct validity

Criterion related validity

Reliability

Internal Consistency

Phase III: Scale Evaluation (Large scale

data collection)

Reliability

Internal Consistency

Discriminant validity

validity

Sources

Number of

statements

Previous literature

Thesis

40

8.0

Peer reviewed

papers

40

8.0

Bulletins/Mannuals/

Annual report

30

6.0

Experts consulted

Professors

75

15.0

Student

Entrepreneurs

25

5.0

Entrepreneurs

50

10.0

Entrepreneurship

websites

100

20.0

Online library

Online journal

Discussion forum

Others

Total

15

20

5

100

500

3.0

4.0

1.0

20.0

100.0

experts for developing

Entrepreneurial skills

questionnaire for graduate

Description

students

Number of

statement

Number of statements

screened at face validity

phase

Number of statements

evaluated by experts

Number of statements

satisfied Aiken`s Index >0.70

250

100.0

250

100.0

110

44.0

140

56.0

110

44.0

Internet source

satisfied Aiken`s Index

Number of statements

considered of Pilot study

Entrepreneurial skills

scale

scale

Correlation Significant

General

11.1742.85

33.205.99

0.528

<0.001**

Managerial

108.6841.25 33.205.99

0.591

<0.001**

Manufacturing

93.6831.56

33.205.99

0.522

<0.001**

Marketing

85.9933.02

33.205.99

0.599

<0.001**

399.72132.38 33.205.99

0.604

<0.001**

Total

correlation

Factor loadings

Statements

1. I am a person who is ready to take

responsibility

2. I want to be economically

independent

3. I feel responsible for my mistakes

and take corrections

4. I persevere till I can achieve my

dream

5. I have a supportive network of

friends, family and advisers

6. I'm flexible and able to take advice

7. I am a person who makes decision

within a reasonable time frame

8. I set goals & articulate a vision

9. It is important to me to make a mark

in this life

10. I have self confidence and self

esteem

11. I have a strong need to work

independently

12. I don't start something without a

clear vision and plan of action

13. Once I start a project I pursue it

inspite of challenges

Aiken`s

Index

Item-Total

correlation

Factor 1

Factor 2

Factor 3

Factor 4

0.850

0.847

0.848

0.291

0.265

0.243

0.850

0.858

0.821

0.298

0.291

0.263

0.850

0.856

0.833

0.312

0.263

0.256

0.750

0.850

0.832

0.278

0.293

0.251

0.750

0.863

0.827

0.317

0.261

0.277

0.750

0.858

0.828

0.297

0.283

0.264

0.750

0.848

0.824

0.290

0.283

0.255

0.750

0.855

0.802

0.312

0.276

0.280

0.850

0.857

0.823

0.312

0.261

0.273

0.850

0.850

0.844

0.291

0.253

0.266

0.900

0.852

0.831

0.274

0.266

0.291

0.900

0.846

0.823

0.305

0.262

0.256

0.850

0.839

0.843

0.277

0.263

0.248

Table 6: Explorative Factor analysis: Extraction and Rotation Sums of Squared Loadings

Compo

nents

Total

Extraction Sums of

Squared Loadings

% of

Cumul

Varian

Total

ative %

ce

Rotation Sums of

Squared Loadings

(Varimax)

% of

Cumul

Varian

Total

ative %

ce

% of

Cumul

Varian

ative %

ce

74.83

68.03

68.03

74.83

68.03

68.03

27.49

24.99

24.99

7.95

7.23

75.26

7.95

7.23

75.26

23.83

21.66

46.65

6.58

5.98

81.24

6.58

5.98

81.24

22.15

20.13

66.78

5.27

4.79

86.03

5.27

4.79

86.03

21.18

19.25

86.03

Cronbach Alpha (Consistency) Co-efficient based on Pilot study

Entrepreneuria Number Max Cronbach Correlatio

l skills

of items score

alpha

n

General

Managerial

Manufacturing

Marketing

Total

30

30

25

25

110

150

150

125

125

550

0.996

0.994

0.993

0.990

0.996

0.7381

0.7610

0.6652

0.7149

0.7707

Reliabilit

y Index

P value

Remark

0.8493

<0.001**

High

reliable

0.8643

High

<0.001** reliable

0.7990

Very

<0.001** reliable

0.8337

High

<0.001** reliable

0.8705

High

<0.001** reliable

VALIDITY AND

RELIABILITY

Validity refers to the accuracy

or truthfulness of a measurement. Are we measuring what we

think we are? "Validity itself is a simple concept, but the determination of the validity of a measure

is elusive

Face validity is based solely on the judgment of the researcher.

Content validity. Expert opinions, literature searches, and pretest open-ended questions help to

establish content validity.

Criterion-related validity can be either predictive or concurrent. Predictive validity refers to the

ability of an independent variable (or group of variables) to predict a future value of the dependent

variable. Concurrent validity is concerned with the relationship between two or more variables at

the same point in time.

Construct validity: It looks at the underlying theories or constructs that explain a phenomena.

Reliability is synonymous with repeatability. A measurement that yields consistent results over

time is said to be reliable.

A test-retest measure of reliability: administering the same instrument to the same group of

people at two different points in time.

equivalent-form technique: The researcher creates two different instruments designed to

measure identical constructs.

split-half reliability: measures of internal consistency

Statistical Methods

STATISTICAL METHODS

OUTCOME

COMPARISON

PARAMETRIC

NON-PARAMETRIC

MEAN/SD

ONE GROUP

RUN TEST

NO(%)

ONE GROUP

EXACT TESTS

CHI-SQUARE TEST

FISHER EXACT TEST

MEAN/SD

TWO GROUP

STUDENT T TEST

MEAN/SD

TWO GROUP

(PRE-POST)

STUDNT T

TEST(PAIRED)

WILCOXON

SIGNED RANK TEST

NO(%)

TWO GROUP

MEAN/SD

THREE OR MORE

GROUPS

NO(%)

THREE OR MORE

GROUPS

RELATIONSHIP

CHI-SQUARE TEST

FISHER EXACT TEST

ANOVA

KRUSKAL WALLIES

TEST

CHI-SQUARE TEST

FISHER EXACT TEST

PEARSON

CORRELATION

REGRESSION

SPEARMAN

CORRELATION

The Beck Depression Inventory. The 1970 version of the Beck

Depression Inventory is a 21-item selfreport inventory. Each item consists

of four alternative statements that represent gradations of a given

symptom rated in severity from 0 to 3. The scale is scored by summing the

item ratings; the total scores can range from 0 to 63. The instrument was

either self-administered by the patients or read aloud by one of the

The

Hopelessness

research

assistants. Scale. The Hopelessness Scale consists of 20 truefalse statements that assess the extent of pessimism. Each of the 20 items

is scored 1 or 0; the total score is the sum of the individual item scores. The

possible range of scores is from 0 to 20. The method of administration was

similar to the procedure used for the Beck scale.

The Scale for Suicide Ideation. This scale quantifies the severity of

current suicidal ideas and wishes. It was developed on the basis of

systematic clinical observations and interviews with suicidal patients. The

scale includes 19 items; each of these is composed of three choices that

range from 0 (least severe) to 2 (most severe) of the given construct. The

total score is computed by summing the item ratings; the scores can range

from 0 to 38. The items quantify the frequency and duration of suicidal

thoughts as well as the patients' attitudes toward them.

Give me Bihar for 3 years, we

will turn it into Japan.

Laloo: Give me Japan for 3

months, I will turn it into Bihar.

Fundamental Technique

of Life table or Survival

analysis

time to event

Time to death, time to relapse,

recovery, pregnancy, receiving organ

transplant, failure to treatment

Uses the time of entry

Answer the question of the chance of

survival after being diagnosed with

disease or after beginning the

treatment

Example

A group of 200 subjects followed for

three years

Deaths (Events) occurred throughout

three years

What is the chance of surviving at

the end of three years??

Interva It

l

1

200

dt

qt

pt

Pt

20

0.1

0.9

1.0

180

30

0.17

0.83

0.9

150

40

0.27

0.73

0.747

0.73x0.747=0.545

0.545

dt:Number of deaths during the time interval

qt:dt/It=prob of dying during the time interval

pt=1-qt: Probability of surviving in the time interval

Pt=Cum probability of survival at the beginning of time interval or at

the end of previous time interval

Meta-analysis

About 40,000 journals for the sciences,

the researchers are filling those journals at the rate of one article

every 30 seconds.

As results accumulate, it becomes increasingly difficult to

understand what they tell us and becomes difficult to find the

knowledge in this flood of information.

Meta-analysis is a rapidly expanding area of research that has been

relatively underutilized in Animal Nutrition and Physiology

Meta-analysis is also an evidence based research

Meta-analysis is a analysis of analysis, the statistical analysis of

large collection of results from individual studies for the purpose

of integrating the findings.

results of different studies

Pooled estimates should be a weighted average

of all studies included in meta- analysis.

Increases the statistical power

Resources

patients needed for a clinical trial with survival as an endpoint.

Biometrics. 38(1):163-170, 1982.

Bland JM and Altman DG. One and two sided tests of significance.

British Medical Journal 309: 248, 1994.

Pepe, Longton, Anderson, Schummer. Selecting differentially

expressed genes from microarry experiments. Biometrics.

59(1):133-142, 2003.

http://faculty.vassar.edu/lowry/VassarStats.html

Statistics guide for research grant applicants, St. Georges Hospital

Medical School (http://www.sghms.ac.uk/depts/phs/guide/size.htm)

http://www.physics.csbsju.edu/stats/contingency_NROW_NCOLUMN

_form.html

http://www.physics.csbsju.edu/stats/exact_NROW_NCOLUMN_form.

html

http://www.randomization.com/

http://www.graphpad.com/quickcalcs/index.cfm

Administrative Structure

Sponsor Signature page

Investigator Signature Page

Facilities

1.0 List of abbreviations and definitions of terms

2.0 Introduction

2.1 Background

2.1.1 Non clinical Summary

2.1.2 Clinical Summary

2.2 Benefits and Risks

2.2.1 Benefits and Risks of the study

2.2.2 Benefits and Risks of the study Drug

2.3 Rationale for the study

3.1 Primary objectives

3.2 Secondary objectives

3.3 Exploratory Objectives

4.0 Investigational Plan

4.1 Study design

4.2 Blinding

4.3 Randomization/ treatment Groups

4.4 Study endpoints

4.4.1 Pharmacokinetic endpoints

4.4.2 Pharmacodynamic endpoints

4.4.3 Efficacy endpoints

4.4.4 Safety Endpoints

4.4.5 Immunogenicity Endpoints

4.5 Duration of the Study

4.5.1 Part 1 treatment and Evaluation

4.5.2 Part 2 Recovery and follow-up

4.6.1 Inclusion Criteria

4.6.2 Exclusion Criteria

4.6.3 Protocol Required Restrictions

4.6.4 Patient Withdrawal

4.6.5 Screen failure or Rescreening

4.7 Study Assessments

4.7.1 Clinical laboratory Evaluations

4.7.2 Vital Signs

4.7.3 Physical Examination

4.7.4 Electrocardiogram

4.7.5 Chest X ray and Quantiferon -TB Gold Test

4.8 Study Periods

4.8.1 Part 1 Treatment and evaluation

4.8.1.1 Screening

4.8.1.2 Randomization Visit 1(week 0), Day 1

4.8.1.3 Evaluation Visit 2, Visit3 Visit 4 .

4.8.3 End- of-study

5.1 Treatment administered

5.2 Investigational products

5.3 Packaging and Labelling

5.4 Storage

5.5 Preparation

5.6 Study Medication Accountability

5.7 Prior and concomitant Treatments

5.8 Rescue Therapy

5.8.1 Prohibited Concomitant Medications

5.8.2 Allowed Medications

5.9 Treatment Compliance

6.0 Pharmacokinetic, Pharmacodynamic, efficacy outcome and

immunogenicity assessment

6.1 Pharmacokinetics and Pharmacodynamics

6.1.1 Pharmacokinetics

6.1.2 Pharmacodynamics

6.1.1 Pharmacokinetics

6.1.2 Pharmacodynamics

6.2 Efficacy Outcomes

6.2.1 Health assessment questionnaire Disability index

6.3 Immunogenicity

6.4 Appropriateness of Measurements

7.0 Safety

7.1 Drug associated adverse events- warnings, precautions and treatment

recommendations

7.2 Adverse Events

7.2.1 Definitions and general guidelines

7.2.2 Clinical laboratory abnormalities and other abnormal assessments

7.2.3 Pregnancy

7.2.4 Safety monitoring

8.0 Statistics

8.1 Determination of sample size

8.2 Statistical Methods

8.2.1 Interim Analysis

8.2.2 Efficacy Analysis

8.2.3 Safety Analysis

8.2.4 Analysis Populations

8.3 Safety

8.3.1 Adverse Events

8.3.2 Clinical Laboratory Evaluations

8.3.3 Vital signs Measurements, physical findings and other safety Evaluations

8.3.4 Immunogenicity

8.4 Pharmacokinetic Analysis

8.5 Pharmacodynamic analysis

9.0 Ethics

9.1 Institutional Review Board or Independent Ethics Committee

10.1 Monitoring

10.2 Data Management/ Coding

10.3 Quality Assurance Audit

10.4 Ethical Conduct of the study

10.5 Patient Information and informed consent

10.6 Patient Data Protection

11.0 Study Administration

11.1 Administrative Structure

11.2 Study and study Centre closure

11.3 Study discontinuation

11.4 Data handling and Record Keeping

11.5 Direct Access to source Data/Documentation

11.6 Investigator Information

11.6.2 Protocol Signatures

11.6.3 Publication Policy

11.7 Financing and Insurance

11.8 Interpretation of the Protocol/Protocol Amendment(s)

11.9 Protocol Deviations/Violations

12.0 Appendices

12.1 Appendix 1:Schedule of Assessments

( and so on.... list of appendices)

13.0 References

List Of Tables

sureshkp97@gmail.com

The winners in life think constantly in terms of I can, I will, and I am. Losers, on

the other hand, concentrate their waking thoughts on what they should have or

would have done, or what they can't do.

Thank you

