0% found this document useful (0 votes)
14 views35 pages

Copy Research 3m

Uploaded by

snehashree1506
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views35 pages

Copy Research 3m

Uploaded by

snehashree1506
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Research 3 markers( 2012-2021) Solution, Compiled by ROSHAN SAPKOTA, Final year BPT

sapkotaroshan4@gmail.com

July 2021
1.Descriptive and analytical research
Descriptive research includes surveys and fact finding enquiries of different kinds.
The major purpose of descriptive research is to describe the states of affairs as it exists at present.
Ex post facto research is often used term for descriptive research studies.
The main characteristics of this method is that the researcher has no control over the variables; he can only
report for what has happened or what is happening.
The researcher seeks to measure items as, for example, frequency of clinic visit, preferences of people etc.
It also includes attempts by researchers to discover causes even when they cannot control the variables.

In analytical research, the researcher has to use facts or information already available, and analyze these to
make a critical evaluation of the material. Analytical research asks “why?” so we try to find out how
something came to be.
For example, Why do so many siblings of people with Down syndrome have positive experiences?
Analytical research brings together subtle details to create more provable assumptions. People might use
analytical research to find the missing link in a study (Valcarcel,2017).

2. Necessity of defining a research problem.


A problem clearly stated is a problem half solved. This statement signifies the need of defining a
research problem. It is necessary to define a research problem for following purposes:
It will help to discriminate relevant data from the irrelevant ones.
It will enable the researcher to be on the track( an ill-defined problem may create hurdles).
A well defined research problem helps the researcher to answer following questions,
What data is to be collected?
What characteristics of data are relevant and need to be studied?
What relations are to be explored?
What techniques are to be used for the purpose?
Thus, defining a research problem helps to work out the research design and smoothly carry all the
consequential steps involved while doing research.

3. Need for sampling


Sampling is a technique of selecting individual members or a subset of the population to make statistical
inferences from them and estimate characteristics of the whole population. The sample should be
representative of the population to ensure that we can generalize the findings from the research sample to
the population as a whole.
Sampling is needed because:
It is impractical to survey every member of a population.
Sampling makes the research more manageable, time efficient, less costly.
It helps to learn about the characteristics of a group of people or objects without having to collect
information about all of the people or objects of interest.
When a person handles less amount of work of fewer number of people, then it is easier to ensure the
quality of the outcome.

4. Define source list and give example.

5. Kurtosis
Kurtosis is a measure of relative peakedness of a distribution. It is a meaure of whether the
data are heavy-tailed or light-tailed relative to a normal distribution. It is a shape of parameter
that characterizes the degree of peakedness.

In the given figure, there are three distributions having smallest or flattest peak( platykuttic),
having medium peak ( mesokuttic), and having highest peak ( leptokurtic).

6. Nominal and ordinal data


Nominal and ordinal are two of the four levels of measurement.
Nominal data has two or more categories without having any kind of natural order, they are data
with no numerical value i.e. they can’t be quantified.
For example, Hair color: Brown, Black, Red, White. Religious preference: hindu, muslim, jewish,
Christian.
Ordinal data is a statistical type of quantitative data in which variables exist in naturally
occurring ordered categories. The distance between two categories cannot be established.
For example, The likert scale: strongly disagree, disagree, neutral, agree, strongly agree.
socioeconomic status: poor, middle class, rich.
Nominal level data can only be classified, while ordinal level data can be classified and ordered.

7. Level of significance
It is also known as ‘p-value’. It is a function of the observed sample results that is used for
testing a statistical hypothesis. It is the probability of null hypothesis being true. It can accept or
reject the null hypothesis based on P value. Practically, P< 0.05 (5%) is considered significant.

P = 0.05 implies,
We may go wrong 5 out of 100 times by rejecting null hypothesis.
Or, we can attribute significance with 95% confidence.
It is symbolized by alpha(a). It is affected by the sample size and the nature of the experiment.
Common levels of significance are 0.05, 0.01, 0.001.

8. Histogram
A histogram is a representation of a frequency distribution by means of rectangle whose
widths represent class intervals and whose areas are proportional to the corresponding
frequencies.
A histogram can be used:
to display large amount of data values in a relatively simple chart form.
to tell relative frequency of occurrence.
to easily see the distribution of data.
to see if there is variation in the data.
to make future predictions based on the data.

9. Sampling distribution
A sampling distribution refers to a probability distribution of a statistic that comes from
choosing a random samples of a given population. Also known as finite-sample distribution, it
represents the distribution of frequencies on how spread apart various outcomes will be for a
specific population.
The sampling distribution depends on multiple factors- the statistic, sample size, sampling
process, and the overall population. It is used to calculate statistics such means, ranges, variances,
and standard deviations for the given sample.
TYPES
1. Sampling distribution of mean
2. Sampling distribution of proportion
3. T- distribution > Used when sample size is very small.

10. One tailed and two tailed test


A one-tailed test is a statistical test in which the values for which we can reject the null
hypothesis are located entirely in one tail of the probability distribution. The alternative is stated
is such a way that the probability of making a Type I error is entirely in one tail of a sampling
distribution. It can be either right-sided or left-sided. A one tailed test looks for an “increase” or
“decrease” in the parameter. For example, People with education are more interested in physical
activity than people with no education.

A two tailed test is a statistical test where the region of rejection is on both sides of the sampling
distribution. A two-tailed test looks for a “change” in the parameter. For example, eating will
change the weight of people.

March 2021

1.Editing
Editing of data is a process of examining the collected raw data (specially in surveys) to
detect errors and omissions and to correct these when possible. It involves a careful scrutiny of
the completed questionnaires and/or schedules. It is done to assure that the data are accurate,
consistent with other facts gathered, uniformly entered, as completed as possible and have been
well arranged to facilitate coding and tabulation.
With regards to points or stages at which editing should be done, one can talk about field
editing and central editing. Field editing is reviewing of the reporting forms by the investigator
for completing(translating and rewriting) what he has written in abbreviated and/or ineligible
form at the time of recording the respondents’ responses.
Central editing implies that all forms should get a thorough editing by a single editor in a
small study and by a team of editors in case of a large enquiry.

2. Kurtosis
Done already
3. Ex post facto research
see Descriptive research

4. Research problem
A research problem refers to some difficulty which a researcher experiences in the context of
either a theoretical or practical situation and wants to obtain a solution for the same. A research
problem exists if the following conditions are met with:
There must be an individual ( or a group or an organization) to whom the problem can be
attributed.
There must be at least two courses of action, to be pursued.
There must be at least two possible outcomes of a course of action, of which one should be
preferable to the other.
The courses of action available must provide some chance of obtaining the objective, but they
cannot provide the same chance, otherwise the choice would not matter.

5. Histogram
done already

6. Research approaches
Research approaches are plans and the procedures for research. that span the steps from
broad assumptions to detailed methods of data collection, analysis, and interpretation.
There are two types of approaches to research, namely quantitative approach and qualitative
approach.
The quantitative approach involves the generation of data in quantitative form which can be
subjected to rigorous quantitative analysis in the formal and rigid fashion. This approach can be
sub-classified as:
Inferential,
Experimental, and
Simulation approaches
Inferential approach forms a data base to infer characteristics or relationships of population.
In experimental approach, some variables are manipulated to observe their effects on other
variables.
Simulation approach involves the construction of an artificial environment within which relevant
information and data can be generated.

Qualitative approach to research is concerned with subjective assessment of attitudes, opinions


and behavior. Research in such a situation is a function of researcher’s insights and impressions.

7. One tailed and two tailed test


Done already

8. Ordinal scale
Ordinal scale is the 2nd level of measurement that reports the ranking and ordering of the data without
actually establishing the degree of variation between them. Ordinal level of measurement is the second of
the four measurement scales.
“Ordinal” indicates “order”. Ordinal data is quantitative data which have naturally occurring orders and
the difference between is unknown. It can be named, grouped and also ranked.

Ordinal Characteristics

 Along with identifying and describing the magnitude, the ordinal scale shows the relative rank of
variables.
 The properties of the interval are not known.
 Measurement of non-numeric attributes such as frequency, satisfaction, happiness etc.
 In addition to the information provided by nominal scale, ordinal scale identifies the rank of variables.
 Using this scale, survey makers can analyze the degree of agreement among respondents with respect to
the identified order of the variables.
For example, The Likert scale

9. Probability
Probability is a branch of statistics that deals with the occurrence of a random event. For example,
when a coin is tossed in the air, the possible outcomes are Head and Tail.
The probability formula provides the ratio of the number of favorable outcomes to the total number of
possible outcomes.
The probability of an Event = (Number of favorable outcomes) / (Total number of possible outcomes)
Whenever we’re unsure about the outcome of an event, we can talk about the probabilities of certain
outcomes—how likely they are.

10. What is binomial distribution?


The binomial is a type of distribution that has two possible outcomes. It can be thought of as simply the
probability of a success or failure of outcome in an experiment or survey that is repeated multiple times.
For example, a coin has only two possible outcomes, either head or tails.
Must meet following 3 criteria :
1. The number of observations or trials is fixed.
One can only figure out the probability of something happening if it is done a certain number of times.
If a coin is tossed once, the probability of getting a tails is 50%. If the coin is tossed 20 times, probability of
getting a tails is very, very close to 100%.
2. Each observation or trial is independent. In other words, none of the trails have an effect on the
probability of the next trial.
3. The probability of success is exactly the same from one trial to another.

September 2020

1. Mode
The most frequently occurring observation in a data-set is mode. It is particularly useful in the
study of popular sizes. For example, a manufacturer of shoes is usually interested in finding out
the size most in demand so that he may manufacture a larger quantity of that size.
Like median, mode is also a positional average and is not affected by extreme values. Mode is
not amenable to algebraic treatment.
A data-set may not have any mode or there may be more than one modes in a data-set.

2. Cluster Sampling
It is a type of probability sampling. Cluster sampling also involves dividing the population into
subgroups, but each subgroup should have similar characteristics to the whole sample. Instead of sampling
individuals from each subgroup, we randomly select entire subgroups.

If it is practically possible, we might include every individual from each sampled cluster. If the clusters
themselves are large, we can also sample individuals from within each cluster using one of the techniques
above.
This method is good for dealing with large and dispersed populations, but there is more risk of error in the
sample, as there could be substantial differences between clusters. It’s difficult to guarantee that the
sampled clusters are really representative of the whole population.

Example
The company has offices in 10 cities across the country (all with roughly the same number of employees in
similar roles). You don’t have the capacity to travel to every office to collect your data, so you use random
sampling to select 3 offices – these are your clusters.

3. Sampling Unit
A Sampling unit is one of the units selected for the purpose of sampling. It is a basic unit
containing the elements of target population. A Sampling unit is an element, or a unit containing
the element, that is available for selection at some stage of the sampling process. It may be a
geographical one such as state, district, village, etc. or a construction unit such as house, flat, etc.
or it may be a social unit such as family, club, school, etc., or it may be an individual. The
researcher will have to decide one or more of such units that he has to select for his study.
A sampling unit is the building block of a data set.

4. Research approaches
Done already
5. One tailed and two tailed test
Done already
6. Ordinal scale
Done already

7. Define population and sample


The population includes all objects of interest whereas the sample is only a portion of the
population. There are several reasons why we don't work with populations. They are usually large,
and it is often impossible to get data for every object we're studying.
Sample is representative of the population. A sample is a group of people, objects, or items that
are taken from a larger population for measurement. The sample should be representative of the
population to ensure that we can generalize the findings from the research sample to the
population as a whole.
For example, All the students in the class are population whereas the top 10 students in the class
are the sample.

8. Continuous Vs discrete variables


A discrete variable is a variable whose value is obtained by counting. It can take only certain
values.
A continuous variable is a variable whose value is obtained by measuring. It can take any value
in a particular limit.
Examples for discrete variable: Number of planets around the Sun Number of students in a class.
Example for continuous variable: Height and weight of students in a particular class.

9. Define scaling.
Scaling describes the procedures of assigning numbers to various degrees of opinion, attitude
and other concepts. This can be done in two ways:
i) making a judgement about some characteristic of an individual and then placing him directly
on a scale that has been defined in terms of that characteristic and
ii) constructing questionnaires in such a way that the score of individual’s responses assigns him
a place on a scale.
A scale is a continuum, consisting of the highest point and the lowest point along with several
intermediate points between these two extreme points.
The term ‘scaling’ is applied to the procedures for attempting to determine quantitative measures
of subjective abstract concepts.

10. Probability
Done already

October 2019

1. Histograms
Done already
2. Sampling units
Done already
3. Randomized controlled trials
A randomized controlled trial (RCT) is an experimental form of impact evaluation in
which the population receiving the programme or policy intervention is chosen at random
from the eligible population, and a control group is also chosen at random from the same
eligible population. It is a study design that randomly assigns participants into an experimental group
or a control group. It is used to measure the effectiveness of a new intervention or
treatment. Randomization reduces bias and provides a rigorous tool to examine cause-effect
relationships between an intervention and outcome.

4. Convenience sampling
It is a type of non probability sampling. A convenience sample simply includes the individuals
who happen to be most accessible to the researcher.
This is an easy and inexpensive way to gather initial data, but there is no way to tell if the sample is
representative of the population, so it can’t produce generalizable results.

Example
We are researching opinions about student support services in our university, so after each of our classes,
we ask our fellow students to complete a survey on the topic. This is a convenient way to gather data, but
as we only surveyed students taking the same classes as us at the same level, the sample is not
representative of all the students at your university.

5. Artificial research

6. One tailed and two tailed test


Done already

7. Any two objective of research


The purpose of research is to discover answers to questions through the application of scientific
procedures. The objectives of research are given below:
To gain familiarity with a phenomenon or to achieve new insights into it.
To portray accurately the characteristics of a particular individual, situation or a group.
To determine the frequency with which something occurs or with which it is associated with
something else.
To test a hypothesis of a causal relationship between variables.
8. Necessity of defining research problem
Done already

9. Frequency polygon
A frequency polygon is a line graph of class frequency plotted against class midpoint. It can be
obtained by joining the midpoints of the tops of the rectangles in the histogram. It is a graphical device
for understanding the shapes of distributions. They serve the same purpose as histograms, but are
especially helpful for comparing sets of data. Frequency polygons provide us with an understanding of the
shape of the data and its trends.

10.Level of significance
Done already

11. Define mode


done already

12. Ranking scale


A ranking scale is a close-ended scale survey question tool that measures people's preferences by
asking them to rank their views on a list of related items. The respondents are asked to compare
items to one another, rather than rating them on a common scale.
On a ranking scale, the question may be in terms of product features, needs, wants, etc. It can be used for
both online and offline surveys.
For an existing product, a ranking scale gives a more detailed answer to whether customers like the
product in question or not.

13. Coding
Coding refers to the process of assigning numerals or other symbols to answers so that responses can
be put into a limited number of categories or classes. Such classes should be appropriate to the research
problem under consideration.
Coding is necessary for efficient analysis and through it the several replies may be reduced to a small
number of classes which contain the critical information required for analysis. One method of coding is to
code in the margin with a colored pencil. Another method can be to transcribe the data from the
questionnaire to a coding sheet.
14. Distinguish between primary and secondary data

 15. Frequency polygon


Done already

16. Define statistics


Statistics is the science of acquiring, classifying, organizing, analyzing, interpreting and
presenting the numerical data, so as to make inferences about the population from the sample drawn.

Types
Descriptive statistics
That part of statistics which qualitatively describes the characteristics of a particular database
under study, with the help of brief summary about the sample.
Inferential statistics
Type of statistics in which a random sample is drawn from the large population, to make
deductions about the whole population, from which the sample is taken.

17. Standard deviation


Most widely used measure of dispersion of a series and is commonly denoted by the symbol
sigma.
It is defined as the square root of the averages of squares of deviations, when such deviations for the
values of individual items in a series are obtained from the arithmetic average.

18. Type I and type II error


Type 1 error, in statistical hypothesis testing, is the error caused by rejecting a null
hypothesis when it is true.
 Type 1 error is caused when the hypothesis that should have been accepted is rejected.
 Type I error is denoted by α (alpha) known as an error, also called the level of significance of the test.
 This type of error is a false negative error where the null hypothesis is rejected based on some error
during the testing.
 The null hypothesis is set to state that there is no relationship between two variables and the cause-
effect relationship between two variables, if present, is caused by chance.
 Type 1 error occurs when the null hypothesis is rejected even when there is no relationship between
the variables.
 As a result of this error, the researcher might end up believing that the hypothesis works even when it
doesn’t.

Type II error is the error that occurs when the null hypothesis is accepted when it is not true.
 In simple words, Type II error means accepting the hypothesis when it should not have been accepted.
 The type II error results in a false negative result.
 In other words, type II is the error of failing to accept an alternative hypothesis when the researcher
doesn’t have adequate power.
 The Type II error is denoted by β (beta) and is also termed as the beta error.
 The null hypothesis is set to state that there is no relationship between two variables and the cause-
effect relationship between two variables, if present, is caused by chance.
 Type II error occurs when the null hypothesis is acceptable considering that the relationship between
the variables is because of chance or luck, and even when there is a relationship between the variables.
 As a result of this error, the researcher might end up believing that the hypothesis doesn’t work even
when it should.

19. Case control study


It is a study between the groups, one group of persons having a particular disease under study
called “cases” and another group of persons “controls” who are all comparable with cases in respect of age,
sex, literacy level, occupation, marital status, socioeconomic status but free from the disease under study.
The control group is taken for the purpose of comparisons of observations.
Study is made by obtaining information from each member of both the groups, about the exposure to the
suspected factor made in the hypothesis. The information may be obtained either by interview method, or
questionnaire method or by E-mail, etc.
Suppose the hypothesis is, ‘ Smoking 20 cigarettes per day over 20 years, results in lung cancer,’ the study
proceeds backwards to know the history of exposure to the suspected factor under study.

20. Define research design


A research design is a framework or blueprint for conducting the research project. It details the
procedures necessary for obtaining the information needed to structure or solve research problems.
It is the arrangement of conditions for collection and analysis of data in a manner that aims to combine
relevance to the research purpose with economy in procedure.
It involves blueprint of data collection, measurement and analysis.
Exploratory Research Design
To formulate a research problem for an in-depth or more precise investigation
To discover new ideas and insights
Three methods considered for such Research Design
a) A Survey of related literature
b) Experience survey
c) Analysis of insight-stimulating instances
Descriptive and Diagnostic Research Design
Descriptive Research Design is concerned with describing the characteristics of a particular individual
or a group.
Diagnostic Research design determines the frequency with which a variable occurs or its relationship with
other.
Both Descriptive and Diagnostic Research design have common requirements
Hypothesis- Testing Research Design
The researcher tests the hypothesis of causal relationship between two or more variables.
These studies require unbiased attitude of the researcher.

September 2018
1. Skewness
Skewness is lack of symmetry. Skewness gives us the idea of the shape of the distribution of the data. A
data-set has a skewed distribution when mean, median and mode are not the same. In such a case, the plot
of the distribution is stretched to one side than to the other. When the curve is stretched towards right side
more, we have positive skewness, and when the curve is stretched towards the left side more, we have
negative skewness.
In case of positive skewness, we have Mode< Median< Mean, and in case of negative skewness, we have
Mean< Median< Mode. Skewness is measured by (Mean- Mode).
In case mode is ill-defined, it can be estimated from mean and median, for a moderately asymmetrical
distribution, using the formula Mode= 3 Median – 2 Mean.

2. Normal probability curve


The normal probability curve is symmetrical about the ordinate of the central point of the curve. It
implies that the size, shape and slope of the curve on one side of the curve is identical to that of the other.
That is, the normal curve is bilaterally symmetrical.

Properties of the curve


1. The mean, mode and median are all equal.
2. The curve is symmetrical at the center(i.e. around the mean).
3. Exactly half of the values are to the left of center and exactly half the values are to right.
4. The total area under the curve is 1.
5. A standard normal model is a normal distribution with a mean of 1 and a standard deviation of 1.

3. One tailed and two tailed hypothesis


Done already

4. Classification
Classification is the process of arranging large volume of raw data in groups or classes on the basis of
common characteristics. Data having a common characteristic are placed in one class and in this way the
entire data get divided into a number of groups or classes.
Classification is of two types, based on the nature of the phenomenon involved:
Classification according to the attributes: Data are classified on the basis of common characteristics which
can either be Descriptive( such as literacy, sex, honesty, etc.) or numerical ( such as height, weight,
income, etc.).
Classification according to class-intervals: The numerical characteristics refer to quantitative phenomenon
which can be measured through some statistical units. Data relating to income, age, weight, etc. come
under this category. Such data are known as Statistics of variables and are classified on the basis of class
intervals.
5. What are the errors in sampling?
Sampling errors arise due to the fact that only a part of the population has been used to estimate
population parameters and to draw inferences about the population. Sampling errors are absent in census
survey.
Sampling error can be measured for a given sample design and size. The measurement of sampling error is
usually called the ‘ precision of the sampling plan’. If we increase the sample size, the precision can be
improved. An effective way to increase precision is usually to select a better sampling design which has a
smaller sampling error for a given sample size at a given cost. Thus, while selecting a sampling procedure,
researcher must ensure that the procedure causes a relatively small sampling error and helps to control the
systematic bias in a better way.

6. Prospective study
A prospective study (sometimes called a prospective cohort study) is a type of cohort study, or group
study, where participants are enrolled into the study before they develop the disease or outcome in
question. The opposite is a retrospective study, where researchers enroll people who already have the
disease/condition. Prospective studies typically last a few years, with some (like the Framingham Heart
Study) lasting for decades.

Study participants typically have to meet certain criteria to be involved in the study. For example, they
may have to be of a certain age, profession, or race. Once the participants are enrolled, they are followed
for a period of time to see who gets the outcome in question (and who doesn’t). Usually, the research is
conducted with a goal in mind and participants are periodically checked for progress, using the same data
collection methods and questions for each person in the study. Follow ups might include:
 Email questionnaires,
 Phone, internet, or in-person interviews,
 Physical exams,
 Imaging or laboratory tests.

7. Median
When the data-set has outliers, mean becomes flowed as a representative of the data-set. In
such a case, median is used as a measure of central tendency. Median divides the data-set into two
equal parts. Half of the items are less than the median and remaining half of the items are larger
than the median.
In order to obtain the median, we first arrange the data-set into ascending and descending order. If
no. of observations in the data set is n , then the
Median is a positional average and is used only in the context of qualitative phenomenon, for
example, in estimating intelligence, etc., which are often encountered in sociological fields.

8. Ordinal scale
Done already

9. Qualitative and quantitative data

10. Mean deviation


Mean deviation is the average of difference of the values of items from some average of the
series. Such a difference is technically described as deviation. Mean deviation is obtained as
follows:
When mean deviation is divided by the average used in finding out the mean deviation itself, the
resulting quantity is described as the coefficient of mean deviation.

September 2017

1. Define ratio scale


A ratio scale is a quantitative scale where there is a true zero and equal intervals between
neighboring points. A ratio scale is the most informative scale as it tends to tell about the order
and number of the object between the values of the scale. The most common examples of this
scale are height, money, age, weight etc. Ratio scale is the 4th level of measurement. This scale is
used to calculate all the scientific variables. In fact, in the absence of a ratio scale, scientific
variables cannot be measured.
2. Define mode
Done already
3. Coding
Done already
4. Distinguish between primary and secondary data
done already
5. Frequency polygon
Done already
6. Case-control study
Done already
7. Sampling errors
Done already
8. Standard deviation
Done already
9. Binomial distribution
Done already
10. Skewness
Done already

September 2016

1. Define mean
Mean is the most common measure of central tendency and may be defined as the value
which we get by dividing the total of the values of various given items in a series by the total
number of items.

Mean is the simplest and most widely used measure of central tendency. However, the mean
suffers from some limitations. When the data-set has one or more extreme values, the magnitude
of mean is affected and it provides a wrong impression of the other values in the data-set, when
used to represent the whole data-set.

2. Nominal scale
Done already

3. Retrospective study
A retrospective study is an observational study that enrolls participants who already have a
disease or condition. In other words, all cases have already happened before the study
begins. Researchers then look back in time, using questionnaires, medical records and other
methods. The goal is to find out what potential risk factors or other associations and relationships
the group has in common.
Advantages:
 Useful for rare diseases or unusual exposures.
 Smaller sample sizes.
 Studies take less time, because the data is readily available (it just has to be collected and analyzed).
 Costs are generally lower.
Disadvantages:
 Missing data: Exposure status may not be clear, because important data may not have been collected
in the first place. For example, if the study is investigating occupational lung cancer rates,
information about worker’s smoking habits may not be available.
 Recall bias: Participants may not be able to remember if they were exposed or not.
 Confounding variables are difficult or impossible to measure.
 Retrospective studies are considered to be inferior to prospective studies, so prospective studies
should always be used if there is a choice.

4. Binomial Distribution
Done already

5. Cluster Sampling
Cluster sampling also involves dividing the population into subgroups, but each subgroup should have
similar characteristics to the whole sample. Instead of sampling individuals from each subgroup, we
randomly select entire subgroups.

If it is practically possible, we might include every individual from each sampled cluster. If the clusters
themselves are large, we can also sample individuals from within each cluster using one of the techniques
above.

This method is good for dealing with large and dispersed populations, but there is more risk of error in the
sample, as there could be substantial differences between clusters. It’s difficult to guarantee that the
sampled clusters are really representative of the whole population.

Example: The company has offices in 10 cities across the country (all with roughly the same number of
employees in similar roles). You don’t have the capacity to travel to every office to collect your data, so
you use random sampling to select 3 offices – these are your clusters.
6. Dependent and independent variable
The independent variable is the variable the experimenter manipulates or changes, and is assumed to
have a direct effect on the dependent variable. For example, allocating participants to either drug or
placebo conditions (independent variable) in order to measure any changes in the intensity of their anxiety
(dependent variable).

The dependent variable is the variable being tested and measured in an experiment, and is 'dependent' on
the independent variable. An example of a dependent variable is depression symptoms, which depends on
the independent variable (type of therapy).
In an experiment, the researcher is looking for the possible effect on the dependent variable that might be
caused by changing the independent variable.

7. Cumulative frequency Curve


A curve that represents the cumulative frequency distribution of grouped data on a graph is
called a Cumulative Frequency Curve or an Ogive. Representing cumulative frequency data on a graph is
the most efficient way to understand the data and derive results. The cumulative frequency is calculated
by adding each frequency from a frequency distribution table to the sum of its predecessors.

8. Editing in processing of data


Done already

9. Chi-square test
A chi-square test is a statistical test used to compare observed results with expected results. The purpose
of this test is to determine if a difference between observed data and expected data is due to chance, or if it
is due to a relationship between the variables under study.

There are three types of Chi-square tests, tests of goodness of fit, independence and homogeneity. All
three tests also rely on the same formula to compute a test statistic.

10. Skewness
Done already

April 2015
1. What is hypothesis?
A research hypothesis is a specific, clear, and testable proposition or predictive statement about
the possible outcome of a scientific research study based on a particular property of a population,
such as presumed differences between groups on a particular variable or relationships between variables.
“ A hypothesis is a conjectural statement of the relation between two or more variables.” ( Kerlinger,1956)
“ Hypotheses are single tentative guesses, good hunches- assumed for use in devising theory or planning
experiments intended to be given a direct experimental test when possible.” ( Eric Rogers, 1966)

A hypothesis gives direction to the study/investigation. It defines facts that are relevant and not relevant.
Furthermore, it also suggests which form of research design is likely to be most appropriate.

2. When do you use Paired ‘t’-test?


A paired t-test is used when we are interested in the difference between two variables for the same
subject.Often the two variables are separated by time.
For example, cholesterol levels in a person before 10 days of taking statins and after 10 days of taking
statins. We may be interested in the difference in cholesterol levels between these two time points.

However, sometimes the two variables are separated by something other than time. For example, subjects
with ACL tears may be asked to balance on their leg with the torn ACL and then to balance again on
their leg without the torn ACL. Then, for each subject, we can then calculate the difference in balancing
time between the two legs.

3. Describe quartiles
A quartile is a statistical term that describes a division of observations into four defined intervals
based on the values of the data and how they compare to the entire set of observations. So, there are
three quartiles, first, second and third represented by Q1, Q2 and Q3, respectively. Q2 is nothing but the
median, since it indicates the position of the item in the list and thus, is a positional average. To find
quartiles of a group of data, we have to arrange the data in ascending order.

In the median, we can measure the distribution with the help of lesser and higher quartile. Apart from
mean and median, there are other measures in statistics, which can divide the data into specific equal parts.
A median divides a series into two equal parts.
4. Analysis of Co-Variance (ANCOVA)?
ANCOVA is similar to traditional ANOVA but is used to detect a difference in means of 3 or more
independent groups, whilst controlling for scale covariates. A covariate is not usually part of the main
research question but could influence the dependent variable and therefore needs to be controlled for.
ANCOVA is a blend of analysis of variance (ANOVA) and regression.
The ANCOVA technique requires one to assume that there is some sort of relationship between the
dependent variable and the uncontrolled variable. We also assume that this form of relationship is the same
in the various treatment groups. Other assumptions are:
Various treatment groups are selected at random from the population.
The groups are homogeneous in variability.
The relationship is linear and is similar from group to group.

5. Define Simple random sampling.


It is a type of probability sampling.In a simple random sample, every member of the population has an
equal chance of being selected. The sampling frame should include the whole population.
To conduct this type of sampling, we can use tools like random number generators or other techniques that
are based entirely on chance.
For example, if we want to select a simple random sample of 100 employees of Company X. We assign
number to every employee in the company database from 1 to 1000, and use a random number generator to
select 100 numbers.

6. Define co-efficient of variation


The coefficient of variation (CV) is a measure of relative variability. It is the ratio of the standard
deviation to the mean (average). For example, the expression “The standard deviation is 15% of the mean”
is a CV.

The CV is particularly useful when one wants to compare results from two different surveys or tests that
have different measures or values. For example, if one is comparing the results from two tests that have
different scoring mechanisms. If sample A has a CV of 12% and sample B has a CV of 25%, one would
say that sample B has more variation, relative to its mean.
Coefficient of Variation = (Standard Deviation / Mean) * 100

7. Differences between Bar diagram and Histogram

Representation of ungrouped data. Representation of group data


For example, a bar graph may show the number of For example, a histogram can show height
students who like a particular sport. measured in cm with the intervals 140-150, 150-160
and so on.
8. Mention Various types of Mean.
9. Define probability of an event
Probability means possibility. It is a branch of statistics that deals with the occurrence of a random
event. Whenever we are unsure about the outcome of an event, we can talk about the probabilities of
certain outcomes- how likely they are. For example, when a coin is tossed in the air, the possible outcomes
are head and tails. It is the number of favorable outcomes divided by the total number of outcomes
possible.

P(A) = n(A)/n(S)

Where, P(A) is the probability of an event “A” n(A) is the number of favorable outcomes. n(S) is the total
number of events in the sample space.

10. Sampling variation

September 2014
1. Qualitative research
Qualitative research is concerned with qualitative phenomenon, i.e., phenomena relating to or involving
quality or kind. For instance, when we are interested in investigating the reasons for human behavior (i.e.,
why people think or do certain things), we quite often talk of “ Motivation Research”, an important type of
qualitative research. This type of research aims at discovering the underlying motives and desires, using in
depth interviews for the purpose.
Attitude or opinion research i.e. , research designed to find out how people feel or what they think about a
particular subject or institution is also qualitative research. Qualitative research is specially important in
the behavioral sciences where the aim is to discover the underlying motives of human behavior.

2. Nominal and ordinal data


Done already
3. Statement of the research problem
Done already

4. Pie diagram
A pie chart is a circular statistical graphic, which is divided into slices to illustrate numerical
proportion. In a pie chart, the arc length of each slice, is proportional to the quantity it represents.
It is a type of pictorial representation of data. A pie chart requires a list of categorical variables and
the numerical variables. Here, the term “pie” represents the whole, and the “slices” represent the
parts of the whole.
The pie chart is an important type of data representation. It contains different segments and sectors in
which each segment and sectors of a pie chart forms a certain portion of the total(percentage). The total of
all the data is equal to 360°.
Advantages

 The picture is simple and easy-to-understand

 Data can be represented visually as a fractional part of a whole

Disadvantages
 It becomes less effective, if there are too many pieces of data to use

 If there are too many pieces of data. Even if you add data labels and numbers may not help here, they
themselves may become crowded and hard to read

5. Skewness
Done already
6. Kurtosis
Done already
7. Probability
Done already
8. Median
Done already
9. Need for sampling
Done already
10. Computer in research
Computers are used in scientific research immensely and it is an important tool. Research process can
also be done through computers. Computers are very useful and important when large sample in used.
Computer and phases of Research
Research process has five major phases. Computer can be used in these following phases.
Conceptual Phase and Computer
In this phase, formulation of research problem, review of literature, theoretical frame work and
formulation of hypothesis. Computer helps in searching the existing literature. It helps in finding the
relevant existing research papers so that researcher can find out the gap from the existing literature.
Bibliographic references can also be stored through World Wide Web. In the latest computers, references
can be written automatically in different styles like APA, MLA etc. This saves time of researcher. He
needs not to visit libraries and wastes his time. It helps researchers to know how theoretical framework can
be built.

Design and Planning Phase and Computer


Computer can be used for, deciding population sample, questionnaire designing and data collection. These
are different internet sites which help to design questionnaire. Several softwares can be used to calculate
the sample size. It makes pilot study of the research possible. In pilot study, sample size calculation,
standard deviation are required. Computer helps in doing all these activities.

Empirical Phase and Computer


Computer helps in collecting data and in data analysis. After collecting data, it is stored in computers in
word files or excels sheets. The important applications used in scientific research are data storage, data
analysis, scientific simulations, instrumentation control and knowledge sharing. Necessary corrections are
made or edited whenever it is required. Otherwise it will be a time consuming process. Computers help in
referring, editing and managing of data. Computer allows for greater flexibility in recording the data and
made the analysis of data easy.
In research, preparation and inputting data is the most labor intensive. It consumes much time Data is
being converted in a form which is suitable for the computer. It can be coded on the excel sheet. These
excel spread sheets can be directly opened with the statistics software for analysis.

Data Analysis and Computer


Data analysis and interpretation can be done with the help of computers. For data analysis, softwares are
available. These softwares help in using the techniques for analysis like average, percentage and
correlation etc. These softwares are SPSS, STATA, Sysat etc.
These can also be used for checking the reliability of data, establishing and testing hypothesis etc.
Computers are used in interpretation also.
They can check the accuracy and authenticity of data. It helps is drafting tables by which a researcher can
interpret the results easily. These tables give a clear proof of the interpretation made by researcher.

Research Dissemination and Computer


After interpretation, computer helps is converting the results into a research article or report which can be
published. It can be written in a word format or in a PDF format. Article can be stored or published on
website.

References and computer


After completing the word document, a researcher need to give source of the literature studied and
discussed in references. Computers help in preparing references. References can be written in different
styles like APA/MLA etc. All the details of authors journals, publication volume. Books can be filled in
the options ‘reference’ given in computer and it automatically change the information into the required
style. Some softwares like ‘Medley’ are also used to manage the references. A researcher needs not to
worry about remembering all the articles from where literature in taken, it can be easily managed with the
help of ‘Medley’.

April 2014
1. Distinguish between inclusive and exclusive method of class intervals in frequency distribution.
(1) Inclusive method: - It is a method of classification of given data in such a manner that the upper
limit of the previous class intervals does not repeat in the lower limit of the next class interval. In this
classification we include both the values of upper and lower limit in the distribution table. For example: -
0 – 10, 11 – 20, 21 – 30, and so on.
(2) Exclusive method: - It is a method of classification of given data in such a manner that the upper limit
of the previous class intervals gets repeated in the lower limit of the next class interval. In this
classification we include only the value lower limit and do not include the value of upper limit in the
distribution table. For example: - 0 – 10, 10 – 20, 20 – 30, and so on.
Now, let us take an example where we will draw the frequency distribution table of each kind using some
assumed data by us. Assume that the marks (out of 40) obtained by 15 students of a class is given as: - 5, 10,
21, 29, 9, 8, 20, 16, 18, 25, 30, 25, 35, 37, 40 and we have to draw the frequency distribution table.
(i) Considering the Inclusive method of distribution we can draw the table as shown below: -

Marks No. of students (frequency)


0 – 10 4
11 – 20 3
21 – 30 5
31 – 40 3

(ii) Considering the Exclusive method of distribution we can draw the table as shown below: -

Marks No. of students (frequency)


0 – 10 3
10 – 20 3
20 – 30 5
30 – 40 3
40 – 50 1
2. Distinguish between standard deviation and standard error

3. What is sampling frame?


Done already
4. Distinguish between qualitative and quantitative variables. Give one example of each.
5. Define level of significance and confidence level.
Level of significance is the probability of Type I error. It is always some percentage (usually 5%) which
should be chosen with great care, thought and reason. In case we take the significance level at 5 per cent,
then this implies that null hypothesis will be rejected when the sampling result ( i.e. , observed evidence)
has a less than 0.05 probability of occurring if null hypothesis is true. In other words, the 5 per cent level of
significance means that researcher is willing to take as much as a 5 per cent risk of rejecting the null
hypothesis when it happens to be true. Thus the significance level is the maximum value of the probability
of rejecting null hypothesis when it is true and is usually determined in advance before testing the
hypothesis.
A range of values constructed from sample data so that the population parameter is likely to occur within
that range at a specified probability. The specified probability is called level of confidence.
For example, we are 90 % sure that the mean distance covered by students in 6MWT of BPT final year of
Ramaiah physiotherapy college is 450 m.

6. Define the probability of an event.


Done already

7. What is target population? Give an example.


The target population is the group of individuals that the intervention intends to conduct research in
and draw conclusions from. Also known as target audience, this term refers to a group of people that
possess certain attributes that can be classified properly to separate them from the entire population. The
purpose of this technique is to understand and evaluate their preferences and behaviors.
For example, students of BPT final year, Ramaiah Physiotherapy College for knowing the teaching
quality of staffs in recent 3 years.
8. What is randomization.
Randomization is the process of making something random. For example: generating a random
permutation of a sequence (such as when shuffling cards); selecting a random sample of a population
(important in statistical sampling).
It eliminates the selection bias, balances the groups with respect to many known and unknown
confounding or prognostic variables, and forms the basis for statistical tests, a basis for an assumption of
free statistical test of the equality of treatments.

9. What is the necessity of defining research problem?


Done already
10. What is sampling distribution?
A sampling distribution is a probability distribution of a statistic obtained from a larger number of
samples drawn from a specific population. It shows every possible result a statistic can take in every
possible sample from a population and how often each result happens.
TYPES

Suppose a researcher wishes to identify the average age of babies when they begin to walk in India.
Instead of keeping a track of all the babies in India, the researcher will select a total of 500 babies.

The number of babies constitutes the population for this particular research. Now, the researcher will
identify the age of babies when they begin to walk. Let us assume that 25% of the babies began to walk at
the age of 1.5 years old. Another 30% of the babies began to walk at the age of 2 years old.

This way, the researcher will calculate the actual mean of the sampling distribution of babies by picking a
handful of samples. The sample mean (average of a sample) will be further calculated along with other
sample means obtained from the same population.

August 2013
1. Mode
Done already
2. Standard deviation
Done already
3. Ordinal data
done already
4. Simple random Sampling
Done already
5. Questionnaire and Schedule

6. Null hypothesis
The null hypothesis is a characteristic arithmetic theory suggesting that no statistical relationship and
significance exists in a set of given, single, observed variables between two sets of observed data and
measured phenomena. H0 symbolizes the null hypothesis of no difference.
The null hypothesis, also known as the conjecture, assumes that any kind of difference between the
chosen characteristics that is seen in a set of data is due to chance.
7. Assumptions in ANOVA
note: written as a 5m
The one-way analysis of variance(ANOVA) is used to determine whether there are any statistically
significant differences between the means of two or more independent(unrelated) groups.
For example, one-way ANOVA can be used to understand whether exam performance differed based on
test anxiety level amongst students, dividing students into 3 independent groups (e.g. low, medium and
high stressed students)
The assumptions are as follows:
1) The dependent variable should be measured at the interval or ratio level (i.e. they are continuous)
E.g. revision time(measured in hours)
Intelligence( measured using IQ score)
2) The independent variable should consist of two or more categorical, independent groups.
E.g. ethnicity groups( Caucasian, Hispanic, African American), physical activity level( sedentary, low,
moderate and high)
3) There should be independence of observations means there must be different participants in each group
with no participant being in more than one group.
4) There should be no significant outliers. Outliers are single data points within the data that do not follow
the usual pattern. (e.g. in a study of 100 students IQ scores where the mean score was 108 with only a small
variation between students, one student had a score of 156, which is very unusual).
5) The dependent variable should be approximately normally distributed for each category of the
independent variable.
6) There needs to be homogeneity of variances.

8. Qualitative research
done already
9. Simulation research
Simulation is a methodological approach for organizational researchers.
If the other methods answer the question “What happened, and how and why” simulation helps answer the
question “What if?”. Simulation enables studies of more complex systems because it creates observations
by “moving forward” into the future, whereas other methods attempt to look backwards across history to
determine what happened, and how.
Simulation takes a model, composed of a structure and rules that govern that structure and produces
output(observed behavior). By comparing different output obtained via different structures and governing
rules, researchers can infer what might happen in the real situation if such interventions were to occur.

10. Histogram
Done already

September 2012
1. Nominal scale
Done already
2. Median
done already
3. Need for sampling
Done already
4. Descriptive vs Analytical research
Done already
5. Stratified Sampling
3. Stratified sampling

Stratified sampling involves dividing the population into subpopulations that may differ in important ways.
It allows us to draw more precise conclusions by ensuring that every subgroup is properly represented in
the sample.

To use this sampling method, we divide the population into subgroups (called strata) based on the relevant
characteristic (e.g. gender, age range, income bracket, job role).

Based on the overall proportions of the population, we calculate how many people should be sampled from
each subgroup. Then we use random or systematic sampling to select a sample from each subgroup.

Example

The company has 800 female employees and 200 male employees. We want to ensure that the sample
reflects the gender balance of the company, so we sort the population into two strata based on gender.
Then we use random sampling on each group, selecting 80 women and 20 men, which gives us a
representative sample of 100 people.

6. Define research
Research refers to the systematic method consisting of enunciating the problem, formulating a
hypothesis, collecting the facts or data, analyzing the facts and reaching certain conclusions either in the
form of solution(s) towards the concerned problem or in certain generalizations for some theoretical
formulation.
It is an original contribution to the existing stock of knowledge making for its advancement. It is the
pursuit of truth with the help of study, observation, comparison and experiment.
“Research comprises defining and redefining problems, formulating hypothesis or suggested solutions,
collecting, organizing and evaluating data, making deductions and reaching conclusions and at last careful
testing the conclusions to determine whether they fit the formulated hypothesis.” – Clifford Woody
7. Dependent and independent variable
Done already
8. Research problem
Done already

9. Experimental research designs


Experimental research designs are concerned with examination of the effect of independent variable on
the dependent variable, where the independent variable is manipulated through treatment or interventions,
and the effect of these interventions is observed in the dependent variable.
According to Riley, experimental research design is a powerful design for testing hypotheses of causal
relationship among variables.
Post-test-only control design, Solomon four-group design, Nonrandomized control group design etc. are
examples of experimental research design.

10. What is hypothesis?


Done already

You might also like