You are on page 1of 51

BY WAMBUGU, Lydiah N.

Data collected from the field must be analyzed, presented and interpreted to make it meaningful to the audiences of the research work. The process of reducing research data to manageable summaries is what is called data analysis. Data is analyzed using statistics.
Wambugu L.N. 4/30/2013

Wambugu L.N.

4/30/2013

Statistics are tools for summarizing data, measuring relationships between sets of data or making inferences about a set of data. Data can be analyzed manually or by using a software package such as the Statistical Package for Social Sciences (SPSS). Data analysis is done in chapter four

Wambugu L.N.

4/30/2013

Methods of data analysis depends on the type of data collected be it nominal, ordinal, interval or ratio data.

Nominal data is used for identification only e.g. gender, marital status, political affiliation, type of housing, departments in a university, type of drink, subjects in a school etc

Wambugu L.N.

4/30/2013

Ordinal data is ordered according to quantity though difference between one level and the other is not equal e.g. educational level, income levels, position in class, grading system, mean grade, age groups

Wambugu L.N.

4/30/2013

Interval data values whose intervals between them are equal but do not have absolute zero e.g. temperature levels. 0oC does not mean absence of temperature

Ratio has an absolute zero value e.g. weight, height, number of children NB: Most social scientists will combine the first two to categorical data while the last two as continuous data

Wambugu L.N. 4/30/2013

There are two statistics that a researcher may use;

descriptive and inferential statistics.

Wambugu L.N.

4/30/2013

This is a way of summarizing data letting one number stand for a group of numbers. There are three ways of describing data:

Tabular Representation of Data Graphical Representation of Data Numerical Representation of Data

Wambugu L.N.

4/30/2013

Tabular representation of data- data can be summarized by making a table of the data. In statistics these tables are referred to as frequency distribution tables e.g

Wambugu L.N.

4/30/2013

Gender Male Female

Frequency 8 10

Percentage 44.4 55.6

TOTAL

18

100.0

Wambugu L.N.

4/30/2013

Graphical representation of data one can summarize data using: A bar graph for categorical data A histogram for continuous data Frequency polygon for continuous data Scatter diagram to show relationships for continuous data Pie Chats for Categorical Data

Wambugu L.N.

4/30/2013

Wambugu L.N.

4/30/2013

Wambugu L.N.

4/30/2013

Wambugu L.N.

4/30/2013

Wambugu L.N.

4/30/2013

This is where you use a single number to represent many numbers. This can be done by use of: Measures of Central Tendency mean (Continuous), mode (Categorical), median (continuous) Measures of Variability Range, Standard deviation, variance (all for continuous data)

Wambugu L.N.

4/30/2013

Measures of Association / relationship correlation (spearman rank order for ordinal while Pearson Product Moment is for continuous), regression (continuous) Z-Scores for continuous data. Z-scores shows the standing of a score in a distribution Cross tabulation shows the relationship between categorical variables/data e.g. gender and job category

Wambugu L.N. 4/30/2013

NB: Correlation relates two variables which are continuous e.g. performance in history and geography. It shows both the strength and the direction. Correlation does not imply causality i.e. you cannot predict the value of the IV because correlation has no IV and DV. Direction is shown by a negative or positive sign while the strength is shown by the r value which ranges between -1.00 to +1.00. The table below shows correlation:

Wambugu L.N.

4/30/2013

Current Salary Current Salary Pearson Correlation Sig. (2-tailed) N Pearson Correlation Sig. (2-tailed) N ** Correlation is significant at the 0.01 level (2-tailed). 1 . 474 .880(**) .000 474

Beginning Salary .880(**) .000 474 1 . 474

Beginning Salary

Wambugu L.N.

4/30/2013

0.8 0.9 is a very strong positive linear correlation, 0.6 0.7 is a strong positive linear correlation, 0.5 is moderate while below 0.4 is weak positive linear correlation. A value of 1.00 is a perfect positive linear correlation With regression, you use a regression model (Y = a + bX+) where Y is the DV, a (constant or the autonomous value of Y without X), b(gradient or the marginal effect of X on Y) and X (IV). Regression shows causality and you can also predict the value of DV given the IV
Wambugu L.N. 4/30/2013

Plot A B C

Amount of Fertilizer used (in tonnes) IV 2 7 3

Yield of Maize (in tonnes) DV 8 10 7

20

Wambugu L.N.

4/30/2013

Assume that a=1.5 and b=0.2 How many tonnes of maize will a farmer harvest by using 17 tons of fertilizer? NB: With or without fertilizer, the farmer will harvest 1.5 tones of Maize. I tonne of fertilizer causes an increase of 0.2 tones of the yield

Wambugu L.N.

4/30/2013

This is the statistics we use to test hypothesis. It is a measure of the confidence we have in our descriptive statistics. Inferential statistics works on the assumption that samples were drawn from a normally distributed population. There are two types of tests: Parametric tests for continuous data and non-parametric tests for categorical data. NB: Remember that the researcher sets the alternative hypothesis (in chapter one) but tests the Null hypothesis (in chapter four). Alternative can either be directional or non directional
Wambugu L.N. 4/30/2013

Example 1: Ho: There is no significant difference in the mean performance of boys and girls in Mathematics H1: There is a significant difference in the mean performance of boys and girls in Mathematics (non directional) Non Directional leads to a two tailed test

Wambugu L.N.

4/30/2013

Ho: Viceroy does not increase mood H1: Viceroy increases mood (directional) Directional leads to a one tailed test

Wambugu L.N.

4/30/2013

Examples of Non-parametric Tests Chi square, spearman rank order

Examples of parametric Tests T-test, ZTest, Pearson, ANOVA, Regression Model

Wambugu L.N.

4/30/2013

State the null and alternative hypothesis based on your research question Ho: 1 = 2 H1: 1 2 (non directional) or 1 < 2 1 > 2(directional) Set the alpha level: = 0.05. This means that we have 5 chances in 100 of making a type 1 error (rejecting a null hypothesis when its true)

Wambugu L.N.

4/30/2013

Calculate the value of the appropriate statistic. Also indicate the degrees of freedom for the statistical test if necessary e.g X2, Ttest, F-test Write the decision rule for rejecting the null hypothesis

Wambugu L.N.

4/30/2013

Write a summary statement based on the decision

Write a statement of results in standard English

Wambugu L.N.

4/30/2013

Wambugu L.N.

4/30/2013

Chi-square test this is used in situations where you have two categorical variables e.g. gender and job category. Chi square tests the null hypothesis that there is no relationship between the two variables. Chi square statistic is obtained after cross tabulation T Tests this compares means of two sets of numbers. It can be used to compare two independent groups (independent-samples t test) e.g. chemistry performance of boys nad girls
Wambugu L.N. 4/30/2013

or to compare observations from two measurements occasions for the same group (paired-samples t test) e.g. starting salaries and current salaries among teachers in school A. Correlation statistics used are pearson, and spearman (refer to the table on correlation) Regression this allows you to make statements about how well one or more independent variables will predict the value of a dependent variable (refer to slide 21)
Wambugu L.N. 4/30/2013

Regressions output is an ANOVA table that describes the overall variance accounted for in the model. The F statistic represents a test of the null hypothesis

One way ANOVA is used to predict the dependent variable given one independent salary.

Wambugu L.N.

4/30/2013

Assume that 100 M.A students have a choice of buying an IBM, HP or Dell Computer. The table below shows the observed frequencies of the said students:

Wambugu L.N.

4/30/2013

Computer Model IBM


HP DELL TOTAL

Frequency 47
36 17 100

Wambugu L.N.

4/30/2013

Using appropriate statistic test the hypothesis that there is no significant difference among the frequencies with which students purchased three different models of computers. The appropriate statistic is X2. X2 is used in situations where you have two categorical variables. Assume that after calculating using the formula X2value is 13.820.

Wambugu L.N.

4/30/2013

Steps in Testing the Hypothesis State the null and the alternative hypothesis based on your research question Ho: O = E (There is no significant differences between the observed and the expected frequencies) H1: O E (There is a significant difference between the observed and the expected frequencies)

Wambugu L.N.

4/30/2013

Set the alpha level : = 0.05 Calculate the value of the appropriate statistic: X2 = 13.820 df =2 Write the decision rule for rejecting the null hypothesis: Reject Ho if X2 5.991 (Critical value of X2 at 0.05). This value is gotten by using statistical tables

Wambugu L.N.

4/30/2013

Write a summary statement based on the decision Reject Ho, p < 0.05 NB: Since our calculated value of X2 (13.820) is greater than 5.991, we reject the null hypothesis and accept the alternative hypothesis

Wambugu L.N.

4/30/2013

Write a statement of results in standard English: There is a significant difference among the frequencies with which students purchased three models of computers.

Wambugu L.N.

4/30/2013

Qualitative data is information gathered in a nonnumeric form and consists of detailed descriptions of situations, events, people, interactions and observed behaviors. It also consists of direct quotations from people about their experiences, attitudes, beliefs and thoughts. Common examples of such data are interview transcript, documents (reports, meetings, minutes, e-mails), field notes, video, audio recordings and images. Qualitative data consist of words and observations and not numbers.

Wambugu L.N.

4/30/2013

Whereas in quantitative analysis, data analysis begins after data collection, in qualitative research analysis is tied to the data collection and occurs throughout the data collection, as well at the end of the study. The purpose of analyzing data in the field is to discover categories and underlying themes and to develop grounded theories

Wambugu L.N.

4/30/2013

Processing Condensed Data - This step is also referred to as pre analysis of data. It involves transcribing interviews (changing from audio to print) and translating observed events and behaviors into words Condensing Data -This is done by: a) Editing data editing data ensures removal of any grammatical error so as to ensure precise explanation in a concise form and also increase clarity. It is important to note that while editing, the critical meaning of data in not changed.
Wambugu L.N. 4/30/2013

For instance, assume this interview between John (interviewer), Jose and Tom (interviewee) John: Are you Gay? Jose: I am aah not gay This should not be edited because it shows emphasis or sends out a clear message about ones feelings or stand on a particular issue as opposed to having a full stop. Contrast this with Toms response below Tom: I am Gay
Wambugu L.N. 4/30/2013

b) Removing Ambiguity this involves clarifying the meanings presented by the data. Where a phrase is repeated and monotonously used by a participant, it is referred to as ambiguity. In this case the repetitive phrase/statement needs to be edited. For instance: John: Have you visited England Jose: I have ah of course I have visited England This needs to be edited
Wambugu L.N. 4/30/2013

c) Creating Data Categories A data category is a theme or class of data established to represent related or similar forms of data. This is a complex process and requires that the researcher be very familiar with the data. He/she must be able to detect various categories in data, which should be distinct from each other. The researcher should then establish the relationship among these categories. Themes are identified from literature review and unexpected theories
Wambugu L.N. 4/30/2013

d) Selecting and assigning data to established categories using codes

e) Summarizing the data in each category the researcher should condense and report in his/her word and should not keep on repeating the same quote if said by two or more people

Wambugu L.N.

4/30/2013

Presentation of Findings - This is used to display analyzed data. This is done using strategies such as narratives, direct quotes, matrices, tables and diagrams. The researcher should think critically about the techniques s/he should use to present data and justify the selected strategy. In qualitative research, tables are not only statistical but are also interpretive frames/analytic frame. An interpretive frame has a question (Theme), Response and Remarks
Wambugu L.N. 4/30/2013

Making Sense of the Findings - Making sense of the findings involves interpreting the results and drawing parallels and disparities from existing theories e.g. these results concurs with the study done in which The researcher quotes verbatim when all the given responses or a section appears to summarize a wide range of aspects being investigated. Finally the researcher should make conclusions and recommendations of the study.
Wambugu L.N. 4/30/2013

Wambugu L.N.

4/30/2013

You might also like