Data Analysis Tools.

BY WAMBUGU, Lydiah N.
Data collected from the field must be analyzed, presented and interpreted to make it meaningful to the audiences of the research work. The process of reducing research data to manageable summaries is what is called data analysis. Data is analyzed using statistics.
Wambugu L.N. 4/30/2013
Wambugu L.N.
4/30/2013
Statistics are tools for summarizing data, measuring relationships between sets of data or making inferences about a set of data. Data can be analyzed manually or by using a software package such as the Statistical Package for Social Sciences (SPSS). Data analysis is done in chapter four
Wambugu L.N.
4/30/2013
Methods of data analysis depends on the type of data collected be it nominal, ordinal, interval or ratio data.
Nominal data is used for identification only e.g. gender, marital status, political affiliation, type of housing, departments in a university, type of drink, subjects in a school etc
Wambugu L.N.
4/30/2013
Ordinal data is ordered according to quantity though difference between one level and the other is not equal e.g. educational level, income levels, position in class, grading system, mean grade, age groups
Wambugu L.N.
4/30/2013
Interval data values whose intervals between them are equal but do not have absolute zero e.g. temperature levels. 0oC does not mean absence of temperature
Ratio has an absolute zero value e.g. weight, height, number of children NB: Most social scientists will combine the first two to categorical data while the last two as continuous data
There are two statistics that a researcher may use;
descriptive and inferential statistics.
Wambugu L.N.
4/30/2013
This is a way of summarizing data letting one number stand for a group of numbers. There are three ways of describing data:

Tabular Representation of Data Graphical Representation of Data Numerical Representation of Data
Wambugu L.N.
4/30/2013
Tabular representation of data- data can be summarized by making a table of the data. In statistics these tables are referred to as frequency distribution tables e.g
Wambugu L.N.
4/30/2013
Gender Male Female
Frequency 8 10
Percentage 44.4 55.6
TOTAL
18
100.0
Wambugu L.N.
4/30/2013
Graphical representation of data one can summarize data using: A bar graph for categorical data A histogram for continuous data Frequency polygon for continuous data Scatter diagram to show relationships for continuous data Pie Chats for Categorical Data
Wambugu L.N.
4/30/2013
Wambugu L.N.
4/30/2013
Wambugu L.N.
4/30/2013
Wambugu L.N.
4/30/2013
Wambugu L.N.
4/30/2013
This is where you use a single number to represent many numbers. This can be done by use of: Measures of Central Tendency mean (Continuous), mode (Categorical), median (continuous) Measures of Variability Range, Standard deviation, variance (all for continuous data)
Wambugu L.N.
4/30/2013
Measures of Association / relationship correlation (spearman rank order for ordinal while Pearson Product Moment is for continuous), regression (continuous) Z-Scores for continuous data. Z-scores shows the standing of a score in a distribution Cross tabulation shows the relationship between categorical variables/data e.g. gender and job category
NB: Correlation relates two variables which are continuous e.g. performance in history and geography. It shows both the strength and the direction. Correlation does not imply causality i.e. you cannot predict the value of the IV because correlation has no IV and DV. Direction is shown by a negative or positive sign while the strength is shown by the r value which ranges between -1.00 to +1.00. The table below shows correlation:
Wambugu L.N.
4/30/2013
Current Salary Current Salary Pearson Correlation Sig. (2-tailed) N Pearson Correlation Sig. (2-tailed) N ** Correlation is significant at the 0.01 level (2-tailed). 1 . 474 .880(**) .000 474
Beginning Salary .880(**) .000 474 1 . 474
Beginning Salary
Wambugu L.N.
4/30/2013
0.8 0.9 is a very strong positive linear correlation, 0.6 0.7 is a strong positive linear correlation, 0.5 is moderate while below 0.4 is weak positive linear correlation. A value of 1.00 is a perfect positive linear correlation With regression, you use a regression model (Y = a + bX+) where Y is the DV, a (constant or the autonomous value of Y without X), b(gradient or the marginal effect of X on Y) and X (IV). Regression shows causality and you can also predict the value of DV given the IV
Plot A B C
Amount of Fertilizer used (in tonnes) IV 2 7 3
Yield of Maize (in tonnes) DV 8 10 7
20
Wambugu L.N.
4/30/2013
Assume that a=1.5 and b=0.2 How many tonnes of maize will a farmer harvest by using 17 tons of fertilizer? NB: With or without fertilizer, the farmer will harvest 1.5 tones of Maize. I tonne of fertilizer causes an increase of 0.2 tones of the yield
Wambugu L.N.
4/30/2013
This is the statistics we use to test hypothesis. It is a measure of the confidence we have in our descriptive statistics. Inferential statistics works on the assumption that samples were drawn from a normally distributed population. There are two types of tests: Parametric tests for continuous data and non-parametric tests for categorical data. NB: Remember that the researcher sets the alternative hypothesis (in chapter one) but tests the Null hypothesis (in chapter four). Alternative can either be directional or non directional
Example 1: Ho: There is no significant difference in the mean performance of boys and girls in Mathematics H1: There is a significant difference in the mean performance of boys and girls in Mathematics (non directional) Non Directional leads to a two tailed test
Wambugu L.N.
4/30/2013
Ho: Viceroy does not increase mood H1: Viceroy increases mood (directional) Directional leads to a one tailed test
Wambugu L.N.
4/30/2013
Examples of Non-parametric Tests Chi square, spearman rank order
Examples of parametric Tests T-test, ZTest, Pearson, ANOVA, Regression Model
Wambugu L.N.
4/30/2013
State the null and alternative hypothesis based on your research question Ho: 1 = 2 H1: 1 2 (non directional) or 1 < 2 1 > 2(directional) Set the alpha level: = 0.05. This means that we have 5 chances in 100 of making a type 1 error (rejecting a null hypothesis when its true)
Wambugu L.N.
4/30/2013
Calculate the value of the appropriate statistic. Also indicate the degrees of freedom for the statistical test if necessary e.g X2, Ttest, F-test Write the decision rule for rejecting the null hypothesis
Wambugu L.N.
4/30/2013
Write a summary statement based on the decision
Write a statement of results in standard English
Wambugu L.N.
4/30/2013
Wambugu L.N.
4/30/2013
Chi-square test this is used in situations where you have two categorical variables e.g. gender and job category. Chi square tests the null hypothesis that there is no relationship between the two variables. Chi square statistic is obtained after cross tabulation T Tests this compares means of two sets of numbers. It can be used to compare two independent groups (independent-samples t test) e.g. chemistry performance of boys nad girls
or to compare observations from two measurements occasions for the same group (paired-samples t test) e.g. starting salaries and current salaries among teachers in school A. Correlation statistics used are pearson, and spearman (refer to the table on correlation) Regression this allows you to make statements about how well one or more independent variables will predict the value of a dependent variable (refer to slide 21)
Regressions output is an ANOVA table that describes the overall variance accounted for in the model. The F statistic represents a test of the null hypothesis
One way ANOVA is used to predict the dependent variable given one independent salary.
Wambugu L.N.
4/30/2013
Assume that 100 M.A students have a choice of buying an IBM, HP or Dell Computer. The table below shows the observed frequencies of the said students:
Wambugu L.N.
4/30/2013
Computer Model IBM

HP DELL TOTAL
Frequency 47
36 17 100
Wambugu L.N.
4/30/2013
Using appropriate statistic test the hypothesis that there is no significant difference among the frequencies with which students purchased three different models of computers. The appropriate statistic is X2. X2 is used in situations where you have two categorical variables. Assume that after calculating using the formula X2value is 13.820.
Wambugu L.N.
4/30/2013
Steps in Testing the Hypothesis State the null and the alternative hypothesis based on your research question Ho: O = E (There is no significant differences between the observed and the expected frequencies) H1: O E (There is a significant difference between the observed and the expected frequencies)
Wambugu L.N.
4/30/2013
Set the alpha level : = 0.05 Calculate the value of the appropriate statistic: X2 = 13.820 df =2 Write the decision rule for rejecting the null hypothesis: Reject Ho if X2 5.991 (Critical value of X2 at 0.05). This value is gotten by using statistical tables
Wambugu L.N.
4/30/2013
Write a summary statement based on the decision Reject Ho, p < 0.05 NB: Since our calculated value of X2 (13.820) is greater than 5.991, we reject the null hypothesis and accept the alternative hypothesis
Wambugu L.N.
4/30/2013
Write a statement of results in standard English: There is a significant difference among the frequencies with which students purchased three models of computers.
Wambugu L.N.
4/30/2013
Qualitative data is information gathered in a nonnumeric form and consists of detailed descriptions of situations, events, people, interactions and observed behaviors. It also consists of direct quotations from people about their experiences, attitudes, beliefs and thoughts. Common examples of such data are interview transcript, documents (reports, meetings, minutes, e-mails), field notes, video, audio recordings and images. Qualitative data consist of words and observations and not numbers.
Wambugu L.N.
4/30/2013
Whereas in quantitative analysis, data analysis begins after data collection, in qualitative research analysis is tied to the data collection and occurs throughout the data collection, as well at the end of the study. The purpose of analyzing data in the field is to discover categories and underlying themes and to develop grounded theories
Wambugu L.N.
4/30/2013
Processing Condensed Data - This step is also referred to as pre analysis of data. It involves transcribing interviews (changing from audio to print) and translating observed events and behaviors into words Condensing Data -This is done by: a) Editing data editing data ensures removal of any grammatical error so as to ensure precise explanation in a concise form and also increase clarity. It is important to note that while editing, the critical meaning of data in not changed.
For instance, assume this interview between John (interviewer), Jose and Tom (interviewee) John: Are you Gay? Jose: I am aah not gay This should not be edited because it shows emphasis or sends out a clear message about ones feelings or stand on a particular issue as opposed to having a full stop. Contrast this with Toms response below Tom: I am Gay
b) Removing Ambiguity this involves clarifying the meanings presented by the data. Where a phrase is repeated and monotonously used by a participant, it is referred to as ambiguity. In this case the repetitive phrase/statement needs to be edited. For instance: John: Have you visited England Jose: I have ah of course I have visited England This needs to be edited
c) Creating Data Categories A data category is a theme or class of data established to represent related or similar forms of data. This is a complex process and requires that the researcher be very familiar with the data. He/she must be able to detect various categories in data, which should be distinct from each other. The researcher should then establish the relationship among these categories. Themes are identified from literature review and unexpected theories
d) Selecting and assigning data to established categories using codes
e) Summarizing the data in each category the researcher should condense and report in his/her word and should not keep on repeating the same quote if said by two or more people
Wambugu L.N.
4/30/2013
Presentation of Findings - This is used to display analyzed data. This is done using strategies such as narratives, direct quotes, matrices, tables and diagrams. The researcher should think critically about the techniques s/he should use to present data and justify the selected strategy. In qualitative research, tables are not only statistical but are also interpretive frames/analytic frame. An interpretive frame has a question (Theme), Response and Remarks
Making Sense of the Findings - Making sense of the findings involves interpreting the results and drawing parallels and disparities from existing theories e.g. these results concurs with the study done in which The researcher quotes verbatim when all the given responses or a section appears to summarize a wide range of aspects being investigated. Finally the researcher should make conclusions and recommendations of the study.
Wambugu L.N.
4/30/2013

Data Analysis Tools.

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Data Analysis Tools.

Uploaded by

Copyright:

Available Formats

BY WAMBUGU, Lydiah N.

Wambugu L.N. 4/30/2013

There are two statistics that a researcher may use;

descriptive and inferential statistics.

Tabular Representation of Data Graphical Representation of Data Numerical Representation of Data

Gender Male Female

Percentage 44.4 55.6

Wambugu L.N. 4/30/2013

Beginning Salary .880(**) .000 474 1 . 474

Amount of Fertilizer used (in tonnes) IV 2 7 3

Yield of Maize (in tonnes) DV 8 10 7

Examples of Non-parametric Tests Chi square, spearman rank order

Examples of parametric Tests T-test, ZTest, Pearson, ANOVA, Regression Model

Write a summary statement based on the decision

Write a statement of results in standard English

Computer Model IBM

d) Selecting and assigning data to established categories using codes

You might also like