9 views

Uploaded by Adika Eka

This document is about introductions of several statistical interpretations such as: Crosstab
Chi-square
Pearson's Correlation
etc

This document is about introductions of several statistical interpretations such as: Crosstab
Chi-square
Pearson's Correlation
etc

© All Rights Reserved

- Statistics SPSS Project
- Costs prediction in Rise Buildings
- Crosstabs in Crystal Reports
- Generalized Linear Mixed Model for Longitudinal
- quantity techniqes
- Psych 162 HW
- PO_31
- 2013jan_LinearRegression
- Dr. William Allan Kritsonis - PhD Dissertation Advisory Committee Member for Daniel Amadin Irabor - Title: Profiling Juvenile Offenders among Street Children in Nigeria
- BS
- TInspire Core
- EDQ
- Effect of Utilizing Geometer’s Sketchpad Software on Students’ Academic Achievement in Mathematics’ Training at High Schools
- OUTPUT.doc
- Bivariat Dukungan Dan Pendidikan1
- Print
- Andriani I4
- Chapter 03
- file_2
- 267952708

You are on page 1of 10

Below I provide some basic notes on statistical interpretation for some selected procedures.

The information provided here is not exhaustive. There is more to learn about

assumptions, applications, and interpretation of these procedures. Further information

can be obtained in statistics textbooks and statistics courses.

Crosstabs:

Crosstab is short for cross-tabulation or cross-classification table. In its basic form it is a

bivariate table. Usually the independent variable is represented by the columns and the

dependent variable is represented by the rows.

One can use any variables with any level of measurement in a crosstab but usually they are

constructed using nominal or ordinal variables. Because interval/ratio variables tend to have

many potential variables, crosstabs are usually impractical for these levels of measurement.

More complex multivariate crosstabs can also be constructed (e.g., where a third variable is

controlled).

The data in crosstabs is usually presented either as percentages, or frequencies. Percentages can

pertain to the cell as a function of either: 1) the column, 2) the row, 3) the total. In constructing a

crosstabulation for a report you should make clear which of these types of percentages are being

calculated. (This can often be done easily by providing a total percentage at the end of the row or

column.)

In providing descriptive interpretation of results one can discuss the relative frequency or

percentage of cases falling in particular cells. Usually this is done in reference to the column

variable. E.g., 35% of women strongly agreed with statement X, while only 15% of men strongly

agreed with statement X.

2

Chi-Square:

Technically this is a test of statistical independence. That is, if two variable are unrelated then

they are independent of one another. If not, they are dependent. Another way of thinking about

this is that they are associated. Chi-square can be used with nominal and ordinal variables. If

the significance value corresponding to the chi-square test is less than or equal to .05, then the

test is deemed to be statistically significant and you can interpret the two variables in the test as

being dependent or associated.

There are several limitations to the chi-square test. Two of these are: 1) the test does not tell you

about the direction of an association (e.g., positive or negative), 2) the test does not tell you about

the strength of an association.

From the chi-square statistic (and its related level of significance) all you can say is that the

variables are statistically associated or not.

You can, however, try to interpret the percentages in the related crosstabulation.

In Table 1, the chi-square is significant. This means that employment status and gender are

statistically associated. The results in the crosstabulation suggest that men are more likely to be

employed full-time.

3

Pearsons Correlation:

Pearsons correlation is a bi-variate measure of association for interval/ratio level variables.

Pearsons correlation ranges from 0 to the absolute value of 1 (e.g. 1 or -1).

A correlation of 0 means that there is no linear statistical association between two variables. A

correlation of 1 means that there is a perfect positive correlation (or linear association) between

two variables. A correlation of -1 means that there is a perfect negative correlation between two

variables. A correlation of .50 means that there is a moderately strong positive correlation

between two variables.

There is also an associated test of significance. If the significance value (p.) is # .05, then the

correlation is deemed to be statistically significant.

In Table 2 the correlation between years of education and personal income is .42, and p. is < .01.

Thus there is a significant, moderately strong positive correlation between education and income.

(Another way of saying this is that there is a significant moderately strongly positive linear

association between education and income.)

In other words, people with higher levels of education tend to earn higher levels of income,

people with lower levels of education tend to earn lower levels of income.

4

Multiple Regression Analysis.

Multiple regression analysis examines the strength of the linear relationship between a set of

independent variables and a single dependent variable (measured at the interval/ratio level).

The R2 provides the proportion of variation in the dependent variable that is explained by the

independent variables in the model. For example, the independent variables in Model 5 of Table

7 explain .20 of the variation in environmentally friendly behaviour, or, converted into a

percentage, they explain 20% of the variation in environmentally friendly behaviour.

There are two types of coefficients that are typically be displayed in a multiple regression table:

unstandardized coefficients, and standardized coefficients.

To interpret an unstandardized regression coefficient: for every metric unit change in the

independent variable, the dependent variable changes by X units. For instance, if income is the

dependent variable, and years of education is one of the independent variables, and the

unstandardized regression coefficient for education is 3,000, then this would mean that for every

additional year of education a respondent has, their income increases by $3,000.00 (controlling

for the other independent variables in the equation).

In multiple regression, the effects of the independent variables are always net effects

controlling simultaneously for the effects of the other variables in the equation.

One advantage of using unstandardized coefficients is that they have readily interpretable

substantive meaning (such as in the example of education and income given above).

One disadvantage is that the independent variables usually have different metrics (e.g. income in

dollars, age in years, attitudes on a rating scale, etc.). This makes it difficult to compare the

relative influence of different independent variables upon the dependent variable.

Standardized regression coefficients are based on changes in standard deviation units. For

example, in Model 5 of Table 7, for every standard deviation unit increase in activism, the

respondents score on the environmentally friendly behaviour index increases by .18 standard

deviation units.

5

One advantage of using standardized regression coefficients is that you can compare the relative

strength of the coefficients. Generally, the closer to the absolute value of 1 the coefficient is, the

stronger the effect of that independent variable on the dependent variable (controlling for other

variables in the equation). The closer the coefficient is to 0, the weaker the effect of that

independent variable.

For example, in Model 1 of Table 1, Age has the strongest effect on environmentally friendly

behaviour (-.23), while income (log) has the smallest effect (-.08).

(0 means no net effect; under unusual circumstances in multiple regression, standardized

regression coefficients can be greater than the absolute value of 1; in bivariate regression the

standardized regression coefficient also known as Pearsons Correlation Coefficient has a

maximum value of the absolute value of 1.)

While it is technically not supposed to be done, sometimes ordinal variables (measured in likerttype scales) are treated as interval/ratio level variables and used as independent variables.

It is also possible to include categorical variables as independent variables but they have to be

binarized, and coded as 0 or 1. Also, at least one category has to be left out to serve as a

reference category. Variables coded in this way are referred to as dummy variables.

For example, in Table 7 gender is coded as male = 1, and female = 0.

If one had income as a dependent variable in a multiple regression, and the unstandardized

regression coefficient for gender was 10,000 then (assuming the previous coding scheme) men

would make 10,000 more than women controlling for other variables in the equation.

Another example in Table 7 is Gendpar where female parents are coded as 1, and everyone else

is coded as 0.

It is somewhat more difficult to interpret standardized regression coefficients for dummy

variables because standard deviation unit changes are somewhat meaningless when there are only

two categories. In Model 1 of Table 7, it can be said that there is a significant effect for gender,

females have higher scores for environmentally friendly behaviour.

In multiple regression analysis, significance levels are usually also reported that are associated

with the individual regression coefficients, and also a separate significance level is reported for

the equation as a whole and associated with the R2.

6

Usually .05 is the minimal criterial for indicating a result is significant (though in Table 7, the

level of .10 is also reported.)

For example, in Model 2 of Table 7, the following independent variables are significant at the .05

level: gender, age, and education (squared).

The following variables are not significant at the .05 level: income (log), parent.

In Model 2 of Table 7 the equation as a whole is significant. (See the asterix next to the R2.)

There are a variety of different ways of displaying information in a multiple regression table.

Sometimes a series of models is presented (such as in Table 7) where conceptually similar

variables are grouped together and added in a block, and then different blocks are added in

sequence usually associated with theoretical arguments. This is often referred to as hierarchal

regression analysis.

Sometimes only the results associated with a single model are presented.

Sometimes only the unstandardized coefficients are provided.

Sometimes only the standardized coefficients are provided (this is the case in Table 7).

Sometimes the standard error associated with the coefficient is provided.

Sometimes R2 Changes are provides in association with different models. (This could have been

done in Table 7).

Also, the number of cases used to create the regression model are usually indicated (N).

These are just some of the basics. There is a good deal of additional information to know

associated with assumptions underlying the variables, regression diagnostics, and interpreting

regression equations.

There are also a variety of specialized types of regression equations (e.g. for non-linear effects,

for interaction effects, etc.)

7

Difference in Means and t-test:

When you wish to examine the relationship between a nominal (or ordinal) variable with two

categories that is an independent variable, and a dependent variable that is measured at the

interval/ratio level then an appropriate then an appropriate procedure and test is to examine the

difference in means, and calculate a t-test.

To see the direction of the difference in means just examine the respective means for the two

groups. For the t-test there is an associated significance level. If the significance level is #.05,

then the difference in means is statistically significant.

For example, examine the third row of Table 3. This displays the mean personal income for

women and men. Men made an average of $46,968 while women made a an average of $24,268.

This difference is statistically significant (p. # .01). Thus you can conclude that (for this sample)

men make more than women.

8

Univariate Statistics: Frequencies and Percentages:

Often it is useful to provide basic univariate statistics describing key variables. For nominal and

ordinal variables this can be done by providing frequencies and percentages. (There are also a

variety of other useful statistics that will not be discussed here.) Technically, you can also

provide frequencies and percentages for interval/ratio variables but it is usually not practical to

do so because there are so many potential values. (Instead, such data are sometimes portrayed in

graphs.)

When you provide tables of frequencies and percentages you should provide totals.

Also, if there is missing data you should indicate this in the table.

In Table 4, the response category with the largest number of cases is strongly agree. 7 out of

20 people or 35% of the sample selected this response.

9

Univariate Statistics: Means, Standard Deviations, and N

For interval/ratio level variables, one way of summarizing data is to provide means, standard

deviations, and N.

The mean is the arithmetic average of the data. The standard deviation is a measure of how

dispersed the data are. The N is the number of (valid) cases that were used to calculate these

statistics.

In row 2 of Table 5 we see that for this sample the mean years of education were 15.36, and the

standard deviation was 2.17. These statistics were calculated from 183 cases.

The standard deviation means that about 68% of the cases fell between 13.19 and 17.53, and

about 95% of all the cases fell between 11.02 and 19.70.

10

Percentage Tables for Multiple Items:

Sometimes it is useful to provide tables that summarize multiple variables at the same time.

Table 2 does this for some correlations. Table 5 does this for means, standard deviations, and

Ns.

When you have likert-type scales it is sometimes useful to present data in the form of a matrix

with the categories across the top (or columns) and the different questionnaire items down the

side (or rows).

Table 6 does this for the political efficacy items.

For example, for item #4, 35% strongly disagreed, 15% disagreed, 0% had no opinion, 20%

agreed, and 30% strongly agreed.

When the data are displayed this way we can try to discern patterns by comparing across the

items.

In this particular instance the responses look pretty similar across items with lots of responses

in the extreme categories and fewer responses in the middle of the scale (especially for no

opinion).

- Statistics SPSS ProjectUploaded byrishabhsethi1990
- Costs prediction in Rise BuildingsUploaded byNataliaTeruya
- Crosstabs in Crystal ReportsUploaded byMeng Vannary
- Generalized Linear Mixed Model for LongitudinalUploaded bykaled1971
- quantity techniqesUploaded byAshish Yadav
- Psych 162 HWUploaded byMacky Bautista
- PO_31Uploaded byHarold Taylor
- 2013jan_LinearRegressionUploaded byarmand_20042002
- Dr. William Allan Kritsonis - PhD Dissertation Advisory Committee Member for Daniel Amadin Irabor - Title: Profiling Juvenile Offenders among Street Children in NigeriaUploaded byAnonymous sewU7e6
- BSUploaded bysareenck
- TInspire CoreUploaded bythor1s
- EDQUploaded byNuno Rabino
- Effect of Utilizing Geometer’s Sketchpad Software on Students’ Academic Achievement in Mathematics’ Training at High SchoolsUploaded bytheijes
- OUTPUT.docUploaded byFifi Anggraeny
- Bivariat Dukungan Dan Pendidikan1Uploaded byAhmad Sahid
- PrintUploaded byReski Sri Narendra
- Andriani I4Uploaded byIna Andriani
- Chapter 03Uploaded byhany_farid_2
- file_2Uploaded byMichael Baguyo
- 267952708Uploaded byRogeriano21
- 2010 Mock SolutionsUploaded byS.L.L.C
- Unit-9Uploaded byswingbike
- 07ME308FC.pdfUploaded byGanesh Chelluboyina
- 211Uploaded byAlexandra Grigorescu
- Inference in BivariatesUploaded byXavier Joseph M. Mercader
- US Federal Trade Commission: cmu sep07Uploaded byftc
- Cash Flows, Earnings Opacity and Stock Price Crash Risk in Tehran Stock ExchangeUploaded byTI Journals Publishing
- Stats Ch03.s03Uploaded byclassic_777
- First Draft - AR.docxUploaded bymerii
- Analyze Grass Leaf Growth Using Function Fitting _ Plant Methods _ Full TextUploaded byJiaLing Thian

- HW_6Uploaded bycincinmindy
- Tut 4Uploaded bySim Yap
- Chapter 06_Risk, Return, And the CAPMUploaded bypranavyes
- [Joseph b. Kruskal, Myron Wish] Multidimensional ScalingUploaded bycorreage
- Chile economic growthUploaded bygisuskraist
- Impact of Working Capital Management on the Profitability of the Food and Personal Care Products Companies Listed in Karachi Stock Exchange (Finance)Uploaded byMuhammad Nawaz Khan Abbasi
- Loneliness ScaleUploaded bysandesh bhaisare
- PROBABILISTIC ASSESSMENT OF WIND LOADS PHD thesis.pdfUploaded bycentscoup1
- 2210-7023-1-SMUploaded bydeepak202002t
- ps matlabUploaded byMadhukar Scribd
- DissertationUploaded byAnkit Pahari
- RBANSUploaded byFanel Putra
- 20131119111156Week 1 _ Intro to Educational ResearchUploaded bynadi_asha
- Harvard Government 2000 SyllabusUploaded byJ
- BA5202-Business Research MethodsUploaded byVivek Levin
- Application of Statistical Tools in Singer, BATBC and BATAUploaded byAlMumit
- sdwUploaded bypisal
- Regression AnalysisUploaded byshoaib
- Chapter 12 OutlineUploaded byplayquiditch
- 15.-15-pp..pdfUploaded byComan Flavius-Alin
- Detection of the road traffic noise: Xipu as a study caseUploaded byAnonymous vQrJlEN
- The Resource Curse Revisited and Revised a Tale of Paradoxes JEEM 2008[1]Uploaded byNaletta Bella
- AWB120 Dynamics 04 SpectrumUploaded byRoss Waring
- Summary of Coleman and Steele uncertainty methodsUploaded byAli Al-hamaly
- Dividend Policy UkUploaded byThuỷ Trình
- Buying Preference of Customer Regarding Maruti CarsUploaded byVikash Bhanwala
- Pengaruh Kebijakan Tunjangan Perbaikan Penghasilan Tpp Terhadap Semangat KerjaUploaded bytaufik
- Kumari BankUploaded bysaurav
- 2. Human Resources - Ijhrmr - Hr Practices and Job Satisfaction - s Ramya Dr.p.brunthaUploaded byTJPRC Publications
- A Level Biology a Core Practical 18 - Habituation in a SnailUploaded byScarlett Lin Latt