AJMR Vol6 No1 1998 PDF

AUSTRALASIAN JOURNAL OF MARKET RESEARCH JANUARY 1998 ‘VOLUME 6, NUMBER 1 CONTENTS PAGE Rank Order Correlations and Interpretability of Cluster-Based 3 Segmentations David E Hansen Increased Support for Aboriginal Reconciliation: Fact or 13 Artefact? The Validity of Comparisons Between Questions With Different Wording Robert J Donovan Structural Equation Modelling in Market Research 27 Scott MacLean and Keven Gray ‘The Market Research Society of Australia Limited Market Research Society wo A.C.N, 002 882 635 of New ZealandAUSTRALASIAN JOURNAL OF MARKET RESEARCH The AJMR is the official journal of the Market Research Society of Australia (MRSA), and the Market Research Society of New Zealand (MRSNZ). It is published twice yearly. All members of the MRSA receive each copy free and all full members of the New Zealand Society receive complimentary copies. Subscribers may obtain copies for $30.00 annually, or $35.00 for overseas subscribers ($5.00 extra for Air Mail per issue). Original papers are invited, There is an editorial policy to ensure that a mix of theoretical and practical papers are presented. Papers should be of some relevance to at least a segment of the population of practicing market researchers. Purely academic marketing papers are probably more suitable elsewhere, Contributors of papers will receive 5 copies of the issue in which their article appears. Manuscripts are to be typed and double spaced. Diagrams and figures should be of professional quality. In writing articles, authors should consider the style of articles in previous issues. On acceptance of an article, authors will be asked to supply their article on a 3.5” or 5” disk. ‘The word processor of preference is Microsoft Word for Windows. However, conversions can be made from other packages. The AJMR is an open publication. All expressions of opinion are published on the basis that they are not to be regarded as expressing the official opinion of the Market Research Society of Australia Limited or the Market Research Society of New Zealand. ‘The MRSA/MRSNZ accepts no responsibility for the accuracy of any of the opinions or information contained in this publication and readers should rely upon their own enquiries in making decisions All papers for submission should be sent to Professor Lester W. Johnson Editor, AJMR Monash Mt. Eliza Business Schoo! P.O, Box 2224 Caulfield Junction, Vie 3161 Australia All business enquiries and requests for additional copies should be sent to "The Market Research Society of Australia P.O, Box 697 North Sydney, NSW 2059 Phone 02.9955 4830 Fax 029955 5746 Copies of articles in the AJMR may be made for personal or classroom use, without charge and with the publisher's consent. Copying for any other purpose must first be approved by the publisher. The Society recognises the contribution of Lester Johnson (Editor) and the Monash Mt. Eliza Business School, Monash University Published by the Market Research Society of Australia Limited Copyright 1998, Editor ~ Lester Johnson Production - Jintana KurosawaAUSTRALASIAN JOURNAL OF MARKET RESEARCH Editor Production Editor Professor Lester W. Johnson Jintana Kurosawa Monash Mt. Eliza Business School Monash Mt. Bliza Business EDITORIAL BOARD R, Susan Ellis Liane Ringham University of Sydney Sutherland Smith Ringham Mark Jessop John Rossiter ‘The Boshe Group Australian Graduate School of Management V. Kumar Geoffrey Soutar University of Houston Edith Cowan University Peter Oppenheim Don Stem University of Ballarat Washington State University Pascale Quester Jill Sweeney University of Adelaide University of Western AustraliaAustralasian Journal of Market Research ‘Volume 6, Number 1 January 1998 * RANK ORDER CORRELATIONS AND INTERPRETABILITY OF CLUSTER-BASED SEGMENTATIONS David E Hansen Department of Marketing University of Auckland New Zealand ABSTRACT It is often difficult to use the results of cluster-based market segmentation studies for purposes of market definition or product differentiation because of problems with interpreting the results, For example, one may find that some of the items receiving high ratings in a given segment have little to do with the definition of that segment or with each other, making interpretation a problem, Such problems are likely to occur with the Tandem approach to segmentation analysis, often used for its ease of interpretation, but driven by assumptions of nicely behaved data, In this paper we show that use of rank order data provides a specific improvement for cluster-based segmentation analysis. Namely, it improves the interpretability of cluster- based market segmentations with the irregular data found in a typical segmentation study. We demonstrate this empirically with simulations and data from four commercial segmentation studies each containing variety of data distributions. We show a statistically significant improvement in interpretability using rank order data in place of scaled data with the Tandem approach. RANK ORDER CORRELATIONS AND INTERPRETABILITY OF CLUSTER- BASED SEGMENTATIONS Much attention has been paid to the accuracy of cluster-based segmentations as measured by reclassification of segment members using discriminant analysis (Punj and Stewart 1983; Green and Krieger 1995). Accuracy is of importance to managers with interests in direct marketing and database marketing as a more accurate segmentation assures more efficient use of marketing resources. However, it may be of little importance to managers interested in defining markets or developing differentiated product offerings based on the core benefits sought by target segments, For those managers, interpretability and not accuracy is the main issue, as they must be able to place a descriptive tag on a segment that unambiguously represents the main interests of that segment, For example, if they find that the items receiving high ratings in a given segment have little to do with the description of that segment or with each other, this makes the segmentation uninterpretable, ‘Interpretability is a relatively qualitative issue compared to accuracy and the link between measures of reclassification and the qualitative aspects of segmentation is less well understood. Our focus in this paper is on the relationship between statistical properties of the data and the qualitative issue of interpretability with a view to improving interpretability The interpretability of segmentation results is influenced by the quality of the data. Poor data quality impacts interpretability because it obscures the underlying correlation structure of the data used to determine the composition of the segments. The data found in typical commercial segmentation studies often have a variety of skewed and non-normal distributions that are known to impact the correlation structure (Green and Krieger, 1995; Stewart 1981). These data violate the assumptions of most parametric statistical‘Australasian Journal of Market Research procedures making it difficult for these procedures to produce an unambiguous and thus interpretable result. Such problems are likely to impact the well known two stage (Tandem) approach to segmentation analysis, used for its advantages in interpretability, but driven by assumptions of nicely behaved data. Rank order statistics, well known for their robust properties with poorly behaved data, are an obvious candidate for solving this problem, Yet despite much research demonstrating these properties in situations requiring statistical inference (Siegel 1956), there has been little research indicating their effects in a complex technique like cluster-based segmentation. In particular the effect of rank order statistics on qualitative, but important, issues like interpretability are relatively unknown. Thus, these statistics may often be overlooked because of /—_uncertainty regarding their impact on the managerial aspects of a segmentation, The goal and main contribution of this paper is to demonstrate that the robust properties of the rank order correlation, Spearman's rho, can improve the interpretability of a segmentation regardless of the quality of the data by improving the correlation structure of the data. A second contribution is that we develop a measure of interpretability based on the face validity of a cluster-based segmentation and the consistency of important characteristics. We begin the paper with a review of data quality issues for the Tandem segmentation approach, We then proceed to an analysis of correlations with rank order data and how they differ from correlations based on scaled data. Then we empirically compare the results for rank order and scaled data with simulations and with the Tandem approach using data from four commercial market segmentation studies. ‘Volume 6, Number 1 January 1998 CONCEPTUAL BACKGROUND Data Quality and the Tandem Approach to Segmentation The nature of marketing segmentation data has lead research on segmentation analysis in the direction of using a series of data transformations prior to the actual clustering of data into segments. For examples, see Gleason and Staelin (1973), Furse, Punj, and Stewart (1984), and Milligan and Cooper (1988). ‘The most popular method is the Tandem approach (Punj and Stewart, 1983). Its popularity rests in large part on the interpretability of results as it helps make sense of the large number of variables going into a segmentation analysis. It consists of a specific sequence of statistical operations: Pearson product-moment correlations, principal components factor analysis, and non-hierarchical cluster analysis. In. this approach the standardised data from a Pearson correlation matrix serve as input to the factor analysis, the factor scores of which serve as input to a non-hierarchical clustering algorithm such as K-means (Furse, Punj, and Stewart, 1984), The means from a hierarchical clustering algorithm (eg, Ward's clustering) are often used as seeds for the non-hierarchical algorithm. Even though Pearson's r and principle components factor analysis are intended to rescale data before clustering, they are themselves sensitive to data scaling and distributional shape. When using typical market research data with scaled data and Pearson's r, one of the main threats to interpretability with the Tandem approach comes from data that have different shaped distributions—in particular, skewed distributions (Stewart, 1981), The basis of segmentation with the Tandem approach is the factor structure given to the clustering algorithm. Slight Gifferences in the correlations used in the factor analysis can alter this structure drastically. ‘This threat is greater the more variety in the skew of distributions and theAustralasian Journal of Market Research fewer the response categories—ie., the more non-metric the data. ‘The implication is that the type of data found in commercial market research studies and databases should be problematic for use with Pearson’s correlations. ‘As has been shown (Kendall and Stuart 1991), the impact of skewed distributions on the attenuation of correlations may be moderated by using rank ordered data in place of Pearson's r, However, while we are aware of the properties of rank order statistics, we are not as familiar with their impact on the qualitative aspects of segmentation analysis. This is likely to be because of the complex relationships that exist between the stages of analysis in the Tandem approach and how the use of rank order statisties may impact other techniques such as factor analysis. There is thus a need for research demonstrating the impact of rank order statistics on the qualitative outcomes of segmentation analysis such as interpretability. In the next section we compare rank order data to ratings data to see how the product-moment correlations used in incipal components factor analysis, an essential part of the Tandem approach, can be improved with rank order data Correlations with Rank Ordered Data To understand how the use of ranks can provide more valid correlations we need to look at the similarities and differences between rank order and scaled data, The only computational difference when using these data is that respondent ratings are rank ordered before a product- moment correlation is performed on the rank orders. However, some useful statistical differences exist when using the ranks of multichotomous rating scale (MRS) data. The distribution of ranks is actually the same as ratings in terms of frequencies, but is very different in terms of scale. ‘This is because the outcome scale for rank ordered data is dependent on the number of Volume 6, Number 1 January 1998 ‘observations, For small scaled (few scale points) MRS data this means the ranks are almost exclusively dependent on ties, making the assignment of ranks to ties an important issue. For example, if there are more than two observations with a two point scale (dichotomous) variable, there will be ties. Although several types of ranking schemes for ties are possible, averaged ranks have the most desirable properties, as the mean is the midpoint of the number of observations so that the mean is a constant regardless of the symmetry of the distribution. ‘These means are also the median of the range of the ranks assigned to ties when the marginal distributions of variables are symmetric. This yields the smallest squared deviations, the greatest variance, and makes rank order equal to scaled data in this case. Because the rank assigned to a group of tied observations depends on the size of that group and on the size of the preceding ‘groups, ties impact the size of the scale, the location of the points on the scale (the number of points will be equal to the number of possible ratings, for n>r), and ultimately the sensitivity of the correlation to asymmetry. To understand this we must first look at how asymmetry impacts the correlations of ratings, Pearson’s correlations. As the distribution of ratings becomes more asymmetric, the mean of ratings changes in the direction of the frequency. ‘Thus, if more ratings are high than low, the mean of. ratings increases, but because the scale of the range and the scale points are constant for ratings, the deviations are now asymmetric. As variance is largest under symmetry, asymmetry reduces this quantity as well as changing the cross product of deviations--but in different ways for ratings and ranks, Rank orders work the opposite of ratings with MRS data. The mean is constant and the scale range and scale points adjust as the symmetry of the stribution changes. With both ranks or‘Australasian Journal of Market Research ratings, the deviations under asymmetry are unequal so that the sums of squared Geviations and cross products are smaller than with symmetry. However, it can be shown that the deviations for ranks change at a slower rate than those of ratings because of the usually larger scale size for the distribution of ranks, meaning that the distribution of ranks is relatively more symmetric than that of ratings. Product-moment correlations with rank ordered data should thus be larger than those with scaled data under asymmetry even though the marginal frequency distributions are the same, a fact that can be used to advantage with typical marketing data which are noticeably asymmetric. As most cases of asymmetry should result in rank order correlations being larger than Pearson's correlations, and because symmetric distributions produce the largest correlations, use of rank order data effectively creates a bias in favour of the symmetric distribution. This means that if the true distribution is in fact symmetric then correlations with rank order data (Spearman’s’ rho) will actually be a more valid measure of comelation, ‘The assumption of symmetry is not naive because most MRS scales are designed so that the scale midpoint is neutral and thus separates upper and lower halves, implying the expectation of symmetry in ratings. ‘This is a key point, as the input to factor analysis is correlations and the factor scores which determine the segments depend on the size of the correlations between variables. Interpretation of segments depends on this as well, because factor analysis fits factors according to the similarity of correlations between variables, and even slight differences can alter the factor structure and the resulting segments. Which respondents end up in a given segment also influences which items in a segment received the highest ratings—the final determinant of interpretability. With the natural bias of Spearman's rho in the direction of the correlation produced by ‘Volume 6 Number 1 January 1998 symmetric data, Spearman's effectively reduces some of the problems due to data irregularities which may impact Pearson's r We conduct a test of these ideas by 1) comparing Pearson's and Spearman's correlations using simulated distributions with varying degrees of skew for one skewed variable and one symmetric variable, and 2) comparing the performance of Pearson's and Spearman's on segmentations involving four commercial data sets, The results are discussed along with limitations and future directions. METHODOLOGY ‘Test of Interpretability Interpretability depends on whether the underlying themes attributed to a segment in the factor analysis, and thus used to define the segment, are consistent with the questionnaire items to which segment members assign the highest ratings. Inconsistency exists, for example, when the members of a segment labelled “The Health Conscious” have high ratings on statements like “I drink a six pack of Coke every day” but also on “I read all the health food advertisements”. Inthe Tandem approach segments are identified by clustering the factor scores of each respondent and using the names of the factors with the highest or lowest mean scores in a segment to define that segment. ‘This makes the development of a measure of interpretability relatively straightforward, For example, if a given cluster has one high mean score on the factor named “health consciousness” this cluster represents the “The Health Conscious” segment. We can then use the cross-tabulation of segments by questionnaire items to see if the items having the highest ratings in a segment also Joad on the factor that defines the segment. The exact measure of interpretability (MATCH) is the percentage of matches, where @ match exists if the two highest‘Australasian Journal of Market Research rated items in a segment also load on the factor(s) defining that segment. ‘Tests of Structure The structure of the correlations of test items impacts the factor structure and the eventual segments. This impacts both interpretability and accuracy. If a segment is defined by a particular factor from the factor analysis in the Tandem approach, and that factor is based on the correlations of ratings on a set of items, then people scoring high on that factor should be in that segment and not some other, Their ratings should be similar as well. By reclassifying respondents into their respective segments using discriminant functions we can test for correlation structure. We test the reclassification of respondents using the hit ratios from a discriminant analysis confusion matrix with two kinds of independent —variables—exogenous and endogenous. Reclassification —_ with endogenous variables is also the traditional way of testing the accuracy of a segmentation (Green and Krieger 1995). The first type of hit rates, DA_EX, uses exogenous (c.g, demographic) variables as the predictor variables and has been recommended by Green and Krieger (1995) and Punj and Stewart (1983) as the best test of the underlying structure of the data because of its linking the test items to non-test or exogenous items like demographic variables. The second and more well known hit rate, DA_EN, is based ‘on a hold out sample of the endogenous (i.e., attitudinal) variables that were used in the actual segmentation analysis, and is a test of segmentation accuracy. The segment ID numbers of each segment member were used as the dependent variables in both discriminant analyses. ‘Segmentation Analysis Method The analysis followed the Tandem approach of Furse, Stewart, and Punj (1984). ‘The data sets were submitted to a Spearman (or Pearson) correlation analysis. ‘Volume 6, Number 1 January 1998 The correlation matrix was passed to a principle component factor analysis, where the number of factors retained depended on the mineigen criterion of one. After varimax rotation, the factor scores were analysed in a Ward’s cluster analysis. A five cluster solution was used for all four datasets (Green and Krieger 1995). This number was determined from formation rate of linkages, the cubic clustering criterion (Sarle, 1993), and the pseudo-F ratio (Symms, 1981). The five cluster centroids from the Ward’s clustering procedure were used as seeds for the K-means non-hierarchical clustering procedure. Using the clusters produced by K-means, each respondent was given a cluster ID number. These ID numbers served as the dependent variable for the discriminant analysis. We defined segments according to how much weight each cluster received from each factor (via the cluster centroid of factor scores means) and the names of the factors. Data Sets The four commercial data sets came from questionnaires mailed to existing or potential customers of large organisations. ‘The attitudinal (endogenous) variables used for the analysis all had five point scales, while the behavioural (exogenous) variables (eg., gender, age, occupation) had scales ranging from binary to fifteen points. Descriptions of the distributions of the endogenous variables for these studies are given in Exhibit 1, The first study had mostly skewed distributions, the second had skewed and symmetric distributions, the third had mostly skewed and the fourth contained skewed and symmetric distributions, ‘These data sets were chosen to test for the generalisability of any potential advantage in interpretation due to Spearman's across data with different distributional shapes. Regarding missing values, the imputation method is critical to the success of any multivariate technique. ‘The data in‘Australasian Journal of Market Research all four data sets was cleaned by finding the average response level within a respondent, and substituting the integer value closest to ‘Volume 6 Number 1 January 1998 that average level into the variable with the missing response. EXHIBIT 1 Description of the distributions of the data sets Study One Two ‘Three Four Description % % % % Skewed Left triangular 58 20 34 35 Right triangular 25 0 34 29 ‘Symmetric Flat 4 20 0 0 Single Peaked 13 60 2 36 ‘Number of variables 24 35 28 26 ‘Number of observations 701 300 380 401 ANALYSIS AND RESULTS respondent at a time. Although this changes the correlation (Stewart, 1981), it Results of the Simulations isolates the effect of asymmetry on First, to confirm that Spearman's tho is greater than Pearson's r when one of the variables has an asymmetric distribution we ran several simulations using 3, 4, and 5 point scales where the correlations with symmetry ranged from +0.60 to +0.85. The same number of scale points was used for both correlated variables, X and Y. The dependent variable was the difference between the absolute value of Pearson r and Spearman rho correlations. As the starting point for all simulations the marginal distributions of the two variables, X and Y, were equal. The simulation of asymmetry consisted of systematically increasing the skew of the marginal distribution of one of the variables, X, one hypothetical correlation for the following simple reason. If more than one response at a time can vary for each calculation of correlations, it is difficult to assess the exact nature of the distributions prior to each calculation. ‘As shown in Table 1, on average, Spearman’s was greater than Pearson's in about 84% of all cases of asymmetry across all three sizes of scales (ie., 3 point, 4 point, and 5 point). Both were of course lower than the correlation obtained with ‘marginally symmetric distributions (for which they were equal). Although not reported in Table 1, a more micro-level analysis was conducted in which the deviations, variances, and cross-products were observed with each change in distribution for both ranks and ratings.‘Australasian Journal of Market Research Volume 6, Number 1 January 1998 TABLE1 ‘Simulation Results: Comparing Correlations under Asymmetry * Scale Points 3 4 5 Difference 88.0% 823% 79.0% NE 40 30 40 a, Number of simulated distributions=50 b. Proportion of times (out of 5) that Spearman's rho was greater than Pearson's ¢, Number of observations inthe base distribution It was found that in general a change in marginal distribution that decreased the frequency of higher response levels (¢.8., increased right skew) increased the difference between Spearman's and Pearson's with the difference always being in favour of Spearman's. Changes which increased the number of low tesponses (eg., increased left skew) relative to the symmetry condition resulted in Pearson's being higher for the first few steps away from symmetry, but as this number kept increasing, Spearman’s eventually became greater that Pearson’s. Regardless of the direction of skew, the variance of ranks was almost always smaller than that of ratings. ‘Also, the deviations of ranks were more symmetric than were those of ratings. These findings indicate that the distribution of ranks is probably more symmetric than that of ratings even though they have the same marginal distribution of frequencies at each scale point. Given that the largest potential correlations occur with a symmetric distribution, this also makes Spearman's likely to yield more valid correlations if the theoretical correlations are in fact thought to be symmetric even though the empirical shapes of the distributions are mixed. Results of the Commercial Data Sets Next we review the results of four segmentation analyses with Spearman and Pearson correlations as input to the Tandem approach with the four commercial data sets, We tested for significant differences using a conservative two-tailed t-test of independent samples. In Table 2, we sce the results for each data set. Regarding the discriminant test’ ~—of--—accuracy (reclassification with attitude variables), on average there was no statistical difference between Spearman's and Pearson's on DALEN. The average hit ratio for DA_EN using endogenous variables was 80.2% for Spearman's and 81.5% for Pearson's. For the discriminant test of structure (reclassification with —_ behavioural variables), Spearman's showed a marginally significant advantage over Pearson's for DAEX (M=36.9%, ‘Spearman's versus M=32.9%, Pearson's, {=2.21, p=0.07). There was, however, a fairly obvious difference on the last measure, MATCH, which is the qualitative measure of interpretability, The percentage of highest rated variables matching their segment definitions (MATCH) was consistently greater across all four studies (M=80.5%, Spearman’s versus M=67.9%, Pearson's, t=6.61, p=0.001).‘Journal of Market Research ‘Volume 6, Number 1 ‘Sanwary 1998 TABLE , ‘Segmentation Results: Commercial Data Sets Data Set DA_EN' DA_EX! MATCH? ‘Study One Pearson 78.0% 32.5% 65.4% Spearman 82.0% 34.0% 82.1% Sud Two Pearson 840% 32.4% 102% Spearman 13% 36.7% 805% udy Thr Pearson 82.0% 311% 61.7% Spearman 113% 35.4% 80.4% ‘Study Four Pearson 81.9% 35.60% 85% Spearman 82.3% 41.6% 78.9% Averages Pearson 815% 32.9%" 61.9%" Spearman 80.2% 36.9% 805% Percentage of respondents correctly reclassified The first important thing to note is that there was no difference in accuracy between Pearson's and Spearman's. The second is that Spearman's indicates a slight but consistent improvement in structure over Pearson's. The most important result concems the measure of interpretability. ‘These data showed a strong improvement in interpretability for each data set resulting in a reasonably large difference between Percentage of items matching the segment definition 10 Marginally significant difference between Pearson's and Spearman's at p=0.07 Statistically significant difference between Pearson's and Spearman's at p=0.001 Spearman’s and Pearson's on this measure. This finding is supported by the results for the test of structure, DA_EX. Although it is hard to generalise from only four examples, 2 pattern is formed across the four studies that agrees with the notion that Spearman's seems to produce a more interpretable segmentation than Pearson's in conditions of asymmetry for multichotomous rating scale data.Australasian Journal of Market Research DISCUSSION There have been ample demonstrations of improvements in research methods due to the statistical properties afforded by non- parametric statistics when poorly behaved data are involved. What is not clear is how the properties of non-parametric statistics relate to more qualitative managerial benefits. In this study we specifically wanted to test the impact of the use of the non-parametric statistic, Spearman's tho, with poor quality data on one of the most managerially sought after benefits of segmentation _analysis—interpretability. ‘Our main assumption is that interpretability is influenced by the quality of the data, which in tum impacts the correlation structure of the data, By starting the analysis with more valid correlations we hoped to produce a better correlation structure and thus a more interpretable segmentation, In this study we compared the effects of Pearson's parametric correlations with those of Spearman's non-parametric rank order correlations on segmentation interpretability. The correlations were used as input to the Tandem approach to segmentation analysis. We first demonstrated the ability of Spearman's to limit the attenuation of correlatations exhibited by Pearson’s when skewed data are involved, using simulated distributions varying in the degree of skew. As seen in Table 1, Spearman's showed larger correlations than Pearson's, although this decreased as the granularity of the scales decreased. This indicates that the problem of attenuation found with Pearson's r can in fact be handled by Spearman's when the scales are fairly non-metric and should result in an improvement in the correlation structure of the data, ‘As measurement of interpretability depends on the definition of interpretability, we defined interpretability as the degree of correspondence between the most highly ‘Volume 6, Number 1 ‘January 1998 rated items in a segment and the definition of that segment. Using this definition we developed a measure specifically for use with the Tandem approach. This measure consisted of the percentage of times the two ‘most highly rated items in a segment also loaded on one of the segment defining factors. It is also a measure of the face validity of the segmentation and is often used as an ad hoc test by marketing managers to ascertain whether the results make sense. ‘A. segmentation analysis using the ‘Tandem approach was performed on each of four commercial data sets with data distributions of different shapes. The results showed that Spearman's provides a statistically significant improvement in interpretability compared to Pearson’. Specifically, Spearman's showed on average more correspondence between the highest rated items in a segment and the items defining the segment, thus making the meaning of these segments less ambiguous and more interpretable. To show the effect of non-parametric correlations on correlation structure, a measure of the structure of the segmentation was taken based on a discriminant test of reclassification using data exogenous to the segmentation analysis (Green and Krieger 1995; Punj and Stewart 1983). The hit rates produced using exogenous variables were slightly better for Spearman’s rho than Pearson's r. Based on this, Spearmans' scems to improve the correlation structure of the data and in doing so improves interpretability as wwe have defined it. It is interesting to note that there was no difference in accuracy between Spearman's and Pearson’s using reclassification of segment members with endogenous data as the test of accuracy. We have found in this study that when interpretability is the research goal, as is the case with market definition studies, there seems to be an advantage in’ using Spearman's correlations. The overall improvement in interpretability found in‘Australasian Journal of Market Research this study indicates a specific reason for using Spearman’s as a regular part of the segmentation analysis for needs based segmentation or market definition with any kind of data. Limitations and Future Directions Because the evidence —for improvements in interpretability rests on tests with only four data sets, we should use caution when generalising these results to other data sets. For any segmentation analysis a variety of approaches should be used with the hope of observing some convergence. But among these the use of Spearman's should be considered as a normal part of the analysis regardless of the quality of the data. Future directions along these lines include refining the = measure of interpretability and continuing to develop other measures of managerial usefulness, perhaps by focusing on other key benefits of segmentation, Another line of research involves investigating the impact of Gifferent types of latent structures on the ability of various techniques such as correlations, principal components, or cluster analysis, to recover that structure. This could be done using simulation analysis so that the structure of the data is known a priori and the accuracy of the various methods can be assessed under different conditions of data quality. REFERENCES Furse, D. H, G. N. Punj, and D. W. Stewart, 1984. “A Typology OF Individual Search Strategies Among Purchasers Of New Automobiles.” Journal of Consumer Research 10, 417-431 Gleason, T. C. and R. Staelin, 1973 “Improving The Metric Quality Of Questionnaire Data.” Psychomerrica 38, 393-410. ‘Volume 6, Nomber 1 Tanuary 1998 Green, P. E, and A. M. Krieger, 1995, “Alternative Approaches To Cluster Based Market Segmentation.” Journal of the Market Research Society 37, 221-239. Kendall, M.G., and A. Stuart, 1991. The Advanced Theory of Statistics, 6th Ed. New York: MacMillan. Milligan, G.W. and M.C. Cooper, 1988. “A Study Of Standardization Of Variables In Cluster Analysis.” Journal of Classification 5, 181-204 Punj, G. and D. W. Stewart, 1983. “Cluster ‘Analysis In Marketing Research: Review And Suggestions. For Application.” Journal of Marketing Research 20, 134-148. Sarle, W.S. , 1983. “Cubic Clustering Criterion.” SAS Technical Report A-10, Cary NC: SAS Institute, Inc. Siegel, S., 1956. Nonparametric Statistics ‘for the Behavioral Sciences, New ‘York: McGraw-Hill, Stewart, D, W., 1981. “The Application ‘And Misapplication Of Factor Analysis In Marketing Research.” Journal of Marketing Research 18, 51-62. Symms, M. J., 1981. “Clustering Criteria And Multivariate Normal Mixture.” Biometrics 37, 35-43.ian Journal of Market Research CREASED SUPPORT FOR ~ ABORIGINAL CONCILIATION: FACT © ORARTEFACT? Validity of Comparisons een Questions With fferent Wording _ Robert. J Donovan Associate Professor of Marketing Graduate School of Management “University of Western Australia ; & ‘Chairman ynovan Research ong been known by public sters that changes in the context ‘which questions are asked, changes ven response categories, and even “changes in question wording can astly different findings in response to ne or similar questions. “In spite of substantial documentation ¢ effects, market researchers and opinion polisters occasionally ignore liscount these effects in interpreting sir data, This paper discusses and illustrates these issues using recently ished findings with respect to a survey Of attitudes toward Aboriginal Reconciliation carried out in May 1996, and the comparison of these findings with a Previous survey in May 1995. It is argued that the comparison made between the findings of the two surveys is invalid because of a number of differences in wording, context and response categories of the questions used in the two surveys. Volume 6, Number 1 ‘January 1998, INTRODUCTION ‘There is an extensive literature with respect to how differences in question wording (e.g., Rasinski 1989), changes in response categories (e.g., Bishop 1987), and the context in which questions are asked (e.g, Foddy 1994), can yield substantially different results, ‘These findings suggest caution should be exercised when attempting to compare the results of ‘questions purporting to measure the same thing, but differing on one or more of the above. For example, in their list of ten major points for formulating attitude questions, Sudman and Bradbum (1982, p 181) state that where changes in attitude are to be assessed over time, public opinion pollsters should “ask exactly the same questions in all time periods, if at all possible”. Similarly, since “Scholars of public opinion polling methods have long known that seemingly minor modifications in question ... format could alter the pattern of responses” (Peity, Rennier and Cacioppo 1987, p 481), pollsters and market researchers have attempted to derive question formats that minimise response bias effects in order to more accurately ‘measure the attitudes under study. It is also well documented that “responses t0 questions measuring beliefs and attitudes may be significantly altered by apparently trivial changes ... in the context in which they are asked” (Krosnick and Alwin 1987, p 201). In fact, Schuman and Presser (1981, p 24) claim that other than sampling error, “question-order effects are probably the most frequently offered explanation for an unexpected survey finding”. Ih spite of extensive documentation of these effects, market researchers and public opinion pollsters occasionally ignore or discount these effects in interpreting their data. This paper discusses and illustrates these issues using recently published results of a survey of attitude toward Aboriginal Reconciliation carried out in May 1996, and the comparison ofAustralasian Journal of Market Research these resulls with a previous survey in May 1995. It is argued that the comparison between the findings of the two surveys is invalid because the questions used in the two surveys differed not only in wording, but also in format, context and presented response categories. CASE STUDY: INCREASED SUPPORT FOR ABORIGINAL RECONCILIATION: FACT OR ARTEFACT? Ina recent publicati for Aboriginal Reconci wide distribution, it “...strong” support for reconciliation has more than doubled over the last twelve months” (Johnson 1996, p 4). This claim is taken from a report to the Council of a survey conducted in May 1996 which compared the May 1996 results with those of a survey conducted in May 1995. The Johnson (1996) publication contains no information about the questions asked in the two surveys, but the wording of the text would lead the reader to believe that the same question was asked in both surveys. This paper argues that, for a number of reasons, the claimed change in community attitude toward the Reconciliation Process is at Ieast open to question and is probably erroneous. The Issues This paper deals with two major issues: * firstly, the validity of the comparison between the May 1995 and May 1996 surveys’ findings; and secondly, © regardless of any comparison, the relative accuracy of the two question formats in measuring attitude toward the Reconciliation Process. The 1995 and 1996 questions and their respective results are presented first, followed by a comparison of the two questions in the context of findings in the market research and public opinion 4 ‘Volume 6, Number 1 January 1998 measurement literature with respect to question wording, response alternatives, context and question format. The issue of which question better reflects community attitudes to the process of Aboriginal Reconciliation is then discussed in light of other evidence and the Council's activities (through the Aboriginal Reconciliation Branch) in the period between the May 1995 survey and the May 1996 survey The May 1995 Question The May 1995 questions with respect to Aboriginal Reconciliation were placed by Donovan Research as part of Newspoll’s regular omnibus in a series of surveys from November 1991 to May 1995. Prior to asking the attitude question in May 1995, respondents’ spontaneous responses to the words “Aboriginal Reconciliation” were clicited, followed by a question as to whether they had “heard of the topic Aboriginal Reconciliation before today” This order and content of questions facilitated consideration of the topic prior to measuring the respondent's attitude, but ensured that the respondent was not unduly cued or primed by specific information in the questionnaire, This sequencing is consistent with recommendations by George Gallup and adopted by the American institute of Public Opinion (Foddy 1994). The May 1995 question assessing attitude towards the Reconciliation Process was: “Prom what you expect, know or have heard, are you in favour of, against, or have no feelings either way, about a process for Aboriginal Reconciliation?” ‘Those responding ‘in favour’ or ‘against’, were asked: “Is that strongly in favourfagainst or somewhat in favour/againsi”.‘Australasian Journal of Market Research “The results for November 1991 to May 1995 are shown in Table 1 (reproduced ‘Volume 6, Number 1 ‘January 1998 from Donovan 1995). TABLE 1: OVERALL ATTITUDE TOWARD A PROCESS FOR ABORIGINAL RECONCILIATION (DONOVAN 1995) Nov | Feb | Sept | March | August |” May 1991 | 1993 | 1993 | 1994 | 1994 | 1995 BASE N=1300 FOR EACH % % % % % % SAMPLE, Strongly in favour 23 | 270 | 283 | 278 | 268 | 21.5 Somewhat in favour 259 | 25.7 | 269 | 243 | 250 ] 277 ‘TOTAL IN FAVOUR (48.2) | (52.7) | (65.2) | 2.1) | G17) | 49.2) No feelings either way 283 | 298 | 253 | 265 | 279 | 282 Somewhat against 64 | 39 | 54] 52] 35) 427 Strongly against 39 | 47 | 43 | 47] 45 | 30 Don't know 132 | 90 | 99 | 114 | 123 | 149 TOTAL, 1000 | 100.0 | 1000 | 1000 | 100.0 | 100.0 ‘The May 1996 Question Specifically, the 1996 respondents were Prior to being asked the altitude to Reconciliation question, 1996 respondents were asked a series of questions that were not asked in the May 1995 survey. The 1996 questionnaire began with a question with respect to how well respondents felt that various groups in society were ‘looked after’. ‘Aboriginal people’ were one group in a general context of ‘disadvantage’ (single mothers, elderly people, families, youth and disabled people), Respondents then were presented with 12 areas and asked to indicate the extent to which they thought indigenous people “are suffering disadvantage” in each of the 12 areas (@.g., jobs, health, cultural heritage, housing, etc). They were then asked a series of questions regarding “relationships between Aboriginal people and ‘other Australians” asked to: (a) rate the state of “general relationships between Aboriginal people and other Australians” and (b) to give the reasons for their rating; (c) give a prompted response with respect to perceived responsibility for “improving relationships between Aboriginal people and other Australians” (Seven ‘groups’ listed, including ‘all individuals’); and (d) to rate their personal concern “about improving relationships between Aboriginal people and other Australians”. The inclusion of these prior specific questions is not consistent with the procedures adopted by the American Institute of Public Opinion Research as noted above, and clearly constitute a very different context to that of the 1995 question,‘Australasian Journal of Market Research Using the same questions as in the 1995 questionnaire, 1996 respondents then weré asked for spontaneous responses to the words “Aboriginal Reconciliation” and whether they had “heard of the topic Aboriginal Reconciliation before today”. In ‘Volume 6, Number 1 ‘January 1998 addition, if they had heard of the topic, they were asked where they had heard about the topic. The question assessing attitude toward the Reconciliation process in May 1996 was: “Commonwealth Parliament voted to establish the Council for Aboriginal Reconciliation in 1991, The Parliament asked Council to jote a process reconciliation between Aboriginal and Torres ‘strait Islander people and the wider community. Do you support this concept of reconciliation...2” Interviewers then read the response categories: “Strongly, A little, Not very much, Not at all”. The May 1996 results are shown in Table 2 (reproduced from Sweeney 1996), The publication produced by Sweeney and ‘Associates for the Council for Aboriginal Reconciliation presented the May 1995 and May 1996 results, along with results from previous surveys using the 1995 question, as in Figure 1 (reproduced from Johnson 1996). TABLE2: OVERALL ATTITUDE TO RECONCILIATION MAY 1996 RESULTS (SWEENEY 1996) % TOTAL Strongly 48 A little 34 Not very 6 much Notatall 5 Don’t know 6 ‘TOTAL. 100‘Justralasian Journal of Market Research Volume 6, Number 1 January 1998 FIGURE 1; SUPPORT FOR ‘THE CONCEPT OF RECONCILIATION’ - TRENDS (JOHNSON 1996) November, 1991 20% February, 1993 21% September, 1993 28% “Strongly in favour” March, 1994 28% August, 1994 27% May, 1995 2% May, 1996 48% “Strongly support” be Comparison Between The 1995 And 1996 Questions On Attitude To The Reconciliation Process It is clear that the 1995 and 1996 questions with respect to attitude toward ‘Aboriginal Reconciliation and their response categories are quite different. The specific differences are delineated below. Differences in wording © The 1995 question used the terms ‘in favour’ and ‘against’. The 1996 question used the term ‘support’. Responses have been shown to differ for these two terms (eg, Krosnick 1989). Furthermore, contrary to accepted practice (Malhotra 1993; Bearden, Netemeyer and Mobley 1993; Foddy 1994; Sudman and Bradburn 1982), BSA did not include an ‘oppose’ alternative (see below). © The 1995 question asked about “a process for Aboriginal Reconciliation” - a more generic concept, whereas respondents in 1996 were presented with a specific item of information and then asked about “this concept of reconciliation” (underlining and italics added in each case), thus calling for a response specific the information presented. In this sense, itis argued that the 1995 results are therefore generalisable to a wider concept of Reconciliation and to existing attitudes in the wider population, whereas the 1996 results are generalisable only to people given that specific preamble. © In 1996, the word ‘Aboriginal’ was omitted in reference to Reconciliation in the attitudinal question. The 1995 question referred to a process for Aboriginal Reconciliation whereas the 1996 question asked about this concept of Reconciliation. The 1996 question might have been confusing since “this concept” implies a distinction from other concepts of to Reconciliation. Given the questions relating to ‘relationships’ immediately preceding the attitude question, a reasonable assumption might be that respondents were responding to the concept of improving relationships between indigenous and non-indigenous Australians rather than to a formal Reconciliation Process. Differences in response categories‘Tustralasian Journal of Market Research © The 1995 response scale used the qualifiers ‘strongly’ and ‘somewhat’; the 1996 response scale used ‘strongly’ and ‘a little’. ‘The 1995 qualifiers are more consistent with the literature (Bearden et al 1993; Krosnick 1989). ‘e The 1996 response categories “not very much” and “not at all” are potentially ambiguous. “Not very much” may imply rejection or limited acceptance. Similarly, “not at all”, depending on voice expression, could reflect a neutral response or rejection. The 1995 question and response scale were balanced (ie., presented both the positive and negative alternatives) with a neutral alternative. The 1996 response scale was “unbalanced” (Malhotra 1993, p 301) because it did not offer a negative alternative. By definition, the measurement of attitude (“whether the respondent likes or dislikes the object, favours or disfavours the object ... pro or con ... the object”, Sudman and Bradburn, 1982, p 123; see also Bearden et al 1993), requires the use of ‘pipolar’ response altematives, and, as noted below, the absence of the ‘against’ or “oppose” alternative yields a higher positive response than when included. Sudman and Bradburn (1982, p 180) note that “there is no simple way to provide balance (in the broad sense of the term) to most survey items, but one elementary step is to make formally explicit the fact that a negative response is as legitimate as an affirmative one”. It is noted that, in spite of not including negative options on their response scale, Sweeney (1996, p 19) refers to those “against” Reconciliation and Johnson (1996, p 4) refers to the level of “opposition” to Reconciliation. Both of these statements are misleading as the 1996 response scale included neither of these negative positions. Differences in question format © The 1995 approach is in effect a ‘filter’ approach (Schuman and Presser 1981), in that respondents are given a ‘no opinion’ Volume 6, Number 1 January 1998, alternative in the statement of the question. fers reduce the likelihood of response sets and of respondents without an opinion volunteering a response because they feel they are expected to, and hence “contributing systematic error to survey data” (Schuman and Presser 1981, p 114). © The 1995 procedure is also a ‘branching’ procedure in that intensity of attitude is measured afer first establishing the direction of attitude, Krosnick and Berent (1990, p 24), after a series of experiments concluded that “survey researchers should use branching formats whenever possible to maximise the reliability of their measurements”, © The 1995 question presented people with three alternatives as part of the question- ‘in favour’, ‘against’, and a neutral alternative - before asking for a response. The 1996 question presented only the positive ‘support’ alternative before reading the response alternatives. Such questions are known to yield a higher degree of positive response (j.e., acquiescence) than questions offering both positive and negative alternatives (Lehman et al in Sterngold, Warland and Herrmann 1994; Schuman and Presser 1981; Petty et al 1987; Sudman and Bradburn 1982; Ayidiya and McClendon 1990). Furthermore, a presented neutral category attracts far greater responses than when not offered (Bishop 1987; Schuman and Presser 1981). Context _Effects of Precedin, Topics The 1996 question was preceded by several questions that might have served to prime respondents in a particular way such that the respondent's expressed attitude was a more transient response generated by information presented in the interview situation rather than reflecting a more enduring attitude, The readiness with which issues come to mind and the topics discussed prior to measuring of attitudes are primary influences on expressed attitudes (Tourangeau and Rasinski 1988). If we JuestionAustralasian Journal of Market Research wish to measure pre-existing attitudes in response to the topic per se, it is crucial to avoid priming effects - unless of course, the prime is deliberately used for some reason. The above mentioned preceding questions in 1996 would increase the likelihood that people would take these issues into account when expressing an opinion on “this concept of reconciliation”. Ih particular it is noted that the three immediately preceding questions related specifically to relationships between indigenous and non-indigenous Australians, two of which related to improving such relationships. Given that 73% of 1996 respondents claimed to be ‘quite’ or ‘extremely’ concemed about improving relationships and 84% claimed that ‘all individuals’ were responsible for improving relationships (Sweeney 1996), it is likely that these questions had a ‘priming’ effect {o increase expressed support for a concept relating to bringing about improved relationships between indigenous and non- indigenous Australians (Foddy 1994; ‘Tourangeau, Rasinski, Bradbum and D’Andrade 1989). The commitment and consistency principle (Cialdini 1984) also would operate in such a situation: “Order effects also may appear when questions have a close substantive relationship to one another, so that the answers to one question have logical implications for others. We mentioned carlier that there is a general strain toward consistency in attitude” (Sudman and Bradburn 1982, page 144). WHICH QUESTION IS MORE LIKELY TO — ACCURATELY REFLECT PUBLIC OPINION? ‘As noted above, acquiescence effects are likely to yield a higher positive response to the 1996 question than to the 1995 question. However, the question format and context effects also point to the 1995 19 Volume 6, Number 1 January 1998 question yielding a more accurate result, It might be argued that the 1996 question is in fact measuring a different attitude and that this question is a more valid indicator of opinion about Reconciliation. However, stich an argument must then accept that the original comparison between the results is invalid, For the present then, the discussion below proceeds on the assumption that the two questions purport to measure the same thing as implicitly claimed in Sweeney (1996) and Johnson (1996). Consistency With Other Findings Figure 2 shows levels of awareness of “the topic of Aboriginal Reconciliation’ for the 1991 - 1995 surveys and the 1996 survey. All surveys used precisely the same question wording for this variable (Johnson 1996). Figure 2 shows that the 1996 level is very similar to that of the past four surveys, although slightly higher than in 1995. This small increase in awareness is inconsistent with a ‘doubling in strong support” for Aboriginal Reconciliation. This level of awareness finding supports the interpretation that 1996 respondents were responding to the immediacy of the question situation - not expressing an underlying more enduring attitude toward a process of Reconciliation, In the 1995 survey, 28% nominated the ‘no feelings cither way’ category and 15% voluntecred a ‘don’t know’ response to their attitude to Reconciliation question (Donovan 1995) - a result consistent with the level of awareness of the topic and with an enduring rather than situation-derived response. This conclusion is further supported by Donovan’s (1994) analysis of the relationship between awareness and attitudes (Table 3).‘Justralasian Journal of Market Research Volume 6, Number 1 ‘January 1998 FIGURE 2: AWARENESS OF ‘ABORIGINAL RECONCILIATION” (JOHNSON 1996) ‘November, 1991 February, 1993 |_31__ September, 1993 48 March, 1994 50% August, 1994 41% May, 1995 46% May, 1996 ae Se TABLE3: ATTITUDE TO ABORIGINAL RECONCILIATION AS A FUNCTION OF AWARENESS OF ‘THE TOPIC (DONOVAN 1994) Feb Sept March 1993 1993 1994 % | %Not| % | %Not | % | %Not ‘Awimade to Process_| Aware | Aware | Aware | Aware | Aware | Aware In fayour 657 | 468 | 674 | 435 | 649 | 443 Neutral 2s | 448 | 240 | 470 | 224 | 465 Against us 84 8.6 95 | 127 92 Total 7000 [1000 | 1000 [100.0 | 100.0 | 100.0 20Australasian Journal of Market Research The claimed high level of support for the Reconciliation Process is _also inconsistent with the results shown in Table 4 (reproduced from Sweeney 1996). When 1996 respondents were presented with four alternative statements and asked to nominate the one that best represented their view, 46% chose a ‘forget the past’ alternative and a further 20% chose a “don’t owe them anything in spite of past ‘Volume 6, Number 1 January 1998 mistreatment’ alternative. These data appear to contradict the claimed high level of support for a Reconciliation Process and again support the conclusion that 1996 respondents were responding to the immediacy of the attitude to Reconciliation question situation - not expressing an underlying more enduring attitude toward a process of Reconciliation. (TABLE 4: MAIN BELIEF AND FEELINGS ABOUT INDIGENOUS ISSUES (SWEENEY 1996) TOTAL (1255) % MAIN BELIEF Lbelieve all Australians have a debt to Aboriginal people because of the way they have been treated since 1788..... | 22 [believe Aboriginal people were mistreated in the past but I don’t think we owe them anything today... 20 believe we should forget about the past and all ‘Australians should just get on with thei 46 don’t care. L Other... u Consistency With Couneil for Aboriginal Reconciliation Activities To this author's knowledge there were no large scale systematic campaigns by the Council for Aboriginal Reconciliation (or other groups) in the period May 1995 to May 1996 that could account for a doubling in ‘strong’ support for Reconciliation, but at the same time yield only a small increase in awareness of the topic. ‘Overall then, question wording effects and other evidence all support the conclusion that the 1995 survey results would be more accurate readings of community opinion with respect to @ 21 process for Aboriginal Reconciliation than the result reported for May 1996. The above inconsistencies also support the proposition that the ‘doubling in ‘strong’ support’ for Reconciliation is an artefact. DISCUSSION ‘The Comparison Public opinion pollsters and survey researchers continually face questions about the validity and reliability of the questions they ask in their questionnaires. While few question formats will ever be beyond some criticism, there are nevertheless a set of‘ustralasian Journal of Market Research principles and practices that at least minimise actual and potential bias. It is argued here that violation of a number of these principles and practices renders invalid the conclusion that “..."strong” support for reconciliation has more than doubled over the last twelve months” (Johnson, 1996, p 4; Sweeney 1996), and that publication of this conclusion is misleading to the audiences exposed to this interpretation of the data, The questions asked in the 1995 survey and the 1996 survey with respect to attitude towards the Process of Aboriginal Reconciliation were clearly quite different, ‘as were the response categories used and the number and nature of preceding questions, A number of these factors, along with other data such as the level of awareness of the ‘topic of Aboriginal Reconciliation’, suggest that the 1995 question yields @ more accurate reading of ‘community opinion toward a formal process of Aboriginal Reconciliation than the 1996 question. As argued above, the 1996 question appears to change the object of the attitude from a formal process of Reconciliation to some supposedly specific, yet undefined, concept of Reconciliation constructed by the interviewer in the interview situation. Given the preceding questions in 1996 about improving relationships between indigenous and non- indigenous Australians, the 1996 question is probably measuring attitude toward the principle of improving such relationships - not a process of Reconciliation as measured in previous surveys, and not as measured in the word association and awareness of the topic of Aboriginal Reconciliation questions asked in both surveys. That is, the object of measurement seems to be different to that asked about in these preceding questions. Given that a formal Reconciliation process does exist, that approximately half the sample were unaware of the topic, and that the 1995 question included a clear__neutral alternative, the 1995 results appears to have 2 ‘Volume 6, Number 1 January 1998 far more real world applicability than the 1996 results. Sweeney (1996, p 18) states that their “questioning method was applied as a means of forcing respondents to take a stance rather than “fence sitting” ". This view of attitude measurement is inconsistent with both theory and reality (ie., many individuals are neutral to many issues; Schuman and Presser 1981). In fact it is. arguable that the problem of respondents expressing an opinion when they have none is a greater problem to be avoided than ‘forcing respondents off the fence? (Schuman and Presser 1981). The absence of negative response categories can suggest to respondents that opposition to the presented proposition is not expected ‘and hence inhibit negative responses and encourage positive responses (via the principle of social proof; Cialdini 1984; Foddy 1994). In. areas subject to socially desirable response influences (Sudman and Bradburn 1982), and in the absence of an ‘opposition’ response category, it can be argued in fact that the 1996 questioning method is a means of inducing respondents to take a positive stance. Professional Practice Issues While the difference in wording is noted in Sweeney (1996), and that report initially concedes that this might have contributed to ‘some’ increase in support for Reconciliation, the authors nevertheless dismiss this possibility as insignificant (pages 18, 19) and proceed to present the ‘doubling of ‘strong’ support in the past twelve months’ as a valid finding Regardless of any differences in question wording and context, such 2 statistically and socio-politically significant difference should have triggered some questioning of the validity of the difference and a search for artefacts that might explain - or factors that might validate - the finding, and especially before widespread publicising of the claimed increase. Neither (Johnson 1996) nor Sweeney (1996) offer any‘Australasian Journal of Market Research explanation as to why such a significant increase in support for the Reconciliation process occurred in the period May 1995 to May 1996. Given the amount of literature available on the issues raised above, it is legitimate to ask why the differences in question wording were dismissed as @ possible artefactual source of increase in ‘strong’ support for Reconciliation and why the difference in the questions was not mentioned in the Johnson (1996) publication clearly intended for public distribution, It may be that courses in market research (as distinct from social survey research) place less emphasis on question wording, response categories and context than they should. ‘The Market Research Society of ‘Australia’s Code of Professional Behaviour (Revised 1995) contains clauses with respect to the accurate reporting of results ‘and to the dissemination of conclusions which are not adequately supported by the data and which may be misleading, and calls for the researcher to take action where this has occurred. If this paper's arguments are accepted, there appears to be an obligation to supply correcting information to all past and future recipients of the Sweeney (1996) report, and, in particular, given its intended wider distribution, the Tohnson (1996) publication. Ik is perhaps appropriate also to note that where organisations seek or consent to mass media publicity for their survey findings, they not only have a “duty of care’ as outlined above, but must then accept public scrutiny of the validity of the publicised findings and the methods used to obtain these findings. Implications For Reconciliation Communication Strategies ‘The disparity between the 1995 and the 1996 attitudinal results is of some significance to the Council for Aboriginal Reconciliation because the implications for the Council’s communication strategies are 2B Volume 6, Number 1 January 1998 quite different. First, whereas the 1995 result led to a recommendation for targeting those expressing a neutral position with respect to Reconciliation as a priority (Donovan 1995), the 1996 result implies that strategies to create favourable attitudes are not a priority as by the 1996 measure, ‘almost all respondents are already positive (83%; Johnson 1996). The 1995 recommended strategy, consistent with accepted communication strategies (Donovan and Leivers 1993), was based on winning over the ‘undecideds’ before they become exposed to information that might lead to them adopting negative attitudes. ‘The 1995 recommended strategy was also consistent with the data with respect to the evel of awareness of the topic of ‘Aboriginal Reconciliation and the relationship between this awareness and attitude toward the Reconciliation Process. Second, whereas the 1995 strategy recommended increased efforts to increase awareness for the Reconciliation Process (especially as awareness was positively related to attitude to Reconciliation), the 1996 results imply that this is unnecessary given the high levels of support anyway. In fact, even measuring awareness of the topic seems to be rendered somewhat irrelevant given the 1996 attitudinal result, In short, the 1996 attitude result poses something of ‘a conundrum for the Council in terms of what should be their topic awareness objective. Given the 1996 result showing little change in awareness of the topic of Reconciliation at around 50% of the population, and given the increase in ‘strong’ support as artefactual, it is likely that the previously recommended strategy of targeting ‘undecideds’ is still valid. Concluding Comment The above discussion highlights two aspects of altitude measurement: the importance of clearly defining the object of measurement; and the importance of question format and wording, response alternatives, and question order inAustralasian Journal of Market Research influencing responses. These two aspects are particularly important when attempting to’ measure attitude change over time. While such matters are important in all areas of research, they are arguably more important in social research where errors in measurement and interpretation may have far reaching consequences for society as a whole. Finally, this paper's discussion of the comparison between the two surveys’ findings suggests that where researchers deal with issues that have socio-political ramifications, extra vigilance may be required with respect to objectivity and accuracy in reporting and disseminating survey findings. Given that findings from public opinion polling are part of today's news, either via contracts between research agencies and media vehicles or via the polling agency's self-promotion, and that market researchers are often commissioned to provide data to support particular organisations’ views in the media and elsewhere, it may be timely to note Komnhauser’s statement of some 50 years ago with respect to US polls being biased against organised labor: “the technical job of avoiding bias can be reasonably well handled but more formidable hurdles are the social-economic pressures upon the opinion polling agencies and, closely associated, the personal social outlook of the agency staffs” (Kornhauser 1946-1947, p 499), Perhaps the biggest ‘hurdles’ are that we are often unaware that various biases might be operating, or convince ‘ourselves that they are not. REFERENCES Ayidiya, S.A. & McClendon, M.J. (1990). Response Effects in Mail Surveys. Public Opinion Quarterly 54: 229-247. Bearden, W.O., Netemeyer, RG. & Mobley, MF. (1993). Handbook of Marketing Scales: Multi-Item 24, ‘Volume 6, Number 1 January 1998 Measures for Marketing and Consumer Behavior Research. London: Sage. Bishop, G-F. (1987). Experiments with the Middle Response Alternative in Survey Questions. Public Opinion Quarterly 51: 220-232. Cialdini, R.B. (1984). Influence: The New. Psychology of Modern Persuasion. New York: Quill. Donovan, RJ. & Leivers, S. (1993). Using ‘Mass Media to Change — Racial Stereotype Beliefs. Public Opinion Quarterly 57: 205-218. Donovan Research (1994). Aboriginal Reconciliation Tracking Research. Summary Report to the Aboriginal Reconciliation Unit, Department of the Prime Minister & Cabinet. Donovan Research, Perth. Donovan Research (1995). Aboriginal Reconciliation Tracking Research. Summary Report to the Aboriginal Reconciliation Unit, Department of the Prime Minister & Cabinet, Donovan Research, Perth. Foddy, W. (1994). Constructing Questions for Interviews and Questionnaires: Theory and Practice in Social Research, New York: Cambridge University Press. Johnson, J. (1996). Unfinished Business: ‘Australians and Reconciliation. Produced by Brian Sweeney & Associates for the Council for Aboriginal Reconciliation. Australian Government Publishing Service, ‘Canberra. Komhauser, A. (1946-1947). Are Public Opinion Polls Fair to Organized Labor? Public Opinion Quarterly 484-500.‘Australasian Journal of Market Research Krosnick, J.A. (1989). Review: Question Wording and Reports of Survey Results. Public Opinion Quarterly 53: 107-113. Krosnick, J.A. & Alwin, D.F. (1987). An Evaluation of a Cognitive Theory of Response-Order Effects in Survey Measurement. Public Opinion Quarterly 51: 201-219. Krosnick, J.A. & Berent, MK. (1990), The Impact of Verbal Labeling of Response Alternatives and Branching on Attitude Measurement Reliability in Surveys. ‘The Ohio State University, Department of Psychology. Lehman, D.R., Korsnick, J.A., West, RL. & Li, F. (1991). The Focus of Judgement Effect: A Question Wording Effect Duc to Hypothesis Confirmation Bias. In: Sterngold, A., Warland, RH. & Herrmann, R.O. Do Surveys Overstate Public Concerns? Public Opinion Quarterly 58: 255-263. Malhotra, N.K. (1993). Marketing Research: An Applied Orientation. New Jersey: Prentice-Hall. Petty, RE., Rennier, G.A. & Cacioppo, J.T. (1987). Assertion versus Interrogation Format in Opinion Surveys: Questions Enhance Thoughtful Responding. Public Opinion Quarterly 51: 481-494, Rasinski, K.A. (1989). Question Wording ‘and Support for Government Spending. Public Opinion Quarterly 53: 388-394. Schuman, H. & Presser, S. (1981). Questions and Answers in Attitude Surveys: Experiments on Question Form, Wording, and Content. New York: Academic Press. Sudman, S. & Bradburn, NM. (1982) ‘Asking Questions: A Practical Guide to ‘Volume 6, Number 1 ‘January 1998 Questionnaire Design, London: Jossey- Bass. Sweeney, Brian & Associates (1996). A Report on a National Quantitative Survey of Community Attitudes Towards Aboriginal Reconciliation. Report to the Department of the Prime Minister & Cabinet for the Council for Aboriginal Reconciliation. Brian Sweeney & Associates, Melbourne. Tourangeau, R. & Rasinski, K. (1988). Cognitive Processes Underlying Context Effects. in Attitude Measurement. Psychological Bulletin 103; 299-314. ‘Tourangeau, R. & Rasinski, K., Bradburn, N. & D’Andrade, R. (1989). Carryover Effects in Attitude Surveys. Public Opinion Quarterly 53: 495-524.‘Australasian Journal of Market Research STRUCTURAL EQUATION MODELLING IN MARKET RESEARCH Scott MacLean DBM Consultants Pty Ltd Melbourne Australia Keven Gray ACNielsen. SRG Tokyo Japan ABSTRACT Structural Equation Modelling (SEM) is a technique which effectively subsumes a whole range of standard multivariate analysis methods, including regression, factor analysis and analysis of variance. Whilst being a sophisticated theoretical tool, and certainly not easy to implement, SEM actually underlies much of what practising market researchers do on a daily basis. ‘That is, on the basis of things we can measure, we attempt to make predictions of things we cannot measure. For market research, SEM provides an opportunity (in fact, a requirement) to hypothesise models of market behaviour, and to test or confirm these models statistically. In the paper, some examples are presented to show some of the benefits of this modelling approach. Technically, SEM estimates the unknown coefficients in a set of linear structural equations. Variables in the equation system are usually directly observed variables, and unmeasured latent variables that are not observed but relate to observed variables, Volume 6, Number 1 ‘January 1998 SEM assumes there is a causal structure among a set of latent variables, and that the observed variables are indicators of the latent variables. The latent variables may appear as linear combinations of observed variables, or they may be intervening variables in a causal chain. One of the findings deriving from the examples presented in the paper is that conclusions drawn from techniques such as Exploratory Factor Analysis and regression (eg as used in many customer satisfaction approaches) may be unsustainable in terms of their statistical integrity. INTRODUCTION To paraphrase | Byme (1994), Structural Equation Modelling (SEM) is a statistical methodology that takes an hypothesis-testing (ie confirmatory) approach to multivariate analysis. By contrast, multivariate procedures commonly used in market research are essentially descriptive or exploratory in nature (eg principal components analysis, cluster analysis), so that hypothesis testing is difficult, if not impossible. SEM generally involves the specification of an underpinning linear regression-type model (incorporating the structural relationships or equations between unobserved or latent variables) together with a number of observed or measured indicator variables. By examining the co-variation between the observed variables, itis possible to: @ estimate the values of the coefficients in the underpinning linear model; ‘© statistically test the adequacy of the model to adequately represent the process(es) being studied; and if the model is adequate, conclude that the postulated relationships are plausible (or, more correctly, that they are not inconsistent with the data). For market research, SEM provides an opportunity to hypothesise models of‘Australasian Journal of Market Research market behaviour, and to test these models statistically. In this paper, examples and cage studies will be presented which show, in part, that conclusions drawn from what are’ now fairly standard applications of techniques such as Exploratory Factor ‘Analysis and regression (eg as used in many customer satisfaction approaches) may be unsustainable in terms of their statistical integrity. ‘SOME BASIC CONCEPTS ‘A Structural Equation Model in its most general form involves the specification of a number of components which, when pictured in full detail, can be more than daunting to the tyro modeller. ‘Anyone who has perused the LISREL® (1989) documentation will surely agree with this’ ! It is therefore instructive to examine the various elements of SEM, one by one. First, however, a small parable may be of assistance. Let us take a brief sojourn to Omote- sando, one of Tokyo's chic fashion districts. Here in Omote-sando we observe a young woman - let us call her Yumi - emerging from one of the trendy and very expensive boutiques which abound in this area, Elegantly and expensively dressed and coiffured, it is apparent to us that Yumi pays a great deal of attention to her appearance. In market research jargon, we might also say she appears very “fashion- conscious.” ‘Though we often use terms such as “fashion-conscious” casually, it _ is important to recognise that _fashion- consciousness is in reality a theoretical construct; we cannot actually see it but can only infer its presence from what we can observe. In other words, it is a latent or unobserved variable. In our example. we can observe Yumi’s dress and manner and See also Long (1983) for a simpler treatment of LISREL®. 28 ‘Volume 6, Number 1 January 1998 the Omote-sando boutique at which she’s been shopping and make the inference that she is fashion-conscious. One may object and conclude instead that Yumi is simply — materialistic ‘Materialism is another example of a latent or unobserved variable. Or, one may determine she is both fashion-conscious and materialistic. In this case we would, in effect, be saying that these two latent variables are correlated. Measurable Variables Naturally, in Market Research we would not normally venture into Omote- sando, observe young women like Yumi and speculate about latent variables. We and Unmeasurable often do, however, administer questionnaires to consumers which probe for concepts such as-_—“fashion- consciousness”, “materialism”, etc. By asking them to make self-assessments on items such as “I usually have one or more outfits that are of the very latest style,” we are attempting to measure the extent of their fashion-consciousness, etc., though we recognise that we cannot do so perfectly. (That is, we can measure but only with error.) ‘The statement “I usually have one or more outfits that are of the very latest style" is an example of a measurable variable and, similarly, “fashion-consciousness” is an example of an unmeasurable, —Jatent variable. To relate this to our earlier discussion, by asking Yumi to make self- assessments such as this, we are attempting to indirectly measure a latent variable which is, in fact, a theoretical construct which cannot be measured directly‘Australasian Journal of Market Research Volume 6, Number 1 January 1998 Latent Variables presumed to underlie what can be observed, Thus, unobserved or unmeasured, in the sense that the latent variables directly latent, variables are those which represent fluence the outcome or values taken by abstract concepts or theoretical constructs the observed variables. which cannot be directly measured. Such In pictorial form, latent variables can variables are often referred to as ‘factors’ or _be represented as ellipses, as shown in ‘common factors’. That is, they are Figure 1. FIGURE 1 FASHION (CONSCIOUSNESS Latent variables can be correlated with each other, as represented by the double-headed arrow in Figure 2, FIGURE 2 FASHION CONSCIOUSNESS MATERIALISM “Latent variables ean also influence other latent variables directly, vn a regression-type relationship, as represented by the single-headed arrows below: FIGURE 3 FASHION CONSCIOUSNESS MATERIALISM INCLINATION TO PURCHASE 29Justralasian Journal of Market Research Observed Variables Because latent variables are, by definition, unobservable, their measurement must be obtained indirectly. This is done by linking one or more observed variables to each unobserved variable. In fact, whilst this may sound an overly-fussy process, as shown in the case of our Japanese shopper Yumi, it is effectively what most of us do on a day-today basis as we prepare questionnaires. ‘The difference, however, lies in how we ‘Volume 6, Number 1 ‘January 1998 With SEM, the linking of observed (or indicator) variables with latent (or unobserved) variables is the first step in a formal statistically valid procedure. In contrast, with our day-to-day work the linking procedure is oftentimes implicit - in other words, if we feel that a particular measured variable makes a good indicator of some underlying construct, then we simply use it ! In pictorial form, observed or indicator variables can be represented as analyse the information we collect. rectangles, as shown in Figure 4. FIGURE 4 Fashion's an Tm usualy the Tam Tm tho ype to Important means| | Uke high-class | | fist toleam extiavagant |_| buy something of se items about anew | about my clothes want oxoregsion brand orproduc|_ [and food immodiately Zz FASHION INCLINATION TO. PURCHASE \ In this diagram, the single-headed arrows connecting the latent and observed variables indicate that the latent variables directly influence the outcome or values taken by the observed variables, again through a regression-type relationship. 30 We can go still further, in terms of identifying observed variables for the completely endogenous latent variable labelled as “Inclination to purchase”, as illustrated in Figure 5.Jestralasian Journal of Market Research Volume 6, Number 1 January 1998 FIGURE 5 — Fashions an Tim usually the Tam Tmithe type to imoortant means| | 1ke high-class | | first to learn extravagant | | buy something | of solt- items about anew | [about my cothes| want expression brand or praduct| ‘and food immediately FASHION CONSCIOUSNESS: INCLINATION “The chances of Tm not really me buying interested in something today | buying anything are high today Still More Variables Example I - Japanese Single Women ‘Apart from the latent and observed variables, there are residual and error terms associated with each of these which also form a key part of the overall model, For simplicity, however we omit these from the discussion, and refer the interested reader to the references. ‘Suffice it to say that a fully specified Structural Equation Model is potentially a complex interplay between a large number of observed and unobserved variables, and residual and error terms, Tn order to illustrate the concepts of observed/measured and unmeasured/latent variables we have already introduced you to A fictitious young woman whom we called Yumi. Fashion Consciousness and Materialism were used as examples of unmeasured/latent variables, and it was hypothesised that these two latent variables might be inter-correlated. We would now like to proceed beyond allegory and share some results of an actual study conducted among consumers who, in many, aspects, are very much like Yumi. 31‘Australasian Journal of Market Research Recently, SRG Japan conducted a U&A study on overseas travel among young Japanese single women ~ OLs in the jocal vernacular, ‘The term OL is an abbreviation of “Office Lady” and is widely used in Japan to refer to single women working in non-management and non-technical occupations, most often clerical work. Although their eamed incomes are typically not high, OLs are one of the most important consumer groups in Japan because they often live with their parents rent-free and tend to have significant disposable incomes ~ incomes which they frequently spend quite frecly. Another distinction OLs have which is important to the travel industry is that they often have more freedom to travel during any time of the year than other consumer groups. ‘Volume 6, Number 1 ‘Tanuary 1998 A Key objective of this research was to explore personality factors underlying OLS’ preferences for overseas destinations and travel arrangements. Consequently, during the interviews, respondents rated themselves on a battery of psychographic items which had been developed through preliminary qualitative research The qualitative phase of the research had suggested five principal psychographi factors of relevance to overseas travel experience and tastes. These were @ Fashion Consciousness © Materialism @ Assertiveness e Conservatism © Hedonism. Each of these latent constructs was measured by three to four measured variables (items). These are shown in Figure 6. FIGURE 6 MEASURED VARIABLES FOR EXAMPLE! = JAPANESE SINGLE WOMEN - Fashion Consciousness Vi3. Fashion is an important means of self-expression v20 like high-class items Vet usually the first among my friends to learn about a new brand or product Materialism (or “Extravagance”) Vai Lam extravagant about my clothes and food Yaa mith type to buy something | want immediately even if| have fo borrow! money Vaz _Tmthe type that doesn't hesitate to buy necessary things even i hey are somewhat expensive Assertiveness Vid make friends quickly even with people I've just met Vi7 [challenge anything without fear of failure 33 socialise with many different types of people 391 am the type to clearly state my opinions to others Conservatism (or “Deliberateness”) V3. Itend to achieve my goals one step at a time V6 —_amthe type to deliberate things WS | Gather various information and study well when deciding fo buy & specie tern Hedonism Vi [want to enjoy the present rather than think about the future V9 Like to go out to night-time entertainment spots Vi2__Iwant to lead a Ife with lots of ups and downs{osiralasian Journal of Market Research used on the qualitative research and Exploratory Factor Analysis (BEA) of the quantitative results, a number of Structural equation Models were developed and tested, each of which hypothesised different inter-relationships among the five latent constructs listed above. The path diagram representing the model we consider most treaningful in light of the overall findings ‘Volume 6, Number 1 ‘Fanuary 1998 ellipses and questionnaire items used to measure these latent constructs ie., measured variables, are shown in fectangles, Arrows pointing from the circles to the rectangles are equivalent to factor loadings in factor analysis. With two exceptions, all loadings were above 0.50) ‘Arrows between the unobserved variables represent correlations among these factors of the research is shown in Figure 7. (since correlations are two-way, To build upon our earlier discussion, associations, all arrows between the in the path diagram, latent constructs ‘unobserved variables are two-headed), (unmeasured variables) are shown in FIGURE 7 VIB pK 69 rt FASHION veo [+7128 CONSCIOUSNESS (64 | v3 N21 { [52 CONSERVATISM |.75_ V6 ae a HEDONISM 33‘Australasian Journal of Market Research To many readers of a Westem background, the overall results may seem surprisingly intuitive. Fashion Consciousness and Materialism were, indeed, found to be highly associated and in a positive direction. Indeed, this correlation (0.84) is so strong as to suggest these (wo factors themselves may really be functions of a second-order factor, though the confirmation of this would need further research Materialism and Assertiveness ‘are also found to be positively related but more weakly. The correlation between “Assertiveness and Fashion Consciousness weakly positive (0.26) but, nonetheless, statistically significant. Looking to the right side of the diagram, we see that Conservatism and Hedonism are negatively associated. This relationship also is weak but significant. We also note that Fashion Consciousness, Materialism and Assertiveness are all moderately positively associated with Hedonism. And, as had been anticipated, Materialism and Conservatism — are negatively related in this research, though this relationship is not strong. In an earlier preliminary model, Fashion Consciousness and Assertiveness were not found to be associated with Conservatism, and these paths were deleted prior to testing the present model. To recapitulate, fashion conscious OLs are also inclined to be on the spendthrift side and to have a hedonistic streak, though they are not necessarily exceptionally assertive or extroverted. Given these patterns, when vacationing abroad one might expect they would tend to look for an abundance of places to shop, especially for high-priced/fashion goods. Choice restaurants and perhaps nightspots would probably also be considerations for many of them when choosing a travel destination and/or travel package. ‘More conservative or methodical types, on the other hand, would be expected to be less extravagant, fashion conscious ‘and assertive and also less hedonistic. 34 Volume 6, Number 1 ‘January 1998 Other results for this survey suggested these young women might, instead, be more inclined to enjoy the local flavour of their destination or simply relax. Calibration and Hypothesis Testing So far, so good, We have a nice Jooking picture which (for our example) makes a certain amount of sense in terms of describing the key relationships in a model of market behaviour. In fact, what we have is more than that. Firstly, the diagram indicates that there is an hypothesised relationship between a number of latent variables which forms the underpinning casual structure of behaviour in this market. This is the so- called structural model. Secondly, the diagram indicates that there are a number of variables which we can directly observe, the _ statistical relationships between which we may be able to use to calibrate the underlying structural model. This set of statistical relationships is the so-called measurement model. [Recall that the latent variables are linked to each other via regression-type relationships, so that calibration in this context simply means estimating values for the relevant regression coefficients.] The central thesis of SEM is then twofold: ‘© the statistical relationship between the observed variables (in fact, the estimated covariances between them) can be used to provide estimates of the regression coefficients which link the unobserved, latent variables; and @ the adequacy, or goodness-of-fit, of the hypothesised structural model can be statistically tested using _ methods closely aligned with conventional chi- square goodness-of-fit approaches. Let’s have a look at one more illustrative case study to try to make things clearer, before © summarising our conclusions.Australasian Journal of Market Research. Example Il - Australian Employee Satisfaction Jn 1996, AGB McNair (now A C Nielsen) undertook an Employee Opinion Survey, the objective of which was to obiain benchmark information regarding the current attitudes of | Australian employees to their work environment. ‘A drop-off and mail-back methodology was used, Questionnaires were placed with employed respondents aged 16 years and over by AGB McNair Face to Face Omnibus interviewers. ‘A total of 740 completed questionnaires were returned. Information was collected on the following seven categories: # Employee satisfaction; # Leadership and support; © Customers; © Communication; © Feedback, recognition and training; © Quality and safety; and © Personal life. Respondents were asked to rate whether they agreed or disagreed with a number of statements under each of the above categories using the following scale 1, Disagree strongly 2, Disagree 3. Neither agree nor disagree 4, Agree 5. Agree Strongly ‘The categories themselves, and the various statements which underpin them, were selected so as to be broadly consistent with the criteria laid down by the Australian Quality Council, in relation to the Australian Quality Awards Assessment Framework. Details may be found in AQC (1996). Volume 6 Number 1 January 1998 In the AGB McNair (1996) report, analysis of the information was carried out for a number of key demographics, including: © the size of the organisation in which the respondent was employed; © the type of industry in which the respondent was employed; © the position in the organisation the respondent held; and © the State in which the respondent resided. In the present context, however, it is instructive to take a leaf out of the SEM book and treat the seven categories as subsuming a number of latent, unobserved variables, and the various underpinning statements as comprising the measured variables to be used as indicators of the latent constructs. Whilst the Employee Opinion Survey was not designed with this type of analysis in mind (indeed, nor was the earlier Yumi example), it is certainly possible to hypothesise the nature and direction of the relationships which exist amongst the latent variables, and thereby test the statistical and practical significance of the associated structural model. Factor Analysis and Regression These types of data are normally analysed by means of an Exploratory Factor ‘Analysis (EFA), usually implemented in the form of Principal Components Analysis (see, for example, Johnson and Wichern (1992)). Regression of the resultant factor scores (see Pedhazur (1982)) against some overall criterion measure (eg Overall Satisfaction) gives rise to standardised regression coefficients, which can be normalised (ie re-scaled so as to add to 100) and, it is claimed, thereby give an indication of the relative importances of the different factors‘Australasian Journal of Market Research ‘Volume 6, Number 1 January 1998 As. an illustration, consider the were used as the measured variables for this category Quality and Safety. In the construct, as shown in Figure 8 Employee Opinion Survey, nine staterents FIGURE 8 MEASURED VARIABLES FOR EXAMPLE I! ~ AUSTRALIAN EMPLOYEE SATISFACTION - Managers/supervisors talk with people about safety issues: Hanerets and conditions are never ignored and are reported by all personnel ‘Safely is never overridden by workiproduction issue: There is a positive link between quality and safety; ‘When an employee here is off si as possible; ‘Aclean and tidy workplace is encouraged; Sar team is continually looking for more ways to reduce waste (ime and resources); ick or injured, he/she tries to get back to work as soon {tam injured at work, | ee! | would be well looked after; and ‘Quality is never sacrificed by work pressures. ‘A casual inspection of the statements themselves suggests that there may be {wo factors (latent variables) which underlie them: © Safety, and © Efficiency/Ouality. Sure enough, if we camy out a Principal Components Analysis, two factors emerge which we may call ‘safety’ and ‘efficiency/quality’ with (rotated) loadings as shown in Figure 9 (total variance explained is 59%). Regression of the individual factor scores against Overall Satisfaction (R- square = 30; Std error = 87), and 36 inspection of the standardised regression coefficients, yields the following result concerning relative importances: 52% 48%. Safety Efficiency/Quality On the face of it, this is a clear and simple conclusion, Based on our data, we infer that in terms of the impact on Overall Employee Satisfaction, the factors of Efficiency/Quality and Safety are of almost equal importance. And this is the type of result that is presented to management and clients every day in the market research world.Australasian Journal of Market Research Volume 6, Number 1 ‘January 1998 FIGURE 9 . ‘STATEMENT FACTOR 1: FACTOR 2: ‘SAFETY EFFICIENCY/ QUALITY. Managers/supervisors talk with people about safety issues 744 Unsafe acts and conditions are never ignored and are reported 809 by all personnel Safety is never overridden by ‘worl/production issues 826 There is a positive link between quality and safety 753 Aclean and tidy workplace is encouraged 556 408 ‘Our team is continually looking for more ways t0 reduce waste 663 (time and resources) ‘When an employee here is off sick or injured, he/she tries to get 42 back to work as soon as possible 1 Lam injured at work, | feel | would be well looked after 575 Quality is never sacrifice by work pressures 446 565 ‘Note; Tor clarity only loadings greater than 0.4 are shown Consider, however, that: © we have assumed that the Exploratory Factor Analysis solution is a ‘good’ solution - it has not been subject to any form of statistical testing (in fact, it cannot be); and © we have put forward conclusions regarding relative importance which must at best be regarded with some degree of circumspection, being based on a linear regression with such a low R- square and large standard error’. a ‘There are also other objections to this approach, which are concerned with the ‘essentially retrospective nature of the ‘analysis; that is, we arc looking only at ‘what has happened in the past, not at what ‘might happen in the future cy The alternative is to do all of these things on the one pass, using the techniques of Structural Equation Modelling. Structural Model - Stage I ‘Using the SEM approach, a factor structure would normally be hypothesised based on a variety of considerations (eg the results of qualitative research), the necessary model defined, and its adequacy tested statistically. Instead, as a short cut (although not generally advisable - see Bentler (1995)), in many cases the results of an Exploratory Factor Analysis may themselves be used to define a factor structure. Thus, with our present example we might postulate that the factor structure which underlies the Safety/Quality questions is as shown in Figure 10.‘Australasian Journal of Market Research Volume 6, Number 1 January 1998 , FIGURE 10 Here, the 'z’ variables correspond to the statements listed earlier, administered in the questionnaire via an ‘agree/disagree’ scale. The arrows show that the ‘2? variables are indicators of the {wo underlying latent constructs, or factors, namely Safety and Efficiency/Quality. In other words, the diagram simply represents the factor analysis solution shown in the table earlier (Figure 9). Tt should also be noted that two of the measured variables are assumed to be associated with more than one factor. ‘When the necessary caleulations have been completed, we obtain the results shown in Figure 11 (standardised coefficients shown):‘Rustralasian Journal of Market Research Volume 6, Number 1 ‘January 1998 FIGURE 11 Several items of information are contained in the above picture, and the implications are immediately discernible. Firstly, the standardised regression coefficients from the Structural Equation Model are roughly the same as the factor coefficients arising from the Exploratory Factor Analysis (as we hope they would be). In Figure 12 the two sets of results are compared. 39 ‘The reason they are different, of course, is that the EFA solution assumes that every indicator variable is statistically related (correlated) with every factor. In contrast, the SEM solution assumes that only indicator variables correlated with the factors are those shown linked by an arrow (in fact a much more flexible arrangement for model specification). ‘Secondly, and more importantly, from the SEM we have a lot more information, specifically in relation to the “goodness’ of the solution rea a ee + Remember that the table of factor loadings shown earlier excluded any loadings less than 04.‘Australasian Journal of Market Research ‘Volume 6, Number 1 January 1998 ’ FIGURE 12 STATEMENT FACTOR: | FACTOR2: SAFETY EFFICIENCY! QUALITY, era _| sem | EFA | SEM Managers/supervisors talk with people about safety issues 74 | 86 Unsafe acts and conditions are never ignored and are reported | 81 | 89 by all personnel Safety is never overridden by work/production issues 83 | 85 “There is a postive link between quality and safety 75 | 69 A clean and tidy workplace is encouraged 56 | 46 | 41 | 43 ‘Our team is continually looking for more ways to reduce waste 66 | 62 (lime and resources) When an employee here is off sick or injured, he/she tries to get 4 | 60 back to work as soon as possible If am injured at work, | fee! | would be well looked after sa | 59 Quality is never sacrifice by work pressures 45 | 42 | 57 | 33 ‘One measure of the goodness-of-fit of the SEM solution is given by the chi-square value of 335.54, which is highly significant (p = .000). Note, however, that this does not mean our model is ‘good’. In fact the opposite, from the point of view of statistical significance’. In fact, one may say that what we are actually testing is “padness-of-fit” However, whilst the chi-square value is too large (ie. the p-value is too small) to be able to accept our model on strict “The reason why 2 ow p-value implies a ‘bad’ mode! is that the null hypothesis for this test is thatthe model is ‘good model. So a low p-value (that is, one close to Zero) means thal we reject the nul hypothesis, with & ow probability of being wrong in reaching that conclusion. Conversely, a high p-value (ie & value Targer than zero) would mean that if we did eject the rll hypothesis (i. conclude that the model is bad) then there would be a high probability that we would be wrong in doing so 40 statistical grounds, the other goodness-of-fit measures quoted” are not too bad. The so- called Goodness-of-Fit Index (GFI) is .916, and the Adjusted Goodness-of-Fit Index (AGRI is .849. The best you can get is unity with these two measures, so on the basis of the results obtained, we would probably say that the model is ‘good enough’ We can improve things further by allowing for a degree of co-variation between the factors, with results as shown in Figure 13. —_— 5 ere is huge itertuee on testing the ‘oodness-of fit of SEM solutions. ‘The principal consensus seems to be that there jg no consensus on which is the best approach, Bollen and Long (1993) provide probably the best discussion of the many issues involve.‘Australasian Journal of Market Research ‘Volume 6 Number 1 ‘January 1998 " FIGURE 13 a Chi-square = 97.90 67 000 z ‘ Iie Je (69. SAFETY a (33 ao [20 74 : ak = [ hy EFFICIENCY! ‘QUALITY Fr (54 65, 2 This second model allows for a degree of correlation (in fact, an r-value of 74) between the two factors. Further, by including this extra parameter, we have in fact improved the fit, as measured by the GFl and the AGFI. The chi-square value is still too large (ie. we still have p = .000) although it has certainly improved. On this basis, we would say that our second model is better’, 5 14 is actually possible to statistically test which of several competing SEMSs is the best, but we do not pursve this in this paper. The interested reader is Jnvited to consult some ofthe references Next we look at the impact of Efficiency/Quality and Safety on Overall Employee Satisfaction. Using SEM we can make a simple addition to the model, as shown in Figure 14. 4‘Australasian Journal of Market ‘Research ‘Volume 6, Number 1 January 1998 ’ FIGURE 14 OVERALL ‘SATISFACTION a In other words, we have the same model as before, but now we are specifically providing for an underlying structural mode] which relates the latent constructs Safety and Efficiency/Quality to the latent construct Overall Satisfaction via a regression-type relationship. The latent variable Overall Satisfaction is measured by just one indicator variable, labelled as ‘al’ in the diagram and represented in the original questionnaire by the question “How satisfied are you overall with working at your organisation?” The results are shown in Figure 15. 4a We have a direct parallel to the results obtained from the EFA. approach described earlier. But in contrast to the EFA approach, where Safety and Efficiency/Quality were seen to have almost equal impact on Overall Satisfaction, the SEM results show that Safety is of considerably lesser significance in its impact on Overall Satisfaction than the EFA approach would suggest. In addition, we have simultaneously provided ‘an estimate of the correlation which undoubtedly exists between the two factors, to show the close relationship between them.‘Rustralasian Journal of Market Research Volume 6, Number 1 Tanuary 1998 FIGURE 15 zi k Chi-square = 112.30 al af=31; p= 000 2K gii= 971 = a agli = 948 ~~ SAFETY 74 defo } = EN A 6 a 9 n overaLL \ { ai 4 ‘SATISFACTION 26 32 [s2 7 a 9 ERRIGIENCY'Y QUALITY 28 |g 453 ao lee 2 Structural Model - Stage II In fact, SEM allows models to be developed which are of almost unbelievable complexity, and which allow the direct calibration of sophisticated interplays of latent and measured variables. ‘As an indication of what we mean, We have used the employee data to calibrate quite a complicated model, which takes into account the likelihood that several of the hypothesised underlying latent variables will in fact be correlated with one-another. ‘The results from one such model are shown in Figure 16, with the model shown in diagrammatic form’ in Figure 17. Whilst this model is presented here primarily for illustrative purposes, for those pie Se 7 Gor clarity, the measured variables (over 60 of them !) have not been shown, who may be interested in some conclusions, ‘we make a few comments. Firstly, whilst the GFI and AGFI indices are somewhat too low for comfort, the chi-square results in relation to the degrees of freedom are not too bad. On this basis, we would probably accept the model on a preliminary basis, with an intention of refining it further. Secondly, the correlations shown in Figure 17 (indicated by the double-headed arrows) are almost all greater than zero and statistically significant. Thirdly, the relativities between the standardised regression coefficients (indicated in Figure 17 by the single-headed arrows) are interesting - in terms of the impact on Overall Employee Satisfaction they are as shown in Figure 16 for the total sample, and also for just two of the 13 Pi‘Australasian Journal of Market Research employee categories represented in the survey. It can be seen that: «For the total sample, the commitment of tie respondents and the career opportunities they feel are open to them play substantially greater role in the overall satisfaction they have with their jobs than do any of the other constructs. © This is, however, not at all the case for the retail sector respondents and health and community services. sector respondents. For the retail respondents, lack of stress is the dominant influence, with an emphasis on efficiency and ‘Volume 6, Number 1 January 1998 quality appearing to play @ negative part {at this stage of the analysis it is unclear why this might be the case). For the health and community services respondents, in contrast, working conditions and team spirit are the dominant influences, with a focus on the external customer playing a negative part, This would not appear to be inconsistent with our perceptions of the nature of the jobs undertaken by these types of people although, again, it is stressed that these results should be seen as preliminary. FIGURE 16 Standardised Regression Coefficients Latent Construct TOTAL RETAIL | HEALTH AND (incl.13 | (category 7) | COMMUNITY ‘employment ‘SERVICES categories) (category 11) Commitment and Career Opportunities 835 294 487 Working Conditions and Team Spirit 403 i 960 Management and Supervision 4a g * Lack of Stress at Home and Work 308 22 847 Equality of Treatment 304 4 Remuneration 94 246 322 Safety : y Positive Organisational Change 158 251 ; Internal Customer Focus é a a Communication é * e Efficiency/Quality é -330 3: Freedom to Perform 7 ‘ ¥ Extemal Customer Focus 7 : -512 Feedback and Recognition 2 = ¢ Chi-square; of | 9967.3; 1805 | 2857-4; 1805 | 4216.9; 1605 off 629 417 520 agti 599 369 4a Note: Non-signiticant coefficients are shown as 2°Australasian Journal of Market Research Volume 6 Number 1 Tanuary 1998 FIGURE 17 MANAGE- MENT & ‘SUPERVISION COMMUNI CATION Se FEEDBACK & RECOGNITION EFFICIENCY AND QUALITY FREEDOM TO PERFORM OVERALL. EMPLOYEE SATISFACTION EQUALITY OF TREATMENT REMUNER- ‘ATION COMMITMENT EXTERNAL CUSTOMER & CAREER FOCUS OPPORTUN- Tes Sa POSITIVE WORKING INTERNAL ORGANIS- COND'NS & CUSTOMER ‘ATIONAL TEAM SPIRIT FOCUS CHANGE 45‘Australasian Journal of Market Research CONCLUSION In this brief paper, we hope we have etn able to demonstrate the following: ¢ SEM makes explicit the implicit assumptions which apply when we include batteries of attitudinal statements in questionnaires. «SEM provides a statistically valid means of using the information we obtain through measurement to calibrate the relationships we hypothesise to exist between the underlying (latent) non- measurable variables. © Whilst being 2 sophisticated theoretical tool, and certainly not easy to implement, SEM actually underlies much of what we do on a daily basis, as practising market researchers. That is, on the basis of things we can measure, we attempt to make predictions of things ‘we cannot measure. © SEM also allows us to statistically compare the models which underlie different groups in the population we are studying. ¢ SEM provides an opportunity to hypothesise models of market behaviour, and to test or confirm these models statistically, (Further, _ conclusions drawn from what are now fairly standard applications of techniques such as Exploratory Factor Analysis and regression (eg, as used in many customer satisfaction approaches) may be unsustainable in terms of their statistical imtegrity.] + Opportunities for use of the SEM approach are numerous, and include: => (as mentioned) customer satisfaction studies => product and service preferences and buying behaviour rescarch = exploration of behavioural and attitudinal motivations = lifestyle studies plus many others, 46 ‘Volume 6, Number 1 January 1998 We would therefore urge the reader to consult some of the references® given, and develop an appreciation of a technique which we are convinced is deserving of, and which will receive, more widespread application in market research in the near future. REFERENCES AGB McNair (1996) “People ...An Organisation's Most _ Important Resource ... But Are They Empowered To Meet The Challenges Facing Australian Organisations?” Syndicated Research Report prepared by Sarah Wrigley, AMOS (1996) AMOS Users’ Guide Version 3.6, Smallwaters Corporation. AQC (1996) Australian Quality Awards ‘Assessment Criteria Australian Quality Council. Bentler, Peter M. (1995) EQS Structural Equations Program Manual, Encino, CA,; Multivariate Software Inc. Bentler, Peter M. and Wu, Eric J. C. (1995) EQS for Windows User's Guide, Encino, CA: Multivariate Software Inc. Bollen, Kenneth A. (1989) Structural Equations with Latent Variables, John ‘Wiley and Sons Bollen, Kenneth A. and Long, J. Scott (1993) Testing Structural Equation Models, Sage Publications Bollen (1989) is one of the seminal texts and, whilst technically complex, is well-worth reading. Bentler and Wu (1995) and AMOS (1996) provide desis of two of the more popular SEM sofiware packages. ‘The SEMNET FAQ (1997) is one of many web-based resources for SEM.‘Australasian Journal of Market Research Byme, Barbara M (1994) Structural Equation Modelling with EQS and EQS/Windows - Basic Concepts, Applications and Programming SAGE Publications. Johnson, Richard A. and Wichern, Dean W. (1992) Applied Multivariate Statistical ‘Analysis 3rd edn, pp. 340-347; 356-458 Prentice Hall. LISREL® (1989) LISREL® 7 - A Guide to the Program and its Applications, 2 edn, JORESKOG and SORBOM/SPSS. Inc Long, J. Scott (1983) Covariance Structure ‘Models - An Introduction to LISREL® SAGE Publi Pedhazur, Blazar J (1982) Multiple Regression in Behavioral Research - Explanation and Prediction 2nd ed. pp. 575-681 Holt, Rinehart and Winston. SEMNET FAQ (1997) htip:/vvww.gsu.edu/-mkteer/semfag ht ml 41 ‘Volume 6, Number 1 January 1998

AJMR Vol6 No1 1998 PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

AJMR Vol6 No1 1998 PDF

Uploaded by

Copyright:

Available Formats

You might also like