You are on page 1of 7

Francis Ian C.

Albaracin II

BSA_3

Finite-Mixture Structural Equation Models for Response-Based Segmentation


and Unobserved Heterogeneity

Abstract
Two endemic problems face researchers in the social sciences (e.g., Marketing, Economics,
Psychology, and Finance): unobserved heterogeneity and measurement error in data. Structural
equation modeling is a powerful tool for dealing with these difficulties using a simultaneous
equation framework with unobserved constructs and manifest indicators which are error-
prone. When estimating structural equation models, however, researchers frequently treat the
data as if they were collected from a single population (Muthén [Muthén, Bengt O. 1989. Latent
variable modeling in heterogeneous populations. Psychometrika54 557–585.]). This assumption
of homogeneity is often unrealistic. For example, in multidimensional expectancy value models,
consumers from different market segments can have different belief structures (Bagozzi
[Bagozzi, Richard P. 1982. A field investigation of causal relations among cognitions, affect,
intentions, and behavior. J. Marketing Res.19 562–584.]). Research in satisfaction suggests that
consumer decision processes vary across segments (Day [Day, Ralph L. 1977. Extending the
concept of consumer satisfaction. W. D. Perreault, ed. Advances in Consumer Research, Vol. 4.
Association for Consumer Research, Atlanta, 149–154.]).

This paper shows that aggregate analysis which ignores heterogeneity in structural equation
models produces misleading results and that traditional fit statistics are not useful for detecting
unobserved heterogeneity in the data. Furthermore, sequential analyses that first form groups
using cluster analysis and then apply multigroup structural equation modeling are not
satisfactory.

We develop a general finite mixture structural equation model that simultaneously treats
heterogeneity and forms market segments in the context of a specified model structure where
all the observed variables are measured with error. The model is considerably more general
than cluster analysis, multigroup confirmatory factor analysis, and multigroup structural
equation modeling. In particular, the model subsumes several specialized models including
finite mixture simultaneous equation models, finite mixture confirmatory factor analysis, and
finite mixture second-order factor analysis.

The finite mixture structural equation model should be of interest to academics in a wide range
of disciplines (e.g., Consumer Behavior, Marketing, Economics, Finance, Psychology, and
Sociology) where unobserved heterogeneity and measurement error are problematic. In
addition, the model should be of interest to market researchers and product managers for two
reasons. First, the model allows the manager to perform response-based segmentation using a
consumer decision process model, while explicitly allowing for both measurement and
structural error. Second, the model allows managers to detect unobserved moderating factors
which account for heterogeneity. Once managers have identified the moderating factors, they
can link segment membership to observable individual-level characteristics (e.g., socioeconomic
and demographic variables) and improve marketing policy.

We applied the finite mixture structural equation model to a direct marketing study of
customer satisfaction and estimated a large model with 8 unobserved constructs and 23
manifest indicators. The results show that there are three consumer segments that vary
considerably in terms of the importance they attach to the various dimensions of satisfaction. In
contrast, aggregate analysis is misleading because it incorrectly suggests that except for price
all dimensions of satisfaction are significant for all consumers. Methodologically, the finite
mixture model is robust; that is, the parameter estimates are stable under double cross-
validation and the method can be used to test large models. Furthermore, the double cross-
validation results show that the finite mixture model is superior to sequential data analysis
strategies in terms of goodness-of-fit and interpretability.

We performed four simulation experiments to test the robustness of the algorithm using both
recursive and nonrecursive model specifications. Specifically, we examined the robustness of
different model selection criteria (e.g., CAIC, BIC, and GFI) in choosing the correct number of
clusters for exactly identified and overidentified models assuming that the distributional form is
correctly specified. We also examined the following effect of distributional misspecification (i.e.,
departures from multivariate normality) on model performance. The results show that when
the data are heterogeneous, the standard goodness-of-fit statistics for the aggregate model are
not useful for detecting heterogeneity. Furthermore, parameter recovery is poor. For the finite
mixture model, however, the BIC and CAIC criteria perform well in detecting heterogeneity and
in identifying the true number of segments. In particular, parameter recovery for both the
measurement and structural models is highly satisfactory. The finite mixture method is robust
to distributional misspecification; in addition, the method significantly outperforms aggregate
and sequential data analysis methods when the form of heterogeneity is misspecified (i.e., the
true model has random coefficients).

Researchers and practitioners should only use the mixture methodology when substantive
theory supports the structural equation model, a priori segmentation is infeasible, and theory
suggests that the data are heterogeneous and belong to a finite number of unobserved groups.
We expect these conditions to hold in many social science applications and, in particular,
market segmentation studies.

Future research should focus on large-scale simulation studies to test the structural equation
mixture model using a wide range of models and statistical distributions. Theoretical research
should extend the model by allowing the mixing proportions to depend on prior information
and/or subject-specific variables. Finally, in order to provide a fuller treatment of
heterogeneity, we need to develop a general random coefficient structural equation model.
Such a model is presently unavailable in the statistical and psychometric literatures.

Critique:
Structural Equation Model is a useful technique in observing, testing, and evaluating causal
relationships with several varities of factors involved. In this particular article, it questions the
true usefulness of several testing methods particular to those market researchers, and those
involved in social sciences. It is because most methods used assumes that the sample taken
from a population are all of similar beliefs and characteristics, that it is an entirely
homogeneous mixture of subjects and ignores the heterogeneity and range of error that can
possibly occur. This model takes all those in account upon evaluation of the data taken from
experiments and analysis, even from a sample size, and applying the possibility of such to the
entire population, with regards to various levels of error.

Furthermore, it is because the SEM method takes into account the various levels of error and
other variables that the result shown in the article is genuine and not misleading. The features
most shown in the article is the measurement equations within the model, which assesses the
relationship between the variables and their indicators. Moreso, the model allows the
observation and testing of the Exogenous Variable, heterogeneity, which was ignored in other
types of models and methods. The SEM model allows the researcher to account for this variable
which in turn lets the researcher avoid a misleading result.

Francis Ian C. Albaracin II


BSA_3

1. What is Error 1?
Type 1 error is a term statisticians use to describe a false positive—a test result that
incorrectly affirms a false statement about the nature of reality.
In A/B testing, type 1 errors occur when experimenters falsely conclude that any
variation of an A/B or multivariate test outperformed the other(s) due to something
more than random chance. Type 1 errors can hurt conversions when companies make
website changes based on incorrect information.
Type 1 errors can result from two sources: random chance and improper research
techniques.
Random chance: no random sample, whether it’s a pre-election poll or an A/B test, can
ever perfectly represent the population it intends to describe. Since researchers sample
a small portion of the total population, it’s possible that the results don’t accurately
predict or represent reality—that the conclusions are theproduct of random chance.
Statistical significance measures the odds that the results of an A/B test were produced
by random chance. For example, let’s say you’ve run an A/B test that shows Version B
outperforming Version A with a statistical significance of 95%. That means there’s a 5%
chance these results were produced by random chance. You can raise your level of
statistical significance by increasing the sample size, but this requires more traffic and
therefore takes more time. In the end, you have to strike a balance between your
desired level of accuracy and the resources you have available.
Improper research techniques: when running an A/B test, it’s important to gather
enough data to reach your desired level of statistical significance. Sloppy researchers
might start running a test and pull the plug when they feel there’s a ‘clear winner’—long
before they’ve gathered enough data to reach their desired level of statistical
significance. There’s really no excuse for a type 1 error like this.

2. What is Error 2?
A type II error is a statistical term used within the context of hypothesis testing that
describes the error that occurs when one fails to reject a null hypothesis that is actually
false. A type II error produces a false negative, also known as an error of omission. For
example, a test for a disease may report a negative result when the patient is infected.
This is a type II error because we accept the conclusion of the test as negative, even
though it is incorrect.
A type II error, also known as an error of the second kind or a beta error, confirms an
idea that should have been rejected, such as, for instance, claiming that two
observances are the same, despite them being different. A type II error does not reject
the null hypothesis, even though the alternative hypothesis is the true state of nature.
In other words, a false finding is accepted as true.
A type II error is commonly caused if the statistical power of a test is too low. The
highest the statistical power, the greater the chance of avoiding an error. It's often
recommended that the statistical power should be set to at least 80% prior to
conducting any testing. A Type II error can occur if there is not enough power in
statistical tests, often resulting from sample sizes that are too small. Increasing the
sample size can help reduce the chances of committing a Type II error.

3. What is Power of Test? Examples.


In statistics, the power of a binary hypothesis test is the probability that the test
correctly rejects the null hypothesis when a specific alternative hypothesis is true. It
commonly represents the chances of a true positive detection conditional on the actual
existence of an effect to detect. Statistical power ranges from 0 to 1, and as the power
of a test increases, the probability of making a type II error by wrongly failing to reject
the null hypothesis decreases.
For a type II error probability of β, the corresponding statistical power is 1 − β. For
example, if experiment E has a statistical power of 0.7, and experiment F has a statistical
power of 0.95, then there is a stronger probability that experiment E had a type II error
than experiment F. This reduces experiment E's sensitivity to detect significant effects.
However, experiment E is consequently more reliable than experiment F due to its
lower probability of a type I error. It can be equivalently thought of as the probability of
accepting the alternative hypothesis
when it is true – that is, the ability of a test to detect a specific effect, if that specific
effect actually exists.

Descriptive Statistics in Business Research


ABSTRACT

The analysis of data is the most skilled task in the business research process which requires the
researcher own judgment and skill. The different statistical techniques were available to enrich
the researcher decision. Choices of appropriate statistical techniques were determined to a
great extent by the research design, hypothesis and the kind of data that was collected. These
techniques were categorized into descriptive and inferential statistics. This paper focuses only
on descriptive statistics which summarizes large mass of data into understandable and
meaningful form.
INTRODUCTION
Statistics is concerned with the scientific method by which information is collected, organized,
analyzed and interpreted for the purpose of description and decision making. A wider scope
and comprehensive definition was framed by two well known statistician Croxton and Cowden
“Statistics may be defined as the science of collection, presentation and analysis and
interpretation of numerical data”. It is a sound techniques or method for handling the collected
data, analyzing the data and used for drawing valid inferences from them.
There are different statistical approaches available to a researcher. Choices of appropriate
statistical techniques are determined to a great extent by the research design, hypothesis and
the kind of data that will be collected. When the data are collected, edited, classified and
tabulated, they are analyzed and interpreted with the help of various statistical tools based on
the nature of investigation. Thus, the researcher is expected to have basic knowledge of
statistics for carrying out the systematic analysis as well to provide accurate and precise
interpretation of data.

OBJECTIVE OF THE STUDY

To study the different types of descriptive statistics which are used for describing the data in
business research.

METHODOLOGY

The study was entirely based on secondary data. The required data for the present study were
collected from secondary source such as books.

Critique:

Handling data and information is the hallmark of a good researcher, especially those
specializing in business research as they are expected to handle large amounts of data and
information. As such, in my opinion, it is an excellent idea to use descriptive statistics as a way
to summarize those large amount of data, compress the raw information into a usable and
understandable form. This paper uses the statistical method appropriately and thus managed
to gather data and make something out of it.

You might also like