Professional Documents
Culture Documents
The important step of the modeling is that p(y|x) is assumed to take the following form, as a mixture model:
where n is the number of clusters and {wj} are weights that sum to one. The functions pj(y,x) are joint
probability density functions that relate to each of the n clusters. These functions are modeled using a
decomposition into a conditional and a marginal density:
where:
pj(y|x) is a model for predicting y given x, and given that the input-output pair should be
associated with cluster j on the basis of the value of x. This model might be a
regression model in the simplest cases.
pj(x) is formally a density for values of x, given that the input-output pair should be
associated with cluster j. The relative sizes of these functions between the clusters
determines whether a particular value of x is associated with any given cluster-center.
This density might be a Gaussian function centered at a parameter representing the
cluster-center.
In the same way as for regression analysis, it will be important to consider preliminary data transformations
as part of the overall modeling strategy if the core components of the model are to be simple regression
models for the cluster-wise condition densities, and normal distributions for the cluster-weighting densities
pj(x).
General versions
The basic CWM algorithm gives a single output cluster for each input cluster. However, CWM can be
extended to multiple clusters which are still associated with the same input cluster.[3] Each cluster in CWM
is localized to a Gaussian input region, and this contains its own trainable local model.[4] It is recognized as
a versatile inference algorithm which provides simplicity, generality, and flexibility; even when a
feedforward layered network might be preferred, it is sometimes used as a "second opinion" on the nature
of the training problem.[5]
CWM can be used to classify media in printer applications, using at least two parameters to generate an
output that has a joint dependency on the input parameters.[6]
References
1. Gershenfeld, N (1997). "Nonlinear Inference and Cluster-Weighted Modeling". Annals of the
New York Academy of Sciences. 808: 18–24. Bibcode:1997NYASA.808...18G (https://ui.ads
abs.harvard.edu/abs/1997NYASA.808...18G). doi:10.1111/j.1749-6632.1997.tb51651.x (http
s://doi.org/10.1111%2Fj.1749-6632.1997.tb51651.x). S2CID 85736539 (https://api.semantic
scholar.org/CorpusID:85736539).
2. Gershenfeld, N.; Schoner; Metois, E. (1999). "Cluster-weighted modelling for time-series
analysis". Nature. 397 (6717): 329–332. Bibcode:1999Natur.397..329G (https://ui.adsabs.har
vard.edu/abs/1999Natur.397..329G). doi:10.1038/16873 (https://doi.org/10.1038%2F16873).
S2CID 204990873 (https://api.semanticscholar.org/CorpusID:204990873).
3. Feldkamp, L.A.; Prokhorov, D.V.; Feldkamp, T.M. (2001). "Cluster-weighted modeling with
multiclusters". International Joint Conference on Neural Networks. 3 (1): 1710–1714.
doi:10.1109/IJCNN.2001.938419 (https://doi.org/10.1109%2FIJCNN.2001.938419).
S2CID 60819260 (https://api.semanticscholar.org/CorpusID:60819260).
4. Boyden, Edward S. "Tree-based Cluster Weighted Modeling: Towards A Massively Parallel
Real-Time Digital Stradivarius" (http://edboyden.org/violin.pdf) (PDF). Cambridge, MA: MIT
Media Lab.
5. Prokhorov, A New Approach to Cluster-Weighted Modeling Danil V.; Lee A. Feldkamp;
Timothy M. Feldkamp. "A New Approach to Cluster-Weighted Modeling" (http://home.comca
st.net/~dvp/cwm.pdf) (PDF). Dearborn, MI: Ford Research Laboratory.
6. Gao, Jun; Ross R. Allen (2003-07-24). "CLUSTER-WEIGHTED MODELING FOR MEDIA
CLASSIFICATION" (https://archive.today/20121212003528/http://www.wipo.int/pctdb/en/wo.
jsp?wo=2003059630). Palo Alto, CA: World Intellectual Property Organization. Archived
from the original (http://www.wipo.int/pctdb/en/wo.jsp?wo=2003059630) on 2012-12-12.
Retrieved from "https://en.wikipedia.org/w/index.php?title=Cluster-weighted_modeling&oldid=1097385931"