You are on page 1of 13

Quick & Simple Introduction to Multidimensional Scaling

Professor Tony Coxon




Hon. Professorial Research Fellow, University of Edinburgh ( apm.coxon@ed.ac.uk )


see www.tonycoxon.com for information on me  see www.newmdsx.com for information resource on MDS and NewMDSX programs/doc.  See:

  

The Users Guide to MDS and Key Texts in MDS (readings), Heineman 1982 Available as pdf at 15 from newmdsx

What is Multidimensional Scaling?

A student s definition:  If you are interested in how certain objects relate to each other and if you would like to present these relationships in the form of a map then MDS is the technique you need (Mr Gawels, KUB) A good start!




MDS is a family of models structured by D-T-M:


(DATA) the empirical information on inter-relationships between a set of objects /variables are given in a set of dis/similarity data (TRANSFORMATION) which are then re-scaled ( according to permissible transformations for the data / level of measurement) , in terms of (MODEL) the assumptions of the model chosen to represent the data

MDS Solution
1.

    

to produce a SOLUTION, consisting of : a CONFIGURATION, which is a i. pattern of points representing the objects ii. located in a space of a small number of dimensions
(hence SSA Smallest-Space Analysis )

where the distances between the points represent the dis/similarities between the data-points

iii.

iv.

as perfectly as possible (the imperfection/badness of fit is measured by Stress)

Low stress is desirable; No stress is perfection

Distances & Maps




Given a map, its easy to calculate the (Euclidean) distances between the points :

d j ,k !
 

( x ja  x ka ) 2
a

MDS operates the other way round: Given the distances [data] find the map [configuration] which generated them


and MDS can do so when all but ordinal information has been jettisoned (fruit of the non-metric revolution) even when there are missing data and in the presence of considerable noise/error (MDS is robust). [exploratory] a useful and easily-assimilable graphic visualization of a complex data set (Tukey: A picture is worth a thousand words )

MDS thus provides at least




What is like MDS?


Related and Special-case Models:  Metric Scalar Products Models:
 

*PRINCIPAL COMPONENTS ANALYSIS FACTOR ANALYSIS (+ communalities) *Hierarchical Clustering *Partition Clustering (CONPAR) Additive Clustering ( 2 and 3-way)

Metric and Non-Metric Ultrametric Distance, Discrete models

 

Metric Chi-squared Distance Model for 2W2M and 3W data / Tables


*Simple (2W2M) and Multiple (3W) Correspondence Analysis

BECAUSE OF NON-METRIC (MONOTONE) REGRESSION, MDS ALSO OFFERS ORDINAL EQUIVALENTS OF:
*ANOVA other simple composition models * UNICON

(All models with asterisk * exist as programs within NewMDSX)

How does MDS differ from other Multivariate Methods?


Compared to other multivariate methods, MDS models are usually:


distribution-free


(though MLE models do exist

Ramsay s MULTISCALE)

  

make conservative (non-metric) demands on the structure of the data, are relatively unaffected by non-systematic missing data, can be used with a very wide variety of types of data:
 

 

direct data (pair comparisons, ratings, rankings, triads, sortings) derived data (profiles, co-occurrence matrices, textual data, aggregated data) measures of association/correlation etc derived from simpler data, and tables of data. monotonic (ordinal), linear/metric (interval), but also log-interval, power, smoothness even maximum variance non-dimensional scaling (Shepard)

range of transformations


How does MDS differ from other Multivariate Methods (2)?


Compared to other multivariate methods, MDS models are also offer:


range of models (chiefly distance (Euclidean, but also City-block), factor/vector (scalar-products), simple composition (additive). Also there are hierarchies of models:
 

Similarity models: 2W1M METRIC 3W2M INDSCAL IDIOSCAL (honest!) Preference models : Vector-distance-weighted distance-rotated, weighted (PREFMAP) Procrustes rotation for putting configurations into maximum conformity, and then increasingly complex transformations: PINDIS

 

the solutions are visually assimilable & readily interpretable the structure is not limited to dimensional information also other simple structures ( horseshoes , radex/circumplex, clusters, directions).

Weaknesses in MDS
 


There ARE any??!

Relative ignorance of the sampling properties of stress prone-ness to local minima solutions
(but less so, and interactive programs like PERMAP allow thousands of runs to check)

a few forms of data/models are prone to degeneracies (especially MD Unfolding but see new PREFSCAL in SPSS) difficulty in representing the asymmetry of causal models


though external analysis is very akin to dependent-independent modelling, there are convergences with GLM in hybrid models such as CLASCAL (INDSCAL with parameterization of latent classes)

CHARACTERIZATION OF BASIC MDS & TERMINOLOGY


Structure of MDS specifiable in terms of D-T-M
DATA (specifies input data shape and content) DATA MATRIX INPUT:
 

WAY: dimensionality of array (2,3,4 ...) MODALITY: No of distinct sets (to be represented) (1,2,3 )


NB: Modality < or = Way

Common examples:
  

2W1M 2W2M 3W2M

basic models (LTM,UTM,FSM) rectangular, joint (conditional )mapping (stack of 2W1M) Individual differences Scaling

CHARACTERIZATION OF BASIC MDS (2)

TRANSFORMATION (form or type of rescaling performed on data) o Non-Metric /Ordinal: H = M(d)


 Monotonic Increasing (sims) or Decreasing (dissims) y Order/inequality o Strong / Guttman: (j,k) > (l,m) -> d(j,k) > d(l,m) o weak/Kruskal: (j,k) > (l,m) -> d(j,k) d(l,m) y Equality / ties o Primary (j,k) = (l,m) -> d(j,k) = OR d(l,m) o 2ndary (j,k) = (l,m) -> d(j,k) = d(l,m)

o Metric / Linear
 Linear: H = L(d)  H = a + b(d)

CHARACTERIZATION OF BASIC MDS (3)




MODEL: Euclidean Distance


d j ,k ! ( x ja  x ka ) 2
a

where x(i,a) is the co-ordinate of point i on dimension a in the solution configuration X of low dimension  The basic model is Euclidean distance, but other Minkowski metrics are available, including:


City Block Model

(Badness of) FIT: Stress


DEFINITION S OF STRESS Raw Stress ! (d jk  d o ) 2 jk
j, k

(sum of squared residuals from monotone regression )

Normalisin g Factors : NF1 ! d 2 (sum of squared distances) jk


j, k

NF2 ! (d j, k  d ) 2 (sum of squared deviations from mean distance)


j, k

STRESS - FORMULAE S1 ! S2 ! rawstress NF 1 rawstress NF 2

Types of Analysis
INTERNAL: If the analysis depends solely on the input data, it is termed internal, but  EXTERNAL: If the analysis uses additionally to the input data / solution information relating to the same points (but from another source), it is termed external.


You might also like