31 views

Uploaded by Mansour Taleb Abdullah

save

- Wang Chen Opportunity
- G8D - Assessing Questions
- c
- Hw7 Solution Muhammad Ahmad Ali
- On the Determinants of Scholastic Performance in Five Asian Countries
- Gustavo Stas PCA Generic
- kernel method
- Introduction to Data Mining
- Application of Unsupervised Learning in Clinical Oncology Practice – Exploring Anxiety Characteristics in Chemotherapy-Induced Nausea and Vomiting through Principal Variables
- MachineLearning-Lecture15.pdf
- TU-Berlin Intern Report
- Chemical and Biological Indicators of Soil Quality - Yadvinder Singh, PAU
- Principal Components Analysis (PCA) in SPSS Statistics _ Laerd Statistics
- SPAD7 Data Miner Guide
- Principal Component Analysis for the Classification of Fingers Movement Data Using
- CSECS-13
- Factor Analysis First Aid
- image transforms tutorial
- Matlab Tutorial
- Principal-Channels for One-Sided Object Cutout
- Nguyen Eurographics 08
- Intro SVD
- 55 ijecs
- 10.1073@pnas.1708800115.pdf
- A Tutorial on Principal Component Analysis
- The International Journal of Engineering and Science (The IJES)
- neuralnetworks8
- A New Process Capability Index for Multiple Quality Characteristics Based on Principal Components
- International Journal of Computational Engineering Research(IJCER)
- Full Text
- Yes Please
- The Unwinding: An Inner History of the New America
- Sapiens: A Brief History of Humankind
- The Innovators: How a Group of Hackers, Geniuses, and Geeks Created the Digital Revolution
- Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
- Dispatches from Pluto: Lost and Found in the Mississippi Delta
- Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
- John Adams
- The Prize: The Epic Quest for Oil, Money & Power
- The Emperor of All Maladies: A Biography of Cancer
- This Changes Everything: Capitalism vs. The Climate
- Grand Pursuit: The Story of Economic Genius
- A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
- The New Confessions of an Economic Hit Man
- Team of Rivals: The Political Genius of Abraham Lincoln
- The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
- Smart People Should Build Things: How to Restore Our Culture of Achievement, Build a Path for Entrepreneurs, and Create New Jobs in America
- Rise of ISIS: A Threat We Can't Ignore
- The World Is Flat 3.0: A Brief History of the Twenty-first Century
- Bad Feminist: Essays
- Steve Jobs
- Angela's Ashes: A Memoir
- How To Win Friends and Influence People
- Extremely Loud and Incredibly Close: A Novel
- The Sympathizer: A Novel (Pulitzer Prize for Fiction)
- The Silver Linings Playbook: A Novel
- Leaving Berlin: A Novel
- The Light Between Oceans: A Novel
- The Incarnations: A Novel
- You Too Can Have a Body Like Mine: A Novel
- The Love Affairs of Nathaniel P.: A Novel
- Life of Pi
- The Flamethrowers: A Novel
- Brooklyn: A Novel
- The Rosie Project: A Novel
- The Blazing World: A Novel
- We Are Not Ourselves: A Novel
- The First Bad Man: A Novel
- The Master
- Bel Canto
- A Man Called Ove: A Novel
- The Kitchen House: A Novel
- Beautiful Ruins: A Novel
- Interpreter of Maladies
- The Wallcreeper
- The Art of Racing in the Rain: A Novel
- Wolf Hall: A Novel
- The Cider House Rules
- A Prayer for Owen Meany: A Novel
- The Bonfire of the Vanities: A Novel
- Lovers at the Chameleon Club, Paris 1932: A Novel
- The Perks of Being a Wallflower
- The Constant Gardener: A Novel

You are on page 1of 8

)

Steven M. Ho!and

Department of Geology, University of Georgia, Athens, GA 30602-2501

May 2008

and so on. In most ordination methods. such as covariance or correlation in PCA or the implied chi-squared measure in detrended correspondence analysis. or centered to any desired conﬁguration. also NMDS and NMS) is an ordination technique that diﬀers in several ways from nearly all other ordination methods. Unlike other ordination methods. unlike other methods which specify particular measures. For example. In MDS. As a result. although these are becoming less important as computational power increases. inverted. an MDS ordination is not a unique solution and a subsequent MDS analysis on the same set of data and following the same methodology will likely result in a somewhat diﬀerent ordination. it can fail to ﬁnd the true best solution because it can become stuck on local minima. an MDS ordination can be rotated. MDS is slow. Second. MDS is a numerical technique that iteratively seeks a solution and stops computation when an acceptable solution has been found. so is well suited for a wide variety of data. a n x n symmetrical matrix of Non-metric Multdimensional Scaling (MDS) 1 . many axes are calculated. particularly for large data sets and the reasons for this will become apparent under Computation below. As a result. Second. MDS makes few assumptions about the nature of the data. because MDS is a numerical optimization technique. axis 2 explains the next greatest amount of variance. owing to graphical limitations. Increased computational speed now allows MDS ordinations even of large data sets and allows multiple ordinations to be run. MDS makes neither of these assumptions. such as taxa for ecological data. a small number of axes are explicitly chosen prior to the analysis and the data are ﬁtted to those dimensions. such that the chance of being stuck on a local minimum is greatly decreased. principal components analysis assumes linear relationships and reciprocal averaging assumes modal relationships. Computation The method underlying MDS is straightforward in approach. In contrast. but computationally demanding to execute. MDS also allows the use of any distance measure of the samples. MDS is not an eigenvalue-eigenvector technique like principal components analysis or correspondence analysis that ordinates the data such that axis 1 explains the greatest amount of variance. or it stops after some pre-speciﬁed number of attempts. From this. Third. MDS does suﬀer from two principal drawbacks. First. solutions that are not the best solution but that are better than all nearby solutions. but only a few are viewed. there are no hidden axes of variation. First. most other ordination methods are analytical and therefore result in a single unique solution to a set of data.Introduction Nonmetric multidimensional scaling (MDS. one starts with a matrix of data consisting of n rows of samples and p columns of variables.

Another approach is to perform a diﬀerent type of ordination. such as a principal components analysis or a higher-order MDS. A third approach. such as Euclidean distance. One approach is to perform several ordinations. typically with a Euclidean metric.i ˆ dhi − dhi 2 / h. and d-hat is the distance predicted from the regression. This initial conﬁguration could be based on another ordination or it could consist of an entirely random arrangement of the samples. which indicates that a minimum (perhaps local) has been found. The goodness of ﬁt of the regression is measured based on the sum of squared diﬀerences between ordination-based distances and the distances predicted by the regression. The MDS ordination will be performed on this distance matrix. polynomial.i d2 hi where dhi is the ordinated distance between samples h and i. all ordinated distances would fall exactly on the regression. and to select the ordination with the best ﬁt. Non-metric Multdimensional Scaling (MDS) 2 . In any case. and stress recalculated. Note that an ndimensional ordination is not equivalent to the ﬁrst n dimensions of an n+1-dimensional ordination. In a perfect ordination. The ﬁnal ordination is partly dependent on this initial conﬁguration. the direction in which stress changes most rapidly. Next. the regression performed again.all pairwise distances among samples is calculated with an appropriate distance measure. A variety of regression methods can be used. Distances among samples in this starting conﬁguration are calculated. is to use the geographic locations of samples as a starting conﬁguration. that is. each starting from a diﬀerent random arrangement of points. and Bray distance. the regression is ﬁtted by least-squares. and this entire procedure of nudging samples and recalculating stress is repeated until some small speciﬁed tolerance value is achieved or until the procedure converges by failing to achieve any lower values of stress. and non-parametric approaches. useful for data thought to be geographically arrayed. the two ordinations would have to be run separately. The ordination distance matrix is recalculated. including linear. These distances are regressed against the original distance matrix and the predicted ordination distances for each pair of samples is calculated. This conﬁguration is then improved by moving the positions of samples in ordination space by a small amount in the direction of steepest descent. with one of the most common being Kruskal’s Stress (formula 1) Stress(1) = h. Manhattan distance (city block distance). and to use n axes from that ordination as the initial conﬁguration. the last of which stipulates only that the regression consistently increases from left to right. so a variety of approaches are used to avoid the problem of local minima. they would match the rank-order of distances in the original distance matrix. The MDS software begins by constructing an initial conﬁguration of the samples in the m dimensions. This goodness of ﬁt is called stress and can be calculated in several ways. a desired number of m dimensions is chosen for the ordination.

which is part of the vegan library. which is part of the MASS library. and so on. it is still necessary to perform any transformations to obtain a meaningful ordination.Considerations The ordination will be sensitive to the number of dimensions that is chosen. on which one can identify the point beyond which additional dimensions do not substantially lower the stress value. necessary for running the metaMDS() command. The metaMDS function uses isoMDS in its calculations as well as several helper functions. in ecological data. The metaMDS routine allows greater automation of the ordination process. which is generally not of interest. The vegan package is designed for ecological data. Non-metric Multdimensional Scaling (MDS) 3 . Other commercial packages of MDS may not perform such a rotation. so the metaMDS default settings are set with this in mind. For the same underlying data structure. 1) Download and install the vegan library. Choosing too many dimensions is no better in that it can cause a single source of variation to be expressed on more than one dimension. For example. > library(vegan) 2) Run an MDS on data set. The metaMDS routine also has the useful default behavior of following the ordination with a rotation via principal components analysis such that MDS axis 1 reﬂects the principal source of variation. Stress increases both with the number of samples and with the number of variables. with columns of variables and rows of samples. For example. Although MDS seeks to preserve the distance relationships among the samples. as is characteristic of eigenvalue methods. whether the results make sense. so use caution when comparing stress among data sets. so use caution in interpreting their results. Stress can also be highly inﬂuenced by one or a few poorly ﬁt samples. isoMDS. and metaMDS. a larger data set will necessarily result in a higher stress value. the distance metric defaults to Bray and common ecological data transformations are turned on by default. One way to choose an appropriate number of dimensions is perform ordinations of progressively higher numbers of dimensions. The stress value reﬂects how well the ordination summarizes the observed distances among the samples. For non-ecological data. but have been criticized for being over-simplistic. A scree diagram (stress versus number of dimensions) can then be plotted. samples should be standardized by sample size to avoid ordinations that reﬂect primarily sample size. so it is important to check the contributions to stress among samples in an ordination. MDS in R R has two main MDS functions available. Choosing too few dimensions will force multiple axes of variation to be expressed on a single ordination dimension. Several “rules of thumb” for stress have been proposed. so is usually the preferred method. these settings may distort the ordination. so this choice must be made with care. A second criterion for the appropriate number of dimensions is the interpretability of the ordination. that is.

mds$tries: number of random initial configurations tried mydata.mds$points: sample scores mydata. 3) View items in the list produced by metaMDS. and will use 50 starts from # random configurations to avoid local minima. Open black circles correspond to samples and red crosses indicate taxa.> mydata <. the distance measure (distance).mydata.mds$distance: distance metric used mydata.mydata. > # # # # # # # # # # # # # names(mydata. This # MDS will be 3-dimensional (k=3).mds. the ﬁnal stress value (stress). Non-metric Multdimensional Scaling (MDS) 4 .mds) mydata.mds$points 6) Plot sample and variable scores in same space. the number of dimensions (dims).mds$dims: number of MDS axes or dimensions mydata.mds$data: what was ordinated.mds$species > sampleScores <. row. whether any convergent solution was achieved (converged).mds$stress: stress value of final solution mydata. k=3.ALT <. header=TRUE. or rotated.") > mydata.table("mydata. distance=”euclidean”.mds$species: scores of variables (species / taxa in ecology) mydata. so this will return as many columns as was speciﬁed with the k parameter in the call to metaMDS. trymax=50. # where a euclidean distance metric would be appropriate.metaMDS(mydata) # the default MDS ordination > mydata.mds <. The # transformations appropriate for ecological data are also # turned off. including how the function was called (call). how many random initial conﬁgurations were tried (tries). so one would need to make any necessary # transformations prior to calling the metaMDS function.mds 5) Extract sample and variable scores. autotransform=FALSE) # Shows how an MDS could be performed on non-ecological data.mds$call: restates how the function was called 4) View the results of the MDS.txt".read. which will display several elements of the list detailed above. > variableScores <. plus whether scores have been scaled.names=1. > mydata.mds$converged: whether solution converged or not (T/F) mydata. including any transformations mydata.metaMDS(mydata. the data set and any transformations used (data). sep=". centered. The column numbers correspond to the MDS axes.

spp.spp.sp.0 −0.bryozoans bivalve.0 ● ● ● ● ● ● ● ● ● + ● ● ● ● + ● ● ● ● ● ● +● + ●● ● + ● ● ● ● + ● ● ● + ● +● ● ● ● ● ● ● ● + + ● ●● + ● ● + ● + ● ● ● −0.5 + + ● ● + ●+ + ● ● ● ● ● ● ● ● ● ++ + ● ● + ● + +● + ● ● ● ● + NMDS2 ● 0.spp. Eridorthis. labels may be displayed instead of symbols by specifying type=”t”.indet.indet.nautiloid Cyclonema.mds) ● + ● + + ● + ● + ● + + ● ● ● ● ● ●● + 0.humerosum Platystrophia. atrypid.formosa −0.spp.mds.indet. Escharopora.increbescens Orthorhynchula.trepostome cryptostome.indet.0 −0. Also. Anomalocrinus. Prasopora. Liospira.spp. type=”t”.0 NMDS1 0.sublinneyi Rafinesquina. encrusting. gastropod.spp. Sowerbyella.sp.indet.sp.indet.sp.spp. Lingulella.> plot(mydata.0 7) MDS plots can be customized by selecting either sites or species. Ambonychia. Petrocrania.0 0.spp.spp.sp. Conularia.5 1. Cornulites. Solenopora Strophomena.5 orthoconic.indet.indet.trepostomes Constellaria.5 0. Hebertella.spp. Cryptolithus.5 Non-metric Multdimensional Scaling (MDS) 5 .5 0.tesselatus modiomorphid. Pseudolingula.nicklesi proetids. −1.0 NMDS1 0. ostracods Sinuites.sp. Isotelus.scabiosa odontopleurid. dalmanellids ramose.sp. Lophospira. NMDS2 0.5 Rhynchotrema. bifoliate. display=c(”species”)) text labels instead of symbols Crowding of text labels can be alleviated by plotting to a larger window Climacograptus.5 +● + ● + ● ● + + + −1. > # # # plot(mydata.cancellatus calymenid.

0 0. Liospira. Petrocrania. encrusting. col=”red”) plots points for all samples (specified by “sites”) for MDS axes 1 & 2 (specified by choices).indet.formosa −0.spp. Ambonychia. Solenopora Strophomena.sp. Note that # there is also a select parameter for text. Sowerbyella. By specifying type =”n”.bryozoans bivalve. Anomalocrinus.indet.0 −0.nicklesi proetids. # See labels parameter on help page for metaMDS for how to # specify an alternative vector of labels. Lophospira.spp. but no symbols or labels > # # # # # # points(mydata.5 Non-metric Multdimensional Scaling (MDS) 6 .2). For crowded plots.spp. Conularia. See the select parameter on help page for metaMDS for how to use a vector to specify which points are plotted.8) MDS plots can be further customized.indet.indet. > plot(mydata.trepostomes Constellaria. Climacograptus. bifoliate. These can then be plotted with the points() and text() commands.spp. Typical plotting parameters # can also be set. Cryptolithus.5 Rhynchotrema.5 0.cancellatus calymenid.spp.indet. cex=0.sublinneyi Rafinesquina. as opposed to the # row or column names.indet. atrypid. congestion of points and labels can be alleviated by plotting to a larger window and by using cex to reduce the size of symbols and text.sp. mentioned above. pch=3. NMDS2 0. −1. such as using cex to plot smaller labels. as there is for # points.tesselatus modiomorphid.indet. choices=c(1.sp.indet. Note that symbols can be modified with typical adjustments. ramose. Escharopora. Pseudolingula. choices=c(1. Prasopora. no sample scores or variable scores will be plotted.0 NMDS1 0.spp.spp.spp. such as pch and col. display=c(”sites”).2).increbescens Orthorhynchula. display=c(”species”). Lingulella.nautiloid Cyclonema.sp.mds.trepostome cryptostome. which are used by default. Cornulites.humerosum Platystrophia. Eridorthis. Hebertella.mds.5 dalmanellids orthoconic. > text(mydata.sp.scabiosa odontopleurid. Isotelus. col=”blue”. ostracods Sinuites.spp. gastropod.spp.sp.mds.7) # plots labels for all variable scores (specified by # “species”) for MDS axes 1 & 2.sp. type=”n”) # plots axes.

and L.9) For congested plots. > fig <. Grace.. Numerical Ecology... "spec") References Legendre. and J. MjM Software Design: Gleneden Beach.B. Press “escape” when you are done selecting your points to see their identities. B. 300 p. Non-metric Multdimensional Scaling (MDS) 7 .mds) > identify(fig. Elsevier: Amsterdam. 853 p. 2002. P. Oregon.ordiplot(mydata. 1998. You can then click on individual points you would like identiﬁed. Legendre. this will display a plot with symbols for both samples and variables. McCune. Analysis of Ecological Communities.

- Wang Chen OpportunityUploaded byAgung Adiputra
- G8D - Assessing QuestionsUploaded byİlhan Burak Özhan
- cUploaded byPriyaprasad Panda
- Hw7 Solution Muhammad Ahmad AliUploaded byKun Abdillah
- On the Determinants of Scholastic Performance in Five Asian CountriesUploaded byAsian Development Bank
- Gustavo Stas PCA GenericUploaded byMohammad Nahid Mia
- kernel methodUploaded byNisha Kamaraj
- Introduction to Data MiningUploaded byChris Gull
- Application of Unsupervised Learning in Clinical Oncology Practice – Exploring Anxiety Characteristics in Chemotherapy-Induced Nausea and Vomiting through Principal VariablesUploaded byJournal of Computing
- MachineLearning-Lecture15.pdfUploaded bySethu S
- TU-Berlin Intern ReportUploaded byAmal Sahai
- Chemical and Biological Indicators of Soil Quality - Yadvinder Singh, PAUUploaded byCSISA Project
- Principal Components Analysis (PCA) in SPSS Statistics _ Laerd StatisticsUploaded byjoysinha
- SPAD7 Data Miner GuideUploaded byPabloAPacheco
- Principal Component Analysis for the Classification of Fingers Movement Data UsingUploaded byIAEME Publication
- CSECS-13Uploaded byjenghiskhan
- Factor Analysis First AidUploaded bypantal24
- image transforms tutorialUploaded bySudip Ghosh
- Matlab TutorialUploaded byRohit Vishal Kumar
- Principal-Channels for One-Sided Object CutoutUploaded byozlem_ozturk
- Nguyen Eurographics 08Uploaded byAnonymous t5TDwd
- Intro SVDUploaded byTri Kasihno
- 55 ijecsUploaded byAnonymous Ndsvh2so
- 10.1073@pnas.1708800115.pdfUploaded byDarkoPolšek
- A Tutorial on Principal Component AnalysisUploaded byrigasti
- The International Journal of Engineering and Science (The IJES)Uploaded bytheijes
- neuralnetworks8Uploaded byapi-26830587
- A New Process Capability Index for Multiple Quality Characteristics Based on Principal ComponentsUploaded byMohamed Hamdy
- International Journal of Computational Engineering Research(IJCER)Uploaded byInternational Journal of computational Engineering research (IJCER)
- Full TextUploaded byNoha Fayed