Professional Documents
Culture Documents
Performance Char Vmfs RDM
Performance Char Vmfs RDM
Technology Overview
VMFSisaspecialhighperformancefilesystemofferedbyVMwaretostoreESXServervirtualmachines. AvailableaspartofESXServer,itisaclusteredfilesystemthatallowsconcurrentaccessbymultiplehoststo filesonasharedVMFSvolume.VMFSoffershighI/Ocapabilitiesforvirtualmachines.Itisoptimizedfor storingandaccessinglargefilessuchasvirtualdisksandthememoryimagesofsuspendedvirtualmachines. RDMisamappingfileinaVMFSvolumethatactsasaproxyforarawphysicaldevice.TheRDMfilecontains metadatausedtomanageandredirectdiskaccessestothephysicaldevice.Thistechniqueprovides advantagesofdirectaccesstophysicaldeviceinadditiontosomeoftheadvantagesofavirtualdiskonVMFS storage.Inbrief,itoffersVMFSmanageabilitywiththerawdeviceaccessrequiredbycertainapplications. YoucanconfigureRDMintwoways:
BothVMFSandRDMprovidesuchclusteredfilesystemfeaturesasfilelocking,permissions,persistent naming,andVMotioncapabilities.VMFSisthepreferredoptionformostenterpriseapplications,including databases,ERP,CRM,VMwareConsolidatedBackup,Webservers,andfileservers.AlthoughVMFSis recommendedformostvirtualdiskstorage,rawdiskaccessisneededinafewcases.RDMisrecommended forthosecases.SomeofthecommonusesofRDMareinclusterdataandquorumdisksforconfigurations usingclusteringbetweenvirtualmachinesorbetweenphysicalandvirtualmachinesorforrunningSAN snapshotorotherlayeredapplicationsinavirtualmachine. FormoreinformationonVMFSandRDM,seetheServerConfigurationGuidementionedinResourceson page 11.
Executive Summary
Themainconclusionsthatcanbedrawnfromthetestsdescribedinthisstudyare:
Test Environment
ThetestsdescribedinthisstudycharacterizetheperformanceofVMFSandRDMinESXServer3.5.Weran thetestswithauniprocessorvirtualmachineusingWindowsServer2003EnterpriseEditionwithSP2asthe guestoperatingsystem.ThevirtualmachineranonanESXServersysteminstalledonalocalSCSIdisk.We attachedtwodiskstothevirtualmachineonevirtualdiskfortheoperatingsystemandaseparatetestdisk, whichwasthetargetfortheI/Ooperations.WegeneratedI/OloadusingIometer,averypopulartoolfor evaluatingI/Operformance(seeResourcesonpage 11foralinktomoreinformation).SeeConfiguration onpage 10foradetailedlistofthehardwareandsoftwareconfigurationweusedforthetests. Forallworkloadsexcept4Ksequentialread,weusedthedefaultcachepagesize(8K)inthestorage.However, for4Ksequentialreadworkloads,thedefaultcachepagesettingresultedinlowerperformanceforbothVMFS andRDM.EMCrecommendssettingthecachepagesizeinstoragetoapplicationblocksize(4Kinthiscase)
Disk Layout
Inthisstudy,disklayoutreferstotheconfiguration,location,andtypeofdiskusedfortests.Weusedasingle serverattachedtoaFibreChannelarraythroughahostbusadapterforthetests. Wecreatedalogicaldriveonasinglephysicaldisk.WeinstalledESXServeronthisdrive.Onthesamedrive, wealsocreatedaVMFSpartitionandusedittostorethevirtualdiskonwhichweinstalledtheWindows Server2003guestoperatingsystem. ThetestdiskwaslocatedonametaLUNontheCLARiiONCX340,whichwascreatedasfollows:Wecreated twoRAID0groupsontheCLARiiONCX340,onecontaining15FibreChanneldisksandtheothercontaining 10FiberChanneldisks.Weconfigureda10GBLUNoneachRAIDgroup.Wethencreateda20GBmetaLUN usingthetwo10GBLUNsinastripedconfiguration(seeResourcesonpage 11foralinktotheEMC NavisphereManagerAdministratorsGuidetogetmoreinformation).WeusedametaLUNforthetestdisk becauseitallowedustousemorethan16spindlestostripeaLUNinaRAID0configuration.Weusedthis testdiskonlyfortheI/Ostresstest.Wecreatedavirtualdiskonthetestdiskandattachedthatvirtualdiskto theWindowsvirtualmachine.Totheguestoperatingsystem,thevirtualdiskappearsasaphysicaldrive. Figure1showsthediskconfigurationusedinourtests.IntheVMFStests,weimplementedthevirtualdisk (seenasaphysicaldiskbytheguestoperatingsystem)asa.vmdkfilestoredonaVMFSpartitioncreatedon thetestdisk.IntheRDMtests,wecreatedanRDMfileontheVMFSvolume(thevolumethatheldthevirtual machineconfigurationfilesandthevirtualdiskwhereweinstalledtheguestoperatingsystem)andmapped theRDMfiletothetestdisk.WeconfiguredthetestvirtualdisksoitwasconnectedtoanLSISCSIhostbus adapter. Figure 1. Disk layout for VMFS and RDM tests
ESX Server
ESX Server
Virtual machine
Virtual machine
.vmdk
Operating system disk
VMFS volume
.vmdk
Test disk Test disk
VMFS volume VMFS volume
.vmdk
RDM mapping le pointer
Test disk
Raw device
VMFS test
RDM test
Software Configuration
WeconfiguredtheguestoperatingsystemtousetheLSILogicSCSIdriver.OnVMFSvolumes,wecreated virtualdiskswiththethickoption.Thisoptionprovidesthebestperformingdiskallocationschemefora virtualdisk.Allthespaceallocatedduringdiskcreationisavailablefortheguestoperatingsystem
immediatelyafterthecreation.Anyolddatathatmightbepresentontheallocatedspaceisnotzeroedout duringvirtualmachinewriteoperations. Unlessstatedotherwise,weleftallESXServerandguestoperatingsystemparametersattheirdefaultsettings. Ineachtestcase,wezeroedthevirtualdisksbeforestartingtheexperimentusingthecommandlineprogram vmkfstools(withthe-woption) NOTEESXServer3.5offersfouroptionsforcreatingvirtualdiskszeroedthick,eagerzeroedthick, thick,andthin.WhenavirtualdiskiscreatedusingtheVIClient,thezeroedthickoptionisusedby default.Virtualdiskswitheagerzeroedthick,thick,orthinformatscanbecreatedonlywithvmkfstools, acommandlineprogram.Thezeroedthickandthinformatshavecharacteristicssimilartothethick formataftertheinitialwriteoperationtothedisk.Inourtests,weusedthethickoptiontopreventthe warmupanomalies.FordetailsonthesupportedvirtualdiskformatsrefertoChapter5oftheVMware Infrastructure3ServerConfigurationGuide.Fordetailsonusingvmkfstools,seeappendixesoftheVMware Infrastructure3ServerConfigurationGuide.
Test Cases
Inthisstudy,wecharacterizetheperformanceofVMFSandRDMforarangeofdatatransfersizesacross variousaccesspatterns.Thedatatransfersizesweselectedwere4KB,8KB,16KB,32KB,and64KB.Theaccess patternswechosewererandomreads,randomwrites,sequentialreads,sequentialwrites,oramixofrandom readsandwrites.ThetestcasesaresummarizedinTable1. Table 1. Test cases
100% Sequential 100% Read 100% Write 50% Read + 50% Write 4KB,8KB,16KB,32KB,64KB 4KB,8KB,16KB,32KB,64KB 100% Random 4KB,8KB,16KB,32KB,64KB 4KB,8KB,16KB,32KB,64KB 4KB,8KB,16KB,32KB,64KB
Load Generation
WeusedtheIometerbenchmarkingtool,originallydevelopedatIntelandwidelyusedinI/Osubsystem performancetesting,togenerateI/OloadandmeasuretheI/Operformance.Foralinktomoreinformation, seeResourcesonpage 11.IometerprovidesoptionstocreateandexecuteawelldesignedsetofI/O workloads.Becausewedesignedourteststocharacterizetherelativeperformanceofvirtualdisksonraw devicesandVMFS,weusedonlybasicloademulationfeaturesinthetests. Iometerconfigurationoptionsusedasvariablesinthetests:
Performance Results
Thissectionpresentsdataandanalysisofstoragesubsystemperformanceinauniprocessorvirtualmachine.
Metrics
ThemetricsweusedtocomparetheperformanceofVMFSandRDMareI/Orate(measuredasnumberofI/O operationspersecond),throughputrate(measuredasMBpersecond),andCPUcostmeasuredintermsof MHzperI/Ops. Inthisstudy,wereporttheI/OrateandthroughputrateasmeasuredbyIometer.Weuseacostmetric measuredintermsofMHzperI/OpstocomparetheefficienciesofVMFSandRDM.Thismetricisdefinedas theCPUcost(inprocessorcycles)perunitI/Oandiscalculatedasfollows:
Average CPU utilization CPU rating in MHz Number of cores MHz per I/Ops = -------------------------------------------------------------------------------------------------------------------------------------------------------------Number of I/O operations per second
WecollectedI/OandCPUutilizationstatisticsfromIometerandesxtopasfollows:
IometercollectedI/OoperationspersecondandthroughputinMBps esxtopcollectedtheaverageCPUutilizationofphysicalCPUs
Performance
Thissectioncomparestheperformancecharacteristicsofeachtypeofdiskaccessmanagement.Themetrics usedareI/Orate,throughput,andCPUcost.
Random Workload
Inourtestsforrandomworkloads,VMFSandRDMproducedsimilarI/OperformanceasevidentfromFigure 2,Figure3,andFigure4.
Figure 2. Random mixed (50 percent read access/50 percent write access) I/O operations per second (higher is better)
9000 8000 7000 I/O per second 6000 5000 4000 3000 2000 1000 0 4 8 16 Data size (KB) 32 64 VMFS RDM (virtual) RDM (physical)
Sequential Workload
For4Ksequentialread,wechangedthecachepagesizeontheCLARiiONCX340to4K.Weusedthedefault sizeof8Kforallotherworkloads. InESXServer3.5,forsequentialworkloads,performanceofVMFSisveryclosetothatofRDMforallI/Oblock sizesexcept4Ksequentialread.MostapplicationswithasequentialreadI/Opatternuseablocksizegreater than4K.BothVMFSandRDMprovidesimilarperformanceinthosecases,asshowninFigure5andFigure6. BothVMFSandRDMdeliververyhighthroughput(inexcessof300megabytespersecond,dependingonthe I/Oblocksize). Figure 5. Sequential read I/O operations per second (higher is better)
45000 40000 35000 30000 I/O per second 25000 20000 15000 10000 5000 0
4 8 16 32 64
25000
20000
15000
10000
5000
32
64
Table2andTable3showthethroughputratesinmegabytespersecondcorrespondingtotheaboveI/O operationspersecondforVMFSandRDM.Thethroughputrates(I/Ooperationspersecond*datasize)are consistentwiththeI/OratesshownaboveanddisplaybehaviorsimilartothatexplainedfortheI/Orates. Table 2. Throughput rates for random workloads in megabytes per second
Random Mix Data Size (KB) 4 8 16 32 64 VMFS 30.96 58.8 105.22 173.72 256.14 RDM (V) 30.98 58.68 106.03 176.1 262.92 RDM (P) 31.15 58.68 105.09 176.69 263.98 Random Read VMFS 23.41 44.67 80.14 135.91 215.1 RDM (V) 23.27 44.43 80.72 138.53 215.49 RDM (P) 23.27 44.35 80.58 138.11 214.59 Random Write VMFS 46.38 90.24 161.2 241.4 309.6 RDM (V) 46.1 89.22 158.25 241.12 311.13 RDM (P) 45.87 88.74 159.42 242.05 310.15
CPU Cost
CPUcostcanbecomputedintermsofCPUcyclesrequiredperunitofI/Oorunitofthroughput(byte).We obtainedafigureforCPUcyclesusedbythevirtualmachineformanagingtheworkload,includingthe virtualizationoverhead,bymultiplyingtheaverageCPUutilizationofalltheprocessorsseenbyESXServer, theCPUratinginMHz,andthetotalnumberofcoresinthesystem(fourinourtestserver).Inthisstudy,we measuredCPUcostasCPUcyclesperunitofI/Ooperationspersecond. NormalizedCPUcostforvariousworkloadsisshowninfigures7through11.WeusedCPUcostforRDM (physicalmapping)asthebaselineforeachworkloadandplottedtheCPUcostsofVMFSandRDM(virtual mapping)asafractionofthebaselinevalue.ForrandomworkloadstheCPUcostofVMFSisonaverage5 percentmorethantheCPUcostofRDM.Forsequentialworkloads,theCPUcostofVMFSis8percentmore thantheCPUcostofRDM. Aswithanyfilesystem,VMFSmaintainsdatastructuresthatmapfilenamestophysicalblocksonthedisk. EachfileI/Orequiresaccessingthemetadatatoresolvefilenamestoactualdatablocksbeforereadingdata fromorwritingdatatoafile.TheaddressresolutionrequiresafewextraCPUcycleseverytimethereisanI/O access.Inaddition,maintainingthemetadataalsorequiresadditionalCPUcycles.RDMdoesnotrequireany underlyingfilesystemtomanageitsdata.Dataisaccesseddirectlyfromthedisk,withoutanyfilesystem overhead,resultinginalowerCPUcycleconsumptionandbetterCPUcostforRDMaccess.
1.00
0.50
0.00
8 VMFS
32
64
RDM (Physical)
1.00
0.50
0.00
8
VMFS
32
64
RDM (physical)
1.00
0.50
0.00
8 VMFS
32
64
RDM (physical)
Figure 10. Normalized CPU cost for sequential read (lower is better)
1.50
Normalized CPU cost (MHz/IOps)
1.00
0.50
0.00
8
VMFS
16
Data size (KB) RDM (virtual)
32
RDM (physical)
64
Figure 11. Normalized CPU cost for sequential write (lower is better)
1.50
Normalized CPU cost (MHz/IOps)
1.00
0.50
0.00
32
64
VMFS
RDM (physical)
Conclusion
VMwareESXServerofferstwooptionsfordiskaccessmanagementVMFSandRDM.Bothoptionsprovide clusteredfilesystemfeaturessuchasuserfriendlypersistentnames,distributedfilelocking,andfile permissions.BothVMFSandRDMallowyoutomigrateavirtualmachineusingVMotion.Thisstudy comparestheperformancecharacteristicsofbothoptionsandfindsonlyminordifferencesinperformance. Forrandomworkloads,VMFSandRDMproducesimilarI/Othroughput.Forsequentialworkloadswith smallI/Oblocksizes,RDMprovidesasmallincreaseinthroughputcomparedtoVMFS.However,the performancegapdecreasesastheI/Oblocksizeincreases.Forallworkloads,RDMhasslightlybetterCPU cost. ThetestresultsdescribedinthisstudyshowthatVMFSandRDMprovidesimilarI/Othroughputformostof theworkloadswetested.ThesmalldifferencesinI/Operformanceweobservedwerewiththevirtualmachine runningCPUsaturated.Thedifferencesseeninthesestudieswouldthereforebeminimizedinreallife workloadsbecausemostapplicationsdonotusuallydrivevirtualmachinestotheirfullcapacity.Most enterpriseapplicationscan,therefore,useeitherVMFSorRDMforconfiguringvirtualdiskswhenrunina virtualmachine. However,thereareafewcasesthatrequireuseofrawdisks.BackupapplicationsthatusesuchinherentSAN featuresassnapshotsorclusteringapplications(forbothdataandquorumdisks)requirerawdisks.RDMis recommendedforthesecases.WerecommenduseofRDMforthesecasesnotforperformancereasonsbut becausetheseapplicationsrequirelowerleveldiskcontrol.
Configuration
Thissectiondescribesthehardwareandsoftwareconfigurationsweusedinthetestsdescribedinthisstudy.
Server Hardware
Storage Hardware
Software
Virtualizationsoftware:ESXServer3.5(build64607)
Storage Configuration
Iometer Configuration
Resources
PerformanceCharacteristicsofVMFSandRDM:VMWareESXServer3.0.1 http://www.vmware.com/resources/techresources/1019 ToobtainIometer,goto http://www.iometer.org/ FormoreinformationonhowtogatherI/OstatisticsusingIometer,seetheIometerusersguideat http://www.iometer.org/doc/documents.html TolearnmoreabouthowtocollectCPUstatisticsusingesxtop,seeUsingtheesxtopUtilityinthe VMwareInfrastructure3ResourceManagementGuideat http://www.vmware.com/pdf/vi3_35/esx_3/r35/vi3_35_25_resource_mgmt.pdf ForadetaileddescriptionofVMFSandRDMandhowtoconfigurethem,seechapters5and8ofthe VMwareInfrastructure3ServerConfigurationGuideat http://www.vmware.com/pdf/vi3_35/esx_3/r35/vi3_35_25_3_server_config.pdf VMwareInfrastructure3ServerConfigurationGuide http://www.vmware.com/pdf/vi3_35/esx_3/r35/vi3_35_25_3_server_config.pdf EMCCLARiiONBestPracticesforFibreChannelStorage http://powerlink.emc.com EMCNavisphereManagerAdministratorsGuide http://powerlink.emc.com
11
Appendix: Effect of Cache Page Size on Sequential Read I/O Patterns with I/O Block Size Less than Cache Page Size
ThedefaultcachepagesettingonaCLARiiONCX340is8K.Thissettingaffectssequentialreadswithablock sizesmallerthanthecachepagesize(refertoEMCCLARiiONBestPracticesforFibreChannelStorage mentionedinResourcesonpage 11).FollowingtheEMCrecommendation,wechangedthecachepage settingto4K,whichwassameastheI/OblocksizegeneratedbyIometer.Thisresultedina226percent increaseinthenumberofI/OoperationspersecondasshowninFigure12. Figure 12. Sequential read I/O operations per second for 4k sequential read with different cache page size (higher is better)
45000 40000 35000
I/O per second
Figure13showsthenormalizedCPUcostsperI/OoperationforVMFSandRDMwith4Kand8Kcachepage size.WeusedtheCPUcostofRDM(physicalmapping)with4Kcachepagesizeasthebaselinevalueand plottedtheCPUcostsofVMFSandRDM(bothphysicalandvirtualmapping)withboth4Kand8Kcachepage sizeasafractionofthebaselinevalue.TheCPUcostimprovedbyanaveragevalueof70percentwith4Kcache pagesizeforbothVMFSandRDM Figure 13. Normalized CPU cost for 4K sequential read with different cache page size (lower is better)
2.00
Normalized efficiency (MHz/IOps)
VMFS
VMware, Inc. 3401 Hillview Ave., Palo Alto, CA 94304 www.vmware.com Copyright 2008 VMware, Inc. All rights reserved. Protected by one or more of U.S. Patent Nos. 6,397,242, 6,496,847, 6,704,925, 6,711,672, 6,725,289, 6,735,601, 6,785,886, 6,789,156, 6,795,966, 6,880,022, 6,944,699, 6,961,806, 6,961,941, 7,069,413, 7,082,598, 7,089,377, 7,111,086, 7,111,145, 7,117,481, 7,149, 843, 7,155,558, 7,222,221, 7,260,815, 7,260,820, 7,269,683, 7,275,136, 7,277,998, 7,277,999, 7,278,030, 7,281,102, and 7,290,253; patents pending. VMware, the VMware boxes logo and design, Virtual SMP and VMotion are registered trademarks or trademarks of VMware, Inc. in the United States and/or other jurisdictions. Microsoft, Windows and Windows NT are registered trademarks of Microsoft Corporation. Linux is a registered trademark of Linus Torvalds. All other marks and names mentioned herein may be trademarks of their respective companies. Revision 20080131 Item: PS-051-PRD-01-01
12