You are on page 1of 378

VMware vSphere:

Optimize and Scale


Lecture Manual – Volume 2
ESXi 5.1 and vCenter Server 5.1

VMware® Education Services


VMware, Inc.
www.vmware.com/education
VMware vSphere:
Optimize and Scale
ESXi 5.1 and vCenter Server 5.1
Part Number EDU-EN-OS51-LECT2
Lecture Manual – Volume 2
Revision A

Copyright/Trademark
Copyright © 2012 VMware, Inc. All rights reserved. This manual and its accompanying
materials are protected by U.S. and international copyright and intellectual property laws.
VMware products are covered by one or more patents listed at http://www.vmware.com/go/
patents. VMware is a registered trademark or trademark of VMware, Inc. in the United States
and/or other jurisdictions. All other marks and names mentioned herein may be trademarks
of their respective companies.
The training material is provided “as is,” and all express or implied conditions,
representations, and warranties, including any implied warranty of merchantability, fitness for
a particular purpose or noninfringement, are disclaimed, even if VMware, Inc., has been
advised of the possibility of such claims. This training material is designed to support an
instructor-led training course and is intended to be used for reference purposes in
conjunction with the instructor-led training course. The training material is not a standalone
training tool. Use of the training material for self-study without class attendance is not
recommended.
These materials and the computer programs to which it relates are the property of, and
embody trade secrets and confidential information proprietary to, VMware, Inc., and may not
be reproduced, copied, disclosed, transferred, adapted or modified without the express
written approval of VMware, Inc.
Course development: Mike Sutton
Technical review: Johnathan Loux, Carla Gavalakis
Technical editing: Jeffrey Gardiner
Production and publishing: Ron Morton, Regina Aboud

www.vmware.com/education
TA B L E OF C ONTENTS
MODULE 7 Storage Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .351
You Are Here . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .352
Importance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .353
Module Lessons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .354
Lesson 1: Storage Virtualization Concepts . . . . . . . . . . . . . . . . . . . . . .355
Learner Objectives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .356
Storage Performance Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .357
Storage Protocol Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .358
SAN Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .359
Storage Queues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .361
Performance Impact of Queuing on the Storage Array . . . . . . . . . . . . .362
LUN Queue Depth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .363
Network Storage: iSCSI and NFS . . . . . . . . . . . . . . . . . . . . . . . . . . . . .364
What Affects VMFS Performance?. . . . . . . . . . . . . . . . . . . . . . . . . . . .365
SCSI Reservations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .367
VMFS Versus RDMs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .368
Virtual Disk Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .369
Review of Learner Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .370
Lesson 2: Monitoring Storage Activity . . . . . . . . . . . . . . . . . . . . . . . . .371
Learner Objectives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .372
Applying Space Utilization Data to Manage Storage Resources . . . . .373
Disk Capacity Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .374
Monitoring Disk Throughput with vSphere Client . . . . . . . . . . . . . . . .375
Monitoring Disk Throughput with resxtop . . . . . . . . . . . . . . . . . . . . . .376
Disk Throughput Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .377
Monitoring Disk Latency with vSphere Client . . . . . . . . . . . . . . . . . . .379
Monitoring Disk Latency with resxtop . . . . . . . . . . . . . . . . . . . . . . . . .381
Monitoring Commands and Command Queuing . . . . . . . . . . . . . . . . .382
Disk Latency and Queuing Example . . . . . . . . . . . . . . . . . . . . . . . . . . .383
Monitoring Severely Overloaded Storage . . . . . . . . . . . . . . . . . . . . . . .384
Configuring Datastore Alarms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .385
Analyzing Datastore Alarms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .387
Lab 10 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .388
Lab 10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .389
Lab 10 Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .390
Review of Learner Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .391
Lesson 3: Command-Line Storage Management . . . . . . . . . . . . . . . . .392
Learner Objectives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .393
Managing Storage with vMA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .394
Examining LUNs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .395
Managing Storage Paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .396
Managing NAS Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .397
Managing iSCSI Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .398

VMware vSphere:Optimize and Scale i


Masking LUNs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .399
Managing PSA Plug-Ins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .400
Migrating Virtual Machine Files to a Different Datastore . . . . . . . . . .402
vmkfstools Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .403
vmkfstools Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .404
vmkfstools General Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .405
vmkfstools Command Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .406
vmkfstools File System Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .407
vmkfstools Virtual Disk Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .408
vmkfstools Virtual Disk Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . .410
vscsiStats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 411
Why Use vscsiStats? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .412
Running vscsiStats. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .413
Lab 11 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .415
Review of Learner Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .416
Lesson 4: Troubleshooting Storage Performance Problems . . . . . . . . .417
Learner Objectives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .418
Review: Basic Troubleshooting Flow for ESXi Hosts . . . . . . . . . . . . .419
Overloaded Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .420
Causes of Overloaded Storage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .421
Slow Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .422
Factors Affecting Storage Response Time . . . . . . . . . . . . . . . . . . . . . .423
Random Increase in I/O Latency on Shared Storage. . . . . . . . . . . . . . .424
Example 1: Bad Disk Throughput . . . . . . . . . . . . . . . . . . . . . . . . . . . . .426
Example 2: Virtual Machine Power On Is Slow . . . . . . . . . . . . . . . . . .427
Monitoring Disk Latency with the vSphere Client . . . . . . . . . . . . . . . .428
Monitoring Disk Latency With resxtop . . . . . . . . . . . . . . . . . . . . . . . . .429
Solving the Problem of Slow Virtual Machine Power On . . . . . . . . . .430
Example 3: Logging In to Virtual Machines Is Slow . . . . . . . . . . . . . .431
Monitoring Host CPU Usage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .432
Monitoring Host Disk Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .433
Monitoring Disk Throughput . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .434
Solving the Problem of Slow Virtual Machine Login . . . . . . . . . . . . . .435
Resolving Storage Performance Problems . . . . . . . . . . . . . . . . . . . . . .436
Checking Storage Hardware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .437
Reducing the Need for Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .438
Balancing the Load . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .439
Understanding Load Placed on Storage Devices. . . . . . . . . . . . . . . . . .440
Storage Performance Best Practices . . . . . . . . . . . . . . . . . . . . . . . . . . .441
Review of Learner Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .442
Key Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .443

ii VMware vSphere:Optimize and Scale


MODULE 8 CPU Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .445
You Are Here . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .446
Importance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .447
Module Lessons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .448
Lesson 1: CPU Virtualization Concepts . . . . . . . . . . . . . . . . . . . . . . . .449
Learner Objectives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .450
CPU Scheduler Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .451
What Is a World? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .452
CPU Scheduler Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .453
CPU Scheduler Features: Support for SMP VMs . . . . . . . . . . . . . . . . .454
CPU Scheduler Feature: Relaxed Co-Scheduling . . . . . . . . . . . . . . . . .456
CPU Scheduler Feature: Processor Topology/Cache Aware . . . . . . . .457
CPU Scheduler Feature: NUMA-Aware . . . . . . . . . . . . . . . . . . . . . . . .458
Wide-VM NUMA Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .460
Performance Impact with Wide-VM NUMA Support . . . . . . . . . . . . .461
What Affects CPU Performance? . . . . . . . . . . . . . . . . . . . . . . . . . . . . .462
Warning Sign: Ready Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .463
Review of Learner Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .464
Lesson 2: Monitoring CPU Activity . . . . . . . . . . . . . . . . . . . . . . . . . . .465
Learner Objectives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .466
CPU Metrics to Monitor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .467
Viewing CPU Metrics in the vSphere Client . . . . . . . . . . . . . . . . . . . .468
CPU Performance Analysis Using resxtop . . . . . . . . . . . . . . . . . . . . . .469
Using resxtop to View CPU Metrics per Virtual Machine . . . . . . . . . .471
Using resxtop to View Single CPU Statistics . . . . . . . . . . . . . . . . . . . .472
What Is Most Important to Monitor?. . . . . . . . . . . . . . . . . . . . . . . . . . .473
Lab 12 Introduction (1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .474
Lab 12 Introduction (2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .475
Lab 12 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .476
Lab 12 Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .477
Review of Learner Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .478
Lesson 3: Troubleshooting CPU Performance Problems . . . . . . . . . . .479
Learner Objectives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .480
Review: Basic Troubleshooting Flow for ESXi Hosts . . . . . . . . . . . . .481
Resource Pool CPU Saturation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .482
Host CPU Saturation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .483
Causes of Host CPU Saturation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .484
Resolving Host CPU Saturation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .485
Reducing the Number of Virtual Machines on the Host . . . . . . . . . . . .486
Increasing CPU Resources with DRS Clusters . . . . . . . . . . . . . . . . . . .487
Increasing Efficiency of a Virtual Machine's CPU Usage . . . . . . . . . .488
Controlling Resources Using Resource Settings . . . . . . . . . . . . . . . . . .490
When Ready Time Might Not Indicate a Problem . . . . . . . . . . . . . . . .491

Contents iii
Example: Spotting CPU Overcommitment . . . . . . . . . . . . . . . . . . . . . .492
Guest CPU Saturation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .493
Using One vCPU in SMP Virtual Machine . . . . . . . . . . . . . . . . . . . . . .494
Low Guest CPU Utilization. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .496
CPU Performance Best Practices . . . . . . . . . . . . . . . . . . . . . . . . . . . . .497
Lab 13 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .498
Review of Learner Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .499
Key Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .500

MODULE 9 Memory Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .501


You Are Here . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .502
Importance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .503
Module Lessons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .504
Lesson 1: Memory Virtualization Concepts . . . . . . . . . . . . . . . . . . . . .505
Learner Objectives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .506
Virtual Memory Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .507
Application and Guest OS Memory Management . . . . . . . . . . . . . . . .508
Virtual Machine Memory Management . . . . . . . . . . . . . . . . . . . . . . . .509
Memory Reclamation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .510
Virtual Machine Memory-Reclamation Techniques . . . . . . . . . . . . . . . 511
Guest Operating System Memory Terminology . . . . . . . . . . . . . . . . . .512
Reclaiming Memory with Ballooning . . . . . . . . . . . . . . . . . . . . . . . . . .513
Memory Compression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .514
Host Cache. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .515
Reclaiming Memory with Host Swapping . . . . . . . . . . . . . . . . . . . . . .516
Why Does the Hypervisor Reclaim Memory? . . . . . . . . . . . . . . . . . . .517
When to Reclaim Host Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .518
Sliding Scale Mem.MinFreePct. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .520
Memory Reclamation Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .522
Memory Space Overhead . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .524
Review of Learner Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .525
Lesson 2: Monitoring Memory Activity . . . . . . . . . . . . . . . . . . . . . . . .526
Learner Objectives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .527
Monitoring Virtual Machine Memory Usage . . . . . . . . . . . . . . . . . . . .528
Memory Usage Metrics Inside the Guest Operating System . . . . . . . .529
Consumed Host Memory and Active Guest Memory . . . . . . . . . . . . . .530
Monitoring Memory Usage Using resxtop/esxtop . . . . . . . . . . . . . . . .531
Monitoring Host Swapping in the vSphere Client . . . . . . . . . . . . . . . .533
Host Swapping Activity in resxtop/esxtop: Memory Screen . . . . . . . .534
Host Swapping Activity in resxtop/esxtop: CPU Screen . . . . . . . . . . .535
Host Ballooning Activity in the vSphere Client . . . . . . . . . . . . . . . . . .536
Host Ballooning Activity in resxtop . . . . . . . . . . . . . . . . . . . . . . . . . . .537
Lab 14 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .538

iv VMware vSphere:Optimize and Scale


Lab 14 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .539
Lab 14 Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .540
Review of Learner Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .541
Lesson 3: Troubleshooting Memory Performance Problems . . . . . . . .542
Learner Objectives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .543
Review: Basic Troubleshooting Flow for ESXi Hosts . . . . . . . . . . . . .544
Active Host-Level Swapping (1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .545
Active Host-Level Swapping (2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .546
Causes of Active Host-Level Swapping . . . . . . . . . . . . . . . . . . . . . . . .547
Resolving Host-Level Swapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .548
Reducing Memory Overcommitment . . . . . . . . . . . . . . . . . . . . . . . . . .549
Enabling Balloon Driver in Virtual Machines. . . . . . . . . . . . . . . . . . . .550
Memory Hot Add . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .551
Memory Hot Add Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .552
Reducing a Virtual Machine's Memory Reservations . . . . . . . . . . . . . .553
Dedicating Memory to Critical Virtual Machines . . . . . . . . . . . . . . . . .554
Guest Operating System Paging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .555
Example: Ballooning Versus Swapping . . . . . . . . . . . . . . . . . . . . . . . .557
High Guest Memory Demand . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .558
When Swapping Occurs Before Ballooning . . . . . . . . . . . . . . . . . . . . .559
Memory Performance Best Practices . . . . . . . . . . . . . . . . . . . . . . . . . .560
Lab 15 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .561
Review of Learner Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .562
Key Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .563

MODULE 10 Virtual Machine and Cluster Optimization . . . . . . . . . . . . . . . . . . . . . .565


You Are Here . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .566
Importance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .567
Module Lessons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .568
Lesson 1: Virtual Machine Optimization . . . . . . . . . . . . . . . . . . . . . . .569
Learner Objectives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .570
Virtual Machine Performance Overview. . . . . . . . . . . . . . . . . . . . . . . .571
Selecting the Right Guest Operating System . . . . . . . . . . . . . . . . . . . .572
Timekeeping in the Guest Operating System . . . . . . . . . . . . . . . . . . . .573
VMware Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .575
Virtual Hardware Version 9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .577
Virtual Hardware Compatibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .578
CPU Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .579
Using vNUMA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .580
Memory Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .582
Storage Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .584
Network Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .586
Virtual Machine Power-On Requirements . . . . . . . . . . . . . . . . . . . . . .587

Contents v
Power-On Requirements: CPU and Memory Reservations . . . . . . . . .588
Power-On Requirements: Swap File Space . . . . . . . . . . . . . . . . . . . . . .589
Power-On Requirements: Virtual SCSI HBA Selection . . . . . . . . . . . .590
Virtual Machine Performance Best Practices . . . . . . . . . . . . . . . . . . . .591
Review of Learner Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .592
Lesson 2: vSphere Cluster Optimization . . . . . . . . . . . . . . . . . . . . . . . .593
Learner Objectives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .594
Guidelines for Resource Allocation Settings . . . . . . . . . . . . . . . . . . . .595
Resource Pool and vApp Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . .597
DRS Cluster: Configuration Guidelines . . . . . . . . . . . . . . . . . . . . . . . .598
DRS Cluster: vMotion Guidelines. . . . . . . . . . . . . . . . . . . . . . . . . . . . .600
DRS Cluster: Cluster Setting Guidelines . . . . . . . . . . . . . . . . . . . . . . .602
vSphere HA Cluster: Admission Control Guidelines . . . . . . . . . . . . . .603
Example of Calculating Slot Size . . . . . . . . . . . . . . . . . . . . . . . . . . . . .605
Applying Slot Size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .606
Reserving a Percentage of Cluster Resources . . . . . . . . . . . . . . . . . . . .607
Calculating Current Failover Capacity . . . . . . . . . . . . . . . . . . . . . . . . .608
Virtual Machine Unable to Power On . . . . . . . . . . . . . . . . . . . . . . . . . .609
Advanced Options to Control Slot Size. . . . . . . . . . . . . . . . . . . . . . . . . 611
Setting vSphere HA Advanced Parameters . . . . . . . . . . . . . . . . . . . . . .612
Fewer Available Slots Shown Than Expected . . . . . . . . . . . . . . . . . . .613
Lab 16 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .614
Review of Learner Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .615
Key Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .616

MODULE 11 Host and Management Scalability . . . . . . . . . . . . . . . . . . . . . . . . . . . . .617


You Are Here . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .618
Importance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .619
Module Lessons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .620
Lesson 1: vCenter Linked Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .621
Learner Objectives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .622
vCenter Linked Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .623
vCenter Linked Mode Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . .624
Searching Across vCenter Server Instances . . . . . . . . . . . . . . . . . . . . .625
Basic Requirements for vCenter Linked Mode . . . . . . . . . . . . . . . . . . .627
Joining a Linked Mode Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .629
vCenter Service Monitoring: Linked Mode Groups . . . . . . . . . . . . . . .630
Resolving Role Conflicts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .632
Isolating a vCenter Server Instance . . . . . . . . . . . . . . . . . . . . . . . . . . . .633
Review of Learner Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .634
Lesson 2: vSphere DPM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .635
Learner Objectives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .636
vSphere DPM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .637

vi VMware vSphere:Optimize and Scale


How Does vSphere DPM Work? . . . . . . . . . . . . . . . . . . . . . . . . . . . . .639
vSphere DPM Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .640
Power Management Comparison: vSphere DPM . . . . . . . . . . . . . . . . .642
Enabling vSphere DPM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .643
vSphere DPM Host Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .645
Review of Learner Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .646
Lesson 3: Host Profiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .647
Learner Objectives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .648
Host Configuration Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .649
Host Profiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .650
Basic Workflow to Implement Host Profiles . . . . . . . . . . . . . . . . . . . .651
Monitoring for Compliance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .652
Applying Host Profiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .654
Customizing Hosts with Answer Files . . . . . . . . . . . . . . . . . . . . . . . . .655
Standardization Across Multiple vCenter Server Instances . . . . . . . . .656
Lab 17 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .657
Review of Learner Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .658
Lesson 4: vSphere PowerCLI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .659
Learner Objectives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .660
vSphere PowerCLI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .661
vSphere PowerCLI Cmdlets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .662
Windows PowerShell and vSphere PowerCLI . . . . . . . . . . . . . . . . . . .663
Advantages of Using vSphere PowerCLI . . . . . . . . . . . . . . . . . . . . . . .664
Common Tasks Performed with vSphere PowerCLI . . . . . . . . . . . . . .665
Other Tasks Performed with vSphere PowerCLI . . . . . . . . . . . . . . . . .666
vSphere PowerCLI Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .667
Returning Object Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .668
Connecting and Disconnecting to an ESXi Host . . . . . . . . . . . . . . . . . .669
Certificate Warnings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .670
Types of vSphere PowerCLI Cmdlets . . . . . . . . . . . . . . . . . . . . . . . . . .671
Using Basic vSphere PowerCLI Cmdlets . . . . . . . . . . . . . . . . . . . . . . .672
vSphere PowerCLI Snap-Ins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .673
Lab 18 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .675
Review of Learner Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .676
Lesson 5: Image Builder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .677
Learner Objectives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .678
What Is an ESXi Image? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .679
VMware Infrastructure Bundles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .680
ESXi Image Deployment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .681
What Is Image Builder? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .682
Image Builder Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .683
Building an ESXi Image: Step 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .684
Building an ESXi Image: Step 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .685

Contents vii
Building an ESXi Image: Step 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .686
Using Image Builder to Build an Image: Step 4 . . . . . . . . . . . . . . . . . .687
Lab 19 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .688
Review of Learner Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .689
Lesson 6: Auto Deploy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .690
Learner Objectives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .691
What Is Auto Deploy? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .692
Where Are the Configuration and State Information Stored? . . . . . . . .693
Auto Deploy Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .694
Rules Engine Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .695
Software Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .696
PXE Boot Infrastructure Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .697
Initial Boot of an Autodeployed ESXi Host: Step 1 . . . . . . . . . . . . . . .698
Initial Boot of an Autodeployed ESXi Host: Step 2 . . . . . . . . . . . . . . .699
Initial Boot of an Autodeployed ESXi Host: Step 3 . . . . . . . . . . . . . . .700
Initial Boot of an Autodeployed ESXi Host: Step 4 . . . . . . . . . . . . . . .701
Initial Boot of an Autodeployed ESXi Host: Step 5 . . . . . . . . . . . . . . .702
Subsequent Boot of an Autodeployed ESXi Host: Step 1 . . . . . . . . . . .703
Subsequent Boot of an Autodeployed ESXi Host: Step 2 . . . . . . . . . . .704
Subsequent Boot of an Autodeployed ESXi Host: Step 3 . . . . . . . . . . .705
Subsequent Boot of an Autodeployed ESXi Host: Step 4 . . . . . . . . . . .706
Auto Deploy Stateless Caching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .707
Stateless Caching Host Profile Configuration . . . . . . . . . . . . . . . . . . . .708
Auto Deploy Stateless Caching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .709
Auto Deploy Stateful Installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . .710
Stateful Installation Host Profile Configuration . . . . . . . . . . . . . . . . . . 711
Auto Deploy Stateful Installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . .712
Managing the Auto Deploy Environment . . . . . . . . . . . . . . . . . . . . . . .713
Using Auto Deploy with Update Manager to Upgrade Hosts . . . . . . . .714
Lab 20 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .715
Review of Learner Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .716
Key Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .717

viii VMware vSphere:Optimize and Scale


MODULE 7

Storage Optimization 7

7
Slide 7-1

Storage Optimization
Module 7

VMware vSphere: Optimize and Scale 351


You Are Here
Slide 7-2

Course Introduction Storage Optimization

VMware Management Resources


CPU Optimization
Performance in a Virtualized
Environment
Memory Performance
Network Scalability
VM and Cluster Optimization
Network Optimization

Storage Scalability Host and Management Scalability

352 VMware vSphere: Optimize and Scale


Importance

7
Slide 7-3

Storage Optimization
Storage can limit the performance of enterprise workloads. You
should know how to monitor a host’s storage throughput and
troubleshoot problems that result in overloaded storage and slow
storage performance.

Module 7 Storage Optimization 353


Module Lessons
Slide 7-4

Lesson 1: Storage Virtualization Concepts


Lesson 2: Monitoring Storage Activity
Lesson 3: Command-Line Storage Management
Lesson 4: Troubleshooting Storage Performance Problems

354 VMware vSphere: Optimize and Scale


Lesson 1: Storage Virtualization Concepts

7
Slide 7-5

Storage Optimization
Lesson 1:
Storage Virtualization Concepts

Module 7 Storage Optimization 355


Learner Objectives
Slide 7-6

After this lesson, you should be able to do the following:


ƒ Describe factors that affect storage performance.

356 VMware vSphere: Optimize and Scale


Storage Performance Overview

7
Slide 7-7

Storage Optimization
What affects storage
performance?
ƒ Storage protocols:
• Fibre Channel, Fibre Channel over
Ethernet, hardware iSCSI,
software iSCSI, NFS
ƒ Proper storage configuration
ƒ Load balancing
ƒ Queuing and LUN queue depth
ƒ VMware vSphere® VMFS (VMFS)
configuration:
• Choosing between VMFS and
RDMs
• SCSI reservations
ƒ Virtual disk types

VMware vSphere® ESXi™ enables multiple hosts to share the same physical storage reliably
through its optimized storage stack and VMware vSphere® VMFS. Centralized storage of virtual
machines can be accomplished by using VMFS as well as NFS. Centralized storage enables
virtualization capabilities such as VMware vSphere® vMotion®, VMware vSphere® Distributed
Resource Scheduler™ (DRS), and VMware vSphere® High Availability (vSphere HA). To gain the
greatest advantage from shared storage, you must understand the storage performance limits of a
given physical environment to ensure that you do not overcommit resources.
Several factors have an affect on storage performance:
• Storage protocols
• Proper configuration of your storage devices
• Load balancing across available storage
• Storage queues
• Proper use and configuration of your VMFS volumes
Each of these factors is discussed in this lesson.

Module 7 Storage Optimization 357


Storage Protocol Performance
Slide 7-8

VMware vSphere® ESXi™ supports Fibre Channel, hardware iSCSI,


software iSCSI, and NFS.
All storage protocols are capable of delivering high throughput
performance.
ƒ When CPU is not a bottleneck, software iSCSI and NFS can be part of
a high-performance solution.
Hardware performance features:
ƒ 16 Gb Fibre Channel
ƒ Software and Hardware iSCSI and NFS support for jumbo frames:
• Using Gigabit,10Gb Ethernet NICs, or 10Gb iSCSI hardware initiators.

For Fibre Channel and hardware iSCSI, a major part of the protocol processing is off-loaded to the
HBA. Consequently, the cost of each I/O is very low.
For software iSCSI and NFS, host CPUs are used for protocol processing, which increases cost.
Furthermore, the cost of NFS and software iSCSI is higher with larger block sizes, such as 64KB.
This cost is due to the additional CPU cycles needed for each block for tasks like check summing
and blocking. Software iSCSI and NFS are more efficient at smaller blocks and both are capable of
delivering high throughput performance when CPU resource is not a bottleneck.
ESXi hosts provide support for high-performance hardware features. For greater throughput, ESXi
hosts support 16 Gb Fibre Channel adapters and 10Gb Ethernet adapters for hardware iSCSI and
NFS storage. ESXi hosts also support the use of jumbo frames for software iSCSI and NFS. This
support is available on both Gigabit and 10Gb Ethernet NICs.

358 VMware vSphere: Optimize and Scale


SAN Configuration

7
Slide 7-9

Storage Optimization
Proper SAN configuration can help to eliminate performance issues.
Each LUN should have the right RAID level and storage
characteristics for the applications in virtual machines that will use it.
Use the right path selection policy:
ƒ Most Recently Used (MRU)
ƒ Fixed (Fixed)
ƒ Round Robin (RR)

Storage performance is a vast topic that depends on workload, hardware, vendor, RAID level, cache
size, stripe size, and so on. Consult the appropriate VMware® documentation as well as the storage
vendor documentation on how to configure your storage devices appropriately.
Because each application running in your VMware vSphere® environment has different
requirements, you can achieve high throughput and minimal latency by choosing the appropriate
RAID level for applications running in the virtual machines.
By default, active-passive storage arrays use the Most Recently Used path policy. To avoid LUN
thrashing, do not use the Fixed path policy for active-passive storage arrays.
By default, active-active storage arrays use the Fixed path policy. When using this policy, you can
maximize the use of your bandwidth to the storage array by designating preferred paths to each
LUN through different storage controllers.

Module 7 Storage Optimization 359


Round Robin uses an automatic path selection rotating through all available paths, enabling the
distribution of load across the configured paths.
• For Active/Passive storage arrays, only the paths to the active controller will be used in the
Round Robin policy.
• For Active/Active storage arrays, all paths will be used in the Round Robin policy.
For detailed information on SAN configuration, see vSphere Storage Guide at
http://www.vmware.com/support/pubs/vsphere-esxi-vcenter-server-pubs.html.

360 VMware vSphere: Optimize and Scale


Storage Queues

7
Slide 7-10

Storage Optimization
Queuing at the host:
ƒ Device queue controls number of
queues active commands on LUN at any
time.
• Depth of queue is 32 (default).
ƒ VMkernel queue is an overflow
queue for device driver queue.
Queuing at the storage array:
ƒ Queuing occurs when the number
of active commands to a LUN is too
queues high for the storage array to handle.
Latency increases with excessive
queuing at host or storage array.

There are several storage queues that you should be aware of: device driver queue, kernel queue,
and storage array queue.
The device driver queue is used for low-level interaction with the storage device. This queue
controls how many active commands can be on a LUN at the same time. This number is effectively
the concurrency of the storage stack. Set the device queue to 1 and each storage command becomes
sequential: each command must complete before the next starts.
The kernel queue can be thought of as an overflow queue for the device driver queues. A kernel
queue includes features that optimize storage. These features include multipathing for failover and
load balancing, prioritization of storage activities based on virtual machine and cluster shares, and
optimizations to improve efficiency for long sequential operations.
SCSI device drivers have a configurable parameter called the LUN queue depth that determines how
many commands can be active at one time to a given LUN. The default value is 32. If the total
number of outstanding commands from all virtual machines exceeds the LUN queue depth, the
excess commands are queued in the ESXi kernel, which increases latency.
In addition to queuing at the ESXi host, command queuing can also occur at the storage array.

Module 7 Storage Optimization 361


Performance Impact of Queuing on the Storage Array
Slide 7-11

Sequential workloads generate random access at the storage array.

Aggregate throughput
200
180
160
140
120 seq_rd
MBps

seq_wr
100 rnd_rd
80 rnd_wr

60
40
20
0
1 2 4 8 16 32 64
No. of hosts

To understand the effect of queuing on the storage array, consider a case in which the virtual
machines on an ESXi host can generate a constant number of SCSI commands equal to the LUN
queue depth. If multiple ESXi hosts share the same LUN, SCSI commands to that LUN from all
hosts are processed by the storage processor on the storage array. Strictly speaking, an array storage
processor running in target mode might not have a per-LUN queue depth, so it might issue the
commands directly to the disks. But if the number of active commands to the shared LUN is too
high, multiple commands begin queuing up at the disks, resulting in high latencies.
VMware tested the effects of running an I/O intensive workload on 64 ESXi hosts sharing a VMFS
volume on a single LUN. As shown in the table above, except for sequential reads, there is no drop
in aggregate throughput as the number of hosts increases. The reason sequential read throughput
drops is that the sequential streams coming in from the different ESXi hosts are intermixed at the
storage array, thus losing their sequential nature. Writes generally show better performance than
reads because they are absorbed by the write cache and flushed to disks in the background.
For more details on this test, see “Scalable Storage Performance” at http://www.vmware.com/
resources/techresources/1059.

362 VMware vSphere: Optimize and Scale


LUN Queue Depth

7
Slide 7-12

Storage Optimization
LUN queue depth determines how
many commands to a given LUN can
be active at one time.
Set LUN queue depth size properly to
Set LUN decrease disk latency.
queue
depth to its ƒ Depth of queue is 32 (default).
maximum:
64.
ƒ Maximum recommended queue depth
is 64.
Set Disk.SchedNumReqOutstanding
to the same value as the queue depth.

SCSI device drivers have a configurable parameter called the LUN queue depth that determines how
many commands can be active at one time to a given LUN. QLogic Fibre Channel HBAs support up
to 255 outstanding commands per LUN, and Emulex HBAs support up to 128. However, the default
value for both drivers is set to 32. If an ESXi host generates more commands to a LUN than the
LUN queue depth, the excess commands are queued in the VMkernel, which increases latency.
When virtual machines share a LUN, the total number of outstanding commands permitted from all
virtual machines to that LUN is governed by the Disk.SchedNumReqOutstanding configuration
parameter, which can be set in VMware® vCenter Server™. If the total number of outstanding
commands from all virtual machines exceeds this parameter, the excess commands are queued in the
VMkernel.
To reduce latency, ensure that the sum of active commands from all virtual machines does not
consistently exceed the LUN queue depth. Either increase the queue depth (the maximum
recommended queue depth is 64) or move the virtual disks of some virtual machines to a different
VMFS volume.
For details on how to increase the queue depth of your storage adapter, see vSphere Storage Guide at
http://www.vmware.com/support/pubs/vsphere-esxi-vcenter-server-pubs.html.

Module 7 Storage Optimization 363


Network Storage: iSCSI and NFS
Slide 7-13

Avoid oversubscribing your links.


ƒ Using VLANs does not solve the problem of oversubscription.
Isolate iSCSI traffic and NFS traffic.
Applications that write a lot of data to storage should not share
Ethernet links to a storage device.
For software iSCSI and NFS, protocol processing uses CPU
resources on the host.

Applications or systems that write large amounts of data to storage, such as data acquisition or
transaction logging systems, should not share Ethernet links to a storage device. These types of
applications perform best with multiple connections to storage devices.
For iSCSI and NFS, make sure that your network topology does not contain Ethernet bottlenecks.
Bottlenecks where multiple links are routed through fewer links can result in oversubscription and
dropped network packets. Any time a number of links transmitting near capacity are switched to a
smaller number of links, such oversubscription becomes possible. Recovering from dropped
network packets results in large performance degradation.
Using VLANs or VPNs does not provide a suitable solution to the problem of link oversubscription
in shared configurations. However, creating separate VLANs for NFS and iSCSI is beneficial. This
separation minimizes network interference from other packet sources.
Finally, with software-initiated iSCSI and NFS, the network protocol processing takes place on the
host system and thus can require more CPU resources than other storage options.

364 VMware vSphere: Optimize and Scale


What Affects VMFS Performance?

7
Slide 7-14

Storage Optimization
VMFS partition alignment:
ƒ The VMware vSphere® Client™ properly aligns a VMFS partition along
the 1MB boundary.
ƒ Performance improvement is dependent on workloads and array types.
Spanning VMFS volumes:
ƒ This feature is effective for increasing VMFS size dynamically.
ƒ Predicting performance is not straightforward.

VMFS Partition Alignment


The alignment of your VMFS partitions can affect performance. Like other disk-based file systems,
VMFS suffers a penalty when the partition is unaligned. Using the VMware vSphere® Client™ to
create VMFS datastores avoids this problem because it automatically aligns the datastores along the
1MB boundary.
To manually align your VMFS partitions, check your storage vendor’s recommendations for the
partition starting block. If your storage vendor makes no specific recommendation, use a starting
block that is a multiple of 8KB.
Before performing an alignment, carefully evaluate the performance effect of the unaligned VMFS
partition on your particular workload. The degree of improvement from alignment is highly
dependent on workloads and array types. You might want to refer to the alignment recommendations
from your array vendor for further information.

NOTE
If a VMFS3 partition was created using an earlier version of ESXi that aligned along the 64KB
boundary, and that file system is then upgraded to VMFS5, it will retain its 64KB alignment. 1MB
alignment can be obtained by deleting the partition and recreating it using the vSphere Client and an
ESXi host.

Module 7 Storage Optimization 365


Spanned VMFS Volumes
A spanned VMFS volume includes multiple extents (part or an entire LUN). Spanning is a good
feature to use if you need to add more storage to a VMFS volume while it is in use. Predicting
performance with spanned volumes is not straightforward, because the user does not have control
over how the data from various virtual machines is laid out on the different LUNs that form the
spanned VMFS volume.
For example, consider a spanned VMFS volume with two 100GB LUNs. Two virtual machines are
on this spanned VMFS, and the sum total of their sizes is 150GB. The user cannot determine the
contents of each LUN in the spanned volume directly. Hence, determining the performance
properties of this configuration is not straightforward.
Mixing storage devices of different performance characteristics on the same spanned volume could
cause an imbalance in virtual machine performance. This imbalance might occur if a virtual
machine’s blocks were allocated across device boundaries, and each device might have a different
queue depth.

366 VMware vSphere: Optimize and Scale


SCSI Reservations

7
Slide 7-15

Storage Optimization
A SCSI reservation:
ƒ Causes a LUN to be used exclusively by a single host for a brief period
ƒ Is used by a VMFS instance to lock the file system while the VMFS
metadata is updated
Operations that result in metadata updates:
ƒ Creating or deleting a virtual disk
ƒ Increasing the size of a VMFS volume
ƒ Creating or deleting snapshots
ƒ Increasing the size of a VMDK file
To minimize the impact on virtual machine performance:
ƒ Postpone major maintenance/configuration until off-peak hours.

VMFS is a clustered file system and uses SCSI reservations as part of its distributed locking
algorithms. Administrative operations—such as creating or deleting a virtual disk, extending a
VMFS volume, or creating or deleting snapshots—result in metadata updates to the file system
using locks and thus result in SCSI reservations. A reservation causes the LUN to be available
exclusively to a single ESXi host for a brief period of time. Although an acceptable practice is to
perform a limited number of administrative tasks during peak hours, postponing major maintenance
to off-peak hours in order to minimize the effect on virtual machine performance is better.
The impact of SCSI reservations depends on the number and nature of storage or VMFS
administrative tasks being performed:
• The longer an administrative task runs (for example, creating a virtual machine with a larger
disk or cloning from a template that resides on a slow NFS share), the longer the virtual
machines are affected. Also, the time to reserve and release a LUN is highly hardware-
dependent and vendor-dependent.
• Running administrative tasks from a particular ESXi host does not have much effect on the I/O-
intensive virtual machines running on the same ESXi host.
You usually do not have to worry about SCSI reservations if no VMFS administrative tasks are
being performed or if VMFS volumes are not being shared by multiple hosts.

Module 7 Storage Optimization 367


VMFS Versus RDMs
Slide 7-16

VMFS is the preferred option for most enterprise applications.


Examples:
ƒ Databases, ERP, CRM, Web servers, and file servers
RDM is preferred when raw disk access is necessary.

I/O characteristic Which yields better performance?

Random reads/writes VMFS and RDM yield similar


I/O operations/second
Sequential reads/writes VMFS and RDM yield similar performance
at small I/O block sizes
Sequential reads/writes VMFS
at larger I/O block sizes

There are three choices on ESXi hosts for managing disk access in a virtual machine: virtual disk in
a VMFS, virtual raw device mapping (RDM), and physical raw device mapping. Choosing the right
disk-access method can be a key factor in achieving high performance for enterprise applications.
VMFS is the preferred option for most virtual disk storage and most enterprise applications,
including databases, ERP, CRM, VMware Consolidated Backup, Web servers, and file servers.
Common uses of RDM are in cluster data and quorum disks for virtual-to-virtual clustering, or
physical-to-virtual clustering; or for running SAN snapshot applications in a virtual machine.
For random workloads, VMFS and RDM produce similar I/O throughput. For sequential workloads
with small I/O block sizes, RDM provides a small increase in throughput compared to VMFS.
However, the performance gap decreases as the I/O block size increases. For all workloads, RDM
has slightly better CPU cost.
For a detailed study comparing VMFS and RDM performance, see “Performance Characterization
of VMFS and RDM Using a SAN” at http://www.vmware.com/resources/techresources/1040.

368 VMware vSphere: Optimize and Scale


Virtual Disk Types

7
Slide 7-17

Storage Optimization
Disk type Description How to Performance Use case
create impact
Eager-zeroed Space allocated vSphere Extended creation Quorum drive
thick and zeroed out Client or time, but best in an MSCS
at time of vmkfstools performance from cluster
creation first write on
Lazy-zeroed Space allocated vSphere Shorter creation Good for most
thick at time of Client (default time, but reduced cases
creation, but disk type) or performance on
zeroed on first vmkfstools first write to block
write
Thin Space allocated vSphere Shorter creation Disk space
and zeroed upon Client or time, but reduced utilization is the
demand vmkfstools performance on main concern.
first write to block

The type of virtual disk used by your virtual machine can have an effect on I/O performance. ESXi
supports the following virtual disk types:
• Eager-zeroed thick – Disk space is allocated and zeroed out at the time of creation. Although
this extends the time it takes to create the disk, using this disk type results in the best
performance, even on the first write to each block. The primary use of this disk type is for
quorum drives in an MSCS cluster. You can create eager-zeroed thick disks at the command
prompt with vmkfstools.
• Lazy-zeroed thick – Disk space is allocated at the time of creation, but each block is zeroed
only on the first write. Using this disk type results in a shorter creation time but reduced
performance the first time a block is written to. Subsequent writes, however, have the same
performance as an eager-zeroed thick disk. This disk type is the default type used to create
virtual disks using the vSphere Client and is good for most cases.
• Thin – Disk space is allocated and zeroed upon demand, instead of upon creation. Using this
disk type results in a shorter creation time but reduced performance the first time a block is
written to. Subsequent writes have the same performance as an eager-zeroed thick disk. Use this
disk type when space use is the main concern for all types of applications. You can create thin
disks (also known as thin-provisioned disks) through the vSphere Client.

Module 7 Storage Optimization 369


Review of Learner Objectives
Slide 7-18

You should be able to do the following:


ƒ Describe factors that affect storage performance.

370 VMware vSphere: Optimize and Scale


Lesson 2: Monitoring Storage Activity

7
Slide 7-19

Storage Optimization
Lesson 2:
Monitoring Storage Activity

Module 7 Storage Optimization 371


Learner Objectives
Slide 7-20

After this lesson, you should be able to do the following:


ƒ Determine which disk metrics to monitor.
ƒ Identify metrics in VMware® vCenter Server™ and resxtop.
ƒ Demonstrate how to monitor disk throughput.

372 VMware vSphere: Optimize and Scale


Applying Space Utilization Data to Manage Storage Resources

7
Slide 7-21

Storage Optimization
The overview charts on the
performance tab displays
usage details of the
datastore.
By default, the displayed
charts include:
ƒ Space Utilization
ƒ By Virtual Machines
(Top 5)
ƒ 1 Day Summary

The overview charts on the performance tab for a datastore displays information on how the space
on a datastore is utilized. To view the charts:
1. From the Datastores and Datastore Clusters view, select the datastore.

2. Select the Performance tab.

3. On the View pull-down menu, ensure Space is selected.


Two pie charts provide information on space use for the datastore.
• By File Type - The chart displays the portion of space utilized by swap files, other virtual
machine files, other non-virtual machine files and free space.
• By Virtual Machines (Top 5) - Shows a breakdown of virtual machine space use of the top
five virtual machines residing on the datastore.
A summary graph reveals how space is utilized over time:
• From the Time Range pull-down menu, select one day, one week, one month, one year, or
custom. When custom is selected, you have the option of selecting a specific time to view by
entering a date and time to view the information.
• The graph displays used space, storage capacity, and allocated storage over the time selected.

Module 7 Storage Optimization 373


Disk Capacity Metrics
Slide 7-22

Identify disk problems.


ƒ Determine available bandwidth and compare with expectations.
What do I do?
ƒ Check key metrics. In a VMware vSphere® environment, the most
significant statistics are:
• Disk throughput
• Latency (device, kernel)
• Number of aborted disk commands
• Number of active disk commands
• Number of active commands queued

To identify disk-related performance problems, start with determining the available bandwidth on
your host and compare it with your expectations. Are you getting the expected IOPS? Are you
getting the expected bandwidth (read/write)? The same thing applies to disk latency. Check disk
latency and compare it with your expectations. Are the latencies higher than you expected?
Compare performance with the hardware specifications of the particular storage subsystem.
Disk bandwidth and latency help determine whether storage is overloaded or slow. In your vSphere
environment, the most significant metrics to monitor for disk performance are the following:
• Disk throughput
• Disk latency
• Number of commands queued
• Number of active disk commands
• Number of aborted disk commands

374 VMware vSphere: Optimize and Scale


Monitoring Disk Throughput with vSphere Client

7
Slide 7-23

The vSphere Client counters:

Storage Optimization
ƒ Disk Read Rate, Disk Write Rate, Disk Usage

The vSphere Client allows you to monitor disk throughput using the advanced disk performance
charts. Metrics of interest:
• Disk Read Rate – Number of disk reads (in KBps) over the sample period
• Disk Write Rate – Number of disk writes (in KBps) over the sample period
• Disk Usage – Number of combined disk reads and disk writes (in KBps) over the sample period
These metrics give you a good indication of how well your storage is being utilized. You can also
use these metrics to monitor whether or not your storage workload is balanced across LUNs.
Disk Read Rate and Disk Write Rate can be monitored per LUN. Disk Read Rate, Disk Write Rate,
and Disk Usage can be monitored per host.

Module 7 Storage Optimization 375


Monitoring Disk Throughput with resxtop
Slide 7-24

Measure disk throughput with the following:


ƒ READs/s and WRITEs/s
ƒ READs/s + WRITEs/s = IOPS
Or you can use:
ƒ MBREAD/s and MBWRTN/s

Disk throughput can also be monitored using resxtop:


• READs/s – Number of disk reads per second
• WRITES/s – Number of disk writes per second
The sum of reads/second and writes/second equals I/O operations/second (IOPS). IOPS is a
common benchmark for storage subsystems and can be measured with tools like Iometer.
If you prefer, you can also monitor throughput using the following metrics instead:
• MBREAD/s – Number of megabytes read per second
• MBWRTN/s – Number of megabytes written per second
All of these metrics can be monitored per HBA (vmhba#).

376 VMware vSphere: Optimize and Scale


Disk Throughput Example

7
Slide 7-25

Adapter view: Type d.

Storage Optimization
Device view: Type u.

Virtual machine view: Type v.

One of the challenges when monitoring these values is to determine whether or not the values
indicate a potential disk bottleneck. The screenshots above were taken while a database was being
restored. A database restore can generate a lot of sequential I/O and is very disk-write intensive.
The first resxtop screen shows disk activity per disk adapter (type d in the window.) All I/O is
going through the same adapter, vmhba2. To display the set of fields, type f. Ensure that the
following fields are selected: A (adapter name), B (LUN ID), G (I/O statistics), and H (overall
latency statistics). If necessary, select fields to display by typing a, b, g, or h.
The second resxtop screen shows disk activity per device (type u in the window.) Most of the I/O
is going to the same LUN. To display the set of fields, type f. Ensure that the following fields are
selected: A (device name), G (I/O stats), and H (overall latency stats). If necessary, select fields to
display by typing a, g, or h.

Module 7 Storage Optimization 377


The third resxtop screen shows disk activity per virtual machine (type v in the window.) The
virtual machine is generating about 161MB per second (MBWRTN/s). To display the set of fields,
type f. Ensure that the following fields are selected: C (group/world name), I (I/O stats) and J
(overall latency stats). If necessary, select fields to display by typing c, i, or j.
The data shown potentially indicate a problem. In this example, vmhba2 is a 2Gbps Fibre Channel
adapter. The statistics indicate that the adapter has reached its limit (2Gbps is roughly equal to
1.5Gbps, plus some Fibre Channel protocol overhead). If this becomes a problem, consider using a
4Gbps or 8Gbps Fibre Channel adapter.

378 VMware vSphere: Optimize and Scale


Monitoring Disk Latency with vSphere Client

7
Slide 7-26

Storage Optimization
The vSphere Client counters:
ƒ Physical device latency counters and kernel latency counters

Disk latency is the time taken to complete an I/O request and is most commonly measured in
milliseconds. Because multiple layers are involved in a storage system, each layer through which the
I/O request passes might add its own delay. In the vSphere environment, the I/O request passes from
the VMkernel to the physical storage device. Latency can be caused when excessive commands are
being queued either at the VMkernel or the storage subsystem. When the latency numbers are high,
the storage subsystem might be overworked by too many hosts.
In the vSphere Client, there are a number of latency counters that you can monitor. However, a
performance problem can usually be identified and corrected by monitoring the following latency
metrics:
• Physical device command latency – This is the average amount of time for the physical device
to complete a SCSI command. Depending on your hardware, a number greater than
15 milliseconds indicates that the storage array might be slow or overworked.
• Kernel command latency – This is the average amount of time that the VMkernel spends
processing each SCSI command. For best performance, the value should be 0–1 milliseconds. If
the value is greater than 4 milliseconds, the virtual machines on the ESXi host are trying to send
more throughput to the storage system than the configuration supports.

Module 7 Storage Optimization 379


It might help to know whether a performance problem is occurring specifically during reads or
writes. The following metrics provide this information:
• Physical device read latency and Physical device write latency – These metrics break down
physical device command latency into read latency and write latency. These metrics define the
time it takes the physical device, from the HBA to the platter, to service a read or write request,
respectively.
Kernel read latency and Kernel write latency – These metrics are a breakdown of the kernel
command latency metric. These metrics define the time it takes the VMkernel to service a read or
write request, respectively. This is the time between the guest operating system and the virtual
device.

380 VMware vSphere: Optimize and Scale


Monitoring Disk Latency with resxtop

7
Slide 7-27

Storage Optimization
Host bus adapters (HBAs) latency stats from the
include SCSI, iSCSI, RAID, device, the kernel, and
and FC-HBA adapters. the guest

DAVG/cmd: Average latency (ms) of the device (LUN)


KAVG/cmd: Average latency (ms) in the VMkernel, also known as
“queuing time”
GAVG/cmd: Average latency (ms) in the guest. GAVG = DAVG +
KAVG.

In addition to disk throughput, the disk adapter screen (type d in the window) lets you monitor disk
latency as well:
• ADAPTR – The name of the host bus adapter (vmhba#), which includes SCSI, iSCSI, RAID
and Fibre Channel adapters.
• DAVG/cmd – The average amount of time it takes a device (which includes the HBA, the
storage array, and everything in between) to service a single I/O request (read or write).
• If the value < 10, the system is healthy. If the value is 11–20 (inclusive), be aware of the
situation by monitoring the value more frequently. If the value is > 20, this most likely
indicates a problem.
• KAVG/cmd – The average amount of time it takes the VMkernel to service a disk operation.
This number represents time spent by the CPU to manage I/O. Because processors are much
faster than disks, this value should be close to zero. A value or 1 or 2 is considered high for this
metric.
GAVG/cmd – The total latency seen from the virtual machine when performing an I/O request.
GAVG is the sum of DAVG plus KAVG

Module 7 Storage Optimization 381


Monitoring Commands and Command Queuing
Slide 7-28

Performance metric Name in vSphere Name in


Client resxtop/esxtop
Number of active Commands issued ACTV
commands
Number of commands Queue command QUED
queued latency

There are metrics for monitoring the number of active disk commands and the number of disk
commands that are queued. These metrics provide information about your disk performance. They
are often used to further interpret the latency values that you might be observing.
• Number of active commands – This metric represents the number of I/O operations that are
currently active. This includes operations for which the host is processing. This metric can
serve as a quick view of storage activity. If the value of this metric is close to or at zero, the
storage subsystem is not being used. If the value is a nonzero number, sustained over time, then
constant interaction with the storage subsystem is occurring.
Number of commands queued – This metric represents the number of I/O operations that require
processing but have not yet been addressed. Commands are queued and awaiting management by
the kernel when the driver’s active command buffer is full. Occasionally, a queue will form and
result in a small, nonzero value for QUED. However, any significant (double-digit) average of
queued commands means that the storage hardware is unable to keep up with the host’s needs.

382 VMware vSphere: Optimize and Scale


Disk Latency and Queuing Example

7
Slide 7-29

Storage Optimization
normal
VMkernel
latency

queuing at
the device

Here is an example of monitoring the kernel latency value, KAVG/cmd. This value is being
monitored for the device vmhba0. In the first resxtop screen (type d in the window), the kernel
latency value is 0.01 milliseconds. This is a good value because it is nearly zero.
In the second resxtop screen (type u in the window), there are 32 active I/Os (ACTV) and 2 I/Os
being queued (QUED). This means that there is some queuing happening at the VMkernel level.
Queuing happens if there is excessive I/O to the device and the LUN queue depth setting is not
sufficient. The default LUN queue depth is 32. However, if there are too many I/Os (more than 32)
to handle simultaneously, the device will get bottlenecked to only 32 outstanding I/Os at a time. To
resolve this, you would change the queue depth of the device driver.
For details on changing the queue depth of the device driver, see Fibre Channel SAN Configuration
Guide at http://www.vmware.com/support/pubs.

Module 7 Storage Optimization 383


Monitoring Severely Overloaded Storage
Slide 7-30

The vSphere
Client counter:
ƒ Disk Command
Aborts
resxtop counter:
ƒ ABRTS/s

When storage is severely overloaded, commands are aborted because the storage subsystem is
taking far too long to respond to the commands. The storage subsystem has not responded within an
acceptable amount of time, as defined by the guest operating system or application. Aborted
commands are a sign that the storage hardware is overloaded and unable to handle the requests in
line with the host’s expectations.
The number of aborted commands can be monitored by using either the vSphere Client or resxtop:
• From the vSphere Client, monitor Disk Command Aborts.
From resxtop, monitor ABRTS/s.

384 VMware vSphere: Optimize and Scale


Configuring Datastore Alarms

7
Slide 7-31

Storage Optimization
To configure datastore alarms:
ƒ Right-click on the datastore and select Add Alarm.
ƒ Enter the condition or event you want to monitor and what action to
occur as a result.

Alarms can be configured to detect when specific conditions or events occur against a selected
datastore.
Monitoring for conditions is limited to only three options:
• Datastore Disk Provisioned (%)
• Datastore Disk Usage (%)
• Datastore State to All Hosts
Monitoring for events offers more options. The events that can be monitored include the following:
• Datastore capacity increased
• Datastore discovered
• File or directory copied to datastore
• File or directory deleted
• File or directory moved to datastore
• Local datastore created

Module 7 Storage Optimization 385


• NAS datastore created
• SIOC: pre-4.1 host {host} connected to SIOC-enabled datastore {objectName}
• Storage provider system capability requirements met
• Storage provider system capability requirements not met
• Storage provider system default capability event
• Storage provider thin provisioning capacity threshold crossed
• Storage provider thin provisioning capacity threshold reached
• Storage provider thin provisioning default capacity event
• Unmanaged workload detected on SIOC-enabled datastore
• VMFS datastore created
• VMFS datastore expanded
• VMFS datastore extended

386 VMware vSphere: Optimize and Scale


Analyzing Datastore Alarms

7
Slide 7-32

Storage Optimization
By selecting the Triggered Alarms button, the Alarms tab displays all
triggered alarms for the highlighted datastores.

By default, alarms are not configured to perform any actions. An alarm appears on the Alarms tab
when the Triggered Alarms button is selected and a red or yellow indicator appears on the
datastore, but the only way to know an alarm has triggered is to actively be searching for them.
Actions that can be configured include the following:
• Send a notification email
• Send a notification trap
• Run a command
Configuring actions enables the vCenter Server to notify you if an alarm is triggered.
Once a triggered alarm is discovered, you must gather as much information as possible concerning
the reason for the alarm. The Tasks & Events tab can provide more detail as to why the alarm
occurred. Searching the system logs might also provide some clues.

Module 7 Storage Optimization 387


Lab 10 Introduction
Slide 7-33

VMFS

TestVM01 TestVM01.vmdk system disk


fileserver1.sh TestVM01_1.vmdk local data disk

shared storage

fileserver2.sh VMFS
datawrite.sh remote
TestVM01.vmdk
logwrite.sh data disk
… …
your assigned
LUN

In this lab, you generate various types of disk I/O and compare performance. Your virtual machine is
currently configured with one virtual disk (the system disk). For this lab, you add two additional
virtual disks (a local data disk and a remote data disk) to your test virtual machine: one on your local
VMFS file system and the other on your assigned VMFS file system on shared storage. Your local
VMFS file system is located on a single physical drive and your assigned VMFS file system on
shared storage is located on a LUN consisting of two physical drives.
After adding the two virtual disks to your virtual machine, you run the following scripts:
• fileserver1.sh – This script generates random reads to the local data disk.
• fileserver2.sh – This script generates random reads to the remote data disk.
• datawrite.sh – This script generates random writes to the remote data disk.
• logwrite.sh – This script generates sequential writes to the remote data disk.
Each script starts a program called aio-stress. aio-stress is a simple command-line program
that measures the performance of a disk subsystem. For more information on the aio-stress
command, see the file readme.txt on the test virtual machine, in the same directory as the scripts.

388 VMware vSphere: Optimize and Scale


Lab 10

7
Slide 7-34

Storage Optimization
In this lab, you will use performance charts to monitor disk performance:
1. Identify the vmhbas used for local storage and shared storage.
2. Add disks to your test virtual machine.
3. Display the real-time disk performance charts of the vSphere Client.
4. Perform test case 1: Measure sequential write activity to your remote virtual disk.
5. Perform test case 2: Measure random write activity to your remote virtual disk.
6. Perform test case 3: Measure random read activity to your remote virtual disk.
7. Perform test case 4: Measure random read activity to your local virtual disk.
8. Summarize your findings.
9. Clean up for the next lab.

Module 7 Storage Optimization 389


Lab 10 Review
Slide 7-35

Case 1 Case 2 Case 3 Case 4


vSphere Sequential Random Random Random
Client writes to writes to reads to reads to
Counters remote disk remote disk remote disk local disk

Read Rate

Write Rate

390 VMware vSphere: Optimize and Scale


Review of Learner Objectives

7
Slide 7-36

Storage Optimization
You should be able to do the following:
ƒ Determine which disk metrics to monitor.
ƒ Identify metrics in vCenter Server and resxtop.
ƒ Demonstrate how to monitor disk throughput.

Module 7 Storage Optimization 391


Lesson 3: Command-Line Storage Management
Slide 7-37

Lesson 3:
Command-Line Storage Management

392 VMware vSphere: Optimize and Scale


Learner Objectives

7
Slide 7-38

Storage Optimization
After this lesson, you should be able to do the following:
ƒ Use VMware vSphere® Management Assistant (vMA) to manage
vSphere virtual storage.
ƒ Use vmkfstools for VMFS operations.
ƒ Use the vscsiStats command.

Module 7 Storage Optimization 393


Managing Storage with vMA
Slide 7-39

Storage management task vMA command


Examine LUNs. esxcli

Manage storage paths. esxcli

Manage NAS storage. esxcli

Manage iSCSI storage. esxcli

Mask LUNs. esxcli

Manage PSA plug-ins. esxcli

Migrating virtual machines to a different datastore. svmotion

The VMware vSphere® Command-Line Interface (vCLI) storage commands enable you to manage
ESXi storage. ESXi storage is storage space on various physical storage systems (local or
networked) that a host uses to store virtual machine disks.

394 VMware vSphere: Optimize and Scale


Examining LUNs

7
Slide 7-40

Storage Optimization
Use the esxcli command with the storage namespace:
ƒ esxcli <conn_options> storage core|filesystem <cmd_options>

Examples:
ƒ To list all logical devices known to a host:
• esxcli –-server esxi02 storage core device list
ƒ To list a specific device:
• esxcli –-server esxi02 storage core device
list –d mpx.vmhba32:C0:T0:L0
ƒ To display host bus adapter (HBA) information:
• esxcli –-server esxi02 storage core adapter list
ƒ To print mappings of datastores to their mount points and UUIDs:
• esxcli –-server esxi02 storage filesystem list

The esxcli storage core command is used to display information about available logical unit
numbers (LUNs) on the ESXi hosts.
The esxcli storage filesystem command is used for listing, mounting, and unmounting
VMFS volumes.

Module 7 Storage Optimization 395


Managing Storage Paths
Slide 7-41

Use the esxcli command with the storage core path|adapter


namespace:
ƒ esxcli <conn_options> storage core path|adapter <cmd_options>

Examples:
ƒ To display mappings between HBAs and devices:
• esxcli –-server esxi02 storage core path list
ƒ To list the statistics for a specific device path:
• esxcli –-server esxi02 storage core path stats get
–-path vmhba33:C0:T2:L0
ƒ To rescan all adapters:
• esxcli –-server esxi02 storage core adapter rescan

You can display information about paths by running the esxcli storage core path command.
To list information about a specific device path, use the --path <path_name> option:
• esxcli –-server esxi02 storage core path list –-path vmhba33:C0:T2:L0
where <path_name> is the runtime name of the device.
To rescan a specific storage adapter, use the --adapter <adapter_name> option:
esxcli –-server esxi02 storage core adapter rescan –-adapter vmhba33

396 VMware vSphere: Optimize and Scale


Managing NAS Storage

7
Slide 7-42

Storage Optimization
Use the esxcli command with the storage nfs namespace:
ƒ esxcli <conn_options> storage nfs <cmd_options>
Examples:
ƒ To list NFS file systems:
• esxcli –-server esxi02 storage nfs list
ƒ To add an NFS file system to an ESXi host:
• esxcli –-server esxi02 storage nfs
add --host=nfs.vclass.local --share=/lun2
--volume-name=MyNFS
ƒ To delete an NFS file system:
• esxcli –-server esxi02 storage nfs remove --volume-name=MyNFS

At the command prompt, you can add, delete, and list attributes concerning NAS datastores attached
to the ESXi host.
When adding an NFS file system to an ESXi host, use the following options:
• --host – Specifies the NAS/NFS server
• --share – Specifies the folder being shared
• --volume-name – Specifies the NFS datastore name
Another useful command is the showmount command:
• showmount -e displays the file systems that are exported on the server where the command is
being run.
showmount -e <host_name> displays the file systems that are exported on the specified host.

Module 7 Storage Optimization 397


Managing iSCSI Storage
Slide 7-43

Use the esxcli command with the iscsi namespace:


ƒ esxcli <conn_options> iscsi <cmd_options>
Examples:
ƒ To enable software iSCSI:
• esxcli –-server esxi02 iscsi software set –-enabled=true
ƒ To add an iSCSI software adapter to an ESXi host:
• esxcli –-server esxi02 iscsi networkportal add
–n vmk2 -A vmhba33
ƒ To check the software iSCSI adapter status:
• esxcli –-server esxi02 iscsi software get
- If the command returns the value true, the adapter is enabled.

The esxcli iscsi command can be used to configure both hardware and software iSCSI storage
for the ESXi hosts.
When you add an iSCSI software adapter to an ESXi host, you are essentially binding a VMkernel
interface to an iSCSI software adapter. The –n option enables you to specify the VMkernel interface
that the iSCSI software adapter binds to. The -A option enables you to specify the vmhba name for
the iSCSI software adapter.
To find the VMkernel interface name (vmk#), use the command:
• esxcli –-server <server_name> network ip interface list
To find the vmhba, use the command:
esxcli –-server <server_name> storage core adapter list

398 VMware vSphere: Optimize and Scale


Masking LUNs

7
Slide 7-44

Storage Optimization
You can prevent an ESXi host from accessing the following:
ƒ A storage array
ƒ A LUN
ƒ An individual path to a LUN
Use the esxcli command to mask access to storage.
ƒ esxcli –server esxi02 storage core claimrule add -r
<claimrule_ID> -t <type> <required_option> -P
MASK_PATH

Storage path masking has a variety of purposes. In troubleshooting, you can force a system to stop
using a path that you either know or suspect is a bad path. If the path was bad, then errors should be
reduced or eliminated in the log files as soon as the path is masked.
You can also use path masking to protect a particular storage array, LUN, or path. Path masking is
useful for test and engineering environments.

Module 7 Storage Optimization 399


Managing PSA Plug-Ins
Slide 7-45

Use the esxcli command to manage the NMP.


ƒ List the devices controlled by the NMP
• esxcli storage nmp path list
ƒ Set the PSP for a device
• esxcli storage nmp device set --device naa.xxx
--psp VMW_PSP_FIXED
ƒ View paths claimed by NMP
• esxcli storage nmp path list
ƒ Retrieve PSP information
• esxcli storage nmp psp generic deviceconfig get
--device=<device>
• esxcli storage nmp psp fixed deviceconfig get
--device=<device>
• esxcli storage nmp psp roundrobin deviceconfig get
--device=<device>

The NMP is an extensible multipathing module that ESXi supports by default. You can use esxcli
storage nmp to manage devices associated with NMP and to set path policies.
The NMP supports all storage arrays listed on the VMware storage Hardware Compatibility List
(HCL) and provides a path selection algorithm based on the array type. The NMP associates a set of
physical paths with a storage device. A SATP determines how path failover is handled for a specific
storage array. A PSP determines which physical path is used to issue an I/O request to a storage
device. SATPs and PSPs are plugins within the NMP plugin.
The list command lists the devices controlled by VMware NMP and shows the SATP and PSP
information associated with each device. To show the paths claimed by NMP, run esxcli
storage nmp path list to list information for all devices, or for just one device with the --
device option.

The set command sets the PSP for a device to one of the policies loaded on the system.

400 VMware vSphere: Optimize and Scale


Use the path option to list paths claimed by NMP. By default, the command displays information
about all paths on all devices. You can filter the command by doing the following:

7
• Only show paths to a singe device (esxcli storage nmp path list --device
<device>)

Storage Optimization
• Only show information for a single path (esxcli storage nmp path list --
path=<path>).

The esxcli storage nmp psp generic deviceconfig get and esxcli storage nmp
psp generic pathconfig get commands retrieve PSP configuration parameters. The type of
PSP determines which command to use:
• Use nmp psp generic deviceconfig get for PSPs that are set to VMW_PSP_RR,
VMW_PSP_FIXED or VMW_PSP_MRU.
• Use nmp psp generic pathconfig get for PSPs that are set to VMW_PSP_FIXED or
VMW_PSP_MRU. No path configuration information is available for VMW_PSP_RR.

Module 7 Storage Optimization 401


Migrating Virtual Machine Files to a Different Datastore
Slide 7-46

Use the svmotion command to perform VMware vSphere® Storage


vMotion® migrations:
ƒ svmotion <conn_options>
--datacenter=<datacenter_name>
--vm=<VM_config_datastore_path>:<new_datastore>
[--disks=<virtual_disk_datastore_path>:<new_datastore>]
Example:
ƒ svmotion –-server vc01 –-username administrator
–-password vmware1! –-datacenter=Training
–-vm=“[Local01] LabVM1/LabVM1.vmx: SharedVMs”
To run svmotion in interactive mode:
ƒ svmotion –-server esxi02 --interactive
- After the command is entered, the user is asked a series of questions
concerning the migration.

The ability to perform VMware vSphere® Storage vMotion® migrations at the command prompt
can be useful when the user wants to run a script to clear datastores. Sometimes, a script is the
preferred alternative over migrating with the vSphere Client. Using a script is especially desirable in
cases where a lot of virtual machines must be migrated.
The svmotion command can be run in interactive or noninteractive mode.
In noninteractive mode, all the options must be listed as part of the command. If an option is
missing, the command fails.
Running the command in interactive mode starts with the --interactive switch. In interactive
mode, the user is prompted for the information required to complete the migration. The user is
prompted for the following:
• The datacenter
• The virtual machine to migrate
• The new datastore on which to place the virtual machine
• Whether the virtual disks will reside on the same datastore or on different datastores
When the command is run, the --server option must refer to a vCenter Server system.

402 VMware vSphere: Optimize and Scale


vmkfstools Overview

7
Slide 7-47

Storage Optimization
The vmkfstools command is a VMware vSphere® ESXi™ Shell
command for managing VMFS volumes and virtual disks.
ƒ vmkfstools performs many storage operations from the command
line.
ƒ Examples of vmkfstools operations:
• Create a VMFS datastore
• Manage a VMFS datastore
• Manipulate virtual disk files
ƒ Common options, including connection options, can be displayed with
the following command:
vmkfstools --help

vmkfstools is one of the VMware vSphere® ESXi™ Shell commands for managing VMFS
volumes and virtual disks. The vmkfstools command is part of the vCLI and vMA.
You can perform many storage operations using the vmkfstools command. For example, you can
create and manage VMFS datastores on a physical partition, or you can manipulate virtual disk files
that are stored on VMFS or NFS datastores.
After you make a change using vmkfstools, the vSphere Client might not be updated immediately.
You must use a refresh or rescan operation from the vSphere Client.

Module 7 Storage Optimization 403


vmkfstools Commands
Slide 7-48

vmkfstools can perform options on the following:


ƒ Virtual disks
ƒ File systems
ƒ Logical volumes
ƒ Physical storage devices

You use the vmkfstools command to create and manipulate virtual disks, file systems, logical
volumes, and physical storage devices on an ESXi host. You can use vmkfstools to create and
manage a virtual machine file system on a physical partition of a disk and to manipulate files, such
as virtual disks, stored on VMFS-3 and NFS. You can also use vmkfstools to set up and manage
raw device mappings (RDMs).

404 VMware vSphere: Optimize and Scale


vmkfstools General Options

7
Slide 7-49

Storage Optimization
General options for vmkfstools :
ƒ connection_options specifies the target server and authentication
information.
ƒ --help prints a help message for each command specific option.
ƒ --vihost specifies the ESXi host to run the command against.
ƒ -v specifies the verbosity level of the command output.

You can use one or more command-line options and associated arguments to specify the activity for
vmkfstools to perform, for example, choosing the disk format when creating a new virtual disk.
After entering the option, specify a target on which to perform the operation. The target can indicate
a partition, device, or path.
With vmkfstools you can use the following general options:
• connection_options: Specifies the target server and authentication information if required.
The connection options for vmkfstools are the same options that vicfg- commands use.
Run vmkfstools --help for a list of all connection options.
• --help: Prints a help message for each command-specific and each connection option. Calling
the script with no arguments or with --help has the same effect.
• --vihost | -h <esx_host>: When you execute vmkfstools with the --server option
pointing to a vCenter Server system, use --vihost to specify the ESXi host to run the
command against.
• -v: The -v suboption indicates the verbosity level of the command output.

Module 7 Storage Optimization 405


vmkfstools Command Syntax
Slide 7-50

Generally you do not need to log in as the root user to run


vmkfstools commands. Some commands might require root
user login.
ƒ vmkfstools performs many storage operations from the command
line.
ƒ vmkfstools supports the following command syntax:
vmkfstools conn_options target
• You have one or more command-line options to specify the activity for
vmkfstools to perform.
• Target is the location where the operation is performed.
ƒ Example:
vmkfstools –-createfs vmfs –blocksize 1m disk_IP:P
• Creates a vmfs file system on the specified device.

Generally, you do not need to log in as the root user to run the vmkfstools command. However,
some commands, such as the file system commands, might require the root user login.
The vmkfstools command supports the following command syntax:
vmkfstools conn_options options target
Target specifies a partition, device, or path to which to apply the command option.

406 VMware vSphere: Optimize and Scale


vmkfstools File System Options

7
Slide 7-51

Storage Optimization
vmkfstools is used to create and manage VMFS file systems:
ƒ --createfs creates a VMFS5 file system of a specified partition.
ƒ --blocksize specifies the block size of the VMFS files system.
ƒ --setfsname sets name of the VMFS file system to create.
ƒ --spanfs extends the VMFS file system with the specified partition.
ƒ --rescanvmfs rescans the host for new VMFS volumes
ƒ --queryfs lists attributes of a file or directory on a VMFS volumes.

You use the vmkfstools command to create and manipulate file systems, logical volumes, and
physical storage devices on an ESXi host.
--createfs | -C vmfs5 -b | --blocksize -S | --setfsname <fsName>
<partition> – Creates a VMFS5 file system on a specified partition, such as naa.<naa_ID>:1.
The specified partition becomes the file system's head partition.
--blocksize | -b – Specifies the block size of the VMFS file system to create. When omitted,
defaults to using 1MB.
--setfsname | -S – Name of the VMFS file system to create.

--spanfs | -Z <span_partition> <head_partition> – Extends the VMFS file system


with the specified head partition by spanning it across the partition specified by <span_partition>.
--rescanvmfs | -V – Rescans the host for new VMFS volumes.

--queryfs | -P <directory> – Lists attributes of a file or directory on a VMFS volume.


Displays VMFS version number, the VMFS file system partitions, the capacity and the
available space.

Module 7 Storage Optimization 407


vmkfstools Virtual Disk Options
Slide 7-52

vmkfstools is used to create and manage virtual machine disks:


ƒ --createvirtualdisk
ƒ --adapterType
ƒ --diskformat
ƒ --clonevirtualdisk
ƒ --deletevirtualdisk
ƒ --renamevirtualdisk
ƒ --extendvirtualdisk
ƒ --createrdm
ƒ --createrdmpassthru
ƒ --querydm
ƒ --geometry
ƒ --writezeros
ƒ --inflatedisk

You can manage virtual disks with vmkfstools. The following options can be used to manage
virtual disk files:
• --createvirtualdisk | -c <size> -a | --adaptertype <srcfile> -d | --
diskformat <location> Creates a virtual disk at the specified location on a VMFS volume.
With <size> you can specify specify k|K, m|M, or g|G. Default size is 1MB, default adapter type
is buslogic, and default disk format is zeroedthick.
• --adapterType | -a [buslogic|lsilogic|ide] Adapter type of a disk to be created.
Accepts buslogic, lsilogic, or ide.
• --diskformat | -d Specifies the target disk format when used with -c, -i, or -X. For c,
accepts zeroedthick, eagerzeroedthick, or thin. For i, accepts zeroedthick, eagerzeroedthick,
thin, rdm:dev, rdmp:dev, or 2gbsparse. For -X, accepts eagerzeroedthick.
• --clonevirtualdisk | -i <src_file> <dest_file> --diskformat | -d
<format> --adaptertype | -a <type> Creates a copy of a virtual disk or raw disk. The
copy is in the specified disk format. Takes source disk and destination disk as arguments.
• --deletevirtualdisk | -U <disk> Deletes files associated with the specified
virtual disk.

408 VMware vSphere: Optimize and Scale


• --renamevirtualdisk | -E <old_name> <new_name> Renames a specified virtual disk.

7
• --extendvirtualdisk | -X [-d eagerzeroedthick] Extends the specified VMFS
virtual disk to the specified length. This command is useful for extending the size of a virtual
disk allocated to a virtual machine after the virtual machine has been created. However, this

Storage Optimization
command requires that the guest operating system has some capability for recognizing the new
size of the virtual disk and taking advantage of this new size. For example, the guest operating
system can take advantage of this new size by updating the file system on the virtual disk to
take advantage of the extra space.
• --createrdm | -r <rdm_file> Creates a raw disk mapping, that is, maps a raw disk to a
file on a VMFS file system. Once the mapping is established, the mapping file can be used to
access the raw disk like a normal VMFS virtual disk. The file length of the mapping is the same
as the size of the raw disk that it points to.
• --createrdmpassthru | -z <device> <map_file> Creates a passthrough raw disk
mapping. Once the mapping is established, it can be used to access the raw disk like a normal
VMFS virtual disk. The file length of the mapping is the same as the size of the raw disk that it
points to.
• --querydm | -q This option is not currently supported.
• --geometry | -g Returns the geometry information (cylinders, heads, sectors) of a
virtual disk.
• --writezeros | -w Initializes the virtual disk with zeros. Any existing data on the virtual
disk is lost.
• --inflatedisk | -j Converts a thin virtual disk to eagerzeroedthick. Any data on the thin
disk is preserved. Any blocks that were not allocated are allocated and zeroed out.

Module 7 Storage Optimization 409


vmkfstools Virtual Disk Example
Slide 7-53

The following example assumes that you are specifying connection


options.
Example:
ƒ Create a virtual disk 2GB in size:
• vmkfstools –c 2048 ‘[Storage1] rh6-2.vmdk’

This example illustrates creating a two-gigabyte virtual disk file named rh6-2.vmdk on the
VMFS file system named Storage1. This file represents an empty virtual disk that virtual machines
can access.

410 VMware vSphere: Optimize and Scale


vscsiStats

7
Slide 7-54

Storage Optimization
vscsiStats collects and reports counters on storage activity:
ƒ Data is collected at the virtual SCSI device level in the kernel.
ƒ Results are reported per VMDK, regardless of the underlying storage
protocol.
Reports data in histogram form, which includes:
ƒ I/O size
ƒ Seek distance
ƒ Outstanding I/Os
ƒ Latency in microseconds

The vscsiStats command collects and reports counters on storage activity.


vscsiStats creates the ESXi disk I/O workload characterization per virtual disk. This
characterization allows you to separate each type of workload into its own container and to
observe trends.
In many cases, the administrator is hampered by not knowing the right question to ask when trying
to troubleshoot disk performance problems. vscsiStats provides many different levels of
information to help you provide as much information as possible to your storage team. vscsiStats
reports the following data in histogram form: I/O size, seek distance, outstanding I/Os, and latency.

Module 7 Storage Optimization 411


Why Use vscsiStats?
Slide 7-55

Provides virtual disk latency statistics, with microsecond precision,


for both VMFS and NFS datastores
Reports data in histogram form instead of averages:
ƒ Histograms of observed data values can be much more informative
than single numbers like mean, median, and standard deviations from
the mean.
Exposes sequentiality of I/O:
ƒ This tool shows sequential versus random access patterns.
ƒ Sequentiality can help with storage sizing, LUN architecture, and
identification of application-specific behavior.
Provides a breakdown of I/O sizes:
ƒ Knowledge of expected I/O sizes can be used to optimize performance
of the storage architecture.

Because vscsiStats collects its results where the guest interacts with the hypervisor, it is unaware
of the storage implementation. Latency statistics can be collected for all storage configurations with
this tool.
Instead of computing averages over large intervals, histograms are kept so that a full distribution is
available for later analysis. For example, knowing the average I/O size of 23KB for an application is
not as useful as knowing that the application is doing 8 and 64KB I/Os in certain proportions.
Sequentiality and I/O size are important characteristics of storage profiles. A variety of best
practices have been provided by storage vendors to enable customers to tune their storage to a
particular I/O size and access type. As an example, it might make sense to optimize an array’s stripe
size to its average I/O size. vscsiStats can provide a histogram of I/O sizes and block distribution
to help this process.

412 VMware vSphere: Optimize and Scale


Running vscsiStats

7
Slide 7-56

Storage Optimization
vscsiStats <options>
ƒ -l List running virtual machines and their world IDs
(worldGroupID).
ƒ -s Start vscsiStats data collection.
ƒ -x Stop vscsiStats data collection.
ƒ -p Print histograms, specifying histogram type.
ƒ -c Produce results in a comma-delimited list.
ƒ -h Display help menu for more information about
command-line parameters.

Starting the vscsiStats data collection requires the world ID of the virtual machine being
monitored. You can determine this identifier by running the -l option to generate a list of all
running virtual machines.
To start the data collection, use this command:
vscsiStats -s -w <worldGroup_ID>

Once vscsiStats data collection has been started, it will automatically stop after about 30
minutes. If the analysis is needed for a longer period, the start command (-s option) should be
repeated above in the window where the command was run originally. That will defer the timeout
and termination by another 30 minutes. The -p option works only while vscsiStats is running.
When you are viewing data counters, the histogram type is used to specify either all of the statistics
or one group of them. Options include all, ioLength, seekDistance, outstandingIOs,
latency, and interarrival. These options are case-sensitive.

Module 7 Storage Optimization 413


Results can be produced in a more compact, comma-delimited list by adding the -c option. For
example, to show all counters in a comma-delimited form, use the command:
/usr/lib/vmware/bin/vscsiStats -p all -c

To manually stop the data collection and reset the counters, use the -x option:
vscsiStats -x -w <worldGroup_ID>

Because results are accrued and reported out in summary, the histograms will include data since
collection was started. To reset all counters to zero without restarting vscsiStats, run:
/usr/lib/vmware/bin/vscsiStats -r

414 VMware vSphere: Optimize and Scale


Lab 11

7
Slide 7-57

Storage Optimization
In this lab, you will use various commands to view storage
information and monitor storage activity.
1. Use esxcli to view storage configuration information.
2. Use vmkfstools to manage volumes and virtual disks.
3. Generate disk activity in your test virtual machine.
4. Start vscsiStats.
5. Use vscsiStats to generate a histogram for disk latency.
6. Use vscsiStats to determine type of I/O disk activity.
7. Generate disk activity in your test virtual machine.
8. Use vscsiStats to determine type of I/O disk activity.
9. Clean up for the next lab.

Module 7 Storage Optimization 415


Review of Learner Objectives
Slide 7-58

You should be able to do the following:


ƒ Use vMA to manage vSphere virtual storage.
ƒ Use vmkfstools for VMFS operations.
ƒ Use the vscsiStats command.

416 VMware vSphere: Optimize and Scale


Lesson 4: Troubleshooting Storage Performance Problems

7
Slide 7-59

Storage Optimization
Lesson 4:
Troubleshooting Storage Performance
Problems

Module 7 Storage Optimization 417


Learner Objectives
Slide 7-60

After this lesson, you should be able to do the following:


ƒ Describe various storage performance problems.
ƒ Discuss the causes of storage performance problems.
ƒ Propose solutions to correct storage performance problems.
ƒ Discuss examples of troubleshooting storage performance problems.

418 VMware vSphere: Optimize and Scale


Review: Basic Troubleshooting Flow for ESXi Hosts

7
Slide 7-61

1. Check for VMware® Tools™ status.

Storage Optimization
2. Check for resource pool
CPU saturation. 11. Check for using only one vCPU
in an SMP VM.
16. Check for low guest CPU
3. Check for host CPU saturation.
utilization.
12. Check for high CPU ready time
4. Check for guest CPU saturation. on VMs running in 17. Check for past VM memory
under-utilized hosts. swapping.
5. Check for active VM
memory swapping.
13. Check for slow storage device. 18. Check for high memory
demand in a resource pool.
6. Check for VM swap wait.
14. Check for random increase
in I/O latency on a shared storage 19. Check for high memory
7. Check for active VM
device. demand in a host.
memory compression.

8. Check for an overloaded 20. Check for high guest memory


15. Check for random increase demand.
storage device.
in data transfer rate
on network controllers.
9. Check for dropped receive
packets.

10. Check for dropped transmit


packets.

definite problems likely problems possible problems

The performance of the storage infrastructure can have a significant impact on the performance of
applications, whether running in a virtualized or nonvirtualized environment. In fact, problems with
storage performance are the most common causes of application-level performance problems.
Slow storage is the most common cause of performance problems in the vSphere environment.
Slow storage response times might be an indication that your storage is being overloaded. Slow
storage or overloaded storage are related problems, so the solutions presented in this lesson address
both problems.

Module 7 Storage Optimization 419


Overloaded Storage
Slide 7-62

Monitor the number of disk commands aborted on the host:


ƒ If Commands Terminated > 0 for any LUN, then storage is overloaded
on that LUN.

Severely overloaded storage can be the result of a large number of different issues in the underlying
storage layout or infrastructure. In turn, overloaded storage can manifest itself in many different
ways, depending on the applications running in the virtual machines. When storage is severely
overloaded, operation time-outs can cause commands already issued to disk to be terminated (or
aborted). If a storage device is experiencing command aborts, the cause of these aborts must be
identified and corrected.
To check for command aborts

1. In the vSphere Client, select the host and display an advanced disk performance chart.

For all LUN objects, look at the counter named Commands Terminated. If the value is greater than 0
on any LUN, then storage is overloaded on that LUN. Otherwise, storage is not severely overloaded.

420 VMware vSphere: Optimize and Scale


Causes of Overloaded Storage

7
Slide 7-63

Storage Optimization
Excessive demand is placed on the storage device.
Storage is misconfigured. Check the following:
ƒ Number of disks per LUN
ƒ RAID level of a LUN
ƒ Assignment of array cache to a LUN

The following are the two main causes of overloaded storage:


• Excessive demand being placed on the storage device – The storage load being placed on the
device exceeds the ability of the device to respond.
• Misconfigured storage – Storage devices have a large number of configuration parameters.
These parameters include the assignment of an array cache to a LUN, the number of disks per
LUN, and the RAID level of a LUN. Choices made for any of these variables can affect the
ability of the device to handle the I/O load.
For example, if you have applications that primarily perform sequential reads, such as a data
warehousing application or a database management system, set the RAID level to favor
sequential reads (this level is typically RAID level 5). For transactional applications, which
perform random I/O, set the RAID level to favor that I/O pattern (such as RAID 1+0). Do not
mix these two different types of I/O on the same RAID-configured LUN. However, keep in
mind that today’s array technology can also automatically adjust for sequential and random I/O.
Storage vendors typically provide guidelines on proper configuration and expected performance for
different RAID levels and types of I/O load.

Module 7 Storage Optimization 421


Slow Storage
Slide 7-64

For a host’s LUNs, monitor Physical Device Read Latency and


Physical Device Write Latency counters.
ƒ If average > 10ms or peak > 20ms for any LUN, then storage might be
slow on that LUN.
Or monitor the device latency (DAVG/cmd) in resxtop.
ƒ If value > 10, a problem might exist.
ƒ If value > 20, a problem exists.

Slow storage is the most common cause of performance problems in a vSphere environment. The
disk latency metrics provided by the vSphere Client are Physical Device Read Latency and Physical
Device Write Latency. These metrics show the average time for an I/O operation to complete. The
I/O operation is measured from the time the operation is submitted to the hardware adapter to the
time the completion notification is provided back to ESXi. As a result, these values represent
performance achieved by the physical storage infrastructure. Storage is slow if the disk latencies
are high.
The advanced disk performance chart in the vSphere Client provides you with latency metrics. For
all LUN objects, look at the counters named Physical Device Read Latency and Physical Device
Write Latency. For these measurements, if the average is above 10 milliseconds, or if the peaks are
above 20 milliseconds, for any vmhba, then storage might be slow or overloaded on that vmhba.
Otherwise, storage does not appear to be slow or overloaded.
As an alternative to using the vSphere Client, you monitor the latency metrics like DAVG/cmd with
resxtop/esxtop. To display this information, type d in the resxtop/esxtop window.
These latency values are provided only as points at which further investigation is justified. Expected
disk latencies depend on the nature of the storage workload (for example, the read/write mix,
randomness, and I/O size) and the capabilities of the storage subsystems.

422 VMware vSphere: Optimize and Scale


Factors Affecting Storage Response Time

7
Slide 7-65

Storage Optimization
Three main workload factors that affect storage response time:
ƒ I/O arrival rate
ƒ I/O size
ƒ I/O locality
Use the storage device’s monitoring tools to collect data to
characterize the workload.

To determine whether high disk latency values represent an actual problem, you must understand the
storage workload. Three main workload factors affect the response time of a storage subsystem:
• I/O arrival rate – A given configuration of a storage device has a maximum rate at which it can
handle specific mixes of I/O requests. When bursts of requests exceed this rate, they might need
to be queued in buffers along the I/O path. This queuing can add to the overall response time.
• I/O size – The transmission rate of storage interconnects, and the speed at which an I/O can be
read from or written to disk, are fixed quantities. As a result, large I/O operations naturally take
longer to complete. A response-time that is slow for small transfers might be expected for larger
operations.
• I/O locality – Successive I/O requests to data that is stored sequentially on disk can be
completed more quickly than to data that is spread randomly. In addition, read requests to
sequential data are more likely to be completed out of high-speed caches in the disks or arrays.
Storage devices typically provide monitoring tools for collecting data that can help characterize
storage workloads. In addition, storage vendors typically provide data on recommended
configurations and expected performance of their devices based on these workload characteristics.
The storage response times should be compared to expectations for that workload. If investigation
determines that storage response times are unexpectedly high, corrective action should be taken.

Module 7 Storage Optimization 423


Random Increase in I/O Latency on Shared Storage
Slide 7-66

Applications, when consolidated, share expensive physical


resources such as storage.
Situations might occur when high I/O activity on shared storage
could impact the performance of latency-sensitive applications.
ƒ Very high number of I/O requests are issued concurrently.
ƒ Operations, such as a backup operation in a virtual machine, hog the
I/O bandwidth of a shared storage device.
To resolve these situations, use Storage I/O Control to control each
virtual machine’s access to I/O resources of a shared datastore.

Applications, when consolidated, can share expensive physical resources such as storage. Sharing
storage resources offers many advantages: ease of management, power savings and higher resource
use. However, situations might occur when high I/O activity on the shared storage could impact the
performance of certain latency-sensitive applications:
• A higher than expected number of I/O-intensive applications in virtual machines sharing a
storage device become active at the same time. As a result, these virtual machines issue a very
high number of I/O requests concurrently. Increased I/O requests increases the load on the
storage device and the response time of I/O requests also increases. The performance of virtual
machines running critical workloads might be impacted because these workloads now must
compete for shared storage resources.
• Operations, such as Storage vMotion or a backup operation running in a virtual machine,
completely hog the I/O bandwidth of a shared storage device. These operations cause other
virtual machines sharing the device to suffer from resource starvation.

424 VMware vSphere: Optimize and Scale


Storage I/O Control can be used to overcome these problems. Storage I/O Control continuously
monitors the aggregate normalized I/O latency across the shared storage devices (VMFS datastores).

7
Storage I/O Control can detect the existence of an I/O congestion based on a preset congestion
threshold. Once the condition is detected, Storage I/O Control uses the resource controls set on the

Storage Optimization
virtual machines to control each virtual machine’s access to the shared datastore.
In addition, VMware vSphere® Storage DRS™ provides automatic or manual datastore cluster I/O
capacity and performance balancing. Storage DRS can be configured to perform automatic virtual
machine storage vMotion migrations when space use or I/O response time thresholds of a datastore
have been exceeded.

Module 7 Storage Optimization 425


Example 1: Bad Disk Throughput
Slide 7-67

good low device


throughput latency
resxtop output 1

bad high device latency


throughput (due to disabled cache)
resxtop output 2

One of the challenges of performance monitoring is determining whether a performance problem


truly exists. You can get clues from viewing the latency values.
In the example above, the first resxtop screen (type d in the window) shows the disk activity of a
host. Looking at the vmhba1 adapter, you see that very good throughput exists (135.90MB written
per second). The device latency per command is 7.22, which is low.
The second resxtop screen, however, shows low throughput (3.95MB written per second) on
vmhba1 but very high device latency (252.04). In this particular configuration, the SAN cache was
disabled. The disk cache is very important, especially for writes.
The use of caches applies not only to SAN storage arrays but also to RAID controllers. A RAID
controller typically has a cache with a battery backup. If the battery dies, the cache is disabled to
prevent any further loss of data. So a problem could occur if, one day, you are experiencing very
good performance and then the performance suddenly degrades. In this case, check the battery status
on the RAID controller to see if the battery is still good

426 VMware vSphere: Optimize and Scale


Example 2: Virtual Machine Power On Is Slow

7
Slide 7-68

Storage Optimization
User complaint: Powering on a virtual machine takes longer than
usual.
ƒ Sometimes powering on a virtual machine takes 5 seconds.
ƒ Other times powering on a virtual machine takes 5 minutes!
What do you check?
ƒ Because powering on a virtual machine requires disk activity on the
host, check the disk metrics for the host.

In this example, users are complaining that, at times, powering on their virtual machines is very
slow. Sometimes it takes 5 seconds to power on a virtual machine, and that is OK. However, at other
times, a virtual machine takes 5 minutes to power on, which is unacceptable.
Because powering on a virtual machine requires disk activity on the virtual machine’s host, you
might want to check the host’s disk metrics, such as disk latency.

Module 7 Storage Optimization 427


Monitoring Disk Latency with the vSphere Client
Slide 7-69

Maximum disk
latencies range
from 100ms to
1100ms.

This latency is
very high.

If you use the vSphere Client to monitor performance, the overview chart for your host’s disk gives
you a quick look at the highest disk latency experienced by your host.
Storage vendors provide latency statistics for their hardware that can be checked against the latency
statistics in either the vSphere Client or resxtop/esxtop. When the latency numbers are high, the
hardware might be overworked by too many servers. Some examples of latency numbers include the
following:
• When reading data from the array cache, latencies of 2–5 milliseconds usually reflect a healthy
storage system.
• When data is randomly being read across the disk, latencies of 5–12 milliseconds reflect a
healthy storage architecture.
• Latencies of 15 milliseconds or greater might reflect an overutilized storage array or a storage
array that is not operating properly.
In the example above, maximum disk latencies of 100–1,100 milliseconds are being experienced.
This is very high.

428 VMware vSphere: Optimize and Scale


Monitoring Disk Latency With resxtop

7
Slide 7-70

Storage Optimization
…

Rule of thumb: very large values


ƒ GAVG/cmd > 20ms = high latency! for DAVG/cmd
and GAVG/cmd
What does this mean?
ƒ Latency when command reaches device is high.
ƒ Latency as seen by the guest is high.
ƒ Low KAVG/cmd means command is not queuing in VMkernel.

To further investigate the problem, you can view latency information using resxtop/esxtop. As a
general rule, if the value of GAVG/cmd is greater than 20 milliseconds, you have high latency.
This example shows relatively large values for DAVG/cmd and GAVG/cmd. Therefore, this host is
experiencing high latency, specifically on vmhba1. The value for KAVG/cmd is zero, which means
that the commands are not queuing in the VMkernel.

Module 7 Storage Optimization 429


Solving the Problem of Slow Virtual Machine Power On
Slide 7-71

Host events show that


a disk has connectivity
issues.

This leads to
high latencies.

Monitor disk latencies if there is slow access to storage.


The cause of the problem might not be related to virtualization.

Solving the problem might involve not only looking at performance statistics but also looking to
see if there are any errors encountered on the host. In this example, the host’s events list shows
that a disk has connectivity issues. A storage device that is not working correctly can lead to high
latency values.

430 VMware vSphere: Optimize and Scale


Example 3: Logging In to Virtual Machines Is Slow

7
Slide 7-72

Storage Optimization
Scenario:
ƒ A group of virtual machines runs on a host.
ƒ Each virtual machine’s application reads to and writes from the same
NAS device.
• The NAS device is also a virtual machine.
Problem:
ƒ Users suddenly cannot log in to any of the virtual machines.
• The login process is very slow.
Initial speculation:
ƒ Virtual machines are saturating a resource.

In this example, a group of virtual machines runs on a host. An application runs in each of these
virtual machines. Each application reads to and writes from a NAS device shared by all the
applications. The NAS device is implemented as a virtual machine.
The problem is that, after some time, the users are unable to log in to any of the virtual machines
because the login process is very slow to complete. This behavior might indicate that a resource
used by the virtual machine is being saturated.
The next few slides take you through a possible sequence of steps to take to identify the problem.

Module 7 Storage Optimization 431


Monitoring Host CPU Usage
Slide 7-73

Chaotic CPU usage.


Host is saturated.
Predictable CPU
usage. Host is
not saturated.

Because the initial speculation is that the virtual machines are saturating a resource, you might want
to check the host’s CPU usage. In the example above, a stacked chart of CPU usage is displayed for
the virtual machines on the host. During the first interval of 4:00 a.m. to 4:00 p.m., CPU usage is
predictable and the host is not saturated. However, during the second interval, starting at 10:00 p.m.,
CPU usage has increased dramatically and indicates that the host’s CPU resources are saturated.
But what about the other resources, such as disk usage?

432 VMware vSphere: Optimize and Scale


Monitoring Host Disk Usage

7
Slide 7-74

Storage Optimization
Predictable, balanced
disk usage Uneven, reduced
disk usage

On checking the host’s disk usage, a different pattern exists. The stacked chart of virtual machine
disk usage shows that between 4:00 a.m. and 4:00 p.m., disk usage is balanced and predictable.
However, during the interval starting at 10:00 p.m., disk usage is reduced and somewhat uneven.
Because overall disk throughput is reduced, the next task is to break down the disk throughput into
the read rate and write rate.

Module 7 Storage Optimization 433


Monitoring Disk Throughput
Slide 7-75

Steady read
and write traffic
Increased write traffic.
Zero read traffic

Disk throughput can be monitored in terms of the disk read rate and disk write rate. The line chart
above shows the disk read rate and the disk write rate for the HBA being used on the host. During
the interval of 4:00 a.m. and 4:00 p.m., read and write activity is steady. However, during the
interval starting at 10:00 p.m., the write traffic increased and the read traffic dropped to zero.
Figuring out why the read traffic dropped to zero can help pinpoint the problem.

434 VMware vSphere: Optimize and Scale


Solving the Problem of Slow Virtual Machine Login

7
Slide 7-76

Storage Optimization
Problem diagnosis:
ƒ CPU usage increased per virtual machine.
ƒ Write traffic increased per virtual machine.
ƒ Write traffic to the NAS virtual machine significantly increased.
One possible problem:
ƒ An application bug in the virtual machine caused the error condition:
• The error condition caused excessive writes to the NAS virtual machine.
• Each virtual machine is so busy writing that it never reads.

To summarize, something caused excessive writes to occur from each virtual machine. At
10:00 p.m., an increase in CPU usage occurred for each virtual machine. Also at the same time, a
big increase in write traffic to the NAS virtual machine happened.
A possible problem, which has nothing to do with the storage hardware or virtual machine
configuration, could be that the application running in the virtual machine has a bug. An error
condition in the application caused excessive writes by each virtual machine running the application
to the same NAS device. Each virtual machine is so busy writing data that it never reads.
At this point, you might want to monitor the application itself to see if the application is indeed
the problem.

Module 7 Storage Optimization 435


Resolving Storage Performance Problems
Slide 7-77

Consider the following when resolving storage performance


problems:
ƒ Check your hardware for proper operation and optimal configuration.
ƒ Reduce the need for storage by your hosts and virtual machines.
ƒ Balance the load across available storage.
ƒ Understand the load being placed on storage devices.

Due to the complexity and variety of storage infrastructures, providing specific solutions for slow or
overloaded storage is impossible. In a vSphere environment, beyond rebalancing the load, typically
little that can be done from within the vSphere environment to solve problems related to slow or
overloaded storage. Use the tools provided by your storage vendor to monitor the workload placed
on your storage device, and follow your storage vendor’s configuration recommendations to
configure the storage device to meet your demands.
To help guide the investigation of performance issues in the storage infrastructure, follow these
general considerations:
• Check that your hardware is operating properly and is using an optimal configuration.
• Reduce the storage needs of your hosts and virtual machines.
• Balance the I/O workload across all available LUNs.
• Understand the workload being placed on your storage devices.

436 VMware vSphere: Optimize and Scale


Checking Storage Hardware

7
Slide 7-78

Storage Optimization
To resolve the problems of slow or overloaded storage, solutions can
include the following:
ƒ Ensure that hardware is working properly.
ƒ Configure the HBAs and RAID controllers for optimal use.
ƒ Upgrade your hardware, if possible.

Excessive disk throughput can lead to overloaded storage. Throughput in a vSphere environment
depends on many factors related to both hardware and software. Among the important factors are
Fibre Channel link speed, the number of outstanding I/O requests, the number of disk spindles,
RAID type, SCSI reservations, and caching or prefetching algorithms.
High latency is an indicator of slow storage. Latency depends on many factors, including the
following:
• Queue depth or capacity at various levels
• I/O request size
• Disk properties such as rotational, seek, and access delays
• SCSI reservations
• Caching or prefetching algorithms
Make sure that your storage hardware is working properly and is configured according to your
storage vendor’s recommendations. Periodically check the host’s event logs to make sure that
nothing unusual is occurring.

Module 7 Storage Optimization 437


Reducing the Need for Storage
Slide 7-79

Consider the trade-off between memory capacity and storage


demand.
ƒ Some applications, such as databases, cache frequently used data in
memory, thus reducing storage loads.
Eliminate all possible swapping to reduce the burden on the storage
subsystem.

Consider the trade-off between memory capacity and storage demand. Some applications can take
advantage of additional memory capacity to cache frequently used data, thus reducing storage loads.
For example, databases can use system memory to cache data and avoid disk access. Check the
applications running in your virtual machines to see if they might benefit from increased caches.
Provide more memory to those virtual machines if resources permit, which might reduce the burden
on the storage subsystem and also have the added benefit of improving application performance.
Refer to the application-specific documentation to determine whether an application can take
advantage of additional memory. Some applications require changes to configuration parameters in
order to make use of additional memory.
Also, eliminate all possible swapping to reduce the burden on the storage subsystem. First, verify
that the virtual machines have the memory they need by checking the swap statistics in the guest.
Provide memory if resources permit, then try to eliminate host-level swapping.

438 VMware vSphere: Optimize and Scale


Balancing the Load

7
Slide 7-80

Storage Optimization
Spread I/O loads over the
available paths to the storage.
Configure VMware vSphere®
Storage DRS to balance
capacity and IOPS.
For disk-intensive workloads:
ƒ Use enough HBAs to handle
the load.
ƒ If necessary, separate storage
processors to separate
systems.

To optimize storage array performance, spread I/O loads across multiple HBAs and storage
processors. Spreading the I/O load provides higher throughput and less latency for each application.
Make sure that each server has a sufficient number of HBAs to allow maximum throughput for all
the applications hosted on the server for the peak period.
vSphere Storage DRS helps balance I/O loads by migrating a virtual machine’s files across multiple
datastores when latency reaches a threshold.
If you have heavy disk I/O loads, you might need to assign separate storage processors to separate
systems to handle the amount of traffic bound for storage.

Module 7 Storage Optimization 439


Understanding Load Placed on Storage Devices
Slide 7-81

Understand the workload:


ƒ Use storage array tools
to capture workload statistics.
Strive for complementary
workloads:
ƒ Mix disk-intensive with non-
disk-intensive virtual
machines on a datastore.
ƒ Mix virtual machines with
different peak access times.
ƒ Configure VMware vSphere®
Storage I/O Control to
prioritize IOPS when latency
thresholds are reached.

Understand the load being placed on the storage device. To do this, many storage arrays have tools
that allow workload statistics (such as IOPS, I/O sizes, and queuing data) to be captured. Refer to
the vendor’s documentation for details. High-level data provided by the vSphere Client and
resxtop can be useful. The vscsiStats tool can also provide detailed information on the I/O
workload generated by a virtual machine’s virtual SCSI device.
Strive for complementary workloads. The idea is for the virtual machines not to overtax any of the
four resources (CPU, memory, networking, and storage). For example, the workload of a Web server
virtual machine is mainly memory-intensive and occasionally disk-intensive. The workload of a
Citrix server virtual machine is mainly CPU- and memory-intensive. Therefore, place the Web
server and Citrix server virtual machines on the same datastore. Or place a firewall virtual machine
(which is mainly CPU-, memory-, and network-intensive) on the same datastore as a database server
(which is CPU-, memory-, and disk-intensive).
Also consider peak usage time to determine virtual machine placement. For example, place
Microsoft Exchange Server virtual machines on the same datastore if the peak usage of each virtual
machine occurs at a different time of day. In the example above, peak usage time for the users in
Asia will most likely differ from peak usage time for the users in Europe.
As you capture long-term performance data, consider using cold migrations or Storage vMotion to
balance and optimize your datastore workloads.

440 VMware vSphere: Optimize and Scale


Storage Performance Best Practices

7
Slide 7-82

Storage Optimization
Configure each LUN with the right RAID level and storage
characteristics for the applications in virtual machines that will use it.
ƒ Use VMFS file systems for your virtual machines.
ƒ Avoid oversubscribing paths (SAN) and links (iSCSI and NFS).
ƒ Isolate iSCSI and NFS traffic.
ƒ Applications that write a lot of data to storage should not share Ethernet
links to a storage device.
ƒ Postpone major storage maintenance until off-peak hours.
ƒ Eliminate all possible swapping to reduce the burden on the storage
subsystem.
ƒ In SAN configurations, spread I/O loads over the available paths to the
storage devices.
ƒ Strive for complementary workloads.

Best practices are recommendations for the configuration, operation, and management of the
architecture. Best practices are developed in the industry over time. As technology evolves so do
best practices. Always balance best practices with an organizations objectives.

Module 7 Storage Optimization 441


Review of Learner Objectives
Slide 7-83

You should be able to do the following:


ƒ Describe various storage performance problems.
ƒ Discuss the causes of storage performance problems.
ƒ Propose solutions to correct storage performance problems.
ƒ Discuss examples of troubleshooting storage performance problems.

442 VMware vSphere: Optimize and Scale


Key Points

7
Slide 7-84

Storage Optimization
Factors that affect storage performance include storage protocols,
storage configuration, queuing, and VMFS configuration.
ƒ Disk throughput and latency are two key metrics to monitor storage
performance.
ƒ Overloaded storage and slow storage are common storage
performance problems.

Questions?

Module 7 Storage Optimization 443


444 VMware vSphere: Optimize and Scale
MODULE 8

CPU Optimization 8
Slide 8-1

Module 8

8
CPU Optimization

VMware vSphere: Optimize and Scale 445


You Are Here
Slide 8-2

Course Introduction Storage Optimization

VMware Management Resources


CPU Optimization
Performance in a Virtualized
Environment
Memory Performance
Network Scalability
VM and Cluster Optimization
Network Optimization

Storage Scalability Host and Management Scalability

446 VMware vSphere: Optimize and Scale


Importance
Slide 8-3

In a VMware vSphere® environment, multiple virtual machines run on


the same host. To prevent a situation where one or more virtual

8
machines are allocated insufficient CPU resources, you should know
how to monitor both host and virtual machine CPU usage.

CPU Optimization
Likewise, to resolve a situation where the CPU of a host or virtual
machine becomes saturated, you should know how to troubleshoot
such a problem.

Module 8 CPU Optimization 447


Module Lessons
Slide 8-4

Lesson 1: CPU Virtualization Concepts


Lesson 2: Monitoring CPU Activity
Lesson 3: Troubleshooting CPU Performance Problems

448 VMware vSphere: Optimize and Scale


Lesson 1: CPU Virtualization Concepts
Slide 8-5

8
CPU Optimization
Lesson 1:
CPU Virtualization Concepts

Module 8 CPU Optimization 449


Learner Objectives
Slide 8-6

After this lesson, you should be able to do the following:


ƒ Discuss the CPU scheduler features.
ƒ Discuss what affects CPU performance.

450 VMware vSphere: Optimize and Scale


CPU Scheduler Overview
Slide 8-7

The CPU scheduler is crucial to providing good performance in a


consolidated environment.

8
The CPU scheduler has the following features:
ƒ

CPU Optimization
Schedules virtual CPUs (vCPUs) on physical CPUs
ƒ Enforces the proportional-share algorithm for CPU usage
ƒ Supports SMP virtual machines
ƒ Uses relaxed co-scheduling for SMP virtual machines
ƒ Is NUMA/processor/cache topology–aware

The VMkernel CPU scheduler is crucial to providing good performance in a consolidated


environment. Because most modern processors are equipped with multiple cores per processor, it is
easy to build a system with tens of cores running hundreds of virtual machines. In such a large
system, allocating CPU resource efficiently and fairly is critical.
The CPU scheduler has the features noted above. These features are discussed in the next
several slides.

Module 8 CPU Optimization 451


What Is a World?
Slide 8-8

A world is an execution context,


scheduled on a processor.
ƒ A world is like a process in group
conventional operating systems.
A virtual machine is a collection, or
group, of worlds:
ƒ One for each virtual machine’s vCPU worlds MKS VMM vCPU0
ƒ One for the virtual machine’s mouse,
keyboard and screen (MKS)
ƒ One for the virtual machine monitor physical
(VMM) CPU

The CPU scheduler chooses which


world to schedule on a processor.

The role of the CPU scheduler is to assign execution contexts to processors in a way that meets
system objectives such as responsiveness, throughput, and utilization. On conventional operating
systems, the execution context corresponds to a process or a thread. On VMware vSphere® ESXi™
hosts, the execution context corresponds to a world.
A virtual machine is a collection of worlds, with some worlds being virtual CPUs (vCPUs) and
other threads doing additional work. For example, a virtual machine consists of a world that controls
the mouse, keyboard, and screen (MKS). The virtual machine also has a world for its virtual
machine monitor (VMM).
There are nonvirtual machine worlds as well. These nonvirtual machine worlds are VMkernel
worlds and are used to perform various system tasks. Examples of these nonvirtual machine worlds
include the idle, driver, and vmotion worlds.

452 VMware vSphere: Optimize and Scale


CPU Scheduler Features
Slide 8-9

Allocates CPU resources dynamically and transparently:


ƒ

8
Schedules vCPUs on physical CPUs
ƒ Checks physical CPU utilization every 2 to 40 milliseconds and
migrates vCPUs as necessary

CPU Optimization
Enforces the proportional-share algorithm for CPU usage:
ƒ When CPU resources are overcommitted, hosts time-slice physical
CPUs across all virtual machines.
ƒ Each vCPU is prioritized by resource allocation settings (shares,
reservations, limits).

One of the main tasks of the CPU scheduler is to choose which world is to be scheduled to a
processor. If the target processor is already occupied, the scheduler must decide whether to preempt
the currently running world on behalf of the chosen one.
The CPU scheduler checks physical CPU utilization every 2 to 40 milliseconds, depending on the
CPU type, and migrates vCPUs as necessary. A world migrates from a busy processor to an idle
processor. A world migration can be initiated either by a physical CPU that becomes idle or by a
world that becomes ready to be scheduled.
An ESXi host implements the proportional-share-based algorithm. When CPU resources are
overcommitted, the ESXi host time-slices the physical CPUs across all virtual machines so that each
virtual machine runs as if it had its specified number of virtual processors. It associates each world
with a share of CPU resource.
This is called entitlement. Entitlement is calculated from user-provided resource specifications like
shares, reservation, and limits. When making scheduling decisions, the ratio of the consumed CPU
resource to the entitlement is used as the priority of the world. If there is a world that has consumed
less than its entitlement, the world is considered high priority and will likely be chosen to run next.

Module 8 CPU Optimization 453


CPU Scheduler Features: Support for SMP VMs
Slide 8-10

VMware vSphere® ESXi™ uses a form of co-scheduling that is optimized to run


SMP virtual machines efficiently.

Co-scheduling refers to a technique for scheduling related processes to run on


different processors at the same time.
ƒ At any time, each vCPU might be scheduled, descheduled, preempted, or blocked
while waiting for some event.
The CPU scheduler takes “skew” into account when scheduling vCPUs in an
SMP virtual machine.
ƒ Skew is the difference in execution rates between two or more vCPUs in an SMP
virtual machine.
ƒ A vCPU’s skew increases when it is not making progress but one of its sibling
vCPUs is.
ƒ A vCPU is considered to be skewed if its cumulative skew exceeds a threshold.

A symmetric multiprocessing (SMP) virtual machine presents the guest operating system and
applications with the illusion that they are running on a dedicated physical multiprocessor. An
ESXi host implements this illusion by supporting co-scheduling of the vCPUs within an SMP
virtual machine.
Co-scheduling refers to a technique used for scheduling related processes to run on different
processors at the same time. An ESXi host uses a form of co-scheduling that is optimized for
running SMP virtual machines efficiently.

454 VMware vSphere: Optimize and Scale


At any particular time, each vCPU might be scheduled, descheduled, preempted, or blocked while
waiting for some event. Without co-scheduling, the vCPUs associated with an SMP virtual machine
would be scheduled independently, breaking the guest's assumptions regarding uniform progress.
The word “skew” is used to refer to the difference in execution rates between two or more vCPUs
associated with an SMP virtual machine.
The ESXi scheduler maintains a fine-grained cumulative skew value for each vCPU within an SMP

8
virtual machine. A vCPU is considered to be making progress if it consumes CPU in the guest level
or halts. The time spent in the hypervisor is excluded from the progress. This means that the

CPU Optimization
hypervisor execution might not always be co-scheduled. This is acceptable because not all
operations in the hypervisor benefit from being co-scheduled. When it is beneficial, the hypervisor
makes explicit co-scheduling requests to achieve good performance.
The progress of each vCPU in an SMP virtual machine is tracked individually. The skew is
measured as the difference in progress between the slowest vCPU and each of the other vCPUs. A
vCPU is considered to be skewed if its cumulative skew value exceeds a configurable threshold,
typically a few milliseconds.

Module 8 CPU Optimization 455


CPU Scheduler Feature: Relaxed Co-Scheduling
Slide 8-11

Relaxed co-scheduling refers to a technique for scheduling a subset


of a virtual machine’s vCPUs simultaneously after skew has been
detected:
ƒ Lowers the number of required physical CPUs for the virtual machine to
co-start
ƒ Increases CPU utilization
There is no co-scheduling overhead for an idle vCPU.

With relaxed co-scheduling, only those vCPUs that are skewed must be co-started. This ensures that
when any vCPU is scheduled, all other vCPUs that are “behind” will also be scheduled, reducing
skew. This approach is called relaxed co-scheduling because only a subset of a virtual machine’s
vCPUs must be scheduled simultaneously after skew has been detected.
The vCPUs that advanced too much are individually stopped. After the lagging vCPUs catch up, the
stopped vCPUs can start individually. Co-scheduling all vCPUs is still attempted to maximize the
performance benefit of co-scheduling.
There is no co-scheduling overhead for an idle vCPU, because the skew does not grow when a
vCPU halts. An idle vCPU does not accumulate skew and is treated as if it were running for co-
scheduling purposes. This optimization ensures that idle guest vCPUs do not waste physical
processor resources, which can instead be allocated to other virtual machines.
For example, an ESXi host with two physical cores might be running one vCPU each from two
different virtual machines, if their sibling vCPUs are idling, without incurring any co-scheduling
overhead.

456 VMware vSphere: Optimize and Scale


CPU Scheduler Feature: Processor Topology/Cache Aware
Slide 8-12

The CPU scheduler uses processor topology information to optimize


the placement of vCPUs onto different sockets.

8
The CPU scheduler spreads load across all sockets to maximize the
aggregate amount of cache available.

CPU Optimization
ƒ Cores within a single socket typically use a shared last-level cache.
ƒ Use of a shared last-level cache can improve vCPU performance if
running memory-intensive workloads.

The CPU scheduler can interpret processor topology, including the relationship between sockets,
cores, and logical processors. The scheduler uses topology information to optimize the placement of
vCPUs onto different sockets to maximize overall cache utilization and to improve cache affinity by
minimizing vCPU migrations.
A dual-core processor can provide almost double the performance of a single-core processor by
allowing two vCPUs to execute at the same time. Cores within the same processor are typically
configured with a shared last-level cache used by all cores, potentially reducing the need to access
slower, main memory. Last-level cache is a memory cache that has a dedicated channel to a CPU
socket (bypassing the main memory bus), enabling it to run at the full speed of the CPU.
By default, in undercommitted systems, the CPU scheduler spreads the load across all sockets. This
improves performance by maximizing the aggregate amount of cache available to the running
vCPUs. As a result, the vCPUs of a single SMP virtual machine are spread across multiple sockets.
In some cases, such as when an SMP virtual machine exhibits significant data sharing between its
vCPUs, this default behavior might be suboptimal. For such workloads, it can be beneficial to
schedule all of the vCPUs on the same socket, with a shared last-level cache, even when the ESXi
host is undercommitted. In such scenarios, you can override the default behavior of spreading
vCPUs across packages by including the following configuration option in the virtual machine’s
VMX configuration file: sched.cpu.vsmpConsolidate="TRUE".

Module 8 CPU Optimization 457


CPU Scheduler Feature: NUMA-Aware
Slide 8-13

Each CPU on a non-uniform memory access (NUMA) server has local


memory directly connected by one or more local memory controllers.
ƒ Processes running on a CPU can access this local memory faster than
memory on a remote CPU in the same server.
ƒ Poor NUMA locality refers to the situation where a high percentage of a
virtual machine’s memory is not local.
The NUMA scheduler restricts a virtual machine’s vCPUs to a single
socket to take advantage of the cache.
ƒ This can be manually overridden.

In a nonuniform memory access (NUMA) server, the delay incurred when accessing memory varies
for different memory locations. Each processor chip on a NUMA server has local memory directly
connected by one or more local memory controllers. For processes running on that chip, this
memory can be accessed more rapidly than memory connected to other, remote processor chips.
When a high percentage of a virtual machine’s memory is located on a remote processor, this means
that the virtual machine has poor NUMA locality. When this happens, the virtual machine’s
performance might be less than if its memory were all local. All AMD Opteron processors and some
recent Intel Xeon processors use NUMA architecture.
When running on a NUMA server, the CPU scheduler assigns each virtual machine to a home node.
In selecting a home node for a virtual machine, the scheduler attempts to keep both the virtual
machine and its memory located on the same node, thus maintaining good NUMA locality.
However, there are some instances where a virtual machine will have poor NUMA locality despite
the efforts of the scheduler:

458 VMware vSphere: Optimize and Scale


• Virtual machine with more vCPUs than available per processor chip – The number of CPU
cores in a home node is equivalent to the number of cores in each physical processor chip.
When a virtual machine has more vCPUs than there are logical threads in a home node, the
CPU scheduler does not attempt to use NUMA optimizations for that virtual machine. This
means that the virtual machine’s memory is not migrated to be local to the processors on which
its vCPUs are running, leading to poor NUMA locality.

8
• Virtual machine memory size is greater than memory per NUMA node – Each physical
processor in a NUMA server is usually configured with an equal amount of local memory. For

CPU Optimization
example, a four-socket NUMA server with a total of 64GB of memory typically has 16GB
locally on each processor. If a virtual machine has a memory size greater than the amount of
memory local to each processor, the CPU scheduler does not attempt to use NUMA
optimizations for that virtual machine. This means that the virtual machine’s memory is not
migrated to be local to the processors on which its vCPUs are running, leading to poor NUMA
locality.
• The ESXi host is experiencing moderate to high CPU loads – When multiple virtual machines
are running on an ESXi host, some of the virtual machines end up sharing the same home node.
When a virtual machine is ready to run, the scheduler attempts to schedule the virtual machine
on its home node. However, if the load on the home node is high, and other home nodes are less
heavily loaded, the scheduler might move the virtual machine to a different home node.
Assuming that most of the virtual machine’s memory was located on the original home node,
this causes the virtual machine that is moved to have poor NUMA locality.
However, in return for the poor locality, the virtual machines on the home node experience a
temporary increase in available CPU resources. The scheduler eventually migrates the virtual
machine’s memory to the new home node, thus restoring NUMA locality. The poor NUMA locality
experienced by the virtual machine that was moved is a temporary situation and is seldom the source
of performance problems. However, frequent rebalancing of virtual machines between local and
remote nodes might be a symptom of other problems, such as host CPU saturation.

Module 8 CPU Optimization 459


Wide-VM NUMA Support
Slide 8-14

A Wide-VM has more vCPUs than the available cores on a NUMA node.
ƒ For example, a 4-vCPU SMP virtual machine is considered “wide” on a two-socket
(dual core) system.
ƒ Only the cores count (hyperthreading threads do not).
• An 8-vCPU SMP virtual machine is considered wide on a two-socket (quad core) system
with hyperthreading enabled because the processor has only four cores per NUMA node.

Wide-VM NUMA support splits a wide-VM into smaller NUMA clients.


ƒ Wide-VM assigns a home node to each client .
ƒ For example, a 4-vCPU SMP virtual machine on a two-socket (dual core) system
has two 2-vCPU NUMA clients.
ƒ An 8-vCPU SMP virtual machine on a two-socket (quad core) system has two 4-
vCPU NUMA clients.
ƒ The wide-VM has multiple home nodes because it consists of multiple clients, each
with its own home node.

Wide virtual machines have more vCPUs than the number of cores in each physical NUMA node.
These virtual machines are assigned two or more NUMA nodes and are preferentially allocated
memory local to those NUMA nodes.
Because vCPUs in these wide virtual machines might sometimes need to access memory outside
their own NUMA node, the virtual machine might experience higher average memory access
latencies than virtual machines that fit entirely within a NUMA node.
Memory latency can be mitigated by appropriately configuring Virtual NUMA on the virtual
machines. Virtual NUMA allows the guest operating system to take on part of the memory-locality
management task.

460 VMware vSphere: Optimize and Scale


Performance Impact with Wide-VM NUMA Support
Slide 8-15

Consider an 8-vCPU VM on a two-socket (quad core) system.


ƒ

8
Assuming uniform memory access pattern, about 50% of memory
accesses are local, because there are two NUMA nodes without wide-
VM NUMA support.

CPU Optimization
ƒ In this case, wide-VM NUMA support won’t make much performance
difference .
On a four-socket system, only about 25 percent of memory accesses
are local.
ƒ Wide-VM NUMA support likely improves it to 50 percent local accesses
due to memory interleaving allocations.
ƒ The performance benefit is even greater on bigger systems.

A slight performance advantage can occur in some environments that have virtual machines
configured with no more vCPUs than the number of cores in each physical NUMA node.
Some memory bandwidth bottlenecked workloads can benefit from the increased aggregate memory
bandwidth available when a virtual machine that would fit within one NUMA node is nevertheless
split across multiple NUMA nodes. This split can be accomplished by using the
numa.vcpu.maxPerMachineNode option.
On hyperthreaded systems, virtual machines with a number of vCPUs greater than the number of
cores in a NUMA node, but lower than the number of logical processors in each physical NUMA
node, might benefit from using logical processors with local memory instead of full cores with
remote memory. This behavior can be configured for a specific virtual machine with the
numa.vcpu.preferHT flag.

Module 8 CPU Optimization 461


What Affects CPU Performance?
Slide 8-16

Idling virtual machines:


ƒ Consider the overhead of delivering guest timer interrupts.
CPU affinity:
ƒ CPU affinity constrains the scheduler and can cause an imbalanced
load.
SMP virtual machines:
ƒ Some co-scheduling overhead is incurred.
Insufficient CPU resources to satisfy demand:
ƒ If CPU contention exists, the scheduler forces vCPUs of lower-priority
virtual machines to queue their CPU requests in deference to higher-
priority virtual machines.

You might think that idle virtual machines do not cost anything in terms of performance. However,
timer interrupts still need to be delivered to these virtual machines. Lower timer interrupt rates can
help a guest operating system’s performance.
Using CPU affinity has a positive effect for the virtual machine being pinned to a vCPU. However,
for the entire system as a whole, CPU affinity constrains the scheduler and can cause an improperly
balanced load. VMware® strongly recommends against using CPU affinity.
Try to use as few vCPUs in your virtual machine as possible to reduce the amount of timer
interrupts necessary as well as reduce any co-scheduling overhead that might be incurred. Also, use
SMP kernels (and not uniprocessor kernels) in SMP virtual machines.
CPU capacity is a finite resource. Even on a server that allows additional processors to be
configured, there is always a maximum number of processors that can be installed. As a result,
performance problems might occur when there are insufficient CPU resources to satisfy demand.

462 VMware vSphere: Optimize and Scale


Warning Sign: Ready Time
Slide 8-17

vCPUs are allocated CPU cycles on an assigned physical CPU based


on the proportional-share algorithm enforced by the CPU scheduler.

8
ƒ If a vCPU tries to execute a CPU instruction while no cycles are
available on the physical CPU, the request is queued.

CPU Optimization
ƒ A physical CPU with no cycles could be due to high load on the
physical CPU or a higher-priority vCPU receiving preference.
The amount of time that the vCPU waits for the physical CPU to
become available is called ready time.
ƒ This latency can affect performance of the guest operating system and
its applications within a virtual machine.

To achieve best performance in a consolidated environment, you must consider ready time. Ready
time is the time a virtual machine must wait in the queue in a ready-to-run state before it can be
scheduled on a CPU.
When multiple processes are trying to use the same physical CPU, that CPU might not be
immediately available, and a process must wait before the ESXi host can allocate a CPU to it. The
CPU scheduler manages access to the physical CPUs on the host system.

Module 8 CPU Optimization 463


Review of Learner Objectives
Slide 8-18

You should be able to do the following:


ƒ Discuss the CPU scheduler features.
ƒ Discuss what affects CPU performance.

464 VMware vSphere: Optimize and Scale


Lesson 2: Monitoring CPU Activity
Slide 8-19

8
CPU Optimization
Lesson 2:
Monitoring CPU Activity

Module 8 CPU Optimization 465


Learner Objectives
Slide 8-20

After this lesson, you should be able to do the following:


ƒ Identify key CPU metrics to monitor.
ƒ Use metrics in VMware® vCenter Server™ and resxtop.
ƒ Monitor CPU usage.

466 VMware vSphere: Optimize and Scale


CPU Metrics to Monitor
Slide 8-21

Key metrics:
ƒ

8
Host CPU used
ƒ Virtual machine CPU used
ƒ

CPU Optimization
Virtual machine CPU ready time
The VMware vSphere® Client™ displays information in units of time
(milliseconds).
resxtop displays information in percentages (of sample time).

To accurately determine how well the CPU is running, you need to look at specific information
available from our monitoring tools.
The values obtained from the VMware vSphere® Client™ and resxtop are gathered based on
different sampling intervals. By default, the vSphere Client uses a sampling interval of 20 seconds
and resxtop uses a sampling interval of 5 seconds. Be sure to normalize the values before
comparing output from both monitoring tools:
• Used (CPU – host). Amount of time that the host’s CPU (physical CPU) was used during the
sampling period.
• Usage (CPU – virtual machine). Amount of time that the virtual machine’s CPU (vCPU) was
actively using the physical CPU. For virtual SMP virtual machines, this can be displayed as an
aggregate of all vCPUs in the virtual machine or per vCPU.
• Ready (CPU – virtual machine). Amount of time the virtual machine’s CPU (vCPU) was ready
but could not get scheduled to run on the physical CPU. CPU ready time is dependent on the
number of virtual machines on the host and their CPU loads. For virtual SMP virtual machines,
ready can be displayed as an aggregate of all vCPUs in the virtual machine or per vCPU.

Module 8 CPU Optimization 467


Viewing CPU Metrics in the vSphere Client
Slide 8-22

Use the vSphere Client CPU performance charts to monitor CPU usage for hosts, clusters, resource
pools, virtual machines, and VMware vSphere® vApps™.
There are two important caveats about the vSphere Client performance charts:
• Only two counters at a time can be displayed in the chart.
• The virtual machine object shows aggregated data for all the CPUs in a virtual SMP virtual
machine. The numbered objects (in the example above, 0 and 1) track the performance of the
individual vCPUs.
A short spike in CPU used or CPU ready indicates that you are making the best use of the host
resources. However, if both values are constantly high, the hosts are probably overcommitted.
Generally, if the CPU used value for a virtual machine is above 90 percent and the CPU Ready
value is above 20 percent, performance is negatively affected.
VMware documentation provides performance thresholds in time values and percentages. To
convert time values to percentages, divide the time value by the sample interval. For the vSphere
Client, the default sampling interval is 20 seconds (or 20,000 milliseconds). In the example above,
the CPU ready value for vCPU0 is 14,996 milliseconds or 75 percent (14,996/20,000).

468 VMware vSphere: Optimize and Scale


CPU Performance Analysis Using resxtop
Slide 8-23

PCPU USED(%) – CPU utilization per physical CPU

8
Per-group statistics:
ƒ %USED – Utilization (includes %SYS)
ƒ

CPU Optimization
%SYS – VMkernel system activity time
ƒ %RDY – Ready time
ƒ %WAIT – Wait and idling time
ƒ %CSTP – Pending co-schedule
ƒ %MLMTD – Time not running because of a CPU limit
ƒ NWLD – Number of worlds associated with a given group

The resxtop command give much of the same information as the vSphere Client but supplements
the displays with more detailed information.
In resxtop, a group refers to a resource pool, a running virtual machine, or a nonvirtual machine
world. For worlds belonging to a virtual machine, statistics for the running virtual machine are
displayed. The CPU statistics displayed for a group are the following:
• PCPU USED(%) – CPU utilization per physical CPU (includes logical CPU).
• %USED – CPU utilization. This is the percentage of physical CPU core cycles used by a group
of worlds (resource pools, running virtual machines, or other worlds). This includes %SYS.
• %SYS – Percentage of time spent in the ESXi VMkernel on behalf of the world/resource pool
to process interrupts and to perform other system activities.
• %RDY – Percentage of time the group was ready to run but was not provided CPU resources on
which to execute.
• %WAIT – Percentage of time the group spent in the blocked or busy wait state. This includes
the percentage of time the group was idle.

Module 8 CPU Optimization 469


• %CSTP – Percentage of time the vCPUs of a virtual machine spent in the co-stopped state,
waiting to be co-started. This gives an indication of the co-scheduling overhead incurred by the
virtual machine. If this value is low, then any performance problems should be attributed to
other issues and not to the co-scheduling of the virtual machine’s vCPUs.
• %MLMTD – Percentage of time the VMkernel did not run the resource pool/world because that
would violate the resource pool/world’s limit setting.
• NWLD – Number of worlds associated with a given group. Each world can consume 100
percent of a physical CPU, which is why you might see some unexpanded groups with %USED
above 100 percent.
You can convert the %USED, %RDY, and %WAIT values to absolute time by multiplying the value
by the sample time.

470 VMware vSphere: Optimize and Scale


Using resxtop to View CPU Metrics per Virtual Machine
Slide 8-24

8
Type V to

CPU Optimization
filter the
output to
show just
virtual
machines.

resxtop often shows more information than you need for the specific performance problem that
you are troubleshooting. There are commands that can be issued in interactive mode to customize
the display. These commands are the most useful when monitoring CPU performance:
• Press spacebar – Immediately updates the current screen.
• Type s – Prompts you for the delay between updates, in seconds. The default value is 5
seconds. The minimum value is 2 seconds.
• Type V – Displays virtual machine instances only. Do not confuse this with lowercase v.
Lowercase v shifts the view to the storage virtual machine resource utilization screen. If you
accidentally use this option, type c to switch back to the CPU resource utilization panel.
• Type e – Toggles whether CPU statistics are displayed expanded or unexpanded. The expanded
display includes CPU resource utilization statistics broken down by individual worlds
belonging to a virtual machine. All percentages of the individual worlds are percentage of a
single physical CPU.

Module 8 CPU Optimization 471


Using resxtop to View Single CPU Statistics
Slide 8-25

Type e to
show all the
worlds
associated
with a single
virtual
machine.

The default resxtop screen shows aggregate information for all vCPUs in a virtual machine. Type
e and enter the group ID (GID) to expand the group and display all the worlds associated with that
group. In the example above, the virtual machine group with GID 24 is expanded to show all the
worlds associated with that group.
To view the performance of individual CPUs, you must expand the output of resxtop for an
individual virtual machine. After it has been expanded, the utilization and ready time for each vCPU
can be tracked to further isolate performance problems.

472 VMware vSphere: Optimize and Scale


What Is Most Important to Monitor?
Slide 8-26

High used values:


ƒ

8
Often an indicator of high utilization, a goal of many organizations
Ready time:
ƒ

CPU Optimization
The best indicator of possible CPU performance problems
ƒ Ready time occurs when more CPU resources are being requested by
virtual machines than can be scheduled on the given physical CPUs.

High usage is not necessarily an indication of poor CPU performance, though it is often mistakenly
understood as such. Administrators in the physical space view high utilization as a warning sign,
due to the limitations of physical architectures. However, one of the goals of virtualization is to
efficiently utilize the available resources. When high CPU utilization is detected along with queuing,
this might indicate a performance problem.
As an analogy, consider a freeway. If every lane in the freeway is packed with cars but traffic is
moving at a reasonable speed, we have high utilization but good performance. The freeway is doing
what is was designed to do: moving a large number of vehicles at its maximum capacity.
Now imagine the same freeway at rush hour. More cars than the maximum capacity of the freeway
are trying to share the limited space. Traffic speed reduces to the point where cars line up (queue) at
the on-ramps while they wait for shared space to open up so that they can merge onto the freeway.
This is high utilization with queuing, and a performance problem exists.
Ready time reflects the idea of queuing within the CPU scheduler. Ready time is the amount of time
that a vCPU was ready to run but was asked to wait due to an overcommitted physical CPU.

Module 8 CPU Optimization 473


Lab 12 Introduction (1)
Slide 8-27

your ESXi host


# ./starttest1 # ./starttest1

TestVM01 vApp: CPU-Hog TestVM01

Workload Workload
01 … 10

VM VM

vCPU vCPU vCPU

Case 1: vApp creates


Case 2:
1 thread, 1 CPU contention 1 thread, 2
vCPU vCPUs

In this lab, you will compare the performance of a MySQL database running on a single-vCPU
virtual machine, a dual-vCPU virtual machine, and a four-vCPU virtual machine. You will run four
tests, measuring CPU usage and ready time and recording the data after each test. There are four test
cases that you will perform. Two of them are described here:
• Case 1 – On a single-vCPU virtual machine, you will run the program starttest1.
starttest1 simulates a single-threaded application accessing a MySQL database. First, you
gather baseline data. Then you generate CPU contention by using a vApp named CPU-HOG.
Then you re-run starttest1 and monitor CPU performance.
• Case 2 – On a dual-vCPU virtual machine, you will run the program starttest1.
Cases 3 and 4 are described on the next page.

474 VMware vSphere: Optimize and Scale


Lab 12 Introduction (2)
Slide 8-28

your ESXi host

8
# ./starttest2 # ./starttest4

CPU Optimization
TestVM02 TestVM01
vApp
CPU-Hog

vCPU vCPU vCPU vCPU vCPU vCPU

vApp creates
Case 3: CPU contention
Case 4:
2 threads, 2 4 threads, 4
vCPUs vCPUs

The remaining two test cases that you will perform are the following:
• Case 3– On a dual-vCPU virtual machine, you will run the program starttest2.
starttest2 simulates a dual-threaded application accessing a MySQL database.
• Case 4 – On a quad-vCPU virtual machine, you will run the program starttest4.
starttest4 simulates a quad-threaded application accessing a MySQL database.
In both test cases, CPU-HOG continues to run to generate CPU contention.

Module 8 CPU Optimization 475


Lab 12
Slide 8-29

In this lab, you will use performance charts and resxtop to monitor CPU performance.
1. Run a single-threaded program in a single-vCPU virtual machine.
2. Use esxtop to observe baseline data.
3. Use the vSphere Client to observe baseline data.
4. Import the CPU-HOG vApp.
5. Generate CPU contention.
6. Observe the performance of the test virtual machine.
7. Run a single-threaded program in a dual-vCPU virtual machine.
8. Observe the performance of the test virtual machine.
9. Run a dual-threaded program in a dual-vCPU virtual machine.
10. Observe the performance of the test virtual machine.
11. Run a multithreaded program in a quad-vCPU virtual machine.
12. Observe the performance of the test virtual machine.
13. Summarize your findings.
14. Clean up for the next lab.

476 VMware vSphere: Optimize and Scale


Lab 12 Review
Slide 8-30

8
Case 1 Case 2 Case 3 Case 4
Counters 1 thread, 1 thread, 2 threads, 4 threads,

CPU Optimization
1 vCPU 2 vCPUs 2 vCPUs 4 vCPUs

opm

%USED

%RDY

Compare the results of case 1 and case 2. How does the performance differ between the two cases?
Compare the results of case 1 and case 3. How does the performance differ between the two cases?
Compare the results of case 3 and case 4. How does the performance differ between the two cases?

Module 8 CPU Optimization 477


Review of Learner Objectives
Slide 8-31

You should be able to do the following:


ƒ Identify key CPU metrics to monitor.
ƒ Use metrics in vCenter Server and resxtop.
ƒ Monitor CPU usage.

478 VMware vSphere: Optimize and Scale


Lesson 3: Troubleshooting CPU Performance Problems
Slide 8-32

8
Lesson 3:

CPU Optimization
Troubleshooting CPU Performance
Problems

Module 8 CPU Optimization 479


Learner Objectives
Slide 8-33

After this lesson, you should be able to do the following:


ƒ Describe various CPU performance problems.
ƒ Discuss the causes of CPU performance problems.
ƒ Propose solutions to correct CPU performance problems.

480 VMware vSphere: Optimize and Scale


Review: Basic Troubleshooting Flow for ESXi Hosts
Slide 8-34

1. Check for VMware® Tools™ status.

2. Check for resource pool

8
CPU saturation. 11. Check for using only one vCPU
in an SMP VM.
16. Check for low guest CPU
3. Check for host CPU saturation.
utilization.
12. Check for high CPU ready time

CPU Optimization
4. Check for guest CPU saturation. on VMs running in under-utilized 17. Check for past VM memory
hosts. swapping.
5. Check for active VM
memory swapping.
13. Check for slow storage device. 18. Check for high memory
demand in a resource pool.
6. Check for VM swap wait.
14. Check for random increase
in I/O latency on a 19. Check for high memory
7. Check for active VM
shared storage device. demand in a host.
memory compression.

8. Check for an overloaded 20. Check for high guest memory


15. Check for random increase demand.
storage device.
in data transfer rate
on network controllers.
9. Check for dropped receive
packets.

10. Check for dropped transmit


packets.

definite problems likely problems possible problems

When troubleshooting CPU issues in a VMware vSphere™ environment, you can perform the
problem checks in the basic troubleshooting flow for an ESXi host. For CPU-related problems, this
means checking host CPU saturation and guest CPU saturation. The problem checks assume that
you are using the vSphere Client and that you are using real-time statistics while the problem is
occurring.
Performing these problem checks requires looking at the values of specific performance metrics
using the performance monitoring capabilities of the vSphere Client. The threshold values specified
in these checks are only guidelines based on past experience. The best values for these thresholds
might vary, depending on the sensitivity of a particular configuration or set of applications to
variations in system load.

Module 8 CPU Optimization 481


Resource Pool CPU Saturation
Slide 8-35

If resource pools are part of the virtual datacenter, the pools need to be checked for CPU saturation
to ensure the pools are not a performance bottleneck. An incorrectly configured resource pool can
inject CPU contention for the virtual machines that reside in the pool.
If virtual machine performance is degraded among pool members, two metrics should be checked:
• High CPU usage on the resource pool by itself does not indicate a performance bottleneck.
Compare CPU usage with the CPU limit setting on the resource pool if it is set. If the CPU
usage metric and the CPU limit value on the resource pool are close, the resource pool is
saturated. If this is not the case, the issue might be specific to the virtual machine.
• For each virtual machine in the pool, high ready time should be checked. If the average or latest
value for ready time is greater than 2000ms for any vCPU object, a contention issue exists.
However, the issue might reside on the ESXi host or the guest operating system. The problem
does not involve the resource pool.
For more details on performance troubleshooting, see vSphere Performance Troubleshooting Guide
at http://www.vmware.com/resources/techresources/.

482 VMware vSphere: Optimize and Scale


Host CPU Saturation
Slide 8-36

8
No

CPU Optimization
Yes

Yes No

The root cause of host CPU saturation is simple: The virtual machines running on the host are
demanding more CPU resource than the host has available.
Remember that ready time is reported in two different values between resxtop and VMware®
vCenter Server™. In resxtop, it is reported in an easily consumed percentage format. A figure of
5 percent means that the virtual machine spent 5 percent of its last sample period waiting for
available CPU resources. In vCenter Server, ready time is reported as a time measurement. For
example, in vCenter Server’s real-time data, which produces sample values every 20,000
milliseconds, a figure of 1,000 milliseconds is reported for a 5 percent ready time. A figure of 2,000
milliseconds is reported for a 10 percent ready time.
Using resxtop values for ready time, here is how to interpret the value:
• If ready time <= 5 percent, this is normal. Very small single-digit numbers result in minimal
impact to users.
• If ready time is between 5 and 10 percent, ready time is starting to be worth watching.
If ready time is > 10 percent, though some systems continue to meet expectations, double-digit
ready time percentages often mean action is required to address performance issues.

Module 8 CPU Optimization 483


Causes of Host CPU Saturation
Slide 8-37

Root cause is simple:


ƒ Virtual machines running on the host are demanding more CPU
resource than the host has available.
Main scenarios for this problem:
ƒ Host has small number of virtual machines, all with high CPU demand.
ƒ Host has large number of virtual machines, all with low to moderate
CPU demand.
ƒ Host has a mix of virtual machines with high and low CPU demand.

Populating an ESXi host with too many virtual machines running compute-intensive applications
can make it impossible to supply sufficient resources to any individual virtual machine.
Three main scenarios in which this problem can occur are when the host has one of the following:
• A small number of virtual machines with high CPU demand
• A large number of virtual machines with moderate CPU demands
• A mix of virtual machines with high and low CPU demand

484 VMware vSphere: Optimize and Scale


Resolving Host CPU Saturation
Slide 8-38

Four possible approaches:


ƒ

8
Reduce the number of virtual machines running on the host.
ƒ Increase the available CPU resources by adding the host to a VMware
vSphere® Distributed Resource Scheduler (DRS) cluster.

CPU Optimization
ƒ Increase the efficiency with which virtual machines use CPU resources.
ƒ Use resource controls to direct available resources to critical virtual
machines.

Four possible approaches exist to resolving the issue of a CPU-saturated host. These approaches are
discussed in the next few slides. You can apply these solutions to the three scenarios mentioned on
the previous page:
• When the host has a small number of virtual machines with high CPU demand, you might solve
the problem by first increasing the efficiency of CPU usage in the virtual machines. Virtual
machines with high CPU demands are those most likely to benefit from simple performance
tuning. In addition, unless care is taken with virtual machine placement, moving virtual
machines with high CPU demands to a new host might simply move the problem to that host.
• When the host has a large number of virtual machines with moderate CPU demands, reducing
the number of virtual machines on the host by rebalancing the load is the simplest solution.
Performance tuning on virtual machines with low CPU demands yields smaller benefits and is
time-consuming if the virtual machines are running different applications.
When the host has a mix of virtual machines with high and low CPU demand, the appropriate
solution depends on available resources and skill sets. If additional ESXi hosts are available,
rebalancing the load might be the best approach. If additional hosts are not available, or if expertise
exists for tuning the high-demand virtual machines, then increasing efficiency might be the best
approach.

Module 8 CPU Optimization 485


Reducing the Number of Virtual Machines on the Host
Slide 8-39

Use VMware vSphere® vMotion® to migrate virtual machines to a


server that is less loaded:
ƒ Use performance charts to verify that the virtual machine is migrating to
a host with available CPU resources.
ƒ Consider peak usage to prevent future problems.
Power off noncritical virtual machines.

The most straightforward solution to the problem of host CPU saturation is to reduce the demand for
CPU resources by migrating virtual machines to ESXi hosts with available CPU resources. In order
to do this successfully, use the performance charts available in the vSphere Client to determine the
CPU usage of each virtual machine on the host and the available CPU resources on the target hosts.
Ensure that the migrated virtual machines will not cause CPU saturation on the target hosts. When
manually rebalancing load in this manner, consider peak usage periods as well as average CPU
usage. This data is available in the historical performance charts when using vCenter Server to
manage ESXi hosts. The load can be rebalanced with no downtime for the affected virtual machines
if VMware vSphere® vMotion® is configured properly in your vSphere environment.
If additional ESXi hosts are not available, you can eliminate host CPU saturation by powering off
noncritical virtual machines. This action makes additional CPU resources available for critical
applications. Even idle virtual machines consume CPU resources. If powering off virtual machines
is not an option, you have to increase efficiency or explore the use of resource controls, such as
shares, limits, and reservations.

486 VMware vSphere: Optimize and Scale


Increasing CPU Resources with DRS Clusters
Slide 8-40

Adding hosts to a DRS cluster increases the number of CPU


resources.

8
ƒ DRS can use vMotion to automatically balance the CPU load across
ESXi hosts.

CPU Optimization
ƒ Manual calculations and administration during peak versus
off-peak periods are eliminated.

Using VMware vSphere® Distributed Resource Scheduler™ (DRS) clusters to increase available
CPU resources is similar to the previous solution: additional CPU resources are made available to
the set of virtual machines that have been running on the affected host. However, in a DRS cluster,
load rebalancing is performed automatically. Thus, you do not have to manually compute the load
compatibility of specific virtual machines and hosts or to account for peak usage periods.
For more about DRS clusters, see vSphere Resource Management Guide at
http://www.vmware.com/support/pubs/vsphere-esxi-vcenter-server-pubs.html.

Module 8 CPU Optimization 487


Increasing Efficiency of a Virtual Machine's CPU Usage
Slide 8-41

Leverage application-tuning guides.


ƒ Vendor guides typically include best practices and operating system–
tuning instructions.
ƒ Technical books and white papers provide general principles.
Guest operating system and application tuning should include:
ƒ Use of large memory pages
ƒ Reducing the timer interrupt rate for the guest operating system
Virtual machine optimizations can include:
ƒ Allocating more memory to the virtual machine
ƒ Choosing the right number of vCPUs to fit the application

To increase the amount of work that can be performed on a saturated host, you must increase the
efficiency with which applications and virtual machines use those resources. CPU resources might
be wasted due to suboptimal tuning of applications and operating systems in virtual machines or
inefficient assignment of host resources to virtual machines.
The efficiency with which an application–operating system combination uses CPU resources
depends on many factors specific to the application, operating system, and hardware. As a result, a
discussion of tuning applications and operating systems for efficient use of CPU resources is beyond
the scope of this course.
Most application vendors provide performance-tuning guides that document best practices and
procedures for their application. These guides often include operating system–level tuning advice
and best practices. Operating system vendors also provide guides with performance-tuning
recommendations. In addition, many books and white papers discuss the general principles of
performance tuning as well as the application of those principles to specific applications and
operating systems. These procedures, best practices, and recommendations apply equally well to
virtualized and nonvirtualized environments. However, two application-level and operating system–
level tunings are particularly effective in a virtualized environment:

488 VMware vSphere: Optimize and Scale


Configure the application and guest operating system to use large pages when allocating memory.
• Reduce the timer interrupt rate for the guest operating system.
A virtualized environment provides more direct control over the assignment of CPU and memory
resources to applications than a nonvirtualized environment. Two key adjustments can be made in
virtual machine configurations that might improve overall CPU efficiency:

8
• Allocating more memory to a virtual machine – This approach might enable the application
running in the virtual machine to operate more efficiently. Additional memory might enable the

CPU Optimization
application to reduce I/O overhead or allocate more space for critical resources. Check the
performance-tuning information for the specific application to see if additional memory will
improve efficiency. Some applications must be explicitly configured to use additional memory.
Reducing the number of vCPUs allocated to virtual machines – Virtual machines that are not using
all of their CPU resources will make more resources available to other virtual machines. Even if the
extra vCPUs are idle, they still incur a cost in CPU resources, both in the CPU scheduler and in the
guest operating system overhead involved in managing the extra vCPU.

Module 8 CPU Optimization 489


Controlling Resources Using Resource Settings
Slide 8-42

high ready time


limit on CPU

If you cannot rebalance CPU load or increase efficiency, or if all possible steps have been taken and
host CPU saturation still exists, then you might be able to reduce performance problems by
modifying resource controls. Many applications, such as batch jobs or long-running numerical
calculations, respond to a lack of CPU resources by taking longer to complete but still produce
correct and useful results. Other applications might experience failures or might be unable to meet
critical business requirements when denied sufficient CPU resources. The resource controls
available in vSphere can be used to ensure that resource-sensitive applications always get sufficient
CPU resources, even when host CPU saturation exists. For more about resource controls, see
vSphere Resource Management Guide at http://www.vmware.com/support/pubs/vsphere-esxi-
vcenter-server-pubs.html.
In this example, a lower-priority virtual machine was configured with a CPU limit, which is an
upper bound on the amount of CPU this virtual machine can consume. This artificial ceiling has
resulted in ready time for this specific virtual machine, most likely resulting in decreased
performance.

490 VMware vSphere: Optimize and Scale


When Ready Time Might Not Indicate a Problem
Slide 8-43

Used time ~ ready time:


ƒ

8
This might signal contention. used time
ƒ However, the host might not
be overcommitted, due to

CPU Optimization
workload availability.
Example:
ƒ
ready time ~ used time
There might be periods of
activity and periods that are
idle.
ƒ The CPU is not
overcommitted
all of the time.
Ready time
< used time

Although high ready time typically signifies CPU contention, the condition does not always warrant
corrective action. If the value for ready time is close in value to the amount of time used on the
CPU, and if the increased ready time occurs with occasional spikes in CPU activity but does not
persist for extended periods of time, this value might not indicate a performance problem. The brief
performance hit is often within the accepted performance variance and does not require any action
on the part of the administrator.

Module 8 CPU Optimization 491


Example: Spotting CPU Overcommitment
Slide 8-44

ƒ Dual-CPU host with three active virtual machines (high %USED)


ƒ High %RDY + high %USED can imply CPU overcommitment.

This example uses resxtop to detect CPU overcommitment. Looking at the PCPU line near the top
of the screen, you can determine that the host has two CPUs, both 100 percent utilized. Three active
virtual machines are shown: memhog-linux-sm, kernelcompile-u, and memhog-linux. These virtual
machines are active because they have relatively high values in the %USED column. The values in
the %USED column alone do not necessarily indicate that the CPUs are overcommitted. In the
%RDY column, you see that the three active virtual machines have relatively high values. High
%RDY values, plus high %USED values, are a sure indicator that your CPU resources are
overcommitted.

492 VMware vSphere: Optimize and Scale


Guest CPU Saturation
Slide 8-45

Monitor virtual machine CPU usage.


ƒ

8
If the CPU average usage is > 75 percent, or if there are
peaks > 90 percent, then guest CPU saturation exists.

CPU Optimization
Cause Solution
Increase CPU resources
Guest operating system and provided to application.
applications use all the CPU Increase the efficiency with
resources. which the virtual machine uses
CPU resources.

Guest CPU saturation occurs when the application and operating system running in a virtual
machine use all of the CPU resources that the ESXi host is providing to that virtual machine. The
occurrence of guest CPU saturation does not necessarily indicate that a performance problem exists.
Compute-intensive applications commonly use all available CPU resources, and even less-intensive
applications might experience periods of high CPU demand without experiencing performance
problems. However, if a performance problem exists when guest CPU saturation is occurring, steps
should be taken to eliminate the condition.
You have two possible approaches to solving performance problems related to guest CPU saturation:
• Increase the CPU resources provided to the application.
• Increase the efficiency with which the virtual machine uses CPU resources.
Adding CPU resources is often the easiest choice, particularly in a virtualized environment.
However, this approach misses inefficient behavior in the guest application and operating system
that might be wasting CPU resources or, in the worst case, that might be an indication of some
erroneous condition in the guest. If a virtual machine continues to experience CPU saturation even
after adding CPU resources, the tuning and behavior of the application and operating system should
be investigated.

Module 8 CPU Optimization 493


Using One vCPU in SMP Virtual Machine
Slide 8-46

Monitor virtual machine usage for all vCPU objects.


ƒ If usage for all vCPUs except one is close to zero, the SMP virtual
machine is using only one vCPU.

Cause Solution
Guest operating system is
Upgrade to an SMP kernel or
configured with
HAL.
uniprocessor kernel.

Application is pinned to single Remove OS-level controls or


core in guest operating system. reduce number of vCPUs.

Reduce number of vCPUs to


Application is single-threaded.
one.

When a virtual machine configured with more than one vCPU actively uses only one of those
vCPUs, resources that could be used to perform useful work are being wasted. Possible causes and
solutions for this problem include the following:
• The guest operating system is configured with a uniprocessor kernel (Linux) or hardware
abstraction layer (HAL) (Windows).
In order for a virtual machine to take advantage of multiple vCPUs, the guest operating system
running in the virtual machine must be able to recognize and use multiple processor cores. If the
virtual machine is doing all of its work on vCPU0, the guest operating system might be
configured with a kernel or a HAL that can recognize only a single processor core. Follow the
documentation provided by your operating system vendor to check for the type of kernel or
HAL that is being used by the guest operating system. If the guest operating system is using a
uniprocessor kernel or HAL, follow the operating system vendor’s documentation for upgrading
to an SMP kernel or HAL.
• The application is pinned to a single core in the guest operating system.
Many modern operating systems provide controls for restricting applications to run on only a
subset of available processor cores.

494 VMware vSphere: Optimize and Scale


If these controls have been applied in the guest operating system, the application might run on
only vCPU0. To determine whether this is the case, inspect the commands used to launch the
application, or other operating system–level tunings applied to the application. This must be
accompanied by an understanding of the operating system vendor’s documentation regarding
restricting CPU resources. This issue can be resolved by removing the operating system–level
controls or, if the controls are determined to be appropriate for the application, by reducing the

8
number of vCPUs to match the number actually used by the virtual machine.
• The application is single-threaded.

CPU Optimization
Many applications, particularly older applications, are written with only a single thread of
control. These applications cannot take advantage of more than one processor core. However,
many guest operating systems spread the runtime of even a single-threaded application over
multiple processors. Thus, running of a single-threaded application on a virtual SMP virtual
machine might lead to low guest CPU utilization, rather than to utilization on only vCPU0.
Without knowledge of the application design, it might be difficult to determine whether an
application is single-threaded other than by observing the behavior of the application in a
virtual SMP virtual machine. If, even when driven at peak load, the application is unable to
achieve CPU utilization of more than 100 percent of a single CPU, then the application is
unlikely to be single-threaded. Poor application and guest operating system tuning, or high I/O
response times, can also lead to artificial limits on the achievable utilization. The solution is to
reduce the number of vCPUs allocated to this virtual machine to one.

Module 8 CPU Optimization 495


Low Guest CPU Utilization
Slide 8-47

Monitor virtual machines’ CPU usage.


ƒ If average usage < 75 percent, guest CPU utilization is low.

Cause Solution
Troubleshoot slow/overloaded
High storage response times
storage.
High response times from external
systems Solution is application- and
Poor application or operating system environment-dependent.
tuning
Application pinned to cores in guest Remove OS-level controls or reduce
operating system number of vCPUs.
Too many configured vCPUs Reduce number of vCPUs.
Restrictive resource allocations Modify VM resource settings.

To determine if the problem is low guest CPU utilization, use the vSphere Client to check the CPU
use of the virtual machines on the host.
To determine low guest CPU use

1. In the vSphere Client, select the host.


2. Click the Virtual Machines tab.
3. View the Host CPU – MHz column and note the name of the virtual machine with the highest
CPU usage.
4. Select the virtual machine in the inventory.
5. View the CPU advanced performance chart. Look at the usage metric for this virtual machine. If
the average value is below 75 percent, then guest CPU utilization is low.

496 VMware vSphere: Optimize and Scale


CPU Performance Best Practices
Slide 8-48

Avoid using SMP unless specifically required by the application


running in the guest operating system.

8
Prioritize virtual machine CPU usage with the proportional-share
algorithm.

CPU Optimization
Use vMotion and DRS to redistribute virtual machines and reduce
contention.
Increase the efficiency of virtual machine usage by:
ƒ Leveraging application tuning guides
ƒ Tuning the guest operating system
ƒ Optimizing the virtual hardware

Module 8 CPU Optimization 497


Lab 13
Slide 8-49

In this lab, you will use the performance troubleshooting checklist to


detect possible CPU performance problems.
1. Generate CPU activity.
2. Display pop-up charts for your ESXi host and your test virtual machine.
3. Check for host CPU saturation.
4. Check for guest CPU saturation.
5. Check for using only one vCPU in an SMP virtual machine.
6. Check for low guest CPU utilization.
7. Summarize your findings.
8. Clean up for the next lab.

498 VMware vSphere: Optimize and Scale


Review of Learner Objectives
Slide 8-50

You should be able to do the following:


ƒ

8
Describe various CPU performance problems.
ƒ Discuss the causes of CPU performance problems.
ƒ

CPU Optimization
Propose solutions to correct CPU performance problems.

Module 8 CPU Optimization 499


Key Points
Slide 8-51

ƒ Most often, CPU issues are caused when there are insufficient CPU
resources to satisfy demand.
ƒ High CPU usage values along with high ready time often indicate CPU
performance problems.
ƒ For CPU-related problems, you should check host CPU saturation and
guest CPU saturation.

Questions?

500 VMware vSphere: Optimize and Scale


MODULE 9

Memory Optimization 9
Slide 9-1

Module 9

9
Memory Optimization

VMware vSphere: Optimize and Scale 501


You Are Here
Slide 9-2

Course Introduction Storage Optimization

VMware Management Resources


CPU Optimization
Performance in a Virtualized
Environment
Memory Optimization
Network Scalability
VM and Cluster Optimization
Network Optimization

Storage Scalability Host and Management Scalability

502 VMware vSphere: Optimize and Scale


Importance
Slide 9-3

While VMware vSphere™ employs various mechanisms to efficiently


allocate memory, you might still encounter a situation where virtual
machines are allocated insufficient physical memory.
You should know how to monitor memory usage of both the host and
virtual machines. You should also know how to troubleshoot

9
common memory performance problems, such as those involving an
excessive demand for memory.

Memory Optimization

Module 9 Memory Optimization 503


Module Lessons
Slide 9-4

Lesson 1: Memory Virtualization Concepts


Lesson 2: Monitoring Memory Activity
Lesson 3: Troubleshooting Memory Performance Problems

504 VMware vSphere: Optimize and Scale


Lesson 1: Memory Virtualization Concepts
Slide 9-5

Lesson 1:

9
Memory Virtualization Concepts

Memory Optimization

Module 9 Memory Optimization 505


Learner Objectives
Slide 9-6

After this lesson, you should be able to do the following:


ƒ Describe how memory is used in a virtualized environment.
ƒ Explain each memory-reclamation technique.
ƒ Explain how memory overcommitment impacts performance.

506 VMware vSphere: Optimize and Scale


Virtual Memory Overview
Slide 9-7

virtual machine

virtual memory applications


guest

9
guest physical memory guest
hypervisor
operating system

Memory Optimization
host physical memory hypervisor

Three levels of memory exist when running a virtual machine:


• Virtual memory – Memory visible to the applications running inside the virtual machine. This is
a continuous virtual address space presented by the guest operating system to applications.
• Guest physical memory – Memory visible to the guest operating system running in the virtual
machine.
• Host physical memory – Also called machine memory, this is memory visible to the hypervisor
as available on the system.

Module 9 Memory Optimization 507


Application and Guest OS Memory Management
Slide 9-8

allocated
list
virtual machine

application

free list
guest
OS

Applications start with no memory. Then, during their execution, they use the interfaces provided by
the operating system to explicitly allocate or deallocate virtual memory.
In a nonvirtual environment, the operating system assumes it owns all host physical memory in the
system. The hardware does not provide interfaces for the operating system to explicitly allocate or
free host physical memory. The operating system establishes the definitions of allocated or free
physical memory. Different operating systems have different implementations to realize this
abstraction. So, whether or not a physical page is free depends on which list the page resides in.

508 VMware vSphere: Optimize and Scale


Virtual Machine Memory Management
Slide 9-9

Virtual machine starts with no host application


physical memory allocated to it.
Host physical memory is allocated on
demand.

9
ƒ Guest operating system will not guest OS
explicitly allocate memory.

Memory Optimization
ƒ Memory is allocated on the virtual
machine’s first access to memory (read
or write). hypervisor

Because a virtual machine runs an operating system and one or more applications, the virtual
machine memory management properties combine both application and operating system memory
management properties. Like an application, when a virtual machine first starts, it has no
preallocated host physical memory. Like an operating system, the virtual machine cannot explicitly
allocate host physical memory through any standard interfaces.
The hypervisor intercepts the virtual machine’s memory accesses and allocates host physical
memory for the virtual machine on its first access to the memory. The hypervisor always writes
zeroes to the host physical memory before assigning it to a virtual machine.

Module 9 Memory Optimization 509


Memory Reclamation
Slide 9-10

Guest physical memory is not “freed”


in the typical sense.
ƒ Memory is moved to the “free” list. application
The hypervisor is not aware of when guest
the guest frees memory. free list

ƒ It has no access to the guest’s “free”


guest OS
list.
ƒ The virtual machine can accrue lots of
host physical memory.
The hypervisor cannot “reclaim” the hypervisor
host physical memory freed up by the
guest.

Virtual machine memory deallocation acts just like an operating system, such that the guest
operating system frees a piece of guest physical memory by adding these memory page numbers to
the guest free list. But the data of the “freed” memory might not be modified at all. As a result,
when a particular piece of guest physical memory is freed, the mapped host physical memory does
not usually change its state and only the guest free list is changed.
It is difficult for the hypervisor to know when to free host physical memory when guest physical
memory is deallocated, or freed, because the guest operating system free list is not accessible to the
hypervisor. The hypervisor is completely unaware of which pages are free or allocated in the guest
operating system. As a result, the hypervisor cannot reclaim host physical memory when the guest
operating system frees guest physical memory.

510 VMware vSphere: Optimize and Scale


Virtual Machine Memory-Reclamation Techniques
Slide 9-11

The hypervisor relies on the following memory-reclamation


techniques to free host physical memory:
ƒ Transparent page sharing (enabled by default)
ƒ Ballooning
ƒ Memory Compression

9
ƒ Host-level, or hypervisor, swapping

Memory Optimization
The hypervisor must rely on memory-reclamation techniques to reclaim the host physical memory
freed up by the guest operating system. The memory-reclamation techniques are transparent page
sharing, ballooning, and host-level (or hypervisor) swapping.
When multiple virtual machines are running, some of them might have identical sets of memory
content. This presents opportunities for sharing memory across virtual machines (as well as sharing
within a single virtual machine). With transparent page sharing, the hypervisor can reclaim the
redundant copies and keep only one copy, which is shared by multiple virtual machines, in host
physical memory.
Ballooning and host-level swapping are the other two memory-reclamation techniques. These
techniques are discussed after a review of guest operating system memory terminology.

NOTE
In hardware-assisted memory virtualization systems, page sharing might not occur until memory
overcommitment is high enough to require the large pages to be broken into small pages. For further
information, see VMware® KB articles 1021095 and 1021896.

Module 9 Memory Optimization 511


Guest Operating System Memory Terminology
Slide 9-12

memory size
total amount of memory
presented to a guest

allocated memory free memory


memory assigned to memory not assigned
applications

active memory idle memory


allocated memory recently allocated memory not
accessed or used by recently accessed or
applications used

The memory used by the guest operating system in the virtual machine can be described as follows:
First, there is the memory size. This is the total amount of memory that is presented to the guest. By
default, the memory size corresponds to the maximum memory configured when the virtual machine
was created.
The total amount of memory (memory size) can be split into two parts: free memory and allocated
memory. Free memory is memory that has not been assigned to the guest operating system or to
applications. Allocated memory is memory that has been assigned to the guest operating system or
to applications.
Allocated memory can be further subdivided into two types: active memory and idle memory.
Active memory is memory that has “recently” been used by applications. Idle memory is memory
that has not been “recently” used by applications. To maximize virtual machine performance, it is
vital to keep a virtual machine’s active memory in host physical memory.
These terms are used specifically to describe the guest operating system’s memory. The VMware
vSphere® ESXi™ host does not know or care about a guest’s free, active, or idle memory.

512 VMware vSphere: Optimize and Scale


Reclaiming Memory with Ballooning
Slide 9-13

Ballooning preferentially selects free or idle virtual machine memory.


ƒ This is because the guest operating system allocates from free
memory.
But if asked to reclaim too much, ballooning will eventually start
reclaiming active memory.

9
free memory idle memory active memory

Memory Optimization
guest
OS

hypervisor

Due to the virtual machine’s isolation, the guest operating system is not aware that it is running
inside a virtual machine and is not aware of the states of other virtual machines on the same host.
When the hypervisor runs multiple virtual machines and the total amount of free host physical
memory becomes low, none of the virtual machines will free guest physical memory, because the
guest operating system cannot detect the host physical memory shortage.
The goal of ballooning is to make the guest operating system aware of the low memory status of the
host so that the guest operating system can free up some of its memory. The guest operating system
determines whether it needs to page out guest physical memory to satisfy the balloon driver’s
allocation requests. If the virtual machine has plenty of free or idle guest physical memory, inflating
the balloon will induce no guest-level paging and will not affect guest performance. However, if the
guest is already under memory pressure, the guest operating system decides which guest physical
pages are to be paged out to the virtual swap device in order to satisfy the balloon driver’s allocation
requests.

Module 9 Memory Optimization 513


Memory Compression
Slide 9-14

The VMware vSphere® ESXi™ host stores pages that would


otherwise be swapped out to disk in a
compression cache
compressed cache.

A B VM A B C VM D

Disk Disk
A B

without compression cache with compression cache

With memory compression, ESXi stores pages in memory. These pages would otherwise be
swapped out to disk, through host swapping, in a compression cache located in the main memory.
Memory compression outperforms host swapping because the next access to the compressed page
only causes a page decompression, which can be an order of magnitude faster than the disk access.
ESXi determines if a page can be compressed by checking the compression ratio for the page.
Memory compression occurs when the page’s compression ratio is greater than 50%. Otherwise, the
page is swapped out. Only pages that would otherwise be swapped out to disk are chosen as
candidates for memory compression. This specification means that ESXi does not actively compress
guest pages when host swapping is not necessary. In other words, memory compression does not
affect workload performance when host memory is not overcommitted.
The slide illustrates how memory compression reclaims host memory compared to host swapping.
Assume ESXi needs to reclaim 8KB physical memory (that is, two 4KB pages) from a virtual machine.
With host swapping, two swap candidate pages, pages A and B, are directly swapped to disk. With
memory compression, a swap candidate page is compressed and stored using 2KB of space in a per-
virtual machine compression cache. Hence, each compressed page yields 2KB memory space for ESXi
to reclaim. In order to reclaim 8KB physical memory, four swap candidate pages must be compressed.
The page compression is much faster than the normal page swap-out operation, which involves a
disk I/O.

514 VMware vSphere: Optimize and Scale


Host Cache
Slide 9-15

Swap to host cache allows users to configure a special swap cache


on Solid State Disk (SSD) storage. Page Table
0

1
1

9
0
Flash
1

Memory Optimization
0
0
1

Main Memory

If memory compression doesn’t keep the virtual machine’s memory usage low enough, the ESXi
host will next forcibly reclaim memory using host-level swapping to a host cache if one has been
configured.
Swap to host cache allows users to configure a special swap cache on Solid State Disk (SSD)
storage. In most cases, swapping to the swap cache on SSD is much faster than swapping to regular
swap files on hard disk storage. Swapping to the SSD swap cache significantly reduces latency.
Thus, although some of the pages that the ESXi host swaps out might be active, swap to host cache
has a far lower performance impact than regular host-level swapping.
Placing swap files on SSD storage could potentially reduce vMotion performance. This is because if
a virtual machine has memory pages in a local swap file, they must be swapped in to memory before
a vMotion operation on that virtual machine can proceed.

Module 9 Memory Optimization 515


Reclaiming Memory with Host Swapping
Slide 9-16

Host-level swapping randomly selects guest physical memory to


reclaim, potentially including a virtual machine’s active memory.

free memory idle memory active memory

guest
OS

hypervisor

In cases where transparent page sharing and ballooning are not sufficient to reclaim memory, ESXi
employs host-level swapping. To support this, when starting a virtual machine, the hypervisor
creates a separate swap file for the virtual machine. Then, if necessary, the hypervisor can directly
swap out guest physical memory to the swap file, which frees host physical memory for other virtual
machines.
Host-level, or hypervisor, swapping is a guaranteed technique to reclaim a specific amount of
memory within a specific amount of time. However, host-level swapping might severely penalize
guest performance. This occurs when the hypervisor has no knowledge of which guest physical
pages should be swapped out and the swapping might cause unintended interactions with the native
memory management policies in the guest operating system. For example, the guest operating
system will never page out its kernel pages, because those pages are critical to ensure guest kernel
performance. The hypervisor, however, cannot identify those guest kernel pages, so it might swap
out those physical pages.

NOTE
ESXi swapping is distinct from swapping performed by a guest OS due to memory pressure within a
virtual machine. Guest-OS level swapping may occur even when the ESXi host has ample resources.

516 VMware vSphere: Optimize and Scale


Why Does the Hypervisor Reclaim Memory?
Slide 9-17

The hypervisor reclaims memory to support ESXi memory


overcommitment.
A host’s memory is overcommitted when total amount of guest
physical memory is greater than the amount of host physical
memory.

9
VM1 (2GB) VM2 (2GB) VM3 (2GB)

Memory Optimization
VM
memory

hypervisor
(4GB)

The hypervisor reclaims memory to support ESXi memory overcommitment. Memory


overcommitment provides two important benefits:
• Higher memory utilization – With memory overcommitment, ESXi ensures that the host
physical memory is consumed by active guest memory as much as possible. Typically, some
virtual machines might be lightly loaded compared to others, so for much of the time their
memory will sit idle. Memory overcommitment allows the hypervisor to use memory-
reclamation techniques to take the inactive or unused host physical memory away from the idle
virtual machines and give it to other virtual machines that will actively use it.
• Higher consolidation ratio – With memory overcommitment, each virtual machine has a smaller
footprint in host physical memory, making it possible to fit more virtual machines on the host
while still achieving good performance for all virtual machines. In the example above, you can
enable a host with 4GB of host physical memory to run three virtual machines with 2GB of
guest physical memory each. This assumes that all virtual machines are using the default setting
for memory reservation, which is 0. In that case, all the virtual machines would power on.
Without memory overcommitment, if all the virtual machines had a memory reservation of 2GB
each, then only one virtual machine could be run because the hypervisor cannot reserve host
physical memory for more than one virtual machine, given that each virtual machine has
overhead memory as well.

Module 9 Memory Optimization 517


When to Reclaim Host Memory
Slide 9-18

Host physical memory is reclaimed based on four host free memory


states, reflected by four thresholds:

If host free memory drops Use the following reclamation


toward the following technique
threshold
High None

Soft Ballooning

Hard Swapping and Compression

Low Ballooning Compression and


Swapping

ESXi maintains four host free memory states: high, soft, hard, and low. When to use ballooning or
host-level swapping to reclaim host physical memory is largely determined by the current host free
memory state. (ESXi enables page sharing by default and reclaims host physical memory with little
overhead.)
In the high state, the aggregate virtual machine guest memory usage is smaller than the host physical
memory size. Whether or not host physical memory is overcommitted, the hypervisor will not
reclaim memory through ballooning or host-level swapping. (This is true only when the virtual
machine memory limit is not set.)

518 VMware vSphere: Optimize and Scale


If host free memory drops toward the soft threshold, the hypervisor starts to reclaim memory by
using ballooning. Ballooning happens before free memory actually reaches the soft threshold
because it takes time for the balloon driver to allocate guest physical memory. Usually, the balloon
driver is able to reclaim memory in a timely fashion so that the host free memory stays above the
soft threshold.
If ballooning is not sufficient to reclaim memory or the host free memory drops toward the hard
threshold, the hypervisor starts to use memory compression in addition to using ballooning. Through
memory compression, the hypervisor should be able to quickly reclaim memory and bring the host
memory state back to the soft state.
In the rare case where host free memory drops below the low threshold, the hypervisor reclaims

9
memory through swapping and blocks the execution of all virtual machines that consume more
memory than their target memory allocations.

Memory Optimization

Module 9 Memory Optimization 519


Sliding Scale Mem.MinFreePct
Slide 9-19

Mem.MinFreePct is the amount of memory the VMkernel should


keep free.
ƒ The VMkernel uses a sliding scale to determine the Mem.MinFreePct
threshold based on the amount of RAM installed in the host.

Memory Installed Free State Threshold


Range

0-4GB 6%

4-12GB 4%

12-28GB 2%

Remaining Memory 1%

MinFreePct determines the amount of memory the VMkernel should keep free. vSphere 4.1 allowed
the user to change the default MinFreePct value of 6% to a different value. vSphere 4.1 introduced a
dynamic threshold of the Soft, Hard, and Low state to set appropriate thresholds and prevent virtual
machine performance issues while protecting VMkernel.
Using a default MinFreePct value of 6% can be inefficient when 256GB or 512GB systems are
becoming more and more mainstream. A 6% threshold on a 512GB results in 30GB idling most of
the time. However, not all customers use large systems and prefer to scale out rather than to scale
up. In this scenario, a 6% MinFreePCT might be suitable. To have the best of both worlds, ESXi 5
uses a sliding scale for determining its MinFreePct threshold.

520 VMware vSphere: Optimize and Scale


For example, assume a server is configured with 96GB of RAM. The sliding scale technique
allocates 245.76MB of RAM or 6% of the memory up to 4GB to calculate the total high threshold.
The scale then allocates 327.66MB of memory out of the 4-12MB range, 327.66MB out of the 12-
28GB threshold, and finally 696.32MB out of the remaining memory for a total of 1597.36MB to
calculate the High threshold.
Once the High threshold is calculated, the Soft memory state is calculated by taking 64% of the
MinFreePct or in this example 1022.26MB. Once the memory state drops into the Soft state, the
system experiences ballooning.
When the system reaches 32% of MinFreePct, the Memory Free State changes to Hard. In this
example, when the system reaches 511.23MB, the system experiences ballooning and memory

9
compression.
When the memory reaches 16% of MemFreePct, the Memory Free State changes to Low. In this

Memory Optimization
example, once the system drops below 255.62MB, the system experiences ballooning, memory
compression, and swapping.

Module 9 Memory Optimization 521


Memory Reclamation Review
Slide 9-20

The hypervisor uses memory-reclamation techniques to reclaim host


physical memory.
ƒ Transparent page sharing is done by default.
• This is a low-overhead task.
ƒ Ballooning (which can cause guest paging) is more efficient than host-
level swapping.
• Both techniques may result in memory pages being written to disk.
• Because the guest operating system is aware of its memory usage, the
guest operating system pages more efficiently than the host will swap.
ƒ Memory Compression
• Compresses pages in memory rather than writing them to disk.
ƒ Host-level swapping:
• This technique quickly reclaims memory.
• There is higher performance overhead than the other techniques.
Memory reclamation supports memory overcommitment.

To review, the hypervisor uses a variety of memory-reclamation techniques to reclaim host physical
(machine) memory.
Transparent page sharing and ballooning are the preferred techniques because they have a smaller
performance penalty for the virtual machine than host-level swapping. However, both techniques
take time to work. Transparent page sharing might or might not find memory to share. Ballooning
relies on the guest operating system to take actions to reclaim memory.

522 VMware vSphere: Optimize and Scale


Both ballooning and host-level swapping result in memory pages being written to disk, either by the
guest operating system or by the host. But because the guest operating system is more aware than
the host about its memory usage, it will page much more efficiently than the host can swap.
Therefore, ballooning (which can result in guest paging) is more efficient than host-level swapping.
Memory compression stores pages in memory, which would otherwise be swapped out to disk
through host swapping, in a compression cache located in the main memory.
Host-level swapping is a guaranteed method of reclaiming a specific amount of memory in a
specific amount of time. However, because host-level swapping has a greater effect on performance
than ballooning, this reclamation technique is used by the hypervisor as a last resort.

9
Transparent page sharing is always performed because it is enabled by default. Ballooning and host-
level swapping are performed when a host’s physical memory is overcommitted. If a host’s physical
memory is not overcommitted, there is no reason to reclaim memory through ballooning or host-

Memory Optimization
level swapping.

Module 9 Memory Optimization 523


Memory Space Overhead
Slide 9-21

Memory space overhead has the following components:


ƒ VMkernel: 100MB+
• Depends on number of host devices
ƒ Virtual machine memory overhead:
• Varies, based on size of guest memory and number of vCPUs

Example:

1,024MB RAM 16GB RAM


2 vCPUs 4 vCPUs
=> ~30MB => ~151MB
overhead overhead

Virtualization of memory resources has some associated overhead. The memory space overhead has
the following components:
• The VMkernel uses a smaller amount of memory. The amount depends on the number and size
of the device drivers that are being used.
• Each virtual machine that is powered on has overhead. Overhead memory includes space
reserved for the virtual machine frame buffer and various virtualization data structures, such as
shadow page tables. Overhead memory depends on the number of virtual CPUs and the
configured memory for the guest operating system.
For a table of overhead memory on virtual machines, see vSphere Resource Management Guide at
http://www.vmware.com/support/pubs/vsphere-esxi-vcenter-server-pubs.html.

524 VMware vSphere: Optimize and Scale


Review of Learner Objectives
Slide 9-22

You should be able to do the following:


ƒ Describe how memory is used in a virtualized environment.
ƒ Explain each memory-reclamation technique.
ƒ Explain how memory overcommitment impacts performance.

9
Memory Optimization

Module 9 Memory Optimization 525


Lesson 2: Monitoring Memory Activity
Slide 9-23

Lesson 2:
Monitoring Memory Activity

526 VMware vSphere: Optimize and Scale


Learner Objectives
Slide 9-24

After this lesson, you should be able to do the following:


ƒ Monitor host memory usage and virtual machine memory usage.
ƒ Monitor host swapping activity.
ƒ Monitor host ballooning activity.

9
Memory Optimization

Module 9 Memory Optimization 527


Monitoring Virtual Machine Memory Usage
Slide 9-25

Consumed host memory:


ƒ The amount of host physical memory allocated to a guest
Active guest memory:
ƒ The amount of guest physical memory actively used by the guest
operating system and its applications
Monitoring the memory usage is useful for quickly analyzing a virtual
machine’s status.

In order to quickly monitor virtual machine memory usage, the VMware vSphere® Client™
exposes two memory statistics in the Summary tab of a virtual machine: consumed host memory
and active guest memory.
Consumed host memory is the amount of host physical memory that is allocated to the virtual
machine. This value includes virtualization overhead.
Active guest memory is defined as the amount of guest physical memory that is currently being used
by the guest operating system and its applications.
These two statistics are quite useful for analyzing the memory status of the virtual machine and
providing hints to address potential performance issues.

528 VMware vSphere: Optimize and Scale


Memory Usage Metrics Inside the Guest Operating System
Slide 9-26

Why is guest/host memory usage different than what I see inside the
guest operating system?
ƒ Guest physical memory:
• Guest has better visibility while estimating active memory.
• ESXi active memory estimate technique can take time to converge.
ƒ

9
Host physical memory:
• Host memory usage does not correspond to any memory metric within the

Memory Optimization
guest.
• Host memory usage size is based on a virtual machine’s relative priority on
the physical host and memory usage by the guest.

As you monitor your system, you might notice that the host and guest memory usage values
reported in the vSphere Client are different than the memory usage metrics reported by the guest
operating system tools.
For guest memory usage, there are a couple of reasons why the values reported in the vSphere Client
might be different than the active memory usage reported by the guest operating system. The first
reason is that the guest operating system generally has a better idea (than the hypervisor) of what
memory is “active.” The guest operating system knows what applications are running and how it has
allocated memory. The second reason is that the method employed by the hypervisor to estimate
active memory usage takes time to converge. Therefore, the guest operating system’s estimation
might be more accurate than the ESXi host if the memory workload is fluctuating.
For host memory usage, the host memory usage metric has absolutely no meaning inside the guest
operating system. This is because the guest operating system does not know that it is running within
a virtual machine or that other virtual machines exist on the same physical host.

Module 9 Memory Optimization 529


Consumed Host Memory and Active Guest Memory
Slide 9-27

If consumed host memory > active guest memory:


ƒ If memory is not overcommitted, this is a good thing – there is nothing
to worry about.
ƒ This value represents the highest amount of memory used by the
guest.
If consumed host memory <= active guest memory:
ƒ Active guest memory might not completely reside in host physical
memory.
ƒ This might point to potential performance degradation.

When might active guest memory not completely


reside in host physical memory?

If you monitor memory usage, you might wonder why consumed host memory is greater than active
guest memory. The reason is that for physical hosts that are not overcommitted on memory,
consumed host memory represents the highest amount of memory usage by a virtual machine. It is
possible that in the past this virtual machine was actively using a very large amount of memory.
Because the host physical memory is not overcommitted, there is no reason for the hypervisor to
invoke ballooning or host-level swapping to reclaim memory. Therefore, you can find the virtual
machine in a situation where its active guest memory use is low but the amount of host physical
memory assigned to it is high. This is a perfectly normal situation, so there is nothing to be
concerned about.
If consumed host memory is less than or equal to active guest memory, this might be because the
active guest memory of a virtual machine might not completely reside in host physical memory.
This might occur if a guest’s active memory has been reclaimed by the balloon driver or if the
virtual machine has been swapped out by the hypervisor. In both cases, this is probably due to high
memory overcommitment.

530 VMware vSphere: Optimize and Scale


Monitoring Memory Usage Using resxtop/esxtop
Slide 9-28

possible states:
high, soft, hard,
and low

9
Memory Optimization
PCI hole
VMKMEM

host physical memory (PMEM)

VMKMEM – Memory managed by the VMkernel

There are a lot of memory statistics to look at in resxtop. To display the memory screen, type m.
Type V to list only the virtual machines. Here are a few metrics of interest:
• PMEM/MB – The total amount of physical memory on your host
• VMKMEM/MB – The amount of physical memory currently in use by the VMkernel
• PSHARE - Savings from transparent page sharing – The saving field shows you how much
memory you have saved through transparent page sharing. In the example above, two virtual
machines are running, with a memory savings of 578 MB.

Module 9 Memory Optimization 531


• Memory state – This value can be high, low, soft, or hard. It refers to the current state of the
VMkernel’s memory. It tells you whether the VMkernel has enough memory for performing its
critical operations.
If the state is high, the VMkernel has sufficient memory to perform its critical tasks. However,
if the state is soft, hard, or low, the VMkernel is having trouble getting the memory it needs. For
all practical purposes, you should see only a state of high, even if you are overcommitting your
memory. (ESXi employs good mechanisms for reserving memory for the VMkernel.) However,
if your host is severely overcommitted, you might see a state other than high.
This state is not related to host-level swapping, either. Even though the state is high, there could
still be swapping on the host.
The PCI hole refers to the memory reserved by the PCI devices on the host. This is memory used
solely by the PCI devices for I/O purposes. It cannot be used by the VMkernel. Reserving memory
by PCI devices is common in other operating systems as well, such as Windows and Linux. The
amount of memory used for PCI devices is part of the “other” value, located in the PMEM/MB line.

532 VMware vSphere: Optimize and Scale


Monitoring Host Swapping in the vSphere Client
Slide 9-29

Host-level swapping severely affects the performance of the virtual machine being
swapped, as well as other virtual machines. (The VMware vSphere® Client™ provides
metrics for monitoring swapping activity.)

9
Memory Optimization
Excessive memory demand can cause severe performance problems for one or more virtual
machines on an ESXi host. When ESXi is actively swapping the memory of a virtual machine in and
out from disk, the performance of that virtual machine will degrade. The overhead of swapping a
virtual machine's memory in and out from disk can also degrade the performance of other virtual
machines.
The metrics in the vSphere Client for monitoring swapping activity are the following:
• Memory Swap In Rate – The rate at which memory is being swapped in from disk
• Memory Swap Out Rate – The rate at which memory is being swapped out to disk
High values for either metric indicate a lack of memory and that performance is suffering as a result.

Module 9 Memory Optimization 533


Host Swapping Activity in resxtop/esxtop: Memory Screen
Slide 9-30

Memory screen (m) for virtual machine worlds


total memory
swapped for all virtual
machines on host

swap reads/ swap writes/


second second

To monitor a host’s swapping activity in resxtop, view the memory screen. A useful metric is
SWAP/MB. This metric represents total swapping for all the virtual machines on the host. In the
example above, the value of curr is 198MB, which means that currently 198MB of swap space is
used. The value of target, which is 0 in the example above, is for where ESXi expects the swap
usage to be.
To monitor host-level swapping activity per virtual machine, type V in the resxtop/esxtop
window to display just the virtual machines. Here are some useful metrics:
• SWR/s and SWW/s – Measured in megabytes, these counters represent the rate at which the
ESXi host is swapping memory in from disk (SWR/s) and swapping memory out to disk
(SWW/s).
• SWCUR – This is the amount of swap space currently used by the virtual machine.
• SWTGT – This is the amount of swap space that the host expects the virtual machine to use.

534 VMware vSphere: Optimize and Scale


Host Swapping Activity in resxtop/esxtop: CPU Screen
Slide 9-31

CPU screen (c) for virtual machine worlds percentage of time


virtual machine has
waited for swap
activity

9
Memory Optimization
In addition, the CPU screen (type c), contains a metric named %SWPWT. This is the best indicator
of a performance problem due to wait time experienced by the virtual machine. This metric
represents the percentage of time that the virtual machine is waiting for memory to be swapped in.

Module 9 Memory Optimization 535


Host Ballooning Activity in the vSphere Client
Slide 9-32

Ballooning has a lower performance penalty than swapping.


ƒ However, high ballooning activity can affect performance (high guest operating system
swapping).

Ballooning is part of normal operations when memory is overcommitted. The fact that ballooning is
occurring is not necessarily an indication of a performance problem. The use of the balloon driver
enables the guest to give up physical memory pages that are not being used. However, if ballooning
causes the guest to give up memory that it actually needs, performance problems can occur due to
guest operating system paging.
In the vSphere Client, use the Memory Balloon metric to monitor a host’s ballooning activity. This
metric represents the total amount of memory claimed by the balloon drivers of the virtual machines
on the host. The memory claimed by the balloon drivers are for use by other virtual machines.
Again, this is not a performance problem per se, but it represents the host starting to take memory
from less needful virtual machines for those with large amounts of active memory. If the host is
ballooning, check swap rate counters (Memory Swap In Rate and Memory Swap Out Rate), which
might indicate performance problems.

536 VMware vSphere: Optimize and Scale


Host Ballooning Activity in resxtop
Slide 9-33

9
Memory Optimization
Balloon activity can also be monitored in resxtop. Here are some useful metrics:
• MEMCTL/MB – This line displays the memory balloon statistics for the entire host. All
numbers are in megabytes. “curr” is the total amount of physical memory reclaimed using the
balloon driver (also known as the vmmemctl module). “target” is the total amount of physical
memory ESXi wants to reclaim with the balloon driver. “max” is the maximum amount of
physical memory that the ESXi host can reclaim with the balloon driver.
• MCTL? – This value, which is either Y (for yes) or N (for no), indicates whether the balloon
driver is installed in the virtual machine.
• MCTLSZ – This value is reported for each virtual machine and represents the amount of
physical memory that the balloon driver is holding for use by other virtual machines. In the
example above, the balloon driver for VMTest01 is holding approximately 634MB of memory
for use by other virtual machines.
• MCTLTGT – This value is the amount of physical memory that the host wants to reclaim from
the virtual machine through ballooning. In the example above, this value is approximately
634MB.

Module 9 Memory Optimization 537


Lab 14 Introduction
Slide 9-34

Case 1: Case 2:
Baseline Data after
data workloads started

# ./starttest4 vApp: RAM-HOG (Limit 1,000MB)


TestVM
TestVM
Workload Workload
01 … 06

VM VM
128MB 128MB 1GB
1GB RAM RAM RAM RAM

your ESXi host

In this lab, you perform two tests:


• Test case 1 – Run starttest4 in your test virtual machine and record baseline performance
data.
• Test case 2 – Start the RAM-HOG vApp to generate memory contention. This vApp contains six
memory-intensive virtual machines named Workload01 through Workload06. Add your test
virtual machine to RAM-HOG. After the workloads are started, recorded performance data.
Answer questions posed in the lab and compare the baseline data with the performance data after the
workload has started.

538 VMware vSphere: Optimize and Scale


Lab 14
Slide 9-35

In this lab, you will use performance charts and resxtop to monitor
memory performance:
1. Generate database activity.
2. Display the memory screen in resxtop.
3. Display a memory performance chart for your test virtual machine.

9
4. Observe your test virtual machine’s memory usage.
5. Import the RAM-HOG vApp.

Memory Optimization
6. Overcommit memory resources.
7. Observe memory statistics.
8. Clean up for the next lab.

Module 9 Memory Optimization 539


Lab 14 Review
Slide 9-36

Review the lab questions:


ƒ Did the values for MCTLSZ and MCTLTGT converge?
ƒ Did the values for SWCUR and SWTGT converge?
ƒ Which virtual machines do not have the balloon driver installed?
ƒ Which virtual machines experienced swapping activity?
• Did the %SWPWT value exceed 5 percent for any of these virtual
machines?
ƒ Based on the opm value that you recorded, did the performance of the
starttest4 script degrade?

Your instructor will review these lab questions with you.

540 VMware vSphere: Optimize and Scale


Review of Learner Objectives
Slide 9-37

You should be able to do the following:


ƒ Monitor host memory usage and virtual machine memory usage.
ƒ Monitor host swapping activity.
ƒ Monitor host ballooning activity.

9
Memory Optimization

Module 9 Memory Optimization 541


Lesson 3: Troubleshooting Memory Performance Problems
Slide 9-38

Lesson 3:
Troubleshooting Memory Performance
Problems

542 VMware vSphere: Optimize and Scale


Learner Objectives
Slide 9-39

After this lesson, you should be able to do the following:


ƒ Describe various memory performance problems.
ƒ Discuss the causes of memory performance problems.
ƒ Propose solutions to correct memory performance problems.

9
Memory Optimization

Module 9 Memory Optimization 543


Review: Basic Troubleshooting Flow for ESXi Hosts
Slide 9-40

1. Check for VMware® Tools™ status.

2. Check for resource pool


CPU saturation. 11. Check for using only one vCPU
in an SMP VM.
16. Check for low guest CPU
3. Check for host CPU saturation.
utilization.
12. Check for high CPU ready time
4. Check for guest CPU saturation. on VMs running in
17. Check for past VM memory
under-utilized hosts.
swapping.
5. Check for active VM
memory swapping. 13. Check for slow storage device. 18. Check for high memory
demand in a resource pool.
6. Check for VM swap wait.
14. Check for random increase
in I/O latency on a 19. Check for high memory
7. Check for active VM
shared storage device. demand in a host.
memory compression.

8. Check for an overloaded 20. Check for high guest memory


15. Check for random increase demand.
storage device.
in data transfer rate
on network controllers.
9. Check for dropped receive
packets.

10. Check for dropped transmit


packets.

Definite problems Likely problems Possible problems

The slide shows the basic troubleshooting flow discussed in an earlier module. This lesson focuses
on memory-related problems. The problems you will look at are active (and past) host-level
swapping, guest operating system paging, and high guest memory demand.

544 VMware vSphere: Optimize and Scale


Active Host-Level Swapping (1)
Slide 9-41

No

Yes

9
No Yes

Memory Optimization
Yes

Excessive memory demand can cause severe performance problems for one or more virtual
machines on an ESXi host. When ESXi is actively swapping the memory of a virtual machine in and
out from disk, the performance of that virtual machine will degrade. The overhead of swapping a
virtual machine's memory in and out from disk can also degrade the performance of other virtual
machines.
Check for active host-level swapping. To do this, use the vSphere Client to display an advanced
performance chart for memory. Look at the Memory Swap In Rate and Memory Swap Out Rate
counters for the host.
If either of these measurements is greater than zero at any time during the displayed period, then the
ESXi host is actively swapping virtual machine memory. This is a performance problem. If either of
these measurements is not greater than zero, return to the basic troubleshooting flow and look at the
next problem in the list.
In addition, identify the virtual machines affected by host-level swapping. To do this, look at the
Memory Swap In Rate and Memory Swap Out Rate counters for all the virtual machines on the host.
A stacked graph using the vSphere Client works well for this information. If the values for any of
the virtual machines are greater than zero, then host-level swapping is directly affecting those virtual
machines. Otherwise, host-level swapping is not directly affecting the other virtual machines.

Module 9 Memory Optimization 545


Active Host-Level Swapping (2)
Slide 9-42

No

Yes

If you determine that host-level swapping is affecting the performance of your virtual machines, the
next step is to use resxtop/esxtop to measure the %SWPWT counter for the affected virtual
machines. %SWPWT tracks the percentage of time that the virtual machine is waiting for its pages
to be swapped in. This counter is the best indicator of a performance problem because the longer a
virtual machine waits for its pages, the slower will be the performance of the guest operating system
and applications running in it. Any value greater than 0 warrants attention. If the value is greater
than 5, the problem should be immediately addressed.
The %SWPWT counter is found on the CPU screen. (Type c in the resxtop window.)

546 VMware vSphere: Optimize and Scale


Causes of Active Host-Level Swapping
Slide 9-43

The basic cause of host-level swapping:


ƒ Memory overcommitment from using memory-intensive virtual
machines whose combined configured memory is greater than the
amount of host physical memory available
Causes of active host-level swapping:

9
ƒ Excessive memory overcommitment
ƒ Memory overcommitment with memory reservations

Memory Optimization
ƒ Balloon drivers in virtual machines not running or disabled

The basic cause of host-level swapping is memory overcommitment from using memory-intensive
virtual machines whose combined configured memory is greater than the amount of host physical
memory available. The following conditions can cause host-level swapping:
• Excessive memory overcommitment – In this condition, the level of memory overcommitment
is too high for the combination of memory capacity, virtual machine allocations, and virtual
machine workload. In addition, the amount of overhead memory must be taken into
consideration when using memory overcommitment.
• Memory overcommitment with memory reservations – In this condition, the total amount of
balloonable memory might be reduced by memory reservations. If a large portion of a host’s
memory is reserved for some virtual machines, the host might be forced to swap the memory of
other virtual machines. This might happen even though virtual machines are not actively using
all of their reserved memory.
• Balloon drivers in virtual machines not running or disabled – In this condition, if the balloon
driver is not running, or has been deliberately disabled in some virtual machines, the amount of
balloonable memory on the host will be decreased. As a result, the host might be forced to swap
even though there is idle memory that could have been reclaimed through ballooning.

Module 9 Memory Optimization 547


Resolving Host-Level Swapping
Slide 9-44

To resolve this problem:


ƒ Reduce the level of memory overcommitment.
ƒ Enable the balloon driver in all virtual machines.
ƒ Add memory to the host.
ƒ Reduce memory reservations.
ƒ Use resource controls to dedicate memory to critical virtual machines.

There are four possible solutions to performance problems caused by active host-level swapping
(listed on the slide).
In most situations, reducing memory overcommitment levels is the proper approach for eliminating
swapping on an ESXi host. However, be sure to consider all of the factors discussed in this lesson to
ensure that the reduction is adequate to eliminate swapping. If other approaches are used, be sure to
monitor the host to ensure that swapping has been eliminated.
Each solution is discussed in the following pages.

548 VMware vSphere: Optimize and Scale


Reducing Memory Overcommitment
Slide 9-45

Ways to reduce the level of memory overcommitment:


ƒ Add physical memory to the ESXi host.
ƒ Reduce the number of virtual machines running on the ESXi host.
ƒ Increase available memory resources by adding the host to a VMware
Distributed Resource Scheduler cluster.

9
Memory Optimization
To reduce the level of memory overcommitment, do one or more of the following:
• Add physical memory to the ESXi host. Adding physical memory to the host will reduce the
level of memory overcommitment and it might eliminate the memory pressure that caused
swapping to occur.
• Reduce the number of virtual machines running on the ESXi host. One way to do this is to use
VMware vSphere® vMotion® to migrate virtual machines to hosts with available memory
resources. In order to do this successfully, the vSphere Client performance charts should be
used to determine the memory usage of each virtual machine on the host and the available
memory resources on the target hosts. You must ensure that the migrated virtual machines will
not cause swapping to occur on the target hosts.
If additional ESXi hosts are not available, you can reduce the number of virtual machines by
powering off noncritical virtual machines. This will make additional memory resources
available for critical applications. Even idle virtual machines consume some memory resources.
• Increase available memory resources by adding the host to a VMware vSphere® Distributed
Resource Scheduler™ (DRS) cluster. This is similar to the previous solution. However, in a DRS
cluster, load rebalancing can be performed automatically. It is not necessary to manually compute
the compatibility of specific virtual machines and hosts, or to account for peak usage periods.

Module 9 Memory Optimization 549


Enabling Balloon Driver in Virtual Machines
Slide 9-46

To enable the balloon driver in a virtual machine, install VMware


Tools.
If a virtual machine has critical memory needs, use resource controls
to satisfy those needs.

Never disable the balloon driver.


Always keep it enabled.

In order to maximize the ability of ESXi to recover idle memory from virtual machines, the balloon
driver should be enabled in all virtual machines. If a virtual machine has critical memory needs,
reservations and other resource controls should be used to ensure that those needs are satisfied. The
balloon driver is enabled when VMware® Tools™ is installed into the virtual machine.
If there is excessive memory overcommitment, however, ballooning will only delay the onset of
swapping. Therefore, it is not a sufficient solution. The level of memory overcommitment should be
reduced as well.
The balloon driver should never be deliberately disabled in a virtual machine. Disabling the balloon
driver might cause unintended performance problems such as host-level swapping. It also makes
tracking down memory-related problems more difficult. VMware vSphere® provides other
mechanisms, such as memory reservations, for controlling the amount of memory available to a
virtual machine.

550 VMware vSphere: Optimize and Scale


Memory Hot Add
Slide 9-47

Add memory resources to a powered-on virtual machine that meets


the following conditions:
ƒ The virtual machine has a guest operating system that supports
memory hot add functionality.
ƒ The virtual machine is using at least hardware version 7.
ƒ

9
VMware Tools is installed.

Memory Optimization
Memory hot add lets you add memory resources for a virtual machine while the machine is
powered on.
Hot-add memory allows ranges of physical memory to be added to a running operating system
without requiring the system to be shut down. Hot-add allows an administrator to add memory to a
running system to address existing or future needs for memory without incurring downtime for the
system or applications.

Module 9 Memory Optimization 551


Memory Hot Add Procedure
Slide 9-48

To enable Hot Plug Memory for a virtual machine:


1. In the vSphere Client inventory, right-click the virtual machine and select Edit Settings.

2. Click the Options tab and under Advanced, select Memory/CPU Hotplug.

3. Enable or disable memory hot add.

4. Enable memory hot add for this virtual machine.

5. Disable memory hot add for this virtual machine.

6. Click OK to save your changes and close the dialog box.

552 VMware vSphere: Optimize and Scale


Reducing a Virtual Machine's Memory Reservations
Slide 9-49

Reevaluate a virtual machine’s memory reservations if:


ƒ That virtual machine’s reservations cause the host to swap virtual
machines without reservations.
Reduce a virtual machine’s memory reservation if:
ƒ The virtual machine is not using its full reservation.

9
If the reservation cannot be reduced, memory overcommitment must
be reduced.

Memory Optimization
If virtual machines configured with large memory reservations are causing the ESXi host to swap
memory from virtual machines without reservations, the need for those reservations should be
reevaluated.
If a virtual machine with reservations is not using its full reservation, even at peak periods, then the
reservation should be reduced.
If the reservations cannot be reduced, either because the memory is needed or for business reasons,
then the level of memory overcommitment must be reduced.

Module 9 Memory Optimization 553


Dedicating Memory to Critical Virtual Machines
Slide 9-50

Use memory reservations to prevent critical virtual machines from


swapping.
ƒ Use as a last resort.
ƒ This might move the swapping problem to other virtual machines.

As a last resort, when it is not possible to reduce the level of memory overcommitment, configure
your performance-critical virtual machines with a sufficient memory reservation to prevent them
from swapping. However, doing this only moves the swapping problem to other virtual machines,
whose performance will be severely degraded. In addition, swapping of other virtual machines
might still affect the performance of virtual machines with reservations, due to added disk traffic
and memory management overhead.

554 VMware vSphere: Optimize and Scale


Guest Operating System Paging
Slide 9-51

Monitor the host’s ballooning activity.


ƒ If ballooning > 0 for the host, look at ballooning activity for the virtual
machines with performance problems.
• If ballooning > 0 for those virtual machines, check for high paging activity
within the guest operating systems.

9
Cause Solution

Memory Optimization
If overall memory demand is Use resource controls to direct
high, the ballooning mechanism resources to critical VMs.
might reclaim active memory Reduce memory
from the application or guest OS. overcommitment on the host.

High demand for host memory will not necessarily cause performance problems. Host physical
memory can be used without an effect on virtual machine performance as long as memory is not
overcommitted and sufficient memory capacity is available for virtual machine overhead and for the
internal operations of the ESXi host. Even if memory is overcommitted, ESXi can use ballooning to
reclaim unused memory from nonbusy virtual machines to meet the needs of more demanding
virtual machines. Ballooning can maintain good performance in most cases.
However, if overall demand for host physical memory is high, the ballooning mechanism might
reclaim memory that is needed by an application or guest operating system. In addition, some
applications are highly sensitive to having any memory reclaimed by the balloon driver.
To determine whether ballooning is affecting performance of a virtual machine, you must
investigate the use of memory within the virtual machine. If a virtual machine with a balloon value
of greater than zero is experiencing performance problems, the next step is to examine paging
statistics from within the guest operating system. In Windows guests, these statistics are available
through Perfmon. In Linux guests, they are provided by such tools as vmstat and sar. The values
provided by these monitoring tools within the guest operating system are not strictly accurate. But
they will provide enough information to determine whether the virtual machine is experiencing
memory pressure.

Module 9 Memory Optimization 555


If ballooning activity is causing performance problems in the guest operating system, there are two
possible approaches to addressing this problem:
• Use resource controls to direct available resources to critical virtual machines. In order to
prevent the host from reclaiming memory via ballooning from the virtual machine, memory
reservations can be set for the virtual machine to ensure that it has adequate memory available.
Monitor the actual amount of memory used by the virtual machine to determine the appropriate
reservations value. Increasing the reservation for the virtual machine might cause the ESXi host
to reclaim memory through ballooning from other virtual machines. In some cases, it might also
lead to host-level swapping. Hosts with memory overcommitment should be monitored for
host-level swapping.
• Reduce memory overcommitment on the host. Ballooning can be eliminated on the host by
reducing memory overcommitment, as discussed earlier. However, this approach will prevent
ESXi from taking advantage of otherwise idle memory. The overall level of memory usage
should be considered before using this approach to solve the problem of guest operating system
paging in one virtual machine.

556 VMware vSphere: Optimize and Scale


Example: Ballooning Versus Swapping
Slide 9-52

Swap target is
more for the virtual
machine without
the balloon driver .

9
MCTL: N – Balloon

Memory Optimization
driver not active,
tools probably not Virtual
installed machine with
memory-
balloon driver
hog virtual
swaps less.
machines

This example shows why ballooning is preferable to swapping. There are multiple virtual machines
running, all the virtual machines are running a memory-intensive application. The MCTL? field
shows that the virtual machine named Workload## do not have the balloon driver installed, whereas
the virtual machine named VMtest02 does have the balloon driver installed. If the balloon driver is
not installed, this typically means that VMware Tools has not been installed into the virtual
machine.
The Workload virtual machines have a higher swap target (SWTGT) than the other virtual machines
because these virtual machines do not have the balloon driver installed. The virtual machine that has
the balloon driver installed (VMTest02) can reclaim memory by using the balloon driver. If ESXi
cannot reclaim memory through ballooning, then it will have to resort to other methods, such as
host-level swapping.

Module 9 Memory Optimization 557


High Guest Memory Demand
Slide 9-53

Monitor the virtual machine’s memory usage:


ƒ If average > 80% or peaks > 90%, high guest memory demand might
be causing problems within the guest operating system.

Cause Solution

Application and guest operating Configure additional memory for


system are using a high the virtual machine.
percentage of memory allocated Tune the application to reduce
to them. its memory demand.

Performance problems can occur when the application and the guest operating system running in a
virtual machine are using a high percentage of the memory allocated to them. High guest memory
can lead to guest operating system paging as well as other, application-specific performance
problems. The causes of high memory demand will vary depending on the application and guest
operating system being used. It is necessary to use application-specific and operating system–
specific monitoring and tuning documentation in order to determine why the guest is using most of
its allocated memory and whether this is causing performance problems.
If high demand for guest memory is causing performance problems, the following actions can be
taken to address the performance problems:
• Configure additional memory for the virtual machine. In a vSphere environment, it is much
easier to add memory to a guest than would be possible in a physical environment. Ensure that
the guest operating system and application used can take advantage of the added memory. For
example, some 32-bit operating systems and most 32-bit applications can access only 4GB of
memory, regardless of how much is configured. Also, check that the new allocation will not
cause excessive overcommitment on the host, because this will lead to host-level swapping.
• Tune the application to reduce its memory demand. The methods for doing this are application-
specific.

558 VMware vSphere: Optimize and Scale


When Swapping Occurs Before Ballooning
Slide 9-54

Many virtual machines are powered on at the same time.


ƒ Virtual machines might access a large portion of their allocated
memory.
ƒ At the same time, balloon drivers have not yet started.
ƒ This causes the host to swap virtual machines.

9
Host-level swapping slows the boot process but is not a problem
after the guest is booted up.

Memory Optimization
A virtual machine’s memory that is currently swapped out to disk will
not cause a performance problem if the memory is never accessed.

In some cases, virtual machine memory can remain swapped to disk even though the ESXi host is
not actively swapping. This will occur when high memory activity caused some virtual machine
memory to be swapped to disk and the virtual machine whose memory was swapped has not yet
attempted to access the swapped memory. As soon as the virtual machine does access this memory,
the host will swap it back in from disk.
Having virtual machine memory that is currently swapped out to disk will not cause performance
problems. However, it might indicate the cause of performance problems that were observed in the
past, and it might indicate problems that might recur at some point in the future.
There is one common situation that can lead to virtual machine memory being left swapped out even
though there was no observed performance problem. When a virtual machine’s guest operating
system first starts, there will be a period of time before the memory balloon driver begins running.
In that time, the virtual machine might access a large portion of its allocated memory. If many
virtual machines are powered on at the same time, the spike in memory demand, together with the
lack of running balloon drivers, might force the ESXi host to resort to host-level swapping. Some of
the memory that is swapped might never again be accessed by a virtual machine, causing it to
remain swapped to disk even though there is adequate free memory. This type of swapping might
slow the boot process but will not cause problems when the operating systems have finished
booting.

Module 9 Memory Optimization 559


Memory Performance Best Practices
Slide 9-55

ƒ Allocate enough memory to hold the working set of applications you will
run in the virtual machine, thus minimizing swapping.
ƒ Never disable the balloon driver. Always keep it enabled.
ƒ Keep transparent page sharing enabled
ƒ Avoid overcommitting memory to the point that it results in heavy
memory reclamation.

560 VMware vSphere: Optimize and Scale


Lab 15
Slide 9-56

In this lab, you will use the performance troubleshooting checklist to


detect possible memory performance problems:
1. Generate memory activity.
2. Display the memory performance charts for your ESXi host and virtual
machines.

9
3. Check for active host-level swapping.
4. Check for past host-level swapping.

Memory Optimization
5. Check for guest operating system paging.
6. Check for high guest memory demand.
7. Summarize your findings.
8. Clean up for the next lab.

Module 9 Memory Optimization 561


Review of Learner Objectives
Slide 9-57

You should be able to do the following:


ƒ Describe various memory performance problems.
ƒ Discuss the causes of memory performance problems.
ƒ Propose solutions to correct memory performance problems.

562 VMware vSphere: Optimize and Scale


Key Points
Slide 9-58

ƒ The hypervisor uses memory-reclamation techniques – such as


transparent page sharing, ballooning, and host-level swapping – to
reclaim host physical memory.
ƒ A host’s swap rates and ballooning activity are key memory
performance metrics.
ƒ

9
The basic cause of memory swapping is memory overcommitment of
memory-intensive virtual machines.

Memory Optimization
Questions?

Module 9 Memory Optimization 563


564 VMware vSphere: Optimize and Scale
MODULE 10

Virtual Machine and Cluster


Optimization 10
Slide 10-1

Module 10

10
Virtual Machine and Cluster Optimization

VMware vSphere: Optimize and Scale 565


You Are Here
Slide 10-2

Course Introduction Storage Optimization

VMware Management Resources


CPU Optimization
Performance in a Virtualized
Environment
Memory Performance
Network Scalability
VM and Cluster Optimization
Network Optimization

Storage Scalability Host and Management Scalability

566 VMware vSphere: Optimize and Scale


Importance
Slide 10-3

Proper configuration of the virtual machine can reduce the possibility


of performance problems from occurring during normal operation.
To get the most out of VMware vSphere® Distributed Resource
Scheduler™ (DRS) and VMware vSphere® High Availability (vSphere
HA), you should know the guidelines for configuring a vSphere
HA/DRS cluster.
You should also know the guidelines for using resource allocation
settings (shares, limits, reservations, resource pools) to gain optimal
performance for your virtual machines.

10
Virtual Machine and Cluster Optimization

Module 10 Virtual Machine and Cluster Optimization 567


Module Lessons
Slide 10-4

Lesson 1: Virtual Machine Optimization


Lesson 2: vSphere Cluster Optimization

568 VMware vSphere: Optimize and Scale


Lesson 1: Virtual Machine Optimization
Slide 10-5

Lesson 1:
Virtual Machine Optimization

10
Virtual Machine and Cluster Optimization

Module 10 Virtual Machine and Cluster Optimization 569


Learner Objectives
Slide 10-6

After this lesson, you should be able to do the following:


ƒ Discuss guidelines for creating an optimal virtual machine configuration.

570 VMware vSphere: Optimize and Scale


Virtual Machine Performance Overview
Slide 10-7

A well-tuned virtual machine provides an optimal environment for


applications to run in.
Performance considerations exist for the following:
ƒ Guest operating system
ƒ VMware® Tools™
ƒ CPU
ƒ Memory
ƒ Storage

10
ƒ Network

Virtual Machine and Cluster Optimization


You can maintain and even improve the performance of your applications by ensuring that your
virtual machine is optimally configured.
Follow the guidelines when configuring your virtual machine for optimal performance. These
guidelines pertain to the guest operating system, VMware® Tools™, and the virtual hardware
resources (CPU, memory, storage, and networking).

Module 10 Virtual Machine and Cluster Optimization 571


Selecting the Right Guest Operating System
Slide 10-8

During virtual machine creation,


select the right guest operating
system.
The guest operating system type
determines default optimal
devices and their settings.

When you create a virtual machine, select the guest operating system type that you intend to install
in that virtual machine. The guest operating system type determines the default optimal devices for
the guest operating system, such as the SCSI controller and the network adapter to use.
Virtual machines support both 32-bit and 64-bit guest operating systems. For information about
guest operating system support, see VMware Compatibility Guide at http://www.vmware.com/
resources/compatibility.

572 VMware vSphere: Optimize and Scale


Timekeeping in the Guest Operating System
Slide 10-9

Time measurements within a virtual machine can be inaccurate due


to difficulties with the guest operating system keeping the exact time.
To reduce the problem:
ƒ Use guest operating systems that require fewer timer interrupts:
• Most Windows, Linux 2.4: 100Hz (100 interrupts per second)
• Most Linux 2.6: 1,000Hz
• Recent Linux: 250Hz
ƒ Use clock synchronization software in the guest:

10
• As of VMware vSphere® ESXi™ 5.0, VMware Tools is a suitable choice.
ƒ For Linux guests, use NTP instead of VMware Tools.

Virtual Machine and Cluster Optimization


Within any virtual machine, use either VMware Tools time
synchronization or another timekeeping utility, but not both.

Because virtual machines work by time-sharing host physical hardware, virtual machines cannot
exactly duplicate the timing activity of physical machines. Virtual machines use several techniques
to minimize and conceal differences in timing performance. However, the differences can still
sometimes cause timekeeping inaccuracies and other problems in software running in a virtual
machine.
Several things can be done to reduce the problem of timing inaccuracies:
• If you have a choice, use guest operating systems that require fewer timer interrupts. The timer
interrupt rates vary between different operating systems and versions. For example:
• Windows systems typically use a base timer interrupt rate of 64 Hz or 100 Hz.
• 2.6 Linux kernels have used a variety of timer interrupt rates (100 Hz, 250 Hz, and 1000
Hz).
• The most recent 2.6 Linux kernels introduce the NO_HZ kernel configuration option
(sometimes called “tickless timer”) that uses a variable timer interrupt rate.

Module 10 Virtual Machine and Cluster Optimization 573


The total number of timer interrupts delivered to the virtual machine depends on several factors.
For example, in virtual machines running symmetric multiprocessing (SMP) hardware
abstraction layers (HALs) require more timer interrupts than those running uniprocessor HALs.
Also, the more vCPUs a virtual machine has, the more interrupts the virtual machine requires.
Delivering many virtual timer interrupts negatively impacts guest performance and increases
host CPU consumption.
• Use clock synchronization software in the guest. A few options are available for guest operating
system clock synchronization:
• NTP
• Windows Time Service
• VMware Tools time-synchronization option
• Another timekeeping utility suitable for your operating system
• For Linux guests, VMware® recommends that NTP be used instead of VMware Tools for time
synchronization. For complete details, see VMware knowledge base article 1006427 at
http://kb.vmware.com/kb/1006427.
For a detailed discussion on timekeeping, see “Timekeeping in VMware Virtual Machines” at
http://www.vmware.com/resources/techresources/238.

574 VMware vSphere: Optimize and Scale


VMware Tools
Slide 10-10

Basic performance VMware Tools improves the


troubleshooting for performance and management of
vSphere ESXi hosts
the virtual machine.

VMware Install the latest version of VMware


Tools– Check for VMware Tools.
ƒ
related Tools status.
problems
Keep VMware Tools up to date.
If VMware Tools does not start
Check for host CPU properly, re-enable it.
ƒ

10
saturation.
VMware Tools should always be
running.

Virtual Machine and Cluster Optimization


…

Install the latest version of VMware Tools in the guest operating system. Installing VMware Tools
in Windows guests updates the BusLogic SCSI driver included with the guest operating system to
the driver supplied by VMware. The VMware driver has optimizations that guest-supplied Windows
drivers do not.
VMware Tools also includes the balloon driver used for memory reclamation in VMware vSphere®
ESXi™. Ballooning does not work if VMware Tools is not installed. VMware Tools allows you to
synchronize the virtual machine time with that of the ESXi host.
VMware Tools is running when the guest operating system has booted. However, a couple of
situations exist that can prevent VMware Tools from starting properly after it has been installed:
• The operating system kernel in a Linux virtual machine has changed, for example, you updated
the Linux kernel to a newer version. Changing the Linux kernel can lead to conditions that
prevent the drivers that comprise VMware Tools from loading properly. To reenable VMware
Tools, rerun the script vmware-config-tools.pl in the Linux virtual machine.

Module 10 Virtual Machine and Cluster Optimization 575


• VMware Tools is disabled in the guest operating system. VMware Tools is installed as a set of
services in a Windows virtual machine or as a set of device drivers in a Linux virtual machine.
These services/drivers might be deliberately disabled, or unintentionally disabled when
performing other operating system–level tasks. In this case, reenable VMware Tools.
Make sure to update VMware Tools after each ESXi upgrade. Otherwise, the VMware vSphere®
Client™ reports that the tools are out of date in that virtual machine.

576 VMware vSphere: Optimize and Scale


Virtual Hardware Version 9
Slide 10-11

For virtual machines running on an ESXi 5.x host, make sure that
their virtual hardware is at least version 9.
Virtual hardware version 9 includes support for the following:
ƒ Up to 64 vCPUs
ƒ Up to 1TB of RAM
ƒ Virtual GPU
ƒ Support for virtual Non-Uniform Memory Access (vNUMA)
ƒ Support for 3D graphics

10
ƒ Guest operating system storage reclamation
ƒ Enhanced CPU virtualization

Virtual Machine and Cluster Optimization


ESXi 5.0 introduces a new generation of virtual hardware – virtual hardware version 8. Version 8
supports a larger virtual machine as well as new features including:
• Support for virtual Non-Uniform Memory Access (vNUMA)
• Support for 3D graphics
• Virtual graphical processing unit (GPU) support
Hardware version 8 is the default for virtual machines created on an ESXi 5.0 host.
For virtual machines that are running on ESXi 5.1, VMware recommends that you upgrade the
virtual hardware to version 9. Virtual machines with virtual hardware version 9 are not supported on
pre-5.0 hosts.

Module 10 Virtual Machine and Cluster Optimization 577


Virtual Hardware Compatibility
Slide 10-12

Virtual Hardware Compatibility allows the user to set the virtual


hardware of a virtual machine based on the version of the ESXi host
the virtual machine will run on.
ƒ Defines virtual machine capabilities.
ƒ Virtual hardware can only be upgraded.
ƒ Older compatibility levels supported on newer vSphere hosts.

Virtual machine hardware compatibility assigns a compatibility level that directly correlates to the
vSphere version feature set for the virtual machine. Hardware compatibility enables administrators
to identify the hardware features and capabilities of a virtual machine without having to remember
the feature set of a version of virtual hardware.
A virtual machine’s hardware compatibility is defined when it is created. The hardware can be
upgraded to a higher level, but not down to a lower level. Changing a virtual machine’s
compatibility requires it to be powered off. Virtual machines with older compatibility levels will be
fully supported on newer ESXi host versions.
Compatibility can be defined on a host or a cluster as a default setting. Making the setting a default
version ensures all newly created virtual machines are set to the correct version.
Hardware compatibility can be configured using the vSphere Web Client only.

578 VMware vSphere: Optimize and Scale


CPU Considerations
Slide 10-13

Avoid using SMP unless specifically required by the application


running in the guest operating system.
ƒ With SMP, processes can migrate across vCPUs, incurring a small
overhead.
If you have a choice, use guest operating systems that require fewer
timer interrupts.
ƒ Delivering many virtual timer interrupts increases host CPU
consumption.

10
Virtual Machine and Cluster Optimization
In SMP guests, the guest operating system can migrate processes from one vCPU to another. This
migration can incur a small CPU overhead. If the migration is very frequent, pinning guest threads
or processes to specific vCPUs might prove helpful.
If you have a choice, use guest operating systems that require fewer timer interrupts. For example:
• If you have a uniprocessor virtual machine, use a uniprocessor HAL.
• In some Linux versions, such as RHEL 5.1 and later, the “divider=1-” kernel boot parameter
reduces the timer interrupt rate to one tenth its default rate. For timekeeping best practices for
Linux guests see VMware knowledge base article 1006427 at http://kb.vmware.com/kb/
1006427.
• Kernels with tickless-timer support (NO_HZ kernels) do not schedule periodic timers to
maintain system time. As a result, these kernels reduce the overall average rate of virtual timer
interrupts. This reduction improves system performance and scalability on hosts running large
numbers of virtual machines.

Module 10 Virtual Machine and Cluster Optimization 579


Using vNUMA
Slide 10-14

vNUMA allows NUMA-aware guest operating systems and


applications to make efficient use of the underlying hardware’s
NUMA architecture.
vNUMA:
ƒ Requires virtual VM 1:
6 vCPU
VM 2:
12 vCPU
hardware version 8 32GB RAM 64GB RAM
ƒ Is enabled by default
when the number of Node 0: Node 1:
vCPUs > 8 6 cores 6 cores
ƒ Is enabled or disabled 64GB RAM 64GB RAM
in a virtual machine
with the VMware
vSphere® Client™
NUMA server

vNUMA exposes NUMA topology to the guest operating system. vNUMA can provide significant
performance benefits, though the benefits depend heavily on the level of NUMA optimization in the
guest operating system and applications.
You can obtain the maximum performance benefits from vNUMA if your clusters are composed
entirely of hosts with matching NUMA architecture. When a vNUMA-enabled virtual machine is
powered on, its vNUMA topology is set based in part on the NUMA topology of the underlying
physical host. Once a virtual machine’s vNUMA topology is initialized, it does not change unless
the number of vCPUs in that virtual machine is changed. Therefore, if a vNUMA virtual machine is
moved to a host with a different NUMA topology, the virtual machine’s vNUMA topology might no
longer be optimal. This move could potentially result in reduced performance.
Size your virtual machines so they align with physical NUMA boundaries. For example, you have a
host system with six cores per NUMA node. In this case, size your virtual machines with a multiple
of six vCPUs (that is, 6 vCPUs, 12 vCPUs, 18vCPUs, 24 vCPUs, and so on).
When creating a virtual machine you have the option to specify the number of virtual sockets and
the number of cores per virtual socket. If the number of cores per virtual socket on a vNUMA-
enabled virtual machine is set to any value other than the default of 1 and that value doesn't align
with the underlying physical host topology, performance might be slightly reduced.

580 VMware vSphere: Optimize and Scale


Therefore for best performance, if a virtual machine is to be configured with a non-default number
of cores per virtual socket, that number should be an integer multiple or integer divisor of the
physical NUMA node size.
By default, vNUMA is enabled only for virtual machines with more than eight vCPUs. This feature
can be enabled for smaller virtual machines, however, by adding to the .vmx file the following line:
numa.vcpu.maxPerVirtualNode = X (where X is the number of vCPUs per vNUMA node).
To make this change with the vSphere Client, perform the following steps:

1. Select the virtual machine that you wish to change, then click Edit virtual machine settings.

2. Under the Options tab, select General and then click Configuration Parameters.

3. Look for numa.vcpu.maxPerVirtualNode. If it is not present, click Add Row and enter the
new variable.
4. Click on the value to be changed and configure it as you wish.

10
For information on NUMA and vNUMA, see Performance Best Practices for VMware vSphere 5.0
at http://www.vmware.com/pdf/Perf_Best_Practices_vSphere5.0.pdf.

Virtual Machine and Cluster Optimization

Module 10 Virtual Machine and Cluster Optimization 581


Memory Considerations
Slide 10-15

Using large pages:


ƒ ESXi provides 2MB memory pages to guest operating systems that
request them.
ƒ The use of large pages results in reduced memory management
overhead and can therefore increase hypervisor performance.
ƒ Use large memory pages if your guest operating system or application
can take advantage of them.
Using an alternate virtual machine swap location:
ƒ Configure a special host cache on an SSD to be used for the swap-to-
host cache feature.
ƒ If a host does not use the swap-to-host cache feature, place a virtual
machine’s swap file on either local or remote SSD.
ƒ Do not place swap files on thin-provisioned LUNs.

In addition to the usual 4KB memory pages, ESXi also provides 2MB memory pages, commonly
referred to as “large pages”. By default, ESXi assigns these 2MB machine memory pages to guest
operating systems that request them, giving the guest operating system the full advantage of using
large pages.
An operating system or application could benefit from large pages on a native system. In this case,
that operating system or application can potentially achieve a similar performance improvement on
a virtual machine backed with 2MB machine memory pages. Consult the documentation for your
operating system and application to determine how to configure them to use large memory pages.
Use of large pages can also change page sharing behavior. While ESXi ordinarily uses page sharing
regardless of memory demands, ESXi does not share large pages. Therefore with large pages, page
sharing might not occur until memory overcommitment is high enough to require the large pages to
be broken into small pages.
More information about large page support can be found in the performance study entitled Large
Page Performance (available at http://www.vmware.com/resources/techresources/1039).
The creation of a virtual machine’s swap file is automatic. By default, this file is created in the
virtual machine’s working directory, but a different location can be set.

582 VMware vSphere: Optimize and Scale


You can optionally configure a special host cache on an SSD (if one is installed) to be used for the
swap to host cache feature. This swap cache is shared by all the virtual machines running on the
host. In addition, host-level swapping of the virtual machines’ most active pages benefit from the
low latency of SSD. The swap to host cache feature makes the best use of potentially limited SSD
space. This feature is also optimized for the large block sizes at which some SSDs work best.
If a host does not use the swap to host cache feature, place the virtual machine swap files on low
latency, high bandwidth storage systems. The best choice is usually local SSD. If the host does not
have local SSD, the second choice would be remote SSD. Remote SSD would still provide the low
latencies of SSD, though with the added latency of remote access.
Other than SSD storage, place the virtual machine’s swap file on the fastest available datastore. This
datastore might be on a Fibre Channel SAN array or a fast local disk.
Do not store swap files on thin-provisioned LUNs. Running a virtual machine with a swap file that
is stored on a thin-provisioned LUN can cause swap file growth to fail if no space on the LUN
remains. This failure can lead to the termination of the virtual machine.

10
You can define an alternate swap file location at the cluster level and at the virtual machine level.

Virtual Machine and Cluster Optimization


To define an alternate swap file location at the cluster level:

1. Right-click the cluster in the inventory and select Edit Settings.

2. In the cluster settings dialog box, select Swapfile Location in the left pane.

3. Choose the option to store the swap file in the datastore specified by the host and click OK.

4. Select the ESXi host in the inventory and click the Configuration tab.

5. In the Software panel, click the Virtual Machine Swapfile Location link.

6. Click the Edit link and select the datastore to use as that swap file location. Click OK.
To define an alternate swap file location at the virtual machine level

1. Right-click the virtual machine in the inventory and select Edit Settings.

2. In the virtual machine properties dialog box, click the Options tab.

3. Select Swapfile Location.

4. Select Store in the host’s swapfile datastore and click OK.

Module 10 Virtual Machine and Cluster Optimization 583


Storage Considerations
Slide 10-16

Use the correct virtual hardware:


ƒ BusLogic or LSI Logic
• Use the custom BusLogic driver in VMware Tools.
ƒ VMware Paravirtual SCSI (PVSCSI) adapter
• Use for I/O-intensive guest applications.
• PVSCSI is functionally similar to BusLogic and LSI Logic, but has low CPU
overhead, higher throughput, lower latency, and better scalability.
Size the guest operating system queue depth appropriately.
Be aware that large I/O requests split up by the guest might affect I/O
performance.
Align the guest operating system partitions.

The default virtual storage adapter is either BusLogic Parallel, LSI Logic Parallel, or LSI Logic
SAS. The adapter used depends on the guest operating system and the virtual hardware version. You
might choose to use the BusLogic Parallel virtual SCSI adapter, and you are using a Windows guest
operating system. In this case, you should use the custom BusLogic driver included in the VMware
Tools package.
ESXi also includes a paravirtualized SCSI storage adapter, PVSCSI (also called VMware
Paravirtual). The PVSCSI adapter offers a significant reduction in CPU utilization as well as
potentially increased throughput compared to the default virtual storage adapters.
The PVSCSI adapter is the best choice for environments with very I/O-intensive guest
applications. In order to use PVSCSI, your virtual machine must be using virtual hardware
version 7 or later. For information on PVSCI, see VMware knowledge base article 1010398 at
http://kb.vmware.com/kb/1010398.

584 VMware vSphere: Optimize and Scale


If you choose to use the BusLogic virtual SCSI adapter, and are using a Windows guest operating
system, you should use the custom BusLogic driver included in the VMware Tools package.
The depth of the queue of outstanding commands in the guest operating system SCSI driver can
significantly affect disk performance. A queue depth that is too small, for example, limits the disk
bandwidth that can be pushed through the virtual machine. See the driver-specific documentation for
more information on how to adjust these settings.
In some cases large I/O requests issued by applications in a virtual machine can be split by the guest
storage driver. Changing the guest operating system’s registry settings to issue larger block sizes can
eliminate this splitting, thus enhancing performance. For additional information see VMware
knowledge base article 9645697 at http://kb.vmware.com/kb/9645697.
Make sure that the system partitions within the guest are aligned. For further information, see the
documentation from the operating system vendor regarding appropriate tools to use as well as
recommendations from the array vendor.

10
Virtual Machine and Cluster Optimization

Module 10 Virtual Machine and Cluster Optimization 585


Network Considerations
Slide 10-17

Use the vmxnet3 network adapter for guest operating systems in


which it is supported.
ƒ If vmxnet3 is not supported, use Enhanced vmxnet.
ƒ If Enhanced vmxnet is not supported, use the flexible device type.
Use a network adapter that supports high-performance features such
as TCP checksum offload, TSO, and jumbo frames.
Ensure that network adapters are running with full duplex and the
highest supported speed.
ƒ For network adapters, Gigabit Ethernet or higher, set to autonegotiate.

For the best performance, use the vmxnet3 network adapter for operating systems in which it is
supported. The virtual machine must use virtual hardware version 7 or later, and VMware Tools
must be installed in the guest operating system.
• If vmxnet3 is not supported in your guest operating system, use Enhanced vmxnet (which
requires VMware Tools). Both Enhanced vmxnet (vmxnet2) and vmxnet3 support jumbo
frames for better performance.
• If vmxnet2 is not supported in your guest operating system, use the flexible device type. This
device type automatically converts each vlance adapter to a vmxnet adapter if VMware Tools is
installed.
For the best networking performance, use network adapters that support hardware features such as
TCP checksum offload, TCP segmentation offload, and jumbo frames.
Ensure that network adapters have the proper speed and duplex settings. Typically, for 10/100 NICs,
set the speed and duplex. Make sure that the duplex is set to full duplex. For NICS, Gigabit Ethernet
or higher, set the speed and duplex to autonegotiate.

586 VMware vSphere: Optimize and Scale


Virtual Machine Power-On Requirements
Slide 10-18

A number of resource requirements must be met in order for a virtual


machine to power on and boot successfully:
ƒ CPU and memory reservations must be satisfied.
ƒ Sufficient disk space must exist for the virtual machine swap file.
ƒ The correct virtual SCSI HBA must be selected.

10
Virtual Machine and Cluster Optimization
A virtual machine is highly configurable. Different configuration options affect the virtual
machine’s ability to power on and boot correctly.
A virtual machine does not power on unless certain requirements are met. A virtual machine does
not power on unless the host can meet the virtual machine’s CPU and memory reservations. A
virtual machine does not power on unless enough storage space exists to create the virtual machine’s
swap file.
A virtual machine does power on but not boot if the guest operating system does not include a driver
for the type of SCSI HBA configured in the virtual hardware.

Module 10 Virtual Machine and Cluster Optimization 587


Power-On Requirements: CPU and Memory Reservations
Slide 10-19

In order to power on a virtual Reservations:


machine, the target host must CPU 2GHz
Memory 2GB ESXi host:
have enough unreserved CPU
unreserved
and memory capacity to satisfy
the virtual machine’s reservation X capacity
CPU 1.5GHz
settings.
ƒ
Memory 6GB
Account for overhead memory
reservation as well.
Reservations:
CPU 2GHz
Memory 2GB ESXi host:
unreserved
capacity
CPU 8GHz
Memory 6GB

A virtual machine does not power on unless the host has sufficient unreserved CPU and memory
resources to meet the virtual machine’s reservations. The vSphere Client reports an insufficient
resources error.
In order to power on the virtual machine, you must either make more CPU and memory resources
available or reduce the virtual machine’s reservation settings.
To make more resource available, perform one or more of the following actions:
• Reduce the reservations of one or more existing virtual machines.
• Migrate one or more existing virtual machines to another host.
• Power off one or more existing virtual machines.
• Add additional CPU or memory resources.

588 VMware vSphere: Optimize and Scale


Power-On Requirements: Swap File Space
Slide 10-20

In order to power on a virtual Configured memory: VM datastore:


machine, the datastore defined 4GB
2GB
as the swap file location must Memory reservation: available
have enough space to create the 1GB
swap file.
ƒ Swap file size is determined by X
the difference between the
virtual machine’s memory
reservation and its configured

10
memory. Configured memory: VM datastore:
2GB
2GB
Memory reservation: available

Virtual Machine and Cluster Optimization


1GB

Virtual machine swap files are created at power-on and deleted at power-off. A virtual machine does
not power on unless enough storage space exists to create the swap file. Swap file size is determined
by subtracting the virtual machine’s memory reservation from its configured memory.
To make more storage space available, perform one or more of the following actions:
• Increase the reservations of one or more virtual machines.
• Decrease the configured memory for one or more virtual machines.
• Use VMware vSphere® Storage vMotion® to migrate files to another datastore.
• Increase the size of the datastore.

Module 10 Virtual Machine and Cluster Optimization 589


Power-On Requirements: Virtual SCSI HBA Selection
Slide 10-21

In order to successfully boot a virtual machine, the guest operating


system must support the type of SCSI HBA configured for the virtual
machine hardware.
SCSI HBA selection can be made at the following times:
ƒ During virtual machine
creation
ƒ After creation, by using
Edit Settings
The Create Virtual
Machine wizard provides
a default choice based on
the selected operating
system choice.

Once a virtual machine has been powered on, the virtual machine must still boot its operating
system. A virtual machine does not boot if the guest operating system does not include a driver for
the configured virtual SCSI HBA.
The virtual SCSI HBA is automatically selected during virtual machine creation, based on the
selected guest operating system. The default SCSI HBA selection can be overridden at creation time
by using the Create Virtual Machine wizard.
The SCSI HBA selection can also be changed by using the vSphere Client Edit Settings menu item
(available by right-clicking the virtual machine in the inventory panel).

590 VMware vSphere: Optimize and Scale


Virtual Machine Performance Best Practices
Slide 10-22

ƒ Select the right guest operating system type during virtual machine
creation.
ƒ Disable unused devices, for example, USB, CD-ROM, and floppy devices.
ƒ Do not deploy single-threaded applications in SMP virtual machines.
ƒ Configure proper guest time synchronization.
ƒ Install VMware Tools in the guest operating system and keep it up to date.
ƒ Use the correct virtual hardware.
ƒ Size the guest operating system queue depth appropriately.

10
ƒ Be aware that large I/O requests split up by the guest might affect I/O
performance.
ƒ

Virtual Machine and Cluster Optimization


Align the guest operating system partitions.
ƒ Use vmxnet3 where possible.

Module 10 Virtual Machine and Cluster Optimization 591


Review of Learner Objectives
Slide 10-23

You should be able to do the following:


ƒ Discuss guidelines for creating an optimal virtual machine configuration.

592 VMware vSphere: Optimize and Scale


Lesson 2: vSphere Cluster Optimization
Slide 10-24

Lesson 2:
vSphere Cluster Optimization

10
Virtual Machine and Cluster Optimization

Module 10 Virtual Machine and Cluster Optimization 593


Learner Objectives
Slide 10-25

After this lesson, you should be able to do the following:


ƒ Discuss guidelines for using resource allocation settings:
• Shares
• Reservations
• Limits
ƒ Discuss guidelines for using resource pools.
ƒ Discuss guidelines for using DRS clusters:
• Configuration guidelines
• VMware vSphere® vMotion® guidelines
• Cluster setting guidelines
ƒ Discuss guidelines for setting vSphere HA admission control policies.
ƒ Troubleshoot common VMware vSphere® cluster problems.

594 VMware vSphere: Optimize and Scale


Guidelines for Resource Allocation Settings
Slide 10-26

Guidelines for setting shares, reservations, and limits:


ƒ Use resource settings (reservation, shares, and limits) only when
necessary.
ƒ If you expect frequent changes to the total available resources, use
shares, not reservations.
ƒ When memory is overcommitted, use shares to adjust relative priorities.
ƒ Use a reservation to specify the minimum acceptable amount of
resource, not the amount you would like to have available.
ƒ

10
Do not set your reservations too high because high-reservation values
might limit DRS balancing.
ƒ Ensure that shares and limits do not limit the virtual machine’s storage

Virtual Machine and Cluster Optimization


and network performance.
ƒ Carefully specify the memory limit and memory reservation.

ESXi provides several mechanisms to configure and adjust the allocation of CPU and memory
resources for virtual machines running within it. Resource management configurations can have a
significant impact on virtual machine performance.
If you expect frequent changes to the total available resources, use shares, not reservations, to
allocate resources fairly across virtual machines. If you use shares and you subsequently upgrade the
hardware, each virtual machine stays at the same relative priority (keeps the same number of
shares). The relative priority remains the same even though each share represents a larger amount of
memory or CPU.
Keep in mind that shares take effect only when there is resource contention. For example, if you
give your virtual machine a high number of CPU shares, and you expect to see an immediate
difference in performance, you might be not see a difference because there might not be contention
for CPU
A situation might occur when the host’s memory is overcommitted and the virtual machine’s
allocated host memory is too small to achieve a reasonable performance. In this situation, the user
can increase the virtual machine’s shares to escalate the relative priority of the virtual machine. The
shares are increased so that the hypervisor can allocate more host memory for that virtual machine.

Module 10 Virtual Machine and Cluster Optimization 595


Use reservations to specify the minimum acceptable amount of CPU or memory, not the amount that
you want to have available. After all resource reservations have been met, the VMkernel allocates
the remaining resources based on the number of shares and the limits configured for the virtual
machine.
High reservation values might limit the ability of VMware vSphere® Distributed Resource
Scheduler™ (DRS) to balance the load. DRS might not have excess resources to move virtual
machines around.
Ensure that shares and limits do not limit the virtual machine’s storage and networking performance.
For example, if the virtual machine has insufficient CPU time, it can affect network performance,
especially with Gigabit Ethernet interrupt processing. If the virtual machine has insufficient reserved
memory, it might experience excessive guest operating system paging and possibly, VMkernel
paging. Paging affects storage performance. Paging also affects networking performance if the
virtual machine is located on IP storage.
Carefully specify the memory limit and memory reservation. If these two parameters are
misconfigured, users might observe ballooning or swapping even when the host has plenty of free
memory. For example, a virtual machine’s memory might be reclaimed when the specified limit is
too small or when other virtual machines reserve too much host memory. If a performance-critical
virtual machine needs a guaranteed memory allocation, the reservation must be specified carefully
because it might affect other virtual machines.

596 VMware vSphere: Optimize and Scale


Resource Pool and vApp Guidelines
Slide 10-27

Use resource pools for better manageability.


Use resource pools to isolate performance.
Group virtual machines for a multitier service into a resource pool.
In a cluster, make sure resource pools and virtual machines are not
siblings in a hierarchy.
A vApp is a container of virtual machines and a resource pool for its
virtual machines.

10
Virtual Machine and Cluster Optimization
Use resource pools to hierarchically partition available CPU and memory resources. With separate
resource pools, you have more options when you need to choose between high availability and load
balancing. Resource pools can also be easier to manage and troubleshoot.
Use resource pools to isolate performance. Isolation can prevent one virtual machine’s performance
from affecting the performance of other virtual machines. For example, you might want to limit a
department or a project’s virtual machines to a certain number of resources. Defined limits provide
more predictable service levels and less resource contention. Setting defined limits also protects the
resource pool from a virtual machine going into a runaway condition. For example, due to an
application problem, the virtual machine takes up 100 percent of its allocated processor time. Setting
a firm limit on a resource pool can reduce the negative effect that the virtual machine might have on
the resource pool. You can fully isolate a resource pool by giving it the type Fixed and using
reservations and limits to control resources.
Group virtual machines for a multitier service into a resource pool. Using the same resource pool
allows resources to be assigned for the service as a whole.
VMware recommends that resource pools and virtual machines not be made siblings in a hierarchy.
Instead, each hierarchy level should contain only resource pools or only virtual machines. By
default, resource pools are assigned share values that might not compare appropriately with those
assigned to virtual machines, potentially resulting in unexpected performance.

Module 10 Virtual Machine and Cluster Optimization 597


DRS Cluster: Configuration Guidelines
Slide 10-28

Ensure that cluster maximums are not exceeded.


Be aware that differences in memory cache architecture can create
inconsistencies in performance.
When sizing your cluster, consider the following:
ƒ Higher number of hosts in the cluster means more DRS balancing
opportunities.
ƒ Larger host CPU and memory sizes are preferred for virtual machine
placement.
ƒ Virtual machines configured with the minimum necessary resources
give DRS more flexibility.
ƒ Account for virtual machine memory overhead.

Exceeding the maximum number of hosts, virtual machines, or resource pools for each DRS cluster
is not supported. Even if the cluster seems to be operating properly, exceeding the maximums could
adversely affect VMware® vCenter Server™ or DRS performance. For cluster maximums, see
Configuration Maximums for VMware vSphere 5.0 at http://www.vmware.com/pdf/vsphere5/r50/
vsphere-50-configuration-maximums.pdf.
Differences can exist in the memory cache architecture of hosts in your DRS cluster. For example,
some of the hosts in your cluster might have NUMA architecture and others might not. This
difference can result in differences in performance numbers across your hosts. DRS does not
consider memory cache architecture because accounting for these differences results in minimal (if
any) benefits at the DRS granularity of scheduling.

598 VMware vSphere: Optimize and Scale


The more hosts that you have in your cluster, the more opportunities DRS has to place your virtual
machines. The maximum number of hosts in a cluster is 32. The number of hosts that you place in
your cluster depends on your vCenter Server configuration, what kinds of workloads are running,
and how many virtual machines you have per host.
If your DRS cluster consists of hosts that have different numbers of CPUs and memory sizes, DRS
tries to place your virtual machines on the largest host first. The idea here is that DRS wants the
virtual machine to run on a host with excess capacity to accommodate for any changes in resource
demands of the virtual machine.
Virtual machines with smaller memory sizes and/or fewer vCPUs provide more opportunities for
DRS to migrate them in order to improve balance across the cluster. Virtual machines with larger
memory sizes and/or more vCPUs add more constraints in migrating the virtual machines.
Therefore, configure virtual machines with only as many vCPUs and only as much virtual memory
as the virtual machines need.
Every virtual machine that is powered on incurs some amount of memory overhead. This overhead

10
is not factored in as part of the virtual machine’s memory usage. You can very easily overlook the
overhead when sizing your resource pools and when estimating how many virtual machines to place

Virtual Machine and Cluster Optimization


on each host.
To display a virtual machine’s memory overhead:

1. Select the virtual machine in the inventory and click the Summary tab.

View the Memory Overhead field in the General panel.

Module 10 Virtual Machine and Cluster Optimization 599


DRS Cluster: vMotion Guidelines
Slide 10-29

In terms of CPU and memory, choose hosts that are as


homogeneous as possible:
ƒ For example, use all Intel hosts or all AMD hosts.
ƒ If virtual machines use hardware version 9, use all ESXi 5.1 hosts.
CPU incompatibility leads to limited DRS migration options:
ƒ Enhanced vMotion Compatibility (EVC) eases this problem.
Provision 10 Gigabit Ethernet vMotion network interfaces for
maximum vMotion performance, or, use multiple VMkernel ports
enabled by vMotion.
Leave some unreserved CPU capacity in a cluster for vMotion tasks.
Place virtual machine (host-level) swap files on local storage.

DRS can make migration recommendations only for virtual machines that can be migrated with
vSphere vMotion.
Ensure that the hosts in the DRS cluster have compatible CPUs. Within the same hardware platform
(Intel or AMD), differences in CPU family might exist, which means different CPU feature sets. As
a result, virtual machines are not be able to migrate across hosts. However, Enhanced vMotion
Compatibility (EVC) automatically configures server CPUs with Intel FlexMigration or AMD-V
Extended Migration technologies to be compatible with older servers. EVC prevents migrations
with vMotion from failing due to incompatible CPUs.’
Virtual machines running on hardware version 9 cannot run on prior versions of ESXi. Therefore,
such virtual machines can be moved using vMotion only to other ESXi 5.1 hosts.

600 VMware vSphere: Optimize and Scale


vMotion performance increases as additional network bandwidth is made available to the vMotion
network. Consider provisioning 10 Gigabit Ethernet vMotion network interfaces for maximum
vMotion performance.
Multiple VMkernel ports enabled by vMotion can provide a further increase in network bandwidth
available to vMotion. All VMkernel ports enabled by vMotion that are on a host should share a
single virtual switch. Each port group should be configured to leverage a different physical NIC as
its active virtual NIC. In addition, all VMkernel ports enabled by vMotion should be on the same
vMotion network.
Leave some unused CPU capacity in your cluster for vMotion operations. When a vMotion
operation is in progress, ESXi opportunistically reserves CPU resources on both the source and
destination hosts. CPU resources are reserved to ensure the ability of vMotion operations to fully
utilize the network bandwidth. The amount of CPU reservation thus depends on the number of
vMotion NICs and their speeds: 10% of a processor core for each Gigabit Ethernet network
interface is reserved. 100% of a processor core for each 10 Gigabit Ethernet network interface is

10
reserved. A minimum total reservation of 30% of a processor core is reserved.
Placement of a virtual machine’s swap file can affect the performance of vMotion. If the swap file is

Virtual Machine and Cluster Optimization


on shared storage, then vMotion performance is good because the swap file does not need to be
copied. If the swap file is on the host’s local storage, then vMotion performance is slightly degraded
(usually negligible) because the swap file has to be copied to the destination host.

Module 10 Virtual Machine and Cluster Optimization 601


DRS Cluster: Cluster Setting Guidelines
Slide 10-30

When choosing the migration threshold:


ƒ Moderate threshold (default) works well for most cases.
ƒ Aggressive thresholds are recommended for the following cases:
• Homogeneous clusters
• Relatively constant virtual machine demand for resources
• Few affinity/anti-affinity rules
Use affinity/anti-affinity rules only when needed.
Fully automated mode is recommended (cluster-wide):
ƒ Use a mode of manual or partially automated for location-critical virtual
machines.

In most cases, keep the migration threshold at its default, which is moderate. Set the migration
threshold to more aggressive levels when the following conditions are satisfied:
• The hosts in the cluster are relatively homogeneous.
• The virtual machines in the cluster are running workloads that are fairly constant.
• Few to no constraints (affinity and anti-affinity rules) are defined for the virtual machines.
Use affinity and anti-affinity rules only when necessary. The more rules that are used, the less
flexibility DRS has when determining on what hosts to place virtual machines. In most cases,
leaving the affinity settings unchanged provides the best results. In rare cases, however, specifying
affinity rules can help improve performance.
Finally, make sure that DRS is in fully-automated mode cluster-wide. For more control over your
critical virtual machines, override the cluster-wide setting by setting manual or partially-automated
mode on selected virtual machines.

602 VMware vSphere: Optimize and Scale


vSphere HA Cluster: Admission Control Guidelines
Slide 10-31

Ensure that you have enough resources in the vSphere HA cluster to


accommodate your admission control policy.

Admission Control Policy Space Calculations


Host Failures The Cluster Tolerates Calculate slot size.
Percentage of Cluster Resources Calculate spare capacity based on
Reserved as Failover Spare total resource requirements for
Capacity powered on virtual machines.

10
Calculate number of failover hosts
Specify Failover Hosts needed to hold your virtual

Virtual Machine and Cluster Optimization


machines.

vCenter Server uses admission control to ensure that sufficient resources are available in a cluster to
provide failover protection. Admission control is also used to ensure that virtual machine resource
reservations are respected.
Choose a VMware vSphere® High Availability admission control policy based on your availability
needs and the characteristics of your cluster. When choosing an admission control policy, you
should consider a number of factors:
• Avoiding resource fragmentation – Resource fragmentation occurs when enough resources in
aggregate exist for a virtual machine to be failed over. However, those resources are located on
multiple hosts and are unusable because a virtual machine can run on one ESXi host at a time.
The Host Failures The Cluster Tolerates policy avoids resource fragmentation by defining a slot
as the maximum virtual machine reservation. The Percentage of Cluster Resources policy does
not address the problem of resource fragmentation. With the Specify Failover Hosts policy,
resources are not fragmented because hosts are reserved for failover.

Module 10 Virtual Machine and Cluster Optimization 603


• Flexibility of failover resource reservation – Admission control policies differ in the granularity
of control they give you when reserving cluster resources for failover protection. The Host
Failures The Cluster Tolerates policy allows you to set the failover level as a number of hosts.
The Percentage of Cluster Resources policy allows you to designate up to 100% of cluster CPU
or memory resources for failover. The Specify Failover Hosts policy allows you to specify a set
of failover hosts.
• Heterogeneity of cluster – Clusters can be heterogeneous in terms of virtual machine resource
reservations and host total resource capacities. In a heterogeneous cluster, the Host Failures The
Cluster Tolerates policy can be too conservative. The reason is because this policy only
considers the largest virtual machine reservations when defining slot size and assumes the
largest hosts fail when computing the Current Failover Capacity. The other two admission
control policies are not affected by cluster heterogeneity.

604 VMware vSphere: Optimize and Scale


Example of Calculating Slot Size
Slide 10-32

The vSphere HA cluster contains five powered-on virtual machines


with differing CPU and memory reservation requirements.
The Host Failures The Cluster Tolerates field is set to one.

CPU: 2GHz CPU: 2GHz CPU: 1GHz CPU: 1GHz CPU: 1GHz
RAM: 1GB RAM: 1GB RAM: 2GB RAM: 1GB RAM: 1GB

Slot Size: 2GHz, 2GB*

10
* plus memory overhead

4 slots available 3 slots available 3 slots available

Virtual Machine and Cluster Optimization


ESXi host 1: ESXi host 2: ESXi host 3:
CPU: 9GHz CPU: 9GHz CPU: 6GHz
RAM: 9GB RAM: 7GB RAM: 7GB

A slot is a logical representation of memory and CPU resources. By default, a slot is sized to satisfy
the requirements for any powered-on virtual machine in the cluster. Slot size is calculated by
comparing both the CPU and memory requirements of the virtual machines and selecting the largest.
In the example above, the largest CPU requirement, 2GHz, is shared by the first and second virtual
machines. The largest memory requirement, 2GB, is held by the third virtual machine. Based on this
information, the slot size is 2GHz CPU and 2GB memory.
Of the three ESXi hosts in the example, ESXi host 1 can support four slots. ESXi hosts 2 and 3 can
support three slots each.
A virtual machine requires a certain amount of available overhead memory to power on. The
amount of overhead required varies, and depends on the amount of configured memory. The amount
of this overhead is added to the memory slot size value. For simplicity, the calculations do not
include the memory overhead. However, just be aware that such overhead exists.
vSphere HA admission control does not consider the number of vCPUs that a virtual machine has.
Admission control looks only at the virtual machine’s reservation values. Therefore, a 4GHz CPU
reservation means the same thing for both a 4-way SMP virtual machine and a single-vCPU virtual
machine.

Module 10 Virtual Machine and Cluster Optimization 605


Applying Slot Size
Slide 10-33

After determining the maximum number of slots that each host can
support, the current failover capacity can be computed.
In this example, current failover capacity is one.
CPU: 2GHz CPU: 2GHz CPU: 1GHz CPU: 1GHz CPU: 1GHz
RAM: 1GB RAM: 1GB RAM: 2GB RAM: 1GB RAM: 1GB

Slot Size: 2GHz, 2GB*


* plus memory overhead

4 slots available 3 slots available 3 slots available

ESXi host 1: ESXi host 2: ESXi host 3:


CPU: 9GHz CPU: 9GHz CPU: 6GHz
RAM: 9GB RAM: 7GB RAM: 7GB

Continuing with this example, the largest host in the cluster is host 1. If host 1 fails, then six slots
remain in the cluster, which is sufficient for all five of the powered-on virtual machines. The cluster
has one available slot left.
If both hosts 1 and 2 fail, then only three slots remain, which is insufficient for the number of
powered-on virtual machines to fail over. Therefore, the current failover capacity is one.

606 VMware vSphere: Optimize and Scale


Reserving a Percentage of Cluster Resources
Slide 10-34

With the Percentage of Cluster Resources Reserved as Failover


Spare Capacity policy, vSphere HA ensures that a specified
percentage of aggregate CPU and memory resources is reserved for
failover.
CPU: 2GHz CPU: 2GHz CPU: 1GHz CPU: 1GHz CPU: 1GHz
RAM: 1GB RAM: 1GB RAM: 2GB RAM: 1GB RAM: 1GB

Total virtual machine host requirements: 7GHz, 6GB

10
Total host resources: 24GHz, 21GB

Virtual Machine and Cluster Optimization


ESXi host 1: ESXi host 2: ESXi host 3:
CPU: 9GHz CPU: 9GHz CPU: 6GHz
RAM: 9GB RAM: 7GB RAM: 7GB

With the Percentage of Cluster Resources Reserved admission control policy, vSphere HA ensures
that a specified percentage of aggregate CPU and memory resources are reserved for failover.
With this policy, vSphere HA enforces admission control as follows:
1. Calculates the total resource requirements for all powered-on virtual machines in the cluster

2. Calculates the total host resources available for virtual machines

3. Calculates the current CPU failover capacity and current memory failover capacity for the
cluster
4. Determines if either the current CPU failover capacity or current memory failover capacity is
less than the corresponding configured failover capacity (provided by the user)
If so, admission control disallows the operation.
vSphere HA uses the actual reservations of the virtual machines. If a virtual machine does not have
reservations (reservation is 0), a default of 0MB memory and 256MHz CPU is applied.

Module 10 Virtual Machine and Cluster Optimization 607


Calculating Current Failover Capacity
Slide 10-35

CPU: 2GHz CPU: 2GHz CPU: 1GHz CPU: 1GHz CPU: 1GHz
RAM: 1GB RAM: 1GB RAM: 2GB RAM: 1GB RAM: 1GB

Total virtual machine host requirements: 7GHz, 6GB

Total host resources: 24GHz, 21GB

ESXi host 1: ESXi host 2: ESXi host 3:


CPU: 9GHz CPU: 9GHz CPU: 6GHz
RAM: 9GB RAM: 7GB RAM: 7GB

Current CPU failover capacity = (24GHz  7GHz) 㾂 24GHz = 70%


Current memory failover capacity = (21GB  6GB) 㾂 21GB = 71%
ƒ If the cluster's configured failover capacity is set to 25%, 45% of the
cluster's total CPU resources and 46% of the cluster's memory
resources are still available to power on additional virtual machines.

The current CPU failover capacity is computed by subtracting the total CPU resource requirements
from the total host CPU resources and dividing the result by the total host CPU resources. The
current memory failover capacity is calculated similarly.
In the example above, the total resource requirements for the powered-on virtual machines is 7GHz
and 6GB. The total host resources available for virtual machines is 24GHz and 21GB. Based on
these values, the current CPU failover capacity is 70%. Similarly, the current memory failover
capacity is 71%.

608 VMware vSphere: Optimize and Scale


Virtual Machine Unable to Power On
Slide 10-36

Problem:
ƒ You cannot power on a virtual machine in a vSphere HA cluster due to
insufficient resources.
Cause:
ƒ This problem can have several causes:
• Hosts in the cluster are disconnected, in maintenance mode, not
responding, or have a vSphere HA error.
• Cluster contains virtual machines that have much larger memory or CPU
reservations than the others.

10
• No free slots exist in the cluster.
Solution:

Virtual Machine and Cluster Optimization


ƒ If virtual machines exist with higher reservations than the others, use a
different vSphere HA admission control policy, or use the advanced
options to place an absolute cap on the slot size.

If you select the Host Failures The Cluster Tolerates admission control policy and certain problems
arise, you might be prevented from powering on a virtual machine due to insufficient resources.
This problem can have several causes:
• Hosts in the cluster are disconnected, in maintenance mode, not responding, or have a vSphere
HA error.
Disconnected and maintenance mode hosts are typically caused by user action. Unresponsive or
error-possessing hosts usually result from a more serious problem. For example, hosts or agents
have failed or a networking problem exists.
• Cluster contains virtual machines that have much larger memory or CPU reservations than the
others.
The Host Failures the Cluster Tolerates admission control policy is based on the calculation on
a slot size comprised of two components, the CPU and memory reservations of a virtual
machine. If the calculation of this slot size is skewed by outlier virtual machines, the admission
control policy can become too restrictive and result in the inability to power on virtual
machines.

Module 10 Virtual Machine and Cluster Optimization 609


• No free slots in the cluster.
Problems occur if no free slots exist in the cluster. Problems can also occur if powering on a
virtual machine causes the slot size to increase because it has a larger reservation than existing
virtual machines. In either case, you should use the vSphere HA advanced options to reduce the
slot size, use a different admission control policy, or modify the policy to tolerate fewer host
failures.
To display slot size information:

1. In the vSphere HA panel of the cluster’s Summary tab, click on the Advanced Runtime Info
link. This dialog box shows the slot size and how many available slots exist in the cluster.
2. If the slot size appears too high, click on the Resource Allocation tab of the cluster. Sort the
virtual machines by reservation to determine which have the largest CPU and memory
reservations.
3. If there are outlier virtual machines with much higher reservations than the others, consider
using a different vSphere HA admission control policy (such as the Percentage of Cluster
Resources Reserved admission control policy). Or use the vSphere HA advanced options to
place an absolute cap on the slot size. Both of these options, however, increase the risk of
resource fragmentation.

610 VMware vSphere: Optimize and Scale


Advanced Options to Control Slot Size
Slide 10-37

Set default (minimum) slot size:


ƒ

10
das.vmCpuMinMHz
ƒ das.vmMemoryMinMB

Virtual Machine and Cluster Optimization


Set maximum slot size:
ƒ das.slotCpuInMHz
ƒ das.slotMemInMB

The Host Failures the Cluster Tolerates admission control policy uses slot size to determine whether
enough capacity exists in the cluster to power on a virtual machine.
The advanced options das.vmCpuMinMHz and das.vmMemoryMinMB set the default slot size if all
virtual machines are configured with reservations set at 0. These options are used:
• Specifically for vSphere HA admission control
• Only if CPU or memory reservations are not specified for the virtual machine
• In calculating the current failover level
das.vmMemoryMinMB specifies the minimum amount of memory (in megabytes) sufficient for any
virtual machine in the cluster to be usable. If no value is specified, the default is 256MB. This value
effectively defines a minimum memory bound on the slot size.
das.vmCpuMinHz specifies the minimum amount of CPU (in megahertz) sufficient for any virtual
machine in the cluster to be usable. If no value is specified, the default is 256MHz. This value
effectively defines a minimum processor bound on the slot size.

Module 10 Virtual Machine and Cluster Optimization 611


Setting vSphere HA Advanced Parameters
Slide 10-38

Set advanced
parameters by
editing vSphere HA
cluster settings.

After you have established a cluster, you have the option of setting attributes that affect how
vSphere HA behaves. These attributes are called advanced options.
To set an advanced option:

1. Click the Hosts and Clusters tab.


2. Right-click the cluster in the inventory and select Edit Settings.
3. In the cluster’s Settings dialog box, select vSphere HA in the left pane.
4. Click the Advanced Options button in the right pane to open the Advanced Options (vSphere
HA) dialog box.
Enter each advanced attribute that you want to change in a text box in the Option column. Enter the
value of the attribute in the Value column.

612 VMware vSphere: Optimize and Scale


Fewer Available Slots Shown Than Expected
Slide 10-39

Problem:
ƒ The number of slots available in the cluster is smaller than expected.
Cause:
ƒ Slot size is calculated using the largest reservations plus the memory
overhead of any powered on virtual machines in the cluster.
ƒ vSphere HA admission control considers only the resources on a host
that are available for virtual machines.
• This amount is less than the total amount of physical resources on the host

10
because some overhead exists.
Solution:
ƒ

Virtual Machine and Cluster Optimization


Reduce the virtual machine reservations if possible.
ƒ Use vSphere HA advanced options to reduce the slot size.
ƒ Use a different admission control policy.

Another problem that you might encounter is that the Advanced Runtime Info dialog box might
display a smaller number of available slots in the cluster than you expect.
When you select the Host Failures the Cluster Tolerates admission control policy, the Advanced
Runtime Info link appears in the vSphere HA panel of the cluster's Summary tab. Clicking this
link displays information about the cluster, including the number of slots available to power on
additional virtual machines in the cluster.
Remember that slot size is calculated using the largest reservations plus the memory overhead of
any powered on virtual machines in the cluster. However, vSphere HA admission control considers
only the resources on a host that are available for virtual machines.
Therefore, the number of slots available might be smaller than expected because you did not account
for the memory overhead.
A few solutions exist, which are noted in the slide.

Module 10 Virtual Machine and Cluster Optimization 613


Lab 16
Slide 10-40

In this lab, you will detect possible virtual machine and vSphere HA
cluster problems.
1. Move your ESXi host into the first problem cluster.
2. Troubleshoot the first cluster problem.
3. Troubleshoot the second cluster problem.
4. Prepare for the next lab.

614 VMware vSphere: Optimize and Scale


Review of Learner Objectives
Slide 10-41

You should be able to do the following:


ƒ Discuss guidelines for using resource allocation settings:
• Shares
• Reservations
• Limits
ƒ Discuss guidelines for using resource pools.
ƒ Discuss guidelines for using DRS clusters:
• Configuration guidelines

10
• vMotion guidelines
• Cluster setting guidelines
ƒ

Virtual Machine and Cluster Optimization


Discuss guidelines for setting vSphere HA admission control policies.
ƒ Troubleshoot common cluster problems.

Module 10 Virtual Machine and Cluster Optimization 615


Key Points
Slide 10-42

ƒ Always install the latest version of VMware Tools into all virtual
machines.
ƒ A well-tuned virtual machine provides an optimal environment for
applications to run in.
ƒ Use shares, limits, and reservations carefully because they can
significantly affect performance.
ƒ Resource pools are useful for limiting virtual machines to a specific
amount of resources.
ƒ DRS can make migration recommendations only for virtual machines
that can be migrated with vMotion.

Questions?

616 VMware vSphere: Optimize and Scale


M O D U L E 11

Host and Management Scalability 11


Slide 11-1

Module 11

11
Host and Management Scalability

VMware vSphere: Optimize and Scale 617


You Are Here
Slide 11-2

Course Introduction Storage Optimization

VMware Management Resources


CPU Optimization
Performance in a Virtualized
Environment
Memory Performance
Network Scalability
VM and Cluster Optimization
Network Optimization

Storage Scalability Host and Management Scalability

618 VMware vSphere: Optimize and Scale


Importance
Slide 11-3

As you scale your VMware vSphere® environment, you must be


aware of the vSphere features and functions that will help you
manage the hosts in your environment.

11
Host and Management Scalability

Module 11 Host and Management Scalability 619


Module Lessons
Slide 11-4

Lesson 1: vCenter Linked Mode


Lesson 2: vSphere DPM
Lesson 3: Host Profiles
Lesson 4: vSphere PowerCLI
Lesson 5: Image Builder
Lesson 6: Auto Deploy

620 VMware vSphere: Optimize and Scale


Lesson 1: vCenter Linked Mode
Slide 11-5

Lesson 2:
vCenter Linked Mode

11
Host and Management Scalability

Module 11 Host and Management Scalability 621


Learner Objectives
Slide 11-6

After this lesson, you should be able to do the following:


ƒ Describe vCenter Linked Mode.
ƒ List vCenter Linked Mode benefits.
ƒ List vCenter Linked Mode requirements.
ƒ Join a VMware® vCenter Server™ system to a Linked Mode group.
ƒ View and manage vCenter Server inventories in a Linked Mode group.
ƒ Isolate a vCenter Server system from a Linked Mode group.

622 VMware vSphere: Optimize and Scale


vCenter Linked Mode
Slide 11-7

vCenter Linked Mode helps you


manage multiple vCenter Server
instances.
vCenter Linked Mode enables you to
do the following:
ƒ Log in simultaneously to all vCenter
Server systems
ƒ View and search the inventories of all
vCenter Server systems
You cannot migrate hosts or virtual
machines between vCenter Server
systems in vCenter Linked Mode.

11
Host and Management Scalability
vCenter Linked Mode enables VMware vSphere® administrators to search across multiple
VMware® vCenter Server™ system inventories from a single VMware vSphere® Client™ session.
For example, you might want to simplify management of inventories associated with remote offices
or multiple datacenters. Likewise, you might use vCenter Linked Mode to configure a recovery site
for disaster recovery purposes.
vCenter Linked Mode enables you to do the following:
• Log in simultaneously to all vCenter Server systems for which you have valid credentials
• Search the inventories of all vCenter Server systems in the group
• Search for user roles on all the vCenter Server systems in the group
• View the inventories of all vCenter Server systems in the group in a single inventory view
• With vCenter Linked Mode, you can have up to ten linked vCenter Server systems and up to
3,000 hosts across the linked vCenter Server systems. For example, you can have ten linked
vCenter Server systems, each with 300 hosts, or five vCenter Server systems with 600 hosts
each. But you cannot have two vCenter Server systems with 1,500 hosts each, because that
exceeds the limit of 1,000 hosts per vCenter Server system. vCenter Linked Mode supports
30,000 powered-on virtual machines (and 50,000 registered virtual machines) across linked
vCenter Server systems.

Module 11 Host and Management Scalability 623


vCenter Linked Mode Architecture
Slide 11-8

VMware vSphere® Client™

vCenter Tomcat vCenter Tomcat vCenter Tomcat


Server Web service Server Web service Server Web service

AD LDS instance AD LDS instance AD LDS instance

vCenter Server instance vCenter Server instance vCenter Server instance

ƒ Connection information
ƒ Certificates and thumbprints
ƒ Licensing information
ƒ User roles

vCenter Linked Mode uses Microsoft Active Directory Lightweight Directory Services (AD LDS) to
store and synchronize data across multiple vCenter Server instances. AD LDS is an implementation
of Lightweight Directory Access Protocol (LDAP). AD LDS is installed as part of the vCenter
Server installation. Each AD LDS instance stores data from all the vCenter Server instances in the
group, including information about roles and licenses. This information is regularly replicated across
all the AD LDS instances in the connected group to keep them in sync. After you have configured
multiple vCenter Server instances in a group, the group is called a Linked Mode group.
Using peer-to-peer networking, the vCenter Server instances in a Linked Mode group replicate
shared global data to the LDAP directory. The global data for each vCenter Server instance includes:
• Connection information (IP addresses and ports)
• Certificates and thumbprints
• Licensing information
• User roles
• The vSphere Client can connect to other vCenter Server instances by using the connection
information retrieved from AD LDS. The Apache Tomcat Web service running on vCenter
Server enables the search capability across multiple vCenter Server instances. All vCenter
Server instances in a Linked Mode group can access a common view of the global data.

624 VMware vSphere: Optimize and Scale


Searching Across vCenter Server Instances
Slide 11-9

VMware vSphere Client


1
4
vCenter Tomcat 3 vCenter Tomcat 3 vCenter Tomcat 3
Server 2 Web service Server Web service Server Web service

AD LDS instance AD LDS instance AD LDS instance

vCenter Server instance vCenter Server instance vCenter Server instance

11
Host and Management Scalability
For inventory searches, vCenter Linked Mode relies on a Java-based Web application called the
query service, which runs in Tomcat Web services.
When you use search for an object, the following operations take place:
1. The vSphere Client logs in to a vCenter Server system.

2. The vSphere Client obtains a ticket to connect to the local query service.

3. The local query service connects to the query services on other vCenter Server instances to do a
distributed search.
4. Before returning the results, vCenter Server filters the search results according to permissions.
The search service queries Active Directory (AD) for information about user permissions. So you
must be logged in to a domain account to search all vCenter Server systems in vCenter Linked
Mode. If you log in with a local account, searches return results only for the local vCenter Server
system, even if it is joined to other systems in a Linked Mode group.

Module 11 Host and Management Scalability 625


The vSphere Client provides a search-based interface that enables an administrator to search through
the entire inventory and its attributes. For example, an administrator might search for virtual
machines that match a search string and that reside on hosts whose names match another search
string.
1. Select Home > Inventory > Search to display the search page.

2. Click the icon in the Search Inventory field at the top right of the vSphere Client window.

3. Select the type of inventory item to search for:


• Virtual Machines
• Folders
• Hosts
• Datastores
• Networks
• Inventory, which finds matches to the search criteria in any of the available managed object
types

626 VMware vSphere: Optimize and Scale


Basic Requirements for vCenter Linked Mode
Slide 11-10

vCenter Server version:


ƒ Linked Mode groups that contain both vCenter Server 5.0 and earlier
versions of vCenter Server are not supported.
Domain controller:
ƒ Domain user account requires the following privileges:
• Member of Admin group
• Act as part of operating system
• Log in as a service
ƒ vCenter Server instances can be in different domains if the domains
trust one another.
DNS server:
ƒ

11
DNS name must match machine name.
Clocks synchronized across instances

Host and Management Scalability


vCenter Linked Mode is implemented through the vCenter Server installer. You can create a Linked
Mode group during vCenter Server installation. Or you can use the installer to add or remove
instances.
When adding a vCenter Server instance to a Linked Mode group, the user running the installer must
be an administrator. Specifically, the user must be a local administrator on the machine where
vCenter Server is being installed and on the target machine of the Linked Mode group. Generally,
the installer must be run by a domain user who is an administrator on both systems.
The following requirements apply to each vCenter Server system that is a member of a Linked
Mode group:
• Do not join a version 5.0 vCenter Server to earlier versions of vCenter Server, or an earlier
version of vCenter Server to a version 5.0 vCenter Server. Upgrade any vCenter Server instance
to version 5.0 before joining it to a version 5.0 vCenter Server. The vSphere Client does not
function correctly with vCenter Server systems in groups that have both version 5.0 and earlier
versions of vCenter Server.
• DNS must be operational for Linked Mode replication to work.

Module 11 Host and Management Scalability 627


• The vCenter Server instances in a Linked Mode group can be in different domains if the
domains have a two-way trust relationship. Each domain must trust the other domains on which
vCenter Server instances are installed.
• All vCenter Server instances must have network time synchronization. The vCenter Server
installer validates that the machine clocks are no more than 5 minutes apart.
Consider the following before you configure a Linked Mode group:
• vCenter Server users see the vCenter Server instances on which they have valid permissions.
• You can join a vCenter Server instance to a standalone instance that is not part of a domain. If
you do so, add the standalone instance to a domain and add a domain user as an administrator.
• The vCenter Server instances in a Linked Mode group do not have to have the same domain
user login. The instances can run under different domain accounts. By default, they run as the
LocalSystem account of the machine on which they are running, which means that they are
different accounts.
• During vCenter Server installation, if you enter an IP address for the remote instance of vCenter
Server, the installer converts it into a fully qualified domain name.
For a complete list of the requirements and considerations for implementing vCenter Linked Mode,
see vCenter Server and Host Management Guide at http://www.vmware.com/support/pubs/vsphere-
esxi-vcenter-server-pubs.html.

628 VMware vSphere: Optimize and Scale


Joining a Linked Mode Group
Slide 11-11

Join a system to a Linked Mode group.


After installing vCenter Server
Start > Programs > VMware >
vCenter Server Linked Mode
During installation of vCenter Server Configuration

11
Host and Management Scalability
During the installation of vCenter Server, you have the option of joining a Linked Mode group. If
you do not join during installation, you can join a Linked Mode group after vCenter Server has
been installed.
To join a vCenter Server system to a Linked Mode group:

1. Select Start > Programs > VMware > vCenter Server Linked Mode Configuration. Click
Next.
2. Select Modify linked mode configuration. Click Next.

3. Click Join this vCenter Server instance to an existing linked mode group or another
instance. Click Next.
4. When prompted, enter the remaining networking information. Click Finish. The vCenter Server
instance is now part of a Linked Mode group.
After you form a Linked Mode group, you can log in to any single instance of vCenter Server. From
that single instance, you can view and manage the inventories of all the vCenter Server instances in
the group. The delay is usually less than 15 seconds. A new vCenter Server instance might take a
few minutes to be recognized and published by the existing instances because group members do
not read the global data often.

Module 11 Host and Management Scalability 629


vCenter Service Monitoring: Linked Mode Groups
Slide 11-12

Use the vCenter Service


Status window to quickly
identify and correct failures.

When logged in to a vCenter Server system that is part of a Linked Mode group, you can monitor
the health of services running on each server in the group.
To display the status of vCenter services:

• On the vSphere Client Home page, click vCenter Service Status. The vCenter Service Status
page enables you to view information that includes:
• A list of all vCenter Server systems and their services
• A list of all vCenter Server plug-ins
• The status of all listed items
• The date and time of the last change in status
• Messages associated with the change in status.

630 VMware vSphere: Optimize and Scale


The vCenter Service Status plug-in is used to track multiple vCenter Server extensions and view
the overall health of the vCenter Server system. The plug-in is also useful for confirming that
communications are valid when a Linked Mode configuration is enabled. In this way, an
administrator can, at a glance, determine the service status for each member of a Linked
Mode group:
• VirtualCenter Management Service (the main vCenter Server service)
• vCenter Management Webservices
• Query Service
• Ldap health monitors
The Ldap health monitor is the component that represents AD LDS (VMwareVCMSDS in Windows
Services). It is the LDAP directory service that vCenter Server uses. This health monitor can be
helpful in troubleshooting communication issues among vCenter Server instances that have been
configured in a Linked Mode group.

11
Host and Management Scalability

Module 11 Host and Management Scalability 631


Resolving Role Conflicts
Slide 11-13

Roles are replicated when a vCenter Server system is joined to a


Linked Mode group.
ƒ If role names differ on vCenter Server systems, they are combined into
a single common list and each system will have all the user roles.
ƒ If role names are identical, they are combined into a single role (if they
contain the same privileges).
ƒ If role names are identical, and the roles contain different privileges,
these roles must be reconciled.
• One of the roles must be renamed.

When you join a vCenter Server system to a Linked Mode group, the roles defined on each vCenter
Server system in the group are replicated to the other systems in the group.
If the roles defined on each vCenter Server system are different, the role lists of the systems are
combined into a single common list. For example, vCenter Server 1 has a role named Role A and
vCenter Server 2 has a role named Role B. Both systems will have both Role A and Role B after the
systems are joined in a Linked Mode group.
If two vCenter Server systems have roles with the same name, the roles are combined into a single
role if they contain the same privileges on each vCenter Server system. If two vCenter Server
systems have roles with the same name that contain different privileges, this conflict must be
resolved by renaming at least one of the roles. You can choose to resolve the conflicting roles either
automatically or manually.
If you choose to reconcile the roles automatically, the role on the joining system is renamed to
<vCenter_Server_system_name> <role_name>. <vCenter_Server_system_name> is the name of the
vCenter Server system that is joining the Linked Mode group, and <role_name> is the name of the
original role. To reconcile the roles manually, connect to one of the vCenter Server systems with the
vSphere Client. Then rename one instance of the role before joining the vCenter Server system to
the Linked Mode group. If you remove a vCenter Server system from a Linked Mode group, the
vCenter Server system retains all the roles that it had as part of the group.

632 VMware vSphere: Optimize and Scale


Isolating a vCenter Server Instance
Slide 11-14

1
You can isolate a vCenter Server instance
from a Linked Mode group in two ways:
ƒ Use the vCenter Server Linked Mode 3

11
Configuration wizard.
ƒ Use Windows Add/Remove Programs.

Host and Management Scalability


You can also isolate (remove) a vCenter Server instance from a Linked Mode group, for example, to
manage the vCenter Server instance as a standalone instance. One way to isolate the instance is to
start the vCenter Server Linked Mode Configuration wizard. Another way is to use the Windows
Add/Remove Programs (click Change) in the vCenter Server system operating system. Either
method starts the vCenter Server wizard. Modify the vCenter Server configuration as shown on the
slide.
To use the vCenter Server Linked Mode Configuration wizard to isolate a vCenter Server
instance from a Linked Mode group:

1. Select Start > All Programs > VMware > vCenter Server Linked Mode Configuration.

2. Click Modify linked mode configuration. Click Next.

3. Click Isolate this vCenter Server instance from linked mode group. Click Next.

4. Click Continue.

5. Click Finish. The vCenter Server instance is no longer part of the Linked Mode group.

Module 11 Host and Management Scalability 633


Review of Learner Objectives
Slide 11-15

You should be able to do the following:


ƒ Describe vCenter Linked Mode.
ƒ List vCenter Linked Mode benefits.
ƒ List vCenter Linked Mode requirements.
ƒ Join a vCenter Server system to a Linked Mode group.
ƒ View and manage vCenter Server inventories in a Linked Mode group.
ƒ Isolate a vCenter Server system from a Linked Mode group.

634 VMware vSphere: Optimize and Scale


Lesson 2: vSphere DPM
Slide 11-16

Lesson 2:
vSphere DPM

11
Host and Management Scalability

Module 11 Host and Management Scalability 635


Learner Objectives
Slide 11-17

After this lesson, you should be able to do the following:


ƒ Explain how VMware vSphere® Distributed Power Management™
(vSphere DPM) operates.
ƒ Configure vSphere DPM.

636 VMware vSphere: Optimize and Scale


vSphere DPM
Slide 11-18

vSphere DPM consolidates workloads to reduce power consumption:


ƒ Cuts power and cooling costs
ƒ Automates management of energy efficiency
vSphere DPM
supports three
power management
protocols:
ƒ Intelligent Platform
Management
Interface (IPMI)
ƒ Integrated
Lights-Out (iLO)
ƒ Wake on LAN (WOL)

11
Host and Management Scalability
VMware vSphere® Distributed Power Management™ (vSphere DPM) continuously monitors
resource requirements and power consumption across a VMware vSphere® Distributed Resource
Scheduler™ (DRS) cluster. When the cluster needs fewer resources, it consolidates workloads and
powers off unused VMware vSphere® ESXi™ hosts to reduce power consumption. Virtual
machines are not affected by disruption or downtime. Evacuated hosts are kept powered off during
periods of low resource use. When resource requirements of workloads increase, vSphere DPM
powers hosts on for virtual machines to use.
vSphere DPM can use one of three power management protocols to bring a host out of standby mode:
• Intelligent Platform Management Interface (IPMI)
• Hewlett-Packard Integrated Lights-Out (iLO)
• Wake on LAN (WOL)

Module 11 Host and Management Scalability 637


If a host does not support any of these protocols, it cannot be put into standby mode by vSphere
DPM. If a host supports multiple protocols, they are used in the following order:
1. IPMI

2. iLO
3. WOL
A host that supports both WOL and IPMI defaults to using IPMI if IPMI is configured. A host that
supports both WOL and iLO defaults to using iLO if iLO is configured.
For each of these wake protocols, you must perform specific configuration and testing steps on each
host before you can enable vSphere DPM for the cluster.
For information about configuration and testing guidelines, see vSphere Resource Management
Guide at http://www.vmware.com/support/pubs/vsphere-esxi-vcenter-server-pubs.html.

638 VMware vSphere: Optimize and Scale


How Does vSphere DPM Work?
Slide 11-19

vSphere DPM:
ƒ Operates on VMware vSphere® ESXi™ hosts that can be awakened
from a standby state through WOL packets or IPMI-based remote
power-on:
• WOL packets are sent over the VMware vSphere® vMotion® network by
another ESXi host in the cluster.
• vSphere DPM keeps at least one such host powered on.
ƒ Can be enabled or disabled at the cluster level:
• vSphere DPM is disabled by default.

11
Host and Management Scalability
Hosts powered off by vSphere DPM are marked by vCenter Server as being in standby mode.
Standby mode indicates that the hosts are available to be powered on whenever they are needed.
vSphere DPM operates by awakening ESXi hosts from a powered-off state through WOL packets.
These packets are sent over the vMotion networking interface by another host in the cluster, so
vSphere DPM keeps at least one host powered on at all times. Before you enable vSphere DPM on a
host, manually test the “exit standby” procedure for that host to ensure that it can be powered on
successfully through WOL. You can do this test with the vSphere Client.
For WOL compatibility requirements, see VMware® knowledge base article 1003373 at
http://kb.vmware.com/kb/1003373.

Module 11 Host and Management Scalability 639


vSphere DPM Operation
Slide 11-20

vSphere DPM powers off the host when the cluster load is low.
ƒ vSphere DPM considers a 40-minute load history.
ƒ All virtual machines on the selected host are migrated to other hosts.
vSphere DPM powers on a host when the cluster load is high.
ƒ It considers a 5-minute load history.
ƒ The WOL packet is sent to the selected host, which boots up.
ƒ DRS load balancing initiates, and some virtual machines are migrated
to this host.

The vSphere DPM algorithm does not frequently power servers on and off in response to transient
load variations. Rather, it powers off a server only when it is very likely that it will stay powered off
for some time. It does this by taking into account the history of the cluster workload. When a server
is powered back on, vSphere DPM ensures that it stays powered on for a nontrivial amount of time
before being again considered for power-off. To minimize the power-cycling frequency across all of
the hosts in the cluster, vSphere DPM cycles through all comparable hosts enabled for vSphere
DPM hosts when trying to find a power-off candidate.
When vSphere HA admission control is disabled, failover resource constraints are not passed on to
DRS and vSphere DPM. The constraints are not enforced. DRS does evacuate virtual machines from
hosts. It places the hosts in maintenance mode or standby mode, regardless of the effect that this
action might have on failover requirements.

640 VMware vSphere: Optimize and Scale


vSphere DPM does power off hosts (place them in standby mode), even if doing so violates failover
requirements. (When a host machine is placed in standby mode, it is powered off.) Normally,
vSphere DPM places hosts in standby mode to optimize power usage. You can also manually place a
host in standby mode. But DRS might undo (or recommend undoing) your change the next time that
it runs. To force a host to remain powered off, place it in maintenance mode and power it off.
With cluster configuration settings, users can define the reserve capacity to always be available.
Users can also define the time for which load history can be monitored before the power-off
decision is made. Power-on is also triggered when not enough resources are available to power on a
virtual machine or when more spare capacity is required for vSphere HA.
For more about vSphere DPM, see vSphere Resource Management Guide at
http://www.vmware.com/support/pubs/vsphere-esxi-vcenter-server-pubs.html.

11
Host and Management Scalability

Module 11 Host and Management Scalability 641


Power Management Comparison: vSphere DPM
Slide 11-21

How does vSphere DPM compare to power management features in


server hardware?
ƒ vSphere DPM = cluster power management
ƒ Enhanced Intel SpeedStep and AMD PowerNow! = per-host power
management
Dynamic voltage and frequency scaling (DVFS)
Implement both vSphere DPM and DVFS to maximize power savings.

vSphere DPM is a cluster power management feature. Enhanced Intel SpeedStep and AMD
PowerNow! are CPU power management technologies.
ESXi supports Enhanced Intel SpeedStep and AMD PowerNow! The dynamic voltage and
frequency scaling (DVFS) enabled by these hardware features enables power savings on ESXi hosts
that are not operating at maximum capacity.
DVFS is useful. But the power savings that it provides, even for an idle host, are not necessarily
much lower than powering off the host. So DVFS is complementary to vSphere DPM, and VMware
recommends that the two be used together to maximize power savings.
In other words, vSphere DPM achieves a great power savings by completely powering off some
hosts. DVFS provides more savings on the remaining hosts if they are at least somewhat
underutilized. Using DVFS means that power can be saved even when the load is high enough to
keep vSphere DPM from powering off any hosts.
Implementing vSphere DPM at the cluster level, while using per-host DVFS power management
technologies, enables vSphere to provide effective power management on the levels of the virtual
machine, the host, and the datacenter.Monitoring Cluster Status

642 VMware vSphere: Optimize and Scale


Enabling vSphere DPM
Slide 11-22

Automation Function
level
Off Disables features
Manual Recommends
only

11
Automatic Executes
recommendations

Host and Management Scalability


After you have done the configuration or testing steps required by the wake protocol that you are
using on each host, you can enable vSphere DPM. You enable vSphere DPM by configuring the
power management automation level, threshold, and host-level overrides. These settings are
configured on the Power Management page of the cluster Settings dialog box.
The power management automation levels are different from the DRS automation levels:
• Off disables the feature. vSphere DPM makes no recommendations.
• Manual sets vSphere DPM to make recommendations for host power operation and related
virtual machine migration, but recommendations are not automatically executed. These
recommendations are displayed on the cluster’s DRS tab in the vSphere Client.
• Automatic sets vSphere DPM to execute host power operations if all related virtual machine
migrations can be executed automatically.
Priority ratings are based on the amount of overutilization or underutilization found in the DRS
cluster and the improvement that is expected from the intended host power state change. A priority 1
recommendation is mandatory. A priority 5 recommendation brings only slight improvement.

Module 11 Host and Management Scalability 643


When you disable vSphere DPM, hosts are taken out of standby mode. Taking hosts out of standby
mode ensures that when vSphere DPM is disabled all hosts are powered on and ready to
accommodate load increases.
You can use the Schedule Task: Change Cluster Power Settings wizard to create scheduled tasks to
enable and disable vSphere DPM for a cluster.

644 VMware vSphere: Optimize and Scale


vSphere DPM Host Options
Slide 11-23

Cluster > Edit Settings

Disable hosts with a Verify that wake is


status of Never. functioning properly.

11
Host and Management Scalability
When you enable vSphere DPM, hosts in the DRS cluster inherit the power management automation
level of the cluster by default. You can override this default for an individual host so that its
automation level differs from that of the cluster.
After enabling and running vSphere DPM, you can verify that it is functioning properly by viewing
each host’s information in the Last Time Exited Standby column. This information is on the Host
Options page in the cluster Settings dialog box and on the Hosts tab for each cluster.
A time stamp is displayed as well as whether vCenter Server succeeded or failed the last time that it
attempted to bring the host out of standby mode. vSphere DPM derives times for the Last Time
Exited Standby column from the vCenter Server event log. If no such attempt has been made, the
column displays Never (on the slide). If the event log is cleared, the Last Time Exited Standby
column is reset to Never.
Use this page to disable the host for any host that fails to exit standby mode. When a host is
disabled, vSphere DPM does not consider that host a candidate for being powered off.
VMware strongly recommends that you disable any host with a Last Time Exited Standby status
of Never until you ensure that WOL is configured properly. You do not want vSphere DPM to
power off a host for which wake has not been successfully tested.

Module 11 Host and Management Scalability 645


Review of Learner Objectives
Slide 11-24

You should be able to do the following:


ƒ Explain how vSphere DPM operates.
ƒ Configure vSphere DPM.

646 VMware vSphere: Optimize and Scale


Lesson 3: Host Profiles
Slide 11-25

Lesson 3:
Host Profiles

11
Host and Management Scalability

Module 11 Host and Management Scalability 647


Learner Objectives
Slide 11-26

After this lesson, you should be able to do the following:


ƒ Describe the components of Host Profiles.
ƒ Describe the benefits of Host Profiles.
ƒ Explain how Host Profiles operates.
ƒ Use a host profile to manage ESXi configuration compliance.
ƒ Describe the benefits of exporting and importing Host Profiles.

648 VMware vSphere: Optimize and Scale


Host Configuration Overview
Slide 11-27

ESXi host configurations have a wide scope, which includes:


ƒ CPU
ƒ Memory
ƒ Storage
The Host Profiles feature enables you to
export configuration settings from a
ƒ Networking reference host and save them as a portable
ƒ Licensing
set of policies, called the host profile.

ƒ DNS and routing


Existing processes for modifying a single host’s configuration are
either manual or automated through scripts.

11
Host and Management Scalability
Over time, your vSphere environment expands and changes. New hosts are added to the datacenter
and must be configured to coexist with other hosts in the cluster. Applications and guest operating
systems must be updated with the latest software versions and security patches. To review, many
components are configured on an ESXi host:
• Storage – Datastores (VMware vSphere® VMFS and NFS), iSCSI initiators and targets,
multipathing
• Networking – Virtual switches, port groups, physical network interface card (NIC) speed,
security, NIC teaming policies
• Licensing – Edition selection, license key input
• DNS and routing – DNS server, IP default gateway
• Processor – Use of hyperthreads as logical CPUs
• Memory – Amount of memory on an ESXi host
Processes already exist to modify the configuration of a single host:
• Manually, using the vSphere Client or the command prompt
• Automatically, using scripts

Module 11 Host and Management Scalability 649


Host Profiles
Slide 11-28

Host profiles eliminate


per-host configurations and maintain
configuration consistency and
correctness across the entire
datacenter.

Benefits:
ƒ Simplified setup
and change
management for
ESXi hosts
ƒ Easy detection of noncompliance
with standard configurations
ƒ Automated remediation

The Host Profiles feature enables you to export configuration settings from a master reference host
and save them as a portable set of policies, called a host profile. You can use this profile to quickly
configure other hosts in the datacenter. Configuring hosts with this method drastically reduces the
setup time of new hosts: multiple steps are reduced to a single click. Host profiles also eliminate the
need for specialized scripts to configure hosts.
vCenter Server uses host profiles as host configuration baselines so that you can monitor changes to
the host configuration, detect discrepancies, and fix them.
As a licensed feature of vSphere, host profiles are available only with the appropriate licensing. If
you see errors, make sure that you have the appropriate vSphere licensing for your hosts.

650 VMware vSphere: Optimize and Scale


Basic Workflow to Implement Host Profiles
Slide 11-29

host profile 2
memory reservation
storage
networking 4
date and time
firewall 3
security
services
5
users and user groups
new
host

11
1 reference host cluster

Host and Management Scalability


Basic workflow to implement host profiles:
1. Set up and configure a host that will be used as the reference host. A reference host is the host
from which the profile is created.
2. Use the Create Profile wizard to create a profile from the designated reference host. vCenter
Server retrieves the configuration settings of the reference host, including IP addressing, NFS
mounts, and VMware vSphere® vMotion® ports. You can edit the profile to further customize
the policies in the profile.
3. Attach the host or cluster to the profile. You can attach a profile to standalone hosts or to one or
more clusters of hosts.
4. Check the host’s compliance to the reference host’s profile. If all hosts are compliant with the
reference host, the hosts are correctly configured.
5. Check new hosts for compliance against the reference host’s profile. You can easily apply the
host profile of the reference host to other hosts or clusters of hosts that are not in compliance.
You can also import and export a profile file to a host profile that is in the VMware® profile format
(.vpf). This feature is useful for sharing host profiles across multiple vCenter Server instances.

Module 11 Host and Management Scalability 651


Monitoring for Compliance
Slide 11-30

After the host profile is created and associated with a set of hosts or clusters, you can check the
compliance status from various places in the vSphere Client:
• Host Profiles main view – Displays the compliance status of hosts and clusters, listed by profile
• Host Summary tab – Displays the compliance status of the selected host
• Cluster Profile Compliance tab – Displays the compliance status of the selected cluster and all
the hosts in the selected cluster (shown on the slide)

652 VMware vSphere: Optimize and Scale


A cluster can be checked for compliance against a host profile or for specific cluster requirements
and settings. For example, compliance can be checked to ensure that:
• The vMotion NIC speed is at least 1,000Mbps
• vMotion is enabled
• Power management is enabled on the host
These built-in cluster compliance checks are useful for DRS and vSphere DPM.
Whenever a new host is added to a cluster, the host is checked for compliance against the host
profile that has been applied.
On the slide, the host esxi02.vclass.local does not comply with the host profile, but the host
esxi01.vclass.local does comply.
To determine why a host is noncompliant:

1. Select the host in the Host Compliance Against Cluster Requirements And Host Profile(s)
panel.
2. Review the reasons that are listed in the Host Compliance Failures pane at the bottom of the
window.

11
TIP
You can schedule tasks in vSphere to help automate compliancy checking. To run a scheduled task,

Host and Management Scalability


the vSphere Client must be connected to a vCenter Server system to create and manage scheduled
tasks. You can then schedule the Check Compliance of a Profile task to check that a host’s
configuration matches the configuration that is specified in a host profile.

Module 11 Host and Management Scalability 653


Applying Host Profiles
Slide 11-31

To bring a host to the required state of configuration:


1. Select Apply Profile to apply the configuration encapsulated in the
attached host profile.
2. Review the configuration changes to be applied. The host is configured
automatically.

For you to apply the


host profile, a host
must be in
maintenance mode.

After attaching the profile to an entity (a host or a cluster of hosts), configure the entity by applying
the configuration as contained in the profile. Before you apply the changes, notice that vCenter
Server displays a detailed list of actions that will be performed on the host to configure it and bring
it into compliance. This list of actions enables the administrator to cancel applying the host profile if
the parameters shown are not wanted.
For remediation of the host to occur from the application of a host profile, the host must be placed in
maintenance mode.
The Host Profile Compliance column displays the current status. On the slide, the host is not
compliant and so the host profile will be applied.
To apply a host profile:

1. Select Home > Management > Host Profiles.

2. Select a host profile in the inventor.

3. Click the Hosts and Clusters tab.


Right-click the host and select Apply Profile.

654 VMware vSphere: Optimize and Scale


Customizing Hosts with Answer Files
Slide 11-32

The answer file:


ƒ Contains the user input policies for a host profile
ƒ Is created when the profile is initially applied to a particular host
ƒ Is useful when provisioning hosts with VMware vSphere® Auto Deploy

11
Host and Management Scalability
After a host profile is first applied to a host, an answer file is created for that host.
For hosts provisioned with VMware vSphere® Auto Deploy™, vCenter Server owns the entire host
configuration, which is captured in a host profile. In most cases, the host profile information is
sufficient to store all configuration information. Sometimes the user is prompted for input when the
host provisioned with Auto Deploy boots. The answer file mechanism manages those cases. After
the user has specified the information, the system generates a host-specific answer file and stores it
with the Auto Deploy cache and the vCenter Server host object.
You can check the status of an answer file. If the answer file has all of the user input answers
needed, then the status is Complete. If the answer file is missing some of the required user input
answers, then the status is Incomplete. You can also update or change the user input parameters for
the host profile’s policies in the answer file.

Module 11 Host and Management Scalability 655


Standardization Across Multiple vCenter Server Instances
Slide 11-33

Some organizations could


span multiple vCenter Server
instances:
ƒ Export the host profile out of
vCenter Server in .vpf format.
vCenter Server System vCenter Server System
ƒ Other vCenter Sever hosts
can import the host profile for
use in their environment.
ƒ Host profiles are not
replicated in Linked Mode.

.vpf File

Some organizations might have large environments that span many vCenter Server instances. If you
want to standardize on a single host configuration across multiple vCenter Server environments, you
can export the host profile out of vCenter Server and have other vCenter Servers import the profile
for use in their environments. Host profiles are not replicated, shared, or kept in sync across
VMware vCenter Server instances joined in Linked Mode.
You can export a profile to a file that is in the VMware profile format (.vpf).

NOTE
When a host profile is exported, administrator and user profile passwords are not exported. This
restriction is a security measure and stops passwords from being exported in plain text when the
profile is exported. You will be prompted to re-enter the values for the password after the profile is
imported and the password is applied to a host.

656 VMware vSphere: Optimize and Scale


Lab 17
Slide 11-34

In this lab, you will synchronize ESXi host configurations.


1. Create a host profile.
2. Attach a host profile to an ESXi host and check compliance.
3. Purposely introduce configuration drift.
4. Check compliance again.
5. Remediate the managed host.
6. Detach the managed host from the host profile.

11
Host and Management Scalability

Module 11 Host and Management Scalability 657


Review of Learner Objectives
Slide 11-35

You should be able to do the following:


ƒ Describe the components of Host Profiles.
ƒ Describe the benefits of Host Profiles.
ƒ Explain how Host Profiles operates.
ƒ Use a host profile to manage ESXi configuration compliance.
ƒ Describe the benefits of exporting and importing Host Profiles.

658 VMware vSphere: Optimize and Scale


Lesson 4: vSphere PowerCLI
Slide 11-36

Lesson 4:
vSphere PowerCLI

11
Host and Management Scalability

Module 11 Host and Management Scalability 659


Learner Objectives
Slide 11-37

After this lesson, you should be able to do the following:


ƒ Describe the features of VMware vSphere® PowerCLI™.
ƒ List common tasks to perform with vSphere PowerCLI.
ƒ Use basic vSphere PowerCLI cmdlets.

660 VMware vSphere: Optimize and Scale


vSphere PowerCLI
Slide 11-38

vSphere PowerCLI provides a Windows PowerShell interface for


command-line access to perform administrative tasks or to create
executable scripts.

11
Host and Management Scalability
VMware vSphere® PowerCLI™ provides Windows PowerShell interface to VMware vSphere®
API. Also, the vSphere PowerCLI command-line syntax is same as the generic Windows
PowerShell syntax.

Module 11 Host and Management Scalability 661


vSphere PowerCLI Cmdlets
Slide 11-39

vSphere PowerCLI cmdlets enable you to manage, monitor, and


automate vSphere components, such as virtual machines,
datacenters, storage, and networks. Using vSphere PowerCLI
cmdlets, you can create simple and robust scripts to automate
administrative tasks.

vSphere PowerCLI is a powerful automation and administration tool that adds about 300 commands,
called cmdlets, on top of Windows PowerShell. These cmdlets ease the management and automation
of vSphere infrastructure. System administrators can perform useful actions with vSphere PowerCLI
without having to learn a new language.

662 VMware vSphere: Optimize and Scale


Windows PowerShell and vSphere PowerCLI
Slide 11-40

Windows PowerShell vSphere PowerCLI


A command-line and scripting A set of snap-ins based on
environment Windows PowerShell

Cmdlets using a consistent verb- Same syntax as generic Windows


noun structure PowerShell

Uses the same object model as


Uses the .NET object model
Windows PowerShell

Provides administrators with Enables management, monitoring,


management and automation and automation for vSphere
capabilities components

11
Host and Management Scalability

Module 11 Host and Management Scalability 663


Advantages of Using vSphere PowerCLI
Slide 11-41

vSphere PowerCLI is used by basic administrators and advanced


administrators.
The advantages of using vSphere PowerCLI include:
ƒ Reduces operating expenses, enabling you to focus on strategic
projects
ƒ Maintains consistency and ensures compliance
ƒ Reduces time spent on performing manual, repetitive tasks

vSphere PowerCLI is used by both basic administrators and advanced administrators. Basic
administrators use vSphere PowerCLI to manage their VMware infrastructure from the command
line. Advanced administrators use vSphere PowerCLI to develop scripts that can be reused by other
administrators or integrated into other applications.
Some of the advantages of using vSphere PowerCLI are:
• Reduces operating expenses, allowing you to focus on strategic projects: Automating common
virtualization tasks helps you to reduce your operation expenses and allows your team to focus
on strategic projects. Automating the management of your vSphere environment can help
increase your virtual machine–to–administrator ratio.
• Maintains consistency and ensures compliance: Script automation allows your team to deploy
servers and virtual machines in a structured and automated model. Using scripts removes the
risk of human error involved when performing repetitive tasks.
Reduces time spent on performing manual, repetitive tasks: Automating certain manual repetitive
tasks such as finding unused virtual machines, rebalancing storage paths, clearing space from
datastore saves you time.

664 VMware vSphere: Optimize and Scale


Common Tasks Performed with vSphere PowerCLI
Slide 11-42

Using vSphere PowerCLI, you can automate common tasks like:


ƒ Managing and automating tasks on ESXi hosts
ƒ Managing virtual machines
ƒ Automating reporting

11
Host and Management Scalability
• ESXi host set up and management can be time intensive without automation. Automating tasks
that you must perform several times in week saves you time.
• Reporting is critical to management of any size environment, because it allows your team to
measure trends and provide internal chargeback in your organization. You can automate
reporting to avoid human error.
• vSphere PowerCLI can help increase your organization’s virtual machine-to-administrator ratio
by automating common tasks and letting your team focus on strategic projects. The following
tasks are some of the tasks that you can automate:
• Set virtual machine power-on options, such as configuring tools to automatically upgrade
• Set resources, such as limits and reservations, on a virtual machine
Determine the version of VMware® Tools™ that the virtual machines are using?

Module 11 Host and Management Scalability 665


Other Tasks Performed with vSphere PowerCLI
Slide 11-43

vSphere PowerCLI provides additional features to enable


automation:
ƒ vSphere Update Manager PowerCLI cmdlets
ƒ vSphere Image Builder PowerCLI cmdlets
ƒ vSphere Auto Deploy PowerCLI cmdlets

666 VMware vSphere: Optimize and Scale


vSphere PowerCLI Objects
Slide 11-44

vSphere PowerCLI uses objects to represent the vSphere


infrastructure:
ƒ Virtual machines
ƒ ESXi hosts
ƒ Datastores
ƒ Virtual switches
ƒ Clusters

11
Host and Management Scalability
vSphere PowerCLI uses objects to represent the vSphere infrastructure. Over 60 objects in vSphere
PowerCLI are defined. These objects include virtual machines, ESXi hosts, datastores, virtual
networks, and clusters. Objects in vSphere PowerCLI have properties and methods. Properties
describe the object. Methods perform an action on the object.

Module 11 Host and Management Scalability 667


Returning Object Properties
Slide 11-45

Get-Member can return the properties and methods of an object.


ƒ Example:
Get-VM $vm | Get-Member

The Get-Member cmdlet returns information about objects. The cmdlet exposes the properties
and methods of an object. Use this cmdlet to inspect an object to see more information than what
vSphere PowerCLI returns by default.
The example uses the Get-Member cmdlet to view the properties and methods of the virtual
machine object. The cmdlet returns a list of all of the properties and methods that are associated
with the virtual machine object.

668 VMware vSphere: Optimize and Scale


Connecting and Disconnecting to an ESXi Host
Slide 11-46

Use the Connect-VIServer cmdlet to connect to an ESXi host or a


vCenter Server system:
ƒ Connect-VIServer –Server <server_address>
ƒ Connect-VIServer –Server <server_address> -
Protocol <protocol> -User <user_name> -Password
<password>
The Disconnect-VIServer cmdlet closes the last connect to the
specified server.

11
Host and Management Scalability
vSphere PowerCLI supports working with multiple default servers. If you select this option, every
time you connect to a different server with Connect-VIServer, the new server connection is
stored in an array variable. The new server connection is stored together with the previously
connected servers, unless the -NotDefault parameter is set. This variable is named
$DefaultVIServers and its initial value is an empty array. When you run a cmdlet and the target
servers cannot be determined from the specified parameters, the cmdlet runs against all servers
stored in the array variable. To remove a server from the $DefaultVIServers variable, you can
either use Disconnect-Server to close all active connections to the server, or modify the value
of $DefaultVIServers manually.
If you choose to work with a single default server, when you run a cmdlet and the target servers
cannot be determined from the specified parameters, the cmdlet runs against the last connected
server. This server is stored in the $defaultVIServer variable, which is updated every time you
establish a new connection. To switch between single and multiple default servers working mode,
use the Set-PowerCLIConfiguration cmdlet.

Module 11 Host and Management Scalability 669


Certificate Warnings
Slide 11-47

Certificate warnings appear when the vCenter Server system or ESXi


host uses unsigned certificates.

When Connect-VIServer is run, the title bar of the vSphere PowerCLI console indicates that a
connection has been made. A warning message might be generated if the ESXi host or vCenter
Server system is using the self-signed certificate. This warning can be ignored unless the server has
been issued a signed certificate.

670 VMware vSphere: Optimize and Scale


Types of vSphere PowerCLI Cmdlets
Slide 11-48

“Get” cmdlets retrieve information about objects.


ƒ Example: Get-VM
“New” cmdlets create objects.
ƒ Example: New-VM
“Set” cmdlets change objects.
ƒ Example: Set-VM
“Move” cmdlets relocate objects.
ƒ Example: Move-VM
“Remove” cmdlets delete objects.
ƒ Example: Remove-VM

11
Host and Management Scalability
Four kinds of cmdlets are commonly used in vSphere PowerCLI:
• “Get” cmdlets retrieve information about objects like virtual machines, virtual machine hosts,
and datastores.
• “New” cmdlets create objects like virtual machines, virtual switches, and datacenters.
• “Set” cmdlets change existing objects, for example, a hardware change to a virtual machine.
• “Move” cmdlets relocate objects, for example, moving a virtual machine from one ESXi host to
another.
• “Remove” cmdlet removes the specified object, for example, removing a virtual machine from
the ESXi host.

Module 11 Host and Management Scalability 671


Using Basic vSphere PowerCLI Cmdlets
Slide 11-49

After connecting to the ESXi host or vCenter Server, you can manage
and monitor the vSphere infrastructure from the command prompt.
Examples:
ƒ The version of vSphere PowerCLI: Get-PowerCLIVersion
ƒ All the cmdlets that vSphere PowerCLI provides: Get-VICommand
ƒ The version of the ESXi host: Get-VMHost | Select Name,
@{Name="Product";Expression={$_.ExtensionData.Config
.Product.Name}}, Version
ƒ The number of processors on the ESXi host: Get-Vmhost|Select
CpuTotalMhz
ƒ The memory available on the ESXi host: Get-Vmhost|Select
MemoryTotalMB
ƒ Disconnect a CDROM drive : Get-VM | Get-Cddrive |
Set-CDDrive –Connected $false

672 VMware vSphere: Optimize and Scale


vSphere PowerCLI Snap-Ins
Slide 11-50

To view all the vSphere PowerCLI snap-ins, use the –Name


parameter.
Get-PSSnapin -Name *vmware*

11
Host and Management Scalability
The vSphere PowerCLI package includes three additional snap-ins for managing specific
vSphere features:
• VMware.VimAutomation.License - The Licensing snap-in contains the Get-
LicenseDataManager cmdlet for managing VMware License components.
• VMware.ImageBuilder - The Image Builder snap-in contains cmdlets for managing depots,
image profiles, and VIBs.
• VMware.DeployAutomation - The Auto Deploy snap-in contains cmdlets that provide an
interface to Auto Deploy for provisioning physical hosts with ESXi software.
To use these additional snap-ins, you must first add them using the Add-PSSnapin cmdlet. For
example, to add the Auto Deploy snap-in, in the PowerCLI console type Add-PSSnapin
VMware.DeployAutomation.
Listing the Cmdlets in a vSphere PowerCLI Snap-in: To view the installed PowerShell snap-ins, use
the Get-PSSnapin command. You can filter the vSphere PowerCLI snap-ins by using the -Name
parameter and a wildcard:
Get-PSSnapin -Name *vmware*

Module 11 Host and Management Scalability 673


To get a list of all cmdlets in a snap-in, use the Get-Command cmdlet with the -PSSnapin
parameter. For example, run:
Get-Command -PSSnapin VMware.VimAutomation.Core
The cmdlets in the specified vSphere PowerCLI snap-in are listed in the console window as one
long, scrolling topic. You can view them a single page at a time by piping the results of the Get-
Command cmdlet to the more option in the following way:
Get-Command -PSSnapin VMware.VimAutomation.Core | more
In vSphere PowerCLI, these snap-ins are automatically loaded.

674 VMware vSphere: Optimize and Scale


Lab 18
Slide 11-51

In this lab, you will use vSphere PowerCLI to run basic Windows PowerShell cmdlets.
1. Start vSphere PowerCLI and find the vSphere PowerCLI version.
2. Retrieve the Windows PowerShell security policy.
3. Create a simple script.
4. Display the vSphere PowerCLI help files.
5. Connect and disconnect to your ESXi host.
6. Define a variable to store the ESXi host and vCenter Server system names.
7. Connect to the vCenter Server system.
8. Run cmdlets to find ESXi host-related information.
9. List the datastores on an ESXi host.
10. List the properties of a virtual switch.
11. Use Get-VM to retrieve virtual machine properties.
12. Use Get-Stat to retrieve performance information.

11
13. Migrate a virtual machine’s disk to shared storage.
14. Use vMotion to migrate a virtual machine.

Host and Management Scalability

Module 11 Host and Management Scalability 675


Review of Learner Objectives
Slide 11-52

You should be able to do the following:


ƒ Describe the features of vSphere PowerCLI.
ƒ List common tasks to perform with vSphere PowerCLI.
ƒ Use basic vSphere PowerCLI cmdlets.

676 VMware vSphere: Optimize and Scale


Lesson 5: Image Builder
Slide 11-53

Lesson 5:
Image Builder

11
Host and Management Scalability

Module 11 Host and Management Scalability 677


Learner Objectives
Slide 11-54

After this lesson, you should be able to do the following:


ƒ Use Image Builder to create an ESXi image.

678 VMware vSphere: Optimize and Scale


What Is an ESXi Image?
Slide 11-55

An ESXi image is a software bundle that consists of four main


components.

core CIM
hypervisor providers

11
plug-in
drivers
components

Host and Management Scalability


An ESXi image is a customizable software bundle that contains all the software necessary to run on
an ESXi host.
An ESXi image includes the following:
• The base ESXi software, also called the core hypervisor
• Specific hardware drivers
• Common Information Model (CIM) providers
• Specific applications or plug-in components
ESXi images can be installed on a hard disk, or the ESXi image can run entirely in memory.

Module 11 Host and Management Scalability 679


VMware Infrastructure Bundles
Slide 11-56

VMware® infrastructure bundles (VIBs) are software packages that


are added to an ESXi image.
A VIB is used to package any of the following ESXi image
components:
ƒ ESXi base image
ƒ Drivers
ƒ CIM providers
ƒ Plug-ins and other components
A VIB specifies relationships with other VIBs:
ƒ VIBs that the VIB depends on
ƒ VIBs that the VIB conflicts with

The ESXi image includes one or more VMware installation bundles (VIBs).
A VIB is an ESXi software package. In VIBs, VMware and its partners package solutions, drivers,
Common Information Model (CIM) providers, and applications that extend the ESXi platform. An
ESXi image should always contain one base VIB. Other VIBs can be added to include additional
drivers, CIM providers, updates, patches, and applications.

680 VMware vSphere: Optimize and Scale


ESXi Image Deployment
Slide 11-57

The challenge of using a standard ESXi image is that the image might
be missing desired functionality.

missing
CIM
provider

? missing
driver

standard
ESXi ISO image: missing

11
vendor
ƒ Base providers plug-in
ƒ Base drivers

Host and Management Scalability


Standard ESXi images are provided by VMware and are available on the VMware Web site. ESXi
images can also be provided by VMware partners.
The challenge that administrators face when using the standard ESXi image provided by VMware is
that the standard image is sometimes limited in functionality. For example, the standard ESXi image
might not contain all the drivers or CIM providers for a specific set of hardware. Or the standard
image might not contain vendor-specific plug-in components.
To create an ESXi image that contains custom components, use Image Builder.

Module 11 Host and Management Scalability 681


What Is Image Builder?
Slide 11-58

Image Builder is a set of command-line utilities that are used to


create and manage image profiles.
ƒ An image profile is a group of VIBs that are used to create an ESXi
image.
Image Builder enables the administrator to build customized ESXi
boot images:
ƒ Used for booting disk-based ESXi installations
ƒ Used by Auto Deploy to boot an ESXi host in memory
Image Builder is based on vSphere PowerCLI.
ƒ The Image Builder cmdlets are included with the vSphere PowerCLI
tools.

Image Builder is a utility for customizing ESXi images. Image Builder consists of a server and
vSphere PowerCLI cmdlets. These cmdlets are used to create and manage VIBs, image profiles,
software depots, and software channels. Image Builder cmdlets are implemented as Microsoft
PowerShell cmdlets and are included in vSphere PowerCLI. Users of Image Builder cmdlets can use
all vSphere PowerCLI features.

682 VMware vSphere: Optimize and Scale


Image Builder Architecture
Slide 11-59

software depot VIB:


ƒ ESXi software package:
image profile • Provided by VMware and its partners
Image profile:
ESXi
VIBs
driver
VIBs
ƒ Defines an ESXi image
ƒ Consists of one or more VIBs
Software depot:
ƒ Logical grouping of VIBs and image
profiles
OEM VIBs security
VIBs
ƒ Can be online or offline
Software channel:

11
software channels ƒ Used to group different types of VIBs
at a software depot

Host and Management Scalability


The Image Builder architecture consists of the following components:
• VIB – VIBs are software packages that consist of packaged solutions, drivers, CIM providers,
and applications that extend the ESXi platform. VIBs are available in software depots.
• Image profile – Image profiles define ESXi images. Image profiles consist of one or more
VIBs. An image profile always includes a base VIB and might include other VIBs. You use
vSphere PowerCLI to examine and define an image profile.
• Software depot – A software depot is a logical grouping of VIBs and image profiles. The
software depot is a hierarchy of files and folders and can be available through an HTTP URL
(online depot) or a ZIP file (offline depot). VMware and its partners make software depots
available to users.
• Software channel – VMware and its partners that are hosting software depots use software
channels to group different types of VIBs. For example, a software channel can be created to
group all security updates. A VIB can be in multiple software channels. You do not have to
connect to a software channel to use the associated VIBs. Software channels are available to
facilitate filtering.

Module 11 Host and Management Scalability 683


Building an ESXi Image: Step 1
Slide 11-60

Start the vSphere PowerCLI session.


1. Verify that the execution policy is set to unrestricted.
• Cmdlet:
Get-ExecutionPolicy
Windows host with
2. Connect to your vCenter Server vSphere PowerCLI
system. and Image Builder snap-in
• Cmdlet:
Connect-VIServer

Image
Builder

To use Image Builder, the first step is to install vSphere PowerCLI and all prerequisite software. The
Image Builder snap-in is included with the vSphere PowerCLI installation.
To install Image Builder, you must install:
• Microsoft .NET 2.0
• Microsoft PowerShell 1.0 or 2.0
• vSphere PowerCLI which includes the Image Builder cmdlets
After you start the vSphere PowerCLI session, the first task is to verify that the execution policy is set
to Unrestricted. For security reasons, Windows PowerShell supports an execution policy feature. It
determines whether scripts are allowed to run and whether they must be digitally signed. By default,
the execution policy is set to Restricted, which is the most secure policy. If you want to run scripts or
load configuration files, you can change the execution policy by using the Set-ExecutionPolicy
cmdlet. To view the current execution policy, use the Get-ExecutionPolicy cmdlet.
The next task is to connect to your vCenter Server system. The Connect-VIServer cmdlet enables
you to start a new session or reestablish a previous session with a vSphere server.
For more about installing Image Builder and its prerequisite software, see vSphere Installation and
Setup Guide. For more about vSphere PowerCLI, see vSphere PowerCLI Installation Guide. Both
books are at http://www.vmware.com/support/pubs/vsphere-esxi-vcenter-server-pubs.html.

684 VMware vSphere: Optimize and Scale


Building an ESXi Image: Step 2
Slide 11-61

Connect to a software depot. software depot


1. Add a software
depot to Image image
Builder. Windows host with profile
vSphere PowerCLI
• Cmdlet: and Image Builder
Add-EsxSoftwareDepot snap-in ESXi
2. Verify that the VIBs
software depot can
driver
be read. Image VIBs
• Cmdlet: Builder
Get-EsxImageProfile

11
OEM VIBs

Host and Management Scalability


Before you create or customize an ESXi image, Image Builder must be able to access one or more
software depots.
The Add-EsxSoftwareDepot cmdlet enables you to add software depots or offline bundle ZIP
files to Image Builder. A software depot consists of one or more software channels. By default, this
cmdlet adds all software channels in the depot to Image Builder. The Get-EsxSoftwareChannel
cmdlet retrieves a list of currently connected software channels. The Remove-EsxSoftwareDepot
cmdlet enables you to remove software depots from Image Builder.
After adding the software depot to Image Builder, verify that you can read the software depot.
The Get-EsxImageProfile cmdlet retrieves a list of all published image profiles in the
software depot.
Other Image Builder cmdlets that might be useful include Set-EsxImageProfile and Compare-
EsxImageProfile. The Set-EsxImageProfile cmdlet modifies a local image profile and
performs validation tests on the modified profile. The cmdlet returns the modified object but does
not persist it. The Compare-EsxImageProfile cmdlet shows whether two image profiles have the
same VIB list and acceptance levels.

Module 11 Host and Management Scalability 685


Building an ESXi Image: Step 3
Slide 11-62

Clone and modify an image profile. software depot


1. Clone an image profile.
• Cmdlet: image
New-EsxImageProfile Windows host with profile
vSphere PowerCLI
2. Modify an image profile. and Image Builder
• Cmdlets: snap-in ESXi
Add-EsxSoftwarePackage VIBs
Remove-EsxSoftwarePackage
driver
Image VIBs
Builder
Clone the default ESXi
image provided by VMware
and then customize it.
OEM VIBs

Cloning a published profile is the easiest way to create a custom image profile. Cloning a profile is
useful when you want to remove a few VIBs from a profile. Cloning is also useful when you want to
use hosts from different vendors and want to use the same basic profile, with the addition of vendor-
specific VIBs. VMware partners or large installations might consider creating a profile from the
beginning.
The New-EsxImageProfile cmdlet enables you to create an image profile or clone an
image profile.
To add one or more software packages (VIBs) to an image profile, use the Add-
EsxSoftwarePackage cmdlet. Likewise, the Remove-EsxSoftwarePackage cmdlet enables you
to remove software packages from an image profile. The Get-EsxSoftwarePackage cmdlet
retrieves a list of VIBs in an image profile.

686 VMware vSphere: Optimize and Scale


Using Image Builder to Build an Image: Step 4
Slide 11-63

Generate a new ESXi image.


software depot ƒ Cmdlet: Export-ESXImageProfile
image
profile Windows host with
vSphere PowerCLI
and Image Builder
ESXi snap-in
VIBs

driver
Image
VIBs ISO image
Builder

PXE-bootable

11
Image

OEM VIBs

Host and Management Scalability


Finally, after creating an image profile, you can generate an ESXi image. The Export-
EsxImageProfile cmdlet exports an image profile as an ISO image or ZIP file. An ISO image can
be used to boot an ESXi host. A ZIP file can be used by VMware vSphere® Update Manager™ for
remediating ESXi hosts. The exported image profile can also be used with Auto Deploy to boot
ESXi hosts.
For the complete list of Image Builder cmdlets, see vSphere Image Builder PowerCLI Reference at
http://www.vmware.com/support/pubs/vsphere-esxi-vcenter-server-pubs.html.

Module 11 Host and Management Scalability 687


Lab 19
Slide 11-64

In this lab, you will use Image Builder to export an image profile.
1. Export an image profile to an ISO image.

688 VMware vSphere: Optimize and Scale


Review of Learner Objectives
Slide 11-65

You should be able to do the following:


ƒ Use Image Builder to create an ESXi image.

11
Host and Management Scalability

Module 11 Host and Management Scalability 689


Lesson 6: Auto Deploy
Slide 11-66

Lesson 6:
Auto Deploy

690 VMware vSphere: Optimize and Scale


Learner Objectives
Slide 11-67

After this lesson, you should be able to do the following:


ƒ Understand the purpose of Auto Deploy.
ƒ Configure Auto Deploy.
ƒ Use Auto Deploy to deploy stateless ESXi.

11
Host and Management Scalability

Module 11 Host and Management Scalability 691


What Is Auto Deploy?
Slide 11-68

Auto Deploy is a method introduced in vSphere 5.0 for deploying


ESXi hosts:
ƒ The ESXi host’s state and configuration run entirely in memory.
ƒ When the ESXi host is shut down, the state information is cleared from
memory.
ƒ Based on preboot execution environment (PXE) boot
ƒ Works with Image Builder, vCenter Server, and Host Profiles
Benefits of Auto Deploy:
ƒ Large numbers of ESXi hosts can be deployed quickly and easily.
ƒ A standard ESXi image can be shared across many hosts.
ƒ The host image is decoupled from the physical server:
• Simplifies host recovery
ƒ A boot disk is not necessary.

Auto Deploy is a new method for provisioning ESXi hosts in vSphere 5.0. With Auto Deploy,
vCenter Server loads the ESXi image directly into the host memory. When the host boots, the
software image and optional configuration are streamed into memory. All changes made to the state
of the host are stored in RAM. When the host is shut down, the state of the host is lost but can be
streamed into memory again when the host is powered back on.
Unlike the other installation options, Auto Deploy does not store the ESXi state on the host disk.
vCenter Server stores and manages ESXi updates and patching through an image profile and,
optionally, the host configuration through a host profile.
Auto Deploy enables the rapid deployment of many hosts. Auto Deploy simplifies ESXi host
management by eliminating the necessity to maintain a separate boot image for each host. A
standard ESXi image can be shared across many hosts. When a host is provisioned using Auto
Deploy, the host image is decoupled from the physical server. The host can be recovered without
having to recover the hardware or restore from a backup. Finally, Auto Deploy eliminates the need
for a dedicated boot device, thus freeing the boot device for other uses.

692 VMware vSphere: Optimize and Scale


Where Are the Configuration and State Information Stored?
Slide 11-69

Because Auto Deploy can be configured without a boot disk, all


information on the state of the host is stored in or managed by
vCenter Server.
vCenter
boot disk Server

Image state: ESXi base, drivers, CIM


providers … image profile
Configuration state: networking,
storage, date and time, firewall, admin host profile
password …
Running state: VM inventory,
vSphere HA state, license, vSphere vCenter Server
DPM configuration

11
Event recording: log files, core dump add-in components

Host and Management Scalability


Without the use of Auto Deploy, the ESXi host’s image (binaries, VIBs), configuration, state, and
log files are stored on the boot device.
With Auto Deploy, a boot device no longer holds the host’s information. Instead, the information is
stored off the host and managed by vCenter Server:
• Image state – Executable software to run on the ESXi host. The information is part of the image
profile, which can be created and customized with the Image Builder snap-in in vSphere
PowerCLI.
• Configuration state – Configurable settings that determine how the host is configured.
Examples include virtual switch settings, boot parameters, and driver settings. Host profiles are
created using the host profile user interface in the vSphere Client.
• Running state – Settings that apply while the ESXi host is up and running. This state also
includes the location of the virtual machine in the inventory and the virtual machine autostart
information. This state information is managed by the vCenter Server instance.
Event recording – Information found in log files and core dumps. This information can be managed
by vCenter Server, using add-in components like the ESXi Dump Collector and the Syslog
Collector.

Module 11 Host and Management Scalability 693


Auto Deploy Architecture
Slide 11-70

Auto Deploy
PowerCLI

host profiles host profile Image Builder


rules engine image
and UI PowerCLI
profiles
answer files

Auto Deploy
server
fetch of predefined
image profiles
host and VIBs
profile
engine

ESXi HTTP fetch of images or VIBs


VIBs and
host and host profiles
image profiles

public depot

The Auto Deploy infrastructure consists of several components:


• Auto Deploy server – Serves images and host profiles to ESXi hosts. The Auto Deploy server is
at the heart of the Auto Deploy infrastructure.
• Auto Deploy rules engine – Tells the Auto Deploy server which image and which host profiles
to serve to which host. You use Auto Deploy PowerCLI to define the rules that assign image
profiles and host profiles to hosts.
• Image profiles – Define the set of VIBs with which to boot ESXi hosts. VMware and its
partners make image profiles and VIBs available in public software depots. Use Image Builder
PowerCLI to examine the depot and the Auto Deploy rules engine to specify which image
profile to assign to which host. You can create a custom image profile based on the public
image profiles and VIBs in the depot and apply that image profile to the host.
• Host profiles – Templates that define an ESXi host’s configuration, such as networking or
storage setup. You can save the host profile for an individual host and reuse it to reprovision
that host. You can save the host profile of a template host and use that profile for other hosts.
• Answer files – Store information that the user provides during the boot process. Only one
answer file exists for each host.

694 VMware vSphere: Optimize and Scale


Rules Engine Basics
Slide 11-71

Auto Deploy has a rules engine that determines which ESXi image
and host profiles can be used on selected hosts.
The rules engine maps software images and host profiles to hosts,
based on the attributes of the host:
ƒ For example, rules can be based on IP or MAC address.
ƒ The -AllHosts option can be used for every host.
For new hosts, the Auto Deploy server checks with the rules engine
before serving image and host profiles to a host.
vSphere PowerCLI cmdlets are used to set, evaluate, and update
image profile and host profile rules.
The rules engine includes rules and rule sets.

11
Host and Management Scalability
The rules engine includes the following:
• Rules – Rules can assign image profiles and host profiles to a set of hosts or specify the
inventory location (folder or cluster) of a host on the target vCenter Server system. A rule can
identify target hosts by boot MAC address, SMBIOS asset tag, BIOS UIID, or fixed DHCP IP
address. In most cases, rules apply to multiple hosts. You use vSphere PowerCLI to create rules.
After you create a rule, you must add it to a rule set. After you add a rule to a rule set, you
cannot edit it.
• Active rule set – When a newly started host contacts the Auto Deploy server with a request for
an image, the Auto Deploy server checks the active rule set for matching rules. The image
profile, host profile, and vCenter Server inventory location that are mapped by matching rules
are then used to boot the host. If more than one item is mapped by the rules, the Auto Deploy
server uses the item that is first in the rule set.
• Working rule set – The working rule set enables you to test changes to rules before making
them active. For example, you can use vSphere PowerCLI commands for testing compliance
with the working rule set.

Module 11 Host and Management Scalability 695


Software Configuration
Slide 11-72

Install Auto Deploy and register it with a vCenter Server instance:


ƒ Can be installed on the vCenter Server system
ƒ Included with the vCenter Server Appliance
The installation binary is included with the vCenter Server installation
media.
Install vSphere PowerCLI on a server that can reach the vCenter
Server system and the Auto Deploy server.
The user can set up an online or offline software depot:
ƒ An online depot is a URL where the image is stored.
ƒ An offline depot is a local zipped file that contains the image.
ƒ Both are configured and maintained using Image Builder.

When installing Auto Deploy, the software can be on its own server or it can be installed on the
same host as the vCenter Server instance. Setting up Auto Deploy includes installing the software
and registering Auto Deploy with a vCenter Server system.
The vCenter Server Appliance has the Auto Deploy software installed by default.
vSphere PowerCLI must be installed on a system that can be reached by the vCenter Server system
and the Auto Deploy server.
Installing vSphere PowerCLI includes installing Microsoft PowerShell. For Windows 7 and
Windows Server 2008, Windows PowerShell is installed by default. For Windows XP or Windows
2003, Windows PowerShell must be installed before installing vSphere PowerCLI.
The image profile can come from a public depot or it can be a zipped file stored locally. The local
image profile can be created and customized using vSphere PowerCLI. However, the base ESXi
image must be part of the image profile.
If you are using a host profile, save a copy of the host profile to a location that can be reached by the
Auto Deploy server.
For more about preparing your system and installing the Auto Deploy server, see vSphere Installation
and Setup Guide at http://www.vmware.com/support/pubs/vsphere-esxi-vcenter-server-pubs.html.

696 VMware vSphere: Optimize and Scale


PXE Boot Infrastructure Setup
Slide 11-73

Auto Deploy requires a PXE boot infrastructure.


ƒ The ESXi host must get its IP address from a DHCP server.
ƒ The DHCP server must be configured to direct the ESXi host to a Trivial
File Transfer Protocol (TFTP) server to download the PXE file.
ƒ The following DHCP services can be used:
• The organization’s current DHCP server
• A new DHCP server installed for Auto Deploy use
• The DHCP service included with the vCenter Server Appliance
ƒ A TFTP server must be accessible from the DHCP server and the
vCenter Server instance.
• TFTP services are included with VMware® vCenter™ Server Appliance™.

11
Host and Management Scalability
Autodeployed hosts perform a preboot execution environment (PXE) boot. PXE uses DHCP and
Trivial File Transfer Protocol (TFTP) to boot an operating system over a network.
A DHCP server and a TFTP server must be configured. The DHCP server assigns IP addresses to
each autodeployed host on startup and points the host to a TFTP server to download the gPXE
configuration files. The ESXi hosts can use the infrastructure’s existing DHCP and TFTP servers, or
new DHCP and TFTP servers can be created for use with Auto Deploy. Any DHCP server that
supports the next-server and filename options can be used.
The vCenter Server Appliance can be used as the Auto Deploy server, DHCP server, and TFTP
server. The Auto Deploy service, DHCP service, and TFTP service are included in the appliance.

Module 11 Host and Management Scalability 697


Initial Boot of an Autodeployed ESXi Host: Step 1
Slide 11-74

The ESXi host boots from the PXE boot server.


vCenter Server

image Host Profile


profile Host Profile
host profile
rules engine
ESXi
VIBs
ESXi
host
driver
VIBs
“waiter” gPXE DHCP
image request

Auto TFTP DHCP


OEM VIBs Deploy

When an autodeployed host boots up for the first time, a certain sequence of events occurs:
1. When the ESXi host is powered on, the ESXi host starts a PXE boot sequence.

2. The DHCP server assigns an IP address to the ESXi host and instructs the host to contact the
TFTP server.
3. The ESXi host downloads the gPXE image file and the gPXE configuration file from the
TFTP server.

698 VMware vSphere: Optimize and Scale


Initial Boot of an Autodeployed ESXi Host: Step 2
Slide 11-75

The ESXi host contacts the Auto Deploy server.


vCenter Server

image Host Profile


profile Host Profile
host profile
rules engine
ESXi
VIBs
ESXi
host
driver
VIBs
“waiter”

11
Auto
OEM VIBs Deploy cluster A cluster B

Host and Management Scalability


The gPXE configuration file instructs the host to make an HTTP boot request to the Auto Deploy
server.

Module 11 Host and Management Scalability 699


Initial Boot of an Autodeployed ESXi Host: Step 3
Slide 11-76

The host’s image profile, host profile, and cluster are determined.
vCenter Server

image Host Profile


profile Host Profile
host profile
rules engine
ESXi
VIBs
image profile X ESXi
host profile 1
host
driver cluster B
VIBs

“waiter”

Auto
OEM VIBs Deploy cluster A cluster B

The Auto Deploy server queries the rules engine for the following information about the host:
• The image profile to use
• The host profile to use
• Which cluster the host belongs to (if any)
The rules engine maps software and configuration settings to hosts, based on the attributes of the
host. For example, you can deploy image profiles or host profiles to two clusters of hosts by writing
two rules, each matching on the network address of one cluster.
For hosts that have not yet been added to vCenter Server, Auto Deploy checks with the rule engine
before serving image profiles and host profiles to hosts.

700 VMware vSphere: Optimize and Scale


Initial Boot of an Autodeployed ESXi Host: Step 4
Slide 11-77

The image is pushed to the host, and the host profile is applied.
vCenter Server

image Host Profile


profile Host Profile
host profile
rules engine
ESXi
VIBs
image profile,
host profile, ESXi
driver cluster info host
VIBs
“waiter”

11
Auto
OEM VIBs Deploy cluster A cluster B

Host and Management Scalability


The Auto Deploy server streams the VIBs specified in the image profile to the host. Optionally, a
host profile can also be streamed to the host.
The host boots based on the image profile and the host profile received by Auto Deploy. Auto
Deploy also assigns the host to the vCenter Server instance with which it is registered.

Module 11 Host and Management Scalability 701


Initial Boot of an Autodeployed ESXi Host: Step 5
Slide 11-78

The host is placed in the appropriate cluster, if specified by a rule. The ESXi
image and information on the image profile, host profile, and cluster to use
are held on the Auto Deploy server.
vCenter Server
image
profile Host Profile
rules engine Host Profile
host profile
image profile,
ESXi host profile,
VIBs cluster info
ESXi
driver host
VIBs “waiter”

ESXi
image

OEM VIBs Auto Deploy


cluster A cluster B

If a rule specifies a target folder or a cluster on the vCenter Server instance, the host is added to that
location. If no rule exists, Auto Deploy adds the host to the first datacenter.
If a host profile was used without an answer file and the profile requires specific information from
the user, the host is placed in maintenance mode. The host remains in maintenance mode until the
user reapplies the host profile and answers any outstanding questions.
If the host is part of a fully automated DRS cluster, the cluster rebalances itself based on the new
host, moving virtual machines onto the new host.
To make subsequent boots quicker, the Auto Deploy server stores the ESXi image as well as the
following information:
• The image profile to use
• The host profile to use
• The location of the host in the vCenter Server inventory

702 VMware vSphere: Optimize and Scale


Subsequent Boot of an Autodeployed ESXi Host: Step 1
Slide 11-79

The autodeployed host is rebooted, and the PXE boot sequence


starts.
vCenter Server
image
profile Host Profile
rules engine Host Profile
host profile
image profile,
ESXi host profile,
VIBs cluster info
ESXi
driver host
VIBs “waiter”
gPXE DHCP
ESXi image Request
image

11
OEM VIBs Auto Deploy TFTP DHCP

Host and Management Scalability


When an autodeployed ESXi host is rebooted, a slightly different sequence of events takes place:
1. The ESXi host goes through the PXE boot sequence, as it does in the initial boot sequence.

2. The DHCP server assigns an IP address to the ESXi host and instructs the host to contact the
TFTP server.
3. The host downloads the gPXE image file and the gPXE configuration file from the
TFTP server.

Module 11 Host and Management Scalability 703


Subsequent Boot of an Autodeployed ESXi Host: Step 2
Slide 11-80

The Auto Deploy server is contacted by the ESXi host.

vCenter Server
image
profile Host Profile
rules engine Host Profile
host profile
image profile,
ESXi host profile,
VIBs cluster info

driver HTTP boot ESXi


VIBs “waiter” request host

ESXi
image

OEM VIBs Auto Deploy

cluster A cluster B

As in the initial boot sequence, the gPXE configuration file instructs the host to make an HTTP boot
request to the Auto Deploy server.

704 VMware vSphere: Optimize and Scale


Subsequent Boot of an Autodeployed ESXi Host: Step 3
Slide 11-81

The ESXi image is downloaded from the Auto Deploy server to the
host. The host profile is downloaded from vCenter Server to the host.
vCenter Server
image
profile Host Profile
rules engine Host Profile
host profile
image profile,
ESXi host profile,
VIBs cluster info

driver ESXi
VIBs “waiter” host

ESXi

11
image
TFTP DHCP
OEM VIBs Auto Deploy

Host and Management Scalability


In this step, the subsequent boot sequence differs from the initial boot sequence.
When an ESXi host is booted for the first time, Auto Deploy queries the rules engine for
information about the host. The information about the host’s image profile, host profile, and cluster
is stored on the Auto Deploy server.
On subsequent reboots, the Auto Deploy server uses the saved information instead of using the rules
engine to determine this information. Using the saved information saves time during subsequent
boots because the host does not have to be checked against the rules in the active rule set. Auto
Deploy checks the host against the active rule set only once, during the initial boot.

Module 11 Host and Management Scalability 705


Subsequent Boot of an Autodeployed ESXi Host: Step 4
Slide 11-82

The host is placed into the appropriate cluster.

vCenter Server
image
profile Host Profile
rules engine Host Profile
host profile
image profile,
ESXi host profile,
VIBs cluster info

ESXi
driver
“waiter” host
VIBs

ESXi
image

OEM VIBs Auto Deploy


cluster A cluster B

Finally, the ESXi host is placed in its assigned cluster on the vCenter Server instance.

706 VMware vSphere: Optimize and Scale


Auto Deploy Stateless Caching
Slide 11-83

Auto Deploy stateless caching saves the image and configuration to


a local disk, but the host continues to perform stateless reboots.
Requirements include a dedicated boot device.
If the host is unable to reach the PXE host or the Auto Deploy server,
the host boots using the local image:
ƒ The image on the local disk can be used as a backup.
ƒ Stateless caching can be configured to overwrite or preserve existing
VMFS.

11
Host and Management Scalability
Auto Deploy stateless caching PXE boots the ESXi host and loads the image in memory like
stateless ESXi hosts. However, when the host profile is applied to the ESXi host, the image running
in memory is copied to a boot device. The saved image acts as a backup in the event the PXE,
infrastructure or Auto Deploy servers are unavailable. If the host needs to reboot and it cannot
contact the DHCP, TFTP, or Auto Deploy server, the network boot will timeout and the host will
reboot using the cached disk image.
While stateless caching can potentially help ensure availability of an Auto Deployed ESXi host by
enabling the host to boot in the event of an outage affecting the DHCP, TFTP or Auto Deploy
servers, stateless caching does not guarantee that the image is current or the vCenter server is
available after the boot up. Stateless caching’s primary benefit is it enables you to boot the host in
order to help troubleshoot and resolve problems that prevent a successful PXE boot.
It’s also important to note that unlike stateless ESXi hosts, stateless caching requires a dedicated
boot device be assigned to the host.

Module 11 Host and Management Scalability 707


Stateless Caching Host Profile Configuration
Slide 11-84

Configuring an Auto Deploy enabled server for stateless caching


includes doing the following:
ƒ Creating a host profile with stateless caching configured.
ƒ Booting an ESXi host using Auto Deploy:
• The host profile is applied.
• The ESXi image is cached to disk.
The host runs stateless under normal operations.

esx,local

Stateless caching is configured using host profiles. When configuring stateless caching, you can
choose to save the image to a local boot device or a USB disk. You also have the option of leaving
the local VMFS intact or overwriting it.
To configure stateless caching:

1. From the vCenter server home window click on Host Profiles.

2. Select the host profile you want to configure, and click on Edit Profile from the tool bar.

3. Expand the System Image Cache Configuration tree and highlight System Image Cache
Profile Settings.
4. From the pull-down menu on the Configuration Details tab, select Enable stateless caching on
the host.
Enter first disk arguments if needed and whether or not you want to overwrite the local VMFS
and select OK.

708 VMware vSphere: Optimize and Scale


Auto Deploy Stateless Caching
Slide 11-85

The host copies the state locally. Reboots are stateless only if the
PXE and Auto Deploy servers are reached.
vCenter Server
Image
Image
image
Profile Host Profile
Profile Host Profile
profile host profile
rules engine
image profile
host profile
ESXi cache
VIBs Image/Config

Driver
VIBs “Waiter”

11
Auto
OEM VIBs Deploy

Host and Management Scalability


All subsequent reboots will boot from the Auto Deploy server unless the DHCP, TFTP, or the Auto
Deploy servers are not available. If either infrastructure server is not available, the host will boot
from the locally saved image.
No attempt is made to update the cached image. If the host regains the ability to boot stateless, the
local image does not change, even after the stateless image changes. The local image may become
stale as a result.

Module 11 Host and Management Scalability 709


Auto Deploy Stateful Installation
Slide 11-86

The ESXi host initially boots using Auto Deploy. All subsequent
reboots use local disks.
The benefits of Auto Deploy stateful installations include the
following:
ƒ Quickly and efficiently provision hosts
ƒ Once provisioned no further requirement on the PXE and Auto Deploy
servers.
Some of the disadvantages of using stateful Installations include the
following:
ƒ Over time the configuration might become out of sync with the Auto
Deploy image.
ƒ Patching and updating ESXi hosts need to be done using traditional
methods.

Setting up stateful installation is similar to configuring stateless caching. The difference is instead of
attempting to use Auto Deploy to boot the host, the host will do a one-time PXE boot to install
ESXi. Once the image is cached to disk, the host will boot from the disk image on all subsequent
reboots.

710 VMware vSphere: Optimize and Scale


Stateful Installation Host Profile Configuration
Slide 11-87

Stateful installation host profile configuration includes the following:


ƒ Create the host profile with stateful install configured.
ƒ Boot the ESXi host using Auto Deploy:
• The ESXi image is saved to disk.
• The host profile is applied.
ƒ Future boots are directed to the local disk:
• No future attempts are made to contract the PXE or Auto Deploy Servers.

11
Host and Management Scalability
Stateful installation is configured using host profiles. When configuring stateful installation you can
choose to save the image to a local boot device or a USB disk. You also have the option of leaving
the local VMFS intact or overwriting it.
To configure stateful installation:

1. From the vCenter server home window, click on Host Profiles.

2. Select the host profile that you want to configure, and click on Edit Profile from the tool bar.

3. Expand the System Image Cache Configuration tree and highlight System Image Cache
Profile Settings.
4. From the pull-down menu on the Configuration Details tab, select Enable stateful installs on
the host.
5. Enter first disk arguments if needed and whether or not you want to overwrite the local VMFS
and select OK.

Module 11 Host and Management Scalability 711


Auto Deploy Stateful Installation
Slide 11-88

Initial boot up uses Auto Deploy to install the image on the server.
Subsequent reboots are performed from local storage.
vCenter Server
Image
Image
image
Profile Host Profile
Profile Host Profile
profile host profile
rules engine
image profile
host profile
ESXi
ESXi cache
VIBs
VIBs Image/Config

Driver
VIBs “Waiter”

Auto
OEM VIBs Deploy

With stateful installation, the Auto Deploy server is used to provision new ESXi hosts. The first
time the host boots, it uses the PXE host, DHCP server, and Auto Deploy server just like a stateless
host. All subsequent reboots use the image saved to the local disk.
Once the image is written to the host’s local storage, the image is used for all future reboots. The
initial boot of a system is the equivalent of an ESXi host installation to a local disk. No attempt is
made to update the image, which means over time the image can become stale. Manual processes
need to be implemented to manage configuration updates and patches, losing most of the key
benefits of Auto Deploy.

712 VMware vSphere: Optimize and Scale


Managing the Auto Deploy Environment
Slide 11-89

If you change a rule set:


ƒ Unprovisioned hosts boot automatically according to the new rules.
ƒ For all other hosts, Auto Deploy applies new rules only when you test a
host’s rule compliance and perform remediation.
If vCenter Server becomes unavailable:
ƒ Host deployment continues to work because the Auto Deploy server
retains the state information:
• Hosts must be part of a VMware vSphere® High Availability cluster.
ƒ The host contacts the Auto Deploy server to determine which image
and host profile to use.
If Auto Deploy becomes unavailable:
ƒ You will be unable to boot or reboot a host.

11
The vCenter and Auto Deploy Servers do not need to be available for
stateless or stateful configurations.

Host and Management Scalability


You can change a rule set, for example, to require a host to boot from a different image profile. You
can also require a host to use a different host profile. Unprovisioned hosts that you boot are
automatically provisioned according to these modified rules. For all other hosts, Auto Deploy
applies the new rules only when you test their rule compliance and perform remediation.
If the vCenter Server instance is unavailable, the stateless host contacts the Auto Deploy server to
determine which image and host profile to use. If a host is in a VMware vSphere® High Availability
cluster, Auto Deploy retains the state information so deployment works for the stateless host even if
the vCenter Server instance is not available. If the host is not in a vSphere HA cluster, the vCenter
Server system must be available to supply information to the Auto Deploy server.
If the Auto Deploy server becomes unavailable, stateless hosts that are already autodeployed remain
up and running. However, you will be unable to boot or reboot hosts. VMware recommends
installing Auto Deploy in a virtual machine and placing the virtual machine in a vSphere HA cluster
to keep it available.
For details about the procedure and the commands used for testing and repairing rule compliance,
see vSphere Installation and Setup Guide at http://www.vmware.com/support/pubs/vsphere-esxi-
vcenter-server-pubs.html.

Module 11 Host and Management Scalability 713


Using Auto Deploy with Update Manager to Upgrade Hosts
Slide 11-90

VMware vSphere® Update Manager™ supports ESXi 5.0 and 5.1


hosts that use Auto Deploy to boot.
Update Manager can patch hosts but cannot update the ESXi image
used to boot the host.
Update Manager can remediate only patches that do not require a
reboot (live-install).
ƒ Patches requiring reboot cannot be installed.
Workflow for patching includes the following steps:
1. Update the image that Auto Deploy uses with patches manually. If a
reboot is possible, rebooting is all that is required to update the host.
2. If a reboot cannot be performed, create a baseline in Update Manager
and remediate the host.

Update Manager now supports PXE installations. Update Manager updates the hosts but does not
update the image on the PXE server.
Only live-install patches can be remediated with Update Manager. Any patch that requires a reboot
cannot be installed on a PXE host. The live-install patches can be from VMware or a third party.

714 VMware vSphere: Optimize and Scale


Lab 20
Slide 11-91

In this lab, you will configure Auto Deploy to boot ESXi hosts.
1. Install Auto Deploy.
2. Configure the DHCP server and TFTP server for Auto Deploy.
3. Use vSphere PowerCLI to configure Auto Deploy.
4. (For vClass users only) Configure the ESXi host to boot from the
network.
5. (For non-vClass users) Configure the ESXi host to boot from the
network.
6. View the autodeployed host in the vCenter Server inventory.
7. (Optional) Apply a host profile to the autodeployed host.
8. (Optional) Define a rule to apply the host profile to an autodeployed
host when it boots.

11
Host and Management Scalability

Module 11 Host and Management Scalability 715


Review of Learner Objectives
Slide 11-92

You should be able to do the following:


ƒ Understand the purpose of Auto Deploy.
ƒ Configure Auto Deploy.
ƒ Use Auto Deploy to deploy stateless ESXi.

716 VMware vSphere: Optimize and Scale


Key Points
Slide 11-93

ƒ The Host Profiles feature allows you to export configuration settings


from a master reference host and save them as a host profile. You can
use the host profile to quickly configure other hosts in the datacenter.
ƒ DRS clusters provide automated resource management for multiple
ESXi hosts.
ƒ vSphere DPM consolidates workloads and powers down unused ESXi
hosts to reduce power consumption.
ƒ vSphere PowerCLI provides a Windows PowerShell interface for
command-line access to perform administrative tasks.
ƒ Auto Deploy is a method for deploying large numbers of ESXi hosts
quickly and easily.

11
Questions?

Host and Management Scalability

Module 11 Host and Management Scalability 717


718 VMware vSphere: Optimize and Scale

You might also like