Professional Documents
Culture Documents
Conceptual Design
Solution Scope
Logical Design
2
Introductions and Workshop Objectives
Participant introductions
• What experience do you have?
• Why are you attending this workshop?
PS Consultant: Please fill out these bullet points to provide
• What do you expect to achieve? scope and expectations of this workshop.
3
VMware Solution Conceptual Design
Applications
Client Data
Financial Scaling GUI API CLI SaaS 3rd Platform
Server Availability
Compliance Analytics
Resource Catalogs Active Workloads Active Workloads
Data On Premesis Cloud
Governance Data Services Isolation
Applications Virtual Virtual
Containers Containers
Machines Machines
Infrastructure
Service Level Reclamation
Abstraction, Pooling and Tenancy
Development Threat
Event
lifecycle Compute Network Storage Compute Network Storage Compute Network Storage Containment
Application
Capacity
provisioning
Physical Resources
Infrastructure Compute Storage Network Data
Performance
Provisioning Encryption
4
IT Value Model
5
6
6
VMware Solution
Logical Diagram
7
Solution Scope
The table below lists the scope of the engagement in the context of the VMware journey model. These are
the IT Capabilities that have been determined as the focus for this engagement
IT Capabilities in Scope
Automatically recover from hardware failures
Abstract and pool compute and storage resources
8
Solution Scope
The table below lists the scope of the engagement in the context of the VMware journey model. These are
the IT Problems that have been determined as the focus for this engagement
IT Problems in Scope
High CAPEX for dedicated infrastructure
Single point of failure
Long wait time for hardware purchases
Unexpected infrastructure outages
Performance bottlenecks
Not enough data center resources or space
9
VMware Solution
Logical Diagram
10
Engagement Scope - Technical
This specific engagement by VMware Professional Services included the following components of the VMware
Solution. This Solution Design content will only refer to these components.
11
VMware vSphere 7.0.x
13
What’s New
vSphere 7.0
14
Product Overview
vSphere 7.0.x
16
Virtualization Overview
Virtualization
• Abstracts traditional physical machine resources and runs workloads as virtual machines
• Each virtual machine runs a guest operating system and applications
• The operating system and applications don’t know that they are virtualized
17
Virtualization Overview (cont.)
Hypervisors
• Partition computing resources of a server
for multiple virtual machines
• Hypervisors alone lack coordination for
higher availability and efficiency
• The VMware vSphere Hypervisor is ESXi
VMware vSphere
• A suite of software that extends beyond basic host
partitioning by aggregating infrastructure
resources and providing additional services such as
dynamic resource scheduling
• Serves as the foundation of the software-defined
data center (SDDC)
18
Cloud Computing and the SDDC
IT as a Service (ITaaS)
• Abstracts complexity in the enterprise data center
• Achieves economies of scale
• Renews focus on application services
– Availability
– Security
– Scalability
Management
Cloud OS
Enterprise Cloud
19
vSphere – Use Case Examples
20
Foundation for the Software-Defined Enterprise
End User Desktop Mobile
Computing
Virtual Workspace
Virtualized Infrastructure
Abstract and Pool
Hybrid Cloud
VMware and
vCloud Data Center
Partners
Compute Network Storage Abstraction
Abstraction = Abstraction = = Software-
Server Virtual Defined Storage Private Public
Virtualization Networking Clouds Clouds
Physical
Hardware
Compute Network Storage
21
vSphere 7 Adds Kubernetes to the VMKernel
vSphere 7 and VCF 4
Namespaces
DB & Analytics AI/ML Business Critical Time-critical
vCenter
Tanzu
Developer Compute Network Storage IT Operator
Kubernetes Grid
Services Services Services
Service
22
In April 2020 vSphere 7 Changed the SDDC
Bringing new features & vSphere with Kubernetes
23
vSphere 7 Update 1
Major Focus Areas
24
Architecture
vSphere 7.0.x
25
Agenda Architecture Overview vSphere with Tanzu
Content Library
Storage
Networking
26
Architecture Overview
27
High-Level VMware vSphere Architectural Overview
VMware vSphere
Availability Scalability
Manage • VMware vSphere vMotion
Application • VMware vSphere Storage • DRS and DPM
vMotion •
Services •
Hot Add
VMware vSphere High
Availability • Over Commitment
• VMware vSphere FT • Content Library
• VMware Data Recovery
Cluster
Storage Network
• vSphere VMFS
• VMware Virtual • Standard vSwitch
Infrastructure ESXi ESXi ESXi Volumes • Distributed vSwitch
Services • VMware vSAN • VMware NSX
• Thin Provisioning • VMware vSphere
Network I/O Control
• vSphere Storage I/O
Control
28
Physical Resources
Introducing vSphere with Kubernetes
Transform your infrastructure to build, run and manage modern applications everywhere
vSphere
Developer Services
TKG Container
service service
Self-service Development
Developer
Network Registry
Volume service
service service
Application
focused
management Agile Operations
VI Admin
29
Enable vSphere with Kubernetes Supervisor Clusters
VI Admin
vCenter Server
vSphere Cluster
30
VMware Tanzu Kubernetes Grid Service for vSphere
Developer Self Service
Networking: Calico
… …
Supervisor Cluster
31
Simplified Deployment and Consumption
vSphere with Tanzu
VCF
vSphere
VCF
with
with
with
Kubernetes
Tanzu
Tanzu
DB and Analytics AI/ML Business Critical Time-critical
Namespaces
VMware
vSphere
Cloud
withFoundation
Tanzu Services
Services
Tanzu Kubernetes
Grid Service vSphere Pod Registry Network Storage
Service Service Service Services
Developer IT Operator
Services
TanzuTanzu Kubernetes
Kubernetes Grid Grid | vSphere Pods | Networks | Volumes | Registry
Management Portgroup
Frontend Portgroup
Workload Portgroup
K8S Control Plane K8S Control Plane K8S Control Plane HA Proxy
TKG Cluster TKG Cluster TKG Cluster TKG Cluster TKG Cluster TKG Cluster TKG Cluster TKG Cluster TKG Cluster
Node Node Node Node Node Node Node Node Node
33
VMware ESXi
34
ESXi 7.0
ESXi is bare metal VMware vSphere Hypervisor
VM
• ESXi is in control of all CPU, memory, network and
storage resources
wa
re
• Allows for virtual machines to be run at near native
E
performance, unlike hosted hypervisors
SX
i
ESXi 7.0 allows
• Utilization of up to 768 Logical CPUs per host
• Utilization of up to 16 TB of RAM per host
• Deployment of up to 1024 virtual machines per host
35
ESXi Architecture
CLI Commands
for Configuration
ESXi Host
And Support
Agentless Agentless
Systems Hardware
Management Monitoring
VMkernel
37
Virtual Machines
The software computer and consumer of resources Virtual Machine
that ESXi is in charge of
App App App
VMs are containers that can run any almost any
operating system and application. Operating System
Each VM has access to its own resources Keyboard Mouse SCSI CD / DVD
Controller
38
Virtual Machine Architecture
Virtual machines consist of files stored on a vSphere VMFS or NFS datastore
• Configuration file (.vmx)
• Swap files (.vswp)
• BIOS files (.nvram)
• Log files (.log)
• Template file (.vmtx)
• Raw device map file (<VM_name>-rdm.vmdk)
• Disk descriptor file (.vmdk)
• Disk data file (VM_name>-flat.vmdk)
• Suspend state file (.vmss)
• Snapshot data file (.vmsd)
• Snapshot state file (.vmsn)
• Snapshot disk file (<VM_name>-delta.vmdk)
39
VMware vCenter Server
40
VMware vCenter™ 7.0
vCenter is the management platform for vSphere
environments
Provides much of the feature set that comes with
vSphere, such as vSphere High Availability
Also provides SDK access into the environment
for solutions such as VMware vRealize™
Automation™
vCenter Server is available as an appliance only
in vSphere 7.0 and beyond
A single vCenter Server running version 7.0 can
manage:
• 2000 hosts
• 25,000 virtual machines
41
vCenter 7.0 Architecture
In vCenter 7.0, the architecture is simplified dramatically. There
is only one architecture that is permitted, the Embedded
Architecture
All services are provided by vCenter Server including:
• VMware vCenter Single Sign-On™
• License service
• Lookup service All services are provided from a
• VMware Directory Services vCenter Server instance
• VMware Certificate Authority
• vCenter Server
• VMware vSphere Client (HTML5)
• VMware vSphere Auto Deploy™
• VMware vSphere ESXi Dump Collector
• vSphere Syslog Service
• vSphere Update Manager
42
vCenter 7.0 Architecture (cont.)
One architecture is supported as a result of this change
Choosing a mode depends on the size and feature requirements for the environment
Platform Services
Controller Platform Services
Controller
vCenter Server
Virtual Machine or Server
vCenter Server
43
vCenter 7.0 Architecture (cont.)
One architecture is supported in vSphere 7.0, the Embedded architecture from previous releases. Platform
Services Controllers can no longer be External to vCenter Server
vCenter is also only available in appliance form factor, vCenter for Windows no longer is available.
44
vCenter 7.0 Architecture (cont.)
Enhanced Linked Mode has the following maximums
*Note: These are a point in time snapshot of the maximums taken April 2020. For the most up to date data, see http://configmax.vmware.com/
45
vCenter Architecture – ESXi and vCenter Server Communication
How vCenter Server components and ESXi hosts communicate
vCenter Server
& Platform
Services Controller
vpxd
TCP
443, 9443
TCP/UDP
902
hostd vpxa
ESXi Host
46
VMware vSphere vMotion
47
vSphere vMotion
vSphere vMotion allows for live migration of
virtual machines between compatible ESXi
hosts
• Compatibility determined by CPU, network,
and storage access
48
vSphere vMotion Architecture
Long-Distance vSphere vMotion
Cross Continental
• Targeting cross continental USA
• Up to 150ms RTT
Performance
49
vSphere vMotion Architecture
vSphere vMotion involves transferring the entire
execution state of the virtual machine from the source
host to the destination
Primarily happens over a high-speed network
50
vMotion Improvements
Basic vMotion Workflow
1
Create VM on destination
2
Copy memory
3
Suspend VM on source
5
Resume VM on destination Switch-over
Time of 1 sec
6
Power Off VM on source
Datastore
vMotion Improvements
Memory Copy
• Prior to vSphere 7, we
installed the page tracer on
all the vCPUs in a VM
52
vMotion Improvements
Memory Copy Optimizations
• In vSphere 7, we claim
one vCPU to do all the
page tracing work during
a vMotion operation.
• Greatly reduced
performance impact on
workload!
53
vMotion Improvements
Execution Switch-over Process
1
Quiesce VM on Source Switch-over
Time < 1 sec
2
Transfer checkpoint
3
Transfer changed bitmap Function of VM’s memory size
1 GB mem => 32 KB bitmap
24 TB mem => 768 MB bitmap
4
Transfer swap bitmap
5
Transfer remaining pages
6
Resume VM on Destination
54
vMotion Improvements
Memory Copy Optimizations
55
VMware vSphere Storage vMotion Architecture
vSphere Storage vMotion works in very much the same
Read/Write
way as vSphere vMotion, only the disks are migrated
I/O to Virtual
instead Disk
It works as follows
VM VM
1. Initiate storage migration
2. Use the VMkernel data mover or VMware vSphere Mirror Driver
56
vSphere Storage vMotion Architecture
Simultaneously Change Host and Storage
vSphere vMotion also allows both storage and host to be changed at the same time
In vSphere 6.x – the VM can be migrated between networks and vCenter Servers
vCenter vCenter
Server Server
57
Availability
VMware vSphere High Availability
VMware vSphere Fault Tolerance
VMware vSphere Distributed Resource Scheduler
58
Availability
VMware vSphere High Availability
59
vSphere High Availability
vSphere High Availability is an availability solution
that monitors hosts and restarts virtual machines in the
case of a host failure
• OS and application-independent, requiring no complex
configuration changes
• Agents on the ESXi hosts monitor for the following
types of failures
60
vSphere High Availability Architecture – Overview
Cluster of ESXi hosts created up to 64 hosts
• One of the hosts is elected as master when HA is enabled
Network heartbeats
Storage heartbeats
Master
61
vSphere High Availability Architecture – Host Failures
Master
62
vSphere High Availability Architecture – Host Failures
Master declares
Master slave host dead
63
vSphere High Availability Architecture – Host Failures
64
vSphere High Availability Architecture – Network Partition
A B
Master
65
vSphere High Availability Architecture – Host Isolation
Master
66
vSphere High Availability Architecture – VM Monitoring
Master
67
vSphere High Availability Architecture – VM Component Protection
Master
68
Availability
VMware vSphere Fault Tolerance
69
vSphere FT
vSphere FT is an availability solution that provides
continuous availability for virtual machines
• Zero downtime
• Zero data loss
70
vSphere FT Architecture
vSphere FT creates two complete virtual machines when enabled with vSphere 7.x
Primary VM Secondary VM
71
vSphere FT Architecture – Memory Checkpoint
vSphere FT in vSphere 6.7 uses fast checkpoint technology
• This is similar to how vSphere vMotion works, but it is done continuously (rather than once)
• The fast checkpoint is a snapshot of all data not just memory (memory, disks, devices, and so on)
• vSphere FT logging network has a minimum requirement of 10 Gbps NIC
VM A VM A
Memory
bitmap
Production
network
VM End User
72
Availability
VMware vSphere Distributed Resource Scheduler
(DRS)
73
DRS
DRS
DRS is a technology that monitors load and resource
usage and will use vSphere vMotion to balance virtual
machines across hosts in a cluster
• DRS also Includes VMware Distributed Power
Management (DPM) which allows for hosts to be evacuated
and powered off during periods of low utilization
DRS uses vSphere vMotion functionality migrate VMs
VMware DPM
Can be used in three ways
• Fully automated – where DRS acts on recommendations
automatically
• Partially automated – where DRS only acts for initial VM
power-on placement and an administrator has to approve
recommendations
• Manual – where administrator approval is required for all
movements
74
DRS Architecture
ESXi Host 1 ESXi Host 1
DRS generates migration recommendations based
on how aggressive it has been configured
For example
• The three hosts on the left side of the following
figure are unbalanced
ESXi Host 2 ESXi Host 2
• Host 1 has six virtual machines, its resources might
be overused while ample resources are available on
Host 2 and Host 3
• DRS migrates (or recommends the migration of)
virtual machines from Host 1 to Host 2 and Host 3
• On the right side of the diagram, the properly load
ESXi Host 3 ESXi Host 3
balanced configuration of the hosts that results
appears
75
Distributed Power Management Architecture
ESXi Host 1 ESXi Host 1
DPM generates migration recommendations similar to
DRS, but in terms of achieving power savings
• It can be configured for how aggressively you want to
save power
For example
ESXi Host 2 ESXi Host 2
• The three hosts on the left side of the following figure
have virtual machines running, but they are mostly idle
• DPM determines that given the load of the environment
shutting down Host 3 will not impact the level of
performance for the VMs
• DPM migrates (or recommends the migration of) virtual
ESXi Host 3 ESXi Host 3
machines from Host 3 to Host 2 and Host 1 and puts
Host 3 into standby mode
• On the right side of the diagram, the power managed Host
d by
configuration of the hosts appears Stan
76
Improved DRS in vSphere 7.0
Why?
77
Improved DRS
Compared with Previous Releases
Original DRS
• Cluster centric
• Runs every 5 min
• Uses cluster-wide standard
deviation model
Improved DRS
• Workload centric
• Runs every 1 min
• Uses the VM DRS Score
• Based on granted memory
78
Improved DRS
VM DRS Score
79
Improved DRS
Scalable Shares
80
Improved DRS
Scalable Shares
81
Improved DRS
Scalable Shares
Cluster:
• Scalable Shares are
configured on cluster level
and/or resource pool level
• Not enabled by default
• Scalable Shares are used
by default for vSphere with
Kubernetes where a
Namespace = Resource Resource
Pool Pool:
82
Content Library
83
Content Library
The Content Library is a distributed template, media and script library for vCenter Server
vCenter vCenter
321 23
Content Library Content Library
Subscribe
(Publisher) (Subscriber)
12 1 12 Sync
12 1 12
84
Content Library Architecture – Publication and Subscription
Publication and subscription allow libraries to be shared between vCenter Servers
Provides a single source for information that can be configured to download and sync according to
schedules or timeframes
vCenter vCenter
Other
Password (optional)
85
Content Library Architecture – Content Synchronization
Content Synchronization occurs when content changes
Simple versioning used to denote the modification, and the item is transferred
vCenter vCenter
HTTP GET
86
Content Library with vSphere 7.0 Improvements
VM Template Check-In/Check-Out & Versioning
Quickly find VM
Templates versions
Check-out templates
for edits
Check-in templates to
save changes made
Revert to previous
versions
87
Content Library
VM Template Check-In/Check-Out & Versioning
Quickly find VM
Templates versions
Check-out templates
for edits
Check-in templates to
save changes made
Revert to previous
versions
88
Content Library
VM Template Versioning Tab
89
Content Library
VM Template Versioning Tab
90
Content Library
Advanced Configuration and Optimization
Advanced
Configurations in
Content Library
Edit Auto-Sync
Frequency and
Performance
Optimization
91
Content Library
Developer Center >> API Explorer
92
VMware Certificate Authority
93
VMware Certificate Authority
In vSphere 7.x, vCenter ships with an internal Certificate Authority (CA) called the VMware Certificate
Authority
Issues certificates for VMware components under its personal authority in the vSphere eco-system
VMware CA issues certificates only to clients that present credentials from VMDirectory in its own
identity domain
• It also posts its root certificate to its own server node in VMware Directory Services
94
How is the VMware Certificate Authority Used?
Machine’s SSL certificate
• Used by reverse proxy on every vSphere node
• Used by the VMware Directory Service on Platform Services Controller and Embedded nodes
• Used by VPXD on Management and Embedded nodes
95
Simplified Certificate Management
vSphere 6.x: Lots of Certificates
96
Simplified Certificate Management
vSphere 7: Much Simpler
97
Simplified Certificate Management
New Wizard for Certificate Import
98
Storage
iSCSI Storage Architecture
NFS Storage Architecture
Fibre Channel Architecture
Other Storage Architectural Concepts
99
Storage
Both local and/or shared storage are a core
requirement for full utilization of ESXi features
VMware
Many kinds of storage can be used with vSphere ESXi
hosts
• Local disks
• Fibre Channel (FC) SANs
• iSCSI SANs
Datastore
• NAS SANs types VMware vSphere VMFS NFS
• Virtual SAN
• Virtual Volumes (VVOLs)
File
They are generally formatted either: system
100
Storage – Protocol Features
Each different protocol has its own set of supported features
Fibre Channel ● ● ● ● ●
FCoE ● ● ● ● ●
iSCSI ● ● ● ● ●
NFS ● ● ●
vSAN ● ● ●
101
Storage
iSCSI Storage Architecture
102
Storage Architecture – iSCSI
iSCSI storage utilizes regular IP traffic over a standard network to transport iSCSI commands
The ESXi host connects through one of several types of iSCSI initiator
103
Storage Architecture – iSCSI Components
All iSCSI systems share a common set of components that are used to provide the storage access
104
Storage Architecture – iSCSI Addressing
Other than the standard IP addresses, iSCSI targets are identified by names as well
105
Storage
NFS Storage Architecture
106
Storage Architecture – NFS Components
Much like iSCSI, NFS accesses storage over the
network
107
Storage Architecture – Addressing and Access Control with NFS
ESXi Accesses NFS through NFS Server address /
name through a VMkernel port
NFS version 4.1 and NFS version 3 are available with
vSphere 7.x
Different features are supported with different versions
192.168.81.33
of the protocol
• NFS 4.1 supports multipathing unlike NFS 3
• NFS 3 supports all features, NFS 4.1 does not support
Storage DRS, VMware vSphere Storage I/O Control,
VMware vCenter Site Recovery Manager™, and Virtual
Volumes
108
Storage
Fibre Channel Architecture
109
Storage Architecture – Fibre Channel
Unlike network storage such as NFS or iSCSI, Fibre Channel does not generally use an IP network for
storage Access.
• The exception here is when using Fibre Channel over Ethernet (FCoE)
110
Storage Architecture – Fibre Channel Addressing and Access Control
Zoning and LUN masking are used for access control to storage LUNs
111
Storage Architecture – FCoE Adapters
FCoE adapters allow access to Fibre Hardware FCoE Software FCoE
Channel Storage over Ethernet connections
ESXi Host ESXi 5.x Host
Enables expansion to Fibre Channel SANs
when no Fibre Channel infrastructure exits Network FC Network Software
in many cases Driver Driver Driver FC
FC
LAN
SAN
112
Storage
Other Storage Architectural Concepts
113
Multipathing
114
vSphere Storage I/O Control
With vSphere
Without vSphere
Storage
Storage I/O
I/O Control
Control
vSphere Storage I/O Control allows traffic to Data Print Online Microsoft Data Print Online Microsoft
115
Datastore Clusters
A collection of datastores with shared resources similar to ESXi host clusters
Storage DRS can be used to manage the resource and ensure they are balanced
116
Software-Defined Storage
Software-defined storage is a software construct which is
used by
• Virtual Volumes
• vSAN
117
Networking
118
Networking
Networking is also a core resource for vSphere
119
Networking Architecture
Management
Network
VMkernel
121
Network Architecture – NIC Teaming and Load Balancing
NIC Teaming enables multiple NICs to be connected to a single virtual switch for continued access to
networks if hardware fails
• This also can enables load balancing (if appropriate)
122
VMware vSphere Network I/O Control
vSphere Network I/O Control allows traffic to be
prioritized during periods of contention
• Brings the compute style of shares/limits to storage
infrastructure
10 GigE
123
Software-Defined Networking
Software-Defined Networking is a software
construct that allows your physical network to be
treated as a pool of transport capacity, with
network and security services attached to VMs
with a policy-driven approach
Decouples the network configuration from the
physical infrastructure
Allows for security and micro-segmentation of
traffic
Key tenant to the software-defined data center
(SDDC)
124
vSphere with Tanzu
vSphere 7 Update 1
125
Simplified Deployment and Consumption
vSphere with Tanzu
VCF
vSphere
VCF
with
with
with
Kubernetes
Tanzu
Tanzu
DB and Analytics AI/ML Business Critical Time-critical
Namespaces
VMware
vSphere
Cloud
withFoundation
Tanzu Services
Services
Tanzu Kubernetes
Grid Service vSphere Pod Registry Network Storage
Service Service Service Services
Developer IT Operator
Benefit
Configure an Enterprise grade Kubernetes infrastructure
on your choice of networking, storage and load
balancing solutions
127
Building on the Best
vSphere with Tanzu Architecture
Pod Pod Pod Pod Pod Pod Pod Pod Pod Pod Pod Pod
VM VM VM VM VM VM VM VM VM VM VM VM VM VM VM VM
Tanzu Kubernetes Cluster Tanzu Kubernetes Cluster Tanzu Kubernetes Cluster Tanzu Kubernetes Cluster
Tanzu Tanzu
VM Operator Supervisor Cluster Cluster API Kubernetes VM Operator Supervisor Cluster Cluster API Kubernetes
Grid Grid
SDDC
Services
TanzuTanzu Kubernetes
Kubernetes Grid Grid | vSphere Pods | Networks | Volumes | Registry
Management Portgroup
Frontend Portgroup
Workload Portgroup
K8S Control Plane K8S Control Plane K8S Control Plane HA Proxy
TKG Cluster TKG Cluster TKG Cluster TKG Cluster TKG Cluster TKG Cluster TKG Cluster TKG Cluster TKG Cluster
Node Node Node Node Node Node Node Node Node
129
Announcing vSphere With Tanzu
The simplest implementation of Kubernetes
brought to the fingertips of millions of IT admins
Available
today NEW
131
vSphere Clustering Services (vCLS)
Distributing the vSphere Cluster Services
132
AMD SEV-ES
vSphere 7 Update 1
133
I want them to know I
We put
cannot seea inside
lot of trust
their
in the infrastructure.
workloads!
We’d likewetoknow
How do limit a
How
Can wedo limit
we assure
our
exposure and
vSphere Admin addisn’t
customers
risk? of
defense-in-depth.
watching us?
Wider Interest in
privacy?
Infrastructure Security
High-profile hardware issues
means people asking the right
questions
vSphere
Workload Admins
Admins, CISO,
Risk and Compliance Auditors
134
vSphere Isolation Protects Workloads From Each Other
ESXi Defense-in-Depth & Least Privilege, enhanced with SEV-ES
ESXi
Encryption Key A Encryption Key B Encryption Key C
CPU and Memory
135
Security is Always a Tradeoff
…but it’s very nice to have great options
Considerations Benefits
AMD SEV-ES
Requires AMD EPYC 7xx2 CPUs Workloads gain deep data-in-use
protections without modification!
Requires guest OS support
Coexists with other workloads
vMotion, memory snapshots, hot-add,
suspend/resume, Fault Tolerance, clones, and Containers & modern applications (Tanzu)
guest integrity not supported make most operational considerations
invisible
Support SEV-ES (memory encryption +
encrypted register state), not just SEV Easy to enable & operate (PowerCLI
command for the VM)
137
Establishing Trust in Hardware Can Be Troublesome
vSphere Trust Authority
X =
138
Establishing Trust in vSphere 6.7
A great start
ESXi Hosts
139
vSphere Trust Authority Automates & Enforces The Rules
Secure Infrastructure at Scale
Gain isolation
140
Improvements to vSphere Trust Authority
Based on customer feedback
141
Virtual Disk Development Kit
(VDDK)
vSphere 7 Update 1
142
VDDK Improvements
NIOC resource pool for backup network traffic
Network I/O Control helps prioritize and resolve conflicts when network traffic competes for
resources. Now backup traffic can be prioritized as well.
144
Agenda VMware ESXi™
Virtual Machines
Availability
Storage
Networking
Lifecycle Manager(vLCM)
If you require more detailed information use the VMware vSphere Documentation (
https://docs.vmware.com/en/VMware-vSphere/index.html) and
VMware Global Support Services might be of assistance
146
ESXi
147
Components of ESXi
The ESXi architecture comprises the underlying operating system, called the VMkernel, and processes that
run on top of it
VMkernel provides a means for running all processes on the system, including management applications
and agents as well as virtual machines
It has control of all hardware devices on the server and manages resources for the applications
148
Components of ESXi (cont.)
Direct Console User Interface
• Low-level configuration and management interface, accessible through the console of the server, used primarily
for initial basic configuration
149
ESXi Deep Dive
VMkernel
• A POSIX-like operating system developed by VMware, which provides certain functionality similar to that found
in other operating systems, such as process creation and control, signals, file system, and process threads
• Designed specifically to support running multiple virtual machines and provides such core functionality such as
– Resource scheduling
– I/O stacks
– Device drivers
• Some of the more pertinent aspects of the VMkernel are presented in the following sections
150
ESXi Deep Dive (cont.)
File System
• VMkernel uses a simple in-memory file system to hold the ESXi Server configuration files, log files, and staged
patches
• The file system structure is designed to be the same as that used in the service console of traditional ESX Server.
For example
– ESX Server configuration files are found in /etc/vmware
– Log files are found in /var/log/vmware
– Staged patches are uploaded to /tmp
• This file system is independent of the VMware vSphere VMFS file system used to store virtual machines
• The in-memory file system does not persist when the power is shut down. Therefore, log files do not survive a
reboot if no scratch partition is configured
• ESXi has the ability to configure a remote syslog server and remote dump server, enabling you to save all log
information on an external system
151
ESXi Deep Dive (cont.)
User Worlds
• The term user world refers to a process running in the VMkernel operating system. The environment in which a
user world runs is limited compared to is found in a general-purpose POSIX-compliant operating system such as
Linux
– The set of available signals is limited
– The system API is a subset of POSIX
– The /proc file system is very limited
• A single swap file is available for all user world processes. If a local disk exists, the swap file is created
automatically in a small VFAT partition. Otherwise, the user is free to set up a swap file on one of the attached
VMFS datastores
• Several important processes run in user worlds. Think of these as native VMkernel applications. They are
described in the following sections
152
ESXi Deep Dive (cont.)
Direct Console User Interface (DCUI)
• DCUI is the local user interface that is displayed only on the console of an ESXi system
• It provides a BIOS-like, menu-driven interface for interacting with the system. Its main purpose is initial
configuration and troubleshooting
• The DCUI configuration tasks include
– Set administrative password
– Set Lockdown mode (if attached to VMware vCenter™)
– Configure and revert networking tasks
• Troubleshooting tasks include
– Perform simple network tests
– View logs
– Restart agents
– Restore defaults
153
ESXi Deep Dive (cont.)
Other User World Processes
• Agents used by VMware to implement certain management capabilities have been ported from running in the
service console to running in user worlds
– The hostd process provides a programmatic interface to VMkernel, and it is used by direct VMware vSphere Client™
connections as well as APIs. It is the process that authenticates users and keeps track of which users and groups have which
privileges
– The vpxa process is the agent used to connect to vCenter. It runs as a special system user called vpxuser. It acts as the
intermediary between the hostd agent and vCenter Server
– The FDM agent used to provide vSphere High Availability capabilities has also been ported from running in the service
console to running in its own user world
– A syslog daemon runs as a user world. If you enable remote logging, that daemon forwards all log files to the remote target in
addition to putting them in local files
– A process that handles initial discovery of an iSCSI target, after which point all iSCSI traffic is handled by the VMkernel, just
as it handles any other device driver
154
ESXi Deep Dive (cont.)
Open Network Ports – A limited number of network ports are open on ESXi. The most important ports and
services are
• 80 – This port serves a reverse proxy that is open only to display a static Web page that you see when browsing to
the server. Otherwise, this port redirects all traffic to port 443 to provide SSL-encrypted communications to the
ESXi Server
• 443 (reverse proxy) – This port also acts as a reverse proxy to a number of services to provide SSL-encrypted
communication to these services. The services include API access to the host, which provides access to the
RCLIs, the vSphere Client, vCenter Server, and the SDK
• 5989 – This port is open for the CIM server, which is an interface for third-party management tools
• 902 – This port is open to support the older VIM API, specifically the older versions of the vSphere Client and
vCenter
• Many other ports depending on what is configured (vSphere High Availability, vSphere vMotion, and so on) have
their own port requirements, but this are only opened if these services are configured
155
ESXi Troubleshooting
Troubleshooting ESXi is very much the same as any operating system
156
ESXi Best Practices
For in depth ESXi and other component practices, read the Performance Best Practices Guide (
https://www.vmware.com/content/dam/digitalmarketing/vmware/en/pdf/techpaper/performance/vsphere-esxi-vcenter-s
erver-67-performance-best-practices.pdf
)
Always set up the VMware vSphere Syslog Collector (Windows) / VMware Syslog Service (Appliance) to remotely
collect and store the ESXi log files
Always set up the VMware vSphere ESXi Dump Collector Service to allow dumps to be remotely collected in the case
of a VMkernel failure
Ensure that only the firewall ports required by running services are enabled in the Security profile
Ensure the management network is isolated from the general network (VLAN) to decrease the attack surface of the
hosts
Ensure the management network has redundancy through NIC Teaming or by having multiple management interfaces
Ensure that the ESXi Shell and SSH connectivity are not permanently enabled
157
VM Troubleshooting and Best
Practices
158
Virtual Machine Troubleshooting
Virtual machines run as processes on the ESXi host
ESXi host virtual machine log files are located in the directory which the virtual machine runs by default,
and are named vmware.log
Generally issues occur as a result of a problem in the guest OS
• Host level crashes of the VM processes are relatively rare and are normally a result of hardware errors or
compatibility of hardware between hosts
159
Virtual Machine Best Practices
Virtual machines should always run VMware Tools™ to ensure that the correct drivers are installed for virtual hardware
Right-size VMs to ensure that they use only required hardware. If VMs are provisioned with an over-allocation of
resources that are not used, ESXi host performance and capacity is reduced
Any devices not being used should be disconnected from VMs (CD-ROM/DVD, floppy, and so on)
If NUMA is used on ESXi, VMs should be right-sized to the size of the NUMA nodes on the host to avoid performance
loss
VMs should be stored on shared storage to allow for the maximum vSphere vMotion compatibility and vSphere High
Availability configurations in a cluster
Memory/CPU reservations should not be used regularly because they reserve the resource and can prevent the VMware
vSphere Hypervisor from being able to take advantage of over commitment technologies
VMs partitions should be aligned to the storage array partition alignment
Storage and Network I/O Control can dramatically help VM performance in times of contention
160
vCenter Server
161
vCenter Server 7.0
162
vCenter Appliance Deployment
•Changed Significantly!
•Installer support for Windows, Mac, and Linux
•Updated menu: Install, Upgrade, Migrate, Restore
•No longer supports external databases!
•VMware vSphere Update Manager included
•vCenter Appliance (incl. PSC Install) is a two stage process
• Stage 1 – Deploy OVF
• Stage 2 - Configuration
163
vCenter Appliance Migration – 7.0
Adds support for migrating Windows vCenter 6.x to the vCenter 7.0 Appliance
• Windows is no longer supported architecture, therefore migration is required.
164
vCenter Best Practices
Verify that vCenter, the Platform Services Controller, and any database have adequate CPU, memory, and
disk resources available
Verify that the proper inventory size is configured during the installation
Minimize latency between components (vCenter and Platform Services Controller) by minimizing network
hops between components
External databases should be used for large deployments
If using Enhanced Linked Mode, VMware recommends having external Platform Services Controllers
Verify that time is correct on vCenter and all other components in the environment
VMware vSphere Update Manager™ for Windows should be installed on a separate system if inventory is
large
165
vSphere vMotion
166
vSphere vMotion
Troubleshooting and Best Practices
167
167
vSphere vMotion and vSphere Storage vMotion Troubleshooting
vSphere vMotion and vSphere Storage vMotion are some of the best logged features in vSphere
Each migration that occurs has a unique Migration ID (MID) that can be used to search logs for the vSphere
vMotion and vSphere Storage vMotion
• MIDs look as follows: 1295599672867508
Each time a vSphere vMotion and vSphere Storage vMotion is attempted, all logs can be reviewed to find the
error using grep and searching for the term Migrate
Both the source and the destination logs should be reviewed
168
vSphere vMotion Troubleshooting – Example vmkernel.log – Source
170
170
vSphere vMotion Best Practices
ESXi host hardware should be as similar as possible to avoid failures
VMware Virtual Machine Hardware compatibility is important to avoid failures as newer hardware
revisions cannot be run on older ESXi hosts
10 Gb networking will improve vSphere vMotion performance
vSphere vMotion networking should be segregated form other traffic to prevent saturation of network links
Multiple network cards can be configured for vSphere vMotion VMkernel networking to improve
performance of migrations
171
vSphere Storage vMotion Best Practices
If vSphere Storage vMotion traffic takes place on storage that might also have other I/O loads (from other VMs on the
same ESXi host or from other hosts), it can further reduce the available bandwidth, so it should be done during times
when there will be less impact
vSphere Storage vMotion will have the highest performance during times of low storage activity (when available
storage bandwidth is highest) and when the workload in the VM being moved is least active
vSphere Storage vMotion can perform up to four simultaneous disk copies per vSphere Storage vMotion operation.
However, vSphere Storage vMotion will involve each datastore in no more than one disk copy at any one time. This
means, for example, that moving four VMDK files from datastore A to datastore B will occur serially, but moving four
VMDK files from datastores A, B, C, and D to datastores E, F, G, and H will occur in parallel
For performance-critical vSphere Storage vMotion operations involving VMs with multiple VMDK files, you can use
anti-affinity rules to spread the VMDK files across multiple datastores, thus ensuring simultaneous disk copies
vSphere Storage vMotion will often have significantly better performance on vStorage APIs for Array Integration
(VAAI)-Capable storage arrays
172
Availability
vSphere High Availability
173
vSphere High Availability Deep Dive
In the vSphere High Availability architecture, each host in the cluster runs an FDM agent
The FDM agents do not use vpxa and are completely decoupled from it
The agent (or FDM) on one host is the master, and the agents on all other hosts are its slaves
When vSphere High Availability is enabled, all FDM agents participate in an election to choose the master
If the host that is serving as the master should subsequently fail, be shutdown, or need to abdicate its role, a
new master election is held
174
vSphere High Availability Deep Dive – Role of the Master
A master monitors ESXi hosts and VM availability
A master will monitor slave hosts and it will restart VMs in the event of a slave host failure
It manages the list of hosts that are members of the cluster and manages adding and removing hosts from
the cluster
It monitors the power state of all the protected VMs, and if one should fail, it will restart the VM
It manages the list of protected VMs and updates this list after each user-initiated power on or power off
It sends heartbeats to the slaves so the slaves know the master is alive
It caches the cluster configuration and informs the slaves of changes in configuration
175
vSphere High Availability Deep Dive – Role of the Slave
A slave monitors the runtime state of the VMs running locally and forwards significant state changes to the
master
It implements vSphere High Availability features that do not require central coordination, most notably VM
health monitoring
It monitors the health of the master, and if the master should fail, it participates in a new master election
176
vSphere High Availability Deep Dive – Master and Slave Summary Views
Master Slave
View View
177
vSphere High Availability Deep Dive – Master Election
A master is elected when the following conditions occur
• vSphere High Availability is enabled
• A master host fails
• A management network partition occurs
After a master is elected and contacts vCenter, vCenter sends a compatibility list to the master which saves
it on its local disk, and then pushes it out to the slave hosts in the cluster
vCenter normally only talks to a master. It will sometimes talk to FDM agents on other hosts, especially if
master states that it cannot reach the slave agent. vCenter will try to contact the other host to figure out
why
178
vSphere High Availability Deep Dive – Partitioning
Under normal operating conditions, there is only one master
However, if a management network failure occurs, a subset of the hosts might become isolated. This means
that they cannot communicate with the other hosts in the cluster over the management network
In such a situation, when the hosts can continue to ping the isolation response IP, but not other hosts, FDM
is called network partitioned
Each partition without an existing master will elect a new one
Thus, a partitioned cluster state will have multiple masters, one per partition
However, vCenter cannot report back on more than one master, so you could be getting only one partition
details – the master that vCenter finds first
When a network partition is corrected, one of the masters will take over from the others, thus reverting
back to a single master
179
vSphere High Availability Deep Dive – Isolation
In some ways this is similar to a network partition state, except that a host can no longer ping the default
gateway/isolation IP address
In this case, a host is called network isolated
The host has the ability to inform the master that it is in this isolation state, through files on the heartbeat
datastores, which will be discussed shortly
Then the Host Isolation Response is checked to see when the VMs on this host should be shut down or left
powered on
If they are powered off, they can be restarted on other hosts in the cluster
180
vSphere High Availability Deep Dive – Virtual Machine Protection
The master is responsible for restarting any protected VMs that fail
The trigger to protect a VM is the master observing that the power state of the VM changes from powered
off to powered on
The trigger to unprotect a VM is the master observing the VM’s power state changing from powered on to
power off
After the master protects the VM, the master will inform vCenter that the VM has been protected, and
vCenter will report this fact through the vSphere High Availability Protection runtime property of the VM
181
HA Troubleshooting and Best
Practices
182
vSphere High Availability Troubleshooting
Troubleshooting vSphere High Availability since vSphere 5.x is greatly simplified
• Agents were upgraded from using a third party component to using a component built by VMware called Fault
Domain Manager (FDM)
A single log file, fdm.log, now exists for communication of all events related to vSphere High
Availability
When troubleshooting a vSphere High Availability failure, be sure to collect logs from all hosts in the
cluster
• This is because when a vSphere High Availability event occurs, VMs might be moved to any host in the cluster.
To track all events, the FDM log for each host (including the master host) is required
184
vSphere High Availability Best Practices (cont.)
Interoperability
• Do not mix versions of ESXi in the same cluster
• vSAN uses its network for vSphere High Availability, rather than the default
• When enabling vSAN, vSphere High Availability should be disabled first and then enabled
Admission Control
• Select the policy that best matches the need in the environment
• Do not disable admission control or VMs might not all be able to fail over if an event occurs
• Size hosts equally to prevent imbalances
185
Availability
vSphere FT
186
vSphere FT Troubleshooting
vSphere FT has been completely rewritten in vSphere 6.x and beyond
Now, CPU compatibility is the same as vSphere vMotion compatibility because the same technology is
used to ship memory, CPU, storage, and network states across to the secondary virtual machine
When troubleshooting
• Get logs for both primary and secondary VMs and hosts
• Grab logs before log rotation
• Ensure time is synchronized on all hosts
When reviewing the configuration, you should find both primary and secondary VMX logs in the primary
VMs directory
• They will named vmware.log and vmware-snd.log
Also, be sure to review vmkernel.log and hostd.log from both the primary and secondary hosts for
errors
187
vSphere FT Troubleshooting – General Things To Look For (vmkernel, vmx)
2016-10-17T18:12:25.892Z cpu3:35660)FTCpt: 2401: (1389982345707340120 pri) Primary init: nonce
2791343341
2016-10-17T18:12:25.892Z cpu3:35660)FTCpt: 2440: (1389982345707340120 pri) Setting
allowedDiffCount = 64
2016-10-17T18:12:25.892Z cpu3:35660)FTCpt: 1217: Queued accept request for ftPairID
1389982345707340120
2016-10-17T18:12:25.892Z cpu3:35660)FTCpt: 2531: (1389982345707340120 pri) vmx 35660 vmm
35662
2016-10-17T18:12:25.892Z cpu1:32805)FTCpt: 1262: (1389982345707340120 pri) Waiting for
connection
vSphere FT messages will prefix with “FTCpt:”
188
vSphere FT Troubleshooting – Legacy vSphere FT or vSphere FT?
vmware.log file
189
vSphere FT Troubleshooting – Has vSphere FT Started?
vmkernel.log
• 2016-10-17T14:32:13.607Z cpu5:89619)FTCpt: 3831:
(1389969072618873992 pri) Start stamp: 2016-10-
17T14:32:13.607Z nonce 409806199
•…
• 2016-10-17T14:46:23.860Z cpu2:89657)FTCpt: 9821:
(1389969072618873992 pri) Last ack stamp: 2016-10-
17T14:46:15.639Z nonce 409806199
vmware.log
• 2016-10-21T22:56:01.635Z| vcpu-0| I120: FTCpt:
Activated ftcpt in VMM.
If you do not see these, vSphere FT may not have started
190
vSphere FT Best Practices
Hosts running primary and secondary VMs should run at approximately the same processor frequency to
avoid errors
• Homogeneous clusters work best for vSphere FT
All hosts should have
• Common access to datastores used by VMs
• The same virtual network configuration
• The same BIOS settings (power management, hyper threading, and so on)
FT Logging networks should be configured with 10 Gb networking connections
Jumbo frames can also help performance of vSphere FT
Network configuration should be
• Distribute each NIC team over two physical switches
• Deterministic teaming policies to ensure network traffic affinity
ISOs should be stored on shared storage
191
Availability
vSphere Distributed Resource Scheduler
192
DRS
Troubleshooting and Best Practices
193
DRS Troubleshooting
DRS uses a proprietary algorithm to assess and determine resource usage and to determine which hosts to
balance VMs to
DRS primarily uses vMotion to facilitate movements
• Troubleshooting failures generally consist of figuring out why vMotion failed, and not DRS itself as the algorithm
just follows resource utilization
To test DRS, from the vSphere Web Client, select the Run DRS option, which will initiate
recommendations
Failures can be assessed and corrected at that time
194
DRS Best Practices
Hosts should be as homogeneous as possible to ensure predictability of DRS placements
vSphere vMotion should be compatible for all hosts or DRS will not function
The more hosts available, the better DRS functions because there are more options for available placement
of VMs
VMs that have a smaller CPU/RAM footprint provide more opportunities for placement across hosts
DRS affinity should be used to keep VMs apart, such as in the case of a load balanced configuration
providing high availability
195
VMware Certificate Authority
196
VMware CA – Management Tools
A set of CLIs allows management of VMware CA, VMware Endpoint Certificate Store, and VMware
Directory Service are available
certool
• Use to generate private keys, public keys
• Use to request a certificate
• Used to promote a plain Certificate Server to a Root CA
dir-cli
• Use to create/delete/list/manage solution users in VMDirectory
vecs-cli
• Use to create/delete/list/manage key stores in VMware Endpoint Certificate Store
• Use to create/delete/list/manage private keys and certificates in the key stores
• Use to manage the permissions on the key stores
197
VMware CA – Management Tools (cont.)
By default, the tools are in the following locations
Platform Location
Linux /usr/lib/vmware-vmafd/bin/vecs-cli
/usr/lib/vmware-vmafd/bin/dir-cli
/usr/lib/vmware-vmca/bin/certool
198
certool Configuration File
certool uses a configuration file called certool.cfg
• override by using the --config=<file name> or --Locality=“Cork”
OS Location
VCSA /usr/lib/vmware-vmca/share/config
certool.cfg
Country = US
Name= cert
Organization = VMware
OrgUnit = Support
State = California
Locality = Palo Alto
IPAddress = 127.0.0.1
Email = ca@vmware.com
Hostname = machine.vmware.com
199
Machine SSL Certificates
The SSL certificates for each node, also called machine certificates, are used to establish a socket that
allows secure communication. For example, using HTTPS or LDAPS
During installation, VMware CA provisions each machine (vCenter / ESXi) with an SSL certificate
• Used for secure connections to other services and for other HTTPS traffic
200
Solution User Certificates
Solution user certificate are used for authentication to vCenter Single Sign-On
• Issues the SAML tokens that allow services and other users to authenticate
The Security Assertion Markup Language (SAML) token contains group membership information so that
the SAML token could be used for authorization operations
Solution user certificates enable the solution user to use any other vCenter service that vCenter Single
Sign-On supports without authenticating
201
Certificate Deployment Options
VMware CA Certificates
• You can use the certificates that VMware CA assigned to vSphere components as is
– These certificates are stored in the VMware Endpoint Certificate Store on each machine
– VMware CA is a Certificate Authority, but because all certificates are signed by VMware CA itself, the certificates do not
include a certificate chain
202
VMware CA Best Practices
Replacement of the certificates is not required to have trusted connections
• VMware CA is a CA, and therefore, all certificates used by vSphere components are fully valid and trusted
certificates
• Addition of the VMware CA as a trusted root certificate will allow the SSL warnings to be eliminated
203
Storage
204
Storage Troubleshooting
Troubleshooting storage is a broad topic that very much depends on the type of storage in use
Consult the vendor to determine what is normal and expected for storage
In general, the following are problems that are frequently seen
• Overloaded storage
• Slow storage
205
Problem 1 – Overloaded Storage
Monitor the number of disk commands aborted on the host
• If Disk Command Aborts > 0 for any LUN, then storage is overloaded on that LUN
206
Problem 2 – Slow Storage
For a host’s LUNs, monitor Physical Device Read Latency and Physical Device Write Latency counters
• If average > 10ms or peak > 20ms for any LUN, then storage might be slow on that LUN
Use the storage device’s monitoring tools to collect data to characterize the workload
207
Example 1 – Bad Disk Throughput
Low Device
Good Throughput
Latency
Bad Throughput
208
208
Example 2 – Virtual Machine Power On Is Slow
User complaint – Powering on a virtual machine takes longer than usual
• Sometimes, powering on a virtual machine takes 5 seconds
• Other times, powering on a virtual machine takes 5 minutes!
209
Monitoring Disk Latency Using the vSphere Web Client
Maximum disk
latencies range from
100ms to 1100ms
210
Using esxtop to Examine Slow VM Power On
Rule of thumb
• GAVG/cmd > 20ms = high latency!
To resolve the problems of slow or overloaded storage, solutions can include the following
• Verify that hardware is working properly
• Configure the HBAs and RAID controllers for optimal use
• Upgrade your hardware, if possible
Eliminate all possible swapping to reduce the burden on the storage subsystem
212
Storage Troubleshooting – Balancing the Load
Spread I/O loads over the available paths to the
storage
For disk-intensive workloads
• Use enough HBAs to handle the load
• If necessary, separate storage processors to
separate systems
213
Storage Troubleshooting – Understanding Load
Understand the workload
• Use storage array tools
to capture workload statistics
214
Storage Best Practices – Fibre Channel
Best practices for Fibre Channel arrays
• Place only one VMFS datastore on each LUN
• Do not change the path policy the system sets for you unless you understand the implications of making such a
change
• Document everything. Include information about zoning, access control, storage, switch, server and FC HBA
configuration, software and firmware versions, and storage cable plan
• Plan for failure
– Make several copies of your topology maps. For each element, consider what happens to your SAN if the element fails
– Cross off different links, switches, HBAs and other elements to ensure you did not miss a critical failure point in your design
• Ensure that the Fibre Channel HBAs are installed in the correct slots in the host, based on slot and bus speed.
Balance PCI bus load among the available busses in the server
• Become familiar with the various monitor points in your storage network, at all visibility points, including host's
performance charts, FC switch statistics, and storage performance statistics
• Be cautious when changing IDs of the LUNs that have VMFS datastores being used by your ESXi host. If you
change the ID, the datastore becomes inactive and its virtual machines fail
215
Storage Best Practices – iSCSI
Best practices for iSCSI arrays
• Place only one VMFS datastore on each LUN. Multiple VMFS datastores on one LUN is not recommended
• Do not change the path policy the system sets for you unless you understand the implications of making such a
change
• Document everything. Include information about configuration, access control, storage, switch, server and iSCSI
HBA configuration, software and firmware versions, and storage cable plan
• Plan for failure
– Make several copies of your topology maps. For each element, consider what happens to your SAN if the element fails
– Cross off different links, switches, HBAs, and other elements to ensure you did not miss a critical failure point in your design
• Ensure that the iSCSI HBAs are installed in the correct slots in the ESXi host, based on slot and bus speed.
Balance PCI bus load among the available busses in the server
• If you need to change the default iSCSI name of your iSCSI adapter, make sure the name you enter is worldwide
unique and properly formatted. To avoid storage access problems, never assign the same iSCSI name to different
adapters, even on different hosts
216
Storage Best Practices – NFS
Best practices for NFS arrays
• Make sure that NFS servers you use are listed in the VMware Hardware Compatibility List. Use the correct
version for the server firmware
• When configuring NFS storage, follow the recommendations from your storage vendor
• Verify that the NFS volume is exported using NFS over TCP
• Verify that the NFS server exports a particular share as either NFS 3 or NFS 4.1, but does not provide both
protocol versions for the same share. This policy needs to be enforced by the server because ESXi does not
prevent mounting the same share through different NFS versions
• NFS 3 and non-Kerberos NFS 4.1 do not support the delegate user functionality that enables access to NFS
volumes using nonroot credentials. Typically, this is done on the NAS servers by using the no_root_squash option
• If the underlying NFS volume, on which files are stored, is read-only, make sure that the volume is exported as a
read-only share by the NFS server, or configure it as a read-only datastore on the ESXi host. Otherwise, the host
considers the datastore to be read-write and might not be able to open the files
217
Networking
218
Networking Troubleshooting
Troubleshooting networking is very similar to physical network troubleshooting
Is the issue limited to the virtual environment, or is it seen in the physical environment too?
One of the biggest issues that VMware has observed is dropped network packets (discussed next)
219
Network Troubleshooting – Dropped Network Packets
Network packets are queued in buffers if the
• Destination is not ready to receive them (Rx)
• Network is too busy to send them (Tx)
220
Example Problem 1 – Dropped Receive Packets
If a host’s droppedRx value > 0, there is a network throughput issue
Cause Solution
221
Example Problem 2 – Dropped Transmit Packets
If a host’s dropped TX value > 0, there is a network throughput issue
Cause Solution
Traffic from the set of virtual Move some virtual machines with high network demand to a different virtual
machines sharing a virtual switch switch
exceeds the physical capabilities
of the uplink NICs or the
networking infrastructure Enhance the networking infrastructure
222
Networking Best Practices
CPU plays a large role in performance of virtual networking. More CPUs, therefore, will generally result in
better network performance
Sharing physical NICs is good for redundancy, but it can impact other consumers if the link is overutilized.
Carefully choose the policies and how items are shared
Traffic between virtual machines on the same system does not need to go external to the host if they are on
the same virtual switch. Consider this when designing the network
Distributed vSwitches should be used whenever possible because they offer greater granularity on traffic
flow than standard vSwitches
vSphere Network and Storage I/O Control can dramatically help with contention on systems. This should
be used whenever possible
VMware Tools, and subsequently VMXNET3 drivers, should be used in all virtual machines to allow for
enhanced network capabilities
223
Lifecycle Manager (vLCM)
vSphere 7 Update 1
224
vLCM and NSX-T Integration
Manage NSX-T Lifecycle with vLCM
Installation of NSX-T
Upgrade of NSX-T
Uninstall of NSX-T
Add/Remove/Move a host in & out of
ESXi 7 ESXi 7 ESXi 7 vLCM-enabled clusters
Update 1 Update 1 Update 1
225
Install NSX-T on a vLCM Enabled Cluster
vLCM Image Manager will add NSX-T components to vLCM cluster image
226
Configure NSX-T on a vLCM Enabled Cluster
Configuration Management Aspects of NSX-T using vLCM
ADD Host
NSX Manager will update the TNP (Transport node
profile)
vLCM will automatically starts the remediation and
install NSX-T components to the newly added ESXi host
Add/Remove host
Remove Host
NSX manager will remove the TNP (Transport node
profile)
vLCM will un-install NSX-T components
227
Upgrade NSX-T on a vLCM Enabled Cluster
Upgrade NSX-T workflow will update the vLCM Image and triggers vLCM remediation
Sequential Process
Set
Stage Remediate
Solution
228
EVC for Graphics
vSphere 7 Update 1
229
Graphics-enabled VMs can now consume a consistent set of
features seamlessly across varied hardware.
231
Paravirtual RDMA
vSphere 7 Update 1
232
Virtual machines can now use vRDMA devices
✓ to communicate with other endpoints that are RDMA-
enabled, but not virtualized
Paravirtual RDMA
connection between each and send mappings of
the virtual to physical resource number.
Native Endpoints
Underlying hardware
communicates using the
physical resource numbers.
QPx2 QPy2
HCA HCA
QPx1 QPy
VM1, App App
QPx1
Control
Channel
Native Endpoints
ESX1, QPx1
QPx1 QPy
HCA HCA
Prerequisites
236
Consumption Activities
vSphere 7.0.x
237
Agenda Introduction
238
Introduction
239
Value - Adopt
Delivery Overview
Provide technical guidance and Implement prescribed process Provide mentoring to the identified
assistance on applying and Mentor appropriately trained individuals utilizing the defined
customer individuals to be workflows (manual or
customizing VMware solutions for automated) and runbooks for the workflows to apply and customize
their specific use cases responsible for the use case the VMware solution
specific use case
240
Updating Hosts with vSphere
Update Manager
Explanation and Demo
241
vSphere Update Manager
• Updating a vSphere environment is imperative to having a secure and stable environment
• Can update a single host, or a cluster of hosts based on availability or other business requirements
• Demo shows an example of updating a host.
(Video automatically plays on the next slide, you can load directly from here: https://youtu.be/6eYfh18dqBc)
242
243
Intelligent vLCM Updates for vSAN Deployments
vSAN Fault Domain and Stretched Cluster awareness
244
Performing a vSphere HA failover
Explanation and Demo
245
Performing a “Controlled” vSphere HA failover
• vSphere HA allows for downtime to be minimized by restarting failed Virtual Machines on other hosts
• Seeing a failure occur, and the expected behavior of a failure is imperative to operating the environment
successfully
• The demo shows an example of a host failover due to a failure.
(Video automatically plays on the next slide, you can load directly from here: https://youtu.be/oawYgjLgIUk)
246
247
Perform vMotion Operations
Explanation and Demo
248
Performing vMotion Operations
• vSphere vMotion allows for live running VMs to be moved to a different host without any downtime
• Being able to perform a vMotion will allow for maintenance to be performed
• The demo shows an example of how to perform a vMotion
(Video automatically plays on the next slide, you can load directly from here: https://youtu.be/ctWOxZwm8C0)
249
250
Create a virtual machine and
explain VMware Tools
Explanation and Demo
251
Create a virtual machine and explain VMware Tools
• Virtual machines are a core tenant of vSphere.
• Creating and operating VMs can be slightly different than a physical environment
• The demo shows an example of creating a VM and installing VMware Tools
(Video automatically plays on the next slide, you can load directly from here: https://youtu.be/7AGZjcZ7p8I)
252
253
Deploy Tanzu
vSphere 7 Update 1
254
Less Than 1 Hour to Deploy Tanzu
255
Thank You