You are on page 1of 36

Front cover

Microsoft Storage Spaces


Direct (S2D) Deployment
Guide
Last Update: May 2018

Includes detailed steps for Updated for the latest Lenovo


deploying a Microsoft ThinkSystem rack server and
Software-Defined Storage network switch hardware
solution based on Windows Server
2016

Describes in detail the hardware Provides validation steps along the


used in our labs to build this way to ensure successful deployment
solution

Dave Feisthammel
Mike Miller
David Ye

Click here to check for updates


Abstract

As the high demand for storage continues to accelerate for enterprises in recent years,
Lenovo® and Microsoft have teamed up to craft a software-defined storage solution
leveraging the advanced feature set of Windows Server 2016 and the flexibility of the Lenovo
ThinkSystem™ SR650 rack server and ThinkSystem NE2572 RackSwitch™ network switch.

This solution provides a solid foundation for customers looking to consolidate both storage
and compute capabilities on a single hardware platform, or for those enterprises that wish to
have distinct storage and compute environments. In both situations, this solution provides
outstanding performance, high availability protection and effortless scale out growth potential
to accommodate evolving business needs.

This deployment guide provides insight to the setup of this environment and guides the
reader through a set of well-proven procedures leading to readiness of this solution for
production use. This guide is based on Storage Spaces Direct as implemented in Windows
Server 2016.

Do you have the latest version? Check whether you have the latest version of this
document by clicking the Check for Updates button on the front page of the PDF.
Pressing this button will take you to a web page that will tell you if you are reading the
latest version of the document and give you a link to the latest if needed. While you’re
there, you can also sign up to get notified via email whenever we make an update.

Contents

Storage Spaces Direct Solution Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3


Solution configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Overview of the installation tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Configure the physical network switches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Prepare the servers and storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Install Windows Server 2016 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Install Windows Server roles and features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Configure the operating system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Configure networking parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Create the Failover Cluster . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Enable and configure Storage Spaces Direct . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Lenovo Professional Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
Change history . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
Authors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Notices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
Trademarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

2 Microsoft Storage Spaces Direct (S2D) Deployment Guide


Storage Spaces Direct Solution Overview
Microsoft Storage Spaces Direct (S2D) has become extremely popular with customers all
over the world since its introduction with the release of Microsoft Windows Server 2016. This
software-defined storage (SDS) technology leverages the concept of collecting a pool of
affordable drives to form a large usable and shareable storage repository.

Lenovo continues to work closely with Microsoft to deliver the latest capabilities in Windows
Server 2016, including S2D. This document focuses on S2D deployment on Lenovo’s latest
generation of rack servers and network switches.

Figure 1 shows an overview of the Storage Spaces Direct stack.

Scale-Out File Server \\fileserver\share

Storage Spaces Virtual disks

Cluster Shared Volumes C:\Cluster storage


(ReFS file system)

Storage pools

Software storage bus


HDD HDD HDD SSD HDD HDD HDD SSD HDD HDD HDD SSD HDD HDD HDD SSD

Figure 1 Storage Spaces Direct stack

When discussing high performance and shareable storage pools, many IT professionals think
of expensive SAN infrastructure. Thanks to the evolution of disk and virtualization technology,
as well as ongoing advancements in network throughput, the realization of having an
economical, highly redundant and high performance storage subsystem is now present.

Key considerations of S2D are as follows:


򐂰 S2D capacity and storage growth
Leveraging the 14x 3.5” drive bays of the Lenovo ThinkSystem SR650 and high-capacity
drives such as the 4-10TB hard disk drives (HDDs) that can be used in this solution, each
server node is itself a JBOD (just a bunch of disks) repository. As demand for storage
and/or compute resources grows, additional SR650 rack servers are added into the
environment to provide the necessary storage expansion.
򐂰 S2D performance
Using a combination of solid-state drives (SSD and NVMe) and regular HDDs as the
building blocks of the storage volume, an effective method for storage tiering is available.
Faster-performing SSD or NVMe devices act as a cache repository to the capacity tier,
which is usually placed on traditional HDDs in this solution. Data is striped across multiple
drives, thus allowing for very fast retrieval from multiple read points.

© Copyright Lenovo 2018. All rights reserved. 3


На физическом сетевом уровне сегодня используются каналы 10GbE или
25GbE. Однако дополнительные потребности в пропускной способности могут
быть удовлетворены с помощью адаптеров с более высокой пропускной
способностью. В большинстве случаев двойные сетевые пути 10 / 25GbE,
содержащие как операционную систему Windows Server, так и трафик
репликации хранилища, более чем достаточны для поддержки рабочих
нагрузок и не указывают на насыщение полосы пропускания.

򐂰 S2D resilience
Traditional disk subsystem protection relies on RAID storage controllers. In S2D, high
availability of the data is achieved using a non-RAID adapter and adopting redundancy
measures provided by Windows Server 2016 itself. The storage can be configured as
simple spaces, mirror spaces, or parity spaces.
– Simple spaces: Stripes data across a set of pool disks, and is not resilient to any disk
failures. Suitable for high performance workloads where resiliency is either not
necessary, or is provided by the application.
– Mirror spaces: Stripes and mirrors data across a set of pool disks, supporting a
two-way or three-way mirror, which are respectively resilient to single disk, or double
disk failures. Suitable for the majority of workloads, in both clustered and non-clustered
deployments.
– Parity spaces: Stripes data across a set of pool disks, with a single disk write block
used to store parity information, and is resilient to a single disk failure. Suitable for
large block append-style workloads, such as archiving, in non-clustered deployments.
򐂰 S2D use cases
The importance of having a SAN in the enterprise space as the high-performance and
high-resilience storage platform is changing. The S2D solution is a direct replacement for
this role. Whether the primary function of the environment is to provide Windows
applications or a Hyper-V virtual machine farm, S2D can be configured as the principal
storage provider to these environments. Another use for S2D is as a repository for backup
or archival of VHD(X) files. Wherever a shared volume is applicable for use, S2D can be
the new solution to support this function.

S2D supports two general deployment scenarios, which have been called disaggregated and
hyperconverged. Microsoft sometimes uses the term “converged” to describe the
disaggregated deployment scenario. Both scenarios provide storage for Hyper-V, specifically
focusing on Hyper-V Infrastructure as a Service (IaaS) for service providers and enterprises.

In the disaggregated approach, the environment is separated into compute and storage
components. An independent pool of servers running Hyper-V acts to provide the CPU and
memory resources (the “compute” component) for the running of VMs that reside on the
storage environment. The “storage” component is built using S2D and Scale-Out File Server
(SOFS) to provide an independently scalable storage repository for the running of VMs and
applications. This method, as illustrated in Figure 2 on page 5, allows for the independent
scaling and expanding of the compute farm (Hyper-V) and the storage farm (S2D).

4 Microsoft Storage Spaces Direct (S2D) Deployment Guide


Storage Spaces Virtual disks

Cluster Shared Volumes C:\Cluster storage


(ReFS file system)

Storage pools

Software storage bus


HDD HDD HDD SSD HDD HDD HDD SSD HDD HDD HDD SSD HDD HDD HDD SSD

Figure 2 Disaggregated configuration - nodes do not host VMs

For the hyperconverged approach, there is no separation between the resource pools for
compute and storage. Instead, each server node provides hardware resources to support the
running of VMs under Hyper-V, as well as the allocation of its internal storage to contribute to
the S2D storage repository.

Figure 3 demonstrates this all-in-one configuration for a four-node hyperconverged solution.


When it comes to growth, each additional node added to the environment will mean both
compute and storage resources are increased together. Perhaps workload metrics dictate
that a specific resource increase is sufficient to cure a bottleneck (e.g., CPU resources).
Nevertheless, any scaling will mean the addition of both compute and storage resources. This
is a fundamental limitation for all hyperconverged solutions.

Hyper-V virtual machines

Storage Spaces Virtual disks

Cluster Shared Volumes C:\Cluster storage


(ReFS file system)

Storage pools

Software storage bus


HDD HDD HDD SSD HDD HDD HDD SSD HDD HDD HDD SSD HDD HDD HDD SSD

Figure 3 Hyperconverged configuration - nodes provide shared storage and Hyper-V hosting

5
Solution configuration
Configuring the two deployment scenarios is essentially identical. The following components
and information are relevant to the test environment used to develop this guide. This solution
consists of two key components, a high-throughput network infrastructure and a
storage-dense high-performance server farm.

For details regarding Lenovo systems and components that have been certified for use with
S2D, please see the Certified Configurations for Microsoft Storage Spaces Direct (S2D)
document available at this URL:

https://lenovopress.com/lp0866

This guide provides the latest details related to certification of Lenovo systems and
components under the Microsoft Windows Server Software-Defined (WSSD) program.
Deploying WSSD certified configurations for S2D takes the guesswork out of system
configuration. Whether you intend to build a disaggregated or hyper-converged S2D
environment, you can rest assured that purchasing a WSSD certified configuration will
provide a rock solid foundation with minimal obstacles along the way. These node
configurations are certified by Lenovo and validated by Microsoft for out-of-the-box
optimization.

Note: It is strongly recommended to build S2D solutions based on WSSD certified


configurations and components. Deploying WSSD certified configurations ensures the
highest levels of support from both Lenovo and Microsoft.

For more information about the Microsoft WSSD program, see the following URL:

https://docs.microsoft.com/en-us/windows-server/sddc

Network infrastructure
To build the S2D solution described in this document, we used a pair of Lenovo ThinkSystem
NE2572 RackSwitch network switches, which are connected to each node via 25GbE Direct
Attach Copper (DAC) cables.

In addition to the NE2572 network switch, Lenovo offers multiple other switches that are
suitable for building an S2D solution, including:
򐂰 RackSwitch G8272
This is the network switch upon which the previous edition of this document was based. It
is a 1U rack-mount enterprise class Layer 2 and Layer 3 full featured switch that delivers
line-rate, high bandwidth switching, filtering, and traffic queuing. It has 48 SFP+ (10GbE)
ports for server connectivity and 6 QSFP+ (40GbE) ports for data center uplink.
򐂰 ThinkSystem NE1032 RackSwitch
This network switch is a 1U rack-mount 10 GbE switch that delivers lossless, low-latency
performance with a feature-rich design that supports virtualization, Converged Enhanced
Ethernet (CEE), high availability, and enterprise class Layer 2 and Layer 3 functionality. It
has 32 SFP+ ports that support 1 GbE and 10 GbE optical transceivers, active optical
cables (AOCs), and DAC cables.

6 Microsoft Storage Spaces Direct (S2D) Deployment Guide


򐂰 ThinkSystem NE10032 RackSwitch
This network switch is a 1U rack-mount 100GbE switch that uses 100Gb QSFP28 and
40Gb QSFP+ Ethernet technology and is specifically designed for the data center. It is an
enterprise class Layer 2 and Layer 3 full featured switch that delivers line-rate,
high-bandwidth switching, filtering, and traffic queuing without delaying data. It has 32
QSFP+/QSFP28 ports that support 40 GbE and 100 GbE optical transceivers, AOCs, and
DAC cables. These ports can also be split out into four 10 GbE (for 40 GbE ports) or 25
GbE (for 100 GbE ports) connections by using breakout cables

Server farm
To build the S2D solution used to write this document, we used four Lenovo ThinkSystem
SR650 rack servers equipped with multiple storage devices. Supported storage devices
include HDD, SSD, and NVMe media types. A four-node cluster is the minimum configuration
required to harness the failover capability of losing any two nodes.

Use of RAID controllers: Microsoft does not support any RAID controller attached to the
storage devices used by S2D, regardless of a controller’s ability to support “pass-through”
or JBOD mode. As a result, the ThinkSystem 430-16i SAS/SATA HBAs are used in this
solution. The ThinkSystem M.2 Mirroring Enablement Kit is used only for dual M.2 boot
drives and has nothing to do with S2D.

Lenovo has worked closely with Microsoft for many years to ensure our products perform
smoothly and reliably with Microsoft operating systems and software. Our customers can
leverage the benefits of our partnership with Microsoft by deploying Lenovo certified
configurations for Microsoft S2D, which have been certified under the Microsoft WSSD
program.

Deploying WSSD certified configurations for S2D solutions takes the guesswork out of
system configuration. Whether you intend to build a disaggregated or hyper-converged S2D
environment, you can rest assured that purchasing WSSD certified configurations will provide
a rock solid foundation with minimal obstacles along the way. For details regarding WSSD
certified configurations for S2D, refer to the following Lenovo Press document:

https://lenovopress.com/lp0866.pdf

Rack configuration
Figure 4 shows high-level details of the configuration. The four server/storage nodes and two
switches take up a combined total of 10 rack units of space.

7
1 2 3 4 5 6
1

1
Networking: Two Lenovo ThinkSystem NE2572 RackSwitch
2

network switches, each containing:


1 2 3 4 5 6

򐂰 48 ports at 25Gbps SFP28


򐂰 4 ports at 100Gbps QSFP28

Compute: Four Lenovo ThinkSystem SR650 servers, each


containing:
򐂰 Two Intel Gold or Platinum family processors
򐂰 384GB memory (balanced configuration, see Note below)
򐂰 One dual-port 10/25GbE Mellanox ConnectX-4 Lx PCIe
0 3
1 4
2 5
6 9
7 10
8
NVMe

11
ID

adapter with RoCE support


SR650

ID

0 3
1 4
2 5
6 9
7 10
8
NVMe

11

SR650
Storage in each SR650 server:
ID

򐂰 Eight 3.5” hot swap HDDs and four SSDs at front


NVMe

0 3 6 9
1 4 7 10
2 5 8 11

SR650

ID

0 3
1 4
2 5
6 9
7 10
8
NVMe

11
򐂰 Two 3.5” hot swap HDDs at rear
SATA
SATA

SATA
SATA

SATA
SATA

򐂰 ThinkSystem 430-16i SAS/SATA 12Gb HBA


򐂰 M.2 Mirroring Kit with dual 480GB M.2 SSD for OS boot

Figure 4 Solution configuration using ThinkSystem SR650 rack servers

Note: Although other memory configurations are possible, we highly recommend that you
choose a balanced memory configuration. For more information, see the following URL:

https://lenovopress.com/lp0742.pdf

Figure 5 shows the layout of the drives. There are 14x 3.5” drives in the SR650, 12 at the
front of the server and two at the rear of the server. Four are 800 GB SSD devices, while the
remaining ten drives are 4 TB SATA HDDs. These 14 drives form the tiered storage pool of
S2D and are connected to the ThinkSystem 430-16i SAS/SATA 12Gb HBA. In addition to the
storage devices that will be used by S2D, a dual 480GB M.2 SSD, residing inside the server,
is configured as a mirrored (RAID-1) OS boot volume.

Front: 12 x 3.5” drives (4 SSDs and 8 HDDs)


ID

SSD SSD SSD SSD


SATA

SATA

SATA

SATA
4TB

4TB

4TB

4TB

NVMe

0 3 6 9
1 4
2 5
7 10
8 11
HDD HDD HDD HDD
SATA

SATA

SATA

SATA
4TB

4TB

4TB

4TB

SR650

HDD HDD HDD HDD


SATA

SATA

SATA

SATA
4TB

4TB

4TB

4TB

Rear: 2 x 3.5” HDDs


5
23
PCIe3
HDD PCIe3
SATA

4T B

4 6
PORT 2
PORT 1

PCIe3 PCIe3

24
PCIe3
HDD AC AC
SATA

4T B

DC DC

Mellanox ConnectX-4

Figure 5 Lenovo ThinkSystem SR650 storage subsystem

8 Microsoft Storage Spaces Direct (S2D) Deployment Guide


Network wiring of this solution is straight-forward, with each server being connected to each
switch to enhance availability. Each system contains a dual-port 10/25GbE Mellanox
ConnectX-4 Lx adapter to handle operating system traffic and storage communications.

Switch 2
49 51 53

1 3 49
2 4 50
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 50 52 54
NE2572

Switch 1
49 51 53

1 3 49
2 4 50
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 50 52 54
NE2572

Node 4
5
23 PCIe3
PCIe3

SATA
SATA
4 6
2 1

PORT 2
PORT 1
PCIe3 PCIe3

24
PCIe3
AC AC

SATA
SATA
DC DC

Node 3
5
23 PCIe3
PCIe3
SATA
SATA

4 6
2 1

PORT 2
PORT 1
PCIe3 PCIe3

24
PCIe3
AC AC
SATA
SATA

DC DC

Node 2
5
23 PCIe3
PCIe3
SATA
SATA

4 6
2 1
PORT 2
PORT 1

PCIe3 PCIe3

24
PCIe3
AC AC
SATA
SATA

DC DC

Node 1
5
23 PCIe3
PCIe3
SATA
SATA

4 6
2 1
PORT 2
PORT 1

PCIe3 PCIe3

24
PCIe3
AC AC
SATA
SATA

DC DC

Figure 6 Switch to node network connectivity

To allow for redundant network links in the event of a network port or external switch failure,
the recommendation calls for the connection from Port 1 on the Mellanox adapter to be joined
to a port on the first NE2572 switch (“Switch 1”), plus a connection from Port 2 on the same
Mellanox adapter to be linked to an available port on the second NE2572 switch (“Switch 2”).
This cabling construct is illustrated in Figure 6 on page 9 and Figure 7 on page 11. Defining
an Inter-Switch Link (ISL) and Virtual Link Aggregation Group (vLAG) ensures failover
capabilities on the switches.

The last construction on the network subsystem is to leverage the virtual network capabilities
of Hyper-V on each host to create a SET-enabled team from both 25GbE ports on the
Mellanox adapter. From this a virtual switch (vSwitch) is defined and logical network adapters
(vNICs) are created to facilitate the operating system and storage traffic. Note that for the
disaggregated solution, the SET team, vSwitch, and vNICs do not need to be created, but we
generally do this anyway, just in case we’d like to run a VM or two from the storage cluster
occasionally.

Also, for the disaggregated solution, the servers are configured with 192 GB of memory,
rather than 384 GB, and the CPU has 8 cores instead of 14 cores. The higher-end
specifications of the hyperconverged solution are to account for the dual functions of compute
and storage that each server node will take on, whereas in the disaggregated solution, there
is a separation of duties, with one server farm dedicated to S2D and a second devoted to
Hyper-V hosting.

9
Overview of the installation tasks
This document specifically addresses the deployment of a Storage Spaces Direct
hyperconverged solution. Although nearly all configuration steps presented apply to the
disaggregated solution as well, there are a few differences between these two solutions. We
have included notes regarding steps that do not apply to the disaggregated solution. These
notes are also included as comments in PowerShell scripts.

A number of tasks need to be performed in order to configure this solution. If completed in a


stepwise fashion, this is not a difficult endeavor. The high-level steps described in the
remaining sections of the paper are as follows:
1. “Configure the physical network switches” on page 10
2. “Prepare the servers and storage” on page 14
3. “Install Windows Server 2016” on page 17
4. “Install Windows Server roles and features” on page 18
5. “Configure the operating system” on page 18
6. “Configure networking parameters” on page 19
7. “Create the Failover Cluster” on page 23
8. “Enable and configure Storage Spaces Direct” on page 27

Configure the physical network switches


Windows Server 2016 includes a feature called SMB Direct, which supports the use of
network adapters that have the Remote Direct Memory Access (RDMA) capability. Network
adapters that support RDMA can function at full speed with very low latency, while using very
little CPU. For workloads such as Hyper-V or Microsoft SQL Server, this enables a remote file
server to resemble local storage.

SMB Direct provides the following benefits:


򐂰 Increased throughput: Leverages the full throughput of high speed networks where the
network adapters coordinate the transfer of large amounts of data at line speed.
򐂰 Low latency: Provides extremely fast responses to network requests and, as a result,
makes remote file storage feel as if it is directly attached block storage.
򐂰 Low CPU utilization: Uses fewer CPU cycles when transferring data over the network,
which leaves more power available to server applications, including Hyper-V.

Leveraging the benefits of SMB Direct comes down to a few simple principles. First, using
hardware that supports SMB Direct and RDMA is critical. This solution utilizes a pair of
Lenovo ThinkSystem NE2572 RackSwitch Ethernet switches and a dual-port 10/25GbE
Mellanox ConnectX-4 Lx PCIe adapter for each node.

Redundant physical network connections are a best practice for resiliency as well as
bandwidth aggregation. This is a simple matter of connecting each node to each switch. In
our solution, Port 1 of each Mellanox adapter is connected to the Switch 1 and Port 2 of each
Mellanox adapter is connected to Switch 2, as shown in Figure 7 on page 11.

10 Microsoft Storage Spaces Direct (S2D) Deployment Guide


Switch 2
49 51 53

1 3 49
2 4 50
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 50 52 54
NE2572

Switch 1
49 51 53

1 3 49
2 4 50
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 50 52 54
NE2572

Node 4
5
23 PCIe3
PCIe3
SATA
SATA

4 6
2 1

PORT 2
PORT 1
PCIe3 PCIe3

24
PCIe3
AC AC
SATA
SATA

DC DC

Node 3
5
23 PCIe3
PCIe3
SATA
SATA

4 6
2 1
PORT 2
PORT 1

PCIe3 PCIe3

24
PCIe3
AC AC
SATA
SATA

DC DC

Node 2
5
23 PCIe3
PCIe3
SATA
SATA

4 6
2 1
PORT 2
PORT 1

PCIe3 PCIe3

24
PCIe3
AC AC
SATA
SATA

DC DC

Node 1
5
23 PCIe3
PCIe3
SATA
SATA

4 6
2 1
PORT 2
PORT 1

PCIe3 PCIe3

24
PCIe3
AC AC
SATA
SATA

DC DC

Figure 7 Switch to node connectivity using 10GbE or 25GbE AOC or DAC cables

11
As a final bit of network cabling, we configure an ISL between our pair of switches to support
the redundant node-to-switch cabling described above. To do this, we need redundant
high-throughput connectivity between the switches, so we connect Ports 49 and 50 on each
switch to each other using a pair of 100Gbps QSFP28 cables.

In order to leverage the SMB Direct benefits listed above, a set of cascading requirements
must be met. Using RDMA over Converged Ethernet (RoCE) requires a lossless fabric, which
is typically not provided by standard TCP/IP Ethernet network infrastructure, since the TCP
protocol is designed as a “best-effort” transport protocol. Datacenter Bridging (DCB) is a set
of enhancements to IP Ethernet, which is designed to eliminate loss due to queue overflow,
as well as to allocate bandwidth between various traffic types.

To sort out priorities and provide lossless performance for certain traffic types, DCB relies on
Priority Flow Control (PFC). Rather than using the typical Global Pause method of standard
Ethernet, PFC specifies individual pause parameters for eight separate priority classes. Since
the priority class data is contained within the VLAN tag of any given traffic, VLAN tagging is
also a requirement for RoCE and, therefore SMB Direct.

Once the network cabling is done, it's time to begin configuring the switches. These
configuration commands need to be executed on both switches. We start by enabling
Converged Enhanced Ethernet (CEE), which automatically enables Priority-Based Flow
Control (PFC) for all Priority 3 traffic on all ports. Enabling CEE also automatically configures
Enhanced Transmission Selection (ETS) so that at least 50% of the total bandwidth is always
available for our storage (PGID 1) traffic. These automatic default configurations are suitable
for our solution. The commands are listed in Example 1.

Example 1 Enable CEE on the switch


enable
configure
cee enable

After enabling CEE, we configure the VLANs. Although we could use multiple VLANs for
different types of network traffic (storage, client, management, cluster heartbeat, Live
Migration, etc.), the simplest choice is to use a single VLAN (12) to carry all our SMB Direct
solution traffic. Employing 25GbE links makes this a viable scenario. Enabling VLAN tagging
is important in this solution, since RDMA requires it.

Example 2 Establish VLAN for all solution traffic


vlan 12
name SMB
exit

interface ethernet 1/1-4


bridge-port mode trunk
bridge-port trunk allowed vlan 12
mtu 9100
exit

For redundancy, we configure an ISL between a pair of 100GbE ports on each switch. We
use the first two 100GbE ports, 49 and 50, for this purpose. Physically, each port is connected
to the same port on the other switch using a 100Gbps QSFP28 cable. Configuring the ISL is a
simple matter of joining the two ports into a port trunk group. We establish a vLAG across this
ISL, which extends network resiliency all the way to the S2D cluster nodes and their NIC
teams using vLAG Instances. See Example 3.

12 Microsoft Storage Spaces Direct (S2D) Deployment Guide


Example 3 Configure an ISL between switches for resiliency
interface ethernet 1/49-50
bridge-port mode trunk
bridge-port trunk allowed vlan 12
aggregation-group 100 mode active
exit

interface port-aggregation 100


bridge-port mode trunk
bridge-port trunk allowed vlan 12
mtu 9100
exit

vlag isl port-aggregation 100


vlag tier id 100
exit

Establishing a vLAG across the ISL offers the following benefits:


򐂰 Enables single S2D Node to use a Link Aggregation Group (LAG) across two switches
򐂰 Spanning Tree Protocol (STP) blocked interfaces are eliminated
򐂰 Topology loops are also eliminated
򐂰 Enables the use of all available uplink bandwidth
򐂰 Allows fast convergence times in case of link or device failure
򐂰 Allows link-level resilience
򐂰 Enables high availability

To verify the completed vLAG configuration, use the display vlag information command. A
portion of the output of this command is shown in Example 4. Run this command on both
switches and compare the outputs. There should be no differences between the Local and
Peer switches in the “Mis-Match Information” section. Also, in the “Role Information” section,
one switch should indicate that it has the Primary role and its Peer has the Secondary role.
The other switch should indicate the opposite (i.e. it has the Secondary role and its Peer has
the Primary role).

Example 4 Verification of completed vLAG configuration


display vlag information
Global State : enabled
VRRP active : enabled
vLAG system MAC : 08:17:f4:c3:dd:63
ISL Information:
PAG Ifindex State Previous State
-------+-----------+-----------+---------------------------------
100 100100 Active Inactive
Mis-Match Information:
Local Peer
-------------+---------------------------+-----------------------
Match Result : Match Match
Tier ID : 100 100
System Type : NE2572 NE2572
OS Version : 10.6.x.x 10.6.x.x
Role Information:
Local Peer
-------------+---------------------------+-----------------------
Admin Role : Primary Secondary
Oper Role : Primary Secondary
Priority : 0 0
System MAC : a4:8c:db:bb:7f:01 a4:8c:db:bb:88:01

13
Consistency Checking Information:
State : enabled
Strict Mode : disabled
Final Result : pass

Once we've got the configuration complete on the switch, we need to copy the running
configuration to the startup configuration. Otherwise, our configuration changes would be lost
once the switch is reset or reboots. This is achieved using the save command, Example 5.

Example 5 Use the write command to copy the running configuration to startup
save

Repeat the entire set of commands above (Example 1 on page 12 through Example 5) on the
other switch, defining the same VLAN and port trunk on that switch. Since we are using the
same ports on both switches for identical purposes, the commands that are run on each
switch are identical. Remember to commit the configuration changes on both switches using
the save command.

Note: If the solution uses another switch model or switch vendor’s equipment, other than the
NE2572, it is essential to perform the equivalent command sets for the switches. The
commands themselves may differ from what is stated above but it is imperative that the same
functions are executed on the switches to ensure proper operation of this solution.

Prepare the servers and storage


In this section, we describe updating firmware and drivers, and configuring the RAID
subsystem for the boot drive in the server nodes.

Firmware and drivers


Best practices dictate that with a new server deployment, the first task is to review the system
firmware and drivers relevant to the incoming operating system. If the system has the latest
firmware and drivers installed, it will expedite tech support calls and may reduce the need for
such calls. Lenovo offers a useful tool for this important task called Lenovo XClarity™
Essentials UpdateXpress.
https://support.lenovo.com/us/en/documents/lnvo-xpress

UpdateXpress can be utilized in two ways:


򐂰 The first option allows the system administrator to download and install the tool on the
target server, perform a verification to identify any firmware and drivers that need
attention, download the update packages from the Lenovo web site, and then proceed
with the updates.
򐂰 The second method lets the server owner download the new packages to a local network
share or repository and then install the updates during a maintenance window.

This flexibility in the tool grants full control to the server owner and ensures that these
important updates are performed at a convenient time.

14 Microsoft Storage Spaces Direct (S2D) Deployment Guide


Physical storage subsystem
We recommend using a dual M.2 boot drive configuration for OS boot, since this allows all
other storage devices to become part of the S2D shared storage pool. Alternatively, you can
use a pair of devices attached to a RAID adapter for OS boot. If doing so, make sure to create
a RAID-1 mirror using the correct two devices. Follow these steps to configure a RAID-1 array
for the operating system on the M.2 devices via the ThinkSystem M.2 Mirroring Enablement
Kit:
1. Power on the server to review the drive subsystem in preparation for the installation of the
operating system.
2. During the system boot process, press the F1 key to initiate the UEFI menu screen and
then traverse to System Settings, Storage.
3. Highlight “Slot 8 M.2 + Mirroring Kit Configuration Utility” as shown in Figure 8 and then
press Enter.

Figure 8 UEFI Storage menu

4. Highlight “Configuration Management” as shown in Figure 9 and then press Enter.

Figure 9 M.2 + Mirroring Kit Configuration Utility menu

5. Highlight “[Create RAID Configuration]” as shown in Figure 10 and then press Enter.

Figure 10

6. Highlight “RAID Level” and then press Enter.


7. In the RAID Level overlay, highlight “RAID1” as shown in Figure 11 and then press Enter.

15
Figure 11

8. Back in the Create RAID Configuration screen, highlight “Name” and then press Enter.
9. In the Name overlay, enter a name for the boot volume (such as “Boot”) as shown in
Figure 12 and then press Enter.

Figure 12

10.Back in the Create RAID Configuration screen, highlight “Create” and then press Enter.
11. A blue overlay is displayed as shown in Figure 13 on page 17. Press the “Y” key to create
the virtual disk that will be used for OS boot.

16 Microsoft Storage Spaces Direct (S2D) Deployment Guide


Figure 13

12.Press the Esc key multiple times to return to the main UEFI menu screen and then press
the Esc key once more to exit. Make sure to save your changes.

Leave the remaining that are connected to the 430-16i SAS/SATA HBA as unconfigured.
They will be managed directly by the operating system when the time comes to create the
storage pool.

Install Windows Server 2016


ThinkSystem servers, including the SR650, feature an advanced Baseboard Management
Controller (BMC) called the “XClarity Controller” (XCC) to provide remote out-of-band
management, including remote control and remote virtual media. You can install Windows
from a variety of sources:
򐂰 Remote ISO media mount via the XCC
򐂰 Bootable USB media with the installation content
򐂰 Installation DVD

Select the source that is appropriate for your situation. The following steps describe the
installation:
1. With the method of Windows deployment selected, power the server on to begin the
installation process.
2. Select the appropriate language pack, correct input device, and the geography, then
select the desired OS edition (GUI or Core components only).
3. Select the virtual disk connected to the ThinkSystem M.2 Mirroring Enablement Kit as the
target to install Windows.
4. Follow the prompts to complete installation of the OS.

Most of the drivers contained inside Windows Server 2016 are suitable for an S2D node, but
we need to update the Mellanox ConnectX-4 driver. To obtain the latest ConnectX-4 driver at
the time of this writing, visit:

https://datacentersupport.lenovo.com/us/en/downloads/DS501851

17
Install Windows Server roles and features
Several Windows Server roles and features are used by this solution. It makes sense to
install them all at the same time, then perform specific configuration tasks later. To make this
installation quick and easy, use the following PowerShell script, Example 6 on page 18.

Example 6 PowerShell script to install necessary server roles and features


Install-WindowsFeature -Name File-Services
Install-WindowsFeature -Name Failover-Clustering -IncludeManagementTools
Install-WindowsFeature -Name Hyper-V -IncludeManagementTools -Restart

Note that it is a good idea to install the Hyper-V role on all nodes even if you plan to
implement the disaggregated solution. Although you may not regularly use the storage cluster
to host VMs, if the Hyper-V role is installed, you will have the option to deploy an occasional
VM if the need arises.

Once the roles and features have been installed and the nodes are back online, operating
system configuration can begin.

Configure the operating system


Next, we configure the operating system, including Windows Update, AD Domain join, and
internal drive verification.

To ensure that the latest fixes and patches are applied to the operating system, perform
updating of the Windows Server components via Windows Update. It is a good idea to reboot
each node after the final update is applied to ensure that all updates have been fully installed,
regardless what Windows Update indicates.

Upon completing the Windows Update process, join each server node to the Windows Active
Directory Domain. The following PowerShell command can be used to accomplish this task.

Example 7 PowerShell command to add system to an Active Directory Domain


Add-Computer -DomainName <DomainName> -Reboot

From this point onward, when working with cluster services be sure to log onto the systems
with a Domain account and not the local Administrator account. Ensure that a Domain
account is part of the local Administrators Security Group, as shown in Figure 14.

18 Microsoft Storage Spaces Direct (S2D) Deployment Guide


Figure 14 Group membership of the Administrator account

Verify that the internal drives are online, by going to Server Manager > Tools > Computer
Management > Disk Management. If any are offline, select the drive, right-click it, and click
Online. Alternatively, PowerShell can be used to bring all 14 drives in each host online with a
single command.

Example 8 PowerShell command to bring all 14 drives online


Get-Disk | ? FriendlyName -Like *ATA* | Set-Disk -IsOffline $False

Since all systems have been joined to the domain, we can execute the PowerShell command
remotely on the other hosts while logged in as a Domain Administrator. To do this, use the
command shown in Example 9.

Example 9 PowerShell command to bring drives online in remote systems


Invoke-Command -ComputerName S2D02, S2D03, S2D04 -ScriptBlock {
Get-Disk | ? FriendlyName -Like *ATA* | Set-Disk -IsOffline $False}

Configure networking parameters


Now that the required Windows Server roles and features have been installed, we turn our
attention to some network configuration details.

For the Mellanox NICs used in this solution, we need to enable Data Center Bridging (DCB),
which is required for RDMA. Then we create a policy to establish network Quality of Service
(QoS) to ensure that the Software Defined Storage system has enough bandwidth to
communicate between the nodes, ensuring resiliency and performance. We also need to
disable regular Flow Control (Global Pause) on the Mellanox adapters, since Priority Flow
Control (PFC) and Global Pause cannot operate together on the same interface.

To make all these changes quickly and consistently, we again use a PowerShell script, as
shown in Example 10 on page 20.

19
Example 10 PowerShell script to configure required network parameters on servers
# Enable Data Center Bridging (required for RDMA)
Install-WindowsFeature -Name Data-Center-Bridging
# Configure a QoS policy for SMB-Direct
New-NetQosPolicy "SMB" -NetDirectPortMatchCondition 445 -PriorityValue8021Action 3
# Turn on Flow Control for SMB
Enable-NetQosFlowControl -Priority 3
# Make sure flow control is off for other traffic
Disable-NetQosFlowControl -Priority 0,1,2,4,5,6,7
# Apply a Quality of Service (QoS) policy to the target adapters
Enable-NetAdapterQos -Name "Mellanox 1","Mellanox 2"
# Give SMB Direct a minimum bandwidth of 50%
New-NetQosTrafficClass "SMB" -Priority 3 -BandwidthPercentage 50 -Algorithm ETS
# Disable Flow Control on physical adapters
Set-NetAdapterAdvancedProperty -Name "Mellanox 1" -RegistryKeyword "*FlowControl" -RegistryValue 0
Set-NetAdapterAdvancedProperty -Name "Mellanox 2" -RegistryKeyword "*FlowControl" -RegistryValue 0

For an S2D hyperconverged solution, we deploy a SET-enabled Hyper-V switch and add
RDMA-enabled host virtual NICs to it for use by Hyper-V. Since many switches won't pass
traffic class information on untagged VLAN traffic, we need to make sure that the vNICs using
RDMA are on VLANs.

To keep this hyperconverged solution as simple as possible and since we are using dual-port
25GbE NICs, we will pass all traffic on VLAN 12. If you need to segment your network traffic
more, for example to isolate VM Live Migration traffic, you can use additional VLANs.

As a best practice, we affinitize the vNICs to the physical ports on the Mellanox ConnectX-4
network adapter. Without this step, both vNICs could become attached to the same physical
NIC port, which would prevent bandwidth aggregation. It also makes sense to affinitize the
vNICs for troubleshooting purposes, since this makes it clear which port carries which vNIC™
traffic on all cluster nodes. Note that setting an affinity will not prevent failover to the other
physical NIC port if the selected port encounters a failure. Affinity will be restored when the
selected port is restored to operation.

Example 11 shows the PowerShell commands that can be used to perform the SET
configuration, enable RDMA, assign VLANs to the vNICs, and affinitize the vNICs to the
physical NIC ports.

Example 11 PowerShell script to create a SET-enabled vSwitch and affinitize vNICs to physical NIC ports
# Create a SET-enabled vSwitch supporting multiple uplinks provided by the Mellanox adapter
New-VMSwitch -Name S2DSwitch -NetAdapterName "Mellanox 1", "Mellanox 2" -EnableEmbeddedTeaming $true
-AllowManagementOS $false
# Add host vNICs to the vSwitch just created
Add-VMNetworkAdapter -SwitchName S2DSwitch -Name SMB1 -ManagementOS
Add-VMNetworkAdapter -SwitchName S2DSwitch -Name SMB2 -ManagementOS
# Enable RDMA on the vNICs just created
Enable-NetAdapterRDMA -Name "vEthernet (SMB1)","vEthernet (SMB2)"
# Assign the vNICs to a VLAN
Set-VMNetworkAdapterVlan -VMNetworkAdapterName SMB1 -VlanId 12 -Access –ManagementOS
Set-VMNetworkAdapterVlan -VMNetworkAdapterName SMB2 -VlanId 12 -Access –ManagementOS
# Affinitize vNICs to pNICs for consistency and better fault tolerance
Set-VMNetworkAdapterTeamMapping -VMNetworkAdapterName SMB1 -PhysicalNetAdapterName "Mellanox 1"
-ManagementOS
Set-VMNetworkAdapterTeamMapping -VMNetworkAdapterName SMB2 -PhysicalNetAdapterName "Mellanox 2"
-ManagementOS

20 Microsoft Storage Spaces Direct (S2D) Deployment Guide


Now that all network interfaces have been created, IP address configuration can be
completed, as follows:
1. Configure a static IP address for the operating system or public facing interface on the
SMB1 vNIC (for example, 10.10.11.x). Configure default gateway and DNS server settings
as appropriate for your environment.
2. Configure a static IP address on the SMB2 vNIC, using a different subnet if desired (for
example, 10.10.12.x). Again, configure default gateway and DNS server settings as
appropriate for your environment.
3. Perform a ping command from each interface to the corresponding server nodes in this
environment to confirm that all connections are functioning properly. Both interfaces on
each node should be able to communicate with both interfaces on all other nodes.

Of course, PowerShell can be used to make IP address assignments if desired. Example 12


shows the commands used to specify a static IP address and DNS server assignment for
Node 1 in our environment. Make sure to change the IP addresses and subnet masks (prefix
length) to appropriate values for your environment.

Example 12 PowerShell commands used to configure the SMB vNIC interfaces on Node 1
Set-NetIPInterface -InterfaceAlias "vEthernet (SMB1)" -Dhcp Disabled
New-NetIPAddress -InterfaceAlias "vEthernet (SMB1)" -IPAddress 10.10.11.11 -PrefixLength 24
Set-DnsClientServerAddress -InterfaceAlias "vEthernet (SMB1)" -ServerAddresses 10.10.11.9
Set-NetIPInterface -InterfaceAlias "vEthernet (SMB2)" -Dhcp Disabled
New-NetIPAddress -InterfaceAlias "vEthernet (SMB2)" -IPAddress 10.10.12.11 -PrefixLength 24
Set-DnsClientServerAddress -InterfaceAlias "vEthernet (SMB2)" -ServerAddresses 10.10.11.9

It's a good idea to disable any network interfaces that won't be used for the solution before
creating the Failover Cluster. This includes the Intel LAN On Motherboard (LOM) NICs. The
only interfaces that will be used in this solution are the SMB1 and SMB2 vNICs.

Figure 15 shows the network connections. The top four connections (in red box) represent
the Intel LOM NICs, which can be disabled. The next two connections (in blue box) represent
the two physical ports on the Mellanox adapter and must remain enabled. Finally, the bottom
two connections (in the green box) are the SMB Direct vNICs that will be used for all solution
network traffic. There may be additional network interfaces listed, which should be disabled
as well.

Figure 15 Windows network connections

21
Since RDMA is so critical to the performance of the final solution, it’s a good idea to make
sure each piece of the configuration is correct as we move through the steps. We can’t look
for RDMA traffic yet, but we can verify that the vNICs (in a hyperconverged solution) have
RDMA enabled. Example 13 on page 22 shows the PowerShell command we use for this
purpose and Figure 16 on page 22 shows the output of that command in our environment.

Example 13 PowerShell command to verify that RDMA is enabled on the vNICs just created
Get-NetAdapterRdma | ? Name -Like *SMB* | ft Name, Enabled

Figure 16 PowerShell command verifies that RDMA is enabled on a pair of vNICs

Using Virtual Machine Queue


For the 25GbE Mellanox adapters in our solution, the operating system automatically enables
dynamic VMQ and RSS, which improve network performance and throughput to the VMs.
VMQ is a scaling network technology for Hyper-V switch that improves network throughput by
distributing processing of network traffic for multiple VMs among multiple processors. When
VMQ is enabled, a dedicated queue is established on the physical NIC for each vNIC that has
requested a queue. As packets arrive for a vNIC, the physical NIC places them in that vNIC's
queue. These queues are managed by the system's processors.

Although not strictly necessary, it is a best practice to assign base and maximum processors
for VMQ queues on each server in order to ensure maximum efficiency of queue
management. Although the concept is straight forward, there are a few things to keep in mind
when determining proper processor assignment. First, only physical processors are used to
manage VMQ queues. Therefore, if Hyper-Threading (HT) Technology is enabled, only the
even-numbered processors are considered viable. Next, since processor 0 is assigned to
many internal tasks, it is best not to assign queues to this particular processor.

Before configuring VMQ queue management, execute a couple of PowerShell commands to


gather in-formation. We need to know if HT is enabled and how many processors are
available. You can issue a WMI query for this, comparing the “NumberOfCores” field to the
“NumberOfLogicalProcessors” field. As an alternative, issue the Get-NetAdapterRSS
command to see a list of viable processors (remember not to use Processor 0:0/0) as shown
in Example 14.

Example 14 PowerShell commands used to determine processors available for VMQ queues
# Check for Hyper-Threading (if there are twice as many logical procs as number of cores, HT is enabled)
Get-WmiObject -Class win32_processor | ft -Property NumberOfCores, NumberOfLogicalProcessors -AutoSize
# Check procs available for queues (check the RssProcessorArray field)
Get-NetAdapterRSS

Once you have this information, it's a simple math problem. We have a pair of 14-core CPUs
in each host, providing 28 processors total, or 56 logical processors, including
Hyper-Threading. Excluding processor 0 and eliminating all odd-numbered processors leaves
us with 27 processors to assign. Given the dual-port Mellanox adapter, this means we can
assign 13 processors to one port and 14 processors to the other. This results in the following
processor assignment:
Mellanox 1: procs 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28

22 Microsoft Storage Spaces Direct (S2D) Deployment Guide


Mellanox 2: procs 30, 32, 34, 46, 38, 40, 42, 44, 46, 48, 50, 52, 54

Use the following PowerShell script to define the base (starting) processor as well as how
many processors to use for managing VMQ queues on each physical NIC consumed by the
vSwitch (in our solution, the two Mellanox ports.)

Example 15 PowerShell script to assign processors for VMQ queue management


# Configure the base and maximum processors to use for VMQ queues
Set-NetAdapterVmq -Name "Mellanox 1" -BaseProcessorNumber 2 -MaxProcessors 14
Set-NetAdapterVmq -Name "Mellanox 2" -BaseProcessorNumber 30 -MaxProcessors 13
# Check VMQ queues
Get-NetAdapterVmq

Now that we’ve got the networking internals configured for one system, we use PowerShell
remote execution to replicate this configuration to the other three hosts. Example 16 shows
the PowerShell commands, this time without comments. These commands are for configuring
a hyperconverged solution using Mellanox NICs.

Example 16 PowerShell remote execution script to configure networking on remaining hosts


Invoke-Command -ComputerName S2D02, S2D03, S2D04 -ScriptBlock {
Install-WindowsFeature -Name Data-Center-Bridging
New-NetQosPolicy "SMB" -NetDirectPortMatchCondition 445 -PriorityValue8021Action 3
Enable-NetQosFlowControl -Priority 3
Disable-NetQosFlowControl -Priority 0,1,2,4,5,6,7
Enable-NetAdapterQos -Name "Mellanox 1","Mellanox 2"
New-NetQosTrafficClass "SMB" -Priority 3 -BandwidthPercentage 50 -Algorithm ETS
Set-NetAdapterAdvancedProperty -Name "Mellanox 1" -RegistryKeyword "*FlowControl" -RegistryValue 0
Set-NetAdapterAdvancedProperty -Name "Mellanox 2" -RegistryKeyword "*FlowControl" -RegistryValue 0
New-VMSwitch –Name S2DSwitch –NetAdapterName "Mellanox 1", "Mellanox 2" -EnableEmbeddedTeaming $true
-AllowManagementOS $false
Add-VMNetworkAdapter –SwitchName S2DSwitch –Name SMB1 –ManagementOS
Add-VMNetworkAdapter –SwitchName S2DSwitch –Name SMB2 –ManagementOS
Enable-NetAdapterRDMA -Name “vEthernet (SMB1)”,“vEthernet (SMB2)”
Set-VMNetworkAdapterVlan -VMNetworkAdapterName SMB1 -VlanId 12 -Access –ManagementOS
Set-VMNetworkAdapterVlan -VMNetworkAdapterName SMB2 -VlanId 12 -Access –ManagementOS
Set-NetAdapterVmq -Name "Mellanox 1" -BaseProcessorNumber 2 -MaxProcessors 14
Set-NetAdapterVmq -Name "Mellanox 2" -BaseProcessorNumber 30 -MaxProcessors 13}

The final piece of preparing the infrastructure for S2D is to create the Failover Cluster.

Create the Failover Cluster


Before creating the Failover Cluster we need to validate the components that are necessary
to form the cluster. As an alternative to using the GUI, the following PowerShell commands
can be used to test and create the Failover Cluster, Example 17.

Example 17 PowerShell commands to test and create a failover cluster


Test-Cluster -Node S2D01,S2D02,S2D03,S2D04 -Include "Storage Spaces
Direct",Inventory,Network,"System Configuration"
New-Cluster -Name S2DCluster -Node S2D01,S2D02,S2D03,S2D04 -NoStorage

Once the cluster is built, you can also use PowerShell to query the health status of the cluster
storage.

23
Example 18 PowerShell command to check the status of cluster storage
Get-StorageSubSystem S2DCluster

The default behavior of Failover Cluster creation is to set aside the non-public facing subnet
(configured on the SMB2 vNIC) as a cluster heartbeat network. When 1GbE was the
standard, this made perfect sense. However, since we are using 25GbE in this solution, we
don’t want to dedicate half our bandwidth to this important, but mundane task. We use
Failover Cluster Manager to resolve this issue as follows:
1. In Failover Cluster Manager navigate to Failover Cluster Manager → Clustername →
Networks in the left navigation panel, as shown in Figure 17.

Figure 17 Networks available for the cluster

2. Note the Cluster Use setting for each network. If this setting is Cluster Only, right-click on
the network entry and select Properties.
3. In the Properties window that opens ensure that the Allow cluster network
communication on this network radio button is selected. Also, select the Allow clients
to connect through this network checkbox, as shown in Figure 18 on page 24.
Optionally, change the network Name to one that makes sense for your installation and
click OK.

Figure 18 SMB2 network set to allow cluster and client traffic

24 Microsoft Storage Spaces Direct (S2D) Deployment Guide


After making this change, both networks should show “Cluster and Client” in the Cluster Use
column, as shown in Figure 19.

It is generally a good idea to use the cluster network Properties window to specify cluster
network names that makes sense and will aid in troubleshooting later. To be consistent, we
name our cluster networks after the vNICs that carry the traffic for each, as shown in
Figure 19.

Figure 19 Cluster networks shown with names to match the vNICs that carry their traffic

It is also possible to accomplish the cluster network role and name changes using
PowerShell. Example 19 provides a script to do this.

Example 19 PowerShell script to change names and roles of cluster networks


# Update the cluster networks that were created by default
# First, look at what's there
Get-ClusterNetwork | ft Name, Role, Address
# Change the cluster network names so they're consistent with the individual nodes
(Get-ClusterNetwork -Name "Cluster Network 1").Name = "SMB1"
(Get-ClusterNetwork -Name "Cluster Network 2").Name = "SMB2"
# Enable Client traffic on the second cluster network
(Get-ClusterNetwork -Name "SMB2").Role = 3
# Check to make sure the cluster network names and roles are set properly
Get-ClusterNetwork | ft Name, Role, Address

Figure 20 shows output of the PowerShell commands to display the initial cluster network
parameters, modify the cluster network names, enable client traffic on the second cluster
network, and check to make sure cluster network names and roles are set properly.

25
Figure 20 PowerShell output showing cluster network renaming and results

You can also verify the cluster network changes by viewing them in Failover Cluster Manager
by navigating to Failover Cluster Manager → Clustername → Networks in the left
navigation panel.

Cluster file share witness


It is recommended to create a cluster file share witness. The cluster file share witness
quorum configuration enables the 4-node cluster to withstand up to two node failures.

For information on how to create a cluster file share witness, read the Microsoft article,
Configuring a File Share Witness on a Scale-Out File Server, available at:
https://blogs.msdn.microsoft.com/clustering/2014/03/31/configuring-a-file-share-wi
tness-on-a-scale-out-file-server/

Note: Make sure the file share for the cluster file share witness has the proper permissions
for the cluster name object as in the example shown in Figure 21.

Figure 21 Security tab of the Permissions screen

Once the cluster is operational and the file share witness has been established, it is time to
enable and configure the Storage Spaces Direct feature.

26 Microsoft Storage Spaces Direct (S2D) Deployment Guide


Enable and configure Storage Spaces Direct
Once the failover cluster has been created, run the PowerShell command in Example 20 to
enable S2D on the cluster.

Example 20 PowerShell command to enable Storage Spaces Direct


Enable-ClusterStorageSpacesDirect –CimSession S2DCluster -PoolFriendlyName S2DPool

This PowerShell command will do the following automatically:


1. Create a single storage pool that has a name as specified by the -PoolFriendlyName
parameter
2. Configure S2D cache tier using the highest performance storage devices available, such
as NVMe or SSD
3. Create two storage tiers, one called “Capacity” and the other called “Performance.”

Take a moment to run a few PowerShell commands at this point to verify that all is as
expected. First, run the command shown in Example 21. The results should be similar to
those in our environment, shown in Figure 22 on page 27.

Example 21 PowerShell command to check S2D storage tiers


Get-StorageTier | ft FriendlyName, ResiliencySettingName

Figure 22 PowerShell query showing resiliency settings for storage tiers

At this point we can also check to make sure RDMA is working. We provide two suggested
approaches for this. First, Figure 23 shows a simple netstat command that can be used to
verify that listeners are in place on port 445 (in the yellow boxes). This is the port typically
used for SMB and the port specified when we created the network QoS policy for SMB in
Example 10 on page 20.

Figure 23 The netstat command can be used to confirm listeners configured for port 445

27
The second method for verifying that RDMA is configured and working properly is to use
PerfMon to create an RDMA monitor. To do this, following these steps:
1. At the PowerShell or Command prompt, type perfmon and press Enter.
2. In the Performance Monitor window that opens, select Performance Monitor in the left
pane and click the green plus sign (“+”) at the top of the right pane.

Figure 24 Initial Performance Monitor window before configuration

3. In the Add Counters window that opens, select RDMA Activity in the upper left pane. In
the Instances of selected object area in the lower left, choose the instances that represent
your vNICs (for our environment, these are “Hyper-V Virtual Ethernet Adapter” and
“Hyper-V Virtual Ethernet Adapter #2”). Once the instances are selected, click the Add
button to move them to the Added counters pane on the right. Click OK.

Figure 25 The Add counters window for Performance Monitor

4. Back in the Performance Monitor window, click the drop-down icon to the left of the green
plus sign and choose Report.

28 Microsoft Storage Spaces Direct (S2D) Deployment Guide


Figure 26 Choose the “Report” format

5. This should show a report of RDMA activity for your vNICs. Here you can view key
performance metrics for RDMA connections in your environment, as shown in Figure 27
on page 29.

Figure 27 Key RDMA performance metrics

Create virtual disks


After the S2D cluster is created, create virtual disks or volumes based on your performance
requirements. There are three common volume types for general deployments:
򐂰 Mirror
򐂰 Parity
򐂰 Multi-Resilient

Table 1 shows the volume types supported by Storage Spaces Direct and several
characteristics of each.

29
Table 1 Summary of characteristics associated with common storage volume types
Mirror Parity Multi-resilient

Optimized for Performance Efficiency Archival

Use case All data is hot All data is cold Mix of hot and cold
data

Storage efficiency Least (33%) Most (50+%) Medium (~50%)

File system ReFS or NTFS ReFS or NTFS ReFS only

Minimum nodes 3 4 4

Use the PowerShell commands in Example 22 on page 30 through Example 24 on page 30


to create and configure the virtual disks. Choose any or all types of volumes shown, adjusting
the volume names and sizes to suit your needs. This solution yields a total pool size of about
146TB to be consumed by the volumes you create. However, the amount of pool space
consumed by each volume will depend on which Storage Tier is used. For example, the
commands below create three volumes that consume a total of 88TB from the pool.

Create a mirror volume using the commands in Example 22 on page 30.

Example 22 PowerShell command to create a new mirror volume


New-Volume -StoragePoolFriendlyName S2DPool -FriendlyName "Mirror" -FileSystem
CSVFS_ReFS -StorageTierfriendlyNames Performance -StorageTierSizes 6TB

Create a Parity Volume using the commands in Example 23.

Example 23 PowerShell command to create a new parity volume


New-Volume -StoragePoolFriendlyName S2DPool -FriendlyName "Parity" -FileSystem
CSVFS_ReFS -StorageTierfriendlyNames Capacity -StorageTierSizes 24TB

Create a Multi-Resilient Volume using the commands in Example 24.

Example 24 PowerShell command to create a new multi-resilient volume


New-Volume -StoragePoolFriendlyName S2DPool -FriendlyName "Resilient" -FileSystem
CSVFS_ReFS -StorageTierfriendlyNames Performance, Capacity -StorageTierSizes 2TB,
8TB

Once S2D installation is complete and volumes have been created, the final step is to verify
that there is fault tolerance in this storage environment. Example 25 shows the PowerShell
command to verify the fault tolerance of the S2D storage pool and Figure 28 shows the output
of that command in our environment.

To query the storage pool use the command in Example 25.

Example 25 PowerShell command to determine S2D storage pool fault tolerance


Get-StoragePool –FriendlyName S2DPool | FL FriendlyName, Size,
FaultDomainAwarenessDefault

30 Microsoft Storage Spaces Direct (S2D) Deployment Guide


Figure 28 PowerShell query showing the fault domain awareness of the storage pool

To Query the virtual disk, use the command in Example 26. The command verifies the fault
tolerance of a virtual disk (volume) in S2D and Figure 29 shows the output of that command
in our environment.

Example 26 PowerShell command to determine S2D virtual disk (volume) fault tolerance
Get-VirtualDisk –FriendlyName <VirtualDiskName> | FL FriendlyName, Size,
FaultDomainAwareness

Figure 29 PowerShell query showing the fault domain awareness of the virtual disk

Over time, the storage pool may get unbalanced because of adding or removing physical
disks/storage nodes or data written or deleted to the storage pool. In this case, use the
PowerShell command shown in Example 27 to improve storage efficiency and performance.

Example 27 PowerShell command to optimize the S2D storage pool


optimize-storagepool S2DPool

Summary
Windows Server 2016 introduced Storage Spaces Direct, which enables building highly
available and scalable storage systems with local storage. This is a significant step forward in
Microsoft Windows Server software-defined storage (SDS) as it simplifies the deployment
and management of SDS systems and also unlocks use of new classes of disk devices, such
as SATA and NVMe disk devices, that were previously not possible with clustered Storage
Spaces with shared disks.

With Windows Server 2016 Storage Spaces Direct, you can now build highly available
storage systems using Lenovo ThinkSystem rack servers with only local storage. This
eliminates the need for a shared SAS fabric and its complexities, but also enables using
devices such as SATA SSDs, which can help further reduce cost or NVMe SSDs to improve
performance.

This document has provided an organized, stepwise process for deploying a Storage Spaces
Direct solution based on Lenovo ThinkSystem servers and Ethernet switches. Once
configured, this solution provides a versatile foundation for many different types of workloads.

31
Lenovo Professional Services
Lenovo offers an extensive range of solutions, from the simple OS-only laden product to
much more complex solutions running cluster and cloud technologies. For customers looking
for assistance in the form of design, deploy or migrate, Lenovo Professional Services is your
go-to partner.

Our worldwide team of IT Specialists and IT Architects can help customers scope and size
the right solutions to meet their requirements, and then accelerate the implementation of the
solution with our on-site and remote services. For customers also looking to elevate their own
skill sets, our Technology Trainers can craft services that encompass solution deployment
plus skills transfer, all in a single affordable package.

To inquire about our extensive service offerings and solicit information on how we can assist
in your new Storage Spaces Direct implementation, please contact us at
x86svcs@lenovo.com.

For more information about our service portfolio, please see our website:
http://shop.lenovo.com/us/en/systems/services/?menu-id=services

Change history
Changes in the 14 May 2018 update:
򐂰 Updated to include the latest Lenovo ThinkSystem rack servers
򐂰 Updated to include the latest Lenovo ThinkSystem RackSwitch products
򐂰 Switch configuration commands updated for CNOS
򐂰 Added vLAG to ISL between switches
򐂰 Added switch configuration commands to support Jumbo Frames
򐂰 Added affinitization of virtual NICs to physical NICs

Changes in the 9 January 2017 update:


򐂰 Added detail regarding solution configuration if using Chelsio NICs
򐂰 Added PowerShell commands for IP address assignment
򐂰 Moved network interface disablement section to make more logical sense
򐂰 Updated Figure 2 on page 5 and Figure 3 on page 5
򐂰 Fixed reference to Intel v3 processors in Figure 4 on page 8
򐂰 Updated cluster network rename section and figure
򐂰 Removed Bill of Materials for disaggregated solution

Changes in the 16 September 2016 update:


򐂰 Updated process based on Windows Server 2016 RTM
򐂰 Added background detail around Microsoft S2D
򐂰 Added driver details for Mellanox ConnectX-4 Lx adapter
򐂰 Added notes specific to hyperconverged vs. disaggregated deployment
򐂰 Removed GUI-based Failover Cluster configuration steps (use PowerShell!)
򐂰 Added step to ensure both cluster networks are available for SMB traffic to clients
򐂰 Fixed issues with a couple of graphics
򐂰 Updated both BOMs: the servers now use Intel Xeon E5 2600 v4 processors

Changes in the 14 July 2016 update:


򐂰 Configuration process reordered for efficiency
򐂰 Added steps to configure VMQ queues

32 Microsoft Storage Spaces Direct (S2D) Deployment Guide


򐂰 Updated and added graphics
򐂰 Added various PowerShell cmdlets to aid in configuration
򐂰 Fixed various typos

Changes in the 3 June 2016 update:


򐂰 Updated to list setup instructions using Windows Server 2016 TP5
򐂰 Added DCB settings for each host
򐂰 Updated the Bills of Material

Authors
This paper was produced by the following team of specialists:

Dave Feisthammel is a Senior Solutions Architect working at the Lenovo Center for
Microsoft Technologies in Kirkland, Washington. He has over 25 years of experience in the IT
field, including four years as an IBM client and 14 years working for IBM. His areas of
expertise include Windows Server and systems management, as well as virtualization,
storage, and cloud technologies. He is currently a key contributor to Lenovo solutions related
to Microsoft Azure Stack and Storage Spaces Direct.

Mike Miller is a Windows Engineer with the Lenovo Server Lab in Kirkland, Washington. He
has over 35 years in the IT industry, primarily in client/server support and development roles.
The last 13 years have been focused on Windows Server operating systems and server-level
hardware, particularly on operating system/hardware compatibility, advanced Windows
features, and Windows test functions.

David Ye is a Senior Solutions Architect and has been working at Lenovo Center for
Microsoft Technologies for 17 years. He started his career at IBM as a Worldwide Windows
Level 3 Support Engineer. In this role, he helped customers solve complex problems and was
involved in many critical customer support cases. He is now a Senior Solutions Architect in
the Lenovo Data Center Group, where he works with customers on Proof of Concept designs,
solution sizing, performance optimization, and solution reviews. His areas of expertise are
Windows Server, SAN Storage, Virtualization and Cloud, and Microsoft Exchange Server. He
is currently leading the effort in Microsoft Storage Spaces Direct and Azure Stack solutions
development.

Thanks to the following Lenovo colleagues for their contributions to this project:
򐂰 Val Danciu, Lead Engineer - Microsoft Systems Management Integration
򐂰 Wayne (“Guy”) Fusman, Engineer - Microsoft OS Technology and Enablement
򐂰 Daniel Ghidali, Manager - Microsoft Technology and Enablement
򐂰 Vinay Kulkarni, Lead Architect - Microsoft Solutions and Enablement
򐂰 Turner Pham, Engineer - Microsoft OS Technology and Enablement
򐂰 Vy Phan, Technical Program Manager - Microsoft OS and Solutions
򐂰 David Tanaka, Advisory Software Engineer - Microsoft OS Technology and Enablement
򐂰 David Watts, Senior IT Consultant - Lenovo Press

At Lenovo Press, we bring together experts to produce technical publications around topics of
importance to you, providing information and best practices for using Lenovo products and
solutions to solve IT challenges.

See a list of our most recent publications at the Lenovo Press web site:
http://lenovopress.com

33
34 Microsoft Storage Spaces Direct (S2D) Deployment Guide
Notices
Lenovo may not offer the products, services, or features discussed in this document in all countries. Consult
your local Lenovo representative for information on the products and services currently available in your area.
Any reference to a Lenovo product, program, or service is not intended to state or imply that only that Lenovo
product, program, or service may be used. Any functionally equivalent product, program, or service that does
not infringe any Lenovo intellectual property right may be used instead. However, it is the user's responsibility
to evaluate and verify the operation of any other product, program, or service.

Lenovo may have patents or pending patent applications covering subject matter described in this document.
The furnishing of this document does not give you any license to these patents. You can send license
inquiries, in writing, to:

Lenovo (United States), Inc.


1009 Think Place - Building One
Morrisville, NC 27560
U.S.A.
Attention: Lenovo Director of Licensing

LENOVO PROVIDES THIS PUBLICATION “AS IS” WITHOUT WARRANTY OF ANY KIND, EITHER
EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some
jurisdictions do not allow disclaimer of express or implied warranties in certain transactions, therefore, this
statement may not apply to you.

This information could include technical inaccuracies or typographical errors. Changes are periodically made
to the information herein; these changes will be incorporated in new editions of the publication. Lenovo may
make improvements and/or changes in the product(s) and/or the program(s) described in this publication at
any time without notice.

The products described in this document are not intended for use in implantation or other life support
applications where malfunction may result in injury or death to persons. The information contained in this
document does not affect or change Lenovo product specifications or warranties. Nothing in this document
shall operate as an express or implied license or indemnity under the intellectual property rights of Lenovo or
third parties. All information contained in this document was obtained in specific environments and is
presented as an illustration. The result obtained in other operating environments may vary.

Lenovo may use or distribute any of the information you supply in any way it believes appropriate without
incurring any obligation to you.

Any references in this publication to non-Lenovo Web sites are provided for convenience only and do not in
any manner serve as an endorsement of those Web sites. The materials at those Web sites are not part of the
materials for this Lenovo product, and use of those Web sites is at your own risk.

Any performance data contained herein was determined in a controlled environment. Therefore, the result
obtained in other operating environments may vary significantly. Some measurements may have been made
on development-level systems and there is no guarantee that these measurements will be the same on
generally available systems. Furthermore, some measurements may have been estimated through
extrapolation. Actual results may vary. Users of this document should verify the applicable data for their
specific environment.

© Copyright Lenovo 2018. All rights reserved.


Note to U.S. Government Users Restricted Rights -- Use, duplication or disclosure restricted by Global Services
Administration (GSA) ADP Schedule Contract 35
This document LP0064 was created or updated on May 14, 2018.

Send us your comments via the Rate & Provide Feedback form found at
http://lenovopress.com/lp0064

Trademarks
Lenovo, the Lenovo logo, and For Those Who Do are trademarks or registered trademarks of Lenovo in the
United States, other countries, or both. These and other Lenovo trademarked terms are marked on their first
occurrence in this information with the appropriate symbol (® or ™), indicating US registered or common law
trademarks owned by Lenovo at the time this information was published. Such trademarks may also be
registered or common law trademarks in other countries. A current list of Lenovo trademarks is available on
the Web at http://www.lenovo.com/legal/copytrade.html.

The following terms are trademarks of Lenovo in the United States, other countries, or both:
Lenovo® RackSwitch™ ThinkSystem™
Lenovo XClarity™ Lenovo(logo)® vNIC™

The following terms are trademarks of other companies:

Intel, Xeon, and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries
in the United States and other countries.

Active Directory, Azure, Hyper-V, Microsoft, PowerShell, SQL Server, Windows, Windows Server, and the
Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both.

Other company, product, or service names may be trademarks or service marks of others.

36 Microsoft Storage Spaces Direct (S2D) Deployment Guide

You might also like