Professional Documents
Culture Documents
Foreword
• Data has become an enterprise’s most important asset. How is data stored on the
cloud? How is this different from the way it is stored with traditional IT? This chapter
will answer these questions.
Course Objectives
• Upon completion of this course, you will:
• Understand Huawei’s FusionSphere virtualization solution.
• Understand the storage architecture used for virtualized environments.
• Understand different types of disks.
• Understand the differences between centralized and distributed storage.
• Understand the differences between virtualized and non-virtualized storage.
• Understand different types of VM disks.
• Understand the storage features of Huawei’s virtualization product.
Course Contents
• Storage Architecture for Virtualization
• Physical Disk Types and Related Techniques
• Centralized Storage vs. Distributed Storage
• Virtualized Storage vs. Non-Virtualized Storage
• Introduction to VM Disks
• Storage Features of Huawei’s Virtualization Product
Centralized Storage
• Centralized storage means that all storage resources are centrally deployed and are provisioned over
a unified interface. With centralized storage, all physical disks are centrally deployed in disk
enclosures and are used to provide storage services externally through the controller. Centralized
storage typically refers to disk arrays.
• Based on the technical architecture, centralized storage can be categorized as SAN and NAS. SAN
can be further categorized as FC SAN, IP SAN, and FCoE SAN. Currently, FC SAN and IP SAN
technologies are mature, and FCoE SAN still has a long way to go to reach maturity.
• A disk array combines multiple physical disks into a single logical unit. Each disk array consists of
one controller enclosure and multiple disk enclosures. This architecture delivers an intelligent
storage space featuring high availability, high performance, and large capacity.
Centralized Storage
Storage
Pool
RAID
RAID Technology
RAID
Efficient Safe
Parallel read and write on Parity check and hot backup
1
multiple disks
2
Common RAID Types
Logical
Disks D0, D1, D2, D3, D0, D1, D2 Logical
D4, D5 Disks
D5
D4
D4 D5 D2 D2 D2
D3 D1 D1 D1
D2 D2
D0
D3
D1 D0 D0 D0
D1
D0
Disk 1 Disk 2 Disk 1 Disk 2
RAID 0 RAID 1
RAID 5 RAID 6
Logical P3
Disks
D7
D6
D0, D1, D2, D3, D4, Q3
D0, D1, D2, D3, D5
D5, D6, D7 D4
D5
D4
D4, D5 Q2
P2
D3
P2
D3 Q3 D6
D7 P3
P2 D4
D5 P2 Q 2 D4
D5 Q1
P1 Q1 P1
D2 D2 P1 D3 D2 P1 D3 D20
P0 D0 D1
P0 D0
D1 P0
Q0 Q
P0
D1
D0
Disk 1 Disk 2 Disk 3 Disk 1 Disk 2 Disk 3 Disk 4 D1
D0
Centralized Storage Types
SAN NAS
Windows Unix-like
client client
IP/FC network
CIFS NFS
LAN LAN
Server Server Server Server
iSCSI FC
TCP/IP network Ethernet switch FC network FC switch
• The inception of NAS is closely related to the development of network. After the inception of the ARPANET,
modern network technologies develop rapidly, and users have increasing demand for sharing data over
network. However, sharing files on the network is confronted with many issues such as cross-platform access
and data security.
NAS System
• The advantage of NAS is that it can deliver file storage services in a fast and cost-effective manner using
existing resources in the data center. The current solution is compatible between UNIX, Linux, and Windows
OSs, and can be easily connected to users' TCP/IP networks.
NAS Architecture
File System
• CIFS is a public and open file system developed by Microsoft Server Message Block (SMB). SMB is a file
sharing protocol set by Microsoft based on NetBIOS. Users can access data on a remote computer through
CIFS. In addition, CIFS prevents read-write conflicts and write-write conflicts to support multi-user access.
• Network File System (NFS) is a distributed file system protocol originally developed by Sun Microsystems in
1984, allowing a user on a client computer to access files over a computer network much like local storage is
accessed. It is designed for use among different operating systems, therefore, its communication protocol is
independent from hosts and operating systems. When users want to use remote files, they only need to use the
mount command to mount the remote file system under their own local file systems. There is no difference
between using remote files and local files.
Comparison CFIS and NFS
Item CIFS NFS
Replication
Advantages of Distributed Storage
• Excellent performance
The distributed storage system uses an innovative architecture to organize SATA/SAS HDDs into a storage pool
like SAN delivering a higher I/O than SAN devices and optimal performance. In a distributed storage system,
SSDs can replace HDDs as high-speed storage devices, and InfiniBand networks can replace GE/10GE networks
to deliver higher bandwidth, fulfilling the high performance requirements for real-time processing of large
volumes of data.
• Global load balancing
The implementation mechanism of the distributed storage system ensures that I/O operations of upper-layer
applications are evenly distributed on different hard drives of different servers, preventing partial hot spots and
implementing global load balancing. The system automatically disperses data blocks on the hard disks of
various servers. Data that is frequently or seldom accessed is evenly distributed on the servers, avoiding hot
spots. FusionStorage employs the data fragment distribution algorithm to ensure that active and standby copies
are evenly distributed to different hard disks of the servers. In this way, each hard disk contains the same number
of active and standby copies. When the system capacity is expanded or reduced due to a node failure, the data
reconstruction algorithm helps ensure load balancing among all nodes after system reconstruction.
Advantages of Distributed Storage
Distributed SSD storage
The distributed storage system uses the SSD storage system that supports high-performance applications to
deliver higher read/write performance than the traditional HDDs (SATA/SAS hard drives). PCIe SSDs deliver
higher bandwidth and I/O. The PCIe 2.0 x8 interface provides a read/write bandwidth of up to 3.0 GB. I/O
performance of SSDs delivers 4 KB random data transfer, realizing up to 600,000 continuous random read IOPS
and 220,000 continuous random write IOPS. Although SSDs deliver high read and write speeds, SSDs have a
shorter write lifespan. When SSDs are used, the distributed SSD storage system uses multiple mechanisms and
measures to improve reliability.
High-performance snapshot
The distributed storage system delivers the snapshot mechanism, which allows the system to capture the status
of the data written into a logical volume at a particular point in time. The data snapshot can then be exported and
used for restoring the volume data when required. The snapshot data of the distributed storage system is based
on the distributed hash table (DHT) mechanism. Therefore, the snapshots do not cause performance
deterioration of the original volumes. The DHT technology delivers high query efficiency. For example, to build
indexes for a 2 TB hard disk in the memory, tens of MB of memory space is required. Only one hash query
operation can determine whether a snapshot has been created for the disk. If a snapshot has been created, the
hash query can also determine the storage location of the snapshot.
Advantages of Distributed Storage
High-performance linked cloning
The distributed storage system delivers the linked cloning mechanism for incremental snapshots so
that multiple cloned volumes can be created for a snapshot. The data in the cloned volumes is the
same as that in the snapshot. Subsequent modifications to a cloned volume do not affect the snapshot
or other cloned volumes. The distributed storage system supports batch deployment of VM volumes.
Hundreds of VM volumes can be created in seconds. A cloned volume can be used for creating a
snapshot, restoring data from the snapshot, and cloning the volume as the base volume again.
High-speed InfiniBand (IB) network
To eliminate storage switching bottlenecks in a distributed storage environment, a distributed storage
system can be deployed on an IB network designed for high-bandwidth applications
Replication
Data write Data read
Distributed Distributed
Storage Pool Storage Pool
Popular Distributed Storage Products
Course Contents
• Storage Architecture for Virtualization
• Physical Disk Types and Related Techniques
• Centralized Storage vs. Distributed Storage
• Virtualized Storage vs. Non-Virtualized Storage
• Introduction to VM Disks
• Storage Features of Huawei’s Virtualization Product
Virtualized Storage Convension Path in Cloud
Computing
Shared Catalog
Virtual file
system
Logical
volume
Attach to compute cluster
and format
Physical
NFS file
volume
Format
Logical division system
RAID or Replication
Non-Virtualized Storage Convension Path in
Cloud Computing
Logical
volume Attach to compute cluster
Physical
volume Logical division
RAID or Replication
Relationship Between RAIDs and LUNs
• A RAID array can be seen as a large physical volume made up of a number of hard
disks.
• On top of a physical volume, you can create one or more logical units of specified
capacities. These logical units are called Logical Unit Numbers (LUNs) and can be
mapped to hosts as basic block devices
Segmentation
Physical volume
RAID
Hard disk
Pool & Volume & LUN
• A storage pool is a container of storage resources. The storage resources
used by application servers all come from storage pools.
• A volume is an internal object managed inside a storage system.
• A LUN is a storage unit that can be directly mapped to a host for data read
and write. It is how volumes are provisioned to application servers
Server
LUN
Pool
Volume
Common File Systems
Virtual clustered file system