You are on page 1of 16

UNIVERSITE INTERNATIONALE DE RABAT

Academic year: 2023-2024


Virtual Storage

Introduction

• Other chapters in the course described virtualization technologies


that run on top of the infrastructure, including virtual machines,
containers, and virtual networks.

• This chapter completes the description of virtualization technologies


by examining the virtual storage facilities used in data centers.

• The storage facilities used in data centers employ the same designs
as the storage mechanisms used on a conventional computer. In fact,
data center storage mechanisms reuse approaches that have been
around for decades.

• Therefore, to understand persistent storage in a data center, we must


start with the persistent storage systems that are used with
conventional computer systems.

2
Virtual Storage

Persistent Storage

• The term Persistent Storage refers to a data storage mechanism that


retains data after the power has been removed. We can distinguish
between two forms of persistent storage;

o Persistent storage devices : A conventional computer uses a


separate physical device to provide persistent storage. By the
1960s, the computer industry adopted electromechanical devices
called hard drives that use magnetized particles on a surface to
store data. The industry now uses Solid State Disk (SSD)
technology with no moving parts.

o Persistent storage abstractions : Users do not deal directly with


disk hardware. Instead, an operating system provides two
abstractions that users find intuitive and convenient: named files
and hierarchical directories.
3
Virtual Storage

Disk Interface Abstraction

• A disk device provides a block-oriented interface. That is, the


hardware can only store and retrieve fixed-sized blocks of data.

• Traditional disks define a block to consist of 512 bytes of data; to


increase performance, some newer disks offer blocks of 4096 bytes.

• The blocks on a disk are numbered starting at zero (0, 1, 2,...).

• To store data on a disk, the operating system must pass two items to
the disk device: a block of data and a block number.

• The hardware cannot write a partial block — when the operating


system instructs the disk device to store data in the block i, the entire
contents of the block i on the disk is replaced with the new data.

4
Virtual Storage

Disk Interface Abstraction

• The disk interface is surprisingly constrained: the hardware can only


transfer a complete block and does not transfer more than one block
per request.

• The disk interface supports only the following operations:

5
Virtual Storage

File Interface Abstraction

• An operating system contains a software module known as a file


system that users and applications use to create and manipulate files.

• Unlike a disk interface, a file system provides a large set of


operations that the operating system maps onto the underlying disk
hardware.

• Unlike a disk device, a file system provides a byte-oriented interface.

• The file interface supports the following basic operations:

6
Virtual Storage

Local and Remote Storage

• We use the term local storage device to characterize a disk


connected directly to a computer.

• We use the term remote storage to characterize a persistent storage


mechanism that is not attached directly to a computer but is instead
reachable over a computer network.

• A disk device cannot connect directly to a network. Instead, the


remote disk connects to a storage server that connects to a network
and runs software that handles network communication:

7
Virtual Storage

Remote Storage Systems

• Commercial systems for each of the two remote storage paradigms


exist:

o Byte-oriented remote file access

o Block-oriented remote disk access

8
Virtual Storage

Remote Storage Systems

• Byte-oriented remote file access

o When a group collaborated on a document, they had to send


copies to each other. To solve the problem, computer vendors
introduced remote file storage.

o The idea is straightforward: equip a storage server with a disk


and arrange for the storage server to run an operating system
that includes a file system. Add software to each individual’s
workstation that can access and modify files on the server.

o Each time an app performs an operation on a remote file, the


user’s operating system sends the request to the server, which
performs the operation on the file.

o One of the first remote file access systems is known as the


Network File System (NFS) 9
Virtual Storage

Remote Storage Systems

• Byte-oriented remote file access

o Industry has adopted the term Network Attached Storage (NAS)


to refer to specialized systems that provide scalable remote file
storage systems suitable for a data center.

o The hardware used in a NAS system is ruggedized to withstand


heavy use. The hardware and software in a NAS system are
both optimized for high performance.

o One technique used to help satisfy the goal of durability involves


the use of parallel, redundant disks.

o Known as a RAID array the technology places redundant copies


of data on multiple disks, allowing the system to continue to
operate correctly if a disk fails, and allows the disk to be replaced
while the array continues to operate. 10
Virtual Storage

Remote Storage Systems

• Block-oriented remote file access

o Industry uses the term Storage Area Network (SAN) to describe


a remote storage system that employs a block-oriented interface
to provide a remote disk interface.

o Early SAN technology includes two components: a server and a


network optimized for storage traffic.

o Some of the motivation for a special network arose because


early data centers used a hierarchical network architecture
optimized for web traffic (i.e., north-south traffic). In contrast,
communication between servers and remote storage facilities
produces east-west traffic.

o Thus, having a dedicated network used to access remote


storage keeps storage traffic off the main data center network. 11
Virtual Storage

Remote Storage Systems

• Block-oriented remote file access

12
Virtual Storage

Virtual Disk Mapping

• The SAN server has one or more local disks that it uses to store
blocks on behalf of clients.

• The server does not merely allocate one physical disk to each client.
Instead, the server provides each client with a virtual disk.

• When software creates an entity that needs disk storage (e.g., when
a VM is created), the software sends a message to the SAN server
giving a unique ID for the new entity and specifying a disk size
measured in blocks.

• When it receives a request to create a disk for a new entity, the


server uses the client’s unique ID to form a virtual disk map.

13
Virtual Storage

Virtual Disk Mapping

• The virtual disk map has an entry for each block of the virtual disk, 0
1, and so on.

• For each entry, the server finds an unused block on one of the local
disks, allocates the block to the new entity, and fills in the entry with
information about how to access the block.

• Basically, the virtual disk map defines how to treat a set of blocks on
disks at the server as a single disk:

14
Virtual Storage

Hyper-Converged Infrastructure

• The specialized networks used in early SANs were expensive.

• The move to leaf-spine networks and the availability of much less


expensive high-capacity Ethernet hardware changes the economics
of SANs.

• Instead of using a special-purpose network, SAN hardware has been


redesigned to allow it to communicate over a conventional data
center network.

• To characterize a data center network that carries all types of traffic,


including SAN storage traffic, industry uses the term Hyper-
Converged Infrastructure.

15
Virtual Networks

Summary

• Disk hardware uses a block interface that allows a computer to read


or write one block at a time; a file system typically uses the open-
close-read-write paradigm to provide a byte-oriented interface.

• Industry uses the term Network Attached Storage (NAS) to describe a


high-performance, scalable, ruggedized remote file server, and the
term Storage Area Network (SAN) to describe a high-performance,
scalable, ruggedized remote disk server that provides block storage.

• Block storage systems allocate a virtual disk to each client (e.g., each
VM). The client uses block numbers 0 through N–1.

• The SAN server maintains a mapping between the block numbers a


client uses and blocks on physical disks; the client remains unaware
that blocks from its disk may not map onto a single physical disk at
the server.
16

You might also like