Chapter 3

3.
1storage system:
Storage system architecture are specialized environments that safeguard the company’s
most valuable information.
It store and manage the data in large amount which require specialized designs, greater
reliability and manageability.
It support the following things-
Process the business transactions.
Process and share your intellectual property.
Route your emails.
Maintain your financial records.
Purpose of Storage System Architecture:
The main purpose of storage system architecture is running the applications that handle
the core business and operational data of the organization.
The data should have access available anywhere and anytime and should be secure.
• For basic functionality of storage system architecture five elements are
required they are as follows-
• Application: It is a computer program which provides the logic for computing
operations.
• Database: It provides a structured way to store data in logically organized
tables that are interrelated. It optimizes the operation.
• Server and OS: A computing platform that runs application and databases.
• Network: used for communication between client and server.
• Storage array: It stores data persistently.
• Cloud storage is a digital storage solution which utilizes multiple servers to
store data in logical pools.
• Security: The backups are located across multiple servers and are better
protected from data loss or hacking.
• The advantages of Cloud Storage include:
• File Accessibility – The files can be accessed at any time from any place so
long as you have Internet access.
• Offsite Backup – Cloud Storage provides organizations with offsite (remote)
backups of data which in turn reduces costs.
• Effective Use of Bandwidth – Cloud storage uses the bandwidth effectively i.e.
instead of sending files to recipients, a web link can be sent through email.
• Security of Data – Helps in protecting the data against ransomware or malware
as it is secured and needs proper authentication to access the stored data.
• The disadvantages of Cloud Storage include:
• Dependency on Internet Speed – If the Internet connection is slow or unstable,
we might have problems accessing or sharing the files.
• Dependency on a Third Party – A third party service provider (company) is
responsible for the data stored and hence it becomes an important pre-requisite
in selecting a vendor and to examine the security standards prior investing.
• 3.2virtualized data center in cloud
• Data center virtualization is the process of transforming data centers hosted on
physical servers to virtual data centers using cloud computing technology.
• Servers operated at a fraction of their capacity, creating IT environments with
huge inefficiencies and excessive operating costs.
• A virtualized data center is a logical software abstraction of a physical data
center that provides a collection of cloud infrastructure components including
servers, storage clusters, and other networking components, to business
enterprises.
• Data center virtualization is the process of creating a modern data center that is
highly scalable, available and secure.
• Virtual data center (VDC) can be defined as the unit of resource allocation for
multiple tenants. A correctly designed virtualized infrastructure will optimize
workloads, reducing your data center footprint with computing and converged
networking technologies
• Core components – equipment and software for IT operations and storage of
data and applications.
• These may include storage systems; servers; network infrastructure, such
as switches and routers; and various information security elements, such as
firewalls.
• A data center solution that considers and designs for the five key elements;
performance, time, space, experience and sustainability, will be reliable,
flexible, scalable and efficient in many ways beyond just cooling and power.
• Datacenter virtualization offers a variety of strategic and technology benefits
to businesses ranging from increased profitability to greater scalability.
• VDC environment in cloud computing
• Virtual Data Centre (VDC) is a fully managed and self serve Infrastructure
as a Service (IaaS) Private Cloud solution.
• Providing multiple levels of security that adhere to Cloud Security Principles
by design, it is a flexible, automated and scalable cloud computing platform.
• By consolidating IT resources using virtualization, organizations can optimize
their infrastructure utilization and reduce the total cost of owning an
infrastructure.
• Moreover, in a VDC, virtual resources are created using software that enables
faster deployment, compared to deploying physical resources in classic data
centers.
• covers all the key components of a data center, including virtualization at
compute, memory, desktop, and application.
• Virtual Private Server
• A VPS can be used for a variety of purposes. It serves as an off-site, third-party hosting
provider with greater security and flexibility than you’d get with something like shared
hosting. When you sign up for a virtual private server, you’re basically getting a
remote computer and all the dynamic functioning that comes with it.
• It can be used for:
• Running websites
• Hosting business files and media
• Testing new components
• The advantages of a VPS are that it offers a lot of the resources modern organizations
require to offer a professional user experience. For example, with CPU and RAM, a
website run from a VPS will be more responsive than one run from a shared hosting
service.
• It is also a secure option. VPS has what is known as sandbox security. This means no
one can go in and hack your user information and data files as the server is its own
separate entity within the cloud environment.
• About Storage VDC
• You use a storage virtual device context (VDC) to separate LAN and SAN traffic
on the same switch.
• A VDC allows you to maintain one physical infrastructure but separate logical
data paths. To achieve this configuration, you must perform the following tasks: •
Create a dedicated storage VDC. • Allocate physical ports to the storage VDC.
• These can be either ports dedicated to only the storage VDC or ports that are
shared between the storage VDC and one other VDC.
• Dedicated ports can be used to create either VFC E ports (VE ports) or F ports
(VF ports).
• Shared ports can only be used for VFC F ports (VF ports). Once you share the
port to the storage VDC you can create a VFC F-port on top of the shared
interface.
• You cannot modify some details of that port because it must match the underlying
shared physical port. If you move the source port to another VDC or delete the
VDC, the shared ports are deleted and you must reconfigure them.
• virtual data center networking
• networking is transforming how organizations connect users, customers,
data, workloads and digital experiences.
• By providing access to virtual routers, firewalls, bandwidth and network
management resources via the internet,
• networking simplifies and automates functions while increasing uptime,
improving service and reducing costs.
• For IT teams, managing cloud networking has grown more complicated as
enterprises adopt a growing number of cloud technologies and environments.
• This complexity results in added costs, slower migrations and limits on the
innovation that enterprises need to stay competitive.
• Users connect to cloud services via multiple devices, while applications are run
on distributed infrastructure made possible by cloud computing.
• 3.3block and file level storage virtualization
• Block-level storage virtualization is a storage service that provides a flexible,
logical arrangement of storage capacity to applications and users while
abstracting its physical location.
• As a software layer, it intercepts I/O requests to that logical capacity and maps
them to the appropriate physical locations.
• The difference between block-level storage and file-level storage is how the
storage is organized and accessed on the storage device and from other
devices, such as servers.
• In block-level storage, a storage device such as a hard disk drive (HDD) is
identified as something called a storage volume .
• A storage volume can be treated as an individual drive, a “block”. This gives a
server's operating system the ability to have access to the raw storage sections.
• The storage blocks can be modified by an administrator, adding more capacity
when necessary, which makes block storage fast, flexible, and reliable.
• File-level storage is a type of storage that has a file system installed directly onto
it where the storage volumes appear as a hierarchy of files to the server, rather
than blocks.
• This is different from block type storage, which doesn't have a default file system
and needs to have an administrator create one in order for non-administrator
users to navigate and find data.
• One benefit of using file storage is that it is easier to use. Most people are
familiar with file system navigation as opposed to storage volumes found in
block-level storage, where more knowledge about partitioning is required to
create volumes.
• Partitioning is the creation of sections on a disk that are set aside for certain files
or software.
• Based on the operating system used, the partitions will be assigned a name or
letter.
• virtual provisioning
• Virtual provisioning is a strategy for efficiently managing space in a storage
area network (SAN) by allocating physical storage on an "as needed" basis.
This strategy is also called thin provisioning.
• Virtual provisioning is designed to simplify storage administration by allowing
storage administrators to meet requests for capacity on-demand.
• Virtual provisioning gives a host, application or file system the illusion that it
has more storage than is physically provided.
• Physical storage is allocated only when the data is written, rather than when
the application is initially configured.
• Virtual provisioning can reduce power and cooling costs by cutting down on
the amount of idle storage devices in the array.
• As a result, virtual provisioning has become a part of green computing and
green data center initiatives.
• automated storage tiering in cloud
• Automated storage tiering (AST) is a storage software management feature
that dynamically moves information between different disk types and RAID
levels to meet space, performance and cost requirements.
• Automated storage tiering features use policies that are set up by storage
administrators.
• a data storage administrator can assign infrequently used data to slower, less-
expensive SATA storage but allow it to be automatically moved to higher-
performing SAS or solid-state drives (SSDs) as it becomes more active (and
vice versa).
• Automated storage tiering fully leverages the advantages of different storage
media including SSDs for high-performance I/Os and HDDs for massive data
archive.
• It would allow users to assign applications flexibly to two/four available tiers
distinguished by different drive types and RAID levels. Also, users can optimize
storage performance and increase ROI greatly.
• Storage Tiering prioritizes storage blocks into different categories, referred to as
storage tiers, which provide various levels of performance and capacity based
on price/performance considerations, performance/ bandwidth demands,
frequency of use, and other criteria.
• Storage Tiering enables users to flexibly assign applications to tiers with
different drive types and RAID levels.
• Info trend's Automated Storage Tiering provides an architecture that fully
consolidates the advantages of different storage media, including SSDs for high
performance and near-line serial attach SCSI (NL-SAS) drives for storage
capacity.
• It helps users more easily accommodate and meet different service level
requirements via easy-to-use GUI-based management tools.
• virtual storage area network (VSAN)
• A virtual storage area network (VSAN) is a logical partition in a physical storage area
network (SAN).
• VSANs enable traffic to be isolated within specific portions of a storage area network,
so if a problem occurs in one logical partition, it can be handled with a minimum of
disruption to the rest of the network.
• The use of multiple, isolated VSANs can also make a storage system easier to
configure and scale out.
• Subscribers can be added or relocated without needing to change the physical layout.
• VSAN works
• A virtual SAN appliance enables unused storage capacity on virtual servers to be
pooled and accessed by virtual servers as needed. A virtual SAN appliance is most
often downloaded as a software program that runs on a virtual machine, but some
storage hardware vendors are beginning to incorporate virtual SAN appliances into
their firmware. Depending on the vendor, a virtual SAN appliance might also be
called a software-defined storage (SDS) appliance or, simply, a virtual storage
• Benefits of virtual storage area networks
• Nondisruptive data migration. A VSAN enables adopters to migrate data between
drives easily and without any downtime.
• Better information lifecycle management. Virtualization administrators can
relocate frequently accessed data to high-performance storage, pushing rarely
accessed data regions onto less expensive storage resources.
• Improved manageability. Although it's relatively easy to manage identical drives,
the task can become much more difficult if storage resources involve several
vendors or even several models from the same vendor. A VSAN isn't only easy to
set up, but straightforward to manage and provision.
• Overall simplicity. Compared to the available alternatives, a VSAN is easy to
provision and manage. This is because the VSAN is
embedded directly within the hypervisor, enabling installation and configuration to
be handled rapidly and
• Reduced total cost of ownership. A VSAN can be deployed on inexpensive x86
servers, eliminating the need for large upfront investments.
• Virtual storage area network use cases
• Server virtualization
• Cloud automation
• Demilitarized zones and any test environments
• Virtual desktop infrastructure (VDI) environments
• Support edge network sites
• Convert localized storage into virtual storage
Cloud File System: Google File System(GFS) And Hadoop Distributed File
System(HDFS):
• Google File System (GFS):
• Google File System (GFS) is a scalable distributed file system (DFS) created by
Google Inc. and developed to accommodate Google’s expanding data
processing requirements.
• GFS is made up of several storage systems built from low-cost commodity
hardware components.
• It is optimized to accommodate Google's different data use and storage needs,
such as its search engine, which generates huge amounts of data that must be
stored.
• GFS provides fault tolerance, reliability, scalability, availability and
performance to large networks and connected nodes.
• The GFS node cluster is a single master with multiple chunk servers that are
continuously accessed by different client systems.
• Chunk servers store data as Linux files on local disks.
• Stored data is divided into large chunks (64 MB), which are replicated in the
network a minimum of three times.
• The large chunk size reduces network overhead.
• GFS is designed to accommodate Google’s large cluster requirements without
burdening applications.
• Files are stored in hierarchical directories identified by path names.
• Metadata - such as namespace, access control data, and mapping information - is
controlled by the master, which interacts with and monitors the status updates of
each chunk server through timed heartbeat messages.
• GFS features include:
o
Fault tolerance
o Critical data replication
o
Automatic and efficient data recovery
o High aggregate throughput
o
Reduced client and master interaction because of large chunk server size
o
Namespace management and locking
o High availability
• The largest GFS clusters have more than 1,000 nodes with 300 TB disk storage
capacity. This can be accessed by hundreds of clients on a continuous basis
• GFS:
• Hadoop Distributed File System(HDFS):
• The Hadoop Distributed File System is an open source implementation of the
GFS architecture that is also available on the Amazon EC2 cloud platform.
• Large files are broken int chunks (GFS) blocks (HDFS) which are themselves
very large.
• These chunks are stored on commodity (Linux) server called chunk server
(GFS) or data nodes (HDFS); further each chunk is replicated both on a
different physical rack as well as different network segment in anticipation of
possible failures of these components apart from server failures.
• When a client program needs to read or write a file, it sends the full path and
offset to the master (GFS) which sends back metadata for one (incase of read)
or all (in case of write) of the replicas of the chunk where the data is to be
found.
• The client catches such metadata so that it need not contact the master each time.
• In case of write, in particular an append, the client sends only the data to be
appended to all the chunk servers; when they all acknowledge receiving this
data, it informs a designated ‘primary’ chunk server, whose identity it receives
from the master.
• The master maintains regular contact with each chunk server through heartbeat
messages and incase it detects a failure its metadata is updated to reflect this,
and if required assigns a new primary for the chunks being served by a fail
chunk server.
• This architecture efficiently supports multiple parallel readers and writers.
• It also supports writing(append) and reading the same file by parallel sets of
writers and readers while maintaining a consistent view i.e. each reader always
sees the same data regardless of the replica it happens to read from.
• HDFS:

Chapter 3

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Chapter 3

Uploaded by

Copyright:

Available Formats

3.

You might also like