You are on page 1of 47

Advanced Technical Support, Americas

Network Attached Storage

The Basics
Norman Bogard

2012 IBM Corporation

Advanced Technical Support, Americas

Origins of Network Attached Storage
Data Transfer: Block versus File
Converged Storage
NAS Techniques
Additional Resources

A note of thanks for input from

Brett Cooper
Nils Haustein
2 2012 IBM Corporation
Advanced Technical Support, Americas

Origins of Network Attached Storage (NAS)

Ethernet was invented in 1973
By Robert Metcalfe of Xerox
First paper on Ethernet didnt come until 1976
In the early 1980s Newcastle University demonstrated remote
file access with UNIX systems
In 1983 Novells NetWare Core Protocol (NCP) was released
In 1983 Barry Feigenbaum of IBM invented Server Message
Block (SMB)
Foundation of the Common Internet File System (CIFS) {1996}
In 1984 Sun Microsystems released the Network File System
The 1990s saw the beginning of dedicated NAS devices

3 2012 IBM Corporation

Advanced Technical Support, Americas

Origins of NAS Continued

By late 1991 the HyperText Transfer Protocol (HTTP) &
Hypertext Markup Language (HTML) were defined
HTTP is now often referred to as Representational State
Transfer (REST)
REST is the foundation of cloud based storage services like
Amazons S3
NAS is always Client / Server based
Appliance (file server) is the server
End systems, like workstations or application servers, are the
1992 saw the beginning of Samba
Samba gets its name from SMB & grep -i '^s.*m.*b'
Samba integrates NFS & CIFS so files can be shared by both

4 2012 IBM Corporation

Advanced Technical Support, Americas

Block: Leverages Small Computer System Interface (SCSI) commands to
read-write specific blocks
Common SCSI access methods include Fiber Channel (FC), Internet Small
Computer System Interface (iSCSI), or InfiniBand (IB)
IB is a high speed network interconnect
NAS: reads/writes files
File Server: A storage server dedicated (primarily) to serving file-based
NAS Gateway: A server that provides network-based storage virtualization
Provides protocol translation from host-based CIFS/NFS to Storage Area
Network (SAN) based block
Examples: IBM N series & SONAS; NetApp V Series; EMC VNX/Celerra;
OnStor (LSI); HP P4000 Unified Gateway
Unified Storage a single logical, centrally managed storage platform that
serves both block (FC, iSCSI, IB) and file-based (CIFS, NFS, HTTP, etc.)
Examples: IBM N series; NetApp V series; & IBM Storwize V7000 Unified
5 2012 IBM Corporation
Advanced Technical Support, Americas

Comparing SAN & NAS

* Internet Protocol / User Datagram Protocol

6 2012 IBM Corporation

Advanced Technical Support, Americas

Block vs File

Block Level Storage devices / SAN NAS devices

(i.e. V7000, DS8000, XIV) (i.e. N series, SONAS, or V7kU)

Provide access to equal sized blocks of storage Provide access to files

Blocks are found by a number on a device Files are found by a name within a tree of names
Read and Write operations on data blocks Read, Write, Create, Delete and many more
mainly SCSI protocol CIFS, NFS, FTP and other protocols
Block Services segmented into LUNs or vDisks Device Services exposed as Exports, Directories,
usually a few dozen Files a few hundred, millions, billions
Connections to the device in the order of 10-100 Connections can be in the order of 100-10.000s
Authorization for generic access by host (10-100) Authorization by User ID for Reads, Writes, Meta-Data operations
Almost no coordination of concurrent access to Coordination of concurrent access with Share
these LUNs (other than SCSI protocol device Modes and leases/delegations (on whole files),
reservation) byte range locks (fragments within files)
within a protocol (ie CIFS) and across protocols
(i.e. between CIFS and NFS)

7 2012 IBM Corporation

Advanced Technical Support, Americas

Data Transfer: Block versus File

The key to understanding the difference between

block and file data is the file system owner
Direct Attached Storage Storage Area Network Network Attached Storage





FC, iSCSI, or IB IP: CIFS, NFS, Etc.

8 2012 IBM Corporation
Advanced Technical Support, Americas

Converged Storage (Unified Storage)

Two fundamental approaches to intermixing block &

file storage within a single system
IBMs N series uses block on file
A device file with a Logical Unit Number (LUN) assigned to it
is stored within the file servers Write Anywhere File Layout
(WAFL) file system and then mapped to a host
File & block data are stored within the same file system

IBMs Storwize V7000 Unified (V7kU) uses file on block

A raw device from the V7k is mapped to hosts
File data is contained within discrete devices
Host block data is contained within discrete devices
File & block data are stored independently
9 2012 IBM Corporation
Advanced Technical Support, Americas

Block-on-File File-on-Block
(N series) (V7kU)



Mapped LUN = File File Share

File Modules

File Server
(WAFL File System)
Block Devices

SAN Internal SAN Internal

Disks Disks

10 2012 IBM Corporation

Advanced Technical Support, Americas

Block vs File High Level Application Affinity

Applications/data types that typically reside in block stores:
RDBMS (Oracle, SQL Server, DB2)
Analytics (stream processing)
Metadata Layers (component of content management)
Email (MS Exchange, Notes)
Virtualization Stacks (VMware: VDI, VMDK implementations; HyperV; Citrix Xen)
Applications/data types that typically reside in files:
Rich Media (pictures, videos, seismic data, medical imaging, etc.)
Analytics (SAS grid)
ECM (Enterprise Content Management e.g., web stores)
Research Data Sets
User files (documents, etc.)
PLM/PDM (Product Lifecycle/Data Management)
Virtualized Environments (VMware client-driven deployment)

11 2012 IBM Corporation

Advanced Technical Support, Americas

The Enterprise Workload Landscape What Fits Where?

Application Workload Class Comments
Oracle DBMS B (F) For larger (>20TB) instances, lead with XIV

eBusiness Suite B (F)


BWH F (B) Analytics

Filenet, Documentum,
Content Mgt.
etc. F Metadata layer may be block/RDBMS

Media Streaming VOD, AOD, IPTV F Very performance/latency sensitive; also potential for tape/LTFS

MS Exchange 2010 B Strong ESRP results for both V7K & XIV

2003/2007 B
Lotus Notes B (F) Back end is DB2 database; predominantly block

VMware Virt. Infrastructure B (F) Block: mature (also XIV); File: emerging (MS HyperV block only)

SAS Analytics B/F Block: mid-range, File: grid; XIV certified & strong; V7KU TBD

PACS/EMR Imaging F Ex.: Cachet Database

Prod. Workflow B/F Front Office OLTP: block; Patient archives: file

B: Block; F: File
12 2012 IBM Corporation
Advanced Technical Support, Americas

Data Protection: File vs Block - Concept

Generally Snapshots
(Network Attached Storage) snapshots are (Storage Area Network) require
consistent integration with
since the file the host file
system is system and
APPLICATION consistent on application to
the NAS System
Replication is
supported Backups are
done by moving
NDMP is blocks through
leverages a master-media
server to
dumps files to disk/tape
Replication is
Integrated supported once
with most file system is
leading NETWORK consistent
FILE SYSTEM software Acts just like
Direct Attached


13 2012 IBM Corporation

Advanced Technical Support, Americas

Integration points for NAS protocols with

NAS applications historically have required very little if any
integration with host applications
Many NAS vendors now provide Application Programming Interfaces (APIs) to
integrated with leading host applications such as Vmware
Provides seamless movement and recovery of virtual machines through replication

Many third party management solutions support NAS integration to

discover and manage the storage, including our own Tivoli
Productivity Center (TPC)
Microsoft Windows Server Applications such as Microsoft
Exchange Server and Microsoft SQL Server do not support NAS
shares for data placement
These applications require block storage

14 2012 IBM Corporation

Advanced Technical Support, Americas

NAS Techniques overview

File systems
File shares
Network services
Authentication and authorization
Data availability
Date protection (snapshot, backup, NDMP,
Anti virus support
Information Lifecycle Management
File cloning
15 2012 IBM Corporation
Advanced Technical Support, Americas

File systems and file-sets

File shares are exports to the user or
application NFS, CIFS, HTTP, FTP,

User files are organized and stored in file Share Share

File-set File-set File-set
File system is local to the NAS system (optional) (optional) (optional)

File System
File-sets allow for breaking down the file
system space in smaller manageable Pool Pool Pool
units (optional) (optional) (optional)

Certain operations can be configured for

file-sets such as replication, snapshots,
and quota-management Storage Storage

Pools allow placement and migration of NAS System

files to different cost storage devices
16 2012 IBM Corporation
Advanced Technical Support, Americas

File shares
File share is a user file system provided by the file server
Allow users and groups to share files in a common name space
Access permissions can be given based on user and group
Typical file sharing protocols are CIFS and NFS

CIFS share is exported under share- NFS share is exported as directory

name (usershare) which is mounted by (/shares/nfs) which is mounted by the
the user user (as mnt/userdata)

Mounted as CIFS share: Mounted as NFS share:

Filer:/shares/nfs /mnt/userdata

Directory: /shares/nfs
Directory: /shares/cifs
Sharename: usershare

File Server File Server

17 2012 IBM Corporation
Advanced Technical Support, Americas

File shares and TCP/IP address failover

NFS is state-less
Upon TCP/IP address failover NFS client experiences a short interruption
I/O continues when TCP/IP connection is available, no re-connection required
Even though NFS v4 is state-full, durable file handles eliminate the need for a re-
CIFS is state-full
Upon TCP/IP address failover CIFS client looses connection
File share must be reconnected (mounted) before I/O can continue
Additional tools can be used to automate reconnection (e.g. DFS)

NFS Client connected to module 2 CIFS client

I/O continues Requires reconnect!

Module 1:
Module 2:

Clustered File Server with two file modules

18 2012 IBM Corporation

Advanced Technical Support, Americas

Network services
Network Time Protocol (NTP) is used to synchronize the time
between components (File Server, directory server)
Very important for authentication services with kerberos and clustered
Domain Name Service (DNS) is used to resolve names and IP
Required for active directory authentication
DNS round robin can be used for load balancing
DNS round robin sequentially selects IP addresses

Client 1 Client 2 DNS server

Module 1: Module 2:
19 2012 IBM Corporation
Advanced Technical Support, Americas

Multipathing Solutions for NAS

Link Aggregation
Relies on the Transmission Control Protocol/Internet Protocol
(TCP/IP) level integration
Handled at either the switch port level or at the network layer of the Open 2 GbE
Systems Interconnection (OSI) model
1 GbE 1 GbE
OSI layer 3
Commonly Known As: Bonding, Teaming or Trunking
Logically bonds multiple network paths into a single path
effectively increasing the bandwidth and providing redundant
physical connections in the case of a failure
1 GbE 1 GbE
Uses round-robin scheduling, or is based on hash values
computed from fields in the packet header, or a combination
of these two methods 2 GbE

Network load is balanced across all links unless active/passive

mode is chosen

20 2012 IBM Corporation

Advanced Technical Support, Americas

Authentication validates a resource whether it is who it claims to be
A resource can be a computer, user, or group
Authentication can be done local within the file server
Local authentication of users requires the user to be configured in the file
This does not scale is not manageable because every user must be configured in
every file server
Only works in small environments
Authentication can be done remote with a User
directory server server
3. Grant
Directory server contains names, profile information, access
and machine addresses of every user and resource
on the network
It is used to manage user accounts and network
1. Connect 2.Validate
Upon user access the file server validates user credentials
credentials with the directory server
NAS file server
Directory servers are Active Directory, LDAP, etc.
21 2012 IBM Corporation
Advanced Technical Support, Americas

Authentication processing

2. verify Auth. Request
Clients 1. User Auth. Request
w/o Kerberos 4. Response 3. Response

file server

3. Kerberos Ticket

4. Response

with Kerberos
1. User Auth. Request

2. Granted Kerberos Ticket


22 2012 IBM Corporation

Advanced Technical Support, Americas

Authentication by protocol
Authentication for NFS is based on host names or IP addresses
Allowed host-names are typically configured on a per-share basis
Network Information Service (NIS) can be used to group host-names
Allows single point of maintenance for host-groups
NFS also supports user authentication via Kerberos

Authentication for CIFS is based on Security Identification (SID)

Each resource within a domain has a unique SID Server

SIDs are known to the directory server

Authentication process validates SID with directory server

23 2012 IBM Corporation

Advanced Technical Support, Americas

Upon file access the file server matches user credentials
against file-ACL and validates the level of access for the
ACLs can be inherited from parent folders

Authorization takes place after authentication

Authorization validates access permissions of user/group to directories/files
Each shared file and directory has access permission
Access permissions are also called Access Control Lists (ACL)
In Unix (NFS) access permissions are simple: r-w-x for owner, group, others
User: MIA
Name ACL Owner Group
Group: Users
Myfile.txt rwx r-- --- MIA root

In Windows the ACLs are more complex

One user can be in multiple groups Share
ACLs include change, append and attribute operation permissions

24 2012 IBM Corporation

Advanced Technical Support, Americas

Quotas are used to restrict certain aspects of the file system
(share) usage:
Number of files
Set by the administrator for specific user, group, file system, or file-

Hard-quota: when reached writing files is denied

Soft-quota: user or administrator may get a warning, but can

continue to store files
Grace period may start during which the user can continue to store
When a hard-quota is reached, the user cannot store anymore files

25 2012 IBM Corporation

Advanced Technical Support, Americas

Data availability techniques

NAS Protocol
NAS services
Clustered NAS services
File system
Replication, Data striping
(Replication, striping)
Storage network
Redundant storage networks
Disk System
RAID protection on disk subsystem
NAS System

26 2012 IBM Corporation

Advanced Technical Support, Americas

Data protection

Against what must the data be protected ?

Techniques to protect against operational errors (changes,

deletion, virus)

Techniques to protect against disaster (complete failure of

computer center)
Replication / Mirroring

27 2012 IBM Corporation

Advanced Technical Support, Americas

Snapshots freeze the state of a file system or subset at a
certain point in time
Data resides in the same file server
Snapshots are typically space efficient
At the time of the snapshot almost no capacity is consumed
Changes in the file system cause snapshots to grow Snap

Snapshot management
Snapshots can be scheduled by date and time File system Snapshot
Snapshots can be deleted automatically based on rules
Snapshots can be mounted and accessed by users
Read-only, no changes allowed
Snapshot are used to recover deleted or changed files
Also used for other background operations (backup,
replication, etc.)

28 2012 IBM Corporation

Advanced Technical Support, Americas

Backup and Recovery Techniques

Backup from shares Integration in file server

User Dedicated
Backup client(s) User
BR client
BR client


BR client

Backup Backup
File Server Server
File Server Server
Backup client runs on every user Backup clients run on file server
workstation or on dedicated server(s) Backs up files from file systems
Backs up the files in the file system
Leverage fast scan process
Dedicated servers provide more
scalability Recovery can be done by the backup
client internally to the file server
Recovery can be done by the backup Usually administrative effort
Can be done by user

General concern: long file scan time

29 2012 IBM Corporation

Advanced Technical Support, Americas

Backup & Recovery considerations

Typically full, differential and incremental backups are supported
Full: entire file system or share
Differential: all changes since last full backup
Incremental: all changes since last full or differential backup
Typical requirements
Backup window: depends on number of files and speed of identification process
Recovery time: depends on number of files to be recovered and backup medium
Recovery point: depends on frequency of backups
Scalability of backup and recovery depends on:
Number of parallel backup clients
Network and storage medium
Scan time
Use file level backup to recover files or subset of files
Full system recovery (disaster) may take a long time because restore is typically
on a file level
Consider other techniques for disaster recovery (replication, etc.)
30 2012 IBM Corporation
Advanced Technical Support, Americas

Network Data Management Protocol

Standardized protocol facilitating backup and recovery for NAS server
Comprises three services:
Data service: performs the backup and recovery operation in the NAS server
Tape service: writes and reads data to the backup storage medium (disk or
Data management service (DMA): controls backup and restore operations and
NDMP data movement


file server
NDMP Client
(Source: Storage networks explained,
Troppens et al, John Wiley & Sons, Ltd, 2009)

31 2012 IBM Corporation

Advanced Technical Support, Americas

NDMP Backup Functions

Files are backed up in NDMP data stream to tape service
Streaming provides higher performance
Meta information is passed to DMA
NDMP supports full and incremental backup for file systems
NDMP supports file system and file level recovery
File system recovery is fast because of streaming
File level recovery is based on direct access recovery where the
NDMP client keeps track of position of file within NDMP data stream
NDMP version 5 supports compression, encryption and
Recovery of entire file systems is faster than with file-level backup
Instead of single files entire container of files (streams) are recovered

32 2012 IBM Corporation

Advanced Technical Support, Americas

NDMP for data migration (copy)

NDMP can be used for data migration (copy) between file servers
One file server is the source and runs the data service (DS)
Another file server is the target and runs the tape service (TS) and data service
DMA runs externally or on either of the file servers
Source file server (DS) collects the files and attributes and streams it to
target file server TS
Target file server receives (TS) the stream and DS unpacks it
General issue: data format within the NDMP stream is not standardized

NDMP Control
Data stream

Source file server Target file server

33 2012 IBM Corporation

Advanced Technical Support, Americas

Replication / Mirroring
Copy files from one NAS system to another for disaster protection
Copy can be done on storage system or file system layer
File system replication typically allows faster recovery (fail over)
File system replication is more consistent because it has the
awareness of files
Storage system replication is typically faster and can also be fully
synchronous on block level
Typically asynchronous methods are used in a NAS environment
Multi-side and multi-directional replication scenarios are possible

Site 1 Site 2 Site 3

34 2012 IBM Corporation

Advanced Technical Support, Americas

Replication considerations
Data reduction techniques (compression, de-duplication)
help to overcome replication bandwidth challenges
Encryption helps to provide secure data transmission
between sites
Main requirements for disaster recovery
Recovery time objective: how long does it take to recover from
a disaster
Recovery point objective: how much data will be lost

Recovery process typically involves administrative

measures at the target site
Should be well documented and trained

35 2012 IBM Corporation

Advanced Technical Support, Americas

Virus protection
Scanning on file shares Scanner integration Scanner in file server
with file server User
User Scan Server(s)

Scanner Scanner
Shares Scan Servers


Scanner running on user Leverage integrated file Scanner runs in file

workstation has limited identification techniques server
scalability for bulk file scan
Leverages integrated file
Dedicated scanner(s) are Enables scan on- identification techniques
more scalable demand for bulk file scan
Need access to shares On file access
After file write Enables scan on-demand
File identification may On file access
become the bottleneck Scales with number of After file write
Limited scalability and
scanner support
36 2012 IBM Corporation
Advanced Technical Support, Americas

Tiered storage
Important Information Lifecycle Management technique
Initial placement of files on the most appropriate storage medium
Policy based migration during the lifetime of the files
Keep the files in the original name space to allow transparent access
Supports the idea of archiving
For data at rest which needs to be kept for long periods of time
Integration of ILM functions in file server provides cost efficiency
No extra infrastructure required
Central administration in concert with other functions
Automated tiering

Gold Silver Bronce Tape

file server
Performance and Cost
37 2012 IBM Corporation
Advanced Technical Support, Americas

Scale out
Scale out NAS systems can scale in multiple dimensions
Horizontally: performance and throughput provided by interface nodes
Vertically: storage capacity provided by storage systems
Provide single name-space across multiple processing nodes (interfaces)
Workload is distributed across the components (interface and storage
Centrally managed and maintained

One global namespace

Interface nodes Performance

Storage systems

38 2012 IBM Corporation

Advanced Technical Support, Americas

Traditional NAS vs. Scale Out NAS

Traditional NAS Scale Out NAS
File 1 File 2 File 3


Interface Interface Interface Interface
file file file node 3 node n
server server server node 1 node 2
1 2 3

Storage Single large virtual server

Storage Storage Island including automated storage tiering
Island Island

A few traditional NAS challenges: Goal of Scale Out NAS:

Scale performance and capacity with number of Scale performance and capacity
disks and file servers
With interface and storage nodes
Adding file servers leads to fragmented data, hot
spots and underutilized disks Very high aggregate performance through
More complex to manage multiple NAS appliances
Greatly simplified management because it
Operational costs grow is one system
Provides operational cost reduction

39 2012 IBM Corporation

Advanced Technical Support, Americas

Provide file space for multiple tenants
Multiple departments / organization in one company
Multiple customers hosted by service provider
Separation of networks User Dept. Client
Separation of interface nodes
Separation of storage
Separation of administration
Separation of authentication
Different protection concepts
Reporting and chargeback, etc
Simple Solution: provide one file server per tenant
Complex to administrate and to maintain Backup
More costly due to underutilized disk space and complexity Antivirus

Combine multi-tenancy with scale-out NAS

40 2012 IBM Corporation
Advanced Technical Support, Americas

Multi-tenancy and scale-out

Scale-out system provides scalability in multiple dimensions
Scale-out system allows dynamic allocation and de-
allocation of resources
Scale-out system can be centrally managed and maintained
Scale-out systems help save cost in a multi-tenant
environment User Dept. Client


Interface nodes Scale-out Systems


Storage systems

41 2012 IBM Corporation

Advanced Technical Support, Americas

File cloning
Clone is a writable space efficient copy of an individual file
Clones point back to the parent file
Only modified blocks from clone are stored
Parent file is read-only, cannot be modified as long as a clone exists
Multiple clones of the same file can exist
Clones of clones are possible as well
Clones can only be created within the file server
Use case: Provision virtual machines from the same base image

clone files

Source file Parent file

42 2012 IBM Corporation
Advanced Technical Support, Americas

Additional Resources for more information on NAS

Wikipedias Definition of NAS:
InfoStor: NAS Advantages a VARs View:
4/news-analysis-trends/nas-advantages-a-vars-view.html SAN vs. NAS: Whats the difference? Introduction to NAS Network Attached Storage
FreeBSD Handbook: What is NFS?
CodeFX: CIFS Explained
Wikipedias Definition of Samba:
SearchStorage: Using NAS NFS with VMware ESX Technology Pros and Cons
(Requires Account Creation to view)
43 2012 IBM Corporation
Advanced Technical Support, Americas

Question and Answer

2012 IBM Corporation

Advanced Technical Support, Americas

Question and Answer

2012 IBM Corporation

Advanced Technical Support, Americas

Question and Answer

2012 IBM Corporation

Advanced Technical Support, Americas

This information is provided on an "AS IS" basis without warranty of any kind, express or implied, including, but not
limited to, the implied warranties of merchantability and fitness for a particular purpose. Some jurisdictions do not allow
disclaimers of express or implied warranties in certain transactions; therefore, this statement may not apply to you.
This information is provided for information purposes only as a high level overview of possible future products.
Important notes:
IBM reserves the right to change product specifications and offerings at any time without notice. This publication could
include technical inaccuracies or typographical errors. References herein to IBM products and services do not imply that
IBM intends to make them available in all countries.
IBM makes no warranties, express or implied, regarding non-IBM products and services, including but not limited to Year
2000 readiness and any implied warranties of merchantability and fitness for a particular purpose. IBM makes no
representations or warranties with respect to non-IBM products. Warranty, service and support for non-IBM products is
provided directly to you by the third party, not IBM.
All part numbers referenced in this publication are product part numbers and not service part numbers. Other part
numbers in addition to those listed in this document may be required to support a specific device or function.
MHz / GHz only measures microprocessor internal clock speed; many factors may affect application performance. When
referring to storage capacity, GB stands for one billion bytes; accessible capacity may be less. Maximum internal hard
disk drive capacities assume the replacement of any standard hard disk drives and the population of all hard disk drive
bays with the largest currently supported drives available from IBM.
IBM Information and Trademarks
The following terms are trademarks or registered trademarks of the IBM Corporation in the United States or other
countries or both: the e-business logo, IBM, System Storage, Easy Tier, FlashCopy, and System Storage DS.
Linear Tape-Open, LTO, the LTO Logo, Ultrium, and the Ultrium logo are trademarks of HP, IBM Corp. and Quantum in the
U.S. and other countries.
Intel, Pentium 4 and Xeon are trademarks or registered trademarks of Intel Corporation. Microsoft Windows is a
trademark or registered trademark of Microsoft Corporation. Linux is a registered trademark of Linus Torvalds. Other
company, product, and service names may be trademarks or service marks of others.

47 2012 IBM Corporation