You are on page 1of 28

Contents

Azure Storage Documentation


Overview
Introduction
Choose Blobs, Files, or Disks
Blobs
Data Lake Storage Gen2
Files
Queues
Tables
Disks
Windows
Linux
Introduction to Azure Storage
5/28/2019 • 12 minutes to read • Edit Online

Azure Storage is Microsoft's cloud storage solution for modern data storage scenarios. Azure Storage offers a
massively scalable object store for data objects, a file system service for the cloud, a messaging store for reliable
messaging, and a NoSQL store. Azure Storage is:
Durable and highly available. Redundancy ensures that your data is safe in the event of transient hardware
failures. You can also opt to replicate data across datacenters or geographical regions for additional protection
from local catastrophe or natural disaster. Data replicated in this way remains highly available in the event of an
unexpected outage.
Secure. All data written to Azure Storage is encrypted by the service. Azure Storage provides you with fine-
grained control over who has access to your data.
Scalable. Azure Storage is designed to be massively scalable to meet the data storage and performance needs
of today's applications.
Managed. Microsoft Azure handles hardware maintenance, updates, and critical issues for you.
Accessible. Data in Azure Storage is accessible from anywhere in the world over HTTP or HTTPS. Microsoft
provides SDKs for Azure Storage in a variety of languages -- .NET, Java, Node.js, Python, PHP, Ruby, Go, and
others -- as well as a mature REST API. Azure Storage supports scripting in Azure PowerShell or Azure CLI.
And the Azure portal and Azure Storage Explorer offer easy visual solutions for working with your data.

Azure Storage services


Azure Storage includes these data services:
Azure Blobs: A massively scalable object store for text and binary data.
Azure Files: Managed file shares for cloud or on-premises deployments.
Azure Queues: A messaging store for reliable messaging between application components.
Azure Tables: A NoSQL store for schemaless storage of structured data.
Each service is accessed through a storage account. To get started, see Create a storage account.

Blob storage
Azure Blob storage is Microsoft's object storage solution for the cloud. Blob storage is optimized for storing
massive amounts of unstructured data, such as text or binary data.
Blob storage is ideal for:
Serving images or documents directly to a browser.
Storing files for distributed access.
Streaming video and audio.
Storing data for backup and restore, disaster recovery, and archiving.
Storing data for analysis by an on-premises or Azure-hosted service.
Objects in Blob storage can be accessed from anywhere in the world via HTTP or HTTPS. Users or client
applications can access blobs via URLs, the Azure Storage REST API, Azure PowerShell, Azure CLI, or an Azure
Storage client library. The storage client libraries are available for multiple languages, including .NET, Java, Node.js,
Python, PHP, and Ruby.
For more information about Blob storage, see Introduction to Blob storage.

Azure Files
Azure Files enables you to set up highly available network file shares that can be accessed by using the standard
Server Message Block (SMB ) protocol. That means that multiple VMs can share the same files with both read and
write access. You can also read the files using the REST interface or the storage client libraries.
One thing that distinguishes Azure Files from files on a corporate file share is that you can access the files from
anywhere in the world using a URL that points to the file and includes a shared access signature (SAS ) token. You
can generate SAS tokens; they allow specific access to a private asset for a specific amount of time.
File shares can be used for many common scenarios:
Many on-premises applications use file shares. This feature makes it easier to migrate those applications
that share data to Azure. If you mount the file share to the same drive letter that the on-premises application
uses, the part of your application that accesses the file share should work with minimal, if any, changes.
Configuration files can be stored on a file share and accessed from multiple VMs. Tools and utilities used by
multiple developers in a group can be stored on a file share, ensuring that everybody can find them, and that
they use the same version.
Diagnostic logs, metrics, and crash dumps are just three examples of data that can be written to a file share
and processed or analyzed later.
At this time, Active Directory-based authentication and access control lists (ACLs) are not supported, but they will
be at some time in the future. The storage account credentials are used to provide authentication for access to the
file share. This means anybody with the share mounted will have full read/write access to the share.
For more information about Azure Files, see Introduction to Azure Files.

Queue storage
The Azure Queue service is used to store and retrieve messages. Queue messages can be up to 64 KB in size, and a
queue can contain millions of messages. Queues are generally used to store lists of messages to be processed
asynchronously.
For example, say you want your customers to be able to upload pictures, and you want to create thumbnails for
each picture. You could have your customer wait for you to create the thumbnails while uploading the pictures. An
alternative would be to use a queue. When the customer finishes their upload, write a message to the queue. Then
have an Azure Function retrieve the message from the queue and create the thumbnails. Each of the parts of this
processing can be scaled separately, giving you more control when tuning it for your usage.
For more information about Azure Queues, see Introduction to Queues.

Table storage
Azure Table storage is now part of Azure Cosmos DB. To see Azure Table storage documentation, see the Azure
Table Storage Overview. In addition to the existing Azure Table storage service, there is a new Azure Cosmos DB
Table API offering that provides throughput-optimized tables, global distribution, and automatic secondary
indexes. To learn more and try out the new premium experience, please check out Azure Cosmos DB Table API.
For more information about Table storage, see Overview of Azure Table storage.

Disk storage
An Azure managed disk is a virtual hard disk (VHD ). You can think of it like a physical disk in an on-premises
server but, virtualized. Azure managed disks are stored as page blobs, which are a random IO storage object in
Azure. We call a managed disk ‘managed’ because it is an abstraction over page blobs, blob containers, and Azure
storage accounts. With managed disks, all you have to do is provision the disk, and Azure takes care of the rest.
For more information about managed disks, see Introduction to Azure managed disks.

Types of storage accounts


Azure Storage offers several types of storage accounts. Each type supports different features and has its own
pricing model. Consider these differences before you create a storage account to determine the type of account
that is best for your applications. The types of storage accounts are:
General-purpose v2 accounts: Basic storage account type for blobs, files, queues, and tables. Recommended
for most scenarios using Azure Storage.
General-purpose v1 accounts: Legacy account type for blobs, files, queues, and tables. Use general-purpose
v2 accounts instead when possible.
Block blob storage accounts: Blob-only storage accounts with premium performance characteristics.
Recommended for scenarios with high transactions rates, using smaller objects, or requiring consistently low
storage latency.
FileStorage (preview) storage accounts: Files-only storage accounts with premium performance
characteristics. Recommended for enterprise or high performance scale applications.
Blob storage accounts: Blob-only storage accounts. Use general-purpose v2 accounts instead when possible.
The following table describes the types of storage accounts and their capabilities:

SUPPORTED
STORAGE SUPPORTED PERFORMANCE SUPPORTED REPLICATION DEPLOYMENT
ACCOUNT TYPE SERVICES TIERS ACCESS TIERS OPTIONS MODEL1 ENCRYPTION2

General- Blob, File, Standard, Hot, Cool, LRS, ZRS4 , Resource Encrypted
purpose V2 Queue, Table, Premium5 Archive3 GRS, RA-GRS Manager
and Disk

General- Blob, File, Standard, N/A LRS, GRS, RA- Resource Encrypted
purpose V1 Queue, Table, Premium5 GRS Manager,
and Disk Classic

Block blob Blob (block Premium N/A LRS Resource Encrypted


storage blobs and Manager
append blobs
only)

FileStorage Files only Premium N/A LRS Resource Encrypted


(preview) Manager

Blob storage Blob (block Standard Hot, Cool, LRS, GRS, RA- Resource Encrypted
blobs and Archive3 GRS Manager
append blobs
only)

1Using the Azure Resource Manager deployment model is recommended. Storage accounts using the classic
deployment model can still be created in some locations, and existing classic accounts continue to be supported.
For more information, see Azure Resource Manager vs. classic deployment: Understand deployment models and
the state of your resources.
2All storage accounts are encrypted using Storage Service Encryption ( SSE ) for data at rest. For more information,
see Azure Storage Service Encryption for Data at Rest.
3
3The Archive tier
is available at level of an individual blob only, not at the storage account level. Only block blobs
and append blobs can be archived. For more information, see Azure Blob storage: Hot, Cool, and Archive storage
tiers.
4Zone-redundant storage ( ZRS ) is available only for
standard general-purpose v2 storage accounts. For more
information about ZRS, see Zone redundant storage (ZRS ): Highly available Azure Storage applications. For more
-
information about other replication options, see Azure Storage replication.
5Premium performance for general-purpose v2 and general-purpose v1 accounts is available for disk and page
blob only.
For more information about storage account types, see Azure storage account overview.

Securing access to storage accounts


Every request to Azure Storage must be authorized. Azure Storage supports the following authorization methods:
Azure Active Directory (Azure AD ) integration for blob and queue data. Azure Storage supports
authentication and authorization with Azure AD credentials for the Blob and Queue services via role-based
access control (RBAC ). Authorizing requests with Azure AD is recommended for superior security and ease of
use. For more information, see Authenticate access to Azure blobs and queues using Azure Active Directory.
Azure AD authorization over SMB for Azure Files (preview). Azure Files supports identity-based
authorization over SMB (Server Message Block) through Azure Active Directory Domain Services. Your
domain-joined Windows virtual machines (VMs) can access Azure file shares using Azure AD credentials. For
more information, see Overview of Azure Active Directory authorization over SMB for Azure Files (preview ).
Authorization with Shared Key. The Azure Storage Blob, Queue, and Table services and Azure Files support
authorization with Shared Key.A client using Shared Key authorization passes a header with every request that
is signed using the storage account access key. For more information, see Authorize with Shared Key.
Authorization using shared access signatures (SAS ). A shared access signature (SAS ) is a string containing
a security token that can be appended to the URI for a storage resource. The security token encapsulates
constraints such as permissions and the interval of access. For more information, refer to Using Shared Access
Signatures (SAS ).
Anonymous access to containers and blobs. A container and its blobs may be publicly available. When you
specify that a container or blob is public, anyone can read it anonymously; no authentication is required. For
more information, see Manage anonymous read access to containers and blobs

Encryption
There are two basic kinds of encryption available for the Storage services. For more information about security and
encryption, see the Azure Storage security guide.
Encryption at rest
Azure Storage Service Encryption (SSE ) at rest helps you protect and safeguard your data to meet your
organizational security and compliance commitments. With this feature, Azure Storage automatically encrypts your
data prior to persisting to storage and decrypts prior to retrieval. The encryption, decryption, and key management
are totally transparent to users.
SSE automatically encrypts data in all performance tiers (Standard and Premium), all deployment models (Azure
Resource Manager and Classic), and all of the Azure Storage services (Blob, Queue, Table, and File). SSE does not
affect Azure Storage performance.
For more information about SSE encryption at rest, see Azure Storage Service Encryption for Data at Rest.
Client-side encryption
The storage client libraries have methods you can call to programmatically encrypt data before sending it across
the wire from the client to Azure. It is stored encrypted, which means it also is encrypted at rest. When reading the
data back, you decrypt the information after receiving it.
For more information about client-side encryption, see Client-Side Encryption with .NET for Microsoft Azure
Storage.

Redundancy
In order to ensure that your data is durable, Azure Storage replicates multiple copies of your data. When you set up
your storage account, you select a redundancy option.
Replication options for a storage account include:
Locally-redundant storage (LRS ): A simple, low -cost replication strategy. Data is replicated within a single
storage scale unit.
Zone-redundant storage (ZRS ): Replication for high availability and durability. Data is replicated synchronously
across three availability zones.
Geo-redundant storage (GRS ): Cross-regional replication to protect against region-wide unavailability.
Read-access geo-redundant storage (RA-GRS ): Cross-regional replication with read access to the replica.
For more information about disaster recovery, see Disaster recovery and storage account failover (preview ) in
Azure Storage.

Transferring data to and from Azure Storage


You have several options for moving data into or out of Azure Storage. Which option you choose depends on the
size of your dataset and your network bandwidth. For more information, see Choose an Azure solution for data
transfer.

Pricing
For detailed information about pricing for Azure Storage, see the Pricing page.

Storage APIs, libraries, and tools


Azure Storage resources can be accessed by any language that can make HTTP/HTTPS requests. Additionally,
Azure Storage offers programming libraries for several popular languages. These libraries simplify many aspects of
working with Azure Storage by handling details such as synchronous and asynchronous invocation, batching of
operations, exception management, automatic retries, operational behavior, and so forth. Libraries are currently
available for the following languages and platforms, with others in the pipeline:
Azure Storage data API and library references
Storage Services REST API
Storage Client Library for .NET
Storage Client Library for Java/Android
Storage Client Library for Node.js
Storage Client Library for Python
Storage Client Library for PHP
Storage Client Library for Ruby
Storage Client Library for C++
Azure Storage management API and library references
Storage Resource Provider REST API
Storage Resource Provider Client Library for .NET
Storage Service Management REST API (Classic)
Azure Storage data movement API and library references
Storage Import/Export Service REST API
Storage Data Movement Client Library for .NET
Tools and utilities
Azure PowerShell Cmdlets for Storage
Azure CLI Cmdlets for Storage
AzCopy Command-Line Utility
Azure Storage Explorer is a free, standalone app from Microsoft that enables you to work visually with Azure
Storage data on Windows, macOS, and Linux.
Azure Storage Client Tools
Azure Developer Tools

Next steps
To get up and running with Azure Storage, see Create a storage account.
Deciding when to use Azure Blobs, Azure Files, or Azure
Disks
5/21/2019 • 3 minutes to read • Edit Online

Microsoft Azure provides several features in Azure Storage for storing and accessing your data in the cloud. This article covers
Azure Files, Blobs, and Disks, and is designed to help you choose between these features.

Scenarios
The following table compares Files, Blobs, and Disks, and shows example scenarios appropriate for each.

FEATURE DESCRIPTION WHEN TO USE

Azure Files Provides an SMB interface, client libraries, and You want to "lift and shift" an application to
a REST interface that allows access from the cloud which already uses the native file
anywhere to stored files. system APIs to share data between it and
other applications running in Azure.

You want to store development and


debugging tools that need to be accessed
from many virtual machines.

Azure Blobs Provides client libraries and a REST interface You want your application to support
that allows unstructured data to be stored streaming and random access scenarios.
and accessed at a massive scale in block
blobs. You want to be able to access application
data from anywhere.
Also supports Azure Data Lake Storage Gen2
for enterprise big data analytics solutions. You want to build an enterprise data lake on
Azure and perform big data analytics.

Azure Disks Provides client libraries and a REST interface You want to lift and shift applications that use
that allows data to be persistently stored and native file system APIs to read and write data
accessed from an attached virtual hard disk. to persistent disks.

You want to store data that is not required to


be accessed from outside the virtual machine
to which the disk is attached.

Comparison: Files and Blobs


The following table compares Azure Files with Azure Blobs.

Attribute Azure Blobs Azure Files

Durability options LRS, ZRS, GRS, RA-GRS LRS, ZRS, GRS

Accessibility REST APIs REST APIs

SMB 2.1 and SMB 3.0 (standard file system


APIs)

Connectivity REST APIs -- Worldwide REST APIs - Worldwide

SMB 2.1 -- Within region

SMB 3.0 -- Worldwide


Endpoints http://myaccount.blob.core.windows.net/mycontainer/myblob
\\myaccount.file.core.windows.net\myshare\myfile.txt

http://myaccount.file.core.windows.net/myshare/myfile.t

Directories Flat namespace True directory objects

Case sensitivity of names Case sensitive Case insensitive, but case preserving

Capacity Up to 2 PiB Account Limit 5 TiB file shares

Throughput Up to 60 MiB/s per block blob Up to 60 MiB/s per share

Object Size Up to about 4.75 TiB per block blob Up to 1 TiB per file

Billed capacity Based on bytes written Based on file size

Client libraries Multiple languages Multiple languages

Comparison: Files and Disks


Azure Files complement Azure Disks. A disk can only be attached to one Azure Virtual Machine at a time. Disks are fixed-format
VHDs stored as page blobs in Azure Storage, and are used by the virtual machine to store durable data. File shares in Azure Files
can be accessed in the same way as the local disk is accessed (by using native file system APIs), and can be shared across many
virtual machines.
The following table compares Azure Files with Azure Disks.

Attribute Azure Disks Azure Files

Scope Exclusive to a single virtual machine Shared access across multiple virtual
machines

Snapshots and Copy Yes Yes

Configuration Connected at startup of the virtual machine Connected after the virtual machine has
started

Authentication Built-in Set up with net use

Access using REST Files within the VHD cannot be accessed Files stored in a share can be accessed

Max Size 32 TiB disk 5 TiB File Share and 1 TiB file within share

Max IOps 20,000 IOps 1000 IOps

Throughput Up to 900 MiB/s per Disk Target is 60 MiB/s per File Share (can get
higher for higher IO sizes)

Next steps
When making decisions about how your data is stored and accessed, you should also consider the costs involved. For more
information, see Azure Storage Pricing.
Some SMB features are not applicable to the cloud. For more information, see Features not supported by the Azure File service.
For more information about disks, see our Introduction to managed disks and How to Attach a Data Disk to a Windows Virtual
Machine.
What is Azure Blob storage?
4/7/2019 • 2 minutes to read • Edit Online

Azure Blob storage is Microsoft's object storage solution for the cloud. Blob storage is optimized for storing
massive amounts of unstructured data. Unstructured data is data that does not adhere to a particular data model or
definition, such as text or binary data.

About Blob storage


Blob storage is designed for:
Serving images or documents directly to a browser.
Storing files for distributed access.
Streaming video and audio.
Writing to log files.
Storing data for backup and restore, disaster recovery, and archiving.
Storing data for analysis by an on-premises or Azure-hosted service.
Users or client applications can access objects in Blob storage via HTTP/HTTPS, from anywhere in the world.
Objects in Blob storage are accessible via the Azure Storage REST API, Azure PowerShell, Azure CLI, or an Azure
Storage client library. Client libraries are available for a variety of languages, including .NET, Java, Node.js, Python,
Go, PHP, and Ruby.

About Azure Data Lake Storage Gen2


Blob storage supports Azure Data Lake Storage Gen2, Microsoft's enterprise big data analytics solution for the
cloud. Azure Data Lake Storage Gen2 offers a hierarchical file system as well as the advantages of Blob storage,
including low -cost, tiered storage; high availability; strong consistency; and disaster recovery capabilities.
For more information about Data Lake Storage Gen2, see Introduction to Azure Data Lake Storage Gen2.

Next steps
Introduction to Azure Blob storage
Introduction to Azure Data Lake Storage Gen2
Introduction to Azure Data Lake Storage Gen2
4/30/2019 • 4 minutes to read • Edit Online

Azure Data Lake Storage Gen2 is a set of capabilities dedicated to big data analytics, built on Azure Blob storage.
Data Lake Storage Gen2 is the result of converging the capabilities of our two existing storage services, Azure
Blob storage and Azure Data Lake Storage Gen1. Features from Azure Data Lake Storage Gen1, such as file
system semantics, directory, and file level security and scale are combined with low -cost, tiered storage, high
availability/disaster recovery capabilities from Azure Blob storage.

Designed for enterprise big data analytics


Data Lake Storage Gen2 makes Azure Storage the foundation for building enterprise data lakes on Azure.
Designed from the start to service multiple petabytes of information while sustaining hundreds of gigabits of
throughput, Data Lake Storage Gen2 allows you to easily manage massive amounts of data.
A fundamental part of Data Lake Storage Gen2 is the addition of a hierarchical namespace to Blob storage. The
hierarchical namespace organizes objects/files into a hierarchy of directories for efficient data access. A common
object store naming convention uses slashes in the name to mimic a hierarchical directory structure. This structure
becomes real with Data Lake Storage Gen2. Operations such as renaming or deleting a directory become single
atomic metadata operations on the directory rather than enumerating and processing all objects that share the
name prefix of the directory.
In the past, cloud-based analytics had to compromise in areas of performance, management, and security. Data
Lake Storage Gen2 addresses each of these aspects in the following ways:
Performance is optimized because you do not need to copy or transform data as a prerequisite for
analysis. The hierarchical namespace greatly improves the performance of directory management
operations, which improves overall job performance.
Management is easier because you can organize and manipulate files through directories and
subdirectories.
Security is enforceable because you can define POSIX permissions on directories or individual files.
Cost effectiveness is made possible as Data Lake Storage Gen2 is built on top of the low -cost Azure Blob
storage. The additional features further lower the total cost of ownership for running big data analytics on
Azure.

Key features of Data Lake Storage Gen2


Hadoop compatible access: Data Lake Storage Gen2 allows you to manage and access data just as you
would with a Hadoop Distributed File System (HDFS ). The new ABFS driver is available within all Apache
Hadoop environments, including Azure HDInsight, Azure Databricks, and SQL Data Warehouse to access
data stored in Data Lake Storage Gen2.
A superset of POSIX permissions: The security model for Data Lake Gen2 supports ACL and POSIX
permissions along with some extra granularity specific to Data Lake Storage Gen2. Settings may be
configured through Storage Explorer or through frameworks like Hive and Spark.
Cost effective: Data Lake Storage Gen2 offers low -cost storage capacity and transactions. As data
transitions through its complete lifecycle, billing rates change keeping costs to a minimum via built-in
features such as Azure Blob storage lifecycle.
Optimized driver: The ABFS driver is optimized specifically for big data analytics. The corresponding
REST APIs are surfaced through the endpoint dfs.core.windows.net .
Scalability
Azure Storage is scalable by design whether you access via Data Lake Storage Gen2 or Blob storage interfaces. It
is able to store and serve many exabytes of data. This amount of storage is available with throughput measured in
gigabits per second (Gbps) at high levels of input/output operations per second (IOPS ). Beyond just persistence,
processing is executed at near-constant per-request latencies that are measured at the service, account, and file
levels.
Cost effectiveness
One of the many benefits of building Data Lake Storage Gen2 on top of Azure Blob storage is the low cost of
storage capacity and transactions. Unlike other cloud storage services, data stored in Data Lake Storage Gen2 is
not required to be moved or transformed prior to performing analysis. For more information about pricing, see
Azure Storage pricing.
Additionally, features such as the hierarchical namespace significantly improve the overall performance of many
analytics jobs. This improvement in performance means that you require less compute power to process the same
amount of data, resulting in a lower total cost of ownership (TCO ) for the end-to-end analytics job.
One service, multiple concepts
Data Lake Storage Gen2 is an additional capability for big data analytics, built on top of Azure Blob storage. While
there are many benefits in leveraging existing platform components of Blobs to create and operate data lakes for
analytics, it does lead to multiple concepts describing the same, shared things.
The following are the equivalent entities, as described by different concepts. Unless specified otherwise these
entities are directly synonymous:

CONCEPT TOP LEVEL ORGANIZATION LOWER LEVEL ORGANIZATION DATA CONTAINER

Blobs – General purpose Container Virtual directory (SDK only – Blob


object storage does not provide atomic
manipulation)

ADLS Gen2 – Analytics File system Directory File


Storage

Supported open source platforms


Several open source platforms support Data Lake Storage Gen2. Those platforms appear in the following table.

NOTE
Only the versions that appear in this table are supported.

PLATFORM SUPPORTED VERSION(S) MORE INFORMATION

HDInsight 3.6+ What are the Apache Hadoop


components and versions available with
HDInsight?

Hadoop 3.2+ Apache Hadoop releases archive

Cloudera 6.1+ Cloudera Enterprise 6.x release notes


PLATFORM SUPPORTED VERSION(S) MORE INFORMATION

Azure Databricks 5.1+ Databricks Runtime versions

Hortonworks 3.1.x++ Configuring cloud data access

Next steps
The following articles describe some of the main concepts of Data Lake Storage Gen2 and detail how to store,
access, manage, and gain insights from your data:
Hierarchical namespace
Create a storage account
Use a Data Lake Storage Gen2 account in Azure Databricks
What is Azure Files?
4/23/2019 • 3 minutes to read • Edit Online

Azure Files offers fully managed file shares in the cloud that are accessible via the industry standard Server
Message Block (SMB ) protocol. Azure file shares can be mounted concurrently by cloud or on-premises
deployments of Windows, Linux, and macOS. Additionally, Azure file shares can be cached on Windows Servers
with Azure File Sync for fast access near where the data is being used.

Videos
INTRODUCING AZURE FILE SYNC (2 M) AZURE FILES WITH SYNC (IGNITE 2017) (85 M)

Why Azure Files is useful


Azure file shares can be used to:
Replace or supplement on-premises file servers:
Azure Files can be used to completely replace or supplement traditional on-premises file servers or NAS
devices. Popular operating systems such as Windows, macOS, and Linux can directly mount Azure file
shares wherever they are in the world. Azure file shares can also be replicated with Azure File Sync to
Windows Servers, either on-premises or in the cloud, for performance and distributed caching of the data
where it's being used.
"Lift and shift" applications:
Azure Files makes it easy to "lift and shift" applications to the cloud that expect a file share to store file
application or user data. Azure Files enables both the "classic" lift and shift scenario, where both the
application and its data are moved to Azure, and the "hybrid" lift and shift scenario, where the application
data is moved to Azure Files, and the application continues to run on-premises.
Simplify cloud development:
Azure Files can also be used in numerous ways to simplify new cloud development projects. For example:
Shared application settings:
A common pattern for distributed applications is to have configuration files in a centralized location
where they can be accessed from many application instances. Application instances can load their
configuration through the File REST API, and humans can access them as needed by mounting the
SMB share locally.
Diagnostic share:
An Azure file share is a convenient place for cloud applications to write their logs, metrics, and crash
dumps. Logs can be written by the application instances via the File REST API, and developers can
access them by mounting the file share on their local machine. This enables great flexibility, as
developers can embrace cloud development without having to abandon any existing tooling they
know and love.
Dev/Test/Debug:
When developers or administrators are working on VMs in the cloud, they often need a set of tools
or utilities. Copying such utilities and tools to each VM can be a time consuming exercise. By
mounting an Azure file share locally on the VMs, a developer and administrator can quickly access
their tools and utilities, no copying required.

Key benefits
Shared access. Azure file shares support the industry standard SMB protocol, meaning you can seamlessly
replace your on-premises file shares with Azure file shares without worrying about application compatibility.
Being able to share a file system across multiple machines, applications/instances is a significant advantage
with Azure Files for applications that need shareability.
Fully managed. Azure file shares can be created without the need to manage hardware or an OS. This means
you don't have to deal with patching the server OS with critical security upgrades or replacing faulty hard
disks.
Scripting and tooling. PowerShell cmdlets and Azure CLI can be used to create, mount, and manage Azure
file shares as part of the administration of Azure applications.You can create and manage Azure file shares
using Azure portal and Azure Storage Explorer.
Resiliency. Azure Files has been built from the ground up to be always available. Replacing on-premises file
shares with Azure Files means you no longer have to wake up to deal with local power outages or network
issues.
Familiar programmability. Applications running in Azure can access data in the share via file system I/O
APIs. Developers can therefore leverage their existing code and skills to migrate existing applications. In
addition to System IO APIs, you can use Azure Storage Client Libraries or the Azure Storage REST API.

Next Steps
Create Azure file Share
Connect and mount on Windows
Connect and mount on Linux
Connect and mount on macOS
What are Azure Queues?
6/6/2019 • 2 minutes to read • Edit Online

Azure Queue storage is a service for storing large numbers of messages. You access messages from anywhere in
the world via authenticated calls using HTTP or HTTPS. A queue message can be up to 64 KB in size. A queue may
contain millions of messages, up to the total capacity limit of a storage account.

Common uses
Common uses of Queue storage include:
Creating a backlog of work to process asynchronously
Passing messages from an Azure web role to an Azure worker role

Queue service concepts


The Queue service contains the following components:

URL format: Queues are addressable using the following URL format:
https://<storage account>.queue.core.windows.net/<queue>

The following URL addresses a queue in the diagram:


https://myaccount.queue.core.windows.net/images-to-download

Storage account: All access to Azure Storage is done through a storage account. See Azure Storage
Scalability and Performance Targets for details about storage account capacity.
Queue: A queue contains a set of messages. The queue name must be all lowercase. For information on
naming queues, see Naming Queues and Metadata.
Message: A message, in any format, of up to 64 KB. Before version 2017-07-29, the maximum time-to-live
allowed is seven days. For version 2017-07-29 or later, the maximum time-to-live can be any positive
number, or -1 indicating that the message doesn't expire. If this parameter is omitted, the default time-to-
live is seven days.

Next steps
Create a storage account
Getting started with Queues using .NET
Introduction to Table storage in Azure
1/30/2019 • 3 minutes to read • Edit Online

TIP
The content in this article applies to the original Azure Table storage. However, there is now a premium offering for table
storage, the Azure Cosmos DB Table API that offers throughput-optimized tables, global distribution, and automatic
secondary indexes. There are some feature differences between Table API in Azure Cosmos DB and Azure table storage, to
learn more and try out the premium experience, please check out Azure Cosmos DB Table API.

Azure Table storage is a service that stores structured NoSQL data in the cloud, providing a key/attribute store
with a schemaless design. Because Table storage is schemaless, it's easy to adapt your data as the needs of your
application evolve. Access to Table storage data is fast and cost-effective for many types of applications, and is
typically lower in cost than traditional SQL for similar volumes of data.
You can use Table storage to store flexible datasets like user data for web applications, address books, device
information, or other types of metadata your service requires. You can store any number of entities in a table, and
a storage account may contain any number of tables, up to the capacity limit of the storage account.

What is Table storage


Azure Table storage stores large amounts of structured data. The service is a NoSQL datastore which accepts
authenticated calls from inside and outside the Azure cloud. Azure tables are ideal for storing structured, non-
relational data. Common uses of Table storage include:
Storing TBs of structured data capable of serving web scale applications
Storing datasets that don't require complex joins, foreign keys, or stored procedures and can be denormalized
for fast access
Quickly querying data using a clustered index
Accessing data using the OData protocol and LINQ queries with WCF Data Service .NET Libraries
You can use Table storage to store and query huge sets of structured, non-relational data, and your tables will
scale as demand increases.

Table storage concepts


Table storage contains the following components:

URL format: Azure Table Storage accounts use this format:


http://<storage account>.table.core.windows.net/<table>

Azure Cosmos DB Table API accounts use this format:


http://<storage account>.table.cosmosdb.azure.com/<table>

You can address Azure tables directly using this address with the OData protocol. For more information,
see OData.org.
Accounts: All access to Azure Storage is done through a storage account. See Azure Storage Scalability
and Performance Targets for details about storage account capacity.
All access to Azure Cosmos DB is done through a Table API account. See Create a Table API account for
details creating a Table API account.
Table: A table is a collection of entities. Tables don't enforce a schema on entities, which means a single
table can contain entities that have different sets of properties.
Entity: An entity is a set of properties, similar to a database row. An entity in Azure Storage can be up to
1MB in size. An entity in Azure Cosmos DB can be up to 2MB in size.
Properties: A property is a name-value pair. Each entity can include up to 252 properties to store data.
Each entity also has three system properties that specify a partition key, a row key, and a timestamp.
Entities with the same partition key can be queried more quickly, and inserted/updated in atomic
operations. An entity's row key is its unique identifier within a partition.
For details about naming tables and properties, see Understanding the Table Service Data Model.

Next steps
Microsoft Azure Storage Explorer is a free, standalone app from Microsoft that enables you to work visually
with Azure Storage data on Windows, macOS, and Linux.
Get started with Azure Table Storage in .NET
View the Table service reference documentation for complete details about available APIs:
Storage Client Library for .NET reference
REST API reference
Introduction to Azure managed disks
4/22/2019 • 9 minutes to read • Edit Online

An Azure managed disk is a virtual hard disk (VHD ). You can think of it like a physical disk in an on-premises
server but, virtualized. Azure managed disks are stored as page blobs, which are a random IO storage object in
Azure. We call a managed disk ‘managed’ because it is an abstraction over page blobs, blob containers, and Azure
storage accounts. With managed disks, all you have to do is provision the disk, and Azure takes care of the rest.
When you select to use Azure managed disks with your workloads, Azure creates and manages the disk for you.
The available types of disks are Ultra Solid State Drives (SSD ) (Preview ), Premium SSD, Standard SSD, and
Standard Hard Disk Drives (HDD ). For more information about each individual disk type, see Select a disk type for
IaaS VMs.

Benefits of managed disks


Let's go over some of the benefits you gain by using managed disks.
Highly durable and available
Managed disks are designed for 99.999% availability. Managed disks achieve this by providing you with three
replicas of your data, allowing for high durability. If one or even two replicas experience issues, the remaining
replicas help ensure persistence of your data and high tolerance against failures. This architecture has helped
Azure consistently deliver enterprise-grade durability for infrastructure as a service (IaaS ) disks, with an industry-
leading ZERO% annualized failure rate.
Simple and scalable VM deployment
Using managed disks, you can create up to 50,000 VM disks of a type in a subscription per region, allowing you to
create thousands of VMs in a single subscription. This feature also further increases the scalability of virtual
machine scale sets by allowing you to create up to 1,000 VMs in a virtual machine scale set using a Marketplace
image.
Integration with availability sets
Managed disks are integrated with availability sets to ensure that the disks of VMs in an availability set are
sufficiently isolated from each other to avoid a single point of failure. Disks are automatically placed in different
storage scale units (stamps). If a stamp fails due to hardware or software failure, only the VM instances with disks
on those stamps fail. For example, let's say you have an application running on five VMs, and the VMs are in an
Availability Set. The disks for those VMs won't all be stored in the same stamp, so if one stamp goes down, the
other instances of the application continue to run.
Integration with Availability Zones
Managed disks supports Availability Zones, which is a high-availability offering that protects your applications
from datacenter failures. Availability Zones are unique physical locations within an Azure region. Each zone is
made up of one or more datacenters equipped with independent power, cooling, and networking. To ensure
resiliency, there’s a minimum of three separate zones in all enabled regions. With Availability Zones, Azure offers
industry best 99.99% VM uptime SLA.
Azure Backup support
To protect against regional disasters, Azure Backup can be used to create a backup job with time-based backups
and backup retention policies. This allows you to perform easy VM restorations at will. Currently Azure Backup
supports disk sizes up to four tebibyte (TiB ) disks. Azure Backup supports backup and restore of managed disks.
Learn more about Azure VM backup support.
Granular access control
You can use Azure role-based access control (RBAC ) to assign specific permissions for a managed disk to one or
more users. Managed disks expose a variety of operations, including read, write (create/update), delete, and
retrieving a shared access signature (SAS ) URI for the disk. You can grant access to only the operations a person
needs to perform their job. For example, if you don't want a person to copy a managed disk to a storage account,
you can choose not to grant access to the export action for that managed disk. Similarly, if you don't want a person
to use an SAS URI to copy a managed disk, you can choose not to grant that permission to the managed disk.

Encryption
Managed disks offer two different kinds of encryption. The first is Storage Service Encryption (SSE ), which is
performed by the storage service. The second one is Azure Disk Encryption, which you can enable on the OS and
data disks for your VMs.
Storage Service Encryption (SSE)
Azure Storage Service Encryption provides encryption-at-rest and safeguards your data to meet your
organizational security and compliance commitments. SSE is enabled by default for all managed disks, snapshots,
and images in all the regions where managed disks are available. Visit the Managed Disks FAQ page for more
details.
Azure Disk Encryption (ADE)
Azure Disk Encryption allows you to encrypt the OS and Data disks used by an IaaS Virtual Machine. This
encryption includes managed disks. For Windows, the drives are encrypted using industry-standard BitLocker
encryption technology. For Linux, the disks are encrypted using the DM -Crypt technology. The encryption process
is integrated with Azure Key Vault to allow you to control and manage the disk encryption keys. For more
information, see Azure Disk Encryption for IaaS VMs.

Disk roles
There are three main disk roles in Azure: the data disk, the OS disk, and the temporary disk. These roles map to
disks that are attached to your virtual machine.

Data disk
A data disk is a managed disk that's attached to a virtual machine to store application data, or other data you need
to keep. Data disks are registered as SCSI drives and are labeled with a letter that you choose. Each data disk has a
maximum capacity of 32,767 gibibytes (GiB ). The size of the virtual machine determines how many data disks you
can attach to it and the type of storage you can use to host the disks.
OS disk
Every virtual machine has one attached operating system disk. That OS disk has a pre-installed OS, which was
selected when the VM was created.
This disk has a maximum capacity of 2,048 GiB.
Temporary disk
Every VM contains a temporary disk, which is not a managed disk. The temporary disk provides short-term
storage for applications and processes and is intended to only store data such as page or swap files. Data on the
temporary disk may be lost during a maintenance event event or when you redeploy a VM. On Azure Linux VMs,
the temporary disk is /dev/sdb by default and on Windows VMs the temporary disk is D: by default. During a
successful standard reboot of the VM, the data on the temporary disk will persist.

Managed disk snapshots


A managed disk snapshot is a read-only full copy of a managed disk that is stored as a standard managed disk by
default. With snapshots, you can back up your managed disks at any point in time. These snapshots exist
independent of the source disk and can be used to create new managed disks. They are billed based on the used
size. For example, if you create a snapshot of a managed disk with provisioned capacity of 64 GiB and actual used
data size of 10 GiB, that snapshot is billed only for the used data size of 10 GiB.
To learn more about how to create snapshots with managed disks, see the following resources:
Create copy of VHD stored as a managed disk using snapshots in Windows
Create copy of VHD stored as a managed disk using snapshots in Linux
Images
Managed disks also support creating a managed custom image. You can create an image from your custom VHD
in a storage account or directly from a generalized (sysprepped) VM. This process captures a single image. This
image contains all managed disks associated with a VM, including both the OS and data disks. This managed
custom image enables creating hundreds of VMs using your custom image without the need to copy or manage
any storage accounts.
For information on creating images, see the following articles:
How to capture a managed image of a generalized VM in Azure
How to generalize and capture a Linux virtual machine using the Azure CLI
Images versus snapshots
It's important to understand the difference between images and snapshots. With managed disks, you can take an
image of a generalized VM that has been deallocated. This image includes all of the disks attached to the VM. You
can use this image to create a VM, and it includes all of the disks.
A snapshot is a copy of a disk at the point in time the snapshot is taken. It applies only to one disk. If you have a
VM that has one disk (the OS disk), you can take a snapshot or an image of it and create a VM from either the
snapshot or the image.
A snapshot doesn't have awareness of any disk except the one it contains. This makes it problematic to use in
scenarios that require the coordination of multiple disks, such as striping. Snapshots would need to be able to
coordinate with each other and this is currently not supported.

Disk allocation and performance


The following diagram depicts real-time allocation of bandwidth and IOPS for disks, using a three-level
provisioning system:
The first level provisioning sets the per-disk IOPS and bandwidth assignment. At the second level, compute server
host implements SSD provisioning, applying it only to data that is stored on the server’s SSD, which includes
disks with caching (ReadWrite and ReadOnly) as well as local and temp disks. Finally, VM network provisioning
takes place at the third level for any I/O that the compute host sends to Azure Storage's backend. With this
scheme, the performance of a VM depends on a variety of factors, from how the VM uses the local SSD, to the
number of disks attached, as well as the performance and caching type of the disks it has attached.
As an example of these limitations, a Standard_DS1v1 VM is prevented from achieving the 5,000 IOPS potential
of a P30 disk, whether it is cached or not, because of limits at the SSD and network levels:
Azure uses prioritized network channel for disk traffic, which gets the precedence over other low priority of
network traffic. This helps disks maintain their expected performance in case of network contentions. Similarly,
Azure Storage handles resource contentions and other issues in the background with automatic load balancing.
Azure Storage allocates required resources when you create a disk, and applies proactive and reactive balancing of
resources to handle the traffic level. This further ensures disks can sustain their expected IOPS and throughput
targets. You can use the VM -level and Disk-level metrics to track the performance and setup alerts as needed.
Refer to our design for high performance article, to learn the best practices for optimizing VM + Disk
configurations so that you can achieve your desired performance

Next steps
Learn more about the individual disk types Azure offers, which type is a good fit for your needs, and learn about
their performance targets in our article on disk types.
Select a disk type for IaaS VMs
Introduction to Azure managed disks
4/22/2019 • 9 minutes to read • Edit Online

An Azure managed disk is a virtual hard disk (VHD ). You can think of it like a physical disk in an on-premises
server but, virtualized. Azure managed disks are stored as page blobs, which are a random IO storage object in
Azure. We call a managed disk ‘managed’ because it is an abstraction over page blobs, blob containers, and Azure
storage accounts. With managed disks, all you have to do is provision the disk, and Azure takes care of the rest.
When you select to use Azure managed disks with your workloads, Azure creates and manages the disk for you.
The available types of disks are ultra disks (Preview ), premium solid-state drives (SSD ), standard SSD, and
standard hard disk drives (HDD ). For more information about each individual disk type, see Select a disk type for
IaaS VMs.

Benefits of managed disks


Let's go over some of the benefits you gain by using managed disks.
Highly durable and available
Managed disks are designed for 99.999% availability. Managed disks achieve this by providing you with three
replicas of your data, allowing for high durability. If one or even two replicas experience issues, the remaining
replicas help ensure persistence of your data and high tolerance against failures. This architecture has helped Azure
consistently deliver enterprise-grade durability for infrastructure as a service (IaaS ) disks, with an industry-leading
ZERO% annualized failure rate.
Simple and scalable VM deployment
Using managed disks, you can create up to 50,000 VM disks of a type in a subscription per region, allowing you to
create thousands of VMs in a single subscription. This feature also further increases the scalability of virtual
machine scale sets by allowing you to create up to 1,000 VMs in a virtual machine scale set using a Marketplace
image.
Integration with availability sets
Managed disks are integrated with availability sets to ensure that the disks of VMs in an availability set are
sufficiently isolated from each other to avoid a single point of failure. Disks are automatically placed in different
storage scale units (stamps). If a stamp fails due to hardware or software failure, only the VM instances with disks
on those stamps fail. For example, let's say you have an application running on five VMs, and the VMs are in an
Availability Set. The disks for those VMs won't all be stored in the same stamp, so if one stamp goes down, the
other instances of the application continue to run.
Integration with Availability Zones
Managed disks supports Availability Zones, which is a high-availability offering that protects your applications
from datacenter failures. Availability Zones are unique physical locations within an Azure region. Each zone is made
up of one or more datacenters equipped with independent power, cooling, and networking. To ensure resiliency,
there’s a minimum of three separate zones in all enabled regions. With Availability Zones, Azure offers industry
best 99.99% VM uptime SLA.
Azure Backup support
To protect against regional disasters, Azure Backup can be used to create a backup job with time-based backups
and backup retention policies. This allows you to perform easy VM restorations at will. Currently Azure Backup
supports disk sizes up to four tebibyte (TiB ) disks. Azure Backup supports backup and restore of managed disks.
Learn more about Azure VM backup support.
Granular access control
You can use Azure role-based access control (RBAC ) to assign specific permissions for a managed disk to one or
more users. Managed disks expose a variety of operations, including read, write (create/update), delete, and
retrieving a shared access signature (SAS ) URI for the disk. You can grant access to only the operations a person
needs to perform their job. For example, if you don't want a person to copy a managed disk to a storage account,
you can choose not to grant access to the export action for that managed disk. Similarly, if you don't want a person
to use an SAS URI to copy a managed disk, you can choose not to grant that permission to the managed disk.

Encryption
Managed disks offer two different kinds of encryption. The first is Storage Service Encryption (SSE ), which is
performed by the storage service. The second one is Azure Disk Encryption, which you can enable on the OS and
data disks for your VMs.
Storage Service Encryption (SSE)
Azure Storage Service Encryption provides encryption-at-rest and safeguards your data to meet your
organizational security and compliance commitments. SSE is enabled by default for all managed disks, snapshots,
and images in all the regions where managed disks are available. Visit the Managed Disks FAQ page for more
details.
Azure Disk Encryption (ADE)
Azure Disk Encryption allows you to encrypt the OS and Data disks used by an IaaS Virtual Machine. This
encryption includes managed disks. For Windows, the drives are encrypted using industry-standard BitLocker
encryption technology. For Linux, the disks are encrypted using the DM -Crypt technology. The encryption process
is integrated with Azure Key Vault to allow you to control and manage the disk encryption keys. For more
information, see Azure Disk Encryption for IaaS VMs.

Disk roles
There are three main disk roles in Azure: the data disk, the OS disk, and the temporary disk. These roles map to
disks that are attached to your virtual machine.

Data disk
A data disk is a managed disk that's attached to a virtual machine to store application data, or other data you need
to keep. Data disks are registered as SCSI drives and are labeled with a letter that you choose. Each data disk has a
maximum capacity of 32,767 gibibytes (GiB ). The size of the virtual machine determines how many data disks you
can attach to it and the type of storage you can use to host the disks.
OS disk
Every virtual machine has one attached operating system disk. That OS disk has a pre-installed OS, which was
selected when the VM was created.
This disk has a maximum capacity of 2,048 GiB.
Temporary disk
Every VM contains a temporary disk, which is not a managed disk. The temporary disk provides short-term
storage for applications and processes and is intended to only store data such as page or swap files. Data on the
temporary disk may be lost during a maintenance event event or when you redeploy a VM. On Azure Linux VMs,
the temporary disk is /dev/sdb by default and on Windows VMs the temporary disk is D: by default. During a
successful standard reboot of the VM, the data on the temporary disk will persist.

Managed disk snapshots


A managed disk snapshot is a read-only full copy of a managed disk that is stored as a standard managed disk by
default. With snapshots, you can back up your managed disks at any point in time. These snapshots exist
independent of the source disk and can be used to create new managed disks. They are billed based on the used
size. For example, if you create a snapshot of a managed disk with provisioned capacity of 64 GiB and actual used
data size of 10 GiB, that snapshot is billed only for the used data size of 10 GiB.
To learn more about how to create snapshots with managed disks, see the following resources:
Create copy of VHD stored as a managed disk using snapshots in Windows
Create copy of VHD stored as a managed disk using snapshots in Linux
Images
Managed disks also support creating a managed custom image. You can create an image from your custom VHD
in a storage account or directly from a generalized (sysprepped) VM. This process captures a single image. This
image contains all managed disks associated with a VM, including both the OS and data disks. This managed
custom image enables creating hundreds of VMs using your custom image without the need to copy or manage
any storage accounts.
For information on creating images, see the following articles:
How to capture a managed image of a generalized VM in Azure
How to generalize and capture a Linux virtual machine using the Azure CLI
Images versus snapshots
It's important to understand the difference between images and snapshots. With managed disks, you can take an
image of a generalized VM that has been deallocated. This image includes all of the disks attached to the VM. You
can use this image to create a VM, and it includes all of the disks.
A snapshot is a copy of a disk at the point in time the snapshot is taken. It applies only to one disk. If you have a
VM that has one disk (the OS disk), you can take a snapshot or an image of it and create a VM from either the
snapshot or the image.
A snapshot doesn't have awareness of any disk except the one it contains. This makes it problematic to use in
scenarios that require the coordination of multiple disks, such as striping. Snapshots would need to be able to
coordinate with each other and this is currently not supported.

Disk allocation and performance


The following diagram depicts real-time allocation of bandwidth and IOPS for disks, using a three-level
provisioning system:
The first level provisioning sets the per-disk IOPS and bandwidth assignment. At the second level, compute server
host implements SSD provisioning, applying it only to data that is stored on the server’s SSD, which includes disks
with caching (ReadWrite and ReadOnly) as well as local and temp disks. Finally, VM network provisioning takes
place at the third level for any I/O that the compute host sends to Azure Storage's backend. With this scheme, the
performance of a VM depends on a variety of factors, from how the VM uses the local SSD, to the number of disks
attached, as well as the performance and caching type of the disks it has attached.
As an example of these limitations, a Standard_DS1v1 VM is prevented from achieving the 5,000 IOPS potential of
a P30 disk, whether it is cached or not, because of limits at the SSD and network levels:
Azure uses prioritized network channel for disk traffic, which gets the precedence over other low priority of
network traffic. This helps disks maintain their expected performance in case of network contentions. Similarly,
Azure Storage handles resource contentions and other issues in the background with automatic load balancing.
Azure Storage allocates required resources when you create a disk, and applies proactive and reactive balancing of
resources to handle the traffic level. This further ensures disks can sustain their expected IOPS and throughput
targets. You can use the VM -level and Disk-level metrics to track the performance and setup alerts as needed.
Refer to our design for high performance article, to learn the best practices for optimizing VM + Disk
configurations so that you can achieve your desired performance

Next steps
Learn more about the individual disk types Azure offers, which type is a good fit for your needs, and learn about
their performance targets in our article on disk types.
Select a disk type for IaaS VMs

You might also like