You are on page 1of 68

Dell Unstructred Data

Solution
N g u yễ n T h ế H ù n g
T h e h u n g.n gu yen@de ll .co m
0 9 3 6 3 91 525
U D S So l u tio n En g in e e r
Innovate using leading storage innovations
Hyperconverged infrastructure
Modern storage portfolio Software-Defined Storage (SDS)

Unstructured data Multi-


Block Block and file VMware
File & S3 object Object hypervisor

Dell
Dell Dell Dell Dell Dell Dell Dell
PowerScale/
PowerVault Unity XT PowerStore PowerMax ECS VxRail PowerFlex
Isilon

✓ Block (SAN/DAS) ✓Simple ✓ Scale-up/out ✓ Scale-up/out ✓ Scale-out ✓ Cloud-scale ✓ Turnkey system ✓ Scalable SDI
✓ Affordable/simple ✓All-flash/hybrid ✓ NVMe ✓ Cyber resiliency ✓ All-flash to archive ✓ Deep archive ✓ VMware vSAN ✓ Multi-hypervisor
✓ CloudIQ support ✓Virtual option ✓ AppsON ✓ Highest availability ✓ Multiprotocol file ✓ Mobile App and ✓ Lifecycle mgt ✓ 2 layer/
✓CloudIQ support ✓ CloudIQ support ✓ CloudIQ support ✓ CloudIQ Modern app ✓ CloudIQ support HCI/storage
✓ CloudIQ support
Gartner Magic Quadrant – 7th Consecutive Year Leader (Oct’22)!!
“Distributed file systems and object storage deployments are
growing faster than ever in both volume and size as the
consolidated platform for unstructured data services in
global data center.”

“By 2026, large enterprises will triple their unstructured data


capacity stored as file or object storage on-premises, at the
edge or in the public cloud, compared to 2022.”

“Dell Technologies has the broadest portfolio of products


and builds on insights gathered from the largest installed
base in the market to address the challenges of
unstructured data.”

- Gartner Magichttps://www.gartner.com/doc/reprints?id=1-2BHJ4VIQ&ct=221024&st=sb
Quadrant for Distributed File Systems and Object Storage – Oct. 2022

Gartner, Inc. “Magic Quadrant™ for Distributed File Systems and Object Storage” byJulia Palmer, Jerry Rozeman, Chandra Mukhyala, Jeff Vogel, October 19, 2022.
Gartner does not endorse any vendor, product or service depicted in its research publications, and does not advise technology users to select only those vendors with the highest
ratings or other designation. Gartner research publications consist of the opinions of Gartner’s research organization and should not be construed as statements of fact. Gartner
disclaims all warranties, express or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.
This graphic was published by Gartner, Inc. as part of a larger research document and should be evaluated in the context of the entire document. The Gartner document is
available upon request from Dell Technologies.
GARTNER and MAGIC QUADRANT are registered trademarks and service marks of Gartner, Inc. and/or its affiliates in the U.S. and internationally and are used herein with
permission. All rights reserved

Dell Internal Confidential


Object vs File vs Block

5 © Copyright 2018 Dell Inc.


What is unstructured data?
Artificial intelligence
Email Data analytics

Databases Automated driving assist

Archiving Internet of things

Home directories
Energy
Virtual server of the world’s data
stored is unstructured* Manufacturing

Virtual desktop
Financial services

Video surveillance
Life sciences

File shares Media & entertainment

EDA
Dell Internal and Partner Confidential Copyright © Dell Inc. All Rights Reserved.
Unstructured data workload
Artificial intelligence

Data analytics

Automated driving assist

Archiving Internet of things

Home directories
Workloads suitable for
Energy

Manufacturing

Financial services

Video surveillance
Life sciences

File shares Media & entertainment

EDA
Dell Internal and Partner Confidential Copyright © Dell Inc. All Rights Reserved.
Different in the way of accessing
Dell PowerScale
Scale-Out NAS

Dell Internal and Partner Confidential Copyright © Dell Inc. All Rights Reserved.
Powerscale Scale-out architecture
Client
Multi-protocol Applications
Intra-cluster
communication
REST

NDMP

HDFS

Ethernet layer
HTTP

FTP

S3

SMB

NFS

Dell Internal and Partner Confidential Copyright © Dell Inc. All Rights Reserved.
Traditional NAS

NFS
SMB

GUI
Establish pools/LUNs
NAS #1 Marketing
HR Finance Engineering
Set Data protection/RAID

Shelf #1 Marketing
2 LUN 6 LUN 4
Permissions/Access/Protocol
LUN 3
Scaling compounds complexity

NAS #2 Silos of Storage


LUN 1 LUN 2 LUN 3 LUN 4
Manual Tiering

NAS #3 LUN 1 LUN 2 LUN 3 LUN 4


Dell Internal and Partner Confidential Copyright © Dell Inc. All Rights Reserved.
Normal NAS Architecture

Dell Internal and Partner Confidential Copyright © Dell Inc. All Rights Reserved.
Powerscale Architecture

Dell Internal and Partner Confidential Copyright © Dell Inc. All Rights Reserved.
ARCHITECTURE REALLY MATTERS
Traditional – Scale-Up Modern - Scale Out
PowerScale Storage Nodes
ALL-FLASH (Performance)

HYBRID (Blended)
F900

F600
H700 / H7000
F200

ARCHIVE (Capacity)

A300 / A3000 Combine in a single


cluster with OneFS
Cluster Hardware platforms PowerScale F900 Node

PowerScale Stand-alone Node Platform:

• F900 contains 24 x 2.5” NVMe SSDs.


PowerScale F600 Node
• F600 contains 8 x 2.5” NVMe SSDs.

• F200 contains 4 x 3.5” SAS SSDs. PowerScale F200 Node


PowerScale Gen6.x Modular Platform:
• Four nodes are contained in a 4RU chassis

• Enhances the concept of disk pools, node pools


Node 1 Node 2 Node 3 Node 4
• ‘Neighborhoods’ add another level of resilience to the
Sled 1
failure domain concept.
Sled 2
• Each chassis contains four compute modules (one per Sled 3
node), and five drive containers, or sleds, per node. Sled 4
Sled 5
• Each sled is a tray which slides into the front of the chassis PowerScale Gen6.x
and contains between three and six drives Chassis

Dell Internal and Partner Confidential Copyright © Dell Inc. All Rights Reserved.
Multiple pools, one filesystem Drive lowest blended TCO
Optimize Data Placement to Lower Blended TCO without compromising
performance

Transparently leverage
POLICY multiple pools of storage
1 week
move to hybrid tier SMARTPOOLS
Policy based automated
1 month
move to archive tier tiering within one filesystem

CLOUDPOOLS
1 year
move to cloud Extends filesystem to the
cloud or on-prem object store

Dell Internal and Partner Confidential Copyright


17 of 58 © Copyright
© Dell Inc.
2020AllDell
Rights
Inc.Reserved.
Compatibility with existing clusters
Add new generation node to
existing cluster

Non-disruptive access continues

Auto-balance data across


the cluster

Transparent to clients

No impact to users

Dell Internal and Partner Confidential Copyright © Dell Inc. All Rights Reserved.
Eliminate data migrations
Add new generation node to
existing cluster

Non-disruptive access continues

Policy-driven job moves data to


new nodes

Seamless and transparent to


clients

No impact to users

Dell Internal and Partner Confidential Copyright


19 of 58 © Copyright
© Dell Inc.
2020AllDell
Rights
Inc.Reserved.
Capacity expansion
and balancing Automated data balancing across
nodes eliminates “hot spots”

AutoBalance automatically moves


content to new storage nodes

No administrator intervention
required

Utilization increases as cluster


grows

Performance scales linearly with


capacity

Dell Internal and Partner Confidential Copyright


20 of 58 © Copyright
© Dell Inc.
2020AllDell
Rights
Inc.Reserved.
Connection and
CPU balancing Client connects evenly distributed
across cluster

In flight reads and writes handed


off
Windows
clients
Smart Multiple connection balancing
Connect policies
Linux
clients

User and application access are


Throughput Represents
uninterrupted
Connection Count client connection

Round Robin
Ensures CPU is evenly utilized
CPU
across cluster

Dell Internal and Partner Confidential Copyright


21 of 58 © Copyright
© Dell Inc.
2020AllDell
Rights
Inc.Reserved.
Policy based data protection
Protection with no downtime Flexible disk and node
protection with lower impact
N+1 N+4
Faster rebuild

On the fly at directory and file

Multiple levels of protection

No impact to users

Dell Internal and Partner Confidential Copyright


22 of 58 © Copyright
© Dell Inc.
2020AllDell
Rights
Inc.Reserved.
Policy based data protection
Protection with no downtime Flexible disk and node
protection with lower impact
N+2
Faster rebuild

On the fly at directory and file

Multiple levels of protection

No impact to users

Dell Internal and Partner Confidential Copyright


23 of 58 © Copyright
© Dell Inc.
2020AllDell
Rights
Inc.Reserved.
Policy based data protection
Protection with no downtime Flexible disk and node
protection with lower impact
N+2
Faster rebuild

On the fly at directory and file

Multiple levels of protection

No impact to users

Dell Internal and Partner Confidential Copyright


24 of 58 © Copyright
© Dell Inc.
2020AllDell
Rights
Inc.Reserved.
Policy based data protection
Protection with no downtime Flexible disk and node
protection with lower impact
N+2
Faster rebuild

On the fly at directory and file

Multiple levels of protection

No impact to users

Dell Internal and Partner Confidential Copyright


25 of 58 © Copyright
© Dell Inc.
2020AllDell
Rights
Inc.Reserved.
Policy based data protection
Protection with no downtime
ADVANTAGES
N+2
3 Change data protection
on the fly

Directory and
/ifs N+3 file level granularity
change just a single directory to a
higher level

N+4

Dell Internal and Partner Confidential Copyright


26 of 58 © Copyright
© Dell Inc.
2020AllDell
Rights
Inc.Reserved.
Policy based data protection
Protection with no downtime
ADVANTAGES
N+3 Protected @ N+3 Change data protection
USED FREE on the fly

Directory and
file level granularity
change just a single directory to a
higher level

How rebuilds work

No Need for Hot Spare Drives


Dell Internal and Partner Confidential Copyright
27 of 58 © Copyright
© Dell Inc.
2020AllDell
Rights
Inc.Reserved.
Policy based data protection
Protection with no downtime
ADVANTAGES
N+3 Protected @ N+2 Change data protection
USED FREE on the fly

Directory and
file level granularity
change just a single directory to a
FAILED higher level

How rebuilds work

Node Fails, Below Protection


Dell Internal and Partner Confidential Copyright
28 of 58 © Copyright
© Dell Inc.
2020AllDell
Rights
Inc.Reserved.
Policy based data protection
Protection with no downtime
ADVANTAGES
N+3 Protected @ N+3 Change data protection
USED FREE on the fly

Directory and
file level granularity
change just a single directory to a
higher level

How rebuilds work

All of the nodes rebuild the failed units into free space
Dell Internal and Partner Confidential Copyright
29 of 58 © Copyright
© Dell Inc.
2020AllDell
Rights
Inc.Reserved.
Policy based data protection
Protection with no downtime
ADVANTAGES
N+3 Protected @ N+3 Change data protection
USED FREE on the fly

Directory and
file level granularity
change just a single directory to a
higher level

How rebuilds work

Failed node is replaced


Dell Internal and Partner Confidential Copyright
30 of 58 © Copyright
© Dell Inc.
2020AllDell
Rights
Inc.Reserved.
Policy based data protection
Protection with no downtime
ADVANTAGES
N+3 Protected @ N+3 Change data protection
USED FREE on the fly

Directory and
file level granularity
change just a single directory to a
higher level

How rebuilds work

Data is redistributed and balanced automatically


Dell Internal and Partner Confidential Copyright
31 of 58 © Copyright
© Dell Inc.
2020AllDell
Rights
Inc.Reserved.
Multi-Tenancy

Dell Internal and Partner Confidential Copyright © Dell Inc. All Rights Reserved.
OneFS In-line Data Reduction

Dell Internal and Partner Confidential Copyright © Dell Inc. All Rights Reserved.
Secure multi-protocol access
Global permissions structure shared
across ALL users with ALL protocols

Unified Authentication and Identity


Management across all protocols

/ifs Full control of protocol stack and


modular architecture to add new
protocols easily

All protocols get all data services


Global Permissions
Structure
Single or multiple authentication
providers

Dell Internal and Partner Confidential Copyright


34 of 58 © Copyright
© Dell Inc.
2020AllDell
Rights
Inc.Reserved.
OneFS features: manage what matters

InsightIQ

SmartPools SyncIQ
Policy-based automated tiering Asynchronous replication for DR

Data Reduction SnapShotIQ


Data deduplication reducing storage costs
OneFS Fast, efficient, data backup and recovery

CloudPools SmartLock
Cloud tiering to a choice of providers Policy-based compliance WORM protection

SmartQuotas SmartConnect
Quota management and thin provisioning Policy-based client failover load balancing
DataIQ

Dell Internal and Partner Confidential Copyright


35 of 58 © Copyright
© Dell Inc.
2020AllDell
Rights
Inc.Reserved.
Simplicity at any scale
Provides peace of mind to storage admins

Any scale No node left behind Ransomware protection


Start small Add new and remove old Stop ransomware real-time
Grow to petabyte scale nodes with no downtime with API-integrated solution

Resilient Efficient DevOps ready


Sustain multi-node failures; Inline data reduction with no Utilize new Ansible and
designed for 6x9s availability* hot spots and AutoBalance Kubernetes integrations
Worldwide Guidance: Introduce an Isolated Backup

“Create an isolated recovery “Ensure that backups are not “Secure tertiary data backup should “It is important that the backup data
environment: Make ransomware connected to the business be disconnected … so that it can is stored offline and not connected
recovery via an IRE part of your network” withstand targeted cyber attacks … to your network.”
disaster recovery plan & include it or threats from malicious insiders.”
in future disaster recovery tests.”

“It is critical to maintain offline, “Data Vault requirement: “Ensure backups are not “Daily backups of important data,
encrypted backups of data” ‘Air gapped’” connected to the networks they software and settings, stored
back up.” disconnected, retained for at least
three months.”

37 of Y © Copyright 2021 Dell Inc.


Dell OneFS Unstructured Data Cyber Defense
Production Data SyncIQ DR
Large Unstructured Data Sets
don’t backup quickly. The most
cost-effective and efficient way
to provide a 2nd copy is snap &
replicate with OneFS SyncIQ
Flexible replication options with
Encryption over the wire
and SnapshotIQ. OneFS has
cloud-native options for DR

Ransomware Defender
AirGap Cluster
Real-Time Ransomware detection and
prevention. Monitors user behavior &
shuts down access when ransomware
activity is identified.

SnapshotIQ AirGap
OneFS Snapshots are read-only. Impact of Manages SyncIQ policy replication schedule to 3rd cluster
attack is limited to changes between last automatically and keeps the Airgap closed when active
snapshot policy and when attack happened. threats are detected. Longest Data Retention capability
OneFS has granular policy options and with block level snapshot differencing. Fastest Restore
rapidly recovers from an attack Speed with rapid recovery of data in hours not days or
weeks “Get PB’s of data usable in hours”

38 of Y © Copyright 2021 Dell Inc.


Focused use cases
Would you like some
Powerscale ☺
Central nas – home & File sharing
Data mostly created and accessed by users
• User Home Directories, User Profiles (VDI)
• Network shares
CCTV - surveillance system

• Metropolitan System
• Airports, Railways, Public transportation
• Hospitals
• Prisons Surveillance
Body worn
video
• Strategic objects – power plants etc.
• Logistics companies Aerial video

Audio
In car video

Crime science
GIS data
CCTV – Traditional approach

LUN 1 LUN 2 LUN 3 LUN 4 LUN 5 LUN 6 LUN 7 LUN 8 LUN 9 LUN 10 LUN 11 LUN 12
(E:) (F:) (G:) (H:) (E:) (F:) (G:) (H:) (E:) (F:) (G:) (H:)
CCTV –Traditional approach

LUN 1 LUN 2 LUN 3 LUN 4 LUN 5 LUN 6 LUN 7 LUN 8 LUN 17 LUN 9 LUN 10 LUN 11 LUN 12
(E:) (F:) (G:) (H:) (E:) (F:) (G:) (H:) (I:) (E:) (F:) (G:) (H:)
CCTV – smart & modern approach

ISILON = UNIFORM AND AUTOMATIC DATA DISTRIBUTION


Healthcare
Healthcare pacs system and VNA
Healthcare pacs system and VNA

• Each PACS system has different storage


specific requirements
• Validated design and sizing guides
• Plenty of customers
• PACS is only one workload in HC –
consolidation story with data lake
Media & Entertainment
Isilon came from M&E – standard in M&E

• Isilon can be a storage for most of the M&E workflow:


• Ingest & acquisition
• Edit & Process & Composition
• Visual Effect
• Archive
• Video Content Delivery
• Broadcasters, Animation & VFX studios, Service Providers with VoD service
• We understand media applications & workflows
• Tons of customers
Sample Diagram
Movies created on isilon

by UPP
Prague
Data Analytics/Data Management
with Data Lake
Data architecture evolution to Data Lakehouse

BI Reports Data Machine BI Reports Data Machine BI Reports Data Machine


Science Learning Science Learning Science Learning

Data Warehouses
Data Warehouses

Metadata, Caching
ETL ETL
and Indexing Layer

1 1 0 0 0 1 1 0 1 1 0 0 0 1 1 0 1 1 0
ETL 0 0 1 1 0
ETL 0 0
0
0 0 0 1 1 1 0
1 0 1 0 0 1
1
1 1
0 0
1 0 0
0
0 0 0 1 1 1 0
1 0 1 0 0 1
1
1 1
0 0
1 0 0
0
0 0 0 1 1 1 0
1 0 1 0 0 1
1
1 1
0 0
1
1 1
1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1
1 1 1 1 1 1
1 1
0
1 0 0 1 1 0
0 1 0 0 0
0
0 1 1
0
1 0 0 1 1 0
0 1 0 0 0
0
0 1 1
0
1 0 0 1 1 0
0 1 0 0 0
0
0
1 0
0 0 1 1 0
0 0 1 1 0
0 0 1
0 0 1 0 0 1 0 0 1
Data Lake Data Lake Data Lake

Structured Data Semi-structured & Unstructured Data Semi-structured


Semi-structured&&Unstructured
UnstructuredData
Data Semi-structured & Unstructured Data

DATA DATA LAKE DATA LAKE & DATA LAKEHOUSE


WAREHOUSE DATA WAREHOUSE ARCHITECTURE
(two-tier architectures)
http://www.cidrdb.org/cidr2021/papers/cidr2021_paper17.pdf

53 Copyright © Dell Inc. All Rights Reserved.


Storage Efficiency
Replication is expensive – the default
3x replication scheme in HDFS has
Streaming Data
200% overhead in storage space and
other resources (e.g., network
Production Data
Landing Zone bandwidth
https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-
hdfs/HDFSErasureCoding.html

External Data

Raw Data Landing Multiple


zone Copies

54 Copyright © Dell Inc. All Rights Reserved.


Hadoop Architecture - Traditional NameNode
2nd NameNode

Kafka Spark Impala Hive HBase NameNode

Yarn and MapReduce DataNode 2nd NameNode

Data Node + Compute Node Data Node + Compute Node Data Node + Compute Node

Ethernet
NameNode

Data Node + Compute Node Data Node + Compute Node Data Node + Compute Node

Copyright © Dell Inc. All Rights Reserved.


Bottleneck with NameNode
1. The user makes a request to the client
2. The client makes a plan: cut the data into 64MB blocks; All blocks are
1 kept in three copies

2、3 3. Client divides large files into blocks


4
4. For the first block, the Client told NameNode, please help me copy
5
10 the 64MB block in three copies
6
5. NameNode tells the client the addresses of the three DataNodes and
9 9
9 sorts them according to the distance to the client
7 6. The client sends the data and manifest to the first DataNode
8 7. The first DataNode copies data to the second DataNode
8. The second DataNode copies the data to the third DataNode
9. If all the data for a block has been written, it will report back to
NameNode that it is complete
10. For the second block, the same is done, all blocks are finished, close
the file, and NameNode will persist the data to disk

56 Copyright © Dell Inc. All Rights Reserved.


Hadoop Typical Node Architecture

Admin Master Node 1 Master Node 2 Master Node 3


Node
Active Standby
Resourc Active Resourc Standby
e Name e Name
Provisioning Manage Node Manage Node ZooKee Journal
Monitoring r r per Node
Journal Journal
ZooKee Node ZooKee Node
per per

Node Node Node


Cloudera Manager Data Data Data
Manage Manage Manage
Hadoop clients Node Node Node
r r r

Edge Node Worker Node Worker Node Worker Node

57 Copyright © Dell Inc. All Rights Reserved.


PowerScale - Storage Separation
Compute Storage
Admin Master Node
Node
Active Standby
Resourc Resourc Active Standby
e e Name Name
Provisioning Manage Manage ZooKee Node Node Journal
Monitoring r r per Node
Journal Journal
ZooKee ZooKee Node Node
per per

Node Node Node


Cloudera Manager Data Data Data
Manage Manage Manage
Hadoop clients Node Node Node
r r r

Edge Node Worker Node


HDFS

58 Copyright © Dell Inc. All Rights Reserved.


Hadoop Architecture with PowerScale

Kafka Spark Impala Hive HBase NameNode

Yarn and MapReduce DataNode 2nd NameNode

name
node

Compute Node Compute Node Compute Node

data nodes
name
node
Ethernet
name
node

name
node
Compute Node Compute Node Compute Node

59 Copyright © Dell Inc. All Rights Reserved.


Access processes comparison
Performance/Reliability

2、3 4
1 2
5
10 NameNode
6
9 9 3 NameNode
9
NameNode
7
8 NameNode

60 Copyright © Dell Inc. All Rights Reserved.


Leading data Protection

/ifs N+3
• A copy of the data
• Better usable rate
• Change the protection level online
• Granulariy at directory/file level
• Global hotspare space improves
Protection Description
Level
data security and reconstruction
+1n Tolerate failure of 1 drive OR 1 node efficiency
+2d:1n Tolerate failure of 2 drives OR 1 node
+2n Tolerate failure of 2 drives OR 2 nodes • Matured, 22 years old products.
Version 6.5 since 2001
+3d:1n Tolerate failure of 3 drives OR 1 node
+3d:1n1d Tolerate failure of 3 drives OR 1 node AND
1 drive
+3n Tolerate failure of 3 drives or 3 nodes
+4d:1n Tolerate failure of 4 drives or 1 node
+4d:2n Tolerate failure of 4 drives or 2 nodes
+4n Tolerate failure of 4 nodes
2x to 8x Mirrored over 2 to 8 nodes, depending on
config
61 Copyright © Dell Inc. All Rights Reserved.
Flexible configuration in a cloud environment
Hadoop Version 1
Hadoop on Container
server server server

Hadoop Version 2
Hadoop on Virtual Machine
server server server

Hadoop Version 3
서버 서버 서버 New Hadoop
server server server

✓ If you need different versions of ✓ When upgrading an existing version ✓ Hadoop based on on-demand
Hadoop clusters of Hadoop containers/virtual machines

➔ Data can be shared ➔ Ensure data safety ➔ Efficient HDFS zone configuration
62 of Y © Copyright 2021 Dell Inc.
PowerScale vs DAS HDFS
Item PowerScale Hadoop DAS
Operation Multi-Protocol HDFS/S3/NFS/SMB/FTP/HTTP/RESTful API Only HDFS, require gateway
Scalability Separate compute and storage resources and scale Storage and compute resources must scale
with actual demand to avoid waste together, and in most cases, wasted resources

Upgrade Storage and computing can be upgraded Both storage and compute need to be upgraded,
independently, and Powerscale can upgrade online, new clusters need to be established, and data
without the need to build new systems or migrate migration might required
data
Balancing data Auto, self-heal and little/no impact Manually, can impact performance, require tuning

Efficiency Capacity Data lake technology ensures multi-protocol sharing Hadoop requires 3 copies of data, which requires
of one piece of data, supporting more than 80% 6 times the space consumption if disaster
storage efficiency recovery or other replication is considered

Auto-tiering Use SmartPool to automatically optimize Not supported


hot/warm/cold data
Data Smartdedupe, In-line Dedupe Not supported
Deduplication

64 Copyright © Dell Inc. All Rights Reserved.


PowerScale vs DAS HDFS
Item PowerScale Hadoop DAS
Replication/snap Asynchronous replication SyncIQ (snapshot-based) Use DistCP (File Copy), slowly
shorts with a minimum interval of 10 seconds
Name Node Each Isilon node is an active name node, with no Namenode could be bottleneck
performance bottlenecks, no single points of failure,
Data
no file count issues
Protection
Ransomware Through AirGap technology, intelligent user behavior Not supported
Protection monitoring and auditing, and restores file/directory
granularity, so that massive data can be protected
against ransomware
Security Multi-release Supports multiple software versions at the same Not supported
support time, and supports data sharing between multiple
versions
Immutable WORM is implemented through the Smartlock Not supported
function
Other Hbase Hbase has better performance due to the distributed No improvement
architecture and enhanced design of Region servers

65 Copyright © Dell Inc. All Rights Reserved.


Case Study –
Nhà mạng tại Việt Nam

66 © Copyright 2018 Dell Inc. 66 of 20 © Copyright 2020 Dell Inc.


Windows Hadoop UNIX/ Linux
Clients Clients Clients
SMB HDFS NFS
Customer ’s Network
Customer’s Network

10GbE 10GbE 10GbE

i
F200

EMC PowerScale

i
F200

EMC PowerScale

i
F200

EMC PowerScale
10GbE

i
F200

EMC PowerScale

i
F200

EMC PowerScale

VMware i

EMC
F200

PowerScale

1.3PB Usable All 6 PB usable Hybrid 7 PB usable Archive


Flash Performance
i
F200

EMC PowerScale

i
F200

EMC PowerScale

i
F200

EMC PowerScale

• DNS Server 100GbE 40GbE 40GbE


• Performance Monitoring Server Leaf Switch x 8

Back-end
Network
Spine Switch x 4
Case Study –
Ngân hàng tại Việt Nam

68 © Copyright 2018 Dell Inc. 68 of 20 © Copyright 2020 Dell Inc.


System Home NVR
Modern
ECM EDWH logs Directory Apps

Bank Network

40/100Gb Network Backup/


ETL Tool
Hadoop Compute Farm 40/100Gb Network
Archive

Data Lake

AI/ML Compute Farm 4N – F600

Ethernet
40/100Gb
Ethernet
40Gb
Replication

8N – H5600

Disaster Recovery Site

Ethernet
25GbEthernet
Workload mobility

40Gb
6N – EX500 5N – EXF900
69 of Y Dell - Internal Use - Confidential
PowerScale Resources
• OneFS Manuals
• White Papers
• PowerSizer
• BLogs
• Hands-on Labs
• OneFS Simulator
• CloudIQ Simulator

You might also like