Professional Documents
Culture Documents
HP Gen8 Technologies For Low Latency, High Performance Trading and Exchanges
HP Gen8 Technologies For Low Latency, High Performance Trading and Exchanges
Experience matters
HP ProLiant
#1 in x86 server market share
16+ years straight 65 consecutive quarters in both factory revenue and units
HPs leadership in the datacenter that has been built over years of innovation, experience and market leadership.
Source: IDC Worldwide Quarterly Server Tracker, August 2012. Includes Compaq ProLiant from Q196 through Q202
2 Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Low power choices for grid computing Open reference architecture for unstructured data
Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Low Latency Systems Require Optimization at every layer in the Solution Stack
Use Cases Exchange Matching Engines Market Data Distribution High Frequency Algorithmic Trading Pre/Post Trade Analytics Real Time Enterprise Risk Management
Messaging Middleware
Server I/O Fabric
Definitions: Solution - includes messaging middleware; in-house apps; design services System - integrated server/networking/storage infrastructure Components - specific servers/OS/switches/file system in the system
4 Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Turbo Boost deserves a fresh (e.g. +400 MHz) Copyright 2012 Hewlett-Packard Development Company, L.P. look The information contained herein is subject to change without notice.
UDIMMs offer a 1 clock latency advantage when only 1 DIMM per Channel (DPC)
Unregistered DIMMs UDIMM failure rates are higher, so use these judiciously
DIMM 8GB 2Rx4 PC3-12800R 16GB 2Rx4 PC3-12800R 4GB 2Rx8 PC3-12800E Description 1.5V DDR3-1600 RDIMM 1.5V DDR3-1600 RDIMM 1.5V DDR3-1600 UDIMM 1DPC (DDR3-) 1600 1600 1600 2DPC (DDR3-) 1600 1600 3DPC (DDR3-) 1333 1 1333 1
4 June, 2012
New
1600
Do this with the new HPRCU, Conrep scripting tool or RBSU Advanced Menu
Conrep now available for Solaris too
See User Guide for ROM-Based Setup Utility (RBSU) for explanation of BIOS settings
Pub #347563-405 June, 2012 at: http://h20000.www2.hp.com/bc/docs/support/SupportManual/c00191707/c00191707.pdf
latency (cycles)
15000 10000 5000 0 0 30 60 90 120 spike (usec) 150 180 210 240 270 300 330 360 Elapsed Time (seconds) 390 420 450 480 510 540 570 600
5 4 3 2 1 0
latency (cycles)
with prototype HP BIOS option for SNB memory power refresh, we observe spikes <3 sec !
(to be released mid-Oct12)
9
9 8 7 5 4 3 2 1
0
0 30 The60 90 contained 120 herein 150 is subject 180 to 210 240 270 Copyright 2012 Hewlett-Packard Development Company, L.P. information change without notice. 300 spike (cycle) 330 360 390 420 450 480 510 540 570 600 Elapsed Time (seconds)
latency (secs)
latency (secs)
Latency Spikes: Time History, DL380p Gen8, E5-2643 @ 3.300 GHz RHEL 6.2/2.6.32-220.el6.x86_64, HP-TimeTest7.2
25000 20000
9
8 7 6
ProLiant Gen8 servers with ConnectX-3 based Adapters and VMA acceleration enable 2msec trading advantage!
10
Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
CX-2
MAC+IP per process in addition to Server MAC+IP
CX-3
No additional MAC+IP. Use Servers MAC+IP
Description
ConnectX-3 implements Flow Steering
Multithread support
QP per thread/socket
Not supported Not supported Not supported Single default GW is supported per process and requires per process configuration
Supported Supported (Q112) Supported Host stack routing table is supported ConnectX-3 Flow Steering enables utilizing the host IP stack
10 9 8 7
Latency (usec)
6 5 4 3
RT Latency (msec)
2
1 0
16
32
64
128
256
512
1024
Back-to-back configuration (no Switch), Round Trip; Netperf v2.5.0; MTU size = 1470 Bytes RHEL Development 6.1; ConnectX-3 FW 2.10.2220; Driver: OFED-VMA 1.5.3-0008; VMA 6.1.6 Copyright 2012 Hewlett-Packard Company, L.P. The information contained herein is subject to change without notice. Command Line: netperf -n 16 -H <peer ip> -c -C -P 0 -t TCP_RR -l 10 -T 2,2 -- -r <message size>
FPGAs:
DL380p risers now supports double wide HL PCIe cards with aux power cable options at PCIe Gen3 speeds!
Rapid changes underway: FPGA vendors adding 10GbE; 10GbE vendors adding FPGAs; switches adding FPGAs
13 Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Maximize your Application resources by doing the following: 1. Bind threads, interrupts and processes to cores using CPU_ID
/usr/bin/taskset c 0,1 /usr/bin/numactl --localalloc . (other command line options) or use Red Hat tuna to do this with GUI (in RHEL 5.5 MRG and RHEL 6.0 standard) Beginning with SandyBridge on-chip PCIe controllers, bind NICs to cores for minimum QPI latencies
3. Place communication functions threads on adjacent cores 3. Use PCM to determine L3 Cache misses & keep data in L3 Cache http://software.intel.com/file/41604 4. Compile with Performance Settings, Use PGO, Evaluate IPP / SSE 4.2 Strings http://software.intel.com/en-us/articles/using-avx-without-writing-avx-code/ Implement application-transparent multicast acceleration between nodes,
Link Mellanox s VMA v6 library to the application THE GOLDEN TICKET: Above the noise. for kernel bypass over Ethernet and IB (HP now resells VMA) 14 Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Ultra low latency systems for High Frequency Trading Low power choices for grid computing
SL200s servers with GPU options Moonshot program for ARM, Atom, Phi
Open reference architecture for unstructured data Quality infrastructure for IT cost reduction
15
Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
SL250s
HPC optimized for efficiency and density, with balanced GPU performance
Purpose-built for HPC performance at scale Up to 1 integrated I/O Accelerator Maximum speed FDR IB FlexibleLOM Multi-node 1/2U density and efficiency Enhanced, simple front serviceability Rack level power management Industry Leading Mgmt with Insight Control*
16
Purpose-built for HPC performance at scale Up to 3 integrated GPUs Maximum speed FDR IB FlexibleLOM Multi-node 1U density and efficiency Enhanced, simple front serviceability Rack level power management Industry Leading Mgmt with Insight Control*
Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
CPU
PCI Express 3.0
GPU
GPU
CPU
PCI Express 3.0
Mellanox VPI
17
Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Availability: GPUDirect RDMA requires CUDA 5.0 and MLNX_OFED driver changes (beta 9/12 with expected GA by 12/12).
1 DL380 control node w/ E5-2670 8 core 2.6GHz 115WCPUs, 64 GB RAM and 2x 600 GB HDD 1 SL6500 enclosures 4 SL250s 2u server trays w/ E5-2670 8 core 2.6GHz 115W CPUs, 64 GB RAM, 600 GB HDD, 2 Nvidia M2090 GPU modules Mellanox IB 4x QDR 36 port managed switch HPN ProCurve 2910 24 port 10/100/1000 Ethernet switch RHEL CMU Linux Value Pack Rack and infrastructure Hardware/Software Integration
Development Environment for commercial, enterprise, Higher Ed, ISVs CUDA Programming Environment Proof-of-concept environment for channel partners
- Full trade history logs and analytics - Venue latencies - Transaction Costs - Risk Analytics - Matching - Execution - Online Risk Management
Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Federated
Management, Fabric, Storage Networking, Power/Cooling
HP Redstone
Sever Development Platform
HP Discovery Lab
Proof of Concept Lab
HP Pathfinder Program
Partner Collaboration
20
Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Pooled power4 common slot power supplies Shared cooling8 shared fans, N+1, rear-serviceable Integrated, configurable network fabric with up to 16 10Gb uplinks
Calxeda EnergyCore quad-core ARM SoCs w/4MB L2 cache Up to 4GB ECC (up to 1333mhz) memory per server Integrated management
Diskless or up 4 SATA drives (1 drive cartridges) per server Up to 192 SSD or 96 2.5 SFF HDD per enclosure
Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
$3.3M
89% less energy 94% less space 63% less cost 97% less complexity
400 servers 10 racks 20 switches 1,600 cables 91 kilowatts
$1.2M
Select hyperscale web, and data analytics applications show tremendous promise
22 Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. Based on weighted average performance projections for workloads such as web serving, memcached, and Data Analytics. Cost estimates include infrastructure, space, and power and cooling costs over three years.
What is Hadoop?
Your data is going unstructured
The digital universe will expand by almost half in 2012 - 90% of that data is unstructured Traditional systems are not designed to analyze unstructured data Hadoop is designed specifically to extract business value from unstructured data
Risk Modeling
Fraud Detection
Sentiment Analysis
Customer Retention
Web Mining
Financial Services
Government
Retail
Telecom
Media
24
Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
HP Insight CMU
Business Users
Vertica
Meaning Based Analytics
Autonomy IDOL
Consulting Services
25
Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
26
Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
HP HyperStorage Server
Address the explosion of data permeating the data center
ProLiant SL 4500 Shared SL 4500 HyperStorage chassis
Pooled power 4 HP common slot power supplies Shared cooling 10 shared fans, N+1, rear-serviceable Shared management Reduced cabling with single iLo port
Dual node
Single server model gives the most dense storage solution for massive data stores Triple server gives users optimal mix of storage and compute for working inside large unstructured datasets Dual server provides an optimal mix of high density storage and compute
Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
vs.
vs.
28
Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
29
Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Ultra low latency systems for High Frequency Trading Low power choices for grid computing Open reference architecture for unstructured data
HP ProLiant Gen8:
3X
6X
70%
More compute per watt
66%
Faster time to problem resolution
Admin productivity Performance increase improvement for the most demanding workloads
HP Smart Storage
HP FlexNet Adapters Insight Online Virtual Connect Sea of Sensors 3D ProLiant Operating Environment Datacenter Smart Grid
Integrated Lifecycle Automation / Dynamic Workload Acceleration / Automated Energy Optimization / ProActive Service and Support
32 Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Address explosive data growth 2X # of Drives supported (up to 227) Minimize data loss Long term data retention with Flash Backed Write Cache standard
External model with SAS cable connectors for extending the RAID set to JBODs
Reduce initial setup time 95% reduction in parity initialization from several days to 5 hours**
33
Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
*256KiB, Sequential write, RAID 5 with 15K SAS drives, performance will vary based on configuration ** HP R & D, Validation information TBD
Thank you
Low.latency@hp.com
Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.