Professional Documents
Culture Documents
at JCAHPC
Taisuke Boku
Vice Director, JCAHPC
University of Tsukuba
Future
PF Exascale
Post K Computer
1000 Tier-1 and tier-2 supercomputers -> Riken
R-CCS AICS
form HPCI and move forward to
Exascale computing like two wheels
100
10 OFP
JCAHPC(U. Tsukuba and
U. Tokyo)
1 TSUBAME2.0
Tokyo Tech.
T2K
U. of Tsukuba
U. of Tokyo
Kyoto U.
2008 2010 2012 2014 2016 2018 2020
U. Tsukuba
Kashiwa
Campus
of U. Tokyo
• 25 PFLOPS peak
• 8208 KNL CPUs
• FBB Fat-Tree by
OmniPath
• HPL 13.55 PFLOPS
#1 in Japan
#6➙#7
• HPCG #3➙#5
• Green500 #6➙#21
Water cooling
wheel & pipe
2 2
Uplink: 24
362 of
48 port Edge Switch
Downlink: 24
Compute Nodes 8208
1 ... 24 25 ... 48 49 ... 72
Login Nodes 20
Firstly, to reduce switches&cables, we considered :
• All the nodes into subgroups are connected with FBB Fat-tree Parallel FS 64
• Subgroups are connected with each other with >20% of FBB IME 300
But, HW quantity is not so different from globally FBB, and globally FBB is
Mgmt, etc. 8
preferredfor flexible job management.
Total 8600
Facility PAC
Engineering
Earth/Space
Material
NICAM-COCO Energy/Physics
GHYDRA
Seism3D
Information Sci.
Education
Industry
SALMON/ Bio
ARTED
Social Sci. & Economics
Lattice QCD Data
1 0.109 n
0.127
0.8 0.095 n performance gap exists on any materials
0.6 n Non-algorithmic load-imbalancing
0.925
0.4 0.764 Ø dynamic clock adjustment (DVFS) on turbo
0.2
boost is applied individually on all processors
0
Ø it is observed on under same condition of
Best Worst nodes
Ø on KNL, more sensitive than Xeon
normalized to best case
Ø serious performance degradation on
synchronized large scale system
FPGA FPGA
PCIe PCIe
comp. comp.
comm. collective or specialized comm.
communication with
computation
> QSFP28 interconnect
Ethernet Switch
CPU:Xeon
E5-2660 v4
GPU:
NVIDIA P100 x2
CPU:Xeon
E5-2660 v4
FPGA:
QSFP+ : 40Gbpsx2 Bittware A10PL4