You are on page 1of 2

Xu ZW. Disaggregated datacenters for future cloud computing.

JOURNAL OF COMPUTER SCIENCE AND TECH-


NOLOGY 38(5): 947−948 Sept. 2023. DOI: 10.1007/s11390-023-0006-2

Disaggregated Datacenters for Future Cloud Computing

Zhi-Wei Xu (徐志伟), Fellow, CCF

Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China


Great Bay University, Dongguan 523000, China

E-mail: zxu@ict.ac.cn

Datacenters are warehouse-scale computers providing compute, storage and network capabilities. Datacen-
ters are engines of cloud computing and have seen rapid progress in the past two decades. A recent research
trend is disaggregated datacenters, as shown in Fig.1.
A conventional datacenter (Fig.1(a)) consists of N nodes linked into a whole system by a high-speed inter-
connect (also called I/O fabric). Each node aggregates various resources into one physical server. The resources
of a server node include CPU chips, accelerator chips (e.g., GPUs), main memory, persistent storage, network
devices, etc. One or more server nodes are mounted on a rack. A datacenter consists of one or more racks of
nodes.
A form of a disaggregated datacenter (Fig.1(b)) also consists of multiple nodes linked by an interconnect.
However, its nodes are organized quite differently. A CPU node may contain mainly CPU chips, with small
amounts of local memory, storage and network capability. Similarly, a memory node contains mainly memory
chips. Such speciality enables more efficient node design. Resources used by a cloud computing process are disag-
gregated in multiple nodes and can scale to much larger sizes. For instance, a process running on a CPU core
can execute a load/store instruction to access not only the 4-GB local memory, but also the far memory on oth-
er nodes, which could total over 1 PB.
In his Turing award lecture of 1999, Jim Gray posed 12 long-term research goals from the systems perspec-
tive (“What next?: A dozen information-technology research goals”, JACM, 2003, 50 (1): 41-57). The first is the
scalability goal: devise a system architecture that scales up by a factor of 106, just by adding more resources.
Since Gray psessented his vision, the aggregated datacenter approach has already achieved a scaling factor of
105 for supercomputing and 103 for big data processing. In the highlighted paper “Reinventing Cloud Comput-

Node-1 Node- CPU GPU Memory I/O


Node Node Node Node
1
CPU GPU CPU GPU
CPU GPU Memory I/O

Memory Memory I/O CPU GPU Memory I/O


I/O

Other Other Other Other

Interconnect (I/O Fabrics) Interconnect (I/O Fabrics)

(a) (b)
Fig.1. (a) A traditional datacenter versus (b) a disaggregated datacenter.

Perspective
For Cover Article: Wang CX, Shan YZ, Zuo PF et al. Reinvent cloud computing systems for resource disaggregation. JOUR-
NAL OF COMPUTER SCIENCE AND TECHNOLOGY 38(5): 949−969 Sept. 2023. DOI: 10.1007/s11390-023-3272-0
©Institute of Computing Technology, Chinese Academy of Sciences 2023
948 J. Comput. Sci. & Technol., Sept. 2023, Vol.38, No.5

ing Systems for Resource Disaggregation” of this issue, Wang, Shan, Zuo, and Cui argue for the disaggregated
datacenter approach, supported by work from three best papers appeared in top conferences (Shan et al., best
paper at OSDI 2018; Wang et al., best paper at OSDI 2022; Li et al., best paper at FAST 2023). They show that
with the advent of the I/O fabric, the system software stack has become the bottleneck. The authors also pro-
pose a vision of semantics-aware software stack for virtual disaggregated datacenters. Its architecture differs
from both in Fig.1.
The future may belong to both the aggregated and the disaggregated architectures. It will be exciting to see
either win the race to realize Jim Gray’s goal of million-fold scalability.

Zhi-Wei Xu received his Ph.D. degree from the University of Southern California, Los Angeles.
He is a professor of Institute of Computing Technology, Chinese Academy of Sciences, Beijing, and
a chair professor of Great Bay University, Dongguan. His research areas include high-performance
computer architecture and distributed systems.

You might also like