Welcome to Scribd, the world's digital library. Read, publish, and share books and documents. See more
Download
Standard view
Full view
of .
Look up keyword
Like this
4Activity
0 of .
Results for:
No results containing your search query
P. 1
Understanding Non-Uniform Memory Access – NUMA

Understanding Non-Uniform Memory Access – NUMA

Ratings: (0)|Views: 317|Likes:
Published by ANILPETWAL

More info:

Published by: ANILPETWAL on Dec 29, 2009
Copyright:Attribution Non-commercial

Availability:

Read on Scribd mobile: iPhone, iPad and Android.
download as DOC or read online from Scribd
See more
See less

05/11/2014

 
Understanding Non-uniform Memory Access – NUMA
Microprocessor challenge .As clock speed and the number of processors increase, it becomesincreasingly difficult to reduce the memory latency required to use this additional processingpower. To circumvent this, hardware vendors provide large L3 caches, but this is only a limitedsolution. NUMA architecture provides a scalable solution to this problem.
In NUMA systems,the nodes consist of processors and memory which are interconnected. Eachprocessor can access local memory and they can access memory on the othernodes. There is a greater latency in accessing the remote memory compared to thelocal memory. i.e. non-uniform memory access.
 
NUMA Concepts
  The trend in hardware has been towards more than one system bus, each serving a small setof processors. Each group of processors has its own memory and possibly its own I/O channels.However, each CPU can access memory associated with the other groups in a coherent way.Each group is called a NUMA node. The number of CPUs within a NUMA node depends on thehardware vendor. It is faster to access local memory than the memory associated with otherNUMA nodes. This is the reason for the name, non-uniform memory access architecture.On NUMA hardware, some regions of memory are on physically different buses from otherregions. Because NUMA uses local and foreign memory, it will take longer to access someregions of memory than others.
Local memory 
and
foreign memory 
are typically used inreference to a currently running thread. Local memory is the memory that is on the same nodeas the CPU currently running the thread. Any memory that does not belong to the node onwhich the thread is currently running is foreign. Foreign memory is also known as
remotememory 
. The ratio of the cost to access foreign memory over that for local memory is calledthe NUMA ratio. If the NUMA ratio is 1, it is symmetric multiprocessing (SMP). The greater theratio, the more it costs to access the memory of other nodes. Windows applications that arenot NUMA aware (including SQL Server 2000 SP3 and earlier) sometimes perform poorly onNUMA hardware. The main benefit of NUMA is scalability. The NUMA architecture was designed to surpass thescalability limits of the SMP architecture. With SMP, all memory access is posted to the sameshared memory bus. This works fine for a relatively small number of CPUs, but not when youhave dozens, even hundreds, of CPUs competing for access to the shared memory bus. NUMAalleviates these bottlenecks by limiting the number of CPUs on any one memory bus andconnecting the various nodes by means of a high speed interconnection. 
Hardware NUMA
 
Computers with hardware NUMA have more than one system bus, each serving a small set of processors. Each group of processors has its own memory and possibly its own I/O channels,but each CPU can access memory associated with other groups in a coherent way. Each groupis called a NUMA node. The number of CPUs within a NUMA node depends on the hardwarevendor. Your hardware manufacturer can tell you if your computer supports hardware NUMA.
The SMP and MPP Machine Architectures
 This discussion requires a baseline understanding of symmetric multiprocessing (SMP) andmassively parallel processing (MPP) machine architecturesSMP systems allow any processor to work on any task no matter where the data for that taskare located in memory; with proper operating system support, SMP systems can easily movetasks between processors to balance the workload efficiently.
SMP – Symmetric Multiprocessing System
(
Max 8 Processor
)
Stated simply, an SMP machine has memory and disk that is equally accessible to anyprocessor (hence the term "symmetric").And symmetric really means symmetric. All theprocessors have to be the same speed, the same stepping, the same manufacturer. They mustbe identical in every way. If you break any of these rules, you will get strange results. Strangeresults from QueryPerformanceCounter will be the least of your problems,The physics behind ahardware bus plated into a circuit board limits how close they can be before electromagneticinterference becomes unmanageable,
Entry Level Syster - Before about 2006, entry-level servers and workstations with two processors dominated the SMP market. With the introduction of dual-core devices, SMP isfound in most new desktop machines and in many laptop machines. The most popular entry-level SMP systems use the x86 instruction set architecture and are based on Intel’s Xeon,Pentium D, Core Duo, and Core 2 Duo based processors or AMD’s Athlon64 X2, Quad FX or Opteron 200 and 2000 series processors. Servers use those processors and other readily available non-x86 processor choices including the Sun Microsystems UltraSPARC, FujitsuSPARC64 III and later, SGI MIPS, Intel Itanium, Hewlett Packard PA-RISC, Hewlett-Packar(merged with Compaq which acquired first Digital Equipment Corporation) DEC Alpha, IBMPOWER and Apple Computer PowerPC (specifically G4 and G5 series, as well as earlier PowerPC 604 and 604e series) processors. In all cases, these systems are available inunprocessed versions as well.Mid Range Server - The Burroughs B5500 first implemented SMP in 1961, It was implemented later on other mainframes. Mid-level servers, using between four and eight processors, can befound using the Intel Xeon MP, AMD Opteron 800 and 8000 series and the above-mentioned UltraSPARC, SPARC64, MIPS, Itanium, PA-RISC, Alpha and POWER processors.
MMP - Massive parallel processing Architecture (Max 32 Processor)

Activity (4)

You've already reviewed this. Edit your review.
1 hundred reads
1 thousand reads
Nathan Ma liked this
Agus Go Green liked this

You're Reading a Free Preview

Download
scribd
/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->