You are on page 1of 6

INTEL NEHALEM-EX 8-CORE PROCESSORS

INTRODUCTION
Intel Corporation culminated the transition to the company’s award-winning “Nehalem” chip
design with the launch of the Intel® Xeon® 7500 processor series. In less than 90 days, Intel has
introduced all-new 2010 PC, laptop and server processors that increase energy efficiency and
computing speed and include a multitude of new features that make computers more intelligent,
flexible and reliable. Expandable to include from two to 256 chips per server, the new Intel Xeon
processors have an average performance three times that of Intel’s existing Xeon 7400 series on
common, leading enterprise benchmarks, and come equipped with more than 20 new reliability
features. Nehalem-EX is Intel’s fastest processor to date, however, clock speeds are still mystery
but the company has said it will include 24MB of cache, and 2.3 billion transistors. The new
processor will be targeted for high-end servers especially to Cloud computing farms and one of
the first implementers would be IBM. Apart from that, the new architecture have some advanced
error-recovery mechanisms like MCA as far as the fabrication is concerned, it would use 45-
nanometer process, however, 32nm is very much work-in-progress.

TWENTY OLD SERVERS – TO ONE NEW ONE


The combined scalable performance, advanced reliability and total cost of ownership advantages
of the Xeon 7500 series will further accelerate the shift from proprietary systems to industry-
standard Intel processor-based servers. These new capabilities enable IT managers to consolidate
up to 20 older single-core, 4-chip servers onto a single server using Intel Xeon 7500 series
processors while maintaining the same level of performance. In doing so, they could also see up
to a 92 percent estimated reduction in energy costs and a return on their investment estimated
within 1 year due to reductions in power, cooling and licensing costs.

The Xeon 7500 brings mission critical capabilities to the mainstream by delivering the most
significant leap in performance, scalability and reliability ever seen from Intel,” said Kirk
Skaugen, vice president of the Intel architecture group and general manager of Intel’s data center
group. “This combination will help users push to new levels of productivity, and accelerate the
industry’s migration away from proprietary architectures
CODE NAMES
Each combination of a Nehalem/Westmere processor die and package has both a separate
codename and a product code. Typically, the same dies are used for uniprocessor (UP) and dual-
processor (DP) servers, but using an extra Quick Path link for the inter-processor communication
in the DP server variant. Like the Core microarchitecture, Nehalem also uses four different
processor packages, one for each market segment. The high-end chips with eight cores are only
available in the MP server segment, while all others are used in at least two different packages.

ADVANCED ERROR RECOVERY:


Intel is also including new technologies like Machine Check Architecture (MCA) recovery error
correction that could make servers more fault tolerant and provide greater uptime. This added
capability is brand new to the Nehalem product family and will help businesses lower their total
cost of ownership. With MCA Recovery, errors can be isolated on a virtual machine and the
other machines can still run uninterrupted while that one is being restarted. The processor will be
able to detect system errors originating in the CPU or system memory and correct them by
working with the operating system. Intel will also offer four memory channels per processor,
simply said more channels provide more memory bandwidth to run programs faster. AMD‘s 12-
core Opteron server processors also offer same memory channels. But, Intel will build large
amounts of cache inside Nehalem-EX that could help the processor deliver faster performance,
but AMD‘s Magny Cours beats Nehalem-EX when it comes to more physical cores per chip.

45 NANOMETER
Intel started mass producing 45 nm chips in late 2007, and AMD started production of 45 nm
chips in late 2008, while IBM, Infineon, Samsung, and Chartered Semiconductor have already
completed a common 45 nm process platform. Many critical feature sizes are smaller than the
wavelength of light used for lithography (i.e., 193 nm and/or 248 nm). A variety of techniques,
such as larger lenses, are used to make sub-wavelength features. Double patterning has also been
introduced by Intel to assist in shrinking distances between features, especially if dry lithography
is used. It is expected that more layers will be patterned with 193 nm wavelength at the 45 nm
node. Moving previously loose layers (such as Metal 4 and Metal 5) from 248 nm to 193 nm
wavelength is expected to continue, which will likely further drive costs upward, due to
difficulties with 193 nm photoresists.
DOUBLE DATA RATE MEMORY
Double Data Rate DDR memory controllers are used to drive DDR SDRAM, where data is
transferred on the rising and falling access of the memory clock of the system. DDR memory
controllers are significantly more complicated than Single Data Rate controllers, but allow for
twice the data to be transferred without increasing the clock rate or increasing the bus width to
the memory cell. Dual Channel memory controllers are memory controllers where the DRAM
devices are separated on to two different buses to allow two memory controllers to access them
in parallel. This doubles the theoretical amount of bandwidth of the bus. In theory, more
channels can be built (a channel for every DRAM cell would be the ideal solution), but due to
wire count, line capacitance, and the need for parallel access lines to have identical lengths, more
channels are very difficult to add.

FULLY BUFFERED MEMORY


Fully buffered memory systems place a memory buffer device on every memory module (called
an FB-DIMM when Fully Buffered RAM is used), which unlike traditional memory controller
devices, use a serial data link to the memory controller instead of the parallel link used in
previous RAM designs. This decreases the number of the wires necessary to place the memory
devices on a motherboard (allowing for a smaller number of layers to be used, meaning more
memory devices can be placed on a single board), at the expense of increasing latency (the time
necessary to access a memory location). This increase is due to the time required to convert the
parallel information read from the DRAM cell to the serial format used by the FB-DIMM
controller, and back to a parallel form in the memory controller on the motherboard. In theory,
the FB-DIMM's memory buffer device could be built to access any DRAM cells, allowing for
memory cell agnostic memory controller design, but this has not been demonstrated, as the
technology is in its infancy.

LAYER ARCHITECTURE
The physical layer comprises the actual wiring and the differential transmitters and receivers,
plus the lowest-level logic that transmits and receives the physical-layer unit. The physical-layer
unit is the 20-bit "phit." The physical layer transmits a 20-bit "phit" using a single clock edge on
20 lanes when all 20 lanes are available or on 10 or 5 lanes when the QPI is reconfigured due to
a failure. The link layer is responsible for sending and receiving 80-bit flits. Each flit is sent to
the physical layer as four 20-bit phits. Each flit contains an 8-bit CRC generated by the link layer
transmitter and a 72-bit payload. If the link layer receiver detects a CRC error, the receiver
notifies the transmitter via a flit on the return link of the pair and the transmitter resends the flit.
The link layer implements flow control using a credit/debit scheme to prevent the receiver's
buffer from overflowing. The link layer supports six different classes of message to permit the
higher layers to distinguish data flits from non-data messages primarily for maintenance of cache
coherence. In complex implementations of the Quick Path architecture, the link layer can be
configured to maintain separate flows and flow control for the different classes. It is not clear if
this is needed or implemented for single-processor and dual-processor implementations.

The routing layer sends a 72-bit unit consisting of an 8-bit header and a 64-bit payload. The
header contains the destination and the message type. When the routing layer receives a unit, it
examines its routing tables to determine if the unit has reached its destination. If so it is delivered
to the next-higher layer. If not, it is sent on the correct outbound QPI. On a device with only one
QPI, the routing layer is minimal. For more complex implementations, the routing layer's routing
tables are more complex, and are modified dynamically to avoid failed QPI links.

NEW STANDARDS IN RELIABILITY AND SCALABILITY


Mission-critical workloads run by customers that simply cannot afford unscheduled downtime
such as hospitals or stock exchanges can take advantage of more than 20 new features that
deliver a leap forward in reliability, availability and serviceability (RAS). These reliability
capabilities are designed to improve the protection of data integrity, increase availability and
minimize planned downtime.

For example, this is the first Xeon processor to possess Machine Check Architecture (MCA)
Recovery, a feature that allows the silicon to work with the operating system and virtual machine
manager to recover from otherwise fatal system errors, a mechanism until now found only in the
company’s Intel® Itanium® processor family and RISC processors. The Intel Xeon processor
7500 series offers unique scalability through modular building blocks enabled by Intel® Quick
Path Technology (QPI) interconnect. With QPI, cost-effective and highly scalable eight-
processor servers that don’t require specialized third-party node controller chips to “glue” the
system together can be built. Intel is also working with system vendors to deliver “ultra-scale”
systems with 16 processors for the enterprise, and up to 256 processors and support for 16
terabytes of memory for high- performance computing “super nodes” running bandwidth-
demanding applications such as financial analysis, numerical weather predictions and genome
sequencing.
LARGE-SCALE VIRTUALIZATION
The Intel Xeon processor 7500 series meets the growing trend of IT organizations virtualizing
large mission-critical workloads for applications such as Enterprise Resource Planning. With up
to eight times the memory bandwidth of the Intel Xeon processor 7400 series and four times the
memory capacity with 16 memory slots per processor, the Xeon 7500 series can support one
terabyte (or 1,000 gigabytes) of memory in a four-socket platform. Intel Virtualization
Technologies, which include new I/O virtualization capabilities and Intel® Virtualization
Technology (VT) Flex Migration, enables live VM migration across all Intel® Core™
microarchitecture-based platforms to ensure investment protection for administrators seeking to
use pools of virtualized systems to facilitate failover, disaster recovery, load balancing and
optimal server maintenance and downtime.

TWO-CHIP AND COST OPTIMIZED SERVERS


New two-chip expandable class platforms with large memory capacity based on the Intel Xeon
processor 7500 series are ideal for memory intensive databases and virtualization environments.
The Intel Xeon processor 7500 series is available in quad, six and eight core versions with twice
the number of threads thanks to Intel Hyper-Threading Technology. The Intel Xeon processor
6500 series provides a lower cost solution for 2-chip servers with large memory requirements.

PRODUCT SPECIFICATION
The Intel Xeon processor 7500 series supports up to eight integrated cores and 16 threads, and
can scale up to 32 cores and 64 threads per 4-chip platform or 64 cores and 128 threads per 8-
chip platform, and is available with frequencies up to 2.66 GHz, and 24 MB of Intel® Smart
Cache memory, four Intel QPI links and Intel Turbo Boost technology. Thermal Design Point
(TDP) power levels range from 95 watts to 130 watts.

The Intel Xeon processor X7560, with eight cores and 24MB cache size, is built for highly
parallel, data demanding and mission-critical workloads, whereas the Intel Xeon processor
X7542 is a frequency-optimized 6-core option at 2.66 GHz targeted for super node high-
performance computing applications in science and financial services.
CONCLUSION
The innovative modular scaling of the Xeon® 7500 processor works with the Intel 7500 Chipset
and Intel 7500 Scalable Memory Buffers to enable unique OEM system designs and brings a
wide range of socket, memory and I/O, form factor, and reliability feature sets never before
available to the mainstream server market. Enterprise software vendors expected to support the
high-end features of Intel Xeon processor 7500-based platforms, include Citrix*, IBM*,
Microsoft*, Novell*, Oracle*, Red Hat*, SAP AG* and VMware*. System vendors are lining up
to take advantage of the high-end Intel Xeon processor 7500 series capabilities and deliver
highly innovative solutions at much lower costs than older proprietary solutions. With more than
double the amount of designs versus the previous generation Intel Xeon processor 7400 series,
system manufacturers were expected to announce systems based on the Intel Xeon processor
7500/6500 processor starting today.