Multi-Core Processors : A seminar

• Presented by: Rakesh Babu G R 04EC45 Dept of E&C NITK, Surathkal

• • • • • • • What? Block diagram Why? – Yet another Brick wall Changes in Software Advantages Disadvantages Summary

“Moore’s Law was over (doubling of CPU power every 18 months). Intel said, in essence, “Sorry, we’re not going to build you 10GHz processors - but we will let you have lots of 3GHz ones” [1]

Welcome to the world of “multi-core” processor chips

What are Multi-core processors?
• 2 or more Execution cores within one processor • Plugs into a single processor socket • Operating system sees two processors [3]

Block diagram [3]

“The power wall + the memory wall + the ILP wall = a brick wall for serial performance.”- David Patterson, “Father of RISC” [5]

"Power is expensive, but transistors are 'free'. That is, we can put more transistors on a chip than we have the power to turn on.“ => Power wall [5]

Power wall

Power wall [6]

Power wall
• Cooling technology to offset increase in power • But it has limitations • 2 processors out of the transistors, run at a lesser frequency • But 2 processors => More throughput • Pentium 4 @ 3 GHz • Pentium D @ 2.67 GHz

"Load and store is slow, but multiply is fast. Modern microprocessors can take 200 clocks to access Dynamic Random Access Memory (DRAM), but even floating-point multiplies may take only four clock cycles.“ [5]

“And increasing the size of the already great big caches traditionally used to mask memory latency aren't giving us a good return on the transistor investment anymore.” => Memory wall [5]

Memory wall
• • • • Traditional approach: Scale frequency + Increase Cache Doesn’t work now Increase in cache size doesn’t increase access time by much

Memory wall [5]

Memory wall
• There is increase in transistor density at the rate predicted by Moore • But, not the Performance!!!!!! since 2002 • Since access time hasn’t improved • Use excess transistors for 2 cores • Use a lesser frequency rate at which access rate is not that bad

Case study: IBM POWER6 [2]
• By a Sun employee : So, we have take it with a pinch of salt • What did IBM do? • They more than doubled their clock rate (2.2GHz to 4.7GHz) • Quadrupled the size of their L2 on-chip caches (1.92MB on POWER5+, 8MB on POWER6

And what did they get for their efforts? [2]

Memory wall
• POWER6 : Diminishing performance returns • Why??? It has pinned its hopes on old, unimaginative, and out of date techniques that the rest of the industry has largely abandoned • “You may ask, how did this tradition get started? I'll tell you. I don't know.”

"There are diminishing returns on finding more ILP. ... Increasing parallelism is the primary method of improving processor performance“ => ILP wall [5]

What is ILP???
• Apart from speed-up by frequency scaling • “Speed-up by having duplicate hardware speculatively execute future instructions before the results of current instructions are known while providing hardware safeguards to prevent the errors that might be caused by out of order execution”, ILP – Instruction Level Parallelism [4]

ILP wall
• Increase in hardware Super-linear • Increase in performance not even linear • Can’t even predict the speed-up because of nondeterministic(probabilistic) nature

Advantages [3]
• They scale up the brick wall, i.e. the power wall + memory wall + ILP wall • Presence of two processors(execution cores) on same die increases clock rate at which certain processes operate • Occupies less space than many processors joined in the board

Disadvantages [3]
• The Operating system has to be modified to utilize the increased resources • The thousands of applications that we run have to be re-written to fully utilize the improved hardware and the parallelism thus obtained • The training of software developers to write better software for dual, quad and other multicore processors, €€€€€€€€€€€€€€€€€

Moral of the story
• Multi-core processors are the way to increase the performance of the processor • OS and software should be re-written to exploit the parallelism obtained

An interesting thing
• Greed of corporations • Some of them count each processor in a die as one separate processor • So, if you use a dual core, you will have to pay twice the licensing amount

• [1] • art_iv [2] • [ 3] • [4]

• hRpts/2006/EECS-2006-183.html [5] • “Multi-Core Processor Technology: Maximizing CPU Performance in a Power-Constrained World”, Paul Teich, AMD [6]