You are on page 1of 6

Computer Architecture-I

Assignment 1 Solution
1. Die Yield is given by the formula,
Die Yield = Wafer Yield x (1 + (Defects per unit area x Die Area)/a)-a
Let us assume a wafer yield of 100% and a 4 for current technology.
a. Die yield for AMD Opteron,
Die yield = (1 + (0.75 x 1.99)/4) -4 = 0.281
b. Die yield for 8-core SUN Niagara,
Die yield = (1 + (0.75 x 3.80)/4) -4 = 0.116
c. The defect rate for both, the AMD Opteron and SUN Niagara is the
same. But, the size of the die for Niagara is almost twice as that of
AMD Opteron. Thus, the number of dies per wafer reduces
significantly for the Niagara. Since the defect rate is same, the yield
of Niagara suffers in comparison to Opteron.
4. Question 1.4
a. In order to compute the wattage for the servers power supply, we
need to first calculate the power consumed by the entire system.
i. Sun Niagara 8 -core chip :
Power Consumed at max load = 79W
ii. 2 x 1GB 184-pin Kingston DRAM :
Power consumed at max load = 2 x 3.7W = 7.4W
iii. 2 x 7200rpm Hard Drive :
Since, we are interested in max. load condition, we assume
0% idle time for the hard drive.
Power = 7.9W
Total power for 2 Drives = 15.8W
Thus, the total power consumed by the system = (79+7.4+15.8) =
102.2W.
Power Supply Efficiency = Power O/P/ Power I/P
Thus, PowerI/P = 102.2/0.7 = 146W
This is the required power supply wattage for the system.

b. The hard drive is idle for 40% of the time.


Power = (0.4 x 4) + (0.6 x 7.9) = 6.34W
c. Since rpm is the only factor affecting idle time of a disk, the disk
rpm is directly proportional to the read/seek and idle time of the
disk.
The disk with 7200 rpm has a read/seek of 60%. Then, for the
same set of transactions, the 5400 rpm disk will take 4/3 more time
than the 7200 rpm disk i.e. 80% read/seek. Thus , the 5400 rpm
disk will idle for 20%.
6. Question 1.6
a. Performance/Power Ratio for each benchmark has been tabulated
below,
Benchmark
SPECjbb
SPECWeb

Sun Fire T2000


212.677
42.427

IBM x346
91.289
9.926

b. If power is the main concern, the Sun Fire T2000 is a better choice
since it has lower power consumption for both the benchmarks.
c. It is true that For database benchmarks, the cheaper the system,
the lower cost per database operation the system is. Even so,
some server farms may go for expensive servers. These servers
are equipped not only for better performance, but also lower power
consumption. Power consumption is an ever-growing concern with
large server farms which may consist of over 10000 processors and
disks. Cheaper systems might yield a lower cost per operation
which is desirable. But these systems may not be power efficient.
The cost incurred due to excess power consumption, cooling costs
is quite significant. Thus, it is necessary to weigh both these factors
when making the choice.
9. Question 1.9
a. FIT = 100
Since FIT is given in billions of hours,
MTTF = 109/FIT = 109 /100 = 107 hours
b. MTTR = 1 day = 24hours
Availability = MTTF/ (MTTF + MTTR) = 0.9999

12. Question 1.12


a. Tabulated results for performance normalized to the Pentium D820
Chip
Athlon64 X2 4800+
Pentium EE840
Pentium D820
Athlon64 X2 3800+
Pentium 4
Athlon64 3000+
Pentium 4 570
Processor X

Memory Performance
1.141
1.076
1
0.980333333
0.910333333
0.984333333
1.167
2.333333333

Dhrystone Performance
1.361235217
1.241327201
1
1.12542707
0.500722733
0.501182654
0.73653088
0.328515112

b. Arithmetic Mean of Performance for each processor tabulated for


both original and normalized performance values.
Chip
Athlon64 X2 4800+
Pentium EE840
Pentium D820
Athlon64 X2 3800+
Pentium 4
Athlon64 3000+
Pentium 4 570
Processor X

Arithmetic Mean for


Original Results
12070.5
11060.5
9110
10035
5176
5290.5
7355.5
6000

Arithmetic Mean for


Normalized Results
1.251117608
1.158663601
1
1.052880201
0.705528033
0.742757994
0.95176544
1.330924223

c. From the table above, one can draw a conflicting conclusion in


reference to the performance of Processor X. If one examines the
performance given in the first column, it is clear that the processors
viz. Athlon64 X2 4800+, Pentium EE840, Pentium D820, Athlon64
X2 3800+ and the Pentium 4 570 are all faster than Processor X.
This is contrary to the results in the second column where
Processor X is faster than all of the said processors.
d. Geometric mean for Dhrystone benchmark for the single and dual
core processors is given below:
Geometric mean (Single Core) = 0.4964
Geometric mean (Dual Core) = 1.1743

e. The scatter graph for Dhrystone performance Vs Memory


Performance is given below:

f. The scatter graph clearly indicates that the dual core processors
outperform their single core counterparts in Dhrystone
performance. The Dhrystone benchmark is an integer benchmark
which primarily exercises the logical/arithmetic functionality in CPU.
The dramatic improvement in Dhrystone performance can be
justified simply by the fact that there are 2 cores available for
computation instead of 1. It can also be seen that there is no major
improvement in memory performance. This is because the latency
in memory is not related to number of CPU cores available. Thus,
even if the processor is a dual core, the latency in memory
load/store operations is similar to the single core. The only
exception to this is the memory performance of Processor X which
is fictitiously high.

13. Question 1.13


a. It is given that 40% of operations are memory centric and 60% are
CPU-centric. Following table gives the weighted execution times for
the benchmarks.
Execution Time
Chip

Memory
Benchmark

Dhrystone
Benchmark

Athlon64 X2 4800+
Pentium EE840
Pentium D820
Athlon64 X2 3800+
Pentium 4
Athlon64 3000+
Pentium 4 570
Processor X

0.000292141
0.000309789
0.000333333
0.00034002
0.000366166
0.000338639
0.000285633
0.000142857

4.82672E -05
5.29297E -05
6.5703E-05
5.83805E -05
0.000131216
0.000131096
8.92061E -05
0.0002

Weighted
Arithmetic
Mean
0.00015
0.00016
0.00017
0.00017
0.00023
0.00021
0.00017
0.00018

b. Since the application suite is CPU-intensive, we consider the


Dhrystone performance of the two CPUs in comparison. Speed-up
from Pentium 4 570 to Athlon64 X2 4800+ can be measured as the
ratio of their Dhrystone performance.
Hence Speed-up = 20718/11210 = 1.848
c. Let the required ratio of memory-processor computation be a.
Then, for equal performance, we can consider the following
equation.
3501a + 11210(1-a) = 3000a + 15220(1-a)
Thus, 4511a = 4010
i.e. a = 0.89
Thus, the performance of Pentium 4 570 equals Pentium D 820
when there are 89% memory operations and 11% processor
operations.

14. Question 1.14


According to Amdahls Law, speed up is given by,
Speed-upsystem = (Execution Time)old/(Execution Time)new
= 1/((1 Fractionenhanced) + (Fractionenhanced/Speed-upenhanced))
a. The first application is run in isolation and 40% of it is parallelizable.
Thus, Fractionenhanced = 0.4.
Also, since the new processor is a dual core, Speed-upenhanced = 2.
Then, the overall speed-up is given by the formula above.
Speed-upsystem = 1.25
b. The second application is run in isolation and 99% of it is
parallelizable. Hence, we have:
Fractionenhanced = 0.4
Speed-upenhanced = 2
Thus, Speed-upsystem = 1.98
c. Now both, the first and second application are running on the
system. Since, the first application uses 80% of system resources,
only 40% of 80% (= 32%) will be enhanced by a factor of 2.
Thus, Speed-upsystem = 1.19
d. Similar to the solution above, 99% of 20% (= 19.8%) will be
enhanced by a factor of 2.
Thus, Speed-upsystem = 1.10

You might also like