Run Time Efficiency of Accessor Functions

by Tim Lee, 7.13.98 Purpose: To measure the efficiency of accessor functions on a variety of computers. Introduction: Two programs are used as benchmarks to run on nine different computers. The first program measures how long it takes to fetch data from memory using the direct access operations built into each processor. The second program measures how long it takes to fetch data indirectly, by calling a subroutine composed of direct access operations. This kind of routine is commonly called an accessor function or an accessor for short. Since there are extra steps involved in the accessor function method it will always be slower than the direct method. The purpose of this study is to measure how much slower. Summary of Results: Accessor functions were measured to take from three to eighteen times longer to do the same work as the direct access method, depending on the processor type. It was also found that the faster the CPU clock speed the less efficient accessor functions are. Finally, the PowerPC chip family was measured to process accessor functions four times more efficiently than the 80x86 chip family. In this paper the term efficiency means what it normally does, namely, how much time or energy is required to do the same amount of work relative to some standard. In this case the standard is how long it takes to fetch a 16-bit unit of memory using the direct access method: Time to fetch a 16-bit unit directly Efficiency = —————————————————————————————————————————— Time to fetch a 16-bit unit via accessor For example, an efficiency rating of .5 means it takes twice as long for an accessor function to do the same work as using direct access, and likewise an efficiency number of .1 means it takes ten times as long. On the following page is a graphical summary of the results. 1

Accessor Function Efficiency by Clock Speed
0.40 0.35 0.30 0.25

PowerPC 601 PowerPC 603ev PowerPC 750 (G3)

Efficiency

0.20 0.15 0.10 0.05 0.00 0 50 100 150

PowerPC 604

386SX Pentium 486DX2 Pentium Pentium II

200

250

300

350

400

Clock Speed (Mhz)

Processor

Type

CPU Clock Speed (Mhz) 80 180 200 266 33 66 133 166 333

Efficiency

PowerPC 601 PowerPC 603ev PowerPC 604 PowerPC 750 (G3) 80386SX 80486DX2 Pentium Pentium Pentium II

.3333 .2870 .2300 .2620 .0960 .0896 .0769 .0689 .0555

How Many Times Slower Than Direct 3.00 3.48 4.35 3.81 10.40 11.10 13.00 14.50 18.00

2

Description of Procedure: The following general procedure was used: 1. Each benchmark was run nine times. 2. The median measurement was chosen to be representative of the other measurements for computing the efficiency rating. On the Mac all benchmarks were run under MacOS 8 or 8.1. On 80x86 machines all benchmarks were run under DOS after it was found that running the benchmarks in a DOS command window under Windows NT skewed the results. Description of Benchmark Programs: The benchmark programs consist of two main parts, timing functions and data access functions. The timing part is written in assembler and the data access part is written in ANSI C. The Data Access Part: The data access part differs in the two benchmark programs so that one method of access can be compared with another. In pattern, the source code differences are as follows:
Direct Access Source Code Accessor Function Source Code

int x,y; x = y;

int x, y; int gety() { return( y ) }; x = gety();

The data access part of the benchmark program is designed to test the general case of data access where the data is in main memory rather than in level 1 or level 2 cache memory. To accomplish this aim many separate memory locations are read rather than reading from the same memory location many times. Also the memory addresses are separated enough so that cache line fetches can’t get more than one target value at a time. What is unknown is the extent to which the program resides in cache memory after having been loaded from disk by the OS just prior to running it. More experiments would need to be done to separate out this possible bias but the efficiency of accessor functions isn’t expected to change since both benchmarks run under the same conditions.

3

The Timing Part: The timing functions of the benchmark programs are machine type specific. On the Mac timing functions read time registers built into the PowerPC chip providing a very high resolution measurement of time. On 80x86 machines timing functions read the 8253 timer chip. Details can be found in the source code below. Description of Software: On the Mac the source code for the benchmark programs was compiled using the Metrowerks Code Warrior Pro 2 compiler. On the 80x86 machine the benchmark C source code was compiled using Microsoft Quick C Version 2.5 and the assembler code for the timer functions was compiled using Microsoft Macro Assembler 5.1. Description of Hardware: Processor Type
PowerPC 601 PowerPC 603ev PowerPC 604 PowerPC 750 (G3) 80386SX 80486DX2 Pentium Pentium Pentium II

Computer
PowerMac 8100/80 PowerBook 2400c/180 PowerMac 9500 PowerMac G3 A generic 386SX box Commax Desktop Systems Micron Home MPC Pro Micron Millenium Dell Dimension XPS D333

CPU Clock Speed (Mhz)
80 180 200 266 33 66 133 166 333

Measurement Data: The following measurement data was collected to compute the accessor efficiency ratings:
PowerMac 8100/80 PowerPC 601, 80 Mhz.
Trial 1 2 3 4 5 6 7 8 9 Median Direct Tick Count 78208 78208 78976 78592 78336 79616 78976 79232 79104 78976 Accessor Tick Count 236928 235904 238336 236160 237184 236544 238336 243840 235776 236928

4

PowerBook 2400c/180 PowerPC 603ev, 180 Mhz. Trial 1 2 3 4 5 6 7 8 9 Median Direct Tick Count 423 440 449 425 442 441 437 436 448 440 Accessor Tick Count 1527 1545 1530 1534 1539 1538 1533 1544 1524 1534

PowerMac 9500 PowerPC 604, 200 Mhz. Trial 1 2 3 4 5 6 7 8 9 Median Direct Tick Count 524 482 498 492 471 484 478 483 466 483 Accessor Tick Count 2101 2308 1990 2031 2356 1696 2559 2370 1910 2101

PowerMac G3 PowerPC 750 (G3), 266 Mhz. Trial 1 2 3 4 5 6 7 8 9 Median Direct Tick Count 389 388 385 380 381 394 408 387 388 388 Accessor Tick Count 1456 1481 1488 1492 1479 1462 1465 1479 1465 1479

5

A generic 386SX box 80386SX, 33 Mhz. Trial 1 2 3 4 5 6 7 8 9 Median Direct Tick Count 194 195 196 195 195 195 195 195 194 195 Accessor Tick Count 2030 2033 2030 2033 2030 2033 2031 2031 2032 2031

Commax Desktop Systems 80486DX2, 66 Mhz. Trial 1 2 3 4 5 6 7 8 9 Median Direct Tick Count 107 107 107 107 106 106 106 106 106 106 Accessor Tick Count 1175 1177 1177 1174 1187 1167 1174 1177 1177 1177

Micron Home MPC Pro
Pentium, 133 Mhz. Trial 1 2 3 4 5 6 7 8 9 Median Direct Tick Count 12 11 10 12 11 11 11 12 12 11 Accessor Tick Count 143 143 144 144 144 143 143 143 144 143

6

Micron Millenium Pentium, 166 Mhz. Trial 1 2 3 4 5 6 7 8 9 Median Dell Dimension XPS D333 Pentium II, 333 Mhz. Trial 1 2 3 4 5 6 7 8 9 Median Direct Tick Count 6 6 6 6 5 6 6 6 6 6 Accessor Tick Count 109 108 112 108 108 108 108 108 108 108 Direct Tick Count 11 10 9 10 12 10 9 9 10 10 Accessor Tick Count 146 143 145 145 143 145 145 144 144 145

7

Source Code for Mac and x86 Benchmarks follow: BEGIN MAC SOURCE CODE BEGIN FILE ‘TimePPC.h’ ===================================== ------------------------------------

/*-----------------------------------------------------------| NAME: TimePPC.h | | PURPOSE: To provide interface to time functions for the | PowerPC chips. | | DESCRIPTION: | | NOTE: | | HISTORY: 02.07.98 ------------------------------------------------------------*/ #ifndef #define _TIMEPPC_H_ _TIMEPPC_H_

#ifdef __cplusplus extern "C" { #endif void asm void asm void void void ElapsedTimePPC( u64* ); GetRealTimePPC( u64* ); GetTimeBasePPC( u64* ); GetTimePPC( u64* ); SetUpTimePPC();

#ifdef __cplusplus } // extern "C" #endif #endif // _TIMEPPC_H_ END FILE ‘TimePPC.h’ BEGIN FILE ‘TimePPC.c’ -------------------------------------------------------------------------

/*-----------------------------------------------------------| NAME: TimePPC.c | | PURPOSE: To provide timing functions for PowerPC chips. | | DESCRIPTION: There are two different ways of measuring the | rate of change in the PowerPC chip family: one way only | works with the 601 chip and the other way only works for | non-601 chips.

8

| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |

On the PowerMac these differences are smoothed over by using the illegal instruction handler to emulate one or the other timing methods. The seam that shows up is the extra time spent servicing the illegal instruction exception. The two functions 'GetRealTimePPC' and 'GetTimeBasePPC' are opcode-for-opcode identical to the low-level routines used by the Metrowerks Profiler. These routines are also published in the official PowerPC manuals as the right way to get the contents of the timing registers. The following is from 'PowerPC 601 RISC Microprocessor User's Manual', p. B-7: "B.24 Timing Facilities This section describes differences between the POWER architecture and the PowerPC architecture timer facilities. B.24.1 Real-Time Clock The 601 implements a POWER-based RTC. Note that the POWER RTC is not supported in the PowerPC architecture. Instead, the PowerPC architecture provides a time base (TB). Both the RTC and the time base are 64-bit special purpose registers, but they differ in the following respects. * The RTC counts seconds, and nanoseconds, while the TB counts 'ticks'. The frequency of the RTC is implementationdependent. * The RTC increments discontinuously -- 1 is added to RTCU when the value in RTCL passes 999_999_999. The TB increments continuously -- 1 is added to TBU when the value in TBL passes x'FFFF FFFF'. * The RTC is written and read by the 'mtspr' and 'mfspr' instructions, using SPR numbers that denote the RTCU and RTCL. The TB is written to and read by the instructions 'mtspr' and 'mftb'. * The SPR numbers that denote RTCL and RTCU are invalid in the PowerPC architecture except the 601. * The RTC is guaranteed to increment at least once in the time required to execute 10 Add Immediate (addi) instructions. No analogous guarantee is made for the TB.

9

| * Not all bits of RTCL need to be implemented, while all | bits of the TB must be implemented." | | From page 10-127: "For forward compatibility with other | members of the PowerPC microprocessor family the 'mftb' | instruction should be used to obtain the contents of the | RTCL and RTCU registers. The 'mftb' instruction is a | PowerPC instruction unimplemented by the 601, and will | be trapped by the illegal instruction exception handler, | which can then issue the appropriate mfspr instructions | for reading the RTCL and RTCU registers." | | HISTORY: 02.07.98 from "MacTech" Jan '98 p. 48 which cites | PowerPC 601 RISC User's Manual by | Motorola as it's source. | 06.15.98 Added notes and updated for chips beyond | 601. | 06.16.98 Revised to read entire 64-bit register not | just the low part. | 06.18.98 Edited comments. ------------------------------------------------------------*/ #include <Gestalt.h> typedef typedef typedef typedef typedef typedef typedef typedef unsigned unsigned unsigned unsigned char short long long long u8; u16; u32; u64; s8; s16; s32; s64;

signed char short long unsigned long long

#include "TimePPC.h" u64 GetTimePPCOverhead; // The number of ticks that need to be subtracted from // an elapsed time result to correct for the time taken // to measure the time. This gets set by 'SetUpTimePPC()'. static u32 CPUType = 0; // Holds '601' if the currently running CPU is a 601 chip, // else holds '603'. This gets set by 'SetUpTimePPC()'. /*-----------------------------------------------------------| NAME: ElapsedTimePPC |------------------------------------------------------------| | PURPOSE: To compute the elapsed time since a time | measurement was taken. |

10

| DESCRIPTION: This is a generic routine for high-frequency | time measurement on any PowerPC chip. | | Takes a time value as input, computes the number of ticks | between then and now, and saves the result over the input. | | Returns the elapsed time measured in units dependent on the | tick rate of the chip. | | The input/result is a 64-bit number with this format: | | ------------------| | Hi | Lo | | Byte ------------------| Offset 0 4 | | EXAMPLE: | | u64 ATime; | | GetTimePPC( &ATime ); | | < Some code to be timed goes here. > | | ElapsedTimePPC( &ATime ); | | NOTE: | | ASSUMES: The function 'SetUpTimePPC()' has been called | prior to calling this function to identify the | CPU type that is running. | | The process takes less time than the longest span | that can be measured by the time base. | | HISTORY: 06.17.98 ------------------------------------------------------------*/ void ElapsedTimePPC( u64* t ) { u64 now; // Mark the end of a process being timed. GetTimePPC( &now ); // If the end time is larger than the start time. if( now > *t ) { // Compute the difference. *t = now - *t; } else // The time base has wrapped around during the

11

// process being timed. { // Adjust the original measure. *t = ((u64) -1) - *t; // Add the final time. *t += now; } } /*-----------------------------------------------------------| NAME: GetRealTimePPC |------------------------------------------------------------| | PURPOSE: To read the real-time clock registers of the | PowerPC 601 chip. | | DESCRIPTION: Returns the contents of the RTC registers as a | a 64-bit number with this format: | | ----------------------| | RTCU | RTCL | | Byte ----------------------| Offset 0 4 | | where: | | RTCU is the upper register of the real time clock which | holds the number of seconds since the time specified | in the software. | | RTCL is the lower register of the real time clock. It | holds the number of nanoseconds since the beginning | of the second, with a resolution of 128 nanoseconds | per tick. | | Not all the bits are implemented and should always | read as 0. | | RTCL | --------------------------------| | 00 | | 0000000 | | --------------------------------| 0 1 2 24 25 31 | ^ | |__ Least Significant Bit | | The low register counts from zero to 999,999,872, one | billion minus 128 after 999,999,999 nS. The next time | RTCL is incremented, it cycles to all zeros and RTCU is | incremented. |

12

| The RTCL is incremented 7812500 times per second, once | every 128 nanoseconds. | | EXAMPLE: | | u64 RTCL_HiLo; | | GetRealTimePPC( &RTCL_HiLo ); | | NOTE: See page 2-16 of 601 User's Manual for the detailed | | | ASSUMES: | | HISTORY: 06.17.98 | 06.24.98 Updated description. ------------------------------------------------------------*/ asm void GetRealTimePPC( u64* t ) { machine 601 // This is only for the 601 chip. A: mfspr r4, 4 // Get upper real time clock register. mfspr r5, 5 // Get lower real time clock register. mfspr r6, 4 // Get upper real time clock register again. cmpw r4,r6 // If the upper register has changed. bne A // Try reading again. stw r4,0(r3) // Put the hi part at the result. stw r5,4(r3) // Put the lo part at offset 4 of result. blr // Return. } /*-----------------------------------------------------------| NAME: GetTimeBasePPC |------------------------------------------------------------| | PURPOSE: To read the time base register of any PowerPC chip | other than the 601 chip. | | DESCRIPTION: Returns a number measured in units dependent | on the time base tick rate of the chip. | | The result is a 64-bit number with this format: | | ------------------| | Hi | Lo | | Byte ------------------| Offset 0 4 | | EXAMPLE: | | u64 Before, After, Diff;

13

| | GetTimeBasePPC( &Before ); | | < Some code to be timed goes here. > | | GetTimeBasePPC( &After ); | | // Assuming value in 'After' is larger than 'Before', | // calculate the elapsed time in ticks. | Diff = After - Before; | | NOTE: Not supported on the 601, use 'GetRealTimePPC()' | instead. | | ASSUMES: | | HISTORY: 06.17.98 ------------------------------------------------------------*/ asm void GetTimeBasePPC( u64* t ) { machine 603 // For any PowerPC chip other than the 601. A: mftbu r4 // Get the upper time base register. mftb r5 // Get the lower time base register. mftbu r6 // Get upper time base register again. cmpw r4,r6 // If the upper register has changed. bne A // Try reading again. stw r4,0(r3) // Put the hi part at the result. stw r5,4(r3) // Put the lo part at offset 4 of result. blr // Return. } /*-----------------------------------------------------------| NAME: GetTimePPC |------------------------------------------------------------| | PURPOSE: To read the time register of any PowerPC chip. | | DESCRIPTION: This is a generic routine for high-frequency | time measurement on any PowerPC chip. | | Returns a number measured in units dependent on the tick | rate of the chip. | | The result is a 64-bit number with this format: | | ------------------| | Hi | Lo | | Byte ------------------| Offset 0 4 |

14

| EXAMPLE: | | u64 ATime; | | GetTimePPC( &ATime ); | | < Some code to be timed goes here. > | | ElapsedTimePPC( &ATime ); | | NOTE: | | ASSUMES: The function 'SetUpTimePPC()' has been called | prior to calling this function to identify the | CPU type that is running. | | HISTORY: 06.17.98 | 06.29.98 Added unit conversion for 601 chip. ------------------------------------------------------------*/ void GetTimePPC( u64* t ) { u32* lo; u32* hi; u32 H, L; // If this is a 601 chip. if( CPUType == 601 ) { // Read the real time clock register. GetRealTimePPC( t ); // Convert seconds:nanoseconds to units of 128 nanoseconds each... // Refer to the upper register field, RTCU. hi = (u32*) t; // Refer to the lower register field, RTCL. lo = (u32*) ( ((u8*) t) + 4 ); // Get the value of the lower register. L = *lo; // Shift the lo part to the left two bits, then right 9 bits // to clear the high bits and right justify the significant bits. // // This converts nanosecond units to 128-nS units. L = ( L << 2 ) >> 9; // Get the value of the upper registers. H = *hi;

15

// Shift high value to the right nine bits to convert seconds // to 128-nS units. *hi = H >> 9; // Shift high value left 23 bits to left justify the section of the // upper 32 bits that shifts into the lower 32-bits when converting // seconds to 128-nS units. H = H << 23; // Merge the bits shifted down from the upper register with the // justified bits of the lower register. *lo = H | L; } else // This is a non-601 chip. { // Read the time base register. GetTimeBasePPC( t ); } } /*-----------------------------------------------------------| NAME: SetUpTimePPC |------------------------------------------------------------| | PURPOSE: To prepare for high-resolution time measurement on | the PowerPC chip. | | DESCRIPTION: Call this routine prior to taking a time | measurement. The CPU type and timing overhead are computed | for use by the timing functions. | | EXAMPLE: | | NOTE: | | ASSUMES: The timing overhead factor computed by this | function is based on having the timing functions | resident in the on-chip cache at the time they are | called. It's your responsibility to pre-fetch | the timing functions to make this true. | | HISTORY: 06.17.98 ------------------------------------------------------------*/ void SetUpTimePPC() { OSErr err; s32 result; u64 Timer; // If CPUType isn't identified. if( CPUType == 0 )

16

{ // Find what kind of CPU is running. err = Gestalt( gestaltProcessorType, &result ); // If this is a PowerPC 601 chip. if( result == gestaltCPU601 ) { CPUType = 601; } else // Treat all other PowerPC chips as if they // have a time base register like the 603. { CPUType = 603; } } // Assume that there is no overhead for timing calls. GetTimePPCOverhead = 0; // Call these two routines here just to get them into // the on-chip cache. GetTimePPC( &Timer ); ElapsedTimePPC( &Timer ); // Now make the real measurement of how long it takes // to do nothing. GetTimePPC( &Timer ); ElapsedTimePPC( &Timer ); // The resulting time is the timing overhead, a correction // factor to be applied to future timings. GetTimePPCOverhead = Timer; } END FILE ‘TimePPC.c’ --------------------------------------------------------------------

BEGIN FILE ‘AccessorTest.c’

// PURPOSE: To measure the cost of using accessors on PowerPC chips. #include <stdio.h> typedef typedef typedef typedef typedef typedef typedef typedef unsigned unsigned unsigned unsigned char short long long long u8; u16; u32; u64; s8; s16; s32; s64;

signed char short long unsigned long long

17

#include "TimePPC.h" // Define one or the other of the following symbols and then compile // to make a benchmark for that method: //#define DIRECT_METHOD #define ACCESSOR_METHOD // These are the variables, separated by padding. int a0, pada0[32], b0, padb0[32], c0, padc0[32], int e0, pade0[32], f0, padf0[32], g0, padg0[32], int i0, padi0[32], j0, padj0[32], k0, padk0[32], int m0, padm0[32], n0, padn0[32], o0, pado0[32], int q0, padq0[32], r0, padr0[32], s0, pads0[32], int u0, padu0[32], v0, padv0[32], w0, padw0[32], int y0, pady0[32], z0, padz0[32]; int int int int int int int int int int int int int int int int int int int int int int int int int int int int a1, e1, i1, m1, q1, u1, y1, a2, e2, i2, m2, q2, u2, y2, a3, e3, i3, m3, q3, u3, y3, a4, e4, i4, m4, q4, u4, y4, pada1[32], pade1[32], padi1[32], padm1[32], padq1[32], padu1[32], pady1[32], pada2[32], pade2[32], padi2[32], padm2[32], padq2[32], padu2[32], pady2[32], pada3[32], pade3[32], padi3[32], padm3[32], padq3[32], padu3[32], pady3[32], pada4[32], pade4[32], padi4[32], padm4[32], padq4[32], padu4[32], pady4[32], b1, f1, j1, n1, r1, v1, z1, b2, f2, j2, n2, r2, v2, z2, b3, f3, j3, n3, r3, v3, z3, b4, f4, j4, n4, r4, v4, z4, padb1[32], padf1[32], padj1[32], padn1[32], padr1[32], padv1[32], padz1[32]; padb2[32], padf2[32], padj2[32], padn2[32], padr2[32], padv2[32], padz2[32]; padb3[32], padf3[32], padj3[32], padn3[32], padr3[32], padv3[32], padz3[32]; padb4[32], padf4[32], padj4[32], padn4[32], padr4[32], padv4[32], padz4[32]; c1, g1, k1, o1, s1, w1, padc1[32], padg1[32], padk1[32], pado1[32], pads1[32], padw1[32],

d0, h0, l0, p0, t0, x0,

padd0[32]; padh0[32]; padl0[32]; padp0[32]; padt0[32]; padx0[32];

d1, h1, l1, p1, t1, x1,

padd1[32]; padh1[32]; padl1[32]; padp1[32]; padt1[32]; padx1[32];

c2, g2, k2, o2, s2, w2,

padc2[32], padg2[32], padk2[32], pado2[32], pads2[32], padw2[32],

d2, h2, l2, p2, t2, x2,

padd2[32]; padh2[32]; padl2[32]; padp2[32]; padt2[32]; padx2[32];

c3, g3, k3, o3, s3, w3,

padc3[32], padg3[32], padk3[32], pado3[32], pads3[32], padw3[32],

d3, h3, l3, p3, t3, x3,

padd3[32]; padh3[32]; padl3[32]; padp3[32]; padt3[32]; padx3[32];

c4, g4, k4, o4, s4, w4,

padc4[32], padg4[32], padk4[32], pado4[32], pads4[32], padw4[32],

d4, h4, l4, p4, t4, x4,

padd4[32]; padh4[32]; padl4[32]; padp4[32]; padt4[32]; padx4[32];

int a5, pada5[32], int e5, pade5[32], int i5, padi5[32],

b5, padb5[32], f5, padf5[32], j5, padj5[32],

c5, padc5[32], g5, padg5[32], k5, padk5[32],

d5, padd5[32]; h5, padh5[32]; l5, padl5[32];

18

int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int

m5, q5, u5, y5, a6, e6, i6, m6, q6, u6, y6, a7, e7, i7, m7, q7, u7, y7,

padm5[32], padq5[32], padu5[32], pady5[32], pada6[32], pade6[32], padi6[32], padm6[32], padq6[32], padu6[32], pady6[32], pada7[32], pade7[32], padi7[32], padm7[32], padq7[32], padu7[32], pady7[32],

n5, r5, v5, z5, b6, f6, j6, n6, r6, v6, z6, b7, f7, j7, n7, r7, v7, z7, b8, f8, j8, n8, r8, v8, z8, b9, f9, j9, n9, r9, v9, z9,

padn5[32], padr5[32], padv5[32], padz5[32]; padb6[32], padf6[32], padj6[32], padn6[32], padr6[32], padv6[32], padz6[32]; padb7[32], padf7[32], padj7[32], padn7[32], padr7[32], padv7[32], padz7[32]; padb8[32], padf8[32], padj8[32], padn8[32], padr8[32], padv8[32], padz8[32]; padb9[32], padf9[32], padj9[32], padn9[32], padr9[32], padv9[32], padz9[32];

o5, pado5[32], s5, pads5[32], w5, padw5[32],

p5, padp5[32]; t5, padt5[32]; x5, padx5[32];

c6, g6, k6, o6, s6, w6,

padc6[32], padg6[32], padk6[32], pado6[32], pads6[32], padw6[32],

d6, h6, l6, p6, t6, x6,

padd6[32]; padh6[32]; padl6[32]; padp6[32]; padt6[32]; padx6[32];

c7, g7, k7, o7, s7, w7,

padc7[32], padg7[32], padk7[32], pado7[32], pads7[32], padw7[32],

d7, h7, l7, p7, t7, x7,

padd7[32]; padh7[32]; padl7[32]; padp7[32]; padt7[32]; padx7[32];

a8, pada8[32], e8, pade8[32], i8, padi8[32], m8, padm8[32], q8, padq8[32], u88, padu8[32], y8, pady8[32], a9, e9, i9, m9, q9, u9, y9, pada9[32], pade9[32], padi9[32], padm9[32], padq9[32], padu9[32], pady9[32],

c8, padc8[32], g8, padg8[32], k8, padk8[32], o8, pado8[32], s88, pads8[32], w8, padw8[32],

d8, h8, l8, p8, t8, x8,

padd8[32]; padh8[32]; padl8[32]; padp8[32]; padt8[32]; padx8[32];

c9, g9, k9, o9, s9, w9,

padc9[32], padg9[32], padk9[32], pado9[32], pads9[32], padw9[32],

d9, h9, l9, p9, t9, x9,

padd9[32]; padh9[32]; padl9[32]; padp9[32]; padt9[32]; padx9[32];

#ifdef ACCESSOR_METHOD int int int int int int int int int int geta0(), geth0(), geto0(), getv0(), geta1(), geth1(), geto1(), getv1(), getb0(), geti0(), getp0(), getw0(), getb1(), geti1(), getp1(), getw1(), getc0(), getj0(), getq0(), getx0(), getc1(), getj1(), getq1(), getx1(), getd0(), getk0(), getr0(), gety0(), getd1(), getk1(), getr1(), gety1(), gete0(), getf0(), getg0(); getl0(), getm0(), getn0(); gets0(), gett0(), getu0(); getz0(); gete1(), getf1(), getg1(); getl1(), getm1(), getn1(); gets1(), gett1(), getu1(); getz1();

geta2(), getb2(), getc2(), getd2(), gete2(), getf2(), getg2(); geth2(), geti2(), getj2(), getk2(), getl2(), getm2(), getn2();

19

int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int

geto2(), getp2(), getq2(), getr2(), gets2(), gett2(), getu2(); getv2(), getw2(), getx2(), gety2(), getz2(); geta3(), geth3(), geto3(), getv3(), geta4(), geth4(), geto4(), getv4(), geta5(), geth5(), geto5(), getv5(), geta6(), geth6(), geto6(), getv6(), geta7(), geth7(), geto7(), getv7(), geta8(), geth8(), geto8(), getv8(), geta9(), geth9(), geto9(), getv9(), getb3(), geti3(), getp3(), getw3(), getb4(), geti4(), getp4(), getw4(), getb5(), geti5(), getp5(), getw5(), getb6(), geti6(), getp6(), getw6(), getb7(), geti7(), getp7(), getw7(), getb8(), geti8(), getp8(), getw8(), getb9(), geti9(), getp9(), getw9(), getc3(), getj3(), getq3(), getx3(), getc4(), getj4(), getq4(), getx4(), getc5(), getj5(), getq5(), getx5(), getc6(), getj6(), getq6(), getx6(), getc7(), getj7(), getq7(), getx7(), getc8(), getj8(), getq8(), getx8(), getc9(), getj9(), getq9(), getx9(), getd3(), getk3(), getr3(), gety3(), getd4(), getk4(), getr4(), gety4(), getd5(), getk5(), getr5(), gety5(), getd6(), getk6(), getr6(), gety6(), getd7(), getk7(), getr7(), gety7(), getd8(), getk8(); getr8(), gety8(), getd9(), getk9(), getr9(), gety9(), gete3(), getf3(), getg3(); getl3(), getm3(), getn3(); gets3(), gett3(), getu3(); getz3(); gete4(), getf4(), getg4(); getl4(), getm4(), getn4(); gets4(), gett4(), getu4(); getz4(); gete5(), getf5(), getg5(); getl5(), getm5(), getn5(); gets5(), gett5(), getu5(); getz5(); gete6(), getf6(), getg6(); getl6(), getm6(), getn6(); gets6(), gett6(), getu6(); getz6(); gete7(), getf7(), getg7(); getl7(), getm7(), getn7(); gets7(), gett7(), getu7(); getz7(); gete8(), getf8(), getg8(); getl8(), getm8(), getn8(); gets88(), gett8(), getu88(); getz8(); gete9(), getf9(), getg9(); getl9(), getm9(), getn9(); gets9(), gett9(), getu9(); getz9();

void setz9( int ); int int int int int int int int int int int int geta0() getb0() getc0() getd0() gete0() getf0() getg0() geth0() geti0() getj0() getk0() getl0() { { { { { { { { { { { { return( return( return( return( return( return( return( return( return( return( return( return( a0 b0 c0 d0 e0 f0 g0 h0 i0 j0 k0 l0 ); ); ); ); ); ); ); ); ); ); ); ); } } } } } } } } } } } }

20

int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int

getm0() getn0() geto0() getp0() getq0() getr0() gets0() gett0() getu0() getv0() getw0() getx0() gety0() getz0() geta1() getb1() getc1() getd1() gete1() getf1() getg1() geth1() geti1() getj1() getk1() getl1() getm1() getn1() geto1() getp1() getq1() getr1() gets1() gett1() getu1() getv1() getw1() getx1() gety1() getz1() geta2() getb2() getc2() getd2() gete2() getf2() getg2() geth2() geti2() getj2() getk2() getl2()

{ { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { {

return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return(

m0 n0 o0 p0 q0 r0 s0 t0 u0 v0 w0 x0 y0 z0 a1 b1 c1 d1 e1 f1 g1 h1 i1 j1 k1 l1 m1 n1 o1 p1 q1 r1 s1 t1 u1 v1 w1 x1 y1 z1 a2 b2 c2 d2 e2 f2 g2 h2 i2 j2 k2 l2

); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); );

} } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } }

21

int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int

getm2() getn2() geto2() getp2() getq2() getr2() gets2() gett2() getu2() getv2() getw2() getx2() gety2() getz2() geta3() getb3() getc3() getd3() gete3() getf3() getg3() geth3() geti3() getj3() getk3() getl3() getm3() getn3() geto3() getp3() getq3() getr3() gets3() gett3() getu3() getv3() getw3() getx3() gety3() getz3() geta4() getb4() getc4() getd4() gete4() getf4() getg4() geth4() geti4() getj4() getk4() getl4()

{ { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { {

return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return(

m2 n2 o2 p2 q2 r2 s2 t2 u2 v2 w2 x2 y2 z2 a3 b3 c3 d3 e3 f3 g3 h3 i3 j3 k3 l3 m3 n3 o3 p3 q3 r3 s3 t3 u3 v3 w3 x3 y3 z3 a4 b4 c4 d4 e4 f4 g4 h4 i4 j4 k4 l4

); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); );

} } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } }

22

int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int

getm4() getn4() geto4() getp4() getq4() getr4() gets4() gett4() getu4() getv4() getw4() getx4() gety4() getz4() geta5() getb5() getc5() getd5() gete5() getf5() getg5() geth5() geti5() getj5() getk5() getl5() getm5() getn5() geto5() getp5() getq5() getr5() gets5() gett5() getu5() getv5() getw5() getx5() gety5() getz5() geta6() getb6() getc6() getd6() gete6() getf6() getg6() geth6() geti6() getj6() getk6() getl6()

{ { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { {

return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return(

m4 n4 o4 p4 q4 r4 s4 t4 u4 v4 w4 x4 y4 z4 a5 b5 c5 d5 e5 f5 g5 h5 i5 j5 k5 l5 m5 n5 o5 p5 q5 r5 s5 t5 u5 v5 w5 x5 y5 z5 a6 b6 c6 d6 e6 f6 g6 h6 i6 j6 k6 l6

); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); );

} } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } }

23

int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int

getm6() getn6() geto6() getp6() getq6() getr6() gets6() gett6() getu6() getv6() getw6() getx6() gety6() getz6() geta7() getb7() getc7() getd7() gete7() getf7() getg7() geth7() geti7() getj7() getk7() getl7() getm7() getn7() geto7() getp7() getq7() getr7() gets7() gett7() getu7() getv7() getw7() getx7() gety7() getz7() geta8() getb8() getc8() getd8() gete8() getf8() getg8() geth8() geti8() getj8() getk8() getl8()

{ { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { {

return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return(

m6 n6 o6 p6 q6 r6 s6 t6 u6 v6 w6 x6 y6 z6 a7 b7 c7 d7 e7 f7 g7 h7 i7 j7 k7 l7 m7 n7 o7 p7 q7 r7 s7 t7 u7 v7 w7 x7 y7 z7 a8 b8 c8 d8 e8 f8 g8 h8 i8 j8 k8 l8

); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); );

} } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } }

24

int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int

getm8() { return( m8 ); } getn8() { return( n8 ); } geto8() { return( o8 ); } getp8() { return( p8 ); } getq8() { return( q8 ); } getr8() { return( r8 ); } gets88() { return( s88 ); } gett8() { return( t8 ); } getu88() { return( u88 ); } getv8() { return( v8 ); } getw8() { return( w8 ); } getx8() { return( x8 ); } gety8() { return( y8 ); } getz8() { return( z8 ); } geta9() { return( a9 ); } getb9() { return( b9 ); } getc9() { return( c9 ); } getd9() { return( d9 ); } gete9() { return( e9 ); } getf9() { return( f9 ); } getg9() { return( g9 ); } geth9() { return( h9 ); } geti9() { return( i9 ); } getj9() { return( j9 ); } getk9() { return( k9 ); } getl9() { return( l9 ); } getm9() { return( m9 ); } getn9() { return( n9 ); } geto9() { return( o9 ); } getp9() { return( p9 ); } getq9() { return( q9 ); } getr9() { return( r9 ); } gets9() { return( s9 ); } gett9() { return( t9 ); } getu9() { return( u9 ); } getv9() { return( v9 ); } getw9() { return( w9 ); } getx9() { return( x9 ); } gety9() { return( y9 ); } getz9() { return( z9 ); }

void setz9( int value ) { z9 = value; } #endif void main(void) { u64 TimeCount; /* Measure timer overhead and pre-load into cpu cache. */ SetUpTimePPC();

25

#ifdef DIRECT_METHOD GetTimePPC( &TimeCount ); z9 += a0 + b0 + c0 + d0 + e0 + f0 + g0 + h0 + i0 + j0 + k0 + l0 + m0 + n0 + o0 + p0 + q0 + r0 + s0 + t0 + u0 + v0 + w0 + x0 + y0 + z0; z9 += a1 + b1 + c1 + d1 + e1 + f1 + g1 + h1 + i1 + j1 + k1 + l1 + m1 + n1 + o1 + p1 + q1 + r1 + s1 + t1 + u1 + v1 + w1 + x1 + y1 + z1; z9 += a2 + b2 + c2 + d2 + e2 + f2 + g2 + h2 + i2 + j2 + k2 + l2 + m2 + n2 + o2 + p2 + q2 + r2 + s2 + t2 + u2 + v2 + w2 + x2 + y2 + z2; z9 += a3 + b3 + c3 + d3 + e3 + f3 + g3 + h3 + i3 + j3 + k3 + l3 + m3 + n3 + o3 + p3 + q3 + r3 + s3 + t3 + u3 + v3 + w3 + x3 + y3 + z3; z9 += a4 + b4 + c4 + d4 + e4 + f4 + g4 + h4 + i4 + j4 + k4 + l4 + m4 + n4 + o4 + p4 + q4 + r4 + s4 + t4 + u4 + v4 + w4 + x4 + y4 + z4; z9 += a5 + b5 + c5 + d5 + e5 + f5 + g5 + h5 + i5 + j5 + k5 + l5 + m5 + n5 + o5 + p5 + q5 + r5 + s5 + t5 + u5 + v5 + w5 + x5 + y5 + z5; z9 += a6 + b6 + c6 + d6 + e6 + f6 + g6 + h6 + i6 + j6 + k6 + l6 + m6 + n6 + o6 + p6 + q6 + r6 + s6 + t6 + u6 + v6 + w6 + x6 + y6 + z6; z9 += a7 + b7 + c7 + d7 + e7 + f7 + g7 + h7 + i7 + j7 + k7 + l7 + m7 + n7 + o7 + p7 + q7 + r7 + s7 + t7 + u7 + v7 + w7 + x7 + y7 + z7; z9 += a8 + b8 + c8 + d8 + e8 + f8 + g8 + h8 + i8 + j8 + k8 + l8 + m8 + n8 + o8 + p8 + q8 + r8 + s88 + t8 + u88 + v8 + w8 + x8 + y8 + z8; z9 += a9 + b9 + c9 + d9 + e9 + f9 + g9 + h9 + i9 + j9 + k9 + l9 + m9 + n9 + o9 + p9 + q9 + r9 + s9 + t9 + u9 + v9 + w9 + x9 + y9 + z9; ElapsedTimePPC( &TimeCount ); printf( "Direct: %d\n", (u32) TimeCount ); #endif #ifdef ACCESSOR_METHOD GetTimePPC( &TimeCount ); setz9( getz9() geta0() getg0() getm0() gets0() gety0() setz9( getz9() geta1() getg1() getm1() + + + + + +

getb0() geth0() getn0() gett0() getz0()

+ getc0() + geti0() + geto0() + getu0() );

+ + + +

getd0() getj0() getp0() getv0()

+ + + +

gete0() getk0() getq0() getw0()

+ + + +

getf0() getl0() getr0() getx0()

+ + + +

+ + getb1() + getc1() + getd1() + gete1() + getf1() + + geth1() + geti1() + getj1() + getk1() + getl1() + + getn1() + geto1() + getp1() + getq1() + getr1() +

26

gets1() + gett1() + getu1() + getv1() + getw1() + getx1() + gety1() + getz1() ); setz9( getz9() geta2() getg2() getm2() gets2() gety2() setz9( getz9() geta3() getg3() getm3() gets3() gety3() setz9( getz9() geta4() getg4() getm4() gets4() gety4() setz9( getz9() geta5() getg5() getm5() gets5() gety5() setz9( getz9() geta6() getg6() getm6() gets6() gety6() setz9( getz9() geta7() getg7() getm7() gets7() gety7() + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +

getb2() geth2() getn2() gett2() getz2()

+ getc2() + geti2() + geto2() + getu2() );

+ + + +

getd2() getj2() getp2() getv2()

+ + + +

gete2() getk2() getq2() getw2()

+ + + +

getf2() getl2() getr2() getx2()

+ + + +

getb3() geth3() getn3() gett3() getz3()

+ getc3() + geti3() + geto3() + getu3() );

+ + + +

getd3() getj3() getp3() getv3()

+ + + +

gete3() getk3() getq3() getw3()

+ + + +

getf3() getl3() getr3() getx3()

+ + + +

getb4() geth4() getn4() gett4() getz4()

+ getc4() + geti4() + geto4() + getu4() );

+ + + +

getd4() getj4() getp4() getv4()

+ + + +

gete4() getk4() getq4() getw4()

+ + + +

getf4() getl4() getr4() getx4()

+ + + +

getb5() geth5() getn5() gett5() getz5()

+ getc5() + geti5() + geto5() + getu5() );

+ + + +

getd5() getj5() getp5() getv5()

+ + + +

gete5() getk5() getq5() getw5()

+ + + +

getf5() getl5() getr5() getx5()

+ + + +

getb6() geth6() getn6() gett6() getz6()

+ getc6() + geti6() + geto6() + getu6() );

+ + + +

getd6() getj6() getp6() getv6()

+ + + +

gete6() getk6() getq6() getw6()

+ + + +

getf6() getl6() getr6() getx6()

+ + + +

getb7() geth7() getn7() gett7() getz7()

+ getc7() + geti7() + geto7() + getu7() );

+ + + +

getd7() getj7() getp7() getv7()

+ + + +

gete7() getk7() getq7() getw7()

+ + + +

getf7() getl7() getr7() getx7()

+ + + +

setz9( getz9() + geta8() + getb8() + getc8() + getg8() + geth8() + geti8() + getm8() + getn8() + geto8() + gets88() + gett8() + getu88() gety8() + getz8() );

getd8() + getj8() + getp8() + + getv8()

gete8() + getk8() + getq8() + + getw8()

getf8() + getl8() + getr8() + + getx8() +

27

setz9( getz9() geta9() getg9() getm9() gets9() gety9()

+ + + + + +

getb9() geth9() getn9() gett9() getz9()

+ getc9() + geti9() + geto9() + getu9() );

+ + + +

getd9() getj9() getp9() getv9()

+ + + +

gete9() getk9() getq9() getw9()

+ + + +

getf9() getl9() getr9() getx9()

+ + + +

ElapsedTimePPC( &TimeCount ); printf( "Accessor: %d\n", (u32) TimeCount ); #endif }

END FILE ‘AccessorTest.c’ END MAC SOURCE CODE

---------------------------------

=======================================

BEGIN X86 SOURCE CODE BEGIN FILE ‘timex86.asm

===================================== -----------------------------------

;-----------------------------------------------------------; NAME: timex86.asm ; ; PURPOSE: To provide timing functions for Intel x86 chips. ; ; DESCRIPTION: Copied from the original file... ; ; "**** PCZTNEAR.ASM ; The C-near-callable version of the precision Zen timer ; (PZTIMER.ASM) ; ; Note: use NOSMART with TASM (at least version 2.0) to keep ; the assembler from turning far calls in the reference ; timing code into PUSH CS/near call sequences, thereby ; messing up the reference call times. This problem may ; arise with other optimizing assemblers as well. ; ; Uses the 8253 timer to time the performance of code that takes ; less than about 54 milliseconds to execute, with a resolution ; of better than 10 microseconds. ; ; By Michael Abrash 4/26/89 ; ; Externally callable routines: ; ; ZTimerOn: Starts the Zen timer, with interrupts disabled. ; ; ZTimerOff: Stops the Zen timer, saves the timer count,

28

; times the overhead code, and restores interrupts to the ; state they were in when ZTimerOn was called. ; ; ZTimerReport: Prints the net time that passed between starting ; and stopping the timer. ; ; Note: If longer than about 54 ms passes between ZTimerOn and ; ZTimerOff calls, the timer turns over and the count is ; inaccurate. When this happens, an error message is displayed ; instead of a count. The long-period Zen timer should be used ; in such cases. ; ; Note: Interrupts *MUST* be left off between calls to ZTimerOn ; and ZTimerOff for accurate timing and for detection of ; timer overflow. ; ; Note: These routines can introduce slight inaccuracies into the ; system clock count for each code section timed even if ; timer 0 doesn't overflow. If timer 0 does overflow, the ; system clock can become slow by virtually any amount of ; time, since the system clock can't advance while the ; precison timer is timing. Consequently, it's a good idea ; to reboot at the end of each timing session. (The ; battery-backed clock, if any, is not affected by the Zen ; timer.) ; ; All registers, and all flags except the interrupt flag, are ; preserved by all routines. Interrupts are enabled and then disabled ; by ZTimerOn, and are restored by ZTimerOff to the state they were ; in when ZTimerOn was called." ; ; 07.08.98 Now necessary to measure timer overhead separately and ; perform calculation by hand. This approach taken to ; gauge the variation in the timing mechanism. ; ; Use like this: ; ; ZTIMERON(); /* Measure time overhead and print results. */ ; ZTIMEROFF(); ; ZTIMERREPORT(); ; ; ZTIMERON(); /* Measure overhead again: timer code is now in cpu cache. */ ; ZTIMEROFF(); ; ZTIMERREPORT(); ; ; ZTIMERON(); /* Measure overhead again. */ ; ZTIMEROFF(); ; ZTIMERREPORT(); ; ; ZTIMERON(); /* Now measure code of interest. */ ; MyTimeConsumingFunction();

29

; ZTIMEROFF(); ; ZTIMERREPORT(); ; ; NOTE: See also a similar timer code for PowerPC in 'TimePPC.c'. ; ; HISTORY: 04.26.89 By Michael Abrash as file 'PCZTNEAR.ASM'. ; 07.08.98 Revised to support faster chips: ; timer overhead calculation removed, ; microsecond conversion removed: now returns ; timer ticks. ------------------------------------------------------------*/ _TEXT segment word public 'CODE' assume cs:_TEXT, ds:nothing public _ZTimerOn, _ZTimerOff, _ZTimerReport

; ; Base address of the 8253 timer chip. ; BASE_8253 equ 40h ; ; The address of the timer 0 count registers in the 8253. ; TIMER_0_8253 equ BASE_8253 + 0 ; ; The address of the mode register in the 8253. ; MODE_8253 equ BASE_8253 + 3 ; ; The address of Operation Command Word 3 in the 8259 Programmable ; Interrupt Controller (PIC) (write only, and writable only when ; bit 4 of the byte written to this address is 0 and bit 3 is 1). ; OCW3 equ 20h ; ; The address of the Interrupt Request register in the 8259 PIC ; (read only, and readable only when bit 1 of OCW3 = 1 and bit 0 ; of OCW3 = 0). ; IRR equ 20h ; ; Macro to emulate a POPF instruction in order to fix the bug in some ; 80286 chips which allows interrupts to occur during a POPF even when ; interrupts remain disabled. ; MPOPF macro local p1, p2 jmp short p2 p1: iret ;jump to pushed address & pop flags p2: push cs ;construct far return address to call p1 ; the next instruction endm

30

; ; Macro to delay briefly to ensure that enough time has elapsed ; between successive I/O accesses so that the device being accessed ; can respond to both accesses even on a very fast PC. ; ; 07.08.98 TL Changed from 3 jumps to 30 to be on the safe side. DELAY macro jmp $+2 jmp $+2 jmp $+2 jmp $+2 jmp $+2 jmp $+2 jmp $+2 jmp $+2 jmp $+2 jmp $+2 jmp $+2 jmp $+2 jmp $+2 jmp $+2 jmp $+2 jmp $+2 jmp $+2 jmp $+2 jmp $+2 jmp $+2 jmp $+2 jmp $+2 jmp $+2 jmp $+2 jmp $+2 jmp $+2 jmp $+2 jmp $+2 jmp $+2 jmp $+2 endm OriginalFlags db ? ;storage for upper byte of ; FLAGS register when ; ZTimerOn called ;timer 0 count when the timer ; is stopped ;number of counts required to ; execute timer overhead code ;used to indicate whether the ; timer overflowed during the ; timing interval

TimedCount ReferenceCount OverflowFlag

dw dw db

? ? ?

; ; String printed to report results.

31

; OutputStr ASCIICountEnd

label db label db db

byte 'Timed count: ', 5 dup (?) byte ' ticks ', 0dh, 0ah '$'

; ; String printed to report timer overflow. ; OverflowStr label byte db 0dh, 0ah db '****************************************************' db 0dh, 0ah db '* The timer overflowed, so the interval timed was *' db 0dh, 0ah db '* too long for the precision timer to measure. *' db 0dh, 0ah db '* Please perform the timing test again with the *' db 0dh, 0ah db '* long-period timer. *' db 0dh, 0ah db '****************************************************' db 0dh, 0ah db '$' ;******************************************************************** ;* Routine called to start timing. * ;******************************************************************** _ZTimerOn proc near

; ; Save the context of the program being timed. ; push ax pushf pop ax ;get flags so we can keep ; interrupts off when leaving ; this routine mov cs:[OriginalFlags],ah ;remember the state of the ; Interrupt flag and ah,0fdh ;set pushed interrupt flag ; to 0 push ax ; ; Turn on interrupts, so the timer interrupt can occur if it's ; pending. ; sti ; ; Set timer 0 of the 8253 to mode 2 (divide-by-N), to cause ; linear counting rather than count-by-two counting. Also

32

; leaves the 8253 waiting for the initial timer 0 count to ; be loaded. ; mov al,00110100b ;mode 2 out MODE_8253,al ; ; Set the timer count to 0, so we know we won't get another ; timer interrupt right away. ; Note: this introduces an inaccuracy of up to 54 ms in the system ; clock count each time it is executed. ; DELAY sub al,al out TIMER_0_8253,al ;lsb DELAY out TIMER_0_8253,al ;msb ; ; Wait before clearing interrupts to allow the interrupt generated ; when switching from mode 3 to mode 2 to be recognized. The delay ; must be at least 210 ns long to allow time for that interrupt to ; occur. Here, 10 jumps are used for the delay to ensure that the ; delay time will be more than long enough even on a very fast PC. ; ; rept 10 ; 07.08.98 TL Changed to 60 to allow for current high speeds. rept 60 jmp $+2 endm ; ; Disable interrupts to get an accurate count. ; cli ; ; Set the timer count to 0 again to start the timing interval. ; mov al,00110100b ;set up to load initial out MODE_8253,al ; timer count DELAY sub al,al out TIMER_0_8253,al ;load count lsb DELAY out TIMER_0_8253,al ;load count msb ; ; Restore the context and return. ; MPOPF ;keeps interrupts off pop ax ret _ZTimerOn endp

;********************************************************************

33

;* Routine called to stop timing and get count. * ;******************************************************************** _ZTimerOff proc near ; ; Save the context of the program being timed. ; push ax push cx pushf ; ; Latch the count. ; mov al,00000000b ;latch timer 0 out MODE_8253,al ; ; See if the timer has overflowed by checking the 8259 for a pending ; timer interrupt. ; mov al,00001010b ;OCW3, set up to read out OCW3,al ; Interrupt Request register DELAY in al,IRR ;read Interrupt Request ; register and al,1 ;set AL to 1 if IRQ0 (the ; timer interrupt) is pending mov cs:[OverflowFlag],al ;store the timer overflow ; status ; ; Allow interrupts to happen again. ; sti ; ; Read out the count we latched earlier. ; in al,TIMER_0_8253 ;least significant byte DELAY mov ah,al in al,TIMER_0_8253 ;most significant byte xchg ah,al neg ax ;convert from countdown ; remaining to elapsed ; count mov cs:[TimedCount],ax ; Time a zero-length code fragment, to get a reference for how ; much overhead this routine has. Time it 16 times and average it, ; for accuracy, rounding the result. ; ; 07.08.98 TL Revised to skip reference count calculation: ; Reference count now calculated in 'ZTimerSetUp'. ; mov cs:[ReferenceCount],0

34

; mov cx,16 ; cli ;interrupts off to allow a ; ; precise reference count ;RefLoop: ; call ReferenceZTimerOn ; call ReferenceZTimerOff ; loop RefLoop ; sti ; add cs:[ReferenceCount],8 ;total + (0.5 * 16) ; mov cl,4 ; shr cs:[ReferenceCount],cl ;(total) / 16 + 0.5 ; ; Restore original interrupt state. ; pop ax ;retrieve flags when called mov ch,cs:[OriginalFlags] ;get back the original upper ; byte of the FLAGS register and ch,not 0fdh ;only care about original ; interrupt flag... and ah,0fdh ;...keep all other flags in ; their current condition or ah,ch ;make flags word with original ; interrupt flag push ax ;prepare flags to be popped ; ; Restore the context of the program being timed and return to it. ; MPOPF ;restore the flags with the ; original interrupt state pop cx pop ax ret _ZTimerOff endp ; ; Called by ZTimerOff to start timer for overhead measurements. ; ReferenceZTimerOn proc near ; ; Save the context of the program being timed. ; push ax pushf ;interrupts are already off ; ; Set timer 0 of the 8253 to mode 2 (divide-by-N), to cause ; linear counting rather than count-by-two counting. ; mov al,00110100b ;set up to load out MODE_8253,al ; initial timer count DELAY

35

; ; Set the timer count to 0. ; sub al,al out TIMER_0_8253,al ;load count lsb DELAY out TIMER_0_8253,al ;load count msb ; ; Restore the context of the program being timed and return to it. ; MPOPF pop ax ret ReferenceZTimerOn endp

; ; Called by ZTimerOff to stop timer and add result to ReferenceCount ; for overhead measurements. ; ReferenceZTimerOff proc near ; ; Save the context of the program being timed. ; push ax push cx pushf ; ; Latch the count and read it. ; mov al,00000000b ;latch timer 0 out MODE_8253,al DELAY in al,TIMER_0_8253 ;lsb DELAY mov ah,al in al,TIMER_0_8253 ;msb xchg ah,al neg ax ;convert from countdown ; remaining to amount ; counted down add cs:[ReferenceCount],ax ; ; Restore the context of the program being timed and return to it. ; MPOPF pop cx pop ax ret ReferenceZTimerOff endp

36

;******************************************************************** ;* Routine called to report timing results. * ;******************************************************************** _ZTimerReport pushf push push push push push push ; push pop assume cs ;DOS functions require that DS point ds ; to text to be displayed on the screen ds:_TEXT proc near

ax bx cx dx si ds

; ; Check for timer 0 overflow. ; cmp [OverflowFlag],0 jz PrintGoodCount mov dx,offset OverflowStr mov ah,9 int 21h jmp short EndZTimerReport ; ; Convert net count to decimal ASCII in microseconds. ; PrintGoodCount: mov ax,[TimedCount] ; 07.08.98 TL Don't subtract out the reference count yet. ; sub ax,[ReferenceCount] mov si,offset ASCIICountEnd - 1 ; 07.08.98 TL Don't convert to microseconds to preserve highest ; resolution ticks. ; ; Convert count to microseconds by multiplying by .8381. ; ; mov dx,8381 ; mul dx ; mov bx,10000 ; div bx ;* .8381 = * 8381 / 10000 ; ; Convert time in microseconds to 5 decimal ASCII digits. ; mov bx,10 mov cx,5 CTSLoop:

37

sub div add mov dec loop

dx,dx bx dl,'0' [si],dl si CTSLoop

; ; Print the results. ; mov ah,9 mov dx,offset OutputStr int 21h ; EndZTimerReport: pop ds pop si pop dx pop cx pop bx pop ax MPOPF ret _ZTimerReport _TEXT endp

ends end END FILE ‘timex86.asm’

-----------------------------------

BEGIN FILE ‘acctest.c’

-----------------------------------

/* PURPOSE: To measure the cost of using accessors. */ extern int ZTIMERON(); extern int ZTIMEROFF(); extern void ZTIMERREPORT(); /* Define one or the other of the following symbols and then compile to make a benchmark for that method: */ #define DIRECT_METHOD /*#define ACCESSOR_METHOD*/ /* These are the variables, separated by padding. */ int a0, pada0[32], b0, padb0[32], c0, padc0[32], d0, int e0, pade0[32], f0, padf0[32], g0, padg0[32], h0, int i0, padi0[32], j0, padj0[32], k0, padk0[32], l0, int m0, padm0[32], n0, padn0[32], o0, pado0[32], p0, int q0, padq0[32], r0, padr0[32], s0, pads0[32], t0, int u0, padu0[32], v0, padv0[32], w0, padw0[32], x0, int y0, pady0[32], z0, padz0[32];

padd0[32]; padh0[32]; padl0[32]; padp0[32]; padt0[32]; padx0[32];

38

int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int

a1, e1, i1, m1, q1, u1, y1, a2, e2, i2, m2, q2, u2, y2, a3, e3, i3, m3, q3, u3, y3, a4, e4, i4, m4, q4, u4, y4, a5, e5, i5, m5, q5, u5, y5, a6, e6, i6, m6, q6, u6, y6,

pada1[32], pade1[32], padi1[32], padm1[32], padq1[32], padu1[32], pady1[32], pada2[32], pade2[32], padi2[32], padm2[32], padq2[32], padu2[32], pady2[32], pada3[32], pade3[32], padi3[32], padm3[32], padq3[32], padu3[32], pady3[32], pada4[32], pade4[32], padi4[32], padm4[32], padq4[32], padu4[32], pady4[32], pada5[32], pade5[32], padi5[32], padm5[32], padq5[32], padu5[32], pady5[32], pada6[32], pade6[32], padi6[32], padm6[32], padq6[32], padu6[32], pady6[32],

b1, f1, j1, n1, r1, v1, z1, b2, f2, j2, n2, r2, v2, z2, b3, f3, j3, n3, r3, v3, z3, b4, f4, j4, n4, r4, v4, z4, b5, f5, j5, n5, r5, v5, z5, b6, f6, j6, n6, r6, v6, z6,

padb1[32], padf1[32], padj1[32], padn1[32], padr1[32], padv1[32], padz1[32]; padb2[32], padf2[32], padj2[32], padn2[32], padr2[32], padv2[32], padz2[32]; padb3[32], padf3[32], padj3[32], padn3[32], padr3[32], padv3[32], padz3[32]; padb4[32], padf4[32], padj4[32], padn4[32], padr4[32], padv4[32], padz4[32]; padb5[32], padf5[32], padj5[32], padn5[32], padr5[32], padv5[32], padz5[32]; padb6[32], padf6[32], padj6[32], padn6[32], padr6[32], padv6[32], padz6[32];

c1, g1, k1, o1, s1, w1,

padc1[32], padg1[32], padk1[32], pado1[32], pads1[32], padw1[32],

d1, h1, l1, p1, t1, x1,

padd1[32]; padh1[32]; padl1[32]; padp1[32]; padt1[32]; padx1[32];

c2, g2, k2, o2, s2, w2,

padc2[32], padg2[32], padk2[32], pado2[32], pads2[32], padw2[32],

d2, h2, l2, p2, t2, x2,

padd2[32]; padh2[32]; padl2[32]; padp2[32]; padt2[32]; padx2[32];

c3, g3, k3, o3, s3, w3,

padc3[32], padg3[32], padk3[32], pado3[32], pads3[32], padw3[32],

d3, h3, l3, p3, t3, x3,

padd3[32]; padh3[32]; padl3[32]; padp3[32]; padt3[32]; padx3[32];

c4, g4, k4, o4, s4, w4,

padc4[32], padg4[32], padk4[32], pado4[32], pads4[32], padw4[32],

d4, h4, l4, p4, t4, x4,

padd4[32]; padh4[32]; padl4[32]; padp4[32]; padt4[32]; padx4[32];

c5, g5, k5, o5, s5, w5,

padc5[32], padg5[32], padk5[32], pado5[32], pads5[32], padw5[32],

d5, h5, l5, p5, t5, x5,

padd5[32]; padh5[32]; padl5[32]; padp5[32]; padt5[32]; padx5[32];

c6, g6, k6, o6, s6, w6,

padc6[32], padg6[32], padk6[32], pado6[32], pads6[32], padw6[32],

d6, h6, l6, p6, t6, x6,

padd6[32]; padh6[32]; padl6[32]; padp6[32]; padt6[32]; padx6[32];

int a7, pada7[32], int e7, pade7[32], int i7, padi7[32],

b7, padb7[32], f7, padf7[32], j7, padj7[32],

c7, padc7[32], g7, padg7[32], k7, padk7[32],

d7, padd7[32]; h7, padh7[32]; l7, padl7[32];

39

int int int int int int int int int int int int int int int int int int

m7, q7, u7, y7, a8, e8, i8, m8, q8, u8, y8, a9, e9, i9, m9, q9, u9, y9,

padm7[32], padq7[32], padu7[32], pady7[32], pada8[32], pade8[32], padi8[32], padm8[32], padq8[32], padu8[32], pady8[32], pada9[32], pade9[32], padi9[32], padm9[32], padq9[32], padu9[32], pady9[32],

n7, r7, v7, z7, b8, f8, j8, n8, r8, v8, z8, b9, f9, j9, n9, r9, v9, z9,

padn7[32], padr7[32], padv7[32], padz7[32]; padb8[32], padf8[32], padj8[32], padn8[32], padr8[32], padv8[32], padz8[32]; padb9[32], padf9[32], padj9[32], padn9[32], padr9[32], padv9[32], padz9[32];

o7, pado7[32], s7, pads7[32], w7, padw7[32],

p7, padp7[32]; t7, padt7[32]; x7, padx7[32];

c8, g8, k8, o8, s8, w8,

padc8[32], padg8[32], padk8[32], pado8[32], pads8[32], padw8[32],

d8, h8, l8, p8, t8, x8,

padd8[32]; padh8[32]; padl8[32]; padp8[32]; padt8[32]; padx8[32];

c9, g9, k9, o9, s9, w9,

padc9[32], padg9[32], padk9[32], pado9[32], pads9[32], padw9[32],

d9, h9, l9, p9, t9, x9,

padd9[32]; padh9[32]; padl9[32]; padp9[32]; padt9[32]; padx9[32];

#ifdef ACCESSOR_METHOD int int int int int int int int int int int int int int int int int int int int int int int int geta0(), geth0(), geto0(), getv0(), geta1(), geth1(), geto1(), getv1(), geta2(), geth2(), geto2(), getv2(), geta3(), geth3(), geto3(), getv3(), geta4(), geth4(), geto4(), getv4(), geta5(), geth5(), geto5(), getv5(), getb0(), geti0(), getp0(), getw0(), getb1(), geti1(), getp1(), getw1(), getb2(), geti2(), getp2(), getw2(), getb3(), geti3(), getp3(), getw3(), getb4(), geti4(), getp4(), getw4(), getb5(), geti5(), getp5(), getw5(), getc0(), getj0(), getq0(), getx0(), getc1(), getj1(), getq1(), getx1(), getc2(), getj2(), getq2(), getx2(), getc3(), getj3(), getq3(), getx3(), getc4(), getj4(), getq4(), getx4(), getc5(), getj5(), getq5(), getx5(), getd0(), getk0(), getr0(), gety0(), getd1(), getk1(), getr1(), gety1(), getd2(), getk2(), getr2(), gety2(), getd3(), getk3(), getr3(), gety3(), getd4(), getk4(), getr4(), gety4(), getd5(), getk5(), getr5(), gety5(), gete0(), getf0(), getg0(); getl0(), getm0(), getn0(); gets0(), gett0(), getu0(); getz0(); gete1(), getf1(), getg1(); getl1(), getm1(), getn1(); gets1(), gett1(), getu1(); getz1(); gete2(), getf2(), getg2(); getl2(), getm2(), getn2(); gets2(), gett2(), getu2(); getz2(); gete3(), getf3(), getg3(); getl3(), getm3(), getn3(); gets3(), gett3(), getu3(); getz3(); gete4(), getf4(), getg4(); getl4(), getm4(), getn4(); gets4(), gett4(), getu4(); getz4(); gete5(), getf5(), getg5(); getl5(), getm5(), getn5(); gets5(), gett5(), getu5(); getz5();

40

int int int int int int int int int int int int int int int int

geta6(), geth6(), geto6(), getv6(), geta7(), geth7(), geto7(), getv7(), geta8(), geth8(), geto8(), getv8(), geta9(), geth9(), geto9(), getv9(),

getb6(), geti6(), getp6(), getw6(), getb7(), geti7(), getp7(), getw7(), getb8(), geti8(), getp8(), getw8(), getb9(), geti9(), getp9(), getw9(),

getc6(), getj6(), getq6(), getx6(), getc7(), getj7(), getq7(), getx7(), getc8(), getj8(), getq8(), getx8(), getc9(), getj9(), getq9(), getx9(),

getd6(), getk6(), getr6(), gety6(), getd7(), getk7(), getr7(), gety7(), getd8(), getk8(); getr8(), gety8(), getd9(), getk9(), getr9(), gety9(),

gete6(), getf6(), getg6(); getl6(), getm6(), getn6(); gets6(), gett6(), getu6(); getz6(); gete7(), getf7(), getg7(); getl7(), getm7(), getn7(); gets7(), gett7(), getu7(); getz7(); gete8(), getf8(), getg8(); getl8(), getm8(), getn8(); gets8(), gett8(), getu8(); getz8(); gete9(), getf9(), getg9(); getl9(), getm9(), getn9(); gets9(), gett9(), getu9(); getz9();

void setz9( int ); int int int int int int int int int int int int int int int int int int int int int int int int int int int int int geta0() getb0() getc0() getd0() gete0() getf0() getg0() geth0() geti0() getj0() getk0() getl0() getm0() getn0() geto0() getp0() getq0() getr0() gets0() gett0() getu0() getv0() getw0() getx0() gety0() getz0() geta1() getb1() getc1() { { { { { { { { { { { { { { { { { { { { { { { { { { { { { return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( a0 b0 c0 d0 e0 f0 g0 h0 i0 j0 k0 l0 m0 n0 o0 p0 q0 r0 s0 t0 u0 v0 w0 x0 y0 z0 a1 b1 c1 ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); } } } } } } } } } } } } } } } } } } } } } } } } } } } } }

41

int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int

getd1() gete1() getf1() getg1() geth1() geti1() getj1() getk1() getl1() getm1() getn1() geto1() getp1() getq1() getr1() gets1() gett1() getu1() getv1() getw1() getx1() gety1() getz1() geta2() getb2() getc2() getd2() gete2() getf2() getg2() geth2() geti2() getj2() getk2() getl2() getm2() getn2() geto2() getp2() getq2() getr2() gets2() gett2() getu2() getv2() getw2() getx2() gety2() getz2() geta3() getb3() getc3()

{ { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { {

return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return(

d1 e1 f1 g1 h1 i1 j1 k1 l1 m1 n1 o1 p1 q1 r1 s1 t1 u1 v1 w1 x1 y1 z1 a2 b2 c2 d2 e2 f2 g2 h2 i2 j2 k2 l2 m2 n2 o2 p2 q2 r2 s2 t2 u2 v2 w2 x2 y2 z2 a3 b3 c3

); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); );

} } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } }

42

int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int

getd3() gete3() getf3() getg3() geth3() geti3() getj3() getk3() getl3() getm3() getn3() geto3() getp3() getq3() getr3() gets3() gett3() getu3() getv3() getw3() getx3() gety3() getz3() geta4() getb4() getc4() getd4() gete4() getf4() getg4() geth4() geti4() getj4() getk4() getl4() getm4() getn4() geto4() getp4() getq4() getr4() gets4() gett4() getu4() getv4() getw4() getx4() gety4() getz4() geta5() getb5() getc5()

{ { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { {

return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return(

d3 e3 f3 g3 h3 i3 j3 k3 l3 m3 n3 o3 p3 q3 r3 s3 t3 u3 v3 w3 x3 y3 z3 a4 b4 c4 d4 e4 f4 g4 h4 i4 j4 k4 l4 m4 n4 o4 p4 q4 r4 s4 t4 u4 v4 w4 x4 y4 z4 a5 b5 c5

); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); );

} } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } }

43

int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int

getd5() gete5() getf5() getg5() geth5() geti5() getj5() getk5() getl5() getm5() getn5() geto5() getp5() getq5() getr5() gets5() gett5() getu5() getv5() getw5() getx5() gety5() getz5() geta6() getb6() getc6() getd6() gete6() getf6() getg6() geth6() geti6() getj6() getk6() getl6() getm6() getn6() geto6() getp6() getq6() getr6() gets6() gett6() getu6() getv6() getw6() getx6() gety6() getz6() geta7() getb7() getc7()

{ { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { {

return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return(

d5 e5 f5 g5 h5 i5 j5 k5 l5 m5 n5 o5 p5 q5 r5 s5 t5 u5 v5 w5 x5 y5 z5 a6 b6 c6 d6 e6 f6 g6 h6 i6 j6 k6 l6 m6 n6 o6 p6 q6 r6 s6 t6 u6 v6 w6 x6 y6 z6 a7 b7 c7

); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); );

} } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } }

44

int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int int

getd7() gete7() getf7() getg7() geth7() geti7() getj7() getk7() getl7() getm7() getn7() geto7() getp7() getq7() getr7() gets7() gett7() getu7() getv7() getw7() getx7() gety7() getz7() geta8() getb8() getc8() getd8() gete8() getf8() getg8() geth8() geti8() getj8() getk8() getl8() getm8() getn8() geto8() getp8() getq8() getr8() gets8() gett8() getu8() getv8() getw8() getx8() gety8() getz8() geta9() getb9() getc9()

{ { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { { {

return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return(

d7 e7 f7 g7 h7 i7 j7 k7 l7 m7 n7 o7 p7 q7 r7 s7 t7 u7 v7 w7 x7 y7 z7 a8 b8 c8 d8 e8 f8 g8 h8 i8 j8 k8 l8 m8 n8 o8 p8 q8 r8 s8 t8 u8 v8 w8 x8 y8 z8 a9 b9 c9

); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); );

} } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } }

45

int int int int int int int int int int int int int int int int int int int int int int int

getd9() gete9() getf9() getg9() geth9() geti9() getj9() getk9() getl9() getm9() getn9() geto9() getp9() getq9() getr9() gets9() gett9() getu9() getv9() getw9() getx9() gety9() getz9()

{ { { { { { { { { { { { { { { { { { { { { { {

return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return( return(

d9 e9 f9 g9 h9 i9 j9 k9 l9 m9 n9 o9 p9 q9 r9 s9 t9 u9 v9 w9 x9 y9 z9

); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); ); );

} } } } } } } } } } } } } } } } } } } } } } }

void setz9( int value ) { z9 = value; } #endif void main(void) { /* Measure timer overhead and pre-load into cpu cache. */ /* Note that the timer-only numbers need to be subtracted from * the data access numbers to compensate for timer overhead. */ ZTIMERON(); ZTIMEROFF(); ZTIMERREPORT(); ZTIMERON(); ZTIMEROFF(); ZTIMERREPORT(); ZTIMERON(); ZTIMEROFF(); ZTIMERREPORT(); #ifdef DIRECT_METHOD ZTIMERON(); z9 += a0 + b0 + c0 + d0 + e0 + f0 + g0 + h0 + i0 + j0 + k0 + l0 + m0 + n0 + o0 + p0 + q0 + r0 + s0 + t0 + u0 + v0 + w0 + x0 + y0 + z0; z9 += a1 + b1 + c1 + d1 + e1 + f1 + g1 + h1 + i1 + j1 + k1 + l1 + m1 +

46

n1 + o1 + p1 + q1 + r1 + s1 + t1 + u1 + v1 + w1 + x1 + y1 + z1; z9 += a2 + b2 + c2 + d2 + e2 + f2 + g2 + h2 + i2 + j2 + k2 + l2 + m2 + n2 + o2 + p2 + q2 + r2 + s2 + t2 + u2 + v2 + w2 + x2 + y2 + z2; z9 += a3 + b3 + c3 + d3 + e3 + f3 + g3 + h3 + i3 + j3 + k3 + l3 + m3 + n3 + o3 + p3 + q3 + r3 + s3 + t3 + u3 + v3 + w3 + x3 + y3 + z3; z9 += a4 + b4 + c4 + d4 + e4 + f4 + g4 + h4 + i4 + j4 + k4 + l4 + m4 + n4 + o4 + p4 + q4 + r4 + s4 + t4 + u4 + v4 + w4 + x4 + y4 + z4; z9 += a5 + b5 + c5 + d5 + e5 + f5 + g5 + h5 + i5 + j5 + k5 + l5 + m5 + n5 + o5 + p5 + q5 + r5 + s5 + t5 + u5 + v5 + w5 + x5 + y5 + z5; z9 += a6 + b6 + c6 + d6 + e6 + f6 + g6 + h6 + i6 + j6 + k6 + l6 + m6 + n6 + o6 + p6 + q6 + r6 + s6 + t6 + u6 + v6 + w6 + x6 + y6 + z6; z9 += a7 + b7 + c7 + d7 + e7 + f7 + g7 + h7 + i7 + j7 + k7 + l7 + m7 + n7 + o7 + p7 + q7 + r7 + s7 + t7 + u7 + v7 + w7 + x7 + y7 + z7; z9 += a8 + b8 + c8 + d8 + e8 + f8 + g8 + h8 + i8 + j8 + k8 + l8 + m8 + n8 + o8 + p8 + q8 + r8 + s8 + t8 + u8 + v8 + w8 + x8 + y8 + z8; z9 += a9 + b9 + c9 + d9 + e9 + f9 + g9 + h9 + i9 + j9 + k9 + l9 + m9 + n9 + o9 + p9 + q9 + r9 + s9 + t9 + u9 + v9 + w9 + x9 + y9 + z9; ZTIMEROFF(); ZTIMERREPORT(); #endif #ifdef ACCESSOR_METHOD ZTIMERON(); setz9( getz9() geta0() getg0() getm0() gets0() gety0() setz9( getz9() geta1() getg1() getm1() gets1() gety1() setz9( getz9() geta2() getg2() getm2() + + + + + + + + + + + +

getb0() geth0() getn0() gett0() getz0()

+ getc0() + geti0() + geto0() + getu0() );

+ + + +

getd0() getj0() getp0() getv0()

+ + + +

gete0() getk0() getq0() getw0()

+ + + +

getf0() getl0() getr0() getx0()

+ + + +

getb1() geth1() getn1() gett1() getz1()

+ getc1() + geti1() + geto1() + getu1() );

+ + + +

getd1() getj1() getp1() getv1()

+ + + +

gete1() getk1() getq1() getw1()

+ + + +

getf1() getl1() getr1() getx1()

+ + + +

+ + getb2() + getc2() + getd2() + gete2() + getf2() + + geth2() + geti2() + getj2() + getk2() + getl2() + + getn2() + geto2() + getp2() + getq2() + getr2() +

47

gets2() + gett2() + getu2() + getv2() + getw2() + getx2() + gety2() + getz2() ); setz9( getz9() geta3() getg3() getm3() gets3() gety3() setz9( getz9() geta4() getg4() getm4() gets4() gety4() setz9( getz9() geta5() getg5() getm5() gets5() gety5() setz9( getz9() geta6() getg6() getm6() gets6() gety6() setz9( getz9() geta7() getg7() getm7() gets7() gety7() setz9( getz9() geta8() getg8() getm8() gets8() gety8() setz9( getz9() geta9() getg9() getm9() gets9() gety9() ZTIMEROFF(); + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +

getb3() geth3() getn3() gett3() getz3()

+ getc3() + geti3() + geto3() + getu3() );

+ + + +

getd3() getj3() getp3() getv3()

+ + + +

gete3() getk3() getq3() getw3()

+ + + +

getf3() getl3() getr3() getx3()

+ + + +

getb4() geth4() getn4() gett4() getz4()

+ getc4() + geti4() + geto4() + getu4() );

+ + + +

getd4() getj4() getp4() getv4()

+ + + +

gete4() getk4() getq4() getw4()

+ + + +

getf4() getl4() getr4() getx4()

+ + + +

getb5() geth5() getn5() gett5() getz5()

+ getc5() + geti5() + geto5() + getu5() );

+ + + +

getd5() getj5() getp5() getv5()

+ + + +

gete5() getk5() getq5() getw5()

+ + + +

getf5() getl5() getr5() getx5()

+ + + +

getb6() geth6() getn6() gett6() getz6()

+ getc6() + geti6() + geto6() + getu6() );

+ + + +

getd6() getj6() getp6() getv6()

+ + + +

gete6() getk6() getq6() getw6()

+ + + +

getf6() getl6() getr6() getx6()

+ + + +

getb7() geth7() getn7() gett7() getz7()

+ getc7() + geti7() + geto7() + getu7() );

+ + + +

getd7() getj7() getp7() getv7()

+ + + +

gete7() getk7() getq7() getw7()

+ + + +

getf7() getl7() getr7() getx7()

+ + + +

getb8() geth8() getn8() gett8() getz8()

+ getc8() + geti8() + geto8() + getu8() );

+ + + +

getd8() getj8() getp8() getv8()

+ + + +

gete8() getk8() getq8() getw8()

+ + + +

getf8() getl8() getr8() getx8()

+ + + +

getb9() geth9() getn9() gett9() getz9()

+ getc9() + geti9() + geto9() + getu9() );

+ + + +

getd9() getj9() getp9() getv9()

+ + + +

gete9() getk9() getq9() getw9()

+ + + +

getf9() getl9() getr9() getx9()

+ + + +

48

ZTIMERREPORT(); #endif } END FILE ‘acctest.c’ END X86 SOURCE CODE ---------------------------------------------------------------------------END OF DOCUMENT

49

Sign up to vote on this title
UsefulNot useful