You are on page 1of 49

Run Time Efficiency of Accessor Functions

by Tim Lee, 7.13.98

Purpose: To measure the efficiency of accessor functions on a


variety of computers.

Introduction: Two programs are used as benchmarks to run on nine


different computers.

The first program measures how long it takes to fetch data from
memory using the direct access operations built into each
processor.

The second program measures how long it takes to fetch data


indirectly, by calling a subroutine composed of direct access
operations. This kind of routine is commonly called an accessor
function or an accessor for short.
Since there are extra steps involved in the accessor function
method it will always be slower than the direct method.

The purpose of this study is to measure how much slower.

Summary of Results: Accessor functions were measured to take


from three to eighteen times longer to do the same work as the
direct access method, depending on the processor type.

It was also found that the faster the CPU clock speed the less
efficient accessor functions are.

Finally, the PowerPC chip family was measured to process accessor


functions four times more efficiently than the 80x86 chip family.

In this paper the term efficiency means what it normally does,


namely, how much time or energy is required to do the same amount
of work relative to some standard.

In this case the standard is how long it takes to fetch a 16-bit


unit of memory using the direct access method:

Time to fetch a 16-bit unit directly


Efficiency = ——————————————————————————————————————————
Time to fetch a 16-bit unit via accessor

For example, an efficiency rating of .5 means it takes twice as


long for an accessor function to do the same work as using direct
access, and likewise an efficiency number of .1 means it takes ten
times as long.

On the following page is a graphical summary of the results.


1
Accessor Function Efficiency by Clock Speed
0.40

0.35 PowerPC 601

0.30 PowerPC 603ev


PowerPC 750 (G3)
0.25
Efficiency

0.20
PowerPC 604

0.15

386SX
0.10 Pentium
486DX2 Pentium II
0.05 Pentium

0.00
0 50 100 150 200 250 300 350 400
Clock Speed
(Mhz)

Processor Type CPU Clock Efficiency How Many Times


Speed (Mhz) Slower Than
Direct
PowerPC 601 80 .3333 3.00
PowerPC 603ev 180 .2870 3.48
PowerPC 604 200 .2300 4.35
PowerPC 750 (G3) 266 .2620 3.81
80386SX 33 .0960 10.40
80486DX2 66 .0896 11.10
Pentium 133 .0769 13.00
Pentium 166 .0689 14.50
Pentium II 333 .0555 18.00

2
Description of Procedure: The following general procedure was
used:

1. Each benchmark was run nine times.

2. The median measurement was chosen to be representative of


the other measurements for computing the efficiency rating.

On the Mac all benchmarks were run under MacOS 8 or 8.1.

On 80x86 machines all benchmarks were run under DOS after it was
found that running the benchmarks in a DOS command window under
Windows NT skewed the results.

Description of Benchmark Programs: The benchmark programs


consist of two main parts, timing functions and data access
functions. The timing part is written in assembler and the data
access part is written in ANSI C.

The Data Access Part: The data access part differs in the two
benchmark programs so that one method of access can be compared
with another.

In pattern, the source code differences are as follows:

Direct Access Source Code Accessor Function Source Code

int x,y; int x, y;

x = y; int gety() { return( y ) };

x = gety();

The data access part of the benchmark program is designed to test


the general case of data access where the data is in main memory
rather than in level 1 or level 2 cache memory.

To accomplish this aim many separate memory locations are read


rather than reading from the same memory location many times.
Also the memory addresses are separated enough so that cache line
fetches can’t get more than one target value at a time.

What is unknown is the extent to which the program resides in


cache memory after having been loaded from disk by the OS just
prior to running it. More experiments would need to be done to
separate out this possible bias but the efficiency of accessor
functions isn’t expected to change since both benchmarks run under
the same conditions.

3
The Timing Part: The timing functions of the benchmark programs
are machine type specific. On the Mac timing functions read time
registers built into the PowerPC chip providing a very high
resolution measurement of time. On 80x86 machines timing functions
read the 8253 timer chip. Details can be found in the source code
below.

Description of Software: On the Mac the source code for the


benchmark programs was compiled using the Metrowerks Code Warrior
Pro 2 compiler. On the 80x86 machine the benchmark C source code
was compiled using Microsoft Quick C Version 2.5 and the assembler
code for the timer functions was compiled using Microsoft Macro
Assembler 5.1.

Description of Hardware:

Computer Processor Type CPU Clock Speed (Mhz)


PowerMac 8100/80 PowerPC 601 80
PowerBook 2400c/180 PowerPC 603ev 180
PowerMac 9500 PowerPC 604 200
PowerMac G3 PowerPC 750 (G3) 266
A generic 386SX box 80386SX 33
Commax Desktop Systems 80486DX2 66
Micron Home MPC Pro Pentium 133
Micron Millenium Pentium 166
Dell Dimension XPS D333 Pentium II 333

Measurement Data: The following measurement data was collected


to compute the accessor efficiency ratings:

PowerMac 8100/80
PowerPC 601, 80 Mhz.
Direct Accessor
Trial Tick Count Tick Count
1 78208 236928
2 78208 235904
3 78976 238336
4 78592 236160
5 78336 237184
6 79616 236544
7 78976 238336
8 79232 243840
9 79104 235776
Median 78976 236928

4
PowerBook 2400c/180
PowerPC 603ev, 180 Mhz.
Direct Accessor
Trial Tick Count Tick Count
1 423 1527
2 440 1545
3 449 1530
4 425 1534
5 442 1539
6 441 1538
7 437 1533
8 436 1544
9 448 1524
Median 440 1534

PowerMac 9500
PowerPC 604, 200 Mhz.
Direct Accessor
Trial Tick Count Tick Count
1 524 2101
2 482 2308
3 498 1990
4 492 2031
5 471 2356
6 484 1696
7 478 2559
8 483 2370
9 466 1910
Median 483 2101

PowerMac G3
PowerPC 750 (G3), 266 Mhz.
Direct Accessor
Trial Tick Count Tick Count
1 389 1456
2 388 1481
3 385 1488
4 380 1492
5 381 1479
6 394 1462
7 408 1465
8 387 1479
9 388 1465
Median 388 1479

5
A generic 386SX box
80386SX, 33 Mhz.
Direct Accessor
Trial Tick Count Tick Count
1 194 2030
2 195 2033
3 196 2030
4 195 2033
5 195 2030
6 195 2033
7 195 2031
8 195 2031
9 194 2032
Median 195 2031

Commax Desktop Systems


80486DX2, 66 Mhz.
Direct Accessor
Trial Tick Count Tick Count
1 107 1175
2 107 1177
3 107 1177
4 107 1174
5 106 1187
6 106 1167
7 106 1174
8 106 1177
9 106 1177
Median 106 1177

Micron Home MPC Pro


Pentium, 133 Mhz.
Direct Accessor
Trial Tick Count Tick Count
1 12 143
2 11 143
3 10 144
4 12 144
5 11 144
6 11 143
7 11 143
8 12 143
9 12 144
Median 11 143

6
Micron Millenium
Pentium, 166 Mhz.
Direct Accessor
Trial Tick Count Tick Count
1 11 146
2 10 143
3 9 145
4 10 145
5 12 143
6 10 145
7 9 145
8 9 144
9 10 144
Median 10 145

Dell Dimension XPS D333


Pentium II, 333 Mhz.
Direct Accessor
Trial Tick Count Tick Count
1 6 109
2 6 108
3 6 112
4 6 108
5 5 108
6 6 108
7 6 108
8 6 108
9 6 108
Median 6 108

7
Source Code for Mac and x86 Benchmarks follow:

BEGIN MAC SOURCE CODE =====================================

BEGIN FILE ‘TimePPC.h’ ------------------------------------

/*------------------------------------------------------------
| NAME: TimePPC.h
|
| PURPOSE: To provide interface to time functions for the
| PowerPC chips.
|
| DESCRIPTION:
|
| NOTE:
|
| HISTORY: 02.07.98
------------------------------------------------------------*/

#ifndef _TIMEPPC_H_
#define _TIMEPPC_H_

#ifdef __cplusplus
extern "C"
{
#endif

void ElapsedTimePPC( u64* );


asm void GetRealTimePPC( u64* );
asm void GetTimeBasePPC( u64* );
void GetTimePPC( u64* );
void SetUpTimePPC();

#ifdef __cplusplus
} // extern "C"
#endif

#endif // _TIMEPPC_H_

END FILE ‘TimePPC.h’ --------------------------------------

BEGIN FILE ‘TimePPC.c’ ------------------------------------

/*------------------------------------------------------------
| NAME: TimePPC.c
|
| PURPOSE: To provide timing functions for PowerPC chips.
|
| DESCRIPTION: There are two different ways of measuring the
| rate of change in the PowerPC chip family: one way only
| works with the 601 chip and the other way only works for
| non-601 chips.

8
|
| On the PowerMac these differences are smoothed over by
| using the illegal instruction handler to emulate one
| or the other timing methods. The seam that shows up is
| the extra time spent servicing the illegal instruction
| exception.
|
| The two functions 'GetRealTimePPC' and 'GetTimeBasePPC'
| are opcode-for-opcode identical to the low-level routines
| used by the Metrowerks Profiler. These routines are also
| published in the official PowerPC manuals as the right
| way to get the contents of the timing registers.
|
| The following is from 'PowerPC 601 RISC Microprocessor
| User's Manual', p. B-7:
|
| "B.24 Timing Facilities
|
| This section describes differences between the POWER
| architecture and the PowerPC architecture timer facilities.
|
| B.24.1 Real-Time Clock
|
| The 601 implements a POWER-based RTC. Note that the
| POWER RTC is not supported in the PowerPC architecture.
| Instead, the PowerPC architecture provides a time base
| (TB).
|
| Both the RTC and the time base are 64-bit special purpose
| registers, but they differ in the following respects.
|
| * The RTC counts seconds, and nanoseconds, while the TB
| counts 'ticks'. The frequency of the RTC is implementation-
| dependent.
|
| * The RTC increments discontinuously -- 1 is added to RTCU
| when the value in RTCL passes 999_999_999. The TB
| increments continuously -- 1 is added to TBU when the
| value in TBL passes x'FFFF FFFF'.
|
| * The RTC is written and read by the 'mtspr' and 'mfspr'
| instructions, using SPR numbers that denote the RTCU and
| RTCL. The TB is written to and read by the instructions
| 'mtspr' and 'mftb'.
|
| * The SPR numbers that denote RTCL and RTCU are invalid in
| the PowerPC architecture except the 601.
|
| * The RTC is guaranteed to increment at least once in the
| time required to execute 10 Add Immediate (addi)
| instructions. No analogous guarantee is made for the TB.
|

9
| * Not all bits of RTCL need to be implemented, while all
| bits of the TB must be implemented."
|
| From page 10-127: "For forward compatibility with other
| members of the PowerPC microprocessor family the 'mftb'
| instruction should be used to obtain the contents of the
| RTCL and RTCU registers. The 'mftb' instruction is a
| PowerPC instruction unimplemented by the 601, and will
| be trapped by the illegal instruction exception handler,
| which can then issue the appropriate mfspr instructions
| for reading the RTCL and RTCU registers."
|
| HISTORY: 02.07.98 from "MacTech" Jan '98 p. 48 which cites
| PowerPC 601 RISC User's Manual by
| Motorola as it's source.
| 06.15.98 Added notes and updated for chips beyond
| 601.
| 06.16.98 Revised to read entire 64-bit register not
| just the low part.
| 06.18.98 Edited comments.
------------------------------------------------------------*/

#include <Gestalt.h>

typedef unsigned char u8;


typedef unsigned short u16;
typedef unsigned long u32;
typedef unsigned long long u64;

typedef signed char s8;


typedef short s16;
typedef long s32;
typedef unsigned long long s64;

#include "TimePPC.h"

u64 GetTimePPCOverhead;
// The number of ticks that need to be subtracted from
// an elapsed time result to correct for the time taken
// to measure the time. This gets set by 'SetUpTimePPC()'.

static u32 CPUType = 0;


// Holds '601' if the currently running CPU is a 601 chip,
// else holds '603'. This gets set by 'SetUpTimePPC()'.

/*------------------------------------------------------------
| NAME: ElapsedTimePPC
|-------------------------------------------------------------
|
| PURPOSE: To compute the elapsed time since a time
| measurement was taken.
|

10
| DESCRIPTION: This is a generic routine for high-frequency
| time measurement on any PowerPC chip.
|
| Takes a time value as input, computes the number of ticks
| between then and now, and saves the result over the input.
|
| Returns the elapsed time measured in units dependent on the
| tick rate of the chip.
|
| The input/result is a 64-bit number with this format:
|
| -------------------
| | Hi | Lo |
| Byte -------------------
| Offset 0 4
|
| EXAMPLE:
|
| u64 ATime;
|
| GetTimePPC( &ATime );
|
| < Some code to be timed goes here. >
|
| ElapsedTimePPC( &ATime );
|
| NOTE:
|
| ASSUMES: The function 'SetUpTimePPC()' has been called
| prior to calling this function to identify the
| CPU type that is running.
|
| The process takes less time than the longest span
| that can be measured by the time base.
|
| HISTORY: 06.17.98
------------------------------------------------------------*/
void
ElapsedTimePPC( u64* t )
{
u64 now;

// Mark the end of a process being timed.


GetTimePPC( &now );

// If the end time is larger than the start time.


if( now > *t )
{
// Compute the difference.
*t = now - *t;
}
else // The time base has wrapped around during the

11
// process being timed.
{
// Adjust the original measure.
*t = ((u64) -1) - *t;

// Add the final time.


*t += now;
}
}

/*------------------------------------------------------------
| NAME: GetRealTimePPC
|-------------------------------------------------------------
|
| PURPOSE: To read the real-time clock registers of the
| PowerPC 601 chip.
|
| DESCRIPTION: Returns the contents of the RTC registers as a
| a 64-bit number with this format:
|
| -----------------------
| | RTCU | RTCL |
| Byte -----------------------
| Offset 0 4
|
| where:
|
| RTCU is the upper register of the real time clock which
| holds the number of seconds since the time specified
| in the software.
|
| RTCL is the lower register of the real time clock. It
| holds the number of nanoseconds since the beginning
| of the second, with a resolution of 128 nanoseconds
| per tick.
|
| Not all the bits are implemented and should always
| read as 0.
|
| RTCL
| ---------------------------------
| | 00 | | 0000000 |
| ---------------------------------
| 0 1 2 24 25 31
| ^
| |__ Least Significant Bit
|
| The low register counts from zero to 999,999,872, one
| billion minus 128 after 999,999,999 nS. The next time
| RTCL is incremented, it cycles to all zeros and RTCU is
| incremented.
|

12
| The RTCL is incremented 7812500 times per second, once
| every 128 nanoseconds.
|
| EXAMPLE:
|
| u64 RTCL_HiLo;
|
| GetRealTimePPC( &RTCL_HiLo );
|
| NOTE: See page 2-16 of 601 User's Manual for the detailed
|
|
| ASSUMES:
|
| HISTORY: 06.17.98
| 06.24.98 Updated description.
------------------------------------------------------------*/
asm
void
GetRealTimePPC( u64* t )
{
machine 601 // This is only for the 601 chip.
A: mfspr r4, 4 // Get upper real time clock register.
mfspr r5, 5 // Get lower real time clock register.
mfspr r6, 4 // Get upper real time clock register again.
cmpw r4,r6 // If the upper register has changed.
bne A // Try reading again.
stw r4,0(r3) // Put the hi part at the result.
stw r5,4(r3) // Put the lo part at offset 4 of result.
blr // Return.
}

/*------------------------------------------------------------
| NAME: GetTimeBasePPC
|-------------------------------------------------------------
|
| PURPOSE: To read the time base register of any PowerPC chip
| other than the 601 chip.
|
| DESCRIPTION: Returns a number measured in units dependent
| on the time base tick rate of the chip.
|
| The result is a 64-bit number with this format:
|
| -------------------
| | Hi | Lo |
| Byte -------------------
| Offset 0 4
|
| EXAMPLE:
|
| u64 Before, After, Diff;

13
|
| GetTimeBasePPC( &Before );
|
| < Some code to be timed goes here. >
|
| GetTimeBasePPC( &After );
|
| // Assuming value in 'After' is larger than 'Before',
| // calculate the elapsed time in ticks.
| Diff = After - Before;
|
| NOTE: Not supported on the 601, use 'GetRealTimePPC()'
| instead.
|
| ASSUMES:
|
| HISTORY: 06.17.98
------------------------------------------------------------*/
asm
void
GetTimeBasePPC( u64* t )
{
machine 603 // For any PowerPC chip other than the 601.
A: mftbu r4 // Get the upper time base register.
mftb r5 // Get the lower time base register.
mftbu r6 // Get upper time base register again.
cmpw r4,r6 // If the upper register has changed.
bne A // Try reading again.
stw r4,0(r3) // Put the hi part at the result.
stw r5,4(r3) // Put the lo part at offset 4 of result.
blr // Return.
}

/*------------------------------------------------------------
| NAME: GetTimePPC
|-------------------------------------------------------------
|
| PURPOSE: To read the time register of any PowerPC chip.
|
| DESCRIPTION: This is a generic routine for high-frequency
| time measurement on any PowerPC chip.
|
| Returns a number measured in units dependent on the tick
| rate of the chip.
|
| The result is a 64-bit number with this format:
|
| -------------------
| | Hi | Lo |
| Byte -------------------
| Offset 0 4
|

14
| EXAMPLE:
|
| u64 ATime;
|
| GetTimePPC( &ATime );
|
| < Some code to be timed goes here. >
|
| ElapsedTimePPC( &ATime );
|
| NOTE:
|
| ASSUMES: The function 'SetUpTimePPC()' has been called
| prior to calling this function to identify the
| CPU type that is running.
|
| HISTORY: 06.17.98
| 06.29.98 Added unit conversion for 601 chip.
------------------------------------------------------------*/
void
GetTimePPC( u64* t )
{
u32* lo;
u32* hi;
u32 H, L;

// If this is a 601 chip.


if( CPUType == 601 )
{
// Read the real time clock register.
GetRealTimePPC( t );

// Convert seconds:nanoseconds to units of 128 nanoseconds each...

// Refer to the upper register field, RTCU.


hi = (u32*) t;

// Refer to the lower register field, RTCL.


lo = (u32*) ( ((u8*) t) + 4 );

// Get the value of the lower register.


L = *lo;

// Shift the lo part to the left two bits, then right 9 bits
// to clear the high bits and right justify the significant bits.
//
// This converts nanosecond units to 128-nS units.
L = ( L << 2 ) >> 9;

// Get the value of the upper registers.


H = *hi;

15
// Shift high value to the right nine bits to convert seconds
// to 128-nS units.
*hi = H >> 9;

// Shift high value left 23 bits to left justify the section of the
// upper 32 bits that shifts into the lower 32-bits when converting
// seconds to 128-nS units.
H = H << 23;

// Merge the bits shifted down from the upper register with the
// justified bits of the lower register.
*lo = H | L;
}
else // This is a non-601 chip.
{
// Read the time base register.
GetTimeBasePPC( t );
}
}

/*------------------------------------------------------------
| NAME: SetUpTimePPC
|-------------------------------------------------------------
|
| PURPOSE: To prepare for high-resolution time measurement on
| the PowerPC chip.
|
| DESCRIPTION: Call this routine prior to taking a time
| measurement. The CPU type and timing overhead are computed
| for use by the timing functions.
|
| EXAMPLE:
|
| NOTE:
|
| ASSUMES: The timing overhead factor computed by this
| function is based on having the timing functions
| resident in the on-chip cache at the time they are
| called. It's your responsibility to pre-fetch
| the timing functions to make this true.
|
| HISTORY: 06.17.98
------------------------------------------------------------*/
void
SetUpTimePPC()
{
OSErr err;
s32 result;
u64 Timer;

// If CPUType isn't identified.


if( CPUType == 0 )

16
{
// Find what kind of CPU is running.
err = Gestalt( gestaltProcessorType, &result );

// If this is a PowerPC 601 chip.


if( result == gestaltCPU601 )
{
CPUType = 601;
}
else // Treat all other PowerPC chips as if they
// have a time base register like the 603.
{
CPUType = 603;
}
}

// Assume that there is no overhead for timing calls.


GetTimePPCOverhead = 0;

// Call these two routines here just to get them into


// the on-chip cache.
GetTimePPC( &Timer );
ElapsedTimePPC( &Timer );

// Now make the real measurement of how long it takes


// to do nothing.
GetTimePPC( &Timer );
ElapsedTimePPC( &Timer );

// The resulting time is the timing overhead, a correction


// factor to be applied to future timings.
GetTimePPCOverhead = Timer;
}

END FILE ‘TimePPC.c’ --------------------------------------

BEGIN FILE ‘AccessorTest.c’ -------------------------------

// PURPOSE: To measure the cost of using accessors on PowerPC chips.

#include <stdio.h>

typedef unsigned char u8;


typedef unsigned short u16;
typedef unsigned long u32;
typedef unsigned long long u64;

typedef signed char s8;


typedef short s16;
typedef long s32;
typedef unsigned long long s64;

17
#include "TimePPC.h"

// Define one or the other of the following symbols and then compile
// to make a benchmark for that method:

//#define DIRECT_METHOD
#define ACCESSOR_METHOD

// These are the variables, separated by padding.


int a0, pada0[32], b0, padb0[32], c0, padc0[32], d0, padd0[32];
int e0, pade0[32], f0, padf0[32], g0, padg0[32], h0, padh0[32];
int i0, padi0[32], j0, padj0[32], k0, padk0[32], l0, padl0[32];
int m0, padm0[32], n0, padn0[32], o0, pado0[32], p0, padp0[32];
int q0, padq0[32], r0, padr0[32], s0, pads0[32], t0, padt0[32];
int u0, padu0[32], v0, padv0[32], w0, padw0[32], x0, padx0[32];
int y0, pady0[32], z0, padz0[32];

int a1, pada1[32], b1, padb1[32], c1, padc1[32], d1, padd1[32];


int e1, pade1[32], f1, padf1[32], g1, padg1[32], h1, padh1[32];
int i1, padi1[32], j1, padj1[32], k1, padk1[32], l1, padl1[32];
int m1, padm1[32], n1, padn1[32], o1, pado1[32], p1, padp1[32];
int q1, padq1[32], r1, padr1[32], s1, pads1[32], t1, padt1[32];
int u1, padu1[32], v1, padv1[32], w1, padw1[32], x1, padx1[32];
int y1, pady1[32], z1, padz1[32];

int a2, pada2[32], b2, padb2[32], c2, padc2[32], d2, padd2[32];


int e2, pade2[32], f2, padf2[32], g2, padg2[32], h2, padh2[32];
int i2, padi2[32], j2, padj2[32], k2, padk2[32], l2, padl2[32];
int m2, padm2[32], n2, padn2[32], o2, pado2[32], p2, padp2[32];
int q2, padq2[32], r2, padr2[32], s2, pads2[32], t2, padt2[32];
int u2, padu2[32], v2, padv2[32], w2, padw2[32], x2, padx2[32];
int y2, pady2[32], z2, padz2[32];

int a3, pada3[32], b3, padb3[32], c3, padc3[32], d3, padd3[32];


int e3, pade3[32], f3, padf3[32], g3, padg3[32], h3, padh3[32];
int i3, padi3[32], j3, padj3[32], k3, padk3[32], l3, padl3[32];
int m3, padm3[32], n3, padn3[32], o3, pado3[32], p3, padp3[32];
int q3, padq3[32], r3, padr3[32], s3, pads3[32], t3, padt3[32];
int u3, padu3[32], v3, padv3[32], w3, padw3[32], x3, padx3[32];
int y3, pady3[32], z3, padz3[32];

int a4, pada4[32], b4, padb4[32], c4, padc4[32], d4, padd4[32];


int e4, pade4[32], f4, padf4[32], g4, padg4[32], h4, padh4[32];
int i4, padi4[32], j4, padj4[32], k4, padk4[32], l4, padl4[32];
int m4, padm4[32], n4, padn4[32], o4, pado4[32], p4, padp4[32];
int q4, padq4[32], r4, padr4[32], s4, pads4[32], t4, padt4[32];
int u4, padu4[32], v4, padv4[32], w4, padw4[32], x4, padx4[32];
int y4, pady4[32], z4, padz4[32];

int a5, pada5[32], b5, padb5[32], c5, padc5[32], d5, padd5[32];


int e5, pade5[32], f5, padf5[32], g5, padg5[32], h5, padh5[32];
int i5, padi5[32], j5, padj5[32], k5, padk5[32], l5, padl5[32];

18
int m5, padm5[32], n5, padn5[32], o5, pado5[32], p5, padp5[32];
int q5, padq5[32], r5, padr5[32], s5, pads5[32], t5, padt5[32];
int u5, padu5[32], v5, padv5[32], w5, padw5[32], x5, padx5[32];
int y5, pady5[32], z5, padz5[32];

int a6, pada6[32], b6, padb6[32], c6, padc6[32], d6, padd6[32];


int e6, pade6[32], f6, padf6[32], g6, padg6[32], h6, padh6[32];
int i6, padi6[32], j6, padj6[32], k6, padk6[32], l6, padl6[32];
int m6, padm6[32], n6, padn6[32], o6, pado6[32], p6, padp6[32];
int q6, padq6[32], r6, padr6[32], s6, pads6[32], t6, padt6[32];
int u6, padu6[32], v6, padv6[32], w6, padw6[32], x6, padx6[32];
int y6, pady6[32], z6, padz6[32];

int a7, pada7[32], b7, padb7[32], c7, padc7[32], d7, padd7[32];


int e7, pade7[32], f7, padf7[32], g7, padg7[32], h7, padh7[32];
int i7, padi7[32], j7, padj7[32], k7, padk7[32], l7, padl7[32];
int m7, padm7[32], n7, padn7[32], o7, pado7[32], p7, padp7[32];
int q7, padq7[32], r7, padr7[32], s7, pads7[32], t7, padt7[32];
int u7, padu7[32], v7, padv7[32], w7, padw7[32], x7, padx7[32];
int y7, pady7[32], z7, padz7[32];

int a8, pada8[32], b8, padb8[32], c8, padc8[32], d8, padd8[32];


int e8, pade8[32], f8, padf8[32], g8, padg8[32], h8, padh8[32];
int i8, padi8[32], j8, padj8[32], k8, padk8[32], l8, padl8[32];
int m8, padm8[32], n8, padn8[32], o8, pado8[32], p8, padp8[32];
int q8, padq8[32], r8, padr8[32], s88, pads8[32], t8, padt8[32];
int u88, padu8[32], v8, padv8[32], w8, padw8[32], x8, padx8[32];
int y8, pady8[32], z8, padz8[32];

int a9, pada9[32], b9, padb9[32], c9, padc9[32], d9, padd9[32];


int e9, pade9[32], f9, padf9[32], g9, padg9[32], h9, padh9[32];
int i9, padi9[32], j9, padj9[32], k9, padk9[32], l9, padl9[32];
int m9, padm9[32], n9, padn9[32], o9, pado9[32], p9, padp9[32];
int q9, padq9[32], r9, padr9[32], s9, pads9[32], t9, padt9[32];
int u9, padu9[32], v9, padv9[32], w9, padw9[32], x9, padx9[32];
int y9, pady9[32], z9, padz9[32];

#ifdef ACCESSOR_METHOD

int geta0(), getb0(), getc0(), getd0(), gete0(), getf0(), getg0();


int geth0(), geti0(), getj0(), getk0(), getl0(), getm0(), getn0();
int geto0(), getp0(), getq0(), getr0(), gets0(), gett0(), getu0();
int getv0(), getw0(), getx0(), gety0(), getz0();

int geta1(), getb1(), getc1(), getd1(), gete1(), getf1(), getg1();


int geth1(), geti1(), getj1(), getk1(), getl1(), getm1(), getn1();
int geto1(), getp1(), getq1(), getr1(), gets1(), gett1(), getu1();
int getv1(), getw1(), getx1(), gety1(), getz1();

int geta2(), getb2(), getc2(), getd2(), gete2(), getf2(), getg2();


int geth2(), geti2(), getj2(), getk2(), getl2(), getm2(), getn2();

19
int geto2(), getp2(), getq2(), getr2(), gets2(), gett2(), getu2();
int getv2(), getw2(), getx2(), gety2(), getz2();

int geta3(), getb3(), getc3(), getd3(), gete3(), getf3(), getg3();


int geth3(), geti3(), getj3(), getk3(), getl3(), getm3(), getn3();
int geto3(), getp3(), getq3(), getr3(), gets3(), gett3(), getu3();
int getv3(), getw3(), getx3(), gety3(), getz3();

int geta4(), getb4(), getc4(), getd4(), gete4(), getf4(), getg4();


int geth4(), geti4(), getj4(), getk4(), getl4(), getm4(), getn4();
int geto4(), getp4(), getq4(), getr4(), gets4(), gett4(), getu4();
int getv4(), getw4(), getx4(), gety4(), getz4();

int geta5(), getb5(), getc5(), getd5(), gete5(), getf5(), getg5();


int geth5(), geti5(), getj5(), getk5(), getl5(), getm5(), getn5();
int geto5(), getp5(), getq5(), getr5(), gets5(), gett5(), getu5();
int getv5(), getw5(), getx5(), gety5(), getz5();

int geta6(), getb6(), getc6(), getd6(), gete6(), getf6(), getg6();


int geth6(), geti6(), getj6(), getk6(), getl6(), getm6(), getn6();
int geto6(), getp6(), getq6(), getr6(), gets6(), gett6(), getu6();
int getv6(), getw6(), getx6(), gety6(), getz6();

int geta7(), getb7(), getc7(), getd7(), gete7(), getf7(), getg7();


int geth7(), geti7(), getj7(), getk7(), getl7(), getm7(), getn7();
int geto7(), getp7(), getq7(), getr7(), gets7(), gett7(), getu7();
int getv7(), getw7(), getx7(), gety7(), getz7();

int geta8(), getb8(), getc8(), getd8(), gete8(), getf8(), getg8();


int geth8(), geti8(), getj8(), getk8(); getl8(), getm8(), getn8();
int geto8(), getp8(), getq8(), getr8(), gets88(), gett8(), getu88();
int getv8(), getw8(), getx8(), gety8(), getz8();

int geta9(), getb9(), getc9(), getd9(), gete9(), getf9(), getg9();


int geth9(), geti9(), getj9(), getk9(), getl9(), getm9(), getn9();
int geto9(), getp9(), getq9(), getr9(), gets9(), gett9(), getu9();
int getv9(), getw9(), getx9(), gety9(), getz9();

void setz9( int );

int geta0() { return( a0 ); }


int getb0() { return( b0 ); }
int getc0() { return( c0 ); }
int getd0() { return( d0 ); }
int gete0() { return( e0 ); }
int getf0() { return( f0 ); }
int getg0() { return( g0 ); }
int geth0() { return( h0 ); }
int geti0() { return( i0 ); }
int getj0() { return( j0 ); }
int getk0() { return( k0 ); }
int getl0() { return( l0 ); }

20
int getm0() { return( m0 ); }
int getn0() { return( n0 ); }
int geto0() { return( o0 ); }
int getp0() { return( p0 ); }
int getq0() { return( q0 ); }
int getr0() { return( r0 ); }
int gets0() { return( s0 ); }
int gett0() { return( t0 ); }
int getu0() { return( u0 ); }
int getv0() { return( v0 ); }
int getw0() { return( w0 ); }
int getx0() { return( x0 ); }
int gety0() { return( y0 ); }
int getz0() { return( z0 ); }
int geta1() { return( a1 ); }
int getb1() { return( b1 ); }
int getc1() { return( c1 ); }
int getd1() { return( d1 ); }
int gete1() { return( e1 ); }
int getf1() { return( f1 ); }
int getg1() { return( g1 ); }
int geth1() { return( h1 ); }
int geti1() { return( i1 ); }
int getj1() { return( j1 ); }
int getk1() { return( k1 ); }
int getl1() { return( l1 ); }
int getm1() { return( m1 ); }
int getn1() { return( n1 ); }
int geto1() { return( o1 ); }
int getp1() { return( p1 ); }
int getq1() { return( q1 ); }
int getr1() { return( r1 ); }
int gets1() { return( s1 ); }
int gett1() { return( t1 ); }
int getu1() { return( u1 ); }
int getv1() { return( v1 ); }
int getw1() { return( w1 ); }
int getx1() { return( x1 ); }
int gety1() { return( y1 ); }
int getz1() { return( z1 ); }
int geta2() { return( a2 ); }
int getb2() { return( b2 ); }
int getc2() { return( c2 ); }
int getd2() { return( d2 ); }
int gete2() { return( e2 ); }
int getf2() { return( f2 ); }
int getg2() { return( g2 ); }
int geth2() { return( h2 ); }
int geti2() { return( i2 ); }
int getj2() { return( j2 ); }
int getk2() { return( k2 ); }
int getl2() { return( l2 ); }

21
int getm2() { return( m2 ); }
int getn2() { return( n2 ); }
int geto2() { return( o2 ); }
int getp2() { return( p2 ); }
int getq2() { return( q2 ); }
int getr2() { return( r2 ); }
int gets2() { return( s2 ); }
int gett2() { return( t2 ); }
int getu2() { return( u2 ); }
int getv2() { return( v2 ); }
int getw2() { return( w2 ); }
int getx2() { return( x2 ); }
int gety2() { return( y2 ); }
int getz2() { return( z2 ); }
int geta3() { return( a3 ); }
int getb3() { return( b3 ); }
int getc3() { return( c3 ); }
int getd3() { return( d3 ); }
int gete3() { return( e3 ); }
int getf3() { return( f3 ); }
int getg3() { return( g3 ); }
int geth3() { return( h3 ); }
int geti3() { return( i3 ); }
int getj3() { return( j3 ); }
int getk3() { return( k3 ); }
int getl3() { return( l3 ); }
int getm3() { return( m3 ); }
int getn3() { return( n3 ); }
int geto3() { return( o3 ); }
int getp3() { return( p3 ); }
int getq3() { return( q3 ); }
int getr3() { return( r3 ); }
int gets3() { return( s3 ); }
int gett3() { return( t3 ); }
int getu3() { return( u3 ); }
int getv3() { return( v3 ); }
int getw3() { return( w3 ); }
int getx3() { return( x3 ); }
int gety3() { return( y3 ); }
int getz3() { return( z3 ); }
int geta4() { return( a4 ); }
int getb4() { return( b4 ); }
int getc4() { return( c4 ); }
int getd4() { return( d4 ); }
int gete4() { return( e4 ); }
int getf4() { return( f4 ); }
int getg4() { return( g4 ); }
int geth4() { return( h4 ); }
int geti4() { return( i4 ); }
int getj4() { return( j4 ); }
int getk4() { return( k4 ); }
int getl4() { return( l4 ); }

22
int getm4() { return( m4 ); }
int getn4() { return( n4 ); }
int geto4() { return( o4 ); }
int getp4() { return( p4 ); }
int getq4() { return( q4 ); }
int getr4() { return( r4 ); }
int gets4() { return( s4 ); }
int gett4() { return( t4 ); }
int getu4() { return( u4 ); }
int getv4() { return( v4 ); }
int getw4() { return( w4 ); }
int getx4() { return( x4 ); }
int gety4() { return( y4 ); }
int getz4() { return( z4 ); }
int geta5() { return( a5 ); }
int getb5() { return( b5 ); }
int getc5() { return( c5 ); }
int getd5() { return( d5 ); }
int gete5() { return( e5 ); }
int getf5() { return( f5 ); }
int getg5() { return( g5 ); }
int geth5() { return( h5 ); }
int geti5() { return( i5 ); }
int getj5() { return( j5 ); }
int getk5() { return( k5 ); }
int getl5() { return( l5 ); }
int getm5() { return( m5 ); }
int getn5() { return( n5 ); }
int geto5() { return( o5 ); }
int getp5() { return( p5 ); }
int getq5() { return( q5 ); }
int getr5() { return( r5 ); }
int gets5() { return( s5 ); }
int gett5() { return( t5 ); }
int getu5() { return( u5 ); }
int getv5() { return( v5 ); }
int getw5() { return( w5 ); }
int getx5() { return( x5 ); }
int gety5() { return( y5 ); }
int getz5() { return( z5 ); }
int geta6() { return( a6 ); }
int getb6() { return( b6 ); }
int getc6() { return( c6 ); }
int getd6() { return( d6 ); }
int gete6() { return( e6 ); }
int getf6() { return( f6 ); }
int getg6() { return( g6 ); }
int geth6() { return( h6 ); }
int geti6() { return( i6 ); }
int getj6() { return( j6 ); }
int getk6() { return( k6 ); }
int getl6() { return( l6 ); }

23
int getm6() { return( m6 ); }
int getn6() { return( n6 ); }
int geto6() { return( o6 ); }
int getp6() { return( p6 ); }
int getq6() { return( q6 ); }
int getr6() { return( r6 ); }
int gets6() { return( s6 ); }
int gett6() { return( t6 ); }
int getu6() { return( u6 ); }
int getv6() { return( v6 ); }
int getw6() { return( w6 ); }
int getx6() { return( x6 ); }
int gety6() { return( y6 ); }
int getz6() { return( z6 ); }
int geta7() { return( a7 ); }
int getb7() { return( b7 ); }
int getc7() { return( c7 ); }
int getd7() { return( d7 ); }
int gete7() { return( e7 ); }
int getf7() { return( f7 ); }
int getg7() { return( g7 ); }
int geth7() { return( h7 ); }
int geti7() { return( i7 ); }
int getj7() { return( j7 ); }
int getk7() { return( k7 ); }
int getl7() { return( l7 ); }
int getm7() { return( m7 ); }
int getn7() { return( n7 ); }
int geto7() { return( o7 ); }
int getp7() { return( p7 ); }
int getq7() { return( q7 ); }
int getr7() { return( r7 ); }
int gets7() { return( s7 ); }
int gett7() { return( t7 ); }
int getu7() { return( u7 ); }
int getv7() { return( v7 ); }
int getw7() { return( w7 ); }
int getx7() { return( x7 ); }
int gety7() { return( y7 ); }
int getz7() { return( z7 ); }
int geta8() { return( a8 ); }
int getb8() { return( b8 ); }
int getc8() { return( c8 ); }
int getd8() { return( d8 ); }
int gete8() { return( e8 ); }
int getf8() { return( f8 ); }
int getg8() { return( g8 ); }
int geth8() { return( h8 ); }
int geti8() { return( i8 ); }
int getj8() { return( j8 ); }
int getk8() { return( k8 ); }
int getl8() { return( l8 ); }

24
int getm8() { return( m8 ); }
int getn8() { return( n8 ); }
int geto8() { return( o8 ); }
int getp8() { return( p8 ); }
int getq8() { return( q8 ); }
int getr8() { return( r8 ); }
int gets88() { return( s88 ); }
int gett8() { return( t8 ); }
int getu88() { return( u88 ); }
int getv8() { return( v8 ); }
int getw8() { return( w8 ); }
int getx8() { return( x8 ); }
int gety8() { return( y8 ); }
int getz8() { return( z8 ); }
int geta9() { return( a9 ); }
int getb9() { return( b9 ); }
int getc9() { return( c9 ); }
int getd9() { return( d9 ); }
int gete9() { return( e9 ); }
int getf9() { return( f9 ); }
int getg9() { return( g9 ); }
int geth9() { return( h9 ); }
int geti9() { return( i9 ); }
int getj9() { return( j9 ); }
int getk9() { return( k9 ); }
int getl9() { return( l9 ); }
int getm9() { return( m9 ); }
int getn9() { return( n9 ); }
int geto9() { return( o9 ); }
int getp9() { return( p9 ); }
int getq9() { return( q9 ); }
int getr9() { return( r9 ); }
int gets9() { return( s9 ); }
int gett9() { return( t9 ); }
int getu9() { return( u9 ); }
int getv9() { return( v9 ); }
int getw9() { return( w9 ); }
int getx9() { return( x9 ); }
int gety9() { return( y9 ); }
int getz9() { return( z9 ); }

void setz9( int value ) { z9 = value; }

#endif

void main(void)
{
u64 TimeCount;

/* Measure timer overhead and pre-load into cpu cache. */


SetUpTimePPC();

25
#ifdef DIRECT_METHOD
GetTimePPC( &TimeCount );

z9 += a0 + b0 + c0 + d0 + e0 + f0 + g0 + h0 + i0 + j0 + k0 + l0 + m0 +
n0 + o0 + p0 + q0 + r0 + s0 + t0 + u0 + v0 + w0 + x0 + y0 + z0;

z9 += a1 + b1 + c1 + d1 + e1 + f1 + g1 + h1 + i1 + j1 + k1 + l1 + m1 +
n1 + o1 + p1 + q1 + r1 + s1 + t1 + u1 + v1 + w1 + x1 + y1 + z1;

z9 += a2 + b2 + c2 + d2 + e2 + f2 + g2 + h2 + i2 + j2 + k2 + l2 + m2 +
n2 + o2 + p2 + q2 + r2 + s2 + t2 + u2 + v2 + w2 + x2 + y2 + z2;

z9 += a3 + b3 + c3 + d3 + e3 + f3 + g3 + h3 + i3 + j3 + k3 + l3 + m3 +
n3 + o3 + p3 + q3 + r3 + s3 + t3 + u3 + v3 + w3 + x3 + y3 + z3;

z9 += a4 + b4 + c4 + d4 + e4 + f4 + g4 + h4 + i4 + j4 + k4 + l4 + m4 +
n4 + o4 + p4 + q4 + r4 + s4 + t4 + u4 + v4 + w4 + x4 + y4 + z4;

z9 += a5 + b5 + c5 + d5 + e5 + f5 + g5 + h5 + i5 + j5 + k5 + l5 + m5 +
n5 + o5 + p5 + q5 + r5 + s5 + t5 + u5 + v5 + w5 + x5 + y5 + z5;

z9 += a6 + b6 + c6 + d6 + e6 + f6 + g6 + h6 + i6 + j6 + k6 + l6 + m6 +
n6 + o6 + p6 + q6 + r6 + s6 + t6 + u6 + v6 + w6 + x6 + y6 + z6;

z9 += a7 + b7 + c7 + d7 + e7 + f7 + g7 + h7 + i7 + j7 + k7 + l7 + m7 +
n7 + o7 + p7 + q7 + r7 + s7 + t7 + u7 + v7 + w7 + x7 + y7 + z7;

z9 += a8 + b8 + c8 + d8 + e8 + f8 + g8 + h8 + i8 + j8 + k8 + l8 + m8 +
n8 + o8 + p8 + q8 + r8 + s88 + t8 + u88 + v8 + w8 + x8 + y8 + z8;

z9 += a9 + b9 + c9 + d9 + e9 + f9 + g9 + h9 + i9 + j9 + k9 + l9 + m9 +
n9 + o9 + p9 + q9 + r9 + s9 + t9 + u9 + v9 + w9 + x9 + y9 + z9;

ElapsedTimePPC( &TimeCount );

printf( "Direct: %d\n", (u32) TimeCount );


#endif

#ifdef ACCESSOR_METHOD
GetTimePPC( &TimeCount );

setz9( getz9() +
geta0() + getb0() + getc0() + getd0() + gete0() + getf0() +
getg0() + geth0() + geti0() + getj0() + getk0() + getl0() +
getm0() + getn0() + geto0() + getp0() + getq0() + getr0() +
gets0() + gett0() + getu0() + getv0() + getw0() + getx0() +
gety0() + getz0() );

setz9( getz9() +
geta1() + getb1() + getc1() + getd1() + gete1() + getf1() +
getg1() + geth1() + geti1() + getj1() + getk1() + getl1() +
getm1() + getn1() + geto1() + getp1() + getq1() + getr1() +

26
gets1() + gett1() + getu1() + getv1() + getw1() + getx1() +
gety1() + getz1() );

setz9( getz9() +
geta2() + getb2() + getc2() + getd2() + gete2() + getf2() +
getg2() + geth2() + geti2() + getj2() + getk2() + getl2() +
getm2() + getn2() + geto2() + getp2() + getq2() + getr2() +
gets2() + gett2() + getu2() + getv2() + getw2() + getx2() +
gety2() + getz2() );

setz9( getz9() +
geta3() + getb3() + getc3() + getd3() + gete3() + getf3() +
getg3() + geth3() + geti3() + getj3() + getk3() + getl3() +
getm3() + getn3() + geto3() + getp3() + getq3() + getr3() +
gets3() + gett3() + getu3() + getv3() + getw3() + getx3() +
gety3() + getz3() );

setz9( getz9() +
geta4() + getb4() + getc4() + getd4() + gete4() + getf4() +
getg4() + geth4() + geti4() + getj4() + getk4() + getl4() +
getm4() + getn4() + geto4() + getp4() + getq4() + getr4() +
gets4() + gett4() + getu4() + getv4() + getw4() + getx4() +
gety4() + getz4() );

setz9( getz9() +
geta5() + getb5() + getc5() + getd5() + gete5() + getf5() +
getg5() + geth5() + geti5() + getj5() + getk5() + getl5() +
getm5() + getn5() + geto5() + getp5() + getq5() + getr5() +
gets5() + gett5() + getu5() + getv5() + getw5() + getx5() +
gety5() + getz5() );

setz9( getz9() +
geta6() + getb6() + getc6() + getd6() + gete6() + getf6() +
getg6() + geth6() + geti6() + getj6() + getk6() + getl6() +
getm6() + getn6() + geto6() + getp6() + getq6() + getr6() +
gets6() + gett6() + getu6() + getv6() + getw6() + getx6() +
gety6() + getz6() );

setz9( getz9() +
geta7() + getb7() + getc7() + getd7() + gete7() + getf7() +
getg7() + geth7() + geti7() + getj7() + getk7() + getl7() +
getm7() + getn7() + geto7() + getp7() + getq7() + getr7() +
gets7() + gett7() + getu7() + getv7() + getw7() + getx7() +
gety7() + getz7() );

setz9( getz9() +
geta8() + getb8() + getc8() + getd8() + gete8() + getf8() +
getg8() + geth8() + geti8() + getj8() + getk8() + getl8() +
getm8() + getn8() + geto8() + getp8() + getq8() + getr8() +
gets88() + gett8() + getu88() + getv8() + getw8() + getx8() +
gety8() + getz8() );

27
setz9( getz9() +
geta9() + getb9() + getc9() + getd9() + gete9() + getf9() +
getg9() + geth9() + geti9() + getj9() + getk9() + getl9() +
getm9() + getn9() + geto9() + getp9() + getq9() + getr9() +
gets9() + gett9() + getu9() + getv9() + getw9() + getx9() +
gety9() + getz9() );

ElapsedTimePPC( &TimeCount );

printf( "Accessor: %d\n", (u32) TimeCount );

#endif
}

END FILE ‘AccessorTest.c’ ---------------------------------

END MAC SOURCE CODE =======================================

BEGIN X86 SOURCE CODE =====================================

BEGIN FILE ‘timex86.asm -----------------------------------

;------------------------------------------------------------
; NAME: timex86.asm
;
; PURPOSE: To provide timing functions for Intel x86 chips.
;
; DESCRIPTION: Copied from the original file...
;
; "**** PCZTNEAR.ASM
; The C-near-callable version of the precision Zen timer
; (PZTIMER.ASM)
;
; Note: use NOSMART with TASM (at least version 2.0) to keep
; the assembler from turning far calls in the reference
; timing code into PUSH CS/near call sequences, thereby
; messing up the reference call times. This problem may
; arise with other optimizing assemblers as well.
;
; Uses the 8253 timer to time the performance of code that takes
; less than about 54 milliseconds to execute, with a resolution
; of better than 10 microseconds.
;
; By Michael Abrash 4/26/89
;
; Externally callable routines:
;
; ZTimerOn: Starts the Zen timer, with interrupts disabled.
;
; ZTimerOff: Stops the Zen timer, saves the timer count,

28
; times the overhead code, and restores interrupts to the
; state they were in when ZTimerOn was called.
;
; ZTimerReport: Prints the net time that passed between starting
; and stopping the timer.
;
; Note: If longer than about 54 ms passes between ZTimerOn and
; ZTimerOff calls, the timer turns over and the count is
; inaccurate. When this happens, an error message is displayed
; instead of a count. The long-period Zen timer should be used
; in such cases.
;
; Note: Interrupts *MUST* be left off between calls to ZTimerOn
; and ZTimerOff for accurate timing and for detection of
; timer overflow.
;
; Note: These routines can introduce slight inaccuracies into the
; system clock count for each code section timed even if
; timer 0 doesn't overflow. If timer 0 does overflow, the
; system clock can become slow by virtually any amount of
; time, since the system clock can't advance while the
; precison timer is timing. Consequently, it's a good idea
; to reboot at the end of each timing session. (The
; battery-backed clock, if any, is not affected by the Zen
; timer.)
;
; All registers, and all flags except the interrupt flag, are
; preserved by all routines. Interrupts are enabled and then disabled
; by ZTimerOn, and are restored by ZTimerOff to the state they were
; in when ZTimerOn was called."
;
; 07.08.98 Now necessary to measure timer overhead separately and
; perform calculation by hand. This approach taken to
; gauge the variation in the timing mechanism.
;
; Use like this:
;
; ZTIMERON(); /* Measure time overhead and print results. */
; ZTIMEROFF();
; ZTIMERREPORT();
;
; ZTIMERON(); /* Measure overhead again: timer code is now in
cpu cache. */
; ZTIMEROFF();
; ZTIMERREPORT();
;
; ZTIMERON(); /* Measure overhead again. */
; ZTIMEROFF();
; ZTIMERREPORT();
;
; ZTIMERON(); /* Now measure code of interest. */
; MyTimeConsumingFunction();

29
; ZTIMEROFF();
; ZTIMERREPORT();
;
; NOTE: See also a similar timer code for PowerPC in 'TimePPC.c'.
;
; HISTORY: 04.26.89 By Michael Abrash as file 'PCZTNEAR.ASM'.
; 07.08.98 Revised to support faster chips:
; timer overhead calculation removed,
; microsecond conversion removed: now returns
; timer ticks.
------------------------------------------------------------*/

_TEXT segment word public 'CODE'


assume cs:_TEXT, ds:nothing
public _ZTimerOn, _ZTimerOff, _ZTimerReport

;
; Base address of the 8253 timer chip.
;
BASE_8253 equ 40h
;
; The address of the timer 0 count registers in the 8253.
;
TIMER_0_8253 equ BASE_8253 + 0
;
; The address of the mode register in the 8253.
;
MODE_8253 equ BASE_8253 + 3
;
; The address of Operation Command Word 3 in the 8259 Programmable
; Interrupt Controller (PIC) (write only, and writable only when
; bit 4 of the byte written to this address is 0 and bit 3 is 1).
;
OCW3 equ 20h
;
; The address of the Interrupt Request register in the 8259 PIC
; (read only, and readable only when bit 1 of OCW3 = 1 and bit 0
; of OCW3 = 0).
;
IRR equ 20h
;
; Macro to emulate a POPF instruction in order to fix the bug in some
; 80286 chips which allows interrupts to occur during a POPF even when
; interrupts remain disabled.
;
MPOPF macro
local p1, p2
jmp short p2
p1: iret ;jump to pushed address & pop flags
p2: push cs ;construct far return address to
call p1 ; the next instruction
endm

30
;
; Macro to delay briefly to ensure that enough time has elapsed
; between successive I/O accesses so that the device being accessed
; can respond to both accesses even on a very fast PC.
;
; 07.08.98 TL Changed from 3 jumps to 30 to be on the safe side.
DELAY macro
jmp $+2
jmp $+2
jmp $+2
jmp $+2
jmp $+2
jmp $+2
jmp $+2
jmp $+2
jmp $+2
jmp $+2
jmp $+2
jmp $+2
jmp $+2
jmp $+2
jmp $+2
jmp $+2
jmp $+2
jmp $+2
jmp $+2
jmp $+2
jmp $+2
jmp $+2
jmp $+2
jmp $+2
jmp $+2
jmp $+2
jmp $+2
jmp $+2
jmp $+2
jmp $+2
endm

OriginalFlags db ? ;storage for upper byte of


; FLAGS register when
; ZTimerOn called
TimedCount dw ? ;timer 0 count when the timer
; is stopped
ReferenceCount dw ? ;number of counts required to
; execute timer overhead code
OverflowFlag db ? ;used to indicate whether the
; timer overflowed during the
; timing interval
;
; String printed to report results.

31
;
OutputStr label byte
db 'Timed count: ', 5 dup (?)
ASCIICountEnd label byte
db ' ticks ', 0dh, 0ah
db '$'
;
; String printed to report timer overflow.
;
OverflowStr label byte
db 0dh, 0ah
db '****************************************************'
db 0dh, 0ah
db '* The timer overflowed, so the interval timed was *'
db 0dh, 0ah
db '* too long for the precision timer to measure. *'
db 0dh, 0ah
db '* Please perform the timing test again with the *'
db 0dh, 0ah
db '* long-period timer. *'
db 0dh, 0ah
db '****************************************************'
db 0dh, 0ah
db '$'

;********************************************************************
;* Routine called to start timing. *
;********************************************************************

_ZTimerOn proc near

;
; Save the context of the program being timed.
;
push ax
pushf
pop ax ;get flags so we can keep
; interrupts off when leaving
; this routine
mov cs:[OriginalFlags],ah ;remember the state of the
; Interrupt flag
and ah,0fdh ;set pushed interrupt flag
; to 0
push ax
;
; Turn on interrupts, so the timer interrupt can occur if it's
; pending.
;
sti
;
; Set timer 0 of the 8253 to mode 2 (divide-by-N), to cause
; linear counting rather than count-by-two counting. Also

32
; leaves the 8253 waiting for the initial timer 0 count to
; be loaded.
;
mov al,00110100b ;mode 2
out MODE_8253,al
;
; Set the timer count to 0, so we know we won't get another
; timer interrupt right away.
; Note: this introduces an inaccuracy of up to 54 ms in the system
; clock count each time it is executed.
;
DELAY
sub al,al
out TIMER_0_8253,al ;lsb
DELAY
out TIMER_0_8253,al ;msb
;
; Wait before clearing interrupts to allow the interrupt generated
; when switching from mode 3 to mode 2 to be recognized. The delay
; must be at least 210 ns long to allow time for that interrupt to
; occur. Here, 10 jumps are used for the delay to ensure that the
; delay time will be more than long enough even on a very fast PC.
;
; rept 10 ; 07.08.98 TL Changed to 60 to allow for current high
speeds.
rept 60
jmp $+2
endm
;
; Disable interrupts to get an accurate count.
;
cli
;
; Set the timer count to 0 again to start the timing interval.
;
mov al,00110100b ;set up to load initial
out MODE_8253,al ; timer count
DELAY
sub al,al
out TIMER_0_8253,al ;load count lsb
DELAY
out TIMER_0_8253,al ;load count msb
;
; Restore the context and return.
;
MPOPF ;keeps interrupts off
pop ax
ret

_ZTimerOn endp

;********************************************************************

33
;* Routine called to stop timing and get count. *
;********************************************************************

_ZTimerOff proc near

;
; Save the context of the program being timed.
;
push ax
push cx
pushf
;
; Latch the count.
;
mov al,00000000b ;latch timer 0
out MODE_8253,al
;
; See if the timer has overflowed by checking the 8259 for a pending
; timer interrupt.
;
mov al,00001010b ;OCW3, set up to read
out OCW3,al ; Interrupt Request register
DELAY
in al,IRR ;read Interrupt Request
; register
and al,1 ;set AL to 1 if IRQ0 (the
; timer interrupt) is pending
mov cs:[OverflowFlag],al ;store the timer overflow
; status
;
; Allow interrupts to happen again.
;
sti
;
; Read out the count we latched earlier.
;
in al,TIMER_0_8253 ;least significant byte
DELAY
mov ah,al
in al,TIMER_0_8253 ;most significant byte
xchg ah,al
neg ax ;convert from countdown
; remaining to elapsed
; count
mov cs:[TimedCount],ax
; Time a zero-length code fragment, to get a reference for how
; much overhead this routine has. Time it 16 times and average it,
; for accuracy, rounding the result.
;
; 07.08.98 TL Revised to skip reference count calculation:
; Reference count now calculated in 'ZTimerSetUp'.
; mov cs:[ReferenceCount],0

34
; mov cx,16
; cli ;interrupts off to allow a
; ; precise reference count
;RefLoop:
; call ReferenceZTimerOn
; call ReferenceZTimerOff
; loop RefLoop
; sti
; add cs:[ReferenceCount],8 ;total + (0.5 * 16)
; mov cl,4
; shr cs:[ReferenceCount],cl ;(total) / 16 + 0.5
;
; Restore original interrupt state.
;
pop ax ;retrieve flags when called
mov ch,cs:[OriginalFlags] ;get back the original upper
; byte of the FLAGS register
and ch,not 0fdh ;only care about original
; interrupt flag...
and ah,0fdh ;...keep all other flags in
; their current condition
or ah,ch ;make flags word with original
; interrupt flag
push ax ;prepare flags to be popped
;
; Restore the context of the program being timed and return to it.
;
MPOPF ;restore the flags with the
; original interrupt state
pop cx
pop ax
ret

_ZTimerOff endp

;
; Called by ZTimerOff to start timer for overhead measurements.
;

ReferenceZTimerOn proc near


;
; Save the context of the program being timed.
;
push ax
pushf ;interrupts are already off
;
; Set timer 0 of the 8253 to mode 2 (divide-by-N), to cause
; linear counting rather than count-by-two counting.
;
mov al,00110100b ;set up to load
out MODE_8253,al ; initial timer count
DELAY

35
;
; Set the timer count to 0.
;
sub al,al
out TIMER_0_8253,al ;load count lsb
DELAY
out TIMER_0_8253,al ;load count msb
;
; Restore the context of the program being timed and return to it.
;
MPOPF
pop ax
ret

ReferenceZTimerOn endp

;
; Called by ZTimerOff to stop timer and add result to ReferenceCount
; for overhead measurements.
;

ReferenceZTimerOff proc near


;
; Save the context of the program being timed.
;
push ax
push cx
pushf
;
; Latch the count and read it.
;
mov al,00000000b ;latch timer 0
out MODE_8253,al
DELAY
in al,TIMER_0_8253 ;lsb
DELAY
mov ah,al
in al,TIMER_0_8253 ;msb
xchg ah,al
neg ax ;convert from countdown
; remaining to amount
; counted down
add cs:[ReferenceCount],ax
;
; Restore the context of the program being timed and return to it.
;
MPOPF
pop cx
pop ax
ret

ReferenceZTimerOff endp

36
;********************************************************************
;* Routine called to report timing results. *
;********************************************************************

_ZTimerReport proc near

pushf
push ax
push bx
push cx
push dx
push si
push ds
;
push cs ;DOS functions require that DS point
pop ds ; to text to be displayed on the screen
assume ds:_TEXT
;
; Check for timer 0 overflow.
;
cmp [OverflowFlag],0
jz PrintGoodCount
mov dx,offset OverflowStr
mov ah,9
int 21h
jmp short EndZTimerReport
;
; Convert net count to decimal ASCII in microseconds.
;
PrintGoodCount:
mov ax,[TimedCount]

; 07.08.98 TL Don't subtract out the reference count yet.


; sub ax,[ReferenceCount]
mov si,offset ASCIICountEnd - 1

; 07.08.98 TL Don't convert to microseconds to preserve highest


; resolution ticks.
;
; Convert count to microseconds by multiplying by .8381.
;
; mov dx,8381
; mul dx
; mov bx,10000
; div bx ;* .8381 = * 8381 / 10000
;
; Convert time in microseconds to 5 decimal ASCII digits.
;
mov bx,10
mov cx,5
CTSLoop:

37
sub dx,dx
div bx
add dl,'0'
mov [si],dl
dec si
loop CTSLoop
;
; Print the results.
;
mov ah,9
mov dx,offset OutputStr
int 21h
;
EndZTimerReport:
pop ds
pop si
pop dx
pop cx
pop bx
pop ax
MPOPF
ret

_ZTimerReport endp

_TEXT ends
end
END FILE ‘timex86.asm’ -----------------------------------

BEGIN FILE ‘acctest.c’ -----------------------------------

/* PURPOSE: To measure the cost of using accessors. */

extern int ZTIMERON();


extern int ZTIMEROFF();
extern void ZTIMERREPORT();

/* Define one or the other of the following symbols and then compile
to make a benchmark for that method:
*/
#define DIRECT_METHOD
/*#define ACCESSOR_METHOD*/

/* These are the variables, separated by padding. */


int a0, pada0[32], b0, padb0[32], c0, padc0[32], d0, padd0[32];
int e0, pade0[32], f0, padf0[32], g0, padg0[32], h0, padh0[32];
int i0, padi0[32], j0, padj0[32], k0, padk0[32], l0, padl0[32];
int m0, padm0[32], n0, padn0[32], o0, pado0[32], p0, padp0[32];
int q0, padq0[32], r0, padr0[32], s0, pads0[32], t0, padt0[32];
int u0, padu0[32], v0, padv0[32], w0, padw0[32], x0, padx0[32];
int y0, pady0[32], z0, padz0[32];

38
int a1, pada1[32], b1, padb1[32], c1, padc1[32], d1, padd1[32];
int e1, pade1[32], f1, padf1[32], g1, padg1[32], h1, padh1[32];
int i1, padi1[32], j1, padj1[32], k1, padk1[32], l1, padl1[32];
int m1, padm1[32], n1, padn1[32], o1, pado1[32], p1, padp1[32];
int q1, padq1[32], r1, padr1[32], s1, pads1[32], t1, padt1[32];
int u1, padu1[32], v1, padv1[32], w1, padw1[32], x1, padx1[32];
int y1, pady1[32], z1, padz1[32];

int a2, pada2[32], b2, padb2[32], c2, padc2[32], d2, padd2[32];


int e2, pade2[32], f2, padf2[32], g2, padg2[32], h2, padh2[32];
int i2, padi2[32], j2, padj2[32], k2, padk2[32], l2, padl2[32];
int m2, padm2[32], n2, padn2[32], o2, pado2[32], p2, padp2[32];
int q2, padq2[32], r2, padr2[32], s2, pads2[32], t2, padt2[32];
int u2, padu2[32], v2, padv2[32], w2, padw2[32], x2, padx2[32];
int y2, pady2[32], z2, padz2[32];

int a3, pada3[32], b3, padb3[32], c3, padc3[32], d3, padd3[32];


int e3, pade3[32], f3, padf3[32], g3, padg3[32], h3, padh3[32];
int i3, padi3[32], j3, padj3[32], k3, padk3[32], l3, padl3[32];
int m3, padm3[32], n3, padn3[32], o3, pado3[32], p3, padp3[32];
int q3, padq3[32], r3, padr3[32], s3, pads3[32], t3, padt3[32];
int u3, padu3[32], v3, padv3[32], w3, padw3[32], x3, padx3[32];
int y3, pady3[32], z3, padz3[32];

int a4, pada4[32], b4, padb4[32], c4, padc4[32], d4, padd4[32];


int e4, pade4[32], f4, padf4[32], g4, padg4[32], h4, padh4[32];
int i4, padi4[32], j4, padj4[32], k4, padk4[32], l4, padl4[32];
int m4, padm4[32], n4, padn4[32], o4, pado4[32], p4, padp4[32];
int q4, padq4[32], r4, padr4[32], s4, pads4[32], t4, padt4[32];
int u4, padu4[32], v4, padv4[32], w4, padw4[32], x4, padx4[32];
int y4, pady4[32], z4, padz4[32];

int a5, pada5[32], b5, padb5[32], c5, padc5[32], d5, padd5[32];


int e5, pade5[32], f5, padf5[32], g5, padg5[32], h5, padh5[32];
int i5, padi5[32], j5, padj5[32], k5, padk5[32], l5, padl5[32];
int m5, padm5[32], n5, padn5[32], o5, pado5[32], p5, padp5[32];
int q5, padq5[32], r5, padr5[32], s5, pads5[32], t5, padt5[32];
int u5, padu5[32], v5, padv5[32], w5, padw5[32], x5, padx5[32];
int y5, pady5[32], z5, padz5[32];

int a6, pada6[32], b6, padb6[32], c6, padc6[32], d6, padd6[32];


int e6, pade6[32], f6, padf6[32], g6, padg6[32], h6, padh6[32];
int i6, padi6[32], j6, padj6[32], k6, padk6[32], l6, padl6[32];
int m6, padm6[32], n6, padn6[32], o6, pado6[32], p6, padp6[32];
int q6, padq6[32], r6, padr6[32], s6, pads6[32], t6, padt6[32];
int u6, padu6[32], v6, padv6[32], w6, padw6[32], x6, padx6[32];
int y6, pady6[32], z6, padz6[32];

int a7, pada7[32], b7, padb7[32], c7, padc7[32], d7, padd7[32];


int e7, pade7[32], f7, padf7[32], g7, padg7[32], h7, padh7[32];
int i7, padi7[32], j7, padj7[32], k7, padk7[32], l7, padl7[32];

39
int m7, padm7[32], n7, padn7[32], o7, pado7[32], p7, padp7[32];
int q7, padq7[32], r7, padr7[32], s7, pads7[32], t7, padt7[32];
int u7, padu7[32], v7, padv7[32], w7, padw7[32], x7, padx7[32];
int y7, pady7[32], z7, padz7[32];

int a8, pada8[32], b8, padb8[32], c8, padc8[32], d8, padd8[32];


int e8, pade8[32], f8, padf8[32], g8, padg8[32], h8, padh8[32];
int i8, padi8[32], j8, padj8[32], k8, padk8[32], l8, padl8[32];
int m8, padm8[32], n8, padn8[32], o8, pado8[32], p8, padp8[32];
int q8, padq8[32], r8, padr8[32], s8, pads8[32], t8, padt8[32];
int u8, padu8[32], v8, padv8[32], w8, padw8[32], x8, padx8[32];
int y8, pady8[32], z8, padz8[32];

int a9, pada9[32], b9, padb9[32], c9, padc9[32], d9, padd9[32];


int e9, pade9[32], f9, padf9[32], g9, padg9[32], h9, padh9[32];
int i9, padi9[32], j9, padj9[32], k9, padk9[32], l9, padl9[32];
int m9, padm9[32], n9, padn9[32], o9, pado9[32], p9, padp9[32];
int q9, padq9[32], r9, padr9[32], s9, pads9[32], t9, padt9[32];
int u9, padu9[32], v9, padv9[32], w9, padw9[32], x9, padx9[32];
int y9, pady9[32], z9, padz9[32];

#ifdef ACCESSOR_METHOD

int geta0(), getb0(), getc0(), getd0(), gete0(), getf0(), getg0();


int geth0(), geti0(), getj0(), getk0(), getl0(), getm0(), getn0();
int geto0(), getp0(), getq0(), getr0(), gets0(), gett0(), getu0();
int getv0(), getw0(), getx0(), gety0(), getz0();

int geta1(), getb1(), getc1(), getd1(), gete1(), getf1(), getg1();


int geth1(), geti1(), getj1(), getk1(), getl1(), getm1(), getn1();
int geto1(), getp1(), getq1(), getr1(), gets1(), gett1(), getu1();
int getv1(), getw1(), getx1(), gety1(), getz1();

int geta2(), getb2(), getc2(), getd2(), gete2(), getf2(), getg2();


int geth2(), geti2(), getj2(), getk2(), getl2(), getm2(), getn2();
int geto2(), getp2(), getq2(), getr2(), gets2(), gett2(), getu2();
int getv2(), getw2(), getx2(), gety2(), getz2();

int geta3(), getb3(), getc3(), getd3(), gete3(), getf3(), getg3();


int geth3(), geti3(), getj3(), getk3(), getl3(), getm3(), getn3();
int geto3(), getp3(), getq3(), getr3(), gets3(), gett3(), getu3();
int getv3(), getw3(), getx3(), gety3(), getz3();

int geta4(), getb4(), getc4(), getd4(), gete4(), getf4(), getg4();


int geth4(), geti4(), getj4(), getk4(), getl4(), getm4(), getn4();
int geto4(), getp4(), getq4(), getr4(), gets4(), gett4(), getu4();
int getv4(), getw4(), getx4(), gety4(), getz4();

int geta5(), getb5(), getc5(), getd5(), gete5(), getf5(), getg5();


int geth5(), geti5(), getj5(), getk5(), getl5(), getm5(), getn5();
int geto5(), getp5(), getq5(), getr5(), gets5(), gett5(), getu5();
int getv5(), getw5(), getx5(), gety5(), getz5();

40
int geta6(), getb6(), getc6(), getd6(), gete6(), getf6(), getg6();
int geth6(), geti6(), getj6(), getk6(), getl6(), getm6(), getn6();
int geto6(), getp6(), getq6(), getr6(), gets6(), gett6(), getu6();
int getv6(), getw6(), getx6(), gety6(), getz6();

int geta7(), getb7(), getc7(), getd7(), gete7(), getf7(), getg7();


int geth7(), geti7(), getj7(), getk7(), getl7(), getm7(), getn7();
int geto7(), getp7(), getq7(), getr7(), gets7(), gett7(), getu7();
int getv7(), getw7(), getx7(), gety7(), getz7();

int geta8(), getb8(), getc8(), getd8(), gete8(), getf8(), getg8();


int geth8(), geti8(), getj8(), getk8(); getl8(), getm8(), getn8();
int geto8(), getp8(), getq8(), getr8(), gets8(), gett8(), getu8();
int getv8(), getw8(), getx8(), gety8(), getz8();

int geta9(), getb9(), getc9(), getd9(), gete9(), getf9(), getg9();


int geth9(), geti9(), getj9(), getk9(), getl9(), getm9(), getn9();
int geto9(), getp9(), getq9(), getr9(), gets9(), gett9(), getu9();
int getv9(), getw9(), getx9(), gety9(), getz9();

void setz9( int );

int geta0() { return( a0 ); }


int getb0() { return( b0 ); }
int getc0() { return( c0 ); }
int getd0() { return( d0 ); }
int gete0() { return( e0 ); }
int getf0() { return( f0 ); }
int getg0() { return( g0 ); }
int geth0() { return( h0 ); }
int geti0() { return( i0 ); }
int getj0() { return( j0 ); }
int getk0() { return( k0 ); }
int getl0() { return( l0 ); }
int getm0() { return( m0 ); }
int getn0() { return( n0 ); }
int geto0() { return( o0 ); }
int getp0() { return( p0 ); }
int getq0() { return( q0 ); }
int getr0() { return( r0 ); }
int gets0() { return( s0 ); }
int gett0() { return( t0 ); }
int getu0() { return( u0 ); }
int getv0() { return( v0 ); }
int getw0() { return( w0 ); }
int getx0() { return( x0 ); }
int gety0() { return( y0 ); }
int getz0() { return( z0 ); }
int geta1() { return( a1 ); }
int getb1() { return( b1 ); }
int getc1() { return( c1 ); }

41
int getd1() { return( d1 ); }
int gete1() { return( e1 ); }
int getf1() { return( f1 ); }
int getg1() { return( g1 ); }
int geth1() { return( h1 ); }
int geti1() { return( i1 ); }
int getj1() { return( j1 ); }
int getk1() { return( k1 ); }
int getl1() { return( l1 ); }
int getm1() { return( m1 ); }
int getn1() { return( n1 ); }
int geto1() { return( o1 ); }
int getp1() { return( p1 ); }
int getq1() { return( q1 ); }
int getr1() { return( r1 ); }
int gets1() { return( s1 ); }
int gett1() { return( t1 ); }
int getu1() { return( u1 ); }
int getv1() { return( v1 ); }
int getw1() { return( w1 ); }
int getx1() { return( x1 ); }
int gety1() { return( y1 ); }
int getz1() { return( z1 ); }
int geta2() { return( a2 ); }
int getb2() { return( b2 ); }
int getc2() { return( c2 ); }
int getd2() { return( d2 ); }
int gete2() { return( e2 ); }
int getf2() { return( f2 ); }
int getg2() { return( g2 ); }
int geth2() { return( h2 ); }
int geti2() { return( i2 ); }
int getj2() { return( j2 ); }
int getk2() { return( k2 ); }
int getl2() { return( l2 ); }
int getm2() { return( m2 ); }
int getn2() { return( n2 ); }
int geto2() { return( o2 ); }
int getp2() { return( p2 ); }
int getq2() { return( q2 ); }
int getr2() { return( r2 ); }
int gets2() { return( s2 ); }
int gett2() { return( t2 ); }
int getu2() { return( u2 ); }
int getv2() { return( v2 ); }
int getw2() { return( w2 ); }
int getx2() { return( x2 ); }
int gety2() { return( y2 ); }
int getz2() { return( z2 ); }
int geta3() { return( a3 ); }
int getb3() { return( b3 ); }
int getc3() { return( c3 ); }

42
int getd3() { return( d3 ); }
int gete3() { return( e3 ); }
int getf3() { return( f3 ); }
int getg3() { return( g3 ); }
int geth3() { return( h3 ); }
int geti3() { return( i3 ); }
int getj3() { return( j3 ); }
int getk3() { return( k3 ); }
int getl3() { return( l3 ); }
int getm3() { return( m3 ); }
int getn3() { return( n3 ); }
int geto3() { return( o3 ); }
int getp3() { return( p3 ); }
int getq3() { return( q3 ); }
int getr3() { return( r3 ); }
int gets3() { return( s3 ); }
int gett3() { return( t3 ); }
int getu3() { return( u3 ); }
int getv3() { return( v3 ); }
int getw3() { return( w3 ); }
int getx3() { return( x3 ); }
int gety3() { return( y3 ); }
int getz3() { return( z3 ); }
int geta4() { return( a4 ); }
int getb4() { return( b4 ); }
int getc4() { return( c4 ); }
int getd4() { return( d4 ); }
int gete4() { return( e4 ); }
int getf4() { return( f4 ); }
int getg4() { return( g4 ); }
int geth4() { return( h4 ); }
int geti4() { return( i4 ); }
int getj4() { return( j4 ); }
int getk4() { return( k4 ); }
int getl4() { return( l4 ); }
int getm4() { return( m4 ); }
int getn4() { return( n4 ); }
int geto4() { return( o4 ); }
int getp4() { return( p4 ); }
int getq4() { return( q4 ); }
int getr4() { return( r4 ); }
int gets4() { return( s4 ); }
int gett4() { return( t4 ); }
int getu4() { return( u4 ); }
int getv4() { return( v4 ); }
int getw4() { return( w4 ); }
int getx4() { return( x4 ); }
int gety4() { return( y4 ); }
int getz4() { return( z4 ); }
int geta5() { return( a5 ); }
int getb5() { return( b5 ); }
int getc5() { return( c5 ); }

43
int getd5() { return( d5 ); }
int gete5() { return( e5 ); }
int getf5() { return( f5 ); }
int getg5() { return( g5 ); }
int geth5() { return( h5 ); }
int geti5() { return( i5 ); }
int getj5() { return( j5 ); }
int getk5() { return( k5 ); }
int getl5() { return( l5 ); }
int getm5() { return( m5 ); }
int getn5() { return( n5 ); }
int geto5() { return( o5 ); }
int getp5() { return( p5 ); }
int getq5() { return( q5 ); }
int getr5() { return( r5 ); }
int gets5() { return( s5 ); }
int gett5() { return( t5 ); }
int getu5() { return( u5 ); }
int getv5() { return( v5 ); }
int getw5() { return( w5 ); }
int getx5() { return( x5 ); }
int gety5() { return( y5 ); }
int getz5() { return( z5 ); }
int geta6() { return( a6 ); }
int getb6() { return( b6 ); }
int getc6() { return( c6 ); }
int getd6() { return( d6 ); }
int gete6() { return( e6 ); }
int getf6() { return( f6 ); }
int getg6() { return( g6 ); }
int geth6() { return( h6 ); }
int geti6() { return( i6 ); }
int getj6() { return( j6 ); }
int getk6() { return( k6 ); }
int getl6() { return( l6 ); }
int getm6() { return( m6 ); }
int getn6() { return( n6 ); }
int geto6() { return( o6 ); }
int getp6() { return( p6 ); }
int getq6() { return( q6 ); }
int getr6() { return( r6 ); }
int gets6() { return( s6 ); }
int gett6() { return( t6 ); }
int getu6() { return( u6 ); }
int getv6() { return( v6 ); }
int getw6() { return( w6 ); }
int getx6() { return( x6 ); }
int gety6() { return( y6 ); }
int getz6() { return( z6 ); }
int geta7() { return( a7 ); }
int getb7() { return( b7 ); }
int getc7() { return( c7 ); }

44
int getd7() { return( d7 ); }
int gete7() { return( e7 ); }
int getf7() { return( f7 ); }
int getg7() { return( g7 ); }
int geth7() { return( h7 ); }
int geti7() { return( i7 ); }
int getj7() { return( j7 ); }
int getk7() { return( k7 ); }
int getl7() { return( l7 ); }
int getm7() { return( m7 ); }
int getn7() { return( n7 ); }
int geto7() { return( o7 ); }
int getp7() { return( p7 ); }
int getq7() { return( q7 ); }
int getr7() { return( r7 ); }
int gets7() { return( s7 ); }
int gett7() { return( t7 ); }
int getu7() { return( u7 ); }
int getv7() { return( v7 ); }
int getw7() { return( w7 ); }
int getx7() { return( x7 ); }
int gety7() { return( y7 ); }
int getz7() { return( z7 ); }
int geta8() { return( a8 ); }
int getb8() { return( b8 ); }
int getc8() { return( c8 ); }
int getd8() { return( d8 ); }
int gete8() { return( e8 ); }
int getf8() { return( f8 ); }
int getg8() { return( g8 ); }
int geth8() { return( h8 ); }
int geti8() { return( i8 ); }
int getj8() { return( j8 ); }
int getk8() { return( k8 ); }
int getl8() { return( l8 ); }
int getm8() { return( m8 ); }
int getn8() { return( n8 ); }
int geto8() { return( o8 ); }
int getp8() { return( p8 ); }
int getq8() { return( q8 ); }
int getr8() { return( r8 ); }
int gets8() { return( s8 ); }
int gett8() { return( t8 ); }
int getu8() { return( u8 ); }
int getv8() { return( v8 ); }
int getw8() { return( w8 ); }
int getx8() { return( x8 ); }
int gety8() { return( y8 ); }
int getz8() { return( z8 ); }
int geta9() { return( a9 ); }
int getb9() { return( b9 ); }
int getc9() { return( c9 ); }

45
int getd9() { return( d9 ); }
int gete9() { return( e9 ); }
int getf9() { return( f9 ); }
int getg9() { return( g9 ); }
int geth9() { return( h9 ); }
int geti9() { return( i9 ); }
int getj9() { return( j9 ); }
int getk9() { return( k9 ); }
int getl9() { return( l9 ); }
int getm9() { return( m9 ); }
int getn9() { return( n9 ); }
int geto9() { return( o9 ); }
int getp9() { return( p9 ); }
int getq9() { return( q9 ); }
int getr9() { return( r9 ); }
int gets9() { return( s9 ); }
int gett9() { return( t9 ); }
int getu9() { return( u9 ); }
int getv9() { return( v9 ); }
int getw9() { return( w9 ); }
int getx9() { return( x9 ); }
int gety9() { return( y9 ); }
int getz9() { return( z9 ); }

void setz9( int value ) { z9 = value; }

#endif

void main(void)
{
/* Measure timer overhead and pre-load into cpu cache. */

/* Note that the timer-only numbers need to be subtracted from


* the data access numbers to compensate for timer overhead.
*/
ZTIMERON();
ZTIMEROFF();
ZTIMERREPORT();
ZTIMERON();
ZTIMEROFF();
ZTIMERREPORT();
ZTIMERON();
ZTIMEROFF();
ZTIMERREPORT();

#ifdef DIRECT_METHOD
ZTIMERON();

z9 += a0 + b0 + c0 + d0 + e0 + f0 + g0 + h0 + i0 + j0 + k0 + l0 + m0 +
n0 + o0 + p0 + q0 + r0 + s0 + t0 + u0 + v0 + w0 + x0 + y0 + z0;

z9 += a1 + b1 + c1 + d1 + e1 + f1 + g1 + h1 + i1 + j1 + k1 + l1 + m1 +

46
n1 + o1 + p1 + q1 + r1 + s1 + t1 + u1 + v1 + w1 + x1 + y1 + z1;

z9 += a2 + b2 + c2 + d2 + e2 + f2 + g2 + h2 + i2 + j2 + k2 + l2 + m2 +
n2 + o2 + p2 + q2 + r2 + s2 + t2 + u2 + v2 + w2 + x2 + y2 + z2;

z9 += a3 + b3 + c3 + d3 + e3 + f3 + g3 + h3 + i3 + j3 + k3 + l3 + m3 +
n3 + o3 + p3 + q3 + r3 + s3 + t3 + u3 + v3 + w3 + x3 + y3 + z3;

z9 += a4 + b4 + c4 + d4 + e4 + f4 + g4 + h4 + i4 + j4 + k4 + l4 + m4 +
n4 + o4 + p4 + q4 + r4 + s4 + t4 + u4 + v4 + w4 + x4 + y4 + z4;

z9 += a5 + b5 + c5 + d5 + e5 + f5 + g5 + h5 + i5 + j5 + k5 + l5 + m5 +
n5 + o5 + p5 + q5 + r5 + s5 + t5 + u5 + v5 + w5 + x5 + y5 + z5;

z9 += a6 + b6 + c6 + d6 + e6 + f6 + g6 + h6 + i6 + j6 + k6 + l6 + m6 +
n6 + o6 + p6 + q6 + r6 + s6 + t6 + u6 + v6 + w6 + x6 + y6 + z6;

z9 += a7 + b7 + c7 + d7 + e7 + f7 + g7 + h7 + i7 + j7 + k7 + l7 + m7 +
n7 + o7 + p7 + q7 + r7 + s7 + t7 + u7 + v7 + w7 + x7 + y7 + z7;

z9 += a8 + b8 + c8 + d8 + e8 + f8 + g8 + h8 + i8 + j8 + k8 + l8 + m8 +
n8 + o8 + p8 + q8 + r8 + s8 + t8 + u8 + v8 + w8 + x8 + y8 + z8;

z9 += a9 + b9 + c9 + d9 + e9 + f9 + g9 + h9 + i9 + j9 + k9 + l9 + m9 +
n9 + o9 + p9 + q9 + r9 + s9 + t9 + u9 + v9 + w9 + x9 + y9 + z9;

ZTIMEROFF();

ZTIMERREPORT();
#endif

#ifdef ACCESSOR_METHOD
ZTIMERON();

setz9( getz9() +
geta0() + getb0() + getc0() + getd0() + gete0() + getf0() +
getg0() + geth0() + geti0() + getj0() + getk0() + getl0() +
getm0() + getn0() + geto0() + getp0() + getq0() + getr0() +
gets0() + gett0() + getu0() + getv0() + getw0() + getx0() +
gety0() + getz0() );

setz9( getz9() +
geta1() + getb1() + getc1() + getd1() + gete1() + getf1() +
getg1() + geth1() + geti1() + getj1() + getk1() + getl1() +
getm1() + getn1() + geto1() + getp1() + getq1() + getr1() +
gets1() + gett1() + getu1() + getv1() + getw1() + getx1() +
gety1() + getz1() );

setz9( getz9() +
geta2() + getb2() + getc2() + getd2() + gete2() + getf2() +
getg2() + geth2() + geti2() + getj2() + getk2() + getl2() +
getm2() + getn2() + geto2() + getp2() + getq2() + getr2() +

47
gets2() + gett2() + getu2() + getv2() + getw2() + getx2() +
gety2() + getz2() );

setz9( getz9() +
geta3() + getb3() + getc3() + getd3() + gete3() + getf3() +
getg3() + geth3() + geti3() + getj3() + getk3() + getl3() +
getm3() + getn3() + geto3() + getp3() + getq3() + getr3() +
gets3() + gett3() + getu3() + getv3() + getw3() + getx3() +
gety3() + getz3() );

setz9( getz9() +
geta4() + getb4() + getc4() + getd4() + gete4() + getf4() +
getg4() + geth4() + geti4() + getj4() + getk4() + getl4() +
getm4() + getn4() + geto4() + getp4() + getq4() + getr4() +
gets4() + gett4() + getu4() + getv4() + getw4() + getx4() +
gety4() + getz4() );

setz9( getz9() +
geta5() + getb5() + getc5() + getd5() + gete5() + getf5() +
getg5() + geth5() + geti5() + getj5() + getk5() + getl5() +
getm5() + getn5() + geto5() + getp5() + getq5() + getr5() +
gets5() + gett5() + getu5() + getv5() + getw5() + getx5() +
gety5() + getz5() );

setz9( getz9() +
geta6() + getb6() + getc6() + getd6() + gete6() + getf6() +
getg6() + geth6() + geti6() + getj6() + getk6() + getl6() +
getm6() + getn6() + geto6() + getp6() + getq6() + getr6() +
gets6() + gett6() + getu6() + getv6() + getw6() + getx6() +
gety6() + getz6() );

setz9( getz9() +
geta7() + getb7() + getc7() + getd7() + gete7() + getf7() +
getg7() + geth7() + geti7() + getj7() + getk7() + getl7() +
getm7() + getn7() + geto7() + getp7() + getq7() + getr7() +
gets7() + gett7() + getu7() + getv7() + getw7() + getx7() +
gety7() + getz7() );

setz9( getz9() +
geta8() + getb8() + getc8() + getd8() + gete8() + getf8() +
getg8() + geth8() + geti8() + getj8() + getk8() + getl8() +
getm8() + getn8() + geto8() + getp8() + getq8() + getr8() +
gets8() + gett8() + getu8() + getv8() + getw8() + getx8() +
gety8() + getz8() );

setz9( getz9() +
geta9() + getb9() + getc9() + getd9() + gete9() + getf9() +
getg9() + geth9() + geti9() + getj9() + getk9() + getl9() +
getm9() + getn9() + geto9() + getp9() + getq9() + getr9() +
gets9() + gett9() + getu9() + getv9() + getw9() + getx9() +
gety9() + getz9() );
ZTIMEROFF();

48
ZTIMERREPORT();
#endif
}

END FILE ‘acctest.c’ --------------------------------------

END X86 SOURCE CODE ---------------------------------------

END OF DOCUMENT

49

You might also like