Professional Documents
Culture Documents
Version 3.0
ii Power ISA™
Version 3.0
IBM®
Power ISA
PowerPC®
Power Architecture
PowerPC Architecture
Power Family
RISC/System 6000®
POWER®
POWER2
POWER4
POWER4+
POWER5
POWER5+
POWER6®
POWER7®
System/370
System z
The POWER ARCHITECTURE and POWER.ORG.
word marks and the Power and Power.org logos and
related marks are trademarks and service marks
licensed by Power.org.
AltiVec is a trademark of Freescale Semiconductor, Inc.
used under license.
Notice to U.S. Government Users—Documentation
Related to Restricted Rights—Use, duplication or dis-
closure is subject to restrictions set fourth in GSA ADP
Schedule Contract with IBM Corporation.
iii
Version 3.0
Preface
Preface v
Version 3.0
vi Power ISA™
Version 3.0
D-form VSX Floating-Point Storage Access Instruc- Data Storage Interrupt Status Register for Alignment
tions: Adds base+displacement forms of VSR load and Interrupt: Simplifies the Alignment interrupt by remov-
store instructions. ing the Data Storage Interrupt Status Register (DSISR)
from the set of registers modified by the Alignment
Integer Multiply-Add Instructions: Adds new integer
interrupt.
multiply-add instructions to accelerate arbitrary-length
multiplication. CA32 & OV32 and Move XER to CR Extended: Added
support for 32-bit CA & OV status in 64-bit mode for
msgsndp Hypervisor Facility Availability Interrupt:
dynamically-typed languages
Adds a new HFSCR bit to control the availability of the
msgsndp instruction and the associated control regis- VSX Shift Variable: Accelerate parallel element extrac-
ters. tion from packed vectors of arbitrary-width-element val-
ues.
VSX Permute: Adds new pernute instructions that can
address all 64 VSRs. Enhanced Virtualization for Linux: Delivers exceptions
caused by the OS attempting to use hypervisor instruc-
Array Index Support: Enhance support for
tions and SPRs to the hypervisor instead of the OS.
mixed-datatype addressing into arrays (e.g., base +
Accesses to unimplemented SPRs by the OS newly
32-bit index)
cause interrupts that are also directed to the hypervi-
Hypervisor Virtualization Interrupt: Defines a new sor.
exception and corresponding interrupt that is caused
Synchronizing Messages and Storage Updates: Adds a
by events external to the processor that relate to virtual-
new instruction to make latent storage updates from
ization.
another thread accessible after receiving a Directed
Debug Extension for TM: Adds a new type of Trace Hypervisor Doorbell interrupt from that thread.
interrupt to enable skipping past a transaction in the
debugging process.
wait Instruction Enhancements: Improves the capabili-
ties of the wait instruction so that resumption of pro-
cessing can occur due to event-based branches and
external signals.
Decrementer and Hypervisor Decrementer Enahnce-
ments: Defines a new mode bit in the LPCR that
enables additional Decrementer and Hypervisor Decre-
menter bits in order to increase the time between the
associated interrupts.
Deliver A Random Number: Adds a new instruction to
place a random number in a GPR in one of three for-
mats.
Preface vii
Version 3.0
Table of Contents
1.6.19 XO-FORM . . . . . . . . . . . . . . . . . 15
1.6.20 XS-FORM. . . . . . . . . . . . . . . . . . 15
Preface. . . . . . . . . . . . . . . . . . . . . . . . . v 1.6.21 XX2-FORM. . . . . . . . . . . . . . . . . 15
Summary of Changes in Power ISA Ver- 1.6.22 XX3-FORM. . . . . . . . . . . . . . . . . 15
sion 3.0 . . . . . . . . . . . . . . . . . . . . . . . . . . vi 1.6.23 XX4-FORM. . . . . . . . . . . . . . . . . 15
1.6.24 Z22-FORM . . . . . . . . . . . . . . . . . 15
Table of Contents . . . . . . . . . . . . . . . . ix 1.6.25 Z23-FORM . . . . . . . . . . . . . . . . . 16
1.7 Instruction Fields . . . . . . . . . . . . . . . 16
1.8 Classes of Instructions . . . . . . . . . . 22
Book I: 1.8.1 Defined Instruction Class . . . . . . . 22
1.8.2 Illegal Instruction Class . . . . . . . . 22
Power ISA User Instruction Set 1.8.3 Reserved Instruction Class . . . . . 22
1.9 Forms of Defined Instructions . . . . . 23
Architecture. . . . . . . . . . . . . . . . . . . . 1
1.9.1 Preferred Instruction Forms . . . . . 23
1.9.2 Invalid Instruction Forms . . . . . . . 23
Chapter 1. Introduction . . . . . . . . . . 3 1.9.3 Reserved-no-op Instructions . . . . 23
1.1 Overview. . . . . . . . . . . . . . . . . . . . . . 3 1.10 Exceptions. . . . . . . . . . . . . . . . . . . 23
1.2 Instruction Mnemonics and Operands3 1.11 Storage Addressing. . . . . . . . . . . . 24
1.3 Document Conventions . . . . . . . . . . 3 1.11.1 Storage Operands . . . . . . . . . . . 24
1.3.1 Definitions . . . . . . . . . . . . . . . . . . . 3 1.11.2 Instruction Fetches . . . . . . . . . . . 26
1.3.2 Notation . . . . . . . . . . . . . . . . . . . . . 4 1.11.3 Effective Address Calculation. . . 27
1.3.3 Reserved Fields, Reserved Values,
and Reserved SPRs . . . . . . . . . . . . . . . . 5 Chapter 2. Branch Facility . . . . . . . 29
1.3.4 Description of Instruction Operation 6 2.1 Branch Facility Overview . . . . . . . . . 29
1.3.5 Phased-Out Facilities. . . . . . . . . . . 8 2.2 Instruction Execution Order. . . . . . . 29
1.4 Processor Overview . . . . . . . . . . . . . 9 2.3 Branch Facility Registers. . . . . . . . . 30
1.5 Computation modes . . . . . . . . . . . . 10 2.3.1 Condition Register . . . . . . . . . . . . 30
1.6 Instruction Formats . . . . . . . . . . . . . 11 2.3.2 Link Register . . . . . . . . . . . . . . . . 32
1.6.1 A-FORM . . . . . . . . . . . . . . . . . . . 12 2.3.3 Count Register . . . . . . . . . . . . . . . 32
1.6.2 B-FORM . . . . . . . . . . . . . . . . . . . 12 2.3.4 Target Address Register. . . . . . . . 32
1.6.3 D-FORM . . . . . . . . . . . . . . . . . . . 12 2.4 Branch History Rolling Buffer . . . . . 32
1.6.4 DQ-FORM . . . . . . . . . . . . . . . . . . 12 2.4.1 Branch History Rolling Buffer Entry
1.6.5 DS-FORM . . . . . . . . . . . . . . . . . . 12 Format . . . . . . . . . . . . . . . . . . . . . . . . . . 33
1.6.6 DX-FORM . . . . . . . . . . . . . . . . . . 12 2.5 Branch Instructions . . . . . . . . . . . . . 34
1.6.7 I-FORM . . . . . . . . . . . . . . . . . . . . 12 2.6 Condition Register Instructions . . . . 41
1.6.8 M-FORM . . . . . . . . . . . . . . . . . . . 12 2.6.1 Condition Register Logical Instruc-
1.6.9 MD-FORM . . . . . . . . . . . . . . . . . . 12 tions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
1.6.10 MDS-FORM . . . . . . . . . . . . . . . . 12 2.6.2 Condition Register Field Instruction .
1.6.11 SC-FORM . . . . . . . . . . . . . . . . . 12 42
1.6.12 VA-FORM . . . . . . . . . . . . . . . . . 12 2.7 System Call Instructions. . . . . . . . . 43
1.6.13 VC-FORM . . . . . . . . . . . . . . . . . 12 2.8 Branch History Rolling Buffer Instruc-
1.6.14 VX-FORM . . . . . . . . . . . . . . . . . 13 tions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
1.6.15 X-FORM . . . . . . . . . . . . . . . . . . 13
1.6.16 XFL-FORM . . . . . . . . . . . . . . . . 15
1.6.17 XFX-FORM . . . . . . . . . . . . . . . . 15 Chapter 3. Fixed-Point Facility . . . . 45
1.6.18 XL-FORM . . . . . . . . . . . . . . . . . 15 3.1 Fixed-Point Facility Overview . . . . . 45
3.2 Fixed-Point Facility Registers . . . . . 45
Table of Contents ix
Version 2.08
3.2.1 General Purpose Registers. . . . . .45 3.3.17 Move To/From System Register
3.2.2 Fixed-Point Exception Instructions . . . . . . . . . . . . . . . . . . . . . 116
Register . . . . . . . . . . . . . . . . . . . . . . . . .45
3.2.3 VR Save Register . . . . . . . . . . . . .46 Chapter 4. Floating-Point Facility 123
3.2.4 Load Monitored Region Registers.46 4.1 Floating-Point Facility Overview . . 123
3.2.4.1 Load Monitored Region Register46 4.2 Floating-Point Facility Registers . . 124
3.2.4.2 Load Monitored Section Enable 4.2.1 Floating-Point Registers . . . . . . 124
Register . . . . . . . . . . . . . . . . . . . . . . . . .46 4.2.2 Floating-Point Status and Control
3.3 Fixed-Point Facility Instructions . . . .47 Register. . . . . . . . . . . . . . . . . . . . . . . . 124
3.3.1 Fixed-Point Storage Access Instruc- 4.3 Floating-Point Data . . . . . . . . . . . . 127
tions . . . . . . . . . . . . . . . . . . . . . . . . . . . .47 4.3.1 Data Format. . . . . . . . . . . . . . . . 127
3.3.1.1 Storage Access Exceptions . . . .47 4.3.2 Value Representation . . . . . . . . 127
3.3.2 Fixed-Point Load Instructions . . . .47 4.3.3 Sign of Result . . . . . . . . . . . . . . 129
3.3.2.1 64-bit Fixed-Point Load Instruc- 4.3.4 Normalization and
tions . . . . . . . . . . . . . . . . . . . . . . . . . . . .52 Denormalization . . . . . . . . . . . . . . . . . 129
3.3.3 Fixed-Point Store Instructions . . . .55 4.3.5 Data Handling and Precision . . . 129
3.3.3.1 64-bit Fixed-Point Store Instruc- 4.3.5.1 Single-Precision Operands . . . 129
tions . . . . . . . . . . . . . . . . . . . . . . . . . . . .58 4.3.5.2 Integer-Valued Operands . . . . 130
3.3.4 Fixed Point Load and Store Quad- 4.3.6 Rounding . . . . . . . . . . . . . . . . . . 131
word Instructions . . . . . . . . . . . . . . . . . .59 4.4 Floating-Point Exceptions . . . . . . . 132
3.3.5 Fixed-Point Load and Store with Byte 4.4.1 Invalid Operation Exception . . . . 134
Reversal Instructions . . . . . . . . . . . . . . .61 4.4.1.1 Definition. . . . . . . . . . . . . . . . . 134
3.3.5.1 64-Bit Load and Store with Byte 4.4.1.2 Action . . . . . . . . . . . . . . . . . . . 134
Reversal Instructions . . . . . . . . . . . . . . .62 4.4.2 Zero Divide Exception . . . . . . . . 134
3.3.6 Fixed-Point Load and Store Multiple 4.4.2.1 Definition. . . . . . . . . . . . . . . . . 134
Instructions . . . . . . . . . . . . . . . . . . . . . . .63 4.4.2.2 Action . . . . . . . . . . . . . . . . . . . 135
3.3.7 Fixed-Point Move Assist Instructions 4.4.3 Overflow Exception . . . . . . . . . . 135
[Phased Out]. . . . . . . . . . . . . . . . . . . . . .64 4.4.3.1 Definition. . . . . . . . . . . . . . . . . 135
3.3.8 Other Fixed-Point Instructions. . . .67 4.4.3.2 Action . . . . . . . . . . . . . . . . . . . 135
3.3.9 Fixed-Point Arithmetic Instructions 68 4.4.4 Underflow Exception . . . . . . . . . 136
3.3.9.1 64-bit Fixed-Point Arithmetic 4.4.4.1 Definition. . . . . . . . . . . . . . . . . 136
Instructions . . . . . . . . . . . . . . . . . . . . . . .80 4.4.4.2 Action . . . . . . . . . . . . . . . . . . . 136
3.3.10 Fixed-Point Compare Instructions . . 4.4.5 Inexact Exception . . . . . . . . . . . 136
85 4.4.5.1 Definition. . . . . . . . . . . . . . . . . 136
3.3.10.1 Character-Type Compare Instruc- 4.4.5.2 Action . . . . . . . . . . . . . . . . . . . 136
tions . . . . . . . . . . . . . . . . . . . . . . . . . . . .87 4.5 Floating-Point Execution Models . 137
3.3.11 Fixed-Point Trap Instructions. . . .89 4.5.1 Execution Model for IEEE Opera-
3.3.11.1 64-bit Fixed-Point Trap Instruc- tions . . . . . . . . . . . . . . . . . . . . . . . . . . 137
tions . . . . . . . . . . . . . . . . . . . . . . . . . . . .90 4.5.2 Execution Model for
3.3.12 Fixed-Point Select . . . . . . . . . . . .90 Multiply-Add Type Instructions . . . . . . 139
3.3.13 Fixed-Point Logical Instructions .91 4.6 Floating-Point Facility Instructions 140
3.3.13.1 64-bit Fixed-Point Logical Instruc- 4.6.1 Floating-Point Storage Access
tions . . . . . . . . . . . . . . . . . . . . . . . . . . . .98 Instructions . . . . . . . . . . . . . . . . . . . . . 141
3.3.14 Fixed-Point Rotate and Shift 4.6.1.1 Storage Access Exceptions . . 141
Instructions . . . . . . . . . . . . . . . . . . . . . .100 4.6.2 Floating-Point Load Instructions 141
3.3.14.1 Fixed-Point Rotate Instructions . . 4.6.3 Floating-Point Store Instructions 145
100 4.6.4 Floating-Point Load and Store Dou-
3.3.14.1.1 64-bit Fixed-Point Rotate ble Pair Instructions [Phased-Out] . . . 149
Instructions . . . . . . . . . . . . . . . . . . . . . .103 4.6.5 Floating-Point Move Instructions 151
3.3.14.2 Fixed-Point Shift Instructions .106 4.6.6 Floating-Point Arithmetic Instructions
3.3.14.2.1 64-bit Fixed-Point Shift Instruc- 153
tions . . . . . . . . . . . . . . . . . . . . . . . . . . .108 4.6.6.1 Floating-Point Elementary Arith-
3.3.15 Binary Coded Decimal (BCD) metic Instructions . . . . . . . . . . . . . . . . 153
Assist Instructions . . . . . . . . . . . . . . . .110 4.6.6.2 Floating-Point Multiply-Add Instruc-
3.3.16 Move To/From Vector-Scalar Regis- tions . . . . . . . . . . . . . . . . . . . . . . . . . . 158
ter Instructions . . . . . . . . . . . . . . . . . . .111
x Power ISA™
Version 2.08
4.6.7 Floating-Point Rounding and Conver- 5.6.4 DFP Quantum Adjustment Instruc-
sion Instructions . . . . . . . . . . . . . . . . . 160 tions . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
4.6.7.1 Floating-Point Rounding Instruction 5.6.5 DFP Conversion Instructions . . . 214
160 5.6.5.1 DFP Data-Format Conversion
4.6.7.2 Floating-Point Convert To/From Instructions . . . . . . . . . . . . . . . . . . . . . 214
Integer Instructions . . . . . . . . . . . . . . . 160 5.6.5.2 DFP Data-Type Conversion
4.6.7.3 Floating Round to Integer Instruc- Instructions . . . . . . . . . . . . . . . . . . . . . 217
tions . . . . . . . . . . . . . . . . . . . . . . . . . . 166 5.6.6 DFP Format Instructions . . . . . . 219
4.6.8 Floating-Point Compare Instructions 5.6.7 DFP Instruction Summary . . . . . 223
168
4.6.9 Floating-Point Select Instruction 169 Chapter 6. Vector Facility . . . . . . . 225
4.6.10 Floating-Point Status and Control 6.1 Vector Facility Overview . . . . . . . . 225
Register Instructions . . . . . . . . . . . . . . 171 6.2 Chapter Conventions. . . . . . . . . . . 225
6.2.1 Description of Instruction Operation.
Chapter 5. Decimal Floating-Point . . 225
175 6.3 Vector Facility Registers . . . . . . . . 234
5.1 Decimal Floating-Point (DFP) Facility 6.3.1 Vector Registers . . . . . . . . . . . . . 234
Overview . . . . . . . . . . . . . . . . . . . . . . . 175 6.3.2 Vector Status and Control Register .
5.2 DFP Register Handling . . . . . . . . . 176 234
5.2.1 DFP Usage of Floating-Point Regis- 6.3.3 VR Save Register. . . . . . . . . . . . 235
ters . . . . . . . . . . . . . . . . . . . . . . . . . . . 176 6.4 Vector Storage Access Operations 236
5.3 DFP Support for Non-DFP Data Types 6.4.1 Accessing Unaligned Storage Oper-
178 ands. . . . . . . . . . . . . . . . . . . . . . . . . . . 238
5.4 DFP Number Representation . . . . 179 6.5 Vector Integer Operations . . . . . . . 239
5.4.1 DFP Data Format. . . . . . . . . . . . 179 6.5.1 Integer Saturation. . . . . . . . . . . . 239
5.4.1.1 Fields Within the Data Format 179 6.6 Vector Floating-Point Operations. . 241
5.4.1.2 Summary of DFP Data Formats . . 6.6.1 Floating-Point Overview . . . . . . . 241
180 6.6.2 Floating-Point Exceptions. . . . . . 241
5.4.1.3 Preferred DPD Encoding . . . . 181 6.6.2.1 NaN Operand Exception . . . . . 241
5.4.2 Classes of DFP Data . . . . . . . . . 181 6.6.2.2 Invalid Operation Exception . . 242
5.5 DFP Execution Model . . . . . . . . . . 182 6.6.2.3 Zero Divide Exception . . . . . . . 242
5.5.1 Rounding . . . . . . . . . . . . . . . . . . 182 6.6.2.4 Log of Zero Exception . . . . . . . 242
5.5.2 Rounding Mode Specification . . 183 6.6.2.5 Overflow Exception . . . . . . . . . 242
5.5.3 Formation of Final Result. . . . . . 184 6.6.2.6 Underflow Exception . . . . . . . . 242
5.5.3.1 Use of Ideal Exponent . . . . . . 184 6.7 Vector Storage Access Instructions243
5.5.4 Arithmetic Operations . . . . . . . . 184 6.7.1 Storage Access Exceptions . . . . 243
5.5.4.1 Sign of Arithmetic Result . . . . 184 6.7.2 Vector Load Instructions. . . . . . . 244
5.5.5 Compare Operations . . . . . . . . . 185 6.7.3 Vector Store Instructions . . . . . . 247
5.5.6 Test Operations . . . . . . . . . . . . . 185 6.7.4 Vector Alignment Support Instruc-
5.5.7 Quantum Adjustment Operations 185 tions . . . . . . . . . . . . . . . . . . . . . . . . . . . 249
5.5.8 Conversion Operations . . . . . . . 185 6.8 Vector Permute and Formatting
5.5.8.1 Data-Format Conversion . . . . 185 Instructions . . . . . . . . . . . . . . . . . . . . . 250
5.5.8.2 Data-Type Conversion . . . . . . 186 6.8.1 Vector Pack and Unpack Instructions
5.5.9 Format Operations. . . . . . . . . . . 186 250
5.5.10 DFP Exceptions . . . . . . . . . . . . 186 6.8.2 Vector Merge Instructions . . . . . 257
5.5.10.1 Invalid Operation Exception . 188 6.8.3 Vector Splat Instructions . . . . . . 260
5.5.10.2 Zero Divide Exception . . . . . 189 6.8.4 Vector Permute Instruction . . . . . 262
5.5.10.3 Overflow Exception. . . . . . . . 189 6.8.5 Vector Select Instruction . . . . . . 263
5.5.10.4 Underflow Exception. . . . . . . 190 6.8.6 Vector Shift Instructions . . . . . . . 264
5.5.10.5 Inexact Exception . . . . . . . . . 191 6.8.7 Vector Extract Element Instructions .
5.5.11 Summary of Normal Rounding And 269
Range Actions. . . . . . . . . . . . . . . . . . . 192 6.8.8 Vector Insert Element Instructions . .
5.6 DFP Instruction Descriptions . . . . 194 270
5.6.1 DFP Arithmetic Instructions . . . . 195 6.9 Vector Integer Instructions . . . . . . 271
5.6.2 DFP Compare Instructions . . . . 199 6.9.1 Vector Integer Arithmetic Instructions
5.6.3 DFP Test Instructions . . . . . . . . 202 271
Table of Contents xi
Version 2.08
6.9.1.1 Vector Integer Add Instructions 271 6.16 Vector Bit Permute Instruction . . 349
6.9.1.2 Vector Integer Subtract Instructions 6.17 Decimal Integer Instructions . . . . 350
277 6.17.1 Decimal Integer Arithmetic Instruc-
6.9.1.3 Vector Integer Multiply Instructions tions . . . . . . . . . . . . . . . . . . . . . . . . . . 350
283 6.17.2 Decimal Integer Format Conversion
6.9.1.4 Vector Integer Multiply-Add/Sum Instructions . . . . . . . . . . . . . . . . . . . . . 352
Instructions . . . . . . . . . . . . . . . . . . . . . .287 6.17.3 Decimal Integer Sign Manipulation
6.9.1.5 Vector Integer Sum-Across Instruc- Instructions . . . . . . . . . . . . . . . . . . . . . 358
tions . . . . . . . . . . . . . . . . . . . . . . . . . . .292 6.17.4 Decimal Integer Shift and Round
6.9.1.6 Vector Integer Negate Instructions. Instructions . . . . . . . . . . . . . . . . . . . . . 359
295 6.17.5 Decimal Integer Truncate Instruc-
6.9.2 Vector Extend Sign Instructions .296 tions . . . . . . . . . . . . . . . . . . . . . . . . . . 362
6.9.2.1 Vector Integer Average Instructions 6.18 Vector Status and Control Register
298 Instructions . . . . . . . . . . . . . . . . . . . . . 364
6.9.2.2 Vector Integer Absolute Difference
Instructions . . . . . . . . . . . . . . . . . . . . . .300 Chapter 7. Vector-Scalar
6.9.2.3 Vector Integer Maximum and Mini- Floating-Point Operations . . . . . . 365
mum Instructions . . . . . . . . . . . . . . . . .302
7.1 Introduction . . . . . . . . . . . . . . . . . . 365
6.9.3 Vector Integer Compare Instructions.
7.1.1 Overview of the Vector-Scalar Exten-
306
sion . . . . . . . . . . . . . . . . . . . . . . . . . . . 365
6.9.4 Vector Logical Instructions . . . . .315
7.1.1.1 Compatibility with Floating-Point
6.9.5 Vector Parity Byte Instructions . .317
and Decimal Floating-Point Operations 365
6.9.6 Vector Integer Rotate and Shift
7.1.1.2 Compatibility with Vector Opera-
Instructions . . . . . . . . . . . . . . . . . . . . . .318
tions . . . . . . . . . . . . . . . . . . . . . . . . . . 365
6.10 Vector Floating-Point Instruction Set .
7.2 VSX Registers . . . . . . . . . . . . . . . 366
324
7.2.1 Vector-Scalar Registers . . . . . . . 366
6.10.1 Vector Floating-Point Arithmetic
7.2.1.1 Floating-Point Registers . . . . . 366
Instructions . . . . . . . . . . . . . . . . . . . . . .324
7.2.1.2 Vector Registers . . . . . . . . . . . 368
6.10.2 Vector Floating-Point Maximum and
7.2.2 Floating-Point Status and Control
Minimum Instructions . . . . . . . . . . . . . .326
Register. . . . . . . . . . . . . . . . . . . . . . . . 369
6.10.3 Vector Floating-Point Rounding and
7.3 VSX Operations . . . . . . . . . . . . . . 374
Conversion Instructions . . . . . . . . . . . .327
7.3.1 VSX Floating-Point Arithmetic Over-
6.10.4 Vector Floating-Point Compare
view . . . . . . . . . . . . . . . . . . . . . . . . . . . 374
Instructions . . . . . . . . . . . . . . . . . . . . . .331
7.3.2 VSX Floating-Point Data . . . . . . 375
6.10.5 Vector Floating-Point Estimate
7.3.2.1 Data Format . . . . . . . . . . . . . . 375
Instructions . . . . . . . . . . . . . . . . . . . . . .334
7.3.2.2 Value Representation . . . . . . . 377
6.11 Vector Exclusive-OR-based Instruc-
7.3.2.3 Sign of Result . . . . . . . . . . . . . 378
tions . . . . . . . . . . . . . . . . . . . . . . . . . . .336
7.3.2.4 Normalization and Denormaliza-
6.11.1 Vector AES Instructions . . . . . .336
tion . . . . . . . . . . . . . . . . . . . . . . . . . . . 379
6.11.2 Vector SHA-256 and SHA-512
7.3.2.5 Data Handling and Precision . 379
Sigma Instructions . . . . . . . . . . . . . . . .338
7.3.2.6 Rounding . . . . . . . . . . . . . . . . 383
6.11.3 Vector Binary Polynomial Multiplica-
7.3.3 VSX Floating-Point Execution Mod-
tion Instructions . . . . . . . . . . . . . . . . . .339
els . . . . . . . . . . . . . . . . . . . . . . . . . . . . 386
6.11.4 Vector Permute and Exclusive-OR
7.3.3.1 VSX Execution Model for IEEE
Instruction . . . . . . . . . . . . . . . . . . . . . . .341
Operations. . . . . . . . . . . . . . . . . . . . . . 386
6.12 Vector Gather Instruction . . . . . . .342
7.3.3.2 VSX Execution Model for Multi-
6.13 Vector Count Leading Zeros Instruc-
ply-Add Type Instructions . . . . . . . . . . 387
tions . . . . . . . . . . . . . . . . . . . . . . . . . . .343
7.4 VSX Floating-Point Exceptions . . . 389
6.14 Vector Count Trailing Zeros Instruc-
7.4.1 Floating-Point Invalid Operation
tions . . . . . . . . . . . . . . . . . . . . . . . . . . .344
Exception . . . . . . . . . . . . . . . . . . . . . . 392
6.14.1 Vector Count Leading/Trailing Zero
7.4.1.1 Definition. . . . . . . . . . . . . . . . . 392
LSB Instructions . . . . . . . . . . . . . . . . . .345
7.4.1.2 Action for VE=1. . . . . . . . . . . . 392
6.14.2 Vector Extract Element Instructions
7.4.1.3 Action for VE=0. . . . . . . . . . . . 394
346
7.4.2 Floating-Point Zero Divide Exception
6.15 Vector Population Count Instructions .
403
348
7.4.2.1 Definition. . . . . . . . . . . . . . . . . 403
7.4.2.2 Action for ZE=1 . . . . . . . . . . . . 403 A.4 Floating-Point Round to Integer Model
7.4.2.3 Action for ZE=0 . . . . . . . . . . . . 404 788
7.4.3 Floating-Point Overflow Exception. .
406 Appendix B. Densely Packed
7.4.3.1 Definition. . . . . . . . . . . . . . . . . 406 Decimal . . . . . . . . . . . . . . . . . . . . . . 791
7.4.3.2 Action for OE=1 . . . . . . . . . . . 406
B.1 BCD-to-DPD Translation . . . . . . . . 791
7.4.3.3 Action for OE=0 . . . . . . . . . . . 409
B.2 DPD-to-BCD Translation . . . . . . . . 791
7.4.4 Floating-Point Underflow Exception.
B.3 Preferred DPD encoding. . . . . . . . 792
411
7.4.4.1 Definition. . . . . . . . . . . . . . . . . 411
7.4.4.2 Action for UE=1 . . . . . . . . . . . 411 Appendix C. Assembler Extended
7.4.4.3 Action for UE=0 . . . . . . . . . . . 413 Mnemonics . . . . . . . . . . . . . . . . . . . 795
7.4.5 Floating-Point Inexact Exception 416 C.1 Symbols . . . . . . . . . . . . . . . . . . . . 795
7.4.5.1 Definition. . . . . . . . . . . . . . . . . 416 C.2 Branch Mnemonics . . . . . . . . . . . . 796
7.4.5.2 Action for XE=1. . . . . . . . . . . . 416 C.2.1 BO and BI Fields . . . . . . . . . . . . 796
7.4.5.3 Action for XE=0. . . . . . . . . . . . 419 C.2.2 Simple Branch Mnemonics . . . . 796
7.5 VSX Storage Access Operations . 422 C.2.3 Branch Mnemonics Incorporating
7.5.1 Accessing Aligned Storage Oper- Conditions . . . . . . . . . . . . . . . . . . . . . . 797
ands . . . . . . . . . . . . . . . . . . . . . . . . . . 422 C.2.4 Branch Prediction . . . . . . . . . . . 798
7.5.2 Accessing Unaligned Storage Oper- C.3 Condition Register Logical Mnemonics
ands . . . . . . . . . . . . . . . . . . . . . . . . . . 423 799
7.5.3 Storage Access Exceptions . . . . 424 C.4 Subtract Mnemonics . . . . . . . . . . . 799
7.6 VSX Instruction Set . . . . . . . . . . . 425 C.4.1 Subtract Immediate . . . . . . . . . . 799
7.6.1 VSX Instruction Set Summary . . 425 C.4.2 Subtract . . . . . . . . . . . . . . . . . . . 799
7.6.1.1 VSX Storage Access Instructions . C.5 Compare Mnemonics . . . . . . . . . . 800
425 C.5.1 Doubleword Comparisons . . . . . 800
7.6.1.2 VSX Move Instructions . . . . . . 426 C.5.2 Word Comparisons . . . . . . . . . . 800
7.6.1.3 VSX Floating-Point Arithmetic C.6 Trap Mnemonics . . . . . . . . . . . . . . 801
Instructions . . . . . . . . . . . . . . . . . . . . . 426 C.7 Integer Select Mnemonics . . . . . . 802
7.6.1.4 VSX Floating-Point Compare C.8 Rotate and Shift Mnemonics. . . . . 803
Instructions . . . . . . . . . . . . . . . . . . . . . 429 C.8.1 Operations on Doublewords . . . 803
7.6.1.5 VSX FP-FP Conversion Instruc- C.8.2 Operations on Words . . . . . . . . . 804
tions . . . . . . . . . . . . . . . . . . . . . . . . . . 430 C.9 Move To/From Special Purpose Regis-
7.6.1.6 VSX FP-Integer Conversion ter Mnemonics . . . . . . . . . . . . . . . . . . . 805
Instructions . . . . . . . . . . . . . . . . . . . . . 430 C.10 Miscellaneous Mnemonics . . . . . 806
7.6.1.7 VSX Round to Floating-Point Inte-
ger Instructions . . . . . . . . . . . . . . . . . . 432 Book II:
7.6.1.8 VSX Scalar Floating-Point Support
Instructions . . . . . . . . . . . . . . . . . . . . . 432
7.6.1.9 VSX Logical Instructions. . . . . 433 Power ISA Virtual Environment
7.6.1.10 VSX Permute Instructions. . . 434 Architecture . . . . . . . . . . . . . . . . . . 809
7.6.2 VSX Instruction Description Conven-
tions . . . . . . . . . . . . . . . . . . . . . . . . . . 435 Chapter 1. Storage Model. . . . . . . 811
7.6.2.1 VSX Instruction RTL Operators 435
1.1 Definitions . . . . . . . . . . . . . . . . . . . 811
7.6.2.2 VSX Instruction RTL Function
1.2 Introduction . . . . . . . . . . . . . . . . . . 812
Calls . . . . . . . . . . . . . . . . . . . . . . . . . . 436
1.3 Virtual Storage . . . . . . . . . . . . . . . 812
7.6.3 VSX Instruction Descriptions . . . 481
1.4 Single-Copy Atomicity . . . . . . . . . 813
1.5 Cache Model . . . . . . . . . . . . . . . . . 814
Appendix A. Suggested 1.6 Storage Control Attributes . . . . . . 814
Floating-Point Models. . . . . . . . . . 779 1.6.1 Write Through Required . . . . . . 814
A.1 Floating-Point Round to Single-Preci- 1.6.2 Caching Inhibited . . . . . . . . . . . 815
sion Model. . . . . . . . . . . . . . . . . . . . . . 779 1.6.3 Memory Coherence Required . 815
A.2 Floating-Point Convert to Integer 1.6.4 Guarded . . . . . . . . . . . . . . . . . . 815
Model . . . . . . . . . . . . . . . . . . . . . . . . . 783 1.6.5 Strong Access Order . . . . . . . . . 817
A.3 Floating-Point Convert from Integer 1.7 Shared Storage . . . . . . . . . . . . . . 818
Model . . . . . . . . . . . . . . . . . . . . . . . . . 786 1.7.1 Storage Access Ordering . . . . 818
1.7.2 Storage Ordering of Copy/Paste-Initi- 4.6.2.1 64-Bit Load and Reserve and
ated Data Transfers . . . . . . . . . . . . . . .820 Store Conditional Instructions . . . . . . . 873
1.7.3 Storage Ordering of I/O Accesses . . 4.6.2.2 128-bit Load and Reserve Store
820 Conditional Instructions. . . . . . . . . . . . 875
1.7.4 Atomic Update. . . . . . . . . . . . . . .820 4.6.3 Memory Barrier Instructions . . . 877
1.7.4.1 Reservations . . . . . . . . . . . . .821 4.6.4 Wait Instruction . . . . . . . . . . . . . 880
1.7.4.2 Forward Progress . . . . . . . . . .823
1.8 Transactions . . . . . . . . . . . . . . . . . .823 Chapter 5. Transactional Memory
1.8.1 Rollback-Only Transactions. . . . .825 Facility . . . . . . . . . . . . . . . . . . . . . 881
1.9 Cluster Shared Memory . . . . . . . . .825
5.1 Transactional Memory Facility Over-
1.10 Instruction Storage . . . . . . . . . . . .826
view . . . . . . . . . . . . . . . . . . . . . . . . . . . 881
1.10.1 Concurrent Modification and Execu-
5.1.1 Definitions . . . . . . . . . . . . . . . . . 882
tion of Instructions . . . . . . . . . . . . . . . .828
5.2 Transactional Memory Facility States .
883
Chapter 2. Performance 5.2.1 The TDOOMED Bit . . . . . . . . . . 885
Considerations and Instruction 5.3 Transaction Failure . . . . . . . . . . . . 885
Restart . . . . . . . . . . . . . . . . . . . . . . 829 5.3.1 Causes of Transaction Failure . . 885
2.1 Performance-Optimized Instruction 5.3.2 Recording of Transaction Failure 888
Sequences . . . . . . . . . . . . . . . . . . . . . .829 5.3.3 Handling of Transaction Failure . 888
2.1.1 Load and Store Operations . . . . .830 5.4 Transactional Memory Facility Regis-
2.1.2 32-Bit Constant Generation. . . . .833 ters . . . . . . . . . . . . . . . . . . . . . . . . . . . 889
2.1.3 Sign and Zero Extension . . . . . .833 5.4.1 Transaction Failure Handler Address
2.1.4 Load/Store Addressing Relative to Register (TFHAR) . . . . . . . . . . . . . . . . 889
Program Counter . . . . . . . . . . . . . . . . .834 5.4.2 Transaction EXception And Status
2.1.5 Destructive Operation Operand Register (TEXASR) . . . . . . . . . . . . . . . 889
Preservation . . . . . . . . . . . . . . . . . . . . .835 5.4.3 Transaction Failure Instruction
2.2 Instruction Restart . . . . . . . . . . . .836 Address Register (TFIAR). . . . . . . . . . 891
5.5 Transactional Facility Instructions . 893
Chapter 3. Management of Shared
Resources . . . . . . . . . . . . . . . . . . . 837 Chapter 6. Time Base . . . . . . . . . 901
6.1 Time Base Instructions . . . . . . . . . 902
3.1 Program Priority Registers . . . . . . .837
3.2 “or” Instruction . . . . . . . . . . . . . . . .837
Chapter 7. Event-Based Branch
Chapter 4. Storage Control Facility . . . . . . . . . . . . . . . . . . . . . 905
Instructions . . . . . . . . . . . . . . . . . . 839 7.1 Event-Based Branch Overview . . . 905
7.2 Event-Based Branch Registers. . . 906
4.1 Parameters Useful to Application Pro-
7.2.1 Branch Event Status and Control
grams . . . . . . . . . . . . . . . . . . . . . . . . . .839
Register. . . . . . . . . . . . . . . . . . . . . . . . 906
4.2 Data Stream Control Register (DSCR)
7.2.2 Event-Based Branch Handler Regis-
839
ter . . . . . . . . . . . . . . . . . . . . . . . . . . . . 908
4.3 Cache Management Instructions .841
7.2.3 Event-Based Branch Return Register
4.3.1 Instruction Cache Instructions. . .842
908
4.3.2 Data Cache Instructions . . . . . . .843
7.3 Event-Based Branch Instructions . 909
4.3.2.1 Obsolete Data Cache Instructions .
854
4.3.3 “or” Instruction . . . . . . . . . . . . . . .855 Appendix A. Assembler Extended
4.4 Copy-Paste Facility. . . . . . . . . . . . .856 Mnemonics . . . . . . . . . . . . . . . . . . 911
4.5 Atomic Memory Operations . . . . . .861 A.1 Data Cache Block Touch [for Store]
4.5.1 Load Atomic . . . . . . . . . . . . . . . .861 Mnemonics . . . . . . . . . . . . . . . . . . . . . 911
4.5.2 Store Atomic . . . . . . . . . . . . . . . .864 A.2 Data Cache Block Flush Mnemonics .
4.6 Synchronization Instructions . . . . .867 911
4.6.1 Instruction Synchronize Instruction . . A.3 Or Mnemonics . . . . . . . . . . . . . . . 911
867 A.4 Load and Reserve
4.6.2 Load and Reserve and Store Condi- Mnemonics . . . . . . . . . . . . . . . . . . . . . 911
tional Instructions . . . . . . . . . . . . . . . . .867 A.5 Synchronize Mnemonics . . . . . . . 912
Table of Contents xv
Version 2.08
4.4.3 Load Monitored Doubleword Instruc- 5.7.8.3.4 Secondary Hash for 1TB Seg-
tion . . . . . . . . . . . . . . . . . . . . . . . . . . . .968 ment . . . . . . . . . . . . . . . . . . . . . . . . . . 996
4.4.4 Transactional Memory Instructions . . 5.7.9 Hashed Page Table Translation . 997
969 5.7.9.1 Hashed Page Table. . . . . . . . . 998
4.4.5 Move To/From System Register 5.7.9.2 Page Table Search . . . . . . . . . 999
Instructions . . . . . . . . . . . . . . . . . . . . . .970 5.7.10 Radix Tree Translation . . . . . . 1001
5.7.10.1 Radix Tree Page Directory Entry
Chapter 5. Storage Control . . . . . 981 1002
5.1 Overview . . . . . . . . . . . . . . . . . . . .981 5.7.10.2 Radix Tree Page Table Entry 1003
5.2 Storage Exceptions . . . . . . . . . . . .981 5.7.11 Nested Translation . . . . . . . . . 1003
5.3 Instruction Fetch . . . . . . . . . . . . . .981 5.7.12 Translation Process . . . . . . . . 1005
5.3.1 Implicit Branch. . . . . . . . . . . . . . .981 5.7.12.1 Fully-Qualified Address . . . 1005
5.3.2 Address Wrapping Combined with 5.7.12.2 Finding the Page Tables . . . 1005
Changing MSR Bit SF . . . . . . . . . . . . .981 5.7.12.3 Obtaining Host Real Address,
5.4 Data Access . . . . . . . . . . . . . . . . . .981 Radix on Radix . . . . . . . . . . . . . . . . . 1006
5.5 Performing Operations 5.7.12.4 Obtaining Host Real Address,
Out-of-Order . . . . . . . . . . . . . . . . . . . . .982 HPT . . . . . . . . . . . . . . . . . . . . . . . . . . 1007
5.6 Invalid Real Address . . . . . . . . . . .982 5.7.13 Reference and Change Recording
5.7 Storage Addressing . . . . . . . . . . . .983 1007
5.7.1 32-Bit Mode. . . . . . . . . . . . . . . . .983 5.7.14 Storage Protection . . . . . . . . . 1011
5.7.2 Virtualized Partition Memory (VPM) 5.7.14.1 Virtual Page Class Key Protection
Mode. . . . . . . . . . . . . . . . . . . . . . . . . . .983 1011
5.7.3 Hypervisor Real And Virtual Real 5.7.14.2 Basic Storage Protection,
Addressing Modes . . . . . . . . . . . . . . . .984 Address Translation Enabled . . . . . . 1015
5.7.3.1 Hypervisor Offset Real Mode 5.7.14.3 Basic Storage Protection,
Address . . . . . . . . . . . . . . . . . . . . . . . .984 Address Translation Disabled . . . . . . 1016
5.7.3.2 Storage Control Attributes for 5.7.14.4 Radix Tree Translation Storage
Accesses in Hypervisor Real Addressing Protection . . . . . . . . . . . . . . . . . . . . . 1016
Mode. . . . . . . . . . . . . . . . . . . . . . . . . . .984 5.7.15 Cluster Shared Memory Protection
5.7.3.2.1 Hypervisor Real Mode Storage 1017
Control . . . . . . . . . . . . . . . . . . . . . . . . .985 5.8 Storage Control Attributes . . . . . 1017
5.7.3.3 Virtual Real Mode Addressing 5.8.1 Guarded Storage . . . . . . . . . . . 1017
Mechanism . . . . . . . . . . . . . . . . . . . . . .985 5.8.1.1 Out-of-Order Accesses to Guarded
5.7.3.4 Storage Control Attributes for Storage . . . . . . . . . . . . . . . . . . . . . . . 1018
Implicit Storage Accesses . . . . . . . . . .986 5.8.2 Storage Control Bits. . . . . . . . . 1018
5.7.4 Definitions . . . . . . . . . . . . . . . . . .986 5.8.2.1 Storage Control Bit Restrictions . .
5.7.5 Address Ranges Having Defined 1019
Uses . . . . . . . . . . . . . . . . . . . . . . . . . . .987 5.8.2.2 Altering the Storage Control Bits .
5.7.5.1 Effective Address Space Structure 1019
for Radix-using Partitions . . . . . . . . . . .988 5.9 Storage Control Instructions . . . . 1021
5.7.6 In-Memory Tables . . . . . . . . . . . .989 5.9.1 Cache Management Instructions. . .
5.7.6.1 Partition Table . . . . . . . . . . . . .989 1021
5.7.6.2 Process Table. . . . . . . . . . . . . .991 5.9.2 Synchronize Instruction . . . . . . 1021
5.7.7 Address Translation Overview. . .991 5.9.3 Lookaside Buffer
5.7.8 Segment Translation . . . . . . . . . .994 Management . . . . . . . . . . . . . . . . . . . 1022
5.7.8.1 Segment Lookaside Buffer (SLB) . 5.9.3.1 Thread-Specific Segment Transla-
994 tions . . . . . . . . . . . . . . . . . . . . . . . . . 1023
5.7.8.2 SLB Search . . . . . . . . . . . . . . .996 5.9.3.2 SLB Management Instructions . .
5.7.8.3 Segment Table Description and 1023
Search . . . . . . . . . . . . . . . . . . . . . . . . .996 5.9.3.3 TLB Management Instructions . . .
5.7.8.3.1 Primary Hash for 256MB Seg- 1033
ment . . . . . . . . . . . . . . . . . . . . . . . . . . .996 5.10 Translation Table Update Synchroni-
5.7.8.3.2 Primary Hash for 1TB Segment . zation Requirements . . . . . . . . . . . . . 1043
996 5.10.1 Translation Table Updates . . . 1043
5.7.8.3.3 Secondary Hash for 256MB Seg- 5.10.1.1 Adding a Page Table Entry . 1045
ment . . . . . . . . . . . . . . . . . . . . . . . . . . .996
Book I:
Chapter 1. Introduction
positive
1.1 Overview Means greater than zero.
This chapter describes computation modes, document negative
conventions, a processor overview, instruction formats, Means less than zero.
storage addressing, and instruction fetching.
floating-point single format (or simply single
format)
1.2 Instruction Mnemonics and Refers to the representation of a single-precision
binary floating-point value in a register or storage.
Operands
floating-point double format (or simply double
The description of each instruction includes the mne- format)
monic and a formatted list of operands. Some exam- Refers to the representation of a double-precision
ples are the following. binary floating-point value in a register or storage.
system library program
stw RS,D(RA)
A component of the system software that can be
addis RT,RA,SI
called by an application program using a Branch
Power ISA-compliant Assemblers will support the mne- instruction.
monics and operand lists exactly as shown. They
system service program
should also provide certain extended mnemonics, such
A component of the system software that can be
as the ones described in Appendix C of Book I.
called by an application program using a System
Call or System Call Vectored instruction.
1.3 Document Conventions system trap handler
A component of the system software that receives
control when the conditions specified in a Trap
1.3.1 Definitions instruction are satisfied.
The following definitions are used throughout this docu- system error handler
ment. A component of the system software that receives
control when an error occurs. The system error
program
handler includes a component for each of the vari-
A sequence of related instructions.
ous kinds of error. These error-specific compo-
application program nents are referred to as the system alignment error
A program that uses only the instructions and handler, the system data storage error handler,
resources described in Books I and II. etc.
processor latency
The hardware component that implements the Refers to the interval from the time an instruction
instruction set, storage model, and other facilities begins execution until it produces a result that is
defined in the Power ISA architecture, and exe- available for use by a subsequent instruction.
cutes the instructions specified in a program.
unavailable
quadword, doubleword, word, halfword, and Refers to a resource that cannot be used by the
byte program. For example, storage is unavailable if
128 bits, 64 bits, 32 bits, 16 bits, and 8 bits, access to it is denied. See Book III.
respectively.
Chapter 1. Introduction 3
Version 3.0
undefined value ses are not used with them. Parentheses are also
May vary between implementations, and between omitted when register x is the register into which
different executions on the same implementation, the result of an operation is placed.
and similarly for register contents, storage con-
(RA|0) means the contents of register RA if the RA
tents, etc., that are specified as being undefined.
field has the value 1-31, or the value 0 if the RA
boundedly undefined field is 0.
The results of executing a given instruction are Bytes in instructions, fields, and bit strings are
said to be boundedly undefined if they could have numbered from left to right, starting with byte 0
been achieved by executing an arbitrary finite (most significant).
sequence of instructions (none of which yields Bits in registers, instructions, fields, and bit strings
boundedly undefined results) in the state the pro- are specified as follows. In the last three items
cessor was in before executing the given instruc- (definition of Xp etc.), if X is a field that specifies a
tion. Boundedly undefined results may include the GPR, FPR, or VR (e.g., the RS field of an instruc-
presentation of inconsistent state to the system tion), the definitions apply to the register, not to the
error handler as described in Section 1.10.1 of field.
Book II. Boundedly undefined results for a given
instruction may vary between implementations, - Bits in instructions, fields, and bit strings are
and between different executions on the same numbered from left to right, starting with bit 0
implementation. - For all registers except the Vector registers,
“must” bits in registers that are less than 64 bits start
If software violates a rule that is stated using the with bit number 64-L, where L is the register
word “must” (e.g., “this field must be set to 0”), the length; for the Vector registers, bits in registers
results are boundedly undefined unless otherwise that are less than 128 bits start with bit num-
stated. ber 128-L.
- The leftmost bit of a sequence of bits is the
sequential execution model most significant bit of the sequence.
The model of program execution described in - Xp means bit p of register/instruction/field/
Section 2.2, “Instruction Execution Order” on bit_string X.
page 29. - Xp:q means bits p through q of register/instruc-
tion/field/bit_string X.
- Xp q ... means bits p, q, ... of register/instruc-
1.3.2 Notation tion/field/bit_string X.
¬(RA) means the one’s complement of the con-
The following notation is used throughout the Power tents of register RA.
ISA documents.
A period (.) as the last character of an instruction
All numbers are decimal unless specified in some mnemonic means that the instruction records sta-
special way. tus information in certain fields of the Condition
- 0bnnnn means a number expressed in binary Register as a side effect of execution.
format. The symbol || is used to describe the concatena-
- 0xnnnn means a number expressed in hexa- tion of two values. For example, 010 || 111 is the
decimal format. same as 010111.
Underscores may be used between digits. xn means x raised to the nth power.
RT, RA, R1, ... refer to General Purpose Registers. nx means the replication of x, n times (i.e., x con-
FRT, FRA, FR1, ... refer to Floating-Point Regis- catenated to itself n-1 times). n0 and n1 are spe-
ters. cial cases:
- n0 means a field of n bits with each bit equal to
FRTp, FRAp, FRBp, ... refer to an even-odd pair of
Floating-Point Registers. Values must be even, 0. Thus 50 is equivalent to 0b00000.
n
otherwise the instruction form is invalid. - 1 means a field of n bits with each bit equal to
1. Thus 51 is equivalent to 0b11111.
VRT, VRA, VR1, ... refer to Vector Registers.
Each bit and field in instructions, and in status and
(x) means the contents of register x, where x is the control registers (e.g., XER, FPSCR) and Special
name of an instruction field. For example, (RA) Purpose Registers, is either defined or reserved.
means the contents of register RA, and (FRA) Some defined fields contain reserved values. In
means the contents of register FRA, where RA such cases when this document refers to the spe-
and FRA are instruction fields. Names such as LR cific field, it refers only to the defined values,
and CTR denote registers, not fields, so parenthe- unless otherwise specified.
/, //, ///, ... denotes a reserved field, in a register, 1.3.3 Reserved Fields, Reserved
instruction, field, or bit string.
Values, and Reserved SPRs
?, ??, ???, ... denotes an implementation-depen-
dent field in a register, instruction, field or bit string. Reserved fields in instructions are ignored by the pro-
cessor.
In some cases a defined field of an instruction has cer-
tain values that are reserved. This includes cases in
which the field is shown in the instruction layout as con-
taining a particular value; in such cases all other values
of the field are reserved. In general, if an instruction is
coded such that a defined field contains a reserved
value the instruction form is invalid; see Section 1.9.2
on page 23. The only exception to the preceding rule is
that it does not apply to Reserved and Illegal classes of
instructions (see Section 1.8) or to portions of defined
fields that are specified, in the instruction description,
as being treated as reserved fields.
To maximize compatibility with future architecture
extensions, software must ensure that reserved fields
in instructions contain zero and that defined fields of
instructions do not contain reserved values.
The handling of reserved bits in System Registers (e.g.,
XER, FPSCR) depends on whether the processor is in
problem state. Unless otherwise stated, software is per-
mitted to write any value to such a bit. In problem state,
a subsequent reading of the bit returns 0 regardless of
the value written; in privileged states, a subsequent
reading of the bit returns 0 if the value last written to the
bit was 0 and returns an undefined value (0 or 1) other-
wise.
In some cases, a defined field of a System Register has
certain values that are reserved. Software must not set
a defined field of a System Register to a reserved
value. References elsewhere in this document to a
defined field (in an instruction or System Register) that
has reserved values assume the field does not contain
a reserved value, unless otherwise stated or obvious
from context.
In some cases, a given bit of a System Register is
specified to be set to a constant value by a given
instruction or event. Unless otherwise stated or obvious
from context, software should not depend on this con-
stant value because the bit may be assigned a meaning
in a future version of the architecture.
The reserved SPRs include SPRs 808, 809, 810, and
811. mtspr and mfspr instructions specifying these
SPRs are treated as noops. Reserved SPRs are pro-
vided in the architecture to anticipate the eventual
adoption of performance hint functionality that must be
controlled by SPRs. Control of these capabilities using
reserved SPRs will allow software to use these new
capabilities on new implementations that support them
while remaining compatible with existing implementa-
tions that may not support the new functionality.
Chapter 1. Introduction 5
Version 3.0
Reserved SPRs are not assigned names. There are no (“Non-standard” setting of these registers, such as the
individual descriptions of reserved SPRs in this docu- setting of the Condition Register by the Compare
ment. instructions, is shown.) The RTL descriptions do not
cover cases in which the system error handler is
Assembler Note invoked, or for which the results are boundedly unde-
Assemblers should report uses of reserved values fined.
of defined fields of instructions as errors. The RTL descriptions specify the architectural transfor-
mation performed by the execution of an instruction.
Programming Note They do not imply any particular implementation.
It is the responsibility of software to preserve bits
that are now reserved in System Registers, Notation Meaning
because they may be assigned a meaning in some
Assignment
future version of the architecture.
iea Assignment of an instruction effective
In order to accomplish this preservation in imple- address. In 32-bit mode the high-order 32
mentation-independent fashion, software should do bits of the 64-bit target address are set to
the following. 0.
¬ NOT logical operator
Initialize each such register supplying zeros for
all reserved bits. + Two’s complement addition
Alter (defined) bit(s) in the register by reading - Two’s complement subtraction, unary
the register, altering only the desired bit(s), minus
and then writing the new value back to the reg- × Multiplication
ister. ×si Signed-integer multiplication
×ui Unsigned-integer multiplication
The XER and FPSCR are partial exceptions to this / Division
recommendation. Software can alter the status bits ÷ Division, with result truncated to integer
in these registers, preserving the reserved bits, by % Remainder of integer division
executing instructions that have the side effect of √ Square root
altering the status bits. Similarly, software can alter =, ≠ Equals, Not Equals relations
any defined bit in the FPSCR by executing a Float- <, ≤, >, ≥ Signed comparison relations
ing-Point Status and Control Register instruction. <u, >u Unsigned comparison relations
Using such instructions is likely to yield better per- ? Unordered comparison relation
formance than using the method described in the &, | AND, OR logical operators
second item above.
⊕, ≡ Exclusive OR, Equivalence logical opera-
tors ((a≡b) = (a⊕¬b))
ABS(x) Absolute value of x
1.3.4 Description of Instruction BCD_TO_DPD(x)
Operation The low-order 24 bits of x contain six, 4-bit
BCD fields which are converted to two
Instruction descriptions (including related material such declets; each set of two declets is placed
as the introduction to the section describing the instruc- into the low-order 20 bits of the result. See
tions) mention that the instruction may cause a system Section B.1, “BCD-to-DPD Translation”.
error handler to be invoked, under certain conditions, if CEIL(x) Least integer ≥ x
and only if the system error handler may treat the case DOUBLE(x) Result of converting x from floating-point
as a programming error. (An instruction may cause a single format to floating-point double for-
system error handler to be invoked under other condi- mat, using the model shown on page 141
tions as well; see Chapter 6 of Book III). DPD_TO_BCD(x)
The low-order 20 bits of x contain two
A formal description is given of the operation of each
declets which are converted to six, 4-bit
instruction. In addition, the operation of most instruc-
BCD fields; each set of six, 4-bit BCD
tions is described by a semiformal language at the reg-
fields is placed into the low-order 24 bits of
ister transfer level (RTL). This RTL uses the notation
the result. See Section B.2, “DPD-to-BCD
given below, in addition to the notation described in
Translation”.
Section 1.3.2. Some of this notation is also used in the
EXTS(x) Result of extending x on the left with sign
formal descriptions of instructions. RTL notation not
bits
summarized here should be self-explanatory.
FLOOR(x) Greatest integer ≤ x
The RTL descriptions cover the normal execution of the GPR(x) General Purpose Register x
instruction, except that “standard” setting of status reg- MASK(x, y) Mask having 1s in positions x through y
isters, such as the Condition Register, is not shown. (wrapping if x > y) and 0s elsewhere
MEM(x, y) Contents of a sequence of y bytes of stor- For instructions that do not branch, and do
age. The sequence depends on the byte not otherwise cause instruction fetching to
ordering used for storage access, as fol- be non-sequential, the next instruction
lows. address is CIA+4. Does not correspond to
Big-Endian byte ordering: any architected register.
The sequence starts with the byte at if... then... else...
address x and ends with the byte at Conditional execution, indenting shows
address x+y-1. range; else is optional.
Little-Endian byte ordering: do Do loop, indenting shows range. “To” and/
The sequence starts with the byte at or “by” clauses specify incrementing an
address x+y-1 and ends with the byte at iteration variable, and a “while” clause
address x. gives termination conditions.
ROTL64(x, y) leave Leave innermost do loop, or do loop
Result of rotating the 64-bit value x left y described in leave statement.
positions for For loop, indenting shows range. Clause
ROTL32(x, y) after “for” specifies the entities for which to
Result of rotating the 64-bit value x||x left y execute the body of the loop.
positions, where x is 32 bits long
SINGLE(x) Result of converting x from floating-point
double format to floating-point single for-
mat, using the model shown on page 145
SPR(x) Special Purpose Register x
switch/case/default
switch/case/default statement, indenting
shows range. The clause after “switch”
specifies the expression to evaluate. The
clause after “case” specifies individual val-
ues for the expression, followed by a
colon, followed by the actions that are
taken if the evaluated expression has any
of the specified values. “default” is
optional. If present, it must follow all the
“case” clauses. The clause after “default”
starts with a colon, and specifies the
actions that are taken if the evaluated
expression does not have any of the val-
ues specified in the preceding case state-
ments.
TRAP Invoke the system trap handler
characterization
Reference to the setting of status bits, in a
standard way that is explained in the text
undefined An undefined value.
CIA Current Instruction Address, which is the
64-bit address of the instruction being
described by a sequence of RTL. Used by
relative branches to set the Next Instruc-
tion Address (NIA), and by Branch instruc-
tions with LK=1 to set the Link Register.
Does not correspond to any architected
register. The CIA is sometimes referred to
as the Program Counter (PC).
NIA Next Instruction Address, which is the
64-bit address of the next instruction to be
executed. For a successful branch, the
next instruction address is the branch tar-
get address: in RTL, this is indicated by
assigning a value to NIA. For other instruc-
tions that cause non-sequential instruction
fetching (see Book III), the RTL is similar.
Chapter 1. Introduction 7
Version 3.0
The precedence rules for RTL operators are summa- 1.3.5 Phased-Out Facilities
rized in Table 1. Operators higher in the table are
applied before those lower in the table. Operators at the
same level in the table associate from left to right, from
right to left, or not at all, as shown. (For example, - Phased-Out Facilities
associates from left to right, so a-b-c = (a-b)-c.)
These are facilities and instructions that, in some
Parentheses are used to override the evaluation order
future version of the architecture, will be dropped
implied by the table or to increase clarity; parenthe-
out of the architecture. System developers should
sized expressions are evaluated before serving as
develop a migration plan to eliminate use of them
operands.
in new systems. These facilities are marked with a
[Phased-Out] marker.
Table 1: Operator precedence
Phased-Out facilities and instructions must be
Operators Associativity
implemented.
subscript, function evaluation left to right
pre-superscript (replication), right to left Programming Note
post-superscript (exponentiation) Warning: Instructions and facilities being phased
unary -, ¬ right to left out of the architecture are likely to perform poorly
×, ÷ left to right on future implementations. New programs should
not use them.
+, -, left to right
|| left to right
=, ≠, <, ≤, >, ≥,<u, >u,? left to right
&, ⊕, ≡ left to right
| left to right
: (range) none
, iea none
Chapter 1. Introduction 9
Version 3.0
CR FPSCR
32 63 32 63
“Condition Register” on page 30 “Floating-Point Status and Control Register” on
page 124
LR
0 63
VR 0
“Link Register” on page 32
VR 1
CTR ...
0 63 ...
“Count Register” on page 32 VR 30
VR 31
GPR 0
0 127
GPR 1 “Vector Registers” on page 234
...
... VSCR
96 127
GPR 30 “Vector Status and Control Register” on page 234
GPR 31
0 63
VSR 0
“General Purpose Registers” on page 45
VSR 1
XER ...
0 63 ...
“Fixed-Point Exception Register” on page 45 VSR 62
VSR 63
VRSAVE 0 127
32 63
“VR Save Register” on page 235 “Vector-Scalar Registers” on page 366
FPR 0
FPR 1
...
...
FPR 30
FPR 31
0 63
“Floating-Point Registers” on page 124
Programming Note
Although instructions that set a 64-bit register affect
all 64 bits in both 32-bit and 64-bit modes, operat-
ing systems often do not preserve the upper 32-bits
of all registers across context switches done in
32-bit mode. For this reason, application programs
operating in 32-bit mode should not assume that
the upper 32 bits of the GPRs are preserved from
instruction to instruction unless the operating sys-
tem is known to preserve these bits.
Chapter 1. Introduction 11
Version 3.0
PO BF / L RA SI 1.6.9 MD-FORM
PO BF / L RA UI 0 6 11 16 21 27 30 31
PO FRS RA D PO RS RA sh mb XO sh Rc
PO FRT RA D PO RS RA sh me XO sh Rc
PO RS RA D Figure 11. MD instruction format
PO RS RA UI
PO RT RA D 1.6.10 MDS-FORM
PO RT RA SI 0 6 11 16 21 25 27 31
PO TO RA SI PO RS RA RB mb XO Rc
Figure 5. D instruction format PO RS RA RB me XO Rc
0 6 11 16 30 31 PO RT RA RB RC XO
PO FRSp RA DS XO PO VRT VRA VRB / SHB XO
PO FRTp RA DS XO PO VRT VRA VRB VRC XO
PO RS RA DS XO Figure 14. VA instruction format
PO RSp RA DS XO
PO RT RA DS XO 1.6.13 VC-FORM
PO VRS RA DS XO 0 6 11 16 21 22 31
1.6.14 VX-FORM 0 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
0 6 11 12 13 14 16 21 22 23 31 PO BF // FRA FRBp XO /
PO /// /// VRB XO PO BF // FRAp FRBp XO /
PO RT XO VRB XO PO BF // RA RB XO /
PO VRT /// /// XO PO BF // UIM FRB XO /
PO VRT /// VRB XO PO BF // UIM FRBp XO /
PO VRT /// UIM VRB XO PO BF // VRA VRB XO /
PO VRT // UIM VRB XO PO BF / 1 RA RB XO /
PO VRT / UIM VRB XO PO BF / L RA RB XO /
PO VRT RA VRB XO PO BF DCMX VRB XO /
PO VRT SIM /// XO PO BT /// /// XO Rc
PO VRT UIM VRB XO PO FRS RA RB XO /
PO VRT VRA /// XO PO FRSp RA RB XO /
PO VRT VRA VRB 1 / XO PO FRT /// /// XO Rc
PO VRT VRA VRB 1 PS XO PO FRT /// FRB XO Rc
PO VRT VRA VRB XO PO FRT /// FRBp XO Rc
PO VRT EO VRB 1 / XO PO FRT FRA FRB XO /
PO VRT EO VRB 1 PS XO PO FRT FRA FRB XO Rc
PO VRT EO VRB XO PO FRT RA RB XO /
PO FRT S /// FRB XO Rc
Figure 16. VX instruction format
PO FRT SP /// FRB XO Rc
PO FRTp /// FRB XO Rc
1.6.15 X-FORM
0 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 PO FRTp /// FRBp XO Rc
PO /// /// /// XO / PO FRTp FRA FRBp XO Rc
PO /// /// /// XO 1 PO FRTp FRAp FRBp XO Rc
PO /// /// RB XO / PO FRTp RA RB XO /
PO /// RA /// XO / PO FRTp S /// FRBp XO Rc
PO /// RA /// XO 1 PO FRTp SP /// FRBp XO Rc
PO /// RA RB XO / PO RS /// RB XO /
PO /// L /// /// XO / PO RS / RIC PR R RB XO /
PO /// L /// RB XO / PO RS /// L /// XO /
PO /// L RA RB XO / PO RS / SR /// XO /
PO /// L RA RB XO Rc PO RS BFA // /// XO /
PO /// L /// /// XO / PO RS RA /// XO /
PO /// L RA RB XO / PO RS RA /// XO 1
PO /// WC /// /// XO / PO RS RA /// XO Rc
PO // IH /// /// XO / PO RS RA FC XO /
PO / CT RA RB XO / PO RS RA NB XO /
PO A /// /// /// XO / PO RS RA SH XO Rc
PO A /// R /// /// XO / PO RS RA RB XO /
PO BF // /// /// XO / PO RS RA RB XO 1
PO BF // /// FRB XO / PO RS RA RB XO Rc
PO BF // /// W U / XO Rc PO RSp RA RB XO 1
PO BF // BFA // /// XO / PO RT /// /// XO /
PO BF // FRA FRB XO / PO RT /// RB XO /
Chapter 1. Introduction 13
Version 3.0
0 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
PO RT /// RB XO 1
PO RT /// L /// XO /
PO RT / SR /// XO /
PO RT RA FC XO /
PO RT RA NB XO /
PO RT RA RB XO /
PO RT RA RB XO EH
PO RTp RA RB XO EH
PO S RA /// XO SX
PO S RA RB XO SX
PO T RA /// XO TX
PO T RA RB XO TX
PO T EO IMM8 XO TX
PO TH RA RB XO /
PO TO RA SI XO 1
PO TO RA RB XO /
PO TO RA RB XO 1
PO VRS RA RB XO /
PO VRT RA RB XO /
PO VRT VRA VRB XO /
PO VRT VRA VRB XO RO
PO VRT XO VRB XO /
PO VRT XO VRB XO RO
PO RT spr XO / PO BF // A B XO AX BX /
PO RT tbr XO / PO T A B 0 DM XO AX BX TX
1.6.18 XL-FORM PO T A B XO AX BX TX
0 6 9 11 14 16 19 20 21 31 Figure 24. XX3 instruction format
PO /// /// /// XO /
PO /// /// /// S XO / 1.6.23 XX4-FORM
PO BF // BFA // /// XO / 0 6 11 16 21 26 27 28 29 30 31
PO BO BI /// BH XO LK PO T A B C XO CX AX BX TX
PO BT BA BB XO / Figure 25. XX4 instruction format
Figure 20. XL instruction format
1.6.24 Z22-FORM
1.6.19 XO-FORM 0 6 9 11 15 16 22 31
0 6 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 PO BF // FRA DCM XO /
PO RT RA /// OE XO Rc PO BF // FRA DGM XO /
PO RT RA RB / XO / PO BF // FRAp DCM XO /
PO RT RA RB / XO Rc PO BF // FRAp DGM XO /
PO RT RA RB OE XO Rc PO FRT FRA SH XO Rc
PO RS RA sh XO sh Rc
Chapter 1. Introduction 15
Version 3.0
Chapter 1. Introduction 17
Version 3.0
L (6) MB (21:25)
Field used to specify whether the mtfsf instruction Field used in M-form instructions to specify the first
updates the entire FPSCR. 1-bit of a 64-bit mask, as described in
Section 3.3.14, “Fixed-Point Rotate and Shift
Formats: XFL Instructions” on page 100.
Formats: M
mb (21:26) R (15)
Field used in MD-form and MDS-form instructions Immediate field that specifies whether the RMC is
to specify the first 1-bit of a 64-bit mask, as specifying the primary or secondary encoding
described in Section 3.3.14, “Fixed-Point Rotate
Field used to specify whether to invalidate Radix
and Shift Instructions” on page 100.
Tree or HPT entries for tlbie[l].
Formats: MD, MDS
Formats: X, Z23
me (21:26)
RA (11:15)
Field used in MD-form and MDS-form instructions
Field used to specify a GPR to be used as a
to specify the last 1-bit of a 64-bit mask, as
source or as a target.
described in Section 3.3.14, “Fixed-Point Rotate
and Shift Instructions” on page 100. Formats: A, D, DQ, DQE, DS, M, MD, MDS, TX,
VA, VX, X, XO, XS, XX1
Formats: MD, MDS
RB (16:20)
ME (26:30)
Field used to specify a GPR to be used as a
Field used in M-form instructions to specify the last
source.
1-bit of a 64-bit mask, as described in
Section 3.3.14, “Fixed-Point Rotate and Shift Formats: A, M, MDS, VA, X, XO, XX1
Instructions” on page 100.
Rc (21)
Formats: M RECORD bit.
NB (16:20) 0 Do not alter the Condition Register.
Field used to specify the number of bytes to move
in an immediate Move Assist instruction. 1 Set Condition Register Field 6 as described in
Section 2.3.1, “Condition Register” on
Formats: X page 30.
OE (21) Formats: VC, XX3
Field used by XO-form instructions to enable set-
ting OV and SO in the XER. RC (21:25)
Field used to specify a GPR to be used as a
Formats: XO source.
PO (0:5) Formats: VA
Primary opcode.
Rc (31)
Formats: all RECORD bit.
PRS (14) 0 Do not alter the Condition Register.
Field used to specify whether to invalidate pro-
cess- or partition-scoped entries for tlbie[l]. 1 Set Condition Register Field 0 or Field 1 as
described in Section 2.3.1, “Condition Regis-
Formats: X ter” on page 30.
PS (22) Formats: A, M, MD, MDS, X, XFL, XO, XS, Z22,
Field used to specify preferred sign for BCD opera- Z23
tions.
RIC (12:13)
Formats: VX Field used to specify what types of entries to inval-
idate for tlbie[l].
PT (28:31)
Immediate field used to specify a 4-bit unsigned Formats: X
value.
RMC (21:22)
Formats: DQ Immediate field used for DFP rounding mode con-
trol.
R (10)
Field used by the tbegin. instruction to specify the Formats: Z23
start of a ROT.
RO (31)
Formats: X Round to Odd override
Formats: X
Chapter 1. Introduction 19
Version 3.0
RS (6:10) SI (16:31)
Field used to specify a GPR to be used as a Immediate field used to specify a 16-bit signed
source. integer.
Formats: D, DS, M, MD, MDS, X, XFX, XS Formats: D
RSp (6:10) SIM (11:15)
Field used to specify an even/odd pair of GPRs to Immediate field used to specify a 5-bit signed inte-
be concatenated and used as a source. ger.
Formats: DS, X Formats: VX
RT (6:10) SP (11:12)
Field used to specify a GPR to be used as a target. Immediate field that specifies signed versus
unsigned conversion.
Formats: A, D, DQE, DS, DX, VA, VX, X, XFX,
XO, XX2 Formats: X
RTp (6:10) SPR (11:20)
Field used to specify an even/odd pair of GPRs to Field used to specify a Special Purpose Register
be concatenated and used as a target. for the mtspr and mfspr instructions.
Formats: DQ, X Formats: X
S (11) SR (12:15)
Immediate field that specifies signed versus Field used by the Segment Register Manipulation
unsigned conversion. instructions (see Book III).
Formats: X Formats: X
S (20) SX,S (28,6:10)
Immediate field that specifies whether or not the Fields SX and S are concatenated to specify a
rfebb instruction re-enables event-based VSR to be used as a source.
branches.
Formats: DQ
Formats: XL
SX,S (31,6:10)
SH (16:20) Fields SX and S are concatenated to specify a
Field used to specify a shift amount. VSR to be used as a source.
Formats: M, X Formats: XX1
SH (16:21) TBR (11:20)
Field used to specify a shift amount. Field used by the Move From Time Base instruc-
tion (see Section 6.1 of Book II).
Formats: Z22
Formats: X
sh (30,16:20)
Fields that are concatenated to specify a shift TE (11:15)
amount.
Immediate field that specifies a DFP exponent.
Formats: MD, XS
Formats: Z23
SHB (22:25)
TH (6:10)
Field used to specify a shift amount in bytes.
Field used by the data stream variant of the dcbt
Formats: VA and dcbtst instructions (see Section 4.3.2 of Book
II).
SHW (22:23)
Field used to specify a shift amount in words. Formats: X
Formats: XX3 TO (6:10)
Field used to specify the conditions on which to
SI (16:20)
trap. The encoding is described in
Immediate field used to specify a 5-bit signed inte-
Section 3.3.10.1, “Character-Type Compare
ger.
Instructions” on page 87.
Formats: X
Formats: TX, X
Chapter 1. Introduction 21
Version 3.0
Defined
Illegal
Reserved
The class is determined by examining the opcode, and
the extended opcode if any. If the opcode, or combina-
tion of opcode and extended opcode, is not that of a
defined instruction or a reserved instruction, the
instruction is illegal.
1.9 Forms of Defined Instruc- with existing implementations that may not support the
new function.
tions When a reserved-no-op instruction is executed, no
operation is performed.
1.9.1 Preferred Instruction Forms Reserved-no-op instructions are not assigned instruc-
Some of the defined instructions have preferred forms. tion names or mnemonics. There are no individual
For such an instruction, the preferred form will execute descriptions of reserved-no-op instructions in this docu-
in an efficient manner, but any other form may take sig- ment.
nificantly longer to execute than the preferred form.
Instructions having preferred forms are: 1.10 Exceptions
the Condition Register Logical instructions
There are two kinds of exception, those caused directly
the Load Quadword instruction
by the execution of an instruction and those caused by
the Move Assist instructions
an asynchronous event. In either case, the exception
the Or Immediate instruction (preferred form of
may cause one of several components of the system
no-op)
software to be invoked.
the Move To Condition Register Fields instruction
The exceptions that can be caused directly by the exe-
cution of an instruction include the following:
1.9.2 Invalid Instruction Forms
an attempt to execute an illegal instruction, or an
Some of the defined instructions can be coded in a attempt by an application program to execute a
form that is invalid. An instruction form is invalid if one “privileged” instruction (see Book III) (system ille-
or more fields of the instruction, excluding the opcode gal instruction error handler or system privileged
field(s), are coded incorrectly in a manner that can be instruction error handler)
deduced by examining only the instruction encoding.
the execution of a defined instruction using an
In general, any attempt to execute an invalid form of an invalid form (system illegal instruction error handler
instruction will either cause the system illegal instruc- or system privileged instruction error handler)
tion error handler to be invoked or yield boundedly
an attempt to execute an instruction that is not pro-
undefined results. Exceptions to this rule are stated in
vided by the implementation (system illegal
the instruction descriptions.
instruction error handler)
Some instruction forms are invalid because the instruc-
an attempt to access a storage location that is
tion contains a reserved value in a defined field (see
unavailable (system instruction storage error han-
Section 1.3.3 on page 5); these invalid forms are not
dler or system data storage error handler)
discussed further. All other invalid forms are identified
in the instruction descriptions. an attempt to access storage with an effective
address alignment that is invalid for the instruction
References to instructions elsewhere in this document (system alignment error handler)
assume the instruction form is not invalid, unless other-
wise stated or obvious from context. the execution of a System Call or System Call Vec-
tored instruction (system service program)
Assembler Note the execution of a Trap instruction that traps (sys-
Assemblers should report uses of invalid instruction tem trap handler)
forms as errors.
the execution of a floating-point instruction that
causes a floating-point enabled exception to exist
(system floating-point enabled exception error han-
1.9.3 Reserved-no-op Instructions dler)
Reserved-no-op instructions include the following the execution of an auxiliary processor instruction
extended opcodes under primary opcode 31: 530, 562, that causes an auxiliary processor enabled excep-
594, 626, 658, 690, 722, and 754. tion to exist (system auxiliary processor enabled
exception error handler)
Reserved-no-op instructions are provided in the archi-
tecture to anticipate the eventual adoption of perfor- The exceptions that can be caused by an asynchro-
mance hint instructions to the architecture. For these nous event are described in Book III.
instructions, which cause no visible change to archi-
tected state, employing a reserved-no-op opcode will The invocation of the system error handler is precise,
allow software to use this new capability on new imple- except that the invocation of the auxiliary processor
mentations that support it while remaining compatible enabled exception error handler may be imprecise, and
Chapter 1. Introduction 23
Version 3.0
if one of the imprecise modes for invoking the system (Load/Store Multiple), or by an immediate field in the
floating-point enabled exception error handler is in instruction or the contents of a field in the XER (Move
effect (see page 133), then the invocation of the system Assist), as well as by the name of the instruction. For
floating-point enabled exception error handler may also example, the length of the storage operand of a Load
be imprecise. When the system error handler is invoked Multiple Word instruction for which the specified target
imprecisely, the excepting instruction does not appear register is GPR 20 is 48 bytes ((32-20)x4), and the
to complete before the next instruction starts (because length of the storage operand of a Load String Word
one of the effects of the excepting instruction, namely Immediate instruction for which the immediate field
the invocation of the system error handler, has not yet contains the number 20 is 20 bytes.
occurred).
The storage operand of a Load or Store instruction
Additional information about exception handling can be other than a Load/Store Multiple or Move Assist instruc-
found in Book III. tion is said to be aligned if the address of the storage
operand is an integral multiple of the storage operand
length; otherwise it is said to be unaligned. See the fol-
1.11 Storage Addressing lowing table. (The storage operand of a Load/Store
Multiple or Move Assist instruction is neither said to be
A program references storage using the effective aligned nor said to be unaligned. Its alignment proper-
address computed by the processor when it executes a ties are described, when necessary, using terms such
Storage Access or Branch instruction (or certain other as “word-aligned”, which are defined below.)
instructions described in Book II and Book III), or when
it fetches the next sequential instruction. Operand Length Addr60:63 if aligned
Bytes in storage are numbered consecutively starting Byte 8 bits xxxx
with 0. Each number is the address of the correspond- Halfword 2 bytes xxx0
ing byte. Word 4 bytes xx00
Doubleword 8 bytes x000
The byte ordering (Big-Endian or Little-Endian) for a
Quadword 16 bytes 0000
storage access is specified by the operating system.
This byte ordering is also referred to as the Endian Note: An “x” in an address bit position indicates that
mode and it applies to both data accesses and instruc- the bit can be 0 or 1 independent of the contents of
tion fetches. The Endian mode is specified by the LE other bits in the address.
mode bit (see Section 3.2.1 of Book III), which applies The concept of alignment is also applied more gener-
to all of storage. ally, to any datum in storage.
A datum having length that is an integral power of
1.11.1 Storage Operands 2 is said to be aligned if its address is an integral
multiple of its length.
A storage operand may be a byte, a halfword, a word, a A datum of any length is said to be half-
doubleword, or a quadword, or, for the Load/Store Mul- word-aligned (or aligned at a halfword boundary) if
tiple and Move Assist instructions, a sequence of bytes its address is an integral multiple of 2,
(Move Assist) or words (Load/Store Multiple). The word-aligned (or aligned at a word boundary) if its
address of a storage operand is the address of its first address is an integral multiple of 4, etc. (All data in
byte (i.e., of its lowest-numbered byte). An instruction storage is byte-aligned.)
for which the storage operand is a byte is said to cause
a byte access, and similarly for halfword, word, double- The concept of alignment can also be applied to data in
word, and quadword. registers, with the "address" of the datum interpreted
as the byte number of the datum in the register. E.g., a
The length of the storage operand is the number of word element (4 bytes) in a Vector Register is said to
bytes (of the storage operand) that the instruction be aligned if its byte number is an integral multiple of 4.
would access in the absence of invocations of the sys-
tem error handler. The length is generally implied by Programming Note
the name of the instruction (equivalently, by the The technical literature sometimes uses the term
opcode, and extended opcode if any). For example, the “naturally aligned” to mean “aligned.”
length of the storage operand of a Load Word and Zero,
Load Floating-Point Single, and Load Vector Element Versions of the architecture that precede Version
Word instruction is four bytes (one word), and the 2.07 also used “naturally aligned” as defined
length of a Store Quadword, Store Floating-Point Dou- above. The term was dropped from the architecture
ble Pair, and Store VSX Vector Word*4 instruction is 16 in Version 2.07 because it seemed to mean differ-
bytes (one quadword). The only exceptions are the ent things to different readers and is not needed.
Load/Store Multiple and Move Assist instructions, for
which the length of the storage operand is implied by
the identity of the specified source or target register
Some instructions require their storage operands to Figure 29 shows an example of a C language
have certain alignments. In addition, alignment may structure s containing an assortment of scalars and
affect performance. In general, the best performance is one character string. The value assumed to be in each
obtained when storage operands are aligned. structure element is shown in hex in the C comments;
these values are used below to show how the bytes
When a storage operand of length N bytes starting at making up each structure element are mapped into
effective address EA is copied between storage and a storage. It is assumed that structure s is compiled for
register that is R bytes long (i.e., the register contains 32-bit mode or for a 32-bit implementation. (This affects
bytes numbered from 0, most significant, through R-1, the length of the pointer to c.)
least significant), the bytes of the operand are placed C structure mapping rules permit the use of padding
into the register or into storage in a manner that (skipped bytes) in order to align the scalars on desir-
depends on the byte ordering for the storage access as able boundaries. Figures 30 and 31 show each scalar
shown in Figure 28, unless otherwise specified in the as aligned. This alignment introduces padding of four
instruction description. bytes between a and b, one byte between d and e, and
two bytes between e and f. The same amount of pad-
ding is present for both Big-Endian and Little-Endian
Big-Endian Byte Ordering mappings.
Load Store The Big-Endian mapping of structure s is shown in
for i=0 to N-1: for i=0 to N-1: Figure 30. Addresses are shown in hex at the left of
RT(R-N)+i MEM(EA+i,1) MEM(EA+i,1) (RS)(R-N)+i each doubleword, and in small figures below each byte.
Little-Endian Byte Ordering The contents of each byte, as indicated in the C exam-
ple in Figure 29, are shown in hex (as characters for the
Load Store elements of the string).
for i=0 to N-1: for i=0 to N-1:
The Little-Endian mapping of structure s is shown in
RT(R-1)-i MEM(EA+i,1) MEM(EA+i,1) (RS)(R-1)-i
Figure 31. Doublewords are shown laid out from right to
Notes: left, which is the common way of showing storage maps
1. In this table, subscripts refer to bytes in a register for processors that implement only Little-Endian byte
rather than to bits as defined in Section 1.3.2. ordering.
2. This table does not apply to the lvebx, lvehx,
lvewx, stvebx, stvehx, and stvewx instructions.
Figure 28. Storage operands and byte ordering
struct {
int a; /* 0x1112_1314 word */
double b; /* 0x2122_2324_2526_2728 doubleword */
char * c; /* 0x3132_3334 word */
char d[7]; /* ‘A’, ‘B’, ‘C’, ‘D’, ‘E’, ‘F’, ‘G’ array of bytes */
short e; /* 0x5152 halfword */
int f; /* 0x6162_6364 word */
} s;
Chapter 1. Introduction 25
Version 3.0
loop:
cmplwi r5,0
beq done
lwzux r4,r5,r6
add r7,r7,r4
subi r5,r5,4
b loop
done:
stw r7,total
Figure 33. Assembly language program ‘p’
The Big-Endian mapping of program p is shown in
Figure 34 (assuming the program starts at address 0).
Programming Note
The terms Big-Endian and Little-Endian come from forbidden, and the whole Party rendered incapable
Part I, Chapter 4, of Jonathan Swift’s Gulliver’s Travels. by Law of holding Employments. During the
Here is the complete passage, from the edition printed Course of these Troubles, the Emperors of Ble-
in 1734 by George Faulkner in Dublin. fuscu did frequently expostulate by their Ambassa-
dors, accusing us of making a Schism in Religion,
... our Histories of six Thousand Moons make no
by offending against a fundamental Doctrine of our
Mention of any other Regions, than the two great
great Prophet Lustrog, in the fifty-fourth Chapter of
Empires of Lilliput and Blefuscu. Which two mighty
the Brundrecal, (which is their Alcoran.) This, how-
Powers have, as I was going to tell you, been
ever, is thought to be a mere Strain upon the text:
engaged in a most obstinate War for six and thirty
For the Words are these; That all true Believers
Moons past. It began upon the following Occasion.
shall break their Eggs at the convenient End: and
It is allowed on all Hands, that the primitive Way of
which is the convenient End, seems, in my humble
breaking Eggs before we eat them, was upon the
Opinion, to be left to every Man’s Conscience, or at
larger End: But his present Majesty’s Grand-father,
least in the Power of the chief Magistrate to deter-
while he was a Boy, going to eat an Egg, and
mine. Now the Big-Endian Exiles have found so
breaking it according to the ancient Practice, hap-
much Credit in the Emperor of Blefuscu’s Court;
pened to cut one of his Fingers. Whereupon the
and so much private Assistance and Encourage-
Emperor his Father, published an Edict, command-
ment from their Party here at home, that a bloody
ing all his Subjects, upon great Penalties, to break
War has been carried on between the two Empires
the smaller End of their Eggs. The People so
for six and thirty Moons with various Success; dur-
highly resented this Law, that our Histories tell us,
ing which Time we have lost Forty Capital Ships,
there have been six Rebellions raised on that
and a much greater Number of smaller Vessels,
Account; wherein one Emperor lost his Life, and
together with thirty thousand of our best Seamen
another his Crown. These civil Commotions were
and Soldiers; and the Damage received by the
constantly fomented by the Monarchs of Blefuscu;
Enemy is reckoned to be somewhat greater than
and when they were quelled, the Exiles always fled
ours. However, they have now equipped a numer-
for Refuge to that Empire. It is computed that
ous Fleet, and are just preparing to make a
eleven Thousand Persons have, at several Times,
Descent upon us: and his Imperial Majesty, placing
suffered Death, rather than submit to break their
great Confidence in your Valour and Strength, hath
Eggs at the smaller End. Many hundred large Vol-
commanded me to lay this Account of his Affairs
umes have been published upon this Controversy:
before you.
But the Books of the Big-Endians have been long
1.11.3 Effective Address Calcula- metic wraps around from the maximum address,
264 - 1, to address 0, except that if the current instruc-
tion tion is at effective address 264 - 4 the effective address
of the next sequential instruction is undefined.
An effective address is computed by the processor
when executing a Storage Access or Branch instruction In 32-bit mode, the low-order 32 bits of the 64-bit result,
(or certain other instructions described in Book II and preceded by 32 0 bits, comprise the 64-bit effective
Book III) when fetching the next sequential instruction, address for the purpose of addressing storage. When
or when invoking a system error handler. The following an effective address is placed into a register by an
provides an overview of this process. More detail is pro- instruction or event, the value placed into the high-order
vided in the individual instruction descriptions. 32 bits of the register is as follows.
Effective address calculations, for both data and - Load with Update and Store with Update
instruction accesses, use 64-bit two’s complement instructions set the high-order 32 bits of regis-
addition. All 64 bits of each address component partici- ter RA to the high-order 32 bits of the 64-bit
pate in the calculation regardless of mode (32-bit or result.
64-bit). In this computation one operand is an address - In all other cases (e.g., the Link Register when
(which is by definition an unsigned number) and the set by Branch instructions having LK=1, Spe-
second is a signed offset. Carries out of the most signif- cial Purpose Registers when set to an effec-
icant bit are ignored. tive address by invocation of a system error
In 64-bit mode, the entire 64-bit result comprises the handler) the high-order 32 bits of the register
64-bit effective address. The effective address arith-
Chapter 1. Introduction 27
Version 3.0
are set to 0s except as described in the last sign-extended to form a 64-bit address compo-
sentence of this paragraph. nent. If AA=0, this address component is added to
As used to address storage, the effective address arith- the address of the Branch instruction to form the
metic appears to wrap around from the maximum effective address of the target instruction. If AA=1,
address, 232 - 1, to address 0, except that if the current this address component is the effective address of
instruction is at effective address 232 - 4 the effective the target instruction.
address of the next sequential instruction is undefined.
With XL-form Branch instructions, bits 0:61 of the
RA is a field in the instruction which specifies an Link Register or the Count Register are concate-
address component in the computation of an effective nated on the right with 0b00 to form the effective
address. A zero in the RA field indicates the absence of address of the target instruction.
the corresponding address component. A value of zero With sequential instruction fetching, the value 4 is
is substituted for the absent component of the effective added to the address of the current instruction to
address computation. This substitution is shown in the form the effective address of the next instruction,
instruction descriptions as (RA|0). except that if the current instruction is at the maxi-
Effective addresses are computed as follows. In the mum instruction effective address for the mode
descriptions below, it should be understood that “the (264 - 4 in 64-bit mode, 232 - 4 in 32-bit mode) the
contents of a GPR” refers to the entire 64-bit contents, effective address of the next sequential instruction
independent of mode, but that in 32-bit mode only bits is undefined.
32:63 of the 64-bit result of the computation are used to If the size of the operand of a Storage Access instruc-
address storage. tion is more than one byte, the effective address for
With X-form instructions, in computing the effective each byte after the first is computed by adding 1 to the
address of a data element, the contents of the effective address of the preceding byte.
GPR designated by RB (or the value zero for lswi
and stswi) are added to the contents of the GPR
designated by RA or to zero if RA=0 or RA is not
used in forming the EA.
With D-form instructions, the 16-bit D field is
sign-extended to form a 64-bit address compo-
nent. In computing the effective address of a data
element, this address component is added to the
contents of the GPR designated by RA or to zero if
RA=0.
With DS-form instructions, the 14-bit DS field is
concatenated on the right with 0b00 and
sign-extended to form a 64-bit address compo-
nent. In computing the effective address of a data
element, this address component is added to the
contents of the GPR designated by RA or to zero if
RA=0.
With DQ-form instructions, the 12-bit DQ field is
concatenated on the right with 0b0000 and
sign-extended to form a 64-bit address compo-
nent. In computing the effective address of a data
element, this address component is added to the
contents of the GPR designated by RA or to zero if
RA=0.
With I-form Branch instructions, the 24-bit LI field is
concatenated on the right with 0b00 and
sign-extended to form a 64-bit address compo-
nent. If AA=0, this address component is added to
the address of the Branch instruction to form the
effective address of the target instruction. If AA=1,
this address component is the effective address of
the target instruction.
With B-form Branch instructions, the 14-bit BD field
is concatenated on the right with 0b00 and
2.1 Branch Facility Overview respect to setting exception bits and (if the excep-
tion is enabled) invoking the system error handler.
This chapter describes the registers and instructions A Store instruction modifies one or more bytes in
that make up the Branch Facility. an area of storage that contains instructions that
will subsequently be executed. Before an instruc-
tion in that area of storage is executed, software
2.2 Instruction Execution Order synchronization is required to ensure that the
instructions executed are consistent with the
In general, instructions appear to execute sequentially, results produced by the Store instruction.
in the order in which they appear in storage. The excep-
tions to this rule are listed below. Programming Note
Branch instructions for which the branch is taken This software synchronization will generally be
cause execution to continue at the target address provided by system library programs (see
specified by the Branch instruction. Section 1.10 of Book II). Application programs
Trap instructions for which the trap conditions are should call the appropriate system library pro-
satisfied, and System Call and System Call Vec- gram before attempting to execute modified
tored instructions, cause the appropriate system instructions.
handler to be invoked.
Transaction failure will eventually cause the trans-
action’s failure handler, implied by the tbegin.
instruction, to be invoked. See the programming
note following the tbegin. description in
Section 5.5 of Book II.
Exceptions can cause the system error handler to
be invoked, as described in Section 1.10, “Excep-
tions” on page 23.
Returning from a system service program, system
trap handler, or system error handler causes exe-
cution to continue at a specified address.
The model of program execution in which the processor
appears to execute one instruction at a time, complet-
ing each instruction before beginning to execute the
next instruction is called the “sequential execution
model”. In general, the processor obeys the sequential
execution model. For the instructions and facilities
defined in this Book, the only exceptions to this rule are
the following.
A floating-point exception occurs when the proces-
sor is running in one of the Imprecise floating-point
exception modes (see Section 4.4). The instruction
that causes the exception need not complete
before the next instruction begins execution, with
if (64-bit mode)
then M 0 Programming Note
else M 32 Setting of bit 3 of the specified CR field to zero by
if (target_register)M:63 < 0 then c 0b100 tcheck and of field CR03 to zero by other TM
else if (target_register)M:63 > 0 then c 0b010 instructions is intended to preserve these bits for
else c 0b001 future function. Software should not depend on the
CR0 c || XERSO bits being zero.
If any portion of the result is undefined, then the value
placed into the first three bits of CR Field 0 is unde-
fined.
The paste. instruction (see Section 4.4, “Copy-Paste
The bits of CR Field 0 are interpreted as follows. Facility”, in Book II) and the stbcx., sthcx., stwcx.,
stdcx., and stqcx. instructions (see Section 4.6.2, 3 Summary Overflow, Floating-Point Unor-
“Load and Reserve and Store Conditional Instructions”, dered (SO,FU)
in Book II) also set CR Field 0. For fixed-point Compare instructions, this is a
copy of the contents of XERSO at the comple-
For all floating-point instructions in which Rc=1, CR
tion of the instruction. For floating-point Com-
Field 1 (bits 36:39 of the Condition Register) is set to
pare instructions, one or both of (FRA) and
the Floating-Point exception status, copied from bits
(FRB) is a NaN.
32:35 of the Floating-Point Status and Control Register.
This occurs regardless of whether any exceptions are The Vector Integer Compare instructions (see
enabled, and regardless of whether the writing of the Section 6.9.3, “Vector Integer Compare Instructions”)
result is suppressed (see Section 4.4, “Floating-Point compare two Vector Registers element by element,
Exceptions” on page 132). These bits are interpreted interpreting the elements as unsigned or signed inte-
as follows. gers depending on the instruction, and set the corre-
sponding element of the target Vector Register to all 1s
Bit Description if the relation being tested is true and 0s if the relation
being tested is false.
32 Floating-Point Exception Summary (FX)
This is a copy of the contents of FPSCRFX at If Rc=1, CR Field 6 is set to reflect the result of the
the completion of the instruction. comparison, as follows
33 Floating-Point Enabled Exception Sum-
mary (FEX) Bit Description
This is a copy of the contents of FPSCRFEX at 0 The relation is true for all element pairs (i.e.,
the completion of the instruction. VRT is set to all 1s).
34 Floating-Point Invalid Operation Exception 1 0
Summary (VX)
2 The relation is false for all element pairs (i.e.,
This is a copy of the contents of FPSCRVX at
VRT is set to all 0s).
the completion of the instruction.
3 0
35 Floating-Point Overflow Exception (OX)
This is a copy of the contents of FPSCROX at The Vector Floating-Point Compare instructions com-
the completion of the instruction. pare two Vector Registers word element by word ele-
ment, interpreting the elements as single-precision
For Compare instructions, a specified CR field is set to
floating-point numbers. With the exception of the Vector
reflect the result of the comparison. The bits of the
Compare Bounds Floating-Point instruction, they set
specified CR field are interpreted as follows. A com-
the target Vector Register, and CR Field 6 if Rc=1, in
plete description of how the bits are set is given in the
the same manner as do the Vector Integer Compare
instruction descriptions in Section 3.3.10, “Fixed-Point
instructions.
Compare Instructions” on page 85, and Section 4.6.8,
“Floating-Point Compare Instructions” on page 168.
Bit Description
Bit Description 0 The relation is true for all element pairs (i.e.,
VRT is set to all 1s).
0 Less Than, Floating-Point Less Than (LT,
FL) 1 0
For fixed-point Compare instructions, (RA) <
2 The relation is false for all element pairs (i.e.,
SI or (RB) (signed comparison) or (RA) <u UI
VRT is set to all 0s).
or (RB) (unsigned comparison). For float-
ing-point Compare instructions, (FRA) < 3 0
(FRB).
The Vector Compare Bounds Floating-Point instruction
1 Greater Than, Floating-Point Greater Than on page 331 sets CR Field 6 if Rc=1, to indicate
(GT, FG) whether the elements in VRA are within the bounds
For fixed-point Compare instructions, (RA) > specified by the corresponding element in VRB, as
SI or (RB) (signed comparison) or (RA) >u UI explained in the instruction description. A single-preci-
or (RB) (unsigned comparison). For float- sion floating-point value x is said to be “within the
ing-point Compare instructions, (FRA) > bounds” specified by a single-precision floating-point
(FRB). value y if -y ≤ x ≤ y.
2 Equal, Floating-Point Equal (EQ, FE)
Bit Description
For fixed-point Compare instructions, (RA) =
SI, UI, or (RB). For floating-point Compare 0 0
instructions, (FRA) = (FRB).
1 0
2.4.1 Branch History Rolling 00 This entry either is not implemented or has
been cleared. There are no valid entries
Buffer Entry Format beyond the current entry.
Branch History Rolling Buffer Entries (BHRBEs) have 01-11 Reserved.
the following format.
Effective Address T P
0 62 63
BH Hint
00 bclr[l]: The instruction is a subroutine
return
bcctr[l] and bctar[l]:The instruction is not a
subroutine return; the target
address is likely to be the same as
the target address used the pre-
ceding time the branch was taken
01 bclr[l]: The instruction is not a subroutine
return; the target address is likely
to be the same as the target
address used the preceding time
the branch was taken
bcctr[l] and bctar[l]:Reserved
10 Reserved
11 bclr[l], bcctr[l], and bctar[l]: The target
address is not predictable
Figure 43. BH field encodings
Programming Note
The hint provided by the BH field is independent of
the hint provided by the “at” bits (e.g., the BH field
provides no indication of whether the branch is
likely to be taken).
Programming Note
The hints provided by the “at” bits and by the BH
field do not affect the results of executing the
instruction.
The “z” bits should be set to 0, because they may
be assigned a meaning in some future version of
the architecture.
Programming Note
Many implementations have dynamic mechanisms for branch to, and use a bcctr instruction (LK=0, and
predicting the target addresses of bclr[l] and bcctr[l] BH=0b11 if appropriate) to branch to the selected
instructions. These mechanisms may cache return address.
addresses (i.e., Link Register values set by Branch
Direct subroutine linkage:
instructions for which LK=1 and for which the branch
Here A calls B and B returns to A. The two
was taken, other than the special form shown in the first
branches should be as follows.
example below) and recently used branch target
- A calls B: use a bl or bcl instruction (LK=1).
addresses. To obtain the best performance across the
- B returns to A: use a bclr instruction (LK=0)
widest range of implementations, the programmer
(the return address is in, or can be restored to,
should obey the following rules.
the Link Register).
Use Branch instructions for which LK=1 only as Indirect subroutine linkage:
subroutine calls (including function calls, etc.), or in Here A calls Glue, Glue calls B, and B returns to A
the special form shown in the first example below. rather than to Glue. (Such a calling sequence is
Pair each subroutine call (i.e., each Branch instruc- common in linkage code used when the subroutine
tion for which LK=1 and the branch is taken, other that the programmer wants to call, here B, is in a
than the special form shown in the first example different module from the caller; the Binder inserts
below) with a bclr instruction that returns from the “glue” code to mediate the branch.) The three
subroutine and has BH=0b00. branches should be as follows.
Do not use bclrl as a subroutine call. (Some imple- - A calls Glue: use a bl or bcl instruction
mentations access the return address cache at (LK=1).
most once per instruction; such implementations - Glue calls B: place the address of B into the
are likely to treat bclrl as a subroutine return, and Count Register, and use a bcctr instruction
not as a subroutine call.) (LK=0).
For bclr[l] and bcctr[l], use the appropriate value - B returns to A: use a bclr instruction (LK=0)
in the BH field. (the return address is in, or can be restored to,
The following are examples of programming conven- the Link Register).
tions that obey these rules. In the examples, BH is
assumed to contain 0b00 unless otherwise stated. In Function call:
addition, the “at” bits are assumed to be coded appro- Here A calls a function, the identity of which may
priately. vary from one instance of the call to another,
instead of calling a specific program B. This case
Let A, B, and Glue be specific programs. should be handled using the conventions of the
Obtaining the address of the next instruction: preceding two bullets, depending on whether the
Use the following form of Branch and Link. call is direct or indirect, with the following differ-
bcl 20,31,$+4 ences.
Loop counts: - If the call is direct, place the address of the
Keep them in the Count Register, and use a bc function into the Count Register, and use a
instruction (LK=0) to decrement the count and to bcctrl instruction (LK=1) instead of a bl or bcl
branch back to the beginning of the loop if the dec- instruction.
remented count is nonzero. - For the bcctr[l] instruction that branches to
the function, use BH=0b11 if appropriate.
Computed goto’s, case statements, etc.:
Use the Count Register to hold the address to
Compatibility Note
The bits corresponding to the current “a” and “t”
bits, and to the current “z” bits except in the “branch
always” BO encoding, had different meanings in
versions of the architecture that precede Version
2.00.
The bit corresponding to the “t” bit was called
the “y” bit. The “y” bit indicated whether to use
the architected default prediction (y=0) or to
use the complement of the default prediction
(y=1). The default prediction was defined as
follows.
- If the instruction is bc[l][a] with a negative
value in the displacement field, the branch
is taken. (This is the only case in which
the prediction corresponding to the “y” bit
differs from the prediction corresponding
to the “t” bit.)
- In all other cases (bc[l][a] with a nonnega-
tive value in the displacement field, bclr[l],
or bcctr[l]), the branch is not taken.
The BO encodings that test both the Count
Register and the Condition Register had a “y”
bit in place of the current “z” bit. The meaning
of the “y” bit was as described in the preceding
item.
The “a” bit was a “z” bit.
Because these bits have always been defined
either to be ignored or to be treated as hints, a
given program will produce the same result on any
implementation regardless of the values of the bits.
Also, because even the “y” bit is ignored, in prac-
tice, by most processors that comply with versions
of the architecture that precede Version 2.00, the
performance of a given program on those proces-
sors will not be affected by the values of the bits.
18 LI AA LK 16 BO BI BD AA LK
0 6 30 31 0 6 11 16 30 31
Programming Note
bclr, bclrl, bcctr, and bcctrl each serve as both a
basic and an extended mnemonic. The Assembler
will recognize a bclr, bclrl, bcctr, or bcctrl mne-
monic with three operands as the basic form, and a
bclr, bclrl, bcctr, or bcctrl mnemonic with two
operands as the extended form. In the extended
form the BH operand is omitted and assumed to be
0b00.
19 BO BI /// BH 560 LK
0 6 11 16 19 21 31
if (64-bit mode)
then M 0
else M 32
if ¬BO2 then CTR CTR - 1
ctr_ok BO2 | ((CTRM:63 ≠ 0) ⊕ BO3
cond_ok BO0 | (CRBI+32 ≡ BO1)
if ctr_ok & cond_ok then NIA iea TAR0:61 || 0b00
if LK then LR iea CIA + 4
BI+32 specifies the Condition Register bit to be tested.
The BO field is used to resolve the branch as described
in Figure 41. The BH field is used as described in
Figure 43. The branch target address is
TAR0:61 || 0b00, with the high-order 32 bits of the
branch target address set to 0 in 32-bit mode.
If LK=1 then the effective address of the instruction fol-
lowing the Branch instruction is placed into the Link
Register.
Special Registers Altered:
CTR (if BO2=0)
LR (if LK=1)
Programming Note
In some systems, the system software will restrict
usage of the bctar[l] instruction to only selected
programs. If an attempt is made to execute the
instruction when it is not available, the system error
handler will be invoked. See Book III for additional
information.
19 BT BA BB 257 / 19 BT BA BB 225 /
0 6 11 16 21 31 0 6 11 16 21 31
19 BT BA BB 449 / 19 BT BA BB 193 /
0 6 11 16 21 31 0 6 11 16 21 31
19 BT BA BB 33 / 19 BT BA BB 289 /
0 6 11 16 21 31 0 6 11 16 21 31
19 BT BA BB 129 / 19 BT BA BB 417 /
0 6 11 16 21 31 0 6 11 16 21 31
19 BF // BFA // /// 0 /
0 6 9 11 14 16 21 31
CR4×BF+32:4×BF+35 CR4×BFA+32:4×BFA+35
The contents of Condition Register field BFA are copied
to Condition Register field BF.
Special Registers Altered:
CR field BF
Programming Note
sc serves as both a basic and an extended mne-
monic. The Assembler will recognize an sc mne-
monic with one operand as the basic form, and an
sc mnemonic with no operand as the extended
form. In the extended form the LEV operand is
omitted and assumed to be 0.
In application programs the value of the LEV oper-
and for sc should be 0.
Programming Note
In order to read all the BHRB entries containing
information about taken branches, software should
read the entries starting from entry number 0 and
continuing until an entry containing all 0s is read or
until all implemented BHRB entries have been
read.
Since the number of BHRB entries may decrease
or the BHRB may be cleared at any time, if a given
entry, m, is read as not containing all 0s and is read
again subsequently, the subsequent read may
return all 0s even though the program has not exe-
cuted clrbhrb.
3.2.1 General Purpose Registers The bits are set based on the operation of an instruc-
tion considered as a whole, not on intermediate results
All manipulation of information is done in registers inter- (e.g., the Subtract From Carrying instruction, the result
nal to the Fixed-Point Facility. The principal storage of which is specified as the sum of three values, sets
internal to the Fixed-Point Facility is a set of 32 General bits in the Fixed-Point Exception Register based on the
Purpose Registers (GPRs). See Figure 44. entire operation, not on an intermediate sum).
instructions, or by other instructions (except When the value read by an ldmx instruction is equal to
mtspr to the XER, and mcrxr) that cannot an effective address within an enabled section of the
overflow. Load Monitored region specified by these registers, a
Load Monitored event-based exception occurs if
34 Carry (CA)
BESCRGE LME = (1,1). (See Section 7.2.1 of Book II.)
The Carry bit is set as follows, during execu-
tion of certain instructions. Add Carrying, Sub-
tract From Carrying, Add Extended, and 3.2.4.1 Load Monitored Region Regis-
Subtract From Extended types of instructions ter
set it to 1 if there is a carry out of bit M, and
set it to 0 otherwise. Shift Right Algebraic The Load Monitored Region Register (LMRR) specifies
instructions set it to 1 if any 1-bits have been the base address and size of the Load Monitored
shifted out of a negative operand, and set it to region.
0 otherwise. The CA bit is not altered by Com-
pare instructions, or by other instructions Base Effective Address /// Size
(except Shift Right Algebraic, mtspr to the 0 39 60
XER, and mcrxr) that cannot carry.
Figure 46. Load Monitored Region Register
35:43 Reserved
0:38 Base Effective Address
44 Overflow32 (OV32) This field specifies the most-significant bits of
OV32 is set whenever OV is set, and is set to the effective address of the first byte in the
the same value that OV is defined to be set to Load Monitored region. Only the high-order
in 32-bit mode. 39-S bits of the field are used, where S is the
value specified in the Size field; the remaining
45 Carry32 (CA32) S+35 bits of the effective address are
CA32 is set whenever CA is set, and is set to assumed to be 0s. For example, for a 32 MB
the same value that CA is defined to be set to region all 39 bits of the field are used, and for
in 32-bit mode. a 1 TB region only bits 0:23 are used.
Programming Note
In some implementations, the Load Algebraic and
Load with Update instructions may have greater
latency than other types of Load instructions. More-
over, Load with Update instructions may take
longer to execute in some implementations than
the corresponding pair of a non-update Load
instruction and an Add instruction.
Load Byte and Zero D-form Load Byte and Zero Indexed X-form
lbz RT,D(RA) lbzx RT,RA,RB
34 RT RA D 31 RT RA RB 87 /
0 6 11 16 31 0 6 11 16 21 31
if RA = 0 then b 0 if RA = 0 then b 0
else b (RA) else b (RA)
EA b + EXTS(D) EA b + (RB)
560 || MEM(EA, 1) 560 || MEM(EA, 1)
RT RT
Let the effective address (EA) be the sum (RA|0)+ D. Let the effective address (EA) be the sum
The byte in storage addressed by EA is loaded into (RA|0)+ (RB). The byte in storage addressed by EA is
RT56:63. RT0:55 are set to 0. loaded into RT56:63. RT0:55 are set to 0.
Special Registers Altered: Special Registers Altered:
None None
Load Byte and Zero with Update D-form Load Byte and Zero with Update Indexed
X-form
lbzu RT,D(RA)
lbzux RT,RA,RB
35 RT RA D
0 6 11 16 31 31 RT RA RB 119 /
0 6 11 16 21 31
EA (RA) + EXTS(D)
RT 560 || MEM(EA, 1)
EA (RA) + (RB)
RA EA 560 || MEM(EA, 1)
RT
RA EA
Let the effective address (EA) be the sum (RA)+ D. The
byte in storage addressed by EA is loaded into RT56:63. Let the effective address (EA) be the sum (RA)+ (RB).
RT0:55 are set to 0. The byte in storage addressed by EA is loaded into
RT56:63. RT0:55 are set to 0.
EA is placed into register RA.
EA is placed into register RA.
If RA=0 or RA=RT, the instruction form is invalid.
If RA=0 or RA=RT, the instruction form is invalid.
Special Registers Altered:
None Special Registers Altered:
None
Load Halfword and Zero D-form Load Halfword and Zero Indexed X-form
lhz RT,D(RA) lhzx RT,RA,RB
40 RT RA D 31 RT RA RB 279 /
0 6 11 16 31 0 6 11 16 21 31
if RA = 0 then b 0 if RA = 0 then b 0
else b (RA) else b (RA)
EA b + EXTS(D) EA b + (RB)
480 || MEM(EA, 2) 480 || MEM(EA, 2)
RT RT
Let the effective address (EA) be the sum (RA|0)+ D. Let the effective address (EA) be the sum
The halfword in storage addressed by EA is loaded into (RA|0)+ (RB). The halfword in storage addressed by
RT48:63. RT0:47 are set to 0. EA is loaded into RT48:63. RT0:47 are set to 0.
Special Registers Altered: Special Registers Altered:
None None
Load Halfword and Zero with Update Load Halfword and Zero with Update
D-form Indexed X-form
lhzu RT,D(RA) lhzux RT,RA,RB
41 RT RA D 31 RT RA RB 311 /
0 6 11 16 31 0 6 11 16 21 31
42 RT RA D 31 RT RA RB 343 /
0 6 11 16 31 0 6 11 16 21 31
if RA = 0 then b 0 if RA = 0 then b 0
else b (RA) else b (RA)
EA b + EXTS(D) EA b + (RB)
RT EXTS(MEM(EA, 2)) RT EXTS(MEM(EA, 2))
Let the effective address (EA) be the sum (RA|0)+ D. Let the effective address (EA) be the sum
The halfword in storage addressed by EA is loaded into (RA|0)+ (RB). The halfword in storage addressed by
RT48:63. RT0:47 are filled with a copy of bit 0 of the EA is loaded into RT48:63. RT0:47 are filled with a copy
loaded halfword. of bit 0 of the loaded halfword.
Special Registers Altered: Special Registers Altered:
None None
Load Halfword Algebraic with Update Load Halfword Algebraic with Update
D-form Indexed X-form
lhau RT,D(RA) lhaux RT,RA,RB
43 RT RA D 31 RT RA RB 375 /
0 6 11 16 31 0 6 11 16 21 31
Load Word and Zero D-form Load Word and Zero Indexed X-form
lwz RT,D(RA) lwzx RT,RA,RB
32 RT RA D 31 RT RA RB 23 /
0 6 11 16 31 0 6 11 16 21 31
if RA = 0 then b 0 if RA = 0 then b 0
else b (RA) else b (RA)
EA b + EXTS(D) EA b + (RB)
320 || MEM(EA, 4) 320 || MEM(EA, 4)
RT RT
Let the effective address (EA) be the sum (RA|0)+ D. Let the effective address (EA) be the sum
The word in storage addressed by EA is loaded into (RA|0)+ (RB). The word in storage addressed by EA is
RT32:63. RT0:31 are set to 0. loaded into RT32:63. RT0:31 are set to 0.
Special Registers Altered: Special Registers Altered:
None None
Load Word and Zero with Update D-form Load Word and Zero with Update Indexed
X-form
lwzu RT,D(RA)
lwzux RT,RA,RB
33 RT RA D
0 6 11 16 31 31 RT RA RB 55 /
0 6 11 16 21 31
EA (RA) + EXTS(D)
RT 320 || MEM(EA, 4)
EA (RA) + (RB)
RA EA 320 || MEM(EA, 4)
RT
RA EA
Let the effective address (EA) be the sum (RA)+ D. The
word in storage addressed by EA is loaded into Let the effective address (EA) be the sum (RA)+ (RB).
RT32:63. RT0:31 are set to 0. The word in storage addressed by EA is loaded into
RT32:63. RT0:31 are set to 0.
EA is placed into register RA.
EA is placed into register RA.
If RA=0 or RA=RT, the instruction form is invalid.
If RA=0 or RA=RT, the instruction form is invalid.
Special Registers Altered:
None Special Registers Altered:
None
58 RT RA DS 2 31 RT RA RB 341 /
0 6 11 16 30 31 0 6 11 16 21 31
if RA = 0 then b 0 if RA = 0 then b 0
else b (RA) else b (RA)
EA b + EXTS(DS || 0b00) EA b + (RB)
RT EXTS(MEM(EA, 4)) RT EXTS(MEM(EA, 4))
Let the effective address (EA) be the sum Let the effective address (EA) be the sum
(RA|0)+ (DS||0b00). The word in storage addressed by (RA|0)+ (RB). The word in storage addressed by EA is
EA is loaded into RT32:63. RT0:31 are filled with a copy loaded into RT32:63. RT0:31 are filled with a copy of bit 0
of bit 0 of the loaded word. of the loaded word.
Special Registers Altered: Special Registers Altered:
None None
31 RT RA RB 373 /
0 6 11 16 21 31
EA (RA) + (RB)
RT EXTS(MEM(EA, 4))
RA EA
Let the effective address (EA) be the sum (RA)+ (RB).
The word in storage addressed by EA is loaded into
RT32:63. RT0:31 are filled with a copy of bit 0 of the
loaded word.
EA is placed into register RA.
If RA=0 or RA=RT, the instruction form is invalid.
Special Registers Altered:
None
58 RT RA DS 0 31 RT RA RB 21 /
0 6 11 16 30 31 0 6 11 16 21 31
if RA = 0 then b 0 if RA = 0 then b 0
else b (RA) else b (RA)
EA b + EXTS(DS || 0b00) EA b + (RB)
RT MEM(EA, 8) RT MEM(EA, 8)
Let the effective address (EA) be the sum Let the effective address (EA) be the sum
(RA|0)+ (DS||0b00). The doubleword in storage (RA|0)+ (RB). The doubleword in storage addressed by
addressed by EA is loaded into RT. EA is loaded into RT.
Special Registers Altered: Special Registers Altered:
None None
Load Doubleword with Update DS-form Load Doubleword with Update Indexed
X-form
ldu RT,DS(RA)
ldux RT,RA,RB
58 RT RA DS 1
0 6 11 16 30 31 31 RT RA RB 53 /
0 6 11 16 21 31
Programming Note
Warning: The ldmx instruction should be in stor-
age to which the event-based branch handler has
read access (see Section 5.7.14 of Book III),
because the event-based branch handler may need
to load the ldmx instruction from storage in order to
determine the instruction's register usage.
38 RS RA D 31 RS RA RB 215 /
0 6 11 16 31 0 6 11 16 21 31
if RA = 0 then b 0 if RA = 0 then b 0
else b (RA) else b (RA)
EA b + EXTS(D) EA b + (RB)
MEM(EA, 1) (RS)56:63 MEM(EA, 1) (RS)56:63
Let the effective address (EA) be the sum (RA|0)+ D. Let the effective address (EA) be the sum
(RS)56:63 are stored into the byte in storage addressed (RA|0)+ (RB). (RS)56:63 are stored into the byte in stor-
by EA. age addressed by EA.
Special Registers Altered: Special Registers Altered:
None None
Store Byte with Update D-form Store Byte with Update Indexed X-form
stbu RS,D(RA) stbux RS,RA,RB
39 RS RA D 31 RS RA RB 247 /
0 6 11 16 31 0 6 11 16 21 31
44 RS RA D 31 RS RA RB 407 /
0 6 11 16 31 0 6 11 16 21 31
if RA = 0 then b 0 if RA = 0 then b 0
else b (RA) else b (RA)
EA b + EXTS(D) EA b + (RB)
MEM(EA, 2) (RS)48:63 MEM(EA, 2) (RS)48:63
Let the effective address (EA) be the sum (RA|0)+ D. Let the effective address (EA) be the sum
(RS)48:63 are stored into the halfword in storage (RA|0)+ (RB). (RS)48:63 are stored into the halfword in
addressed by EA. storage addressed by EA.
Special Registers Altered: Special Registers Altered:
None None
Store Halfword with Update D-form Store Halfword with Update Indexed
X-form
sthu RS,D(RA)
sthux RS,RA,RB
45 RS RA D
0 6 11 16 31 31 RS RA RB 439 /
0 6 11 16 21 31
EA (RA) + EXTS(D)
MEM(EA, 2) (RS)48:63 EA (RA) + (RB)
RA EA MEM(EA, 2) (RS)48:63
RA EA
Let the effective address (EA) be the sum (RA)+ D.
(RS)48:63 are stored into the halfword in storage Let the effective address (EA) be the sum (RA)+ (RB).
addressed by EA. (RS)48:63 are stored into the halfword in storage
addressed by EA.
EA is placed into register RA.
EA is placed into register RA.
If RA=0, the instruction form is invalid.
If RA=0, the instruction form is invalid.
Special Registers Altered:
None Special Registers Altered:
None
36 RS RA D 31 RS RA RB 151 /
0 6 11 16 31 0 6 11 16 21 31
if RA = 0 then b 0 if RA = 0 then b 0
else b (RA) else b (RA)
EA b + EXTS(D) EA b + (RB)
MEM(EA, 4) (RS)32:63 MEM(EA, 4) (RS)32:63
Let the effective address (EA) be the sum (RA|0)+ D. Let the effective address (EA) be the sum
(RS)32:63 are stored into the word in storage addressed (RA|0)+ (RB). (RS)32:63 are stored into the word in stor-
by EA. age addressed by EA.
Special Registers Altered: Special Registers Altered:
None None
Store Word with Update D-form Store Word with Update Indexed X-form
stwu RS,D(RA) stwux RS,RA,RB
37 RS RA D 31 RS RA RB 183 /
0 6 11 16 31 0 6 11 16 21 31
62 RS RA DS 0 31 RS RA RB 149 /
0 6 11 16 30 31 0 6 11 16 21 31
if RA = 0 then b 0 if RA = 0 then b 0
else b (RA) else b (RA)
EA b + EXTS(DS || 0b00) EA b + (RB)
MEM(EA, 8) (RS) MEM(EA, 8) (RS)
Let the effective address (EA) be the sum Let the effective address (EA) be the sum
(RA|0)+ (DS||0b00). (RS) is stored into the doubleword (RA|0)+ (RB). (RS) is stored into the doubleword in
in storage addressed by EA. storage addressed by EA.
Special Registers Altered: Special Registers Altered:
None None
Store Doubleword with Update DS-form Store Doubleword with Update Indexed
X-form
stdu RS,DS(RA)
stdux RS,RA,RB
62 RS RA DS 1
0 6 11 16 30 31 31 RS RA RB 181 /
0 6 11 16 21 31
if RA = 0 then b 0
else b (RA)
EA b + EXTS(DQ || 0b0000)
RTp MEM(EA, 16)
Let the effective address (EA) be the sum (RA|0)+
(DQ||0b0000). The quadword in storage addressed by
EA is loaded into register pair RTp.
62 RSp RA DS 2
0 6 11 16 30 31
if RA = 0 then b 0
else b (RA)
EA b + EXTS(DS || 0b00)
MEM(EA, 16) RSp
Let the effective address (EA) be the sum (RA|0)+
(DS||0b00). The contents of register pair RSp are
stored into the quadword in storage addressed by EA.
Programming Note
In versions of the architecture prior to V. 2.07, this
instruction was privileged.
31 RT RA RB 790 / 31 RS RA RB 918 /
0 6 11 16 21 31 0 6 11 16 21 31
if RA = 0 then b 0 if RA = 0 then b 0
else b (RA) else b (RA)
EA b + (RB) EA b + (RB)
load_data MEM(EA, 2) MEM(EA, 2) (RS)56:63 || (RS)48:55
RT 480 || load_data
8:15 || load_data0:7
Let the effective address (EA) be the sum
Let the effective address (EA) be the sum (RA|0)+(RB). (RA|0)+ (RB). (RS)56:63 are stored into bits 0:7 of the
Bits 0:7 of the halfword in storage addressed by EA are halfword in storage addressed by EA. (RS)48:55 are
loaded into RT56:63. Bits 8:15 of the halfword in storage stored into bits 8:15 of the halfword in storage
addressed by EA are loaded into RT48:55. RT0:47 are addressed by EA.
set to 0.
Special Registers Altered:
Special Registers Altered: None
None
Load Word Byte-Reverse Indexed X-form Store Word Byte-Reverse Indexed X-form
lwbrx RT,RA,RB stwbrx RS,RA,RB
31 RT RA RB 534 / 31 RS RA RB 662 /
0 6 11 16 21 31 0 6 11 16 21 31
if RA = 0 then b 0 if RA = 0 then b 0
else b (RA) else b (RA)
EA b + (RB) EA b + (RB)
load_data MEM(EA, 4) MEM(EA, 4) (RS)56:63 || (RS)48:55 || (RS)40:47
320 || load_data
RT 24:31 || load_data16:23 ||(RS)32:39
|| load_data8:15 || load_data0:7
Let the effective address (EA) be the sum
Let the effective address (EA) be the sum (RA|0)+ (RB). (RS)56:63 are stored into bits 0:7 of the
(RA|0)+ (RB). Bits 0:7 of the word in storage addressed word in storage addressed by EA. (RS)48:55 are stored
by EA are loaded into RT56:63. Bits 8:15 of the word in into bits 8:15 of the word in storage addressed by EA.
storage addressed by EA are loaded into RT48:55. Bits (RS)40:47 are stored into bits 16:23 of the word in stor-
16:23 of the word in storage addressed by EA are age addressed by EA. (RS)32:39 are stored into bits
loaded into RT40:47. Bits 24:31 of the word in storage 24:31 of the word in storage addressed by EA.
addressed by EA are loaded into RT32:39. RT0:31 are
set to 0. Special Registers Altered:
None
Special Registers Altered:
None
31 RT RA RB 532 / 31 RS RA RB 660 /
0 6 11 16 21 31 0 6 11 16 21 31
if RA = 0 then b 0 if RA = 0 then b 0
else b (RA) else b (RA)
EA b + (RB) EA b + (RB)
load_data MEM(EA, 8) MEM(EA, 8) (RS)56:63 || (RS)48:55
RT load_data56:63 || load_data48:55 || (RS)40:47 || (RS)32:39
|| load_data40:47 || load_data32:39 || (RS)24:31 || (RS)16:23
|| load_data24:31 || load_data16:23 || (RS)8:15 || (RS)0:7
|| load_data8:15 || load_data0:7
Let the effective address (EA) be the sum
Let the effective address (EA) be the sum (RA|0)+(RB). (RA|0)+ (RB). (RS)56:63 are stored into bits 0:7 of the
Bits 0:7 of the doubleword in storage addressed by EA doubleword in storage addressed by EA. (RS)48:55 are
are loaded into RT56:63. Bits 8:15 of the doubleword in stored into bits 8:15 of the doubleword in storage
storage addressed by EA are loaded into RT48:55. Bits addressed by EA. (RS)40:47 are stored into bits 16:23 of
16:23 of the doubleword in storage addressed by EA the doubleword in storage addressed by EA. (RS)32:39
are loaded into RT40:47. Bits 24:31 of the doubleword in are stored into bits 23:31 of the doubleword in storage
storage addressed by EA are loaded into RT32:39. Bits addressed by EA. (RS)24:31 are stored into bits 32:39 of
32:39 of the doubleword in storage addressed by EA the doubleword in storage addressed by EA. (RS)16:23
are loaded into RT24:31. Bits 40:47 of the doubleword in are stored into bits 40:47 of the doubleword in storage
storage addressed by EA are loaded into RT16:23. Bits addressed by EA. (RS)8:15 are stored into bits 48:55 of
48:55 of the doubleword in storage addressed by EA the doubleword in storage addressed by EA. (RS)0:7
are loaded into RT8:15. Bits 56:63 of the doubleword in are stored into bits 56:63 of the doubleword in storage
storage addressed by EA are loaded into RT0:7. addressed by EA.
Special Registers Altered: Special Registers Altered:
None None
46 RT RA D 47 RS RA D
0 6 11 16 31 0 6 11 16 31
if RA = 0 then b 0 if RA = 0 then b 0
else b (RA) else b (RA)
EA b + EXTS(D) EA b + EXTS(D)
r RT r RS
do while r ≤ 31 do while r ≤ 31
320 || MEM(EA, 4)
GPR(r) MEM(EA, 4) GPR(r)32:63
r r + 1 r r + 1
EA EA + 4 EA EA + 4
Let n = (32-RT). Let the effective address (EA) be the Let n = (32-RS). Let the effective address (EA) be the
sum (RA|0)+ D. sum (RA|0)+ D.
n consecutive words starting at EA are loaded into the n consecutive words starting at EA are stored from the
low-order 32 bits of GPRs RT through 31. The low-order 32 bits of GPRs RS through 31.
high-order 32 bits of these GPRs are set to zero.
This instruction is not supported in Little-Endian mode.
If RA is in the range of registers to be loaded, including If it is executed in Little-Endian mode, the system align-
the case in which RA=0, the instruction form is invalid. ment error handler is invoked.
This instruction is not supported in Little-Endian mode. Special Registers Altered:
If it is executed in Little-Endian mode, the system align- None
ment error handler is invoked.
Special Registers Altered:
None
Load String Word Immediate X-form Load String Word Indexed X-form
lswi RT,RA,NB lswx RT,RA,RB
31 RT RA NB 597 / 31 RT RA RB 533 /
0 6 11 16 21 31 0 6 11 16 21 31
if RA = 0 then EA 0 if RA = 0 then b 0
else EA (RA) else b (RA)
if NB = 0 then n 32 EA b + (RB)
else n NB n XER57:63
r RT - 1 r RT - 1
i 32 i 32
do while n > 0 RT undefined
if i = 32 then do while n > 0
r r + 1 (mod 32) if i = 32 then
GPR(r) 0 r r + 1 (mod 32)
GPR(r)i:i+7 MEM(EA, 1) GPR(r) 0
i i + 8 GPR(r)i:i+7 MEM(EA, 1)
if i = 64 then i 32 i i + 8
EA EA + 1 if i = 64 then i 32
n n - 1 EA EA + 1
n n - 1
Let the effective address (EA) be (RA|0). Let n = NB if
NB≠0, n = 32 if NB=0; n is the number of bytes to load. Let the effective address (EA) be the sum
Let nr=CEIL(n/4); nr is the number of registers to (RA|0)+ (RB). Let n=XER57:63; n is the number of bytes
receive data. to load. Let nr=CEIL(n/4); nr is the number of registers
to receive data.
n consecutive bytes starting at EA are loaded into
GPRs RT through RT+nr-1. Data are loaded into the If n>0, n consecutive bytes starting at EA are loaded
low-order four bytes of each GPR; the high-order four into GPRs RT through RT+nr-1. Data are loaded into
bytes are set to 0. the low-order four bytes of each GPR; the high-order
four bytes are set to 0.
Bytes are loaded left to right in each register. The
sequence of registers wraps around to GPR 0 if Bytes are loaded left to right in each register. The
required. If the low-order four bytes of register RT+nr-1 sequence of registers wraps around to GPR 0 if
are only partially filled, the unfilled low-order byte(s) of required. If the low-order four bytes of register RT+nr-1
that register are set to 0. are only partially filled, the unfilled low-order byte(s) of
that register are set to 0.
If RA is in the range of registers to be loaded, including
the case in which RA=0, the instruction form is invalid. If n=0, the contents of register RT are undefined.
This instruction is not supported in Little-Endian mode. If RA or RB is in the range of registers to be loaded,
If it is executed in Little-Endian mode, the system align- including the case in which RA=0, the instruction is
ment error handler is invoked. treated as if the instruction form were invalid. If RT=RA
or RT=RB, the instruction form is invalid.
Special Registers Altered:
None This instruction is not supported in Little-Endian mode.
If it is executed in Little-Endian mode and n>0, the sys-
tem alignment error handler is invoked.
Special Registers Altered:
None
Store String Word Immediate X-form Store String Word Indexed X-form
stswi RS,RA,NB stswx RS,RA,RB
31 RS RA NB 725 / 31 RS RA RB 661 /
0 6 11 16 21 31 0 6 11 16 21 31
if RA = 0 then EA 0 if RA = 0 then b 0
else EA (RA) else b (RA)
if NB = 0 then n 32 EA b + (RB)
else n NB n XER57:63
r RS - 1 r RS - 1
i 32 i 32
do while n > 0 do while n > 0
if i = 32 then r r + 1 (mod 32) if i = 32 then r r + 1 (mod 32)
MEM(EA, 1) GPR(r)i:i+7 MEM(EA, 1) GPR(r)i:i+7
i i + 8 i i + 8
if i = 64 then i 32 if i = 64 then i 32
EA EA + 1 EA EA + 1
n n - 1 n n - 1
Let the effective address (EA) be (RA|0). Let n = NB if Let the effective address (EA) be the sum
NB≠0, n = 32 if NB=0; n is the number of bytes to store. (RA|0)+ (RB). Let n = XER57:63; n is the number of
Let nr =CEIL(n/4); nr is the number of registers to sup- bytes to store. Let nr = CEIL(n/4); nr is the number of
ply data. registers to supply data.
n consecutive bytes starting at EA are stored from If n>0, n consecutive bytes starting at EA are stored
GPRs RS through RS+nr-1. Data are stored from the from GPRs RS through RS+nr-1. Data are stored from
low-order four bytes of each GPR. the low-order four bytes of each GPR.
Bytes are stored left to right from each register. The Bytes are stored left to right from each register. The
sequence of registers wraps around to GPR 0 if sequence of registers wraps around to GPR 0 if
required. required.
This instruction is not supported in Little-Endian mode. If n=0, no bytes are stored.
If it is executed in Little-Endian mode, the system align-
This instruction is not supported in Little-Endian mode.
ment error handler is invoked.
If it is executed in Little-Endian mode and n>0, the sys-
Special Registers Altered: tem alignment error handler is invoked.
None
Special Registers Altered:
None
14 RT RA SI 15 RT RA SI
0 6 11 16 31 0 6 11 16 31
19 RT d1 d0 2 d2
D d0||d1||d2
RT NIA + EXTS(D || 160)
The sum of NIA + (D || 0x0000) is placed into register
RT.
31 RT RA RB OE 266 Rc 31 RT RA RB OE 40 Rc
0 6 11 16 21 22 31 0 6 11 16 21 22 31
RT (RA) + EXTS(SI)
RT (RA) + EXTS(SI)
The sum (RA) + SI is placed into register RT.
The sum (RA) + SI is placed into register RT.
Special Registers Altered:
CA CA32 Special Registers Altered:
CR0 CA CA32
Extended Mnemonics:
Extended Mnemonics:
Example of extended mnemonics for Add Immediate
Carrying: Example of extended mnemonics for Add Immediate
Carrying and Record:
Extended: Equivalent to:
subic Rx,Ry,value addic Rx,Ry,-value Extended: Equivalent to:
subic. Rx,Ry,value addic. Rx,Ry,-value
8 RT RA SI
0 6 11 16 31
RT ¬(RA) + EXTS(SI) + 1
The sum ¬(RA) + SI + 1 is placed into register RT.
Special Registers Altered:
CA CA32
31 RT RA RB OE 10 Rc 31 RT RA RB OE 8 Rc
0 6 11 16 21 22 31 0 6 11 16 21 22 31
31 RT RA RB OE 138 Rc 31 RT RA RB OE 136 Rc
0 6 11 16 21 22 31 0 6 11 16 21 22 31
Add to Minus One Extended XO-form Subtract From Minus One Extended
XO-form
addme RT,RA (OE=0 Rc=0)
addme. RT,RA (OE=0 Rc=1) subfme RT,RA (OE=0 Rc=0)
addmeo RT,RA (OE=1 Rc=0) subfme. RT,RA (OE=0 Rc=1)
addmeo. RT,RA (OE=1 Rc=1) subfmeo RT,RA (OE=1 Rc=0)
subfmeo. RT,RA (OE=1 Rc=1)
31 RT RA /// OE 234 Rc
0 6 11 16 21 22 31 31 RT RA /// OE 232 Rc
0 6 11 16 21 22 31
RT (RA) + CA - 1
The sum (RA) + CA + 641 is placed into register RT.
RT ¬(RA) + CA - 1
The sum ¬(RA) + CA + 641 is placed into register RT.
Special Registers Altered:
CA CA32 Special Registers Altered:
CR0 (if Rc=1) CA CA32
SO OV OV32 (if OE=1) CR0 (if Rc=1)
SO OV OV32 (if OE=1)
RT (RA) + CA RT ¬(RA) + CA
The sum (RA) + CA is placed into register RT. The sum ¬(RA) + CA is placed into register RT.
Special Registers Altered: Special Registers Altered:
CA CA32 CA CA32
CR0 (if Rc=1) CR0 (if Rc=1)
SO OV OV32 (if OE=1) SO OV OV32 (if OE=1)
Programming Note
The setting of CA and CA32 by the Add and Sub-
tract From instructions, including the Extended ver-
sions thereof, is mode-dependent. If a sequence of
these instructions is used to perform extended-pre-
cision addition or subtraction, the same mode
should be used throughout the sequence.
Negate XO-form
neg RT,RA (OE=0 Rc=0)
neg. RT,RA (OE=0 Rc=1)
nego RT,RA (OE=1 Rc=0)
nego. RT,RA (OE=1 Rc=1)
31 RT RA /// OE 104 Rc
0 6 11 16 21 22 31
RT ¬(RA) + 1
The sum ¬(RA) + 1 is placed into register RT.
If the processor is in 64-bit mode and register RA con-
tains the most negative 64-bit number (0x8000_
0000_0000_0000), the result is the most negative num-
ber and, if OE=1, OV and OV32 are set to 1. Similarly, if
the processor is in 32-bit mode and (RA)32:63 contain
the most negative 32-bit number (0x8000_0000), the
low-order 32 bits of the result contain the most negative
32-bit number and, if OE=1, OV and OV32 are set to 1.
Special Registers Altered:
CR0 (if Rc=1)
SO OV OV32 (if OE=1)
31 RT RA RB OE 235 Rc 31 RT RA RB / 11 Rc
0 6 11 16 21 22 31 0 6 11 16 21 22 31
31 RT RA RB OE 491 Rc 31 RT RA RB OE 459 Rc
0 6 11 16 21 22 31 0 6 11 16 21 22 31
31 RT RA RB 779 / 31 RT RA RB 267 /
0 6 11 16 21 31 0 6 11 16 21 31
31 RT RA RB OE 427 Rc 31 RT RA RB OE 395 Rc
0 6 11 16 21 22 31 0 6 11 16 21 22 31
Programming Note
L Format
0 320 || CRN0:31
1 CRN0:63
2 RRN0:63
3 reserved
Format above is for non-error conditions.
0xFFFFFFFF_FFFFFFFF for error conditions.
CRN = conditioned random number
RRN = raw random number
Programming Note
32-bit software running in an environment that does
not preserve the high-order 32 bits of GPRs across
invocations of the system error handler, signal han-
dlers, event-based branch handlers, etc. may use
the L=0 variant of darn and interpret the value
0xFFFFFFFF to indicate an error condition. The
fact that the error condition includes the valid value
0x00000000_FFFFFFFF together with the true
error value 0xFFFFFFFF_FFFFFFFF is not a prob-
lem.
Programming Note
When the error value is obtained, software is
expected to repeat the operation. If a non-error
value has not been obtained after several attempts,
a software random number generation method
should be used. The recommended number of
attempts may be implementation specific. In the
absence of other guidance, ten attempts should be
adequate.
4 RT RA RB RC 51
0 6 11 16 21 26 31
31 RT RA RB OE 489 Rc 31 RT RA RB OE 457 Rc
0 6 11 16 21 22 31 0 6 11 16 21 22 31
Programming Note
Unsigned long division of a 128-bit dividend con-
tained in two 64-bit registers by a 64-bit divisor can
be accomplished using the technique described in
the Programming Note with the divweu instruction
description: divd[e]u would be used instead of
divw[e]u (and cmpld instead of cmplw, etc.).
31 RT RA RB 777 / 31 RT RA RB 265 /
0 6 11 16 21 31 0 6 11 16 21 31
Let dividend be the signed integer doubleword in Let dividend be the unsigned integer doubleword in
register RA. register RA.
Let divisor be the signed integer doubleword in Let divisor be the unsigned integer doubleword in
register RB. register RB.
The remainder of dividend divided by divisor is The remainder of dividend divided by divisor is
placed into register RT. The quotient is not supplied as placed into register RT. The quotient is not supplied as
a result. a result.
The remainder is the unique signed integer that The remainder is the unique unsigned integer that
satisfies satisfies
where 0 ≤ remainder < |divisor| if the dividend is where 0 ≤ remainder < divisor.
nonnegative, and -|divisor| < remainder ≤ 0 if the
dividend is negative. If an attempt is made to perform any of the divisions
11 BF / L RA SI 31 BF / L RA RB 0 /
0 6 9 10 11 16 31 0 6 9 10 11 16 21 31
10 BF / L RA UI 31 BF / L RA RB 32 /
0 6 9 10 11 16 31 0 6 9 10 11 16 21 31
32 32
if L = 0 then a 0 || (RA)32:63 if L = 0 then a 0 || (RA)32:63
else a (RA) b 320 || (RB)
32:63
if a <u (480 || UI) then c 0b100 else a (RA)
else if a >u (480 || UI) then c 0b010 b (RB)
else c 0b001 if a <u b then c 0b100
CR4×BF+32:4×BF+35 c || XERSO else if a >u b then c 0b010
else c 0b001
The contents of register RA ((RA)32:63 zero-extended to CR4×BF+32:4×BF+35 c || XERSO
64 bits if L=0) are compared with 480 || UI, treating the
operands as unsigned integers. The result of the com- The contents of register RA ((RA)32:63 if L=0) are com-
parison is placed into CR field BF. pared with the contents of register RB ((RB)32:63 if
L=0), treating the operands as unsigned integers. The
Special Registers Altered: result of the comparison is placed into CR field BF.
CR field BF
Special Registers Altered:
Extended Mnemonics: CR field BF
Examples of extended mnemonics for Compare Logical Extended Mnemonics:
Immediate:
Examples of extended mnemonics for Compare Logi-
Extended: Equivalent to: cal:
cmpldi Rx,value cmpli 0,1,Rx,value
cmplwi cr3,Rx,value cmpli 3,0,Rx,value Extended: Equivalent to:
cmpld Rx,Ry cmpl 0,1,Rx,Ry
cmplw cr3,Rx,Ry cmpl 3,0,Rx,Ry
31 BF // RA RB 224 /
0 6 9 11 16 21 31
src1 ← GPR[RA].bit[56:63]
CR4×BF+32 ← 0b0
CR4×BF+33 ← match
CR4×BF+34 ← 0b0
CR4×BF+35 ← 0b0
Programming Note
cmpeqb is useful for implementing character
typing functions such as isspace() that are
implemented by comparing the character to 1 or
more values.
3 TO RA SI 31 TO RA RB 4 /
0 6 11 16 31 0 6 11 16 21 31
a EXTS((RA)32:63) a EXTS((RA)32:63)
if (a < EXTS(SI)) & TO0 then TRAP b EXTS((RB)32:63)
if (a > EXTS(SI)) & TO1 then TRAP if (a < b) & TO0 then TRAP
if (a = EXTS(SI)) & TO2 then TRAP if (a > b) & TO1 then TRAP
if (a <u EXTS(SI)) & TO3 then TRAP if (a = b) & TO2 then TRAP
if (a >u EXTS(SI)) & TO4 then TRAP if (a <u b) & TO3 then TRAP
if (a >u b) & TO4 then TRAP
The contents of RA32:63 are compared with the
sign-extended value of the SI field. If any bit in the TO The contents of RA32:63 are compared with the con-
field is set to 1 and its corresponding condition is met tents of RB32:63. If any bit in the TO field is set to 1 and
by the result of the comparison, the system trap han- its corresponding condition is met by the result of the
dler is invoked. comparison, the system trap handler is invoked.
If the trap conditions are met, this instruction is context If the trap conditions are met, this instruction is context
synchronizing (see Book III). synchronizing (see Book III).
Special Registers Altered: Special Registers Altered:
None None
Extended Mnemonics: Extended Mnemonics:
Examples of extended mnemonics for Trap Word Examples of extended mnemonics for Trap Word:
Immediate:
Extended: Equivalent to:
Extended: Equivalent to: tweq Rx,Ry tw 4,Rx,Ry
twgti Rx,value twi 8,Rx,value twlge Rx,Ry tw 5,Rx,Ry
twllei Rx,value twi 6,Rx,value trap tw 31,0,0
28 RS RA UI 24 RS RA UI
0 6 11 16 31 0 6 11 16 31
29 RS RA UI Extended Mnemonics:
0 6 11 16 31
Example of extended mnemonics for OR Immediate:
25 RS RA UI
0 6 11 16 31
26 RS RA UI 27 RS RA UI
0 6 11 16 31 0 6 11 16 31
xori 0,0,0
Special Registers Altered:
None
Extended Mnemonics:
Example of extended mnemonics for XOR Immediate:
Programming Note
The executed form of no-op should be used only
when the intent is to alter the timing of a program.
31 RS RA RB 28 Rc 31 RS RA RB 444 Rc
0 6 11 16 21 31 0 6 11 16 21 31
31 RS RA RB 316 Rc
0 6 11 16 21 31
RA (RS) ⊕ (RB)
The contents of register RS are XORed with the con-
tents of register RB and the result is placed into register
RA.
Special Registers Altered:
CR0 (if Rc=1)
NAND X-form
nand RA,RS,RB (Rc=0)
nand. RA,RS,RB (Rc=1)
31 RS RA RB 476 Rc
0 6 11 16 21 31
Programming Note
nand or nor with RS=RB can be used to obtain the
one’s complement.
31 RS RA RB 124 Rc 31 RS RA RB 284 Rc
0 6 11 16 21 31 0 6 11 16 21 31
31 RS RA RB 60 Rc 31 RS RA RB 412 Rc
0 6 11 16 21 31 0 6 11 16 21 31
s (RS)56 s (RS)48
RA56:63 (RS)56:63 RA48:63 (RS)48:63
RA0:55 56s RA0:47 48s
(RS)56:63 are placed into RA56:63. RA0:55 are filled with (RS)48:63 are placed into RA48:63. RA0:47 are filled with
a copy of (RS)56. a copy of (RS)48.
Special Registers Altered: Special Registers Altered:
CR0 (if Rc=1) CR0 (if Rc=1)
31 RS RA /// 538 Rc
n 32 0 6 11 16 21 31
do while n < 64 n ← 0
if (RS)n = 1 then leave
n n + 1 do while n < 32
if (RS)63-n = 0b1 then leave
RA n - 32 n ← n + 1
do n = 0 to 7 do i = 0 to 7
if RS8×n:8×n+7 = (RB)8×n:8×n+7 then n 0
8
RA8×n:8×n+7 1 do j = 0 to 7
else if (RS)(i×8)+j = 1 then
80
RA8×n:8×n+7 n n+1
Each byte of the contents of register RS is compared to RA(i×8):(i×8)+7 n
each corresponding byte of the contents in register RB. A count of the number of one bits in each byte of regis-
If they are equal, the corresponding byte in RA is set to ter RS is placed into the corresponding byte of register
0xFF. Otherwise the corresponding byte in RA is set to RA. This number ranges from 0 to 8, inclusive.
0x00.
Special Registers Altered:
Special Registers Altered: None
None
Population Count Words X-form
popcntw RA, RS
31 RS RA /// 378 /
0 6 11 16 21 31
do i = 0 to 1
n 0
do j = 0 to 31
if (RS)(i×32)+j = 1 then
n n+1
RA(i×32):(i×32)+31 n
A count of the number of one bits in each word of regis-
ter RS is placed into the corresponding word of register
RA. This number ranges from 0 to 32, inclusive.
Special Registers Altered:
None
s 0 s 0
do i = 0 to 7 t 0
s s / (RS)i%8+7 do i = 0 to 3
RA 630 || s
s s / (RS)i%8+7
do i = 4 to 7
The least significant bit in each byte of the contents of t t / (RS)i%8+7
register RS is examined. If there is an odd number of RA0:31 31
0 || s
one bits the value 1 is placed into register RA; other- RA32:63 31
0 || t
wise the value 0 is placed into register RA.
The least significant bit in each byte of (RS)0:31 is
Special Registers Altered: examined. If there is an odd number of one bits the
None value 1 is placed into RA0:31; otherwise the value 0 is
placed into RA0:31. The least significant bit in each byte
of (RS)32:63 is examined. If there is an odd number of
one bits the value 1 is placed into RA32:63; otherwise
the value 0 is placed into RA32:63.
Special Registers Altered:
None
Programming Note
The Parity instructions are designed to be used in conjunction with the Population Count instruction to compute
the parity of words or a doubleword. The parity of the upper and lower words in (RS) can be computed as follows.
popcntb RA, RS
prtyw RA, RA
The parity of (RS) can be computed as follows.
popcntb RA, RS
prtyd RA, RA
n 0 n ← 0
do while n < 64 do while n < 64
if (RS)n = 1 then leave if (RS)63-n = 0b1 then leave
n n + 1 n ← n + 1
RA n RA ← EXTZ64(n)
A count of the number of consecutive zero bits starting A count of the number of consecutive zero bits starting
at bit 0 of register RS is placed into register RA. This at bit 63 of register RS is placed into register RA. This
number ranges from 0 to 64, inclusive. number ranges from 0 to 64, inclusive.
If Rc=1, CR Field 0 is set to reflect the result. If Rc is equal to 1, CR field 0 is set to reflect the result.
Special Registers Altered:
Special Registers Altered:
CR0 (if Rc=1)
CR0 (if Rc=1)
31 RS RA RB 252 /
0 6 11 16 21 31
For i = 0 to 7
index (RS)8*i:8*i+7
If index < 64
then permi (RB)index
else permi 0
56
RA 0 || perm0:7
Eight permuted bits are produced. For each permuted
bit i where i ranges from 0 to 7 and for each byte i of
RS, do the following.
If byte i of RS is less than 64, permuted bit i is set
to the bit of RB specified by byte i of RS; otherwise
permuted bit i is set to 0.
The permuted bits are placed in the least-significant
byte of RA, and the remaining bits are filled with 0s.
Special Registers Altered:
None
Programming Note
The fact that the permuted bit is 0 if the corre-
sponding index value exceeds 63 permits the per-
muted bits to be selected from a 128-bit quantity,
using a single index register. For example, assume
that the 128-bit quantity Q, from which the per-
muted bits are to be selected, is in registers r2
(high-order 64 bits of Q) and r3 (low-order 64 bits of
Q), that the index values are in register r1, with
each byte of r1 containing a value in the range
0:127, and that each byte of register r4 contains the
value 64. The following code sequence selects
eight permuted bits from Q and places them into
the low-order byte of r6.
bpermd r6,r1,r2 # select from high-
order half of Q
xor r0,r1,r4 # adjust index values
bpermd r5,r0,r3 # select from low-
order half of Q
or r6,r6,r5 # merge the two
selections
Rotate Left Word then AND with Mask Rotate Left Word Immediate then Mask
M-form Insert M-form
rlwnm RA,RS,RB,MB,ME (Rc=0) rlwimi RA,RS,SH,MB,ME (Rc=0)
rlwnm. RA,RS,RB,MB,ME (Rc=1) rlwimi. RA,RS,SH,MB,ME (Rc=1)
23 RS RA RB MB ME Rc 20 RS RA SH MB ME Rc
0 6 11 16 21 26 31 0 6 11 16 21 26 31
n (RB)59:63 n SH
r ROTL32((RS)32:63, n) r ROTL32((RS)32:63, n)
m MASK(MB+32, ME+32) m MASK(MB+32, ME+32)
RA r & m RA r&m | (RA)&¬m
The contents of register RS are rotated32 left the num- The contents of register RS are rotated32 left SH bits. A
ber of bits specified by (RB)59:63. A mask is generated mask is generated having 1-bits from bit MB+32
having 1-bits from bit MB+32 through bit ME+32 and through bit ME+32 and 0-bits elsewhere. The rotated
0-bits elsewhere. The rotated data are ANDed with the data are inserted into register RA under control of the
generated mask and the result is placed into register generated mask.
RA.
Special Registers Altered:
Special Registers Altered: CR0 (if Rc=1)
CR0 (if Rc=1)
Extended Mnemonics:
Extended Mnemonics:
Example of extended mnemonics for Rotate Left Word
Example of extended mnemonics for Rotate Left Word Immediate then Mask Insert:
then AND with Mask:
Extended: Equivalent to:
Extended: Equivalent to: inslwi Rx,Ry,n,b rlwimi Rx,Ry,32-b,b,b+n-1
rotlw Rx,Ry,Rz rlwnm Rx,Ry,Rz,0,31
Programming Note
Programming Note Let RAL represent the low-order 32 bits of register
Let RSL represent the low-order 32 bits of register RA, with the bits numbered from 0 through 31.
RS, with the bits numbered from 0 through 31.
rlwimi can be used to insert an n-bit field that is
rlwnm can be used to extract an n-bit field that left-justified in the low-order 32 bits of register RS,
starts at variable bit position b in RSL, right-justified into RAL starting at bit position b, by setting
into the low-order 32 bits of register RA (clearing SH=32-b, MB=b, and ME=(b+n)-1. It can be used
the remaining 32-n bits of the low-order 32 bits of to insert an n-bit field that is right-justified in the
RA), by setting RB59:63=b+n, MB=32-n, and low-order 32 bits of register RS, into RAL starting at
ME=31. It can be used to extract an n-bit field that bit position b, by setting SH=32-(b+n), MB=b, and
starts at variable bit position b in RSL, left-justified ME=(b+n)-1.
into the low-order 32 bits of register RA (clearing
Extended mnemonics are provided for both of
the remaining 32-n bits of the low-order 32 bits of
these uses; see Appendix C, “Assembler Extended
RA), by setting RB59:63=b, MB = 0, and ME=n-1. It
Mnemonics” on page 795.
can be used to rotate the contents of the low-order
32 bits of a register left (right) by variable n bits, by
setting RB59:63=n (32-n), MB=0, and ME=31.
For all the uses given above, the high-order 32 bits
of register RA are cleared.
Extended mnemonics are provided for some of
these uses; see Appendix C, “Assembler Extended
Mnemonics” on page 795.
Rotate Left Doubleword Immediate then Rotate Left Doubleword Immediate then
Clear Left MD-form Clear Right MD-form
rldicl RA,RS,SH,MB (Rc=0) rldicr RA,RS,SH,ME (Rc=0)
rldicl. RA,RS,SH,MB (Rc=1) rldicr. RA,RS,SH,ME (Rc=1)
30 RS RA sh mb 0 sh Rc 30 RS RA sh me 1 sh Rc
0 6 11 16 21 27 30 31 0 6 11 16 21 27 30 31
Rotate Left Doubleword Immediate then Rotate Left Doubleword then Clear Left
Clear MD-form MDS-form
rldic RA,RS,SH,MB (Rc=0) rldcl RA,RS,RB,MB (Rc=0)
rldic. RA,RS,SH,MB (Rc=1) rldcl. RA,RS,RB,MB (Rc=1)
30 RS RA sh mb 2 sh Rc 30 RS RA RB mb 8 Rc
0 6 11 16 21 27 30 31 0 6 11 16 21 27 31
Rotate Left Doubleword then Clear Right Rotate Left Doubleword Immediate then
MDS-form Mask Insert MD-form
rldcr RA,RS,RB,ME (Rc=0) rldimi RA,RS,SH,MB (Rc=0)
rldcr. RA,RS,RB,ME (Rc=1) rldimi. RA,RS,SH,MB (Rc=1)
30 RS RA RB me 9 Rc 30 RS RA sh mb 3 sh Rc
0 6 11 16 21 27 31 0 6 11 16 21 27 30 31
31 RS RA RB 24 Rc 31 RS RA RB 536 Rc
0 6 11 16 21 31 0 6 11 16 21 31
n (RB)59:63 n (RB)59:63
r ROTL32((RS)32:63, n) r ROTL32((RS)32:63, 64-n)
if (RB)58 = 0 then if (RB)58 = 0 then
m MASK(32, 63-n) m MASK(n+32, 63)
640 640
else m else m
RA r & m RA r & m
The contents of the low-order 32 bits of register RS are The contents of the low-order 32 bits of register RS are
shifted left the number of bits specified by (RB)58:63. shifted right the number of bits specified by (RB)58:63.
Bits shifted out of position 32 are lost. Zeros are sup- Bits shifted out of position 63 are lost. Zeros are sup-
plied to the vacated positions on the right. The 32-bit plied to the vacated positions on the left. The 32-bit
result is placed into RA32:63. RA0:31 are set to zero. result is placed into RA32:63. RA0:31 are set to zero.
Shift amounts from 32 to 63 give a zero result. Shift amounts from 32 to 63 give a zero result.
Special Registers Altered: Special Registers Altered:
CR0 (if Rc=1) CR0 (if Rc=1)
Shift Right Algebraic Word Immediate Shift Right Algebraic Word X-form
X-form
sraw RA,RS,RB (Rc=0)
srawi RA,RS,SH (Rc=0) sraw. RA,RS,RB (Rc=1)
srawi. RA,RS,SH (Rc=1)
31 RS RA RB 792 Rc
31 RS RA SH 824 Rc 0 6 11 16 21 31
0 6 11 16 21 31
n (RB)59:63
n SH r ROTL32((RS)32:63, 64-n)
r ROTL32((RS)32:63, 64-n) if (RB)58 = 0 then
m MASK(n+32, 63) m MASK(n+32, 63)
else m 64
s (RS)32 0
RA r&m | (64s)&¬m s (RS)32
carry s & ((r&¬m)32:63≠0) RA r&m | (64s)&¬m
CA carry carry s & ((r&¬m)32:63≠0)
CA32 carry CA carry
CA32 carry
The contents of the low-order 32 bits of register RS are
shifted right SH bits. Bits shifted out of position 63 are The contents of the low-order 32 bits of register RS are
lost. Bit 32 of RS is replicated to fill the vacated posi- shifted right the number of bits specified by (RB)58:63.
tions on the left. The 32-bit result is placed into RA32:63. Bits shifted out of position 63 are lost. Bit 32 of RS is
Bit 32 of RS is replicated to fill RA0:31. CA and CA32 replicated to fill the vacated positions on the left. The
are set to 1 if the low-order 32 bits of (RS) contain a 32-bit result is placed into RA32:63. Bit 32 of RS is repli-
negative number and any 1-bits are shifted out of posi- cated to fill RA0:31. CA and CA32 are set to 1 if the
tion 63; otherwise CA and CA32 are set to 0. A shift low-order 32 bits of (RS) contain a negative number
amount of zero causes RA to receive EXTS((RS)32:63), and any 1-bits are shifted out of position 63; otherwise
and CA and CA32 to be set to 0. CA and CA32 are set to 0. A shift amount of zero
causes RA to receive EXTS((RS)32:63), and CA and
CA32 to be set to 0. Shift amounts from 32 to 63 give a
Special Registers Altered: result of 64 sign bits, and cause CA and CA32 to
CA CA32 receive the sign bit of (RS)32:63.
CR0 (if Rc=1)
31 RS RA RB 27 Rc 31 RS RA RB 539 Rc
0 6 11 16 21 31 0 6 11 16 21 31
n (RB)58:63 n (RB)58:63
r ROTL64((RS), n) r ROTL64((RS), 64-n)
if (RB)57 = 0 then if (RB)57 = 0 then
m MASK(0, 63-n) m MASK(n, 63)
64 64
else m 0 else m 0
RA r & m RA r & m
The contents of register RS are shifted left the number The contents of register RS are shifted right the num-
of bits specified by (RB)57:63. Bits shifted out of position ber of bits specified by (RB)57:63. Bits shifted out of
0 are lost. Zeros are supplied to the vacated positions position 63 are lost. Zeros are supplied to the vacated
on the right. The result is placed into register RA. Shift positions on the left. The result is placed into register
amounts from 64 to 127 give a zero result. RA. Shift amounts from 64 to 127 give a zero result.
Special Registers Altered: Special Registers Altered:
CR0 (if Rc=1) CR0 (if Rc=1)
n (RB)58:63
n sh5 || sh0:4 r ROTL64((RS), 64-n)
r ROTL64((RS), 64-n) if (RB)57 = 0 then
m MASK(n, 63) m MASK(n, 63)
else m 64
s (RS)0 0
RA r&m | (64s)&¬m s (RS)0
carry s & ((r&¬m)≠0) RA r&m | (64s)&¬m
CA carry carry s & ((r&¬m)≠0)
CA32 carry CA carry
CA32 carry
The contents of register RS are shifted right SH bits.
Bits shifted out of position 63 are lost. Bit 0 of RS is rep- The contents of register RS are shifted right the num-
licated to fill the vacated positions on the left. The result ber of bits specified by (RB)57:63. Bits shifted out of
is placed into register RA. CA and CA32 are set to 1 if position 63 are lost. Bit 0 of RS is replicated to fill the
(RS) is negative and any 1-bits are shifted out of posi- vacated positions on the left. The result is placed into
tion 63; otherwise CA and CA32 are set to 0. A shift register RA. CA and CA32 are set to 1 if (RS) is nega-
amount of zero causes RA to be set equal to (RS), and tive and any 1-bits are shifted out of position 63; other-
CA and CA32 to be set to 0. wise CA and CA32 are set to 0. A shift amount of zero
causes RA to be set equal to (RS), and CA and CA32
Special Registers Altered: to be set to 0. Shift amounts from 64 to 127 give a
CA CA32 result of 64 sign bits in RA, and cause CA and CA32 to
CR0 (if Rc=1) receive the sign bit of (RS).
Special Registers Altered:
CA CA32
CR0 (if Rc=1)
31 RS RA sh 445 sh Rc
0 6 11 16 21 30 31
n ← sh5 || sh0:4
r ← ROTL64(EXTS64(RS32:63), n)
m ← MASK(0, 63-n)
RA ← r & m
Convert Declets To Binary Coded Decimal Add and Generate Sixes XO-form
X-form
addg6s RT,RA,RB
cdtbcd RA, RS
31 RT RA RB / 74 /
31 RS RA /// 282 / 0 6 11 16 21 22 31
0 6 11 16 21 31
do i = 0 to 15
do i = 0 to 1 dci carry_out(RA4xi:63 + RB4xi:63)
4
n i x 32 c (dc0) || 4(dc1) || ... || 4(dc15)
RAn+0:n+7 0 RT (¬c) & 0x6666_6666_6666_6666
RAn+8:n+19 DPD_TO_BCD( (RS)n+12:n+21 )
RAn+20:n+31 DPD_TO_BCD( (RS)n+22:n+31 ) The contents of register RA are added to the contents
of register RB. Sixteen carry bits are produced, one
The low-order 20 bits of each word of register RS con- for each carry out of decimal position n (bit posi-
tain two declets which are converted to six, 4-bit BCD tion 4xn).
fields; each set of six, 4-bit BCD fields is placed into the
low-order 24 bits of the corresponding word in RA. The A doubleword is composed from the 16 carry bits, and
high-order 8 bits in each word of RA are set to 0. placed into RT. The doubleword consists of a decimal
Special Registers Altered: six (0b0110) in every decimal digit position for which
None the corresponding carry bit is 0, and a zero (0b0000) in
every position for which the corresponding carry bit is
Convert Binary Coded Decimal To Declets 1.
X-form Special Registers Altered:
None
cbcdtd RA, RS
Programming Note
31 RS RA /// 314 /
0 6 11 16 21 31
addg6s can be used to add or subtract two BCD
operands. In these examples it is assumed that r0
contains 0x666...666. (BCD data formats are
do i = 0 to 1
n i x 32 described in Section 5.3.)
RAn+0:n+11 0 Addition of the unsigned BCD operand in register
RAn+12:n+21 BCD_TO_DPD( (RS)n+8:n+19 )
RAn+22:n+31 BCD_TO_DPD( (RS)n+20:n+31 ) RA to the unsigned BCD operand in register RB
can be accomplished as follows.
The low-order 24 bits of each word of register RS con-
tain six, 4-bit BCD fields which are converted to two add r1,RA,r0
declets; each set of two declets is placed into the add r2,r1,RB
low-order 20 bits of the corresponding word in RA. The addg6s RT,r1,RB
high-order 12 bits in each word of RA are set to 0. subf RT,RT,r2# RT = RA +BCD RB
If a 4-bit BCD field has a value greater than 9 the Subtraction of the unsigned BCD operand in regis-
results are undefined. ter RA from the unsigned BCD operand in register
Special Registers Altered: RB can be accomplished as follows. (In this exam-
None ple it is assumed that RB is not register 0.)
addi r1,RB,1
nor r2,RA,RA# one's complement of RA
add r3,r1,r2
addg6s RT,r1,r2
subf RT,RT,r3# RT = RB -BCD RA
Additional instructions are needed to handle signed
BCD operands, and BCD operands that occupy
more than one register (e.g., unsigned BCD oper-
ands that have more than 16 decimal digits).
Move From VSR Doubleword XX1-form Move From VSR Lower Doubleword
XX1-form
mfvsrd RA,XS
mfvsrld RA,XS
31 S RA /// 51 SX
31 S RA /// 307 SX
0 6 11 16 21 31
0 6 11 16 21 31
if SX=0 & MSR.FP=0 then FP_Unavailable() if SX=0 & MSR.VSX=0 then VSX_Unavailable()
if SX=1 & MSR.VEC=0 then Vector_Unavailable() if SX=1 & MSR.VEC=0 then Vector_Unavailable()
0 64 127
31 S RA /// 115 SX
0 6 11 16 21 31
GPR[RA] ← EXTZ64(VSR[32×SX+S].word[1])
0 32 64 127
if TX=0 & MSR.FP=0 then FP_Unavailable() if TX=0 & MSR.FP=0 then FP_Unavailable()
if TX=1 & MSR.VEC=0 then Vector_Unavailable() if TX=1 & MSR.VEC=0 then Vector_Unavailable()
The contents of GPR[RA] are placed into doubleword The two’s-complement integer in bits 32:63 of GPR[RA]
element 0 of VSR[XT]. is sign-extended to 64 bits and placed into doubleword
element 0 of VSR[XT].
The contents of doubleword element 1 of VSR[XT] are
undefined. The contents of doubleword element 1 of VSR[XT] are
undefined.
For TX=0, mtvsrd is treated as a Floating-Point
instruction in terms of resource availability. For TX=0, mtvsrwa is treated as a Floating-Point
instruction in terms of resource availability.
For TX=1, mtvsrd is treated as a Vector instruction in
terms of resource availability. For TX=1, mtvsrwa is treated as a Vector instruction in
terms of resource availability.
Extended Mnemonics Equivalent To
mtfprd FRT,RA mtvsrd FRT,RA Extended Mnemonics Equivalent To
mtvrd VRT,RA mtvsrd VRT+32,RA mtfprwa FRT,RA mtvsrwa FRT,RA
mtvrwa VRT,RA mtvsrwa VRT+32,RA
Special Registers Altered
None Special Registers Altered
None
Data Layout for mtvsrd
Data Layout for mtvsrwa
src = GPR[RA]
src = GPR[RA]
Move To VSR Word and Zero XX1-form Move To VSR Double Doubleword
XX1-form
mtvsrwz XT,RA
mtvsrdd XT,RA,RB
31 T RA /// 243 TX
31 T RA RB 435 TX
0 6 11 16 21 31
0 6 11 16 21 31
if TX=0 & MSR.FP=0 then FP_Unavailable() if TX=0 & MSR.VSX=0 then VSX_Unavailable()
if TX=1 & MSR.VEC=0 then Vector_Unavailable() if TX=1 & MSR.VEC=0 then Vector_Unavailable()
31 T RA /// 403 TX
0 6 11 16 21 31
VSR[32×TX+T].word[0] GPR[RA].bit[32:63]
VSR[32×TX+T].word[1] GPR[RA].bit[32:63]
VSR[32×TX+T].word[2] GPR[RA].bit[32:63]
VSR[32×TX+T].word[3] GPR[RA].bit[32:63]
Programming Note
The AMR is part of the “context” of the program
(see Book III). Therefore modification of the AMR
requires “synchronization” by software. For this rea-
son, most operating systems provide a system
library program that application programs can use
to modify the AMR.
count 0
4
do i = 0 to 7 mask (FXM0) || 4(FXM1) || ... 4(FXM7)
if FXMi = 1 then CR ((RS)32:63 & mask) | (CR & ¬mask)
n i
count count + 1 The contents of bits 32:63 of register RS are placed
if count = 1 then into the Condition Register under control of the field
CR4×n+32:4×n+35 (RS)4×n+32:4×n+35 mask specified by FXM. The field mask identifies the
else CR undefined 4-bit fields affected. Let i be an integer in the range 0-7.
If FXMi=1 then CR field i (CR bits 4×i+32:4×i+35) is set
If exactly one bit of the FXM field is set to 1, let n be the to the contents of the corresponding field of the
position of that bit in the field (0 ≤ n ≤ 7). The contents low-order 32 bits of RS.
of bits 4×n+32:4×n+35 of register RS are placed into
CR field n (CR bits 4×n+32:4×n+35). Otherwise, the Special Registers Altered:
contents of the Condition Register are undefined. CR fields selected by mask
Special Registers Altered: Extended Mnemonics:
CR field selected by FXM
Example of extended mnemonics for Move To Condi-
tion Register Fields:
Move From One Condition Register Field Move From Condition Register XFX-form
XFX-form
mfocrf RT,FXM mfcr RT
31 RT 1 FXM / 19 / 31 RT 0 /// 19 /
0 6 11 12 20 21 31 0 6 11 12 21 31
RT undefined 320
RT || CR
count 0
do i = 0 to 7 The contents of the Condition Register are placed into
if FXMi = 1 then RT32:63. RT0:31 are set to 0.
n i
count count + 1
Special Registers Altered:
if count = 1 then
RT 640 None
RT4×n+32:4×n+35 CR4×n+32:4×n+35
If exactly one bit of the FXM field is set to 1, let n be Set Boolean X-form
the position of that bit in the field (0 ≤ n ≤ 7). The
setb RT,BFA
contents of CR field n (CR bits 4×n+32:4×n+35) are
placed into bits 4×n+32:4×n+35 of register RT, and the 31 RT BFA // /// 128 /
contents of the remaining bits of register RT are 0 6 11 14 16 21 31
undefined. Otherwise, the contents of register RT are
undefined. if CR4×BFA+32=1 then
RT ← 0xFFFF_FFFF_FFFF_FFFF
If exactly one bit of the FXM field is set to 1, the
else if CR4×BFA+33=1 then
contents of the remaining bits of register RT are set to RT ← 0x0000_0000_0000_0001
0's instead of being undefined as specified above.
else
Special Registers Altered: RT ← 0x0000_0000_0000_0000
None
If the contents of bit 0 of CR field BFA are equal to 0b1,
Programming Note the contents of register RT are set to
0xFFFF_FFFF_FFFF_FFFF.
Warning: mfocrf is not backward compatible with
processors that comply with versions of the
Otherwise, if the contents of bit 1 of CR field BFA are
architecture that precede Version 3.0. Such
equal to 0b1, the contents of register RT are set to
processors may not set to 0 the bits of register RT
0x0000_0000_0000_0001.
that do not correspond to the specified CR field. If
programs that depend on this clearing behavior Otherwise, the contents of register RT are set to
are run on such processors, the programs may get 0x0000_0000_0000_0000.
incorrect results.
Special Registers Altered:
The POWER4, POWER5, POWER7 and
None
POWER8 processors set to 0's all bytes of register
RT other than the byte that contains the specified
CR field. In the byte that contains the CR field, bits
other than those containing the CR field may or
may not be set to 0s.
4.1 Floating-Point Facility Over- itly, and select the value from one of two float-
ing-point registers based on the value in a third
view floating-point register. The operations performed
by these instructions are not considered float-
This chapter describes the registers and instructions ing-point operations. With the exception of the
that make up the Floating-Point Facility. instructions that manipulate the Floating-Point Sta-
tus and Control Register explicitly, they do not alter
The processor (augmented by appropriate software
the Floating-Point Status and Control Register.
support, where required) implements a floating-point
They are the instructions described in Sections
system compliant with the ANSI/IEEE Standard
4.6.2 through 4.6.5, and 4.6.10.
754-1985, “IEEE Standard for Binary Floating-Point
Arithmetic” (hereafter referred to as “the IEEE stan- A floating-point number consists of a signed exponent
dard”). That standard defines certain required “opera- and a signed significand. The quantity expressed by
tions” (addition, subtraction, etc.). Herein, the term this number is the product of the significand and the
“floating-point operation” is used to refer to one of these number 2exponent. Encodings are provided in the data
required operations and to additional operations format to represent finite numeric values, ±Infinity, and
defined (e.g., those performed by Multiply-Add or values that are “Not a Number” (NaN). Operations
Reciprocal Estimate instructions). A Non-IEEE mode is involving infinities produce results obeying traditional
also provided. This mode, which may produce results mathematical conventions. NaNs have no mathemati-
not in strict compliance with the IEEE standard, allows cal interpretation. Their encoding permits a variable
shorter latency. diagnostic information field. They may be used to indi-
cate such things as uninitialized variables and can be
Instructions are provided to perform arithmetic, round-
produced by certain invalid operations.
ing, conversion, comparison, and other operations in
floating-point registers; to move floating-point data There is one class of exceptional events that occur dur-
between storage and these registers; and to manipu- ing instruction execution that is unique to the Float-
late the Floating-Point Status and Control Register ing-Point Facility: the Floating-Point Exception.
explicitly. Floating-point exceptions are signaled with bits set in
the Floating-Point Status and Control Register
These instructions are divided into two categories.
(FPSCR). They can cause the system floating-point
computational instructions enabled exception error handler to be invoked, pre-
The computational instructions are those that per- cisely or imprecisely, if the proper control bits are set.
form addition, subtraction, multiplication, division,
extracting the square root, rounding, conversion, Floating-Point Exceptions
comparison, and combinations of these opera-
The following floating-point exceptions are detected by
tions. These instructions provide the floating-point
the processor:
operations. They place status information into the
Floating-Point Status and Control Register. They
Invalid Operation Exception (VX)
are the instructions described in Sections 4.6.6
SNaN (VXSNAN)
through 4.6.8.
Infinity-Infinity (VXISI)
non-computational instructions Infinity÷Infinity (VXIDI)
Zero÷Zero (VXZDZ)
The non-computational instructions are those that
Infinity×Zero (VXIMZ)
perform loads and stores, move the contents of a
Invalid Compare (VXVC)
floating-point register to another floating-point reg-
Software-Defined Condition (VXSOFT)
ister possibly altering the sign, manipulate the
Invalid Square Root (VXSQRT)
Floating-Point Status and Control Register explic-
Invalid Integer Convert (VXCVI) the result placed into the target FPR, and the setting of
Zero Divide Exception (ZX) status bits in the FPSCR and in the Condition Register
Overflow Exception (OX) (if Rc=1), are undefined.
Underflow Exception (UX)
Inexact Exception (XX) FPR 0
Each floating-point exception, and each category of FPR 1
Invalid Operation Exception, has an exception bit in the ...
FPSCR. In addition, each floating-point exception has a
corresponding enable bit in the FPSCR. See ...
Section 4.2.2, “Floating-Point Status and Control Reg- FPR 30
ister” on page 124 for a description of these exception
FPR 31
and enable bits, and Section 4.4, “Floating-Point
Exceptions” on page 132 for a detailed discussion of 0 63
the FPCC bits to 1 and the other three FPCC See Section 4.4.5, “Inexact Exception” on
bits to 0. Arithmetic, rounding, and Convert page 136.
From Integer instructions may set the FPCC
61 Floating-Point Non-IEEE Mode (NI)
bits with the C bit, to indicate the class of the
Floating-point non-IEEE mode is optional. If
result as shown in Figure 50 on page 127.
floating-point non-IEEE mode is not imple-
Note that in this case the high-order three bits
mented, this bit is treated as reserved, and the
of the FPCC retain their relational significance
remainder of the definition of this bit does not
indicating that the value is less than, greater
apply.
than, or equal to zero.
If floating-point non-IEEE mode is imple-
48 Floating-Point Less Than or Negative (FL
mented, this bit has the following meaning.
or <)
0 The processor is not in floating-point
49 Floating-Point Greater Than or Positive non-IEEE mode (i.e., all floating-point
(FG or >) operations conform to the IEEE standard).
50 Floating-Point Equal or Zero (FE or =) 1 The processor is in floating-point
non-IEEE mode.
51 Floating-Point Unordered or NaN (FU or ?)
When the processor is in floating-point
52 Reserved non-IEEE mode, the remaining FPSCR bits
53 Floating-Point Invalid Operation Exception may have meanings different from those given
(Software-Defined Condition) in this document, and floating-point operations
(VXSOFT) need not conform to the IEEE standard. The
This bit can be altered only by mcrfs, mtfsfi, effects of executing a given floating-point
mtfsf, mtfsb0, or mtfsb1. See Section 4.4.1. instruction with FPSCRNI=1, and any addi-
tional requirements for using non-IEEE mode,
Programming Note are implementation-dependent. The results of
FPSCRVXSOFT can be used by software executing a given instruction in non-IEEE
to indicate the occurrence of an arbitrary, mode may vary between implementations,
software-defined, condition that is to be and between different executions on the same
treated as an Invalid Operation Exception. implementation.
For example, the bit could be set by a pro-
gram that computes a base 10 logarithm if Programming Note
the supplied input is negative. When the processor is in floating-point
non-IEEE mode, the results of float-
54 Floating-Point Invalid Operation Exception ing-point operations may be approximate,
(Invalid Square Root) (VXSQRT) and performance for these operations
See Section 4.4.1. may be better, more predictable, or less
data-dependent than when the processor
55 Floating-Point Invalid Operation Exception is not in non-IEEE mode. For example, in
(Invalid Integer Convert) (VXCVI) non-IEEE mode an implementation may
See Section 4.4.1. return 0 instead of a denormalized num-
56 Floating-Point Invalid Operation Exception ber, and may return a large number
Enable (VE) instead of an infinity.
See Section 4.4.1.
62:63 Floating-Point Rounding Control (RN) See
57 Floating-Point Overflow Exception Enable
Section 4.3.6, “Rounding” on page 131.
(OE)
See Section 4.4.3, “Overflow Exception” on 00 Round to Nearest
page 135. 01 Round toward Zero
10 Round toward +Infinity
58 Floating-Point Underflow Exception Enable
11 Round toward -Infinity
(UE)
See Section 4.4.4, “Underflow Exception” on
page 136.
59 Floating-Point Zero Divide Exception
Enable (ZE)
See Section 4.4.2, “Zero Divide Exception” on
page 134.
60 Floating-Point Inexact Exception Enable
(XE)
Figure 52. Floating-point double format -INF -NOR -DEN -0 +0 +DEN +NOR +INF
S sign bit The NaNs are not related to the numeric values or infin-
EXP exponent+bias ities by order or value but are encodings used to con-
FRACTION fraction vey diagnostic information such as the representation
of uninitialized variables.
Representation of numeric values in the floating-point
formats consists of a sign bit (S), a biased exponent The following is a description of the different float-
(EXP), and the fraction portion (FRACTION) of the sig- ing-point values defined in the architecture:
nificand. The significand consists of a leading implied Binary floating-point numbers
bit concatenated on the right with the FRACTION. This Machine representable values used as approximations
leading implied bit is 1 for normalized numbers and 0 to real numbers. Three categories of numbers are sup-
for denormalized numbers and is located in the unit bit ported: normalized numbers, denormalized numbers,
position (i.e., the first bit to the left of the binary point). and zero values.
Values representable within the two floating-point for-
Exception generates this QNaN (i.e., mented by one. For the leading-zero case, the signifi-
0x7FF8_0000_0000_0000). cand is shifted left while decrementing its exponent by
one for each bit shifted, until the leading significand bit
A double-precision NaN is considered to be represent-
becomes one. The Guard bit and the Round bit (see
able in single format if and only if the low-order 29 bits
Section 4.5.1, “Execution Model for IEEE Operations”
of the double-precision NaN’s fraction are zero.
on page 137) participate in the shift with zeros shifted
into the Round bit. The exponent is regarded as if its
4.3.3 Sign of Result range were unlimited.
The following rules govern the sign of the result of an After normalization, or if normalization was not
arithmetic, rounding, or conversion operation, when the required, the intermediate result may have a nonzero
operation does not yield an exception. They apply even significand and an exponent value that is less than the
when the operands or results are zeros or infinities. minimum value that can be represented in the format
specified for the result. In this case, the intermediate
The sign of the result of an add operation is the result is said to be “Tiny” and the stored result is deter-
sign of the operand having the larger absolute mined by the rules described in Section 4.4.4, “Under-
value. If both operands have the same sign, the flow Exception”. These rules may require
sign of the result of an add operation is the same denormalization.
as the sign of the operands. The sign of the result
of the subtract operation x-y is the same as the A number is denormalized by shifting its significand
sign of the result of the add operation x+(-y). right while incrementing its exponent by 1 for each bit
shifted, until the exponent is equal to the format’s mini-
When the sum of two operands with opposite sign, mum value. If any significant bits are lost in this shifting
or the difference of two operands with the same process then “Loss of Accuracy” has occurred (See
sign, is exactly zero, the sign of the result is posi- Section 4.4.4, “Underflow Exception” on page 136) and
tive in all rounding modes except Round toward Underflow Exception is signaled.
-Infinity, in which mode the sign is negative.
The sign of the result of a multiply or divide opera- 4.3.5 Data Handling and Precision
tion is the Exclusive OR of the signs of the oper-
ands. Most of the Floating-Point Facility Architecture, includ-
The sign of the result of a Square Root or Recipro- ing all computational, Move, and Select instructions,
cal Square Root Estimate operation is always pos- use the floating-point double format to represent data in
itive, except that the square root of -0 is -0 and the FPRs. Single-precision and integer-valued oper-
the reciprocal square root of -0 is -Infinity. ands may be manipulated using double-precision oper-
ations. Instructions are provided to coerce these values
The sign of the result of a Round to Single-Preci- from a double format operand. Instructions are also
sion, or Convert From Integer, or Round to Integer provided for manipulations which do not require dou-
operation is the sign of the operand being con- ble-precision. In addition, instructions are provided to
verted. access a true single-precision representation in stor-
For the Multiply-Add instructions, the rules given above age, and a fixed-point integer representation in GPRs.
are applied first to the multiply operation and then to
the add or subtract operation (one of the inputs to the 4.3.5.1 Single-Precision Operands
add or subtract operation is the result of the multiply
operation). For single format data, a format conversion from single
to double is performed when loading from storage into
an FPR and a format conversion from double to single
4.3.4 Normalization and is performed when storing from an FPR to storage. No
Denormalization floating-point exceptions are caused by these instruc-
tions. An instruction is provided to explicitly convert a
The intermediate result of an arithmetic or frsp instruc- double format operand in an FPR to single-precision.
tion may require normalization and/or denormalization Floating-point single-precision is enabled with four
as described below. Normalization and denormalization types of instruction.
do not affect the sign of the result.
When an arithmetic or rounding instruction produces an 1. Load Floating-Point Single
intermediate result which carries out of the significand,
or in which the significand is nonzero but has a leading This form of instruction accesses a single-preci-
zero bit, it is not a normalized number and must be nor- sion operand in single format in storage, converts it
malized before it is stored. For the carry-out case, the to double format, and loads it into an FPR. No
significand is shifted right one bit, with a one shifted into floating-point exceptions are caused by these
the leading significand bit, and the exponent is incre- instructions.
4.4 Floating-Point Exceptions When an exception occurs the writing of a result to the
target register may be suppressed or a result may be
This architecture defines the following floating-point delivered, depending on the exception.
exceptions: The writing of a result to the target register is sup-
Invalid Operation Exception pressed for the following kinds of exception, so that
SNaN there is no possibility that one of the operands is lost:
Infinity-Infinity Enabled Invalid Operation
Infinity÷Infinity Enabled Zero Divide
Zero÷Zero
Infinity×Zero For the remaining kinds of exception, a result is gener-
Invalid Compare ated and written to the destination specified by the
Software-Defined Condition instruction causing the exception. The result may be a
Invalid Square Root different value for the enabled and disabled conditions
Invalid Integer Convert for some of these exceptions. The kinds of exception
Zero Divide Exception that deliver a result are the following:
Overflow Exception Disabled Invalid Operation
Underflow Exception Disabled Zero Divide
Inexact Exception Disabled Overflow
These exceptions, other than Invalid Operation Excep- Disabled Underflow
tion due to Software-Defined Condition, may occur dur- Disabled Inexact
ing execution of computational instructions. An Invalid Enabled Overflow
Operation Exception due to Software-Defined Condi- Enabled Underflow
tion occurs when a Move To FPSCR instruction sets Enabled Inexact
FPSCRVXSOFT to 1. Subsequent sections define each of the floating-point
Each floating-point exception, and each category of exceptions and specify the action that is taken when
Invalid Operation Exception, has an exception bit in the they are detected.
FPSCR. In addition, each floating-point exception has a The IEEE standard specifies the handling of excep-
corresponding enable bit in the FPSCR. The exception tional conditions in terms of “traps” and “trap handlers”.
bit indicates occurrence of the corresponding excep- In this architecture, an FPSCR exception enable bit of 1
tion. If an exception occurs, the corresponding enable causes generation of the result value specified in the
bit governs the result produced by the instruction and, IEEE standard for the “trap enabled” case; the expecta-
in conjunction with the FE0 and FE1 bits (see tion is that the exception will be detected by software,
page 133), whether and how the system floating-point which will revise the result. An FPSCR exception
enabled exception error handler is invoked. (In general, enable bit of 0 causes generation of the “default result”
the enabling specified by the enable bit is of invoking value specified for the “trap disabled” (or “no trap
the system error handler, not of permitting the excep- occurs” or “trap is not implemented”) case; the expecta-
tion to occur. The occurrence of an exception depends tion is that the exception will not be detected by soft-
only on the instruction and its inputs, not on the setting ware, which will simply use the default result. The result
of any control bits. The only deviation from this general to be delivered in each case for each exception is
rule is that the occurrence of an Underflow Exception described in the sections below.
may depend on the setting of the enable bit.)
The IEEE default behavior when an exception occurs is
A single instruction, other than mtfsfi or mtfsf, may set to generate a default value and not to notify software. In
more than one exception bit only in the following cases: this architecture, if the IEEE default behavior when an
Inexact Exception may be set with Overflow exception occurs is desired for all exceptions, all
Exception. FPSCR exception enable bits should be set to 0 and
Inexact Exception may be set with Underflow Ignore Exceptions Mode (see below) should be used.
Exception. In this case the system floating-point enabled exception
Invalid Operation Exception (SNaN) is set with error handler is not invoked, even if floating-point
Invalid Operation Exception (∞×0) for Multiply-Add exceptions occur: software can inspect the FPSCR
instructions for which the values being multiplied exception bits if necessary, to determine whether
are infinity and zero and the value being added is exceptions have occurred.
an SNaN.
Invalid Operation Exception (SNaN) may be set In this architecture, if software is to be notified that a
with Invalid Operation Exception (Invalid Compare) given kind of exception has occurred, the correspond-
for Compare Ordered instructions. ing FPSCR exception enable bit must be set to 1 and a
Invalid Operation Exception (SNaN) may be set mode other than Ignore Exceptions Mode must be
with Invalid Operation Exception (Invalid Integer used. In this case the system floating-point enabled
Convert) for Convert To Integer instructions. exception error handler is invoked if an enabled float-
ing-point exception occurs. The system floating-point before the instruction at which the system floating-point
enabled exception error handler is also invoked if a enabled exception error handler is invoked have com-
Move To FPSCR instruction causes an exception bit pleted, and no instruction after the instruction at which
and the corresponding enable bit both to be 1; the the system floating-point enabled exception error han-
Move To FPSCR instruction is considered to cause the dler is invoked has begun execution. The instruction at
enabled exception. which the system floating-point enabled exception error
handler is invoked has completed if it is the excepting
The FE0 and FE1 bits control whether and how the sys-
instruction and there is only one such instruction. Oth-
tem floating-point enabled exception error handler is
erwise it has not begun execution (or may have been
invoked if an enabled floating-point exception occurs.
partially executed in some cases, as described in Book
The location of these bits and the requirements for
III).
altering them are described in Book III. (The system
floating-point enabled exception error handler is never
Programming Note
invoked because of a disabled floating-point excep-
tion.) The effects of the four possible settings of these In any of the three non-Precise modes, a Float-
bits are as follows. ing-Point Status and Control Register instruction
can be used to force any exceptions, due to
FE0 FE1 Description instructions initiated before the Floating-Point Sta-
tus and Control Register instruction, to be recorded
0 0 Ignore Exceptions Mode in the FPSCR. (This forcing is superfluous for Pre-
Floating-point exceptions do not cause cise Mode.)
the system floating-point enabled excep-
tion error handler to be invoked. In either of the Imprecise modes, a Floating-Point
0 1 Imprecise Nonrecoverable Mode Status and Control Register instruction can be used
The system floating-point enabled excep- to force any invocations of the system floating-point
tion error handler is invoked at some point enabled exception error handler, due to instructions
at or beyond the instruction that caused initiated before the Floating-Point Status and Con-
the enabled exception. It may not be pos- trol Register instruction, to occur. (This forcing has
sible to identify the excepting instruction no effect in Ignore Exceptions Mode, and is super-
or the data that caused the exception. fluous for Precise Mode.)
Results produced by the excepting The last sentence of the paragraph preceding this
instruction may have been used by or may Programming Note can apply only in the Imprecise
have affected subsequent instructions modes, or if the mode has just been changed from
that are executed before the error handler Ignore Exceptions Mode to some other mode. (It
is invoked. always applies in the latter case.)
1 0 Imprecise Recoverable Mode
The system floating-point enabled excep- In order to obtain the best performance across the wid-
tion error handler is invoked at some point est range of implementations, the programmer should
at or beyond the instruction that caused obey the following guidelines.
the enabled exception. Sufficient informa-
If the IEEE default results are acceptable to the
tion is provided to the error handler that it
application, Ignore Exceptions Mode should be
can identify the excepting instruction and
used with all FPSCR exception enable bits set to 0.
the operands, and correct the result. No
If the IEEE default results are not acceptable to the
results produced by the excepting instruc-
application, Imprecise Nonrecoverable Mode
tion have been used by or have affected
should be used, or Imprecise Recoverable Mode if
subsequent instructions that are executed
recoverability is needed, with FPSCR exception
before the error handler is invoked.
enable bits set to 1 for those exceptions for which
1 1 Precise Mode the system floating-point enabled exception error
The system floating-point enabled excep- handler is to be invoked.
tion error handler is invoked precisely at Ignore Exceptions Mode should not, in general, be
the instruction that caused the enabled used when any FPSCR exception enable bits are
exception. set to 1.
Precise Mode may degrade performance in some
In all cases, the question of whether a floating-point
implementations, perhaps substantially, and there-
result is stored, and what value is stored, is governed
fore should be used only for debugging and other
by the FPSCR exception enable bits, as described in
specialized applications.
subsequent sections, and is not affected by the value of
the FE0 and FE1 bits.
In all cases in which the system floating-point enabled
exception error handler is invoked, all instructions
4.4.3.1 Definition
An Overflow Exception occurs when the magnitude of
what would have been the rounded result if the expo-
nent range were unbounded exceeds that of the largest
finite number of the specified result precision.
4.4.3.2 Action
The action to be taken depends on the setting of the
Overflow Exception Enable bit of the FPSCR.
When Overflow Exception is enabled (FPSCROE=1)
and an Overflow Exception occurs, the following
actions are taken:
1. Overflow Exception is set
FPSCROX 1
2. For double-precision arithmetic instructions, the
exponent of the normalized intermediate result is
adjusted by subtracting 1536
3. For single-precision arithmetic instructions and the
Floating Round to Single-Precision instruction, the
exponent of the normalized intermediate result is
adjusted by subtracting 192
4. The adjusted rounded result is placed into the tar-
get FPR
5. FPSCRFPRF is set to indicate the class and sign of
the result (± Normal Number)
When Overflow Exception is disabled (FPSCROE=0)
and an Overflow Exception occurs, the following
actions are taken:
Programming Note
In some implementations, enabling Inexact Excep-
tions may degrade performance more than does
enabling other types of floating-point exception.
S C L FRACTION X’
0 1 2 3 106
if RA = 0 then b 0
else b (RA) if RA = 0 then b 0
EA b + EXTS(D) else b (RA)
FRT DOUBLE(MEM(EA, 4)) EA b + (RB)
FRT DOUBLE(MEM(EA, 4))
Let the effective address (EA) be the sum (RA|0)+D.
Let the effective address (EA) be the sum (RA|0)+(RB).
The word in storage addressed by EA is interpreted as
a floating-point single-precision operand. This word is The word in storage addressed by EA is interpreted as
converted to floating-point double format (see a floating-point single-precision operand. This word is
page 141) and placed into register FRT. converted to floating-point double format (see
page 141) and placed into register FRT.
Special Registers Altered:
None Special Registers Altered:
None
Load Floating-Point Single with Update Load Floating-Point Single with Update
D-form Indexed X-form
lfsu FRT,D(RA) lfsux FRT,RA,RB
if RA = 0 then b 0
else b (RA) if RA = 0 then b 0
EA b + EXTS(D) else b (RA)
FRT MEM(EA, 8) EA b + (RB)
FRT MEM(EA, 8)
Let the effective address (EA) be the sum (RA|0)+D.
Let the effective address (EA) be the sum (RA|0)+(RB).
The doubleword in storage addressed by EA is loaded
into register FRT. The doubleword in storage addressed by EA is loaded
into register FRT.
Special Registers Altered:
None Special Registers Altered:
None
Load Floating-Point Double with Update Load Floating-Point Double with Update
D-form Indexed X-form
lfdu FRT,D(RA) lfdux FRT,RA,RB
31 FRT RA RB 855 /
0 6 11 16 21 31
if RA = 0 then b 0
else b (RA)
EA b + (RB)
FRT EXTS(MEM(EA, 4))
Let the effective address (EA) be the sum (RA|0)+(RB).
The word in storage addressed by EA is loaded into
FRT32:63. FRT0:31 are filled with a copy of bit 0 of the
loaded word.
Special Registers Altered:
None
31 FRT RA RB 887 /
0 6 11 16 21 31
if RA = 0 then b 0
else b (RA)
EA b + (RB)
320 || MEM(EA, 4)
FRT
Let the effective address (EA) be the sum (RA|0)+(RB).
The word in storage addressed by EA is loaded into
FRT32:63. FRT0:31 are set to 0.
Special Registers Altered:
None
if RA = 0 then b 0
else b (RA) if RA = 0 then b 0
EA b + EXTS(D) else b (RA)
MEM(EA, 4) SINGLE((FRS)) EA b + (RB)
MEM(EA, 4) SINGLE((FRS))
Let the effective address (EA) be the sum (RA|0)+D.
Let the effective address (EA) be the sum (RA|0)+(RB).
The contents of register FRS are converted to single
format (see page 145) and stored into the word in stor- The contents of register FRS are converted to single
age addressed by EA. format (see page 145) and stored into the word in stor-
age addressed by EA.
Special Registers Altered:
None Special Registers Altered:
None
Store Floating-Point Single with Update Store Floating-Point Single with Update
D-form Indexed X-form
stfsu FRS,D(RA) stfsux FRS,RA,RB
if RA = 0 then b 0
else b (RA) if RA = 0 then b 0
EA b + EXTS(D) else b (RA)
MEM(EA, 8) (FRS) EA b + (RB)
MEM(EA, 8) (FRS)
Let the effective address (EA) be the sum (RA|0)+D.
Let the effective address (EA) be the sum (RA|0)+(RB).
The contents of register FRS are stored into the dou-
bleword in storage addressed by EA. The contents of register FRS are stored into the dou-
bleword in storage addressed by EA.
Special Registers Altered:
None Special Registers Altered:
None
Store Floating-Point Double with Update Store Floating-Point Double with Update
D-form Indexed X-form
stfdu FRS,D(RA) stfdux FRS,RA,RB
31 FRS RA RB 983 /
0 6 11 16 21 31
if RA = 0 then b 0
else b (RA)
EA b + (RB)
MEM(EA, 4) (FRS)32:63
Let the effective address (EA) be the sum (RA|0)+(RB).
(FRS)32:63 are stored, without conversion, into the word
in storage addressed by EA.
If the contents of register FRS were produced, either
directly or indirectly, by a Load Floating-Point Single
instruction, a single-precision Arithmetic instruction, or
frsp, then the value stored is undefined. (The contents
of register FRS are produced directly by such an
instruction if FRS is the target register for the instruc-
tion. The contents of register FRS are produced indi-
rectly by such an instruction if FRS is the final target
register of a sequence of one or more Floating-Point
Move instructions, with the input to the sequence hav-
ing been produced directly by such an instruction.)
Special Registers Altered:
None
Load Floating-Point Double Pair DS-form Store Floating-Point Double Pair DS-form
lfdp FRTp,DS(RA) stfdp FRSp,DS(RA)
57 FRTp RA DS 00 61 FRSp RA DS 00
0 6 11 16 30 31 0 6 11 16 30 31
if RA = 0 then b 0 if RA = 0 then b 0
else b (RA) else b (RA)
EA b + EXTS(DS||0b00) EA b + EXTS(DS||0b00)
FRTpeven MEM(EA,8) MEM(EA, 8) FRSpeven
FRTpodd MEM(EA+8, 8) MEM(EA+8, 8) FRSpodd
Let the effective address (EA) be the sum (RA|0) + Let the effective address (EA) be the sum (RA|0) +
(DS||0b00). (DS||0b00).
The doubleword in storage addressed by EA is placed The contents of the even-numbered register of FRSp
into the even-numbered register of FRTp. are stored into the doubleword in storage addressed by
EA.
The doubleword in storage addressed by EA+8 is
placed into the odd-numbered register of FRTp. The contents of the odd-numbered register of FRSp are
stored into the doubleword in storage addressed by
If FRTp is odd, the instruction form is invalid.
EA+8.
Special Registers Altered: If FRSp is odd, the instruction form is invalid.
None
Special Registers Altered:
Load Floating-Point Double Pair Indexed None
X-form
Store Floating-Point Double Pair Indexed
lfdpx FRTp,RA,RB X-form
31 FRTp RA RB 791 / stfdpx FRSp,RA,RB
0 6 11 16 21 31
31 FRSp RA RB 919 /
0 6 11 16 21 31
if RA = 0 then b 0
else b (RA)
EA b + (RB) if RA = 0 then b 0
FRTpeven MEM(EA,8) else b (RA)
FRTpodd MEM(EA+8, 8) EA b + (RB)
Let the effective address (EA) be the sum (RA|0) + MEM(EA, 8) FRSpeven
MEM(EA+8, 8) FRSpodd
(RB).
Let the effective address (EA) be the sum (RA|0) +
The doubleword in storage addressed by EA is placed
(DS||0b00).
into the even-numbered register of FRTp.
The contents of the even-numbered register of FRSp
The doubleword in storage addressed by EA+8 is
are stored into the doubleword in storage addressed by
placed into the odd-numbered register of FRTp.
EA.
If FRTp is odd, the instruction form is invalid.
The contents of the odd-numbered register of FRSp are
Special Registers Altered: stored into the doubleword in storage addressed by
None EA+8.
If FRSp is odd, the instruction form is invalid.
The contents of register FRB are placed into register The contents of register FRB with bit 0 inverted are
FRT. placed into register FRT.
Special Registers Altered: Special Registers Altered:
CR1 (if Rc=1) CR1 (if Rc=1)
The contents of register FRB with bit 0 set to zero are The contents of register FRB with bit 0 set to the value
placed into register FRT. of bit 0 of register FRA are placed into register FRT.
Special Registers Altered: Special Registers Altered:
CR1 (if Rc=1) CR1 (if Rc=1)
The contents of register FRB with bit 0 set to one are if MSR.FP=0 then FP_Unavailable()
placed into register FRT. FPR[FRT].word[0] ← FPR[FRA].word[0]
FPR[FRT].word[1] ← FPR[FRB].word[0]
Special Registers Altered:
CR1 (if Rc=1) The contents of word element 0 of FPR[FRA] are placed
into word element 0 of FPR[FRT].
The floating-point operand in register FRA is added to The floating-point operand in register FRB is subtracted
the floating-point operand in register FRB. from the floating-point operand in register FRA.
If the most significant bit of the resultant significand is If the most significant bit of the resultant significand is
not 1, the result is normalized. The result is rounded to not 1, the result is normalized. The result is rounded to
the target precision under control of the Floating-Point the target precision under control of the Floating-Point
Rounding Control field RN of the FPSCR and placed Rounding Control field RN of the FPSCR and placed
into register FRT. into register FRT.
Floating-point addition is based on exponent compari- The execution of the Floating Subtract instruction is
son and addition of the two significands. The exponents identical to that of Floating Add, except that the con-
of the two operands are compared, and the significand tents of FRB participate in the operation with the sign
accompanying the smaller exponent is shifted right, bit (bit 0) inverted.
with its exponent increased by one for each bit shifted,
FPSCRFPRF is set to the class and sign of the result,
until the two exponents are equal. The two significands
except for Invalid Operation Exceptions when
are then added or subtracted as appropriate, depend-
FPSCRVE=1.
ing on the signs of the operands, to form an intermedi-
ate sum. All 53 bits of the significand as well as all three Special Registers Altered:
guard bits (G, R, and X) enter into the computation. FPRF FR FI
FX OX UX XX
If a carry occurs, the sum’s significand is shifted right
VXSNAN VXISI
one bit position and the exponent is increased by one.
CR1 (if Rc=1)
FPSCRFPRF is set to the class and sign of the result,
except for Invalid Operation Exceptions when
FPSCRVE=1.
Special Registers Altered:
FPRF FR FI
FX OX UX XX
VXSNAN VXISI
CR1 (if Rc=1)
The floating-point operand in register FRA is multiplied The floating-point operand in register FRA is divided by
by the floating-point operand in register FRC. the floating-point operand in register FRB. The remain-
der is not supplied as a result.
If the most significant bit of the resultant significand is
not 1, the result is normalized. The result is rounded to If the most significant bit of the resultant significand is
the target precision under control of the Floating-Point not 1, the result is normalized. The result is rounded to
Rounding Control field RN of the FPSCR and placed the target precision under control of the Floating-Point
into register FRT. Rounding Control field RN of the FPSCR and placed
into register FRT.
Floating-point multiplication is based on exponent addi-
tion and multiplication of the significands. Floating-point division is based on exponent subtrac-
tion and division of the significands.
FPSCRFPRF is set to the class and sign of the result,
except for Invalid Operation Exceptions when FPSCRFPRF is set to the class and sign of the result,
FPSCRVE=1. except for Invalid Operation Exceptions when
FPSCRVE=1 and Zero Divide Exceptions when
Special Registers Altered:
FPSCRZE=1.
FPRF FR FI
FX OX UX XX Special Registers Altered:
VXSNAN VXIMZ FPRF FR FI
CR1 (if Rc=1) FX OX UX ZX XX
VXSNAN VXIDI VXZDZ
CR1 (if Rc=1)
Note
See the Notes that appear with fre[s].
Floating Test for software Divide X-form Floating Test for software Square Root
X-form
ftdiv BF,FRA,FRB
ftsqrt BF,FRB
63 BF // FRA FRB 128 /
0 6 9 11 16 21 31 63 BF // /// FRB 160 /
0 6 9 11 16 21 31
Let e_a be the unbiased exponent of the double-preci-
sion floating-point operand in register FRA. Let e_b be the unbiased exponent of the double-preci-
sion floating-point operand in register FRB.
Let e_b be the unbiased exponent of the double-preci-
sion floating-point operand in register FRB. fe_flag is set to 1 if either of the following conditions
occurs.
fe_flag is set to 1 if any of the following conditions
occurs. The double-precision floating-point operand in reg-
ister FRB is a zero, a NaN, or an infinity, or a nega-
The double-precision floating-point operand in reg-
tive value.
ister FRA is a NaN or an Infinity.
e_b is less than or equal to -970.
The double-precision floating-point operand in reg-
ister FRB is a Zero, a NaN, or an Infinity. Otherwise fe_flag is set to 0.
e_b is less than or equal to -1022. fg_flag is set to 1 if the following condition occurs.
e_b is greater than or equal to 1021. The double-precision floating-point operand in reg-
The double-precision floating-point operand in reg- ister FRB is a Zero, an Infinity, or a denormalized
ister FRA is not a zero and the difference, value.
e_a - e_b, is greater than or equal to 1023. Otherwise fg_flag is set to 0.
The double-precision floating-point operand in reg- If the implementation guarantees a relative error of
ister FRA is not a zero and the difference, frsqrte[s][.] of less than or equal to 2-14, then fl_flag
e_a - e_b, is less than or equal to -1021. is set to 1. Otherwise fl_flag is set to 0.
The double-precision floating-point operand in reg- CR field BF is set to the value
ister FRA is not a zero and e_a is less than or fl_flag || fg_flag || fe_flag || 0b0.
equal to -970
Special Registers Altered:
Otherwise fe_flag is set to 0. CR field BF
fg_flag is set to 1 if either of the following conditions
occurs. Programming Note
The double-precision floating-point operand in reg- ftdiv and ftsqrt are provided to accelerate software
ister FRA is an Infinity. emulation of divide and square root operations, by
performing the requisite special case checking.
The double-precision floating-point operand in reg- Software needs only a single branch, on FE=1 (in
ister FRB is a Zero, an Infinity, or a denormalized CR[BF]), to a special case handler. FG and FL may
value. provide further acceleration opportunities.
Otherwise fg_flag is set to 0.
If the implementation guarantees a relative error of
fre[s][.] of less than or equal to 2-14, then fl_flag is
set to 1. Otherwise fl_flag is set to 0.
CR field BF is set to the value
fl_flag || fg_flag || fe_flag || 0b0.
Special Registers Altered:
CR field BF
The floating-point operand in register FRB is rounded to Let src be the double-precision floating-point value in
single-precision, using the rounding mode specified by FRB.
RN, and placed into register FRT.
If src is a NaN, then the result is
The rounding is described fully in Section A.1, “Float- 0x8000_0000_0000_0000, VXCVI is set to 1, and, if src is
ing-Point Round to Single-Precision Model” on an SNaN, VXSNAN is set to 1.
page 779.
Otherwise, src is rounded to a floating-point integer
FPRF is set to the class and sign of the result, except for using the rounding mode specified by RN.
Invalid Operation Exceptions when VE=1.
If the rounded value is greater than 263-1, then the
Special Registers Altered: result is 0x7FFF_FFFF_FFFF_FFFF and VXCVI is set to 1.
FPRF FR FI
FX OX UX XX VXSNAN Otherwise, if the rounded value is less than -263, then
CR1 (if Rc=1) the result is 0x8000_0000_0000_0000 and VXCVI is set to
1.
Otherwise, the result is the rounded value converted to
64-bit signed-integer format, and XX is set to 1 if the
result is inexact.
If an enabled Invalid Operation Exception does not
occur, then the result is placed into FRT.
The conversion is described fully in Section A.2, “Float-
ing-Point Convert to Integer Model” on page 783.
Except for enabled Invalid Operation Exceptions, FPRF
is undefined. FR is set if the result is incremented when
rounded. FI is set if the result is inexact.
Special Registers Altered:
FPRF (undefined) FR FI
FX XX VXSNAN VXCVI
CR1 (if Rc=1)
Let src be the double-precision floating-point value in Let src be the double-precision floating-point value in
FRB. FRB.
If src is a NaN, then the result is If src is a NaN, then the result is
0x8000_0000_0000_0000, VXCVI is set to 1, and, if src is 0x0000_0000_0000_0000, VXCVI is set to 1, and, if src is
an SNaN, VXSNAN is set to 1. an SNaN, VXSNAN is set to 1.
Otherwise, src is rounded to a floating-point integer Otherwise, src is rounded to a floating-point integer
using the rounding mode Round toward Zero. using the rounding mode specified by RN.
If the rounded value is greater than 263-1, then the If the rounded value is greater than 264-1, then the
result is 0x7FFF_FFFF_FFFF_FFFF and VXCVI is set to 1. result is 0xFFFF_FFFF_FFFF_FFFF, and VXCVI is set to 1.
Otherwise, if the rounded value is less than -263, then Otherwise, if the rounded value is less than 0, then the
the result is 0x8000_0000_0000_0000 and VXCVI is set to result is 0x0000_0000_0000_0000, and VXCVI is set to 1.
1.
Otherwise, the result is the rounded value converted to
Otherwise, the result is the rounded value converted to 64-bit unsigned-integer format, and XX is set to 1 if the
64-bit signed-integer format, and XX is set to 1 if the result is inexact.
result is inexact.
If an enabled Invalid Operation Exception does not
If an enabled Invalid Operation Exception does not occur, then the result is placed into FRT.
occur, then the result is placed into FRT.
The conversion is described fully in Section A.2, “Float-
The conversion is described fully in Section A.2, “Float- ing-Point Convert to Integer Model” on page 783.
ing-Point Convert to Integer Model” on page 783.
Except for enabled Invalid Operation Exceptions, FPRF
Except for enabled Invalid Operation Exceptions, FPRF is undefined. FR is set if the result is incremented when
is undefined. FR is set if the result is incremented when rounded. FI is set if the result is inexact.
rounded. FI is set if the result is inexact.
Special Registers Altered:
Special Registers Altered: FPRF (undefined) FR FI
FPRF (undefined) FR FI FX XX VXSNAN VXCVI
FX XX VXSNAN VXCVI CR1 (if Rc=1)
CR1 (if Rc=1)
Let src be the double-precision floating-point value in Let src be the double-precision floating-point value in
FRB. FRB.
If src is a NaN, then the result is If src is a NaN, then the result is 0x8000_0000, VXCVI is
0x0000_0000_0000_0000, VXCVI is set to 1, and, if src is set to 1, and, if src is an SNaN, VXSNAN is set to 1.
an SNaN, VXSNAN is set to 1.
Otherwise, src is rounded to a floating-point integer
Otherwise, src is rounded to a floating-point integer using the rounding mode specified by RN.
using the rounding mode Round toward Zero.
If the rounded value is greater than 231-1, then the
If the rounded value is greater than 264-1,
then the result is 0x7FFF_FFFF, and VXCVI is set to 1.
result is 0xFFFF_FFFF_FFFF_FFFF, and VXCVI is set to 1.
Otherwise, if the rounded value is less than -231, then
Otherwise, if the rounded value is less than 0, then the the result is 0x8000_0000, and VXCVI is set to 1.
result is 0x0000_0000_0000_0000, and VXCVI is set to 1.
Otherwise, the result is the rounded value converted to
Otherwise, the result is the rounded value converted to 32-bit signed-integer format, and XX is set to 1 if the
64-bit unsigned-integer format, and XX is set to 1 if the result is inexact.
result is inexact.
If an enabled Invalid Operation Exception does not
If an enabled Invalid Operation Exception does not occur, then the result is placed into FRT32:63 and FRT0:31
occur, then the result is placed into FRT. is undefined,
The conversion is described fully in Section A.2, “Float- The conversion is described fully in Section A.2, “Float-
ing-Point Convert to Integer Model” on page 783. ing-Point Convert to Integer Model” on page 783.
Except for enabled Invalid Operation Exceptions, FPRF Except for enabled Invalid Operation Exceptions, FPRF
is undefined. FR is set if the result is incremented when is undefined. FR is set if the result is incremented when
rounded. FI is set if the result is inexact. rounded. FI is set if the result is inexact.
Special Registers Altered: Special Registers Altered:
FPRF (undefined) FR FI FPRF (undefined) FR FI
FX XX VXSNAN VXCVI FX XX VXSNAN VXCVI
CR1 (if Rc=1) CR1 (if Rc=1)
Otherwise, the result is the rounded value converted to Special Registers Altered:
32-bit unsigned-integer format, and XX is set to 1 if the FPRF FR FI FX XX
result is inexact. CR1 (if Rc=1)
The 64-bit unsigned fixed-point operand in register The 64-bit signed fixed-point operand in register FRB is
FRB is converted to an infinitely precise floating-point converted to an infinitely precise floating-point integer.
integer. The result of the conversion is rounded to dou- The result of the conversion is rounded to single-preci-
ble-precision, using the rounding mode specified by sion, using the rounding mode specified by FPSCRRN,
FPSCRRN, and placed into register FRT. and placed into register FRT.
The conversion is described fully in Section A.3, “Float- The conversion is described fully in Section A.3, “Float-
ing-Point Convert from Integer Model”. ing-Point Convert from Integer Model”.
FPSCRFPRF is set to the class and sign of the result. FR FPSCRFPRF is set to the class and sign of the result. FR
is set if the result is incremented when rounded. is set if the result is incremented when rounded.
FPSCRFI is set if the result is inexact. FPSCRFI is set if the result is inexact.
Special Registers Altered: Special Registers Altered:
FPRF FR FI FPRF FR FI
FX XX FX XX
CR1 (if Rc=1) CR1 (if Rc=1)
Programming Note
Converting a unsigned integer word to single-preci-
sion floating-point can be accomplished by loading
the word from storage using Load Float Word and
Zero Indexed and then using fcfidus.
Floating Round to Integer Nearest X-form Floating Round to Integer Plus X-form
frin FRT,FRB (Rc=0) frip FRT,FRB (Rc=0)
frin. FRT,FRB (Rc=1) frip. FRT,FRB (Rc=1)
The floating-point operand in register FRB is rounded The floating-point operand in register FRB is rounded
to an integral value as follows, with the result placed to an integral value using the rounding mode round
into register FRT. If the sign of the operand is positive, toward +infinity, and the result is placed into register
(FRB) + 0.5 is truncated to an integral value, otherwise FRT.
(FRB) - 0.5 is truncated to an integral value.
FPSCRFPRF is set to the class and sign of the result,
FPSCRFPRF is set to the class and sign of the result, except for Invalid Operation Exceptions when
except for Invalid Operation Exceptions when FPSCRVE = 1.
FPSCRVE = 1.
Special Registers Altered:
Special Registers Altered: FPRF FR (set to 0) FI (set to 0)
FPRF FR (set to 0) FI (set to 0) FX
FX VXSNAN
VXSNAN CR1 (if Rc = 1)
CR1 (if Rc = 1)
Floating Round to Integer Toward Zero Floating Round to Integer Minus X-form
X-form
frim FRT,FRB (Rc=0)
friz FRT,FRB (Rc=0) frim. FRT,FRB (Rc=1)
friz. FRT,FRB (Rc=1)
63 FRT /// FRB 488 Rc
63 FRT /// FRB 424 Rc 0 6 11 16 21 31
0 6 11 16 21 31
This section gives examples of how the Floating Select instruction can be used to implement certain simple forms of
if-then-else constructions, without branching.
The examples show program fragments in an imaginary, C-like, high-level programming language, and the corre-
sponding program fragment using fsel and other Power ISA instructions. In the examples, a, b, x, y, and z are float-
ing-point variables, which are assumed to be in FPRs fa, fb, fx, fy, and fz. FPR fs is assumed to be available for
scratch space.
Warning: Care must be taken in using fsel if IEEE compatibility is required, or if the values being tested can be NaNs
or infinities; see Section .
High-level language: Power ISA: Notes High-level language: Power ISA: Notes
if a ≥ 0.0 then fsel fx,fa,fy,fz (1) if a ≥ b then x y fsub fs,fa,fb (4,5)
x y else x z fsel fx,fs,fy,fz
else if a > b then x y fsub fs,fb,fa (3,4,5)
x z else x z fsel fx,fs,fz,fy
if a > 0.0 then fneg fs,fa (1,2) if a = b then x y fsub fs,fa,fb (4,5)
x y fsel fx,fs,fz,fy else x z fsel fx,fs,fy,fz
else fneg fs,fs
x z fsel fx,fs,fx,fz
if a = 0.0 then fsel fx,fa,fy,fz (1)
x y fneg fs,fa
else fsel fx,fs,fx,fz
x z
Notes:
The following Notes apply to the preceding examples 1. The unoptimized program affects the VXSNAN bit of
and to the corresponding cases using the other three the FPSCR, and therefore may cause the system
arithmetic relations (<, ≤, and ≠). They should also be error handler to be invoked if the corresponding
considered when any other use of fsel is contemplated. exception is enabled, while the optimized program
does not affect this bit. This property of the opti-
In these Notes, the “optimized program” is the Power
mized program is incompatible with the IEEE stan-
ISA program shown, and the “unoptimized program”
dard.
(not shown) is the corresponding Power ISA program
that uses fcmpu and Branch Conditional instructions 2. The optimized program gives the incorrect result if
instead of fsel. a is a NaN.
3. The optimized program gives the incorrect result if 5. The optimized program affects the OX, UX, XX, and
a and/or b is a NaN (except that it may give the cor- VXISI bits of the FPSCR, and therefore may cause
rect result in some cases for the minimum and the system error handler to be invoked if the corre-
maximum functions, depending on how those func- sponding exceptions are enabled, while the unopti-
tions are defined to operate on NaNs). mized program does not affect these bits. This
property of the optimized program is incompatible
4. The optimized program gives the incorrect result if
with the IEEE standard.
a and b are infinities of the same sign. (Here it is
assumed that Invalid Operation Exceptions are
disabled, in which case the result of the subtraction
is a NaN. The analysis is more complicated if
Invalid Operation Exceptions are enabled,
because in that case the target register of the sub-
traction is unchanged.)
The value of the U field is placed into FPSCR field The FPSCR is modified as specified by the FLM, L, and
BF+8%(1-W). W fields.
FPSCRFX is altered only if BF = 0 and W = 0. L=0
Special Registers Altered: The contents of register FRB are placed into the
FPSCR field BF + 8%(1-W) FPSCR under control of the W field and the field
CR1 (if Rc=1) mask specified by FLM. W and the field mask iden-
tify the 4-bit fields affected. Let i be an integer in
Programming Note the range 0-7. If FLMi=1 then FPSCR field k is set
mtfsfi serves as both a basic and an extended to the contents of the corresponding field of regis-
mnemonic. The Assembler will recognize a mtfsfi ter FRB, where k = i+8%(1-W).
mnemonic with three operands as the basic form, L=1
and a mtfsfi mnemonic with two operands as the
extended form. In the extended form the W oper- The contents of register FRB are placed into the
and is omitted and assumed to be 0. FPSCR.
FPSCRFX is not altered implicitly by this instruction.
Programming Note Special Registers Altered:
When FPSCR32:35 is specified, bits 32 (FX) and 35 FPSCR fields selected by mask, L, and W
(OX) are set to the values of U0 and U3 (i.e., even if CR1 (if Rc=1)
this instruction causes OX to change from 0 to 1,
FX is set from U0 and not by the usual rule that FX Programming Note
is set to 1 when an exception bit changes from 0 to mtfsf serves as both a basic and an extended
1). Bits 33 and 34 (FEX and VX) are set according mnemonic. The Assembler will recognize a mtfsf
to the usual rule, given on page 125, and not from mnemonic with four operands as the basic form,
U1:2. and a mtfsf mnemonic with two operands as the
extended form. In the extended form the W and L
operands are omitted and both are assumed to be
0.
Programming Note
Updating fewer than eight fields of the FPSCR may
have substantially poorer performance on some
implementations than updating eight fields or all of
the fields.
Programming Note
If L=1 or if L=0 and FPSCR32:35 is specified, bits 32
(FX) and 35 (OX) are set to the values of (FRB)32
and (FRB)35 (i.e., even if this instruction causes OX
to change from 0 to 1, FX is set from (FRB)32 and
not by the usual rule that FX is set to 1 when an
exception bit changes from 0 to 1). Bits 33 and 34
(FEX and VX) are set according to the usual rule,
given on page 125, and not from (FRB)33:34.
Bit BT+32 of the FPSCR is set to 0. Bit BT+32 of the FPSCR is set to 1.
Special Registers Altered: Special Registers Altered:
FPSCR bit BT+32 FPSCR bits BT+32 and FX
CR1 (if Rc=1) CR1 (if Rc=1)
5.1 Decimal Floating-Point (DFP) These instructions test the data class, the data
group, the exponent, or the number of significant
Facility Overview digits of a DFP operand.
Quantum-adjustment instructions
This chapter describes the behavior of the decimal
floating-point facility, the supported data types, formats, These instructions convert a DFP number to a
and classes, and the usage of registers. Also included result in the form that has the designated expo-
are the execution model, exceptions, and instructions nent, which may be explicitly or implicitly specified.
supported by the decimal floating-point facility.
Conversion instructions
The decimal floating-point (DFP) facility shares the 32
These instructions perform conversion between
floating-point registers (FPRs) and the Floating-Point
different data formats or data types.
Status and Control Register (FPSCR) with the float-
ing-point (BFP) facility. However, the interpretation of Format instructions
data formats in the FPRs, and the meaning of some
These instructions facilitate composing or decom-
control and status bits in the FPSCR are different
posing a DFP operand.
between the BFP and DFP facilities.
These instructions are described in Section 5.6 “DFP
The DFP facility also shares the Condition Register
Instruction Descriptions” on page 194.
(CR) with the fixed-Point facility, the BFP faciltiy, and
the vector facility. The three DFP data formats allow finite numbers to be
represented with different precision and ranges. Spe-
The DFP facility supports three DFP data formats: DFP
cial codes are also provided to represent +Infinity,
Short (single precision), DFP Long (double precision),
-Infinity, Quiet NaN (Not-a-Number), and Signaling
and DFP Extended (quad precision). Most operations
NaN. Operations involving infinities produce results
are performed on DFP Long or DFP Extended format
obeying traditional mathematical conventions. NaNs
directly. Support for DFP Short is limited to conversion
have no mathematical interpretation. The encoding of
to and from DFP Long. Some DFP instructions operate
NaNs provides a diagnostic information field. This diag-
on other data types, including signed or unsigned
nostic field may be used to indicate such things as the
binary fixed-point data, and signed or unsigned decimal
source of an uninitialized variable or the reason an
data.
invalid result was produced.
DFP instructions are provided to perform arithmetic,
The DFP processor recognizes a set of DFP excep-
compare, test, quantum-adjustment, conversion, and
tions which are indicated via bits set in the FPSCR.
format operations on operands held in FPRs or FPR
Additionally, the DFP exception actions depend on the
pairs.
setting of the various exception enable bits in the
Arithmetic instructions FPSCR.
These instructions perform addition, subtraction, The following DFP exceptions are detected by the DFP
multiplication, and division operations. processor. The exception status bits in the FPSCR are
indicated in parentheses.
Compare instructions
Invalid Operation Exception (VX)
These instructions perform a comparison opera- SNaN (VXSNAN)
tion on the numerical value of two DFP operands. ∞-∞ (VXISI)
Test instructions ∞÷∞ (VXIDI)
0 ÷ 0 (VXZDZ)
∞ % 0 (VXIMZ)
Invalid Compare (VXVC)
57 Floating-Point Overflow Exception Enable are zeros, including the sign bit. Negative numbers are
(OE) represented in two’s complement binary notation with a
See Section 5.5.10.3, “Overflow Exception” on one in the sign-bit position.
page 189.
For decimal data, each byte contains a pair of four-bit
58 Floating-Point Underflow Exception Enable nibbles; each four-bit nibble contains a
(UE) binary-coded-decimal (BCD) code. There are two kinds
See Section 5.5.10.4, “Underflow Exception” of BCD codes: digit code and sign code. For unsigned
on page 190. decimal data, all nibbles contain a digit code (D) as
59 Floating-Point Zero Divide Exception shown in Figure 62
Enable (ZE)
See Section 5.5.10.2, “Zero Divide Exception” D D D D ... D D D D
on page 189. Figure 62. Format for Unsigned Decimal Data
60 Floating-Point Inexact Exception Enable
For signed decimal data, the rightmost nibble contains
(XE)
a sign code (S) and all other nibbles contain a digit
See Section 5.5.10.5, “Inexact Exception” on
code as shown in Figure 63.
page 191
61 Reserved (not used by DFP) D D D D ... D D D S
62:63 Binary Floating-Point Rounding Control Figure 63. Format for Signed Decimal Data
(RN)
See Section 5.5.1, “Rounding” on page 182. The decimal digits 0-9 have the binary encoding
0000-1001. The preferred plus-sign codes are 1100
00 Round to Nearest
and 1111. The preferred minus sign code is 1101.
01 Round toward Zero
These are the sign codes generated for the results of
10 Round toward +Infinity
the Decode DPD To BCD instruction. A selection is pro-
11 Round toward -Infinity
vided by this instruction to specify which of the two pre-
ferred plus sign codes is to be generated. Alternate
Result
sign codes are also recognized as valid in the sign
Flags Result Value Class
position: 1010 and 1110 are alternate sign codes for
C < > = ? plus, and 1011 is an alternate sign code for minus.
0 0 0 0 1 Signaling NaN (DFP only) Alternate sign codes are accepted for any source oper-
1 0 0 0 1 Quiet NaN and, but are not generated as a result by the instruc-
0 1 0 0 1 - Infinity tion. When an invalid digit or sign code is detected by
0 1 0 0 0 - Normal Number the Encode BCD To DPD instruction, an invalid-opera-
1 1 0 0 0 - Subnormal Number
1 0 0 1 0 - Zero
0 0 0 1 0 + Zero
1 0 1 0 0 + Subnormal Number
0 0 1 0 0 + Normal Number
0 0 1 0 1 + Infinity
tion exception occurs. A summary of digit and sign 5.4.1 DFP Data Format
codes are provided in Figure 64.
DFP numbers and NaNs may be represented in FPRs
Binary Recognized As in any of the three data formats: DFP Short, DFP Long,
Code Digit Sign or DFP Extended. The contents of each data format
represent encoded information. Special codes are
0000 0 Invalid assigned to NaNs and infinities. Different formats sup-
0001 1 Invalid port different sizes in both significand and exponent.
0010 2 Invalid Arithmetic, compare, test, quantum-adjustment, and
format instructions are provided for DFP Long and DFP
0011 3 Invalid
Extended formats only.
0100 4 Invalid
The sign is encoded as a one bit binary value. Signifi-
0101 5 Invalid cand is encoded as an unsigned decimal integer in two
0110 6 Invalid distinct parts. The leftmost digit (LMD) of the signifi-
0111 7 Invalid cand is encoded as part of the combination field; the
remaining digits of the significand are encoded in the
1000 8 Invalid
trailing significand field. The exponent is contained in
1001 9 Invalid the combination field in two parts. However, prior to
1010 Invalid Plus encoding, the exponent is converted to an unsigned
1011 Invalid Minus binary value called the biased exponent by adding a
bias value which is a constant for each format. The two
1100 Invalid Plus (preferred; option 1) leftmost bits of the biased exponent are encoded with
1101 Invalid Minus (preferred) the leftmost digit of the significand in the leftmost bits of
1110 Invalid Plus the combination field. The rest of the biased exponent
occupies the remaining portion of the combination field.
1111 Invalid Plus (preferred; option 2)
Figure 64. Summary of BCD Digit and Sign Codes 5.4.1.1 Fields Within the Data Format
The DFP data representation comprises three fields, as
5.4 DFP Number Representation diagrammed below for each of the three formats:
cand consists of a number of decimal digits, which are Figure 65. DFP Short format
to the left of the implied decimal point. The rightmost
digit of the significand is called the units digit. The
numerical value of a DFP finite number is represented
S G T
as (-1)sign % significand % 10exponent and the unit value
of this number is (1 % 10exponent), which is called the
0 1 14 63
Figure 66. DFP Long format
quantum.
DFP finite numbers are not normalized. This allows
leading zeros and trailing zeros to exist in the signifi-
S G T
cand. This unnormalized DFP number representation
0 1 18 63
allows some values to have redundant forms; each
form represents the DFP number with a different com- T (continued)
bination of the significand value and the exponent 64 127
value. For example, 1000000 % 105 and 10 % 1010 are Figure 67. DFP Extended format
two different forms of the same numerical value. A form
of this number representation carries information about The fields are defined as follows:
both the numerical value and the quantum of a DFP
Sign bit (S)
finite number.
The sign bit is in bit 0 of each format, and is zero for
The significant digits of a DFP finite number are the dig- plus and one for minus.
its in the significand beginning with the leftmost non-
Combination field (G)
zero digit and ending with the units digit.
As the name implies, this field provides a combination
of the exponent and the left-most digit (LMD) of the sig-
nificand, for finite numbers, or provides a special code
for denoting the value as either a Not-a-Number or an For DFP finite numbers, the rightmost N-5 bits of the
Infinity. N-bit combination field contain the remaining bits of the
biased exponent. For NaNs, bit 5 of the combination
The first 5 bits of the combination field contain the
field is used to distinguish a Quiet NaN from a Signal-
encoding of NaN or infinity, or the two leftmost bits of
ing NaN; the remaining bits in a source operand are
the biased exponent and the leftmost digit (LMD) of the
ignored and they are set to zeros in a target operand by
significand. The following tables show the encoding:
most operations. For infinities, the rightmost N-5 bits of
the N-bit combination field of a source operand are
G0:4 Description ignored and they are set to zeros in a target operand by
most operations.
11111 NaN
11110 Infinity Trailing Significand field (T)
For DFP finite numbers, this field contains the remain-
All others Finite Number (see Figure 69) ing significand digits. For NaNs, this field may be used
Figure 68. Encoding of the G field for Special to contain diagnostic information. For infinities, con-
Symbols tents in this field of a source operand are ignored and
they are set to zeros in a target operand by most opera-
tions. The trailing significand field is a multiple of 10-bit
Leftmost 2-bits of biased exponent
LMD blocks. The multiple depends on the format. Each
00 01 10 10-bit block is called a declet and represents three dec-
0 00000 01000 10000 imal digits, using the Densely Packed Decimal (DPD)
1 00001 01001 10001 encoding defined in Appendix B.
2 00010 01010 10010
3 00011 01011 10011
5.4.1.2 Summary of DFP Data Formats
4 00100 01100 10100 The properties of the three DFP formats are summa-
rized in the following table:.
5 00101 01101 10101
6 00110 01110 10110
7 00111 01111 10111
8 11000 11010 11100
9 11001 11011 11101
Figure 69. Encoding of bits 0:4 of the G field for
Finite Numbers
Format
DFP Short DFP Long DFP Extended
Widths (bits):
Format 32 64 128
Sign (S) 1 1 1
Combination (G) 11 13 17
Trailing Significand (T) 20 50 110
Exponent:
Maximum biased 191 767 12,287
Maximum (Xmax) 90 369 6111
Minimum (Xmin) -101 -398 -6176
Bias 101 398 6176
Magnitude:
Maximum normal number (Nmax) (107 - 1) x 1090 (1016 - 1) x 10369 (1034 - 1) x 106111
Minimum normal number (Nmin) 1x 10-95 1x 10-383 1 x 10-6143
Format
DFP Short DFP Long DFP Extended
Minimum subnormal number (Dmin) 1 x 10-101 1 x 10-398 1 x 10-6176
Figure 70. Summary of DFP Formats
Data Class Sign Magnitude 0b111110 in the leftmost 6 bits of the combination field
indicates a Quiet NaN, whereas 0b111111 indicates a
Zero ± 0* Signaling NaN.
Subnormal ± Dmin ≤ |X| < Nmin
A special QNaN is sometimes supplied as the default
Normal ± Nmin ≤ |Y| ≤ Nmax QNaN for a disabled invalid-operation exception; it has
* The significand is zero and the exponent is any rep- a plus sign, the leftmost 6 bits of the combination field
resentable value set to 0b111110 and remaining bits in the combination
field and the trailing significand field set to zero.
Figure 71. Value Ranges for Finite Number Data
Classes Normally, source QNaNs are propagated during opera-
tions so that they will remain visible at the end. When a
Data Class S G T QNaN is propagated, the sign is preserved, the decimal
+Infinity 0 11110xxx . . . xxx xxx . . . xxx value of the trailing significand field is preserved but
reencoded using the preferred DPD codes, and the
–Infinity 1 11110xxx . . . xxx xxx . . . xxx contents in the rightmost N-6 bits of the combination
Quiet NaN x 111110xx . . . xxx xxx . . . xxx field set to zero, where N is the width of the combina-
Signaling NaN x 111111xx . . . xxx xxx . . . xxx tion field for the format.
x Don’t care A source SNaN generally causes an invalid-operation
exception. If the exception is disabled, the SNaN is
converted to the corresponding QNaN and propagated.
Figure 72. Encoding of NaN and Infinity Data The primary encoding difference between an SNaN
Classes and a QNaN is that bit 5 of an SNaN is 1 and bit 5 of a
Zeros QNaN is 0. When an SNaN is propagated as a QNaN,
Zeros have a zero significand and any representable bit 5 is set to 0, and, just as with QNaN proagation, the
value in the exponent. A +0 is distinct from -0, and sign is preserved, the decimal value of the trailing sig-
zeros with different exponents are distinct, except that nificand field is preserved but reencoded using the pre-
comparison treats them as equal. ferred DPD codes, and the contents in the rightmost
N-6 bits of the combination field set to zero, where N is
Subnormal Numbers the width of the combination field for the format. For
Subnormal numbers have values that are smaller than some format-conversion instructions, a source SNaN
Nmin and greater than zero in magnitude. does not cause an invalid-operation exception, and an
Normal Numbers SNaN is returned as the target operand.
Normal numbers are nonzero finite numbers whose For instructions with two source NaNs and a NaN is to
magnitude is between Nmin and Nmax inclusively. be propagated as the result, do the following.
Infinities If there is a QNaN in FRA and an SNaN in FRB,
Infinities are represented by 0b11110 in the leftmost 5 the SNaN in FRB is propagated.
bits of the combination field. When an operation is Otherwise, propagate the NaN is FRA.
defined to generate an infinity as the result, a default
infinity is sometimes supplied. A default infinity has all
remaining bits in the combination field and trailing sig- 5.5 DFP Execution Model
nificand field set to zeros.
DFP operations are performed as if they first produce
When infinities are used as source operands, only the an intermediate result correct to infinite precision and
leftmost 5 bits of the combination field are interpreted with unbounded range. The intermediate result is then
(i.e., 0b11110 indicates the value is an infinity). The rounded to the destination’s precision according to one
trailing significand field of infinities is usually ignored. of the eight DFP rounding modes. If the rounded result
For generated infinities, the leftmost 5 bits of the combi- has only one form, it is delivered as the final result; if
nation field are set to 0b11110 and all remaining combi- the rounded result has redundant forms, then an ideal
nation bits are set to zero. exponent is used to select the form of the final result.
The ideal exponent determines the form, not the value,
Infinities can participate in most arithmetic operations
of the final result. (See Section 5.5.3 “Formation of
and give a consistent result. In comparisons, any
Final Result” on page 184.)
+Infinity compares greater than any finite number, and
any -Infinity compares less than any finite number. All
+Infinity are compared equal and all -Infinity are com- 5.5.1 Rounding
pared equal.
Rounding takes a number regarded as infinitely precise
Signaling and Quiet NaNs and, if necessary, modifies it to fit the destination’s pre-
There are two types of Not-a-Numbers (NaNs), Signal- cision. The destination’s precision of an operation
ing (SNaN) and Quiet (QNaN). defines the set of permissible resultant values. For
most operations, the destination’s precision is the tar- of a tie, choose the larger in magnitude (Z1 or Z2).
get-format precision and the permissible resultant val- However, an infinitely precise result with magnitude at
ues are those values representable in the target format. least (Nmax + 0.5Q(Nmax)) is rounded to infinity with no
For some special operations, the destination precision change in sign; where Q(Nmax) is the quantum of Nmax.
is constrained by both the target format and some addi-
Round to Nearest, Ties toward 0
tional restrictions, and the permissible resultant values
Choose the value that is closer to Z (Z1 or Z2). In case
are a subset of the values representable in the target
of a tie, choose the smaller in magnitude (Z1 or Z2).
format.
However, an infinitely precise result with magnitude
Rounding sets FPSCR bits FR and FI. When an inex- greater than (Nmax + 0.5Q(Nmax)) is rounded to infinity
act exception occurs, FI is set to one; otherwise, FI is with no change in sign; where Q(Nmax) is the quantum
set to zero. When an inexact exception occurs and if of Nmax.
the rounded result is greater in magnitude than the
Round away from 0
intermediate result, then FR is set to one; otherwise,
Choose the larger in magnitude (Z1 or Z2).
FR is set to zero. The exception is the Round to FP
Integer Without Inexact instruction, which always sets Round to prepare for shorter precision
FR and FI to zero. Rounding may cause an overflow Choose the smaller in magnitude (Z1 or Z2). If the
exception or underflow exception; it may also cause an selected value is inexact and the units digit of the
inexact exception. selected value is either 0 or 5, then the digit is incre-
mented by one and the incremented result is delivered.
Refer to Figure 73 below for rounding. Let Z be the
In all other cases, the selected value is delivered.
intermediate result of a DFP operation. Z may or may
When a value has redundant forms, the units digit is
not fit in the destination’s precision. If Z is exactly one of
determined by using the form that has the smallest
the permissible representable resultant values, then the
exponent.
final result in all rounding modes is Z. Otherwise, either
Z1 or Z2 is chosen to approximate the result, where Z1
and Z2 are the next larger and smaller permissible 5.5.2 Rounding Mode Specifica-
resultant values, respectively.
tion
Unless otherwise specified in the instruction definition,
the rounding mode used by an operation is specified in
By increasing |Z| the DFP rounding control (DRN) field of the FPSCR.
Infinitely precise value The eight DFP rounding modes are encoded in the
By decreasing |Z| DRN field as specified in the table below.
Round toward -∞
Choose Z2.
Round to Nearest, Ties away from 0
Choose the value that is closer to Z (Z1 or Z2). In case
bit. The following tables define the primary encoding The following table specifies the ideal exponent for
and the secondary encoding. each instruction.
5.5.5 Compare Operations the sign and significand of operands. Infinities compare
equal, and NaNs compare equal. The test result is indi-
Two sets of instructions are provided for comparing cated in the FPSCRFPCC field and CR field BF.
numerical values: Compare Ordered and Compare
The Test Significance instruction compares the number
Unordered. In the absence of NaNs, these instructions
of significant digits of one source operand with the ref-
work the same. These instructions work differently
erenced number of significant digits in another source
when either of the followings is true:
operand. The test result is indicated in the FPSCRFPCC
1. At least one source operand of the instruction is an field and CR field BF.
SNaN and the invalid-operation exception is dis-
Execution of a test instruction does not cause any DFP
abled.
exception.
2. When there is no SNaN in any source operand, at
least one source operand of the instruction is a
QNaN 5.5.7 Quantum Adjustment Opera-
In case 1, Compare Unordered recognizes an tions
invalid-operation exception and sets the FPSCRVXSNAN
flag, but Compare Ordered recognizes the exception Four kinds of quantum-adjustment operations are pro-
and sets both the FPSCRVXSNAN and FPSCRVXVC vided: Quantize, Quantize Immediate, Reround, and
flags. In case 2, Compare Unordered does not recog- Round To FP Integer. Each of them has an immediate
nize an exception, but Compare Ordered recognizes an field which specifies whether the rounding mode in
invalid-operation exception and sets the FPSCRVXVC FPSCR or a different one is to be used.
flag. The Quantize instruction is used to adjust a DFP num-
For finite numbers, comparisons are performed on val- ber to the form that has the specified target exponent.
ues, that is, all redundant forms of a DFP number are The Quantize Immediate instruction is similar to the
treated equal. Quantize instruction, except that the target exponent is
specified in a 5-bit immediate field as a signed binary
Comparisons are always exact and cannot cause an integer and has a limited range.
inexact exception.
The Reround instruction is used to simulate a DFP
Comparison ignores the sign of zero, that is, +0 equals operation of a precision other than that of DFP Long or
-0. DFP Extended. For the Reround instruction to produce
Infinities with like sign compare equal, that is, +∞ a result which accurately reflects that which would have
equals +∞, and -∞ equals -∞. resulted from a DFP operation of the desired precision
d in the range {1: 33} inclusively, the following condi-
A NaN compares as unordered with any other operand, tions must be met:
whether a finite number, an infinity, or another NaN,
including itself. The precision of the preceding DFP operation
must be at least one digit larger than d.
Execution of a compare instruction always completes,
The rounding mode used by the preceding DFP
regardless of whether any DFP exception occurs or
operation must be round-to-pre-
not, and whether the exception is enabled or not.
pare-for-shorter-precision.
The Round To FP Integer instruction is used to round a
5.5.6 Test Operations DFP number to an integer value of the same format.
Four kinds of test operations are provided: Test Data The target exponent is implicitly specified, and is
Class, Test Data Group, Test Exponent, and Test Signif- greater than or equal to zero.
icance.
The Test Data Class instruction examines the contents 5.5.8 Conversion Operations
of a source operand and determines if the operand is
There are two kinds of conversion operations: data-for-
one of the specified data classes. The test result and
mat conversion and data-type conversion.
the sign of the source operand are indicated in the
FPSCRFPCC field and CR field BF.
The Test Data Group instruction examines the contents
5.5.8.1 Data-Format Conversion
of a source operand and determines if the operand is The instructions Convert To DFP Long and Convert To
one of the specified data groups. The test result and DFP Extended convert DFP operands to wider formats;
the sign of the source operand are indicated in the the instructions Round To DFP Short and Round To
FPSCRFPCC field and CR field BF. DFP Long convert DFP operands to narrower formats.
The Test Exponent instruction compares the exponent When converting a finite number to a wider format, the
of the two source operands. The test operation ignores result is exact. When converting a finite number to a
narrower format, the source operand is rounded to the 5.5.9 Format Operations
target-format precision, which is specified by the
instruction, not by the target register size. The format instructions are provided to facilitate com-
posing or decomposing a DFP number, and consist of
When converting a finite number, the ideal exponent of
Encode BCD To DPD, Decode DPD To BCD, Extract
the result is the source exponent.
Biased Exponent, Insert Biased Exponent, Shift Signifi-
Conversion of an infinity or NaN to a different format cand Left Immediate, and Shift Significand Right Imme-
does not preserve the source combination field. Let N diate. A source operand of SNaN does not cause an
be the width of the target format’s combination field. invalid-operation exception, and an SNaN may be pro-
duced as the target operand.
When the result is an infinity or a QNaN, the con-
tents of the rightmost N-5 bits of the N-bit target
combination field are set to zero. 5.5.10 DFP Exceptions
When the result is an SNaN, bit 5 of the target for- This architecture defines the following DFP exceptions:
mat’s combination field is set to one and the right-
most N-6 bits of the N-bit target combination field Invalid Operation Exception
are set to zero. SNaN
∞-∞
When converting a NaN to a wider format or when con- ∞÷∞
verting an infinity from DFP Short to DFP Long, digits in 0÷0
the source trailing significand field are reencoded using ∞%0
the preferred DPD codes with sufficient zeros Invalid Compare
appended on the left to form the target trailing signifi- Invalid Conversion
cand field. When converting a NaN to a narrower for- Zero Divide Exception
mat or when converting an infinity from DFP Long to Overflow Exception
DFP Short, the appropriate number of leftmost digits of Underflow Exception
the source trailing significand field are removed and the Inexact Exception
remaining digits of the field are reencoded using the
preferred DPD codes to form the target trailing signifi- These exceptions may occur during execution of a DFP
cand field. instruction.
When converting an infinity between DFP Long and Each DFP exception, and each category of the Invalid
DFP Extended, a default infinity with the same sign is Operation Exception, has an exception status bit in the
produced. FPSCR. In addition, each DFP exception has a corre-
sponding enable bit in the FPSCR. The exception sta-
When converting an SNaN between DFP Short and tus bit indicates occurrence of the corresponding
DFP Long, it is converted to an SNaN without causing exception. If an exception occurs, the corresponding
an invalid-operation exception. When converting an enable bit governs the result produced by the instruc-
SNaN between DFP Long and DFP Extended, the tion and, in conjunction with the FE0 and FE1 bits (see
invalid-operation exception occurs; if the invalid-opera- the discussion of FE0 and FE1 below), whether and
tion exception is disabled, the result is converted to the how the system floating-point enabled exception error
corresponding QNaN. handler is invoked. (In general, the enabling specified
by the enable bit is of invoking the system error handler,
5.5.8.2 Data-Type Conversion not of permitting the exception to occur. The occur-
rence of an exception depends only on the instruction
The instructions Convert From Fixed and Convert To and its source operands, not on the setting of any con-
Fixed are provided to convert a number between the trol bits. The only deviation from this general rule is that
DFP data type and the signed 64-bit binary-integer the occurrence of an Underflow Exception may depend
data type. on the setting of the enable bit.)
Conversion of a signed 64-bit binary integer to a DFP A single instruction, other than mtfsfi or mtfsf, may set
Extended number is always exact. more than one exception bit only in the following cases:
Conversion of a DFP number to a signed 64-bit binary Inexact Exception may be set with Overflow
integer results in an invalid-operation exception when Exception.
the converted value does not fit into the target format, Inexact Exception may be set with Underflow
or when the source operand is an infinity or NaN. When Exception.
the exception is disabled, the most positive integer is Invalid Operation Exception (SNaN) may be set
returned if the source operand is a positive number or with Invalid Operation Exception (Invalid Compare)
+∞, and the most negative integer is returned if the for Compare Ordered instructions
source operand is a negative number, -∞, or NaN.
Invalid Operation Exception (SNaN) may be set In this case the system floating-point enabled exception
with Invalid Operation Exception (Invalid Conver- error handler is not invoked, even if DFP exceptions
sion) for Convert To Fixed instructions. occur: software can inspect the FPSCR exception bits if
necessary, to determine whether exceptions have
When an exception occurs the instruction execution
occurred.
may be completed or partially completed, depending on
the exception and the operation. In this architecture, if software is to be notified that a
given kind of exception has occurred, the correspond-
For all instructions, except for the Compare and Test
ing FPSCR exception enable bit must be set to one and
instructions, the following exceptions cause the instruc-
a mode other than Ignore Exceptions Mode must be
tion execution to be partially completed. That is, setting
used. In this case the system floating-point enabled
of CR field 1(when Rc=1) and exception status flags is
exception error handler is invoked if an enabled DFP
performed, but no result is stored into the target FPR or
exception occurs. The system floating-point enabled
FPR pair. For Compare and Test instructions, instruc-
exception error handler is also invoked if a Move To
tion execution is always completed, regardless of
FPSCR instruction causes an exception bit and the cor-
whether any DFP exception occurs or not, and whether
responding enable bit both to be 1; the Move To
the exception is enabled or not.
FPSCR instruction is considered to cause the enabled
Enabled Invalid Operation exception.
Enabled Zero Divide
The FE0 and FE1 bits control whether and how the sys-
For the remaining kinds of exceptions, instruction exe- tem floating-point enabled exception error handler is
cution is completed, a result, if specified by the instruc- invoked if an enabled DFP exception occurs. The loca-
tion, is generated and stored into the target FPR or tion of these bits and the requirements for altering them
FPR pair, and appropriate status flags are set. The are described in Book III, Power ISA Operating Envi-
result may be a different value for the enabled and dis- ronment Architecture. (The system floating-point
abled conditions for some of these exceptions. The enabled exception error handler is never invoked
kinds of exceptions that deliver a result in target FPR because of a disabled DFP exception.) The effects of
are the following: the four possible settings of these bits are as follows.
Disabled Invalid Operation
Disabled Zero Divide FE0 FE1 Description
Disabled Overflow 0 0 Ignore Exceptions Mode
Disabled Underflow DFP exceptions do not cause the system
Disabled Inexact floating-point enabled exception error
Enabled Overflow handler to be invoked.
Enabled Underflow 0 1 Imprecise Nonrecoverable Mode
Enabled Inexact The system floating-point enabled excep-
Subsequent sections define each of the DFP excep- tion error handler is invoked at some point
tions and specify the action that is taken when they are at or beyond the instruction that caused
detected. the enabled exception. It may not be pos-
sible to identify the excepting instruction
The IEEE standard specifies the handling of excep- or the data that caused the exception.
tional conditions in terms of “traps” and “trap handlers”. Results produced by the excepting
In this architecture, a FPSCR exception enable bit of 1 instruction may have been used by or may
causes generation of the result value specified in the have affected subsequent instructions that
IEEE standard for the “trap enabled” case: the expecta- are executed before the error handler is
tion is that the exception will be detected by software, invoked.
which will revise the result. A FPSCR exception enable
1 0 Imprecise Recoverable Mode
bit of 0 causes generation of the “default result” value
The system floating-point enabled excep-
specified for the “trap disabled” (or “no trap occurs” or
tion error handler is invoked at some point
“trap is not implemented”) case: the expectation is that
at or beyond the instruction that caused
the exception will not be detected by software, which
the enabled exception. Sufficient informa-
will simply use the default result. The result to be deliv-
tion is provided to the error handler that it
ered in each case for each exception is described in the
can identify the excepting instruction and
sections below.
the operands, and correct the result. No
The IEEE default behavior when an exception occurs is results produced by the excepting instruc-
to generate a default value and not to notify software. tion have been used by or have affected
In this architecture, if the IEEE default behavior when subsequent instructions that are exe-
an exception occurs is desired for all exceptions, all cuted before the error handler is invoked.
FPSCR exception enable bits should be set to zero and
Ignore Exceptions Mode (see below) should be used.
FE0 FE1 Description enable bits set to one for those exceptions for
which the system floating-point enabled exception
1 1 Precise Mode error handler is to be invoked.
The system floating-point enabled excep-
tion error handler is invoked precisely at Ignore Exceptions Mode should not, in general, be
the instruction that caused the enabled used when any FPSCR exception enable bits are
exception. set to one.
In all cases, the question of whether a DFP result is Precise Mode may degrade performance in some
stored, and what value is stored, is governed by the implementations, perhaps substantially, and there-
FPSCR exception enable bits, as described in subse- fore should be used only for debugging and other
quent sections, and is not affected by the value of the specialized applications.
FE0 and FE1 bits.
In all cases in which the system floating-point enabled 5.5.10.1 Invalid Operation Exception
exception error handler is invoked, all instructions
before the instruction at which the system floating-point Definition
enabled exception error handler is invoked have com-
pleted, and no instruction after the instruction at which An Invalid Operation Exception occurs when an oper-
the system floating-point enabled exception error han- and is invalid for the specified DFP operation. The
dler is invoked has begun execution. (Recall that, for invalid DFP operations are:
the two Imprecise modes, the instruction at which the Any DFP operation on a signaling NaN (SNaN),
system floating-point enabled exception error handler is except for Test, Round To DFP Short, Convert To
invoked need not be the instruction that caused the DFP Long, Decode DPD To BCD, Extract Biased
exception.) The instruction at which the system float- Exponent, Insert Biased Exponent, Shift Signifi-
ing-point enabled exception error handler is invoked cand Left Immediate, and Shift Significand Right
has not been executed unless it is the excepting Immediate
instruction, in which case it has been executed if the For add or subtract operations, magnitude subtrac-
exception is not among those listed on page 186 as tion of infinities (+∞) + (-∞)
suppressed. Division of infinity by infinity (∞ ÷ ∞)
Division of zero by zero (0 ÷ 0)
Programming Note Multiplication of infinity by zero (∞ % 0)
In the ignore and both imprecise modes, a Float- Ordered comparison involving a NaN (Invalid Com-
ing-Point Status and Control Register instruction pare)
can be used to force any exceptions, due to instruc- The Quantize operation detects that the significand
tions initiated before the Floating-Point Status and associated with the specified target exponent
Control Register instruction, to be recorded in the would have more significant digits than the tar-
FPSCR. (This forcing is superfluous for Precise get-format precision
Mode.) For the Quantize operation, when one source
operand specifies an infinity and the other speci-
In either of the Imprecise modes, a Floating-Point fies a finite number
Status and Control Register instruction can be The Reround operation detects that the target
used to force any invocations of the system float- exponent associated with the specified target sig-
ing-point enabled exception error handler, due to nificance would be greater than Xmax
instructions initiated before the Floating-Point Sta- The Encode BCD To DPD operation detects an
tus and Control Register instruction, to occur. (This invalid BCD digit or sign code
forcing has no effect in Ignore Exceptions Mode, The Convert To Fixed operation involving a number
and is superfluous for Precise Mode.) too large in magnitude to be represented in the tar-
get format, or involving a NaN.
In order to obtain the best performance across the wid-
est range of implementations, the programmer should
obey the following guidelines.
If the IEEE default results are acceptable to the
application, Ignore Exceptions Mode should be
used with all FPSCR exception enable bits set to
zero.
If the IEEE default results are not acceptable to the
application, Imprecise Nonrecoverable Mode
should be used, or Imprecise Recoverable Mode if
recoverability is needed, with FPSCR exception
When Overflow Exception is enabled (FPSCROE=1) 4. The result is placed into the target FPR
and overflow occurs, the following actions are taken: 5. FPSCRFR is set to one if the returned result is ± ∞,
and is set to zero if the returned result is ±Nmax
1. Overflow Exception is set
6. FPSCRFI is set to one
FPSCROX 1
7. FPSCRFPRF is set to indicate the class and sign of
2. The infinitely precise result is divided by 10α. That
the result (± ∞ or ± Normal number)
is, the exponent adjustment α is subtracted from
the exponent. This is called the wrapped result.
The exponent adjustment for all operations, except 5.5.10.4 Underflow Exception
for Round To DFP Short and Round To DFP Long,
is 576 for DFP Long and 9216 for DFP Extended. Definition
For Round To DFP Short and Round To DFP Long,
the exponent adjustment is 192 for the source for- Except for Reround, the following describes the han-
mat of DFP Long and 3072 for the source format of dling of the IEEE underflow exception condition. The
DFP Extended. Reround operation does not recognize an underflow
3. The wrapped result is rounded to the target-format exception condition.
precision. This is called the wrapped rounded
The Underflow Exception is defined differently for the
result.
enabled and disabled states. However, a tininess condi-
4. If the wrapped rounded result has only one form, it
tion is recognized in both states when a result com-
is the delivered result. If the wrapped rounded
puted as though both the precision and exponent range
result has redundant forms and is exact, the result
were unbounded would be nonzero and less than the
of the form that has the exponent closest to the
target format’s smallest normal number, Nmin, in magni-
wrapped ideal exponent is returned. If the wrapped
tude.
rounded result has redundant forms and is inexact,
the result of the form that has the smallest expo- Unless otherwise defined in the instruction description,
nent is returned. The wrapped ideal exponent is an underflow exception occurs as follows:
the result of subtracting the exponent adjustment
Enabled:
from the ideal exponent.
When the tininess condition is recognized.
5. FPSCRFPRF is set to indicate the class and sign of
the result (± Normal Number) Disabled:
When the tininess condition is recognized and
When Overflow Exception is disabled (FPSCROE=0) when the delivered result value differs from what
and overflow occurs, the following actions are taken: would have been computed were both the preci-
1. Overflow Exception is set sion and the exponent range unbounded.
FPSCROX 1
2. Inexact Exception is set Action
FPSCRXX 1
3. The result is determined by the rounding mode The action to be taken depends on the setting of the
and the sign of the intermediate result as follows. Underflow Exception Enable bit of the FPSCR.
When Underflow Exception is enabled (FPSCRUE=1)
Sign of inter- and underflow occurs, the following actions are taken:
mediate result
1. Underflow Exception is set
Rounding Mode Plus Minus FPSCRUX 1
2. The infinitely precise result is multiplied by 10α.
Round to Nearest, Ties to Even +∞ -∞ That is, the exponent adjustment α is added to the
Round toward 0 +Nmax -Nmax exponent. This is called the wrapped result. The
exponent adjustment for all operations, except for
Round toward +∞ +∞ -Nmax
Round To DFP Short and Round To DFP Long, is
Round toward - ∞ +Nmax -∞ 576 for DFP Long and 9216 for DFP Extended. For
Round to Nearest, Ties away +∞ -∞ Round To DFP Short and Round To DFP Long, the
from 0 exponent adjustment is 192 for the source format
of DFP Long and 3072 for the source format of
Round to Nearest, Ties toward 0 +∞ -∞ DFP Extended.
Round away from 0 +∞ -∞ 3. The wrapped result is rounded to the target-format
precision. This is called the wrapped rounded
Round to prepare for shorter pre- +Nmax -Nmax
result.
cision
4. If the wrapped rounded result has only one form, it
Figure 78. Overflow Results When Exception Is is the delivered result. If the wrapped rounded
Disabled result has redundant forms and is exact, the result
of the form that has the exponent closest to the
Definition
Except for Round to FP Integer Without Inexact, the fol-
lowing describes the handling of the IEEE inexact
exception condition. The Round to FP Integer Without
Inexact does not recognize an inexact exception condi-
tion.
An Inexact Exception occurs when either of two condi-
tions occur during rounding:
1. The delivered result differs from what would have
been computed were both the precision and expo-
nent range unbounded.
2. The rounded result overflows and Overflow Excep-
tion is disabled.
Action
The action to be taken does not depend on the setting
of the Inexact Exception Enable bit of the FPSCR.
When Inexact Exception occurs, the following actions
are taken:
1. Inexact Exception is set
FPSCRXX 1
2. The rounded or overflowed result is placed into the
target FPR
3. FPSCRFPRF is set to indicate the class and sign of
the result
Programming Note
In some implementations, enabling Inexact Excep-
tions may degrade performance more than does
enabling other types of floating-point exception.
Result (r)
when Rounding Mode Is
Range of v Case RNE RNTZ RNAZ RAFZ RTMI RFSP RTPI RTZ
1 1 1 1 1
v < -Nmax, q < -Nmax Overflow -∞ -∞ -∞ -∞ -∞ -Nmax -Nmax -Nmax
v < -Nmax, q = -Nmax Normal -Nmax -Nmax -Nmax — — -Nmax -Nmax -Nmax
-Nmax ≤ v ≤ -Nmin Normal b b b b b b b b
-Nmin < v ≤ -Dmin Tiny b* b* b* b* b* b* b b
-Dmin < v < -Dmin/2 Tiny -Dmin -Dmin -Dmin -Dmin -Dmin -Dmin -0 -0
v = -Dmin/2 Tiny -0 -0 -Dmin -Dmin -Dmin -Dmin -0 -0
-Dmin/2 < v < 0 Tiny -0 -0 -0 -Dmin -Dmin -Dmin -0 -0
v=0 EZD +0 +0 +0 +0 -0 +0 +0 +0
0 < v < +Dmin/2 Tiny +0 +0 +0 +Dmin +0 +Dmin +Dmin +0
v = +Dmin/2 Tiny +0 +0 +Dmin +Dmin +0 +Dmin +Dmin +0
+Dmin/2 < v < +Dmin Tiny +Dmin +Dmin +Dmin +Dmin +0 +Dmin +Dmin +0
+Dmin ≤ v < +Nmin Tiny b* b* b* b* b b* b* b
+Nmin ≤ v ≤ +Nmax Normal b b b b b b b b
+Nmax < v, q = +Nmax Normal +Nmax +Nmax +Nmax — +Nmax +Nmax — +Nmax
+Nmax < v, q > +Nmax Overflow +∞1 +∞1 +∞1 +∞1 +Nmax +Nmax +∞1 +Nmax
Explanation:
— This situation cannot occur.
1 The normal result r is considered to have been incremented.
* The rounded value, in the extreme case, may be Nmin. In this case, the exception conditions are underflow,
inexact, and incremented.
b The value derived when the precise result v is rounded to the destination’s precision, including both bounded
precision and bounded exponent range.
q The value derived when the precise result v is rounded to the destination’s precision, but assuming an
unbounded exponent range.
r This is the returned value when neither overflow nor underflow is enabled.
v Precise result before rounding, assuming unbounded precision and an unbounded exponent range. For
data-format conversion operations, v is the source value.
Dmin Smallest (in magnitude) representable subnormal number in the target format.
EZD The result r of the exact-zero-difference case applies only to ADD and SUBTRACT with both source operands
having opposite signs. (For ADD and SUBTRACT, when both source operands have the same sign, the sign of
the zero result is the same sign as the sign of the source operands.)
Nmax Largest (in magnitude) representable finite number in the target format.
Nmin Smallest (in magnitude) representable normalized number in the target format.
RAFZ Round away from 0.
RFSP Round to Prepare for Shorter Precision.
RNAZ Round to Nearest, Ties away from 0.
RNE Round to Nearest, Ties to even.
RNTZ Round to Nearest, Ties toward 0.
RTPI Round toward +∞.
RTMI Round toward -∞.
RTZ Round toward 0.
Figure 79. Rounding and Range Actions (Part 1)
Is r Is r Incre- Is q Is q Incre-
inexact mented inexact mented
Case (r≠v) OE=1 UE=1 XE=1 (|r|>|v|) (q≠v) (|q|>|v|) Returned Results and Status Setting*
Overflow Yes1 No — No No — — T(r), OX← 1, FI← 1, FR← 0, XX 1
Overflow Yes1 No — No Yes — — T(r), OX← 1, FI← 1, FR← 1, XX 1
Overflow Yes1 No — Yes No — — T(r), OX← 1, FI← 1, FR← 0, XX 1, TX
Overflow Yes1 No — Yes Yes — — T(r), OX← 1, FI← 1, FR← 1, XX 1, TX
Overflow Yes1 Yes — — — No No1 Tw(q÷β), OX← 1, FI← 0, FR← 0, TO
Overflow Yes1 Yes — — — Yes No Tw(q÷β), OX← 1, FI← 1, FR← 0, XX 1,TO
1
Overflow Yes Yes — — — Yes Yes Tw(q÷β), OX← 1, FI← 1, FR← 1, XX 1,TO
Normal No — — — — — — T(r), FI← 0, FR← 0
Normal Yes — — No No — — T(r), FI← 1, FR← 0, XX 1
Normal Yes — — No Yes — — T(r), FI← 1, FR← 1, XX 1
Normal Yes — — Yes No — — T(r), FI← 1, FR← 0, XX 1, TX
Normal Yes — — Yes Yes — — T(r), FI← 1, FR← 1, XX 1, TX
Tiny No — No — — — — T(r), FI← 0, FR← 0
Tiny No — Yes — — No1 No1 Tw(q•β), UX← 1, FI← 0, FR← 0, TU
Tiny Yes — No No No — — T(r), UX← 1, FI← 1, FR← 0, XX 1
Tiny Yes — No No Yes — — T(r), UX← 1, FI← 1, FR← 1, XX 1
Tiny Yes — No Yes No — — T(r), UX← 1, FI← 1, FR← 0, XX 1, TX
Tiny Yes — No Yes Yes — — T(r), UX← 1, FI← 1, FR← 1, XX 1, TX
Tiny Yes — Yes — — No No1 Tw(q•β), UX← 1, FI← 0, FR← 0, TU
Tiny Yes — Yes — — Yes No Tw(q•β), UX← 1, FI← 1, FR← 0, XX 1,TU
Tiny Yes — Yes — — Yes Yes Tw(q•β), UX← 1, FI← 1, FR← 1, XX 1,TU
Explanation:
— The results do not depend on this condition.
1 This condition is true by virtue of the state of some condition to the left of this column.
* Rounding sets only the FI and FR status flags. Setting of the OX, XX, or UX flag is part of the exception actions. They
are listed here for reference.
β Wrap adjust, which depends on the type of operation and operand format. For all operations except Round to DFP
Short and Round to DFP Long, the wrap adjust depends on the target format: β = 10α, where α is 576 for DFP Long,
and 9216 for DFP Extended. For Round to DFP Short and Round to DFP Long, the wrap adjust depends on the source
format: β = 10κ where κ is 192 for DFP Long and 3072 for DFP Extended.
q The value derived when the precise result v is rounded to destination’s precision, but assuming an unbounded exponent
range.
r The result as defined in Part 1 of this figure.
v Precise result before rounding, assuming unbounded precision and unbounded exponent range.
FI Floating-Point-Fraction-Inexact status flag, FPSCRFI. This status flag is non-sticky.
FR Floating-Point-Fraction-Rounded status flag, FPSCRFR.
OX Floating-Point Overflow Exception status flag, FPSCRoX.
TO The system floating-point enabled exception error handler is invoked for the overflow exception if the FE0 and FE1 bits
in the machine-state register are set to any mode other than the ignore-exception mode.
TU The system floating-point enabled exception error handler is invoked for the underflow exception if the FE0 and FE1 bits
in the machine-state register are set to any mode other than the ignore-exception mode.
TX The system floating-point enabled exception error handler is invoked for the inexact exception if the FE0 and FE1 bits in
the machine-state register are set to any mode other than the ignore-exception mode.
T(x) The value x is placed at the target operand location.
Tw(x) The wrapped rounded result x is placed at the target operand location. For all operations except data format
conversions, the wrapped rounded result is in the same format and length as normal results at the target location. For
data format conversions, the wrapped rounded result is in the same format and length as the source, but rounded to the
target-format precision.
UX Floating-Point-Underflow-Exception status flag, FPSCRUX
XX Float-Point-Inexact-Exception Status flag, FPSCRXX. The flag is a sticky version of FPSCRFI. When FPSCRFI is set to a
new value, the new value of FPSCRXX is set to the result of ORing the old value of FPSCRXX with the new value of
FPSCRFI.
DFP Multiply [Quad] X-form invalid-operation exception, in which case the field
remains unchanged.
dmul FRT,FRA,FRB (Rc=0)
Special Registers Altered:
dmul. FRT,FRA,FRB (Rc=1)
FPRF FR FI
FX OX UX XX VXSNAN VXIMZ
59 FRT FRA FRB 34 Rc CR1 (if Rc=1)
0 6 11 16 21 31
DFP Divide [Quad] X-form Figure 83 summarizes the actions for Divide. Figure 83
does not include the setting of the FPSCRFPRF field.
ddiv FRT,FRA,FRB (Rc=0) The FPSCRFPRF field is always set to the class and
ddiv. FRT,FRA,FRB (Rc=1) sign of the result, except for an enabled invalid-opera-
tion and enabled zero-divide exceptions, in which
59 FRT FRA FRB 546 Rc cases the field remains unchanged.
0 6 11 16 21 31
Special Registers Altered:
FPRF FR FI
ddivq FRTp,FRAp,FRBp (Rc=0) FX OX UX ZX XX
ddivq. FRTp,FRAp,FRBp (Rc=1) VXSNAN VXIDI VXZDZ
CR1 (if Rc=1)
63 FRTp FRAp FRBp 546 Rc
0 6 11 16 21 31
dcmpuq BF,FRAp,FRBp
dcmpoq BF,FRAp,FRBp
DFP Test Data Class [Quad] Z22-form DFP Test Data Group [Quad] Z22-form
dtstdc BF,FRA,DCM dtstdg BF,FRA,DGM
Let the DCM (Data Class Mask) field specify one or Let the DGM (Data Group Mask) field specify one or
more of the 6 possible data classes, where each bit more of the 6 possible data groups, where each bit cor-
corresponds to a specific data class. responds to a specific data group.
DCM Bit Data Class The term extreme exponent means either the maximum
exponent, Xmax, or the minimum exponent, Xmin.
0 Zero
1 Subnormal DGM Bit Data Group
2 Normal 0 Zero with non-extreme exponent
3 Infinity 1 Zero with extreme exponent
4 Quiet NaN 2 Subnormal or (Normal with extreme expo-
5 Signaling NaN nent)
CR field BF and FPSCRFPCC are set to indicate the 3 Normal with non-extreme exponent and
sign of the DFP operand in FRA[p] and whether the leftmost zero digit in significand
data class of the DFP operand in FRA[p] matches any 4 Normal with non-extreme exponent and
of the data classes specified by DCM. leftmost nonzero digit in significand
5 Special symbol (Infinity, QNaN, or SNaN)
Field Meaning
CR field BF and FPSCRFPCC are set to indicate the
0000 Operand positive with no match sign of the DFP operand in FRA[p] and whether the
0010 Operand positive with match data group of the DFP operand in FRA[p] matches any
1000 Operand negative with no match of the data groups specified by DGM.
1010 Operand negative with match
Field Meaning
Special Registers Altered:
CR field BF 0000 Operand positive with no match
FPCC 0010 Operand positive with match
1000 Operand negative with no match
1010 Operand negative with match
Special Registers Altered:
CR field BF
FPCC
dtstexq BF,FRAp,FRBp
Bit Description
0 Ea < Eb
1 Ea > Eb
2 Ea = Eb
3 Ea ? Eb
Special Registers Altered:
CR field BF
FPCC
DFP Test Significance [Quad] X-form DFP Test Significance Immediate [Quad]
dtstsf BF,FRA,FRB dtstsfi BF,UIM,FRB
Let k be the contents of bits 58:63 of FPR[FRA] that Let the value UIM specify the reference significance.
specifies the reference significance.
For dtstsfi, let the value NSDb be the number of
For dtstsf, let the value NSDb be the number of significant digits of the DFP value in FPR[FRB].
significant digits of the DFP value in FPR[FRB].
For dtstsfiq, let the value NSDb be the number of
For dtstsfq, let the value NSDb be the number of significant digits of the DFP value in FPR[FRBp:FRBp+1].
significant digits of the DFP value in FPR[FRBp:FRBp+1].
For this instruction, the number of significant digits of
For this instruction, the number of significant digits of the value 0 is considered to be zero.
the value 0 is considered to be zero.
NSDb is compared to UIM. The result of the compare is
NSDb is compared to k. The result of the compare is placed into CR field BF and the FPCC as follows.
placed into CR field BF and the FPCC as follows.
Bit Description
Bit Description 0 UIM ≠ 0 and UIM < NSDb
0 k g 0 and k < NSDb 1 UIM ≠ 0 and UIM > NSDb, or UIM = 0
1 k g 0 and k > NSDb, or k = 0 2 UIM ≠ 0 and UIM = NSDb
2 k g 0 and k = NSDb 3 UIM ? NSDb
3 k ? NSDb
Special Registers Altered:
Special Registers Altered: CR field BF
CR field BF FPCC
FPCC
Actions for Test Significance
Actions for Test Significance when the operand in VSR[FRB] or VSR[FRBp:FRBp+1] is
when the operand in VSR[FRB] or VSR[FRBp:FRBp+1] is
F ∞ QNaN SNaN
F ∞ QNaN SNaN
C(UIM:NSDb) AuoB AuoB AuoB
C(UIM:NSDb) AuoB AuoB AuoB
Explanation:
Explanation:
Algebraic comparison. See the table
Algebraic comparison. See the table C(UIM:NSDb)
C(k:NSDb) below.
below.
F All finite numbers, including zeros.
F All finite numbers, including zeros.
AeqB CR field BF and FPCC are set to 0b0010.
AeqB CR field BF and FPCC are set to 0b0010.
AgtB CR field BF and FPCC are set to 0b0100.
AgtB CR field BF and FPCC are set to 0b0100.
AltB CR field BF and FPCC are set to 0b1000.
AltB CR field BF and FPCC are set to 0b1000.
AuoB CR field BF and FPCC are set to 0b0001.
AuoB CR field BF and FPCC are set to 0b0001.
Relation of Value NSDb to Value UIM Action for C(UIM:NSDb)
Relation of Value NSDb to Value k Action for C(k:NSDb)
UIM≠0 and UIM = NSDb AeqB
k g 0 and k = NSDb AeqB
UIM≠0 and UIM < NSDb AltB
k g 0 and k < NSDb AltB
UIM≠0 and UIM > NSDb, or UIM = 0 AgtB
k g 0 and k > NSDb, or k = 0 AgtB
Figure 88. Actions: Test Significance
Figure 87. Actions: Test Significance
Programming Note
The reference significance can be loaded into a
FPR using a Load Float as Integer Word Algebraic
instruction
1.00 (100 x 10-2) 9. (9 x 100) 9.00 (900 x 10-2) 9.00 (900 x 10-2)
1.00 (100 x 10-2) 49.1234 (491234 x 10-4) 49.12 (4912 x 10-2) 49.12 (4912 x 10-2)
1.00 (100 x 10-2) 49.9876 (499876 x 10-4) 49.98 (4998 x 10-2) 49.99 (4999 x 10-2)
0.01 (1 x 10-2) 49.9876 (499876 x 10-4) 49.98 (4998 x 10-2) 49.99 (4999 x 10-2)
9999999999999999
1.0 (10 x 10-1) QNaN QNaN
(9999999999999999 x 100)
DFP Reround [Quad] Z23-form invalid-operation exception, in which case the field
remains unchanged.
drrnd FRT,FRA,FRB,RMC (Rc=0)
Special Registers Altered:
drrnd. FRT,FRA,FRB,RMC (Rc=1
FPRF FR FI
FX XX
59 FRT FRA FRB RMC 35 Rc VXSNAN VXCVI
0 6 11 16 21 23 31 CR1 (if Rc=1)
If the exponent of the rounded result of the form that dxex f0,FRB
has the specified number of significant digits would be stfd f0,-16(r1)
greater than Xmax, an invalid operation exception drrnd f1,f13,FRB,1 # reround 1 digit toward 0
(VXCVI) occurs. When the invalid-operation exception dxex f1,f1
stfd f1,-8(r1)
occurs, and if the exception is disabled, a default QNaN
lfd r11,-16(r1)
is returned. When an invalid-operation exception lfd r3,-8(r1)
occurs, no inexact exception is recognized. subf r3,r11,r3
In the absence of an invalid-operation exception, if the addi r3,r3,1
result differs in value from the operand in FRB[p], an Given the value 412.34 the result is E(4 x 102) -
inexact exception is recognized. E(41234 x 10-2) + 1 = (398+2) - (398-2) + 1 = 400 -
396 + 1 = 5. Additional code is required to detect
This operation causes neither an overflow nor an
and handle special values like Subnormal, Infinity,
underflow exception.
and NAN.
Figure 93 summarizes the actions for Reround. The
table does not include the setting of the FPSCRFPRF
field. The FPSCRFPRF field is always set to the class
and sign of the result, except for an enabled
Programming Note
DFP Reround combined with DFP Quantize can be
used to left justify a value (as needed by the frexp
function). FRB is the DFP value for which we want
to left justify; f13 contains the reference significance
value 0x0000000000000001; and r1 is the stack
pointer, with free space for a doubleword at offset
-8. This doubleword is used to transfer the biased
exponents from the FPR to a GPR, for integer com-
putation. The adjusted biased exponent (+ format
precision - 1) is transferred back into an FPR so it
can be inserted into the rerounded value. The
adjusted rerounded value becomes the quantize
reference value. The quantize instruction returns
the left justified result in FRT.
drrnd f1,f13,FRB,1 # reround 1 digit toward 0
dxex f0,f1
stfd f0,-8(r1)
lfd r11,-8(r1)
addi r11,r11,15 # biased exp + precision - 1
lfd r11,-8(r1)
stfd f0,-8(r1)
diex f1,f0,f1 # adjust exponent
dqua FRT,f1,f0,1 # quantize to adjusted
exponent
Figure 95 summarizes the actions for Round To FP Note that nearbyint is similar to the rint function but
Integer Without Inexact. The table does not include the without raising the inexact exception. Similarly ceil,
setting of the FPSCRFPRF field. The FPSCRFPRF field floor, round, and trunc do not require the inexact
is always set to the class and sign of the result, except exception.
for an enabled invalid-operation, in which case the field
remains unchanged.
DFP Convert To DFP Long X-form DFP Convert To DFP Extended X-form
dctdp FRT,FRB (Rc=0) dctqpq FRTp,FRB (Rc=0)
dctdp. FRT,FRB (Rc=1) dctqpq. FRTp,FRB (Rc=1)
The DFP short operand in bits 32:63 of FRB is con- The DFP long operand in the FRB is converted to DFP
verted to DFP long format and the converted result is extended format and placed into FRTp. The sign of the
placed into FRT. The sign of the result is the same as result is the same as the sign of the operand in FRB.
the sign of the source operand. The ideal exponent is The ideal exponent is the exponent of the operand in
the exponent of the source operand. FRB.
If the operand in FRB is an SNaN, it is converted to an If the operand in FRB is an SNaN, an invalid-operation
SNaN in DFP long format and does not cause an exception is recognized. If the exception is disabled,
invalid-operation exception. the SNaN is converted to the corresponding QNaN in
DFP extended format.
Special Registers Altered:
FPRF FR (undefined) FI (undefined) Special Registers Altered:
CR1 (if Rc=1) FPRF FR (set to 0) FI (set to 0)
FX
Programming Note VXSNAN
Note that DFP short format is a storage-only for- CR1 (if Rc=1)
mat, Therefore, conversion of a short SNaN to long
format will not cause an exception and the SNaN is
preserved. Subsequent operation on that SNaN in
long format will cause an exception.
DFP Round To DFP Short X-form DFP Round To DFP Long X-form
drsp FRT,FRB (Rc=0) drdpq FRTp,FRBp (Rc=0)
drsp. FRT,FRB (Rc=1) drdpq. FRTp,FRBp (Rc=1)
The DFP long operand in FRB is converted and The DFP extended operand in FRBp is converted and
rounded to DFP short format. The DFP short value is rounded to DFP long format. The result concatenated
extended on the left with zeros to form a 64-bit entity with 64 0s is placed in FRTp. The sign of the result is
and placed into FRT. The sign of the result is the same the same as the sign of the source operand. The ideal
as the sign of the source operand. The ideal exponent exponent is the exponent of the operand in FRBp.
is the exponent of the source operand.
If the operand in FRBp is an SNaN, an invalid-operation
If the operand in FRB is an SNaN, it is converted to an exception is recognized. If the exception is disabled,
SNaN in DFP short format and does not cause an the SNaN is converted to the corresponding QNaN in
invalid-operation exception. DFP long format.
Normally, the result is in the format and length of the Normally, the result is in the format and length of the
target. However, when an overflow or underflow excep- target. However, when an overflow or underflow excep-
tion occurs and if the exception is enabled, the opera- tion occurs and if the exception is enabled, the opera-
tion is completed by producing a wrapped rounded tion is completed by producing a wrapped rounded
result in the same format and length as the source but result in the same format and length as the source but
rounded to the target-format precision. rounded to the target-format precision.
Special Registers Altered: Special Registers Altered:
FPRF FR FI FPRF FR FI
FX OX UX XX FX OX UX XX
CR1 (if Rc=1) VXSNAN
CR1 (if Rc=1)
Programming Note
Note that DFP short format is a storage-only for- Programming Note
mat, Therefore, conversion of a long SNaN to short Note that DFP Round to DFP Long, while produc-
format will not cause an exception. Converting a ing a result in DFP long format, actually targets a
long format SNaN to short format is an implied register pair, writing 64 0s in FRTp+1.
move operation.
DFP Convert From Fixed X-form DFP Convert To Fixed [Quad] X-form
dcffix FRT,FRB (Rc=0) dctfix FRT,FRB (Rc=0)
dcffix. FRT,FRB (Rc=1) dctfix. FRT,FRB (Rc=1)
The 64-bit signed binary integer in FRB is converted dctfixq FRT,FRBp (Rc=0)
and rounded to a DFP Long value and placed into FRT. dctfixq. FRT,FRBp (Rc=1)
The sign of the result is the same as the sign of the
source operand. The ideal exponent is zero. 63 FRT /// FRBp 290 Rc
0 6 11 16 21 31
If the source operand is a zero, then a plus zero with a
zero exponent is returned.
The DFP operand in FRB[p] is rounded to an integer
The FPSCRFPRF field is set to the class and sign of the value and is placed into FRT in the 64-bit signed binary
result. integer format. The sign of the result is the same as
Special Registers Altered: the sign of the source operand, except when the source
FPRF FR FI operand is a NaN or a zero.
FX XX
Figure 97 summarizes the actions for Convert To Fixed.
CR1 (if Rc=1)
Special Registers Altered:
DFP Convert From Fixed Quad X-form FPRF (undefined) FR FI
FX XX
dcffixq FRTp,FRB (Rc=0) VXSNAN VXCVI
dcffixq. FRTp,FRB (Rc=1) CR1 (if Rc=1)
DFP Decode DPD To BCD [Quad] X-form DFP Encode BCD To DPD [Quad] X-form
ddedpd SP,FRT,FRB (Rc=0) denbcd S,FRT,FRB (Rc=0)
ddedpd. SP,FRT,FRB (Rc=1) denbcd. S,FRT,FRB (Rc=1)
A portion of the significand of the DFP operand in The signed or unsigned BCD operand, depending on
FRB[p] is converted to a signed or unsigned BCD num- the S field, in FRB[p] is converted to a DFP number.
ber depending on the SP field. For infinity and NaN, the The ideal exponent is zero.
significand is considered to be the contents in the trail-
ing significand field padded on the left by a zero digit. S = 0 (unsigned BCD operand)
The unsigned BCD operand in FRB[p] is converted
SP0 = 0 (unsigned conversion)
to a positive DFP number of the same magnitude
The rightmost 16 digits of the significand (32 digits and the result is placed into FRT[p].
for ddedpdq) is converted to an unsigned BCD
number and the result is placed into FRT[p]. S = 1 (signed BCD operand)
The signed BCD operand in FRB[p] is converted to
SP0 = 1 (signed conversion)
the corresponding DFP number and the result is
The rightmost 15 digits of the significand (31 digits placed into FRT[p].
for ddedpdq) is converted to a signed BCD num-
If an invalid BCD digit or sign code is detected in the
ber with the same sign as the DFP operand, and
source operand, an invalid-operation exception
the result is placed into FRT[p]. If the DFP operand
(VXCVI) occurs.
is negative, the sign is encoded as 0b1101. If the
DFP operand is positive, SP1 indicates which pre- FPSCRFPRF is set to the class and sign of the result,
ferred plus sign encoding is used. If SP1 = 0, the except for Invalid Operation Exception when
plus sign is encoded as 0b1100 (the option-1 pre- FPSCRVE=1.
ferred sign code), otherwise the plus sign is
encoded as 0b1111(the option-2 preferred sign Special Registers Altered:
code). FPRF FR (set to 0) FI (set to 0)
FX
Special Registers Altered: VXCVI
CR1 (if Rc=1) CR1 (if Rc=1)
DFP Extract Biased Exponent [Quad] DFP Insert Biased Exponent [Quad]
X-form X-form
dxex FRT,FRB (Rc=0) diex FRT,FRA,FRB (Rc=0)
dxex. FRT,FRB (Rc=1) diex. FRT,FRA,FRB (Rc=1)
The biased exponent of the operand in FRB[p] is Let a be the value of the 64-bit signed binary integer in
extracted and placed into FRT in the 64-bit signed FRA.
binary integer format. When the operand in FRB is an a Result
infinity, QNaN, or SNaN, a special code is returned. a > MBE1 QNaN
MBE m a m 0 Finite number with biased exponent a
Operand Result
a = -1 Infinity
Finite Number biased exponent value
a = -2 QNaN
Infinity -1
a = -3 SNaN
QNaN -2
a < -3 QNaN
SNaN -3 1 Maximum biased exponent for the target format
Programming Note
The exponent bias value is 101 for DFP Short, 398
for DFP Long, and 6176 for DFP Extended.
Operand a in Actions for Insert Biased Exponent when operand b in FRB[p] specifies
FRA[p] specifies F ∞ QNaN SNaN
F N, Rb Z, Rb Z, Rb Z, Rb
∞ I, Rb I, Rb I, Rb I, Rb
QNaN Q, Rb Q, Rb Q, Rb Q, Rb
SNaN S, Rb S, Rb S, Rb S, Rb
Explanation:
F All finite numbers, including zeros
I The combination field in FRT[p] is set to indicate a default Infinity.
N The combination field in FRT[p] is set to the specified biased exponent in FRA and
the leftmost significand digit in FRB[p].
Q The combination field in FRT[p] is set to indicate a default QNaN.
S The combination field in FRT[p] is set to indicate a default SNaN.
Z The combination field in FRT[p] is set to indicate the specific biased exponent in FRA
and a leftmost coefficient digit of zero.
Rb The contents of the trailing significand field in FRB[p] are reencoded using preferred
DPD encodings and the reencoded result is placed in the same field in FRT[p]. The
sign bit of FRB[p] is copied into the sign bit in FRT[p].
Figure 98. Actions: Insert Biased Exponent
DFP Shift Significand Left Immediate DFP Shift Significand Right Immediate
[Quad] Z22-form [Quad] Z22-form
dscli FRT,FRA,SH (Rc=0) dscri FRT,FRA,SH (Rc=0)
dscli. FRT,FRA,SH (Rc=1) dscri. FRT,FRA,SH (Rc=1)
Encoding
FORM
FP
FPCC
FR\FI
SNaN Exception
Rc
Full Name Operands Vs G V Z O U X IE
C
dadd DFP Add X FRT, FRA, FRB Y N RE Y Y V O U X Y Y Y
daddq DFP Add Quad X FRTp, FRAp, FRBp Y N RE Y Y V O U X Y Y Y
dsub DFP Subtract X FRT, FRA, FRB Y N RE Y Y V O U X Y Y Y
dsubq DFP Subtract Quad X FRTp, FRAp, FRBp Y N RE Y Y V O U X Y Y Y
dmul DFP Multiply X FRT, FRA, FRB Y N RE Y Y V O U X Y Y Y
dmulq DFP Multiply Quad X FRTp, FRAp, FRBp Y N RE Y Y V O U X Y Y Y
ddiv DFP Divide X FRT, FRA, FRB Y N RE Y Y V Z O U X Y Y Y
ddivq DFP Divide Quad X FRTp, FRAp, FRBp Y N RE Y Y V Z O U X Y Y Y
dcmpo DFP Compare Ordered X BF, FRA, FRB Y - - N Y V - - N
dcmpoq DFP Compare Ordered Quad X BF, FRAp, FRBp Y - - N Y V - - N
dcmpu DFP Compare Unordered X BF, FRA, FRB Y - - N Y V - - N
dcmpuq DFP Compare Unordered Quad X BF, FRAp, FRBp Y - - N Y V - - N
dtstdc DFP Test Data Class Z22 BF, FRA, DCM N - - N Y 1 - - N
dtstdcq DFP Test Data Class Quad Z22 BF, FRAp, DCM N - - N Y 1 - - N
dtstdgq DFP Test Data Group Quad Z22 BF, FRAp, DGM N - - N Y 1 - - N
dtstex DFP Test Exponent X BF, FRA, FRB N - - N Y - - N
dtstexq DFP Test Exponent Quad X BF, FRAp, FRBp N - - N Y - - N
dtstsf DFP Test Significance X BF, FRA(FIX), FRB N - - N Y - - N
dtstsfq DFP Test Significance Quad X BF, FRA(FIX), FRBp N - - N Y - - N
dquai DFP Quantize Immediate Z23 TE, FRT, FRB, RMC Y N RE Y Y V X Y Y Y
dquaiq DFP Quantize Immediate Quad Z23 TE, FRTp, FRBp, RMC Y N RE Y Y V X Y Y Y
dqua DFP Quantize Z23 FRT,FRA,FRB,RMC Y N RE Y Y V X Y Y Y
dquaq DFP Quantize Quad Z23 FRTp,FRAp,FRBp, RMC Y N RE Y Y V X Y Y Y
drrnd DFP Reround Z23 FRT,FRA(FIX),FRB,RMC Y N RE Y Y V X Y Y Y
FRTp, FRA(FIX), FRBp, Y
drrndq DFP Reround Quad Z23 Y N RE Y Y V X Y Y
RMC
DFP Round To FP Integer With Y
drintx Z23 R,FRT, FRB,RMC Y N RE Y Y V X Y Y
Inexact
DFP Round To FP Integer With Y
drintxq Z23 R,FRTp,FRBp,RMC Y N RE Y Y V X Y Y
Inexact Quad
DFP Round To FP Integer With- Y
drintn
out Inexact
Z23 R,FRT, FRB,RMC Y N RE Y Y V Y# Y
Mnemonic
FPRF
Encoding
FORM
FP
FPCC
FR\FI
SNaN Exception
Rc
Full Name Operands Vs G V Z O U X IE
C
ddedpdq DFP Decode DPD To BCD Quad X SP, FRTp(BCD), FRBp N - - N N - - Y
Y
denbcd DFP Encode BCD To DPD X S, FRT, FRB (BCD) - N RE Y Y V Y
Y#
denbcdq DFP Encode BCD To DPD Quad X S, FRTp, FRBp (BCD) - N RE Y Y V Y# Y Y
dxex DFP Extract Biased Exponent X FRT (FIX), FRB N N - N N - - Y
DFP Extract Biased Exponent Y
dxexq X FRT (FIX), FRBp N N - N N - -
Quad
diex DFP Insert Biased Exponent X FRT, FRA(FIX), FRB N Y RE N N - Y Y
DFP Insert Biased Exponent Y
diexq X FRTp, FRA(FIX), FRBp N Y RE N N - Y
Quad
DFP Shift Significand Left Imme- Y
dscli Z22 FRT,FRA,SH N Y RE N N - -
diate
DFP Shift Significand Left Imme- Y
dscliq Z22 FRTp,FRAp,SH N Y RE N N - -
diate Quad
DFP Shift Significand Right Imme- Y
dscri Z22 FRT,FRA,SH N Y RE N N - -
diate
DFP Shift Significand Right Imme- Y
dscriq Z22 FRTp,FRAp,SH N Y RE N N - -
diate Quad
Explanation:
# FI and FR are set to zeros for these instructions.
- Not applicable.
1 A unique definition of the FPSCRFPCC field is provided for the instruction.
These are the only instructions that may generate an SNaN and also set the FPSCFPRF field. Since the BFP FPSCRFPRF
2
field does not include a code for SNaN, these instructions cause the need for redefining the FPSCRFPRF field for DFP.
DCM A 6-bit immediate operand specifying the data-class mask.
DGM A 6-bit immediate operand specifying the data-group mask.
G An SNaN can be generated as the target operand.
IE An ideal exponent is defined for the instruction.
FI Setting of the FPSCRFI flag.
FR Setting of the FPSCRFR flag.
N No.
O An overflow exception may be recognized.
Rc The record bit, Rc, is provided to record FPSCR32:35 in CR field 1.
The trailing significand field is reencoded using preferred DPD encodings.The preferred DPD encoding are also used for
RE
propagated NaNs, or converted NaNs and infinities.
RMC A 2-bit immediate operand specifying the rounding-mode control.
S An one-bit immediate operand specifying if the operation is signed or unsigned.
A two-bit immediate operand: one bit specifies if the operation is signed or unsigned and, for signed operations, another
SP
bit specifies which preferred plus sign code is generated.
U An underflow exception may be recognized.
V An invalid-operation exception may be recognized.
Vs An input operand of SNaN causes an invalid-operation exception.
X An inexact exception may be recognized.
Y Yes.
U Undefined
Z A zero-divide exception may be recognized.
Figure 99. Decimal Floating-Point Instructions Summary (Continued)
x.bit[y] +fp
Return the contents of bit y of x. Floating-point addition.
x.bit[y:z] –fp
Return the contents of bits y:z of x. Floating-point subtraction.
x.nibble[y] ×sui
Return the contents of the 4-bit nibble element y Multiplication of a signed-integer (first operand) by
of x. an unsigned-integer (second operand).
x.nibble[y:z] ×fp
Return the contents of the nibble elements y:z of Floating-point multiplication.
x.
=int
x.byte[y] Integer equals relation.
Return the contents of byte element y of x.
=fp
x.byte[y:z] Floating-point equals relation.
Return the contents of byte elements y:z of x.
<ui, ≤ui, >ui, ≥ui
x.hword[y] Unsigned-integer comparison relations.
Return the contents of halfword element y of x.
<si, ≤si, >si, ≥si
x.hword[y:z] Signed-integer comparison relations.
Return the contents of halfword elements y:z of x.
<fp, ≤fp, >fp, ≥fp
x.word[y] Floating-point comparison relations.
Return the contents of word element y of x.
LENGTH( x ) ConvertSItoBCD(x,y)
Length of x, in bits. If x is the word “element”, Let x be a signed integer quadword.
LENGTH(x) is the length, in bits, of the element Let y indicate the preferred sign code.
implied by the instruction mnemonic.
Return the signed integer value x in packed
x << y decimal format.
Result of shifting x left by y bits, filling vacated bits
with zeros. if (x<0) then do
x ← ¬x + 1
b LENGTH(x) sign ← 0x000D
result (y < b) ? (xy:b-1 ||y0) : b0 end
else
x >>ui y sign ← (y=0) ? 0x000C : 0x000F
Result of shifting x right by y bits, filling vacated
bits with zeros. result ← 0
shcnt ← 4
b LENGTH(x)
result (y < b) ? (y0 || x0:(b-y)-1) : b0 do while (x > 0)
digit ← x % 10
x >> y result ← result | (digit<<shcnt)
Result of shifting x right by y bits, filling vacated x ← x ÷ 10
bits with copies of bit 0 (sign bit) of x. shcnt ← shcnt + 4
end
b LENGTH(x)
result (y<b) ? (yx0 ||x0:(b-y)-1) : bx0 return (result | sign)
x <<< y ConvertBCDtoSI(x)
Result of rotating x left by y bits. Let x be a packed decimal value.
if (x < y) then
result y
VSCRSAT 1
else if (x > z) then
result z
VSCRSAT 1
else result x
ConvertSPtoSXWsaturate(x, y)
Let x be a single-precision floating-point value.
Let y be an unsigned integer value.
sign x.bit[0]
exp x.bit[1:8]
frac.bit[0:22] x.bit[9:31]
frac.bit[23:30] 0b0000_0000
if (exp==255) & (frac!=0) then return (0x0000_0000) // NaN operand
if (exp==255) & (frac==0) then do // infinity operand
VSCR.SAT 1
return ((sign==1) ? 0x8000_0000 : 0x7FFF_FFFF)
end
if ((exp+Y-127)>30) then do // large operand
VSCR.SAT 1
return ((sign==1) ? 0x8000_0000 : 0x7FFF_FFFF)
end
if ((exp+y-127)<0) then return (0x0000_0000) // -1.0 < value < 1.0 (value rounds to 0)
significand.bit[0] 0b1
significand.bit[1:31] frac
do i = 1 to 31-(exp+Y-127)
significand significand >>ui 1
end
return ((sign==0) ? significand : (¬significand + 1))
ConvertSPtoUXWsaturate(x, y)
Let x be a single-precision floating-point value.
Let y be an unsigned integer value.
sign x.bit[0]
exp x.bit[1:8]
frac.bit[0:22] x.bit[9:31]
frac.bit[23:30] 0b0000_0000
if (exp==255) & (frac!=0) then return (0x0000_0000) // NaN operand
if (exp==255) & (frac==0) then do // infinity operand
VSCR.SAT 1
return ((sign==1) ? 0x0000_0000 : 0xFFFF_FFFF)
end
if ((exp+Y-127)>31) then do // large operand
VSCR.SAT 1
return ((sign==1) ? 0x0000_0000 : 0xFFFF_FFFF)
end
if ((exp+Y-127)<0) then return (0x0000_0000) // -1.0 < value < 1.0
// value rounds to 0
if( sign==1 ) then do // negative operand
VSCR.SAT 1
return (0x0000_0000)
end
significand.bit[0] 0b1
significand.bit[1:31] frac
do i = 1 to 31-(exp+Y-127)
significand = significand >>ui 1
end
return (significand)
ConvertSXWtoSP(x)
Let x be a 32-bit signed integer value.
sign X.bit[0]
exp 32 + 127
frac.bit[0] x.bit[0]
frac.bit[1:32] x.bit[0:31]
if (frac==0) return (0x0000_0000) // Zero Operand
if (sign==1) then frac = ¬frac + 1
do while (frac.bit[0]=0)
frac frac << 1
exp exp - 1
end
lsb frac.bit[23]
gbit frac.bit[24]
xbit frac.bit[25:32]!=0
inc (lsb & gbit) | (gbit & xbit)
frac.bit[0:23] frac.bit[0:23] + inc
if (carry_out=1) then exp exp + 1
result.bit[0] sign
result.bit[1:8] exp
result.bit[9:31] frac.bit[1:23]
return (result)
ConvertUXWtoSP(x)
Let x be a 32-bit unsigned integer value.
exp 31 + 127
frac x.bit[0:31]
if (frac==0) return (0x0000_0000) // Zero Operand
do while( frac0==0 )
frac frac << 1
exp exp - 1
end
lsb frac.bit[23]
gbit frac.bit[24]
xbit frac.bit[25:31]!=0
inc (lsb & gbit) | (gbit & xbit)
frac.bit[0:23] frac.bit[0:23] + inc
if (carry_out=1) then exp exp + 1
result.bit[0] 0b0
result.bit[1:8] exp
result.bit[9:31] frac.bit[1:23]
return (result)
DUP(x,y)
Return the concatenation of y copies x.
DUP(0b01,4) = 0b01010101
DUP(0b001,3) = 0b001001001
EXTZ(x)
Result of extending x on the left with zeros.
b LENGTH(x)
result x & ((1<<b)-1)
InvMixColumns(x)
do c = 0 to 3
result.word[c].byte[0] = 0x0E•x.word[c].byte[0] ^ 0x0B•x.word[c].byte[1] ^ 0x0D•x.word[c].byte[2] ^ 0x09•x.word[c].byte[3]
result.word[c].byte[1] = 0x09•x.word[c].byte[0] ^ 0x0E•x.word[c].byte[1] ^ 0x0B•x.word[c].byte[2] ^ 0x0D•x.word[c].byte[3]
result.word[c].byte[2] = 0x0D•x.word[c].byte[0] ^ 0x09•x.word[c].byte[1] ^ 0x0E•x.word[c].byte[2] ^ 0x0B•x.word[c].byte[3]
result.word[c].byte[3] = 0x0B•x.word[c].byte[0] ^ 0x0D•x.word[c].byte[1] ^ 0x09•x.word[c].byte[2] ^ 0x0E•x.word[c].byte[3]
end
return(result);
where “•” is a GF(28) multiply, a binary polynomial multiplication reduced by modulo 0x11B.
The GF(28) multiply of 0x09•x can be expressed in minimized terms as the following.
product.bit[0] = x.bit[0] ^ x.bit[3]
product.bit[1] = x.bit[1] ^ x.bit[4] ^ x.bit[0]
product.bit[2] = x.bit[2] ^ x.bit[5] ^ x.bit[0] ^ x.bit[1]
product.bit[3] = x.bit[3] ^ x.bit[6] ^ x.bit[1] ^ x.bit[2]
product.bit[4] = x.bit[4] ^ x.bit[7] ^ x.bit[0] ^ x.bit[2]
product.bit[5] = x.bit[5] ^ x.bit[0] ^ x.bit[1]
product.bit[6] = x.bit[6] ^ x.bit[1] ^ x.bit[2]
product.bit[7] = x.bit[7] ^ x.bit[2]
The GF(28) multiply of 0x0B•x can be expressed in minimized terms as the following.
product.bit[0] = x.bit[0] ^ x.bit[1] ^ x.bit[3]
product.bit[1] = x.bit[1] ^ x.bit[2] ^ x.bit[4] ^ x.bit[0]
product.bit[2] = x.bit[2] ^ x.bit[3] ^ x.bit[5] ^ x.bit[0] ^ x.bit[1]
product.bit[3] = x.bit[3] ^ x.bit[4] ^ x.bit[6] ^ x.bit[0] ^ x.bit[1] ^ x.bit[2]
product.bit[4] = x.bit[4] ^ x.bit[5] ^ x.bit[7] ^ x.bit[2]
product.bit[5] = x.bit[5] ^ x.bit[6] ^ x.bit[0] ^ x.bit[1]
product.bit[6] = x.bit[6] ^ x.bit[7] ^ x.bit[0] ^ x.bit[1] ^ x.bit[2]
product.bit[7] = x.bit[7] ^ x.bit[0] ^ x.bit[2]
The GF(28) multiply of 0x0D•x can be expressed in minimized terms as the following.
product.bit[0] = x.bit[0] ^ x.bit[2] ^ x.bit[3]
product.bit[1] = x.bit[1] ^ x.bit[3] ^ x.bit[4] ^ x.bit[0]
product.bit[2] = x.bit[2] ^ x.bit[4] ^ x.bit[5] ^ x.bit[1]
product.bit[3] = x.bit[3] ^ x.bit[5] ^ x.bit[6] ^ x.bit[0] ^ x.bit[2]
product.bit[4] = x.bit[4] ^ x.bit[6] ^ x.bit[7] ^ x.bit[0] ^ x.bit[1] ^ x.bit[2]
product.bit[5] = x.bit[5] ^ x.bit[7] ^ x.bit[1]
product.bit[6] = x.bit[6] ^ x.bit[0] ^ x.bit[2]
product.bit[7] = x.bit[7] ^ x.bit[1] ^ x.bit[2]
The GF(28) multiply of 0x0E•x can be expressed in minimized terms as the following.
product.bit[0] = x.bit[1] ^ x.bit[2] ^ x.bit[3]
product.bit[1] = x.bit[2] ^ x.bit[3] ^ x.bit[4] ^ x.bit[0]
product.bit[2] = x.bit[3] ^ x.bit[4] ^ x.bit[5] ^ x.bit[1]
product.bit[3] = x.bit[4] ^ x.bit[5] ^ x.bit[6] ^ x.bit[2]
product.bit[4] = x.bit[5] ^ x.bit[6] ^ x.bit[7] ^ x.bit[1] ^ x.bit[2]
product.bit[5] = x.bit[6] ^ x.bit[7] ^ x.bit[1]
product.bit[6] = x.bit[7] ^ x.bit[2]
product.bit[7] = x.bit[0] ^ x.bit[1] ^ x.bit[2]
InvShiftRows(x)
result.word[0].byte[0] = x.word[0].byte[0]
result.word[1].byte[0] = x.word[1].byte[0]
result.word[2].byte[0] = x.word[2].byte[0]
result.word[3].byte[0] = x.word[3].byte[0]
result.word[0].byte[1] = x.word[3].byte[1]
result.word[1].byte[1] = x.word[0].byte[1]
result.word[2].byte[1] = x.word[1].byte[1]
result.word[3].byte[1] = x.word[2].byte[1]
result.word[0].byte[2] = x.word[2].byte[2]
result.word[1].byte[2] = x.word[3].byte[2]
result.word[2].byte[2] = x.word[0].byte[2]
result.word[3].byte[2] = x.word[1].byte[2]
result.word[0].byte[3] = x.word[1].byte[3]
result.word[1].byte[3] = x.word[2].byte[3]
result.word[2].byte[3] = x.word[3].byte[3]
result.word[3].byte[3] = x.word[0].byte[3]
return(result)
InvSubBytes(x)
InvSBOX.byte[256] = { 0x52,0x09,0x6A,0xD5,0x30,0x36,0xA5,0x38,0xBF,0x40,0xA3,0x9E,0x81,0xF3,0xD7,0xFB,
0x7C,0xE3,0x39,0x82,0x9B,0x2F,0xFF,0x87,0x34,0x8E,0x43,0x44,0xC4,0xDE,0xE9,0xCB,
0x54,0x7B,0x94,0x32,0xA6,0xC2,0x23,0x3D,0xEE,0x4C,0x95,0x0B,0x42,0xFA,0xC3,0x4E,
0x08,0x2E,0xA1,0x66,0x28,0xD9,0x24,0xB2,0x76,0x5B,0xA2,0x49,0x6D,0x8B,0xD1,0x25,
0x72,0xF8,0xF6,0x64,0x86,0x68,0x98,0x16,0xD4,0xA4,0x5C,0xCC,0x5D,0x65,0xB6,0x92,
0x6C,0x70,0x48,0x50,0xFD,0xED,0xB9,0xDA,0x5E,0x15,0x46,0x57,0xA7,0x8D,0x9D,0x84,
0x90,0xD8,0xAB,0x00,0x8C,0xBC,0xD3,0x0A,0xF7,0xE4,0x58,0x05,0xB8,0xB3,0x45,0x06,
0xD0,0x2C,0x1E,0x8F,0xCA,0x3F,0x0F,0x02,0xC1,0xAF,0xBD,0x03,0x01,0x13,0x8A,0x6B,
0x3A,0x91,0x11,0x41,0x4F,0x67,0xDC,0xEA,0x97,0xF2,0xCF,0xCE,0xF0,0xB4,0xE6,0x73,
0x96,0xAC,0x74,0x22,0xE7,0xAD,0x35,0x85,0xE2,0xF9,0x37,0xE8,0x1C,0x75,0xDF,0x6E,
0x47,0xF1,0x1A,0x71,0x1D,0x29,0xC5,0x89,0x6F,0xB7,0x62,0x0E,0xAA,0x18,0xBE,0x1B,
0xFC,0x56,0x3E,0x4B,0xC6,0xD2,0x79,0x20,0x9A,0xDB,0xC0,0xFE,0x78,0xCD,0x5A,0xF4,
0x1F,0xDD,0xA8,0x33,0x88,0x07,0xC7,0x31,0xB1,0x12,0x10,0x59,0x27,0x80,0xEC,0x5F,
0x60,0x51,0x7F,0xA9,0x19,0xB5,0x4A,0x0D,0x2D,0xE5,0x7A,0x9F,0x93,0xC9,0x9C,0xEF,
0xA0,0xE0,0x3B,0x4D,0xAE,0x2A,0xF5,0xB0,0xC8,0xEB,0xBB,0x3C,0x83,0x53,0x99,0x61,
0x17,0x2B,0x04,0x7E,0xBA,0x77,0xD6,0x26,0xE1,0x69,0x14,0x63,0x55,0x21,0x0C,0x7D }
do i = 0 to 15
result.byte[i] = InvSBOX.byte[x.byte[i]]
end
return(result)
MixColumns(x)
do c = 0 to 3
result.word[c].byte[0] = 0x02•x.word[c].byte[0] ^ 0x03•x.word[c].byte[1] ^ x.word[c].byte[2] ^ x.word[c].byte[3]
result.word[c].byte[1] = x.word[c].byte[0] ^ 0x02•x.word[c].byte[1] ^ 0x03•x.word[c].byte[2] ^ x.word[c].byte[3]
result.word[c].byte[2] = x.word[c].byte[0] ^ x.word[c].byte[1] ^ 0x02•x.word[c].byte[2] ^ 0x03•x.word[c].byte[3]
result.word[c].byte[3] = 0x03•x.word[c].byte[0] ^ x.word[c].byte[1] ^ x.word[c].byte[2] ^ 0x02•x.word[c].byte[3]
end
return(result)
The GF(28) multiply of 0x02•x can be expressed in minimized terms as the following.
product.bit[0] = x.bit[1]
product.bit[1] = x.bit[2]
product.bit[2] = x.bit[3]
product.bit[3] = x.bit[4] ^ x.bit[0]
product.bit[4] = x.bit[5] ^ x.bit[0]
product.bit[5] = x.bit[6]
product.bit[6] = x.bit[7] ^ x.bit[0]
product.bit[7] = x.bit[0]
The GF(28) multiply of 0x03•x can be expressed in minimized terms as the following.
product.bit[0] = x.bit[0] ^ x.bit[1]
product.bit[1] = x.bit[1] ^ x.bit[2]
product.bit[2] = x.bit[2] ^ x.bit[3]
product.bit[3] = x.bit[3] ^ x.bit[4] ^ x.bit[0]
product.bit[4] = x.bit[4] ^ x.bit[5] ^ x.bit[0]
product.bit[5] = x.bit[5] ^ x.bit[6]
product.bit[6] = x.bit[6] ^ x.bit[7] ^ x.bit[0]
product.bit[7] = x.bit[7] ^ x.bit[0]
ShiftRows(x)
result.word[0].byte[0] = x.word[0].byte[0]
result.word[1].byte[0] = x.word[1].byte[0]
result.word[2].byte[0] = x.word[2].byte[0]
result.word[3].byte[0] = x.word[3].byte[0]
result.word[0].byte[1] = x.word[1].byte[1]
result.word[1].byte[1] = x.word[2].byte[1]
result.word[2].byte[1] = x.word[3].byte[1]
result.word[3].byte[1] = x.word[0].byte[1]
result.word[0].byte[2] = x.word[2].byte[2]
result.word[1].byte[2] = x.word[3].byte[2]
result.word[2].byte[2] = x.word[0].byte[2]
result.word[3].byte[2] = x.word[1].byte[2]
result.word[0].byte[3] = x.word[3].byte[3]
result.word[1].byte[3] = x.word[0].byte[3]
result.word[2].byte[3] = x.word[1].byte[3]
result.word[3].byte[3] = x.word[2].byte[3]
return(result)
Signed_BCD_Add(x,y,z)
Let x and y be 31-digit signed decimal values.
If the unbounded result is equal to zero, eq_flag is set to 1. Otherwise, eq_flag is set to 0.
If the unbounded result is greater than zero, gt_flag is set to 1. Otherwise, gt_flag is set to 0.
If the unbounded result is less than zero, lt_flag is set to 1. Otherwise, lt_flag is set to 0.
If the magnitude of the unbounded result is greater than 1031-1, ox_flag is set to 1. Otherwise, ox_flag is set to
0.
If the unbounded result is greater than or equal to zero, the sign code of the result is set to 0b1100 if z=0.
If the unbounded result is greater than or equal to zero, the sign code of the result is set to 0b1111 if z=1.
If the unbounded result is less than zero, the sign code of the result is set to 0b1101.
The low-order 31 digits of the unbounded result magnitude concatented with the sign code are returned.
If either operand is an invalid encoding of a signed decimal value, the result returned is undefined and inv_flag
is set to 1 and lt_flag, gt_flag and eq_flag are set to 0. Otherwise, inv_flag is set to 0.
Signed_BCD_Subtract(x,y,z)
Let x and y be 31-digit signed decimal values.
If the unbounded result is equal to zero, eq_flag is set to 1. Otherwise, eq_flag is set to 0.
If the unbounded result is greater than zero, gt_flag is set to 1. Otherwise, gt_flag is set to 0.
If the unbounded result is less than zero, lt_flag is set to 1. Otherwise, lt_flag is set to 0.
If the magnitude of the unbounded result is greater than 1031-1, ox_flag is set to 1. Otherwise, ox_flag is set to
0.
If the unbounded result is greater than or equal to zero, the sign code of the result is set to 0b1100 if z=0.
If the unbounded result is greater than or equal to zero, the sign code of the result is set to 0b1111 if z=1.
If the unbounded result is less than zero, the sign code of the result is set to 0b1101.
The low-order 31 digits of the unbounded result magnitude concatented with the sign code are returned.
If either operand is an invalid encoding of a signed decimal value, the result returned is undefined and inv_flag
is set to 1 and lt_flag, gt_flag and eq_flag are set to 0. Otherwise, inv_flag is set to 0.
SubBytes(x)
SBOX.byte[0:255] = { 0x63,0x7C,0x77,0x7B,0xF2,0x6B,0x6F,0xC5,0x30,0x01,0x67,0x2B,0xFE,0xD7,0xAB,0x76,
0xCA,0x82,0xC9,0x7D,0xFA,0x59,0x47,0xF0,0xAD,0xD4,0xA2,0xAF,0x9C,0xA4,0x72,0xC0,
0xB7,0xFD,0x93,0x26,0x36,0x3F,0xF7,0xCC,0x34,0xA5,0xE5,0xF1,0x71,0xD8,0x31,0x15,
0x04,0xC7,0x23,0xC3,0x18,0x96,0x05,0x9A,0x07,0x12,0x80,0xE2,0xEB,0x27,0xB2,0x75,
0x09,0x83,0x2C,0x1A,0x1B,0x6E,0x5A,0xA0,0x52,0x3B,0xD6,0xB3,0x29,0xE3,0x2F,0x84,
0x53,0xD1,0x00,0xED,0x20,0xFC,0xB1,0x5B,0x6A,0xCB,0xBE,0x39,0x4A,0x4C,0x58,0xCF,
0xD0,0xEF,0xAA,0xFB,0x43,0x4D,0x33,0x85,0x45,0xF9,0x02,0x7F,0x50,0x3C,0x9F,0xA8,
0x51,0xA3,0x40,0x8F,0x92,0x9D,0x38,0xF5,0xBC,0xB6,0xDA,0x21,0x10,0xFF,0xF3,0xD2,
0xCD,0x0C,0x13,0xEC,0x5F,0x97,0x44,0x17,0xC4,0xA7,0x7E,0x3D,0x64,0x5D,0x19,0x73,
0x60,0x81,0x4F,0xDC,0x22,0x2A,0x90,0x88,0x46,0xEE,0xB8,0x14,0xDE,0x5E,0x0B,0xDB,
0xE0,0x32,0x3A,0x0A,0x49,0x06,0x24,0x5C,0xC2,0xD3,0xAC,0x62,0x91,0x95,0xE4,0x79,
0xE7,0xC8,0x37,0x6D,0x8D,0xD5,0x4E,0xA9,0x6C,0x56,0xF4,0xEA,0x65,0x7A,0xAE,0x08,
0xBA,0x78,0x25,0x2E,0x1C,0xA6,0xB4,0xC6,0xE8,0xDD,0x74,0x1F,0x4B,0xBD,0x8B,0x8A,
0x70,0x3E,0xB5,0x66,0x48,0x03,0xF6,0x0E,0x61,0x35,0x57,0xB9,0x86,0xC1,0x1D,0x9E,
0xE1,0xF8,0x98,0x11,0x69,0xD9,0x8E,0x94,0x9B,0x1E,0x87,0xE9,0xCE,0x55,0x28,0xDF,
0x8C,0xA1,0x89,0x0D,0xBF,0xE6,0x42,0x68,0x41,0x99,0x2D,0x0F,0xB0,0x54,0xBB,0x16 }
do i = 0 to 15
result.byte[i] = SBOX.byte[x.byte[i]]
end
return(result)
RoundToSPIntCeil(x) RoundToSPIntTrunc(x)
The value x if x is a single-precision floating-point The value x if x is a single-precision floating-point
integer; otherwise the smallest single-precision integer; otherwise the largest single-precision
floating-point integer that is greater than x. floating-point integer that is less than x if x>0, or
the smallest single-precision floating-point integer
RoundToSPIntFloor(x) that is greater than x if x<0.
The value x if x is a single-precision floating-point
integer; otherwise the largest single-precision RoundToNearSP(x)
floating-point integer that is less than x. The single-precision floating-point number that is
nearest in value to the infinitely-precise
RoundToSPIntNear(x) floating-point intermediate result x (in case of a tie,
The value x if x is a single-precision floating-point the single-precision floating-point value with the
integer; otherwise the single-precision least-significant bit equal to 0 is used).
floating-point integer that is nearest in value to x
(in case of a tie, the even single-precision
floating-point integer is used).
ReciprocalEstimateSP(x) LogBase2EstimateSP(x)
A single-precision floating-point estimate of the A single-precision floating-point estimate of the
reciprocal of the single-precision floating-point base 2 logarithm of the single-precision
number x. floating-point number x.
ReciprocalSquareRootEstimateSP(x) Power2EstimateSP(x)
A single-precision floating-point estimate of the A single-precision floating-point estimate of the 2
reciprocal of the square root of the raised to the power of the single-precision
single-precision floating-point number x. floating-point number x.
Depending on the instruction, the contents of a Vector The bit definitions for the VSCR are as follows.
Register are interpreted as a sequence of equal-length
elements (bytes, halfwords, or words) or as a Bit(s) Description
quadword. Each of the elements is aligned within the
Vector Register, as shown in Figure 100. Many 96:110 Reserved
instructions perform a given operation in parallel on all 111 Vector Non-Java Mode (NJ)
elements in a Vector Register. Depending on the
This bit controls how denormalized values
instruction, a byte, halfword, or word element can be
are handled by Vector Floating-Point
interpreted as a signed-integer, an unsigned-integer,
instructions.
or a logical value; a word element can also be
0 Denormalized values are handled as
interpreted as a single-precision floating-point value. In
specified by Java and the IEEE stan-
the instruction descriptions, phrases like
dard; see Section 6.6.1.
“signed-integer word element” are used as shorthand 1 If an element in a source VR contains a
for “word element, interpreted as a signed-integer”. denormalized value, the value 0 is used
instead. If an instruction causes an
Load and Store instructions are provided that transfer
Underflow Exception, the correspond-
a byte, halfword, word, or quadword between storage
ing element in the target VR is set to 0.
and a Vector Register. In both cases the 0 has the same sign
as the denormalized or underflowing
value.
112:126 Reserved
127 Vector Saturation (SAT)
6.4 Vector Storage Access A vector quadword Load instruction for which the
effective address (EA) is quadword-aligned places the
Operations byte in storage addressed by EA into byte element 0 of
the target Vector Register, the byte in storage
The Vector Storage Access instructions provide the addressed by EA+1 into byte element 1 of the target
means by which data can be copied from storage to a Vector Register, etc. Similarly, a vector quadword
Vector Register or from a Vector Register to storage. Store instruction for which the EA is quadword-aligned
Instructions are provided that access byte, halfword, places the contents of byte element 0 of the source
word, and quadword storage operands. These Vector Register into the byte in storage addressed by
instructions differ from the fixed-point and floating-point EA, the contents of byte element 1 of the source
Storage Access instructions in that vector storage Vector Register into the byte in storage addressed by
operands are assumed to be aligned, and vector EA+1, etc.
storage accesses are performed as if the appropriate
number of low-order bits of the specified effective Figure 103 shows an aligned quadword in storage.
address (EA) were zero. For example, the low-order bit Figure 104 shows the result of loading that quadword
of EA is ignored for halfword Vector Storage Access into a Vector Register or, equivalently, shows the
instructions, and the low-order four bits of EA are contents that must be in a Vector Register if storing
ignored for quadword Vector Storage Access that Vector Register is to produce the storage contents
instructions. The effect is to load or store the storage shown in Figure 103.
operand of the specified length that contains the byte
addressed by EA. When an aligned byte, halfword, or word storage
operand is loaded into a Vector Register, the element
If a storage operand is unaligned, additional (byte, halfword, or word respectively) that receives the
instructions must be used to ensure that the operand is data is the element that would have received the data
correctly placed in a Vector Register or in storage. had the entire aligned quadword containing the
Instructions are provided that shift and merge the storage operand addressed by EA been loaded.
contents of two Vector Registers, such that an Similarly, when a byte, halfword, or word element in a
unaligned quadword storage operand can be copied Vector Register is stored into an aligned storage
between storage and the Vector Registers in a operand (byte, halfword, or word respectively), the
relatively efficient manner. element selected to be stored is the element that
would have been stored into the storage operand
As shown in Figure 100, the elements in Vector addressed by EA had the entire Vector Register been
Registers are numbered; the high-order (or most stored to the aligned quadword containing the storage
significant) byte element is numbered 0 and the operand addressed by EA. (Byte storage operands are
low-order (or least significant) byte element is always aligned.)
numbered 15. The numbering affects the values that
must be placed into the permute control vector for the For aligned byte, halfword, and word storage
Vector Permute instruction in order for that instruction operands, if the corresponding element number is
to achieve the desired effects, as illustrated by the known when the program is written, the appropriate
examples in the following subsections. Vector Splat and Vector Permute instructions can be
used to copy or replicate the data contained in the
storage operand after loading the operand into a
Vector Register. An example of this is given in the
Programming Note for Vector Splat; see page 259.
Another example is to replicate the element across an
entire Vector Register before storing it into an arbitrary
aligned storage operand of the same length; the
replication ensures that the correct data are stored
regardless of the offset of the storage operand in its
aligned quadword in storage.
00 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
10
0 1 2 3 4 5 6 7 8 9 A B C D E F
00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
0 1 2 3 4 5 6 7 8 9 A B C D E F
00 00 01 02 03 04
10 05 06 07 08 09 0A 0B 0C 0D 0E 0F
0 1 2 3 4 5 6 7 8 9 A B C D E F
Vhi 00 01 02 03 04
Vlo 05 06 07 08 09 0A 0B 0C 0D 0E 0F
Vt,Vs 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
0 15
Programming Note
The sequence of instructions given below is one stored is assumed to be in Vs; see Figure 106 The
approach that can be used to load the unaligned contents of Vs are shifted right and split into two parts,
quadword shown in Figure 105 into a Vector Register. each of which is merged (using a Vector Select
In Figure 106 Vhi and Vlo are the Vector Registers that instruction) with the current contents of the two aligned
will receive the most significant quadword and least quadwords (MSQ and LSQ) that will contain the most
significant quadword respectively. VRT is the target significant bytes and least significant bytes,
Vector Register. respectively, of the unaligned quadword. The resulting
two quadwords are stored using Store Vector Indexed
After the two quadwords have been loaded into Vhi instructions. A Load Vector for Shift Right instruction is
and Vlo, using Load Vector Indexed instructions, the used to generate the permute control vector that is
alignment is performed by shifting the 32-byte quantity used for the shifting. A single register is used for the
Vhi || Vlo left by an amount determined by the address “shifted” contents; this is possible because the
of the first byte of the desired data. The shifting is done “shifting” is done by means of a right rotation. The
using a Vector Permute instruction for which the rotation is accomplished by specifying Vs for both
permute control vector is generated by a Load Vector components of the Vector Permute instruction. In
for Shift Left instruction. The Load Vector for Shift Left addition, the same permute control vector is used on a
instruction uses the same address specification as the sequence of 1s and 0s to generate the mask used by
Load Vector Indexed instruction that loads the Vhi the Vector Select instructions that do the merging.
register; this is the address of the desired unaligned
quadword. The following sequence of instructions copies the
contents of Vs into an unaligned quadword in storage.
The following sequence of instructions copies the
unaligned quadword storage operand into register Vt. # Assumptions:
# Rb != 0 and contents of Rb = 0xB
# Assumptions: lvx Vhi,0,Rb # load current MSQ
# Rb != 0 and contents of Rb = 0xB lvsr Vp,0,Rb # set permute control vector
lvx Vhi,0,Rb # load MSQ addi Rb,Rb,16 # address of LSQ
lvsl Vp,0,Rb # set permute control vector lvx Vlo,0,Rb # load current LSQ
addi Rb,Rb,16 # address of LSQ vspltisb V1s,-1 # generate the select mask bits
lvx Vlo,0,Rb # load LSQ vspltisb V0s,0
vperm Vt,Vhi,Vlo,Vp # align the data vperm Vmask,V0s,V1s,Vp # generate the select mask
vperm Vs,Vs,Vs,Vp # right rotate the data
The procedure for storing an unaligned quadword is vsel Vlo,Vs,Vlo,Vmask # insert LSQ component
essentially the reverse of the procedure for loading vsel Vhi,Vhi,Vs,Vmask # insert MSQ component
one. However, a read-modify-write sequence is stvx Vlo,0,Rb # store LSQ
addi Rb,Rb,-16 # address of MSQ
required that inserts the source quadword into two
stvx Vhi,0,Rb # store MSQ
aligned quadwords in storage. The quadword to be
then the result is that NaN range were unbounded exceeds that of the largest
else if Invalid Operation exception finite floating-point number for the target
(Section 6.6.2.2) floating-point format.
then the result is the QNaN 0x7FC0_0000
– For either of the two Vector Convert To
If the instruction is either of the two Vector Convert To Fixed-Point Word instructions, either a source
Fixed-Point Word instructions, the corresponding result value is an infinity or the product of a source value
is 0x0000_0000. VSCRSAT is not affected. and 2UIM is a number too large in magnitude to be
represented in the target fixed-point format.
If the instruction is Vector Compare Bounds
Floating-Point, the corresponding result is The following actions are taken:
0xC000_0000.
1. If the vector instruction would normally produce
If the instruction is one of the other Vector floating-point results, the corresponding result is
Floating-Point Compare instructions, the an infinity, where the sign is the sign of the inter-
corresponding result is 0x0000_0000. mediate result.
2. If the instruction is Vector Convert To Unsigned
6.6.2.2 Invalid Operation Exception Fixed-Point Word Saturate, the corresponding
result is 0xFFFF_FFFF if the source value is a
An Invalid Operation Exception occurs when a source positive number or +infinity, and is 0x0000_0000 if
value or set of source values is invalid for the specified the source value is a negative number or -infinity.
operation. The invalid operations are: VSCRSAT is set to 1.
3. If the instruction is Vector Convert To Signed
– Magnitude subtraction of infinities Fixed-Point Word Saturate, the corresponding
– Multiplication of infinity by zero result is 0x7FFF_FFFF if the source value is a pos-
– Reciprocal square root estimate of a negative, itive number or +infinity., and is 0x8000_0000 if the
nonzero number or -infinity. source value is a negative number or -infinity.
– Log base 2 estimate of a negative, nonzero VSCRSAT is set to 1.
number or -infinity.
Load Vector Element Byte Indexed X-form Load Vector Element Halfword Indexed
X-form
lvebx VRT,RA,RB
lvehx VRT,RA,RB
31 VRT RA RB 7 /
0 6 11 16 21 31 31 VRT RA RB 39 /
0 6 11 16 21 31
if RA = 0 then b 0
else b (RA) if RA = 0 then b 0
EA b + (RB) else b (RA)
eb EA60:63 EA (b + (RB)) & 0xFFFF_FFFF_FFFF_FFFE
eb EA60:63
VRT undefined
if Big-Endian byte ordering then VRT undefined
VRT8×eb:8×eb+7 MEM(EA,1) if Big-Endian byte ordering then
else VRT8×eb:8×eb+15 MEM(EA,2)
VRT120-(8×eb):127-(8×eb) MEM(EA,1) else
VRT112-(8×eb):127-(8×eb) MEM(EA,2)
Let the effective address (EA) be the sum
(RA|0)+(RB). Let the effective address (EA) be the result of ANDing
0xFFFF_FFFF_FFFF_FFFE with the sum
Let eb be bits 60:63 of EA. (RA|0)+(RB).
If Big-Endian byte ordering is used for the storage Let eb be bits 60:63 of EA.
access, the contents of the byte in storage at address
EA are placed into byte eb of register VRT. The If Big-Endian byte ordering is used for the storage
remaining bytes in register VRT are set to undefined access,
values.
– the contents of the byte in storage at address EA
If Little-Endian byte ordering is used for the storage are placed into byte eb of register VRT,
access, the contents of the byte in storage at address – the contents of the byte in storage at address
EA are placed into byte 15-eb of register VRT. The EA+1 are placed into byte eb+1 of register VRT,
remaining bytes in register VRT are set to undefined and
values. – the remaining bytes in register VRT are set to
undefined values.
Special Registers Altered:
None If Little-Endian byte ordering is used for the storage
access,
Programming Note
On some implementations, the hint provided by the
lvxl instruction and the corresponding hint
provided by the stvxl, lvepxl, and stvepxl
instructions are applied to the entire cache block
containing the specified quadword. On such
implementations, the effect of the hint may be to
cause that cache block to be considered a likely
candidate for replacement when space is needed
in the cache for a new block. Thus, on such
implementations, the hint should be used with
caution if the cache block containing the quadword
also contains data that may be needed by the
program in the near future. Also, the hint may be
used before the last reference in a sequence of
references to the quadword if the subsequent
references are likely to occur sufficiently soon that
the cache block containing the quadword is not
likely to be displaced from the cache before the
last reference.
Store Vector Element Byte Indexed X-form Store Vector Element Halfword Indexed
X-form
stvebx VRS,RA,RB
stvehx VRS,RA,RB
31 VRS RA RB 135 /
0 6 11 16 21 31 31 VRS RA RB 167 /
0 6 11 16 21 31
if RA = 0 then b 0
else b (RA) if RA = 0 then b 0
EA b + (RB) else b (RA)
eb EA60:63 EA (b + (RB)) & 0xFFFF_FFFF_FFFF_FFFE
if Big-Endian byte ordering then eb EA60:63
MEM(EA,1) VRS8×eb:8×eb+7 if Big-Endian byte ordering then
else MEM(EA,2) VRS8×eb:8×eb+15
MEM(EA,1) VRS120-(8×eb):127-(8×eb) else
MEM(EA,2) VRS112-(8×eb):127-(8×eb)
Let the effective address (EA) be the sum
(RA|0)+(RB). Let the effective address (EA) be the result of ANDing
0xFFFF_FFFF_FFFF_FFFE with the sum
Let eb be bits 60:63 of EA. (RA|0)+(RB).
If Big-Endian byte ordering is used for the storage Let eb be bits 60:63 of EA.
access, the contents of byte eb of register VRS are
placed in the byte in storage at address EA. If Big-Endian byte ordering is used for the storage
access,
If Little-Endian byte ordering is used for the storage
access, the contents of byte 15-eb of register VRS are – the contents of byte eb of register VRS are placed
placed in the byte in storage at address EA. in the byte in storage at address EA, and
– the contents of byte eb+1 of register VRS are
Special Registers Altered: placed in the byte in storage at address EA+1.
None
If Little-Endian byte ordering is used for the storage
Programming Note access,
Unless bits 60:63 of the address are known to
match the byte offset of the subject byte element in – the contents of byte 15-eb of register VRS are
register VRS, software should use Vector Splat to placed in the byte in storage at address EA, and
splat the subject byte element before performing – the contents of byte 14-eb of register VRS are
the store. placed in the byte in storage at address EA+1.
Programming Note
Unless bits 60:62 of the address are known to
match the halfword offset of the subject halfword
element in register VRS software should use
Vector Splat to splat the subject halfword element
before performing the store.
Programming Note
Unless bits 60:61 of the address are known to
match the word offset of the subject word element
in register VRS, software should use Vector Splat
to splat the subject word element before
performing the store.
Load Vector for Shift Left Indexed X-form Load Vector for Shift Right Indexed
X-form
lvsl VRT,RA,RB
lvsr VRT,RA,RB
31 VRT RA RB 6 /
0 6 11 16 21 31 31 VRT RA RB 38 /
0 6 11 16 21 31
if RA = 0 then b 0
else b (RA) if RA = 0 then b 0
sh (b + (RB))60:63 else b (RA)
switch(sh) sh (b + (RB))60:63
case(0x0): VRT 0x000102030405060708090A0B0C0D0E0F switch(sh)
case(0x1): VRT 0x0102030405060708090A0B0C0D0E0F10 case(0x0): VRT 0x101112131415161718191A1B1C1D1E1F
case(0x2): VRT 0x02030405060708090A0B0C0D0E0F1011 case(0x1): VRT 0x0F101112131415161718191A1B1C1D1E
case(0x3): VRT 0x030405060708090A0B0C0D0E0F101112 case(0x2): VRT 0x0E0F101112131415161718191A1B1C1D
case(0x4): VRT 0x0405060708090A0B0C0D0E0F10111213 case(0x3): VRT 0x0D0E0F101112131415161718191A1B1C
case(0x5): VRT 0x05060708090A0B0C0D0E0F1011121314 case(0x4): VRT 0x0C0D0E0F101112131415161718191A1B
case(0x6): VRT 0x060708090A0B0C0D0E0F101112131415 case(0x5): VRT 0x0B0C0D0E0F101112131415161718191A
case(0x7): VRT 0x0708090A0B0C0D0E0F10111213141516 case(0x6): VRT 0x0A0B0C0D0E0F10111213141516171819
case(0x8): VRT 0x08090A0B0C0D0E0F1011121314151617 case(0x7): VRT 0x090A0B0C0D0E0F101112131415161718
case(0x9): VRT 0x090A0B0C0D0E0F101112131415161718 case(0x8): VRT 0x08090A0B0C0D0E0F1011121314151617
case(0xA): VRT 0x0A0B0C0D0E0F10111213141516171819 case(0x9): VRT 0x0708090A0B0C0D0E0F10111213141516
case(0xB): VRT 0x0B0C0D0E0F101112131415161718191A case(0xA): VRT 0x060708090A0B0C0D0E0F101112131415
case(0xC): VRT 0x0C0D0E0F101112131415161718191A1B case(0xB): VRT 0x05060708090A0B0C0D0E0F1011121314
case(0xD): VRT 0x0D0E0F101112131415161718191A1B1C case(0xC): VRT 0x0405060708090A0B0C0D0E0F10111213
case(0xE): VRT 0x0E0F101112131415161718191A1B1C1D case(0xD): VRT 0x030405060708090A0B0C0D0E0F101112
case(0xF): VRT 0x0F101112131415161718191A1B1C1D1E case(0xE): VRT 0x02030405060708090A0B0C0D0E0F1011
case(0xF): VRT 0x0102030405060708090A0B0C0D0E0F10
Let sh be bits 60:63 of the sum (RA|0)+(RB). Let X be
the 32 byte value 0x00 || 0x01 || 0x02 || … || 0x1E || Let sh be bits 60:63 of the sum (RA|0)+(RB). Let X be
0x1F. the 32-byte value 0x00 || 0x01 || 0x02 || … || 0x1E ||
0x1F.
Bytes sh to sh+15 of X are placed into VRT.
Bytes 16-sh to 31-sh of X are placed into VRT.
Special Registers Altered:
None Special Registers Altered:
None
Programming Note
Each source word can be considered to be a 32-bit
"pixel", consisting of four 8-bit "channels". Each
target halfword can be considered to be a 16-bit
pixel, consisting of one 1-bit channel and three
5-bit channels. A channel can be used to specify
the intensity of a particular color, such as red,
green, or blue, or to provide other information
needed by the application.
Let doubleword elements 0 and 1 of src be the Let the source vector be the concatenation of the
contents of VR[VRA]. contents of VRA followed by the contents of VRB.
Let doubleword elements 2 and 3 of src be the For each integer value i from 0 to 15, do the following.
contents of VR[VRB]. Signed-integer halfword element i in the source
vector is converted to an signed-integer byte.
For each integer value i from 0 to 3, do the following.
The signed integer value in doubleword element i – If the value of the element is greater than 127
of src is placed into word element i of VR[VRT] in the result saturates to 127
unsigned integer format.
– If the value is greater than 232-1 the result – If the value of the element is less than -128
saturates to 232-1. the result saturates to -128.
– If the value is less than 0 the result saturates
to 0. The low-order 8 bits of the result is placed into
byte element i of VRT.
Special Registers Altered:
SAT Special Registers Altered:
SAT
Vector Pack Signed Halfword Unsigned Vector Pack Signed Word Signed Saturate
Saturate VX-form VX-form
vpkshus VRT,VRA,VRB vpkswss VRT,VRA,VRB
do i=0 to 63 by 8 do i=0 to 63 by 16
src1 EXTS((VRA)i×2:i×2+15) src1 EXTS((VRA)i×2:i×2+31)
src2 EXTS((VRB)i×2:i×2+15) src2 EXTS((VRB)i×2:i×2+31)
VRTi:i+7 Clamp(src1, 0, 255)24:31 VRTi:i+15 Clamp(src1, -215, 215-1)16:31
VRTi+64:i+71 Clamp(src2, 0, 255)24:31 VRTi+64:i+79 Clamp(src2, -215, 215-1)16:31
end end
Let the source vector be the concatenation of the Let the source vector be the concatenation of the
contents of VRA followed by the contents of VRB. contents of VRA followed by the contents of VRB.
For each integer value i from 0 to 15, do the following. For each integer value i from 0 to 7, do the following.
Signed-integer halfword element i in the source Signed-integer word element i in the source vector
vector is converted to an unsigned-integer byte. is converted to an signed-integer halfword.
– If the value of the element is greater than 255 – If the value of the element is greater than
the result saturates to 255 215-1 the result saturates to 215-1
– If the value of the element is less than 0 the – If the value of the element is less than -215
result saturates to 0. the result saturates to -215.
The low-order 8 bits of the result is placed into The low-order 16 bits of the result is placed into
byte element i of VRT. halfword element i of VRT.
For each integer value i from 0 to 7, do the following. Let doubleword elements 0 and 1 of src be the
Signed-integer word element i in the source vector contents of VR[VRA].
is converted to an unsigned-integer halfword.
Let doubleword elements 2 and 3 of src be the
contents of VR[VRB].
– If the value of the element is greater than
216-1 the result saturates to 216-1
For each integer value i from 0 to 3, do the following.
The unsigned integer value in doubleword
– If the value of the element is less than 0 the
element i of src is placed into word element i of
result saturates to 0.
VR[VRT] in unsigned integer format.
– If the value of the element is greater than
The low-order 16 bits of the result is placed into
232-1 the result saturates to 232-1
halfword element i of VRT.
Special Registers Altered:
Special Registers Altered:
SAT
SAT
Let doubleword elements 0 and 1 of src be the For each integer value i from 0 to 15, do the following.
contents of VR[VRA]. The contents of bits 8:15 of halfword element i in
the source vector is placed into byte element i of
Let doubleword elements 2 and 3 of src be the VRT.
contents of VR[VRB].
Special Registers Altered:
For each integer value i from 0 to 3, do the following. None
The contents of bits 32:63 of doubleword element
i of src is placed into word element i of VR[VRT].
Vector Pack Unsigned Halfword Unsigned Vector Pack Unsigned Word Unsigned
Saturate VX-form Saturate VX-form
vpkuhus VRT,VRA,VRB vpkuwus VRT,VRA,VRB
do i=0 to 63 by 8 do i=0 to 63 by 16
src1 EXTZ((VRA)i×2:i×2+15) src1 EXTZ((VRA)i×2:i×2+31 )
src2 EXTZ((VRB)i×2:i×2+15) src2 EXTZ((VRB)i×2:i×2+31 )
VRTi:i+7 Clamp( src1, 0, 255 )24:31 VRTi:i+15 Clamp( src1, 0, 216-1 )16:31
VRTi+64:i+71 Clamp( src2, 0, 255 )24:31 VRTi+64:i+79 Clamp( src2, 0, 216-1 )16:31
end end
Let the source vector be the concatenation of the Let the source vector be the concatenation of the
contents of VRA followed by the contents of VRB. contents of VRA followed by the contents of VRB.
For each integer value i from 0 to 15, do the following. For each integer value i from 0 to 7, do the following.
Unsigned-integer halfword element i in the source Unsigned-integer word element i in the source
vector is converted to an unsigned-integer byte. vector is converted to an unsigned-integer
halfword.
– If the value of the element is greater than 255
the result saturates to 255. – If the value of the element is greater than
216-1 the result saturates to 216-1.
The low-order 8 bits of the result is placed into byte
element i of VRT. The low-order 16 bits of the result is placed into
halfword element i of VRT.
Special Registers Altered:
SAT Special Registers Altered:
SAT
do i=0 to 63 by 16
VRTi:i+15 (VRA)i×2+16:i×2+31
VRTi+64:i+79 (VRB)i×2+16:i×2+31
end
Vector Unpack High Pixel VX-form Vector Unpack Low Pixel VX-form
vupkhpx VRT,VRB vupklpx VRT,VRB
do i=0 to 63 by 16 do i=0 to 63 by 16
VRTi×2:i×2+7 EXTS((VRB)i ) VRTi×2:i×2+7 EXTS((VRB)i+64 )
VRTi×2+8:i×2+15 EXTZ((VRB)i+1:i+5 ) VRTi×2+8:i×2+15 EXTZ((VRB)i+65:i+69)
VRTi×2+16:i×2+23 EXTZ((VRB)i+6:i+10 ) VRTi×2+16:i×2+23 EXTZ((VRB)i+70:i+74)
VRTi×2+24:i×2+31 EXTZ((VRB)i+11:i+15) VRTi×2+24:i×2+31 EXTZ((VRB)i+75:i+79)
end end
For each vector element i from 0 to 3, do the following. For each vector element i from 0 to 3, do the following.
Halfword element i in VRB is unpacked as follows. Halfword element i+4 in VRB is unpacked as
follows.
– sign-extend bit 0 of the halfword to 8 bits
– zero-extend bits 1:5 of the halfword to 8 bits – sign-extend bit 0 of the halfword to 8 bits
– zero-extend bits 6:10 of the halfword to 8 bits – zero-extend bits 1:5 of the halfword to 8 bits
– zero-extend bits 11:15 of the halfword to 8 – zero-extend bits 6:10 of the halfword to 8 bits
bits – zero-extend bits 11:15 of the halfword to 8
bits
The result is placed in word element i of VRT.
The result is placed in word element i of VRT.
Special Registers Altered:
None Special Registers Altered:
None
Programming Note
The source and target elements can be considered
to be 16-bit and 32-bit “pixels” respectively, having
the formats described in the Programming Note for
the Vector Pack Pixel instruction on page 250.
Programming Note
Notice that the unpacking done by the Vector
Unpack Pixel instructions does not reverse the
packing done by the Vector Pack Pixel instruction.
Specifically, if a 16-bit pixel is unpacked to a 32-bit
pixel which is then packed to a 16-bit pixel, the
resulting 16-bit pixel will not, in general, be equal
to the original 16-bit pixel (because, for each
channel except the first, Vector Unpack Pixel
inserts high-order bits while Vector Pack Pixel
discards low-order bits).
Vector Unpack High Signed Byte VX-form Vector Unpack Low Signed Byte VX-form
vupkhsb VRT,VRB vupklsb VRT,VRB
do i=0 to 63 by 8 do i=0 to 63 by 8
VRTi×2:i×2+15 EXTS((VRB)i:i+7) VRTi×2:i×2+15 EXTS((VRB)i+64:i+71)
end end
For each vector element i from 0 to 7, do the following. For each vector element i from 0 to 7, do the following.
Signed-integer byte element i in VRB is Signed-integer byte element i+8 in VRB is
sign-extended to produce a signed-integer sign-extended to produce a signed-integer
halfword and placed into halfword element i in halfword and placed into halfword element i in
VRT. VRT.
Vector Unpack High Signed Halfword Vector Unpack Low Signed Halfword
VX-form VX-form
vupkhsh VRT,VRB vupklsh VRT,VRB
do i=0 to 63 by 16 do i=0 to 63 by 16
VRTi×2:i×2+31 EXTS((VRB)i:i+15) VRTi×2:i×2+31 EXTS((VRB)i+64:i+79)
end end
For each vector element i from 0 to 3, do the following. For each vector element i from 0 to 3, do the following.
Signed-integer halfword element i in VRB is Signed-integer halfword element i+4 in VRB is
sign-extended to produce a signed-integer word sign-extended to produce a signed-integer word
and placed into word element i in VRT. and placed into word element i in VRT.
Vector Unpack High Signed Word VX-form Vector Unpack Low Signed Word VX-form
vupkhsw VRT,VRB vupklsw VRT,VRB
For each integer value i from 0 to 1, do the following. For each integer value i from 0 to 1, do the following.
The signed integer value in word element i of The signed integer value in word element i+2 of
VR[VRB] is sign-extended and placed into VR[VRB] is sign-extended and placed into
doubleword element i of VR[VRT]. doubleword element i of VR[VRT].
Vector Merge High Byte VX-form Vector Merge Low Byte VX-form
vmrghb VRT,VRA,VRB vmrglb VRT,VRA,VRB
do i=0 to 63 by 8 do i=0 to 63 by 8
VRTi×2:i×2+7 (VRA)i:i+7 VRTi×2:i×2+7 (VRA)i+64:i+71
VRTi×2+8:i×2+15 (VRB)i:i+7 VRTi×2+8:i×2+15 (VRB)i+64:i+71
end end
For each vector element i from 0 to 7, do the following. For each vector element i from 0 to 7, do the following.
Byte element i in VRA is placed into byte element Byte element i+8 in VRA is placed into byte
2×i in VRT. element 2×i in VRT.
Byte element i in VRB is placed into byte element Byte element i+8 in VRB is placed into byte
2×i+1 in VRT. element 2×i+1 in VRT.
Vector Merge High Halfword VX-form Vector Merge Low Halfword VX-form
vmrghh VRT,VRA,VRB vmrglh VRT,VRA,VRB
do i=0 to 63 by 16 do i=0 to 63 by 16
VRTi×2:i×2+15 (VRA)i:i+15 VRTi×2:i×2+15 (VRA)i+64:i+79
VRTi×2+16:i×2+31 (VRB)i:i+15 VRTi×2+16:i×2+31 (VRB)i+64:i+79
end end
For each vector element i from 0 to 3, do the following. For each vector element i from 0 to 3, do the following.
Halfword element i in VRA is placed into halfword Halfword element i+4 in VRA is placed into
element 2×i in VRT. halfword element 2×i in VRT.
Halfword element i in VRB is placed into halfword Halfword element i+4 in VRB is placed into
element 2×i+1 in VRT. halfword element 2×i+1 in VRT.
Vector Merge High Word VX-form Vector Merge Low Word VX-form
vmrghw VRT,VRA,VRB vmrglw VRT,VRA,VRB
do i=0 to 63 by 32 do i=0 to 63 by 32
VRTi×2:i×2+31 (VRA)i:i+31 VRTi×2:i×2+31 (VRA)i+64:i+95
VRTi×2+32:i×2+63 (VRB)i:i+31 VRTi×2+32:i×2+63 (VRB)i+64:i+95
end end
For each vector element i from 0 to 1, do the following. For each vector element i from 0 to 1, do the following.
Word element i in VRA is placed into word Word element i+2 in VRA is placed into word
element 2×i in VRT. element 2×i in VRT.
Word element i in VRB is placed into word Word element i+2 in VRB is placed into word
element 2×i+1 in VRT. element 2×i+1 in VRT.
The word elements in the high-order half of VRA are Special Registers Altered:
placed, in the same order, into the even-numbered None
word elements of VRT. The word elements in the
high-order half of VRB are placed, in the same order,
into the odd-numbered word elements of VRT.
Vector Merge Even Word VX-form Vector Merge Odd Word VX-form
vmrgew VRT,VRA,VRB vmrgow VRT,VRA,VRB
The contents of word element 0 of VR[VRA] are placed The contents of word element 1 of VR[VRA] are placed
into word element 0 of VR[VRT]. into word element 0 of VR[VRT].
The contents of word element 0 of VR[VRB] are placed The contents of word element 1 of VR[VRB] are placed
into word element 1 of VR[VRT]. into word element 1 of VR[VRT].
The contents of word element 2 of VR[VRA] are placed The contents of word element 3 of VR[VRA] are placed
into word element 2 of VR[VRT]. into word element 2 of VR[VRT].
The contents of word element 2 of VR[VRB] are placed The contents of word element 3 of VR[VRB] are placed
into word element 3 of VR[VRT]. into word element 3 of VR[VRT].
vmrgew is treated as a Vector instruction in terms of vmrgow is treated as a Vector instruction in terms of
resource availability. resource availability.
For each integer value i from 0 to 15, do the following. For each integer value i from 0 to 3, do the following.
The contents of byte element UIM in VRB are The contents of word element UIM in VRB are
placed into byte element i of VRT. placed into word element i of VRT.
b UIM || 0b0000
do i=0 to 127 by 16
VRTi:i+15 (VRB)b:b+15
end
do i=0 to 127 by 8
VRTi:i+7 EXTS(SIM, 8)
end
do i=0 to 127 by 16
VRTi:i+15 EXTS(SIM, 16)
end
do i=0 to 127 by 32
VRTi:i+31 EXTS(SIM, 32)
end
do i = 0 to 15 do i = 0 to 15
index VR[VRC].byte[i].bit[3:7] index VR[VRC].byte[i].bit[3:7]
VR[VRT].byte[i] src.byte[index] VR[VRT].byte[i] src.byte[31-index]
end end
Let the source vector be the concatenation of the Let the source vector be the concatenation of the
contents of VR[VRA] followed by the contents of contents of VR[VRA] followed by the contents of
VR[VRB]. VR[VRB].
For each integer value i from 0 to 15, do the following. For each integer value i from 0 to 15, do the following.
Let index be the value specified by bits 3:7 of byte Let index be the value specified by bits 3:7 of byte
element i of VR[VRC]. element i of VR[VRC].
The contents of byte element index of src are The contents of byte element 31-index of src are
placed into byte element i of VR[VRT]. placed into byte element i of VR[VRT].
Programming Note
See the Programming Notes with the Load Vector
for Shift Left and Load Vector for Shift Right
instructions on page 249 for examples of uses of
vperm.
do i = 0 to 127
mask VR[VRC].bit[i]
VR[VRT].bit[i] (mask=0) ? VR[VRA].bit[i] : VR[VRB].bit[i]
end
src.qword[0] VR[VRA]
src.qword[1] VR[VRB]
VR[VRT] src.byte[SHB:SHB+15]
t 1 t 1
do i = 0 to 15 do i = 0 to 15
t t & (VR[VRB].byte[i].bit[5:7] = sh) t t & (VR[VRB].byte[i].bit[5:7]=sh)
end end
if t=1 then if t=1 then
VR[VRT] VR[VRA] << sh VR[VRT] VR[VRA] >> sh
else else
VR[VRT] undefined VR[VRT] undefined
The contents of VR[VRA] are shifted left by the number The contents of VR[VRA] are shifted right by the number
of bits specified in bits 125:127 of VR[VRB]. of bits specified in bits 125:127 of VR[VRB].
– Bits shifted out of bit 0 are lost. – Bits shifted out of bit 127 are lost.
– Zeros are supplied to the vacated bits on the right. – Zeros are supplied to the vacated bits on the left.
The result is place into VR[VRT], except if, for any byte The result is place into VR[VRT], except if, for any byte
element in register VR[VRB], the low-order 3 bits are not element in register VR[VRB], the low-order 3 bits are not
equal to the shift amount, then VR[VRT] is undefined. equal to the shift amount, then VR[VRT] is undefined.
Vector Shift Left by Octet VX-form Vector Shift Right by Octet VX-form
vslo VRT,VRA,VRB vsro VRT,VRA,VRB
The contents of VR[VRA] are shifted left by the number The contents of VR[VRA] are shifted right by the number
of bytes specified in bits 121:124 of VR[VRB]. of bytes specified in bits 121:124 of VR[VRB].
– Bytes shifted out of byte 0 are lost. – Bytes shifted out of byte 15 are lost.
– Zeros are supplied to the vacated bytes on the – Zeros are supplied to the vacated bytes on the
right. left.
The result is placed into VR[VRT]. The result is placed into VR[VRT].
Programming Note
A double-register shift by a dynamically specified number of bits (0-127) can be performed in six instructions.
The following example shifts Vw || Vx left by the number of bits specified in Vy and places the high-order 128 bits
of the result into Vz.
Vector Shift Left Variable VX-form Vector Shift Right Variable VX-form
vslv VRT,VRA,VRB vsrv VRT,VRA,VRB
do i = 0 to 15 do i = 0 to 15
sh ← VR[VRB].byte[i].bit[5:7] sh ← VR[VRB].byte[i].bit[5:7]
VR[VRT].byte[i] ← src.byte[i:i+1].bit[sh:sh+7] VR[VRT].byte[i] ← src.byte[i:i+1].bit[8-sh:15-sh]
end end
Let bytes 0:15 of src be the contents of VR[VRA]. Let bytes 0:15 of src be the contents of VR[VRA].
Let byte 16 of src be the value 0x00. Let bytes 16 of src be the value 0x00.
For each integer value i from 0 to 15, do the following. For each integer value i from 0 to 15, do the following.
Let sh be the value in bits 5:7 of byte element i of Let sh be the value in bits 5:7 of byte element i of
VR[VRB]. VR[VRB].
The contents of bits sh:sh+7 of the halfword in The contents of bits 8-sh:15-sh of the halfword in
byte elements i:i+1 of src are placed into byte byte elements i:i+1 of src are placed into byte
element i of VR[VRT]. element i of VR[VRT].
Programming Note
Assume vSRC contains a vector of packed 7-bit values, A located in bits 0:6, B located in bits 7:13, C located in
bits 14:20, etc..
The leftmost seven packed 7-bit values can be unpacked into byte elements 0 to 6 using vsrv with vSHCNT1.
The next seven packed 7-bit values can then be unpacked into byte elements 7 to 13 using vsrv with vSHCNT2.
The next two packed 7-bit values can then be unpacked into byte elements 14 to 15 using vsrv with vSHCNT3.
The most-significant bit in each byte element is masked off to produce a vector of sixteen unsigned byte
elements.
The vector of sixteen unsigned byte elements can be further unpacked to two vectors of eight unsigned halfword
elements using a vupkhsb and a vupklsb.
The resultant two vectors of eight unsigned halfword elements can then be further unpacked to four vectors of
four unsigned word elements using two vupkhsh and two vupklsh instructions.
The contents of byte elements UIM:UIM+3 of VR[VRB] The contents of byte elements UIM:UIM+7 of VR[VRB]
are placed into word element 1 of VR[VRT]. The are placed into VR[VRT]. The contents of doubleword
contents of the remaining word elements of VR[VRT] element 1 of VR[VRT] are set to 0.
are set to 0.
If the value of UIM is greater than 8, the results are
If the value of UIM is greater than 12, the results are undefined.
undefined.
Special Registers Altered:
Special Registers Altered: None
None
The contents of byte element 7 of VR[VRB] are placed The contents of halfword element 3 of VR[VRB] are
into byte element UIM of VR[VRT]. The contents of the placed into byte elements UIM:UIM+1 of VR[VRT]. The
remaining byte elements of VR[VRT] are not modified. contents of the remaining byte elements of VR[VRT] are
not modified.
Special Registers Altered:
None If the value of UIM is greater than 14, the results are
undefined.
The contents of word element 1 of VR[VRB] are placed The contents of doubleword element 0 of VR[VRB] are
into byte elements UIM:UIM+3 of VR[VRT]. The contents placed into byte elements UIM:UIM+7 of VR[VRT]. The
of the remaining byte elements of VR[VRT] are not contents of the remaining byte elements of VR[VRT] are
modified. not modified.
If the value of UIM is greater than 12, the results are If the value of UIM is greater than 8, the results are
undefined. undefined.
Vector Add and Write Carry-Out Unsigned Vector Add Signed Halfword Saturate
Word VX-form VX-form
vaddcuw VRT,VRA,VRB vaddshs VRT,VRA,VRB
For each integer value i from 0 to 3, do the following. For each integer value i from 0 to 7, do the following.
Unsigned-integer word element i in VRA is added Signed-integer halfword element i in VRA is added
to unsigned-integer word element i in VRB. The to signed-integer halfword element i in VRB.
carry out of the 32-bit sum is zero-extended to 32
bits and placed into word element i of VRT. – If the sum is greater than 215-1 the result
saturates to 215-1
Special Registers Altered:
None – If the sum is less than -215 the result
saturates to -215.
Vector Add Signed Byte Saturate VX-form The low-order 16 bits of the result are placed into
halfword element i of VRT.
vaddsbs VRT,VRA,VRB
Special Registers Altered:
4 VRT VRA VRB 768 SAT
0 6 11 16 21 31
do i=0 to 127 by 8
aop EXTS(VRAi:i+7)
bop EXTS(VRBi:i+7)
VRTi:i+7 Clamp( aop +int bop, -128, 127 )24:31
end
Vector Add Signed Word Saturate Vector Add Unsigned Doubleword Modulo
VX-form VX-form
vaddsws VRT,VRA,VRB vaddudm VRT,VRA,VRB
do i=0 to 127 by 32 do i = 0 to 1
aop EXTS((VRA)i:i+31) aop ← VR[VRA].dword[i]
bop EXTS((VRB)i:i+31) bop ← VR[VRB].dword[i]
VRTi:i+31 Clamp(aop +int bop, -231, 231-1) VR[VRT].dword[i] ← Chop( aop +int bop, 64 )
end end
For each integer value i from 0 to 3, do the following. For each integer value i from 0 to 1, do the following.
Signed-integer word element i in VRA is added to The integer value in doubleword element i of
signed-integer word element i in VRB. VR[VRB] is added to the integer value in
doubleword element i of VR[VRA].
– If the sum is greater than 231-1 the result
saturates to 231-1. The low-order 64 bits of the result are placed into
doubleword element i of VR[VRT].
– If the sum is less than -231 the result
saturates to -231. Special Registers Altered:
None
The low-order 32 bits of the result are placed into
word element i of VRT. Programming Note
Special Registers Altered: vaddudm can be used for signed or unsigned inte-
SAT gers.
do i=0 to 127 by 8
aop EXTZ((VRA)i:i+7)
bop EXTZ((VRB)i:i+7)
VRTi:i+7 Chop( aop +int bop, 8 )
end
Programming Note
vaddubm can be used for unsigned or
signed-integers.
Vector Add Unsigned Halfword Modulo Vector Add Unsigned Word Modulo
VX-form VX-form
vadduhm VRT,VRA,VRB vadduwm VRT,VRA,VRB
The low-order 16 bits of the result are placed into The low-order 32 bits of the result are placed into
halfword element i of VRT. word element i of VRT.
Vector Add Unsigned Byte Saturate Vector Add Unsigned Word Saturate
VX-form VX-form
vaddubs VRT,VRA,VRB vadduws VRT,VRA,VRB
For each integer value i from 0 to 15, do the following. For each integer value i from 0 to 3, do the following.
Unsigned-integer byte element i in VRA is added Unsigned-integer word element i in VRA is added
to unsigned-integer byte element i in VRB. to unsigned-integer word element i in VRB.
– If the sum is greater than 255 the result – If the sum is greater than 232-1 the result
saturates to 255. saturates to 232-1.
The low-order 8 bits of the result are placed into The low-order 32 bits of the result are placed into
byte element i of VRT. word element i of VRT.
do i=0 to 127 by 16
aop EXTZ((VRA)i:i+15)
bop EXTZ((VRB)i:i+15)
VRTi:i+15 Clamp(aop +int bop, 0, 216-1)16:31
end
Vector Add Unsigned Quadword Modulo Vector Add & write Carry Unsigned
VX-form Quadword VX-form
vadduqm VRT,VRA,VRB vaddcuq VRT,VRA,VRB
Let src1 be the integer value in VR[VRA]. Let src1 be the integer value in VR[VRA].
Let src2 be the integer value in VR[VRB]. Let src2 be the integer value in VR[VRB].
src1 and src2 can be signed or unsigned integers. src1 and src2 can be signed or unsigned integers.
The rightmost 128 bits of the sum of src1 and src2 are The carry out of the sum of src1 and src2 is placed
placed into VR[VRT]. into VR[VRT].
Vector Add Extended Unsigned Vector Add Extended & write Carry
Quadword Modulo VA-form Unsigned Quadword VA-form
vaddeuqm VRT,VRA,VRB,VRC vaddecuq VRT,VRA,VRB,VRC
VR[VRT] ← Chop(sum, 128) VR[VRT] ← Chop( EXTZ( Chop(sum >> 128, 1) ), 128 )
Let src1 be the integer value in VR[VRA]. Let src1 be the integer value in VR[VRA].
Let src2 be the integer value in VR[VRB]. Let src2 be the integer value in VR[VRB].
Let cin be the integer value in bit 127 of VR[VRC]. Let cin be the integer value in bit 127 of VR[VRC].
src1 and src2 can be signed or unsigned integers. src1 and src2 can be signed or unsigned integers.
The rightmost 128 bits of the sum of src1, src2, and cin The carry out of the sum of src1, src2, and cin are
are placed into VR[VRT]. placed into VR[VRT].
Programming Note
The Vector Add Unsigned Quadword instructions support efficient wide-integer addition. The following code
sequence can be used to implement a 512-bit signed or unsigned add operation.
Vector Subtract and Write Carry-Out Vector Subtract Signed Halfword Saturate
Unsigned Word VX-form VX-form
vsubcuw VRT,VRA,VRB vsubshs VRT,VRA,VRB
For each integer value i from 0 to 3, do the following. For each integer value i from 0 to 7, do the following.
Unsigned-integer word element i in VRB is Signed-integer halfword element i in VRB is
subtracted from unsigned-integer word element i subtracted from signed-integer halfword element i
in VRA. The complement of the borrow out of bit 0 in VRA.
of the 32-bit difference is zero-extended to 32 bits
and placed into word element i of VRT. – If the intermediate result is greater than 215-1
the result saturates to 215-1.
Special Registers Altered:
None – If the intermediate result is less than -215 the
result saturates to -215.
Vector Subtract Signed Byte Saturate The low-order 16 bits of the result are placed into
VX-form halfword element i of VRT.
do i=0 to 127 by 8
aop EXTS((VRA)i:i+7)
bop EXTS((VRB)i:i+7)
VRTi:i+7 Clamp(aop +int ¬bop +int 1, -128, 127)24:31
end
do i=0 to 127 by 32
aop EXTS((VRA)i:i+31)
bop EXTS((VRB)i:i+31)
VRTi:i+31 Clamp(aop +int ¬bop +int 1,-231,231-1)
end
For each integer value i from 0 to 15, do the following. For each integer value i from 0 to 7, do the following.
Unsigned-integer byte element i in VRB is Unsigned-integer halfword element i in VRB is
subtracted from unsigned-integer byte element i in subtracted from unsigned-integer halfword
VRA. The low-order 8 bits of the result are placed element i in VRA. The low-order 16 bits of the
into byte element i of VRT. result are placed into halfword element i of VRT.
do i = 0 to 1 do i=0 to 127 by 32
aop ← VR[VRA].dword[i] aop EXTZ((VRA)i:i+31)
bop ← VR[VRB].dword[i] bop EXTZ((VRB)i:i+31)
VR[VRT].dword[i] ← Chop( aop +int ~bop +int 1, 64 ) VRTi:i+31 Chop( aop +int ¬bop +int 1, 32 )
end end
For each integer value i from 0 to 1, do the following. For each integer value i from 0 to 3, do the following.
The integer value in doubleword element i of Unsigned-integer word element i in VRB is
VR[VRB] is subtracted from the integer value in subtracted from unsigned-integer word element i
doubleword element i of VR[VRA]. in VRA. The low-order 32 bits of the result are
placed into word element i of VRT.
The low-order 64 bits of the result are placed into
doubleword element i of VR[VRT]. Special Registers Altered:
None
Special Registers Altered:
None
Programming Note
vsubudm can be used for signed or unsigned inte-
gers.
Vector Subtract Unsigned Byte Saturate Vector Subtract Unsigned Word Saturate
VX-form VX-form
vsububs VRT,VRA,VRB
vsubuws VRT,VRA,VRB
4 VRT VRA VRB 1536
0 6 11 16 21 31 4 VRT VRA VRB 1664
0 6 11 16 21 31
do i=0 to 127 by 8
aop EXTZ((VRA)i:i+7) do i=0 to 127 by 32
bop EXTZ((VRB)i:i+7) aop EXTZ((VRA)i:i+31)
VRTi:i+7 Clamp(aop +int ¬bop +int 1, 0, 255)24:31 bop EXTZ((VRB)i:i+31)
end VRTi:i+31 Clamp(aop +int ¬bop +int 1, 0, 232-1)
end
For each integer value i from 0 to 15, do the following.
Unsigned-integer byte element i in VRB is For each integer value i from 0 to 7, do the following.
subtracted from unsigned-integer byte element i in Unsigned-integer word element i in VRB is
VRA. If the intermediate result is less than 0 the subtracted from unsigned-integer word element i
result saturates to 0. The low-order 8 bits of the in VRA.
result are placed into byte element i of VRT.
– If the intermediate result is less than 0 the
Special Registers Altered: result saturates to 0.
SAT
The low-order 32 bits of the result are placed into
word element i of VRT.
Vector Subtract Unsigned Halfword
Saturate VX-form Special Registers Altered:
SAT
vsubuhs VRT,VRA,VRB
do i=0 to 127 by 16
aop EXTZ((VRA)i:i+15)
bop EXTZ((VRB)i:i+15)
VRTi:i+15 Clamp(aop +int ¬bop +int 1,0,216-1)16:31
end
Vector Subtract Unsigned Quadword Vector Subtract & write Carry Unsigned
Modulo VX-form Quadword VX-form
vsubuqm VRT,VRA,VRB vsubcuq VRT,VRA,VRB
Let src1 be the integer value in VR[VRA]. Let src1 be the integer value in VR[VRA].
Let src2 be the integer value in VR[VRB]. Let src2 be the integer value in VR[VRB].
src1 and src2 can be signed or unsigned integers. src1 and src2 can be signed or unsigned integers.
The rightmost 128 bits of the sum of src1, the one’s The carry out of the sum of src1, the one’s
complement of src2, and the value 1 are placed into complement of src2, and the value 1 is placed into
VR[VRT]. VR[VRT].
Vector Subtract Extended Unsigned Vector Subtract Extended & write Carry
Quadword Modulo VA-form Unsigned Quadword VA-form
vsubeuqm VRT,VRA,VRB,VRC vsubecuq VRT,VRA,VRB,VRC
Let src1 be the integer value in VR[VRA]. Let src1 be the integer value in VR[VRA].
Let src2 be the integer value in VR[VRB]. Let src2 be the integer value in VR[VRB].
Let cin be the integer value in bit 127 of VR[VRC]. Let cin be the integer value in bit 127 of VR[VRC].
src1 and src2 can be signed or unsigned integers. src1 and src2 can be signed or unsigned integers.
The rightmost 128 bits of the sum of src1, the one’s The carry out of the sum of src1, the one’s
complement of src2, and cin are placed into VR[VRT]. complement of src2, and cin are placed into VR[VRT].
Programming Note
The Vector Subtract Unsigned Quadword instructions support efficient wide-integer subtraction. The following
code sequence can be used to implement a 512-bit signed or unsigned subtract operation.
Vector Multiply Even Signed Byte VX-form Vector Multiply Odd Signed Byte VX-form
vmulesb VRT,VRA,VRB vmulosb VRT,VRA,VRB
For each integer value i from 0 to 7, do the following. For each integer value i from 0 to 7, do the following.
Signed-integer byte element i×2 in VRA is Signed-integer byte element i×2+1 in VRA is
multiplied by signed-integer byte element i×2 in multiplied by signed-integer byte element i×2+1 in
VRB. The low-order 16 bits of the product are VRB. The low-order 16 bits of the product are
placed into halfword element i VRT. placed into halfword element i VRT.
Vector Multiply Even Unsigned Byte Vector Multiply Odd Unsigned Byte
VX-form VX-form
vmuleub VRT,VRA,VRB vmuloub VRT,VRA,VRB
For each integer value i from 0 to 7, do the following. For each integer value i from 0 to 7, do the following.
Unsigned-integer byte element i×2 in VRA is Unsigned-integer byte element i×2+1 in VRA is
multiplied by unsigned-integer byte element i×2 in multiplied by unsigned-integer byte element i×2+1
VRB. The low-order 16 bits of the product are in VRB. The low-order 16 bits of the product are
placed into halfword element i VRT. placed into halfword element i VRT.
Vector Multiply Even Signed Halfword Vector Multiply Odd Signed Halfword
VX-form VX-form
vmulesh VRT,VRA,VRB vmulosh VRT,VRA,VRB
For each integer value i from 0 to 3, do the following. For each integer value i from 0 to 3, do the following.
Signed-integer halfword element i×2 in VRA is Signed-integer halfword element i×2+1 in VRA is
multiplied by signed-integer halfword element i×2 multiplied by signed-integer halfword element
in VRB. The low-order 32 bits of the product are i×2+1 in VRB. The low-order 32 bits of the product
placed into halfword element i VRT. are placed into halfword element i VRT.
Vector Multiply Even Unsigned Halfword Vector Multiply Odd Unsigned Halfword
VX-form VX-form
vmuleuh VRT,VRA,VRB vmulouh VRT,VRA,VRB
For each integer value i from 0 to 3, do the following. For each integer value i from 0 to 3, do the following.
Unsigned-integer halfword element i×2 in VRA is Unsigned-integer halfword element i×2+1 in VRA
multiplied by unsigned-integer halfword element is multiplied by unsigned-integer halfword element
i×2 in VRB. The low-order 32 bits of the product i×2+1 in VRB. The low-order 32 bits of the product
are placed into halfword element i VRT. are placed into halfword element i VRT.
Vector Multiply Even Signed Word Vector Multiply Odd Signed Word
VX-form VX-form
vmulesw VRT,VRA,VRB vmulosw VRT,VRA,VRB
do i = 0 to 1 do i = 0 to 1
src1 ← VR[VRA].word[2×i] src1 ← VR[VRA].word[2×i+1]
src2 ← VR[VRB].word[2×i] src2 ← VR[VRB].word[2×i+1]
VR[VRT].dword[i] ← src1 ×si src2 VR[VRT].dword[i] ← src1 ×si src2
end end
For each integer value i from 0 to 1, do the following. For each integer value i from 0 to 1, do the following.
The signed integer in word element 2×i of VR[VRA] The signed integer in word element 2×i+1 of
is multiplied by the signed integer in word element VR[VRA] is multiplied by the signed integer in word
2×i of VR[VRB]. element 2×i+1 of VR[VRB].
The 64-bit product is placed into doubleword The 64-bit product is placed into doubleword
element i of VR[VRT]. element i of VR[VRT].
Vector Multiply Even Unsigned Word Vector Multiply Odd Unsigned Word
VX-form VX-form
vmuleuw VRT,VRA,VRB vmulouw VRT,VRA,VRB
do i = 0 to 1 do i = 0 to 1
src1 ← VR[VRA].word[2×i] src1 ← VR[VRA].word[2×i+1]
src2 ← VR[VRB].word[2×i] src2 ← VR[VRB].word[2×i+1]
VR[VRT].dword[i] ← src1 ×ui src2 VR[VRT].dword[i] ← src1 ×ui src2
end end
For each integer value i from 0 to 1, do the following. For each integer value i from 0 to 1, do the following.
The unsigned integer in word element 2×i of The unsigned integer in word element 2×i+1 of
VR[VRA] is multiplied by the unsigned integer in VR[VRA] is multiplied by the unsigned integer in
word element 2×i of VR[VRB]. word element 2×i+1 of VR[VRB].
The 64-bit product is placed into doubleword The 64-bit product is placed into doubleword
element i of VR[VRT]. element i of VR[VRT].
do i = 0 to 3
src1 ← VR[VRA].word[i]
src2 ← VR[VRB].word[i]
VR[VRT].word[i] ← Chop( src1 ×ui src2, 32 )
end
Programming Note
vmuluwm can be used for unsigned or signed
integers.
– If the intermediate result is less than -215 the – If the intermediate result is greater than 215-1
result saturates to -215. the result saturates to 215-1.
The low-order 16 bits of the result are placed into – If the intermediate result is less than -215 the
halfword element i of VRT. result saturates to -215.
Special Registers Altered: The low-order 16 bits of the result are placed into
SAT halfword element i of VRT.
For each word element in VRT the following operations For each word element in VRT the following operations
are performed, in the order shown. are performed, in the order shown.
– Each of the four signed-integer byte elements – Each of the two signed-integer halfword elements
contained in the corresponding word element of contained in the corresponding word element of
VRA is multiplied by the corresponding VRA is multiplied by the corresponding
unsigned-integer byte element in VRB, producing signed-integer halfword element in VRB,
a signed-integer product. producing a signed-integer product.
– The sum of these four signed-integer halfword – The sum of these two signed-integer word
products is added to the signed-integer word products is added to the signed-integer word
element in VRC. element in VRC.
– The signed-integer result is placed into the – The signed-integer word result is placed into the
corresponding word element of VRT. corresponding word element of VRT.
For each word element in VRT the following operations For each word element in VRT the following operations
are performed, in the order shown. are performed, in the order shown.
– Each of the two signed-integer halfword elements – Each of the two unsigned-integer halfword
contained in the corresponding word element of elements contained in the corresponding word
VRA is multiplied by the corresponding element of VRA is multiplied by the corresponding
signed-integer halfword element in VRB, unsigned-integer halfword element in VRB,
producing a signed-integer product. producing an unsigned-integer word product.
– The sum of these two signed-integer word – The sum of these two unsigned-integer word
products is added to the signed-integer word products is added to the unsigned-integer word
element in VRC. element in VRC.
– If the intermediate result is greater than 231-1 the – The unsigned-integer result is placed into the
result saturates to 231-1 and if it is less than -231 it corresponding word element of VRT.
saturates to -231.
Special Registers Altered:
– The result is placed into the corresponding word None
element of VRT.
do i=0 to 127 by 32
temp EXTZ((VRC)i:i+31)
do j=0 to 31 by 16
src1 EXTZ((VRA)i+j:i+j+15)
src2 EXTZ((VRB)i+j:i+j+15)
prod src1 ×ui src2
end
temp temp +int prod
VRTi:i+31 Clamp(temp, 0, 232-1)
end
Vector Sum across Signed Word Saturate Vector Sum across Half Signed Word
VX-form Saturate VX-form
vsumsws VRT,VRA,VRB vsum2sws VRT,VRA,VRB
Vector Sum across Quarter Signed Byte Vector Sum across Quarter Signed
Saturate VX-form Halfword Saturate VX-form
vsum4sbs VRT,VRA,VRB vsum4shs VRT,VRA,VRB
For each integer value i from 0 to 3, do the following. For each integer value i from 0 to 3, do the following.
The sum of the four signed-integer byte elements The sum of the two signed-integer halfword
contained in word element i of VRA is added to elements contained in word element i of VRA is
signed-integer word element i in VRB. added to signed-integer word element i in VRB.
– If the intermediate result is greater than 231-1 – If the intermediate result is greater than 231-1
the result saturates to 231-1. the result saturates to 231-1.
– If the intermediate result is less than -231 the – If the intermediate result is less than -231 the
result saturates to -231. result saturates to -231.
The low-order 32 bits of the result are placed into The low-order 32 bits of the result are placed into
word element i of VRT. the corresponding word element of VRT.
do i=0 to 127 by 32
temp EXTZ((VRB)i:i+31)
do j=0 to 31 by 8
temp temp +int EXTZ((VRA)i+j:i+j+7)
end
VRTi:i+31 Clamp( temp, 0, 232-1 )
end
do i = 0 to 3 do i = 0 to 1
src EXTS(VR[VRB].word[i]) src EXTS(VR[VRB].dword[i])
VR[VRT].word[i] Chop((¬src + 1), 32) VR[VRT]dword[i] Chop((¬src + 1), 64)
end end
For each integer value i from 0 to 3, do the following. For each integer value i from 0 to 1, do the following.
The sum of the one’s-complement of the signed The sum of the one’s-complement of the signed
integer in word element i of VR[VRB] and 1 is integer in doubleword element i of VR[VRB] and 1
placed into word element i of VR[VRT]. is placed into doubleword element i of VR[VRT].
Vector Extend Sign Byte To Word VX-form Vector Extend Sign Byte To Doubleword
vextsb2w VRT,VRB
VX-form
vextsb2d VRT,VRB
4 VRT 16 VRB 1538
0 6 11 16 21 31 4 VRT 24 VRB 1538
0 6 11 16 21 31
if MSR.VEC=0 then Vector_Unavailable()
if MSR.VEC=0 then Vector_Unavailable()
do i = 0 to 3
VR[VRT].word[i] EXTS32(VR[VRB].word[i].byte[3]) do i = 0 to 1
end VR[VRT].dword[i] EXTS64(VR[VRB].dword[i].byte[7])
end
For each integer value i from 0 to 3, do the following.
The rightmost byte of word element i of VR[VRB] is For each integer value i from 0 to 1, do the following.
sign-extended and placed into word element i of The rightmost byte of doubleword element i of
VR[VRT]. VR[VRB] is sign-extended and placed into
doubleword element i of VR[VRT].
Special Registers Altered:
None Special Registers Altered:
None
do i = 0 to 1
VR[VRT].dword[i] EXTS64(VR[VRB].dword[i].word[1])
end
Vector Average Signed Byte VX-form Vector Average Signed Word VX-form
vavgsb VRT,VRA,VRB vavgsw VRT,VRA,VRB
For each integer value i from 0 to 15, do the following. For each integer value i from 0 to 3, do the following.
Signed-integer byte element i in VRA is added to Signed-integer word element i in VRA is added to
signed-integer byte element i in VRB. The sum is signed-integer word element i in VRB. The sum is
incremented by 1 and then shifted right 1 bit. incremented by 1 and then shifted right 1 bit.
The low-order 8 bits of the result are placed into The low-order 32 bits of the result are placed into
byte element i of VRT. word element i of VRT.
do i=0 to 127 by 16
aop EXTS((VRA)i:i+15)
bop EXTS((VRB)i:i+15)
VRTi:i+15 Chop(( aop +int bop +int 1 ) >> 1, 16)
end
do i=0 to 127 by 32
aop EXTZ((VRA)i:i+31)
bop EXTZ((VRB)i:i+31)
VRTi:i+31 Chop((aop +int bop +int 1) >>ui 1, 32)
end
for i = 0 to 15 for i = 0 to 7
src1 EXTZ(VR[VRA].byte[i]) src1 EXTZ(VR[VRA].hword[i])
src2 EXTZ(VR[VRB].byte[i]) src2 EXTZ(VR[VRB].hword[i])
if (src1>src2) then if (src1>src2) then
VR[VRT].byte[i] Chop(src1 + ¬src2 + 1, 8) VR[VRT].hword[i] Chop(src1 + ¬src2 + 1, 16)
else else
VR[VRT].byte[i] Chop(src2 + ¬src1 + 1, 8) VR[VRT].hword[i] Chop(src2 + ¬src1 + 1, 16)
end end
For each integer value i from 0 to 15, do the following. For each integer value i from 0 to 7, do the following.
The unsigned integer value in byte element i of The unsigned integer value in halfword element i
VR[VRA] is subtracted by the unsigned integer of VR[VRA] is subtracted by the unsigned integer
value in byte element i of VR[VRB]. The absolute value in halfword element i of VR[VRB]. The
value of the difference is placed into byte element absolute value of the difference is placed into
i of VR[VRT]. halfword element i of VR[VRT].
for i = 0 to 3
src1 EXTZ(VR[VRA].word[i])
src2 EXTZ(VR[VRB].word[i])
if (src1>src2) then
VR[VRT].word[i] Chop(src1 + ¬src2 + 1, 32)
else
VR[VRT].word[i] Chop(src2 + ¬src1 + 1, 32)
end
Vector Maximum Signed Byte VX-form Vector Maximum Unsigned Byte VX-form
vmaxsb VRT,VRA,VRB vmaxub VRT,VRA,VRB
For each integer value i from 0 to 15, do the following. For each integer value i from 0 to 15, do the following.
Signed-integer byte element i in VRA is compared Unsigned-integer byte element i in VRA is
to signed-integer byte element i in VRB. The compared to unsigned-integer byte element i in
larger of the two values is placed into byte VRB. The larger of the two values is placed into
element i of VRT. byte element i of VRT.
do i = 0 to 1 do i = 0 to 1
aop ← VR[VRA].dword[i] aop ← VR[VRA].dword[i]
bop ← VR[VRB].dword[i] bop ← VR[VRB].dword[i]
VR[VRT].dword[i] ← (aop >si bop) ? aop : bop VR[VRT].dword[i] ← (aop >ui bop) ? aop : bop
end end
For each integer value i from 0 to 1, do the following. For each integer value i from 0 to 1, do the following.
The signed integer value in doubleword element i The unsigned integer value in doubleword
of VR[VRA] is compared to the signed integer value element i of VR[VRA] is compared to the unsigned
in doubleword element i of VR[VRB]. The larger of integer value in doubleword element i of VR[VRB].
the two values is placed into doubleword element The larger of the two values is placed into
i of VR[VRT]. doubleword element i of VR[VRT].
For each integer value i from 0 to 7, do the following. For each integer value i from 0 to 7, do the following.
Signed-integer halfword element i in VRA is Unsigned-integer halfword element i in VRA is
compared to signed-integer halfword element i in compared to unsigned-integer halfword element i
VRB. The larger of the two values is placed into in VRB. The larger of the two values is placed into
halfword element i of VRT. halfword element i of VRT.
Vector Maximum Signed Word VX-form Vector Maximum Unsigned Word VX-form
vmaxsw VRT,VRA,VRB vmaxuw VRT,VRA,VRB
For each integer value i from 0 to 3, do the following. For each integer value i from 0 to 3, do the following.
Signed-integer word element i in VRA is Unsigned-integer word element i in VRA is
compared to signed-integer word element i in compared to unsigned-integer word element i in
VRB. The larger of the two values is placed into VRB. The larger of the two values is placed into
word element i of VRT. word element i of VRT.
Vector Minimum Signed Byte VX-form Vector Minimum Unsigned Byte VX-form
vminsb VRT,VRA,VRB vminub VRT,VRA,VRB
For each integer value i from 0 to 15, do the following. For each integer value i from 0 to 15, do the following.
Signed-integer byte element i in VRA is compared Unsigned-integer byte element i in VRA is
to signed-integer byte element i in VRB. The compared to unsigned-integer byte element i in
smaller of the two values is placed into byte VRB. The smaller of the two values is placed into
element i of VRT. byte element i of VRT.
do i = 0 to 1 do i = 0 to 1
aop ← VR[VRA].dword[i] aop ← VR[VRA].dword[i]
bop ← VR[VRB].dword[i] bop ← VR[VRB].dword[i]
VR[VRT].dword[i] ← (EXTS(aop) <si EXTS(bop)) ? aop : bop VR[VRT].dword[i] ← (aop <ui bop) ? aop : bop
end end
For each integer value i from 0 to 1, do the following. For each integer value i from 0 to 1, do the following.
The signed integer value in doubleword element i The unsigned integer value in doubleword
of VR[VRA] is compared to the signed integer value element i of VR[VRA] is compared to the unsigned
in doubleword element i of VR[VRB]. The smaller integer value in doubleword element i of VR[VRB].
of the two values is placed into doubleword The smaller of the two values is placed into
element i of VR[VRT]. doubleword element i of VR[VRT].
For each integer value i from 0 to 7, do the following. For each integer value i from 0 to 7, do the following.
Signed-integer halfword element i in VRA is Unsigned-integer halfword element i in VRA is
compared to signed-integer halfword element i in compared to unsigned-integer halfword element i
VRB. The smaller of the two values is placed into in VRB. The smaller of the two values is placed
halfword element i of VRT. into halfword element i of VRT.
Vector Minimum Signed Word VX-form Vector Minimum Unsigned Word VX-form
vminsw VRT,VRA,VRB vminuw VRT,VRA,VRB
For each integer value i from 0 to 3, do the following. For each integer value i from 0 to 3, do the following.
Signed-integer word element i in VRA is Unsigned-integer word element i in VRA is
compared to signed-integer word element i in compared to unsigned-integer word element i in
VRB. The smaller of the two values is placed into VRB. The smaller of the two values is placed into
word element i of VRT. word element i of VRT.
For each integer value i from 0 to 15, do the following. For each integer value i from 0 to 7, do the following.
Unsigned-integer byte element i in VRA is Unsigned-integer halfword element i in VRA is
compared to unsigned-integer byte element i in compared to unsigned-integer halfword element
VRB. Byte element i in VRT is set to all 1s if element i in VRB. Halfword element i in VRT is set
unsigned-integer byte element i in VRA is equal to to all 1s if unsigned-integer halfword element i in
unsigned-integer byte element i in VRB, and is set VRA is equal to unsigned-integer halfword
to all 0s otherwise. element i in VRB, and is set to all 0s otherwise.
do i=0 to 127 by 32 do i = 0 to 1
VRTi:i+31 ((VRA)i:i+31 =int (VRB)i:i+31) ? 32
1 : 32
0 aop ← EXTZ(VR[VRA].dword[i])
end bop ← EXTZ(VR[VRB].dword[i])
if Rc=1 then do if (aop = bop) then do
t (VRT=1281) VR[VRT].dword[i] ← 0xFFFF_FFFF_FFFF_FFFF
f (VRT=1280) flag.bit[i] ← 0b1
CR6 t || 0b0 || f || 0b0 end
end else do
VR[VRT].dword[i] ← 0x0000_0000_0000_0000
For each integer value i from 0 to 3, do the following. flag.bit[i] ← 0b0
The unsigned integer value in word element i in end
VR[VRA] is compared to the unsigned integer value end
in word element i in VR[VRB]. Word element i in if Rc=1 then do
CR.bit[24] ← (flag=0b11)
VR[VRT] is set to all 1s if unsigned-integer word
CR.bit[25] ← 0b0
element i in VR[VRA] is equal to unsigned-integer CR.bit[26] ← (flag=0b00)
word element i in VR[VRB], and is set to all 0s CR.bit[27] ← 0b0
otherwise. end
Special Registers Altered: For each integer value i from 0 to 1, do the following.
CR field 6 . . . . . . . . . . . . . . . . . . . . . . . . . . (if Rc=1) The unsigned integer value in doubleword
element i of VR[VRA] is compared to the unsigned
integer value in doubleword element i of VR[VRB].
Doubleword element i of VR[VRT] is set to all 1s if
the unsigned integer value in doubleword element
i of VR[VRA] is equal to the unsigned integer value
in doubleword element i of VR[VRB], and is set to
all 0s otherwise.
Vector Compare Greater Than Signed Vector Compare Greater Than Signed
Byte VC-form Doubleword VX-form
vcmpgtsb VRT,VRA,VRB (Rc=0) vcmpgtsd VRT,VRA,VRB (Rc=0)
vcmpgtsb. VRT,VRA,VRB (Rc=1) vcmpgtsd. VRT,VRA,VRB (Rc=1)
do i=0 to 127 by 8 do i = 0 to 1
VRTi:i+7 ((VRA)i:i+7 >si (VRB)i:i+7) ? 81 : 80 aop ← EXTS(VR[VRA].dword[i])
end bop ← EXTS(VR[VRB].dword[i])
if Rc=1 then do if (aop >si bop) then do
t (VRT=1281) VR[VRT].dword[i] ← 0xFFFF_FFFF_FFFF_FFFF
f (VRT=1280) flag.bit[i] ← 0b1
CR6 t || 0b0 || f || 0b0 end
end else do
VR[VRT].dword[i] ← 0x0000_0000_0000_0000
For each integer value i from 0 to 15, do the following. flag.bit[i] ← 0b0
The signed integer value in byte element i in end
VR[VRA] is compared to the signed integer value in end
byte element i in VR[VRB]. Byte element i in if “vcmpgtsd.” then do
CR.bit[24] ← (flag=0b11)
VR[VRT] is set to all 1s if signed-integer byte
CR.bit[25] ← 0b0
element i in VR[VRA] is greater than to CR.bit[26] ← (flag=0b00)
signed-integer byte element i in VR[VRB], and is set CR.bit[27] ← 0b0
to all 0s otherwise. end
Special Registers Altered: For each integer value i from 0 to 1, do the following.
CR field 6 . . . . . . . . . . . . . . . . . . . . . . . . . . (if Rc=1) The signed integer value in doubleword element i
of VR[VRA] is compared to the signed integer value
in doubleword element i of VR[VRB]. Doubleword
element i of VR[VRT] is set to all 1s if the signed
integer value in doubleword element i of VR[VRA]
is greater than the signed integer value in
doubleword element i of VR[VRB], and is set to all
0s otherwise.
Vector Compare Greater Than Signed Vector Compare Greater Than Signed
Halfword VC-form Word VC-form
vcmpgtsh VRT,VRA,VRB (Rc=0) vcmpgtsw VRT,VRA,VRB (Rc=0)
vcmpgtsh. VRT,VRA,VRB (Rc=1) vcmpgtsw. VRT,VRA,VRB (Rc=1)
For each integer value i from 0 to 7, do the following. For each integer value i from 0 to 3, do the following.
Signed-integer halfword element i in VRA is Signed-integer word element i in VRA is
compared to signed-integer halfword element i in compared to signed-integer word element i in
VRB. Halfword element i in VRT is set to all 1s if VRB. Word element i in VRT is set to all 1s if
signed-integer halfword element i in VRA is signed-integer word element i in VRA is greater
greater than signed-integer halfword element i in than signed-integer word element i in VRB, and is
VRB, and is set to all 0s otherwise. set to all 0s otherwise.
Vector Compare Greater Than Unsigned Vector Compare Greater Than Unsigned
Byte VC-form Doubleword VX-form
vcmpgtub VRT,VRA,VRB (Rc=0) vcmpgtud VRT,VRA,VRB (Rc=0)
vcmpgtub. VRT,VRA,VRB (Rc=1) vcmpgtud. VRT,VRA,VRB (Rc=1)
do i=0 to 127 by 8 do i = 0 to 1
VRTi:i+7 ((VRA)i:i+7 >ui (VRB)i:i+7) ? 81 : 80 aop ← EXTZ(VR[VRA].dword[i])
end bop ← EXTZ(VR[VRB].dword[i])
if Rc=1 then do if (EXTZ(aop) >ui EXTZ(bop)) then do
t (VRT=1281) VR[VRT].dword[i] ← 0xFFFF_FFFF_FFFF_FFFF
f (VRT=1280) flag.bit[i] ← 0b1
CR6 t || 0b0 || f || 0b0 end
end else do
VR[VRT].dword[i] ← 0x0000_0000_0000_0000
For each integer value i from 0 to 15, do the following. flag.bit[i] ← 0b1
Unsigned-integer byte element i in VRA is end
compared to unsigned-integer byte element i in end
VRB. Byte element i in VRT is set to all 1s if if “vcmpgtud.” then do
CR.bit[24] ← (flag=0b11)
unsigned-integer byte element i in VRA is greater
CR.bit[25] ← 0b0
than to unsigned-integer byte element i in VRB, CR.bit[26] ← (flag=0b00)
and is set to all 0s otherwise. CR.bit[27] ← 0b0
end
Special Registers Altered:
CR field 6 . . . . . . . . . . . . . . . . . . . . . . . . . . (if Rc=1) For each integer value i from 0 to 1, do the following.
The unsigned integer value in doubleword
element i of VR[VRA] is compared to the unsigned
integer value in doubleword element i of VR[VRB].
Doubleword element i of VR[VRT] is set to all 1s if
the unsigned integer value in doubleword element
i of VR[VRA] is greater than the unsigned integer
value in doubleword element i of VR[VRB], and is
set to all 0s otherwise.
Vector Compare Greater Than Unsigned Vector Compare Greater Than Unsigned
Halfword VC-form Word VC-form
vcmpgtuh VRT,VRA,VRB (Rc=0) vcmpgtuw VRT,VRA,VRB (Rc=0)
vcmpgtuh. VRT,VRA,VRB (Rc=1) vcmpgtuw. VRT,VRA,VRB (Rc=1)
For each integer value i from 0 to 7, do the following. For each integer value i from 0 to 3, do the following.
Unsigned-integer halfword element i in VRA is Unsigned-integer word element i in VRA is
compared to unsigned-integer halfword element i compared to unsigned-integer word element i in
in VRB. Halfword element i in VRT is set to all 1s VRB. Word element i in VRT is set to all 1s if
if unsigned-integer halfword element i in VRA is unsigned-integer word element i in VRA is greater
greater than to unsigned-integer halfword element than to unsigned-integer word element i in VRB,
i in VRB, and is set to all 0s otherwise. and is set to all 0s otherwise.
Vector Compare Not Equal Byte VX-form Vector Compare Not Equal or Zero Byte
vcmpneb VRT,VRA,VRB (if Rc=0)
VX-form
vcmpneb. VRT,VRA,VRB (if Rc=1) vcmpnezb VRT,VRA,VRB (if Rc=0)
vcmpnezb. VRT,VRA,VRB (if Rc=1)
4 VRT VRA VRB Rc 7
0 6 11 16 21 31 4 VRT VRA VRB Rc 263
0 6 11 16 21 31
if MSR.VEC=0 then Vector_Unavailable()
if MSR.VEC=0 then Vector_Unavailable()
for i = 0 to 15
src1 VR[VRA].byte[i] for i = 0 to 15
src2 VR[VRB].byte[i] src1 VR[VRA].byte[i]
if (src1 != src2) then src2 VR[VRB].byte[i]
VR[VRT].byte[i] 0xFF if (src1 = 0) | (src2 = 0) | (src1 != src2) then
else VR[VRT].byte[i] 0xFF
VR[VRT].byte[i] 0x00 else
end VR[VRT].byte[i] 0x00
end
all_true (VR[VRT]=0xFFFF_FFFF_FFFF_FFFF_FFF_FFFF_FFFF_FFFF)
all_false (VR[VRT]=0x0000_0000_0000_0000_0000_0000_0000_0000) all_true (VR[VRT]=0xFFFF_FFFF_FFFF_FFFF_FFF_FFFF_FFFF_FFFF)
all_false (VR[VRT]=0x0000_0000_0000_0000_0000_0000_0000_0000)
if Rc=1 then CR.bit[56:59] (all_true<<3) + (all_false<<1)
if Rc=1 then CR.bit[56:59] (all_true<<3) + (all_false<<1)
For each integer value i from 0 to 15, do the following.
The integer value in byte element i in VR[VRA] is For each integer value i from 0 to 15, do the following.
compared to the integer value in byte element i in The integer value in byte element i in VR[VRA] is
VR[VRB]. The contents of byte element i in VR[VRT] compared to the integer value in byte element i in
are set to 0xFF if integer value in byte element i in VR[VRB]. The contents of byte element i in VR[VRT]
VR[VRA] is not equal to the integer value in byte are set to 0xFF if integer value in byte element i in
element i in VR[VRB], and are set to 0x00 VR[VRA] is not equal to the integer value in byte
otherwise. element i in VR[VRB] or either value is equal to
0x00, and are set to 0x00 otherwise.
If Rc=1, CR field 6 is set to indicate whether all vector
elements compared true and whether all vector If Rc=1, CR field 6 is set to indicate whether all vector
elements compared false. elements compared true and whether all vector
elements compared false.
Special Registers Altered:
CR field 6 (if Rc=1) Special Registers Altered:
CR field 6 (if Rc=1)
Vector Compare Not Equal Halfword Vector Compare Not Equal or Zero
VX-form Halfword VX-form
vcmpneh VRT,VRA,VRB (if Rc=0) vcmpnezh VRT,VRA,VRB (if Rc=0)
vcmpneh. VRT,VRA,VRB (if Rc=1) vcmpnezh. VRT,VRA,VRB (if Rc=1)
for i = 0 to 7 for i = 0 to 7
src1 VR[VRA].hword[i] src1 VR[VRA].hword[i]
src2 VR[VRB].hword[i] src2 VR[VRB].hword[i]
if (src1 != src2) then if (src1=0) | (src2=0) | (src1≠src2) then
VR[VRT].hword[i] 0xFFFF VR[VRT].hword[i] 0xFFFF
else else
VR[VRT].hword[i] 0x0000 VR[VRT].hword[i] 0x0000
end end
if Rc=1 then CR.bit[56:59] (all_true<<3) + (all_false<<1) if Rc=1 then CR.bit[56:59] (all_true<<3) + (all_false<<1)
For each integer value i from 0 to 7, do the following. For each integer value i from 0 to 7, do the following.
The integer value in halfword element i in VR[VRA] The integer value in halfword element i in VR[VRA]
is compared to the integer value in halfword is compared to the integer value in halfword
element i in VR[VRB]. The contents of halfword element i in VR[VRB]. The contents of halfword
element i in VR[VRT] are set to 0xFFFF if integer element i in VR[VRT] are set to 0xFFFF if integer
value in halfword element i in VR[VRA] is not equal value in halfword element i in VR[VRA] is not equal
to the integer value in halfword element i in to the integer value in halfword element i in
VR[VRB], and are set to 0x0000 otherwise. VR[VRB] or either value is equal to 0x0000, and are
set to 0x0000 otherwise.
If Rc=1, CR field 6 is set to indicate whether all vector
elements compared true and whether all vector If Rc=1, CR field 6 is set to indicate whether all vector
elements compared false. elements compared true and whether all vector
elements compared false.
Special Registers Altered:
CR field 6 (if Rc=1) Special Registers Altered:
CR field 6 (if Rc=1)
Vector Compare Not Equal Word VX-form Vector Compare Not Equal or Zero Word
vcmpnew VRT,VRA,VRB (if Rc=0)
VX-form
vcmpnew. VRT,VRA,VRB (if Rc=1) vcmpnezw VRT,VRA,VRB (if Rc=0)
vcmpnezw. VRT,VRA,VRB (if Rc=1)
4 VRT VRA VRB Rc 135
0 6 11 16 21 31 4 VRT VRA VRB Rc 391
0 6 11 16 21 31
if MSR.VEC=0 then Vector_Unavailable()
if MSR.VEC=0 then Vector_Unavailable()
for i = 0 to 3
src1 VR[VRA].word[i] for i = 0 to 3
src2 VR[VRB].word[i] src1 VR[VRA].word[i]
if (src1 != src2) then src2 VR[VRB].word[i]
VR[VRT].word[i] 0xFFFF_FFFF if (src1=0) | (src2=0) | (src1≠src2) then
else VR[VRT].word[i] 0xFFFF_FFFF
VR[VRT].word[i] 0x0000_0000 else
end VR[VRT].word[i] 0x0000_0000
end
all_true (VR[VRT]=0xFFFF_FFFF_FFFF_FFFF_FFF_FFFF_FFFF_FFFF)
all_false (VR[VRT]=0x0000_0000_0000_0000_0000_0000_0000_0000) all_true (VR[VRT]=0xFFFF_FFFF_FFFF_FFFF_FFF_FFFF_FFFF_FFFF)
all_false (VR[VRT]=0x0000_0000_0000_0000_0000_0000_0000_0000)
if Rc=1 then CR.bit[56:59] (all_true<<3) + (all_false<<1)
if Rc=1 then CR.bit[56:59] (all_true<<3) + (all_false<<1)
For each integer value i from 0 to 3, do the following.
The integer value in word element i in VR[VRA] is For each integer value i from 0 to 3, do the following.
compared to the integer value in word element i in The integer value in word element i in VR[VRA] is
VR[VRB]. The contents of word element i in compared to the integer value in word element i in
VR[VRT] are set to 0xFFFF_FFFF if integer value in VR[VRB]. The contents of word element i in
word element i in VR[VRA] is not equal to the VR[VRT] are set to 0xFFFF_FFFF if integer value in
integer value in word element i in VR[VRB], and word element i in VR[VRA] is not equal to the
are set to 0x0000_0000 otherwise. integer value in word element i in VR[VRB] or
either value is equal to 0x0000_0000, and are set to
If Rc=1, CR field 6 is set to indicate whether all vector 0x0000_0000 otherwise.
elements compared true and whether all vector
elements compared false. If Rc=1, CR field 6 is set to indicate whether all vector
elements compared true and whether all vector
Special Registers Altered: elements compared false.
CR field 6 (if Rc=1)
Special Registers Altered:
CR field 6 (if Rc=1)
The contents of VR[VRA] are ANDed with the contents The contents of VR[VRA] are XORed with the contents
of VR[VRB] and the result is placed into VR[VRT]. of VR[VRB] and the complemented result is placed into
VR[VRT].
Special Registers Altered:
None Special Registers Altered:
None
The contents of VR[VRA] are ANDed with the VR[VRT] ← ¬( VR[VRA] & VR[VRB] )
complement of the contents of VR[VRB] and the result is
placed into VR[VRT]. The contents of VR[VRA] are ANDed with the contents
of VR[VRB] and the complemented result is placed into
Special Registers Altered: VR[VRT].
None
Special Registers Altered:
None
VR[VRT] ¬( VR[VRA] | VR[VRB] ) The contents of VR[VRA] are XORed with the contents
of VR[VRB] and the result is placed into VR[VRT].
The contents of VR[VRA] are ORed with the contents of
VR[VRB] and the complemented result is placed into Special Registers Altered:
VR[VRT]. None
Vector Parity Byte Word VX-form Vector Parity Byte Doubleword VX-form
vprtybw VRT,VRB vprtybd VRT,VRB
do i = 0 to 3 do i = 0 to 1
s 0 s 0
do j = 0 to 3 do j = 0 to 7
s s ^ VR[VRB].word[i].byte[j].bit[7] s s ^ VR[VRB]dword[i].byte[j].bit[7]
end end
VR[VRT].word[i] Chop(EXTZ(s), 32) VR[VRT].dword[i] Chop(EXTZ(s), 64)
end end
For each integer value i from 0 to 3, do the following For each integer value i from 0 to 1, do the following
If the sum of the least significant bit in each byte If the sum of the least significant bit in each byte
sub-element of word element i of VR[VRB] is odd, sub-element of doubleword element i of VR[VRB]
the value 1 is placed into word element i of is odd, the value 1 is placed into doubleword
VR[VRT]; otherwise the value 0 is placed into word element i of VR[VRT]; otherwise the value 0 is
element i of VR[VRT]. placed into doubleword element i of VR[VRT].
s 0
do j = 0 to 15
s s ^ VR[VRB].byte[j].bit[7]
end
VR[VRT] Chop(EXTZ(s), 128)
Vector Rotate Left Byte VX-form Vector Rotate Left Word VX-form
vrlb VRT,VRA,VRB vrlw VRT,VRA,VRB
For each integer value i from 0 to 15, do the following. For each integer value i from 0 to 3, do the following.
Byte element i in VRA is rotated left by the Word element i in VRA is rotated left by the
number of bits specified in the low-order 3 bits of number of bits specified in the low-order 5 bits of
the corresponding byte element i in VRB. the corresponding word element i in VRB.
The result is placed into byte element i in VRT. The result is placed into word element i in VRT.
Vector Rotate Left Halfword VX-form Vector Rotate Left Doubleword VX-form
vrlh VRT,VRA,VRB vrld VRT,VRA,VRB
do i=0 to 127 by 16 do i = 0 to 1
sh (VRB)i+12:i+15 sh ← VR[VRB].dword[i].bit[58:63]
VRTi:i+15 (VRA)i:i+15 <<< sh VR[VRT].dword[i] ← VR[VRA].dword[i] <<< sh
end end
For each integer value i from 0 to 7, do the following. For each integer value i from 0 to 1, do the following.
Halfword element i in VRA is rotated left by the The contents of doubleword element i of VR[VRA]
number of bits specified in the low-order 4 bits of are rotated left by the number of bits specified in
the corresponding halfword element i in VRB. bits 58:63 of doubleword element i of VR[VRB].
The result is placed into halfword element i in The result is placed into doubleword element i of
VRT. VR[VRT].
Vector Shift Left Byte VX-form Vector Shift Left Word VX-form
vslb VRT,VRA,VRB vslw VRT,VRA,VRB
For each integer value i from 0 to 15, do the following. For each integer value i from 0 to 3, do the following.
Byte element i in VRA is shifted left by the number Word element i in VRA is shifted left by the
of bits specified in the low-order 3 bits of byte number of bits specified in the low-order 5 bits of
element i in VRB. word element i in VRB.
– Bits shifted out of bit 0 are lost. – Bits shifted out of bit 0 are lost.
– Zeros are supplied to the vacated bits on the – Zeros are supplied to the vacated bits on the
right. right.
The result is placed into byte element i of VRT. The result is placed into word element i of VRT.
Vector Shift Left Halfword VX-form Vector Shift Left Doubleword VX-form
vslh VRT,VRA,VRB vsld VRT,VRA,VRB
do i=0 to 127 by 16 do i = 0 to 1
sh (VRB)i+12:i+15 sh ← VR[VRB].dword[i].bit[58:63]
VRTi:i+15 (VRA)i:i+15 << sh VR[VRT].dword[i] ← VR[VRA].dword[i] << sh
end end
For each integer value i from 0 to 7, do the following. For each integer value i from 0 to 1, do the following.
Halfword element i in VRA is shifted left by the The contents of doubleword element i of VR[VRA]
number of bits specified in the low-order 4 bits of are shifted left by the number of bits specified in
halfword element i in VRB. bits 58:63 of doubleword element i of VR[VRB].
– Bits shifted out of bit 0 are lost.
– Bits shifted out of bit 0 are lost. – Zeros are supplied to the vacated bits on the
right.
– Zeros are supplied to the vacated bits on the
right. The result is placed into doubleword element i of
VR[VRT].
The result is placed into halfword element i of
VRT. Special Registers Altered:
None
Special Registers Altered:
None
Vector Shift Right Byte VX-form Vector Shift Right Word VX-form
vsrb VRT,VRA,VRB vsrw VRT,VRA,VRB
For each integer value i from 0 to 15, do the following. For each integer value i from 0 to 3, do the following.
Byte element i in VRA is shifted right by the Word element i in VRA is shifted right by the
number of bits specified in the low-order 3 bits of number of bits specified in the low-order 5 bits of
byte element i in VRB. Bits shifted out of the word element i in VRB. Bits shifted out of the
least-significant bit are lost. Zeros are supplied to least-significant bit are lost. Zeros are supplied to
the vacated bits on the left. The result is placed the vacated bits on the left. The result is placed
into byte element i of VRT. into word element i of VRT.
Vector Shift Right Halfword VX-form Vector Shift Right Doubleword VX-form
vsrh VRT,VRA,VRB vsrd VRT,VRA,VRB
do i=0 to 127 by 16 do i = 0 to 1
sh (VRB)i+12:i+15 sh ← VR[VRB].dword[i].bit[58:63]
VRTi:i+15 (VRA)i:i+15 >>ui sh VR[VRT].dword[i] ← VR[VRA].dword[i] >>ui sh
end end
For each integer value i from 0 to 7, do the following. For each integer value i from 0 to 1, do the following.
Halfword element i in VRA is shifted right by the The contents of doubleword element i of VR[VRA]
number of bits specified in the low-order 4 bits of are shifted right by the number of bits specified in
halfword element i in VRB. Bits shifted out of the bits 58:63 of doubleword element i of VR[VRB].
least-significant bit are lost. Zeros are supplied to Zeros are supplied to the vacated bits on the left.
the vacated bits on the left. The result is placed
into halfword element i of VRT. The result is placed into doubleword element i of
VR[VRT].
Special Registers Altered:
None Special Registers Altered:
None
Vector Shift Right Algebraic Byte VX-form Vector Shift Right Algebraic Word
VX-form
vsrab VRT,VRA,VRB
vsraw VRT,VRA,VRB
4 VRT VRA VRB 772
0 6 11 16 21 31 4 VRT VRA VRB 900
0 6 11 16 21 31
do i=0 to 127 by 8
sh (VRB)i+5:i+7 do i=0 to 127 by 32
VRTi:i+7 (VRA)i:i+7 >>si sh sh (VRB)i+27:i+31
end VRTi:i+31 (VRA)i:i+31 >>si sh
end
For each integer value i from 0 to 15, do the following.
Byte element i in VRA is shifted right by the For each integer value i from 0 to 3, do the following.
number of bits specified in the low-order 3 bits of Word element i in VRA is shifted right by the
the corresponding byte element i in VRB. Bits number of bits specified in the low-order 5 bits of
shifted out of bit 7 of the byte element are lost. Bit the corresponding word element i in VRB. Bits
0 of the byte element is replicated to fill the shifted out of bit 31 of the word are lost. Bit 0 of
vacated bits on the left. The result is placed into the word is replicated to fill the vacated bits on the
byte element i of VRT. left. The result is placed into word element i of
VRT.
Special Registers Altered:
None Special Registers Altered:
None
Vector Rotate Left Word then AND with Vector Rotate Left Word then Mask Insert
Mask VX-form VX-form
vrlwnm VRT,VRA,VRB vrlwmi VRT,VRA,VRB
do i = 0 to 3 do i = 0 to 3
src1.word[0] VR[VRA].word[i] src1.word[0] VR[VRA].word[i]
src1.word[1] VR[VRA].word[i] src1.word[1] VR[VRA].word[i]
src2 VR[VRB].word[i] src2 VR[VRB].word[i]
src3 VR[VRT].word[i]
b src2.bit[11:15]
e src2.bit[19:23] b src2.bit[11:15]
n src2.bit[27:31] e src2.bit[19:23]
r src1.bit[n:n+31] n src2.bit[27:31]
m MASK(b, e) r src1.bit[n:n+31]
m MASK(b, e)
VR[VRT].word[i] r&m
end VR[VRT].word[i] (r & m) | (src3 & ¬m)
end
For each integer value i from 0 to 3, do the following.
Let src1 be the contents of word element i of For each integer value i from 0 to 3, do the following.
VR[VRA]. Let src1 be the contents of word element i of
VR[VRA].
Let src2 be the contents of word element i of
VR[VRB]. Let src2 be the contents of word element i of
VR[VRB].
Let mb be the contents of bits 11:15 of src2.
Let me be the contents of bits 19:23 of src2. Let src3 be the contents of word element i of
Let sh be the contents of bits 27:31 of src2. VR[VRT].
src1 is rotated left sh bits. A mask is generated Let mb be the contents of bits 11:15 of src2.
having 1-bits from bit mb through bit me and 0-bits Let me be the contents of bits 19:23 of src2.
elsewhere. The rotated data are ANDed with the Let sh be the contents of bits 27:31 of src2.
generated mask.
src1 is rotated left sh bits. A mask is generated
The result is placed into word element i of having 1-bits from bit mb through bit me and 0-bits
VR[VRT]. elsewhere. The rotated data are inserted into src3
under control of the generated mask.
Special Registers Altered:
None The result is placed into word element i of
VR[VRT].
Vector Rotate Left Doubleword then AND Vector Rotate Left Doubleword then Mask
with Mask VX-form Insert VX-form
vrldnm VRT,VRA,VRB vrldmi VRT,VRA,VRB
do i = 0 to 1 do i = 0 to 1
src1.dword[0] VR[VRA].dword[i] src1.dword[0] VR[VRA].dword[i]
src1.dword[1] VR[VRA].dword[i] src1.dword[1] VR[VRA].dword[i]
src2 VR[VRB].dword[i] src2 VR[VRB].dword[i]
src3 VR[VRT].dword[i]
b src2.bit[42:47]
e src2.bit[50:55] b src2.bit[42:47]
n src2.bit[58:63] e src2.bit[50:55]
r src1.bit[n:n+63] n src2.bit[58:63]
m MASK(b, e) r src1.bit[n:n+63]
m MASK(b, e)
VR[VRT].dword[i] r&m
end VR[VRT].dword[i] (r & m) | (src3 & ¬m)
end
For each integer value i from 0 to 1, do the following.
Let src1 be the contents of doubleword element i For each integer value i from 0 to 1, do the following.
of VR[VRA]. Let src1 be the contents of doubleword element i
of VR[VRA].
Let src2 be the contents of doubleword element i
of VR[VRB]. Let src2 be the contents of doubleword element i
of VR[VRB].
Let mb be the contents of bits 42:47 of src2.
Let me be the contents of bits 50:55 of src2. Let src3 be the contents of doubleword element i
Let sh be the contents of bits 58:63 of src2. of VR[VRT].
src1 is rotated left sh bits. A mask is generated Let mb be the contents of bits 42:47 of src2.
having 1-bits from bit mb through bit me and 0-bits Let me be the contents of bits 50:55 of src2.
elsewhere. The rotated data are ANDed with the Let sh be the contents of bits 58:63 of src2.
generated mask.
src1 is rotated left sh bits. A mask is generated
The result is placed into doubleword element i of having 1-bits from bit mb through bit me and 0-bits
VR[VRT]. elsewhere. The rotated data are inserted into src3
under control of the generated mask.
Special Registers Altered:
None The result is placed into doubleword element i of
VR[VRT].
For each integer value i from 0 to 3, do the following. For each integer value i from 0 to 3, do the following.
Single-precision floating-point element i in VRA is Single-precision floating-point element i in VRB is
added to single-precision floating-point element i subtracted from single-precision floating-point
in VRB. The intermediate result is rounded to the element i in VRA. The intermediate result is
nearest single-precision floating-point number and rounded to the nearest single-precision
placed into word element i of VRT. floating-point number and placed into word
element i of VRT.
Special Registers Altered:
None Special Registers Altered:
None
For each integer value i from 0 to 3, do the following. For each integer value i from 0 to 3, do the following.
Single-precision floating-point element i in VRA is Single-precision floating-point element i in VRA is
multiplied by single-precision floating-point multiplied by single-precision floating-point
element i in VRC. Single-precision floating-point element i in VRC. Single-precision floating-point
element i in VRB is added to the infinitely-precise element i in VRB is subtracted from the
product. The intermediate result is rounded to the infinitely-precise product. The intermediate result
nearest single-precision floating-point number and is rounded to the nearest single-precision
placed into word element i of VRT. floating-point number, then negated and placed
into word element i of VRT.
Special Registers Altered:
None Special Registers Altered:
None
Programming Note
To use a multiply-add to perform an IEEE or Java
compliant multiply, the addend must be -0.0. This
is necessary to insure that the sign of a zero result
will be correct when the product is -0.0 (+0.0 + -0.0
≥ +0.0, and -0.0 + -0.0≥ -0.0). When the sign of a
resulting 0.0 is not important, then +0.0 can be
used as an addend which may, in some cases,
avoid the need for a second register to hold a -0.0
in addition to the integer 0/floating-point +0.0 that
may already be available.
For each integer value i from 0 to 3, do the following. For each integer value i from 0 to 3, do the following.
Single-precision floating-point element i in VRA is Single-precision floating-point element i in VRA is
compared to single-precision floating-point compared to single-precision floating-point
element i in VRB. The larger of the two values is element i in VRB. The smaller of the two values is
placed into word element i of VRT. placed into word element i of VRT.
The maximum of +0 and -0 is +0. The maximum of any The minimum of +0 and -0 is -0. The minimum of any
value and a NaN is a QNaN. value and a NaN is a QNaN.
For each integer value i from 0 to 3, do the following. For each integer value i from 0 to 3, do the following.
Single-precision floating-point word element i in Single-precision floating-point word element i in
VRB is multiplied by 2UIM. The product is VRB is multiplied by 2UIM. The product is
converted to a 32-bit signed fixed-point integer converted to a 32-bit unsigned fixed-point integer
using the rounding mode Round toward Zero. using the rounding mode Round toward Zero.
– If the intermediate result is greater than 231-1 – If the intermediate result is greater than 232-1
the result saturates to 231-1. the result saturates to 232-1.
– If the intermediate result is less than -231 the The result is placed into word element i of VRT.
result saturates to -231.
Special Registers Altered:
The result is placed into word element i of VRT. SAT
For each integer value i from 0 to 3, do the following. For each integer value i from 0 to 3, do the following.
Signed fixed-point word element i in VRB is Unsigned fixed-point word element i in VRB is
converted to the nearest single-precision converted to the nearest single-precision
floating-point value. Each result is divided by 2UIM floating-point value. The result is divided by 2UIM
and placed into word element i of VRT. and placed into word element i of VRT.
For each integer value i from 0 to 3, do the following. For each integer value i from 0 to 3, do the following.
Single-precision floating-point element i in VRB is Single-precision floating-point element i in VRB is
rounded to a single-precision floating-point integer rounded to a single-precision floating-point integer
using the rounding mode Round toward -Infinity. using the rounding mode Round to Nearest.
The result is placed into the corresponding word The result is placed into the corresponding word
element i of VRT. element i of VRT.
Programming Note
The Vector Convert To Fixed-Point Word Vector Round to Floating-Point Integer
instructions support only the rounding mode toward +Infinity VX-form
Round toward Zero. A floating-point number can
be converted to a fixed-point integer using any of vrfip VRT,VRB
the other three rounding modes by executing the 4 VRT /// VRB 650
appropriate Vector Round to Floating-Point Integer 0 6 11 16 21 31
instruction before the Vector Convert To
Fixed-Point Word instruction. do i=0 to 127 by 32
VRT0:31 RoundToSPIntCeil( (VRB)0:31 )
end
Programming Note
The fixed-point integers used by the Vector For each integer value i from 0 to 3, do the following.
Convert instructions can be interpreted as Single-precision floating-point element i in VRB is
consisting of 32-UIM integer bits followed by UIM rounded to a single-precision floating-point integer
fraction bits. using the rounding mode Round toward +Infinity.
do i=0 to 127 by 32
VRT0:31 RoundToSPIntTrunc( (VRB)0:31 )
end
Bit Description
0 Set to 0
1 Set to 0
For each integer value i from 0 to 3, do the following. For each integer value i from 0 to 3, do the following.
Single-precision floating-point element i in VRA is Single-precision floating-point element i in VRA is
compared to single-precision floating-point compared to single-precision floating-point
element i in VRB. Word element i in VRT is set to element i in VRB. Word element i in VRT is set to
all 1s if single-precision floating-point element i in all 1s if single-precision floating-point element i in
VRA is equal to single-precision floating-point VRA is greater than or equal to single-precision
element i in VRB, and is set to all 0s otherwise. floating-point element i in VRB, and is set to all 0s
otherwise.
If the source element i in VRA or the source
element i in VRB is a NaN, VRT is set to all 0s, If the source element i in VRA or the source
indicating “not equal to”. If the source element i in element i in VRB is a NaN, VRT is set to all 0s,
VRA and the source element i in VRB are both indicating “not greater than or equal to”. If the
infinity with the same sign, VRT is set to all 1s, source element i in VRA and the source element i
indicating “equal to”. in VRB are both infinity with the same sign, VRT is
set to all 1s, indicating “greater than or equal to”.
Special Registers Altered:
CR field 6 . . . . . . . . . . . . . . . . . . . . . . . . . .(if Rc=1) Special Registers Altered:
CR field 6 . . . . . . . . . . . . . . . . . . . . . . . . . . (if Rc=1)
do i=0 to 127 by 32
32 32
VRTi:i+31 ((VRA)i:i+31 >fp (VRB)i:i+31) ? 1 : 0
end
if Rc=1 then do
t ( VRT=1281 )
f ( VRT=1280 )
CR6 t || 0b0 || f || 0b0
end
For each integer value i from 0 to 3, do the following. For each integer value i from 0 to 3, do the following.
The single-precision floating-point estimate of 2 The single-precision floating-point estimate of the
raised to the power of single-precision base 2 logarithm of single-precision floating-point
floating-point element i in VRB is placed into word element i in VRB is placed into the corresponding
element i of VRT. word element of VRT.
Let x be any single-precision floating-point input value. Let x be any single-precision floating-point input value.
Unless x< -146 or the single-precision floating-point Unless | x-1 | is less than or equal to 0.125 or the
result of computing 2 raised to the power x would be a single-precision floating-point result of computing the
zero, an infinity, or a QNaN, the estimate has a relative base 2 logarithm of x would be an infinity or a QNaN,
error in precision no greater than one part in 16. The the estimate has an absolute error in precision
most significant 12 bits of the estimate’s significand (absolute value of the difference between the estimate
are monotonic. An integral input value returns an and the infinitely precise value) no greater than 2-5.
integral value when the result is representable. Under the same conditions, the estimate has a relative
error in precision no greater than one part in 8.
The result for various special cases of the source
value is given below. The most significant 12 bits of the estimate’s
significand are monotonic. The estimate is exact if
Value Result x=2y, where y is an integer between -149 and +127
- Infinity +0 inclusive. Otherwise the value placed into the element
-0 +1 of register VRT may vary between implementations,
and between different executions on the same
+0 +1
implementation.
+Infinity +Infinity
NaN QNaN The result for various special cases of the source
value is given below.
Special Registers Altered:
Value Result
None
- Infinity QNaN
<0 QNaN
-0 - Infinity
+0 - Infinity
+Infinity +Infinity
NaN QNaN
For each integer value i from 0 to 3, do the following. For each integer value i from 0 to 3, do the following.
The single-precision floating-point estimate of the The single-precision floating-point estimate of the
reciprocal of single-precision floating-point reciprocal of the square root of single-precision
element i in VRB is placed into word element i of floating-point element i in VRB is placed into word
VRT. element i of VRT.
Unless the single-precision floating-point result of Let x be any single-precision floating-point value.
computing the reciprocal of a value would be a zero, Unless the single-precision floating-point result of
an infinity, or a QNaN, the estimate has a relative error computing the reciprocal of the square root of x would
in precision no greater than one part in 4096. be a zero, an infinity, or a QNaN, the estimate has a
relative error in precision no greater than one part in
Note that results may vary between implementations, 4096.
and between different executions on the same
implementation. Note that results may vary between implementations,
and between different executions on the same
The result for various special cases of the source implementation.
value is given below.
The result for various special cases of the source
Value Result value is given below.
- Infinity -0
-0 - Infinity Value Result
+0 + Infinity - Infinity QNaN
+Infinity +0 <0 QNaN
NaN QNaN -0 - Infinity
+0 + Infinity
Special Registers Altered: +Infinity +0
None NaN QNaN
Vector AES Inverse Cipher VX-form Vector AES Inverse Cipher Last VX-form
vncipher VRT,VRA,VRB vncipherlast VRT,VRA,VRB
State ← VR[VRA]
VR[VRT] ← SubBytes(State)
do i = 0 to 7 do i = 0 to 3
prod.bit[0:30] ← 0 prod[i].bit[0:62] ← 0
srcA ← VR[VRA].halfword[i] srcA ← VR[VRA].word[i]
srcB ← VR[VRB].halfword[i] srcB ← VR[VRB].word[i]
do j = 0 to 15 do j = 0 to 31
do k = 0 to j do k = 0 to j
gbit ← srcA.bit[k] & srcB.bit[j-k] gbit ← srcA.bit[k] & srcB.bit[j-k]
prod[i].bit[j] ← prod[i].bit[j] ^ gbit prod[i].bit[j] ← prod[i].bit[j] ^ gbit
end end
end end
do j = 16 to 30 do j = 32 to 62
do k = j-15 to 15 do k = j-31 to 31
gbit ← (srcA.bit[k] & srcB.bit[j-k]) gbit ← (srcA.bit[k] & srcB.bit[j-k])
prod[i].bit[j] ← prod[i].bit[j] ^ gbit prod[i].bit[j] ← prod[i].bit[j] ^ gbit
end end
end end
end end
VR[VRT].word[0] ← 0b0 æ (prod[0] ^ prod[1]) VR[VRT].dword[0] ← 0b0 æ (prod[0] ^ prod[1])
VR[VRT].word[1] ← 0b0 æ (prod[2] ^ prod[3]) VR[VRT].dword[1] ← 0b0 æ (prod[2] ^ prod[3])
VR[VRT].word[2] ← 0b0 æ (prod[4] ^ prod[5])
VR[VRT].word[3] ← 0b0 æ (prod[6] ^ prod[7]) For each integer value i from 0 to 3, do the following.
Let prod[i] be the 63-bit result of a binary
For each integer value i from 0 to 7, do the following. polynomial multiplication of the contents of word
Let prod[i] be the 31-bit result of a binary element i of VR[VRA] and the contents of word
polynomial multiplication of the contents of element i of VR[VRB].
halfword element i of VR[VRA] and the contents of
halfword element i of VR[VRB]. For each integer value i from 0 to 1, do the following.
The exclusive-OR of prod[2×i] and prod[2×i+1] is
For each integer value i from 0 to 3, do the following. placed in bits 1:63 of doubleword element i of
The exclusive-OR of prod[2×i] and prod[2×i+1] is VR[VRT]. Bit 0 of doubleword element i of VR[VRT]
placed in bits 1:31 of word element i of VR[VRT]. is set to 0.
Bit 0 of word element i of VR[VRT] is set to 0.
Special Registers Altered:
Special Registers Altered: None
None
do i = 0 to 15
indexA ← VR[VRC].byte[i].bit[0:3]
indexB ← VR[VRC].byte[i].bit[4:7]
src1 ← VR[VRA].byte[indexA]
src2 ← VR[VRB].byte[indexB]
VSR[VRT].byte[i] ← src1 ^ src2
end
Vector Gather Bits by Bytes by The contents of bit 1 of each byte of doubleword
Doubleword VX-form element i of VR[VRB] are concatenated and placed
into byte 1 of doubleword element i of VR[VRT].
vgbbd VRT,VRB
The contents of bit 2 of each byte of doubleword
4 VRT /// VRB 1292 element i of VR[VRB] are concatenated and placed
0 6 11 16 21 31
into byte 2 of doubleword element i of VR[VRT].
do i = 0 to 1
do j = 0 to 7 The contents of bit 3 of each byte of doubleword
do k = 0 to 7 element i of VR[VRB] are concatenated and placed
b ← VSR[VRB].dword[i].byte[k].bit[j] into byte 3 of doubleword element i of VR[VRT].
VSR[VRT].dword[i].byte[j].bit[k] ← b
end The contents of bit 4 of each byte of doubleword
end element i of VR[VRB] are concatenated and placed
end into byte 4 of doubleword element i of VR[VRT].
Let src be the contents of VR[VRB], composed of two The contents of bit 5 of each byte of doubleword
doubleword elements numbered 0 and 1. element i of VR[VRB] are concatenated and placed
into byte 5 of doubleword element i of VR[VRT].
Let each doubleword element be composed of eight
bytes numbered 0 through 7. The contents of bit 6 of each byte of doubleword
element i of VR[VRB] are concatenated and placed
An 8-bit × 8-bit bit-matrix transpose is performed on
into byte 6 of doubleword element i of VR[VRT].
the contents of each doubleword element of VR[VRB]
(see Figure 107). The contents of bit 7 of each byte of doubleword
element i of VR[VRB] are concatenated and placed
For each integer value i from 0 to 1, do the following,
into byte 7 of doubleword element i of VR[VRT].
The contents of bit 0 of each byte of doubleword
element i of VR[VRB] are concatenated and placed
Special Registers Altered:
into byte 0 of doubleword element i of VR[VRT].
None
Vector Count Leading Zeros Byte VX-form Vector Count Leading Zeros Word
VX-form
vclzb VRT,VRB
vclzw VRT,VRB
4 VRT /// VRB 1794
0 6 11 16 21 31 4 VRT /// VRB 1922
0 6 11 16 21 31
if MSR.VEC=0 then Vector_Unavailable()
if MSR.VEC=0 then Vector_Unavailable()
do i = 0 to 15
n ← 0 do i = 0 to 3
do while n < 8 n ← 0
if VR[VRB].byte[i].bit[n] = 0b1 then leave do while n < 32
n ← n + 1 if VR[VRB].word[i].bit[n] = 0b1 then leave
end n ← n + 1
VSR[VRT].byte[i] ← n end
end VSR[VRT].word[i] ← n
end
For each integer value i from 0 to 15, do the following.
A count of the number of consecutive zero bits For each integer value i from 0 to 3, do the following.
starting at bit 0 of byte element i of VR[VRB] is A count of the number of consecutive zero bits
placed into byte element i of VR[VRT]. This starting at bit 0 of word element i of VR[VRB] is
number ranges from 0 to 8, inclusive. placed into word element i of VR[VRT]. This
number ranges from 0 to 32, inclusive.
Special Registers Altered:
None Special Registers Altered:
None
For each integer value i from 0 to 7, do the following. For each integer value i from 0 to 1, do the following.
A count of the number of consecutive zero bits A count of the number of consecutive zero bits
starting at bit 0 of halfword element i of VR[VRB] is starting at bit 0 of doubleword element i of
placed into halfword element i of VR[VRT]. This VR[VRB] is placed into doubleword element i of
number ranges from 0 to 16, inclusive. VR[VRT]. This number ranges from 0 to 64,
inclusive.
Special Registers Altered:
None Special Registers Altered:
None
Vector Count Trailing Zeros Byte VX-form Vector Count Trailing Zeros Word
VX-form
vctzb VRT,VRB
vctzw VRT,VRB
4 VRT 28 VRB 1538
0 6 11 16 21 31 4 VRT 30 VRB 1538
0 6 11 16 21 31
if MSR.VEC=0 then Vector_Unavailable()
if MSR.VEC=0 then Vector_Unavailable()
do i = 0 to 15
n←0 do i = 0 to 3
do while n < 8 n ←0
if VR[VRB].byte[i].bit[7-n] = 0b1 then leave do while n < 32
n←n+1 if VR[VRB].word[i].bit[31-n] = 0b1 then leave
end n ←n+1
VR[VRT].byte[i] ← Chop(EXTZ(n), 8) end
end VR[VRT].word[i] ← Chop(EXTZ(n), 32)
end
For each integer value i from 0 to 15, do the following.
A count of the number of consecutive zero bits For each integer value i from 0 to 3, do the following.
starting at bit 7 of byte element i of VR[VRB] is A count of the number of consecutive zero bits
placed into byte element i of VR[VRT]. This number starting at bit 31 of word element i of VR[VRB] is
ranges from 0 to 8, inclusive. placed into word element i of VR[VRT]. This number
ranges from 0 to 32, inclusive.
Special Registers Altered:
None Special Registers Altered:
None
count 0 count 0
do while count < 16 do while count < 16
if (VR[VRB].byte[count].bit[7]=1) break if (VR[VRB].byte[15-count].bit[7]=1) break
count count + 1 count count + 1
end end
Let count be the number of contiguous leading byte Let count be the number of contiguous trailing byte
elements in VR[VRB] having a zero least-significant bit. elements in VR[VRB] having a zero least-significant bit.
Let index be the contents of bits 60:63 of GPR[RA]. Let index be the contents of bits 60:63 of GPR[RA].
The contents of byte element index of VR[VRB] are The contents of byte element 15-index of VR[VRB] are
placed into bits 56:63 of GPR[RT]. placed into bits 56:63 of GPR[RT].
The contents of bits 0:55 of GPR[RT] are set to 0. The contents of bits 0:55 of GPR[RT] are set to 0.
Let index be the contents of bits 60:63 of GPR[RA]. Let index be the contents of bits 60:63 of GPR[RA].
The contents of byte elements index:index+1 of The contents of byte elements 14-index:15-index of
VR[VRB] are placed into bits 48:63 of GPR[RT]. VR[VRB] are placed into bits 48:63 of GPR[RT].
The contents of bits 0:47 of GPR[RT] are set to 0. The contents of bits 0:47 of GPR[RT] are set to 0.
If the value of index is greater than 14, the results are If the value of index is greater than 14, the results are
undefined. undefined.
Let index be the contents of bits 60:63 of GPR[RA]. Let index be the contents of bits 60:63 of GPR[RA].
The contents of byte elements index:index+3 of The contents of byte elements index:index+3 of
VR[VRB] are placed into bits 32:63 of GPR[RT]. VR[VRB] are placed into bits 32:63 of GPR[RT].
The contents of bits 0:31 of GPR[RT] are set to 0. The contents of bits 0:31 of GPR[RT] are set to 0.
If the value of index is greater than 12, the results are If the value of index is greater than 12, the results are
undefined. undefined.
do i = 0 to 1 do i = 0 to 3
n ← 0 n ← 0
do j = 0 to 63 do j = 0 to 31
n ← n + VR[VRB].dword[i].bit[j] n ← n + VR[VRB].word[i].bit[j]
end end
VSR[VRT].dword[i] ← n VSR[VRT].word[i] ← n
end end
For each integer value i from 0 to 1, do the following. For each integer value i from 0 to 3, do the following.
A count of the number of bits set to 1 in A count of the number of bits set to 1 in word
doubleword element i of VR[VRB] is placed into element i of VR[VRB] is placed into word element i
doubleword element i of VR[VRT]. This number of VR[VRT]. This number ranges from 0 to 32,
ranges from 0 to 64, inclusive. inclusive.
Vector Bit Permute Doubleword VX-form Vector Bit Permute Quadword VX-form
vbpermd VRT,VRA,VRB
vbpermq VRT,VRA,VRB
4 VRT VRA VRB 1484
0 6 11 16 21 31 4 VRT VRA VRB 1356
0 6 11 16 21 31
if MSR.VEC=0 then Vector_Unavailable()
if MSR.VEC=0 then Vector_Unavailable()
do i = 0 to 1
do j = 0 to 7 do i = 0 to 15
index VR[VRB].dword[i].byte[j] index VR[VRB].byte[i]
if index < 64 then if index < 128 then
perm.bit[j] VR[VRA].dword[i].bit[index] perm.bit[i] VR[VRA].bit[index]
else
else
perm.bit[j] 0
end
perm.bit[i] 0
end
VR[VRT].dword[i] EXTZ64(perm) VR[VRT].dword[0] Chop(EXTZ(perm),64)
end VR[VRT].dword[1] 0x0000_0000_0000_0000
For each integer value i from 0 to 1, and for each For each integer value i from 0 to 15, do the following.
integer value j from 0 to 7, do the following. Let index be the contents of byte element i of
Let index be the contents of byte sub-element j of VR[VRB].
doubleword element i of VR[VRB].
If index is less than 128, then the contents of bit
If index is less than 64, then the contents of bit index of VR[VRA] are placed into bit 48+i of double-
index of doubleword i of VR[VRA] are placed into word element i of VR[VRT]. Otherwise, bit 48+i of
doubleword element i of VR[VRT] is set to 0.
bit 56+j of doubleword element i of VR[VRT].
Otherwise, bit 56+j of doubleword element i of The contents of bits 0:47 of VR[VRT] are set to 0.
VR[VRT] is set to 0. The contents of bits 64:127 of VR[VRT] are set to 0.
Special Registers Altered:
The contents of bits 0:55 of doubleword element i
None
of VR[VRT] are set to 0.
Programming Note
The fact that the permuted bit is 0 if the corresponding index value exceeds 127 permits the permuted bits to be
selected from a 256-bit quantity, using a single index register. For example, assume that the 256-bit quantity Q,
from which the permuted bits are to be selected, is in registers v2 (high-order 128 bits of Q) and v3 (low-order
128 bits of Q), that the index values are in register v1, with each byte of v1 containing a value in the range 0:255,
and that each byte of register v4 contains the value 128. The following code sequence selects eight permuted
bits from Q and places them into the low-order byte of v6.
Source operands with sign codes of 0b1010, 0b1100, Negative results are encoded with a sign code of
0b1110, and 0b1111 are interpreted as positive values. 0b1101.
Let src1 be the decimal integer value in VR[VRA]. Let src1 be the decimal integer value in VR[VRA].
Let src2 be the decimal integer value in VR[VRB]. Let src2 be the decimal integer value in VR[VRB].
If the unbounded result is equal to zero, do the If the unbounded result is equal to zero, do the
following. following.
If PS=0, the sign code of the result is set to 0b1100. If PS=0, the sign code of the result is set to 0b1100.
If PS=1, the sign code of the result is set to 0b1111. If PS=1, the sign code of the result is set to 0b1111.
If the unbounded result is greater than zero, do the If the unbounded result is greater than zero, do the
following. following.
If PS=0, the sign code of the result is set to 0b1100. If PS=0, the sign code of the result is set to 0b1100.
If PS=1, the sign code of the result is set to 0b1111. If PS=1, the sign code of the result is set to 0b1111.
If the operation overflows, CR field 6 is set to If the operation overflows, CR field 6 is set to
0b0101. Otherwise, CR field 6 is set to 0b0100. 0b0101. Otherwise, CR field 6 is set to 0b0100.
If the unbounded result is less than zero, do the If the unbounded result is less than zero, do the
following. following.
The sign code of the result is set to 0b1101. The sign code of the result is set to 0b1101.
If the operation overflows, CR field 6 is set to If the operation overflows, CR field 6 is set to
0b1001. Otherwise, CR field 6 is set to 0b1000. 0b1001. Otherwise, CR field 6 is set to 0b1000.
The low-order 31 digits of the magnitude of the result The low-order 31 digits of the magnitude of the result
are placed in bits 0:123 of VR[VRT]. are placed in bits 0:123 of VR[VRT].
The sign code is placed in bits 124:127 of VR[VRT]. The sign code is placed in bits 124:127 of VR[VRT].
If either src1 or src2 is an invalid encoding of a 31-digit If either src1 or src2 is an invalid encoding of a 31-digit
signed decimal value, the result is undefined and CR signed decimal value, the result is undefined and CR
field 6 is set to 0b0001. field 6 is set to 0b0001.
Decimal Convert From National VX-form Let src be the national decimal value in VR[VRB].
bcdcfn. VRT,VRB,PS src is placed in VR[VRT] in packed decimal format.
4 VRT 7 VRB 1 PS 385
0 6 11 16 21 22 23 31
A valid encoding of a national decimal value requires
the following.
if MSR.VEC=0 then Vector_Unavailable() – The contents of halfword 7 (sign code) must be
either 0x002B or 0x002D.
src_sign ← (VR[VRB].hword[7] = 0x002D) – The contents of halfwords 0 to 6 must be in the
eq_flag ← 1 range 0x0030 to 0x0039.
/* check for valid sign */
inv_flag ← (VR[VRB].hword[7] != 0x002B) & National decimal values having a sign code of 0x002B
(VR[VRB].hword[7] != 0x002D) are interpreted as positive values.
CR.bit[56] ← inv_flag ? 0b0 : lt_flag CR field 6 is set to reflect src compared to zero.
CR.bit[57] ← inv_flag ? 0b0 : gt_flag
CR.bit[58] ← inv_flag ? 0b0 : eq_flag If src is an invalid encoding of a national decimal
CR.bit[59] ← inv_flag
value, the contents of VR[VRT] are undefined and CR
field 6 is set to 0b0001.
Decimal Convert From Zoned VX-form Zoned decimal values having a sign code of 0x0,
0x1, 0x2, 0x3, 0x8, 0x9, 0xA, or 0xB are interpreted
bcdcfz. VRT,VRB,PS
as positive values.
4 VRT 6 VRB 1 PS 385
0 6 11 16 21 22 23 31 Zoned decimal values having a sign code of 0x4,
0x5, 0x6, 0x7, 0xC, 0xD, 0xE, or 0xF are interpreted
if MSR.VEC=0 then Vector_Unavailable() as negative values.
/* check for valid sign */
When PS=1, do the following.
inv_flag ← ((VR[VRB].byte[15].nibble[0] < 0xA) & (PS=1)) |
(VR[VRB].byte[15].nibble[1] > 0x9)
A valid encoding of a zoned decimal source
operand requires the following.
/* check for valid digits */ – The contents of bits 0:3 of byte 15 (sign code)
MIN ← (PS=0) ? 0x30 : 0xF0 must be a value in the range 0xA to 0xF.
MAX ← (PS=0) ? 0x39 : 0xF9 – The contents of bits 0:3 of bytes 0 to 14 must
do i = 0 to 14 be the value 0xF.
inv_flag ← inv_flag | (VR[VRB].byte[i] < MIN) – The contents of bits 4:7 of bytes 0 to 15 must
| (VR[VRB].byte[i] > MAX)
be a value in the range 0x0 to 0x9.
end
if PS=0 then
Zoned decimal source operands having a sign
src_sign ← VR[VRB].nibble[30].bit[1] code of 0xA, 0xC, 0xE, or 0xF are interpreted as
else positive values.
src_sign ← (VR[VRB].nibble[30] = 0b1011) |
(VR[VRB].nibble[30] = 0b1101) Zoned decimal source operands having a sign
code of 0xB or 0xD are interpreted as negative
eq_flag ← 1 values.
result.nibble[31] ← (src_sign=0) ? 0xC : 0xD For each integer value i from 0 to 15,
The contents of nibble 1 of byte element i of src
VR[VRT] ← inv_flag ? undefined : result are placed into nibble element i+15 of VR[VRT].
CR.bit[56] ← inv_flag ? 0b0 : lt_flag CR field 6 is set to reflect src compared to zero.
CR.bit[57] ← inv_flag ? 0b0 : gt_flag
CR.bit[58] ← inv_flag ? 0b0 : eq_flag If src is an invalid encoding of a zoned decimal value,
CR.bit[59] ← inv_flag the contents of VR[VRT] are undefined and CR field 6 is
set to 0b0001.
Let src be the zoned decimal value in VR[VRB].
Special Registers Altered:
src is placed in VR[VRT] in packed decimal format.
CR field 6
When PS=0, do the following.
A valid encoding of a zoned decimal value
requires the following.
– The contents of bits 0:3 of byte 15 (sign code)
can be any value in the range 0x0 to 0xF.
– The contents of bits 0:3 of bytes 0 to 14 must
be the value 0x3.
– The contents of bits 4:7 of bytes 0 to 15 must
be a value in the range 0x0 to 0x9.
Decimal Convert To National VX-form For each integer value i from 0 to 6, do the following.
The value 0x003 is placed into nibbles 0:2 of
bcdctn. VRT,VRB
halfword element i of VR[VRT].
4 VRT 5 VRB 1 / 385
0 6 11 16 21 22 23 31 The contents of nibble element i+24 of VR[VRB] are
placed into nibble 3 of halfword element i of
if MSR.VEC=0 then Vector_Unavailable() VR[VRT].
ox_flag ← 0 The contents of halfword element 7 (i.e., sign code) of
do i = 0 to 23
VR[VRT] are set to 0x002B for positive values and to
ox_flag ← ox_flag | (VR[VRB].nibble[i] != 0x0)
end
0x002D for negative values.
inv_flag ← (VR[VRB].nibble[31] < 0xA) CR field 6 is set to reflect src compared to zero,
do i = 0 to 30 including whether or not src is too large to be
inv_flag ← inv_flag | (VR[VRB].nibble[i] > 0x9) represented in national decimal format.
end
If src is an invalid encoding of a packed decimal value,
src_sign ← (VR[VRB].nibble[31] = 0xB) | the contents of VR[VRT] are undefined and CR field 6 is
src.sign ← (VR[VRB].nibble[31] = 0xD) set to 0b0001.
eq_flag ← (VR[VRB].nibble[0:30] = 0)
lt_flag ← (eq_flag=0) & (src_sign=1)
Special Registers Altered:
gt_flag ← (eq_flag=0) & (src_sign=0) CR field 6
do i = 0 to 6
result.hword[i].nibble[0:2] ← 0x003
result.hword[i].nibble[3] ← VR[VRB].nibble[i+24]
end
src_sign ← (VR[VRB].nibble[31] = 0xB) | Negative zoned decimal results are returned with
(VR[VRB].nibble[31] = 0xD) a sign code of 0xD.
eq_flag ← (VR[VRB].nibble[0:30] = 0) For each integer value i from 0 to 15, do the following.
lt_flag ← (eq_flag=0) & (src_sign=1) The rightmost nibble of each digit i of the zoned
gt_flag ← (eq_flag=0) & (src_sign=0)
decimal result is set to the contents of nibble i+15
do i = 0 to 14
of src.
result.byte[i].nibble[0] ← (PS=0) ? 0x3 : 0xF
result.byte[i].nibble[1] ← VR[VRB].nibble[i+15] The result is placed into VR[VRT].
end
CR field 6 is set to reflect src compared to zero,
if src.sign=0 then including whether or not src is too large to be
result.byte[15].nibble[0] ← (PS=0) ? 0x3 : 0xC represented in zoned decimal format.
else
result.byte[15].nibble[0] ← (PS=0) ? 0x7 : 0xD If src is an invalid encoding of a packed decimal value,
the contents of VR[VRT] are undefined and CR field 6 is
result.byte[15].nibble[1] ← VR[VRB].nibble[30]
set to 0b0001.
VR[VRT] ← inv_flag ? undefined : result
Special Registers Altered:
CR.bit[56] ← inv_flag ? 0b0 : lt_flag CR field 6
CR.bit[57] ← inv_flag ? 0b0 : gt_flag
CR.bit[58] ← inv_flag ? 0b0 : eq_flag
CR.bit[59] ← inv_flag | ox_flag
For PS=0, the contents of nibble element 31 (i.e., sign src is placed into VR[VRT] in signed integer format.
code) of VR[VRT] are set to 0xC for values greater than
or equal to 0 and to 0xD for values less than 0. A valid encoding of a signed packed decimal value
requires the following.
For PS=1, the contents of nibble element 31 (i.e., sign – The contents of nibble 31 (sign code) must be a
code) of VR[VRT] are set to 0xF for values greater than value in the range 0xA to 0xF.
or equal to 0 and to 0xD for values less than 0. – The contents of each nibble 0-30 must be a value
in the range 0x0 to 0x9.
If the signed integer value in VR[VRB] is greater than
1031-1 or less than -1031-1, the value is too large to be Packed decimal values with sign codes of 0xA, 0xC,
represented in packed decimal format, and the 0xE, or 0xF are interpreted as positive values.
contents of VR[VRT] are undefined.
Packed decimal values with sign codes of 0xB or 0xD
CR field 6 is set to reflect src compared to zero and are interpreted as negative values.
whether or not src is too large in magnitude to be
represented in packed decimal format. CR field 6 is set to reflect src compared to zero.
The decimal value in VR[VRA] is placed into VR[VRT] Packed decimal values with sign codes of 0xA, 0xC,
with the sign code of the decimal value in VR[VRB]. 0xE, or 0xF are interpreted as positive values.
CR field 6 is set to reflect the result compared to zero. Packed decimal values with sign codes of 0xB or 0xD
are interpreted as negative values.
If either the decimal value in VR[VRA] or the decimal
value in VR[VRB] is an invalid encoding, the contents of If src is negative, src is placed into VR[VRT] with the
VR[VRT] are undefined and CR field 6 is set to 0b0001. sign code set to 0xD.
Special Registers Altered: If src is positive and PS=0, src is placed into VR[VRT]
CR field 6 with the sign code set to 0xC.
Decimal Shift VX-form Let n be the signed integer value in byte element 7 of
VR[VRA].
bcds. VRT,VRA,VRB,PS
Decimal Unsigned Shift VX-form Let n be the signed integer value in byte element 7 of
VR[VRA].
bcdus. VRT,VRA,VRB
eq_flag ← (VR[VRB].nibble[0:31] = 0) If n is less than zero, src is shifted right -n digits. Zeros
gt_flag ← (eq_flag=0) are supplied to vacated digits on the left.
if (n >si 0) then do // shift left The shifted result is placed into VR[VRT].
shcnt ← (n<33) ? n : 32
src.nibble[0:31] ← VR[VRB] CR field 6 is set to reflect src compared to zero,
src.nibble[32:63] ← DUP(0b0000,32) including whether or not significant digits were shifted
result ← src.nibble[shcnt:shcnt+31] out when the shift count is positive (i.e., left shift
ox_flag ← (shcnt > 0) & (src.nibble[0:shcnt-1] != 0)
operation).
end
else do // shift right
shcnt ← ((¬n+1)<33) ? (¬n+1) : 32
If src is an invalid encoding of a packed decimal value,
src.nibble[0:31] ← DUP(0b0000,32) the contents of VR[VRT] are undefined and CR field 6 is
src.nibble[32:63] ← VR[VRB] set to 0b0001.
result ← src.nibble[32-shcnt:63-shcnt]
ox_flag ← 0 Special Registers Altered:
end CR field 6
CR.bit[56] ← 0b0
CR.bit[57] ← inv_flag ? 0b0 : gt_flag
CR.bit[58] ← inv_flag ? 0b0 : eq_flag
CR.bit[59] ← inv_flag | ox_flag
Decimal Shift and Round VX-form Let n be the signed integer value in byte element 7 of
VR[VRA].
bcdsr. VRT,VRA,VRB,PS
Decimal Truncate VX-form Let length be the integer value in bits 48:63 of VR[VRA].
bcdtrunc. VRT,VRA,VRB,PS Let src be the signed decimal value in VR[VRB].
4 VRT VRA VRB 1 PS 257
0 6 11 16 21 22 23 31
A valid encoding of a packed decimal source operand
requires the following.
if MSR.VEC=0 then Vector_Unavailable() – The contents of nibble 31 (sign code) must be a
value in the range 0xA to 0xF.
inv_flag ← (VR[VRB].nibble[31] < 0xA) – The contents of each nibble 0-30 must be a value
do i = 0 to 30 in the range 0x0 to 0x9.
inv_flag ← inv_flag | (VR[VRB].nibble[i] > 0x9)
end
Packed decimal values with sign codes of 0xA, 0xC,
length ← VR[VRA].bit[48:63] 0xE, or 0xF are interpreted as positive values.
ox_flag ← 0
Packed decimal values with sign codes of 0xB or 0xD
src_sign ← (VR[VRB].nibble[31] = 0xB) | are interpreted as negative values.
(VR[VRB].nibble[31] = 0xD)
If src is negative, the sign code of the result is set to
eq_flag ← (VR[VRB].nibble[0:30] = 0) 0b1101.
lt_flag ← src_sign & ¬eq_flag
gt_flag ← ¬src_sign & ¬eq_flag
If src is positive, the sign code of the result is set to
if length < 31 then do 0b1100 if PS=0 and is set to 0b1111 if PS=1.
do i = 0 to 30-length
if VR[VRB].nibble[i]!=0b0000 then ox_flag ← 1 src is copied into VR[VRT] with the leftmost 31-length
result.nibble[i] ← 0b0000 digits each set to 0b0000. If any of the leftmost
end 31-length digits of the signed decimal value in VR[VRB]
if length > 0 then do are non-zero, an overflow occurs.
do i = 31-length to 30
result.nibble[i] ← VR[VRB].nibble[i] CR field 6 is set to reflect src compared to zero,
end including whether or not significant digits were
end
truncated.
end
else result.nibble[0:30] ← VR[VRB].nibble[0:30]
If src is an invalid encoding of a packed decimal value,
result.nibble[31] ← (src_sign=0) ? ((PS=0) ? 0xC : 0xF) : 0xD the contents of VR[VRT] are undefined and CR field 6 is
set to 0b0001.
VR[VRT] ← inv_flag ? undefined : result
Special Registers Altered:
CR.bit[56] ← inv_flag ? 0b0 : lt_flag CR field 6
CR.bit[57] ← inv_flag ? 0b0 : gt_flag
CR.bit[58] ← inv_flag ? 0b0 : eq_flag
CR.bit[59] ← inv_flag | ox_flag
Decimal Unsigned Truncate VX-form Let length be the integer value in bits 48:63 of VR[VRA].
bcdutrunc. VRT,VRA,VRB Let src be the unsigned decimal value in VR[VRB].
4 VRT VRA VRB 1 / 321
0 6 11 16 21 22 23 31
A valid encoding of a packed decimal source operand
requires the contents of each nibble 0-31 must be a
if MSR.VEC=0 then Vector_Unavailable() value in the range 0x0 to 0x9.
CR.bit[56] ← 0b0
CR.bit[57] ← inv_flag ? 0b0 : gt_flag
CR.bit[58] ← inv_flag ? 0b0 : eq_flag
CR.bit[59] ← inv_flag | ox_flag
VSCR (VRB)96:127
96
VRT 0 || (VSCR)
7.1 Introduction
VSR[0]
VSR[1]
…
…
VSR[62]
VSR[63]
0 127
SQ/UQ/QP/BCD
SD/UD/MD/DP 0 SD/UD/MD/DP 1
SW/UW/MW/SP 0 SW/UW/MW/SP 1 SW/UW/MW/SP 2 SW/UW/MW/SP 3
HP 0 HP 1 HP 2 HP 3 HP 4 HP 5 HP 6 HP 7
0 16 32 48 64 80 96 112 127
VSR[0] FPR[0]
VSR[1] FPR[1]
…
…
VSR[30] FPR[30]
VSR[31] FPR[31]
VSR[32]
VSR[33]
…
…
VSR[62]
VSR[63]
0 63 127
VSR[0]
VSR[1]
…
…
VSR[30]
VSR[31]
VSR[32] VR[0]
VSR[33] VR[1]
…
…
VSR[62] VR[30]
VSR[63] VR[31]
0 127
Result Flags
Result Value Class
C FL FG FE FU
1 0 0 0 1 Quiet NaN
0 1 0 0 1 - Infinity
0 1 0 0 0 - Normalized Number
1 1 0 0 0 - Denormalized Number
1 0 0 1 0 - Zero
0 0 0 1 0 + Zero
1 0 1 0 0 + Denormalized Number
0 0 1 0 0 + Normalized Number
0 0 1 0 1 + Infinity
This section describes the floating-point arithmetic and A floating-point number consists of a signed exponent
exception model supported by Vector-Scalar and a signed significand. The quantity expressed by
Extension. Except for extensions to support 32-bit this number is the product of the significand and the
single-precision floating-point vector operations, the number 2exponent. Encodings are provided in the data
models are identical to that described in Chapter 4. format to represent finite numeric values, ±Infinity, and
Floating-Point Facility. values that are “Not a Number” (NaN). Operations
involving infinities produce results obeying traditional
The processor (augmented by appropriate software mathematical conventions. NaNs have no
support, where required) implements a floating-point mathematical interpretation. Their encoding permits a
system compliant with the ANSI/IEEE Standard variable diagnostic information field. NaNs might be
754-1985, IEEE Standard for Binary Floating-Point used to indicate such things as uninitialized variables
Arithmetic (hereafter referred to as the IEEE standard). and can be produced by certain invalid operations.
That standard defines certain required "operations"
(addition, subtraction, and so on). Herein, the term, There is one class of exceptional events that occur
floating-point operation, is used to refer to one of these during instruction execution that is unique to
required operations and to additional operations Vector-Scalar Extension and Floating-Point: the
defined (e.g., those performed by Multiply-Add or Floating-Point Exception. Floating-point exceptions are
Reciprocal Estimate instructions). A Non-IEEE mode is signaled with bits set in the FPSCR. They can cause
also provided. This mode, which is permitted to the system floating-point enabled exception error
produce results not in strict compliance with the IEEE handler to be invoked, precisely or imprecisely, if the
standard, allows shorter latency. proper control bits are set.
These instructions are divided into two categories. – Invalid Operation exception (VX)
SNaN (VXSNAN)
– computational instructions Infinity-Infinity (VXISI)
Infinity÷Infinity (VXIDI)
The computational instructions are those that Zero÷Zero (VXZDZ)
perform addition, subtraction, multiplication, Infinity×Zero (VXIMZ)
division, extracting the square root, rounding, Invalid Compare (VXVC)
conversion, comparison, and combinations of Software-Defined Condition (VXSOFT)
these operations. These instructions provide the Invalid Square Root (VXSQRT)
floating-point operations. There are two forms of Invalid Integer Convert (VXCVI)
computational instructions, scalar, which perform – Zero Divide exception (ZX)
a single floating-point operation, and vector, which – Overflow exception (OX)
perform either two double-precision floating-point – Underflow exception (UX)
operations or four single-precision operations. – Inexact exception (XX)
Computational instructions place status
information into the Floating-Point Status and Each floating-point exception, and each category of
Control Register. They are the instructions Invalid Operation exception, has an exception bit in the
described in Sections 7.6.1.3 through 7.6.1.7.2. FPSCR. In addition, each floating-point exception has
a corresponding enable bit in the FPSCR. See
– noncomputational instructions Section 7.2.2, “Floating-Point Status and Control
Register” on page 369 for a description of these
The noncomputational instructions are those that exception and enable bits, and Section 7.3.3 , “VSX
perform loads and stores, move the contents of a Floating-Point Execution Models” on page 386 for a
VSR to another floating-point register possibly detailed discussion of floating-point exceptions,
altering the sign, and select the value from one of including the effects of the enable bits.
two VSRs based on the value in a third VSR. The
S EXP FRACTION
0 1 6 15
Figure 112. Floating-point half-precision format
S EXP FRACTION
0 9 31
Figure 113. Floating-point single-precision format
S EXP FRACTION
01 12 63
Figure 114.Floating-point double-precision format
S EXP FRACTION
01 16 127
Figure 115.Floating-point quad-precision format (binary128)
Nmax (2-2-10) × 215≈6.6 × 104 (1-2-24) x 2128≈3.4 x 1038 (1-2-53) x 21024≈1.8 x 10308 (1-2-113) x 216384≈1.2 x 104932
Nmin 1.0 × 2-14≈6.1 × 10-5 1.0 x 2-126≈1.2 x 10-38 1.0 x 2-1022≈2.2 x 10-308 1.0 x 2-16382≈3.4 x 10-4932
Dmin 1.0 × 2-24≈6.0 × 10-8 1.0 x 2-149≈1.4 x 10-45 1.0 x 2-1074≈4.9 x 10-324 1.0 x 2-16494≈6.5 x 10-4966
≈ Value is approximate
Dmin Smallest (in magnitude) representable denormalized number.
Nmax Largest (in magnitude) representable number.
Nmin Smallest (in magnitude) representable normalized number.
Table 3. IEEE floating-point fields
7.3.2.2 Value Representation can have a positive or negative sign. The sign of
zero is ignored by comparison operations (that is,
This architecture defines numeric and nonnumeric comparison regards +0 as equal to -0).
values representable within each of the three
supported formats. The numeric values are Denormalized numbers (±DEN)
approximations to the real numbers and include the These are values that have a biased exponent
normalized numbers, denormalized numbers, and zero value of zero and a nonzero fraction value. They
values. The nonnumeric values representable are the are nonzero numbers smaller in magnitude than
infinities and the Not a Numbers (NaNs). The infinities the representable normalized numbers. They are
are adjoined to the real numbers, but are not numbers values in which the implied unit bit is 0.
themselves, and the standard rules of arithmetic do not Denormalized numbers are interpreted as follows:
hold when they are used in an operation. They are
related to the real numbers by order alone. It is DEN = (-1)s x 2Emin x (0.fraction)
possible however to define restricted operations
among numbers and infinities as defined below. The where Emin is the minimum representable
relative location on the real number line for each of the exponent value.
defined entities is shown in Figure 116.
-14 for half-precision
Figure 116.Approximation to real numbers -126 for single-precision
-1022 for double-precision
-INF -NOR -DEN –0 +0 +DEN +NOR +INF -16382 for quad-precision.
Infinities (±INF)
These are values that have the maximum biased
The NaNs are not related to the numeric values or exponent value:
infinities by order or value but are encodings used to
convey diagnostic information such as the 31 in half-precision format
representation of uninitialized variables. 255 in single-precision format
2047 in double-precision format
The following is a description of the different 32767 in quad-precision format
floating-point values defined in the architecture:
and a zero fraction value. They are used to
Binary floating-point numbers approximate values greater in magnitude than the
Machine representable values used as maximum normalized value.
approximations to real numbers. Three categories
of numbers are supported: normalized numbers, Infinity arithmetic is defined as the limiting case of
denormalized numbers, and zero values. real arithmetic, with restricted operations defined
among numbers and infinities. Infinities and the
Normalized numbers (±NOR) real numbers can be related by ordering in the
These are values that have a biased exponent affine sense:
value in the range:
-Infinity < every finite number < +Infinity
1 to 30 in half-precision format
1 to 254 in single-precision format Arithmetic on infinities is always exact and does
1 to 2046 in double-precision format not signal any exception, except when an
1 to 32766 in quad-precision format exception occurs due to the invalid operations as
described in Section 7.4.1 , “Floating-Point Invalid
They are values in which the implied unit bit is 1. Operation Exception” on page 392.
Normalized numbers are interpreted as follows:
For comparison operations, +Infinity compares
NOR = (-1)s x 2E x (1.fraction) equal to +Infinity and -Infinity compares equal to
-Infinity.
where s is the sign, E is the unbiased exponent,
and 1.fraction is the significand, which is Not a Numbers (NaNs)
composed of a leading unit bit (implied bit) and a These are values that have the maximum biased
fraction part. exponent value and a nonzero fraction value. The
sign bit is ignored (that is, NaNs are neither
Zero values (±0) positive nor negative). If the high-order bit of the
These are values that have a biased exponent fraction field is 0, the NaN is a Signaling NaN;
value of zero and a fraction value of zero. Zeros otherwise it is a Quiet NaN.
When an arithmetic or rounding instruction produces Double-precision operands may be used as input for
an intermediate result which carries out of the double-precision scalar arithmetic operations.
significand, or in which the significand is nonzero but
Double-precision operands may be used as input for
has a leading zero bit, it is not a normalized number
single-precision scalar arithmetic operations when
and must be normalized before it is stored. For the
trapping on overflow and underflow exceptions is
carry-out case, the significand is shifted right one bit,
disabled.
with a one shifted into the leading significand bit, and
the exponent is incremented by one. For the
Single-precision operands may be used as input for
leading-zero case, the significand is shifted left while
double-precision and single-precision scalar arithmetic
decrementing its exponent by one for each bit shifted,
operations.
until the leading significand bit becomes one. The
Guard bit and the Round bit (see Section 7.3.3.1, “VSX Double-precision operands may be used as input for
Execution Model for IEEE Operations” on page 386) double-precision vector arithmetic operations.
participate in the shift with zeros shifted into the Round
bit. The exponent is regarded as if its range were Single-precision operands may be used as input for
unlimited. single-precison vector arithmetic operations.
After normalization, or if normalization was not Instructions are also provided for manipulations which
required, the intermediate result can have a nonzero do not require double-precision or single-precision. In
significand and an exponent value that is less than the addition, instructions are provided to access an integer
minimum value that can be represented in the format representation in GPRs.
specified for the result. In this case, the intermediate
result is said to be “Tiny” and the stored result is Half-Precision Operands
determined by the rules described in Section 7.4.4 ,
“Floating-Point Underflow Exception” on page 411. Instructions are provided to convert between
These rules can require denormalization. half-precision and single-precision formats for vector
data in VSRs and between half-precision and
A number is denormalized by shifting its significand double-precision formats for scalar data. Note that
right while incrementing its exponent by 1 for each bit scalar double-precision format is identical to scalar
shifted, until the exponent is equal to the format’s single-precision format.
minimum value. If any significant bits are lost in this
shifting process, “Loss of Accuracy” has occurred (See An instruction is provided to explicitly convert
Section 7.4.4 , “Floating-Point Underflow Exception” half-precision format operands in a VSR to
on page 411) and Underflow exception is signaled. single-precision format. Scalar single-precision
floating-point is enabled with six types of instruction.
Engineering Note
When denormalized numbers are operands of 1. VSX Scalar Convert Half-Precision format to
multiply, divide, and square root operations, some Double-Precision format XX2-form
implementations might prenormalize the operands
internally before performing the operations. The half-precision floating-point value in the
rightmost halfword of each doubleword element 0
of the source VSR is placed into the doubleword
7.3.2.5 Data Handling and Precision element 0 of the target VSR in double-precision
format.
Scalar double-precision floating-point data is
represented in double-precision format in VSRs and 2. VSX Scalar round & Convert Double-Precision
storage. format to Half-Precision format XX2-form
For xsresp or xsrsqrtesp, if the input value is Except for xsresp or xsrsqrtesp, any
finite and has an unbiased exponent greater than double-precision value can be used in
+127, the input value is interpreted as an Infinity. single-precision scalar arithmetic operations when
OE=0 and UE=0. When OE=1 or UE=1, or if the
6. Store VSX Scalar Single-Precision instruction is xsresp or xsrsqrtesp, source
operands must be respresentable in
stxsspx converts a single-precision value that is single-precision format.
in double-precision format to single-precision
format and stores that operand into storage. No Some implementations may execute
floating-point exceptions are caused by stxsspx. single-precision arithmetic instructions faster than
(The value being stored is effectively assumed to double-precision arithmetic instructions. Therefore,
be the result of an instruction of one of the if double-precision accuracy is not required,
preceding five types.) single-precision data and instructions should be
used.
When the result of a Load VSX Scalar Single-Precision
(lxsspx), a VSX Scalar Round to Single-Precision Programming Note
(xsrsp), or a VSX Scalar Single-Precision Arithmetic[1]
instruction is stored in a VSR, the low-order 29 bits of Both single-precision and double-precision forms
FRACTION are zero. are provided for most scalar floating-point
instructions. Some scalar floating-point instructions
Programming Note are only provided in double-precision form since
their operation is identical to the equivalent scalar
VSX Scalar Round to Single-Precision (xsrsp) is
single-precision operation.
provided to allow value conversion from
double-precision to single-precision with Of the operations for which only a double-precision
appropriate exception checking and rounding. form of the instruction is provided,
xsrsp should be used to convert double-precision
floating-point values to single-precision values – instructions that return the absolute value, the
prior to storing them into single format storage negative absolute value, or the negated value
elements or using them as operands for (xsnabsdp, xsabsdp, xsnegdp) can be used
single-precision arithmetic instructions. Values to perform these operations on scalar
produced by single-precision load and arithmetic single-precision operands,
instructions are already single-precision values
and can be stored directly into single format – instructions that perform a comparison
storage elements, or used directly as operands for (xscmpodp, xscmpudp) can be used to
single-precision arithmetic instructions, without perform these operations on scalar
preceding the store, or the arithmetic instruction, single-precision operands,
by an xsrsp.
– instructions that determine the maximum
(xsmaxdp) or minimum (xsmindp) can be
used to perform these operations on scalar
single-precision operands, and
xsrdpic, xvrdpic, and xvrspic can also cause VSX Scalar Integer Doubleword to
Inexact exception. Single-Precision Format Conversion[10]
instructions converts a 64-bit signed or unsigned can be regarded as having unbounded precision and
integer to a single-precision floating-point value exponent range. All but two groups of these
and returns the result in double-precision format. instructions normalize or denormalize the intermediate
result prior to rounding and then place the final result
VSX Vector Integer Doubleword to into the target element of the target VSR in either
Double-Precision Format Conversion[1] double-precision, single-precision, or quad-precision
instructions converts the 64-bit signed or unsigned format.
integer in each doubleword element in the source
vector operand to double-precision floating-point The scalar round to double-precision integer, vector
format. round to double-precision integer, and convert
double-precision to integer instructions with biased
VSX Vector Integer Word to Double-Precision exponents ranging from 1022 through 1074 are
Format Conversion[2] instructions converts the prepared for rounding by repetitively shifting the
32-bit signed or unsigned integer in each significand right one position and incrementing the
odd-numbered word element in the source vector biased exponent until it reaches a value of 1075.
operand to double-precision floating-point format. (Intermediate results with biased exponents 1075 or
larger are already integers, and with biased exponents
VSX Vector Integer Doubleword to 1021 or less round to zero.) After rounding, the final
Single-Precision Format Conversion[3] instructions result for round to double-precision integer instructions
convert the 64-bit signed or unsigned integer in is normalized and put in double-precision format, and,
each doubleword element in the source vector for the convert double-precision to integer instructions,
operand to single-precision floating-point format. is converted to a signed or unsigned integer.
VSX Vector Integer Word to Single-Precision The vector round to single-precision integer and vector
Format Conversion[4] instructions convert the convert single-precision to integer instructions with
32-bit signed or unsigned integer in each word biased exponents ranging from 126 through 178 are
element in the source vector operand to prepared for rounding by repetitively shifting the
single-precision floating-point format. significand right one position and incrementing the
biased exponent until it reaches a value of 179.
Rounding is performed using the rounding mode (Intermediate results with biased exponents 179 or
specificed in RN. Because of the limitations of the larger are already integers, and with biased exponents
source format, only an Inexact exception can be 125 or less round to zero.) After rounding, the final
generated. result for vector round to single-precision integer is
normalized and put in double-precision format, and for
7.3.2.6 Rounding vector convert single-precision to integer is converted
to a signed or unsigned integer.
The material in this section applies to operations that
have numeric operands (that is, operands that are not FR and FI generally indicate the results of rounding.
infinities or NaNs). Rounding the intermediate result of Each of the scalar instructions which rounds its
such an operation can cause an Overflow exception, intermediate result sets these bits. There are no vector
an Underflow exception, or an Inexact exception. The instructions that modify FR and FI. If the fraction is
remainder of this section assumes that the operation incremented during rounding, FR is set to 1, otherwise
causes no exceptions and that the result is numeric. FR is set to 0. If the result is inexact, FI is set to 1,
See Section 7.3.2.2, “Value Representation” and otherwise FI is set to zero. The scalar round to
Section 7.4, “VSX Floating-Point Exceptions” for the double-precision integer instructions are exceptions to
cases not covered here. this rule, setting FR and FI to 0. The scalar
double-precision estimate instructions set FR and FI to
The floating-point arithmetic, and rounding and undefined values. The remaining scalar floating-point
conversion instructions round their intermediate instructions do not alter FR and FI.
results. With the exception of the estimate instructions,
these instructions produce an intermediate result that
Programming Note
Round to Odd rounding mode is useful when the results of a Quad-Precision Arithmetic instruction are required
to be rounded to a shorter precision while avoiding a double rounding error. In this case, the rounding mode of
the Quad-Precision Arithmetic instruction is overridden as Round To Odd by setting the RO bit in the instruction
encoding to 1, then the result of that Quad-Precision Arithmetic instruction can be rounded to the desired shorter
precision using the rounding mode specified in RN by following with a VSX Scalar Round Quad-Precision to
Double-Extended-Precision for 15-bit exponent range and 64-bit significand precision, VSX Scalar Round
Quad-Precision to Double-Precision for 11-bit exponent range and 53-bit significand precision, or VSX Scalar
Round Quad-Precision to Single-Precision for 8-bit exponent range and 24-bit significand precision. For
example,
Let Z be the intermediate arithmetic result or the Figure 117 also summarizes the rounding actions for
operand of a convert operation. If Z can be floating-point intermediate result for all supported
represented exactly in the target format, the result in rounding modes.
all rounding modes is Z as represented in the target
format. If Z cannot be represented exactly in the target
format, let Z1 and Z2 bound Z as the next larger and
next smaller numbers representable in the target
format. Then Z1 or Z2 can be used to approximate the
result in the target format.
Z2 Z1 0 Z2 Z1
Z Z
Negative values Positive values
Otherwise, choose the value that is closer to Z (Z1 or Z2). In case of a tie, choose the one that is furthest
away from 0.
Otherwise, choose the value that is closer to Z (Z1 or Z2). In case of a tie, choose the one that is even (least
significant bit is 0).
Round to Odd
Choose Z if Z is representable in the target precision.
Otherwise, choose the value (Z1 or Z2) that is odd (least significant bit is 1).
0 1 1 – Round to Odd
If IR is exact, choose IR.
1 0 0 IR midway between NL and NH
Otherwise, choose NL, and if G=1, R=1, or X=1,
1 0 1 IR closer to NH the least-significant bit of the result is set to 1.
1 1 0 Four of the rounding modes are user-selectable
1 1 1 through RN.
S C L FRACTION X’
0 1 2 3 106
Double 53 54 OR of 55:105, X’
Single 24 25 OR of 26:105, X’
These exceptions, other than Invalid Operation – An Invalid Operation exception (SNaN) can be set
exception resulting from a Software-Defined Condition, with an Invalid Operation exception (Invalid
can occur during execution of computational Integer Convert) for convert to integer instructions.
instructions. An Invalid Operation exception resulting
from a Software-Defined Condition occurs when a When an exception occurs, the writing of a result to the
Move To FPSCR instruction sets VXSOFT to 1. target register can be suppressed, or a result can be
delivered, depending on the exception.
Each floating-point exception, and each category of
Invalid Operation exception, has an exception bit in the The writing of a result to the target register is
FPSCR. In addition, each floating-point exception has suppressed for the certain kinds of exceptions, based
a corresponding enable bit in the FPSCR. The on whether the instruction is a vector or a scalar
exception bit indicates the occurrence of the instruction, so that there is no possibility that one of the
corresponding exception. If an exception occurs, the operands is lost. For other kinds of exceptions and
corresponding enable bit governs the result produced also depending on whether the instruction is a vector
by the instruction and, in conjunction with the FE0 and or a scalar instruction, a result is generated and written
FE1 bits (see page 390), whether and how the system to the destination specified by the instruction causing
floating-point enabled exception error handler is the exception. The result can be a different value for
invoked. In general, the enabling specified by the the enabled and disabled conditions for some of these
enable bit is of invoking the system error handler, not exceptions. Table 7 lists the types of exceptions and
of permitting the exception to occur. The occurrence of indicates whether a result is written to the target VSR
an exception depends only on the instruction and its or suppressed.
inputs, not on the setting of any control bits. The only
deviation from this general rule is that the occurrence Scalar Vector
of an Underflow exception depends on the setting of On exception type... Instruction Instruction
the enable bit. Results Results
Enabled Invalid Operation suppressed suppressed
Enabled Zero Divide suppressed suppressed
Enabled Overflow written suppressed
Enabled Underflow written suppressed
Enabled Inexact written suppressed
Disabled Invalid Operation written written
Scalar Vector for altering them are described in Book III. The system
On exception type... Instruction Instruction floating-point enabled exception error handler is never
Results Results invoked because of a disabled floating-point exception.
The effects of the four possible settings of these bits
Disabled Zero Divide written written
are as follows.
Disabled Overflow written written
Disabled Underflow written written FE0 FE1 Description
Disabled Inexact written written 0 0 Ignore Exceptions Mode
Floating-point exceptions do not cause the
Table 7. Exception Types Result Suppression
system floating-point enabled exception
The subsequent sections define each of the error handler to be invoked.
floating-point exceptions and specify the action that is 0 1 Imprecise Nonrecoverable Mode
taken when they are detected. The system floating-point enabled excep-
tion error handler is invoked at some point
The IEEE standard specifies the handling of
at or beyond the instruction that caused the
exceptional conditions in terms of traps and trap
enabled exception. It may not be possible
handlers. In this architecture, an FPSCR exception
to identify the excepting instruction or the
enable bit of 1 causes generation of the result value
data that caused the exception. Results
specified in the IEEE standard for the trap enabled
produced by the excepting instruction might
case; the expectation is that the exception is detected
have been used by or might have affected
by software, which revises the result. An FPSCR
subsequent instructions that are executed
exception enable bit of 0 causes generation of the
before the error handler is invoked.
default result value specified for the trap disabled (or
no trap occurs or trap is not implemented) case. The 1 0 Imprecise Recoverable Mode
expectation is that the exception is not detected by The system floating-point enabled excep-
software, which uses the default result. The result to tion error handler is invoked at some point
be delivered in each case for each exception is at or beyond the instruction that caused the
described in the following sections. enabled exception. Sufficient information is
provided to the error handler for it to identify
The IEEE default behavior when an exception occurs the excepting instruction, the operands, and
is to generate a default value and not to notify correct the result. No results produced by
software. In this architecture, if the IEEE default the excepting instruction have been used
behavior when an exception occurs is required for all by or affected subsequent instructions that
exceptions, all FPSCR exception enable bits must be are executed before the error handler is
set to 0, and Ignore Exceptions Mode (see below) invoked.
should be used. In this case, the system floating-point
enabled exception error handler is not invoked, even if 1 1 Precise Mode
floating-point exceptions occur: software can inspect The system floating-point enabled excep-
the FPSCR exception bits, if necessary, to determine tion error handler is invoked precisely at the
whether exceptions have occurred. instruction that caused the enabled excep-
tion.
In this architecture, if software is to be notified that a
given kind of exception has occurred, the In all cases, the question of whether a floating-point
corresponding FPSCR exception enable bit must be result is stored, and what value is stored, is governed
set to 1, and a mode other than Ignore Exceptions by the FPSCR exception enable bits, as described in
Mode must be used. In this case, the system subsequent sections, and is not affected by the value
floating-point enabled exception error handler is of the FE0 and FE1 bits.
invoked if an enabled floating-point exception occurs.
The system floating-point enabled exception error In all cases in which the system floating-point enabled
handler is also invoked if a Move To FPSCR instruction exception error handler is invoked, all instructions
causes an exception bit and the corresponding enable before the instruction at which the system
bit both to be 1. The Move To FPSCR instruction is floating-point enabled exception error handler is
considered to cause the enabled exception. invoked have been completed, and no instruction after
the instruction at which the system floating-point
The FE0 and FE1 bits control whether and how the enabled exception error handler is invoked has begun
system floating-point enabled exception error handler execution. The instruction at which the system
is invoked if an enabled floating-point exception floating-point enabled exception error handler is
occurs. The location of these bits and the requirements invoked has completed if it is the excepting instruction,
Programming Note
In any of the three non-Precise modes, a
Floating-Point Status and Control Register
instruction can be used to force any exceptions,
because of instructions initiated before the
Floating-Point Status and Control Register
instruction, to be recorded in the FPSCR. (This
forcing is superfluous for Precise Mode.)
For VSX Scalar Floating-Point Arithmetic, VSX Scalar DP-SP Conversion, VSX Scalar Convert Floating-Point to
Integer, and VSX Scalar Round to Floating-Point Integer instructions:
4. FPRF is unchanged.
do the following.
do the following.
2. FR, FI, and C are not modified. FPCC is set to reflect unordered.
do the following.
1. VXSNAN is set to 1.
2. VSR[XT] is not modified.
3. FR and FI are set to 0. FPRF is not modified.
do the following.
1. VXSNAN is set to 1.
2. VSR[XT] is not modified.
3. FR, FI, and FPRF are not modified.
do the following.
4. FPRF is unchanged.
For the VSX Scalar round and Convert Double-Precision to Single-Precision format (xscvdpsp) instruction:
1. VXSNAN is set to 1.
2. The single-precision representation of a Quiet NaN is placed into word element 0 of VSR[XT]. The
contents of word elements 1-3 of VSR[XT] are undefined.
For the VSX Vector Single-Precision Arithmetic instructions, VSX Vector Single-Precision Maximum/Minimum
instructions, the VSX Vector round and Convert Double-Precision to Single-Precision format (xvcvdpsp)
instruction, and the VSX Vector Round to Single-Precision Integer instructions:
2. The single-precision representation of a Quiet NaN is placed into its respective word element of
VSR[XT].
For the VSX Scalar Double-Precision Arithmetic instructions, VSX Scalar Double-Precision Maximum/Minimum
instructions, the VSX Scalar Convert Single-Precision to Double-Precision format (xscvspdp) instruction, and
the VSX Scalar Round to Double-Precision Integer instructions:
2. The double-precision representation of a Quiet NaN is placed into doubleword element 0 of VSR[XT].
The contents of doubleword element 1 of VSR[XT] are undefined.
do the following.
3. FR and FI are set to 0. FPRF is set to indicate the class of the result (Quiet NaN).
1. VXSNAN is set to 1.
3. FR and FI are set to 0. FPRF is set to indicate the class of the result (Quiet NaN).
do the following.
For VSX Scalar Convert Quad-Precision to Double-Precision [using round to Odd] (xscvqpdp[o]), do the
following.
1. VXSNAN is set to 1.
2. The double-precision Quiet NaN result is placed into doubleword element 0 of VSR[VRT+32] in
double-precision format.
3. FR and FI are set to 0. FPRF is set to indicate the class of the result (Quiet NaN).
For VSX Scalar Convert Quad-Precision to Signed Doubleword (xscvqpsdz), do the following.
For VSX Scalar Convert Quad-Precision to Signed Word (xscvqpswz), do the following.
2. 0x7FFF_FFFF is placed into word element 1 of VSR[VRT+32] if the quad-precision operand in VSR[VRB+32]
is a positive number or +Infinity.
0x8000_0000 is placed into word element 1 of VSR[VRT+32] if the quad-precision operand in VSR[VRB+32]
is a negative number, -Infinity, or NaN.
For VSX Scalar Convert Quad-Precision to Unsigned Doubleword (xscvqpudz), do the following.
For VSX Scalar Convert Quad-Precision to Unsigned Word (xscvqpuwz), do the following.
2. 0xFFFF_FFFF is placed into word element 1 of VSR[VRT+32] if the quad-precision operand in VSR[VRB+32]
is a positive number or +Infinity.
0x0000_0000 is placed into word element 1 of VSR[VRT+32] if the quad-precision operand in VSR[VRB+32]
is a negative number, -Infinity, or NaN.
For VSX Scalar Convert Double-Precision to Half-Precision with round (xscvdphp), do the following.
1. VXSNAN is set to 1.
2. The half-precision representation of a Quiet NaN is placed into the rightmost halfword of doubleword
element 0 of VSR[XT]. The contents of the leftmost 3 halfwords of doubleword element 0 of VSR[XT] are
set to 0. The contents of doubleword element 1 of VSR[XT] are undefined.
3. FR and FI are set to 0. FPRF is set to indicate the class of the result (Quiet NaN).
1. VXSNAN is set to 1.
2. The double-precision representation of a Quiet NaN is placed into doubleword element 0 of VSR[XT].
The contents of doubleword element 1 of VSR[XT] are undefined.
3. FR and FI are set to 0. FPRF is set to indicate the class of the result (Quiet NaN).
For the VSX Vector Double-Precision Arithmetic instructions, VSX Vector Double-Precision Maximum/Minimum
instructions, the VSX Vector Convert Single-Precision to Double-Precision format (xvcvspdp) instruction, and
the VSX Vector Round to Double-Precision Integer instructions:
2. The double-precision representation of a Quiet NaN is placed into its respective doubleword element
of VSR[XT].
For the VSX Scalar Convert Double-Precision to Signed Integer Doubleword (xscvdpsxd) instruction:
4. FPRF is undefined.
For the VSX Scalar Convert Double-Precision to Unsigned Integer Doubleword (xscvdpuxd) instruction:
4. FPRF is undefined.
For the VSX Scalar Convert Double-Precision to Signed Integer Word (xscvdpsxw) instruction:
2. 0x7FFF_FFFF is placed into word element 1 of VSR[XT] if the double-precision operand in doubleword
element 0 of VSR[XB] is a positive number or +Infinity.
0x8000_0000 is placed into word element 1 of VSR[XT] if the double-precision operand in doubleword
element 0 of VSR[XB] is a negative number, -Infinity, or NaN.
4. FPRF is undefined.
For the VSX Scalar Convert Double-Precision to Unsigned Integer Word (xscvdpuxw) instruction:
2. 0xFFFF_FFFF is placed into word element 1 of VSR[XT] if the double-precision operand in doubleword
element 0 of VSR[XB] is a positive number or +Infinity.
0x0000_0000 is placed into word element 1 of VSR[XT] if the double-precision operand in doubleword
element 0 of VSR[XB] is a negative number, -Infinity, or NaN.
4. FPRF is undefined.
For the VSX Vector Convert Double-Precision to Signed Integer Doubleword (xvcvdpsxd) instruction:
For the VSX Vector Convert Double-Precision to Unsigned Integer Doubleword (xvcvdpuxd) instruction:
For the VSX Vector Convert Double-Precision to Signed Integer Word (xvcvdpsxw) instruction:
2. 0x7FFF_FFFF is placed intoword element i×2 of VSR[XT] if the double-precision operand in doubleword
element i of VSR[XB] is a positive number or +Infinity.
0x8000_0000 is placed into word element i×2 of VSR[XT] if the double-precision operand in doubleword
element i of VSR[XB] is a negative number, -Infinity, or NaN.
For the VSX Vector Convert Double-Precision to Unsigned Integer Word (xvcvdpuxw) instruction:
2. 0xFFFF_FFFF is placed into word element i×2 of VSR[XT] if the double-precision operand in doubleword
element i of VSR[XB] is a positive number or +Infinity.
0x0000_0000 is placed into word element i×2 of VSR[XT] if the double-precision operand in doubleword
element i of VSR[XB] is a negative number, -Infinity, or NaN.
For the VSX Vector Convert Single-Precision to Signed Integer Doubleword (xvcvspsxd) instruction:
For the VSX Vector Convert Single-Precision to Unsigned Integer Doubleword (xvcvspuxd) instruction:
For the VSX Vector Convert Single-Precision to Signed Integer Word (xvcvspsxw) instruction:
2. 0x7FFF_FFFF is placed into word element i of VSR[XT] if the single-precision operand in word element i
of VSR[XB] is a positive number or +Infinity.
0x8000_0000 is placed into word element i of VSR[XT] if the single-precision operand in word element i
of VSR[XB] is a negative number, -Infinity, or NaN.
For the VSX Vector Convert Single-Precision to Unsigned Integer Word (xvcvspuxw) instruction:
2. 0xFFFF_FFFF is placed into word element i of VSR[XT] if the single-precision operand in the
corresponding word element 2×i of VSR[XB] is a positive number or +Infinity.
0x0000_0000 is placed into word element i of VSR[XT] if the single-precision operand in word element
2×i of VSR[XB] is a negative number, -Infinity, or NaN.
For VSX Vector Convert Single-Precision to Half-Precision with round (xscvsphp), do the following.
1. VXSNAN is set to 1.
2. The half-precision representation of a Quiet NaN is placed into the rightmost halfword of its respective
word element of VSR[XT]. The contents of the leftmost halfword of its respective word element of
VSR[XT] are set to 0.
1. VXSNAN is set to 1.
2. The half-precision representation of a Quiet NaN is placed into the rightmost halfword of its respective
word element of VSR[XT]. The contents of the leftmost halfword of its respective word element of
VSR[XT] are set to 0.
A Zero Divide exception also occurs when a VSX Floating-Point Reciprocal Estimate[2] instruction or a VSX
Floating-Point Reciprocal Square Root Estimate[3] instruction is executed with an operand value of zero.
The action to be taken depends on the setting of the Zero Divide Exception Enable bit of the FPSCR.
do the following.
1. ZX is set to 1.
2. Update of VSR[XT] is suppressed.
3. FR and FI are set to 0.
4. FPRF is unchanged.
1. ZX is set to 1.
2. Update of VSR[VRT+32] is suppressed.
3. FR and FI are set to 0. FPRF is not modified.
do the following.
1. ZX is set to 1.
2. Update of VSR[XT] is suppressed for all vector elements.
3. FR and FI are unchanged.
4. FPRF is unchanged.
1. ZX is set to 1.
2. An Infinity, having a sign determined by the XOR of the signs of the source operands, is placed into
doubleword element 0 of VSR[XT] in double-precision format. The contents of doubleword element 1 of
VSR[XT] are undefined.
4. FPRF is set to indicate the class and sign of the result (± Infinity).
1. ZX is set to 1.
2. An Infinity, having a sign determined by the XOR of the signs of the source operands, is placed into
VSR[VRT+32] in quad-precision format.
3. FR and FI are set to 0. FPRF is set to indicate the class and sign of the result (± Infinity).
1. ZX is set to 1.
2. For each vector element causing a Zero Divide exception, an Infinity, having a sign determined by the
XOR of the signs of the source operands, is placed into its respective doubleword element of VSR[XT]
in double-precision format.
1. ZX is set to 1.
2. For each vector element causing a Zero Divide exception, an Infinity, having a sign determined by the
XOR of the signs of the source operands, is placed into its respective word element of VSR[XT] in
single-precision format.
For VSX Scalar Floating-Point Reciprocal Estimate[1] instructions and VSX Scalar Floating-Point Reciprocal
Square Root Estimate[2] instructions, do the following.
1. ZX is set to 1.
2. An Infinity, having the sign of the source operand, is placed into doubleword element 0 of VSR[XT] in
double-precision format. The contents of doubleword element 1 of VSR[XT] are undefined.
4. FPRF is set to indicate the class and sign of the result (± Infinity).
For the VSX Vector Reciprocal Estimate Double-Precision (xvredp) and VSX Vector Reciprocal Square Root
Estimate Double-Precision (xvrsqrtedp) instructions:
1. ZX is set to 1.
2. For each vector element causing a Zero Divide exception, an Infinity, having the sign of the source
operand, is placed into its respective doubleword element of VSR[XT] in double-precision format.
For the VSX Vector Reciprocal Estimate Single-Precision (xvresp) and VSX Vector Reciprocal Square Root
Estimate Single-Precision (xvrsqrtesp) instructions:
1. ZX is set to 1.
2. For each vector element causing a Zero Divide exception, an Infinity, having the sign of the source
operand, is placed into its respective word element of VSR[XT] in single-precision format.
The action to be taken depends on the setting of the Overflow Exception Enable bit of the FPSCR.
For the VSX Vector round and Convert Double-Precision to Single-Precision format (xscvdpsp) instruction:
1. OX is set to 1.
2. If the unbiased exponent of the normalized intermediate result is less than or equal to 318
(Emax+192), the exponent is adjusted by subtracting 192. Otherwise the result is undefined.
3. The adjusted rounded result is placed into word element 0 of VSR[XT] in single-precision format. The
contents of word elements 1-3 of VSR[XT] are undefined.
4. Unless the result is undefined, FPRF is set to indicate the class and sign of the result (±Normal
Number).
1. OX is set to 1.
3. The adjusted rounded result is placed into doubleword element 0 of VSR[XT] in double-precision
format. The contents of doubleword element 1 of VSR[XT] are undefined.
4. FPRF is set to indicate the class and sign of the result (±Normal Number).
1. OX is set to 1.
3. The adjusted and rounded result is placed into doubleword element 0 of VSR[XT] in double-precision
format. The contents of doubleword element 1 of VSR[XT] are undefined.
4. FPRF is set to indicate the class and sign of the result (±Normal Number).
do the following.
1. OX is set to 1.
4. Unless the result is undefined, FPRF is set to indicate the class and sign of the result (±Normal
Number).
For VSX Scalar Convert Quad-Precision to Double-Precision [using round to Odd] (xscvqpdp), do the
following.
1. OX is set to 1.
2. The exponent is adjusted by subtracting 1536. If the adjusted exponent is greater than +1023 (Emax), the
result is undefined.
3. The adjusted, rounded result is placed into doubleword element 0 of VSR[VRT+32] in double-precision
format.
4. Unless the result is undefined, FPRF is set to indicate the class and sign of the result (±Normal
Number).
For VSX Scalar Convert Double-Precision to Half-Precision with round (xscvdphp), do the following.
1. OX is set to 1.
2. The exponent is adjusted by subtracting 24. If the adjusted exponent is greater than +15 (Emax), the
result is undefined.
3. The adjusted, rounded result is placed into rightmost halfword of doubleword element 0 of VSR[XT] in
half-precision format.
The contents of the leftmost 3 halfwords of doubleword element 0 of VSR[XT] are set to 0.
4. Unless the result is undefined, FPRF is set to indicate the class and sign of the result (±Normal
Number).
For VSX Vector Double-Precision Arithmetic[1] instructions, VSX Vector Single-Precision Arithmetic[2]
instructions, and VSX Vector round and Convert Double-Precision to Single-Precision format instruction
(xvcvdpsp), do the following.
1. OX is set to 1.
For VSX Vector Convert Single-Precision to Half-Precision with round (xvcvsphp), do the following.
1. OX is set to 1.
2. VSR[XT] is not modified.
3. FR, FI, and FPRF are not modified.
2. The result is determined by the rounding mode (RN) and the sign of the intermediate result as follows:
For VSX Scalar round and Convert Double-Precision to Single-Precision format (xscvdpsp):
3. The result is placed into word element 0 of VSR[XT] as a single-precision value. The contents of word
elements 1-3 of VSR[XT] are undefined.
4. FR is undefined.
5. FI is set to 1.
For VSX Scalar Double-Precision Arithmetic[1] instructions and VSX Scalar Single-Precision Arithmetic[2]
instructions, do the following.
3. The result is placed into doubleword element 0 of VSR[XT] as a double-precision value. The contents
of doubleword element 1 of VSR[XT] are undefined.
4. FR is undefined.
5. FI is set to 1.
do the following.
4. FR is undefined. FI is set to 1. FPRF is set to indicate the class and sign of the result.
4. FR is undefined. FI is set to 1. FPRF is set to indicate the class and sign of the result.
2. The result is placed into the rightmost halfword of doubleword element 0 of VSR[XT] as a half-precision
value.
The contents of the leftmost 3 halfwords of doubleword element 0 of VSR[XT] are set to 0.
3. FR is undefined. FI is set to 1. FPRF is set to indicate the class and sign of the result.
3. For each vector element causing an Overflow exception, the result is placed into its respective
doubleword element of VSR[XT] in double-precision format.
For VSX Vector Single-Precision Arithmetic[2] instructions and VSX Vector round and Convert Double-Precision
to Single-Precision format (xvcvdpsp), do the following.
3. For each vector element causing an Overflow exception, the result is placed into its respective word
element of VSR[XT] in single-precision format.
For VSX Vector Convert Single-Precision to Half-Precision with round (xvcvsphp), do the following.
2. For each vector element causing an Overflow exception, the result is placed into the rightmost
halfword of its respective word element of VSR[XT] in half-precision format.
The contents of the leftmost halfword of its respective word element of VSR[XT] are set to 0.
Enabled:
Underflow occurs when the intermediate result is “Tiny”.
Disabled:
Underflow occurs when the intermediate result is “Tiny” and there is “Loss of Accuracy”.
A tiny result is detected before rounding, when a nonzero intermediate result computed as though both the precision
and the exponent range were unbounded would be less in magnitude than the smallest normalized number.
If the intermediate result is tiny and Underflow exception is disabled (UE=0), the intermediate result is denormalized
(see Section 7.3.2.4 , “Normalization and Denormalization” on page 379) and rounded (see Section 7.3.2.6 ,
“Rounding” on page 383) before being placed into the target VSR.
Loss of accuracy is detected when the delivered result value differs from what would have been computed were
both the precision and the exponent range unbounded.
The action to be taken depends on the setting of the Underflow Exception Enable bit of the FPSCR.
For VSX Scalar round and Convert Double-Precision to Single-Precision format (xscvdpsp), do the following.
1. UX is set to 1.
2. If the unbiased exponent of the normalized intermediate result is greater than or equal to -319
(Emin-192), the exponent is adjusted by adding 192. Otherwise the result is undefined.
3. The adjusted rounded result is placed into word element 0 of VSR[XT] in single-precision format. The
contents of word elements 1-3 of VSR[XT] are undefined.
4. Unless the result is undefined, FPRF is set to indicate the class and sign of the result (±Normal
Number).
For VSX Scalar Double-Precision Arithmetic[1] instructions and VSX Scalar Double-Precision Reciprocal
Estimate (xsredp), do the following.
1. UX is set to 1.
3. The adjusted rounded result is placed into doubleword element 0 of VSR[XT] in double-precision
format. The contents of doubleword element 1 of VSR[XT] are undefined.
4. FPRF is set to indicate the class and sign of the result (±Normal Number).
do the following.
1. UX is set to 1.
4. Unless the result is undefined, FPRF is set to indicate the class and sign of the result (±Normal
Number).
For VSX Scalar Convert Quad-Precision to Double-Precision [using round to Odd] (xscvqpdp[o]), do the
following.
1. UX is set to 1.
2. The exponent of the normalized intermediate result is adjusted by adding 1536. If the adjusted
exponent is less than -1022, the result is undefined.
3. The adjusted, rounded result is placed into doubleword element 0 of VSR[VRT+32] in double-precision
format.
4. Unless the result is undefined, FPRF is set to indicate the class and sign of the result (±Normal
Number).
For VSX Scalar Single-Precision Arithmetic[1] instructions and VSX Scalar Single-Precision Reciprocal Estimate
(xsresp), do the following.
1. UX is set to 1.
3. The adjusted rounded result is placed into doubleword element 0 of VSR[XT] in double-precision
format. The contents of doubleword element 1 of VSR[XT] are undefined.
4. FPRF is set to indicate the class and sign of the result (±Normal Number).
Programming Note
The FR and FI bits are provided to allow the system floating-point enabled exception error handler, when
invoked because of an Underflow exception, to simulate a “trap disabled” environment. That is, the FR and FI
bits allow the system floating-point enabled exception error handler to unround the result, thus allowing the
result to be denormalized and correctly rounded.
For VSX Scalar Convert Double-Precision to Half-Precision with round (xscvdphp), do the following.
1. UX is set to 1.
2. The exponent of the normalized intermediate result is adjusted by adding 24. If the adjusted exponent
is less than -14, the result is undefined.
3. The adjusted, rounded result is placed into rightmost halfword of doubleword element 0 of VSR[XT] in
half-precision format.
The contents of the leftmost 3 halfwords of doubleword element 0 of VSR[XT] are set to 0.
4. Unless the result is undefined, FPRF is set to indicate the class and sign of the result (±Normal
Number).
For VSX Vector Floating-Point Arithmetic[1] instructions, VSX Vector Floating-Point Reciprocal Estimate[2]
instructions, and VSX Vector round and Convert Double-Precision to Single-Precision format (xvcvdpsp), do
the following.
1. UX is set to 1.
For VSX Vector Convert Single-Precision to Half-Precision with round (xvcvsphp), do the following.
1. UX is set to 1.
2. VSR[XT] is not modified.
3. FR, FI, and FPRF are not modified.
For VSX Scalar round and Convert Double-Precision to Single-Precision format (xscvdpsp), do the following.
1. UX is set to 1.
2. The result is placed into word element 0 of VSR[XT] in single-precision format. The contents of word
elements 1-3 of VSR[XT] are undefined.
For VSX Scalar Floating-Point Arithmetic[3] instructions and VSX Scalar Reciprocal Estimate[4] instructions, do
the following.
1. UX is set to 1.
2. The result is placed into doubleword element 0 of VSR[XT] in double-precision format. The contents of
doubleword element 1 of VSR[XT] are undefined.
do the following.
1. UX is set to 1.
1. UX is set to 1.
For VSX Scalar Convert Double-Precision to Half-Precision with round (xscvdphp), do the following.
1. UX is set to 1.
2. The result is placed into the rightmost halfword of doubleword element 0 of VSR[XT] as a half-precision
value.
The contents of the leftmost 3 halfwords of doubleword element 0 of VSR[XT] are set to 0.
For VSX Vector Double-Precision Arithmetic[1] instructions and VSX Vector Reciprocal Estimate
Double-Precision (xvredp), do the following.
1. UX is set to 1.
2. For each vector element causing an Underflow exception, the result is placed into its respective
doubleword element of VSR[XT] in double-precision format.
For VSX Vector Single-Precision Arithmetic[2] instructions, VSX Vector Reciprocal Estimate Single-Precision
(xvresp), and VSX Vector round and Convert Double-Precision to Single-Precision format (xvcvdpsp), do the
following.
1. UX is set to 1.
1. VSX Vector Double-Precision Arithmetic instructions:
xvadddp, xvdivdp, xvmuldp, xvsubdp, xvmaddadp, xvmaddmdp, xvmsubadp, xvmsubmdp, xvnmaddadp, xvnmaddmdp, xvnmsubadp,
xvnmsubmdp
2. VSX Vector Single-Precision Arithmetic instructions:
xvaddsp, xvdivsp, xvmulsp, xvsubsp, xvmaddasp, xvmaddmsp, xvmsubasp, xvmsubmsp, xvnmaddasp, xvnmaddmsp, xvnmsubasp,
xvnmsubmsp
2. For each vector element causing an Underflow exception, the result is placed into its respective word
element of VSR[XT] in single-precision format.
For VSX Vector Convert Single-Precision to Half-Precision with round (xvcvsphp), do the following.
1. UX is set to 1.
2. For each vector element causing an Underflow exception, the result is placed into the rightmost
halfword of its respective word element of VSR[XT] in half-precision format.
The contents of the leftmost halfword of its respective word element of VSR[XT] are set to 0.
1. The rounded result differs from the intermediate result assuming both the precision and the exponent range of
the intermediate result to be unbounded. In this case the result is said to be inexact. (If the rounding causes an
enabled Overflow exception or an enabled Underflow exception, an Inexact exception also occurs only if the
significands of the rounded result and the intermediate result differ.)
The action to be taken depends on the setting of the Inexact Exception Enable bit of the FPSCR.
Programming Note
In some implementations, enabling Inexact exceptions can degrade performance more than does enabling other
types of floating-point exception.
When Inexact exception is enabled (UE=1) and an Inexact exception occurs, the following actions are taken:
For the VSX Vector round and Convert Double-Precision to Single-Precision format (xscvdpsp) instruction:
1. XX is set to 1.
2. The result is placed into word element 0 of VSR[XT] in single-precision format. The contents of word
elements 1-3 of VSR[XT] are undefined.
For VSX Scalar Floating-Point Arithmetic[1] instructions, VSX Scalar Round to Double-Precision Integer Exact
using Current rounding mode (xsrdpic), and VSX Scalar Integer to Floating-Point Format Conversion[2]
instructions, do the following.
1. XX is set to 1.
2. The result is placed into doubleword element 0 of VSR[XT] in double-precision format. The contents of
doubleword element 1 of VSR[XT] are undefined.
For VSX Scalar Floating-Point to Integer Word Format Conversion[3] instructions, do the following.
1. XX is set to 1.
2. The result is placed into word element 1 of VSR[XT]. The contents of word elements 0, 2, and 3 of
VSR[XT] are undefined.
do the following.
1. XX is set to 1.
3. FR is set to indicate if the rounded result was incremented. FI is set to 1. FPRF is set to indicate the
class and sign of the result.
1. XX is set to 1.
3. FR is set to indicate if the rounded result was incremented. FI is set to 1. FPRF is set to indicate the
class and sign of the result.
For VSX Scalar truncate & Convert Quad-Precision to Signed Doubleword (xscvqpsdz), do the following.
1. XX is set to 1.
2. The result is placed into doubleword element 0 of VSR[XT] in signed integer format.
For VSX Scalar truncate & Convert Quad-Precision to Signed Word (xscvqpswz), do the following.
1. XX is set to 1.
2. The result is placed into word element 1 of VSR[XT] in signed integer format.
For VSX Scalar truncate & Convert Quad-Precision to Unsigned Doubleword (xscvqpudz), do the following.
1. XX is set to 1.
2. The result is placed into doubleword element 0 of VSR[XT] in unsigned integer format.
For VSX Scalar truncate & Convert Quad-Precision to Unsigned Word (xscvqpuwz), do the following.
1. XX is set to 1.
2. The result is placed into word element 1 of VSR[XT] in unsigned integer format.
For VSX Scalar Convert Double-Precision to Half-Precision with round (xscvdphp), do the following.
1. XX is set to 1.
2. The result is placed into the rightmost halfword of doubleword element 0 of VSR[XT] as a half-precision
value.
The contents of the leftmost 3 halfwords of doubleword element 0 of VSR[XT] are set to 0.
3. FR is set to indicate if the rounded result was incremented. FI is set to 1. FPRF is set to indicate the
class and sign of the result.
For VSX Vector Floating-Point Arithmetic[1] instructions, VSX Vector Floating-Point Reciprocal Estimate[2]
instructions, VSX Vector round and Convert Double-Precision to Single-Precision format (xvcvdpsp), VSX
Vector Double-Precision to Integer Format Conversion[3] instructions, and VSX Vector Integer to Floating-Point
Format Conversion[4] instructions, do the following.
1. XX is set to 1.
For VSX Vector Convert Single-Precision to Half-Precision with round (xvcvsphp), do the following.
1. XX is set to 1.
2. VSR[XT] is not modified.
3. FR, FI, and FPRF are not modified.
For VSX Scalar round and Convert Double-Precision to Single-Precision format (xscvdpsp), do the following.
1. XX is set to 1.
2. The result is placed into word element 0 of VSR[XT] as a single-precision value. The contents of word
elements 1-3 of VSR[XT] are undefined.
For VSX Scalar Double-Precision Arithmetic[1] instructions, VSX Scalar Single-Precision Arithmetic[2]
instructions, VSX Scalar Round to Single-Precision (xsrsp), the VSX Scalar Round to Double-Precision Integer
Exact using Current rounding mode (xsrdpic), and VSX Scalar Integer to Double-Precision Format
Conversion[3] instructions, do the following.
1. XX is set to 1.
2. The result is placed into doubleword element 0 of VSR[XT] as a double-precision value. The contents
of doubleword element 1 of VSR[XT] are undefined.
For VSX Scalar Convert Double-Precision To Integer Word format with Saturate instructions,
xscvdpsxws, xscvdpuxws
do the following.
1. XX is set to 1.
2. The result is placed into word element 1 of VSR[XT]. The contents of word elements 0, 2, and 3 of
VSR[XT] are undefined.
For VSX Scalar round & Convert Quad-Precision to Double-Precision (xscvqpdp), do the following.
1. XX is set to 1.
2. The result is placed into the rightmost halfword of doubleword element 0 of VSR[XT] as a half-precision
value.
The contents of the leftmost 3 halfwords of doubleword element 0 of VSR[XT] are set to 0.
3. FR is set to indicate if the rounded result was incremented. FI is set to 1. FPRF is set to indicate the
class and sign of the result.
do the following.
1. XX is set to 1.
2. For each vector element causing an Inexact exception, the result is placed into its respective
doubleword element of VSR[XT] in double-precision format.
do the following.
1. XX is set to 1.
3. FR is set to indicate if the rounded result was incremented. FI is set to 1. FPRF is set to indicate the
class and sign of the result.
For VSX Scalar round & Convert Quad-Precision to Double-Precision (xscvqpdp), do the following.
1. XX is set to 1.
3. FR is set to indicate if the rounded result was incremented. FI is set to 1. FPRF is set to indicate the
class and sign of the result.
do the following.
1. XX is set to 1.
2. The result is placed into doubleword element 0 of VSR[VRT+32] in signed integer format.
do the following.
1. XX is set to 1.
2. The result is placed into doubleword element 0 of VSR[VRT+32] in unsigned integer format.
For VSX Vector Convert Single-Precision to Half-Precision with round (xvcvsphp), do the following.
1. XX is set to 1.
2. For each vector element causing an Underflow exception, the result is placed into the rightmost
halfword of its respective word element of VSR[XT] in half-precision format.
The contents of the leftmost halfword of its respective word element of VSR[XT] are set to 0.
1. XX is set to 1.
2. For each vector element causing an Inexact exception, the result is placed into its respective word
element of VSR[XT] in single-precision format.
0 1 2 3 4 5 6 7 8 9 A B C D E F
0x0000: 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
Figure 125.Vector-Scalar Register contents for
0x0010: aligned quadword Load or Store VSX
Vector
0 1 2 3 4 5 6 7 8 9 A B C D E F
0x0000: 03 02 01 00 07 06 05 04 0B 0A 09 08 0F 0E 0D 0C
0x0010:
0 1 2 3 4 5 6 7 8 9 A B C D E F
0 1 2 3 4 5 6 7 8 9 A B C D E F # Assumptions
Little-Endian storage image of array B GPR[Ra] = address of B
GPR[Rb] = 4 (index to B[1])
0x0000: 67 45 23 01 33 22 11 00 77 66 55 44 BBAA 99 88
lxvw4x Xt,Ra,Rb
0x0010: FF EEDDCC Xt: 00 11 22 33 44 55 66 77 88 99 AABBCCDDEE FF
0 1 2 3 4 5 6 7 8 9 A B C D E F 0 1 2 3 4 5 6 7 8 9 A B C D E F
0x0010: FF EEDDCC
0 1 2 3 4 5 6 7 8 9 A B C D E F
# Assumptions
GPR[A] = address of B
GPR[B] = 4 (index to B[1])
lxvw4x Xt,Ra,Rb
Xt: 00 11 22 33 44 55 66 77 88 99 AABBCCDDEE FF
0 1 2 3 4 5 6 7 8 9 A B C D E F
Storing a VSR to elements 1 through 4 of B (see Storing a VSR to elements 1 through 4 of B (see
Figure 126) into VR[VT] involves an unaligned Figure 126) into VR[VT] involves an unaligned
quadword storage access. quadword storage access.
VSX supports word-aligned vector and scalar storage VSX supports word-aligned vector and scalar storage
accesses using Big-Endian byte ordering. accesses using Little-Endian byte ordering.
Big-Endian storage image of array B Little-Endian storage image of array B
0x0000: 01 23 45 67 00 11 22 33 44 55 66 77 88 99 AA BB 0x0000: 67 45 23 01 33 22 11 00 77 66 55 44 BB AA 99 88
0x0010: CC DD EE FF 0x0010: FF EE DD CC
0 1 2 3 4 5 6 7 8 9 A B C D E F 0 1 2 3 4 5 6 7 8 9 A B C D E F
Xs: F0 F1 F2 F3 F4 F5 F6 F7 F8 F9 FA BB FC FD FE FF Xs: F0 F1 F2 F3 F4 F5 F6 F7 F8 F9 FA BB FC FD FE FF
0 1 2 3 4 5 6 7 8 9 A B C D E F 0 1 2 3 4 5 6 7 8 9 A B C D E F
# Assumptions # Assumptions
GPR[Ra] = address of B GPR[A] = address of B
GPR[Rb] = 4 (index to B[1]) GPR[B] = 4 (index to B[1])
0x0010: FC FD FE FF 0x0010: FF FE FD FC
0 1 2 3 4 5 6 7 8 9 A B C D E F 0 1 2 3 4 5 6 7 8 9 A B C D E F
Figure 129.Process to store unaligned quadword to Figure 130.Process to store unaligned quadword to
Big-Endian storage using Store VSX Little-Endian storage Store VSX Vector
Vector Word*4 Indexed Word*4 Indexed
x.dword[y] x<=y
Return the contents of doubleword element y of x. x and y are integer values.
~x
Return the one’s complement of x.
!x
Return 1 if the contents of x are equal to 0,
otherwise return 0.
x || y
Return the value of x concatenated with the value
of y. For example, 0b010 || 0b111 is the same as
0b010111.
x^y
Return the value of x exclusive ORed with the
value of y.
x?y:z
If the value of x is true, return the value of y,
otherwise return the value z.
x+y
x and y are integer values.
If x is a QNaN, return x.
Otherwise, if x is an SNaN, return x represented as a QNaN.
Otherwise, if y is a QNaN, return y.
Otherwise, if y is an SNaN, return y represented as a QNaN.
Otherwise, if x and y are infinities of opposite sign, return the standard QNaN.
Otherwise, return the normalized sum of x and y, having unbounded range and precision.
AddSP(x,y)
x and y are single-precision floating-point values.
If x is a QNaN, return x.
Otherwise, if x is an SNaN, return x represented as a QNaN.
Otherwise, if y is a QNaN, return y.
Otherwise, if y is an SNaN, return y represented as a QNaN.
Otherwise, if x and y are infinities of opposite sign, return the standard QNaN.
Otherwise, return the normalized sum of x added to y, having unbounded range and precision.
bfp_ABSOLUTE(x)
x is a binary floating-point value represented in the working floating-point format.
bfp_ADD(x, y)
x is a binary floating-point value represented in the working floating-point format.
y is a binary floating-point value represented in the working floating-point format.
If x is a QNaN, return x.
Otherwise, if x is an SNaN, return x represented as a QNaN.
Otherwise, if y is a QNaN, return y.
Otherwise, if y is an SNaN, return y represented as a QNaN.
Otherwise, if x and y are infinities of opposite sign, return the standard QNaN.
Otherwise, return the normalized sum of x and y, having unbounded range and precision.
bfp_COMPARE_EQ(x, y)
x is a binary floating-point value represented in the working floating-point format.
y is a binary floating-point value represented in the working floating-point format.
bfp_COMPARE_GT(x, y)
x is a binary floating-point value represented in the working floating-point format.
y is a binary floating-point value represented in the working floating-point format.
bfp_COMPARE_LT(x, y)
x is a binary floating-point value represented in the working floating-point format.
y is a binary floating-point value represented in the working floating-point format.
bfp_CONVERT_FROM_BFP16(x)
x is a floating-point value represented in half-precision format.
result.significand is shifted left until the contents bit 0 of result.significand are equal to 1.
result.exponent is decremented by the the number of bits result.significand was shifted.
Return result.
bfp_CONVERT_FROM_BFP32(x)
x is a floating-point value represented in single-precision format.
result.significand is shifted left until the contents bit 0 of result.significand are equal to 1.
result.exponent is decremented by the the number of bits result.significand was shifted.
Return result.
bfp_CONVERT_FROM_BFP64(x)
x is a binary floating-point value represented in double-precision format.
result.sign is initialized to 0.
result.exponent is initialized to 0.
result.significand is initialized to 0.
result.class.SNaN is initialized to 0.
result.class.QNaN is initialized to 0.
result.class.Infinity is initialized to 0.
result.class.Zero is initialized to 0.
result.class.Denormal is initialized to 0.
result.class.Normal is initialized to 0.
bfp_CONVERT_FROM_BFP128(x)
x is a binary floating-point value represented in quad-precision format.
result.sign is initialized to 0.
result.exponent is initialized to 0.
result.significand is initialized to 0.
result.class.SNaN is initialized to 0.
result.class.QNaN is initialized to 0.
result.class.Infinity is initialized to 0.
result.class.Zero is initialized to 0.
result.class.Denormal is initialized to 0.
result.class.Normal is initialized to 0.
bfp_CONVERT_FROM_SI64(x)
x is an integer value represented in signed doubleword integer format.
result.sign is initialized to 0.
result.exponent is initialized to 0.
result.significand is initialized to 0.
result.class.SNaN is initialized to 0.
result.class.QNaN is initialized to 0.
result.class.Infinity is initialized to 0.
result.class.Zero is initialized to 0.
result.class.Denormal is initialized to 0.
result.class.Normal is initialized to 0.
If x is equal to 0x0000_0000_0000_0000,
result.class.Zero is set to 1.
bfp_CONVERT_FROM_UI64(x)
x is an integer value represented in unsigned doubleword integer format.
result.sign is initialized to 0.
result.exponent is initialized to 0.
result.significand is initialized to 0.
result.class.SNaN is initialized to 0.
result.class.QNaN is initialized to 0.
result.class.Infinity is initialized to 0.
result.class.Zero is initialized to 0.
result.class.Denormal is initialized to 0.
result.class.Normal is initialized to 0.
bfp_CONVERT_TO_BFP16(x)
x is a floating-point value represented in the working format.
Return result.
bfp_CONVERT_TO_BFP32(x)
x is a floating-point value represented in the working format.
Return result.
bfp_CONVERT_TO_BFP64(x)
x is a floating-point value represented in the working format.
Return result.
bfp_CONVERT_TO_BFP128(x)
x is a quad-precision floating-point value that is represented in the working floating-point format.
If x is a QNaN,
the contents of bit 0 of result are set to the value of x.sign,
the contents of bits 1:15 of result are set to the value 0b111_1111_1111_1111, and
the contents of bits 16:127 of result are set to the value of bits 1:112 of x.significand.
Otherwise, if x is a Zero,
the contents of bit 0 of result are set to the value of x.sign, and
the contents of bits 1:15 of result are set to the value 0b000_0000_0000_0000, and
the contents of bits 16:127 of result are set to the value 0x0000_0000_0000_0000_0000_0000_0000.
Otherwise, if x is an Infinity,
the contents of bit 0 of result are set to the value of x.sign,
the contents of bits 1:15 of result are set to the value 0b111_1111_1111_1111, and
the contents of bits 16:127 of result are set to the value 0x0000_0000_0000_0000_0000_0000_0000.
bfp_CONVERT_TO_SI64(x)
x is an integer value represented in the working floating-point format.
bfp_CONVERT_TO_UI64(x)
x is an integer value represented in the working floating-point format.
bfp_DENORM(x, y)
x is an integer value specifying the target format’s Emin value.
y is a binary floating-point value that is represented in the working floating-point format.
If y.exponent is less than Emin, let sh_cnt be the value Emin - y.exponent.
Otherwise, let sh_cnt be the value 0.
bfp_DIVIDE(x, y)
x is a binary floating-point value that is represented in the working floating-point format.
y is a binary floating-point value that is represented in the working floating-point format.
If x is a QNaN, return x.
Otherwise, if x is an SNaN, return x represented as a QNaN.
Otherwise, if y is a QNaN, return y.
Otherwise, if y is an SNaN, return y represented as a QNaN.
Otherwise, if x and y are infinities, return the standard QNaN.
Otherwise, if x and y are zeros, return the standard QNaN.
Otherwise, if y is a zero, return infinity, having the sign of the exclusive-OR of the signs of x and y.
Otherwise, return the normalized quotient of x ÷ y, having unbounded range and precision.
bfp_INFINITY()
Return a positive floating-point infinity value, represented in the working format.
bfp_INITIALIZE(result)
result.class.Infinity ← 1
return(result)
bfp_INITIALIZE(x)
Return x.
bfp_MULTIPLY(x, y)
x is a binary floating-point value represented in the working floating-point format.
y is a binary floating-point value represented in the working floating-point format.
If x is a QNaN, return x.
Otherwise, if x is an SNaN, return x represented as a QNaN.
Otherwise, if y is a QNaN, return y.
Otherwise, if y is an SNaN, return y represented as a QNaN.
Otherwise, if x is an infinity and y is a zero, return the standard QNaN.
Otherwise, if x is a zero and y is an infinity, return the standard QNaN.
Otherwise, return the normalized product of x × y, having unbounded range and precision.
bfp_MULTIPLY_ADD(x, y, z)
x is a binary floating-point value represented in the working floating-point format.
y is a binary floating-point value represented in the working floating-point format.
z is a binary floating-point value represented in the working floating-point format.
If x is a QNaN, return x.
Otherwise, if x is an SNaN, return x represented as a QNaN.
Otherwise, if z is a QNaN, return z.
Otherwise, if z is an SNaN, return z represented as a QNaN.
Otherwise, if y is a QNaN, return y.
Otherwise, if y is an SNaN, return y represented as a QNaN.
Otherwise, if x is an infinity and y is a zero, return the standard QNaN.
Otherwise, if x is a zero and y is an infinity, return the standard QNaN.
Otherwise, if z and the product of x × y are Infinity values having opposite signs, return the standard QNaN.
Otherwise, return the sum of z and the normalized product of x × y, having unbounded range and precision.
bfp_NEGATE(x)
x is a binary floating-point value that is represented in the working floating-point format.
bfp_NMAX_BFP16()
Return the largest, positive, normalized half-precision floating-point value, (2-2-10)×2+15, represented in the
working format.
bfp_INITIALIZE(result)
result.exponent ← +15
result.significand.bit[0:10] ← 0b111_1111_1111
result.class.Normal ← 1
return(result)
bfp_NMAX_BFP64
Return the largest finite double-precision value (i.e., 21024-21024-53) in the working floating-point format.
return( bfp_CONVERT_FROM_BFP64(0x7FEF_FFFF_FFFF_FFFF) )
bfp_NMAX_BFP80
Return the largest finite double-extended-precision value (i.e., 216384-216384-65) in the working floating-point
format.
return( bfp_CONVERT_FROM_BFP80(0x7FFE_FFFF_FFFF_FFFF_FFFF) )
bfp_NMAX_BFP128
Return the largest finite quad-precision value (i.e., 216384-216384-113) in the working floating-point format.
return( bfp_CONVERT_FROM_BFP128(0x7FFE_FFFF_FFFF_FFFF_FFFF_FFFF_FFFF_FFFF) )
bfp_NMIN_BFP16()
Return the smallest, positive, normalized half-precision floating-point value, 2-14, represented in the working
format.
bfp_INITIALIZE(result)
result.exponent ← -14
result.significand.bit[0:10] ← 0b100_0000_0000
result.class.Normal ← 1
return(result)
bfp_NMIN_BFP64
Return the smallest, positive, normalized double-precision value, 2-1022, represented in the binary floating-point
working format.
return( bfp_CONVERT_FROM_BFP64(0x0010_0000_0000_0000) )
bfp_NMIN_BFP80
Return the smallest, positive, normalized double-extended-precision value, 2-16382, represented in the binary
floating-point working format.
return( bfp_CONVERT_FROM_BFP80(0x0001_0000_0000_0000_0000) )
bfp_NMIN_BFP128
Return the smallest, positive, normalized quad-precision value, 2-16382, represented in the binary floating-point
working format.
return( bfp_CONVERT_FROM_BFP128(0x0001_0000_0000_0000_0000_0000_0000_0000) )
bfp_QUIET(x)
x is a Signalling NaN.
Return x converted to a Quiet NaN with x.class.QNaN set to 1 and x.class.SNaN set to 0.
bfp_ROUND_CEIL(p, x)
x is a binary floating-point value that is represented in the working floating-point format and has unbounded
exponent range and significand precision. x must be rounded as presented, without prenormalization.
p is an integer value specifying the precision (i.e., number of bits) the significand is rounded to.
Return the smallest floating-point number having unbounded exponent range and a significand with a width of p
bits that is greater or equal in value to x.
bfp_ROUND_FLOOR(p, x)
x is a binary floating-point value that is represented in the working floating-point format and has unbounded
exponent range and significand precision. The value must be rounded as presented, without prenormalization.
p is an integer value specifying the precision (i.e., number of bits) the significand is rounded to.
Return the largest floating-point number having unbounded exponent range and a significand with a width of p
bits that is lesser or equal in value to x.
bfp_ROUND_TO_BFP16(x,y)
y is a normalized floating-point value represented in the working format, having unbounded exponent range and
significand precision.
If y is an QNaN, Infinity, or Zero, return y. Otherwise, if y is an SNaN, set vxsnan_flag to 1 and return the
corresponding QNaN representation of y. Otherwise, return the value y rounded to half-precision format’s
exponent range and significand precision using the rounding mode specified by x.
if bfp_COMPARE_LT(y,bfp_NMIN_BFP16()) then do
if FPSCR.UE=0 then do
do while y.exponent < -14 // denormalize y
y.significand ← y.significand >> 1
y.exponent ← y.exponent + 1
end
if x=0b00 then result ← bfp_ROUND_TO_BFP16_NEAR_EVEN(y)
if x=0b01 then result ← bfp_ROUND_TO_BFP16_TRUNC(y)
if x=0b10 then result ← bfp_ROUND_TO_BFP16_CEIL(y)
if x=0b11 then result ← bfp_ROUND_TO_BFP16_FLOOR(y)
do while result.significand.bit[0] = 0 // normalize result
result.significand ← result.significand << 1
result.exponent ← result.exponent - 1
end
ux_flag ← xx_flag
return(result)
end
else do
y.exponent ← y.exponent + 24
ux_flag ← 1
end
end
return(result)
bfp_ROUND_TO_BFP16_CEIL(x)
x is a normalized floating-point value represented in the working format, having unbounded exponent range and
significand precision.
Return the smallest floating-point number having unbounded exponent range but half-precision significand
precision that is greater or equal in value to x.
bfp_ROUND_TO_BFP16_FLOOR(x)
x is a normalized floating-point value represented in the working format, having unbounded exponent range and
significand precision.
Return the largest floating-point number having unbounded exponent range but half-precision significand
precision that is lesser or equal in value to x.
bfp_ROUND_TO_BFP16_NEAR_EVEN(x)
x is a normalized floating-point value represented in the working format, having unbounded exponent range and
significand precision.
Return the floating-point number having unbounded exponent range but half-precision significand precision that
is nearest in value to x (in case of a tie, the floating-point number having unbounded exponent range but
half-precision significand precision with the least-significant bit equal to 0 is used).
bfp_ROUND_TO_BFP16_TRUNC(x)
x is a normalized floating-point value represented in the working format, having unbounded exponent range and
significand precision.
Return the largest floating-point number having unbounded exponent range but half-precision significand
precision that is lesser or equal in value to x if x>0, or the smallest floating-point number having unbounded
exponent range but half0-precision significand precision that is greater or equal in value to x if x<0.
bfp_ROUND_TO_INTEGER(rmode, x)
x is a binary floating-point value that is represented in the working floating-point format and has unbounded
exponent range and significand precision.
If x is a QNaN, return x.
bfp_ROUND_ODD(p, x)
x is a binary floating-point value that is represented in the working floating-point format and has unbounded
exponent range and significand precision. x must be rounded as presented, without prenormalization.
p is an integer value specifying the precision (i.e., number of bits) the significand is rounded to.
Return x with bit p-1 of the significand set to 1 if any of the bits to the right of bit p-1 of the significand of x are
equal to 1, and all bits to the right of bit p-1 of the significand of the value returned are set to 0. Otherwise return
x with all bits to the right of bit p-1 of the significand set to 0.
bfp_ROUND_NEAR_EVEN(p, x)
x is a binary floating-point value that is represented in the working floating-point format and has unbounded
exponent range and significand precision. x must be rounded as presented, without prenormalization.
p is an integer value specifying the precision (i.e., number of bits) the significand is rounded to.
Return the floating-point number having unbounded exponent range and a significand with a width of p bits that
is nearest in value to x (in case of a tie, the floating-point number having unbounded exponent range and a
p-bit significand with the least-significant bit equal to 0 is used).
bfp_ROUND_TRUNC(p, x)
x is a binary floating-point value that is represented in the working floating-point format and has unbounded
exponent range and significand precision. x must be rounded as presented, without prenormalization.
p is an integer value specifying the precision (i.e., number of bits) the significand is rounded to.
Return the largest floating-point number having unbounded exponent range and a significand with a width of p
bits that is lesser or equal in value to x if x>0, or the smallest floating-point number having unbounded exponent
range but double-precision significand precision that is greater or equal in value to x if x<0.
bfp_ROUND_TO_BFP128(ro, rmode, x)
x is a normalized binary floating-point value that is represented in the working floating-point format and has
unbounded exponent range and significand precision.
ro is a 1-bit unsigned integer and rmode is a 2-bit unsigned integer, together specifying one of five rounding
modes to be used in rounding z.
Return the value x rounded to quad-precision under control of the specified rounding mode.
bfp_ROUND_TO_BFP80(rmode,x)
x is a normalized binary floating-point value that is represented in the working floating-point format and has
unbounded exponent range and significand precision.
rmode is a 2-bit unsigned integer, together specifying one of four rounding modes to be used in rounding x.
Return the value x rounded to double-extended-precision under control of the specified rounding mode.
bfp_ROUND_TO_BFP64(ro, rmode, x)
x is a normalized binary floating-point value that is represented in the working floating-point format and has
unbounded exponent range and significand precision.
ro is a 1-bit unsigned integer and rmode is a 2-bit unsigned integer, together specifying one of five rounding
modes to be used in rounding z.
Return the value x rounded to double-precision under control of the specified rounding mode.
bfp_SQUARE_ROOT(x)
x is a binary floating-point value that is represented in the working floating-point format and has unbounded
exponent range and significand precision.
If x is a QNaN, return x.
Otherwise, if x is an SNaN, return x represented as a QNaN.
Otherwise, if x is -Zero, return -Zero.
Otherwise, if x is negative, return the standard QNaN.
Otherwise, return the normalized square root of x, having unbounded range and precision.
ClassDP(x,y)
Return a 5-bit characterization of the double-precision floating-point number x.
ClassSP(x,y)
Return a 5-bit characterization of the single-precision floating-point number x.
CompareEQDP(x,y)
x and y are double-precision floating-point values.
If x or y is a NaN, return 0.
Otherwise, if x is equal to y, return 1.
Otherwise, return 0.
CompareEQSP(x,y)
x and y are single-precision floating-point values.
If x or y is a NaN, return 0,
Otherwise, if x is equal to y, return 1.
Otherwise, return 0.
CompareGTDP(x,y)
x and y are double-precision floating-point values.
If x or y is a NaN, return 0,
Otherwise, if x is greater than y, return 1.
Otherwise, return 0.
CompareGTSP(x,y)
x and y are single-precision floating-point values.
If x or y is a NaN, return 0.
Otherwise, if x is greater than y, return 1.
Otherwise, return 0.
CompareLTDP(x,y)
x and y are double-precision floating-point values.
If x or y is a NaN, return 0.
Otherwise, if x is less than y, return 1.
Otherwise, return 0.
CompareLTSP(x,y)
x and y are single-precision floating-point values.
If x or y is a NaN, return 0.
Otherwise, if x is less than y, return 1.
Otherwise, return 0.
ConvertDPtoSD(x)
x is a floating-point value in double-precision format.
If x is a NaN,
vxcvi_flag is set to 1,
vxsnan_flag is set to 1 if x is an SNaN, and
return 0x8000_0000_0000_0000,
Otherwise,
xx_flag is set to 1 if rnd is inexact.
return rnd in 64-bit signed integer format.
ConvertDPtoSP(x)
x is a floating-point value in double-precision format.
If x is an SNaN, vxsnan_flag is set to 1.
If x is a SNaN, returns x, converted to a QNaN, in single-precision floating-point format.
Otherwise, if x is a QNaN, an Infinity, or a Zero, returns x in single-precision floating-point format.
Otherwise, returns x, rounded to single-precision using the rounding mode specified in RN, in single-precision
floating-point format.
ox_flag is set to 1 if rounding x resulted in an Overflow exception.
ux_flag is set to 1 if rounding x resulted in an Underflow exception.
xx_flag is set to 1 if rounding x returns an inexact result.
inc_flag is set to 1 if the significand of the result was incremented during rounding.
ConvertDPtoSP_NS(x)
x is a single-precision floating-point value represented in double-precision format.
Returns x in single-precision format.
sign ← x.bit[0]
exponent ← x.bit[1:11]
fraction ← 0b1 æ x.bit[12:63] // implicit bit set to 1 (for now)
Programming Note
If x is not representable in single-precision, some exponent and/or significand bits will be discarded, likely
producing undesirable results. The low-order 29 bits of the significand of x are discarded, more if the
unbiased exponent of x is less than -126 (i.e., denormal). Finite values of x having an unbiased exponent
less than -150 will return a result of Zero. Finite values of x having an unbiased exponent greater than +127
will result in discarding significant bits of the exponent. SNaN inputs having no significant bits in the upper
23 bits of the signifcand will return Infinity as the result. No status is set for any of these cases.
ConvertDPtoSW(x)
x is a floating-point value in double-precision format.
If x is a NaN,
vxcvi_flag is set to 1,
vxsnan_flag is set to 1 if x is an SNaN, and
return 0x8000_0000,
Otherwise,
xx_flag is set to 1 if rnd is inexact.
return rnd in 32-bit signed integer format.
ConvertDPtoUD(x)
x is a floating-point value in double-precision format.
If x is a NaN,
vxcvi_flag is set to 1,
vxsnan_flag is set to 1 if x is an SNaN, and
return 0x8000_0000_0000_0000,
Otherwise,
xx_flag is set to 1 if rnd is inexact.
return rnd in 64-bit unsigned integer format.
ConvertDPtoUW(x)
x is a floating-point value in double-precision format.
If x is a NaN,
vxcvi_flag is set to 1,
vxsnan_flag is set to 1 if x is an SNaN, and
return 0x0000_0000,
Otherwise,
xx_flag is set to 1 if rnd is inexact.
return rnd in 32-bit unsigned integer format.
ConvertFPtoDP(x)
Return the floating-point value x in DP format.
ConvertFPtoSP(x)
Return the floating-point value x in single-precision format.
ConvertSDtoFP(x)
x is a 64-bit signed integer value.
Return the value x converted to floating-point format having unbounded significand precision.
ConvertSPtoDP_NS(x)
x is a single-precision floating-point value.
Returns x in double-precision format.
sign ← x.bit[0]
exponent ← (x.bit[1] æ ¬x.bit[1] æ ¬x.bit[1] æ ¬x.bit[1] æ x.bit[2:8])
fraction ← 0b0 æ x.bit[9:31] æ 0b0_0000_0000_0000_0000_0000_0000_0000
ConvertSP64toSP(x)
x is a single-precision floating-point value in double-precision format.
Returns the value x in single-precision format. x must be representable in single-precision, or else result returned
is undefined. x may require denormalization. No rounding is performed. If x is a SNaN, it is converted to a sin-
gle-precision SNaN having the same payload as x.
sign ← x.bit[0]
exp ← x.bit[1:11] - 1023
frac ← x.bit[12:63]
ConvertSPtoDP(x)
x is a single-precision floating-point value.
ConvertSPtoSD(x)
x is a floating-point value in single-precision format.
If x is a NaN,
vxcvi_flag is set to 1, and
vxsnan_flag is set to 1 if x is an SNaN
return 0x8000_0000_0000_0000 and
Otherwise,
xx_flag is set to 1 if rnd is inexact, and
return rnd in 64-bit signed integer format.
ConvertSPtoSP64(x)
x is a floating-point value in single-precision format.
Returns the value x in double-precision format. If x is a SNaN, it is converted to a double-precision SNaN having
the same payload as x.
sign ← x.bit[0]
exp ← x.bit[1:8] - 127
frac ← x.bit[9:31]
result.bit[0] ← sign
result.bit[1:11] ← exp + 1023
result.bit[12:34] ← frac
result.bit[35:63] ← 0
return(result)
ConvertSPtoSW(x)
x is a floating-point value in single-precision format.
If x is a NaN,
vxcvi_flag is set to 1,
vxsnan_flag is set to 1 if x is an SNaN, and
return 0x8000_0000.
Otherwise,
xx_flag is set to 1 if rnd is inexact, and
return rnd in 32-bit signed integer format.
ConvertSPtoUD(x)
x is a floating-point value in single-precision format.
If x is a NaN,
vxcvi_flag is set to 1, and
vxsnan_flag is set to 1 if x is an SNaN
return 0x0000_0000_0000_0000,
Otherwise,
xx_flag is set to 1 if rnd is inexact, and
return rnd in 64-bit unsigned integer format.
ConvertSPtoUW(x)
x is a floating-point value in single-precision format.
If x is a NaN,
vxcvi_flag is set to 1,
vxsnan_flag is set to 1 if x is an SNaN, and
return 0x0000_0000.
Otherwise,
xx_flag is set to 1 if rnd is inexact, and
return rnd in 32-bit unsigned integer format.
ConvertSWtoFP(x)
x is a 32-bit signed integer value.
Return the value x converted to floating-point format having unbounded significand precision.
ConvertUDtoFP(x)
x is a 64-bit unsigned integer value.
Return the value x converted to floating-point format having unbounded significand precision.
ConvertUWtoFP(x)
x is a 32-bit unsigned integer value.
Return the value x converted to floating-point format having unbounded significand precision.
DivideDP(x,y)
x and y are double-precision floating-point values.
If x is a QNaN, return x.
Otherwise, if x is an SNaN, return x represented as a QNaN.
Otherwise, if y is a QNaN, return y.
Otherwise, if y is an SNaN, return y represented as a QNaN.
Otherwise, if x is a Zero and y is a Zero, return the standard QNaN.
Otherwise, if x is a finite, nonzero value and y is a Zero with the same sign as x, return +Infinity.
Otherwise, if x is a finite, nonzero value and y is a Zero with the opposite sign as x, return -Infinity.
Otherwise, if x is an Infinity and y is an Infinity, return the standard QNaN.
Otherwise, return the normalized quotient of x divided by y, having unbounded range and precision.
DivideSP(x,y)
x and y are single-precision floating-point values.
If x is a QNaN, return x.
Otherwise, if x is an SNaN, return x represented as a QNaN.
Otherwise, if y is a QNaN, return y.
Otherwise, if y is an SNaN, return y represented as a QNaN.
Otherwise, if x is a Zero and y is a Zero, return the standard QNaN.
Otherwise, if x is a finite, nonzero value and y is a Zero with the same sign as x, return +Infinity.
Otherwise, if x is a finite, nonzero value and y is a Zero with the opposite sign as x, return -Infinity.
Otherwise, if x is an Infinity and y is an Infinity, return the standard QNaN.
Otherwise, return the normalized quotient of x divided by y, having unbounded range and precision.
DenormDP(x)
x is a floating-point value having unbounded range and precision.
Return the value x with its significand shifted right by a number of bits equal to the difference of the -1022 and
the unbiased exponent of x, and its unbiased exponent set to -1022.
DenormSP(x)
x is a floating-point value having unbounded range and precision.
Return the value x with its significand shifted right by a number of bits equal to the difference of the -126 and
the unbiased exponent of x, and its unbiased exponent set to -126.
EXTZ32(x)
Result of extending the b-bit value x on the left with 32-b zeros, forming a 32-bit value.
b LENGTH(x)
result.bit[0:31-b] 0
result.bit[32-b:31] x
EXTZ64(x)
Result of extending the b-bit value x on the left with 64-b zeros, forming a 64-bit value.
b LENGTH(x)
result.bit[0:63-b] 0
result.bit[64-b:63] x
EXTZ128(x)
Result of extending the b-bit value x on the left with 128-b zeros, forming a 128-bit value.
b LENGTH(x)
result.bit[0:127-b] 0
result.bit[128-b:127] x
fprf_CLASS_BFP16(x)
x is a floating-point value represented in half-precision format.
Return the 5-bit code that specifies the sign and class of x.
fprf_CLASS_BFP64(x)
x is a floating-point value represented in double-precision format.
Return the 5-bit code that specifies the sign and class of x.
fprf_CLASS_BFP128(x)
x is binary floating-point value that is represented in quad-precision format.
IsInf(x)
Return 1 if x is an Infinity, otherwise return 0.
IsNaN(x)
Return 1 if x is either an SNaN or a QNaN, otherwise return 0.
IsNeg(x)
Return 1 if x is a negative, nonzero value, otherwise return 0.
IsSNaN(x)
Return 1 if x is an SNaN, otherwise return 0.
IsZero(x)
Return 1 if x is a Zero, otherwise return 0.
MaximumDP(x,y)
x and y are double-precision floating-point values.
MaximumSP(x,y)
x and y are single-precision floating-point values.
MinimumDP(x,y)
x and y are double-precision floating-point values.
MinimumSP(x,y)
x and y are single-precision floating-point values.
MultiplyAddDP(x,y,z)
x, y and z are double-precision floating-point values.
If the product of x and y is an Infinity and z is an Infinity of the opposite sign, vxisi_flag is set to 1.
If x is a QNaN, return x.
Otherwise, if x is an SNaN, return x represented as a QNaN.
Otherwise, if z is a QNaN, return z.
Otherwise, if z is an SNaN, return z represented as a QNaN.
Otherwise, if y is a QNaN, return y.
Otherwise, if y is an SNaN, return y represented as a QNaN.
Otherwise, if x is a Zero and y is an Infinity or x is an Infinity and y is an Zero, return the standard QNaN.
Otherwise, if the product of x and y is an Infinity, and z is an Infinity of the opposite sign, return the standard
QNaN.
Otherwise, return the normalized sum of z and the product of x and y, having unbounded range and precision.
MultiplyAddSP(x,y,z)
x, y and z are single-precision floating-point values.
If the product of x and y is an Infinity and z is an Infinity of the opposite sign, vxisi_flag is set to 1.
If x is a QNaN, return x.
Otherwise, if x is an SNaN, return x represented as a QNaN.
Otherwise, if z is a QNaN, return z.
Otherwise, if z is an SNaN, return z represented as a QNaN.
Otherwise, if y is a QNaN, return y.
Otherwise, if y is an SNaN, return y represented as a QNaN.
Otherwise, if x is a Zero and y is an Infinity or x is an Infinity and y is an Zero, return the standard QNaN.
Otherwise, if the product of x and y is an Infinity, and z is an Infinity of the opposite sign, return the standard
QNaN.
Otherwise, return the normalized sum of z and the product of x and y, having unbounded range and precision.
MultiplyDP(x,y)
x and y are double-precision floating-point values.
If x is a QNaN, return x.
Otherwise, if x is an SNaN, return x represented as a QNaN.
Otherwise, if y is a QNaN, return y.
Otherwise, if y is an SNaN, return y represented as a QNaN.
Otherwise, if x is a Zero and y is as Infinity or x is a Infinity and y is an Zero, return the standard QNaN.
Otherwise, return the normalized product of x and y, having unbounded range and precision.
MultiplySP(x,y)
x and y are single-precision floating-point values.
If x is a QNaN, return x.
Otherwise, if x is an SNaN, return x represented as a QNaN.
Otherwise, if y is a QNaN, return y.
Otherwise, if y is an SNaN, return y represented as a QNaN.
Otherwise, if x is a Zero and y is as Infinity or x is a Infinity and y is an Zero, return the standard QNaN.
Otherwise, return the normalized product of x and y, having unbounded range and precision.
NegateDP(x)
If the double-precision floating-point value x is a NaN, return x.
Otherwise, return the double-precision floating-point value x with its sign bit complemented.
NegateSP(x)
If the single-precision floating-point value x is a NaN, return x.
Otherwise, return the single-precision floating-point value x with its sign bit complemented.
ReciprocalEstimateDP(x)
x is a double-precision floating-point value.
If x is a QNaN, return x.
Otherwise, if x is an SNaN, return x represented as a QNaN.
Otherwise, if x is a Zero, return an Infinity with the sign of x.
Otherwise, if x is an Infinity, return a Zero with the sign of x.
Otherwise, return an estimate of the reciprocal of x having unbounded exponent range.
ReciprocalEstimateSP(x)
x is a single-precision floating-point value.
If x is a QNaN, return x.
Otherwise, if x is an SNaN, return x represented as a QNaN.
Otherwise, if x is a Zero, return an Infinity with the sign of x.
Otherwise, if x is an Infinity, return a Zero with the sign of x.
Otherwise, return an estimate of the reciprocal of x having unbounded exponent range.
ReciprocalSquareRootEstimateDP(x)
x is a double-precision floating-point value.
If x is a QNaN, return x.
Otherwise, if x is an SNaN, return x represented as a QNaN.
Otherwise, if x is a negative, nonzero value, return the default QNaN.
Otherwise, return an estimate of the reciprocal of the square root of x having unbounded exponent range.
ReciprocalSquareRootEstimateSP(x)
x is a single-precision floating-point value.
If x is a QNaN, return x.
Otherwise, if x is an SNaN, return x represented as a QNaN.
Otherwise, if x is a negative, nonzero value, return the default QNaN.
Otherwise, return an estimate of the reciprocal of the square root of x having unbounded exponent range.
reset_xflags()
vxsnan_flag is set to 0.
vximz_flag is set to 0.
vxidi_flag is set to 0.
vxisi_flag is set to 0.
vxzdz_flag is set to 0.
vxsqrt_flag is set to 0.
vxcvi_flag is set to 0.
vxvc_flag is set to 0.
ox_flag is set to 0.
ux_flag is set to 0.
xx_flag is set to 0.
zx_flag is set to 0.
RoundToDP(x,y)
x is a 2-bit unsigned integer specifying one of four rounding modes.
Return the value y rounded to double-precision under control of the rounding mode specified by x.
RoundToDPCeil(x)
x is a floating-point value having unbounded range and precision.
If x is a QNaN, return x.
RoundToDPFloor(x)
x is a floating-point value having unbounded range and precision.
If x is a QNaN, return x.
RoundToDPIntegerCeil(x)
x is a double-precision floating-point value.
If x is a QNaN, return x.
RoundToDPIntegerFloor(x)
x is a double-precision floating-point value.
If x is a QNaN, return x.
RoundToDPIntegerNearAway(x)
x is a double-precision floating-point value.
If x is a QNaN, return x.
RoundToDPIntegerNearEven(x)
x is a double-precision floating-point value.
If x is a QNaN, return x.
RoundToDPIntegerTrunc(x)
x is a double-precision floating-point value.
If x is a QNaN, return x.
RoundToDPNearEven(x)
x is a floating-point value having unbounded range and precision.
If x is a QNaN, return x.
RoundToDPTrunc(x)
x is a floating-point value having unbounded range and precision.
If x is a QNaN, return x.
RoundToSP(x,y)
x is a 2-bit unsigned integer specifying one of four rounding modes.
Return the value y rounded to single-precision under control of the rounding mode specified by x.
RoundToSPCeil(x)
x is a floating-point value having unbounded range and precision.
If x is a QNaN, return x.
RoundToSPFloor(x)
x is a floating-point value having unbounded range and precision.
If x is a QNaN, return x.
RoundToSPIntegerCeil(x)
x is a single-precision floating-point value.
If x is a QNaN, return x.
RoundToSPIntegerFloor(x)
x is a single-precision floating-point value.
If x is a QNaN, return x.
RoundToSPIntegerNearAway(x)
x is a single-precision floating-point value.
If x is a QNaN, return x.
RoundToSPIntegerNearEven(x)
x is a single-precision floating-point value.
If x is a QNaN, return x.
RoundToSPIntegerTrunc(x)
x is a single-precision floating-point value.
If x is a QNaN, return x.
RoundToSPNearEven(x)
x is a floating-point value having unbounded range and precision.
If x is a QNaN, return x.
RoundToSPTrunc(x)
x is a floating-point value having unbounded range and precision.
If x is a QNaN, return x.
Scalb(x,y)
x is a floating-point value having unbounded range and precision.
y is a signed integer.
SetFX(x)
x is one of the exception flags in the FPSCR.
SquareRootDP(x)
x is a double-precision floating-point value.
If x is a QNaN, return x.
Otherwise, if x is an SNaN, return x represented as a QNaN.
Otherwise, if x is a negative, nonzero value, return the default QNaN.
Otherwise, return the normalized square root of x, having unbounded range and precision.
SquareRootSP(x)
x is a single-precision floating-point value.
If x is a QNaN, return x.
Otherwise, if x is an SNaN, return x represented as a QNaN.
Otherwise, if x is a negative, nonzero value, return the default QNaN.
Otherwise, return the normalized square root of x, having unbounded range and precision.
57 VRT RA DS 2 31 T RA RB 588 TX
0 6 11 16 30 31 0 6 11 16 21 31
Let EA be the sum of the contents of GPR[RA], or 0 if Let EA be the sum of the contents of GPR[RA], or 0 if RA
RA=0, and the signed integer value DS<<2. is equal to 0, and the contents of GPR[RB].
When Big-Endian byte ordering is employed, the When Big-Endian byte ordering is employed, the
contents of the doubleword in storage at address EA contents of the doubleword in storage at address EA
are placed into load_data in such an order that; are placed into load_data in such an order that;
– the contents of the byte in storage at address EA – the contents of the byte in storage at address EA
are placed into byte 0 of load_data, are placed into byte 0 of load_data,
– the contents of the byte in storage at address EA+1 – the contents of the byte in storage at address EA+1
are placed into byte 1 of load_data, and so forth are placed into byte 1 of load_data, and so forth
until until
– the contents of the byte in storage at address EA+7 – the contents of the byte in storage at address EA+7
are placed into byte 7 of load_data. are placed into byte 7 of load_data.
When Little-Endian byte ordering is employed, let When Little-Endian byte ordering is employed, the
load_data be the contents of the doubleword in storage contents of the doubleword in storage at address EA
at address EA such that; are placed into load_data in such an order that;
– the contents of the byte in storage at address EA – the contents of the byte in storage at address EA
are placed into byte 7 of load_data, are placed into byte 7 of load_data,
– the contents of the byte in storage at address EA+1 – the contents of the byte in storage at address EA+1
are placed into byte 6 of load_data, and so forth are placed into byte 6 of load_data, and so forth
until until
– the contents of the byte in storage at address EA+7 – the contents of the byte in storage at address EA+7
are placed into byte 0 of load_data. are placed into byte 0 of load_data.
load_data is placed into doubleword element 0 of load_data is placed into doubleword element 0 of
VSR[XT]. VSR[XT].
The contents of doubleword element 1 of VSR[XT] are The contents of doubleword element 1 of VSR[XT] are
undefined. undefined.
Load VSX Scalar as Integer Byte & Zero Load VSX Scalar as Integer Halfword & Zero
Indexed X-form Indexed X-form
lxsibzx XT,RA,RB lxsihzx XT,RA,RB
31 T RA RB 781 TX 31 T RA RB 813 TX
0 6 11 16 21 31 0 6 11 16 21 31
if TX=0 & MSR.VSX=0 then VSX_Unavailable() if TX=0 & MSR.VSX=0 then VSX_Unavailable()
if TX=1 & MSR.VEC=0 then Vector_Unavailable() if TX=1 & MSR.VEC=0 then Vector_Unavailable()
Let the effective address (EA) be sum of the contents of Let the effective address (EA) be sum of the contents of
GPR[RA], or 0 if RA is equal to 0, and the contents of GPR[RA], or 0 if RA is equal to 0, and the contents of
GPR[RB]. GPR[RB].
The unsigned integer in the byte in storage addressed The unsigned integer in the halfword in storage
by EA is placed in doubleword element 0 of VSR[XT]. addressed by EA is placed in doubleword element 0 of
The contents of doubleword element 1 of VSR[XT] are VSR[XT]. The contents of doubleword element 1 of
undefined. VSR[XT] are undefined.
VSR Data Layout for lxsibzx VSR Data Layout for lxsihzx
tgt = VSR[XT] tgt = VSR[XT]
tgt.dword[0] undefined tgt.dword[0] undefined
0 64 127 0 64 127
VSR[32×TX+T].dword[0] ← EXTS64(MEM(EA,4))
VSR[32×TX+T].dword[1] ← 0xUUUU_UUUU_UUUU_UUUU
VSR[32×TX+T].dword[0] ← ExtendZero(MEM(EA,4))
VSR[32×TX+T].dword[1] ← 0xUUUU_UUUU_UUUU_UUUU
Load VSX Scalar Single DS-form Load VSX Scalar Single-Precision Indexed
X-form
lxssp VRT,DS(RA)
lxsspx XT,RA,RB
57 VRT RA DS 3
0 6 11 16 30 31 31 T RA RB 524 TX
0 6 11 16 21 31
if MSR.VEC=0 then Vector_Unavailable()
if MSR.VSX=0 then VSX_Unavailable()
EA ← ((RA=0)? 0 : GPR[RA]) + EXTS(DS||0b00)
EA ← ((RA=0)? 0 : GPR[RA]) + GPR[RB]
VSR[VRT+32].dword[0] ← ConvertSPtoSP64(MEM(EA,4))
VSR[VRT+32].dword[1] ← 0xUUUU_UUUU_UUUU_UUUU VSR[VRT+32].dword[0] ← ConvertSPtoSP64(MEM(EA,4))
VSR[VRT+32].dword[1] ← 0xUUUU_UUUU_UUUU_UUUU
Let XT be the value VRT + 32.
Let XT be the value 32×TX + T.
Let EA be the sum of the contents of GPR[RA], or 0 if
RA=0, and the signed integer value DS||0b00. Let EA be the sum of the contents of GPR[RA], or 0 if RA
is equal to 0, and the contents of GPR[RB].
When Big-Endian byte ordering is employed, the
contents of the word in storage at address EA are When Big-Endian byte ordering is employed, the
placed into load_data in such an order that; contents of the word in storage at address EA are
placed into load_data in such an order that;
– the contents of the byte in storage at address EA
are placed into byte 0 of load_data, – the contents of the byte in storage at address EA
are placed into byte 0 of load_data,
– the contents of the byte in storage at address EA+1
are placed into byte 1 of load_data, – the contents of the byte in storage at address EA+1
are placed into byte 1 of load_data,
– the contents of the byte in storage at address EA+2
are placed into byte 2 of load_data, and – the contents of the byte in storage at address EA+2
are placed into byte 2 of load_data, and
– the contents of the byte in storage at address EA+3
are placed into byte 3 of load_data. – the contents of the byte in storage at address EA+3
are placed into byte 3 of load_data.
When Little-Endian byte ordering is employed, the
contents of the word in storage at address EA are When Little-Endian byte ordering is employed, the
placed into load_data in such an order that; contents of the word in storage at address EA are
placed into load_data in such an order that;
– the contents of the byte in storage at address EA
are placed into byte 3 of load_data, – the contents of the byte in storage at address EA
are placed into byte 3 of load_data,
– the contents of the byte in storage at address EA+1
are placed into byte 2 of load_data, – the contents of the byte in storage at address EA+1
are placed into byte 2 of load_data,
– the contents of the byte in storage at address EA+2
are placed into byte 1 of load_data, and – the contents of the byte in storage at address EA+2
are placed into byte 1 of load_data, and
– the contents of the byte in storage at address EA+3
are placed into byte 0 of load_data. – the contents of the byte in storage at address EA+3
are placed into byte 0 of load_data.
load_data, interpreted as a single-precision
floating-point value, is placed into doubleword element load_data, interpreted as a single-precision
0 of VSR[VRT+32] in double-precision format. floating-point value, is placed in doubleword element 0
of VSR[XT] in double-precision format.
The contents of doubleword element 1 of VSR[VRT+32]
are undefined. The contents of doubleword element 1 of VSR[XT] are
undefined.
Special Registers Altered:
None Special Registers Altered
None
Load VSX Vector Byte*16 Indexed X-form Example: Loading data using Load VSX Vector
Byte*16 Indexed
lxvb16x XT,RA,RB
char X[] = { 0xF0, 0xF1, 0xF2, 0xF3,
31 T RA RB 876 TX 0xF4, 0xF5, 0xF6, 0xF7,
0 6 11 16 21 31 0xE0, 0xE1, 0xE2, 0xE3,
0xE4, 0xE5, 0xE6, 0xE7 };
if TX=0 & MSR.VSX=0 then VSX_Unavailable()
if TX=1 & MSR.VEC=0 then Vector_Unavailable()
Big-endian storage image of X
EA RA=0 ? GPR[RB] : GPR[RA] + GPR[RB] addr(X): F0 F1 F2 F3 F4 F5 F6 F7 E0 E1 E2 E3 E4 E5 E6 E7
do i = 0 to 15 0 1 2 3 4 5 6 7 8 9 A B C D E F
VSR[32×TX+T].byte[i] MEM(EA+i, 1)
Little-endian storage image of X
end
addr(X): F0 F1 F2 F3 F4 F5 F6 F7 E0 E1 E2 E3 E4 E5 E6 E7
Let XT be the value 32×TX + T.
0 1 2 3 4 5 6 7 8 9 A B C D E F
Let the effective address (EA) be the sum of the
contents of GPR[RA], or 0 if RA is equal to 0, and the Loading a vector of 16 byte elements from Big-Endian
contents of GPR[RB]. storage in VSR[XT] using lxvb16x, retaining left-to-right
element ordering.
For each integer value from 0 to 15, do the following.
The contents of the byte in storage at address # Assumptions
# GPR[PX] = address of X
EA+i are placed into byte element i of VSR[XT],
lxvb16x xX,r0,rPX
Special Registers Altered:
None VSR[W]: F0 F1 F2 F3 F4 F5 F6 F7 E0 E1 E2 E3 E4 E5 E6 E7
0 1 2 3 4 5 6 7 8 9 A B C D E F
# Assumptions
# GPR[PX] = address of X
lxvb16x xX,r0,rPX
VSR[X]: F0 F1 F2 F3 F4 F5 F6 F7 E0 E1 E2 E3 E4 E5 E6 E7
0 1 2 3 4 5 6 7 8 9 A B C D E F
31 T RA RB 269 TX
0 6 11 16 21 31
EA (RA=0) ? 0 : GPR[RA]
nb EXTZ(GPR[RB].bit[0:7])
load_data 0x0000_0000_0000_0000_0000_0000_0000_0000
VSR[32×TX+T] load_data
Example: Loading less than 16-byte data into VSR using lxvl
Loading less than 16-byte data from Big-Endian Loading less than 16-byte data from Little-Endian
storage in VSR[XT] using lxvl. storage in VSR[XT] using lxvl.
addr(S)+0x0010: E2 E3 E4 E5 E6 E7 E8 E9 EA EB F0 F1 F2 F3 F4 F5 addr(S)+0x0010: E3 E2 E5 E4 E7 E6 E9 E8 EB EA F9 F8 F7 F6 F5 F4
addr(S)+0x0020: F6 F7 F8 F9 00 00 00 00 00 00 00 00 00 00 00 00 addr(S)+0x0020: F3 F2 F1 F0 00 00 00 00 00 00 00 00 00 00 00 00
0 1 2 3 4 5 6 7 8 9 A B C D E F 0 1 2 3 4 5 6 7 8 9 A B C D E F
# Assumptions # Assumptions
# GPR[NS] = 14 (length of S in # of bytes) # GPR[NS] = 14 (length of S in # of bytes)
# GPR[NX] = 12 (length of X in # of bytes) # GPR[NX] = 12 (length of X in # of bytes)
# GPR[NZ] = 10 (length of Z in # of bytes) # GPR[NZ] = 10 (length of Z in # of bytes)
# GPR[PS] = address of S # GPR[PS] = address of S
VSR[X]: E0 E1 E2 E3 E4 E5 E6 E7 E8 E9 EA EB 00 00 00 00 VSR[X]: 00 00 00 00 EA EB E8 E9 E6 E7 E4 E5 E2 E3 E0 E1
VSR[Z]: F0 F1 F2 F3 F4 F5 F6 F7 F8 F9 00 00 00 00 00 00 VSR[Z]: 00 00 00 00 00 00 F0 F1 F2 F3 F4 F5 F6 F7 F8 F9
0 1 2 3 4 5 6 7 8 9 A B C D E F 0 1 2 3 4 5 6 7 8 9 A B C D E F
Load VSX Vector Left-justified with Length Example: Loading less than 16-byte left-justified
X-form data
lxvll XT,RA,RB
decimal X = +1234567890123456789;
31 T RA RB 301 TX decimal Y = -123456;
0 6 11 16 21 31 decimal Z = +1004966723510220;
61 T RA DQ TX 1 31 T RA RB 4 / 12 TX
0 6 11 16 28 29 31 0 6 11 16 21 25 26 31
if TX=0 & MSR.VSX=0 then VSX_Unavailable() if TX=0 & MSR.VSX=0 then VSX_Unavailable()
if TX=1 & MSR.VEC=0 then Vector_Unavailable() if TX=1 & MSR.VEC=0 then Vector_Unavailable()
Let the effective address (EA) be the sum of the Let the effective address (EA) be the sum of the
contents of GPR[RA], or 0 if RA is equal to 0, and the contents of GPR[RA], or 0 if RA is equal to 0, and the
signed integer value DQ||0b0000. contents of GPR[RB].
When Big-Endian byte ordering is employed, the When Big-Endian byte ordering is employed, the
contents of the quadword in storage at address EA are contents of the quadword in storage at address EA are
placed into load_data in such an order that; placed into load_data in such an order that;
– the contents of the byte in storage at address EA – the contents of the byte in storage at address EA
are placed into byte element 0 of load_data, are placed into byte element 0 of load_data,
– the contents of the byte in storage at address EA+1 – the contents of the byte in storage at address EA+1
are placed into byte element 1 of load_data, and are placed into byte element 1 of load_data, and
so forth until so forth until
– the contents of the byte in storage at address – the contents of the byte in storage at address
EA+15 are placed into byte element 15 of EA+15 are placed into byte element 15 of
load_data. load_data.
When Little-Endian byte ordering is employed, the When Little-Endian byte ordering is employed, the
contents of the quadword in storage at address EA are contents of the quadword in storage at address EA are
placed into load_data in such an order that; placed into load_data in such an order that;
– the contents of the byte in storage at address EA – the contents of the byte in storage at address EA
are placed into byte element 15 of load_data, are placed into byte element 15 of load_data,
– the contents of the byte in storage at address EA+1 – the contents of the byte in storage at address EA+1
are placed into byte element 14 of load_data, and are placed into byte element 14 of load_data, and
so forth until so forth until
– the contents of the byte in storage at address – the contents of the byte in storage at address
EA+15 are placed into byte element 0 of load_data. EA+15 are placed into byte element 0 of load_data.
char W[16] = { 0xF0, 0xF1, 0xF2, 0xF3, 0xF4, 0xF5, 0xF6, 0xF7, 0xE0, 0xE1, 0xE2, 0xE3, 0xE4, 0xE5, 0xE6, 0xE7 };
short X[8] = { 0xF0F1, 0xF2F3, 0xF4F5, 0xF6F7, 0xE0E1, 0xE2E3, 0xE4E5, 0xE6E7 };
float Y[4] = { 0xF0F1_F2F3, 0xF4F5_F6F7, 0xE0E1_E2E3, 0xE4E5_E6E7 };
double Z[2] = { 0xF0F1_F2F3_F4F5_F6F7, 0xE0E1_E2E3_E4E5_E6E7 };
Loading 16 bytes of data from Big-Endian storage in Loading 16 bytes of data from Little-Endian storage in
VSR[XT] using lxvx. VSR[XT] using lxvx.
addr(W+0x0010): F0 F1 F2 F3 F4 F5 F6 F7 E0 E1 E2 E3 E4 E5 E6 E7 addr(W+0x0010): F1 F0 F3 F2 F5 F4 F7 F6 E1 E0 E3 E2 E5 E4 E7 E6
addr(W+0x0020): F0 F1 F2 F3 F4 F5 F6 F7 E0 E1 E2 E3 E4 E5 E6 E7 addr(W+0x0020): F3 F2 F1 F0 F7 F6 F5 F4 E3 E2 E1 E0 E7 E6 E5 E4
addr(W+0x0030): F0 F1 F2 F3 F4 F5 F6 F7 E0 E1 E2 E3 E4 E5 E6 E7 addr(W+0x0030): F7 F6 F5 F4 F3 F2 F1 F0 E7 E6 E5 E4 E3 E2 E1 E0
0 1 2 3 4 5 6 7 8 9 A B C D E F 0 1 2 3 4 5 6 7 8 9 A B C D E F
# Assumptions # Assumptions
# GPR[PW] = address of W # GPR[PW] = address of W
# GPR[PX] = address of X = GPR[PW] + 16 # GPR[PX] = address of X = GPR[PW] + 16
# GPR[PY] = address of Y = GPR[PW] + 32 # GPR[PY] = address of Y = GPR[PW] + 32
# GPR[PZ] = address of Z = GPR[PW] + 48 # GPR[PZ] = address of Z = GPR[PW] + 48
VSR[X]: F0 F1 F2 F3 F4 F5 F6 F7 E0 E1 E2 E3 E4 E5 E6 E7 VSR[X]: E6 E7 E4 E5 E2 E3 E0 E1 F6 F7 F4 F5 F2 F3 F0 F1
VSR[Y]: F0 F1 F2 F3 F4 F5 F6 F7 E0 E1 E2 E3 E4 E5 E6 E7 VSR[Y]: E4 E5 E6 E7 E0 E1 E2 E3 F4 F5 F6 F7 F0 F1 F2 F3
VSR[Z]: F0 F1 F2 F3 F4 F5 F6 F7 E0 E1 E2 E3 E4 E5 E6 E7 VSR[Z]: E0 E1 E2 E3 E4 E5 E6 E7 F0 F1 F2 F3 F4 F5 F6 F7
0 1 2 3 4 5 6 7 8 9 A B C D E F 0 1 2 3 4 5 6 7 8 9 A B C D E F
31 T RA RB 332 TX
0 6 11 16 21 31
load_data MEM(EA, 8)
VSR[32×TX+T].dword[0] load_data
VSR[32×TX+T].dword[1] load_data
Load VSX Vector Halfword*8 Indexed X-form Example: Loading data using Load VSX Vector
Halfword*8 Indexed
lxvh8x XT,RA,RB
short X[] = { 0x0001, 0x1011, 0x2021, 0x3031,
31 T RA RB 812 TX 0x4041, 0x5051, 0x6061, 0x7071 };
0 6 11 16 21 31
Let the effective address (EA) be the sum of the Loading a vector of 8 halfword elements from
contents of GPR[RA], or 0 if RA is equal to 0, and the Big-Endian storage in VSR[XT] using lxvh8x, retaining
contents of GPR[RB]. left-to-right element ordering.
0 1 2 3 4 5 6 7 8 9 A B C D E F
– the contents of the byte in storage at address
EA+2×i are placed into byte element 1 of
load_data,
Load VSX Vector Word*4 Indexed X-form – the contents of the byte in storage at address
EA+4×i+3 are placed into byte element 0 of
lxvw4x XT,RA,RB load_data.
31 T RA RB 780 TX load_data is placed into word element i of
0 6 11 16 21 31
VSR[XT].
if MSR.VSX=0 then VSX_Unavailable()
Special Registers Altered
EA RA=0 ? GPR[RB] : GPR[RA] + GPR[RB] None
VSR[32×TX+T].word[0] MEM(EA, 4)
VSR[32×TX+T].word[1] MEM(EA+4, 4) VSR Data Layout for lxvw4x
VSR[32×TX+T].word[2] MEM(EA+8, 4) tgt = VSR[XT]
VSR[32×TX+T].word[3] MEM(EA+12, 4)
.word[0] .word[1] .word[2] .word[3]
Let XT be the value 32×TX + T. 0 32 64 96 127
Load VSX Vector Word & Splat Indexed X-form Example: Loading data using Load VSX Vector
Word & Splat Indexed
lxvwsx XT,RA,RB
int X = 0xF0F1_F2F3;
31 T RA RB 364 TX
0 6 11 16 21 31
Big-endian storage image of X
if TX=0 & MSR.VSX=0 then VSX_Unavailable() addr(X): F0 F1 F2 F3 00 00 00 00 00 00 00 00 00 00 00 00
if TX=1 & MSR.VEC=0 then Vector_Unavailable()
0 1 2 3 4 5 6 7 8 9 A B C D E F
EA RA=0 ? GPR[RB] : GPR[RA] + GPR[RB]
Little-endian storage image of X
load_data MEM(EA,4) addr(X): F3 F2 F1 F0 00 00 00 00 00 00 00 00 00 00 00 00
do i = 0 to 3 0 1 2 3 4 5 6 7 8 9 A B C D E F
VSR[32×TX+T].word[i] load_data
end Loading scalar word data from Big-Endian storage in
VSR[XT] using lxvwsx.
Let XT be the value 32×TX + T.
# Assumptions
Let the effective address (EA) be the sum of the # GPR[PX] = address of X
contents of GPR[RA], or 0 if RA is equal to 0, and the
contents of GPR[RB]. lxvwsx xX,r0,rPX
– the contents of the byte in storage at address EA Loading scalar word data from Little-Endian storage in
are placed into byte element 0 of load_data, VSR[XT] using lxvwsx.
Store VSX Scalar Doubleword DS-form Store VSX Scalar Doubleword Indexed X-form
stxsd VRS,DS(RA) stxsdx XS,RA,RB
61 VRS RA DS 2 31 S RA RB 716 SX
0 6 11 16 30 31 0 6 11 16 21 31
Let EA be the sum of the contents of GPR[RA], or 0 if Let EA be the sum of the contents of GPR[RA], or 0 if RA
RA=0, and the signed integer value DS<<2. is equal to 0, and the contents of GPR[RB].
Let store_data be the contents of doubleword element Let store_data be the contents of doubleword element
0 of VSR[XS]. 0 of VSR[XS].
When Big-Endian byte ordering is employed, When Big-Endian byte ordering is employed,
store_data is placed in the doubleword in storage at store_data is placed in the doubleword in storage at
address EA in such order that; address EA in such order that;
– byte 0 of store_data is placed into the byte in – byte 0 of store_data is placed into the byte in
storage at address EA, storage at address EA,
– byte 1 of store_data is placed into the byte in – byte 1 of store_data is placed into the byte in
storage at address EA+1, and so forth until storage at address EA+1, and so forth until
– byte 7 of store_data is placed into the byte in – byte 7 of store_data is placed into the byte in
storage at address EA+7. storage at address EA+7.
When Little-Endian byte ordering is employed, When Little-Endian byte ordering is employed,
store_data is placed in the doubleword in storage at store_data is placed in the doubleword in storage at
address EA in such order that; address EA in such order that;
– the contents of byte 7 of doubleword element 0 of – byte 0 of store_data is placed into the byte in
VSR[VRS+32] are placed into the byte in storage at storage at address EA+7,
address EA,
– byte 1 of store_data is placed into the byte in
– the contents of byte 6 of doubleword element 0 of storage at address EA+6, and so forth until
VSR[VRS+32] are placed into the byte in storage at
address EA+1, and so forth until – byte 7 of store_data is placed into the byte in
storage at address EA.
– the contents of byte 0 of doubleword element 0 of
VSR[VRS+32] are placed into the byte in storage at Special Registers Altered
address EA+7. None
.dword[0] unused
0 64 127
Store VSX Scalar as Integer Byte Indexed Store VSX Scalar as Integer Halfword Indexed
X-form X-form
stxsibx XS,RA,RB stxsihx XS,RA,RB
31 S RA RB 909 SX 31 S RA RB 941 SX
0 6 11 16 21 31 0 6 11 16 21 31
if SX=0 & MSR.VSX=0 then VSX_Unavailable() if SX=0 & MSR.VSX=0 then VSX_Unavailable()
if SX=1 & MSR.VEC=0 then Vector_Unavailable() if SX=1 & MSR.VEC=0 then Vector_Unavailable()
Let the effective address (EA) be sum of the contents of Let the effective address (EA) be sum of the contents of
GPR[RA], or 0 if RA is equal to 0, and the contents of GPR[RA], or 0 if RA is equal to 0, and the contents of
GPR[RB]. GPR[RB].
The contents of byte element 7 of VSR[XS] are placed The contents of halfword element 3 of VSR[XS] are
into the byte in storage addressed by EA. placed into the halfword in storage addressed by EA.
VSR Data Layout for stxsibx VSR Data Layout for stxsihx
src = VSR[XS] src = VSR[XS]
MEM(EA,4) ← VSR[32×SX+S].word[1]
MEM(EA,4) ← ConvertSP64toSP(VSR[VRS+32].dword[0])
MEM(EA,4) ← ConvertSP64toSP(VSR[32×SX+S].dword[0])
Store VSX Vector Byte*16 Indexed X-form Example: Storing data using Store VSX Vector
Byte*16 Indexed
stxvb16x XS,RA,RB
char X[16];
31 S RA RB 1004 SX
0 6 11 16 21 31 VSR[X]: F0 F1 F2 F3 F4 F5 F6 F7 E0 E1 E2 E3 E4 E5 E6 E7
stxvb16x xX,r0,rPX
Let XS be the value 32×SX + S.
Big-endian storage image of X
Let the effective address (EA) be the sum of the
addr(X): F0 F1 F2 F3 F4 F5 F6 F7 E0 E1 E2 E3 E4 E5 E6 E7
contents of GPR[RA], or 0 if RA is equal to 0, and the
contents of GPR[RB]. 0 1 2 3 4 5 6 7 8 9 A B C D E F
For each integer value from 0 to 15, do the following. Loading a vector of 16 byte elements from
The contents of byte element i of VSR[XS] are Little-Endian storage in VSR[XT] using lxvb16x,
placed into the byte in storage at address EA+i. retaining left-to-right element ordering.
stxvb16x xX,r0,rPX
0 1 2 3 4 5 6 7 8 9 A B C D E F
31 S RA RB 972 SX
0 6 11 16 21 31
Store VSX Vector Halfword*8 Indexed X-form Example: Storing data using Store VSX Vector
Halfword*8 Indexed
stxvh8x XS,RA,RB
short X[8];
31 S RA RB 940 SX
0 6 11 16 21 31
VSR[X]: 00 01 10 11 20 21 30 31 40 41 50 51 60 61 70 71
if SX=0 & MSR.VSX=0 then VSX_Unavailable()
if SX=1 & MSR.VEC=0 then Vector_Unavailable() 0 1 2 3 4 5 6 7 8 9 A B C D E F
EA RA=0 ? GPR[RB] : GPR[RA] + GPR[RB] Storing a vector of 8 halfword elements from VSR[X]
into Big-Endian storage using stxvh8x, retaining
do i = 0 to 7
left-to-right element ordering.
MEM(EA+2×i,2) VSR[32×SX+S].hword[i]
end
# Assumptions
# GPR[PX] = address of X
Let XS be the value 32×SX + S.
stxvh8x xX,r0,rPX
Let the effective address (EA) be the sum of the Big-endian storage image of X
contents of GPR[RA], or 0 if RA is equal to 0, and the
contents of GPR[RB]. addr(X): 00 01 10 11 20 21 30 31 40 41 50 51 60 61 70 71
0 1 2 3 4 5 6 7 8 9 A B C D E F
For each integer value from 0 to 7, do the following.
The contents of byte element i of VSR[XS] are Storing a vector of 8 halfword elements from VSR[X]
placed into the byte in storage at address EA+i. into Little-Endian storage using stxvh8x, retaining
left-to-right element ordering.
For each integer value from 0 to 7, do the following.
When Big-Endian byte ordering is employed, the # Assumptions
contents of halfword element i of VSR[XS] are # GPR[PX] = address of X
placed into the halfword in storage at address
EA+2×i in such an order that; stxvh8x xX,r0,rPX
Store VSX Vector DQ-form Store VSX Vector with Length X-form
stxv XS,DQ(RA) stxvl XS,RA,RB
61 S RA DQ SX 5 31 S RA RB 397 SX
0 6 11 16 28 29 31 0 6 11 16 21 31
if SX=0 & MSR.VSX=0 then VSX_Unavailable() if SX=0 & MSR.VSX=0 then VSX_Unavailable()
if SX=1 & MSR.VEC=0 then Vector_Unavailable() if SX=1 & MSR.VEC=0 then Vector_Unavailable()
When Big-Endian byte ordering is employed, Let the effective address (EA) be the contents of
store_data is placed into the quadword in storage at GPR[RA], or 0 if RA is equal to 0.
address EA in such an order that;
Let nb be the unsigned integer value in bits 0:7 of
– byte 0 of store_data is placed into the byte in GPR[RB].
storage at address EA,
If nb is equal to 0, the storage access is not performed.
– byte 1 of store_data is placed into the byte in
storage at address EA+1, and so forth until Otherwise, when Big-Endian byte-ordering is
employed, do the following.
– byte 15 of store_data is placed into the byte in If nb less than 16, the contents of the leftmost nb
storage at address EA+15. bytes of VSR[XS] are placed in storage starting at
address EA.
When Little-Endian byte ordering is employed,
store_data is placed into the quadword in storage at Otherwise, the contents of VSR[XS] are placed into
address EA in such an order that; the quadword in storage at address EA.
– byte 15 of store_data is placed into the byte in Otherwise, when Little-Endian byte ordering is
storage at address EA, employed, do the following.
If nb less than 16, the contents of the rightmost nb
– byte 14 of store_data is placed into the byte in bytes of VSR[XS] are placed in storage starting at
storage at address EA+1, and so forth until address EA in byte-reversed order.
– byte 0 of store_data is placed into the byte in Otherwise, the contents of VSR[XS] are placed into
storage at address EA+15. the quadword in storage at address EA in
byte-reversed order.
Special Registers Altered
If the contents of bits 8:63 of GPR[RB] are not equal to
None
0, the results are boundedly undefined.
Example: Storing less than 16-byte data from VSR using stxvl
0 1 2 3 4 5 6 7 8 9 A B C D E F
Store VSX Vector Left-justified with Length Example: Storing less than 16-byte left-justified
X-form data
stxvll XS,RA,RB decimal X = +1234567890123456789;
decimal Y = -123456;
31 S RA RB 429 SX decimal Z = +1004966723510220;
0 6 11 16 21 31
31 S RA RB 396 SX
0 6 11 16 21 31
MEM(EA,16) VSR[32×SX+S]
char W[16] = { 0xF0, 0xF1, 0xF2, 0xF3, 0xF4, 0xF5, 0xF6, 0xF7, 0xE0, 0xE1, 0xE2, 0xE3, 0xE4, 0xE5, 0xE6, 0xE7 };
short X[8] = { 0xF0F1, 0xF2F3, 0xF4F5, 0xF6F7, 0xE0E1, 0xE2E3, 0xE4E5, 0xE6E7 };
float Y[4] = { 0xF0F1_F2F3, 0xF4F5_F6F7, 0xE0E1_E2E3, 0xE4E5_E6E7 };
double Z[2] = { 0xF0F1_F2F3_F4F5_F6F7, 0xE0E1_E2E3_E4E5_E6E7 };
Storing 16 bytes of data into Big-Endian storage from Storing 16 bytes of data into Little-Endian storage from
VSR[XS] using stxvx. VSR[XS] using stxvx.
VSR[W]: F0 F1 F2 F3 F4 F5 F6 F7 E0 E1 E2 E3 E4 E5 E6 E7 VSR[W]: E7 E6 E5 E4 E3 E2 E1 E0 F7 F6 F5 F4 F3 F2 F1 F0
VSR[X]: F0 F1 F2 F3 F4 F5 F6 F7 E0 E1 E2 E3 E4 E5 E6 E7 VSR[X]: E6 E7 E4 E5 E2 E3 E0 E1 F6 F7 F4 F5 F2 F3 F0 F1
VSR[Y]: F0 F1 F2 F3 F4 F5 F6 F7 E0 E1 E2 E3 E4 E5 E6 E7 VSR[Y]: E4 E5 E6 E7 E0 E1 E2 E3 F4 F5 F6 F7 F0 F1 F2 F3
VSR[Z]: F0 F1 F2 F3 F4 F5 F6 F7 E0 E1 E2 E3 E4 E5 E6 E7 VSR[Z]: E0 E1 E2 E3 E4 E5 E6 E7 F0 F1 F2 F3 F4 F5 F6 F7
0 1 2 3 4 5 6 7 8 9 A B C D E F 0 1 2 3 4 5 6 7 8 9 A B C D E F
# Assumptions # Assumptions
# GPR[PW] = address of W # GPR[PW] = address of W
# GPR[PX] = address of X = GPR[PW] + 16 # GPR[PX] = address of X = GPR[PW] + 16
# GPR[PY] = address of Y = GPR[PW] + 32 # GPR[PY] = address of Y = GPR[PW] + 32
# GPR[PZ] = address of Z = GPR[PW] + 48 # GPR[PZ] = address of Z = GPR[PW] + 48
addr(W+0x0010): F0 F1 F2 F3 F4 F5 F6 F7 E0 E1 E2 E3 E4 E5 E6 E7 addr(W+0x0010): F1 F0 F3 F2 F5 F4 F7 F6 E1 E0 E3 E2 E5 E4 E7 E6
addr(W+0x0020): F0 F1 F2 F3 F4 F5 F6 F7 E0 E1 E2 E3 E4 E5 E6 E7 addr(W+0x0020): F3 F2 F1 F0 F7 F6 F5 F4 E3 E2 E1 E0 E7 E6 E5 E4
addr(W+0x0030): F0 F1 F2 F3 F4 F5 F6 F7 E0 E1 E2 E3 E4 E5 E6 E7 addr(W+0x0030): F7 F6 F5 F4 F3 F2 F1 F0 E7 E6 E5 E4 E3 E2 E1 E0
0 1 2 3 4 5 6 7 8 9 A B C D E F 0 1 2 3 4 5 6 7 8 9 A B C D E F
VSX Scalar Absolute Value Double-Precision VSX Scalar Absolute Quad-Precision X-form
XX2-form
xsabsqp VRT,VRB
xsabsdp XT,XB
63 VRT 0 VRB 804 /
0 6 11 16 21 31
60 T /// B 345 BX TX
0 6 11 16 21 30 31 if MSR.VSX=0 then VSX_Unavailable()
src
The contents of doubleword element 1 of VSR[XT] are
undefined. VSR[XT]
tgt
Special Registers Altered
None
Programming Note
This instruction can be used to operate on a
single-precision source operand.
VSX Scalar Add Double-Precision XX3-form The contents of doubleword element 1 of VSR[XT] are
undefined.
xsadddp XT,XA,XB
FPRF is set to the class and sign of the result. FR is
60 T A B 32 AX BX TX set to indicate if the result was incremented when
0 6 11 16 21 29 30 31 rounded. FI is set to indicate the result is inexact.
XT ← TX || T
If a trap-enabled invalid operation exception occurs,
XA ← AX || A
XB ← BX || B
VSR[XT] and FPRF are not modified, and FR and FI
reset_xflags() are set to 0.
src1 ← VSR[XA]{0:63}
src2 ← VSR[XB]{0:63} See Table 59, “VSX Scalar Floating-Point Final
v{0:inf} ← AddDP(src1,src2) Result,” on page 517.
result{0:63} ← RoundToDP(RN,v)
if(vxsnan_flag) then SetFX(VXSNAN) Special Registers Altered
if(vxisi_flag) then SetFX(VXISI) FPRF FR FI FX OX UX XX
if(ox_flag) then SetFX(OX) VXSNAN VXISI
if(ux_flag) then SetFX(UX)
if(xx_flag) then SetFX(XX)
vex_flag ← VE & (vxsnan_flag | vxisi_flag) VSR Data Layout for xsadddp
src1 = VSR[XA]
if( ~vex_flag ) then do
VSR[XT] ← result || 0xUUUU_UUUU_UUUU_UUUU DP unused
FPRF ← ClassSP(result)
FR ← inc_flag src2 = VSR[XB]
FI ← xx_flag DP unused
end
else do tgt = VSR[XT]
FR ← 0b0
FI ← 0b0 DP undefined
end 0 64 127
src2
-Infinity -NZF -Zero +Zero +NZF +Infinity QNaN SNaN
v ← dQNaN v ← Q(src2)
-Infinity v ← -Infinity v ← -Infinity v ← -Infinity v ← -Infinity v ← -Infinity v ← src2
vxisi_flag ← 1 vxsnan_flag ← 1
v ← Q(src2)
-NZF v ← -Infinity v ← A(src1,src2) v ← src1 v ← src1 v ← A(src1,src2) v ← +Infinity v ← src2
vxsnan_flag ← 1
v ← Q(src2)
-Zero v ← -Infinity v ← src2 v ← -Zero v ← Rezd v ← src2 v ← +Infinity v ← src2
vxsnan_flag ← 1
v ← Q(src2)
+Zero v ← -Infinity v ← src2 v ← Rezd v ← +Zero v ← src2 v ← +Infinity v ← src2
vxsnan_flag ← 1
src1
v ← Q(src2)
+NZF v ← -Infinity v ← A(src1,src2) v ← src1 v ← src1 v ← A(src1,src2) v ← +Infinity v ← src2
vxsnan_flag ← 1
v ← dQNaN v ← Q(src2)
+Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← src2
vxisi_flag ← 1 vxsnan_flag ← 1
v ← src1
QNaN v ← src1 v ← src1 v ← src1 v ← src1 v ← src1 v ← src1 v ← src1
vxsnan_flag ← 1
v ← Q(src1) v ← Q(src1) v ← Q(src1) v ← Q(src1) v ← Q(src1) v ← Q(src1) v ← Q(src1) v ← Q(src1)
SNaN
vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1
Explanation:
src1 The double-precision floating-point value in doubleword element 0 of VSR[XA].
src2 The double-precision floating-point value in doubleword element 0 of VSR[XB].
dQNaN Default quiet NaN (0x7FF8_0000_0000_0000).
NZF Nonzero finite number.
Rezd Exact-zero-difference result (addition of two finite numbers having same magnitude but different signs).
A(x,y) Return the normalized sum of floating-point value x and floating-point value y, having unbounded range and precision.
Note: If x = -y, v is considered to be an exact-zero-difference result (Rezd).
Q(x) Return a QNaN with the payload of x.
v The intermediate result having unbounded signficand precision and unbounded exponent range.
Rounding Mode
Range of v Case Round To Round Towards Round Towards Round Towards Round To
Nearest (RTN) Zero (RTZ) +Infinity (RTP) –Infinity (RTM) Odd (RTO)
Explanation:
– This situation cannot occur.
v The precise intermediate result defined in the instruction having unbounded range and precision.
den(x) The significand of x is shifted right by the amount of the difference between the target rounding precision Emin and the unbiased
exponent of x. The unbiased exponent of the denormalized value is Emin. The significand of the denormalized value has
unbounded significand precision.
Emin = -16382 (quad-precision)
Emin = -16382 (double-extended-precision)
Emin = -1022 (double-precision)
Emin = -126 (single-precision)
Rezd Exact-zero-difference result. Applies only to add operations involving source operands having the same magnitude and different
signs or subtract operations involving source operands having the same magnitude and same signs. Whether +Zero or -Zero is
returned is controlled by the setting of the rounding mode in RN, even when the rounding mode is overridden to Round to Odd.
rnd(x) The significand of x is rounded to the target rounding precision according to the rounding mode specified in FPSCR.RN. Exponent
range of the rounded result is unbounded. See Section 7.3.2.6.
Nmax Largest (in magnitude) representable normalized number in the target rounding precision format.
Nmax = ± 2+16383 × 1.FFFFFFFFFFFFFFFFFFFFFFFFFFFF (quad-precision)
Nmax = ± 2+16383 × 1.FFFFFFFFFFFFFFFF000000000000 (double-extended-precision)
Nmax = ± 2+1023 × 1.FFFFFFFFFFFFF000000000000000 (double-precision)
Nmax = ± 2+127 × 1.FFFFFF0000000000000000000000 (single-precision)
Nmin Smallest (in magnitude) representable normalized number in the target rounding precision format.
Nmin = ± 2-16382 × 1.0000000000000000000000000000 (quad-precision)
Nmin = ± 2-16382 × 1.0000000000000000000000000000 (double-extended-precision)
Nmin = ± 2-1022 × 1.0000000000000000000000000000 (double-precision)
Nmin = ± 2-126 × 1.0000000000000000000000000000 (single-precision)
ulp Least significant bit in the target precision format’s significand (Unit in the Last Position).
Is q inexact? (q g v)
vxsnan_flag
vxsqrt_flag
vximz_flag
vxisi_flag
vxidi_flag
vxzdz_flag
FPSCR.VE
FPSCR.OE
FPSCR.UE
FPSCR.ZE
FPSCR.XE
zx_flag
Case Returned Results and Status Setting
Explanation:
– The results do not depend on this condition.
T(x) Places the result into the target VSR.
For scalar single-precision and double-precision results
VSR[XT].dword[0] ← bfp_CONVERT_TO_BFP64(r)
VSR[XT].dword[1] ← 0xUUUU_UUUU_UUUU_UUUU
For scalar quad-precision results
VSR[VRT+32] ← bfp_CONVERT_TO_BFP128(r)
class_bfp(x) Sets FPSCR.FPRF to the sign and class of x.
FPSCR.FPRF ← fprf_CLASS_BFP32(x) (single-precision)
FPSCR.FPRF ← fprf_CLASS_BFP64(x) (double-precision)
FPSCR.FPRF ← fprf_CLASS_BFP128(x) (quad-precision)
fx(x) FPSCR.FX is set to 1 if FPSCR.x=0. FPSCR.x is set to 1.
fi(x) FPSCR.FI is set to the value x.
fr(x) FPSCR.FR is set to the value x.
β Wrap adjust
β = 2192 (single-precision)
β = 21536 (double-precision)
β = 224576 (quad-precision)
See Table 7.4.3.2, “Action for OE=1,” on page 406 for trap-enabled Overflow exceptions.
See Table 7.4.4.2, “Action for UE=1,” on page 411 for trap-enabled Underflow exceptions.
q The value defined in Table 58, “Scalar Floating-Point Intermediate Result Handling,” on page 516, significand rounded to the
target rounding precision, unbounded exponent range.
r The value defined in Table 58, “Scalar Floating-Point Intermediate Result Handling,” on page 516, significand rounded to the
target rounding precision, exponent bounded to the target rounding precision format exponent range.
error() The system error handler is invoked for the trap-enabled exception if MSR.FE0 and MSR.FE1 are set to any mode other than the
ignore-exception mode.
Is q inexact? (q g v)
vxsnan_flag
vxsqrt_flag
vximz_flag
vxisi_flag
vxidi_flag
vxzdz_flag
FPSCR.VE
FPSCR.OE
FPSCR.UE
FPSCR.ZE
FPSCR.XE
zx_flag
Case Returned Results and Status Setting
Explanation:
– The results do not depend on this condition.
T(x) Places the result into the target VSR.
For scalar single-precision and double-precision results
VSR[XT].dword[0] ← bfp_CONVERT_TO_BFP64(r)
VSR[XT].dword[1] ← 0xUUUU_UUUU_UUUU_UUUU
For scalar quad-precision results
VSR[VRT+32] ← bfp_CONVERT_TO_BFP128(r)
class_bfp(x) Sets FPSCR.FPRF to the sign and class of x.
FPSCR.FPRF ← fprf_CLASS_BFP32(x) (single-precision)
FPSCR.FPRF ← fprf_CLASS_BFP64(x) (double-precision)
FPSCR.FPRF ← fprf_CLASS_BFP128(x) (quad-precision)
fx(x) FPSCR.FX is set to 1 if FPSCR.x=0. FPSCR.x is set to 1.
fi(x) FPSCR.FI is set to the value x.
fr(x) FPSCR.FR is set to the value x.
β Wrap adjust
β = 2192 (single-precision)
β = 21536 (double-precision)
β = 224576 (quad-precision)
See Table 7.4.3.2, “Action for OE=1,” on page 406 for trap-enabled Overflow exceptions.
See Table 7.4.4.2, “Action for UE=1,” on page 411 for trap-enabled Underflow exceptions.
q The value defined in Table 58, “Scalar Floating-Point Intermediate Result Handling,” on page 516, significand rounded to the
target rounding precision, unbounded exponent range.
r The value defined in Table 58, “Scalar Floating-Point Intermediate Result Handling,” on page 516, significand rounded to the
target rounding precision, exponent bounded to the target rounding precision format exponent range.
error() The system error handler is invoked for the trap-enabled exception if MSR.FE0 and MSR.FE1 are set to any mode other than the
ignore-exception mode.
VSX Scalar Add Single-Precision XX3-form The result is placed into doubleword element 0 of
VSR[XT] in double-precision format.
xsaddsp XT,XA,XB
The contents of doubleword element 1 of VSR[XT] are
60 T A B 0 AXBXTX undefined.
0 6 11 16 21 29 30 31
reset_xflags()
FPRF is set to the class and sign of the result as
represented in single-precision format. FR is set to
src1 ← VSR[32×AX+A].dword[0] indicate if the result was incremented when rounded.
src2 ← VSR[32×BX+B].dword[0] FI is set to indicate the result is inexact.
v ← AddDP(src1,src2)
result ← RoundToSP(RN,v) If a trap-enabled invalid operation exception occurs,
VSR[XT] and FPRF are not modified, and FR and FI
if(vxsnan_flag) then SetFX(VXSNAN) are set to 0.
if(vxisi_flag) then SetFX(VXISI)
if(ox_flag) then SetFX(OX) See Table 59, “VSX Scalar Floating-Point Final
if(ux_flag) then SetFX(UX) Result,” on page 517.
if(xx_flag) then SetFX(XX)
1. Floating-point addition is based on exponent comparison and addition of the two significands. The exponents of the two operands are compared,
and the significand accompanying the smaller exponent is shifted right, with its exponent increased by one for each bit shifted, until the two
exponents are equal. The two significands are then added or subtracted as appropriate, depending on the signs of the operands, to form an
intermediate sum. All 53 bits of the significand as well as all three guard bits (G, R, and X) enter into the computation.
2. Floating-point normalization is based on shifting the significand left until the most-significant bit is 1 and decrementing the exponent by the number
of bits the significand was shifted.
src2
-Infinity -NZF -Zero +Zero +NZF +Infinity QNaN SNaN
v ← dQNaN v ← Q(src2)
-Infinity v ← -Infinity v ← -Infinity v ← -Infinity v ← -Infinity v ← -Infinity v ← src2
vxisi_flag ← 1 vxsnan_flag ← 1
v ← Q(src2)
-NZF v ← -Infinity v ← A(src1,src2) v ← src1 v ← src1 v ← A(src1,src2) v ← +Infinity v ← src2
vxsnan_flag ← 1
v ← Q(src2)
-Zero v ← -Infinity v ← src2 v ← -Zero v ← Rezd v ← src2 v ← +Infinity v ← src2
vxsnan_flag ← 1
v ← Q(src2)
+Zero v ← -Infinity v ← src2 v ← Rezd v ← +Zero v ← src2 v ← +Infinity v ← src2
vxsnan_flag ← 1
src1
v ← Q(src2)
+NZF v ← -Infinity v ← A(src1,src2) v ← src1 v ← src1 v ← A(src1,src2) v ← +Infinity v ← src2
vxsnan_flag ← 1
v ← dQNaN v ← Q(src2)
+Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← src2
vxisi_flag ← 1 vxsnan_flag ← 1
v ← src1
QNaN v ← src1 v ← src1 v ← src1 v ← src1 v ← src1 v ← src1 v ← src1
vxsnan_flag ← 1
v ← Q(src1) v ← Q(src1) v ← Q(src1) v ← Q(src1) v ← Q(src1) v ← Q(src1) v ← Q(src1) v ← Q(src1)
SNaN
vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1
Explanation:
src1 The double-precision floating-point value in doubleword element 0 of VSR[XA].
src2 The double-precision floating-point value in doubleword element 0 of VSR[XB].
dQNaN Default quiet NaN (0x7FF8_0000_0000_0000).
NZF Nonzero finite number.
Rezd Exact-zero-difference result (addition of two finite numbers having same magnitude but different signs).
A(x,y) Return the normalized sum of floating-point value x and floating-point value y, having unbounded range and precision.
Note: If x = -y, v is considered to be an exact-zero-difference result (Rezd).
Q(x) Return a QNaN with the payload of x.
v The intermediate result having unbounded signficand precision and unbounded exponent range.
if MSR.VSX=0 then VSX_Unavailable() If the intermediate result is Tiny (i.e., the unbiased
exponent is less than -16382) and UE=0, the
reset_xflags() significand is shifted right N bits, where N is the
difference between -16382 and the unbiased
src1 ← bfp_CONVERT_FROM_BFP128(VSR[VRA+32])
src2 ← bfp_CONVERT_FROM_BFP128(VSR[VRB+32])
exponent of the intermediate result. The exponent
v ← bfp_ADD(src1, src2) of the intermediate result is set to the value
rnd ← bfp_ROUND_TO_BFP128(RO, FPSCR.RN, v) -16382.
result ← bfp_CONVERT_TO_BFP128(rnd)
If RO=1, let the rounding mode be Round to Odd.
if(vxsnan_flag) then SetFX(FPSCR.VXSNAN) Otherwise, let the rounding mode be specified by
if(vxisi_flag) then SetFX(FPSCR.VXISI) RN. The intermediate result is rounded to
if(ox_flag) then SetFX(FPSCR.OX) quad-precision using the specified rounding mode.
if(ux_flag) then SetFX(FPSCR.UX)
if(xx_flag) then SetFX(FPSCR.XX) See Table 58, “Scalar Floating-Point Intermediate
Result Handling,” on page 516.
vx_flag ← vxsnan_flag | vxisi_flag
ex_flag ← FPSCR.VE & vx_flag
The result is placed into VSR[VRT+32] in quad-precision
if ex_flag=0 then do
format.
VSR[VRT+32] ← result
FPSCR.FPRF ← fprf_CLASS_BFP128(result) FPRF is set to the class and sign of the result. FR is set
end to indicate if the rounded result was incremented. FI is
FPSCR.FR ← (vx_flag=0) & inc_flag set to indicate the result is inexact.
FPSCR.FI ← (vx_flag=0) & xx_flag
If a trap-disabled Invalid Operation exception occurs,
Let src1 be the floating-point value in VSR[VRA+32] FR and FI are set to 0.
represented in quad-precision format.
If a trap-enabled Invalid Operation exception occurs,
Let src2 be the floating-point value in VSR[VRB+32] VSR[VRT+32] and FPRF are not modified, and FR and FI
represented in quad-precision format. are set to 0.
If either src1 or src2 is a Signalling NaN, an Invalid See Table 59, “VSX Scalar Floating-Point Final
Operation exception occurs and VXSNAN is set to 1. Result,” on page 517.
If src1 and src2 are Infinity values having opposite Special Registers Altered:
signs, an Invalid Operation exception occurs and VXISI FPRF FR FI
is set to 1. FX VXSNAN VXISI OX UX XX
If src1 is a Signalling NaN, the result is the Quiet NaN VSR Data Layout for xsaddqp[o]
corresponding to src1. VSR[VRA+32]
VSR[VRB+32]
Otherwise, if src2 is a Signalling NaN, the result is the
Quiet NaN corresponding to src2. src2
VSR[VRT+32]
Otherwise, if src2 is a Quiet NaN, the result is src2.
tgt
Otherwise, if src1 and src2 are Infinity values having
opposite signs, the result is the default Quiet NaN[1].
src2
-Infinity -NZF -Zero +Zero +NZF +Infinity QNaN SNaN
v ← dQNaN
-Infinity v ← -Infinity
vxisi_flag ← 1
-NZF v ← add(src1,src2) v ← src1 v ← add(src1,src2)
VSX Scalar Compare Exponents Let src1 be the floating-point value in VSR[VRA+32]
Quad-Precision X-form represented in quad-precision format.
vex_flag ← FPSCR.VE & vxsnan_flag A NaN compared to any value, including itself,
compares false for the predicate, equal.
if(vxsnan_flag) SetFX(FPSCR.VXSNAN)
The contents of doubleword 0 of VSR[XT] are set to
if (vex_flag=0) then do 0xFFFF_FFFF_FFFF_FFFF if src1 is equal to src2, and are
if bfp_COMPARE_EQ(src1, src2)=1 then
set to 0x0000_0000_0000_0000 otherwise.
VSR[32×TX+T].dword[0] ← 0xFFFF_FFFF_FFFF_FFFF
VSR[32×TX+T].dword[1] ← 0x0000_0000_0000_0000
end
The contents of doubleword 1 of VSR[XT] are set to
else do 0x0000_0000_0000_0000.
VSR[32×TX+T].dword[0] ← 0x0000_0000_0000_0000
VSR[32×TX+T].dword[1] ← 0x0000_0000_0000_0000 If a trap-enabled Invalid Operation occurs, VSR[XT] is
end not modified.
end
Special Registers Altered:
Let XT be the value 32×TX + T. FX VXSNAN
VSX Scalar Compare Greater Than or Equal Let XT be the value 32×TX + T.
Double-Precision XX3-form Let XA be the value 32×AX + A.
Let XB be the value 32×BX + B.
xscmpgedp XT,XA,XB
Let src1 be the double-precision floating-point value in
60 T A B 19 AXBXTX
0 6 11 16 21 29 30 31 doubleword 0 of VSR[XA].
FL ← CompareLTDP(src1,src2)
FG ← CompareGTDP(src1,src2)
FE ← CompareEQDP(src1,src2)
FU ← IsNAN(src1) | IsNAN(src2)
CR[BF] ← FL || FG || FE || FU
if(vxsnan_flag) then SetFX(VXSNAN)
if(vxvc_flag) then SetFX(VXVC)
src2
–Infinity –NZF –Zero +Zero +NZF +Infinity QNaN SNaN
cc←0b0001
cc←0b0001
–Infinity cc←0b0010 cc←0b1000 cc←0b1000 cc←0b1000 cc←0b1000 cc←0b1000 vxsnan_flag←1
vxvc_flag←1
vxvc_flag←(VE=0)
cc←0b0001
cc←0b0001
–NZF cc←0b0100 cc←C(src1,src2) cc←0b1000 cc←0b1000 cc←0b1000 cc←0b1000 vxsnan_flag←1
vxvc_flag←1
vxvc_flag←(VE=0)
cc←0b0001
cc←0b0001
–Zero cc←0b0100 cc←0b0100 cc←0b0010 cc←0b0010 cc←0b1000 cc←0b1000 vxsnan_flag←1
vxvc_flag←1
vxvc_flag←(VE=0)
cc←0b0001
cc←0b0001
+Zero cc←0b0100 cc←0b0100 cc←0b0010 cc←0b0010 cc←0b1000 cc←0b1000 vxsnan_flag←1
vxvc_flag←1
vxvc_flag←(VE=0)
src1
cc←0b0001
cc←0b0001
+NZF cc←0b0100 cc←0b0100 cc←0b0100 cc←0b0100 cc←C(src1,src2) cc←0b1000 vxsnan_flag←1
vxvc_flag←1
vxvc_flag←(VE=0)
cc←0b0001
cc←0b0001
+Infinity cc←0b0100 cc←0b0100 cc←0b0100 cc←0b0100 cc←0b0100 cc←0b0010 vxsnan_flag←1
vxvc_flag←1
vxvc_flag←(VE=0)
cc←0b0001
cc←0b0001 cc←0b0001 cc←0b0001 cc←0b0001 cc←0b0001 cc←0b0001 cc←0b0001
QNaN vxsnan_flag←1
vxvc_flag←1 vxvc_flag←1 vxvc_flag←1 vxvc_flag←1 vxvc_flag←1 vxvc_flag←1 vxvc_flag←1
vxvc_flag←(VE=0)
cc←0b0001 cc←0b0001 cc←0b0001 cc←0b0001 cc←0b0001 cc←0b0001 cc←0b0001 cc←0b0001
SNaN vxsnan_flag←1 vxsnan_flag←1 vxsnan_flag←1 vxsnan_flag←1 vxsnan_flag←1 vxsnan_flag←1 vxsnan_flag←1 vxsnan_flag←1
vxvc_flag←(VE=0) vxvc_flag←(VE=0) vxvc_flag←(VE=0) vxvc_flag←(VE=0) vxvc_flag←(VE=0) vxvc_flag←(VE=0) vxvc_flag←(VE=0) vxvc_flag←(VE=0)
Explanation:
src1 The double-precision floating-point value in doubleword element 0 of VSR[XA].
src2 The double-precision floating-point value in doubleword element 0 of VSR[XB].
NZF Nonzero finite number.
C(x,y) The floating-point value x is compared to the floating-point value y, returning one of three 4-bit results.
0b1000 when x is greater than y
0b0100 when x is less than y
0b0010 when x is equal to y
cc The 4-bit result compare code.
– 0 0 FPCC←cc, CR[BF]←cc
0 0 1 FPCC←cc, CR[BF]←cc, fx(VXVC)
0 1 0 FPCC←cc, CR[BF]←cc, fx(VXSNAN)
0 1 1 FPCC←cc, CR[BF]←cc, fx(VXSNAN), fx(VXVC)
1 0 1 FPCC←cc, CR[BF]←cc, fx(VXVC), error()
1 1 – FPCC←cc, CR[BF]←cc, fx(VXSNAN), error()
Explanation:
– The results do not depend on this condition.
cc The 4-bit result as defined in Table 62.
fx(x) FX is set to 1 if x=0. x is set to 1.
error() The system error handler is invoked for the trap-enabled exception if the FE0 and FE1 bits in the Machine State Register are set
to any mode other than the ignore-exception mode.
FX Floating-Point Summary Exception status flag, FPSCRFX.
VXSNAN Floating-Point Invalid Operation Exception (SNaN) status flag, FPSCRVXSNAN. See Section 7.4.1.
VXC Floating-Point Invalid Operation Exception (Invalid Compare) status flag, FPSCRVXVC. See Section 7.4.1.
VSX Scalar Compare Ordered Quad-Precision Let src1 be the floating-point value in VSR[VRA+32]
X-form represented in quad-precision format.
src1
VSR[VRB+32]
src2
FL ← CompareLTDP(src1,src2)
FG ← CompareGTDP(src1,src2)
FE ← CompareEQDP(src1,src2)
FU ← IsNAN(src1) | IsNAN(src2)
CR[BF] ← FL || FG || FE || FU
if(vxsnan_flag) then SetFX(VXSNAN)
Programming Note
This instruction can be used to operate on
single-precision source operands.
src2
–Infinity –NZF –Zero +Zero +NZF +Infinity QNaN SNaN
cc = 0b0001
–Infinity cc = 0b0010 cc = 0b1000 cc = 0b1000 cc = 0b1000 cc = 0b1000 cc = 0b1000 cc = 0b0001
vxsnan_flag = 1
cc = 0b0001
–NZF cc = 0b0100 cc = C(src1,src2) cc = 0b1000 cc = 0b1000 cc = 0b1000 cc = 0b1000 cc = 0b0001
vxsnan_flag = 1
cc = 0b0001
–Zero cc = 0b0100 cc = 0b0100 cc = 0b0010 cc = 0b0010 cc = 0b1000 cc = 0b1000 cc = 0b0001
vxsnan_flag = 1
cc = 0b0001
+Zero cc = 0b0100 cc = 0b0100 cc = 0b0010 cc = 0b0010 cc = 0b1000 cc = 0b1000 cc = 0b0001
vxsnan_flag = 1
src1
cc = 0b0001
+NZF cc = 0b0100 cc = 0b0100 cc = 0b0100 cc = 0b0100 cc = C(src1,src2) cc = 0b1000 cc = 0b0001
vxsnan_flag = 1
cc = 0b0001
+Infinity cc = 0b0100 cc = 0b0100 cc = 0b0100 cc = 0b0100 cc = 0b0100 cc = 0b0010 cc = 0b0001
vxsnan_flag = 1
cc = 0b0001
QNaN cc = 0b0001 cc = 0b0001 cc = 0b0001 cc = 0b0001 cc = 0b0001 cc = 0b0001 cc = 0b0001
vxsnan_flag = 1
cc = 0b0001 cc = 0b0001 cc = 0b0001 cc = 0b0001 cc = 0b0001 cc = 0b0001 cc = 0b0001 cc = 0b0001
SNaN
vxsnan_flag = 1 vxsnan_flag = 1 vxsnan_flag = 1 vxsnan_flag = 1 vxsnan_flag = 1 vxsnan_flag = 1 vxsnan_flag = 1 vxsnan_flag = 1
Explanation:
src1 The double-precision floating-point value in doubleword element 0 of VSR[XA].
src2 The double-precision floating-point value in doubleword element 0 of VSR[XB].
NZF Nonzero finite number.
C(x,y) The floating-point value x is compared to the floating-point value y, returning one of three 4-bit results.
0b1000 when x is greater than y
0b0100 when x is less than y
0b0010 when x is equal to y
cc The 4-bit result compare code.
– 0 FPCC←cc, CR[BF]←cc
0 1 FPCC←cc, CR[BF]←cc, fx(VXSNAN)
1 1 FPCC←cc, CR[BF]←cc, fx(VXSNAN), error()
Explanation:
– The results do not depend on this condition.
cc The 4-bit result as defined in Table 64.
fx(x) FX is set to 1 if x=0. x is set to 1.
error() The system error handler is invoked for the trap-enabled exception if the FE0 and FE1 bits in the Machine State Register are set
to any mode other than the ignore-exception mode.
FX Floating-Point Summary Exception status flag, FPSCRFX.
VXSNAN Floating-Point Invalid Operation Exception (SNaN) status flag, FPSCRVXSNAN. See Section 7.4.1.
VSX Scalar Compare Unordered Let src1 be the floating-point value in VSR[VRA+32]
Quad-Precision X-form represented in quad-precision format.
src1
VSR[VRB+32]
src2
VSX Scalar Copy Sign Double-Precision VSX Scalar Copy Sign Quad-Precision X-form
XX3-form
xscpsgnqp VRT,VRA,VRB
xscpsgndp XT,XA,XB
63 VRT VRA VRB 100 /
0 6 11 16 21 31
60 T A B 176 AX BX TX
0 6 11 16 21 29 30 31 if MSR.VSX=0 then VSX_Unavailable()
XT ← TX || T
src1 ← VSR[VRA+32] & 0x8000_0000_0000_0000_0000_0000_0000_0000
XA ← AX || A
src2 ← VSR[VRB+32] & 0x7FFF_FFFF_FFFF_FFFF_FFFF_FFFF_FFFF_FFFF
XB ← BX || B
result{0:63} ← VSR[XA]{0} || VSR[XB]{1:63}
VSR[VRT+32] ← src1 | src2
VSR[XT] ← result || 0xUUUU_UUUU_UUUU_UUUU
Let src1 be the floating-point value in VSR[VRA+32]
Let XT be the value 32×TX + T.
represented in quad-precision format.
Let XA be the value 32×AX + A.
Let XB be the value 32×BX + B.
Let src2 be the floating-point value in VSR[VRB+32]
represented in quad-precision format.
Bit 0 of VSR[XT] is set to the contents of bit 0 of
VSR[XA].
src2 is placed into VSR[VRT+32] with the sign of src1.
Bits 1:63 of VSR[XT] are set to the contents of bits
Special Registers Altered:
1:63 of VSR[XB].
None
The contents of doubleword element 1 of VSR[XT] are
undefined. VSR Data Layout for xscpsgnqp
VSR[VRA+32]
Special Registers Altered src1
None
VSR[VRB+32]
DP unused
tgt = VSR[XT]
DP undefined
0 64 127
Programming Note
This instruction can be used to operate on
single-precision source operands.
VSX Scalar round & Convert Double-Precision Otherwise, if src is a QNaN, the result is the
format to Half-Precision format XX2-form half-precision representation of that QNaN.
if vex_flag=0 then do FPRF is set to the class and sign of the result as
VSR[TX×32+T].hword[0:2] 0x0000_0000_0000 represented in half-precision. FR is set to indicate if the
VSR[TX×32+T].hword[3] result result was incremented when rounded. FI is set to
VSR[TX×32+T].dword[1] 0xUUUU_UUUU_UUUU_UUUU indicate the result is inexact.
FPSCR.FPRF fprf_CLASS_BFP16(result)
end If a trap-enabled invalid operation exception occurs,
FPSCR.FR (vex_flag=0) & inc_flag
VSR[XT] and FPRF are not modified, and FR and FI are
FPSCR.FI (vex_flag=0) & xx_flag
set to 0.
Let XT be the value 32×TX + T.
Let XB be the value 32×BX + B. Special Registers Altered:
FPRF FR FI
Let src be the double-precision floating-point value in FX VXSNAN OX UX XX
doubleword element 0 of VSR[XB].
Programming Note
If src is an SNaN, the result is the half-precision This instruction can be used to operate on a
representation of that SNaN converted to a QNaN. single-precision source operand.
src ← bfp_CONVERT_FROM_BFP64(VSR[VRB+32].dword[0])
if src.class.SNaN then
result ← bfp_CONVERT_TO_BFP128(bfp_QUIET(src))
else
result ← bfp_CONVERT_TO_BFP128(src)
vxsnan_flag ← src.class.SNaN
if(vxsnan_flag) then SetFX(FPSCR.VXSNAN)
vex_flag ← FPSCR.VE & vxsnan_flag
if vex_flag=0 then do
VSR[VRT+32] ← result
FPSCR.FPRF ← fprf_CLASS_BFP128(result)
end
FPSCR.FR ← 0
FPSCR.FI ← 0
FR is set to 0. FI is set to 0.
src.dword[0] unused
VSR[VRT+32]
tgt
VSX Scalar round Double-Precision to See Table 59, “VSX Scalar Floating-Point Final
single-precision and Convert to Result,” on page 517.
Single-Precision format XX2-form
Special Registers Altered
xscvdpsp XT,XB FPRF FR FI FX OX UX XX VXSNAN
60 T /// B 265 BX TX
0 6 11 16 21 30 31
VSR Data Layout for xscvdpsp
reset_xflags() src = VSR[XB]
src ← VSR[32×BX+B].dword[0]
result ← ConvertDPtoSP(src) DP unused
if(vxsnan_flag) then SetFX(FPSCR.VXSNAN)
tgt = VSR[XT]
if(xx_flag) then SetFX(FPSCR.XX)
if(ox_flag) then SetFX(FPSCR.OX) SP undefined undefined
if(ux_flag) then SetFX(FPSCR.UX) 0 32 64 127
vex_flag ← FPSCR.VE & vxsnan_flag
Programming Note
if( ~vex_flag ) then do
VSR[32×TX+T].word[0] ← result This instruction can be used to operate on a
VSR[32×TX+T].word[1] ← 0xUUUU_UUUU single-precision source operand.
VSR[32×TX+T].word[2] ← 0xUUUU_UUUU
VSR[32×TX+T].word[3] ← 0xUUUU_UUUU
FPSCR.FPRF ← ClassSP(result)
FPSCR.FR ← inc_flag
FPSCR.FI ← xx_flag
end
else do
FPSCR.FR ← 0b0
FPSCR.FI ← 0b0
end
reset_xflags() XT ← TX || T
src ← VSR[32×BX+B].dword[0] XB ← BX || B
result ← ConvertDPtoSP_NS(src) reset_xflags()
VSR[32×TX+T].word[0] ← result result{0:63} ← ConvertDPtoSD(VSR[XB]{0:63})
VSR[32×TX+T].word[1] ← 0xUUUU_UUUU if(vxsnan_flag) then SetFX(VXSNAN)
VSR[32×TX+T].word[2] ← 0xUUUU_UUUU if(vxcvi_flag) then SetFX(VXCVI)
VSR[32×TX+T].word[3] ← 0xUUUU_UUUU if(xx_flag) then SetFX(XX)
vex_flag ← VE & (vxsnan_flag | vxcvi_flag)
Let XT be the value 32×TX + T.
Let XB be the value 32×BX + B. if( ~vex_flag ) then do
VSR[XT] ← result || 0xUUUU_UUUU_UUUU_UUUU
Let src be the single-precision floating-point value in FPRF ← 0bUUUUU
doubleword element 0 of VSR[XB] represented in FR ← inc_flag
FI ← xx_flag
double-precision format.
end
else do
src is placed into word element 0 of VSR[XT] in FR ← 0b0
single-precision format. FI ← 0b0
end
The contents of word elements 1, 2, and 3 of VSR[XT]
are undefined. Let XT be the value 32×TX + T.
Let XB be the value 32×BX + B.
Special Registers Altered
None Let src be the double-precision floating-point value in
doubleword element 0 of VSR[XB].
VSR Data Layout for xscvdpspn
src = VSR[XB] If src is a NaN, the result is the value
0x8000_0000_0000_0000 and VXCVI is set to 1. If src is
SP unused
an SNaN, VXSNAN is also set to 1.
tgt = VSR[XT]
Otherwise, src is rounded to a floating-point integer
SP undefined undefined undefined using the rounding mode Round Toward Zero.
0 32 64 96 127
If the rounded value is greater than 263-1, the result is
Programming Note 0x7FFF_FFFF_FFFF_FFFF and VXCVI is set to 1.
xscvdpsp should be used to convert a scalar
double-precision value to vector single-precision Otherwise, if the rounded value is less than -263, the
format. result is 0x8000_0000_0000_0000 and VXCVI is set to 1.
xscvdpspn should be used to convert a scalar Otherwise, the result is the rounded value converted to
single-precision value to vector single-precision 64-bit signed-integer format, and if the result is inexact
format. (i.e., not equal to src), XX is set to 1.
Otherwise,
– The result is placed into doubleword element 0 of
VSR[XT]. The contents of doubleword element 1 of
VSR[XT] are undefined.
– FPRF is set to an undefined value.
– FR is set to indicate if the result was incremented
when rounded.
– FI is set to indicate the result is inexact.
Programming Note
This instruction can be used to operate on a
single-precision source operand.
Programming Note
xscvdpsxds rounds using Round towards Zero
rounding mode. For other rounding modes, software
must use a Round to Double-Precision Integer
instruction that corresponds to the desired rounding
mode, including xsrdpic which uses the rounding
mode specified by RN.
VSX Scalar Convert Half-Precision format to Otherwise, if src is a QNaN, the result is the
Double-Precision format XX2-form double-precision representation of that QNaN.
vex_flag ← FPSCR.VE & vxsnan_flag FPRF is set to the class and sign of the result as
represented in double-precision format. FR is set to
if vex_flag=0 then do
indicate if the rounded result was incremented. FI is
VSR[VRT+32].dword[0] ← result
VSR[VRT+32].dword[1] ← 0x0000_0000_0000_0000
set to indicate the result is inexact.
FPSCR.FPRF ← fprf_CLASS_BFP64(result)
end If a trap-disabled Invalid Operation exception occurs,
FPSCR.FR ← (vxsnan_flag=0) & inc_flag FR and FI are set to 0.
FPSCR.FI ← (vxsnan_flag=0) & xx_flag
If a trap-enabled Invalid Operation exception occurs,
Let src be the quad-precision floating-point value in VSR[VRT+32] and FPRF are not modified, and FR and FI
VSR[VRB+32]. are set to 0.
If src is a Signalling NaN, an Invalid Operation See Table 59, “VSX Scalar Floating-Point Final
exception occurs and VXSNAN is set to 1. Result,” on page 517.
If src is a Signalling NaN, the result is the Quiet NaN Special Registers Altered:
corresponding to the Signalling NaN, with the FPRF FR FI
significand truncated to the rounding precision. FX VXSNAN OX UX XX
Otherwise, if src is a Quiet NaN, then the result is src VSR Data Layout for xscvqpdp[o]
with the significand truncated to double-precision. VSR[VRB+32]
tgt.dword[0] 0x0000_0000_0000_0000
VSX Scalar truncate & Convert If src is a Quiet NaN or an Infinity, an Invalid Operation
Quad-Precision format to Signed Doubleword exception occurs and VXCVI is set to 1.
format X-form
If src is a NaN, the result is 0x8000_0000_0000_0000.
xscvqpsdz VRT,VRB
Otherwise, if src is a Zero, the result is
63 VRT 25 VRB 836 /
0 6 11 16 21 31 0x0000_0000_0000_0000.
tgt.dword[0] 0x0000_0000_0000_0000
If src is a Signalling NaN, an Invalid Operation
exception occurs and VXSNAN and VXCVI are set to 1.
bfp_ROUND_TO_INTEGER(0b001,src) g src
FPSCR.VE
FPSCR.XE
VSX Scalar truncate & Convert If src is a Quiet NaN or an Infinity, an Invalid Operation
Quad-Precision format to Signed Word format exception occurs and VXCVI is set to 1.
X-form
If src is a NaN, the result is 0xFFFF_FFFF_8000_0000.
xscvqpswz VRT,VRB
Otherwise, if src is a Zero, the result is
63 VRT 9 VRB 836 /
0 6 11 16 21 31 0x0000_0000_0000_0000.
src
Let src be the quad-precision floating-point value in
VSR[VRB+32]. VSR[VRT+32]
tgt.dword[0] 0x0000_0000_0000_0000
If src is a Signalling NaN, an Invalid Operation
exception occurs and VXSNAN and VXCVI are set to 1.
bfp_ROUND_TO_INTEGER(0b001,src) g src
FPSCR.VE
FPSCR.XE
VSX Scalar truncate & Convert If src is a Quiet NaN or an Infinity, an Invalid Operation
Quad-Precision format to Unsigned exception occurs and VXCVI is set to 1.
Doubleword format X-form
If src is a NaN, the result is 0x0000_0000_0000_0000.
xscvqpudz VRT,VRB
Otherwise, if src is a Zero, the result is
63 VRT 17 VRB 836 /
0 6 11 16 21 31 0x0000_0000_0000_0000.
bfp_ROUND_TO_INTEGER(0b001,src) g src
FPSCR.VE
FPSCR.XE
VSX Scalar truncate & Convert If src is a Quiet NaN or an Infinity, an Invalid Operation
Quad-Precision format to Unsigned Word exception occurs and VXCVI is set to 1.
format X-form
If src is a NaN, the result is 0x0000_0000_0000_0000.
xscvqpuwz VRT,VRB
Otherwise, if src is a Zero, the result is
63 VRT 1 VRB 836 /
0 6 11 16 21 31 0x0000_0000_0000_0000.
tgt.dword[0] 0x0000_0000_0000_0000
Let src be the quad-precision floating-point value in
VSR[VRB+32].
bfp_ROUND_TO_INTEGER(0b001,src) g src
FPSCR.VE
FPSCR.XE
src ← bfp_CONVERT_FROM_SI64(VSR[VRB+32].dword[0])
result ← bfp_CONVERT_TO_BFP128(src)
VSR[VRT+32] ← result
FPSCR.FPRF ← fprf_CLASS_BFP128(result)
FPSCR.FR ← 0
FPSCR.FI ← 0
src.dword[0] unused
VSR[VRT+32]
tgt
reset_xflags()
src ← VSR[32×BX+B].word[0]
result ← ConvertVectorSPtoScalarSP(src)
if(vxsnan_flag) then SetFX(FPSCR.VXSNAN)
vex_flag ← FPSCR.VE & vxsnan_flag
FPSCR.FR ← 0b0
FPSCR.FI ← 0b0
if( ~vex_flag ) then do
VSR[32×TX+T].dword[0] ← result
VSR[32×TX+T].dword[1] ← 0xUUUU_UUUU_UUUU_UUUU
FPSCR.FPRF ← ClassDP(result)
end
reset_xflags()
src ← VSR[32×BX+B].word[0]
result ← ConvertSPtoDP_NS(src)
VSR[32×TX+T].dword[0] ← result
VSR[32×TX+T].dword[1] ← 0xUUUU_UUUU_UUUU_UUUU
Programming Note
xscvspdp should be used to convert a vector
single-precision floating-point value to scalar
double-precision format.
VSX Scalar Convert Signed Integer VSX Scalar Convert Signed Integer
Doubleword to floating-point format and Doubleword to floating-point format and
round to Double-Precision format XX2-form round to Single-Precision XX2-form
xscvsxddp XT,XB xscvsxdsp XT,XB
XT ← TX || T reset_xflags()
XB ← BX || B
reset_xflags() src ← ConvertSDtoDP(VSR[32×BX+B].dword[0])
v{0:inf} ← ConvertSDtoFP(VSR[XB]{0:63}) result ← RoundToSP(RN,src)
result{0:63} ← RoundToDP(RN,v) VSR[32×TX+T].dword[0] ← ConvertSPtoSP64(result)
VSR[XT] ← result || 0xUUUU_UUUU_UUUU_UUUU VSR[32×TX+T].dword[1] ← 0xUUUU_UUUU_UUUU_UUUU
if(xx_flag) then SetFX(XX)
FPRF ← ClassDP(result) if(xx_flag) then SetFX(XX)
FR ← inc_flag
FI ← xx_flag FPRF ← ClassSP(result)
FR ← inc_flag
Let XT be the value 32×TX + T. FI ← xx_flag
Let XB be the value 32×BX + B.
Let XT be the value 32×TX + T.
Let src be the signed integer value in doubleword Let XB be the value 32×BX + B.
element 0 of VSR[XB].
Let src be the two’s-complement integer value in
src is converted to an unbounded-precision doubleword element 0 of VSR[XB].
floating-point value and rounded to double-precision
using the rounding mode specified by RN. src is converted to floating-point format, and rounded
to single-precision using the rounding mode specified
The result is placed into doubleword element 0 of by RN.
VSR[XT] in double-precision format.
The result is placed into doubleword element 0 of
The contents of doubleword element 1 of VSR[XT] are VSR[XT] in double-precision format.
undefined.
The contents of doubleword element 1 of VSR[XT] are
FPRF is set to the class and sign of the result. FR is undefined.
set to indicate if the result was incremented when
rounded. FI is set to indicate the result is inexact. FPRF is set to the class and sign of the result as
represented in single-precision format. FR is set to
Special Registers Altered indicate if the result was incremented when rounded.
FPRF FR FI FX XX FI is set to indicate the result is inexact.
VSX Scalar Convert Signed Doubleword VSX Scalar Convert Unsigned Doubleword
format to Quad-Precision format X-form format to Quad-Precision format X-form
xscvsdqp VRT,VRB xscvudqp VRT,VRB
Let src be the signed integer value in doubleword Let src be the unsigned integer value in doubleword
element 0 of VSR[VRB+32]. element 0 of VSR[VRB+32].
src is placed into VSR[VRT+32] in quad-precision src is placed into VSR[VRT+32] in quad-precision
floating-point format. floating-point format.
FPRF is set to the class and sign of the result. FR is set FPRF is set to the class and sign of the result. FR is set
to 0. FI is set to 0. to 0. FI is set to 0.
VSR Data Layout for xscvsdqp VSR Data Layout for xscvudqp
VSR[VRB+32] VSR[VRB+32]
VSR[VRT+32] VSR[VRT+32]
tgt tgt
VSX Scalar Convert Unsigned Integer VSX Scalar Convert Unsigned Integer
Doubleword to floating-point format and Doubleword to floating-point format and
round to Double-Precision format XX2-form round to Single-Precision XX2-form
xscvuxddp XT,XB xscvuxdsp XT,XB
XT ← TX || T reset_xflags()
XB ← BX || B
reset_xflags() src ← ConvertUDtoDP(VSR[32×BX+B].dword[0])
src{0:inf} ← ConvertUDtoFP(VSR[XB]{0:63}) result ← RoundToSP(RN,src)
result{0:63} ← RoundToDP(RN,src) VSR[32×TX+T].dword[0] ← ConvertSPtoSP64(result)
VSR[XT] ← result || 0xUUUU_UUUU_UUUU_UUUU VSR[32×TX+T].dword[1] ← 0xUUUU_UUUU_UUUU_UUUU
if(xx_flag) then SetFX(XX)
FPRF ← ClassDP(result) if(xx_flag) then SetFX(XX)
FR ← inc_flag
FI ← xx_flag FPRF ← ClassSP(result)
FR ← inc_flag
Let XT be the value 32×TX + T. FI ← xx_flag
Let XB be the value 32×BX + B.
Let XT be the value 32×TX + T.
Let src be the unsigned integer value in doubleword Let XB be the value 32×BX + B.
element 0 of VSR[XB].
Let src be the unsigned-integer value in doubleword
src is converted to an unbounded-precision element 0 of VSR[XB].
floating-point value and rounded to double-precision
using the rounding mode specified by RN. src is converted to floating-point format, and rounded
to single-precision using the rounding mode specified
The result is placed into doubleword element 0 of by RN.
VSR[XT] in double-precision format.
The result is placed into doubleword element 0 of
The contents of doubleword element 1 of VSR[XT] are VSR[XT] in double-precision format.
undefined.
The contents of doubleword element 1 of VSR[XT] are
FPRF is set to the class and sign of the result. FR is undefined.
set to indicate if the result was incremented when
rounded. FI is set to indicate the result is inexact. FPRF is set to the class and sign of the result as
represented in single-precision format. FR is set to
Special Registers Altered indicate if the result was incremented when rounded.
FPRF FR FI FX XX FI is set to indicate the result is inexact.
VSX Scalar Divide Double-Precision XX3-form The result is placed into doubleword element 0 of
VSR[XT] in double-precision format.
xsdivdp XT,XA,XB
The contents of doubleword element 1 of VSR[XT] are
60 T A B 56 AX BX TX undefined.
0 6 11 16 21 29 30 31
FPRF is set to the class and sign of the result. FR is
XT ← TX || T
set to indicate if the result was incremented when
XA ← AX || A
XB ← BX || B
rounded. FI is set to indicate the result is inexact.
reset_xflags()
src1 ← VSR[XA]{0:63} If a trap-enabled invalid operation exception or a
src2 ← VSR[XB]{0:63} trap-enabled zero divide exception occurs, VSR[XT]
v{0:inf} ← DivideFP(src1,src2) and FPRF are not modified, and FR and FI are set to
result{0:63} ← RoundToDP(RN,v) 0.
if(vxsnan_flag) then SetFX(VXSNAN)
if(vxidi_flag) then SetFX(VXIDI) See Table 59, “VSX Scalar Floating-Point Final
if(vxzdz_flag) then SetFX(VXZDZ) Result,” on page 517.
if(ox_flag) then SetFX(OX)
if(ux_flag) then SetFX(UX) Special Registers Altered
if(xx_flag) then SetFX(XX)
FPRF FR FI FX OX UX ZX XX
if(zx_flag) then SetFX(ZX)
vex_flag ← VE & (vxsnan_flag | vxidi_flag | vxzdz_flag)
VXSNAN VXIDI VXZDZ
zex_flag ← ZE & zx_flag
VSR Data Layout for xsdivdp
if( ~vex_flag & ~zex_flag ) then do
src1 = VSR[XA]
VSR[XT] = result || 0xUUUU_UUUU_UUUU_UUUU
FPRF = ClassDP(result) DP unused
FR = inc_flag
FI = xx_flag src2 = VSR[XB]
end
else do DP unused
FR = 0b0 tgt = VSR[XT]
FI = 0b0
end DP undefined
0 64 127
Let XT be the value 32×TX + T.
Let XA be the value 32×AX + A.
Let XB be the value 32×BX + B.
src2
-Infinity -NZF -Zero +Zero +NZF +Infinity QNaN SNaN
v ← dQNaN v ← dQNaN v ← Q(src2)
-Infinity v ← +Infinity v ← +Infinity v ← –Infinity v ← –Infinity v ← src2
vxidi_flag ← 1 vxidi_flag ← 1 vxsnan_flag ← 1
v ← +Infinity v ← –Infinity v ← Q(src2)
-NZF v ← +Zero v ← D(src1,src2) v ← D(src1,src2) v ← –Zero v ← src2
zx_flag ← 1 zx_flag ← 1 vxsnan_flag ← 1
v ← dQNaN v ← dQNaN v ← Q(src2)
-Zero v ← +Zero v ← +Zero v ← –Zero v ← –Zero v ← src2
vxzdz_flag ← 1 vxzdz_flag ← 1 vxsnan_flag ← 1
v ← dQNaN v ← dQNaN v ← Q(src2)
+Zero v ← –Zero v ← –Zero v ← +Zero v ← +Zero v ← src2
vxzdz_flag ← 1 vxzdz_flag ← 1 vxsnan_flag ← 1
src1
Explanation:
src1 The double-precision floating-point value in doubleword element 0 of VSR[XA].
src2 The double-precision floating-point value in doubleword element 0 of VSR[XB].
dQNaN Default quiet NaN (0x7FF8_0000_0000_0000).
NZF Nonzero finite number.
D(x,y) Return the normalized quotient of floating-point value x divided by floating-point value y, having unbounded range and precision.
Q(x) Return a QNaN with the payload of x.
v The intermediate result having unbounded signficand precision and unbounded exponent range.
VSX Scalar Divide Quad-Precision [using Otherwise, if src2 is a Quiet NaN, the result is src2.
round to Odd] X-form
Otherwise, if src1 and src2 are Infinity values, or if
xsdivqp VRT,VRA,VRB (RO=0) src1 and src2 are Zero values, the result is the default
xsdivqpo VRT,VRA,VRB (RO=1) Quiet NaN[1].
63 VRT VRA VRB 548 RO
0 6 11 16 21 31
Otherwise, if src1 is a non-zero value and src2 is a
Zero value, the result is an Infinity.
if MSR.VSX=0 then VSX_Unavailable()
Otherwise, do the following.
reset_xflags() The normalized quotient of src1 divided by src2 is
produced with unbounded significand precision
src1 ← bfp_CONVERT_FROM_BFP128(VSR[VRA+32])
and exponent range.
src2 ← bfp_CONVERT_FROM_BFP128(VSR[VRB+32])
v ← bfp_DIVIDE(src1, src2)
rnd ← bfp_ROUND_TO_BFP128(RO, FPSCR.RN, v)
See Table 75, “Actions for xsdivqp[o],” on
result ← bfp_CONVERT_TO_BFP128(rnd) page 567.
if(vxsnan_flag) then SetFX(FPSCR.VXSNAN) If the intermediate result is Tiny (i.e., the unbiased
if(vxidi_flag) then SetFX(FPSCR.VXIDI) exponent is less than -16382) and UE=0, the
if(vxzdz_flag) then SetFX(FPSCR.VXZDZ) significand is shifted right N bits, where N is the
if(ox_flag) then SetFX(FPSCR.OX) difference between -16382 and the unbiased
if(ux_flag) then SetFX(FPSCR.UX) exponent of the intermediate result. The exponent
if(zx_flag) then SetFX(FPSCR.ZX) of the intermediate result is set to the value
if(xx_flag) then SetFX(FPSCR.XX) -16382.
vx_flag ← vxsnan_flag | vxidi_flag | vxzdz_flag
If RO=1, let the rounding mode be Round to Odd.
ex_flag ← (FPSCR.VE & vx_flag) | (FPSCR.ZE & zx_flag)
Otherwise, let the rounding mode be specified by
if ex_flag=0 then do RN. Unless the result is an Infinity or a Zero, the
VSR[VRT+32] ← result intermediate result is rounded to quad-precision
FPSCR.FPRF ← fprf_CLASS_BFP128(result) using the specified rounding mode.
end
FPSCR.FR ← (vx_flag=0) & (zx_flag=0) & inc_flag See Table 58, “Scalar Floating-Point Intermediate
FPSCR.FI ← (vx_flag=0) & (zx_flag=0) & xx_flag Result Handling,” on page 516.
Let src1 be the floating-point value in VSR[VRA+32] The result is placed into VSR[VRT+32] in quad-precision
represented in quad-precision format. format.
Let src2 be the floating-point value in VSR[VRB+32] FPRF is set to the class and sign of the result. FR is set
represented in quad-precision format. to indicate if the result was incremented when
rounded. FI is set to indicate the result is inexact.
If either src1 or src2 is a Signalling NaN, an Invalid
Operation exception occurs and VXSNAN is set to 1 If a trap-disabled Invalid Operation exception occurs,
FR and FI are set to 0.
If src1 and src2 are Infinity values, an Invalid
Operation exception occurs and VXIDI is set to 1. If a trap-disabled Zero Divide exception occurs, FR and
FI are set to 0.
If src1 and src2 are Zero values, an Invalid Operation
exception occurs and VXZDZ is set to 1. If a trap-enabled Invalid Operation exception or a
trap-enabled Zero Divide exception occurs,
If src1 is a finite value and src2 is a Zero value, an VSR[VRT+32] and FPRF are not modified, and FR and FI
Zero Divide exception occurs and ZX is set to 1. are set to 0.
If src1 is a Signalling NaN, the result is the Quiet NaN See Table 59, “VSX Scalar Floating-Point Final
corresponding to src1. Result,” on page 517.
Otherwise, if src1 is a Quiet NaN, the result is src1. Special Registers Altered:
FPRF FR FI
Otherwise, if src2 is a Signalling NaN, the result is the FX VXSNAN VXIDI VXZDZ OX UX ZX XX
Quiet NaN corresponding to src2.
1. The quad-precision default Quiet NaN is the value, 0x7FFF_8000_0000_0000_0000_0000_0000.
src1
VSR[VRB+32]
src2
VSR[VRT+32]
tgt
src2
-Infinity -NZF -Zero +Zero +NZF +Infinity QNaN SNaN
v ← dQNaN v ← dQNaN
-Infinity v ← +Infinity v ← +Infinity v ← -Infinity v ← -Infinity
vxidi_flag ← 1 vxidi_flag ← 1
v ← +Infinity v ← -Infinity
-NZF v ← Div(src1,src2) v ← Div(src1,src2)
zx_flag ← 1 zx_flag ← 1
-Zero v ← +Zero v ← -Zero
v ← dQNaN
v ← src2
vxzdz_flag ← 1 v ← quiet(src2)
+Zero v ← -Zero v ← +Zero
vxsnan_flag ← 1
src1
v ← -Infinity v ← +Infinity
+NZF v ← Div(src1,src2) v ← Div(src1,src2)
zx_flag ← 1 zx_flag ← 1
v ← dQNaN v ← dQNaN
+Infinity v ← -Infinity v ← -Infinity v ← +Infinity v ← +Infinity
vxidi_flag ← 1 vxidi_flag ← 1
v ← src1
QNaN v ← src1
vxsnan_flag ← 1
v ← quiet(src1)
SNaN
vxsnan_flag ← 1
Explanation:
src1 The quad-precision floating-point value in VSR[VRA+32].
src2 The quad-precision floating-point value in VSR[VRB+32].
dQNaN Default quiet NaN (0x7FFF_8000_0000_0000_0000_0000_0000).
NZF Nonzero finite number.
Div(x,y) The floating-point value x is divided1 by floating-point value y. Return the normalized2 quotient, having unbounded range and
precision.
quiet(x) Convert x to the corresponding Quiet NaN.
v The intermediate result having unbounded significand precision and unbounded exponent range.
VSX Scalar Divide Single-Precision XX3-form The result is placed into doubleword element 0 of
VSR[XT] in double-precision format.
xsdivsp XT,XA,XB
The contents of doubleword element 1 of VSR[XT] are
60 T A B 24 AXBXTX undefined.
0 6 11 16 21 29 30 31
reset_xflags()
FPRF is set to the class and sign of the result as
represented in single-precision format. FR is set to
src1 ← VSR[32×AX+A].dword[0] indicate if the result was incremented when rounded.
src2 ← VSR[32×BX+B].dword[0] FI is set to indicate the result is inexact.
v ← DivideDP(src1,src2) If a trap-enabled invalid operation exception or a
result ← RoundToSP(RN,v) trap-enabled zero divide exception occurs, VSR[XT]
and FPRF are not modified, and FR and FI are set to
if(vxsnan_flag) then SetFX(VXSNAN) 0.
if(vxidi_flag) then SetFX(VXIDI) See Table 59, “VSX Scalar Floating-Point Final
if(vxzdz_flag) then SetFX(VXZDZ)
Result,” on page 517.
if(ox_flag) then SetFX(OX)
if(ux_flag) then SetFX(UX)
if(xx_flag) then SetFX(XX) Special Registers Altered
if(zx_flag) then SetFX(ZX) FPRF FR FI FX OX UX ZX XX
VXSNAN VXIDI VXZDZ
vex_flag ← VE & (vxsnan_flag|vxidi_flag|vxzdz_flag)
zex_flag ← ZE & zx_flag VSR Data Layout for xsdivsp
src1 = VSR[XA]
if( ~vex_flag & ~zex_flag ) then do
VSR[32×TX+T].dword[0] ← ConvertSPtoSP64(result) DP unused
VSR[32×TX+T].dword[1] ← 0xUUUU_UUUU_UUUU_UUUU src2 = VSR[XB]
FPRF ← ClassSP(result)
DP unused
FR ← inc_flag
FI ← xx_flag tgt = VSR[XT]
end DP undefined
else do
0 64 127
FR ← 0b0
FI ← 0b0
end
src2
-Infinity -NZF -Zero +Zero +NZF +Infinity QNaN SNaN
v ← dQNaN v ← dQNaN v ← Q(src2)
-Infinity v ← +Infinity v ← +Infinity v ← –Infinity v ← –Infinity v ← src2
vxidi_flag ← 1 vxidi_flag ← 1 vxsnan_flag ← 1
v ← +Infinity v ← –Infinity v ← Q(src2)
-NZF v ← +Zero v ← D(src1,src2) v ← D(src1,src2) v ← –Zero v ← src2
zx_flag ← 1 zx_flag ← 1 vxsnan_flag ← 1
v ← dQNaN v ← dQNaN v ← Q(src2)
-Zero v ← +Zero v ← +Zero v ← –Zero v ← –Zero v ← src2
vxzdz_flag ← 1 vxzdz_flag ← 1 vxsnan_flag ← 1
v ← dQNaN v ← dQNaN v ← Q(src2)
+Zero v ← –Zero v ← –Zero v ← +Zero v ← +Zero v ← src2
vxzdz_flag ← 1 vxzdz_flag ← 1 vxsnan_flag ← 1
src1
Explanation:
src1 The double-precision floating-point value in doubleword element 0 of VSR[XA].
src2 The double-precision floating-point value in doubleword element 0 of VSR[XB].
dQNaN Default quiet NaN (0x7FF8_0000_0000_0000).
NZF Nonzero finite number.
D(x,y) Return the normalized quotient of floating-point value x divided by floating-point value y, having unbounded range and precision.
Q(x) Return a QNaN with the payload of x.
v The intermediate result having unbounded signficand precision and unbounded exponent range.
Programming Note
This instruction can be used to produce a
single-precision result.
src1 GPR[RA]
src2 GPR[RB]
VSR[VRT+32].bit[0] ← VSR[VRA+32].bit[0]
VSR[VRT+32].bit[1:15] ← VSR[VRB+32].dword[0].bit[49:63]
VSR[VRT+32].bit[16:127] ← VSR[VRA+32].bit[16:127]
VSR[VRB+32]
src2.dword[0] unused
VSR[VRT+32]
tgt
Part 1: src3
Multiply –Infinity –NZF –Zero +Zero +NZF +Infinity QNaN SNaN
p ← dQNaN p ← dQNaN p ← Q(src3)
–Infinity p ← +Infinity p ← +Infinity p ← –Infinity p ← –Infinity p ← src3
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
p ← Q(src3)
–NZF p ← +Infinity p ← M(src1,src3) p ← +Zero p ← –Zero p ← M(src1,src3) p ← +Infinity p ← src3
vxsnan_flag ← 1
p ← dQNaN p ← dQNaN p ← Q(src3)
–Zero p ← +Zero p ← +Zero p ← –Zero p ← –Zero p ← src3
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
p ← dQNaN p ← dQNaN p ← Q(src3)
+Zero p ← –Zero p ← –Zero p ← +Zero p ← +Zero p ← src3
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
src1
p ← Q(src3)
+NZF p ← –Infinity p ← M(src1,src3) p ← –Zero p ← +Zero p ← M(src1,src3) p ← +Infinity p ← src3
vxsnan_flag ← 1
p ← dQNaN p ← dQNaN p ← Q(src3)
+Infinity p ← –Infinity p ← +Infinity p ← +Infinity p ← +Infinity p ← src3
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
p ← src1
QNaN p ← src1 p ← src1 p ← src1 p ← src1 p ← src1 p ← src1 p ← src1
vxsnan_flag ← 1
p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1)
SNaN
vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1
Part 2: src2
Add –Infinity –NZF –Zero +Zero +NZF +Infinity QNaN SNaN
v ← dQNaN v ← Q(src2)
–Infinity v ← –Infinity v ← –Infinity v ← –Infinity v ← –Infinity v ← –Infinity v ← src2
vxisi_flag ← 1 vxsnan_flag ← 1
v ← Q(src2)
–NZF v ← –Infinity v ← A(p,src2) v←p v←p v ← A(p,src2) v ← +Infinity v ← src2
vxsnan_flag ← 1
v ← Q(src2)
–Zero v ← –Infinity v ← src2 v ← –Zero v ← Rezd v ← src2 v ← +Infinity v ← src2
vxsnan_flag ← 1
v ← Q(src2)
+Zero v ← –Infinity v ← src2 v ← Rezd v ← +Zero v ← src2 v ← +Infinity v ← src2
vxsnan_flag ← 1
p
v ← Q(src2)
+NZF v ← –Infinity v ← A(p,src2) v←p v←p v ← A(p,src2) v ← +Infinity v ← src2
vxsnan_flag ← 1
v ← dQNaN v ← Q(src2)
+Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← src2
vxisi_flag ← 1 vxsnan_flag ← 1
QNaN & v←p
v←p v←p v←p v←p v←p v←p v←p
src1 is a NaN vxsnan_flag ← 1
QNaN & v ← Q(src2)
v←p v←p v←p v←p v←p v←p v ← src2
src1 not a NaN vxsnan_flag ← 1
Explanation:
src1 The double-precision floating-point value in doubleword element 0 of VSR[XA].
src2 For xsmaddadp, the double-precision floating-point value in doubleword element 0 of VSR[XT].
For xsmaddmdp, the double-precision floating-point value in doubleword element 0 of VSR[XB].
src3 For xsmaddadp, the double-precision floating-point value in doubleword element 0 of VSR[XB].
For xsmaddmdp, the double-precision floating-point value in doubleword element 0 of VSR[XT].
dQNaN Default quiet NaN (0x7FF8_0000_0000_0000).
NZF Nonzero finite number.
Rezd Exact-zero-difference result (addition of two finite numbers having same magnitude but different signs). Can also occur with two
nonzero finite number source operands.
Q(x) Return a QNaN with the payload of x.
A(x,y) Return the normalized sum of floating-point value x and floating-point value y, having unbounded range and precision.
Note: If x = -y, v is considered to be an exact-zero-difference result (Rezd).
M(x,y) Return the normalized product of floating-point value x and floating-point value y, having unbounded range and precision.
p The intermediate product having unbounded range and precision.
v The intermediate result having unbounded range and precision.
if(vxsnan_flag) then SetFX(VXSNAN) See part 2 of Table 78, “Actions for xsmadd(a|m)sp,”
if(vximz_flag) then SetFX(VXIMZ) on page 577.
if(vxisi_flag) then SetFX(VXISI)
if(ox_flag) then SetFX(OX) The intermediate result is rounded to single-precision
if(ux_flag) then SetFX(UX)
using the rounding mode specified by RN.
if(xx_flag) then SetFX(XX)
Part 1: src3
Multiply –Infinity –NZF –Zero +Zero +NZF +Infinity QNaN SNaN
p ← dQNaN p ← dQNaN p ← Q(src3)
–Infinity p ← +Infinity p ← +Infinity p ← –Infinity p ← –Infinity p ← src3
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
p ← Q(src3)
–NZF p ← +Infinity p ← M(src1,src3) p ← +Zero p ← –Zero p ← M(src1,src3) p ← +Infinity p ← src3
vxsnan_flag ← 1
p ← dQNaN p ← dQNaN p ← Q(src3)
–Zero p ← +Zero p ← +Zero p ← –Zero p ← –Zero p ← src3
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
p ← dQNaN p ← dQNaN p ← Q(src3)
+Zero p ← –Zero p ← –Zero p ← +Zero p ← +Zero p ← src3
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
src1
p ← Q(src3)
+NZF p ← –Infinity p ← M(src1,src3) p ← –Zero p ← +Zero p ← M(src1,src3) p ← +Infinity p ← src3
vxsnan_flag ← 1
p ← dQNaN p ← dQNaN p ← Q(src3)
+Infinity p ← –Infinity p ← +Infinity p ← +Infinity p ← +Infinity p ← src3
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
p ← src1
QNaN p ← src1 p ← src1 p ← src1 p ← src1 p ← src1 p ← src1 p ← src1
vxsnan_flag ← 1
p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1)
SNaN
vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1
Part 2: src2
Add –Infinity –NZF –Zero +Zero +NZF +Infinity QNaN SNaN
v ← dQNaN v ← Q(src2)
–Infinity v ← –Infinity v ← –Infinity v ← –Infinity v ← –Infinity v ← –Infinity v ← src2
vxisi_flag ← 1 vxsnan_flag ← 1
v ← Q(src2)
–NZF v ← –Infinity v ← A(p,src2) v←p v←p v ← A(p,src2) v ← +Infinity v ← src2
vxsnan_flag ← 1
v ← Q(src2)
–Zero v ← –Infinity v ← src2 v ← –Zero v ← Rezd v ← src2 v ← +Infinity v ← src2
vxsnan_flag ← 1
v ← Q(src2)
+Zero v ← –Infinity v ← src2 v ← Rezd v ← +Zero v ← src2 v ← +Infinity v ← src2
vxsnan_flag ← 1
v ← Q(src2)
+NZF v ← –Infinity v ← A(p,src2) v←p v←p v ← A(p,src2) v ← +Infinity v ← src2
vxsnan_flag ← 1
p
v ← dQNaN v ← Q(src2)
+Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← src2
vxisi_flag ← 1 vxsnan_flag ← 1
QNaN &
v←p
src1 is a v←p v←p v←p v←p v←p v←p v←p
vxsnan_flag ← 1
NaN
QNaN &
v ← Q(src2)
src1 not a v ← p v←p v←p v←p v←p v←p v ← src2
vxsnan_flag ← 1
NaN
Explanation:
src1 The double-precision floating-point value in doubleword element 0 of VSR[XA].
src2 For xsmaddasp, the double-precision floating-point value in doubleword element 0 of VSR[XT].
For xsmaddmsp, the double-precision floating-point value in doubleword element 0 of VSR[XB].
src3 For xsmaddasp, the double-precision floating-point value in doubleword element 0 of VSR[XB].
For xsmaddmsp, the double-precision floating-point value in doubleword element 0 of VSR[XT].
dQNaN Default quiet NaN (0x7FF8_0000_0000_0000).
NZF Nonzero finite number.
Rezd Exact-zero-difference result (addition of two finite numbers having same magnitude but different signs). Can also occur with two
nonzero finite number source operands.
Q(x) Return a QNaN with the payload of x.
A(x,y) Return the normalized sum of floating-point value x and floating-point value y, having unbounded range and precision.
Note: If x = -y, v is considered to be an exact-zero-difference result (Rezd).
M(x,y) Return the normalized product of floating-point value x and floating-point value y, having unbounded range and precision.
p The intermediate product having unbounded range and precision.
v The intermediate result having unbounded range and precision.
VSX Scalar Multiply-Add Quad-Precision Otherwise, if src1 is a Quiet NaN, the result is src1.
[using round to Odd] X-form
Otherwise, if src2 is a Signalling NaN, the result is the
xsmaddqp VRT,VRA,VRB (RO=0) Quiet NaN corresponding to src2.
xsmaddqpo VRT,VRA,VRB (RO=1)
Otherwise, if src2 is a Quiet NaN, the result is src2.
63 VRT VRA VRB 388 RO
0 6 11 16 21 31 Otherwise, if src3 is a Signalling NaN, the result is the
Quiet NaN corresponding to src3.
if MSR.VSX=0 then VSX_Unavailable()
reset_xflags()
Otherwise, if src3 is a Quiet NaN, the result is src3.
Let src1 be the floating-point value in VSR[VRA+32] If the intermediate result is Tiny (i.e., the unbiased
represented in quad-precision format. exponent is less than -16382) and UE=0, the
significand is shifted right N bits, where N is the
Let src2 be the floating-point value in VSR[VRT+32] difference between -16382 and the unbiased
represented in quad-precision format. exponent of the intermediate result. The exponent
of the intermediate result is set to the value
Let src3 be the floating-point value in VSR[VRB+32] -16382.
represented in quad-precision format.
If RO=1, let the rounding mode be Round to Odd.
If either src1, src2, or src3 is a Signalling NaN, an Otherwise, let the rounding mode be specified by
Invalid Operation exception occurs and VXSNAN is set to RN. Unless the result is an Infinity or a Zero, the
1. intermediate result is rounded to quad-precision
using the specified rounding mode.
If src1 is an Infinity value and src3 is a Zero value, or if
src1 is a Zero value and src3 is an Infinity value, an See Table 58, “Scalar Floating-Point Intermediate
Invalid Operation exception occurs and VXIMZ is set to Result Handling,” on page 516.
1.
The result is placed into VSR[VRT+32] in quad-precision
If src2 and the product of src1 and src3 are Infinity format.
values having opposite signs, an Invalid Operation
exception occurs and VXISI is set to 1.
src1
VSR[VRT+32]
src2
VSR[VRB+32]
src3
VSR[VRT+32]
tgt
Part 1: src3
Multiply –Infinity –NZF –Zero +Zero +NZF +Infinity QNaN SNaN
p ← dQNaN
–Infinity p ← +Infinity p ← –Infinity
vximz_flag ← 1
p← p←
–NZF
mul(src1,src3) mul(src1,src3)
p← p←
+NZF
mul(src1,src3) mul(src1,src3)
p ← dQNaN
+Infinity p ← –Infinity p ← +Infinity
vximz_flag ← 1
p ← src1
QNaN p ← src1
vxsnan_flag ← 1
p ← quiet(src1)
SNaN
vxsnan_flag ← 1
Part 2: src2
Add –Infinity –NZF –Zero +Zero +NZF +Infinity QNaN SNaN
v ← dQNaN
–Infinity v ← –Infinity
vxisi_flag ← 1
–NZF v ← add(p,src2) v←p v ← add(p,src2)
Explanation:
src1 The quad-precision floating-point value in VSR[VRA+32].
src2 The quad-precision floating-point value in VSR[VRT+32].
src3 The quad-precision floating-point value in VSR[VRB+32].
dQNaN Default quiet NaN (0x7FFF_8000_0000_0000_0000_0000_0000).
NZF Nonzero finite number.
Rezd Exact-zero-difference result (addition of two finite numbers having same magnitude but different signs). Can also occur with two
nonzero finite number source operands.
quiet(x) Return a QNaN with the payload of x.
add(x,y) Return the normalized sum of floating-point value x and floating-point value y, having unbounded range and precision.
Note: If x = -y, v is considered to be an exact-zero-difference result (Rezd).
mul(x,y) Return the normalized product of floating-point value x and floating-point value y, having unbounded range and precision.
p The intermediate product having unbounded range and precision.
v The intermediate result having unbounded range and precision.
Programming Note
This instruction can be used to operate on
single-precision source operands.
src2
–Infinity –NZF –Zero +Zero +NZF +Infinity QNaN SNaN
T(Q(src2))
–Infinity T(src1) T(src2) T(src2) T(src2) T(src2) T(src2) T(src1)
fx(VXSNAN)
T(Q(src2))
–NZF T(src1) T(M(src1,src2)) T(src2) T(src2) T(src2) T(src2) T(src1)
fx(VXSNAN)
T(Q(src2))
–Zero T(src1) T(src1) T(src1) T(src2) T(src2) T(src2) T(src1)
fx(VXSNAN)
T(Q(src2))
+Zero T(src1) T(src1) T(src1) T(src1) T(src2) T(src2) T(src1)
fx(VXSNAN)
src1
T(Q(src2))
+NZF T(src1) T(src1) T(src1) T(src1) T(M(src1,src2)) T(src2) T(src1)
fx(VXSNAN)
T(Q(src2))
+Infinity T(src1) T(src1) T(src1) T(src1) T(src1) T(src1) T(src1)
fx(VXSNAN)
T(src1)
QNaN T(src2) T(src2) T(src2) T(src2) T(src2) T(src2) T(src1)
fx(VXSNAN)
T(Q(src1)) T(Q(src1)) T(Q(src1)) T(Q(src1)) T(Q(src1)) T(Q(src1)) T(Q(src1)) T(Q(src1))
SNaN
fx(VXSNAN) fx(VXSNAN) fx(VXSNAN) fx(VXSNAN) fx(VXSNAN) fx(VXSNAN) fx(VXSNAN) fx(VXSNAN)
Explanation:
src1 The double-precision floating-point value in doubleword element 0 of VSR[XA].
src2 The double-precision floating-point value in doubleword element 0 of VSR[XT].
NZF Nonzero finite number.
Q(x) Return a QNaN with the payload of x.
M(x,y) Return the greater of floating-point value x and floating-point value y.
T(x) The value x is placed in doubleword element 0 of VSR[XT] in double-precision format.
The contents of doubleword element 1 of VSR[XT] are undefined.
FPRF, FR and FI are not modified.
fx(x) If x is equal to 0, FX is set to 1. x is set to 1.
VXSNAN Floating-Point Invalid Operation Exception (SNaN) status flag, FPSCRVXSNAN. If VE=1, update of VSR[XT] is suppressed.
Programming Note
xsmaxcdp can be used to implement the C/C++/Java conditional operation (x>y)?x:y for single-precision and
double-precision arguments.
src2
–Infinity –NZF –Zero +Zero +NZF +Infinity QNaN SNaN
T(src2)
–Infinity T(src2) T(src2) T(src2) T(src2) T(src2) T(src2) T(src2)
fx(VXSNAN)
T(src2)
–NZF T(src1) T(M(src1,src2) T(src2) T(src2) T(src2) T(src2) T(src2)
fx(VXSNAN)
T(src2)
–Zero T(src1) T(src1) T(src2) T(src2) T(src2) T(src2) T(src2)
fx(VXSNAN)
T(src2)
+Zero T(src1) T(src1) T(src2) T(src2) T(src2) T(src2) T(src2)
fx(VXSNAN)
src1
T(src2)
+NZF T(src1) T(src1) T(src1) T(src1) T(M(src1,src2) T(src2) T(src2)
fx(VXSNAN)
T(src2)
+Infinity T(src1) T(src1) T(src1) T(src1) T(src1) T(src2) T(src2)
fx(VXSNAN)
T(src2)
QNaN T(src2) T(src2) T(src2) T(src2) T(src2) T(src2) T(src2)
fx(VXSNAN)
T(src2) T(src2) T(src2) T(src2) T(src2) T(src2) T(src2) T(src2)
SNaN
fx(VXSNAN) fx(VXSNAN) fx(VXSNAN) fx(VXSNAN) fx(VXSNAN) fx(VXSNAN) fx(VXSNAN) fx(VXSNAN)
Explanation:
src1 The double-precision floating-point value in doubleword element 0 of VSR[XA].
src2 The double-precision floating-point value in doubleword element 0 of VSR[XT].
NZF Nonzero finite number.
M(x,y) Return the greater of floating-point value x and floating-point value y.
T(x) The value x is placed in doubleword element 0 of VSR[XT] in double-precision format.
The contents of doubleword element 1 of VSR[XT] are undefined.
FPRF, FR and FI are not modified.
fx(x) If x is equal to 0, FX is set to 1. x is set to 1.
VXSNAN Floating-Point Invalid Operation Exception (SNaN) status flag, VXSNAN. If VE=1, update of VSR[XT] is suppressed.
else if (src2.type=“SNaN”) | (src2.type=“QNaN”) then Otherwise, if src1 is a Zero and src2 is a Zero and
result ← VSR[32×BX+B].dword[0] either src1 or src2 is a +Zero, the result is +Zero.
else if (src1.type=“Zero”) & (src2.type=“Zero”) then
Otherwise, if src1 is a -Zero and src2 is a -Zero, the
if (src1.sign=0) | (src2.sign=0) then
result ← 0x0000_0000_0000_0000 // +Zero
result is -Zero.
else
result ← 0x8000_0000_0000_0000 // -Zero Otherwise, if src1 is greater than src2, result is src1.
src2
–Infinity –NZF –Zero +Zero +NZF +Infinity QNaN SNaN
T(src2)
–Infinity T(-INF) T(src2) T(src2) T(src2) T(src2) T(src2) T(src2)
fx(VXSNAN)
T(src2)
–NZF T(src1) T(M(src1,src2) T(src2) T(src2) T(src2) T(src2) T(src2)
fx(VXSNAN)
T(src2)
–Zero T(src1) T(src1) T(-Zero) T(+Zero) T(src2) T(src2) T(src2)
fx(VXSNAN)
T(src2)
+Zero T(src1) T(src1) T(+Zero) T(+Zero) T(src2) T(src2) T(src2)
fx(VXSNAN)
src1
T(src2)
+NZF T(src1) T(src1) T(src1) T(src1) T(M(src1,src2) T(src2) T(src2)
fx(VXSNAN)
T(src2)
+Infinity T(src1) T(src1) T(src1) T(src1) T(src1) T(+INF) T(src2)
fx(VXSNAN)
T(src1)
QNaN T(src1) T(src1) T(src1) T(src1) T(src1) T(src1) T(src1)
fx(VXSNAN)
T(src1) T(src1) T(src1) T(src1) T(src1) T(src1) T(src1) T(src1)
SNaN
fx(VXSNAN) fx(VXSNAN) fx(VXSNAN) fx(VXSNAN) fx(VXSNAN) fx(VXSNAN) fx(VXSNAN) fx(VXSNAN)
Explanation:
src1 The double-precision floating-point value in doubleword element 0 of VSR[XA].
src2 The double-precision floating-point value in doubleword element 0 of VSR[XT].
NZF Nonzero finite number.
M(x,y) Return the greater of floating-point value x and floating-point value y.
T(x) The value x is placed in doubleword element 0 of VSR[XT] in double-precision format.
The contents of doubleword element 1 of VSR[XT] are undefined.
FPRF, FR and FI are not modified.
fx(x) If x is equal to 0, FX is set to 1. x is set to 1.
VXSNAN Floating-Point Invalid Operation Exception (SNaN) status flag, VXSNAN. If VE=1, update of VSR[XT] is suppressed.
Programming Note
This instruction can be used to operate on
single-precision source operands.
src2
–Infinity –NZF –Zero +Zero +NZF +Infinity QNaN SNaN
T(Q(src2))
–Infinity T(src1) T(src1) T(src1) T(src1) T(src1) T(src1) T(src1)
fx(VXSNAN)
T(Q(src2))
–NZF T(src2) T(M(src1,src2)) T(src1) T(src1) T(src1) T(src1) T(src1)
fx(VXSNAN)
T(Q(src2))
–Zero T(src2) T(src2) T(src1) T(src1) T(src1) T(src1) T(src1)
fx(VXSNAN)
T(Q(src2))
+Zero T(src2) T(src2) T(src2) T(src1) T(src1) T(src1) T(src1)
fx(VXSNAN)
src1
T(Q(src2))
+NZF T(src2) T(src2) T(src2) T(src2) T(M(src1,src2)) T(src1) T(src1)
fx(VXSNAN)
T(Q(src2))
+Infinity T(src2) T(src2) T(src2) T(src2) T(src2) T(src1) T(src1)
fx(VXSNAN)
T(src1)
QNaN T(src2) T(src2) T(src2) T(src2) T(src2) T(src2) T(src1)
fx(VXSNAN)
T(Q(src1)) T(Q(src1)) T(Q(src1)) T(Q(src1)) T(Q(src1)) T(Q(src1)) T(Q(src1)) T(Q(src1))
SNaN
fx(VXSNAN) fx(VXSNAN) fx(VXSNAN) fx(VXSNAN) fx(VXSNAN) fx(VXSNAN) fx(VXSNAN) fx(VXSNAN)
Explanation:
src1 The double-precision floating-point value in doubleword element 0 of VSR[XA].
src2 The double-precision floating-point value in doubleword element 0 of VSR[XT].
NZF Nonzero finite number.
Q(x) Return a QNaN with the payload of x.
M(x,y) Return the lesser of floating-point value x and floating-point value y.
T(x) The value x is placed in doubleword element i (i∈{0,1}) of VSR[XT] in double-precision format.
The contents of doubleword element 1 of VSR[XT] are undefined.
FPRF, FR and FI are not modified.
fx(x) If x is equal to 0, FX is set to 1. x is set to 1.
VXSNAN Floating-Point Invalid Operation Exception (SNaN) status flag, FPSCRVXSNAN. If VE=1, update of VSR[XT] is suppressed.
Programming Note
xsmincdp can be used to implement the C/C++/Java conditional operator (x<y)?x:y for single-precision and
double-precision arguments.
src2
–Infinity –NZF –Zero +Zero +NZF +Infinity QNaN SNaN
T(src2)
–Infinity T(src2) T(src1) T(src1) T(src1) T(src1) T(src1) T(src2)
fx(VXSNAN)
T(src2)
–NZF T(src2) T(M(src1,src2) T(src1) T(src1) T(src1) T(src1) T(src2)
fx(VXSNAN)
T(src2)
–Zero T(src2) T(src2) T(src2) T(src2) T(src1) T(src1) T(src2)
fx(VXSNAN)
T(src2)
+Zero T(src2) T(src2) T(src2) T(src2) T(src1) T(src1) T(src2)
fx(VXSNAN)
src1
T(src2)
+NZF T(src2) T(src2) T(src2) T(src2) T(M(src1,src2) T(src1) T(src2)
fx(VXSNAN)
T(src2)
+Infinity T(src2) T(src2) T(src2) T(src2) T(src2) T(src2) T(src2)
fx(VXSNAN)
T(src2)
QNaN T(src2) T(src2) T(src2) T(src2) T(src2) T(src2) T(src2)
fx(VXSNAN)
T(src2) T(src2) T(src2) T(src2) T(src2) T(src2) T(src2) T(src2)
SNaN
fx(VXSNAN) fx(VXSNAN) fx(VXSNAN) fx(VXSNAN) fx(VXSNAN) fx(VXSNAN) fx(VXSNAN) fx(VXSNAN)
Explanation:
src1 The double-precision floating-point value in doubleword element 0 of VSR[XA].
src2 The double-precision floating-point value in doubleword element 0 of VSR[XT].
NZF Nonzero finite number.
M(x,y) Return the lesser of floating-point value x and floating-point value y.
T(x) The value x is placed in doubleword element 0 of VSR[XT] in double-precision format.
The contents of doubleword element 1 of VSR[XT] are undefined.
FPRF, FR and FI are not modified.
fx(x) If x is equal to 0, FX is set to 1. x is set to 1.
VXSNAN Floating-Point Invalid Operation Exception (SNaN) status flag, VXSNAN. If VE=1, update of VSR[XT] is suppressed.
else if (src2.type=“SNaN”) | (src2.type=“QNaN”) then Otherwise, if src1 is a Zero and src2 is a Zero and
result ← VSR[32×BX+B].dword[0] either src1 or src2 is a -Zero, the result is -Zero.
else if (src1.type=“Zero”) & (src2.type=“Zero”) then
Otherwise, if src1 is a +Zero and src2 is a +Zero, the
if (src1.sign=1) | (src2.sign=1) then
result ← 0x8000_0000_0000_0000 // -Zero
result is +Zero.
else
result ← 0x0000_0000_0000_0000 // +Zero Otherwise, if src1 is less than src2, result is src1.
src2
–Infinity –NZF –Zero +Zero +NZF +Infinity QNaN SNaN
T(src2)
–Infinity T(-INF) T(src1) T(src1) T(src1) T(src1) T(src1) T(src2)
fx(VXSNAN)
T(src2)
–NZF T(src2) T(M(src1,src2) T(src1) T(src1) T(src1) T(src1) T(src2)
fx(VXSNAN)
T(src2)
–Zero T(src2) T(src2) T(-Zero) T(-Zero) T(src1) T(src1) T(src2)
fx(VXSNAN)
T(src2)
+Zero T(src2) T(src2) T(-Zero) T(+Zero) T(src1) T(src1) T(src2)
fx(VXSNAN)
src1
T(src2)
+NZF T(src2) T(src2) T(src2) T(src2) T(M(src1,src2) T(src1) T(src2)
fx(VXSNAN)
T(src2)
+Infinity T(src2) T(src2) T(src2) T(src2) T(src2) T(+INF) T(src2)
fx(VXSNAN)
T(src1)
QNaN T(src1) T(src1) T(src1) T(src1) T(src1) T(src1) T(src1)
fx(VXSNAN)
T(src1) T(src1) T(src1) T(src1) T(src1) T(src1) T(src1) T(src1)
SNaN
fx(VXSNAN) fx(VXSNAN) fx(VXSNAN) fx(VXSNAN) fx(VXSNAN) fx(VXSNAN) fx(VXSNAN) fx(VXSNAN)
Explanation:
src1 The double-precision floating-point value in doubleword element 0 of VSR[XA].
src2 The double-precision floating-point value in doubleword element 0 of VSR[XT].
NZF Nonzero finite number.
M(x,y) Return the greater of floating-point value x and floating-point value y.
T(x) The value x is placed in doubleword element 0 of VSR[XT] in double-precision format.
The contents of doubleword element 1 of VSR[XT] are undefined.
FPRF, FR and FI are not modified.
fx(x) If x is equal to 0, FX is set to 1. x is set to 1.
VXSNAN Floating-Point Invalid Operation Exception (SNaN) status flag, VXSNAN. If VE=1, update of VSR[XT] is suppressed.
Part 1: src3
Multiply –Infinity –NZF –Zero +Zero +NZF +Infinity QNaN SNaN
p ← dQNaN p ← dQNaN p ← Q(src3)
–Infinity p ← +Infinity p ← +Infinity p ← –Infinity p ← –Infinity p ← src3
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
p ← Q(src3)
–NZF p ← +Infinity p ← M(src1,src3) p ← +Zero p ← –Zero p ← M(src1,src3) p ← +Infinity p ← src3
vxsnan_flag ← 1
p ← dQNaN p ← dQNaN p ← Q(src3)
–Zero p ← +Zero p ← +Zero p ← –Zero p ← –Zero p ← src3
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
p ← dQNaN p ← dQNaN p ← Q(src3)
+Zero p ← –Zero p ← –Zero p ← +Zero p ← +Zero p ← src3
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
src1
p ← Q(src3)
+NZF p ← –Infinity p ← M(src1,src3) p ← –Zero p ← +Zero p ← M(src1,src3) p ← +Infinity p ← src3
vxsnan_flag ← 1
p ← dQNaN p ← dQNaN p ← Q(src3)
+Infinity p ← –Infinity p ← +Infinity p ← +Infinity p ← +Infinity p ← src3
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
p ← src1
QNaN p ← src1 p ← src1 p ← src1 p ← src1 p ← src1 p ← src1 p ← src1
vxsnan_flag ← 1
p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1)
SNaN
vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1
Part 2: src2
Subtract –Infinity –NZF –Zero +Zero +NZF +Infinity QNaN SNaN
v ← dQNaN v ← Q(src2)
–Infinity v ← –Infinity v ← –Infinity v ← –Infinity v ← –Infinity v ← –Infinity v ← src2
vxisi_flag ← 1 vxsnan_flag ← 1
v ← Q(src2)
–NZF v ← +Infinity v ← S(p,src2) v←p v←p v ← S(p,src2) v ← –Infinity v ← src2
vxsnan_flag ← 1
v ← Q(src2)
–Zero v ← +Infinity v ← –src2 v ← Rezd v ← –Zero v ← –src2 v ← –Infinity v ← src2
vxsnan_flag ← 1
v ← Q(src2)
+Zero v ← +Infinity v ← –src2 v ← +Zero v ← Rezd v ← –src2 v ← –Infinity v ← src2
vxsnan_flag ← 1
p
v ← Q(src2)
+NZF v ← +Infinity v ← S(p,src2) v←p v←p v ← S(p,src2) v ← –Infinity v ← src2
vxsnan_flag ← 1
v ← dQNaN v ← Q(src2)
+Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← src2
vxisi_flag ← 1 vxsnan_flag ← 1
QNaN & v←p
v←p v←p v←p v←p v←p v←p v←p
src1 is a NaN vxsnan_flag ← 1
QNaN & v ← Q(src2)
v←p v←p v←p v←p v←p v←p v ← src2
src1 not a NaN vxsnan_flag ← 1
Explanation:
src1 The double-precision floating-point value in doubleword element 0 of VSR[XA].
src2 For xsmsubadp, the double-precision floating-point value in doubleword element 0 of VSR[XT].
For xsmsubmdp, the double-precision floating-point value in doubleword element 0 of VSR[XB].
src3 For xsmsubadp, the double-precision floating-point value in doubleword element 0 of VSR[XB].
For xsmsubmdp, the double-precision floating-point value in doubleword element 0 of VSR[XT].
dQNaN Default quiet NaN (0x7FF8_0000_0000_0000).
NZF Nonzero finite number.
Rezd Exact-zero-difference result (addition of two finite numbers having same magnitude but different signs). Can also occur with two
nonzero finite number source operands.
Q(x) Return a QNaN with the payload of x.
S(x,y) Return the normalized sum of floating-point value x and negated floating-point value y, having unbounded range and precision.
Note: If x = y, v is considered to be an exact-zero-difference result (Rezd).
M(x,y) Return the normalized product of floating-point value x and floating-point value y, having unbounded range and precision.
p The intermediate product having unbounded range and precision.
v The intermediate result having unbounded range and precision.
if(vxsnan_flag) then SetFX(VXSNAN) See part 2 of Table 87, “Actions for xsmsub(a|m)sp”.
if(vximz_flag) then SetFX(VXIMZ)
if(vxisi_flag) then SetFX(VXISI) The intermediate result is rounded to single-precision
if(ox_flag) then SetFX(OX) using the rounding mode specified by RN.
if(ux_flag) then SetFX(UX)
if(xx_flag) then SetFX(XX)
See Table 58, “Scalar Floating-Point Intermediate
vex_flag ← VE & (vxsnan_flag | vximz_flag | vxisi_flag)
Result Handling,” on page 516.
Part 1: src3
Multiply –Infinity –NZF –Zero +Zero +NZF +Infinity QNaN SNaN
p ← dQNaN p ← dQNaN p ← Q(src3)
–Infinity p ← +Infinity p ← +Infinity p ← –Infinity p ← –Infinity p ← src3
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
p ← Q(src3)
–NZF p ← +Infinity p ← M(src1,src3) p ← +Zero p ← –Zero p ← M(src1,src3) p ← +Infinity p ← src3
vxsnan_flag ← 1
p ← dQNaN p ← dQNaN p ← Q(src3)
–Zero p ← +Zero p ← +Zero p ← –Zero p ← –Zero p ← src3
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
p ← dQNaN p ← dQNaN p ← Q(src3)
+Zero p ← –Zero p ← –Zero p ← +Zero p ← +Zero p ← src3
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
src1
p ← Q(src3)
+NZF p ← –Infinity p ← M(src1,src3) p ← –Zero p ← +Zero p ← M(src1,src3) p ← +Infinity p ← src3
vxsnan_flag ← 1
p ← dQNaN p ← dQNaN p ← Q(src3)
+Infinity p ← –Infinity p ← +Infinity p ← +Infinity p ← +Infinity p ← src3
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
p ← src1
QNaN p ← src1 p ← src1 p ← src1 p ← src1 p ← src1 p ← src1 p ← src1
vxsnan_flag ← 1
p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1)
SNaN
vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1
Part 2: src2
Subtract –Infinity –NZF –Zero +Zero +NZF +Infinity QNaN SNaN
v ← dQNaN v ← Q(src2)
–Infinity v ← –Infinity v ← –Infinity v ← –Infinity v ← –Infinity v ← –Infinity v ← src2
vxisi_flag ← 1 vxsnan_flag ← 1
v ← Q(src2)
–NZF v ← +Infinity v ← S(p,src2) v←p v←p v ← S(p,src2) v ← –Infinity v ← src2
vxsnan_flag ← 1
v ← Q(src2)
–Zero v ← +Infinity v ← –src2 v ← Rezd v ← –Zero v ← –src2 v ← –Infinity v ← src2
vxsnan_flag ← 1
v ← Q(src2)
+Zero v ← +Infinity v ← –src2 v ← +Zero v ← Rezd v ← –src2 v ← –Infinity v ← src2
vxsnan_flag ← 1
v ← Q(src2)
+NZF v ← +Infinity v ← S(p,src2) v←p v←p v ← S(p,src2) v ← –Infinity v ← src2
vxsnan_flag ← 1
p
v ← dQNaN v ← Q(src2)
+Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← src2
vxisi_flag ← 1 vxsnan_flag ← 1
QNaN &
v←p
src1 is a v←p v←p v←p v←p v←p v←p v←p
vxsnan_flag ← 1
NaN
QNaN &
v ← Q(src2)
src1 not a v ← p v←p v←p v←p v←p v←p v ← src2
vxsnan_flag ← 1
NaN
Explanation:
src1 The double-precision floating-point value in doubleword element 0 of VSR[XA].
src2 For xsmsubasp, the double-precision floating-point value in doubleword element 0 of VSR[XT].
For xsmsubmsp, the double-precision floating-point value in doubleword element 0 of VSR[XB].
src3 For xsmsubasp, the double-precision floating-point value in doubleword element 0 of VSR[XB].
For xsmsubmsp, the double-precision floating-point value in doubleword element 0 of VSR[XT].
dQNaN Default quiet NaN (0x7FF8_0000_0000_0000).
NZF Nonzero finite number.
Rezd Exact-zero-difference result (addition of two finite numbers having same magnitude but different signs). Can also occur with two
nonzero finite number source operands.
Q(x) Return a QNaN with the payload of x.
S(x,y) Return the normalized sum of floating-point value x and negated floating-point value y, having unbounded range and precision.
Note: If x = y, v is considered to be an exact-zero-difference result (Rezd).
M(x,y) Return the normalized product of floating-point value x and floating-point value y, having unbounded range and precision.
p The intermediate product having unbounded range and precision.
v The intermediate result having unbounded range and precision.
VSX Scalar Multiply-Subtract Quad-Precision Otherwise, if src1 is a Quiet NaN, the result is src1.
[using round to Odd] X-form
Otherwise, if src2 is a Signalling NaN, the result is the
xsmsubqp VRT,VRA,VRB (RO=0) Quiet NaN corresponding to src2.
xsmsubqpo VRT,VRA,VRB (RO=1)
Otherwise, if src2 is a Quiet NaN, the result is src2.
63 VRT VRA VRB 420 RO
0 6 11 16 21 31 Otherwise, if src3 is a Signalling NaN, the result is the
Quiet NaN corresponding to src3.
if MSR.VSX=0 then VSX_Unavailable()
reset_xflags()
Otherwise, if src3 is a Quiet NaN, the result is src3.
src1
VSR[VRT+32]
src2
VSR[VRB+32]
src3
VSR[VRT+32]
tgt
Part 1: src3
Multiply –Infinity –NZF –Zero +Zero +NZF +Infinity QNaN SNaN
p ← dQNaN
–Infinity p ← +Infinity p ← –Infinity
vximz_flag ← 1
p← p←
–NZF
mul(src1,src3) mul(src1,src3)
p← p←
+NZF
mul(src1,src3) mul(src1,src3)
p ← dQNaN
+Infinity p ← –Infinity p ← +Infinity
vximz_flag ← 1
p ← src1
QNaN p ← src1
vxsnan_flag ← 1
p ← quiet(src1)
SNaN
vxsnan_flag ← 1
Part 2: src2
Subtract –Infinity –NZF –Zero +Zero +NZF +Infinity QNaN SNaN
v ← dQNaN
–Infinity v ← –Infinity
vxisi_flag ← 1
v ← dQNaN
+Infinity v ← +Infinity
vxisi_flag ← 1
QNaN & v←p
src1 is a NaN vxsnan_flag ← 1
v←p
QNaN & v ← quiet(src2)
v ← src2
src1 not a NaN vxsnan_flag ← 1
Explanation:
src1 The quad-precision floating-point value in VSR[VRA+32].
src2 The quad-precision floating-point value in VSR[VRT+32].
src3 The quad-precision floating-point value in VSR[VRB+32].
dQNaN Default quiet NaN (0x7FFF_8000_0000_0000_0000_0000_0000).
NZF Nonzero finite number.
Rezd Exact-zero-difference result (addition of two finite numbers having same magnitude but different signs). Can also occur with two
nonzero finite number source operands.
quiet(x) Return a QNaN with the payload of x.
sub(x,y) Return the normalized sum of floating-point value x and negated floating-point value y, having unbounded range and precision.
Note: If x = y, v is considered to be an exact-zero-difference result (Rezd).
mul(x,y) Return the normalized product of floating-point value x and floating-point value y, having unbounded range and precision.
p The intermediate product having unbounded range and precision.
v The intermediate result having unbounded range and precision.
VSX Scalar Multiply Double-Precision The contents of doubleword element 1 of VSR[XT] are
XX3-form undefined.
xsmuldp XT,XA,XB FPRF is set to the class and sign of the result. FR is
set to indicate if the result was incremented when
60 T A B 48 AX BX TX rounded. FI is set to indicate the result is inexact.
0 6 11 16 21 29 30 31
If a trap-enabled invalid operation exception occurs,
XT ← TX || T
XA ← AX || A
VSR[XT] and FPRF are not modified, and FR and FI
XB ← BX || B are set to 0.
reset_xflags()
src1 ← VSR[XA]{0:63} See Table 59, “VSX Scalar Floating-Point Final
src2 ← VSR[XB]{0:63} Result,” on page 517.
v{0:inf} ← MultiplyFP(src1,src2)
result{0:63} ← RoundToDP(RN,v) Special Registers Altered
if(vxsnan_flag) then SetFX(VXSNAN) FPRF FR FI FX OX UX XX
if(vximz_flag) then SetFX(VXIMZ) VXSNAN VXIMZ
if(vxisi_flag) then SetFX(VXISI)
if(ox_flag) then SetFX(OX)
if(ux_flag) then SetFX(UX) VSR Data Layout for xsmuldp
if(xx_flag) then SetFX(XX) src1 = VSR[XA]
vex_flag ← VE & (vxsnan_flag | vximz_flag)
DP unused
if( ~vex_flag ) then do
VSR[XT] ← result || 0xUUUU_UUUU_UUUU_UUUU src2 = VSR[XB]
FPRF ← ClassDP(result) DP unused
FR ← inc_flag
FI ← xx_flag tgt = VSR[XT]
end
else do DP undefined
FR ← 0b0 0 64 127
FI ← 0b0
end
src2
-Infinity -NZF -Zero +Zero +NZF +Infinity QNaN SNaN
v ← dQNaN v ← dQNaN v ← Q(src2)
-Infinity v ← +Infinity v ← +Infinity v ← –Infinity v ← –Infinity v ← src2
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
v ← Q(src2)
-NZF v ← +Infinity v ← M(src1,src2) v ← +Zero v ← –Zero v ← M(src1,src2) v ← +Infinity v ← src2
vxsnan_flag ← 1
v ← dQNaN v ← dQNaN v ← Q(src2)
-Zero v ← +Zero v ← +Zero v ← –Zero v ← –Zero v ← src2
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
v ← dQNaN v ← dQNaN v ← Q(src2)
+Zero v ← –Zero v ← –Zero v ← +Zero v ← +Zero v ← src2
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
src1
v ← Q(src2)
+NZF v ← –Infinity v ← M(src1,src2) v ← –Zero v ← +Zero v ← M(src1,src2) v ← +Infinity v ← src2
vxsnan_flag ← 1
v ← dQNaN v ← dQNaN v ← Q(src2)
+Infinity v ← –Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← src2
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
v ← src1
QNaN v ← src1 v ← src1 v ← src1 v ← src1 v ← src1 v ← src1 v ← src1
vxsnan_flag ← 1
v ← Q(src1) v ← Q(src1) v ← Q(src1) v ← Q(src1) v ← Q(src1) v ← Q(src1) v ← Q(src1) v ← Q(src1)
SNaN
vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1
Explanation:
src1 The double-precision floating-point value in doubleword element 0 of VSR[XA].
src2 The double-precision floating-point value in doubleword element 0 of VSR[XB].
dQNaN Default quiet NaN (0x7FF8_0000_0000_0000).
NZF Nonzero finite number.
M(x,y) Return the normalized product of floating-point value x and floating-point value y, having unbounded range and precision.
Q(x) Return a QNaN with the payload of x.
v The intermediate result having unbounded signficand precision and unbounded exponent range.
VSX Scalar Multiply Quad-Precision [using Otherwise, if src1 is an Infinity value and src2 is a Zero
round to Odd] X-form value, or if src1 is a Zero value and src2 is an Infinity
value, the result is the default Quiet NaN[1].
xsmulqp VRT,VRA,VRB (RO=0)
xsmulqpo VRT,VRA,VRB (RO=1) Otherwise, do the following.
The normalized product of src1 multiplied by src2
63 VRT VRA VRB 36 RO is produced with unbounded significand precision
0 6 11 16 21 31
and exponent range.
if MSR.VSX=0 then VSX_Unavailable()
See Table 90. "Actions for xsmulqp[o]".
reset_xflags()
If the intermediate result is Tiny (i.e., the unbiased
src1 ← bfp_CONVERT_FROM_BFP128(VSR[VRA+32]) exponent is less than -16382) and UE=0, the
src2 ← bfp_CONVERT_FROM_BFP128(VSR[VRB+32]) significand is shifted right N bits, where N is the
v ← bfp_MULTIPLY(src1, src2) difference between -16382 and the unbiased
rnd ← bfp_ROUND_TO_BFP128(RO, FPSCR.RN, v) exponent of the intermediate result. The exponent
result ← bfp_CONVERT_TO_BFP128(rnd) of the intermediate result is set to the value
-16382.
if(vxsnan_flag) then SetFX(FPSCR.VXSNAN)
if(vximz_flag) then SetFX(FPSCR.VXIMZ)
If RO=1, let the rounding mode be Round to Odd.
if(ox_flag) then SetFX(FPSCR.OX)
if(ux_flag) then SetFX(FPSCR.UX)
Otherwise, let the rounding mode be specified by
if(xx_flag) then SetFX(FPSCR.XX) RN. Unless the result is an Infinity or a Zero, the
intermediate result is rounded to quad-precision
vx_flag ← vxsnan_flag | vximz_flag using the specified rounding mode.
ex_flag ← FPSCR.VE & vx_flag
See Table 58, “Scalar Floating-Point Intermediate
if ex_flag=0 then do Result Handling,” on page 516.
VSR[VRT+32] ← result
FPSCR.FPRF ← fprf_CLASS_BFP128(result) The result is placed into VSR[VRT+32] in quad-precision
end format.
FPSCR.FR ← (vx_flag=0) & inc_flag
FPSCR.FI ← (vx_flag=0) & xx_flag FPRF is set to the class and sign of the result. FR is set
to indicate if the rounded result was incremented. FI is
Let src1 be the floating-point value in VSR[VRA+32]
set to indicate the result is inexact.
represented in quad-precision format.
If a trap-disabled Invalid Operation exception occurs,
Let src2 be the floating-point value in VSR[VRB+32]
FR and FI are set to 0.
represented in quad-precision format.
If a trap-enabled Invalid Operation exception occurs,
If either src1 or src2 is a Signalling NaN, an Invalid
VSR[VRT+32] and FPRF are not modified, and FR and FI
Operation exception occurs and VXSNAN is set to 1.
are set to 0.
If src1 is an Infinity value and src2 is a Zero value, or if
See Table 59, “VSX Scalar Floating-Point Final
src1 is a Zero value and src2 is an Infinity value, an
Result,” on page 517.
Invalid Operation exception occurs and VXIMZ is set to
1.
Special Registers Altered:
FPRF FR FI FX VXSNAN VXIMZ OX UX XX
If src1 is a Signalling NaN, the result is the Quiet NaN
corresponding to src1.
src1
VSR[VRB+32]
src2
VSR[VRT+32]
tgt
src2
-Infinity -NZF -Zero +Zero +NZF +Infinity QNaN SNaN
v ← dQNaN
-Infinity v ← +Infinity v ← –Infinity
vximz_flag ← 1
v ← dQNaN
+Infinity v ← –Infinity v ← +Infinity
vximz_flag ← 1
v ← src1
QNaN v ← src1
vxsnan_flag ← 1
v ← quiet(src1)
SNaN
vxsnan_flag ← 1
Explanation:
src1 The quad-precision floating-point value in VSR[VRA+32].
src2 The quad-precision floating-point value in VSR[VRB+32].
dQNaN Default quiet NaN (0x7FFF_8000_0000_0000_0000_0000_0000).
NZF Nonzero finite number.
mul(x,y) The floating-point value x is multiplied1 by the floating-point value y. Return the normalized product, having unbounded significand
precision and exponent range.
quiet(x) Convert x to the corresponding Quiet NaN.
v The intermediate result having unbounded significand precision and unbounded exponent range.
VSX Scalar Multiply Single-Precision The result is placed into doubleword element 0 of
XX3-form VSR[XT] in double-precision format.
if(vxsnan_flag) then SetFX(VXSNAN) See Table 59, “VSX Scalar Floating-Point Final
if(vximz_flag) then SetFX(VXIMZ) Result,” on page 517.
if(ox_flag) then SetFX(OX)
if(ux_flag) then SetFX(UX)
if(xx_flag) then SetFX(XX)
Special Registers Altered
FPRF FR FI FX OX UX XX
vex_flag ← VE & (vxsnan_flag | vximz_flag) VXSNAN VXIMZ
src2
-Infinity -NZF -Zero +Zero +NZF +Infinity QNaN SNaN
v ← dQNaN v ← dQNaN v ← Q(src2)
-Infinity v ← +Infinity v ← +Infinity v ← –Infinity v ← –Infinity v ← src2
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
v ← Q(src2)
-NZF v ← +Infinity v ← M(src1,src2) v ← +Zero v ← –Zero v ← M(src1,src2) v ← +Infinity v ← src2
vxsnan_flag ← 1
v ← dQNaN v ← dQNaN v ← Q(src2)
-Zero v ← +Zero v ← +Zero v ← –Zero v ← –Zero v ← src2
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
v ← dQNaN v ← dQNaN v ← Q(src2)
+Zero v ← –Zero v ← –Zero v ← +Zero v ← +Zero v ← src2
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
src1
v ← Q(src2)
+NZF v ← –Infinity v ← M(src1,src2) v ← –Zero v ← +Zero v ← M(src1,src2) v ← +Infinity v ← src2
vxsnan_flag ← 1
v ← dQNaN v ← dQNaN v ← Q(src2)
+Infinity v ← –Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← src2
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
v ← src1
QNaN v ← src1 v ← src1 v ← src1 v ← src1 v ← src1 v ← src1 v ← src1
vxsnan_flag ← 1
v ← Q(src1) v ← Q(src1) v ← Q(src1) v ← Q(src1) v ← Q(src1) v ← Q(src1) v ← Q(src1) v ← Q(src1)
SNaN
vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1
Explanation:
src1 The double-precision floating-point value in doubleword element 0 of VSR[XA].
src2 The double-precision floating-point value in doubleword element 0 of VSR[XB].
dQNaN Default quiet NaN (0x7FF8_0000_0000_0000).
NZF Nonzero finite number.
M(x,y) Return the normalized product of floating-point value x and floating-point value y, having unbounded range and precision.
Q(x) Return a QNaN with the payload of x.
v The intermediate result having unbounded signficand precision and unbounded exponent range.
Programming Note
This instruction can be used to operate on a
single-precision source operand.
The contents of doubleword element 1 of VSR[XT] are VSR Data Layout for xsnegqp
undefined.
VSR[VRB+32]
tgt
VSR Data Layout for xsnegdp
src = VSR[XB]
DP unused
tgt = VSR[XT]
DP undefined
0 64 127
Programming Note
This instruction can be used to operate on a
single-precision source operand.
Part 1: src3
Multiply –Infinity –NZF –Zero +Zero +NZF +Infinity QNaN SNaN
p ← dQNaN p ← dQNaN p ← Q(src3)
–Infinity p ← +Infinity p ← +Infinity p ← –Infinity p ← –Infinity p ← src3
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
p ← Q(src3)
–NZF p ← +Infinity p ← M(src1,src3) p ← src1 p ← src1 p ← M(src1,src3) p ← +Infinity p ← src3
vxsnan_flag ← 1
p ← dQNaN p ← dQNaN p ← Q(src3)
–Zero p ← +Zero p ← +Zero p ← –Zero p ← –Zero p ← src3
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
p ← dQNaN p ← dQNaN p ← Q(src3)
+Zero p ← –Zero p ← –Zero p ← +Zero p ← +Zero p ← src3
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
src1
p ← Q(src3)
+NZF p ← –Infinity p ← M(src1,src3) p ← src1 p ← src1 p ← M(src1,src3) p ← +Infinity p ← src3
vxsnan_flag ← 1
p ← dQNaN p ← dQNaN p ← Q(src3)
+Infinity p ← –Infinity p ← +Infinity p ← +Infinity p ← +Infinity p ← src3
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
p ← src1
QNaN p ← src1 p ← src1 p ← src1 p ← src1 p ← src1 p ← src1 p ← src1
vxsnan_flag ← 1
p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1)
SNaN
vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1
Part 2: src2
Add –Infinity –NZF –Zero +Zero +NZF +Infinity QNaN SNaN
v ← dQNaN v ← Q(src2)
–Infinity v ← –Infinity v ← –Infinity v ← –Infinity v ← –Infinity v ← –Infinity v ← src2
vxisi_flag ← 1 vxsnan_flag ← 1
v ← Q(src2)
–NZF v ← –Infinity v ← A(p,src2) v←p v←p v ← A(p,src2) v ← +Infinity v ← src2
vxsnan_flag ← 1
v ← Q(src2)
–Zero v ← –Infinity v ← src2 v ← –Zero v ← Rezd v ← src2 v ← +Infinity v ← src2
vxsnan_flag ← 1
v ← Q(src2)
+Zero v ← –Infinity v ← src2 v ← Rezd v ← +Zero v ← src2 v ← +Infinity v ← src2
vxsnan_flag ← 1
p
v ← Q(src2)
+NZF v ← –Infinity v ← A(p,src2) v←p v←p v ← A(p,src2) v ← +Infinity v ← src2
vxsnan_flag ← 1
v ← dQNaN v ← Q(src2)
+Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← src2
vxisi_flag ← 1 vxsnan_flag ← 1
QNaN & v←p
v←p v←p v←p v←p v←p v←p v←p
src1 is a NaN vxsnan_flag ← 1
QNaN & v ← Q(src2)
v←p v←p v←p v←p v←p v←p v ← src2
src1 not a NaN vxsnan_flag ← 1
Explanation:
src1 The double-precision floating-point value in doubleword element 0 of VSR[XA].
src2 For xsnmaddadp, the double-precision floating-point value in doubleword element 0 of VSR[XT].
For xsnmaddmdp, the double-precision floating-point value in doubleword element 0 of VSR[XB].
src3 For xsnmaddadp, the double-precision floating-point value in doubleword element 0 of VSR[XB].
For xsnmaddmdp, the double-precision floating-point value in doubleword element 0 of VSR[XT].
dQNaN Default quiet NaN (0x7FF8_0000_0000_0000).
NZF Nonzero finite number.
Rezd Exact-zero-difference result (addition of two finite numbers having same magnitude but different signs). Can also occur with two
nonzero finite number source operands.
Q(x) Return a QNaN with the payload of x.
A(x,y) Return the normalized sum of floating-point value x and floating-point value y, having unbounded range and precision.
Note: If x = -y, v is considered to be an exact-zero-difference result (Rezd).
M(x,y) Return the normalized product of floating-point value x and floating-point value y, having unbounded range and precision.
p The intermediate product having unbounded range and precision.
v The intermediate result having unbounded range and precision.
Is q inexact? (q g v)
vxsnan_flag
vximz_flag
vxisi_flag
OE
UE
VE
XE
Case ZE Returned Results and Status Setting
Explanation:
– The results do not depend on this condition.
ClassFP(x) Classifies the floating-point value x as defined in Table 2, “Floating-Point Result Flags,” on page 373.
fx(x) FX is set to 1 if x=0. x is set to 1.
β Wrap adjust, where β = 21536 for double-precision and β = 2192 for single-precision.
q The value defined in Table 58, “Scalar Floating-Point Intermediate Result Handling,” on page 516, signficand rounded to the target
precision, unbounded exponent range.
r The value defined in Table 58, “Scalar Floating-Point Intermediate Result Handling,” on page 516, signficand rounded to the target
precision, bounded exponent range.
v The precise intermediate result defined in the instruction having unbounded signficand precision, unbounded exponent range.
FI Floating-Point Fraction Inexact status flag, FPSCRFI. This status flag is nonsticky.
FR Floating-Point Fraction Rounded status flag, FPSCRFR.
OX Floating-Point Overflow Exception status flag, FPSCROX.
error() The system error handler is invoked for the trap-enabled exception if the FE0 and FE1 bits in the Machine State Register are set
to any mode other than the ignore-exception mode.
N(x) The value x is is negated by complementing the sign bit of x.
T(x) The value x is placed in element 0 of VSR[XT] in the target precision format.
The contents of the remaining element(s) of VSR[XT] are undefined.
UX Floating-Point Underflow Exception status flag, FPSCRUX
VXSNAN Floating-Point Invalid Operation Exception (SNaN) status flag, FPSCRVXSNAN.
VXIMZ Floating-Point Invalid Operation Exception (Infinity × Zero) status flag, FPSCRVXIMZ.
VXISI Floating-Point Invalid Operation Exception (Infinity – Infinity) status flag, FPSCRVXISI.
XX Float-Point Inexact Exception status flag, FPSCRXX. The flag is a sticky version of FPSCRFI. When FPSCRFI is set to a new
value, the new value of FPSCRXX is set to the result of ORing the old value of FPSCRXX with the new value of FPSCRFI.
Is q inexact? (q g v)
vxsnan_flag
vximz_flag
vxisi_flag
OE
UE
VE
XE
ZE
Case Returned Results and Status Setting
Explanation:
– The results do not depend on this condition.
ClassFP(x) Classifies the floating-point value x as defined in Table 2, “Floating-Point Result Flags,” on page 373.
fx(x) FX is set to 1 if x=0. x is set to 1.
β Wrap adjust, where β = 21536 for double-precision and β = 2192 for single-precision.
q The value defined in Table 58, “Scalar Floating-Point Intermediate Result Handling,” on page 516, signficand rounded to the target
precision, unbounded exponent range.
r The value defined in Table 58, “Scalar Floating-Point Intermediate Result Handling,” on page 516, signficand rounded to the target
precision, bounded exponent range.
v The precise intermediate result defined in the instruction having unbounded signficand precision, unbounded exponent range.
FI Floating-Point Fraction Inexact status flag, FPSCRFI. This status flag is nonsticky.
FR Floating-Point Fraction Rounded status flag, FPSCRFR.
OX Floating-Point Overflow Exception status flag, FPSCROX.
error() The system error handler is invoked for the trap-enabled exception if the FE0 and FE1 bits in the Machine State Register are set
to any mode other than the ignore-exception mode.
N(x) The value x is is negated by complementing the sign bit of x.
T(x) The value x is placed in element 0 of VSR[XT] in the target precision format.
The contents of the remaining element(s) of VSR[XT] are undefined.
UX Floating-Point Underflow Exception status flag, FPSCRUX
VXSNAN Floating-Point Invalid Operation Exception (SNaN) status flag, FPSCRVXSNAN.
VXIMZ Floating-Point Invalid Operation Exception (Infinity × Zero) status flag, FPSCRVXIMZ.
VXISI Floating-Point Invalid Operation Exception (Infinity – Infinity) status flag, FPSCRVXISI.
XX Float-Point Inexact Exception status flag, FPSCRXX. The flag is a sticky version of FPSCRFI. When FPSCRFI is set to a new
value, the new value of FPSCRXX is set to the result of ORing the old value of FPSCRXX with the new value of FPSCRFI.
if(vxsnan_flag) then SetFX(VXSNAN) See part 2 of Table 94, “Actions for xsnmadd(a|m)sp,”
if(vximz_flag) then SetFX(VXIMZ) on page 617.
if(vxisi_flag) then SetFX(VXISI)
if(ox_flag) then SetFX(OX) The intermediate result is rounded to single-precision
if(ux_flag) then SetFX(UX) using the rounding mode specified by RN.
if(xx_flag) then SetFX(XX)
See Table 58, “Scalar Floating-Point Intermediate
vex_flag ← VE & (vxsnan_flag | vximz_flag | vxisi_flag) Result Handling,” on page 516.
if( ~vex_flag ) then do The result is negated and placed into doubleword
VSR[32×TX+T].dword[0] ← ConvertToSP(result)
element 0 of VSR[XT] in double-precision format.
VSR[32×TX+T].dword[1] ← 0xUUUU_UUUU_UUUU_UUUU
FPRF ← ClassSP(result)
FR ← inc_flag
The contents of doubleword element 1 of VSR[XT] are
FI ← xx_flag undefined.
end
else do FPRF is set to the class and sign of the result as
FR ← 0b0 represented in single-precision format. FR is set to
FI ← 0b0 indicate if the result was incremented when rounded.
end FI is set to indicate the result is inexact.
Part 1: src3
Multiply –Infinity –NZF –Zero +Zero +NZF +Infinity QNaN SNaN
p ← dQNaN p ← dQNaN p ← Q(src3)
–Infinity p ← +Infinity p ← +Infinity p ← –Infinity p ← –Infinity p ← src3
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
p ← Q(src3)
–NZF p ← +Infinity p ← M(src1,src3) p ← src1 p ← src1 p ← M(src1,src3) p ← +Infinity p ← src3
vxsnan_flag ← 1
p ← dQNaN p ← dQNaN p ← Q(src3)
–Zero p ← +Zero p ← +Zero p ← –Zero p ← –Zero p ← src3
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
p ← dQNaN p ← dQNaN p ← Q(src3)
+Zero p ← –Zero p ← –Zero p ← +Zero p ← +Zero p ← src3
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
src1
p ← Q(src3)
+NZF p ← –Infinity p ← M(src1,src3) p ← src1 p ← src1 p ← M(src1,src3) p ← +Infinity p ← src3
vxsnan_flag ← 1
p ← dQNaN p ← dQNaN p ← Q(src3)
+Infinity p ← –Infinity p ← +Infinity p ← +Infinity p ← +Infinity p ← src3
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
p ← src1
QNaN p ← src1 p ← src1 p ← src1 p ← src1 p ← src1 p ← src1 p ← src1
vxsnan_flag ← 1
p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1)
SNaN
vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1
Part 2: src2
Add –Infinity –NZF –Zero +Zero +NZF +Infinity QNaN SNaN
v ← dQNaN v ← Q(src2)
–Infinity v ← –Infinity v ← –Infinity v ← –Infinity v ← –Infinity v ← –Infinity v ← src2
vxisi_flag ← 1 vxsnan_flag ← 1
v ← Q(src2)
–NZF v ← –Infinity v ← A(p,src2) v←p v←p v ← A(p,src2) v ← +Infinity v ← src2
vxsnan_flag ← 1
v ← Q(src2)
–Zero v ← –Infinity v ← src2 v ← –Zero v ← Rezd v ← src2 v ← +Infinity v ← src2
vxsnan_flag ← 1
v ← Q(src2)
+Zero v ← –Infinity v ← src2 v ← Rezd v ← +Zero v ← src2 v ← +Infinity v ← src2
vxsnan_flag ← 1
v ← Q(src2)
+NZF v ← –Infinity v ← A(p,src2) v←p v←p v ← A(p,src2) v ← +Infinity v ← src2
vxsnan_flag ← 1
p
v ← dQNaN v ← Q(src2)
+Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← src2
vxisi_flag ← 1 vxsnan_flag ← 1
QNaN &
v←p
src1 is a v←p v←p v←p v←p v←p v←p v←p
vxsnan_flag ← 1
NaN
QNaN &
v ← Q(src2)
src1 not a v ← p v←p v←p v←p v←p v←p v ← src2
vxsnan_flag ← 1
NaN
Explanation:
src1 The double-precision floating-point value in doubleword element 0 of VSR[XA].
src2 For xsnmaddasp, the double-precision floating-point value in doubleword element 0 of VSR[XT].
For xsnmaddmsp, the double-precision floating-point value in doubleword element 0 of VSR[XB].
src3 For xsnmaddasp, the double-precision floating-point value in doubleword element 0 of VSR[XB].
For xsnmaddmsp, the double-precision floating-point value in doubleword element 0 of VSR[XT].
dQNaN Default quiet NaN (0x7FF8_0000_0000_0000).
NZF Nonzero finite number.
Rezd Exact-zero-difference result (addition of two finite numbers having same magnitude but different signs). Can also occur with two
nonzero finite number source operands.
Q(x) Return a QNaN with the payload of x.
A(x,y) Return the normalized sum of floating-point value x and floating-point value y, having unbounded range and precision.
Note: If x = -y, v is considered to be an exact-zero-difference result (Rezd).
M(x,y) Return the normalized product of floating-point value x and floating-point value y, having unbounded range and precision.
p The intermediate product having unbounded range and precision.
v The intermediate result having unbounded range and precision.
VSX Scalar Negative Multiply-Add Otherwise, if src1 is a Quiet NaN, the result is src1.
Quad-Precision [using round to Odd] X-form
Otherwise, if src2 is a Signalling NaN, the result is the
xsnmaddqp VRT,VRA,VRB (RO=0) Quiet NaN corresponding to src2.
xsnmaddqpo VRT,VRA,VRB (RO=1)
Otherwise, if src2 is a Quiet NaN, the result is src2.
63 VRT VRA VRB 452 RO
0 6 11 16 21 31 Otherwise, if src3 is a Signalling NaN, the result is the
Quiet NaN corresponding to src3.
if MSR.VSX=0 then VSX_Unavailable()
reset_xflags()
Otherwise, if src3 is a Quiet NaN, the result is src3.
Let src1 be the floating-point value in VSR[VRA+32] If the intermediate result is Tiny (i.e., the unbiased
represented in quad-precision format. exponent is less than -16382) and UE=0, the
significand is shifted right N bits, where N is the
Let src2 be the floating-point value in VSR[VRT+32] difference between -16382 and the unbiased
represented in quad-precision format. exponent of the intermediate result. The exponent
of the intermediate result is set to the value
Let src3 be the floating-point value in VSR[VRB+32] -16382.
represented in quad-precision format.
If RO=1, let the rounding mode be Round to Odd.
If either src1, src2, or src3 is a Signalling NaN, an Otherwise, let the rounding mode be specified by
Invalid Operation exception occurs and VXSNAN is set to RN. Unless the result is an Infinity or a Zero, the
1. intermediate result is rounded to quad-precision
using the specified rounding mode.
If src1 is an Infinity value and src3 is a Zero value, or if
src1 is a Zero value and src3 is an Infinity value, an See Table 58, “Scalar Floating-Point Intermediate
Invalid Operation exception occurs and VXIMZ is set to Result Handling,” on page 516.
1.
The result is negated and placed into VSR[VRT+32] in
If src2 and the product of src1 and src3 are Infinity quad-precision format.
values having opposite signs, an Invalid Operation
exception occurs and VXISI is set to 1. FPRF is set to the class and sign of the result. FR is set
to indicate if the rounded result was incremented. FI is
If src1 is a Signalling NaN, the result is the Quiet NaN set to indicate the result is inexact.
corresponding to src1.
src1
VSR[VRT+32]
src2
VSR[VRB+32]
src3
VSR[VRT+32]
tgt
Part 1: src3
Multiply –Infinity –NZF –Zero +Zero +NZF +Infinity QNaN SNaN
p ← dQNaN
–Infinity p ← +Infinity p ← –Infinity
vximz_flag ← 1
p← p←
–NZF
Mul(src1,src3) Mul(src1,src3)
p← p←
+NZF
Mul(src1,src3) Mul(src1,src3)
p ← dQNaN
+Infinity p ← –Infinity p ← +Infinity
vximz_flag ← 1
p ← src1
QNaN p ← src1
vxsnan_flag ← 1
p ← quiet(src1)
SNaN
vxsnan_flag ← 1
Part 2: src2
Add –Infinity –NZF –Zero +Zero +NZF +Infinity QNaN SNaN
v ← dQNaN
–Infinity v ← –Infinity
vxisi_flag ← 1
v ← dQNaN
+Infinity v ← +Infinity
vxisi_flag ← 1
QNaN & v←p
src1 is a NaN vxsnan_flag ← 1
v←p
QNaN & v ← quiet(src2)
v ← src2
src1 not a NaN vxsnan_flag ← 1
Explanation:
src1 The quad-precision floating-point value in VSR[VRA+32].
src2 The quad-precision floating-point value in VSR[VRT+32].
src3 The quad-precision floating-point value in VSR[VRB+32].
dQNaN Default quiet NaN (0x7FFF_8000_0000_0000_0000_0000_0000).
NZF Nonzero finite number.
Rezd Exact-zero-difference result (addition of two finite numbers having same magnitude but different signs). Can also occur with two
nonzero finite number source operands.
quiet(x) Return a QNaN with the payload of x.
Add(x,y) Return the normalized sum of floating-point value x and floating-point value y, having unbounded range and precision.
Note: If x = -y, v is considered to be an exact-zero-difference result (Rezd).
Mul(x,y) Return the normalized product of floating-point value x and floating-point value y, having unbounded range and precision.
p The intermediate product having unbounded range and precision.
v The intermediate result having unbounded range and precision.
Part 1: src3
Multiply –Infinity –NZF –Zero +Zero +NZF +Infinity QNaN SNaN
p ← dQNaN p ← dQNaN p ← Q(src3)
–Infinity p ← +Infinity p ← +Infinity p ← –Infinity p ← –Infinity p ← src3
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
p ← Q(src3)
–NZF p ← +Infinity p ← M(src1,src3) p ← src1 p ← src1 p ← M(src1,src3) p ← +Infinity p ← src3
vxsnan_flag ← 1
p ← dQNaN p ← dQNaN p ← Q(src3)
–Zero p ← +Zero p ← +Zero p ← –Zero p ← –Zero p ← src3
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
p ← dQNaN p ← dQNaN p ← Q(src3)
+Zero p ← –Zero p ← –Zero p ← +Zero p ← +Zero p ← src3
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
src1
p ← Q(src3)
+NZF p ← –Infinity p ← M(src1,src3) p ← src1 p ← src1 p ← M(src1,src3) p ← +Infinity p ← src3
vxsnan_flag ← 1
p ← dQNaN p ← dQNaN p ← Q(src3)
+Infinity p ← –Infinity p ← +Infinity p ← +Infinity p ← +Infinity p ← src3
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
p ← src1
QNaN p ← src1 p ← src1 p ← src1 p ← src1 p ← src1 p ← src1 p ← src1
vxsnan_flag ← 1
p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1)
SNaN
vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1
Part 2: src2
Subtract –Infinity –NZF –Zero +Zero +NZF +Infinity QNaN SNaN
v ← dQNaN v ← Q(src2)
–Infinity v ← –Infinity v ← –Infinity v ← –Infinity v ← –Infinity v ← –Infinity v ← src2
vxisi_flag ← 1 vxsnan_flag ← 1
v ← Q(src2)
–NZF v ← +Infinity v ← S(p,src2) v←p v←p v ← S(p,src2) v ← –Infinity v ← src2
vxsnan_flag ← 1
v ← Q(src2)
–Zero v ← +Infinity v ← –src2 v ← Rezd v ← –Zero v ← –src2 v ← –Infinity v ← src2
vxsnan_flag ← 1
v ← Q(src2)
+Zero v ← +Infinity v ← –src2 v ← +Zero v ← Rezd v ← –src2 v ← –Infinity v ← src2
vxsnan_flag ← 1
p
v ← Q(src2)
+NZF v ← +Infinity v ← S(p,src2) v←p v←p v ← S(p,src2) v ← –Infinity v ← src2
vxsnan_flag ← 1
v ← dQNaN v ← Q(src2)
+Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← src2
vxisi_flag ← 1 vxsnan_flag ← 1
QNaN & v←p
v←p v←p v←p v←p v←p v←p v←p
src1 is a NaN vxsnan_flag ← 1
QNaN & v ← Q(src2)
v←p v←p v←p v←p v←p v←p v ← src2
src1 not a NaN vxsnan_flag ← 1
Explanation:
src1 The double-precision floating-point value in doubleword element 0 of VSR[XA].
src2 For xsnmsubadp, the double-precision floating-point value in doubleword element 0 of VSR[XT].
For xsnmsubmdp, the double-precision floating-point value in doubleword element 0 of VSR[XB].
src3 For xsnmsubadp, the double-precision floating-point value in doubleword element 0 of VSR[XB].
For xsnmsubmdp, the double-precision floating-point value in doubleword element 0 of VSR[XT].
dQNaN Default quiet NaN (0x7FF8_0000_0000_0000).
NZF Nonzero finite number.
Rezd Exact-zero-difference result (addition of two finite numbers having same magnitude but different signs). Can also occur with two
nonzero finite number source operands.
Q(x) Return a QNaN with the payload of x.
S(x,y) Return the normalized sum of floating-point value x and negated floating-point value y, having unbounded range and precision.
Note: If x = y, v is considered to be an exact-zero-difference result (Rezd).
M(x,y) Return the normalized product of floating-point value x and floating-point value y, having unbounded range and precision.
p The intermediate product having unbounded range and precision.
v The intermediate result having unbounded range and precision.
if(vxsnan_flag) then SetFX(VXSNAN) See part 2 of Table 97, “Actions for xsnmsub(a|m)sp,”
if(vximz_flag) then SetFX(VXIMZ) on page 626.
if(vxisi_flag) then SetFX(VXISI)
if(ox_flag) then SetFX(OX) The intermediate result is rounded to single-precision
if(ux_flag) then SetFX(UX) using the rounding mode specified by RN.
if(xx_flag) then SetFX(XX)
See Table 58, “Scalar Floating-Point Intermediate
vex_flag ← VE & (vxsnan_flag | vximz_flag | vxisi_flag) Result Handling,” on page 516.
if( ~vex_flag ) then do The result is negated and placed into doubleword
VSR[32×TX+T].dword[0] ← ConvertSPtoSP64(result)
element 0 of VSR[XT] in double-precision format.
VSR[32×TX+T].dword[1] ← 0xUUUU_UUUU_UUUU_UUUU
FPRF ← ClassSP(result)
FR ← inc_flag
The contents of doubleword element 1 of VSR[XT] are
FI ← xx_flag undefined.
end
else do FPRF is set to the class and sign of the result as
FR ← 0b0 represented in single-precision format. FR is set to
FI ← 0b0 indicate if the result was incremented when rounded.
end FI is set to indicate the result is inexact.
Part 1: src3
Multiply –Infinity –NZF –Zero +Zero +NZF +Infinity QNaN SNaN
p ← dQNaN p ← dQNaN p ← Q(src3)
–Infinity p ← +Infinity p ← +Infinity p ← –Infinity p ← –Infinity p ← src3
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
p ← Q(src3)
–NZF p ← +Infinity p ← M(src1,src3) p ← src1 p ← src1 p ← M(src1,src3) p ← +Infinity p ← src3
vxsnan_flag ← 1
p ← dQNaN p ← dQNaN p ← Q(src3)
–Zero p ← +Zero p ← +Zero p ← –Zero p ← –Zero p ← src3
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
p ← dQNaN p ← dQNaN p ← Q(src3)
+Zero p ← –Zero p ← –Zero p ← +Zero p ← +Zero p ← src3
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
src1
p ← Q(src3)
+NZF p ← –Infinity p ← M(src1,src3) p ← src1 p ← src1 p ← M(src1,src3) p ← +Infinity p ← src3
vxsnan_flag ← 1
p ← dQNaN p ← dQNaN p ← Q(src3)
+Infinity p ← –Infinity p ← +Infinity p ← +Infinity p ← +Infinity p ← src3
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
p ← src1
QNaN p ← src1 p ← src1 p ← src1 p ← src1 p ← src1 p ← src1 p ← src1
vxsnan_flag ← 1
p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1)
SNaN
vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1
Part 2: src2
Subtract –Infinity –NZF –Zero +Zero +NZF +Infinity QNaN SNaN
v ← dQNaN v ← Q(src2)
–Infinity v ← –Infinity v ← –Infinity v ← –Infinity v ← –Infinity v ← –Infinity v ← src2
vxisi_flag ← 1 vxsnan_flag ← 1
v ← Q(src2)
–NZF v ← +Infinity v ← S(p,src2) v←p v←p v ← S(p,src2) v ← –Infinity v ← src2
vxsnan_flag ← 1
v ← Q(src2)
–Zero v ← +Infinity v ← –src2 v ← Rezd v ← –Zero v ← –src2 v ← –Infinity v ← src2
vxsnan_flag ← 1
v ← Q(src2)
+Zero v ← +Infinity v ← –src2 v ← +Zero v ← Rezd v ← –src2 v ← –Infinity v ← src2
vxsnan_flag ← 1
p
v ← Q(src2)
+NZF v ← +Infinity v ← S(p,src2) v←p v←p v ← S(p,src2) v ← –Infinity v ← src2
vxsnan_flag ← 1
v ← dQNaN v ← Q(src2)
+Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← src2
vxisi_flag ← 1 vxsnan_flag ← 1
QNaN & v←p
v←p v←p v←p v←p v←p v←p v←p
src1 is a NaN vxsnan_flag ← 1
QNaN & v ← Q(src2)
v←p v←p v←p v←p v←p v←p v ← src2
src1 not a NaN vxsnan_flag ← 1
Explanation:
src1 The double-precision floating-point value in VSR[XA].dword[0].
src2 For xsnmsubasp, the double-precision floating-point value in VSR[XT].dword[0].
For xsnmsubmsp, the double-precision floating-point value in VSR[XB].dword[0].
src3 For xsnmsubasp, the double-precision floating-point value in VSR[XB].dword[0].
For xsnmsubmsp, the double-precision floating-point value in VSR[XT].dword[0].
dQNaN Default quiet NaN (0x7FF8_0000_0000_0000).
NZF Nonzero finite number.
Rezd Exact-zero-difference result (addition of two finite numbers having same magnitude but different signs). Can also occur with two
nonzero finite number source operands.
Q(x) Return a QNaN with the payload of x.
S(x,y) Return the normalized sum of floating-point value x and negated floating-point value y, having unbounded range and precision.
Note: If x = y, v is considered to be an exact-zero-difference result (Rezd).
M(x,y) Return the normalized product of floating-point value x and floating-point value y, having unbounded range and precision.
p The intermediate product having unbounded range and precision.
v The intermediate result having unbounded range and precision.
VSX Scalar Negative Multiply-Subtract Otherwise, if src1 is a Quiet NaN, the result is src1.
Quad-Precision [using round to Odd] X-form
Otherwise, if src2 is a Signalling NaN, the result is the
xsnmsubqp VRT,VRA,VRB (RO=0) Quiet NaN corresponding to src2.
xsnmsubqpo VRT,VRA,VRB (RO=1)
Otherwise, if src2 is a Quiet NaN, the result is src2.
63 VRT VRA VRB 484 RO
0 6 11 16 21 31 Otherwise, if src3 is a Signalling NaN, the result is the
Quiet NaN corresponding to src3.
if MSR.VSX=0 then VSX_Unavailable()
reset_xflags()
Otherwise, if src3 is a Quiet NaN, the result is src3.
src1
VSR[VRT+32]
src2
VSR[VRB+32]
src3
VSR[VRT+32]
tgt
Part 1: src3
Multiply –Infinity –NZF –Zero +Zero +NZF +Infinity QNaN SNaN
p ← dQNaN
–Infinity p ← +Infinity p ← –Infinity
vximz_flag ← 1
p← p←
–NZF
Mul(src1,src3) Mul(src1,src3)
p ← +Zero p ← –Zero
–Zero p ← +Zero p ← –Zero
p ← dQNaN p ← dQNaN p ← quiet(src3)
p ← src3
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
+Zero p ← –Zero p ← +Zero
src1
p ← –Zero p ← +Zero
p← p←
+NZF
Mul(src1,src3) Mul(src1,src3)
p ← dQNaN
+Infinity p ← –Infinity p ← +Infinity
vximz_flag ← 1
p ← src1
QNaN p ← src1
vxsnan_flag ← 1
p ← quiet(src1)
SNaN
vxsnan_flag ← 1
Part 2: src2
Subtract –Infinity –NZF –Zero +Zero +NZF +Infinity QNaN SNaN
v ← dQNaN
–Infinity v ← –Infinity
vxisi_flag ← 1
v ← dQNaN
+Infinity v ← +Infinity
vxisi_flag ← 1
QNaN & v←p
src1 is a NaN vxsnan_flag ← 1
v←p
QNaN & v ← quiet(src2)
v ← src2
src1 not a NaN vxsnan_flag ← 1
Explanation:
src1 The quad-precision floating-point value in VSR[VRA+32].
src2 The quad-precision floating-point value in VSR[VRT+32].
src3 The quad-precision floating-point value in VSR[VRB+32].
dQNaN Default quiet NaN (0x7FFF_8000_0000_0000_0000_0000_0000).
NZF Nonzero finite number.
Rezd Exact-zero-difference result (addition of two finite numbers having same magnitude but different signs). Can also occur with two
nonzero finite number source operands.
quiet(x) Return a QNaN with the payload of x.
sub(x,y) Return the normalized sum of floating-point value x and negated floating-point value y, having unbounded range and precision.
Note: If x = y, v is considered to be an exact-zero-difference result (Rezd).
Mul(x,y) Return the normalized product of floating-point value x and floating-point value y, having unbounded range and precision.
p The intermediate product having unbounded range and precision.
v The intermediate result having unbounded range and precision.
60 T /// B 73 BX TX
0 6 11 16 21 30 31
XT ← TX || T
XB ← BX || B
reset_xflags()
result{0:63} ← RoundToDPIntegerNearAway(VSR[XB]{0:63})
if(vxsnan_flag) then SetFX(VXSNAN)
FR ← 0b0
FI ← 0b0
vex_flag ← VE & vxsnan_flag
Programming Note
This instruction can be used to operate on a
single-precision source operand.
Programming Note
This instruction can be used to operate on a
single-precision source operand.
VSX Scalar Round to Double-Precision Integer VSX Scalar Round to Double-Precision Integer
using round toward -Infinity XX2-form using round toward +Infinity XX2-form
xsrdpim XT,XB xsrdpip XT,XB
XT ← TX || T XT ← TX || T
XB ← BX || B XB ← BX || B
reset_xflags() reset_xflags()
result{0:63} ← RoundToDPIntegerFloor(VSR[XB]{0:63}) result{0:63} ← RoundToDPIntegerCeil(VSR[XB]{0:63})
if(vxsnan_flag) then SetFX(VXSNAN) if(vxsnan_flag) then SetFX(VXSNAN)
FR ← 0b0 FR ← 0b0
FI ← 0b0 FI ← 0b0
vex_flag ← VE & vxsnan_flag vex_flag ← VE & vxsnan_flag
Let src be the double-precision floating-point value in Let src be the double-precision floating-point value in
doubleword element 0 of VSR[XB]. doubleword element 0 of VSR[XB].
src is rounded to an integer using the rounding mode src is rounded to an integer using the rounding mode
Round toward -Infinity. Round toward +Infinity.
The result is placed into doubleword element 0 of The result is placed into doubleword element 0 of
VSR[XT] in double-precision format. VSR[XT] in double-precision format.
The contents of doubleword element 1 of VSR[XT] are The contents of doubleword element 1 of VSR[XT] are
undefined. undefined.
FPRF is set to the class and sign of the result. FR is FPRF is set to the class and sign of the result. FR is
set to 0. FI is set to 0. set to 0. FI is set to 0.
If a trap-enabled invalid operation exception occurs, If a trap-enabled invalid operation exception occurs,
VSR[XT] and FPRF are not modified, and FR and FI VSR[XT] and FPRF are not modified, and FR and FI
are set to 0. are set to 0.
VSR Data Layout for xsrdpim VSR Data Layout for xsrdpip
src = VSR[XB] src = VSR[XB]
DP unused DP unused
tgt = VSR[XT] tgt = VSR[XT]
DP undefined DP undefined
0 64 127 0 64 127
60 T /// B 89 BX TX
0 6 11 16 21 30 31
XT ← TX || T
XB ← BX || B
reset_xflags()
result{0:63} ← RoundToDPIntegerTrunc(VSR[XB]{0:63})
if(vxsnan_flag) then SetFX(VXSNAN)
FR ← 0b0
FI ← 0b0
vex_flag ← VE & vxsnan_flag
Programming Note
This instruction can be used to operate on a
single-precision source operand.
estimate – ---------- 1
1
src
---------------------------------------------- ≤ ------------------
1 16384
----------
src
VSX Scalar Round to Quad-Precision Integer Let R and RMC specify the rounding mode as follows.
[with Inexact] Z23-form
FPSCR.RN
xsrqpi R,VRT,VRB,RMC (EX=0)
RMC
xsrqpix R,VRT,VRB,RMC (EX=1)
Rounding Mode
R
63 VRT /// R VRB RMC 5 EX 0 00 – Round to Nearest Away
0 6 11 15 16 21 23 31
0 01 – reserved
if MSR.VSX=0 then VSX_Unavailable() 0 10 – reserved
0 11 00 Round to Nearest Even
reset_xflags()
0 11 01 Round towards Zero
if R=0 then do 0 11 10 Round towards +Infinity
if RMC=0b00 then // Round to Nearest Away 0 11 11 Round towards -Infinity
rmode ← 0b100 1 00 – Round to Nearest Even
if RMC=0b11 then do
1 01 – Round towards Zero
if FPSCR.RN=0b00 then // Round to Nearest Even
rmode ← 0b000 1 10 – Round towards +Infinity
if FPSCR.RN=0b01 then // Round towards Zero 1 11 – Round towards -Infinity
rmode ← 0b001
if FPSCR.RN=0b10 then // Round towards +Infinity Let src be the floating-point value in VSR[VRB+32]
rmode ← 0b010 represented in quad-precision format.
if FPSCR.RN=0b11 then // Round towards -Infinity
rmode ← 0b011 If src is a Signalling NaN, an Invalid Operation
end exception occurs, VXSNAN is set to 1, and the result is
end the Quiet NaN corresponding to the Signalling NaN.
else do // R=1
if RMC=0b00 then // Round to Nearest Even Otherwise, if src is a Quiet NaN, an Infinity, or a Zero,
rmode ← 0b000 then the result is src.
if RMC=0b01 then // Round towards Zero
rmode ← 0b001
Otherwise, src is rounded to an integer using the
if RMC=0b10 then // Round towards +Infinity
rmode ← 0b010
rounding mode rmode.
if RMC=0b11 then // Round towards -Infinity
rmode ← 0b011 The result is placed into VSR[VRT+32] in quad-precision
end format.
src ← bfp_CONVERT_FROM_BFP128(VSR[VRB+32]) FPRF is set to the class and sign of the result.
src
VSR[VRT+32]
tgt
VSX Scalar Round Quad-Precision to Let R and RMC specify the rounding mode as follows.
Double-Extended Precision Z23-form
FPSCR.RN
xsrqpxp R,VRT,VRB,RMC
RMC
63 VRT /// R VRB RMC 37 / Rounding Mode
R
0 6 11 15 16 21 23 31
0 00 – Round to Nearest Away
if MSR.VSX=0 then VSX_Unavailable() 0 01 – reserved
0 10 – reserved
reset_xflags()
0 11 00 Round to Nearest Even
if R=0 then do 0 11 01 Round to Zero
if RMC=0b00 then // Round to Nearest Away 0 11 10 Round to +Infinity
rmode ← 0b100 0 11 11 Round to -Infinity
if RMC=0b11 then do 1 00 – Round to Nearest Even
if FPSCR.RN=0b00 then // Round to Nearest Even
1 01 – Round to Zero
rmode ← 0b000
if FPSCR.RN=0b01 then // Round towards Zero 1 10 – Round to +Infinity
rmode ← 0b001 1 11 – Round to -Infinity
if FPSCR.RN=0b10 then // Round towards +Infinity
rmode ← 0b010 Let src be the floating-point value in VSR[VRB+32]
if FPSCR.RN=0b11 then // Round towards -Infinity represented in quad-precision format.
rmode ← 0b011
end If src is a Signalling NaN, an Invalid Operation
end exception occurs, VXSNAN is set to 1, and the result is
else do // R=1 the Quiet NaN corresponding to the Signalling NaN,
if RMC=0b00 then // Round to Nearest Even with the significand truncated to
rmode ← 0b000
double-extended-precision.
if RMC=0b01 then // Round towards Zero
rmode ← 0b001
if RMC=0b10 then // Round towards +Infinity
Otherwise, if src is a Quiet NaN, then the result is src
rmode ← 0b010 with the significand truncated to
if RMC=0b11 then // Round towards -Infinity double-extended-precision.
rmode ← 0b011
end Otherwise, if src is an Infinity or a Zero, the result is
src.
src ← bfp_CONVERT_FROM_BFP128(VSR[VRB+32])
rnd ← bfp_ROUND_TO_BFP80(rmode,src) Otherwise, src is rounded to double-extended
result ← bfp_CONVERT_TO_BFP128(rnd) precision (i.e., 15-bit exponent range and 64-bit
significand precision) using the specified rounding
if(vxsnan_flag) then SetFX(FPSCR.VXSNAN) mode.
if(ox_flag) then SetFX(FPSCR.OX)
if(ux_flag) then SetFX(FPSCR.UX)
See Table 58, “Scalar Floating-Point Intermediate
if(xx_flag) then SetFX(FPSCR.XX)
Result Handling,” on page 516.
ex_flag ← FPSCR.VE & vxsnan_flag
The result is placed into VSR[VRT+32] in quad-precision
if ex_flag=0 then do format.
VSR[VRT+32] ← result
FPSCR.FPRF ← fprf_CLASS_BFP128(result) FPRF is set to the class and sign of the result. FR is set
end to indicate if the rounded result was incremented. FI is
FPSCR.FR ← (vxsnan_flag=0) & inc_flag set to indicate the result is inexact.
FPSCR.FI ← (vxsnan_flag=0) & xx_flag
If a trap-disabled Invalid Operation exception occurs,
FPRF is set to an undefined value, and FR and FI are set
to 0.
src
VSR[VRT+32]
tgt
src ← VSR[32×BX+B].dword[0]
result ← RoundToSP(RN,src)
VSX Scalar Reciprocal Square Root Estimate Source Value Result Exception
Double-Precision XX2-form
–Infinity QNaN1 VXSQRT
xsrsqrtedp XT,XB –Finite QNaN1 VXSQRT
1 DP undefined
estimate – ---------------
1 0 64 127
src
-------------------------------------------------- ≤ ----------------
1 16384
----------------
src
VSX Scalar Reciprocal Square Root Estimate Source Value Result Exception
Single-Precision XX2-form
–Infinity QNaN1 VXSQRT
xsrsqrtesp XT,XB
–Finite QNaN1 VXSQRT
60 T /// B 10 BXTX 2
–Zero –Infinity ZX
0 6 11 16 21 30 31
2
+Zero +Infinity ZX
reset_xflags()
+Infinity +Zero None
src ← VSR[32×BX+B].dword[0]
1
v ← ReciprocalSquareRootEstimateDP(src) SNaN QNaN VXSNAN
result ← RoundToSP(RN,v)
QNaN QNaN None
if(vxsnan_flag) then SetFX(VXSNAN) 1. No result if VE=1.
if(vxsqrt_flag) then SetFX(VXSQRT) 2. No result if ZE=1.
if(ox_flag) then SetFX(OX)
if(ux_flag) then SetFX(UX) The contents of doubleword element 1 of VSR[XT] are
if(0bU) then SetFX(XX) undefined.
if(zx_flag) then SetFX(ZX)
FPRF is set to the class and sign of the result as
vex_flag ← VE & (vxsnan_flag | vxsqrt_flag)
represented in single-precision format. FR is set to an
zex_flag ← ZE & zx_flag
undefined value. FI is set to an undefined value.
if( ~vex_flag & ~zex_flag ) then do
VSR[32×TX+T].dword[0] ← ConvertSPtoSP64(result) If a trap-enabled invalid operation exception or a
VSR[32×TX+T].dword[1] ← 0xUUUU_UUUU_UUUU_UUUU trap-enabled zero divide exception occurs, VSR[XT]
FPRF ← ClassSP(result) and FPRF are not modified.
FR ← 0bU
FI ← 0bU The results of executing this instruction is permitted to
end vary between implementations, and between different
else do executions on the same implementation.
FR ← 0b0
FI ← 0b0 Special Registers Altered
end
FPRF FR=0bU FI=0bU FX OX UX ZX
XX=0bU VXSNAN VXSQRT
Let XT be the value 32×TX + T.
Let XB be the value 32×BX + B.
VSR Data Layout for xsrsqrtesp
Let src be the double-precision floating-point value in src = VSR[XB]
doubleword element 0 of VSR[XB]. DP unused
tgt = VSR[XT]
A single-precision floating-point estimate of the
reciprocal square root of src is placed into doubleword DP undefined
element 0 of VSR[XT] in double-precision format. 0 64 127
1
estimate – --------------
-
src 1
------------------------------------------------- ≤ ----------------
1 16384
----------------
src
src
-Infinity -NZF -Zero +Zero +NZF +Infinity QNaN SNaN
v ← dQNaN v ← dQNaN v ← Q(src)
v ← +Zero v ← +Zero v ← SQRT(src) v ← +Infinity v ← src
vxsqrt_flag ← 1 vxsqrt_flag ← 1 vxsnan_flag ← 1
Explanation:
src The double-precision floating-point value in doubleword element 0 of VSR[XB].
dQNaN Default quiet NaN (0x7FF8_0000_0000_0000).
NZF Nonzero finite number.
SQRT(x) The unbounded-precision square root of the floating-point value x.
Q(x) Return a QNaN with the payload of x.
v The intermediate result having unbounded signficand precision and unbounded exponent range.
If src is a Signalling NaN, the result is the Quiet NaN See Table 59, “VSX Scalar Floating-Point Final
corresponding to src. Result,” on page 517.
Otherwise, if src is a Quiet NaN, the result is src. Special Registers Altered:
FPRF FR FI FX VXSNAN VXSQRT XX
Otherwise, if src is a negative value, the result is the
default Quiet NaN[1]. VSR Data Layout for xssqrtqp[o]
VSR[VRB+32]
src
VSR[VRT+32]
tgt
src
-Infinity -NZF -Zero +Zero +NZF +Infinity QNaN SNaN
v ← dQNaN v ← dQNaN v ← quiet(src)
vxsqrt_flag ← 1 vxsqrt_flag ← 1 v ← +Zero v ← +Zero v ← sqrt(src) v ← +Infinity v ← src
vxsnan_flag ← 1
Explanation:
src The quad-precision floating-point value in VSR[VRB+32].
dQNaN Default quiet NaN (0x7FFF_8000_0000_0000_0000_0000_0000).
NZF Nonzero finite number.
sqrt(x) Return the normalized1 square root of floating-point value x, having unbounded significand precision and exponent range.
quiet(x) Convert x to the corresponding Quiet NaN.
v The intermediate result having unbounded significand precision and unbounded exponent range.
src
-Infinity -NZF -Zero +Zero +NZF +Infinity QNaN SNaN
v ← dQNaN v ← dQNaN v ← Q(src)
v ← +Zero v ← +Zero v ← SQRT(src) v ← +Infinity v ← src
vxsqrt_flag ← 1 vxsqrt_flag ← 1 vxsnan_flag ← 1
Explanation:
src The double-precision floating-point value in doubleword element 0 of VSR[XB].
dQNaN Default quiet NaN (0x7FF8_0000_0000_0000).
NZF Nonzero finite number.
SQRT(x) The unbounded-precision and exponent range square root of the floating-point value x.
Q(x) Return a QNaN with the payload of x.
v The intermediate result having unbounded signficand precision and unbounded exponent range.
VSX Scalar Subtract Double-Precision The result is placed into doubleword element 0 of
XX3-form VSR[XT].
1. Floating-point addition is based on exponent comparison and addition of the two significands. The exponents of the two operands are compared,
and the significand accompanying the smaller exponent is shifted right, with its exponent increased by one for each bit shifted, until the two expo-
nents are equal. The two significands are then added or subtracted as appropriate, depending on the signs of the operands, to form an interme-
diate sum. All 53 bits of the significand as well as all three guard bits (G, R, and X) enter into the computation.
2. Floating-point normalization is based on shifting the significand left until the most-significant bit is 1 and decrementing the exponent by the num-
ber of bits the significand was shifted.
src2
-Infinity -NZF -Zero +Zero +NZF +Infinity QNaN SNaN
v ← dQNaN v ← Q(src2)
-Infinity v ← –Infinity v ← –Infinity v ← –Infinity v ← –Infinity v ← –Infinity v ← src2
vxisi_flag ← 1 vxsnan_flag ← 1
v ← Q(src2)
-NZF v ← +Infinity v ← S(src1,src2) v ← src1 v ← src1 v ← S(src1,src2) v ← –Infinity v ← src2
vxsnan_flag ← 1
v ← Q(src2)
-Zero v ← +Infinity v ← –src2 v ← –Zero v ← Rezd v ← –src2 v ← –Infinity v ← src2
vxsnan_flag ← 1
v ← Q(src2)
+Zero v ← +Infinity v ← –src2 v ← Rezd v ← +Zero v ← –src2 v ← –Infinity v ← src2
vxsnan_flag ← 1
src1
v ← Q(src2)
+NZF v ← +Infinity v ← S(src1,src2) v ← src1 v ← src1 v ← S(src1,src2) v ← –Infinity v ← src2
vxsnan_flag ← 1
v ← dQNaN v ← Q(src2)
+Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← src2
vxisi_flag ← 1 vxsnan_flag ← 1
v ← src1
QNaN v ← src1 v ← src1 v ← src1 v ← src1 v ← src1 v ← src1 v ← src1
vxsnan_flag ← 1
v ← Q(src1) v ← Q(src1) v ← Q(src1) v ← Q(src1) v ← Q(src1) v ← Q(src1) v ← Q(src1) v ← Q(src1)
SNaN
vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1
Explanation:
src1 The double-precision floating-point value in doubleword element 0 of VSR[XA].
src2 The double-precision floating-point value in doubleword element 0 of VSR[XB].
dQNaN Default quiet NaN (0x7FF8_0000_0000_0000).
NZF Nonzero finite number.
Rezd Exact-zero-difference result (addition of two finite numbers having same magnitude but different signs).
S(x,y) The floating-point value y is negated and then added to the floating-point value x.
S(x,y) Return the normalized sum of floating-point value x and negated floating-point value y, having unbounded range and precision.
Note: If x = y, v is considered to be an exact-zero-difference result (Rezd).
Q(x) Return a QNaN with the payload of x.
v The intermediate result having unbounded signficand precision and unbounded exponent range.
Let src1 be the floating-point value in VSR[VRA+32] If a trap-disabled Invalid Operation exception occurs,
represented in quad-precision format. FPRF is set to an undefined value, and FR and FI are set
to 0.
Let src2 be the floating-point value in VSR[VRB+32]
represented in quad-precision format. If a trap-enabled Invalid Operation exception occurs,
VSR[VRT+32] and FPRF are not modified, and FR and FI
If either src1 or src2 is a Signalling NaN, an Invalid are set to 0.
Operation exception occurs and VXSNAN is set to 1.
See Table 59, “VSX Scalar Floating-Point Final
If src1 and src2 are Infinity values having same signs, Result,” on page 517.
an Invalid Operation exception occurs and VXISI is set
to 1. Special Registers Altered:
FPRF FR FI FX VXSNAN VXISI OX UX XX
If src1 is a Signalling NaN, the result is the Quiet NaN
corresponding to src1. VSR Data Layout for xssubqp[o]
VSR[VRA+32]
Otherwise, if src1 is a Quiet NaN, the result is src1.
src1
Otherwise, if src2 is a Signalling NaN, the result is the VSR[VRB+32]
Quiet NaN corresponding to src2.
src2
Otherwise, if src2 is a Quiet NaN, the result is src2. VSR[VRT+32]
src2
-Infinity -NZF -Zero +Zero +NZF +Infinity QNaN SNaN
v ← dQNaN
-Infinity v ← -Infinity
vxisi_flag ← 1
-NZF v ← sub(src1,src2) v ← src1 v ← sub(src1,src2)
VSX Scalar Subtract Single-Precision The contents of doubleword element 1 of VSR[XT] are
XX3-form undefined.
xssubsp XT,XA,XB FPRF is set to the class and sign of the result as
represented in single-precision format. FR is set to
60 T A B 8 AXBX TX indicate if the result was incremented when rounded.
0 6 11 16 21 30 30 31 FI is set to indicate the result is inexact.
reset_xflags()
If a trap-enabled invalid operation exception occurs,
src1 ← VSR[32×AX+A].dword[0] VSR[XT] and FPRF are not modified, and FR and FI are
src2 ← VSR[32×BX+B].dword[0] set to 0.
v ← AddDP(src1,NegateDP(src2))
result ← RoundToSP(RN,v) See Table 59, “VSX Scalar Floating-Point Final
Result,” on page 517.
if(vxsnan_flag) then SetFX(VXSNAN)
if(vxisi_flag) then SetFX(VXISI) Special Registers Altered
if(ox_flag) then SetFX(OX) FPRF FR FI FX OX UX XX
if(ux_flag) then SetFX(UX) VXSNAN VXISI
if(xx_flag) then SetFX(XX)
src2
-Infinity -NZF -Zero +Zero +NZF +Infinity QNaN SNaN
v ← dQNaN v ← Q(src2)
-Infinity v ← –Infinity v ← –Infinity v ← –Infinity v ← –Infinity v ← –Infinity v ← src2
vxisi_flag ← 1 vxsnan_flag ← 1
v ← Q(src2)
-NZF v ← +Infinity v ← S(src1,src2) v ← src1 v ← src1 v ← S(src1,src2) v ← –Infinity v ← src2
vxsnan_flag ← 1
v ← Q(src2)
-Zero v ← +Infinity v ← –src2 v ← –Zero v ← Rezd v ← –src2 v ← –Infinity v ← src2
vxsnan_flag ← 1
v ← Q(src2)
+Zero v ← +Infinity v ← –src2 v ← Rezd v ← +Zero v ← –src2 v ← –Infinity v ← src2
vxsnan_flag ← 1
src1
v ← Q(src2)
+NZF v ← +Infinity v ← S(src1,src2) v ← src1 v ← src1 v ← S(src1,src2) v ← –Infinity v ← src2
vxsnan_flag ← 1
v ← dQNaN v ← Q(src2)
+Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← src2
vxisi_flag ← 1 vxsnan_flag ← 1
v ← src1
QNaN v ← src1 v ← src1 v ← src1 v ← src1 v ← src1 v ← src1 v ← src1
vxsnan_flag ← 1
v ← Q(src1) v ← Q(src1) v ← Q(src1) v ← Q(src1) v ← Q(src1) v ← Q(src1) v ← Q(src1) v ← Q(src1)
SNaN
vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1
Explanation:
src1 The double-precision floating-point value in doubleword element 0 of VSR[XA].
src2 The double-precision floating-point value in doubleword element 0 of VSR[XB].
dQNaN Default quiet NaN (0x7FF8_0000_0000_0000).
NZF Nonzero finite number.
Rezd Exact-zero-difference result (addition of two finite numbers having same magnitude but different signs).
S(x,y) The floating-point value y is negated and then added to the floating-point value x.
S(x,y) Return the normalized sum of floating-point value x and negated floating-point value y, having unbounded range and precision.
Note: If x = y, v is considered to be an exact-zero-difference result (Rezd).
Q(x) Return a QNaN with the payload of x.
v The intermediate result having unbounded signficand precision and unbounded exponent range.
60 BF // /// B 106 BX /
0 6 9 11 16 21 30 31
XB ← BX || B
src ← VSR[XB]{0:63}
e_b ← VSR[XB]{1:11} - 1023
fe_flag ← IsNaN(src) | IsInf(src) | IsZero(src) |
IsNeg(src) | ( e_b <= -970 )
fg_flag ← IsInf(src) | IsZero(src) | IsDen(src)
fl_flag ← xsrsqrtedp_error() <= 2-14
CR[BF] ← 0b1 || fg_flag || fe_flag || 0b0
VSX Scalar Test Data Class Double-Precision Let XB be the sum 32×BX + B.
XX2-form
Let src be the double-precision floating-point value in
xststdcdp BF,XB,DCMX doubleword element 0 of VSR[XB].
60 BF DCMX B 362 BX /
0 6 9 16 21 30 31
Bit 0 of CR field BF and bit 0 of FPCC are set to the sign
bit of src.
if MSR.VSX=0 then VSX_Unavailable()
Bit 1 of CR field BF and bit 1 of FPCC are set to 0b0.
src VSR[32×BX+B].dword[0]
exponent src.bit[1:11] Bit 2 of CR field BF and bit 2 of FPCC are set to indicate
fraction src.bit[12:63] whether the data class of src, as represented in
double-precision format, matches any of the data
class.Infinity (exponent = 0x7FF) & (fraction = 0) classes specified by DCMX (Data Class Mask).
class.NaN (exponent = 0x7FF) & (fraction != 0)
class.Zero (exponent = 0x000) & (fraction = 0) DCMX bit Data Class
class.Denormal (exponent = 0x000) & (fraction != 0)
0 NaN
match (DCMX.bit[0] & class.NaN) | 1 +Infinity
(DCMX.bit[1] & class.Infinity & !sign) | 2 -Infinity
(DCMX.bit[2] & class.Infinity & sign) | 3 +Zero
(DCMX.bit[3] & class.Zero & !sign) | 4 -Zero
(DCMX.bit[4] & class.Zero & sign) |
5 +Denormal
(DCMX.bit[5] & class.Denormal & !sign) |
(DCMX.bit[6] & class.Denormal & sign) 6 -Denormal
VSX Scalar Test Data Class Quad-Precision Let src be the quad-precision floating-point value in
X-form VSR[VRB+32].
xststdcqp BF,VRB,DCMX Let the DCMX (Data Class Mask) field specify one or
more of the 7 possible data classes, where each bit
63 BF DCMX VRB 708 /
0 6 9 16 21 31 corresponds to a specific data class.
src
VSX Scalar Test Data Class Single-Precision Let XB be the sum 32×BX + B.
XX2-form
Let src be the double-precision floating-point value in
xststdcsp BF,XB,DCMX doubleword element 0 of VSR[XB].
60 BF DCMX B 298 BX /
0 6 9 16 21 30 31
Bit 0 of CR field BF and bit 0 of FPCC are set to the sign
bit of src.
if MSR.VSX=0 then VSX_Unavailable()
Bit 1 of CR field BF and bit 1 of FPCC are set to 0b0.
src VSR[32×BX+B].dword[0]
exponent src.bit[1:11] Bit 2 of CR field BF and bit 2 of FPCC are set to indicate
fraction src.bit[12:63] whether the data class of src, as represented in
single-precision format, matches any of the data
class.Infinity (exponent = 0x7FF) & (fraction = 0) classes specified by DCMX (Data Class Mask).
class.NaN (exponent = 0x7FF) & (fraction != 0)
class.Zero (exponent = 0x000) & (fraction = 0) DCMX bit Data Class
class.Denormal (exponent = 0x000) & (fraction != 0) |
0 NaN
(exponent > 0x000) & (exponent < 0x381)
1 +Infinity
match (DCMX.bit[0] & class.NaN) | 2 -Infinity
(DCMX.bit[1] & class.Infinity & !sign) | 3 +Zero
(DCMX.bit[2] & class.Infinity & sign) | 4 -Zero
(DCMX.bit[3] & class.Zero & !sign) |
5 +Denormal
(DCMX.bit[4] & class.Zero & sign) |
(DCMX.bit[5] & class.Denormal & !sign) | 6 -Denormal
(DCMX.bit[6] & class.Denormal & sign)
Bit 3 of CR field BF and bit 3 of FPCC are set to indicate if
not_SP_value (src != Convert_SPtoDP(Convert_DPtoSP(src))) src is not representable in single-precision format.
Let XB be the sum 32×BX + B. Let src be the quad-precision floating-point value in
VSR[VRB+32].
Let src be the double-precision floating-point value in
doubleword element 0 of VSR[XB]. The contents of the exponent field of src (bits 1:15) are
zero-extended and placed into doubleword 0 of
The value of the exponent field in src is placed into VSR[VRT+32].
GPR[RT] in unsigned integer format.
The contents of doubleword 1 of VSR[VRT+32] are set to
Special Registers Altered: 0.
None
Special Registers Altered:
Programming Note None
This instruction can be used to operate on a
single-precision source operand.
tgt GPR[RT]
0 63 127
src VSR[VRB+32]
Let src be the double-precision floating-point value in The significand of src is placed into VSR[VRT+32].
doubleword element 0 of VSR[XB].
If the value of the exponent field of src is equal to
The significand of src is placed into GPR[RT] in 0b000_0000_0000_0000 (i.e., Zero or Denormal value) or
unsigned integer format. If src is a normal value, the 0b111_1111_1111 (i.e., Infinity or NaN), 0b0 is placed
implicit leading bit is set to 1. into bit 15 of VSR[VRT+32]. Otherwise (i.e., Normal
value), 0b1 is placed into bit 15 of VSR[VRT+32]. The
Special Registers Altered: contents of bits 0:14 of VSR[VRT+32] are set to 0.
None
Special Registers Altered:
Programming Note None
This instruction can be used to operate on a
single-precision source operand.
tgt GPR[RT]
0 64 127
src VSR[VRB+32]
tgt VSR[VRT+32]
0 127
VSX Vector Absolute Value Double-Precision VSX Vector Absolute Value Single-Precision
XX2-form XX2-form
xvabsdp XT,XB xvabssp XT,XB
XT ← TX || T XT ← TX || T
XB ← BX || B XB ← BX || B
VSX Vector Add Double-Precision XX3-form The result is placed into doubleword element i of
VSR[XT] in double-precision format.
xvadddp XT,XA,XB
See Table 106, “Vector Floating-Point Final
60 T A B 96 AX BX TX Result,” on page 663.
0 6 11 16 21 29 30 31
If a trap-enabled exception occurs in any element of
XT ← TX || T
the vector, no results are written to VSR[XT].
XA ← AX || A
XB ← BX || B
ex_flag ← 0b0 Special Registers Altered
FX OX UX XX VXSNAN VXISI
do i=0 to 127 by 64
reset_xflags() VSR Data Layout for xvadddp
src1 ← VSR[XA]{i:i+63}
src2 ← VSR[XB]{i:i+63} src1 = VSR[XA]
v{0:inf} ← AddDP(src1,src2)
DP DP
result{i:i+63} ← RoundToDP(RN,v)
if(vxsnan_flag) then SetFX(VXSNAN) src2 = VSR[XB]
if(vxisi_flag) then SetFX(VXISI)
if(ox_flag) then SetFX(OX) DP DP
if(ux_flag) then SetFX(UX)
tgt = VSR[XT]
if(xx_flag) then SetFX(XX)
ex_flag ← ex_flag | (VE & vxsnan_flag) DP DP
ex_flag ← ex_flag | (VE & vxisi_flag)
0 64 127
ex_flag ← ex_flag | (OE & ox_flag)
ex_flag ← ex_flag | (UE & ux_flag)
ex_flag ← ex_flag | (XE & xx_flag)
end
1. Floating-point addition is based on exponent comparison and addition of the two significands. The exponents of the two operands are compared,
and the significand accompanying the smaller exponent is shifted right, with its exponent increased by one for each bit shifted, until the two expo-
nents are equal. The two significands are then added or subtracted as appropriate, depending on the signs of the operands, to form an interme-
diate sum. All 53 bits of the significand as well as all three guard bits (G, R, and X) enter into the computation.
2. Floating-point normalization is based on shifting the significand left until the most-significant bit is 1 and decrementing the exponent by the num-
ber of bits the significand was shifted.
src2
-Infinity -NZF -Zero +Zero +NZF +Infinity QNaN SNaN
v ← dQNaN v ← Q(src2)
-Infinity v ← -Infinity v ← -Infinity v ← -Infinity v ← -Infinity v ← -Infinity v ← src2
vxisi_flag ← 1 vxsnan_flag ← 1
v ← Q(src2)
-NZF v ← -Infinity v ← A(src1,src2) v ← src1 v ← src1 v ← A(src1,src2) v ← +Infinity v ← src2
vxsnan_flag ← 1
v ← Q(src2)
-Zero v ← -Infinity v ← src2 v ← -Zero v ← Rezd v ← src2 v ← +Infinity v ← src2
vxsnan_flag ← 1
v ← Q(src2)
+Zero v ← -Infinity v ← src2 v ← Rezd v ← +Zero v ← src2 v ← +Infinity v ← src2
vxsnan_flag ← 1
src1
v ← Q(src2)
+NZF v ← -Infinity v ← A(src1,src2) v ← src1 v ← src1 v ← A(src1,src2) v ← +Infinity v ← src2
vxsnan_flag ← 1
v ← dQNaN v ← Q(src2)
+Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← src2
vxisi_flag ← 1 vxsnan_flag ← 1
v ← src1
QNaN v ← src1 v ← src1 v ← src1 v ← src1 v ← src1 v ← src1 v ← src1
vxsnan_flag ← 1
v ← Q(src1) v ← Q(src1) v ← Q(src1) v ← Q(src1) v ← Q(src1) v ← Q(src1) v ← Q(src1) v ← Q(src1)
SNaN
vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1
Explanation:
src1 The double-precision floating-point value in doubleword element i of VSR[XA] (where i c {0,1}).
src2 The double-precision floating-point value in doubleword element i of VSR[XB] (where i c {0,1}).
dQNaN Default quiet NaN (0x7FF8_0000_0000_0000).
NZF Nonzero finite number.
Rezd Exact-zero-difference result (addition of two finite numbers having same magnitude but different signs).
A(x,y) Return the normalized sum of floating-point value x and floating-point value y, having unbounded range and precision.
Note: If x = -y, v is considered to be an exact-zero-difference result (Rezd).
Q(x) Return a QNaN with the payload of x.
v The intermediate result having unbounded signficand precision and unbounded exponent range.
Is q inexact? (q g v)
vxsnan_flag
vxsqrt_flag
vximz_flag
vxisi_flag
vxidi_flag
vxzdz_flag
zx_flag
OE
UE
VE
XE
Case ZE Returned Results and Status Setting
– – – – – 0 0 0 0 0 0 0 – – – – T(r)
– – – 0 – – – – – – – 1 – – – – T(r), fx(ZX)
– – – 1 – – – – – – – 1 – – – – fx(ZX), error()
0 – – – – – – – – – 1 – – – – – T(r), fx(VXSQRT)
0 – – – – – – – – 1 – – – – – – T(r), fx(VXZDZ)
0 – – – – – – – 1 – – – – – – – T(r), fx(VXIDI)
0 – – – – – – 1 – – – – – – – – T(r), fx(VXISI)
0 – – – – 0 1 – – – – – – – – – T(r), fx(VXIMZ)
Special 0 – – – – 1 0 – – – – – – – – – T(r), fx(VXSNAN)
0 – – – – 1 1 – – – – – – – – – T(r), fx(VXSNAN), fx(VXIMZ)
1 – – – – – – – – – 1 – – – – – T(r), fx(VXSQRT)
1 – – – – – – – – 1 – – – – – – fx(VXZDZ), error()
1 – – – – – – – 1 – – – – – – – fx(VXIDI), error()
1 – – – – – – 1 – – – – – – – – fx(VXISI), error()
1 – – – – 0 1 – – – – – – – – – fx(VXIMZ), error()
1 – – – – 1 0 – – – – – – – – – fx(VXSNAN), error()
1 – – – – 1 1 – – – – – – – – – fx(VXSNAN), fx(VXIMZ), error()
– – – – – – – – – – – – no – – – T(r)
– – – – 0 – – – – – – – yes no – – T(r), fx(XX)
Normal – – – – 0 – – – – – – – yes yes – – T(r), fx(XX)
– – – – 1 – – – – – – – yes no – – T(r), fx(XX), error()
– – – – 1 – – – – – – – yes yes – – T(r), fx(XX), error()
Explanation:
– The results do not depend on this condition.
fx(x) FX is set to 1 if x=0. x is set to 1.
q The value defined in Table 58, “Scalar Floating-Point Intermediate Result Handling,” on page 516, signficand rounded to the target
precision, unbounded exponent range.
r The value defined in Table 58, “Scalar Floating-Point Intermediate Result Handling,” on page 516, signficand rounded to the target
precision, bounded exponent range.
v The precise intermediate result defined in the instruction having unbounded signficand precision, unbounded exponent range.
OX Floating-Point Overflow Exception status flag, FPSCROX.
error() The system error handler is invoked for the trap-enabled exception if the FE0 and FE1 bits in the Machine State Register are set
to any mode other than the ignore-exception mode. Update of the target VSR is suppressed for all vector elements.
T(x) The value x is placed in element i of VSR[XT] in the target precision format (where i c {0,1} for results with 64-bit elements, and i
c {0,1,3,4}) for results with 32-bit elements).
UX Floating-Point Underflow Exception status flag, FPSCRUX
VXSNAN Floating-Point Invalid Operation Exception (SNaN) status flag, FPSCRVXSNAN.
VXSQRT Floating-Point Invalid Operation Exception (Invalid Square Root) status flag, FPSCRVXSQRT.
VXIDI Floating-Point Invalid Operation Exception (Infinity ÷ Infinity) status flag, FPSCRVXIDI.
VXIMZ Floating-Point Invalid Operation Exception (Infinity × Zero) status flag, FPSCRVXIMZ.
VXISI Floating-Point Invalid Operation Exception (Infinity – Infinity) status flag, FPSCRVXISI.
VXZDZ Floating-Point Invalid Operation Exception (Zero ÷ Zero) status flag, FPSCRVXZDZ.
XX Float-Point Inexact Exception status flag, FPSCRXX. The flag is a sticky version of FPSCRFI. When FPSCRFI is set to a new
value, the new value of FPSCRXX is set to the result of ORing the old value of FPSCRXX with the new value of FPSCRFI.
ZX Floating-Point Zero Divide Exception status flag, FPSCRZX.
Is q inexact? (q g v)
vxsnan_flag
vxsqrt_flag
vximz_flag
vxisi_flag
vxidi_flag
vxzdz_flag
zx_flag
OE
UE
VE
XE
ZE
Case Returned Results and Status Setting
Explanation:
– The results do not depend on this condition.
fx(x) FX is set to 1 if x=0. x is set to 1.
q The value defined in Table 58, “Scalar Floating-Point Intermediate Result Handling,” on page 516, signficand rounded to the target
precision, unbounded exponent range.
r The value defined in Table 58, “Scalar Floating-Point Intermediate Result Handling,” on page 516, signficand rounded to the target
precision, bounded exponent range.
v The precise intermediate result defined in the instruction having unbounded signficand precision, unbounded exponent range.
OX Floating-Point Overflow Exception status flag, FPSCROX.
error() The system error handler is invoked for the trap-enabled exception if the FE0 and FE1 bits in the Machine State Register are set
to any mode other than the ignore-exception mode. Update of the target VSR is suppressed for all vector elements.
T(x) The value x is placed in element i of VSR[XT] in the target precision format (where i c {0,1} for results with 64-bit elements, and i
c {0,1,3,4}) for results with 32-bit elements).
UX Floating-Point Underflow Exception status flag, FPSCRUX
VXSNAN Floating-Point Invalid Operation Exception (SNaN) status flag, FPSCRVXSNAN.
VXSQRT Floating-Point Invalid Operation Exception (Invalid Square Root) status flag, FPSCRVXSQRT.
VXIDI Floating-Point Invalid Operation Exception (Infinity ÷ Infinity) status flag, FPSCRVXIDI.
VXIMZ Floating-Point Invalid Operation Exception (Infinity × Zero) status flag, FPSCRVXIMZ.
VXISI Floating-Point Invalid Operation Exception (Infinity – Infinity) status flag, FPSCRVXISI.
VXZDZ Floating-Point Invalid Operation Exception (Zero ÷ Zero) status flag, FPSCRVXZDZ.
XX Float-Point Inexact Exception status flag, FPSCRXX. The flag is a sticky version of FPSCRFI. When FPSCRFI is set to a new
value, the new value of FPSCRXX is set to the result of ORing the old value of FPSCRXX with the new value of FPSCRFI.
ZX Floating-Point Zero Divide Exception status flag, FPSCRZX.
VSX Vector Add Single-Precision XX3-form The result is placed into word element i of
VSR[XT] in single-precision format.
xvaddsp XT,XA,XB
See Table 106, “Vector Floating-Point Final
60 T A B 64 AX BX TX Result,” on page 663.
0 6 11 16 21 29 30 31
If a trap-enabled exception occurs in any element of
XT ← TX || T
the vector, no results are written to VSR[XT].
XA ← AX || A
XB ← BX || B
ex_flag ← 0b0 Special Registers Altered
FX OX UX XX VXSNAN VXISI
do i=0 to 127 by 32
reset_xflags() VSR Data Layout for xvaddsp
src1 ← VSR[XA]{i:i+31}
src2 ← VSR[XB]{i:i+31} src1 = VSR[XA]
v{0:inf} ← AddSP(src1,src2)
SP SP SP SP
result{i:i+31} ← RoundToSP(RN,v)
if(vxsnan_flag) then SetFX(VXSNAN) src2 = VSR[XB]
if(vxisi_flag) then SetFX(VXISI)
if(ox_flag) then SetFX(OX) SP SP SP SP
if(ux_flag) then SetFX(UX)
tgt = VSR[XT]
if(xx_flag) then SetFX(XX)
ex_flag ← ex_flag | (VE & vxsnan_flag) SP SP SP SP
ex_flag ← ex_flag | (VE & vxisi_flag)
0 32 64 96 127
ex_flag ← ex_flag | (OE & ox_flag)
ex_flag ← ex_flag | (UE & ux_flag)
ex_flag ← ex_flag | (XE & xx_flag)
end
1. Floating-point addition is based on exponent comparison and addition of the two significands. The exponents of the two operands are compared,
and the significand accompanying the smaller exponent is shifted right, with its exponent increased by one for each bit shifted, until the two expo-
nents are equal. The two significands are then added or subtracted as appropriate, depending on the signs of the operands, to form an interme-
diate sum. All 53 bits of the significand as well as all three guard bits (G, R, and X) enter into the computation.
2. Floating-point normalization is based on shifting the significand left until the most-significant bit is 1 and decrementing the exponent by the num-
ber of bits the significand was shifted.
src2
-Infinity -NZF -Zero +Zero +NZF +Infinity QNaN SNaN
v ← dQNaN v ← Q(src2)
-Infinity v ← -Infinity v ← -Infinity v ← -Infinity v ← -Infinity v ← -Infinity v ← src2
vxisi_flag ← 1 vxsnan_flag ← 1
v ← Q(src2)
-NZF v ← -Infinity v ← A(src1,src2) v ← src1 v ← src1 v ← A(src1,src2) v ← +Infinity v ← src2
vxsnan_flag ← 1
v ← Q(src2)
-Zero v ← -Infinity v ← src2 v ← -Zero v ← Rezd v ← src2 v ← +Infinity v ← src2
vxsnan_flag ← 1
v ← Q(src2)
+Zero v ← -Infinity v ← src2 v ← Rezd v ← +Zero v ← src2 v ← +Infinity v ← src2
vxsnan_flag ← 1
src1
v ← Q(src2)
+NZF v ← -Infinity v ← A(src1,src2) v ← src1 v ← src1 v ← A(src1,src2) v ← +Infinity v ← src2
vxsnan_flag ← 1
v ← dQNaN v ← Q(src2)
+Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← src2
vxisi_flag ← 1 vxsnan_flag ← 1
v ← src1
QNaN v ← src1 v ← src1 v ← src1 v ← src1 v ← src1 v ← src1 v ← src1
vxsnan_flag ← 1
v ← Q(src1) v ← Q(src1) v ← Q(src1) v ← Q(src1) v ← Q(src1) v ← Q(src1) v ← Q(src1) v ← Q(src1)
SNaN
vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1
Explanation:
src1 The single-precision floating-point value in word element i of VSR[XA] (where i c {0,1,2,3}).
src2 The single-precision floating-point value in word element i of VSR[XB] (where i c {0,1,2,3}).
dQNaN Default quiet NaN (0x7FC0_0000).
NZF Nonzero finite number.
Rezd Exact-zero-difference result (addition of two finite numbers having same magnitude but different signs).
A(x,y) Return the normalized sum of floating-point value x and floating-point value y, having unbounded range and precision.
Note: If x = -y, v is considered to be an exact-zero-difference result (Rezd).
Q(x) Return a QNaN with the payload of x.
v The intermediate result having unbounded signficand precision and unbounded exponent range.
VSX Vector Compare Equal To Two zero inputs of same or different signs return
Double-Precision XX3-form true for that element.
xvcmpeqdp XT,XA,XB (Rc=0) Two infinity inputs of same signs return true for
xvcmpeqdp. XT,XA,XB (Rc=1) that element.
VSX Vector Compare Equal To Two zero inputs of same or different signs return
Single-Precision XX3-form true for that element.
xvcmpeqsp XT,XA,XB (Rc=0) Two infinity inputs of same signs return true for
xvcmpeqsp. XT,XA,XB (Rc=1) that element.
VSX Vector Compare Greater Than or Equal The contents of doubleword element i of VSR[XT]
To Double-Precision XX3-form are set to all 1s if src1 is greater than or equal to
the double-precision floating-point operand in
xvcmpgedp XT,XA,XB (Rc=0) doubleword element i of VSR[XB]src2, and is set
xvcmpgedp. XT,XA,XB (Rc=1) to all 0s otherwise.
60 T A B Rc 115 AX BX TX A NaN input causes the comparison to return false
0 6 11 16 21 22 29 30 31 for that element.
XT ← TX || T
XA ← AX || A
Two zero inputs of same or different signs return
XB ← BX || B true for that element.
ex_flag ← 0b0
all_false ← 0b1 Two infinity inputs of same signs return true for
all_true ← 0b1 that element.
VSX Vector Compare Greater Than or Equal The contents of word element i of VSR[XT] are set
To Single-Precision XX3-form to all 1s if src1 is greater than or equal to src2,
and is set to all 0s otherwise.
xvcmpgesp XT,XA,XB (Rc=0)
xvcmpgesp. XT,XA,XB (Rc=1) A NaN input causes the comparison to return false
for that element.
60 T A B Rc 83 AX BX TX
0 6 11 16 21 22 29 30 31 Two zero inputs of same or different signs return
true for that element.
XT ← TX || T
XA ← AX || A
XB ← BX || B
Two infinity inputs of same signs return true for
ex_flag ← 0b0 that element.
all_false ← 0b1
all_true ← 0b1 If Rc=1, CR Field 6 is set as follows.
– Bit 0 of CR[6] is set to indicate all vector elements
do i=0 to 127 by 32 compared true.
reset_xflags() – Bit 1 of CR[6] is set to 0.
src1 ← VSR[XA]{i:i+31} – Bit 2 of CR[6] is set to indicate all vector elements
src2 ← VSR[XB]{i:i+31} compared false.
– Bit 3 of CR[6] is set to 0.
if( IsSNaN(src1) | IsSNaN(src2) ) then do
vxsnan_flag ← 0b1
If a trap-enabled exception occurs in any element of
if(VE=0) then vxvc_flag ← 0b1
end
the vector, no results are written to VSR[XT] and the
else vxvc_flag ← IsQNaN(src1) | IsQNaN(src2) contents of CR[6] are undefined if Rc is equal to 1.
VSX Vector Compare Greater Than The contents of doubleword element i of VSR[XT]
Double-Precision XX3-form are set to all 1s if src1 is greater than src2, and
is set to all 0s otherwise.
xvcmpgtdp XT,XA,XB (Rc=0)
xvcmpgtdp. XT,XA,XB (Rc=1) A NaN input causes the comparison to return false
for that element.
60 T A B Rc 107 AX BX TX
0 6 11 16 21 22 29 30 31 Two zero inputs of same or different signs return
false for that element.
XT ← TX || T
XA ← AX || A
XB ← BX || B
If Rc=1, CR Field 6 is set as follows.
ex_flag ← 0b0 – Bit 0 of CR[6] is set to indicate all vector elements
all_false ← 0b1 compared true.
all_true ← 0b1 – Bit 1 of CR[6] is set to 0.
– Bit 2 of CR[6] is set to indicate all vector elements
do i=0 to 127 by 64 compared false.
reset_xflags() – Bit 3 of CR[6] is set to 0.
src1 ← VSR[XA]{i:i+63}
src2 ← VSR[XB]{i:i+63} If a trap-enabled exception occurs in any element of
the vector, no results are written to VSR[XT] and the
if( IsSNaN(src1) | IsSNaN(src2) ) then do
contents of CR[6] are undefined if Rc is equal to 1.
vxsnan_flag ← 0b1
if(VE=0) then vxvc_flag ← 0b1
end Special Registers Altered
else vxvc_flag ← IsQNaN(src1) | IsQNaN(src2) CR[6] . . . . . . . . . . . . . . . . . . . . . . . . . . (if Rc=1)
FX VXSNAN VXVC
if( CompareGTDP(src1,src2) ) then do
result{i:i+63} ← 0xFFFF_FFFF_FFFF_FFFF
VSR Data Layout for xvcmpgtdp[.]
all_false ← 0b0
end src1 = VSR[XA]
else do
DP DP
result{i:i+63} ← 0x0000_0000_0000_0000
all_true ← 0b0 src2 = VSR[XB]
end
if(vxsnan_flag) then SetFX(VXSNAN) DP DP
if(vxvc_flag) then SetFX(VXVC)
ex_flag ← ex_flag | (VE & vxsnan_flag) tgt = VSR[XT]
ex_flag ← ex_flag | (VE & vxvc_flag) MD MD
end
0 64 127
if(Rc=1) then do
if( !vex_flag ) then
CR[6] ← all_true || 0b0 || all_false || 0b0
else
CR[6] ← 0bUUUU
end
VSX Vector Compare Greater Than The contents of word element i of VSR[XT] are set
Single-Precision XX3-form to all 1s if src1 is greater than src2, and is set to
all 0s otherwise.
xvcmpgtsp XT,XA,XB (Rc=0)
xvcmpgtsp. XT,XA,XB (Rc=1) A NaN input causes the comparison to return false
for that element.
60 T A B Rc 75 AX BX TX
0 6 11 16 21 22 29 30 31 Two zero inputs of same or different signs return
false for that element.
XT ← TX || T
XA ← AX || A
XB ← BX || B
If Rc=1, CR Field 6 is set as follows.
ex_flag ← 0b0 – Bit 0 of CR[6] is set to indicate all vector elements
all_false ← 0b1 compared true.
all_true ← 0b1 – Bit 1 of CR[6] is set to 0.
– Bit 2 of CR[6] is set to indicate all vector elements
do i=0 to 127 by 32 compared false.
reset_xflags() – Bit 3 of CR[6] is set to 0.
src1 ← VSR[XA]{i:i+31}
src2 ← VSR[XB]{i:i+31} If a trap-enabled exception occurs in any element of
the vector, no results are written to VSR[XT] and the
if( IsSNaN(src1) | IsSNaN(src2) ) then do
contents of CR[6] are undefined if Rc is equal to 1.
vxsnan_flag ← 0b1
if(VE=0) then vxvc_flag ← 0b1
end Special Registers Altered
else vxvc_flag ← IsQNaN(src1) | IsQNaN(src2) CR[6] . . . . . . . . . . . . . . . . . . . . . . . . . . (if Rc=1)
FX VXSNAN VXVC
if( CompareGTSP(src1,src2) ) then do
result{i:i+31} ← 0xFFFF_FFFF
VSR Data Layout for xvcmpgtsp[.]
all_false ← 0b0
end src1 = VSR[XA]
else do
SP SP SP SP
result{i:i+31} ← 0x0000_0000
all_true ← 0b0 src2 = VSR[XB]
end
if(vxsnan_flag) then SetFX(VXSNAN) SP SP SP SP
if(vxvc_flag) then SetFX(VXVC)
ex_flag ← ex_flag | (VE & vxsnan_flag) tgt = VSR[XT]
ex_flag ← ex_flag | (VE & vxvc_flag) MW MW MW MW
end
0 32 64 96 127
if(Rc=1) then do
if( !vex_flag ) then
CR[6] ← all_true || 0b0 || all_false || 0b0
else
CR[6] ← 0bUUUU
end
if (vxsnan_flag=1) SetFX(FPSCR.VXSNAN)
if Rc=1 then do
CR.bit[56] ← all_true
CR.bit[57] ← 0b0
CR.bit[58] ← all_false
CR.bit[59] ← 0b0
end
VSX Vector Compare Not Equal The contents of doubleword 1 of VSR[VRT] are set to
Single-Precision XX3-form 0x0000_0000_0000_0000.
all_true ← 1
all_false ← 1
reset_flags()
do i = 0 to 3
src1 ← bfp_CONVERT_FROM_BFP32(VSR[32×AX+A].word[i])
src2 ← bfp_CONVERT_FROM_BFP32(VSR[32×BX+B].word[i])
if (vxsnan_flag=1) SetFX(FPSCR.VXSNAN)
if Rc=1 then do
CR.bit[56] ← all_true
CR.bit[57] ← 0b0
CR.bit[58] ← all_false
CR.bit[59] ← 0b0
end
VSX Vector Copy Sign Double-Precision VSX Vector Copy Sign Single-Precision
XX3-form XX3-form
xvcpsgndp XT,XA,XB xvcpsgnsp XT,XA,XB
XT ← TX || T XT ← TX || T
XA ← AX || A XA ← AX || A
XB ← BX || B XB ← BX || B
For each vector element i from 0 to 1, do the following. For each vector element i from 0 to 3, do the following.
The contents of bit 0 of doubleword element i of The contents of bit 0 of word element i of VSR[XA]
VSR[XA] are concatenated with the contents of are concatenated with the contents of bits 1:31 of
bits 1:63 of doubleword element i of VSR[XB] and word element i of VSR[XB] and placed into word
placed into doubleword element i of VSR[XT]. element i of VSR[XT].
VSR Data Layout for xvcpsgndp VSR Data Layout for xvcpsgnsp
src1 = VSR[XA] src1 = VSR[XA]
DP DP SP SP SP SP
src2 = VSR[XB] src2 = VSR[XB]
DP DP SP SP SP SP
tgt = VSR[XT] tgt = VSR[XT]
DP DP SP SP SP SP
0 64 127 0 32 64 96 127
60 T /// B 393 BX TX
0 6 11 16 21 30 31
XT ← TX || T
XB ← BX || B
ex_flag ← 0b0
do i=0 to 127 by 64
reset_xflags()
src ← VSR[XB]{i:i+63}
result{i:i+31} ← RoundToSP(RN,src)
result{i+32:i+63} ← 0xUUUU_UUUU
if(vxsnan_flag) then SetFX(VXSNAN)
if(ox_flag) then SetFX(OX)
if(ux_flag) then SetFX(UX)
if(xx_flag) then SetFX(XX)
ex_flag ← ex_flag | (VE & vxsnan_flag)
ex_flag ← ex_flag | (OE & ox_flag)
ex_flag ← ex_flag | (UE & ux_flag)
ex_flag ← ex_flag | (XE & xx_flag)
end
do i=0 to 127 by 64
reset_xflags() Programming Note
result{i:i+63} ← ConvertDPtoSD(VSR[XB]{i:i+63}) xvcvdpsxds rounds using Round towards Zero
if(vxsnan_flag) then SetFX(VXSNAN) rounding mode. For other rounding modes, soft-
if(vxcvi_flag) then SetFX(VXCVI) ware must use a Round to Double-Precision Inte-
if(xx_flag) then SetFX(XX) ger instruction that corresponds to the desired
ex_flag ← ex_flag | (VE & vxsnan_flag)
rounding mode, including xvrdpic which uses the
ex_flag ← ex_flag | (VE & vxcvi_flag)
rounding mode specified by the RN.
ex_flag ← ex_flag | (XE & xx_flag)
end
do i=0 to 127 by 64
reset_xflags() Programming Note
result{i:i+63} ← ConvertDPtoUD(VSR[XB]{i:i+63}) xvcvdpuxds rounds using Round towards Zero
if(vxsnan_flag) then SetFX(VXSNAN) rounding mode. For other rounding modes,
if(vxcvi_flag) then SetFX(VXCVI) software must use a Round to Double-Precision
if(xx_flag) then SetFX(XX)
Integer instruction that corresponds to the desired
ex_flag ← ex_flag | (VE & vxsnan_flag)
rounding mode, including xvrdpic which uses the
ex_flag ← ex_flag | (VE & vxcvi_flag)
ex_flag ← ex_flag | (XE & xx_flag) rounding mode specified by the RN.
end
VSX Vector Convert Half-Precision format to If src is an SNaN, the result is the single-precision
Single-Precision format XX2-form representation of that SNaN converted to a QNaN.
For each integer value i from 0 to 3, do the following. Special Registers Altered:
Let src be the half-precision floating-point value in FX VXSNAN
the rightmost halfword of word element i of
VSR[XB].
60 T /// B 457 BX TX
0 6 11 16 21 30 31
XT ← TX || T
XB ← BX || B
ex_flag ← 0b0
do i=0 to 127 by 64
reset_xflags()
result{i:i+63} ← ConvertSPtoDP(VSR[XB]{i:i+31})
if(vxsnan_flag) then SetFX(VXSNAN)
ex_flag ← ex_flag | (VE & vxsnan_flag)
end
VSX Vector round and Convert If src is an SNaN, the result is the half-precision
Single-Precision format to Half-Precision representation of that SNaN converted to a QNaN.
format XX2-form
Otherwise, if src is a QNaN, the result is the
xvcvsphp XT,XB
half-precision representation of that QNaN.
60 T 25 B 475 BX TX
0 6 11 16 21 30 31 Otherwise, if src is an Infinity, the result is the
half-precision representation of Infinity with the
if MSR.VSX=0 then VSX_Unavailable() same sign as src.
reset_flags()
Otherwise, if src is a Zero, the result is the
do i = 0 to 3
half-precision representation of Zero with the
src bfp_CONVERT_FROM_BFP32(VSR[BX×32+B].word[i]) same sign as src.
rnd bfp_ROUND_TO_BFP16(FPSCR.RN,rnd)
result.hword[2×i] 0x0000 Otherwise, the result is the half-precision
result.hword[2×i+1] bfp_CONVERT_TO_BFP16(rnd) representation of src rounded to half-precision
using the rounding mode specified by RN.
if(vxsnan_flag) then SetFX(FPSCR.VXSNAN)
if(ox_flag) then SetFX(FPSCR.OX) The result is zero-extended and placed into word
if(ux_flag) then SetFX(FPSCR.UX) element i of VSR[XT].
if(xx_flag) then SetFX(FPSCR.XX)
If a trap-enabled exception occurs, VSR[XT] is not
ex_flag ex_flag | (FPSCR.VE & vxsnan_flag)
| (FPSCR.OE & ox_flag)
modified.
| (FPSCR.UE & ux_flag)
| (FPSCR.XE & xx_flag) Special Registers Altered:
end FX VXSNAN OX UX XX
do i=0 to 127 by 64
reset_xflags() Programming Note
result{i:i+63} ← ConvertSPtoSD(VSR[XB]{i:i+31}) xvcvspsxds rounds using Round towards Zero
if(vxsnan_flag) then SetFX(VXSNAN) rounding mode. For other rounding modes,
if(vxcvi_flag) then SetFX(VXCVI) software must use a Round to Single-Precision
if(xx_flag) then SetFX(XX)
Integer instruction that corresponds to the desired
ex_flag ← ex_flag | (VE & vxsnan_flag)
rounding mode, including xvrspic which uses the
ex_flag ← ex_flag | (VE & vxcvi_flag)
ex_flag ← ex_flag | (XE & xx_flag) rounding mode specified by RN.
end
XT ← TX || T SW SW SW SW
XB ← BX || B 0 32 64 96 127
ex_flag ← 0b0
Programming Note
do i=0 to 127 by 32
xvcvspsxws rounds using Round towards Zero
reset_xflags()
rounding mode. For other rounding modes,
result{i:i+31} ← ConvertSPtoSW(VSR[XB]{i:i+31})
if(vxsnan_flag) then SetFX(VXSNAN) software must use a Round to Single-Precision
if(vxcvi_flag) then SetFX(VXCVI) Integer instruction that corresponds to the desired
if(xx_flag) then SetFX(XX) rounding mode, including xvrspic which uses the
ex_flag ← ex_flag | (VE & vxsnan_flag) rounding mode specified by RN.
ex_flag ← ex_flag | (VE & vxcvi_flag)
ex_flag ← ex_flag | (XE & xx_flag)
end
do i=0 to 127 by 64
reset_xflags() Programming Note
result{i:i+63} ← ConvertSPtoUD(VSR[XB]{i:i+31}) xvcvspuxds rounds using Round towards Zero
if(vxsnan_flag) then SetFX(VXSNAN) rounding mode. For other rounding modes,
if(vxcvi_flag) then SetFX(VXCVI) software must use a Round to Single-Precision
if(xx_flag) then SetFX(XX)
Integer instruction that corresponds to the desired
ex_flag ← ex_flag | (VE & vxsnan_flag)
rounding mode, including xvrspic which uses the
ex_flag ← ex_flag | (VE & vxcvi_flag)
ex_flag ← ex_flag | (XE & xx_flag) rounding mode specified by RN.
end
XT ← TX || T UW UW UW UW
XB ← BX || B 0 32 64 96 127
ex_flag ← 0b0
Programming Note
do i=0 to 127 by 32
xvcvspuxws rounds using Round towards Zero
reset_xflags()
rounding mode. For other rounding modes,
result{i:i+31} ← ConvertSPtoUW(VSR[XB]{i:i+31})
if(vxsnan_flag) then SetFX(VXSNAN) software must use a Round to Single-Precision
if(vxcvi_flag) then SetFX(VXCVI) Integer instruction that corresponds to the desired
if(xx_flag) then SetFX(XX) rounding mode, including xvrspic which uses the
ex_flag ← ex_flag | (VE & vxsnan_flag) rounding mode specified by RN.
ex_flag ← ex_flag | (VE & vxcvi_flag)
ex_flag ← ex_flag | (XE & xx_flag)
end
VSX Vector Convert and round Signed Integer VSX Vector Convert and round Signed Integer
Doubleword to Double-Precision format Doubleword to Single-Precision format
XX2-form XX2-form
xvcvsxddp XT,XB xvcvsxdsp XT,XB
XT ← TX || T XT ← TX || T
XB ← BX || B XB ← BX || B
ex_flag ← 0b0 ex_flag ← 0b0
VSX Vector Convert Signed Integer Word to VSX Vector Convert and round Signed Integer
Double-Precision format XX2-form Word to Single-Precision format XX2-form
xvcvsxwdp XT,XB xvcvsxwsp XT,XB
XT ← TX || T XT ← TX || T
XB ← BX || B XB ← BX || B
ex_flag ← 0b0 ex_flag ← 0b0
if( ex_flag = 0 ) then VSR[XT] ← result if( ex_flag = 0 ) then VSR[XT] ← result
For each vector element i from 0 to 1, do the following. For each vector element i from 0 to 3, do the following.
Let src be the signed integer in bits 0:31 of Let src be the signed integer in word element i of
doubleword element i of VSR[XB]. VSR[XB].
The result is placed into doubleword element i of The result is placed into word element i of
VSR[XT] in double-precision format. VSR[XT] in single-precision format.
If a trap-enabled exception occurs in any element of If a trap-enabled exception occurs in any element of
the vector, no results are written to VSR[XT]. the vector, no results are written to VSR[XT].
VSR Data Layout for xvcvsxwdp VSR Data Layout for xvcvsxwsp
src = VSR[XB] src = VSR[XB]
SW unused SW unused SW SW SW SW
tgt = VSR[XT] tgt = VSR[XT]
DP DP SP SP SP SP
0 32 64 96 127 0 32 64 96 127
VSX Vector Convert and round Unsigned VSX Vector Convert and round Unsigned
Integer Doubleword to Double-Precision Integer Doubleword to Single-Precision
format XX2-form format XX2-form
xvcvuxddp XT,XB xvcvuxdsp XT,XB
XT ← TX || T XT ← TX || T
XB ← BX || B XB ← BX || B
ex_flag ← 0b0 ex_flag ← 0b0
VSX Vector Convert and round Unsigned VSX Vector Convert and round Unsigned
Integer Word to Double-Precision format Integer Word to Single-Precision format
XX2-form XX2-form
xvcvuxwdp XT,XB xvcvuxwsp XT,XB
XT ← TX || T XT ← TX || T
XB ← BX || B XB ← BX || B
ex_flag ← 0b0 ex_flag ← 0b0
if( ex_flag = 0 ) then VSR[XT] ← result if( ex_flag = 0 ) then VSR[XT] ← result
For each vector element i from 0 to 1, do the following. For each vector element i from 0 to 3, do the following.
Let src be the unsigned integer in bits 0:31 of Let src be the unsigned integer in word element i
doubleword element i of VSR[XB]. of VSR[XB].
The result is placed into doubleword element i of The result is placed into word element i of
VSR[XT] in double-precision format. VSR[XT] in single-precision format.
If a trap-enabled exception occurs in any element of If a trap-enabled exception occurs in any element of
the vector, no results are written to VSR[XT]. the vector, no results are written to VSR[XT].
VSR Data Layout for xvcvuxwdp VSR Data Layout for xvcvuxwsp
src = VSR[XB] src = VSR[XB]
UW unused UW unused UW UW UW UW
tgt = VSR[XT] tgt = VSR[XT]
DP DP SP SP SP SP
0 32 64 96 127 0 32 64 96 127
VSX Vector Divide Double-Precision XX3-form The result is placed into doubleword element i of
VSR[XT] in double-precision format.
xvdivdp XT,XA,XB
See Table 106, “Vector Floating-Point Final
60 T A B 120 AX BX TX Result,” on page 663.
0 6 11 16 21 29 30 31
If a trap-enabled exception occurs in any element of
XT ← TX || T
the vector, no results are written to VSR[XT].
XA ← AX || A
XB ← BX || B
ex_flag ← 0b0 Special Registers Altered
FX OX UX ZX XX
do i=0 to 127 by 64 VXSNAN VXIDI VXZDZ
reset_xflags()
src1 ← VSR[XA]{i:i+63}
VSR Data Layout for xvdivdp
src2 ← VSR[XB]{i:i+63}
v{0:inf} ← DivideDP(src1,src2) src1 = VSR[XA]
result{i:i+63} ← RoundToDP(RN,v)
DP DP
if(vxsnan_flag) then SetFX(VXSNAN)
if(vxidi_flag) then SetFX(VXIDI) src2 = VSR[XB]
if(vxisi_flag) then SetFX(VXZDZ)
if(ox_flag) then SetFX(OX) DP DP
if(ux_flag) then SetFX(UX)
if(xx_flag) then SetFX(XX) tgt = VSR[XT]
if(zx_flag) then SetFX(ZX) DP DP
ex_flag ← ex_flag | (VE & vxsnan_flag)
0 64 127
ex_flag ← ex_flag | (VE & vxidi_flag)
ex_flag ← ex_flag | (VE & vxzdz_flag)
ex_flag ← ex_flag | (OE & ox_flag)
ex_flag ← ex_flag | (UE & ux_flag)
ex_flag ← ex_flag | (ZE & zx_flag)
ex_flag ← ex_flag | (XE & xx_flag)
end
src2
-Infinity -NZF -Zero +Zero +NZF +Infinity QNaN SNaN
v ← dQNaN v ← dQNaN v ← Q(src2)
-Infinity v ← –Infinity v ← –Infinity v ← –Infinity v ← –Infinity v ← src2
vxidi_flag ← 1 vxidi_flag ← 1 vxsnan_flag ← 1
v ← +Infinity v ← –Infinity v ← Q(src2)
-NZF v ← +Zero v ← D(src1,src2) v ← D(src1,src2) v ← –Zero v ← src2
zx_flag ← 1 zx_flag ← 1 vxsnan_flag ← 1
v ← dQNaN v ← dQNaN v ← Q(src2)
-Zero v ← +Zero v ← +Zero v ← –Zero v ← –Zero v ← src2
vxzdz_flag ← 1 vxzdz_flag ← 1 vxsnan_flag ← 1
v ← dQNaN v ← dQNaN v ← Q(src2)
+Zero v ← –Zero v ← –Zero v ← +Zero v ← +Zero v ← src2
vxzdz_flag ← 1 vxzdz_flag ← 1 vxsnan_flag ← 1
src1
Explanation:
src1 The double-precision floating-point value in doubleword element i of VSR[XA] (where i c {0,1}).
src2 The double-precision floating-point value in doubleword element i of VSR[XB] (where i c {0,1}).
dQNaN Default quiet NaN (0x7FF8_0000_0000_0000).
NZF Nonzero finite number.
Rezd Exact-zero-difference result (addition of two finite numbers having same magnitude but different signs).
D(x,y) Return the normalized quotient of floating-point value x divided by floating-point value y, having unbounded range and precision.
Q(x) Return a QNaN with the payload of x.
v The intermediate result having unbounded signficand precision and unbounded exponent range.
VSX Vector Divide Single-Precision XX3-form The result is placed into word element i of
VSR[XT] in single-precision format.
xvdivsp XT,XA,XB
See Table 106, “Vector Floating-Point Final
60 T A B 88 AX BX TX Result,” on page 663.
0 6 11 16 21 29 30 31
If a trap-enabled exception occurs in any element of
XT ← TX || T
the vector, no results are written to VSR[XT].
XA ← AX || A
XB ← BX || B
ex_flag ← 0b0 Special Registers Altered
FX OX UX ZX XX
do i=0 to 127 by 32 VXSNAN VXIDI VXZDZ
reset_xflags()
src1 ← VSR[XA]{i:i+31}
VSR Data Layout for xvdivsp
src2 ← VSR[XB]{i:i+31}
v{0:inf} ← DivideSP(src1,src2) src1 = VSR[XA]
result{i:i+31} ← RoundToSP(RN,v)
SP SP SP SP
if(vxsnan_flag) then SetFX(VXSNAN)
if(vxidi_flag) then SetFX(VXIDI) src2 = VSR[XB]
if(vxisi_flag) then SetFX(VXZDZ)
if(ox_flag) then SetFX(OX) SP SP SP SP
if(ux_flag) then SetFX(UX)
if(xx_flag) then SetFX(XX) tgt = VSR[XT]
if(zx_flag) then SetFX(ZX) SP SP SP SP
ex_flag ← ex_flag | (VE & vxsnan_flag)
0 32 64 96 127
ex_flag ← ex_flag | (VE & vxidi_flag)
ex_flag ← ex_flag | (VE & vxzdz_flag)
ex_flag ← ex_flag | (OE & ox_flag)
ex_flag ← ex_flag | (UE & ux_flag)
ex_flag ← ex_flag | (ZE & zx_flag)
ex_flag ← ex_flag | (XE & xx_flag)
end
src2
-Infinity -NZF -Zero +Zero +NZF +Infinity QNaN SNaN
v ← dQNaN v ← dQNaN v ← Q(src2)
-Infinity v ← –Infinity v ← –Infinity v ← –Infinity v ← –Infinity v ← src2
vxidi_flag ← 1 vxidi_flag ← 1 vxsnan_flag ← 1
v ← +Infinity v ← –Infinity v ← Q(src2)
-NZF v ← +Zero v ← D(src1,src2) v ← D(src1,src2) v ← –Zero v ← src2
zx_flag ← 1 zx_flag ← 1 vxsnan_flag ← 1
v ← dQNaN v ← dQNaN v ← Q(src2)
-Zero v ← +Zero v ← +Zero v ← –Zero v ← –Zero v ← src2
vxzdz_flag ← 1 vxzdz_flag ← 1 vxsnan_flag ← 1
v ← dQNaN v ← dQNaN v ← Q(src2)
+Zero v ← –Zero v ← –Zero v ← +Zero v ← +Zero v ← src2
vxzdz_flag ← 1 vxzdz_flag ← 1 vxsnan_flag ← 1
src1
Explanation:
src1 The single-precision floating-point value in word element i of VSR[XA] (where i c {0,1,2,3}).
src2 The single-precision floating-point value in word element i of VSR[XB] (where i c {0,1,2,3}).
dQNaN Default quiet NaN (0x7FC0_0000).
NZF Nonzero finite number.
Rezd Exact-zero-difference result (addition of two finite numbers having same magnitude but different signs).
D(x,y) Return the normalized quotient of floating-point value x divided by floating-point value y, having unbounded range and precision.
Note: If x = -y, v is considered to be an exact-zero-difference result (Rezd).
Q(x) Return a QNaN with the payload of x.
v The intermediate result having unbounded signficand precision and unbounded exponent range.
VSX Vector Insert Exponent Double-Precision VSX Vector Insert Exponent Single-Precision
XX3-form XX3-form
xviexpdp XT,XA,XB xviexpsp XT,XA,XB
do i = 0 to 1 do i = 0 to 3
src1 VSR[32×AX+A].dword[i] src1 VSR[32×AX+A].word[i]
src2 VSR[32×BX+B].dword[i] src2 VSR[32×BX+B].word[i]
For each integer value i from 0 to 1, do the following. For each integer value i from 0 to 3, do the following.
Let src1 be the unsigned integer value in Let src1 be the unsigned integer value in word
doubleword element i of VSR[XA]. element i of VSR[XA].
Let src2 be the unsigned integer value in Let src2 be the unsigned integer value in word
doubleword element i of VSR[XB]. element i of VSR[XB].
The contents of bits 0 of src1 are placed into bit 0 The contents of bits 0 of src1 are placed into bit 0
of doubleword element i of VSR[XT]. of word element i of VSR[XT].
The contents of bits 53:63 of src2 are placed into The contents of bits 24:31 of src2 are placed into
bits 1:11 of doubleword element i of VSR[XT]. bits 1:8 of word element i of VSR[XT].
The contents of bits 12:63 of src1 are placed into The contents of bits 9:31 of src1 are placed into
bits 12:63 of doubleword element i of VSR[XT]. bits 9:31 of word element i of VSR[XT].
Part 1: src3
Multiply –Infinity –NZF –Zero +Zero +NZF +Infinity QNaN SNaN
p ← dQNaN p ← dQNaN p ← Q(src3)
–Infinity p ← +Infinity p ← +Infinity p ← –Infinity p ← –Infinity p ← src3
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
p ← Q(src3)
–NZF p ← +Infinity p ← M(src1,src3) p ← +Zero p ← –Zero p ← M(src1,src3) p ← +Infinity p ← src3
vxsnan_flag ← 1
p ← dQNaN p ← dQNaN p ← Q(src3)
–Zero p ← +Zero p ← +Zero p ← –Zero p ← –Zero p ← src3
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
p ← dQNaN p ← dQNaN p ← Q(src3)
+Zero p ← –Zero p ← –Zero p ← +Zero p ← +Zero p ← src3
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
src1
p ← Q(src3)
+NZF p ← –Infinity p ← M(src1,src3) p ← –Zero p ← +Zero p ← M(src1,src3) p ← +Infinity p ← src3
vxsnan_flag ← 1
p ← dQNaN p ← dQNaN p ← Q(src3)
+Infinity p ← –Infinity p ← +Infinity p ← +Infinity p ← +Infinity p ← src3
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
p ← src1
QNaN p ← src1 p ← src1 p ← src1 p ← src1 p ← src1 p ← src1 p ← src1
vxsnan_flag ← 1
p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1)
SNaN
vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1
Part 2: src2
Add –Infinity –NZF –Zero +Zero +NZF +Infinity QNaN SNaN
v ← dQNaN v ← Q(src2)
–Infinity v ← –Infinity v ← –Infinity v ← –Infinity v ← –Infinity v ← –Infinity v ← src2
vxisi_flag ← 1 vxsnan_flag ← 1
v ← Q(src2)
–NZF v ← –Infinity v ← A(p,src2) v←p v←p v ← A(p,src2) v ← +Infinity v ← src2
vxsnan_flag ← 1
v ← Q(src2)
–Zero v ← –Infinity v ← src2 v ← –Zero v ← Rezd v ← src2 v ← +Infinity v ← src2
vxsnan_flag ← 1
v ← Q(src2)
+Zero v ← –Infinity v ← src2 v ← Rezd v ← +Zero v ← src2 v ← +Infinity v ← src2
vxsnan_flag ← 1
p
v ← Q(src2)
+NZF v ← –Infinity v ← A(p,src2) v←p v←p v ← A(p,src2) v ← +Infinity v ← src2
vxsnan_flag ← 1
v ← dQNaN v ← Q(src2)
+Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← src2
vxisi_flag ← 1 vxsnan_flag ← 1
QNaN & v←p
v←p v←p v←p v←p v←p v←p v←p
src1 is a NaN vxsnan_flag ← 1
QNaN & v ← Q(src2)
v←p v←p v←p v←p v←p v←p v ← src2
src1 not a NaN vxsnan_flag ← 1
Explanation:
src1 The double-precision floating-point value in doubleword element i of VSR[XA] (where i c {0,1}).
src2 For xvmaddadp, the double-precision floating-point value in doubleword element i of VSR[XT] (where i c {0,1}).
For xvmaddmdp, the double-precision floating-point value in doubleword element i of VSR[XB] (where i c {0,1}).
src3 For xvmaddadp, the double-precision floating-point value in doubleword element i of VSR[XB] (where i c {0,1}).
For xvmaddmdp, the double-precision floating-point value in doubleword element i of VSR[XT] (where i c {0,1}).
dQNaN Default quiet NaN (0x7FF8_0000_0000_0000).
NZF Nonzero finite number.
Rezd Exact-zero-difference result (addition of two finite numbers having same magnitude but different signs). Can also occur with two
nonzero finite number source operands.
Q(x) Return a QNaN with the payload of x.
A(x,y) Return the normalized sum of floating-point value x and floating-point value y, having unbounded range and precision.
Note: If x = -y, v is considered to be an exact-zero-difference result (Rezd).
M(x,y) Return the normalized product of floating-point value x and floating-point value y, having unbounded range and precision.
p The intermediate product having unbounded range and precision.
v The intermediate result having unbounded range and precision.
Part 1: src3
Multiply –Infinity –NZF –Zero +Zero +NZF +Infinity QNaN SNaN
p ← dQNaN p ← dQNaN p ← Q(src3)
–Infinity p ← +Infinity p ← +Infinity p ← –Infinity p ← –Infinity p ← src3
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
p ← Q(src3)
–NZF p ← +Infinity p ← M(src1,src3) p ← +Zero p ← –Zero p ← M(src1,src3) p ← +Infinity p ← src3
vxsnan_flag ← 1
p ← dQNaN p ← dQNaN p ← Q(src3)
–Zero p ← +Zero p ← +Zero p ← –Zero p ← –Zero p ← src3
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
p ← dQNaN p ← dQNaN p ← Q(src3)
+Zero p ← –Zero p ← –Zero p ← +Zero p ← +Zero p ← src3
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
src1
p ← Q(src3)
+NZF p ← –Infinity p ← M(src1,src3) p ← –Zero p ← +Zero p ← M(src1,src3) p ← +Infinity p ← src3
vxsnan_flag ← 1
p ← dQNaN p ← dQNaN p ← Q(src3)
+Infinity p ← –Infinity p ← +Infinity p ← +Infinity p ← +Infinity p ← src3
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
p ← src1
QNaN p ← src1 p ← src1 p ← src1 p ← src1 p ← src1 p ← src1 p ← src1
vxsnan_flag ← 1
p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1)
SNaN
vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1
Part 2: src2
Add –Infinity –NZF –Zero +Zero +NZF +Infinity QNaN SNaN
v ← dQNaN v ← Q(src2)
–Infinity v ← –Infinity v ← –Infinity v ← –Infinity v ← –Infinity v ← –Infinity v ← src2
vxisi_flag ← 1 vxsnan_flag ← 1
v ← Q(src2)
–NZF v ← –Infinity v ← A(p,src2) v←p v←p v ← A(p,src2) v ← +Infinity v ← src2
vxsnan_flag ← 1
v ← Q(src2)
–Zero v ← –Infinity v ← src2 v ← –Zero v ← Rezd v ← src2 v ← +Infinity v ← src2
vxsnan_flag ← 1
v ← Q(src2)
+Zero v ← –Infinity v ← src2 v ← Rezd v ← +Zero v ← src2 v ← +Infinity v ← src2
vxsnan_flag ← 1
p
v ← Q(src2)
+NZF v ← –Infinity v ← A(p,src2) v←p v←p v ← A(p,src2) v ← +Infinity v ← src2
vxsnan_flag ← 1
v ← dQNaN v ← Q(src2)
+Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← src2
vxisi_flag ← 1 vxsnan_flag ← 1
QNaN & v←p
v←p v←p v←p v←p v←p v←p v←p
src1 is a NaN vxsnan_flag ← 1
QNaN & v ← Q(src2)
v←p v←p v←p v←p v←p v←p v ← src2
src1 not a NaN vxsnan_flag ← 1
Explanation:
src1 The single-precision floating-point value in word element i of VSR[XA] (where i c {0,1,2,3}).
src2 For xvmaddasp, the single-precision floating-point value in word element i of VSR[XT] (where i c {0,1,2,3}).
For xvmaddmsp, the single-precision floating-point value in word element i of VSR[XB] (where i c {0,1,2,3}).
src3 For xvmaddasp, the single-precision floating-point value in word element i of VSR[XB] (where i c {0,1,2,3}).
For xvmaddmsp, the single-precision floating-point value in word element i of VSR[XT] (where i c {0,1,2,3}).
dQNaN Default quiet NaN (0x7FC0_0000).
NZF Nonzero finite number.
Rezd Exact-zero-difference result (addition of two finite numbers having same magnitude but different signs). Can also occur with two
nonzero finite number source operands.
Q(x) Return a QNaN with the payload of x.
A(x,y) Return the normalized sum of floating-point value x and floating-point value y, having unbounded range and precision.
Note: If x = -y, v is considered to be an exact-zero-difference result (Rezd).
M(x,y) Return the normalized product of floating-point value x and floating-point value y, having unbounded range and precision.
p The intermediate product having unbounded range and precision.
v The intermediate result having unbounded range and precision.
src2
–Infinity –NZF –Zero +Zero +NZF +Infinity QNaN SNaN
T(Q(src2))
–Infinity T(src1) T(src2) T(src2) T(src2) T(src2) T(src2) T(src1)
fx(VXSNAN)
T(Q(src2))
–NZF T(src1) T(M(src1,src2)) T(src2) T(src2) T(src2) T(src2) T(src1)
fx(VXSNAN)
T(Q(src2))
–Zero T(src1) T(src1) T(src1) T(src2) T(src2) T(src2) T(src1)
fx(VXSNAN)
T(Q(src2))
+Zero T(src1) T(src1) T(src1) T(src1) T(src2) T(src2) T(src1)
fx(VXSNAN)
src1
T(Q(src2))
+NZF T(src1) T(src1) T(src1) T(src1) T(M(src1,src2)) T(src2) T(src1)
fx(VXSNAN)
T(Q(src2))
+Infinity T(src1) T(src1) T(src1) T(src1) T(src1) T(src1) T(src1)
fx(VXSNAN)
T(src1)
QNaN T(src2) T(src2) T(src2) T(src2) T(src2) T(src2) T(src1)
fx(VXSNAN)
T(Q(src1)) T(Q(src1)) T(Q(src1)) T(Q(src1)) T(Q(src1)) T(Q(src1)) T(Q(src1)) T(Q(src1))
SNaN
fx(VXSNAN) fx(VXSNAN) fx(VXSNAN) fx(VXSNAN) fx(VXSNAN) fx(VXSNAN) fx(VXSNAN) fx(VXSNAN)
Explanation:
src1 The double-precision floating-point value in doubleword element i of VSR[XA] (where i c {0,1}).
src2 The double-precision floating-point value in doubleword element i of VSR[XT] (where i c {0,1}).
NZF Nonzero finite number.
Q(x) Return a QNaN with the payload of x.
M(x,y) Return the greater of floating-point value x and floating-point value y.
T(x) The value x is placed in doubleword element i (i∈{0,1}) of VSR[XT] in double-precision format.
FPRF, FR and FI are not modified.
fx(x) If x is equal to 0, FX is set to 1. x is set to 1.
VXSNAN Floating-point Invalid Operation Exception (SNaN). If VE=1, update of VSR[XT] is suppressed.
XT ← TX || T SP SP SP SP
XA ← AX || A tgt = VSR[XT]
XB ← BX || B
ex_flag ← 0b0 SP SP SP SP
0 32 64 96 127
do i=0 to 127 by 32
reset_xflags()
src1 ← VSR[XA]{i:i+31}
src2 ← VSR[XB]{i:i+31}
result{i:i+63} ← MaximumSP(src1,src2)
if(vxsnan_flag) then SetFX(VXSNAN)
ex_flag ← ex_flag | (VE & vxsnan_flag)
end
src2
–Infinity –NZF –Zero +Zero +NZF +Infinity QNaN SNaN
T(Q(src2))
–Infinity T(src1) T(src2) T(src2) T(src2) T(src2) T(src2) T(src1)
fx(VXSNAN)
T(Q(src2))
–NZF T(src1) T(M(src1,src2)) T(src2) T(src2) T(src2) T(src2) T(src1)
fx(VXSNAN)
T(Q(src2))
–Zero T(src1) T(src1) T(src1) T(src2) T(src2) T(src2) T(src1)
fx(VXSNAN)
T(Q(src2))
+Zero T(src1) T(src1) T(src1) T(src1) T(src2) T(src2) T(src1)
fx(VXSNAN)
src1
T(Q(src2))
+NZF T(src1) T(src1) T(src1) T(src1) T(M(src1,src2)) T(src2) T(src1)
fx(VXSNAN)
T(Q(src2))
+Infinity T(src1) T(src1) T(src1) T(src1) T(src1) T(src1) T(src1)
fx(VXSNAN)
T(src1)
QNaN T(src2) T(src2) T(src2) T(src2) T(src2) T(src2) T(src1)
fx(VXSNAN)
T(Q(src1)) T(Q(src1)) T(Q(src1)) T(Q(src1)) T(Q(src1)) T(Q(src1)) T(Q(src1)) T(Q(src1))
SNaN
fx(VXSNAN) fx(VXSNAN) fx(VXSNAN) fx(VXSNAN) fx(VXSNAN) fx(VXSNAN) fx(VXSNAN) fx(VXSNAN)
Explanation:
src1 The single-precision floating-point value in word element i of VSR[XA] (where i c {0,1,2,3}).
src2 The single-precision floating-point value in word element i of VSR[XT] (where i c {0,1,2,3}).
NZF Nonzero finite number.
Q(x) Return a QNaN with the payload of x.
M(x,y) Return the greater of floating-point value x and floating-point value y.
T(x) The value x is placed in word element i (i∈{0,1,2,3}) of VSR[XT] in single-precision format.
FPRF, FR and FI are not modified.
fx(x) If x is equal to 0, FX is set to 1. x is set to 1.
VXSNAN Floating-point Invalid Operation Exception (SNaN). If VE=1, update of VSR[XT] is suppressed.
XT ← TX || T DP DP
XA ← AX || A tgt = VSR[XT]
XB ← BX || B
ex_flag ← 0b0 DP DP
0 64 127
do i=0 to 127 by 64
reset_xflags()
src1 ← VSR[XA]{i:i+63}
src2 ← VSR[XB]{i:i+63}
result{i:i+63} ← MinimumDP(src1,src2)
if(vxsnan_flag) then SetFX(VXSNAN)
ex_flag ← ex_flag | (VE & vxsnan_flag)
end
src2
–Infinity –NZF –Zero +Zero +NZF +Infinity QNaN SNaN
T(Q(src2))
–Infinity T(src1) T(src1) T(src1) T(src1) T(src1) T(src1) T(src1)
fx(VXSNAN)
T(Q(src2))
–NZF T(src2) T(M(src1,src2)) T(src1) T(src1) T(src1) T(src1) T(src1)
fx(VXSNAN)
T(Q(src2))
–Zero T(src2) T(src2) T(src1) T(src1) T(src1) T(src1) T(src1)
fx(VXSNAN)
T(Q(src2))
+Zero T(src2) T(src2) T(src2) T(src1) T(src1) T(src1) T(src1)
fx(VXSNAN)
src1
T(Q(src2))
+NZF T(src2) T(src2) T(src2) T(src2) T(M(src1,src2)) T(src1) T(src1)
fx(VXSNAN)
T(Q(src2))
+Infinity T(src2) T(src2) T(src2) T(src2) T(src2) T(src1) T(src1)
fx(VXSNAN)
T(src1)
QNaN T(src2) T(src2) T(src2) T(src2) T(src2) T(src2) T(src1)
fx(VXSNAN)
T(Q(src1)) T(Q(src1)) T(Q(src1)) T(Q(src1)) T(Q(src1)) T(Q(src1)) T(Q(src1)) T(Q(src1))
SNaN
fx(VXSNAN) fx(VXSNAN) fx(VXSNAN) fx(VXSNAN) fx(VXSNAN) fx(VXSNAN) fx(VXSNAN) fx(VXSNAN)
Explanation:
src1 The double-precision floating-point value in doubleword element i of VSR[XA] (where i c {0,1}).
src2 The double-precision floating-point value in doubleword element i of VSR[XT] (where i c {0,1}).
NZF Nonzero finite number.
Q(x) Return a QNaN with the payload of x.
M(x,y) Return the lesser of floating-point value x and floating-point value y.
T(x) The value x is placed in doubleword element i (i∈{0,1}) of VSR[XT] in double-precision format.
FPRF, FR and FI are not modified.
fx(x) If x is equal to 0, FX is set to 1. x is set to 1.
VXSNAN Floating-point Invalid Operation Exception (SNaN). If VE=1, update of VSR[XT] is suppressed.
XT ← TX || T SP SP SP SP
XA ← AX || A tgt = VSR[XT]
XB ← BX || B
ex_flag ← 0b0 SP SP SP SP
0 32 64 96 127
do i=0 to 127 by 32
reset_xflags()
src1 ← VSR[XA]{i:i+31}
src2 ← VSR[XB]{i:i+31}
result{i:i+31} ← MinimumSP(src1,src2)
if(vxsnan_flag) then SetFX(VXSNAN)
ex_flag ← ex_flag | (VE & vxsnan_flag)
end
src2
–Infinity –NZF –Zero +Zero +NZF +Infinity QNaN SNaN
T(Q(src2))
–Infinity T(src1) T(src1) T(src1) T(src1) T(src1) T(src1) T(src1)
fx(VXSNAN)
T(Q(src2))
–NZF T(src2) T(M(src1,src2)) T(src1) T(src1) T(src1) T(src1) T(src1)
fx(VXSNAN)
T(Q(src2))
–Zero T(src2) T(src2) T(src1) T(src1) T(src1) T(src1) T(src1)
fx(VXSNAN)
T(Q(src2))
+Zero T(src2) T(src2) T(src2) T(src1) T(src1) T(src1) T(src1)
fx(VXSNAN)
src1
T(Q(src2))
+NZF T(src2) T(src2) T(src2) T(src2) T(M(src1,src2)) T(src1) T(src1)
fx(VXSNAN)
T(Q(src2))
+Infinity T(src2) T(src2) T(src2) T(src2) T(src2) T(src1) T(src1)
fx(VXSNAN)
T(src1)
QNaN T(src2) T(src2) T(src2) T(src2) T(src2) T(src2) T(src1)
fx(VXSNAN)
T(Q(src1)) T(Q(src1)) T(Q(src1)) T(Q(src1)) T(Q(src1)) T(Q(src1)) T(Q(src1)) T(Q(src1))
SNaN
fx(VXSNAN) fx(VXSNAN) fx(VXSNAN) fx(VXSNAN) fx(VXSNAN) fx(VXSNAN) fx(VXSNAN) fx(VXSNAN)
Explanation:
src1 The single-precision floating-point value in word element i of VSR[XA] (where i c {0,1,2,3}).
src2 The single-precision floating-point value in word element i of VSR[XT] (where i c {0,1,2,3}).
NZF Nonzero finite number.
Q(x) Return a QNaN with the payload of x.
M(x,y) Return the lesser of floating-point value x and floating-point value y.
T(x) The value x is placed in word element i (i∈{0,1,2,3}) of VSR[XT] in single-precision format.
FPRF, FR and FI are not modified.
fx(x) If x is equal to 0, FX is set to 1. x is set to 1.
VXSNAN Floating-point Invalid Operation Exception (SNaN). If VE=1, update of VSR[XT] is suppressed.
Part 1: src3
Multiply –Infinity –NZF –Zero +Zero +NZF +Infinity QNaN SNaN
p ← dQNaN p ← dQNaN p ← Q(src3)
–Infinity p ← +Infinity p ← +Infinity p ← –Infinity p ← –Infinity p ← src3
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
p ← Q(src3)
–NZF p ← +Infinity p ← M(src1,src3) p ← +Zero p ← –Zero p ← M(src1,src3) p ← +Infinity p ← src3
vxsnan_flag ← 1
p ← dQNaN p ← dQNaN p ← Q(src3)
–Zero p ← +Zero p ← +Zero p ← –Zero p ← –Zero p ← src3
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
p ← dQNaN p ← dQNaN p ← Q(src3)
+Zero p ← –Zero p ← –Zero p ← +Zero p ← +Zero p ← src3
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
src1
p ← Q(src3)
+NZF p ← –Infinity p ← M(src1,src3) p ← –Zero p ← +Zero p ← M(src1,src3) p ← +Infinity p ← src3
vxsnan_flag ← 1
p ← dQNaN p ← dQNaN p ← Q(src3)
+Infinity p ← –Infinity p ← +Infinity p ← +Infinity p ← +Infinity p ← src3
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
p ← src1
QNaN p ← src1 p ← src1 p ← src1 p ← src1 p ← src1 p ← src1 p ← src1
vxsnan_flag ← 1
p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1)
SNaN
vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1
Part 2: src2
Subtract –Infinity –NZF –Zero +Zero +NZF +Infinity QNaN SNaN
v ← dQNaN v ← Q(src2)
–Infinity v ← –Infinity v ← –Infinity v ← –Infinity v ← –Infinity v ← –Infinity v ← src2
vxisi_flag ← 1 vxsnan_flag ← 1
v ← Q(src2)
–NZF v ← +Infinity v ← S(p,src2) v←p v←p v ← S(p,src2) v ← –Infinity v ← src2
vxsnan_flag ← 1
v ← Q(src2)
–Zero v ← +Infinity v ← –src2 v ← –Zero v ← Rezd v ← –src2 v ← –Infinity v ← src2
vxsnan_flag ← 1
v ← Q(src2)
+Zero v ← +Infinity v ← –src2 v ← Rezd v ← +Zero v ← –src2 v ← –Infinity v ← src2
vxsnan_flag ← 1
p
v ← Q(src2)
+NZF v ← +Infinity v ← S(p,src2) v←p v←p v ← S(p,src2) v ← –Infinity v ← src2
vxsnan_flag ← 1
v ← dQNaN v ← Q(src2)
+Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← src2
vxisi_flag ← 1 vxsnan_flag ← 1
QNaN & v←p
v←p v←p v←p v←p v←p v←p v←p
src1 is a NaN vxsnan_flag ← 1
QNaN & v ← Q(src2)
v←p v←p v←p v←p v←p v←p v ← src2
src1 not a NaN vxsnan_flag ← 1
Explanation:
src1 The double-precision floating-point value in doubleword element i of VSR[XA] (where i c {0,1}).
src2 For xvmsubadp, the double-precision floating-point value in doubleword element i of VSR[XT] (where i c {0,1}).
For xvmsubmdp, the double-precision floating-point value in doubleword element i of VSR[XB] (where i c {0,1}).
src3 For xvmsubadp, the double-precision floating-point value in doubleword element i of VSR[XB] (where i c {0,1}).
For xvmsubmdp, the double-precision floating-point value in doubleword element i of VSR[XT] (where i c {0,1}).
dQNaN Default quiet NaN (0x7FF8_0000_0000_0000).
NZF Nonzero finite number.
Rezd Exact-zero-difference result (addition of two finite numbers having same magnitude but different signs). Can also occur with two
nonzero finite number source operands.
Q(x) Return a QNaN with the payload of x.
S(x,y) Return the normalized sum of floating-point value x and negated floating-point value y, having unbounded range and precision.
Note: If x = y, v is considered to be an exact-zero-difference result (Rezd).
M(x,y) Return the normalized product of floating-point value x and floating-point value y, having unbounded range and precision.
p The intermediate product having unbounded range and precision.
v The intermediate result having unbounded range and precision.
Part 1: src3
Multiply –Infinity –NZF –Zero +Zero +NZF +Infinity QNaN SNaN
p ← dQNaN p ← dQNaN p ← Q(src3)
–Infinity p ← +Infinity p ← +Infinity p ← –Infinity p ← –Infinity p ← src3
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
p ← Q(src3)
–NZF p ← +Infinity p ← M(src1,src3) p ← +Zero p ← –Zero p ← M(src1,src3) p ← +Infinity p ← src3
vxsnan_flag ← 1
p ← dQNaN p ← dQNaN p ← Q(src3)
–Zero p ← +Zero p ← +Zero p ← –Zero p ← –Zero p ← src3
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
p ← dQNaN p ← dQNaN p ← Q(src3)
+Zero p ← –Zero p ← –Zero p ← +Zero p ← +Zero p ← src3
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
src1
p ← Q(src3)
+NZF p ← –Infinity p ← M(src1,src3) p ← –Zero p ← +Zero p ← M(src1,src3) p ← +Infinity p ← src3
vxsnan_flag ← 1
p ← dQNaN p ← dQNaN p ← Q(src3)
+Infinity p ← –Infinity p ← +Infinity p ← +Infinity p ← +Infinity p ← src3
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
p ← src1
QNaN p ← src1 p ← src1 p ← src1 p ← src1 p ← src1 p ← src1 p ← src1
vxsnan_flag ← 1
p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1)
SNaN
vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1
Part 2: src2
Subtract –Infinity –NZF –Zero +Zero +NZF +Infinity QNaN SNaN
v ← dQNaN v ← Q(src2)
–Infinity v ← –Infinity v ← –Infinity v ← –Infinity v ← –Infinity v ← –Infinity v ← src2
vxisi_flag ← 1 vxsnan_flag ← 1
v ← Q(src2)
–NZF v ← +Infinity v ← S(p,src2) v←p v←p v ← S(p,src2) v ← –Infinity v ← src2
vxsnan_flag ← 1
v ← Q(src2)
–Zero v ← +Infinity v ← –src2 v ← –Zero v ← Rezd v ← –src2 v ← –Infinity v ← src2
vxsnan_flag ← 1
v ← Q(src2)
+Zero v ← +Infinity v ← –src2 v ← Rezd v ← +Zero v ← –src2 v ← –Infinity v ← src2
vxsnan_flag ← 1
p
v ← Q(src2)
+NZF v ← +Infinity v ← S(p,src2) v←p v←p v ← S(p,src2) v ← –Infinity v ← src2
vxsnan_flag ← 1
v ← dQNaN v ← Q(src2)
+Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← src2
vxisi_flag ← 1 vxsnan_flag ← 1
QNaN & v←p
v←p v←p v←p v←p v←p v←p v←p
src1 is a NaN vxsnan_flag ← 1
QNaN & v ← Q(src2)
v←p v←p v←p v←p v←p v←p v ← src2
src1 not a NaN vxsnan_flag ← 1
Explanation:
src1 The single-precision floating-point value in word element i of VSR[XA] (where i c {0,1,2,3}).
src2 For xvmsubasp, the single-precision floating-point value in word element i of VSR[XT] (where i c {0,1,2,3}).
For xvmsubmsp, the single-precision floating-point value in word element i of VSR[XB] (where i c {0,1,2,3}).
src3 For xvmsubasp, the single-precision floating-point value in word element i of VSR[XB] (where i c {0,1,2,3}).
For xvmsubmsp, the single-precision floating-point value in word element i of VSR[XT] (where i c {0,1,2,3}).
dQNaN Default quiet NaN (0x7FC0_0000).
NZF Nonzero finite number.
Rezd Exact-zero-difference result (addition of two finite numbers having same magnitude but different signs). Can also occur with two
nonzero finite number source operands.
Q(x) Return a QNaN with the payload of x.
S(x,y) Return the normalized sum of floating-point value x and negated floating-point value y, having unbounded range and precision.
Note: If x = y, v is considered to be an exact-zero-difference result (Rezd).
M(x,y) Return the normalized product of floating-point value x and floating-point value y, having unbounded range and precision.
p The intermediate product having unbounded range and precision.
v The intermediate result having unbounded range and precision.
VSX Vector Multiply Double-Precision See Table 106, “Vector Floating-Point Final
XX3-form Result,” on page 663.
src2
-Infinity -NZF -Zero +Zero +NZF +Infinity QNaN SNaN
v ← dQNaN v ← dQNaN v ← Q(src2)
-Infinity v ← +Infinity v ← +Infinity v ← –Infinity v ← –Infinity v ← src2
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
v ← Q(src2)
-NZF v ← +Infinity v ← M(src1,src2) v ← +Zero v ← –Zero v ← M(src1,src2) v ← +Infinity v ← src2
vxsnan_flag ← 1
v ← dQNaN v ← dQNaN v ← Q(src2)
-Zero v ← +Zero v ← +Zero v ← –Zero v ← –Zero v ← src2
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
v ← dQNaN v ← dQNaN v ← Q(src2)
+Zero v ← –Zero v ← –Zero v ← +Zero v ← +Zero v ← src2
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
src1
v ← Q(src2)
+NZF v ← –Infinity v ← M(src1,src2) v ← –Zero v ← +Zero v ← M(src1,src2) v ← +Infinity v ← src2
vxsnan_flag ← 1
v ← dQNaN v ← dQNaN v ← Q(src2)
+Infinity v ← –Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← src2
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
v ← src1
QNaN v ← src1 v ← src1 v ← src1 v ← src1 v ← src1 v ← src1 v ← src1
vxsnan_flag ← 1
v ← Q(src1) v ← Q(src1) v ← Q(src1) v ← Q(src1) v ← Q(src1) v ← Q(src1) v ← Q(src1) v ← Q(src1)
SNaN
vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1
Explanation:
src1 The double-precision floating-point value in doubleword element i of VSR[XA] (where i c {0,1}).
src2 The double-precision floating-point value in doubleword element i of VSR[XB] (where i c {0,1}).
dQNaN Default quiet NaN (0x7FF8_0000_0000_0000).
NZF Nonzero finite number.
M(x,y) Return the normalized product of floating-point value x and floating-point value y, having unbounded range and precision.
Q(x) Return a QNaN with the payload of x.
v The intermediate result having unbounded signficand precision and unbounded exponent range.
VSX Vector Multiply Single-Precision See Table 106, “Vector Floating-Point Final
XX3-form Result,” on page 663.
src2
-Infinity -NZF -Zero +Zero +NZF +Infinity QNaN SNaN
v ← dQNaN v ← dQNaN v ← Q(src2)
-Infinity v ← +Infinity v ← +Infinity v ← –Infinity v ← –Infinity v ← src2
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
v ← Q(src2)
-NZF v ← +Infinity v ← M(src1,src2) v ← +Zero v ← –Zero v ← M(src1,src2) v ← +Infinity v ← src2
vxsnan_flag ← 1
v ← dQNaN v ← dQNaN v ← Q(src2)
-Zero v ← +Zero v ← +Zero v ← –Zero v ← –Zero v ← src2
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
v ← dQNaN v ← dQNaN v ← Q(src2)
+Zero v ← –Zero v ← –Zero v ← +Zero v ← +Zero v ← src2
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
src1
v ← Q(src2)
+NZF v ← –Infinity v ← M(src1,src2) v ← –Zero v ← +Zero v ← M(src1,src2) v ← +Infinity v ← src2
vxsnan_flag ← 1
v ← dQNaN v ← dQNaN v ← Q(src2)
+Infinity v ← –Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← src2
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
v ← src1
QNaN v ← src1 v ← src1 v ← src1 v ← src1 v ← src1 v ← src1 v ← src1
vxsnan_flag ← 1
v ← Q(src1) v ← Q(src1) v ← Q(src1) v ← Q(src1) v ← Q(src1) v ← Q(src1) v ← Q(src1) v ← Q(src1)
SNaN
vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1
Explanation:
src1 The single-precision floating-point value in word element i of VSR[XA] (where i c {0,1,2,3}).
src2 The single-precision floating-point value in word element i of VSR[XB] (where i c {0,1,2,3}).
dQNaN Default quiet NaN (0x7FC0_0000).
NZF Nonzero finite number.
M(x,y) Return the normalized product of floating-point value x and floating-point value y, having unbounded range and precision.
Q(x) Return a QNaN with the payload of x.
v The intermediate result having unbounded signficand precision and unbounded exponent range.
XT ← TX || T XT ← TX || T
XB ← BX || B XB ← BX || B
For each vector element i from 0 to 1, do the following. For each vector element i from 0 to 3, do the following.
The contents of doubleword element i of VSR[XB], The contents of word element i of VSR[XB], with
with bit 0 set to 1, is placed into doubleword bit 0 set to 1, is placed into word element i of
element i of VSR[XT]. VSR[XT].
VSR Data Layout for xvnabsdp VSR Data Layout for xvnabssp
src = VSR[XB] src = VSR[XB]
DP DP SP SP SP SP
tgt = VSR[XT] tgt = VSR[XT]
DP DP SP SP SP SP
0 64 127 0 32 64 96 127
Part 1: src3
Multiply –Infinity –NZF –Zero +Zero +NZF +Infinity QNaN SNaN
p ← dQNaN p ← dQNaN p ← Q(src3)
–Infinity p ← +Infinity p ← +Infinity p ← –Infinity p ← –Infinity p ← src3
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
p ← Q(src3)
–NZF p ← +Infinity p ← M(src1,src3) p ← src1 p ← src1 p ← M(src1,src3) p ← +Infinity p ← src3
vxsnan_flag ← 1
p ← dQNaN p ← dQNaN p ← Q(src3)
–Zero p ← +Zero p ← +Zero p ← –Zero p ← –Zero p ← src3
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
p ← dQNaN p ← dQNaN p ← Q(src3)
+Zero p ← –Zero p ← –Zero p ← +Zero p ← +Zero p ← src3
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
src1
p ← Q(src3)
+NZF p ← –Infinity p ← M(src1,src3) p ← src1 p ← src1 p ← M(src1,src3) p ← +Infinity p ← src3
vxsnan_flag ← 1
p ← dQNaN p ← dQNaN p ← Q(src3)
+Infinity p ← –Infinity p ← +Infinity p ← +Infinity p ← +Infinity p ← src3
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
p ← src1
QNaN p ← src1 p ← src1 p ← src1 p ← src1 p ← src1 p ← src1 p ← src1
vxsnan_flag ← 1
p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1)
SNaN
vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1
Part 2: src2
Add –Infinity –NZF –Zero +Zero +NZF +Infinity QNaN SNaN
v ← dQNaN v ← Q(src2)
–Infinity v ← –Infinity v ← –Infinity v ← –Infinity v ← –Infinity v ← –Infinity v ← src2
vxisi_flag ← 1 vxsnan_flag ← 1
v ← Q(src2)
–NZF v ← –Infinity v ← A(p,src2) v←p v←p v ← A(p,src2) v ← +Infinity v ← src2
vxsnan_flag ← 1
v ← Q(src2)
–Zero v ← –Infinity v ← src2 v ← –Zero v ← Rezd v ← src2 v ← +Infinity v ← src2
vxsnan_flag ← 1
v ← Q(src2)
+Zero v ← –Infinity v ← src2 v ← Rezd v ← +Zero v ← src2 v ← +Infinity v ← src2
vxsnan_flag ← 1
p
v ← Q(src2)
+NZF v ← –Infinity v ← A(p,src2) v←p v←p v ← A(p,src2) v ← +Infinity v ← src2
vxsnan_flag ← 1
v ← dQNaN v ← Q(src2)
+Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← src2
vxisi_flag ← 1 vxsnan_flag ← 1
QNaN & v←p
v←p v←p v←p v←p v←p v←p v←p
src1 is a NaN vxsnan_flag ← 1
QNaN & v ← Q(src2)
v←p v←p v←p v←p v←p v←p v ← src2
src1 not a NaN vxsnan_flag ← 1
Explanation:
src1 The double-precision floating-point value in doubleword element i of VSR[XA] (where i c {0,1}).
src2 For xvnmaddadp, the double-precision floating-point value in doubleword element i of VSR[XT] (where i c {0,1}).
For xvnmaddmdp, the double-precision floating-point value in doubleword element i of VSR[XB] (where i c {0,1}).
src3 For xvnmaddadp, the double-precision floating-point value in doubleword element i of VSR[XB] (where i c {0,1}).
For xvnmaddmdp, the double-precision floating-point value in doubleword element i of VSR[XT] (where i c {0,1}).
dQNaN Default quiet NaN (0x7FF8_0000_0000_0000).
NZF Nonzero finite number.
Rezd Exact-zero-difference result (addition of two finite numbers having same magnitude but different signs). Can also occur with two
nonzero finite number source operands.
Q(x) Return a QNaN with the payload of x.
A(x,y) Return the normalized sum of floating-point value x and floating-point value y, having unbounded range and precision.
Note: If x = -y, v is considered to be an exact-zero-difference result (Rezd).
M(x,y) Return the product of floating-point value x and floating-point value y, having unbounded range and precision.
p The intermediate product having unbounded range and precision.
v The intermediate result having unbounded range and precision.
Is q inexact? (q g v)
vxsnan_flag
vximz_flag
vxisi_flag
OE
UE
VE
XE
ZE
Case Returned Results and Status Setting
– – – – – 0 0 0 – – – – T(N(r))
0 – – – – – – 1 – – – – T(r), fx(VXISI)
0 – – – – 0 1 – – – – – T(r), fx(VXIMZ)
0 – – – – 1 0 – – – – – T(r), fx(VXSNAN)
Special 0 – – – – 1 1 – – – – – T(r), fx(VXSNAN), fx(VXIMZ)
1 – – – – – – 1 – – – – fx(VXISI), error()
1 – – – – 0 1 – – – – – fx(VXIMZ), error()
1 – – – – 1 0 – – – – – fx(VXSNAN), error()
1 – – – – 1 1 – – – – – fx(VXSNAN), fx(VXIMZ), error()
– – – – – – – – no – – – T(N(r))
– – – – 0 – – – yes no – – T(N(r)), fx(XX)
Normal – – – – 0 – – – yes yes – – T(N(r)), fx(XX)
– – – – 1 – – – yes no – – T(N(r)), fx(XX), error()
– – – – 1 – – – yes yes – – T(N(r)), fx(XX), error()
– 0 – – 0 – – – – – – – T(N(r)), fx(OX), fx(XX)
– 0 – – 1 – – – – – – – T(N(r)), fx(OX), fx(XX), error()
Overflow – 1 – – – – – – – – no – fx(OX), error()
– 1 – – – – – – – – yes no fx(OX), fx(XX), error()
– 1 – – – – – – – – yes yes fx(OX), fx(XX), error()
Explanation:
– The results do not depend on this condition.
fx(x) FX is set to 1 if x=0. x is set to 1.
q The value defined in Table 58, “Scalar Floating-Point Intermediate Result Handling,” on page 516, signficand rounded to the target
precision, unbounded exponent range.
r The value defined in Table 58, “Scalar Floating-Point Intermediate Result Handling,” on page 516, signficand rounded to the target
precision, bounded exponent range.
v The precise intermediate result defined in the instruction having unbounded signficand precision, unbounded exponent range.
FI Floating-Point Fraction Inexact status flag, FPSCRFI. This status flag is nonsticky.
FR Floating-Point Fraction Rounded status flag, FPSCRFR.
OX Floating-Point Overflow Exception status flag, FPSCROX.
error() The system error handler is invoked for the trap-enabled exception if the FE0 and FE1 bits in the Machine State Register are set
to any mode other than the ignore-exception mode. Update of the target VSR is suppressed for all vector elements.
N(x) The value x is is negated by complementing the sign bit of x.
T(x) The value x is placed in element i of VSR[XT] in the target precision format (where i c {0,1} for results with 64-bit elements, and i
c {0,1,3,4}) for results with 32-bit elements).
UX Floating-Point Underflow Exception status flag, FPSCRUX
VXSNAN Floating-Point Invalid Operation Exception (SNaN) status flag, FPSCRVXSNAN.
VXIMZ Floating-Point Invalid Operation Exception (Infinity × Zero) status flag, FPSCRVXIMZ.
VXISI Floating-Point Invalid Operation Exception (Infinity – Infinity) status flag, FPSCRVXISI.
XX Float-Point Inexact Exception status flag, FPSCRXX. The flag is a sticky version of FPSCRFI. When FPSCRFI is set to a new
value, the new value of FPSCRXX is set to the result of ORing the old value of FPSCRXX with the new value of FPSCRFI.
Is q inexact? (q g v)
vxsnan_flag
vximz_flag
vxisi_flag
OE
UE
VE
XE
Case ZE Returned Results and Status Setting
– – 0 – – – – – no – – – T(N(r))
– – 0 – 0 – – – yes no – – T(N(r)), fx(UX), fx(XX)
– – 0 – 0 – – – yes yes – – T(N(r)), fx(UX), fx(XX)
– – 0 – 1 – – – yes no – – T(N(r)), fx(UX), fx(XX), error()
Tiny
– – 0 – 1 – – – yes yes – – T(N(r)), fx(UX), fx(XX), error()
– – 1 – – – – – yes – no – fx(UX), error()
– – 1 – – – – – yes – yes no fx(UX), fx(XX), error()
– – 1 – – – – – yes – yes yes fx(UX), fx(XX), error()
Explanation:
– The results do not depend on this condition.
fx(x) FX is set to 1 if x=0. x is set to 1.
q The value defined in Table 58, “Scalar Floating-Point Intermediate Result Handling,” on page 516, signficand rounded to the target
precision, unbounded exponent range.
r The value defined in Table 58, “Scalar Floating-Point Intermediate Result Handling,” on page 516, signficand rounded to the target
precision, bounded exponent range.
v The precise intermediate result defined in the instruction having unbounded signficand precision, unbounded exponent range.
FI Floating-Point Fraction Inexact status flag, FPSCRFI. This status flag is nonsticky.
FR Floating-Point Fraction Rounded status flag, FPSCRFR.
OX Floating-Point Overflow Exception status flag, FPSCROX.
error() The system error handler is invoked for the trap-enabled exception if the FE0 and FE1 bits in the Machine State Register are set
to any mode other than the ignore-exception mode. Update of the target VSR is suppressed for all vector elements.
N(x) The value x is is negated by complementing the sign bit of x.
T(x) The value x is placed in element i of VSR[XT] in the target precision format (where i c {0,1} for results with 64-bit elements, and i
c {0,1,3,4}) for results with 32-bit elements).
UX Floating-Point Underflow Exception status flag, FPSCRUX
VXSNAN Floating-Point Invalid Operation Exception (SNaN) status flag, FPSCRVXSNAN.
VXIMZ Floating-Point Invalid Operation Exception (Infinity × Zero) status flag, FPSCRVXIMZ.
VXISI Floating-Point Invalid Operation Exception (Infinity – Infinity) status flag, FPSCRVXISI.
XX Float-Point Inexact Exception status flag, FPSCRXX. The flag is a sticky version of FPSCRFI. When FPSCRFI is set to a new
value, the new value of FPSCRXX is set to the result of ORing the old value of FPSCRXX with the new value of FPSCRFI.
Part 1: src3
Multiply –Infinity –NZF –Zero +Zero +NZF +Infinity QNaN SNaN
p ← dQNaN p ← dQNaN p ← Q(src3)
–Infinity p ← +Infinity p ← +Infinity p ← –Infinity p ← –Infinity p ← src3
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
p ← Q(src3)
–NZF p ← +Infinity p ← M(src1,src3) p ← src1 p ← src1 p ← M(src1,src3) p ← +Infinity p ← src3
vxsnan_flag ← 1
p ← dQNaN p ← dQNaN p ← Q(src3)
–Zero p ← +Zero p ← +Zero p ← –Zero p ← –Zero p ← src3
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
p ← dQNaN p ← dQNaN p ← Q(src3)
+Zero p ← –Zero p ← –Zero p ← +Zero p ← +Zero p ← src3
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
src1
p ← Q(src3)
+NZF p ← –Infinity p ← M(src1,src3) p ← src1 p ← src1 p ← M(src1,src3) p ← +Infinity p ← src3
vxsnan_flag ← 1
p ← dQNaN p ← dQNaN p ← Q(src3)
+Infinity p ← –Infinity p ← +Infinity p ← +Infinity p ← +Infinity p ← src3
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
p ← src1
QNaN p ← src1 p ← src1 p ← src1 p ← src1 p ← src1 p ← src1 p ← src1
vxsnan_flag ← 1
p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1)
SNaN
vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1
Part 2: src2
Add –Infinity –NZF –Zero +Zero +NZF +Infinity QNaN SNaN
v ← dQNaN v ← Q(src2)
–Infinity v ← –Infinity v ← –Infinity v ← –Infinity v ← –Infinity v ← –Infinity v ← src2
vxisi_flag ← 1 vxsnan_flag ← 1
v ← Q(src2)
–NZF v ← –Infinity v ← A(p,src2) v←p v←p v ← A(p,src2) v ← +Infinity v ← src2
vxsnan_flag ← 1
v ← Q(src2)
–Zero v ← –Infinity v ← src2 v ← –Zero v ← Rezd v ← src2 v ← +Infinity v ← src2
vxsnan_flag ← 1
v ← Q(src2)
+Zero v ← –Infinity v ← src2 v ← Rezd v ← +Zero v ← src2 v ← +Infinity v ← src2
vxsnan_flag ← 1
p
v ← Q(src2)
+NZF v ← –Infinity v ← A(p,src2) v←p v←p v ← A(p,src2) v ← +Infinity v ← src2
vxsnan_flag ← 1
v ← dQNaN v ← Q(src2)
+Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← src2
vxisi_flag ← 1 vxsnan_flag ← 1
QNaN & v←p
v←p v←p v←p v←p v←p v←p v←p
src1 is a NaN vxsnan_flag ← 1
QNaN & v ← Q(src2)
v←p v←p v←p v←p v←p v←p v ← src2
src1 not a NaN vxsnan_flag ← 1
Explanation:
src1 The single-precision floating-point value in word element i of VSR[XA] (where i c {0,1,2,3}).
src2 For xvnmaddasp, the single-precision floating-point value in word element i of VSR[XT] (where i c {0,1,2,3}).
For xvnmaddmsp, the single-precision floating-point value in word element i of VSR[XB] (where i c {0,1,2,3}).
src3 For xvnmaddasp, the single-precision floating-point value in word element i of VSR[XB] (where i c {0,1,2,3}).
For xvnmaddmsp, the single-precision floating-point value in word element i of VSR[XT] (where i c {0,1,2,3}).
dQNaN Default quiet NaN (0x7FC0_0000).
NZF Nonzero finite number.
Rezd Exact-zero-difference result (addition of two finite numbers having same magnitude but different signs). Can also occur with two
nonzero finite number source operands.
Q(x) Return a QNaN with the payload of x.
A(x,y) Return the normalized sum of floating-point value x and floating-point value y, having unbounded range and precision.
Note: If x = -y, v is considered to be an exact-zero-difference result (Rezd).
M(x,y) Return the normalized product of floating-point value x and floating-point value y, having unbounded range and precision.
p The intermediate product having unbounded range and precision.
v The intermediate result having unbounded range and precision.
Part 1: src3
Multiply –Infinity –NZF –Zero +Zero +NZF +Infinity QNaN SNaN
p ← dQNaN p ← dQNaN p ← Q(src3)
–Infinity p ← +Infinity p ← +Infinity p ← –Infinity p ← –Infinity p ← src3
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
p ← Q(src3)
–NZF p ← +Infinity p ← M(src1,src3) p ← src1 p ← src1 p ← M(src1,src3) p ← +Infinity p ← src3
vxsnan_flag ← 1
p ← dQNaN p ← dQNaN p ← Q(src3)
–Zero p ← +Zero p ← +Zero p ← –Zero p ← –Zero p ← src3
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
p ← dQNaN p ← dQNaN p ← Q(src3)
+Zero p ← –Zero p ← –Zero p ← +Zero p ← +Zero p ← src3
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
src1
p ← Q(src3)
+NZF p ← –Infinity p ← M(src1,src3) p ← src1 p ← src1 p ← M(src1,src3) p ← +Infinity p ← src3
vxsnan_flag ← 1
p ← dQNaN p ← dQNaN p ← Q(src3)
+Infinity p ← –Infinity p ← +Infinity p ← +Infinity p ← +Infinity p ← src3
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
p ← src1
QNaN p ← src1 p ← src1 p ← src1 p ← src1 p ← src1 p ← src1 p ← src1
vxsnan_flag ← 1
p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1)
SNaN
vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1
Part 2: src2
Subtract –Infinity –NZF –Zero +Zero +NZF +Infinity QNaN SNaN
v ← dQNaN v ← Q(src2)
–Infinity v ← –Infinity v ← –Infinity v ← –Infinity v ← –Infinity v ← –Infinity v ← src2
vxisi_flag ← 1 vxsnan_flag ← 1
v ← Q(src2)
–NZF v ← +Infinity v ← S(p,src2) v←p v←p v ← S(p,src2) v ← –Infinity v ← src2
vxsnan_flag ← 1
v ← Q(src2)
–Zero v ← +Infinity v ← –src2 v ← –Zero v ← Rezd v ← –src2 v ← –Infinity v ← src2
vxsnan_flag ← 1
v ← Q(src2)
+Zero v ← +Infinity v ← –src2 v ← Rezd v ← +Zero v ← –src2 v ← –Infinity v ← src2
vxsnan_flag ← 1
p
v ← Q(src2)
+NZF v ← +Infinity v ← S(p,src2) v←p v←p v ← S(p,src2) v ← –Infinity v ← src2
vxsnan_flag ← 1
v ← dQNaN v ← Q(src2)
+Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← src2
vxisi_flag ← 1 vxsnan_flag ← 1
QNaN & v←p
v←p v←p v←p v←p v←p v←p v←p
src1 is a NaN vxsnan_flag ← 1
QNaN & v ← Q(src2)
v←p v←p v←p v←p v←p v←p v ← src2
src1 not a NaN vxsnan_flag ← 1
Explanation:
src1 The double-precision floating-point value in doubleword element i of VSR[XA] (where i c {0,1}).
src2 For xvnmsubadp, the double-precision floating-point value in doubleword element i of VSR[XT] (where i c {0,1}).
For xvnmsubmdp, the double-precision floating-point value in doubleword element i of VSR[XB] (where i c {0,1}).
src3 For xvnmsubadp, the double-precision floating-point value in doubleword element i of VSR[XB] (where i c {0,1}).
For xvnmsubmdp, the double-precision floating-point value in doubleword element i of VSR[XT] (where i c {0,1}).
dQNaN Default quiet NaN (0x7FF8_0000_0000_0000).
NZF Nonzero finite number.
Rezd Exact-zero-difference result (addition of two finite numbers having same magnitude but different signs). Can also occur with two
nonzero finite number source operands.
Q(x) Return a QNaN with the payload of x.
S(x,y) Return the normalized sum of floating-point value x and negated floating-point value y, having unbounded range and precision.
Note: If x = -y, v is considered to be an exact-zero-difference result (Rezd).
M(x,y) Return the normalized product of floating-point value x and floating-point value y, having unbounded range and precision.
p The intermediate product having unbounded range and precision.
v The intermediate result having unbounded range and precision.
Part 1: src3
Multiply –Infinity –NZF –Zero +Zero +NZF +Infinity QNaN SNaN
p ← dQNaN p ← dQNaN p ← Q(src3)
–Infinity p ← +Infinity p ← +Infinity p ← –Infinity p ← –Infinity p ← src3
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
p ← Q(src3)
–NZF p ← +Infinity p ← M(src1,src3) p ← src1 p ← src1 p ← M(src1,src3) p ← +Infinity p ← src3
vxsnan_flag ← 1
p ← dQNaN p ← dQNaN p ← Q(src3)
–Zero p ← +Zero p ← +Zero p ← –Zero p ← –Zero p ← src3
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
p ← dQNaN p ← dQNaN p ← Q(src3)
+Zero p ← –Zero p ← –Zero p ← +Zero p ← +Zero p ← src3
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
src1
p ← Q(src3)
+NZF p ← –Infinity p ← M(src1,src3) p ← src1 p ← src1 p ← M(src1,src3) p ← +Infinity p ← src3
vxsnan_flag ← 1
p ← dQNaN p ← dQNaN p ← Q(src3)
+Infinity p ← –Infinity p ← +Infinity p ← +Infinity p ← +Infinity p ← src3
vximz_flag ← 1 vximz_flag ← 1 vxsnan_flag ← 1
p ← src1
QNaN p ← src1 p ← src1 p ← src1 p ← src1 p ← src1 p ← src1 p ← src1
vxsnan_flag ← 1
p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1) p ← Q(src1)
SNaN
vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1
Part 2: src2
Subtract –Infinity –NZF –Zero +Zero +NZF +Infinity QNaN SNaN
v ← dQNaN v ← Q(src2)
–Infinity v ← –Infinity v ← –Infinity v ← –Infinity v ← –Infinity v ← –Infinity v ← src2
vxisi_flag ← 1 vxsnan_flag ← 1
v ← Q(src2)
–NZF v ← +Infinity v ← S(p,src2) v←p v←p v ← S(p,src2) v ← –Infinity v ← src2
vxsnan_flag ← 1
v ← Q(src2)
–Zero v ← +Infinity v ← –src2 v ← –Zero v ← Rezd v ← –src2 v ← –Infinity v ← src2
vxsnan_flag ← 1
v ← Q(src2)
+Zero v ← +Infinity v ← –src2 v ← Rezd v ← +Zero v ← –src2 v ← –Infinity v ← src2
vxsnan_flag ← 1
p
v ← Q(src2)
+NZF v ← +Infinity v ← S(p,src2) v←p v←p v ← S(p,src2) v ← –Infinity v ← src2
vxsnan_flag ← 1
v ← dQNaN v ← Q(src2)
+Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← src2
vxisi_flag ← 1 vxsnan_flag ← 1
QNaN & v←p
v←p v←p v←p v←p v←p v←p v←p
src1 is a NaN vxsnan_flag ← 1
QNaN & v ← Q(src2)
v←p v←p v←p v←p v←p v←p v ← src2
src1 not a NaN vxsnan_flag ← 1
Explanation:
src1 The single-precision floating-point value in word element i of VSR[XA] (where i c {0,1,2,3}).
src2 The single-precision floating-point value in word element i of VSR[XT] (where i c {0,1,2,3}).
src3 The single-precision floating-point value in word element i of VSR[XB] (where i c {0,1,2,3}).
dQNaN Default quiet NaN (0x7FC0_0000).
NZF Nonzero finite number.
Rezd Exact-zero-difference result (addition of two finite numbers having same magnitude but different signs). Can also occur with two
nonzero finite number source operands.
Q(x) Return a QNaN with the payload of x.
S(x,y) Return the normalized sum of floating-point value x and negated floating-point value y, having unbounded range and precision.
Note: If x = -y, v is considered to be an exact-zero-difference result (Rezd).
M(x,y) Return the normalized product of floating-point value x and floating-point value y, having unbounded range and precision.
p The intermediate product having unbounded range and precision.
v The intermediate result having unbounded range and precision.
XT ← TX || T XT ← TX || T
XB ← BX || B XB ← BX || B
ex_flag ← 0b0 ex_flag ← 0b0
VSR Data Layout for xvrdpi The result is placed into doubleword element i of
src = VSR[XB] VSR[XT] in double-precision format.
XT ← TX || T XT ← TX || T
XB ← BX || B XB ← BX || B
ex_flag ← 0b0 ex_flag ← 0b0
if( ex_flag = 0 ) then VSR[XT] ← result if( ex_flag = 0 ) then VSR[XT] ← result
For each vector element i from 0 to 1, do the following. For each vector element i from 0 to 1, do the following.
Let src be the double-precision floating-point Let src be the double-precision floating-point
operand in doubleword element i of VSR[XB]. operand in doubleword element i of VSR[XB].
src is rounded to an integer using the rounding src is rounded to an integer using the rounding
mode Round toward -Infinity. mode Round toward +Infinity.
The result is placed into doubleword element i of The result is placed into doubleword element i of
VSR[XT] in double-precision format. VSR[XT] in double-precision format.
If a trap-enabled exception occurs in any element of If a trap-enabled exception occurs in any element of
the vector, no results are written to VSR[XT]. the vector, no results are written to VSR[XT].
VSR Data Layout for xvrdpim VSR Data Layout for xvrdpip
src = VSR[XB] src = VSR[XB]
DP DP DP DP
tgt = VSR[XT] tgt = VSR[XT]
DP DP DP DP
0 64 127 0 64 127
60 T /// B 217 BX TX
0 6 11 16 21 30 31
XT ← TX || T
XB ← BX || B
ex_flag ← 0b0
do i=0 to 127 by 64
reset_xflags()
result{i:i+63} ← RoundToDPIntegerTrunc(VSR[XB]{i:i+63})
if(vxsnan_flag) then SetFX(VXSNAN)
ex_flag ← ex_flag | (VE & vxsnan_flag)
end
if( ex_flag = 0 ) then VSR[XT] ← result VSR Data Layout for xvredp
src = VSR[XB]
Let XT be the value 32×TX + T.
Let XB be the value 32×BX + B. DP DP
tgt = VSR[XT]
For each vector element i from 0 to 1, do the following.
Let src be the double-precision floating-point DP DP
operand in doubleword element i of VSR[XB]. 0 64 127
1
estimate – ----------
1
src
---------------------------------------------- ≤ ------------------
1 16384
----------
src
if( ex_flag = 0 ) then VSR[XT] ← result VSR Data Layout for xvresp
src = VSR[XB]
Let XT be the value 32×TX + T.
Let XB be the value 32×BX + B. SP SP SP SP
tgt = VSR[XT]
For each vector element i from 0 to 3, do the following.
Let src be the single-precision floating-point SP SP SP SP
operand in word element i of VSR[XB]. 0 32 64 96 127
1
estimate – ----------
1
src
---------------------------------------------- ≤ ------------------
1 16384
----------
src
VSX Vector Round to Single-Precision Integer VSX Vector Round to Single-Precision Integer
using round to Nearest Away XX2-form Exact using Current rounding mode XX2-form
xvrspi XT,XB xvrspic XT,XB
XT ← TX || T XT ← TX || T
XB ← BX || B XB ← BX || B
ex_flag ← 0b0 ex_flag ← 0b0
VSR Data Layout for xvrspi The result is placed into word element i of VSR[XT]
in single-precision format.
src = VSR[XB]
SP SP SP SP If a trap-enabled exception occurs in any element of
the vector, no results are written to VSR[XT].
tgt = VSR[XT]
SP SP SP SP Special Registers Altered
0 32 64 96 127 FX XX VXSNAN
VSX Vector Round to Single-Precision Integer VSX Vector Round to Single-Precision Integer
using round toward -Infinity XX2-form using round toward +Infinity XX2-form
xvrspim XT,XB xvrspip XT,XB
XT ← TX || T XT ← TX || T
XB ← BX || B XB ← BX || B
ex_flag ← 0b0 ex_flag ← 0b0
if( ex_flag = 0 ) then VSR[XT] ← result if( ex_flag = 0 ) then VSR[XT] ← result
For each vector element i from 0 to 3, do the following. For each vector element i from 0 to 3, do the following.
Let src be the single-precision floating-point Let src be the single-precision floating-point
operand in word element i of VSR[XB]. operand in word element i of VSR[XB].
src is rounded to an integer using the rounding src is rounded to an integer using the rounding
mode Round toward -Infinity. mode Round toward +Infinity.
The result is placed into word element i of VSR[XT] The result is placed into word element i of VSR[XT]
in single-precision format. in single-precision format.
If a trap-enabled exception occurs in any element of If a trap-enabled exception occurs in any element of
the vector, no results are written to VSR[XT]. the vector, no results are written to VSR[XT].
VSR Data Layout for xvrspim VSR Data Layout for xvrspip
src = VSR[XB] src = VSR[XB]
SP SP SP SP SP SP SP SP
tgt = VSR[XT] tgt = VSR[XT]
SP SP SP SP SP SP SP SP
0 32 64 96 127 0 32 64 96 127
VSX Vector Round to Single-Precision Integer VSX Vector Reciprocal Square Root Estimate
using round toward Zero XX2-form Double-Precision XX2-form
xvrspiz XT,XB xvrsqrtedp XT,XB
XT ← TX || T XT ← TX || T
XB ← BX || B XB ← BX || B
ex_flag ← 0b0 ex_flag ← 0b0
For each vector element i from 0 to 3, do the following. if( ex_flag = 0 ) then VSR[XT] ← result
Let src be the single-precision floating-point
operand in word element i of VSR[XB]. Let XT be the value 32×TX + T.
Let XB be the value 32×BX + B.
src is rounded to an integer using the rounding
mode Round toward Zero. For each vector element i from 0 to 1, do the following.
Let src be the double-precision floating-point
The result is placed into word element i of VSR[XT] operand in doubleword element i of VSR[XB].
in single-precision format.
A double-precision floating-point estimate of the
If a trap-enabled exception occurs in any element of reciprocal square root of src is placed into
the vector, no results are written to VSR[XT]. doubleword element i of VSR[XT] in
double-precision format.
Special Registers Altered
FX VXSNAN Unless the reciprocal of the square root of src
would be a zero, an infinity, or a QNaN, the
estimate has a relative error in precision no
VSR Data Layout for xvrspiz greater than one part in 16384 of the reciprocal of
src = VSR[XB] the square root of src. That is,
SP SP SP SP 1
estimate – ---------------
1
tgt = VSR[XT] src
-------------------------------------------------- ≤ ----------------
1 16384
----------------
SP SP SP SP src
0 32 64 96 127
Operation with various special values of the operand is
summarized below.
VSX Vector Reciprocal Square Root Estimate Source Value Result Exception
Single-Precision XX2-form
–Infinity QNaN1 VXSQRT
xvrsqrtesp XT,XB +Infinity +Zero None
VSX Vector Square Root Double-Precision See Table 58, “Scalar Floating-Point Intermediate
XX2-form Result Handling,” on page 516.
src
-Infinity -NZF -Zero +Zero +NZF +Infinity QNaN SNaN
v ← dQNaN v ← dQNaN v ← Q(src)
v ← +Zero v ← +Zero v ← SQRT(src) v ← +Infinity v ← src
vxsqrt_flag ← 1 vxsqrt_flag ← 1 vxsnan_flag ← 1
Explanation:
src The double-precision floating-point value in doubleword element i of VSR[XB] (where i c {0,1}).
dQNaN Default quiet NaN (0x7FF8_0000_0000_0000).
NZF Nonzero finite number.
SQRT(x) The unbounded-precision square root of the floating-point value x.
Q(x) Return a QNaN with the payload of x.
v The intermediate result having unbounded signficand precision and unbounded exponent range.
VSX Vector Square Root Single-Precision See Table 58, “Scalar Floating-Point Intermediate
XX2-form Result Handling,” on page 516.
src
-Infinity -NZF -Zero +Zero +NZF +Infinity QNaN SNaN
v ← dQNaN v ← dQNaN v ← Q(src)
v ← +Zero v ← +Zero v ← SQRT(src) v ← +Infinity v ← src
vxsqrt_flag ← 1 vxsqrt_flag ← 1 vxsnan_flag ← 1
Explanation:
src The single-precision floating-point value in word element i of VSR[XB] (where i c {0,1,2,3}).
dQNaN Default quiet NaN (0x7FC0_0000).
NZF Nonzero finite number.
SQRT(x) The unbounded-precision square root of the floating-point value x.
Q(x) Return a QNaN with the payload of x.
v The intermediate result having unbounded signficand precision and unbounded exponent range.
VSX Vector Subtract Double-Precision The result is placed into doubleword element i of
XX3-form VSR[XT] in double-precision format.
1. Floating-point addition is based on exponent comparison and addition of the two significands. The exponents of the two operands are compared,
and the significand accompanying the smaller exponent is shifted right, with its exponent increased by one for each bit shifted, until the two expo-
nents are equal. The two significands are then added or subtracted as appropriate, depending on the signs of the operands, to form an interme-
diate sum. All 53 bits of the significand as well as all three guard bits (G, R, and X) enter into the computation.
2. Floating-point normalization is based on shifting the significand left until the most-significant bit is 1 and decrementing the exponent by the num-
ber of bits the significand was shifted.
src2
-Infinity -NZF -Zero +Zero +NZF +Infinity QNaN SNaN
v ← dQNaN v ← Q(src2)
-Infinity v ← –Infinity v ← –Infinity v ← –Infinity v ← –Infinity v ← –Infinity v ← src2
vxisi_flag ← 1 vxsnan_flag ← 1
v ← Q(src2)
-NZF v ← +Infinity v ← S(src1,src2) v ← src1 v ← src1 v ← S(src1,src2) v ← –Infinity v ← src2
vxsnan_flag ← 1
v ← Q(src2)
-Zero v ← +Infinity v ← –src2 v ← –Zero v ← Rezd v ← –src2 v ← –Infinity v ← src2
vxsnan_flag ← 1
v ← Q(src2)
+Zero v ← +Infinity v ← –src2 v ← Rezd v ← +Zero v ← –src2 v ← –Infinity v ← src2
vxsnan_flag ← 1
src1
v ← Q(src2)
+NZF v ← +Infinity v ← S(src1,src2) v ← src1 v ← src1 v ← S(src1,src2) v ← –Infinity v ← src2
vxsnan_flag ← 1
v ← dQNaN v ← Q(src2)
+Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← src2
vxisi_flag ← 1 vxsnan_flag ← 1
v ← src1
QNaN v ← src1 v ← src1 v ← src1 v ← src1 v ← src1 v ← src1 v ← src1
vxsnan_flag ← 1
v ← Q(src1) v ← Q(src1) v ← Q(src1) v ← Q(src1) v ← Q(src1) v ← Q(src1) v ← Q(src1) v ← Q(src1)
SNaN
vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1
Explanation:
src1 The double-precision floating-point value in doubleword element i of VSR[XA] (where i c {0,1}).
src2 The double-precision floating-point value in doubleword element i of VSR[XB] (where i c {0,1}).
dQNaN Default quiet NaN (0x7FF8_0000_0000_0000).
NZF Nonzero finite number.
Rezd Exact-zero-difference result (addition of two finite numbers having same magnitude but different signs).
S(x,y) Return the normalized sum of floating-point value x and negated floating-point value y, having unbounded range and precision.
Note: If x = -y, v is considered to be an exact-zero-difference result (Rezd).
Q(x) Return a QNaN with the payload of x.
v The intermediate result having unbounded signficand precision and unbounded exponent range.
VSX Vector Subtract Single-Precision The result is placed into word element i of VSR[XT]
XX3-form in single-precision format.
do i=0 to 127 by 32
reset_xflags()
VSR Data Layout for xvsubsp
src1 ← VSR[XA]{i:i+31}
src2 ← VSR[XB]{i:i+31} src1 = VSR[XA]
v{0:inf} ← AddSP(src1,NegateSP(src2))
SP SP SP SP
result{i:i+31} ← RoundToSP(RN,v)
if(vxsnan_flag) then SetFX(VXSNAN) src2 = VSR[XB]
if(vxisi_flag) then SetFX(VXISI)
if(ox_flag) then SetFX(OX) SP SP SP SP
if(ux_flag) then SetFX(UX)
if(xx_flag) then SetFX(XX) tgt = VSR[XT]
ex_flag ← ex_flag | (VE & vxsnan_flag) SP SP SP SP
ex_flag ← ex_flag | (VE & vxisi_flag)
0 32 64 96 127
ex_flag ← ex_flag | (OE & ox_flag)
ex_flag ← ex_flag | (UE & ux_flag)
ex_flag ← ex_flag | (XE & xx_flag)
end
1. Floating-point addition is based on exponent comparison and addition of the two significands. The exponents of the two operands are compared,
and the significand accompanying the smaller exponent is shifted right, with its exponent increased by one for each bit shifted, until the two expo-
nents are equal. The two significands are then added or subtracted as appropriate, depending on the signs of the operands, to form an interme-
diate sum. All 53 bits of the significand as well as all three guard bits (G, R, and X) enter into the computation.
2. Floating-point normalization is based on shifting the significand left until the most-significant bit is 1 and decrementing the exponent by the num-
ber of bits the significand was shifted.
src2
-Infinity -NZF -Zero +Zero +NZF +Infinity QNaN SNaN
v ← dQNaN v ← Q(src2)
-Infinity v ← –Infinity v ← –Infinity v ← –Infinity v ← –Infinity v ← –Infinity v ← src2
vxisi_flag ← 1 vxsnan_flag ← 1
v ← Q(src2)
-NZF v ← +Infinity v ← S(src1,src2) v ← src1 v ← src1 v ← S(src1,src2) v ← –Infinity v ← src2
vxsnan_flag ← 1
v ← Q(src2)
-Zero v ← +Infinity v ← –src2 v ← –Zero v ← Rezd v ← –src2 v ← –Infinity v ← src2
vxsnan_flag ← 1
v ← Q(src2)
+Zero v ← +Infinity v ← –src2 v ← Rezd v ← +Zero v ← –src2 v ← –Infinity v ← src2
vxsnan_flag ← 1
src1
v ← Q(src2)
+NZF v ← +Infinity v ← S(src1,src2) v ← src1 v ← src1 v ← S(src1,src2) v ← –Infinity v ← src2
vxsnan_flag ← 1
v ← dQNaN v ← Q(src2)
+Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← +Infinity v ← src2
vxisi_flag ← 1 vxsnan_flag ← 1
v ← src1
QNaN v ← src1 v ← src1 v ← src1 v ← src1 v ← src1 v ← src1 v ← src1
vxsnan_flag ← 1
v ← Q(src1) v ← Q(src1) v ← Q(src1) v ← Q(src1) v ← Q(src1) v ← Q(src1) v ← Q(src1) v ← Q(src1)
SNaN
vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1 vxsnan_flag ← 1
Explanation:
src1 The single-precision floating-point value in word element i of VSR[XA] (where i c {0,1,2,3}).
src2 The single-precision floating-point value in word element i of VSR[XB] (where i c {0,1,2,3}).
dQNaN Default quiet NaN (0x7FC0_0000).
NZF Nonzero finite number.
Rezd Exact-zero-difference result (addition of two finite numbers having same magnitude but different signs).
S(x,y) Return the normalized sum of floating-point value x and negated floating-point value y, having unbounded range and precision.
Note: If x = -y, v is considered to be an exact-zero-difference result (Rezd).
Q(x) Return a QNaN with the payload of x.
v The intermediate result having unbounded signficand precision and unbounded exponent range.
VSX Vector Test for software Divide fg_flag is set to 1 for any of the following
Double-Precision XX3-form conditions.
– src1 is an infinity.
xvtdivdp BF,XA,XB – src2 is a zero, an infinity, or a denormalized
value.
60 BF // A B 125 AX BX /
0 6 9 11 16 21 29 30 31 CR field BF is set to the value
XA ← AX || A
0b1 || fg_flag || fe_flag || 0b0.
XB ← BX || B
eq_flag ← 0b0 Special Registers Altered
gt_flag ← 0b0 CR[BF]
do i=0 to 127 by 64
src1 ← VSR[XA]{i:i+63} VSR Data Layout for xvtdivdp
src2 ← VSR[XB]{i:i+63}
src1 = VSR[XA]
e_a ← src1{1:11} - 1023
e_b ← src2{1:11} - 1023 .dword[0] .dword[1]
fe_flag ← fe_flag | IsNaN(src1) | IsInf(src1) |
IsNaN(src2) | IsInf(src2) | IsZero(src2) | src2 = VSR[XB]
( e_b <= -1022 ) |
( e_b >= 1021 ) | .dword[0] .dword[1]
( !IsZero(src1) & ( (e_a - e_b) >= 1023 ) ) | 0 64 127
( !IsZero(src1) & ( (e_a - e_b) <= -1021 ) ) |
( !IsZero(src1) & ( e_a <= -970 ) )
fg_flag ← fg_flag | IsInf(src1) | IsInf(src2) |
IsZero(src2) | IsDen(src2)
end
fe_flag is initialized to 0.
fg_flag is initialized to 0.
VSX Vector Test for software Divide fg_flag is set to 1 for any of the following
Single-Precision XX3-form conditions.
– src1 is an infinity.
xvtdivsp BF,XA,XB – src2 is a zero, an infinity, or a denormalized
value.
60 BF // A B 93 AX BX /
0 6 9 11 16 21 29 30 31 CR field BF is set to the value
XA ← AX || A
0b1 || fg_flag || fe_flag || 0b0.
XB ← BX || B
eq_flag ← 0b0 Special Registers Altered
gt_flag ← 0b0 CR[BF]
do i=0 to 127 by 32
VSR Data Layout for xvtdivsp
src1 ← VSR[XA]{i:i+31}
src2 ← VSR[XB]{i:i+31} src1 = VSR[XA]
e_a ← src1{1:8} - 127
.word[0] .word[1] .word[2] .word[3]
e_b ← src2{1:8} - 127
fe_flag ← fe_flag | IsNaN(src1) | IsInf(src1) | src2 = VSR[XB]
IsNaN(src2) | IsInf(src2) | IsZero(src2) |
( e_b <= -126 ) | .word[0] .word[1] .word[2] .word[3]
( e_b >= 125 ) | 0 32 64 96 127
( !IsZero(src1) & ( (e_a - e_b) >= 127 ) ) |
( !IsZero(src1) & ( (e_a - e_b) <= -125 ) ) |
( !IsZero(src1) & ( e_a <= -103 ) )
fg_flag ← fg_flag | IsInf(src1) | IsInf(src2) |
IsZero(src2) | IsDen(src2)
end
fe_flag is initialized to 0.
fg_flag is initialized to 0.
VSX Vector Test for software Square Root VSX Vector Test for software Square Root
Double-Precision XX2-form Single-Precision XX2-form
xvtsqrtdp BF,XB xvtsqrtsp BF,XB
XB ← BX || B XB ← BX || B
fe_flag ← 0b0 fe_flag ← 0b0
fg_flag ← 0b0 fg_flag ← 0b0
For each vector element i from 0 to 1, do the following. For each vector element i from 0 to 3, do the following.
Let src be the double-precision floating-point Let src be the single-precision floating-point
operand in doubleword element i of VSR[XB]. operand in word element i of VSR[XB].
Let e_b be the unbiased exponent of src. Let e_b be the unbiased exponent of src.
fe_flag is set to 1 for any of the following fe_flag is set to 1 for any of the following
conditions. conditions.
– src is a zero, a NaN, an infinity, or a negative – src is a zero, a NaN, an infinity, or a negative
value. value.
– e_b is less than or equal to -970. – e_b is less than or equal to -103.
fg_flag is set to 1 for the following condition. fg_flag is set to 1 for the following condition.
– src is a zero, an infinity, or a denormalized – src is a zero, an infinity, or a denormalized
value. value.
VSR Data Layout for xvtsqrtdp VSR Data Layout for xvtsqrtsp
src = VSR[XB] src = VSR[XB]
.dword[0] .dword[1] .word[0] .word[1] .word[2] .word[3]
0 64 127 0 32 64 96 127
VSX Vector Test Data Class Double-Precision Let XB be the sum 32×BX + B.
XX2-form Let XT be the sum 32×TX + T.
Let DCMX be the value dc concatenated with dx
xvtstdcdp XT,XB,DCMX concatenated with dcm.
60 T dx B 15 dc 5 dmBXTX
0 6 11 16 21 25 26 29 30 31
For each integer value i from 0 to 1, do the following.
Let src be the double-precision floating-point
if MSR.VSX=0 then VSX_Unavailable() value in doubleword element i of VSR[XB].
if match = 1 then
VSR[32×TX+T].dword[i] 0xFFFF_FFFF_FFFF_FFFF
else
VSR[32×TX+T].dword[i] 0x0000_0000_0000_0000
end
VSX Vector Test Data Class Single-Precision Let XB be the sum 32×BX + B.
XX2-form Let XT be the sum 32×TX + T.
Let DCMX be the value dc concatenated with dm
xvtstdcsp XT,XB,DCMX concatenated with dx.
60 T dx B 13 dc 5 dmBXTX
0 6 11 16 21 25 26 29 30 31
For each integer value i from 0 to 3, do the following.
Let src be the single-precision floating-point value
if MSR.VSX=0 then VSX_Unavailable() in word element i of VSR[XB].
if match = 1 then
VSR[32×TX+T].dword[i] 0xFFFF_FFFF
else
VSR[32×TX+T].dword[i] 0x0000_0000
end
60 T 0 B 475 BX TX 60 T 8 B 475 BX TX
0 6 11 16 21 30 31 0 6 11 16 21 30 31
do i = 0 to 1 do i = 0 to 3
src VSR[32×BX+B].dword[i] src VSR[32×BX+B].word[i]
VSR[32×TX+T].dword[i] EXTZ64(src.bit[1:11]) VSR[32×TX+T].word[i] EXTZ32(src.bit[1:8])
end end
For each integer value i from 0 to 1, do the following. For each integer value i from 0 to 3, do the following.
Let src be the double-precision floating-point Let src be the single-precision floating-point value
value in doubleword element i of VSR[XB]. in word element i of VSR[XB].
The value of the exponent field in src is placed The value of the exponent field in src is placed
into doubleword element i of VSR[XT] in unsigned into word element i of VSR[XT] in unsigned integer
integer format. format.
60 T 1 B 475 BX TX 60 T 9 B 475 BX TX
0 6 11 16 21 30 31 0 6 11 16 21 30 31
do i = 0 to 1 do i = 0 to 3
src VSR[32×BX+B].dword[i] src VSR[32×BX+B].word[i]
exponent EXTZ(src.bit[1:11]) exponent EXTZ(src.bit[1:8])
fraction EXTZ64(src.bit[12:63]) fraction EXTZ32(src.bit[9:31])
if (exponent != 0) & (exponent != 2047) then if (exponent != 0) & (exponent != 255) then
fraction fraction | 0x0010_0000_0000_0000 fraction fraction | 0x0080_0000
For each integer value i from 0 to 1, do the following. For each integer value i from 0 to 3, do the following.
Let src be the double-precision floating-point Let src be the single-precision floating-point value
value in doubleword element i of VSR[XB]. in word element i of VSR[XB].
The significand of src is placed into doubleword The significand of src is placed into word element
element i of VSR[XT] in unsigned integer format. If i of VSR[XT] in unsigned integer format. If src is a
src is a normal value, the implicit leading bit is set normal value, the implicit leading bit is set to 1.
to 1.
Special Registers Altered:
Special Registers Altered: None
None
VSX Vector Byte-Reverse Quadword XX2-form VSX Vector Byte-Reverse Word XX2-form
xxbrq XT,XB xxbrw XT,XB
do i = 0 to 15 do i = 0 to 3
VSR[32×TX+T].byte[i] VSR[32×BX+B].byte[15-i] do j = 0 to 3
end VSR[32×TX+T].word[i].byte[j] VSR[32×BX+B].word[i].byte[3-j]
end
Let XT be the value 32×TX + T. end
Let XB be the value 32×BX + B.
Let XT be the value 32×TX + T.
For each integer value i from 0 to 15, do the following. Let XB be the value 32×BX + B.
The contents of byte sub-element 15-i of VSR[XB]
are placed into byte sub-element i of VSR[XT]. For each integer value i from 0 to 3, do the following.
The contents of byte 3 of word element i of
Special Registers Altered: VSR[XB] are placed into byte 0 of word element i
None of VSR[XT].
VSX Vector Extract Unsigned Word XX2-form VSX Vector Insert Word XX2-form
xxextractuw XT,XB,UIM xxinsertw XT,XB,UIM
VSX Logical AND XX3-form VSX Logical AND with Complement XX3-form
xxland XT,XA,XB xxlandc XT,XA,XB
XT ← TX || T XT ← TX || T
XA ← AX || A XA ← AX || A
XB ← BX || B XB ← BX || B
VSR[XT] ← VSR[XA] & VSR[XB] VSR[XT] ← VSR[XA] & ~VSR[XB]
The contents of VSR[XA] are ANDed with the contents The contents of VSR[XA] are ANDed with the
of VSR[XB] and the result is placed into VSR[XT]. complement of the contents of VSR[XB] and the result is
placed into VSR[XT].
Special Registers Altered
None Special Registers Altered
None
VSR Data Layout for xxland
src1 = VSR[XA] VSR Data Layout for xxland
src1 = VSR[XA]
src2 = VSR[XB]
src2 = VSR[XB]
tgt = VSR[XT]
tgt = VSR[XT]
0 127
0 127
The contents of VSR[XA] are exclusive-ORed with the The contents of VSR[XA] are ANDed with the contents
contents of VSR[XB] and the complemented result is of VSR[XB] and the complemented result is placed into
placed into VSR[XT]. VSR[XT].
VSR Data Layout for xxleqv VSR Data Layout for xxlnand
src = VSR[XA] src = VSR[XA]
0 127 0 127
src2 = VSR[XB]
src2 = VSR[XB]
tgt = VSR[XT]
tgt = VSR[XT]
0 127
0 127
The contents of VSR[XA] are ORed with the contents of The contents of VSR[XA] are exclusive-ORed with the
VSR[XB] and the result is placed into VSR[XT]. contents of VSR[XB] and the result is placed into
VSR[XT].
Special Registers Altered
None Special Registers Altered
None
VSR Data Layout for xxlor
src1 = VSR[XA] VSR Data Layout for xxlxor
src1 = VSR[XA]
src2 = VSR[XB]
src2 = VSR[XB]
tgt = VSR[XT]
tgt = VSR[XT]
0 127
0 127
VSX Merge High Word XX3-form VSX Merge Low Word XX3-form
xxmrghw XT,XA,XB xxmrglw XT,XA,XB
60 T A B 18 AX BX TX 60 T A B 50 AXBX TX
0 6 11 16 21 29 30 31 0 6 11 16 21 29 30 31
The contents of word element 0 of VSR[XA] are placed The contents of word element 2 of VSR[XA] are placed
into word element 0 of VSR[XT]. into word element 0 of VSR[XT].
The contents of word element 0 of VSR[XB] are placed The contents of word element 2 of VSR[XB] are placed
into word element 1 of VSR[XT]. into word element 1 of VSR[XT].
The contents of word element 1 of VSR[XA] are placed The contents of word element 3 of VSR[XA] are placed
into word element 2 of VSR[XT]. into word element 2 of VSR[XT].
The contents of word element 1 of VSR[XB] are placed The contents of word element 3 of VSR[XB] are placed
into word element 3 of VSR[XT]. into word element 3 of VSR[XT].
VSR Data Layout for xxmrghw VSR Data Layout for xxmrglw
src1 = VSR[XA] src1 = VSR[XA]
.word[0] .word[1] unused unused unused unused .word[2] .word[3]
src2 = VSR[XB] src2 = VSR[XB]
.word[0] .word[1] unused unused unused unused .word[2] .word[3]
tgt = VSR[XT] tgt = VSR[XT]
.word[0] .word[1] .word[2] .word[3] .word[0] .word[1] .word[2] .word[3]
0 32 64 96 127 0 32 64 96 127
60 T A B 26 AXBXTX 60 T A B 58 AXBXTX
0 6 11 16 21 29 30 31 0 6 11 16 21 29 30 31
do i = 0 to 15 do i = 0 to 15
idx ← pcv.byte[i].bit[3:7] idx ← pcv.byte[i].bit[3:7]
VSR[32×TX+T].byte[i] ← src.byte[idx] VSR[32×TX+T].byte[i] ← src.byte[31-idx]
end end
Let bytes 0:15 of src be the contents of VSR[XA]. Let bytes 0:15 of src be the contents of VSR[XA].
Let bytes 16:31 of src be the contents of VSR[XT]. Let bytes 16:31 of src be the contents of VSR[XT].
Let the permute control vector pcv be the contents of Let the permute control vector pcv be the contents of
VSR[XB]. VSR[XB].
For each integer value i from 0 to 15, do the following. For each integer value i from 0 to 15, do the following.
Let idx be the unsigned integer in bits 3:7 of byte Let idx be the unsigned integer in bits 3:7 of byte
element i of pcv. element i of pcv.
The contents of byte element idx of src is placed The contents of byte element 31-idx of src is
into byte element i of VSR[XT]. placed into byte element i of VSR[XT].
VSX Shift Left Double by Word Immediate VSX Splat Word XX2-form
XX3-form
xxspltw XT,XB,UIM
xxsldwi XT,XA,XB,SHW
60 T /// UIM B 164 BXTX
60 T A B 0 SHW 2 AXBXTX 0 6 11 14 16 21 30 31
0 6 11 16 21 22 24 29 30 31
if MSR.VSX=0 then VSX_Unavailable()
if MSR.VSX=0 then VSX_Unavailable()
VSR[32×TX+T].word[0] ← VSR[32×BX+B].word[UIM]
source.qword[0] ← VSR[32×AX+A] VSR[32×TX+T].word[1] ← VSR[32×BX+B].word[UIM]
source.qword[1] ← VSR[32×BX+B] VSR[32×TX+T].word[2] ← VSR[32×BX+B].word[UIM]
VSR[32×TX+T] ← source.word[SHW:SHW+3] VSR[32×TX+T].word[3] ← VSR[32×BX+B].word[UIM]
60 T 0 IMM8 360 TX
0 6 11 13 21 31
do i = 0 to 15
VSR[32×TX+T].byte[i] UIM8
end
If (FRB)1:11 > 896 and (FRB)1:11 < 1151 then goto Normal Operand
Zero Operand:
FRT (FRB)
If (FRB)0 = 0 then FPSCRFPRF “+ zero”
If (FRB)0 = 1 then FPSCRFPRF “- zero”
FPSCRFRFI 0b00
Done
Infinity Operand:
FRT (FRB)
If (FRB)0 = 0 then FPSCRFPRF “+ infinity”
If (FRB)0 = 1 then FPSCRFPRF “- infinity”
FPSCRFRFI 0b00
Done
QNaN Operand:
FRT (FRB)0:34 || 290
FPSCRFPRF “QNaN”
FPSCRFR FI 0b00
Done
SNaN Operand:
FPSCRVXSNAN 1
If FPSCRVE = 0 then
Do
FRT0:11 (FRB)0:11
FRT12 1
FRT13:63 (FRB)13:34 || 290
FPSCRFPRF “QNaN”
End
FPSCRFR FI 0b00
Done
Normal Operand:
sign (FRB)0
exp (FRB)1:11 - 1023
frac0:52 0b1 || (FRB)12:63
Round Single(sign,exp,frac0:52,0,0,0)
FPSCRXX FPSCRXX | FPSCRFI
If exp > 127 and FPSCROE = 0 then go to Disabled Exponent Overflow
If exp > 127 and FPSCROE = 1 then go to Enabled Overflow
FRT0 sign
FRT1:11 exp + 1023
FRT12:63 frac1:52
If sign = 0 then FPSCRFPRF “+ normal number”
If sign = 1 then FPSCRFPRF “- normal number”
Done
Round Single(sign,exp,frac0:52,G,R,X):
inc 0
lsb frac23
gbit frac24
rbit frac25
xbit (frac26:52||G||R||X)≠0
If FPSCRRN = 0b00 then /* Round to Nearest */
Do /* comparisons ignore u bits */
If sign || lsb || gbit || rbit || xbit = 0bu11uu then inc 1
If sign || lsb || gbit || rbit || xbit = 0bu011u then inc 1
If sign || lsb || gbit || rbit || xbit = 0bu01u1 then inc 1
End
If FPSCRRN = 0b10 then /* Round toward + Infinity */
Do /* comparisons ignore u bits */
If sign || lsb || gbit || rbit || xbit = 0b0u1uu then inc 1
If sign || lsb || gbit || rbit || xbit = 0b0uu1u then inc 1
If sign || lsb || gbit || rbit || xbit = 0b0uuu1 then inc 1
End
If FPSCRRN = 0b11 then /* Round toward - Infinity */
Do /* comparisons ignore u bits */
If sign || lsb || gbit || rbit || xbit = 0b1u1uu then inc 1
If sign || lsb || gbit || rbit || xbit = 0b1uu1u then inc 1
If sign || lsb || gbit || rbit || xbit = 0b1uuu1 then inc 1
End
frac0:23 frac0:23 + inc
If carry_out = 1 then
Do
frac0:23 0b1 || frac0:22
exp exp + 1
End
frac24:52 290
FPSCRFR inc
FPSCRFI gbit | rbit | xbit
Return
if Floating Convert To Integer Word Unsigned with round toward Zero then do
round_mode 0b01
tgt_precision “32-bit unsigned integer”
end
if Floating Convert To Integer Doubleword Unsigned with round toward Zero then do
round_mode 0b01
tgt_precision “64-bit unsigned integer”
end
sign (FRB)0
if (FRB)1:11 = 2047 and (FRB)12:63 = 0 then goto Infinity Operand
if (FRB)1:11 = 2047 and (FRB)12 = 0 then goto SNaN Operand
if (FRB)1:11 = 2047 and (FRB)12 = 1 then goto QNaN Operand
if (FRB)1:11 > 1086 then goto Large Operand
Infinity Operand:
FPSCRFR 0b0
FPSCRFI 0b0
FPSCRVXCVI 0b1
if FPSCRVE = 0 then do
if tgt_precision = “32-bit signed integer” then do
if sign=0 then FRT 0xUUUU_UUUU_7FFF_FFFF
if sign=1 then FRT 0xUUUU_UUUU_8000_0000
end
else if tgt_precision = “32-bit unsigned integer” then do
if sign=0 then FRT 0xUUUU_UUUU_FFFF_FFFF
if sign=1 then FRT 0xUUUU_UUUU_0000_0000
end
else if tgt_precision = “64-bit signed integer” then do
if sign=0 then FRT 0x7FFF_FFFF_FFFF_FFFF
if sign=1 then FRT 0x8000_0000_0000_0000
end
SNaN Operand:
FPSCRFR 0b0
FPSCRFI 0b0
FPSCRVXSNAN 0b1
FPSCRVXCVI 0b1
if FPSCRVE = 0 then do
if tgt_precision = “32-bit signed integer” then FRT 0xUUUU_UUUU_8000_0000
if tgt_precision = “64-bit signed integer” then FRT 0x8000_0000_0000_0000
if tgt_precision = “32-bit unsigned integer” then FRT 0xUUUU_UUUU_0000_0000
if tgt_precision = “64-bit unsigned integer” then FRT 0x0000_0000_0000_0000
FPSCRFPRF 0bUUUUU
end
done
QNaN Operand:
FPSCRFR 0b0
FPSCRFI 0b0
FPSCRVXCVI 0b1
if FPSCRVE = 0 then do
if tgt_precision = “32-bit signed integer” then FRT 0xUUUU_UUUU_8000_0000
if tgt_precision = “64-bit signed integer” then FRT 0x8000_0000_0000_0000
if tgt_precision = “32-bit unsigned integer” then FRT 0xUUUU_UUUU_0000_0000
if tgt_precision = “64-bit unsigned integer” then FRT 0x0000_0000_0000_0000
FPSCRFPRF 0bUUUUU
end
done
Large Operand:
FPSCRFR 0b0
FPSCRFI 0b0
FPSCRVXCVI 0b1
if FPSCRVE = 0 then do
if tgt_precision = “32-bit signed integer” then do
if sign = 0 then FRT 0xUUUU_UUUU_7FFF_FFFF
if sign = 1 then FRT 0xUUUU_UUUU_8000_0000
end
else if tgt_precision = “64-bit signed integer” then do
if sign = 0 then FRT 0x7FFF_FFFF_FFFF_FFFF
if sign = 1 then FRT 0x8000_0000_0000_0000
end
else if tgt_precision = “32-bit unsigned integer” then do
if sign = 0 then FRT 0xUUUU_UUUU_FFFF_FFFF
if sign = 1 then FRT 0xUUUU_UUUU_0000_0000
end
else if tgt_precision = “64-bit unsigned integer” then do
if sign = 0 then FRT 0xFFFF_FFFF_FFFF_FFFF
if sign = 1 then FRT 0x0000_0000_0000_0000
end
FPSCRFPRF 0bUUUUU
end
done
Zero Operand:
FPSCRFR 0b00
FPSCRFI 0b00
FPSCRFPRF “+ zero”
FRT 0x0000_0000_0000_0000
done
lsb frac52
gbit frac53
rbit frac54
xbit frac55:63 > 0
end
FPSCRFR inc
FPSCRFI gbit | rbit | xbit
FPSCRXX FPSCRXX | FPSCRFI
return
sign (FRB)0
exp (FRB)1:11 - 1023 /* exp - bias */
frac0:52 0b1 || (FRB)12:63
gbit || rbit || xbit 0b000
Do i = 1, 52 - exp
frac0:52 || gbit || rbit || xbit 0b0 || frac0:52 || gbit || (rbit | xbit)
End
Do i = 2, 52 - exp
frac0:52 frac1:52 || 0b0
End
FRT0 sign
FRT1:11 exp + 1023
FRT12:63 frac1:52
SNaN Operand:
FPSCRVXSNAN 1
If FPSCRVE = 0 then
Do
FRT (FRB)
FRT12 1
FPSCRFPRF “QNaN”
End
FPSCRFR FI 0b00
Done
QNaN Operand:
FRT (FRB)
FPSCRFPRF “QNaN”
FPSCRFR FI 0b00
Done
Zero Operand:
If (FRB)0 = 0 then
Do
FRT 0x0000_0000_0000_0000
FPSCRFPRF “+ zero”
End
Else
Do
FRT 0x8000_0000_0000_0000
FPSCRFPRF “- zero”
End
FPSCRFR FI 0b00
Done
Small Operand:
If inst = Floating Round to Integer Nearest and
(FRB)1:11 < 1022 then goto Zero Operand
If inst = Floating Round to Integer Toward Zero
then goto Zero Operand
If inst = Floating Round to Integer Plus and (FRB)0
= 1 then goto Zero Operand
If inst = Floating Round to Integer Minus and
(FRB)0 = 0 then goto Zero Operand
If (FRB)0 = 0 then
Do
FRT 0x3FF0_0000_0000_0000
/* value = 1.0 */
FPSCRFPRF “+ normal number”
End
Else
Do
FRT 0xBFF0_0000_0000_0000
/* value = -1.0 */
FPSCRFPRF “- normal number”
End
FPSCRFR FI 0b00
Done
Large Operand:
FRT (FRB)
The trailing significand field of the decimal floating-point can be applied or reversed using simple Boolean oper-
data format is encoded using Densely Packed Decimal ations. In the following examples, a 3-digit BCD num-
(DPD). DPD encoding is a compression technique ber is represented as (abcd)(efgh)(ijkm), a 10-bit DPD
which supports the representation of decimal integers number is represented as (pqr)(stu)(v)(wxy), and the
of arbitrary length. Translation operates on three Boolean operations, & (AND), | (OR), and ¬ (NOT) are
Binary Coded Decimal (BCD) digits at a time com- used.
pressing the 12 bits into 10 bits with an algorithm that
B.1 BCD-to-DPD Translation with the DPD entries shown in hexadecimal format.
The BCD number is produced by replacing ‘_’ in the
The translation from a 3-digit BCD number to a 10-bit leftmost column with the corresponding digit along the
DPD can be performed through the following Boolean top row. The table is split into two halves, with the right
operations. half being a continuation of the left half.
vwxst abcd efgh ijkm DPD Code BCD Value DPD Code BCD Value
0---- 0pqr 0stu 0wxy 0x06E 0x0EE
100-- 0pqr 0stu 100y (0x16E) 888 (0x1EE) 988
101-- 0pqr 100u 0sty (0x26E) (0x2EE)
110-- 100r 0stu 0pqy (0x36E) (0x3EE)
11100 100r 100u 0pqy 0x06F 0x0EF
11101 100r 0pqu 100y (0x16F) 889 (0x1EF) 989
11110 0pqr 100u 100y (0x26F) (0x2EF)
11111 100r 100u 100y (0x36F) (0x3EF)
The full translation of the 10-bit DPD to a 3-digit BCD 0x07E 0x0FE
number is shown in Table 140 on page 794. The 10-bit (0x17E) 898 (0x1FE) 998
DPD index is produced by concatenating the 6-bit value (0x27E) (0x2FE)
shown in the left column with the 4-bit index along the
(0x37E) (0x3FE)
top row, both represented in hexadecimal. The values
in parentheses are non-preferred translations and are 0x07F 0x0FF
explained further in the following section. (0x17F) 899 (0x1FF) 999
(0x27F) (0x2FF)
B.3 Preferred DPD encoding (0x37F) (0x3FF)
In order to make assembler language programs simpler to write and easier to understand, a set of extended mne-
monics and symbols is provided that defines simple shorthand for the most frequently used forms of Branch Condi-
tional, Compare, Trap, Rotate and Shift, and certain other instructions.
Assemblers should provide the extended mnemonics and symbols listed here, and may provide others.
C.1 Symbols
The following symbols are defined for use in instructions (basic or extended mnemonics) that specify a Condition
Register field or a Condition Register bit. The first five (lt, ..., un) identify a bit number within a CR field. The remainder
(cr0, ..., cr7) identify a CR field. An expression in which a CR field symbol is multiplied by 4 and then added to a
bit-number-within-CR-field symbol and 32 can be used to identify a CR bit.
The extended mnemonics in Sections C.2.2 and C.3 require identification of a CR bit: if one of the CR field symbols is
used, it must be multiplied by 4 and added to a bit-number-within-CR-field (value in the range 0-3, explicit or sym-
bolic) and 32. The extended mnemonics in Sections C.2.3 and C.5 require identification of a CR field: if one of the CR
field symbols is used, it must not be multiplied by 4 or added to 32. (For the extended mnemonics in Section C.2.3,
the bit number within the CR field is part of the extended mnemonic. The programmer identifies the CR field, and the
Assembler does the multiplication and addition required to produce a CR bit number for the BI field of the underlying
basic mnemonic.)
Examples
1. Decrement CTR and branch if it is still nonzero (closure of a loop controlled by a count loaded into CTR).
bdnz target (equivalent to: bc 16,0,target)
2. Same as (1) but branch only if CTR is nonzero and condition in CR0 is “equal”.
bdnzt eq,target (equivalent to: bc 8,2,target)
3. Same as (2), but “equal” condition is in CR5.
bdnzt 4×cr5+eq,target (equivalent to: bc 8,22,target)
4. Branch if bit 59 of CR is 0.
bf 27,target (equivalent to: bc 4,27,target)
5. Same as (4), but set the Link Register. This is a form of conditional “call”.
bfl 27,target (equivalent to: bcl 4,27,target)
Code Meaning
lt Less than
le Less than or equal
eq Equal
ge Greater than or equal
gt Greater than
nl Not less than
ne Not equal
ng Not greater than
so Summary overflow
ns Not summary overflow
un Unordered (after floating-point comparison)
nu Not unordered (after floating-point comparison)
These codes are reflected in the mnemonics shown in Table 142.
Examples
1. Branch if CR0 reflects condition “not equal”.
bne target (equivalent to: bc 4,2,target)
2. Same as (1), but condition is in CR3.
Examples
1. Branch if CR0 reflects condition “less than”, specifying that the branch should be predicted to be taken.
blt+ target
2. Same as (1), but target address is in the Link Register and the branch should be predicted not to be taken.
bltlr-
Examples
1. Set CR bit 57.
crset 25 (equivalent to: creqv 25,25,25)
2. Clear the SO bit of CR0.
crclr so (equivalent to: crxor 3,3,3)
3. Same as (2), but SO bit to be cleared is in CR3.
crclr 4×cr3+so (equivalent to: crxor 15,15,15)
4. Invert the EQ bit.
crnot eq,eq (equivalent to: crnor 2,2,2)
5. Same as (4), but EQ bit to be inverted is in CR4, and the result is to be placed into the EQ bit of CR5.
crnot 4×cr5+eq,4×cr4+eq (equivalent to: crnor 22,18,18)
C.4.2 Subtract
The Subtract From instructions subtract the second operand (RA) from the third (RB). Extended mnemonics are pro-
vided that use the more “normal” order, in which the third operand is subtracted from the second. Both these mne-
monics can be coded with a final “o” and/or “.” to cause the OE and/or Rc bit to be set in the underlying instruction.
sub Rx,Ry,Rz (equivalent to: subf Rx,Rz,Ry)
subc Rx,Ry,Rz (equivalent to: subfc Rx,Rz,Ry)
Examples
1. Compare register Rx and immediate value 100 as unsigned 64-bit integers and place result into CR0.
cmpldi Rx,100 (equivalent to: cmpli 0,1,Rx,100)
2. Same as (1), but place result into CR4.
cmpldi cr4,Rx,100 (equivalent to: cmpli 4,1,Rx,100)
3. Compare registers Rx and Ry as signed 64-bit integers and place result into CR0.
cmpd Rx,Ry (equivalent to: cmp 0,1,Rx,Ry)
Examples
1. Compare bits 32:63 of register Rx and immediate value 100 as signed 32-bit integers and place result into CR0.
cmpwi Rx,100 (equivalent to: cmpi 0,0,Rx,100)
2. Same as (1), but place result into CR4.
cmpwi cr4,Rx,100 (equivalent to: cmpi 4,0,Rx,100)
3. Compare bits 32:63 of registers Rx and Ry as unsigned 32-bit integers and place result into CR0.
cmplw Rx,Ry (equivalent to: cmpl 0,0,Rx,Ry)
Examples
1. Trap if register Rx is not 0.
tdnei Rx,0 (equivalent to: tdi 24,Rx,0)
2. Same as (1), but comparison is to register Ry.
tdne Rx,Ry (equivalent to: td 24,Rx,Ry)
3. Trap if bits 32:63 of register Rx, considered as a 32-bit quantity, are logically greater than 0x7FF.
twlgti Rx,0x7FF (equivalent to: twi 1,Rx,0x7FF)
4. Trap unconditionally.
trap (equivalent to: tw 31,0,0)
5. Trap unconditionally with immediate parameters Rx and Ry
tdu Rx,Ry (equivalent to: td 31,Rx,Ry)
Code Meaning
lt Less than
eq Equal
gt Greater than
These codes are reflected in the mnemonics shown in Table 147.
Examples
1. Set register Rx to Ry if the LT bit is set in CR0, and to Rz otherwise.
isellt Rx,Ry,Rz (equivalent to: isel Rx,Ry,Rz,0)
2. Set register Rx to Ry if the GT bit is set in CR0, and to Rz otherwise.
iselgt Rx,Ry,Rz (equivalent to: isel Rx,Ry,Rz,1)
3. Set register Rx to Ry if the EQ bit is set in CR0, and to Rz otherwise.
iseleq Rx,Ry,Rz (equivalent to: isel Rx,Ry,Rz,2)
Examples
1. Extract the sign bit (bit 0) of register Ry and place the result right-justified into register Rx.
extrdi Rx,Ry,1,0 (equivalent to: rldicl Rx,Ry,1,63)
2. Insert the bit extracted in (1) into the sign bit (bit 0) of register Rz.
insrdi Rz,Rx,1,0 (equivalent to: rldimi Rz,Rx,63,0)
3. Shift the contents of register Rx left 8 bits.
sldi Rx,Rx,8 (equivalent to: rldicr Rx,Rx,8,55)
4. Clear the high-order 32 bits of register Ry and place the result into register Rx.
clrldi Rx,Ry,32 (equivalent to: rldicl Rx,Ry,0,32)
Examples
1. Extract the sign bit (bit 32) of register Ry and place the result right-justified into register Rx.
extrwi Rx,Ry,1,0 (equivalent to: rlwinm Rx,Ry,1,31,31)
2. Insert the bit extracted in (1) into the sign bit (bit 32) of register Rz.
insrwi Rz,Rx,1,0 (equivalent to: rlwimi Rz,Rx,31,0,0)
3. Shift the contents of register Rx left 8 bits, clearing the high-order 32 bits.
slwi Rx,Rx,8 (equivalent to: rlwinm Rx,Rx,8,0,23)
4. Clear the high-order 16 bits of the low-order 32 bits of register Ry and place the result into register Rx, clearing
the high-order 32 bits of register Rx.
clrlwi Rx,Ry,16 (equivalent to: rlwinm Rx,Ry,0,16,31)
Examples
1. Copy the contents of register Rx to the XER.
mtxer Rx (equivalent to: mtspr 1,Rx)
2. Copy the contents of the LR to register Rx.
mflr Rx (equivalent to: mfspr Rx,8)
3. Copy the contents of register Rx to the CTR.
mtctr Rx (equivalent to: mtspr 9,Rx)
Load Immediate
The addi and addis instructions can be used to load an immediate value into a register. Extended mnemonics are
provided to convey the idea that no addition is being performed but merely data movement (from the immediate field
of the instruction to a register).
Load a 16-bit signed immediate value into register Rx.
li Rx,value (equivalent to: addi Rx,0,value)
Load a 16-bit signed immediate value, shifted left by 16 bits, into register Rx.
lis Rx,value (equivalent to: addis Rx,0,value)
Load Address
This mnemonic permits computing the value of a base-displacement operand, using the addi instruction which nor-
mally requires separate register and immediate operands.
la Rx,D(Ry) (equivalent to: addi Rx,Ry,D)
The la mnemonic is useful for obtaining the address of a variable specified by name, allowing the Assembler to sup-
ply the base register number and compute the displacement. If the variable v is located at offset Dv bytes from the
address in register Rv, and the Assembler has been told to use register Rv as a base for references to the data struc-
ture containing v, then the following line causes the address of v to be loaded into register Rx.
la Rx,v (equivalent to: addi Rx,Rv,Dv)
Move Register
Several Power ISA instructions can be coded in a way such that they simply copy the contents of one register to
another. An extended mnemonic is provided to convey the idea that no computation is being performed but merely
data movement (from one register to another).
The following instruction copies the contents of register Ry to register Rx. This mnemonic can be coded with a final
“.” to cause the Rc bit to be set in the underlying instruction.
mr Rx,Ry (equivalent to: or Rx,Ry,Ry)
Complement Register
Several Power ISA instructions can be coded in a way such that they complement the contents of one register and
place the result into another register. An extended mnemonic is provided that allows this operation to be coded easily.
The following instruction complements the contents of register Ry and places the result into register Rx. This mne-
monic can be coded with a final “.” to cause the Rc bit to be set in the underlying instruction.
not Rx,Ry (equivalent to: nor Rx,Ry,Ry)
Book II:
1.5 Cache Model vary between systems, the details of the specific sys-
tem being used must be known before these attributes
A cache model in which there is one cache for instruc- can be used.
tions and another cache for data is called a “Har- Storage control attributes are associated with units of
vard-style” cache. This is the model assumed by the storage that are multiples of the page size. Each stor-
Power ISA, e.g., in the descriptions of the Cache Man- age access is performed according to the storage con-
agement instructions in Section 4.3. Alternative cache trol attributes of the specified storage location, as
models may be implemented (e.g., a “combined cache” described below. The storage control attributes are the
model, in which a single cache is used for both instruc- following.
tions and data, or a model in which there are several
levels of caches), but they support the programming Write Through Required
model implied by a Harvard-style cache. Caching Inhibited
Memory Coherence Required
The processor is not required to maintain copies of Guarded
storage locations in the instruction cache consistent Strong Access Order
with modifications to those storage locations (e.g.,
modifications caused by Store instructions). These attributes have meaning only when an effective
address is translated by the processor performing the
A location in the data cache is considered to be modi- storage access.
fied in that cache if the location has been modified
(e.g., by a Store instruction) and the modified data have Programming Note
not been written to main storage.
The Write Through Required and Caching Inhibited
Cache Management instructions are provided so that attributes are mutually exclusive because, as
programs can manage the caches when needed. For described below, the Write Through Required
example, program management of the caches is attribute permits the storage location to be in the
needed when a program generates or modifies code data cache while the Caching Inhibited attribute
that will be executed (i.e., when the program modifies does not.
data in storage and then attempts to execute the modi-
Storage that is Write Through Required or Caching
fied data as instructions). The Cache Management
Inhibited is not intended to be used for general-pur-
instructions are also useful in optimizing the use of
pose programming. For example, the lbarx, lharx,
memory bandwidth in such applications as graphics
lwarx, ldarx, lqarx, stbcx., sthcx., stwcx., stdcx.,
and numerically intensive computing. The functions
and stqcx. instructions may cause the system data
performed by these instructions depend on the storage
storage error handler to be invoked if they specify a
control attributes associated with the specified storage
location in storage having either of these attributes.
location (see Section 1.6, “Storage Control Attributes”).
To obtain the best performance across the widest
The Cache Management instructions allow the program range of implementations, storage that is Write
to do the following. Through Required or Caching Inhibited should be
used only when the use of such storage meets spe-
invalidate the copy of storage in an instruction cific functional or semantic needs or enables a per-
cache block (icbi) formance optimization.
provide a hint that an instruction will probably soon
be accessed from a specified instruction cache In the remainder of this section, “Load instruction”
block (icbt) includes the Cache Management and other instructions
provide a hint that the program will probably soon that are stated in the instruction descriptions to be
access a specified data cache block (dcbt, dcbtst) “treated as a Load” unless they are explicitly excluded,
set the contents of a data cache block to zeros and similarly for “Store instruction”.
(dcbz)
copy the contents of a modified data cache block
to main storage (dcbst) 1.6.1 Write Through Required
copy the contents of a modified data cache block
to main storage and make the copy of the block in A store to a Write Through Required storage location is
the data cache invalid (dcbf or dcbfl) performed in main storage. A Store instruction that
specifies a location in Write Through Required storage
may cause additional locations in main storage to be
1.6 Storage Control Attributes accessed. If a copy of the block containing the speci-
fied location is retained in the data cache, the store is
Some operating systems may provide a means to allow also performed in the data cache. The store does not
programs to specify the storage control attributes cause the block to be considered to be modified in the
described in this section. Because the support pro- data cache.
vided for these attributes by the operating system may
In general, accesses caused by separate Store instruc- tion by any processor or mechanism during any interval
tions that specify locations in Write Through Required of time forms a subsequence of the sequence of values
storage may be combined into one access. Such com- that the location logically held during that interval. That
bining does not occur if the Store instructions are sepa- is, a processor or mechanism can never load a “newer”
rated by a sync, eieio instruction. value first and then, later, load an “older” value.
Memory coherence is managed in blocks called coher-
1.6.2 Caching Inhibited ence blocks. Their size is implementation-dependent,
but is larger than a word and is usually the size of a
An access to a Caching Inhibited storage location is cache block.
performed in main storage. A Load instruction that
specifies a location in Caching Inhibited storage may For storage that is not Memory Coherence Required,
cause additional locations in main storage to be software must explicitly manage memory coherence to
accessed unless the specified location is also Guarded. the extent required by program correctness. The oper-
An instruction fetch from Caching Inhibited storage may ations required to do this may be system-dependent.
cause additional words in main storage to be accessed. Because the Memory Coherence Required attribute for
No copy of the accessed locations is placed into the a given storage location is of little use unless all proces-
caches. sors that access the location do so coherently, in state-
In general, non-overlapping accesses caused by sepa- ments about Memory Coherence Required storage
rate Load instructions that specify locations in Caching elsewhere in this document it is generally assumed that
Inhibited storage may be combined into one access, as the storage has the Memory Coherence Required
may non-overlapping accesses caused by separate attribute for all processors that access it.
Store instructions that specify locations in Caching
Inhibited storage. Such combining does not occur if the Programming Note
Load or Store instructions are separated by a sync Operating systems that allow programs to request
instruction. Combining may also occur among such that storage not be Memory Coherence Required
accesses from multiple processors that share a com- should provide services to assist in managing
mon memory interface. No combining occurs if the memory coherence for such storage, including all
storage is also Guarded. system-dependent aspects thereof.
In most systems the default is that all storage is
Programming Note
Memory Coherence Required. For some applica-
None of the memory barrier instructions prevent tions in some systems, software management of
the combining of accesses from different proces- coherence may yield better performance. In such
sors. The Guarded storage attribute must be used cases, a program can request that a given unit of
in combination with Caching Inhibited to prevent storage not be Memory Coherence Required, and
such combining. can manage the coherence of that storage by using
the sync instruction, the Cache Management
instructions, and services provided by the operating
1.6.3 Memory Coherence system.
Required
An access to a Memory Coherence Required storage 1.6.4 Guarded
location is performed coherently, as follows.
A data access to a Guarded storage location is per-
Memory coherence refers to the ordering of stores to a formed only if either (a) the access is caused by an
single location. Atomic stores to a given location are instruction that is known to be required by the sequen-
coherent if they are serialized in some order, and no tial execution model, or (b) the access is a load and the
processor or mechanism is able to observe any subset storage location is already in a cache. If the storage is
of those stores as occurring in a conflicting order. This also Caching Inhibited, only the storage location speci-
serialization order is an abstract sequence of values; fied by the instruction is accessed; otherwise any stor-
the physical storage location need not assume each of age location in the cache block containing the specified
the values written to it. For example, a processor may storage location may be accessed.
update a location several times before the value is writ-
ten to physical storage. The result of a store operation Instructions are not fetched from virtual storage that is
is not available to every processor or mechanism at the Guarded. If the instruction addressed by the current
same instant, and it may be that a processor or mecha- instruction address is in such storage, the system
nism observes only some of the values that are written instruction storage error handler may be invoked (see
to a location. However, when a location is accessed Section 6.5.5 of Book III).
atomically and coherently by all processors and mech-
anisms, the sequence of values loaded from the loca-
Programming Note
In some implementations, instructions may be exe-
cuted before they are known to be required by the
sequential execution model. Because the results of
instructions executed in this manner are discarded
if it is later determined that those instructions would
not have been executed in the sequential execution
model, this behavior does not affect most pro-
grams.
This behavior does affect programs that access
storage locations that are not “well-behaved” (e.g.,
a storage location that represents a control register
on an I/O device that, when accessed, causes the
device to perform an operation). To avoid unin-
tended results, programs that access such storage
locations should request that the storage be
Guarded, and should prevent such storage loca-
tions from being in a cache (e.g., by requesting that
the storage also be Caching Inhibited).
1.6.5 Strong Access Order sideration is that although programs running on each
processor will see the same sequence of accesses
All accesses to storage with the Strong Access Order from any individual processor to SAO storage, each
(SAO) attribute (referred to as SAO storage) will be per- may in general see a different interleaving of the indi-
formed using a set of ordering rules different from that vidual sequences. The memory barrier instructions
of the weakly consistent model that is described in may be used to establish stronger ordering, as
Section 1.7.1, “Storage Access Ordering”. These rules described in Section 1.7.1, “Storage Access Ordering”,
apply only to accesses that are caused by a Load or a beginning with the third major bullet.
Store, and not to accesses associated with those
instructions. Furthermore, these rules do not apply to
accesses that are caused by or associated with instruc-
tions that are stated in their descriptions to be “treated
as a Load” or “treated as a Store.” The details are
described below, from the programmer’s point of view.
(The processor may deviate from these rules if the pro-
grammer cannot detect the deviation.) The SAO
attribute is not intended to be used for general purpose
programming. It is provided in a manner that is not fully
independent of the other storage attributes. Specifi-
cally, it is only provided for storage that is Memory
Coherence Required, but not Write Through Required,
not Caching Inhibited, and not Guarded. See
Section 5.8.2.1, “Storage Control Bit Restrictions”, in
Book III for more details. Accesses to SAO storage are
likely to be performed more slowly than similar
accesses to non-SAO storage.
The order in which a processor performs storage
accesses to SAO storage, the order in which those
accesses are performed with respect to other proces-
sors and mechanisms, and the order in which those
accesses are performed in main storage are the same
except in the circumstances described in the following
paragraph. The ordering rules for accesses performed
by a single processor to SAO storage are as follows.
Stores are performed in program order. When a store
accesses data adjacent to that which is accessed by
the next store in program order, the two storage
accesses may be combined into a single larger access.
Loads are performed in program order. When a load
accesses data adjacent to that which is accessed by
the next load in program order, the two storage
accesses may be combined into a single larger access.
Stores may not be performed before loads which pre-
cede them in program order. Loads may be performed
before stores which precede them in program order,
with the provision that a load which follows a store of
the same datum (to the same address) must obtain a
value which is no older (in consideration of the possibil-
ity of programs on other processors sharing the same
storage) than the value stored by the preceding store.
When any given processor loads the datum it just
stored, as described above, the load may be performed
by the processor before the preceding store has been
performed with respect to other processors and mecha-
nisms, and in main storage. This may cause the pro-
cessor to see its store earlier relative to stores
performed by other processors than it is observed by
other processors and mechanisms, and than it is per-
formed in memory. A direct consequence of this con-
Programming Note
Because stores cannot be performed “out-of-order” not order the Store Conditional's store with respect
(see Book III), if a Store instruction depends on the to storage accesses caused by instructions that
value returned by a preceding Load instruction follow the Branch.
(because the value returned by the Load is used to
Because processors may predict branch target
compute either the effective address specified by the
addresses and branch condition resolution, control
Store or the value to be stored), the corresponding stor-
dependencies (e.g., branches) do not order stor-
age accesses are performed in program order. The
age accesses except as described above. For
same applies if whether the Store instruction is exe-
example, when a subroutine returns to its caller the
cuted depends on a conditional Branch instruction that
return address may be predicted, with the result
in turn depends on the value returned by a preceding
that loads caused by instructions at or after the
Load instruction.
return address may be performed before the load
Because an isync instruction prevents the execution of that obtains the return address is performed.
instructions following the isync until instructions pre-
Because processors may implement nonarchitected
ceding the isync have completed, if an isync follows a
duplicates of architected resources (e.g., GPRs, CR
conditional Branch instruction that depends on the
fields, and the Link Register), resource dependencies
value returned by a preceding Load instruction, the
(e.g., specification of the same target register for two
load on which the Branch depends is performed before
Load instructions) do not order storage accesses.
any loads caused by instructions following the isync.
This applies even if the effects of the “dependency” are Examples of correct uses of dependencies, sync and
independent of the value loaded (e.g., the value is lwsync to order storage accesses can be found in
compared to itself and the Branch tests the EQ bit in Appendix B. “Programming Examples for Sharing Stor-
the selected CR field), and even if the branch target is age” on page 913.
the sequentially next instruction.
Because the storage model is weakly consistent, the
With the exception of the cases described above and sequential execution model as applied to instructions
earlier in this section, data dependencies and control that cause storage accesses guarantees only that
dependencies do not order storage accesses. Exam- those accesses appear to be performed in program
ples include the following. order with respect to the processor executing the
instructions. For example, an instruction may com-
If a Load instruction specifies the same storage
plete, and subsequent instructions may be executed,
location as a preceding Store instruction and the
before storage accesses caused by the first instruction
location is in storage that is not Caching Inhibited,
have been performed. However, for a sequence of
the load may be satisfied from a “store queue” (a
atomic accesses to the same storage location, if the
buffer into which the processor places stored val-
location is in storage that is Memory Coherence
ues before presenting them to the storage sub-
Required the definition of coherence guarantees that
system), and not be visible to other processors and
the accesses are performed in program order with
mechanisms. A consequence is that if a subse-
respect to any processor or mechanism that accesses
quent Store depends on the value returned by the
the location coherently, and similarly if the location is in
Load, the two stores need not be performed in pro-
storage that is Caching Inhibited.
gram order with respect to other processors and
mechanisms. Because accesses to storage that is Caching Inhibited
Because a Store Conditional instruction may com- are performed in main storage, memory barriers and
plete before its store has been performed, a condi- dependencies on Load instructions order such
tional Branch instruction that depends on the CR0 accesses with respect to any processor or mechanism
value set by a Store Conditional instruction does even if the storage is not Memory Coherence Required.
3. The processor holding the reservation executes an 10. Implementation-specific characteristics of the
AMO that updates the same reservation granule: coherence mechanism cause the reservation to be
whether the reservation is lost is undefined. lost.
4. Any of the following occurs on the processor hold-
ing the reservation.
a. The transaction state changes (from
Non-transactional, Transactional, or Sus-
pended state to one of the other two states;
see Section 5.2, “Transactional Memory
Facility States”), except in the following
cases Virtualized Implementation Note
If the change is from Transactional
A reservation may be lost if:
state to Suspended state, the reserva-
Software executes a privileged instruction or
tion is not lost.
utilizes a privileged facility
If the change is from Suspended state
Software accesses storage not intended for
to Transactional state, the reservation
general-purpose programming
is not lost if it was established in Trans-
Software accesses a Device Control Register
actional state.
If the change is caused by a treclaim.
or trechkpt. instruction, whether the Programming Note
reservation is lost is undefined. One use of lwarx and stwcx. is to emulate a “Com-
b. The transaction nesting depth (see pare and Swap” primitive like that provided by the
Section 5.4, “Transactional Memory Facility IBM System/370 Compare and Swap instruction;
Registers”) changes; whether the reserva- see Section B.1, “Atomic Update Primitives” on
tion is lost is undefined. (This item applies page 913. A System/370-style Compare and Swap
only if the processor is in Transactional state checks only that the old and current values of the
both before and after the change.) word being tested are equal, with the result that
c. The processor is in Suspended state and programs that use such a Compare and Swap to
executes a Store Conditional instruction control a shared resource can err if the word has
(stbcx., sthcx., stwcx., stdcx., or stqcx.) been modified and the old value subsequently
or a waitrsv instruction; the reservation is restored. The combination of lwarx and stwcx.
lost if it was established in Transactional improves on such a Compare and Swap, because
state. In this case the Store Conditional the reservation reliably binds the lwarx and stwcx.
instruction’s store is not performed, and the together. The reservation is always lost if the word
waitrsv does not wait. (For Store Condi- is modified by another processor or mechanism
tional, the reservation is also lost if it was between the lwarx and stwcx., so the stwcx.
established in Suspended state; see item never succeeds unless the word has not been
2.) stored into (by another processor or mechanism)
5. Some other processor executes a Store or dcbz since the lwarx.
that specifies a location in the same reservation
granule.
Programming Note
6. Some other processor executes a dcbtst, or dcbt In general, programming conventions must ensure
that specifies a location in the same reservation that lwarx and stwcx. specify addresses that
granule: whether the reservation is lost is unde- match; a stwcx. should be paired with a specific
fined. (For a dcbtst instruction that specifies a lwarx to the same storage location. Situations in
data stream, "location" in the preceding sentence which a stwcx. may erroneously be issued after
includes all locations in the data stream.) some lwarx other than that with which it is intended
to be paired must be scrupulously avoided. For
7. Any processor modifies a Reference or Change bit
(see Book III in the same reservation granule: example, there must not be a context switch in
whether the reservation is lost is undefined. which the processor holds a reservation in behalf of
the old context, and the new context resumes after
8. Some mechanism other than a processor modifies a lwarx and before the paired stwcx.. The stwcx.
a storage location in the same reservation granule. in the new context might succeed, which is not
9. An interrupt (see Book III) occurs on the processor what was intended by the programmer. Such a situ-
holding the reservation: the reservation is not lost. ation must be prevented by executing a stbcx.,
However, system software invoked by interrupts sthcx., stwcx., stdcx., or stqcx. that specifies a
may clear the reservation.) dummy writable aligned location as part of the con-
text switch; see Section 6.4.3 of Book III.
cedes the transaction and orders storage accesses mechanism. Set A contains all data accesses caused
pairwise, as follows. Let A and B be sets of storage by instructions preceding the tend., including Sus-
accesses as defined below. For each pair aibj of stor- pended state accesses, that are neither Write Through
age accesses such that ai is in A and bj is in B, the Required nor Caching Inhibited. Set B contains all data
memory barrier ensures that ai will be performed with accesses caused by instructions following the tend.
respect to any processor or mechanism, to the extent that are neither Write Through Required nor Caching
required by the associated Memory Coherence Inhibited.
Required attributes, before bj is performed with respect
to that processor or mechanism. Set A contains all Programming Note
data accesses caused by instructions preceding the The memory barriers that are created by the execu-
tbegin. that are neither Write Through Required nor tion of a successful transaction (those associated
Caching Inhibited. Set B contains all data accesses with tbegin., tend., and the integrated cumulative
caused by instructions following the tbegin., including memomry barrier) render most explicit memory
Suspended state accesses, that are neither Write barriers in and around transactions redundant. An
Through Required nor Caching Inhibited. The ordering exception is when there is a need to establish order
done by this memory barrier is cumulative. among Suspended state accesses.
Programming Note
The reason the creation of the memory barrier by 1.8.1 Rollback-Only Transactions
tbegin. is specified to be contingent on the transac-
tion succeeding is that delaying the creation may A Rollback-Only Transaction (ROT) is a sequence of
improve performance, and does not seriously instructions that is executed, or not, as a unit. The pur-
inconvenience software. pose of the ROT is to enable bulk speculation of
instructions with minimum overhead. It leverages the
A successful transaction has an integrated memory rollback mechanism that is invoked as part of transac-
barrier behavior. When a processor (P1) executes a tion failure handling, but has reduced overhead in that it
tend. instruction and tend. processing determines that does not have the full atomic nature of the transaction
the transaction will succeed, a memory barrier is cre- and its synchronization and serialization properties.
ated, which orders storage accesses pairwise, as fol- The absence of a (normal) transaction’s atomic quality
lows. Let A and B be sets of storage accesses as means that a ROT must not be used to manipulate
defined below. For each pair aibj of storage accesses shared data.
such that ai is in A and bj is in B, the memory barrier More specifically, a ROT differs from a normal transac-
ensures that ai will be performed with respect to any tion as follows.
processor or mechanism, to the extent required by the ROTs are not serialized.
associated Memory Coherence Required attributes, There are no memory barriers created by tbegin.
before bj is performed with respect to that processor or and tend.
mechanism. Set A contains all non-transactional data A ROT has no integrated cumulative memory bar-
accesses by other processors and mechanisms that rier.
have been performed with respect to P1 before the There is no monitoring of storage locations speci-
memory barrier is created and are neither Write fied by loads for modification by other processors
Through Required nor Caching Inhibited. Set B con- and mechanisms between the performing of the
tains the aggregate store and all non-transactional data loads and the completion of the ROT.
accesses by other processors and mechanisms that The stores that are included in the ROT need not
are performed after a Load instruction executed by that appear to be performed as an aggregate store.
processor or mechanism has returned the value stored (Implementations are likely to provide an aggre-
by a store that is in set B. Note that the cumulative gate store appearance, but the correctness of the
memory barrier does not order Suspended state stor- program must not depend on the aggregate store
age accesses interleaved with the transaction. The appearance.)
ordering done by this memory barrier is cumulative.
A tend. instruction that ends a successful transaction
creates a memory barrier that immediately follows the
transaction and orders storage accesses pairwise, as
follows. Let A and B be sets of storage accesses as
1.9 Cluster Shared Memory
defined below. For each pair aibj of storage accesses Computer systems may be connected to provide a
such that ai is in A and bj is in B, the memory barrier means for one to directly address storage in another.
ensures that ai will be performed with respect to any The group of systems so interconnected is called a
processor or mechanism, to the extent required by the “cluster.” Real address space in each system in the
associated Memory Coherence Required attributes, cluster may be designated to map to “Cluster Shared
before bj is performed with respect to that processor or Memory” or CSM. Main storage from all systems in the
Programming Note
The results of attempting to execute instructions
from storage that does not satisfy this assumption
are described in Section 1.6.2 and Section 1.6.4 of
this Book and in Book III.
Programming Note
Following are examples of how to make instruction stor- li r0,1 #put a 1 value in r0
age consistent with data storage. Because the optimal dcbst X #copy the block in main storage
instruction sequence to make instruction storage con- sync #order copy before invalidation
sistent with data storage may vary between systems, icbi X #invalidate copy in instr cache
many operating systems will provide a system service sync #order invalidation before store
# to flag
to perform this function.
stw r0,flag #set flag indicating instruction
Case 1: The given program does not modify instruc- # storage is now consistent
tions executed by another program nor does another The following instruction sequence, executed by the
program modify the instructions executed by the given waiting program, will prevent the waiting programs from
program. executing the instruction at location X until location X in
Assume that location X previously contained the instruction storage is consistent with data storage, and
instruction A0; the program modified one of more bytes then will cause any prefetched instructions to be dis-
of that location such that, in data storage, the location carded.
contains the instruction A1; and location X is wholly lwz r0,flag #loop until flag = 1 (when 1 is
contained in a single cache block. The following instruc- cmpwi r0,1 # loaded, location X in inst’n
bne $-8 # storage is consistent with
tion sequence will make instruction storage consistent
# location X in data storage)
with data storage such that if the isync was in location isync #discard any prefetched inst’ns
X-4, the instruction A1 in location X would be executed
immediately after the isync. In the preceding instruction sequence any context syn-
dcbst X #copy the block to main storage chronizing instruction (e.g., rfid) can be used instead of
sync #order copy before invalidation isync. (For Case 1 only isync can be used.)
icbi X #invalidate copy in instr cache
isync #discard prefetched instructions For both cases, if two or more instructions in separate
data cache blocks have been modified, the dcbst
Case 2: One or more programs execute the instruc- instruction in the examples must be replaced by a
tions that are concurrently being modified by another sequence of dcbst instructions such that each block
program. containing the modified instructions is copied back to
Assume program A has modified the instruction at loca- main storage. Similarly, for icbi the sequence must
tion X and other programs are waiting for program A to invalidate each instruction cache block containing a
signal that the new instruction is ready to execute. The location of an instruction that was modified. The sync
following instruction sequence will make instruction instruction that appears above between “dcbst X” and
storage consistent with data storage and then set a flag “icbi X” would be placed between the sequence of
to indicate to the waiting programs that the new instruc- dcbst instructions and the sequence of icbi instructions.
tion can be executed.
2.1 Performance-Optimized
Instruction Sequences
Performance-optimized instruction sequences are
instruction sequences that provide better performance
than other ways of achieving the same results. The
supported performance-optimized sequences are
shown in the following sections. In order to achieve the
improved performance, the sequences must be coded
exactly as shown, including instruction order, register
re-use, and lack of intervening instructions. The pro-
cessor achieves the improved performance by execut-
ing the sequence as a single operation, or in some
other highly efficient, sequence-specific, manner. (The
improved performance may not be obtained if the
sequence causes the system error handler to be
invoked, or for implementation-dependent reasons.)
Programming Note
Even independent of the performance optimization described above, the techniques illustrated in Table 1 and Table 2
generally perform better than other ways of achieving the effect of having a large displacement field for D-form and
DS-form fixed-point Load/Store instructions (Table 1), and of having a displacement field for X-form and XX1-form
Vector and VSX Load/Store instructions (Table 2).
The technique for the fixed-point Load/Store instructions is complicated by the fact that D-form and DS-form Loads
and Stores treat the D/DS value as signed.
For simplicity, most of this Note assumes that the fixed-point Load/Store instruction is D-form; the modifications for
DS-form fixed-point Load/Store instructions are straightforward.
Let the desired effective address to load from or store to be (RA) + DISP, where DISP is a signed 32-bit value.
(RA) + DISP = (RA) + DISP0:15 || DISP16:31
= (RA) + (DISP0:15 || 0x0000) + DISP16:31
where DISP0:15 is a signed 16-bit value.
If DISP0:15 is used as the SI value for the addis, the addis forms the sum
(RA) + (DISP0:15 || 0x0000)
and places the result into Rx.
If DISP16:31 is used as the D value for the Load or Store and Rx is used as the base register for the Load or Store,
and DISP16 = 0, the Load or Store computes the EA to load from as
(Rx) + DISP16:31 = (RA) + (DISP0:15 || 0x0000) + DISP16:31
= (RA) + DISP
However, because D-form Loads and Stores treat the D value as signed, if DISP16 = 1 the Load or Store computes
the EA as
(Rx) + DISP16:31 = (RA) + (DISP0:15 || 0x0000) + DISP16:31 + 0xFFFF_FFFF_FFFF_0000
= (RA) + (DISP0:15 || 0x0000) + DISP16:31 - 216
= (RA) + DISP - 216
To compensate for this effective subtraction of 216, if DISP16 = 1 the SI value used for the addis must be
DISP0:15 + 1. Then the addis sets Rx to
(RA) + ((DISP0:15 + 1) || 0x0000) = (RA) + (DISP0:15 || 0x0000) + 216
and the Load or Store computes the EA as
(Rx) + DISP16:31 = (RA) + (DISP0:15 || 0x0000) + 216 + DISP16:31 - 216
= (RA) + DISP
as desired.
Thus the rules for using the technique illustrated in Table 1 are as follows.
For the RA field of the addis, use the desired base register for the Load or Store.
For the D field of the Load or Store, use DISP16:31.
(For DS-form Loads and Stores, for the DS field use DISP16:29; DISP30:31 are 0b00.)
For the SI field of the addis:
- if DISP16 = 0 use DISP0:15;
- if DISP16 = 1 use DISP0:15 + 1.
2.1.2 32-Bit Constant Generation The following instruction sequence will optimize perfor-
mance when zero-extending the result of a 32-bit addi-
The following instruction sequences will optimize per- tion.
formance when generating zero-extended 32-bit
unsigned constants (when RA0:63 equal 0) and when Operation Instruction
performing 32-bit logical operations on RA32:63). Sequence
Unsigned constant add Rx,RA,RB
Operation Instruction (RA + RB zero extended) rldicl Rt,Rx,0,32
Sequence Table 6: 32-bit Zero-Extended addition
Unsigned constant oris Rx,RA,UIh
(UIh,UIl zero extended) ori Rt,Rx,UIl
Unsigned constant xoris Rx,RA,UIh
(UIh,UIl zero extended) xori Rt,Rx,UIl
Table 3: 32-bit Unsigned Constant Generation
Operation Instruction
Sequence
Signed consant addis Rx,RA,SIh
(SIh,SIl sign extended) addi Rt,Rx,SIl
Signed consant addis Rx,0,SIh
(SIh sign extended; UI zero ori Rt,Rx,UIl
extended)
Table 4: 32-bit Signed Constant Generation
Instruction
Sequence
add Rx,RA,RB
extsw[.] Rt,Rx
addi Rx,RA,SI
extsw[.] Rt,Rx
addis Rx,RA,SI
extsw[.] Rt,Rx
subf Rx,RA,RB
extsw[.] Rt,Rx
neg Rx,RA
extsw[.] Rt,Rx
Table 5: 32-bit Sign Extended Addition
Programming Note
See the Programming Notes for Table 1.
Programming Note
Table 8 includes only the Type-A Multiply-Add
instructions because supporting only one of the two
types (i.e. either Type-A or Type-M) is sufficient to
preserve the contents of the destination operand of
the permute or Multiply-Add instruction. The xxlor
instruction “preserves” the contents of the destina-
tion operand by copying it into another register, and
the copy is then used as the destination operand of
the Multiply-Add instruction, which is overwritten
upon execution.
Programming Note
In order to ensure that the contents of registers are
preserved to the extent that a partially executed
instruction can be re-executed correctly, the regis-
ters that are preserved must satisfy the following
conditions. For any given instruction, zero or more
of the conditions applies.
For a fixed-point Load instruction that is not a
multiple or string form, if RT=RA or RT=RB
then the contents of register RT are not
altered.
For an update form Load or Store instruction,
the contents of register RA are not altered.
Programming Note
Warning: Other forms of or Rx,Rx,Rx that are not
described in this section and in Section 4.3.3 may
also cause program priority to change. Use of
these forms should be avoided except when soft-
ware explicitly intends to alter program priority. If a
no-op is needed, the preferred no-op (ori 0,0,0)
should be used.
URG
CNT
SSE
STE
LSD
LTE
0 default
1 not urgent
0 38 39 40 41 42 43 44 45 54 55 57 58 59 60 61 63
2 least urgent
Figure 2. Data Stream Control Register 3 less urgent
4 medium
Bit(s) Description 5 urgent
6 more urgent
39 Software Transient Enable (SWTE) 7 most urgent
0 SWTE is disabled. 58 Load Stream Disable (LSD)
0 No effect.
Programming Note
The URG, LSD, SNSE and SSE fields do not affect
the initiation of streams specified using the dcbt
and dcbtst instructions.
Note that even when SNSE is not set, hardware
may detect Stride-N streams in intervals when they
access elements that map to sequential cache
blocks.
Programming Note
Accesses that are caused by or associated with
Cache Management instructions that are “treated
as a Load” or “treated as a Store” are not subject to
the special ordering rules described for SAO stor-
age. These accesses are always performed in
accordance with the weakly consistent storage
model.
31 /// RA RB 982 /
0 6 11 16 21 31 31 / CT RA RB 22 /
0 6 7 11 16 21 31
Programming Note
Because the instruction is treated as a Load, the
effective address is translated using translation
resources that are used for data accesses, even
though the block being invalidated was copied into
the instruction cache based on translation
resources used for instruction fetches (see Book
III).
Programming Note
The invalidation of the specified block need not
have been performed with respect to the processor
executing the icbi instruction until a subsequent
isync instruction has been executed by that pro-
cessor. No other instruction or event has the corre-
sponding effect.
Programming Note
The architecture does not provide a way to specify
the size of the data elements that compose a
stream. An implementation may assume some
fixed size for all data elements. As a result,
depending on the offset, stride, and size (and in
particular whether the elements are aligned), the
implementation may reduce the latency for access-
ing only a portion of some of the elements. A future
version of the architecture may enable the specifi-
cation of element size to avoid this limitation.
32 GO 47:56 UNITCNT
Programming Note
To obtain the best performance across the widest range At each level of the storage hierarchy that is “near”
of implementations that support the data stream vari- the processor, elements of a data stream that is
ants of dcbt/dcbtst, the programmer should assume specified as transient are most likely to be
the following model when using those variants. replaced. As a result, it may be desirable to stag-
ger addresses of streams (choose addresses that
The processor’s response to a hint that the pro-
map to different cache congruence classes) to
gram will probably soon access a given data
reduce the likelihood that an element of a transient
stream is to take actions that reduce the latency of
stream will be replaced prior to being accessed by
accesses to the first few elements of the stream.
the program.
(Such actions may include prefetching cache
blocks into levels of the storage hierarchy that are Processors that comply with versions of the archi-
“near” the processor.) Thereafter, as the program tecture that do not support the TH field at all treat
accesses each successive element of the stream, TH = 0b01000, 0b01010, and 0b01011 as if TH =
the processor takes latency-reducing actions for 0b00000.
additional elements of the stream, pacing these
A single set of stream IDs is shared between the
actions with the program’s accesses (i.e., taking
dcbt and dcbtst instructions.
the actions for only a limited number of elements
ahead of the element that the program is currently On some implementations, data streams that are
accessing). not specified by software may be detected by the
processor. Such data streams are called “hard-
The processor’s response to a hint that the pro-
ware-detected data streams”. On some such
gram will probably no longer access a given data
implementations, data stream resources
stream, or to the cessation of existence of a data
(resources that are used primarily to support data
stream, is to stop taking latency-reducing actions
streams) are shared between software-specified
for the stream.
data streams and hardware-detected data
A data stream having finite length ceases to exist streams. On these latter implementations, the pro-
when the latency-reducing actions have been gramming model includes the following.
taken for all elements of the stream.
- Software-specified data streams take prece-
If the program ceases to need a given data stream dence over hardware-detected data streams
before having accessed all elements of the stream in use of data stream resources.
(always the case for streams having unlimited
- The processor’s response to a hint that the
length), performance may be improved if the pro-
program will probably no longer access a
gram then provides a hint that it will no longer
given data stream, or to the cessation of exist-
access the stream (e.g., by executing the appropri-
ence of a data stream, includes releasing the
ate dcbt instruction with TH=0b01010 and
associated data stream resources, so that
S≠0b00).
they can be used by hardware-detected data
streams.
Programming Note
The latency-reducing actions taken in response to
a program's hints about access to a data stream,
including the depth and urgency parameters, may
vary based on its behavior and on the behavior of
other programs sharing platform resources, as well
as on the design of the platform resources they
use. Without actually changing the stream specifi-
cation or DSCR parameters, the processor may
adjust its actions (e.g. slow down prefetches or be
more selective choosing them) based on their
effectiveness and on the availability of storage
bandwidth. In general, the goal of this variation is
to improve overall system performance and fair-
ness across the set of programs that share
resources. There often will be a performance ben-
efit, however, from adjusting stream specifications
to the platform and co-resident programs to adjust
for these actions by the processor.
Programming Note
This Programming Note describes several aspects of arate the dcbt instruction with GO=1 from the pre-
using the data stream variants of the dcbt and dcbtst ceding dcbt/dcbtst instructions, and another eieio
instructions. instruction must separate that dcbt instruction
from the following dcbt/dcbtst instructions.
A non-transient data stream having unlimited
length and which will access consecutive units in In practice, the second eieio described above can
storage can be completely specified, including pro- sometimes be omitted. For example, if the program
viding the hint that the program will probably soon consists of an outer loop that contains the dcbt/
access it, using one dcbt instruction. The corre- dcbtst instructions and an inner loop that contains
sponding specification for a data stream having the Load or Store instructions that access the data
other attributes requires two or three dcbt/dcbtst streams, the characteristics of the inner loop and
instructions to describe the stream and one addi- of the implementation’s branch prediction mecha-
tional dcbt instruction to start the stream. How- nisms may make it highly unlikely that hints corre-
ever, one dcbt instruction with TH=0b01010 and sponding to a given iteration of the outer loop will
GO=1 can apply to a set of the data streams be provided out of program order with respect to
described in the preceding sentence, so the corre- hints corresponding to the previous iteration of the
sponding specification for n such data streams outer loop. (Also, any providing of hints out of pro-
requires 2×n to 3×n dcbt/dcbtst instructions plus gram order affects only performance, not program
one dcbt instruction. (There is no need to execute correctness.)
a dcbt/dcbtst instruction with TH=0b01010 and
To mitigate the effects of interrupts on data
S=0b10 for a given stream ID before using the
streams, it may be desirable to specify a given
stream ID for a new data stream; the implicit por-
“logical” data stream as a sequence of shorter,
tion of the hint provided by dcbt/dcbtst instruc-
component data streams. Similar considerations
tions that describe data streams suffices.)
apply to conditions and events that, depending on
If it is desired that the hint provided by a given the implementation, may cause an existing data
dcbt/dcbtst instruction be provided in program stream to cease to exist; for example, in some
order with respect to the hint provided by another implementations an existing data stream ceases to
dcbt/dcbtst instruction, the two instructions must exist when it comes to the end of a virtual page.
be separated by an eieio instruction. For example,
If it is desired to specify data streams without
if a dcbt instruction with TH=0b01010 and GO=1
regard to the number of stream IDs provided by the
is intended to indicate that the program will proba-
implementation, stream IDs should be assigned to
bly soon access nascent data streams described
data streams in order of decreasing stream impor-
(completely) by preceding dcbt/dcbtst instruc-
tance (stream ID 0 to the most important stream,
tions, and is intended not to indicate that the pro-
stream ID 1 to the next most important stream,
gram will probably soon access nascent data
etc.). This order ensures that the hints for the most
streams described (completely) by following dcbt/
important data streams will be provided.
dcbtst instructions, an eieio instruction must sep-
Programming Note
The processor’s response to the hint that access to
TH=0b10000 the block will be transient is to prefetch data into the
cache hierarchy in a way that minimizes the dis-
If TH=0b10000, the dcbt instruction provides a hint that placement of data that has not been identified as
the program will probably soon load from the block con- transient.
taining the byte addressed by EA, and that the pro-
gram’s need for the block will be transient (i.e., the time
interval during which the program accesses the block is
likely to be short). TH=0b10001
If TH=0b10001, the dcbt instruction provides a hint that
the program will probably not access the block contain-
ing the byte addressed by EA for a relatively long
period of time.
31 TH RA RB 246 /
0 6 11 16 21 31
Programming Note
See the Programming Notes at the beginning of
this section.
Data Cache Block set to Zero X-form Data Cache Block Store X-form
dcbz RA,RB dcbst RA,RB
For storage that is either Write Through Required Special Registers Altered:
or Caching Inhibited, dcbz is likely to take signifi- None
cantly longer to execute than an equivalent
sequence of Store instructions. For example, on
some implementations dcbz for such storage may
cause the system alignment error handler to be
invoked; on such implementations the system
alignment error handler sets the specified block to
zero using Store instructions.
See Section 5.9.1 of Book III for additional informa-
tion about dcbz.
This instruction is treated as a Load (see Section 4.3), The treatment of these instructions is independent of
except that reference and change recording need not whether other Vector instructions are available (i.e., is
be done. independent of the contents of MSRVEC (see Book III).
Programming Note
Warning: Other forms of or Rx,Rx,Rx that are not
described in this section and in Section 3.2 may
also cause program priority to change. Use of
these forms should be avoided except when soft-
ware explicitly intends to alter program priority. If a
no-op is needed, the preferred no-op (ori 0,0,0)
should be used.
Programming Note
A failure of a singleton move will generally be the
result of a shortage of the resources required to
complete the operation. When the resources are
known to be shared by multiple programs, a
credit-based system is frequently used to improve
quality of service. If such a credit system is in use,
or if the resources are not shared, the program
should continually repeat the singleton move until it
succeeds. However, if no credit system is in use for
shared resources, it may be appropriate to apply
some sort of backoff algorithm after having retried
the singleton move a few times.
Programming Note
WARNING: In rare circumstances, paste_last may
falsely report successful completion when the
move group is coded incorrectly. This may occur if
the move group sequence includes a redundant
copy_first and the sequence is interrupted just
prior to the redundant copy_first. Since interrupts
should be rare, any sequence that returns a false
positive CR0 value should fail for most executions.
Programming Note
Once a move group encounters an error, the hard-
ware is permitted to take shortcuts on the remain-
der of the group to reduce resource utilization.
Software that properly breaks a failed move group
into singletons and completes that emulation of the
move group prior to accessing the targets of the
move group can not detect such shortcuts. In gen-
eral the move group itself and any interleaved
instructions must constitute an idempotent collec-
tion of operations.
The instruction descriptions include an example of
how the hardware can take a shortcut by not per-
forming the transfer, e.g. if MGV=0.
Engineering Note
Copy X-form occured and this copy is part of a singleton move, the
system data storage error handler will be invoked in
copy RA,RB,L asociation with the execution of the related paste.
instruction (if another interruption did not separate the
31 /// L RA RB 774 / two). If any of the sequencing errors described above
0 6 9 10 11 16 21 31 have occured and this copy is part of a (non-singleton)
move group, the paste. will return “move group failed”
if L=1 in CR0.
if MGIP=0 When successful, this instruction is treated as a Load
MGIP 1 (see Section 4.3, “Cache Management Instructions”),
MGV 1
except that the transfer ordering is described in
CPTOGGLE 0
else Section 1.7.2, “Storage Ordering of Copy/Paste-Initi-
MGV 0 ated Data Transfers”.
if (MGIP=1)&(MGV=1)&(CPTOGGLE=0) Special Registers Altered:
if RA = 0 then b 0
None
else b (RA)
EA b +(RB) Extended Mnemonics:
copy_buffer mem(EA,128)
CPTOGGLE 1 Extended mnemonics for Copy:
else
MGIP 1 Extended: Equivalent to:
MGV 0 copy RA,RB copy RA,RB,0
copy_first RA,RB copy RA,RB,1
The copy instruction copies 128 bytes from the speci-
fied location in storage into a copy buffer, as described
below. If L=1, the instruction identifies the beginning of Programming Note
a move group. copy serves as both a basic and an extended mne-
monic. The Assembler will recognize a copy mne-
In preparation for the copy operation, the following
monic with three operands as the basic form, and a
move group state changes are made. If L=1 and a
copy mnemonic with two operands as the
move group is already in progress MGV is set to zero;
extended form. In the extended form the L operand
otherwise, MGIP is set to 1, MGV is set to 1, and
is omitted and assumed to be 0.
CPTOGGLE is set to 0.
If MGIP=0, the instruction is attempting to start a
sequence with a plain copy. If MGV=0, either a
copy_first was issued with a move group already in
progress or some programming error happened before
this instruction. If CPTOGGLE=1 when neither of the
previous errors occurred, this is a copy following a copy.
All three of these errors will result in setting MGIP=1
and MGV=0. Otherwise, this is a good copy, and the
instruction copies 128 bytes of storage into the copy
buffer as follows.
Let the effective address (EA) be the sum
(RA|0)+(RB).
If the storage addressed by EA is in cluster shared
memory, an outboard translation/authority mecha-
nism will be invoked in addition to the normal trans-
lation/authority checking.
The 128 bytes in storage addressed by EA is
loaded into the copy buffer and CPTOGGLE is set
to 1 to indicate a Copy instruction has been pro-
cessed.
If the EA is not a multiple of 128, the system alignment
error handler is invoked.
If the specified block is in storage that is Caching Inhib-
ited, the system data storage error handler is invoked.
If any of the sequencing errors described above have
Paste X-form paste_last, status for all the transfers will be col-
lected to determine the success of the move
paste RA,RB,L (L=0 Rc=0) group. CPTOGGLE is set to 0 to indicate a Paste
paste. RA,RB,L (L=1 Rc=1) instruction has been processed.
If L=1 and nothing has gone wrong with the move
31 /// L RA RB 902 Rc
group so far, the processor waits for the group sta-
0 6 9 10 11 16 21 31
tus to be collected, and CR0 is set as follows to
report the status of the move group.
if (MGIP=0)|(MGV=0)|(CPTOGGLE=0)
MGIP 1
MGV 0
else CR0 Description
if RA = 0 then b 0
else b (RA) 0b000||XERSO Move group failed.
EA b +(RB)
post-mem(EA,128,L) copy_buffer 0b001||XERSO Move group successful.
CPTOGGLE 0
If L≠Rc, the instruction form is invalid.
if L=1
if (MGIP=1)&(MGV=1) If the EA is not a multiple of 128, the system alignment
wait for group completion status error handler is invoked.
if group succeeded
If the specified block is in storage that is Caching Inhib-
CR0 0b001||XERSO
else ited, the system data storage error handler is invoked.
CR0 0b000||XERSO If any of the following occurs as part of a (non-single-
else /* this is a programming error ton) move group, the paste. ending the group will
CR0 0b000||XERSO
MGIP 0 return “move group failed” in CR0. If any of them
MGV 0 occurs as part of a singleton move, the behavior will be
as described for each situation.
For a copy / paste[.] pair, if one instruction speci-
fies a location in CSM and the other does not
The paste instruction copies 128 bytes from the copy specify a location in (local) main storage, the sys-
buffer to the specified location in storage, as described tem data storage error handler will be invoked.
below. If L=1, the instruction identifies the end of a For a copy / paste[.] pair, that specifies an accel-
move group, and the successful completion of the erator, if the copy specifies (local) main storage
instruction is contingent on the successful completion and the paste[.] specifies the accelerator, the
of all of the paste operations in the group. accelerator operation will be initiated (once exe-
If MGIP=0, the instruction is attempting to start a cuted as a singleton move); otherwise, the system
sequence with a paste. If MGV=0, some programming data storage error handler will be invoked.
error happened before this instruction. If CPTOG- If any of the sequencing errors described above
GLE=0 when neither of the previous errors occurred, have occured, the system data storage error han-
this is a paste following a paste. All three of these dler will be invoked.
errors will result in setting MGIP=1 and MGV=0. Other- If an error from the copy description that is speci-
wise, this is a good paste, and the instruction pastes fied to occur when the associated paste[.] is exe-
the 128 bytes from the copy buffer to storage as fol- cuted occurs, the system data storage error
lows. handler is invoked.
Let the effective address (EA) be the sum When successful, this instruction is treated as a Store
(RA|0)+(RB). (see Section 4.3, “Cache Management Instructions”),
except that the transfer ordering is described in
If the storage addressed by EA is in cluster shared Section 1.7.2, “Storage Ordering of Copy/Paste-Initi-
memory, an outboard translation/authority mecha- ated Data Transfers”.
nism will be invoked in addition to the normal trans-
lation/authority checking. Special Registers Altered:
The contents of the copy buffer are posted to be CR0 Extended Mnemonics:
stored in the 128 bytes of storage or sent to the
Extended mnemonics for Paste:
accelerator addressed by EA and an indicator of
whether this is a paste_last is sent along. The Extended: Equivalent to:
actual update of storage will be done asynchro- paste RA,RB paste RA,RB,0
nously, but prior to the completion of the paste_last RA,RB paste. RA,RB,1
paste_last for the move group. If it is a
CP_Abort X-form
Programming Note cp_abort
paste serves as both a basic and an extended
mnemonic. The Assembler will recognize a paste 31 /// /// /// 838 /
mnemonic with three operands as the basic form, 0 6 11 16 21 31
Architecture Note
Engineering
Function Code GPR operands Storage operands Function name and RTL
10000 RT, RT1, RT+2 mem(EA,s) Compare and Swap Not Equal
t mem(EA, s)
if t != (RT+1) then mem(EA,s) (RT+2)
RT t
Notes:
s = operand size in number of bytes
Function codes not listed in this table are considered invalid.
For word atomics, only the least significant word of each source register is used, and the least significant word of
the target register is updated with the result, while the upper word is set to zero.
31 RT RA FC 582 / 31 RT RA FC 614 /
0 6 11 16 21 31 0 6 11 16 21 31
Function Code GPR operands Storage operands Function name and RTL
Notes:
s = operand size in number of bytes
Function codes not listed in this table are considered invalid.
For word atomics, only the least significant word of each source register is used.
31 RS RA FC 710 / 31 RS RA FC 742 /
0 6 11 16 21 31 0 6 11 16 21 31
Programming Note
The Memory Coherence Required attribute on
other processors and mechanisms ensures that
their stores to the reservation granule will cause the
reservation created by the Load And Reserve
instruction to be lost.
RESERVE_ADDR real_addr(EA)
Programming Note RT 560 || MEM(EA, 1)
Because the Load And Reserve and Store Condi-
tional instructions have implementation dependen- Let the effective address (EA) be the sum (RA|0)+(RB).
cies (e.g., the granularity at which reservations are The byte in storage addressed by EA is loaded into
managed), they must be used with care. The oper- RT56:63. RT0:55 are set to 0.
ating system should provide system library pro-
This instruction creates a reservation for use by a
grams that use these instructions to implement the
high-level synchronization functions (Test and Set, stbcx. instruction. A real address computed from the
Compare and Swap, locking, etc.; see Appendix B) EA as described in Section 1.7.4.1 is associated with
that are needed by application programs. Applica- the reservation, and replaces any address previously
tion programs should use these library programs, associated with the reservation. A length of 1 byte is
rather than use the Load And Reserve and Store associated with the reservation, and replaces any
Conditional instructions directly. length previously associated with the reservation.
The value of EH provides a hint as to whether the pro-
Programming Note gram will perform a subsequent store to the byte in
EH = 1 should be used when the program is obtain- storage addressed by EA before some other processor
ing a lock variable which it will subsequently attempts to modify it.
release before another program attempts to per- 0 Other programs might attempt to modify
form a store to it. When contention for a lock is sig- the byte in storage addressed by EA
nificant, using this hint may reduce the number of regardless of the result of the correspond-
times a cache block is transferred between proces- ing stbcx. instruction.
sor caches. 1 Other programs will not attempt to modify
EH = 0 should be used when all accesses to a the byte in storage addressed by EA until
mutex variable are performed using an instruction the program that has acquired the lock
sequence with Load and Reserve followed by Store performs a subsequent store releasing the
Conditional (e.g., emulating atomic update primi- lock.
tives such as “Fetch and Add;” see Appendix B).
The processor may use this hint to optimize the Special Registers Altered:
cache to cache transfer of the block containing the None
mutex variable, thus reducing the latency of per-
forming an operation such as ‘Fetch and Add’. Programming Note
lbarx serves as both a basic and an extended
mnemonic. The Assembler will recognize a lbarx
Engineering Note
Programming Note
mnemonic with four operands as the basic form,
Either value of the EH field is appropriate for a and a lbarx mnemonic with three operands as the
Load and Reserve instruction that is intended to extended form. In the extended form the EH oper-
establish a reservation for a subsequent waitrsv and is omitted and assumed to be 0.
and not a subsequent Store Conditional instruction.
Programming Note
Warning: On some processors that comply with
versions of the architecture that precede Version
2.00, executing a Load And Reserve instruction in
which EH = 1 will cause the illegal instruction error
handler to be invoked.
31 RT RA RB 52 EH
0 6 11 16 21 31
if RA = 0 then b 0
else b (RA)
EA b +(RB)
RESERVE 1
RESERVE_LENGTH 1
Load Halfword And Reserve Indexed Load Word And Reserve Indexed X-form
X-form
lwarx RT,RA,RB,EH
lharx RT,RA,RB,EH
31 RT RA RB 20 EH
31 RT RA RB 116 EH 0 6 11 16 21 31
0 6 11 16 21 31
if RA = 0 then b 0
if RA = 0 then b 0 else b (RA)
else b (RA) EA b +(RB)
EA b +(RB) RESERVE 1
RESERVE 1 RESERVE_LENGTH 4
RESERVE_LENGTH 2 RESERVE_ADDR real_addr(EA)
32
RESERVE_ADDR real_addr(EA) RT 0 || MEM(EA, 4)
RT 480 || MEM(EA, 2)
Let the effective address (EA) be the sum (RA|0)+(RB).
Let the effective address (EA) be the sum (RA|0)+(RB). The word in storage addressed by EA is loaded into
The halfword in storage addressed by EA is loaded into RT32:63. RT0:31 are set to 0.
RT48:63. RT0:47 are set to 0.
This instruction creates a reservation for use by a
This instruction creates a reservation for use by a stwcx. instruction. A real address computed from the
sthcx. instruction. A real address computed from the EA as described in Section 1.7.4.1 is associated with
EA as described in Section 1.7.4.1 is associated with the reservation, and replaces any address previously
the reservation, and replaces any address previously associated with the reservation. A length of 4 bytes is
associated with the reservation. A length of 2 bytes is associated with the reservation, and replaces any
associated with the reservation, and replaces any length previously associated with the reservation.
length previously associated with the reservation.
The value of EH provides a hint as to whether the pro-
The value of EH provides a hint as to whether the pro- gram will perform a subsequent store to the word in
gram will perform a subsequent store to the halfword in storage addressed by EA before some other processor
storage addressed by EA before some other processor attempts to modify it.
attempts to modify it. 0 Other programs might attempt to modify
0 Other programs might attempt to modify the word in storage addressed by EA
the halfword in storage addressed by EA regardless of the result of the correspond-
regardless of the result of the correspond- ing stwcx. instruction.
ing sthcx. instruction. 1 Other programs will not attempt to modify
1 Other programs will not attempt to modify the word in storage addressed by EA until
the halfword in storage addressed by EA the program that has acquired the lock
until the program that has acquired the performs a subsequent store releasing the
lock performs a subsequent store releas- lock.
ing the lock.
EA must be a multiple of 4. If it is not, either the system
EA must be a multiple of 2. If it is not, either the system alignment error handler is invoked or the results are
alignment error handler is invoked or the results are boundedly undefined.
boundedly undefined.
Special Registers Altered:
Special Registers Altered: None
None
Programming Note
Programming Note lwarx serves as both a basic and an extended
lharx serves as both a basic and an extended mnemonic. The Assembler will recognize a lwarx
mnemonic. The Assembler will recognize a lharx mnemonic with four operands as the basic form,
mnemonic with four operands as the basic form, and a lwarx mnemonic with three operands as the
and a lharx mnemonic with three operands as the extended form. In the extended form the EH oper-
extended form. In the extended form the EH oper- and is omitted and assumed to be 0.
and is omitted and assumed to be 0.
Store Byte Conditional Indexed X-form If a reservation exists and the length associated with
the reservation is not 1 byte, it is undefined whether
stbcx. RS,RA,RB (RS)56:63 are stored into the byte in storage addressed
by EA.
31 RS RA RB 694 1
If a reservation does not exist, no store is performed.
0 6 11 16 21 31
CR Field 0 is set as follows. n is a 1-bit value that indi-
if RA = 0 then b 0 cates whether the store was performed, except that if,
else b (RA) per the preceding description, it is undefined whether
EA b + (RB) the store is performed, the value of n is undefined (and
if RESERVE then need not reflect whether the store was performed).
if RESERVE_LENGTH = 1 then
if RESERVE_ADDR = real_addr(EA) then CR0LT GT EQ SO = 0b00 || n || XERSO
MEM(EA, 1) (RS)56:63
undefined_case 0 The reservation is cleared.
store_performed 1
else Special Registers Altered:
z smallest real page size supported by CR0
implementation
if RESERVE_ADDR ÷ z = real_addr(EA) ÷ z then
undefined_case 1
else
undefined_case 0
store_performed 0
else
undefined_case 1
else
undefined_case 0
store_performed 0
if undefined_case then
u1 undefined 1-bit value
if u1 then
MEM(EA, 1) (RS)56:63
u2 undefined 1-bit value
CR0 0b00 || u2 || XERSO
else
CR0 0b00 || store_performed || XERSO
RESERVE 0
Let the effective address (EA) be the sum (RA|0)+(RB).
If a reservation exists, the length associated with the
reservation is 1 byte, and the real storage location
specified by the stbcx. is the same as the real storage
location specified by the lbarx instruction that estab-
lished the reservation, (RS)56:63 are stored into the byte
in storage addressed by EA.
If a reservation exists, the length associated with the
reservation is 1 byte, and the real storage location
specified by the stbcx. is not the same as the real stor-
age location specified by the lbarx instruction that
established the reservation, the following applies.
Let z denote an aligned block of real storage
whose size is the smallest real page size sup-
ported by the implementation. If the real storage
location specified by the stbcx. is in the same z as
the real storage location specified by the lbarx
instruction that established the reservation, it is
undefined whether (RS)56:63 are stored into the
byte in storage addressed by EA. Otherwise, no
store is performed.
Store Halfword Conditional Indexed If a reservation exists and the length associated with
X-form the reservation is not 2 bytes, it is undefined whether
(RS)48:63 are stored into the halfword in storage
sthcx. RS,RA,RB addressed by EA.
If a reservation does not exist, no store is performed.
31 RS RA RB 726 1
0 6 11 16 21 31 CR Field 0 is set as follows. n is a 1-bit value that indi-
cates whether the store was performed, except that if,
if RA = 0 then b 0 per the preceding description, it is undefined whether
else b (RA) the store is performed, the value of n is undefined (and
EA b + (RB) need not reflect whether the store was performed).
if RESERVE then
if RESERVE_LENGTH = 2 then CR0LT GT EQ SO = 0b00 || n || XERSO
if RESERVE_ADDR = real_addr(EA) then
MEM(EA, 2) (RS)48:63 The reservation is cleared.
undefined_case 0
store_performed 1 EA must be a multiple of 2. If it is not, either the system
else alignment error handler is invoked or the results are
z smallest real page size supported by boundedly undefined.
implementation Special Registers Altered:
if RESERVE_ADDR ÷ z = real_addr(EA) ÷ z then
CR0
undefined_case 1
else
undefined_case 0
store_performed 0
else
undefined_case 1
else
undefined_case 0
store_performed 0
if undefined_case then
u1 undefined 1-bit value
if u1 then
MEM(EA, 2) (RS)48:63
u2 undefined 1-bit value
CR0 0b00 || u2 || XERSO
else
CR0 0b00 || store_performed || XERSO
RESERVE 0
Let the effective address (EA) be the sum (RA|0)+(RB).
If a reservation exists, the length associated with the
reservation is 2 bytes, and the real storage location
specified by the sthcx. is the same as the real storage
location specified by the lharx instruction that estab-
lished the reservation, (RS)48:63 are stored into the
halfword in storage addressed by EA.
If a reservation exists, the length associated with the
reservation is 2 bytes, and the real storage location
specified by the sthcx. is not the same as the real stor-
age location specified by the lharx instruction that
established the reservation, the following applies.
Let z denote an aligned block of real storage
whose size is the smallest real page size sup-
ported by the implementation. If the real storage
location specified by the sthcx. is in the same z as
the real storage location specified by the lharx
instruction that established the reservation, it is
undefined whether (RS)48:63 are stored into the
halfword in storage addressed by EA. Otherwise,
no store is performed.
Store Word Conditional Indexed X-form If a reservation exists and the length associated with
the reservation is not 4 bytes, it is undefined whether
stwcx. RS,RA,RB (RS)32:63 are stored into the word in storage addressed
by EA.
31 RS RA RB 150 1
If a reservation does not exist, no store is performed.
0 6 11 16 21 31
CR Field 0 is set as follows. n is a 1-bit value that indi-
if RA = 0 then b 0 cates whether the store was performed, except that if,
else b (RA) per the preceding description, it is undefined whether
EA b + (RB) the store is performed, the value of n is undefined (and
if RESERVE then need not reflect whether the store was performed).
if RESERVE_LENGTH = 4 then
if RESERVE_ADDR = real_addr(EA) then CR0LT GT EQ SO = 0b00 || n || XERSO
MEM(EA, 4) (RS)32:63
undefined_case 0 The reservation is cleared.
store_performed 1
else EA must be a multiple of 4. If it is not, either the system
z smallest real page size supported by alignment error handler is invoked or the results are
implementation boundedly undefined.
if RESERVE_ADDR ÷ z = real_addr(EA) ÷ z then
Special Registers Altered:
undefined_case 1
else CR0
undefined_case 0
store_performed 0
else
undefined_case 1
else
undefined_case 0
store_performed 0
if undefined_case then
u1 undefined 1-bit value
if u1 then
MEM(EA, 4) (RS)32:63
u2 undefined 1-bit value
CR0 0b00 || u2 || XERSO
else
CR0 0b00 || store_performed || XERSO
RESERVE 0
Let the effective address (EA) be the sum (RA|0)+(RB).
If a reservation exists, the length associated with the
reservation is 4 bytes, and the real storage location
specified by the stwcx. is the same as the real storage
location specified by the lwarx instruction that estab-
lished the reservation, (RS)32:63 are stored into the
word in storage addressed by EA.
If a reservation exists, the length associated with the
reservation is 4 bytes, and the real storage location
specified by the stwcx. is not the same as the real stor-
age location specified by the lwarx instruction that
established the reservation, the following applies.
Let z denote an aligned block of real storage
whose size is the smallest real page size sup-
ported by the implementation. If the real storage
location specified by the stwcx. is in the same z as
the real storage location specified by the lwarx
instruction that established the reservation, it is
undefined whether (RS)32:63 are stored into the
word in storage addressed by EA. Otherwise, no
store is performed.
31 RT RA RB 84 EH 31 RS RA RB 214 1
0 6 11 16 21 31 0 6 11 16 21 31
if RA = 0 then b 0 if RA = 0 then b 0
else b (RA) else b (RA)
EA b +(RB) EA b + (RB)
RESERVE 1 if RESERVE then
RESERVE_LENGTH 8 if RESERVE_LENGTH = 8 then
RESERVE_ADDR real_addr(EA) if RESERVE_ADDR = real_addr(EA) then
RT MEM(EA, 8) MEM(EA, 8) (RS)
undefined_case 0
Let the effective address (EA) be the sum (RA|0)+(RB). store_performed 1
The doubleword in storage addressed by EA is loaded else
into RT. z smallest real page size supported by
implementation
This instruction creates a reservation for use by a
if RESERVE_ADDR ÷ z = real_addr(EA) ÷ z then
stdcx. instruction. A real address computed from the undefined_case 1
EA as described in Section 1.7.4.1 is associated with else
the reservation, and replaces any address previously undefined_case 0
associated with the reservation. A length of 8 bytes is store_performed 0
associated with the reservation, and replaces any else
length previously associated with the reservation. undefined_case 1
else
The value of EH provides a hint as to whether the pro- undefined_case 0
gram will perform a subsequent store to the doubleword store_performed 0
in storage addressed by EA before some other proces- if undefined_case then
sor attempts to modify it. u1 undefined 1-bit value
0 Other programs might attempt to modify if u1 then
the doubleword in storage addressed by MEM(EA, 8) (RS)
u2 undefined 1-bit value
EA regardless of the result of the corre-
CR0 0b00 || u2 || XERSO
sponding stdcx. instruction. else
1 Other programs will not attempt to modify CR0 0b00 || store_performed || XERSO
the doubleword in storage addressed by RESERVE 0
EA until the program that has acquired the
lock performs a subsequent store releas- Let the effective address (EA) be the sum (RA|0)+(RB).
ing the lock. If a reservation exists, the length associated with the
EA must be a multiple of 8. If it is not, either the system reservation is 8 bytes, and the real storage location
alignment error handler is invoked or the results are specified by the stdcx. is the same as the real storage
boundedly undefined. location specified by the ldarx instruction that estab-
lished the reservation, (RS) is stored into the double-
Special Registers Altered: word in storage addressed by EA.
None
If a reservation exists, the length associated with the
Programming Note reservation is 8 bytes, and the real storage location
specified by the stdcx. is not the same as the real stor-
ldarx serves as both a basic and an extended age location specified by the ldarx instruction that
mnemonic. The Assembler will recognize a ldarx established the reservation, the following applies.
mnemonic with four operands as the basic form,
and a ldarx mnemonic with three operands as the
extended form. In the extended form the EH oper- Let z denote an aligned block of real storage
and is omitted and assumed to be 0. whose size is the smallest real page size sup-
ported by the implementation. If the real storage
location specified by the stdcx. is in the same z as
the real storage location specified by the ldarx
instruction that established the reservation, it is
Load Quadword And Reserve Indexed If RTp is odd, RTp=RA, or RTp=RB the instruction form
X-form is invalid. If RTp=RA or RTp=RB, an attempt to execute
this instruction will invoke the system illegal instruction
lqarx RTp,RA,RB,EH error handler. (The RTp=RA case includes the case of
RTp=RA=0.)
31 RTp RA RB 276 EH
Special Registers Altered:
0 6 11 16 21 31
None
Store Quadword Conditional Indexed If a reservation exists and the length associated with
X-form the reservation is not 16 bytes, it is undefined whether
(RSp) is stored into the quadword in storage addressed
stqcx. RSp,RA,RB by EA.
If a reservation does not exist, no store is performed.
31 RSp RA RB 182 1
0 6 11 16 21 31 CR Field 0 is set as follows. n is a 1-bit value that indi-
cates whether the store was performed, except that if,
if RA = 0 then b 0 per the preceding description, it is undefined whether
else b (RA) the store is performed, the value of n is undefined (and
EA b + (RB) need not reflect whether the store was performed).
if RESERVE then
if RESERVE_LENGTH = 16 then CR0LT GT EQ SO = 0b00 || n || XERSO
if RESERVE_ADDR = real_addr(EA) then
MEM(EA, 16) (RSp) The reservation is cleared.
undefined_case 0
store_performed 1 EA must be a multiple of 16. If it is not, either the sys-
else tem alignment error handler is invoked or the results
z smallest real page size supported by are boundedly undefined.
implementation
If RSp is odd, the instruction form is invalid.
if RESERVE_ADDR ÷ z = real_addr(EA) ÷ z then
undefined_case 1 Special Registers Altered:
else
undefined_case 0 CR0
store_performed 0
else
undefined_case 1
else
undefined_case 0
store_performed 0
if undefined_case then
u1 undefined 1-bit value
if u1 then
MEM(EA, 16) (RSp)
u2 undefined 1-bit value
CR0 0b00 || u2 || XERSO
else
CR0 0b00 || store_performed || XERSO
RESERVE 0
Let the effective address (EA) be the sum (RA|0)+(RB).
If a reservation exists, the length associated with the
reservation is 16 bytes, and the real storage location
specified by the stqcx. is the same as the real storage
location specified by the lqarx instruction that estab-
lished the reservation, (RSp) is stored into the quad-
word in storage addressed by EA.
If a reservation exists, the length associated with the
reservation is 16 bytes, and the real storage location
specified by the stqcx. is not the same as the real stor-
age location specified by the lqarx instruction that
established the reservation, the following applies.
Let z denote an aligned block of real storage
whose size is the smallest real page size sup-
ported by the implementation. If the real storage
location specified by the stqcx. is in the same z as
the real storage location specified by the lqarx
instruction that established the reservation, it is
undefined whether (RSp) is stored into the quad-
word in storage addressed by EA. Otherwise, no
store is performed.
Extended Mnemonics:
Extended mnemonics for Synchronize:
Programming Note
sync serves as both a basic and an extended mne-
monic. Assemblers will recognize a sync mne-
monic with one operand as the basic form, and a
sync mnemonic with no operand as the extended
form. In the extended form the L operand is omitted
and assumed to be 0.
Programming Note
The sync instruction can be used to ensure that all
stores into a data structure, caused by Store
instructions executed in a “critical section” of a pro-
gram, will be performed with respect to another
processor before the store that releases the lock is
performed with respect to that processor; see
Section B.2, “Lock Acquisition and Release, and
Related Techniques” on page 915.
The memory barrier created by a sync instruction
with L=1 does not order implicit storage accesses
or instruction fetches. The memory barrier created
by a sync instruction with L=0 (or L=2) orders
implicit storage accesses and instruction fetches
associated with instructions preceding the sync
instruction but not those associated with instruc-
tions following the sync instruction.
In order to obtain the best performance across the
widest range of implementations, the programmer
should use the sync instruction with L=1, or the
eieio instruction, if any of these is sufficient for his
needs; otherwise he should use sync with L=0.
sync with L=2 should not be used by application
programs.
Programming Note
The functions provided by sync with L=1 are a
strict subset of those provided by sync with L=0.
(The functions provided by sync with L=2 are a
strict superset of those provided by sync with L=0;
see Book III.)
The wait instruction is used to stop instruction fetching include exceptions (see Section 1.2.1 of Book III) and
and execution until certain events occur. These events event-based branch exceptions (see Section 1.1).
ized: each happens in its entirety in some order, even being invalidated by the tlbie or slbieg. For a ROT, no
when that order is not specified in the program or conflict is caused if the access is a load.
enforced between processors. Until a transaction com-
A Suspended state cache control instruction is said to
mits, its set of transactional accesses is provisional,
cause a conflict if it would cause the destruction of a
and will be discarded should the transaction fail. The
transactional update or if it would make a transactional
set of transactional accesses is also referred to as the
update visible to another thread.
“transactional footprint.”
Non-transactional accesses: Storage accesses per- Programming Note
formed in the existing Power storage model are said to Warning: In descriptions of the transactional mem-
be “non-transactional.” In contrast to transactional stor- ory facility that precede V. 2.07B, the granularity at
age accesses, there is no provision of atomicity across which conflict between storage accesses is
multiple non-transactional accesses. Non-transactional detected was specified to be the cache block. Pro-
storage updates are not discarded in the event of a grams that were based on these early descriptions
transaction failure. and depend on this granularity may need to be
Outer transaction: A transaction that is initiated from revised so as not to depend on it.
the Non-transactional state is said to be an outer trans- A future version of the architecture may define
action. A tbegin. instruction that initiates an outer "transaction conflict granule", as the aligned unit of
transaction is sometimes referred to as an “outer tbe- storage having the property that the granularity at
gin..” Similarly, a tend. instruction with A=0 that ends which conflict between storage accesses is
an outer transaction is sometimes referred to as an detected is never larger than the transaction con-
“outer tend..” flict granule. The size of the transaction conflict
Nested Transaction: A transaction that is initiated granule would be implementation-dependent and
while already executing a transaction is said to be would be added to the list of parameters useful to
“nested” within the pre-existing transaction. The set of application programs in Section 4.1 and the last
active nested transactions forms a stack growing from sentence of the first paragraph of the definition of
the outer transaction. A tend. with A=0 will remove the "conflict" would use "transaction conflict granule"
most recently nested transaction from the stack. instead of "cache block".
N T N2 N2 N2 N2 Not appli- N6 S7
cable
T T N, if outer trans-
N
3,4 S T
N
3,4
N
3 S6
action or A=1
form; otherwise T
S S
1 S6 S
3
S
2
T
5
S
3
N
3 S6
Notes
1. CR0 updated indicating transactional initiation was unsuccessful, due to a pre-existing transaction occupying the
transactional facility.
2. Execution of these operations does not affect transaction state, allowing for the instructions to be used in software
modules called from Non-transactional, Transactional, and Suspended paths.
3. If failure recording has not previously occurred, failure recording is performed as defined in Section 5.3.2.
4. Failure handling is performed as defined in Section 5.3.3.
5. If failure has occurred during Suspended execution, failure handling will be performed sometime after the execu-
tion of tresume, and no later than the set of events listed in Section 5.3.3.
6. Generate TM Bad Thing type Program interrupt.
7. If TEXASRFS=0, generate a TM Bad Thing type Program interrupt.
Transactions will fail for the following externally-induced Conflict caused by a Suspended state Load
causes Atomic or Store Atomic instruction updating a
Conflict with transactional access by another block that was previously accessed transactionally.
thread paste[.] to a block that was previously accessed
Conflict with non-transactional access by another transactionally is executed in Suspended state on
thread the thread executing the transaction.
In either of the previous two cases, if a successful Conflict caused by a Suspended state tlbie or
Store Conditional would have conflicted, but the slbieg that specifies a translation that was previ-
Store Conditional is not successful, it is implemen- ously used transactionally. (This case will be
tation-dependent whether a conflict is detected recorded as a translation invalidation conflict
Conflict with a translation invalidation caused by a because it may be hard to differentiate from a con-
tlbie or slbieg performed by another thread flict caused by a tlbie or slbieg performed by
paste[.] to a block that was previously accessed another thread and because it is highly likely to be
transactionally is executed by another thread. a transient failure.)
copy from a block that was previously written
For each of the following potential causes, the transac-
transactionally is executed by another thread.
tion will fail if the absence of failure would compromise
Transactions will fail for the following self-induced transaction semantics; otherwise, whether the transac-
causes tion fails is undefined.
Termination caused by the execution of tabort., Execution of the following instructions while in the
tabortdc., tabortdci., tabortwc., tabortwci. or Transactional state: lbzcix, ldcix, lhzcix, lwzcix,
treclaim. instruction. stbcix, stdcix, sthcix, stwci. The disallowed
Transaction level overflow, defined as an attempt to instruction is not executed; failure handling occurs
execute tbegin. when the transaction level is before it has been executed. (These instructions
already at its maximum value are considered to be disallowed in Transactional
Footprint overflow, defined as an attempt to per- state if they cause transaction failure in Transac-
form a storage access in Transactional state which tional state.) Execution of these instructions in the
exceeds the capacity for tracking transactional Suspended state is allowed and does not cause
accesses. transaction failure.
Execution of the following instructions while in the Execution of the following instruction in the Trans-
Transactional state: icbi, copy, paste[.], actional state: wait. The disallowed instruction is
cp_abort, lwat, ldat, stwat, stdat, dcbf, dcbi, not executed; failure handling occurs before it has
dcbst, rfscv, [h]rfid, rfebb, mtmsr[d], msgsnd, been executed. (This instruction is considered to
msgsndp, msgclr, msgclrp, slbie[g], slbia, slb- be disallowed in a transaction if it causes transac-
mte, slbfee, stop, and tlbie[l]. (These instruc- tion failure.)
tions are considered to be disallowed in Execution of the following instruction in the Sus-
Transactional state.) The disallowed instruction is pended state: wait. The disallowed instruction is
not executed; failure handling occurs before it has treated as a no-op; failure recording occurs. (This
been executed. instruction is considered to be disallowed in a
transaction if it causes transaction failure.)
Programming Note Access of a disallowed type while in the Transac-
Note that execution of a stop instruction in tional state: Caching Inhibited, Write Through
Suspended state causes a TM Bad Thing type Required, and Memory Coherence not Required
Program interrupt. for data access; Caching Inhibited for instruction
fetch. The disallowed access is not performed;
Execution, while in Transactional state, of mtspr failure handling occurs such that the instruction
specifying an SPR that is not part of the check- that would cause (or be associated with, for
pointed registers and is not a Transactional Mem- instruction fetch) the disallowed access type
ory SPR. The mtspr is not executed; failure appears not to have been executed. Accesses of
handling occurs before it has been executed. this type in the Suspended state are allowed and
(Modification of XERFXCC and CRCR0 are allowed, do not cause transaction failure.
but the changes will not be rolled back in the event Instruction fetch from a storage location that was
of transaction failure.) previously written transactionally (reported as a
Conflict caused by a Suspended state store to a unique cause that includes both self-induced and
storage location that was previously accessed externally-induced instances)
transactionally. If the store would have been per-
formed by a successful Store Conditional instruc-
tion, but the Store Conditional instruction does not dcbf, dcbi, or icbi specifying a block that was pre-
succeed, it is implementation-dependent whether viously accessed transactionally, in either of the
a conflict is detected. following cases.
treclaim., the following things occur. CR0 is set to 5.4.1 Transaction Failure Handler
0b101 || 0. The transaction state is set to Non-transac-
tional, and control flow is redirected to the instruction Address Register (TFHAR)
address stored in TFHAR. If the failure is caused by
The Transaction Failure Handler Address Register is a
treclaim., CR0 is not set to indicate failure and the
64-bit SPR that records the effective address of a soft-
transaction’s failure handler is not invoked.
ware failure handler used in the event of transaction
The following RTL function specifies the actions taken failure. Bits 62:63 are reserved.
during the handling of transaction failure:
TFHA //
0 62 63
TMHandleFailure()
If the transactional footprint has not previ- Figure 5. Transaction Failure Handler Address
ously been discarded Register (TFHAR)
Discard transactional footprint
Revert checkpointed registers to pre-transac- This register is written with the NIA for the tbegin. as a
tional values side-effect of the execution of an outer tbegin. instruc-
Discard all resources related to current trans- tion (tbegin. executed in the Non-transactional state).
action
MSRTS 0b00 #Non-transactional
If failure was not caused by treclaim., 5.4.2 Transaction EXception And
NIA TFHAR
CR0 0b101 || 0 Status Register (TEXASR)
Upon failure detected in Suspended state from causes The Transaction EXception And Status Register is a
other than the execution of a treclaim. instruction, fail- 64-bit register, containing a transaction level
ure handling is deferred until the transaction is (TEXASRTL) and status information for use by transac-
resumed. Once resumed, failure handling will occur no tion failure handlers. The identification of the cause and
later than the set of failure synchronizing events listed persistence of transaction failure reported in bits 7:30
above. Upon failure in Suspended state caused by tre- may rarely be inaccurate. Bits 0:31 are called the failure
claim., failure handling is immediate (but CR0 is not set cause in the instruction descriptions.
to indicate failure and the transaction’s failure handler is
not invoked). TEXASR
0 63
Programming Note
Figure 6. Transaction EXception And Status
A Load instruction executed immediately after tre-
Register (TEXASR)
claim. or a conditional or unconditional Abort
instruction is guaranteed not to load a transactional
storage update. TEXASRU
0 31
MSRHV || MSRPR when the failure was operand to TEXASR0:7. TEXASRSuspended indicates
recorded. whether the transaction was in the Suspended state at
the time that failure occurred. The values of MSRHV
36 Failure Summary (FS)
and MSRPR at the time that failure occurs are copied to
Set to 1 when a failure has been detected and
TEXASR34 and TEXASR35, respectively. In some cir-
failure recording has been performed.
cumstances, the failure causing instruction address in
37 TFIAR Exact TFIAR may not be exact. In such circumstances,
Set to 1 when the value in the TFIAR is exact. TEXASR37 is set to 0 indicating that the contents of
Otherwise the value in the TFIAR is approxi- TFIAR are not exact; otherwise TEXASR37 is set to 1.
mate.
38 ROT Programming Note
Set to 1 when a ROT is initiated. Set to zero The transaction level contained in TEXASRTL
when a non-ROT tbegin. is executed. should be interpreted by software as follows:
39 Reserved When in the Transactional or Suspended state, this
field contains an unsigned integer representing the
40:51 Reserved
transaction level of the active transaction, with 1
52:63 Transaction Level (TL) indicating an outer transaction, and a number
Transaction level (nesting depth + 1) for the greater than 1 indicating a nested transaction. The
active transaction, if any; otherwise 0 if the nesting depth of the active transaction is
most recently executed transaction com- TEXASRTL – 1.
pleted successfully, or the transaction level at
which the most recently executed transaction When in the Non-transactional state, TEXASRTL
failed if the most recently executed transaction contains 0 if the last transaction committed suc-
did not complete successfully. cessfully, otherwise it contains the transaction level
at which the most recent transaction failed.
Programming Note
A value of 1 corresponds to an outer transaction. A Programming Note
value greater than 1 corresponds to a nested trans- The Privilege bits in TEXASR represent the state of
action. the machine at the point when failure occurs. This
information may be used by problem state software
The transaction level in TEXASRTL contains an to determine whether an unexpected hypervisor or
unsigned integer indicating whether the current trans- operating system interaction was responsible for
action is an outer transaction, or is nested, and if transaction failure. This information may be useful
nested, its depth. The maximum transaction level sup- to operating systems or hypervisors when restoring
ported by a given implementation is of the form 2t - 1. register state for failure handling after the transac-
The value of t corresponding to the smallest maximum tional facility was reclaimed, to determine which of
is 4; the value of t corresponding to the largest maxi- the operating system or the hypervisor has retained
mum is 12. This value is tied to the “Maximum transac- the pre-transactional version of the checkpointed
tion level” parameter useful for application registers.
programmers, as specified in Section 4.1. The
high-order 12-t bits of TEXASRTL are treated as
reserved. 5.4.3 Transaction Failure Instruc-
Transaction failure information is contained in tion Address Register (TFIAR)
TEXASR0:37. The fields of TEXASR are initialized upon
the successful initiation of a transaction from the The Transaction Failure Instruction Address Register is
Non-transactional state, by setting TEXASRTL to 1, a 64-bit SPR that is set to the exact effective address of
indicating an outer transaction, and all other fields to 0. the instruction causing the failure, when possible. Bits
62:63 contain the privilege state when the failure was
When transaction failure is recorded, the failure sum- recorded. This was the value MSRHV || MSRPR when
mary bit TEXASRFS is set to 1, indicating that failure the failure was recorded.
has been detected for the active transaction and that
failure recording has been performed. TEXASR0:31 are
TFIA Privilege
set indicating the source of the failure. Exactly one of
0 62 63
bits 8 through 31 will be set indicating the instruction or
event that caused failure. In the event of failure due to Figure 8. Transaction Failure Instruction Address
the execution of a tabort., tabortdc., tabortdci., Register (TFIAR)
tabortwc., tabortwci. or treclaim. instruction,
TEXASR31 is set to 1, and, for tabort. and treclaim., a In certain cases, the exact address may not be avail-
software defined failure code is copied from a register able, and therefore TFIAR will be an approximation. An
5.5 Transactional Facility by the given processor until the transaction state has
been modified. In particular:
Instructions All exceptions that will be caused by the previously
initiated instructions are recorded in the FPSCR
Similar to the Floating-Point Status and Control Regis- before the transaction state is modified.
ter instructions, modifications of transaction state All invocations of the system floating-point enabled
caused by the execution of Transactional Memory exception error handler that will be caused by the
instructions or by failure handling synchronize the previously initiated instructions have occurred
effects of exception-causing floating-point instructions before the transaction state is modified.
executed by a given processor. Executing a Transacti- No subsequent floating-point instruction that alters
nal Memory instruction, or invocation of the failure han- the settings of any FPSCR bits is initiated until the
dler, ensures that all floating-point instructions transaction state has been modified.
previously initiated by the given processor have com-
pleted before the transaction state is modified, and that (Floating-point Storage Access instructions are not
no subsequent floating-point instructions are initiated affected.)
31 TO RA RB 782 1 31 TO RA SI 846 1
0 6 11 16 21 31 0 6 11 16 21 31
a EXTS((RA)32:63)
b EXTS((RB)32:63)
abort 0 a EXTS((RA)32:63)
abort 0
CR0 0 || MSRTS || 0
CR0 0 || MSRTS || 0
if (a < b) & TO0 then abort 1
if (a > b) & TO1 then abort 1
if (a = b) & TO2 then abort 1 if a < EXTS(SI) & TO0 then abort 1
if (a u< b) & TO3 then abort 1 if a > EXTS(SI) & TO1 then abort 1
if (a >u b) & TO4 then abort 1 if a = EXTS(SI) & T02 then abort 1
if a u< EXTS(SI) & TO3 then abort 1
if abort & (MSRTS = 0b10 | MSRTS = 0b01) then if a >u EXTS(SI) & TO4 then abort 1
#Transactional or Suspended
cause 0x00000001 if abort & (MSRTS = 0b10 | MSRTS = 0b01) then
if MSRTS= 0b01 & TEXASRFS = 0 then #Suspended #Transactional or Suspended
Discard transactional footprint cause 0x00000001
TMRecordFailure(cause) if MSRTS= 0b01 & TEXASRFS = 0 then #Suspended
if MSRTS = 0b10 then #Transactional Discard transactional footprint
TMHandleFailure() TMRecordFailure(cause)
if MSRTS = 0b10 then #Transactional
TMHandleFailure()
The tabortwc. instruction sets condition register field 0
to 0 || MSRTS || 0. The contents of register RA32:63 are The tabortwci. instruction sets condition register field 0
compared with the contents of register RB32:63. If any to 0 || MSRTS || 0. The contents of register RA32:63 are
bit in the TO field is set to 1 and its corresponding con- compared with the sign-extended value of the SI field. If
dition is met by the result of the comparison, and the any bit in the TO field is set to 1 and its corresponding
transaction state is Transactional or Suspended, then condition is met by the result of the comparison, and
the tabortwc. instruction causes transaction failure, the transaction state is Transactional or Suspended
resulting in the following: then the tabortwci. instruction causes transaction fail-
ure, resulting in the following:
Failure recording is performed as defined in Section
5.3.2, using the failure cause 0x00000001. Failure recording is performed as defined in Section
5.3.2, using the failure cause 0x00000001.
If the transaction state is Transactional, failure handling
is performed as defined in Section 5.3.3 (this includes If the transaction state is Transactional, failure handling
discarding the transactional footprint). is performed as defined in Section 5.3.3 (this includes
discarding the transactional footprint).
If the transaction state is Suspended, the transactional
footprint is discarded (if not already discarded for a If the transaction state is Suspended, the transactional
pending failure), but failure handling is deferred. footprint is discarded (if not already discarded for a
pending failure), but failure handling is deferred.
Other than the setting of CR0, execution of tabortwc.
in the Non-transactional state is treated as a no-op. Other than the setting of CR0, execution of tabortwci.
in the Non-transactional state is treated as a no-op.
Special Registers Altered:
Special Registers Altered: CR0 TEXASR TFIAR TS
CR0 TEXASR TFIAR TS
31 TO RA RB 814 1 31 TO RA SI 878 1
0 6 11 16 21 31 0 6 11 16 21 31
a ( RA ) a (RA)
b ( RB ) abort 0
abort 0
CR0 0 || MSRTS || 0
CR0 0 || MSRTS || 0
if (a < b) & TO0 then abort 1 if a < EXTS(SI) & TO0 then abort 1
if (a > b) & TO1 then abort 1 if a > EXTS(SI) & TO1 then abort 1
if (a = b) & TO2 then abort 1 if a = EXTS(SI) & T02 then abort 1
if (a u< b) & TO3 then abort 1 if a u< EXTS(SI) & TO3 then abort 1
if (a >u b) & TO4 then abort 1 if a >u EXTS(SI) & TO4 then abort 1
if abort & (MSRTS = 0b10 | MSRTS = 0b01) then if abort & (MSRTS = 0b10 | MSRTS = 0b01) then
#Transactional or Suspended #Transactional or Suspended
cause 0x00000001 cause 0x00000001
if MSRTS= 0b01 & TEXASRFS = 0 then #Suspended if MSRTS= 0b01 & TEXASRFS = 0 then #Suspended
Discard transactional footprint Discard transactional footprint
TMRecordFailure(cause) TMRecordFailure(cause)
if MSRTS = 0b10 then #Transactional if MSRTS = 0b10 then #Transactional
TMHandleFailure() TMHandleFailure()
The tabortdc. instruction sets condition register field 0 The tabortdci. instruction sets condition register field 0
to 0 || MSRTS || 0. The contents of register RA are com- to 0 || MSRTS || 0. The contents of register RA are com-
pared with the contents of register RB. If any bit in the pared with the sign-extended value of the SI field. If any
TO field is set to 1 and its corresponding condition is bit in the TO field is set to 1 and its corresponding con-
met by the result of the comparison, and the transac- dition is met by the result of the comparison, and the
tion state is Transactional or Suspended, then the transaction state is Transactional or Suspended then
tabortdc. instruction causes transaction failure, result- the tabortdci. instruction causes transaction failure,
ing in the following: resulting in the following:
Failure recording is performed as defined in Section Failure recording is performed as defined in Section
5.3.2, using the failure cause 0x00000001. 5.3.2, using the failure cause 0x00000001.
If the transaction state is Transactional, failure handling If the transaction state is Transactional, failure handling
is performed as defined in Section 5.3.3 (this includes is performed as defined in Section 5.3.3 (this includes
discarding the transactional footprint). discarding the transactional footprint).
If the transaction state is Suspended, the transactional If the transaction state is Suspended, the transactional
footprint is discarded (if not already discarded for a footprint is discarded (if not already discarded for a
pending failure), but failure handling is deferred. pending failure), but failure handling is deferred.
Other than the setting of CR0, execution of tabortdc. in Other than the setting of CR0, execution of tabortdci.
the Non-transactional state is treated as a no-op. in the Non-transactional state is treated as a no-op.
Special Registers Altered: Special Registers Altered:
CR0 TEXASR TFIAR TS CR0 TEXASR TFIAR TS
Programming Note
One use of the tcheck instruction in Suspended
state is to determine whether preceding loads from
transactionally modified locations have returned
the data the transaction stored. (If the transaction
has failed, some of the loads may have returned a
more recent value that was stored by a conflicting
store, or may have returned the pre-transaction
contents of the location.). It is important to use
tcheck. between any Suspended state loads that
might access transactionally modified locations and
subsequent computation using the Sus-
pended-state-loaded data. Otherwise, corrupt data
could cause problems such as wild branches or
infinite loops.
Another use of tcheck in Suspended state is to
determine whether the contents of storage, as seen
in Suspended state, are consistent with the trans-
action succeeding -- e.g., whether no location that
has been accessed transactionally (stored transac-
tionally, for ROTs), and has been seen in Sus-
pended state, has been subject to a conflict thus
far. (A location is seen in Suspended state either
by being loaded in Suspended state or by being
loaded in Transactional state and the value (or a
value derived therefrom) passed, in a register, into
Suspended state.)
A use of tcheck in Transactional state is to deter-
mine whether the transaction still has the potential
to succeed.
Note that tcheck provides an instantaneous check
on the integrity of a subset of the accesses per-
formed within a transaction. tcheck is not a failure
synchronizing mechanism. Even if no accesses
follow the tcheck, there may still be latent failures
that haven’t been recorded, for example caused by
accesses that tcheck does not wait for, by external
conflicts that will happen in the future, or simply by
time of flight to the failure detection mechanism for
operations that have already been performed.
Programming Note
The tcheck instruction can return 1 in bit 0 of CR
field BF before the failure has been recorded in
TEXASR and TFIAR.
Programming Note
The tcheck instruction may cause pipeline syn-
chronization. As a result, programs that use
tcheck excessively may perform poorly.
Programming Note
New programs should use mfspr instead of mftb to
access the Time Base.
Programming Note
mftb serves as both a basic and an extended mne-
monic. The Assembler will recognize an mftb
mnemonic with two operands as the basic form,
and an mftb mnemonic with one operand as the
extended form. In the extended form the TBR
operand is omitted and assumed to be 268 (the
value that corresponds to TB).
Programming Note
Since the update frequency of the Time Base is imple- When the processor is in 64-bit mode, The POSIX
mentation-dependent, the algorithm for converting the clock can be computed with an instruction sequence
current value in the Time Base to time of day is also such as this:
implementation-dependent. mfspr Ry,268 # Ry = Time Base
lwz Rx,ticks_per_sec
As an example, assume that the Time Base increments divdu Rz,Ry,Rx # Rz = whole seconds
at the constant rate of 512 MHz. (Note, however, that stw Rz,posix_sec
programs should allow for the possibility that some mulld Rz,Rz,Rx # Rz = quotient * divisor
implementations may not increment the least-signifi- sub Rz,Ry,Rz # Rz = excess ticks
cant 4 bits of the Time Base at a constant rate.) What is lwz Rx,ns_adj
wanted is the pair of 32-bit values comprising a POSIX slwi Rz,Rz,1 # Rz = 2 * excess ticks
standard clock:1 the number of whole seconds that mulhwu Rz,Rz,Rx # mul by (ns/tick)/2 * 232
have passed since 00:00:00 January 1, 1970, UTC, stw Rz,posix_ns# product[0:31] = excess ns
and the remaining fraction of a second expressed as a
number of nanoseconds. Non-constant update frequency
Assume that: In a system in which the update frequency of the Time
Base may change over time, it is not possible to convert
The value 0 in the Time Base represents the start an isolated Time Base value into time of day. Instead, a
time of the POSIX clock (if this is not true, a simple Time Base value has meaning only with respect to the
64-bit subtraction will make it so). current update frequency and the time of day that the
The integer constant ticks_per_sec contains the update frequency was last changed. Each time the
value 512,000,000, which is the number of times update frequency changes, either the system software
the Time Base is updated each second. is notified of the change via an interrupt (see Book III),
or the change was instigated by the system software
The integer constant ns_adj contains the value
itself. At each such change, the system software must
compute the current time of day using the old update
1,000,000,000
-------------------------------------- × 232 / 2 = 4194304000 frequency, compute a new value of ticks_per_sec for
512,000,000 the new frequency, and save the time of day, Time Base
value, and tick rate. Subsequent calls to compute Time
which is the number of nanoseconds per tick of the of Day use the current Time Base Value and the saved
Time Base, multiplied by 232 for use in mulhwu value.
(see below), and then divided by 2 in order to fit, as
an unsigned integer, into 32 bits.
1. Described in POSIX Draft Standard P1003.4/D12, Draft Standard for Information Technology -- Portable Operating System Interface (POSIX) --
Part 1: System Application Program Interface (API) - Amendment 1: Real-time Extension [C Language]. Institute of Electrical and Electronics Engi-
neers, Inc., Feb. 1992.
34:60Reserved
Programming Note
61 Load Monitored Event-Based Excep-
When BESCRGE=0, the LME bit is tion Occurred (LMEO)
ignored, and ldmx behaves as ldx without 0 A Load Monitored event-based
affecting this bit. exception has not occurred since
the last time software set this bit to
0.
30 External Event-Based Exception 1 A Load Monitored event-based
Enable (EE) exception has occurred since the
0 External event-based (EBB) excep- last time software set this bit to 0.
tions are disabled.
1 External EBB exceptions are This bit is set to 1 by the hardware when a
enabled until an external Load Monitored event-based exception
event-based exception occurs, at occurs. This bit can be set to 0 only by the
which time: mtspr instruction.
- EE is set to 0
- EEO is set to 1 Programming Note
Software should set this bit to 0 after han-
External event-based exceptions exist when
dling an event-based branch due to a
an external EBB input from the platform is
Load Monitored event-based exception.
active. See the system documentation for
information about the external EBB input. 62 External Event-Based Exception
Occurred (EEO)
0 An external EBB exception has not
31 Performance Monitor Event-Based
occurred since the last time soft-
Exception Enable (PME)
ware set this bit to 0.
0 Performance Monitor event-based
1 An external EBB exception has
exceptions are disabled.
occurred since the last time soft-
1 Performance Monitor event-based
ware set this bit to 0.
exceptions are enabled until a Per-
formance Monitor event-based
Programming Note
exception occurs, at which time:
- PME is set to 0 As part of processing an External EBB
- PMEO is set to 1 exception, it may also be necessary to
perform additional operations to manage
See Chapter 9 of Book III for information the external EBB input from the system.
about Performance Monitor event-based See the system documentation for details.
exceptions and about the effects of this bit on
the Performance Monitor.
63 Performance Monitor Event-Based
32:33 Transactional State Exception Occurred (PMEO)
When an event-based branch occurs, hard- 0 A Performance Monitor event-based
ware sets this field to indicate the transac- exception has not occurred since
tional state of the processor when the the last time software set this bit to
event-based branch occurred. 0.
The values and their associated meanings are 1 A Performance Monitor event-based
as follows. exception has occurred since the
00 Non-transactional last time software set this bit to 0.
01 Suspended This bit is set to 1 by the hardware when a
10 Transactional Performance Monitor event-based exception
11 Reserved occurs. This bit can be set to 0 only by the
mtspr instruction.
Programming Note
Event-based branch handlers should not See Chapter 9 of Book III for information
modify this field since its value is used by about Performance Monitor event-based
the processor to determine the transac- exceptions and about the effects of this bit on
tional state of the processor after the the Performance Monitor.
rfebb instruction is executed.
Effective Address
0 62 63
Programming Note
The EBBHR can be used by software as a scratch-
pad register after entry into an event-based branch
handler, provided that its contents are restored
prior to executing rfebb 1. An example of such
usage is as follows, where SPRG3 is used to con-
tain a pointer to a storage area where context infor-
mation may be saved.
E:mtspr EBBHR, r1 // Save r1 in EBBHR
mfspr r1, SPRG3 // Move SPRG3 to r1
std r2, r1,offset1 // Store r2
mfspr EBBHR,r2 // Copy original contents
// of r1 to r2
std r2,offset2(r1) // save original r1
.. // Store rest of state
... // Process event(s)
... // Restore all state except
// r1,r2
r2 = &E // Generate original value
// of EBBHR in r2
mtspr EBBHR,r2 // Restore EBBHR
ld r2 offset1(r1) // restore r2
ld r1 offset2(r1) // restore r1
rfebb 1 // Return from handler
Programming Note
If the BESCRTS has been modified by software
after an event-based branch occurs, an illegal
transaction state transition may occur. See Chapter
3.2.2 of Book III.
In order to make assembler language programs simpler symbols related to instructions defined in Book II.
to write and easier to understand, a set of extended Assemblers should provide the extended mnemonics
mnemonics and symbols is provided for certain instruc- and symbols listed here, and may provide others.
tions. This appendix defines extended mnemonics and
A.1 Data Cache Block Touch [for represent the L value in the mnemonic rather than
requiring it to be coded as a numeric operand.
Store] Mnemonics Note: dcbf serves as both a basic and an extended
mnemonic. The Assembler will recognize a dcbf mne-
The TH field in the Data Cache Block Touch and Data monic with three operands as the basic form, and a
Cache Block Touch for Store instructions control the dcbf mnemonic with two operands as the extended
actions performed by the instructions. Extended mne- form. In the extended form the L operand is omitted
monics are provided that represent the TH value in the and assumed to be 0.
mnemonic rather than requiring it to be coded as a
numeric operand.
dcbf RA,RB (equivalent to: dcbf RA,RB,0)
dcbtct RA,RB,TH (equivalent to: dcbt for TH val- dcbfl RA,RB (equivalent to: dcbf RA,RB,1)
ues of 0b00000 - 0b00111); dcbflp RA,RB (equivalent to: dcbf RA,RB,3)
other TH values are invalid.
dcbtds RA,RB,TH (equivalent to: dcbt for TH val-
ues of 0b00000 or 0b01000 A.3 Or Mnemonics
- 0b01111);
other TH values are invalid. The three register fields in the or instruction can be
used to specify a hint indicating how the processor
dcbtt RA,RB (equivalent to: dcbt for TH
should handle stores caused by previous Store or dcbz
value of 0b10000)
instructions. An extended mnemonic is supported that
dcbna RA,RB (equivalent to: dcbt for TH represents the operand values in the mnemonic rather
value of 0b10001) than requiring them to be coded as numeric operands.
dcbtstct RA,RB,TH (equivalent to: dcbtst for TH
values of 0b00000 or miso (equivalent to: or 26,26,26)
0b00000 - 0b00111);
other TH values are invalid.
dcbtstds RA,RB,TH (equivalent to: dcbtst for TH A.4 Load and Reserve
values of 0b00000 or
0b01000 - 0b01111); Mnemonics
other TH values are invalid.
The EH field in the Load and Reserve instructions pro-
dcbtstt RA,RB (equivalent to: dcbtst for TH vides a hint regarding the type of algorithm imple-
value of 0b10000) mented by the instruction sequence being executed.
Extended mnemonics are provided that allow the EH
value to be omitted and assumed to be 0b0.
A.2 Data Cache Block Flush
Note: lbarx, lharx, lwarx, ldarx, and lqarx serve as
Mnemonics both basic and extended mnemonics. The Assembler
will recognize these mnemonics with four operands as
The L field in the Data Cache Block Flush instruction the basic form, and these mnemonics with three oper-
controls the scope of the flush function performed by
the instruction. Extended mnemonics are provided that
ands as the extended form. In the extended form the the A value in the mnemonic rather than requiring it to
EH operand is omitted and assumed to be 0. be coded as a numeric operand..
lbarx RT,RA,RB (equivalent to: lbarx RT,RA,RB,0) tend. (equivalent to: tend. 0)
lharx RT,RA,RB (equivalent to: lharx RT,RA,RB,0) tendall. (equivalent to: tend. 1)
lwarx RT,RA,RB (equivalent to: lwarx RT,RA,RB,0)
ldarx RT,RA,RB (equivalent to: ldarx RT,RA,RB,0) The L field in the Transaction Suspend or Resume
lqarx RT,RA,RB (equivalent to: lqarx RT,RA,RB,0) instruction determines how to change the transaction
state. Extended mnemonics are provided that repre-
sent the L value in the mnemonic rather than requiring
it to be coded as a numeric operand.
A.5 Synchronize Mnemonics
The L field in the Synchronize instruction controls the tsuspend. (equivalent to: tsr. 0)
scope of the synchronization function performed by the tresume. (equivalent to: tsr. 1)
instruction. Extended mnemonics are provided that
represent the L value in the mnemonic rather than
requiring it to be coded as a numeric operand. Two
extended mnemonics are provided for the L=0 value in
A.8 Move To/From Time Base
order to support Assemblers that do not recognize the Mnemonics
sync mnemonic.
Note: sync serves as both a basic and an extended The tbr field in the Move From Time Base instruction
mnemonic. Assemblers will recognize a sync mne- specifies whether the instruction reads the entire Time
monic with one operand as the basic form, and a sync Base or only the high-order half of the Time Base.
mnemonic with no operand as the extended form. In
the extended form the L operand is omitted and mftb Rx (equivalent to: mftb Rx,268)
assumed to be 0. or: mfspr Rx,268
mftbu Rx (equivalent to: mftb Rx,269)
sync (equivalent to: sync 0) or: mfspr Rx,269
lwsync (equivalent to: sync 1)
ptesync (equivalent to: sync 2)
A.9 Return From Event-Based
A.6 Wait Mnemonics Branch Mnemonic
The S field in the Return from Event-Based Branch
The WC field in the wait instruction is reserved for
instruction specifies the value to which the instruction
future use. It may be be used in the future to indicate
sets the GE field in the BESCR. Extended mnemonics
the condition that causes instruction execution to
are provided that represent the S value in the mne-
resume. An extended mnemonic is provided that repre-
monic rather than requiring it to be coded as a numeric
sent the WC value in the mnemonic rather than requir-
operand.
ing it to be coded as a numeric operand.
Note: wait serves as both a basic and an extended rfebb (equivalent to: rfebb 1)
mnemonic. The Assembler will recognize a wait mne-
monic with one operand as the basic form, and a wait Note: rfebb serves as both a basic and an extended
mnemonic with no operands as the extended form. In mnemonic. The Assembler will recognize this mne-
the extended form the WC operand is omitted and monic with one operand as the basic form, and this
assumed to be 0. mnemonic with no operands as the extended form. In
the extended form the S operand is omitted and
wait (equivalent to: wait 0) assumed to be 1.
This appendix gives examples of how dependencies In these examples it is assumed that contention for the
and the Synchronization instructions can be used to shared resource is low; the conditional branches are
control storage access ordering when storage is shared optimized for this case by using “+” and “-” suffixes
between programs. appropriately.
Many of the examples use extended mnemonics (e.g., The examples deal with words; they can be used for
bne, bne-, cmpw) that are defined in Appendix C of doublewords by changing all word-specific mnemonics
Book I. to the corresponding doubleword-specific mnemonics
(e.g., lwarx to ldarx, cmpw to cmpd).
Many of the examples use the Load And Reserve and
Store Conditional instructions, in a sequence that In this appendix it is assumed that all shared storage
begins with a Load And Reserve instruction and ends locations are in storage that is Memory Coherence
with a Store Conditional instruction (specifying the Required, and that the storage locations specified by
same storage location as the Load Conditional) fol- Load And Reserve and Store Conditional instructions
lowed by a Branch Conditional instruction that tests are in storage that is neither Write Through Required
whether the Store Conditional instruction succeeded. nor Caching Inhibited.
loop:
Fetch and AND lwarx r6,0,r3 #load and reserve
The “Fetch and AND” primitive atomically ANDs a value cmpw r4,r6 #1st 2 operands equal?
bne- exit #skip if not
into a word in storage.
stwcx. r5,0,r3 #store new value if still res’ved
In this example it is assumed that the address of the bne- loop #loop if lost reservation
word to be ANDed is in GPR 3, the value to AND into it exit:
is in GPR 4, and the old value is returned in GPR 5. mr r4,r6 #return value from storage
Notes:
loop:
lwarx r5,0,r3 #load and reserve 1. The semantics given for “Compare and Swap”
and r0,r4,r5#AND word above are based on those of the IBM System/370
stwcx. r0,0,r3 #store new value if still res’ved Compare and Swap instruction. Other architec-
bne- loop #loop if lost reservation tures may define a Compare and Swap instruction
Note: differently.
1. The sequence given above can be changed to per- 2. “Compare and Swap” is shown primarily for peda-
form another Boolean operation atomically on a gogical reasons. It is useful on machines that lack
word in storage, simply by changing the and the better synchronization facilities provided by
instruction to the desired Boolean instruction (or, lwarx and stwcx.. A major weakness of a Sys-
xor, etc.). tem/370-style Compare and Swap instruction is
that, although the instruction itself is atomic, it
checks only that the old and current values of the
Test and Set word being tested are equal, with the result that
This version of the “Test and Set” primitive atomically programs that use such a Compare and Swap to
loads a word from storage, sets the word in storage to a control a shared resource can err if the word has
nonzero value if the value loaded is zero, and sets the been modified and the old value subsequently
EQ bit of CR Field 0 to indicate whether the value restored. The sequence shown above has the
loaded is zero. same weakness.
In this example it is assumed that the address of the 3. In some applications the second bne- instruction
word to be tested is in GPR 3, the new value (nonzero) and/or the mr instruction can be omitted. The
is in GPR 4, and the old value is returned in GPR 5. bne- is needed only if the application requires that
if the EQ bit of CR Field 0 on exit indicates “not
loop: equal” then (r4) and (r6) are in fact not equal. The
lwarx r5,0,r3 #load and reserve mr is needed only if the application requires that if
cmpwi r5,0 #done if word not equal to 0 the comparands are not equal then the word from
bne- exit storage is loaded into the register with which it was
stwcx. r4,0,r3 #try to store non-0 compared (rather than into a third register). If
bne- loop #loop if lost reservation either or both of these instructions is omitted, the
exit: ... resulting Compare and Swap does not obey Sys-
tem/370 semantics.
B.2.1 Lock Acquisition and Import quent isync create an import barrier that prevents the
load from “data1” from being performed until the branch
Barriers has been resolved not to be taken.
An “import barrier” is an instruction or sequence of If the shared data structure is in storage that is neither
instructions that prevents storage accesses caused by Write Through Required nor Caching Inhibited, an
instructions following the barrier from being performed lwsync instruction can be used instead of the isync
before storage accesses that acquire a lock have been instruction. If lwsync is used, the load from “data1”
performed. An import barrier can be used to ensure may be performed before the stwcx.. But if the stwcx.
that a shared data structure protected by a lock is not fails, the second branch is taken and the lwarx is
accessed until the lock has been acquired. A sync re-executed. If the stwcx. succeeds, the value
instruction can be used as an import barrier, but the returned by the load from “data1” is valid even if the
approaches shown below will generally yield better per- load is performed before the stwcx., because the
formance because they order only the relevant storage lwsync ensures that the load is performed after the
accesses. instance of the lwarx that created the reservation used
by the successful stwcx..
B.2.1.1 Acquire Lock and Import
Shared Storage B.2.1.2 Obtain Pointer and Import
If lwarx and stwcx. instructions are used to obtain the
Shared Storage
lock, an import barrier can be constructed by placing an If lwarx and stwcx. instructions are used to obtain a
isync instruction immediately following the loop con- pointer into a shared data structure, an import barrier is
taining the lwarx and stwcx.. The following example not needed if all the accesses to the shared data struc-
uses the “Compare and Swap” primitive to acquire the ture depend on the value obtained for the pointer. The
lock. following example uses the “Fetch and Add” primitive to
obtain and increment the pointer.
In this example it is assumed that the address of the
lock is in GPR 3, the value indicating that the lock is In this example it is assumed that the address of the
free is in GPR 4, the value to which the lock should be pointer is in GPR 3, the value to be added to the pointer
set is in GPR 5, the old value of the lock is returned in is in GPR 4, and the old value of the pointer is returned
GPR 6, and the address of the shared data structure is in GPR 5.
in GPR 9.
loop:
loop: lwarx r5,0,r3 #load pointer and reserve
lwarx r6,0,r3,1 #load lock and reserve add r0,r4,r5#increment the pointer
cmpw r4,r6 #skip ahead if stwcx. r0,0,r3 #try to store new value
bne- wait # lock not free bne- loop #loop if lost reservation
stwcx. r5,0,r3 #try to set lock lwz r7,data1(r5) #load shared data
bne- loop #loop if lost reservation
isync #import barrier The load from “data1” cannot be performed until the
lwz r7,data1(r9)#load shared data pointer value has been loaded into GPR 5 by the lwarx.
. The load from “data1” may be performed before the
. stwcx.. But if the stwcx. fails, the branch is taken and
wait... #wait for lock to free the value returned by the load from “data1” is dis-
carded. If the stwcx. succeeds, the value returned by
The hint provided with lwarx indicates that after the
the load from “data1” is valid even if the load is per-
program acquires the lock variable (i.e., stwcx. is suc-
formed before the stwcx., because the load uses the
cessful), it will release it (i.e., store to it) prior to another
pointer value returned by the instance of the lwarx that
program attempting to modify it.
created the reservation used by the successful stwcx..
The second bne- does not complete until CR0 has
An isync instruction could be placed between the bne-
been set by the stwcx.. The stwcx. does not set CR0
and the subsequent lwz, but no isync is needed if all
until it has completed (successfully or unsuccessfully).
accesses to the shared data structure depend on the
The lock is acquired when the stwcx. completes suc-
value returned by the lwarx.
cessfully. Together, the second bne- and the subse-
B.2.2 Lock Release and Export The lwsync ensures that the store that releases the
lock will not be performed with respect to any other pro-
Barriers cessor until all stores caused by instructions preceding
the lwsync have been performed with respect to that
An “export barrier” is an instruction or sequence of
processor.
instructions that prevents the store that releases a lock
from being performed before stores caused by instruc-
tions preceding the barrier have been performed. An
export barrier can be used to ensure that all stores to a
shared data structure protected by a lock will be per- B.2.3 Safe Fetch
formed with respect to any other processor before the If a load must be performed before a subsequent store
store that releases the lock is performed with respect to (e.g., the store that releases a lock protecting a shared
that processor. data structure), a technique similar to the following can
be used.
B.2.2.1 Export Shared Storage and In this example it is assumed that the address of the
Release Lock storage operand to be loaded is in GPR 3, the contents
A sync instruction can be used as an export barrier of the storage operand are returned in GPR 4, and the
independent of the storage control attributes (e.g., address of the storage operand to be stored is in GPR
presence or absence of the Caching Inhibited attribute) 5.
of the storage containing the shared data structure.
lwz r4,0(r3)#load shared data
Because the lock must be in storage that is neither
cmpw r4,r4 #set CR0 to “equal”
Write Through Required nor Caching Inhibited, if the bne- $-8 #branch never taken
shared data structure is in storage that is Write stw r7,0(r5)#store other shared data
Through Required or Caching Inhibited a sync instruc-
tion must be used as the export barrier. An alternative is to use a technique similar to that
described in Section B.2.1.2, by causing the stw to
In this example it is assumed that the shared data depend on the value returned by the lwz and omitting
structure is in storage that is Caching Inhibited, the the cmpw and bne-. The dependency could be created
address of the lock is in GPR 3, the value indicating by ANDing the value returned by the lwz with zero and
that the lock is free is in GPR 4, and the address of the then adding the result to the value to be stored by the
shared data structure is in GPR 9. stw. If both storage operands are in storage that is nei-
ther Write Through Required nor Caching Inhibited,
stw r7,data1(r9)#store shared data (last) another alternative is to replace the cmpw and bne-
sync #export barrier with an lwsync instruction.
stw r4,lock(r3)#release lock
The sync ensures that the store that releases the lock
will not be performed with respect to any other proces-
sor until all stores caused by instructions preceding the
sync have been performed with respect to that proces-
sor.
B.5.1 Enter Critical Section GPR 3. The immediate value of the andis. instruction
selects the Failure Persistent bit in the upper half of
The following example shows the entry point to a criti- TEXASR to be tested.
cal section using transactional lock elision. The entry
code starts a transaction using the tbegin. instruction
tle_abort:
and checks whether the transaction was aborted or not.
mfspr r4, TEXASRU # Read high-order half
If not, it checks whether the lock is free or not. If the # of TEXASR
lock is found to be free, the thread proceeds to execute andis. r5,r4,0x0100 # determine whether failure
the critical section. # is likely to be persistent
bne tle_acquire_lock #Persistent, acquire lock
In this example it is assumed that the address of the
#enter critical sec
lock is in GPR 3, and the value indicating that the lock b tle_entry #Transient, try TLE again
is free is in GPR 4. The handling of cases of transaction
abort and busy lock are described in subsequent exam-
ples.
This example can be extended to keep track of the
number of transient aborts and fall back on the acquisi-
tle_entry: tion of the lock after the number of transient failures
tbegin. #Start TLE transaction reaches some threshold. It can also be extended to
beq- tle_abort #Handle TLE transaction abort handle reentrant locks. Acquisition of TLE locks is
lwz r6,0(r3) #Read lock described in a subsequent example.
cmpw r6,r4 #Check if lock is free
bne- busy_lock #If not, handle lock busy case
B.5.4 TLE Exit Section Critical
critical_section1:
Path
The following example illustrates the instruction
sequence used to exit a TLE critical section. The CR0
B.5.2 Handling Busy Lock value set by tend. indicates whether the current thread
was in a transaction. If so, the exited critical section
In the event that the lock is already held, by either
was entered speculatively, and the transaction is
another thread or the current thread, the transaction is
ended. If not, the execution takes a path to release the
aborted using the tabort instruction, using a soft-
lock.
ware-defined code TLE_BUSY_LOCK indicating the
cause of the abort. The abort returns control to the beq Release of an acquired TLE lock is described in a sub-
following tbegin. in the critical section entrance sequent example.
sequence, allowing for an abort handler to react appro-
priately.
tle_exit:
tend. #End the current trans-
busy_lock: #action, if any
li r3, TLE_BUSY_LOCK bng- tle_release_lock #Release lock, if was
tabort r3 #Abort TLE transaction #not in a transaction
Book III:
Chapter 1. Introduction
- The definition of “store is performed” applies sequence following mtgsr has finished exe-
to accesses for recording reference and cuting. The contents of SPRs that are the tar-
change information. gets of mtspr instructions between the point
of interruption or EBB and the end of the
- A TLB entry invalidation by thread T1 is per-
mtspr sequence may be altered.
formed with respect to thread T2 when the
instruction that requested the invalidation has “must”
caused the specified entry, if present, to be If hypervisor software violates a rule that is stated
made invalid in T2’s TLB, and similarly for using the word “must” (e.g., “this field must be set
invalidations of entries in other caches of infor- to 0”), and the rule pertains to the contents of a
mation derived from tables used in address hypervisor resource, to executing an instruction
translation. that can be executed only in hypervisor state, or to
accessing storage in real addressing mode, the
exception
results are undefined, and may include altering
An error, unusual condition, or external signal, that
resources belonging to other partitions, causing
may set a status bit and may or may not cause an
the system to “hang”, etc.
interrupt, depending upon whether the correspond-
ing interrupt is enabled. hardware
Any combination of hard-wired implementation,
interrupt
emulation assist, or interrupt for software assis-
The act of changing the machine state in response
tance. In the last case, the interrupt may be to an
to an exception, as described in Chapter
architected location or to an implementa-
6. “Interrupts” on page 1047.
tion-dependent location. Any use of emulation
trap interrupt assists or interrupts to implement the architecture
An interrupt that results from execution of a Trap is implementation-dependent.
instruction.
hypervisor privileged
Additional exceptions to the rule that the thread A term used to describe an instruction or facility
obeys the sequential execution model, beyond that is available only when the thread is in hypervi-
those described in Section 2.2 of Book I and in the sor state.
bullet defining “program order” in Section 1.1 of
Book II, are the following. privileged state and supervisor mode
Used interchangeably to refer to a state in which
- A System Reset or Machine Check interrupt privileged facilities are available.
may occur. The determination of whether an
instruction is required by the sequential execu- problem state and user mode
tion model is not affected by the potential Used interchangeably to refer to a state in which
occurrence of a System Reset or Machine privileged facilities are not available.
Check interrupt. (The determination is /, //, ///, ... denotes a field that is reserved in an
affected by the potential occurrence of any instruction, in a register, or in an architected stor-
other kind of interrupt.) age table.
- A context-altering instruction is executed ?, ??, ???, ... denotes a field that is implementa-
(Chapter 11. “Synchronization Requirements tion-dependent in an instruction, in a register, or in
for Context Alterations” on page 1127). The an architected storage table.
context alteration need not take effect until the
required subsequent synchronizing operation
has occurred. 1.2.2 Reserved Fields
- A Reference and Change bit is updated by the Book I's description of the handling of reserved bits in
thread. The update need not be performed System Registers, and of reserved values of defined
with respect to that thread until the required fields of System Registers, applies also to the SLB.
subsequent synchronizing operation has Book I's description of the handling of reserved values
occurred. of defined fields of System Registers applies also to
architected storage tables (e.g., the Page Table).
- A Branch instruction is executed and the
branch is taken. The update of the Some fields of certain architected storage tables may
Come-From Address Register (see be written to automatically by the hardware, e.g., Refer-
Section 8.2 of Book III) need not occur until a ence and Change bits in the Page Table. When the
subsequent context synchronizing operation hardware writes to such a table, the following rules are
has occurred. obeyed.
- An mtgsr is executed and an interrupt or
event-based branch occurs before the mtspr
translation is disabled, the setting of the ISL 17:19 Power-saving mode Exit Cause Enable
bit has no effect. ISL mode has no effect on (Upper Section) (PECEU)
SLB, TLB, and ERAT entry invalidations
17 Hypervisor Virtualization Exit Enable
caused by slbie, slbieg, slbia, tlbie, and
tlbiel. 0 When the stop instruction is executed
with PSSCREC=1, Hypervisor Virtualiza-
Programming Note tion exceptions are not enabled to cause
Specifying that L||LP=0b000 in PATEPS exit from power-saving mode.
when VPM mode is enabled has the same 1 When the stop instruction is executed
effect on address translation when trans- with PSSCREC=1, Hypervisor Virtualiza-
lation is disabled as enabling ISL mode tion exceptions are enabled to cause exit
when translation is enabled. from power-saving mode.
18:19 Reserved
ISL mode is needed when translation is
enabled because translation uses the 20:37 Reserved
SLB, and the contents of the SLB are
accessible to the operating system and 38 Interrupt Little-Endian (ILE)
should not be modified by the hypervisor. The contents of the ILE bit are copied into
ISL mode is not needed when translation MSRLE by interrupts that set MSRHV to 0 (see
is disabled since Virtual Real Mode Section 6.5), to establish the Endian mode for
address translation uses PATEPS, which is the interrupt handler.
not visible to the operating system and is
39:40 Alternate Interrupt Location (AIL)
in complete control of the hypervisor.
Controls the effective address offset, or alter-
nate effective address for System Call Vec-
tored, of the interrupt handler and the
relocation mode in which it begins execution
3 Key-Based Virtualization (KBV)
for all interrupts except those subject to the
Controls whether Key-Based Virtualization overrides described below.
is enabled as specified below. 0 The interrupt is taken with MSRIR DR =
0b00 and no effective address offset or
0 - KBV is disabled alternate effective address.
1 - KBV is enabled 1 Reserved
When KBV is enabled, Virtual Page Class 2 The interrupt is taken with MSRIR DR =
Key Storage Protection exceptions that 0b11. If the interrupt is not System Call
occur on operand accesses when VPM1=0 Vectored , an effective address offset of
cause Hypervisor Data Storage interrupts. 0x0000_0000_0001_8000 is applied.
System Call Vectored uses an alternate
Programming Note effective address of
Key-Based Virtualization provides an effi- 0x0000_0000_0001_7 || LEV || 0b0_0000.
cient means for the hypervisor to intercept 3 The interrupt is taken with MSRIR DR =
storage references, e.g. MMIO, that must 0b11. If the interrupt is not System Cal
be emulated. (The corresponding behav- Vectored, an effective address offset of
ior for instruction fetching is not desired.) 0xc000_0000_0000_4000 is applied.
Virtual Page Class Key Storage Protection System Call Vectored uses an alternate
exceptions not handled by the hypervisor effective address of
should be reflected to the operating sys- 0xc000_0000_0000_3 || LEV || 0b0_0000.
tem at its Data Storage interrupt vector Machine Check, System Reset, and Hypervi-
with the hypervisor having set DSISR42. sor Maintenance interrupts are taken as if
LPCRAIL=0. In the remainder of this defini-
4:8 Reserved tion, “other interrupts” means interrupts other
than these three.
Other interrupts that occur when MSRIR=0 or
9:11 Default Prefetch Depth (DPFD)
MSRDR=0, are taken as if LPCRAIL=0.
The DPFD field is used as the default prefetch
When the software receiving other interrupts
depth for data stream prefetching when
uses Radix Tree translation and the interrupts
DSCRDPFD=0; see page 844.
occur when MSRIR=1 and MSRDR=1, the
12:16 Reserved interrupts are taken as if LPCRAIL=3. This
includes interrupts taken by the hypervisor
If the state of the PECE field is lost during power-saving 53 Guest Translation Shootdown Enable
mode, implementations must provide the means to exit (GTSE)
power-saving mode upon the occurrence of a System
Reset exception and any of the exceptions that were Controls whether the operating system is per-
enabled by the PECE field when the stop instruction mitted to use tlbie and slbieg directly, or must
was executed. In addition, they may also exit issue a system call to the hypervisor.
power-saving mode on exceptions that were disabled 0 Guest is not permitted to use tlbie,
by the PECE field as well. See Section 6.5.1 and Sec- slbieg, tlbsync, and slbsync.
tion 6.5.2 for additional information about exit from 1 Guest is permitted to use tlbie, slbieg,
power-saving mode. tlbsync, and slbsync.
- Instructions are treated as illegal instructions, MMCR0 bit; facility availability interrupts (e.g. [Hypervi-
- SPRs are treated as if they were not defined sor] Facility Available, Vector Unavailable, etc.) do not
for the implementation, occur as a result of problem state accesses even if the
corresponding field in the MSR, [H]FSCR, or MMCR0
- The “reserved SPRs” (see Section 1.3.3 of makes them unavailable in problem state.
Book I) are treated as not defined for the
implementation,
Programming Note
- Fields in instructions are treated as if they Facilities that can be disabled in problem state by
were 0s,
the PCR that also have enable bits in either the
- bits in system registers read back 0s, and MSR or [H]FSCR include Transactional Memory,
mtspr operations have no effect on their val- the BHRB instructions, event-based branch instruc-
ues. tions, TAR, DSCR at SPR 3, SIER, MMCR2, the
- event-based branch instructions, and certain Float-
ing-Point, Vector, and VSX instructions. When any
Programming Note of these facilities are made unavailable in problem
When a bit in a system register is made state by the PCR, the corresponding [Hypervisor]
unavailable by the PCR, mtspr opera- Facility Unavailable, Floating-Point Unavailable,
tions performed on the register in problem Vector, or VSX unavailable interrupts do not occur
state have no effect on the value of the bit when the facility is accessed in problem state.
regardless of the privilege state in which Note, however, that the PCR does not affect privi-
the register may subsequently be read. leged accesses, and thus any Hypervisor Facility
Unavailable, Floating-Point Unavailable, Vector
unavailable, or VSX unavailable interrupts that are
specified to occur as a result of privileged accesses
A PCR bit may also determine how an instruction field occur regardless of the PCR value.
value is interpreted or may define other behavior as
specified in the bit definitions below. The bit definitions for the PCR are shown below.
The PCR has no effect on the setting of the MSR and
Bit Description
[H]SRR1 by interrupts (and of the Count Register by
the System Call Vectored interrupt), and by the rfscv, 0:59 Reserved
[h]rfid and mtmsr[d] instructions, except as specified
60 Version 2.07 (v2.07)
elsewhere in this section.
This bit controls the availability, in problem
Programming Note state, of the following instructions, facilities,
Because the PCR does not prevent mtspr, rfscv, and behaviors that were newly available in
[h]rfid, and mtmsr[d] instructions from setting bits problem state in the version of the architecture
in system registers that the PCR will make unavail- subsequent to Version 2.07.
able after a transition to problem state, these - The instructions listed in Table 1
instructions may cause interrupts in a variety of - The implicit setting of XEROV32 and
unexpected ways. For example, consider an oper- XERCA32 by Fixed-Point Arithmetic
ating system that sets SRR1 such that rfid returns instructions
to problem state with MSR[TS] nonzero. A TM Bad - LMRR, LMSER
Thing interrupt will result, despite that TM is made - scv
unavailable by the PCR. 0 The instructions, behaviors, and facilities
listed above are available in problem
Similarly, the PCR does not prevent rfebb instruc- state.
tions from setting bits in system registers that the
PCR has made unavailable in problem state, and
thus changes to BESCRTS made by privileged
code have the potential to subsequently cause ille-
gal transaction state transitions when rfebb is exe-
cuted in problem state, resulting in the occurrence
of TM Bad Thing type Program interrupts.
- icbt
- lq, stq lbarx, lharx, stbcx, sthcx
- lqarx., stqcx.
- clrbhrb, mfbhrbe
- rfebb, bctar[l]
- The entire Transactional Memory facility
- The instructions in Table 2
- The reserved no-op instructions (see
Section 1.9.3 of Book I)
- The reserved SPRs (see Section 1.3.3 of
Book I)
- PPR32
- DSCR at SPR number 3
- SIER and MMCR2
- MMCR042:47, 51:55 and MMCRA0:63.
Programming Note
The specified bits of MMCR0 and
MMCRA above cannot be changed by
mtspr instructions and mfspr instruc-
tions return 0s for these bits.
- BESCR, EBBHR, and TAR
- The ability of the or 31,31,31 and or
5,5,5 instructions to change the value of
PPRPRI.
- The ability of mtspr instructions that
attempt to set PPRPRI to 001 or 101 to
change the value of PPRPRI.
6:28 Reserved
29:30 Transaction State (TS)
00 Non-transactional
01 Suspended
10 Transactional
11 Reserved
Programming Note
The only instructions that can alter MSRLE
are rfid and hrfid.
Programming Note
The transition rules are the same for mtmsrd as for
the rfid-type instructions because if a transition
were illegal for mtmsrd but allowed for rfid, or vice
versa, software could use the instruction for which
the transition is allowed to achieve the effect of the
other instruction.
Sx Sx
Notes:
1.Generate TM Bad Thing type Program interrupt. “All others" includes all attempts to set MSRTS to 0b11
(reserved value).
2.Instruction completes, change to MSRTM suppressed, except when attempted by rfebb, in which case the result
is a TM Bad Thing type Program interrupt.
Programming Note
For rfscv, [h]rfid, and mtmsrd, the attempted tran-
sition from S0 to N0 is suppressed in order that
interrupt handlers that are "unaware" of transac-
tional memory, and load an MSR value that has not
been updated to take account of transactional
memory, will continue to work correctly. (If the
interrupt occurs when a transaction is running or
suspended, the interrupt will set MSR[TS || TM] to
S0. If the interrupt handler attempts to load an
MSR value that has not been updated to take
account of transactional memory, that MSR value
will have TS || TM = N0. It is desirable that the
interrupt handler remain in state S0, so that it can
return normally to the interrupted transaction.)
The problem solved by suppressing this transition
does not apply to rfebb, so for rfebb an attempt to
transition from S0 to N0 is not suppressed, and
instead causes a TM Bad Thing type Program
interrupt.
ESL
SD
EC
TR
PLS /// PSLL /// MTL RL
0 4 41 42 43 44 48 54 56 60
The bits and their meanings are as follows. 0 State loss while in power-saving mode is
0:3 Power-Saving Level Status (PLS) controlled by the RL, MTL, and PSLL
fields.
Hardware sets this field to the highest 1 Non-hypervisor state loss is allowed while
power-saving level that the thread entered in power-saving mode in addition to state
between the time when the stop instruction is loss controlled by the RL, MTL, and PSLL
executed and when the thread exits fields.
power-saving mode. See the description of
the SD field for the value returned in this field If this field is set to 1 when the stop instruction
when the PSSCR is read. is executed in privileged non-hypervisor state,
a Hypervisor Facility Unavailable interrupt
Programming Note occurs. See Section 6.5.26.
Since the power-saving level entered dur-
Programming Note
ing power-saving mode may vary with
time, the PLS field may not indicate the When state loss occurs, thread resources
power-saving level that existed at exit from such as SPRs, GPRs, address translation
power-saving mode. resources, etc. may be powered off or
allocated to other threads during
4:40 Reserved power-saving mode. The amount of state
loss for various combinations of ESL, RL,
41 Status Disable (SD) and MTL values is implementation depen-
This field is accessible only to the hypervisor. dent, subject to the restrictions specified
in Section 3.3.2.
0 The current value of the PLS field is
returned in the PLS field when reading the 43 Exit Criterion (EC)
PSSCR (using mfspr).
This field is accessible only to the hypervisor.
1 0’s are returned in the PLS field when
reading the PSSCR (using mfspr).
0 Hardware will exit power-saving mode
when the exception corresponding to any
system-caused interrupt occurs.
Power-saving mode is exited either at the
instruction following the stop (if MSREE=0) This field is used to specify the relative rate at
or in the corresponding interrupt handler which the power-saving level increases during
(if MSREE=1). power-saving mode. The rate of power-saving
1 Provided LPCRPECE is not lost, hardware level increase corresponding to each value is
will exit power-saving mode only when a implementation-dependent, and monotoni-
System Reset exception or one of the cally increasing with the value specified.
events specified in LPCRPECE occurs. If
56:59 Maximum Transition Level (MTL)
the event is a Machine Check exception,
then a Machine Check interrupt occurs; If the value of this field is greater than the
otherwise a System Reset interrupt value of the Power-Saving Level Limit (PSLL)
occurs, and the contents of SRR1 indicate field when stop is executed in privileged,
the event that caused exit from power-sav- non-hypervisor state, a Hypervisor Facility
ing mode. Unavailable interrupt occurs. See
Section 6.5.26 of Book III.
When the stop instruction is executed in
hypervisor state, the hypervisor must set the Otherwise, if the value of this field is greater
ESL field to the same value as this field. Also, than the value of the RL field, the power-sav-
if the RL or MTL fields are set to values that ing level is allowed to increase from the value
allow state loss, then fields ESL and EC must in the RL field up to the value of this field dur-
both be set to 1. Other combinations of the ing power-saving mode.
values of the ESL, EC, RL, and MTL fields are
If this field is less than or equal to the value of
reserved for future use.
the PSLL field when stop is executed in privi-
leged non-hypervisor state, this field is used to
Architecture Note
specify the maximum power-saving level that
Other combinations of the values of the can be reached during power-saving mode
ESL, EC, RL, and MTL fields may be provided that the value of this field is greater
allowed in a future version of the architec- than the value of the RL field. If this field is
ture in order to provide additional function- less than the Requested Level (RL) field when
ality. stop is executed hardware is not allowed to
increase the power-saving level during
If this field is set to 1 when the stop instruction power-saving mode beyond the value indi-
is executed in privileged state, a Hypervisor cated in the RL field.
Facility Unavailable interrupt occurs. See Sec-
tion 6.5.26. 60:63 Requested Level (RL)
This field is used to specify the power-saving
Programming Note level that is to be entered when the stop
In order to enable an OS to enter instruction is executed.
power-saving mode without hypervisor If the value of this field is greater than the
involvement, both the EC and ESL bits value of the Power-Saving Level Limit (PSLL)
must be set to 0s. When this is done, OS field when stop is executed in privileged,
execution of the stop instruction will not non-hypervisor state, a Hypervisor Facility
cause hypervisor involvement provided Unavailable interrupt occurs.
that bits RL and and MTL are less than or
equal to PSLL. See Section 6.5.26 for
details.
Programming Note
The Hypervisor Facility Unavailable inter-
rupt occurs when a privileged non-hyper-
visor program executes stop when
PSSCRRL > PSSCRPSLL so that the
Hypervisor may decide whether or not to
allow the requested loss of state to occur.
If the hypervisor decides that some loss of
state is acceptable, it may choose to
re-execute stop after either setting PSS-
CRMTL to a value that causes state loss,
or setting both PSSCRRL and PSSCRMTL
to values that cause state loss. When the
thread exits power-saving mode, the
hypervisor can quickly determine whether
any resources were actually lost and need
to be restored.
Programming Note
If LEV=1 the hypervisor is invoked. This is the only
way that executing an instruction can cause hyper-
visor state to be entered.
Because this instruction is not privileged, it is possi-
ble for application software to invoke the hypervi-
sor. However, such invocation should be
considered a programming error.
LR CIA + 4
CTR33:36 42:47 undefined
CTR0:32 37:41 48:63 MSR0:32 37:41 48:63
MSR new_value (see below)
NIA (see below) if (MSR29:31 ¬= 0b010 | CTR29:31 ¬= 0b000) then
MSR29:31 CTR29:31
The effective address of the instruction following the MSR48 CTR48 | CTR49
System Call Vectored instruction is placed into the Link MSR58 CTR58 | CTR49
Register. Bits 0:32, 37:41, and 48:63 of the MSR are MSR59 CTR59 | CTR49
placed into the corresponding bits of Count Register, MSR0:2 4:28 32 37:41 49:50 52:57 60:63 CTR0:2 4:28 32 37:41 49:50 52:57
and bits 33:36 and 42:47 of Count Register are set to 60:63
undefined values. NIA iea LR0:61 || 0b00
Then a System Call Vectored interrupt is generated. If bits 29 through 31 of the MSR are not equal to 0b010
The interrupt causes the MSR to be altered as or bits 29 through 31 of CTR are not equal to 0b000,
described in Section 6.5. then the value of bits 29 through 31 of CTR is placed
into bits 29 through 31 of the MSR. The result of ORing
The interrupt causes the next instruction to be fetched bits 48 and 49 of CTR is placed into MSR48. The result
as specified in LPCRAIL (see to Section 2.2). of ORing bits 58 and 49 of CTR is placed into MSR58.
The SRRs are not affected. The result of ORing bits 59 and 49 of CTR is placed
into MSR59. Bits 0:2, 4:28, 32, 37:41, 49:50, 52:57, and
This instruction is context synchronizing. 60:63 of CTR are placed into the corresponding bits of
the MSR.
Special Registers Altered:
LR CTR MSR If the instruction attempts to cause an illegal transac-
tion state transition (see Table 3, “Transaction state
transitions that can be requested by rfebb, rfid, rfscv,
hrfid, and mtmsrd.,” on page 947), or when TM is dis-
abled by the PCR, a transition to Problem state with an
active transaction, a TM Bad Thing type Program inter-
rupt is generated (unless a higher-priority exception is
pending). If this interrupt is generated, the value placed
into SRR0 by the interrupt processing mechanism (see
Section 6.4.3) is the address of the rfid instruction.
Otherwise, if the new MSR value does not enable any
pending exceptions, then the next instruction is fetched,
under control of the new MSR value, from the address
LR0:61 || 0b00 (when SF=1 in the new MSR value) or
320 || LR
32:61 || 0b00 (when SF=0 in the new MSR
value). If the new MSR value enables one or more
pending exceptions, the interrupt associated with the
highest priority pending exception is generated; in this
case the value placed into SRR0 or HSRR0 by the
interrupt processing mechanism (see Section 6.4.3) is
the address of the instruction that would have been
executed next had the interrupt not occurred.
This instruction is privileged and context synchronizing.
Special Registers Altered:
MSR
Programming Note
If this instruction sets MSRPR to 1, it also sets
MSREE, MSRIR, and MSRDR to 1.
Programming Note
If this instruction sets MSRPR to 1, it also sets
MSREE, MSRIR, and MSRDR to 1.
Programming Note
For the power-saving level corresponding to the
second item above, if the state of the Decrementer
were not maintained and updated as if the thread
was not in power-saving mode, Decrementer
exceptions would not reliably cause exit from this
power-saving level even if Decrementer exceptions
were enabled to cause exit.
stop XL-form the storage areas before executing the stop. See the
the User’s Manual for the implementation for details.
stop
Software must also specify the requested and maxi-
mum power-saving level limit fields (i.e RL and MTL
19 /// /// /// 370 /
fields), and the Transition Rate (TR) field in the PSSCR
0 6 11 16 21 31
in order to bound the range of power-saving modes that
The thread is placed into power-saving mode and exe- can be entered. If the value of the RL field is greater
cution is stopped. than or equal to the value of the MTL field, the
power-saving level will not increase from the initial level
The power-saving level that is entered is determined by during power-saving mode.
the contents of the PSSCR (see Section 3.2.3). The
thread state that is maintained depends on the Programming Note
power-saving level that is entered. The thread state that
If MSREE=1 when the stop instruction is executed,
is maintained at each power-saving level is implemen-
then the interrupt corresponding to the exception
tation-dependent, subject to the restrictions specified in
that was expected to cause exit from power-saving
Section 3.3.2.
mode may occur immediately prior to execution of
The thread remains in power-saving mode until either a the stop instruction. If this occurs, the result may
System Reset exception or certain other events occur. be a software hang condition since the exception
The events that may cause exit from power-saving that was expected to cause exit from power-saving
mode are specified by PSSCREC and LPCRPECE. If the mode has already occurred.
event that causes the exit is a System Reset, Machine
The above software hang condition can be pre-
Check, or Hypervisor Maintenance exception, resource
vented by setting MSREE=0 prior to executing stop.
state that would be lost if the exception occurred when
the thread was not in power-saving mode may be lost.
After the thread has entered power-saving mode with
An attempt to execute this instruction in Suspended PSSCREC=0, any exception may cause exit from
state will result in a TM Bad Thing type of Program power-saving mode. When an exception occurs,
interrupt. power-saving mode is exited either at the instruction
following the stop (if MSREE=0) or in the corresponding
This instruction is privileged and context synchronizing.
interrupt handler (if MSREE=1).
Special Registers Altered:
None Programming Note
If stop was executed when PSSCREC=0 and
3.3.2.2 Entering and Exiting MSREE=0 (in order to avoid the hang condition
described in the above Programming Note),
Power-Saving Mode MSREE should be set to 1 after power-saving mode
Before software executes the stop instruction, the is exited in order to take the interrupt corresponding
PSSCR is initialized. If the stop instruction is to be to the exception that caused exit from power-saving
used by the OS, the hypervisor initializes the fields that mode.
are accessible only to the hypervisor before dispatching
the OS. These fields include the SD, ESL, EC, and After the thread has entered power-saving mode with
PSLL fields. See the Programming Notes for these PSSCREC=1, only the System Reset or Machine Check
fieldsSection 3.2.3 for additional information. exceptions and the exceptions enabled in LPCRPECE
will cause exit. If the event that causes exit is a Machine
If the stop instruction is to be executed by the hypervi-
Check exception, then a Machine Check interrupt
sor when PSSCREC=1, the LPCRPECE must be set to
occurs; otherwise a System Reset interrupt occurs, and
the desired value (see Section 2.2). Depending on the
the contents of SRR1 indicate the exception that
implementation and the power-saving level to be
caused exit from power-saving mode.
entered, it may also be necessary to save the state of
certain resources and perform synchronization proce- If the hypervisor has set PSSCRSD=0 prior to when the
dures to ensure that all stores have been performed stop instruction is executed, the instruction following
with respect to other threads or mechanisms that use the stop may typically be a mfspr in order to read the
4.1 Fixed-Point Facility Over- version number, such as clock rate and
Engineering Change level.
view Version numbers are assigned by the Power ISA pro-
This chapter describes the details concerning the regis- cess. Revision numbers are assigned by an implemen-
ters and the privileged instructions implemented in the tation-defined process.
Fixed-Point Facility that are not covered in Book I.
4.3.2 Chip Information Register
4.2 Special Purpose Registers The Chip Information Register (CIR) is a 32-bit
read-only register that contains a value identifying the
Special Purpose Registers (SPRs) are read and written manufacturer and other characteristics of the chip on
using the mfspr (page 975) and mtspr (page 974) which the processor is implemented. The contents of
instructions. Most SPRs are defined in other chapters the CIR can be copied to a GPR by the mfspr instruc-
of this book; see the index to locate those definitions. tion. Read access to the CIR is privileged; write access
is not provided.
Bit Description
4.3.1 Processor Version Register 32:35 Manufacturer ID (ID) A four-bit field that iden-
The Processor Version Register (PVR) is a 32-bit tifies the manufacturer of the chip.
read-only register that contains a value identifying the 36:63 Implementation-dependent.
version and revision level of the implementation. The
contents of the PVR can be copied to a GPR by the Figure 8. Chip Information Register
mfspr instruction. Read access to the PVR is privi-
leged; write access is not provided. 4.3.3 Processor Identification
Version Revision Register
32 48 63
--------------------------- Begin Text -----------------------------
Figure 7. Processor Version Register The Processor Identification Register (PIR) is a 32-bit
The PVR distinguishes between implementations that register that contains a 20-bit PROCID field that can be
differ in attributes that may affect software. It contains used to distinguish the thread from other threads in the
two fields. system. The contents of the PIR can be copied to a
GPR by the mfspr instruction. Read access to the PIR
Version A 16-bit number that identifies the version is privileged; write access is not provided.
of the implementation. Different version
numbers indicate major differences
between implementations.
Revision A 16-bit number that distinguishes between
implementations of the version. Different
revision numbers indicate minor differences
between implementations having the same
The means by which the PIR is initialized are imple- Hypervisor accesses
mentation-dependent. Bits 0:7 of this field are read-only bits that indi-
cate the state of CTRLRUN for threads with
The PIR is a hypervisor resource; see Chapter 2. hypervisor thread numbers 0 through 7,
respectively; bits corresponding to hypervisor
thread numbers higher than the maximum
4.3.4 Process Identification Reg- hypervisor thread number supported are set
ister to 0s.
Programming Note
4.3.6 Program Priority Register
Radix tree translation assigns special meaning to Privileged programs may set a wider range of program
PID=0, specifically indicating the operating sys- priorities in the PRI field of PPR and PPR32 than may
tem’s kernel process. When GR=1, PIDR should be set by problem state programs (see Chapter 3 of
not be set to zero except when MSRPR=0. Book II). Problem state programs may only set values
in the range of 0b001 to 0b100 unless the Problem
State Priority Boost register (see Section 4.3.7) allows
4.3.5 Control Register the value 0b101. Privileged programs may set values in
the range of 0b001 to 0b110. Hypervisor software may
The Control Register (CTRL) is a 32-bit register as also set 0b111. For all priorities except 0b101, if a pro-
shown below. gram attempts to set a value that is not allowed for its
privilege level, the PRI field remains unchanged. If a
/// TS /// RUN problem state program attempts to set its priority value
32 48 56 63 to 0b101 when this priority value is not allowed for
problem state programs, the priority is set to 0b100.
Figure 11. Control Register The values and their corresponding meanings are as
The field definitions for the CTRL are shown below. follows.
Figure 12. Problem State Priority Boost Register 4.3.9 Software-use SPRs
A problem state program is able to set the program pri- Software-use SPRs are 64-bit registers provided for
ority to medium high only when the PSPB of the thread use by software.
contains a non-zero value.
SPRG0
The maximum value to which the PSPB can be set
must be a power of 2 minus 1. Bits that are not required SPRG1
to represent this maximum value must return 0s when SPRG2
read regardless of what was written to them. SPRG3
0 63
When the PSPB is set to a value less than its maxi-
mum value but greater than 0, its contents decrease Figure 14. Software-use SPRs
monotonically at the same rate as the SPURR until its
contents minus the amount it is to be decreased are 0 SPRG0, SPRG1, and SPRG2 are privileged registers.
or less when a problem state program is executing on SPRG3 is a privileged register except that the contents
the thread at a priority of medium high.When the con- may be copied to a GPR in Problem state when
tents of the PSPB minus the amount it is to be accessed using the mfspr instruction.
decreased are 0 or less, its contents are replaced by 0.
Programming Note
When the PSPB is set to its maximum value or 0, its
Neither the contents of the SPRGs, nor accessing
contents do not change until it is set to a different value.
them using mtspr or mfspr, has a side effect on
Whenever the priority of a thread is medium high and the operation of the thread. One or more of the reg-
either of the following conditions exist, hardware isters is likely to be needed by non-hypervisor inter-
changes the priority to medium: rupt handler programs (e.g., as scratch registers
and/or pointers to per thread save areas).
- the PSPB counts down to 0, or
- PSPB=0 and the privilege state of the thread Operating systems must ensure that no sensitive
is changed to problem state (MSRPR=1). data are left in SPRG3 when a problem state pro-
gram is dispatched, and operating systems for
secure systems must ensure that SPRG3 cannot
4.3.8 Relative Priority Register be used to implement a “covert channel” between
problem state programs. These requirements can
The Relative Priority Register (RPR) is a 64-bit register be satisfied by clearing SPRG3 before passing
that allows the hypervisor to control the relative priori- control to a program that will run in problem state.
ties corresponding to each valid value of PPRPRI.
HSPRG0
HSPRG1
0 63
Programming Note
Neither the contents of the HSPRGs, nor accessing
them using mtspr or mfspr, has a side effect on
the operation of the thread. One or more of the reg-
isters is likely to be needed by hypervisor interrupt
handler programs (e.g., as scratch registers and/or
pointers to per thread save areas).
Load Byte and Zero Caching Inhibited Load Halfword and Zero Caching Inhibited
Indexed X-form Indexed X-form
lbzcix RT,RA,RB lhzcix RT,RA,RB
31 RT RA RB 853 / 31 RT RA RB 821 /
0 6 11 16 21 31 0 6 11 16 21 31
if RA = 0 then b 0 if RA = 0 then b 0
else b (RA) else b (RA)
EA b + (RB) EA b + (RB)
RT 560 || MEM(EA, 1) RT 480 || MEM(EA, 2)
Let the effective address (EA) be the sum Let the effective address (EA) be the sum
(RA|0)+ (RB). The byte in storage addressed by EA is (RA|0)+ (RB). The halfword in storage addressed by
loaded into RT56:63. RT0:55 are set to 0. EA is loaded into RT48:63. RT0:47 are set to 0.
The storage access caused by this instruction is per- The storage access caused by this instruction is per-
formed as though the specified storage location is formed as though the specified storage location is
Caching Inhibited and Guarded. Caching Inhibited and Guarded.
This instruction is hypervisor privileged. This instruction is hypervisor privileged.
Special Registers Altered: Special Registers Altered:
None None
Load Word and Zero Caching Inhibited Load Doubleword Caching Inhibited
Indexed X-form Indexed X-form
lwzcix RT,RA,RB ldcix RT,RA,RB
31 RT RA RB 789 / 31 RT RA RB 885 /
0 6 11 16 21 31 0 6 11 16 21 31
if RA = 0 then b 0 if RA = 0 then b 0
else b (RA) else b (RA)
EA b + (RB) EA b + (RB)
320 || MEM(EA, 4)
RT RT MEM(EA, 8)
Let the effective address (EA) be the sum Let the effective address (EA) be the sum
(RA|0)+ (RB). The word in storage addressed by EA is (RA|0)+ (RB). The doubleword in storage addressed by
loaded into RT32:63. RT0:31 are set to 0. EA is loaded into RT.
The storage access caused by this instruction is per- The storage access caused by this instruction is per-
formed as though the specified storage location is formed as though the specified storage location is
Caching Inhibited and Guarded. Caching Inhibited and Guarded.
This instruction is hypervisor privileged. This instruction is hypervisor privileged.
Special Registers Altered: Special Registers Altered:
None None
Store Byte Caching Inhibited Indexed Store Halfword Caching Inhibited Indexed
X-form X-form
stbcix RS,RA,RB sthcix RS,RA,RB
31 RS RA RB 981 / 31 RS RA RB 949 /
0 6 11 16 21 31 0 6 11 16 21 31
if RA = 0 then b 0 if RA = 0 then b 0
else b (RA) else b (RA)
EA b + (RB) EA b + (RB)
MEM(EA, 1) (RS)56:63 MEM(EA, 2) (RS)48:63
Let the effective address (EA) be the sum Let the effective address (EA) be the sum
(RA|0)+ (RB). (RS)56:63 are stored into the byte in stor- (RA|0)+ (RB). (RS)48:63 are stored into the halfword in
age addressed by EA. storage addressed by EA.
The storage access caused by this instruction is per- The storage access caused by this instruction is per-
formed as though the specified storage location is formed as though the specified storage location is
Caching Inhibited and Guarded. Caching Inhibited and Guarded.
This instruction is hypervisor privileged. This instruction is hypervisor privileged.
Special Registers Altered: Special Registers Altered:
None None
31 RS RA RB 917 / 31 RS RA RB 1013 /
0 6 11 16 21 31 0 6 11 16 21 31
if RA = 0 then b 0 if RA = 0 then b 0
else b (RA) else b (RA)
EA b + (RB) EA b + (RB)
MEM(EA, 4) (RS)32:63 MEM(EA, 8) (RS)
Let the effective address (EA) be the sum Let the effective address (EA) be the sum
(RA|0)+ (RB). (RS)32:63 are stored into the word in stor- (RA|0)+ (RB). (RS) is stored into the doubleword in
age addressed by EA. storage addressed by EA.
The storage access caused by this instruction is per- The storage access caused by this instruction is per-
formed as though the specified storage location is formed as though the specified storage location is
Caching Inhibited and Guarded. Caching Inhibited and Guarded.
This instruction is hypervisor privileged. This instruction is hypervisor privileged.
Special Registers Altered: Special Registers Altered:
None None
4.4.2 OR Instruction
or Rx,Rx,Rx can be used to set PPRPRI (see Section
4.3.6) as shown in Figure 16. For all priorities except
medium high, PPRPRI remains unchanged if the privi-
lege state of the thread executing the instruction is
lower than the privilege indicated in the figure. For pri-
ority medium high, PPRPRI is set to medium if the
thread executing the instruction is in problem state and
medium high priority is not allowed for problem state
programs. (The encodings available to problem state
programs, as well as encodings for additional shared
resource hints not shown here, are described in Chap-
ter 3 of Book II.)
Programming Note
In privileged state, if ldmx were not illegal it would
behave as if it were ldx (because event-based
exceptions and event-based branches do not occur
in privileged state.) Such behavior could result in
silent errors if privileged software used ldmx
expecting it to behave as it does in problem state.
SPRs” in Book I.
9 The length of the GSR is undefined. An access to this SPR affects synchronization of subsequent mtspr
instructions. See the introductory text in this section for more details
10 SPR numbers 777-778, 783, 793-794, and 799 are reserved for the Performance Monitor. All other SPR num-
bers that are not shown above and are not implementation-specific are reserved.
11 The mftb instruction is Phased-Out. Assemblers targeting Version 2.03 or later of the architecture should gener-
ate an mfspr instruction for the mftb and mftbu extended mnemonics; see the corresponding Assembler Note
in the mftb instruction description (see Section 6.1 of Book II).
12
mfspr specifying the GSR has no meaningful use. It is treated as a noop. As a result, no extended mnemonic
is assigned for it.
13 No extended mnemonic is provided because previous versions of the architecture defined the obvious extended
mnemonic as resolving to the non-privileged SPR number, and because there is no software benefit in using the
privileged SPR number, rather than the non-privileged SPR number, for this function.
*This figure also defines extended mnemonics for the mtspr and mfspr instructions, including the Special Purpose
Registers (SPRs) defined in Book I and for the Move From Time Base instruction defined in Book II.
The mtspr and mfspr instructions specify an SPR as a numeric operand; extended mnemonics are provided that
represent the SPR in the mnemonic rather than requiring it to be coded as an operand. Similar extended mnemon-
ics are provided for the Move From Time Base instruction, which specifies the portion of the Time Base as a
numeric operand.
Note: mftb serves as both a basic and an extended mnemonic. The Assembler will recognize an mftb mnemonic
with two operands as the basic form, and an mftb mnemonic with one operand as the extended form. In the
extended form the TBR operand is omitted and assumed to be 268 (the value that corresponds to TB)
Move To Special Purpose Register When the designated SPR is the AMR, using SPR 13,
XFX-form and MSRPR=1, the contents of bit positions of register
RS corresponding to 1 bits in the User Authority Mask
mtspr SPR,RS Override Register (UAMOR) are placed into the corre-
sponding bits of the AMR; the other AMR bits are not
31 RS spr 467 / modified.
0 6 11 21 31
When the designated SPR is the UAMOR and
MSRHV PR=0b00, the contents of register RS are
n spr5:9 || spr0:4 ANDed with the contents of the AMOR and the result is
switch (n) placed into the UAMOR.
case(13): if MSRHV PR = 0b10 then
SPR(13) (RS) When the designated SPR is the Hypervisor Mainte-
else nance Exception Register (HMER), the contents of reg-
if MSRHV PR = 0b00 then ister RS are ANDed with the contents of the HMER and
SPR(13) ((RS) & AMOR) | the result is placed into the HMER.
((SPR(13)) & ¬AMOR)
else For this instruction, SPRs TBL and TBU are treated as
SPR(13) ((RS) & UAMOR) | separate 32-bit registers; setting one leaves the other
((SPR(13)) & ¬UAMOR) unaltered.
case(29,61):if MSRHV PR = 0b10 then
SPR(n) (RS) spr0=1 if and only if writing the register is privileged.
else Execution of this instruction specifying an SPR number
SPR(n) ((RS) & AMOR) | with spr0=1 causes a Privileged Instruction type Pro-
((SPR(n)) & ¬AMOR) gram interrupt when MSRPR=1 and, if the SPR is a
case (157): if MSRHV PR = 0b10 then hypervisor resource (see Figure 17) when
SPR(157) (RS)
MSRHV PR=0b00, causes a Privileged Instruction type
else
SPR(157) (RS) & AMOR of Program interrupt if LPCREVIRT=0 and a Hypervisor
case (158):start mtspr sequence optimization Emulation Assistance interrupt if LPCREVIRT=1. .
case (336):SPR(336) (SPR(336)) & (RS)
case (808, 809, 810, 811):
Execution of this instruction specifying an SPR number
default: if length(SPR(n)) = 64 then
SPR(n) (RS) that is not defined for the implementation causes one of
else the following.
SPR(n) (RS)32:63 if spr0=0:
- if MSRPR=1: Hypervisor Emulation Assistance
interrupt
The SPR field denotes a Special Purpose Register,
- if MSRPR=0: Hypervisor Emulation Assistance
encoded as shown in Figure 17. If the SPR field con-
interrupt for SPR 0 and no operation (i.e., the
tains the value 158, the instruction indicates the start of
instruction is treated as a no-op) when
a sequence of mtspr instructions that may be synchro-
LPCREVIRT=0 or Hypervisor Emulation Assis-
nized as a group. See the introductory material in this
tance interrupt when LPCREVIRT=1 for all
section for more information. If the SPR field contains
other SPRs
a value from 808 through 811, the instruction specifies
if spr0=1:
a reserved SPR, and is treated as a no-op; see
- if MSRPR=1: Privileged Instruction type Pro-
Section 1.3.3, “Reserved Fields, Reserved Values, and
gram interrupt
Reserved SPRs” in Book I. Otherwise, the contents of
register RS are placed into the designated Special Pur- - if MSRPR=0: no operation (i.e., the instruction
pose Register, except as described in the next four is treated as a no-op) when LPCREVIRT=0 and
paragraphs. For Special Purpose Registers that are 32 Hypervisor Emulation Assistance interrupt
bits long, the low-order 32 bits of RS are placed into the when LPCREVIRT=1
SPR.
If an attempt is made to execute mtspr specifying a
When the designated SPR is the Authority Mask Regis- Transactional Memory SPR in other than Non-transac-
ter (AMR), (using SPR 13 or SPR 29), or the desig- tional state, with the exception of TFAR in suspended
nated SPR is the Instruction Authority Mask Register state, a TM Bad Thing type Program interrupt is gener-
(IAMR), and MSRHV PR=0b00, the contents of bit posi- ated.
tions of register RS corresponding to 1 bits in the
Special Registers Altered:
Authority Mask Override Register (AMOR) are placed
See Figure 17
into the corresponding bits of the AMR or IAMR,
respectively; the other AMR or IAMR bits are not modi-
fied.
Note
See the Notes that appear with mtspr.
Programming Note
If this instruction sets MSRPR to 1, it also sets
MSREE, MSRIR, and MSRDR to 1.
This instruction does not alter MSRME or MSRLE.
(This instruction does not alter MSRHV because it
does not alter any of the high-order 32 bits of the
MSR.)
If the only MSR bits to be altered are MSREE RI, to
obtain the best performance L=1 should be used.
Move To Machine State Register Except in the mtmsrd instruction description in this
Doubleword X-form section, references to “mtmsrd” in this document imply
either L value unless otherwise stated or obvious from
mtmsrd RS,L context (e.g., a reference to an mtmsrd instruction that
modifies an MSR bit other than the EE or RI bit implies
31 RS /// L /// 178 / L=0).
0 6 11 15 16 21 31
Programming Note
if L = 0 then If this instruction sets MSRPR to 1, it also sets
MSREE, MSRIR, and MSRDR to 1.
if (MSR29:31 ¬= 0b010 | RS29:31 ¬= 0b000) then This instruction does not alter MSRLE, MSRME or
MSR29:31 RS29:31 MSRHV.
MSR48 (RS)48 | (RS)49
MSR58 (RS)58 | (RS)49 If the only MSR bits to be altered are MSREE RI, to
MSR59 (RS)59 | (RS)49 obtain the best performance L=1 should be used.
MSR0:2 4:28 32:47 49:50 52:57 60:62
(RS)0:2 4 6:28 32:47 49:50 52:57 60:62
else Programming Note
MSR48 62 (RS)48 62 If MSREE=0 and an External, Decrementer, or Per-
formance Monitor exception is pending, executing
The MSR is set based on the contents of register RS
an mtmsrd instruction that sets MSREE to 1 will
and of the L field.
cause the interrupt to occur before the next instruc-
L=0: tion is executed, if no higher priority exception
exists (see Section 6.9, “Interrupt Priorities” on
If bits 29 through 31 of the MSR are not equal to
page 1086). Similarly, if a Hypervisor Decrementer
0b010 or bits 29 through 31 of RS are not equal to
interrupt is pending, execution of the instruction by
0b000, then the value of bits 29 through 31 of RS
the hypervisor causes a Hypervisor Decrementer
is placed into bits 29 through 31 of the MSR.The
interrupt to occur if HDICE=1.
result of ORing bits 48 and 49 of register RS is
placed into MSR48. The result of ORing bits 58 and For a discussion of software synchronization
49 of register RS is placed into MSR58. The result requirements when altering certain MSR bits, see
of ORing bits 59 and 49 of register RS is placed Chapter 11.
into MSR59. Bits 0:2, 4:28, 32:47, 49:50, 52:57,
and 60:62 of register RS are placed into the corre-
Programming Note
sponding bits of the MSR.
mtmsrd serves as both a basic and an extended
L=1: mnemonic. The Assembler will recognize an mtm-
Bits 48 and 62 of register RS are placed into the srd mnemonic with two operands as the basic
corresponding bits of the MSR. The remaining bits form, and an mtmsrd mnemonic with one operand
of the MSR are unchanged. as the extended form. In the extended form the L
operand is omitted and assumed to be 0.
If the instruction attempts to cause an illegal transac-
tion state transition (see Table 3, “Transaction state
transitions that can be requested by rfebb, rfid, rfscv,
hrfid, and mtmsrd.,” on page 947), or when TM is dis-
abled by the PCR, a transition to Problem state with an
active transaction, a TM Bad Thing type Program inter-
rupt is generated (unless a higher-priority exception is
pending). If this interrupt is generated, the value placed
into SRR0 by the interrupt processing mechanism (see
Section 6.4.3) is the address of the mtmsrd instruction.
This instruction is privileged.
If L=0 this instruction is context synchronizing. If L=1
this instruction is execution synchronizing; in addition,
the alterations of the EE and RI bits take effect as soon
as the instruction completes.
Special Registers Altered:
MSR
31 RT /// /// 83 /
0 6 11 16 21 31
RT MSR
The contents of the MSR are placed into register RT.
This instruction is privileged.
Special Registers Altered:
None
Storage Control Overview high-order 78-n bits of the “78-bit” virtual address
m
are assumed to be zeros.
Host real address space size is 2 bytes, m≤60;
see Note 1. 4. The supported values of p for the larger virtual
page sizes are implementation-dependent (subject
Guest real address space size is 2m bytes, m≤60; to the limitations given above).
see Notes 1 and 2.
Real page size is 212 bytes (4 KB). Programming Note
Effective address space size is 264 bytes. Note that without some of the reserved bits in the
Radix PTE, the RPN field cannot address the full
An effective address is translated to a virtual 60-bit real address space. Similarly without some
address via the Segment Lookaside Buffer (SLB). of the reserved bits in the HPT PTE, the AVA field
- Virtual address space size is 2n bytes, cannot address the full 60-bit real address space.
65≤n≤78; see Note 3.
- Segment size is 2s bytes, s=28 or 40. Note that without some of the reserved bits in the
- 2n-40 ≤ number of virtual segments ≤ 2n-28; Radix PTE, the AVA field cannot resolve the full
see Note 3. 78-bit virtual address.
- Virtual page size is 2p bytes, where 12≤p, and
2p is no larger than either the size of the big-
gest segment or the real address space; a 5.7.1 32-Bit Mode
size of 4 KB, 64 KB, 2MB for Radix Tree trans-
lation, and an implementation-dependent The computation of the 64-bit effective address is inde-
number of other sizes are supported; see pendent of whether the thread is in 32-bit mode or
Note 4. For HPT translation, the Page Table 64-bit mode. In 32-bit mode (MSRSF=0), the high-order
specifies the virtual page size and the SLB or 32 bits of the 64-bit effective address are treated as
implied segment descriptor specifies the base zeros for the purpose of addressing storage. This
virtual page size, which is the smallest virtual applies to both data accesses and instruction fetches. It
page size that the segment can contain. The applies independent of whether address translation is
base virtual page size is 2b bytes. For Radix enabled or disabled. This truncation of the effective
Tree translation, the virtual page size is deter- address is the only respect in which storage accesses
mined by the location of the Page Table Entry in 32-bit mode differ from those in 64-bit mode.
in the Radix Tree.
- Segments contain pages of a single size, a Programming Note
mixture of 4 KB and 64 KB pages, or a mixture Treating the high-order 32 bits of the effective
of page sizes that include implementa- address as zeros effectively truncates the 64-bit
tion-dependent page sizes. effective address to a 32-bit effective address such
as would have been generated on a 32-bit imple-
A virtual address is translated to a real address via
mentation of the Power ISA. Thus, for example, the
the Page Table.
ESID in 32-bit mode is the high-order four bits of
this truncated effective address; the ESID thus lies
in the range 0-15. When address translation is
Notes:
enabled, these four bits would select a Segment
1. The value of m is implementation-dependent (sub- Register on a 32-bit implementation of the Power
ject to the maximum given above). When used to ISA. The SLB entries that translate these 16 ESIDs
address storage or to represent a guest real can be used to emulate these Segment Registers.
address, the high-order 60-m bits of the “60-bit”
real address must be zeros.
2. The hypervisor may assign a guest real address 5.7.2 Virtualized Partition Mem-
space size for each partition. Accesses to guest ory (VPM) Mode
real storage outside this range but still mappable
by the second level Radix Tree or HPT will cause VPM mode enables the hypervisor to reassign all or
an HDSI. Accesses to storage outside the mappa- part of a partition’s memory transparently so that the
ble range will have boundedly undefined results. reassignment is not visible to the partition. When this is
done, the partition’s memory is said to be “virtualized.”
3. The value of n is implementation-dependent (sub- This mode is only available within Paravirtualized HPT
ject to the range given above). In references to translation mode. The other translation mode provides
78-bit virtual addresses elsewhere in this Book, the equivalent function by providing two levels of translation
with separate Page Tables for the operating system and 5.7.3.1 Hypervisor Offset Real Mode
the hypervisor. (See Section 5.7.7 for a more complete
overview of the translation modes.) The VPM field in
Address
the LPCR enables VPM mode when address transla- If MSRHV = 1 and EA0 = 0, the access is controlled by
tion is enabled. VPM is always enabled when address the contents of the Hypervisor Real Mode Offset Regis-
translation is disabled. ter, as follows.
If the thread is not in hypervisor state, and either Hypervisor Real Mode Offset Register (HRMOR)
address translation is enabled and VPM1=1, or address
translation is disabled, conditions that would have Bits 4:63 of the effective address for the access
caused a Data Storage or an Instruction Storage inter- are ORed with the 60-bit offset represented by the
rupt if the affected memory were not virtualized instead contents of the HRMOR, and the 60-bit result is
cause a Hypervisor Data Storage or a Hypervisor used as the real address for the access. The sup-
Instruction Storage interrupt respectively. Because the ported offset values are all values of the form i×2r,
Hypervisor Data Storage and Hypervisor Instruction where 0 ≤ i < 2j, and j and r are implementa-
Storage interrupts always put the thread in hypervisor
tion-dependent values having the properties that
state, they permit the hypervisor to handle the condition
if appropriate (e.g., to restore the contents of a page 12 ≤ r ≤ 26 (i.e., the minimum offset granularity is 4
that was reassigned), and to reflect it to the operating KB and the maximum offset granularity is 64 MB)
system’s Data Storage or Instruction Storage interrupt and j+r = m, where the real address size supported
handler otherwise. by the implementation is m bits.
When address translation is enabled, VPM mode has Programming Note
no effect on address translation. When address transla-
tion is disabled, addressing is controlled as specified in EA4:63-r should equal 60-r0. If this condition is satis-
Section 5.7.3. fied, ORing the effective address with the offset
produces a result that is equivalent to adding the
effective address and the offset.
5.7.3 Hypervisor Real And Virtual
If m<60, EA4:63-m and HRMOR4:63-m must be
Real Addressing Modes zeros.
When a storage access is an instruction fetch per-
formed when instruction address translation is dis-
abled, or if the access is a data access and data
address translation is disabled, it is said to be per- 5.7.3.2 Storage Control Attributes for
formed in “hypervisor real addressing mode” if the
thread is in hypervisor state. If the thread is not in
Accesses in Hypervisor Real Address-
hypervisor state, the access is said to be performed in ing Mode
“virtual real addressing mode.” Storage accesses in Storage accesses in hypervisor real addressing mode
hypervisor real and virtual real addressing modes are are performed as though all of storage had the follow-
performed in a manner that depends on the contents of ing storage control attributes, except as modified by the
MSRHV, VPM, PATEPS, HRMOR (see Chapter 2), bit 0 Hypervisor Real Mode Storage Control facility (see
of the effective address (EA0), and the state of the Real Section 5.7.3.2.1). (The storage control attributes are
Mode Storage Control Facility as described below. Bits defined in Book II.)
1:3 of the effective address are ignored.
not Write Through Required
MSRHV=1 not Caching Inhibited, for instruction fetches
not Caching Inhibited, for data accesses except
If EA0=0, the Hypervisor Offset Real Mode
those caused by the Load/Store Caching Inhibited
Address mechanism, described in Section 5.7.3.1,
instructions; Caching Inhibited, for data accesses
controls the access.
caused by the Load/Store Caching Inhibited
If EA0=1, bits 4:63 of the effective address are instructions
used as the real address for the access. Memory Coherence Required, for data accesses
Guarded
MSRHV=0 not SAO
If PATEHR||GR=0b00, the Virtual Real Mode Additionally, storage accesses in hypervisor real
Addressing mechanism, described in Section addressing mode are performed as though all storage
5.7.3.3, controls the access. was not No-execute.
If PATEHR||GR≠0b00, partition-scoped translation is
performed on the effective address. (See
Section 5.7.12.3, “Obtaining Host Real Address,
Radix on Radix”.)
virtual address: Refers to the output of Segment Similarly, on exit from the adjunct, the hypervisor must
translation and input to HPT translation. invalidate its SLB entries
host real address: Refers to the output of the parti-
tion-scoped translation process when two levels of
translation are being performed or the output of the pro-
cess-scoped translation in Radix on Radix when
MSRHV=1 for quadrants 0 and 3. The simpler “real 5.7.5 Address Ranges Having
address” may be used interchangeably.
Defined Uses
Page Directory: A table within the Radix Tree transla-
tion structure that contains elements (“Page Directory The address ranges described below have uses that
Entries”) that point to other tables, instead of containing are defined by the architecture.
just Page Table Entries. The Page Directory that is at Fixed interrupt vectors
the root of the Radix Tree is called the “Root Page
Directory.” Except for the first 256 bytes, which are reserved
for software use, the real page beginning at real
effLPID, effPID: This is shorthand for effective LPID address 0x0000_0000_0000_0000 is either used
and effective PID. In certain circumstances, the value for interrupt vectors or reserved for future interrupt
used for the LPID and/or the PID is specified to be zero vectors.
instead of the actual register contents. “Effective” or
“eff” is used to indicate the possibility of such a substi- Implementation-specific use
tution. This value substitution happens only in Radix The two contiguous real pages beginning at real
Tree translation, and is based on the value of EA0:1 address 0x0000_0000_0000_1000 are reserved
(see Section 5.7.5.1, “Effective Address Space Struc- for implementation-specific purposes.
ture for Radix-using Partitions”). Value substitution
does not happen in HPT translation. When a guest has Offset Real Mode interrupt vectors
its own Radix Tree (PATEGR=1), PID substitution may The real page beginning at the real address speci-
take place. When a host uses Radix Tree translation fied by the HRMOR is used similarly to the page
(PATEHR=1), both PID and LPID substitution may take for the fixed interrupt vectors.
place. When a host uses HPT translation, the only spe-
cial significance associated with LPIDR=0 is with Relocated interrupt vectors
regard to Segment Table walk when MSRHV=1, as Depending on the values of MSRIR DR and
described later. LPCRAIL and on whether the specific interrupt will
cause MSRHV to change, either the virtual page
adjunct: An adjunct is a software entity that resides in
containing the byte addressed by effective
a partition along with an operating system and its appli-
address 0x0000_0000_0001_8000 or the virtual
cations in order to efficiently provide services (e.g.
page containing the byte addressed by effective
device drivers) for the partition. The adjunct partition is
address 0xC000_0000_0000_4000 may be used
managed by the hypervisor. It runs in problem state
similarly to the page for the fixed interrupt vectors.
with MSRHV PR=0b11, thereby restricting the resources
(See Section 2.2.)
it can modify (MSRPR=1) and causing its interrupts to
go to the hypervisor (MSRHV=1). It shares an HPT with System Call Vectored interrupt vectors
the partition it serves. The adjunct is kept separate from
Depending on the value of LPCRAIL, the virtual
the partition using Virtual Page Class Key protection.
page containing the effective address
(The adjunct partition’s lightness of weight derives from
0x0000_0000_0001_7000 or
not requiring a full partition context switch (SLB flush,
0xc000_0000_0000_3000 contains the interrupt
TLB flush, PTCR change, etc.) when the client partition
vectors that are invoked by the System Call Vec-
invokes the services of the adjunct partition.) Each
tored instruction.
thread may have its own unique translations for an
adjunct. As a result, adjunct segment descriptors can- Page Table
not exist in the process’s Segment Table and must
A contiguous sequence of real pages beginning at
instead be bolted in the SLB manually. The adjunct
the real address specified by the first doubleword
construct exists only with a hypervisor that uses HPT
of the Partition Table Entry when HR=1 contains
translation and only for LPID≠0. The adjunct has its
the Page Table.
own 64-bit EA space. Accesses performed by the
adjunct are not subject to guest translation (when using Adjunct Virtual Address Space
nested translation). Entry to an adjunct is only possible
In systems in which the hypervisor uses HPT
from hypervisor state. Prior to dispatching the adjunct,
translation, the following virtual address ranges are
the hypervisor must invalidate SLB entries that map the
reserved for adjunct use:
effective address range that will be used by the adjunct.
FFFFFD1FFF0000000 to FFFFFDFFFFFFFFFFF,
FFFFFE1FFF0000000 to FFFFFEFFFFFFFFFFF,
Programming Note
and
FFFFFF0FFF0000000 to FFFFFFEFFFFFFFFFF The intent is that the PIDR and LPIDR contents
indicate the process and partition on behalf of
which execution is taking place. For example,
5.7.5.1 Effective Address Space Struc- when a guest process interrupts to the hypervisor,
ture for Radix-using Partitions execution to service the interrupt will generally be
on behalf of the guest partition. When execution
When Radix Tree translation is in use but translation is
changes to be purely managing hypervisor
off (MSRPR=0), MSRHV selects between parti-
resources that are not directly tied to any partition,
tion-scoped translation of the real mode guest real
the hypervisor should set LPIDR to 0.
address, formed by treating EA0:1 as 0b00, and hyper-
visor real mode (see Section 5.7.3). When Radix Tree
translation is in use and translation is on, EA0:1 For guest and host applications and the guest operat-
together with MSRHV are used to select one of as many ing system, quadrant 0 (EA0:1=0b00) addresses the
as four Radix Trees with which to perform pro- Radix Tree for the application and quadrant 3
cess-scoped translation, as a technique to make sys- (EA0:1=0b11) addresses the direct supervisor of the
tem calls and interrupts more efficient by avoiding the application. For the guest and host applications, it will
need to immediately change the contents of the PIDR frequently be the case that page protection is used to
and LPIDR. (See Figure 19 for an illustration of the prevent access to quadrant 3, but partition-wide shared
mappings. Note that the second and third columns and text and/or data may also be located there. Quadrants
related description below apply only when the HR field 1 and 2 have no associated Radix Tree.
of the Partition Table Entry (PATEHR, see Figure 21) for
the partition is set to 1. Also note that PATEHR comes Programming Note
from the Partition Table Entry (PATE) for the partition Outboard accelerators may commonly be limited to
indicated by the LPIDR, rather than effLPID.) Since accessing quadrants 0 and 3 as a matter of plat-
there’s nothing to prevent a process from generating form architecture. In such platforms, references to
any address in the 64b EA space, the exceptional quadrants 1 and 2 may be regarded as errors.
cases are defined as follows. When a quadrant of the
EA space has no associated Radix Tree, access to it For the hypervisor, quadrants 0 and 3 are as described
results in an Instruction Segment exception or Data above. Quadrant 1 (EA0:1=0b01) addresses the guest
Segment exception, as appropriate for the type of application and quadrant 2 (EA0:1=0b10) addresses the
access. Similarly, reference to any portion of these guest operating system, one of which experienced a
quadrants or the real mode guest real address hypervisor interrupt or performed a system call to the
described above that is not mapped by a Radix Tree hypervisor. It will rarely be the case that quadrants 0
(versus mapped by an invalid entry) will cause an and 1 will be in use concurrently. A new value will usu-
Instruction or Data Segment exception. ally be put in PIDR between accesses to quadrants 0
and 1.
Programming Note
When MSRHV=1 and EA0:1=0b00 or 0b11, only pro-
Note that the quadrant structure is only available to cess-scoped translation is performed. When MSRHV=0
software running in 64b mode. 32b software will and MSRIR/DR=0, only partition-scoped translation is
only be able to access storage mapped by its own performed. Otherwise, nested process- and parti-
Radix Tree. tion-scoped translations are performed.
Programming Note
Warning: The functionality described in this sec-
tion, e.g. directing most hypervisor interrupts to the
LPID=0 translation tables, places great importance
on the correctness of the format of and mappings in
Partition Table Entry 0 and the tables it anchors.
An error in any of these structures could have
severe consequences including system checkstops
and hangs.
The In-Memory Tables are used to find the tables that Partition Descriptor
are used in the actual translation process for the parti-
tion and process that are executing. They enable hard- Bit(s) Name Description
ware, including accelerator hardware separate and 4:51 PATB Partition Table Base
distinct from the Power ISA processors in the platform, 59:63 PATS Partition Table Size=212+PATS
to perform the translation process largely without soft- PATS≤24
ware intervention. Description of the In-Memory Table
All other fields are reserved.
structure follows. Hardware may cache the contents of
the In-Memory Tables. Variants of tlbie[l] may be used Figure 20. Partition Table Control Register
to manage the caching even though the In-Memory
Table contents are not cached in the TLB. Programming Note
When an address in the In-Memory Table structure is If it becomes necessary to shrink the Partition
specified to be a virtual or guest real address, the Table or to change PATB to point to a table that is
access to that address is considered to be performed not identical to the existing one, it is necessary to
with translation on. For a host using HPT translation, a issue tlbie with RIC=2 to invalidate caching of out-
base page size is specified for each such access to be dated In-Memory Table Entries.
used in the HPT search. The hypervisor can override
the Segment Table Page Size in the Process Table The Parition Table is composed of a pair of double-
Entry (PRTESTPS, see Figure 22) using LPCRISL. The words per partition. The first doubleword indicates
base page size for the Process Table (PATEPRTPS) can whether the host uses HPT or Radix Tree translation,
be safely altered by the hypervisor since the OS does and contains the base of the host’s translation table
not have direct access to the Partition Table Entry. All structure in host real memory. The first doubleword
accesses to the In-Memory Tables, the Segment also contains the size of the table structure and the size
Tables, and the guest Radix Tables that are performed of the Root Page Directory for a hypervisor using Radix
with translation on, including for instruction address Tree translation, or the base page size for the VRMA for
translation, are data accesses performed as if Paravirtualized HPT translation. Additional details
MSRPR=0 for the purpose of determing storage protec- about the parameters for HPT translation follow.
tion, although instruction side translation exceptions
The HTABORG field contains the high-order 42 bits of
cause [H]ISI. (A specific example of the implications of
the 60-bit real address of the Page Table. The Page
Table is thus constrained to lie on a 218 byte (256 KB) Paravirtualized HPT Partition Table Entry
boundary at a minimum. At least 11 bits from the hash
function (see Figure 29) are used to index into the Page Bit(s) Name Description
Table. The minimum size Page Table is 256 KB (211 0 HR Host Radix
PTEGs of 128 bytes each). 0b0- hypervisor uses HPT
translation for this partition
The Page Table can be any size 2n bytes where
0b1- hypervisor uses Radix
18≤n≤46. As the table size is increased, more bits are
Tree translation for this partition
used from the hash to index into the table.
4:45 HTABORG Hashed Page Table Base
The HTABSIZE field contains an integer giving the 56:58 PS Page Size (uses L||LP encoding
number of bits (in addition to the minimum of 11 bits) as in current SLBE)
from the hash that are used in the Page Table index. 59:63 HTABSIZE HPT size = 2HTABSIZE+18
This number must not exceed 28. HTABSIZE is used to HTABSIZE ≤ 28
generate a mask of the form 0b00...011...1, which is a 0 GR Guest Radix
string of 28 - HTABSIZE 0-bits followed by a string of 0b0- partition uses HPT
HTABSIZE 1-bits. The 1-bits determine which addi- 0b1- partition uses Radix Tree
tional bits (beyond the minimum of 11) from the hash 1:38 PRTB Process Table Base (when
are used in the index (see Figure 29). UPRT=1)
56:58 PRTPS Process Table Page SIze (when
On implementations that support a real address size of
UPRT=1) (uses L||LP encoding
only m bits, m<60, bits 0:59-m of the HTABORG field
as in current SLBE)
are treated as reserved bits, and software must set
them to zeros. 59:63 PRTS Process Table Size = 212+PRTS
PRTS≤24 (when UPRT=1)
Programming Note
0 2 3 55 58 63
Let n equal the virtual address size (in bits) sup-
1 RTS1 / RPDB RTS2 RPDS
ported by the implementation. If n<67, software
should set the HTABSIZE field to a value that does 1 / PRTB // PRTS
not exceed n-39. Because the high-order 78-n bits 0 3 51 58 63
of the VSID are assumed to be zeros, the hash
value used in the Page Table search will have the Radix on Radix Partition Table Entry
high-order 67-n bits either all 0s (primary hash; see
Section 5.7.9.2) or all 1s (secondary hash). If Bit(s) Name Description
HTABSIZE > n-39, some of these hash value bits 0 HR Host Radix
will be used to index into the Page Table, with the 0b0- hypervisor uses HPT trans-
result that certain PTEGs will not be searched. lation for this partition
0b1- hypervisor uses Radix Tree
Example: translation for this partition
1:2 RTS1 Radix Tree Size[0:1]
Suppose that the Page Table is 16,384 (214) 128-byte 4:55 RPDB Root Page Directory Base
PTEGs, for a total size of 221 bytes (2 MB). A 14-bit 56:58 RTS2 Radix Tree Size[2:4] (number of
index is required. Eleven bits are provided from the address bits mapped),
hash to start with, so 3 additional bits from the hash size=2RTS+31
must be selected. Thus the value in HTABSIZE must be 59:63 RPDS Root Page Directory Size
3 and the value in HTABORG must have its low-order 3 = 2RPDS+3, RPDS≥5
bits (bits 43:45 of the first doubleword of the Partition 0 GR Guest Radix
Table Entry) equal to 0. This means that the Page Table 0b0- partition uses HPT
must begin on a 23+11+7 = 221 = 2 MB boundary. 0b1- partition uses Radix Tree
4:51 PRTB Process Table Base
0 3 45 55 58 63 59:63 PRTS Process Table Size = 212+PRTS
0 / HTABORG // PS HTABSIZE PRTS≤24 (when UPRT=1)
0 PRTB /// PRTPS PRTS All other fields are reserved.
0 38 55 58 63
Figure 21. Partition Table Entry Variants
The second doubleword of the Partition Table Entry
indicates whether the guest has its own Radix Tree. It
also contains the base of the partition’s Process Table,
which is a guest real address (or effective address
when effective LPID=0) for radix hypervisor and virtual
address for HPT hypervisor, and the size of the Pro-
Lookup in process-scoped
Page Table
Lookup in partition-
scoped Page Table
Segment Lookaside
Buffer (SLB)
≈ ≈ ≈ ≈ ≈ ≈≈ ≈
SLBEn
0 35 37 39 88 89 93 95 96
VSID0:77-s
78-s s-p p
VSID Page Byte
Virtual Page Number (VPN)
78-bit Virtual Address
Programming Note
It is permissible for software to replace the contents
of a valid SLB entry without invalidating the transla-
tion specified by that entry provided the specified
restrictions are followed. See Chapter 11 Note 10.
28
AND
Page Table
16 bytes
PTEG 0 PTE0 PTE7
14 28 11 7
0000000
60-bit Real Address of Page Table Entry Group (PTEG)
PTEG n
128 bytes
0 12 57 6162 63 0 1 2 4 6 7 44 52 54 5556 57 61 62 63
(ARPN||LP)0:59-p
60-p p
60-bit Real Address Byte
word of the Partition Table Entry (HTABSIZE) match is found, the search succeeds. If more than
and then added to the value of bits 4:45 of the one match is found, one of the matching entries is
first doubleword of the Partition Table Entry used as if it were the only matching entry, or a
(HTABORG). Machine Check occurs.
Bits 28:38 of the 39-bit hash value.
If the Page Table search succeeds, the real address
Seven 0-bits.
(RA) is formed by concatenating bits 0:59-p of ARPN||LP
This operation identifies a particular PTEG, called from the matching PTE with bits 64-p:63 of the effective
the “primary PTEG”, whose eight PTEs will be address (the byte offset), where the p value is the log2
tested. (actual page size specified by PTEL LP).
2. Secondary Hash: RA=(ARPN || LP)0:59-p || EA64-p:63
If the secondary Page Table search is enabled A TLB entry may be created as a result of the success-
(LPCRTC=0), perform the secondary hash function ful HPT translation. Depending on the specific TLB
as follows; otherwise do not perform step 2 and implementation, the scope of the entry may be the base
proceed to step 3 below. page size, the virtual page size, or any size in between.
If s=28, the hash value is computed by taking the In the absence of a TLB, software would be required to
ones complement of the Exclusive OR of VA11:49 create a PTE for each base page sized piece of storage
with (11+b0||VA50:77-b) within the virtual page. The number of PTEs actually
created to map a virtual page will depend on the
If s=40, the hash value is computed by taking the scopes supported for TLB entries, the access pattern,
ones complement of the Exclusive OR of the fol- and the lifetime of the TLB entries. Hardware generally
lowing three quantities: (VA24:37 ||250), (0||VA0:37), will not create more than one TLB entry to translate a
and (b-10||VA38:77-b) given virtual address. Multiple matching TLB entries
The 60-bit real address of a PTEG is formed by may be created only if the Page Table contains PTEs
concatenating the following values: that map different-sized virtual pages that overlap in the
Bits 0:27 of the 39-bit appropriate primary or virtual address space. If a TLB search finds multiple
secondary hash value ANDed with the mask matching TLB entries created from such PTEs, one of
generated from bits 59:63 of the first double- the matching TLB entries is used as if it were the only
word of the Partition Table Entry (HTABSIZE) matching entry, or a Machine Check occurs. Software
and then added to the value of bits 4:45 of the should scrupulously avoid creating such mappings.
first doubleword of the Partition Table Entry
(HTABORG).
Bits 28:38 of the 39-bit hash value.
Seven 0-bits.
This operation identifies the “secondary PTEG”.
3. As many as 8 PTEs in the primary PTEG and, if
the secondary Page Table search is enabled, 8
PTEs in the secondary PTEG are tested to deter-
mine if any translate the given virtual address. Let
q = minimum(54, 77-b). For a match to exist, the
following conditions must be satisfied, where SLBE
is the SLBE used to form the virtual address.
PTEH=0 for the primary PTEG, 1 for the sec-
ondary PTEG
PTEV=1
PTEB=SLBEB
PTEAVA[0:q]=VA0:q
if b = 12 then
(PTEL = 0) | (PTELP specifies the 4KB base
page size)
else
(PTEL = 1) & (PTELP specifies the base page
size specified by SLBEL||LP)
If no match is found, the search fails. The result is
a page fault -- a [Hypervisor] Instruction Storage
exception or a [Hypervisor] Data Storage excep-
tion, depending on whether the effective address is
for an instruction fetch or for a data access. If one
40
Root Page Directory Base 13
2 + 10 + 40 p
Effective Page Number Byte
0 1 2 11 12 24 25 33 34 42 43 51
quadrant select
40 13 3
000
56-bit Real Address of Root (Level 1) PDE Level 1
Read Level 1 PDE PDE Access
44
V L 9
Contents of PTE
L=1,V=1
56-p p
56-bit Real Address Byte
Figure 32. Four level Radix Tree walk translating a 5.7.10.1 Radix Tree Page Directory
52b EA with NLS=13 in the root PDE and Entry
NLS=9 in the other PDEs.
.
(#3)
Hypervisor Page
Table Accesses (#4)
(#5)
40 13 3 Real Address of
Level 1 Table
(#12)
56-bit Real Address of Level 2 PDE Virt Level 2
Read Level 2 PDE PDE Access (#13)
44 Guest Real Address
of Level 3 Table (#14)
V L 9 (#11)
(#15)
Contents of Level 2 PDE
L=0,V=1
9 Real Address of
44 Level 3 Table
L=0,V=1
44 9 Real Address of
Page Table
(#22)
56-bit Real Address of PTE Virt
Read PTE PTE Access (#23)
44 Guest Real Address
(#24)
V L (#21)
(#25)
Contents of PTE
L=1,V=1
56-p p
56-bit Real Address Byte
Figure 35. Radix on Radix Page Table search for a effect. For read, write, and execute authority, each is
52-bit EA depicting memory reads 1-24 controlled independently based on the least permissive
numbered in sequence setting of the two translation mechanisms (including all
component authority mechanisms within each of them).
When nested translation is being performed, there is
For storage ordering, the SAO attribute takes effect
the potential for two different sets of protection set-
when both SAO and normal memory attributes are
tings and two different sets of storage attributes. For
specified. (The hypervisor will typically specify “normal
protection settings, the least permissive values take
memory” and the OS may override that with SAO.)
When the non-idempotent (IG=0b11) and tolerant ment Table Entry or walk the process-scoped Page
(IG=0b10) I/O attributes are specified, the setting spec- Table (i.e. translate the effective address to a vir-
ified by the guest operating system takes effect. Any tual or guest real address), and
other mismatch of attributes will cause a data storage use of the partition-scoped Page Table to translate
or instruction storage exception, as appropriate for the the virtual or guest real address.
access.
Depending on the translation mode and process state,
Reference and Change bit recording is done in both the some of these steps may be skipped. The following
process-scoped and partition-scoped Page Table subsections enumerate the cases and explain the steps
Entries. Recording is done as described in in more detail.
Section 5.7.13, “Reference and Change Recording”.
Faults that occur as part of process-scoped translation 5.7.12.1 Fully-Qualified Address
will generally be signalled by ISI/DSI, while faults that
occur as part of partition-scoped translation will gener- The storage control facilities enable hardware to per-
ally be signalled by HISI/HDSI. (In this categorization, form the entire translation process given a “fully-quali-
single level translation is considered process-scoped fied address” and context that makes it a unique input.
translation except when VPM is active, in which case it In addition to its normal use, the term “effective
is treated like partition-scoped translation.) The address” is sometimes used as shorthand for the
address specified in ASDR is the guest real address or fully-qualified address, and the architecture should be
VSID for which translation has most immediately failed read with this possibilty in mind. The following are the
except when the translation fails too early to produce components of the fully-qualified address.
that value. HDAR will generally contain the EA or effLPID
lower VA bits for which translation has most immedi- effPID
ately failed. For example, in the case of a Page Direc- EA
tory being paged out, the ASDR will contain the guest The additional context required to perform a translation
real address of the Page Directory Entry (down to bit or match a cached translation may include the follow-
51), rather than the GRA of the datum being accessed. ing.
Exceptions may be manifest in unexpected ways. For PATEHR GR (selected using the value in LPIDR, not
example, an instruction fetch can fail to set a Change effLPID)
bit in the host PTE mapping the guest PTE. Similarly, MSRHV PR IR DR
the Reference bit update might fail for lack of write
authority on the PTE. At a high level, the translation mode is selected by the
Host Radix and Guest Radix bits found in the Partition
For performance reasons, the result of each walk of a Table Entry. The Host Radix bit indicates whether the
Radix Tree or HPT may be cached in a TLB. Logically, hypervisor is using HPT or Radix Tree translation. The
the result of each walk is cached separately. For Guest Radix bit indicates whether the guest manages
nested translation, the effective to guest real (pro- an effective to guest real mapping using Radix Tree
cess-scoped) translation may be cached, as well as the translation, or an effective to virtual mapping using
partition-scoped translation for each guest real address Segment translation. Given the overall process,
produced by the translation process. A minimum of two MSRHV||PR||IR||DR determine where and how the pro-
TLB accesses is required to complete a nested transla- cess is entered.
tion: one for the effective to guest real address and one
for the guest real to host real address. (An implemen-
tation may optimize the process, as long as the optimi- 5.7.12.2 Finding the Page Tables
zation can be managed correctly using the tlbie [The following description assumes that no legacy
instructions that software will use to manage the logical mode is active, i.e. LPCRUPRT=1.]
model.)
The components of the fully-qualified address are used
to determine the table(s) used in the translation pro-
5.7.12 Translation Process cess. The effective LPID and effective PID are used to
find the appropriate Page Table base address(es) using
As previously described, in its most complicated form
the In-Memory Table structures. Some types of transla-
the translation process includes the following steps:
tion use process-scoped Page Tables, some use parti-
use of the PTCR to find the required Partition Table
tion-scoped Page Tables, and some use both.
Entry
use of the Partition Table Entry to find the parti- Process-scoped table descriptors are found in the Pro-
tion-scoped Page Table cess Tables as follows. The partition table is assumed
use of partition-scoped Page Table to find the to be aligned in host real address space. The Partition
required Process Table Entry Table Entry (PATE) host real address is calculated by
use of the Process Table Entry and parti- ORing the Partition Table Base Address (PATB||120) in
tion-scoped Page Table to find the required Seg- the PTCR with 16 times the effective LPID and then
performing partition-scoped translation. The second 5.7.12.3 Obtaining Host Real Address,
doubleword of the entry contains the base address of Radix on Radix
the Process Table for the partition. The Process Table
is assumed to be aligned in effective (HR=1, The following cases exist.
effLPID=0), virtual, or guest real address space. (If the Guest access to quadrant 0 with translation on:
table is not aligned or is not large enough to support process-scoped translation is performed on
the PID value, an unreported error will most likely LPIDR||PIDR||EA, with the result subject to parti-
result.) The Process Table Entry (PRTE) host real tion-scoped translation with effective LPID=LPIDR.
address is calculated by ORing the Process Table Base Guest access to quadrant 3 with translation on:
Address (PRTB||400 for for an HPT host and PRTB||120 process-scoped translation is performed on
for a radix host) in the PATE with 16 times the effective LPIDR||0||EA, with the result subject to parti-
PID and then performing partition-scoped translation. tion-scoped translation with effective LPID=LPIDR.
(If the table is not aligned or is not large enough to sup- Hypervisor access to quadrant 1 with translation
port the PID value, an unreported error will most likely on: process-scoped translation is performed on
result.) The resulting Process Table Entry guest real LPIDR||PIDR||EA, with the result subject to parti-
address for a radix guest on a radix host (effLPID≠0), tion-scoped translation with effective LPID=LPIDR
effective address for radix host (effLPID=0), or virtual if LPIDR≠0.
address for all cases with an HPT host must be trans- Hypervisor access to quadrant 2 with translation
lated via the appropriate partition-scoped table. The on: process-scoped translation is performed on
Process Table Entry at that location contains a pro- LPIDR||0||EA, with the result subject to parti-
cess-scoped table base address, which is a guest real tion-scoped translation with effective LPID=LPIDR
address for a radix guest on a radix host (HV=0), a if LPIDR≠0.
host real address for a radix host (HV=1), or a virtual Guest OS access with translation off: parti-
address (all cases with an HPT host). The virtual or tion-scoped translation is performed with effective
guest real address must be translated via the appropri- LPID = LPIDR.
ate partition-scoped table. Hypervisor or host application access to quadrant
0 with translation on: process-scoped translation
Programming Note is performed on 0||PIDR||EA.
The guest real address for a guest of a radix Hypervisor or host application access to quadrant
host,or virtual address for a guest of an HPT host, 3 with translation on: process-scoped translation is
of the Process Table may be set via an hcall. The performed with 0||0||EA.
radix guest may choose to map the Process Table Hypervisor real mode access: subject to HRMOR
into its own virtual address space. These matters and EA0 as described in Section 5.7.3.1.
are not visible to the architecture.
Programming Note
Note that the sole purpose of partition-scoped
Page Table descriptor when LPID=0 for a radix host
is to translate the effective addresses of the Pro-
cess Table Entries for LPID=0. (If the Process
Table Base address for LPID=0 was a real address,
the Process Table would have to be in contiguous
real storage.) This descriptor will commonly be the
same as the descriptor found in the LPID=0, PID=0
Process Table Entry, both pointing to the hypervi-
sor’s own page trable, but it may be set up to point
to a table used solely to translate the addresses of
Process Table Entries.
Reference Bit
The Reference bit is set to 1 if the corresponding
access (load, store, implicit access, or instruction
fetch) is required by the sequential execution
model and is performed. Otherwise the Reference
bit may be set to 1 if the corresponding access is
attempted, either in-order or out-of-order, even if
the attempt causes an exception, except that the
Reference bit is not set to 1 for the access caused
by an indexed Move Assist instruction for which the
XER specifies a length of zero.
Change Bit
Programming Note
Because the sync instruction is execution synchro-
nizing, the set of Reference and Change bit
updates that are performed with respect to the
thread executing the sync instruction before the
memory barrier is created includes all Reference
and Change bit updates associated with instruc-
tions preceding the sync instruction.
Status of Access R C
Indexed Move Assist insn w 0 len in XER No No
Storage protection violation Acc1 No
Out-of-order I-fetch or Load-type Inst’n Acc No
(including transactional Load-type
inst’n or dcbtst)
Out-of-order Store-type inst’n, including
transactional Store-type inst’n, exclud-
ing dcbtst
Would be required by the sequential
execution model in the absence of
system-caused or imprecise
interrupts3, or transaction failure Acc Acc1 2
All other cases Acc No
In-order Load-type or Store-type insn,
access not performed4
Load-type insn Acc No
Store-type insn Acc Acc2
Other in-order access
I-fetch Yes No
Ordinary Load Yes No
Other ordinary Store, dcbz Yes Yes
icbi, icbt, dcbt, dcbtst, dcbst, dcbf[l] Acc No
“Acc” means that it is acceptable to set the bit.
1 It is preferable not to set the bit.
2 If C is set, R is also set unless it is already set.
3
For Floating-Point Enabled Exception type Pro-
gram interrupts, “imprecise” refers to the exception
mode controlled by MSRFE0 FE1.
4 This case does not apply to the Touch instructions,
because they do not cause a storage access.
Programming Note
The preceding requirement permits designs to
implement the AMOR and/or UAMOR as 32-bit reg-
isters — specifically, to implement only the
even-numbered bits (or only the odd-numbered
bits) of the register — in a manner such that the
reduction, from the architecturally-required 64 bits
to 32 bits, is not visible to (correct) software. This
implementation technique saves space in the hard-
ware. (A design that uses this technique does the
appropriate “fan in/out” when the register is
accessed, to provide the appearance, to (correct)
software, of supporting all 64 bits of the register.)
Permitting designs to implement the [U]AMOR as
32-bit registers by virtue of the software require-
ment specified above, rather than by defining the
[U]AMOR as 32-bit registers, permits the architec-
ture to be extended in the future to support control-
ling modification of the “read access” AMR bits (the
odd-numbered bits) independently from the “write
access” AMR bits (the even-numbered bits), if that
proves desirable. If this independent control does
prove desirable, the only architecture change would
be to eliminate the software requirement.
Programming Note
When modifying the AMOR and/or UAMOR, the
hypervisor should ensure that the two registers are
consistent with one another before giving control to
a non-hypervisor program. In particular, the hyper-
visor should ensure that if AMORi=0 then
UAMORi=0, for all i in the range 0:63. (Having
AMORi=0 and UAMORi=1 would permit problem
state programs, but not the operating system, to
modify AMR bit i.)
Programming Note
The Virtual Page Class Key Protection mechanism replaces the Data Address Compare mechanism that was defined
in versions of the architecture that precede Version 2.04 (e.g., the two facilities use some of the same resources, as
described below). However, the Virtual Page Class Key Protection mechanism can be used to emulate the Data
Address Compare mechanism. Moreover, programs that use the Data Address Compare mechanism can be modi-
fied in a manner such that they will work correctly both on implementations that comply with versions of the architec-
ture that precede Version 2.04 (and hence implement the Data Address Compare mechanism) and on
implementations that comply with Version 2.04 of the architecture or with any subsequent version (and hence instead
implement the Virtual Page Class Key Protection mechanism). The technique takes advantage of the facts that the
SPR number for privileged access to the AMR (29) is the same as the SPR number for the Data Address Compare
mechanism's ACCR (Address Compare Control Register), that KEY4 occupies the same bit in the PTE as the Data
Address Compare mechanism's AC (Address Compare) bit, and that the definition of ACCR62:63 is very similar to the
definition of each even-odd pair of AMR bits. The technique is as follows, where PTE1 refers to doubleword 1 of the
PTE.
- Set bits 2:3 and 62:63 of SPR 29 (which is also be used for any virtual pages for which it
either the ACCR or the AMR) to x, where x is is desired that the Virtual Page Class Key Pro-
the desired 2-bit value for controlling Data tection mechanism permit all accesses. Do
Address Compare matches, and set bits 0:1 to not use PTEKEY =31.
0s.
- When a Hypervisor Data Storage interrupt
- Set PTE154 (which is either the AC bit or occurs, if HDSISR42=1 then ignore the inter-
KEY4) to the same value that the AC bit would rupt for Cache Management instructions other
be set to, and set PTE12:3 (which are either than dcbz. (These instructions can cause a
RPN bits, that correspond to a real address virtual page class key protection violation but
size larger than the size supported by any cannot cause a Data Address Compare
implementation that supports the Data match.) Otherwise forward the interrupt to the
Address Compare mechanism, or KEY0:1) operating system, which will treat the interrupt
and PTE152:53 (which are either reserved bits as if a Data Address Compare match had
or KEY2:3) to 0s. occurred. (Note: Cases for which it is unde-
fined whether a Data Address Compare
- Use PTEKEY values 0 and 1 only for purposes
match occurs do not necessarily cause a vir-
of emulating the Data Address Compare
tual page class key protection violation.)
mechanism, except that PTEKEY value 0 may
(Because privileged software can access the AMR using either SPR 13 or SPR 29, it might seem that, when SPR 13
was added to the architecture (in Version 2.06), SPR 29 should have been removed. SPR 29 is retained for two rea-
sons: first, to avoid requiring privileged software to change to use the newer SPR number; and second, to retain the
ability to emulate the Data Address Compare mechanism as described above.)
Real Mode Storage Control facility (see tion model, or is in the real page immediately fol-
Section 5.7.3.2.1). lowing such a page.
In general, storage that is not well-behaved should be
Programming Note
Guarded. Because such storage may represent a con-
trol register on an I/O device or may include locations Software should ensure that only well-behaved
that do not exist, an out-of-order access to such stor- storage is copied into a cache, either by accessing
age may cause an I/O device to perform unintended as Caching Inhibited (and Guarded) all storage that
operations or may result in a Machine Check. may not be well-behaved, or by accessing such
storage as not Caching Inhibited (but Guarded) and
The following rules apply to in-order execution of Load referring only to cache blocks that are
and Store instructions for which the first byte of the well-behaved.
storage operand is in storage that is both Caching
Inhibited and Guarded. If a real page contains instructions that will be exe-
cuted when MSRIR=0 and MSRHV=1, software
Load or Store instruction that causes an atomic should ensure that this real page and the next real
access page contain only well-behaved storage (or that the
If any portion of the storage operand has been Hypervisor Real Mode Storage Control facility
accessed and an External, Decrementer, Hypervi- specifies that this real page is not Guarded).
sor Decrementer, Performance Monitor, or Impre-
cise mode Floating-Point Enabled exception is
pending, the instruction completes before the inter-
rupt occurs.
5.8.2 Storage Control Bits
Load or Store instruction that causes an Alignment
exception, or that causes a [Hypervisor] Data Stor- When address translation is enabled, each storage
age exception for reasons other than Data Address access is performed under the control of the Page
Watchpoint match. Table Entry used to translate the effective address.
Each Page Table Entry contains storage control bits
The portion of the storage operand that is in Cach- that specify the presence or absence of the corre-
ing Inhibited and Guarded storage is not accessed. sponding storage control for all accesses translated by
(The corresponding rules for instructions that the entry as shown in Figure 45 and Figure 46. In the
cause a Data Address Watchpoint match are given following description, references to individual WIMG
in Section 8.4.) bits apply to the corresponding Radix Att encoding
except where otherwise stated or obvious from context.
Programming Note
It is recommended that dcbf be used, rather than
dcbfl, when changing the value of the I or W bit
from 0 to 1. (dcbfl would have to be executed on all
threads for which the contents of the data cache
may be inconsistent with the new value of the bit,
whereas, if the M bit for the page is 1, dcbf need be
executed on only one thread in the system.)
Programming Note
For example, when changing the M bit in some
directory-based systems, software may be required
to execute dcbf[l] on each thread to flush all stor-
age locations accessed with the old M value before
permitting the locations to be accessed with the
new M value.
extended mnemonic ptesync (equivalent to sync with sync instruction ensures that all such searches
L=2). that implicitly load from the target location of such
a store willl obtain the value stored (or a value
The ptesync instruction has all of the properties of
stored subsequently). Also, the memory barrier
sync with L=0 and also the following additional proper-
created by the ptesync insruction ensures that all
ties.
searches of these tables by any other thread, that
The memory barrier created by the ptesync are performed after a store in set B of the memory
instruction provides an ordering function for the barrier has been performed with respect to the
storage accesses associated with all instructions other thread and that implicitly load from the target
that are executed by the thread executing the pte- location of such a store, will obtain the value stored
sync instruction and, as elements of set A, for all (or a value stored subsequently).
Reference and Change bit updates associated
In conjunction with the tlbie and tlbsync instruc-
with additional address translations that were per-
tions, the ptesync instruction provides an ordering
formed, by the thread executing the ptesync
function for TLB invalidations and related storage
instruction, before the ptesync instruction is exe-
accesses on other threads as described in the tlb-
cuted. The applicable pairs are all pairs ai,bj in
sync instruction description on page 1042.
which bj is a data access and ai is not an instruc-
tion fetch.
Similarly, in conjunction with the slbieg and slb-
The ptesync instruction causes all Reference and sync instructions, the ptesync instruction provides
Change bit updates associated with address trans- an ordering function for SLB invalidations and
lations that were performed, by the thread execut- related storage accesses on other threads as
ing the ptesync instruction, before the ptesync described in the slbsync instruction description on
instruction is executed, to be performed with page 1031.
respect to that thread before the ptesync instruc-
tion’s memory barrier is created. Programming Note
The memory barrier created by the ptesync For instructions following a ptesync instruction, the
instruction provides an ordering function for all memory barrier need not order implicit storage
stores to the Partition Table, Process Tables, Seg- accesses for purposes of address translation and
ment Tables, Page Directories, and Page Tables reference and change recording.
caused by Store instructions preceding the pte- The functions performed by the ptesync instruction
sync instruction with respect to invalidations, of may take a significant amount of time to complete,
cached copies of information derived from these so this form of the instruction should be used only if
tables, caused by slbieg and tlbie instuctions fol- the functions listed above are needed. Otherwise
lowing the ptesync instruction. The memory bar- sync with L=0 should be used (or sync with L=1,
rier ensures that all searches of these tables by or eieio, if appropriate).
another thread, that are performed after an invali-
dation caused by such an slbieg or tlbie instruc- Section 5.10, “Translation Table Update Synchroni-
tion has been performed with respect to the other zation Requirements” on page 1043 gives exam-
thread and that implicitly load from the target loca- ples of uses of ptesync.
tion of such a store, will obtain the value stored (or
a value stored subsequently).
Programming Note
5.9.3 Lookaside Buffer
The next bullet is sufficient to order the stores Management
with respect to the invalidations on the thread
All implementations have a Segment Lookaside Buffer
executing the ptesync instruction. That bullet
(SLB). Independent of whether the executing partition
is also sufficient to provide the ordering with
operates in a mode that uses hardware SLB loading
respect to invalidations caused by slbie, slbia,
and bolting versus pure software loading (controlled by
and tlbiel instuctions, which affect only the
the value of LPCRUPRT), software is responsible for
thread executing them.
keeping the SLB current with the segment mapping for
the process that is executing. To simplify management
The ptesync instruction provides an ordering func-
of the SLB, hardware will only create speculative SLB
tion for all stores to the Partition Table, Process
entries for the context that is executing, in particular
Tables, Segment Tables, Page Directories, and
using the current values of LPID and PID. Except when
Page Tables caused by Store instructions preced-
LPID=0, no such entries will be created when
ing the ptesync instruction with respect to
MSRHV=1. Similarly, when MSRIR=0 and MSRDR=0,
searches of these tables that are performed, by the
no such entries will be created. Proper management of
thread executing the ptesync instruction, after the
the SLB across context switches is described in pro-
ptesync instruction completes. Executing a pte-
gramming notes.
Programming Note
For performance reasons, most implementations also Because the instructions used to manage TLBs,
cache other information that is used in address transla- SLBs, Page Walk Caches, caches of Partition and
tion. These caches may include: a Translation Looka- Process Table Entries, and implementation-specific
side Buffer (TLB) which is a cache of recently used lookaside information may be changed in a future
Page Table Entries (PTEs); a cache of recently used version of the architecture, it is recommended that
translations of effective addresses to real addresses; a software “encapsulate” their use into subroutines.
Page Walk Cache for Radix Tree translation; caching of
the In-Memory Tables; or any combination of these.
Lookaside information, including the SLB, is managed Programming Note
using the instructions described in the subsections of The function of all the instructions described in
this section unless additional requirements are pro- Sections 5.9.3.2 - 5.9.3.3 is independent of
vided in implementation-specific documentation. whether address translation is enabled or disabled.
For a discussion of software synchronization
requirements when invalidating SLB and TLB
Lookaside information derived from PTEs is not neces-
entries, see Chapter 11.
sarily kept consistent with the Page Table. When soft-
ware alters the contents of a PTE, in general it must
also invalidate all corresponding TLB entries and imple-
mentation-specific lookaside information; exceptions to
5.9.3.1 Thread-Specific Segment Trans-
this rule are described in Section 5.10.1.2. lations
The effects of the slbie, slbieg, slbia, and TLB Man- It is necessary to provide thread-specific temporary
agement instructions on address translations, as speci- ESID to VSID translations. These translations cannot
fied in Sections 5.9.3.2 for the SLB and 5.9.3.3 for be placed in valid entries in the Segment Table
the TLB, Page Walk Cache, and In-Memory Table because the Segment Table has a process scope
caches, apply to all implementation-specific lookaside rather than a thread scope. Instead, software will use
information that is used in address translation. Unless slbmte to install such translations in the SLB. All SLB
otherwise stated or obvious from context, references to entries created using slbmte are considered to be
SLB entry invalidation and TLB entry invalidation else- “software created.” Software created entries will only
where in the Books apply also to invalidation of Page translate accesses from the hardware thread by which
Walk Cache content, In-Memory Table cache content, they are installed. When LPCRUPRT=1, they are also
and all implementation-specific lookaside information considered to be “bolted.” Each thread has the ability
that is derived from SLB entries and PTEs, respectively. to bolt four entries.
entries. slbfee has no function when LPCRUPRT=1 for SLB Invalidate Entry X-form
the partition that is running.
slbie RB
Programming Note
Accesses to a given SLB entry caused by the 31 /// /// RB 434 /
instructions described in this section obey the 0 6 11 16 21 31
- (RB)37
Programming Note
- (RB)39
- (RB)40:63 When switching to execute an adjunct, a hypervisor
- If s = 40, (RB)24:35 will turn off translation and use slbie to be sure
there is no SLB entry mapping the effective
If this instruction is executed in 32-bit mode, (RB)0:31 address space that will be used by the incoming
must be zeros. adjunct. It will then bolt an entry for the incoming
This instruction is privileged. adjunct and transfer control to that adjunct. While
translation is off and during adjunct execution, no
Special Registers Altered: speculative Segment Table walks will be per-
None formed.
Programming Note
slbie does not affect SLBs on other threads.
Programming Note
The reason the class value specified by slbie must
be the same as the Class value that is or was in the
relevant SLB entry is that the hardware may use
these values to optimize invalidation of implemen-
tation-specific lookaside information used in
address translation. If the value specified by slbie
differs from the value that is or was in the relevant
SLB entry, these optimizations may produce incor-
rect results. (An example of implementation-spe-
cific address translation lookaside information is
the set of recently used translations of effective
addresses to real addresses that some implemen-
tations maintain in an Effective to Real Address
Translation (ERAT) lookaside buffer.)
When switching tasks in certain cases, it may be
advantageous to preserve some implementa-
tion-specific lookaside entries while invalidating
others. The IH=0b001 invalidation hint of the slbia
instruction can be used for this purpose if SLB
class values are appropriately assigned, i.e., a
class value of 0 gives the hint that the entry should
be preserved and a class value of 1 indicates the
entry must be invalidated. Also, it is advantageous
to assign a class value of 1 to entries that need to
be invalidated via an slbie instruction while pre-
serving implementation-specific lookaside entries
that are not derived from an SLB entry since such
entries are assigned a class value of 0.
Programming Note
The B value in register RB may be needed for inval-
idating ERAT entries corresponding to the transla-
tion being invalidated.
SLB Invalidate Entry Global X-form ment size, are the same as for the B field in the SLBE
(see Figure 26 on page 995).
slbieg RS,RB
Only SLBs for threads running on behalf of target_LPID
and target_PID are searched. Software-created
31 RS /// RB 466 /
entries are ignored. The class value and segment size
0 6 11 16 21 31
must be the same as the class value and segment size
in the SLB entry that translates the EA, or the values
target_PID = RS0:31 that were in the SLB entry that most recently translated
if MSRHV=1 then target_LPID = RS32:63 the EA if the translation is no longer in the SLB; if these
else target_LPID = LPIDR values are not the same, it is implementation-depen-
ea0:35 (RB)0:35
dent whether the SLB entry (or implementation-depen-
for each thread with LPIDR=target_LPID and
PIDR=target_PID dent translation information) that translates the EA is
if, for each SLB entry that invalidated, and the next paragraph need not apply.
translates or most recently translated ea If the SLB contains only a single entry that translates
entry_class = (RB)36 and
entry_seg_size = size specified in (RB)37:38 the EA, then that is the only SLB entry that is invali-
then for SLB entry (if any) dated, except that it is implementation-dependent
that translates ea and is not software-created whether an implementation-specific lookaside entry for
SLBEV 0 a real mode address “translation” is invalidated. If the
all other fields of SLBE undefined SLB contains more than one such entry, then zero or
else more such entries are invalidated, and similarly for any
s log_base_2(entry_seg_size) implementation-specific lookaside information used in
esid (RB)0:63-s address translation; additionally, a machine check may
u undefined 1-bit value occur.
if u then
if an SLB entry translates esid and the entry SLB entries are invalidated by setting the V bit in the
is not software-created entry to 0, and the remaining fields of the entry are set
SLBEV 0 to undefined values.
all other fields of SLBE undefined
The hardware ignores the contents of RB listed below
and software must set them to 0s.
The operation performed by this instruction is based on
the contents of registers RS and RB. The contents of - (RB)37
these registers are shown below. - (RB)39
- (RB)40:63
RS
- If s = 40, (RB)24:35
PID LPID If this instruction is executed in 32-bit mode, (RB)0:31
0 32 63 must be zeros.
0 6 8 11 16 21 31
Programming Note
The reason the class value specified by slbieg
for each SLB entry except SLB entry 0
must be the same as the Class value that is or was SLBEV 0
in the relevant SLB entry is that the hardware may all other fields of SLBE undefined
use these values to optimize invalidation of imple-
mentation-specific lookaside information used in For all SLB entries except SLB entry 0, the V bit in the
address translation. If the value specified by slbieg entry is set to 0, making the entry invalid, and the
differs from the value that is or was in the relevant remaining fields of the entry are set to undefined val-
SLB entry, these optimizations may produce incor- ues. SLB entry 0 is altered only if IH=0b100 or its C bit
rect results. (An example of implementation-spe- is 1 and IH=0b011.
cific address translation lookaside information is On implementations that have implementation-specific
the set of recently used translations of effective lookaside information for effective to real address trans-
addresses to real addresses that some implemen- lations, the IH field provides a hint that can be used to
tations maintain in an Effective to Real Address invalidate entries selectively in such lookaside informa-
Translation (ERAT) lookaside buffer.) tion. The defined values for IH are as follows.
When switching tasks in certain cases, it may be 0b000 All such implementation-specific lookaside
advantageous to preserve some implementa- information is invalidated. (This value is not a
tion-specific lookaside entries while invalidating hint.)
others. The IH=0b001 invalidation hint of the slbia
instruction can be used for this purpose if SLB 0b001 Preserve such implementation-specific looka-
class values are appropriately assigned, i.e. a class side information having a Class value of 0.
value of 0 gives the hint that the entry should be 0b010 Preserve such implementation-specific looka-
preserved and a class value of 1 indicates the entry side information created when MSRIR/DR=0.
must be invalidated. Also, it is advantageous to
assign a class value of 1 to entries that need to be 0b011 Preserve such implementation-specific looka-
invalidated via an slbieg instruction while preserv- side information having a Class value of 0.
ing implementation-specific lookaside entries that Preserve SLB entries with a Class value of 0.
are not derived from an SLB entry since such If SLB entry 0 has a Class value of 1, it is
entries are assigned a class value of 0. invalidated. (This value is not a hint.)
0b100 All such implementation-specific lookaside
Programming Note information is invalidated, and in addition, SLB
slbieg specifying LPID=0 and PID=0 in RS should entry 0 is invalidated. (This value is not a
be used to invalidate hypervisor real mode ERAT hint.)
entries in the “nest” (see the appropriate platform 0b110 Preserve such implementation-specific looka-
architecture document), since hypervisor real mode side information created when MSRHV=1,
ERAT entries in the processor are typically invali- MSRPR=0, and MSRIR/DR=0.
date using slbie, which does not broadcast.
0b111 All such implementation-specific lookaside
information is invalidated, but no SLB entries
Programming Note are invalidated.
The B value in register RB may be needed for inval-
All other IH values are reserved. If the IH field contains
idating ERAT entries corresponding to the transla-
a reserved value, the hint provided by the IH field is
tion being invalidated.
undefined.
Implementation specific lookaside information for which
Programming Note preservation is not requested is invalidated. Implemen-
Use of slbieg to invalidate software-created seg- tation specific lookaside information for which preserva-
ment descriptors is a programming error. The tion is requested may be invalidated.
architecture requires that bolted entries be ignored
by the instruction. When IH=0b000, execution of this instruction has the
side effect of clearing the storage access history asso-
ciated with the Hypervisor Real Mode Storage Control
facility. See Section 5.7.3.2.1, “Hypervisor Real Mode
Storage Control” for more details.
SLB Invalidate All X-form
This instruction terminates any Segment Table walks
slbia IH being performed on behalf of the thread that executes
it, and ensures that any new table walks will be per-
31 // IH /// /// 498 / formed using the current PIDR value.
Programming Note
slbia serves as both a basic and an extended mne-
monic. The Assembler will recognize an slbia mne-
monic with one operand as the basic form, and an
slbia mnemonic with no operand as the extended
form. In the extended form the IH operand is omit-
ted and assumed to be 0.
SLB Move To Entry X-form This instruction must not be used to load a segment
descriptor that is in the Segment Table when LPCRU-
slbmte RS,RB PRT=1, and cannot be used to invalidate the translation
contained in an SLB entry.
31 RS /// RB 402 /
This instruction is privileged.
0 6 11 16 21 31
Special Registers Altered:
When LPCRUPRT=0, this instruction is the sole means None
for specifying Segment translations to the hardware.
When LPCRUPRT=1, Segment Table walks populate the Programming Note
SLB, and this instruction is used only to bolt The reason slbmte must not be used to load seg-
thread-specific Segment translations. ment descriptors that are in the Segment Table is
that there could be a race condition with hardware
The SLB entry specified by bits 52:63 of register RB is
loading the same segment descriptor, resulting in
loaded from register RS and from the remainder of reg-
duplicate SLB entries. Software must not allow
ister RB. The contents of these registers are inter-
duplicate SLB entries to be created; see
preted as shown in Figure 47.
Section 5.7.8.2, “SLB Search”.
RS
The reason slbmte cannot be used to invalidate an
SLB entry is that it does not necessarily affect
B VSID KsKpNLC 0 LP 0s implementation-specific address translation looka-
0 2 52 57 58 60 63 side information. slbie (or slbia) must be used for
RB this purpose.
ESID V 0s index
0 36 37 52 63
RS0:1 B
RS2:51 VSID
RS52 Ks
RS53 Kp
RS54 N
RS55 L
RS56 C
RS57 must be 0b0
RS58:59 LP
RS60:63 must be 0b0000
RB0:35 ESID
RB36 V
RB37:51 must be 0b000 || 0x000
RB52:63 index, which selects the SLB entry
Figure 47. GPR contents for slbmte
On implementations that support a virtual address size
of only n bits, n<78, (RS)2:79-n must be zeros.
When LPCRUPRT=1, the value of index must not
exceed 3. (RB)52:61 are ignored.
High-order bits of (RB)52:63 that correspond to SLB
entries beyond the size of the SLB provided by the
implementation must be zeros.
The hardware ignores the contents of RS and RB listed
below and software must set them to 0s.
- (RS)57
- (RS)60:63
- (RB)37:51
If this instruction is executed in 32-bit mode, (RB)0:31
must be zeros (i.e., the ESID must be in the range 0:15).
SLB Move From Entry VSID X-form SLB Move From Entry ESID X-form
slbmfev RT,RB slbmfee RT,RB
This instruction is used to read software-loaded SLB This instruction is used to read software-loaded SLB
entries. When LPCRUPRT=0, the entry is specified by entries. When LPCRUPRT=0, the entry is specified by
bits 52:63 of register RB. When LPCRUPRT=1, only the bits 52:63 of register RB. When LPCRUPRT=1, only the
first four entries can be read, so bits 52:61 of register first four entries can be read, so bits 52:61 of register
RB are ignored. If the specified entry is valid (V=1), the RB are ignored. If the specified entry is valid (V=1), the
contents of the B, VSID, Ks, Kp, N, L, C, and LP fields contents of the ESID and V fields of the entry are
of the entry are placed into register RT. The contents of placed into register RT. If LPCRUPRT=1, the value of
these registers are interpreted as shown in Figure 48. the BO field of the entry is also placed into register RT.
The contents of these registers are interpreted as
RT
shown in Figure 49.
B VSID KsKpNLC 0 LP 0s RT
0 2 52 57 58 60 63
ESID V BO 0s
RB 0 36 37 38 63
0s index RB
0 52 63
0s index
RT0:1 B 0 52 63
RT2:51 VSID
RT52 Ks RT0:35 ESID
RT53 Kp RT36 V
RT54 N RT37 BO, entry is bolted
RT55 L RT38:63 set to 0b000 || 0x00_0000
RT56 C RB0:51 must be 0x0_0000_0000_0000
RT57 set to 0b0 RB52:63 index, which selects the SLB entry
RT58:59 LP
Figure 49. GPR contents for slbmfee
RT60:63 set to 0b0000
If the SLB entry specified by bits 52:63 of register RB is
RB0:51 must be 0x0_0000_0000_0000 invalid (V=0), the contents of register RT are set to 0.
RB52:63 index, which selects the SLB entry
High-order bits of (RB)52:63 that correspond to SLB
Figure 48. GPR contents for slbmfev entries beyond the size of the SLB provided by the
implementation must be zeros.
On implementations that support a virtual address size
of only n bits, n<78, RT2:79-n are set to zeros. The hardware ignores the contents of RB0:51.
If the SLB entry specified by bits 52:63 of register RB is This instruction is privileged.
invalid (V=0), the contents of register RT are set to 0.
The use of the L field is implementation specific.
High-order bits of (RB)52:63 that correspond to SLB
entries beyond the size of the SLB provided by the Special Registers Altered:
implementation must be zeros. None
Figure 50. GPR contents for slbfee. This instruction is privileged except when
LPCRGTSE=0, making it hypervisor privileged.
If s > 28, RT80-s:51 are set to zeros. On implementa-
See Section 5.10 for a description of other require-
tions that support a virtual address size of only n bits, n
ments associated with the use of this instruction.
< 78, RT2:79-n are set to zeros.
Special Registers Altered:
CR Field 0 is set as follows. j is a 1-bit value that is None
equal to 0b1 if a matching entry was found. Otherwise, j
is 0b0. When LPCRUPRT≠0, j=0b0.
Programming Note
slbsync should not be used to synchronize the
completion of slbie.
for each thread invalidate Partition Table RB for R=0, IS=0b00, and L=0::
caching associated with partition
search_LPID AVA IS B AP 0s L
if PRS=1 then
0 52 54 56 59 63
for each thread invalidate Process Table
caching associated with partition RB for R=0, IS=0b00, and L=1:
search_LPID
case (0b11):
if RIC=0 | RIC=2 then AVA LP IS B AVAL L
if MSRHV then 0 44 52 54 56 63
for all threads
RB for IS=0b01, 0b10, or 0b11:
if PRS=0 then
all partition-scoped TLB entries
invalid 0s IS 0s
else 0 52 54 63
all process-scoped TLB entries invalid
if (MSRHV=0)&(PRS=1) then If this instruction is executed in hypervisor state,
for each process-scoped TLB entry for each RS32:63 contains the partiion ID (LPID) of the partition
thread for which one or more translations are being invali-
if TLBELPID=search_LPID dated. Otherwise, the value in LPIDR is used. The sup-
then TLB entry invalid ported (RS)32:63 values are the same as the LPID
if RIC=1 | RIC=2 then values supported in LPIDR. RS0:31 contains a PID
if MSRHV then value. The supported values of RS0:31 are the same as
if PRS=0 then
the PID values supported in PIDR.
for all threads
invalidate all partition-scoped The following forms are invalid.
page walk caching PRS=1, R=0, and RIC≠2 (The only pro-
else cess-scoped HPT caching is of the Process Table.)
for all threads
RIC=1 and R=0 (There is no Page Walk Cache for
invalidate all process-scoped
page walk caching HPT translation.)
if (MSRHV=0) & (PRS=1) then RIC=3 and R=1 (Cluster bombs are only sup-
for each thread invalidate process-scoped ported for HPT translation.)
page walk caching associated with
partition search_LPID The following forms are treated as if the instruction form
if RIC=2 then were invalid.
if MSRHV then RIC=1 and IS=0 (The architecture does not sup-
if PRS=0 then port shootdown of individual translations in the
for each thread Page Walk Cache.)
invalidate all Partition Table caching RIC=2 and IS=0 (RIC is for comprehensive invali-
else dation that is not supported at the level of an indi-
for each thread vidual page.)
invalidate all Process Table caching RIC=3 and IS≠0 (Cluster bombs are only sup-
if (MSRHV=0) & (PRS=1) then
ported for individual pages.)
for each thread invalidate Process Table
caching associated with partition PRS=0 and IS=1 (Partition-scoped translations
search_LPID are not associated with processes.)
R=0 and IS=1 (HPT translations are not associ-
ated with processes.)
The operation performed by this instruction is based on
the contents of registers RS and RB. The contents of The results of an attempt to invalidate a translation out-
these registers are shown below, where IS is (RB)52:53 side of quadrant 0 for Radix Tree translation (R=1,
and L is (RB)63. RIC=0, PRS=1, IS=0, and EA0:1≠0b00) are boundedly
undefined.
RS:
IS field in RB contains 0b00
PID LPID
If RIC=0, this is a search for a single TLB entry. The
0 32 63
following relationships must be true and tests and
RB for R=1 and IS=0b00:: actions are performed to search for an HPT translation.
If the base page size specified by the PTE that was
EPN IS 0s AP 0s used to create the TLB entry to be invalidated is 4
0 52 54 56 59 63 KB, the L field in register RB must contain 0.
If the L field in RB contains 0, the base page size is
4 KB and RB56:58 (AP - Actual Page size field)
must be set to the SLBEL||LP encoding for the page The L field in RB is 0, the base page
size corresponding to the actual page size speci- size of the entry is 4 KB, and the actual
fied by the PTE that was used to create the TLB page size of the entry matches the
entry to be invalidated. Thus, b is equal to 12 and p actual page size specified in (RB)56:58.
is equal to log2 (actual page size specified by The L field in RB is 1, the base page
(RB)56:58). The Abbreviated Virtual Address (AVA) size of the entry matches the base
field in register RB must contain bits 14:65 of the page size specified in (RB)44:51, and
virtual address translated by the TLB entry to be the actual page size of the entry
invalidated. Variable i is equal to 51. matches the actual page size specified
in (RB)44:51.
If the L field in RB contains 1, the following rules
TLBELPID = search_LPID.
apply.
The base page size and actual page size are Additional TLB entries may also be made invalid if
specified in the LP field in register RB, where those TLB entries contain an LPID that matches
the relationship between (RB)44:51 (LP - Large search_LPID.
Page size selector field) and the base page
The following relationships must be true and tests and
size and actual page size is the same as the
actions are performed to search for a Radix Tree trans-
relationship between PTELP and the base
lation. For a partition-scoped invalidation, references to
page size and actual page size, except for the
the effective address are understood to refer to the
“r” bits (see Section 5.7.9.1 on page 998 and
guest real address.
Figure 31 on page 999). Thus, b is equal to
log2 (base page size specified by (RB)44:51) The page size is encoded in RB56:58 (AP - Actual
and p is equal to log2 (actual page size speci- Page size field). Thus p is equal to log2( page size
fied by (RB)44:51). Specifically, (RB)44+c:51 specified by RB56:58). The Effective Page Number
must be equal to the contents of bits c:7 of the (EPN) field in register RB must contain the bits 0:i
LP field of the PTE that was used to create the of the effective address translated by the TLB entry
TLB entry to be invalidated, where c is the to be invalidated. Variable i is equal to 63-p.
maximum of 0 and (20-p).
The fields shown as zeros must be set to zero and
Variable i is the larger of (63-p) and the value
are ignored by the hardware.
that is the smaller of 43 and (63-b). (RB)0:i
must contain bits 14:(i+14) of the virtual All TLB entries on all threads that have all of the
address translated by the TLB to be invali- following properties are made invalid.
dated. If b>20, RB64-b:43 may contain any The entry translates an effective address for
value and are ignored by the hardware. which EA0:i is equal to (RB)0:i.
If b<20, (RB)56:75-b must contain bits 58:77-b The page size of the entry matches the page
of the virtual address translated by the TLB to size specified in (RB)56:58.
be invalidated, and other bits in (RB)56:62 may The entry has the appropriate scope (partition
contain any value and are ignored by the or process).
hardware. The process ID specified in RS matches the
If b≥20, (RB)56:62 (AVAL - Abbreviated Virtual process ID in the TLB entry if not invalidating
Address, Lower) may contain any value and a partition-scoped translation.
are ignored by the hardware. TLBELPID matches the partiion ID of the parti-
tion for which the translation is to be invali-
Let the segment size be equal to the segment size
dated.
specified in (RB)54:55 (B field). The contents of
RB54:55 must be the same as the contents of the B Additional TLB entries may also be made invalid if
field of the PTE that was used to create the TLB those TLB entries contain an LPID that matches
entry to be invalidated. the partition ID of the partition for which the trans-
lation is to be invalidated.
RB52:53 and RB59:62 (when (RB)63 = 0) must con-
tain zeros and are ignored by the hardware. If RIC=3, then an implementation-specific encoding of
All TLB entries on all threads that have all of the AP indicates the number and (base) size of a series of
following properties are made invalid. sequential virtual pages or an address range for which
The entry translates a virtual address for the translations will be invalidated. The range or pages
which all the following are true. occupy an aligned region of virtual storage. The
VA14:14+i is equal to (RB)0:i. address in RB is masked to get the base address of the
L=0 or b≥20 or, if L=1 and b<20, region. For the cluster bomb version, tlbies are then
VA58:77-b is equal to (RB)56:75-b. performed for the virtual pages within the region. For
The segment size of the entry is the same as the range bomb version, the entire TLB is searched and
the segment size specified in (RB)54:55. all entries that map a portion of the address range for
Either of the following is true: the target partition are marked invalid.
IS field in RB is non-zero When i>40, RB40:i-1 may contain any value and are
ignored by the hardware.
If RIC=0 or RIC=2, all partition-scoped TLB entries
when PRS=0 and MSRHV=1 or all process-scoped TLB For all IS values
entries when PRS=1 on all threads for which any of the
following conditions are met for the entry are made For all threads, any implementation specific lookaside
invalid. information that is based on any TLB entry that would
The IS field in RB contains 0b10 or MSRHV=0 and be invalidated by this instruction will also be invalidated.
the IS field contains 0b11, and TLBELPID matches MSRSF must be 1 when this instruction is executed;
the partition ID of the partition for which the trans- otherwise the results are undefined.
lation is to be invalidated.
The IS field in RB contains 0b01, TLBELPID If the value specified in RS0:31, RS32:63, RB54:55 when
matches the partition ID of the partition for which R=0, RB56:58 when RB63=0, or RB44:51 when RB63=1
the translation is to be invalidated, and is not supported by the implementation, the instruction
TLBEPID=RS0:31. is treated as if the instruction form were invalid.
The IS field in RB contains 0b11 and MSRHV=1. The operation performed by this instruction is ordered
If RIC=1 or RIC=2, if the following conditions are met, by the eieio (or sync or ptesync) instruction with
the respective partition-scoped contents when PRS=0 respect to a subsequent tlbsync instruction executed
and MSRHV=1 or process-scoped contents when by the thread executing the tlbie instruction. The opera-
PRS=1 of the page walk cache are invalidated. tions caused by tlbie and tlbsync are ordered by eieio
If the IS field in RB contains 0b10 or if IS contains as a fourth set of operations, which is independent of
0b11 and MSRHV=0, for all threads, all prop- the other three sets that eieio orders.
erly-scoped page walk caching associated with the This instruction is privileged except when LPCRGTSE=0
partition for which the translation is to be invali- or when PRS=0 and HR||GR≠0b00, making it hypervi-
dated is invalidated. sor privileged.
If the IS field in RB contains 0b11 and MSRHV=1,
the entire properly-scoped page walk caching for See Section 5.10, “Translation Table Update Synchroni-
each thread is invalidated. zation Requirements” for a description of other require-
If the IS field in RB contains 0b01 (and PRS=1 and ments associated with the use of this instruction.
R=1), for all threads, all properly-scoped page walk
Special Registers Altered:
caching associated with process RS0:31 in the par-
tition for which the translation is to be invalidated is None
invalidated.
Extended Mnemonics:
If RIC=2, if the following conditions are met, the respec-
Extended mnemonic for tlbie::
tive partition and Process Table caching are invalidated
for all threads. Extended: Equivalent to:
If the IS field in RB contains 0b01 and PRS=1, for tlbie RB,RS tlbie RB,RS,0,0,0
all threads, caching of Process Table Entries for
process RS0:31 in the partition for which the trans-
lation is to be invalidated is invalidated.
Special Registers Altered:
If the IS field in RB contains 0b10, MSRHV=1, and
None
PRS=0, for all threads, caching of Partition Tables
for the partition for which the translation is to be
invalidated is invalidated.
If the IS field in RB contains 0b10 and PRS=1, for
all threads, caching of Process Tables for the parti-
tion for which the translation is to be invalidated is
invalidated.
if the IS field in RB contains 0b11, MSRHV=1, and
PRS=0, for all threads, all Partition Table caching
is invalidated.
if the IS field in RB contains 0b11, MSRHV=1, and
PRS=1, for all theads, all Process Table caching is
invalidated.
If the IS field in RB contains 0b11, MSRHV=0, and
PRS=1, for all threads, caching of Process Tables
for the partition for which the translation is to be
invalidated is invalidated.
(TLBELPID=search_LPID) &
Programming Note (entry_process_scoped=0) &
For tlbie[l] instructions in which (RB)63=0, the AP (entry_radix=0))
value in RB is provided to make it easier for the then
hardware to locate address translations, in looka- if ((L = 0)|(b ≥ 20)) then
side buffers, corresponding to the address transla- TLB entry invalid
tion being invalidated. else
if (entry_VA58:77-b = (RB)56:75-b) then
For tlbie[l] instructions the AP specification is not TLB entry invalid
binary compatible with versions of the architecture else
that precede Version 2.06. As an example, for an pg_size = page size specified in (RB)56:58
actual page size of 64 KB AP=0b101, whereas p log_base_2(pg_size)
software written for an implementation that com- i = 63-p
plies with a version of the architecture that pre- for each TLB entry
cedes V. 2.06 would have AP=100 since AP was a if (entry_EA0:i = (RB)0:i) &
1 bit value followed by 0s in RB57:58. If binary com- (entry_pg_size = pg_size) &
patibility is important, for a 64 KB page software (entry_LPID = search_LPID) &
can use AP=0b101 on these earlier implementa- (entry_process_scoped = PRS) &
tions since these implementations were required to (entry_radix = R) &
ignore RB57:58. ((PRS = 0) |
(entry_PID = (RS)0:31))
then
Programming Note TLB entry invalid
For tlbie[l] instructions the AVA and AVAL fields in case (0b01):
RB contain different VA bits from those in PTEAVA. if RIC=0 | RIC=2 then
i implementation-dependent number, 40≤i≤51
for each TLB entry in set (RB)i:51
if (entry_LPID=search_LPID)
&(entry_PID=RS0:31)
&(entry_R=1)
&(entry_PRS=1)
then TLB entry invalid
if RIC=1 | RIC=2 then
TLB Invalidate Entry Local X-form invalidate process-scoped radix page walk
caching associated with process RS0:31 in
tlbiel RB,RS,RIC,PRS,R partition search_LPID
if (RIC=2)&(PRS=1) then
31 RS / RIC PRS R RB 274 / invalidate Process Table caching associated
0 6 11 12 14 15 16 21 31 with process RS0:31 in partition search_LPID
case (0b10):
IS (RB)52:53 if RIC=0 | RIC=2 then
search_LPID=LPIDRLPID i implementation-dependent number, 40≤i≤51
switch(IS) if (PRS=0)&(MSRHV=1) then
case (0b00): for each partition-scoped TLB entry in set
If RIC=0 (RB)i:51
If R=0 if entry_LPID=search_LPID
L (RB)63 then TLB entry invalid
if L = 0 then if PRS=1 then
base_pg_size = 4K for each process-scoped TLB entry in
actual_pg_size = set (RB)i:51
page size specified in (RB)56:58 if entry_LPID=search_LPID
i = 51 then TLB entry invalid
else if RIC=1 | RIC=2 then
base_pg_size = base page size specified for each thread
in (RB)44:51 if (PRS=0)&(MSRHV=1) then
actual_pg_size = invalidate partition-scoped page walk
actual page size specified in (RB)44:51 caching associated with partition
b log_base_2(base_pg_size) search_LPID
p log_base_2(actual_pg_size) if PRS=1 then
i = max(min(43,63-b),63-p) invalidate process-scoped page walk
sg_size segment size specified in (RB)54:55 caching associated with partition
for each TLB entry search_LPID
if (entry_VA14:i+14 = (RB)0:i) & if RIC=2 then
(entry_sg_size = segment_size) & if (PRS=0)&(MSRHV=1) then
(entry_base_pg_size = base_pg_size) & invalidate Partition Table caching
(entry_actual_pg_size =actual_pg_size)& associated with partition search_LPID
PID /// If RIC=0, this is a search for a single TLB entry. The
0 32 63 following relationships must be true and tests and
actions are performed to search for an HPT translation.
If the base page size specified by the PTE that was
RB for R=1 and IS=0b00:: used to create the TLB entry to be invalidated is 4
KB, the L field in register RB must contain 0.
EPN IS 0s AP 0s
If the L field in RB contains 0, the base page size is
0 52 54 56 59 63
4 KB and RB56:58 (AP - Actual Page size field)
must be set to the SLBEL||LP encoding for the page
size corresponding to the actual page size speci-
RB for R=0, IS=0b00, and L=0: fied by the PTE that was used to create the TLB
entry to be invalidated. Thus, b is equal to 12 and p
AVA IS B AP 0s L is equal to log2 (actual page size specified by
0 52 54 56 59 63 (RB)56:58). The Abbreviated Virtual Address (AVA)
field in register RB must contain bits 14:65 of the
virtual address translated by the TLB entry to be
invalidated. Variable i is equal to 51.
If the L field in RB contains 1, the following rules The following relationships must be true and tests and
apply. actions are performed to search for a Radix Tree trans-
The base page size and actual page size are lation. For a partition-scoped invalidation, references to
specified in the LP field in register RB, where the effective address are understood to refer to the
the relationship between (RB)44:51 (LP - Large guest real address.
Page size selector field) and the base page
The page size is encoded in RB56:58 (AP - Actual
size and actual page size is the same as the
Page size field). Thus p is equal to log2( page size
relationship between PTELP and the base
specified by RB56:58). The Effective Page Number
page size and actual page size, except for the
(EPN) field in register RB must contain the bits 0:i
“r” bits (see Section 5.7.9.1 on page 998 and
of the effective address translated by the TLB entry
Figure 31 on page 999). Thus, b is equal to
to be invalidated. Variable i is equal to 63-p.
log2 (base page size specified by (RB)44:51)
and p is equal to log2 (actual page size speci- The fields shown as zeros must be set to zero and
fied by (RB)44:51). Specifically, (RB)44+c:51 are ignored by the hardware.
must be equal to the contents of bits c:7 of the
All TLB entries that have all of the following proper-
LP field of the PTE that was used to create the
ties are made invalid on the thread executing the
TLB entry to be invalidated, where c is the
tlbiel instruction..
maximum of 0 and (20-p).
The entry translates an effective address for
Variable i is the larger of (63-p) and the value
which EA0:i is equal to (RB)0:i.
that is the smaller of 43 and (63-b). (RB)0:i
The page size of the entry matches the page
must contain bits 14:(i+14) of the virtual
size specified in (RB)56:58.
address translated by the TLB to be invali-
The entry has the appropriate scope (partition
dated. If b>20, RB64-b:43 may contain any
or process).
value and are ignored by the hardware.
The process ID specified in RS matches the
If b<20, (RB)56:75-b must contain bits 58:77-b
process ID in the TLB entry if not invalidating
of the virtual address translated by the TLB to
a partition-scoped translation.
be invalidated, and other bits in (RB)56:62 may
TLBELPID matches the partiion ID of the parti-
contain any value and are ignored by the
tion for which the translation is to be invali-
hardware.
dated.
If b≥20, (RB)56:62 (AVAL - Abbreviated Virtual
Address, Lower) may contain any value and IS field in RB is non-zero
are ignored by the hardware.
If RIC=0 or RIC=2, (RB)i:51 (bits i-40:11 of the SET field
Let the segment size be equal to the segment size in (RB)) specify a set of TLB entries, where i is an
specified in (RB)54:55 (B field). The contents of implementation-dependent value in the range 40:51.
RB54:55 must be the same as the contents of the B Each partition-scoped entry when PRS=0 and
field of the PTE that was used to create the TLB MSRHV=1 or each process-scoped entry when PRS=1
entry to be invalidated. in the set is invalidated if any of the following conditions
All TLB entries that have all of the following proper- are met for the entry.
ties are made invalid on the thread executing the The IS field in RB contains 0b10, or MSRHV=0 and
tlbiel instruction. the IS field contains 0b11, and TLBELPID =
The entry translates a virtual address for LPIDRLPID.
which all the following are true. The IS field in RB contains 0b01,
VA14:14+i is equal to (RB)0:i. TLBELPID=LPIDRLPID, and TLBEPID=RS0:31.
L=0 or b≥20 or, if L=1 and b<20,
The IS field in RB contains 0b11 and MSRHV=1.
VA58:77-b is equal to (RB)56:75-b.
The segment size of the entry is the same as How the TLB is divided into the 252-i sets is implemen-
the segment size specified in (RB)54:55. tation-dependent. The relationship of virtual addresses
Either of the following is true: to these sets is also implementation-dependent. How-
The L field in RB is 0, the base page ever, if, in an implementation, there can be multiple TLB
size of the entry is 4 KB, and the actual
entries for the same virtual address and same partition,
page size of the entry matches the
then all these entries must be in a single set.
actual page size specified in (RB)56:58.
The L field in RB is 1, the base page If RIC=1 or RIC=2, if the following conditions are met,
size of the entry matches the base the respective partition-scoped contents when PRS=0
page size specified in (RB)44:51, and and MSRHV=1 or process-scoped contents when
the actual page size of the entry
PRS=1 of the page walk cache are invalidated.
matches the actual page size specified
If the IS field in RB contains 0b10 or if IS contains
in (RB)44:51.
TLBELPID = LPIDRLPID. 0b11 and MSRHV=0, all properly-scoped page
walk caching associated with partition LPDIRLPID
is invalidated.
Programming Note
tlbsync should not be used to synchronize the
completion of tlbiel.
while they are being modified. (For Partition Table The PTE sequences are also used to synchronize
Entries, and for Process Table Entries when updates to partition and Process Table Entries.
HR||GR≠0b00, the process or partition affected must
On systems consisting of only a single-threaded pro-
be inactive because the entries do not have valid bits.)
cessor, the eieio and tlbsync or slbsync instructions
With the exceptions previously identified for Segment
can be omitted.
Table walks (see Section 5.9.3, “Lookaside Buffer Man-
agement”), any thread, including a thread on which The following subsections illustrate sequences that
software is modifying any of the set of tables described must be used for translation table updates to tables that
in the first sentence, may look in those tables at any are subject to concurrent use by hardware (i.e. that
time in an attempt to translate an address. When modi- have valid bits in their entries). For Partition Table
fying an entry in any of the former set of tables, soft- Entries and for Process Table Entries that do not have
ware must ensure that the table entry’s V bit is 0 if the valid bits, simpler sequences consisting of just the pre-
table entry does not correctly specify its portion of the ceding sequences, perhaps with mutual exclusion if the
translation (e.g., if the RPN field is not correct for the update processes are multithreaded, is sufficient.
current AVA field).
For HPT translation, updates of Reference and Programming Note
Change bits by the hardware are not synchronized The eieio instruction prevents the reordering of the
with the accesses that cause the updates. When preceding tlbie or slbieg instructions with respect
modifying doubleword 1 of a PTE, software must take to the subsequent tlbsync or slbsync instruction.
care to avoid overwriting a hardware update of these The tlbsync or slbsync instruction and the subse-
bits and to avoid having the value written by a Store quent ptesync instruction together ensure that all
instruction overwritten by a hardware update. storage accesses for which the address was trans-
lated using the translations being invalidated (by
The most basic sequence that will achieve proper sys- the tlbie or slbieg instructions), and all Reference
tem synchronization for PTE updates is the following. and Change bit updates associated with address
tlbie instruction(s) specifying the same LPID oper- translations that were performed using the transla-
and value tions being invalidated, will be performed with
eieio respect to any thread or mechanism, to the extent
tlbsync required by the associated Memory Coherence
ptesync Required attributes, before any data accesses
Other instructions may be interleaved among these caused by instructions following the ptesync
instructions. Operating system and hypervisor soft- instruction are performed with respect to that
ware that updates Page Table Entries should use this thread or mechanism.
sequence. For Page Table update sequences that mark the
Operating systems and nested hypervisors are PTE invalid (see Section 5.10.1.2, “Modifying a
exposed to being interrupted during this sequence. Translation Table Entry”), Reference and Change
The interrupting hypervisor is responsible for complet- bit updates cease when the sequence is com-
ing the sequence above. In general this will require the plete. When the PTE is marked invalid using an
hypervisor to include the following sequence in an inter- atomic update and the Store Conditional setting the
rupt handler. entry invalid is successful, the Reference and
eieio Change bits obtained by the corresponding Load
tlbsync And Reserve instruction are stable/final values.
ptesync
The sequences of operations shown in the following
This sequence itself may be interrupted by a higher subsections assume a multi-threaded environment. In
level hypervisor. When returning to the interrupted soft- an environment consisting of only a single-threaded
ware, the original sequence will be completed. Hard- processor, the tlbsync or slbsync and the eieio that
ware must tolerate the result of nested interleaving of separates the tlbie or slbieg from the tlbsync or slb-
these sequences. tlbie and tlbsync instructions sync can be omitted. In a multi-threaded environment,
should only be used as part of these sequences. when tlbiel or slbie is used instead of tlbie or slbieg in
The corresponding sequences for Segment Table a Page or Segment Table update, the synchronization
updates use slbieg in place of tlbie and slbsync in requirements are the same as when tlbie or slbieg is
place of tlbsync. Similarly slbieg and slbsync should used in an environment consisting of only a sin-
only be used as part of these sequences. In circum- gle-threaded processor.
stances where a hypervisor may be interrupting either a
PTE update or a Segment Table update, it must include
both tlbsync and slbsync in its completing sequence,
in either order. Hardware must tolerate the result of
nested interleaving of these additional sequences.
dated, the following sequence can be used to modify stdcx. PTE_dwd_0 r1 /* store dwd 0 of PTE
the STE, maintain a consistent state, ensure that the if still reserved (new SW value, other
translation instantiated by the old entry is no longer fields unchanged) */
available, and ensure that a subsequent reference to bne- loop /* loop if lost reservation */
the effective address translated by the new entry will A lbarx/stbcx., lharx/sthcx., or lwarx/stwcx. pair
use the correct virtual address and associated (specifying the low-order byte, halfword, or word
attributes. (The sequence is much like the general case respectively of doubleword 0 of the PTE) can be used
for a change to an HPT PTE, and is equivalent to delet- instead of the ldarx /stdcx. pair shown above for HPT
ing the STE and then adding a new one.) Mutual exclu- translation. The split SW field in the radix PTE cannot
sion with respect to other software threads may be be updated with a single smaller atomic update. This
required. A similar sequence (except using tlbie with sequence interacts correctly with hardware updates
RIC=2 and tlbsync) may be used to modify and is safe for multithreaded software. A similar
HR||GR=0b00 Process Table Entries. sequence (including the possibility of using a smaller
atomic update) can be used to update a Segment Table
STEV 0 /* (other fields don’t matter)*/ Entry.
ptesync /* order update before slbieg and
before next Segment Table search */
slbieg(old_B,old_ESID,old_C,old_TA) Modifying the Effective Address (STE)
/*invalidate old translation*/
eieio /* order slbieg before slbsync */ If the effective address translated by a valid STE is to
slbsync /* order slbieg before ptesync */ be modified and the new effective address hashes to
ptesync /* order slbieg, slbsync and 1st the same STEG as does the old effective address, the
update before 2nd update */ following sequence can be used to modify the STE,
/* deletion sequence ends here */ maintain a consistent state, ensure that the translation
STEVSID, Ks, Kp, N, L, C, LP, SW new values instantiated by the old entry is no longer available, and
eieio /* order 2nd update before 3rd */ ensure that a subsequent reference to the effective
STEESID,V new values (V=1) address translated by the new entry will use the correct
ptesync /* order 2nd and 3rd updates before virtual address and associated attributes. Mutual
next Segment Table search and
exclusion with respect to other software threads may be
before next data access */
required. The corresponding change of the virtual
address in the PTE for HPT translation can be per-
Resetting the Reference Bit (PTE) formed using a similar sequence, interacting correctly
If the only change being made to a valid entry is to set with hardware table updates, as long as the second
the Reference bit to 0, a simpler sequence suffices doubleword of the PTE is not stored.
because the Reference bit need not be maintained
exactly. The byte store is exposed to overwriting STEESID,V new values (V=1)
ptesync /* order update before slbieg and
another change being performed by multithreaded soft-
before next Segment Table search */
ware, so mutual exclusion may be required. slbieg(old_B,old_ESID,old_TA,old_C)
/*invalidate old translation*/
oldR PTER /* get old R */ eieio /* order slbieg before slbsync */
if oldR = 1 then slbsync /* order slbieg before ptesync */
PTER 0 /* store byte (R=0, other bits ptesync /* order slbieg, slbsync, and update
unchanged) */ before next data access */
tlbie(B,VA14:77-b,L,LP,AP,LPID) /* invalidate
entry */
eieio /* order tlbie before tlbsync */
tlbsync /* order tlbie before ptesync */
ptesync /* order tlbie, tlbsync, and update
before next Page Table search
and before next data access */
Chapter 6. Interrupts
HSRR1
6.2 Interrupt Registers 0 63
Segment Descriptor Register (ASDR). When nested caused the interrupt, with the high-order 32 bits of the
translation is taking place, the ASDR will generally pro- HDAR set to 0 if the interrupt occurs in 32-bit mode.
vide the guest real address down to bit 51. (The small-
est supported page size is 4k.) When using HDAR
paravirtualized HPT translation or for HV=1 accesses 0 63
when HR=0, information from the segment descriptor
that was used to perform the effective to virtual transla- Figure 56. Hypervisor Data Address Register
tion is provided in the ASDR. For exceptions that take
place when translating the address of the process table 6.2.6 Data Storage Interrupt
entry or segment table entry group, only the VSID will
be provided, because those addresses are specified as Status Register
virtual addresses and the rest of the segment descrip-
tor is implied. Some instances of the Machine Check The Data Storage Interrupt Status Register (DSISR) is
interrupt may require the ASDR to be set similarly to a 32-bit register that is set by the Machine Check, Data
how it is set for the hypervisor storage interrupts. The Storage, and Data Segment interrupts; see Sections
ASDR is set independent of the value of UPRT for the 6.5.2, 6.5.3, and 6.5.4.
partition that is running.
attempted. The values and their meanings are 0 The msgsndp and msgclrp instructions
specified below. and the TIR and DPDES registers are not
available in privileged non-hypervisor
state.
02 Access to the DSCR at SPR 3
1 The msgsndp and msgclrp instructions
03 Access to a Performance Monitor SPR in
and the TIR and DPDES registers are
group A or B when MMCR0PMCC is set to
available in privileged non-hypervisor
a value for which the access results in a
state unless made unavailable by another
Facility Unavailable interrupt. (See the
register.
definition of MMCR0PMCC in Section
9.4.4.) 54 Reserved
04 Execution of a BHRB Instruction
55 Target Address Register (TAR)
05 Access to a Transactional Memory SPR or
execution of a Transactional Memory 0 The TAR and bctar instruction are not
Instruction available in problem state.
06 Reserved 1 The TAR and bctar instruction are avail-
07 Access to an Event-Based Branch SPR or able in problem state unless made
execution of an Event-Based Branch unavailable by another register.
instruction 56 Event-Based Branch Facility (EBB)
08 Access to the Target Address Register
0A Access to the msgsndp or msgclrp 0 The Event-Based Branch facility SPRs
instructions, the TIR or the DPDES Regis- and instructions are not available in prob-
ter lem state, and event-based exceptions
0B Access to the Load Monitored Region and branches do not occur.
Register or Load Monitored Section 1 The Event-Based Branch facility SPRs
Enable Register. and instructions (see Chapter 7 of Book II)
0C Execution of scv are available in problem state unless
made unavailable by another register, and
All other values are reserved. event-based exceptions and branches
8:63 Facility Enable (FE) are allowed to occur if enabled by other
registers.
The FE field controls the availability of various
facilities in problem state as specified below. 57:60 Reserved
8:50 Reserved Programming Note
51 scv instruction HFSCR58:60 are used to control the availability of
0 The scv instruction is not available. Transactional Memory, the Performance Monitor,
1 The scv instruction is available. and the BHRB in problem and privileged
non-hypervisor states. FSCR58:60 are reserved
52 Load Monitored Facilities (LM) since the availability of Transactional Memory is
0 The Load Monitored Region and Load controlled by the MSR, and the availability of the
Monitored Section Enable Registers are Performance Monitor and BHRB is controlled by
not available in problem state. Load Moni- MMCR0.
tored event-based exceptions do not
occur, and Load Monitored event-based 61 Data Stream Control Register at SPR 3
branches do not occur. (DSCR)
When executed in problem state, ldmx 0 SPR 3 is not available in problem state.
behaves as if it were ldx. 1 SPR 3 is available in problem state
1 The Load Monitored Region Register and unless made unavailable by another regis-
Load Monitored Section Enable Register ter.
are available in problem state unless
made unavailable by another register, and 62:63 Reserved
Load Monitored event-based exceptions
and Load Monitored event-based Programming Note
branches occur if enabled by other regis- When an OS has set the FSCR such that a facility
ters. is unavailable, the OS should either emulate the
ldmx behaves as described in facility when it is accessed or provide an applica-
Section 3.3.2.1 of Book I. tion interface that requires the application to
53 msgsndp instructions and SPRsr (MSGP) request use of the facility before it accesses the
facility.
Programming Note
The FSCR can be used to determine whether a
particular facility is being used by an application,
and the HFSCR can be used to determine whether
a particular facility is being used by either an appli-
cation or by an operating system. This is done by
disabling the facility initially, and enabling it in the
interrupt handler upon first usage. The information
about the usage of a particular facility can be used
to determine whether that facility’s state must be
saved and restored when changing program con-
text.
Programming Note
The following tables summarize the interrupts that occur as a result of accessing the non-privileged Performance
Monitor registers in problem state when MMCR0PMCC, PCR, and HFSCR are set to various values. (Accesses to
privileged Performance Monitor SPRs (SPRs 784-792, 795-798) in problem state result in Privileged Instruction Type
Program interrupts.)
mfspr mtspr
PMCC PMCC
SPR # 00 01 10 11 00 01 10 11
3 4 4 4 4 4 4 4
MMCR2 769 HU FU, HU HU HU HE,HU FU, HU HU HU4
4 4 4 4 4 4 4
MMCRA 770 HU FU, HU HU HU HE,HU FU, HU HU HU4
PMC1 771 HU4 FU, HU4 HU4 HU4 HE,HU4 FU, HU4 HU4 HU4
4 4 4 4 4 4 4
HU4
Group A
SIER3 768 HU4 FU, HU4 HU4 HU4 See 2. See 2. See 2. See 2.
HU4 HU4 HU4 HU4
Group B
Programming Note
When an MSR bit makes a facility unavailable, the
facility is made unavailable in all privilege states.
Examples of this include the Floating Point, Vector,
and VSX facilities. The FSCR and HFSCR affect
the availability of facilities only in privilege states
that are lower than the privilege of the register
(FSCR or HFSCR).
6.3 Interrupt Synchronization interrupts. System Reset and Machine Check interrupts
are not maskable.
When an interrupt occurs, in general SRR0 or HSRR0 “Instruction-caused” interrupts are further divided into
is set to point to an instruction such that all preceding two classes, precise and imprecise.
instructions have completed execution, no subsequent
instruction has begun execution, and the instruction
addressed by SRR0 or HSRR0 may or may not have 6.4.1 Precise Interrupt
completed execution, depending on the interrupt type.
All instruction-caused interrupts other than the Impre-
The only exception is that if an mtspr sequence started
cise Mode Floating-Point Enabled Exception type Pro-
by mtgsr is active when the interrupt occurs, some of
gram interrupt are precise, except that if an mtspr
the sequence’s mtsprs beyond the instruction pointed
sequence started by mtgsr is active when the interrupt
to by SRR0 or HSRR0 may have been executed; see
occurs, some of the sequence’s mtsprs beyond the
Chapter 11.
interrupt point may have been executed; see Chapter
With the exception of System Reset and Machine 11. Statements elsewhere in Book III-S that a given
Check interrupts, all interrupts are context synchroniz- interrupt is precise do not preclude the mtspr case just
ing as defined in Section 1.5.1. System Reset and described.
Machine Check interrupts are context synchronizing if
When the fetching or execution of an instruction causes
they are recoverable (i.e., if bit 62 of SRR1 is set to 1 by
a precise interrupt, the following conditions exist at the
the interrupt). If a System Reset or Machine Check
interrupt point.
interrupt is not recoverable (i.e., if bit 62 of SRR1 is set
to 0 by the interrupt), it acts like a context synchronizing 1. SRR0 addresses either the instruction causing the
operation with respect to subsequent instructions. That exception or the immediately following instruction.
is, a non-recoverable System Reset or Machine Check Which instruction is addressed can be determined
interrupt need not satisfy items 1 through 3 of Section from the interrupt type and status bits.
1.5.1, but does satisfy items 4 and 5.
2. An interrupt is generated such that all instructions
preceding the instruction causing the exception
Programming Note
appear to have completed with respect to the exe-
While the behavior of correct mtgsr-initiated mtspr cuting thread.
sequences described above does not directly vio-
late a careful interpretation of the requirements of 3. The instruction causing the exception may appear
context synchronization, it may violate programmer not to have begun execution (except for causing
expectations. For example, speculative execution the exception), may have been partially executed,
is permitted or may have completed, depending on the interrupt
type.
4. Architecturally, no subsequent instruction has
6.4 Interrupt Classes begun execution.
3. Bits 0:32, 37:41, and 48:63 of SRR1 or HSRR1 are isync or rfid, to ensure that the instructions in
loaded with a copy of the corresponding bits of the the “new” program execute in the “new” con-
MSR. text.
treclaim, to ensure that any previous use of
4. The MSR is set as shown in Figure 64 on the transactional facility is terminated.
page 1061. In particular, MSR bits IR and DR are
set as specified by LPCRAIL (see Section 2.2),
and MSR bit SF is set to 1, selecting 64-bit mode.
The new values take effect beginning with the first
instruction executed following the interrupt.
5. Instruction fetch and execution resumes, using the
new MSR value, at the effective address specific to
the interrupt type. These effective addresses are
shown in Figure 65 on page 1062. An offset may
be applied to get the effective addresses, as speci-
fied by LPCRAIL (see Section 2.2).
Interrupts do not clear reservations obtained with lbarx,
lharx, lwarx, ldarx, or lqarx.
Programming Note
For instruction-caused interrupts, in some cases it may If the instruction is a Storage Access instruction, the
be desirable for the operating system to emulate the emulation must satisfy the atomicity requirements
instruction that caused the interrupt, while in other described in Section 1.4 of Book II.
cases it may be desirable for the operating system not
In general, the instruction should not be emulated if:
to emulate the instruction. The following list, while not
complete, illustrates criteria by which decisions regard- - The purpose of the instruction is to cause an
ing emulation should be made. The list applies to gen- interrupt. Example: System Call interrupt
eral execution environments; it does not necessarily caused by sc.
apply to special environments such as program debug-
- The interrupt is caused by a condition that is
ging, bring-up, etc.
stated, in the instruction description, poten-
In general, the instruction should be emulated if: tially to cause the interrupt. Example: Align-
ment interrupt caused by lwarx for which the
- The interrupt is caused by a condition for
storage operand is not aligned.
which the instruction description (including
related material such as the introduction to the - The program is attempting to perform a func-
section describing the instruction) implies that tion that it should not be permitted to perform.
the instruction works correctly. Example: Example: Data Storage interrupt caused by
Alignment interrupt caused by lmw for which lwz for which the storage operand is in stor-
the storage operand is not aligned, or by dcbz age that the program should not be permitted
for which the storage operand is in storage to access. (If the function is one that the pro-
that is Write Through Required or Caching gram should be permitted to perform, the con-
Inhibited. ditions that caused the interrupt should be
corrected and the program re-dispatched such
- The instruction is an illegal instruction that
that the instruction will be re-executed. Exam-
should appear, to the program executing it, as
ple: Data Storage interrupt caused by lwz for
if it were supported by the implementation.
which the storage operand is in storage that
Example: A Hypervisor Emulation Assistance
the program should be permitted to access
interrupt is caused by an instruction that has
but for which there currently is no PTE that
been phased out of the architecture but is still
satisfies the Page Table search.)
used by some programs that the operating
system supports.
Programming Note
If a program modifies an instruction that it or
another program will subsequently execute and the
execution of the instruction causes an interrupt, the
state of storage and the content of some registers
may appear to be inconsistent to the interrupt han-
dler program. For example, this could be the result
of one program executing an instruction that
causes a Hypervisor Emulation Assistance inter-
rupt just before another instance of the same pro-
gram stores an Add Immediate instruction in that
storage location. To the interrupt handler code, it
would appear that a hardware generated the inter-
rupt as the result of executing a valid instruction.
In each interrupt handler, when enough state 5. Logical and Extend Sign instructions
has been saved that a Machine Check or Sys- ori, oris, xori, xoris, and, or, xor, nand, nor, eqv,
tem Reset interrupt can be recovered from, set andc, orc, extsb, extsh, extsw
MSRRI to 1.
6. Rotate and Shift instructions
In each interrupt handler, do the following (in
order) just before returning. rldicl, rldicr, rldic, rlwinm, rldcl, rldcr, rlwnm,
1. Set MSRRI to 0. rldimi, rlwimi, sld, slw, srd, srw
2. Set SRR0 and SRR1 to the values to be 7. Other instructions
used by rfid. The new value of SRR1
should have bit 62 set to 1 (which will hap- isync
pen naturally if SRR1 is restored to the rfid, hrfid
value saved there by the interrupt,
because the interrupt handler will not be mtspr, mfspr, mtmsrd, mfmsr
executing this sequence unless the inter-
rupt is recoverable). Programming Note
3. Execute rfid. Instructions excluded from the list include the fol-
For interrupts that set the SRRs other than lowing.
Machine Check or System Reset, MSRRI can be instructions that set or use XERCA
managed similarly when these interrupts occur instructions that set XEROV or XERSO
within interrupt handlers for other interrupts that set andi., andis., and fixed-point instructions with
the SRRs. Rc=1 (Fixed-point instructions with Rc=1 can
be replaced by the corresponding instruction
This Note does not apply to interrupts that set the
with Rc=0 followed by a Compare instruction.)
HSRRs because these interrupts put the thread
all floating-point instructions
into hypervisor state, and either do not occur or can
mftb
be prevented from occurring within interrupt han-
dlers for other interrupts that set the HSRRs. These instructions, and the other excluded instruc-
tions, may be implemented with the assistance of
the Hypervisor Emulation Assistance interrupt, or
6.4.4 Implicit alteration of HSRR0 of implementation-specific interrupts that modify
HSRR0 and HSRR1. The included instructions are
and HSRR1 guaranteed not to be implemented thus. (The
Executing some of the more complex instructions may included instructions are sufficiently simple as to be
have the side effect of altering the contents of HSRR0 unlikely to need such assistance. Moreover, they
and HSRR1. The instructions listed below are guaran- are likely to be needed in interrupt handlers before
teed not to have this side effect. Any omission of HSRR0 and HSRR1 have been saved or after
instruction suffixes is significant; e.g., add is listed but HSRR0 and HSRR1 have been restored.)
add. is excluded.
Effective
Address1 Interrupt Type
Effective 1 The values in the Effective Address column are
Address1 Interrupt Type interpreted as follows.
00..0000_0100 System Reset 00...0000_0nnn means
00..0000_0200 Machine Check 0x0000_0000_0000_0nnn unless the values
00..0000_0300 Data Storage of LPCRAIL and MSRHV IR DR cause the appli-
00..0000_0380 Data Segment cation of an effective address offset. See the
00..0000_0400 Instruction Storage description of LPCRAIL in Section 2.2 for more
00..0000_0480 Instruction Segment details.
0...00_0001_7nnn means
00..0000_0500 External
0x0000_0000_0001_7nnn unless the values
00..0000_0600 Alignment
of LPCRAIL and MSRHV IR DR cause the usage
00..0000_0700 Program of an alternate effective address. See the
00..0000_0800 Floating-Point Unavailable description of LPCRAIL in Section 2.2 for
00..0000_0900 Decrementer details.
Hypervisor Decrementer 2
00..0000_0980 Effective addresses 0x0000_0000_0000_0000
00..0000_0A00 Directed Privileged Doorbell through 0x0000_0000_0000_00FF are used by
00..0000_0B00 Reserved software and will not be assigned as interrupt
00..0000_0C00 System Call vectors.
00..0000_0D00 Trace
00..0000_0E00 Hypervisor Data Storage Figure 65. Effective address of interrupt vector by
00..0000_0E20 Hypervisor Instruction Storage interrupt type
00..0000_0E40 Hypervisor Emulation Assistance
00..0000_0E60 Hypervisor Maintenance Programming Note
00..0000_0E80 Directed Hypervisor Doorbell When address translation is disabled, use of any of
00..0000_0EA0 Hypervisor Virtualization the effective addresses that are shown as reserved
00..0000_0EC0 Reserved in Figure 65 risks incompatibility with future imple-
mentations.
00..0000_0EE0 Reserved for implementa-
tion-dependent interrupt for per-
formance monitoring
00..0000_0F00 Performance Monitor
00..0000_0F20 Vector Unavailable 6.5.1 System Reset Interrupt
00..0000_0F40 VSX Unavailable
If a System Reset exception causes an interrupt that is
00..0000_0F60 Facility Unavailable
not context synchronizing or causes the loss of a
00..0000_0F80 Hypervisor Facility Unavailable Machine Check exception or a Direct External excep-
00..0000_0FA0 Reserved tion, or if the state of the thread has been corrupted, the
. . . ... interrupt is not recoverable.
00..0000_0FFF Reserved
When the thread is in any power-saving level, a System
00..0001_7000 System Call Vectored
Reset interrupt occurs when a System Reset exception
00..0001_7020 System Call Vectored exists. When the thread is in a power-saving level that
. . . ... was entered when PSSCREC=1, a System Reset inter-
00..0001_7FE0 System Call Vectored rupt also occurs when any of the following events
00..0001_7FFF (end of scv interrupt vectors) occurs provided that the event is enabled to cause exit
from power-saving mode (see Section 2.2). When the
thread is in a power-saving level that allows the state of
the LPCR to be lost, it is implementation-specific
whether the following events, when enabled, cause exit,
or whether only a system-reset exception causes exit.
External
Decrementer
Directed Privileged Doorbell
Directed Hypervisor Doorbell
Hypervisor Maintenance
46:47 Set to indicate whether the interrupt 62 If the interrupt did not occur while the
occurred when the thread was in thread was in a power-saving level that was
power-saving mode and, if so, the extent to entered when PSSCREC=1, loaded from bit
which resource state was maintained while 62 of the MSR if the thread is in a recover-
the thread was in power-saving mode, as able state; otherwise set to 0. If the inter-
follows. rupt occurred while the thread was in a
power-saving level that was entered when
00 The interrupt did not occur when PSSCREC=1, set to 1 if the thread is in a
the thread was in power-saving recoverable state; otherwise set to 0.
mode.
Others Set to an implementation-dependent value.
01 The interrupt occurred when the MSR See Figure 64.
thread was in power-saving mode.
The state of all resources was DSISR Set to an implementation-dependent value.
maintained as if the thread was not DAR Set to an implementation-dependent value.
in power-saving mode.
ASDR Set to an implementation-dependent value.
10 The interrupt occurred when the
thread was in power-saving mode. Execution resumes at effective address
The state of some resources was 0x0000_0000_0000_0200.
not maintained, but the state of all A Machine Check interrupt caused by the existence of
hypervisor resources, including the multiple SLB entries or TLB entries (or similar entries in
TB, PURR, and SPURR, was implementation-specific translation caches) which
maintained as if the thread was not translate a given effective or virtual address (see Sec-
in power-saving mode and the tions 5.7.8.2 and 5.7.9.2.) must occur while still in the
state of all other resources is such context of the partition that caused it. The interrupt
that the hypervisor can resume must be presented in a way that permits continuing
execution. (See Section 2.6 for the execution, with damage limited to the causing partition.
list of hypervisor resources.) Treating the exception as instruction-caused will
achieve these requirements.
11 The interrupt occurred when the
thread was in power-saving mode.
The state of some resources was Programming Note
not maintained, and the state of If a Machine Check interrupt is caused by an error
some hypervisor resources was in the storage subsystem, the storage subsystem
not maintained or the state of some may return incorrect data, which may be placed
resources is such that the hypervi- into registers. This corruption of register contents
sor cannot resume execution. may occur even if the interrupt is recoverable.
Programming Note
Although the resources that are main- 6.5.3 Data Storage Interrupt
tained in power-saving mode (except A Data Storage interrupt occurs when no higher priority
when all resources are maintained) exception exists and either
are implementation-dependent, the
hypervisor can avoid implementa- (a) HR||GR=0b00, the value of the expression
tion-dependence in the portion of the
((MSRHV PR=0b10)|((¬VPM1|¬PRTEV)& MSRDR))
System Reset and Machine Check
interrupt handlers that recover from is 1, and a data access cannot be performed,
having been in power-saving mode by
except for the case of MSRHV PR≠0b10,
using the contents of SRR146:47, to
determine what state to restore. (To VPM1=0, LPCRKBV=1, and a Virtual Storage
avoid implementation-dependence in
the portion of the hypervisor that Page Class Key Protection exception exists or
enters power-saving mode, the hyper- (b) HR||GR ≠ 0b00 and process-scoped translation
visor must use the specification of the
four instructions to determine what prevents a data access from being performed
state to save.) for any of the following reasons. (In the expression for
(a) above, “¬PRTEV” is shorthand representing the
case of an invalid segment table descriptor stopping
the translation process.)
Data address translation is enabled (MSRDR=1) If the XER specifies a length of zero for an indexed
and the virtual address of any byte of the storage Move Assist instruction, a Data Storage interrupt does
location specified by a Load, Store, icbi, dcbz, not occur.
dcbst, or dcbf[l] instruction cannot be translated
The following registers are set:
to a real address because no valid PTE was found
for the process-scoped translation or paravirtual- SRR0 Set to the effective address of the instruc-
ized translation with VPM off. tion that caused the interrupt.
The address of the appropriate process table entry
SRR1
or segment table entry group cannot be translated
33:36 Set to 0.
when HR||GR=0b00 and either VPM1=0 or the pro-
42:47 Set to 0.
cess table entry is invalid (independent of VPM1).
Others Loaded from the MSR.
The effective address specified by a lq, stq, lwat,
ldat, lbarx, lharx, lwarx, ldarx, lqarx, stwat, MSR See Figure 64.
stdat, stbcx., sthcx., stwcx., stdcx., or stqcx.
instruction refers to storage that is Write Through DSISR
Required or Caching Inhibited; or the effective 32 Set to 0.
address specified by a lwat, ldat, stwat, or stdat 33 Set to 1 if MSRDR=1 and the translation for
instruction refers to storage that is Guarded. an attempted access is not found in the
The effective address specified by a copy or Page Table; otherwise set to 0..
paste[.] instruction refers to storage that is Cach- 34 Set to 1 if the process- and parti-
ing Inhibited. tion-scoped page attributes conflict; other-
The effective address specified by an instruction wise set to 0.
other than copy or paste[.] refers to CSM. 35 Set to 0.
An accelerator is specified as the source of a copy 36 Set to 1 if the access is not permitted by
instruction or an attempt is made to access an Figure 43 45, or the privilege, read, or
accelerator that is not properly initialized for the read/write bits in Figure 44 as appropriate;
software’s use. otherwise set to 0.
A transfer specified to access CSM does not spec- 37 Set to 1 if the access is due to a lq, stq,
ify (local) main storage as the other end of the lwat, ldat, lbarx, lharx, lwarx, ldarx,
transfer. lqarx, stwat, stdat, stbcx., sthcx., stwcx.,
The access violates Basic Storage Protection. stdcx., or stqcx. instruction that addresses
The access violates Virtual Page Class Key Stor- storage that is Write Through Required or
age Protection and LPCRKBV=0. Caching Inhibited; or if the access is due to
The process- and partition-scoped page attributes a lwat, ldat, stwat, or stdat instruction that
conflict. addresses storage that is Guarded; or if the
An unsupported radix tree configuration is found in access is due to a copy or paste[.] instruc-
the process-scoped tables. tion that addreses storage that is caching
A reference or change bit update cannot be per- inhibited; or if the access is due to an
formed in a process-scoped PTE. instruction other than copy or paste[.]
A Data Address Watchpoint match occurs. addressing CSM; otherwise set to 0.
An attempt is made to execute a Fixed-Point Load 38 Set to 1 for a Store or dcbz instruction; oth-
or Store Caching Inhibited instruction with erwise set to 0.
MSRDR=1 or specifying a storage location that is 39:40 Set to 0.
specified by the Hypervisor Real Mode Storage 41 Set to 1 if a Data Address Watchpoint
Control facility to be treated as non-Guarded. match occurs; otherwise set to 0.
42 Set to 1 if the access is not permitted by vir-
A Data Storage interrupt also occurs when no higher tual page class key protection; otherwise
priority exception exists and an attempt is made to exe- set to 0.
cute a Load Atomic or Store Atomic instruction specify-
ing an invalid function code. 43 Set to 0.
44 Set to 1 if an unsupported radix tree config-
If a stbcx., sthcx., stwcx., stdcx., or stqcx. would not
uration is found during the translation pro-
perform its store in the absence of a Data Storage inter-
cess; otherwise set to 0.
rupt, and either (a) the specified effective address
45 Set to 1 if an attempt to atomically set a ref-
refers to storage that is Write Through Required or
erence or change bit fails; otherwise set to
Caching Inhibited, or (b) a non-conditional Store to the
0.
specified effective address would cause a Data Storage
46 Set to 1 if the address of the appropriate
interrupt, it is implementation-dependent whether a
process table entry or segment table entry
Data Storage interrupt occurs.
group cannot be translated when VPM1=0
and HR||GR=0b00, or the process table
An Instruction Storage interrupt occurs when no higher 42 Set to 1 if the access is not permitted by vir-
priority exception exists and either tual page class key protection; otherwise
set to 0.
(a) HR||GR=0b00, the value of the expression 43 Set to 0.
((MSRHV PR=0b10)|((¬VPM1|¬PRTEV)&MSRIR)) 44 Set to 1 if an unsupported radix tree config-
uration is found during the translation pro-
is 1, and the next instruction to be executed cannot cess; otherwise set to 0.
be fetched, or 45 Set to 1 if an attempt to atomically set a ref-
erence bit fails; otherwise set to 0.
(b) HR||GR ≠ 0b00 and process-scoped translation 46 Set to 1 if the address of the appropriate
prevents the next instruction to be executed from process table entry or segment table entry
group cannot be translated when VPM1=0
being fetched and HR||GR=0b00, or the process table
entry is invalid (independent of VPM1)
for any of the following reasons. (In the expression for
when HR||GR=0b00.
(a) above, “¬PRTEV” is shorthand representing the
47 Set to 0.
case of an invalid segment table descriptor stopping
Others Loaded from the MSR.
the translation process.)
MSR See Figure 64.
If multiple Instruction Storage exceptions occur due to both a Direct External interrupt and a Mediated Exter-
attempting to fetch a single instruction, any one or more nal interrupt.
of the bits corresponding to these exceptions may be
set to 1 in SRR1.
6.5.7.1 Direct External Interrupt
Execution resumes at effective address
A Direct External interrupt occurs when no higher prior-
0x0000_0000_0000_0400, possibly offset as specified
ity exception exists, a Direct External exception exists,
in Figure 65.
and the value of the expression
MSREE & ¬(MSRHV & ¬MSRPR & LPCRHEIC) |
6.5.6 Instruction Segment (¬(LPES) & (¬(MSRHV) | MSRPR))
Interrupt is one. The occurrence of the interrupt does not cause
For Paravirtualized HPT Translation, an Instruction the exception to cease to exist.
Segment interrupt occurs when no higher priority
exception exists and the next instruction to be executed Programming Note
cannot be fetched because instruction address transla- When HEIC=1, Direct External exceptions will not
tion is enabled and the effective address cannot be result in external interrupts when the processor is
translated to a virtual address. in hypervisor state. This enables the Hypervisor
Interrupt Virtualization handler to prevent External
For Radix Tree Translation (in other than hypervisor
interrupts from occurring during the Hypervisor Vir-
real mode), an Instruction Segment interrupt occurs
tualization interrupt handler.
when no higher priority exception exists and the next
instruction to be executed cannot be fetched because
EA0:1=0b01 or EA0:1=0b10 when MSRHV PR ≠ 0b10 When LPES=0, the following registers are set:
and instruction address translation is enabled, or HSRR0 Set to the effective address of the instruction
EA2:63 is outside the range translated by the appropri- that the thread would have attempted to exe-
ate Radix Tree. cute next if no interrupt conditions were
present.
The following registers are set:
HSRR1
SRR0 Set to the effective address of the instruction
33:36 Set to 0.
that the thread would have attempted to exe-
42:47 Set to 0.
cute next if no interrupt conditions were
Others Loaded from the MSR.
present (if the interrupt occurs on attempting
to fetch a branch target, SRR0 is set to the MSR See Figure 64 on page 1061.
branch target address).
When LPES=1, the following registers are set:
SRR1
33:36 Set to 0. SRR0 Set to the effective address of the instruction
42:47 Set to 0. that the thread would have attempted to exe-
Others Loaded from the MSR. cute next if no interrupt conditions were
present.
MSR See Figure 64 on page 1061.
SRR1
Execution resumes at effective address 33:36 Set to 0.
0x0000_0000_0000_0480, possibly offset as specified 42:47 Set to 0.
in Figure 65. Others Loaded from the MSR.
MSR See Figure 64 on page 1061.
Programming Note
An Instruction Segment interrupt occurs if Execution resumes at effective address
MSRIR=1 and the translation of the effective 0x0000_0000_0000_0500, possibly offset as specified
address of the next instruction to be executed is not in Figure 65.
found in the SLB (or in any implementation-specific
address translation lookaside information). Programming Note
Because the value of MSREE is always 1 when the
thread is in problem state, the simpler expression
6.5.7 External Interrupt MSREE & ¬(MSRHV & ¬MSRPR & LPCRHEIC) |
An External interrupt is classified as being either a ¬(LPES | MSRHV)
Direct External interrupt or a Mediated External inter- is equivalent to the expression given above.
rupt. Throughout this Book, usage of the phrase “Exter-
nal interrupt’, without further classification, refers to
Execution resumes at effective address than Non-transactional state, with the excep-
0x0000_0000_0000_0600, possibly offset as specified tion of TFHAR in Suspended state.
in Figure 65. An attempt is made to execute a stop instruc-
tion in Suspended state.
Programming Note
Privileged Instruction
If an Alignment interrupt occurs for a case in the
second bulleted list above, the Alignment interrupt The following applies if the instruction is executed
handler should emulate the instruction. The emula- when MSRPR = 1.
tion must satisfy the atomicity requirements A Privileged Instruction type Program interrupt
described in Section 1.4 of Book II. is generated when execution is attempted of a
If an Alignment interrupt occurs for a case in the privileged instruction, or of an mtspr or mfspr
first bulleted list above, the Alignment interrupt han- instruction with an SPR field that contains a
dler must not attempt to emulate the instruction, but value having spr0=1.
instead should treat the instruction as a program- The following applies if the instruction is executed
ming error. when MSRHV PR = 0b00 and LPCREVIRT=0.
A Privileged Instruction type Program interrupt
is generated when execution is attempted of
6.5.9 Program Interrupt an mtspr or mfspr instruction with an SPR
A Program interrupt occurs when no higher priority field that designates an SPR that is accessible
exception exists and one of the following exceptions by the instruction only when the thread is in
arises during execution of an instruction: hypervisor state, or when execution of a
hypervisor-privileged instruction is attempted.
Floating-Point Enabled Exception
A Floating-Point Enabled Exception type Program Programming Note
interrupt is generated when the value of the These are the only cases in which a Privi-
expression leged Instruction type Program interrupt
can be generated when MSRPR=0. They
(MSRFE0 | MSRFE1) & FPSCRFEX can be distinguished from other causes of
is 1. FPSCRFEX is set to 1 by the execution of a Privileged Instruction type Program inter-
floating-point instruction that causes an enabled rupts by examining SRR149 (the bit in
exception, including the case of a Move To FPSCR which MSRPR was saved by the interrupt).
instruction that causes an exception bit and the
corresponding enable bit both to be 1. Trap
Programming Note
SRR147 can be set to 1 only if the
exception is a Floating-Point Enabled
Exception and either MSRFE0 FE1 =
0b01 or 0b10 or MSRFE0 FE1 has just
been changed from 0b00 to a nonzero
value. (SRR147 is always set to 1 in the
last case.)
to execute next if no interrupt conditions action has succeeded and execution of the tend.
were present. instruction has successfully completed or that failure
handling (see Section 5.3.3 of Book II) has completed
SRR1
for any cause of failure other than execution of tre-
33:36 Set to 0.
claim.
42:47 Set to 0.
Others Loaded from the MSR. Successful completion for an instruction means that the
instruction caused no other interrupt and, if the thread
MSR See Figure 64 on page 1061. is in Transactional state, did not cause the transaction
to fail in such a way that the instruction did not com-
Execution resumes at effective address
plete. (See Section 5.3.1 of Book II). Thus a Trace
0x0000_0000_0000_0A00, possibly offset as specified
interrupt never occurs for a System Call or System Call
in Figure 65.
Vectored instruction, or for a Trap instruction that traps,
or for a dcbf that is executed in Transactional state. The
6.5.14 System Call Interrupt instruction that causes a Trace interrupt is called the
“traced instruction”.
A System Call interrupt occurs when a System Call
instruction is executed. The following registers are set:
The following registers are set: SRR0 Set to the effective address of the instruc-
tion that the thread would have attempted
SRR0 Set to the effective address of the instruc- to execute next if no interrupt conditions
tion following the System Call instruction. were present.
SRR1 SRR1
33:36 Set to 0. 33 Set to 1.
42:47 Set to 0. 34 Set to 0.
Others Loaded from the MSR. 35 Set to 1 if the the Trace interrupt is not the
MSR See Figure 64 on page 1061. result of a CIABR match and the traced
instruction is a Load instruction or is speci-
Execution resumes at effective address fied to be treated as a Load instruction; oth-
0x0000_0000_0000_0C00, possibly offset as specified erwise set to 0.
in Figure 65. 36 Set to 1 if the the Trace interrupt is not the
result of a CIABR match and the traced
Programming Note instruction is a Store instruction or is speci-
An attempt to execute an sc instruction with LEV=1 fied to be treated as a Store instruction;
in problem state should be treated as a program- otherwise set to 0.
ming error. 43 Set to 1 if the traced instruction is the result
of a CIABR match.
44 Set to 1 if the interrupt ocurred as the result
6.5.15 Trace Interrupt of transaction completion.
45:47 Set to 0.
A Trace interrupt occurs when no higher priority excep- Others Loaded from the MSR.
tion exists and either a transaction is completed or any
Programming Note
instruction except rfid, hrfid, rfscv, or a Power-Saving
Mode instruction is successfully completed, provided Bit 33 is set to 1 for historical reasons.
any of the following is true:
SIAR For all Trace interrupts other than those
- the instruction is mtmsr[d] and caused by a transaction completion or a
MSRTE=0b10 when the instruction was initi- CIABR match, set to the effective address
ated, of the traced instruction; otherwise unde-
- the instruction is not mtmsr[d] and fined.
MSRTE=0b10, SDAR For all Trace interrupts other than those
- the instruction is a Branch instruction and caused by a transaction completion or a
MSRTE=0b01, CIABR match, set to the effective address
of the storage operand (if any) of the traced
- the transaction is completed when instruction; otherwise undefined.
MSRTE=0b11, or
If the state of the Performance Monitor is such that the
- a CIABR match occurs.
Performance Monitor may be altering the SIAR and
Completion of a transaction, for the purposes of Trans- SDAR (i.e., if MMCR0PMAE=1), the contents of the
action Completion Tracing, means that either the trans-
SIAR and SDAR are undefined for the Trace interrupt described below, or
and may change even when no Trace interrupt occurs.
(d) HR||GR ≠ 0b00 and partition-scoped translation
MSR See Figure 64 on page 1061.
prevents an access from being performed for any of
Execution resumes at effective address
0x0000_0000_0000_00D0, possibly offset as specified the following reasons.
in Figure 65. For a Trace interrupt resulting from execu- (In the expression for (b) above, “PRTEV” is shorthand
tion of an instruction that modifies the value of MSRIR indicating that an invalid segment table descriptor did
or MSRDR, the Trace interrupt vector location is based not stop the translation process. Note that an SLB hit
on the modified values. may satisfy this condition even when the Process Table
Entry is invalid.)
Programming Note
HR||GR=0b00, data address translation is enabled
The following instructions are not traced.
(MSRDR=1) and the virtual address of any byte of
rfid the storage location specified by a Load, Store,
hrfid icbi, dcbz, dcbst, or dcbf[l] instruction cannot be
rfscv translated to a real address because no valid PTE
sc, scv, and Trap instructions that trap was found for the VPM translation.
Power-Saving Mode instructions HR||GR≠0b00 and the virtual / guest real address
other instructions that cause interrupts (other of any byte of the storage location specified by a
than Trace interrupts) Load, Store, icbi, dcbz, dcbst, or dcbf[l] instruc-
the first instructions of any interrupt handler tion cannot be translated to a host real address
instructions that are emulated by software because no valid PTE was found in the parti-
instructions, executed in Transactional state, tion-scoped page table.
that are disallowed in Transactional state The virtual / guest real address of a page directory
instructions, executed in Transactional state, entry or process table entry could not be translated
that cause types of accesses that are disal- when HR||GR≠0b00; or the virtual address of a
lowed in Transactional state process table entry or segment table entry group
mtspr, executed in Transactional state, speci- could not be translated when VPM1=1 and
fying an SPR that is not part of the Transac- HR||GR=0b00.
tional Memory checkpointed registers An unsupported MMU configuration is found. In
tbegin. executed at maximum nesting depth addition to an invalid radix tree configuration found
in the partition-scoped tables, this type of excep-
In general, interrupt handlers can achieve the effect
tion will also be reported outside of hypervisor real
of tracing these instructions.
mode for translation mode mismatches including
GR≠HR, UPRT=0 when GR=1 or HR=1, LPID=0 if
MSRHV=0 when GR=1 or HR=1, and HR=0 for
LPID=0 when HR=1 for another partition ID.
6.5.16 Hypervisor Data Storage A reference or change bit update in a parti-
tion-scoped PTE cannot be performed (including
Interrupt for the process-scoped PDE or PTE or process
A Hypervisor Data Storage interrupt occurs when no table entry for a radix guest or the process table
higher priority exception exists, the thread is not in entry or segment table entry group for a paravirtu-
hypervisor state, and either alized HPT guest).
Programming Note
- a byte in the first aligned double-
word for which access was
The exception identified by bit 58 may attempted in the page that caused
be a problem the hypervisor can fix. the exception, for a non-quadword
Additional information may be retained Load or Store instruction
by the platform for this exception. undefined, for a Data Address Watch-
point match
59 Set to 1 if a transfer specified to access For the cases in which the HDAR is speci-
CSM does not specify (local) main storage fied above to be set to an effective address,
as the other end of the transfer; otherwise if the interrupt occurs in 32-bit mode the
set to 0. high-order 32 bits of the HDAR are set to 0.
60 Set to 1 if an accelerator is specified as the
source of a copy instruction or an attempt Programming Note
is made to access an accelerator that is not
Note that for HPT translation, the full EA is a super-
properly initialized for the software’s use;
set of the bits required to construct the full VA,
otherwise set to 0.
when also provided with the VSID in the ASDR.
61 Set to 1 if an attempt is made to execute a
Load Atomic or Store Atomic instruction
ASDR When HR||GR=0b00, loaded with VSID, B,
specifying an invalid function code; other-
Ks, Kp, N, C, L, and LP values from the
wise set to 0.
segment descriptor that translated the
62:63 Set to 0.
access or indicated the base of the table, or
HDAR Set to the effective address or portion of the undefined, as described in the following list.
VPN of a storage element, or undefined, as For a large segment the values of the bits
described in the following list. The list below the VSID are undefined. When
should be read from the top down; the HR||GR≠0b00 (nested translaiton is taking
HDAR is set as described by the first item place), loaded with the guest real address
that corresponds to an exception that is down to bit 51 of a storage element or table
reported in the HDSISR. For example, if a entry, or undefined, as described in the fol-
Load Word instruction causes a storage lowing list. (Note that the size of the GRA
protection violation and a Data Address may differ depending on whether the host
Watchpoint match (and both are reported in uses HPT or radix tree translation. The
the HDSISR), the HDAR is set to the effec- GRA is effectively an abbreviated VSID for
tive address of a byte in the first aligned an HPT host, while its size is determined by
doubleword for which access was the maximum size of a radix tree for a guest
attempted in the page that caused the that uses radix tree translation.) The list
exception. should be read from the top down; the
least significant 64 bits of the VA of the ASDR is set as described by the first item
table entry or group when a process that corresponds to an exception that is
table entry or segment table entry reported in the HDSISR.
group virtual address cannot be trans- the guest real page address of the
lated in Paravirtualized HPT mode with table entry when a process table or
VPM1=1. process-scoped page directory or
EA, when a Hypervisor Data Storage page table entry guest real address
exception occurs for reasons other cannot be translated or the VSID of the
than a Data Address Watchpoint table entry when a process or segment
match table entry virtual address cannot be
- a byte in the block that caused the translated (the rest of the segment
exception, for a Cache Manage- descriptor is implied).
ment instruction the guest real address of the pro-
- a byte in the first aligned quad- cess-scoped PDE or PTE or process
word for which access was table entry when a reference or
attempted in the page that caused change bit in the partition-scoped PTE
the exception, for a quadword mapping the process-scoped PDE or
Load or Store instruction (i.e., a PTE or process table entry cannot be
Load or Store instruction for which set atomically
the storage operand is a quad- the guest real address of the storage
word; “first” refers to address element when a reference or change
order: see Section 6.7) bit in the partition-scoped PTE cannot
be set atomically
not be translated to a real address by means of the change bit must be set. It may also be set
virtual real addressing mechanism. as a modifier to bits 36 and 42, to indicate
The fetch violates storage protection. In addition that write authority was required to com-
to the legacy VPM cases, this includes mis- plete the operation.
matches in access authority in which the pro- Others Loaded from the MSR.
cess-scoped PTE permits the access but the
HDAR Set to the least significant 64 bits of the VA
partition-scoped PTE does not. It also includes
of a table entry or group when
lack of necessary authority for accesses to pro-
HR||GR=0b00 and a process table entry or
cess-scoped tables, for example lack of write
segment table entry group virtual address
authority to set a reference bit in the pro-
cannot be translated and VPM1=1. May be
cess-scoped PTE. (In such a case, the “access”
set spuriously in other cases.
reported as failing would be the access to the pro-
cess-scoped table. The HDAR would provide the ASDR When HR||GR=0b00, loaded with VSID, B,
guest real / (abbreviated) virtual address of the Ks, Kp, N, C, L, and LP values from the
table entry.) segment descriptor that translated the
access or indicated the base of the table, or
The following registers are set: undefined, as described in the following list.
HSRR0 Set to the effective address of the instruction For a large segment the values of the bits
that the thread would have attempted to exe- below the VSID are undefined. When
cute next if no interrupt conditions were HR||GR≠0b00 (nested translaiton is taking
present (if the interrupt occurs on attempting place), set to the guest real address down
to fetch a branch target, HSRR0 is set to the to bit 51 of the instruction or table entry, or
branch target address). undefined, as described in the following
list.
HSRR1 the guest real address of the table
33 Set to 1 if the translation for an attempted entry when a process table or pro-
access is not found in the Page Table; oth- cess-scoped page directory or page
erwise set to 0. table entry guest real address cannot
34 Set to 0. be translated or the VSID of the table
35 Set to 1 if the access is to No-execute (as entry when a process or segment table
indicated by the N bit in the segment table entry virtual address cannot be trans-
entry or the N bit in the HPT PTE or the lated (the rest of the segment desrcrip-
Execute and Privilege bits in the EAA field tor is implied).
of the Radix PTE and IAMR key 0) or the guest real address of the pro-
Guarded storage; otherwise set to 0. cess-scoped PDE or PTE or process
36 Set to 1 if the access is not permitted by table entry when a reference or
Figure 43 45, or the privilege, read, or change bit in the partition-scoped PTE
write bits in Figure 44 as appropriate; oth- mapping the process-scoped PDE or
erwise set to 0. PTE or process table entry cannot be
42 Set to 1 if the access is not permitted by vir- set atomically
tual page class key protection; otherwise the guest real address of the instruc-
set to 0. tion when a reference or change bit in
43 Set to 0. the partition-scoped PTE cannot be
44 Set to 1 if an unsupported MMU configura- set atomically
tion is found during the translation process. the guest real address of the instruc-
45 Set to 1 if an attempt to atomically set a ref- tion, process table entry, page direc-
erence or change bit fails; otherwise set to tory entry, or page table entry
0. (depending on which partition-scoped
46 Set to 1 if HR||GR≠0b00 and the virtual / table has the flaw) for an unsupported
guest real address of a page directory radix tree configuration in the parti-
entry, page table entry, or process table tion-scoped table (the effective
entry could not be translated; or address for other cases of the invalid
HR||GR=0b00, VPM1=1, and the virtual MMU configuration exception will be
address of a process table entry or seg- found in HSRR0)
ment table entry group could not be trans- the guest real address of the pro-
lated; otherwise set to 0. cess-scoped PTE when an attempt is
47 Set to 1 if the operation that caused the made to set a reference bit without
exception was attempting to update stor- write authority in the partition-scoped
age; otherwise set to 0. This bit may be set PTE that maps it
as a modifier to bit 45 to indicate that a
6.7 Exception Ordering is that Virtual Page Class Key Storage Protection
exceptions that occur when LPCRKBV=1 and Virtual-
Since multiple exceptions can exist at the same time ized Partition Memory is disabled by VPM1=0 cause
and the architecture does not provide for reporting only a Hypervisor Data Storage exception (and never a
more than one interrupt at a time, the generation of Data Storage exception).
more than one interrupt is prohibited. Some exceptions,
System-Caused or Imprecise
such as the Mediated External exception, persist and
can be deferred. However, other exceptions would be 1. Program
lost if they were not recognized and handled when they - Imprecise Mode Floating-Point Enabled Exception
occur. For example, if an External interrupt was gener- 2. Hypervisor Maintenance
ated when a Data Storage exception existed, the Data 3. Hypervisor Virtualization, External, [Hypervisor]
Storage exception would be lost. If the Data Storage Decrementer, Performance Monitor, Directed Privi-
exception was caused by a Store Multiple instruction for leged Doorbell, Directed Hypervisor Doorbell
which the storage operand crosses a virtual page
boundary and the exception was a result of attempting
to access the second virtual page, the store could have
modified locations in the first virtual page even though it
appeared that the Store Multiple instruction was never
executed.
For the above reasons, all exceptions are prioritized
with respect to other exceptions that may exist at the
same instant to prevent the loss of any exception that is
not persistent. Some exceptions cannot exist at the
same instant as some others.
Data Storage, Hypervisor Data Storage, Data Seg-
ment, and Alignment exceptions and transaction failure
due to attempted access of a disallowed type while in
Transactional state occur as if the storage operand
were accessed one byte at a time in order of increasing
effective address (with the obvious caveat if the oper-
and includes both the maximum effective address and
effective address 0). (The required ordering of excep-
tions on components of non-atomic accesses does not
extend to the performing of the component accesses in
the event of an exception. For example, if byte n
causes a data storage exception, it is not necessarily
true that the access to byte n-1 has been performed.)
Instruction-Caused and Precise a single instruction would generate are in the order
shown in the list of instruction-caused exceptions.
1. Instruction Segment Exceptions with different numbers have different order-
2. [Hypervisor] Instruction Storage ing. Exceptions with the same numbering but different
3.a Hypervisor Emulation Assistance lettering are mutually exclusive and cannot be caused
3.b Program by the same instruction. The Hypervisor Virtualization,
- Privileged Instruction External, [Hypervisor] Decrementer, Performance Mon-
4. Function-Dependent itor, Directed Privileged Doorbell, and Directed Hypervi-
4.a Fixed-Point and Branch sor Doorbell interrupts have equal ordering. Similarly,
1 Hypervisor Facility Unavailable where Data Storage, Data Segment, and Alignment
2 Facility Unavailable exceptions are listed in the same item they have equal
3a Program ordering.
- Trap
Even on threads that are capable of executing several
- TM Bad Thing
instructions simultaneously, or out of order, instruc-
3b System Call or System Call Vectored
tion-caused interrupts (precise and imprecise) occur in
3c.1 Data Storage for the case of Fixed-Point
program order.
Load or Store Caching Inhibited instructions
with MSRDR=1 or the case of an invalid
function code for an Atomic Memory
Operation
6.8 Event-Based Branch Excep-
3c.2 all other Data Storage, Hypervisor Data tion Ordering
Storage, [Hypervisor] Data Segment, or
Alignment Event-based exceptions are not ordered because they
4 Trace can occur simultaneously. Whenever an event-based
4.b Floating-Point exception occurs and the exception is enabled, the cor-
1 Hypervisor Facility Unavailable responding “exception occurred” bit in the BESCR is
2 FP Unavailable set to 1. See Section 7.2.1 of Book II.
3a Program
- Precise Mode Floating-Pt Enabled Excep’n
3b [Hypervisor] Data Storage, [Hypervisor] Data 6.9 Interrupt Priorities
Segment, or Alignment
4 Trace This section describes the relationship of nonmaskable,
4.c Vector maskable, precise, and imprecise interrupts. In the fol-
1 Hypervisor Facility Unavailable lowing descriptions, the interrupt mechanism waiting for
2 Vector Unavailable all possible exceptions to be reported includes only
3a [Hypervisor] Data Storage, [Hypervisor] Data exceptions caused by previously initiated instructions
Segment, or Alignment (e.g., it does not include waiting for the Decrementer to
4 Trace step through zero). The exceptions are listed in order
4.d VSX of highest to lowest priority. The phrase "corresponding
1 Hypervisor Facility Unavailable interrupt" means the interrupt having the same name
2 VSX Unavailable as the exception unless the thread is in power-saving
3a Program mode, in which case the phrase means the System
- Precise Mode Floating-Pt Enabled Excep’n Reset interrupt.
3b [Hypervisor] Data Storage, [Hypervisor] Data
Unless otherwise stated or obvious from context, it is
Segment, or Alignment
assumed below that one of the following conditions is
4 Trace satisfied.
4.e Other Instructions The thread is not in power-saving mode and the
1 Hypervisor Facility Unavailable interrupt, unless it is the Machine Check inter-
2 Facility Unavailable rupt, is not disabled. (For the Machine Check
3a [Hypervisor] Data Storage, [Hypervisor] Data interrupt no assumption is made regarding
Segment, or Alignment enablement.)
4 Trace
The thread is in power-saving mode and the
For implementations that execute multiple instructions exception is enabled to cause exit from the
in parallel using pipeline or superscalar techniques, or mode.
combinations of these, it can be difficult to understand
the ordering of exceptions.To understand this ordering With one exception, in the following list the hypervisor
it is useful to consider a model in which each instruction forms of the Data Storage and Instruction Storage
is fetched, then decoded, then executed, all before the exceptions can be substituted for the non-hypervisor
next instruction is fetched. In this model, the exceptions forms since the hypervisor forms cannot be caused by
the same instruction and have the same priority. The c. Facility Unavailable
exception is that exceptions caused by Virtual Page d. Data Storage for the case of Fixed-Point
Class Key Storage Protection exceptions that occur Load or Store Caching Inhibited instructions
when LPCRKBV=1 and Virtualized Partition Memory is with MSRDR=1 or the case of an invalid
disabled by VPM1=0 cause only a Hypervisor Data function code for an Atomic Memory
Storage exception (and never a Data Storage excep- Operation
tion). e. all other Data Storage, Hypervisor Data
Storage, [Hypervisor] Data Segment, or
1. System Reset
Alignment
System Reset exception has the highest priority of f. Trace
all exceptions. If this exception exists, the interrupt
B. Floating-Point Loads and Stores
mechanism ignores all other exceptions and gen-
a. Hypervisor Emulation Assistance
erates a System Reset interrupt.
b. Hypervisor Facility Unavailable
Once the System Reset interrupt is generated, no c. Floating-Point Unavailable
nonmaskable interrupts are generated due to d. [Hypervisor] Data Storage, [Hypervisor]
exceptions caused by instructions issued prior to Data Segment, or Alignment
the generation of this interrupt. e Trace
2. Machine Check C. Vector Loads and Stores
a. Hypervisor Emulation Assistance
Machine Check exception is the second highest
b. Hypervisor Facility Unavailable
priority exception. If this exception exists and a
c. Vector Unavailable
System Reset exception does not exist, the inter-
d. [Hypervisor] Data Storage, [Hypervisor]
rupt mechanism ignores all other exceptions and
Data Segment, or Alignment
generates a Machine Check interrupt.
e. Trace
Once the Machine Check interrupt is generated,
D. VSX Loads and Stores
no nonmaskable interrupts are generated due to
a. Hypervisor Emulation Assistance
exceptions caused by instructions issued prior to
b. Hypervisor Facility Unavailable
the generation of this interrupt.
c. VSX Unavailable
3. Instruction-Caused and Precise d. [Hypervisor] Data Storage, [Hypervisor]
Data Segment, or Alignment
This exception is the third highest priority excep-
e. Trace
tion. When this exception is created, the interrupt
mechanism waits for all possible Imprecise excep- E. Other Floating-Point Instructions
tions to be reported. It then generates the appro- a. Hypervisor Emulation Assistance
priate ordered interrupt if no higher priority b. Hypervisor Facility Unavailable
exception exists when the interrupt is to be gener- c. Floating-Point Unavailable
ated. Within this category a particular instruction d. Program - Precise Mode Floating-Point
may present more than a single exception. When Enabled Exception
this occurs, those exceptions are ordered in prior- e. Trace
ity as indicated in the following lists. Where [Hyper-
F. Other Vector Instructions
visor] Data Storage, Data Segment, and Alignment
a. Hypervisor Emulation Assistance
exceptions are listed in the same item they have
b. Hypervisor Facility Unavailable
equal priority (i.e., the hardware may generate any
c. Vector Unavailable
one of the three interrupts for which an exception
d. Trace
exists). For instructions that are forbidden in
Transactional state, transaction failure takes prior- G. Other VSX Instructions
ity over all interrupts except Privileged Instruction a. Hypervisor Emulation Assistance
type Program Interrupts. For data accesses that b. Hypervisor Facility Unavailable
are forbidden in Transactional state, transaction c. VSX Unavailable
failure has the same priority as the group of “other” d. Program - Precise Mode Floating-Point
[Hypervisor] Data Storage, Data Segment, and Enabled Exception
Alignment exceptions. (See Section 5.3.1 of Book e. Trace
II ).
H. TM instruction, mt/fspr specifying TM SPR
A. Fixed-Point Loads and Stores a. Program - Privileged Instruction (only for
a. These exceptions are mutually exclusive treclaim., trechkpt.., and mtspr)
and have the same priority: b Hypervisor Facility Unavailable
Hypervisor Emulation Assistance c Facility Unavailable
Program - Privileged Instruction d Program - TM Bad Thing (only for treclaim.,
b. Hypervisor Facility Unavailable trechkpt., and mtspr)
6.10 Relationship of
Event-Based Branches to Inter-
rupts
6.10.1 EBB Exception Priority
Event-based branches have a priority lower than that of
all interrupts. When an event-based exception is cre-
ated, the Event-Based Branch facility waits for all possi-
ble exceptions that would cause interrupts to be
reported. It then generates the event-based branch if
no exception that would cause an interrupt exists when
the event-based branch is to be generated.
Programming Note
mftb Ry # Read 64-bit Time Base value
If software initializes the Time Base on power-on to clrldi Ry,Ry,40 # lower 24 bits of old TB
some reasonable value and the update frequency mttbu40 Rx # write upper 40 bits of TB
of the Time Base is constant, the Time Base can mftb Rz # read TB value again
be used as a source of values that increase at a clrldi Rz,Rz,40 # lower 24 bits of new TB
constant rate, such as for time stamps in trace cmpld Rz,Ry # compare new and old lwr 24
entries. bge done # no carry out of low 24 bits
addis Rx,Rx,0x0100
Even if the update frequency is not constant, val- #increment upper 40 bits
ues read from the Time Base are monotonically mttbu40 Rx # update to adjust for carry
increasing (except when the Time Base wraps from
264-1 to 0). If a trace entry is recorded each time Programming Note
the update frequency changes, the sequence of The instructions for writing the Time Base are
Time Base values can be post-processed to mode-independent. Thus code written to set the
become actual time values. Time Base will work correctly in either 64-bit or
32-bit mode.
Successive readings of the Time Base may return
identical values.
If Time Base bits 60:63 are used as part of a ran-
dom number generator, software must account for
7.3 Virtual Time Base
the fact that these bits are set to 0x0 only when bit The Virtual Time Base (VTB) is a 64-bit incrementing
59 changes state regardless of whether or not they counter.
incremented to 0xF since they were previously set
to 0x0.
VTB
See the description of the Time Base in Chapter 6 0 63
of Book II for ways to compute time of day in POSIX
format from the Time Base. Figure 67. Virtual Time Base
Virtual Time Base increments at the same rate as the
Time Base until its value becomes
7.2.1 Writing the Time Base 0xFFFF_FFFF_FFFF_FFFF (264 - 1); at the next incre-
ment its value becomes 0x0000_0000_0000_0000.
Writing the Time Base is privileged, and can be done There is no interrupt or other indication when this
only in hypervisor state. Reading the Time Base is not occurs.
privileged; it is discussed in Chapter 6 of Book II.
The operation of the Virtual Time Base has the follow-
It is not possible to write the entire 64-bit Time Base ing additional properties.
using a single instruction. The mttbl and mttbu
extended mnemonics write the lower and upper halves 1. Loading a GPR from the Virtual Time Base has no
of the Time Base (TBL and TBU), respectively, preserv- effect on the accuracy of the Virtual Time Base.
ing the other half. These are extended mnemonics for 2. Copying the contents of a GPR to the Virtual Time
the mtspr instruction; Figure 17. Base replaces the contents of the Virtual Time
The Time Base can be written by a sequence such as: Base with the contents of the GPR.
7.4.1 Writing and Reading the The operation of the Hypervisor Decrementer has the
following additional properties.
Decrementer
1. Loading a GPR from the Hypervisor Decrementer
The contents of the Decrementer can be read or written has no effect on the accuracy of the Hypervisor
using the mfspr and mtspr instructions, both of which Decrementer.
are privileged when they refer to the Decrementer.
2. Copying the contents of a GPR to the Hypervisor
Using an extended mnemonic (Figure 17), the Decre-
Decrementer replaces the contents of the Hypervi-
menter can be written from GPR Rx using:
sor Decrementer with the contents of the GPR.
mtdec Rx
Programming Note
The Decrementer can be read into GPR Rx using:
In systems that change the Time Base update fre-
quency for purposes such as power management,
mfdec Rx
the Hypervisor Decrementer update frequency will
Copying the Decrementer to a GPR has no effect on also change. Software must be aware of this in
the Decrementer contents or on the interrupt mecha- order to set interval timers.
nism.
If Hypervisor Decrementer bits 60:63 are used as
part of a random number generator, software must
7.5 Hypervisor Decrementer account for the fact that these bits are set to 0xF
only when bit 59 changes state regardless of
The Hypervisor Decrementer is a h-bit decrementing whether or not they decremented to 0x0 since they
counter that is sign-extended to 64 bits. The value of h were previously set to 0xF.
is implementation dependent, however the number of
bits supported by the Hypervisor Decrementer must be Programming Note
greater than or equal to the number of bits supported
by the Decrementer. When the Decrementer is written, A Hypervisor Decrementer exception is not created
bits 0:63-h are ignored by the hardware. if the thread is in a power-saving mode when
HDEC0 changes from 0 to 1 because having a
Hypervisor Decrementer interrupt occur almost
Programming Note
immediately after exiting the power-saving mode in
The maximum positive value supported by the this case is deemed unnecessary. The hypervisor
Hypervisor Decrementer is 2h-1-1, represented with already has control, and if a timed exit from the
bits 0:64-h containing 0’s and bits 65-h:63 contain- power-saving mode is necessary and possible, the
ing 1’s. The minimum value supported by the hypervisor can use the Decrementer to exit the
Hypervisor Decrementer is -2h-1, represented as power-saving mode at the appropriate time. For
0xFFFF_FFFF_FFFF_FFFF. some power-saving levels, the state of the Hyper-
visor Decrementer and Decrementer is not neces-
The binary value of the Hypervisor Decrementer counts sarily maintained and updated.
down until its value becomes
0x0000_0000_0000_0000; at the next decrement its
value becomes the minimum value supported, which is
represented as 0xFFFF_FFFF_FFFF_FFFF. 7.6 Processor Utilization of
When the contents of HDEC0 change from 0 to 1 and Resources Register (PURR)
the thread is not in a power-saving mode, a Hypervisor
Decrementer exception will come into existence within The Processor Utilization of Resources Register
a reasonable period of time. When a Hypervisor Decre- (PURR) is a 64-bit counter, the contents of which pro-
menter interrupt occurs, the existing Hypervisor Decre- vide an estimate of the resources used by the thread.
menter exception will cease to exist within a reasonable The contents of the PURR are treated as a 64-bit
period of time, but not later than the completion of the unsigned integer.
next context synchronizing instruction or event. Even if
multiple HDEC0 change transitions from 0 to 1 occur PURR
before a Hypervisor Decrementer interrupt occurs, at 0 63
most one Hypervisor Decrementer exception exists. Figure 69. Processor Utilization of Resources
The preceding paragraph applies regardless of whether Register
the change in the contents of HDEC0 is the result of The PURR is a hypervisor resource; see Chapter 2.
decrementation of the Hypervisor Decrementer by the
hardware or of modification of the Hypervisor Decre- The contents of the PURR increase monotonically,
menter caused by execution of an mtspr instruction. unless altered by software, until the sum of the contents
plus the amount by which it is to be increased exceed thread. The contents of the SPURR are treated as a
0xFFFF_FFFF_FFFF_FFFF (264 - 1) at which point the 64-bit unsigned integer.
contents are replaced by that sum modulo 264. There
is no interrupt or other indication when this occurs. SPURR
0 63
The rate at which the value represented by the contents
of the PURR increases is an estimate of the portion of Figure 70. Scaled Processor Utilization of
resources used by the thread per unit time with respect Resources Register
to other threads that share those resources monitored
by the PURR. When the thread is idle, the rate at The SPURR is a hypervisor resource; see Section 2.6.
which the PURR value increases is implementation The contents of the SPURR increase monotonically,
dependent. unless altered by software, until the sum of the contents
Let the difference between the value represented by plus the amount by which it is to be increased exceed
the contents of the Time Base at times Ta and Tb be 0xFFFF_FFFF_FFFF_FFFF (264 - 1) at which point the
Tab. Let the difference between the value represented contents are replaced by that sum modulo 264. There
by the contents of the PURR at time Ta and Tb be the is no interrupt or other indication when this occurs.
value Pab. The ratio of Pab/Tab is an estimate of the per- The rate at which the value represented by the contents
centage of shared resources used by the thread during of the SPURR increases is an estimate of the portion of
the interval Tab. For the set {S} of threads that share resources used by the thread with respect to other
the resources monitored by the PURR, the sum of the threads that share those resources monitored by the
usage estimates for all the threads in the set is 1.0. SPURR, and relative to the computational capacity pro-
The definition of the set of threads S, the shared vided by those resources. The computational capacity
resources corresponding to the set S, and specifics of provided by the shared resources may vary as a func-
the algorithm for incrementing the PURR are imple- tion of the frequency of the oscillator which drives the
mentation-specific. resources or as a result of deliberate delays in process-
ing that are created to reduce power consumption.
The PURR is implemented such that: When the thread is idle, the rate at which the SPURR
1. Loading a GPR from the PURR has no effect on value increases is implementation dependent.
the accuracy of the PURR. Let the difference between the value represented by
2. Copying the contents of a GPR to the PURR the contents of the Time Base at times Ta and Tb be
replaces the contents of the PURR with the con- Tab. Let the ratio of the effective and nominal frequen-
tents of the GPR. cies of the oscillator driving instruction execution fe/fn
be fr. Let the ratio of delay cycles created by power
Programming Note reduction circuitry and total cycles cd/ct be cr. Let the
difference between the value represented by the con-
Estimates computed as described above may be
tents of the SPURR at time Ta and Tb be the value Sab.
useful for purposes related to resource utilization,
The ratio of Sab/(Tab x fr x (1 - cr)) is an estimate of the
including utilization-based system management
percentage of shared resource capacity used by the
and planning.
thread during the interval Tab. For the set {S} of threads
Because the rate at which the PURR accumulates that share the resources monitored by the SPURR, the
resource usage estimates is dependent on the fre- sum of the usage estimates for all the threads in the set
quency at which the Time Base is incremented, is 1.0.
and the frequency of the oscillator that drives
The definition of the set of threads S, the shared
instruction execution may vary independently from
resources corresponding to the set S, and specifics of
that of the Time Base, the interpretation of the con-
the algorithm for incrementing the SPURR are imple-
tents of the PURR may be inaccurate as a mea-
mentation-specific.
surement of capacity consumption for accounting
purposes. The SPURR should be used for The SPURR is implemented such that:
accounting purposes.
1. Loading a GPR from the SPURR has no effect on
the accuracy of the SPURR.
2. Copying the contents of a GPR to the SPURR
7.7 Scaled Processor Utilization replaces the contents of the SPURR with the con-
of Resources Register (SPURR) tents of the GPR.
Programming Note
Estimates computed as described above may be
useful for purposes of resource use accounting,
program dispatching, etc.
IC
0 63
8.1 Overview setting need not occur until a subsequent context syn-
chronizing operation has occurred.
Implementations provide debug facilities to enable
hardware and software debug functions, such as con- CFAR //
trol flow tracing, data address watchpoints, and pro- 0 62 63
gram single-stepping. The debug facilities described in
Figure 72. Come-From Address Register
this section consist of the Come-From Address Regis-
ter (see Section 8.2), Completed Instruction Address The contents of the CFAR can be read and written
Breakpoint Register (see Section 8.3), and the Data using the mfspr and mtspr instructions. Acccess to the
Address Watchpoint Register (DAWRn) and Data CFAR is privileged.
Address Watchpoint Register Extension (DAWRXn)
(see Section 8.4). The interrupt associated with the Programming Note
Data Address Breakpoint registers is described in Sec- This register can be used for purposes of debug-
tion 6.5.3. The interrupt associated with the Completed ging software. For example, often a software bug
Instruction Address Breakpoint Register is described in results in the program executing a portion of the
Section 6.5.15. The Trace facility, which can be used for code that it should not have reached or causing an
single-stepping as well as for control flow tracing, is unexpected interrupt. In the former case, a break-
described in Section 6.5.15. point can be placed in the portion of the code that
The mfspr and mtspr instructions (see Section 4.4.5) was erroneously reached and the program reexe-
provide access to the registers of the debug facilities. cuted. In either case, the interrupt handler can save
the contents of the CFAR (before executing the first
In addition to the facilities mentioned above, implemen- instruction that would modify the register), and then
tations typically provide debug facilities, modes, and make the saved contents available for a debugger
access mechanisms that are implementation-specific. to use in determining the control flow path by which
For example, implementations typically provide facilities the exception was reached.
for instruction address tracing, and also access to cer-
tain debug facilities via a dedicated interface such as In order to preserve the CFAR's contents for each
the IEEE 1149.1 Test Access Port (JTAG). partition and to prevent it from being used to imple-
ment a "covert channel" between partitions, the
hypervisor should initialize/save/restore the CFAR
8.2 Come-From Address Regis- when switching partitions on a given thread.
ter
The Come-From Address Register (CFAR) is a 64-bit 8.3 Completed Instruction
register. When an rfebb, rfid, or rfscv instruction is
executed, the register is set to the effective address of
Address Breakpoint
the instruction. When a Branch instruction is executed The Completed Instruction Address Breakpoint mecha-
and the branch is taken, the register is set to the effec- nism provides a means of detecting an instruction com-
tive address of an instruction in the instruction cache pletion at a specific instruction address. The address
block containing the Branch instruction, except that if comparison is done on an effective address (EA).
the Branch instruction is a B-form Branch (i.e., bc, bca,
bcl, or bcla) for which the target address is in the The Completed Instruction Address Breakpoint mecha-
instruction cache block containing the Branch instruc- nism is controlled by the Completed Instruction
tion or is in the previous or next cache block, the regis-
ter is not necessarily set. For Branch instructions, the
Address Breakpoint Register (CIABR), shown in Figure 74, and the Data Address Watchpoint Register
Figure 74. Extension (DAWRXn), shown in Figure 75.
The Data Address Watchpoint mechanism provides a All other fields are reserved.
means of detecting load and store accesses to a range
Figure 75. Data Address Watchpoint Register
of addresses starting at a designated doubleword. The
Extension
address comparison is done on an effective address
(EA). The supported PRIVM values are 0b000, 0b001,
0b010, 0b011, 0b100, and 0b111. If the PRIVM field
Programming Note does not contain one of the supported values, then
The Data Address Watchpoint mechanism employs whether a match occurs for a given storage access is
a simple EA compare. It makes no attempt to take undefined. Elsewhere in this section it is assumed that
the radix table translation quadrants (keyed off the PRIVM field contains one of the supported values.
EA0:1) into account to enable a single setting to
work in all privilege levels.
Programming Note
PMC5 and PMC6 are defined to facilitate calculat-
ing basic performance metrics such as cycles per
instruction (CPI).
9.4.4 Monitor Mode Control 1 The PMCs are not incremented, and
entries are not written into the BHRB, if
Register 0 MSRHV PR=0b00.
Monitor Mode Control Register 0 (MMCR0) is a 64-bit 34 Conditionally Freeze Counters and BHRB
register as shown below. in Problem State (FCP)
If the value of bit 51 (FCPC) is 0, this field has
MMCR0 the following meaning.
0 63
0 The PMCs are incremented (if permitted
Figure 77. Monitor Mode Control Register 0 by other MMCR bits) and entries are writ-
ten into the BHRB (if permitted by the
MMCR0 is used to control multiple functions of the Per-
BHRB Instruction Filtering Mode field in
formance Monitor. Some fields of MMCR0 are altered
MMCRA).
by the hardware when various events occur.
1 The PMCs are not incremented, and
The following notation is used in the definitions below. entries are not written into the BHRB, if
“PMCs” refers to PMCs 1 - n and “PMCj” refers to MSRPR=1.
PMCj, where 2 ≤ j ≤ n. n=4 when MMCR0PMCC=0b11
If the value of bit 51 (FCPC) is 1, this field has
and n=6 otherwise.
the following meaning.
When MMCR0PMCC is set to 0b10 or 0b11, providing 0 The PMCs are not incremented, and
problem state programs read/write access to MMCR0, entries are not written into the BHRB, if
only FC, PMAE, PMAO can be accessed. All other bits MSRHV PR=0b01.
are not changed when mtspr is executed in problem 1 The PMCs are not incremented, and
state, and all other bits return 0s when mfspr is exe- entries are not written into the BHRB, if
cuted in problem state. MSRHV PR=0b11.
Load or Store instruction to the point at F0 The thread has completed a Store instruc-
which it has reported all exceptions it will tion to the point at which it has reported all
cause. the exceptions it will cause.
F6 The thread has failed to locate an ERAT F2 The thread has dispatched an instruction.
entry during instruction address transla- F4 A cycle has occurred during which the
tion. RUN bit of the thread’s CTRL register
F8 A cycle has occurred during which all pre- contained 1.
viously initiated instructions have com- F6 The thread has failed to locate an ERAT
pleted and no instructions are available for entry during data address translation, and
initiation. a new ERAT entry corresponding to the
FA A cycle has occurred during which the data effective address has been written.
RUN bit of the CTRL register for one or F8 An external interrupt for the thread has
more threads of the multi-threaded pro- occurred.
cessor was set to 1. FA The thread has completed a Branch
FC A load type instruction finished. If the instruction for which the branch was
instruction caused more than one refer- taken.
ence, only one will be counted. FC The thread has failed to locate an instruc-
FE The thread has completed an instruction. tion in the primary cache.
FE The thread has filled a block in the primary
40:47 PMC2 Selector (PMC2SEL) data cache with data that were accessed
The value of PMC2SEL specifies the event to by a Load instruction and obtained from a
be counted by PMC2 as defined below. location other than the secondary cache.
All values in the range of E0 - FF that are not
specified below are reserved. 48:55 PMC3Selector (PMC3SEL)
The value of PMC3SEL specifies the event to
Hex Event
be counted by PMC3 as defined below.
00 Disable events. (No events occur.) All values in the range of E0 - FF that are not
01-BF Implementation-dependent specified below are reserved.
C0-DF Reserved
Hex Event
00 Disable events. (No events occur.)
The following events can occur only when ran-
01-BF Implementation-dependent
dom sampling is enabled (MMCRASE=1). The
C0-DF Reserved
sampling modes corresponding to each event
are listed in parentheses. (The sampling mode The following events can occur only when ran-
is specified in MMCRASM.) dom sampling is enabled (MMCRASE=1). The
E0 The thread has obtained the data for a sampling modes corresponding to each event
randomly sampled Load instruction from are listed in parentheses. (The sampling mode
storage that did not reside in any cache. is specified in MMCRASM.)
(RIS, RLS) E2 The thread has completed a randomly
E2 The thread has failed to locate the data for sampled Store instruction to the point at
a randomly sampled Load instruction in which it has reported all exceptions it will
the primary data cache. (RIS, RLS) cause. (RIS,RLS)
E4 The thread filled a block in the primary E4 The thread has mispredicted either
data cache with data that were accessed whether or not the branch would be
by a randomly sampled Load instruction taken, or if taken, the target address of a
and obtained from a location other than randomly sampled Branch instruction.
the secondary or tertiary cache. (RIS, (RIS, RBS)
RLS) E6 The thread has failed to locate an ERAT
E6 The threshold event counter has entry during data address translation for a
exceeded the number of events corre- randomly sampled instruction. (RIS,RLS)
sponding to threshold B (see Table 4). E8 The threshold event counter has
(RIS, RLS, RBS) exceeded the number of events corre-
E6 The threshold event counter has sponding to threshold C (see Table 4).
exceeded the number of events corre- (RIS, RLS, RBS)
sponding to threshold F (see Table 4). EA The threshold event counter has
(RIS, RLS, RBS) exceeded the number of events corre-
sponding to threshold G (see Table 4).
The following events can occur regardless of
(RIS, RLS, RBS)
whether random sampling is enabled.
The following events can occur regardless of sponding to threshold D (see Table 4).
whether random sampling is enabled. (RIS, RLS, RBS)
EC The threshold event counter has
F0 The thread has attempted to store data in exceeded the number of events corre-
the primary data cache but no block corre- sponding to threshold H (see Table 4).
sponding to the real address existed. (RIS, RLS, RBS)
F2 The thread has dispatched an instruction.
The following events can occur regardless of
F4 The thread has completed an instruction
whether random sampling is enabled.
when the RUN bit of the CTRL register for
all threads on the multi-threaded proces-
F0 The thread has attempted to load data
sor contained 1.
from the primary data cache but no block
F6 The thread has filled a block in the primary
corresponding to the real address existed.
data cache with data that were accessed
F2 A cycle has occurred during which the
by a Load instruction.
thread has dispatched one or more
F8 A Time Base transition event has occurred
instructions.
for the thread. This event is counted
F4 A cycle has occurred during which the
regardless of whether or not Time Base
PURR was incremented when the RUN bit
transition events are enabled by
of the thread’s CTRL register contained 1.
MMCR0TBEE.
F6 The thread has mispredicted either
FA The thread has loaded an instruction from
whether or not the branch would be
a higher level cache than the tertiary
taken, or if taken, the target address of a
cache.
Branch instruction.
FC The thread was unable to translate a data
F8 The thread has discarded prefetched
virtual address using the TLB.
instructions.
FE The thread has filled a block in the primary
FA The thread has completed an instruction
data cache with data that were accessed
when the RUN bit of the thread’s CTRL
by a Load instruction and obtained from a
register contained 1.
location other than the secondary or ter-
FC The thread was unable to translate an
tiary cache.
instruction virtual address using the TLB,
and a new TLB entry corresponding to the
56:63 PMC4 Selector (PMC4SEL)
instruction virtual address has been writ-
The value of PMC4SEL specifies the event to
ten.
be counted by PMC4 as defined below.
FE The thread has obtained the data for a
All values in the range of E0 - FF that are not
Load instruction from storage that did not
specified below are reserved.
reside in any cache.
Hex Event
The following event must be supported with an imple-
00 Disable events. (No events occur.) mentation-dependent event code.
01-BF Implementation-dependent
C0-DF Reserved The thread has executed an ldmx instruction that
accessed a doubleword that contains an effective
The following events can occur only when ran- address within an enabled section of the Load
dom sampling is enabled (MMCRASE=1). The Monitored region. (See Section 3.2.4 of Book I.)
sampling modes corresponding to each event
are listed in parentheses. (The sampling mode This event only occurs when the Load Monitored
is specified in MMCRASM.) Facility and Event-Based Branch Facility are
E0 The thread has completed a randomly enabled (FSCRLM EBB= 0b11).
sampled instruction. (RIS, RLS, RBS)
E4 The thread was unable to translate a data Programming Note
virtual address using the TLB for a ran- This event can be used to measure the fre-
domly sampled instruction. (RIS,RLS) quency of accesses by ldmx to an enabled
E6 The thread has loaded a randomly sam- section of the Load Monitored region. It would
pled instruction from a higher level cache typically be used when Load Monitored
than the tertiary cache. (RIS) event-based branch exceptions are disabled
E8 The thread has filled a block in the primary (i.e. BESCRLME=0) so that Load Monitored
data cache with data that were accessed event-based branches do not occur during the
by a randomly sampled Load instruction measurement process.
and obtained from a location other than
the secondary cache. (RIS, RLS)
EA The threshold event counter has
exceeded the number of events corre-
MMCRA
0 63
MMCRA gives privileged programs the ability to control 1 Do not enter return instructions into the
the sampling process, BHRB filtering, and threshold BHRB.
events.
31 Filter Jump Instructions (FJ)
When MMCR0PMCC is set to 0b10 or 0b11, providing
problem state programs read/write access to MMCRA, 0 Enter jump instructions into the BHRB if
the Threshold Event Counter Exponent (TECX) and allowed by other filtering fields.
Threshold Event Counter Multiplier (TECM) fields are 1 Do not enter jump instructions into the
read-only, and all other fields return 0s, when mfspr is BHRB.
executed in problem state; all fields are not changed 32:33 BHRB Instruction Filtering Mode (IFM)
when mtspr is executed in problem state.
00 All taken Branch instructions are entered
Programming Note into the BHRB unless prevented by other
filtering fields.
Read/write access is provided to MMCRA in prob- 01 Do not enter jump instructions or return
lem state (SPR 770) when MMCR0PMCC = 0b10 or instructions into the BHRB.
0b11 even though no fields can be modified by
mtspr because future versions of the architecture 10 Do not enter unconditional Branch instruc-
may allow various fields of MMCRA to be modified tion addresses into the BHRB, and enter
in problem state. the target address only if the instruction is
indirect.
The bit definitions of MMCRA are as follows. 11 Filter as defined for IFM=10; additionally,
Bit(s) Description do not enter conditional Branch instruction
addresses into the BHRB if BO0=1 or if
0:26 Problem state access (SPR 770) the “a” bit in the BO field is set to 1, and
Reserved enter the target address only if the instruc-
Privileged access (SPR 770 or 786) tion is indirect.
Implementation-dependent
Programming Note
Bits 27:33 are referred to as the “filtering fields.” These
Filtering mode 11 provides additional fil-
fields control the filter criterion used by the hardware
tering for instructions that provide a hint or
when recording Branch instructions in the BHRB. See
for which the outcome does not depend
Section 9.5.1 for specifications of the terminology used
on the value of the CR.
and an example of their usage.
27 Filter Direct Branch Instructions (FD) 34:36 Threshold Event Counter Exponent
(TECX)
0 Enter direct Branch instructions into the This field species the exponent of the thresh-
BHRB if allowed by other filtering fields. old event counter value. See Section 9.4.3 for
1 Do not enter direct Branch instructions additional information. The maximum expo-
into the BHRB. nent supported is at least 5.
28 Filter Unconditional Branch Instructions 37 Reserved
(FU)
38:44 Threshold Event Counter Multiplier (TECM)
0 Enter unconditional Branch instructions
This field species the multiplier of the thresh-
into the BHRB if allowed by other filtering
old event counter value. See Section 9.4.3 for
fields.
additional information.
1 Do not enter unconditional Branch instruc-
tions into the BHRB.
29 Filter Call Instructions (FC)
57:59 Eligibility for Random Sampling (ES) 011 All Branch instructions for which the
When random sampling is enabled branch was taken.
(MMCRASE=1) and the SM field indicates ran-
The definition of the following values depends
dom instruction sampling (RIS), the encodings
on whether the access to MMCRA is in prob-
of this field specify the instructions that are eli-
lem state or in privileged state.
gible to be sampled as follows.
Problem state access (SPR 770)
000 All instructions
100 - 111 - Reserved
001 All Load and Store instructions
010 All probe no-op instructions Privileged access (SPR 770 or 786)
011 Reserved 100 - 111 - Implementation-dependent
The definition of the following values depends
on whether the access to MMCRA is in prob-
60 Reserved
lem state or in privileged state.
61:62 Random Sampling Mode (SM)
Problem state access (SPR 770)
100 - 111 - Reserved 00 Random Instruction Sampling (RIS) -
Instructions that meet the criterion speci-
Privileged access (SPR 770 or 786)
fied in the ES field for random instruction
100 - 111 - Implementation-dependent
sampling are eligible to be sampled.
01 Random Load/Store Facility Sampling
(RLS) - Instructions that meet the criterion
When random sampling is enabled
specified in the ES field for random Load/
(MMCRASE=1) and the SM field indicates ran-
Store Facility sampling are eligible for
dom Load/Store Facility sampling (RLS), the
sampling.
encodings of this field specify the instructions
10 Random Branch Facility Sampling
that are eligible to be sampled as follows.
(RBS) - Instructions that meet the criterion
000 Instructions for which the thread has
specified in the ES field for random
attempted to load data from the data
Branch Facility sampling are eligible for
cache but no block corresponding to the
sampling.
real address existed.
11 Reserved
001 Reserved
010 Reserved 63 Random Sampling Enable (SE)
011 Reserved 0 Random sampling is disabled.
The definition of the following values depends 1 Random sampling is enabled.
on whether the access to MMCRA is in prob- See Section 9.4.2.1 for information about ran-
lem state or in privileged state. dom sampling.
Problem state access (SPR 770)
100 - 111 - Reserved
9.4.8 Sampled Instruction
Privileged access (SPR 770 or 786)
100 - 111 - Implementation-dependent
Address Register
When random sampling is enabled The Sampled Instruction Address Register (SIAR) is a
(MMCRASE=1) and the SM field indicates ran- 64-bit register.
dom Branch Facility sampling (RBS), the
encodings of this field specify the instructions SIAR
that are eligible to be sampled as follows. 0 63
000 Instructions for which the thread has
Figure 81. Sampled Instruction Address Register
either mispredicted whether or not the
branch would be taken, or if taken, the When a Performance Monitor alert occurs because of
target address of a Branch instruction. an event caused by execution of a randomly sampled
001 Instructions for which the thread has instruction, the SIAR contains the effective address of
mispredicted whether or not the branch the instruction if SIERSIARV = 1 and contains an unde-
of a Branch instruction would be taken fined value if SIERSIARV = 0.
because the contents of the Condition
When a Performance Monitor alert occurs because of
Register differed from the predicted con-
an event other than an event caused by execution of a
tents.
randomly sampled instruction, the SIAR contains the
010 Instructions for which the thread has
effective address of an instruction that was being exe-
mispredicted the target address of a
Branch instruction.
cuted, possibly out-of-order, at or around the time that 9.4.10 Sampled Instruction Event
the Performance Monitor alert occurred.
Register
The instruction located at the effective address con-
tained in the SIAR is called the “sampled instruction”. The Sampled Instruction Event Register (SIER) is a
64-bit register.
000 The instruction did not require data 63 Sampled Instruction Completed (SICMPL)
address translation.
Set to 1 if the sampled instruction has com-
001 The thread translated the data virtual
pleted; otherwise set to 0.
address using the TLB.
010 A PTEG required for data address trans-
lation for the instruction was obtained
from the secondary cache.
9.5 Branch History Rolling
011 A PTEG required for data address trans- Buffer
lation for the instruction was obtained
from the tertiary cache. The Branch History Rolling Buffer (BHRB) is described
100 A PTEG required for data address trans- in Section 2.4 of Book I but only at the level required by
lation for the instruction was obtained application programmers. Additional aspects of the
from storage that did not reside in any BHRB are described here.
cache.
In order to enable problem state programs to use the
101 A PTEG required for data address trans-
BHRB, MMCR0BHRBA must be set to 1 to enable exe-
lation for the instruction was obtained
cution of clrbhrb and mfbhrbe instructions in problem
from a cache on a different
state. Additionally, MMCR0PMCC must be set to 0b10 or
multi-threaded processor that resides on
0b11 to allow problem state programs to read and write
the same chip as the thread.
the necessary Performance Monitor registers. (See
110 A PTEG required for data address trans-
Section 9.4.4.)
lation for the instruction was obtained
from a cache on a different chip from the If Performance Monitor event-based branching is
thread. desired, MMCR0EBE must also be set to 1 to enable
111 Reserved Performance Monitor event-based branches.
Programming Note
60:62 Sampled Instruction Data Storage Access
Enabling Performance Monitor event-based
Information (SIDSAI)
branching eliminates the need for the problem state
This field contains information about data stor- program to poll MMCR0PMAO in order to determine
age accesses made by the sampled instruc- when a Performance Monitor alert occurs.
tion. The values and their meanings are as
follows. The BHRB is written by the hardware if and only if Per-
000 The instruction did not require data formance Monitor alerts are enabled by setting
address translation. MMCR0PMAE to 1. After MMCR0PMAE has been set to
001 The instruction was a Read for which 1 and a Performance Monitor alert occurs,
the thread obtained the referenced data MMCR0PMAE is set to 0 and the BHRB is not altered by
from the primary data cache. hardware until software sets MMCR0PMAE to 1 again.
010 The instruction was a Read for which
the thread obtained the referenced data When MMCR0PMAE=1, mfbhrbe instructions return 0s
from the secondary cache. to the target register.
011 The instruction was a Read for which
the thread obtained the referenced data- Programming Note
from the tertiary cache. mfbhrbe instructions return 0s when
100 The instruction was a Read for which MMCR0PMAE=1 in order to prevent software from
the thread obtained the referenced data- reading the BHRB while it is being written by hard-
from storage that did not reside in any ware.
cache.
101 The instruction was a Read for which
the thread obtained the referenced data 9.5.1 BHRB Filtering
from a cache on a different
multi-threaded processor that resides on When the BHRB is written by hardware, only those
the same chip as the thread. Branch instructions that meet the filtering criterion
110 The instruction was a Read for which specified by the filtering fields in MMCRA27:33 and for
the thread obtained the referenced data which the branch was taken are included.
from a cache on a different chip from the Filtering restricts the type of Branch instructions that
thread. are entered into the BHRB. The filtering criteria are
111 The instruction was a Store for which defined using the following terminology.
the data were placed into a location Call: A Branch instruction with the LK field set to 1.
other than the primary data cache.
Programming Note
A few examples of how software might use the fil-
tering bits include the following. See Section 9.4.4
for a definition of MMCR0IFM and Section 9.4.7 for
definitions of filtering bits FD, FU, FC, FR, and FJ.
In the following examples, MMCR0IFM is set to 0
since that filter is not used. Also, the filtering bits
listed are set to 1 and other filtering bits are set to
0s.
Enter only return instructions: FC,FJ
Enter only indirect call instructions: FD, FR,FJ
Enter only conditional branches: FU
Enter only indirect jump instructions:
FD,FC,FR
Programming Note
A potential combined use of the Trace and Perfor-
mance Monitor facilities is to trace the control flow
of a program and simultaneously count events for
that program.
DPDES
0 63
Programming Note
The primary use of the DPDES is to provide the
means for the hypervisor to save a [sub-]proces-
sor's Directed Privileged Doorbell exception state
when the set of programs running on the [sub-]pro-
cessor is swapped out or moved from one
[sub-]processor to another. Since there is no such
need for a similar function for the hypervisor, there
is no similar register for the hypervisor. Privileged
programs are able to read the DPDES in order to
poll for Directed Privileged Doorbell exceptions
when the corresponding interrupt is disabled
(MSREE=1).
Programming Note
msgclr is typically issued only when MSREE=0. If
msgclr is executed when MSREE=1 when a
Directed Hypervisor Doorbell interrupt is about to
occur, the corresponding interrupt may or may not
occur.
Message Send Privileged X-form The actions taken on receipt of a message are defined
in Section 10.2.
msgsndp RB
This instruction is privileged.
31 /// /// RB 142 / Special Registers Altered:
0 6 11 16 21 31 DPDES
Message Payload
/// TYPE /// TIRTAG
0 32 37 39 57 63
Changing the contents of certain System Registers, the If a sequence of instructions contains context-altering
contents of SLB entries, or the contents of other system instructions and contains no instructions that are
resources that control the context in which a program affected by any of the context alterations, no software
executes can have the side effect of altering the context synchronization is required within the sequence.
in which data addresses and instruction addresses are
interpreted, and in which instructions are executed and Programming Note
data accesses are performed. For example, changing Sometimes advantage can be taken of the fact that
MSRIR from 0 to 1 has the side effect of enabling trans- certain events, such as interrupts, and certain
lation of instruction addresses. These side effects need instructions that occur naturally in the program,
not occur in program order, and therefore may require such as the rfid that returns from an interrupt han-
explicit synchronization by software. (Program order is dler, provide the required synchronization.
defined in Book II.)
An instruction that alters the context in which data No software synchronization is required before or after
addresses or instruction addresses are interpreted, or a context-altering instruction that is also context syn-
in which instructions are executed or data accesses are chronizing or when altering the MSR in most cases
performed, is called a context-altering instruction. This (see the tables). No software synchronization is
chapter covers all the context-altering instructions. The required before most of the other alterations shown in
software synchronization required for them is shown in Table 6, because all instructions preceding the con-
Table 5 (for data access) and Table 6 (for instruction text-altering instruction are fetched and decoded before
fetch and execution). the context-altering instruction is executed (the hard-
ware must determine whether any of these preceding
The notation “CSI” in the tables means any context syn- instructions are context synchronizing).
chronizing instruction (e.g., sc, isync, or rfid). A con-
text synchronizing interrupt (i.e., any interrupt except In situations such as context switch in which multiple
non-recoverable System Reset or non-recoverable SPRs are loaded in sequence, it is often the case that
Machine Check) can be used instead of a context syn- the composition of the implicit (implementation-specific,
chronizing instruction. If it is, phrases like “the synchro- nonarchitectural) synchronizations performed for each
nizing instruction”, below, should be interpreted as individual mtspr will be excessive for the purpose.
meaning the instruction at which the interrupt occurs. If Software may identify such sequences by placing a
no software synchronization is required before (after) a mtgsr before the sequence. Hardware may respond to
context-altering instruction, “the synchronizing instruc- this identification by removing redundant synchroniza-
tion before (after) the context-altering instruction” tion so that the net synchronization effect approaches
should be interpreted as meaning the context-altering that of a single context synchronization at the end of
instruction itself. the sequence. A potential side effect of the optimiza-
tion is that the SPRs specified by the sequence may be
The synchronizing instruction before the context-alter- loaded in an order other than that specified by the pro-
ing instruction ensures that all instructions up to and gram with the result that if an exception or EBB inter-
including that synchronizing instruction are fetched and rupts the sequence, mtspr instructions past the point of
executed in the context that existed before the alter- interruption may have loaded their SPRs. When control
ation. The synchronizing instruction after the con- returns to the interrupted sequence, any such mtspr
text-altering instruction ensures that all instructions instructions are re-executed. The programmer must
after that synchronizing instruction are fetched and exe- ensure that this side effect will not affect the outcome of
cuted in the context established by the alteration. the sequence. The degree of optimization is implemen-
Instructions after the first synchronizing instruction, up tation-specific. Transaction failure may compromise
to and including the second synchronizing instruction, optimization.
may be fetched or executed in either context.
Programming Note
Instruction or Required Required Notes
Because the individual mtspr instructions in an Event Before After
optimized sequence may be executed in any order,
a single sequence should not contain multiple event-based branch none none 21
loads of the same SPR. If it does, the final value of and rfebb
the SPR will be indeterminate. Similarly, any set of interrupt none none
SPRs for which the relative order of execution of
the mtspr instructions in the set matters should not rfid none none
be included in a single optimized sequence. hrfid none none
rfscv none none
sc none none
scv none none
Unless otherwise stated, the material in this chapter
assumes a single-threaded environment. Trap none none
mtspr (AMR) CSI CSI 13
mtspr (PIDR) CSI CSI 6
mtspr (DAWRn) CSI CSI
mtspr (DAWRXn) CSI CSI
mtspr (HRMOR) CSI CSI 11,17
mtspr (LPCR) CSI CSI 11
mtspr (PTCR) ptesync CSI 3
mtmsrd (SF) none none
mtmsrd (TS) none none
mtmsrd (TM) none none
mtmsr[d] (PR) none none
mtmsr[d] (DR) none none
mtspr (PIDR) CSI CSI 6
slbie CSI CSI 4
slbieg CSI CSI 4,6
slbia CSI CSI 4
slbmte CSI CSI 4,10
tlbie CSI CSI 4,6
tlbiel CSI ptesync 4
Store(PTE) none {ptesync, 5,6
CSI}
Store(STE) none {ptesync, 5,6
CSI}
Store(PRTE) none {ptesync, 5,6
CSI}
Store(PATE) none {ptesync, 5,6
CSI}
transaction failure none none 21
and all TM
instructions
except tcheck
Table 5: Synchronization requirements for data access
different ESID (e.g., to satisfy an SLB miss). How- 14. LPIDR must not be altered when MSRDR=1 or
ever, the slbie is needed later if and when the MSRIR=1; if it is, the results are undefined.
translation that was contained in the replaced SLB
entry is to be invalidated. Programming Note
Altering LPIDR when MSRIR=1 or MSRDR=1 is
11. When the HRMOR or the VC field of the LPCR is prohibited because of the difficulty of avoiding
modified, software must invalidate all implementa- an implicit branch relative to the value of
tion-specific lookaside information used in address enabling software to avoid using hypervisor
translation that depends on the old contents of the real addressing mode for the operation. (The
register or field (i.e., the contents immediately tables used for translation are determined by
before the modification). The slbia instruction can the partition ID. See Section 5.7.6 for details.)
be used to invalidate all such implementation-spe-
cific lookaside information. 15. This line applies to the following Performance Mon-
itor SPRs: PMC1-6, MMCR0, MMCR1, MMCR2,
12. A context synchronizing instruction or event that is and MMCRA.
executed or occurs when LPCRMER = 1 does not
necessarily ensure that the exception effects of 16. This line applies to all SPR numbers that access
LPCRMER are consistent with the contents of the BESCR (800-803, 806).
LPCRMER. See Section 2.2. 17. There are additional software synchronization
13. This line applies regardless of which SPR number requirements when an mtspr instruction modifies
(13 or 29) is used for the AMR. this SPR in a multi-threaded environment. See
Section 2.7.
18. As an alternative to a CSI, the execution of an
rfebb instruction or the occurrence of an
event-based branch is sufficient to provide the nec-
essary synchronization.
19. These instructions and events, with the exception
of nested tbegin. nested tend., TM instructions
that except or are described to be treated as
noops, Transaction Abort Conditional instructions
that do not abort, and events and rfebb instruc-
tions for which the event did not take place in
Transactional state, will change MSRTS. No soft-
ware synchronization is required.
This appendix contains opcode maps showing the pri- invoked for all members of the POWER family, and this
mary opcodes, extended opcodes, and expanded is likely to remain true in future models (it is guaranteed
opcodes. in the Power ISA). An instruction having primary
opcode 0 but not consisting entirely of binary 0s is
Table 7 describes the conventions used in the opcode
reserved except for the following extended opcode
maps.
(instruction bits 21:30).
The instruction consisting entirely of binary 0s causes 256 Service Processor “Attention”
the system illegal instruction error handler to be
Table 7: Opcode Maps Legend
po
mnemonic
book po
version privilege format primary opcode (decimal format)
xop
mnemonic
book xop
version privilege format extended or expanded opcode image (binary format)
0 instruction bit corresponding to an extended/expanded opcode bit having value of 0
1 instruction bit corresponding to an extended/expanded opcode bit having value of 1
/ reserved instruction bit, must have value of 0, otherwise invalid form
. instruction bit corresponding to an operand or control bit, can have a value of either 0 or 1
book
Book instruction defined
version
ISA version instruction introduced
privilege
P privileged instruction
H hypervisor-privileged instruction
format
instruction format
Illegal opcode
Opcode having no previous or current assignment, available for future use
08
subfic
I Defined opcode (primary, extended, or expanded)
P1 D Opcode assigned to a defined instruction
17
EXT17
Primary opcode having an extended opcode field
{extended} Opcode having extended opcode field used to identify multiple instructions
10110 000001
XPND04-1
Extended opcode having an expanded opcode field
{expanded} Opcode having expanded opcode field used to identify multiple instructions
Reserved opcode (primary, extended, or expanded)
{reserved} Opcode is not available for future use without careful consideration
1. Opcode corresponds to an instruction defined in a previous version of the architecture that has been subsequently
removed from the architecture. The opcode is treated as an illegal opcode.
2. Or, opcode is reserved for implementation-dependent use.
These opcodes will not be assigned a meaning in the Power ISA except after careful consideration of the effect of such
assignment on existing implementations.
Invalid form opcode
{invalid} Opcode corresponding to a defined instruction encoding with one or more reserved opcode bits having a value of 1
Table 9: EXT17: Extended Opcode Map for Primary Opcode 17 (opcode bits 27:30)
00 01 10 11
01 I 1/ I 1/
scv sc sc
v3.0 SC PPC SC {invalid}
00 01 10 11
Table 10:EXT30: Extended Opcode Map for Primary Opcode 30 (opcode bits 27:30)
000 001 010 011 100 101 110 111
000. I 000. I 001. I 001. I 010. I 010. I 011. I 011. I
0 rldicl[.] rldicl[.] rldicr[.] rldicr[.] rldic[.] rldic[.] rldimi[.] rldimi[.] 0
PPC MD PPC MD PPC MD PPC MD PPC MD PPC MD PPC MD PPC MD
1000 I 1001 I
1 rldcl[.] rldcr[.] 1
PPC MDS PPC MDS {reserved} {reserved} {reserved} {reserved}
Table 11:EXT57: Extended Opcode Map for Primary Opcode 57 (opcode bits 30:31)
00 01 10 11
00 I 10 I 11 I
lfdp lxsd lxssp
v2.05 DS {reserved} v3.0 DS v3.0 DS
00 01 10 11
Table 12:EXT58: Extended Opcode Map for Primary Opcode 58 (opcode bits 30:31)
00 01 10 11
00 I 01 I 10 I
ld ldu lwa
PPC DS PPC DS PPC DS {reserved}
00 01 10 11
Table 13:EXT61: Extended Opcode Map for Primary Opcode 61 (opcode bits 29:31)
000 001 010 011 100 101 110 111
.00 I 001 I .10 I .11 I .00 I 101 I .10 I .11 I
stfdp lxv stxsd stxssp stfdp stxv stxsd stxssp
v2.05 DS v3.0 DQ v3.0 DS v3.0 DS v2.05 DS v3.0 DQ v3.0 DS v3.0 DS
Table 14:EXT62: Extended Opcode Map for Primary Opcode 62 (opcode bits 30:31)
00 01 10 11
00 I 01 I 10 I
std stdu stq
PPC DS PPC DS v2.03 DS {reserved}
00 01 10 11
Table 15:EXT04: Extended Opcode Map for Primary Opcode 4 (opcode bits 21:31) (Sheet 1 of 8)
000000 000001 000010 000011 000100 000101 000110 000111
00000 000000 I 00000 000001 I 00000 000010 I 00000 000100 I .0000 000110 I .0000 000111 I
00000 vaddubm vmul10cuq vmaxub vrlb vcmpequb vcmpneb 00000
v2.03 VX v3.0 VX v2.03 VX v2.03 VX v2.03 VC v3.0 VC
00001 000000 I 00001 000001 I 00001 000010 I 00001 000100 I .0001 000110 I .0001 000111 I
00001 vadduhm vmul10ecuq vmaxuh vrlh vcmpequh vcmpneh 00001
v2.03 VX v3.0 VX v2.03 VX v2.03 VX v2.03 VC v3.0 VC
00010 000000 I 00010 000010 I 00010 000100 I 00010 000101 I .0010 000110 I .0010 000111 I
00010 vadduwm vmaxuw vrlw vrlwmi vcmpequw vcmpnew 00010
v2.03 VX v2.03 VX v2.03 VX v3.0 VX v2.03 VC v3.0 VC
00011 000000 I 00011 000010 I 00011 000100 I 00011 000101 I .0011 000110 I .0011 000111 I
00011 vaddudm vmaxud vrld vrldmi vcmpeqfp vcmpequd 00011
v2.07 VX v2.07 VX v2.07 VX v3.0 VX v2.03 VC v2.07 VC
00100 000000 I 00100 000010 I 00100 000100 I .0100 000111 I
00100 vadduqm vmaxsb vslb vcmpnezb 00100
v2.07 VX v2.03 VX v2.03 VX v3.0 VC
00101 000000 I 00101 000010 I 00101 000100 I .0101 000111 I
00101 vaddcuq vmaxsh vslh vcmpnezh 00101
v2.07 VX v2.03 VX v2.03 VX v3.0 VC
00110 000000 I 00110 000010 I 00110 000100 I 00110 000101 I .0110 000111 I
00110 vaddcuw vmaxsw vslw vrlwnm vcmpnezw 00110
v2.03 VX v2.03 VX v2.03 VX v3.0 VX v3.0 VC
00111 000010 I 00111 000100 I 00111 000101 I .0111 000110 I
00111 vmaxsd vsl vrldnm vcmpgefp 00111
v2.07 VX v2.03 VX v3.0 VX v2.03 VC
01000 000000 I 01000 000001 I 01000 000010 I 01000 000100 I .1000 000110 I
01000 vaddubs vmul10uq vminub vsrb vcmpgtub 01000
v2.03 VX v3.0 VX v2.03 VX v2.03 VX v2.03 VC
01001 000000 I 01001 000001 I 01001 000010 I 01001 000100 I .1001 000110 I
01001 vadduhs vmul10euq vminuh vsrh vcmpgtuh 01001
v2.03 VX v3.0 VX v2.03 VX v2.03 VX v2.03 VC
01010 000000 I 01010 000010 I 01010 000100 I .1010 000110 I
01010 vadduws vminuw vsrw vcmpgtuw 01010
v2.03 VX v2.03 VX v2.03 VX v2.03 VC
01011 000010 I 01011 000100 I .1011 000110 I .1011 000111 I
01011 vminud vsr vcmpgtfp vcmpgtud 01011
v2.07 VX v2.03 VX v2.03 VC v2.07 VC
01100 000000 I 01100 000010 I 01100 000100 I .1100 000110 I
01100 vaddsbs vminsb vsrab vcmpgtsb 01100
v2.03 VX v2.03 VX v2.03 VX v2.03 VC
01101 000000 I 01101 000001 I 01101 000010 I 01101 000100 I .1101 000110 I
01101 vaddshs bcdcpsgn. vminsh vsrah vcmpgtsh 01101
v2.03 VX v3.0 VX v2.03 VX v2.03 VX v2.03 VC
01110 000000 I 01110 000010 I 01110 000100 I .1110 000110 I
01110 vaddsws vminsw vsraw vcmpgtsw 01110
v2.03 VX v2.03 VX v2.03 VX v2.03 VC
01111 000010 I 01111 000100 I .1111 000110 I .1111 000111 I
01111 vminsd vsrad vcmpbfp vcmpgtsd 01111
v2.07 VX v2.07 VX v2.03 VC v2.07 VC
10000 000000 I 1.000 000001 I 10000 000010 I 10000 000011 I 10000 000100 I .0000 000110 I .0000 000111 I
10000 vsububm bcdadd. vavgub vabsdub vand vcmpequb. vcmpneb. 10000
v2.03 VX v2.07 VX v2.03 VX v3.0 VX v2.03 VX v2.03 VC v3.0 VC
10001 000000 I 1.001 000001 I 10001 000010 I 10001 000011 I 10001 000100 I .0001 000110 I .0001 000111 I
10001 vsubuhm bcdsub. vavguh vabsduh vandc vcmpequh. vcmpneh. 10001
v2.03 VX v2.07 VX v2.03 VX v3.0 VX v2.03 VX v2.03 VC v3.0 VC
10010 000000 I 1/010 000001 I 10010 000010 I 10010 000011 I 10010 000100 I .0010 000110 I .0010 000111 I
10010 vsubuwm bcdus. vavguw vabsduw vor vcmpequw. vcmpnew. 10010
v2.03 VX v3.0 VX v2.03 VX v3.0 VX v2.03 VX v2.03 VC v3.0 VC
10011 000000 I 1.011 000001 I 10011 000100 I .0011 000110 I .0011 000111 I
10011 vsubudm bcds. vxor vcmpeqfp. vcmpequd. 10011
v2.07 VX v3.0 VX v2.03 VX v2.03 VC v2.07 VC
10100 000000 I 1.100 000001 I 10100 000010 I 10100 000100 I .0100 000111 I
10100 vsubuqm bcdtrunc. vavgsb vnor vcmpnezb. 10100
v2.07 VX v3.0 VX v2.03 VX v2.03 VX v3.0 VC
10101 000000 I 1/101 000001 I 10101 000010 I 10101 000100 I .0101 000111 I
10101 vsubcuq bcdutrunc. vavgsh vorc vcmpnezh. 10101
v2.07 VX v3.0 VX v2.03 VX v2.07 VX v3.0 VC
10110 000000 I 10110 000001 10110 000010 I 10110 000100 I .0110 000111 I
10110 vsubcuw XPND04-1 vavgsw vnand vcmpnezw. 10110
v2.03 VX {expanded} v2.03 VX v2.07 VX v3.0 VC
1.111 000001 I 10111 000100 I .0111 000110 I
10111 bcdsr. vsld vcmpgefp. 10111
v3.0 VX v2.07 VX v2.03 VC
11000 000000 I 1.000 000001 I 11000 000010 11000 000100 I .1000 000110 I
11000 vsububs bcdadd. XPND04-3 mfvscr vcmpgtub. 11000
v2.03 VX v2.07 VX {expanded} v2.03 VX v2.03 VC
11001 000000 I 1.001 000001 I 11001 000100 I .1001 000110 I
11001 vsubuhs bcdsub. mtvscr vcmpgtuh. 11001
v2.03 VX v2.07 VX v2.03 VX v2.03 VC
11010 000000 I 1/010 000001 11010 000010 I 11010 000100 I .1010 000110 I
11010 vsubuws bcdus. vshasigmaw veqv vcmpgtuw. 11010
v2.03 VX {invalid} v2.07 VX v2.07 VX v2.03 VC
1.011 000001 I 11011 000010 I 11011 000100 I .1011 000110 I .1011 000111 I
11011 bcds. vshasigmad vsrd vcmpgtfp. vcmpgtud. 11011
v3.0 VX v2.07 VX v2.07 VX v2.03 VC v2.07 VC
11100 000000 I 1.100 000001 I 11100 000010 I 11100 000011 I 11100 000100 I .1100 000110 I
11100 vsubsbs bcdtrunc. vclzb vpopcntb vsrv vcmpgtsb. 11100
v2.03 VX v3.0 VX v2.07 VX v2.07 VX v3.0 VX v2.03 VC
11101 000000 I 1/101 000001 11101 000010 I 11101 000011 I 11101 000100 I .1101 000110 I
11101 vsubshs bcdutrunc. vclzh vpopcnth vslv vcmpgtsh. 11101
v2.03 VX {invalid} v2.07 VX v2.07 VX v3.0 VX v2.03 VC
11110 000000 I 11110 000001 11110 000010 I 11110 000011 I .1110 000110 I
11110 vsubsws XPND04-2 vclzw vpopcntw vcmpgtsw. 11110
v2.03 VX {expanded} v2.07 VX v2.07 VX v2.03 VC
1.111 000001 I 11111 000010 I 11111 000011 I .1111 000110 I .1111 000111 I
11111 bcdsr. vclzd vpopcntd vcmpbfp. vcmpgtsd. 11111
v3.0 VX v2.07 VX v2.07 VX v2.03 VC v2.07 VC
Table 15:EXT04: Extended Opcode Map for Primary Opcode 4 (opcode bits 21:31) (Sheet 2 of 8)
001000 001001 001010 001011 001100 001101 001110 001111
00000 001000 I 00000 001010 I 00000 001100 I 00000 001110 I
00000 vmuloub vaddfp vmrghb vpkuhum 00000
v2.03 VX v2.03 VX v2.03 VX v2.03 VX
00001 001000 I 00001 001010 I 00001 001100 I 00001 001110 I
00001 vmulouh vsubfp vmrghh vpkuwum 00001
v2.03 VX v2.03 VX v2.03 VX v2.03 VX
00010 001000 I 00010 001001 I 00010 001100 I 00010 001110 I
00010 vmulouw vmuluwm vmrghw vpkuhus 00010
v2.07 VX v2.07 VX v2.03 VX v2.03 VX
00011 001110 I
00011 vpkuwus 00011
v2.03 VX
00100 001000 I 00100 001010 I 00100 001100 I 00100 001110 I
00100 vmulosb vrefp vmrglb vpkshus 00100
v2.03 VX v2.03 VX v2.03 VX v2.03 VX
00101 001000 I 00101 001010 I 00101 001100 I 00101 001110 I
00101 vmulosh vrsqrtefp vmrglh vpkswus 00101
v2.03 VX v2.03 VX v2.03 VX v2.03 VX
00110 001000 I 00110 001010 I 00110 001100 I 00110 001110 I
00110 vmulosw vexptefp vmrglw vpkshss 00110
v2.07 VX v2.03 VX v2.03 VX v2.03 VX
00111 001010 I 00111 001110 I
00111 vlogefp vpkswss 00111
v2.03 VX v2.03 VX
01000 001000 I 01000 001010 I 01000 001100 I 01000 001101 I 01000 001110 I
01000 vmuleub vrfin vspltb vextractub vupkhsb 01000
v2.03 VX v2.03 VX v2.03 VX v3.0 VX v2.03 VX
01001 001000 I 01001 001010 I 01001 001100 I 01001 001101 I 01001 001110 I
01001 vmuleuh vrfiz vsplth vextractuh vupkhsh 01001
v2.03 VX v2.03 VX v2.03 VX v3.0 VX v2.03 VX
01010 001000 I 01010 001010 I 01010 001100 I 01010 001101 I 01010 001110 I
01010 vmuleuw vrfip vspltw vextractuw vupklsb 01010
v2.07 VX v2.03 VX v2.03 VX v3.0 VX v2.03 VX
01011 001010 I 01011 001101 I 01011 001110 I
01011 vrfim vextractd vupklsh 01011
v2.03 VX v3.0 VX v2.03 VX
01100 001000 I 01100 001010 I 01100 001100 I 01100 001101 I 01100 001110 I
01100 vmulesb vcfux vspltisb vinsertb vpkpx 01100
v2.03 VX v2.03 VX v2.03 VX v3.0 VX v2.03 VX
01101 001000 I 01101 001010 I 01101 001100 I 01101 001101 I 01101 001110 I
01101 vmulesh vcfsx vspltish vinserth vupkhpx 01101
v2.03 VX v2.03 VX v2.03 VX v3.0 VX v2.03 VX
01110 001000 I 01110 001010 I 01110 001100 I 01110 001101 I
01110 vmulesw vctuxs vspltisw vinsertw 01110
v2.07 VX v2.03 VX v2.03 VX v3.0 VX
01111 001010 I 01111 001101 I 01111 001110 I
01111 vctsxs vinsertd vupklpx 01111
v2.03 VX v3.0 VX v2.03 VX
10000 001000 I 10000 001010 I 10000 001100 I
10000 vpmsumb vmaxfp vslo 10000
v2.07 VX v2.03 VX v2.03 VX
10001 001000 I 10001 001010 I 10001 001100 I 10001 001110 I
10001 vpmsumh vminfp vsro vpkudum 10001
v2.07 VX v2.03 VX v2.03 VX v2.07 VX
10010 001000 I
10010 vpmsumw 10010
v2.07 VX
10011 001000 I 10011 001110 I
10011 vpmsumd vpkudus 10011
v2.07 VX v2.07 VX
10100 001000 I 10100 001001 I 10100 001100 I
10100 vcipher vcipherlast vgbbd 10100
v2.07 VX v2.07 VX v2.07 VX
10101 001000 I 10101 001001 I 10101 001100 I 10101 001110 I
10101 vncipher vncipherlast vbpermq vpksdus 10101
v2.07 VX v2.07 VX v2.07 VX v2.07 VX
10110 10110
11111 11111
Table 15:EXT04: Extended Opcode Map for Primary Opcode 4 (opcode bits 21:31) (Sheet 3 of 8)
010000 010001 010010 010011 010100 010101 010110 010111
00000 00000
00001 00001
00010 00010
00011 00011
00100 00100
00101 00101
00110 00110
00111 00111
01000 01000
01001 01001
01010 01010
01011 01011
01100 01100
01101 01101
01110 01110
01111 01111
10000 10000
10001 10001
10010 10010
10011 10011
10100 10100
10101 10101
10110 10110
10111 10111
11000 11000
11001 11001
11010 11010
11011 11011
11100 11100
11101 11101
11110 11110
11111 11111
Table 15:EXT04: Extended Opcode Map for Primary Opcode 4 (opcode bits 21:31) (Sheet 4 of 8)
011000 011001 011010 011011 011100 011101 011110 011111
00000 00000
00001 00001
00010 00010
00011 00011
00100 00100
00101 00101
00110 00110
00111 00111
01000 01000
01001 01001
01010 01010
01011 01011
01100 01100
01101 01101
01110 01110
01111 01111
10000 10000
10001 10001
10010 10010
10011 10011
10100 10100
10101 10101
10110 10110
10111 10111
11000 11000
11001 11001
11010 11010
11011 11011
11100 11100
11101 11101
11110 11110
11111 11111
Table 15:EXT04: Extended Opcode Map for Primary Opcode 4 (opcode bits 21:31) (Sheet 5 of 8)
100000 100001 100010 100011 100100 100101 100110 100111
..... 100000 I ..... 100001 I ..... 100010 I ..... 100100 I ..... 100101 I ..... 100110 I ..... 100111 I
00000 vmhaddshs vmhraddshs vmladduhm vmsumubm vmsummbm vmsumuhm vmsumuhs 00000
v2.03 VA v2.03 VA v2.03 VA v2.03 VA v2.03 VA v2.03 VA v2.03 VA
00001 00001
00010 00010
00011 00011
00100 00100
00101 00101
00110 00110
00111 00111
01000 01000
01001 01001
01010 01010
01011 01011
01100 01100
01101 01101
01110 01110
01111 01111
10000 10000
10001 10001
10010 10010
10011 10011
10100 10100
10101 10101
10110 10110
10111 10111
11000 11000
11001 11001
11010 11010
11011 11011
11100 11100
11101 11101
11110 11110
11111 11111
Table 15:EXT04: Extended Opcode Map for Primary Opcode 4 (opcode bits 21:31) (Sheet 6 of 8)
101000 101001 101010 101011 101100 101101 101110 101111
..... 101000 I ..... 101001 I ..... 101010 I ..... 101011 I /.... 101100 I ..... 101101 I ..... 101110 I ..... 101111 I
00000 vmsumshm vmsumshs vsel vperm vsldoi vpermxor vmaddfp vnmsubfp 00000
v2.03 VA v2.03 VA v2.03 VA v2.03 VA v2.03 VA v2.07 VA v2.03 VA v2.03 VA
00001 00001
00010 00010
00011 00011
00100 00100
00101 00101
00110 00110
00111 00111
01000 01000
01001 01001
01010 01010
01011 01011
01100 01100
01101 01101
01110 01110
01111 01111
/.... 101100
10000 vsldoi 10000
{invalid}
10001 10001
10010 10010
10011 10011
10100 10100
10101 10101
10110 10110
10111 10111
11000 11000
11001 11001
11010 11010
11011 11011
11100 11100
11101 11101
11110 11110
11111 11111
Table 15:EXT04: Extended Opcode Map for Primary Opcode 4 (opcode bits 21:31) (Sheet 7 of 8)
110000 110001 110010 110011 110100 110101 110110 110111
..... 110000 I ..... 110001 I ..... 110011 I
00000 maddhd maddhdu maddld 00000
v3.0 VA v3.0 VA v3.0 VA
00001 00001
00010 00010
00011 00011
00100 00100
00101 00101
00110 00110
00111 00111
01000 01000
01001 01001
01010 01010
01011 01011
01100 01100
01101 01101
01110 01110
01111 01111
10000 10000
10001 10001
10010 10010
10011 10011
10100 10100
10101 10101
10110 10110
10111 10111
11000 11000
11001 11001
11010 11010
11011 11011
11100 11100
11101 11101
11110 11110
11111 11111
Table 15:EXT04: Extended Opcode Map for Primary Opcode 4 (opcode bits 21:31) (Sheet 8 of 8)
111000 111001 111010 111011 111100 111101 111110 111111
..... 111011 I ..... 111100 I ..... 111101 I ..... 111110 I ..... 111111 I
00000 vpermr vaddeuqm vaddecuq vsubeuqm vsubecuq 00000
v3.0 VA v2.07 VA v2.07 VA v2.07 VA v2.07 VA
00001 00001
00010 00010
00011 00011
00100 00100
00101 00101
00110 00110
00111 00111
01000 01000
01001 01001
01010 01010
01011 01011
01100 01100
01101 01101
01110 01110
01111 01111
10000 10000
10001 10001
10010 10010
10011 10011
10100 10100
10101 10101
10110 10110
10111 10111
11000 11000
11001 11001
11010 11010
11011 11011
11100 11100
11101 11101
11110 11110
11111 11111
Table 16:XPND04-1: Expanded Opcode Map for PO=4 XO=0b10110_000001 (opcode bits 21:30)
000 001 010 011 100 101 110 111
00 000 I 00010 I 00 100 I 00 101 I 00 110 I 00 111 I
00 bcdctsq. bcdcfsq. bcdctz. bcdctn. bcdcfz. bcdcfn. 00
v3.0 VX v3.0 VX v3.0 VX v3.0 VX v3.0 VX v3.0 VX
01 01
10 10
11 111 I
11 bcdsetsgn. 11
v3.0 VX
Table 17:XPND04-2: Expanded Opcode Map for PO=4 XO=0b11110_000001 (opcode bits 21:30)
000 001 010 011 100 101 110 111
00 000 I 00 010 I 00 100 I 00 101 I 00 110 I 00 111 I
00 bcdctsq. bcdcfsq. bcdctz. bcdctn. bcdcfz. bcdcfn. 00
{invalid} v3.0 VX v3.0 VX {invalid} v3.0 VX v3.0 VX
01 01
10 10
11 111 I
11 bcdsetsgn. 11
v3.0 VX
Table 18:XPND04-3: Expanded Opcode Map for PO=4 XO=0b11000_000010 (opcode bits 21:30)
000 001 010 011 100 101 110 111
00 000 I 00 001 I 00 110 I 00 111 I
00 vclzlsbb vctzlsbb vnegw vnegd 00
v3.0 VX v3.0 VX v3.0 VX v3.0 VX
01 000 I 01 001 I 01 010 I
01 vprtybw vprtybd vprtybq 01
v3.0 VX v3.0 VX v3.0 VX
10 000 I 10 001 I
10 vextsb2w vextsh2w 10
v3.0 VX v3.0 VX
11 000 I 11 001 I 11 010 I 11 100 I 11 101 I 11 110 I 11 111 I
11 vextsb2d vextsh2d vextsw2d vctzb vctzh vctzw vctzd 11
v3.0 VX v3.0 VX v3.0 VX v3.0 VX v3.0 VX v3.0 VX v3.0 VX
Table 19:EXT19: Extended Opcode Map for Primary Opcode 19 (opcode bits 21:30) (Sheet 1 of 4)
00000 00001 00010 00011 00100 00101 00110 00111
00000 00000 I ..... 00010 I
00000 mcrf addpcis 00000
P1 XL v3.0 DX
00001 00001 I
00001 crnor 00001
P1 XL
00010 00010
00011 00011
00100 00001 I
00100 crandc 00100
P1 XL
00101 00101
00110 00001 I
00110 crxor 00110
P1 XL
00111 00001 I
00111 crnand 00111
P1 XL
01000 00001 I
01000 crand 01000
P1 XL
01001 00001 I
01001 creqv 01001
P1 XL
01010 01010
01011 01011
01100 01100
01101 00001 I
01101 crorc 01101
P1 XL
01110 00001 I
01110 cror 01110
P1 XL
01111 01111
10000 10000
10001 10001
10010 10010
10011 10011
10100 10100
10101 10101
10110 10110
10111 10111
11000 11000
11001 11001
11010 11010
11011 11011
11100 11100
11101 11101
11110 11110
11111 11111
Table 19:EXT19: Extended Opcode Map for Primary Opcode 19 (opcode bits 21:30) (Sheet 2 of 4)
01000 01001 01010 01011 01100 01101 01110 01111
00000 00000
00001 00001
00010 00010
00011 00011
00100 00100
00101 00101
00110 00110
00111 00111
01000 01000
01001 01001
01010 01010
01011 01011
01100 01100
01101 01101
01110 01110
01111 01111
10000 10000
10001 10001
10010 10010
10011 10011
10100 10100
10101 10101
10110 10110
10111 10111
11000 11000
11001 11001
11010 11010
11011 11011
11100 11100
11101 11101
11110 11110
11111 11111
Table 19:EXT19: Extended Opcode Map for Primary Opcode 19 (opcode bits 21:30) (Sheet 3 of 4)
10000 10001 10010 10011 10100 10101 10110 10111
00000 10000 I 00000 10010 III
00000 bclr[l] rfid 00000
P1 XL PPC P XL
00001 00001
{reserved} {reserved}
00010 10010 III
00010 rfscv 00010
v3.0 P XL
00011 00011
00101 00101
00110 00110
00111 00111
01001 01001
01010 01010
01100 01100
{reserved}
01101 01101
{reserved}
01110 01110
{reserved}
01111 01111
{reserved}
10000 10000 I
10000 bcctr[l] 10000
P1 XL
10001 10000 I
10001 bctar[l] 10001
v2.07 XL
10010 10010
10011 10011
10100 10100
10101 10101
10110 10110
10111 10111
11000 11000
11001 11001
11010 11010
11011 11011
11100 11100
11101 11101
11110 11110
11111 11111
Table 19:EXT19: Extended Opcode Map for Primary Opcode 19 (opcode bits 21:30) (Sheet 4 of 4)
11000 11001 11010 11011 11100 11101 11110 11111
00000 00000
00001 00001
00010 00010
00011 00011
00100 00100
00101 00101
00110 00110
00111 00111
01000 01000
01001 01001
01010 01010
01011 01011
01100 01100
01101 01101
01110 01110
01111 01111
10000 10000
10001 10001
10010 10010
10011 10011
10100 10100
10101 10101
10110 10110
10111 10111
11000 11000
11001 11001
11010 11010
11011 11011
11100 11100
11101 11101
11110 11110
11111 11111
Table 20:EXT31: Extended Opcode Map for Primary Opcode 31 (opcode bits 21:30) (Sheet 1 of 4)
00000 00001 00010 00011 00100 00101 00110 00111
00000 00000 I 00000 00100 I 00000 00110 I 00000 00111 I
00000 cmp tw lvsl lvebx 00000
P1 X P1 X v2.03 X v2.03 X
00001 00000 I 00001 00110 I 00001 00111 I
00001 cmpl lvsr lvehx 00001
P1 X {reserved} v2.03 X v2.03 X
00010 00100 I 00010 00111 I
00010 td lvewx 00010
{reserved} PPC X v2.03 X
00011 00111 I
00011 lvx 00011
{reserved} v2.03 X
00100 00000 I 00100 00111 I
00100 setb stvebx 00100
v3.0 X {reserved} v2.03 X
00101 00111 I
00101 stvehx 00101
{reserved} {reserved} v2.03 X
00110 00000 I 00110 00111 I
00110 cmprb stvewx 00110
v3.0 X v2.03 X
00111 00000 I 00111 00111 I
00111 cmpeqb stvx 00111
v3.0 X {reserved} v2.03 X
01000 01000
{reserved}
01001 01001
{reserved}
01010 01010
{reserved}
01011 00111 I
01011 lvxl 01011
{reserved} v2.03 X
01100 01100
01101 01101
{reserved}
01110 01110
{reserved} {reserved}
01111 00111 I
01111 stvxl 01111
{reserved} {reserved} v2.03 X
10000 10000
{reserved}
10001 10001
{reserved} {reserved} {reserved}
10010 00000 I 10010 00110 II
10010 mcrxrx lwat 10010
v3.0 X v3.0 X
10011 00110 II
10011 ldat 10011
{reserved} v3.0 X
10100 10100
{reserved}
10101 10101
{reserved} {reserved}
10110 00110 II
10110 stwat 10110
v3.0 X
10111 00110 II
10111 stdat 10111
{reserved} v3.0 X
11000 00110 II
11000 copy 11000
v3.0 X {reserved}
11001 11001
{reserved} {reserved}
11010 00110 II
11010 cp_abort 11010
v3.0 X
11011 11011
{reserved}
11100 00110 II
11100 paste[.] 11100
v3.0 X {reserved}
11101 11101
{reserved} {reserved}
11110 11110
{reserved}
11111 11111
{reserved} {reserved}
Table 20:EXT31: Extended Opcode Map for Primary Opcode 31 (opcode bits 21:30) (Sheet 2 of 4)
01000 01001 01010 01011 01100 01101 01110 01111
00000 01000 I /0000 01001 I 00000 01010 I /0000 01011 I 00000 01100 I ..... 01111 I
00000 subfc[.] mulhdu[.] addc[.] mulhwu[.] lxsiwzx isel 00000
P1 XO PPC XO P1 XO PPC XO v2.07 XX1 v2.03 A
00001 01000 I
00001 subf[.] 00001
PPC XO
/0010 01001 I /0010 01010 I /0010 01011 I 00010 01100 I
00010 mulhd[.] addg6s mulhw[.] lxsiwax 00010
PPC XO v2.06 XO PPC XO v2.07 XX1
00011 01000 I
00011 neg[.] 00011
P1 XO {reserved}
00100 01000 I 00100 01010 I 00100 01100 I 00100 01110 III
00100 subfe[.] adde[.] stxsiwx msgsndp 00100
P1 XO P1 XO v2.07 XX1 v2.07 P X
00101 01110 III
00101 msgclrp 00101
v2.07 P X
00110 01000 I 00110 01010 I 00110 01110 III
00110 subfze[.] addze[.] msgsnd 00110
P1 XO P1 XO v2.07 H X
00111 01000 I 00111 01001 I 00111 01010 I 00111 01011 I 00111 01110 III
00111 subfme[.] mulld[.] addme[.] mullw[.] msgclr 00111
P1 XO PPC XO P1 XO P1 XO v2.07 H X
01000 01001 I .1000 01010 I 01000 01011 I 01000 01100 I 01000 01101 I
01000 modud add[.] moduw lxvx lxvl 01000
{reserved} v3.0 X P1 XO v3.0 X v3.0 XX1 v3.0 XX1
01001 01101 I 01001 01110 I
01001 lxvll mfbhrbe 01001
v3.0 XX1 v2.07 X
01010 01100 I
01010 lxvdsx 01010
{reserved} v2.06 XX1
01011 01100 I
01011 lxvwsx 01011
{reserved} {reserved} v3.0 XX1
01100 01001 I 01100 01011 I 01100 01100 I 01100 01101 I
01100 divdeu[.] divweu[.] stxvx stxvl 01100
v2.06 XO v2.06 XO v3.0 XX1 v3.0 XX1
01101 01001 I 01101 01011 I 01101 01101 I 01101 01110 I
01101 divde[.] divwe[.] stxvll clrbhrb 01101
v2.06 XO v2.06 XO v3.0 XX1 v2.07 X
01110 01001 I 01110 01011 I
01110 divdu[.] divwu[.] 01110
PPC XO PPC XO
01111 01001 I 01111 01011 I
01111 divd[.] divw[.] 01111
{reserved} PPC XO PPC XO
10000 01000 I /0000 01001 10000 01010 I /0000 01011 10000 01100 I
10000 subfco[.] mulhdu[.] addco[.] mulhwu[.] lxsspx 10000
P1 XO {invalid} P1 XO {invalid} v2.07 XX1
10001 01000 I
10001 subfo[.] 10001
PPC XO
/0010 01001 /0010 01010 /0010 01011 10010 01100 I
10010 mulhd[.] addg6s mulhw[.] lxsdx 10010
{invalid} {invalid} {invalid} v2.06 XX1
10011 01000 I
10011 nego[.] 10011
P1 XO {reserved}
10100 01000 I 10100 01010 I 10100 01100 I 10100 01110 II
10100 subfeo[.] addeo[.] stxsspx tbegin. 10100
P1 XO P1 XO v2.07 XX1 v2.07 X
10101 01110 II
10101 tend. 10101
v2.07 X
10110 01000 I 10110 01010 I 10110 01100 I 10110 01110 II
10110 subfzeo[.] addzeo[.] stxsdx tcheck 10110
P1 XO P1 XO v2.06 XX1 v2.07 X
10111 01000 I 10111 01001 I 10111 01010 I 10111 01011 I 10111 01110 II
10111 subfmeo[.] mulldo[.] addmeo[.] mullwo[.] tsr. 10111
P1 XO PPC XO P1 XO P1 XO v2.07 X
11000 01001 I 11000 01010 I 11000 01011 I 11000 01100 I 11000 01101 I 11000 01110 II
11000 modsd addo[.] modsw lxvw4x lxsibzx tabortwc. 11000
{reserved} v3.0 X P1 XO v3.0 X v2.06 XX1 v3.0 XX1 v2.07 X
11001 01100 I 11001 01101 I 11001 01110 II
11001 lxvh8x lxsihzx tabortdc. 11001
v3.0 XX1 v3.0 XX1 v2.07 X
11010 01100 I 11010 01110 II
11010 lxvd2x tabortwci. 11010
{reserved} v2.06 XX1 v2.07 X
11011 01100 I 11011 01110 II
11011 lxvb16x tabortdci. 11011
{reserved} {reserved} v3.0 XX1 v2.07 X
11100 01001 I 11100 01011 I 11100 01100 I 11100 01101 I 11100 01110 II
11100 divdeuo[.] divweuo[.] stxvw4x stxsibx tabort. 11100
v2.06 XO v2.06 XO v2.06 XX1 v3.0 XX1 v2.07 X
11101 01001 I 11101 01011 I 11101 01100 I 11101 01101 I 11101 01110 II
11101 divdeo[.] divweo[.] stxvh8x stxsihx treclaim. 11101
v2.06 XO v2.06 XO v3.0 XX1 v3.0 XX1 v2.07 X
11110 01001 I 11110 01011 I 11110 01100 I
11110 divduo[.] divwuo[.] stxvd2x 11110
PPC XO PPC XO v2.06 XX1
11111 01001 I 11111 01011 I 11111 01100 I 11111 01110 II
11111 divdo[.] divwo[.] stxvb16x trechkpt. 11111
{reserved} PPC XO PPC XO v3.0 XX1 v2.07 X
Table 20:EXT31: Extended Opcode Map for Primary Opcode 31 (opcode bits 21:30) (Sheet 3 of 4)
10000 10001 10010 10011 10100 10101 10110 10111
00000 10011 I 00000 10100 II 00000 10101 I 00000 10110 II 00000 10111 I
00000 mfcr/mfocrf lwarx ldx icbt lwzx 00000
P1/v2.01 XFX PPC X PPC X v2.07 X P1 X
00001 10011 I 00001 10100 II 00001 10101 I 00001 10110 II 00001 10111 I
00001 mfvsrd lbarx ldux dcbst lwzux 00001
v2.07 XX1 v2.06 X PPC X PPC X P1 X
00010 10011 III 00010 10100 II 00010 10110 II 00010 10111 I
00010 mfmsr ldarx dcbf lbzx 00010
{reserved} P1 P X PPC X PPC X P1 X
00011 10011 I 00011 10100 II 00011 10111 I
00011 mfvsrwz lharx lbzux 00011
{reserved} v2.07 XX1 v2.06 X {reserved} P1 X
00100 10000 I 00100 10010 III 00100 10101 I 00100 10110 II 00100 10111 I
00100 mtcrf/mtocrf mtmsr stdx stwcx. stwx 00100
P1/v2.01 XFX P1 P X {reserved} PPC X PPC X P1 X
00101 10010 III 00101 10011 I 00101 10101 I 00101 10110 I 00101 10111 I
00101 mtmsrd mtvsrd stdux stqcx. stwux 00101
PPC P X v2.07 XX1 PPC X v2.07 X P1 X
00110 10011 I 00110 10110 II 00110 10111 I
00110 mtvsrwa stdcx. stbx 00110
{reserved} v2.07 XX1 PPC X P1 X
00111 10011 I 00111 10110 II 00111 10111 I
00111 mtvsrwz dcbtst stbux 00111
{reserved} v2.07 XX1 PPC X P1 X
01000 10010 III 01000 10100 I 01000 10110 II 01000 10111 I
01000 tlbiel lqarx dcbt lhzx 01000
v2.03 P X v2.07 X {reserved} PPC X P1 X
01001 10010 III 01001 10011 I 01001 10101 I 01001 10111 I
01001 tlbie mfvsrld ldmx lhzux 01001
P1 H X v3.0 XX1 {reserved} v3.0 PI X {reserved} P1 X
01010 10010 III 01010 10011 X 01010 10101 I 01010 10111 I
01010 slbsync mfspr lwax lhax 01010
v3.0 P X P1 O X PPC X {reserved} P1 X
01011 10011 II 01011 10101 I 01011 10111 I
01011 mftb lwaux lhaux 01011
{reserved} PPC X PPC X {reserved} P1 X
01100 10010 III 01100 10011 I 01100 10111 I
01100 slbmte mtvsrws sthx 01100
v2.00 P X v3.0 XX1 P1 X
01101 10010 III 01101 10011 I 01101 10111 I
01101 slbie mtvsrdd sthux 01101
PPC P X v3.0 XX1 {reserved} P1 X
01110 10010 III 01110 10011 X
01110 slbieg mtspr 01110
v3.0 P X P1 O X
01111 10010 III
01111 slbia 01111
PPC P X {reserved}
10000 10010 I 10000 10100 I 10000 10101 I 10000 10110 I 10000 10111 I
10000 nop ldbrx lswx lwbrx lfsx 10000
v2.05 X {reserved} v2.06 X P1 X P1 X P1 X
10001 10010 I 10001 10101 I 10001 10110 III 10001 10111 I
10001 nop lsdx tlbsync lfsux 10001
v2.05 X PPCAS X PPC H X P1 X
10010 10010 I 10010 10101 I 10010 10110 II 10010 10111 I
10010 nop lswi sync lfdx 10010
v2.05 X {reserved} P1 X P1 X P1 X
10011 10010 I 10011 10101 I 10011 10111 I
10011 nop lsdi lfdux 10011
v2.05 X {reserved} PPCAS X {reserved} P1 X
10100 10010 I 10100 10100 I 10100 10101 I 10100 10110 I 10100 10111 I
10100 nop stdbrx stswx stwbrx stfsx 10100
v2.05 X {reserved} v2.06 X P1 X P1 X P1 X
10101 10010 I 10101 10101 I 10101 10110 II 10101 10111 I
10101 nop stsdx stbcx. stfsux 10101
v2.05 X PPCAS X v2.06 X P1 X
10110 10010 I 10110 10101 I 10110 10110 II 10110 10111 I
10110 nop stswi sthcx. stfdx 10110
v2.05 X P1 X v2.06 X P1 X
10111 10010 I 10111 10011 I 10111 10101 I 10111 10111 I
10111 nop darn stsdi stfdux 10111
v2.05 X v3.0 X PPCAS X P1 X
11000 10101 III 11000 10110 I 11000 10111 I
11000 lwzcix lhbrx lfdpx 11000
v2.05 H X P1 X v2.05 X
11001 10101 III
11001 lhzcix 11001
{reserved} v2.05 H X {reserved} {reserved}
11010 10011 III 11010 10101 III 11010 10110 II 11010 10111 I
11010 slbmfev lbzcix eieio lfiwax 11010
v2.00 P X v2.05 H X PPC X v2.05 X
11011 10101 III 11011 10110 III 11011 10111 I
11011 ldcix msgsync lfiwzx 11011
v2.05 H X v3.0 H X v2.06 X
11100 10011 III 11100 10101 III 11100 10110 I 11100 10111 I
11100 slbmfee stwcix sthbrx stfdpx 11100
{reserved} v2.00 P X v2.05 H X P1 X v2.05 X
11101 10101 III
11101 sthcix 11101
{reserved} v2.05 H X {reserved}
11110 10011 III 11110 10101 III 11110 10110 II 11110 10111 I
11110 slbfee. stbcix icbi stfiwx 11110
{reserved} v2.05 P X v2.05 H X PPC X PPC X
11111 10101 III 11111 10110 II
11111 stdcix dcbz 11111
{reserved} v2.05 H X P1 X
Table 20:EXT31: Extended Opcode Map for Primary Opcode 31 (opcode bits 21:30) (Sheet 4 of 4)
11000 11001 11010 11011 11100 11101 11110 11111
00000 11000 I 00000 11010 I 00000 11011 I 00000 11100 I 00000 11110 II
00000 slw[.] cntlzw[.] sld[.] and[.] wait 00000
P1 X P1 X PPC X P1 X {reserved} v3.0 X
00001 11010 I 00001 11100 I 00001 11101 I
00001 cntlzd[.] andc[.] dsixes 00001
PPC X P1 X PPCAS X
00010 11101 I
00010 dtcs. 00010
PPCAS X
00011 11010 I 00011 11100 I
00011 popcntb nor[.] 00011
v2.02 X P1 X
00100 11010 I
00100 prtyw 00100
{reserved} {reserved} v2.05 X
00101 11010 I
00101 prtyd 00101
{reserved} v2.05 X
00110 00110
{reserved} {reserved}
00111 11100 I
00111 bpermd 00111
{reserved} v2.06 X
01000 11010 I 01000 11100 I
01000 cdtbcd eqv[.] 01000
v2.06 X P1 X
01001 11010 I 01001 11100 I
01001 cbcdtd xor[.] 01001
v2.06 X P1 X
01010 01010
01011 11010 I
01011 popcntw 01011
v2.06 X
01100 11100 I
01100 orc[.] 01100
P1 X
01101 11100 I
01101 or[.] 01101
P1 X
01110 11100 I
01110 nand[.] 01110
P1 X
01111 11010 I 01111 11100 I
01111 popcntd cmpb 01111
v2.06 X v2.05 X
10000 11000 I 10000 11010 I 10000 11011 I
10000 srw[.] cnttzw[.] srd[.] 10000
P1 X {reserved} v3.0 X PPC X {reserved}
10001 11010 I
10001 cnttzd[.] 10001
v3.0 X
10010 10010
10011 10011
10100 10100
{reserved} {reserved}
10101 10101
{reserved}
10110 10110
{reserved} {reserved}
10111 10111
{reserved}
11000 11000 I 11000 11010 I
11000 sraw[.] srad[.] 11000
P1 X PPC X
11001 11000 I 11001 1101. I
11001 srawi[.] sradi[.] 11001
P1 X PPC XS
11010 11010
11011 1101. I
11011 extswsli[.] 11011
v3.0 XS
11100 11010 I
11100 extsh[.] 11100
{reserved} {reserved} P1 X
11101 11010 I
11101 extsb[.] 11101
{reserved} PPC X
11110 11010 I
11110 extsw[.] 11110
PPC X
11111 11111
Table 21:EXT59: Extended Opcode Map for Primary Opcode 59 (opcode bits 21:30) (Sheet 1 of 4)
00000 00001 00010 00011 00100 00101 00110 00111
00000 00010 I ..000 00011 I
00000 dadd[.] dqua[.] 00000
v2.05 X v2.05 Z23
00001 00010 I ..001 00011 I
00001 dmul[.] drrnd[.] 00001
v2.05 X v2.05 Z23
.0010 00010 I ..010 00011 I
00010 dscli[.] dquai[.] 00010
v2.05 Z22 v2.05 Z23
.0011 00010 I ..011 00011 I
00011 dscri[.] drintx[.] 00011
v2.05 Z22 v2.05 Z23
00100 00010 I
00100 dcmpo 00100
v2.05 X
00101 00010 I
00101 dtstex 00101
v2.05 X
.0110 00010 I
00110 dtstdc 00110
v2.05 Z22
.0111 00010 I ..111 00011 I
00111 dtstdg drintn[.] 00111
v2.05 Z22 v2.05 Z23
01000 00010 I ..000 00011 I
01000 dctdp[.] dqua[.] 01000
v2.05 X v2.05 Z23
01001 00010 I ..001 00011 I
01001 dctfix[.] drrnd[.] 01001
v2.05 X v2.05 Z23
01010 00010 I ..010 00011 I
01010 ddedpd[.] dquai[.] 01010
v2.05 X v2.05 Z23
01011 00010 I ..011 00011 I
01011 dxex[.] drintx[.] 01011
v2.05 X v2.05 Z23
01100 01100
01101 01101
01110 01110
..111 00011 I
01111 drintn[.] 01111
v2.05 Z23
10000 00010 I ..000 00011 I
10000 dsub[.] dqua[.] 10000
v2.05 X v2.05 Z23
10001 00010 I ..001 00011 I
10001 ddiv[.] drrnd[.] 10001
v2.05 X v2.05 Z23
.0010 00010 I ..010 00011 I
10010 dscli[.] dquai[.] 10010
v2.05 Z22 v2.05 Z23
.0011 00010 I ..011 00011 I
10011 dscri[.] drintx[.] 10011
v2.05 Z22 v2.05 Z23
10100 00010 I
10100 dcmpu 10100
v2.05 X
10101 00010 I 10101 00011 I
10101 dtstsf dtstsfi 10101
v2.05 X v3.0 X
.0110 00010 I
10110 dtstdc 10110
v2.05 Z22
.0111 00010 I ..111 00011 I
10111 dtstdg drintn[.] 10111
v2.05 Z22 v2.05 Z23
11000 00010 I ..000 00011 I
11000 drsp[.] dqua[.] 11000
v2.05 X v2.05 Z23
11001 00010 I ..001 00011 I
11001 dcffix[.] drrnd[.] 11001
v2.06 X v2.05 Z23
11010 00010 I ..010 00011 I
11010 denbcd[.] dquai[.] 11010
v2.05 X v2.05 Z23
11011 00010 I ..011 00011 I
11011 diex[.] drintx[.] 11011
v2.05 X v2.05 Z23
11100 11100
11101 11101
11110 11110
..111 00011 I
11111 drintn[.] 11111
v2.05 Z23
Table 21:EXT59: Extended Opcode Map for Primary Opcode 59 (opcode bits 21:30) (Sheet 2 of 4)
01000 01001 01010 01011 01100 01101 01110 01111
00000 00000
00001 00001
00010 00010
00011 00011
00100 00100
00101 00101
00110 00110
00111 00111
01000 01000
01001 01001
01010 01010
01011 01011
01100 01100
01101 01101
01110 01110
01111 01111
10000 10000
10001 10001
10010 10010
10011 10011
10100 10100
10101 10101
10110 10110
10111 10111
11000 11000
11001 11001
11010 01110 I
11010 fcfids[.] 11010
v2.06 X
11011 11011
11100 11100
11101 11101
11110 01110 I
11110 fcfidus[.] 11110
v2.06 X
11111 11111
Table 21:EXT59: Extended Opcode Map for Primary Opcode 59 (opcode bits 21:30) (Sheet 3 of 4)
10000 10001 10010 10011 10100 10101 10110 10111
///// 10010 I ///// 10100 I ///// 10101 I ///// 10110 I
00000 fdivs[.] fsubs[.] fadds[.] fsqrts[.] 00000
PPC A PPC A PPC A PPC A
///// 10010 ///// 10100 ///// 10101 ///// 10110
00001 fdivs[.] fsubs[.] fadds[.] fsqrts[.] 00001
{invalid} {invalid} {invalid} {invalid}
00010 00010
00011 00011
00100 00100
00101 00101
00110 00110
00111 00111
01000 01000
01001 01001
01010 01010
01011 01011
01100 01100
01101 01101
01110 01110
01111 01111
10000 10000
10001 10001
10010 10010
10011 10011
10100 10100
10101 10101
10110 10110
10111 10111
11000 11000
11001 11001
11010 11010
11011 11011
11100 11100
11101 11101
11110 11110
11111 11111
Table 21:EXT59: Extended Opcode Map for Primary Opcode 59 (opcode bits 21:30) (Sheet 4 of 4)
11000 11001 11010 11011 11100 11101 11110 11111
///// 11000 I ..... 11001 I ///// 11010 I ..... 11100 I ..... 11101 I ..... 11110 I ..... 11111 I
00000 fres[.] fmuls[.] frsqrtes[.] fmsubs[.] fmadds[.] fnmsubs[.] fnmadds[.] 00000
PPC A PPC A v2.02 A PPC A PPC A PPC A PPC A
///// 11000 ///// 11010
00001 fres[.] frsqrtes[.] 00001
{invalid} {invalid}
00010 00010
00011 00011
00100 00100
00101 00101
00110 00110
00111 00111
01000 01000
01001 01001
01010 01010
01011 01011
01100 01100
01101 01101
01110 01110
01111 01111
10000 10000
10001 10001
10010 10010
10011 10011
10100 10100
10101 10101
10110 10110
10111 10111
11000 11000
11001 11001
11010 11010
11011 11011
11100 11100
11101 11101
11110 11110
11111 11111
Table 22:EXT60: Extended Opcode Map for Primary Opcode 60 (opcode bits 21:30) (Sheet 1 of 4)
00000 00001 00010 00011 00100 00101 00110 00111
00000 000.. I 00000 001.. I
00000 xsaddsp xsmaddasp 00000
v2.07 XX3 v2.07 XX3
00001 000.. I 00001 001.. I
00001 xssubsp xsmaddmsp 00001
v2.07 XX3 v2.07 XX3
00010 000.. I 00010 001.. I
00010 xsmulsp xsmsubasp 00010
v2.07 XX3 v2.07 XX3
00011 000.. I 00011 001.. I
00011 xsdivsp xsmsubmsp 00011
v2.07 XX3 v2.07 XX3
00100 000.. I 00100 001.. I
00100 xsadddp xsmaddadp 00100
v2.06 XX3 v2.06 XX3
00101 000.. I 00101 001.. I
00101 xssubdp xsmaddmdp 00101
v2.06 XX3 v2.06 XX3
00110 000.. I 00110 001.. I
00110 xsmuldp xsmsubadp 00110
v2.06 XX3 v2.06 XX3
00111 000.. I 00111 001.. I
00111 xsdivdp xsmsubmdp 00111
v2.06 XX3 v2.06 XX3
01000 000.. I 01000 001.. I
01000 xvaddsp xvmaddasp 01000
v2.06 XX3 v2.06 XX3
01001 000.. I 01001 001.. I
01001 xvsubsp xvmaddmsp 01001
v2.06 XX3 v2.06 XX3
01010 000.. I 01010 001.. I
01010 xvmulsp xvmsubasp 01010
v2.06 XX3 v2.06 XX3
01011 000.. I 01011 001.. I
01011 xvdivsp xvmsubmsp 01011
v2.06 XX3 v2.06 XX3
01100 000.. I 01100 001.. I
01100 xvadddp xvmaddadp 01100
v2.06 XX3 v2.06 XX3
01101 000.. I 01101 001.. I
01101 xvsubdp xvmaddmdp 01101
v2.06 XX3 v2.06 XX3
01110 000.. I 01110 001.. I
01110 xvmuldp xvmsubadp 01110
v2.06 XX3 v2.06 XX3
01111 000.. I 01111 001.. I
01111 xvdivdp xvmsubmdp 01111
v2.06 XX3 v2.06 XX3
10000 000.. I 10000 001.. I
10000 xsmaxcdp xsnmaddasp 10000
v3.0 XX3 v2.07 XX3
10001 000.. I 10001 001.. I
10001 xsmincdp xsnmaddmsp 10001
v3.0 XX3 v2.07 XX3
10010 000.. I 10010 001.. I
10010 xsmaxjdp xsnmsubasp 10010
v3.0 XX3 v2.07 XX3
10011 000.. I 10011 001.. I
10011 xsminjdp xsnmsubmsp 10011
v3.0 XX3 v2.07 XX3
10100 000.. I 10100 001.. I
10100 xsmaxdp xsnmaddadp 10100
v2.06 XX3 v2.06 XX3
10101 000.. I 10101 001.. I
10101 xsmindp xsnmaddmdp 10101
v2.06 XX3 v2.06 XX3
10110 000.. I 10110 001.. I
10110 xscpsgndp xsnmsubadp 10110
v2.06 XX3 v2.06 XX3
10111 001.. I
10111 xsnmsubmdp 10111
v2.06 XX3
11000 000.. I 11000 001.. I
11000 xvmaxsp xvnmaddasp 11000
v2.06 XX3 v2.06 XX3
11001 000.. I 11001 001.. I
11001 xvminsp xvnmaddmsp 11001
v2.06 XX3 v2.06 XX3
11010 000.. I 11010 001.. I
11010 xvcpsgnsp xvnmsubasp 11010
v2.06 XX3 v2.06 XX3
11011 000.. I 11011 001.. I
11011 xviexpsp xvnmsubmsp 11011
v3.0 XX3 v2.06 XX3
11100 000.. I 11100 001.. I
11100 xvmaxdp xvnmaddadp 11100
v2.06 XX3 v2.06 XX3
11101 000.. I 11101 001.. I
11101 xvmindp xvnmaddmdp 11101
v2.06 XX3 v2.06 XX3
11110 000.. I 11110 001.. I
11110 xvcpsgndp xvnmsubadp 11110
v2.06 XX3 v2.06 XX3
11111 000.. I 11111 001.. I
11111 xviexpdp xvnmsubmdp 11111
v3.0 XX3 v2.06 XX3
Table 22:EXT60: Extended Opcode Map for Primary Opcode 60 (opcode bits 21:30) (Sheet 2 of 4)
01000 01001 01010 01011 01100 01101 01110 01111
0..00 010.. I 00000 011.. I
00000 xxsldwi xscmpeqdp 00000
v2.06 XX3 v3.0 XX3
0..01 010.. I 00001 011.. I
00001 xxpermdi xscmpgtdp 00001
v2.06 XX3 v3.0 XX3
00010 010.. I 00010 011.. I
00010 xxmrghw xscmpgedp 00010
v2.06 XX3 v3.0 XX3
00011 010.. I 00011 011.. I
00011 xxperm xscmpnedp 00011
v3.0 XX3 v3.0 XX3
0..00 010.. I 00100 011.. I
00100 xxsldwi xscmpudp 00100
v2.06 XX3 v2.06 XX3
0..01 010.. I 00101 011.. I
00101 xxpermdi xscmpodp 00101
v2.06 XX3 v2.06 XX3
00110 010.. I
00110 xxmrglw 00110
v2.06 XX3
00111 010.. I 00111 011.. I
00111 xxpermr xscmpexpdp 00111
v3.0 XX3 v3.0 XX3
0..00 010.. I 01000 011.. I
01000 xxsldwi xvcmpeqsp 01000
v2.06 XX3 v2.06 XX3
0..01 010.. I 01001 011.. I
01001 xxpermdi xvcmpgtsp 01001
v2.06 XX3 v2.06 XX3
01010 0100. I 01010 0101. I 01010 011.. I
01010 xxspltw xxextractuw xvcmpgesp 01010
v2.06 XX2 v3.0 XX2 v2.06 XX3
01011 01000 I 01011 0101. I 01011 011.. I
01011 XPND60-1 xxinsertw xvcmpnesp 01011
{expanded} v3.0 XX2 v3.0 XX3
0..00 010.. I 01100 011.. I
01100 xxsldwi xvcmpeqdp 01100
v2.06 XX3 v2.06 XX3
0..01 010.. I 01101 011.. I
01101 xxpermdi xvcmpgtdp 01101
v2.06 XX3 v2.06 XX3
01110 011.. I
01110 xvcmpgedp 01110
v2.06 XX3
01111 011.. I
01111 xvcmpnedp 01111
v3.0 XX3
10000 010.. I
10000 xxland 10000
v2.06 XX3
10001 010.. I
10001 xxlandc 10001
v2.06 XX3
10010 010.. I
10010 xxlor 10010
v2.06 XX3
10011 010.. I
10011 xxlxor 10011
v2.06 XX3
10100 010.. I
10100 xxlnor 10100
v2.06 XX3
10101 010.. I
10101 xxlorc 10101
v2.07 XX3
10110 010.. I
10110 xxlnand 10110
v2.07 XX3
10111 010.. I
10111 xxleqv 10111
v2.07 XX3
11000 011.. I
11000 xvcmpeqsp. 11000
v2.06 XX3
11001 011.. I
11001 xvcmpgtsp. 11001
v2.06 XX3
11010 011.. I
11010 xvcmpgesp. 11010
v2.06 XX3
11011 011.. I
11011 xvcmpnesp. 11011
v3.0 XX3
11100 011.. I
11100 xvcmpeqdp. 11100
v2.06 XX3
11101 011.. I
11101 xvcmpgtdp. 11101
v2.06 XX3
11110 011.. I
11110 xvcmpgedp. 11110
v2.06 XX3
11111 011.. I
11111 xvcmpnedp. 11111
v3.0 XX3
Table 22:EXT60: Extended Opcode Map for Primary Opcode 60 (opcode bits 21:30) (Sheet 3 of 4)
10000 10001 10010 10011 10100 10101 10110 10111
00000 1010. I 00000 1011. I
00000 xsrsqrtesp xssqrtsp 00000
v2.07 XX2 v2.07 XX2
00001 1010. I
00001 xsresp 00001
v2.07 XX2
00010 00010
00011 00011
Table 22:EXT60: Extended Opcode Map for Primary Opcode 60 (opcode bits 21:30) (Sheet 4 of 4)
11000 11001 11010 11011 11100 11101 11110 11111
..... 11... I
00000 xxsel 00000
v2.06 XX4
00001 00001
00010 00010
00011 00011
00100 00100
00101 00101
00110 00110
00111 00111
01000 01000
01001 01001
01010 01010
01011 01011
01100 01100
01101 01101
01110 01110
01111 01111
10000 10000
10001 10001
10010 10010
10011 10011
10100 10100
10101 10101
10110 10110
10111 10111
11000 11000
11001 11001
11010 11010
11011 11011
11100 11100
11101 11101
11110 11110
11111 11111
Table 23:XPND60-1: Expanded Opcode Map for PO=60 XO=0b01011_01000 (opcode bits 21:30)
000 001 010 011 100 101 110 111
00 ... I
00 xxspltib 00
v3.0 XX1
01 01
10 10
11 11
Table 24:XPND60-2: Expanded Opcode Map for PO=60 XO=0b10101_1011. (opcode bits 21:30)
000 001 010 011 100 101 110 111
00 000 I 00 001 I
00 xsxexpdp xsxsigdp 00
v3.0 XX2 v3.0 XX2
01 01
10 000 I 10 001 I
10 xscvhpdp xscvdphp 10
v3.0 XX2 v3.0 XX2
11 11
Table 25:XPND60-3: Expanded Opcode Map for PO=60 XO=0b11101_1011. (opcode bits 21:30)
000 001 010 011 100 101 110 111
00 000 I 00 001 I 00 111 I
00 xvxexpdp xvxsigdp xxbrh 00
v3.0 XX2 v3.0 XX2 v3.0 XX2
01 000 I 01 001 I 01 111 I
01 xvxexpsp xvxsigsp xxbrw 01
v3.0 XX2 v3.0 XX2 v3.0 XX2
10 111 I
10 xxbrd 10
v3.0 XX2
11 000 I 11 001 I 11 111 I
11 xvcvhpsp xvcvsphp xxbrq 11
v3.0 XX2 v3.0 XX2 v3.0 XX2
Table 26:EXT63: Extended Opcode Map for Primary Opcode 63 (opcode bits 21:30) (Sheet 1 of 4)
00000 00001 00010 00011 00100 00101 00110 00111
00000 00000 I 00000 00010 I ..000 00011 I 00000 00100 I ..000 00101 I
00000 fcmpu daddq[.] dquaq[.] xsaddqp xsrqpi[x] 00000
P1 X v2.05 X v2.05 Z23 v3.0 X v3.0 X
00001 00000 I 00001 00010 I ..001 00011 I 00001 00100 I ..001 00101 I 00001 00110 I
00001 fcmpo dmulq[.] drrndq[.] xsmulqp[o] xsrqpxp mtfsb1[.] 00001
P1 X v2.05 X v2.05 Z23 v3.0 X v3.0 X P1 X
00010 00000 I .0010 00010 I ..010 00011 I 00010 00110 I
00010 mcrfs dscliq[.] dquaiq[.] mtfsb0[.] 00010
P1 X v2.05 Z22 v2.05 Z23 P1 X
.0011 00010 I ..011 00011 I 00011 00100 I
00011 dscriq[.] drintxq[.] xscpsgnqp 00011
v2.05 Z22 v2.05 Z23 v3.0 X
00100 00000 I 00100 00010 I 00100 00100 I 00100 00110 I
00100 ftdiv dcmpoq xscmpoqp mtfsfi[.] 00100
v2.06 X v2.05 X v3.0 X P1 X
00101 00000 I 00101 00010 I 00101 00100 I
00101 ftsqrt dtstexq xscmpexpqp 00101
v2.06 X v2.05 X v3.0 X
.0110 00010 I
00110 dtstdcq 00110
v2.05 Z22
.0111 00010 I ..111 00011 I
00111 dtstdgq drintnq[.] 00111
v2.05 Z22 v2.05 Z23
01000 00010 I ..000 00011 I ..000 00101 I
01000 dctqpq[.] dquaq[.] xsrqpi[x] 01000
v2.05 X v2.05 Z23 v3.0 X
01001 00010 I ..001 00011 I ..001 00101 I
01001 dctfixq[.] drrndq[.] xsrqpxp 01001
v2.05 X v2.05 Z23 v3.0 X
01010 00010 I ..010 00011 I
01010 ddedpdq[.] dquaiq[.] 01010
v2.05 X v2.05 Z23
01011 00010 I ..011 00011 I
01011 dxexq[.] drintxq[.] 01011
v2.05 X v2.05 Z23
01100 00100 I
01100 xsmaddqp[o] 01100
v3.0 X
01101 00100 I
01101 xsmsubqp[o] 01101
v3.0 X
01110 00100 I
01110 xsnmaddqp[o] 01110
v3.0 X
..111 00011 I 01111 00100 I
01111 drintnq[.] xsnmsubqp[o] 01111
v2.05 Z23 v3.0 X
10000 00010 I ..000 00011 I 10000 00100 I ..000 00101 I
10000 dsubq[.] dquaq[.] xssubqp[o] xsrqpi[x] 10000
v2.05 X v2.05 Z23 v3.0 X v3.0 X
10001 00010 I ..001 00011 I 10001 00100 I ..001 00101 I
10001 ddivq[.] drrndq[.] xsdivqp[o] xsrqpxp 10001
v2.05 X v2.05 Z23 v3.0 X v3.0 X
.0010 00010 I ..010 00011 I 10010 00111 I
10010 dscliq[.] dquaiq[.] mffs[.] 10010
v2.05 Z22 v2.05 Z23 P1 X
.0011 00010 I ..011 00011 I
10011 dscriq[.] drintxq[.] 10011
v2.05 Z22 v2.05 Z23
10100 00010 I 10100 00100 I
10100 dcmpuq xscmpuqp 10100
v2.05 X v3.0 X
10101 00010 I 10101 00011 I
10101 dtstsfq dtstsfiq 10101
v2.05 X v3.0 X
.0110 00010 I 10110 00100 I 10110 00111 I
10110 dtstdcq xststdcqp mtfsf[.] 10110
v2.05 Z22 v3.0 X P1 XFL
.0111 00010 I ..111 00011 I
10111 dtstdgq drintnq[.] 10111
v2.05 Z22 v2.05 Z23
11000 00010 I ..000 00011 I ..000 00101 I
11000 drdpq[.] dquaq[.] xsrqpi[x] 11000
v2.05 X v2.05 Z23 v3.0 Z23
11001 00010 I ..001 00011 I 11001 00100 I ..001 00101 I
11001 dcffixq[.] drrndq[.] XPND63-1 xsrqpxp 11001
v2.05 X v2.05 Z23 {expanded} v3.0 Z23
11010 00010 I ..010 00011 I 11010 00100 I 11010 00110 I
11010 denbcdq[.] dquaiq[.] XPND63-2 fmrgow 11010
v2.05 X v2.05 Z23 {expanded} v2.07 X
11011 00010 I ..011 00011 I 11011 00100 I
11011 diexq[.] drintxq[.] xsiexpqp 11011
v2.05 X v2.05 Z23 v3.0 X
11100 11100
11101 11101
11110 00110 I
11110 fmrgew 11110
v2.07 X
..111 00011 I
11111 drintnq[.] 11111
v2.05 Z23
Table 26:EXT63: Extended Opcode Map for Primary Opcode 63 (opcode bits 21:30) (Sheet 2 of 4)
01000 01001 01010 01011 01100 01101 01110 01111
00000 01000 I 00000 01100 I 00000 01110 I 00000 01111 I
00000 fcpsgn[.] frsp[.] fctiw[.] fctiwz[.] 00000
v2.05 X P1 X P2 X P2 X
00001 01000 I
00001 fneg[.] 00001
P1 X
00010 01000 I
00010 fmr[.] 00010
P1 X
00011 00011
00101 00101
00110 00110
00111 00111
01000 01000 I
01000 fabs[.] 01000
P1 X
01001 01001
01010 01010
01011 01011
01100 01000 I
01100 frin[.] 01100
v2.02 X
01101 01000 I
01101 friz[.] 01101
v2.02 X
01110 01000 I
01110 frip[.] 01110
v2.02 X
01111 01000 I
01111 frim[.] 01111
v2.02 X
10000 10000
10001 10001
10010 10010
10011 10011
10100 10100
10101 10101
10110 10110
10111 10111
11000 11000
11011 11011
11100 11100
11111 11111
Table 26:EXT63: Extended Opcode Map for Primary Opcode 63 (opcode bits 21:30) (Sheet 3 of 4)
10000 10001 10010 10011 10100 10101 10110 10111
///// 10010 I ///// 10100 I ///// 10101 I ///// 10110 I ..... 10111 I
00000 fdiv[.] fsub[.] fadd[.] fsqrt[.] fsel[.] 00000
P1 A P1 A P1 A P2 A PPC A
///// 10010 ///// 10100 ///// 10101 ///// 10110
00001 fdiv[.] fsub[.] fadd[.] fsqrt[.] 00001
{invalid} {invalid} {invalid} {invalid}
00010 00010
00011 00011
00100 00100
00101 00101
00110 00110
00111 00111
01000 01000
01001 01001
01010 01010
01011 01011
01100 01100
01101 01101
01110 01110
01111 01111
10000 10000
10001 10001
10010 10010
10011 10011
10100 10100
10101 10101
10110 10110
10111 10111
11000 11000
11001 11001
11010 11010
11011 11011
11100 11100
11101 11101
11110 11110
11111 11111
Table 26:EXT63: Extended Opcode Map for Primary Opcode 63 (opcode bits 21:30) (Sheet 4 of 4)
11000 11001 11010 11011 11100 11101 11110 11111
///// 11000 I ..... 11001 I ///// 11010 I ..... 11100 I ..... 11101 I ..... 11110 I ..... 11111 I
00000 fre[.] fmul[.] frsqrte[.] fmsub[.] fmadd[.] fnmsub[.] fnmadd[.] 00000
v2.02 A P1 A PPC A P1 A P1 A P1 A P1 A
///// 11000 ///// 11010
00001 fre[.] frsqrte[.] 00001
{invalid} {invalid}
00010 00010
00011 00011
00100 00100
00101 00101
00110 00110
00111 00111
01000 01000
01001 01001
01010 01010
01011 01011
01100 01100
01101 01101
01110 01110
01111 01111
10000 10000
10001 10001
10010 10010
10011 10011
10100 10100
10101 10101
10110 10110
10111 10111
11000 11000
11001 11001
11010 11010
11011 11011
11100 11100
11101 11101
11110 11110
11111 11111
Table 27:XPND63-1: Expanded Opcode Map for PO=63 XO=0b11001_00100 (opcode bits 21:30)
000 001 010 011 100 101 110 111
00 000 I 00 010 I
00 xsabsqp xsxexpqp 00
v3.0 X v3.0 X
01 000 I
01 xsnabsqp 01
v3.0 X
10 000 I 10 010 I
10 xsnegqp xsxsigqp 10
v3.0 X v3.0 X
11 011 I
11 xssqrtqp[o] 11
v3.0 X
Table 28:XPND63-2: Expanded Opcode Map for PO=63 XO=0b11010_00100 (opcode bits 21:30)
000 001 010 011 100 101 110 111
00 001 I 00 010 I
00 xscvqpuwz xscvudqp 00
v3.0 X v3.0 X
01 001 I 01 010 I
01 xscvqpswz xscvsdqp 01
v3.0 X v3.0 X
10 001 I 10 100 I 10 110 I
10 xscvqpudz xscvqpdp[o] xscvdpqp 10
v3.0 X v3.0 X v3.0 X
11 001 I
11 xscvqpsdz 11
v3.0 X
This appendix lists all the instructions in the Power ISA, sorted by primary opcode, then by extended opcode bits
26:31 (if any), then by opcode bits 21:25 (if any), then by expanded opcode bits 11:15 (if any).
Mode Dep4
Mnemonic
Privilege3
Version2
Instruction1
Format
Book
Page
Mode Dep4
Mnemonic
Privilege3
Version2
Instruction1
Format
Book
Page
1 11111 11112 22222 222233
012345 6789 0 12345 67890 12345 678901 Name
000100 ..... 00010 ..... 1.110 000001 VX I 356 v3.0 bcdcfsq. Decimal Convert From Signed Qword & record
000100 ..... 00100 ..... 1.110 000001 VX I 355 v3.0 bcdctz. Decimal Convert To Zoned & record
000100 ..... 00101 ..... 1/110 000001 VX I 354 v3.0 bcdctn. Decimal Convert To National & record
000100 ..... 00110 ..... 1.110 000001 VX I 353 v3.0 bcdcfz. Decimal Convert From Zoned & record
000100 ..... 00111 ..... 1.110 000001 VX I 352 v3.0 bcdcfn. Decimal Convert From National & record
000100 ..... 11111 ..... 1.110 000001 VX I 358 v3.0 bcdsetsgn. Decimal Set Sign & record
000100 ..... ..... ..... 1.111 000001 VX I 361 v3.0 bcdsr. Decimal Shift & Round & record
000100 ..... ..... ..... 00000 000010 VX I 302 v2.03 vmaxub Vector Maximum Unsigned Byte
000100 ..... ..... ..... 00001 000010 VX I 303 v2.03 vmaxuh Vector Maximum Unsigned Hword
000100 ..... ..... ..... 00010 000010 VX I 303 v2.03 vmaxuw Vector Maximum Unsigned Word
000100 ..... ..... ..... 00011 000010 VX I 302 v2.07 vmaxud Vector Maximum Unsigned Dword
000100 ..... ..... ..... 00100 000010 VX I 302 v2.03 vmaxsb Vector Maximum Signed Byte
000100 ..... ..... ..... 00101 000010 VX I 303 v2.03 vmaxsh Vector Maximum Signed Hword
000100 ..... ..... ..... 00110 000010 VX I 303 v2.03 vmaxsw Vector Maximum Signed Word
000100 ..... ..... ..... 00111 000010 VX I 302 v2.07 vmaxsd Vector Maximum Signed Dword
000100 ..... ..... ..... 01000 000010 VX I 304 v2.03 vminub Vector Minimum Unsigned Byte
000100 ..... ..... ..... 01001 000010 VX I 305 v2.03 vminuh Vector Minimum Unsigned Hword
000100 ..... ..... ..... 01010 000010 VX I 305 v2.03 vminuw Vector Minimum Unsigned Word
000100 ..... ..... ..... 01011 000010 VX I 304 v2.07 vminud Vector Minimum Unsigned Dword
000100 ..... ..... ..... 01100 000010 VX I 304 v2.03 vminsb Vector Minimum Signed Byte
000100 ..... ..... ..... 01101 000010 VX I 305 v2.03 vminsh Vector Minimum Signed Hword
000100 ..... ..... ..... 01110 000010 VX I 305 v2.03 vminsw Vector Minimum Signed Word
000100 ..... ..... ..... 01111 000010 VX I 304 v2.07 vminsd Vector Minimum Signed Dword
000100 ..... ..... ..... 10000 000010 VX I 299 v2.03 vavgub Vector Average Unsigned Byte
000100 ..... ..... ..... 10001 000010 VX I 299 v2.03 vavguh Vector Average Unsigned Hword
000100 ..... ..... ..... 10010 000010 VX I 299 v2.03 vavguw Vector Average Unsigned Word
000100 ..... ..... ..... 10100 000010 VX I 298 v2.03 vavgsb Vector Average Signed Byte
000100 ..... ..... ..... 10101 000010 VX I 298 v2.03 vavgsh Vector Average Signed Hword
000100 ..... ..... ..... 10110 000010 VX I 298 v2.03 vavgsw Vector Average Signed Word
000100 ..... 00000 ..... 11000 000010 VX I 345 v3.0 vclzlsbb Vector Count Leading Zero Least-Significant Bits Byte
000100 ..... 00001 ..... 11000 000010 VX I 345 v3.0 vctzlsbb Vector Count Trailing Zero Least-Significant Bits Byte
000100 ..... 00110 ..... 11000 000010 VX I 295 v3.0 vnegw Vector Negate Word
000100 ..... 00111 ..... 11000 000010 VX I 295 v3.0 vnegd Vector Negate Dword
000100 ..... 01000 ..... 11000 000010 VX I 317 v3.0 vprtybw Vector Parity Byte Word
000100 ..... 01001 ..... 11000 000010 VX I 317 v3.0 vprtybd Vector Parity Byte Dword
000100 ..... 01010 ..... 11000 000010 VX I 317 v3.0 vprtybq Vector Parity Byte Qword
000100 ..... 10000 ..... 11000 000010 VX I 296 v3.0 vextsb2w Vector Extend Sign Byte to Word
000100 ..... 10001 ..... 11000 000010 VX I 296 v3.0 vextsh2w Vector Extend Sign Hword to Word
000100 ..... 11000 ..... 11000 000010 VX I 296 v3.0 vextsb2d Vector Extend Sign Byte to Dword
000100 ..... 11001 ..... 11000 000010 VX I 296 v3.0 vextsh2d Vector Extend Sign Hword to Dword
000100 ..... 11010 ..... 11000 000010 VX I 297 v3.0 vextsw2d Vector Extend Sign Word to Dword
000100 ..... 11100 ..... 11000 000010 VX I 344 v3.0 vctzb Vector Count Trailing Zeros Byte
000100 ..... 11101 ..... 11000 000010 VX I 344 v3.0 vctzh Vector Count Trailing Zeros Hword
000100 ..... 11110 ..... 11000 000010 VX I 344 v3.0 vctzw Vector Count Trailing Zeros Word
000100 ..... 11111 ..... 11000 000010 VX I 344 v3.0 vctzd Vector Count Trailing Zeros Dword
000100 ..... ..... ..... 11010 000010 VX I 338 v2.07 vshasigmaw Vector SHA-256 Sigma Word
000100 ..... ..... ..... 11011 000010 VX I 338 v2.07 vshasigmad Vector SHA-512 Sigma Dword
000100 ..... ///// ..... 11100 000010 VX I 343 v2.07 vclzb Vector Count Leading Zeros Byte
000100 ..... ///// ..... 11101 000010 VX I 343 v2.07 vclzh Vector Count Leading Zeros Hword
000100 ..... ///// ..... 11110 000010 VX I 343 v2.07 vclzw Vector Count Leading Zeros Word
000100 ..... ///// ..... 11111 000010 VX I 343 v2.07 vclzd Vector Count Leading Zeros Dword
000100 ..... ..... ..... 10000 000011 VX I 300 v3.0 vabsdub Vector Absolute Difference Unsigned Byte
000100 ..... ..... ..... 10001 000011 VX I 300 v3.0 vabsduh Vector Absolute Difference Unsigned Hword
000100 ..... ..... ..... 10010 000011 VX I 301 v3.0 vabsduw Vector Absolute Difference Unsigned Word
000100 ..... ///// ..... 11100 000011 VX I 348 v2.07 vpopcntb Vector Population Count Byte
000100 ..... ///// ..... 11101 000011 VX I 348 v2.07 vpopcnth Vector Population Count Hword
Figure 87. Power ISA Instruction Set Sorted by Opcode (Sheet 2 of 17)
Mode Dep4
Mnemonic
Privilege3
Version2
Instruction1
Format
Book
Page
1 11111 11112 22222 222233
012345 6789 0 12345 67890 12345 678901 Name
000100 ..... ///// ..... 11110 000011 VX I 348 v2.07 vpopcntw Vector Population Count Word
000100 ..... ///// ..... 11111 000011 VX I 348 v2.07 vpopcntd Vector Population Count Dword
000100 ..... ..... ..... 00000 000100 VX I 318 v2.03 vrlb Vector Rotate Left Byte
000100 ..... ..... ..... 00001 000100 VX I 318 v2.03 vrlh Vector Rotate Left Hword
000100 ..... ..... ..... 00010 000100 VX I 318 v2.03 vrlw Vector Rotate Left Word
000100 ..... ..... ..... 00011 000100 VX I 318 v2.07 vrld Vector Rotate Left Dword
000100 ..... ..... ..... 00100 000100 VX I 319 v2.03 vslb Vector Shift Left Byte
000100 ..... ..... ..... 00101 000100 VX I 319 v2.03 vslh Vector Shift Left Hword
000100 ..... ..... ..... 00110 000100 VX I 319 v2.03 vslw Vector Shift Left Word
000100 ..... ..... ..... 00111 000100 VX I 266 v2.03 vsl Vector Shift Left
000100 ..... ..... ..... 01000 000100 VX I 320 v2.03 vsrb Vector Shift Right Byte
000100 ..... ..... ..... 01001 000100 VX I 320 v2.03 vsrh Vector Shift Right Hword
000100 ..... ..... ..... 01010 000100 VX I 320 v2.03 vsrw Vector Shift Right Word
000100 ..... ..... ..... 01011 000100 VX I 266 v2.03 vsr Vector Shift Right
000100 ..... ..... ..... 01100 000100 VX I 321 v2.03 vsrab Vector Shift Right Algebraic Byte
000100 ..... ..... ..... 01101 000100 VX I 321 v2.03 vsrah Vector Shift Right Algebraic Hword
000100 ..... ..... ..... 01110 000100 VX I 321 v2.03 vsraw Vector Shift Right Algebraic Word
000100 ..... ..... ..... 01111 000100 VX I 321 v2.07 vsrad Vector Shift Right Algebraic Dword
000100 ..... ..... ..... 10000 000100 VX I 315 v2.03 vand Vector Logical AND
000100 ..... ..... ..... 10001 000100 VX I 315 v2.03 vandc Vector Logical AND with Complement
000100 ..... ..... ..... 10010 000100 VX I 316 v2.03 vor Vector Logical OR
000100 ..... ..... ..... 10011 000100 VX I 316 v2.03 vxor Vector Logical XOR
000100 ..... ..... ..... 10100 000100 VX I 316 v2.03 vnor Vector Logical NOR
000100 ..... ..... ..... 10101 000100 VX I 316 v2.07 vorc Vector Logical OR with Complement
000100 ..... ..... ..... 10110 000100 VX I 315 v2.07 vnand Vector Logical NAND
000100 ..... ..... ..... 10111 000100 VX I 319 v2.07 vsld Vector Shift Left Dword
000100 ..... ///// ///// 11000 000100 VX I 364 v2.03 mfvscr Move From VSCR
000100 ///// ///// ..... 11001 000100 VX I 364 v2.03 mtvscr Move To VSCR
000100 ..... ..... ..... 11010 000100 VX I 315 v2.07 veqv Vector Logical Equivalence
000100 ..... ..... ..... 11011 000100 VX I 320 v2.07 vsrd Vector Shift Right Dword
000100 ..... ..... ..... 11100 000100 X I 267 v3.0 vsrv Vector Shift Right Variable
000100 ..... ..... ..... 11101 000100 X I 267 v3.0 vslv Vector Shift Left Variable
000100 ..... ..... ..... 00010 000101 VX I 322 v3.0 vrlwmi Vector Rotate Left Word then Mask Insert
000100 ..... ..... ..... 00011 000101 VX I 323 v3.0 vrldmi Vector Rotate Left Dword then Mask Insert
000100 ..... ..... ..... 00110 000101 VX I 322 v3.0 vrlwnm Vector Rotate Left Word then AND with Mask
000100 ..... ..... ..... 00111 000101 VX I 323 v3.0 vrldnm Vector Rotate Left Dword then AND with Mask
000100 ..... ..... ..... .0000 000110 VC I 306 v2.03 vcmpequb[.] Vector Compare Equal Unsigned Byte
000100 ..... ..... ..... .0001 000110 VC I 306 v2.03 vcmpequh[.] Vector Compare Equal Unsigned Hword
000100 ..... ..... ..... .0010 000110 VC I 307 v2.03 vcmpequw[.] Vector Compare Equal Unsigned Word
000100 ..... ..... ..... .0011 000110 VC I 332 v2.03 vcmpeqfp[.] Vector Compare Equal To Floating-Point
000100 ..... ..... ..... .0111 000110 VC I 332 v2.03 vcmpgefp[.] Vector Compare Greater Than or Equal To Floating-Point
000100 ..... ..... ..... .1000 000110 VC I 310 v2.03 vcmpgtub[.] Vector Compare Greater Than Unsigned Byte
000100 ..... ..... ..... .1001 000110 VC I 311 v2.03 vcmpgtuh[.] Vector Compare Greater Than Unsigned Hword
000100 ..... ..... ..... .1010 000110 VC I 311 v2.03 vcmpgtuw[.] Vector Compare Greater Than Unsigned Word
000100 ..... ..... ..... .1011 000110 VC I 333 v2.03 vcmpgtfp[.] Vector Compare Greater Than Floating-Point
000100 ..... ..... ..... .1100 000110 VC I 308 v2.03 vcmpgtsb[.] Vector Compare Greater Than Signed Byte
000100 ..... ..... ..... .1101 000110 VC I 309 v2.03 vcmpgtsh[.] Vector Compare Greater Than Signed Hword
000100 ..... ..... ..... .1110 000110 VC I 309 v2.03 vcmpgtsw[.] Vector Compare Greater Than Signed Word
000100 ..... ..... ..... .1111 000110 VC I 331 v2.03 vcmpbfp[.] Vector Compare Bounds Floating-Point
000100 ..... ..... ..... .0000 000111 VC I 312 v3.0 vcmpneb[.] Vector Compare Not Equal Byte
000100 ..... ..... ..... .0001 000111 VC I 313 v3.0 vcmpneh[.] Vector Compare Not Equal Hword
000100 ..... ..... ..... .0010 000111 VC I 314 v3.0 vcmpnew[.] Vector Compare Not Equal Word
000100 ..... ..... ..... .0011 000111 VC I 307 v2.07 vcmpequd[.] Vector Compare Equal Unsigned Dword
000100 ..... ..... ..... .0100 000111 VC I 312 v3.0 vcmpnezb[.] Vector Compare Not Equal or Zero Byte
000100 ..... ..... ..... .0101 000111 VC I 313 v3.0 vcmpnezh[.] Vector Compare Not Equal or Zero Hword
000100 ..... ..... ..... .0110 000111 VC I 314 v3.0 vcmpnezw[.] Vector Compare Not Equal or Zero Word
Figure 87. Power ISA Instruction Set Sorted by Opcode (Sheet 3 of 17)
Mode Dep4
Mnemonic
Privilege3
Version2
Instruction1
Format
Book
Page
1 11111 11112 22222 222233
012345 6789 0 12345 67890 12345 678901 Name
000100 ..... ..... ..... .1011 000111 VC I 310 v2.07 vcmpgtud[.] Vector Compare Greater Than Unsigned Dword
000100 ..... ..... ..... .1111 000111 VC I 308 v2.07 vcmpgtsd[.] Vector Compare Greater Than Signed Dword
000100 ..... ..... ..... 00000 001000 VX I 283 v2.03 vmuloub Vector Multiply Odd Unsigned Byte
000100 ..... ..... ..... 00001 001000 VX I 284 v2.03 vmulouh Vector Multiply Odd Unsigned Hword
000100 ..... ..... ..... 00010 001000 VX I 285 v2.07 vmulouw Vector Multiply Odd Unsigned Word
000100 ..... ..... ..... 00100 001000 VX I 283 v2.03 vmulosb Vector Multiply Odd Signed Byte
000100 ..... ..... ..... 00101 001000 VX I 284 v2.03 vmulosh Vector Multiply Odd Signed Hword
000100 ..... ..... ..... 00110 001000 VX I 285 v2.07 vmulosw Vector Multiply Odd Signed Word
000100 ..... ..... ..... 01000 001000 VX I 283 v2.03 vmuleub Vector Multiply Even Unsigned Byte
000100 ..... ..... ..... 01001 001000 VX I 284 v2.03 vmuleuh Vector Multiply Even Unsigned Hword
000100 ..... ..... ..... 01010 001000 VX I 285 v2.07 vmuleuw Vector Multiply Even Unsigned Word
000100 ..... ..... ..... 01100 001000 VX I 283 v2.03 vmulesb Vector Multiply Even Signed Byte
000100 ..... ..... ..... 01101 001000 VX I 284 v2.03 vmulesh Vector Multiply Even Signed Hword
000100 ..... ..... ..... 01110 001000 VX I 285 v2.07 vmulesw Vector Multiply Even Signed Word
000100 ..... ..... ..... 10000 001000 VX I 339 v2.07 vpmsumb Vector Polynomial Multiply-Sum Byte
000100 ..... ..... ..... 10001 001000 VX I 340 v2.07 vpmsumh Vector Polynomial Multiply-Sum Hword
000100 ..... ..... ..... 10010 001000 VX I 340 v2.07 vpmsumw Vector Polynomial Multiply-Sum Word
000100 ..... ..... ..... 10011 001000 VX I 339 v2.07 vpmsumd Vector Polynomial Multiply-Sum Dword
000100 ..... ..... ..... 10100 001000 VX I 336 v2.07 vcipher Vector AES Cipher
000100 ..... ..... ..... 10101 001000 VX I 337 v2.07 vncipher Vector AES Inverse Cipher
000100 ..... ..... ///// 10111 001000 VX I 337 v2.07 vsbox Vector AES SubBytes
000100 ..... ..... ..... 11000 001000 VX I 294 v2.03 vsum4ubs Vector Sum across Quarter Unsigned Byte Saturate
000100 ..... ..... ..... 11001 001000 VX I 293 v2.03 vsum4shs Vector Sum across Quarter Signed Hword Saturate
000100 ..... ..... ..... 11010 001000 VX I 292 v2.03 vsum2sws Vector Sum across Half Signed Word Saturate
000100 ..... ..... ..... 11100 001000 VX I 293 v2.03 vsum4sbs Vector Sum across Quarter Signed Byte Saturate
000100 ..... ..... ..... 11110 001000 VX I 292 v2.03 vsumsws Vector Sum across Signed Word Saturate
000100 ..... ..... ..... 00010 001001 VX I 286 v2.07 vmuluwm Vector Multiply Unsigned Word Modulo
000100 ..... ..... ..... 10100 001001 VX I 336 v2.07 vcipherlast Vector AES Cipher Last
000100 ..... ..... ..... 10101 001001 VX I 337 v2.07 vncipherlast Vector AES Inverse Cipher Last
000100 ..... ..... ..... 00000 001010 VX I 324 v2.03 vaddfp Vector Add Floating-Point
000100 ..... ..... ..... 00001 001010 VX I 324 v2.03 vsubfp Vector Subtract Floating-Point
000100 ..... ///// ..... 00100 001010 VX I 335 v2.03 vrefp Vector Reciprocal Estimate Floating-Point
000100 ..... ///// ..... 00101 001010 VX I 335 v2.03 vrsqrtefp Vector Reciprocal Square Root Estimate Floating-Point
000100 ..... ///// ..... 00110 001010 VX I 334 v2.03 vexptefp Vector 2 Raised to the Exponent Estimate Floating-Point
000100 ..... ..... ..... 00111 001010 VX I 334 v2.03 vlogefp Vector Log Base 2 Estimate Floating-Point
000100 ..... ///// ..... 01000 001010 VX I 329 v2.03 vrfin Vector Round to Floating-Point Integral Nearest
000100 ..... ///// ..... 01001 001010 VX I 330 v2.03 vrfiz Vector Round to Floating-Point Integral toward Zero
000100 ..... ///// ..... 01010 001010 VX I 329 v2.03 vrfip Vector Round to Floating-Point Integral toward +Infinity
000100 ..... ///// ..... 01011 001010 VX I 329 v2.03 vrfim Vector Round to Floating-Point Integral toward -Infinity
000100 ..... ..... ..... 01100 001010 VX I 328 v2.03 vcfux Vector Convert From Unsigned Word
000100 ..... ..... ..... 01101 001010 VX I 328 v2.03 vcfsx Vector Convert From Signed Word
000100 ..... ..... ..... 01110 001010 VX I 327 v2.03 vctuxs Vector Convert To Unsigned Word Saturate
000100 ..... ..... ..... 01111 001010 VX I 327 v2.03 vctsxs Vector Convert To Signed Word Saturate
000100 ..... ..... ..... 10000 001010 VX I 326 v2.03 vmaxfp Vector Maximum Floating-Point
000100 ..... ..... ..... 10001 001010 VX I 326 v2.03 vminfp Vector Minimum Floating-Point
000100 ..... ..... ..... 00000 001100 VX I 257 v2.03 vmrghb Vector Merge High Byte
000100 ..... ..... ..... 00001 001100 VX I 257 v2.03 vmrghh Vector Merge High Hword
000100 ..... ..... ..... 00010 001100 VX I 258 v2.03 vmrghw Vector Merge High Word
000100 ..... ..... ..... 00100 001100 VX I 257 v2.03 vmrglb Vector Merge Low Byte
000100 ..... ..... ..... 00101 001100 VX I 257 v2.03 vmrglh Vector Merge Low Hword
000100 ..... ..... ..... 00110 001100 VX I 258 v2.03 vmrglw Vector Merge Low Word
000100 ..... /.... ..... 01000 001100 VX I 260 v2.03 vspltb Vector Splat Byte
000100 ..... //... ..... 01001 001100 VX I 260 v2.03 vsplth Vector Splat Hword
000100 ..... ///.. ..... 01010 001100 VX I 260 v2.03 vspltw Vector Splat Word
000100 ..... ..... ///// 01100 001100 VX I 261 v2.03 vspltisb Vector Splat Immediate Signed Byte
000100 ..... ..... ///// 01101 001100 VX I 261 v2.03 vspltish Vector Splat Immediate Signed Hword
Figure 87. Power ISA Instruction Set Sorted by Opcode (Sheet 4 of 17)
Mode Dep4
Mnemonic
Privilege3
Version2
Instruction1
Format
Book
Page
1 11111 11112 22222 222233
012345 6789 0 12345 67890 12345 678901 Name
000100 ..... ..... ///// 01110 001100 VX I 261 v2.03 vspltisw Vector Splat Immediate Signed Word
000100 ..... ..... ..... 10000 001100 VX I 266 v2.03 vslo Vector Shift Left by Octet
000100 ..... ..... ..... 10001 001100 VX I 266 v2.03 vsro Vector Shift Right by Octet
000100 ..... ///// ..... 10100 001100 VX I 342 v2.07 vgbbd Vector Gather Bits by Byte by Dword
000100 ..... ..... ..... 10101 001100 VX I 349 v2.07 vbpermq Vector Bit Permute Qword
000100 ..... ..... ..... 10111 001100 VX I 349 v3.0 vbpermd Vector Bit Permute Dword
000100 ..... ..... ..... 11010 001100 VX I 259 v2.07 vmrgow Vector Merge Odd Word
000100 ..... ..... ..... 11110 001100 VX I 259 v2.07 vmrgew Vector Merge Even Word
000100 ..... /.... ..... 01000 001101 VX I 269 v3.0 vextractub Vector Extract Unsigned Byte
000100 ..... /.... ..... 01001 001101 VX I 269 v3.0 vextractuh Vector Extract Unsigned Hword
000100 ..... /.... ..... 01010 001101 VX I 269 v3.0 vextractuw Vector Extract Unsigned Word
000100 ..... /.... ..... 01011 001101 VX I 269 v3.0 vextractd Vector Extract Dword
000100 ..... /.... ..... 01100 001101 VX I 270 v3.0 vinsertb Vector Insert Byte
000100 ..... /.... ..... 01101 001101 VX I 270 v3.0 vinserth Vector Insert Hword
000100 ..... /.... ..... 01110 001101 VX I 270 v3.0 vinsertw Vector Insert Word
000100 ..... /.... ..... 01111 001101 VX I 270 v3.0 vinsertd Vector Insert Dword
000100 ..... ..... ..... 11000 001101 VX I 346 v3.0 vextublx Vector Extract Unsigned Byte Left-Indexed
000100 ..... ..... ..... 11001 001101 VX I 346 v3.0 vextuhlx Vector Extract Unsigned Hword Left-Indexed
000100 ..... ..... ..... 11010 001101 VX I 347 v3.0 vextuwlx Vector Extract Unsigned Word Left-Indexed
000100 ..... ..... ..... 11100 001101 VX I 346 v3.0 vextubrx Vector Extract Unsigned Byte Right-Indexed
000100 ..... ..... ..... 11101 001101 VX I 346 v3.0 vextuhrx Vector Extract Unsigned Hword Right-Indexed
000100 ..... ..... ..... 11110 001101 VX I 347 v3.0 vextuwrx Vector Extract Unsigned Word Right-Indexed
000100 ..... ..... ..... 00000 001110 VX I 253 v2.03 vpkuhum Vector Pack Unsigned Hword Unsigned Modulo
000100 ..... ..... ..... 00001 001110 VX I 254 v2.03 vpkuwum Vector Pack Unsigned Word Unsigned Modulo
000100 ..... ..... ..... 00010 001110 VX I 254 v2.03 vpkuhus Vector Pack Unsigned Hword Unsigned Saturate
000100 ..... ..... ..... 00011 001110 VX I 254 v2.03 vpkuwus Vector Pack Unsigned Word Unsigned Saturate
000100 ..... ..... ..... 00100 001110 VX I 252 v2.03 vpkshus Vector Pack Signed Hword Unsigned Saturate
000100 ..... ..... ..... 00101 001110 VX I 253 v2.03 vpkswus Vector Pack Signed Word Unsigned Saturate
000100 ..... ..... ..... 00110 001110 VX I 251 v2.03 vpkshss Vector Pack Signed Hword Signed Saturate
000100 ..... ..... ..... 00111 001110 VX I 252 v2.03 vpkswss Vector Pack Signed Word Signed Saturate
000100 ..... ///// ..... 01000 001110 VX I 256 v2.03 vupkhsb Vector Unpack High Signed Byte
000100 ..... ///// ..... 01001 001110 VX I 256 v2.03 vupkhsh Vector Unpack High Signed Hword
000100 ..... ///// ..... 01010 001110 VX I 256 v2.03 vupklsb Vector Unpack Low Signed Byte
000100 ..... ///// ..... 01011 001110 VX I 256 v2.03 vupklsh Vector Unpack Low Signed Hword
000100 ..... ..... ..... 01100 001110 VX I 250 v2.03 vpkpx Vector Pack Pixel
000100 ..... ///// ..... 01101 001110 VX I 255 v2.03 vupkhpx Vector Unpack High Pixel
000100 ..... ///// ..... 01111 001110 VX I 255 v2.03 vupklpx Vector Unpack Low Pixel
000100 ..... ..... ..... 10001 001110 VX I 253 v2.07 vpkudum Vector Pack Unsigned Dword Unsigned Modulo
000100 ..... ..... ..... 10011 001110 VX I 253 v2.07 vpkudus Vector Pack Unsigned Dword Unsigned Saturate
000100 ..... ..... ..... 10101 001110 VX I 251 v2.07 vpksdus Vector Pack Signed Dword Unsigned Saturate
000100 ..... ..... ..... 10111 001110 VX I 250 v2.07 vpksdss Vector Pack Signed Dword Signed Saturate
000100 ..... ///// ..... 11001 001110 VX I 256 v2.07 vupkhsw Vector Unpack High Signed Word
000100 ..... ///// ..... 11011 001110 VX I 256 v2.07 vupklsw Vector Unpack Low Signed Word
000100 ..... ..... ..... ..... 100000 VA I 287 v2.03 vmhaddshs Vector Multiply-High-Add Signed Hword Saturate
000100 ..... ..... ..... ..... 100001 VA I 287 v2.03 vmhraddshs Vector Multiply-High-Round-Add Signed Hword Saturate
000100 ..... ..... ..... ..... 100010 VA I 288 v2.03 vmladduhm Vector Multiply-Low-Add Unsigned Hword Modulo
000100 ..... ..... ..... ..... 100100 VA I 288 v2.03 vmsumubm Vector Multiply-Sum Unsigned Byte Modulo
000100 ..... ..... ..... ..... 100101 VA I 289 v2.03 vmsummbm Vector Multiply-Sum Mixed Byte Modulo
000100 ..... ..... ..... ..... 100110 VA I 290 v2.03 vmsumuhm Vector Multiply-Sum Unsigned Hword Modulo
000100 ..... ..... ..... ..... 100111 VA I 291 v2.03 vmsumuhs Vector Multiply-Sum Unsigned Hword Saturate
000100 ..... ..... ..... ..... 101000 VA I 289 v2.03 vmsumshm Vector Multiply-Sum Signed Hword Modulo
000100 ..... ..... ..... ..... 101001 VA I 290 v2.03 vmsumshs Vector Multiply-Sum Signed Hword Saturate
000100 ..... ..... ..... ..... 101010 VA I 263 v2.03 vsel Vector Select
000100 ..... ..... ..... ..... 101011 VA I 262 v2.03 vperm Vector Permute
000100 ..... ..... ..... /.... 101100 VA I 265 v2.03 vsldoi Vector Shift Left Double by Octet Immediate
000100 ..... ..... ..... ..... 101101 VA I 341 v2.07 vpermxor Vector Permute & Exclusive-OR
Figure 87. Power ISA Instruction Set Sorted by Opcode (Sheet 5 of 17)
Mode Dep4
Mnemonic
Privilege3
Version2
Instruction1
Format
Book
Page
1 11111 11112 22222 222233
012345 6789 0 12345 67890 12345 678901 Name
000100 ..... ..... ..... ..... 101110 VA I 325 v2.03 vmaddfp Vector Multiply-Add Floating-Point
000100 ..... ..... ..... ..... 101111 VA I 325 v2.03 vnmsubfp Vector Negative Multiply-Subtract Floating-Point
000100 ..... ..... ..... ..... 110000 VA I 81 v3.0 maddhd Multiply-Add High Dword
000100 ..... ..... ..... ..... 110001 VA I 81 v3.0 maddhdu Multiply-Add High Dword Unsigned
000100 ..... ..... ..... ..... 110011 VA I 81 v3.0 maddld Multiply-Add Low Dword
000100 ..... ..... ..... ..... 111011 VA I 262 v3.0 vpermr Vector Permute Right-indexed
000100 ..... ..... ..... ..... 111100 VA I 275 v2.07 vaddeuqm Vector Add Extended Unsigned Qword Modulo
000100 ..... ..... ..... ..... 111101 VA I 275 v2.07 vaddecuq Vector Add Extended & write Carry Unsigned Qword
000100 ..... ..... ..... ..... 111110 VA I 281 v2.07 vsubeuqm Vector Subtract Extended Unsigned Qword Modulo
000100 ..... ..... ..... ..... 111111 VA I 281 v2.07 vsubecuq Vector Subtract Extended & write Carry Unsigned Qword
000111 ..... ..... ..... ..... ...... D I 74 P1 mulli Multiply Low Immediate
001000 ..... ..... ..... ..... ...... D I 71 P1 SR subfic Subtract From Immediate Carrying
001010 .../. ..... ..... ..... ...... D I 86 P1 cmpli Compare Logical Immediate
001011 .../. ..... ..... ..... ...... D I 85 P1 cmpi Compare Immediate
001100 ..... ..... ..... ..... ...... D I 70 P1 SR addic Add Immediate Carrying
001101 ..... ..... ..... ..... ...... D I 70 P1 SR addic. Add Immediate Carrying & record
001110 ..... ..... ..... ..... ...... D I 68 P1 addi Add Immediate
001111 ..... ..... ..... ..... ...... D I 68 P1 addis Add Immediate Shifted
010000 ..... ..... ..... ..... ...... B I 38 P1 CT bc[l][a] Branch Conditional [& Link] [Absolute]
010001 ///// ///// ////. ..... .///01 SC I 43 v3.0 scv System Call Vectored
010001 ///// ///// ////. ..... .///1/ SC I 43 PPC sc System Call
010010 ..... ..... ..... ..... ...... I I 38 P1 b[l][a] Branch [& Link] [Absolute]
010011 ...// ...// ///// 00000 00000/ XL I 42 P1 mcrf Move CR Field
010011 ..... ..... ..... 00001 00001/ XL I 42 P1 crnor CR NOR
010011 ..... ..... ..... 00100 00001/ XL I 42 P1 crandc CR AND with Complement
010011 ..... ..... ..... 00110 00001/ XL I 41 P1 crxor CR XOR
010011 ..... ..... ..... 00111 00001/ XL I 41 P1 crnand CR NAND
010011 ..... ..... ..... 01000 00001/ XL I 41 P1 crand CR AND
010011 ..... ..... ..... 01001 00001/ XL I 42 P1 creqv CR Equivalent
010011 ..... ..... ..... 01101 00001/ XL I 42 P1 crorc CR OR with Complement
010011 ..... ..... ..... 01110 00001/ XL I 41 P1 cror CR OR
010011 ..... ..... ..... ..... 00010. DX I 69 v3.0 addpcis Add PC Immediate Shifted
010011 ..... ..... ///.. 00000 10000. XL I 39 P1 CT bclr[l] Branch Conditional to LR [& Link]
010011 ..... ..... ///.. 10000 10000. XL I 39 P1 CT bcctr[l] Branch Conditional to CTR [& Link]
010011 ..... ..... ///.. 10001 10000. XL I 40 v2.07 bctar[l] Branch Conditional to BTAR [& Link]
010011 ///// ///// ///// 00000 10010/ XL III 954 PPC P rfid Return from Interrupt Dword
010011 ///// ///// ///// 00010 10010/ XL III 953 v3.0 P rfscv Return From System Call Vectored
010011 ///// ///// ////. 00100 10010/ XL II 909 v2.07 rfebb Return from Event Based Branch
010011 ///// ///// ///// 01000 10010/ XL III 955 v2.02 H hrfid Return From Interrupt Dword Hypervisor
010011 ///// ///// ///// 01011 10010/ XL III 957 v3.0 P stop Stop
010011 ///// ///// ///// 00100 10110/ XL II 867 P1 isync Instruction Synchronize
010100 ..... ..... ..... ..... ...... M I 102 P1 SR rlwimi[.] Rotate Left Word Immediate then Mask Insert
010101 ..... ..... ..... ..... ...... M I 101 P1 SR rlwinm[.] Rotate Left Word Immediate then AND with Mask
010111 ..... ..... ..... ..... ...... M I 102 P1 SR rlwnm[.] Rotate Left Word then AND with Mask
011000 ..... ..... ..... ..... ...... D I 91 P1 ori OR Immediate
011001 ..... ..... ..... ..... ...... D I 92 P1 oris OR Immediate Shifted
011010 ..... ..... ..... ..... ...... D I 92 P1 xori XOR Immediate
011010 00000 00000 00000 00000 000000 D I 92 v2.05 xnop Executed No Operation
011011 ..... ..... ..... ..... ...... D I 92 P1 xoris XOR Immediate Shifted
011100 ..... ..... ..... ..... ...... D I 91 P1 SR andi. AND Immediate & record
011101 ..... ..... ..... ..... ...... D I 91 P1 SR andis. AND Immediate Shifted & record
011110 ..... ..... ..... ..... .000.. MD I 104 PPC SR rldicl[.] Rotate Left Dword Immediate then Clear Left
011110 ..... ..... ..... ..... .001.. MD I 105 PPC SR rldicr[.] Rotate Left Dword Immediate then Clear Right
011110 ..... ..... ..... ..... .010.. MD I 104 PPC SR rldic[.] Rotate Left Dword Immediate then Clear
011110 ..... ..... ..... ..... .011.. MD I 105 PPC SR rldimi[.] Rotate Left Dword Immediate then Mask Insert
011110 ..... ..... ..... ..... .1000. MDS I 103 PPC SR rldcl[.] Rotate Left Dword then Clear Left
Figure 87. Power ISA Instruction Set Sorted by Opcode (Sheet 6 of 17)
Mode Dep4
Mnemonic
Privilege3
Version2
Instruction1
Format
Book
Page
1 11111 11112 22222 222233
012345 6789 0 12345 67890 12345 678901 Name
011110 ..... ..... ..... ..... .1001. MDS I 103 PPC SR rldcr[.] Rotate Left Dword then Clear Right
011111 .../. ..... ..... 00000 00000/ X I 85 P1 cmp Compare
011111 .../. ..... ..... 00001 00000/ X I 86 P1 cmpl Compare Logical
011111 ..... ...// ///// 00100 00000/ X I 121 v3.0 setb Set Boolean
011111 .../. ..... ..... 00110 00000/ X I 87 v3.0 cmprb Compare Ranged Byte
011111 ...// ..... ..... 00111 00000/ X I 88 v3.0 cmpeqb Compare Equal Byte
011111 ...// ///// ///// 10010 00000/ X I 119 v3.0 mcrxrx Move XER to CR Extended
011111 ..... ..... ..... 00000 00100/ X I 89 P1 tw Trap Word
011111 ..... ..... ..... 00010 00100/ X I 90 PPC td Trap Dword
011111 ..... ..... ..... 00000 00110/ X I 249 v2.03 lvsl Load Vector for Shift Left
011111 ..... ..... ..... 00001 00110/ X I 249 v2.03 lvsr Load Vector for Shift Right
011111 ..... ..... ..... 10010 00110/ X II 864 v3.0 lwat Load Word ATomic
011111 ..... ..... ..... 10011 00110/ X II 864 v3.0 ldat Load Dword ATomic
011111 ..... ..... ..... 10110 00110/ X II 866 v3.0 stwat Store Word ATomic
011111 ..... ..... ..... 10111 00110/ X II 866 v3.0 stdat Store Dword ATomic
011111 ////. ..... ..... 11000 00110/ X II 858 v3.0 copy Copy
011111 ///// ///// ///// 11010 00110/ X II 860 v3.0 cp_abort CP_Abort
011111 ////. ..... ..... 11100 00110. X II 859 v3.0 paste[.] Paste
011111 ..... ..... ..... 00000 00111/ X I 244 v2.03 lvebx Load Vector Element Byte Indexed
011111 ..... ..... ..... 00001 00111/ X I 244 v2.03 lvehx Load Vector Element Hword Indexed
011111 ..... ..... ..... 00010 00111/ X I 245 v2.03 lvewx Load Vector Element Word Indexed
011111 ..... ..... ..... 00011 00111/ X I 245 v2.03 lvx Load Vector Indexed
011111 ..... ..... ..... 00100 00111/ X I 247 v2.03 stvebx Store Vector Element Byte Indexed
011111 ..... ..... ..... 00101 00111/ X I 247 v2.03 stvehx Store Vector Element Hword Indexed
011111 ..... ..... ..... 00110 00111/ X I 248 v2.03 stvewx Store Vector Element Word Indexed
011111 ..... ..... ..... 00111 00111/ X I 248 v2.03 stvx Store Vector Indexed
011111 ..... ..... ..... 01011 00111/ X I 245 v2.03 lvxl Load Vector Indexed Last
011111 ..... ..... ..... 01111 00111/ X I 248 v2.03 stvxl Store Vector Indexed Last
011111 ..... ..... ..... .0000 01000. XO I 71 P1 SR subfc[o][.] Subtract From Carrying
011111 ..... ..... ..... .0001 01000. XO I 70 PPC SR subf[o][.] Subtract From
011111 ..... ..... ///// .0011 01000. XO I 73 P1 SR neg[o][.] Negate
011111 ..... ..... ..... .0100 01000. XO I 72 P1 SR subfe[o][.] Subtract From Extended
011111 ..... ..... ///// .0110 01000. XO I 73 P1 SR subfze[o][.] Subtract From Zero Extended
011111 ..... ..... ///// .0111 01000. XO I 72 P1 SR subfme[o][.] Subtract From Minus One Extended
011111 ..... ..... ..... /0000 01001. XO I 80 PPC SR mulhdu[.] Multiply High Dword Unsigned
011111 ..... ..... ..... /0010 01001. XO I 80 PPC SR mulhd[.] Multiply High Dword
011111 ..... ..... ..... .0111 01001. XO I 80 PPC SR mulld[o][.] Multiply Low Dword
011111 ..... ..... ..... .1100 01001. XO I 83 v2.06 SR divdeu[o][.] Divide Dword Extended Unsigned
011111 ..... ..... ..... .1101 01001. XO I 83 v2.06 SR divde[o][.] Divide Dword Extended
011111 ..... ..... ..... .1110 01001. XO I 82 PPC SR divdu[o][.] Divide Dword Unsigned
011111 ..... ..... ..... .1111 01001. XO I 82 PPC SR divd[o][.] Divide Dword
011111 ..... ..... ..... 01000 01001/ X I 84 v3.0 modud Modulo Unsigned Dword
011111 ..... ..... ..... 11000 01001/ X I 84 v3.0 modsd Modulo Signed Dword
011111 ..... ..... ..... .0000 01010. XO I 71 P1 SR addc[o][.] Add Carrying
011111 ..... ..... ..... /0010 01010/ XO I 110 v2.06 addg6s Add & Generate Sixes
011111 ..... ..... ..... .0100 01010. XO I 72 P1 SR adde[o][.] Add Extended
011111 ..... ..... ///// .0110 01010. XO I 73 P1 SR addze[o][.] Add to Zero Extended
011111 ..... ..... ///// .0111 01010. XO I 72 P1 SR addme[o][.] Add to Minus One Extended
011111 ..... ..... ..... .1000 01010. XO I 70 P1 SR add[o][.] Add
011111 ..... ..... ..... /0000 01011. XO I 74 PPC SR mulhwu[.] Multiply High Word Unsigned
011111 ..... ..... ..... /0010 01011. XO I 74 PPC SR mulhw[.] Multiply High Word
011111 ..... ..... ..... .0111 01011. XO I 74 P1 SR mullw[o][.] Multiply Low Word
011111 ..... ..... ..... .1100 01011. XO I 77 v2.06 SR divweu[o][.] Divide Word Extended Unsigned
011111 ..... ..... ..... .1101 01011. XO I 77 v2.06 SR divwe[o][.] Divide Word Extended
011111 ..... ..... ..... .1110 01011. XO I 75 PPC SR divwu[o][.] Divide Word Unsigned
011111 ..... ..... ..... .1111 01011. XO I 75 PPC SR divw[o][.] Divide Word
Figure 87. Power ISA Instruction Set Sorted by Opcode (Sheet 7 of 17)
Mode Dep4
Mnemonic
Privilege3
Version2
Instruction1
Format
Book
Page
1 11111 11112 22222 222233
012345 6789 0 12345 67890 12345 678901 Name
011111 ..... ..... ..... 01000 01011/ X I 76 v3.0 moduw Modulo Unsigned Word
011111 ..... ..... ..... 11000 01011/ X I 76 v3.0 modsw Modulo Signed Word
011111 ..... ..... ..... 00000 01100. XX1 I 485 v2.07 lxsiwzx Load VSX Scalar as Integer Word & Zero Indexed
011111 ..... ..... ..... 00010 01100. XX1 I 484 v2.07 lxsiwax Load VSX Scalar as Integer Word Algebraic Indexed
011111 ..... ..... ..... 00100 01100. XX1 I 501 v2.07 stxsiwx Store VSX Scalar as Integer Word Indexed
011111 ..... ..... ..... 01000 01100. XX1 I 493 v3.0 lxvx Load VSX Vector Indexed
011111 ..... ..... ..... 01010 01100. XX1 I 495 v2.06 lxvdsx Load VSX Vector Dword & Splat Indexed
011111 ..... ..... ..... 01011 01100. XX1 I 498 v3.0 lxvwsx Load VSX Vector Word & Splat Indexed
011111 ..... ..... ..... 01100 01100. XX1 I 511 v3.0 stxvx Store VSX Vector Indexed
011111 ..... ..... ..... 10000 01100. XX1 I 486 v2.07 lxsspx Load VSX Scalar SP Indexed
011111 ..... ..... ..... 10010 01100. XX1 I 481 v2.06 lxsdx Load VSX Scalar Dword Indexed
011111 ..... ..... ..... 10100 01100. XX1 I 503 v2.07 stxsspx Store VSX Scalar SP Indexed
011111 ..... ..... ..... 10110 01100. XX1 I 499 v2.06 stxsdx Store VSX Scalar Dword Indexed
011111 ..... ..... ..... 11000 01100. XX1 I 497 v2.06 lxvw4x Load VSX Vector Word*4 Indexed
011111 ..... ..... ..... 11001 01100. XX1 I 496 v3.0 lxvh8x Load VSX Vector Hword*8 Indexed
011111 ..... ..... ..... 11010 01100. XX1 I 489 v2.06 lxvd2x Load VSX Vector Dword*2 Indexed
011111 ..... ..... ..... 11011 01100. XX1 I 488 v3.0 lxvb16x Load VSX Vector Byte*16 Indexed
011111 ..... ..... ..... 11100 01100. XX1 I 507 v2.06 stxvw4x Store VSX Vector Word*4 Indexed
011111 ..... ..... ..... 11101 01100. XX1 I 506 v3.0 stxvh8x Store VSX Vector Hword*8 Indexed
011111 ..... ..... ..... 11110 01100. XX1 I 505 v2.06 stxvd2x Store VSX Vector Dword*2 Indexed
011111 ..... ..... ..... 11111 01100. XX1 I 504 v3.0 stxvb16x Store VSX Vector Byte*16 Indexed
011111 ..... ..... ..... 01000 01101. XX1 I 490 v3.0 lxvl Load VSX Vector with Length
011111 ..... ..... ..... 01001 01101. XX1 I 492 v3.0 lxvll Load VSX Vector Left-justified with Length
011111 ..... ..... ..... 01100 01101. XX1 I 508 v3.0 stxvl Store VSX Vector with Length
011111 ..... ..... ..... 01101 01101. XX1 I 510 v3.0 stxvll Store VSX Vector Left-justified with Length
011111 ..... ..... ..... 11000 01101. XX1 I 483 v3.0 lxsibzx Load VSX Scalar as Integer Byte & Zero Indexed
011111 ..... ..... ..... 11001 01101. XX1 I 483 v3.0 lxsihzx Load VSX Scalar as Integer Hword & Zero Indexed
011111 ..... ..... ..... 11100 01101. XX1 I 500 v3.0 stxsibx Store VSX Scalar as Integer Byte Indexed
011111 ..... ..... ..... 11101 01101. XX1 I 500 v3.0 stxsihx Store VSX Scalar as Integer Hword Indexed
011111 ///// ///// ..... 00100 01110/ X III 1125 v2.07 P msgsndp Message Send Privileged
011111 ///// ///// ..... 00101 01110/ X III 1126 v2.07 P msgclrp Message Clear Privileged
011111 ///// ///// ..... 00110 01110/ X III 1123 v2.07 H msgsnd Message Send
011111 ///// ///// ..... 00111 01110/ X III 1124 v2.07 H msgclr Message Clear
011111 ..... ..... ..... 01001 01110/ X I 44 v2.07 mfbhrbe Move From BHRB
011111 ///// ///// ///// 01101 01110/ X I 44 v2.07 clrbhrb Clear BHRB
011111 .//// ///// ///// 10101 01110/ X II 894 v2.07 tend. Transaction End & record
011111 ...// ///// ///// 10110 01110/ X II 898 v2.07 tcheck Transaction Check & record
011111 ////. ///// ///// 10111 01110/ X II 898 v2.07 tsr. Transaction Suspend or Resume & record
011111 .///. ///// ///// 10100 011101 X II 893 v2.07 tbegin. Transaction Begin & record
011111 ..... ..... ..... 11000 011101 X II 896 v2.07 tabortwc. Transaction Abort Word Conditional & record
011111 ..... ..... ..... 11001 011101 X II 897 v2.07 tabortdc. Transaction Abort Dword Conditional & record
011111 ..... ..... ..... 11010 011101 X II 896 v2.07 tabortwci. Transaction Abort Word Conditional Immediate & record
011111 ..... ..... ..... 11011 011101 X II 897 v2.07 tabortdci. Transaction Abort Dword Conditional Immediate & record
011111 ///// ..... ///// 11100 011101 X II 895 v2.07 tabort. Transaction Abort & record
011111 ///// ..... ///// 11101 011101 X III 969 v2.07 P treclaim. Transaction Reclaim & record
011111 ///// ///// ///// 11111 011101 X III 970 v2.07 P trechkpt. Transaction Recheckpoint & record
011111 ..... ..... ..... ..... 01111/ A I 90 v2.03 isel Integer Select
011111 ..... 0.... ..../ 00100 10000/ XFX I 120 P1 mtcrf Move To CR Fields
011111 ..... 1.... ..../ 00100 10000/ XFX I 120 v2.01 mtocrf Move To One CR Field
011111 ..... ////. ///// 00100 10010/ X III 977 P1 P mtmsr Move To MSR
011111 ..... ////. ///// 00101 10010/ X III 978 PPC P mtmsrd Move To MSR Dword
011111 ///// ///// ..... 01000 10010/ X III 1038 v2.03 P 64 tlbiel TLB Invalidate Entry Local
011111 ////. ///// ..... 01001 10010/ X III 1034 P1 H 64 tlbie TLB Invalidate Entry
011111 ///// ///// ///// 01010 10010/ X III 1031 v3.0 P slbsync SLB Synchronize
011111 ..... ///// ..... 01100 10010/ X III 1029 v2.00 P slbmte SLB Move To Entry
011111 ///// ///// ..... 01101 10010/ X III 1024 PPC P slbie SLB Invalidate Entry
Figure 87. Power ISA Instruction Set Sorted by Opcode (Sheet 8 of 17)
Mode Dep4
Mnemonic
Privilege3
Version2
Instruction1
Format
Book
Page
1 11111 11112 22222 222233
012345 6789 0 12345 67890 12345 678901 Name
011111 ..... ///// ..... 01110 10010/ X III 1025 v3.0 P slbieg SLB Invalidate Entry Global
011111 //... ///// ///// 01111 10010/ X III 1027 PPC P slbia SLB Invalidate All
011111 ..... 0//// ///// 00000 10011/ XFX I 121 P1 mfcr Move From CR
011111 ..... 1.... ..../ 00000 10011/ XFX I 121 v2.01 mfocrf Move From One CR Field
011111 ..... ..... ///// 00001 10011. XX1 I 111 v2.07 mfvsrd Move From VSR Dword
011111 ..... ///// ///// 00010 10011/ X III 979 P1 P mfmsr Move From MSR
011111 ..... ..... ///// 00011 10011. XX1 I 112 v2.07 mfvsrwz Move From VSR Word & Zero
011111 ..... ..... ///// 00101 10011. XX1 I 113 v2.07 mtvsrd Move To VSR Dword
011111 ..... ..... ///// 00110 10011. XX1 I 113 v2.07 mtvsrwa Move To VSR Word Algebraic
011111 ..... ..... ///// 00111 10011. XX1 I 114 v2.07 mtvsrwz Move To VSR Word & Zero
011111 ..... ..... ///// 01001 10011. XX1 I 111 v3.0 mfvsrld Move From VSR Lower Dword
I 118
011111 ..... ..... ..... 01010 10011/ X III 975 P1 O mfspr Move From SPR
011111 ..... ..... ..... 01011 10011/ X II 902 PPC mftb Move From Time Base
011111 ..... ..... ///// 01100 10011. XX1 I 115 v3.0 mtvsrws Move To VSR Word & Splat
011111 ..... ..... ..... 01101 10011. XX1 I 114 v3.0 mtvsrdd Move To VSR Double Dword
I 116
011111 ..... ..... ..... 01110 10011/ X III 974 P1 O mtspr Move To SPR
011111 ..... ///.. ///// 10111 10011/ X I 79 v3.0 darn Deliver A Random Number
011111 ..... ///// ..... 11010 10011/ X III 1030 v2.00 P slbmfev SLB Move From Entry VSID
011111 ..... ///// ..... 11100 10011/ X III 1030 v2.00 P slbmfee SLB Move From Entry ESID
011111 ..... ///// ..... 11110 100111 X III 1031 v2.05 P SR slbfee. SLB Find Entry ESID & record
011111 ..... ..... ..... 00000 10100/ X II 869 PPC lwarx Load Word & Reserve Indexed
011111 ..... ..... ..... 00001 10100. X II 868 v2.06 lbarx Load Byte And Reserve Indexed
011111 ..... ..... ..... 00010 10100/ X II 873 PPC ldarx Load Dword And Reserve Indexed
011111 ..... ..... ..... 00011 10100. X II 869 v2.06 lharx Load Hword And Reserve Indexed Xform
011111 ..... ..... ..... 01000 10100. X II 875 v2.07 lqarx Load Qword And Reserve Indexed
011111 ..... ..... ..... 10000 10100/ X I 62 v2.06 ldbrx Load Dword Byte-Reverse Indexed
011111 ..... ..... ..... 10100 10100/ X I 62 v2.06 stdbrx Store Dword Byte-Reverse Indexed
011111 ..... ..... ..... 00000 10101/ X I 53 PPC ldx Load Dword Indexed
011111 ..... ..... ..... 00001 10101/ X I 53 PPC ldux Load Dword with Update Indexed
011111 ..... ..... ..... 00100 10101/ X I 58 PPC stdx Store Dword Indexed
011111 ..... ..... ..... 00101 10101/ X I 58 PPC stdux Store Dword with Update Indexed
011111 ..... ..... ..... 01001 10101/ X I 54 v3.0 PI ldmx Load Dword Monitored Indexed
011111 ..... ..... ..... 01010 10101/ X I 52 PPC lwax Load Word Algebraic Indexed
011111 ..... ..... ..... 01011 10101/ X I 52 PPC lwaux Load Word Algebraic with Update Indexed
011111 ..... ..... ..... 10000 10101/ X I 65 P1 lswx Load String Word Indexed
011111 ..... ..... ..... 10010 10101/ X I 65 P1 lswi Load String Word Immediate
011111 ..... ..... ..... 10100 10101/ X I 66 P1 stswx Store String Word Indexed
011111 ..... ..... ..... 10110 10101/ X I 66 P1 stswi Store String Word Immediate
011111 ..... ..... ..... 11000 10101/ X III 966 v2.05 H lwzcix Load Word & Zero Caching Inhibited Indexed
011111 ..... ..... ..... 11001 10101/ X III 966 v2.05 H lhzcix Load Hword & Zero Caching Inhibited Indexed
011111 ..... ..... ..... 11010 10101/ X III 966 v2.05 H lbzcix Load Byte & Zero Caching Inhibited Indexed
011111 ..... ..... ..... 11011 10101/ X III 966 v2.05 H ldcix Load Dword Caching Inhibited Indexed
011111 ..... ..... ..... 11100 10101/ X III 967 v2.05 H stwcix Store Word Caching Inhibited Indexed
011111 ..... ..... ..... 11101 10101/ X III 967 v2.05 H sthcix Store Hword Caching Inhibited Indexed
011111 ..... ..... ..... 11110 10101/ X III 967 v2.05 H stbcix Store Byte Caching Inhibited Indexed
011111 ..... ..... ..... 11111 10101/ X III 967 v2.05 H stdcix Store Dword Caching Inhibited Indexed
011111 /.... ..... ..... 00000 10110/ X II 842 v2.07 icbt Instruction Cache Block Touch
011111 ///// ..... ..... 00001 10110/ X II 853 PPC dcbst Data Cache Block Store
011111 ///.. ..... ..... 00010 10110/ X II 854 PPC dcbf Data Cache Block Flush
011111 ..... ..... ..... 00111 10110/ X II 852 PPC dcbtst Data Cache Block Touch for Store
011111 ..... ..... ..... 01000 10110/ X II 851 PPC dcbt Data Cache Block Touch
011111 ..... ..... ..... 10000 10110/ X I 61 P1 lwbrx Load Word Byte-Reverse Indexed
011111 ///// ///// ///// 10001 10110/ X III 1042 PPC H tlbsync TLB Synchronize
011111 ///.. ///// ///// 10010 10110/ X II 877 P1 sync Synchronize
011111 ..... ..... ..... 10100 10110/ X I 61 P1 stwbrx Store Word Byte-Reverse Indexed
Figure 87. Power ISA Instruction Set Sorted by Opcode (Sheet 9 of 17)
Mode Dep4
Mnemonic
Privilege3
Version2
Instruction1
Format
Book
Page
1 11111 11112 22222 222233
012345 6789 0 12345 67890 12345 678901 Name
011111 ..... ..... ..... 11000 10110/ X I 61 P1 lhbrx Load Hword Byte-Reverse Indexed
011111 ///// ///// ///// 11010 10110/ X II 879 PPC eieio Enforce In-order Execution of I/O
011111 ///// ///// ///// 11011 10110/ X III 1126 v3.0 H msgsync Message Synchronize
011111 ..... ..... ..... 11100 10110/ X I 61 P1 sthbrx Store Hword Byte-Reverse Indexed
011111 ///// ..... ..... 11110 10110/ X II 842 PPC icbi Instruction Cache Block Invalidate
011111 ///// ..... ..... 11111 10110/ X II 853 P1 dcbz Data Cache Block Zero
011111 ..... ..... ..... 00100 101101 X II 872 PPC stwcx. Store Word Conditional Indexed & record
011111 ..... ..... ..... 00101 101101 X II 876 v2.07 stqcx. Store Qword Conditional Indexed & record
011111 ..... ..... ..... 00110 101101 X II 873 PPC stdcx. Store Dword Conditional Indexed & record
011111 ..... ..... ..... 10101 101101 X II 870 v2.06 stbcx. Store Byte Conditional Indexed & record
011111 ..... ..... ..... 10110 101101 X II 871 v2.06 sthcx. Store Hword Conditional Indexed & record
011111 ..... ..... ..... 00000 10111/ X I 51 P1 lwzx Load Word & Zero Indexed
011111 ..... ..... ..... 00001 10111/ X I 51 P1 lwzux Load Word & Zero with Update Indexed
011111 ..... ..... ..... 00010 10111/ X I 48 P1 lbzx Load Byte & Zero Indexed
011111 ..... ..... ..... 00011 10111/ X I 48 P1 lbzux Load Byte & Zero with Update Indexed
011111 ..... ..... ..... 00100 10111/ X I 57 P1 stwx Store Word Indexed
011111 ..... ..... ..... 00101 10111/ X I 57 P1 stwux Store Word with Update Indexed
011111 ..... ..... ..... 00110 10111/ X I 55 P1 stbx Store Byte Indexed
011111 ..... ..... ..... 00111 10111/ X I 55 P1 stbux Store Byte with Update Indexed
011111 ..... ..... ..... 01000 10111/ X I 49 P1 lhzx Load Hword & Zero Indexed
011111 ..... ..... ..... 01001 10111/ X I 49 P1 lhzux Load Hword & Zero with Update Indexed
011111 ..... ..... ..... 01010 10111/ X I 50 P1 lhax Load Hword Algebraic Indexed
011111 ..... ..... ..... 01011 10111/ X I 50 P1 lhaux Load Hword Algebraic with Update Indexed
011111 ..... ..... ..... 01100 10111/ X I 56 P1 sthx Store Hword Indexed
011111 ..... ..... ..... 01101 10111/ X I 56 P1 sthux Store Hword with Update Indexed
011111 ..... ..... ..... 10000 10111/ X I 142 P1 lfsx Load Floating Single Indexed
011111 ..... ..... ..... 10001 10111/ X I 142 P1 lfsux Load Floating Single with Update Indexed
011111 ..... ..... ..... 10010 10111/ X I 143 P1 lfdx Load Floating Double Indexed
011111 ..... ..... ..... 10011 10111/ X I 143 P1 lfdux Load Floating Double with Update Indexed
011111 ..... ..... ..... 10100 10111/ X I 146 P1 stfsx Store Floating Single Indexed
011111 ..... ..... ..... 10101 10111/ X I 146 P1 stfsux Store Floating Single with Update Indexed
011111 ..... ..... ..... 10110 10111/ X I 147 P1 stfdx Store Floating Double Indexed
011111 ..... ..... ..... 10111 10111/ X I 147 P1 stfdux Store Floating Double with Update Indexed
011111 ..... ..... ..... 11000 10111/ X I 150 v2.05 lfdpx Load Floating Double Pair Indexed
011111 ..... ..... ..... 11010 10111/ X I 144 v2.05 lfiwax Load Floating as Integer Word Algebraic Indexed
011111 ..... ..... ..... 11011 10111/ X I 144 v2.06 lfiwzx Load Floating as Integer Word & Zero Indexed
011111 ..... ..... ..... 11100 10111/ X I 150 v2.05 stfdpx Store Floating Double Pair Indexed
011111 ..... ..... ..... 11110 10111/ X I 148 PPC stfiwx Store Floating as Integer Word Indexed
011111 ..... ..... ..... 00000 11000. X I 106 P1 SR slw[.] Shift Left Word
011111 ..... ..... ..... 10000 11000. X I 106 P1 SR srw[.] Shift Right Word
011111 ..... ..... ..... 11000 11000. X I 107 P1 SR sraw[.] Shift Right Algebraic Word
011111 ..... ..... ..... 11001 11000. X I 107 P1 SR srawi[.] Shift Right Algebraic Word Immediate
011111 ..... ..... ..... 11001 1101.. XS I 109 PPC SR sradi[.] Shift Right Algebraic Dword Immediate
011111 ..... ..... ..... 11011 1101.. XS I 109 v3.0 extswsli[.] Extend Sign Word & Shift Left Immediate
011111 ..... ..... ///// 00000 11010. X I 95 P1 SR cntlzw[.] Count Leading Zeros Word
011111 ..... ..... ///// 00001 11010. X I 98 PPC SR cntlzd[.] Count Leading Zeros Dword
011111 ..... ..... ///// 00011 11010/ X I 96 v2.02 popcntb Population Count Byte
011111 ..... ..... ///// 00100 11010/ X I 97 v2.05 prtyw Parity Word
011111 ..... ..... ///// 00101 11010/ X I 97 v2.05 prtyd Parity Dword
011111 ..... ..... ///// 01000 11010/ X I 110 v2.06 cdtbcd Convert Declets To Binary Coded Decimal
011111 ..... ..... ///// 01001 11010/ X I 110 v2.06 cbcdtd Convert Binary Coded Decimal To Declets
011111 ..... ..... ///// 01011 11010/ X I 96 v2.06 popcntw Population Count Words
011111 ..... ..... ///// 01111 11010/ X I 98 v2.06 popcntd Population Count Dword
011111 ..... ..... ///// 10000 11010. X I 95 v3.0 cnttzw[.] Count Trailing Zeros Word
011111 ..... ..... ///// 10001 11010. X I 98 v3.0 cnttzd[.] Count Trailing Zeros Dword
011111 ..... ..... ..... 11000 11010. X I 109 PPC SR srad[.] Shift Right Algebraic Dword
Figure 87. Power ISA Instruction Set Sorted by Opcode (Sheet 10 of 17)
Mode Dep4
Mnemonic
Privilege3
Version2
Instruction1
Format
Book
Page
1 11111 11112 22222 222233
012345 6789 0 12345 67890 12345 678901 Name
011111 ..... ..... ///// 11100 11010. X I 94 P1 SR extsh[.] Extend Sign Hword
011111 ..... ..... ///// 11101 11010. X I 94 PPC SR extsb[.] Extend Sign Byte
011111 ..... ..... ///// 11110 11010. X I 98 PPC SR extsw[.] Extend Sign Word
011111 ..... ..... ..... 00000 11011. X I 108 PPC SR sld[.] Shift Left Dword
011111 ..... ..... ..... 10000 11011. X I 108 PPC SR srd[.] Shift Right Dword
011111 ..... ..... ..... 00000 11100. X I 93 P1 SR and[.] AND
011111 ..... ..... ..... 00001 11100. X I 94 P1 SR andc[.] AND with Complement
011111 ..... ..... ..... 00011 11100. X I 94 P1 SR nor[.] NOR
011111 ..... ..... ..... 00111 11100/ X I 99 v2.06 bpermd Bit Permute Dword
011111 ..... ..... ..... 01000 11100. X I 94 P1 SR eqv[.] Equivalent
011111 ..... ..... ..... 01001 11100. X I 93 P1 SR xor[.] XOR
011111 ..... ..... ..... 01100 11100. X I 94 P1 SR orc[.] OR with Complement
011111 ..... ..... ..... 01101 11100. X I 93 P1 SR or[.] OR
011111 ..... ..... ..... 01110 11100. X I 93 P1 SR nand[.] NAND
011111 ..... ..... ..... 01111 11100/ X I 96 v2.05 cmpb Compare Bytes
011111 ///.. ///// ///// 00000 11110/ X II 880 v3.0 wait Wait for Interrupt
100000 ..... ..... ..... ..... ...... D I 51 P1 lwz Load Word & Zero
100001 ..... ..... ..... ..... ...... D I 51 P1 lwzu Load Word & Zero with Update
100010 ..... ..... ..... ..... ...... D I 48 P1 lbz Load Byte & Zero
100011 ..... ..... ..... ..... ...... D I 48 P1 lbzu Load Byte & Zero with Update
100100 ..... ..... ..... ..... ...... D I 57 P1 stw Store Word
100101 ..... ..... ..... ..... ...... D I 57 P1 stwu Store Word with Update
100110 ..... ..... ..... ..... ...... D I 55 P1 stb Store Byte
100111 ..... ..... ..... ..... ...... D I 55 P1 stbu Store Byte with Update
101000 ..... ..... ..... ..... ...... D I 49 P1 lhz Load Hword & Zero
101001 ..... ..... ..... ..... ...... D I 49 P1 lhzu Load Hword & Zero with Update
101010 ..... ..... ..... ..... ...... D I 50 P1 lha Load Hword Algebraic
101011 ..... ..... ..... ..... ...... D I 50 P1 lhau Load Hword Algebraic with Update
101100 ..... ..... ..... ..... ...... D I 56 P1 sth Store Hword
101101 ..... ..... ..... ..... ...... D I 56 P1 sthu Store Hword with Update
101110 ..... ..... ..... ..... ...... D I 63 P1 lmw Load Multiple Word
101111 ..... ..... ..... ..... ...... D I 63 P1 stmw Store Multiple Word
110000 ..... ..... ..... ..... ...... D I 142 P1 lfs Load Floating Single
110001 ..... ..... ..... ..... ...... D I 142 P1 lfsu Load Floating Single with Update
110010 ..... ..... ..... ..... ...... D I 143 P1 lfd Load Floating Double
110011 ..... ..... ..... ..... ...... D I 143 P1 lfdu Load Floating Double with Update
110100 ..... ..... ..... ..... ...... D I 146 P1 stfs Store Floating Single
110101 ..... ..... ..... ..... ...... D I 146 P1 stfsu Store Floating Single with Update
110110 ..... ..... ..... ..... ...... D I 147 P1 stfd Store Floating Double
110111 ..... ..... ..... ..... ...... D I 147 P1 stfdu Store Floating Double with Update
111000 ..... ..... ..... ..... ...... DQ I 59 v2.03 lq Load Qword
111001 ..... ..... ..... ..... ....00 DS I 150 v2.05 lfdp Load Floating Double Pair
111001 ..... ..... ..... ..... ....10 DS I 481 v3.0 lxsd Load VSX Scalar Dword
111001 ..... ..... ..... ..... ....11 DS I 486 v3.0 lxssp Load VSX Scalar Single
111010 ..... ..... ..... ..... ....00 DS I 53 PPC ld Load Dword
111010 ..... ..... ..... ..... ....01 DS I 53 PPC ldu Load Dword with Update
111010 ..... ..... ..... ..... ....10 DS I 52 PPC lwa Load Word Algebraic
111011 ..... ..... ..... .0010 00010. Z22 I 222 v2.05 dscli[.] DFP Shift Significand Left Immediate
111011 ..... ..... ..... .0011 00010. Z22 I 222 v2.05 dscri[.] DFP Shift Significand Right Immediate
111011 ...// ..... ..... .0110 00010/ Z22 I 202 v2.05 dtstdc DFP Test Data Class
111011 ...// ..... ..... .0111 00010/ Z22 I 202 v2.05 dtstdg DFP Test Data Group
111011 ..... ..... ..... 00000 00010. X I 195 v2.05 dadd[.] DFP Add
111011 ..... ..... ..... 00001 00010. X I 197 v2.05 dmul[.] DFP Multiply
111011 ...// ..... ..... 00100 00010/ X I 201 v2.05 dcmpo DFP Compare Ordered
111011 ...// ..... ..... 00101 00010/ X I 203 v2.05 dtstex DFP Test Exponent
111011 ..... ///// ..... 01000 00010. X I 215 v2.05 dctdp[.] DFP Convert To DFP Long
Figure 87. Power ISA Instruction Set Sorted by Opcode (Sheet 11 of 17)
Mode Dep4
Mnemonic
Privilege3
Version2
Instruction1
Format
Book
Page
1 11111 11112 22222 222233
012345 6789 0 12345 67890 12345 678901 Name
111011 ..... ///// ..... 01001 00010. X I 217 v2.05 dctfix[.] DFP Convert To Fixed
111011 ..... ../// ..... 01010 00010. X I 219 v2.05 ddedpd[.] DFP Decode DPD To BCD
111011 ..... ///// ..... 01011 00010. X I 220 v2.05 dxex[.] DFP Extract Exponent
111011 ..... ..... ..... 10000 00010. X I 195 v2.05 dsub[.] DFP Subtract
111011 ..... ..... ..... 10001 00010. X I 198 v2.05 ddiv[.] DFP Divide
111011 ...// ..... ..... 10100 00010/ X I 200 v2.05 dcmpu DFP Compare Unordered
111011 ...// ..... ..... 10101 00010/ X I 204 v2.05 dtstsf DFP Test Significance
111011 ..... ///// ..... 11000 00010. X I 216 v2.05 drsp[.] DFP Round To DFP Short
111011 ..... ///// ..... 11001 00010. X I 217 v2.06 dcffix[.] DFP Convert From Fixed
111011 ..... .//// ..... 11010 00010. X I 219 v2.05 denbcd[.] DFP Encode BCD To DPD
111011 ..... ..... ..... 11011 00010. X I 220 v2.05 diex[.] DFP Insert Exponent
111011 ..... ..... ..... ..000 00011. Z23 I 206 v2.05 dqua[.] DFP Quantize
111011 ..... ..... ..... ..001 00011. Z23 I 208 v2.05 drrnd[.] DFP Reround
111011 ..... ..... ..... ..010 00011. Z23 I 205 v2.05 dquai[.] DFP Quantize Immediate
111011 ..... ////. ..... ..011 00011. Z23 I 211 v2.05 drintx[.] DFP Round To FP Integer With Inexact
111011 ..... ////. ..... ..111 00011. Z23 I 213 v2.05 drintn[.] DFP Round To FP Integer Without Inexact
111011 ...// ..... ..... 10101 00011/ X I 204 v3.0 dtstsfi DFP Test Significance Immediate
111011 ..... ///// ..... 11010 01110. X I 165 v2.06 fcfids[.] Floating Convert From Integer Dword Single
111011 ..... ///// ..... 11110 01110. X I 166 v2.06 fcfidus[.] Floating Convert From Integer Dword Unsigned Single
111011 ..... ..... ..... ///// 10010. A I 154 PPC fdivs[.] Floating Divide Single
111011 ..... ..... ..... ///// 10100. A I 153 PPC fsubs[.] Floating Subtract Single
111011 ..... ..... ..... ///// 10101. A I 153 PPC fadds[.] Floating Add Single
111011 ..... ///// ..... ///// 10110. A I 155 PPC fsqrts[.] Floating Square Root Single
111011 ..... ///// ..... ///// 11000. A I 155 PPC fres[.] Floating Reciprocal Estimate Single
111011 ..... ..... ///// ..... 11001. A I 154 PPC fmuls[.] Floating Multiply Single
111011 ..... ///// ..... ///// 11010. A I 156 v2.02 frsqrtes[.] Floating Reciprocal Square Root Estimate Single
111011 ..... ..... ..... ..... 11100. A I 159 PPC fmsubs[.] Floating Multiply-Subtract Single
111011 ..... ..... ..... ..... 11101. A I 158 PPC fmadds[.] Floating Multiply-Add Single
111011 ..... ..... ..... ..... 11110. A I 159 PPC fnmsubs[.] Floating Negative Multiply-Subtract Single
111011 ..... ..... ..... ..... 11111. A I 159 PPC fnmadds[.] Floating Negative Multiply-Add Single
111100 ..... ..... ..... 00000 000... XX3 I 519 v2.07 xsaddsp VSX Scalar Add SP
111100 ..... ..... ..... 00001 000... XX3 I 651 v2.07 xssubsp VSX Scalar Subtract SP
111100 ..... ..... ..... 00010 000... XX3 I 606 v2.07 xsmulsp VSX Scalar Multiply SP
111100 ..... ..... ..... 00011 000... XX3 I 568 v2.07 xsdivsp VSX Scalar Divide SP
111100 ..... ..... ..... 00100 000... XX3 I 514 v2.06 xsadddp VSX Scalar Add DP
111100 ..... ..... ..... 00101 000... XX3 I 647 v2.06 xssubdp VSX Scalar Subtract DP
111100 ..... ..... ..... 00110 000... XX3 I 602 v2.06 xsmuldp VSX Scalar Multiply DP
111100 ..... ..... ..... 00111 000... XX3 I 564 v2.06 xsdivdp VSX Scalar Divide DP
111100 ..... ..... ..... 01000 000... XX3 I 665 v2.06 xvaddsp VSX Vector Add SP
111100 ..... ..... ..... 01001 000... XX3 I 759 v2.06 xvsubsp VSX Vector Subtract SP
111100 ..... ..... ..... 01010 000... XX3 I 727 v2.06 xvmulsp VSX Vector Multiply SP
111100 ..... ..... ..... 01011 000... XX3 I 702 v2.06 xvdivsp VSX Vector Divide SP
111100 ..... ..... ..... 01100 000... XX3 I 661 v2.06 xvadddp VSX Vector Add DP
111100 ..... ..... ..... 01101 000... XX3 I 757 v2.06 xvsubdp VSX Vector Subtract DP
111100 ..... ..... ..... 01110 000... XX3 I 725 v2.06 xvmuldp VSX Vector Multiply DP
111100 ..... ..... ..... 01111 000... XX3 I 700 v2.06 xvdivdp VSX Vector Divide DP
111100 ..... ..... ..... 10000 000... XX3 I 583 v3.0 xsmaxcdp VSX Scalar Maximum Type-C Double-Precision
111100 ..... ..... ..... 10001 000... XX3 I 589 v3.0 xsmincdp VSX Scalar Minimum Type-C Double-Precision
111100 ..... ..... ..... 10010 000... XX3 I 585 v3.0 xsmaxjdp VSX Scalar Maximum Type-J Double-Precision
111100 ..... ..... ..... 10011 000... XX3 I 591 v3.0 xsminjdp VSX Scalar Minimum Type-J Double-Precision
111100 ..... ..... ..... 10100 000... XX3 I 581 v2.06 xsmaxdp VSX Scalar Maximum DP
111100 ..... ..... ..... 10101 000... XX3 I 587 v2.06 xsmindp VSX Scalar Minimum DP
111100 ..... ..... ..... 10110 000... XX3 I 535 v2.06 xscpsgndp VSX Scalar Copy Sign DP
111100 ..... ..... ..... 11000 000... XX3 I 713 v2.06 xvmaxsp VSX Vector Maximum SP
111100 ..... ..... ..... 11001 000... XX3 I 717 v2.06 xvminsp VSX Vector Minimum SP
111100 ..... ..... ..... 11010 000... XX3 I 675 v2.06 xvcpsgnsp VSX Vector Copy Sign SP
Figure 87. Power ISA Instruction Set Sorted by Opcode (Sheet 12 of 17)
Mode Dep4
Mnemonic
Privilege3
Version2
Instruction1
Format
Book
Page
1 11111 11112 22222 222233
012345 6789 0 12345 67890 12345 678901 Name
111100 ..... ..... ..... 11011 000... XX3 I 704 v3.0 xviexpsp VSX Vector Insert Exponent SP
111100 ..... ..... ..... 11100 000... XX3 I 711 v2.06 xvmaxdp VSX Vector Maximum DP
111100 ..... ..... ..... 11101 000... XX3 I 715 v2.06 xvmindp VSX Vector Minimum DP
111100 ..... ..... ..... 11110 000... XX3 I 675 v2.06 xvcpsgndp VSX Vector Copy Sign DP
111100 ..... ..... ..... 11111 000... XX3 I 704 v3.0 xviexpdp VSX Vector Insert Exponent DP
111100 ..... ..... ..... 00000 001... XX3 I 575 v2.07 xsmaddasp VSX Scalar Multiply-Add Type-A SP
111100 ..... ..... ..... 00001 001... XX3 I 575 v2.07 xsmaddmsp VSX Scalar Multiply-Add Type-M SP
111100 ..... ..... ..... 00010 001... XX3 I 596 v2.07 xsmsubasp VSX Scalar Multiply-Subtract Type-A SP
111100 ..... ..... ..... 00011 001... XX3 I 596 v2.07 xsmsubmsp VSX Scalar Multiply-Subtract Type-M SP
111100 ..... ..... ..... 00100 001... XX3 I 572 v2.06 xsmaddadp VSX Scalar Multiply-Add Type-A DP
111100 ..... ..... ..... 00101 001... XX3 I 572 v2.06 xsmaddmdp VSX Scalar Multiply-Add Type-M DP
111100 ..... ..... ..... 00110 001... XX3 I 593 v2.06 xsmsubadp VSX Scalar Multiply-Subtract Type-A DP
111100 ..... ..... ..... 00111 001... XX3 I 593 v2.06 xsmsubmdp VSX Scalar Multiply-Subtract Type-M DP
111100 ..... ..... ..... 01000 001... XX3 I 708 v2.06 xvmaddasp VSX Vector Multiply-Add Type-A SP
111100 ..... ..... ..... 01001 001... XX3 I 708 v2.06 xvmaddmsp VSX Vector Multiply-Add Type-M SP
111100 ..... ..... ..... 01010 001... XX3 I 722 v2.06 xvmsubasp VSX Vector Multiply-Subtract Type-A SP
111100 ..... ..... ..... 01011 001... XX3 I 722 v2.06 xvmsubmsp VSX Vector Multiply-Subtract Type-M SP
111100 ..... ..... ..... 01100 001... XX3 I 705 v2.06 xvmaddadp VSX Vector Multiply-Add Type-A DP
111100 ..... ..... ..... 01101 001... XX3 I 705 v2.06 xvmaddmdp VSX Vector Multiply-Add Type-M DP
111100 ..... ..... ..... 01110 001... XX3 I 719 v2.06 xvmsubadp VSX Vector Multiply-Subtract Type-A DP
111100 ..... ..... ..... 01111 001... XX3 I 719 v2.06 xvmsubmdp VSX Vector Multiply-Subtract Type-M DP
111100 ..... ..... ..... 10000 001... XX3 I 615 v2.07 xsnmaddasp VSX Scalar Negative Multiply-Add Type-A SP
111100 ..... ..... ..... 10001 001... XX3 I 615 v2.07 xsnmaddmsp VSX Scalar Negative Multiply-Add Type-M SP
111100 ..... ..... ..... 10010 001... XX3 I 624 v2.07 xsnmsubasp VSX Scalar Negative Multiply-Subtract Type-A SP
111100 ..... ..... ..... 10011 001... XX3 I 624 v2.07 xsnmsubmsp VSX Scalar Negative Multiply-Subtract Type-M SP
111100 ..... ..... ..... 10100 001... XX3 I 610 v2.06 xsnmaddadp VSX Scalar Negative Multiply-Add Type-A DP
111100 ..... ..... ..... 10101 001... XX3 I 610 v2.06 xsnmaddmdp VSX Scalar Negative Multiply-Add Type-M DP
111100 ..... ..... ..... 10110 001... XX3 I 621 v2.06 xsnmsubadp VSX Scalar Negative Multiply-Subtract Type-A DP
111100 ..... ..... ..... 10111 001... XX3 I 621 v2.06 xsnmsubmdp VSX Scalar Negative Multiply-Subtract Type-M DP
111100 ..... ..... ..... 11000 001... XX3 I 736 v2.06 xvnmaddasp VSX Vector Negative Multiply-Add Type-A SP
111100 ..... ..... ..... 11001 001... XX3 I 736 v2.06 xvnmaddmsp VSX Vector Negative Multiply-Add Type-M SP
111100 ..... ..... ..... 11010 001... XX3 I 742 v2.06 xvnmsubasp VSX Vector Negative Multiply-Subtract Type-A SP
111100 ..... ..... ..... 11011 001... XX3 I 742 v2.06 xvnmsubmsp VSX Vector Negative Multiply-Subtract Type-M SP
111100 ..... ..... ..... 11100 001... XX3 I 731 v2.06 xvnmaddadp VSX Vector Negative Multiply-Add Type-A DP
111100 ..... ..... ..... 11101 001... XX3 I 731 v2.06 xvnmaddmdp VSX Vector Negative Multiply-Add Type-M DP
111100 ..... ..... ..... 11110 001... XX3 I 739 v2.06 xvnmsubadp VSX Vector Negative Multiply-Subtract Type-A DP
111100 ..... ..... ..... 11111 001... XX3 I 739 v2.06 xvnmsubmdp VSX Vector Negative Multiply-Subtract Type-M DP
111100 ..... ..... ..... 0..00 010... XX3 I 778 v2.06 xxsldwi VSX Vector Shift Left Double by Word Immediate
111100 ..... ..... ..... 0..01 010... XX3 I 777 v2.06 xxpermdi VSX Vector Dword Permute Immediate
111100 ..... ..... ..... 00010 010... XX3 I 775 v2.06 xxmrghw VSX Vector Merge Word High
111100 ..... ..... ..... 00011 010... XX3 I 776 v3.0 xxperm VSX Vector Permute
111100 ..... ..... ..... 00110 010... XX3 I 775 v2.06 xxmrglw VSX Vector Merge Word Low
111100 ..... ..... ..... 00111 010... XX3 I 776 v3.0 xxpermr VSX Vector Permute Right-indexed
111100 ..... ..... ..... 10000 010... XX3 I 771 v2.06 xxland VSX Vector Logical AND
111100 ..... ..... ..... 10001 010... XX3 I 771 v2.06 xxlandc VSX Vector Logical AND with Complement
111100 ..... ..... ..... 10010 010... XX3 I 774 v2.06 xxlor VSX Vector Logical OR
111100 ..... ..... ..... 10011 010... XX3 I 774 v2.06 xxlxor VSX Vector Logical XOR
111100 ..... ..... ..... 10100 010... XX3 I 773 v2.06 xxlnor VSX Vector Logical NOR
111100 ..... ..... ..... 10101 010... XX3 I 773 v2.07 xxlorc VSX Vector Logical OR with Complement
111100 ..... ..... ..... 10110 010... XX3 I 772 v2.07 xxlnand VSX Vector Logical NAND
111100 ..... ..... ..... 10111 010... XX3 I 772 v2.07 xxleqv VSX Vector Logical Equivalence
111100 ..... ///.. ..... 01010 0100.. XX2 I 778 v2.06 xxspltw VSX Vector Splat Word
111100 ..... 00... ..... 01011 01000. XX1 I 778 v3.0 xxspltib VSX Vector Splat Immediate Byte
111100 ..... /.... ..... 01010 0101.. XX2 I 770 v3.0 xxextractuw VSX Vector Extract Unsigned Word
111100 ..... /.... ..... 01011 0101.. XX2 I 770 v3.0 xxinsertw VSX Vector Insert Word
111100 ..... ..... ..... .1000 011... XX3 I 668 v2.06 xvcmpeqsp[.] VSX Vector Compare Equal SP
Figure 87. Power ISA Instruction Set Sorted by Opcode (Sheet 13 of 17)
Mode Dep4
Mnemonic
Privilege3
Version2
Instruction1
Format
Book
Page
1 11111 11112 22222 222233
012345 6789 0 12345 67890 12345 678901 Name
111100 ..... ..... ..... .1001 011... XX3 I 672 v2.06 xvcmpgtsp[.] VSX Vector Compare Greater Than SP
111100 ..... ..... ..... .1010 011... XX3 I 670 v2.06 xvcmpgesp[.] VSX Vector Compare Greater Than or Equal SP
111100 ..... ..... ..... .1011 011... XX3 I 674 v3.0 xvcmpnesp[.] VSX Vector Compare Not Equal Single-Precision
111100 ..... ..... ..... .1100 011... XX3 I 667 v2.06 xvcmpeqdp[.] VSX Vector Compare Equal DP
111100 ..... ..... ..... .1101 011... XX3 I 671 v2.06 xvcmpgtdp[.] VSX Vector Compare Greater Than DP
111100 ..... ..... ..... .1110 011... XX3 I 669 v2.06 xvcmpgedp[.] VSX Vector Compare Greater Than or Equal DP
111100 ..... ..... ..... .1111 011... XX3 I 673 v3.0 xvcmpnedp[.] VSX Vector Compare Not Equal Double-Precision
111100 ..... ..... ..... 00000 011... XX3 I 525 v3.0 xscmpeqdp VSX Scalar Compare Equal Double-Precision
111100 ..... ..... ..... 00001 011... XX3 I 527 v3.0 xscmpgtdp VSX Scalar Compare Greater Than Double-Precision
111100 ..... ..... ..... 00010 011... XX3 I 526 v3.0 xscmpgedp VSX Scalar Compare Greater Than or Equal Double-Precision
111100 ..... ..... ..... 00011 011... XX3 I 528 v3.0 xscmpnedp VSX Scalar Compare Not Equal Double-Precision
111100 ...// ..... ..... 00100 011../ XX3 I 532 v2.06 xscmpudp VSX Scalar Compare Unordered DP
111100 ...// ..... ..... 00101 011../ XX3 I 529 v2.06 xscmpodp VSX Scalar Compare Ordered DP
111100 ...// ..... ..... 00111 011../ XX3 I 523 v3.0 xscmpexpdp VSX Scalar Compare Exponents DP
111100 ..... ///// ..... 00100 1000.. XX2 I 546 v2.06 xscvdpuxws VSX Scalar Convert DP to Unsigned Word truncate
111100 ..... ///// ..... 00101 1000.. XX2 I 542 v2.06 xscvdpsxws VSX Scalar Convert DP to Signed Word truncate
111100 ..... ///// ..... 01000 1000.. XX2 I 694 v2.06 xvcvspuxws VSX Vector Convert SP to Unsigned Word truncate
111100 ..... ///// ..... 01001 1000.. XX2 I 690 v2.06 xvcvspsxws VSX Vector Convert SP to Signed Word truncate
111100 ..... ///// ..... 01010 1000.. XX2 I 699 v2.06 xvcvuxwsp VSX Vector Convert Unsigned Word to SP
111100 ..... ///// ..... 01011 1000.. XX2 I 697 v2.06 xvcvsxwsp VSX Vector Convert Signed Word to SP
111100 ..... ///// ..... 01100 1000.. XX2 I 683 v2.06 xvcvdpuxws VSX Vector Convert DP to Unsigned Word truncate
111100 ..... ///// ..... 01101 1000.. XX2 I 679 v2.06 xvcvdpsxws VSX Vector Convert DP to Signed Word truncate
111100 ..... ///// ..... 01110 1000.. XX2 I 699 v2.06 xvcvuxwdp VSX Vector Convert Unsigned Word to DP
111100 ..... ///// ..... 01111 1000.. XX2 I 697 v2.06 xvcvsxwdp VSX Vector Convert Signed Word to DP
111100 ..... ///// ..... 10010 1000.. XX2 I 563 v2.07 xscvuxdsp VSX Scalar Convert Unsigned Dword to SP
111100 ..... ///// ..... 10011 1000.. XX2 I 561 v2.07 xscvsxdsp VSX Scalar Convert Signed Dword to SP
111100 ..... ///// ..... 10100 1000.. XX2 I 544 v2.06 xscvdpuxds VSX Scalar Convert DP to Unsigned Dword truncate
111100 ..... ///// ..... 10101 1000.. XX2 I 539 v2.06 xscvdpsxds VSX Scalar Convert DP to Signed Dword truncate
111100 ..... ///// ..... 10110 1000.. XX2 I 563 v2.06 xscvuxddp VSX Scalar Convert Unsigned Dword to DP
111100 ..... ///// ..... 10111 1000.. XX2 I 561 v2.06 xscvsxddp VSX Scalar Convert Signed Dword to DP
111100 ..... ///// ..... 11000 1000.. XX2 I 692 v2.06 xvcvspuxds VSX Vector Convert SP to Unsigned Dword truncate
111100 ..... ///// ..... 11001 1000.. XX2 I 688 v2.06 xvcvspsxds VSX Vector Convert SP to Signed Dword truncate
111100 ..... ///// ..... 11010 1000.. XX2 I 698 v2.06 xvcvuxdsp VSX Vector Convert Unsigned Dword to SP
111100 ..... ///// ..... 11011 1000.. XX2 I 696 v2.06 xvcvsxdsp VSX Vector Convert Signed Dword to SP
111100 ..... ///// ..... 11100 1000.. XX2 I 681 v2.06 xvcvdpuxds VSX Vector Convert DP to Unsigned Dword truncate
111100 ..... ///// ..... 11101 1000.. XX2 I 677 v2.06 xvcvdpsxds VSX Vector Convert DP to Signed Dword truncate
111100 ..... ///// ..... 11110 1000.. XX2 I 698 v2.06 xvcvuxddp VSX Vector Convert Unsigned Dword to DP
111100 ..... ///// ..... 11111 1000.. XX2 I 696 v2.06 xvcvsxddp VSX Vector Convert Signed Dword to DP
111100 ..... ///// ..... 00100 1001.. XX2 I 630 v2.06 xsrdpi VSX Scalar Round DP to Integral to Nearest Away
111100 ..... ///// ..... 00101 1001.. XX2 I 633 v2.06 xsrdpiz VSX Scalar Round DP to Integral toward Zero
111100 ..... ///// ..... 00110 1001.. XX2 I 632 v2.06 xsrdpip VSX Scalar Round DP to Integral toward +Infinity
111100 ..... ///// ..... 00111 1001.. XX2 I 632 v2.06 xsrdpim VSX Scalar Round DP to Integral toward -Infinity
111100 ..... ///// ..... 01000 1001.. XX2 I 750 v2.06 xvrspi VSX Vector Round SP to Integral to Nearest Away
111100 ..... ///// ..... 01001 1001.. XX2 I 752 v2.06 xvrspiz VSX Vector Round SP to Integral toward Zero
111100 ..... ///// ..... 01010 1001.. XX2 I 751 v2.06 xvrspip VSX Vector Round SP to Integral toward +Infinity
111100 ..... ///// ..... 01011 1001.. XX2 I 751 v2.06 xvrspim VSX Vector Round SP to Integral toward -Infinity
111100 ..... ///// ..... 01100 1001.. XX2 I 745 v2.06 xvrdpi VSX Vector Round DP to Integral to Nearest Away
111100 ..... ///// ..... 01101 1001.. XX2 I 747 v2.06 xvrdpiz VSX Vector Round DP to Integral toward Zero
111100 ..... ///// ..... 01110 1001.. XX2 I 746 v2.06 xvrdpip VSX Vector Round DP to Integral toward +Infinity
111100 ..... ///// ..... 01111 1001.. XX2 I 746 v2.06 xvrdpim VSX Vector Round DP to Integral toward -Infinity
111100 ..... ///// ..... 10000 1001.. XX2 I 538 v2.06 xscvdpsp VSX Scalar Convert DP to SP
111100 ..... ///// ..... 10001 1001.. XX2 I 640 v2.07 xsrsp VSX Scalar Round DP to SP
111100 ..... ///// ..... 10100 1001.. XX2 I 559 v2.06 xscvspdp VSX Scalar Convert SP to DP
111100 ..... ///// ..... 10101 1001.. XX2 I 513 v2.06 xsabsdp VSX Scalar Absolute DP
111100 ..... ///// ..... 10110 1001.. XX2 I 608 v2.06 xsnabsdp VSX Scalar Negative Absolute DP
111100 ..... ///// ..... 10111 1001.. XX2 I 609 v2.06 xsnegdp VSX Scalar Negate DP
Figure 87. Power ISA Instruction Set Sorted by Opcode (Sheet 14 of 17)
Mode Dep4
Mnemonic
Privilege3
Version2
Instruction1
Format
Book
Page
1 11111 11112 22222 222233
012345 6789 0 12345 67890 12345 678901 Name
111100 ..... ///// ..... 11000 1001.. XX2 I 676 v2.06 xvcvdpsp VSX Vector Convert DP to SP
111100 ..... ///// ..... 11001 1001.. XX2 I 660 v2.06 xvabssp VSX Vector Absolute SP
111100 ..... ///// ..... 11010 1001.. XX2 I 729 v2.06 xvnabssp VSX Vector Negative Absolute SP
111100 ..... ///// ..... 11011 1001.. XX2 I 730 v2.06 xvnegsp VSX Vector Negate SP
111100 ..... ///// ..... 11100 1001.. XX2 I 686 v2.06 xvcvspdp VSX Vector Convert SP to DP
111100 ..... ///// ..... 11101 1001.. XX2 I 660 v2.06 xvabsdp VSX Vector Absolute DP
111100 ..... ///// ..... 11110 1001.. XX2 I 729 v2.06 xvnabsdp VSX Vector Negative Absolute DP
111100 ..... ///// ..... 11111 1001.. XX2 I 730 v2.06 xvnegdp VSX Vector Negate DP
111100 ...// ..... ..... 00111 101../ XX3 I 653 v2.06 xstdivdp VSX Scalar Test for software Divide DP
111100 ...// ..... ..... 01011 101../ XX3 I 762 v2.06 xvtdivsp VSX Vector Test for software Divide SP
111100 ...// ..... ..... 01111 101../ XX3 I 761 v2.06 xvtdivdp VSX Vector Test for software Divide DP
111100 ..... ..... ..... 1101. 101... XX2 I 765 v3.0 xvtstdcsp VSX Vector Test Data Class SP
111100 ..... ..... ..... 1111. 101... XX2 I 764 v3.0 xvtstdcdp VSX Vector Test Data Class DP
111100 ..... ///// ..... 00000 1010.. XX2 I 642 v2.07 xsrsqrtesp VSX Scalar Reciprocal Square Root Estimate SP
111100 ..... ///// ..... 00001 1010.. XX2 I 635 v2.07 xsresp VSX Scalar Reciprocal Estimate SP
111100 ..... ///// ..... 00100 1010.. XX2 I 641 v2.06 xsrsqrtedp VSX Scalar Reciprocal Square Root Estimate DP
111100 ..... ///// ..... 00101 1010.. XX2 I 634 v2.06 xsredp VSX Scalar Reciprocal Estimate DP
111100 ...// ///// ..... 00110 1010./ XX2 I 654 v2.06 xstsqrtdp VSX Scalar Test for software Square Root DP
111100 ..... ///// ..... 01000 1010.. XX2 I 754 v2.06 xvrsqrtesp VSX Vector Reciprocal Square Root Estimate SP
111100 ..... ///// ..... 01001 1010.. XX2 I 749 v2.06 xvresp VSX Vector Reciprocal Estimate SP
111100 ...// ///// ..... 01010 1010./ XX2 I 763 v2.06 xvtsqrtsp VSX Vector Test for software Square Root SP
111100 ..... ///// ..... 01100 1010.. XX2 I 752 v2.06 xvrsqrtedp VSX Vector Reciprocal Square Root Estimate DP
111100 ..... ///// ..... 01101 1010.. XX2 I 748 v2.06 xvredp VSX Vector Reciprocal Estimate DP
111100 ...// ///// ..... 01110 1010./ XX2 I 763 v2.06 xvtsqrtdp VSX Vector Test for software Square Root DP
111100 ..... ..... ..... 10010 1010./ XX2 I 657 v3.0 xststdcsp VSX Scalar Test Data Class SP
111100 ..... ..... ..... 10110 1010./ XX2 I 655 v3.0 xststdcdp VSX Scalar Test Data Class DP
111100 ..... ///// ..... 00000 1011.. XX2 I 646 v2.07 xssqrtsp VSX Scalar Square Root SP
111100 ..... ///// ..... 00100 1011.. XX2 I 643 v2.06 xssqrtdp VSX Scalar Square Root DP
111100 ..... ///// ..... 00110 1011.. XX2 I 631 v2.06 xsrdpic VSX Scalar Round DP to Integral using Current rounding mode
111100 ..... ///// ..... 01000 1011.. XX2 I 756 v2.06 xvsqrtsp VSX Vector Square Root SP
111100 ..... ///// ..... 01010 1011.. XX2 I 750 v2.06 xvrspic VSX Vector Round SP to Integral using Current rounding mode
111100 ..... ///// ..... 01100 1011.. XX2 I 755 v2.06 xvsqrtdp VSX Vector Square Root DP
111100 ..... ///// ..... 01110 1011.. XX2 I 745 v2.06 xvrdpic VSX Vector Round DP to Integral using Current rounding mode
111100 ..... ///// ..... 10000 1011.. XX2 I 539 v2.07 xscvdpspn VSX Scalar Convert DP to SP Non-signalling
111100 ..... ///// ..... 10100 1011.. XX2 I 560 v2.07 xscvspdpn VSX Scalar Convert SP to DP Non-signalling
111100 ..... 00000 ..... 10101 1011./ XX2 I 658 v3.0 xsxexpdp VSX Scalar Extract Exponent DP
111100 ..... 00001 ..... 10101 1011./ XX2 I 659 v3.0 xsxsigdp VSX Scalar Extract Significand DP
111100 ..... 10000 ..... 10101 1011.. XX2 I 548 v3.0 xscvhpdp VSX Scalar Convert HP to DP
111100 ..... 10001 ..... 10101 1011.. XX2 I 536 v3.0 xscvdphp VSX Scalar Convert DP to HP
111100 ..... 00000 ..... 11101 1011.. XX2 I 766 v3.0 xvxexpdp VSX Vector Extract Exponent DP
111100 ..... 00001 ..... 11101 1011.. XX2 I 767 v3.0 xvxsigdp VSX Vector Extract Significand DP
111100 ..... 00111 ..... 11101 1011.. XX2 I 768 v3.0 xxbrh VSX Vector Byte-Reverse Hword
111100 ..... 01000 ..... 11101 1011.. XX2 I 766 v3.0 xvxexpsp VSX Vector Extract Exponent SP
111100 ..... 01001 ..... 11101 1011.. XX2 I 767 v3.0 xvxsigsp VSX Vector Extract Significand SP
111100 ..... 01111 ..... 11101 1011.. XX2 I 769 v3.0 xxbrw VSX Vector Byte-Reverse Word
111100 ..... 10111 ..... 11101 1011.. XX2 I 768 v3.0 xxbrd VSX Vector Byte-Reverse Dword
111100 ..... 11000 ..... 11101 1011.. XX2 I 685 v3.0 xvcvhpsp VSX Vector Convert HP to SP
111100 ..... 11001 ..... 11101 1011.. XX2 I 687 v3.0 xvcvsphp VSX Vector Convert SP to HP
111100 ..... 11111 ..... 11101 1011.. XX2 I 769 v3.0 xxbrq VSX Vector Byte-Reverse Qword
111100 ..... ..... ..... 11100 10110. XX1 I 570 v3.0 xsiexpdp VSX Scalar Insert Exponent DP
111100 ..... ..... ..... ..... 11.... XX4 I 777 v2.06 xxsel VSX Vector Select
111101 ..... ..... ..... ..... ....00 DS I 150 v2.05 stfdp Store Floating Double Pair
111101 ..... ..... ..... ..... ....10 DS I 499 v3.0 stxsd Store VSX Scalar Dword
111101 ..... ..... ..... ..... ....11 DS I 502 v3.0 stxssp Store VSX Scalar SP
111101 ..... ..... ..... ..... ...001 DQ I 493 v3.0 lxv Load VSX Vector
111101 ..... ..... ..... ..... ...101 DQ I 508 v3.0 stxv Store VSX Vector
Figure 87. Power ISA Instruction Set Sorted by Opcode (Sheet 15 of 17)
Mode Dep4
Mnemonic
Privilege3
Version2
Instruction1
Format
Book
Page
1 11111 11112 22222 222233
012345 6789 0 12345 67890 12345 678901 Name
111110 ..... ..... ..... ..... ....00 DS I 58 PPC std Store Dword
111110 ..... ..... ..... ..... ....01 DS I 58 PPC stdu Store Dword with Update
111110 ..... ..... ..... ..... ....10 DS I 60 v2.03 stq Store Qword
111111 ...// ..... ..... 00000 00000/ X I 168 P1 fcmpu Floating Compare Unordered
111111 ...// ..... ..... 00001 00000/ X I 168 P1 fcmpo Floating Compare Ordered
111111 ...// ...// ///// 00010 00000/ X I 171 P1 mcrfs Move To CR from FPSCR
111111 ...// ..... ..... 00100 00000/ X I 157 v2.06 ftdiv Floating Test for software Divide
111111 ...// ///// ..... 00101 00000/ X I 157 v2.06 ftsqrt Floating Test for software Square Root
111111 ..... ..... ..... .0010 00010. Z22 I 222 v2.05 dscliq[.] DFP Shift Significand Left Immediate Quad
111111 ..... ..... ..... .0011 00010. Z22 I 222 v2.05 dscriq[.] DFP Shift Significand Right Immediate Quad
111111 ...// ..... ..... .0110 00010/ Z22 I 202 v2.05 dtstdcq DFP Test Data Class Quad
111111 ...// ..... ..... .0111 00010/ Z22 I 202 v2.05 dtstdgq DFP Test Data Group Quad
111111 ..... ..... ..... 00000 00010. X I 195 v2.05 daddq[.] DFP Add Quad
111111 ..... ..... ..... 00001 00010. X I 197 v2.05 dmulq[.] DFP Multiply Quad
111111 ...// ..... ..... 00100 00010/ X I 201 v2.05 dcmpoq DFP Compare Ordered Quad
111111 ...// ..... ..... 00101 00010/ X I 203 v2.05 dtstexq DFP Test Exponent Quad
111111 ..... ///// ..... 01000 00010. X I 215 v2.05 dctqpq[.] DFP Convert To DFP Extended
111111 ..... ///// ..... 01001 00010. X I 217 v2.05 dctfixq[.] DFP Convert To Fixed Quad
111111 ..... ../// ..... 01010 00010. X I 219 v2.05 ddedpdq[.] DFP Decode DPD To BCD Quad
111111 ..... ///// ..... 01011 00010. X I 220 v2.05 dxexq[.] DFP Extract Exponent Quad
111111 ..... ..... ..... 10000 00010. X I 195 v2.05 dsubq[.] DFP Subtract Quad
111111 ..... ..... ..... 10001 00010. X I 198 v2.05 ddivq[.] DFP Divide Quad
111111 ...// ..... ..... 10100 00010/ X I 200 v2.05 dcmpuq DFP Compare Unordered Quad
111111 ...// ..... ..... 10101 00010/ X I 204 v2.05 dtstsfq DFP Test Significance Quad
111111 ..... ///// ..... 11000 00010. X I 216 v2.05 drdpq[.] DFP Round To DFP Long
111111 ..... ///// ..... 11001 00010. X I 217 v2.05 dcffixq[.] DFP Convert From Fixed Quad
111111 ..... .//// ..... 11010 00010. X I 219 v2.05 denbcdq[.] DFP Encode BCD To DPD Quad
111111 ..... ..... ..... 11011 00010. X I 220 v2.05 diexq[.] DFP Insert Exponent Quad
111111 ..... ..... ..... ..000 00011. Z23 I 206 v2.05 dquaq[.] DFP Quantize Quad
111111 ..... ..... ..... ..001 00011. Z23 I 208 v2.05 drrndq[.] DFP Reround Quad
111111 ..... ..... ..... ..010 00011. Z23 I 205 v2.05 dquaiq[.] DFP Quantize Immediate Quad
111111 ..... ////. ..... ..011 00011. Z23 I 211 v2.05 drintxq[.] DFP Round To FP Integer With Inexact Quad
111111 ..... ////. ..... ..111 00011. Z23 I 213 v2.05 drintnq[.] DFP Round To FP Integer Without Inexact Quad
111111 ...// ..... ..... 10101 00011/ X I 204 v3.0 dtstsfiq DFP Test Significance Immediate Quad
111111 ..... ..... ..... 00000 00100. X I 521 v3.0 xsaddqp[o] VSX Scalar Add QP
111111 ..... ..... ..... 00001 00100. X I 604 v3.0 xsmulqp[o] VSX Scalar Multiply QP
111111 ..... ..... ..... 00011 00100/ X I 535 v3.0 xscpsgnqp VSX Scalar Copy Sign QP
111111 ...// ..... ..... 00100 00100/ X I 531 v3.0 xscmpoqp VSX Scalar Compare Ordered QP
111111 ...// ..... ..... 00101 00100/ X I 524 v3.0 xscmpexpqp VSX Scalar Compare Exponents QP
111111 ..... ..... ..... 01100 00100. X I 578 v3.0 xsmaddqp[o] VSX Scalar Multiply-Add QP
111111 ..... ..... ..... 01101 00100. X I 599 v3.0 xsmsubqp[o] VSX Scalar Multiply-Subtract QP
111111 ..... ..... ..... 01110 00100. X I 618 v3.0 xsnmaddqp[o] VSX Scalar Negative Multiply-Add QP
111111 ..... ..... ..... 01111 00100. X I 627 v3.0 xsnmsubqp[o] VSX Scalar Negative Multiply-Subtract QP
111111 ..... ..... ..... 10000 00100. X I 649 v3.0 xssubqp[o] VSX Scalar Subtract QP
111111 ..... ..... ..... 10001 00100. X I 566 v3.0 xsdivqp[o] VSX Scalar Divide QP
111111 ...// ..... ..... 10100 00100/ X I 534 v3.0 xscmpuqp VSX Scalar Compare Unordered QP
111111 ..... ..... ..... 10110 00100/ X I 656 v3.0 xststdcqp VSX Scalar Test Data Class QP
111111 ..... 00000 ..... 11001 00100/ X I 513 v3.0 xsabsqp VSX Scalar Absolute QP
111111 ..... 00010 ..... 11001 00100/ X I 658 v3.0 xsxexpqp VSX Scalar Extract Exponent QP
111111 ..... 01000 ..... 11001 00100/ X I 608 v3.0 xsnabsqp VSX Scalar Negative Absolute QP
111111 ..... 10000 ..... 11001 00100/ X I 609 v3.0 xsnegqp VSX Scalar Negate QP
111111 ..... 10010 ..... 11001 00100/ X I 659 v3.0 xsxsigqp VSX Scalar Extract Significand QP
111111 ..... 11011 ..... 11001 00100. X I 644 v3.0 xssqrtqp[o] VSX Scalar Square Root QP
111111 ..... 00001 ..... 11010 00100/ X I 556 v3.0 xscvqpuwz VSX Scalar Convert QP to Unsigned Word truncate
111111 ..... 00010 ..... 11010 00100/ X I 562 v3.0 xscvudqp VSX Scalar Convert Unsigned Dword to QP
111111 ..... 01001 ..... 11010 00100/ X I 552 v3.0 xscvqpswz VSX Scalar Convert QP to Signed Word truncate
Figure 87. Power ISA Instruction Set Sorted by Opcode (Sheet 16 of 17)
Mode Dep4
Mnemonic
Privilege3
Version2
Instruction1
Format
Book
Page
1 11111 11112 22222 222233
012345 6789 0 12345 67890 12345 678901 Name
111111 ..... 01010 ..... 11010 00100/ X I 558 v3.0 xscvsdqp VSX Scalar Convert Signed Dword to QP
111111 ..... 10001 ..... 11010 00100/ X I 554 v3.0 xscvqpudz VSX Scalar Convert QP to Unsigned Dword truncate
111111 ..... 10100 ..... 11010 00100. X I 549 v3.0 xscvqpdp[o] VSX Scalar Convert QP to DP
111111 ..... 10110 ..... 11010 00100/ X I 537 v3.0 xscvdpqp VSX Scalar Convert DP to QP
111111 ..... 11001 ..... 11010 00100/ X I 550 v3.0 xscvqpsdz VSX Scalar Convert QP to Signed Dword truncate
111111 ..... ..... ..... 11011 00100/ X I 571 v3.0 xsiexpqp VSX Scalar Insert Exponent QP
111111 ..... ////. ..... ..000 00101. Z23 I 636 v3.0 xsrqpi[x] VSX Scalar Round QP to Integral
111111 ..... ////. ..... ..001 00101/ Z23 I 638 v3.0 xsrqpxp VSX Scalar Round QP to XP
111111 ..... ///// ///// 00001 00110. X I 173 P1 mtfsb1[.] Move To FPSCR Bit 1
111111 ..... ///// ///// 00010 00110. X I 173 P1 mtfsb0[.] Move To FPSCR Bit 0
111111 ...// ////. ..../ 00100 00110. X I 172 P1 mtfsfi[.] Move To FPSCR Field Immediate
111111 ..... ..... ..... 11010 00110/ X I 152 v2.07 fmrgow Floating Merge Odd Word
111111 ..... ..... ..... 11110 00110/ X I 151 v2.07 fmrgew Floating Merge Even Word
111111 ..... ///// ///// 10010 00111. X I 171 P1 mffs[.] Move From FPSCR
111111 ..... ..... ..... 10110 00111. XFL I 172 P1 mtfsf[.] Move To FPSCR Fields
111111 ..... ..... ..... 00000 01000. X I 151 v2.05 fcpsgn[.] Floating Copy Sign
111111 ..... ///// ..... 00001 01000. X I 151 P1 fneg[.] Floating Negate
111111 ..... ///// ..... 00010 01000. X I 151 P1 fmr[.] Floating Move Register
111111 ..... ///// ..... 00100 01000. X I 151 P1 fnabs[.] Floating Negative Absolute Value
111111 ..... ///// ..... 01000 01000. X I 151 P1 fabs[.] Floating Absolute
111111 ..... ///// ..... 01100 01000. X I 167 v2.02 frin[.] Floating Round To Integer Nearest
111111 ..... ///// ..... 01101 01000. X I 167 v2.02 friz[.] Floating Round To Integer Zero
111111 ..... ///// ..... 01110 01000. X I 167 v2.02 frip[.] Floating Round To Integer Plus
111111 ..... ///// ..... 01111 01000. X I 167 v2.02 frim[.] Floating Round To Integer Minus
111111 ..... ///// ..... 00000 01100. X I 160 P1 frsp[.] Floating Round to SP
111111 ..... ///// ..... 00000 01110. X I 162 P2 fctiw[.] Floating Convert To Integer Word
111111 ..... ///// ..... 00100 01110. X I 163 v2.06 fctiwu[.] Floating Convert To Integer Word Unsigned
111111 ..... ///// ..... 11001 01110. X I 160 PPC fctid[.] Floating Convert To Integer Dword
111111 ..... ///// ..... 11010 01110. X I 164 PPC fcfid[.] Floating Convert From Integer Dword
111111 ..... ///// ..... 11101 01110. X I 161 v2.06 fctidu[.] Floating Convert To Integer Dword Unsigned
111111 ..... ///// ..... 11110 01110. X I 165 v2.06 fcfidu[.] Floating Convert From Integer Dword Unsigned
111111 ..... ///// ..... 00000 01111. X I 163 P2 fctiwz[.] Floating Convert To Integer Word truncate
111111 ..... ///// ..... 00100 01111. X I 164 v2.06 fctiwuz[.] Floating Convert To Integer Word Unsigned truncate
111111 ..... ///// ..... 11001 01111. X I 161 PPC fctidz[.] Floating Convert To Integer Dword truncate
111111 ..... ///// ..... 11101 01111. X I 162 v2.06 fctiduz[.] Floating Convert To Integer Dword Unsigned truncate
111111 ..... ..... ..... ///// 10010. A I 154 P1 fdiv[.] Floating Divide
111111 ..... ..... ..... ///// 10100. A I 153 P1 fsub[.] Floating Subtract
111111 ..... ..... ..... ///// 10101. A I 153 P1 fadd[.] Floating Add
111111 ..... ///// ..... ///// 10110. A I 155 P2 fsqrt[.] Floating Square Root
111111 ..... ..... ..... ..... 10111. A I 169 PPC fsel[.] Floating Select
111111 ..... ///// ..... ///// 11000. A I 155 v2.02 fre[.] Floating Reciprocal Estimate
111111 ..... ..... ///// ..... 11001. A I 154 P1 fmul[.] Floating Multiply
111111 ..... ///// ..... ///// 11010. A I 156 PPC frsqrte[.] Floating Reciprocal Square Root Estimate
111111 ..... ..... ..... ..... 11100. A I 159 P1 fmsub[.] Floating Multiply-Subtract
111111 ..... ..... ..... ..... 11101. A I 158 P1 fmadd[.] Floating Multiply-Add
111111 ..... ..... ..... ..... 11110. A I 159 P1 fnmsub[.] Floating Negative Multiply-Subtract
111111 ..... ..... ..... ..... 11111. A I 159 P1 fnmadd[.] Floating Negative Multiply-Add
Figure 87. Power ISA Instruction Set Sorted by Opcode (Sheet 17 of 17)
1. Key to Instruction column (primary and extended opcode bits shaded in gray).
/ Instruction bit that corresponds to a reserved field, must have a value of 0, otherwise invalid form.
. Instruction bit that corresponds to an operand bit, may have a value of either 0 or 1.
0 Instruction bit having a value 0.
1 Instruction bit having a value 1.
This appendix lists all the instructions in the Power ISA, sorted in reverse order by ISA version.
Mode Dep4
Mnemonic
Privilege3
Version2
Instruction1
Format
Book
Page
Mode Dep4
Mnemonic
Privilege3
Version2
Instruction1
Format
Book
Page
1 11111 11112 22222 222233
012345 6789 0 12345 67890 12345 678901 Name
011111 ...// ///// ///// 10010 00000/ X I 119 v3.0 mcrxrx Move XER to CR Extended
011111 ..... ..... ///// 01001 10011. XX1 I 111 v3.0 mfvsrld Move From VSR Lower Dword
011111 ..... ..... ..... 11000 01001/ X I 84 v3.0 modsd Modulo Signed Dword
011111 ..... ..... ..... 11000 01011/ X I 76 v3.0 modsw Modulo Signed Word
011111 ..... ..... ..... 01000 01001/ X I 84 v3.0 modud Modulo Unsigned Dword
011111 ..... ..... ..... 01000 01011/ X I 76 v3.0 moduw Modulo Unsigned Word
011111 ///// ///// ///// 11011 10110/ X III 1126 v3.0 H msgsync Message Synchronize
011111 ..... ..... ..... 01101 10011. XX1 I 114 v3.0 mtvsrdd Move To VSR Double Dword
011111 ..... ..... ///// 01100 10011. XX1 I 115 v3.0 mtvsrws Move To VSR Word & Splat
011111 ////. ..... ..... 11100 00110. X II 859 v3.0 paste[.] Paste
011111 ..... ...// ///// 00100 00000/ X I 121 v3.0 setb Set Boolean
011111 ..... ///// ..... 01110 10010/ X III 1025 v3.0 P slbieg SLB Invalidate Entry Global
011111 ///// ///// ///// 01010 10010/ X III 1031 v3.0 P slbsync SLB Synchronize
011111 ..... ..... ..... 10111 00110/ X II 866 v3.0 stdat Store Dword ATomic
010011 ///// ///// ///// 01011 10010/ XL III 957 v3.0 P stop Stop
011111 ..... ..... ..... 10110 00110/ X II 866 v3.0 stwat Store Word ATomic
111101 ..... ..... ..... ..... ....10 DS I 499 v3.0 stxsd Store VSX Scalar Dword
011111 ..... ..... ..... 11100 01101. XX1 I 500 v3.0 stxsibx Store VSX Scalar as Integer Byte Indexed
011111 ..... ..... ..... 11101 01101. XX1 I 500 v3.0 stxsihx Store VSX Scalar as Integer Hword Indexed
111101 ..... ..... ..... ..... ....11 DS I 502 v3.0 stxssp Store VSX Scalar SP
111101 ..... ..... ..... ..... ...101 DQ I 508 v3.0 stxv Store VSX Vector
011111 ..... ..... ..... 11111 01100. XX1 I 504 v3.0 stxvb16x Store VSX Vector Byte*16 Indexed
011111 ..... ..... ..... 11101 01100. XX1 I 506 v3.0 stxvh8x Store VSX Vector Hword*8 Indexed
011111 ..... ..... ..... 01100 01101. XX1 I 508 v3.0 stxvl Store VSX Vector with Length
011111 ..... ..... ..... 01101 01101. XX1 I 510 v3.0 stxvll Store VSX Vector Left-justified with Length
011111 ..... ..... ..... 01100 01100. XX1 I 511 v3.0 stxvx Store VSX Vector Indexed
000100 ..... ..... ..... 10000 000011 VX I 300 v3.0 vabsdub Vector Absolute Difference Unsigned Byte
000100 ..... ..... ..... 10001 000011 VX I 300 v3.0 vabsduh Vector Absolute Difference Unsigned Hword
000100 ..... ..... ..... 10010 000011 VX I 301 v3.0 vabsduw Vector Absolute Difference Unsigned Word
000100 ..... ..... ..... 10111 001100 VX I 349 v3.0 vbpermd Vector Bit Permute Dword
000100 ..... 00000 ..... 11000 000010 VX I 345 v3.0 vclzlsbb Vector Count Leading Zero Least-Significant Bits Byte
000100 ..... ..... ..... .0000 000111 VC I 312 v3.0 vcmpneb[.] Vector Compare Not Equal Byte
000100 ..... ..... ..... .0001 000111 VC I 313 v3.0 vcmpneh[.] Vector Compare Not Equal Hword
000100 ..... ..... ..... .0010 000111 VC I 314 v3.0 vcmpnew[.] Vector Compare Not Equal Word
000100 ..... ..... ..... .0100 000111 VC I 312 v3.0 vcmpnezb[.] Vector Compare Not Equal or Zero Byte
000100 ..... ..... ..... .0101 000111 VC I 313 v3.0 vcmpnezh[.] Vector Compare Not Equal or Zero Hword
000100 ..... ..... ..... .0110 000111 VC I 314 v3.0 vcmpnezw[.] Vector Compare Not Equal or Zero Word
000100 ..... 11100 ..... 11000 000010 VX I 344 v3.0 vctzb Vector Count Trailing Zeros Byte
000100 ..... 11111 ..... 11000 000010 VX I 344 v3.0 vctzd Vector Count Trailing Zeros Dword
000100 ..... 11101 ..... 11000 000010 VX I 344 v3.0 vctzh Vector Count Trailing Zeros Hword
000100 ..... 00001 ..... 11000 000010 VX I 345 v3.0 vctzlsbb Vector Count Trailing Zero Least-Significant Bits Byte
000100 ..... 11110 ..... 11000 000010 VX I 344 v3.0 vctzw Vector Count Trailing Zeros Word
000100 ..... /.... ..... 01011 001101 VX I 269 v3.0 vextractd Vector Extract Dword
000100 ..... /.... ..... 01000 001101 VX I 269 v3.0 vextractub Vector Extract Unsigned Byte
000100 ..... /.... ..... 01001 001101 VX I 269 v3.0 vextractuh Vector Extract Unsigned Hword
000100 ..... /.... ..... 01010 001101 VX I 269 v3.0 vextractuw Vector Extract Unsigned Word
000100 ..... 11000 ..... 11000 000010 VX I 296 v3.0 vextsb2d Vector Extend Sign Byte to Dword
000100 ..... 10000 ..... 11000 000010 VX I 296 v3.0 vextsb2w Vector Extend Sign Byte to Word
000100 ..... 11001 ..... 11000 000010 VX I 296 v3.0 vextsh2d Vector Extend Sign Hword to Dword
000100 ..... 10001 ..... 11000 000010 VX I 296 v3.0 vextsh2w Vector Extend Sign Hword to Word
000100 ..... 11010 ..... 11000 000010 VX I 297 v3.0 vextsw2d Vector Extend Sign Word to Dword
000100 ..... ..... ..... 11000 001101 VX I 346 v3.0 vextublx Vector Extract Unsigned Byte Left-Indexed
000100 ..... ..... ..... 11100 001101 VX I 346 v3.0 vextubrx Vector Extract Unsigned Byte Right-Indexed
000100 ..... ..... ..... 11001 001101 VX I 346 v3.0 vextuhlx Vector Extract Unsigned Hword Left-Indexed
000100 ..... ..... ..... 11101 001101 VX I 346 v3.0 vextuhrx Vector Extract Unsigned Hword Right-Indexed
000100 ..... ..... ..... 11010 001101 VX I 347 v3.0 vextuwlx Vector Extract Unsigned Word Left-Indexed
Figure 88. Power ISA Instruction Set Sorted by Version (Sheet 2 of 17)
Mode Dep4
Mnemonic
Privilege3
Version2
Instruction1
Format
Book
Page
1 11111 11112 22222 222233
012345 6789 0 12345 67890 12345 678901 Name
000100 ..... ..... ..... 11110 001101 VX I 347 v3.0 vextuwrx Vector Extract Unsigned Word Right-Indexed
000100 ..... /.... ..... 01100 001101 VX I 270 v3.0 vinsertb Vector Insert Byte
000100 ..... /.... ..... 01111 001101 VX I 270 v3.0 vinsertd Vector Insert Dword
000100 ..... /.... ..... 01101 001101 VX I 270 v3.0 vinserth Vector Insert Hword
000100 ..... /.... ..... 01110 001101 VX I 270 v3.0 vinsertw Vector Insert Word
000100 ..... ..... ///// 00000 000001 VX I 357 v3.0 vmul10cuq Vector Multiply-by-10 & write Carry Unsigned Qword
000100 ..... ..... ..... 00001 000001 VX I 357 v3.0 vmul10ecuq Vector Multiply-by-10 Extended & write Carry Unsigned Qword
000100 ..... ..... ..... 01001 000001 VX I 357 v3.0 vmul10euq Vector Multiply-by-10 Extended Unsigned Qword
000100 ..... ..... ///// 01000 000001 VX I 357 v3.0 vmul10uq Vector Multiply-by-10 Unsigned Qword
000100 ..... 00111 ..... 11000 000010 VX I 295 v3.0 vnegd Vector Negate Dword
000100 ..... 00110 ..... 11000 000010 VX I 295 v3.0 vnegw Vector Negate Word
000100 ..... ..... ..... ..... 111011 VA I 262 v3.0 vpermr Vector Permute Right-indexed
000100 ..... 01001 ..... 11000 000010 VX I 317 v3.0 vprtybd Vector Parity Byte Dword
000100 ..... 01010 ..... 11000 000010 VX I 317 v3.0 vprtybq Vector Parity Byte Qword
000100 ..... 01000 ..... 11000 000010 VX I 317 v3.0 vprtybw Vector Parity Byte Word
000100 ..... ..... ..... 00011 000101 VX I 323 v3.0 vrldmi Vector Rotate Left Dword then Mask Insert
000100 ..... ..... ..... 00111 000101 VX I 323 v3.0 vrldnm Vector Rotate Left Dword then AND with Mask
000100 ..... ..... ..... 00010 000101 VX I 322 v3.0 vrlwmi Vector Rotate Left Word then Mask Insert
000100 ..... ..... ..... 00110 000101 VX I 322 v3.0 vrlwnm Vector Rotate Left Word then AND with Mask
000100 ..... ..... ..... 11101 000100 X I 267 v3.0 vslv Vector Shift Left Variable
000100 ..... ..... ..... 11100 000100 X I 267 v3.0 vsrv Vector Shift Right Variable
011111 ///.. ///// ///// 00000 11110/ X II 880 v3.0 wait Wait for Interrupt
111111 ..... 00000 ..... 11001 00100/ X I 513 v3.0 xsabsqp VSX Scalar Absolute QP
111111 ..... ..... ..... 00000 00100. X I 521 v3.0 xsaddqp[o] VSX Scalar Add QP
111100 ..... ..... ..... 00000 011... XX3 I 525 v3.0 xscmpeqdp VSX Scalar Compare Equal Double-Precision
111100 ...// ..... ..... 00111 011../ XX3 I 523 v3.0 xscmpexpdp VSX Scalar Compare Exponents DP
111111 ...// ..... ..... 00101 00100/ X I 524 v3.0 xscmpexpqp VSX Scalar Compare Exponents QP
111100 ..... ..... ..... 00010 011... XX3 I 526 v3.0 xscmpgedp VSX Scalar Compare Greater Than or Equal Double-Precision
111100 ..... ..... ..... 00001 011... XX3 I 527 v3.0 xscmpgtdp VSX Scalar Compare Greater Than Double-Precision
111100 ..... ..... ..... 00011 011... XX3 I 528 v3.0 xscmpnedp VSX Scalar Compare Not Equal Double-Precision
111111 ...// ..... ..... 00100 00100/ X I 531 v3.0 xscmpoqp VSX Scalar Compare Ordered QP
111111 ...// ..... ..... 10100 00100/ X I 534 v3.0 xscmpuqp VSX Scalar Compare Unordered QP
111111 ..... ..... ..... 00011 00100/ X I 535 v3.0 xscpsgnqp VSX Scalar Copy Sign QP
111100 ..... 10001 ..... 10101 1011.. XX2 I 536 v3.0 xscvdphp VSX Scalar Convert DP to HP
111111 ..... 10110 ..... 11010 00100/ X I 537 v3.0 xscvdpqp VSX Scalar Convert DP to QP
111100 ..... 10000 ..... 10101 1011.. XX2 I 548 v3.0 xscvhpdp VSX Scalar Convert HP to DP
111111 ..... 10100 ..... 11010 00100. X I 549 v3.0 xscvqpdp[o] VSX Scalar Convert QP to DP
111111 ..... 11001 ..... 11010 00100/ X I 550 v3.0 xscvqpsdz VSX Scalar Convert QP to Signed Dword truncate
111111 ..... 01001 ..... 11010 00100/ X I 552 v3.0 xscvqpswz VSX Scalar Convert QP to Signed Word truncate
111111 ..... 10001 ..... 11010 00100/ X I 554 v3.0 xscvqpudz VSX Scalar Convert QP to Unsigned Dword truncate
111111 ..... 00001 ..... 11010 00100/ X I 556 v3.0 xscvqpuwz VSX Scalar Convert QP to Unsigned Word truncate
111111 ..... 01010 ..... 11010 00100/ X I 558 v3.0 xscvsdqp VSX Scalar Convert Signed Dword to QP
111111 ..... 00010 ..... 11010 00100/ X I 562 v3.0 xscvudqp VSX Scalar Convert Unsigned Dword to QP
111111 ..... ..... ..... 10001 00100. X I 566 v3.0 xsdivqp[o] VSX Scalar Divide QP
111100 ..... ..... ..... 11100 10110. XX1 I 570 v3.0 xsiexpdp VSX Scalar Insert Exponent DP
111111 ..... ..... ..... 11011 00100/ X I 571 v3.0 xsiexpqp VSX Scalar Insert Exponent QP
111111 ..... ..... ..... 01100 00100. X I 578 v3.0 xsmaddqp[o] VSX Scalar Multiply-Add QP
111100 ..... ..... ..... 10000 000... XX3 I 583 v3.0 xsmaxcdp VSX Scalar Maximum Type-C Double-Precision
111100 ..... ..... ..... 10010 000... XX3 I 585 v3.0 xsmaxjdp VSX Scalar Maximum Type-J Double-Precision
111100 ..... ..... ..... 10001 000... XX3 I 589 v3.0 xsmincdp VSX Scalar Minimum Type-C Double-Precision
111100 ..... ..... ..... 10011 000... XX3 I 591 v3.0 xsminjdp VSX Scalar Minimum Type-J Double-Precision
111111 ..... ..... ..... 01101 00100. X I 599 v3.0 xsmsubqp[o] VSX Scalar Multiply-Subtract QP
111111 ..... ..... ..... 00001 00100. X I 604 v3.0 xsmulqp[o] VSX Scalar Multiply QP
111111 ..... 01000 ..... 11001 00100/ X I 608 v3.0 xsnabsqp VSX Scalar Negative Absolute QP
111111 ..... 10000 ..... 11001 00100/ X I 609 v3.0 xsnegqp VSX Scalar Negate QP
111111 ..... ..... ..... 01110 00100. X I 618 v3.0 xsnmaddqp[o] VSX Scalar Negative Multiply-Add QP
Figure 88. Power ISA Instruction Set Sorted by Version (Sheet 3 of 17)
Mode Dep4
Mnemonic
Privilege3
Version2
Instruction1
Format
Book
Page
1 11111 11112 22222 222233
012345 6789 0 12345 67890 12345 678901 Name
111111 ..... ..... ..... 01111 00100. X I 627 v3.0 xsnmsubqp[o] VSX Scalar Negative Multiply-Subtract QP
111111 ..... ////. ..... ..000 00101. Z23 I 636 v3.0 xsrqpi[x] VSX Scalar Round QP to Integral
111111 ..... ////. ..... ..001 00101/ Z23 I 638 v3.0 xsrqpxp VSX Scalar Round QP to XP
111111 ..... 11011 ..... 11001 00100. X I 644 v3.0 xssqrtqp[o] VSX Scalar Square Root QP
111111 ..... ..... ..... 10000 00100. X I 649 v3.0 xssubqp[o] VSX Scalar Subtract QP
111100 ..... ..... ..... 10110 1010./ XX2 I 655 v3.0 xststdcdp VSX Scalar Test Data Class DP
111111 ..... ..... ..... 10110 00100/ X I 656 v3.0 xststdcqp VSX Scalar Test Data Class QP
111100 ..... ..... ..... 10010 1010./ XX2 I 657 v3.0 xststdcsp VSX Scalar Test Data Class SP
111100 ..... 00000 ..... 10101 1011./ XX2 I 658 v3.0 xsxexpdp VSX Scalar Extract Exponent DP
111111 ..... 00010 ..... 11001 00100/ X I 658 v3.0 xsxexpqp VSX Scalar Extract Exponent QP
111100 ..... 00001 ..... 10101 1011./ XX2 I 659 v3.0 xsxsigdp VSX Scalar Extract Significand DP
111111 ..... 10010 ..... 11001 00100/ X I 659 v3.0 xsxsigqp VSX Scalar Extract Significand QP
111100 ..... ..... ..... .1111 011... XX3 I 673 v3.0 xvcmpnedp[.] VSX Vector Compare Not Equal Double-Precision
111100 ..... ..... ..... .1011 011... XX3 I 674 v3.0 xvcmpnesp[.] VSX Vector Compare Not Equal Single-Precision
111100 ..... 11000 ..... 11101 1011.. XX2 I 685 v3.0 xvcvhpsp VSX Vector Convert HP to SP
111100 ..... 11001 ..... 11101 1011.. XX2 I 687 v3.0 xvcvsphp VSX Vector Convert SP to HP
111100 ..... ..... ..... 11111 000... XX3 I 704 v3.0 xviexpdp VSX Vector Insert Exponent DP
111100 ..... ..... ..... 11011 000... XX3 I 704 v3.0 xviexpsp VSX Vector Insert Exponent SP
111100 ..... ..... ..... 1111. 101... XX2 I 764 v3.0 xvtstdcdp VSX Vector Test Data Class DP
111100 ..... ..... ..... 1101. 101... XX2 I 765 v3.0 xvtstdcsp VSX Vector Test Data Class SP
111100 ..... 00000 ..... 11101 1011.. XX2 I 766 v3.0 xvxexpdp VSX Vector Extract Exponent DP
111100 ..... 01000 ..... 11101 1011.. XX2 I 766 v3.0 xvxexpsp VSX Vector Extract Exponent SP
111100 ..... 00001 ..... 11101 1011.. XX2 I 767 v3.0 xvxsigdp VSX Vector Extract Significand DP
111100 ..... 01001 ..... 11101 1011.. XX2 I 767 v3.0 xvxsigsp VSX Vector Extract Significand SP
111100 ..... 10111 ..... 11101 1011.. XX2 I 768 v3.0 xxbrd VSX Vector Byte-Reverse Dword
111100 ..... 00111 ..... 11101 1011.. XX2 I 768 v3.0 xxbrh VSX Vector Byte-Reverse Hword
111100 ..... 11111 ..... 11101 1011.. XX2 I 769 v3.0 xxbrq VSX Vector Byte-Reverse Qword
111100 ..... 01111 ..... 11101 1011.. XX2 I 769 v3.0 xxbrw VSX Vector Byte-Reverse Word
111100 ..... /.... ..... 01010 0101.. XX2 I 770 v3.0 xxextractuw VSX Vector Extract Unsigned Word
111100 ..... /.... ..... 01011 0101.. XX2 I 770 v3.0 xxinsertw VSX Vector Insert Word
111100 ..... ..... ..... 00011 010... XX3 I 776 v3.0 xxperm VSX Vector Permute
111100 ..... ..... ..... 00111 010... XX3 I 776 v3.0 xxpermr VSX Vector Permute Right-indexed
111100 ..... 00... ..... 01011 01000. XX1 I 778 v3.0 xxspltib VSX Vector Splat Immediate Byte
000100 ..... ..... ..... 1.000 000001 VX I 351 v2.07 bcdadd. Decimal Add Modulo & record
000100 ..... ..... ..... 1.001 000001 VX I 351 v2.07 bcdsub. Decimal Subtract Modulo & record
010011 ..... ..... ///.. 10001 10000. XL I 40 v2.07 bctar[l] Branch Conditional to BTAR [& Link]
011111 ///// ///// ///// 01101 01110/ X I 44 v2.07 clrbhrb Clear BHRB
111111 ..... ..... ..... 11110 00110/ X I 151 v2.07 fmrgew Floating Merge Even Word
111111 ..... ..... ..... 11010 00110/ X I 152 v2.07 fmrgow Floating Merge Odd Word
011111 /.... ..... ..... 00000 10110/ X II 842 v2.07 icbt Instruction Cache Block Touch
011111 ..... ..... ..... 01000 10100. X II 875 v2.07 lqarx Load Qword And Reserve Indexed
011111 ..... ..... ..... 00010 01100. XX1 I 484 v2.07 lxsiwax Load VSX Scalar as Integer Word Algebraic Indexed
011111 ..... ..... ..... 00000 01100. XX1 I 485 v2.07 lxsiwzx Load VSX Scalar as Integer Word & Zero Indexed
011111 ..... ..... ..... 10000 01100. XX1 I 486 v2.07 lxsspx Load VSX Scalar SP Indexed
011111 ..... ..... ..... 01001 01110/ X I 44 v2.07 mfbhrbe Move From BHRB
011111 ..... ..... ///// 00001 10011. XX1 I 111 v2.07 mfvsrd Move From VSR Dword
011111 ..... ..... ///// 00011 10011. XX1 I 112 v2.07 mfvsrwz Move From VSR Word & Zero
011111 ///// ///// ..... 00111 01110/ X III 1124 v2.07 H msgclr Message Clear
011111 ///// ///// ..... 00101 01110/ X III 1126 v2.07 P msgclrp Message Clear Privileged
011111 ///// ///// ..... 00110 01110/ X III 1123 v2.07 H msgsnd Message Send
011111 ///// ///// ..... 00100 01110/ X III 1125 v2.07 P msgsndp Message Send Privileged
011111 ..... ..... ///// 00101 10011. XX1 I 113 v2.07 mtvsrd Move To VSR Dword
011111 ..... ..... ///// 00110 10011. XX1 I 113 v2.07 mtvsrwa Move To VSR Word Algebraic
011111 ..... ..... ///// 00111 10011. XX1 I 114 v2.07 mtvsrwz Move To VSR Word & Zero
010011 ///// ///// ////. 00100 10010/ XL II 909 v2.07 rfebb Return from Event Based Branch
011111 ..... ..... ..... 00101 101101 X II 876 v2.07 stqcx. Store Qword Conditional Indexed & record
Figure 88. Power ISA Instruction Set Sorted by Version (Sheet 4 of 17)
Mode Dep4
Mnemonic
Privilege3
Version2
Instruction1
Format
Book
Page
1 11111 11112 22222 222233
012345 6789 0 12345 67890 12345 678901 Name
011111 ..... ..... ..... 00100 01100. XX1 I 501 v2.07 stxsiwx Store VSX Scalar as Integer Word Indexed
011111 ..... ..... ..... 10100 01100. XX1 I 503 v2.07 stxsspx Store VSX Scalar SP Indexed
011111 ///// ..... ///// 11100 011101 X II 895 v2.07 tabort. Transaction Abort & record
011111 ..... ..... ..... 11001 011101 X II 897 v2.07 tabortdc. Transaction Abort Dword Conditional & record
011111 ..... ..... ..... 11011 011101 X II 897 v2.07 tabortdci. Transaction Abort Dword Conditional Immediate & record
011111 ..... ..... ..... 11000 011101 X II 896 v2.07 tabortwc. Transaction Abort Word Conditional & record
011111 ..... ..... ..... 11010 011101 X II 896 v2.07 tabortwci. Transaction Abort Word Conditional Immediate & record
011111 .///. ///// ///// 10100 011101 X II 893 v2.07 tbegin. Transaction Begin & record
011111 ...// ///// ///// 10110 01110/ X II 898 v2.07 tcheck Transaction Check & record
011111 .//// ///// ///// 10101 01110/ X II 894 v2.07 tend. Transaction End & record
011111 ///// ///// ///// 11111 011101 X III 970 v2.07 P trechkpt. Transaction Recheckpoint & record
011111 ///// ..... ///// 11101 011101 X III 969 v2.07 P treclaim. Transaction Reclaim & record
011111 ////. ///// ///// 10111 01110/ X II 898 v2.07 tsr. Transaction Suspend or Resume & record
000100 ..... ..... ..... 00101 000000 VX I 275 v2.07 vaddcuq Vector Add & write Carry Unsigned Qword
000100 ..... ..... ..... ..... 111101 VA I 275 v2.07 vaddecuq Vector Add Extended & write Carry Unsigned Qword
000100 ..... ..... ..... ..... 111100 VA I 275 v2.07 vaddeuqm Vector Add Extended Unsigned Qword Modulo
000100 ..... ..... ..... 00011 000000 VX I 272 v2.07 vaddudm Vector Add Unsigned Dword Modulo
000100 ..... ..... ..... 00100 000000 VX I 272 v2.07 vadduqm Vector Add Unsigned Qword Modulo
000100 ..... ..... ..... 10101 001100 VX I 349 v2.07 vbpermq Vector Bit Permute Qword
000100 ..... ..... ..... 10100 001000 VX I 336 v2.07 vcipher Vector AES Cipher
000100 ..... ..... ..... 10100 001001 VX I 336 v2.07 vcipherlast Vector AES Cipher Last
000100 ..... ///// ..... 11100 000010 VX I 343 v2.07 vclzb Vector Count Leading Zeros Byte
000100 ..... ///// ..... 11111 000010 VX I 343 v2.07 vclzd Vector Count Leading Zeros Dword
000100 ..... ///// ..... 11101 000010 VX I 343 v2.07 vclzh Vector Count Leading Zeros Hword
000100 ..... ///// ..... 11110 000010 VX I 343 v2.07 vclzw Vector Count Leading Zeros Word
000100 ..... ..... ..... .0011 000111 VC I 307 v2.07 vcmpequd[.] Vector Compare Equal Unsigned Dword
000100 ..... ..... ..... .1111 000111 VC I 308 v2.07 vcmpgtsd[.] Vector Compare Greater Than Signed Dword
000100 ..... ..... ..... .1011 000111 VC I 310 v2.07 vcmpgtud[.] Vector Compare Greater Than Unsigned Dword
000100 ..... ..... ..... 11010 000100 VX I 315 v2.07 veqv Vector Logical Equivalence
000100 ..... ///// ..... 10100 001100 VX I 342 v2.07 vgbbd Vector Gather Bits by Byte by Dword
000100 ..... ..... ..... 00111 000010 VX I 302 v2.07 vmaxsd Vector Maximum Signed Dword
000100 ..... ..... ..... 00011 000010 VX I 302 v2.07 vmaxud Vector Maximum Unsigned Dword
000100 ..... ..... ..... 01111 000010 VX I 304 v2.07 vminsd Vector Minimum Signed Dword
000100 ..... ..... ..... 01011 000010 VX I 304 v2.07 vminud Vector Minimum Unsigned Dword
000100 ..... ..... ..... 11110 001100 VX I 259 v2.07 vmrgew Vector Merge Even Word
000100 ..... ..... ..... 11010 001100 VX I 259 v2.07 vmrgow Vector Merge Odd Word
000100 ..... ..... ..... 01110 001000 VX I 285 v2.07 vmulesw Vector Multiply Even Signed Word
000100 ..... ..... ..... 01010 001000 VX I 285 v2.07 vmuleuw Vector Multiply Even Unsigned Word
000100 ..... ..... ..... 00110 001000 VX I 285 v2.07 vmulosw Vector Multiply Odd Signed Word
000100 ..... ..... ..... 00010 001000 VX I 285 v2.07 vmulouw Vector Multiply Odd Unsigned Word
000100 ..... ..... ..... 00010 001001 VX I 286 v2.07 vmuluwm Vector Multiply Unsigned Word Modulo
000100 ..... ..... ..... 10110 000100 VX I 315 v2.07 vnand Vector Logical NAND
000100 ..... ..... ..... 10101 001000 VX I 337 v2.07 vncipher Vector AES Inverse Cipher
000100 ..... ..... ..... 10101 001001 VX I 337 v2.07 vncipherlast Vector AES Inverse Cipher Last
000100 ..... ..... ..... 10101 000100 VX I 316 v2.07 vorc Vector Logical OR with Complement
000100 ..... ..... ..... ..... 101101 VA I 341 v2.07 vpermxor Vector Permute & Exclusive-OR
000100 ..... ..... ..... 10111 001110 VX I 250 v2.07 vpksdss Vector Pack Signed Dword Signed Saturate
000100 ..... ..... ..... 10101 001110 VX I 251 v2.07 vpksdus Vector Pack Signed Dword Unsigned Saturate
000100 ..... ..... ..... 10001 001110 VX I 253 v2.07 vpkudum Vector Pack Unsigned Dword Unsigned Modulo
000100 ..... ..... ..... 10011 001110 VX I 253 v2.07 vpkudus Vector Pack Unsigned Dword Unsigned Saturate
000100 ..... ..... ..... 10000 001000 VX I 339 v2.07 vpmsumb Vector Polynomial Multiply-Sum Byte
000100 ..... ..... ..... 10011 001000 VX I 339 v2.07 vpmsumd Vector Polynomial Multiply-Sum Dword
000100 ..... ..... ..... 10001 001000 VX I 340 v2.07 vpmsumh Vector Polynomial Multiply-Sum Hword
000100 ..... ..... ..... 10010 001000 VX I 340 v2.07 vpmsumw Vector Polynomial Multiply-Sum Word
000100 ..... ///// ..... 11100 000011 VX I 348 v2.07 vpopcntb Vector Population Count Byte
000100 ..... ///// ..... 11111 000011 VX I 348 v2.07 vpopcntd Vector Population Count Dword
Figure 88. Power ISA Instruction Set Sorted by Version (Sheet 5 of 17)
Mode Dep4
Mnemonic
Privilege3
Version2
Instruction1
Format
Book
Page
1 11111 11112 22222 222233
012345 6789 0 12345 67890 12345 678901 Name
000100 ..... ///// ..... 11101 000011 VX I 348 v2.07 vpopcnth Vector Population Count Hword
000100 ..... ///// ..... 11110 000011 VX I 348 v2.07 vpopcntw Vector Population Count Word
000100 ..... ..... ..... 00011 000100 VX I 318 v2.07 vrld Vector Rotate Left Dword
000100 ..... ..... ///// 10111 001000 VX I 337 v2.07 vsbox Vector AES SubBytes
000100 ..... ..... ..... 11011 000010 VX I 338 v2.07 vshasigmad Vector SHA-512 Sigma Dword
000100 ..... ..... ..... 11010 000010 VX I 338 v2.07 vshasigmaw Vector SHA-256 Sigma Word
000100 ..... ..... ..... 10111 000100 VX I 319 v2.07 vsld Vector Shift Left Dword
000100 ..... ..... ..... 01111 000100 VX I 321 v2.07 vsrad Vector Shift Right Algebraic Dword
000100 ..... ..... ..... 11011 000100 VX I 320 v2.07 vsrd Vector Shift Right Dword
000100 ..... ..... ..... 10101 000000 VX I 281 v2.07 vsubcuq Vector Subtract & write Carry Unsigned Qword
000100 ..... ..... ..... ..... 111111 VA I 281 v2.07 vsubecuq Vector Subtract Extended & write Carry Unsigned Qword
000100 ..... ..... ..... ..... 111110 VA I 281 v2.07 vsubeuqm Vector Subtract Extended Unsigned Qword Modulo
000100 ..... ..... ..... 10011 000000 VX I 279 v2.07 vsubudm Vector Subtract Unsigned Dword Modulo
000100 ..... ..... ..... 10100 000000 VX I 281 v2.07 vsubuqm Vector Subtract Unsigned Qword Modulo
000100 ..... ///// ..... 11001 001110 VX I 256 v2.07 vupkhsw Vector Unpack High Signed Word
000100 ..... ///// ..... 11011 001110 VX I 256 v2.07 vupklsw Vector Unpack Low Signed Word
111100 ..... ..... ..... 00000 000... XX3 I 519 v2.07 xsaddsp VSX Scalar Add SP
111100 ..... ///// ..... 10000 1011.. XX2 I 539 v2.07 xscvdpspn VSX Scalar Convert DP to SP Non-signalling
111100 ..... ///// ..... 10100 1011.. XX2 I 560 v2.07 xscvspdpn VSX Scalar Convert SP to DP Non-signalling
111100 ..... ///// ..... 10011 1000.. XX2 I 561 v2.07 xscvsxdsp VSX Scalar Convert Signed Dword to SP
111100 ..... ///// ..... 10010 1000.. XX2 I 563 v2.07 xscvuxdsp VSX Scalar Convert Unsigned Dword to SP
111100 ..... ..... ..... 00011 000... XX3 I 568 v2.07 xsdivsp VSX Scalar Divide SP
111100 ..... ..... ..... 00000 001... XX3 I 575 v2.07 xsmaddasp VSX Scalar Multiply-Add Type-A SP
111100 ..... ..... ..... 00001 001... XX3 I 575 v2.07 xsmaddmsp VSX Scalar Multiply-Add Type-M SP
111100 ..... ..... ..... 00010 001... XX3 I 596 v2.07 xsmsubasp VSX Scalar Multiply-Subtract Type-A SP
111100 ..... ..... ..... 00011 001... XX3 I 596 v2.07 xsmsubmsp VSX Scalar Multiply-Subtract Type-M SP
111100 ..... ..... ..... 00010 000... XX3 I 606 v2.07 xsmulsp VSX Scalar Multiply SP
111100 ..... ..... ..... 10000 001... XX3 I 615 v2.07 xsnmaddasp VSX Scalar Negative Multiply-Add Type-A SP
111100 ..... ..... ..... 10001 001... XX3 I 615 v2.07 xsnmaddmsp VSX Scalar Negative Multiply-Add Type-M SP
111100 ..... ..... ..... 10010 001... XX3 I 624 v2.07 xsnmsubasp VSX Scalar Negative Multiply-Subtract Type-A SP
111100 ..... ..... ..... 10011 001... XX3 I 624 v2.07 xsnmsubmsp VSX Scalar Negative Multiply-Subtract Type-M SP
111100 ..... ///// ..... 00001 1010.. XX2 I 635 v2.07 xsresp VSX Scalar Reciprocal Estimate SP
111100 ..... ///// ..... 10001 1001.. XX2 I 640 v2.07 xsrsp VSX Scalar Round DP to SP
111100 ..... ///// ..... 00000 1010.. XX2 I 642 v2.07 xsrsqrtesp VSX Scalar Reciprocal Square Root Estimate SP
111100 ..... ///// ..... 00000 1011.. XX2 I 646 v2.07 xssqrtsp VSX Scalar Square Root SP
111100 ..... ..... ..... 00001 000... XX3 I 651 v2.07 xssubsp VSX Scalar Subtract SP
111100 ..... ..... ..... 10111 010... XX3 I 772 v2.07 xxleqv VSX Vector Logical Equivalence
111100 ..... ..... ..... 10110 010... XX3 I 772 v2.07 xxlnand VSX Vector Logical NAND
111100 ..... ..... ..... 10101 010... XX3 I 773 v2.07 xxlorc VSX Vector Logical OR with Complement
011111 ..... ..... ..... /0010 01010/ XO I 110 v2.06 addg6s Add & Generate Sixes
011111 ..... ..... ..... 00111 11100/ X I 99 v2.06 bpermd Bit Permute Dword
011111 ..... ..... ///// 01001 11010/ X I 110 v2.06 cbcdtd Convert Binary Coded Decimal To Declets
011111 ..... ..... ///// 01000 11010/ X I 110 v2.06 cdtbcd Convert Declets To Binary Coded Decimal
111011 ..... ///// ..... 11001 00010. X I 217 v2.06 dcffix[.] DFP Convert From Fixed
011111 ..... ..... ..... .1101 01001. XO I 83 v2.06 SR divde[o][.] Divide Dword Extended
011111 ..... ..... ..... .1100 01001. XO I 83 v2.06 SR divdeu[o][.] Divide Dword Extended Unsigned
011111 ..... ..... ..... .1101 01011. XO I 77 v2.06 SR divwe[o][.] Divide Word Extended
011111 ..... ..... ..... .1100 01011. XO I 77 v2.06 SR divweu[o][.] Divide Word Extended Unsigned
111011 ..... ///// ..... 11010 01110. X I 165 v2.06 fcfids[.] Floating Convert From Integer Dword Single
111111 ..... ///// ..... 11110 01110. X I 165 v2.06 fcfidu[.] Floating Convert From Integer Dword Unsigned
111011 ..... ///// ..... 11110 01110. X I 166 v2.06 fcfidus[.] Floating Convert From Integer Dword Unsigned Single
111111 ..... ///// ..... 11101 01110. X I 161 v2.06 fctidu[.] Floating Convert To Integer Dword Unsigned
111111 ..... ///// ..... 11101 01111. X I 162 v2.06 fctiduz[.] Floating Convert To Integer Dword Unsigned truncate
111111 ..... ///// ..... 00100 01110. X I 163 v2.06 fctiwu[.] Floating Convert To Integer Word Unsigned
111111 ..... ///// ..... 00100 01111. X I 164 v2.06 fctiwuz[.] Floating Convert To Integer Word Unsigned truncate
111111 ...// ..... ..... 00100 00000/ X I 157 v2.06 ftdiv Floating Test for software Divide
Figure 88. Power ISA Instruction Set Sorted by Version (Sheet 6 of 17)
Mode Dep4
Mnemonic
Privilege3
Version2
Instruction1
Format
Book
Page
1 11111 11112 22222 222233
012345 6789 0 12345 67890 12345 678901 Name
111111 ...// ///// ..... 00101 00000/ X I 157 v2.06 ftsqrt Floating Test for software Square Root
011111 ..... ..... ..... 00001 10100. X II 868 v2.06 lbarx Load Byte And Reserve Indexed
011111 ..... ..... ..... 10000 10100/ X I 62 v2.06 ldbrx Load Dword Byte-Reverse Indexed
011111 ..... ..... ..... 11011 10111/ X I 144 v2.06 lfiwzx Load Floating as Integer Word & Zero Indexed
011111 ..... ..... ..... 00011 10100. X II 869 v2.06 lharx Load Hword And Reserve Indexed Xform
011111 ..... ..... ..... 10010 01100. XX1 I 481 v2.06 lxsdx Load VSX Scalar Dword Indexed
011111 ..... ..... ..... 11010 01100. XX1 I 489 v2.06 lxvd2x Load VSX Vector Dword*2 Indexed
011111 ..... ..... ..... 01010 01100. XX1 I 495 v2.06 lxvdsx Load VSX Vector Dword & Splat Indexed
011111 ..... ..... ..... 11000 01100. XX1 I 497 v2.06 lxvw4x Load VSX Vector Word*4 Indexed
011111 ..... ..... ///// 01111 11010/ X I 98 v2.06 popcntd Population Count Dword
011111 ..... ..... ///// 01011 11010/ X I 96 v2.06 popcntw Population Count Words
011111 ..... ..... ..... 10101 101101 X II 870 v2.06 stbcx. Store Byte Conditional Indexed & record
011111 ..... ..... ..... 10100 10100/ X I 62 v2.06 stdbrx Store Dword Byte-Reverse Indexed
011111 ..... ..... ..... 10110 101101 X II 871 v2.06 sthcx. Store Hword Conditional Indexed & record
011111 ..... ..... ..... 10110 01100. XX1 I 499 v2.06 stxsdx Store VSX Scalar Dword Indexed
011111 ..... ..... ..... 11110 01100. XX1 I 505 v2.06 stxvd2x Store VSX Vector Dword*2 Indexed
011111 ..... ..... ..... 11100 01100. XX1 I 507 v2.06 stxvw4x Store VSX Vector Word*4 Indexed
111100 ..... ///// ..... 10101 1001.. XX2 I 513 v2.06 xsabsdp VSX Scalar Absolute DP
111100 ..... ..... ..... 00100 000... XX3 I 514 v2.06 xsadddp VSX Scalar Add DP
111100 ...// ..... ..... 00101 011../ XX3 I 529 v2.06 xscmpodp VSX Scalar Compare Ordered DP
111100 ...// ..... ..... 00100 011../ XX3 I 532 v2.06 xscmpudp VSX Scalar Compare Unordered DP
111100 ..... ..... ..... 10110 000... XX3 I 535 v2.06 xscpsgndp VSX Scalar Copy Sign DP
111100 ..... ///// ..... 10000 1001.. XX2 I 538 v2.06 xscvdpsp VSX Scalar Convert DP to SP
111100 ..... ///// ..... 10101 1000.. XX2 I 539 v2.06 xscvdpsxds VSX Scalar Convert DP to Signed Dword truncate
111100 ..... ///// ..... 00101 1000.. XX2 I 542 v2.06 xscvdpsxws VSX Scalar Convert DP to Signed Word truncate
111100 ..... ///// ..... 10100 1000.. XX2 I 544 v2.06 xscvdpuxds VSX Scalar Convert DP to Unsigned Dword truncate
111100 ..... ///// ..... 00100 1000.. XX2 I 546 v2.06 xscvdpuxws VSX Scalar Convert DP to Unsigned Word truncate
111100 ..... ///// ..... 10100 1001.. XX2 I 559 v2.06 xscvspdp VSX Scalar Convert SP to DP
111100 ..... ///// ..... 10111 1000.. XX2 I 561 v2.06 xscvsxddp VSX Scalar Convert Signed Dword to DP
111100 ..... ///// ..... 10110 1000.. XX2 I 563 v2.06 xscvuxddp VSX Scalar Convert Unsigned Dword to DP
111100 ..... ..... ..... 00111 000... XX3 I 564 v2.06 xsdivdp VSX Scalar Divide DP
111100 ..... ..... ..... 00100 001... XX3 I 572 v2.06 xsmaddadp VSX Scalar Multiply-Add Type-A DP
111100 ..... ..... ..... 00101 001... XX3 I 572 v2.06 xsmaddmdp VSX Scalar Multiply-Add Type-M DP
111100 ..... ..... ..... 10100 000... XX3 I 581 v2.06 xsmaxdp VSX Scalar Maximum DP
111100 ..... ..... ..... 10101 000... XX3 I 587 v2.06 xsmindp VSX Scalar Minimum DP
111100 ..... ..... ..... 00110 001... XX3 I 593 v2.06 xsmsubadp VSX Scalar Multiply-Subtract Type-A DP
111100 ..... ..... ..... 00111 001... XX3 I 593 v2.06 xsmsubmdp VSX Scalar Multiply-Subtract Type-M DP
111100 ..... ..... ..... 00110 000... XX3 I 602 v2.06 xsmuldp VSX Scalar Multiply DP
111100 ..... ///// ..... 10110 1001.. XX2 I 608 v2.06 xsnabsdp VSX Scalar Negative Absolute DP
111100 ..... ///// ..... 10111 1001.. XX2 I 609 v2.06 xsnegdp VSX Scalar Negate DP
111100 ..... ..... ..... 10100 001... XX3 I 610 v2.06 xsnmaddadp VSX Scalar Negative Multiply-Add Type-A DP
111100 ..... ..... ..... 10101 001... XX3 I 610 v2.06 xsnmaddmdp VSX Scalar Negative Multiply-Add Type-M DP
111100 ..... ..... ..... 10110 001... XX3 I 621 v2.06 xsnmsubadp VSX Scalar Negative Multiply-Subtract Type-A DP
111100 ..... ..... ..... 10111 001... XX3 I 621 v2.06 xsnmsubmdp VSX Scalar Negative Multiply-Subtract Type-M DP
111100 ..... ///// ..... 00100 1001.. XX2 I 630 v2.06 xsrdpi VSX Scalar Round DP to Integral to Nearest Away
111100 ..... ///// ..... 00110 1011.. XX2 I 631 v2.06 xsrdpic VSX Scalar Round DP to Integral using Current rounding mode
111100 ..... ///// ..... 00111 1001.. XX2 I 632 v2.06 xsrdpim VSX Scalar Round DP to Integral toward -Infinity
111100 ..... ///// ..... 00110 1001.. XX2 I 632 v2.06 xsrdpip VSX Scalar Round DP to Integral toward +Infinity
111100 ..... ///// ..... 00101 1001.. XX2 I 633 v2.06 xsrdpiz VSX Scalar Round DP to Integral toward Zero
111100 ..... ///// ..... 00101 1010.. XX2 I 634 v2.06 xsredp VSX Scalar Reciprocal Estimate DP
111100 ..... ///// ..... 00100 1010.. XX2 I 641 v2.06 xsrsqrtedp VSX Scalar Reciprocal Square Root Estimate DP
111100 ..... ///// ..... 00100 1011.. XX2 I 643 v2.06 xssqrtdp VSX Scalar Square Root DP
111100 ..... ..... ..... 00101 000... XX3 I 647 v2.06 xssubdp VSX Scalar Subtract DP
111100 ...// ..... ..... 00111 101../ XX3 I 653 v2.06 xstdivdp VSX Scalar Test for software Divide DP
111100 ...// ///// ..... 00110 1010./ XX2 I 654 v2.06 xstsqrtdp VSX Scalar Test for software Square Root DP
111100 ..... ///// ..... 11101 1001.. XX2 I 660 v2.06 xvabsdp VSX Vector Absolute DP
Figure 88. Power ISA Instruction Set Sorted by Version (Sheet 7 of 17)
Mode Dep4
Mnemonic
Privilege3
Version2
Instruction1
Format
Book
Page
1 11111 11112 22222 222233
012345 6789 0 12345 67890 12345 678901 Name
111100 ..... ///// ..... 11001 1001.. XX2 I 660 v2.06 xvabssp VSX Vector Absolute SP
111100 ..... ..... ..... 01100 000... XX3 I 661 v2.06 xvadddp VSX Vector Add DP
111100 ..... ..... ..... 01000 000... XX3 I 665 v2.06 xvaddsp VSX Vector Add SP
111100 ..... ..... ..... .1100 011... XX3 I 667 v2.06 xvcmpeqdp[.] VSX Vector Compare Equal DP
111100 ..... ..... ..... .1000 011... XX3 I 668 v2.06 xvcmpeqsp[.] VSX Vector Compare Equal SP
111100 ..... ..... ..... .1110 011... XX3 I 669 v2.06 xvcmpgedp[.] VSX Vector Compare Greater Than or Equal DP
111100 ..... ..... ..... .1010 011... XX3 I 670 v2.06 xvcmpgesp[.] VSX Vector Compare Greater Than or Equal SP
111100 ..... ..... ..... .1101 011... XX3 I 671 v2.06 xvcmpgtdp[.] VSX Vector Compare Greater Than DP
111100 ..... ..... ..... .1001 011... XX3 I 672 v2.06 xvcmpgtsp[.] VSX Vector Compare Greater Than SP
111100 ..... ..... ..... 11110 000... XX3 I 675 v2.06 xvcpsgndp VSX Vector Copy Sign DP
111100 ..... ..... ..... 11010 000... XX3 I 675 v2.06 xvcpsgnsp VSX Vector Copy Sign SP
111100 ..... ///// ..... 11000 1001.. XX2 I 676 v2.06 xvcvdpsp VSX Vector Convert DP to SP
111100 ..... ///// ..... 11101 1000.. XX2 I 677 v2.06 xvcvdpsxds VSX Vector Convert DP to Signed Dword truncate
111100 ..... ///// ..... 01101 1000.. XX2 I 679 v2.06 xvcvdpsxws VSX Vector Convert DP to Signed Word truncate
111100 ..... ///// ..... 11100 1000.. XX2 I 681 v2.06 xvcvdpuxds VSX Vector Convert DP to Unsigned Dword truncate
111100 ..... ///// ..... 01100 1000.. XX2 I 683 v2.06 xvcvdpuxws VSX Vector Convert DP to Unsigned Word truncate
111100 ..... ///// ..... 11100 1001.. XX2 I 686 v2.06 xvcvspdp VSX Vector Convert SP to DP
111100 ..... ///// ..... 11001 1000.. XX2 I 688 v2.06 xvcvspsxds VSX Vector Convert SP to Signed Dword truncate
111100 ..... ///// ..... 01001 1000.. XX2 I 690 v2.06 xvcvspsxws VSX Vector Convert SP to Signed Word truncate
111100 ..... ///// ..... 11000 1000.. XX2 I 692 v2.06 xvcvspuxds VSX Vector Convert SP to Unsigned Dword truncate
111100 ..... ///// ..... 01000 1000.. XX2 I 694 v2.06 xvcvspuxws VSX Vector Convert SP to Unsigned Word truncate
111100 ..... ///// ..... 11111 1000.. XX2 I 696 v2.06 xvcvsxddp VSX Vector Convert Signed Dword to DP
111100 ..... ///// ..... 11011 1000.. XX2 I 696 v2.06 xvcvsxdsp VSX Vector Convert Signed Dword to SP
111100 ..... ///// ..... 01111 1000.. XX2 I 697 v2.06 xvcvsxwdp VSX Vector Convert Signed Word to DP
111100 ..... ///// ..... 01011 1000.. XX2 I 697 v2.06 xvcvsxwsp VSX Vector Convert Signed Word to SP
111100 ..... ///// ..... 11110 1000.. XX2 I 698 v2.06 xvcvuxddp VSX Vector Convert Unsigned Dword to DP
111100 ..... ///// ..... 11010 1000.. XX2 I 698 v2.06 xvcvuxdsp VSX Vector Convert Unsigned Dword to SP
111100 ..... ///// ..... 01110 1000.. XX2 I 699 v2.06 xvcvuxwdp VSX Vector Convert Unsigned Word to DP
111100 ..... ///// ..... 01010 1000.. XX2 I 699 v2.06 xvcvuxwsp VSX Vector Convert Unsigned Word to SP
111100 ..... ..... ..... 01111 000... XX3 I 700 v2.06 xvdivdp VSX Vector Divide DP
111100 ..... ..... ..... 01011 000... XX3 I 702 v2.06 xvdivsp VSX Vector Divide SP
111100 ..... ..... ..... 01100 001... XX3 I 705 v2.06 xvmaddadp VSX Vector Multiply-Add Type-A DP
111100 ..... ..... ..... 01000 001... XX3 I 708 v2.06 xvmaddasp VSX Vector Multiply-Add Type-A SP
111100 ..... ..... ..... 01101 001... XX3 I 705 v2.06 xvmaddmdp VSX Vector Multiply-Add Type-M DP
111100 ..... ..... ..... 01001 001... XX3 I 708 v2.06 xvmaddmsp VSX Vector Multiply-Add Type-M SP
111100 ..... ..... ..... 11100 000... XX3 I 711 v2.06 xvmaxdp VSX Vector Maximum DP
111100 ..... ..... ..... 11000 000... XX3 I 713 v2.06 xvmaxsp VSX Vector Maximum SP
111100 ..... ..... ..... 11101 000... XX3 I 715 v2.06 xvmindp VSX Vector Minimum DP
111100 ..... ..... ..... 11001 000... XX3 I 717 v2.06 xvminsp VSX Vector Minimum SP
111100 ..... ..... ..... 01110 001... XX3 I 719 v2.06 xvmsubadp VSX Vector Multiply-Subtract Type-A DP
111100 ..... ..... ..... 01010 001... XX3 I 722 v2.06 xvmsubasp VSX Vector Multiply-Subtract Type-A SP
111100 ..... ..... ..... 01111 001... XX3 I 719 v2.06 xvmsubmdp VSX Vector Multiply-Subtract Type-M DP
111100 ..... ..... ..... 01011 001... XX3 I 722 v2.06 xvmsubmsp VSX Vector Multiply-Subtract Type-M SP
111100 ..... ..... ..... 01110 000... XX3 I 725 v2.06 xvmuldp VSX Vector Multiply DP
111100 ..... ..... ..... 01010 000... XX3 I 727 v2.06 xvmulsp VSX Vector Multiply SP
111100 ..... ///// ..... 11110 1001.. XX2 I 729 v2.06 xvnabsdp VSX Vector Negative Absolute DP
111100 ..... ///// ..... 11010 1001.. XX2 I 729 v2.06 xvnabssp VSX Vector Negative Absolute SP
111100 ..... ///// ..... 11111 1001.. XX2 I 730 v2.06 xvnegdp VSX Vector Negate DP
111100 ..... ///// ..... 11011 1001.. XX2 I 730 v2.06 xvnegsp VSX Vector Negate SP
111100 ..... ..... ..... 11100 001... XX3 I 731 v2.06 xvnmaddadp VSX Vector Negative Multiply-Add Type-A DP
111100 ..... ..... ..... 11000 001... XX3 I 736 v2.06 xvnmaddasp VSX Vector Negative Multiply-Add Type-A SP
111100 ..... ..... ..... 11101 001... XX3 I 731 v2.06 xvnmaddmdp VSX Vector Negative Multiply-Add Type-M DP
111100 ..... ..... ..... 11001 001... XX3 I 736 v2.06 xvnmaddmsp VSX Vector Negative Multiply-Add Type-M SP
111100 ..... ..... ..... 11110 001... XX3 I 739 v2.06 xvnmsubadp VSX Vector Negative Multiply-Subtract Type-A DP
111100 ..... ..... ..... 11010 001... XX3 I 742 v2.06 xvnmsubasp VSX Vector Negative Multiply-Subtract Type-A SP
111100 ..... ..... ..... 11111 001... XX3 I 739 v2.06 xvnmsubmdp VSX Vector Negative Multiply-Subtract Type-M DP
Figure 88. Power ISA Instruction Set Sorted by Version (Sheet 8 of 17)
Mode Dep4
Mnemonic
Privilege3
Version2
Instruction1
Format
Book
Page
1 11111 11112 22222 222233
012345 6789 0 12345 67890 12345 678901 Name
111100 ..... ..... ..... 11011 001... XX3 I 742 v2.06 xvnmsubmsp VSX Vector Negative Multiply-Subtract Type-M SP
111100 ..... ///// ..... 01100 1001.. XX2 I 745 v2.06 xvrdpi VSX Vector Round DP to Integral to Nearest Away
111100 ..... ///// ..... 01110 1011.. XX2 I 745 v2.06 xvrdpic VSX Vector Round DP to Integral using Current rounding mode
111100 ..... ///// ..... 01111 1001.. XX2 I 746 v2.06 xvrdpim VSX Vector Round DP to Integral toward -Infinity
111100 ..... ///// ..... 01110 1001.. XX2 I 746 v2.06 xvrdpip VSX Vector Round DP to Integral toward +Infinity
111100 ..... ///// ..... 01101 1001.. XX2 I 747 v2.06 xvrdpiz VSX Vector Round DP to Integral toward Zero
111100 ..... ///// ..... 01101 1010.. XX2 I 748 v2.06 xvredp VSX Vector Reciprocal Estimate DP
111100 ..... ///// ..... 01001 1010.. XX2 I 749 v2.06 xvresp VSX Vector Reciprocal Estimate SP
111100 ..... ///// ..... 01000 1001.. XX2 I 750 v2.06 xvrspi VSX Vector Round SP to Integral to Nearest Away
111100 ..... ///// ..... 01010 1011.. XX2 I 750 v2.06 xvrspic VSX Vector Round SP to Integral using Current rounding mode
111100 ..... ///// ..... 01011 1001.. XX2 I 751 v2.06 xvrspim VSX Vector Round SP to Integral toward -Infinity
111100 ..... ///// ..... 01010 1001.. XX2 I 751 v2.06 xvrspip VSX Vector Round SP to Integral toward +Infinity
111100 ..... ///// ..... 01001 1001.. XX2 I 752 v2.06 xvrspiz VSX Vector Round SP to Integral toward Zero
111100 ..... ///// ..... 01100 1010.. XX2 I 752 v2.06 xvrsqrtedp VSX Vector Reciprocal Square Root Estimate DP
111100 ..... ///// ..... 01000 1010.. XX2 I 754 v2.06 xvrsqrtesp VSX Vector Reciprocal Square Root Estimate SP
111100 ..... ///// ..... 01100 1011.. XX2 I 755 v2.06 xvsqrtdp VSX Vector Square Root DP
111100 ..... ///// ..... 01000 1011.. XX2 I 756 v2.06 xvsqrtsp VSX Vector Square Root SP
111100 ..... ..... ..... 01101 000... XX3 I 757 v2.06 xvsubdp VSX Vector Subtract DP
111100 ..... ..... ..... 01001 000... XX3 I 759 v2.06 xvsubsp VSX Vector Subtract SP
111100 ...// ..... ..... 01111 101../ XX3 I 761 v2.06 xvtdivdp VSX Vector Test for software Divide DP
111100 ...// ..... ..... 01011 101../ XX3 I 762 v2.06 xvtdivsp VSX Vector Test for software Divide SP
111100 ...// ///// ..... 01110 1010./ XX2 I 763 v2.06 xvtsqrtdp VSX Vector Test for software Square Root DP
111100 ...// ///// ..... 01010 1010./ XX2 I 763 v2.06 xvtsqrtsp VSX Vector Test for software Square Root SP
111100 ..... ..... ..... 10000 010... XX3 I 771 v2.06 xxland VSX Vector Logical AND
111100 ..... ..... ..... 10001 010... XX3 I 771 v2.06 xxlandc VSX Vector Logical AND with Complement
111100 ..... ..... ..... 10100 010... XX3 I 773 v2.06 xxlnor VSX Vector Logical NOR
111100 ..... ..... ..... 10010 010... XX3 I 774 v2.06 xxlor VSX Vector Logical OR
111100 ..... ..... ..... 10011 010... XX3 I 774 v2.06 xxlxor VSX Vector Logical XOR
111100 ..... ..... ..... 00010 010... XX3 I 775 v2.06 xxmrghw VSX Vector Merge Word High
111100 ..... ..... ..... 00110 010... XX3 I 775 v2.06 xxmrglw VSX Vector Merge Word Low
111100 ..... ..... ..... 0..01 010... XX3 I 777 v2.06 xxpermdi VSX Vector Dword Permute Immediate
111100 ..... ..... ..... ..... 11.... XX4 I 777 v2.06 xxsel VSX Vector Select
111100 ..... ..... ..... 0..00 010... XX3 I 778 v2.06 xxsldwi VSX Vector Shift Left Double by Word Immediate
111100 ..... ///.. ..... 01010 0100.. XX2 I 778 v2.06 xxspltw VSX Vector Splat Word
011111 ..... ..... ..... 01111 11100/ X I 96 v2.05 cmpb Compare Bytes
111011 ..... ..... ..... 00000 00010. X I 195 v2.05 dadd[.] DFP Add
111111 ..... ..... ..... 00000 00010. X I 195 v2.05 daddq[.] DFP Add Quad
111111 ..... ///// ..... 11001 00010. X I 217 v2.05 dcffixq[.] DFP Convert From Fixed Quad
111011 ...// ..... ..... 00100 00010/ X I 201 v2.05 dcmpo DFP Compare Ordered
111111 ...// ..... ..... 00100 00010/ X I 201 v2.05 dcmpoq DFP Compare Ordered Quad
111011 ...// ..... ..... 10100 00010/ X I 200 v2.05 dcmpu DFP Compare Unordered
111111 ...// ..... ..... 10100 00010/ X I 200 v2.05 dcmpuq DFP Compare Unordered Quad
111011 ..... ///// ..... 01000 00010. X I 215 v2.05 dctdp[.] DFP Convert To DFP Long
111011 ..... ///// ..... 01001 00010. X I 217 v2.05 dctfix[.] DFP Convert To Fixed
111111 ..... ///// ..... 01001 00010. X I 217 v2.05 dctfixq[.] DFP Convert To Fixed Quad
111111 ..... ///// ..... 01000 00010. X I 215 v2.05 dctqpq[.] DFP Convert To DFP Extended
111011 ..... ../// ..... 01010 00010. X I 219 v2.05 ddedpd[.] DFP Decode DPD To BCD
111111 ..... ../// ..... 01010 00010. X I 219 v2.05 ddedpdq[.] DFP Decode DPD To BCD Quad
111011 ..... ..... ..... 10001 00010. X I 198 v2.05 ddiv[.] DFP Divide
111111 ..... ..... ..... 10001 00010. X I 198 v2.05 ddivq[.] DFP Divide Quad
111011 ..... .//// ..... 11010 00010. X I 219 v2.05 denbcd[.] DFP Encode BCD To DPD
111111 ..... .//// ..... 11010 00010. X I 219 v2.05 denbcdq[.] DFP Encode BCD To DPD Quad
111011 ..... ..... ..... 11011 00010. X I 220 v2.05 diex[.] DFP Insert Exponent
111111 ..... ..... ..... 11011 00010. X I 220 v2.05 diexq[.] DFP Insert Exponent Quad
111011 ..... ..... ..... 00001 00010. X I 197 v2.05 dmul[.] DFP Multiply
111111 ..... ..... ..... 00001 00010. X I 197 v2.05 dmulq[.] DFP Multiply Quad
Figure 88. Power ISA Instruction Set Sorted by Version (Sheet 9 of 17)
Mode Dep4
Mnemonic
Privilege3
Version2
Instruction1
Format
Book
Page
1 11111 11112 22222 222233
012345 6789 0 12345 67890 12345 678901 Name
111011 ..... ..... ..... ..000 00011. Z23 I 206 v2.05 dqua[.] DFP Quantize
111011 ..... ..... ..... ..010 00011. Z23 I 205 v2.05 dquai[.] DFP Quantize Immediate
111111 ..... ..... ..... ..010 00011. Z23 I 205 v2.05 dquaiq[.] DFP Quantize Immediate Quad
111111 ..... ..... ..... ..000 00011. Z23 I 206 v2.05 dquaq[.] DFP Quantize Quad
111111 ..... ///// ..... 11000 00010. X I 216 v2.05 drdpq[.] DFP Round To DFP Long
111011 ..... ////. ..... ..111 00011. Z23 I 213 v2.05 drintn[.] DFP Round To FP Integer Without Inexact
111111 ..... ////. ..... ..111 00011. Z23 I 213 v2.05 drintnq[.] DFP Round To FP Integer Without Inexact Quad
111011 ..... ////. ..... ..011 00011. Z23 I 211 v2.05 drintx[.] DFP Round To FP Integer With Inexact
111111 ..... ////. ..... ..011 00011. Z23 I 211 v2.05 drintxq[.] DFP Round To FP Integer With Inexact Quad
111011 ..... ..... ..... ..001 00011. Z23 I 208 v2.05 drrnd[.] DFP Reround
111111 ..... ..... ..... ..001 00011. Z23 I 208 v2.05 drrndq[.] DFP Reround Quad
111011 ..... ///// ..... 11000 00010. X I 216 v2.05 drsp[.] DFP Round To DFP Short
111011 ..... ..... ..... .0010 00010. Z22 I 222 v2.05 dscli[.] DFP Shift Significand Left Immediate
111111 ..... ..... ..... .0010 00010. Z22 I 222 v2.05 dscliq[.] DFP Shift Significand Left Immediate Quad
111011 ..... ..... ..... .0011 00010. Z22 I 222 v2.05 dscri[.] DFP Shift Significand Right Immediate
111111 ..... ..... ..... .0011 00010. Z22 I 222 v2.05 dscriq[.] DFP Shift Significand Right Immediate Quad
111011 ..... ..... ..... 10000 00010. X I 195 v2.05 dsub[.] DFP Subtract
111111 ..... ..... ..... 10000 00010. X I 195 v2.05 dsubq[.] DFP Subtract Quad
111011 ...// ..... ..... .0110 00010/ Z22 I 202 v2.05 dtstdc DFP Test Data Class
111111 ...// ..... ..... .0110 00010/ Z22 I 202 v2.05 dtstdcq DFP Test Data Class Quad
111011 ...// ..... ..... .0111 00010/ Z22 I 202 v2.05 dtstdg DFP Test Data Group
111111 ...// ..... ..... .0111 00010/ Z22 I 202 v2.05 dtstdgq DFP Test Data Group Quad
111011 ...// ..... ..... 00101 00010/ X I 203 v2.05 dtstex DFP Test Exponent
111111 ...// ..... ..... 00101 00010/ X I 203 v2.05 dtstexq DFP Test Exponent Quad
111011 ...// ..... ..... 10101 00010/ X I 204 v2.05 dtstsf DFP Test Significance
111111 ...// ..... ..... 10101 00010/ X I 204 v2.05 dtstsfq DFP Test Significance Quad
111011 ..... ///// ..... 01011 00010. X I 220 v2.05 dxex[.] DFP Extract Exponent
111111 ..... ///// ..... 01011 00010. X I 220 v2.05 dxexq[.] DFP Extract Exponent Quad
111111 ..... ..... ..... 00000 01000. X I 151 v2.05 fcpsgn[.] Floating Copy Sign
011111 ..... ..... ..... 11010 10101/ X III 966 v2.05 H lbzcix Load Byte & Zero Caching Inhibited Indexed
011111 ..... ..... ..... 11011 10101/ X III 966 v2.05 H ldcix Load Dword Caching Inhibited Indexed
111001 ..... ..... ..... ..... ....00 DS I 150 v2.05 lfdp Load Floating Double Pair
011111 ..... ..... ..... 11000 10111/ X I 150 v2.05 lfdpx Load Floating Double Pair Indexed
011111 ..... ..... ..... 11010 10111/ X I 144 v2.05 lfiwax Load Floating as Integer Word Algebraic Indexed
011111 ..... ..... ..... 11001 10101/ X III 966 v2.05 H lhzcix Load Hword & Zero Caching Inhibited Indexed
011111 ..... ..... ..... 11000 10101/ X III 966 v2.05 H lwzcix Load Word & Zero Caching Inhibited Indexed
011111 ..... ..... ///// 00101 11010/ X I 97 v2.05 prtyd Parity Dword
011111 ..... ..... ///// 00100 11010/ X I 97 v2.05 prtyw Parity Word
011111 ..... ///// ..... 11110 100111 X III 1031 v2.05 P SR slbfee. SLB Find Entry ESID & record
011111 ..... ..... ..... 11110 10101/ X III 967 v2.05 H stbcix Store Byte Caching Inhibited Indexed
011111 ..... ..... ..... 11111 10101/ X III 967 v2.05 H stdcix Store Dword Caching Inhibited Indexed
111101 ..... ..... ..... ..... ....00 DS I 150 v2.05 stfdp Store Floating Double Pair
011111 ..... ..... ..... 11100 10111/ X I 150 v2.05 stfdpx Store Floating Double Pair Indexed
011111 ..... ..... ..... 11101 10101/ X III 967 v2.05 H sthcix Store Hword Caching Inhibited Indexed
011111 ..... ..... ..... 11100 10101/ X III 967 v2.05 H stwcix Store Word Caching Inhibited Indexed
011010 00000 00000 00000 00000 000000 D I 92 v2.05 xnop Executed No Operation
011111 ..... ..... ..... ..... 01111/ A I 90 v2.03 isel Integer Select
011111 ..... ..... ..... 00000 00111/ X I 244 v2.03 lvebx Load Vector Element Byte Indexed
011111 ..... ..... ..... 00001 00111/ X I 244 v2.03 lvehx Load Vector Element Hword Indexed
011111 ..... ..... ..... 00010 00111/ X I 245 v2.03 lvewx Load Vector Element Word Indexed
011111 ..... ..... ..... 00000 00110/ X I 249 v2.03 lvsl Load Vector for Shift Left
011111 ..... ..... ..... 00001 00110/ X I 249 v2.03 lvsr Load Vector for Shift Right
011111 ..... ..... ..... 00011 00111/ X I 245 v2.03 lvx Load Vector Indexed
011111 ..... ..... ..... 01011 00111/ X I 245 v2.03 lvxl Load Vector Indexed Last
000100 ..... ///// ///// 11000 000100 VX I 364 v2.03 mfvscr Move From VSCR
000100 ///// ///// ..... 11001 000100 VX I 364 v2.03 mtvscr Move To VSCR
Figure 88. Power ISA Instruction Set Sorted by Version (Sheet 10 of 17)
Mode Dep4
Mnemonic
Privilege3
Version2
Instruction1
Format
Book
Page
1 11111 11112 22222 222233
012345 6789 0 12345 67890 12345 678901 Name
111110 ..... ..... ..... ..... ....10 DS I 60 v2.03 stq Store Qword
011111 ..... ..... ..... 00100 00111/ X I 247 v2.03 stvebx Store Vector Element Byte Indexed
011111 ..... ..... ..... 00101 00111/ X I 247 v2.03 stvehx Store Vector Element Hword Indexed
011111 ..... ..... ..... 00110 00111/ X I 248 v2.03 stvewx Store Vector Element Word Indexed
011111 ..... ..... ..... 00111 00111/ X I 248 v2.03 stvx Store Vector Indexed
011111 ..... ..... ..... 01111 00111/ X I 248 v2.03 stvxl Store Vector Indexed Last
011111 ///// ///// ..... 01000 10010/ X III 1038 v2.03 P 64 tlbiel TLB Invalidate Entry Local
000100 ..... ..... ..... 00110 000000 VX I 271 v2.03 vaddcuw Vector Add & Write Carry-Out Unsigned Word
000100 ..... ..... ..... 00000 001010 VX I 324 v2.03 vaddfp Vector Add Floating-Point
000100 ..... ..... ..... 01100 000000 VX I 271 v2.03 vaddsbs Vector Add Signed Byte Saturate
000100 ..... ..... ..... 01101 000000 VX I 271 v2.03 vaddshs Vector Add Signed Hword Saturate
000100 ..... ..... ..... 01110 000000 VX I 272 v2.03 vaddsws Vector Add Signed Word Saturate
000100 ..... ..... ..... 00000 000000 VX I 272 v2.03 vaddubm Vector Add Unsigned Byte Modulo
000100 ..... ..... ..... 01000 000000 VX I 274 v2.03 vaddubs Vector Add Unsigned Byte Saturate
000100 ..... ..... ..... 00001 000000 VX I 273 v2.03 vadduhm Vector Add Unsigned Hword Modulo
000100 ..... ..... ..... 01001 000000 VX I 274 v2.03 vadduhs Vector Add Unsigned Hword Saturate
000100 ..... ..... ..... 00010 000000 VX I 273 v2.03 vadduwm Vector Add Unsigned Word Modulo
000100 ..... ..... ..... 01010 000000 VX I 274 v2.03 vadduws Vector Add Unsigned Word Saturate
000100 ..... ..... ..... 10000 000100 VX I 315 v2.03 vand Vector Logical AND
000100 ..... ..... ..... 10001 000100 VX I 315 v2.03 vandc Vector Logical AND with Complement
000100 ..... ..... ..... 10100 000010 VX I 298 v2.03 vavgsb Vector Average Signed Byte
000100 ..... ..... ..... 10101 000010 VX I 298 v2.03 vavgsh Vector Average Signed Hword
000100 ..... ..... ..... 10110 000010 VX I 298 v2.03 vavgsw Vector Average Signed Word
000100 ..... ..... ..... 10000 000010 VX I 299 v2.03 vavgub Vector Average Unsigned Byte
000100 ..... ..... ..... 10001 000010 VX I 299 v2.03 vavguh Vector Average Unsigned Hword
000100 ..... ..... ..... 10010 000010 VX I 299 v2.03 vavguw Vector Average Unsigned Word
000100 ..... ..... ..... 01101 001010 VX I 328 v2.03 vcfsx Vector Convert From Signed Word
000100 ..... ..... ..... 01100 001010 VX I 328 v2.03 vcfux Vector Convert From Unsigned Word
000100 ..... ..... ..... .1111 000110 VC I 331 v2.03 vcmpbfp[.] Vector Compare Bounds Floating-Point
000100 ..... ..... ..... .0011 000110 VC I 332 v2.03 vcmpeqfp[.] Vector Compare Equal To Floating-Point
000100 ..... ..... ..... .0000 000110 VC I 306 v2.03 vcmpequb[.] Vector Compare Equal Unsigned Byte
000100 ..... ..... ..... .0001 000110 VC I 306 v2.03 vcmpequh[.] Vector Compare Equal Unsigned Hword
000100 ..... ..... ..... .0010 000110 VC I 307 v2.03 vcmpequw[.] Vector Compare Equal Unsigned Word
000100 ..... ..... ..... .0111 000110 VC I 332 v2.03 vcmpgefp[.] Vector Compare Greater Than or Equal To Floating-Point
000100 ..... ..... ..... .1011 000110 VC I 333 v2.03 vcmpgtfp[.] Vector Compare Greater Than Floating-Point
000100 ..... ..... ..... .1100 000110 VC I 308 v2.03 vcmpgtsb[.] Vector Compare Greater Than Signed Byte
000100 ..... ..... ..... .1101 000110 VC I 309 v2.03 vcmpgtsh[.] Vector Compare Greater Than Signed Hword
000100 ..... ..... ..... .1110 000110 VC I 309 v2.03 vcmpgtsw[.] Vector Compare Greater Than Signed Word
000100 ..... ..... ..... .1000 000110 VC I 310 v2.03 vcmpgtub[.] Vector Compare Greater Than Unsigned Byte
000100 ..... ..... ..... .1001 000110 VC I 311 v2.03 vcmpgtuh[.] Vector Compare Greater Than Unsigned Hword
000100 ..... ..... ..... .1010 000110 VC I 311 v2.03 vcmpgtuw[.] Vector Compare Greater Than Unsigned Word
000100 ..... ..... ..... 01111 001010 VX I 327 v2.03 vctsxs Vector Convert To Signed Word Saturate
000100 ..... ..... ..... 01110 001010 VX I 327 v2.03 vctuxs Vector Convert To Unsigned Word Saturate
000100 ..... ///// ..... 00110 001010 VX I 334 v2.03 vexptefp Vector 2 Raised to the Exponent Estimate Floating-Point
000100 ..... ..... ..... 00111 001010 VX I 334 v2.03 vlogefp Vector Log Base 2 Estimate Floating-Point
000100 ..... ..... ..... ..... 101110 VA I 325 v2.03 vmaddfp Vector Multiply-Add Floating-Point
000100 ..... ..... ..... 10000 001010 VX I 326 v2.03 vmaxfp Vector Maximum Floating-Point
000100 ..... ..... ..... 00100 000010 VX I 302 v2.03 vmaxsb Vector Maximum Signed Byte
000100 ..... ..... ..... 00101 000010 VX I 303 v2.03 vmaxsh Vector Maximum Signed Hword
000100 ..... ..... ..... 00110 000010 VX I 303 v2.03 vmaxsw Vector Maximum Signed Word
000100 ..... ..... ..... 00000 000010 VX I 302 v2.03 vmaxub Vector Maximum Unsigned Byte
000100 ..... ..... ..... 00001 000010 VX I 303 v2.03 vmaxuh Vector Maximum Unsigned Hword
000100 ..... ..... ..... 00010 000010 VX I 303 v2.03 vmaxuw Vector Maximum Unsigned Word
000100 ..... ..... ..... ..... 100000 VA I 287 v2.03 vmhaddshs Vector Multiply-High-Add Signed Hword Saturate
000100 ..... ..... ..... ..... 100001 VA I 287 v2.03 vmhraddshs Vector Multiply-High-Round-Add Signed Hword Saturate
000100 ..... ..... ..... 10001 001010 VX I 326 v2.03 vminfp Vector Minimum Floating-Point
Figure 88. Power ISA Instruction Set Sorted by Version (Sheet 11 of 17)
Mode Dep4
Mnemonic
Privilege3
Version2
Instruction1
Format
Book
Page
1 11111 11112 22222 222233
012345 6789 0 12345 67890 12345 678901 Name
000100 ..... ..... ..... 01100 000010 VX I 304 v2.03 vminsb Vector Minimum Signed Byte
000100 ..... ..... ..... 01101 000010 VX I 305 v2.03 vminsh Vector Minimum Signed Hword
000100 ..... ..... ..... 01110 000010 VX I 305 v2.03 vminsw Vector Minimum Signed Word
000100 ..... ..... ..... 01000 000010 VX I 304 v2.03 vminub Vector Minimum Unsigned Byte
000100 ..... ..... ..... 01001 000010 VX I 305 v2.03 vminuh Vector Minimum Unsigned Hword
000100 ..... ..... ..... 01010 000010 VX I 305 v2.03 vminuw Vector Minimum Unsigned Word
000100 ..... ..... ..... ..... 100010 VA I 288 v2.03 vmladduhm Vector Multiply-Low-Add Unsigned Hword Modulo
000100 ..... ..... ..... 00000 001100 VX I 257 v2.03 vmrghb Vector Merge High Byte
000100 ..... ..... ..... 00001 001100 VX I 257 v2.03 vmrghh Vector Merge High Hword
000100 ..... ..... ..... 00010 001100 VX I 258 v2.03 vmrghw Vector Merge High Word
000100 ..... ..... ..... 00100 001100 VX I 257 v2.03 vmrglb Vector Merge Low Byte
000100 ..... ..... ..... 00101 001100 VX I 257 v2.03 vmrglh Vector Merge Low Hword
000100 ..... ..... ..... 00110 001100 VX I 258 v2.03 vmrglw Vector Merge Low Word
000100 ..... ..... ..... ..... 100101 VA I 289 v2.03 vmsummbm Vector Multiply-Sum Mixed Byte Modulo
000100 ..... ..... ..... ..... 101000 VA I 289 v2.03 vmsumshm Vector Multiply-Sum Signed Hword Modulo
000100 ..... ..... ..... ..... 101001 VA I 290 v2.03 vmsumshs Vector Multiply-Sum Signed Hword Saturate
000100 ..... ..... ..... ..... 100100 VA I 288 v2.03 vmsumubm Vector Multiply-Sum Unsigned Byte Modulo
000100 ..... ..... ..... ..... 100110 VA I 290 v2.03 vmsumuhm Vector Multiply-Sum Unsigned Hword Modulo
000100 ..... ..... ..... ..... 100111 VA I 291 v2.03 vmsumuhs Vector Multiply-Sum Unsigned Hword Saturate
000100 ..... ..... ..... 01100 001000 VX I 283 v2.03 vmulesb Vector Multiply Even Signed Byte
000100 ..... ..... ..... 01101 001000 VX I 284 v2.03 vmulesh Vector Multiply Even Signed Hword
000100 ..... ..... ..... 01000 001000 VX I 283 v2.03 vmuleub Vector Multiply Even Unsigned Byte
000100 ..... ..... ..... 01001 001000 VX I 284 v2.03 vmuleuh Vector Multiply Even Unsigned Hword
000100 ..... ..... ..... 00100 001000 VX I 283 v2.03 vmulosb Vector Multiply Odd Signed Byte
000100 ..... ..... ..... 00101 001000 VX I 284 v2.03 vmulosh Vector Multiply Odd Signed Hword
000100 ..... ..... ..... 00000 001000 VX I 283 v2.03 vmuloub Vector Multiply Odd Unsigned Byte
000100 ..... ..... ..... 00001 001000 VX I 284 v2.03 vmulouh Vector Multiply Odd Unsigned Hword
000100 ..... ..... ..... ..... 101111 VA I 325 v2.03 vnmsubfp Vector Negative Multiply-Subtract Floating-Point
000100 ..... ..... ..... 10100 000100 VX I 316 v2.03 vnor Vector Logical NOR
000100 ..... ..... ..... 10010 000100 VX I 316 v2.03 vor Vector Logical OR
000100 ..... ..... ..... ..... 101011 VA I 262 v2.03 vperm Vector Permute
000100 ..... ..... ..... 01100 001110 VX I 250 v2.03 vpkpx Vector Pack Pixel
000100 ..... ..... ..... 00110 001110 VX I 251 v2.03 vpkshss Vector Pack Signed Hword Signed Saturate
000100 ..... ..... ..... 00100 001110 VX I 252 v2.03 vpkshus Vector Pack Signed Hword Unsigned Saturate
000100 ..... ..... ..... 00111 001110 VX I 252 v2.03 vpkswss Vector Pack Signed Word Signed Saturate
000100 ..... ..... ..... 00101 001110 VX I 253 v2.03 vpkswus Vector Pack Signed Word Unsigned Saturate
000100 ..... ..... ..... 00000 001110 VX I 253 v2.03 vpkuhum Vector Pack Unsigned Hword Unsigned Modulo
000100 ..... ..... ..... 00010 001110 VX I 254 v2.03 vpkuhus Vector Pack Unsigned Hword Unsigned Saturate
000100 ..... ..... ..... 00001 001110 VX I 254 v2.03 vpkuwum Vector Pack Unsigned Word Unsigned Modulo
000100 ..... ..... ..... 00011 001110 VX I 254 v2.03 vpkuwus Vector Pack Unsigned Word Unsigned Saturate
000100 ..... ///// ..... 00100 001010 VX I 335 v2.03 vrefp Vector Reciprocal Estimate Floating-Point
000100 ..... ///// ..... 01011 001010 VX I 329 v2.03 vrfim Vector Round to Floating-Point Integral toward -Infinity
000100 ..... ///// ..... 01000 001010 VX I 329 v2.03 vrfin Vector Round to Floating-Point Integral Nearest
000100 ..... ///// ..... 01010 001010 VX I 329 v2.03 vrfip Vector Round to Floating-Point Integral toward +Infinity
000100 ..... ///// ..... 01001 001010 VX I 330 v2.03 vrfiz Vector Round to Floating-Point Integral toward Zero
000100 ..... ..... ..... 00000 000100 VX I 318 v2.03 vrlb Vector Rotate Left Byte
000100 ..... ..... ..... 00001 000100 VX I 318 v2.03 vrlh Vector Rotate Left Hword
000100 ..... ..... ..... 00010 000100 VX I 318 v2.03 vrlw Vector Rotate Left Word
000100 ..... ///// ..... 00101 001010 VX I 335 v2.03 vrsqrtefp Vector Reciprocal Square Root Estimate Floating-Point
000100 ..... ..... ..... ..... 101010 VA I 263 v2.03 vsel Vector Select
000100 ..... ..... ..... 00111 000100 VX I 266 v2.03 vsl Vector Shift Left
000100 ..... ..... ..... 00100 000100 VX I 319 v2.03 vslb Vector Shift Left Byte
000100 ..... ..... ..... /.... 101100 VA I 265 v2.03 vsldoi Vector Shift Left Double by Octet Immediate
000100 ..... ..... ..... 00101 000100 VX I 319 v2.03 vslh Vector Shift Left Hword
000100 ..... ..... ..... 10000 001100 VX I 266 v2.03 vslo Vector Shift Left by Octet
000100 ..... ..... ..... 00110 000100 VX I 319 v2.03 vslw Vector Shift Left Word
Figure 88. Power ISA Instruction Set Sorted by Version (Sheet 12 of 17)
Mode Dep4
Mnemonic
Privilege3
Version2
Instruction1
Format
Book
Page
1 11111 11112 22222 222233
012345 6789 0 12345 67890 12345 678901 Name
000100 ..... /.... ..... 01000 001100 VX I 260 v2.03 vspltb Vector Splat Byte
000100 ..... //... ..... 01001 001100 VX I 260 v2.03 vsplth Vector Splat Hword
000100 ..... ..... ///// 01100 001100 VX I 261 v2.03 vspltisb Vector Splat Immediate Signed Byte
000100 ..... ..... ///// 01101 001100 VX I 261 v2.03 vspltish Vector Splat Immediate Signed Hword
000100 ..... ..... ///// 01110 001100 VX I 261 v2.03 vspltisw Vector Splat Immediate Signed Word
000100 ..... ///.. ..... 01010 001100 VX I 260 v2.03 vspltw Vector Splat Word
000100 ..... ..... ..... 01011 000100 VX I 266 v2.03 vsr Vector Shift Right
000100 ..... ..... ..... 01100 000100 VX I 321 v2.03 vsrab Vector Shift Right Algebraic Byte
000100 ..... ..... ..... 01101 000100 VX I 321 v2.03 vsrah Vector Shift Right Algebraic Hword
000100 ..... ..... ..... 01110 000100 VX I 321 v2.03 vsraw Vector Shift Right Algebraic Word
000100 ..... ..... ..... 01000 000100 VX I 320 v2.03 vsrb Vector Shift Right Byte
000100 ..... ..... ..... 01001 000100 VX I 320 v2.03 vsrh Vector Shift Right Hword
000100 ..... ..... ..... 10001 001100 VX I 266 v2.03 vsro Vector Shift Right by Octet
000100 ..... ..... ..... 01010 000100 VX I 320 v2.03 vsrw Vector Shift Right Word
000100 ..... ..... ..... 10110 000000 VX I 277 v2.03 vsubcuw Vector Subtract & Write Carry-Out Unsigned Word
000100 ..... ..... ..... 00001 001010 VX I 324 v2.03 vsubfp Vector Subtract Floating-Point
000100 ..... ..... ..... 11100 000000 VX I 277 v2.03 vsubsbs Vector Subtract Signed Byte Saturate
000100 ..... ..... ..... 11101 000000 VX I 277 v2.03 vsubshs Vector Subtract Signed Hword Saturate
000100 ..... ..... ..... 11110 000000 VX I 278 v2.03 vsubsws Vector Subtract Signed Word Saturate
000100 ..... ..... ..... 10000 000000 VX I 279 v2.03 vsububm Vector Subtract Unsigned Byte Modulo
000100 ..... ..... ..... 11000 000000 VX I 280 v2.03 vsububs Vector Subtract Unsigned Byte Saturate
000100 ..... ..... ..... 10001 000000 VX I 279 v2.03 vsubuhm Vector Subtract Unsigned Hword Modulo
000100 ..... ..... ..... 11001 000000 VX I 280 v2.03 vsubuhs Vector Subtract Unsigned Hword Saturate
000100 ..... ..... ..... 10010 000000 VX I 279 v2.03 vsubuwm Vector Subtract Unsigned Word Modulo
000100 ..... ..... ..... 11010 000000 VX I 280 v2.03 vsubuws Vector Subtract Unsigned Word Saturate
000100 ..... ..... ..... 11010 001000 VX I 292 v2.03 vsum2sws Vector Sum across Half Signed Word Saturate
000100 ..... ..... ..... 11100 001000 VX I 293 v2.03 vsum4sbs Vector Sum across Quarter Signed Byte Saturate
000100 ..... ..... ..... 11001 001000 VX I 293 v2.03 vsum4shs Vector Sum across Quarter Signed Hword Saturate
000100 ..... ..... ..... 11000 001000 VX I 294 v2.03 vsum4ubs Vector Sum across Quarter Unsigned Byte Saturate
000100 ..... ..... ..... 11110 001000 VX I 292 v2.03 vsumsws Vector Sum across Signed Word Saturate
000100 ..... ///// ..... 01101 001110 VX I 255 v2.03 vupkhpx Vector Unpack High Pixel
000100 ..... ///// ..... 01000 001110 VX I 256 v2.03 vupkhsb Vector Unpack High Signed Byte
000100 ..... ///// ..... 01001 001110 VX I 256 v2.03 vupkhsh Vector Unpack High Signed Hword
000100 ..... ///// ..... 01111 001110 VX I 255 v2.03 vupklpx Vector Unpack Low Pixel
000100 ..... ///// ..... 01010 001110 VX I 256 v2.03 vupklsb Vector Unpack Low Signed Byte
000100 ..... ///// ..... 01011 001110 VX I 256 v2.03 vupklsh Vector Unpack Low Signed Hword
000100 ..... ..... ..... 10011 000100 VX I 316 v2.03 vxor Vector Logical XOR
111111 ..... ///// ..... ///// 11000. A I 155 v2.02 fre[.] Floating Reciprocal Estimate
111111 ..... ///// ..... 01111 01000. X I 167 v2.02 frim[.] Floating Round To Integer Minus
111111 ..... ///// ..... 01100 01000. X I 167 v2.02 frin[.] Floating Round To Integer Nearest
111111 ..... ///// ..... 01110 01000. X I 167 v2.02 frip[.] Floating Round To Integer Plus
111111 ..... ///// ..... 01101 01000. X I 167 v2.02 friz[.] Floating Round To Integer Zero
111011 ..... ///// ..... ///// 11010. A I 156 v2.02 frsqrtes[.] Floating Reciprocal Square Root Estimate Single
010011 ///// ///// ///// 01000 10010/ XL III 955 v2.02 H hrfid Return From Interrupt Dword Hypervisor
011111 ..... ..... ///// 00011 11010/ X I 96 v2.02 popcntb Population Count Byte
011111 ..... 1.... ..../ 00000 10011/ XFX I 121 v2.01 mfocrf Move From One CR Field
011111 ..... 1.... ..../ 00100 10000/ XFX I 120 v2.01 mtocrf Move To One CR Field
011111 ..... ///// ..... 11100 10011/ X III 1030 v2.00 P slbmfee SLB Move From Entry ESID
011111 ..... ///// ..... 11010 10011/ X III 1030 v2.00 P slbmfev SLB Move From Entry VSID
011111 ..... ///// ..... 01100 10010/ X III 1029 v2.00 P slbmte SLB Move To Entry
111000 ..... ..... ..... ..... ...... DQ I 59 v2.03 lq Load Qword
010011 ///// ///// ///// 00010 10010/ XL III 953 v3.0 P rfscv Return From System Call Vectored
010001 ///// ///// ////. ..... .///01 SC I 43 v3.0 scv System Call Vectored
011111 ..... ..... ///// 00001 11010. X I 98 PPC SR cntlzd[.] Count Leading Zeros Dword
011111 ///.. ..... ..... 00010 10110/ X II 854 PPC dcbf Data Cache Block Flush
011111 ///// ..... ..... 00001 10110/ X II 853 PPC dcbst Data Cache Block Store
Figure 88. Power ISA Instruction Set Sorted by Version (Sheet 13 of 17)
Mode Dep4
Mnemonic
Privilege3
Version2
Instruction1
Format
Book
Page
1 11111 11112 22222 222233
012345 6789 0 12345 67890 12345 678901 Name
011111 ..... ..... ..... 01000 10110/ X II 851 PPC dcbt Data Cache Block Touch
011111 ..... ..... ..... 00111 10110/ X II 852 PPC dcbtst Data Cache Block Touch for Store
011111 ..... ..... ..... .1111 01001. XO I 82 PPC SR divd[o][.] Divide Dword
011111 ..... ..... ..... .1110 01001. XO I 82 PPC SR divdu[o][.] Divide Dword Unsigned
011111 ..... ..... ..... .1111 01011. XO I 75 PPC SR divw[o][.] Divide Word
011111 ..... ..... ..... .1110 01011. XO I 75 PPC SR divwu[o][.] Divide Word Unsigned
011111 ///// ///// ///// 11010 10110/ X II 879 PPC eieio Enforce In-order Execution of I/O
011111 ..... ..... ///// 11101 11010. X I 94 PPC SR extsb[.] Extend Sign Byte
011111 ..... ..... ///// 11110 11010. X I 98 PPC SR extsw[.] Extend Sign Word
111011 ..... ..... ..... ///// 10101. A I 153 PPC fadds[.] Floating Add Single
111111 ..... ///// ..... 11010 01110. X I 164 PPC fcfid[.] Floating Convert From Integer Dword
111111 ..... ///// ..... 11001 01110. X I 160 PPC fctid[.] Floating Convert To Integer Dword
111111 ..... ///// ..... 11001 01111. X I 161 PPC fctidz[.] Floating Convert To Integer Dword truncate
111011 ..... ..... ..... ///// 10010. A I 154 PPC fdivs[.] Floating Divide Single
111011 ..... ..... ..... ..... 11101. A I 158 PPC fmadds[.] Floating Multiply-Add Single
111011 ..... ..... ..... ..... 11100. A I 159 PPC fmsubs[.] Floating Multiply-Subtract Single
111011 ..... ..... ///// ..... 11001. A I 154 PPC fmuls[.] Floating Multiply Single
111011 ..... ..... ..... ..... 11111. A I 159 PPC fnmadds[.] Floating Negative Multiply-Add Single
111011 ..... ..... ..... ..... 11110. A I 159 PPC fnmsubs[.] Floating Negative Multiply-Subtract Single
111011 ..... ///// ..... ///// 11000. A I 155 PPC fres[.] Floating Reciprocal Estimate Single
111111 ..... ///// ..... ///// 11010. A I 156 PPC frsqrte[.] Floating Reciprocal Square Root Estimate
111111 ..... ..... ..... ..... 10111. A I 169 PPC fsel[.] Floating Select
111011 ..... ///// ..... ///// 10110. A I 155 PPC fsqrts[.] Floating Square Root Single
111011 ..... ..... ..... ///// 10100. A I 153 PPC fsubs[.] Floating Subtract Single
011111 ///// ..... ..... 11110 10110/ X II 842 PPC icbi Instruction Cache Block Invalidate
111010 ..... ..... ..... ..... ....00 DS I 53 PPC ld Load Dword
011111 ..... ..... ..... 00010 10100/ X II 873 PPC ldarx Load Dword And Reserve Indexed
111010 ..... ..... ..... ..... ....01 DS I 53 PPC ldu Load Dword with Update
011111 ..... ..... ..... 00001 10101/ X I 53 PPC ldux Load Dword with Update Indexed
011111 ..... ..... ..... 00000 10101/ X I 53 PPC ldx Load Dword Indexed
111010 ..... ..... ..... ..... ....10 DS I 52 PPC lwa Load Word Algebraic
011111 ..... ..... ..... 00000 10100/ X II 869 PPC lwarx Load Word & Reserve Indexed
011111 ..... ..... ..... 01011 10101/ X I 52 PPC lwaux Load Word Algebraic with Update Indexed
011111 ..... ..... ..... 01010 10101/ X I 52 PPC lwax Load Word Algebraic Indexed
011111 ..... ..... ..... 01011 10011/ X II 902 PPC mftb Move From Time Base
011111 ..... ////. ///// 00101 10010/ X III 978 PPC P mtmsrd Move To MSR Dword
011111 ..... ..... ..... /0010 01001. XO I 80 PPC SR mulhd[.] Multiply High Dword
011111 ..... ..... ..... /0000 01001. XO I 80 PPC SR mulhdu[.] Multiply High Dword Unsigned
011111 ..... ..... ..... /0010 01011. XO I 74 PPC SR mulhw[.] Multiply High Word
011111 ..... ..... ..... /0000 01011. XO I 74 PPC SR mulhwu[.] Multiply High Word Unsigned
011111 ..... ..... ..... .0111 01001. XO I 80 PPC SR mulld[o][.] Multiply Low Dword
010011 ///// ///// ///// 00000 10010/ XL III 954 PPC P rfid Return from Interrupt Dword
011110 ..... ..... ..... ..... .1000. MDS I 103 PPC SR rldcl[.] Rotate Left Dword then Clear Left
011110 ..... ..... ..... ..... .1001. MDS I 103 PPC SR rldcr[.] Rotate Left Dword then Clear Right
011110 ..... ..... ..... ..... .010.. MD I 104 PPC SR rldic[.] Rotate Left Dword Immediate then Clear
011110 ..... ..... ..... ..... .000.. MD I 104 PPC SR rldicl[.] Rotate Left Dword Immediate then Clear Left
011110 ..... ..... ..... ..... .001.. MD I 105 PPC SR rldicr[.] Rotate Left Dword Immediate then Clear Right
011110 ..... ..... ..... ..... .011.. MD I 105 PPC SR rldimi[.] Rotate Left Dword Immediate then Mask Insert
010001 ///// ///// ////. ..... .///1/ SC I 43 PPC sc System Call
011111 //... ///// ///// 01111 10010/ X III 1027 PPC P slbia SLB Invalidate All
011111 ///// ///// ..... 01101 10010/ X III 1024 PPC P slbie SLB Invalidate Entry
011111 ..... ..... ..... 00000 11011. X I 108 PPC SR sld[.] Shift Left Dword
011111 ..... ..... ..... 11000 11010. X I 109 PPC SR srad[.] Shift Right Algebraic Dword
011111 ..... ..... ..... 11001 1101.. XS I 109 PPC SR sradi[.] Shift Right Algebraic Dword Immediate
011111 ..... ..... ..... 10000 11011. X I 108 PPC SR srd[.] Shift Right Dword
111110 ..... ..... ..... ..... ....00 DS I 58 PPC std Store Dword
Figure 88. Power ISA Instruction Set Sorted by Version (Sheet 14 of 17)
Mode Dep4
Mnemonic
Privilege3
Version2
Instruction1
Format
Book
Page
1 11111 11112 22222 222233
012345 6789 0 12345 67890 12345 678901 Name
011111 ..... ..... ..... 00110 101101 X II 873 PPC stdcx. Store Dword Conditional Indexed & record
111110 ..... ..... ..... ..... ....01 DS I 58 PPC stdu Store Dword with Update
011111 ..... ..... ..... 00101 10101/ X I 58 PPC stdux Store Dword with Update Indexed
011111 ..... ..... ..... 00100 10101/ X I 58 PPC stdx Store Dword Indexed
011111 ..... ..... ..... 11110 10111/ X I 148 PPC stfiwx Store Floating as Integer Word Indexed
011111 ..... ..... ..... 00100 101101 X II 872 PPC stwcx. Store Word Conditional Indexed & record
011111 ..... ..... ..... .0001 01000. XO I 70 PPC SR subf[o][.] Subtract From
011111 ..... ..... ..... 00010 00100/ X I 90 PPC td Trap Dword
000010 ..... ..... ..... ..... ...... D I 90 PPC tdi Trap Dword Immediate
011111 ///// ///// ///// 10001 10110/ X III 1042 PPC H tlbsync TLB Synchronize
111111 ..... ///// ..... 00000 01110. X I 162 P2 fctiw[.] Floating Convert To Integer Word
111111 ..... ///// ..... 00000 01111. X I 163 P2 fctiwz[.] Floating Convert To Integer Word truncate
111111 ..... ///// ..... ///// 10110. A I 155 P2 fsqrt[.] Floating Square Root
011111 ..... ..... ..... .1000 01010. XO I 70 P1 SR add[o][.] Add
011111 ..... ..... ..... .0000 01010. XO I 71 P1 SR addc[o][.] Add Carrying
011111 ..... ..... ..... .0100 01010. XO I 72 P1 SR adde[o][.] Add Extended
001110 ..... ..... ..... ..... ...... D I 68 P1 addi Add Immediate
001100 ..... ..... ..... ..... ...... D I 70 P1 SR addic Add Immediate Carrying
001101 ..... ..... ..... ..... ...... D I 70 P1 SR addic. Add Immediate Carrying & record
001111 ..... ..... ..... ..... ...... D I 68 P1 addis Add Immediate Shifted
011111 ..... ..... ///// .0111 01010. XO I 72 P1 SR addme[o][.] Add to Minus One Extended
011111 ..... ..... ///// .0110 01010. XO I 73 P1 SR addze[o][.] Add to Zero Extended
011111 ..... ..... ..... 00000 11100. X I 93 P1 SR and[.] AND
011111 ..... ..... ..... 00001 11100. X I 94 P1 SR andc[.] AND with Complement
011100 ..... ..... ..... ..... ...... D I 91 P1 SR andi. AND Immediate & record
011101 ..... ..... ..... ..... ...... D I 91 P1 SR andis. AND Immediate Shifted & record
010010 ..... ..... ..... ..... ...... I I 38 P1 b[l][a] Branch [& Link] [Absolute]
010000 ..... ..... ..... ..... ...... B I 38 P1 CT bc[l][a] Branch Conditional [& Link] [Absolute]
010011 ..... ..... ///.. 10000 10000. XL I 39 P1 CT bcctr[l] Branch Conditional to CTR [& Link]
010011 ..... ..... ///.. 00000 10000. XL I 39 P1 CT bclr[l] Branch Conditional to LR [& Link]
011111 .../. ..... ..... 00000 00000/ X I 85 P1 cmp Compare
001011 .../. ..... ..... ..... ...... D I 85 P1 cmpi Compare Immediate
011111 .../. ..... ..... 00001 00000/ X I 86 P1 cmpl Compare Logical
001010 .../. ..... ..... ..... ...... D I 86 P1 cmpli Compare Logical Immediate
011111 ..... ..... ///// 00000 11010. X I 95 P1 SR cntlzw[.] Count Leading Zeros Word
010011 ..... ..... ..... 01000 00001/ XL I 41 P1 crand CR AND
010011 ..... ..... ..... 00100 00001/ XL I 42 P1 crandc CR AND with Complement
010011 ..... ..... ..... 01001 00001/ XL I 42 P1 creqv CR Equivalent
010011 ..... ..... ..... 00111 00001/ XL I 41 P1 crnand CR NAND
010011 ..... ..... ..... 00001 00001/ XL I 42 P1 crnor CR NOR
010011 ..... ..... ..... 01110 00001/ XL I 41 P1 cror CR OR
010011 ..... ..... ..... 01101 00001/ XL I 42 P1 crorc CR OR with Complement
010011 ..... ..... ..... 00110 00001/ XL I 41 P1 crxor CR XOR
011111 ///// ..... ..... 11111 10110/ X II 853 P1 dcbz Data Cache Block Zero
011111 ..... ..... ..... 01000 11100. X I 94 P1 SR eqv[.] Equivalent
011111 ..... ..... ///// 11100 11010. X I 94 P1 SR extsh[.] Extend Sign Hword
111111 ..... ///// ..... 01000 01000. X I 151 P1 fabs[.] Floating Absolute
111111 ..... ..... ..... ///// 10101. A I 153 P1 fadd[.] Floating Add
111111 ...// ..... ..... 00001 00000/ X I 168 P1 fcmpo Floating Compare Ordered
111111 ...// ..... ..... 00000 00000/ X I 168 P1 fcmpu Floating Compare Unordered
111111 ..... ..... ..... ///// 10010. A I 154 P1 fdiv[.] Floating Divide
111111 ..... ..... ..... ..... 11101. A I 158 P1 fmadd[.] Floating Multiply-Add
111111 ..... ///// ..... 00010 01000. X I 151 P1 fmr[.] Floating Move Register
111111 ..... ..... ..... ..... 11100. A I 159 P1 fmsub[.] Floating Multiply-Subtract
111111 ..... ..... ///// ..... 11001. A I 154 P1 fmul[.] Floating Multiply
111111 ..... ///// ..... 00100 01000. X I 151 P1 fnabs[.] Floating Negative Absolute Value
Figure 88. Power ISA Instruction Set Sorted by Version (Sheet 15 of 17)
Mode Dep4
Mnemonic
Privilege3
Version2
Instruction1
Format
Book
Page
1 11111 11112 22222 222233
012345 6789 0 12345 67890 12345 678901 Name
111111 ..... ///// ..... 00001 01000. X I 151 P1 fneg[.] Floating Negate
111111 ..... ..... ..... ..... 11111. A I 159 P1 fnmadd[.] Floating Negative Multiply-Add
111111 ..... ..... ..... ..... 11110. A I 159 P1 fnmsub[.] Floating Negative Multiply-Subtract
111111 ..... ///// ..... 00000 01100. X I 160 P1 frsp[.] Floating Round to SP
111111 ..... ..... ..... ///// 10100. A I 153 P1 fsub[.] Floating Subtract
010011 ///// ///// ///// 00100 10110/ XL II 867 P1 isync Instruction Synchronize
100010 ..... ..... ..... ..... ...... D I 48 P1 lbz Load Byte & Zero
100011 ..... ..... ..... ..... ...... D I 48 P1 lbzu Load Byte & Zero with Update
011111 ..... ..... ..... 00011 10111/ X I 48 P1 lbzux Load Byte & Zero with Update Indexed
011111 ..... ..... ..... 00010 10111/ X I 48 P1 lbzx Load Byte & Zero Indexed
110010 ..... ..... ..... ..... ...... D I 143 P1 lfd Load Floating Double
110011 ..... ..... ..... ..... ...... D I 143 P1 lfdu Load Floating Double with Update
011111 ..... ..... ..... 10011 10111/ X I 143 P1 lfdux Load Floating Double with Update Indexed
011111 ..... ..... ..... 10010 10111/ X I 143 P1 lfdx Load Floating Double Indexed
110000 ..... ..... ..... ..... ...... D I 142 P1 lfs Load Floating Single
110001 ..... ..... ..... ..... ...... D I 142 P1 lfsu Load Floating Single with Update
011111 ..... ..... ..... 10001 10111/ X I 142 P1 lfsux Load Floating Single with Update Indexed
011111 ..... ..... ..... 10000 10111/ X I 142 P1 lfsx Load Floating Single Indexed
101010 ..... ..... ..... ..... ...... D I 50 P1 lha Load Hword Algebraic
101011 ..... ..... ..... ..... ...... D I 50 P1 lhau Load Hword Algebraic with Update
011111 ..... ..... ..... 01011 10111/ X I 50 P1 lhaux Load Hword Algebraic with Update Indexed
011111 ..... ..... ..... 01010 10111/ X I 50 P1 lhax Load Hword Algebraic Indexed
011111 ..... ..... ..... 11000 10110/ X I 61 P1 lhbrx Load Hword Byte-Reverse Indexed
101000 ..... ..... ..... ..... ...... D I 49 P1 lhz Load Hword & Zero
101001 ..... ..... ..... ..... ...... D I 49 P1 lhzu Load Hword & Zero with Update
011111 ..... ..... ..... 01001 10111/ X I 49 P1 lhzux Load Hword & Zero with Update Indexed
011111 ..... ..... ..... 01000 10111/ X I 49 P1 lhzx Load Hword & Zero Indexed
101110 ..... ..... ..... ..... ...... D I 63 P1 lmw Load Multiple Word
011111 ..... ..... ..... 10010 10101/ X I 65 P1 lswi Load String Word Immediate
011111 ..... ..... ..... 10000 10101/ X I 65 P1 lswx Load String Word Indexed
011111 ..... ..... ..... 10000 10110/ X I 61 P1 lwbrx Load Word Byte-Reverse Indexed
100000 ..... ..... ..... ..... ...... D I 51 P1 lwz Load Word & Zero
100001 ..... ..... ..... ..... ...... D I 51 P1 lwzu Load Word & Zero with Update
011111 ..... ..... ..... 00001 10111/ X I 51 P1 lwzux Load Word & Zero with Update Indexed
011111 ..... ..... ..... 00000 10111/ X I 51 P1 lwzx Load Word & Zero Indexed
010011 ...// ...// ///// 00000 00000/ XL I 42 P1 mcrf Move CR Field
111111 ...// ...// ///// 00010 00000/ X I 171 P1 mcrfs Move To CR from FPSCR
011111 ..... 0//// ///// 00000 10011/ XFX I 121 P1 mfcr Move From CR
111111 ..... ///// ///// 10010 00111. X I 171 P1 mffs[.] Move From FPSCR
011111 ..... ///// ///// 00010 10011/ X III 979 P1 P mfmsr Move From MSR
011111 ..... ..... ..... 01010 10011/ X I 118 P1 O mfspr Move From SPR
III 975
011111 ..... 0.... ..../ 00100 10000/ XFX I 120 P1 mtcrf Move To CR Fields
111111 ..... ///// ///// 00010 00110. X I 173 P1 mtfsb0[.] Move To FPSCR Bit 0
111111 ..... ///// ///// 00001 00110. X I 173 P1 mtfsb1[.] Move To FPSCR Bit 1
111111 ..... ..... ..... 10110 00111. XFL I 172 P1 mtfsf[.] Move To FPSCR Fields
111111 ...// ////. ..../ 00100 00110. X I 172 P1 mtfsfi[.] Move To FPSCR Field Immediate
011111 ..... ////. ///// 00100 10010/ X III 977 P1 P mtmsr Move To MSR
011111 ..... ..... ..... 01110 10011/ X I 116 P1 O mtspr Move To SPR
III 974
000111 ..... ..... ..... ..... ...... D I 74 P1 mulli Multiply Low Immediate
011111 ..... ..... ..... .0111 01011. XO I 74 P1 SR mullw[o][.] Multiply Low Word
011111 ..... ..... ..... 01110 11100. X I 93 P1 SR nand[.] NAND
011111 ..... ..... ///// .0011 01000. XO I 73 P1 SR neg[o][.] Negate
011111 ..... ..... ..... 00011 11100. X I 94 P1 SR nor[.] NOR
011111 ..... ..... ..... 01101 11100. X I 93 P1 SR or[.] OR
011111 ..... ..... ..... 01100 11100. X I 94 P1 SR orc[.] OR with Complement
Figure 88. Power ISA Instruction Set Sorted by Version (Sheet 16 of 17)
Mode Dep4
Mnemonic
Privilege3
Version2
Instruction1
Format
Book
Page
1 11111 11112 22222 222233
012345 6789 0 12345 67890 12345 678901 Name
011000 ..... ..... ..... ..... ...... D I 91 P1 ori OR Immediate
011001 ..... ..... ..... ..... ...... D I 92 P1 oris OR Immediate Shifted
010100 ..... ..... ..... ..... ...... M I 102 P1 SR rlwimi[.] Rotate Left Word Immediate then Mask Insert
010101 ..... ..... ..... ..... ...... M I 101 P1 SR rlwinm[.] Rotate Left Word Immediate then AND with Mask
010111 ..... ..... ..... ..... ...... M I 102 P1 SR rlwnm[.] Rotate Left Word then AND with Mask
011111 ..... ..... ..... 00000 11000. X I 106 P1 SR slw[.] Shift Left Word
011111 ..... ..... ..... 11000 11000. X I 107 P1 SR sraw[.] Shift Right Algebraic Word
011111 ..... ..... ..... 11001 11000. X I 107 P1 SR srawi[.] Shift Right Algebraic Word Immediate
011111 ..... ..... ..... 10000 11000. X I 106 P1 SR srw[.] Shift Right Word
100110 ..... ..... ..... ..... ...... D I 55 P1 stb Store Byte
100111 ..... ..... ..... ..... ...... D I 55 P1 stbu Store Byte with Update
011111 ..... ..... ..... 00111 10111/ X I 55 P1 stbux Store Byte with Update Indexed
011111 ..... ..... ..... 00110 10111/ X I 55 P1 stbx Store Byte Indexed
110110 ..... ..... ..... ..... ...... D I 147 P1 stfd Store Floating Double
110111 ..... ..... ..... ..... ...... D I 147 P1 stfdu Store Floating Double with Update
011111 ..... ..... ..... 10111 10111/ X I 147 P1 stfdux Store Floating Double with Update Indexed
011111 ..... ..... ..... 10110 10111/ X I 147 P1 stfdx Store Floating Double Indexed
110100 ..... ..... ..... ..... ...... D I 146 P1 stfs Store Floating Single
110101 ..... ..... ..... ..... ...... D I 146 P1 stfsu Store Floating Single with Update
011111 ..... ..... ..... 10101 10111/ X I 146 P1 stfsux Store Floating Single with Update Indexed
011111 ..... ..... ..... 10100 10111/ X I 146 P1 stfsx Store Floating Single Indexed
101100 ..... ..... ..... ..... ...... D I 56 P1 sth Store Hword
011111 ..... ..... ..... 11100 10110/ X I 61 P1 sthbrx Store Hword Byte-Reverse Indexed
101101 ..... ..... ..... ..... ...... D I 56 P1 sthu Store Hword with Update
011111 ..... ..... ..... 01101 10111/ X I 56 P1 sthux Store Hword with Update Indexed
011111 ..... ..... ..... 01100 10111/ X I 56 P1 sthx Store Hword Indexed
101111 ..... ..... ..... ..... ...... D I 63 P1 stmw Store Multiple Word
011111 ..... ..... ..... 10110 10101/ X I 66 P1 stswi Store String Word Immediate
011111 ..... ..... ..... 10100 10101/ X I 66 P1 stswx Store String Word Indexed
100100 ..... ..... ..... ..... ...... D I 57 P1 stw Store Word
011111 ..... ..... ..... 10100 10110/ X I 61 P1 stwbrx Store Word Byte-Reverse Indexed
100101 ..... ..... ..... ..... ...... D I 57 P1 stwu Store Word with Update
011111 ..... ..... ..... 00101 10111/ X I 57 P1 stwux Store Word with Update Indexed
011111 ..... ..... ..... 00100 10111/ X I 57 P1 stwx Store Word Indexed
011111 ..... ..... ..... .0000 01000. XO I 71 P1 SR subfc[o][.] Subtract From Carrying
011111 ..... ..... ..... .0100 01000. XO I 72 P1 SR subfe[o][.] Subtract From Extended
001000 ..... ..... ..... ..... ...... D I 71 P1 SR subfic Subtract From Immediate Carrying
011111 ..... ..... ///// .0111 01000. XO I 72 P1 SR subfme[o][.] Subtract From Minus One Extended
011111 ..... ..... ///// .0110 01000. XO I 73 P1 SR subfze[o][.] Subtract From Zero Extended
011111 ///.. ///// ///// 10010 10110/ X II 877 P1 sync Synchronize
011111 ////. ///// ..... 01001 10010/ X III 1034 P1 H 64 tlbie TLB Invalidate Entry
011111 ..... ..... ..... 00000 00100/ X I 89 P1 tw Trap Word
000011 ..... ..... ..... ..... ...... D I 89 P1 twi Trap Word Immediate
011111 ..... ..... ..... 01001 11100. X I 93 P1 SR xor[.] XOR
011010 ..... ..... ..... ..... ...... D I 92 P1 xori XOR Immediate
011011 ..... ..... ..... ..... ...... D I 92 P1 xoris XOR Immediate Shifted
Figure 88. Power ISA Instruction Set Sorted by Version (Sheet 17 of 17)
1. Key to Instruction column (primary and extended opcode bits shaded in gray).
/ Instruction bit that corresponds to a reserved field, must have a value of 0, otherwise invalid form.
. Instruction bit that corresponds to an operand bit, may have a value of either 0 or 1.
0 Instruction bit having a value 0.
1 Instruction bit having a value 1.
This appendix lists all the instructions in the Power ISA, sorted by mnemonic.
Mode Dep4
Mnemonic
Privilege3
Version2
Instruction1
Format
Book
Page
Mode Dep4
Mnemonic
Privilege3
Version2
Instruction1
Format
Book
Page
1 11111 11112 22222 222233
012345 6789 0 12345 67890 12345 678901 Name
011111 ...// ..... ..... 00111 00000/ X I 88 v3.0 cmpeqb Compare Equal Byte
001011 .../. ..... ..... ..... ...... D I 85 P1 cmpi Compare Immediate
011111 .../. ..... ..... 00001 00000/ X I 86 P1 cmpl Compare Logical
001010 .../. ..... ..... ..... ...... D I 86 P1 cmpli Compare Logical Immediate
011111 .../. ..... ..... 00110 00000/ X I 87 v3.0 cmprb Compare Ranged Byte
011111 ..... ..... ///// 00001 11010. X I 98 PPC SR cntlzd[.] Count Leading Zeros Dword
011111 ..... ..... ///// 00000 11010. X I 95 P1 SR cntlzw[.] Count Leading Zeros Word
011111 ..... ..... ///// 10001 11010. X I 98 v3.0 cnttzd[.] Count Trailing Zeros Dword
011111 ..... ..... ///// 10000 11010. X I 95 v3.0 cnttzw[.] Count Trailing Zeros Word
011111 ////. ..... ..... 11000 00110/ X II 858 v3.0 copy Copy
011111 ///// ///// ///// 11010 00110/ X II 860 v3.0 cp_abort CP_Abort
010011 ..... ..... ..... 01000 00001/ XL I 41 P1 crand CR AND
010011 ..... ..... ..... 00100 00001/ XL I 42 P1 crandc CR AND with Complement
010011 ..... ..... ..... 01001 00001/ XL I 42 P1 creqv CR Equivalent
010011 ..... ..... ..... 00111 00001/ XL I 41 P1 crnand CR NAND
010011 ..... ..... ..... 00001 00001/ XL I 42 P1 crnor CR NOR
010011 ..... ..... ..... 01110 00001/ XL I 41 P1 cror CR OR
010011 ..... ..... ..... 01101 00001/ XL I 42 P1 crorc CR OR with Complement
010011 ..... ..... ..... 00110 00001/ XL I 41 P1 crxor CR XOR
111011 ..... ..... ..... 00000 00010. X I 195 v2.05 dadd[.] DFP Add
111111 ..... ..... ..... 00000 00010. X I 195 v2.05 daddq[.] DFP Add Quad
011111 ..... ///.. ///// 10111 10011/ X I 79 v3.0 darn Deliver A Random Number
011111 ///.. ..... ..... 00010 10110/ X II 854 PPC dcbf Data Cache Block Flush
011111 ///// ..... ..... 00001 10110/ X II 853 PPC dcbst Data Cache Block Store
011111 ..... ..... ..... 01000 10110/ X II 851 PPC dcbt Data Cache Block Touch
011111 ..... ..... ..... 00111 10110/ X II 852 PPC dcbtst Data Cache Block Touch for Store
011111 ///// ..... ..... 11111 10110/ X II 853 P1 dcbz Data Cache Block Zero
111011 ..... ///// ..... 11001 00010. X I 217 v2.06