You are on page 1of 103

Advanced Tutorial on Hardware

Design and Petri nets

Jordi Cortadella Univ. Politècnica de Catalunya
Luciano Lavagno Università di Udine
Alex Yakovlev Univ. Newcastle upon Tyne

Tutorial Outline
Introduction
• Modeling Hardware with PNs
• Synthesis of Circuits from PN
specifications
• Circuit verification with PNs
• Performance analysis using PNs

Introduction.Outline
• Role of Hardware in modern systems
• Role of Hardware design tools
• Role of a modeling language
• Why Petri nets are good for Hardware
Design
• History of “relationship” between
Hardware Design and Petri nets
• Asynchronous Circuit Design

Role of Hardware in modern systems • Technology soon allows putting 1 billion transistors on a chip • Systems on chip is a reality – 1 billion operations per second • Hardware and software designs are no longer separate • Hardware becomes distributed. asynchronous and concurrent .

Role of Hardware design tools • Design productivity is a problem due to chip complexity and time to market demands • Need for well-integrated CAD with simulation. synthesis. verification and testing tools • Modelling of system behaviour at all levels of abstraction with feedback to the designer • Design re-use is a must but with max technology independence .

waveforms. size and power . timing diagrams. traditionally used by logic designers) • Today’s hardware description languages allow high level of abstraction • Models must allow for equivalence-preserving refinements • They must allow for non-functional qualities such as speed. Role of Modelling Language • Design methods and tools require good modelling and specification techniques • Those must be formal and rigorous and easy to comprehend (cf.

Why Petri nets are good • Finite State Machine is still the main formal tool in hardware design but it may be inadequate for distributed. concurrent and asynchronous hardware • Petri nets: – simple and easy to understand graphical capture – modelling power adjustable to various types of behaviour at different abstraction levels – formal operational semantics and verification of correctnes (safety and liveness) properties – possibility of mechanical synthesis of circuits from net models .

Petrify • 2000’s: Powerful asynchronous design flow . …) • 1970’s: Toward Parellel Computations (MIT. St. Toulouse. Forcage. Signal Transition Graphs (STGs) • 1990’s: First asynchronous design (verification and synthesis) tools: SIS. Concurrency theory. Petri. Karp & Miller. Petersburg. A bit of history of their marriage • 1950’s and 60’s: Foundations (Muller & Bartky. Manchester …) • 1980’s: First progress in VLSI and CAD.

Introduction to Asynchronous Circuits • What is an asynchronous circuit? – Physical (analogue) level – Logical level – Speed-independent and delay-insensitive circuits • Why go asynchronous? • Why control logic? • Role of Petri nets • Asynchronous circuit design based on Petri nets .

circuits are self-timed or self-clocked • Can be viewed as hardwired versions of parallel and distributed programs – statements are activated when their guards are true • No special run-time mechanism – the “program statements” are physical components: logic gates. or hierarchical modules • Interconnections are also physical components: wires. busses . memory latches. What is an asynchronous circuit • No global clock.

Synchronous Design
Clock

Data input

Data Register
Register
Sender Logic Clock
Receiver

Tsetup Thold

Timing constraint: input data must stay unchanged within a
setup/hold window around clock event. Otherwise, the latch may fail
(e.g. metastability)

Asynchronous Design
Req(est)
Ack(nowledge)

Data input

Data Register
Register
Sender Logic Req
Receiver

Ack

Req/Ack (local) signal handshake protocol instead of global clock
Causal relationship
Handshake signals implemented with completion detection in data
path

Physical (Analogue) level
• Strict view: an asynchronous circuit is a (analogue)
dynamical system – e.g. to be described by
differential equations
• In most cases can be safely approximated by logic
level (0-to-1 and 1-to-0 transitions) abstraction; even
hazards can be captured
• For some anomalous effects, such as metastability
and oscillations, absolute need for analogue models
• Analogue aspects are not considered in this tutorial
(cf. reference list)

directly or transitively) • The order is partial if concurrency is present • A class of async timed (yet not clocked!) circuits allows special timing order relations (a occurs before b. Logical Level • Circuit behaviour is described by sequences of up (0- to-1) and down (1-to-0) transitions on inputs and outputs • The order of transitions is defined by causal relationship. not by clock (a causes b. due to delay assumptions) .

Simple circuit example ack1 req1 x + C y out=(x+y)*(a+b) req3 req2 ack2 ack3 a + * out b .

Simple circuit example ack1 req1 x + C y out=(x+y)*(a+b) req3 req2 ack2 ack3 a + * out b x y + out * a + b Data flow graph .

Simple circuit example ack1 req1 x + C y out=(x+y)*(a+b) req3 req2 ack2 ack3 a + * out b x req1 ack1 y + out req3 ack3 * a req2 ack2 + b Data flow graph Control flow graph – Petri net .

Muller C-element Key component in asynchronous circuit design – like a Petri net transition x1 y=x1*x2+(x1+x2)y x2 C y Set-part Reset-part .

Muller C-element Key component in asynchronous circuit design – like a Petri net transition x1 y=x1*x2+(x1+x2)y x2 C y Set-part Reset-part .

Muller C-element Key component in asynchronous circuit design – like a Petri net transition x1 0 0 y=x1*x2+(x1+x2)y x2 0 C y Set-part Reset-part .

Muller C-element Key component in asynchronous circuit design – like a Petri net transition 0->1 x1 0 y=x1*x2+(x1+x2)y x2 0 C y Set-part Reset-part .

Muller C-element Key component in asynchronous circuit design – behaves like a Petri net transition 0->1 x1 0* y=x1*x2+(x1+x2)y x2 0->1 C y excited Set-part Reset-part .

Muller C-element Key component in asynchronous circuit design – behaves like a Petri net transition 1 x1 0* y=x1*x2+(x1+x2)y x2 1 C y excited Set-part Reset-part .

Muller C-element Key component in asynchronous circuit design – behaves like a Petri net transition 1 x1 1 y=x1*x2+(x1+x2)y x2 1 C y stable (new Set-part Reset-part value) .

Muller C-element Key component in asynchronous circuit design – behaves like a Petri net transition 1 x1 1 y=x1*x2+(x1+x2)y x2 1 C y Set-part Reset-part .

Muller C-element Key component in asynchronous circuit design – behaves like a Petri net transition 1->0 x1 1 y=x1*x2+(x1+x2)y x2 1 C y Set-part Reset-part .

Muller C-element Key component in asynchronous circuit design – behaves like a Petri net transition 1->0 x1 1* y=x1*x2+(x1+x2)y x2 1->0 C y excited Set-part Reset-part .

Muller C-element Key component in asynchronous circuit design – behaves like a Petri net transition 0 x1 0 y=x1*x2+(x1+x2)y x2 0 C y stable (new Set-part Reset-part value) .

Muller C-element Key component in asynchronous circuit design – like a Petri net transition x1 y=x1*x2+(x1+x2)y x2 C y Set-part Reset-part It acts symmetrically for pairs of 0-1 and 1-0 transitions – waits for both input events to occur .

Muller C-element Key component in asynchronous circuit design – like a Petri net transition x1 y=x1*x2+(x1+x2)y x2 C y Set-part Reset-part It acts symmetrically for pairs of 0-1 and 1-0 transitions – waits for both input events to occur .

Muller C-element x1 y=x1*x2+(x1+x2)y x2 C y Set-part Reset-part Power NMOS circuit y implementation x1 x2 x1 x2 Ground .

Muller C-element x1 y=x1*x2+(x1+x2)y x2 C y Set-part Reset-part Power y x1 x2 x1 x2 Ground .

Muller C-element x1 y=x1*x2+(x1+x2)y x2 C y Set-part Reset-part Power y x1 x2 x1 x2 Ground .

not max delays) • Robustness (operationally scalable. well-defined interfaces) • Testability (inherent self-checking via ack signals) . no clock distribution. important when gate-to-wire delay ratio changes) • Low Power (‘change-based’ computing – fewer signal transitions) • Low Electromagnetic Emission (more even power/frequency spectrum) • Modularity and re-use (parts designed independently. Why asynchronous is good • Performance (work on actual.

trained ‘with clock’ – biggest obstacle • Overbalancing effect of periodic (every 10 years) ‘asynchronous euphoria’ . Obstacles to Async Design • Design tool support – commercial design tools are aimed at clocked systems • Difficulty of production testing – production testing is heavily committed to use of clock • Aversion of majority of designers.

or a modulo-N counter • Their behaviour is a combination of partial orders of signal events • Examples of data-dominated logic are: a register bank or an arithmetic-logic unit (ALU) . so the distinction is relative • Examples of control-dominated logic: a bus interface adapter. an arbiter. Why control logic • Customary in hardware design to separate control logic from datapath logic due to different design techniques • Control logic implements the control flow of a (possibly concurrent) algorithm • Datapath logic deals with operational part of the algorithms • Datapath operations may have their (lower level) control flow elements.

deterministic and non-deterministic choice in the circuit and its environment • They allow: – composition of labelled PNs (transition or place sync/tion) – refinement of event annotation (from abstract operations down to signal transitions) – use of observational equivalence (lambda-events) – clear link with state-transition models in both directions . Role of Petri Nets • We concentrate here on control logic • Control logic is behaviourally more diverse than data path • Petri nets capture causality and concurrency between signalling events.

…) Abstract behavioural model Labelled Petri nets (LPNs) Timing Signalling refinement diagrams Verification and Performance analysis Logic behavioural model Signal Transition Graphs (STGs) Syntax-direct translation STG-based logic synthesis (deriving circuit structure) (deriving boolean functions) Library Circuit Decomposition and gate cells netlist mapping . CSP. Design flow with Petri nets Transition systems High-level languages Abstract behaviour synthesis Predicates on traces (VHDL.

Tutorial Outline • Introduction Modeling Hardware with PNs • Synthesis of Circuits from PN specifications • Circuit verification with PNs • Performance analysis using PNs .

processor example • Low level modelling and logic synthesis.Outline • High level modelling and abstract refinement. interface controller example • Modelling of logic circuits: event-driven and level-driven parts • Properties analysed . Modelling.

High-level modelling:Processor Instruction Example Instruction Fetch Execution .

High-level modelling:Processor Instruction Example Instruction Fetch Execution One-word Instruction Decode One-word Instruction Execute Memory Read Two-word Program Memory Instruction Counter Address Execute Two-word Update Register Load Instruction Instruction Register Load Decode .

High-level modelling:Processor Instruction Example Instruction Fetch Execution One-word Instruction Decode One-word Instruction Execute Memory Read Two-word Program Memory Instruction Counter Address Execute Two-word Update Register Load Instruction Instruction Register Load Decode .

High-level modelling:Processor Instruction Example Instruction Fetch Execution One-word Instruction Decode One-word Instruction Execute Memory Read Two-word Program Memory Instruction Counter Address Execute Two-word Update Register Load Instruction Instruction Register Load Decode .

High-level modelling:Processor Instruction Example Instruction Fetch Execution One-word Instruction Decode One-word Instruction Execute Memory Read Two-word Program Memory Instruction Counter Address Execute Two-word Update Register Load Instruction Instruction Register Load Decode .

High-level modelling:Processor Instruction Example Instruction Fetch Execution One-word Instruction Decode One-word Instruction Execute Memory Read Two-word Program Memory Instruction Counter Address Execute Two-word Update Register Load Instruction Instruction Register Load Decode .

High-level modelling:Processor Instruction Example Instruction Fetch Execution One-word Instruction Decode One-word Instruction Execute Memory Read Two-word Program Memory Instruction Counter Address Execute Two-word Update Register Load Instruction Instruction Register Load Decode .

High-level modelling:Processor Instruction Example Instruction Fetch Execution One-word Instruction Decode One-word Instruction Execute Memory Read Two-word Program Memory Instruction Counter Address Execute Two-word Update Register Load Instruction Instruction Register Load Decode .

High-level modelling:Processor Instruction Example Instruction Fetch Execution One-word Instruction Decode One-word Instruction Execute Memory Read Two-word Program Memory Instruction Counter Address Execute Two-word Update Register Load Instruction Instruction Register Load Decode .

High-level modelling:Processor Instruction Example Instruction Fetch Execution (not exactly yet!) One-word Instruction Decode One-word Instruction Execute Memory Read Two-word Program Memory Instruction Counter Address Execute Two-word Update Register Load Instruction Instruction Register Load Decode .

High-level modelling:Processor Instruction Example Instruction Fetch Execution One-word Instruction Decode One-word Instruction Execute Memory Read Two-word Program Memory Instruction Counter Address Execute Two-word Update Register Load Instruction Instruction Register Load Decode .

High-level modelling:Processor Instruction Example Instruction Fetch Execution (now it is!) One-word Instruction Decode One-word Instruction Execute Memory Read Two-word Program Memory Instruction Counter Address Execute Two-word Update Register Load Instruction Instruction Register Load Decode .

High-level modelling:Processor Instruction Example Instruction Fetch Execution One-word Instruction Decode One-word Instruction Execute Memory Read Two-word Program Memory Instruction Counter Address Execute Two-word Update Register Load Instruction Instruction Register Load Decode .

182-191. A. High-level modelling:Processor Example • The details of further refinement.2. Koelmans and A. Yakovlev. Designing an asynchronous processor using Petri Nets. L.M. vol. circuit implementation (by direct translation) and performance estimation (using UltraSan) are in: A. Yakovlev. no. Dec. pp. Analysing superscala processor architectures with coloured Petri nets. 17(2):54-64. IEEE Micro. March 1997 • For use of Coloured Petri net models and use of Design/CPN in processor modeling: F. Journal on Software Tools for Technology Transfer. .2. Semenov. Koelmans. A. 1998.Lloyd and A.M.Burns. Int.

Low-level Modelling:Interface Example • Insert VME bus figure 1 – timing diagrams .

Low-level Modelling:Interface Example • Insert VME bus figure 2 .STG .

complete the reference . Petrov. Low-level Modelling:Interface Example • Details of how to model interfaces and design controllers are in: A.Yakovlev and A.

Low-level Modelling:Interface Example • Insert VME bus figure 3 – circuit diagram .

Logic Circuit Modelling Event-driven elements Petri net equivalents C Muller C- element Toggle .

Logic Circuit Modelling Level-driven elements Petri net equivalents x(=1) y(=0) x=0 y=1 y=0 x=1 NOT gate x=0 x(=1) z(=0) z=1 y(=1) y=0 NAND gate z=0 b x=1 y=1 .

Event-driven circuit example • Insert the eps file for fast fwd pipeline cell control .

Level-driven circuit example • Insert the eps file for the example with two inverters and OR gate .

Properties analysed • Functional correctness (need to model environment) • Deadlocks • Hazards • Timing constraints – Absolute (need for Time(d) Petri nets) – Relative (compose with a PN model of order conditions) .

Adequacy of PN modelling • Petri nets have events with atomic action semantics • Asynchronous circuits may exhibit behaviour that does not fit within this domain – due to inertia a ab 0->1 0* 00 a 10 01 b 0->1 11 0* b .

Other modelling examples • Examples with mixed event and level based signalling: – “Lazy” token ring arbiter spec – RGD arbiter with mutex .

Lazy ring adaptor Lr R R G D dum dum Lr Rr G Ring Rr La Ra adaptor La Ra D t=0 (token isn’t initially here) t=0 t=1 .

Lazy ring adaptor Lr R R G D dum dum Lr Rr G Ring Rr La Ra adaptor La Ra D t=0->1->0 (token must be taken from the right and past to the left t=0 t=1 .

Lazy ring adaptor Lr R R G D dum dum Lr Rr G Ring Rr La Ra adaptor La Ra D t=1 (token is already here) t=0 t=1 .

Lazy ring adaptor Lr R R G D dum dum Lr Rr G Ring Rr La Ra adaptor La Ra D t=0->1 (token must be taken from the right) t=0 t=1 .

Lazy ring adaptor Lr R R G D dum dum Lr Rr G Ring Rr La Ra adaptor La Ra D t=1 (token is here) t=0 t=1 .

Tutorial Outline • Introduction • Modeling Hardware with PNs Synthesis of Circuits from PN specifications • Circuit verification with PNs • Performance analysis using PNs .

Outline • Abstract synthesis of LPNs from transition systems and characteristic trace specifications • Handshake and signal refinement (LPN-to-STG) • Direct translation of LPNs and STGs to circuits Examples • Logic synthesis from STGs Examples . Synthesis.

Synthesis from trace specs • Modelling behaviour in terms of characteristic predicates on traces (produce LPN “snippets”) • Construction of LPNs as compositions of snippets • Examples: n-place buffer. 2-way merge .

Synthesis from transition systems • Modelling behaviour in terms of a sequential capture – transition system • Synthesis of LPN (distributed and concurrent object) from TS (using theory of regions) • Examples: one place buffer. counterflow pp .

…) • Synthesis of LPN (concurrent object with explicit causality) from process-based model (concurrency is explicit but causality implicit) • Examples: modulo-N counter . Synthesis from process-based languages • Modelling behaviour in terms of a process (-algebraic) specifications (CSP.

and introduction of “silent” events • Handshake refinement • Signalling protocol refinement (return-to-zero versus NRZ) • Arbitration refinement • Brief comment on what is implemented in Petrify and what isn’t yet . Refinement at the LPN level • Examples of refinements.

Translation of LPNs to circuits
• Examples of refinements, and introduction of
“silent” events

Why direct translation?
• Logic synthesis has problems with state
space explosion, repetitive and regular
structures (log-based encoding approach)
• Direct translation has linear complexity but
can be area inefficient (inherent one-hot
encoding)
What about performance?

Direct Translation of Petri Nets
• Previous work dates back to 70s
• Synthesis into event-based (2-phase) circuits
(similar to micropipeline control)
– S.Patil, F.Furtek (MIT)
• Synthesis into level-based (4-phase) circuits
(similar to synthesis from one-hot encoded FSMs)
– R. David (’69, translation FSM graphs to CUSA cells)
– L. Hollaar (’82, translation from parallel flowcharts)
– V. Varshavsky et al. (’90,’96, translation from PN into
an interconnection of David Cells)

Synthesis into event-based circuits • Patil’s translation method for simple PNs • Furtek’s extension to 1-safe net • “Pragmatic” extensions to Patil’s set (for non- simple PNs) • Examples: modulo-N counter. Lazy ring adapter .

VME bus. Synthesis into level-based circuits • David’s method for FSMs • Holaar’s extensions to parallel flow charts • Varshavsky’s method for 1-safe Petri nets • Examples: counter. butterfly circuit .

David’s original approach a x’1 yb x1 x2 b d ya yc c x’2 x1 x’2 Fragment of flow graph CUSA for storing state b .

Hollaar’s approach M (0) (1) K A N (1) M N B (1) (1) L L K 1 (0) 1 A B Fragment of flow-chart One-hot circuit cell .

Hollaar’s approach M 1 0 K A N (1) M N B 0 (1) L L K 1 (0) 1 A B Fragment of flow-chart One-hot circuit cell .

Hollaar’s approach M 1 0 K A N (1) M N B 1 (1) L L K 0 (0) 1 A B Fragment of flow-chart One-hot circuit cell .

Varshavsky’s Approach Controlled p1 Operation p2 p1 p2 (1) (0) (0) (1) (1) 1* To Operation .

Varshavsky’s Approach p1 p2 p1 p2 (1) (0) 0->1 1->0 1->0 (1) .

Varshavsky’s Approach p1 p2 p1 p2 1->0 0->1 0->1 1->0 1->0->1 1* .

Design Methods.Async.Translation in brief This method has been used for designing control of a token ring adaptor [Yakovlev et al.. 1995] The size of control was about 80 David Cells with 50 controlled hand shakes .

Direct translation examples In this work we tried direct translation: • From STG-refined specification (VME bus controller) – Worse than logic synthesis • From a largish abstract specification with high degree of repetition (mod-6 counter) – Considerable gain to logic synthesis • From a small concurrent specification with dense coding space (“butterfly” circuit) – Similar or better than logic synthesis .

DSw- + D . L D T A C K O U T P U T S : D . / 1 r e q D S w + L D S - ( 1 ) ( 1 ) ( 1 ) 1 * 1 * LDTACK. D T A C K . & ( 1 ) L D T A C K + / 1 ( 1 ) ( 1 ) D + / 1 a c k ( 1 ) ( 1 ) D S r - ( 1 ) ( 1 ) D. / 2 r e q D + / 2 r e q L D S + / 2 & D S w - LDS+/1 L D T A C K + / 2 D . 1 0 ( 1 ) 1 0 0 1 (0) (1) (0) + + p 2 p2 p 1 p 4 D . Example 1: VME bus controller p 0 D S w + D S r + D T A C K - Result of direct translation (DC unoptimised): p 1 D + / 1 L D S + LDTACK+ D+ 0 1 0 1 0 1 0 1 ( 1 ) ( 1 ) ( 1 ) (1) & DTACK+ p r 1 p r 2 p r 3 p r 4 L D S + / 1 D + / 1 r e q D T A C K + / 1 DSr. + LDS. DTACK+/1 ( 1 ) 0 1 0 1 0 1 0 1 ( 1 ) ( 1 ) ( 1 ) & p3 D-/1 p w 1 p w 2 p w 3 p w 4 D T A C K + / 2 D . L D S . / 2 a c k ( 1 ) D + / 2 a c k ( 1 ) ( 1 ) ( 1 ) ( 1 ) ( 1 ) ( 1 ) ( 1 ) LDTACK+/1 I N P U T S : D S r . / 1 a c k D S r + L D T A C K - D T A C K . D S w .

/ 1 r e q ( 1 ) & D . / 2 r e q & L D S + / 2 D + / 2 r e q (1) L D T A C K + / 2 D . VME bus controller After DC-optimisation (in the style of Varshavsky et al WODES’96) 0 1 ( 1 ) D . + D S r + + LDS- D T A C K - & D S w + 1* (1) 1* (1) (1) 0 1 ( 1 ) & pw 1 pw 2 (1) D . / 1 a c k (1) p r 1 pr2 & L D S + / 1 D + / 1 r e q D T A C K + / 1 L D T A C K + / 1 D + / 1 a c k (1) (1) D S r - ( 1 ) ( 1 ) ( 1 ) 1 0 01 + + p 1 (1) p2 & LDTACK. / 2 a c k D + / 2 a c k ( 1 ) ( 1 ) ( 1 ) ( 1 ) (1) D T A C K + / 2 D S w - .

David Cell library D C 1 0 D C 4 0 1 1 0 1 0 1 1 1 1 1 p p 1 1 1 1 + 1 0 D C 2 D C 5 1 0 1 0 1 0 1 + 1 p 1 1 1 p 1 & 1 1 1 1 1 1 D C 6 D C 3 0 0 0 1 0 1 1 & 1 + 1 p 1 1 p 1 + 1 1 1 1 1 1 0 D C 7 1 0 1 + 1 1 p 1 & 1 + & 1 1 1 1 .

DSR/DSW): D T A C K + / 1 ( r ) D T A C K . ( 1 ) D T A C K + / 2 ( w ) 3 w i r e D S w - ( 1 ) D S r + h / s D S r + D T A C K - D T A C K - ( 1 ) D S w + ( 1 ) ( 1 ) D S w + ( 1 ) D T A C K D S r D S w ( 1 ) ( 0 ) ( 0 ) D T A C K D S r D S w . D S R / D S w h a n d s h a k e : D S r - ( 1 ) ( 1 ) D T A C K + / 2 ( w ) D S w - D T A C K + / 1 ( r ) ( 1 ) D S r . “Data path” control logic Example of interface with a handshake control (DTACK.

p?.q!)5. Example 2: “Flat” mod-6 Counter TE-like Specification: ((p?.c!)* Petri net (5-safe): q! 5 p? c! 5 .

“Flat” mod-6 Counter Refined (by hand) and optimised (by Petrify) Petri net: .

“Flat” mod-6 counter Result of direct translation (optimised by hand): .

David Cells and Timed circuits (a) Speed-independent (b) With Relative Timing .

“Flat” mod-6 counter (a) speed-independent (b) with relative timing .

“Butterfly” circuit Initial Specification: STG after CSC resolution: a+ b+ x+ a+ a- y+ z+ x- dummy a. z- . b- b+ b- y.

“Butterfly” circuit Speed-independent logic synthesis solution: x ( 0 ) ( 1 ) a b 0 * 0 * ( 0 ) ( 0 ) y z ( 1 ) ( 1 ) ( 0 ) ( 1 ) ( 1 ) ( 0 ) .

“Butterfly” circuit Speed-independent DC-circuit: ( 1 ) ( 0 1 ) ( 1 0 ) ( 0 ) ( 1 ) a + a - ( 1 ) ( 1 ) 1 * ( 0 1 ) ( 1 ) ( 1 ) & ( 0 1 ) ( 1 0 ) ( 0 ) ( 1 ) ( 1 ) ( 1 ) b + b - & 1 * ( 1 ) .

“Butterfly” circuit DC-circuit with aggressive relative timing: a an pa2 pa2n p a 1 p a 1 n ( 1 ) 1* ( 0 ) ( 1 ) ( 0 ) ( 1 ) ( 1 ) ta1 p pn ( 0 ) (1) ( 1 ) pb2 pb2n p b 1 p b 1 n ( 1 ) 1* (0) (1) b bn ( 0 ) ( 1 ) ( 1 ) tb1 .