This action might not be possible to undo. Are you sure you want to continue?
LDOIC~LEVFI.SYNTHESIS AND OPTIMIZTION
X
FIGURE 8.22 Example of a twoway fanout stem
at x and xz and let ~ " " ( S I .Sz) denote the inputloutput behavior of the perturbed u network. The obsewability along the path including (u,, u,) is computed assuming that SI and 62 are independent perturbations. In particular, ODC,., = fx'~"2(1, % fX1.XZ(O, rS2) 62). Similarly ODC,r.;= P1."?(SI, s f ' " " ( S I . 0). Note that f*l.x'(l. 1) s f X ' . X 2 ( 0 , 1) 0) is the ODC set of variable x in the original network, because it captures the conditions under which a consistent change in polarity of x is propagated to the outputs along the paths through u , and u,. We use now an algebraic manipulation2 to relate the previous expressions for ODC,., and ODC,,: to ODC,, namely:
This formula yields a way of constructing the ODC set of vertex u, from those of the edges (v,. u,), (v,, u;). By symmetry, by exchanging y with z (and the corresponding ODC sets), a similar formula can be derived:
Note that for twoway fanout stems there is no need for explicit edge variables. Indeed, any edge ODC set can depend only on the copy of the variable associated with the other edge. Since no confusion can arise, the edge variables can he the same as the corresponding vertex variable. Hence we can rewrite Equations 8.9 and 8.10 as:
ODC, = ODC,.~I,=,~ % ODC,.:
=
(8.11)
(8.12)
ODC.r.J

@
ODC.~.,Ir=c~
' ~ e c a l lthat
rr
% 1 = u and that b % h = I for any variable or function u. b. Hence a % (b % h ) = a .
Split by PDF Splitter
This is not the case for multiway fanout stems, because an edge ODC set may depend on different copies of the related vertex variable. Hence, edge variables must be recorded and distinguished. Example 8.4.12. Let us consider again the network of Example 8.4.11 and shown in Figure 8.21 (b). Since there is a single output, we use the scalar notation ODC,J = x, + a x4 and ODC,,, = x, + a + x4. Then ODC, = ODCObl ,r=,, c? ODC.,. Hence ODC, = (XI + a' + xa) FS (x, + a +x4) = X I +xd, which represents exactly the conditions under which variable a is not observed at the output.
+
This result can be generalized to more than two fanout stems. The following theorem shows how the ODC of a vertex can be computed from those of its outgoing edges in the general case. Theorem 8.4.1. Let v, E V be any internal or input vertex of a logic network. Let (x,, i = 1 . 2 , . . . p ) be the variables associated with edges ((1, ) ; i = 1.2,. . . pJ y, and ODC,,, i = 1 , 2 . . . . , p, the corresponding ODC sets. The observability don't care set at a vertex is given by:
.
18,
Proof. Let fX(x,, x2,. . . , .rp) describe the rnudified network with independent perturbations ( 6 , . i = 1.2.. . . . p J on variables (1,. = 1.2.. . . , p J . The ODC set for u, is: i
ODC, = f X ( l . I :...
It can be rewritten as:
I. I)BfX(0.0 :.
0.0)
(8.14)
ODC, = (fX(1.1 ..... I ) ~ f X ( 0,.... I ) ) .l
c?
5 ...
(8.15)
(fX(0,0 ..... 1) BjfX(0.0 ...., 0))
Equivalently:
ODC,
=
ODCt.,,la,=..=a,=i FS   .FS ODC,.,,ls , = . =a,,= o
C B
= ODC,,,,I ,,=..=,"=,
...
c?
(8.16)
ODC...,p, I

fB ODC, , (8.17) ,,
which can be rewritten as:
ODC, = @ ODCKsllr.,= =in=.,,
I=,

(8.18)
This theorem allows us to devise an algorithm, shown in Algorithm 8.4.2, for the computation of the internal and input ODC sets, starting from ODC,,,,. The algorithm traverses the network from the primary outputs to the primary inputs. For each vertex being considered, the ODC sets associated with its outgoing edges is computed first as the union of the ODC sets of the direct successors and the complement of the local Boolean difference. Then, the ODC set for the vertex is computed using Theorem 8.4.1.
Split by PDF Splitter
394
LOGICLEVEL SYNTHESIS AVD OPnMlZAttON
OBSERVABILITY( G,,(V.El, ODC,., 1 { foreach vertex v, E V in revene topological order { for (i = I to p )
ODC,,, = (af, /ax)'l ODC, = B = , o D c . ~ . , ,+I ,
+ODC,;
1 consider all direct successors of v, *I I* compute edge ODC set *I I* compute vertex ODC set *I
, = = , P . , ;
1
I
ALGORITHM 8.4.2
Note that this step is trivial, when the venex has only one direct successor, because its ODC set equals that of the outgoing edge. Vertices are sorted in a reverse topological order consistent with the partial order induced by the network graph. The complexity of the algorithm is linear in the size of the graph if we bound the complexity of performing the cofactor operation, which may require reexpressing local functions with different variables. For each variable, the intersection of the elements of the corresponding ODC vector yields the conditions under which that variable is not observed at any output. This set is called the global ODC of a variable. The vector of dimension n; whose entries are the global ODC conditions of the input variables is the input observability don'r care set ODC;,.
Example 8.4.13. Consider again the network of Figure 8.18. The ODC sets at vertices vd and u, are:
which means that the d is fully observable at the first output (i.e., empty ODC) and not observable at the second one (full ODC). Opposite considerations apply to e. First, consider vertex ua and edge (ub. because c' = (afd/ab)'. Similarly ODC,, = ODC~.dlh=w@ ODCs., = Next, consider vertex v,. Now ODCCd = ODC.,. = + = Consider now vertex u.:
vd).
0DCb.d =
=
(!,) + (:)
(r ).
(y)
+
(z:)
=
(y).
Therefore ODCh
=
(i) (i) (i).
( y ) + (i:) ( y ).
=
Similarly,
Therefore ODC, = ODC,,dI,=,8 % ODC,, =
(E).
Split by PDF Splitter MLR. u.b =  (Z).. In general.. we can then determine the observability of the inputs... Consider again the network of Figure 8. uh) and (u. which is not a safe approximation.p .18. Let the ODC sets of edges (v.14. we cannot drop arbitrary elements of the ODC set of a variable and then u_se the subset to compute the ODC set for some other variable. Then: (::) and ODC.I C 3 ODC. Therefore ODC subsets may be useful for logic optimization.. Hence.4. which may be used as external ODC sets for the optimization of a network feeding the one under consideration.. Indeed.... The simplest [8] is to approximate the local ODC set by just the intersection of the complement of the Boolean difference .. Example 8. progressing in a similar l By way.) be approximated by subsets as ODC. = ODC. Differently from the CDC set case. are not subsets of the corresponding components of ODC. =  Note that both components of ODC. ODC sets may be large functions.llPLELEV& COMBINATIONAL LOGIC OFTIMUATION 395 Therefore: ODC. Different safe approximations of the ODC sets have been proposed and they  are sumeyed and contrasted in reference [17]. the Bi operator used to compute the vertex ODC set involves implicitly the complement of the edge ODC sets. some ODC subset may lead to computing ODC supersets at other vertices. The global observability of o is ( x l x 4 ) ( x+xb) = ~ 1 x 4 .
' =  (::). . the ODC subsets of the successors are neglected....2 Boolean Simplification and Substitution We consider now the problem of simplifying a logic function with the degrees of freedom provided by the local don't care set.  8. A second approximation method is based on disregarding the complements of the ODC sets and hence on the formula: If subsets of the edge ODCs are used as arguments of the above formula. Among these. then the above equation can provide a subset of the actual ODC. ). the extension being straightforward..smaller size. The first can be obtained by expanding the B operator appearing in Equations 8.1 1 and 8.)' . . are now subsets of the corresponding components of ODC. = (ODC. we mention two that are noteworthy and we show here the formulae for twoway fanout stems. by substituting the subsets in place of the corresponding full sets.) be approximated by subsets as ODC. Consider again the network of Figure 8.=. Again let the ODC sets of edges ( u ..)'. In other words... r. + + + ( O D C .~)' . (ODCx. A more relevant goal is reducing the number of literals.?..19) If subsets of the edge ODCs and of t h e i ~ complements are available. It may also be . Then:   (::) and Both components of ODC.)' (8.r. v. Note that subsets of the ODC sets and of their complements must be computed and stored. We define simplification informally as the manipulation of a representation of the local function that yields an equivalent one with.I..) and (u. the objective of local function simplification in the frame of multiplelevel logic optimization should not be the reduction of the number of terms.18. First. Other approximations yield subsets of larger size. Equivalence will be defined rigorously later.. We shall then extend thi? problem to the simultaneous optimization of more than one local function.=.L=. (ODC.. Nevertheless we must consider a few subtle differences with respect to the problems addressed in Chapter 7. Example 8.. = ODCa..r..=.4.) (ODC.Split by PDF Splitter 396 LOGICLEVEL SYNTHLFIS AND OPTlMlLAT1ON of the immediate successors with respect to the variable under consideration.r~) + (ODC..:I. a subset of the ODC is ieturned.... (ODC.I. Twolevel logic optimization algorithms are applicable to this problem under the assumption that local functions are represented by twolevel sum ofproducts forms. .4.zI.~) (ODC.15.12 and summing them: ODC.
when optimizing the local functions one at a time. The minimizer may be steered by heuristic rules toward a solution that rewards a maximal reduction of literals (instead of product terms). because such a variable is apparently unrelated to the function.) One literal is saved by changing the support set of f h . Roughly speaking. which is directly related to the required local interconnections. Let us consider the SDC set q @ ( a+ cd) = q'a + q'cd + qa'(cd)'. The don'r care conditions help to bridge this gap. Exact and heuristic algorithms for twolevel minimization can be modified to optimize the number of literals. This is obviously dependent on the local environment. because optimizing a local function affects the don't care sdts of other local functions. Thus.12. In the heuristic case. when optimizing more than one local function at a time. Indeed.4. The two approaches differ substantially.3.2. on the interconnections of u. this corresponds to "substituting a portion of the function" by that variable. the target of each simplification is a vertex in the first case and a subnetwork in the second case.e. E) and a vertex u. The sjrnplification of fh = a + brd + r with q'o + q'cd + qo'(cd)' as a don'r care set yields fh = a + by + e . Note that the simplification of a local function in isolation (i. Let us consider a logic network G. The second noteworthy issue in simplifying a local function is that the reduction in literals can he achieved by using a variable which was not originally in the support set.. prime implicants can he weighted by the number of literals they represent as a product.e. because they encapsulate global information of the network and represent the interplay among the different local functions. this is exactly the same as performing the Boolean substitution transformation that was mentioned in Section 8. We must consider first which functions are the feasible replacements of function f. Consider again the second case of substitution mentioned in Example 8. where we try to substitute q = a+cd into f. . Therefore we can concentrate on Boolean simplification in the sequel. A method for minimizing the cardinality of the support set is summarized in reference 161. (Cube by can replace cube bcd because the mintems whe7e the cubes differ are p n of a the don't care set.16.2. the expand operator yields a prime cover that corresponds to a locally minimumliteral solution. the corresponding degrees of freedom cannot be expressed completely by independent sets. We refer to the two strategies as singlevertex and multiplevertex optimization. i. Thus Boolean simplification encompasses Boolean substitution.(V. that is chosen to be a target of optimization. SINGLEVERTEX OPTIMIZATION. leading to a minimum weighted (unate) covering problem that can be solved with the usual methods. the don't care sets change as the network is being optimized. In the exact case..Split by PDF Splitter MLKTIPLELEVEL COMBINATIONAL LDSK OPIlMlZAnON 397 important to minimize the cardinality of the support set of the local function. = a+bcd+e to get fh = a+bq+e. in the network. Example 8. Consequently. without the don't care conditions induced by the network interconnection) would never lead to adding a variable to the support set.. We can envision two major strategies for performing Boolean simplification: optimizing the local functions one at a time or optimizing simultaneously a set of local functions. Conversely'. respectively.
Theorem 8. (b) Perturbed network at x to show the replacement of f. the difference in inputloutputbehavior of the two networks must don't care set..By perfomlng an expansion on variable 6. say g. which impllec that: SODC: C DC. . A sufficient condition for equivalence is that the perturbation S = f. fZ(O) @ fX(S)c DC. Additional insight into the problem can be gained by the following theorem. with another one.. Consider a logic network and another network obtained by replacing a local functlon f. can be modeled by means of a perturbation 8 = f.. where S = f. . with g . as shown in Figure 8. We say that a perturbation is feasible if 8.Split by PDF Splitter FIGURE 8. The replacement of a local function f . Recall that f. with gx. (c) Optimized network..5. the latter obtained from the former by replacing f . Consider a logic network obtained by replacing a local function f. + ODC. Proof: By definition of equivalence. with 8 .. or equivalently: be included in the exten~al If we define e(S) = f r ( 0 ) @ fr(S). @ g . then e(S) represents the error introduced by replacing f. Definition 8.4. P and are functions of the primary input variables. . The don't care set represents the tolerance on the error. $ g.23. we have: e(S)=S(fT(0)@ f V ( l ) )+ S'(fr(0) @ f'(0)) Hence e(S) = S ODC:... with g.2. We use again the concept of a pemrbed network.4. This allows us to formalize the equivalence of two networks. = a .. e(6) = f r ( 0 ) @ f7(S). = nb with g. The two logic networks are equivalent if the vector equality: is satisfied for all observable components o f f and all possible primary input assignments.23 holds...23 (a) Example of a logic network.. @ g is . Let . contained in all components of DC. In other words.
R. it is convenient to check its containment in their intersection.. A necessary and sufficient condition for equivalence is that the perturbation S = f.27) where SDC. + ODC.....4.. i.1. Note that since the perturbation must be contained in all components of DC.. + SDC. The corollary states that DC. Then the impossible input patterns for f and .T(~I Finally. Assume now that f. and g. Again the impossible input patterns at the inputs o f f . are additional degrees of freedom for the replacement of f. + CDC.4.: DC. (8.. be replaced by g. can be ignored. g.. + ODC. assume that sup(g.t3g..23 (c).2 does not assume anything about the support set of g. i. are primary inputs. represents the don't care conditions for any vertex u. = a b e a = ab'. The optimized network is shown in Figure 8. = DC. SDC.) is included in the set of network variables.e. Let f. and g. Assume that there are no external don't care sets and that we want to optimize f. the condition is also necessary.. = DC..Split by PDF Splitter MULTIPLELEVEL COMBWATIONAL LOGIC OFllMIZATlON 399 The above equality holds if and only if: GI I DC. = S D C . + ODC.. = y e f. Whereas this definition of SDC. o. = DC.e.4.tV": u.. l . E V G . cannot be added to the support of the local function f. Therefore the local don't care set of variable x includes also the controllability component. also have to be accounted for. A simple analysis yields ODC.. + The local don't cure sets encapsulate all possible degrees of freedom for replacing f. Then DC. and therefore the corresponding contribution to SDC. Example 8. (8.. Hence the pemrbation is 6 = f..#F. = DC. the astute reader will notice that variables col~espondingto successor vertices of u. @ g.. is compatible with the literature [6].17. because otherwise there would be some don't care condition not accounted for. In practice.. Consider the network of Flgure 8. with g. the external don't cure sets are specified along with the .e. Therefore: DC. = a . is contained in all componerlts of DC. Therefore the local don't care set includes also the SDC set for all variables but x . SDC. + ODC. as in the case in which we attempt Boolean substitution. + ODC.) is included in the set of all variables excluding x. i. the replacement is feasible. denoted by DC. where sup(&'. = J' = b' + c'.. = b' + c'. by replacing the A N D gate by a straight connection.23 (a).  Corollar~ 8. + SDC. If we assume that sup(&) is a subset of the primary input variables. with g. A feasible revlacement is one where the vetturbation is contained in the local don't care set DC.. = b' + c' because the inputs of g.. Since 6 = ab' c DC.25) Theorem 8. have the same support set S(x) that includes variables associated with internal vertices.
.43 logic network and the SDC. Then.. n ) in the network. (See Problem 9. A simple framework for singlevertex optimization is given in Algorithm 8. or unlikelihood. .i = 1..  MULTIPLEVERTE.1 just states that the local don't care sets contain all relevant components.4.e. corresponding to Boolean subspaces too small to be relevant.. . Corollary 8.4. In the first case the filters are called exact.3. 2 .2. Compute [he local don'r core set DC.. i. Consider a vertex u.2.. We refer the interested reader to references [38] and [43] for details.i = 1...) Approximate filters are based on heuristics. otherwise they are called approximate. . An example of an exact filter is the following. CDC and ODC sets can be computed by the algorithms presented before. [38. S V ( G .Split by PDF Splitter S I M P L I F Y . . Another example is to disregard those cubes whose support is larger than a given threshold. Filters may drop terms of the local don't care sets while considering the uselessness.care set for a variable may be large and contain components that are irrelevant to the simplification of the col~esponding local function. We use the vector notation to denote the set of perturbations 6 and perturbed variables x.E ) I( repeat ( u. I until (no more reduction is possible) 1 ALGORITHM 8.X OFTIMIZATION. ( V .* We consider now the problem of the si multaneous optimization of the functions associated with a set of vertices of a logic network.e. . related to variables {xi. as shown in Figure 8. n ) . 431. Note that the local don't. The filter considers the support sets of the local function to be simplified and of the cubes of a sum of products representation of the corresponding local don't care set. {Si = f. = selected venex. . it can be shown formally that the cubes of E are useless for optimizing f. We use perturbation analysis as a way of describing the replacement of local fundidns with new ones at the vertices of interest. An example is discarding those cubes of the don't care set whose support is disjoint from that of the function to be minimized.Hence we introduce multiple perturbations IS. Filters are used to reduce the size of the local don't care sets. i. . Let DC. one for each variable. $ g . n J . . . . that they contribute to minimizing those functions.24. Both types of filters exploit the topology of the network. Let us assume that we want to simplify n local functions. Each perturbation is the difference between the old and new functions to be assigned to each vertex. i = 1 . = D U E such that ( ( s u p ( F )U s u p ( D ) ) n sup(E)I 5 1. Different heuristics can be used to select vertices and filters to limit the size of the local don't care set.. . . Optimize the function f. .
which satisfies the equivalence constraint 8. = c. . fh) Network with multiple penurhatians. by g. can be characterized by a Boolean relation. = x' = b' + a ' . the simultaneous optimization leads to an erroneous network. Let x be the set of variables under consideration.18. = y' = b' + c'. . Even though both perturbations 81 = ab' and SZ = b'c are included in the corresponding local don'r care sets.4. = a and f. for the sake of simplicity. Let us compute the ODC sets ODC.24 (a). By definition of equivalence. namely: It would be useful to transform this inequality into bounds on the individual perturbations (Si. would lead to erroneous results.= 1. Then. Only one perturbation (corresponding to replacing one AND gate by a wire) is applicable at a time. Assume we want to replace f.28. Note first that a simultaneous optimization of multiple vertices using the local don'r care sets. .2. It isinteresting to look at the multiplevertex optimization problem from a different perspective. computed as in the previous section. Therefore multiplevertex optimization can be modeled as finding a minimal implementation of a multipleoutput function compatible with a Boolean relation. . Example 8. n ) . the difference in inputloutput behavior of the original and perturbed network must be contained in the external don'r care set.Split by PDF Splitter MULTIPLELEVEL COMBINATIONAL LOOK 0 6 T l M l Z n n O N 401 FIGURE 8. in a fashion similar t o Theorem 8. implementing z = ac. some patterns of x may be equivalent as far as determining the value of the primary outputs. O D C . (c) Optimized network. by g. Unfortunately the bounds are more complex than those used in singlevertex optimization. Assume the external don'r care set is empty.24 (a) Example of a logic nelwork. Thus. the mapping from the primary input variables to the variable x. . (d) Another optimized network. Consider the circuit of Figure 8.4.2.
0 1.0. in the singleoutput case it is possible to specify feasible perturbations as incompletely specified functions.24 (b).23 (c). the feasible perturbations can be expressed as solutions to a Boolean relation. y. y = I. I.20. Consider the circuit of Figure 8. Assume again that the external don't care set is empty. The primary output is TRUE only when x. Consider again the circuit of Figure 8. Hence the subnetwork with inputs a .4. Nontrivial solutions are 61 = ah'.19. 1. Thus. whose validity can be easily checked by back substitution. The first is to extract the subnetwork target of optimization. y can be represented by the following Boolean relation: A minimumliteral solution is: corresponding to r = a and ? = bc as in the implementation shown in Figure 8. Let x.1. b. represent the perturbed variables. For example.Split by PDF Splitter Example 8. i. as stated by Theorem 8. 6 2 = 0 and 81 = 0. Boolean slations capture more degrees of freedom than don't care sets. The remaining patterns for x. Nare that this solution is not unique. There are then two major avenues for multiplevertex optimization. First. = 0. Thus. another minimumliteral solution is s = rrb and s = c as shown in Figure 8.4. model its inputloutput behavior as a . Two points are noteworthy. c and outputs x . are equivalent. ( 0.4.24 (d). ?.2. and thus multiplevertex optimization is a more powerful paradigm for optimization than singlevertex optimization. = ab@ab' = a . and its implementation is shown in Figure 8..61 = b'c. Example 8.e. Boolean relations reduce to Boolean functions in the limiting case of singlevertex optimization.24 ( c ) . Second. singleoutput relations are logic functions. Indeed. The first nontrivial perturbation corresponds to choosing g.24 (a) with no external don't curer. 0. The constrain^ on the equivalence of the perturbations are: Therefore: (ah @ 6 . S2 = 0. ) ( b c @ 6 4 = abc A trivial solution is obviously provided by 6. possibly incompletely specified.
n (8.4. . i = I. . We describe multiplevertex optimization by means of compatible don't care sets first. Hence. a conservative approach can be used.. The second method is to search for bounds for feasible simultaneous perturbations. Satisfaction of these hounds is a sufficient (but not necessary) condition for equivalence.* Singlevertex optimization relies on the results of Theorem 8. 7he bounds forone perturbation depends in general on the other perturbations. ..2 and Corollaq 8. . the satisfiability don't cares related to the vertices being optimized are obviously subject to change. Therefore this approach is wellsuited for optimizing cascaded subnetworks. This result does not generalize to multiplevertex optimization.18. n. . i = 1. though. Nevertheless some feasible transformations may not be found. .IEVEL COMBLVATIONAL LOGIC UmlMlZATlON 403 Boolean relation and apply corresponding exact (or heuristic) minimization techniques. and therefore are excluded from the local don't care sets. .. Observability don'r care sets are called compatible and denoted CODC.4. + CODC. The implementation of this approach bears some similarities with singlevertex optimization. where only perturbationindependent upper bounds are used. COMPATIBLE DON'T CARE CONDITIONS.4. . These bounds can be used to compute mutually compatible don'r care sets that reflect the degrees of freedom of each local function in the simultaneous optimization problem.1 5 DC. . n.6. Whereas this approach leads to formulating the optimization problem as a Boolean function minimization problem (and hence simpler to solve than the corresponding Boolean relation minimization problem). i = 1. the specification of a Boolean relation entails identifying a set of inputs and a set of outputs. . Conversely.29) represents a sufficient condition for equivalence . as shown by Example 8. it is possible to. Consider a set of variables x. Definition 8. that compatible don'r care sets can be computed for arbitrary subsets of vertices whose local functions may have arbitraly support sets. Indeed it is possible to show formally [17] that multiple perturbations are feasible when they satisfy bilateral (upper and lower) bounds. Note that the external don'r care sets are invariant on the perturbations.4. .TIPLE. On the other hand. which state that a single perturbation is valid when it is bounded from above by the local don't care set. . the latter being associated with the vertices that are the target of the optimization. . To avoid the complexity of computing and deriving the exact bounds. this approach is less general than Boolean relation optimization.show that some degrees of freedom in logic optimization are not captured by the compatible don't care sets. where a local don'r care set is computed and used for simplification.. because of the restriction applied to the bounds. It should be noted. to draw the parallel with singlevertex optimization.Split by PDF Splitter MUI. We concentrate then on 'the analysis of the ODC sets and on the computation of those subsets that are invariant under simultaneous independent perturbations. 2. We comment on relationbased optimization later..1. . when they do not depend on the perturbations themselves and 6.
Thus CODCs play the role that ODCs have in singlevertex optimization. The second vertex would have a CODC smaller than its ODC by a quantity that measures how much a permissible perturbation at the first vertex would increase the obsewability of the second (i.. Assume (arbitrady) that the first vertex is x and the second is + + . The computation of the CODC sets related to the vertices by combining the edge CODC sets. Let us order the vertices of interest in an arbitrary sequence.21. From a practical standpoint. Let us now derive the CODC sets.. independent of xt and the restriction of ODC. = ODC. Attention has to be paid to restricting the ODC set in an appropriate way. in the general case. = x' = b' 0 ' . O D C . . Unfortunately.. 3. ( 0 D C x l ) ODC. Consider computing the CODC sets for the circuit of Figure 8.. the CODC sets depend on the order of the variables being considered. The computation of the ODC sets related to the edges. we can justify the computation of the CODC sets as follows. The first and last steps parallel those used for the full ODC sets.4.e. and full ODC sets must also be derived. 2. We refer the reader to reference [I71 for the general case.. we must settle for a possible underestimation of the degrees of freedom due to obsewability. The intermediate step is illustrated here for the case for n = 2 perturbations and a singleoutput network.31) Intujtively. = y' = b' c'. Since it is not efficient to consider all orders. maximalvertex CODC sets cannot be derived directly from edge CODC sets.ODC:l (8.. the first CODC set is the full ODC. The second observation is that CODC sets can still be derived by network traversal algorithms. because it enables us to simplify simultaneously and correctly a set of local functions. The ODC sets are O D C . Example 8. The derivation of the corresponding CODC sets by restricting the ODC sets.Split by PDF Splitter A compatible ODC (CODC) set is said to be maximal if no other cube can be added to it while preserving compatibility. namely: CODC. First. to when x. This is because we remove iteratively the dependency of the ODC set on the other perturbations.30) + (8. such as the ODCs.24 (a). From an intuitive point of view. The first vertex would have its CODC equal to its ODC set. . Maximal CODC sets are relevant because they provide more degrees of freedom than nonmaximal ones. CODC. Approximation methods are also possible that compute nonmaximal CODC sets by avoiding the computation of full ODC sets. The calculation of the CODC sets by network tr&versal entails three major steps: 1. Two simple observations are obvious. The second CODC set consists of two terns: the component of ODC. And so on. ) . the derivation of CODC sets is very important. deprive the second vertex of some degrees of freedom). is observable (and thus no optimization can take advantage of C O D C . = C ..
24 (d)] Conversely. while it would disallow g.(x') x ' y = x ' y = (b' a')bc = a'ba. = a . the simultaneous optimization of f. El l ( WPeat 1 U = selecled venex subset: foreach vcner v..4. € U Compute CODC. the computation of the CODC sets with a different variable sequence would allow the simplifications g .6) involves solving a binate covering problem and is more difficult than minimization of Boolean functions. = rr and g . cannot be simultaneously reduced to g.  until (no more reduction is passible) I ALGORITHM 8. = c. The significance of the increa%d observahility of y is due to the fact that f. . = nb and g. f . Thus. all internal venices are considered at once (i. Hence the edge observability is the same as the vertex observability and step 3 is slupped. Indeed C O D C . = y' and C O D C . M V ( G * ( V . This approach is more general than using compatible don't cures. Th? difficulties related to using a relational model are the following.4. = ab and g. = b' + c' and D C .4. . Note that the multiple optimirdtion specified by g.e.24 (c). ) ODC. = O D C . because this would not yield equivalent networks. In this situation.. ( O D C . while it is feasible in principle [Figure 8..(ODC. = c cannot be found with these CODC sets. BOOLEAN RELATIONS AND MULTIPLEVERTEX OPTIMIZATION. = a h and f . Note that in this + + + example the AND gates have single fanout. = br with DC. = a'bc yields g . We introduce the method by elaborating on an example. Second.   The general approach to multiplevertex simplificatisn using compatible don't cure sets and logic minimization is based on iterating the steps shown in Algorithm 8. Often.* Multiple vertex optimization can he modeled by associating the vertices (that are the target of optimization) with the outputs of a Boolean relation. and the comespanding local dun'r core Optimize jimultaneously the functions at venices U : ) subset DC. = C . may be subject to change. the equivalence classes of the Boolean relation have to be determined. G O D C . Then COL)C.)' = C. = bc. Thus the local don't cure sets reduce to the combination of the CODC sets and the external DC sets. = o and g. being replaced by g. Boolean relation optimization (see Section 7. an optimal compatible (multipleoutput) function must be derived. = c. because Boolean relations model implicitly the mutual degrees of freedom of several functions. for example. = bc. as shown in Figure 8. Multipleoutput logic minimizers can optimize simultaneously all local functions. First.4 y. We show next how it is possible to derive the equivalence classes of a Boolean relation from the ODC sets of the vertices that are the target of multiplevertex optimization. U = V C ) and one iteration is used.Split by PDF Splitter S I M P L I F Y .
X I .x3 = 0 . rfl=o) . + ODC.0.6.0 .0 when x is smaller than 3. are: The equivalence class of pattern x. assume that the comparator's output z is 0.' We use now an algebraic manipulation similar to that used for computing the ODC sets. 0 ) = 1.xi. 0 is the set of patterns xi.l when x is larger than 4.. 1.=o.b>. found by the OBSERVABILITY algorithm...\(. xi that satisfies f(x2.. 0 ) % f(xZ. X I . x2. + ODC. 00 01 bu b~ Example of two cascaded logic networks..0) + ODC.O.0)) ?% (f(x2. Consider again the circuit of Figure 8. l.o = (f(x2. we can rewrite the above equation as: EQVo.0) and is therefore described by the function: ..0)) = (xb % (f(x2. Since f(x2. Example 8. x . 0 .0 . 0)) f(O. 0) = 1 and f(x2.. Then: The observability don't cure sets of variables x. 0) ?% f(x2.V2.0.4. .Split by PDF Splitter 2LIOF j Y Im . x.r. XU) S f(x2. We want to determine the equivalence classes of the relation that can model subnetwork NI.22. xo) = f(0. Let f denote the inpudoutput behavior of network . xi.Without loss of generality.0 when x is 3 or 4 and 1.25 related to Example 7. lro=n) S (x.1.) S (x.O. whose observability is limited by network . 0) ?% f(xz.o.
. The derivation of the equivalence classes is applicable to any subnetwork with outputs x... s ..O. x . . Example 8.001.. describes the equivalence class of the reference pattem 0. . x. Let x' E 6% a reference pattern.5.4. The equivalence class of any given configuration x' = [ I . 0) = ( b . Assume that a subnetwork is selected.35) where we used the identity: f(a. .. .6. 2 . aflab)' + ODC. + x . 1 = b'l (8.. We are interested in the equivalence class with respect to all outputs. . p. . such as that corresponding to the shaded gates in the figure..3.. Boolean relationbased optimization of multiplelevel networks can be organized as shown in Algorithm 8. Let f denote the inputloutput behavior of the driven network (e.O with respect to an output.g. The equivalence class of any given configuration pattern x' is: Proof..]' of configurations that satisfies: is the set The following identity holds: The theorem follows from the fact that the ith term of the summation is: 1..23.aflah)' = b'l +ODC~ Each component of EQV. NZin Figure 8. xi.I.1. This corresponds to the panems (000.4..25). The scalar notation is used...24 (a). The other equivalence classes can be derived in a similar way. . 5 :+ ODC.=. Hence. i = 1 . .Split by PDF Splitter where we used the identity: f(a. which in this example is x . b) B? f(a. b) 5 ((a..010) of Example 7.4. Consider again the network of Figure 8. ... kcause there is a single output. 0) = (b .=... described by the product of all components of EQV. Theorem 8. The'significance of this theorem is that we can express each equivalence class for the outputs of the subnetwork t o be optimized by means of the information of the ODC sets.
.) in the network.. = a h : g. u. 10) are equivalent at the primaty output. until (no more reduction is possible) Determine the 1 ALGORITHM 8. The un$erlying concept is that an untestable fault corresponds to a redundant part of the circuit which can be removed.4. Since there are only four possible patterns. Let us now derive the equivalence classes. this completes the computation of the equivalence class.. = bc [Figure 8. The relation can then be expressed by ihe table shown by Example 8. M V R ( G.Split by PDF Splitter 408 LOGlC~LEVEL SYNTHESIS AVD OPTLMLZATIOR: S I M P L f F Y . = c [Figure 8.19. U . This can be understood by considering that only the panern I I can yield a 1 at the primary output. The corresponding minimum compatible functions are g. Let us consider first pattern 00. We consider tests for single stuckat faults at the inputs of a gate. 121. = y' and ODC.5 Let us compute first the ODC sets at the output of the subnetwork: ODC. These methods can be explained in terms of don't care sets and perturbations with the formalism of the previous section.. We restrict our attention to singleoutput networks for the sake of simplicity. Hence: 01. REDUNDANCY IDENTIFICATION AND REMOVAL.4.4. modeled by a local function. = a : g. = x ' . with input variable y related to an input connection to be tested.: Techniques based on redundancy identification and removal originated in the testing community [lo. Let us assume that the gate corresponds to vertex u.. whose ODC set is ODC. 8.24 (d)].24 (c)l and g.(V. This means that patterns (00. E ) ) / weat ( U = selected venex subset: foreach venex L. The input connection y corresponds to edge (u.3 Other Optimization Algorithms Using Boolean Transformations* We review briefly here other methods for multiplelevel logic optimization that use Boolean operations and transformations. Find an optimal function compatible with the relation using a relation minimizer. equivalence classes of [he Boolean relation of the subnetwork induced by U . .. ) Compute OCD.
=. This can be seen as a perturbation S = f. afx/ay. variable y can be set to a peAanent FALSE (or TRUE) value in the local function f .26 (c). We cunsider the case of variable y stuck at 0. simplification of the corresponding local fut~ction. @ f.. = frlr=o.26 (a) Circuit with untertable suckat0fault.4. by asserting a 0 on that connection.. no input vector t can make y(t) . We justify redundancy removal by relating it to perturbation analysis. claim that the perturbation is feasible whenever We y is untestable for stuckat0 faults.e. This leads to removal of a connection and to .l.. ODCb. .(t) = I ] for stuckatl faults).26 (h). Consider the circuit of Figure 8. (The case of variable y stuck at 1 is analogous. By definition of an untestable fault. Assume now that a test pattern generator detects an untestable stuckat0 (or stuckat1) fault on variable y. g... A test pattern generator has ) detected an untestable stuckat0 fault (on variable g .(t) = 1) ((tlyl(t) . i. The circuit can then be simplified further to that of Figure 8. as shown in Figure 8. ODCi. y has an untestable fault.Split by PDF Splitter MULTIPLELEVEL COMBLNATlOXAL LOGIC OPnMlZATlON 409 An input test pattern t that tests for s stuckat0 (or for a stuckat1) fault on y must set variable y to TRUE (or to FALSE)and ensure observability of y at the primary output. Example 8. ODC. = y .(t) TRUE.24. (b) Modified IC) circuit.26 (a). If either set is empty. Namely.or equivalently no input vector can make  (a) FlGURE 8.) Let then y be set to 0. Then. (c) Simplified circuit. The circuit is modified first. the set of all input vectors that detect a stuckat0 fault on y is {tly(t) .
of the logic network. O D C : . The entire process can be repeated until there are no untestable faults. Hence S = y . is included in the CSPF (or MSPF) of another gate u. n ( D C . U DC. af. for each untestable fault (excluding the first). Consider Figure 8. O D C : ( t ) . is then DC./ay mue. satisfying the bounds when simultaneous optimizations are performed. because O D C . . can substitute uy..) A permissible function is a function g. ODC.f. + + + Example 8. 2 O D C . ) ' 2 g. the original methods for computing the MSPF and CSPF of each gate were based on manipulation of tabular forms. The local don'r care set at v.4.. The n~aximumset of permissible funcrior~s(MSPF) is the largest set of functions g.. = a c ' + b r and DC.. .26 (a). The output of u. The original transduction method dealt with networks of NOR gates only. Connection addition and deletion.. redundancy removal can be applied simultaneously to several gates. Consider a vertex u . such that . and v. ( a f . . [36]. Note that simplifying a local function in connection with an untestable fault may make other untestable faults become testable. The compatible set ofpermissiblefrrnctions (CSPF) is the largest set of functions g. which returns a list (possibly empty) of untestable faults.. 2 6 . in terms of primary input variables.Then 6 = nb@O = In practice. Gate substitution. af. = O D C . Later enhancements removed this restriction. We relate the original terminology to our definitions of local functions and don't care sets.25.. = ab and ab. Hence.Split by PDF Splitter 410 LOGICLEVEL SYNTHESIS AND OFTIMLZATlON y ( t ) . The most important algorithms are based on the CSPF computation and on the iterative application of the following transformations: Pruning of redundant subcircuits. (Vertices are referred to as gates. When using CSPFs. 2 ODC. This corresponds to doing simplification (and Boolean substitution) by increasing (andlor decreasing) the support set of a local . More recent implementations of the transduction method exploit BDDs. is not a successor of u.. Hence it is a feasible replacement for function f. then gate u. is connected to all direct successors of u.. = 0. Let f. The quantity y . 5 f. and us can he deleted. Similarly. Transduction means transformation plus reduction. based on their observability. a f . / a y is then part of the satisfiahility don't care set. / a y ) ' . y . ODC. TRANSDUCTION. the local function is simplified. Hence CSPF sets are related to compatible obsewability doll'/ care sets. redundancy removal is applied as a byproduct of automatic test pattern generution (ATPG)./ay = O D C .. This is a form of Boolean substitution. A different terminology was used in the description of the transduction method. a f X / a yis included in the local don'r care conditions.* The transduction method is an iterative optimization method proposed by Muroga et a / . y . Several optimization techniques have been proposed for the transduction framework [36]. a f . / a y . safisfying the bounds. When a local function associated with vertex u. an additional redundancy test is applied. If the fault is still untestable.
Error compensation. Consider the circuit of Figure 8. Then restructure the network to compensate for the error. Then: Example 8. If . If the intersection of the two CSPFs is not void. the implication x = i + s = j can be stated as (xfBi)+(s j) and its complement [i. d .Split by PDF Splitter . It is important to remark that forcing sets express implicitly the information of the SDC set. Overall. (b) Logic network fra.* The globalJlow method was proposed by Berman and Trevillyan and incorporated in IBM's LOGICSYNTHESIS SYSTEM It consists of two steps: [4].e. :] c F I o ( x ) and ( r ] G F l l ( x ) . and feasible (i. .4 but that also have intrinsic close resemblance. function.4. second.27 (a). the implication may not lead to a cube that appears in the SDC as computed according to Definition 8. (c) Logic network fragment after reduction and expansion. optimizing the circuit iteratively using the gathered information. the method exploits global transformations and the Boolean model. Overall. collecting information about the network with a method similar to those used for dataflow analysis in software compilers. Hence ( c . acyclic) connections can be made to a new gate implementing a function in the intersection of the CSPFs. then r = 0. but it may still be derived by performing variable substitution and consensus. Let S denote the set of variables. Indeed. The addition of a connection is justified when other connections can be dropped or gates can be merged (as described next).. Note that since variables x and s may not he related by a local function.4. first. the transduction method can be seen as a family of Boolean transformations that exploit a different organization of the logic information than that described in Section 8.x = I.27 (a) Logic network fragment.. Gate merging. e = I and I = 0. (x S i)(sfBj)] is pan of the SDC set. Global flow information is expressed in terms of forcing sets for each variable.26. GLOBAL FLOW.e.4. Performance constraints can be added to the network. FIGURE 8. d = 0.ment after reduction. Remove a gate and compute the corresponding error at the primary outputs. then the two gates can be replaced by this new gate. Consider two gates and their CSPFs.
then replace f . and we just give an example to understand the intuitive features of the method.27 (b). where it is assumed that all local functions are NORS.Split by PDF Splitter 412 LOGICLEVEL SYNTHESIS AND OPTlMlZAnON The computation of the forcing sets is complex.) The controlling sets can then be computed by iteration until the sets stabilize. other vertices that ..l. Weaker conditions can also be used to allow fast incremental recomputation of the sets as the network is modified [4]. as shown in Figure 8.Cnl and Cw. Since the formal justification of this transformation is complex. then replace f. represents all implications x = I + z = 0. with the overall goal of reducing the circuit area andlor delay. A transformation in the opposite direction. Consider the circuit of Figure 8. we consider the controlling sets of reference [4]. Similarly. Let the direct predecessor relation among variables be denoted by p ( . then it must force to 0 those variables in the support of f.27 (a). This corresponds to propagating the blocking effect of x = I fonvard in the network.Thus. thus reducing the number of literals. + The reduction step increases the number of literals. e s the following straightforward w rules: If y E Cll(x). (See Problem 12. For this transformation to be feasible. subsets are often used that are called controlling sets. because it involves a solution to a satisfiability problem.e. called expansion in [6].4. by its cofactor f. by f. Since z E Cjo(x). (NORSinputs). a specific relation must exist between x and y. Transformations using controlling sets aim at addinglreducing connections in the circuit. s = 1 + x = 0 implies that x = 1 + s = 0. is a direct predecessor of u s . As an example. Eventually. Consider the circuit of Figure 8. x. The modified network is shown in Figure 8. Such sets can be efficiently computed as fixed points of monotonic recurrence relations. = (e + g)' to f: = ((e + g)'x' = (e + g +x)'.. which is called the contrapositive. If x forces variable y to 1. A reduction of f:.4. I].27 (b). i. Example 8.28. we refer the reader to reference [4]for details. If y E CIO(X). we can modify f. y t p ( x ) when (u.x'. as in twolevel logic minimization.targets the replacement of a local function f.+ i t (0.27. Let us consider now the use of the controlling sets Cll(x) or C ~ o ( x ) . called reduction in [6]. if x = 1 forces y = I and u.27 (a). Consider the conditions that apply when a variable x = 1 forces s = 0. Similar properties and transformations can be defined for controlling sets Cnl(x) and Cw(x). Example 8. This relation can be captured by means of the controlling sets. by f . Hence the controlling set can be defined recursively as follows: Similar definitions apply to C I I . then s = 0. A first transformation. Therefore. ) .) is an edge of the network graph. u.
doublerail circuit design has overheads that limit its use to specific applications. currentmode and differential cascade lugic. POLARITY ASSIGNMENT. the logic network (as defined in Section 8. signals are always present along with their complements. predecessors of u.2 is used. We address here the reduction of the size (or delay) of the network by choosing the polarity of the local functions. even though this approach may be practical only for networks of limited . Hence the global How transformation consists of selecting a variable x and determining its corresponding sets A(x) and B(x) that are reduced and expanded. The inverter minimization problem can still be solved by formulating it as a ZOLP. on all paths to primary outputs. The resulting network is shown in Figure 8. static CMOS logic design). When local functions are modeled by BDDs or factored forms. Hence it is possible to optimize a design with the freedom of selecting the polarity of the input a n d o r output signals. For this reason this problem is called global polarity assignment.2) models accurately the area and the delay.g. can be expanded (i. The process can be iterated while using heuristics to select the variables. the area and delay cost of inverters is implicit in the representation. We consider here singlerail circuit design. Two relevant subcorregraphs can be isolated in the logic network graph for any variable x w h ~ s e sponding variables belong to appropriate controlling sets of x. their complementation is straightforward and size preserving. So it is reasonable to assume that the area (andor delay) is invariant under complementation. such as emittercoupled. ( x ) . The problem of choosing the polarity of the network's local functions so that the required inverters are fewer than a given bound is intractable 1431. A specific global polarity assignment problem is one focusing on the inverter cost only. When the logic network model of Section 8. they often provide doublepolarity output signals. but it eventually has to be taken into account..Split by PDF Splitter MULTlPLELEVEL COMBINATIONAL LOGIC O TML TO 413 Pl rAl N are in C l . Whereas this ?ask is easy. Note that the complementation of a local function requires the input signals with complemented polarity. and their local functions can be replaced by the cofacton with respect to x. and we refer the interested reader to reference [43] for details. combinational networks are often components of sequential circuits. udJ satisfy this property. allowed by several design styles. Unfortunately.e. An additional degree of freedom may be the possible choice in polarity of the primary inputs a n d o r outputs. The features of the method can be summarized as follows. When dualrail circuit styles are used.27 (c). The local functions associated with B(x) have the property that they can be replaced by their cofactor with respect to x (or to x') when all functions in A(x) are reduced as described above. When static latches (or registers) are used. Polarity assignment problems arise in the context of singlerail circuit design. because no inverters are required. We now briefly survey methods for inverter reduction. which is the most common case (e. Indeed. vertices (v. made insensitive to x) when they are i I].. In this case. = (0. In this case. We call them A(x) and B(x). optimizing a network with this additional cost component is more difficult.. Then the global polarity assignment problem corresponds to reducing the number of inverters in a network.
For these reasons. an inverter is removed when N F O ( y ) = x and x @ P F O ( y ) . The inverter count variation depends on four factors: An inverter is removed at the output of v.28. be the vertex under consideration.431. After complementation.) For each direct predecessor v. a dynamic programming approach to the inverter minimization problem can yield the optimum choice in linear time. (a) (b) FIGURE 8. fx depends on literal y'. (b) After .) An inverter is added at the output of u. A heuristic algorithm is based on a local search that measures the variations in inverter count due to the complementation of a local function. most networks have reconverging paths. and no inverter is required at the output. all direct successors of ux depend on x.)  With these rules. mentation. depend on x . Unfortunately.. Note that P F O ( x ) n N F O ( x ) may not be void. After complementation. all direct successors of u. An example is shown in Figure 8. The inverter is not needed after complementation of f. an inverter is added when N F O ( y ) = 0. and let P F O ( x ) and N F O ( x ) be the total number of direct successors of v. After complementation. no inverter is present at the output of f. y' is used and an inverter is present. heuristic methods have been used [8. and no inverter is present at the output. all direct successors of v. can he measured by local inspection. whose functions depend on literals x and x'.. when P F O ( x ) = 0. For tree or forest networks. . the inverter savings can be readily evaluated. called inverter saving.28 Example of the inverter savings due to polarity assignment. all direct successors of u.Split by PDF Splitter 414 L O G l C ~ L E E LSYNlHBSIS AND OPTIMRATION size. (Variable y is in the support of f only. . colnplementation of f . of u. when N F O ( x ) = fl. Let v. (Before complementation. respectively.. depend on x'. Before comple. ) For each direct predecessor v.. depend on x'.which requires only literal y . (a) Before complementation of f . of v. (Before complementation. (Before complementation. The variation.
b. It would terminate when no local complementation can lead to positive savings.4. On the other hand. e ) . as in the case of static (or domino) CMOS designs. . Consider the fragment of a logic network shown in Figure 8. For macrocell design. Whereas the major goals in logic synthesis are to provide minimumarea (or minimumdelay) circuits possibly under delay (or area) constraints. . The process is repeated until an improvement is achieved. starting from those with the largest ones.4. The algorithm can be perfected by doing a group complementation that is reminiscent of the group exchange of the Kernighan and Lin partitioning algorithm [30].. The sequence of complementations that globally saves most inverters is accepted.Split by PDF Splitter Example 8. It is important to remark that the polarity assignment problem is solved best in conjunction with library binding. We now extend this result to multiplelevel networks. because each local function is directly implemented as a logic gate.. Therefore we extend the notion of primality and irredundancy to logic networks. because N F O ( s ) = x and x 6 P F O ( y ) .E) whose local functions are expressed in sum . Comparisons of different heuristics for polarity assignments are reported in references 181 and [43]. the synthesis of fully (or at least highly) testable circuits for singlelmultiple stuckat faults is also highly desirable. . the polarity assignment problem may have the additional constraint of requiring that each local function is negative (or positive) unate. an inverter is saved at the input of u . as in the case of twolevel logic circuits. We refer the reader to reference 1231 for synthesis of circuits that are delaylfaelt testable.3. ofproducts form is prime and irredundant if no literal or implicant of any local function can be dropped without changing the inputloutput behavior of the network.2. We restrict our analysis to stuckat faults. We summarize in this section the major findings on the relations between testable and optimal designs. Nevertheless.28.(V. the complementation of the function would not change the number of inverters at its output. Since both fanout subsets are not empty. c) and P F O ( x ) = {c. a fully testable circuit is one that has test patterns which can detect all faults. Let us consider the complementation o f f . This issue will be addressed in Section 10.  8. The sequence is constructed in a greedy fashion. We use the definition of testability introduced in Section 7. Definition 851 A logic network C.5 SYNTHESIS OF TESTABLE NETWORKS Optimization of multiplelevel logic circuits is tightly connected to their testability properties. which represent the most common fault model. where the real cost of the implementation of the logic functions by means of cells is commensurable with the inverter cost.. Namely. A simple greedy algorithm for phase assignment would complement those functions with positive savings.3. In this case. Here N F O ( x ) = (a. and functions are complemented only once. d . the inverter savings problem is also relevant for macrocellbased design styles.29. complementation~with a local increase of inveners are accepted to escape lowquality local minima. We recall that a necessary and sufficient condition for full testability for single stuckat faults of an ANDOR implementation of a twolevel cover is that the cover is prime and irredundant.
this fault is not detectable. say y. Let DC. Indeed it may be made testable by replacing f. I ) . A logic network C . Proof: Consider a vertex of the network and the ANDORimplementation of its local function in a sum ofprodrrcts form. Note that a complete circuit implementation would require inverters. their primality and irredundancy are just necessary conditions for full testability. Then. Corollary 8. fault on an input of an A N D gate corresponding to variable y. I). Assume next that there is a untestable stuckatx. = ac'+ bc. at least a variable. Theorem 8.5. Then. Example 8. that input can be set to x without affecting the network inputloutput behavior. A logic network can be put in onetoone correspondence with a circuit imf plementation by replacing each local sum o producrs expression by some A N D gates and one OR gate. then the logic network is fully testable for single stuckat faults.26 (a). with g. it is not possible to detect it at the primary outputs. = a b consists of a redundant term when considering its don't care set. The key role in relating testability to primality and irredundancy is played by the don't care sets.2. when considering each local function without don't care sets. Assume last that an implicant is redundant. if the corresponding input to the on gate has a stuckat0 fault.Split by PDF Splitter Definition 8. either an implicant in the sum o f products form is redundant or an implicant can be expanded to a tautology and hence it is not prime. Then f .1. Assume that there is an untestable stuckatx.. A logic network is simultaneously prime and irredundant if and only if its ANDon implementation is fully testable for multiple stuckat faults. It can be shown by similar arguments [39] that when all local functions of a logic'network are prime and irredundant srrnl ofprodurts forms with respect t o their local (and complete) don't care sets. either the implicant can be dropped (and the network is not irredundant) or the variable can be dropped (and the network is not prime).1. Consider the network of Figure 8. E ) whose local functions are expressed in sum ofproducts form is simultaneously prime and irredundant if no subset of literals andlor implicants can be dropped without changing the inputloutput behavior of the network.x t {O. Conversely. assume an implicant is not prime. . we can set the input of the OR gate to x with no change in the circuit inputloutput behavior. = ODC.5. Hence the circuit is not fully testable.5. x t (0. As a result. Hence. Then. fault on an input of the on gate related to vertex v. This is consistent with our logic network model where signals are available along with their complements and inverters are implicit in the representation.5. Faults can happen at the input of the ANDS or of the on. Thus. The don't care conditions capture the interplay among the local functions in the logic network. = 0. Thus the circuit is not fully testable. Consider the connection related to variable y . A logic network is prime and irredundant if and only if its ANOOR implementation is fully testable for single stuckat faults.1. Hence the circuit is not fully testable. can he dropped while preserving the network inputloutpuT behavior. Indeed. ( V . If a stuckatl fault is present on that connection. but these d o not alter the testability analysis and thus can be neglected.
these don't care sets). This approach has been implemented in program ESPRESSOMLD which simplifies the local functions and updates the [3]. complete local don't care sets cannot be used for the simultaneous (multiplevertex) optimiration of the local functions. as shown by some counterexamples [29]. and others built on these. 1311 proposed algorithms that transform any logic network into a fully testable one (for single or multiple stuckat faults) with equal or less delay. Keutzer el al. relate transformations to the potential change of ohsenability OF one or more variables. We need then transformations that are testability preserving and that allow us to construct a (possibly areddelay minimal) multiplelevel network from a twolevel representation. Hachtel et ul.r. Unfortunately. using literals as an area measure. WG would like to summarize this section by stating that optimizing circuit area correlaies positively to maximizing testability. but it would not guarantee full testability because only subsets of the local don't care sets are taken into account. Thus instead of searching for transformations that preserve testability. they proved fom~allythat algebraic factorization. substitution and cube and kernel extraction are testability preserving. and then retransforming the twolevel covers into multiplelevel networks while preserving testability. Note that the don't care conditions change as the local functions change.syntactic rewriting of the network and that they preserve multiplefault testability. we still ignore the triangular . 1291 showed that some algebraic transformations are just . On the other hand.2. No general result applies to Boolean transformations as far as testability preservation is concerned. Experiments have shown that the method is effective only for networks of limited size. An alternative way for constructing testable networks is based on flattening multiplelevel networks to twolevel representations. Hence a minimaldelay circuit may correspond to a fully testable one.4). can then be used to transform a twolevel cover into a multiplelevel network with the desired area/delay properties and full testability for multiple faults. making these fully testable. If a network can be flattened. Some results.Split by PDF Splitter MULTIPLELEVEL COMRIKATlOtiAL LOGIC OPTIMIZATION 417 This suggests a way to construct fully testable networks by computing don't care sets and minimizing the local functions. Indeed. which indeed may happen. (simultaneous) prime irredundant networks that guarantee full testability for (multiple) stuckat faults are also local minima as far as area reduction. it is possible to iterate (singlevertex) optimization of the local functions using the complete local don't care sets.t. reported in reference [6]. As far as the relation between delay and testability is concerned. The premise of this approach is that flattening does not lead to an explosion in size. Unfortunately. internal don't care sets repeatedly until all local functions are prime and irredundant (w. then it can be made fully testable for multiple stuckat faults by computing prime and irredundant covers of each individual output with a logic minimizer (see Section 7. due to the possibility that the iteration does not terminate in a reasonable amount of time. it is possible to search for observability invariance as a result of a transformation. It is interesting to note that singlefault testability may not be preserved by algebraic transformations. The use of compatible don't care sets would allow us to perform simultaneous minimization. These transformations. In particular.
1 Delay Modeling Delay modeling of digital circuits is a complex issue.3. 8. possibly while satisfying bounds on delay or area. When combinational networks are part of sequential circuits. Multiplelevel networks offer interesting tradeoffs between area and delay. the maximum propagation delay in the combinational component is a lower bound on the cyclerime. due to the large number of possible inputloutput responses.1.Split by PDF Splitter relations among area. as it is in the case of twolevel circuits. Then every vertex of the network is associated with a cell. The modeling framework will enable us to define topological critical paths whose delay measure provides useful. We present here simple models and the fundamental concepts that provide sufficient information to understand the algorithms.6.. This involves weeding out thefalse paths. Often propagation . This is the subject of future research. Ripplecarry implementations tend to be compact and slow. by considering the logic information of the local functions while computing ihe path delays. we do not know how to make it fully testable while guaranteeing no increase in both area and delay.2 and shown in Figure 8. this becomes infeasible for large circuits. Eventually we shall consider circuit transformations and algorithms for delay (or area) reduction.6. Very often the fastest circuit is not the smallest. given a partially testable circuit with a given area and delay. Timing analysis consists of extracting the delay information directly from a logic network. Thus circuit transformations can target the decrease of area or &lay. delay and testability. while carry lookahead adders are larger and faster (see Figure 1.6. paths that can propagate events and have the longest delays.e.16). Performance evaluation is an important task per se. as shown in Section 8. information on performance.6 ALGORITHMS FOR DELAY EVALUATION AND OPTIMIZATION A major problem in circuit design is achieving the maximum performance for the given technology. as mentioned in Section 8. but loose.4. whose timing characteristics are known. In the context of logic synthesis and optimization. maximizing performance mesns reducing the maximum propagation delay from inputs to outputs. i. This leads eventually to circuit implementations that are noninferior points in the areadelay evaluation space. Namely. The extension of the algorithms presented here to more elaborate models is often straightforward. whose reduction is one of the major design goals. We shall then consider the problem of determining genuine critical paths. An example is given by combinational adders.6. possibly under (area) or delay constraints in Section 8. We shall consider first delay modeling issues for logic networks in Section 8. even when it is not coupled to optimization. Assume we are given a bound network. Whereas performance can be determined accurately by circuit simulation.2. 8.
which can be estimated by considering the load for each fanout stem. including specific delays through the input ports or circuit blocks that are not pan of the logic network abstraction.) and fabrication parameters. Nevertheless. called dataready time or arrival time. etc. We are concerned here with delay evaluation in a network. i. which affects the propagation delay. More detailed models include also a load dependency factor and separate estimates for rising and falling transitions. We assign also to each vertex an estimate of the time at which the signal it generates would settle. In general.g. and they represent the reference points for the delay computation in the circuit. This is a convenient convention for sequential circuits. A nonzero propagation delay can be associated with the output vertices to model the delay through the output ports. Precise models include also a differentiation for rising and falling transition delays. a certain margin of error exists in propagation delay estimates. For our purposes. We consider in this section a model. The worstcase assumption is motivated by the desire of being on the safe side: overestimating delays may lead to circuits that do not exploit a technology at its best.Split by PDF Splitter DIULlTPLE~LEbEL COMBEiATIOUAL LOGIC OPTlMlZAlTON 419 delays are functions of the load on the cell.2. The simplest model is to use unit delays for each stage. that divorces the circuit topology from the logic domain.6. Good correlation has been shown between the predicted delay and the delay measured by circuit simulation. less precise measures of the propagation delays are possible. The dataready computation can be performed in a variety of ways. in the same way as nonzero input dataready times can model delays through the input ports. Often the dataready times of the primaly inputs are zero. A more refined model is to relate the delay to a minimal factored form. Dataready times of the primary inputs denote when these are stable. we can just assume that the propagation delay assigned to a vertex of the logic network is a positive number. Note that each output vertex depends on one internal vertex. The underlying principles are that a cell generator can synthesize logic gates from any factored form and the decomposition tree of the factored form is related to the series and parallel transistor interconnection. Hence worstcase estimates are used that include the possible worstcase operating conditions (e.  . (This assumption will be removed in Section 8. underestimating delays may lead to circuits that do not operate correctly. Different models have been used for estimating the propagation delay of the virtual gate implementing a local Boolean function. temperature. dataready times are computed by considering the dependencies of the logic network graph only and excluding the possibility that some paths would never propagate events due to t%e specific local Boolean functions. power supply. Empirical formulae exist for different technologies that relate the parameters of the factored form to the delay [8. For unbound networks.e. positiveinput dataready times may be useful to model a variety of effects in a circuit. because the delay computation starts at the clock edge that makes the data available to the combinational component. 211...) We can assume then that the dataready time at each internal and output vertex of the network is the sum of the propagation delay plus the dataready time of the latest local input. such as those considered in this chapter.
r.Split by PDF Splitter 420 LOGICLEVEL SYNTHESIS AND OFTIMIZATION Let the propagation delay and the dataready time be denoted by (dilui E V } and {tilui E V ) . = d . d .U. . d.g.t. The maximum dataready time occurs at an output vertex. where weights are the propagat~ondelays associated with the vertices. r. Then: t. This quantity is called the timing slack: s. Such a path is called a topological critical path. dk = 10: d. (8.6.. It corresponds to the weight of the longest path in the network.39) It is usual to record at each vertex the difference between the required dataready time and the actual dataready time. (8. Then. Let the propagation delay of the internal vertices be d . Consider the logic network of Figure 8.. e.:l~. = 0 and I. v.. = 10. = 25. and it is called the topological critical delay of the network. = 5. d. = 8+3= I1 = 1 0 + 3 = 13 r..3) = 17 = 3 + max(13.. Note that such a path may not be unique. The topological critical path is ( u h . = 2. is ix. = J:(L. by means of a backward network traversal.40) .. . = 3. Timing optimization problems for multiplelevel networks can be formulated by restructuring the logic network to satisfy bounds on output dataready times or to minimize them. uq. u . 17) = 20 =2 The maximum dataready time is r. Assume that the dataready times of the primary inputs are t. ur.29. respectively. We denote the required dataready time at an output by a bar. d . = T. = 3: d. = 5 + 1 0 = 1 5 + max(15. = 2.. the dataready times are as follows: rx = 3 + 0 = 3 r. namely: i.. E V (8. The required dataready times can be propagated backwards. = 3. d. = 8: dm = I .. r.1. ) .~. the required dataready time at output u. Vu.respectively. from the outputs to the inputs. + .cE max I.38) The dataready times can be computed by a forward traversal of the logic network in O(IVI IEI) time.d. = 2.)EF min 7 . up. + Example 8..
u.3 = 22 and s = 2 and s.. Example 8 6 2 Consider again the circuit of Example 8.=1515=0. namely 7.min(22 8. in=172=15:s.6.I = 22: sh = 2211 = 1 1 .10: 17 2) = 7: s = 7 . while in logic design the dataready time denotes the end time of a logic evkluation. is min(23 . i 1 = 2 0 ..3 = 17. 7 = 25 . The required dataready times and the slacks of the internal and input vertices can be computed by backward network traversal in O(lVI IEl) time. . when the required dataready times at the outputs are set equal to the maximum dataready time. Hence .29. Unfortunately.3): in scheduling we use the srarr time for a computation. The required dataready times and slacks at r .= 7 .e.1. u + ) corresponds to the vertices with zero slacks.. Then the slacks are s = 2 and s. It may be possible that a topological critical path is a false path. signal . they may lead to overestimating the delay of the circuit.2 Detection of False Paths* Topological critical paths can be determined easily from the logic network graph only. = 0.sk= 1 7 .1 3 = 4 : i D = 2 0 . The required dataready time at . = 25 and i? = 25. The remaining required dataready times and slacks are ih= 23 . are i = 25 . = 0. u. s .?h=155=10:s~=1010=0. Let . u.. Note the similarities and differences with scheduling (see Section 5. = 0. The topological critical path ( u b . and u.3 = 1 7 .s. when no event (i. + 8. Venicer are placed according to their dataready times.1: 22 .6.Split by PDF Splitter MULTIPLE LEVEL COMBINATIONAL LOGIC OPTIMIZATION 421 0 5 10 15 20 25 dataready * FIGURE 8. us assume that the required dataready time at both outputs is 25. u.3 = 4: i. uy.2) = 20.29 Example of logic network. . Critical paths are identified by vertices with zero slack. . 17.2 = 23 and . = 1717=0.. shown in Figure 8. x could be delayed up to 2 units without violating the bound. while signal y cannot.. 7R .3 = 4: s h = 4 0 = 4... respectively. which means that signal .
. The detection of the critical path is important for timing verification.Since the Boolean differences are functions of the side inputs and values on the side inputs may change. we do not distinguish between a network of (virtual) gates and its representative graph. v. Let us consider a vertex ale@ a sensitizable path.30. It is easy to see that no event can propagate along this path.6. v. For an event to propagate along that path.. because the AND gate requires P = I and the OR gate requires e = 0 to propagate the event. . u ) .Split by PDF Splitter 422 LOGlC~LEVELSYhTt1ESIS AND <ItTlMIZATlOS transition) can propagate along it. this form of sensitization is called dynamic sensitizution. u. . In the sequel. which are responsible for the circuit speed.. It is also important for delay optimization. For this reason. it is very important to weed out false paths and detect the true critical paths. /axi+.30 Example of false path . Definition 8. This can be explained by considering the interplay of the local functions of the network with the network topology.3 and path (u. Example 8. . checking if a circuit can operate at a given speed. Consider again Example 8.1. u. the value of P settles at time I and hence it is either 0 or I.. because logic transformations should be applied to true critical paths.. . v.6. The true critical paths are called critical paths for brevity and may have smaller delay than the topological critical paths. .. ud. assuming that the event leaves u .3.. .2. ud. = 1. i. v .. ) and has a total of 3 uoits of delay.4.. . Consider the circuit of Figure 8. L'. Assume that the propagation delay of each gate is 1 unit of delay and that inputs are available at time 0.. Example 8. u. vil. . The longest topological path is (v.e.6.30. v . On the other hand. b F'IGURE 8. Therefore. A critical path is a sensitizable path of maximum weight. The longest true path is (K.. Vi = 1. An event propagates along P if a&. m.6. Obviously false paths do not affect the circuit performance. ) shown in Figure 8. . The direct predecessors of the vertex under consideration are called the inputs to the vertex. the sensitization condition is that the Boolean differences must be true at the time that the event propagates. Let us consider now conditions for sensitization of a path P = (v. The inputs that are not along the path are called side inputs..) and has a total of 4 units of delay. A path of a logic network is sensitiiable if an event can propagate from its tail to its head.).
Signal g drops to 0 after 2 units of time. Consider next the circuit of Figure 8. even though they are not statically sensitizable. v. . . instead of when the event propagates.31 Example of a tme paUl that is not statically sensitizable. 2. ) is not statically sensitizable because no input vector can make e = 1 and e' = I..) is statically sensitizable.&I... u. respectively. representing the worstlcase condition.5. On the other hand. ud. which implies a = b = 0. or by an interval. For g to propagate to o we must have e = I . a f l a y = e' = 1 at time 3. u. representing the possible delays of the gate implementation under all possible operating conditions. In other words.31. The path is false as we argued before.. when all signals settle.ffilC OPTIMIZATIDIC 423 at time 0. Consider the propagation delay of a vertex.. The former delay model is called a f u e d delay model and the latter a bounded delay model. Thus both paths {(u.6.. uI. It may he expressed by one number.Then e rises to I and d drops to 0 after I unit of time. with b = 0 and c = 1. we need: af.30..e.).. all input dataready times are zero and all vertices (gates) have unit delay. u. v. which is not sufficient to characterize the falsepath problem but deserves an analysis because it is related to other sensitization criteria.ud. r. Example 8. u..)] are tme critical paths. Let us assume that c = 0 in the interval of interest. u. A weaker condition is sraric serrsitization. . Another important criterion for the detection of critical paths is robustness. and has weight 2.Split by PDF Splitter MULTIPLELEVEL COMBINATION. in static sensitization we assume that the Boolean differences must be true at time infinity. = 1.. . uo)J with delay 3. u . u. . FIGURE 8. Consider first the circuit of Figure 8. Vi = 1 . because there may be paths that are not statically sensitizable but that can still propagate events.. u. (t)/axi_. u . path (u. As a consequence any approach based on static sensitization may lead to underestimate the delay.. .. Let the event be a simultaneous transition of n and b from 1 to 0.?. u. u.) and similarly for ( u g . 1'd. The topological critical paths are ( ( u . u. r. Both paths appear to be false.): (ub. The robustness of the method relies on interpreting correctly the underlying assumptions./ad = e = 1 at time 2. Hence there is no input vector that statically sensitizes path (v. because e will settle to a final value at time 1. i. . For the sake of simplicity. u8. u .. Any sampling of the output after 2 units of delay would yield a wrong result. Path ( u . m.. It is impossible to meet both the first and last conditions. Eventually signal o raises to 1 and drops to 0 after 2 and 3 units of time. ..):(oh. . Hence the critical path delay is 3. For a to propagate to d we must have b = I. A path is sratically sensitizable if there is an assignment of primary inputs t E B" such that af. l.. vd.
In the transition mode of operation. The following example is adapted from reference [32].e..6. Assume that all propagation delays are 2 units. the replacement of a gate with a faster one may lead to a slowdown of the overall circuit. the values of the variables are assumed to be unknown until they respond to an input stimulus vector t.Split by PDF Splitter 424 LffilC~LEVELSYhTHESIS AND OtTI>fILATIOh In addition. Whereas the delay computation with the floating mode of operation is simpler to perform. Any circuit delay measure involves then two input vectors. Conversely. in the flouting mode of operation.32 (a). it results in pessimistic estimates of the performance. the critical path delay is never inferior to that computed with the transition mode [141. the circuit under consideration is memoryless.6.32 Example of monotone speedup failure. Namely.u p property [32]. due to parasitic capacitances in practical circuits. Thus. Example 8. i. Needless to say. . a delay computation requires assumptions on the "memory" that circuit nets can have. It is interesting to remark that the transition mode of operating a circuit may not have the monotone p e e d . delay computation with this mode can be lengthy. Consider the circuit in Figure 8. the first devoted to setting the circuit in a particular state. This model abstracts accurately the operation of the circuit. variables are assumed to hold their previous values until they respond to an input stimulus vector t. except for lime time 0 1 2 3 (cl 0 1 2 3 (bl 4 5 6 7 4 5 6 7 FIGURE 8.
Consider the circuit of Figure 8. has a controlling value whenever u . is 6. . uIm). COExample 8. Hence speeding up the shaded gate makes the overall circuit slower. 331. the path (u.31. Note that this criterion differs from static sensitization.. u g . Necessary and sufficient conditions for a path to be true can be formally stated.. We say that 0 is a conrrolling value for an A N D gate and that I is a controlling value for an on gate.. Staric cosensitization is a necessary condition for a path to be true. The circuit has no sensitizable paths and its critical path delay is zero. any robust method for falsepath detection using the transition mode must take into account the monotone speedup effect. i. = 1 (or 0) and if u .. agation delay of v. 24. the critical path delay of a circuit is the same when considering either bounded or worstcase propagation delays [14]. . Assume now that the propagation delay of the shaded gate is I.32 (b). The static cosensitization criterion is based on logic properties only and does not take delays into account. u.Split by PDF Splitter the shaded gate. that time. .. 2 . static sensitization requires e = 1.. whereas static sensitization is not. Note that the circuit output settles after 6 delay units. whose delay is 3 units. A vector sturically cosensitizes a path to 1 (or to 0) if x. and since the first and last conditions are incompatible.) is not statically sensitizable. a = 0. as shown by Example 8.. .. and ud) have a controlling input along the path. Regardless of the value of the prop. the floating mode delay of L.. . We say that a gate has a controlled value if one of its inputs has a controlling value.32 (c). There are several theories for computing the critical path delays that use different underlying assumptions [14. for a . When a vector statically cosensitizes a path to 1 (or O). u d .. assume that the circuit can be represented in terms of A N D and on gates. Thus.6. u. there may be some other path that sets the path output to that value earlier. to present the only controlling value. . r = O and h = I. when considering the transition mode of operation. The controlling value determines the value at the gate output regardless of the other values. in addition to inverters that do not affect our analysis. Example 8 6 7 Consider the circuit in Figure 8...8.e.. .. Path (u. . because the gates with controlled values (corresponding to u. Thus. .6. . We summarize here an approach that uses the floating mode of operation [24].. . because it is undetermined until . 27.. We present here the intuitive features of that analysis. in the floating mode of operation. y = 0. has a controlled value. The waveforms for a rising transition at u. Without loss of generality. As a consequence. which requires u. m. The waveforms for a rising transition at u. i = 1. h = O.u.) is ~Itati~dlly sensiti~ableto O by input vector n = O.. . Consider now a path P = (I. 32.32. are shown now in Figure 8. On the other hand. u ~ ox. Vu. as reported in reference 1241.5.. are shown in Figure 8. r = O. On the other hand. the fixed gate delay model and the bounded gate delay model may yield different delay estimates.
A gate along the path is controlled and both the path and a side input have controlling values.9. the path provides a noncontrolling value and a side input provides a controlling value. The description of these techniques goes beyond the scope of this book. by making a sequence of tests if the critical path delay is no larger than i. For input vector a = 1. and leverage concepts developed in the testing community. and falsepath detection is particularly important for those circuits with many paths with similar delays. where 7 is the outcome of a binary search in the set of path delays sorted in decreasing order. U\.e. We refer the interested reader to reference 1241 for further details. one of the following conditions is true: 1. 2. the delay is determined by the last (noncontrolling) input. b = 0 condition 2 occurs at the OR gate. the path must provide the last of the noncontrolling values. As a result. . u For input vector a = 0.. If a gate has a controlled value.6.6. the number of paths in a network grows exponentially with the number of vertices. Multiple falsepath detection methods have been recently developed [24]. : ) . Example 8.ur3 U d . For input vector a = 0. There are different interesting design problems related to falsepath detection. If a gate does not have a controlled value. This problem can be rephrased as checking if all paths with delay larger than 7 are false.30. We question the falsity of path (u#. b = 0 condition I occurs at the OR gate. For input vector n = I . 3. Indeed. for all possible input vectors. The reason is that when a gate has a controlled value. A gate along the path is controlled. i. A path is false if. the following must hold for all gates along the path. 8. Therefore it is important to have methods that can detect groups of false paths to avoid checking paths one at a time. b = 1 condition 2 occurs at the A N D gate.Split by PDF Splitter path to cause the output transition. Consider the circuit of Figure 8. while when it does not have a controlled value. This problem can be reduced to the previous problem.3 Algorithms and Transformations for Delay Optimization We consider in this section algorithms for reducing the critical delay (possibly under area constraints) and reducing the area under inputloutput delay constraints. Tnerefore the path is false. then the path must provide the first of the controlling values. We shall refer to critical paths without distinguishing among topological or sensitizable paths.. a test for detecting a false path can be done as follows. such as the Dcalculus. A gate along the path is not controlled but a side input presents the controlling value last. A very important question is to determine if a circlu_t works at a given speed. b = 1 condition I occurs at the AND gate. the delay up to its output is determined by the first controlling input. if its critical path delay is no larger than i. Another important issue is to determine the critical path delay of a circuit. but the side input presents the controlling value first.
Note that recomputation of the critical paths and slacks is also an important issue for the computational efficiency of the algorithm. Obviously the quality of the results reflects the choice. and they induce a path if the critical path is unique. the smaller the gain in speeding up the critical paths only. . < )I repeat ( Compute critical paths and critical delay r . The transformation must not have as a side effect the creation of another critical path with equal or larger delay. SYNTHESIS OF MINIMALDELAY CIRCUITS. this can be ensured by monitoring the dataready variation at the neighboring vertices that should he bounded from above by their corresponding slacks. the marginal area variation of the transformation should be recorded and checkAd against a hound. Several schemes for determining the cone of influence of a transformation have been devised. It is obvious that the smaller the difference between the critical delay and the largest delay of a noncritical path. When area constraints are specified. so that the dataready times and slacks can be updated without redundant computation. Set output required dataready times to r : Compute slacks: U =vertex subset with slack tower than c: W = select venices in U: Apply Uansformatians to vertices W: ) until (no transformation can reduce r ) 1 ALGORITHM 8.(V. Recall that the critical delay is a lower bound for the cycletime.6. When c = 0. Most algorithms differentiate in the selection of the vertex subset and in the transformations that are applied. so that more transformations can he applied between two recomputations of the critical path. When a single transformation is applied at each iteration of the algorithm. Quasicritical paths can be selected with t > 0. Most critical delay optimization algorithms have the frame shown in Algorithm 8.DELAY(G.1 because the techniques are independent of the criterion used.6. the vertices in U are_critical. The parameter t is a threshold that allows us to consider a wider set of paths than the critical ones. E l . The propagation delay model of the local functions plays an important role because they are tightly coupled to the transformations. l . Transformations target the reduction of the dataready time of a vertex.Split by PDF Splitter MCLnPLErLEVEL COMBLNAnONAL LOGIC OPTIMILA"0N 427 REDUCE. Let us consider first the problem of minimizing the critical delay that arises often in synchronous circuit design. There are two possibilities: reducing the propagation delay and reducing the dependency of the vertex under consideration on some critical input. where the logic network under consideration is the combinational portion of a sequential circuit.
Singlestage networks may have local functions so complex that they are not practical. Hence a minimumcost separation set is {v.. As a result. Hence.D E L A Y can support other choices for W as well as other transformations. u. Now the critical paths are (ui. The value of t. urn). Note that when .v.) where u. Thus the algorithm will generate a sequence of networks corresponding to different arealdelay tradeoff points. the increase in area due to its elimination. 211. v . the algorithm terminates when an elimination causes the area estimate to overcome the given bound. Those vertices that provide primary output signals.g... which would choose [u. has now more literals. . A selective elimination of j into the expression of vi reduces one stage of delay.34 (a).. .e.... vu.. Let us consider a delay model where the propagation delay grows with the size of the expression and with the fanout load. At this point a few paths with weight 2 still remain.]. (ug.33 (a). To cope with the problem of literal and area explosion. In addition. neighboring vertices can be affected by the transformation and should be checked.set of minimum cost is (u. u.. Hence a critical vertex separation . U". . (topological) critical paths are (u..Us. Let us consider algorithm R E D U C E . Let the transformation be the elimination of W.U p . Even though the path (uk. . Let us consider first a critical path ( u i . substitution and simplification [X. In addition. Nevertheless. U Let us consider the values of the vertices.) q . Then. up.11.. Consider the network of Figure 8. is larger than the slacks..10...6.the delay model depends on the fanout load. u. v.. a minimum delay network is a singlestage network that can be obtained by the E L I M I N A T E algorithm. Uq. Assume that all gates have unit delay and all input dataready times are 0... . the case of unit propagation delays.). as shown in Figure 8. is 0 and the value of u. it may become so if the variation in the dataready time of r. Note that as the algorithm proceeds. Exampk 8. The values of u. u. the gain may be offset by the larger propagation delay of v. v. U>. U. general propagation delay models can be used. v. i.. L'. U". may increase. u p . vertex elimination can be applied selectively. u. u. An area constraint can be easily incorporated as a stopping criterion. Example 8. the dataready time 1. is I (using a factored form model). the value of u. u. a network graph shown in Figure 8... is some primary input.). i. ( ~Ui . the load on vr has increased. e. whose elimination leads to s = eghk.. are given infinite weight...6.].. They would be eliminated in the next iteration of the algorithm.33 (b). which can raise the propagation delay dk. Let us analyze the fragment of . u. u. because function f .. uu1. u ...Split by PDF Splitter We consider first. u. v.. Algorithm R E D U C E . U". whose elimination yields w = g r s . and ( u .) was not critical. is now 2.e.D E L A Y and let us choose W as a minimumweight vertex separation set in the subgraph induced by U . are I. u. U r . (Uh.).] as the separation set. because it is the only choice left. such as decomposition. v. where the weight of each vertex is its area value. the local expression gets larger and the assumption on unit delay gates becomes less reasonable. as a simple example. because they cannot be eliminated.
(c.34 (a.d) Network fragments before and after suhstimtion (Shaded area is critical.h) Network fragments before and after elimination.33 (a) Example of lapic network.n of logic networbafter two iterations of the algorithm. FIGURE 8.) .Split by PDF Splitter fe i h i% c: FIGURE 8. (b) Example (b) k 1 .
The set W is determined by first choosing an appropriate vertex separation set in the subgraph induced by U and then adding to it the predecessor vertices separated by no more than d stages. In this case. where r. Consider the circuit of Figure 8. and the corresponding cost is called area penulty. The critical path goes then through ud and u . with corresponding expression a' + b' + c' + d' e'. and its delay is 1I units. Eventually the subnetwork is cast into N A N D functions. The recursive decomposition process extracts first vertices that are fed by the inputs. is bounded by its slack s. The algorithm is controlled by a few empirical parameters. = 3.U P algorithm in program MIS 1411. and the dataready time 1.. u. We can now explain how the separation set was chosen: it is a minimumweight separation set. Let all N A N D gates have propagation delay equal to 2 units and each inverter equal to 1 unit. by selecting appropriate divisors according to the dataready times. ) it affects &. has been eliminated into a single vertex. the network is decomposed beforehand into twoinput NAND functions and inverters. ranging from unit delay to unit delay with fanout dependency or to library cell evaluation.34 (c) and described by the expressions: x = ka + kb + rde Let ( u . while optimizing the delay. Assume that u . d being a parameter of the algorithm. Then unconstrained elimination is applied to the subnetwork induced by W.35 (a). Hence we must ascertain that the variation of r. may reduce its propagation delay d.. Example 8. The algorithm can fit into the frame of R E D U C E . .12. Note also in this case that the load on u.D E L A Y shown hefore.Split by PDF Splitter Let us analyze the network fragment shown in Figure 8. where the weights are a linear combination of the area penalty and potential speed:up. increases.except ford. Let us assume that t = 2 and thus W includes the vertices in the path from ud to c. The subnetwork is then resynthesized by means of a timingdriven decomposition. We refer the reader to reference [41] for more details. Even though this may not affect the critical path (when 1. including c and d described above. Figure 8. Then it is likely that substituting y for u + b in f. is selected as the vertex separation set and the parameter d = 5. the coefficients of the linear combination of the area penalty and porential speedup and a choice of propagation delay models. : A more elaborate approach is used by the S P E E D .) be on a critical path.. Note that the dataready times of the inputs to the subnetwork are known.6. Let the dataready times of all inputs be 0. This step makes the network homogeneous and simplifies the delay evaluation. is much larger than the other dataready times of the inputs to u. The potential speedup in the subnetwork is an educated guess on the speedup after elimination and resynthesis. < 1 . being r. Vertices with successors outside the subnetwork are duplicated. and so the dataready times can be computed col~ectlybottom up.. Note + .35 (b) shows the circuit after the subnetwork feeding u.
Split by PDF Splitter ' t (C) y FIGURE 8. . Another relevant problem in multiplelevel logic synthesis is the search for minimalarea implementations subject to timing constraints. (b) Logic network after elimination (c) Logic network after resynthesis by a new decomposition. that the gates in the shaded area have been duplicated. satisfy all timing constraints. The former subproblem is more difficult. i. but it reduces the number of stages for the late aniving input d. due to the heuristic nature of multiplelevel logic optimization. Unfortunately. the lack of a solution may not be related to an overspecification of the design. The cost of that duplication (one NAND gate and one inverter) has been weighted against the possible speedup of the network due to a more favorable decomposition that balances the skews of the inputs. Such a decomposition is shown in Figure 8.35 (a) Logic network. and sometimes no solution may be found..35 (c). SYNTHESIS OF MINIMALAREA CIRCUITS UNDER DELAY CONSTRAINTS. because the method itself may not be able to find a solution. The problem can be decomposed into two subproblems: coplpute a timingfeasible solution. compute an areaoptimal solution. Almost all inputloutput paths are now critical. The decomposition has a suucture similar to the original circuit. being the critical delay is equal to 8 units.e.
where gates have unit delay and all input dataready times are 0. Let us set then the required dataready time to 4 and compute the slacks. In particular.tvo dataready time after having added i . some critical vertices are identified and transformed.10. u) is selected and that these vertices are eliminated. A gradual approach to achieving timing feasibility is the following. the search for a minimumarea solution can be done by logic transformations.6.Split by PDF Splitter 432 LOGICLEVEL SYNTHESIS A N D OPTlMLZATlON Timing constraints are expressed as input anival times and output required dataready times. When considering more realistic propagation delay models. A pitfall of this approach is that no transformation may be able to make noncritical those vertices with large negative slacks. . Example 8. E V O . Then.D E L A Y with a different exit condition. we could eliminate vertices with negative value. the choice of the transformation and their applicability is obviously more involved. this amount should be bounded from above by the slacks. As a first step. Assume that set W = ( u p . Since the timing constraints are now met. The network is not timing feasible. T = maxT. When the given circuit is timing feasible. if the transformation increases the dataready time at any vertex. A first approach to achieve timing feasibility is to apply transformations to the vertices with negative slacks repeatedly until all slacks are nonnegative.33 (a). Consider again the network of Figure 8.i. Assume that required output dataready times are i = [233217. The larger the magnitude of a negative slack.6. We could then assume a single required dataready time ? = 3 by adding an additional unit delay on the output ports for x and z. Hence. Namely 7 = T breaks the loop. It is important to remember that the transformations are straightfonvard in this example. We would detect . because the unit delay model was pulposely chosen to show the underlying mechanism of the algorithm. Note that efficient updates of the dataready times and slacks are of great importance for the overall efficiency of the method. different required dataready times on different outputs are transformed into a single constraint. is the new required u. The goal is to find the transformation that most reduces the delay. a successive computation of the longest path would result in r = 3. This can be done by using the largest required dataready time as an overall constraint and by adding the difference to the actual required dataready times as "fake" delays on the output ports.13. We can then use algorithm R E D U C E . rather than the transformation that reduces the delay by a given amount. because the critical path has weight r = 4. A circuit is timing feasible if no slack is negative. such as u. Care has to be taken that no transformation violates the timing constraint. the more acute is the local problem in satisfying the timing requirements. At each iteration. we can look for other transformations that reduce the area while preserving timing feasibility. because a major restructuring is required in the neighborhood of that vertex. Since major changes require a sequence of transformations and most of them lack a lookahead capability. to the propagation delay of each output port u.. no transformation may be found and the method may fail. For example. the same critical paths as in Example 8. Hence it is more likely that a transformation is found and that the algorithm succeeds.
like LSS. We consider in this section the application of rulebased systems to multiplelevel logic optimization.7 RULEBASED SYSTEMS FOR LOGIC OPTIMIZATION Expert systems are an application of artificial intellige. Some systems. use a common form for representing both the logic network and the rules. Rulebased systems evolved over the years to a good degree of maturity. for each pattern. The network undergoes local transformations that preserve its timctionality. Here we concentrate on the application of the rulebased methodology to the optimization of unbound logic networks. Rulebased systems are a class of expen systems that use a set of rules to determine the action to be taken toward reaching a global goal [13]. and a priority scheme is used to choose the replacement. Therefore some of these considerations will be echoed in Section 10. The first logic synthesis and optimization systems used rules. Each transformation can be seen as the replacement of a subnetwork by an equivalent one toward the desired goal. a network is optimized by a stepwise refinement. A system for entering and maintaining the database. A rule database contains a family of circuit patterns and. It is usually a heuristic algorithm. each characterized by its domainspecific knowledge. A key concept in expert systems is the separation between knowledge and the mechanism used to apply the knowledge. testability or delay). A rulebased system in this domain consists of: A rule database that contains two types of rules: replacement rules and metarules. The former abstract the local knowledge about subnetwork replacement and the latter the global heuristic knowledge about the convenience of using a particular strategy (i.Split by PDF Splitter MULTIPLELEVEL COMBLNATIONllL LOFlC OPTIMIZATION 433 8. called inference procedure. 191. interconnections of NAND or NOR gates with a . In a rulebased system. Hence our view of rulebased systems will be somehow limited in scope and focused on our application. We refer the reader interested in rulebased systems to reference [13] for details. the corresponding replacement according to the overall goal (such as optimizing area. especially for solving the libraxy binding problem. A controlling mechanism that implements the inference engine. Some commercial logic synthesis and optimization systems are based on rules. for example.nce to the solution of specific problems. Several rules may match a pattern. Hence this methodology is applicable to solving a wide class of problems.e.5. and several knowledge engineeringframeworkr have been designed to support the design and use of such systems. The most notable example is IBM's LOGICSYNTHESIS SYSTEM (LSS) [18. Hence the inference procedure is a rule interpreter. applying a set of replacement rules). Rulebased systems for logic optimization have been fairly successful..
. This facilitates matching the patterns and detecting the applicability of a rule. this task is performed very infrequently. The algorithm has to measure the quality of the circuit undergoing transformations. Consider the rules shown in Figure 8. Then the algorithm enables the rules to identify the subnetwork that is the target for replacement and to choose the replacement. Other systems use signatures of subcircuits that encode the truth table and can be obtained by simulation. The most compelling problem in rulebased systems is the order in which rules should be applied and the possibility of lookahead and backtracking. The measure of area and delay is usually incremental and involves standard techniques. The problems that the control algorithm has to cope with can be understood by means of the following ideal model. This feature played a key role in the acceptance of optimization systems.) Let us consider the set of all equivalent logic networks that implement the behavior of the original network. Assume each vertex is labeled with the cost of the corresponding implementation (e. The last rule removes a local redundancy. equivalent networks and the edges represent the transformation rules. Example 8. Since this task is not simple. The first rule shows that two cascaded inverters can be replaced by a single inverter.g. specific programs collect the knowledge about the possible transformations and create the database. that implements the inference engine. his or her "trick" (knowledge) could then be translated into a rule and incorporated into the database. and therefore it increases the testability.7.36. because when a designer could outsmart the program. This is the task of the control algorithm. (The adjective "ideal" means that the model is not related to the implementation of the algorithm. Even though the generation of the database is time consuming.1. The database must encode the rules in an efficient way to provide for fast execution of the system. This may involve firing o n c o r more rules.Split by PDF Splitter 434 LOGICLEVEL SYVTHESlS AND OPllMLZATlON limit on the number of inputs. A major advantage of this approach is that rules can be added to the database to cover all thinkable replacements and particular design styles. We define a confiRururion search ~ r u p h where the vertices correspond to the set of all . The second rule indicates a direct implementation of a threeinput NANO function. area or (C) Three simple uansfomation rules .
questioning the completeness of the rule s d is a hard problem because of the size of the search configuration graph. the control algorithm iterates the following tasks. consider the set of possible sequences of transformations. whose head is the final solution. The control mechanism is data driven. is a typical choice when rulebased systems are applied to logic optimization problems. The branching factor at each configuration is called the breadth of the search. the network is declared optimal and the optimization procedure ends. use heuristics to bound [22. Synthesis and optimization of largescale circuits is today made possible by CAD tools developed on the basis of the algorithms described in this chapter. We shall compare then the rulebased approach to the algorithmic approach in Section 10.1. A set of rules is complete if each vertex is reachable from any other vertex. if the set of rules is complete and the depth and breadth are unbounded.  8. Metarules are rules that decide upon the value of these parameters. Hence it corresponds to a path in the configuration search graph. the moves are determined by how transformations can improve the cost function from a given configuration. as well as for restricting the search to the rules that apply to a small portion of the circuit (peephole optimization). Given a circuit configuration.e. Obviously it may not be the (global) optimum solution. The rationale is to explore the different choices and their consequences before applying a replacement. each decreasing the circuit cost. it fires the metmles that determine some global parameters of the search. which grows exponentially with the depth. This is followed by firing replacement rules in a specific order. called also fonvard chaining of the rules. First it evaluates the cost functions of interest.8 PERSPECTIVES Multiplelevel logic synthesis has evolved tremendously over the last two decades. i.. The larger the breadth and the depth. When no rule can improve the cost function of interest. a better strategy is to vary them dynamically. A greedy search is a sequence of rules. the length of the sequences is the depth. such as LSS [19]. the optimum configuration can be reached. Some rulebased systems.5. We shall revisit rulebased systems for the optimization of bound networks in Section 10. use this kind of strategy for the control algorithm. Some rulebased systems. Metarules can be used also to trade off area versus delay or the quality of the solution versus computing time. Then. From an implementation point of view. Needless to say. Indeed the choice of applying a rule affects which rules can be fired next.5. The use of automatic circuit optimization has become common practice in industry and in . This approach. given the state of the optimization [28]. Ideally.Split by PDF Splitter timing cost). the better the quality of the solution but also the higher the computing time. Whereaq experiments can be used to determine good values for the depth and breadth parameters. such as SOCRATES 281. Higher quality solutions can be found by using a more complex search for the transformation to apply. the breadth and the depth.
exact methods are few and inefficient. 8. which explains the rich bihliogaphy of this chapter. to which the s6lutions computed by heuristic methods could be contrasted. and of networks of modules with such logic representations. Multiplelevel logic synthesis and optimization have evolved from minimization methods for sum ofproducts expressions and are especially wellsuited for manipulating control circuits. and they often involve a small amount of logic gates. Research on optimization of ANDEXOR forms. including arithmetic units. Algebraic and Boolean algorithms have been used in several programs. 1361. 37. This contrasts the case of twolevel minimization. such as MIS [a] and BOLD 131. 1291 described thoroughly the relations between algebraic transformations and ~ ~   . has overpolynomialtime complexity. Banlett el a1 presented fin1 in a seminal paper I31 the links between logic optimality and testability.. The method. Moreover. The global flow method. 191. Existing CAD tools support area.be useful to be able to compute at least lower bounds on area andlor delay. . is often difficult to achieve with presently known heuristic algorithms. Damiani and De Micheli 1171 proposed a plnhod for the exact computation of observability don't cure sets hy means of network traversal and showed the link between the equivalence classes of a Bwlean relation and the ODC sets. The tmnsduction method. delay).. The concept of ODC sets has been recently extended to observability relations 140). the optimization of circuits involving exclusiveor primitives. Even though heuristic logic minimization often satisfies the practical needs of most digital designs. fanin and fanout constraints. There are still several open issues in this field. Rudell. subject to stage (ie. timingcritical portions of processors are. These methods were later perfected by brayton. area) implementations of multipleoutput networks. Recently.and performanceoriented design and are linked to test pattern generators and to programs that ease the testing problem in different ways.Split by PDF Splitter 436 LOGICLEVEL SYNTHESIS AND OPnMlZnTlON academia. was incorporated in IBM's LSS system 118. has shown to . it would be very interesting to synthesize (even small) circuits with provably optimum delay properties. The theory of the d m ' l care specifications and computation has been the object of long investigation. Wang and Sangiovanni. Among the classical methods. Algorithms for optimilalion were pioneered by Brayton and McMullcn 171.9 REFERENCES Optimization methods for multiplelevel circuits were investigaled heavily in the last thirty years. is thus very important and promising. Hachtel el a / . such as adders and multipliers. Davidson [20l propoaed exact and approximate aleonthms for determining minimal gatecount (i. was implemented as part of Fujitsu's design system. where large problems can he solved exactly today. proposed by Berman and Trevillyan. who introduced and salved the rectangle covering problems [R. Heuristic logic oplimization flourished in !he 1980s. still hand designed or tuned. Lawler 1351 proposed a minimumliteral factorization a1go"lhm for singleoutput networks. yield eood results and it . and it is not considered practical for usual circuits. . C o u d m er "1. based on a bmnchandbound search (that can be relaxed in the approximale case). Indeed. Thus an important open problem is the optimal synthesis of data paths. it is worth remembering the decomposition techniques of Ashenhurst 121 and Curtis 1161. . who developed the algebraic model and the kernel theory. Most of the conlnhutions are scattered in recent journal papers and conference proceedings. When considering datapath design. promsed by Muroea er 01. it would.431. [I51 propased a methad for image computation that was later perfected by Touati el a / [421 and applied to the computation of controllability don't care sets. .e. First of all.
"LSS: A System for Production Logic Synthesis. 3. "Multilevel Logic Synthesis. R. F. NJ. Vol. Vol." IEEE Transnclion. "Redundancy ldenttfication and Removal. SangiovanniVincentelli and A." ISCAS. Jacuby. Van Nosuand. Trevillyan. 1957. R. Berman and L. 5. Joyner. IE36. Proc~edinpro the lnternationnl Symporium on Circuits ond Sy. Ashenhurst. February 1990. K. 537545. 17. pp. Rulebased methods have been applied to logic optimiraliorl for many y e m . pp.pp. pp. Darringer. 10621081. 18. A. W. No. 9.rsnctions on CADIICAS. "Principles of Rulebased Enpen Systems. R. De Micheli. Malik and Saldanhe 1311 addressed the relations between delay optimization and testability Wherea? methods for timing optimization had been explored cince the early stages of logic synthesis 18. 16. R. 10. Lecture Notes in Co. Rudell. 272280. 4. Boston. Daminger. Rudzll. 4.! on Industrial Elecrronicr. F. I. 557564. W. SpringerVerlag.  . Poirot. Wmg. 723740. Val. 6. 6. Saucier and F. "Automated Synthesis fur Tetahility. SangiovanniVincentelli and A. Editor. Editor. 1. Lisanke. Cunk. Yavits." TAU90. Keutrer. 2. E. 10981109. pp. the laurr being mainly applied to the hinding problem. V o l 25. 1962. R. pp. 1990. 365388. March 1993. 12. A. pp. Abouzeid.Split by PDF Splitter testability for both single and multiple stuckat faults. Madre. Morrison. pp. Germany. I I . Hachtel. 0. Beman and L. 1. R." in J. 264300. "Don't Care Specifications in Cornhinational and Synchronous Logic i Circuits. C. 365368. L. "An Algorithm for N & N D Decomposition under Network Constraints. CAD6. MA. 7. R. 5. Brayton. Decemher 1969. San Diego. No. Brgler and R. f 8." in M. 1987. "Multilevel Logic Minimization Using Implicit Don't Cilres. Sakouti. H. R. "MultiLevel Logic Optimization and the Rectangular Covering Problem:' I C C A D .^. Brown.<ign A~tomntionConfer~nce. pp. L. Du. Rudell. pp. Cauden. V o l 28. Brayton and C. A. 1990. Pro. Sifakis. No. 1983. IBMk LSS system was the first logic optimization tool widely used in industry and paved the way to the development of this ficld. Duda. Bryan. Kluwer Academic Publishen. M. R. Bryan.~ocrionron CADIICAS. C18. "Verification of Sequential Machines Based on Symbolic Execution. V o l 78. V o l CAD12. D. A. No. 12f129. D. L. R. 69. B m d and V. lyengar. 1982. No. Davidson. Deradas e r v l 1241 developed the falsepath theory based on static cosensiti~atiun. 4954. "Multilevel Synthesis Minimizing the Routing Factor. 6. July 1981. Proceedingr of ACM Workshop on Timing Isrues in the Specificorinn ond SlnrheWs of Digin81 Sjsrems. "MIS: A Multiplelevel Logic Optimization System. K. o l 22. P.creding. Proceedings ofrhe De. 263277. V 14." I E E E Trrrns~ acrirmr on CAD/ICAS. F. V o l CAD10. pp. G. G. pp." D A C . Trevillyan. ~ C m i a n and G. Lisanke. Wang. Brayton." I E E E Tron. A. November on 1987. New Approach to the Durign of Switching circuit. Buchanan and R. Benhet and J . D.stems." I S M Journal of Research cznd Developmmt. 1986. Their hook is a good source of infomalion on the problem. K. Brglez." lnrernntional Workshop on Logic Synrhrsir.nput<r Scimcr." I E E E Trmrr~cnons CAD/lCAS. No. Brand and Iyengar 151 devised conditions for dynamic sensitization. May 1991. 1990." I E E E Transactions on Compurerr. G." I E E E Tr'8. Brayton. 12. No. D. Advrmces in C<~mputerr. 'Timing Analysis Using a Functional Relationship. Princeton. 1990. Academic Press. SangiovanruVincentelli. No. 19. Chen and N. "Global Flow Optimiration in Automatic Logic Design. Kedem and R. V o l CAD7. "LSS: Logic Synthesis through Local Transformations." I C C A D . C. Boolean Reusorring. pp." I B M J o m u l ofRerearch and Duwlupmenr. R. P r o c e e d i r l ~ ~ ofthe lnrernnlio~mlConference on Compuvr Aided Design. June 1988. 741 16. Trevillyan. 2. 211.r of the lr~ternalionolC<vzJ?re~rence on Cornpuler Aided Design. "Path Sensitization in Critical Path Problem. May 1989. SangiovmniVincenlelli and A. M. Brand. "The Decomposition of Switching Functions. the falsepath problem was brought back to the community's attention only recently. J." Proceedings of the lnterrzational Sympoxium r~rrthe T h e o q ofSwitrhing. 20. 2. pp. D. McMullen. 'The Decomposition and Factorization of Boolean Expressions. September 1984. Hachtel and A. R. B. Calhaun. Banlett. Berlin. 1991 13." I E E E Proceedings. Wanp. Joyner and L. 5. Brayton. 15. G.McGeer and Brayton 1321 proposed another thrury using the concept of viability. C. bmnres 1281 uses both the algorithmic and the rule~baaedapproaches.
" DAC. A. "On Pmpenies of Algebraic Transformations and the Synthesis of Multifaulthdundant Circuits. R. Vol 11. Brayton and A. 27." ICCAD. 35. "Algorithms for Multilevel Logic Optimization. Vol. 427435. Hachtel. University of California at Berkeley. Dordrecht." ICCAD. pp. 37. pp." DAC. 30. Proceedings of rhe Design Aulornafion Conference. S. . 23. 43.. A. 42. April 1989. 1991.D. pp. Gregory. De Micheli. Saldanha. Momson. "Synthesis of Robust DelayFault Testable Circuits: Theory. Procerdingr of the Internntionol Conference on Cvmpuler Aided Design. McGeer and R. Proceed. VLSI Design of Digital Syaems. No. Boston. A. 1991. No. S. 33. K. 1990. Kluwer Academic Publishers. SangiovanniVincentelli. 5. Y. 39. Saldanha. G."Bell System . I. Technical Journal. D.Split by PDF Splitter 438 LffiICLEVEL SYNTHESIS AND OPllMlZATlON 21. Proceedings of rhe Internalionol Conference on Computer Aided Design. Editor. CADI I . D. pp. 3. Dissenation. Savoj and R. Stable Algebraic Operation on Logic Expressions. Savoy." ICCAD. E." in T. 24. SangiovanniVincentelli and P. "Is Redundancy Necessary to Reduce Delay?" IEEE Transactiunr on CADNCAS. 282285. 1964 36. Tauati. "Logic Synthesis for VLSI Design. Jacoby.Lopic Synthesix nnd Silicon Cornpilorion. LogicSInthesisond Oplimi:ofion. January 1969. Ph. k$. No. Lin. A. pp. B. P. Devadas. "MultiLevel Logic Simplification 38. "Delay Models and Exact Timing Analysis. 283295. Yen and S. Vol. 10. 1991. Keutzer. University of California at Berkeley. S. Dissertation. Needham Heights. "Efficient. 1988." IEEE Transocrionr on CADIICAS. S. Lawler. SangiovanniVil~centelli. 1986. "An Efficient Heuristic Pmcedure far Partitioning Gra~hs." in C. Brayton. Wang. 1122. 28. Hachtel. Amsterdam. MA. Singh. Brayton and A. 'The Transduction MethodDesign of Logic Networks Based on Permissible Functions. 26. "Socrates: A System far Automatically Synthesizing and Optimizing Combinational Logic. bp. De Micheli. Proceedinys of rhe Design Auromation Conference. . Malik. Antognetti (Editors). April 1989. Sasao.D. Keuuer and C. 7 9 4 5 . ~: 1987. 291307. 29. Vol CAD6. Allyn and Bacon. "Logic Design Automation of Fanin Limited NAND Networks. 22. Culliney. Malik and A. 1993. 751761. Design Sy. Ph. Savoy. 32. 41. 11." ICCAD. D. Keutrer and S. Proceedings o the hremrionol Conference on Computer Aided Design." IEEE Tranractions on Compurers. Brayton. April 1991. Proceeding* of IheDesign Auromlion Conference. K. No~&Holland. Kambayashi. Dietmeyer and Y. "Performance Oriented Synthesis of Large Scale Domino CMOS Circuits. R. A. using Don't Cares and Filters.~tems for VISl C i r c ~ i t . "On the General Falre Path Problem in Timing Analysis. Logic Design of Digital Sy. D." IEEE Tronsetionr on Computers. 4. C18. M . Sequin. Saldanha. Prvceedin~so. 555560. I." Memorandum UCB/ERL M89149. H." IEEE Trnnsacrions on CADflCAS. 14041424. Rudell. R. A. pp. Touati. Boston. Keutzer. January 1992. No. P. 1988. Braytan. H. "Delay Computation in ~ambinatibnal Logic Circuits: Theory and Aleorithms. Vol. Vol. 1989. Vol. R. G. 313321. 31. September 1987. . R. 34. de Geus and G. "An Approach to Multilevel Boolean Minimization. "Observability Relations and Observability Don't Cares." Journal ofACM. S. 40. Kernighan and S." IEEE Trunsoctions on CAD/ICAS. Brayton and H. 'Timing Optimization of Combinaf tional Logic.   ." DAC." Memorandum UCBERL M89149. No. CAD10. a. SangiovanniVincentelli. 13&l33. The Netherlands. 516517. Wang. Editor."Implicit State Enumeration of Finite State Machines using BDDs. pp. Brayton and A. 87101. Ghanta. Wang. February 1970. K. H. pp. March 1992. Banlen. Integrating F u n c t i o ~ land Temporal D o m i n s In Logic Design. R. H. "Extracting Local Don't Cares for Network Optimization. Du. 1989. McGrer. Dietmeyer. K. of che lntemorioml Conference on Cumpuler A i d c d D e s ~ ~pp 518521." ICCAD. Murogil. Lin. Devadas and K. pp.vtemr. K. McGeer and R. A. 176179. C38. G. 25. 1991. B. the Internarionol Conference on Computer Aided Desiyn. Kluwer Academic Publishers. f pp. MA. Brayton and A. DD. 1989. Njihoff Publishers. pp. MA. pp. pp. Lai and J. P. 277282. 1978. CADNo. SangiovanniVincentelli. Su. R.
.~~d~.. 3.. Consider the logic network defined by the following expressions: f = (0 + d)' Inpue i r e (a. Consider the logic network of Problem 6. Redraw the netujork graph.. 4. .10 PROBLEMS 1. f . fAiUi. algebraic quotient of the the former by the latter is empty if any of the following conditions apply: fdjUj..10. contains a cube not contained in any cube of f ~ . = D U E such that I(rup(F) Usup(D)) nsup(E)I 5 1.4.. . ~ . Determine the minimumliteral network that can be derived by cube extraction. z The count of any variable in fd. Substitute y into f. Design an algorithm that finds all 0level kemels of a function... the cubes of E are useless for optimizing f. ~ ~ . Compute the ODC sets for all internal and input vertices.... u l . y . = nbc'... (i. Perform the algebraic division f.e.y ) . 6. Design an algorithm that finds just one 0level kernel of a function. Prove that given two algebraic expressions f~. Compute all kemels and cokernels of z and u.. A ~ s u m eCDC. Consider the logic network defined by the following expressions: x = abcf + e f c + d e y = acdef +def  Determine the cuhevariable matrix and all prime rectangles. . contains more terms than fdilidmd n > n ) ..and redraw the network graph. . z. Let DC./f. b 7. ~ ..Split by PDF Splitter MULTIPLELEVEL COMBINATIONAL LOGIC OWIMIZATION 439 8. Then. assuming that the outputs are fully observable. Consider a vertex u. is larger than in f... . Prove formally the following theorem justifying an exact filter. 8.. . Compute CDC..d and fdi... Extract a multiplecube subexpression common to f and f. Consider the logic network defined by the following expressions: . 2... x = ad' + a'b' + a'd' + bc + bd' + ar r = a'r' + a'd' + b'c' u + b'd' + e = a'c + a'd + b'd + e' Draw the logic network graph.. 5. ..1 but exploits formulae 8. 9. contains a variable not in fd ..9 and 8.. Outputs are (x.... A.. Design an algorithm that computes the exact ODC set by backward traversal and does not use the results of Theorem 8. : Show all steps. c ) and outputs are {x. Identify all feasible cube intersections.. and show all steps.
= (f U DC. Justify the formula. Discard those cubes of the don'r core set whose support is disjoint from that of the function to be minimized.l(x). Consider the logic network defined by the following expressions: g = (b + c)' Inputs are ( a . Derive a recursive formula to compute thecontrolling set C. 15. Prove the following theorem. Formulate the invener minimization problem as a ZOLP. Consider the circuit of Figure 8...). Exon and EXNOR gate is 2. y ] . 11. Let the inputloutput behavior of the network under consideration be represented by f and the corresponding external don't care set by DC. Determine the topological critical path. 12. Show also that it is not an exact filter. 16. Draw the logic network graph. c. and let 8. Assume that the input dataready times are zero except for t. Let us apply simplification to vertex v . Is it fully tcstablc Tor single stuckat faults? Can you draw the conclusion that a critical true path is not statically sensitirable if the circuit is not fillly testable? Why? . Consider networks with general topology..b.31.. Give an example of the usefulness of an approximate filter based on the following. Determine its time complexity. d .. Design an exact algorithm for invener minimization in a treestmctured network. 13.Split by PDF Splitter 440 L f f i l C ~ L E V E LSYNTHESIS A N D OPTLMIZATION 10. Then: where f. be a feasible implementation of the corresponding local function.. 14. s ) and outputs are [x. = 4 and that the required dataready time at the output is 7. = (f n DC:.. Compute the data ready and slacks for all vertices in the network.. and f. Assume that the delay of each invener... AND and OR gates is 1 and that the delay of each NOR.
We restrict our attention to synchronous models that operate on a singlephase clock. to the case of levelsertsitive latches [32] and that of multiplephase clocks [3]. mutatis mutandis. . as shown in Figure 9. that registers are edge triggered and store 1 bit of information. Most of the techniques described in this chapter can be extended..1 INTRODUCTION We consider in this chapter the optimization of sequential circuits modeled by finitestate mvhines at the logic level. E. Satura. they consist of interconnections of combinational logic gates and registers. synchronous sequential circuits are often modeled by a combinational circuit component and registers.. Thus. Montale. We assume here. 9. There isn't one unique time: many tapes run parallel .1. From a circuit standpoint.Split by PDF Splitter CHAPTER SEQUENTIAL LOGIC OPTIMIZATION Nan c ' 6 un unico tempo: ci sono rnolti nastri che paralleli slitfano . for the sake of simplicity...
2 (a).Split by PDF Splitter 442 LOGICLEVEL SYNTHESIS &qD O ~ M l L K T l O N Piay rmr Inputs COMBINATIONAL CIRCUIT Piay rmr outpuu State REGISTERS FIGURE 9. e.. 23. Thus. tables). Whereas the behavior of combinational logic circuits can be expressed by logic functions. 21. optimization techniques Wget a reduction in complexity of the model that correlates well with area reduction but not necessarily with performance  bm */0 r State NextState z 1 sl sl 0 O s l EZ I * s2 EL 0 (a) $"I=($" 1 l+. A convenient way to express the circuit behavior is by means of finitestate machine models.e. In general.i"))' (b) FIGURE 9. (b) synchronous logic network. 291 present the underlying theory and the fundamental algorithms in detail.3. Some textbooks [20.2 Sequential circuit models: (a) state transition diagram. Sequential circuit optimization has been the subject of intensive investigation for several decades. this finitestate machine representation lacks a direct relation between state manipulation (e. .g. the behavior of sequential circuits can be captured by traces. Classical methods for sequential optimization use state transition diagrams (or equivalent representations. state transition diagrams (Section 3.1 Block diagram of a finitestate machine implementation. as shown in Figure 9. It is the purpose of this chapter to distill those techniques based on classical methods that have been shown practical for CAD synthesis systems as well as to report on the recent developments in the field. e.. In both cases.3). Sequential circuits can be specified in terms of HDL models or synthesized by means of the techniques described in Chapter 4. by input and output sequences.g. statebased representations have a behavioral flavor. i. the sequential circuit model may have either a behavioral or a structural flavor or a combination of both.. state minimization) and the corresponding area and delay variations.. While many optimization techniques have been proposed. State transition diagrams encapsulate the traces that the corresponding circuit can accept and produce.g.
we concentrate on the state minimization and state encoding problems in Sections 9. it is often convenient to express the inputloutput behavior by means of a set of local expressions with mutual dependencies. including decomposition. S. State encoding defines the state representation in terms of state variables.g. in Section 9. A). those reachable from the reset state) among those identified by all possible polarity assignments to the state variables. This lends to circuit representations in terms of synchronous logic networks that express the interconnection of combinational modules and registers. 291. Therefore these optimization methods parallel the behaviorallevel transformations for architectural models. Next we shall consider the relations between the network models and state transition diagrams. Synchronous logic networks have a structural flavor when the combinational modules correspond to logic gates. An alternative representation of the circuit behavior is by means of logic expressions in terms of timelabeled variables. in particular implicit methods for finitestate machine traversal. We comment on the relations among different finitestate .1 and 9. Then we shall consider those methods relying on network models in Section 9.4. respectively.2. as introduced in Y Section S3. We denote I1 by n.3. As in the case of combinational circuits.2 (b). thus allowing the description in terms of networks. Design systems for sequential optimization leverage optimization methods in both representation domains. State transition diagrams can be transformed into synchronous logic networks by state encoding and can be recovered from synchronous logic networks by state extraction. We consider Mealytype fiuitestate machines. We shall conclude with some comments on testability of sequential circuits.2 SEQUENTIAL CIRCUIT OPTIMIZATION USING STATEBASED MODELS We consider in this section algorithms for sequential optimization using statebased models as well a8 transformations into and from structural models.3. defined by the quintuple (X.2. In particular.4.2. . For this reason. S Finitestate machine optimization has been covered by several textbooks [20. described in Section 3. 9. with particular reference to the relation to other synthesis problems and to recent results.3. 23. in the sequel..Split by PDF Splitter improvement. We shall survey briefly other relevant optimization techniques in Section 9. 21. Unused state codes are don't care conditions that can be used for the network optimization. 8. In this chapter we shall first describe sequential circuit optimization using statebased models in Section 9. as shown in Figure 9. They are hybrid structurallbehavioral views of the circuit when the modules are associated with logic functions.2.2. The major task in state extraction is to determine the valid states (e. we summarize the state of the art in the field. such a retiming [24]. Some recent optimization algorithms for sequential circuits use the network representation. In this case there ? is a direct relation between circuit transformations and area andlor performance improvement.2.
= n.'hence satisfying a necessary condition of equivalence. These can be derived by an iterative refinement of a paitition of the state set. they have identical outputs and the corresponding next states are equivalent. A minimumstate implementation of a finitestate machine is one where each state represents only one class. reflexive and transitive. for some value of i .) The reduction in states correlates to a reduction in transitions. Such a partition is unique. .. have next states in the same block of n. the transition function S and the output function A are specified for each pair (input. the corresponding blocks are equivalence classes. i. denote the partitions. Then. the number of registers is the ceiling of the logarithm of the number of states. for any possible input.. Equivalency is checked by using the result of the following theorem 1211. n.e. 9.. At first. In the limiting case that n. Theorem 9. and hence to a reduction of logic gates. i. State reduction may correlate to a reduction of the number of storage elements. states can be partitioned into equivalence classes. STATE MINIMIZATION FOR COMPLETELY SPECIFIED FINITESTATE MACHINES. when n. Hence state minimization is described separately for both cases in the following sections.+. The complexity of this algorithm is ~ ( n ? ) . . state) in X x S. no pair of states are equivalent. we obtain a 0partition where all blocks have one state each. .1. iterations are required. partition blocks are iteratively refined by further splinings by requiring that all states in any block of n. State minimization can be defined informally as deriving a finitestate machine with similar behavior and minimum number of states. (When states are encoded with a minimum number of bits.2. Since equivalency is symmetric.+. Let n.i = I. . the problem complexity and the algorithms. of the finitestate machine initialized in the two states coincide for any input sequence.2.1 State Minimization The state minimization problem aims at reducing the number of machine states. A more precise definition relies on choosing to consider completely (or incompletely) specified finitestate machines.1. Two states are equivalent if the output sequencer. blocks of partition ill contain states whose outputs match for any input. iterations. In addition. . When the iteration converges. 2. for any input. Note that convergence is always achieved in at most n. a recent technique for state minimization of completely specified finitestate machines using an implicit representation (instead of an explicit statebased model) is described in Section 9. When considering completely specified finitestate machines.2. Minimizing the number of states of a completely specified finitestate machine then entails computing the equivalence classes. Two states of a finitestate machine are equivalent if and only if. by Theorem 9. or equivalently where only one state per class is retained.2.. This leads to a reduction in size of the state transition graph.Split by PDF Splitter 444 LOGICLEVEL SYNTHESIS AND OPTIMIZATION machine optimization problems and on issues related to synthesis and optimization of interconnected finitestate machines.e. This decision affects the formalism.4.
The next states of sj and s2 match. Hence there are four classes of compatible states.3 Sure diagram..s3 $1 I 0 I r4 IS 55 s5 $4 II I 0 I 1 I1 The state set can be partitioned first accordir~g the outputs.e. Output 1 I 0 I 0 0 I 0 xi $4 55 ss SS . reported next: Input 0 I 0 I 0 Consider the state diagram shown in Figure 9.s3 33 Next State .S?J. s d l must be split. in the following minimal table: ~ Input 0. State SI 5.2. to  n .5* S3 I 0 . Hence the hlock 1 .. whose state tablei is '.s) I5 $12 I.s?I. i. The next states of sl and sq are in different blocks. we find that no funher refinement is possible.1. I.T~I.s2 match. We denote { s . . because the next states of s .TI2 The corresponding diagram is shown in Figure 9. = I(JI. Example 9. = IIS. s2) as s . When checking the blocks again.Split by PDF Splitter FIGURE 9.4.uJ. . and . ~ 3 .? S.. (s~. .3. Nextstate J3 55 Output 1 I I I 0 $1 S2 s3 IS $2 I. "1 0 I Stale sf2 $12 .74 I I 0 1 .S~I) Then we check each block of l T l to see if the corresponding next states are in a single hlock of nl for any input. ( ~ ~ {. {SSJI 1. yielding: n.
2. and. If neither of these conditions are satisfied. . as in the previous case: Then we check each block of n . is consistent with A if either B 5 P or B f' P = M. Hence we can consider the smaller one in the future iterations. Let A = { s 5 ) . s 2 . STATE MINIMIZATION FOR INCOMPLETELY SPECIFIED FINITESTATE MACHINES. it does not need to be reconsidered unless it is split. (Dotted edges are superfluous. Consider the table of Example 9. and the two splinters become part of I. state) pairs. the transition function 6 and the output function A are not specified for some (input. No further splits are possible and n2defines four classes of equivalent states Hopcroft's method is important because it has a better asymptotic behavior. Consider a partition block A 6 ll. the refinement of the partitions is done by looking at the transitions from the states in the block under consideration to other states.B n P. Block B = ( s . s 2 ) is a subset of P and requires no further split. The state set can be partitioned first according to the outputs. Hopcroft [I91 suggested a partition refinement method where the transitions into the states of the block under consideration are considered. In this case. which we call P. .1. The states whose next state is .) steps. logn. The proof is laborious and reported in references [ I ] and [19].4 Minimumstatediagram. This fact is key in proving that the algorithm can be executed in O(n. both splinters would yield the same results wher) oonsidered in turn to split other blocks.SS are set P = i s .2. s4) is not a subset of P and B n P = ( s 4 J . block B is split into B' = B f' P and BZ = B . Let us consider input 1. don't care conditions denote the unspecified transitions and outputs. block requires a split.. Equivalently. S A ) .. . for each input. Then any other block B 6 II.. If no I. Example 9. .2. Note that when a block is considered to split the others. the subset of the states whose next states are in A.)). Block ( 5 . yielding: .) In the algorithm described above.Split by PDF Splitter FIGURE 9. . then the partition defines a class of equivalent states. In the case of incompletely specified finitestate machines. Hence the block is split as ( ( ~ 3 1{s.
A set of compatibility classes has the closure property when all implied compatibility classes are in the set or are contained by classes in the set. A class of compatible states is then defined to be such that all its states are painvise compatible.. First. 0 I . Other choices of the don't care .5 (a) and described by the .2. multiple solutions to the problem may exist. An input sequence is said to be applicable if it does not lead to any unspecified transition. because the compatibility of two or more states may require the compatibility of others. Theorem 9. Hence the selection of a compatibility class to be represented by a single state may imply that some other class also has to be chosen. Second. compatibility is not an equivalency relation. because compatibility is a symmetric and reflexive relation but not a transitive one. the selection of an adequate number of compatibility classes to cover the state set is complicated by implications among the classes themselves. Maximal classes of compatible states do not form a partition of the state set. following table. The intractability of the problem stems from this fact. This theorem forms the basis of iterative procedures for determining classes of compatible states that are thecounterpart of those used for computing equivalent states. Example 9 2 3 Consider the finitestate machine of Figure 9.2..Split by PDF Splitter They model the knowledge that some input patterns cannot occur in some states or that some outputs are not observed in some states under some input conditions. which can be minimized to four states. Since classes may overlap. for any input.2.1. where only the output function A is incompletely specified for the sake of simplicity: Input 0 Stale $1 NextSfate $3 $5 Output 1 1 0 I . Two states are compatible if the output sequences of the finitestate machine initialized in the two states coincide whenever both outputs are specified and for any applicable input sequence. Nevertheless two major differences exist with respect to the case of completely specified finitestate machines. the corresponding output functions match when both are specified and the corresponding next states are compatible when both are specified. Two states of a finitestate machine are compatible if and only if.. The following theorem [21] applies to incompletely specified finitestate machines.y2 31 s3 s5 SI S1 51 54 55 s2 Ji Si I 0 I 0 I $5 Id 51 55 1 0 1 0 I 1 0 Note first that replacing the don'r rare entries by Is would lead to the table of Example 9.
ssl. The states of a minimum implementation would corre . (3. 321 ( s ~ .. sa) in. Let us consider painvise compatibility. . entries would lead to other completely specified finitestate machines.) is compatible subject to compatibility of Is. ss] Maximal compatibility classes are the following: Classes is.Split by PDF Splitter 448 LOGICLEVEL S Y h W S I S Ah'D OPTIMILATIOS 01 1 0 1 . ss] is?..) Isl. . The following table lists the compatible and incompatible pairs: Pain Implied Pain Compatible Compatible Compatible Compatible Compatible Incompatible Incompalible Incompatible Incompatible Incompatible (sl. The pair ( s ~ s..~51 in. (b) Minimum state diagram. $1 ($1.n. Hence the state minimization problem can be formulated as a binate covering problem and solved exactly or heuristically by the corresponding algorithms. sd) ( s ~ ...s.s4) is2.5 (a) State diagram.r11 i. r s ] is3. rs) ). The pair { s t . .$~.) is not compatible.ssl Minimizing the number of states of an incompletely specified finitestate machine consists of selecting enough compatibility classes satisfying the closure property so that the states are covered. r ~ ) (a. 0 FIGURE 9. $5) (q.nl Implied Classes i~3.sj) is4. s ~ (XI. ra) isi. The pair { s . $21 i3. This shows the lack of transitivity of the compatibility relation. s. s2J is compatible. Unfortunately. there is an exponential number of completely specified finitestate machines in correspondence to the choice of the don't care values.
s ~ ) therefore with cardinality 2.. which are those not included in other classes implying the same set or a subset of classes. The use of prime classes allows us to prune the solution space.s. encoding length) and to the size of the combinational component. s s Jand Is2. because they are the most commonly used.2. Otherwise.4.2. The circuit complexity is related to the number of storage bits nh used for the state representation (i. We shall survey methods for both cases next. In this case. state encoding techniques for twolevel and multiplelevel logic have been developed independently. In the most general case. . spond then to the selected classes and their number to the cardinality of a minimum cover. It has been noticed [291 that often a unate cover in terms of maximal compatibTe classes has the closure property and hence it is a valid solution. This can be explained inf~rmallyby noticing that smaller classes may have fewer implication requirements. D. For example. Hence. Exact methods for state minimization of incompletely specified finitestate machines can take advantage of considering a smaller set of compatibility classes. .. However. its cardinality may be larger than that of a minimum cover and even larger than the original state set cardinality. an approximation to the solution of the binate covering problem can be computed by heuristic covering algorithms..3.2. T.2 State Encoding The state encoding (or assignment) problem consists of determining the binary representation of the states of a finitestate machine. 9.5 (b). but they are not required to be so in general. There are three maximal classes of compatible states. Unfortunately. Consider the table of Example 9. Grasselli and Luccio [IS] suggested restricting the attention to prime classes. It is worth noticing that the set of maximal compatible classes satisfies always the closure property. Upper bounds on the optimum solution size are given by the cardinality of the set of maximal compatible classes and by the original state set cardinality. Most known techniques for state encoding target the reduction of circuit complexity measures that correlate well with circuit area but only weakly with circuit performance. . For this reason. A measure of the latter is much different when considering twolevel or multiplelevel circuit implementations.g. We consider here only Dtype registers. Example 9. Minimum covers may involye compatibility classes that are not necessarily maximal.e. Heuristic methods for state minimization have different flavors. a minimum closed cover involves only ( s . JK) [29]. A lower bound is given by a unate cover solution that disregards the implications. the selected classes are maximal. The minimum state diagram is shown in Figure 9.Split by PDF Splitter 4. the state encoding problem is complicated by the choice of register type used for storage (e. their computation gives always a feasible solution and no check for implications is needed. Encoding affects the circuit area and performance.
Most classical heuristic methods for state encoding are based on a reduced dependency criterion [20. The choice of an encoding affects both the encoding length and the size of a twolevel cover. Early work on state encoding focused on the use of minimumlength codes. where each state is encoded by a corresponding code bit set to 1. Since exact and heuristic symbolic minimizers are available. ) ! possible encodings.Split by PDF Splitter 450 LOGIC[LEVEL SYESIS AND OPTLMKATION STATE ENCODING FOR TWOLEVEL CIRCUITS. The longest physical path is proportional to the sum of twice the number of rows (2 column lengths) plus the number of VOs (1 row length).5.' The number of in#s and outputs of the combinational component is the sum of twice the state encoding length plus the number of primary inputs and outputs. Note that the finitestate machine model requires the solution to both input and output encoding problems.  'The area of a PLA is proportional to the product of the number of 110s (columns) and the number L one column any input column pair corresponding S to a signal and its complement. 91 not to minimize the size of the sum of products representation of the corresponding combinational component.5) was introduced [8. symbolic minimization (see Section 7. The number of product terms to be considered is the size of a minimum (or minimal) sum of products representation. For simplicily we consider . which correlates to the critical path delay. The lhot encoding requires an excessive number of inputloutputs. 311. The feedback nature of the finitestate machine imposes a consistency requirement.2. number of relevant codes can be refined to (2"" The simplest encoding is lhot state encoding. the state set must be encoded while satisfying simultaneously both input and output constraints.e. such numbers can be used to compute readily the circuit area and the physical length of the longest path. Reduced dependencies correlate weakly with the minimality of a sum ofproducts representation.n . using nh = [logz n. Note that the size of sum of products representations is invariant under permutation and complementation of the encoding bits. they can be readily applied to the state encoding problem.n . Twolevel circuits have been the object of investigation for several decades.. Namely. There are 2""/(2"6 . The rationale is to encode the states so that the state variables have the least dependencies on those representing the previous states. Exact and heuristic encoding algorithms have been previously reported in Section 7. Minimizing a symbolic representation is equivalent to minimizing the size of the sum of products form related to all codes that satisfy the corresponding constraints. The circuit complexity of a sum ofproducts representation is related to the number of inputs. of product terms (rows). i. ) ! n h ! . In the early 1980s. In other words. For PLAbased implementations. all others being 0. outputs and product terms. 91 with the goal of solving the state encoding problem.. symbol pairs (in the input and output fields) corresponding to the same states must share the same code. as well as encoding programs.l hits to represent the set of states S. Thus nh = n. Hence the l)!/(2"h . and the choice of the best one is a formidable task. and it was shown [8.
Split by PDF Splitter SEQUENnAL LOGIC OPnMLLATlON 451 Example 9. s5 is covered by all other states The corresponding encoding constraint matrices arethen: An encoding that satisfies simultaneously both sets of constraints is then: where each row relates to a state.5.2.. and s2 covers. Consider the finitestate machine described by the following state table: Input State Nextstate Output A minimal symbolic cover is the following: with the covering constraints: s . An encoded cover of the finitestate machine combinational component is then: .
Nevertheless. The delay corresponds to the critical path length in the network. Example 9. Hence a cover with cardinality 3 and ni. The minimumlength solution compatible with a set of constraints may require a number of bits larger than [logz n.e. The corresponding PLA would have 4 rows and 9 columns. l Now we would have a cover with cardinality 4 and nh = 2. STATE ENCODING FOR MULTIPLELEVEL C I R C ~ T S . The simplest approach is first to compute an optimal state assignment for a twolevel logic model and then to restructure the circuit with combinational optimization algorithms.. Despite the fact that the choice of state codes is done while considering a different model. . Therefore researchers have considered encoding techniques in connection with one particular logic transformation. registers) and to the number of literals in the logic network. experimental results have shown surprisingly good results. In ddzTon. only heuristic methods have been developed for computing state encodings that optimize the area estimate. we would like to stress a limitation of this approach. T h e difficulty of state encoding for multiplelevel logic models stems from the wide variety of transformations available to optimize a network and from the literal estimation problem. experimental results have shown that just a few extra bits may be needed to satisfy all constraints. the desired circuit implementation can be searched for by trading off the number of 110s with the number of product terms. The overall area measure is related to the number of encoding bits (i.Split by PDF Splitter Whereas symbolic minimization and constrained encoding provide a framework for the solution of the state assignment problem for twolevel circuits. Thus. si = OIO: s3 = 001. constraints can be relaxed to satisfy bounds on the encoding length [2. . Then the following 2hit encoding is feasible: s = 00. = 3 can be constructed. 351. the satisfaction of the input l encoding constraints requires at least 3 bits. To date. Assume we split the top symbolic implicdnts into two. A feasible encoding is s = 100. 1 I] proposed a heuristic encoding method that privileges the extraction of common cubes. Consider a finitestate machine whose minimal symbolic cover is: There are no output encoding constraints.l. namely 00 s f s? 100 and 00 s? s3 100. Malik et al. 9. [2. Alternatively we may choose not to satisfy one constraint. The method has been improved upon by others in a later independent development [16]. [27] addressed the problem of optimal subexpression extraction and its relation to encoding.6. In practice. Devadas et al. The corresponding PLA would have 3 rows and I I columns. sz = I I: s? = 01.2. to achieve a 2bit encoding.State encoding techniques for multiplelevel circuits use the logic network model described in Chapter 8 for the combinational component of the finitestate machine. A possible explanation is that many finitestate machines have shallow transition and output functions that can be best implemented in a few stages.
Take. [2. Then weight I is assigned to edge ( u . For the sake of simplicity. The weights denote the desired code proximity of state pairs and are determined by scanning systematically all state pairs. Assume that a %bit encoding is used. two heuristic algorithms were proposed by Devadas et a / . First. Let us encode both states with adjacent (i. when two (or more) states have a transition to the same next state. state pairs that have transitions into the same next state are given high edge weights (to achieve close codes). . b. The encoding problem is modeled by a complete edgeweighted graph. we consider binaryvalued weights. The weights are computed hy a complex formula which takes the output patterns into account. State encoding is determined by embedding this graph in a Boolean space of appropriate dimensions..si under inputs i' and i . Therefore. state pairs with incoming transitions from the same states are given high weights. Example 9. called fanout oriented.2. Given a finitestate machine with state set S = (s. Consider the table of Example 9. the extraction of common cubes interacts with each other. respectively.2. where the vertices are in onetoone correspondence with the states. because that correlates with the size of a common cube that can be extracted. called . and s~ with a transition into the same state . for example. This approach attempts to maximize the size of common cubes in the encoded nextstate function.fanin oriented. Example 9.s2.8. Again a complex rule determines the weight while taking the input patterns into account. heuristic algorithms are used to determine an encoding where pairwise code distance correlates reasonably well with the weight (the higher the weight. SJ. where either state has a transition to s?. The rationale is to determine desired code proximity criteria. con sider two states s. . or equivalently u'b'(i'c'+ic). Since graph embedding is an intractable problem. thaarea gain cannot be related directly to the transitions.. Second. and a smaller cube could be extracted by choosing O l I. when considering the codes of two states with transitions into the same next state. distance I ) codes. 1 I]. For example. Then the transition can be written as i'a'b'c'+ia'b'c. the state pair Is. In the first algorithm. the weights being determined differently. First a complete graph of five vertices is constructed and then the edge weights are determined. . and we refer the reader to reference 121 for details.. There are still two major complications. These correspond to the cubes a'h'c' and a'b'c. namely 000 and 001.e. c ] are the state variables. the lower the distance). Hence a high weight is I .5 and the fanoutoriented algorithm. To cope with these difficulties.sz]. the size of the common cube can be determined by the code distance. In the second algorithm. but the number of possible cube extractions depends on the encoding of the next states. respectively.s S J . where { a . s4.7. They both use a complete edgeweighted graph. and the weights are only an imprecise indication of the possible overall gain by cube extraction.2. u > ] . Note that no cube could have been extracted if sl were encoded as I I I. it is convenient to keep the distance between the corresponding codes short.Split by PDF Splitter We summarize here a method that targets common cube extraction. This strategy tries to maximize the t~urnberof common cubes in the encoded nextstate function.
~ 2 1 .sSI. . and s2. Then it is particularly convenient to achieve distance 1 i n the encoding of s . because in this particular case the primary inputs match in correspondence ..9.u4) lu4.e.. f. Assume adjacent codes are assigned to s.. Then. f. the common cube i'a'c(b + b') = i'a'c is not in conjunction with any expression. respectively. where i l . = iabr' + i'a'c(b b') = iabc' i'a'c.. (UI. fb. as well as formulae for more precise evaluation of literal savings 1161. sz) E S is any state pair. c and the encoded transition functions be fa. the state variables be a . assume now that ss is given code 101. The expression for f is larger in size (i.by deleting those rows that do not contribute to the encoded transition and output function. Since i. u~ll . The encoding represented by the following mabix reflects the proximity requirements specified by the weights: E = [ ] i 1 0 1 1 1 0 By replacing the codes in the table and. the transition can be expressed as ija'b'c' + ia'b'c and simplified to ia'b'(jc' + r ) = ia'b'(j + c ) .. . This is due to the increased distance of the codes of (s4. Let us consider now transitions from states (s.ss\ is I . e. = i2 = i' and the distance of the encoding of (sb. b. i2.] to state s under inputs i l and . which can be rewritten as .. because a larger literal saving can be achieved. i2 are input patterns (cubes) and { s l .8. we obtain: 1 110 111 0 0 0 0 0 0 1 1 000 011 0 011 110 0 001 110 1 001 001 Now let the input be variable i . USI.. The common cube i'a'c is related to the codes of (s4. a'b'c' (000) and a'b'c (001). Consider again the first expression of f. specific cases apply when il = i 2 or il & i l .Split by PDF Splitter 454 LOCilC~LEVELSYNTHESIS AKD OPILMILATION BY scanning all state pairs. Note that cube i'n'c is not in conjunction with an expression.ss). Whereas the general case was described before and demonstrated in Example 9. ab'c. + + + Other rules for detecting potential common cubes have been proposed. respectively. For the sake of the example. which are adjacent. the edges (IvI. with the transitions hom (s. c i * . are assigned unit weight.g. e. while the remaining edges are given weight 0.2. Example 9. which can be rewritten as f = iabc'+i'c(a'b+ ah').2. s) under consideration(rows 4 and 5 of the cover). and sz. s. or equivalently .2. where i . 1 ~ 2u4I. They all correspond to examining nextstate transition pairs that can be represented symbolically as i l s l + i2s2. i j = il and i = i. Consider expression f = iobc' i'a'bc + i'a'b'c. IUI.g. Then f = iabc'+i'a'br+i'ab'c. in Example 9. literal count) than the previous one. .7.
mainly due to the fact that transition functions are shallow ones and do not require many levels. Different decomposition types can be achieved. while cube extraction reduces the number of literals. Experimental results have shown this to be adequate. or a cover. Factorization consists of finding two (or more) similar disjoint subgraphs in a state transition diagram.. 231. We refer the reader to reference 121 for details. Even though . 231). this encoding technique exploits only one method for reducing the complexity of a multiplelevel network.3 Other Optimization Methods and Recent Developments* We review briefly some other optimization techniques for finitestate machines and their interrelations. symbolic minimization and constrained encoding can be used to yield partitions whose figures of merit closely represent those of the resulting circuits. are more common.Split by PDF Splitter SEQUENTIAL LOGIC OmlMLZATlON 455 It is important to remember that. because of the abstractness of the state transition graph model. By the same token. This section is intended to give the reader a flavor for the problems involved (for details the reader is referred to the literature [2. the reverse process is more difficult. As for state minimization and encoding. it is often difficult to relate a decomposition to the final figures of merit of the circuit. Nevertheless. the estimation of the size of an optimal multiplelevel network by cons~dering cube extraction only may be poor. and it has been reported in some textbooks [20. as shown in Figure 9. many other (interacting) transformations can achieve the same goal. Exact factors are hard to find in usual designs. decomposition algorithms based on factorization have been reported [2]. Most techniques have theoretical importance only. similar disjoint subgraphs in a state transition diagram with possibly some mismatches. by merging the transition diagrams. 9. each one defining a component of the decomposed finitestate machine. FINITESTATE MACHINE DECOMPOSITION. unless modeled by state diagrams obtained by flattening hierarchical diagrams. Hence. The goal of finitestate machine decomposition is to find an interconnection of smaller finitestate machines with cquivalent functionality to the original one. Finitestate machine decomposition has been a classic problem for many years. Recently. We comment on the recent developments in the field. An exact factorization corresponds to determining two (or more) subgraphs whose vertices.6. because their computational complexity limits their use to sn~allsizedcircuits only. The rationale is that the decomposition may lead to a reduction in area and increase in performance. Whereas collapsing two (or more) finitestate machines into a single one is straightforward. This leads to a sharing of states and transitions and therefore to an overall complexity reduction. i. Hence it is possible to achiede a twoway decomposition of the original finitestate machine by implementing the common subgraphs as a single slave finitestate machine that acts as a "subroutine" invoked by a master finitestate machine. in the particular case of twoway decomposition with twolevel logic models.e. Decomposition entails a partition. Approximate factors.2. of the state set by subsets. edges and transitions match.
Split by PDF Splitter 456 LOGlCLEVEL SYNTHESIS A V D OFTlh1174TIOh COMBINATIONAL CIRCUIT REGISTERS COMBINATIONAL COMBINATIONAL REGISTERS REGISTERS COMBINATTONAL COMBINATIONAL REGISTERS    COMBINATIONAL ClRCLlIT COMBINATIONAL CIRCUIT  REGISTERS (d) REGISTERS FIGURE 9.6 Different types of finitestate machine decompositions: la) original machine: (b) general twoway decomposition: (c) parallel decomposition: (d) cascade decomposition. .
because mismatches in the common subgraphs must be compensated for. encoding problems and evaluation of the objective function differ slightly. counterexamples show that solving exactly. Whereas it is reasonable to apply both optimization steps (with minimization preceding encoding). encoding nonstateminimal finitestate machines may lead to smaller circuits than encoding stateminimal finitestate machines. as in the case of multiplelevel logic optimization. each induced by the choice of a 1 or a 0 in an encoding c o l u m n ~ we mentioned before. Stare minimization is the counterpart of Boolean minimization and state encoding corresponds to library binding. a decomposition yields an interconnection of finitestate machines. i. 231. We refer the interested reader to reference [2] for details. state minimization and encoding. We summarize here the major results. The relations of decomposition to state encoding have been investigated [20.Split by PDF Splitter approximate factorization is more complex and gains are more limited. even though their nature is different. There are many degrees of freedom in optimizing finitestate machines that parallel those existing in multiplelevel logic design. it can still be used as a basis for twoway decomposition and leads to an overall reduction in complexity. twoway decomposition can be achieved by symbolic minimization and constrained encoding. and we refer the reader to reference [2] for more details. There are also interesting relations between decomposition and state minimization. for example. This issue is described in the next section. Consider the cascade interconnection of two finitestate machines. It is then often possible to compute the set of all possible and impossible pattern sequences generated by the driving machine. Most classical work on state minimization has dealt with finitestate machines in isola~ tion. STATE MINIMIZATION FOR INTERCONNECTED FINITESTATE MACHINES. Indeed an encoding can be seen as a set of twoway partitions of the state set. Indeed.. and the state minimization problem of each component can take advantage of the decomposition structure. The first (driving) machine feeds the input patterns to the second (driven) one.6 (d). Unfortunately. Indeed. transformations like decomposition. both problems may not lead to an overall optimum solution. but independently. ~ e d n t lthe . both are transformations into structural representations.e. factoring and collapsing are defined in both domains and are similar in objectives. AN OVERALL LOOK AT FINITESTATE MACHINE STATEBASED OPTIMIZATION. State encoding techniques that take advantage of equivalent or compatible states are described in the upcoming section on symbolic relations. Indeed. ~s in the case of twolevel logic models. but the underlying mechanism is the same.optimal design of interconnected finitestate machines has prompted scientists to revisit the state minimization problem and extend it. Consider. The definitions of the generalized prime implicants. the search for an optimum implementation is made difficult by the degrees of freedom available and the present ignorance about the interrelations among the transformations themselves. as shown in Figure 9. Therefore finitestate machine optimization systems can be built that support these transformations. This corresponds to specifying .
was proposed by Lin and Somenzi [25]. s 2and { s z . Symbolic relations are extensions of Boolean relations to the symbolic domain.Split by PDF Splitter the input don't care sequences that can be used for the state minimization of the driven machine. 9. by sequences of inputs and outputs. By replacing the state transition function by a state transition relation. 1 523 Sl2 1 312 The minimal table shows that a transition from s2. it is possible to derive the sequences of patterns of the driving machine that do not affect the output of the driven machine.3 SEQUENTIAL CIRCUIT OPTLMIZATION USING NETWORK MODELS We already mentioned in the introduction to this chapter that the behavior of sequential circuits can be described by truces.3 how the input and output don't cure sequences can be computed in the case of network representations.10.3. as well as the solution to corresponding encoding problems. [30]. We shall show in Section 9. Example 9.r23 0 0 323 ( ~ 1 2 5231 0 . state) pairs. The knowledge of these output don't care sequences can then be used to simplify the driving machines. The optimization of symbolic relations. under input 0 can lead to either or s2i. to yield [23. By noticing that compatible classes are ( s i . the table can be minimized ) s. Kim and Newborn [22] proposed a technique for the state minimization of the driven machine. 251: 0 S l l st2 0 I 512 . Consider the following state table (231: ~. are not compatible states. we can model the transitions from a state into more than one next state that is indistinguishable as far as these specific transitions are concerned.). Similarly. Note that classical state minimization considers only (input.e.2. and it has applications to several encoding problems. while we are considering here input sequences from a given state. i. Symbolic relations allow us to express state tables with more general degrees of freedom for encoding than symbolic functions. These sequences correspond to those that the related finitestate machine models can accept . The technique was then improved by Rho et al.. Note though that s12 and r2. because the latter does not distinguish among them. SYMBOLIC RELATIONS.
logic networks where modules can represent storage elements can be used to represent both bound and unbound networks. without explicit dependency on time hut just marking the synchronous delay offset with respect to a reference time point. while a connection through a register has a unit weight. We represent this by appending to the variable the reserved symbol @ and the offset when this is different from zero. products and sums to timelabeled variables. Hence they represent a structured way of representing the circuit behavior. behavior can be expressed as 2'"' = (2'"') Its r'"')' Vn ? 0. which are the composition of vertex propagation delays. the . we restrict our attention to nonhierarchical logic networks. and we assume that the interconnection nets are split into sets of twoterminal nets. x'"+" = )""'is translated to x =yal.. Nevertheless. while unbound networks associate internal modules with logic expressions in terms of timelabeled variables. For this reason. For example. It is convenient sometimes to have a shorthand notation for variables. where the weights denote the corresponding synchronous delays. + +  Network models for synchronous circuits are extensions of those used for describing multiplelevel combinational circuits. In particular. we represent the registers implicitly. the circuit can be described by the expression z = (zml r)'. We recall that we restrict our attention to synchronous circuits with singlephase edgetriggered registers for the sake of simplicity. it is convenient to discretize time into an infinite set of time points corresponding to the set of integers and to the triggering conditions of the registers. We consider unbound networks in the sequel.g. We denote sequences by timelabeled variables. Example 9. We call a path weight the sum of the edge weights along that path. In addition. the reset inputs are applied at some time point n 5 0. We can extend then the notion of literals. For example x ( " ) denotes variable x at time n.~ ) x = x@O. A sequence of values of variable x is denoted by x'"' Vn in an interval of interest. Bound networks represent gatelevel implementation.Split by PDF Splitter and generate. Consider the circuit of Figure 9. Hence x@k = x ( " . A connection through a kstage shift register has weight k .) When compared to combinational logic networks. By this choice.Circuit equations in the shorthand notation are normalized and by assuming that the lefthand side has zero offest. (Path weights should not be confused with path delays.1. We assume that the observed operation of the network begins at time n = 0 after an appropriate initializing (or reset) sequence is applied.2 (b). which provides an oscillating sequence when input r is FALSE.Input r is a reset condition. synchronous networks differ in being edge weighted and in not being restricted to acyclic. Thus the behavior of a synchronous network can be expressed by functions (or relations) over timelabeled variables. Using the shorthand notation. As in the case of combinational synthesis. by positive weights assigned to the nets (and to the edges in the corresponding graph). a direct connectiodbetween two combinational modules has a zero weight. Zero weights are sometimes omitted in the graphical representations.3. e.
Note that zero weights are not shown. A set of assignments of the primary outputs to internal venices that denotes which variables are directly observable from outside the network.. An example of a synchronous circuit and its network model are shown in Figures 9.") .. Each vertex is assigned to a variable.. W e summarize these considerations by refining the earlier definition of logic network of Section 3. . The support variables of each local function are timelabeled variables associated with primary inputs or other internal vertices. primary outputs V o and internal venices V G . The dependency relation of the support variables corresponds to the edge set E and the synchronous delay offsets. denoted by G . Example 9.3. xlnl plnl. For example. .7 (a). Thus a synchronous network is modeled by a multigraph.e.3. Consider the network of Figure 9. " vII. The network model is a multigraph.3.. Note that a local function may depend on the value of a variable at different instances of time. + d""'b'l"' e'"l . Definition 9. It can be described by the following set of equations: 1 h'"l = '. differences in time labels correspond to the edge weights W.2. ... and v. A s in the case of combinational networks. an alternative representation is possible in terms of logic equations. W).2. i"'" . _ Cirll 0 .I> fB. A set of scalar combinational Boolean functions asswiated with the internal vertices.'"I . A nunhierarchical synchronous logic network is: A set of vertices V partitioned into three subsets called primary inputs V ' . there are two edges between u. whose support are variables with explicit dependence on time. .3.. ( V . The circuit decodes an incoming data stream coded with biphase marks. respectively.Split by PDF Splitter synchronous circuit assumption requires each cycle t o have positive weight t o disallow direct combinational feedback.d'". There are two unitweighted cycles. In this case the model requires multiple edges between the corresponding vertices with the appropriate weights. Note that synchronous logic networks simplify to combinational networks when no registers are present. with zero and unit weight. i.. = = c'"l + d'" I."I. as produced by a comppctdisk player.1.3.7 (a) and (b). Example 9. E.
the removal of the registers from the network segments the circuit and weakens the optimality. using the shorthand notation: There are different approaches to optimizing synchronous networks.Split by PDF Splitter FIGURE 9. which minimizes the circuit cycletime . By retiming a network. Leiserson and Saxe [24] presented polynomially bound algorithms for finding the optimum retiming.7 (a) Synchronous circuil. (b) Synchronous logic network or equivalently. This is equivalent to deleting the edges with po'sitive weights and to optimizing the col~espondingcombinational logic network. A radically different approach is retiming. Needless to say. The simplest is to ignore the registers and to optimize the combinational component using technique? of combinational logic synthesis. we move the position of the registers only. hut we modify the weight set W. Hence we do not change the graph topology.
area minimization by retiming corresponds to minimizing the overall number of registers. because the combinational component of the circuit is not affected. Hence retiming aims at placing the registers in appropriate positions. this is a limitation that may lead to inaccurate results. i. such as representing distinguished primary input and output ports (e. by the longest path between a pair of registers. Example 93. arithmetic operations). The numbers above the vertices are the propagation delays. Then we survey recent results o n synchronous logic transformations as well as on enhancements to the original retiming method.ork and denote it by G.g. A synchronous network is shown in Figure 9. Recall that the cycletime is bounded from below by the critical path delay in the combinational component of a synchronous circuit. Thus.1 Retiming Retiming algorithms address the problem of minimizing the cycletime or the area of synchronous circuits by changing the position of the registers. Moving the registers may increase or decrease the number of registers. Figure 9. where any type of computation is performed at the vertices (e.(V. Such transformations can have the algebraic or Boolean flavor. When registers have input loads different from other gates. the concept of don't care conditions must be extended to synchronous networks. We present retiming first. In the latter case. The most general approach to synchronous logic optimization is to perform network transformations that blend retiming with combinational transformations. When modeling circuits for retiming. E. shifting the registers in the circuit may indeed affect the propagation delays. Hence. MODELING AND ASSUhIPTIONS FOR RETIMING. so that the critical paths they embrace are as short as possible. Hence.e.g. the path delay between two registers (identified by nonzero weights on some edges) is the sum of the propagation ... Indeed retiming can be applied to networks that are more general than synchronous logic networks.Split by PDF Splitter or area. Unfortunately.. Because of the generality of the model.3. We consider topological critical paths only.7). it is convenient to represent the environment around a logic circuit within the network model.. With this model. We conclude this section by describing the specification of don't care conditions for optimizing synchronous networks. retiming may not lead to the best implementation. We describe first the original retiming algorithms of Leiserson and Saxe (241 using4 graph model that abstracts the computation performed at each vertex. we shall refer to it as a synchronous neh~. 9. The retiming algorithms proposed by Leiserson and Saxe [24] assume that vertices have fixed propagation delays.4. W). because only register movement is considered. we assume that one or more vertices perform combined inputloutput operations. no vertex is a source o r sink in the graph.8 1241. Unfortunately. We shall defer to a later section a discussion of specific issues related to modeling the environment.
cI.Split by PDF Splitter 1 FIGURE 9... Since r . ). . .". . . Example 9..2) Retiming a vertex means moving registers from its outputs to its inputs. delays of the vertices along that path.. which relates to the register count along that path. W_) is an integervalued venen labeling r : V + Z that transforms C . . . A positive value corresponds to shifting registers from the outputs to the inputs.9 (b). A reliming of a network G. u.2. The path delay must not be confused with the path weight... ..5.) = w(ui.  Example 9.  It is simple to show [241 that the weight on a path depends only on the retiming of its extremal vertices. E. . . . .. u. Definition 9.)  + r. . .3. W) into G . or vice versa. The retiming of a network is represented by a vector r whose elements are the retiming values for the corresponding vertices. .) = k:. including the extremal vertices. Retiming can be formally defined as follows.r.(V.I dk (9. . . = 1 before retiming. . Namely. . (9..) = k .. u. w(u.. E. by 1 leads to the circuit fragment of Figure 9. 0.. + r.) E E the weight after retiming E. v . = 1 and r.3) . ....=I+O1=0.): . a negative one to the opposite direction..9 (a).3.3. . u.8 Example of synchronous network. ". for a given path (s.r. the weight after retiming w. .I ~(. .. C WH (9.1) Note the path delay is defined independently from the presence of registers along that path.) we define path weight as: w ( u . For a given path (u. . ~ . The weight w. When this is possible. . .. .. is equal to E = w. . the retiming of a vertex is an integer that measures the amount of synchronous delays that have been moved.9 (d). ( V . Consider the circuit fragment shown in Figure 9. u. Consider the network fragment shown in Figures 9. l : ( ~ . . .. u. For a given path (vi. . A retiming of vertex v.6. = 0. where for each edge (u. whose network is shown in Figure 9. u. E. whose network is shown in Figure 9. W)... . because the retiming of the internal vertices moves registers within the path itself.) we define path delay as: d ( u . .9 (b) and (d).9 (c).. . . ( V .
An equivalent network is shown in Figure 9. FIGURE 9. (b) Network fragment.9 (a) Circuit fragment. Moreover they also showed that retiming is the most general method for changing the register count and position without knowing the functions performed at the vertices.10 Retimed network.10.3. Leiserson and Saxe [24] proved formally that networks obtained by legal retiming are equivalent to the original ones. Example 9. (c) Circuit fragment aher retiming. (Zero weights are omitted. corresponding to the retirning vector r = [11222100]'.8. the family of networks equivalent to the original ones can be characterized by the set of legal retiming vectors. Hence. Consider the network of Figure 9. where the entries are associated with the vertices in lexicographic order. weights on cycles are invariant on retiming. The goal of this section is to show how an optimum retiming vector can be determined that minimizes the length of the topological critical path in the network. CYCLETIME MINIMIZATlON BY RETIMING.Split by PDF Splitter FIGURE 9. A retiming is said to be legal if the retimed network has no negative weights. . Retiming a circuit affects its topological critical paths. (d) Network fragment after retiming.7.) As a consequence. Note that the path delay is invariant with respect to retiming by definition.
u.ri and D ( u i . Consider the network of Figure 9. we define D ( u i . .u. . . There are two vdths from u. respectively. i. The usefulness of relating the retiming theory to the quantities W ( v i . A retiming is legal if and only if G.8. u j ) . This can be shown to be equivalent to saying that the circuit operates correctly if and only if W ( u i . r is a feasible retiming if and only if: Proof. its topological critical path has delay 4 or less. u. and register propagation delays are negligible. u.. We can now state Leiserson and Saxe's theorem [241. We say that a retiming vector is feasible if it is legal and the retimed network is timing feasible for a given cycletime 4. . . u d . Similarly. u j ) over all paths from s to uj. L'~. the topological critical path delay is less than 4 if and only if the path weight after retiming is larger than or equal to I for all those paths with extremal vertices u .u j ) rj .= w. E V such that D ( u i . . W ( u .. Given a network G .).. The topological critical path is (tlb.) = 2 and D(u. u j ) = maxd(u. In addition. u.) = 16.. Example 9.e. u g . We denote by W and D the square matrices of size IVI containing these elements.. For a given cycletime q5. ( V . whose delay is 13 units. (u.) > 4 + .) 2 1 for all pairs ui.) = D(ui. Hence W(u..) z 1 for all vertex pairs u. u p ) with we~ghts2 and 3. which can be recast as.10. ) and D(ui. inequality (9.u..Split by PDF Splitter Example 9. we define W ( u i . u. . uh. Given a legal retiming. Consider the network of Figure 9.r. u. . such as WarshallFloydk. The quantiiy D(ui. The topological critical path is (ud. .1. .) > q5. to u.the case. u. u. If this is . ur. uh). .9. . Consider instead the network of Figure 9. u.).) over all paths from ui to u.) and ( u .) is that they are unique for each vertex pajr and capture the most stringent timing requirement between them. or equivalently its weight is larger than or equal to 1. E V such that D(u. . ... u.v .u. u. u.uc.. v j ) is the maximum delay on a path with minimum register count between two vertices. . z 0.2 This condition can be restated by saying that the circuit operates correctly if any path whose delay is larger than q5 is "broken" by at least one register.3. . r.. v. with weight W ( u . we say that the network is timing feasible if it can operate correctly with a cycletime q5.8 and venlces u. E .) > 4. . For each ordered vertex pair ui.3.8.) = min w ( u i .3. These quantities can be computed by using an allpair shortestflongest path algorithm.whose delay is 24 units. W ) and a cycletime 4.) = W ( v i . E V . u. Thisis equivalent to saying that W (u.. u. + Theorem 9. u. . u. .. . they can be subtracted from 8. where d ( u . u. u. clockskew not. namely.4). 2We assume that setup limes. u..
4.u.w.3. . 17. once sorted.6.24. . u. W) and matrices W Son the values of the elements of D . For q c h tentative value of 6.4. i. Then inequalities 9.. ( V .. 9 . edges and weights determined by inequalities (9.21.3. which searches for the longest path in a graph with vertex set V . . 5 w . E . Note that the problem may have no solution when it is overconstrained. when there exists a topological critical path whose delay exceeds q5 in all circuit configurations corresponding to a legal retiming. edges of Figure 9. .5). a feasible  ' .Thealgorithm would 19. The overall worstcase computational complexity is O(I V13 log IVI). The computation of the path weights and delays is trivial.(V. u . ) = D ( ~ .) equals the optimum cycletime.r.10. because r. Checking the conditions of Theorem 9. 3 ) . because the righthand sides are known constants for a given 4. and a solution is sought by invoking the BellmanFord algorithm. A more efficient search can be done by noticing that the optimum cycletime must match a path delay for some vertex pair v.DELAY( G.) This is shown by the solid . This is the original algorithm proposed by Leiserson and Saxe [24]. Consvuct inequalities (9. Solve inequalities (9. because W ( u j .8. Since no positive cycle is detected. .27.1 1 The binary search selects first @ = 19.3. ? r. 1 2 . then compute the inequalities 9. are (33. and 9. which are checked for feasibility by the BellmanFord algorithm.30.1..) + r j .r..Split by PDF Splitter 466 . I Determine r by the laal successful solution to the BellmanFord algo~ithm. as well as that of matrices D and W.5). ) V u .4. 1 0 . 7 . Consider the network of Figure 9. .1 is fairly straightforward. The solution to this linear set of inequalities can be computed efficiently by the BellmanFord algorithm. (See Section 2.) = W ( v i .23.5 are constructed. for each edge ( u . A naive way to minimize the cycletime in a network is to check if there exists a feasible retiming for decreasing values of 4.5 is constructed. E . and 9. and D ( v . Hence a binary search among the entries of D provides the candidate values for the cycletime. . LOCEC~LEVEL SYNTHESIS AND OFTIMIZ&TIOV RETIME. W )) I and Compute all path weights and delays in C .4): foreach # determined by a binary search on the elements of D ( Construct inequalities (9.. 9.16.) Example 9. u. 14. in any network there exists a vertex pair v . uj such that D ( u .3..e.) The elements of D.5) by the BellmanFord algodhm: D . I ALGORITHM 9. .1.4.1 a_nd eventually to stating inequality (9. where the cubical term is due to the BellmanFord algorithm and the logarithmic one to the binary search. which can be represented by a graph with the same topology as the network and complementing the weights. (See Algorithm 9.u.26. implies r. E V . .20. 1 3 . More specifically. u) 6 E. 6 .5). v . . (Matrices dated to this example are reported in reference [24]. the set of inequalities 9. u6 V. and the BellmanFord algorithm is applied.
e. Even though this method has polynomialtime compiexity. i.d(ui) > 13 or D(ui. Namely.. Some retiming algorithms exploit the sparsity of the network and are more efficient for large networks. Some of these are redundant. ) E E w.D E L A Y . The retiming entries can be thought of as the weights of the longest path from a reference vertex say uh. We describe instead the case when 4 = 13 is selected. I1 by complementing the weights. Hence: t. Other attempts with a shotter cycletime would then fail.) As in the combinational case. D(u. those vertices where the .) > 13. another feasible retiming vector is [ I 122210017. Most large networks are sparse. u. We do not describe the details of this step. retiming exists for a cycletime of 19 units. + I ( L . The relevant ones are shown by dotted edges in Figure 9. (See Section 8.=O and max t~ (9. u. E V ) .. The BellmanFord algorithm can be used to compute the retiming vector.) . namely.. : v.corresponding to the network shown in Figure 910. In this case the retiming vector is [12232100]7. . . Inequalities 9. At each iteration it computes the dataready times and its maximum value. and therefore the constraints are consistent.6. It is called F E A S and it can replace the BellmanFord algorithm in R E T I M E . = d.) . If this value is less than or equal to the given value of 4.. Otherwise it searches for all those vertices whose dataready times exceed the cycletime. We review here a relaxation method that can be used to check the existence of a feasible retiming for a given cycletime 4 [24]. which is equal to the the dataready time of the combinational network obtained by deleting the edges with positive weights.d(v.e. the dataready times are denoted by { t . = 0.5 are computed.Split by PDF Splitter SEQUENTIAL LOGIC O F T M I Z l m O N 467 FIGURE 9. Computing matrices W and D may require large storage for graphs with many vertices.1 1. An inspection of the graph shows no positive cycle. the number of edges is much smaller than the vertex pairs. L ' . i. Note that the solution is not unique and that other (larger) retiming values are possible for vb and vd. Algorithm F E A S is iterative in nature. with r.6) The circuit operates correctly with cycletime q5 when the maximum dataready time is less than or equal to 4. its run time may he high. Note that we could have used the BellmanFord algorithm to solve a shortest path problem on a graph obtained from Figure 9.11 Canstmint graph. the algorithm terminates successfully.3. The algorithm uses the notion of dataready time at each vertex.
. It remains to prove that when the algorithm returns FALSE. of some unit or corresponding to other design choices.E. no feasible retiming exists for the given @.w): to I V l ) ( for ( k = I Compute the set of damready times: if (mar r j 5 $1 I. matrices W and D must still be computed. W) if and only if a feasible retiming exists. However. Proof.~~... It then retimes these vertices by I unit..(V.... being retimed at any iteration of the algorithm. IV) =  e. If no feasible retiming exists. ( V . Given a synchronous network G. setc.cv. (See Algorithm 9. > 4 for each vertex u.~.2.(V. it moves registers backwards along those paths whose delay is too large.(V. where just a few values of r are $ tried out in decreasing order.(V. i i ) 1: E. the returned network is obviously timing feasible for the given 4. because all its successon on zeroweight paths areretimed as well. the process is iterated until a feasible retiming is found..e.2. chronous network needs to be stored. > 4 implies rj 1) that is head of a zeruweight path from u. . WJ and a cycletime 4.. . W).. We first show that the algorithm constructs only legal retiming vectors. E. I All path delays are bounded by $ *I ' V return ( C. A remarkable property of the algorithm is that when the algorithm fails to find a solution in V iterations. . Consider any vertex. When the algorithm retums G . $11 Set r = 0. Note that when the algorithm is incorporated in RETIMEDELAY.. such that r. i.) The compurational complexity of the algoritkm is O ( ( V ( ( E ( Only the syn). FYNlWESlS A N D OtTlMlZATlOti FEASf G. ~ ~ I ALGORITHM 9.(V.~. Theorem 9.3. W)retimed /* by r: Relime vertices with excessive delay *I 1 return ( F A L ). This may correspond to practical design requirements of choosing clock cycles that are multiple.. We now show the correctness of the algorithm and then give an example of its application. > $ r. W) .. E .. Since moving registers may create timing violations on some other paths. algorithm FEAS retums a timingfeasible network G. it retums FALSE [241.Split by PDF Splitter 468 LOGICLEVEL.(V.3. retiming of vertex as done by the algorithm cannot introduce a negative edge weight. the algorithm can be used in a heuristic search for a minimum cycletime. and the overall complexity may be dominated by their evaluation.=r. else I foreach vertex 1 Some path delays exceed $ *I ' G. E. Since t. E. say u . E .3.ii)=~.2 output signal is generated too late to be stored in a register.. .+l:  G.
.. .. Consider a vertex with two (or more) fanout stems.) I = ri . L.* Retiming may affect the circuit area... : W(u. = 24.12 (b2)] and we then recompute the dataready times. + + +  + + Example 9. = 10: 1. The newly retimed network is shown in Figure 9.r. W(u. the others being neglected by the algoGthm as well as not inducing any constraint.) = D(u. = 14. r. = r . Now the subset of vertices to be retimed is (I.. . 1. . synchronous delays are modeled by weights on each edge. = 3.W(u.12 (a2). = 3. there is some path from v. = 0. = 7. Registers can be shared.W(u.. Before explaining the method. +  . signed the value r.5 in IVI successive . ) = W(uj. LI.. ti) 1. which are r. ui) > 4. because it may increase o r decrease the number of registers in the circuit.11. Hence the subset of vertices to be retimed is (t.) > 4.  AREA MINIMIZATION RY RETIMING. When D(u. > 0. implying that D(Q.).8. From a n implementation standpoint.l. v. . 1. t. . . .. Retiming these vertices by 1 corresponds to assigning w. .10 by a possible retiming of ub by I. u.f = I: w. u) ? d ( ~ . The retimed network is shown in Figure 9. w = .. u. = 7.u. by I. u. Note that the final network differs from the one in Figure 9. = 3. such that E(v. Consider the network of Figure 9.) r.. as shown in Figure 9. say from uk to u. thJ. Thus. let us recall that the synchronous network model splits multiterminal nets into twoterminal ones.) = D(u. . u.. Conversely.. . = 13: 1.) =O. we can petform a network traversal starting from the vertices that are heads of edges with positive weights. A s a result. 1. = 0. = W(u.u.5 ') is satisfied.r.. is as. This can be shown by relating algorithm FEAS to BellmanFord's and showing that they are equivalent. We consider again the combinational component of the circuit [Figure 9. = 10. shown in Figure 9. I.. W(uk.3.. with E r n weight and delay larger than #. = 10. Our goal is to show that we attempt to satisfy inequalities 9. .) > #and W(u.... which would not alter the feasibility of the network. = 3: r. = 1.13 (c). Let # = 13. because the corresponding constraint in 9. . > d ( v i . which yields 1. u.12 (c2)] and determine the dataready times. = 14: r. vj) > 4 and . = 17: 1. r8 = 7..) 1 is exactly the asagnment done by the BellmanFord algorithm. v . I?. r... SO that the algorithm retimes I. u. Let us compute Ihe dataready times. r. r.. . rd = 3: r ..) = 0 and d(u. . Setting r.. r. reported again in Figure 9.) = 0. We then concentrate on vertex pairs a. as reported in reference [24].. u. v. . the variation of the number of registers is the only relevant factor a s far a s area optimization by retiming is concerned. : D(u.. Recall that cutting the edges with positive weighls determines the combinational component of the circuit.. to u . iterations by setting r. We can also disregard vertex pairs u..) > @. Since retiming does not affect the functions associated with the vertices of the network. We give here a short intuitive proof. = b 10: ti = 17: r8 = 24: 1. Thus: 1 . The network is timing feasible.) . = 3. f h = 7. We consider again the combinational component of the circuit [Figure 9. u : D(u.Split by PDF Splitter SEQUWTIAL LCGlC OPnMlZATlON 469 no feasible retiming exists. is incremented only when there exists a path.12 (bl). = I. u.12 (cl). . = 3: r = 3: r.) + 1 for each vertex pair u . LC. = r.12 (al). r.. there is no reason for having independent regisers on multiple fanout paths. u. 1. Retiming these vertices by I corresponds to assigning w. t h ) . Hence r. = 3...W(u.. I = r.W(u. w.
e. (a2. (Shaded vertices are not timing feasible. c l ) Synchronous network heing rctimed by the P E A S algorithm at various steps.Split by PDF Splitter FlGURE 9. Therefore. c2) Combinational component of the network at vsriaus steps. that is retimed by ri. Consider a generic vertex. for the sake of explanation.4. the overall variation of registers due to retiming is cTr. The local variation in register count is r i ( i n d e g r e e ( v i ). i. where c is a constant known vector of size IV( whose entries are the differences between the indegree and the outdegree of each vertex.o u t d e g r e e ( u i ) ) .) Nevertheless. b l .. the unconstrained area minimization problem can be stated as follows: min cTr such that . We then extend the method to a more complex network model that supports register sharing. b2. Since any retiming vector must he legal. we show first how we can achieve minimum area without register sharing. satisfying Equation 9. say ui E V .12 (al.
.8 is area minimal. No legal retiming can reduce the cycle weight. Then e = 11.rI.$l on a cycle. Consider the example of Figure 9. @) with a modified cost function. . Since the righthand sides of the inequalities are integers. The additional register is the price paid for the reduced cycletime. u. E .(V. u. This can be shown by noticing that there are only four registers and they are . u.I . This formulation is also the linear programming dual of a minimumcost flow problem that can he solved in polynomial time [24]. 01' and err = r. I. u. r r 5 W V(ui.5 and is solved by linear by appending cTr 5 Z programming techniques. Note that retiming ub by r b = . r8. 6 These problems can be solved by any linear program solver.)  2 . . The implementation of Figure 9.10. W). This provides us with a means of solving the minimum cycletime problem under area constraints by retiming. Consider the network of Figure 9. (b) Retiming without register sharinp.rh. Note that the timing feasibility problem under area constraint i can he modeled i Cj w.] to Equations 9. W) into %modified one.1 Vui. area minimization under cycletime constraint min 6 can be formulated as: cTr such that r..4 and 9.(V^. xi Example 93.12. called G^.I .I . but with an additional register. Then we minimize the register count on E r .) E E .13 network fragment.rr + r e r. 1. : D(ui. E. with Similarly. . E. . The transformation is such that the . the retiming vector r has integer entries. + + To take register sharing into accou_nt. & I.. such as the simplex algorithm.8. we transform the network G. This implementation requires five registers. (c) Reliming register sharing. . < w . ( v .1 would yield a timingfeasible circuit for the same cycletime.Split by PDF Splitter b )s( apm ICJ (a) Circuit and FIGURE 9.
. . denoted by G. W) without register sharing equals the register count in G..(V...2. after retiming. has k = 3 outdegree. = max wij i= 7 .. 3)/3 = 3.(v. . say ui. W ) is minimized. . . . . E. . because they are equal for are r = 0. .e. the registers perform only synchronous delays.. A A c . W )with register sharing. W) is obtained from G. The overall contribution to the modified cost function by the portion of the network under consideration is thus I. We set the breadth of edges (u.) are equal to the i maximum weight of the edges whose tail is u.14 (h) and (c). In the mod~fied network.. Before retiming w. On the other hand. because is a sink of the graph.13. ui.. the we~ghtsof the new edges are G. thus modeling the register sharing effect. to I l k . . all path weights w ( u i .=.. . uk to Let w. Note that when all breadths are 1.. .3. respectively. = . Example 9.. are minimal. The modified network before and after retiming is shown in Figures 9.(V. Consider the negative retiming of a vertex. . with k direct successors u l . where register sharing matters). . u. we shall consider o_nly onevertex. . u2. = 3. E. it is often convenient to associate the reset function with the registers themselves. Let each edge in G. vector c represents just the difference of the degrees of the vertices. k .G) the same. = W. j = 1. The marginal area cost associated with the modified network is ( 3 .(V.. Without loss of generality. 2. We show now that this formulation models accurately rzgiser'sharing on the net that stems from vi when the overall register count in G. .l k  in G. W) by adding a dummy vertex G with zero propagation delay and edges from ul.(V. uj) and ( u . even though the edge weights change during retiming.. The modified network G.(V. = 2. where .. E.) remain equal.where the entries of c are the differences of the sums of the breadths of the incoming edges minus the breadths of the outgoing ones. W) have a numeric attribute... . The local increase in registers is justified by an overall decrease in other parts of the network. Thus there exists an edge whose head is i. The transformation targets only those vertices with multiple direct successors (i. . the determination of the initial state of a retimed network may be problematic. .T. A minimalarea retimed network is such that the weights GI?. u2. while after retiming w. . .. corresponding to three registers in a cascade connection.(V. Dotted edges model the interaction with the rest of the network.. E. RETIMIW MULTIPLEPORT SYNCHRONOUS LOGIC NETWORKS AND OPTIMUM PIPELINING. Ilk w. k. with 0 weight after retiming. . Then all path weights w(ui. . We use a modified cost function cTr. .. . E. .wij: j = I. . In this case.. . This simplifies the network and often matches the circuit specification of the register cells. Note that the path weights along the paths (u. j = 1. i.e.2. A network fragment is shown in Figure 9 1 4 (a). E. Vertex u. . W). A pract~calproblem associated with retiming synchronous logic nerworks is the determination of an initial state of the retimed network such that the new network is equivalent to the original one. E . For any retiming r. u. called breadth. . c ) . . This is always possible when the reset functions of the registers are modeled explicitly by combinational circuits.. k..Split by PDF Splitter 472 LOGIC~LEVELSYNTHESIS AYD O P T M m T l O l i overall register count in Cs..""X .
Conversely. FIGURE 9.15 (c). Example 9. (c) Modified network after retiming. (d) Circuit fragment after retiming. registers are moved from the input to the output of the corresponding combinational gate. (b) Modified network before retiming. Consider the circuit of Figure 9.14 (a) Network fragment. when considering a positive retiming of a vertex. the difference is in the level of detail of the objects of our optimization technique. registers are moved from the output to the input of the local gate.14. an equivalent initial state in the retimed circuit cannot be determined.15 (a) Register model with internal reset.3. Consider the case when no register sharing is used and the gate undzr consideration drives more than one register with different initial states. In practical circuits. An example of a register with explicit external synchronous reset is shown in Figure 9. (c) Circuit fragment before retiming wilh initial conditions. As far as we are concerned. (c) . where registers have internal reset and are initialized to different values. (b) Register model with explicit ertcrnal reset circuit. the difference lies in the different internal transistorlevel model and in the possible support for asynchronous resets. as shown in Figure 9. which can be compared to a register with internal reset. It is impossible to find an equivalent initial state in the circuit retirned by 1.Split by PDF Splitter (c) id) FIGURE 9. Then.15 (b). Then the initial state can be determined by applying the local function to the initial states of the inputs.15 (a).
Leiserson's model for retiming synchronous networks assumes that inputloutput is modeled by one or more vertices of the network. (b) Retimed synchronou~ network. An optimum retiming is shown in Figure 9. Without loss of generality.16. Assume all gates have unit delay. This motivates the use of distinguished vertices for the inputs and outputs. With this simple mansfonnation.Split by PDF Splitter 474 LOCLCLEVEL S Y N M E S I S A M ) OPTIMmTtOh (b) FIGURE 9. The critical delay is 3 units. after having combined the primary output vertices and added the edge representing the environment.16 (b).15. ) and all primary output vertices into another one (uo). no vertex is a source or a sink in the graph. Example 9.7 is reproposed in Figure 9.16 (a) Modified synchronous network. The most common assumption is that the environment provides inputs and samples outputs at each cycle. because retiming the input (or the output) merged vertex implles addinglremoving the same amount of weight to all edges incident to it.3. The network of Figure 9. Let us consider now the input/output modeling issue. Retiming preserves the relative timing of the input and output signals. This problem can be obviated by the additional constraint r . where the critical path length is 2 units. An edge of unit weight between the output vertex and the input vertex models the synchronous operation of the environment. Hence we can lump all primary input vertices into a single vertex ( v . = r o . . Finite delays on any port can be modeled by introducing extra vertices with the propagation delay of the ports. Practical examples of netwarks have distinguished primary input and output ports. A pitfall of this modeling style is that the synchronous delay of the environment may be redistributed inside the circuit by retiming. the input and output vertices are assumed to have zero propagation delay.
6. In other words.1. This can be modeled by the additional constraint ro . Dey et a/. Consider. Other operators are the combinational transformations described in Sections 8. 9. by using the linear programming formulation of the previous section. If our pipeline model is such that primary inputs and outputs should be synchronized to the clock. for example. Let us now consider the following optimum pipelining problem. Retiming is limited in optimizing the performance of a circuit by the presence of vertices with large propagation delays. Retiming and combinational optimization are interrelated. then we may require that the weight on the edge representing the environment is .3). Given a combinational circuit. [I51 proposed a method that determines timing requirements on combinational subcircuits such that a retimed circuit can be made timing feasible for a given @ when these conditions are met. Indeed. Note that in this case the synchronous delay of the environment is meant to be distributed inside the network. the case of a network where the critical path is an inputloutput path with a (high) weight on the tail edge. where retiming is one of them. under latency and cycletime constraints. such as S P E E D .. which is a stronger condition. we can derive the exact latencylcycletime tradeoff curve. The timing constraints on the propagation delays of combinational subcircuits can guide algorithms. combinational optimization algorithms are limited in their ability to restructure a circuit by the fragmentation of the combinational component due to the presence of registers interspepad with logic gates.3. we can minimize the number of registers i n a pipeline. Similarly.Split by PDF Splitter 5EQUFVTlAL LffilC OPTIMIZATION 475 A different modeling style for retiming multipleport networks is to leave the network unaltered and require zero retiming at the pods.4 and applied to the combinational component. = ro = 0. By applying algorithm R E T I M E . It is obviously possible to alternate retiming and combinational optimization. which we call combinational bottlenecks. The corresponding registers cannot be distributed by the algorithm along that path. When using algorithms that allow retiming of a single polarity. Conversely. r . like the F E A S algorithm that retimes vertices iteratively by +I and moves registers from output to input. The circuit can be modeled by merging all primary inputs and outputs in two distinguished vertices and joining the output vertex to the input vertex by means of an edge with weight h. i. to restructure the logic circuit to meet the bounds.r .least 1 at after retiming. .3 and 8. the method attempts to remove the combinational bottlenecks to allow the retiming algorithm to meet a cycletime requirement. it is desirable to perform retiming in conjunction with optimization of the combinational component. insert appropriate registers so that the cycletime is minimum for a given latency h.2 Synchronous Circuit Optimization by Retiming and Logic Transformations  When considering the overall problem of optimizing synchronous circuits. this modeling style may in some cases preclude finding the optimum solution.e.D E L A Y to the circuit for different values of h. 5 A . we can think of optimization of synchronous circuits as the application of a set of operators.U P (see Section 8.
Since the registers are temporarily placed at the periphe~yof the circuit. or equivalently perform local transformations across register boundaries.. so that combinational logic optimization algorithms can be applied to networks that are ..(V.17 (b) shows the circuit after peripheral retiming. denoted by VC.pi = 0 for each intemal edge. .e. Vertices of synchronous logic networks can be partitioned into peripheral vertices.) t E the weight after peripheral retiming ur. Definition 9. W) into G". . defined as v P = V' U Y O . the method is called peripheral retiming. rj .. The result of applying combinational optimization to the peripherally retimed circuit is shown in Figure 917 (c).. but combinational techniques cannot detect it when operating on the network fragments obtained by removing the registers. When peripheral retiming can be applied. As in the case of combinational circuits. where the registers have been pushed to the periphery of the circuit. peripheral vertices cannot be retlmed. The presence of negative weights means that we are allowed to borrow time from the environment temporarily while applying combinational optimization. E. edges can be classified as peripheral or internal. where the redundancy is detected and eliminated.pi = 0.E ..+ p. Consider the synchronous circuit of Figure 9. Similarly. u. the former are incident to a peripheral vertex while the lamer are the remaining ones. PERIPHERAL RETIMING.16. corresponding to borrowing time from the environment. W) is an integervalued vertex labeling p : V + Z that hansfoms G. Logic optimization benefits from dealing with a larger unpartitioned circuit.r. the internal vertices are retimed so that no internal edge has a positive weight. As a consequence of the fact that peripheral retiming sets all weights to zero (removes all registers) on internal edges. all registers are moved to the periphery of the circuit. they can have the algebraic or Boolean flavor. Figure 9. G).17 (a) 1281. while a peripheral retiming requires wij p.Split by PDF Splitter Malik et a[. [28] proposed a complementary approach that removes the registers temporarily from (a portion of) a network.and for each intemal edge ( u . + + Example 9.17 (d). = w.. negative weights are allowed on peripheral edges. Registers are temporarily moved to the circuit periphery. Peripheral retiming applies to acyclic networks. Other approaches merge local transformation with local retiming. 2 0 for each edge. When synchronous logic networks have cycles. One AND gate is redundant. so that combinational logic optimization can be most effective in reducing the delay (or area).  = Note that a legal retiming (see Section 9. E. Peripheral retiming is followed by combinational logic optimization and by retiming. The circuit after retiming is then shown in Figure 9. i. and internal ones. Note tkdt one register has negative weight. Note that no negative delay is left in the circuit. . these can be cut by removing temporarily some feedback edges. In peripheral retiming. where p(u) = 0 Vv E V P .1) requires that wj.3. The goal of peripheral retiming is to exploit at best the potentials of combinational logic optimization. A peripheral retiming of a network G.33.3..(V.(V.
E v'. Two issues are important to determine the applicability of this method: first. .17 (a) Synchronous network and circuit. The circuits that satisfy the following conditions have a peripheral retiming: %. (b) Modified network and circuit hy peripheral retiming. second. . . . (d) Synchronous network and circuit after tinal retiming. v. fall in the class of circuits that can be peripherally retimed. Pipelined circuits. There exists integer vectors a E z'"" and b E Zi' I such that w ( v . the determination of the class of circuits for which peripheral retiming is possible. so that negative weights can be removed). II> + The correctness of this condition is shown formally in reference [281.) : u. Vectors a and b represent the weights on the input . . ( c j Optimired network and circuit after combinational optimization.. the characterization of the combinational transformations for which there exist a valid retiming (i. . not segmented by the presence of registers. E V ' . where all input/output paths have the same weight.e. for all paths ( u i . . Eventually registers are placed back into the circuit by regular retiming. .network graph is acyclic. There are no two paths from an input to an output vertex with different weights. . u.Split by PDF Splitter FIGURE 9. uj j = ai b.
Unfortunately. some combinational logic transformations may inhoduce inputloutput paths with negative weights. it is always possible to retime a peripherally retimed circuit whose topology has nor changed by restoring ihe original position of the registers. Obviously. [28] showed that a legal retiming of a peripherally retimed network requires nonnegative inputloutput path weights. the transformations must be rejected. When this happens.3.18 (c) shows the circuit after combinational optimization. On the other hand. this change induces an inputloutput path with negative weight.18 (b) shows the circuit after peripheral retiming. (C) FIGURE 9. (b) Synchronous circuit modified by peripheral retiming. and Figure 9. Malik et al. impeding a valid retiming and hence a feasible implementation. Verifying the condition and computing a and b (which is equivalent to pcrfonning peripheral retiming) can be done in O(IE1 V / )time [281. Consider the synchronous circuit of Figure 918 (a) [28]. Example 9. . Hence the combinational transformation is not acceptable.18 (a) Synchronous circuit.Split by PDF Splitter 478 LOGIC~LEYELSYNTHESIS A N D OPIlUlLATION and output peripheral edges of a retimed circuit. Let us consider now the applicable logic transformations. (c) Modified circuit after combinational optimization.17. where a threeinput on gate has been replaced by a twoinput 0%. Figure 9.
to replacing c'" by c'"'". Combinational logic transformations can be applied by considering the timelabeled variables as new variables. called synchrunous logic transformations. c = a b is retimed to r = as1 bal.Split by PDF Splitter Peripheral retiming may be applied to acyclic networks when the above conditions are satisfied or to portions of acyclic networks when partitioning is necessary to satisfy them by breaking some paths. partitioning the network weakens the power of the method. i. Synchronous logic transformations can have the algebraic or Boolean flavor. Retiming a variable by an integer r is adding r to its time label. Example 9.) Therefore it corresponds to retiming variable c by I .. let us consider retiming as a transformation on synchronous logic networks described by timelabeled expressions. the shorthand (With notation. In the algebraic case. We consider now logic transformations that are specific to synchronous circuits. (Zero weights are omitted.) .3. Retiming a timelabeled expression is retiming all its variables. reported for convenience in Figure 9.9. they blend local circuit modification and retiming. .19. Consider the network fragment of Figure 9. By the same token. Before describing the transformations. (b) Network fragment. id) Nelwork fragment after retiming. Alternatively. peripheral retiming can be applied to cyclic networks by deleting temporarily enough edges to satisfy the assumptions for peripheral retiming. They are extensions of combinational logic transformations and can be seen as transformations across register boumdaries. Unfortunately. local expressions can be viewed as polynomials over timelabeled variables [lo]. Retiming of vertex u. ( c )Circuit fragment after retiming. TRANSFORMATIONS FOR SYNCHRONOUS NETWORKS.e. Retiming a vertex of a synchronous logic network by r corresponds to either retiming by r the corresponding variable or retiming by r the related expression.18. by 1 corresponds to changing expression c'"' = a'"'b'"' into c'"+" = ai"'b'"'or equivalently to c'"' = a ( " ~ " b ' " ~ " . it corresponds to retiming expression ab by I . In other words. while exploiting the flexibility of using FIGURE 9.19 (a) Circuit fragment.
Consider the circuit of Figure 9.e. + + + + + It is important to note that synchronous transformations affect the circuit area by modifying the number of literals and registers in the network model.3. Example 9. x a l . leading to r = d o s 1 b a l . Hence. + + Example 9. awl b. is an algebraic divisor of ai.3. FIGURE 9. the computation of the objective functions is more elaborate than in the combinational case.1 c once retimed by I . the shorthand notation is particularly convenient for representing synchronous transformations. FIGURE 9. i. Thus. The elimination saves one literal but adds one register. y = am2 c ba l c.20 (a) Fragment of synchronous network.20. (b) Example of \ynchronous elimination.1 ..Split by PDF Splitter 480 LffilCLEVEL SYNTHESIS AND O l lXl N mML TO retimed variables and expressions to simplify the circuit. i. described by expressions c = ab: x = d c a l .21 (a). (b) Example of synchronous substitution .e. Variable c is eliminated after retiming the corresponding vertex by I This is equivalent to saying that c = ab implies c s l = up1 h l and replacing cal in the expression for I.2 c bi. described by expressions x = n @ l b. The local expression for y is simplified by algebraic substitution by adding to its support set variable x retimed by . Consider the circuit of Figure 9.21 (a) Fragment of synchronous network. The algebraic substitution leads tox = am1 h: y = x @ lc while saving two literals and a register. they affect the network delay by changing the propagation delays associated with the local functions as well as the register boundaries.20 (a). Similarly.19. This is possible because the local expression for x.
..'uml included in the local don't cure set. with a savings of two literals.ed in Example 93.. as in the case of combinational circuits.. As far as this example is concerned..23.. the don't care conditions represent the degrees of freedom in associating output with input sequences..21... When the don't care conditions of interest are expressed by a sum ofproducts form.3... Consider the network . The resulting network i s shown in Figure 9. Nevertheless some don't care conditions can be expressed by sum of products expressions over timelabeled literals and computed by an extension of the methods for don't care set computations of combinational circuits.23 Optimized network . is Note that y = u 8 u = au + a'u' and that u'u' = u'v'uml' + u'~. Example 9.3..25. I . Hence the most general representation for synchronous networks and don't care conditions is in terms of Boolean relations. FIGURE 9....b'? ..V2of Figure 9.. the local function can be replaced by J = u v . 9. Let us consider the simplification of function u 5 u associated with variable y.Split by PDF Splitter I 1 1 The synchronous Boolean transformations require the use of logic minimizers.. The local don't core conditions for this function are not straightfonvard and will be derir. Hence. In the synchronous case. we can perform simplification by applying any twolevel logic minimizer... Oth FIGURE 9.. the minimizers operate on function representations based on timelabeled variables...3 Don't Care Conditions in Synchronous Networks The power of Boolean methods relies on capturing the degrees of freedom for optimization by don't care conditions..22 Interconnected networks. by Boolean simplification.. we assume that the don't core conditions include u a l ' u' f u e l u' and thus involve literals at different time points (different from the combinational case). Since the behavior of synchronous networks can be described in terms of traces.22.
.+I.. because of the circuit initialization. . Consider the circuit of Figure 9. The limited controllability of the inputs of N2 is reflected by the set of its impossible input sequences. . that represent situations when an output is not observed by the environment at the current time or in the future."I C CDC..3.0.26. output observability may be restricted to be at a given. Indeed for u'"' to be 1. When referring to the ohservability at time n. sequential networks have a dynamic behavior. Moreover. Consequently: . Assume that the network i s initialized by the sequence (b('l..UODC.. Differently from combinational circuits. u cannot take value 0 twice consecutively. Hence: r. The overall observahility don'r care set O C . it must be that a'"' = b'"' = 1... as will be shown in detail in Example 9. D:. u'"'u'("+" is an impossible input sequence for N2. The external don't care conditions are denoted by D. For example.(. specific relation minimizers or optimization methods are required. as in the combinational case.... External don't care conditions consist of a controllability and an observability component.b'") = (1. External don't care conditions of synchronous logic networks are related to the embedding of the network in the environment. Similarly. We describe next the computation of don't care conditions expressed explicitly in sum ofproducts forms. I .2 lability don'r care conditions for network J\(~.. = CDC."' +". bW3'.. at the time points of interest. V n ? 4 As a consequence of the initializing sequence.is a vector with no entries equal to C D C i .for the output observability don't care set by associating each entry of the vector with the observability conditions at each output. L ' C CDC. C. We denote the input controllability don't care set by CDC. We shall then consider implicit specifications of synchronous networks and their don't care conditions as well as the corresponding optimization methods. As a result... sum ofproducts representations of don't care sets may involve literals at one or more time points.Split by PDF Splitter erwise.. for N2: 1.22... We use the vector notation ODC. we denote the observability don't cure set as OC)."... where CDC. is the intersection of the ODC sets D.time point. Hence.. output v cannot assume the value 0 at time 3 . but bi"' = 1 implies u("+'J= 1. . Definition 9 3 4 The input controllability don? care set includes all input sequences that are never produced by the environment at the network's inputs. Example 9 3 2 . 1). EXPLICIT DON'T CARE CONDITIONS IN SYNCHRONOUS NETWORKS. Let us consider the input control.?. Definition 9 3 5 The output obsewability don'r care sets denote all input sequences ..
Since u'"' = a'"'h'"'. in particular being x independent of the value of v. In particular. can be expressed in terms of u and u as: (by expanding the 5 operator and removing redundant terns and literals). An example of a periodic don'r core component can be found in a network comprising a twoinput OR gate and an oscillator feeding one input of the OR gate. Therefore.controllability don'r care conditions computed in Example 9. it is possible to define the corresponding satisfiability. Two consecutive FALSE values at o cause two consecutive FALSE values at U. We now compute the output observability don'r care conditions for hrj. associated with output u .Split by PDF Splitter SEQUEYnAL LffilC OPTIMIZATION 483 The interconnection of the two networks limits the observability of the primary outputs of N. due t o the initialization of the circuit. In particular. An intuitive explanation of the ODC set is that t m consecutive FALSE values at u suffice to keep both z and x TRUE.3. + It is interesting to note that synchronous don't cure conditions contain a timeinvariant and a timevarying component. Then (u'"' v*"))u""+" Vn ? 0 is a timeinvariant don'r care condition. The . u'"' is observable at time n if ?'"'I = 0 and u'" 'I = 1.22. We concentrate on the scalar companent related to output u.22. Thus a'{""a""' is an input sequence for that represents a situation when output u is not observed by the environment. Hence the value at the other input of the OR gate is irrelevant at every other time point. Let us consider the input. The latter may have a transient subcomponent.U2 only at time n or at time n + I. and aperiodic subcomponent. the output of N. controllability and obsewability don't care sets. while c''~' + u'"' is a transient condition.3. + When considering the intemals of a synchronous network. most approaches t o Boolean simplification using synchronous don'r care conditions take advantage of the timeinvariant component only.. due to some periodic behavior. The value of u'"' can be observed at the output of . The transient component may not be used for circuit optimization. The observability don'r rare of u at time n can thus be described by the function: while the observability don'r care at time n + I is described by: Conditions for never observing F'"' at the primary output of N2 are described by: in particular containing the cube u""~"~""'.23. then (a"""+ b""")(a""' b""') belongs to the component of the ODC set of JU. Example 9. Consider again the circuit of Figure 9. It may be difficult to exploit the periodic component.
. i f m. The notion of a perrurbed network (Section 8. . . then: +y'l. Definition 9.. Let p denote the maximum weight on a path from u... . xm) . for which we refer the interested reader to reference [6]. and it is useful to define observability don't cures and feasible replacements of local functions. We call internal observability don't care conditions of variable at time n rn the function: Note that in the present case ODC.("I + . the term x" fBf.("' + . 6(""')."n' + u'~nll) (v'081) + $In' + u'lnl)) = ).~lr2l1) oocj:i. The complete computation of internal controllability and observability don't care conditions is complex and not detailed here. Let us consider an arbitrary vertex v.. Let us compute the ODC sets for vertex u? in Figure 9.. Thus the internal controllability don't cure sets can be computed by network traversal by extending the methods presented in Section 8.. to any primary output vertex.3.l l = and: ODC.3. Since + ?.. = lglnl' 5 (yllll +1. without delving into the technical details..:'. .101 +u..""H .l"Il) = ?I"ll +. ACYCLIC NETWORKS. for each time point n and variable x . .fl +..4.6("J') and we denote the behavior of the perturbed network explicitly as fX(6'"'.( v l " ...4.* The acyclic structure of the network makes it possible to express its inputloutput behavior at any time point n by a synchronous Boolean expression f'"' in terms of timelabeled primary input variables.24.1) can also be extended to acyclic synchronous networks. E V and let us denote by f" the behavior of the network at time n when perturbed at u. Similarly. I . at time m 5 n.vl"u may depend on perturbations a(') at time Example 9.. Since the output of the network c a b b e affected by perturbations at different time points. .22. . . We shall first comment on acyclic networks and then extend our considerations to cyclic networks. When compared to combinational circuits. Note that f = fX(O.Split by PDF Splitter satisfiability don't cure set can be modeled by considering. . It is the purpose of the following sections to give the reader a flavor of the issues involved in deriving and using don't care conditions for synchronous network optimization.."I"I) + . . 0).Inl) . all internal variables can be associated with a local expression in terms of timelabeled primary input variables. '' The internal controllability and observability sets can be defined by considering them as external don't care sets of a subnetwork of the synchronous logic network under consideration.6.1. we consider sequences of perturbations S ' " ) . additional difficulties stem from the interaction of variables at different time points and from the presence of feedback connections.
with g . because of the simiiuity to multiplevertex optimization. l.)l G DC.. This should not be surprising. .P I ) @ f X ( O . Namely. Instead of having multiple perturbations at different vertices at the same time point. n .p . In this case. to the primary output have the same weight p . It has been shown formally."+I\ + . C g. Note also that the computation of don't care sets in pipelined networks is a straightforward extension of the combinational case. such that a constraint of type ( f . represented by p = 0 in Equation 9. . we now have multiple perturbations at the same vertex at different time points. The problem of verifying the feasibility of a replacement of a local function bears similarity to multiplevertex optimization in the combinational case. is sufficient to guarantee the validity of Equation 9. . it is possible to derive unilateral bounds on the perturbation when all paths from u.9 shows that these don't care conditions fully describe the degrees of freedom for the optimization of these networks.4. vn >o (9. A sufficient condition for equivalence is that the error is contained in the external don't care set.'.p . It is possible to show [6] that sufficient conditions for network equivalence are: e'"' i_ DC::'. and analogously to Theorem 8.. S i n . . 8"' = f v . Equation 9. . @ g l . e'"' represents the . n . error introduced at time I = n . is a complete local B don't care set. .% when: When performing simplification of a combinational function in a synchronous logic network. all paths from the vertex to any primary output have the same weight. . Boolean optimization of pipelined networks is conceptually no more difficult than Boolean optimization of combinational networks. Nevertheless we can always determine a set DC. e'"' depends only on 8'"P'. n .8 into unilateral bounds on the perturbation 6 . . . by replacing f . which represents the tolerance on the error. by g as measured .l . i ."l"' Variable y"'' is never observed at the primary output of .8. such as ( f .) 1 & DC. 0). . @ g. On the other band. for each vertex. or equivalently the validity of the replacement. i .8) where e(") = fX(S'"'. whore DC. that Equation 9.Split by PDF Splitter and thus: ODC'"l. as in the case of singlevertex combinational logic optimization. the behavior of the network obtained where by replacing the local function f.2 for the combinational case. l I .I . As a result.. the effect of replacing a local function on the network behavior can also be expressed by means of perturbations. .8 holds if and only if the following inequality holds [6]: A network models a pipeline when. : I = n . . . .. In other words. at the output at time n. is described by fx(S'"'.nl  \. In general it is not possible to transform Equation 9. . ...9.
~ n . Hence ' ~ ~ 0. due to the feedback connection. .3. . turbation. Thetimeinvariant controllability don'r care set contains cube u(""u'(").2) can be applied to the present situation.21. contains u ' ' " ~ " ~ ' ' " ' + u ' " ~Vn 2 ~ ' ~ " ~ gate is equivalent to the following perturbation: 6'"' = ( u ' " ' u ' " ' + u " " ' u " " ' ) ~ ( u ' " ' u ' " ' ) = u .>I ~ .4. + CYCLIC NETWORKS. which is independent of variable y. . possibly using the external don't care conditions. Iterative methods can be used for computing the don't care conditions related to We feedback connection. a vertex may have multiple paths with different weights to a primary output. If the corresponding CDC (or ODC) set is different than before. so that the function e("' has multiple dependencies upon S'"'.. controllability and observability conditions are induced by the feedback connection and are inherent to the cyclic nature of the network. We refer the interested reader to reference [61 for the details of the method.8 are correspondingly more complex. . . described later in Section 9.Split by PDF Splitter 486 LffiIC~LEVEL SYX'WESIS AND OF'TIMIMTIO'V In the general case. = + u""" )(j""+" u""').. c n~.3.. The conjunction of the observability don't care sets at times n and n + 1 yields 0DCytn. For example.6'"P'. In the affirmative case. the impossible feedback sequences (or unobservable feedback sequences) are computed for the acyclic network. as used in Example 9. Then.' in the shonhand notation. The optimized network is shown in Figure 9. ~ u. ~ ~ r . when considering arbitrary acyclic networks. . Replacing the EXNOR gate by an AND the set DC.* The computation of internal controllability and observability don't care conditions is more complex for cyclic networks.cn. Note that the acyclic network may be combinational or not. Example 9. This is equivalent to assuming a corresponding empty CDC (or ODC) set.. .l ~ component and the second in the ODC component of DC. v ~ n > u0. Similarly. In the general case. some values of the feedback input may never be observed at the primary outppts. make each bound independent on the other perturbations and then consider their intersection. Consider again the circuit of Figure 9.23. Indeed we search for subsets of the observability don'r care sets that are independent of the multiple A possible approach is to derive bolnds on each perperturbations S("'. The principle is to consider at first the feedback connection fully controllable (or observable). ~ ~ + ~ ~ ~ ~ . A particular form of iterative method is used for state extraction. thereby resulting in external observability don't care conditions of the feedback outputs of the acyclic component.1. and the associated don't care conditions expressed by Equation 9. . this simplification is equivalent to resorting to combinational don't care set computation techniques.l l u ~ l ~ ~ ~ uu. which contains cube u"""u""'.22 and the optimization of v.4. . The simplest approach to dealing with cyclic networks is to consider the subset of internal don'r care conditions induced by any acyclic network obtained by removing a set of feedback edges. Hencethe replacement is possible. some feedback sequences may never be asserted by the network and may therefore be considered as external controllability don't care conditions for the acyclic component. Note that DC. Techniques similar to those used for computing CODC sets in combinational circuits (see Section 8. can be expressed as u s I ' u' + a w l r. the process can be iterated until the set stabilizes [6]. The tixterm is included in the CDC u. We repon here an example.25. .
%'I. for the sake of simplicity. 1) and (I. Hence ? v"""u""' V n > I is also an impossible sequence and it is part of the output controllability don't cure set of network . redrawil in Figure 9. Consider again network highlight the feedback connection... It may be necessary to resort to a relational model if we want to be able to consider all possible degrees of freedom for optimization. Consider the circuit fragment of Figure 9. I V n and hence that . note that removing the inverter yielding variable x is equivalent to a perturbation 6'"' = a''"' EJ a'"' = 1 V n . i""' can never be an input sequence. Example 9. We denote the dangling feedback connection as i.22. Assume that the external don't cure set is empty.k$ redrawn to highlight the feedback of Figure 9.. if we assume that ?'"'I. This could lead us to the erroneous conclusion that DC.24 to Example 9. but we assume that at least one register has an initial condition equal to l k u ~ .3.. Consider now the acyclic component of the network obtained by removing the feedback connection. We uy to interpret this simplification with the analysis methods learned so far.~ IDC .25 (a). Consider the optimization of a synchronous network confined to Boolean simplification of the local function associated with a singlevertex or a singleoutput subnetwork. It follows that u'" = I and t'2' = r'"'. since I . f""' cannot be produced by the acyclic component at the feedback output V n .I) is applied to the network. . Thus. Assume that the initialization sequence (b14'.I I I .fproducts of timelabeled literals are functional (and not relational) representations. First note that the circuit fragment has two inpuLloutput paths of unequal weight. don't care sets expressed as sum t..26. b'2') = ( I ..r could be = replaced by a permanent TRUE or FALSE value. r'"'. External (vanMoresient) don't care conditions on the subnetwork are DC.b(". it follows that I""". 0..Split by PDF Splitter FIGURE 9. Note that similar considerations apply when we neglect the initialization sequence.* We stated previously that traces model best the behavior of sequential circuits. .3. ruling out the use of (peripheral) retiming techniques. Unfortunately.24 Netwurk .. 0) are possible sequences for t''l. vn ? 2 is an impossible sequence. only (0. Whereas the inclusion of the perturbation in a local don'r care set is a sufficient condition for equivalence.? I ~ . and they may fall short of representing all degrees of freedom for optimization. ' IMPLICITDON'TCARE CONDITIONS IN SYNCHRONOUS NETWORKS. Second. ~ V . leading to the simpler and equivalent circuit of Figure 9. It can be easily verified that the inverter can be replaced by a direct connection. it is by no means necessary in the case of synchronous .27.25 (b). " . = i'"i"'+i"""r over. Therefore the most general representation of synchronous networks and their corresponding don't care conditions is in terms of relations that associate the possible input sequences with the corresponding possible output sequences.
.. (It can be derived by noticing that the parity function is invariant to complementation of an even number of inputs. say at v.is valid as long as the following holds: The above equation represents the constraints on the replacement for subnetwork MI. by another one g. Possible solutions are the following: .25 (b). (b) Optimized circuit fragment. is to equate the terminal behavior of the network in both cases.25 (c). When considering singlevertex optimization.. X1n) > 0. networks.at"' e a i " .25 (a) and assume that the external don't care set is empty. A few issues are important in selecting a replacement for a local function." Vn /n 0.... feasibility must be ensured. This corresponds to the original network. the most general method for verifying the feasibility of the replacement of a local function f. Second..) . shown inside a box in the figure..a'"' Vn inverter.~ ~ e x ' " . (c) FIGURE 9...25 . Any implementation of hf. . shown in Figure 9.3... This solution can be derived by adding the term @x'"" to both sides of the constraint equation after having complemented a'" and a""+". First. X(~i) (a). . Consider again the circuit fragment of Figure 9. . The input/output behavior of the network is: Consider the subnetwork A.28. This corresponds to removing the Note that the last implementation of the network introduces a feedback connection.25 (a) Circuit fragment.. as shown in Figure 9. Example 9.Split by PDF Splitter 488 LOGICLEVEL S Y m E S I S A X 0 OmL117ATIOI.. ( c ) Other implementation of circuit Fragment. we may restrict our attention to replacements i T ~ + @ ~ . Therefore we must search for more general conditions for stating feasible transformations.a""' Vn > 0. A feasible replacement must yield indistinguishable behavior at all time points (possibly excluding the external don't cure conditions of the network).. . The corresponding circuit is shown in Ftgure 9.
6) cannot be applied tour court to the solution of this problem. the substitution of the local function must satisfy some optimalily criteria.29. An implementation compatible with the relation table is a specification of x(") as a function of the network inputs at different time points. Alternatively. The first step is equating the terminal behavior of the original network and of the network embedding the local replacement. A method for computing a minimum sum of products implementation of an acyclic replacement subnetwork is described in reference [7]. truth tables satisfying the constraints correspond to feasible network replacements. As a result. Third.25 (a).3. Consider again the circuit fragment of Figure 9. Example 9. specifies implicitly the subnetwork and all degrees of freedom for its optimization. A tabulation of the possible values of the variables leads to the specification of a relation table describing the possible input and output traces of the subnetwork that satisfy the equation. and equivalently the relation table. The synchronous recurrence equation. an optimal solution may be chosen. This yields an implicit svnchronous recurrence equation that relates the network variables at different time points. The coresponding synchronous recurrence equation is: which can be tabulated as follows: The right part of the table shows the possible traces for x in conjunction with the input traces. Among these. Hence don't care conditions are represented implicitly. We present here the highlights of the method by elaborating on an example. Constraints on the feasible values of the coefficients can be inferred from the relation table. the solution can be sought for by representing the desired replacement by a truth table in terms of unknown coefficients 171. combin+onal Boolean relation minimizers (as described in Section 7.Split by PDF Splitter that do not introduce further cycles in the network. Note that the relation table expresses the possible traces for x and thus the values of the same variable at different time points. specifying the values of different variables at the same time. Therefore. A specific synchronous relation minimizer [33] has been developed to solve this problem. . This differs from the specification of a combinational Boolean relation.
J z = O .IMPLICIT FINITESTATE MACHINE TRAVERSAL METHODS Traversing a finitestate machine means executing symbolically its transitions.a'"").330. + + fo. This fact is rooted in the need for considering.)CJi + J3) = I (J. I ) implies Jo = Ji . an explicit traversal means following all directed paths whose tail is the reset state. Solutionsare J o = l . x'"" = (00. excluding the tautological ones and duplications. Consider traces x'"'.25 (a) is replaced by a function specified by the following truth table: ainll J fo Ji Jl 0 0 1 0 1 0 1 1 J3 We can now reexpress the constraints of the relation table in terms of the coefficients. For input trace 000. fl"'(O. + + J. The resulting constraints on the coefficients. the values of the variable at multiple time points and therefore modeling the problem by relations rather than by functions. 1 1 1 ) and to those related to output traces (01. because it does not require an inverter. f3=Oand Jo=O.O) = J'""(0. as shown in Figure 9. + f. It is interesting to note that approaches to solving exactly the simplification problem for synchronous networks involve the solution of binate covering problems.')(Ji + f 2 ) = 1 ( f . 1 1 ) implying x'"' = xi"". If a state transition diagram representation is available. + J .' l ( a ' " " .25 (c). For input trace 001.) = 1 ( f . Such traces imply also that z'"' is TRUE. or equivaa'"") = J i " .) = I CJ.4 . which happens in correspondence with the following input traces al"'. Assume that network Ni of Figure 9. fi"'(O. f i = I . ) (fo + J. States that are not reachable can be eliminated and considered as don't care conditions. If the finitestate machine is described by a synchronous logic network. Note that this method does not consider solutions with additional feedbacks. + f.0) implies fo = fo or (J. The second situation is preferable. J i = O . 9. J i = l . 10). are: (Ji + fi)Cfo + f. 111).Split by PDF Splitter 490 LCGICLEVEL SYNTHESIS AND OFILMLZATION Example 9. = 1 f2)(13 fA)(f2 .al"~": (000. even for singlevertex optimization. lently f'"J(al"'.+ Jo)(Jo+f)= I. 110.) = I. The first solution corresponds to selecting x'"' = a""' and the second to selecting x'"' = a'"'.OOI. thus detecting all reachable states. or ( f .a'"". J 2 = 1 . a traversal means . Similar considerations apply to the remaining input traces ( 1 10.O) = (0.
X M ] . ( x . 2@ l o t 9 states that can be represented implicitly by this compact expression. There are 3 . for example. Let the reset state correspond to the state assignment p = 0: q = 0.e. In this case. finitestate machine traversal is performed to extract a valid set of states. . The condition X I x2 denotes all states with one of the first two state variables assigned to 1. corresponding to the different polarity assignments of p and q.The potential size of the state space is large. . reachable from the reset state under some input sequence. 1060). Consider. we consider state extraction and we revisit state minimization.g. . q ) . + * 9.. There we at most four states. because it grows exponentially with the number of registers. with one input ( x ) . Finitestate machine traversal is widely applicable to verification of the equivalence of two representations of sequential circuits 14.Split by PDF Splitter determining all possible value assignments to state variables that can be achieved. . i. a circuit with 64 state variables. 5. lmplicit methods are capable of handling circuit models with large state sets (e.4. reachable and unreachable state sets are represented implicirlj by functions over the state variables. Example 9.26 (a) Synchronous network.. Let the network under consideration have n i inputs and n state FIGURE 9. . there are at most 2" possible states. starting from the assignment corresponding to the reset state. We first consider an intuitive method for reachability analysis and then comment on some refinements. If the network has n registers. We question if all fwr states are reachable from the reset state. Consider the synchronous network of Figure 9. 26. We limit our comments here to those applications strictly related to synthesis and to the topics described in this chapter. x 2 . (b) Exuacted state transition diagram . Thus.4.26 (a). In general only a small fraction are valid states. Next we shall anempt to construct a consistent state transition diagram. 341. namely. one output ( z ) and two state variables ( p .1 State Extraction The extraction of a state transition diagram from a synchronous logic network requires us to identify the state set first.1.
26.3. Example 9. Example 9.r'p'q'+pq and f' = xp3+pq'.4.t for some value of k = k*.so from state . the state transition diagram can be constructed by determining the transitions which correspond to the edges of the graph. It terminates in a finite number of steps.. LJ J where f 1= . denotes unreachable states and represents dorz'f care conditions that can be used to optimize the circuit. Thus ro = p'q'.26 (h). whose reset state is represented by ( p = 0:q = 0).e. Each state can be associated with a minterm of 11' + q'. Once the state set has been identified. the states reachable from the reset state are encuded as ( p = 1: q = 0) and ( p = 0: q = I). . The states directly reachable from r~ are the image of r~ under f and can be expressed implicitly as a function of the state variables.The state corresponding to pq is unreachable and thus pq can be cnnsidered as a don'r core condition. 1. The transition edges and the qualifying inputs on the edges can be derived by computing the inverse image of the head state representation under f. Hence there is a transition into state . encapsulates all reachable states. the states directly reachable from any state set represented implicitly by rk are the image of rk under f. the iteration has converged. . corresponding to p'q' are identified by those patterns that make f = [OOIT. In this case. input x = 0 and state p'q yield z = 0. we specify an iterative method for implicit reachability computation. k = 0. + B" define the state transition function (denoted elsewhere as 6 ) which defines the nextstate variables in terms of the presentstate variables and primary inputs.4. 5. Therefore r .26 (a). Let ro represent the reset state in terms of the state variables. s to l pq' and s2 to ~. Similarly. The image of p'q' underf can who~e range is represented by vectors [OI]' and (The formal computation 341. The extracted state transition diagram is shown in Figure 9. By defining r k + .) Equivalently. under f is represented by vectors LOO]'. : k 2 0 to be the union of ri and the image of ri under f. the function f reduces to ( is shown in Example 9. ( p = 1: q = 0) and ( p = 0: q = 1). . because the functions r i .(The detailed computation is shown also in Example 94. The image computation can be performed efficiently using ROBDDs [4. The image of r . Transitions into . All other transitions can be extracted in a similar fashion.Split by PDF Splitter variables. when rk = ri. = pZq'+pq'+p'q = p'+q'. [0117 and [lO]T. The state transition function is f= ? be derived intuitively by considering that when ( p = 0: q = 0).44 Thus the reachable states can be encoded as ( p = 0: q = 0). The expression for rk.4.2. denote monotonically increasing state sets and the number of states is finite. . Let f : Bn+". The iteration terminates when a fixed point is reached. Consider again the synchronousnetwork of Figure 9. Thus p' + q' represents implicitly all reachable states. as in the case of controllability . Since r> = p'+q' = r . don't care computation of combinational circuits L '. k*. I . Equivalently.s. i. they satisfy ( f ' = O)( f 2 = 0) = (x'p'q'+pq)'(xp'+ p y ' ) ' = s'p'q. .r2 (encoded as p'q) under input 1'.'q. 1 1{::I. whereas the complement of rx.4. Namely so corresponds to p'q'. The corresponding primary nutput can be derived by evaluating the network in a straightforward way.
j . Consider again the control unit of the complete differential equation integrator. + q' are those G ) (P' + 4 ' ) ) = S i . p .4. Consider again Example 9. q ( ~ 4 ' ( . y ) . 341. Reachability analysis can be performed using the state transition relation. ( X ( XP . the states reachable from r . j . . we get: x ( x .. p .) There are six states and .r k ( x ) ) [26. S.X . 4 . x ( x ( i . . r k ( y ) ) .. Then ~ ( ix.. The state transition relation is represented by the characteristic equation: Example 9.r ' ~+qj '') 4 ( ~ + ~ 4 '+ T G ' ( x i p ' q ) ) ' ~' ) j"i' + j"j + i i ' representing states ($I. .1). ~ ( T 4 ' ( x ' p ' q )= p'q ) Therefore state . 4 . The image of rk under the state transition function can be computed as S i . y ) = 1 is the corresponding relation. extracted from the network.5. it has been possible to develop algorithms i for finitestate machine traversal in both directions. q . Similarly. Let r k ( x ) be a function over the state variables x denoting implicitly a state set.. Then.j(x(x. Namely..:.. is shown in Figure 9. With this formalism. present states and next states. the inverse image of rk under the state transition function can be computed as S i . let r k ( y ) be a function over the state variables y denoting implicitly a (next) state set. y ) .27. (The output signals are omitted in the figure. = p' denoted by: S .Split by PDF Splitter SEQUENTIAL LOGIC OPTNILATION 493 (see Section 8. Example 9. The inverse image of r.7. y ( ~ ( X . x ) denote the state transition function. lo6'). .&. .so can be reached from state s?.6 as a synchronous logic network. linking possible triples ( i . j. Some variations and improvements on the aforementioned scheme have been proposed. s2].g.4. Let j and ~ ( xp.j .4. let y = f ( i .. ($4'(x'p'q') The states reachable from ro = p'q' are those denoted by: s. applicable to sequential circuits with potential large state sets (e. . .4. s. This description uses a 1hot encoding and it entails 46 literals and 6 registers. Namely. j . . The inverse image of a state set under the state transition function is also of interest. y ) of inputs.4.. 4 ) = ( j ?% (r'p'q' +pq)) (4 ?% (xp' + pq')) = l By expanding the expression. = T i ' can be computed as: . because it allows us to derive all states that have transitions into a given state subset. . 4 . 4 denote the nextstate variables. x . ( x ( X p.4 ) p'q0 +Ti(xp'q')) = be' + T i corresponding to s. . 4 ) ) j'4' = S r . which is the characteristic equation of the state transition function and can be efficiently represented as an ROBDD.. p. and sl.1. The inverse image can also be computed with the same ease. The corresponding state transition diagram. described in Example 4. 4 ) = ji'(x'p'q' + p q ) + T G ( x p i+ P<) + T ~ ' ( x ' P ' ~ ) = S . q ...
2 Implicit State Minimization* State traversal methods can be used for finitestate machine state minimization using an implicit model. in particular the derivation of the equivalence classes as suggested by Lin er a / . the state pair information stored in the characteristic function is refined by requiring that such pairs have next states that are also represented by the characteristic function. We consider here state minimization of completely specified finitestate machines. Equivalently. An iterative procedure can thus be defined that updates the characteristic function by its product with the inverse image. it is possible to check if [state pair has next states that are candidate equivalent pairs for all inputs. The states of the product machine are identified by the state variables.4. the network reported in Example 8.27 Stale transition diaffarn for the control unit of the complete differential equalion integrator. but not sufficient. Candidate state pairs are represented implicitly by a characteristic function. we call these state pairs candidate equivalent pairs. Classes of equivalent states are determined by considering pairwise state equivalence by exploiting thetransitivity of the equivalence relation. which is appealing because it can handle much larger state sets than other explicit methods do. all states of the product machine whose output is TRUE for all inputs denote state pairs in the original finitestate machine whose outputs match for all inputs. After logic optimization. as sketched in Figure 9. condition for equivalence. This can be performed by multiplying the characteristic function hy the inverse image of the candidate set under all input conditions.28. This is a necessary. is the reset state. Each state of the product machine corresponds to a pair of states in the original finitestate machine.10. So. No state can be eliminated by state minimization. Since state pairs correspond to a single state in the product machine. As a result. The encoding produced by progrdm NOVA. State encoding called by program $IS.2. the inverse image of any state of the product machine under the transition function yields the state pairs that are mapped into the state under consideration. state s. the network can be reduced to 48 literals and 3 registen. [26]. yields requires 3 bits. It is possible to show [26] . Such a network entails 93 literals and 6 registers. Given a sequential logic network representation of a finitestate machine. 9. we duplicate the network to obtain a pruducr machine representation. As a result.Split by PDF Splitter 494 LoCrC~LEVELSYNTHESIS AND O P n h l l Z A i l O N FIGURE 9.
5 TESTABILITY CONSIDERATIONS FOR SYNCHRONOUS CIRCUITS The synthesis o testable sequential circuits is a broad area of research. The implicit method performs an iterative refinement of a set of state pairs that are described implicitly in terms of state variables of the product machine (which are just twice as many as those of the original machine). Since implicit methods leverage efficient set manipulation based on ROBDD representations and operations. Scan techniques were introduced and divulged by 1BM Corporation and are used today with different flavors. They require explicit representations of the states. they can handle problems of much larger size than classical methods do. In this design and test methodology.2. (b) Product finitestate machine. Classical methods perform iterative refinements of state partitions. to a few comments. scan methods provide full controllability and obsewability of . registers can be configured to be linked in a chain so that data can be introduced andlor observed directly while testing the circuit. To be precise.28 (a) Original finite~stale machine. 291. 9. and &us they are limited by the state set cardinality.Split by PDF Splitter SEQUENTIAL LOGIC OFTIMILATION 495 COMBINATIONAL CIRCUIT REGISTERS COMPARATOR COMBINATIONAL COMBINATIONAL REGISTERS (a) REGISTERS (b) FIGURE 9. We refer the interested reader to reference 1261 for details. Therefore. We limit f ourselv$s. that this iteration converges in a finite number of steps to a characteristic function representing implicitly all equivalent pairs. and we refer the reader to specialized books on the subject [17. we use single sruckat fault models and we consider fully testable those sequential circuits where all faults can be detected. Different testing strategies are used for synchronous sequential circuits. It is interesting to compare this method with the classical methods described in Section 9.1.
the second group requires computationally expensive (or prohibitive) synthesis techniques but no implementation overhead [14]. This is caused by the lack of direct controllability and observability of the state registers. There is a wide spectrum of methods for designing fully testable sequential circuits not based on scan. for the sake of simplicity.a combinational nextstate logic block (feeding the registers). we find synthesis techniques that eliminate untestable fault& by using optimization methods with appropriate don't care sets. To detect a fault in'the oulputlogic circuit. there are techniques that constrain the implementation of the sequential circuit. the finitestate machine has to be driven by a sequence of vectors into the state that propagates the effect of the fault to the primary outputs. a combinational outputlogic block and. The overhead of using scan techniques varies according to the circuit technology and clocking policy. The advantages of these methods are that both eliminate the area overhead due to the scan registers and that they provide support for faster testing by avoiding to load and unload the scan chain. On one side of the spectrum. and then the effects of the fault have to be propagated to the primary outputs. This technique is applicable to sequential circuits modeled as both Moorestyle and Mealystyle finitestate machines. which would represent a faulty state instead of the corresponding nonfaulty next state. The first family of methods requires simpler synthesis tasks but possibly an overhead in terms of area. We comment on the first case only. Conversely. the sequential testing problem reduces to that of the corresponding combinational component. To alleviate this problem. for example.Split by PDF Splitter 496 L f f i l C ~ L E V E LSYNTHESIS Avo OPTlMlLAlloN the registers. To detect a fault in the nextstate logic block. Assume first that the circuit is partitioned into registers. On the other side.29 (a). we must find first a state and an input vector which propagate the effects of the fault to the state registers. Both approaches have advantages and disadvantages. We must then be able to distinguish the faulty state from the nonfaulty state by observing the primary outputs. this problem is complex unless the state variables are . The controllability and observability of the remaining registers is either achieved by exploiting features of the combinational logic component or compromised partially. The circuit has to be driven first to a state that excites the fault. The starting points for these methods are either state transition diagram [I 21 or network [I71 representations of sequential circuits. Recent approaches to design for testability of sequential logic circuits have explored the possibility of using the degrees of freedom in logic synthesis to make the circuit fully testable with relatively short sequences of test vectors. We shall consider two representative examples of these approaches. In some cases. We explain first a method for achieving full testability using an appropriate state encoding and requiring a specific implementation style based on partitioning (and possibly duplicating) a portion of the combinational component [12]. As a consequence. a considerable area penalty is associated with the replacement of regular registers with those supporting scan. The degrees of freedom are. Consider the Moorestyle finitestate machine shown in Figure 9. don't care conditions and/or the choice of state encoding. In general. Testing sequential circuits without scan registers may require long sequences of test vectors. partial scan techniques have been proposed where only a fraction of the registers are in the scan chain.
For this purpose. (See Section 8. any single fault in the nextstate logic will affect one state variable and its effect will be observable at the primary outputs.5. Untestable faults can be divided into two categories: combinational and sequential faults. This task can be done by reachability analysis. They can be detected by using combinational logic optimization methods and removed by making the combinational component prime and irredundant. all possible 2" states must be made reachable from the reset state.) Sequential untestable faults are faults in the combinational component of the circuit that cannot be tested because of the sequential . (b) Moorestyle finitestate machine with partitioned nextstate logic. the testing method requires just consmcting sequences of input test vectors that set the machine in the desired state.Split by PDF Splitter NEXTSTATE >g R SEQUENTIAL LOOK OPI1MIU11ON 497 (a) (b) FIGURE 9. Then the nextstate logic block is further partitioned into independent combinational circuits each feeding a state register. As a result. The combinational circuits are made prime and irredundant using the techniques of Section 8. as shown in Figure 9. Afundamental problem in sequential synthesis for testability is relating faults to the sequential behavior of the circuit. we need to characterize faults in sequential circuits. even if the registers were fully observable and controllable. Assuming that n registers are used. Moreover. Combinational untestable faults are those that are untestable in the combinational component of the circuit. A fully testable implementation of a Moorestyle finitestate machine can be achieved as follows. The states are encoded so that two state codes have distance 2 (differ in 2 bits) when they assert the same output. Thus.29 (a) Original Moorestyle finitestate machine. the fault is not detectable. possibly by adding transitions to the state transition diagram. We consider next a synthesis method to achieve testable sequential circuits that relies on optimization techniques. directly observable at the outputs.29 (b). if the faulty state is equivalent to the nonfaulty state.5.
Namely the states with codes 000 and 001 have been swapped. 171 showed that thesrare all possible types of untestable sequential faults. [13] proposed a method that uses three major steps: state minimization.30 (b).5. e.30 (e). The states encoded as 010 and 110 are equivalent. Invalid. .30 (a).30 (d). Consider the state transition diagram of Figure 9. A special state encoding is used that gumntees that no single stuckat fault can induce isomorphic sequential untestable faults.Split by PDF Splitter behavior of the circuit. The fault creates an extra state.30 (b). but isomorphic. Devadas et a/. The circuit has one primary input (i).. including algorithms for decomposition. whose network implementation is shown in Figure 9. . Isomorphicsequential untestable faults cause the circuit to behave as modeled by a different. The fault changes the state transition diagram to that of Figure 9.and that is equivalent to state 110. [13].g. Designing a fully testable sequential circuit involves checking for and removing untestable sequential faults as well as removing combinational untestable faults.sequential untestable faults cause transitions from invalid states corresponding to unused state codes. state transition diagram. Consider next a sruckorI fault on input w2.6 PERSPECTIVES Synthesis and optimization of sequential synchronous circuits has been a research playground for a few decades. An example of an isomorphicsequential untestable fault is shown in Figure 9. Consider a sruckar0 fault on input wl. The fault changes the state transition diagram to that of Figure 9. as shown in Figure 9. We refer the reader to references 1131 and [I71 for the details. Don't care set extraction and combinational logic optimization are iterated. where the faulty machine represents an equivalent machine with a different encoding. 9. There are a wealth of techniques for optimizing statebased representations of finitestate machines.1. This example of untestable faults is due to Devadas er al. The cormpted transition is shown by a dotted edge in the figure. pl. encoded as 111. with a different state encoding. Example 9. An example of an invalidsequential fault is a fault that affects the circuit when in an invalid state. pi. Since states 010 and 1 LO are equivalent. Sequential untestable faults can be grouped in the following three classes: Equivalentsequential untestable faults cause the interchangelcreation of equivalent statea.30 (c). the fault causes an interchange of equivalent states and so is an equivalentsequential untestable fault. and the overall method is guaranteed to converge to a fully testable sequential circuit. 101. This is again an example of an equivalentsequential untestable fault. one primary output (0) five and states encoded by vxiables p . state encoding and combinational logic optimization with an appropriate don't care set that captures the interplay of equivalent and invalid states. [13. that was originally invalid (unreachable). Devadas era/.
!) ma1qoid Suupa~o3 Su!Ll~apuniapmq aql s! L3uedaiss!p s!ql 103 uoseaJ a u o .sa3uqsu! Luem u! suo!mlos mexa aql 01 L~genb as013 aq 01 umouy am suognlos alem!xoid u! de alaqm 'uo!lez!m!u!m s!So1 puo!leu!qmo~ la~alom101 palseiluos aq plnoqs s ! q ~ .aql JO asne3aq Split by PDF Splitter'alqel3!paidun uago [[!IS s! sqnsai aw JO h l e n b aql 'suonvtuama[dm! 3!801 la~alaldnlnm103 luamu8!sse alvs Suuap!suo~ uaqM '(aleun snsiaA aleu!q .a.az!s pnsn 30 sl!ns 1~3 " [e3!1seld IOU am spoqlam les!ssep l s o .suo!ln[os mnur!~do 01 Jgsunaq amdm03 louue3 am pue smalqoid paz!smn!pam 10 [[ems anlos ue3 am 'sau!qaem a~elsal!uypaypads Llalaldurosu! 103 malqoid u o g e z ~ !u!m alels aql 4uuap!suo~ uaqm 'n1n3gmd UI .1uamu4!sse pue uo!lt..Zm ue le 11nnj 11ex~ns alqssalun le!iuanbassalen!nba ire oi anp i ! n ~ > iillnql aw jo ure18wp uoq!sue~i9 4 !~ 1s (P) l m is llnej o .laL passasse [lam IOU SF sllns%i i!aw $0 4!lenb aql lnq 'Injasn umoqs uaaq aAeq stuqluoSle y s u n a ~ .! (q) .I ~ ~alqmsaluns ? ~ ~ N @wanbaslualen!nba ire 01 anp 1!n=1!3iil~ne)aqi j o uredap uo!l!suen alels (2) eauoduro3 lauo!aeu!qmoJ aql )o uo!leruama~dm!xxorniaf.urede!p uo!l!suen a i m [eo!8"g ( 9 ) OC6 X11Il316 .z!m!up aleis J ~ '11neJ alqeisalun le!luanbas~!qdromon! qi!m i!n31!3 i(l[nej e JO ure18e!p uo!i!sw aleis (a) .
. Kim and Newborn 1221 described a state minimization melhod using input dorl'r core sequences. Symbolic relations were introduced by Lin and Somenri [25]. Kohavi's 1231. Devadas er a1 (12141 investigated several aspects of design for testability of sequential circuits. the implication of network transformations on the corresponding properties of the state transition diagrams are not yet fully understood and exploited. A recent book describing the overall problem is reference 1171. Hopcroft's algorithm far state minimization was fint presented in rcfertnce [19]. Hill and Peterson's 1211 and McCluskey's [29]. exploits only one degree of freedom (i. most networkbased optimization algorithms are heuristic in nature. An analysis of don't cnrr conditions for sequential circuits modeled with state transition graphs is reported in reference [21. Hanmainis and Steams' 1201. state transition diagrams or networks) without exploiting the synergism between the two representations. Malik el 01. e g . is reponed in a recent monograph on sequential synthesis [Z]. with particular emphasis on encoding and dccomposition. Network transformations affect the encoding of the states of the circuit being optimized. Machine optimization using don'r core sequences was first postulated by Paul1 and Unger 1361. Retiming was proposed by Leisenon and Saxe in the early 1980s. . [4. To date. Recently. Even though these programs are used routinely and successfully for digital design.Split by PDF Splitter inability of current algorithms to forecast precisely the effects of the choice of the codes on the area and performance of the resulting circuits. we believe that this field still has many open problems for further research and that progress in sequential logic synthesis and optimization will lead to even more powerful design tools in the years to come. 111 and perkcred by Du er ol. [2. with the goal of verifying equivalence between sequential circuit models.. 9. 26. 1161.e. The symbolic approach to the state encoding problem for twolevel circuits was first presented in reference [a] and then extended in references 191 and [ 3 5 ] State encoding for multiple~levelcircuits was addressed by Deva&as er 01.g. while specifications of don'r cure sets using network models are reviewed in reference [61. 281. Damiani and De Micheli [7] inreaigated modeling and optimiration of sequential circuits using synchronous recurrence equations. was The imnlicit finitestate machine tmverral ~ r o b l e m tackled bv Couden erul. Recent work on finitestate machine optimiration. 1301. but a generalization of the method is described in detail in reference [I]. Sequential optimization methods using network models evolved from retiming and multiplelevel combinational logic synthesis. Once again.7 REFERENCES Finitestate machine optimization using state models has been dexribed in several classical textbooks. The technique was peiiected later by Rho el ul.34). 151 and other authors ~. Several extensions to retiming have been proposed [lo. The ramifications in the synthesis field have been several. 15.. most sequential circuit optimiration programs use different modeling paradigms (e. Whereas state transition diagrams can be easily extracted from network models. with the exception of retiming. register positioning) for circuit optimization. Thus. the flexibility in implementing multiplelevel circuits allows us to develop a rich set of circuit transformations but makes it hard to devise algorithms yielding optimum implementations. which. including the efficient derivation uf controllability don'r care conditions for both combinational and sequential circuits. on the other hand. A complete and accurate description was reported in a later paper 1241. 1271 developed a solution to an input encoding problem that optimizes the extracion of common subexpressions and that can be used fur state rncodinp of multiplelevel circuit\.
No. 597616. S. pp. MA. pp. pp. January 1991. Vol. Rothweiler. 22. 3950. S. McGrawHill. 1. Couden. Boston. Hopcroft. De Micheli. Ullman. G. 12. X. SangiovmniVincentelli. 18. Ghosh. Dey. New York.AddisonWesley. Newton. pp. "lrredundant Sequential Machines Via Optimal Logic Synthesis. Kim and M. Newton and A. pp. G. 269284. Vol. ''An ... Peteraon. pp. 19. 11W1007. 'Timing Optimization of Muldphase Sequential Logic. Raju." IEEE Tra. 10. 4. CAD10. 1. F. "MUSE: A MUltilevel Symbolic Encoding Algorithm for State Assignment. R. SangiovanmVincentelli. I I . No. 12. "Verification of Sequential Machines b s e d on Symbolic Execution. Wiley. Seyue~niol Lopic Tesnnp and VeriJication. J. Switching irnd Finite Auromara Thro". Madre. k c l u r e Norcs in Computer Science. Newton and A. 3 5 2 5 9 . Ma. Hachtel. E. 8&9l. E. K. G. A. 1992.. Hartmdnis and R." IEEE Transnctions on CADNCAS. Englewwd Cliffs. Grasselli and F.rncrio. "MUSTANG: State Assignment of FiniteState Machines Targeting Multilevel Logic Implementations. 1981." I E E E Trunsorrions on CADNCAS. 1." I E E E Tranocrinns on Comprrrerr. Editor. pp. 818. Pr0ceeding. 4h51: 1990. Saxe. Devadas. S . 189198. Proceedings ofrhe Drrign Aurtlmatiot?Confrr~nce. EC14. New York. S. 9. Patkonjak and S. Hopcroft and J.r on Elecrronic C. T. pp. 535. SangiovanniVincentelli. Proceedings <frhe 1nrerno"orral Conference on Computer Aided D n i g n . January 1991. 15. 1991. 2. No.~. T.No. No. 14. Vol CAD10. NI. Newton. CAD10. Leiserson and J. 5. P. Ashar." IEEE Tranrucrion.~ CADNCAS. Germany. B. Vol 6. 25. June 1 x 5 . No. Vol. 6373. New York.Split by PDF Splitter I . 1992. Brayton and A. Sequenrial h g t c S~ntl~esis. Aho. December 1988. PrenticeHall. Devadas and K.r log" Algorithm for Minimizing States in a Finite Automalon. Benhrt and J. "Don't core Set Specifications in Combinational and Synchronous Logic Circuits. CAD~5. Barriello and S. 1992. Damiani and G. . "A Unified Approach to the Synthesis of Fully Testable Sequential Machines. on I 7 A. Vol. SangiovanniVincentelli. 23. pp. Du. 6. 1990. Switching Theon. Dill. Burch. proceeding. . 1990. J. Devadas. "Optimal Swte Aslignment for Finite State Milchines." IEEE Tranracfionr un CADNCAS. "Performance Optimization of Sequential Circuits by Eliminating Reuming Bottlenecks. Vol C ~ ? I . "Recurrence Equations and the Optimi~ationof Synchronous Logic pp. Kahavi. 1992. Damiani and G. "Synchronous Logic Synthesis: Algorithms for CycleTime Minimization. March 1993. . 8. 1974. 4. Stearns. 20. 0. 5W509. Siiakis. 24. De Micheli." IEEE 7ransncfionr on CADNCAS. 3. CADIZ. "Minimization of Symbolic Relations. 1. pp. 556561. 10." in Z. Kluwer Academic Publishers. 13." I E E E 7ronsactions on CADNCAS. De Micheli. 365388. 144s1443. Vol. Hill and G. Val. MA. CAD7." DAC. T. 2. G. pp. McMillan and D." ICCAD. Newton. 1971. C. Somenri. "Symbolic Design of Combinational and Sequential Logic Circuits Implemented by Twolevel Logic Macros. CXD9. Vul. Kluwer Academic Publishers. CAD4. I." IEEE Tranroctions on CAD/lCAS. (Editor) Theory of Mnrhines and Compsmrio. M. Algebraic Srnrcrure Throry of Sequential Machines. MA." Alyorlrhmim. TheDesign and AnalyrisofCompurerAlyorirhmr. Vol. R. Kohavi. 2838. October 1986. Berlin. S. R. De Micheli. 129G1300. "Synthesis and Optimization Procedures for Fully and Easily Teslable SequentC~l Machines. B. Reading. R. 10. 5162.^ offhe Dcrign Auromation Conference. C. A. No. October 1989. Circuits. . Ma." in J." DAC. Dcvadas and A.~rn~urrrr. Clarke. S." ICCAD. pp. 4. Luccio. Boston. "Reuming Synchronous Circuitry." I E E E Transactionr on CADIICAS. o n d b g i c n l Derign. January 1991. Devadas. pp. C. 16. Lin and R. July 1985. Keuuer. 3. Newborn. Devadas and A. "A Method for Minimizing the Number of Internal States in lncom~letelvSwcified Seauential Networks. No. Banlett. pp. G. January 1991. "Sequential Circuit Vcrlfication Using Symbolic Model Checking.pp. Vol. J. 7. pp. Lin and F.Academic Prcss.s of rhe Internurional Conference on Conrpurer Aided Derign. SpringerVerlag. pp. Ma. M. January 1990.. No." IEEE Tronsocfions on CADNCAS. Newton and A. December 1972. I . 21. Vol." I E E E Trarrncrions on CADNCAS. 1966. "The Simplification of Sequential Machines with Input Rest"ctions. 8. S. De Micheli. 1978. M.
Split by PDF Splitter
26. B. Lin, H. Touati and A. Newton. "Don't Care Minimization of MultiLevel Sequential Logic Networks," ICCAD, Pmceedinp ofthe Internotionnl Conference on Computer Aided Design, pp. 414417, 1990. 27. S. Malik, L. Lavagno. R. Brayton and A. SangiovanniVincentelli, "Symbolic Minimization of Multiplelevel Logic and the Input Encoding Problem," IEEE Transactions on CADNCAS, Val. CADI I. No. 7, pp. 825843, July 1992. 28. S. Malik. E. Sentovich, R. Brayton and A. SangjovmniVincentelli, "Retiming and Resynthesis: Optimizing Sequential Networks with Combinational Techniques." IEEE Tronfacrioru on CADNCAS, Vol. CAD10, No. I. pp. 7484, January 1991. 29. E. McCluskey, Logic Design Principles, PrenticeHall. Englewood Cliffs, NJ, 1986. 30. 1. Rho. G. Hachtel and F. Somenzi. "Don't Care Sequences and the Optimim6on of Interacting FiniteState Machines," ICCAD, Proceedings ofrhe lnt~mationrrlConferenre on Computer Aided Design, pp. 418421. 1991. 31. G. Saucier, M. Crastes de Paulet and P. Sicard, "ASYL: A Rulebased System for Controller Synthesis." IEEE Transaction.r on CADNCAS, Vol. C A D 6 No 6, pp. 10881 097, November 1987. 32. N. Shenoy, R. Braytan and A. SangiavmniVincentelli, "Retiming of Circuits with Single Phase Transparent Latches," ICCD, Proceedings ofrhe Inremational Conference on Computer Design, pp. 8689, 1991. 33. V. Singhal, Y. Watanabe and R. Brayton. 'Beuristic Minimization for Synchronws Relations,"ICCD, Proceedings of the International Conference on Computer Design, pp. 428A33, 1993. 34. H. Tauati, H. Savoy. B. Lin. R. Brayton and A. SangiovanniVincentelli,"Implicit State Enumeration of Finite State Machines using BDDs," ICCAD, Proceedings qfthe International Conference on Computer Aided Design. pp. 13CL133, 1990. 35. T. Villa and A. SangiovanniVincentelli,"NOVA: Slate Assignment for Finite State Machines for Opdmal Twolevel Logic Implementation,'' IEEE Tronsactionr on CAD/lCAS, Vol. CAD9, No. 9, pp. 905924. September 1990. 36. S. Unger. Asynchronous Sequentid Snitching Circuits, Wiley, New York, 1972.
9.8 PROBLEMS

1. Consider the state table of Example 9.2.3. Derive a completely specified cover by replacing
the don't care entries by 0s. Minimize the machine using the standard and Hopcroft's algorithms. Repeat the exercise with the don't rare entries replaced by 0 and 1.
2. Consider the state table of Example 9.2.3. Derive a minimum symbolic cover and the
corresponding encoding constraints. Then compute a feasible encoding. Can you reduce the encoding length by using constraints derived from a nonminimum cover? Show possible product tendencoding length tradeoff.
3. Consider the state encoding problem specified by two matrices A and B derived by symbolic minimization. Assume that matrix B specifies only covering constraints (i.e., exclude disjunctive relations). Let S be the state set. Prove the following. A necessary and sufficient
condition for the existence of an encoding of S satisfying both the constraints specified by A and B is that for each triple of states ( r ,s,t ) G S such that b,, = 1 and b,, = 1 there *e'xists no row k of A such that al;, = I , ak, = 0, ax, = 1. 4. Consider the network of Figure 9.8. Draw the weighted graph modeling the search for a legal retiming with cycletime of 22 units using the BellmanFord method. Compute the retiming and draw the retimed network graph.
5. Suppose you want to constrain the maximum number of registers on a path while doing
retiming. How do you incorporate this requirement in the retiming formalism? Would the complexity of the algorithm be affected? As an example, assume that you require at most one register on the path (ud. u h ) in the network of Figure 9.8. What would the
Split by PDF Splitter
SEQLIEIITIAI. LOGIC OTI'IMIZATION
503
minimum cycletime be? Show graphically the constraints of the BellmanFord algorithm in correspondence with the minimum cycletime as well as the retimed network graph. 6. Give examples of synchronous elimination and algebraic substitution as applied to the network of Figure 9.7.
7. Show that for a pipelined network, for any single perturbation at any venex v , representing the replacement of a local function f, by g , . necessary and sufficient conditions for the feasibility of the replacement are:
for a suitable value of p. 8. Consider the acyclic synchronous logic networkobtained from that of Figure 9.7 by cutting the loops and using additional input variables for the dangling connections. Determine the obsemability don't care sets at all internal and input vertices.
Split by PDF Splitter
CHAPTER
CELLLIBRARY BINDING
Cum legere non possis yuanrum habuerir, saris est habere quantum legas. S i r r e you connot read all rhe books which you m y p o s ~ s s , it is enough to possess only as many books as jou can read.
Seneca. Epistulae ad Lucilium.
10.1 INTRODUCTION
Celllibrary binding is the task of transforming an unbound logic network into a bound network, i.e., into an interconnection of components that are instances of elements of a given library. This step is very important for standardcell and arraybased semicustom circuit design, because it provides a complete structural representation of the logic eircuit which serves as an interface to physical design tools. Library binding allows us to retarget logic designs to different technologies and implementation styles. Hence it is of crucial importance for updating and customizing circuit designs. Library binding is often called technology mapping. The origin of this term is due to the early applications of semicustom circuits, which reimplemented circuits, originally designed in Tn SSI bipolar technologies, in LSI CMOS technologies. Circuit technological parameters are fully represented by the characterization of a library in lerms of the area and speed parameters of each cell. Therefore we prefer the
2. Even though digital circuits are often sequential and hierarchical in nature. terminals and some macroscopic parameters such as area. a cell layout is associated with each element. adders. .Split by PDF Splitter name library binding. the most studied binding problems deal with their combinational components.5 and we compare it to the algorithmic approach. The drawbacks of the latter methods are the creation and maintenance of the set of rules and the speed of execution. as well as testability enhancement.g.g. information is provided (to the physical design tools) on bow to implement the cell with the prediffused patterns. Each element is characterized by its function. instead of by enumeration. We describe rulebased library binding in Section 10. In this chapter we consider both methods in detail and highlight their advantages and limitations. Schmitt triggers and threestate drivers). sequential and interface elements (e. We also restrict our attention to libraries of combinational singleoutput cells. because the choice of implementation of registers. where the library can be defined implicitly. Most commercial tools use a combination of algorithms and rules in library binding to leverage the advantages of both. Practical approaches to library binding can be classified into two major groups: heuristic algorithms and rulebased approaches..4. such as fieldprogrammable gate arrays (FPGAs). we describe algorithms for standard libraries of combinational gates in Section 10. Optimization of area andlor delay. Even though this assumption may seem restrictive. We are concerned in this chapter with the logic abstraction level of cell libraries. After having formulated and analyzed the library binding problem in Section 10. delay and capacitive load. in Section 10. rulebased approaches can handle arbitrarily complex libraries.2 PROBLEM FORMULATION AND ANALYSIS A cell library is a set of primitive logic gates. Specific multipleoutput functions (ex. including multipleoutput.. Therefore we consider here the library binding problem for multiplelevel combinational circuits. PLAs). sequential and interface (e.3.. we consider algorithms for specific design styles. Whereas most binding algorithms are limited to singleoutput combinational cells. because the essential task is to relate the circuit representation to that of the cell library and to find a cell interconnection. inputloutput circuits and drivers in a given library is often straightforward and binding is done by direct replacement. practical approaches to library binding involve several techniques. encoders. Twolevel logic representations are decomposed into multiplelevel networks before library binding unless they are implemented with a specific macrocell style (e. The library contains the set of logic primitives that are available in the desired design style. because they entail the solution of intractable problems. is always associated with the binding process. Hence the binding process must exploit the features of such a library in the search for the best possible implementation.g. In the case of standardcell libraries. driver) elements. For arraybased design. . Next. 10. including combinational.. We shall show that the optimization tasks are difficult.
Note that cell sharing is not possible at the logic level. For example. We say that a cell matches a subnetwork when they are functionally equivalent. can be translated into a bound network in a straightforward way by binding eacb vertex to an instance of a matching library cell. We neglect these cases in the sequel. such as area and/or delay. we associate with eacb internal vertex . 261. A simple example of network covering is shown in Figure 10.Split by PDF Splitter 506 LOFLC~LEYEL SYNTHESIS AND OPll\llLATIOV etc. This is called nehcork cob. 23. using simple replacement rules. An unbound logic network. a threeinput A N D cell can be used to implement a twoinput A N D function. E ) in more detail. Note that a cell may match a subnetwork even if the number of inputs differs and some of these are shorted together or connected to a fixed voltage rail. The library binding problem entails finding an equivalent logic network whose internal modules m instances of library cells. for risindfalling transitions at the output. In the covering problem. A rooted subnetwork is a subgrapb of G. because cells evaluate logic functions concurrently. Thus. or by worstcase values). the inputloutput propagation delays (that are generally specified as a function of the load. is presented in Section 10. where each local function matches a library cell. such as lookuptablebased FPGAs where the library is best characterized otherwise. Binding for other design styles.1. library binding often involves a restructuring of the logic network. an area cost. It is usual to search for a binding that minimizes the area cost (possibly under some delay constraints) or the maximum delay (possibly under an area constraint). Covering entails recognizing that a portion of a logic network can be replaced by a library cell and selecting an adequate number of instances of library elements to cover the logic network while optimizing some figure of merit.(V.4. a trivial binding may not be so. Even when the unbound network is area and/or timing optimal. A common approach for achieving library binding is to restrict binding to the replacement of subnetworks (of the original unbound network) with celllibrary instances [19. We describe now the covering problem of an unbound network G. we restrict our attention to nonhierarchical combinational logic networks. even checking the equivalence of an unbound and a hound network (tautology problem) is intractable. because optimization of unbound networks involves simplified models and ignores the actual cell parameters. because libraries usually contain those cells that can be obtained from more complex ones by removing (or bridging) inputs.ering by library cells. E ) that has a single vertex with zero outdegree.(V. and bound to the corresponding cells. or fanout. may be identified. as done for resources at the architectural level. We call this binding trivial. The binding problem is computationally hard. For the sake of simplicity.). We assume that the library is a set of cells. Let us consider a combinational logic network that may have been optimized by means of the algorithms described in Chapter 8. as well as registers. each one characterized by: a singleoutput combinational logic function. Indeed.
M = .(V.. Thus the choice of a match implies the choice of some other matches.e.2. For each selected match.subset M. Let M be the set of all matching cells associated with the internal vertices of the network. u of G. (c) Network cover with library cells including twoinput and threeinput A N D and OR gates.. and the network covering problem can be classified as a binate covering problem.Split by PDF Splitter FIGURE 10. A a e c e s s a q condition for the network covering problem to have a solution is that each internal vertex is covered by at least one match.. We say that a cell in M. Example 10. the cost may be the area taken by the individual cells... of library cells that match s&e subnetwork rooted at u. This condition is usually satisfied by decomposing the unbound network prior to covering so that all local functions have at least one match in the library. For example.2 (b). An example of an optimal solution is one that minimizes the total cost associated with the'selected matches. i.2 (a) and the unbound network of Figure 10. U CEVG The covering problem can be modeled by selecting enough matches in M that cover all internal vertices of the unbound network. Consider the simple librlry shown in Figure 10. E ) the.1. (d) Alternative network cover. we must ensure that the vertices bound to the inputs of the corresponding cell are also associated with the outputs of other matching cells. We consider the problem of finding a network cover that minimizes the total area cost associated with the chosen cells. covers u and the other vertices of the matching subnetwork.1 (a) Simple network. lb) T~ivialbinding by using one gate per vertex.M. .
Split by PDF Splitter
508
LmICLEVEL SYNTHESIS AND OFTlh<lZ&TIDS
m l : (vl.OR2) m2: (v2.AND2) m3: (v3,ANDZJ
m4: [vl,v2.OA21) m5: (vl,v3,OA21)
FIGURE 10.2
(a) Simple lihrary. (h) Unbound network. (c) Trivial binding. (d) Match set. ie) Network cover. (fl Alter native bound network which is not a cover of the unbound network shown in lb).
There are several possibilities for covering this network. For example, a trivial binding is shown in Figure 10.2 (c). A more interesting binding can be found by considering the possible matches. Consider, for example, vertex r , , which can be bound to a twoinput on gale ( 0 ~ 2 ) .and venex u2, which can be bound to a twoinput AND gate (ANDZ). Moreover, the subnetwork consisting of ( u , , u?] can be bound to a complex gate (OAZI). We can associate binary variables lo denote the matches. Variable m , is TRUE whenever the o n 2 gate is bound to I!,, variable ,nz is TnuE whenever the AND2 gate is bound to R , and rn4 represents the use of o h 2 1 for covering the subnetwork (I!,, u21. Similar considerations apply to vertex L.,. We use variable mi to denote the choice of a twoinput A N D gate (ANUZ) for u3 and m s to represent the choice of OA21 for ( u l , u i ) . The possible matches are shown in Figure 10.2 (d). Therefore we can represent the requirement that u , be covered by at least one gate in the library by the unate clause r n , + m d + m 5 . Similarly, the covering requirements of "2 and I!,can be represented by clauses n12 + m r and rn3 + ms, respectively. In addition to these requirements, we must ensure that the appropriate inputs are available to each chosen cell. For instance, binding an A N D 2 gate to v l requires that its inputs are available, which is the case only when an o m gate is bound to vc. The former choice is represented by rnl and the latter by m , . This implication can be expressed by m2 + rn,, or alternatively by the binate clause mi + m , . Similarly, it can be shown that
Split by PDF Splitter
m? + ml,
or m;
+ m , . Therefore, the following overall clause must hold:
The clause is binate. An exact solution can be obtained by binate covering, taking into account the cell costs. For each cube satisfying the clause, the leastcost one denotes the desired binding. In this case, the optimum solution is represented by cube m;m;m;mnmi, with a total cost of 10, corresponding to the use of two on21 gates. The optimal hound network is shown in Figure 10.2 (e). The bound network obtained by covering drpends on the initial decomposition. For example, Figure 10.2 (0 shows another bound network, which is equivalent to the network of Figure 10.2 (b), hut not one of its covers. An exact algorithm for solving the network covering problem must cope with binate covering. The branchandbound algorithm of Section 2.5.3 can be used for this purpose, but experimental results have shown that this approach is viable only for smallscale circuits, which are not of practical interest [26]. It is important to remember that library binding could he solved exactly by methods other than covering. Unfortunately, the difficulty of the covering problem stems from its binate nature, due to the f a c t that the choice of any cell requires selecting other cells to provide correct connectivity. Any other formulation of library binding will face the same problem. and thus we conjecture that solving library hinding is at least as complex as solving it by network covering. For this reason, heuristic algorithms have been developed to approximate the solution of the network covering problem. Alternatively, rulebased systems can perform network covering by stepwise replacement of malching subnetworks. We shall review heuristic algorithms next.

10.3 ALGORITHMS FOR LIBRARY BINDING
Algorithms for library hinding were pioneered at AT&T Bell Laboratories by Keutzer [19], who recognized the similarity between the library binding problem and the code generation task in a software compiler. In both cases, a matching problem addresses the identification of the possible substitutions and a covering problem the optimal selection of matches. There are two major approaches to solving the matching problem which relate to the representation being used for the network and the library. In the Boolean approach, the library cellsand the portion of the network of interest are described by Boolean functions. In the stnrctlrral approach, graphs representing algebraic decompositionspf the Boolean functions are used instead. Since the algebraic representation of an expression can be cast into a graph, expression pattern matching approaches can be classified as structural techniques. The structural approach was used by pro[191, MIS [lo, 261 and TECHMAP [241, while the Boolean approach was grams DAGON implemented first in program C E R E[23] and in Fujitsu's binder 1291. ~ The structural and Boolean approaches differ mainly in the matching step. We define formally the matching of two scalar combinational functions, representing a cell and a subnetwork, as follows.
Split by PDF Splitter
510
LDGlC~LEVELSYNTHESIS A N 0 OPTIMUXTION
Definition 10.3.1. Given two singleoutput combinational functions f (x) and g(y), with the same number of suppon vaahles, we say that f matches g if there exists a permutation matrix P such that f(x) g(P x) is a tautology.

The degrees of freedom in associating input pins is modeled by the permutation matrix. We refer to this type of matching as Boolean matching because it is based on the property of the function and to distinguish it from another weaker form of matching. Given a structural representation of two functions by two graphs in a predefined format (e.g., sum of products representations or networks of twoinput NAND gates and inverters), there is a structural match if the graphs are isomorphic. A structural match implies a Boolean match, but the converse is not true.
Example 103.1. Consider the following two functions: f = a b c and g = p q r . Since it is possible to express g as a function of { a ,b, c ] with a suitable input variable permutation so that f = g is a tautology, it follows that f and g have a Boolean match. decomposition graphs, as Functions f and g can be represented by their ORAND shown in Figures 10.3 (a) and (b). Since the two graphs are isomorphic, f and g have
+
+
a structural match. Consider now the following two functions over the same support set: f = x y + x'y' + y'z and g = x y + x'y' + xz. They are logically equivalent and hence they yield a Boolean match. Nevertheless, they are entirely different in the expression patterns and in their structural representation. Note that different structures for a given function arise because there exist different possible ways of factoring an expression and there are even different sum of producrs representations of the same function. Matching algorithms are described in Sections 10.3.1 and 10.3.2. The Boolean matching problem is intractable, because the complement of the tautology problem belongs to the NPcomplete class [12]. The structural matching problem for general functions, represented by dags, is also conjectured to be intractable, because it is transformable into the graph isomorphism problem [12]. Nevertheless, efficient algorithms for matching have been developed, because the size of the matching problem is usually small, since it is related to the maximum number of inputs of the library cells. The major difficulty in solving the library binding problem lies in the network covering problem, as we have shown in the previous section. To render the problem solvable and tractable, most heuristic algorithms apply two preprocessing steps to the network before coyering: decomposition and partitioning.
RGURE 10.3 (a) Representative graph for f = ob +
forg=p+qr.
c
(b) Representative graph
Different schemes can be used for partitioning.^ Finally each subject graph is covered by an interconnection of library cells. it suffers from the dependency on the choice of output. rhar portion of the subject graph is labeled with the matching cell along with its area and timing attributes. The rationale for partitioning is twofold. First. Indeed. partitioning l is also used to isolate the combinational portions of a network from the sequential elements and from the UOs. especially for those approaches based on structural matching. Subject graphs are then covered by library elements one at a time.3.1. they can hurt the quality of the solution. While this approach privileges the formation of larger partition blocks. and it is obviously dependent on the library under consideration. The goal of decomposition in this context is to express all local functions as simple functions. The subnetworks that are obtained by partitioning the original network are called subject graphs [19]. while searching for a minimaldelay binding. Different heuristic decomposition algorithms can be used for this purpose. In the case of ~ n e r a networks. Another approach is to iteratively identify a partition block consisting of all vertices that are tails of paths to a primary output and to delete them from the graph [lo]. hut attention must be paid because network decompositions into base functions are not unique and affect the quality of the solution. Second. such that as twoinput NORS or NANDS. as described in Section 10. The decomposition. This method implies that library binding will not try to improve the previous structure of the network as far as changing multiplefanout points. are called basefirncrions. Therefore heuristics may be used to bias some features of decomposed networks. a trivial binding can always be derived from a network decomposed into base functions. the size of each covering problem is smaller. The choice of a match is done according to different covering schemes. The choice of the base functions is important. which allows the covering algorithm to consider a collection of multipleinput singleoutput networks in place of a multipleinput multipleoutput network. if no library cell implements a base function f .6. For selected portions of the subject graph. vertices with multiple outdegrees can be marked.Split by PDF Splitter Decomposition is required to guarantee a solution to the network covering problem by ensuring that each vertex is covered by at least one match. although the pmitioning and decomposition steps are heuristics that help reduce the problem difficulty. as described in detail in the following sections. 10. . and when one exists and is selecred.5 and 10. The second major preprocessing step in heuristic binding is partitioning. When considering a combiuational network. all cells in the library are tried for a match. For example. the covering problem becomes tractable under some additional assumptions.4. there may exist a vertex of the network whose expression is f and that is not included in any subnetwork that matches a library element. Conversely. partitioning and covering steps are illustrated in Figures 10. and the edges whose tails are marked as vertices define the partition boundary. The library must include cells implementing the base functions to ensure the existence of a solution. It is important to stress that. a decomposition may be chosen such that late inputs traverse fewer stages.
Such dags are called in jargon leafdags When the decomposition yields a tree. The graphs associated with the library elements are called pattern graphs. Some library cells may have more than one corresponding pattern graph. we shall represent these graphs by their . For this reason. Otherwise. We also assume in the sequel that inverters are explicitly modeled in the subject and pattern graphs. The subject and pattern graphs are acyclic and rooted.1 Covering Algorithms Based on Structural Matching Structural matching relies on the identification of common patterns. all corresponding input variables are associated with different vertices.3. the root being associated with the subnetwork and cell outputs. In some cases. because the decomposition of their representative functions may not be unique. an input variable may be associated with more than one vertex. Hence they are both decomposed in terms of the same set of base functions. We assume in the sequel that decomposition yields either trees or dags where paths that stem from the root reconverge only at the input vertices (leaves).4 Network decornposilion into base Cnnctions (twoinpul ans and twoinput aruos)  10.Split by PDF Splitter 512 LOOlC~LEVILSYNTHESIS AND OPrl\llZ.\nON FIGURE 10. both the subject graph and the library functions must be cast into a form that is comparable.
Example 10.Split by PDF Splitter FIGURE 10.5 Network partilioning into subject graphs. such as EXOR . experimental results have shown that the computation time is negligible for problems of practical size [lo].7. f . The pattern graphs are expressed using a decomposition in twoinput NANDS and inverters. Structural matching can he verified by checking the isomorphism between two rooted dags. = a fBb and f. there are two pattem graphs associated with the cell (excluding isomorphic graphs). For those cells. respectively. In the iirst case the graph is a tree: in the second it is a leafdag. Nonterminal vertices are laheled by the letter "K" to denote a twoinput NANn and by the letter "I" to denote an inverter. Nevertheless.2. the matching problem can be simplified by noticing that most cells in any lihrary can be represented by rooted trees and that tree matching and free covering problems can be solved in linear time. In the last case. trivial binding. because operations on this representation are easier to visualize and understand. The pattern graphs are shown in Figure 10. a twoinput EXOR gate and a fourinput AND gate whose representative functions are f l = ah.3. Even though this problem is conjectured to be intractable [12]. = ahcd. Consider the lollowing cells in a library: a twoinput A N D gate.
In particular. . because it supports arbitrary sets of base functions and it uses an automaton to capture the library and check for matching.Split by PDF Splitter FlGURE 1 . An example of a simple library and the corresponding pattern trees are shown in Figures 10. The first is a simple and intuitive appmach that applies to trees obtained by decomposition into base functions of one type only. 06 Covering of a subject graph. that do not have a treelike decomposition. it is still possible to use a tree representation (by splitting the leaves of the leafdag) with an additional notation for those input variables associated with more than one leaf vertex. The second approach is more general in nature. obtained by splitting the leaves of the corresponding leafdag. we descrihe two methods for tree matching.8 (a) and (b). We assume that the subject graph is represented by a rooted tree. We describe first treebased matching and then treebased covering. We consider twoinput NANDs. SIMPLE TREEBASED MATCHING. We consider now stmctural matching and covering using the treebased representation as proposed by Keutzer 1191. We describe tree matching in the case that only one type of base function is used in the decomposition. and EXNOR gates.
g. the algorithm terminates with an unsuccessful match. by the number of its children.. Algorithm 10.. Since only one type of base function is used. each vertex of the subject and pattern trees is associated with either a twoinput N A N D and has two children or an inverter and has one child. inverter or leaf) is easily identified by its degree.Thus. starting from the initial vertices and proceeding top down until the leaves of the pattern tree are reached.1 is invoked with u as a vertex of the subject graph and u as the root of the pattern graph. i.  'It is customary to call the degree of a venex in a rwted tne the number of its children rather than the number of edges incident to it. . (b) Pattern graph for f2 = n @ b. the corresponding children are recwsjvely visited.' There are several algorithms for tree matching. We describe a simple algorithm that determines if a pattern tree is isomorphic to a subgraph of the subject tree. vertices need not he labeled by the base function.e.Split by PDF Splitter FIGURE 10. The type of a vertex (e. This is performed by matching the root of the pattern tree to a vertex of the subject tree and visiting recursively their children. (c. oihenvise.3.7 (a) Pattern graph for f i = ob.d) Pattern graphs for f = obcd. The isomorphism can be easily verified by comparing the degrees of pairs of vertices in both the subject and the pattern trees. NAND. If there is a mismatch. ? but the same considerations are applicable to twoinput NORS. Note that an inverter can be seen as a singleinput NAND.
when the vertex of the subject graph is a leaf. then a path from the root to that leaf in the pattern graph has a match in the subject graph. Let us consider the pattern trees of . and the vertex of the pattern graph is not a leaf.b). Consider the subject tree for function x = a+b' in terms of a twoinput NANO decomposition. (h) Pattern trees ( 1 = white. When both vertices are not leaves. they must have the same number of children which must recursively match for a match to be possible.9 (a.) (c) Pattern strings. The algorithm is linear in the size of the graphs.3. When visiting a vertex pair. v = gray. as shown in Figure 10. if the vertex of the pattern graph is a leaf. a match is impossible.8 (a) Simple cell library. (d) Pattern wee identifiers.Split by PDF Splitter 516 LOGICLEVEL SYNTWESIS AND OFTMIZnTION FIGURE 10. N = hlack. Example 10. Conversely.3.
rehrn (MATCH(u1.9 (a) N A N D 2 decomposition of x = o NANDZ. . there is no match. u .3. = child of u . ) ) t else I u. Two patterns are also reported in Figures 10. = leftchild of v .): I* Degree mismatch *I if (degree(") f degree(u1) retorn(FaLsE).)). = rightchild of u .Split by PDF Splitter MATCH(u.SF.* W e consider now a method for tree matching that uses an automaton to represent the cell library. else ( I* Leaf of the subject graph reached *I if (z is a leaf) return (FAI. let us apply the algorithm to the subject tree and to the NANDZ pattern tree. (b) Subject Vee. MATCH(+.. v.) + MATCH@. rrturn (mntch(u. Since the root of the subject tree has two children and that of the INV pattern tree has only one. the children of the pattern tree are leaves and the calls return Tnue. Both mots have two children. FIGURE 10. In both recursive calls.1 the simple library shown in Figure 10. MATCH(uj.9 (c) and (d) for convenience. = rightchild of u .. I* Two children each: visit subtrees recursively *I ui = leftchild of u . Hence the NANDI pattern tree matches a subtree of the suhject tree with the same root. I* One child each: visit subtree recursively *I if (dejir~e(u) == I) 1 u. This approach is based on a n encoding of the trees by strings of characters and on a stringrecognition algorithm. (c) Pattern tree for IN". I 1 1 ALGORITHM 10. let us apply the algorithm to the subject tree and the inverter (TNV) pattern tree. TREERASED MATCHING USING AUTOMATA. u. (d) Pattern tree for . + h'. "1). Next."1) . u = child of v . We use algorithm MATCH to determine if a pattern tree matches a subtree of the suhject tree with the same root.8. First. u . u. ") ( I* Leaf of the pattern graph reached *I if ( U is a leaf) return (TRUE).
a new state being added for each of the characters not recognized by the automaton under construction. the left edge is labeled by 1 and the right edge is labeled by 2. Next the automaton is revisited to add the failure function. The automaton consists of a set of states. Next. it is more general in nature than the simple algorithm of the previous section. which is fully detailed in reference [I]. The states can be assigned a level equal to the number of transitions from the reset state and a parentlchild relation can be established among state pairs. They can be labeled by strings [NlIlv. while traversing the automaton by breadthfirst search. The automaton is constructed incrementally. we append to . The output function is updated while deriving the failure functions. where each string corresponds to a path from the root to a leaf. corresponding base function) alternated with a number denoting the edge. detailed later in this section. In our context. This mechanism.. Consider the subject tree of Figure 10. Then each string is processed in turn one character at a time. while considering the strings corresponding to the pattern trees. Since a path is a sequence of vertices and edges. is used to detect all matching patterns.N2v].4.e. Tree matching using string matching is based on the AhoCorasick algorithm. all pattern trees contribute to the automaton.9 (b). When a vertex has one child. It constructs an automaton that detects the matching strings.e. a set of transitions that are partitioned into goro and failur&transitions corresponding to the status of detection of a character in a string and an output Function that signals the full detection of a string. The automaton is constructed once for all elements of a given library. Let us label edges according to this convention. When a vertex has two children. each one encoded by a single character. Initially the automaton contains only the reset state. i. which was devised to recognize strings of characters in a given text [I]. . Thus. The automaton processes strings that encode paths in the subject tree and recognizes those that match paths in the pattern trees. where "v" denotes a leaf. These characters are used as labels of the goto transitions into the states being added. In particular. The output function that signals the full detection of a string is specified in conjunction with each transition to a state corresponding to the last character of the string itself.Split by PDF Splitter 518 LffiICLEVZL SYWTHESIS AND OFTIMILATION This method supports library descriptions in terms of arbitrary base Functions. there is one automaton for the entire cell library. Initially the failure function of states from level 1 is set to the reset state. Once all strings have been considered. the edge to its child is labeled by 1. There are two paths from the root to the leaves. Example 10. a string has a character denoting each vertex in the path and its type (i. the automaton transition diagram has a tree structure.3. We summarize now the algorithm for constructing the automaton.. A tree can be represented by a set gf strings. which indicates which next state to reach when an incoming character does not match the label of any outgoing edge from the current state. the failure function for a state s reached by state r under input character c is set to the state reached by a transition under c from the failure state for r.
Split by PDF Splitter
the output function of a given state the output function associated with the transition into the failure state under the corresponding character. The automaton may recognize arbitrary suhtrees of a given tree. In practice, it is convenient to detect subtrees that have the same root or the same leaves. This can be easily achieved by using a distinguished character for the root or the leaves. Example 10.3.5. Consider a library of two cells: a threeinput NAND and a twoinput The corresponding pattern trees are shuwn in Figure 10.10, where we assume that the base functions are a twoinput N A N D and an inverter. Hence, vertices are labeled by "N," "1" and "v," where "v" denotes a leaf. The following strings encode the first pattern tree: (Nlv, N211NIv. N2IINZvJ. These strings are labeled by r I.l,t1.2. 11.3, ieapectively. Similarly, the second pattern is encoded by (Nlv, N2vJ, denoted as 12.1.12.2. Note that the first string in both patterns is the same. We want to ConSlNCt an automaton that recognizes the strings related to the two cells just mentioned. This automaton is shown in Figure 10.1 1. We consider now the individual steps for assembling the automaton. While processing the fin1 string. i.e., Nlv, three states are added to the reset state, forming a chain with transitions labeled by N, I and v. The last transition is coupled with the output value 11.1, denoting the detection of the first string. Next the second string is considered: N211NIv. Starting from the reset state, the existing automaton would recognize the first character (e.g., N), but not the following ones. Hence new states and transitions are added to the automaton: the first transition to be added is from state 1 to state 4, and so on. When all strings have been considered and the state set finalized, the remaining transitions are determined based on the failure function. The failure function for state 1 is the reset state. This means that when in state 1. if an input character r does not yield a goto transition (i.e., c I and c # 2). the next state is determined by the transition from the reset state under c. In particular, when c = N, the next state is state 1. And so on.
NAND.
+
Let us now consider the use of the automaton for finding the pattern trees that match a given subject tree. We shall consider the particular case where we search for pattern trees isomorphic to subtrees o f the subject graph and with the same leaves.
%A
12.1
Y
12.2
11.2
11 3
FIGURE 10.10 T w o pattern graphs
Split by PDF Splitter
FIGURE 10.11
AhoCorasick matching automaton. The gom imnsitionr are indicated by edges and the failure function is denoted by subscripts. T h e subject tree is visited bottom up, and for each vertex all strings that correspond t o paths to the leaves are processed. A match exists if the strings are recognized by the automaton a s belonging to the same pattern tree. T h e algorithm complexity is linear in the size of the subject graph. Example 10.3.6. We consider here the simple library and the patterns trees of Figure 10.8. The corresponding automaton is shown in Figure 10.12. Let us assume that the subject tree models the function x = a + b', as shown in Figure 10.9. The subject tree can be modeled by striigs (NIIlv,N2v). Let us feed these strings (representing the subject tree) to the automaton of Figure 10.12 (representing the library). It can be seen that the automaton would recognize both strings, leading to output functions 15.1 and 12.2. respectively (states 30 and 32 in Figure 10.12). Since these functions belong to different pattern graphs, there is no cell in the library that can implement the subject graph directly. Nevertheless, substring Ilv can be recognized by the automaton, yielding output I I . I (state 3 in Figure 10.12). This means that an inverter covers a pr of the subject tree. at After pruning the corresponding pan of the tree. the subject tree is reduced to the strings (Nlv,N2v), which can be recognized by the automaton (states 27 and 32 in Figure 10.12). The corresponding output functions are r l l and 11.2, denoting to twoinput NAND cell. Thus, a N A N D ~ cell and an rNv cell provide a cover. When comparing this example with Example 10.3.3, the reader may be surprised to notice that the' MATCH algorithm tinds a match for the NAND2 cell, while the automaton finds a match for the I N V cell. Note first that these matches cover part of the subject tree a n d that in both cases a tree cover can be achieved by using both a NAND2 cell and an I N Y cell. Second. the difference in the matches detected by the two algorithms is due to the fact that the former algorithm looks for a pattern tree isomorphic to a subtree of the subject graph with the same root. The autornatonbased algorithm looks instead for a pattern tree isomorphic to a subtree of the subject graph with the same leaves. It is possible to modify the automaton algorithm by using a distinguished character for the root instead for the leaves and by reversing the strings, so that it detects if a pattern tree is isomorphic to a subtree of the subject graph with the same root.
,
Split by PDF Splitter
FIGURE 10.12 AhoCorasick automaton for the rimple libraly
In @e previous examples, the subject and pattern trees where represented in and terms of N A N D ~ INV base functions. Thus, all strings contained only three characters, including the terminator (e.g., "v"). The automaton recognizer can support tree recognition with decompositions into arbitrary sets of base functions by just associating each base function with a character. The automaton construction and recognition algorithms remain the same. When compared to the simple MATCH algorithm, the automatonbased approach has the additional advantage that it considers all patterns available for match
The choice of the base functions affects the quality of the solution as well as the number of pattem graphs and consequently the computing time for binding.g. Other approaches yield equivalent results. vertex is labeled with the corresponding cell cost. There are some arguments and experimental results favoring the choice of using one base function only (e. y. the vertex labeling corresponds to an optimum covering.8. Therefore. for the sake of clarity.13 with the library and patterns of Figure 10. There are three possibilities for any given pattern tree: 1 The pattern tree and the locally rooted subject sgbtree are isomorphic. while the MATCH algorithm compares one tree at a time.. Then. TREEBASED COVERING. Consider the network shown in Figure 10. representing its area usage. The set of base functions useful for library binding is usually small. Note that overall optimality is weakened by the fact that the total area of a bound network depends also'on the partitioning and decomposition steps. we consider matching of pattern trees whose roots correspond to the vertex of the subject tree under consideration. 3. Optimum trke covering can be computed by dynamic programming [2.4 in the context of solving the library binding problem.) While visiting the subject tree bottom up. The complexity of the algorithm is linear in the size of the subject tree. the . and we can label that vertex.. The pattern tree is isomorphic to a subtree of the locally rooted subject subtree with the same root and a set of leaves L. The total area of the bound network is the objective to minimize. then for any vertex there exists at least one cell for which one of the first two cases applies. There is no match If we assume that the library contains the gates implementing the base functions.7.Split by PDF Splitter ing at the same time. (The network is represented by its trivial binding after a NAND2 decomposition. NANDZ or NOR^ plus inverters) [26]. 191. Each cell has an area cost. it is possible to choose for each vertex in the subject graph the best labeling among all possible matches. the covering algorithm determines the matches of the locally rooted subtrees with the pattern trees. the vertex is labeled with the corresponding cell cost plus the labels of the vertices L.3. Example 10. the MATCH algorithm is preferable due to its simplicity. Then. This advantage is offset by the increased complexity of handling trees as separate strings.3. 2. For the sake of this explanation. Here we consider the algorithm described in Section 2. The tree covering algorithm traverses the subject graph in a hottomup fashion. The corresponding pattem is recorded . and LU. At the end of the tree traversal. We describe first the minimumarea covering problem. In this case. only one match is found for vertices x. For all vertices of the subject tree.
. Let us now consider delay minimization in conjunction with library binding. One INV gate.Split by PDF Splitter FIGURE 10. i.14). Let us consider then the root vertex o . a. Then. NANDZ. z. the problem reduces t o minimizing the dataready time (see Section 8. One ANDZ gate with inputs y. Hence the cost is that of the a0121 gate plus that of the subtree rooted in x . Hence the cost is that of an ANDZ plus those of the subtrees rooted in y and z. subject graph: possible matches at each vertex and corresponding costs. along with the cost of the network bound to the rooted suhtree.3) at the output corresponding t o the root of the subject graph. 3. ANDZ..14 (a) Subject graph with areacost annotation (b) Optimum cover for area. d.e. by choosing gate AOIZI fed by a N A N D ~ gate (Figure 10.6. the cost o f each cell is its inputfoutput FTGURE 113. 6 I respectively. . One Aorzl gate with inputs x . Three matches are possible: . Hcnce the cost is that of an inverter plus that of the subtree rooted in w . In this case. Assume now that the area costs of the cells (INV. . the optimum binding is given by thc third option. 4. . When considering the network partitioned into subject graphs. AOIZI) are ( 2.13 Exdmple of stmctuml covering: network.
i f these assumptions are removed. FIGURE 10. The dataready time at a cell output is then the sum of its propagation delay plus the maximum of the dataready time at the cell inputs. respectively.h]= 16. 2 . Most libraries have multiple gates with different drive capabilities for the same logic function. A bottomup traversal of the subject tree would allow us to determine the binding that minimizes the dataready time at each vertex and hence the minimum time at the root. where the propagation delay depends on the load. Let us now consider the general case.e. 10). 11 an A N I X gate is chosen.15 (a) Subject graph with vertex davaready annotation. The choice of an AOIZI gate leads to an ourput dataready time of 10 + max((l. Therefore the propagation delay of a cell can he measured in terms of a constant plus a loaddependent term. and we assume that the cell delay from each input to the output is the same. the output dataready time is 16. s.3. i. Example 10. While visiting the subject tree bottom up. z and u are r thcn 4 .4.13 with the lihrary and pattcms of Figurc 10. The input dataready times can be easily taken into account. which is available at time 6. First.15). The algorithms still apply.8. 5.J+max(2. Hence the binding corresponding to the fastest netwurk comspontis to Lhe second choice. Thrcc matchcs are again possible for vertcx o. the cost associated with a cell is just a single positive number. because larger devices are employed. . 10 and 14. Note again that the delay optimality is valid within the tree model and is dependent on the chosen decomposition. For the sake of simplicity. ~ 0 1 2 1 )are 12. It entails the interconnection of an ANnz.8. and then we look at the more general case. Consider again the network of Figure 10. we consider the case in which the propagation delay is constant. 10) = 15.Split by PDF Splitter propagation delay. the dalaready limes of vertices x . the shorter the propagation delay for the same load and the higher the input capacitance. it is 5 plus the maximum dataready time at its inputs y and r. When the propagation delay is constant. assuming thal all paths can be sensitized. The overall cost is the maximum path delay. and that all inputs are available at time 0 except for input d . This model is highly desirable for an accurate delay representation. (hl Optimum cavcr f i r delay. that thc delays of the cells ( I N V . 4. NAND% ANDZ. with some slight modifications. two NANDZ and an I N V gate (Figurc 10. which depends on the cell (or cells) being driven and the wiring load. we consider the worstcase delay for the cells' rising and falling transitions at the outputs. If an invcrtcr is chosen. Assumc that a minimumdclny cuvcr is scarchcd for. The higher the driving capability.
5 . which consists of labeling each vertex with all possible total load values or by an approximation. SINV. Assume the output load is 1. If the output load is also 1. the load at the output of the gate. The algorithm would compute the best binding for each load. If all possible load values are considered. Nevertheless. 6. the problem reduces to the previous one described in Example 10. there are three pitfalls. I +max[O. Hence. All cells load the previous stage by 1 = I . In this case. the t ANOZ solution would be preferred as in the previous example. 5 + max(0. where I is ANDZ.3. The regular inverter I N V would have a propagation delay of 1+ I . Rudell realized that for most libraries the values of input capacitances are a finite and small set [26]. at vertex w a load of either 1 or 2 may be possible.5. 9+1 . The regular inverter INV would have a propagation delay of 1 + 1 . NANDZ.5. The A N 0 2 would yield an output dataready time of 4 + 1 . 5 +max(2. First. there are multiple nonisomorphic patterns for some cells.1 = 16. the s t ~ v solution would now be preferred. Consider again the network of Figure 10. The superinverter SINV would have a propagation delay of 1 +0. There are four choices for matching. Hence.9.I].3+1 1 . Assume now that the output load is 5. Indeed the following stages correspond to vertices closer to the root and therefore that are yet to be bound. with corresponding dataready times of either 14 or 15.3.5 and a load of I.I ) . The tree covering algorithm is extended by computing an array of solutions for each vertex.5. 10) = 15 and the ~ o t z would yield an output dataready time of 9 + 1 . because the decomposition into given base functions is not necessarily unique.5. The computational complexity of the tree covering approach is linear in the size of the subject tree and in the number of load values being considered.Hence. the best match at vertices x . because the input capacitance loads of the following stages are unknown when matching. the arrival time is computed for each load value. with load 1 = 2 and delay ( I + 0. Assume that a minimumdelay cover is searched for.4) = 20.I. Assume next that a superinverter cell is available.13 with the library and patterns of Figure 10.8.5 = 18. I and a load of 1. then the algorithm guarantees an optimum solution for the delay model within the tree modeling assumption. corresponding to the loads under consideration. Otherwise.Split by PDF Splitter The problem of selecting a cell in a bottomup traversal of the subject graph is not straightforward. Therefore.8.1. the possible loads on the cells are 1 and 2. AOUI) are now ( 1 + I .I and a load of 2. For each input to the matching cell. yielding an output dataready time of 15+ 1 +0. The ANDZ would yielaan output dataready time of 4 + I . Example 10.Therefore he used a load binning technique. that the delays of the cells (INV.4) = 16.3.8.5 . Whereas the tree covering and matching approach is very appealing for its simplicity and efficiency. 5 correspondmg to an output dataready time of 15 + 1 + 0. This can be done as a preprocessing step. y and z are the same as in Example 10.5. 4 + 1 . yielding an output dataready time of 14 + 2 = 16. it represents a heuristic method approximating the exact solution. For each match. 1 + max(2. a library cell may correspond to more than one pattern . yielding an output dataready time of 14 + 6 = 20.6. The superinvelter slNv would have a propagation delay of 1 + 0. 10) = 19 and the ~ o t t would yield t an'hutput dataready time of 9 + 1 . the best match for driving the cell (for any load) is selected and the corresponding dataready time is used.
some cells. which does not change the complexity of the algorithm.. 10. Fortunately. We represent the cluster function by f (x) and the pattern function by g(y). recent implementations have shown that computing times may be comparable. Thus. and we defer consideration of the use of don't care conditions to Section 10. respectively. Boolean matching can find matches that are not detected by structural matchingand it may exploit the degrees of freedom provided by don't care conditions. This can lead to solutions of inferior quality.4. The order of the variables of one OBDD can be arbitrary and fixed. We consider Boolean matching of completely specified func tions in this section. As a result. a limited use of these cells can be achieved. 261. while different orderings of the variables in the other OBDD are tried until a match is found. increasing the computational burden of the algorithm. Boolean matching addresses the question of whether functions f (x) and g(y) match according to Definition 10. As a consequence. and more importantly. filters can prune the set of pattern functions that need to be tested against a cluster function. The equivalence between two functions can b detected using ordrr~dbir~ury e decision diagrams (OBDDs) in various ways (see Section 2.Split by PDF Splitter 526 LUGICLEVEL SYNTHESIS AND O m L Z A T I O N graph. Boolean covering consists of identifying subnetworks whose corresponding cluster functions have matches in the library and selecting an adequate set of matches that optimize the area and/or delay of the bound network. structural matching can only detect a subset of the possible matches and it does not permit the use of the don't care information in library binding.is a tautology. Similarly.2 and reference [23]). In addition. BOOLEAN MATCHING.1. where x and y denote the input variables associated with the subnetwork and library cell. Boolean matching requires an equivalence check between two functions. We assume that x and y have the same size n. cannot he represented by trees. this number can be greatly reduced by applying the techniques described next. A partial solution is to extend tree covering to use leafdags [lo. if a permutation matrix P exists such that f (x) = g ( P x). each vertex of the subject graph must be tested for matching against a potentially larger number of pattern graphs. such EXORand EXNOR gates. Lastly.5.3. Whereas Boolean covering and matching are potentially more computationally expensive than structural covering and matching.e. Second. Filters are based on prop .2 Covering Algorithms Based on Boolean Matching Boolean matching can overcome the aforementioned pitfalls of structural matching.3. it may lead to better quality solutions. Thus.3. Boolean matching can be made practical by using filters that drastically reduce the number of variable permutations to be considered. i. one representing a portion of the network and called clusterfunction and the other representing a cell and named pattern function. Pattern leafdags can match vertices of the subject graphs as long as the corresponding leaves match. Boolean matching requires solving a factorial number of tautology checks.
) This number must be compared to the overall number of permutations. .b ) ! . n . . which is much larger.. Consider the support set of a function f (x). Indeed. x . A symmet!y class is an ensemble of symmetry sets with the same cardinality.. . Example 10.3. b + s . variables can be paired only when they belong to symmetry sets of the same size. . and they provide a signature for the patterns themselves. Function g has 4 unate variables and 3 binate variables. Consider the following pattern function from a commercial library: g = slsla + s l s . Consider a cluster function f with n = 7 variables.x*. Variables or groups of variables that are interchangeable in the cluster function must be interchangeable in the pattern function. The second condition allows us to exploit symmetry properties to simplify the search for a match [23. (n . . Example 10. s . 7 ! = 5040. the . .11. n. Obviously classes can be void.rS]. . C2 = ((I+. The first condition implies that the cluster and pattern functions must have the same number of unate and binate variables to have a match. x 3 ) ] . s 3 ~ + s . two nonvoid symmetry classes. ) ] and C3 = ( ( x . (x. Namely: Any input permutation must associate a unate (binate) variable in the cluster function with a unate (binate) variable in the pattern function. Most libraries have pattern functions exhibiting symmetries. i = 1 .. the symmetry classes are used to determine nonredundant variable orders. The symmetry classes of the cluster function can be quickly determined before comparing the OBDDs. A symmetry set is a set of variables that are painvise interchangeablt?without affecting the logic functionality.10. they can be used to restrict the set of pattern functions that can match a given cluster function as follows. namely. 2 . 241.3. (A match can be detected before all 144 variable orders are considered. if there are b binate variables. and the symmetry classes can be used to simplify the search for a match in different ways [23]. Thus. First. 2 .Split by PDF Splitter enies of the Boolean functions that represent necessary conditions for matching and that can be easily verified. . In addition. a necessary condition for f to match g is to have also 4 unate variables and 3 binate variables. . then an upper bound on the number of variable permutations to he considered in the search for a match is b! . d with n = 7 variables. We denote a symmetry class by Ci when its elements have cardinality i . The support variables of f (x) can be partitioned into three symmetry sets: ( x I x ~ x ~ x] ~ x s ]( ~ 0 x 7 1There are { . A necessary condition for two functions to match is having symmetry classes of the same cardinality for each i = 1 . Consider the function f = x l x 2 x 3 + x 4 x s + x 6 x 7 . . Second. The symmetry classes of the pattern functions can be computed beforehand. If this is the case. only 1! 4! = 144 variable orders and corresponding OBDDs need to be considered in the worit case. and since all variables in any given symmetry set are equivalent. First. . pattern functions not satisfying this test can be weeded out.
the relevant variable orders for the OBDDs of g are ( x .ICPI)!.andIC. In addition. We describe here a procedure for Boolean covering of a subject graph. i = 1 . n:=. There are two nonvoid symmeuy classes. X g . x 5 . x. because it uses a bottomup traversal of the subject graph. X ~ . when considering the library panem functions. Thus another upper bound on the number of permutations required to detect a match is ((C. 6 = ' =x(a+c): f: = ( e + z ) y . n. We define a cluster as a rooted connected subgraph of the subject graph. since both unateness and symmetry properties have to be the same for two variables to be interchangeable. Example 10. respectively. For each retained pattern function. of the subject graph and shown by different shadings in the picture.XI. The associated cluster function is the Boolean function obtained by collapsing the logic expressions associated with the vertices into a single expression. X ? . x z X .3. . Consider the subject graph shown in Figure 10.. Example 10. For example. The corresponding cluster functions are: 6' x y . As with structural covering. x i ..12. the library is required to include the base functions. x j . Thus.I . BOOLEAN COVERING. fp=(e+c'+b)(a+c) &? = (=(e+z)(a+c). 2 . x1. but just a rooted dag. For each match. The procedure is reminiscent of the treecovering algorithm. . Consider again function f = x l x z x l + x 4 x 5 + x6x7. n:=. the subject graph is not required to be a trze. It is characterized by its depth (longest path from the root to a leaf). f : = ( e + c ' + b ) y : Let us consider first the minimumarea covering prohlem.). Thus the number of nonredundant permutations are at most jCPl!. x 5 .3. l C r ( ! = IC!I!. The base functions for the decomposition of the subject graph are 2input A N D and on functions. only permutations over symmetry sets of the same size need to be considered.13. xa. one of cardinality 2 and one of cardinality 1.16. xg.I!). As a result.Split by PDF Splitter 528 LOGlC~LEVUSYNTHESIS AND OFTITIMIZATION ordering of the variables within the set is irrelevant. only those with IC2I. X S J . only 2 ! = 2 variable orders and corresponding OBDDs need to be considered. x 6 . . (IC. The covering algorithm attempts to match each cluster function to a library element. . n:=. x i ) and ( x ~ . Thus we can write Ci = c P u C y . assuming that the OBDD of f has variables ordered as ( X I .wherethesn~erscriptsbandu denote binate and unate. the unateness information and symmetry classes can be used together to derive a tighter hound on the number of nonredundant variable orders. Unate and binate symmeuy sets are disjoint.I = ~C~l+~CyI. . The expressions associated with the subject graph are the following: ' We consider the clusters rooted at vertex u.= 2 and /C.I = 1 are retained. However. We assume that the network has been partitioned into subject graphs and decomposed into base functions beforehand. X 7 ..
the algorithm selects the cluster that minimizes the cost of the cover of the locally rooted subgraph. the algorithm selects again the cluster that minimizes the local dataready time. When matches exist for multiple clusters.Split by PDF Splitter c' k h FIGURE 10. We consider once more the control unit of the complete differential equation integrator described by the optimized logic network in Example 8. this may prevent reaching an optimum solution. An exact pruning procedure disregards those cluster functions whose support cardinality is larger than the maximum number of inputs in any library cell. The Boolean covering algorithm is based on the same dynamic programming paradigm used for structural matching. When matches exist for multiple clusters rooted at a given vertex. the solution depends on the particular decomposition and panition into subject graphs. The dataready time at any vertex is computed by adding the propagatan delay of the matching cell to the maximum of the dataready times at the vertices providing the inputs to the matching cell.lh Clusters of the Boolean covering algorithm the area cost of a cover is computed by adding the area cost of the matching cell to the area cost of the covers of the subgraphs rooted at the vertices providing the inputs to lhe matching cell.14. Therefore the global optimality of the covering step per se has limited practical value. and nearoptimal covering solutions are often more than adequate to obtain results of overall good quality. because the library contains the base functions. A heuristic method used in CERES [23] limits the clusters to those whose depth is bounded from above by a predefined parameter. As in the case of structural covering.3. However.10. because many clusters exist for each vertex of the subject graph. Unfortunately. Example 10.2. its complexity is higher. A similar procedure is used to minimize the dataready time at the root of the subject graph under the assumption of constant propagation delays. The Boolean covering algorithm still yields optimum solutions for tree or leafdag decompositions of the subject graph when the depth of the clusters is unbounded. There is always at Icast one match for each vertex. The .
1 1 I\ORZ3LsL UOiiB i L a t c h O . . si :.call AxDi.endnodcl control .ICPU~S . CLK :.iLsis3 OR2 153. .reset . iLatc50~:vZ. s 5 . 103.3354~6 5 3 8 4 ~ 5 ~.8 1 . . z z .1 1 . ..v5..~ ~ 4LatchOutLv3 . CLK ::LatchOutLvl I . .7 1 .call ~x32.Split by PDF Splitter 530 LOGICLEVEL SYNTHLSIS AND OPTIMIL4TlOh result of binding the network to a commercial library with program CERES shown is next: . sls3 ::s:sZs3 1 . Therefore we consider the possibility of matching a subnetwork to a libraly cell that implements the complement of the subnetwork's function andlor uses complemented input signals.s2s4 1 .:. LatchOutLv2 .._sZs3 .2 AOiA (sis3sSs6.~. ?R2_52535456 OF2 (La:chOlt~4.rrB 1AKD2% (reset.s5 1 . .?ND3BLrrLL kl4D3B (reset. s253.call .LatchOutLv3 I . Each record corresponds to a cell instance: the first label is an arbitrary instance identifier.call O R Z .1 A024 (reset. . s2 .call OR2_s2s314 032 is254.call IPu"~:sls5s6 I W IsisSs6I .sls3s4 1 : . 227. DFlLiatctOuLLv3 D: ("5.:sZ 1 : . c . LatchOutLvL :.6 ~ 5 ~ 6 3FiL1. s2s3 . . .call 03. .call OA:BLsls3s4: 0: AB (LatchOutLv4.call .ci. the second label denotes the cell type and the argument in parentheses denotes the names of the nets that are inputs and outputs to each cell. . because together they impact the cost of an implementation. .S ~ S L S S S ~ Oh2 1LarchOut"3.till ViD2.0 1 . LarchOutLv3 .l The network is represented as a moduleoriented netlist. . reset c CLK : .AND2A (s6.. .. NOR251 NOH2 (La'chOui_vL. .a: c: mD2.Gill NOR2L~Z7 KORZ ~52545556. .6 have been preserved. 85 ::sis5s61 ) .Ls6 AND23 1La:chOut"3. v5.call 401~. s 4 1 .Ls?s4sis6ROlA (s2s3.7.Lv32ALs3 ix3i. .~sS=mi?. .s2s3 1 .atcMutLv3...1. : z z .R.. LatchOutLv2.v5.sls5s6 I . Note that the original signal names of Example 4.call AOl.call 402i"5. zz4 .atcllOutLv2 i i s 2 ~ 3 ~ 41 ~ : 6 . ..ia11 .outguts sl s2 53 s 4 55 s6 sls3 s2s3 s2s4 8 1 5 2 ~ 3sls3s4 5 2 5 3 ~ 4sls>s6 ~ 2 ~ 4 52. . ~ 2 ~ 4 ~ 53 ~::s3s4sSs6 5 6 .4 1 : . .call 0825254 OR2 li4.2 1 .s2s3s4 I .cal.v5.s6 1 . I:at~>Octv4.Sl 1 . CLK .czll OPZLsls2s3 032 (s2s3.s2s4s556 1 . z z . LatchOutLv2 . . . si ::s:s3 1 : .call A:J3iALsi XD2.A (LarchOut_v2. .3 Covering Algorithms and Polarity Assignment In this section we describe the polarity assignment problem in conjunction with the matching problem. ? DBliarchO2t_vL I : D ("5.. .call OAIBsls5s6: OAiB (. lia:chOutLv4 .tcP~OuLLv2 3F1 ("5.call Ih7sls354 INV (sls3s4I . sls3 ::sls3s4I 1 . LatchOutLv3 .atchOutLv2 . ..2.0 NO42 (LatchOutLv2.call . s 4 .nodel concral .LatchOutLv2 1 . .0.call . 228 : :v5.call NOP2Lv5.s3 1 . reset .
) This transformation can be easily applied to both the subject graph and the pattern graphs by replacing selected edges by edge pairs with an internal vertex labeled as an inverter. regardless of the polarity of the signals.13.3. Consider the network of Figure 10.7. It consists of an inverter pair whose actual implementation is a direct connection and whose cost is zero. there are three possible covers: (I) an INV gate plus a NANDZ gate (and two zerocost INV gates). NANOZ.8. All connections between base gates are replaced by inverter pairs which do not affect the network and cell behavior. . a fake element is added to the library. 3. Let us assume that library cells available are those shown in Figure 10.Split by PDF Splitter The goal of considering the polarity assignment problem together within library binding is to find the best cover. ANDZ. consider the network and the library cells after decomposition into base functions. Example 10.3.17 (a). 1. Because of the optimality of the covering algorithm. Figure 10.The only drawback of using the inverterpair heuristic is a slightly increased computational cost due to the lzsger size of the subject and pattern graphs.17 (b) shows the network after the insertion of inverter pairs and Figure 10. (Connections to or from inverters do not need to be replaced. In addition. Hence a cover of lower cost can be found as compared to approaches that disregard the flexibility in choosing the signal polarities. 4.) We search for a minimumarea cover. repeated for convenience in Figure 10. (3) a NORZ gate. Note that the cost is inferior to that of the cover computed in Example 10.8. because input signals are available in both polarities. the optimal polarity assignment can be achieved by using a clever trick [lo.5. a minimumarea cost cover can be derived that uses three NOR^ cells. Note also that inputs b.27 an ANDZ gate (and two zerocost INV gates). The choice depends on the actual cost of the cells.17 (c) the subject graph. The dynamic programming covering algorithm can now take advantage of the existence of both polarities for each signal in the subject graph in the search for a minimum (area or delay) cost solution. assume that the cost of a NOR2 cell is 2. (Note that inverter pairs should be added to the pattern trees of the last two cells of Figure 10. In the case of structural covering. respectively. The best cover at u. Let us visit the subject graph bottom up. We assume that input signals are available in both polarities and this is modeled by adding a zerocost inverter on all inputs. 6) ~ Assume that the area costs of the cells (INV. as in Example 10. the cost of the bound network starting from an unbound network with inverter pairs has lower (or at most equal) cost than the cost of a bound network derived without using inverter pairs. And so on. Then.3.5 and the cost of an inverter pair is 0.7. and v2 is provided by a zerocost INV gate.0 1 2 1 )are (2. For this reason. When considering vertex "4. It is important that the newly introduced inverters are removed when they do not contribute to lowering the overall cost. Heuristics can be used to minimize the inverters across different subject graphs [26]. 261. c. STRUCTURAL COVERING.15. which was 9. with a total cost of 7. Note again that the optimality is within a tree. The best cover at u3 is provided either by a NANOZ gate (plus two zerocost wv gates) or an OR2 gate. d have been used with the negative polarity.
Two functions f (x) and g(x) are said to belong to the same NPN class if there are a permutation matrix P and complementation operators Nj.17 (a) Original network. two functions belong to the same NFN class. Note that the extension to Definition 10.) (dl Network cover. (b) Network after the inscnion of inveneFpairs. The complementation operators specify the possible negation of some of its arguments. the permutation ( P ) of the inputs and the negation ( N ) of the inputs. Consider all scalar Boolean functions over the same support set of n variables. BOOLEAN COVERING.Split by PDF Splitter FIGURE 10.N . such that f (x) = No g ( P Ni x) is a tautology [15]. When considering the Boolean covering problem jointly with the polarity assignment. Two singleoutput combinational functions f(x) and g(y) (with the same number of support variables) match when they are NPNequivalent. and are called NPNequivalent. when they are equivalent morlulo the negation ( N ) of the output. In the case of Boolean covering. the polarity assignment problem can be explained with the help qf a formalism used to classify Boolean functions. In other words.1. (c) Suhject tree (The nurnhen are vertex identifiers. the definition of Boolean match is extended as follows.1 entails the use of the complementation operator that models the freedom in choosing the polarity of the input and output .
Multiplelevel optimization techniques based on logic transformations do not impose constraints on the local expressions.e. = p q . since there is only one binate vanable (i. that can be matched by the pattern function g ( x . Because of the extension to the definition of a Boolean match. provided that the search for a match is extended to consider all inputloutput polarities and input permutations. Any cluster function f ( a . Let us consider the equivalence tests requjred to detect the match. the polarity assignments to be considered are those of a.3.) + +  10.3. = d + b and the pattern function compared to g. which also have a match provided that inputs a and b are complemented. which have a match provided that input a is complemented.3.17.e. a + b'. such as having to match some library element. would lead to the same functions in this case. The covering algorithm described in Section 10. As a result of the more general definition of a match. Consider now the cluster function f. Second. the search for a match can be reduced to a sequence of equivalence tests while considering the possible polarity assignments to the binate variables [23]. ab.Split by PDF Splitter signals. u'b. ah'. the polarity information of the unate input variables is irrelevant. These two functions are compared to the pattern function gi = pq + p'r.4 Concurrent Logic Optimization and Library Binding* It is customary to perform libraryindependent logic optimization and library binding as two separate tasks. Example 10. Note. The polarity of a can be changed. It would be wasteful to consider the polarity assignments of all input variables of fiLbecause there are no binate variables in f . i.3. corresponding to functions fi = a'b + ac and f.2 can still be used. the search for a match can be simplified by complementing the negative unate variables.. y ) = x cluster functions in the set: {a'b + ab'. a'b') +y (on2 cell). and f . . This has two consequences. Consider the cluster function f .2) can still be used to reduce further the number of equivalence checks [23].3. 5). a' + b. the polarity of the unate variable b can be changed and f* = u'b + o r be considered instead of f 2 . a'b' + ab] cannot be matched to that panem function. gl = p + q . First..2= ob + a'c. however. First. = a + b is Example 10. b) in the set: (a + b. (Note that a permutation of the unate variables. For example. = a'b' + ac and the pattern function g2 = pq p'r. Filters that quickly check necessary conditions for matching and that are based on the unateness and symmetry properties (Section 10.16. Second. a' + b'. more library cells can match a given cluster function and vice versa. exchanging b with c. the extraction of a subexpression is done .
1).4. We consider first the problem of extracting the don't care information of a logic network during binding. Consider now the problem of matching a cluster to a cell using the same local inputs or a subset thereof.oc. corresponding to variable a . It is conjectured that if logic transformations were constrained to provide only valid matches. In particular. Since simplification takes advantage of the degrees of freedom expressed by don't care conditions.2) and binding.Split by PDF Splitter 534 i. The degrees of freedom in matching the corresponding cluster function f are the impossible local input patterns and those whose effect is unobservable at the network outputs.18.4. is bound to an 0 ~ cell with inputs b and c . we concentrate on the relations between logic simplification (see Section 8.4. 8 01 Example of a partially bound network. don't care conditions must be computed dynamically. For example. 'we consider those don't care sets that are specified at the network boundary and those that arise from the network interconnection itself.. (See Section 8. The former are given by its local controllability don't care (CDC) set and the latter by the local observability don't care (ODC) set. A sketch of a partially bound network is shown in Figure 10.) Note that by the nature of the binding algorithms that progress from the network inputs to the outputs. the 2 assignment a = b + c implies that the relations among the variables given by a $ (b C ) = a'b + a'c + ab'c' can never happen and belong to the SDC set.lcreveL SYNTHESIS AND omz*noN regardless of whether this subexpression has a match.1. if vertex v. The interconnection of the cells in the bound portion of the network induces a satisfiability don't care (SDC) set (see Section 8. Since the topology of the network changes during the covering stage. We consider now the possibility of combining logic transformations and library binding in a single step. . the solution space would be reduced and the results would be poor. it is easier to use CDCs than ODCs. because CDCs depend on a bound portion of the network while ODCs depend on an unbound + FIGURE 1 .
the CDC set is replaced by the SDC set induced by the portion of the network already bound. = p(q + r). and hence are subject to change. x = on 3(c'.18. Indeed we search for the local best match allowed by the degrees of freedom specified by the don't care conditions.19 (a) Bound network. Hence x @ (c' + b + e ) = x'c' + x'b + x'e + xch'e' belongs to the satistiability don't care set. namely f = x(a + c ) . Pattem x'c' cannot be '~' an input to the cluster represented by cluster function f = x(a c). The purpose of using the don't care conditions in library binding is to increase the possible number of matches. Let us assume that vertex u . . The don't care conditions are represented here as a function d(x). as allowed by the dun't cures. The choice of g* is preferable to gl in some libraries. The pattern function can replace the cluster function if there exists a completely specified function f (x) that matches g(y) such that f (x) .or equivalently x a + x c s f c xu +xc+x'c'.. in the search for a lower cost cover.g.Split by PDF Splitter part. This approach is then analogous to the use of Boolean division. Therefore ODCs must be either updated after each cell binding or approximated 1e. In this case. has been bound to a threeinput on gate.. A further extension to this method is to allow the matching cell to use as inputs any output of a bound cell. E ( ~ + x'b + r'e + xcb'e') = x'c'. by using compatible ODCs).d' G f G f +d. Let us consider a cluster function f (x) and a pattern function g(y).. Thus. Consider now the cluster function for vertex u. + Example 10.. which is now performed concurrently with library binding.16.19).) Pattem function g2 represents a multiplexer. we combine Boolean simplification with library binding. Whereas f can match g . it can also match g? = qr + q's. The controllability don't care set expresses that portion of the SDC that is independent of the values of the variables that are not in sup( f).3. due to efficient implementation of the multiplexing function with pass transistors (Figure 10. To exploit don't care conditions in library binding. f c n c c' b e (a) i h e FIGURE 10.e. and its corresponding controllability don't care set. i. (This is possible because f = cx + c'a satisfies the bounds and matches g. Consider again the subject graph of Figure 10.b. +  By using the controllability (and possibly observahility) don't care conditions of the cluster function f while trying to match it to a cell. (b) Bound network exploiting don't cure ib) conditiom.Namely. CDC = C 6 . e ) . d'(x) g f (x) & f (x) d(x). they are expressed in terms of the variables that are in support o f f (x). we can consider the possible matches for function f such that f . Let us consider now the matching of f = r(o + c) with don't care2onditions d = x'c'_Equivalently. the concept of matching needs to be generalized further.
18.19 has just shown that applying Boolean simplification before matching may lead to inferior results. Example 10. + Example 10.3.5 Testability Properties of Bound Networks The testability of a network is affected by all logic synthesis and optimization techniques. [23] based on a traversal of a matching compatibiliry graph representing all N'PNclasses of functions of n variables and annotated with the library information.3.Split by PDF Splitter 536 LOGICLEVEL SYN7HESIS AUD OPTIMIZATION Let us consider now Boolean matching with don't care conditions.4 and 8. Tree covering techniques which replace subtrees by logically equivalent cells preserve the circuit's testability. Let us consider first a circuit implementing a subject tree. Boolean matching with don't care conditions can be verified using ROBDDs bv performing the containment tests f (x) . This choice is unfortunate in the cases in which the multiplexer implementation is smaller and faster. the degrees of freedom provided by don'r care sets is best used in selecting an appropriate Boolean match that minimizes area (or delay). where a fully testable circuit is one that has test patterns which can detect all faults.d'(x) 2 f"(x) Z f(x) +d(x) while considering all possible polarities and input assignments of ~ ( x to f (x). Thus f would be bound to a cell modeled by pattern function g. = p ( q r ) and not to a multiplexer.19. Thus. Minimizers for multiplelevel designs target the reduction of the number of literals of a Boolean expression.   10. to a threeinput on gate. The simplest approach to using don't care conditions in library binding is to simplify the cluster functions before matching. This is true for both structural matching and Boolean matching without the use of don't cares. Savoj et al. We assume that the library cells are individually fully testable.After binding vertex v. the CDC set for cluster function f = x ( a + c) is d = x'c'. Consider agln the network of Example 10. . Whereas such an approach leads to a smaller (and faster) implementation in the case of a design style based on cell generators [4]. For example.3. The simplification o f f with this don'r core set would leave f unaltered. We restrict our attention in this section to testability of bound combinational networks with stuckat fault models. Both methods have been shown to yield hound networks of superior quality when compared to methods that do not exploit don't care conditions. it may not improve the local area and timing performance in a standardcell based or arraybased design style. cell libraries exploiting pass transistors may be faster andlor smaller than other gates having fewer literals. as compared to merging the two steps in a single operation. Mailhot et al. including library binding.2.5. This approach has a potential pitfall. [28] proposed ) matching method based on ROBDDs that exploits the symmetry information a ~oolean an alternative formulation to shorten the sear'h of a match.3. We use the definition of testability introduced in Sections 7. This entails the computation of the don't cares induced by the partially bound network and invoking a twolevel logic minimizer.
Their physical layout is automatically synthesized from logic expressions. In these two cases. gives conditions for full testability for multiple faults. vertex u.5).3. then the resulting network is fully testable for single stuckat faults regardless of the properties of the original network.4 SPECIFIC PROBLEMS AND ALGORITHMS FOR LIBRARY BINDING Standardcell and maskprogrammable gate array libraries can be described well as collections of cells. and so the binding algorithms of Section 10. Boolean matching binds a cluster to a cell when there exist test patterns at the circuit input that can set the output of that cell to F ~ L S E to muE. We consider now other design styles where libraries are represented implicitly. Theorem 10. The practical significance of this theorem is the following. say v.1. When the complete controllability don't care set can be computed for each cluster function f . and the bound network is fully controllable. due to Hachtel et al.e.e.3 are directly applicable.1 and 10. Some macrocell generators construct the macrocell by placing and wiring functional cells. i.3. and let the fanout pints of the unbound network match the fanout points of the bound network.2 yield fully testable networks for multiple stuckat faults when applied while satisfying the assumptions of Theorem 10. [14].1. then the covering methods shown in Sections 10.Split by PDF Splitter Let us now consider multipleinput. Consider a logic network decomposed in terms of NAND base functions and partitioned into trees.. is associated with either a FALSE or a TRUE value. Then. 10. Therefore..3. so is a circuit realizing the bound network. If redundancy removal is applied to the bound network to remove unobservable cells. No cell is bound to v. Lastly. union) of the function f and the don't care set is a tautology.. Let the network be bound to library cells in such a way that each tree is replaced by a tree of cells that is fully testable for single stuckat faults. the logic expressions must satisfy some constraints. which is then propagated in the network. To satisfy area and performance requirements related to the corresponding physical view. without enumerating their elements. multipleoutput networks. h e r e are several types of macrocell design styles where module generators construct the physical view of a macrocell from an (optimized) unbound logic network. Typical constraints are related to ..3. The following theorem. If we can design an unbound network which is fully testable (using logic synthesis methods such as those shown in Section 8. let us consider the testability properties of bound networks constructed using Boolean matching techniques with don't care conditions and nontreebased decompositions. This is the case of some macrocell andjieldprogrammable gate array (FPGAs) design styles where special techniques for library binding +re required. if a circuit implementing a trivial binding of the original decomposed network is fully testable for multiple faults. which are similar in principle to standard cells but are not stored in a library. Two extreme cases can be detected while matching a cluster rooted at a vertex. The first is when the cluster function f is included in the don't care set and the second is when the disjunction (i.
A review of the state of the art is reported in reference [27]. FPGAs can be classified as either soft or hard programmable. Library binding for FPGAs is an important and complex problem.. Logic expressions and cells that satisfy these functional constraints constitute a vinual libran. . In the 3000 series. Circuits in the first class are implemented by a programmable interconnection of registers and lookup tables. it is hard to present the binding problem in a comprehensive way. We therefore consider here the fundamental problems and solutions at the logic level of abstraction. to explain the relation to the library binding problem. Its difficulty stems from the fact that binding prewired circuits is deeply related to physical design issues 1271. We neglect pulposely physical design issues. Circuits in the second class consist of an array of programmable logic modules. described in Section 10. this topic is still the subject of ongoing research.3.) Today.Split by PDF Splitter 538 LOGICLEVEL SYhTHESIS rWD OPTINILA1IOU the maximum number of inputs andlor the maximum number of transistors in series or in parallel in any given cell. They can be programmed by loading an onchip memory that configures the tables and the interconnection. because of the programmable interconnect technology. Broadly speaking. Thus iterative improvement techniques have been shown to be important in achieving good quality solutions. At present. enumerating the library cells is not practical.1 LookUp Table FPGAs The virtual library of lookup table FPGAs is represented by all logic functions that can be realized by the tables. Even when considering just those tables implementing singleoutput functions of a few variables (e. Let us consider the FPGAs marketed by Xilinx Inc. (See Section 1. there are several FPGA providers.1.1. Library binding consists of manipulating the logic network until all its expressions satisfy the functional constraints and the network has some desired optimality properties.4. and circuits differ widely in their programming technology and implementation style. In addition. Berkelaar and Jess [4] proposed a heuristic binding algorithm for this task where functional constraints are the maximum number of transistors in series or in parallel in any cell. without delving into implementation details. Since FPGA architectures are novel and quickly evolving.4. Bound networks must be routable. which store the personality of local combinational logic functions. When the functional constraint is only the maximum number of cell inputs. as these are beyond the scope of this book and also because they are often dependent on the specific architecture. and routability depends on the binding itself. 10.4. each lookup table implements any singleoutput function of five variables or any twooutput function of five variables. We consider here just their major features. the problem is similar to a binding problem for FPGAs. each implementing a logic function and that can be personalized and connected by programming the antifuses. Fieldprogrammable gate m a y s are prewired circuits that are programmed by the users (on the field and after chip fabrication) to perform the desired functions.. 5 ) .g. Example 10. path delays are heavily conditioned by wiring. where each component has no more than four variables in its suppon.
We assume that lookup tables can implement any scalar comhinational function of n input variables. Thus. we concentrate here on hinding comhinational networks and in particular on the following problem that is important for all lookuptablebased FPGAs.When considering ninput lookup tables. (b) Cover by three fiveinput lookup tables .20 (a).4. Covering is driven by the principle of packing as much functionality as possible into each lookup table subject to its input size constraint n. Example 10. [25].20 (b).2. Assume that the lookup tables can implement any combinational function of n = 5 inputs. 221 for the others.Split by PDF Splitter CELLLIBRARY BINOINO 539 Let us concentrate on the first option only. we present only the flavor of two approaches [6. FIGURE 10. There are 2 different functions of n ' variables. 251 and refer the reader to specific articles [17. possible singleoutput functions. as shown by Figure 10. Hence. but at the time of writing this topic is still under investigation. The algorithms of Section 10. there are 221. Then the network can be covered by t h e tables. with a minimum number of vertices (or minimum critical path delay) such that each vertex is associated with a function implementable by a lookup table. For this reason. whose heuristic approach to binding involves covering a sequence of subject graphs into which the decomposed network has been partitioned. hut it is usually convenient to use twoinput hase functions to achieve a finer network granularity.20 (a) Subject rree. The organization of the lookup tables differs in various FPGA products. Consider the network of Figure 10. Usually. The tree covering paradigm has been adapted to this problem by Francis et al.or about 4 billion.3 are not applicable to this problem because the library cannot he enumerated. Given a combinational logic network. the starting point for binding is a logic network decomposed into base functions. binding consists of finding an equivalent logic network. the hase functions are required to have at most n inputs. This leads to the need for solving the following subproblem. such as ANDS and ORS. Some specialized hinding algorithms have been recently proposed. or equivalently with a local function having at most n input variables.
and a variable is associated with this table and assigned to the first table that can accept it. Example 10. (Recall that each table handles at most n variables. = o b . is associated with it. fi = r d . groups of product terms must be assigned to different tables.3. as in the case corresponding to a treelike decomposition. the following steps are iterated. This algorithm has been implemented in program CHORTLECRF [25]with a few extensions. This is equivalent to a decomposition into three tables. The procedure terminates when one table is left. This problem can be solved by modifying a bin packing procedure as follows 1251. Even though the two product terms have 2 < n = 3 variables each. f2 = cd. To cope with this extension. because partitioning the set of product terms into b tables requires more than b lookup tables to implement the function. s . the solution has a minimum number of tables when n 5 6. this table yields the desired output. implementing f.4. according to the following decomposition: f = rrb + fi. f = f . CHORTLECRF exhaustively explores all possible assignments of product terms with intersecting support to the same table. product term z can be added to yield f = u b + z.e. If two lookup tables are used to implement each product term. Since the second table has an unused input. The algorithm assigns a table to each product term. then one additional table should be devoted to performing their sum. Note that two tables can cover the function. Consider the problem of Example 10. It represents the assignment := cd implemented by the table. First. The algorithm selects iteratively the product term with most variables and places it into any table where it fits. this table is declared final and a variable. say z . This problem bears a strong resemblance to the bin packing problem. If the function has at most n inputs. + f2.4. CHORTLECRF attempts to duplicate prodnct terms in exchange for reducing the total number of lookup tables. Let the function to be implemented into tables be f = ab + cd. We refer the interested reader to references 1251 and 1271 for further details. Then the algorithm tries to fit the singlevariable product term I into the other table.4.43. . Without loss or generality.. it can be trivially implemented by one table. Let the table size be n = 3. The second extension addresses the inefficiencies due to the network partition.) If no tahle has enough capacity. which consists of packing a given set of objects (here product terms) of bounded size into bins (here tables) of a given capacity. Then. function f has 4 > n = 3 inputs and cannot be implemented by a single table.Split by PDF Splitter 540 LOGICLEVEL SYNTHUIIS AND OPTIMIUTION Consider a sum o products representation of a singleoutput function whose f product terms have at most n variables. a new table is added to the current solution containing the selected product term. When all product terms have been assigned to tables. it can be shown that when the product terms are disjoint (i. Even though this algorithm is heuristic. for subject trees). the additional ones being devoted to performing the sum of the partial sums. The tahle with the fewest unused variables is declared final. let the first table correspond to product term cd. Otherwise. subject graphs are not restricted to be trees and as a consequence productterm pairs may share variables. Example 10. Assume first that the product terms have disjoint support. Unfortunately bin packing does not model the problem precisely.
by setting so = s. This algorithm has been implemented in program FLOWMAPR. while in the module implements the function rnl = (so+s. to the power rail through an antifuse. a library subset can be considered . They considered the problem of minimizing the critical path delay in a bound network.c+s. which can be used to obtain areadelay tradeoff curves for a given network. we make a simplifying assumption that defines a fundamental problem common to different antifusebased FPGAs.5. Namely. We concentrate also on the binding problem of combinational logic networks. by programming the antifuses.r!a +sib) +~.g. called the module function. standard binding algorithms can be used. The transformation is fully detailed in reference [61. This approach has some advantages.(s2s.4. fewer than 1000 functions). Note that a module personalization can be seen as inducing a stuckat or a bridging fault on its inputs. the bound network depends on the particular network decomposition that is used as the starting point for this procedure. This is usually achieved by shorting inputs either to a voltage rail or together.)(.d). For this reason. First. Recently.Split by PDF Splitter A second remarkable approach to the lookup table binding problem was recently proposed by Cong and Deng [6]. = 1. Let us consider the FPGAs marketed by Actel Inc.. In some cases it is practical to derive the library explicitly. because the size of the library is limited (e.In both cases. The algorithm is based on a transformation of the binding problem into a network flow computation that can be solved exactly by standard algorithms. There are about 700 functions that can be derived by programming either module. binding consists of finding an equivalent logic network with a minimum number of vertices (or minimum critical pith delay) such that each expression can be reduced to a personalization of the module function. This is achieved by providing a path from inputs so and s. 10. given a network decomposed into base functions with no more than n inputs. function r n l implements the multiplexer s2a + sib.4.c+ (s2si)'d). the Act2 series it implements the function r n 2 = (so+ sl)(slrja+ (s2a)'b)+s. In the Acrl series. Second.Is. Example 10.2 AntiFuseBased FPGAs The virtual library of antifusebased FPGAs is represented by all logic functions that can be implemented by personalizing the logic module.s.(s. Since all tables are alike and display the same propagation delay. by considering all possible personalizations of the module function. The algorithm assumes that overlapping lookup tables in the cover are not beneficial as far as minimizing the number of tables and avoids it. Unfortunately. the same authors proposed another algorithm that minimizes the number of lookup tables required under a bound on the critical path delay [7]. this problem is equivalent to minimizing the maximum number of tables on any inputloutput path. The organization of the FPGAs and the type of logic module differ in various products. the module is a function of n = 8 inputs. given a combinational logic network. As an example of programming. We assume that all pmgrammable modules implement the same singleoutput combinational function. This problem can be solved exactly in polynomial time.
given an order of the variables of the module function and a corresponding ROBDD representation.(s.6. These subgraphs are rooted at those vertices reachable from the root of the module ROBDD along m edges corresponding to the variables with respect to which the cofactors have been taken.3. it is convenient to decompose the subject graph by choosing a multiplexer as the base function. ROBDD representations can be very useful to visualize and solve this matching problem. 271. For example. the module function~an implement any cluster function that matches any of its cofactors. When the module differs from a cascade of multiplexers. namely eight leafdag patterns. commercial implementations of FFGAs use programmable modules based on signal multiplexing. goodquality solutions were achieved using program MISPGA [22]. Specialized algorithms for binding have been proposed based on structural and Boolean representations [I 1. for the sake of simplicity.b) + s. By using a restricted set of palterns to represent ml. its cofactors with respect to the first m variables in the order are represented by subgraphs of the ROBDD. (b) Pattem graphs . 22. The pattem graphs are leafdags and are shown in Figure 10. In particular.s. Let us consider only personalizations by input stuckats. Consider the module function m = s. the number of pattem graphs increases and may include dags that are not leafdags. ( s 3 c + s .21 (a) Module funcoon. In this case.4. Binding can then be done by structural covering using the dynamic programming paradigm [22. Third. The Boolean covering algorithm of Section 10.2 can be combined with a specialized Boolean matching algorithm that detects whether a cluster function can be implemented by personalizing the module function and that determines the personalization at the same time. There are four corresponding pattern graphs that represent the functions that rn can implement.c + s i d ) . Example 10.Split by PDF Splitter that excludes gates with undesirable delays or pin configurations. 271. d ) . The entire library can then be implicitly represented by pattern graphs that use a similar decomposition.21. 18. Then. or equivalently to those variables that are stuck at a fixed value by the personaliza db b (a) (b) FIGURE 10. Indeed. this applies to module function ml = (so + sl)(sza + s. Structural approaches exploit the specific nature of the uncommitted module. We comment here on the general case where the virtual library is so large that it is not practical to enumerate it. (sza + s i b ) + s . the precise area and delay cost of each cell can be established.
Since the ROBDD of f is isomorphic to the subgraph of the ROBDD of m rooted in the vertex labeled s? (which is the right child of s t ) . o .+ s. c. FIGURE 10. This is due to the fact that a specific variable order has been chosen to construct this ROBDD. (b) Module ROBDD (c) Cluster ROBDD (d) Representation of the cluster functiun.Split by PDF Splitter tion. b.22 (a) and (d). This can be done by constructing a shared ROBDD that encapsulates the virtual library corresponding to the module function.rJ + 1.22(b). Burch and Long [51 developed canonical forms for representing functions under input negation and permutation that can be used efficiently for Boolean matching as well as for matching under a stuckat constant andlor bridging of some inputs. Consider the module function m = s. d). These forms have been applied to the development of efficient algorithms for library binding of antifusebased FPGAs [5].22 (b) shows the ROBDD of m for variable order (s.. the module function can implement f by sticking s f at 1. .. and Figure 10.22 (c) shows the ROBDD o f f for variable order (I.4. Example 10. shown in Figures 10. Figure 10. Extensions to cope with personalization by bridging have also been proposed [ I I].$ ' : cluster function f = . all variable orders of the module function and the corresponding ROBDDs must also be considered.(szu + .(Y' + s. s.y. I ) . 33. respectively.22 (a) Programmable module. Unfortunately. Note that other cluster functions that can be implemented by the module function may have ROBDDs that are not isomorphic to any subgraph of the ROBDD of Figure 10.7.d) and . to consider all possible personalizations. Recently.sib).
the circuit patterns that march the rules are detected and the corresponding replacements are applied. The first rule shows that two cascaded twoinput AND gates can be bound to a threeinput AND gate. or insertion of buffers. The third rule shows that an AND gate on a clitical path with a high load can be replaced by a NAND gate followed by two inverters in parallel. Some current commercial and proprietary design systems use rulebased binding.1. which may require the use of a highdrive cell. or even duplication of some gates. 91. The first two can be called simple.7. . A s an example. such as LSS [8. along with an equivalent replacement pattern in terms of one or rnore library elements.5. The execution of the rules follows a priority scheme. Consider the rules shown in Figure 10. the third complex. For each rule in a given order. The second rule indicates that a twoinput AND gate with an inverted input and output can be bound to a twoinput NOR gate with an inverted input. Some rulebased systems.Split by PDF Splitter 10. The network undergoes local transformations that preserve its behavior.5 RULEBASED LIBRARY BINDING Rulebased library binding is a widely used alternative and a complement to the algorithmic approach described in the previous sections. Eacb transformation can be seen as the replacement of a subnetwork by an equivalent one that best exploits the cell library. such as LSS [ 9 ] and LORESIEX T FIGURE 10.23. Eacb entry in the rule database contains a circuit pattern. The overall control strategy of rulebased systems for binding is simllar to that used for optimizing unbound logic network7 and described in Sec[16]. Simple rules propose just the best match for a subnetwork. Entries may represent simple or complex rules. we shall describe only those issues that are specific to the library binding problem. a complex rule may be applicable to a cell with a high load. Since rulebased library binding is similar to rulebased optimization of unbound networks. used rules for both logic optimization and library binding. which amount to a large load. Complex rules address situations requiring a restructuring of the network. Example 10. use a greedy tion 8.23 I Two simde transformation rules. In a rulebased system. which was presented in Section 8.7. 16. sometimes in conjunction with algorithmic binding [13. 301. Some of the early logic synthesis systems. a network is bound by a stepwise refinement. The first inverter drives the critical path and the second the remaining gates.
SOCRATES tries a set of sequences of rules in the search for the best move. Therefore.Split by PDF Splitter CELLLIBRARY BINDING 545 search. Other systems.1 Comparisons of Algorithmic and RuleBased Library Binding Let us consider first the technical differences and then the overall merit of the two approaches. On the other hand. algorithms for library binding have been conceived to only handle singleoutput combinational logic cells. but all cases can in principle be covered. because of their low computational complexity. Whereas covering algorithms bind a subnetwork to a cell only once. Recall from Section 8. programs. the rulebased binder repeatedly fires the rules corresponding to the local best improvement of the network cost according to some metric. as cells are added or deleted and library cell parameters are updated when faster fabrication processes become available. but the speed of running a rulebased system varies. The overall quality of the solutions is comparahle for both approaches. The number of rules under consideration and the metarules can be tuned so that a desired quality can be achieved with a predictable execution time. However. most rulebased systems execute rules in a given order. unless the rule set satisfies some completeness property and the order in which rules are applied follows some particular discipline. are used to compile the rule database. such as SKRATES 131. use a more refined search for choosing the [ transformations in an attempt to explore the different choices and their consequences before applying a replacement. according to the amount of time illowed to perform a binding and possibly improve upon it. often assisted by human experts. Some binding algorithms can execute in a short time. rule databases are large and complex. the same rulebased system can provide for better or worse solutions. and replace all circuit patterns that match the rules. Rules can be complex at times.5. On the other hand. Maintaining a rule set is difficult. provable optimality and testability properties can be claimed by some algorithmic approaches. 10. where the network is systematically scanned in a prescribed order and covered. Namely. Extensions to other types of cells involve ad hoc methods. Let us consider the generality of both approaches. As a result. Whereas the library description is straightforward in the case of the algorithmicbased approach. for restricted classes of circuits. . Present algorithms for library binding use a covering approach. The results depend often on the breadth and depth parameters.7 that the size of this set is the breadth of the search and the length of the sequences is its depth. which can be varied dynamically by the binder during its execution. By contrast. it is hard to prove that networks bound by rulebased systems have similar attributes. because rule sets must be continuously updated to reflect any change in the library. iterative rebinding and stepwise improvement are supported by rulebased systems. Rules can be thought of for all kinds of library cells without any restriction. Metorules decide upon the breadth and depth of the search and on the tradeoff between area and delay or between the quality of the solution and the computing time. For example. either in the entire circuit or in a selectively chosen subcircuit.
211. 301 programs use rules. He developed program DAMN. Momson er nL 1241 &\slooed a bindine algorithm using exoression Damern matchine and imolemented it in . We mention three examples. who developed program CERES. [29] at Fujitsu.6 PERSPECTIVES Library binding is the key link between logic synthesis and the physical design of semicustom circuits. Algorithms and strategies for rulebased systems of most commercial and proprietary binders are described only in documents with restricted access. and by Sato er nl. TECHMAP. Keutzer 1191 proposed the fint algorithm far binding that leveraged the work on s h n g matching [I. which are always highly desired because they are directly. which binds logic networks using a dag model and uses program Wrc. Detjens et ol. where gates along critical paths are repeatedly identified and rebound. 91. . Complex design systems for library binding often couple algorithms and rules. Binding algorithms are usually applied to a large portion of the circuit and provide a first solution. for binding.     .Split by PDF Splitter 546 L. Burcb and Long [5] developed canonical forms for Bwleao matching. Nevertheless. it would be useful to be able to compute precise lower bounds on the area andlor speed of bound networks to evaluate the closeness of the solutions provided by heuristic binding algorithms and rulebased binders. Most commercial 1131 and propnetaq 116. [LO] and Rudell 1261 expanded Keutzer's ideas and implemented the binder of program MIS [lo] at the University of California at Berkeley. Spectral analysis of Boolean functions is useful for determining criteria for Boolean matching and for filtering probable matches. It would be highly desirable to develop algorithms whose solutions depend only on the network behavior. There is a wealth of techniques that are applicable to binding which have not been presented in this chapter for various reasons. 1281 extended the use of fillen based on symmetries to enhance Bwlean matching with don'r care conditions. Library binding in LSS was based an mles.7 REFERENCES h l y work on automating library binding was done at IBM as p a l of the LSS project 18. Iterative binding techniques have been used for performanceoriented binding. As in the case of multiplelevel circuit optimization of unbound networks. 10. which can be improved upon by the application of rules. for string matching. orogram Other programs based an structural covering with various flavors are reported in the literature 13. 21. Library binding tools are widely available and successfully used. Savoj er ol. They introduced the inverter pair heucistic and developed algorithms for delay optimization. 20. Francis et a1 [25] studied heuristic methods for binding table lwkup architectures. the present techniques leave space for improvements. there are still interesting open research challenges to be solved in the years to come. binding algorithms are dependent on the initial network decomposition into base functions. Other techniques which have been shown to be promising are applicable to specific subproblems of library binding. 10.coupled to the circuits' quality. . combined with algorithms.MilC~LE'JEL SYNTHESIS AND OFTMtzATlON In summary. Bwlean matching and covering using don'r core conditions was proposed by Mailbot and De Micheli . Cong and Deng developed fin1 exact methods for . At AT&T Bell Laboratory. due to the novelty and rapid distribution of W G A technologies. both approaches have advantages and disadvantages. . Library binding techniques for WGAs have been the subject of intensive investigation in recent yean. Similarly. Algorithmsbave been studied for determining the optimal usage and sizing of buffers to drive highly loaded nets. In particular. 1231..
Detjens. 48&501. 341347. D. ''On OnPmpenies of Algebraic Transformations and the Synthesis of Multifault Irredundant Circuits. Spectral Techniques in Digital Logic. June 1976.. Aho and S. 537545. 1991. SangiovanniVincentelli. 4853. G." DAC. H. Vol. f 19. pp.. 6 J Cone and Y . pp. 1. 262266. pp. 3. K. Johnson. No. pp. 28. 'Bagon: Technology Binding and Laeal Optimiration by Dag Matching. C. Miller and 1. Computers a d Inrronobility. Oguri. Kedem. A. 1991.logy Mspplng Algorilhm fur &lay Optrm!ralton In Lookupt. R. Bergamaschi. Sato." ICCD.thlr. 7. Proceeding. 2W243. Nishizaki. "A Rulebased Reorganization System: Loredex.he Design Automotion Conference. f 10. S." DAC. A. Brand. 1988. 3. "Mapping Pmperties of Multilevel Logic Synthesis Operations. 333340. Ishikawa. 6. D. "Optimal Code Generation for Expression Trees. ~ . Ercolani and G. f 5. loyner and L. Proceedings of the lntemationol Conference on Computer Design. Cannot. Vol. 20. 'Technology Mapping for StandardCell Generators. Y. J. R. pp. pp. K." IEEE Transactions on CADflCAS. Val. Rudell. Hurst. Morrison. 17. f 9. July 1981. pp." DAC. 5. Shenoy. C. pp. New York. 272280. "LSS: Logic Synthesis through Local Transformations. October 1988. S. Karplus. M. "Amap: A Technology Mapper for Selectorbased FieldRogrammable Gate Arrays. E. 13. Y. 1985. 1. CAD10. 313321. R. No. 1987. A good survey of the state of the art is given in reference 1271. Proceedings o the Design Automotion Conference. K. Kazuma and S. W. 1988. 11. pp. 7985. Long. Proceedings o . 4. Proceedings o f the lnrernorionol Conference on Computer Design." I B M Journl o Research ond Development. Murai." DAC. Trevillyan. f 18. Bartlen. 1990. SangiovanniVincentelli and A. 3. Jess. Karplus [17. November 1991. CAD11. pp. No. pp. Cong and Y. March 1992. 14. "An Opllmal Twhn'. "On Area/Depth Tradeoff in LUTbased FPGA Technology Mapping. Murio. f 8."Logic Synthesis f for Programmable Gate Arrays. 408411. 1986. Brayton and A. 236239. Proceedings o the Design Auromation Conference. 15. Trevillyan." DAC. "LSS: A System for Pmductian Logic Synthesis:' IBM Journol o Research n d Development. 252256. J. pp. Garey and D. de Geus and G. M. D. F. R. June 1975. K. "Efficient String Matching: An Aid to Bibliogaphic Search. No. Hachtel. 25. Academic Press. 620625. Hachtel. L. 18." ICCAD. Vol. Proceedings o the Internnono1 Conference on Computer Design. Proceedings o the lnterntionol Conference on Computer Aided Design. 47W73."ICCD." Joumal ofACM. Daninger. pp. 1991. 23. 1979. September 1984. pp. Proceedings o the lnterntionol Conference on Computer Aided Design. Brgler and G. 'Technology f Mapping in MIS." ICCAD. 1993. Deng. 257261. S. Johnson.mputrr Aided . Daminger. Jacobi. De Micheli." DAC.of Design. Proceedings o the Design Automation Conference. 1. 244247. Keutrer and C. "Mcmap: A Fast Technology Mapping Procedure for Multif level Loglc Synthes~s. "Efficient Boolean Function Matching. A. B w A m C i A De\ngn. Pnnrrdcn~s thr lntmmr. No. Vol. 1988. Vol. 22. Lega. 21. 16. G.Split by PDF Splitter ELLLIBRARY BINDING 547 minimumdelay binding [6] and for determining the &delay tradeoff 171. Berkelaar and I. Proceedings o the Design Automotion Conference. Karplus. 1992. Murgai. L. Proceedings of& Design Automion Conference. f 12. pp. Proceedings of the Interntlonol Conference on Computer Atded Derign. R. R. " S m t e s : A System for Automatically Synthesizing and Optimizing Combinational Logic." ICCD. No. 181 and independently Murgai er ol. Burch and D." DAC. 1987. I I . Ishida." Communicotiom o ACM. Keutzer. f 2. afthe Design Automation Conference. J. pp. pp. R. London. W. 'Technology Mapping for Elecmcally Pmgrammable Gate Arrays. 21S218. Lisanke. W. pp.onol Confcrmcr on C. A. "Xmap: A Technology Mapper for Tablelookup FieldProgrammable Gate Arrays. M. M." IEEE T r a m t i o m on CADflCAS. R. "SKOL: A System for Logic Synthesis and Technology Mapping. K. [22] studied methads for binding both sob. M. Gregory. 1992. 13421355. pp. Cornsick. Berman and L. K.and hardpmgrammable FPGAs." ICCAD. pp. 116119. loyner. Aho and M. Hiramine. Freeman." ICC'AD. 4. N. Dcng. Wang.
Compute the automaton that represents the library. ProcteJmw.D.. Dissertation. C. Proceedings of the Design Automation Conference. p = 1 ." 1('(. H. Ph. A. Fuiita. Tabulate the number ut of distinct decompositions for n = 2. S&i&to. Then consider function f = a b ' f c ' d ' b . 24. Rose. threeinput A N n gates with cost 3 and e fourinput AND gates with cost 4. . pp. Consider a scalar function of n variables with a symmetry set of cardinality m. 1991. . 1989." DAC. 29. 25. R. D. May 1993.r. Hint: use the following decomposition: f = N A N D ~( p .'." EURODAC.Split by PDF Splitter 548 LootcLEVE SYNntEslS AND OPllMlZAnON 23. M. Morrison. IWO. q). 'Techmap: Technology Mapping with Delay and Area Optimization.2. Is this the best implementation of the given network with the available cells? Is there another decomposition into the same base function leading to a lower cost solution? 2. N A N D ~ 4. NorthHolland. Takahashi. Saucier and P. 168174. 6.& ~ AND gates. Enumerate the different NPN classes of functions of three variables. Derive all pattern trees and corresponding strings for a decomposition into NOW and INV base functions. 27. F. 81. 5344. 1992.= NAND2 (c'T'). What is the size of these libraries for s = 1. 227233. K Takahashi and T. Vrasenic." Proceedings of the IEEE. implementing the conjunction of n variables. 28. INV with cost 1). Silva. 5. M. Suzuki. Brayton and A. OR2 with cost 5. No. Vol. April 1989. .3. N A N D ~ and INv (see Problem 3).4U. Consider a library including the following cells: (ANDZ with cost 4. R. Vol. differ? 7." DAC. into t w ~ . . . M. Mailhot and G.t. R. July 1993." IEEE Tramactions on CADflCAS. McLellan (Editors). How many of the cofacton of f . Bitoh. 10. "Synthesis Methods for Field Programmable Gate Arrays. 5. The Netherlands. 5 ? 8. 10571083. SangiovanniVincenteIli. M. 599620.. . 2 . 30. El Gamal and J. Savoj. f ~ ~ 10. q . R. Matsunaga and M. Proceedings o the Eumpeon Design Automntion Conference." Memorandum UCBERL M89149.8 (a) and the pattern trees according to a decomposition in NO=. Hachtel. Draw the pattern trees for these cells using NANM and INV as base functions. r = NAND2 (d'. Consider the simple library of cells of Figure 10.5. Iacoby and G. Find a minimum cost cover of the subject graph using the inverterpair heuristic. University of California at Berkeley. . "Logic Synthesis for VLSl Design. Available cells are only twoinput AND gates with cost 2. "Chortlecrt Fast Technology Mapping for Lookup Tablebased mGAs.8 PROBLEMS 1. A. De Micheli. Francis. 1987. Amsterdam. Determine the subject graph for f using the same base functions. Rose and Z.~bleFuncuonr and B~nar) Dur~s~un D~agram\. Y. by considering NOW. Sato. "Bmlean Technology Mapping for Both ECL and CMOS C~rcu~lr Bawd on Pcmt\. S. 'Technology Mapping with Bmlean Matching. Consider the binding of a network implementing the conjunction of 10 variables. olrhr Inrrmo~eonalConferewc #on Compurrr A d d U r r t ~ npp 2X&2XY. p = NAN^ (a. pp. Proceedings o the Design Auromntion Conference.8 (a). all variables in the set. and show one r e p resentative function for each class. 7. N. w. . Repeat the exercise and I N V as base functions. . . H. J. b'). Rudell. Consider the simple library of cells of Figure 10. T. CAD12." in G. b). Derive a formula that yields the number of distinct decompositions of a function f . 3. pp. 'TRIP: An Automated Technology Mapping System."Boolean Matching in Logic Synthef sis. Find an optimum cover of a balanced b e decomposition of the network using twoinput AND gates as base functions. Logic and Architecture Synthesis for Silicon Compilers. R. SangiovanniVincentelli. . pp. Kakimoto. pp. pp. . Consider the library of virtual gates corresponding to static CMOS implementations of singleoutput functions with at most s transistors in series and p transistors in parallel. 523529. No. 26.
For this reason.8. We refer to synthesis and optimization methods as synthesis for brevity. The success of many ideas in CAD can often be measured by their use in design systems. we consider the implementation of the ideas presented in this hook in current computeraided design systems and their use for digital circuit design.Split by PDF Splitter CHAPTER STATE OF THE ART AND FUTURE TRENDS I libri non sono farti per crederci. I1 nome della rosa. ma per essere sonoposri a indagine. We now critically review the use of synthesis techniques in the support of microelectronic circuit design. We also~presenteda concise history of the major breakthroughs in this field in Section 1. Eco.1 THE STATE OF THE ART IN SYNTHESIS In the introduction to this book we commented on the importance of computeraided synthesis and optimization methods for the advancement of the electronic industry. Books are nor made ro be believed. 11. U. bur to be subjected to inquiry. Some algorithms failed to be applied because they either were not practical .
especially for synthesis tools at the architectural and logic levels. (Source: VISL Research Inc. Recently. algorithms with lower computational complexity or with more powerful heuristics displaced others that executed in longer time or provided lowerquality solutions. the impact of synthesis methods on microelectronic design has been extremely positive. As in many other fields.) or did not address prnbletns relevant to the design tnethodologies being used. as shown in Figure 11. Overall. We shall take then a broader view of the problem and put .~re Integrated clrcult *oftware FlGURE 11. For example. CAD systems are very complex and their usability depends not only on the algorithms being supported hut also on the user interface. The market growth has been impressive.1 [7]. As an example. In this chapter. Generally speaking. Many synthesis and optimization tools are now commercially available. On the other hand. database and adaptability to the user environmcnt and nccds. we focus on the present state of the art and the likely future directions. the use of specific hardware description languages and design formats is often dictated by company policies or by marketing issues. and most designers of digital circuits use them for at least some part of the designs. Softwnre engineering techniques play a major role in the development of robust and userfriendly CAD systems.Split by PDF Splitter US 5 in billions 1 ltJY'4 I'I~JO ICj'J I ILJLI? l'JtJ3 l'l'J4 0 Syqfem ioftw. sales of synthesis systems have soared. Some microelectronic circuits could never have been designed at the required performance levels without the aid of CAD systems. In other words. We present first a taxonomy of synthesis systems and describe some representative ones.1 Electmnic CAD software market. the directions in grouath have been driven by trends. most of the techniques presented in this hook arc currently used by CAD systems. Standards in design representations have evolved through the years and design synthesis tools have followed the evolution by trying to match the standards. this book has described some of the algorithms that constitute the inner engine of CAD systems. Some techniques disappeared because they were superseded by newer ones.
Universities.. While such a figure would be extremely useful. AND F U N R E TRENDS 553 into perspective the growth of circuit synthesis as a fundamental design method. performance and testability. research centers and indrrtries are rln~~ally involved in designing ruth systems. support synthesis at all levels. Some synthesis systems can claim full topdown synthesis. A first differentiation among synthesis systems can be done by considering their primary goal. logic and geometrical design tools. which in turn may not have reached an adequate degree of maturity and stability. The former are available for sale. Therefore we consider different factors that are useful for evaluating the fitness of a synthesis system for a given design task and for comparing different synthesis systems. as some companies are now selling CAD programs originally developed for internal use.. Other systems are limited to particular levels and tasks. indicative measures support the belief that successful systems have targeted either full topdown synthesis i n a restricted application domain or a restricted set of synthesis tasks for general circuits. They are prototypes of productionlevel synthesis. Research synthesis systems are designed to explore new ideas in synthesis and optimization. specific synthesis systems have been designed for some application domains such as digital signal processing. i. i. 11. The difference between internal and commercial tools is beginning to blur.2 SYNTHESIS SYSTEMS Synthesis systems can be classified according to different criteria. Users of CAD synthesis systems also evaluate them on the basis of their integration in the current design flow. Since the reliability of the CAD tools is of primary importance.tools and they are occasionally used for designing circuits that will be fabricated.e.Split by PDF Splitter STATE OF TWE ART. Hence they tend to incorporate the most novel and advanced algorithms. Design systems can be classified according to the circuit abstraction level they support. In addition. specific comparative data are still not available. namely as architectural. Productionlevel synthesis systems are conceived to be used for designing circuits to be manufactured and either marketed as offtheshelf components or incorporated into electronic systems. they are based upon mature (i. Some research synthesis systems developed in universities are available for free or for a nominal charge. The user is often confronted with the problem of mixing and matching tools from . wellexperienced) techniques. Synthesis systems should be classified according to the quality of the designs they produce. They can be classified further as commercial or internal tools. according to whether they are used for commercial circuit production or for research and exploratory reasons.e.. we shall consider requirements for electronic circuit and syqtem design in the close and distant future and mention relevant open and unresolved problems. Eventually. measured in terms of area.e. while the latter are designed within companies for proprietary use. implementation style and overall design methodology. On the other hand. which varies from site to site according to the circuit technology.
Thus.. HP.Split by PDF Splitter 554 CONCLUSIONS different sources andlor blending them with other tools developed inhouse for specific purposes. specific synthesis policies are mandated to make unambiguous the interpretation of behavioral models. The scarcity of detailed information is due to the attempt to protect proprietary ideas. Frameworks support the integration of CAD systems and isolated tools. Standard data representation formats play a key role in tool integration. LSS was a major success not only for IBM but for the entire field. Intel. IBM. Also other companies (e. Philips) are selling design systems that were originally developed as internal tools. data representation formats and intertool communication.1 ProductionLevel Synthesis Systems Limited published information is available on details of the algorithms used by productionlevel synthesis systems.2. Fast and easy access to internal data representation is very important for combining tools efficiently. The scope of CAD frameworks goes beyond circuit synthesis at all levels. IBM developed the first preductionlevel logic synthesis and optimization system: LSS 141. a novel implementation of IBM's archihas tectural and logic synthesis system. Some companies provide physical design tools that are directly coupled to the corresponding logic synthesis and optimization programs. We report in Table 11. provide their customers with synthesis tools that are targeted to their libraries. CAD frameworks have gained much attention in recent'years. Recently.g. such as AT&T. been put on the market. Most ASIC vendors. Most systems accept circuit specifications in hardware description languages like Verilog or VHDL. thus providing a welldefined interfacr lo external physical design tools. the key role played by CAD tool development. their current features and their costs are often described in trade magazines [I. especially fieldprogrammable gate array suppliers. Commercial CAD syst~ms. and it includes circuit simulation. 11. NTT. 21. NEC. Proprietary internal productionlevel synthesis tools have been developed by several companies. Several commercial synthesis tools are now available [13]. For architectural synthesis. called BOOLEDOZER. by providing guidelines for links to a common user interface. an important feature of synthesis systems is the possibility and ease of augmenting them by incorporating userdesigned or other programs.1 a summary of the offerings of some synthesis vendor companies in 1993. integration and management is immediately recognized. When thinking of the increasingly difficult challenge of designing larger and larger microelectronic circuits. because it showed the practicality of using synthesis and optimization techniques for largescale designs. . This table is only indicative. Philips and Siemens among others. Therefore we shall restrict our comments io~ the major features of these systems. verification and testing support. as these data evolve with time. Others provide synthesis from HDL models to the specification of bound networks in standard netlist formats. Fujitsu.
151 describe these systems in detail. The SYSTEM ARCHITECT'S WORKBENCH (or SAW) [I41 is a design system that encapsulates some of these architectural synthesis tools. Synthesis from VHDL and Verilog. . Logic synthesis and optimization. Loeic svnthesis and oorimization. Libbindine. THE SY~TEM ARCHITECT'S WORKBENCH. Library binding. and it is impossible to comment on all of them here. Resource sharing. 5. DESIGNWARE Resource sharing.2. and thus we just summarize their major features in Table 11. Verilog and graphical inputs. Logic synthesis and optimization. Synthesis from VHDL and Verilog. The value trace can be edited graphically to perform manual operations such as partitioning and expansion of . We describe instead the most salient features of four synthesis systems. Synthesis from VHDL and M. Resource sharing. Library binding. Separate synthesis of data path and control from VHDL. . Viewlogic 11.2 Research Synthesis Systems Several research synthesis systems have been developed. Logic synthesis and optimization. Control syntheas for Imps. Comoass ASIC SYNTHE~IZER Daziflntergraph ARCHSYN Exemplar Logic Mentor Graphics CORE Au~oLoGlc Synopsys HDUDESIGN COMPILER Synthesis from VHDL and Verilog.2. Library binding. Inputs to this design system are circuit specifications in the ISPS or Verilog languages that are compiled into an intermediate dataflow fonnat called value trace. Logic synthesis and optimization.Split by PDF Splitter S T A E OF W E ART AND W R E TRENDS 555 TABLE 11. Logic optimization and binding far mGAs. which have been selected because they represent archetypes of different research ideas and directions. Logic synthesis and optimization. SILCYSN V#EWSVNTHESI~ Synthesis from VHDL.. Library binding. Organization Cadence Design Systems srstem SYNERGY Main futures Synthesis from VHDL and Verilog. ~ibrary~binding. Some specialized books 14. Resource sharing. Several programs for architectural synthesis and exploration have been developed at CamegieMellon University for over one decade.1 Some commercial synthesis systems in 1993.
MG ASO. ASO. BUSSER synthesizes the bus interconnection by optimizing the hardware using is a clique covering algorithm. KarlsnihelSiemena USC BRIDG~CHARM FDL2 HAL Graph models SAW Verilog. APARTY an automatic partitioner is based on a cluster search. networks. Berkeley U.2 Some research synthesis systems in 1993 (FSM = finitestate machine specification.C. The SEQUENTIAL INTERACTIVE SYNTHESIS system (or SIS). M G = module generation). Wine U. PHIDEO Silage PUBSS FSM OLYMPUS HardwareC SIS FSM. LSO ASO. LMER Silage FSM. SUGAR a dedicated tool for microprocessor synthesis which recognizes some specific components of a processor (e. E ~ u c is a global data allocator that binds resources based on the interconnection s cost. The workbench includes the following tools. CORAL maintains the correspondence between the behavioral and structural views. networks Silage CATHEDEAL IIV PYRAMID. CSTEPis responsible for dqiving the hardware control portion: it is based on a list scheduling algorithm which supports resource constraints. Some commercial tools have drawn ideas from MIS. A S 0 = architectural synthesis and optimization.C. and some proprietary internal CAD systems have directly incorporated parts of the MIS program. optimization and binding. LSO AS0 AS0 AS0 MG LSO LSO MG BDS selected blocks. Berkeley U. The synthesis outcome is a structural circuit representation in terms of a network of hardware resources and a corresponding control unit. networks BOLD VSS VHDL CADDYICALLAS DSL. which was limited to synthesis and optimization of combinational circuits. Organization System Input Scope AT&T Carleton University CMU IBM MAG IMEC Philips Princeton University Stanford University U. TABLE 11.Split by PDF Splitter 556 CONCLUSIONS . The SIS program evolved from the MULTILEVEL INTERACTIVE SYNTHESIS program (or MIS). including library binding. ASO. MIS supports logic optimization by means of both algebraic and Boolean transformations. THE SEQUENTIAL INTERACTIVE SYNTHESIS SYSTEM. VHDL SLLDE. DDS ADAM AS^ AS0 AS0 AS0 LSO ASO. All tools are interfaced to each other and share a common data structure and user interface. Boulder U. LSO = logic synthesis. is a program that supports sequential and combinational logic synthesis. HYPER.. The MIS program has been very popular and widely distributed.C.g. developed at the University of California at Berkeley. ISPS HIS VHDL ASYL FSM.C. an instruction decode unit) and takes advantage of these structures in synthesis. It uses the ESPRESSO pro .
such as data paths.9 of Chapter 8 reports the rugged script. Typical applications are speech synthesis and analysis. mapping the partition blocks into execution units while minimizing the interconnection busses) and control synthesis based on a microcode style. a package that supports physical design. into circuits with particular design styles. Therefore. SIS supports a variety of transformations among network representations. Datapath synthesis is done with the aid of an architecture knowledge database. In addition. also developed at the University of California at Berkeley. image and comis planned for implementing very repetitive algomunication domains. namely digital signal processors (DSP). We describe here CATHEDRALI1 CATHEDRALI11 and because of their relevance in connection with the topics described in this book. Commercial versions of CATHEDRALI CATHEDRALI1 and are also available.. CATHEDRALI11I] targets hardwired bitsliced architectures intended for the implementation of algorithms in the realtime video.5] a synthesis system for DSP applications using concurrent bitparallel procesis sors on a single chip. There are two major tasks in the system:'architectural synthesis and module generation.Split by PDF Splitter STATE OF THE . The general design methodology in CATHEDRALI1 called meet in the is middle strategy. The state transition graph can be extracted from a structural representation of the circuit. Program SIS has now replaced MIS. The physical layout is achieved by invoking module generators which can be seen as a library of highlevel cells described in a procedural style. Architectural synthesis maps behavioral circuit models into interconnections of instances of primitive modules. CATHEDRALIV rithms for video processing. Script files can be used to sequence a set of commands.ART AED FU'IURE TRENDS 557 gram for Boolean simplification and it performs library binding by using a structural covering approach. modems. memories. THE CATHEDRAL SYNTHESIS SYSTEMS. . SIS is an interactive program with batch capability. different CATHEDRAL been designed to transform behavioral models of a particular class of designs. often used for optimization of combinational circuits. SIS supports all features of MIS as well as sequential logic optimization using either statebased or structural models. The output of SIS can be transferred to the OCTTOOLS suite. Control synthesis is based on a heuristic scheduling algorithm. Architectural optimization includes the following tasks: system partitioning into processes and protocols. etc. each related to a set of logic transformations of a given type. It0 units and controllers. One of the guiding princides of the project is the tailoring of synthesis tools to specific application programs have domains and implementation styles. datapath synthesis (i. digital [I audio. CATHEDRAL[51 11 [3.2. Specific transformations for sequential circuits include state minimization and encoding as well as retiming to reduce the cycletime or area. The Silage language is used for circuit modeling. CATHEDRALI1 been recently extended to cope with retargetable has code generation. CATHEDRALI is a hardware compiler for bitserial Q i t a l filters. The Cathedral project was developed at IMEC in connection with the Catholic University of Leuven in Belgium and other partners under the auspices of project Esprit of the European Community. Example 8.e. Don't care conditions can also be computed for sequential circuits.
which are clusters of resources tailored to specific tasks [I I]. circuit technologies. Architectural synthesis provides operation clustering in the signalflow graph representation into which the behavioral model is compiled. such as sorting an array. datapath synthesis is centered on the optimal use of applicationspecific units. 11. HEBEapplies the relative scheduling algorithm after having bound resources to operations. These two factors are so important for circuit design that synthesis systems have become pervasive. unless HEBEdetermines that the constraints cannot be met and need to be relaxed. which has both procedural and declarative semantics and a Clike syntax [lo]. THE OLYMPUS SYNTHESIS SYSTEM. Logic synthesis and library binding are used in OLYMPUS estimating area for and delay of applicationspecific logic blocks during architectural synthesis as well as for generating the system's output in terms of a hierarchical logic network bound to library elements. A frontend tool performs parsing and behaviorallevel optimization. If a valid schedule cannot be found that satisfies the timing constraints.2. The continuous increase in complexity and improvement in performance of microelectronic circuits could not have been achieved without the use of automated synthesis systems. Details are reported in reference [lo]. developed at Stanford University. faster. The OLYMPU~ system supports architectural and logic synthesis. modeled as relative timing constraints. Program MERCURY supports some logic transformations. Circuits are modeled (at the architectural level) in a hardware description language called HardwareC.3 Achievements and Unresolved Issues The major achievements of synthesis and optimization techniques are an improvement in the quality of circuit implementations (smaller. is a vertically integrated set of tools for the synthesis of digital circuits. The OLYMPU~ synthesis system. . an interface to the SIS program and gatelevel simulation. Optimizing transformations include distributivity and associativity exploitation as well as retiming. OLYMPUS does not support physical design. Programs THESEUS VENUS and provide a waveform and a sequencing graph display facility.Split by PDF Splitter Module generators are designed to be portable across different. but similar. more testable) and a reduction in their design time. with specific support for ASIC design. hu{ it provides netlist translation into a few standard formats. It strives to compute a minimalarea implementation subject to performance requirements. Program HEBE called HERCULES executes architectural optimization. Thus. Binding and scheduling m iterated until a valid solution is found. a new resource binding is tried. Program CERES performs library binding using a Boolean matching approach. Memory management is provided in the CATHEDRALI11 environment by supporting different storage models and the synthesis of the addresses of the memory arrays where data are held. performing a convolution or computing the minimum/maximum. CATHEDRALI11 exploits the concept of applicationspecific units. Clusters identify applicationspecific units. respectively.
11. which supports all features of handcrafted designs. Similarly. Thomas and Fuhrman [3] reported on the industrial use of the SYSTEM ARCHITECT'S WORKBENCH connection with a comin mercial physical design tool for the design of applications for the automotive market. nature. Today. CAD techniques have matured and acquired the strength of an independent discipline. Unfortunately. have been fully synthesized by several systems. e. Moreover. Moreover.. Applicationspecific circuits for the consumer industry. CATHEDRAL11. respectively. Nakamura [3] described the full synthesis of a 32bit RISC processor (called FDDP because it was designed in four days) and of a v e y long instruction word (VLIW) vector processor for DSP. The synthesis of efficient pipelined circuits.000 and 400. the circuit scale is so large that human design is unlikely to be effective.Split by PDF Splitter STATE OF THE ART AND N N R E TRENDS 559 Several authors described examples of VLSI circuits that have been synthesized in full from architectural models.g.~~~~~ been reported. Some specific design tasks have not been satisfactorily addressed by automated synthesis methods. Hurdles have been encountered in applying synthesis techniques to the engineering design flow. circuit designs can always be found that are too large for existing design systems. Even though heuristic algorithms are often used to cope with the computational complexity of many synthesis tasks. rather than fundamental. Today. There are still some problems that limit the use of synthesis systems and that are due to the lack of maturity of this field. synthesis systems had to support all (or almost all) circuit features that handcrafted designs have. This . there are also pathological circuit examples where optimization leads to poor results. the design of efficient data storage in memory hierarchies is crucial for. To gain acceptance. including popular processors. Designers had to he educated to think of circuits in a more abstract way and to rely on the tools for decisions they used to make themselves. Both designs were achieved with the PARTHENON synthesis system developed at N l T and required about 14. Most logic optimization algorithms have been devised for coping with sparse or control logic and are unable to exploit the special structure of arithmetic functions that is key to their optimization.000 transistors. due to the heuristic naNre of the algorithms. the use of architectural synthesis is still limited by a few factors. the design of some circuits. This has limited the use of some optimization techniques to portions of VLSI circuits. many examples of applications of logic synthesis and optimization to chip design. BRAMID and OLYMPUS. We believe that these problems will be overcome with time due to their technical.3 THE GROWTH OF SYNTHESIS IN THE NEAR AND DISTANT FUTURE The ideas presented in this book are typical of an evolving science. have . Computeraided design methods started as applications of algorithms to circuit design. such as compactdisk controllers and interfaces. including arithmetic functions. An example is logic optimization of datapath units. synthesis and optimization methods are needed the most wher. is not available yet. but it is supported only by a few synthesis tools and to a limited extent. Other difficulties are intrinsic to the nature of the design problem.
The coupling of synthesis and verification for circuit design will become common practice in the future. We expect the growth of its use ip the applicationspecific circuit and instructionset processor domains. Indeed. thus affecting cell selection (in library binding) and resource selection and sharing. Many techniques need to be perfected. implementation verification by comparing synthesized representations at different levels will be relevant to ensure that the circuit has no flaws introduced by the synthesis process. Hence. The horizontal growth of architectural synthesis can be related to extending its application domains. In the future its importance will grow as circuit density increases and wiring delays tend to dominate. Synthesis of digital circuits is part of design science. such as emittercoupled logic (ECL) circuits (which support dotting). Examples of the horizontal growth of logic synthesis are the application of library binding to novel fieldprogrammable gate arrays or to circuit families with specific connection rules. but their level of accuracy needs to be raised to cope with forthcoming circuits with an increasingly higher level of integration. Whereas synthesis and optimization of the geometric features of integrated circuits is a mature field. each related to perfecting the solutions to some problems. Architectural and logic syn€hesis have been revolutionary. Describing and synthesizing circuits from HDL models instead of using gatelevel schematics can be paralleled to the replacement of assembly languages with highlevel programming languages in the software domain. Design science encodes the engineering design flow in a rigorous framework and allows us to reason formally about design problems and their solutions.Split by PDF Splitter 560 CONCLUSIONS is corroborated by the fact that the CAD research and development community has grown in size through the years. As circuits become more and more complex. The latter invohes radical changes in the way in which problems are modeled and solved. By the same token. because architectural and logic synthesis are performed before physical design. and many subproblems. Improvements in science are either evolutionary or revolutionary. the integration of logic and architectural synthesis techniques with physical design is still only a partially solved problem. even though synthesis and optimization algorithms have guaranteed properties of correctness. Estimation techniques have been used for this purpose. An evolutionary growth of synthesis is likely in the near future. because they changed the way in which we reason about circuits and address their optimization. architectural synthesis has been most used for signal processing circuits. The former consists of compounding little steps. need now to be solved. the evolutionary growth will involve the horizontal extension of present techniques to less conventional design styles and circuit technologies as well as the full integration of present synthesis methods. The integration of different synthesis tasks is not simple. . validation of properties by means of fdrmal verification methods becomes even more important. which were originally considered of marginal interest. logic and architectural synthesis programs will need to access accurate information of the physical layout. To date. In particular. their software implementation may have bugs.
the vertical growth of synthesis methods will lead to the extension of the scope of synthesis beyond the design of integrated circuits. is a major challenge for the years to come. isolated synthesis and optimization algorithms exist for some asynchronous styles and for some analog components. the specification of a cellular telephone. will be a forthcoming major revolutionary conquest in the sphere of CAD and it will change the practice of engineering jobs in several sectors. as different meanings can be associated with the word system. e.g. we shall consider here issues related to thrsynthesis of the electrical component of a system that can be thought of as an interconnection of integrated circuits. we consider systems that are single physical objects and that have an electrical component with. Due to the wide disparity of design paradigms. The integration of synthesis techniques for synchronous. In particular.3. Synthesis of composite electrical (and electromechanical) systems. a laptop computer and a distributed computing environment or a telephone and a telephone switching network. for example. few address the real problems of modeling heterogeneous systems.Split by PDF Splitter Revolutionary changes in synthesis methods will be required when considering circuits with both synchronous and asynchronous operation modes as well as with analog components. discreteevent and userdefined models. objective functions and constraints. because different system components may be heterogeneous in nature. synthesis and optimization methods will be necessary. Whereas many mixedmode simulators are available on the market. including system specification. the physical design of electronic systems has evolved as the physical means of composing systems have . Consider. Consider. Computeraided design of multiplechip systems involves several tasks. It is customary to refer to computers as information systems. Systemlevel synthesis is a challenging field still in its infancy. The scale of systems may vary widely. Whereas placement and routing tools for electronic boards have been available for a long time. new modeling. hardware description languages. Among these. possibly involving both hardware and software components. System specification may be multiform. ROLEMY is a research design environment and simulator [9] for signal processing and communicationsystem design that provides a means for heterogeneous cospecification by supporting several modeling styles. for example. but synthesis of such systems is still far on the horizon.1 SystemLevel Synthesis Systemlevel design is a broad term. At present. circuit schematics and constraints. interfaces andlor a mechanical component. asynchronous and analog components. On the other hand. validation and synthesis. Distinct functional requirements can be best described with different modeling paradigms. System validation may be solved (at least in part) by mixedmode simulation. as well as the support for the concurrent design of heterogeneous chip sets.. handling both digital and analog signals at different frequencies. including dataflow. design systems in these domains are not yet available. We shall consider synthesis at different levels of abstraction. Computeraided design of electromechanical systems is a subject of ongoing research. possibly. To be concrete. 11. In the longer term.
Logic design of electronic systems must address data conimunication and clocking problems. This problem falls into the domain of hardwarelsoftware codesign. the overall cost of an electronic system depends on the cost of designing and manufacturing some cwponents and on the actual cost of the available components. systemlevel logic design is a very challenging task which is performed by experienced designers due to the lack of CAD tools. though not yet available. Since architecturallevel decisions may affect strongly the costlperformance tradeoff of a system. A major task in the architectural synthesis of systems is defining a partition of the system's function over components (i. A given system can deliver higher performance when the hardware design is tuned to its software applications and vice versa. At present. Multiplechip carriers provide a means of connecting efficiently several integrated circuits. described in the next section. Important issues for multiplechip physical design are wiring delay estimation and performanceoriented placement and routing. systemlevel logic design will become prohibitively timeconsuming and risky in the future. Systemlevel design is not confined to hardware. . Communication among the various system components must satisfy protocol requirements and possibly different data rates. In system design.Split by PDF Splitter progressed. due to the search for computeraided design tools for effective codesign. CAD tools for this task. will be very important. Indeed most digital systems consist of a hardware component and software programs which execute on the hardware platform. thus motivating the development of specialized tools. With the increasing complexity of system design. several research and some productionlevel tools have been developed for solving these problems. it is often convenient to leverage the use of components that are already available on the market (as offtheshelf parts) or inhouse (from previous system designs). Synchronous subsystems may operate on different clocks. 11. Thus synthesis methods must support the use of predesigned components which impose constraints on the synthesis of the remaining parts. integrated circuits) which can be seen as system resources. Even though this problem seems to be an extension of architectural synthesis methods from the circuit to the system level. while defining hardware architectures and providing architectural support for operating systems. Systems may have synchronous and asynchronous components.. A wellknown example of a multiplechip carrier is IBM's thermal conduction module (TCM). An important problem for architecturallevel synthesis of any composite digital system is to find the appropriate balance between hardware and software. Computer designers have exploited the synergism between hardware and software for many years. the cost functions and constraints are different. As a result. Hence systemlevel partitioning can heavily affect the cost of the system as well as its performance.e.2 HardwareSoftware CoDesign The hardware/sojhare codesign problem is not new but has bee" receiving more and more attention in recent years. In this case.3.
performance estimation is not trivial and requires appropriate models for both hardware and software. Relevant hardwarelsoftware codesign problems for instructionset processors are cache and pipeline design. called here cosynthesis. The PIPER synthesis program is an example of a codesign tool that addresses this problem 181. is the natural evolution of existing hardware architectural synthesis methods. No tools are yet available for the full automatic synthesis of caches and of the related updating algorithms.Split by PDF Splitter There are several reasons for using mixed hardwarelsoftware solutions in a system design. Sensors and actuators allow the system to communicate with the environment. The design and sizing of a memory cache require a match between circuit performance and the updating algorithm and its parameters. Second. the development of a product may he eased by supporting prototypes where most (if not all) functionality is assigned to the software component. The overall processor performance is affected by the choice.g. reducing (or avoiding) circuit fabrication in the prototyping stage. Furthermore. First. The design and control of a pipeline in a processor requires removing pipeline hazards. They are meant to react to the environment by executing functions in response t6 specific input stimuli. Computeraided synthesis of embedded systems. we can search for implementations where the performance provided by&stomized hardware units can balance the programmability of the software components. The design of instructionset processors falls in the former class. robot controllers) and in the consumer and telecommunication industries (e.. while a typical software solution is reordering the instructions or inserting NoOperations. Hence they are called realtime systems.. the evolution of a system product may be better supported by allowing the software programs to undergo upgrades in subsequent releases of the system. Most cache designs are based on validating the design assumptions through simulation with specialized tools. that performs a dedicated function. assisted by applicationspecific hardware and memory. CAD tools can explore the hardwarelsoftware tradeoff and suggest an implementation which meets the needs of the design problem. their functions must execute within predefined time windows. It provides pipestage partitioning and pipeline scheduling and determines the appropriate instruction reorder that the corresponding backend compiler should use to avoid hazards. Several problems are encompassed by hardwadsoftware codesign. A working .g. We consider now embedded systems that are computing and control systems dedicated to an application (Figure 11.g. This model can be broadened to a processor. in the manufacturing industry (e. In some cases. Computing the most effective number of pipe stages for a given architecture is_thus a hardwarelsoftware codesign problem.2). Hardware or software techniques can he used for this purpose. By the same token. We coms ment here on those related to general purpose computing and to dedicated computing and control. The most restrictive view of an embedded system is a microcontroller or a processor ruming a fixed program. engine combustion control). portable telephones). Embedded systems often fall into the class of reactive systems. Examples of reactive realtime systems are pervasive in the automotive field (e.. while design of embedded controllers is an example of the latter. An example of a hardware mechanism is flushing the pipe.
At present. into a hardware and software component. Thus a system design for a given market may tind its costeffective implementation by splitting its functions between hardware and software. At one end of the spectrum. . roftware solutions may run on highperforming processors available at low cost due to highvolume production. hypothesis for cosynthesis is that the overall system can be modeled consistently and be partitioned. Nevertheless. the software component can be generated automatically to implement the function to which the processor is dedicated. operation serialization and lack of specific support for some tasks may result in loss of performance.Split by PDF Splitter ACTUATO FIGURE 11. the potential payoffs make it an attractive area for further research and development. hardware solutions may provide higher performance by supporting parallel execution of operations at the expens? of requiring the fabrication of one or more ASICs. The hardware component can be implemented by applicationspecific circuits using existing hardware synthesis tools. . Cosynthesis must support a means for interfacing and synchronizing the functions implemented in the hardware and software components. At the other end of the spectrum. Nevertheless. the overall CAD support for cosynthesis is primitive.2 Ernhedded system: a simplified structural view. either manually or automatically. The overall system cost and performance are affected by its partition into hardware and softwarc components.
Vol. The problem is complicated by the remoteness of the abstract system models from the physical implementation. of all other important design problems that are not solved yet. Cosimulation provides a simple way of tracing the input/output (and internal) system behavior. B.4 ENVOY Research and development in the CAD field has progressed tremendously in the last three decades. The SYSTEM ARCHITECT'S WORKBENCHdescribed in a dedicated hook 1141. Computeraided synthesis of digital circuits has become a scientific discipline attracting a large number of researchers. December 1992. A good reference is system csn be found in Gajsk'i book [ 5 ] . 5. Jayner and L. Information a b u t the SIS system can be fuund in reference 1121 and about Olympus in Ku's 1101 and in Camposanowith W o l f s 1 1 books. pp. Editors. for large system design. 85114. methods for validating hardwarelsoftware systems are very important. Trevillyan. W. Possible solutions range from the extension of existing hardware and software languages to the use of new heterogeneous paradigms. "LSS: A System for Production Logic Synthesis. Hi~hLp~. while a description of Some more recent to the CATHEDRALIt optimization algorithms used in C A T H E D R ~ Iand in C & T H E D R \ Lare ~ ~ I1 . Walker and Camposano 1151 rummarile the major features of several architectural synthesis systems in a specialized book. No.Split by PDF Splitter ~~~ " Several open problems still impede the rapid growth of the field. The limit to our engineering design capabilities lies in the instruments we have. MA. The knowledge accumulated in this field to date should stimulate us to think. The CAD industry. developers and users. Second. pp. cosimulation may provide insufficient evidence to prove some desired system property. Wolf. has grown in size and occupies an important place in the overall electronic market.~ reponed in reference [Ill. September 1984." ASIC & EDA. pp. there exists a need to define better abstract models for hardwarelsoftware systems and to develop consistent languages to express them." ASIC & EDA. 11. 3.5 REFERENCES Some research synthesis systems are described in detail in the hooks edited by Gajski [5] and Camposano with Wolf 131. D. Kluwer Academic Publishers. 1. Arnold.el VLSl Swrhesis. Nevertheless. cost and performance evaluation of mixed systems play an important role in driving partitioning and synthesis decisions. R. "Focus Repan Summaries. June 1993. J. 5444. However. 28. 11. 6 . ''Turning to HDLr for Fast Design Relief. hut not least. as Socrates suggested. 4. and only a few published papers repon on the recent developments 19. 3 The areas of systemlevel synthesis and hardwarelsoftware codesign are fairly new. Extending formal verification techniques to the hardwarelsoftware domain would thus be desirable. CAD is one of the keys to the future evolution. In the case of electronic design. we hope it has raised enough interest in the reader to motivate him or her to search for additional information in the referenced articles and books and to follow the evolution of this discipline in the regular conferences and scientific journals. Last. This book could only mention some of the relevant problems in the synthesis field. 537545. Brand. Boston. Camposano and W. . 1991. 2. 81. and in particular the digital design sector. Daninger.'' IBM Journal ofRrreorch and Developmenr. First and foremost.
September 1993. Gajaki. 14. MA. 1992. Gupta and G. Nestor. & News. 1. Rajan and R. AddironWesley. Proceedings ofthe lnternorional Conference on Com~urer Desirn. Camposano. 6. OD. Vol. Boston.328333. 1990. November 1992." IEEE Specrnrm. J.. CADI I. 10. 7. Vol. De Micheli. Lagnese. A. No. Vol. Kluwer Academic Publishen. Lee. H. 3. pp. I I . "A HardwareSoftware Codesign Methodolo~y DSP Applications. 135140. R. pp. Walker and R. 1987. I 2 E." IEEE ~ for ~. J . pp. Kluwer Academic Publishen. R. Sineh. D. 13." DAC. 11. E."Swuential Circuit Design Using Synthesis and Optimization. "HardwareSoftware Cosynthesis for Digital Sylems. pp. 2. Kluwer Academic Publishers. EDA?" ASlC TecheoIoe). Note. Reading. Gwssens and H. 9. Silicon Compilation. Boston. F. Smith. MA. 10. A S u n q ofHixhLevel Synthesis Systems. "Combined Hardware Selection and Pipelining in HiehLevel Performance DataPath Desien? IEEE Transactions on CADNCAS." ICCD. 413423. Blackbum. 19. Desixn & Test. De Man. M. R. Thomas. January 1991. 1992. "1991: What Will It Mean for ASICs. Desoaln. No. Moon. Ku and G. No. MA. 8. 9. MA. Sentovich.Split by PDF Splitter 5 . 15. Kalavade and E. Huang and A.     . C. G. September 1993. R. pp." IEEE Design & Test. "H~ghlevelSynthesis of Pipelined lnatructlon Set Processors and Backend Compilers. D. 2 9 4 1 . . Boston. 4448. Algorirhmic andRegister Transfer Level Synthesis: The Sysrem Architect's Wor&ench. 1992. B. D. Vol. SangiovanniVincentelli. Brayton and A. Gmves. HighLevel Synrhesi. Walker. Editor. De Micheli. pp. Proceedings ofrhe Derign Automarion Conference. 3. 1991. S. "More Logic Synthesis for ASICs. No.~of ASICS u&r Timing ond Synchronizarion Constraints. K. 10. 1628. Chattoor. Savoi. April 1992.
544. 43 delay evaluation and optimization. 38&388 decidable. 360 algebraic transformations. 166. 197.Sequential Interactive Synthesis system on SOS .Mask Programmable Gate Array OBDD . 518. 450 ADA. 541 branch and bound.Obsewability Don't Core (sets) PLA .As Late As Possible scheduling ALU . 98 ALAP xheduling algorithm. 98. 526. 506 conuollability don't care sets. 361 algebrlc model. 45 BellmanFord.Integer Linear Program ITE .ZeroOne Linear Program Ihot encoding. 426431 . 198. 57. 117.IfThen~Else LSI .Field Programmable Gate Array HDL .BuiltIn SelfTest CAD . 518. 484 adaptive control synthesis. 467 Boolean covering.Logic Synthesis System LSSD . 55.Tnemal Conduction Module l T L . 177 adjacency mat% 40 adjacent vertices. 337. 188.Complementary Metal Oxide Semiconductors DSP .Miltiplelevel Interactive System MPGA . 545546 algorithms. 521 algebraic divisor. I98 approximation.Binary Decision Diagrim BiCMOS .A Hardware Programming Language ALAP . AhoCorasick.92. 43. 356 algorithmic binding. JO AhoCorasick algorithm. 38.Random Access Memorv RISC . 98 ASAP scheduling algorithm. 188. 418.Redaced Instruction Set Computer ROBDD . 38. 61.Automatic Test Pattern Generation BDD .Emitter Coupled Logic FPGA . 89.Very Large Scale Integration YLE .Bipolar Complementary Metal Oxide Semiconductor BIST . 174175.Programmable Logic Array RAM . 27. APL.As Soon As Possible scheduling ASIC . 360 algorithmic approach to network optimization.Computer Aided Design CDFG .TransiatorTransistorLogic UDUl . 521 ALAP scheduling.ControllData Flow Graphs CMOS .As Fast As Possible ASAP . 167. 98 AHPL. 121. 526. 175 acyclic graphs. 541 Boolean matching.System's Architech Workbench SIS . 146.Arithmetic Logic Unit APL .Unified Design Language for Integrated VHDL . I86 acyclic networks. 171. 190 abstract models. 210 ASAP scheduling.~ i i i c o n Sapphire TCM . 114 activation signals.Reduced Ordered Binary Decision Diagram SAW . 466.Split by PDF Splitter INDEX AHPL . 39.A Programming Language AFAP .Application Specific Integrated Circuit ATPG . 323.LargeScale Integration LSS .VHSlC Hardware Description Language VHSlC Very High Specd Integrated Circuits VLlW .Ordered Binary Decision Diagram ODC .Vely Long Instruction Word VLSl . 476 directed acyclic. .Hardware Description Language ILP .LevelSensitive Scan Design MIS .Yorktown Logic Editor ZOLP .Digital Signal Processing ECL . absolute connrain_ls. 4 7 4 8 . 194.
410 back end. 198 area optimization. 461 projective. 330 Espressoexact. 451. 541 applicable input sequence. 554 behavioral optimization. 448. 564 application specific resources for architectural synthesis. 506 area/latency trade~off. 471 standard binding. 306 rnultipon memory. 506505 concurrent. 351353. 418. 38. 297 binding. 505 concurrent with logic transformations. 446 application specific integrated circuits (ASIC). 462. 63 library binding. 260 ancestor (see predecessor) anchors. 373. 200. 270. 541 behavioral @vor. 511. 254. 467 binary trees.Split by PDF Splitter F algorithms (continued) delay reduction. 323 treebased covering. 278 fanin oriented. 555 areatdelay tradeoff. 3% alternative consrmcts. 370. 367368 Kernighan and Lin panitianing. 449. 54. 194 andor implementation of a cover. 375. 193. 89 binate variable. 262 . 21 1215 greedy. of an architectural body. 179. 433 as fast as possible (AFAP) scheduling. 555 assignment (see state encoding) atomic finitestate machine. 186. 280. 404 observability don'r care sets. 318. 1718. 208211. 28&281 retiming. 256 ifthenAse ITE. 522. 143145 CAD systems for. 4851 dynamic progmmming covering. 331 exact output encoding. 453 fanout oriented. 46 list scheduling. 556 exact. 106 Verilog. 50 binary decision diagram (BDD). 103107 behavioral modeling. 60. 476 simplex. 197. 15 Bellman's equations. 69. 555 automatic test pattern generation. 512515 substitution. 132133 wee matching. 153 minimumarea. 160. 525 treeheight reduction. 344 microelectronic circuits design. 182 architecturallevel synthesis (see also architectural synthesis) under delay constraints. 337338. 108 architectural optimization. 384. 225 logic minimizers. 9. 9. 506 exact input encoding. 250 hierarchical. of a VHDL circuit model. circuits. 88. 358359. 7576 binate covering. 159. 522. 514 TsengSiewiorek algorithm. 560 circuit specifications for. 541 structural matching. 46.45 architectural bady. 2 6 2 6 . 150152. 179. 186 cycletimelarea. 8 datapath design. 466. 4h QuineMcCluskey. 506 binate functions. 466. 533535 linear and integer programs. 243 performance~consuained248 resource selection. 217 ASIC Synthesizer. 126 bipolar complementary metal oxide semiconductor (BiCMOS). 109 behavioral hardware languages. 143 approximate 6lters. 510 heuristic clique panitianing. 245247. 247 twolevel logic optimization. architectural level. 280. 126. 389395 optimal. 111. 115. 57.159. 263. 163 architectural synthesis. 531 ellipsoid. 400 approximation algorithm. 129135 behavioral views. 509. 287 antksk. 415 heuristic binding. 8082 kernel recursive computation. 378 symbolic encoding. aredatency. 31 artificial intelligence. 142. 46 encoding. 329. 43. 432 dynamic programming.89. 5 BIST. 193. 358 matching. cell library. 453 forcedirected liat scheduling. 432. 430 ArchSyn. 175177 AutoLogic. 506. 413. 43. 304 logic transformation. 331 exponential. 198 aredperformance tradeoff. 157. of a software compiler. 490. 55 BellmanFord algorithm. 349. 371. lh4. 55. 21. 4 4 polynomially bound. 538 based FPGAs. 179. 498.469 maybased. 8. 312. 415 leftedge. 531. 126 base functions. 538 heuristic decomposition. 53. 323. 5153. 229 bipanite graph. 4. 520 network traversal.
136 BooleDozer. 344. 239 modeling logic networks. 62. h ~ r ~ n p n l n J ~ n 245. 40.92. 150 optimization of. 262. 505506 binding. sequential. 68. 24. 30. 117 design objectives of. h\tun. 556. ~ CAD framework. 459. ?as) 297.1 ~. 169. algebraic decomposition of. 272 68. of a decision tree. 67 Boolean cover. 69 spectral analysis of. design. 441 circulararc graphs. 217 Ceres. 394 of implicants. 336 reoresentation of. 336 incom~letelv s~ecified. 262 hur urtrnl<'J Jsldpah acsnyn I19 hu.459. 329 cardinality. 369 breddtb of edges. 325 cover number. 77 builtin selftest. 336 Cappuccino. 540 behavioral modeling of. 70 Boolean relations. 134 Cofactor$ of Boolean functions. cambinatorid operations. 387. 248 closure. 69. 63. 432. 231 heuristic partitioning algorithm. cell generators. 6466. 70. 721 . . 69 Boolean matching. 11. 535 Boolean model. 108. 7277 compatible with Boolean relations. 41. 273 of a minimum cover. 244 d11. 404 Boolean Simplification and Substimtion. 396. 242 clique of a graph. 540 chromatic number. 40. 174 code motion optimization. 533. 537 cell library. 44 carriers. 128. 169 Combinational logic circuits. 24 . 527. 535. 472 Bryant's reduction algorithm. 231. 284 of a set. 161 force directed scheduling. 7. 293 Collapsing transfomations. 126. 231 circuit opimization. 509. 89. 88 of compatibility sets. 123 in hierarchical sequencing graphs. 118. 63 ChortleCRF. 401. 481 and Multiplevertex optimization. 457 subject graphs. carry lookahead adders. 360. 333. 289 chordal graph. 457 coloring of a graph. 135 of dataflew graphs. 51 1. 72. 527 clustering. 148149. 69 complcte expansion of. 528 cellsubnetwork matching. 510 binate. 69 68 completely ~pecified. 554 Callas system. 258 of dataflow graphs. 526. 481 bound networks. 529. 154 . I20 C&esian product of a set. 532 Baolean covering algorithm. 534 partially bound networks. hashbased for ROBDD manipulation. 164. 232. 70 consensub. 83 ITE. 74 covering expression of. 128 Brayton and McMullen's theorem. 69. 84 cell based microelectronic circuits. 446 cluster. 526. 36. 545 unate. 506. 31 design characteristics. 535 Boolean space. 246. 4 7 4 8 . 534 bound sequencing graphs. 526. 494 candidate prime. 348. 61. 558 characteristic equation. covering with. 337 branching constructs. 161. 38. 327: 449 preserving in a logic cover. 273. . 61. 541 generalized. 255 boundedlatencv suberaohs. 43. 89 cofactor of. 4l. 5 i 5 1 0 minimization of. 41. 68.Split by PDF Splitter blocklevel transformations. 415. 405. 390 chaining. of controlflow graphs. 541 Boolean difference. 72 oolean functions.lheor. 556 cluster function. 366. 78. 380 Boolean network (see combinational logic network) Boolean operators. 401402 aatishability of. 554 Boolean algebra. 18. 528. 453 Boolean transformations. complementation of. 345. 557558 cache. 103 structural modeling of. of /TE algorithms. 510. 85 Shannon's expansion. 20. 11&118. 3637 Cathedral CAD Systems. 160. 7. difference. 530.42. 8083 smoothing. 82 of a computed cover. 47 branch and bound algarithm. 506510 chain rule. 70 oolean expansion. for ceil matching. 343. 556 candidate equivalent pain. 350. branch. 3 18 operation types.
Split by PDF Splitter
common subexpression elimination optimization, 134 comparability graph, 41.42, 233234, 254 compatibility graph, 169, 231, 255, 260 compatible don't ram sets, 403404 compatible dichotomies. 325 compatible states, 446 compatible resource binding, 152, 230 compatible resource types, 257 comoatible set of oermirsible functions. 410 complementruy metal oxide aem~conductor (CMOS). 5 complementat~on a functton. 299300 of complement of a graph, 38 complemented edges. 84 complete graph. 38 completeness. of rulebased systems. 435 completely specified Baalean function. 68. 272 completion signals. 166, 174 completion time. of an anchor, 195 complexity reduction. 352354 computational complexity of algorithms. 43, 50. 53, 82, 132. 202, 208, 283, 362, 394, 446. 466. 468. 522 of ALAP and ASAP scheduling, 190 exponential. 45. 92 1TE. 82 lower bound of. 43 polynomial, 43, 197, 205, 233. 286, 466 computed table. ITE algorithm. 82 computer aided design (CAD), 4 synthesis and optimization. 14. 551 computeraided simulation, 31 circuitlevel, 32 logiclevel, 32 functianallevel, 32 computeraided verification, 32 condition signals, 166 conditional expansion optimiration. 135 configuration search graph, 434 conflict graphs, 169, 231, 241. 260, 325 conflict resolution. 253 connected graph, 38 connectivity synlhesis, 163165 consensus operator, 78, 83. 293, 385 constant and vmiable propagation optimization. 133 coostraint graph. 59. 59, 196 constraints, absolute. 190 of architectural synthesis, 145146 containment, 274, 286. 360, 370 conmllability, don't care conditions (CDC). 381, 384389. 398,482, 534 internal don't care conditions (CDC), 381. 384. 484
control block, 171 conml unit. 18, 157 synthesizing for unboundedlatency graphs, 174 unconstrained resource schedule. 196 conhoVdataflow graph. 121 conholflow analysis, 130 controlflow based transformations, 134135 controlling sets, of forcing sets, 411 Cmk's theorem. 86 Coral. 556 CORE, 555 cost. area. 506 minimum satistiahility, 86 cost function. 42. 316 covcr, andor implmentation of, 287 as a tautology. 257 binate. 88, 89. 337338, 448 dichotomies, 325 for delay optimization. 523 irredundant operator. 60, 274, 284, 310, 354 logic. 273 minimum canonical, 283 minimum symbolic, 450 minimal with respect to containment. 60. 272 minimal with respect to singleimplicant containment, 274275 of a Boolean function, 273 of a set, 36, 44 of a tree, 49 prime .qd minimal, 284, 304 recursive complementation. 297, 300 two~levellogic operations. 288. 318 unate, 295, 337. 498 uenex, 5 9 6 1 covering algo"thms. based on Boolean matching, 526, 541 covering code relations. 329 carectangle. 374 rprimes, 336338 critical path delay minimizing. 426, 462, 541 critical path detection. 421. 425 cuhe extraction, 369 cube (monomial), 361 cubefree expression, 361, 367 Brayton and McMullen's theorem. 369 current estimate camputation, 92 unate, 87 custom micrnlecUonjc circuit design, 6 characteristics. 11 cycle of a graph, 38 cycle~time, 2426. 142, 251, 460 21, minimization by timing. 464 optimization of resourcedominated circuits, 163 cyclic core, 279, 283 cyclic networks. 486 dag (rer directed acyclic graph)
Split by PDF Splitter
Dagon, 509 database+433 datadependent delay operations. 157 datadependent Imps, 178 data flow flavor, of an architectuml body. IW dataflow analysis, 130,. 41 1 dataflow based transformations, 131134 dataflow graphs, 27, 120, I21 data intmduction interval, 179, 220. 222 data path, 18. 163 datapath design. maybased, 64, 34 busoriented, 164 macrocell based pipelined circuits, 18&181 dead c d e elimination optimization. 134 deadlines, 52, 190 deadlock conditions. 14 descendent (see successor) decidable algorithm. 43 decision tree, 47 decision problem, 42 declarative languages, 100 decomposition, finitestate machine, 443, 455457 multiplelevel logic optimization. 352. 3 7 6 379 of logic networks 510 sequential circuit optimization. 441 dedicated resource, 150 defi~ng path, 194 delay minimization. 427, 523 delay modeling, bounded delays, 423 fired delays, 423 for logic networks, 418 delay optimization, 432, 506 algorithms and transformations far. 426 delav. ~~, . faults, 33 data dependent, 126 data independent. 126, 167 delay consuaints, minimalarea circuits, 430 delayfault testable circuits, 415 De Morgan's Law, 74, 270, 300, 360 depth, of a cluster. 528 of a search, 545 design entity. of a VHDL circuit model. 108 design of integrated circuits, 12 conceptualization and modeling, 13 synthesis and optimization, 1213, 101102, 442 validation, 13 design space, 22, 142, 162, 255, 349 Pareto point. 23. 2526 design verification, computeraided, 32
detailed wiring, 31 dichotomy, compatible. 325 exact encoding algorithms, 323 prime. 326 seed. 324. 332 Dijksua greedy algorithm, 54 directed graphs. 37, 3 9 4 directed acyclic graphs (dags), 39, 58, 186, 346 discrete event, timing emantic, 101, 111 disjoint code relations, 329 disjoint sharp. 292 distance between two implicants, 291 disvibution graph, 212 division, of algebraic expressions. 361362, 378 dominating column. 90 don'r care conditions. 128. 381383, 398,402 controllability, 3889 leaf, X4 obsetvability, 384, 389395, 403405,482. 483, 534 with respect to primality and irredundancy, 416 satisfiability, 384, 402 for synchmnaus networks. 482 doublerail circuit design, 413 doublephase characteristic func60n. 333 doublepolarity logic network, 347 DSP circuits, 1 14, 164 dynamic programming, 4851 dynamic sensitization, 422 edge cover, of a hypergraph. 277 edge covering problem, hl edgeweighted graphs. 54, 453 edges of graphs, 37 breadth, 472 complemented, 84 efficiency of an algorithm, exact, 43 elimination. logic network, 428429 multiplelevel logic optimization, 351, 358 ellipsoid algorithm. 46 embedded system, 563564 EmeraldlFacet CAD program, 160 emitter coupled logic circuits (ECL). 560 enable signal, 176, 178 encoding algorithm for symbolic encoding, 323 equivalence. 400 equivalence classes. 4 M 7 . 444 equivalentsequential untestable faults. 498 error compensation. 41 1 ESIM, 100 Espresso, 270, 284, 295, 299. 304, 307309, 338, 417 Espresso minimizer, 315 Espressoexact, 280, 318, 329, 338

Split by PDF Splitter
572
INDEX
i .
Espressosignature, 283 essential column. 90 essential prime implicants. 276, 280. 313 detection of, 313 event graphs, 166. 167 events, reaction triggering, 101 exact algorithm. 43. 60, 263, 28CL283, 506 for covering problem. 89 exact filters. 400 exact optimization, 336, 343 exact symbolic minimization, 329, 339 execution delays, 144, 252 existential quantifier. 36. 70, 388 expand operator, 304. 315, 338. 354. 396 expert systems, 433 explicit don't core conditions. 482 exponential complexity of an algorithm, 45 exponential algorithm. 278 expression farms of Boolean functions. 74 extraction, circuit schematic, 32 cube, 369, 376 kernel. 369 multiplelevel logic optimization, 352, 364365 state, 491 external don't core conditions, 381, 397, 4 8 H 8 7 fabrication of integrated circuits, 12 factored form. 74 factorization. for finitestate decomposition, 455 false paths, 350, 418. 421425 fanout convergence, 390 fanout oriented algorithm, 453 fault coverage, 14 fault, single stuckat, 286, 408. 495. 498. 536 multiple stuckat, 287 untestable, 4W. 497498 feasible constraint graph, 196 feasibly covered cube, 307 feedback edges, 57 field programmable gate array (FPGA), 8, 12. 505. 537538 filters. approximate, 400 exact. 400 finitestate machine, 19. 105. 119. 166. 267, 441, 444.455, 49 1 st6mic. 175177 incompletely specified, 446. 498 Mealy type, 497 Moore type, 170, 497 statebased optimization, 457 firing an operation, 125 floating mode, 424 floor plan, 3 1 FlowMapR. 541 folded scheduled sequencing graphs, 261
forcing sets, 4 1 W 1 1 forcedirected list scheduling, 21 1 forest, 38 formal verificatlan methods. 14, 32 forward Euler method, 18 framework. 554 front end of a software compiler, 126 fully parallel multipliers, 258 fully serial mullipliers,'258 functional cells, 537 functional pipelining. 220, 221223 functional resource binding, 245 functional resource? for architectural synthesis 143, 150 funcdonallevel simulation. 32 garhage collection procedures, ROBDD manipulation package, 85 gate arrays. field programmable, 8, 12. 505, 537538 mask programmable, 8, 11 generalized cofactor, 72, 387, 394 generalired prime implicants, 329 geometricallevel synthesis, 18, 20 placement and wiring, 20 global flow, 41 1 globak optimum (see optimum) global polarity assignment, 413 global wiring, 31 eraohs. acyclic, 38, 39. 121. 146, 476, 512 bipartite. 38. 115. 126 bound requencing, 161, 255 circulararc, 242 chordal, 41, 63 coloring of. 41. 6164. 169 comparability, 41,42, 233234 compatibility, 169, 231, 232234, 248, 255, complement of a, 38 complete. 38 connected. 38 configuration search, 433 conflict, 169, 231, 241, 325 constraint, 59, 59, 182, 196 dataflow. 120. 121 directed, 37. 3940 directed acyclic. 39. 58. 186. 346 distribution, 212 edgeweighted, 54, 453 event, 168 folded scheduled sequencing. 261 feasible constraint, 196 hierarchical. 153. 167 hierarchical sequencing, 171, 237. 259 interval, 41, 63. 65. 233 isomorphic, 39. 76 labeled hypergraph, 151 logic network, 11&118, 346, 412 matching compatibility, 536 multigraphs, 38, 118
.
100102 model. 433 input mival times. 561 heuristic algorithms (see also approximation algorithm). 13. 515. 103107 suuctural. 155. of controllability don't core sets. integrated circuits. 189. 4 interface constraints in archtectural synthesis. 252 Kronsker delta function. 91. 174 undirected. 4 W 9 5 incidence matrix. of an algebraic expression. 446 Hu's algorithm. 310 irredundant operator. 446 off set. 151. 315. 85 for reduced order binary decision diagram. 415 knobs and gauges approach.. 512. 257 weighted compatilibity. 194. 446 ioputloutput modeling. 144 invalidsequential untestable faults. 42 hardware description language (HDL). 41 subgraphs. 2+26. 12S126 data dependent. 4647. 215. 312. 63 planar. 8 k 8 2 image computation. 167 hierarchical sequencing graphs. 233. 40. 498 iteration consmctr. 172 in hierarchical sequencing graphs. behavioral. 529. 375. 530 perfect. 555 . 498 inverter minimization problem. 167. 270 input sequence. inference procedure. 32 775 implicant. 349. 170 canstrained minimumresource scheduling. 230 resource conflict. 102103 HardwareC. 5153. 272 on set. 73. 202. 284 hierarchical grapha. 201 of schedules. 154. 115 incompletely specified Bwlean functions. 203 latency.415 halting problem. 233 interface resources for architectural synthesis. 167.432 input encoding. 365. 385 implication clause. 68 hypergraghs. 145 internal controllability don't core sets. 258 scheduled sequencing. 147. 197 labeled hypergraph. 484 intersection of two implicants. computeraided. 291 interval graph. 38 stability number. 171 Ku algorithms. 31k311. 38k389 internal observability don'l care conditions. 67 hypercube. 375 levels of kernels. Hercules. 60. 53. 39 register compatibility. 367 Kernighan and Lin partitioning algorithm. 255. 413 i d u n d a n t cover of a function. 277.Split by PDF Splitter INDEX 573 nonhierarchical sequencing. sequencing. 126. 88 implementation constraints in architectural synthesis. 153. 146. 115. 59. 203. 233. 274 . of dataRow gpphs. 236. 38. 203. 40. 246 pattern. 186. 142. 100 hardware languages. 218. 203 latencyconstrained scheduling. 277 incindence smctures. 100. 242. 37. 159. 474 integer linear program (ILP). 284. 259 highlevel synthesis ( r e architectural synthesis) Hapcroft's method. 38. 519.98. 188 latencylcycle time Uadeoff. 100 hardwarelsoftware codesign. 228 simple. 198. 274. 199. 61 greedy algorithm. 455 subject. 145 implementation verification. 130. 65. 338 irredundant relavant anchor sec 194 isomorphic Eraphr. 272. 21. 530 unboundedlatency. application specific. 246 graph coloring. 186 of sequencing graph models. 121. 171. ROBDD manipulation. 68. 63. 230. 3739. 505 heuristic minimization of a cover. 78 HDUDesien Com~iler. 241 resource compatiblity. 475 leaf module. 272 incompletely specified finitestate machines. 41. 45. 552 features. 116 . 373. 306 implicit traversal. 206 Huntington's postulates. of a sequencing graph. 369 intersections. 23 1 scheduled and bound sequencing graphs. 39 isomorphicquential untestable faults. 254 unscheduled sequencing graph. 186. 272 . 371. 278 ITE Bwlean operator. 151 labeling. 239 vertices. <58 heterogeneous system components. of circuits. 366 extraction. 222. 511. 177. 61. 562 hash table. 319327 input plane. 203. 87. 236. 218 kernel. 153 intractable problems.
344 lifetimes of dataflow m p. 243 memory map. concurrent with library binding. 43. 50. 191. 135 Imp folding. 126 matching patterns. charactPristics. 103. 356. arraybased circuits. degrees of freedom. 9 mapped network. 199. 3 11 minimum latency. 376380 minimum canonical cover of. 512.h 242 linear and integer problem algorithms. 318. 53. 474 levels of kernels. 126. 383. 430 by retiming. 528 minimum canonical cover. 304 logic optimization algorithms. 513 matching. 86. 69. 427430 minimization of Boolean functions. 5 M Lms/Ex. 304. 51 maximal clique. 194. 159 minimum state. 126 Logic Synthesis System (LSS). 116. 201.476 Leiserson's model. 411. 78. 56 Iwkup table FPGAs. 38. 17. 41 1. 220. 74 mobility. 334 graphs of. modeling of. 433 uees. 344. 395. 360. 1 16. 346. cellsubnetwork. 113 merarules. 159. 1517 lexicographical order. 143 Mercury. 224 loop winding. 233 minimumcost satinfiability. 7. I04 logic functions. 54. 464. 19.345 marking vertices. 120. of software compilation. 32 logic networks. 3% decomposition. 387. architecrual. 5 6 5 1 0 matching compatibility graphs. 20 memory resources for architectural systhesis. 189 library cells. 253 covering problem. 44 of lagic networks. 200 mar ooerator. 318 e. 432. 1517 logic. 76 leftedge algorithm. 246. 220 loose wiring (see global wiring) . 544. 533535 optimizing networks. 277. 126 LiaeWong algorithm. 351 longest path problem. 46 list scheduling algorithm. 6 semicustom. binding. of a s t 44. 270 macrocell generation. 240. 538 loop expansion optimization. 287. 191 minimal area. 383. 47 M68000 pmcessor. 63. 544 lower bound of complexity. 283 twalevel optimization of. 418 logic optimization. S 3 4 9 lagic mnsfomations. 44 local search. 273. 207 minimum resource. 197 library binding. 356360 synchronous networks. 554 logic transformations. 257 minimum latency scheduling. 30 manufacturing time. 197. 410 maximum timing contraints. 284. 415. 234 legal retiming. 18. 283 minimum clique partition. 351. 116 logic circuits. Boolean Simplification and Substitution. 469 mintems of Boolean functions. 168 min operator. 11 custom. 525 local functions. 333 minimumarea. 86. 367 levelsensitive scan design (LSSD). 20821 1 load binning. 7. 89 logic networks. 479481 logic minimization algorithms. 344345. 433 microcompilation. 528 leaves of trees. 144 microelectronic circuits. variables. 373 lexical and syntax analysis. 270. 1517 geometrical. 27. 380 logielevel simulation. 56 Mealy models. 3 microelecmnic circuit design. 164 macrosell based design. delay constraints.Split by PDF Splitter L leafdags. 118. 459 delav modeline. of vertices. 203. 443 minimum timing contraints. 119 memory binding. 56 Mini. 93 minimum cover. 33 levels of abstraction. 11&118 logic Synthesis. 339 minimaldelay circuits. 231 maximum set of permissible functions. 457. 267.487 local optimum. 536 mamids. 164. 558 metavariables. 87 minimum irredundant cover. 6 microcode compaction. 433. .
127128 panial binding. 472 multiplevalued logic optimization. 251 Nova. 217218 partitioning. 318. 446 output plane. 404 nodes. 318. 45 NPcomplete. 526 isomorphic. combinational circuits. 106. 38. 21. 49 path. 174. 36 into cliques. 376. 505 nooperations. 49 optimization algorithms. 452454 multiplelevel logic optimization. retiming of. 280. 128 network covering. 443 network optimization. expression. of eraohs. 405 mvifunctions. algorithm. 115. 135 modeling. 42 observability. 32W321 A . 349 multiplelevel circuits. 343 arealdelay tradeoff. 118 Multilevel lnteractivc Synthesis (MIS). 440 multiplepart synchronous logic networks. 344 multiplevertex optimization. 109. 23. 512.63 perfect vertex elimination scheme.directed2?cheduling. 40. 135 optimal. 42. 212 operator strenglk reduction optimization. 369 extraction. 530 output required times. 238239 model expansion optimization. 242. 519. 541 module generators (see cell generators) for architectural synthesis. 21 optimum. 534 . 123. algorithmic approach.434 optimum pipelining. 532 u . 397. 505 network traversal algorithms. 218. 272 multioleohase clocks. 98 language. 44. 272 multipleoutput networks. don't care conditions. 233. 4 2 2 4 2 3 pathbased scheduling. 301. 27. 249 path sensitization. 389397. 556 multilevel prime implicants. 169 peephole optimization. 455 multiplelevel networks. 75. 215 perfect graphs. 45 NPNequivalent. I 151 16 models. 433. with Boolean uansformations. of cliques. 356 rulebased approach. 127. Olympus synthesis system. 424 Moore's Law. 40. 327 output panitioning. 350. 410 optimization problems. Opanition. 534 partially redundant cubes.403. 541 design evlauation space. 515. area. 144 module selection. 76 orthonormal expansion. 299 packaging if integrated circuits. 356. 186. 129 pvalued expansion. 333334. 246 nonhierarchical combinational logic network. 530. 270 output polarity assignment. 63 performance optimization. ' in fm. principle of. of Boole's expansion. by library cells. 31 1 partially redundant prime implicants. 313 partially redundant set. 408 for mansduction framework. 417 performance. 443 of a set. 310 panem functions. 254. 21. 167168. 348 synchronous circuits. 289. 73. 381. 38 nonhierarchical sequencing graphs. 506 network equivalence.542 pattern tree. 228 partially bound networks. 6446. 482. 27 module function.558 owration orobabilitv. 189. a44 objective function. 351354 multipleoutput implicants. 432 overloading. 122.484. 119. 435 percolation scheduling. 348. 349. 526 pattern graphs. 127 parsing. 397. abstract. 494 NP prybkms. of equivalent states. 72 output encoding. 153. 257258 monotone speedup property. 230. 229. 233. 247 of loaic networks. 130 optimization of circuit design. 298. 384. 440 netlist. of software Sompilation. 94. 350 multiplecube. 45. 51&511 M P I m d . 21. structures. 365. 349. 150. 51&511 paItitions. 36 blocks of. 142 parse trees. 472 ordered binary decision diagram (OBDD). 44 optimality. 4 Moure models. W 6 . 179. 556. 295.Split by PDF Splitter % model call. 272 multipleoutput mintem. 400 Boolean relations. 12 Pareto paint.169.
28k281. 229. 194 relative schedule. 410 oersonalitv matrix. 54 minimum~costsatisfiability. 449 prime dichotomies.116. 195 with timing constmina. 333. 338. 43. time co~trained. 178181. 161. 150. 194 register cornpatihilily graphs. 231 resource $haring. 302. 270 primillity of im implicant. 100. 462. 461 positionalcube notation. 287 ~rimaryinouts and out~uts. 228. 260 rerources for architectural synthesis. 30 Pyramid. 461. 556 Quine's Theorem. 1517 pipelined circuits. 476 permissable function. matrix representation. 150. 308 reduced ordered binary decision diagram. 450 reduced implicant. 536 portable package. 100 ~recedenceconstrainedmultipmcessor scheduling problem. 416 primitive resources far architectural sysnthesis. 484 Petri nets. 125. 423 m t . 21. 143 priority list for scheduling algorithms. 279. 308. 279 reduncancy identification and removal. 199 relatively essential set. 315. 39 polar dag. 408 redundant anchor. 310 release times. 383. 152. 197. 390. 350. 40. 8. cellsubnetwork matching. 279 programmable logic array (PLA). 440. 186 ~olaritv . 479. 252. 494 product of sums. 11. 329 relative anchor set. 228 functional. 288. 186. 208 privileged layers. 174. 475 of multiplepon synchronous logic network. 146. 194 relation table. 282283 retiming. 205. 93 satisfiability 8587 shanest and longest path. 230 resource conflict graphs. 376 QuineMcCluskey Algorithm. 89 tinitewtate decomposition. 506. 4851 ~miective aleorithm. of a constraint graph. 49. 277. . 122.Split by PDF Splitter peripheral retiming. 457 generalized. 489 relations. 413. 144. 104 retained implicants. 307. 56 procedural languages. 270 programming. 401 . 236. 40 polar graphs. 73 benurbed'network. 320 pragmatics. 473 register sharing. 455 inconsistent. 54. prime classes. 530 bo~ynamia~ complexity. 7677 reduced prime implicant table. 485 placement. 30 problems. 262 relevant anchor set. 191. 561 . 246. 233. 336 physical design (see also geometricallevel synthesis) problems. 299. 160. 257. 243 ROBDD manipulation. 387389 rectangle covering problem. 84 robustness of critical path detection. 245 compatible. 476 ripplecany adden. 143. 541 Ptolemy. 376 range computation. 260. 30 produrn machine. 104 procedural layout styles. 459 selfadjacent. 142. 31 planar graphs. 463. aisienment. 472 of a network. 279 Petrick's method. 400. 5 W 5 1 0 co~rnng. 260. 230 resource compatibility graphs. 329 prime and irredundant logic networks. covering. 326 prime implicants. 354 reduced dependency criterion. 46 bro&atlon delay7. 264 resourcedominated circuits. 193195 relative timing constraints. 257 RlSC architectures. 218 pipeline network models. 190 reset signal. 374375 reduce operator. 20. 158. 239. dynamic. 242 registers.19&197 unconstrained. 563 scheduling. 529. I96 relative scheduling. 336. 414. 329 disjoint. 143144 resolution mechanisms. 419. without resource constraints. 286 polynomially bound algorithms. 38 Presto.61. 276. 126 Petrick's method. 31 phyaical views. 176 resource binding. 241 register model. of hardward deccription languages. 280. 156 edgetriggered. 463 peripheral. 277. 203 orediffused arravbased design. 145. 156. 157.
27. 287. 36 Shannon's expansion. 347 5inglrimpllcant containment. 505 script. 129 scmicustom microelectr~nicdebign. 545546 433 logic optimi~ation. 218. 78 standard triple. 46. 422 sequencing graphs. 130. 545 htabilily numhc of a graph. 203 scheduled sequencing graph. 231 SEHWA. 41 standard rdcll. 260. of Boolean functions. 163. 7 cellbased.. binglesource. 471 ringlexube erpres?ionr. 205 scheduling algorithms. 397. 259 Schmiff triggerr. 283. 294 shortest and longest path problem. I I8 sraric cosen\itiration. 101. 369 extraction. 174. 54 signature cubes. 157 table. 161. 51 sequential circuit optimization. 457 prime classes. 118 statc encoding. 544. 250 . MY state mansition. 8 design. 69 sharing and binding. 175. 238239 sequencing problem. 38 r~mplification. 27. 209 smoothing operator. branching. 146. lihrary binding. 100 semantic analysis. 333 vatistiability of Boolean functions. 230 233. 84 state diagram. 166 serialparallel multiplien. 153 iamtion. 1 8 6 1 8 7 precedenceconstrained multiprocessor. 495. 180 model for. 40 spectral analysis. 7 sensitirable path of a logic network. 14. 425 slalic ienrit~ration. 365 single stuckat faults. 302 <inglewell ?MI%. 498. 494 schedules constrained. 15. 324. 555 simple graphs. 412. of a polar dag. 223 selfadjacent register. ~tate minimizalkn. 6 arraybased. 108. 20821 1 path based. 235.423 steering logic circuits. dacapath synthesis. 215 scheduled and hound sequencing graphs. 314. 41 6 singlerail circuil design. 433 Sasaa's Theorem. 70 Sucralrs. 259 sequencing graph cunstructs. force~directed. 177. 402. 119. seed dichotomies. 1131 14. semantics. 26. 194. 237. pipelined circuits. 488 slack of an qperation. 494 relalion. 1215 21 ILP. 101 of hardware description languagra. 228. 219 sharp operation. compiler. 118. 286. 122. 36 Canesian product. 435. network optimization. 557 rulebased approach. 442 Sequential Intemctivz Synthesis (SIS) program. 198202 list. 258 sets. 9899 source. 203. 149. 398 scan techniques. 384. 217 resourceconstrained. 356 rulebased systems. IZlr130 programming languages. 285286 side input. 353 rimulation and verification methods. 20 rugged script. 37 partitions. 350 singlepulariry logic networks.Split by PDF Splitter routing. 5 singlevencd optimization. 442. 177. 2 16 percolation. 292. 484 resourcedominated circuits. 556 SilcSyn. 536 single output mulliple~level networks. multiplelevel logic optimization. 558 sequential logic circuits. 443444. 122. cardinality. 194 stale extraction. 8 sfgnal multiglexing. 157. 24 syuential resources. 147. 141. 101 illicon on sapphire CMOS. 255. 215 trace. 262 selfforce of list. 122 modelrall. 274. 541 slgnalc. 259 scheduling problems. 449453. 187188 scheduling. 422 Silage. 413 mayabased destgn. 545 software. 121. 556. 32 simultaneoutly prime and irredundant logic network\. 161. 154. 5 rimplcx algorithm. 5 1 3 3 algorithms. 85 satisfiabilily don'r care conditions. 146 with chaining. diagram.
353. 100 synthesis. 132133 transitive orientation property of graphs. 253254 strongly unate functions. 2 4 2 6 . 109 structural hardware laneuaees. 489 Synergy. 115 subgraph. 442. 215 traces. 286. productionlevel. 261 . 164 symbolic minimization. 14. 279.Split by PDF Splitter strong canonical form. 247 twinwell CMOS. 51&515 using automata. 131134 elimination. 320 oumut enccdine. 6. 462 offset. 494 tree. 303 stronglytyped. 536 optimization of. 109 suuctural approach. 163 structures. 415 testability. 509 technology mapping (see libmy binding) testable networks. 49 treebased coverins. 102103 " smcturallevel synthesis (. 562 Theseus. of a circuit. 297 Techmap. controlflow based msformations. 513 threevertex. 459 synchronous logic networks. 27K271.5 twisted pair. 506510 structural flavor. bottonup. 528. 465 totally redundant prime implicants. 261262. 21. 86 of a cover. 172 8ynUlesis systems. 513 matching pmblem. 420 topological critical path. simple. 18 of testable networks. 142. 432 testing of integrated circuits. 526 synchronous circuits. 462 topological sort. 118. 510. multiplelevel logic optimization. 333. 38 covering algorithm. 330 of Boolean functions. 173. 44 mil. 416 supercube of two implicants. 553 researchlevel. 531 substitution. 142. 17 geometricallevel. 458461. 49 twovemx. 553 System's Architect Workbench (SAW). 72 fautology. cellsubnetwork matching problem. 73. 442. 474 elements in logic networks. 555 synfar. 479481 synchronous recurrence equation. 420. 327 symbolic relation4457458 symbolic m t h table. 313 totally redundant set. 522 ". 291 Syco compiler. 3 6 M M successor. 218 structural testability. 555 tractable problems. 483 timelabeled variables. 21. decision pmblem. 459 timevarying don't care components. of hardware description languages. 319 symmeuy class. of a list scheduling operation. 373. 351 for synchmnous networks. 495 of networks. 79 strongly compatible operations. 310 topolagical critical delay. 479 treeheight reduction optimization. 134135 dataflow based. 558 throughput. VHDL data types. of an architectural bcdy. 179 time frame. 42 synUlesis time. 118. 160. transformations. 525 implicit finitestate machine methcds. 559  tabular form. 17 architectual. treebased matching. 118. 490 TsengSiewiorek algorithm.486 synchronous logic transformations. 286 suucaral view. Boolean functions.555. 1517. 510 suucmral pipelining. 318. 415 synthesis algorithms. 431. 17 !ogiclevel. 348. 296. 556 sum of products farm. 49. 39 Sugar. 41. 360. 440 synchronous delay. 5 4 trace scheduling. of circuits. 38 transductipa. 517518 treeheight reduction mnsformations. 21 1 timeinvanant don't core components. 174 subject graphs. of a vertex. 458. incidence. 345. 12 thermal conduction module (TCM). 483 timing feasibility. 132 triangulated graph Dee chordal graph) m t h table. 422.we architectural synthesis) structural match. 41041 1 uansition mode. 233234 traversal methods. 154. 491. 280. 38 boundedlatency. 21.
2%. 413 . 555 vertical microcode. 15. 253 weakly unate functions. 326 unique table. 130 timelabeled. 37 vertex separation set. 337. 59. branching. 295 unate recursive paradigm. 41 vertex coloring. 31 Yorktown Silicon Compiler. 4 5 W 5 3 twolevel logic covers. 3 VHDL. 497498 valid encoding matrix. 538 walk. 129 UDLA. 17 v i m 4 library. 498 unate functions. 60. 199. 112113 unate cover. 87. 78. 250 detailed. 122 source. 309 unbounded latency graphs. 299. 122 model call. 409. 282283 undirected graphs. IW. 196 wiring. dataflow analysis. 104 Ychan. 70 undecided implicants. 1516  ZeroOne Linear Program (ZOLP). 323 variables. 122. design space. 3740. 169. 102. 146 stable set.Split by PDF Splitter INDEX 579 twolevel logic circuits. 257 untestable faults. 153 degree of. Verilog. 11W111. 136. 69 strongly unate. 543 unconsuained schedules. 146 sink. 61. 38 very large scale integration (VLSI). 153 operations. 122. 558 verification. estimating area and length. 187 underlying undkcted graph. 233 vertex cover. 174 unbound networks. 168 vertices. 128. 345 weakly compatible operations. 554 ViiwSynthesis. 108. 288 type checking. state encoding for. 98. behavioral. 153 links. 110.66. 303 weakly unate. 87 vertex of a graph. 505. 157 delays.. 295296. of a graph. 245 wellposed consuaint graph. 254 universal quantifier.295. 154. 295 weighted compatilibity graph. 459. Imkup table FPGAs. 5 15 iteration. 254 union of two dichotomies. 236. 459 Venus. 174 unboundedlatency scheduling. 36. 108. 128. 86. 122. 122. 17 smcmral and physical. 15. 329. 349 twolevel circuits. 37 degree of. 38 Wallacetree. 80 unscheduled sequencing graph. 31 global. 61. 39. 555 views.
This action might not be possible to undo. Are you sure you want to continue?
We've moved you to where you read on your other device.
Get the full title to continue reading from where you left off, or restart the preview.