You are on page 1of 284

CAD for IC design

VLSI Design Problem:
Optimization of design in several aspects
Area Minimization*
Speed*
Power dissipation*
Design time*
Testability*
Design Methodologies
Approach followed to solve the VLSI design
problem

To consider all parameters in one go during VLSI design process we use:
Cost function: measure of VLSI design cost in terms of different
parameters
To design a VLSI circuit at one go while at the same time optimizing
the cost functions complexity is simply too high
To understand complexity
Two main concepts are helpful to deal with this complexity
1. Hierarchy and
2. Abstraction
Hierarchy: shows the structure of a design at different levels
Abstraction: hides the lower level details

Abstraction ex:

The Design Domains
The behavioral domain.: Part of the design (or the whole) is seen
as set of black boxes
Black box: relations between outputs and inputs are given without a
reference to the implementation of these relations.
The structural domain:

Circuit is seen as the composition of subcircuits.

Each of the subcircuits has a description in the behavioral
domain or a
description in the structural domain itself
The physical (or layout ) domain
The physical domain gives information on how the subparts that
can be seen in the structural domain, are located on the twodimensional plane.

parameterizable modules designing an integrated circuit is a sequence of many actions most of which can be done by computer tools .Design Methods and Technologies full-custom design: maximal freedom ability to determine the shape of every mask layer for the production of die chip. Semicustom: • smaller search space • limiting the freedom of the designer • shorter design time • semicustom design implies the use of gate arrays. standard cells.

• Predesigned and have been made available to the designer in a library • characterization of the cells: determination of their timing behavior is done once by the library developer Module Generators • generators exist for those designs that have a regular structure such as adders. • designer specify the wiring patterns • gate arrays as described above are mask programmable • There also exist so-called field-programmable gate arrays (FPGAs) • Interconnections can be configured by applying electrical signals on some inputs.Gated arrays • chips that have all their transistors preplaced in regular patterns. . etc. Standard Cells • simple logic gates. multipliers. the module can be described by one or two parameters. flip-flops. • Due to the regularity of the structure. and memories.

5. 6. 3. 2.VLSI design automation tools: Can be categorized in: 1. Algorithmic and system design Structural and logic design Transistor-level design Layout design Verification Design management . 4.

• Hardware description languages (HDLs) are used for the purpose.Algorithmic and System Design: • mainly concerned with the initial algorithm to be implemented in hardware and works with a purely behavioral description. • High-level synthesis: The synthesis from the algorithmic behavioral level to structural descriptions is called high-level synthesis. . • Silicon Compiler : A silicon compiler is a software system that takes a user's specifications and automatically generates an integrated circuit (IC). (initial synthesizer) • formal specification does not always need to be in a textual • Tools available having capability to convert the graphical information into a textual equivalent (expressed in a language like VHDL) that can be accepted as input by a synthesis tool. • A second application of formal description is the possibility of automatic synthesis • synthesizer reads the description and generates an equivalent description of the design at a much lower level.

• the parts with the highest frequencies are the most likely to be realized in hardware. • Part of the specification is realized in hardware and some of which in software. . and • the other of the software (e. in C). some of which are programmable. in VHDL) that will contain programmable parts.Hardware-software co-design • Design for a complex system will consist of several chips.g. (hardware-software co-design) • partitioning of the initial specification required (difficult to automate) • tools exist that support the designer • by providing information on the frequency at which each part of the specification is executed. • The result of co-design: • is a pair of descriptions: • one of the hardware (e.g.

. Hardware-software co-simulation: Verification of the correctness of the result of co-design using simulation.Code generation : Mapping the high-level descriptions of the software to the low-level instructions of the programmable hardware : CAD problem.

• Structural and Logic Design • Sometimes the tools might not be able to cope with the desired behaviour: inefficient synthesis • Designer provides lower level description :Structural and Logic Design • designer can use a schematic editor program: CAD tool • It allows the interactive specification of the blocks composing a circuit and their interconnections by means of a graphical interface. it is a common practice to verify the circuit by means of simulation • fault simulation: checks whether a set of test vectors or test patterns (input signals used for testing) will be able to detect faults caused by imperfections of the fabrication process • automatic test-pattern generation: • the computer search for the best set of test vectors by using a tool : ATPG. . • schematics constructed in this way are hierarchical • Role of simulation: Once the circuit schematics have been captured by an editor.

therefore. 3. Some parts of integrated circuits consist of so-called random logic (circuitry that does not have the regular structure ) •.Logic synthesis: Generation and optimization of a circuit at the level of logic gates. Synthesis of multilevel combinational logic: •. Random logic is often built of standard cells. which means that the implementation does not restrict the depth of the logic. Synthesis of sequential logic : •. problem here is to find logic necessary to minimize the state transitions. sequential logic has a state which is normally stored in memory elements •. Synthesis of two-level combinational logic: • Boolean function can be written as sum of products or a product of sums. can be directly be implemented as programmable logic arrays (PLAs) • It is. . 2. important to minimize two-level expressions. three different types of problems: 1.

Timing constraints: • designer should be informed about the maximum delay paths • shorter these delays. the faster the operation of the circuit • One possibility of finding out about these delays is by means of simulation • Or by timing analysis tool: compute delays through the circuit without performing any simulation .

transistors are modeled as ideal bidirectional switches and the signals are essentially digital At the timing level .g.Transistor-level Design Logic gates are composed of transistors Depending on the accuracy required. more accurate models of the transistors are used which often involve nonlinear differential equations for the currents and voltages more accurate the model. piecewise linear functions) At the circuit level . analog signals are considered. but the transistors have simple models (e. the more computer time is necessary for simulation . transistors can be simulated at different levels At the switch level .

resistors and capacitances. Construct the network of transistors. 2. The extracted circuit can then be simulated at the circuit or switch level. 3. it is the custom to extract the circuit from the layout data of transistor. .Process (full-custom Transistor-level design): 1.

a position in the plane is assigned to each subblock. different layout tools. goal of placement and routing is to generate the minimal chip area(1).Layout Design Design actions related to layout are very diverse therefore. (timing-driven layout) . The next step is to generate the wiring patterns that realize the correct interconnections (routing problem). trying to minimize the area to be occupied by interconnections (placement problem). together with the list of interconnections then 1. Timing constraint (2): As the length of a wire affects the propagation time of a signal along the wire. If one has the layout of the subblocks of a design available. 2. First. it is important to keep specific wires short in order to guarantee an overall execution speed of the circuit.

g. • when making a transition of a behavioral description to a structure. long wires in the layout • Detailed layout information is available in placement whereas floorplanning has mainly to deal with estimations.Partitioning problem: grouping of the sub-blocks • Subblocks that are tightly connected are put in the same group while the number of connections from one group to the other is kept low • Partitioning helps to solve the placement problem Floorplanning: • The simultaneous development of structure and layout is called floorplanning. (difference) . one can also fixes the relative positions of the subblocks • It gives early feedback on e.

layout of which can be composed by an arrangement of cells.g. .: complexity of around 10 transistors • A cell compiler generates the layout for a network of transistors. the number of bits for an adder or the word length and number of words in a memory). • Module generation: Given some parameters (e. the mask patterns should obey some rules called design rules. • the module generator puts the right cells at the right place and composes a module in this way. • In a correct design. • Tools that analyze a layout to detect violations of these rules are called design-rule checkers. • Layout editor (In full-custom design): provides the possibility to modify the layout at the level of mask patterns. • Module: A block.• microcells.

symbolic layout has been proposed as a solution. resistors and capacitances that can be simulated disadvantage of full-custom design is that the layout has to be redesigned when the technology changes.Circuit extractor: takes the mask patterns as its input and constructs a circuit of transistors. assigns widths to all patterns and spaces the patterns such that all design rules are satisfied. . Compactor: takes the symbolic description. Symbolic representation represents positions of the patterns relative to each other.

Prototyping: building the system to be designed from discrete components rather than one or a few integrated circuits ex. •.Verification Methods: There are three ways of checking the correctness of an integrated circuit without actually fabricating it 1. A prerequisite for rapid system prototyping is the availability of a compiler that can "rapidly" map some algorithm on the programmable prototype. Prototyping 2. Formal verification 1. Simulation 3. prototyping using programmable devices such as FPGA (rapid system prototyping). . Breadboarding.

Simulation: modelling a computer model of all relevant aspects of the circuit. as the set of all possible input signals and internal states grows too large.2. it is impossible to have an exhaustive test of a circuit of reasonable size. 3. Formal verification: use of mathematical methods to prove that a circuit is correct. . executing the model for a set of input signals. and observing the output signals.

• famous standard format for storing data is EDIF (Electronic Design Interchange Format) • Framework is an universal interface used by tools to extract EDIF data from database. • another aspect of design management is to maintain a consistent description of the design while multiple designers work on different parts of the design. .Design Management Tools: • CAD tools consume and produce design data • quantity of data for a VLSI chip can be enormous and • appropriate data management techniques have to be used to store and retrieve them efficiently.

in more abstract cases (e. Computational complexity: time and memory required by a certain algorithm as function of the ‘size of the algorithm's input’.g. Algorithmic Graph Theory and Computational Complexity: Algorithmic graph theory: design of algorithms that operate on graphs Graph: A graph is a mathematical structure that describes a set of objects and the connections between them. when dealing with entities that naturally look like a network (e. a circuit of transistors) 2.3. . Graphs are used in the field of design automation for integrated circuits 1. precedence relations in the computations of some algorithm.g.

. The vertices u and v such that (u. a vertex set V (node) and 2. v) E.  Terminology: A graph G(V. an edge set E (branch) The two vertices that are joined by an edge are called the edge's endpoints. notation (u. E) is characterized by two sets: 1. v) is used. are called adjacent vertices.

Digraph: a directed graph (or digraph) is a graph. .Subgraph: When one removes vertices and edges from a given graph G. one gets a subgraph of G. Complete digraph: A complete digraph is a directed graph in which every pair of distinct vertices is connected by a pair of unique edges (one in each direction). Complete graph : a complete graph is a simple undirected graph in which every pair of distinct vertices is connected by a unique edge. Rule: removing a vertex implies the removal of all edges connected to it. where the edges have a direction associated with them. or set of nodes connected by edges.

Clique: a clique in an undirected graph is a subset of its vertices such that every two vertices in the subset are connected by an edge. V4} and {V5. {V3. A subgraph that is complete three cliques identified by the vertex sets {V1 . V2. V3}. V6} .

. is called a selfloop. it is not a subset of a larger clique.e. i. u). one starting and finishing at the same vertex. degree of a vertex: The degree of a vertex is equal to the number of edges incident with it Selfloop: An edge (u.Maximal clique: A maximal clique is a clique that cannot be extended by including one more adjacent vertex.

having the same endpoints. i. v2) and e2 = (v1 . Simple graph: A graph without selfloops or parallel edges is called a simple graph . v2). are called parallel edges.Parallel edges: Two edges of the form e1= (v1 .e.

. U and V are each independent sets) such that every edge connects a vertex in U to one in V. bigraph : a bipartite graph (or bigraph) is a graph whose vertices can be divided into two disjoint sets U and V (that is.A graph without selfloops but with parallel edges is called a multigraph.

is called a cycle (sometimes also: loop or circuit). Path: A sequence of alternating vertices and edges. starting and finishing with a vertex. is called a path. . of which the first and last vertices are the same and the length is larger than zero. such that an edge e = (u. A path or a cycle not containing two or more occurrences of the same vertex is a simple path or cycle.Planar-graph: A graph that can be drawn on a two-dimensional plane without any of its edges intersecting is called planar. Ex The length of a path equals the number of edges it contains A path. v) is preceded by u and followed by v in the sequence (or vice versa).

strongly connected vertices: Two vertices u and v in a directed graph are called strongly connected if there is both a directed path from u to v and a directed path from u to u.Connected graph: If all pairs of vertices in a graph are connected. . such that all pairs in the set are connected In-degree: The in-degree of a vertex is equal to the number of edges incident to it out-degree: out-degree of an edge is equal to the number of edges incident from it. strongly connected components: In the mathematical theory of directed graphs. the graph is called a connected graph Connected component: a connected component is a subgraph induced by a maximal subset of the vertex set. a graph is said to be strongly connected if every vertex is reachable from every other vertex.

a special type of labeled graph in which the labels are numbers (which are usually taken to be positive). .A weighted graph is a graph in which each branch is given a numerical weight.

To implement graph algorithms suitable data structures are required Different algorithms require different data-structures. E) has n vertices. . The adjacency matrices of undirected graphs are symmetric. adjacency matrix: If the graph G(V. an nxn matrix A is used.

• Array element identified by an index i corresponds with the vertex u.Finding all vertices connected to a given vertex requires inspection of a complete row and a complete column of the matrix (not very efficient) adjacency list representation: It consists of an array that has as many elements as the number of vertices in the graph. . • Each array element points to a linked list that contains the indices of all vertices to which the vertex corresponding to the element is connected.

.

each consisting of at most 10 letters input size =10n using the ASCII code that uses 8 bits for each letter input size =80n Two types of computational complexity are distinguished: 1. Ex: a sorting algorithm has to sort a list of n words. 2. time complexity. which is a measure for the time necessary to accomplish a computation. space complexity which is a measure for the amount of memory required for a computation. The input size or problem size of an algorithm is related to the number of symbols necessary to describe the input. Space complexity is given less importance than time complexity .The complexity/ behaviour of an algorithm is characterized by mathematical functions of the algorithm's "input size“.

Worst-case time complexity: • The duration of a computation is expressed in elementary computational steps • It is not only the size of the input that determines the number of computational steps: • conditional constructs in the algorithm are the reason that the time required by the algorithm is different for different inputs of the same size. • one works with the worst-case time complexity.Order of a function: Big O notation characterizes functions according to their growth rates • growth rate of a function is also referred to as order of the function. • A description of a function in terms of big O notation usually only provides an upper bound on the growth rate of the function. assuming that .

algorithm's time complexities: .

.Examples of Graph Algorithms: Depth-first Search: • To traverse the graph • to visit all vertices only once • A new member ‘mark’ in the vertex structure • initialized with the value 1 and given the value 0 when the vertex is visited.

.

this leads to a time complexity of depth-first search could be used to find all vertices connected to a specific vertex u . Assuming that the generic vertex and edge actions have a constant time complexity.each vertex is visited exactly once. all edges are also visited exactly once.

. • the call shift_in (q. that shif t_out ( q ) removes the oldest object from the queue q.Breadth-first Search: • directed graphs represented by an adjacency list • The central element is the FIFO queue. o) adds an object o to the queue q. • adding and removing objects from a FIFO queue can be done in constant time.

Status of queue .

depth-first search could be used to find all vertices connected to a specific vertex u vertices are visited in the order of their shortest path from vs The shortest-path problem becomes more complex. if the length of the path between two vertices is not simply the number of edges in the path. (weighted graphs) .

E) is given • edge weights w(e).• Dijkstra's Shortest-path Algorithm • a weighted directed graph G(V. • the distance attribute of a vertex v is equal to the edge weight w((vs. w(e) >0 • Visited vertices of the set V are transferred one by one to a set T • Ordering of vertices is done using vertex attribute ‘distance’. v)) .

viz. after the vertex from which they are incident is added to set T.vt = V3 is reached after 5 iterations continuing for one iteration more computes the lengths of the shortest paths for all vertices in the graph Time complexity of while loop = O(n) time. worst-case time complexity = O(n^2+|E|) . This gives a contribution of 0(E) to the overall time complexity. where n = |V| overall time complexity = O(n^2) as all edges are visited exactly once.

Prim's Algorithm for Minimum Spanning Trees In the mathematical field of graph theory. One gets a spanning tree by removing edges from E until all cycles in the graph have disappeared while all vertices remain connected. spanning tree is to be found with the least total edge weight. all of which have the same number of edges (number of vertices minus one) In the case of edge-weighted undirected graphs. (minimum spanning tree problem) . a graph has several spanning trees. a spanning tree T of an undirected graph G is a subgraph that includes all of the vertices of G that is a tree. also called the tree length.

starts with an arbitrary vertex which is considered the initial tree .

.

. called intractable.Tractable Intractable problems: A problem that can be solved in polynomial time is called tractable. A problem that can not be solved in polynomial time is.

g. i.g. If the variables are discrete. "the shortest-path problem between vertex vs and vertex vt.e. the problem is called a continuous optimization problem. the "shortest-path problem“ Instance: The term instance refers to a specific case of a problem. making the problem a combinatorial optimization problem. Instances of optimization problems can be characterized by a finite set of variables. the problem is called a combinatorial optimization problem. .Combinatorial Optimization Problems Problem : problem refers to a general class. they only can assume a finite number of distinct values. If the variables range over real numbers. e. Xi can only assume two values. e. An example of a simple combinatorial optimization problem is the satisfiability problem: to assign Boolean values to the variables in such a way that the whole expression becomes true.

bi = 1means that the edge is "selected" and bi = 0 means that it is not. c is a function assigning a cost to each element of F. a combinatorial optimization problem is defined as the set of all the instances of the problem.Another example: Dijkstra's algorithm: with given source and target vertices vt vs. c). each instance I being defined as a pair (F. Solving a particular instance of a problem consists of finding a feasible solution f with minimal cost . defines an instance of the problem One could associate Boolean variables bi. solving the shortest-path problem for this graph can be seen as assigning Boolean values to the variables bi: making the problem combinatorial. F is called the set of feasible solutions (or the search space).

what is the shortest possible route that visits each city exactly once and returns to the origin city? • TSP can be modelled as an undirected weighted graph. • Often.• The traveling salesman problem (TSP): • Given a list of cities and the distances between each pair of cities. • It is a minimization problem starting and finishing at a specified vertex after having visited each other vertex exactly once. the model is a complete graph . • paths are the graph's edges. such that cities are the graph's vertices. and a path's distance is the edge's length.

yi). . the distance between two cities ci and cj is simply given by .any permutation of the cities defines a feasible solution and the cost of the feasible solution is the length of the cycle represented by the solution nonoptimal solution optimal solution cities c1. .. C9 If the coordinates of a city ci are given by (xi . c2.

decision problems: decision version : These are problems that only have two possible answers: "yes" or "no”.Decision Problems (part of optimization problem): the optimization version of the shortest-path problem in graphs requires that the edges forming the shortest path are identified. the computational complexity of the decision version of a problem gives a lower bound for the computational complexity of the optimization version. Therefore. If optimization version can be solved in polynomial time. . it is not always obvious how to get the solution itself in polynomial time. whereas the evaluation version merely asks for the length of the shortest path. In other words: if there is an algorithm that is able to decide in polynomial time whether there is a solution with cost less than or equal to k. then the decision version can also be solved in polynomial time.

c. This set is called task associated with a decision problem is solution checking. Note that each instance is now characterized by an extra parameter k. k). An interesting subset of instances is formed by those instances for which the answer to the question is "yes".  review The decision version of a combinatorial problem can be defined as the set of its instances (F. . k is the parameter in the question "Is there a solution with cost less than or equal to k?'. It is the problem of verifying whether c(f) <k.

Complexity Classes: it is useful to group problems with the same degree in one complexity class. but a choice that will lead to the desired answer. and then merges back to one machine. which of them to be performed. The class of decision problems for which an algorithm is known that operates in polynomial time is called P (which is an abbreviation of "polynomial"). . For a common (deterministic) computer it always is clear how a computation continues at a certain point in the computation. Deterministic and nondeterministic computer. evaluates all choices in parallel. A nondeterministic computer allows for the specification of multiple computations at a certain point in a program: the computer will make a nondeterministic choices on. The machine splits itself into as many copies as there are choices. This is also reflected in the programming languages used for them. This is not just a random choice.

Any decision problem for which solution checking can be done in polynomial time is in NP.Complexity class NP: The complexity class NP (an abbreviation of "nondeterministic polynomial") consists of those problems that can be solved in polynomial time on a nondeterministic computer. class P is a subset of the class NP Halting problem (undecidable class): problem is to find an algorithm that accepts a computer program as its input and has to decide whether or not this program will stop after a finite computation time. .

An instance of any NP-complete problem can be expressed as an instance of any other NP-complete problem using transformations that have a polynomial time complexity.E) contains a so-called Hamiltonian cycle.NP-completeness : all decision problems contain in it are polynomially reducible to each other. the length of which is less than or equal to k. i. Ex: HAMILTONIANCYCLE problem: whether a given undirected graph G(V. . TRAVELING SALESMAN.e. a simple cycle that goes through all vertices of V. the decision version of TSP amounts to answering the question of whether there is a tour (simple cycle) through all vertices.

The set only includes instructions for writing a symbol (from a finite set) to the memory location pointed at by the memory pointer and move the pointer one position up or down. A finite number of "internal states" should also be provided for a specific Turing machine. A "program" then consists of a set of conditional statements that map a combination of a symbol at the current memory location and an internal state to a new symbol to be written.Nondeterministic computer: Turing machine (mathematical model) a computer with a sequentially accessible memory (a "tape") and a very simple instruction set. a new internal state and a pointer movement. . The machine stops when it enters one of the special internal states labeled by "yes" and "no" (corresponding to the answers to a decision problem). The input to the algorithm to be executed on a Turing machine is the initial state of the memory.

The simplest way to look for an exact solution is exhaustive search: it simply visits all points in the search space in some order and retains the best solution visited. 1. 2.General-purpose Methods for Combinatorial Optimization: algorithm designer has three possibilities when confronted with an intractable problem. 1. Approximation algorithms 3. 2. try to solve the problem exactly if the problem size is sufficiently small using an algorithm that has an exponential (or even a higher order) time complexity in the worst case. albeit the number of points visited may grow exponentially (or worse) with the problem size. Heuristics algorithms . Other methods only visit part of the search space.

minimizing the area amounts to avoiding empty space and keeping the wires that will realize the interconnections as short as possible. . As the number of cells is not modified by placement. A net can be seen as a set of cells that share the same electrical signal The interconnections to be made are specified by nets Placement is to assign a location to each cell such that the total chip area occupied is minimized. A nice property of unit-size placement is that the assignment of distinct coordinate pairs to each cell guarantees that the layouts of the cells will not overlap. cells in the circuit are supposed to have a layout with dimensions 1x1(measured in some abstract length unit) it can be assumed that the only positions on which a cell can be put on the chip are the grid points of a grid created by horizontal and vertical lines with unit-length separation.The Unit-size Placement Problem: Problem: how the cells should be interconnected.

The possible way to evaluate the quality of a solution for unit-size placement is to route all nets and measure die extra area necessary for wiring. A bad placement will have longer connections which normally will lead to more routing tracks between the cells and therefore to a larger chip area. .

Solving the routing problem is an expensive way to evaluate the quality of a placement. This is especially the case if many tentative placements have to be evaluated in an algorithm that tries to find a good one. . An alternative used in most placement algorithms is to only estimate the wiring area.

One can then associate a variable fi with each edge. fn) and each fi(1 < i < n) can assume a finite number of values. . In such a case one speaks of implicit constraints. The explicit constraints then state that fi{0. Backtracking and Branch-and-bound: an instance I of a combinatorial optimization problem was defined by a pair (F . . . E) in which a path with some properties is looked for. called the explicit constraints . with F the "set of feasible solutions" (also called the "search space“ or "solution space") and c a cost function assigning a real number to each element in F. whose value is either 1 to indicate that the corresponding edge is part of the path or 0 to indicate the opposite. the values assigned to the different components of f may sometimes not be chosen independently. 1} for all i. c). Suppose that each feasible solution can be characterized by an ndimensional vector f = [f1. Consider a combinatorial optimization problem related to some graph G(V.f2 .

The cost of the feasible solution found can be computed if all variables are found.Backtracking: The principle of using backtracking for an exhaustive search of the solution space is to start with an initial partial solution in which as many variables as possible are left unspecified. The algorithm continues by going back to a partial solution generated earlier and then assigning a next value to an unspecified variable (hence the name "backtracking") . and then to systematically assign values to the unspecified variables until either a single point in the search space is identified or an implicit constraint makes it impossible to process more unspecified variables.

The partial solutions are generated in such a way that the variables fi are specified for 1 < i < k and are unspecified for i > k. The global array val corresponds to the vector f(k). the values of array elements with index greater than or equal to k are meaningless and should not be inspected. So. Partial solutions having this structure will be denoted by f~(k). It is only called when k = n . The value of fk is stored in val[k — 1].It is assumed that all variables fi have type solution-element. The procedure cost(val) is supposed to compute the cost of a feasible solution using the cost function c. f~(n)corresponds to a fully-specified solution (a member of the set of feasible solutions).

k) returns a set of values allowed by the explicit and implicit constraints for the variable fk+I given f~(K) .procedure allowed(val.

.

• One says that the node in the tree corresponding to can be killed. • Function that estimates this cost lower bound will be denoted by . • killing partial solutions is called branch-and-bound . Branch-and-bound: • Information about a certain partial solution f(k) 1 < k <n. • If inspection of can guarantee that all of the solutions belonging to f(k) have a higher cost than some solution already found earlier during the backtracking. none of the children of need any further investigation. at a certain level can indicate that any fully-specified solution f(n) D (f(k)) derived from it can never be the optimal solution.

.

  Procedure Lower_bound_costis called to get a lower bound of the partial solution based on the function .

.

is essential for dynamic programming. Dynamic programming can be applied to such a problem if there is a rule to construct the optimal solution for p = k (complete solution) from the optimal solutions of instances for which p < k (set of partial solutions). .Dynamic Programming: Dynamic programming is a technique that systematically constructs the optimal solution of some problem instance by defining the optimal solution in terms of optimal solutions of smaller size instances. The fact that an optimal solution for a specific complexity can be constructed from the optimal lower complexity problems only. This idea is called the principle of optimality.

The optimal solution for the instance with p = 1 is found in a trivial way by assigning the edge weight w((vs. the optimization goal becomes: find the shortest path from vs to all other vertices in the graph considering paths that only pass through the first k closest vertices to vs. v is given by the edge weight w((u. If p = k.The goal in the shortest-path problem is to find the shortest path from a source vertex vs to a destination vertex vt in a directed graph G(V. solving the problem for p = k+1 is simple: transfer the vertex u in V having the lowest value for its distance attribute from V to T and update the value of the distance attributes for those vertices remaining in V. v)). . E) where the distance between two vertices u. Then. Suppose that the optimal solution for p = k is known and that the k closest vertices to vs have been identified and transferred from V to T. additional parameters may be necessary to distinguish multiple instances of the problem for the same value of p. u)) to the distance attribute of all vertices u.

Integer Linear Programming:
• Integer linear programming (ILP) is a specific way of casting a
combinatorial optimization problem in a mathematical format.
• This does not help from the point of view of computational complexity
as ILP is NP complete itself .
• ILP formulations for problems from the field of VLSI design automation
are often encountered due to the existence of "ILP solvers“.
• ILP solvers are software packages that accept any instance of ILP as
their input and generate an exact solution for the instance.

• why ILP is useful in CAD for VLSI
• The input sizes of the problems involved may be small enough for an
ILP solver to find a solution in reasonable time.
• One then has an easy way of obtaining exact solutions, compared to
techniques such as branch-and-bound.

all variables are apparently restricted to be positive
Xi , that may assume negative values, can be replaced by the
difference Xi — Xk of two new variables Xi and Xk that are both
restricted to be positive.
standard form

b1 is slack variable
It is possible to solve LP problems by a polynomial-time algorithm
called the ellipsoid algorithm

Integer Linear Programming: ILP is a variant of LP in which the variables
are restricted to be integers
The techniques used for finding solutions of LP are in general not suitable
for ILP
Other techniques that more explicitly deal with the integer values of the
variables should be used.
If integer variables are restricted further to assume either of the values
zero or one. This variant of ILP is called zero-one integer linear
programming.
zero-one ILP formulation for the TSP:
Consider a graph G(V, E) where the edge set E contains k edges: E = {e1,
e2, . . . , ek}.
The ILP formulation requires a variable xi, for each edge ei
The variable xi can either have the value 1, which means that the
corresponding edge
ei has been selected as part of the solution,
or the value 0 meaning that ei, is not part of the solution.
Cost function=
In the optimal solution, only those xi that correspond to edges in the

Solution set would also include
solutions that consists of multiple
disjoint tours
Additional constraint: A tour that
visits all vertices in the graph should
pass through at least two of the
edges that connect a vertex in V1
with a vertex V2 (where V2 = V \ V1)

Both V1 and V2 should contain at least three vertices.

size of the problem instance:
the number of variables is equal to the number of edges
The number of constraints of the type presented is equal to the number
of vertices
The number of constraints of the type given in however, can grow
exponentially as the number of subsets of V equals 2^y.

 
Local
Search:
• Local search is a general-purpose optimization method that works with
fully specified solutions f of a problem instance (F, c)
• It makes use of the notion of a neighbourhood N(f) of a feasible
solution f.
• Works with subset of F that is "close" to f in some sense.
• a neighbourhood is a function that assigns a number of feasible
solutions to each feasible solution: N : F  2^F.
• 2^F denotes the power set of F.
• Any g N(f) is called a neighbour of f.

• The principle of local search is to subsequently visit a number of feasible solutions in the search space. • One more possibility is to repeat the search a number of times with different initial solutions • one should be able to move to a solution with a higher cost. Functions with many minima. • transition from one solution to the next in the neighbourhood is called a move or a local transformation • • • • Multiple minima problem If the function has a single minimum. . the larger is the part of the search space explored and the higher is the chance of finding a solution with good quality. most of which are local local search has the property that it can get stuck in a local minimum. by means of so-called uphill moves. it will be found. • the larger the neighbourhoods considered.

Simulated Annealing: • a material is first heated up to a temperature that allows all its molecules to move freely around (the material becomes liquid). • The movement of the molecules corresponds to a sequence of moves in the set of feasible solutions. • At the end of the process. • The temperature corresponds to a control parameter T which controls the acceptance probability for a move from . and is then cooled down very slowly. the total energy of the material is minimal. • The energy corresponds to the cost function.

. Simulated annealing allows many uphill moves at the beginning of the search and gradually decreases their frequency. new temperature and stop realizes a strategy for simulated annealing. which is called the cooling schedule.The function random (k) generates a real-valued random number between 0 and k with a uniform distribution The combination of the functions thermal equilibrium.

of course. a so-called tabu list containing the k last visited feasible solutions is maintained This only helps. does not directly restrict uphill moves throughout the search process In order to avoid a circular search pattern. the principle of tabu search is to move to the cheapest element g G even when c(g) > c(f). to avoid cycles of length < k . The tabu search method.Tabu Search   Given a neighbourhood subset G N(f) of a feasible solution f.

.

the algorithm simultaneously keeps track of a set P of feasible solutions. but that the number of bits to . In an iterative search process. First of all. this operation assumes that all feasible solutions can be encoded by a fixed length vector f = [f1. fn]T = f as was the case for the backtracking algorithm Bit strings are to represent feasible solutions. Number of vector elements n is fixed.. the current population is replaced by the next oneusing a procedure In order to generate a feasible solution two feasible solutions called the parents of the child are first selected from is generated in such a way that it inherits parts of its "properties" from one parent and the other part from the second parent by the application of an operation called crossover. . called the population.f2 .  Genetic Algorithms: instead of repetitively transforming a single current solution into a next one by the application of a move.

A simple crossover operator works as follows: Generate a random number r between 1 and the length I of the bit strings for the problem instance.Consider an instance of the unit-size placement problem with 100 cells and a 10x 10 grid. As 4 bits are necessary to represent one coordinate value (each value is an integer between 1and 10) and 200 coordinates (100 coordinate pairs) specify a feasible solution. the chromosomes of this problem instance have a length of 800 bits. a crossover operator will use some of the bits of the first parents and some of the second parent to create a new bit string representing the Child. Copy the bits 1through r — 1 from the first parent and the bits r through . A feasible solution = the phenotype Encoding of chromosome = the genotype Given two chromosomes.

1). now with only a single cell to place (an artificial problem). 14) and (8. f(k) is a placement on position (5. a placement at (5. 14) is illegal: it does not represent a feasible solution as coordinate values cannot exceed 10.Suppose that the bit strings of the example represent the coordinates of the placement problem on a 10 x 10 grid. The children generated by crossover represent placements at respectively (5. Clearly. 6). . The bit string for a feasible solution is then obtained by concatenating the two 4-bit values of the coordinates of the cell. 9) and g(k) one on position (8. So.

illegal solution "C1C3C1C5C3C6” (or "C4C2C6C5C2C4") . the traveling salesman problem for which each of the feasible solutions can be represented by a permutation of the cities.g. In such a situation. may leads to more complications.The combination of the chromosome representation and the crossover operator for generating new feasible solutions. Two example chromosomes for a six city problem instance with cities c1 through c6 could then look like "C1C3C6C5C2C4“ and "C4C2C1C5C3C6". the application of the crossover operator as described for binary strings is very likely to produce solutions that are not feasible. Consider e.

Consider again the chromosomes "C1C3C6C5C2C4“ and "C4C2C1C5C3C6" cut after the second city. Then the application of order crossover would lead to the child "C1C3C4C2C5C6" . The remaining part of the child is composed of the elements missing in the permutation in the order in which they appear in the second parent chromosome.Order crossover: for chromosomes that represent permutations This operator copies the elements of the first parent chromosome until the point of the cut into the child chromosome.

g.The function select is responsible for the selection of feasible solutions from the current population favouring those that have a better cost The function stop decides when to terminate the search. where m is a parameter of the . e. when there has been no improvement on the best solution in the population during the last m iterations.

. e. Instead of distinguishing between the populations pk and p(k+1) one could directly add a new child to the population and simultaneously remove some "weak" member of the population.stronger preference given to parents with a lower cost when selecting pairs of parents to be submitted to the crossover operator Mutation: Mutation helps to avoid getting stuck in a local minimum One can work with more sophisticated crossover operators. One can copy some members of the population entirely to the new generation instead of generating new children from them. operators that make multiple cuts in a chromosome.g.

• Because the graph is acyclic. • Any data structure that is able to implement the semantics of a "set" can be used. • All edges in the graph are visited exactly once during the execution of the inner for loop.Longest-path Algorithm for DAGs • A variable pi is associated with each vertex vi to keep count of the edges incident to vi that have already been processed. once all incoming edges have been processed. . • It will be taken out later on to traverse the edges incident from it in order to propagate the longest-path values to the vertices at their endpoints. • Once processed Vj is included in a set Q. the longest path to vi is known.

.

.

(e. both are CMOS technologies). as long as the new and old technologies are compatible (e.g.g. 3. 4. Converting symbolic layout to geometric layout. Correcting small design rule errors A new technology means that the design rules have changed.) . Removing redundant area from geometric layout. This optimization is called layout compaction Layout compaction can be applied in four situations. Adapting geometric layout to a new technology. this adaptation can be done automatically. 1. by means of so-called mask-to-symbolic extraction. 2. a final optimization can be applied to remove redundant space. the level of the mask patterns for the fabrication of the circuit.Layout Compaction: At the lowest level.

In principle the width of a wire cannot be modified. The length of a wire. When they are moved during a compaction process. rigid rectangles and 2. stretchable rectangles. Stretchable rectangles correspond to wires. 1. can be changed by compaction. the rectangles can be classified into two groups: 1. their lengths and widths do not change. 2.The Layout design problem: A layout is considered to consist of rectangles. . Rigid rectangles correspond to transistors and contact cuts whose length and width are fixed.

• When one dimensional compaction tools are used. This means that the tool has to be applied at least twice: once for horizontal and once for vertical compaction. This type of compaction is NP-complete.Compaction tools: • Layout is essentially two-dimensional and layout elements can in principle be moved both horizontally and vertically for the purpose of compaction. the layout elements are only moved along one direction (either vertically or horizontally). • Theoretically. only two-dimensional compaction can achieve an optimal result. On the other hand. • Two dimensional compaction tools move layout elements in both directions simultaneously. one dimensional compaction can be solved optimally in polynomial time .

say horizontal. compaction a rigid rectangle can be represented by one x-coordinate (of its centre. .In one-dimensional. for example) and a stretchable one by two (one for each of the endpoints) A minimum-distance design rule between two rectangle edges can now be expressed as an inequality the minimum width for the layer concerned is a and the minimum separation is b A graph following all these inequalities is called constraint graph.

located at x = 0 .There is a source vertex no.

E) gives the minimal xcoordinate xi. associated to that vertex. i. The graph only contains negative cycles. the sum of the edge weights along any cycle is negative. The Longest Path in Graphs with Cycles Two cases can be distinguished: 1.Directed acyclic graph A constraint graph derived from only minimum-distance constraints has no cycles The length of the longest path from the source vertex v0 to a specific vertex vi in a the constraint graph G(V. 2.e. The graph contains positive cycles: The problem for graphs with positive cycles is NP-hard A constraint graph with positive cycles corresponds to a layout with conflicting constraints Such a layout is called over-constrained layout and is impossible to realize .

partitions the edge set E of the constraint graph G(V. E) into two sets Ef and Eb The edges in Ef have been obtained from the minimum-distance inequalities and are called Forward edges. The edges in Eb correspond to maximum-distance inequalities and are called backward edges .

at the kth iteration of the do loop. the values of the xi represent the longestpaths going through all forward edges and possibly k backward edges. .

This makes the algorithm interesting in cases when the number of backward edges is relatively small.As the DAG longest-path algorithm has a time complexity of 0(|Ef |) and is called at most Eb times. . So the Liao-Wong algorithm has a time complexity of 0(|Eb| x |Ef |).

S1 contains the current wave front and S2 is the one for the next iteration n is the number of vertices after k iterations.The Bellman-Ford Algorithm The algorithm does not discriminate between forward and backward edges. the algorithm has computed the longest-path values for paths going through k — 1 intermediate vertices .

The time complexity of the Bellman-Ford algorithm is O(n x |E|) as each iteration visits all edges at most once and there are at most n iterations .

. Allocation (or "resource allocation“ or module selection) simply reserves the hardware resources that will be necessary to realize the algorithm. High-level synthesis is often divided into a number of subtasks. Assignment is also concerned with mapping storage values to specific memory elements and of data transfers to interconnected structures.Clique cover problem The clique cover problem (also sometimes called partition into cliques) is the problem of determining whether the vertices of a graph can be partitioned into k cliques. Assignment(also called binding) maps each operation in the DFG to a specific functional unit on which the operation will be executed. Scheduling is the task of determining the instants at which the execution of the operations in the DFG will start. Considering them as independent tasks makes it easier to define optimization problems and to design algorithms to solve them.

one can say that two tasks are in conflict if they cannot be executed on the same agent. compatibility means when their life times do not overlap. In case of values. Tasks are called compatible if they can be executed on the same agent. The set of tasks is then used as the vertex set of a conflict graph that has edges for those vertex pairs that are in conflict.Vj) Ecif and only if the tasks vi and vj are compatible. where a task can be an operation or a value and an agent can be an FU or a register. The conflict graph is the complement graph of the compatibility graph. The set of tasks can be used as the vertex set of a so-called compatibility graph GC(VC. The graph has edges (vi.  Assignment(also called binding) problem is called task-to-agent assignment. Ec). .

. vn) are included in set Ek. A supervertexvnis a common neighbor of the superverticesVi. The vertices of any complete subgraph of a compatibility graph correspond to a set of tasks that can be assigned to the same agent. if both edges (vi’ vn) and (vj. The subsets are pairwise disjoint and the union of the subsets forms the original set by definition of a partition. combining vertices 1. For example. In the literature such a partitioning is called a clique partitioning combining vertices in the compatibility graph results a supervertex. The goal of the assignment problem is then to partition the compatibility graph in such a way that each subset in the partition forms a complete graph and the number of subsets in the partition is minimal. The index I of a supervertex represents the set of indices of the vertices from which the supervertex was formed.vjVk.7.  The goal of the assignment problem is to minimize the number of agents for the given set of tasks.3 and 7gives a supervertexV1.3.

.

. The wiring should realize exactly the interconnections specified in the structural description (routing problem). Before dealing with the placement problem and possible solutions.Placement and Partitioning The input to the placement problem is the structural description of a circuit. and sufficient space is left for wiring. The goal of placement is to determine the location of these layouts on the chip such that the total resulting chip area is minimal. Such a description consists of a list of design subentities (hardware subparts with their layout designs) and their interconnection patterns that together specify the complete circuit. some attention is paid to the representation of an electric circuit (partitioning). it is the output of high-level synthesis.

.Partitioning Problem The partitioning problem deals with splitting a network into two or more parts by cutting connections. Partitioning problem is treated here together with placement because solution methods for the partitioning problem can be used as a subroutine for some type of placement algorithms. Data model of an electric circuit: the organization of the data structures that represent electric circuit.

.The data model consists of the three structures cell. A NAND gate is an example of a cell. The wire that electrically connects two or more ports is a net. A set of ports is associated with each net. port and net. A cell is the basic building block of a circuit. The point at which a connection between a wire and a cell is established is called a port. A port can only be part of a single net.

.

the information stored in masters originates from a library An input cell has a single port through which it sends a signal to the circuit and an output cell has a single port through which it receives a signal from the circuit. Ports are indicated by small squares Dashed lines show the cell boundaries .

a net set. one for edges connecting nets with ports edges never connect vertices of the same type . a cell set. a port set and 3. There will be two edge sets: 4.The graph will have three distinct sets of vertices: 1. 2. one for edges connecting cells with ports and 5.

A hypergraph consists of vertices and hyperedges. hyperedges connect two or more vertices the vertices represent the cells and the nets by omitting the explicit representation of nets: clique model Used for clique partitioning .

and three terminal nets and gives a lower bound for the wire length of nets with four or more terminals. All metrics refer to a cell's coordinates. . The total wire length estimation is then obtained by summing the individual estimates. resulting in a length estimate per net. The total wiring area can then be derived from this length by assuming a certain wire width and a wire separation distance. The estimation is exact for two. common metrics are Half perimeter: This metric computes the smallest rectangle that encloses all terminals of a net and takes the sum of the width and height of the rectangle as an estimation of the wire length.Wire-length Estimation total wire length is used to evaluate the quality of placement Estimation: A wire-length metric is applied to each net.

Squared Euclidean distance: This method is meant for the cliquemodel representation of an electric circuit. and Vj .Minimum rectilinear spanning/Steiner tree: The minimum Steiner tree always has a length shorter than or equal to the spanning tree Can be utilized to estimate total wire length required. the total cost is obtained by summing over the cells rather than over the nets. As nets are not explicitly present in this model. The cost of a placement is then defined as: Yij is zero if there is no edge between the vertices Vi.

(called the logistic signals) 2.g.g. Connections that are shared by all or most cells.) Rules for placement: 1. etc. . simple logic gates. like e. power and clock connections. cross the cells from left to right at fixed locations.Types of Placement Problem standard-cell placement : standard cells are predesigned small circuits (e. flip-flops. Signals related to the specific I/O of the cell have to leave the cell either at the top or the bottom.

3. In full-custom design where designers have the freedom to give arbitrary shapes to their cells . the cells need wiring space all around. (building-block layout) Apart from the standard-cell and building-block layout styles. they are connected by horizontal abutment. (standard-cell layout) 4. cells are collected into rows separated by wiring or routing channels. a combination of .

One obvious difference is that moves that exchange two cells as encountered in many general purpose algorithms are not always possible due to the size difference .The placement problem for standard cells or building blocks is more complex than the unit-size placement problem.

Placement Algorithms: Placement algorithms can be grouped into two categories: Constructive placement: the algorithm is such that once the coordinates of a cell have been fixed they are not modified anymore. Iterative placement: all cells have already some coordinates and cells are moved around. their positions are interchanged. in order to get a new (hopefully better) configuration. An initial placement is obtained in a constructive way and attempts are made to increase the quality of the placement by iterative improvement. etc. .

min-cut partitioning and 2. min-cut partitioning The basic idea of min-cut placement is to split the circuit into two subcircuits of more or less equal size while minimizing the number of nets that are connected to both subcircuits The two subcircuits obtained will each be placed in separate halves of the layout The number of long wires crossing from one half of the chip to the other will be minimized bipartitioning is recursively applied . Clustering 1.Constructive Placement: Partitioning methods which divide the circuit in two or more subcircuits of a given size while minimizing the number of connections between the subcircuits: 1.

The second task can be based on different heuristics. One such heuristic is to look at the parts of the circuit that already have a fixed position (either because the placement of these parts is already fixed or because they are connected to the inputs or outputs of the chip that are located at the chip's periphery) Min-cut placement is a top-down method .

Iterative Improvement: Iterative improvement is a method that perturbs a given placement by changing the positions of one or more cells and evaluates the result If the new cost is less than the old one. the new placement replaces the old one and the process continues .

In general. Different approaches are possible: 1. One can allow that cells in a feasible solution overlap and make the overlap part of the cost function to be minimized. this is a computation-intensive operation as the coordinates of many cells in the layout have to be recomputed as well . This will direct a placement algorithm towards solutions with little or no overlap. Any overlap that remains can be eliminated by pulling apart the cells in the final layout (at the expense of a larger overall chip area).Perturbation of a feasible solution for standard cell or building-block placement is more complex due to the inequality of the cell sizes. 2. One can eliminate overlaps directly after each move by shifting an appropriate part of the cells in the layout.

Force-directed placement: It assumes that cells that share nets. feel an attractive "force" from each other. one can compute the "center of gravity" of a cell. yig) of a cell i is defined as perturbation is then to move a cell to a legal position close to its center of gravity and if there is another cell at that position to move that cell to some empty location or to its own center of gravity . The goal is to reduce the total force in the network. the position where the cell feels a force zero center of gravity (xig .

The problem is to find two sets A and B. A B = 0. subject to A U B = V. which minimizes the cut cost defined as follows: . 0. Kernighan-Lin Partitioning Algorithm There is an edge-weighted undirected graph G(V. E) The graph has 2n vertices (|V| = 2n). b) Ehas a weight if (a. an edge (a. and |A| = |B| = n.Partitioning   When a large circuit has to be implemented with multiple chips and the number of pins on the IC packages necessary for interchip communication should be minimized. b) E.

In an iterative process. interchange them and then tries to make a new attempt. until the attempt does not lead to an . The new sets.The principle of the algorithm is to start with an initial partition consisting of the sets A0 and B0 which. the set isolated from Am-1 will be denoted by Xm and the set isolated from Bm-1 will be denoted by Ym. In iteration number m. subsets of both sets are isolated and interchanged. Am and Bm are then obtained as follows algorithm makes an attempt to find suitable subsets. will not have a minimal cut cost. in general.

.  The construction of the sets Xm and Ym is based on external and internal costs for vertices in the sets Am-l and Bm-l. the external cost Ebfor a vertex b Bm-1. The external cost Ea of a Am-1 is defined as follows the external cost for vertex a Am-1 is a measure for the pull that the vertex experiences from the vertices in Bm-1.

A. a negative value shows a preference to keep the vertex in its current set.internal costs la and lb The difference between internal and external costs gives an indication about the desirability to move the vertex: a positive value shows that the vertex should be better moved to the opposite set. resulting from the interchange of two vertices can . The differences for the vertices in both sets are given by the variables Da and Db the gain in the cut cost.

while the exchange of individual vertices from each cluster does not improve the cut cost. Once all vertices have been locked.bi) may be negative. Pairs in the sequence may have negative cost improvements as long as the pairs following them compensate for it. . Such a situation would occur when the exchange of two clusters of tightly connected vertices results in an improvement.It is important to realize that the best cut cost improvement leading to the selection of a pair (ai . the pairs are investigated in the order of selection: the actual subsets to be interchanged correspond to the sequence of pairs (starting with i=1) giving the best improvement.

KL algorithm .

.

Taking layout into account in all design stages also gives early feedback: structural synthesis decisions can immediately be evaluated for their layout consequences and corrected if necessary. only the relative positions of the subblocks in the structural description can be fixed. The presence of (approximate) layout information allows for an estimation of wire lengths. From these lengths.Floorplanning: floorplan-based design methodology: This top-down design methodology advocates that layout aspects should be taken into account in all design stages. At higher levels of abstraction. due to the lack of detailed information. . one can derive performance properties of the design such as timing and power consumption.

two multiplexers and a controller and an ALU .three registers.

At the moment that this type of structural information is not fully available. one can estimate the area to be occupied by the various subblocks and. together with a precise or estimated interconnection pattern. try to allocate distinct regions of the integrated circuit to the specific subblocks .

except for those cells that are at the lowest level of the hierarchy These lowest-level cells are called leaf cells Cells that are made from leaf cells are called composite cells Composite cells can contain other composite cells as well. . If the children of all composite cells can be obtained by bisecting the cell horizontally or vertically.Terminology and Floorplan Representation: floorplan can be represented hierarchically: cells are built from other cells. except for the one representing the complete circuit. The direct subcells of a composite cell are called its children every cell. has a parent cell both leaf cells and composite cells are assumed to have a rectangular shape. the floorplan is called a slicingfloorplan.

So. The leaves of this tree correspond to the leaf cells. . in a slicing floorplan a composite cell is made by combining its children horizontally or vertically A natural way to represent a slicing floorplan is by means of a slicing tree. Other nodes correspond with horizontal and vertical composition of the children nodes.

One can derive new composition operators from the wheel floorplan and its mirror image and use them in combination with the horizontal and vertical composition operators in a floorplan tree. a composite cell needs to be composed of at least five cells in order not to be slicing.'from left to right' in horizontal composition and 'from bottom to top' in vertical composition When the the children of the given cell cannot be obtained by bisections wheel or spiral Floorplan is presented. A floorplan that can be described in this way is called afloorplan of order 5 slicing floorplan can also be called a floorplan of order 2 .

a similar idea is used where the edge direction is from the left boundary to the right one . The horizontal segments are used as vertices in the horizontal polar graph and the vertical segments as the vertices in the vertical polar graph Each cell is represented by an edge in the polar graph In the horizontal one. These graphs can be constructed by identifying the longest possible line segments that separate the cells in the floorplans.A representation mechanism that can deal with any floorplan is the polar graph which actually consists of two directed graphs: the horizontal polar graph and the vertical one. there will be an edge directed from the line segment that is the cell's top boundary to the line segment that is its bottom boundary In the vertical one.

.

floorplan-based design does not exclude the existence of routing channels. all composite cells are created by abutment and no routing channels are used in a floorplan: This requires the existence of flexible cells flexible cells should be able to accommodate feedthrough wires. Such cells are said to Abut.Abut : When two cells that need to be electrically connected have their terminals in the right order and separated correctly. . The channels can be taken care of by incorporating them in the area estimations for the cells. the cells can simply be put against each other without the necessity for a routing channel in between them. Ideally.

. In a true top-down design methodology. It is therefore possible to choose a suitable shape for each leaf cell such that the resulting floorplan is optimal in some sense.Optimization Problems in Floorplanning: 1. Floorplan sizing: The availability of flexible cells implies the possibility of having different shapes. floorplanning will probably be performed manually or interactively as the number of children cells in which a parent cell is subdivided is relatively small and good decisions can be made based on designer experience. Mapping of a structural description to a floorplan (e. 2. techniques known from placement like min-cut partitioning can also be used in floorplanning problem related to global routing is called abstract routing.g. a slicing tree).

As this style of design amounts to full-custom design. . Generation of flexible cells: This task takes as input a cell shape. 4 x 16 or 1 x 64. such as parasitic capacitances and propagation delay. Flexibility in a cell's shape can be achieved using primitives belonging to a level higher than the transistor level.3. data on desired positions of terminals and a netlist of the circuit to be synthesized at some abstraction level. and uses a cell compiler to generate the layout that complies with the input. quite some extra effort has to be spent in the characterization of the generated cells. 16 x 4. The problem is especially complex when the layout has to be composed of individual transistors because of the many degrees of freedom and the huge search space that is associated with it. Characterization is the process of determining all kind of electrical properties of a cell. Characterization is necessary for an accurate simulation of the circuit containing the generated cell. An example is a register file of 64 registers that can be laid out in many different ways. such as 8 x 8.

The minimal height given as a function of the width is called the shape Function of the cell. one could say that the realization needs an area A. Due to design rules neither the height nor the width will asymptotically approach zero. Whichever shape the cell will have. .Shape Functions and Floorplan Sizing: When the cell is flexible. its height h and its width w have to obey the constraint hw > A.

a predesigned cell residing in a library. The shape function of a composite cell in a slicing floorplan can be computed from the shape function of its children cells If the shape function of c1 is indicated by h1(w) and the one of C2 by h2(w). then the shape function h3(w) of the composite cell c3 can be expressed as: . has the possibility of rotations (only in multiples of 90°) and mirrorings as the only flexibility to be fit in a floorplan.Inset or rigid cell: An inset cell.

a small example where both c1 and c2 are inset cells with respective sizes of 4 x 2 and 5x3. there are four ways to stack the two cells vertically . Clearly.

.In the case of horizontal composition. the shape function of a composite cell has to be computed using a detour via the inverses of the children's shape functions The inverse of the composite cell's shape function is the sum of the inverses of its children cell's shape functions children shapes can be easily derived from the chosen parent shape for both types of composition sizing algorithm for slicing floorplans 1. 2. 3. Construct the shape function of the top-level composite cell in a bottom-up fashion starting with the lowest level and combining shape functions while moving upwards. Choose the optimal shape of the top-level cell. Propagate the consequences of the choice for the optimal shape down the slicing tree until the shapes of all leaf cells are fixed.

The second stage local or detailed routing: fixes the precise paths that a wire will take (its position inside a channel and its layer). 4. the netlist that indicates which terminals should be interconnected and 3.Routing The specification of a routing problem will consist of the 1. 2. the area available for routing in each layer. Routing is normally performed in two stages. global or loose routing: determines through which wiring channels a connection will run. 5. The first stage. position of the terminals. .

In gridded routing. Sometimes the complete routing area is available for routing. wires of different widths as well as contacts are explicitly represented. Gridded or gridless routing. In gridless routing. sometimes part of the area in one or more layers is blocked. The presence or absence of obstacles. The number of wiring layers The number of layers available depends on the technology and the design style A contact cut that realizes a connection between two layers is often called a via in the context of routing. 4.Types of Local Routing Problems are defined using following parameters: 1. The orientation of wire segments in a given layer: Reserved-layer models of routing use either horizontal or vertical segments in one layer Sometimes it is also allowed to use segments with an orientation that is a multiple of 45° 3. . 2. all wire segments run along lines of an orthogonal grid with uniform spacing between the lines.

whichever is the most suitable.6. 8. a group of terminals belonging to the same net may already be connected to each other. . 7. Terminals with a fixed or floating position. router should connect the rest of the net to only one of the terminals in this group. but in other problems the router can move the terminal inside a restricted area. In some problems the position of the terminals is fixed. Permutability of terminals. Sometimes the router is allowed to interchange terminals because they are functionally equivalent. Electrically equivalent terminals: In some situations.

the "target" terminal) in a plane. and cleanup . and fixed terminals in all the routing area). Routing problems in which terminals are allowed anywhere in the area available for routing are normally classified as area routing problems path connection" or "maze routing" algorithm The basic algorithm is meant to realize a connection between two points ("source" terminal. Obstacles are grid points through which no wire segments can pass.Area Routing (single wiring layer. the presence of obstacles. If a path exists. The algorithm consists of three steps: wave propagation. the algorithm always finds the shortest connection. in an environment that may contain obstacles. backtracing. going around obstacles. a grid. The distance between two horizontally or vertically neighboring grid points corresponds to the shortest possible wire segment.

.

.

where the size of the third dimension equals the number of layers available When a net has three or more terminals first a path between two terminals should be found and then a generalization of the algorithm has to be used where a path can either act as a source or a target for the wave propagation. sometimes the neighbor with label i is not unique: a heuristic should be used to make a choice. . In the case that there are multiple layers. Once a path has been found. Its space complexity is also 0{n2). the algorithm operates on a three-dimensional grid.in this backtracing step. another heuristic can be used not to change the orientation of the path unnecessarily. it will act as an obstacle for the next connections to be made The worst-case time complexity of Lee's algorithm operating on an n x n grid is 0(n2).

nets have to be routed sequentially is the weak point of Lee's algorithm Routing the nets in a different order strongly influences the final result .

It consists of routing nets across a rectangular channel. all terminals belonging to the same net have the same number .Channel Routing: Channel routing occurs as a natural problem in standard cell and building block layout styles. but also in the design of printed circuit boards (PCBs).

to minimize the number of different horizontal tracks needed. while a secondary goal is the minimization of the total wire length and the number of via. but it is up to the router to determine the exact position nets 1and 3 have floating terminals at the left side and the nets 4 and 5 at the right The main goal of channel routing is the minimization of the height.the grid distance is equal to the horizontal separation between the terminals. A floating terminal is known to enter the channel on the left or on the right side. In other words: The objective is to minimize the area of the channel's rectangular bounding box--or equivalently. . The nets have fixed terminals at the top and bottom of the channel and floating terminals at the "open“ sides. at the left and right.

a secondary goal is to minimize the total wire length and the number of vias . Switchbox routing is a decision problem. When a solution can be found. the goal is to find out whether a solution exists.switchbox routing A routing problem that has some similarity with channel routing is switchbox routing fixed terminals can be found on all four sides of the rectangular routing area. the minimization of the area is not an optimization goal.

Channel Routing Models: classical model 1. 3. 4. . For each net. the wiring is realized by a single horizontal segment. There are two wiring layers. All wires run along orthogonal grid lines with uniform separation. 2. Horizontal segments are put on one layer and vertical segments on the other one. with vertical segments connecting it to all terminals of the net.

Gridless routing model Routers have been designed for working without a grid. reserved-layer model: each layer has only wires in one direction. Works for small search space. allows that each wire has a specific width. nonreserved layer model works for a larger solution space .

there would be a shortcircuit) This restriction is called a vertical constraint. Each column having two terminals in the same layer gives rise to a vertical constraint.The Vertical Constraint Graph Consider a pair of terminals located in the same column and entering the channel in the same layer It is obvious that in any solution of the problem. the endpoint of the segment coming from the top has to finish at a position higher or lower than the endpoint of the bottom segment (otherwise. The constraints are often represented in a vertical .

fully separated VCG fully merged VCG . the vertices represent the endpoints of the terminal segments and the directed edges represent the relation "should be located above” it consists of pairs of vertices. and unconnected vertices for the other columns Cycles are not allowed. each pair connected by a single directed edge from one vertex to the other.In this directed graph. one pair for each column that has two terminals in the same layer.

. in which case the corresponding layout cannot be realized: a segment cannot be at the same time above and below another one. In the absence of cycles in the VCG. a solution with a single horizontal segment per net would amount to finding the longest path in the graph.The main problem with the fully merged form is the possible existence of cycles.

Horizontal Constraints and the Left-edge Algorithm If. The goal of channel routing is then reduced to assign a row position in the channel to each interval An optimal solution combines those nonoverlapping intervals on the same row that will lead to a minimal number of rows . Ximax]. in the classical model for channel routing. corresponding to the leftmost and right-most terminal positions of the net. A net I in a channel routing problem without vertical constraints can be characterized by an interval [Ximin. horizontal segments belonging to different nets are put on the same row (implying that they will be in the same layer too). This restriction is called a horizontal constraint. the segments should not overlap (otherwise there would be a shortcircuit).

Structures for the representation of intervals and linked lists of intervals. rest(l) gives the list that remains when the first element is removed .The number of intervals that contain a specific x-coordinate is called the local density at column position x and will be denoted by d(x) The maximum local density in the range of all column positions is called the channel's density and is denoted by dmax Obviously. the density is a lower bound on the number of necessary rows: all intervals that contain the same x-coordinate must be put on distinct rows The left-edge algorithm always finds a solution with a number of rows equal to the lower bound. called interval and list of interval respectively standard "list processing" function calls: first(l) gives the first element of a list l.

.list i_list that contains the intervals in order of increasing left coordinate.

The outer loop will be executed d times and at most n intervals from the sorted list will be inspected in the inner loop. . Sorting the set of intervals by their left coordinate can be done in 0(n logn).The time complexity of the algorithm can easily be expressed in terms of the number of intervals n and the density of the problem d (the number of rows in the solution). This leads to a total worst-case time complexity of 0(nlogn + dn).

The problem of assigning nonoverlapping intervals to rows can also be
described in graph-theoretical terms.
A set of intervals defines a so-called interval graph G(V, E):
for each interval i , there is a vertex v and
there is an edge (vi, vj) if the corresponding intervals vi and vj overlap
The problem of finding the minimum number of rows for the channel
routing problem without vertical constraints is equivalent to finding a
vertex coloring of the corresponding interval graph with a minimal
number of colors
The vertex coloring problem for graphs is the problem of assigning a
"color" to all vertices of the graph such that adjacent vertices have
different colors and a minimal number of distinct colors are used

Channel Routing Algorithms
robust channel routing algorithm
there are rows in the channel, the number of rows is given by the variable
height
The channel routing problems of decreasing size (stored in the variable
N) are solved in
subsequent iterations (dynamic programing)
The selected nets will be located on the same row alternatingly either on
the top or the bottom of the remaining channel
Each iteration consists of two parts:
1. the assignment of weights to the nets and
2. the selection of a maximal-weight subset of these nets
The algorithm tries to eliminate vertical constraint violations by maze
routing.

The weight Wj of a net I expresses the desirability to assign the net to
either the top
or bottom row
The side (top or bottom) that is selected in some point of the iteration
will be called the "current side“
following rules to compute the weights:
1. For all nets I whose intervals contain the columns of maximal
density, add a large
number B to the weights wi.
2. For each net I that has a current-side terminal at the column
positions x, add to Wi the local density d(x).
3. For each column x for which an assignment of some net I to the
current side will
create a vertical constraint violation, subtract Kd(x) from w;
K is a parameter that should have a value between 5 and 10.
This discourages the creation of vertical constraint violations.

Once all nets have received a weight, the robust routing algorithm
finds the
maximal-weight subset of nets that can be assigned to the same row.
The nets selected for the subset should not have horizontal
constraints.
For any graph, a set of vertices that does not contain pairs of adjacent
vertices is called an independent set.
The problem of finding the maximal-weight subset of the nets could
therefore be formulated as the maximal-weight independent set
problem of the corresponding interval graph.
In the case of the problem of obtaining the group of nonoverlapping
intervals with maximal total weight, the subinstances can be
identified by a single parameter y, with
1 < y < channel.width.
To obtain the subinstance with y = c, one should remove all intervals
that extend beyond column position c.
The costs of the optimal solutions for the subinstances with y = c are

the optimal cost for the subinstance with y = c can be derived from the
optimal costs of the subinstances with y < c and the weights of the nets
that have their right-most terminals at position c(There are at most two
such nets)
Net n is part of the optimal solution if total[c -1] < wn + total[xnmin —
1]
n is part of the optimal solution if n s weight added to the optimal
solution for the subinstance that did not include any nets that
overlapped with n, is larger than the optimal solution for the subinstance
with y =c – 1.
n is part of the optimal solution if n’s weight added to the optimal
solution for the subinstance that did not include any nets that
overlapped with n, is larger than the optimal solution for the subinstance
with y = c – 1
If a net is selected for some c, the net's identification is stored in the
array selectec_net

Total=total weight till c .

.

Such a strategy is often called rip up and reroute .The robust channel routing algorithm uses a restricted maze routing algorithm to repair these violations by selectively undoing the assignments of nets to rows and rerouting these nets.

amounts to constructing a minimum rectilinear Steiner tree. If the terminals of a net are connected to cells on more than two adjacent rows. the global router should split the net into parts that can be handled each by local routing Obtaining a wiring pattern that roughly interconnects all terminals of a net.Introduction to Global Routing: global routing is a design action that precedes local routing and follows placement Global routing decides about the distribution across the available routing channels of the interconnections as specified by a netlist. . the entire net can be routed by local routing only. If all terminals of a net are connected to cells facing the same channel. Standard-cell Layout this type of layout is characterized by rows of cells separated by wiring channels.

By making use of feedthrough cells. these are cells that are inserted between functional cells in a row of standard cells with the purpose of realizing vertical connections.The rectilinear Steiner tree contains vertical segments that cross the rows of standard cells. If feedthrough resources are scarce. 1. First of all. By simply using a wiring layer that is not used by the standard cells. They can be realized in different way. By making use of feedthrough wires that may be available within standard cells 3. Second. . segments at approximately the same location can be permuted to reduce the densities in the channels above and below the row that they cross. 2. their use can be minimized by building a Steiner tree for which vertical connections have a higher cost than horizontal ones. it may be necessary to slightly shift the segments in order to align with feedthrough wire positions.

a long wire in an IC behaves more like a transmission line. signal changes will not arrive simultaneously at all sinks It may e.Given the fact that longer wires roughly correspond to larger delays. global routing minimizes the overall area if it minimizes the sum of all channel widths. is the Elmore delay model the signal flow in a net is unidirectional starting from a source terminal and propagating to multiple sink terminals. partition the wire into multiple segments.g. each segment with its own resistance and capacitance. cells connected to critical nets (nets that are part of the critical path) will receive a higher priority to be placed close to each other during placement. A model based on this principle. . be necessary to optimize the length of the connection from the source to the critical sink (this is a connection that is part of the critical path) rather than the overall tree length In standard-cell layout.

In the case of horizontal composition in the slicing tree. Area for routing is reserved around the ceils. and in which order these channels should be routed (channel ordering problem). This order can be obtained by a depth-first traversal of the tree. but it is not always obvious how this area can be partitioned into channels that can be handled by channel routers (channel Definition problem). . the left and right edges determine the channel borders. the channels are delimited by the top and bottom edges of the two cells involved in the composition.Building-block Layout and Channel Ordering Global routing for building-block layout is somewhat more complex than for standard-cell layout as a consequence of a higher degree of irregularity of the layout. In the case of vertical composition. Both the layout and the tree are annotated with a number between parentheses that indicates a possible correct order for routing the channels.

once Channel (2) has been routed. . its floating terminals at its "bottom" side are fixed by the channel router and become fixed terminals for the top side of Channel (3). receive a fixed position after completing the routing of the channel and become fixed terminals for the right side of Channel (4). The floating terminals at the left side of Channel (3).

Vertical grid lines can be chosen such that the horizontal and vertical resolutions are roughly equal. The horizontal grid lines are chosen such that they run across the centers of the cell rows.Algorithms for Global Routing: the layout is covered by a grid. Note that the exact distance between horizontal lines is not known in advance and depends on the results of channel routing. .

The points to be interconnected by rectilinear Steiner trees will then all be considered to lie at the center of these unit rectangles .The grid divides the routing area into elementary rectangles. All terminals located in such a rectangle will be thought of as having the same coordinates.

1) . The density Dv(i) (1 < i < m) of the channel between grid lines i. 1 < j < n .1and j. is then defined as the number of wires crossing the vertical grid segment located on vertical grid line j between the horizontal lines i -1 and i. 1 < j < n) is defined as the number of wires crossing the horizontal grid line i between the vertical grid lines j .1 and i is then given by: The goal of global routing is to minimize the total channel density given by: . The local horizontal density dh(i. j) (1 < i < m — 1. j) (1 <i < m.local density The local vertical density dv(i.

. Algorithm concept: One could.Mij are the parameters that give the maximum number of feedthroughs that can be accommodated per horizontal grid segment. or those trees that contribute to the reduction of the total channel density after reshaping. construct Steiner trees for all nets independently. examine the result for congested areas and try to modify the shapes of those trees that are the cause of overcongestion.

The recursion stops when a sufficient degree of detail has been reached for handing the problem over to a local router. One then gets four smaller routing problems that can be solved recursively following the same approach. The decision on the ordering of wires crossing a boundary for one subproblem will constrain the search space of the neighboring one.Divide-and-conquer algorithm Instead of using the same grid during the complete routing process. say a 2 x 2 grid. perform global routing on this grid by assuming that all terminals covered by an elementary rectangle are located at the rectangle's center. and construct Steiner trees that evenly distribute the wires crossing the grid segments. one could start with a very coarse grid. .

The rectilinear distance between a pair of points pi = (xi.Efficient Rectilinear Steiner-tree Construction The input of the rectilinear Steiner-tree problem is a set of n points P = {p1.yj|. The problem of finding Steiner tree is the problem of finding a spanning tree in the set of points P U {s} The function 1-steiner takes the vertex and edge sets of a spanning tree as input and returns three values corresponding to the vertex and edge sets of die constructed 1-Steiner tree. yj) is equal to |xi—xj | + |yi . . . p2. and the decrease in tree-length that was the result of adding one Steiner point. pn) located in the two-dimensional plane.. The set of new points will be denoted by S.. The goal of the problem is to find a minimal-length tree that interconnects all points in P and makes use of new points. yi) and pj = ( xj . .

.

all candidate points s are visited and the spanning tree for the points in P U {s} is computed each time. . candidate points are commonly called Hananpoints. The point that leads to the cheapest tree is then selected SLECTING s POINT: an optimal rectilinear Steiner tree can always be embedded in the grid composed of only those grid lines that carry points of the set P.

In the pseudo-code. south and west. .Spanning-update: It involves the incremental computation in linear time of the minimum spanning tree for the set P U {s} given the minimum spanning tree for the set P. east. these four regions are called north. while the closest point to s from a point set V (excluding s itself) in a region r is computed by the function closest-point. The four points to which point s may be connected are the closest ones in each of the four regions obtained by partitioning the plane by two lines crossing s at angles of +45 and —45 degrees.

.

.

.Spanning_update operates in linear time O(n). It may also happen that solutions exist with Steiner points that are not Hanan points. the worst-time complexity of the function 1-steiner becomes 0(n^3) Because the function 1-steiner will be called at most 0(n^2) times. Local Transformations for Global Routing Once Steiner trees for all nets have been generated independently. the time complexity of the main function steiner can be stated to be 0(n^5) it happens very often that a minimum rectilinear Steiner tree problem instance has many distinct optimal solutions. congested areas in the grid can be identified. Given the fact that the number of Hanan points is 0(n^2). The trees of the nets contributing to the congestion can be reshaped by applying local transformations.

multiplexers and busses). in particular of its data path. Structural view of the differential equation integrator with one multiplier and one ALU . The data path is an interconnection of resources (implementing arithmetic or logic functions) steering logic circuits (e. a structural view of the circuit.g.ARCHITECTURAL SYNTHESIS Architectural synthesis means constructing the macroscopic structure of a digital circuit.. that send data to the appropriate destination at the appropriate time and registers or memory arrays to store data. outcome of architectural synthesis 1. a logic-level specification of its coritrol unit. starting from behavioral models that can be captured by data-flow or sequencing graphs. and 2.

the number of cycles to perform all operations) throughput (i. wiring and control.e. cycle-time (i... First.e. the clock period) latency (i. Circuits for which this assumption holds are called resource-dominated circuits Architectural design problem and subproblems: Realistic design examples have trade-off curves that are not smooth . A common simplification is to consider area and performance as depending only on the resources.e. storage circuits. the computation rate) Resource-dominated circuits area and performance depend on the resources as well as on the steering logic. the design space is a finite set Second. there are several non-linear effects that are compounded in determining the objectives as a function of the structure of the circuit. .. because of two reasons.Circuit implementations are evaluated on the basis of the following objectives: area.

Interface resources support data transfer The major decisions in architectural synthesis are often related to the usage of functional resources while neglecting the wiring space. Resources Resources implement different types of functions in hardware. Memory resources store data 3. Functional resources process data. details about the resources being used and constraints. They implement arithmetic or logic functions and can be grouped into two subclasses 1. They can be broadly classified as follows 1.CIRCUIT SPECIFICATIONS FOR ARCHITECTURAL SYNTHESIS Specifications for architectural synthesis problem include behaviorallevel circuit models. In the case of resource-dominated circuits. Primitive resources are subcircuits that are designed carefully once and often used 2. . Application-specific resources are subcircuits that solve a particular subtask 2. the area is determined only by the resource usage.

as is often the case and as considered. which we call the execution delay. it is convenient to measure the performance of the resources in terms of cycles required to execute the corresponding operation. Constraints Interface constraints: They relate to the format and timing of the I/O data transfers The timing separation of IO operations can be specified by timing constraints that can ensure that a synchronous IO operation resource binding constraint: a particular operation is required to be implemented by a given resource (synthesis from partial structure) .When architectural synthesis targets synchronous circuit implementations.

We assume that there are nops. . determining the detailed interconnections of the data path and the logic-level specifications of the control unit. 1. i = 0. determining the time interval for their execution and their binding to resources.e. . . . (i.. . where n = nops + 1. . 1. . j = 0. A set of constraints. operations.) Second. . the source and sink vertices being labeled as vo and vn. First. 2. A set of functional resources.(fully characterized in terms of area and execution delays. A sequencing graph. vj). n) edge set E = ((vi.THE FUNDAMENTAL ARCHITECTURAL SYNTHESIS PROBLEMS circuit is specified by 1. i. Sequencing graphs are polar and acyclic.(V. E) has vertex set V = {vi. placing the operations in time and in space.) 3. n] representing dependencies. . graph G. Architectural synthesis and optimization consists of two stages.

. and it is the difference between the start time of the sink and the start time of the source A scheduled sequencing graph is a vertex-weighted sequencing graph.  Scheduling: We denote the execution delays of the operations by the set D = {di. . i = 0. . represented by the set T = (ti. are attributes of the vertices of the sequencing graph latency of a scheduled sequencing graph is denoted by . . . n} delay of the source and sink vertices is zero The start times of the operations. n). . . . where each vertex is labeled by its start time. 1. . 1. i = 0.

.

. . . a ripple-carry and a cany-look-ahead adder for an addition). . Each operation is bound to one resource. 2..two (or more) combinational operations in a sequence can be chained in the same execution cycle if their overall propagation delay does not exceed the cycle-time. nres] denotes the resource type that can implement an operation. A simple case of binding is providing a dedicated resource. the binding problem can be extended to-a resource selection (or module selection) problem by assuming that there may be more than one resource applicable to an operation (e. and the resource binding B is a one-to-one function. . .. we denote the resource-type set by (1. . nres) The function T : V (1.g. . In this case T is a one-to-many mapping.2. The Spatial Domain: Binding Single resource type can implement more than one operation type nres resource types.

that particular resource is shared and binding is a many to-one function. where the vertex set V represents operations and the edge set Eg represents the binding of the operations to the resources. A necessary condition for a resource binding to produce a valid circuit implementation is that the operations corresponding to a shared resource do not execute concurrently.A resource binding may associate one instance of a resource type to more than one operation. . A resource binding can be represented by a labeled hypergraph. In this case.

denoted by [ak. k = I. . r) with r < at. 2. nres]. . . These bounds represent the allocation of instances for each resource type A resource binding satisfies resource bounds (ak. . .A resource binding is compatible with a partial binding when its restriction to the operations U is identical to the partial binding itself. . .2. k = 1. nres) when B(vi) = (t. . 2. . for each operation (vi.nops) Scheduling and binding provide us with an annotation of the sequencing graph that can be used to estimate the area and performance of the circuit. i =1. . . . Common constraints on binding are upper bounds on the resource usage of each type. .

delay of a branching vertex is the maximum of the latencies of the comesponding bodies 3. The start times are now relative to that of the source vertex in the corresponding graph entity. vertex is the latency of the corresponding graph entity 2. The latency computation of a hierarchical sequencing graph. delay of an iteration vertex is the latency of its body times the maximum number of iterations . with bounded delay operations. can be performed by traversing the hierarchy bottom up Delay modeling 1.Hierarchical Models: A hierarchical schedule can be defined by associating a start time to each vertex in each graph entity.

The Synchronization Problem There are operations whose delay is not known at synthesis time. (datadependent iteration) Scheduling unbounded-latency sequencing graphs cannot be done with traditional techniques One solution is to modify the sequencing graph by isolating the unbounded-delay operations and by splitting the graph into boundedlatency subgraphs. .

but it is not necessary to know the binding to determine the area. The latency of a circuit can be determined by its schedule. of a circuit for a given cycle-time. A binding provides us with information about the area of a circuit. Called resource area. . AREA AND PERFORMANCE ESTIMATION A schedule provides the latency . these two objectives can be evaluated for scheduled and bound sequencing graphs Resource-Dominated Circuits The area estimate of a structure is the sum of the areas of the bound resource instances. it is just sufficient to know how many instances of each resource type are used. A binding specifies fully the total area. the total area is a weighted sum of the resource usage.

Complete architectural optimization is applicable to circuits that can be modeled by sequencing (or equivalent) graphs. Thus the goal of architectural optimization is to determine a scheduled sequencing graph with a complete resource binding that satisfies the given constraints. .STRATEGIES FOR ARCHITECTURAL OPTIMIZATION Architectural optimization comprises scheduling and binding. Partial architectural optimization problems arise in connection with circuit models that either fully specify the timing behavior or fully characterize the resource usage.

cycle-time). Architectural exploration is often done by exploring the (area. Other approach is the search for the (cycle-time latency) tradeofffor some binding or the (area cycle-time) trade-off for some schedules. . because the desired point of the design space is already prescribed.  is obvious that any circuit model in terms of a scheduled and It bound sequencing graph does not require any optimization at all.latency) trade-off for different values of the cycle-time. () Architectural optimization consists of determining a schedule and a binding that optimize the objectives (area. This approach is motivated by the fact that the cycle-time may be constrained to attain one specific value. latency.

Binding is affected by scheduling. the execution delays can be determined. Intermediate solutions can be found by solving resource constrained minimum-latency scheduling problems or latency-constrained minimum resource scheduling problems. solutions to the minimum-latency scheduling problem and to the minimum resource scheduling problem provide the extreme points of the design space. In general circuits: area and latency can be determined by binding and scheduling. scheduling problems provide the framework for determining the (area / latency) trade-off points. but the two problems are now deeply interrelated. CAD systems for architectural optimization perform either schednling followed by binding or vice versa .Area Latency Optimization For resource-dominated circuits given the cycle-time. because the amount of resource sharing depends on the concurrency of the operations.

No resource constraints are required in scheduling. . In this case. Such an approach fits well with processor and DSP designs. This approach best fits the synthesis of those ASIC circuits that are control dominated and where the steering logic parameters can be comparable to those of some application-specific resource. because circuits often are resource dominated binding before scheduling: Performing binding before scheduling permits the characterization of the steering logic and the more precise evaluation of the delays. resource sharing requires that no operation pair with shared resources executes concurrently.scheduling before binding : Most approaches to architectural synthesis perform scheduling before binding. because the resource usage is determined by binding.

Cycle-Time/ Latency Optimization: scheduling with chaining can be performed by considering the propagation delays of the resources. the problem reduces to determining the register boundaries that optimize the cycle time. . This problem has been referred to as retiming The formulation and its solution can be extended to cope with sequential resources by modelling them as interconnections of a combinational component and registers. Retiming: When the resources are combinational in nature.

steering logic.Cycle-Time/ Area Optimization: consider now scheduled sequencing graphs where latency is fixed. etc. for general circuits. . Here that only delays in steering logic matter. This problem is not relevant for resource-dominated circuits. we are solving either a partial synthesis problem or the binding problem after scheduling. cycle-time is bounded by the delays in the resources. because changing the binding does not affect the cycle-time.

Sequencing and concurrency The start times must satisfy the original dependencies of the sequencing graph. which limit the amount of parallelism of the operations. because any pair of operations related by a sequence dependency (or by a chain of dependencies) may not execute concurrently. the scheduling-of a sequencing graph determines the precise start time of each task. Therefore the choice of a schedule affects also the area of the implementation. .SCHEDULING ALGORITHMS a sequencing graph prescribes only dependencies among the operations. Impact on area: the maximum number of concurrent operations of any given type at any step of the schedule is a lower bound on the number of required hardware resources of that type.

. j = 0. E). vj). . . . . n) the start time for the operations. n] represents dependencies. n = nops+1 and that we denote the source vertex by vo and the sink by vsink. . I. both are No-Operations D = (di. the cycles in which the operations start. i = 0. i. 1. . 1. n) be the set of operation execution delays the execution delays of the source and sink vertices are both zero. ..e. i = 0. i. .(V. . The sequencing graph requires that the start time of an operation is at least as large as the start time of each of its direct predecessor plus its execution delay .A MODEL FOR THE SCHEDULING PROBLEMS sequencing graph is a polar directed acyclic graph G. . i. . . I. .e. . where the vertex set V = {vi.. i = 0. d0 = dn= 0 We denote by T = {ti. . n] is in one-to-one correspondence with the set of operations and the edge set E = {(vi.

. the latency of the schedule equals the weight of the longest path from source to sink. unconstrained minimum-latency scheduling problem: Given a set of operations V with integer delays D and a partial order on the operations E. a partial order on the operations E and upper bounds (ak: k = 1..2. find an integer labeling of the operations : V . . . nops]. find an integer labeling of the operations : V such that resource-constrained scheduling problem Given a set of operations V with integer delays D.

” .SCHEDULING WITHOUT RESOURCE CONSTRAINTS Unconstrained scheduling is applied when dedicated resources are used also used when resource binding is done prior to scheduling resource conflicts are solved by serializing the operations that share the same resource the minimum latency of a schedule under some resource constraint is obviously at least as large as the latency computed with unlimited resources that is why “unconstrained scheduling can be used to derive lower bound on latency for constrained problems.

n] . . . . i = 0.Unconstrained Scheduling: The ASAP Scheduling Algorithm We denote by tS the start times computed by the ASAP Algorithm by a vector whose entries are {tsi. . 1.

the as late as possible (ALAP) scheduling Algorithm. .  Latency-Constrained Scheduling: The ALAP Scheduling Algorithm upper bound on the latency. denoted by bar. 1. provides the corresponding maximum values mobility (or slack): the difference of the start times computed by the ALAP and ASAP algorithms. solved by executing the ASAP scheduling algorithm and verifying that The ASAP scheduling algorithm yields the minimum values of the start times. n} .tiS. . {i = 0.. Namely i = tiL. .

When the mobility is larger than zero.Zero mobility implies that an operation can be started only at one given time step in order to meet the overall latency constraint. . it measures the span of the time interval in which it may be started.

regardless of their absolute value. absolute timing constraints can be seen as constraints relative to the source operation The combination of maximum and minimum timing constraints permits us to specify the exact distance in time between two operations Relative timing constraints are positive integers specified for some operation pair (vi vj) A minimum timing constraint A maximum timing constraint A consistent modeling of minimum and maximum timing constraints .Scheduling Under Timing Constraints: can be generalized to the case in which deadlines need to be met by other operations A further generalization is considering relative timing constraints that bind the time separation between operations pairs.

plus possibly the time required by any sequence of operations in between. any cycle in the constraint graph including edge (vi. For every maximum timing constraint uij we add a backward edge (vj vi) in the constraint graph with weight equal to the opposite of the maximum value wij= -uij the requirement of an upper hound on the time distance between the start time of two operations may be inconsistent with the time required to execute the first operation. vj) must have negative or zero weight.edges are weighted by the delay of the operation corresponding to their tail Additional edges are related to the timing constraints For every minimum timing constraint lij. we add a forward edge (vi vj) in the constraint graph with weight equal to the minimum value lij. The longest weighted path in the constraint graph between vi and vj (that determines the minimum separation in time between operations vi and vj) must be less than or equal to the maximum timing constraint uij. constraint graph does not have positive cycles. .

vi) from anchor a to vertex vi is a path in G. where a subset of the vertices has unspecified execution delay. start time and stop time of the operations cannot be determined on an absolute scale schedule of the operations is relative to the anchors A defining path p(a. The relevant anchor set of a vertex vi is the subset of anchors . Anchors The anchors of a constraint graph G(V. E). provide a frame of reference for determining the start time of the operations. E) consist of the source vertex voand of all vertices with unbounded delay. is also its completion signal.(V. E) with one and only one unbounded weigh: da. as well as the source vertex. Such vertices.Relative Scheduling   We assume that operations issue completion signals when execution is finished a start signal to the source vertex.(V. In sequencing graph G.

Then if there are no operations with unbounded delays. when there is another relevant anchor b R(vi) such that For any given vertex vi the irredundant relevant anchor set represents the smallest subset of anchors that affects the start time of that vertex.  when considering one path only and when anchors are cascaded along the path. which reduces to the traditional scheduling formulation . only the last one affects the start time of the operation at the head of the path. computed on the polar subgraph induced by anchor a and its successors. then the start times of all operations will be specified in terms of time offsets from the source vertex. Let ti be the schedule of operation vi with respect to anchor a. An anchor a is redundant for vertex vi. assuming that a is the source of the subgraph and that all anchors have zero execution delay.

. . . i = 0. n . 1. .Thus relative scheduling consists of the computation of the offset values v ia for all irredundant relevant anchors a of each vertex vi.

.) Feasibility: A constraint graph is feasible if all timing constraints are satisfied when the execution delays of the anchors are zero well-posed graph: A constraint graph is well-posed if it can be satisfied for all values of the execution delays of the anchors A feasible constraint graph Gc(Vc. in this case a schedule may or may not exist under the timing constraint. Ec) is well-posed or it can be made well-posed if and only if no cycles with unbounded weight exist in Gc(Vc Ec). It is important to be able to assess the existence of a schedule for any value of the unbounded delays. (these values are not known when the schedule is computed.RELATIVE SCHEDULING UNDER TLMING CONSTRAINTS The constraint graph formulation applies although the weights on the edges whose tails are anchors are unknown.

L = ti upper and lower bounds on the start times can be computed by the ASAP and ALAP algorithms on the corresponding unconstrained problem. xil. The Integer Linear Programming Model A formal model of the scheduling problem under resource constraints can be achieved by using binary decision variables with two indices The number represents an upper bound on the latency. because the schedule latency is unknown. i.e. Thus xilis necessarily zero for .. a binary variable. is 1 only when operation vi starts in step L of the schedule. SCHEDULING WITH RESOURCE CONSTRAINTS The solution of scheduling problems under resource constraints provides a means for computing the (area/ latency) trade-off points.

the start time of each operation is unique the start time of any operation vi can be stated in terms of xiL the sequencing relations represented by Gs(V. E) must be satisfied The number of all operations executing at step I of type k must be lower than or equal to the upper bound ak .

. Then. .Let us denote by t the vector whose entries are the start times. . . the minimum-latency scheduling problem under resource constraints can be stated as (c holds boolean val for timesteps for which time has be calculated) c = [0. 1] vector corresponds to minimizing the latency of the schedule. because .0.

vj. For example. a maximum timing constraint uij on the start times of operations vi. can be expressed .This model can be enhanced to support relative timing constraints by adding the corresponding constraint inequality in terms of the variables X.

Hence the objective function can be expressed by cTa where c is a vector whose entries are the individual resource (area) costs.minimum-resource scheduling problem under latency constraints The optimization goal is a weighted sum of the resource usage represented by a. latency constraint .

the problem is often referred to as a precedence-constrained multiprocessor scheduling problem Assume all operations have unit execution delay scheduling problem can be restated as We compute first a lower bound on the number of resources required to schedule a graph with a latency constraint under these assumptions. .Multiprocessor Scheduling and Hu's Algorithm In resource-constrained scheduling problem: When we assume that all operations can be solved by the same type of resource.

A labeling of a sequencing graph consists of marking each vertex with the weight of its longest path to the sink. measured in terms of edges Denoted by .

 p(j) is the number of vertices with label equal to j. latency is greater than or equal to the weight of the longest path. A lower bound on the number of resources to complete a schedule with latency where y is a positive integer .

to schedule the remaining portion of the graph we need at least steps. . Thus. At schedule step the scheduled vertices are at most This implies that at least a vertex with label has not been scheduled yet.y' the largest integer such that all vertices with labels larger than or equal to � + 1 .L. Therefore. the schedule length is at least which contradicts our hypothesis of satisfying a latency bound of bar.y' have been scheduled up to the critical step and the following one Denominator= l  a schedule exists with a resources that satisfy the latency bound The vertices scheduled up to step L cannot exceed a.

a denote the upper bound on the resource usage

 the algorithm can always achieve a latency with a.bar resources
the algorithm achieves also the minimum-latency schedule under
resource constraints
The algorithm schedules ‘a’ operations at each step, starting from the
first one until a critical step, after which less than ‘a’ resources can be
scheduled due to the precedence constraints.
We denote by c the critical step. Then c + 1is the first step in which less
than ‘a’ operations are scheduled.
The vertices scheduled up to the critical step are a . c and
those scheduled up to step c + 1 are a.(c + ) where 0 << 1.
We denote by y' the largest integer such that all vertices with labels
larger than or equal to + 1 - y' have been scheduled up to the critical
step and the following one.
Then - y' schedule steps are used by the algorithm to schedule the
remaining operations after step c + 1

 Hu's algorithm achieves latency with as many resources as

Using upper 2

 
recalling
that the - y' schedule steps are used by the algorithm to
schedule the
remaining operations after step c+ 1, the total number of steps used by
the algorithm is

Heuristic Scheduling Algorithms:
List Scheduling
consider first the problem of minimizing latency under resource
constraints, represented by vector a.
Here the algorithm is to handle multiple operation types and multiplecycle execution delays.
U.lk are those operations of type k whose predecessors have already
been scheduled early enough, so that the corresponding operations
are completed at step I.
The unfinished operations T.lk are those operations of type k that
started at earlier cycles and whose execution is not finished at step l.
when the execution delays are 1, the set of unfinished operations is
empty.
A priority list of the operations is used in choosing among the
operations.(common priority list is to label the vertices)

.

List scheduling can also be applied to minimize the resource usage under latency constraints.. i. The lower the slack. where the slack is the difference between the latest possible start time (computed by an ALAP schedule) and the index of the schedule step under consideration. At the beginning. the slack of an operation is used to rank the operations. For this problem. one resource per type is assumed. a is a vector with all entries set to 1. the higher the urgency in the list is .e.

.

. nres] at any time step of interest. . . For the remaining operations. . .Heuristic Scheduling Algorithms: Force-directed Scheduling The time frame of an operation is the time interval where it can be scheduled. . . width of the time frame of an operation is equal to its mobility plus 1 The operation probability is a function that is zero outside the corresponding time frame and is equal to the reciprocal of the frame width inside it the probability of the operations at time I by [pi(l). n] Operations whose time frame is one unit wide are bound to start in one specific time step. . . i = 0. 2. the larger the width. the lower the probability that the operation is scheduled in any given step inside the corresponding time frame The type distribution is the sum of the probabilities of the operations implementable by a specific resource type in the set {I. I.

. distribution graphs show the likelihood that a resource is used at each schedule step.distribution graph is a plot of an operation-type distribution over the schedule steps. A uniform plot in a distribution graph means that a type is evenly scattered in the schedule and it relates to a good measure of utilization of that resource. In force-directed scheduling the selection of a candidate operation to be scheduled in a given time step is done using the concept of force.

The elastic constant is the proportionality factor.The force exerted by an elastic spring is proportional to the displacement of its end points. Forces related to the operation dependencies and called predecessor/successor forces. we assume in this section that operations have unit delays The assignment of an operation to a step is chosen while considering all forces relating it to the schedule steps in its time frame the set of forces relating an operation to the different possible control steps where it can be scheduled and called serf-forces. .

The self-force is the sum of the forces relating that operation to all schedule steps in its time frame. The force relating that operation to a step m [ti. . tl] is equal to the type distribution qk(m) times the variation in probability 1 .pi(m).self-forces let us consider operation vi of type k = T(vi) when scheduled in step I.

due . tsLbar] be the reduced one The total force on an operation related to a schedule step is computed by adding to its self-force the predecessor/successor forces. tsL] be the initial time frame and [tiLbar. the predecessor/successor forces are computed by evaluating the variation on the self-forces of the predecessors/ successors . the effects implied by one assignment must be taken into account by considering the predecessor/successor forces. which are the forces on other operations linked by dependency relations. Therefore. .assigning an operation to a specific step may reduce the time frame of other operations. Let [tiL.the restriction of their time frames.

the selected candidates are determined by reducing iteratively the candidate set Ulk by deferring those operations with the least force until the resource bound is met. The algorithm considers the operations one at a time for scheduling. as opposed to the strategy of considering each schedule step at a time as done by list scheduling .

Boolean algebra Boolean algebra is defined by the set B = (0. A point in B^n is represented by a binary-valued vector of dimension n. 1) and by two operations. n-input. It is often referred to as the n-dimensional cube. A product of n literals denotes a point in the Boolean space: it is a zerodimensional cube. denoted by + and . A literal is an instance of a variable or of its complement. m-output function is a mapping f : B^n  B^m . The multi-dimensional space spanned by n binary-valued Boolean variables is denoted by B^n.

m > I). The points where the function is not defined are called don‘t care conditions..An incompletely specified scalar Boolean function is defined over a subset of B^n.e. don't care points may differ for each component. because different outputs may be sampled under different conditions. In the case of multiple-output functions (i. incompletely specified functions are represented as .

.. .x1. . xn) with respect to variable xi' is f. . xi. I and * are called the off set.x. . . 0.).. . . .x1 x2. .s. .r2. . . . The Boolean difference w...r2. respectively The cofactor of f (.. on set. .t. = .. . xi. xi. The cofactor of f(. . .. = f(.. . and dc set.r. . .). .x1 x2.f(r. . . . . . . . . I. indicates whether f is sensitive to changes in input xi . xn) with respect to variable xi is f. . .the subsets of the domain for which the function takes the values 0.

. it corresponds to deleting all appearances of that variable Let i. k. . . 2.  The consensus of a function with respect to a variable represents the component that is independent of that variable. i = 1.. be a set of Boolean functions .

Let f. Then . g be two Boolean functions expressed by an expansion with respect to the same orthonormal basis Let O be an arbitrary binary operator representing a Boolean function of two arguments.

which can be classified into tabular forms. input part is the set of all row vectors in B^n and it can be omitted if a specific ordering of the points of the domain is assumed. 1. denoting whether the corresponding input implies a value in the set (0.Representations of Boolean Functions There are different ways of representing Boolean functions. The output part has a symbol in correspondence to each output. *)^n. truth tables are used only for functions of small size. 1. A multiple-output implicant (Tabular form) has input and output parts. Since the size of a truth table is exponential in the number of inputs. *)^m. The output part is the set of corresponding row vectors in (0. The input part represents a cube in the domain and is a row vector in (0. logic expressions and binary decision diagrams Tabular form: A complete listing of all points in the Boolean input space and of the corresponding values of the outputs. .

Single-level forms use only one operator Standard two-level forms are sum of products of literals and product of sums of literals BINARY DECISION DIAGRAMS A binary decision diagram represents a set of binary-valued decisions. Scalar Boolean functions can be represented by expressions of literals linked by the + and . . operators. culminating in an overall decision that can be either TRUE or FALSE.EXPRESSION FORMS.

indices and leaf values.) . u] such that the suhgraphs rooted in v and in u are isomorphic.Isomorphic OBDD: Two OBDDs are isomorphic if there is a one-to-one mapping between the vertex sets that preserves adjacency. ROBDD An OBDD is said to be a reduced OBDD (or ROBDD) if it contains no vertex v with low(v) = high(v). (redundancies have been eliminated from the diagram. nor any pair [v.

.

.

.

.

g. g. 1) = f‘ . 1. h) = h. h) = g. ite( f. g) = g and ite(f.unique table. which stores the ROBDD information in a strong canonical form compared table. 0. is used to improve the performance of the algorithm The relevant terminal cases of this recursion are ite( f. ite(0. g.0) = f. ite(1.

.

two-level logic optimization has a direct impact on macro-cell design styles using programmable-logic arrays (PLAs) 2.TWO-LEVEL COMBINATIONAL LOGIC OPTIMIZATION Two-level logic optimization is important: 1. Two level optimization is of key importance to multiple-level logic design 3. formal way of processing the representation of systems that can he described by logic functions LOGIC OPTIMIZATION PRINCIPLES The objective of two-level logic minimization is to reduce the size of a Boolean function in either sum of products or product of sums form goals of logic optimization may vary slightly. according to the implementation styles Ex: Each row of a PLA is in one-to-one correspondence with a product term of the sum of products representation. The primary goal of logic minimization is the reduction of terms .

. a 1 implies a TRUE or don't cure value of the function in correspondence with the input part. but the latter task is obviously more complex. . *]^m can be represented in several ways. . multiple-output implicant: it combines an input pattern with an implied value of the function. completely specified functions are the special case of functions with no don't care conditions. under f. The output pan has entries in the set (0.. For each output i = 1. 1.) is 1. .e.Logic minimization of single-output and multiple-output functions follows the same principle. . The input part has entries in the set (0.2. m we define the corresponding on set. A multiple-output implicant of a Boolean function f : B^n + (0. 1. *] and represents a product of literals. For each output component. *)^m is a pair of row vectors of dimensions n and m called the input part and output part. respectively. off set and dc set as the subsets of B^n whose image under the ith component of f (i. 0 and *. I. I). respectively. f : B^n + (0.

A multiple-output minterm of a Boolean function f : B^n -. A multiple-output implicant corresponds to a subset of minterms of the function. *)m is a multiple-output implicant whose input part has elements in the set (0. A cover of a Boolean Function is a set (list) of implicants that covers its minterms . 1. 1) and that implies the TRUE value of one and only one output of the function.(0. we can define containment and intersection among sets of implicants.

A minimal.. off set and dc set of a function f can be modeled by covers. if no implicant is contained in any other implicant of the cover. . A cover is prime if all its implicants are prime. An implicant is prime if it is not contained by any implicant of the function. or cardinalit). of a cover is the number of its implicants. A minimum cover is a cover of minimum cardinality.The on set. or irredundant. The size. For single-output functions. cover of a function is a cover that is not a proper superset of any cover of the same function A cover is minimal with respect to single-implicant containment. a prime implicant corresponds to a product of literals where no literal can be dropped while preserving the implication property.

If there is a minimum cover. that is prime A prime implicant table is a binary-valued matrix A whose columns are in oneto-one correspondence with the prime implicants of the function f and whose rows are in one-to-one correspondence with its minterms. A minimum cover is a minimum set of columns which covers all rows . An entry aij is 1 if and only if the jth prime covers the ith minterm.Exact Logic Minimization Exact logic minimization addresses the problem of computing a minimum cover.

whose vertices correspond to the minterms and whose edges correspond to the prime implicants. the covering problem corresponds to an edge cover of the hypergraph. Row and column dominance can be used to reduce the size of matrix Extraction of the essentials and removal of dominant rows and dominated columns can be iterated to yield a reduced prime implicant table.Therefore the covering problem can be viewed as the problem of finding a binary vector x representing a set of primes with minimum cardinality |x| such that: matrix A can be seen as the incidence matrix of a hypergraph. selecting the different column (prime) combinations and evaluating the corresponding cost (based on the essential and dominance rules). .

. The product of sums form is then transformed into a sum ofproducts form by canying out the products of the sums. The corresponding sum of products expression is satisfied when any of its product terms is TRUE. product terms represent the primes a minimum cover is identified by any product term of the sum of products form with the fewest literals.Petrick's method: writing down the covering clauses of the (reduced) implicant table in a product of sums form.

ESPRESSO-EXACT partitions the prime implicants into three sets: essentials. partially redundunt and totally redundant.The major improvements of the ESPRESSO-EXACT algorithm over the QuineMceluskee algorithm consist of the construction of a smaller reduced prime implicant table and of the use of an efficient branchand-bound algorithm for covering. each row corresponds to all mintenns which are covered by the same subset of prime implicants . rather than to single minterms as in the case of Quine-McCluskey's algorithm.ESPRESSO-EXACT algorithm: . totally redundant primes are those covered by the essentials partially redundant set includes the remaining ones The rows of the reduced implicant table correspond to sets of minterms.

.

Heuristic logic minimization can be viewed as applying a set of operators to the logic cover.. They compute instead a prime cover starting from the initial specification of the function. Reshape: Implicants are processed in pairs.e. . One implicant is expanded while the other is reduced. Reduce: attempts to replace each implicant with another that is contained in it. it is replaced by a prime implicant that contains it. This cover is then manipulated by modifying and/ or deleting implicants . i.until a cover with a suitable minimality property is found. which is initially provided to the minimizer along with the don't care set The most common operators in heuristic minimization are Expand: Each non-prime implicant is expanded to a prime.Heuristic Logic Minimization iterative improvement strategy.

where the vertices correspond to the modules and the edges to the nets. .ABSTRACT MODELS Structures • Structural representations can be modeled in terms of incidence structures. • An incidence structure consists of a set of modules. • A simple model for the structure is a hypergraph. a set of nets and an incidence relation among modules and nets. and to describe the incidence among nets and pins. • An alternative way of specifying a structu~ is to denote each module by its terminals. • The incidence relation is then represented by the corresponding incidence matrix. called pins (or ports).

called primary inputs and primary oufput. is a hierarchical smcture where: Each leaf module is associated with a multiple-input. While this-concept is general and powerful. we consider here two restrictions to this model: the combinational logic network and the synchronous logic network. called a source. The combinational logic network. single-output combinational logic function. where each leaf module is associated with a combinational or sequential logic function. called mnput. Pins are partitioned into two classes. Pins that do not belong to submodules are also partit~oned into two classes. called a local function.i and outputs. called also logic network or Boolean network. Each net has a distinguished terminal. and an orientation from .Logic Network A generalized logic network is a structure.

A finite-state machine can be described by: A set of primary input panems. A set of primary output patterns. S : X x S -t S. . S.State Diagrams: The behavioral view of sequential circuits at the logic level can be expressed by finite-state machine transition diagrams. A set of states. X. A state transition function. Y. An initial state. An output function. A : X x S -t Y for Mealy models or A : S -t Y for Moore models.

Data-flow graphs represent operations and data dependencies. A second reason is serialization constraints in the specification.e. When an input to an operation is the result of another operation. fake operations that execute instantaneously with no side effect. where the former is the time at which the value is generated as an output of an operation and the latter is the latest time at which the variable is referenced as an input to . the former operation depends on the latter. Each variable has a lifetime that is the interval from its birth to its death.Data-flow and Sequencing Graphs: Abstract models of behavior at the architectural level are in terms of tasks (or operations) and their dependencies. Dependencies arise from several reasons. A task may have to follow a second one regardless of data dependency. A first reason is availability of data.. whose values store the information required and generated by the operations. i. The data-flow graph model implicitly assumes the existence of variables (or carriers). Tasks may be No Operations (NOPs).

The corresponding vertex is the tail of two edges. Iteration can be modeled as a branch based on the iteration exit condition. we call sequencing graph G. It models a set of dependencies from its direct predecessors to the source vertex of the called entity and another set of dependencies from the corresponding sink to its direct successors. when confusion is possible between the overall hierarchical model and a single graph.(V. one modeling the exit from the loop and the other the return to the first operation in the loop. A generic element in the hierarchy is called a sequencing graph entity. Sequencing graph entities that are leaves-of the hierarchy have no link vertices. A model call vertex is a pointer to another sequencing graph entity at a lower level in the hierarchy. the latter linking other sequencing graph entities in the hierarchy.A branching vertex is the tail of a set of alternative paths. corresponding to the possible branches. E) a hierarchy of directed graphs. A sequencing graph entity is an extended data-flow graph that has two kinds of vertices: operations and links. .

the delay of a vertex can be data independent or data . (ii) executing. which can be (i) waiting for execution.Branching constructs can be modeled by a branching clause and branching bodies. and (iii) having completed execution. Then. such as measures or estimates of the corresponding area or delay cost. A branching body is a set of tasks that are selected according to the value of the branching clause. (•) Some attributes can be assigned to the vertices and edges of a sequencing graph model. the semantics of the model is as follows: an operation can be fired as soon as all its direct predecessors have completed execution. The semantic interpretation of the sequencing graph model requires the notion of marking the vertices. A marking denotes the state of the corresponding operation. Firing an operation means starting its execution. (•) In general.

where the maximum and minimum possible delays can be computed. Bounded-latency graphs A sequencing graph model with data-independent delays can be characterized by its overall delay. .bounded or unbounded delay Data-dependent delays can be bounded or unbounded. because the latency cannot be computed. Else they are called unbounded. Graphs with bounded delays (including data. called latency. The former case applies to data-dependent delay branching.latency graphs.independent ones) are called bounded-latency graphs.