A SEMINAR REPORT ON CLOCKLESS CHIPS Submitted by AHMED SHAMS In partial fulfillment for the award of the Degree of BACHELOR

OF TECHNOLOGY in COMPUTER SCIENCE & ENGINEERING SCHOOL OF ENGINEERING COCHIN UNIVERSITY OF SCIENCE AND TECHNOLOGY COCHIN- 682 022 AUGUST 2008

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING SCHOOL OF ENGINEERING COCHIN UNIV ERSITY OF SCIENCE AND TECHNOLOGY KOCHI-682022 CERTIFICATE Certified that this is a bonafide record of the seminar work entitled â CLOCKLESS CHIPSâ done by the following student, AHMED SHAMS of the VIIth semester, computer science and engineering in the year 2008 in part ial fulfillment of the requirements to the award of degree of Bachelor of techno logy in Computer Science and Engineering of Cochin University of Science and Tec hnology. Mr. VINOD KUMAR SEMINAR GUIDE Dr. DAVID PETER.S HEAD OF DIVISION Place: Kochi Date:

ACKNOWLEDGEMENT At the outset, I thank the lord almighty for the grace, strength and hope to mak e our endeavor a success We express our deep felt gratitude to Dr. David Peter.S , Head of the Division of Computer Science for his constant encouragement. I am profoundly grateful to Mr. V. Kumar, Lecturer, Department of Computer Science, m y mentor and seminar guide for his valuable guidance support, suggestions and en couragement. I would also like to thank our staff coordinator, Mr. Pramod Pavith ran for his words of support. Further more I would like to thank all others, esp ecially our parents and numerous friends. This project would not have been a suc cess without the inspiration, valuable suggestions and moral support from the th rough out its course AHMED SHAMS.

Clockless chips ABSTRACT Clockless chips are electronic chips that are not using clock for timin g signal. They are implemented in asynchronous circuits. An asynchronous circuit is a circuit in which the parts are largely autonomous. They are not governed b y a clock circuit or global clock signal, but instead need only wait for the sig nals that indicate completion of instructions and operations. These signals are specified by simple data transfer protocols. This digital logic design is contra sted with a synchronous circuit which operates according to clock timing signals . The term asynchronous logic is used to describe a variety of design styles, wh ich use different assumptions about circuit properties. These vary from the bund led delay model - which uses 'conventional' data processing elements with comple tion indicated by a locally generated delay model - to delay-insensitive design - where arbitrary delays through circuit elements can be accommodated. The latte r style tends to yield circuits which are larger and slower than synchronous (or bundled data) implementations, but which are insensitive to layout and parametr ic variations and are thus "correct by design." Unlike a conventional processor, a clockless processor (asynchronous CPU) has no central clock to coordinate the progress of data through the pipeline. Instead, stages of the CPU are coordinat ed using logic devices called "pipeline controls" or "FIFO sequencers." Basicall y, the pipeline controller clocks the next stage of logic when the existing stag e is complete. In this way, a central clock is unnecessary. It may actually be e ven easier to implement high performance devices in asynchronous, as opposed to clocked logic. Division Of Computer Engineering, SOE, CUSAT 4

Clockless chips CONTENTS 1. INTRODUCTION 1.1 Definition 1.2 Clock Concept 2. CLOCKLESS APPROACH 2.1 Clock Limitations 2.1 Asynchronous View 3. PROBLEMS WITH SYNCHRONOUS CIRCUIT S 3.1 Performance 3.2 Speed 3.3 Power Dissipation 3.4 Electromagnetic Noise 4. A SYNCHRONOUS CIRCUITS 4.1 Clockless Chips Implementation 4.2 Throwing Away Global Clock 4.3 Standardize of Components 5. HOW CLOCKLESS CHIPS WORKS 6. SIMPLICITY IN DESIGN 6.1 Asynchronous for Higher Performance 6.2 Asynchronous for Low Power 6.3 Asynchronous for EMN and Emission 1 1 2 3 3 4 5 5 6 6 7 8 8 8 9 9 12 15 16 17 Division Of Computer Engineering, SOE, CUSAT 5

Clockless chips 7. APPLICATIONS OF CLOCKLESS CHIPS 7.1 Wearable Computers 7.2 Infrared Communica tion Receiver 7.3 In Pagers 7.4 Filter Bank for Digital Hearing 8. CHALLENGES 9. CONCLUSION 10. REFERENCES 17 17 18 18 18 20 21 22 Division Of Computer Engineering, SOE, CUSAT 6

Clockless chips LIST OF FIGURES 1. 2. 3. 4. Figure1 Figure2 Figure3 Figure4 3 11 15 16 Division Of Computer Engineering, SOE, CUSAT 7

Clockless chips 1. INTRODUCTION 1.1 DEFINITION Every action of the computer takes place in tiny steps, each a billionth of a se cond long. A simple transfer of data may take only one step; complex calculation s may take many steps. All operations, however, must begin and end according to the clock's timing signals. The use of a central clock also creates problems. As speeds have increased, distributing the timing signals has become more and more difficult. Present-day transistors can process data so quickly that they can ac complish several steps in the time that it takes a wire to carry a signal from o ne side of the chip to the other. Keeping the rhythm identical in all parts of a large chip requires careful design and a great deal of electrical power. Wouldn 't it be nice to have an alternative? Clockless approach, which uses a technique known as asynchronous logic, differs from conventional computer circuit design in that the switching on and off of digital circuits is controlled individually by specific pieces of data rather than by a tyrannical clock that forces all of the millions of the circuits on a chip to march in unison. It overcomes all the disadvantages of a clocked circuit such as slow speed, high power consumption, h igh electromagnetic noise etc. For these reasons the clockless technology is con sidered as the technology which is going to drive majority of electronic chips i n the coming years. Division Of Computer Engineering, SOE, CUSAT 8

Clockless chips 1.2 CLOCK CONCEPT The clock is a tiny crystal oscillator that resides in the heart of every microp rocessor chip. The clock is what which sets the basic rhythm used throughout the machine. The clock orchestrates the synchronous dance of electrons that course through the hundreds of millions of wires and transistors of a modern computer. Such crystals which tick up to 2 billion times each second in the fastest of tod ay's desktop personal computers, dictate the timing of every circuit in every on e of the chips that add, subtract, divide, multiply and move the ones and zeros that are the basic stuff of the information age. Conventional chips (synchronous ) operate under the control of a central clock, which samples data in the regist ers at precisely timed intervals. Computer chips of today are synchronous: they contain a main clock which controls the timing of the entire chips. One advantag e of a clock is that, the clock signals to the devices of the chip when to input or output. This functionality of the synchronous design makes designing the chi p much easier. The circuit which uses global clock can allow data to flow in the circuit in any manner of sequence and order does not matter. Division Of Computer Engineering, SOE, CUSAT 9

Clockless chips Clock (Frequency Figure 1 The diagram above shows the global clock is governing all components in the syst em that need timing signals. All components operate exactly once per clock tick and their outputs need to be ready and next clock tick. 2. CLOCKLESS APPROACH 2.1 CLOCK LIMITATIONS There are problems that go along with the clock, however. Clock speeds are now i n the gigahertz range and there is not much room for speedup before physical rea lities start to complicate things. With a gigahertz clock powering a chip, signa ls barely have enough time to make it across the chip before the next clock tick . At this point, speedup up the clock frequency could become disastrous. This is when a chip that is not constricted by clock speeds could become very valuable. Division Of Computer Engineering, SOE, CUSAT 10

Clockless chips One can create a clock that is so fast and it sends its timing signals to the lo gic circuits which are governed by the clock timing signals. These logic circuit s are supposed to respond to every tick of the clock and yet when they can compi le to match the speed then logic circuits will be not optimum according to the s peed of clock and hence the input and output can go incorrect. This will result hardware problem since one has to assemble chips to achieve the speed of clock a nd hence much more complicated situation arise. 2.2 ASYNCRONOUS VIEW By throwing out the clock, chip makers will be able to escape from huge power di ssipation. Clockless chips draw power only when there is useful work to do, enab ling a huge savings in battery-driven devices. Like a team of horses that can on ly run as fast as its slowest member, a clocked chip can run no faster than its most slothful piece of logic; the answer isn't guaranteed until every part compl etes its work. By contrast, the transistors on an asynchronous chip can swap inf ormation independently, without needing to wait for everything else. The result? Instead of the entire chip running at the speed of its slowest components, it c an run at the average speed of all components. At both Intel and Sun, this appro ach has led to prototype chips that run two to three times faster than comparabl e products using conventional circuitry. Another advantage of clockless chips is that they give off very low levels of electromagnetic noise. The faster the clo ck, the more difficult it is to prevent a device from interfering with other dev ices; dispensing with the clock all but eliminates this problem. The combination of low noise and low power consumption makes asynchronous chips a natural choic e for mobile devices. Division Of Computer Engineering, SOE, CUSAT 11

Clockless chips 3. PROBLEMS WITH SYNCRONOUS CIRCUITS Synchronous circuits are digital circuits i n which parts are synchronized by clock signals. In an ideal synchronous circuit , every change in the logical levels of its storage components is simultaneous. These transitions follow the level change of a special signal called the clock s ignal. Ideally, the input to each storage element has reached its final value be fore the next clock occurs, so the behavior of the whole circuit can be predicte d exactly. Practically, some delay is required for each logical operation, resul ting in a maximum speed at which each synchronous system can run. However there are several problems that are associated with synchronous circuits: 3.1 LOW PERFOMANCE In a synchronous system, all the components are tied up together and the system is working on its worst case execution. The speed of execution will not be faste r than that of the slowest circuit in the system and this will determine the fin al working performance of the system. Although there are faster circuits which h ave sophisticated performance but since they are depending of some other slow co mponents for input and output of data then they can no long run faster than the slowest components. Hence the performance of the synchronous system is limited t o its worst case performance. Division Of Computer Engineering, SOE, CUSAT 12

Clockless chips 3.2 LOW SPEED A traditional CPU cannot "go faster" than the expected worst-case performance of the slowest stage/instruction/component. When an asynchronous CPU completes an operation more quickly than anticipated, the next stage can immediately begin pr ocessing the results, rather than waiting for synchronization with a central clo ck. An operation might finish faster than normal because of attributes of the da ta being processed (e.g., multiplication can be very fast when multiplying by 0 or 1, even when running code produced by a brain-dead compiler), or because of t he presence of a higher voltage or bus speed setting, or a lower ambient tempera ture, than 'normal' or expected. 3.3 HIGH POWER DISSIPATION As we know that clock is a tiny crystal oscillator that keeps vibrating during a ll time as long as the system is power on, this lead into high power dissipation by the synchronous circuit since they use central clock in their timings. The c lock itself consumes about 30 percent of the total power supplied to the circuit and sometimes can even reach high value such as 70 percent. Even if the synchro nous system is not active at the moment still its clock will be oscillating and consumes power that is dissipated as heat energy. This makes synchronous system more power consumer and hence not suitable for use in design of mobile devices a nd battery driven devices. Division Of Computer Engineering, SOE, CUSAT 13

Clockless chips 3.4 HIGH ELECTROMAGNETIC NOISE Since clock itself is crystal oscillator it is then associated with electromagne tic waves. These waves produce electromagnetic noise due to oscillations. Noise will also be accompanied by emission spectra. The higher the speed of clock is t he higher number of oscillations per second and this leak high value of electrom agnetic noise and spectra emission. This is not a good sign for design of mobile devices too. Apart from the problems above, the clock is synchronous circuit an d globally distributed over the components which are obviously in running in dif ferent speed and hence the order of arrive of the timing signal is not important . Data can be received and transmitted in any form of order regardless of there sequential order they arrive at the fist stage of execution. The designing of cl ock frequency should be so sophisticated since the frequency of the clock is fix ed and poor march of design can result problem in the reusability of resources a nd interfacing with mixed-time environment devices. Division Of Computer Engineering, SOE, CUSAT 14

Clockless chips 4 ASYNCRONOUS CIRCUITS Asynchronous circuits are the electronic digital circuits that are not govern by the central clock in their timing instead they are stand ardized in their installation and they use handshakes signals for communication to each other components. In this case the circuits are not tied up together and forced to follow the global clock timing signals but each and every component i s loosely and they run at average speed. Asynchronous is can be achieved by impl ementing three vital techniques and these are: 4.1 CLOCKLESS CHIPS IMPLEMENTATION In order to achieve asynchronous as final goal one must implement the electronic circuits without using central clock and hence make the system free from tied c omponents obeying clock. One tricky technique is to use clockless chips in the c ircuit design. Since these chips are not working with central clock and guarante e to free different components from being tied up together. Now as components ca n run on their own different performance and speed hence asynchronous is establi shed. 4.2 THROWING AWAY GLOBAL CLOCK There is no way one can success to implement asynchronous in circuits if there i s global clock that is managing the whole system timing signals. Since the clock is installed only to enable the synchronization of components, by throwing away the global clock it is possible now for components to be completely not Division Of Computer Engineering, SOE, CUSAT 15

Clockless chips synchronized and the communication between them is only by handshaking mechanism . 4.3 STANDADISE OF COMPONENTS In synchronous system all the components are closed up together as to be managed by central clock. Synchronous ness can be split up if these components are not bound together and hence standardizing these components is one of the alternativ es. Here all the components are going to be standard in a given range of working performance and speed. There is average speed in which the design of system is dedicated to compile and the worst case execution will be avoided. 5. HOW CLOCKLESS CHIPS WORKS Beyond a new generation of design-and-testing equip ment, successful development of clockless chips requires the understanding of as ynchronous design. Such talent is scarce, as asynchronous principles fly in the face of the way almost every university teaches its engineering students. Conven tional chips can have values arrive at a register incorrectly and out of sequenc e; but in a clockless chip, the values that arrive in registers must be correct the first time. One way to achieve this goal is to pay close attention to such d etails as the lengths of the wires and the number of logic gates connected to a given register, thereby assuring that signals travel to the register in the prop er logical sequence. But that means being far more meticulous about the physical design than synchronous designers have been trained to be. Division Of Computer Engineering, SOE, CUSAT 16

Clockless chips An alternative is to open up a separate communication channel on the chip. Clock ed chips represent ones and zeroes using low and high voltages on a single wire, "dual-rail" circuits, on the other hand, use two wires, giving the chip communi cations pathways, not only to send bits, but also to send "handshake" signals to indicate when work has been completed. Fair additionally proposes replacing the conventional system of digital logic with what known as "null convention logic, " a scheme that identifies not only "yes" and "no," but also "no answer yet"-a c onvenient way for clockless chips to recognize when an operation has not yet bee n completed. All of these ideas and approaches are different enough that executi ng them could confound the mind of an engineer trained to design to the beat of a clock. It's no surprise that the two newest asynchronous startups, Asynchronou s Digital Devices and Self-Timed Solutions, are populating now, and clockless-ch ip research has been going on the longest. For a chip to be successful, all thre e elements-design tools, manufacturing efficiency and experienced designers-need to come together. The asynchronous cadre has very promising ideas. There is now way one can obtain pure asynchronous circuits to be used in the complete design of the system and this is one of major barrier of clockless implementation but the circuits were successfully standardized and hence they do not have to be in synchronous mode. And hence handshakes were the solution to overcome synchroniza tion. One component which needs to communicate with the other uses the handshake signals to achieve the establishment of connection and then with set up the tim e at which is going to send data and at the other side another component will al so use the same kind of handshakes to harden the connection and wait for that ti me to receive data. Division Of Computer Engineering, SOE, CUSAT 17

Clockless chips clock Handshakes Interface Synchronous System (Centralized Control) Figure 2 Asynchronous System (Distributed Control) In circuits implemented by clockless chips, data do not have to move at random a nd out of order as in synchronous in which the movement of data is no so essenti al. In asynchronous circuits data are treated as very important aspect and hence do not move at any time they only and only move when are required to move in ca se such as transmission between several components. This technique has offered l ow power consumption and low electromagnetic noise and also there will of course be smooth data streaming. Division Of Computer Engineering, SOE, CUSAT 18

Clockless chips 6. SIMPLICITY IN DESIGN There in no complexity of a simple design for clockless chips. The one fundamental achievement is to throw the central clock away and st andardization of components can be used intensively. Integrated pipeline mode pl ays an important role in total system design. There are about four factors regar ding pipeline and these are: 1. Domino logic 2. Delay insensitive 3. Bundle data 4. Dual rail Domino logic is a CMOS-based evolution of the dynamic logic techni ques which were based on either PMOS or NMOS transistors. It allows a rail-to-ra il logic swing. It was developed to speed up circuits. In a cascade structure co nsisting of several stages, the evaluation of each stage ripples the next stage evaluation, similar to a domino falling one after the other. The structure is he nce called Domino CMOS Logic. Important features include: * They have smaller ar eas than conventional CMOS logic. * Parasitic capacitances are smaller so that h igher operating speeds are possible. Division Of Computer Engineering, SOE, CUSAT 19

Clockless chips * Operation is free of glitches as each gate can make only one transition. * Onl y non-inverting structures are possible because of the presence of inverting buf fer. * Charge distribution may be a problem Delay insensitive circuit is a type of asynchronous circuit which performs a logic operation often within a computin g processor chip. Instead of using clock signals or other global control signals , the sequencing of computation in delay insensitive circuit is determined by th e data flow. Typically handshake signals are used to indicate the readiness of s uch a circuit to accept new data (the previous computation is complete) and the delivery of such data by the requesting function. Similarly there may be output handshake signals indicating the readiness of the result and the safe delivery o f the result to the next stage in a computational chain or pipeline. In a delay insensitive circuit, there is therefore no need to provide a clock signal to det ermine a starting time for a computation. Instead, the arrival of data to the in put of a sub-circuit triggers the computation to start. Consequently, the next c omputation can be initiated immediately when the result of the first computation is completed. The main advantage of such circuits is their ability to optimize processing of activities that can take arbitrary periods of time depending on th e data or requested function. An example of a process with a variable time for c ompletion would be mathematical division or recovery of data where such data mig ht be in a cache. Division Of Computer Engineering, SOE, CUSAT 20

Clockless chips The Delay-Insensitive (DI) class is the most robust of all asynchronous circuit delay models. It makes no assumptions on the delay of wires or gates. In this mo del all transitions on gates or wires must be acknowledged before transitioning again. This condition stops unseen transitions from occurring. In DI circuits an y transition on an input to a gate must be seen on the output of the gate before a subsequent transition on that input is allowed to happen. This forces some in put states or sequences to become illegal. For example OR gates must never go in to the state where both inputs are one, as the entry and exit from this state wi ll not be seen on the output of the gate. Although this model is very robust, no practical circuits are possible due to the heavy restrictions. Instead the Quas i-Delay-Insensitive model is the smallest compromise model yet capable of genera ting useful computing circuits. For this reason circuits are often incorrectly r eferred to as Delay-Insensitive when they are Quasi-DelayInsensitive. Dual rail is the technique employed to influence asynchronization of circuits by establish ing two connections to any circuit that is in connection. Hence it provides one line for handshakes signals and the other for data transmission. The proposed bu ndled-data pipelines include novel data-dependent delay lines with integrated co ntrol circuitry to efficiently implement speculative completion sensing. The con trol circuits are based on a novel control-circuit template that simplifies the design of such nonlinear pipelines. Extensive postlayout back-end timing analysi s was performed to gain confidence in the timing margins as well as to quantify performance and energy. Comparison with a synchronous counterpart suggests that our best asynchronous design yields 30% higher average throughput with negligibl e energy overhead. Division Of Computer Engineering, SOE, CUSAT 21

Clockless chips 6.1 ASYNCRONOUS FO HIGHER PERFOMANCE In order to increase the performance of the circuit, the following are basics to be implements. * Data-dependent delays. *A ll carry bits need to be computed. Figure3 The figure show first circuit being not asynchronous and then the second shows dual rail with every bit taken into computation. Division Of Computer Engineering, SOE, CUSAT 22

Clockless chips 6.2 ASYNCHRONOUS FOR LOW POWER Power consumption is very important aspect in des igning any mobile and to increase the battery capacity and life for battery driv en devices. Hence asynchronization of power is completely inevitable to achieve a low level of power dissipated. The circuit should consume power only when and where active. Rest of the time the circuit returns to a non-dissipating state, u ntil next activation. The figure shows how power is less confused by first taking down the frequency b y dividing the give frequency to two and the next one show as many circuits are cascaded the more the frequency is divided. This provides a crucial reduction on power consumption. Division Of Computer Engineering, SOE, CUSAT 23

Clockless chips 6.3 ASYNCRONOUS FOR LOW NOISE Any system with clock will be having oscillations in it and will create electromagnetic noise and this is the source of the actual noise one hears from convectional computers. For every clock cycle there will b e spike emitted and emission of random spectra is accompanied together with nois e. This problem is greatly reduced to significant considerable range by discardi ng the central clock as explain above and the spectra radiation are much smoothe r in asynchronous circuits. 7. APPLICATIONS OF CLOCKLESS CHIPS Clockless chips are used in other application s also on rather than in design of computers and these are: 7.1 WEARABLE COMPUTERS Wearable computers are mobile computers that are worn on the body. They have bee n applied to areas such as behavioral modeling, health monitoring systems, infor mation technologies and media development. Government organizations, military, a nd health professionals have all incorporated wearable computers into their dail y operations. Wearable computers are especially useful for applications that req uire computational support while the user's hands, voice, eyes or attention are actively engaged with the physical environment. Division Of Computer Engineering, SOE, CUSAT 24

Clockless chips 7.1 INFRARED COMMUNICATION RECEIVER Infrared (IR) radiation is electromagnetic radiation whose wavelength is longer than that of visible light, but shorter than that of terahertz radiation and mic rowaves. This has been implemented in designing receivers that receive transmitt ed data via infrared. Infrared communication receiver is one of computer periphe rals and since it has asynchronous in nature then clockless chips are implemente d for its design. 7.2 IN PAGERS A pager (sometimes called a beeper) is a simple personal telecommunications device for short messages. A one-way numeric pager can only r eceive a message consisting of a few digits. Typically a phone number that the u ser is then expected to call. Alphanumeric pagers are available, as well as twoway pagers that have the ability to send and receive email, numeric pages, and S MS messages. Pagers consisting largely of emergency service personnel, medical p ersonnel, and information technology support staff. 7.3 FILTER BANK FOR DIGITAL HEARING A filter bank is an array of band-pass filters that separates the input signal i nto several components, each one carrying a single frequency subband of the orig inal signal. It also is desirable to design the filter bank in such a way that s ubbands can be recombined to recover original signal. The first process is calle d analysis, while the second is called synthesis. The output of analysis is refe rred as subband signal with as many subbands as there are filters in filter bank . Division Of Computer Engineering, SOE, CUSAT 25

Clockless chips The filter bank serves to isolate different frequency components in a signal. Th is is useful because for most applications some frequencies are more important t han others. For example these important frequencies can be coded with a fine res olution. Small differences at these frequencies are significant and a coding sch eme that preserves these differences must be used. On the other hand, less impor tant frequencies do not have to be exact. A coarser coding scheme can be used, e ven though some of the finer details will be lost in the coding. Division Of Computer Engineering, SOE, CUSAT 26

Clockless chips 8. CHALLENGES 1. Interfacing between synchronous and asynchronous â Many devices a vailable now are synchronous in nature. â Special circuits are needed to align the m. 2. Lack of expertise. 3. Lack of tools. 4. Engineers are not trained in these fields. 5. Academically, no courses available Division Of Computer Engineering, SOE, CUSAT 27

Clockless chips 9. CONCLUSION As has been studied that implementation of clockless chip in async hronous circuit has much great advantage over clocked chips. The obvious reasons for their super performance and average speed, low power consumption, less heat and noise generated are in great demand of the current market of electronic and computing world. This is a very new area of research and design and testing but if more scientists and engineers are dedicated to this, then for surety it the future technology for mobile electronic devices. Division Of Computer Engineering, SOE, CUSAT 28

Clockless chips 10. REFERENCES 1. Scanning the Technology: Applications of Asynchronous Circuits â C. H. (Kees) van Berkel, Mark B. Josephs, and Steven M. Nowick proceedings of I EEE, December 1998. 2. Computers without clocks â Ivan E Sutherland and Jo Ebergen Scientific American, August 2002. 3. Is it time for Clockless chips? â David Geer published by IEEE Computer Society, March 2005. 4. Guest Editorsâ Introduction: C lockless VLSI Systems â Soha Hassoun, Yong-Bin Kim and Fabrizio Lombardi copublish ed by IEEE CS and IEEE November â December 2005. 5. It's Time for Clockless Chips â Claire Tristram from MIT Technology October 2001 6. Old tricks for new chips Apr 19th 2001 From The Economist print edition Division Of Computer Engineering, SOE, CUSAT 29