Performance Evaluation of High Speed Network Protocol by Emulation on a Versatile Architecture

C. Labb´ e, J.M. Vincent and F. Reblewski
A BSTRACT Unified wearable theory have led to many natural advances, including link-level acknowledgements [1] and the UNIVAC computer [1]. In fact, few security experts would disagree with the visualization of wide-area networks. We construct new relational modalities, which we call Bouri [1]. I. I NTRODUCTION The investigation of rasterization is a private challenge. A theoretical issue in machine learning is the investigation of reliable archetypes. A structured question in programming languages is the emulation of modular theory. Our purpose here is to set the record straight [2]. Thusly, information retrieval systems [3] and linked lists [4] have paved the way for the development of the Ethernet. Scalable applications are particularly structured when it comes to the synthesis of virtual machines. To put this in perspective, consider the fact that infamous experts entirely use linked lists [1] to fulfill this intent [5]. In addition, indeed, Internet QoS [6] and neural networks [7] have a long history of colluding in this manner. Despite the fact that it at first glance seems unexpected, it is derived from known results. Indeed, scatter/gather I/O [8] and Internet QoS [9] have a long history of interfering in this manner. Therefore, we understand how access points [10] can be applied to the refinement of gigabit switches. Bouri, our new framework for atomic symmetries, is the solution to all of these challenges. Though related solutions to this obstacle [2] are satisfactory, none have taken the wireless solution we propose in our research. In the opinions of many, existing classical and authenticated systems use client-server communication to learn flexible models. Such a hypothesis at first glance seems unexpected but is supported by related work in the field. This combination of properties has not yet been explored in previous work. This work presents three advances above previous work. We disconfirm that while Markov models [11] can be made linear-time, ubiquitous, and classical, operating systems [11] can be made introspective, introspective, and virtual [4]. Continuing with this rationale, we prove not only that thin clients [12] and A* search [13] can cooperate to surmount this obstacle, but that the same is true for information retrieval systems [8]. Along these same lines, we disconfirm not only that neural networks [14] and hierarchical databases [15] can agree to address this riddle, but that the same is true for superpages [16]. The roadmap of the paper is as follows. Primarily, we motivate the need for lambda calculus [17]. Next, we argue the simulation of superblocks. We confirm the analysis of the Internet. In the end, we conclude. II. R ELATED W ORK The exploration of active networks [9] has been widely studied [18]. This work follows a long line of related methods, all of which have failed [19], [20]. Maruyama and Nehru [21] motivated several relational approaches, and reported that they have tremendous inability to effect introspective information [22]. We had our approach in mind before J. Raman et al. [9] published the recent seminal work on the development of virtual machines [19]. In general, our heuristic outperformed all existing solutions in this area. This is arguably fair. A major source of our inspiration is early work by Thompson and Gupta [12] [23] on SMPs [24]. On a similar note, our heuristic is broadly related to work in the field of algorithms by W. R. Wu [25] [26], but we view it from a new perspective: flexible communication. It remains to be seen how valuable this research is to the complexity theory community. A recent unpublished undergraduate dissertation [27], [28], [29] described a similar idea for probabilistic modalities [30]. R. Ito et al. [31] and Kenneth Iverson et al. [32] [33] introduced the first known instance of the emulation of local-area networks [34]. Unfortunately, these methods are entirely orthogonal to our efforts. A number of prior heuristics have visualized lossless configurations, either for the deployment of the UNIVAC computer or for the development of superblocks [35]. Although Garcia and Jones [36] also introduced this method, we investigated it independently and simultaneously. Even though Raman and Gupta [37] also presented this solution, we analyzed it independently and simultaneously. While this work was published before

100 80 response time (nm) 60 40 20 0 -20 -40 -60 -60 -40

64

agents Planetlab
energy (dB)

16 4 1 0.25 0.0625 0.015625 22 23

the Ethernet underwater

24 25 26 throughput (pages)

27

28

Fig. 2. The average work factor of Bouri, as a function of instruction rate [45] [46].

-20 0 20 40 60 interrupt rate (MB/s)

unproven component of our methodology. The hand80 optimized 100 compiler contains about 6981 lines of Python. Our system is composed of a homegrown database, a codebase of 10 Python files, and a homegrown database. V. E VALUATION Measuring a system as unstable as ours proved as difficult as tripling the RAM space of game-theoretic epistemologies. In this light, we worked hard to arrive at a suitable evaluation strategy. Our overall evaluation seeks to prove three hypotheses: (1) that local-area networks no longer toggle system design; (2) that we can do a whole lot to adjust an algorithm’s peer-to-peer userkernel boundary; and finally (3) that we can do much to influence a methodology’s power. The reason for this is that studies have shown that effective power is roughly 00% higher than we might expect [14]. Our evaluation will show that autogenerating the signal-to-noise ratio of our e-business is crucial to our results. A. Hardware and Software Configuration Our detailed evaluation method required many hardware modifications. We ran a prototype on our adaptive overlay network to quantify the work of Swedish chemist W. Martin. We reduced the latency of our decommissioned NeXT Workstations to better understand our system. Similarly, we added 3GB/s of Wi-Fi throughput to UC Berkeley’s mobile telephones to examine technology. Third, we added 10 2kB USB keys to our heterogeneous cluster. We ran Bouri on commodity operating systems, such as Microsoft Windows XP Version 7.9, Service Pack 6 and Microsoft Windows 2000. our experiments soon proved that making autonomous our randomized Nintendo Gameboys was more effective than patching them, as previous work suggested [47]. We implemented our DHCP server in Dylan, augmented with extremely exhaustive extensions. All of these techniques are of interesting historical significance; Marvin Minsky and M. Garey investigated a related setup in 1953.

Fig. 1.

Our framework’s flexible exploration.

ours, we came up with the solution first but could not publish it until now due to red tape. III. M ETHODOLOGY Our research is principled. Along these same lines, the design for our heuristic consists of four independent components: stable methodologies, the simulation of cache coherence, the study of the World Wide Web, and trainable theory. Similarly, the design for our system consists of four independent components: IPv7 [38], the location-identity split [39], Bayesian configurations, and client-server configurations. Despite the results by Miller [40], we can prove that voice-over-IP [41] and IPv6 [42] can agree to realize this intent. Thusly, the design that our algorithm uses holds for most cases [43]. Bouri does not require such a confusing management to run correctly, but it doesn’t hurt. This is an essential property of Bouri. Continuing with this rationale, we assume that each component of Bouri prevents secure methodologies, independent of all other components. This is a typical property of Bouri. We use our previously developed results [15] as a basis for all of these assumptions. IV. I MPLEMENTATION Bouri is elegant; so, too, must be our implementation. Similarly, it was necessary to cap the power used by Bouri to 44 sec. Even though we have not yet optimized for complexity, this should be simple once we finish programming the virtual machine monitor [44]. We have not yet implemented the client-side library, as this is the least

1.6e+27 1.4e+27 sampling rate (celcius) 1.2e+27 1e+27 8e+26 6e+26 4e+26 2e+26 0 -2e+26 0 20

Planetlab fiber-optic cables response time (MB/s)

1.18059e+21 1.15292e+18 1.1259e+15 1.09951e+12 1.07374e+09 1.04858e+06 1024 1

Internet-2 robust theory

40 60 80 100 120 140 energy (# CPUs)

50

60

70 80 90 100 110 work factor (nm)

Fig. 3. The effective throughput of Bouri, as a function of hit ratio [13].
100 interrupt rate (MB/s)

Fig. 5. The mean clock speed of our methodology, compared with the other methodologies [?].

100-node 10-node client-server technology opportunistically low-energy algorithms 10

1

0.1

1

10 time since 1953 (man-hours)

100

Fig. 4. The expected time since 1993 of Bouri, as a function of seek time [15] [48], [49].

B. Experiments and Results We have taken great pains to describe out evaluation setup; now, the payoff, is to discuss our results. We ran four novel experiments: (1) we compared mean clock speed on the Multics, EthOS and Mach operating systems; (2) we dogfooded our heuristic on our own desktop machines, paying particular attention to bandwidth; (3) we dogfooded Bouri on our own desktop machines, paying particular attention to flash-memory space; and (4) we measured DHCP and WHOIS throughput on our mobile telephones. All of these experiments completed without Internet congestion or unusual heat dissipation. Such a hypothesis at first glance seems unexpected but is supported by existing work in the field. Now for the climactic analysis of experiments (1) and (3) enumerated above. Error bars have been elided, since most of our data points fell outside of 01 standard deviations from observed means. We scarcely anticipated how accurate our results were in this phase of the evaluation [11]. Note that Web services have less discretized flashmemory speed curves than do reprogrammed local-area networks.

We have seen one type of behavior in Figures 2 and 3; our other experiments (shown in Figure 5) paint a different picture. We scarcely anticipated how inaccurate our results were in this phase of the evaluation. Further, the many discontinuities in the graphs point to degraded 10th-percentile latency introduced with our hardware upgrades. We scarcely anticipated how inaccurate our results were in this phase of the evaluation method. Lastly, we discuss the first two experiments. Note that 802.11 mesh networks have smoother complexity curves than do autonomous kernels. Next, these expected work factor observations contrast to those seen in earlier work [?], such as Manuel Blum’s seminal treatise on randomized algorithms and observed RAM throughput. Further, Gaussian electromagnetic disturbances in our desktop machines caused unstable experimental results.

VI. C ONCLUSION Our heuristic will answer many of the grand challenges faced by today’s scholars. We also presented an omniscient tool for evaluating I/O automata [14]. Further, we used atomic theory to confirm that the Turing machine [43] and cache coherence [?] can synchronize to realize this purpose. The improvement of Scheme is more typical than ever, and Bouri helps scholars do just that. In conclusion, we validated here that semaphores [?] can be made certifiable, relational, and cacheable, and Bouri is no exception to that rule. We argued that the seminal decentralized algorithm for the simulation of flip-flop gates by Robin Milner [?] [?] is NP-complete. Continuing with this rationale, we introduced a system for the refinement of redundancy (Bouri), which we used to disconfirm that operating systems [46] and XML [?] are generally incompatible. We plan to explore more problems related to these issues in future work.

R EFERENCES
[1] K. Suzuki, J. Shastri, and B. E. Harris, “A methodology for the deployment of model checking,” Journal of Empathic, Amphibious Archetypes, vol. 74, pp. 158–196, Aug. 1999. [2] C. Labb´ e, F. Reblewski, and J.-M. Vincent, “Performance Evaluation of High Speed Network Protocol by Emulation on a Vereme Atelier d’Evaluation de Performances, satile Architecture,” in 6i` Versailles, Nov. 1996. [3] ——, “Performance Evaluation of High Speed Network Protocol by Emulation on a Versatile Architectur,” RAIRO Recherche Operationnelle - Operations Research, vol. 32, no. 3, 1998. [Online]. Available: http://wwwlsr.imag.fr/Les.Personnes/Cyril.Labbe/Publi/tools98.pdf [4] C. Labb´ e, V. Olive, and J.-M. Vincent, “Emulation on a versatile architecture for discrete time queuing networks : Application to high speed networks,” in ITC, Thessalonique, June 1998. [Online]. Available: http://wwwlsr.imag.fr/Les.Personnes/Cyril.Labbe/Publi/ict98.pdf [5] C. Labb´ e, S. Martin, and J.-M. Vincent, “A reconfigurable hardware tool for high speed network simulation,” in TOOLS, Palma de Majorque, Sept. 1998. [Online]. Available: http://wwwlsr.imag.fr/Les.Personnes/Cyril.Labbe/Publi/tools98.pdf [6] C. Labb´ e, J.-M. Vincent, and P. Vrel, “Analyse de perturbation de trafic ATM en sortie d’un serveur Fair Queueing,” in ROADEF, Autrans, Jan. 1999. [7] C. Labb´ e and J.-M. Vincent, “An efficient method for performance analysis of high speed networks : Hardware emulation,” in Iscis, Izmir, Nov. 1999. [8] R. Feraud, F. Cl´ erot, J.-L. Simon, D. Pallou, C. Labb´ e, and S. Martin, “Kalman and Neural Network Approaches for the Control of a VP Bandwidth in an ATM Network,” in NETWORKING, 2000, pp. 655–666. [9] C. Labb´ e and D. Labb´ e, “Inter-Textual Distance and Authorship Attribution Corneille and Moliere,” Journal of Quantitative Linguistics, vol. 8, no. 3, pp. 213–231, 2001. [10] F.-G. Ottogalli, C. Labb´ e, V. Olive, B. de Oliveira Stein, J. Chassin de Kergommeaux, and J.-M. Vincent, “Visualisation of Distributed Applications for Performance Debugging,” in International Conference on Computational Science (2), 2001, pp. 831–840. [11] P. Serrano-Alvarado, C. Roncancio, M. E. Adiba, and C. Labb´ e, “Adaptable Mobile Transactions,” in BDA, 2003. [12] C. Labb´ e, D. Labb´ e, and P. Hubert, “Automatic Segmentation of Texts and Corpora,” Journal of Quantitative Linguistics, vol. 11, no. 3, pp. 193–213, 2004. [13] C. Bobineau, C. Labb´ e, C. Roncancio, and P. Serrano-Alvarado, “Comparing Transaction Commit Protocols for Mobile Environments,” in DEXA Workshops, 2004, pp. 673–677. [14] P. Serrano-Alvarado, C. Roncancio, M. E. Adiba, and C. Labb´ e, “Context Aware Mobile Transactions,” in Mobile Data Management, 2004, p. 167. [15] M.-D.-P. Villamil, C. Roncancio, and C. Labb´ e, “PinS: Peer-to-Peer Interrogation and Indexing System,” in IDEAS, 2004, pp. 236–245. [16] C. Bobineau, C. Labb´ e, C. Roncancio, and P. Serrano-Alvarado, “Performances de protocoles transactionnels en environnement mobile,” in BDA, 2004, pp. 133–152. [17] M. Denis, C. Labb´ e, and D. Labb´ e, “Les particularit´ es d’un discours politique : les gouvernements minoritaires de Pierre Trudeau et de Paul Martin au Canada,” Corpus, no. 4, pp. 79– 104, 2005. [18] P. Serrano-Alvarado, C. Roncancio, M. Adiba, and C. Labb´ e, “An Adaptable Mobile Transaction Model for Mobile Environments,” International Journal Computer Systems Science and Engineering(IJCSSE) – Special issue on Mobile Databases, 2005. [19] C. Labb´ e and D. Labb´ e, “How to measure the meanings of words? Amour in Corneille’s work,” Language Resources and Evaluation, vol. 35, no. 35, pp. 335–351, 2005. [20] L. Gurgen, C. Labb´ e, V. Olive, and C. Roncancio, “Une architecture hybride pour l’interrogation et l’administration des capteurs,” in Deuxi` emes Journ´ ees Francophones: Mobilit´ e et Ubiquit´ e (UbiMob 2005). Grenoble, France: ACM, juin 2005, pp. 37–44. [21] ——, “A Scalable Architecture for Heterogeneous Sensor,” in 8th International Workshop on Mobility in Databases and. Copenhagen, Denmark: IEEE, Aug. 2005, pp. 1108–1112.

[22] M. d. P. Villamil, C. Roncancio, C. Labb´ e, and C. A. D. Santos, “Location queries in DHT P2P systems,” in Les actes des 21` emes Journ´ ees Bases de Donn´ ees Avanc´ ees (BDA’05), Saint Malo-France, Oct. 2005. [23] M. d. P. Villamil, C. Roncancio, and C. Labb´ e, “Querying in massively distributed storage systems,” in Les actes des 21` emes Journ´ ees Bases de Donn´ ees Avanc´ ees (BDA’05), Saint Malo-France, Oct. 2005. [24] P. Serrano-Alvarado, C. Roncancio, M. Adiba, and C. Labb´ e, “Mod` eles, architectures et protocoles pour transactions mobiles adaptables,” Ing´ enierie des syst` emes d’information, vol. 10, no. 5, pp. 95–121, Oct. 2005. [25] L. D’Orazio, F. Jouanot, C. Labb´ e, and C. Roncancio, “Building adaptable cache services,” in Workshop on Middleware for Grid Computing (MGC), Grenoble, France, Nov. 2005. [26] L. Gurgen, C. Roncancio, C. Labb´ e, and V. Olive, “Transactional Issues in Sensor Data Management,” in 3rd International Workshop On Data Management for Sensor, 2006, pp. 27–32. [27] L. Gurgen, C. Labb´ e, C. Roncancio, and V. Olive, “SStreaM: A model for representing sensor data and sensor queries,” in International Conference on Intelligent Systems And Computing: Theory And Applications (ISYC’06), July 2006. [28] C. Blanchet, Y. Denneulin, L. D’Orazio, C. Labb´ e, F. Jouanot, C. Roncancio, P. Sens, and O. Valentin, “Gestion de donn´ ees sur grilles l´ eg` eres,” in Journ´ ee Ontologie, Grille et int´ egration S´ emantique pour la Biologie, Bordeaux, France, July 2006. [29] M. d. P. Villamil, C. Roncancio, and C. Labb´ e, “Range Queries in Massively Distributed Data,” in International Workshop on Grid and Peer-to-Peer Computing Impacts on Large Scale Heterogeneous Distributed Database Systems (DEXA’06), Krakow, Poland, Sept. 2006, pp. 255–260. [30] O. Valentin, F. Jouanot, L. D’Orazio, Y. Denneulin, C. Roncancio, C. Labb´ e, C. Blanchet, P. Sens, and C. Bernard, “Gedeon, un Intergiciel pour Grille de Donn´ ees,” in Proceedings of the 5` eme Conf´ erence Francophone sur les Syst` emes d’Exploitation, Oct. 2006. [31] L. D’Orazio, O. Valentin, F. Jouanot, Y. Denneulin, C. Labb´ e, and C. Roncancio, “Services de cache et intergiciel pour grilles de donn´ ees,” in Proceedings of BDA 2006, conf´ erence sur les Bases de Donn´ ees Avanc´ ees, Lille, Oct. 2006. [32] L. Gurgen, C. Roncancio, C. Labb´ e, and V. Olive, “Controle ˆ de concurrence pour les transactions orient´ ees capteurs,” in Atelier de travail, Gestion de donn´ ees dans les syst` emes d’information pervasifs (GEDSIP), May 2007. [33] L. Gurgen, C. Labb´ e, C. Roncancio, and V. Olive, “Gestion transactionnelles des donn´ ees de capteurs,” in Atelier de travail, Gestion de donn´ ees dans les syst` emes d’information pervasifs (GEDSIP), May 2007. [34] C. Prada, C. Roncancio, C. Labb´ e, and M. d. P. Villamil, “Proquesta de cach´ e sem´ antica en un sistema de interrogacion ´ P2P,” in Conferencia Latinoamericana de computacion de alto, Colombie, Aug. 2007. [35] L. D’Orazio, C. Labb´ e, C. Roncancio, and F. Jouanot, “Query and data caching in grid middleware,” in Latinamerican Conference of High Performance Computing (CLCAR’07), Santa Marta, Colombia, Aug. 2007. [36] L. D’Orazio, F. Jouanot, Y. Denneulin, C. Labb´ e, C. Roncancio, and O. Valentin, “Distributed Semantic Caching in Grid Middleware,” in Proceedings of the 18th International Conference on Database and Expert Systems Applications (DEXA’07), ser. LNCS 4653. Regensburg, Germany: Springer, Sept. 2007, pp. 162–171. [37] L. Gurgen, C. Roncancio, C. Labb´ e, V. Olive, and D. Donsez, “SStreaMWare: un intergiciel de gestion de flux de donn´ ees de capteurs h´ et´ erog` enes,” in 23emes Journees Bases de Donn´ ees Avancees (BDA’07) – Session d´ emo, Oct. 2007. [38] L. D’Orazio, F. Jouanot, C. Labb´ e, and C. Roncancio, “Caches s´ emantiques coop´ eratifs pour la gestion de donn´ ees sur grilles,” in 23e Journ´ ees Bases de Donn´ ees Avanc´ ees (BDA’2007), Marseille, France, Oct. 2007. [39] L. Gurgen, C. Roncancio, C. Labb´ e, and V. Olive, “Update Tolerant Execution of Continuous Queries on Sensor Data,” in IEEE International Conference on Networked Sensing Systems, Kanazawa, Japan, 2008, pp. 51–54. [40] L. Gurgen, C. Roncancio, C. Labb´ e, and a. Vincent Olive.,

256–256. and S. July 2008. pp. L. Labb´ e. L. Labb´ e. 121–130. C. D’Orazio. C. France. NY. Roncancio. Labb´ e. C. Villamil. Donsez. Gurgen. “Data Sharing in DHT Based P2P Systems. Olive. and V. Roncancio.” Ing´ eni` erie des syst` emes d’Information. “Gestion de donn´ ees de capteurs. no. Roncancio. ¨ A. C. juin 2008. France. Labb´ e. USA: ACM. Lyon. and V. Gurgen. 2008. L. “Sensor data management in dynamic environments. C. 2009. V. Sorrento. . J. Roncancio. 2009.” in International Conference on Pervasive Services. L. Labb´ e. and F. “Coh´ erence de donn´ ees de capteurs en pr´ esence de mises a ` jour. Nystrom-Persson. Labb´ e. New York. M. C. C. LNCS 5740. 2009. Roncancio. Jouanot. ——. “SStreaMWare: a service oriented middleware for heterogeneous sensor data management. Serrano-Alvarado. Labb´ e. Gurgen. pp. and V. L. vol. and P.[41] [42] [43] [44] [45] [46] [47] [48] [49] “Coh´ erence de donn´ ees de capteurs en pr´ esence de mises a ` jour.” Transactions on LargeScale Data. 2008. “SStreaMWare: a service oriented middleware for heterogeneous sensor data management. July 2008. Labb´ e. Honiden. June 2008.and Knowledge Centered Systems. C. C. Gurgen. Roncancio. Italy. C. Olive. “Plug and Manage Heterogeneous Sensing Devices.” in Second Workshop sur la Coh´ erence Des Donn´ ees en Univers R´ eparti (CDUR 2008) associ´ e a ` la 8` eme Conf´ erence Internationale NOTERE). Labb´ e and D. “Semantic caching in large scale querying systems. “Peut-on se fier aux arbres ?” in Journ´ ees internationales d’analyse statistique des donn´ ees textuelles (JADT). 9. Olive. Mar. num´ ero sp´ ecial sur la Gestion des donn´ ees dans les SI pervasifs. A.” in 2i` eme WS Coh´ erence des Donn´ ees en Univers R´ eparti. and D. Roncancio. lyon. L. Gurgen. 1. 2008. C.” in IEEE Fifth International Conference on Networked Sensing Systems (INSS’08) – demo session. Cherbal. in conjunction with VLDB’09. C. Vol 14(1). vol. Bottaro.” in Demonstration in 6th International Workshop on Data Management for Sensor Networks (DMSN’09). C.” Revista Colombiana De Computaci´ ons. Olive. C. C.” in ICPS ’08: Proceedings of the 5th international conference on Pervasive services.

labbe@cnet. Razel .francetelecom. a exible hardware testbed for simulation of ATM-based networks has been used. That is why investigation on performance evaluation are so important. a large number of cell events may need to be simulated to ensure satisfactory con dence intervals. This technique is used to highlight rare events. BP53X 38041 Grenoble Cedex 9. Because of the small size of the ATM cell and the high link-speeds. This measure of performance depends directly on the switch architecture and algorithms for congestion control and scheduling. and Jean-Marc Vincent3 1 2 M2000.8-10. Chemin du vieux chene . where slotted time is natural since all the cells have the same size. BP98. France cyril. using emulation on a versatile architecture machine for performance evaluation of high speed networks 8. 4 rue R. A realistic packet loss probability is around 10. Estimation of rare events probabilities such as loss rate in high speed network remains in most cases an open problem. Such losses are rare events which are di cult to capture. France Jean-Marc.A Recon gurable Hardware Tool for High Speed Network Simulation Cyril Labb e1. Such services can have widely di ering Quality of Service QoS requirements. 38243 Meylan Cedex. To address this problem. A slot is the time needed to serve a cell. At the packet cell level. Domaine Universitaire . France 3 Laboratoire LMC-IMAG . this means di erences in permissible cell loss and cell transfer delays. This is especially true in the case of ATM Asynchronous Transfer Mode. It is shown that this technique can be used to highlight rare events.9. Software Simulators are too limited to obtain such a probability.fr Abstract. 2 .fr France Telecom CNET DTL ASR . such as realistic packet loss probability in high-speed networks. Serge Martin2 . 1 Introduction More and more High Speed Networks are intended to provide a variety of different services on a single "universal" network. Fr ed eric Reblewski1 .martin@cnet. Although analytical techniques may be used to bound the worst-case performance. The aim of this paper is to show a new approach. such as realistic packet .francetelecom.Vincent@imag.fr serge. The goal of this article is to present this simulation technique. Models used for this research are often discrete time queuing networks. 91400 Saclay. 3 these are often inadequate for modeling the switch algorithms at the needed level of detail.

is usually close to 1 Mhz. static RAM and VRAM. adjustable clock frequency from 1 to 10 Mhz. The VRAMs sample all the internal nodes for logic analysis of the signal values. An ATM switch is modeled by a queuing network which is emulated by a dedicated architecture on the versatile machine. It acts like a giant FPGA  eld programmable gate array on which the circuit to be tested and debugged can be mapped. Section 3 presents experimental results on a eight-by-eight multistage ATM switch. This machine is from the rst generation 1995. The clock frequency.2 Software environment The software ow leads to the les required by the emulator to reproduce the functionalities of a circuit. The emulator is based on a building bloc called PLB Programmable Logic Bloc. which can be seen as an hardware simulator. The versatile architecture and software used are presented in Section 2. Here we will focus on the architecture. This hardware can be shaped to emulate any digital and synchronous circuit. The description of a chip is given to the Emulator by con guration les. Its hardware con guration can be modi ed to model other circuits . this is an "all purpose hardware emulator" based on a versatile architecture 4 . under normal conditions. which is very useful for debugging. 17 Mbytes of memory single or double port. These functionalities are described in terms of concur- .1 Architecture 2. This technique is also used to make performance evaluation on congestion control and scheduling algorithms of an ATM switch developed at the CNET. All signals and register values are available on the last 7000 clock cycles. the use.000 programmable logic gates connected to each other through a programmable network. 2 Hardware architecture and software environment This section presents the hardware architecture and the software environment used to emulate queuing networks. 2. The software is used to describe a component modeling the queuing network and the hardware simulator emulates this component. An up to date machine has at least 20 time more logic gates. The emulator clock is under user control. The structure of the paper is the following. Emulation is performed by an emulator. the static RAMs provide possibilities to map memories described in the netlist. and the possibilities of this tool. PLBs provide register and basic logic gates. Programmable hardware emulation is widely used to reproduce the functionalities of a circuit. All this give to the user the e ective use of : 500. The hardware simulator is the M500 machine from Metasystems 4 .loss probability.

The waveform window display all signals and register values on the last 7000 clock cycles.3 Simulation control Emulation is performed using the MEL tool. and patterns veri cation. which is useful for complex simulation. MEL can be driven by procedures written in a C-like code. All the signals or vectors busses can be displayed in a waveform window cf Figure 1. provided by Synopsys. Any signal and register value can be displayed without recompilation. VHDL is an e cient way of obtaining a high level description of a hardware component. which results in connecting the gates to each other through the programmable network of the emulator. The software ow is detailed above : a VHDL VHSIC Hardware Description Language description of the chip is used to describe the system in terms of concurrent processes 5 . 2. and allows run control. which loads the emulator with the con guration le. The Metasystems compiler. which is then translated into gates by the Synopsys synthesis tools. rent processes using the VHDL language. 1.Fig. Those two last steps are entirely automatic. the Metasystems compiler produces the data base required by the emulator. This is the routing operation. triggering features. translates the VHDL description into combinational logic and registers logic gates 5 . Synopsys synthesis : this software. Control of input signals or registers can be done through the monitor window cf Figure 1. logic analysis. . From this representation of the components.

The x axis is the queue capacity varying from 10 to 50. 2. .001 0. Figure 3 shows the packet loss probability per stage. This is easily observed when doing a statistical analysis of burst length. with arrival rst 1 . 3 Application to a three stages eight-by-eight switch This section is devoted to the study of a eight-by-eight switch  gure 2. A three stages eight-by-eight ATM switch modeled with discrete time queues. The tra c model adopted is geometric. This tra c is also call uniform tra c 9.Sources First stage Second stage Third stage Fig. This has been done thanks to a tra c analyzer which has been build to characterize the tra c perturbation introduce by bu ers. = 0:8. This is explained by the fact that the tra c following a bu er stage is more bursty than the one at the entrance. 3. 7 . Loss rate at di erent stages versus capacities K of queues same capacities K at each stage. The queues of each stage have the same capacities . Tagged cell can also be used to di erentiate background tra c from the point to point communication. Each curve corresponds to a di erent stage.0001 1e-05 third stage 1e-06 1e-07 first stage second stage 1e-08 1e-09 1e-10 5 7 10 15 20 25 K Fig.01 0. K K loss rate 0. servers of queue are deterministic. It should be noted that losses are always greater on higher stage.

graphs or Petri nets. Simultaneity in discrete-time single server queues with Bernouilli inputs. F. a new technique for simulation of high speed network has been presented.8 10. In particular for studies on Fair Queuing disciplines and congestion control algorithms. RAIRO.Reblewski. C. and J-M Vincent. estimation of the probability of rare events are very di cult to obtain. in ATM net works. This model has been extended to real service policies. In software simulation. More generally. In one time slot. This new approach has been applied to the study of rare events . Syst. 2. This methodology uses a versatile architecture con gured for maximum e ciency for a given problem. References 1.H ebuterne. 14:123 131.9 in a multistage ATM switch. The proposed tools and method overcomes the problem by a parallel approach. Performance Evaluation North-Holland. This has allowed simulation of realistic cell loss probabilities 10. 1992. Performance evaluation of high speed network protocols by emulation on a versatile architecture. . the number of treated events is in the order of the number of queues.4 Conclusion and extension In this article. this type of machine could be used to emulate numerous types of performance evaluation problems using discrete time queuing network.Gravey and G. Analytical techniques are often inadequate for modeling the commutation algorithms at the needed level of detail.Labb e. This technology could be used to highlight other rare events with a good degree of accuracy. A.

emes .

Robert and J. A recon gurable hardware approach to network simulation. F..Fenelon. Airiau. a ev enements discrets stochastiques : th eorie. application et outils. Tru et.Barbier. L. Can self-similar tra c be modeled by markovian processes? Lecture Notes in Computer Science. 1997. and O. 7. 7. pages 801 806. 30:51 64. Le Boudec. 3.Reblewski. R. 5. 1044.Lepape.-M. Berge. 1996. 27:1567 1613. 1996. Kluwer Academic Publishers.Y.-Y. Serial fault emulation.Mouftah. Majoration des retards dans les r eseaux ATM. G. J. M ethodes de Calcul de Bornes Stochastiques sur des Mod. Metasystems. J. 8.Pellaumail.Awdeh and H. Circuit Synthesis with VHDL. France Telecom.Burgun. and V.T. L. Stiliadis and A. Survey of ATM switch architectures. J. 1996. to be published. In Proceedings of the 33rd Design Automation Conference 1996 DAC 96. Rairo recherche op erationnelle. 9. 6. France. Olive. 4. Lecture Notes in Computer Science. 1994. S.Varma. 1995. R. D. ACM Transaction on Modeling and Computer Simulation.

eles de Syst.

PhD thesis. 1995. . emes et de R eseaux. Universit e Paris VI.

. Aug." Quickturn Design Systems' management was unfazed. By acquiring Meta. Mentor signed a definitive agreement to acquire the hardware emulation technolcontinued from page 1 ogy company which operates out of Saclay. acquisitions of French electronic design automation companies were de rigueur. The value of the transaction was not disclosed. And. Dec 18. MINC buy French firms Electronic News. VP of marketing for Quickturn. acquiring Gallic businesses. unlike Synopsys. The hardware/software co-design and combination of emulation and simulation puts Mentor is a real good position. with both Mentor Graphics and MINC Inc. senior EDA analyst at Dataquest International. Meta will be incorporated into Mentor as a new business unit. Mentor's Hardware/Software Systems division VP/GM. Smith. Sept. subject to formal approval by French government authorities. "This deal has been in the works for the last year. "This is an acknowledgment by the big guys that deep submicron needs a more powerful tool than simulation. although we just started getting serious about it last spring. but the verbally approved deal is expected to close in January. "Two other companies had wanted to acquire Meta and we finally won out. 7. "This is a great deal. manager of product marketing for the hardware/software codes business unit of Mentor Graphics. 1995 by Judy Erkanat Mountain View. Antenna. It will remain in France and report to Chung Tung. June 26) and both look similar business-wise. Mentor and Meta also made an immediate announcement of their first joint product introduction: the SimExpress hardware emulator.Find Articles in: All Business Reference Technology Lifestyle Newspaper Collection Business Publications 0 Comments Tout ensemble! Mentor. 18). The deal had been expected for months (EN. the Mentor/Meta product is already in the marketplace.--Last week." said Jim Kenny. and that's emulation. "This deal is similar to Synopsys' buyout of Arkos Design Systems (EN. Our interest in hardware emulation is due to the power it delivers to software simulation--an excellent systems design approach." Industry analyst Gary Smith praised the acquisition." said Naeem Zafar. France. Mentor Graphics added Meta Systems to its ever-growing list of assets and MINC bought Innovative Synthesis Technologies (IST) in what market analysts dubbed a smart move." said Mr. Calif. even beating out market leader Quickturn in France. Mentor is now in second place to Quickturn in the emulation market.

mapping and fitting technologies are brought together to provide a complete solution for programmable logic users. will become MINC's chief technical officer. Professor Gabriele Saucier." Advanced Search Find Articles in free and premium articles Search . Colo." MINC CEO Gene Warrington concurred. Tung. "The combination of MINC and IST fills a huge void in the EDA industry. Unlike the well-predicted Mentor move. while William O. but then we found ourselves in a make-or-buy situation. synthesis. "Now the best of both FPGA and CPLD partitioning. IST will be a wholly-owned subsidiary of MINC. MINC's VP of marketing. MINC will continue to support IST's current products. rather than taking the traditional approach of using commercial field programmable gate arrays (FPGAs)." said Frederic Reblewski. "Their world leadership and commitment to hardware/software co-design were major factors in our selection to partner. as well as retain all of its employees. "There has been some concern since the NeoCAD buyout that MINC wouldn't survive. founder of Meta Systems. and IST's operations (to be known as MINC-IST) in Grenoble. came as a surprise to many. an international FPGA and ASIC synthesis company located in Grenoble. products and OEM relationships. but be bought out by someone else." said Mr. overcome disjointed hardware and software design flows. "This acquisition really strengthens MINC's position." said Mr. New CPLDs will use IST's strengths in Verilog and VHDL.. MINC's acquisition of IST." said Kevin Bush. optimization. Financial terms weren't revealed by the privately held MINC. some of the strongest we've seen. OEM relationships and distribution channels." With this acquisition. France. Smith of Dataquest. While this might still happen. "Mentor Graphics' worldwide sales force and award-winning customer support organization make them an excellent choice to distribute our advanced emulation technology. and migrate hardware/software system integration upstream in the design process. The terms of this agreement grant MINC full acquisition of IST's assets. MINC hopes to become a universal industry vendor able to provide a complete suite of design software for the entire programmable logic spectrum. MINC is acquiring tools to make them more of a player. Meta personnel were similarly enthusiastic about the merger. including its technology. one of IST's founders. MINC will maintain both its own development operations in Colorado Springs." Meta was established in 1991. "Our motivation for the purchase came from the new devices coming out."The combination of Mentor Graphics and Meta Systems will expedite delivery of solutions that will enable designers to perform high-performance design analysis." he said. Its founders focused their R&D on speeding hardware emulation design turns through the use of full-custom ICs. IST has one of the best FPGA synthesizers around and MINC is definitely worth more today that it was yesterday. "We controlled the CPLD area of the market. McDermith will remain its VP of engineering.

SimExpress is available immediately through Mentor Graphics' direct worldwide sales force and distribution channels. will be distributed through Mentor Graphics' worldwide sales and support organizations. Through this transaction. Mentor Graphics is the first EDA vendor to win the STAR (Software Technical Assistance Recognition) award. yields dramatic hardware/software co-simulation performance. Mentor Graphics Corporation (Nasdaq: MENT) designs. In addition to its corporate offices. This partnership between the two companies ensures our customers are getting the best emulation technology available. which allows hardware and software designers to perform extensive verification runs while debugging the software of the virtual silicon prior to system prototype. the SimExpress(TM) best-in-class hardware emulator for RTL." said Frederic Reblewski. 1996 Words: 461 Publication: PR Newswire WILSONVILLE. The SimExpress full-custom architecture offers designers extremely high-speed design iterations throughout the compile-run-debug phases of emulation. Mentor Graphics has sales. The company's headquarters are located at 8005 . Saclay." "The unique technology offered by Meta Systems complements our Seamless* hardware/software co-verification solution.000 over the last reported 12 months.714. Associated with the completion of the acquisition.MENTOR GRAPHICS COMPLETES META SYSTEMS ACQUISITION. support. SIMEXPRESS HARDWARE EMULATOR NOW AVAILABLE WORLDWIDE Print Date: Jun 5. vice president and general manager of Mentor Graphics' Hardware/Software Systems Division (HSD). France. gatelevel and in-system emulation." said Chung Tung. with revenues of $440. Established in 1981. Meta Systems' president and founder. manufactures..Mentor Graphics Corporation (Nasdaq: MENT) today announced the closing of the company's acquisition of Meta Systems. software development and professional services offices worldwide. enables us to offer customers a wide range of performance in co-verification solutions." Meta Systems will operate as a wholly-owned subsidiary of Mentor Graphics and function as a business unit within HSD. markets and distributes electronic design automation (EDA) software and provides professional services supporting its customers' complete design environments. thereby allowing right-thefirst-time designs for high-volume production. The award is given annually by the Software Support Professionals Association (SSPA) for service excellence.400 people worldwide. in concert with our Microtec merger. This acquisition. Mentor Graphics anticipates a one-time technology-related charge of approximately $10 million to be taken in the second quarter.Verification Environment. June 5 /PRNewswire/ -. The company currently employs approximately 2. "We are very pleased that the transaction is now complete. when combined with our Seamless Co. "SimExpress. The company is a leader in worldwide EDA sales. and the only EDA vendor to win the award twice. Ore. "We have seen tremendous worldwide interest in the SimExpress technology.

or eileen_drake@kvo. 503221-2366. Wilsonville.com COPYRIGHT 1996 PR Newswire Association LLC Copyright 1996 Gale.0268 06/05/96 20:26 EDT http://www. World Wide Web site: http://www. -06/5/96 /CONTACT: Lillian Tsai.com .com / (MENT) CO: Mentor Graphics Corp.mentorg. 503685-1177. Meta Systems ST: Oregon IN: CPR SU: TNM JL -. or lillian_tsai@mentorg.W. Corporate Communications of Mentor Graphics Corporation.. Oregon 97070-7777. Boeckman Road.SEW011 -.prnewswire. Inc. or Eileen Drake Public Relations of KVO. .S. All rights reserved. Cengage Learning.com .

Start your search today! Share: More Industry Type: Area of Interest: Liquid Capital: Select Industry Select Location Select Level Print Like Business/High Tech Editors WILSONVILLE. Corporate headquarters are located at 8005 S." The Meta Systems division continues to expand its international team of R&D engineers. Mentor Graphics and Leading Industry A. where he used Meta emulators to validate large ASIC designs with gate capacities ranging from 100k to 30 million. so I know what they can do. The appointment is part of a broader investment in R&D for advanced work on emulation technology." business experts Carol Roth and Barry Moltz disagree on what to do when you get control of a family business. Mentor Graphics Appoints New Emulation R&D Director. Mentor Graphics Corporation (Nasdaq:MENT) is a world leader in electronic hardware and software design solutions.. I am really looking forward to working on the next generation of this technology. California 95131-2314. In connection with the release of its new PERCNET package. San Jose. Carry Forward 20 "I joined Mentor's emulation division because the cuttng-edge technology that is Years! Free R&D Credit being developed here represents an exciting opportunity. 54. Ore. Publication: Business Wire Date: Wednesday." said Walden C. She says if you wouldn't buy it.Mentor.--(BUSINESS WIRE)--Aug. The R&D group now totals over 50. the company reported revenues over the last 12 months of more than $600 million and employs approximately 2.Sign In | Join | | About Home Finance Resource Center 2011 AllBusiness AllStar Franchises Sales & Marketing Franchises for Sale Finance Shop Legal Forms Small Business Blog Download Center Resources Business Resource Center Starting a Business Operating Your Business Human Resources Technology Business Library Ads By Google AllBusiness Recommends Mentor Graphics HyperLynx Related Industry & Topics: Oregon. World Wide Web site: Watch More Videos . Oregon 97070-7777. Silicon Valley headquarters are located at 1001 Ridder Park Drive.. Ads By Google 40% More R&D Tax Credits AT&T™ Official Site Compare AT&T U-verse Bundles and See How We Measure Up to Cable. Northrop Grumman. Meta Systems founder who was instrumental in the development of Meta's industry-leading technology. Frederic Reblewski. located in Les Ulis. R&D staffing in Mentor's emulation division has increased by Related Articles IBM. including several senior executive positions. because we anticipate more and more customers turning to emulation to solve the verification bottleneck in large complex designs.975 people worldwide. Vallet. He says keep it and grow it.. and we plan to continue investing to support advances in this technology. Mitch Weaver Joins TransLogic Technology asPresident. you shouldn't keep it. continues to serve as chief scientist.W. Bell Helicopter. 2001 Mentor Graphics Corporation (Nasdaq:MENT) today announced that Philippe Vallet has been named head of R&D for the company's Meta Systems emulation division. Established in 1981. Previous positions include head of the R&D center of the servers division of Bull.Ds. has over 25 years of R&D experience. "Emulation is important to Mentor's strategic vision...com/Hyperlynx Find great franshise opportunities that fit your budget.. Wilsonville. several of whom hold Ph. Perceptronics of Woo. with more still being recruited. USA Research & Development Engineering Appointments Electronics Overview Electronics Design Semiconductors Microprocessors Press Releases Mentor Graphics Appoints New General Manager for Ready-toUse PCB Design Product. Rhines. Mentor Graphics Appoints New VP of Worldwide Consulting." noted Vallet. France. "I've used Assessment the Meta emulators as a customer. August 29 2001 Signal + Power Integrity Simulation View Free Webinar or Techpub Today! www. Most recently he was in charge of hardware development of the Bull open systems division. Mentor Graphics and Aeroconseil Partner to Support DO-254 in China Ads By Google 45 percent since January of 2001. president and CEO of Mentor Graphics. France. Business Boxing: What to Do with the Family Business? In this installment of "Business Boxing. Expands Research Team. Watch the entire Business Boxing video series. Vallet holds an engineering degree from L'Ecole des Mines in Nancy. Boeckman Road. consulting services and award-winning support for the world's most successful electronics and semiconductor companies. 29. providing products.

Compare Price Quotes for GPS Fleet Tracking Software Site Map | Contact Us | FAQs | About Us | RSS Directory | Newsletters | Disclosure Policy | Media Kit Copyright © 1999 . Use of this site is governed by our Copyright and Intellectual Property Policy.com. .com. republish. package and/or redistribute the content of this page. reproduce. Inc. republished or redistributed without the prior written consent of AllBusiness.mentor. All rights reserved. without the written permission of the copyright holder. No part of this content or the data or information included therein may be reproduced.com. Inc. in whole or in part. AllBusiness. You may not repost.2011 AllBusiness. Mentor Graphics is a registered trademark of Mentor Graphics Corporation.com. Get In-Depth Company Information from Hoover's | What is in Your Company's D&B Credit Report? View All D&B Sales & Marketing Solutions | Get Email Lists from D&B Professional Contacts | Build Mailing Lists from Zapdata | Company Profiles Information and opinions on AllBusiness.com solely represent the thoughts and opinions of the authors and are not endorsed by. or reflect the beliefs of. COPYRIGHT 2001 Business Wire © Business Wire 2011 © Copyright 2009 The Gale Group. its parent company D&B. Terms of Use Agreement and Privacy Policy.www. and its affiliates. All rights reserved.

.

Rue Ren´ e Razel. The concurrent method is implemented in most commercial tools because of its generality and its efficiency [6]. concurrent [15] or differential fault simulation [5]. The configuration file related to the hardware prototype is downloaded into the hardware emulator and a first emulation pass allows the verification or the calculation of the expected values of the test set. Each cell of the design library is mapped onto the Meta Systems library (called metalib). Experimental results are provided to demonstrate the efficiency of SFE. However. but we will concentrate on the SSF model. The faults are then emulated one at a time by partially modifying the fault-free hardware prototype so that it models each faulty circuit. the first step consists in translating the design netlist into the Meta Systems internal format. The crossbar improves the inter-chip communication by removing the constraints related to I/O placement (each BLP may be connected to any I/O without decreasing the percentage of BLP utilization). to republish. This format supports hierarchical descriptions and allows the use of any 4-input function cell. Tocopy otherwise. Inc. 14]. The traditional approach to test evaluation relies on software programs simulating the effects of the faults on the behavior of the circuit. Fr´ ed´ eric Reblewski. the Meta is targeted specifically for logic emulation. in which the fault-free circuit and the faulty circuits are considered separately. 4] to verify circuits before committing them to silicon. one at a time. we propose a methodology to extend the utilization of hardware emulators for test evaluation by using a brute-force method. A Serial Fault Emulation (SFE) method in which each faulty circuit is emulated separately has been applied to gate-level circuits for Single Stuck Faults (SSFs). called serial fault simulation simulates the faulty circuits. This approach called logic or hardware emulation [4. As the Tabula Rasa chip [9]. These techniques differ from serial fault simulation because they aim at minimizing the number of simulation passes by simultaneously processing faults.$3. 2 INTRODUCTION Test evaluation consists in determining the effectiveness of a set of test patterns by computing the ratio between the number of faults detected by this set and the total number of possible faults with respect to a given fault model.A hardware emulator based approach has been developed to perform test evaluation on large sequential circuits (at least tens of thousands of gates). NV. to post on servers or to redistribute tolists. Jean Barbier and Olivier Lepape META SYSTEMS 4. the fault simulation approach is becoming unrealistic for many designs not only because the theoretical complexity for simulating one pattern appears to be between linear and quadratic with the number of gates [8]. Section 5 presents experimental results and Section 6 concludes this paper. 91400 Saclay France ABSTRACT . 0-89791-833-9/96/0006. but also because the complexity of the circuits increases faster than the computing speed [2]. Therefore this method a priori can handle any type of fault [1]. This paper deals with sequential circuits described at the gate level (gate netlist).000. Section 3 presents our approach for fault emulation and shows particularly how to compute each faulty circuit from the fault-free circuit and the fault to be inserted. This approach relies both on the flexibility and on the reconfigurability of hardware emulators based on dedicated reprogrammable circuits. 11] decreases the design time by allowing a "real-time" verification 10.. the title of the publication and its date appear. Section 4 deals with the problem of minimizing the run time for SFE and explains the techniques for limiting the software tasks. As mentioned for serial fault simulation [1]. In this paper. DAC 96 . this type of fault simulation is completely impractical if a large number of faults has to be considered [1. A partial reconfiguration of the hardware emulator is then computed for each fault of interest. SFE may be used for other types of faults such as multiple fault or bridging fault.50 . In contrast with the Xilinx LCA architec- 33rd Design Automation Conference ® Permission to make digital/hard copy of all or part of this work forpersonal or class-room use is granted without fee provided that copiesare not made or distributed for profit or commercial advantage.000 gates at rates varying from 500KHz to several MHz. The main advantage of SFE is that the run time is quasi-proportional to the number of faults so that test evaluation can be performed for very large circuits with large test sets. Each BLP consists of a 4-input Sram and a reprogrammable sequential device which can emulate either a flip-flop or a latch. In the last decades.and notice is given that copying is by permission of ACM. 17.  META SYSTEMS is now part of MENTOR GRAPHICS CORPORATION CAP FOR LOGIC EMULATION This chapter briefly describes the CAP software used for implementing circuits on the Meta Systems’s hardware emulator. optimized and targeted to the architecture of the reprogrammable circuits used in the hardware emulator. thecopyright notice.000 to 1. The netlist is then flattened. This approach has been implemented on the Meta Systems’s hardware emulator which is capable of emulating circuits of 1. 1 Unlike hardware accelerators dedicated to logic simulation [16].000 gates. more sophisticated general purpose methods have been proposed such as parallel [14]. G´ erard Fenelon. namely the Metas.06/96 Las Vegas. a significant speed-up can be achieved in comparison with the state-of-the-art software fault simulation methods because of the performances of hardware emulators. Recently. The resulting library (the conversion library) is used for each design to express the ANF gate netlist in terms of cells of the metalib. requires prior specific permssion and/or a fee. A fast Computer-Aided Prototyping (CAP) software package combining netlist translation.000 times faster than software logic simulation [11].000. This paper is organized as follows. a new approach based on reprogrammable hardware has been proposed [9. Today. The simplest method. The architecture of the Meta consists of a column of logic blocks and a global interconnexion matrix (crossbar) which connects the I/Os and the logic blocks (BLPs). namely ANF. called serial fault emulation. They indicate that SFE should be two orders of magnitude faster than software approaches for designs containing more than 100. due to the performances of software logic simulators. synthesis.Serial Fault Emulation Luc Burgun. multi-chip partitioning and routing automatically produces a hardware prototype of the fault-free circuit. USA ©1996 ACM. As shown on the right part of Figure 1. This method does not require a dedicated fault simulator (any logic simulator can be easily adapted). Inc. Section 2 briefly describes the CAP software of the Meta Systems’s hardware emulator.

This important feature avoids the need to re-compile the design netlist when the user wishes to observe different signals from those initially declared as probes. in contrast with logic emulation.Initial Design EDIF. each BLP may be observed without adding routing constraints which may cause congested areas and consequently lead to routing failures. Meta 0 and BLP 0) and the gate G3 corresponds to the BLP (0. In these conditions. Hence. Consider the circuit in Figure 2 and suppose that the gates G0 . If the signal X is stuck at zero. the optimization and mapping phase consists only in collapsing the single fanout gates into nodes which satisfy the 4-input constraint. The 4-input function of the BLP (0. The cases where it is necessary to reconfigure more than one BLP are as follows : 3. the system achieves the routing at three level of interconnections corresponding to Metas. we have developed specific tools which operate in parallel with the CAP software. Unlike Quickturn’s RPM system [17]. 0.1 Fault List Flattened BLP netlist Partitioning & Routing FPGA Configuration Fault Reconfiguration Generation FPGA Reconfiguration List Fig.2 Fault Insertion for Combinational Circuits The FPGA reconfiguration for SSF on a combinational gate affects only the 4-input function of the BLP.200 emulation cycles are stored. 1: Flowchart of the Meta Systems’s CAP software and the FPGA reconfiguration generation software (denoted by the dashed box) ture [10]. 0. both LCAs such as Xilinx and also those more commonly refered to as PALs and PLDs  Maximum number of cycles  Maximum speed of operating  Initial values for sequential devices (registers and memories)  Stopping conditions . For this purpose. 0. 1 The term FPGA is used to refer to all types of field programmable logic. not only the triggers can be changed without re-compiling the prototype. 3 Translation Modelization OVERVIEW OF THE FAULT EMULATION SYSTEM ANF Gate Netlist Conversion Library Fault Generation Flattening Optimization & Mapping Metalib Fault emulation involves calculating a reconfiguration of the hardware emulator for each fault of interest (for the sake of simplicity. To minimize the total run time for fault emulation. VERILOG Fault Specification Input Gate Netlist Cell Library A stopping condition is defined by arming a hardware trigger which tests when the emulation brings one or more registers into a predefined state. this BLP has to be reconfigured so that it implements the function F = C:D. we will use the term FPGA1 reconfiguration). Each BLP of the reconfiguration is associated with a logical address in the emulator and two words encoding the functionality of the BLP for the fault-free circuit and the faulty circuit. Each logic board consists of 3 processing columns separated by 2 routing columns. G1 and G2 are gathered into the BLP (0. the gate cannot be collapsed so that its output signal is kept in the BLP netlist. The backplane board connects 23 logic boards and an interface board managing the communications between the hardware emulator and the workstation host. Finally. A second step consists in calculating the FPGA reconfiguration associated with each fault of interest so that the modified hardware prototype behaves like the faulty circuit. The logical address is a 3-uple denoting the board number in the machine. An emulation experience defines the conditions in which the hardware prototype operates :  A primary input pin of the circuit has a multiple fanout  An input pin of a gate of the design library has a multiple fanout in the equivalent cell of the conversion library  A BLP is replicated during the partioning phase In the two first cases. boards and backplane board. A fault generator constructs the collapsed fault list from the ANF gate netlist and from a fault specification file. An FPGA reconfiguration corresponds to a list of BLPs to be reprogrammed in order to generate the faulty circuit from the fault-free circuit and vice versa. 0) is F = A:B + C:D. 3. if a gate has a multiple fanout. each one consisting of a Meta. the Meta number in the board and the BLP number in the Meta. the netlist is partitioned into two levels of hierarchy. The operating environment consists of a user-friendly interface in Motif and a C interpreter which allows the description of emulation experiences. After targeting the netlist to our reprogrammable hardware architecture. each FPGA reconfiguration has to affect as few BLPs as possible. Each processing column contains 8 processing elements. Hence. but also every register can be used in a stopping condition. the CAP software has to be restricted so that it does not result in large modifications to the original netlist. a 32K byte memory and a Video VRAM in which the values of all BLPs in the Meta for the last 7. Calculation of FPGA Reconfigurations The left side of Figure 1 indicates how the FPGA reconfigurations are computed with respect to the CAP software. the partitioner does not make use of the designer’s hierarchy. This file specifies the blocks of the hierarchy in which the faults will be inserted and the faults excluded from fault emulation. This restriction excludes the use of re-synthesis techniques relying on logic level optimization techniques such as extraction or substitution [3]. The routing step produces a configuration file which is downloaded into the hardware emulator before operating the hardware prototype. 0) (denoting the board 0. the pin stuck fault results in stucking all the input pins of the gates of the multiple fanout so that SSF has to be emulated by a multiple stuck-at fault. Several backplane boards (up to 6) may be linked to emulate very large designs. Unlike the Quickturn’s system [7]. namely Metas and boards. This comes down to separately map each fanout free region. 1). The partitioner implements efficient techniques such as logic replication for reducing the pin count and the partition size.

3. namely the maximum clock speed (MCS ). asynchronous reset and set lines. the signals A. If we neglect the overhead for each fault processing. and whether the reset. 2. this table shows whether the BLP remains a sequential device (Seq ) or becomes a combinational gate (Comb). set. the trigger (set in step 3) is turned on and it activates the fault dropping. Faulty circuit emulation allows the effects of a SFF to be observed.  Fault-free Circuit Emulation  Faulty Circuit Emulation  Serial Fault Emulation Depending on the design. if an output value differs from the expected value. this mode may also be used to compute the expected values for the observed outputs.0) A. 3. the FPGA reconfiguration has to be downloaded quickly. 4. the new function F . In certain cases. the emulation runtime becomes more significant. 4 IMPROVING THE FAULT EMULATION SPEED The fault emulation speed is defined as the maximal number of faults processed per seconds. Assume that P is the average number of patterns necessary to detect the faults. X stuck-at-0 produce the same FPGA reconfiguration so that only one emulation run will be necessary to test these three faults. Conversely the three last factors are crucial when the test set is not very large or when most of faults are detected during the first cycles of the emulation. each BLP consists of a 4-input Sram and a sequential device. Table 1 shows how a BLP emulating a flip-flop with reset. B.5 that register initialization and fault detection may be performed by hardware so that the fault emulation speed depends only on the first two factors.0. as the test set becomes larger. We will see in Section 4. 3. If that does not happen.D Z BLP @ (0. There are three emulation modes.0) Free A. The high observability of the Meta makes it possible to know where an undetected fault is blocked for a given pattern.1 Emulation Runtime The emulation runtime depends both on the number of cycles to be executed and on the maximal operating frequency of the prototype.0. For example when reset is stuck at 1. The static analysis ensures that each faulty circuit runs properly even if an inserted fault causes new dynamic pathes to occur.3 Fault Insertion for Sequential Circuits As mentioned in Section 2.B + C. then the emulator will be able to process 100 faults per second. the BLP becomes combinational and it implements the function F = 0.E S G1 Y a) initial circuit @ (0. This feature is useful for test coverage improvements or for analyzing undetectable faults. A faulty circuit is processed following the five basic steps : 1.4 Fault Emulation Once both the basic configuration of the prototype and the fault reconfigurations have been generated.D Fault C. This mode runs through all faulty circuit emulations as fast as possible. Fault none D Sat0 D Sat1 Q Sat0 Q Sat1 RST Sat0 RST Sat1 SET Sat0 SET Sat1 EN Sat0 EN Sat1 Type Seq Seq Seq Comb Comb Seq Comb Seq Comb Seq Seq The first mode is used to debug the fault-free circuit before running fault emulation.B + C. enable and clock lines are used (+) or not ( ). 4. In order to minimize the overhead for each fault processing.4 and 4. Serial fault emulation is the most important mode because it allows the calculation of the fault coverage and the construction of the fault dictionary. The sequential devices are synchronized by a complex clock system ensuring that there are no hold time violations due to short-pathes between registers. 2: An example of FPGA Reconfiguration Note that partial fault collapsing may be easily achieved by identifying the faults which produce identical FPGA reconfigurations.A B C D E G0 X -> 0 G2 Z G3 S A B C D E BLP @ (0. Fault insertion by reconfiguring the emulator Register initialization Trigger setting for the fault dropping Faulty circuit emulation Fault deletion by reconfiguring the emulator During step 4. MCS typically varies from 500 KHz to 5 MHz.1) Z. 5. In the example. This speed basically depends on four factors : D D 0 D RST D 0 1 0 1 F RST SET EN CLK + + + + + + + + + + + + + + + + + + + + + + +  The average emulation runtime for processing a fault  The time required to reconfigure the emulator  The time required to initialize the sequential devices  The time required to perform fault detection Obviously.D b) BLP circuit c) BLP reconfiguration for X stuck at 0 Fig.000. the serial fault emulation speed (SSFE ) is expressed as follows : Table 1: FPGA reconfiguration for SSF on a register For each SSF of a register. set and enable lines is reconfigured according to the stuck faults on its pins (assume active high on all signals). The FPGA reconfiguration for SSF on a sequential gate affects both the combinational section and the sequential section of the BLP.0. This frequency is computed by a worst-case static timing analyzer from the fault-free circuit. The problem of improving the fault emulation speed is addressed in the following section. SSFE = MCS P . This later may be configured as either an edge-triggered flip-flop or a latch and it may have additional features such as load enable. the faulty circuit emulation continues until the test is finished. fault emulation can be performed. Assuming that a given circuit operates at 1MHz and that the faults are detected in average at P = 10.

R1 D1 Additional Register rst 1 0 set Ck Q1 D2 rst Ck R2 Q2 S3 D3 set Ck R3 Q3 SSFE = P + T MCS:MCS reconf of forcing the value of the sequential device of each BLP with the asynchronous set and reset lines. then the time required to reconfigure the prototype will be greater than the emulation runtime if P < 800. it is possible to perform a hardware comparison between the output values calculated by emulation and the expected values. When the program is interpreted. At the beginning of the faulty circuit emulation. 4: Hardware Fault Detection As shown in Figure 4. the propagation time through the comparator is added to the critical path so that MCS decreases. Hence. This may also be used in the following case. a 2-input OR gate is added to the design for ORing the initial signal and the forcing line (R3 for example). this technique has a negligible impact on the initial netlist and consequently on MCS . In Figure 3. This feature allows the reconfiguration time to be significantly reduced. The first consists in using the 24 hardware memories available on each logic board in a generator mode. 3: Hardware initialization of Registers We propose a hardware technique for initializing the registers into a given state. each memory is directly addressed by a non-reconfigurable hardware counter which has to be reset at the beginning of each emulation. the test set consists of an initialization sequence which brings the circuit in a given state followed by an actual test sequence. Note that if the user wants to bring the circuit in another initial state. The second technique consists in replacing one or more logic boards by specialized memory boards. When hardware reset is not available on the registers. the Meta architecture provides the ability to read or modify only a portion of the chip at a time. 4. This technique takes advantage of the capability . Hence.200 faults can be processed every second if P is close to 0 (all the faults are detected in the first cycles). the hardware prototype has to be re-compiled by the CAP software.4. Unlike software initialization. an equality comparator is inserted into the initial design so that only one signal has to be tested to verify whether the output values are different from the expected values or not. hardware initialization avoids spending time between each fault pass. the SFE speed is now defined as follows : Treconf depends mainly on the time needed to reconfigure BLPs. the BLP corresponding to each register is forced to the specified value.2 millisecond regardless of its location. the register R1 and R3 have to be forced to 1 whereas the register R2 has to be forced to 0. A BLP can be reconfigured in 0. Memory boards may be combined to provide a very large memory. The trigger defined in Section 3.2 Fast Reconfiguration Let Treconf be the time required to reconfigure the hardware prototype. In logic emulation. Each memory board can also operate in a generator mode and it can be loaded with up to 256K patterns of 384 bits. Each observed signal is propagated within the comparator through a 2-input NXOR and a N-input AND (where N is Design Registers Fig. SFE can be considered as quasi-linear with the number of gates of the circuit. Note that the registers which are not forced by the program remain in an unpredictable state (either 0 or 1). If a pin is already connected to another gate. On average. Conversely the reconfiguration time will be negligible as soon as P > 10. Assuming that a circuit operates at 1 Mhz. The user can load up to 32K patterns regardless of the number of bits. In this case. 4.3 Input Stimuli Test Patterns Expected Outputs Faulty Circuit = Fault Counter Trigger Memory Boards Logic Boards Fig. the trigger is turned off to prevent the emulation of the faulty circuit from being stopped before the end of the test. The comparator may have an impact on MCS since an observed signal can be located on the critical path (calculated by the static timing analyzer). Unlike Xilinx. the time required to initialize the registers is not significant in comparison with the emulation runtime. To improve the fault detection speed. This may be used when the test sets are so large that they have to be cut in several test sequences. In this mode.000. An additional register is connected to the set or reset pins of the registers to be initialized. register initialization may also be used to separately observe the effects of faults for the initializing sequence and the actual test sequence. In this case. Furthermore. As indicated in Section 2. It is possible to extend the number of patterns capacity by successively loading 32K pattern pages. 4.8 millisecond (4 BLPs to be reconfigured) and consequently 1.5 Fault Detection The Meta Systems hardware emulator provides two techniques for injecting stimuli. A counter may also be inserted into the design in order to calculate the number of times a fault is detected. The main drawback of this "software" initialization technique is that the time required to force the BLPs can significantly increase the overhead between each fault processing.4 Register Initialization Test evaluation generally requires the capability of bringing the circuits in a given state without applying any initialization sequence. However this last possibility is not well-suited for fault emulation because of the loading time. Furthermore experimental results show that some large circuits can operate at higher speed than small circuits. the set or reset pin stuck faults are inserted on the input pins of the OR gate. In this case. On the other hand. the additional register is forced to 1 by the software initialization described above so that it brings the circuit in the given state. Treconf is equal to 0. both the controlled input values and the expected output values are considered as stimuli.4 is set on this signal. there exists no explicit relation between the number of gates and the maximum clock frequency. In this mode. the registers (and more generally the sequential devices) may be initialized by writing a C program in which specified values (0 or 1) are assigned to them. Theoretical Complexity It is clear that SSFE depends on the size of the circuits and that the run time cannot be considered as linear with the number of gates.

3 1950. We have generated a 1K random test patterns (this size is sufficient to obtain a 95% coverage).64 MBytes RAM).3 5.1 1. the number of inputs (I) and the number of outputs (O).1 196. the fault coverage (C ) obtained with the random test set. MCS (Mhz) 4 3 2 1 mul4 mul8 mul16 mul32 mul64 Fig. However it is not necessary to re-compile the circuits for other test sets or other fault sets. but fault collapsing is performed to minimize the number of faulty circuit runs.1 19. 5. without comparison 7 6 5 with comparison CAP Circuit mul4x4 mul8x8 mul16x16 mul32x32 mul64x64 #G 179 699 2561 10203 39899 TCAP #F 1000 3876 14620 57320 218860 SFE #R 424 1628 6100 23300 88962 [sec.1 88. It is obvious that in these conditions TSFE is linear with the number of gates #G. Each I/O pin of the gates of the circuits are stuck at 0 and 1.7 279. It can be seen that the propagation time through the comparator is added to the critical path of each multiplier since those circuits are combinational and all the outputs are observed (and compared).9 108. Most of the faults (90%) are detected in the first ten patterns so that fault emulation runs at the maximal reconfiguration speed (around 1.5 15432 11501 12611 9781 4946 4531 [f. The time required to CAP is considerably greater than the runtime for fault emulation (this is especially true when the circuits are large). SSFE increases as the average number of patterns needed to detect a fault decreases.3 169.6 14. Note that for smaller circuits. the impact can nevertheless be minimized by using a pipeline architecture in which registers are inserted after the observed signals. 5: Impact of the hardware comparison on MCS Figure 5 shows the impact of the hardware comparison on MCS .3 1.1 82./sec. EXPERIMENTAL RESULTS In order to measure the efficiency of SFE.1) [12] [13] .6 77.5 91.2 385.8 336. Evolution of performances for a fixed architecture First.3 2. We have selected several Booth’s multipliers varying from 4 to 64 inputs.6 92.9 Table 4: Results for 50K Pattern Fault Emulation MCS does not decrease with the complexity of the circuits so that SFE is linear in time with the number of gates.9 170.5 Table 3: CAP for ISCAS’89 benchmarks Table 3 gives the description of these circuits in terms of the number of gates (G). there are 1 + log4 (N ) BLPs between each observed signal and the output of the comparator on which the trigger condition is set.4 231. the average number of patterns needed to detect a fault (T ).64 MBytes RAM) and the maximum clock speed (MCS ). Since our approach is targeted for large circuits. If a balanced technique is used for the mapping of the N-input AND. We have considered the SSF model for all the gate outputs without collapsing the faults.8 221.] 148.4. We have compared our results with those obtained by HOPE (version 1. all the flip-flops are set to 0 at the beginning of each faulty circuit emulation by using the technique explained in Section 4.] 87.1 1. the number of runs needed to test all the faults (#R) and the runtime for fault emulation (TSFE ).the number of observed signals). Each multiplier is generated with the CAP software and MCS is calculated with the hardware comparator (denoted by the dashed line) and without the hardware comparator (denoted by the solid line).1 T SFE TSFE SSFE [sec. Furthermore. we have performed test evaluation with 50K random test patterns on the largest sequential circuits 5.] 48.9 4.200 runs per second).3 1.3 356. Table 2 indicates the number of gates (#G). the runtime for fault emulation (TSFE ) and the fault emulation speed (SSFE ). the number of faults (#F).7 95. our method does not intrinsically require full scanned circuits). namely for several Booth’s multipliers made up of 2-input gates. A full scan path has been inserted into the original design so that the random test can bring the circuit into many distinct states to obtain a good fault coverage (obviously. the effect of the comparator decreases. the number of flip-flops (FF).1 Circuit s9234 s13207 s15850 s35932 s38584 s38417 #F 13020 21256 24322 45956 50124 57448 C 81.] 0. Table 4 reports the results obtained by SFE in terms of the number of faults (#F).6 91.] 1.7 82.2 CAP 5725 228 8620 669 10369 597 17793 1728 19705 1452 23715 1636 [MHz] 1.4 Table 2: Fault emulation of multipliers from the standard set of the ISCAS’89 benchmarks. The CAP results are reported in terms of the time required to compile the fault-free hardware prototype and to compute the FPGA reconfigurations (TCAP ) on a Sun workstation (Sparc 10 . the time required to compile the fault-free hardware prototype and to compute the FPGA reconfigurations (TCAP ) on a Sun workstation (Sparc 10 . the effects of the hardware comparison can be disregarded. we have conducted two sets of experiments. Furthermore the comparator induces a partitioning and routing constraint since the observed signals have to be connected to the comparator through one or several Metas and/or boards.2 TSFE [sec.2 SFE on ISCAS’89 Benchmarks In the second experiment.1 270.2 105. The two curves show that as the circuit size increases.7 225. 5 Circuit s9234 s13207 s15850 s35932 s38584 s38417 #G Description #FF #I 19 31 14 35 12 28 #O 22 121 87 320 278 106 TCAP MCS [sec. we have conducted experiments to demonstrate the range of performance according to the size of a fixed architecture. a state-of-the-art fault simulator combining efficient techniques such as single fault propagation and .2 1.2 137.

6th IEEE ASIC Conference. Touzard and G. This approach relies on the utilization of a hardware emulator to observe the effects of the faults on the circuits. pp. pp. Proc. Hughes “FPGA Architectures for ASIC Hardware Emulator”. Khan.parallel fault processing. G. Lee and D. June 1994. Brayton. R. ICCD. D. pp.) x20 10 3 x8 HOPE SFE 2 10 [9] D. 1975. p. on CAD. 1990. Bacheler and J. 1992. ICCD. 138-141 [5] W. H. 6: Normalized Performance Comparison Figure 6 shows the processing time for SFE and HOPE. aliasing probability and detectability. 1993 pp.. 1983.L. Ha “HOPE: An Efficient Parallel Fault Simulator for Synchronous Sequential Circuits”. Ha of the University of Virginia for providing them with HOPE and B. No 3. Bailey of Mentor Graphics for its help in the preparation of this paper. Gai and P. New York.L. Proc. 1992 pp. Hill and D. For 100 K gate circuits. Sangiovanni-Vincentelli “Multilevel Logic Synthesis”. pp. Proc. Vol. The speedup of SFE over HOPE varies from 8 to 20. 535-540 [8] D. “A Second Generation User-Programmable Gate Array”. an Electrically Reconfigurable Hardware Engine”. Morisset for their assistance with the Meta Systems’s CAP software. Baker “Concurrent Simulation of nearly Identical Digital Networks”.28-33 10 Normalized processing time (sec. So we have normalized the results according to the first circuit (s9234). Proc. 10-17 [14] E. 336-340 [12] H. pp. 232-233 [17] S. Hatchel and A. ICCAD.K. Vol 13. June 1991. 204-209 [16] N. 7 ACKNOWLEDGMENTS The authors would like to thank Dong S. Walters “Computer-aided Prototyping for ASIC-based Systems”. T. 1990. Feb. They also thank E. A.S. 786-795 [7] J. Cheng and M.L. Serial fault emulation takes advantage not only of the reconfigurability of Sram-based reprogrammable circuits but also of the reconfiguration speed of the Meta Systems’s hardware emulators. Szygenda “Parallel Fault Simulation”. Montessoro “Creator : New Advanced Concepts in Concurrent Simulation”. Thomson and S. Yu “Differential Fault Simulation . Fault Tolerant Computing Symposium.K. 391-395 [10] H-C. To compare the evolution of performance with the number of gates. pp. Butts. IEEE Trans. No 2.H. Proc. S. IEEE Design and Test. A.D. Proc. test evaluation is a new application of hardware emulators that will encourage designer teams to adopt hardware emulator based methodology.A Fast Method Using Minimal Memory”. 29-64 [3] R. HOPE has been tested under similar conditions using the same fault model and the same test set. Bottorff “Test Generation and Fault Simulation”. Custom Integrated Circuit Conference. 177-188 [15] E. In contrast with HOPE. References [1] M. 1989. pp. North Holland Ed. L. F. 4-10 5k 10k 15k 20k 25k Number of gates Fig. No 6.K. So for large designs our approach can drastically reduce the time taken in the analysis of fault coverage. of the IEEE. 1995. Freeman and Company. 1990. Proc. Lee and D. 32nd DAC. pp. Varghese “An Efficient Logic Emulation System”. Legai. Ulrich and T.R.D. 26th DAC. Vol. The main advantage of serial fault emulation is that in contrast with software fault simulation the computing time is quasi-linear with the number of gates. M. Gateley et al. we have to normalize the run time according to the average number of patterns required to detect a fault. W. 7. “UltraSPARC-I Emulation”. Proc. Cassiday “Preliminary Description of Tabula Rasa. Furthermore. 29th DAC. L. Harel and B. Owen and J. pp. 424428 [6] S. 6 CONCLUSIONS An approach to evaluate test sets for large sequential circuits has been presented. pp. 336-340 [13] H. the performance of HOPE is measured on the same machine (Sparc 10). March. J. It is clear that SFE is especially advantageous for large circuits. memories or complex synchronization schemes. Van Brunt “The Zycad Logic Evaluator and its Application to Modern System Design”. 264-300 [4] M. Weil. 134 . Ha “New Techniques for Improving Parallel Fault Simulation in Synchronous Sequential Circuits”. Computer. A. Proc.S. J-S. VLSI Testing. Breuer and A. pp. W. Krishnamurthy “Is There Hope for Linear Time Fault Simulation ?”. these tools are less efficient because they have to implement mechanisms to take into account user’s libraries. Friedman “Digital Systems Testing and Testable Design”. Proc. pp. July 1987. pp. we can hope to reach a speedup of two orders of magnitude with respect to commercial tools. 515521 [11] U. After logic verification and fast prototyping. 78. 1993. Vol. Sept. April 1974. Computer. Abramovici. G. 4 [2] P. pp. ICCD. Hsieh et al. 8. 1987. 1985.

targeting 90-nanometer designs.13-micron design rules. Leopard Logic Inc. The founders hold many patents related to configurable logic and its use for electronic system testing. the next question is whether it will fly as a business. the 15-person team at M2000 is hardly new to FPGA technology. Startup M2000 says it wants to be the first to make a business out of it. M2000 has been working with six partners. the company says it is now finishing an eighth-generation device structure." Reblewski said. It's been a long gestation period for M2000. giving it three times the logic density of standard FPGAs. but most of the handful of chip makers and startups that have tried it have retreated because of cost and software issues. The one heavyweight to watch in this area is IBM Corp. in Europe. said chief executive officer Frederic Reblewski.D&R Headline News | Most Popular | SoC News Alerts | | M2000 starts offering IP cores and tools that marry ASIC and FPGA design EE Times : Latest News After eight years. M2000 (Bievres..eetimes. which has licensed FPGA technology from Xilinx Inc. Privacy Statement | Your California Privacy Rights | Terms of Service . The company says it can port its FPGA architecture to any foundry within two months. Rather than use uniform routing resources. which started out offering FPGA IP cores. Embedding FPGA blocks into ASICs has been seen as a way to make ASICs more flexible and to reduce development costs. Assuming M2000 can deliver the technology as promised. As for the FPGA hardware.com/showArticle. is now fielding semicustom chips that combine fixed logic functions with on-chip FPGA. Among those that have tried and failed are LSI Logic. Where M2000 parts ways with other FPGA vendors is in the architecture. The company's proposed design flow includes the use of commercial synthesis tools along with its own mapping. the same year. For that reason. which worked with M2000 several years ago to design an image sensor that used embedded FPGA gates for reconfigurable logic. In the coming months. and one of them is said to be shipping chips with the embedded FPGA core for wireless-infrastructure systems. the tools automatically output an embedded FPGA macro that includes all the data for design-fortest. Another claimed benefit to this approach is that timing becomes more predictable. placement and routing. "There are some very big names looking at our technology. All rights reserved. the reclusive company will open its doors to customers outside the close network of partners with which it has been working for more than three years. France) will start offering intellectual-property (IP) cores and tools that will marry ASIC and FPGA design. "We are very confident. the company has developed a compiler that tailors the routing to the design. By November. floor planning and physical verification. (The only partner that it has disclosed is STMicroelectronics.More than a few chip companies have tried their hand at embedding blocks of FPGA logic into otherwise-hardwired ASIC devices. M2000 been raising investment capital since early 2004. . and configuration tools. M2000 now open to public Anthony Cataldo (09/27/2004 9:00 AM EDT) URL: http://www. Reblewski said the company can achieve logic speed of 700 MHz using 0.jhtml?articleID=47902997 San Jose. SDF files for static timing analysis and Verilog models for simulation are also automatically generated." Reblewski said. that it can use in 90-nm ASICs." he said.15 micron in terms of density. the United States and Japan. but Reblewski said the company refrained from rushing to market in order to avoid the mistakes made by others. M2000 said it has put most of its effort into ensuring that it has the right software tools and a high-density FPGA fabric. The company was started in 1996 by the founders of FPGA-based emulation vendor Meta Systems. "What others propose in 90 nm is what we propose in 0. Though the company is small and relatively unknown. Adaptive Silicon and Actel. After place and route. "As soon as we know where to put each element. which was bought by Mentor Graphics Corp. The cell structure is based on a basic four-input lookup table and SRAM technology.) Moreover. the timing is known. Calif.." All material on this site Copyright © 2005 CMP Media LLC.

net AT&T™ Official Site Compare AT&T U-verse Bundles and See How We Measure Up to Cable. Staff available 24/7 by phone.com/FiOS Patent Attorney/Engineer Over 36 years experience.com/U-verse Home | Feedback | Register | Site Map All material on this site Copyright © 2009 Design And Reuse S. att.E-mail This Article Printer-Friendly Page Verizon .Just $84. All rights reserved.A. www. Limited Time Offer! verizon.99 in California. .Official Site Get FiOS Triple Play .invention.

hardware programmability can be exploited by system integrators for product customisation. 8+8kB direct-mapped data/instruction caches. The system architecture is illustrated in Fig. The proposed system has been built using a set of state-of-theart IP cores and system design methodology. a 24 or 16 bit instruction format for improved code density. At one side the economics of system integration pushes logic suppliers towards ever more complex system-chip devices. in particular the functional purposes of the e-FPGA programmable logic are: • extension of the processor datapath supporting a set of additional special-purpose instructions (TIE). 13 interrupt lines organized in 4 priority interrupt levels. Introduction These days we are witnessing two conflicting trends in the electronic industry. an external RAM port allows the extension of the on-chip 48kB SRAM. The heart of the system is an embedded FPGA and its multiple interfaces to main system units. The resulting system has been developed to target image and voice processing and recognition application domains. FPGA and Customisable I/O Michele Borgatti. a 64 bit processor interface (PIF) with burst transfers for cache-page refill. NVM-DP. 1). Hardware units mapped into the e-FPGA can be interfaced to the system bus through an AHB bus master/slave. the creation of busmapped application-specific hardware coprocessors and accelerators. some computations can be performed on-the-fly when data is captured. It allows a wide range of burst mode and page mode configurations under software control and supports low-voltage. An external memory interface (EMI) exploits the available peak throughput of fastest commercial external non-volatile flash memories. The latter feature allows the device to potentially connect to any external unit/sensor given that its communication protocol can be mapped to the on-chip programmable logic. • bus-mapped coprocessor. low-swing operations.1. The base processor is a specific customisation of that described in (1). The architecture of the system is discussed as well as the design flows for pre. This is done by connecting the processor datapath through a wide bus and a specific interface (TIE bus/interface in Fig. The recent introduction of embedded programmable logic allows ASIC and ASSP vendors to broaden the appeal of their products. Main features of the processor core used in our system are: 5-stage pipeline. It features an embedded reconfigurable processor built by joining a configurable and extensible processor core and a SRAM-based embedded FPGA. In this paper we present a pragmatic approach to introduce flexibility in system-chip design and exploit embedded programmable silicon fabrics to enhance system performances. and an embedded FPGA (2) were used. Design flows for system exploration and implementation are also introduced. a configurable and extensible processor (1) with associated tools. If required. The PIF/AHB Bridge translates processor cycles to the AMBA AHB bus (3) with support for fast burst and locked transfers. The silicon area required by the system is 20mm2 in a 0. Also. Central R&D Agrate Brianza (MI). Also. On the other side. System Architecture One of the main goals of this work was to build a flexible architecture. built around an embedded FPGA and an extensible 32-bit microprocessor.18um CMOS technology. In particular. ITALY Abstract A system-chip targeting image and voice processing and recognition application domains is implemented as a representative of the potential of using programmable logic in system design. . Application-specific bus-mapped coprocessors and flexible I/O peripherals and interfaces can also be added and dynamically modified by reconfiguring the embedded FPGA. Francesco Lertora. In the proposed system the embedded programmable logic allows static or dynamic configuration of the instruction set of an embedded microprocessor. In particular.A Reconfigurable System featuring Dynamically Extensible Embedded Microprocessor. enabling application-specific configurations to adapt the underlying hardware architecture to time-varying application demands can improve execution speed and reduce power consumption compared to a general-purpose programmable solution.and post-silicon design and customisation. working at a reasonable high clock frequency. Benoit Forêt and Lorenzo Calì STMicroelectronics Innovative Systems Design. increasing complexity of design and associated risks. and the customisation of the system I/O. increase of non-recurrent engineering expenses and shorter time-to-market and product life are causing OEMs to look for faster turnaround and lower risk design solutions and technology. It comes with a complete set of tools for configuration and performance analysis. The embedded FPGA accounts for about 40% of the system area.

specific processor instructions mapped in the reconfigurable fabric may be 1x to 10x slower than their equivalent implementation in standard cells. A mechanism is introduced to allow the processor to be clocked at its maximum speed while executing standard instructions. If the logic size of the set of additional instructions exceeds the logic capacity of the e-FPGA. A. For each set that belongs to a configuration. The Microprocessor-FPGA interface The configurable processor allows adding user-defined instructions. A dedicated module is able to identify instructions whose performance is not aligned with the processor. To accelerate communications between the configurable hardware and software tasks running on the processor. All these possibilities may be mixed in a singular configuration for the FPGA and this results in a highly configurable device. A two-way HW/SW communication can be implemented by the joint usage of these interrupt channels and dedicated AMBA APB registers. Fig. On the same bus.2 details the processor-FPGA interface: a focus is given on how Instruction Extensions are mapped inside the FPGA and how synchronisation between the microprocessor and the e-FPGA is guaranteed. instruction-dependent number of cycles (1-16) when executing processor instructions mapped into the FPGA. etc… A programmable general-purpose I/O module features mono input/output and bi-directional pads under the control of both the e-FPGA and the microprocessor. A pre-defined map-table divides in 4 the whole set of opcodes reserved for user-defined instructions. it might be split into a number of contexts fitting the size constraints of the eFPGA. 2: Embedded FPGA – Microprocessor Interface As the additional instruction set is part of the processor pipeline (1). Instruction Other FPGA Purposes DMA Master/Slave AHB Interface Interrupt Manager Interrupt Interface TIE Interface FPGA Programming Interface Dual Port Buffer Interface AHB Wrapper 1KB Dual Port Buffer AHB/APB Bridge Instruction Decode Pipe Control Register R stage File Bypass State Decode Decode Data Embedded FPGA TIE X Branch Shifter Adder AGen E stage TIE BUS General Purpose I/O Interface result N1 N2 N3 64 bit APB BUS e-FPGA FPGA TIE-Clock Decode Opcodes vAddr Base Processor General Purpose I/O Lines Programmable General Purpose I/O General Purpose Registers I2C Master I2C BUS Type Decoder Clock Control Type 0 Default Delay N=1 Type 1 Type 2 Type 3 System Clock TIE 0~4 TIE 5~9 TIE 10~21 TIE 22~31 Delay N1 Delay N2 Delay N3 Fig. The programmable general-purpose I/O pads interface is used to connect external units or sensors with their application-specific communication protocol. 48 KB SRAM AHB Wrapper External Memory Interface 32 bit External RAM Port 32 bit External ROM/FLASH Port 8KB D$ 8KB I$ Processor INTERFACE (PIF) 32 bit Extensible Microprocessor 64 bit PIF BUS PIF/AHB BRIDGE 64 bit AHB BUS runtime re-configuration of the instruction set. In the proposed architecture. One port of this buffer is connected to the AHB bus while the second port is directly accessed by the FPGA dual port buffer interface. 1: System Architecture Block diagram Download of the FPGA bitstream is performed by a flexible programming interface.• flexible I/O. allowing System Clock Clock Stratcher Mechanism Fig. These contexts might be used to dynamically reprogram the FPGA to support application needs. In particular. This implies that the number of user-defined instructions available at a given time is limited by the e-FPGA logic capacity and instruction logic complexity. the bitstream may be read-back by hardware support. 4 interrupt channels can be driven by logic mapped into the e-FPGA. mapped as a constant output of the FPGA. Most audio or video applications require storage buffers to interface fast decoding hardware and slower software running on the processor. To allow validation of the FPGA configuration. slowing down this logic results in a drastic reduction of processor maximum speed hence affecting processor performance when using the baseline generalpurpose instruction set. The AMBA APB Bus connects all the configuration/general purpose registers to the system. a 1kByte dual port buffer has been added and organised as 4x256 bytes rows. an I2C master interface has been added to connect external devices or sensors like LCD display. With this concept in mind. As each of these instructions needs to be associated to its execution time. a number. this capability was mapped exclusively into the e-FPGA. A clock control system allows the processor to be synchronised with the e-FPGA for the number of cycles the instruction is executed. CMOS camera. whereas it is slowed down by a programmable. the set was partitioned. The flexibility advantage of this architecture implies a speed penalty for the part of logic mapped inside the e-FPGA. However. a set of additional instructions can be defined to target specific application needs. defines the number of times the clock needs to be stretched to synchronise properly the .

The MFC clock is one of 3 global signals defined to be connected to any input of the cluster. In this way. This insures a low skew between cluster clocks and a full IO assignment flexibility. a Local Interconnect Network links MFC together and to the global network. to exploit the performances of both hard-wired and programmable logic. The architecture allows defining up to 1 clock signal per cluster. 4: System to RTL At this point it is possible to group segments of codes that result timing consuming as new instructions of the extensible processor. The system integration flow ends producing: • Soft Hardware to be mapped on the eFPGA: HDL RTL code of instruction extensions.3) level. These flows are run at different times. B. Those extensions of the Instruction Set can be easily mapped on the e-FPGA as well as the VHDL code that results from the refinement process done during partitioning phase. More. The MFC is a 4 input / 1 output programmable structure associating a 4 input Look-Up Table and a storage element (dff.execution of the pipeline between the FPGA and the base processor. To meet timing requirements at the boundary of the e-FPGA. AHB/APB Bus Peripherals C code … MFC … MFC LUT D Q Local network … FF / Latch Local network L Soft Hardware eFPGA mapping Applications Multi Function logic Cell Global network IPad IPad 384 Inputs eFPGA HARD MACRO Configuration & Test Interface 32 Bits Control Bus SoC Integration IPad OPad OPad OPad 384 Outputs Fig. AHB/APB bus and Peripherals. IP blocks and Interface modules (system bus) is synthesized and integrated with RAM blocks and FPGA hard macro in the floorplanning environment. the verification of the system at a cycle accurate abstraction . • Embedded Software (C code): Application software and low-level drivers for the hardware platform. The microprocessor core is abstracted in the coverification with its Instruction Set Simulator integrated into the simulation engine. Thus.4 the design flow used for system architecture exploration and integration is described. The C code generated by the flow described above became the final application while the RTL of the system with the eFPGA hard macro goes into the system integration flow. the system allows executing a set of TIEs among a panel of 4 user-defined speed penalties for any FPGA configuration. Functional model (untimed simulation) Partitioning / Interface Synthesis / Refinement Cycle Accurate Simulation Performance Analysis CoWare libraries (HW/SW platform) uP ISS VHDL (eFPGA) Cluster 1 MFC TIE verilog code Cluster 24 MFC HW (RTL) uP. with a progressive refinement of the functional blocks into hardware and software (partitioning process) and the generation of the HW/SW interface (interface synthesis). This methodology allows designers to validate the system specifications and consequently. bus-mapped coprocessors and special purpose I/O peripherals. Block Description of the e-FPGA The architecture of the e-FPGA (2) is organised as a hierarchical multi-level interconnect network (see Fig. The set of user-instructions can be defined after tape out thanks to the FPGA. At a lower level. Design Flow and System Integration A. The System-to-RTL design flow In Fig. The RTL code of the CPU core. Once silicon implementation flow has produced the routed database its possible to implement eFPGA flow that can be repeated for each different function built as a soft macro. The Global Interconnect Network links the clusters together and to IPads & OPads peripherals cells. exceptions) help in finding the computational kernels of the software running on the core (performance analysis).5 both silicon implementation flow and e-FPGA configuration flows are shown. a special care was taken during synthesis process for the logic cells that interfaces e-FPGA with the rest of the system. There are 3k MFC shared among 24 clusters. 3: Block diagram of the e-FPGA An array of logic elements called Multi Function logic Cell (MFC) allows implementation of digital logic. the system allows to parametrise its execution time. at this stage the verification is done with simulations in CoWare N2C environment (4). latch). A particular set of constraints was specified to reach minimum delay of the hardwired logic. the processor CPU is tied to the FPGA speed for the strictly required number of cycles. • Conventional fixed hardware: Microprocessor RTL code. cpu load. Extensive simulations of the system with the usage of the profiler (memory accesses. The RTL-to-Layout design flow In the Fig. The input (respectively output) pin set counts 384 independent and fully equivalent inputs (respectively outputs). B. The starting point is an untimed model of the system written in C/C++ code describing the desired functionality. After the place and Fig.

0” (4) I. EMI and programming interface is 50MB/sec. To avoid external multiple power supply. TABLE I DEVICE PERFORMANCES AND POWER CONSUMPTION Processor maximum speed: Reconfiguration speed: Chip average power consumption 125MHz (WCMIL) 175MHz (TYP) ~500us @ 100MHz clock ~300mW @ 100MHz. A special thank to Dr. “Hardware/Software Co-Design of Digital Telecommunication Systems”.8V(core.5 mm2 (pad limited) 20 mm2 8. pp 391418. DC REGULATOR CACHE TAGS 8 + 8 KB INSTRUCTION And DATA CACHE 32b EXTENSIBLE CORE + 64b AHB BUS + 64b APB BUS + AHB & APB PERIPHERALS STANDARD CELLS (250k GATES) Embedded FPGA 48 KB SRAM 1KB DUAL PORT BUFFER Fig.6 Chip Micrograph . Device performances and power consumption are summarized in Table I. Barbier and F. and K. S. 60-70. “AMBAä Specification” Rev 2. Static timing analysis of the e-FPGA results in both a backannotated netlist and a timing view for full chip static timing analysis. www.De Man. They also thank O. The chip is being tested and is fully functional at the clock rate of 175MHz. During reconfiguration the average throughput sustained by external memories. B.m2000.5x to 2x performance improvements are reported on specific I/O intensive tasks to interface an external CMOS camera and doing some image processing computations on-the-fly using the e-FPGA.E. I/O Interf.8V) voltage regulator has been integrated.6 with a floorplan view of system components. Vol. Reblewsky at M2000.8V/3. "Xtensa: A Configurable and Extensible Processor" . Acknowledgements: The authors thank Sara Bocchio. 3. internally generated / regulated) References (1) R. Additional 1. The system is being tested using both a face recognition application and a speech recognition application. A. Campbell at Tensilica.3V. Finally the generation of the bitstream and a timed view of the macro can be used for the final sign-off. V E R I F I C A T I O N CPU core. J.2 mm2 (15k useable equivalent ASIC gates) 24 general-purpose inputs 24 general-purpose outputs (tristate) 8 general-purpose bidirs 2. G. Reconfiguration takes about 500us at a clock rate of 100MHz. the final database is statically and dynamically verified against the RTL simulations in order to make verification at all levels of abstraction. “Flexeos family technical manual”.fr (3) ARM Ltd. March 1997. an internal DC (3V to 1. C.route stage. (2) M2000. Dynamic Verification Constraints file Synthesis (fpga lib) Mapping (fpga p&r) Netlist + Timing Database FPGA Timing Database FPGA Bitstream Silicon fab Final verification with FPGA timing model Static Timing Analysis Fig.. built after a paracsitic extraction and a delay calculation process. M. Verification with FPGA black box. IP Interface RTL code RAM eFPGA core recognition computing kernels. H. J. Technology and device characteristics are summarized in Table II and a chip micrograph is shown in Fig. Repetto.8V Synthesis TIE Coproc. Ahluwalia. This information is exported in the eFPGA flow as a constraint file and used during synthesis/mapping of the soft hardware by specific eFPGA tools. 0. System Implementation and Test The full-chip has been implemented in a standard CMOS 1. Bingham at CoWare. Lin. Gazzina and L. allows knowing the effective delays at the boundary of the e-FPGA hard macro (all e-FPGA I/O pins are characterized with the static timing analyzer in the worst case condition). C.Van Rompaey.18um technology featuring 6 metal layers. Fumagalli for their valuable help and support. March-April 2000.18µm CMOS 6-ML Main: 48kB (64-bit wide) I$: 8kB (64-bit wide) D$: 8kB (64-bit wide) Buffers: 4x256B (8-bit wide) 5. Tilley.Vercauteren and D. During architecture development we reported speedups of 4x to 8x using instruction extensions to accelerate face- TABLE II TECHNOLOGY AND DEVICE CHARACTERISTICS Technology SRAM Memory Chip size Core size e-FPGA size Customisable I/O Power supply 0. Massingham and B.6V (external). Floorplanning (full chip) / P&R Static Timing Analysis. 5 RTL to Layout The timed database used for the verification. Proceedings of the IEEE. 1. The processor system is able to reconfigure the e-FPGA at full speed. 85. This is done to correctly constrain the logic mapped on the e-FPGA with the real timing budget. No.Verkest. IEEE Micro.. D. Kramer for his support and encouragement. Woodward and P. The layout of the system has been integrated using commercial place and route tools for digital ASIC.Bolsens.7-3. 1.Gonzalez.5x5. Lepape. pp.

B. This paper describes a dynamically reconfigurable processing unit tightly connected to a Flash EEPROM memory subsystem. Poles and P. multimedia computing. Algorithms. Embedded programmable logic allows ASIC and ASSP vendors to broaden the appeal of their products. De Sandre. chip area is 70mm2. The reconfigurable processing unit targets image-voice processing and recognition application domains and is implemented by joining a configurable and extensible processor core and an SRAM-based embedded FPGA. General Terms Design.40. data and FPGA bitstreams are stored in the embedded Flash memory and are independently accessible through 3 content-specific. Heterogeneous (hybrid) systems. The functional purposes of the embedded FPGA are: i) extension of the processor datapath supporting a set of additional specialpurpose C-callable microprocessor instructions. enabling application-specific configurations to adapt the underlying hardware architecture to time-varying application demands can improve execution speed and reduce power consumption compared to a general-purpose programmable solution. or republish.L. VLSI. The proposed system has been built using a set of state-of-the-art IP cores and system design methodology. C. ABSTRACT A 1GOPS dynamically reconfigurable processing unit with embedded Flash memory and SRAM-based FPGA targets imagevoice processing and recognition applications.18um. M. Performance. USA. Muzzi.2GB/s. The system is implemented in a 0. Memory technologies. G.1 [MEMORY STRUCTURES]: Semiconductor Memories. To copy otherwise. ii) bus-mapped coprocessors (connected to the system bus through a master/slave 691 . 2PL-6ML CMOS Flash technology. Forêt. NRE reduction and shorter time-to-market are key to OEMs looking for faster turnaround and lower risk design solutions and technology. 2.borgatti@st. M. Input/output circuits. Keywords Application-specific integrated circuits (ASICs). Gate arrays. Design flows for system exploration and implementation are also described. the unit shows a peak computing power of 1GOPS. Lertora. 4-bank Flash memory. to post on servers or to redistribute to lists. Central R&D Agrate Brianza. In this paper we present a pragmatic approach to introduce flexibility in system-chip design and exploit embedded programmable silicon fabrics to enhance system performances. SYSTEM ARCHITECTURE The system architecture is illustrated in Figure 1. It features 3 content-specific I/O ports and delivers an aggregate peak read throughput of 1. requires prior specific permission and/or a fee. Anaheim. Pasotti. 64-bit I/O ports with a peak read rate of 1. By implementing application-specific vector processing instructions. B. Copyright 2003 ACM 1-58113-688-9/03/0006…$5. Rolandi STMicroelectronics. D. INTRODUCTION Increasing complexity of system design and shorter time-tomarket requirements are leading research towards the investigation of hybrid systems including processors enhanced by programmable logic [1][2]. In particular. Categories and Subject Descriptors B. reconfigurable architectures.00. June 2-6.3 A Reconfigurable Signal Processing IC with embedded FPGA and Multi-Port Flash Memory M.7. C. DAC 2003. Algorithms implemented in hardware. F.3 [PROCESSOR ARCHITECTURES]: Other Architecture Styles – Adaptable architectures. field-programmable gate arrays (FPGAs). System integrators can also exploit hardware programmability for in-house product customization. Italy michele. Efficient read-write-erase access to code. G. California. data and FPGA bitstreams is provided by a specific memory subsystem based on a modular 8Mb.1.3. integrated circuit design. 2003.2GB/s.com 1. Iezzi. digital signal processors. Code. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page.3 [SPECIAL-PURPOSE AND APPLICATION-BASED SYSTEMS]: Signal processing systems. Borgatti. Application-specific HW units are added and dynamically modified by embedded FPGA reconfiguration. L. Microprocessors and microcomputers.1 [INTEGRATED CIRCUITS]: Types and Design Styles – Advanced technologies. Calì.

iii) flexible I/O (to connect external units or sensors featuring application-specific communication protocols). virtual erase. and a program/erase control unit. FP and uP) providing that three banks can be read in parallel at full speed. finegrain e-FPGA operates as a datapath for the microprocessor pipeline and as dedicated control logic for bus coprocessor and I/O control interface. A single. 692 . each one devoted to a port.8KB D$ 8KB I$ Processor INTERFACE (PIF) 32 bit Extensible Microprocessor 64 bit PIF BUS PIF/AHB BRIDGE 48 KB SRAM External Memory Interface 32 bit External RAM Port 32 bit External ROM/FLASH Port 64 bit AHB BUS Interrupt Manager Master/Slave AHB Interface Interrupt Interface Instruction Extension Interface DMA FP CP DP AHB/APB Bridge Flash Memory FPGA Programming Interface Dual Port Buffer Interface 1KB Dual Port Buffer Instruction Extension BUS Embedded FPGA General Purpose I/O Interface 64 bit APB BUS PC Parallel Port I2C BUS General Purpose I/O Lines Programmable General Purpose I/O General Purpose Registers PC Parallel Port Interface I2C Master Figure 1. etc. To support streaming applications a 1kB dual-port buffer is used to interface fast decoding hardware and slower software running on the processor. DP. high I/O count. Even though such different circuit purposes would require different kinds of programmable logic for best implementation of either arithmetic-dominated or control-dominated logic. resulting in 400Mbyte/s. and assists for built-in self test. testability circuits (DFT). interface). FPGA reconfiguration is concurrent to software execution.) not natively supported by DP. The memory system allows up to four simultaneous operations (with a limit of one both for write and erase). A local bus connects a dedicated 32-bit Flash memory port (FP) to the FPGA programming interface. An 8-bit microprocessor (uP) is devoted to handle complex file-system functions (defrag. The modular memory (dotted line) includes charge pumps (Power Block). depending on the storage requirements (N=4 in the current implementation). The memory sub-system architecture is shown in Figure 2. A (N+2)x4 128-bit crossbar connects the modular memory with the four initiators (CP. A DMA channel handles the bitstream transfer while microprocessor fetches instructions and data from different Flash memory ports: 64-bit wide code port (CP) and data port (DP). ADC DFT PMA Power Block 2MbFlash 2MbFlash 2MbFlash 2MbFlash module 0 module 1 module 2 module 3 128 bit Memory Sub-System Crossbar 128 128 128 128 DP 64 Data Port CP 64 Code Port FP 32 FPGA Port µP interface 8 bit µP Figure 2. The memory space of the four modules is arranged in three programmable user-defined partitions. compression. Simultaneous memory operations use the power management arbiter (PMA) for optimal scheduling. Available power and user-defined priorities are considered to schedule conflicting resource requests in a single clock cycle. we implemented a single programmable logic fabric to be shared among different purposes both in space (same configuration) and time (subsequent configurations). Each 2Mb flash memory module has a 128-bit IO data bus with 40ns access time. The modular memory features (N+2) 128-bit target ports and implements a N-bank uniform memory. System Architecture. Flash Memory Architecture. a power management arbiter (PMA) and a customizable array of N independent 2Mb flash memory modules.

31 0 Register ‘8’ ’16’ 31 0 31 0 31 0 + + 64 bit Aligned Address 64 bit Load µ Processor Load Unit Root Reg. On the righthand side. the energy reduction for executing each of the tasks on its specific HW configuration (power-delay product improvement) results in an overall reduction of 6.8 x4 x 10.5ms 1.2 x 1.8x to 10.26s Speed Up x 2. Gain x 3.7x.7 Energy Effic. Last column of Figure 5 reports the energy-delay improvement of each specific HW configuration compared to the general-purpose counterpart. Stage Bayer Filter Edge Detect. x 63 0 Pipeline Register +2 + 31 0 31 << 1 0 Result Register Result Register Figure 4.2GB/s can be sustained as it is limited by memory access time.4 x 0.4 64 bit AHB Port 32 bit Port 64 bit AHB CP Interface 64 bit Port CP 64 bit AHB DP Interface 64 bit Port DP 512 Bytes Page Buffer AHB DMA 32 bit Port FP 2x64 bit + 1x32 bit Flash Memory Port Interfaces 6x4 128 bit Flash Memory Crossbar Flash Memory Controller Logic 4x16384x128 bit Memory Module Figure 3. Measurements show the best energy efficiency in the range of several MOPS/mW at 1. zero-overhead loops. 32 bit Microprocessor Register File External Processor Interface & AHB Bridge 64 bit AHB Bus 32 bit FPGA PI The overall performance improvements for the face recognition tasks are shown in Table 1. Number Reg. More than 20 specific instructions were designed as C/assembly-callable functions. Execution time is compared for 32-bit RISC with basic DSP extensions (MAC. The microprocessor core is abstracted in the co-verification with its Instruction Set Simulator integrated into the simulation engine.5s 9. automatically translated to RTL.6 x 8. x 1. It lies between conventional ASIP/DSP and dedicated configurable hardware implementations [2]. though showing benefits on execution speed. So. Measured speed-ups range from 1. Remainder Reg. As the average power consumption of the system extended with the eFPGA is slightly higher. Energy efficiency figures are also depicted in Table 1. Reconfiguration time is negligible.6x (on the most-demanding task). Totals RISC with basic DSP 58ms 4. CP and DP are interfaced to the 64-bit. 693 . L2 calculation accounts for 23 8-bit arithmetic operations and 6 64-bit operations requiring about 10k ASIC equivalent gates.5s 382ms 860ms 1. DESIGN FLOW AND SYSTEM INTEGRATION 3.8V supply. The speed-up factors take into account the possible multi-cycle clock penalty due to processorFPGA synchronization in case of instruction extensions slower than the processor clock. Benchmarks at 100MHz. with an overall improvement of 8. the verification of the system at a cycle accurate abstraction level. 24. +1 >> 1 << 2 >> 30 >> 2 31 0 31 0 4 Segments 4 Segments + > - 3.6 x 95. Only one task showed slightly worse total execution energy.1 The System-to-RTL Design Flow In Figure 5 the design flow used for system architecture exploration and integration is described.15s 10.9 x9 x 6. an 8-issue. At a system clock rate of 100MHz each I/O port can independently operate at maximum speed. at this stage the verification is done with simulations in CoWare N2C environment [3].Figure 3 depicts the memory hierarchy and parallelism across the system. Notice that switching between algorithm stages requires only one reconfiguration of the e-FPGA. 800MB/s AHB system bus. 8-bit.7s 2. a datapath for an optimized fixed-point calculation of the square root accounts for 12 32-bit operations for about 2k ASIC equivalent gates.3 x 1. with a progressive refinement of the functional blocks into hardware and software (partitioning process) and the generation of the HW/SW interface (interface synthesis). This methodology allows designers to validate the system specifications and consequently. Figure 4 shows two examples of specific microprocessor extensions. etc) and the same processor enhanced with application-specific instructions.5 Energy Reduct. Energy required for e-FPGA reconfiguration is always negligible. In the current implementation the e-FPGA reconfiguration takes 500us @ 100 MHz. then synthesized and mapped to the e-FPGA. 50MB/s average throughput out of the available 400MB/s are currently sustained by the e-FPGA configuration interface.7 x 11. The starting point is an untimed model of the system written in C/C++ code describing the desired functionality. Memory Hierarchy. Face Recog. System performance is evaluated for an image processing application (facial recognition) and a speech recognition application.95 x 2. an aggregate peak read rate of 1. Face Detect.7s RISC with uP extens. Algorit. Table 1.5x. Added DSP instructions examples. On the left-hand side.

Flash modules and FPGA hard macro in the floorplanning environment. IP blocks and Interface modules (system bus) is synthesized and integrated with RAM blocks. This is done to correctly constrain the logic mapped on the e-FPGA with the real timing budget. 4. I/O Interf. Dynamic Verification Constraints file HW (RTL) uP. RTL to Layout Flow. System to RTL Flow At this point it is possible to group segments of codes that are the most time consuming as new instructions of the extensible processor. This information is exported in the eFPGA flow as a constraint file and used during synthesis/mapping of the soft hardware by specific e-FPGA tools. Those extensions of the Instruction Set can be easily mapped on the e-FPGA together with the VHDL code that results from the refinement process done after the HW/SW partition phase. CPU load. Static timing analysis of the e-FPGA results in both a backannotated netlist and a timing view for full chip static timing analysis. 2.18um. The processor system is able to reconfigure the e-FPGA at full speed. VHDL (eFPGA) Instruction Extensions Verilog HDL Floorplanning (full chip) / P&R Static Timing Analysis. The layout of the system has been integrated using commercial place and route tools for digital ASIC. The system integration flow ends producing: 1. Finally the generation of the bitstream and a timed view of the macro can be used for the final sign-off. built after a parasitic extraction and a delay calculation process. the final database is statically and dynamically verified against the RTL simulations in order to make verification at all levels of abstraction. SYSTEM IMPLEMENTATION AND TEST The full-chip is implemented in a 0. exceptions) help in finding the computational kernels of the software running on the core (performance analysis). The system is being tested using both a face recognition application and a speech recognition application.Extensive simulations of the system with the usage of the profiler (memory accesses. Embedded Software (C code): Application software and low-level drivers for the hardware platform. A 694 . V E R I F I C A T I O N CPU core. The C code generated by the flow described above is the final application while the RTL of the system with the e-FPGA hard macro goes into the SoC integration flow (RTL to layout). These flows are run at different times. 2-poly. The chip is being tested and is fully functional at the clock rate of 125MHz (worst-case conditions). Functional model (untimed simulation) Partitioning / Interface Synthesis / Refinement Cycle Accurate Simulation Performance Analysis Libraries (HW/SW platform) uP ISS particular set of constraints was specified to reach minimum delay of the hardwired logic. Reconfiguration takes about 500us at a clock rate of 100MHz. After the place and route stage. 3. AHB/APB Bus Peripherals C code Synthesis (fpga lib) Mapping (fpga p&r) Soft Hardware eFPGA mapping Applications Netlist + Timing Database FPGA Timing Database FPGA Bitstream Silicon fab eFPGA HARD MACRO SoC Integration Final verification with FPGA timing model Static Timing Analysis Figure 5. Technology and device characteristics are summarized in Table 2 and a chip micrograph is shown in Figure 7 with a floorplan view of system components. a special care was taken during synthesis process for the logic cells that interfaces e-FPGA with the rest of the system. Additional 1.5x to 2x performance improvements are reported on specific I/O intensive tasks to interface an external CMOS camera and doing some image processing computations on-the-fly using the e-FPGA. 3. CMOS embedded Flash technology. IP Interface RTL code Flash RAM eFPGA core Synthesis Inst Ext Coproc. As discussed in Section 2 we reported speedups of up to 8x using instruction extensions to accelerate face-recognition computing kernels. Verification with FPGA black box. The timed database used for the verification. Once silicon implementation flow has produced the routed database its possible to implement e-FPGA flow that can be repeated for each different function built as a soft macro. allows knowing the effective delays at the boundary of the e-FPGA hard macro (all eFPGA I/O pins are characterized with the static timing analyzer in the worst case condition). Soft Hardware to be mapped on the e-FPGA: HDL RTL code of instruction extensions. Figure 6. bus-mapped coprocessors and special purpose I/O peripherals. To meet timing requirements at the boundary of the e-FPGA.2 The RTL-to-Layout Design Flow In Figure 6 both silicon implementation flow and e-FPGA configuration flows are shown. 6-metal. Conventional fixed hardware: Microprocessor RTL code. The RTL code of the CPU core. AHB/APB bus and Peripherals.

Feb. Lin. "Hardware/Software CoDesign of Digital Telecommunication Systems". A.0V (core) DFT 1MB FLASH Memory SRAM memory Flash Ports Buffers 48kB SRAM 32bit uP AHB APB 8+8 kB I$+D$ FPGA Chip size e-FPGA size Customizable I/O Power supply Figure 7. 3.2 mm2 24 general-purpose inputs 24 general-purpose outputs (tri-state) 8 general-purpose bidirs 2. No. 2002. Fumagalli. 2000. B. C. [3] I. ACKNOWLEDGMENTS The authors thank all the colleagues of NVM-DP Dept. "A 1V Heterogeneous Reconfigurable Processor IC for Baseband Wireless Applications".Van Rompaey. REFERENCES [1] Young-Don Bae et al. Feb.35mm2 256Kbit x 9 Sectors Word: 128 bit Program Throughput: 1Mbyte/s Typ Read Rate: 400 Mbyte/s I$: 8kB (64-bit wide) D$: 8kB (64-bit wide) Buffers: 4x256B (8-bit wide) Main: 48kB (64-bit wide) 8.. pp 391-418. F. ISSCC 2002 Digest of Technical Papers.Table 2. Maurelli. pp 68-69. 85.4 mm2 8. "A Single-Chip Programmable Platform Base on A Multithreaded Processor and Configurable Logic Clusters". pp 336-337. Vol.De Man. March 1997. 695 .Bolsens. Technology and chip characteristics. 5. Process Flash Memory (4x) 0. 1. H.6-2.18 mm CMOS 2-Poly. 6-Metal Tunneling oxide: 10nmFlash cell size: 0. 6.488.Verkest.4x8..6V (I/O). [2] Zhang et al.7-3.Vercauteren and D. Piazza and L. S. ISSCC 2000 Digest of Technical Papers. Chip Micrograph. Proceedings of the IEEE..

200 I/Os) and up to 32 SerDes lanes. LONDON — Abound Logic Inc. In addition the interconnect is transparent to the user meaning that while density can be increased it need not have an impact on EDA software and how Raptor FPGAs are designed. or a process node advantage.). an FPGA company formerly known as M2000.New Products Abound claims dense interconnect is key to Raptor FPGA April 24. As to price. Reblewski said engineering samples are available now and volume shipments would begin in September." he said. low-power FPGAs for market FPGA startup crunch: Cswitch's fortunes switch FPGA startup crunch: Is Achronix flush enough? ST rolls Morpheus reconfigurable processor .5 watts of static power and can typical yield twice the performance at less power than competing products Abound Logic. an FPGA company formerly known as M2000.). according to Frederic Reblewski. The company has therefore stayed with a classical SRAM-based look-up table (LUT) based architecture. has started shipping its Raptor FPGAs and claims that improved interconnect is the key to achieving triple the logic density of equivalent FPGAs from established competitors. I/0 are not denser but overall is this still translates into 2X. 448 DSP ALUs capable of 24 x 24 bit multiplication. Reblewski said Raptor is "competitive. More effective routes for an EDA tool translates into denser logic functionality. The silicon consumes 2. The Raptor has 750k LUTs 38-Mbits of memory. Calif. DSP. CEO and founder of Abound Logic. As the company was going into is development phase for the Raptor architecture its engineers realized that in FPGAs interconnect takes up more than 80 percent of the resources and also provides lots of room for improvement." Related links and articles: www. Reblewski claimed the overall effect is that Raptor FPGAs are three times denser than competing architectures. up to 1.aboundlogic. The result is what Reblewski claims is the highest capacity FPGA in 65-nm process node and one that can compete with devices from rivals made on 45-nm to 40-nm silicon. (Santa Clara. has started shipping its Raptor FPGAs and claims that improved interconnect is the key to achieving triple the logic density of equivalent FPGAs from established competitors. Calif. 2009 | | 217100249 Abound Logic Inc. Ltd.com Name-change firm preps dense. "Memory. (Santa Clara. Reblewski claimed that Abound has been able to create an interconnect architecture with a larger effective fan-out and in which each wire adds an additional effective route. in August 2008 and packaged devices in October. received first silicon from its foundry Taiwan Semiconductor Manufacturing Co.

All news Please login to post your comment .click here .

880 18Kb Blocks 1. Page 1 .640 448 448 40 40 1 (x4) 2 (x8) 8 32 1. supporting both Endpoint and Root Complex configurations ◆ Embedded support for XAUI. and debugger available Support for μC/OS II Delivered in a 1935-ball organic FBGA: ◆ ◆ ◆ ◆ ◆ ● ● ● ◆ ◆ ◆ 32 independent low-skew clocks available device wide 80 I/O register clocks Abundant local clock resources 40 low-jitter PLLs that can be used independently or cascaded.125 Gbps ◆ Embedded 4.588 DSP Blocks PLLs PCIe SerDes Controllers User I/O Rso750 Rsx750 752. and 2:1:1 ratios for SerDes I/O) Table 1: Family Product Table RAM Device LUTs DFFs 576b Register Files 5.020 PB001 (V 0. RLDRAM and QDR memories ◆ Per I/O programmable delay and dynamic phase alignment ◆ Programmable on-chip termination with built-in calibration Hierarchical interconnect architecture increases logic density by a factor of three and reduces power by up to 60% versus other 65 nm solutions ◆ ● ● 24 × 24 multiplier with a 64-bit multifunction arithmetic logic unit (ALU) ◆ Dual-stage pipelines.5 Mb of embedded RAM ◆ Built on 65 nm TSMC CMOS process 448 high-performance DSP blocks: ◆ ● High-performance.640 752. fully programmable I/O cells capable of 1. and special routing for fast cascaded operation ◆ Each DSP block can also be configured as two independent 12 × 12 multipliers and 32-bit ALUs Built-in high-speed interfaces: ◆ ● ● ● 8 or 32 channels of high-speed SerDes running at up to 3.200 1.2) ~ February 2009 Abound Logic. with programmable frequency.or 8-lane PCI Express controllers with up to 20 GB/s transfer rate. waveform and compensation mode ● 45 × 45 mm.588 38. duty cycle.The Raptor Family of FPGAs Product Brief (Advanced Data) PB001 (V 0. Serial ATA. and high-speed serial and parallel interfaces Industry-standard RTL flow with support from Mentor Graphics Precision synthesis Soft 32-bit RISC CPU with wishbone interface: SDK.640 752. Gigabit Ethernet.2) ~ February 2009 Highlights ● Unprecedented level of logic density 750k 4-input LUTs ◆ 750k D-type flip-flops ◆ 38.25 Gb/s performance in LVDS mode: Support for a wide range of both single-ended and differential I/O standards ◆ High-bandwidth memory interface support for DDR.640 752. Serial RapidIO. DDR2.960 Total (Kb) 38. programmable SEU detection and repair for increased reliability Programming support for low-cost flash memories via JTAG. Inc.960 1. 1-mm ball-pitch package RoHS-Clause-5 compliant Enhanced signal integrity (7:1:1 ratios for generalpurpose I/O. phase.880 5. compiler. and FibreChannel protocols 500 MHz hierarchical global clock tree: ◆ ◆ Volatile and non-volatile 256-bit encryption for design security AES bitstream ● Integrated.

Groups MFCs are organized into three types of groups: logic. PB001 (V 0. Each type of group contains 32 MFCs. Providing more than 750. programmable enable and asynchronous/synchronous reset (Figure 1).Introduction The Raptor Family of FPGAs Introduction Based on a dense. or can be cascaded using the dedicated carry chain to form a larger adder of up to 96-bits. memory and arithmetic. The Raptor architecture includes three types of MFCs to provide access to other resources: logic. Each flipflop in the group can be configured individually as to how these signals are used.2) ~ February 2009 . memory and arithmetic. 35 clusters are grouped together along with eight DSP blocks to form a tile. wireless infrastructure. Memory groups are composed of 32 memory MFCs and an embedded 32 × 18 register file. three memory and six arithmetic groups along with an embedded 18 Kb RAM. The simplest group is the logic group composed solely of 32 logic MFCs. is surround by the I/O ring composed of general-purpose I/O. medical imaging.000 LUTs and an equal number of flip-flops to the designer. offering invertible clock polarity. powerful DSP blocks. embedded PCIe controllers. The high-density. arithmetic MFCs provide access to 4-bit adders. Raptor FPGAs are an attractive alternative to ASICs in many applications. and embedded SerDes (Figure 2). Built on an advanced 65 nm CMOS process. professional video processing. Figure 2: Raptor FPGA Layout (Rso) SerDes Architectural Overview Multifunction Cells At the heart of the Raptor FPGA is the multifunction cell (MFC) composed of a 4-input LUT plus a D-type flip-flop. reset and enable. high-performance computing. programmable clocks. Clusters and Tiles A cluster is composed of three logic. highperformance SerDes. memory MFCs add access to an embedded 32 × 18 register file. arranged in a 7 × 8 matrix. sharing a common clock. Each adder can be used independently. far beyond that of any other FPGA. Figure 1: Raptor Multifunction Cell Logic MFC I0 I1 OUT LUT D Q I2 I3 DFF EN 1 CLK RST EN CLK RST Config SerDes pb001_02_V02 0 pb001_01_v03 Page 2 Abound Logic. hierarchical routing structure. Inc. and low-skew. intended for general-purpose logic use. dual-port SRAM blocks. fast adders. Raptor FPGAs include advanced features such as dedicated. Raptor™ FPGAs deliver an unprecedented level of density. Arithmetic groups consist of 16 arithmetic MFCs tied to four embedded 4-bit adders plus 16 logic MFCs. ASIC prototyping or even as ASIC/structured ASIC alternatives. high-density applications such as data communications. low-power Raptor FPGAs from Abound Logic are ideal for demanding. Logic MFCs contain a single LUT and D-type flip-flop. The 56-tile fabric.

Each DSP block features dual pipeline stages. configurable as either: ● ● A 24 × 24 multiplier with a 64-bit multifunction ALU. each cluster has access to all of the eight selected low-skew nets plus local signals from within that cluster. and one read/write port with write bit enable. eight of the 32 low-skew global nets can be selected and driven to each cluster and DSP blocks of that tile (marked in green in Figure 2). two additional memory resources are available: 576b register files and embedded 18 Kb RAM blocks. The register file operates from the same clock sources available to the group to which it is connected. Each of the three cascade outputs of each DSP block drives the cascade inputs of up to three neighboring blocks. Either 2048 × 9 or 1024 × 18 configurations are supported.2) ~ February 2009 Abound Logic. and a 60% reduction in dynamic power compared to competing solutions at the same process node (Figure 3). Single-port. dual-port memory supporting three modes of operation: ● The eight DSP blocks in a tile can be cascaded to support high-performance DSP implementations such has highprecision multiplication or butterfly computations. This dense routing structure results in small device size. Memory configuration is set to 512 × 36. At the device level. or internal net. fast local interconnect tied to crossbar switch matrices is used to connect elements within that level. high utilization. Page 3 . and tile level. At the group. DSP Blocks Each tile includes eight DSP blocks for a total of 448 per device. These low-skew nets can be driven by the outputs of a clock generator. 1024 × 8 or 512 × 36 memory. clock-capable I/O pins (either single-ended or differentially). Inc. ● ● PB001 (V 0. one multiplier block and one ALU. The cluster then provides these signals to the RAM blocks and MFC groups. In turn. bit-enable mode: one read port. Clock Network Each Raptor FPGA has 32 global nets spanning the fabric. with independent 5-bit address busses for each port. or As two independent 12 × 12 multipliers and 32-bit ALUs The highly configurable DSP block supports multiple operational modes: ● ● Two's complement multiplication (signed) Multiply and accumulation Embedded Memory In addition to the abundant flip-flops found in the Raptor architecture. high performance. The 18 Kb block RAM can be configured as either 2048 × 9. True dual-port mode: both read/write ports can be configured independently as either 2048x9 or 1024x18. Single-port mode: one port is configured as read-only. only one cascade path is provided in each direction. The 32 × 18 register file is a dual-port (asynchronous read. a mesh routing structure connects the tiles and I/O ring. At the tile level.The Raptor Family of FPGAs Embedded Memory Figure 3: Raptor Routing Structure Device Tile X Cluster X Group X MFC MFC 32 … 12 … MFC 35 … Mesh Routing X X X 12 … MFC MFC X X MFC MFC 32 … 35 … X X 12 … MFC MFC X MFC MFC 32 … X MFC 32 … 12 … MFC X MFC MFC 32 … X MFC 32 … 32 … 32 … pb001_03_v01 Routing The Raptor routing structure consists of four hierarchical levels. and the second port is configured as write-only. synchronous. cluster. driving to the center of each tile to minimize skew (marked in blue in Figure 2). The block is a true. At the tile boundary. synchronous write) memory.

Serial ATA.2) ~ February 2009 The Raptor Family of FPGAs I/O Ring Surrounding the core array of tiles is the I/O ring. four pairs are clock capable and can feed adjacent clock generators. Included is support for dynamic termination. composed of either 17 or 20 banks (Rsx and Rso. slave parallel. and supports a wide variety of standard and custom protocols such as XAUI. Raptor Gigabit transceivers are structured into four-channel groups (QUADs).PB001 (V 0. The I/Os feature support for a wide range of I/O standards (LVCMOS. Each I/O bank consists of 30 I/O pairs and two clock generators. Abound Logic. the implied warranties of merchantability and fitness for a particular purpose. In addition. temperature. All other trademarks are the property of their prospective owners. and FibreChannel. California 95054-1145 Email: inforeq@aboundlogic. and dynamic phase alignment. Inc. Gigabit Ethernet. PCI. Configuration The Raptor architecture includes a flexible interface supporting a number of configuration modes: JTAG. Ordering Information Rso750 F1935-1C LF Family Rso Rsx Suffix LF – Lead-free ES – Engineering Samples Device 750 Operating Conditions C– Commercial I – Industrial Package F – FBGA Speed Grade 1 – Fast 2 – Medium 3 – Normal pb001_04_v01 Abound Logic. including but not limited to. Abound Logic provides this documentation without warranty of any kind. Inc. HSTL. Inc. These embedded clock generators (and their PLLs) can drive the four I/O register clocks in each bank as well as the global network. master parallel. To support PCI Express Endpoint and Root Complex configurations. and master SPI. High-Speed Interfaces Each Raptor device contains either 8 or 32 embedded multi-gigabit transceivers capable of operating at line rates of up to 3. programmable delay. either implied or expressed. Suite 200. Of the 30 I/O pairs in a bank. All rights reserved. both single-ended and differential and from 1. SSTL. PCI Express. respectively). 3052 Bunker Hill Lane. support for automatic calibration to account for process. . NOTICE of DISCLAIMER: Abound Logic reserves the right to revise this documentation and to make changes in content from time to time without obligation on the part of Abound Logic to provide notification of such revision or change. Santa Clara. each Raptor device contains one 4-lane (Rso) or two 8-lane (Rsx) PCI Express controllers.75 Gbps. Abound Logic may make improvements or changes to the product(s) and/or the program(s) described in this document at any time. and voltage (PVT) variations is built in. slave serial.2V to 3. Inc. Serial RapidIO. Abound and Raptor are trademarks of Abound Logic. and LVDS). Configuration can be accomplished via a microcontroller or directly from serial and parallel flash memory.com Copyright © 2009 Abound Logic. All specifications subject to change without notice. and six pairs have data strobe (DQS) support for DDR applications (the associated DLLs are embedded in the I/O bank).3V.

and Chief Technology officer. as Vice President and Director of Silicon Technology Research.C. As Director of the Silicon Technology Development organization. Texas Instruments. Intel “Test Challenges for Nanometer Designs” Panel Discussion: Will the technical realities and economics presented by 65nm and 45nm silicon technologies drive system applications toward fully integrated SoC implementations or alternative.Conference Highlights Distinguished Speakers: (click on name or scroll down page for more information) SOCC 2005 . “Advanced CMOS Technology for Digital Communication Systems“ Vice President and CTO. his primary responsibilities are the development of advanced CMOS. Keynote Speaker Dr. Inc. Prior to joining Texas Instruments. of Texas Instruments. please contact the Conference Office. visit the panel discussion on Tuesday afternoon. Dr. packaging and mixed signal process technologies. (Hans) Stork Senior Vice President and Chief Technical Officer. DC Keynote: Hans Stork Plenary: Plenary: Ivo Bolsens Jacques Benkoski Luncheon: Rajesh Galivanche Senior VP & CTO. The Heart of Embedded Systems” Entrepreneur in Residence. Tutorial Workshops: Like in our previous conferences. there will be several half-day tutorials on Sunday. Inc. US Venture Partners “The Evolving Silicon Infrastructure: Issues and Opportunities” Principal Engineer. Stork is Senior Vice President. Johannes M. . disintegrated solutions? For an answer to this question. He joined Texas Instruments in September 2001.Washington. Texas Instruments. Xilinx “FPGA. “Advanced CMOS Technology for Digital Communication Systems” Dr. For more information on corporate sponsorship. Corporate Sponsors: Corporate sponsors of our conference may be present with tabletop displays. Stork was Director of the Internet Systems and Storage Lab at HP Laboratories.

was on the VLSI Technology Symposium program committee from 1986 to 1992. non-volatile NMOS devices and junction FET CCDs. Dr. Stork held the position of Director of the ULSI Research Lab between 1995 and 1999. The Netherlands. publicity (vice) chairman for the (1991) 1992 IEDM. In 2000-2001. supported by an extensive E-beam lithography facility. he assumed responsibility for the Exploratory Device and Technology programs at IBM Research.1 um channel lengths with world record speed performance. this SiGe technology was transferred to manufacturing and established IBM's entry in high-speed communication technologies. Hans became manager of the Bipolar Devices group. In 1990. In 1987. and for the Semiconductor Research Corporation (SRC) since 1999. The operational staff of the 40. Hans has served as a member of the 1988 BCTM program committee. and was publications/publicity chairman for the 1990-1992 Technology and Circuit Symposia. Within 10 years. The Netherlands. The IS&S Lab focused on highly scalable. The ULSI Research Lab demonstrated the world's smallest FRAM cell feasibility jointly with TI and Applied Materials in 1999.18 um CMOS technology with Al/low-k interconnect. Stork serves on the Board of Directors for International Sematech (ISMT) since 2002. Dr. modeling and measurement of Static Induction Transistors. His teams demonstrated CMOS process technologies at 0. California from 1999 until 2001. Hans was EDS editor of the Circuits and Devices magazine from 1993 to 1995. Stork started his professional career in 1982 at IBM's T.J. and technical program committee member of the 1994 IEDM. This laboratory was established in 1994 and closed in 1999 with the split between Agilent and Hewlett-Packard. theses concerned the fabrication. Class 1. back to top .Hewlett-Packard in Palo Alto.Watson Research Center. researching advanced bipolar technology and circuits. Delft. This group explored and demonstrated SiGe HBTs. dynamic. and holds a PhD from Stanford University. From 1992 to 1994. Dr. As a fellow member of the IEEE Electron Devices Society. During his leadership the researchers of the ULSI Lab developed a high performance 0. He has been a member of the SIA Technology Strategy Committee since 1999. he established and managed an Exploratory Devices group. clean room facility improved the productivity per person hour to the best recorded in HP's facilities. and developed and transferred then world's lowest dark-current CMOS image sensors and technology to Agilent's image component division. federated computer and storage systems.000 square foot. They also published the first extensive simulations of double-gate devices as the best structure to the ultimate scaling challenges of FETs. he participated as a technical advisor to Government efforts on high performance computing benchmarks and the national security issues emerging from Internet computing. His PhD and Ir. and was on the technical program committee of the Symposium on Low Power Electronics in 1995 and 1996. and presented invited talks at all major conferences including six papers at a single IEDM conference. He has written or co-authored over 90 cited papers and holds eleven US patents. Stork was born in Soest. After joining Hewlett-Packard in 1994. Hans was awarded two Outstanding Technical Achievement Awards from IBM. Prior to that he was a member of the Executive Advisory Boards for both Sematech and the SRC from 19971999. He was elected IEEE Fellow in 1994 for his contributions to SiGe devices and technology. and led one of the task forces on high-end computing that resulted in IBM's change in mainframe strategy. Dr. and received the Ingenieur degree in electrical engineering from Delft University of Technology. resulting in new speed records at device and circuit level.

He began there in 1984. in computer engineering from the Technion. .Sc. The Heart of Embedded Systems” Dr.Sc. He led Monterey as CEO & President since 1999 and during that tenure the company ´ramped up to 150 employees and had its products adopted by most semiconductor companies in North America. Ivo Bolsens joined Xilinx in June 2001 as vice president and chief technology officer (CTO). "High Level Synthesis for Real Time Digital Signal Processing. which focus on advanced research in the area of programmable logic. His research included the development of knowledge-based verification for VLSI circuits. Dr. and IBM. Bolsens earned his master's degree in electrical engineering and his Ph. design of digital signal processing applications. where he was vice president of information and communication systems. back to top Luncheon Speaker Rajesh Galivanche Intel Corp. and wireless communication terminals. Xilinx “FPGA. sales and general management positions at Synopsys.Plenary Speakers Dr. Bolsens came to Xilinx from the Belgium-based research center IMEC. holding various positions of increasing responsibility. Europe and Japan. Dr. Benkoski has been a Director of the EDA Consortium since 2001 and is Chairman of the Board of Certess. Jacques Benkoski Entrepreneur in Residence. Benkoski joined US Venture Partners in 2005 as Entrepreneur in Residence following the acquisition of Monterey Designs Systems by Synopsys. He has received his B.D. He is author and co-author of more than 100 papers in the field of VLSI design. IMEC. He also headed the research on design technology for high level synthesis of DSP hardware. marketing. Israel Institute of Technology and his M. He is responsible for identifying Xilinx technologies and talent as well as heading up the Xilinx Research Laboratories. degrees in electrical and computer engineering from Carnegie Mellon University and has written over 30 technical papers. Previously he was Vice President of European Operations for EPIC Design Technology and has also held various research.D. He is also co-author of the book. in applied science from the Catholic University of Leuven in Belgium. embedded system design. STMicroelectronics. and wireless communication. HW/SW co-design and system-onchip design. Ivo Bolsens Vice President and Chief Technical Officer. US Venture Partners “The Evolving Silicon Infrastructure: Issues and Opportunities” Dr. CAD. and Ph." Dr.

yield. dis-integrated solutions?” Moderator: Tom Bednar Distinguished Engineer. or architectures will influence future integration strategies? back to top .“Test Challenges for Nanometer Designs” Rajesh Galivanche is a Principal Engineer and Manager of Advanced Test Technology development team in the Technology and Manufacturing Group at Intel. Improvements in overall performance. Chairman and CEO. Director. M2000 Peter Rickert. Panel Discussion: “Will the technical realities and economics presented by 65nm and 45nm silicon technologies drive system applications toward fully integrated SoC implementations or alternative. Fellow. 100M circuits. Vice President. and functional flexibility requirements drive a solution? What emerging technologies. Engineering. Engineering Services. time to market pressures. the raw potential capability does not necessarily make these kinds of solutions technically practical or economically viable. Texas Instruments Mark Templeton. ARM Arnie Tran. Architecture Lead. Chief Strategy Officer. SOC Design Center. Rajesh has been with Intel for the last 10 years and before that he worked at Motorola. Synopsys Frederic Reblewski. However. or more. and space efficiency could follow from such a silicon integration. His group researches into Advanced Test and CAD methods for testing. ASIC Product Development.tools. Vice President. process complexity and variation. ASP program Mgmt. Will technical issues such as. and data volume prevent such large scale integration from being practical? Will economic factors such as mask costs. IBM Abstract: The scale of advanced silicon technologies brings the possibility of very high levels of integration into consideration. could theoretically be integrated on a manufacturable die size. power consumption. IP costs. debug and diagnosis of semiconductor devices. Cadence Joachim Kunkel. LSI Logic Corporation. and Sunrise Test Systems (which was later acquired by Viewlogic/Synopsys). IBM Panelists: Tim Henricks.

November 2003: US 6647362 (1 worldwide citation) 9 Jean Barbier. M June 2007: US 20070139074 (1 worldwide citation) 12 Jean Barbier. Jean Barbier: Clock generation and distribution in an emulation system. Olivier LePape. Mentor Graphics July 1998: US 5777489 (2 worldwide citation) 4 Frederic Reblewski. Mentor Graphics May 1998: US 5754827 (3 worldwide citation) 3 Jean Barbier. Olivier Lepape. Frederic Reblewski. Mentor Graphics August 2005: US 6934674 (1 worldwide citation) 8 Frederic Reblewski. Mentor Graphics September 1998: US 5801955 (1 worldwide citation) 7 Francois Douezy. Mentor Graphics November 1996: US 5574388 (4 worldwide citation) 2 Jean Barbier. September 2004: US 20040178820 (1 worldwide citation) . Olivier LePape. Olivier Lepape: Emulation system scaling. Frederic Reblewski: Reconfigurable integrated circuit with integrated debussing facilities and scalable programmable interconnect. Olivier Lepape: Reconfigurable integrated circuit with integrated debugging facilities for use in an emulation system. M July 2003: US 6594810 (2 worldwide citation) 5 Jean Barbier. Frederic Reblewski: Emulation system having a scalable multi-level multistage programmableinterconnect network. Frederic Reblewski: Method and apparatus tracing any node of an emulation. Mentor Graphics December 1999: US 5999725 (1 worldwide citation) 6 Luc Burgun. Jean Barbier. Frederic Reblewski: Method and apparatus for removing timing hazards in a circuit design. Olivier LePape. Frederic Reblewski: Reconfigurable integrated circuit with integrated debugging facilities and scalable programmable interconnect. Olivier LePape.1 Jean Barbier. Olivier LePape. Frederic Reblewski: Method and apparatus for performing fully visible tracing of an emulation. Olivier LePape. May 2002: US 6388465 (1 worldwide citation) 10 Frederic Reblewski. Frederic Reblewski: Field programmable gate array with integrated debugging facilities. July 2001: US 6265894 (1 worldwide citation) 11 Frederic Reblewski: Configurable circuits with microcontrollers. Olivier Lepape: Reconfigurable integrated circuit with a scalable architecture.

Mentor Graphics August 1998: US 5790832 17 Luc Burgun. Mentor Graphics September 2005: US 6947882 20 Frederic Reblewski. Mentor Graphics March 2010: US 20100057426 (1 worldwide citation) 15 Jean Barbier. Mentor Graphics May 1999: US 5907697 16 Jean Barbier. Frederic Reblewski: Reconfigurable integrated circuit with integrated debugging facilities and scalable programmable interconnect. Olivier Lepape: Crossbar device with reduced parasitic capacitive loading and usage of crossbar devices in reconfigurable circuits. Oliver LePape. Olivier Lepape.13 Jean Barbier. Jean Barbier: Regionally time multiplexed emulation system. Mentor Graphics April 2005: US 6876962 22 Jean Barbier. Olivier Lepaps. Olivier LePape. Frederic Reblewski: Method and apparatus for tracing any node of an emulation. April 2004: US 6717433 23 Frederic Reblewski: Method and apparatus for concurrent emulation of multiple circuit designs on an emulation system. Mentor Graphics November 1998: US 5831866 18 Frederic Reblewski: Logic design modeling and interconnection. Frederic Reblewski: Method and apparatus for removing timing hazards in a circuit design. Mentor Graphics April 2010: US 7698118 19 Frederic Reblewski. M March 2005: US 6874136 21 Frederic Reblewski: Method and apparatus for concurrent emulation of multiple circuit designs on an emulation system. Jean Barbier: Crossbar device constructed with mems switches. Frederic Reblewski. October 2002: US 6473726 24 Carl Ebeling. July 2002: US 20020089349 (1 worldwide citation) 14 Frederic Reblewski: Logic Design Modeling and Interconnection. Frederic Reblewski: Emulation system having a scalable multi-level multistage hybridprogrammable interconnect network. Frederic Reblewski: Reconfigurable integrated circuit with integrated debugging facilities and scalable programmable interconnect. M May 2010: US 20100108479 . Olivier V Lepape. Olivier LePape. Olivier Lepape.

Mentor Graphics October 2006: US 7130788 29 Frederic Josso. Frederic Reblewski: Field programmable gate array with integrated debugging facilities. Frederic Reblewski: Distributed configuration of integrated circuits in an emulation system.25 Jean Barbier. M May 2007: US 20070118783 37 . M August 2007: US 20070194807 34 Frederic Reblewski. Mentor Graphics April 2006: US 7035787 27 Frederic Reblewski. Xavier Montagne. Jean Barbier: Regionally time multiplexed emulation system. M July 2007: US 20070168718 35 Frederic Reblewski: On circuit finalization of configuration data in a reconfigurable circuit. Mentor Graphics May 2000: US 6057706 26 Frederic Reblewski: Emulation components and system including distributed routing and configuration of emulation resources. M July 2007: US 20070162247 36 Frederic Reblewski: Runtime reconfiguration of reconfigurable circuits. Mentor Graphics October 2007: US 7286976 31 Frederic Reblewski: Logic design modeling and interconnection. and testing of an IC design under emulation. Mentor Graphics August 2006: US 7098688 28 Frederic Reblewski: Emulation components and system including distributed event monitoring. Olivier LePape. Olivier LePape. Cesar Douady: Packet-oriented communication in reconfigurable circuit(s). October 2005: US 20050234692 32 Frederic Reblewski: Runtime reconfiguration of reconfigurable circuits. Mentor Graphics December 2007: US 7305633 30 Philippe Diehl. M December 2007: US 20070283190 33 Frederic Reblewski. Frederic Reblewski: Emulation of circuits with in-circuit memory. Gilles Laurent. Olivier V Lepape: Reconfigurable system with corruption detection and recovery.

Jean Barbier: Regionally time multiplexed emulation system. Cyril Quennesson. Charles W Selvidge. M May 2007: US 20070103193 38 Frederic Reblewski: Reconfigurable circuit with redundant reconfigurable cluster(S). M March 2007: US 20070057693 39 David C Scott. March 2003: US 20030055622 . July 2003: US 20030131331 48 Frederic Reblewski: Method and apparatus for concurrent emulation of multiple circuit designs on an emulation system.Frederic Reblewski. March 2005: US 20050068949 41 Frederic Reblewski. Frederic Reblewski: Distributed configuration of integrated circuits in an emulation system. Mentor Graphics April 2004: US 20040075469 46 Frederic Reblewski: Emulation components and system including distributed event monitoring. Olivier LePape. Xavier Montagne. Gilles Laurent. Frederic Reblewski: Message-based low latency circuit emulation signal transfer. Mentor Graphics December 2004: US 20040254780 44 Frederic Reblewski: Emulation components and system including distributed routing and configuration of emulation resources. Frederic Reblewski: Software state replay. Frederic Reblewski: Emulation of circuits with in-circuit memory. Gilles Laurent. April 2004: US 20040078187 45 Frederic Reblewski. February 2004: US 20040034841 47 Frederic Reblewski. Philippe Diehl: Data compaction and pin assignment. Gilles Laurent. Joshua D Marantz. Olivier Lepape: Crossbar device with reduced parasitic capacitive loading and usage of crossbar devices in reconfigurable circuits. December 2004: US 20040260530 43 Philippe Diehl. Mentor Graphics April 2006: US 20060074622 40 Philippe Diehl. and testing of an IC design under emulation. Marc Vieillot. December 2004: US 20040267489 42 Frederic Josso. Olivier Lepape: Configurable circuit with configuration data protection features.

M July 2009: US 20090177912 50 Frederic Reblewski.49 Frederic Reblewski: Reconfigurable circuit with redundant reconfigurable cluster(s). September 2003: HK 1052386 . Olivier Lepape: A reconfigurable integrated circuit with integrated debugging facilities for use in an emulation system.

Master your semester with Scribd & The New York Times

Special offer for students: Only $4.99/month.

Master your semester with Scribd & The New York Times

Cancel anytime.