You are on page 1of 5

JOURNAL OF COMPUTING, VOLUME 4, ISSUE 4, APRIL 2012, ISSN 2151-9617 https://sites.google.com/site/journalofcomputing WWW.JOURNALOFCOMPUTING.

ORG

MT-NOC: A New Heterogeneous Topology For Network-on-Chip


Reza Kourdy Department of Computer Engineering Islamic Azad University, Khorramabad Branch, Iran Mohammad Reza Nouri rad Department of Computer Engineering Islamic Azad University, Khorramabad Branch, Iran

Abstract The Network-on-Chip (NoC) interconnect network of future multi-processor system-on-a-chip (MPSoC) needs to be efficient in terms of energy and delay.Most network-on-chip (NoC) architectures are based on a mesh-based interconnection structure. In this paper, we present a new NoC architecture, which relies on source synchronous data transfer. We also carry out the high-level simulation of on chip network using NS-2 to verify the analytical analysis. Index Terms Mesh of Tree(MT), Network on chip (NoC), MPSOC, System-On-Chip (SoC).

1 INTRODUCTION
PSoC are now integrating more and more processors on a single die with the increase in transistor budgets enabled by Moores law. As the number of processor cores on a single die increases, the power consumption and wire delay will also have a significant increase, which makes the onchip communication among cores becomes a critical issue. A scalable, energy efficient on-chip interconnect network is needed to address these difficulties in order to facilitate the onchip communication. The minimum feature size of a CMOS process technology is scaled down, which enables higher density and lower chip cost. However, process variation is increasingly expanded by technology scaling. The expanded process variation strongly affects the SoC circuit characteristics. Network-on-Chip (NoC), which is emerging as a highly efficient network fabric for many-core processors [1], commonly adopts a synchronous design for a network overall across the chip. The NoC in a many-core processor has many network components, each of which is affected by process variation. The network component delays are varied considerably as the number of network components increases. Therefore, the frequency of the large-scale chip-wide synchronous network is degraded to the level of the slowest network component. The Network-on-Chip (NoC) architecture is a communication network that is used on a chip. The idea of a Network-on-Chip appeared in the 90s. However, research started only from year 2000. The most important papers that pioneered this new research field are the ones of: Guerrier and Greiner [2], Hemani et al. [3], Dally and Towles [4], Wingard [5], Rijpkema et al. [6], Kumar et al. [7] and Micheli and Benini [8].

has proved to be more efficient than the regular structure onchip network design in [10]. The reason is that the communication requirement for each data flow is available in the design time, so the power consumption and packet latency are predictable once the links of networks are determined. Having this knowledge makes custom on-chip networks more efficient with topology synthesis as shown in [9], [10], [15]-[20]. Among those, [9], [10], [17] use a partition-based algorithm to reduce the time complexity. Both [9] and [10] use decomposition and clustering methodologies to find the best partition of traces. For each trace partition, [9] uses a K-way merge to construct the on-chip network topology, while [10] uses a Steiner Tree engine to build topology. Both of their solution spaces [9] and [10] are limited by the implementation structure that they choose. A K-way partition plus K-way merge forces the maximum hop count to be 2, while using an existing Steiner tree package limits the location of routers to the Steiner points selected by that package. This usually results in a highly congested common backbone for the wire length consideration. The work in [17] presents a min-cut partition-based algorithm to group processing elements, and uses cross-group traffic as the cost of edge. The work in [18] uses linear programming techniques to solve the topology synthesis problem. The work in [21] provides a system-level for on-chip network that provides the user the most suitable network designs tailored to their performance requirements and power/area constraints. The design space of NoC onfigurations in [21] only includes several fixed topologies such as torus, mesh, ring, fat tree, which might lose the opportunity to improve the performance of on chip network.

2 RELATED WORK
The custom on-chip network, which targets a given application,

JOURNAL OF COMPUTING, VOLUME 4, ISSUE 4, APRIL 2012, ISSN 2151-9617 https://sites.google.com/site/journalofcomputing WWW.JOURNALOFCOMPUTING.ORG

10

Hybrid architecture is also used in custom on-chip networks. In [16], the authors observe that the power consumption of a crossbar is mainly determined by its size. By introducing the concept of temporal non-overlapping, the communication traces, which do not occur at the same time, can share the same resource. However, the topology generated by the work in [16] is limited to crossbars and buses only. In [19] and [20], the topologies are constructed on top of a standard mesh structure. [19] adds extra long-range links to reduce the average packet latency, while [20] uses radio frequency interconnect (RFI) to simplify the topology. The works in [19] and [20] do not consider bus implementation in their algorithms. Srinivasan et al. [15] proposed a three-phase topology synthesis method including performance-aware floorplan, core to router mapping, communication traces routing. The target of [15] is to reduce power consumption with the packet latency constraints. Instead, our approach simultaneously optimizes the power consumption and packet latency. [15] formulated the problem of routing communication traces as a variation of rectilinear Steiner arborescence problem, and modeled the problem as a linear programming formulation. Before solving the communication trace routing, the locations of the routers have already been determined. However, in our algorithm, we simultaneously determine the communication trace routing and locations of the routers. This paper presents an ATree-based topology synthesis approach. The design space of synthesized topology that we explore is larger than those of the previous researches. As we discuss above, the previous research such as CosiNoC [9] and Rectilinear-Steiner-Tree-based algorithm [10] limit the structure of synthesized topologies to be K-way merge or sets of Rectilinear Steiner tree, which might lose the opportunity to synthesize a better topology in terms of power and packet latency. In our ATree-based algorithm, we iteratively refine the topology by constructing the sub-topology of the selected router in an ATree [11], [12] fashion, which is more flexible. In this way, we can achieve power efficient topologies.

alternative topology called Multidrop Express Channels (MECS) uses point-to-multipoint channels to also provide full intra-dimension connectivity but with fewer links [25]. Each node in a MECS network has four output channels, one per cardinal direction. Light-weight drop interfaces allow packets to exit the channel into one of the routers spanned by the link.

b) Flow Control Flow control governs the flow of packets through the network by allocating channel bandwidth and buffer slots to packets. Conventional interconnects have traditionally employed packetgranularity bandwidth and storage allocation, exemplified by Virtual Cut-Through (VCT) flow control [26]. In contrast, NOCs have relied on flit-level flow control [27], refining the allocation granularity to reduce the per-node sto-

(a)

3 BAKGROUND
This section reviews key NOC concepts, draws on prior work to identify important Kilo-NOC technologies, and analyzes their scalability bottlenecks. We start with conventional NOC attributes topology, flow control, and routing followed by quality-of-service technologies.

3.1 Network-On-Chip Attributes a) Topology Network topology determines the connectivity among nodes and is therefore a first-order determinant of network performance and energy-efficiency. To avoid the large hop counts associated with rings and meshes of early NOC designs [22, 23], researchers have turned to richly-connected low-diameter networks that leverage the extensive on-chip wire budget. Such topologies reduce the number of costly router traversals at intermediate hops, thereby improving network latency and energy efficiency, and constitute a foundation for a Kilo-NOC. One low-diameter NOC topology is the flattened butterfly (FBfly), which maps a richly-connected butterfly network to planar substrates by fully interconnecting nodes in each of the two dimensions via dedicated point-to-point channels [24]. An

rage requirements.

Fig.1. Mesh of Trees topology with 3 levels

(b)

JOURNAL OF COMPUTING, VOLUME 4, ISSUE 4, APRIL 2012, ISSN 2151-9617 https://sites.google.com/site/journalofcomputing WWW.JOURNALOFCOMPUTING.ORG

11

c) Routing A routing function determines the path of a packet from its source to the destination. Most networks use deterministic routing schemes, whose chief appeal is simplicity. In contrast, adaptive routing can boost throughput of a given topology at the cost of additional storage and allocation complexity. /or

NOC-MESH OF TREES

We calssify the NOC-mesh of trees Link into two categories:

ogy is shown below. if ($n==1) {set max 1} if ($n==2) {set max 2} if ($n==3) {set max 4} if ($n==4) {set max 8} for {set k 1} {$k <= $n} {incr k} { if ($k==1) {set x 1} if ($k==2) {set x 2} if ($k==3) {set x 4} if ($k==4) {set x 8} for {set i 1} {$i <= $max} {incr i} { for {set j 1} {$j <= $x} {incr j} { set sw([expr ($k*100+$i*10+$j)]) [$ns node] if ($k==1) { $sw([expr ($k*100+$i*10+$j)]) color blue} if ($k==2) { $sw([expr ($k*100+$i*10+$j)]) color red} if ($k==3) { $sw([expr ($k*100+$i*10+$j)]) color brown} if ($k==4) { $sw([expr ($k*100+$i*10+$j)]) color green} $sw([expr ($k*100+$i*10+$j)]) label sw[expr ($k*100+$i*10+$j)] }}} #Create links (switches-switches) for {set k 1} {$k < $n} {incr k} { if ($k==1) {set x 1} if ($k==2) {set x 2} if ($k==3) {set x 4} if ($k==4) {set x 8} for {set i 1} {$i <= $max} {incr i} { for {set j 1} {$j <= $x} {incr j} { $ns duplex-link $sw([expr ($k*100+$i*10+$j)]) $sw([expr ($k+1)*100+$i*10+(2*$j-1)]) 1Mb 10ms DropTail $ns duplex-link $sw([expr ($k*100+$i*10+$j)]) $sw([expr ($k+1)*100+$i*10+(2*$j )]) 1Mb 10ms DropTail }}} for {set i 1} {$i <= $max} {incr i} { for {set j 1} {$j < $max} {incr j} { $ns duplex-link $sw([expr ($k*100+$i*10+$j)]) $sw([expr ($k*100+$i*10+$j+1)]) 1Mb 10ms DropTail }} for {set i 1} {$i < $max} {incr i} { for {set j 1} {$j <= $max} {incr j} { $ns duplex-link $sw([expr ($k*100+$i*10+$j)]) $sw([expr ($k*100+($i+1)*10+$j)]) 1Mb 10ms DropTail }}

4.1 mesh of trees by Horizental Links 4.2 mesh of trees by Vertical Links

5. SIMULATION FRAMEWORK
In this paper, we have modeled our MPLS-noc architecture concepts with the widely used network simulator ns-2 [26]. NS2 has been widely applied in research related to the design and evaluation of computer networks and to evaluate various design options for noc architectures [27], including the design of routers, communication protocols, etc.

7. SIMULATION EXPERIMENTS

Fig.2. Mesh of Trees by Horizental-Links

All of the topology parameters can be described as a script file in Tcl. A part of the ns-2 script file about constructing the topol-

8. SIMULATION RESULTS
Figures 6 and 7, show different views of a Pyramid NOC.

8.1 NOC-mesh of trees by Horizental Links

Fig.3. Mesh of Trees by Horizental-Links

JOURNAL OF COMPUTING, VOLUME 4, ISSUE 4, APRIL 2012, ISSN 2151-9617 https://sites.google.com/site/journalofcomputing WWW.JOURNALOFCOMPUTING.ORG

12

Figures 4 and 5, show different views of a NOC-mesh of trees by Horizental Links:

a) A (4 x 4) NOC-mesh of trees b) A (8 x 8) NOC-mesh of trees

a) A (4 x 4) NOC-mesh of trees b) A (8 x 8) NOC-mesh of trees

Fig.6. (4*4)Mesh of Trees by Horizental-Links

Fig.4. (4*4)Mesh of Trees by Horizental-Links

8.2 NOC-mesh of trees by Verticals Links

Fig.5. (8*8)Mesh of Trees by Horizental-Links

Fig.7. (8*8)Mesh of Trees by Horizental-Links

Figures 6 and 7, show different views of a mesh of trees NOC by Verticals Links:

JOURNAL OF COMPUTING, VOLUME 4, ISSUE 4, APRIL 2012, ISSN 2151-9617 https://sites.google.com/site/journalofcomputing WWW.JOURNALOFCOMPUTING.ORG

13

REFERENCES
[1] L. Benini and G. De Micheli, "Networks on chips: a new SoC paradigm," IEEE Computer, vol. 35, no. 1, pp. 70-78, Jan 2002. [2] P. Guerrier and A. Greiner, A generic architecture for onchip packet-switched interconnections, Proceedings of the conference on Design, automation and test in Europe, p. 250256, 2000. [3] A. Hemani et al., Network on chip: An architecture for billion transistor era, in Proceeding of the IEEE NorChip Conference, 2000, p. 166173. [4] W. J. Dally and B. Towles, Route packets, not wires: onchip inteconnection networks, in Proceedings of the 38th annual Design Automation Conference, Las Vegas, Nevada, United States, 2001, pp. 684-689. [5] D. Wingard, Micronetwork-based integration for SOCs: 673, Proceedings of the 38th annual Design Automation Conference, p. 677, 2001. [6] E. Rijpkema, K. Goossens, and P. Wielage, A Router Architecture for Networks on Silicon, IN PROCEEDINGS OF PROGRESS 2001, 2ND WORKSHOP ON EMBEDDED SYSTEMS, p. 181--188, 2001. [7] S. Kumar et al., A network on chip architecture and design methodology, in isvlsi, 2002, p. 0117. [8] G. de Micheli and L. Benini, Networks on Chip: A New Paradigm for Systems on Chip Design, Proceedings of the conference on Design, automation and test in Europe, p. 418, 2002. [9] A. Pinto, L. P. Carloni, A. L. Sangiovanni-Vincentelli, Efficient Synthesis of Networks on Chip, in ICCAD (2003), 146-150. [10] S. Yan, B. Lin, Custom Networks-on-Chip Architectures With Multicast Routing, IEEE Trans. VLSI, 342-355, 2009. [11] Jason Cong, Andrew B. Kahng and Kwok-Shing Leung, Efficient Algorithms for the Minimum Shortest Path Steiner Arborescence Problem with Applications to VLSI Physical Design, IEEE Trans. CAD, 24-39, 1999. [12] J. Cong, K. Leung, On the Construction of Optimal or Near Optimal Steiner Arborescence, Department of Computer Science, University of California, Los Angeles, Tech. Rep. CSD-960033, 1996. [13] A. Kahng, B. Li, L. Peh, K. Samadi, ORION 2.0: A Fast and Accurate NoC Power and Area Model for Early-Stage Design Space Exploration, Proc. DATE, 550-557, 2009. [14] ORION: http:/ /www.princeton.edu /~peh /orion.html [15] K. Srinivasan, K. Chasha, G. Konjevod, An Automated Technique for Topology and Route Generation of Application Specific On-Chip Interconnection Networks, in ICCAD (2005), 231-237. [16] S. Murali, L. Benini, G. De Micheli, An ApplicationSpecific Design Methodology for On-Chip Crossbar, IEEE Trans. CAD, Vol. 26(7).1283-1296, 2007. [17] S. Murali, P. Meloni, F. Angiolini, D. Atienza, S. Carta, L. Benini, G. De Micheli, L. Raffo, Designing ApplicationSpecific Networks on Chips with Floorplan Information, in ICCAD (2006), 355-362. [18] K. Srinivasan, K. S. Chatha, G. Konjevod, Linear Programming based Techniques for Synthesis of Network-onChip Architectures, IEEE Trans. VLSI, 407-420, 2006.

[19] U. Y. Ogras, R. Marculescu, Application-Specific Network-on-Chip Architecture Customization via Long-Range Link Insertion, in ICCAD(2005), 246-253. [20] M. F. Chang, J. Cong, A. Kaplan, M. Naik, G. Reinman, E. Socher, S. Tam, Power Reduction of CMP Communication Network via RFInterconnects, IEEE/ACM MICRO, 376-387, 2008. [21] Vassos Soteriou, Noel Eisley, Hangsheng Wang, Bin Li, and Li-Shiuan Peh, Polaris: A System-Level Roadmap for On-Chip Interconnection Networks, In Proceedings of the 24th International Conference on Computer Design (ICCD), San Jose, October 2006 [22] D. Pham et al. Overview of the Architecture, Circuit Design, and Physical Implementation of a First-Generation Cell Processor. IEEE Journal of Solid-State Circuits, 41(1):179196, January 2006. [23] E. Waingold, M. Taylor, D. Srikrishna, V. Sarkar, W. Lee, V. Lee, J. Kim, M. Frank, P. Finch, R. Barua, J. Babb, S. Amarasinghe, and A. Agarwal. Baring It All to Software: RAW Machines. IEEE Computer, 30(9):8693, September 1997. [24] J. Kim, J. Balfour, and W. Dally. Flattened Butterfly Topology for On-chip Networks. In International Symposium on Microarchitecture, pages 172182, December 2007. [25] B. Grot, J. Hestness, S. W. Keckler, and O. Mutlu. Express Cube Topologies for on-Chip Interconnects. In International Symposium on High-Performance Computer Architecture, pages 163174, February 2009. [26] P. Kermani and L. Kleinrock. Virtual Cut-through: a New Computer Communication Switching Technique. Computer Networks, 3:267286, September 1979. [27] W. J. Dally. Virtual-channel Flow Control. In International Symposium on Computer Architecture, pages 6068, June 1990.

You might also like