Professional Documents
Culture Documents
1
II. RELATED WORKS facility of sending packet in two ways. Sockets belong to
The idea of using TLM to describe communication in protocol layer of interoperability TLM layers. The initiator and
network structures has been investigated in [15], [16] and [17], target sockets group the TLM 2.0 interfaces including blocking
where TLM is employed to model of the communication and non-blocking transport interfaces (transport layer) for both
between the adjacent switches on a packet’s path. While these forward and backward paths together into a single object.
approaches can already improve simulation performance
significantly in comparison to RTL. A detailed description of
Methodologies of transaction level modeling with SystemC is
introduced in [18] and [19]. Communication mechanisms are
modeled as channels and presented to modules using SystemC
interface classes. Transaction requests take place by calling
interface functions of these channel models, which encapsulate
low-level details of the information exchange [18]. The authors
of [20] introduced and applied Parallel Discrete Event
Simulation techniques to build SoC simulation environment at
Transaction Level Model with Time (TLM/T) using NoC as its Figure 1. TLM interconnection modules
interconnect architecture. In [21] a novel modeling technique, The generic payload belonging to the application layer is
called Result Oriented Modeling, which delivers the same introduced to improve the interoperability of memory-mapped
speed as TLM with fully accurate timing, is proposed. They bus models and facilitate the IP reuse of TLM2.0 IPs . It also
described the AMBA AHB bus architecture in this technique provides the extension mechanism which can be added to
and show its efficient simulation. Methodologies of transaction generic payload object when generic payload attributes are not
level modeling have been already widely used in describing adequate to model the full functionality of the architecture
electronic systems that are based on NoC. In [22], the writers (NoC). Moreover, it is capable of creating detailed models of
enumerated NoC characteristics to be modeled and expose specific bus protocols, while reducing the implementation cost
SystemC models' properties of a NoC system at several and increasing the simulation speed of the whole design. The
abstraction levels. In [23], a scalable and parametrical NoC is main features of the generic payload are illustrated in figure 2.
implemented in SystemC. The NoC is composed of resources We use in this work the mesh 2D topology with
and routers connected by channels, with a mesh topology and deterministic routing function called XY routing. In this kind
wormhole routing. In [24] and [25], the authors modeled a of the routing, the transaction is routed in the X direction until
MIMO-OFDM receiver at transaction level abstraction based the destination address and the current address are equal. After,
on NoC platform in SystemC, and then analyze and compare the transaction is routed in the Y direction until it reaches the
the performance and complexity of the model for various destination router. In the X direction, the transaction is routed
configurations in terms of throughput. Since the on-chip to the east port or to the west port. In the Y direction, the
interconnects topology is the key to determine the main transaction can be routed to the north port or to south port.
performance metrics of the NoC , the objective of this paper is Finally, the local port can route transaction to the north, south,
to build libraries of models of components that allow east, and west ports. For more information about the TLM
constructing several type of router that can be used later to modeling of the mesh 2D can be found in [30].
construct several network on chip topologies for performance
evaluation propose. It allows taking decision of the best
topology for given traffic or application target. To prove our
approach, several routers type is proposed based in these
models of components and are used to build a mesh and
STAR-RING network on chip at transaction level. In the next
section will give a short introduction to TLM modeling, Mesh
topology, and detailed description of the STAR-RING
topology and routing algorithm used by it. After we will
present the TLM model proposed of router components.
Finally, the traffic generator and traffic analyzer are presented.
Section 4 presents a comparative study of the two network
topology in terms of performance evaluations using two traffic
models with the same network sizes. Finally, we conclude in Figure 2. TLM Generic_Payload
section 5 with remarks. In this section we propose a new determinist routing
algorithm that allows obtaining a fixed diameter equal to two
hops regardless the network size and the STAR valence. The
III. THE CONSIDERED PROPOSED TLM NETWORK ON CHIP STAR RING topology is characterized by a diameter of two
The connection is made through a bidirectional macro- links independently of the total number of routers (N) as shown
socket as shown in figure 1. A socket is a high level in figure 3.
mechanism that is introduced in the TLM 2.0 standard which
combines a port with an export to provide designers with the
2
Figure 4. TLM interconnection modules
Figure 3. TLM interconnection modules
When a peripheral router initiate a transaction through its
The proposed STAR-RING topology was been build with local port to other router, the routing function of the initiator
two types of routers. The first one is used to build the ring
router computes the address difference between initiator and
topology and it is composed by four I/O ports numbered from 0
target address, if the difference are greater than 2 or smaller
to 3 and positioned in a ring around the central node such
presented in table I and named peripheral router. than -2 the routing function send a request to the Across port
and the transaction are routed to the STAR router. It will be
TABLE I. I/O PORTS ATTRIBUTE routed later to the destination peripheral router. When the
difference is equal to 1 or 2 then the transaction will be routed
Reference Port Symbol to the CW port. Finally if the difference is equal to -1 or -2 then
0 Local Port L the transaction will be routed to the CCW port. The next figure
1 CLOCKWISE Port CW presents the pseudo code used to implement the routing
function for local port.
2 COUNTER CLOCKWISE CCW All the peripheral routers can send transaction to the STAR
Port router by using its unique address. The routing function used
3 ACCROSS Port AC
by the STAR router is not the same. When the local port of the
STAR router wants to send transaction to a peripheral router, a
The second router is the star router with high radix valence. connection is made with the across port of the destination
The number of its I/O ports is equal to the number of the router through the SATR router.
routers placed in the ring plus one. It has V+1 I/O ports from
numbered from 0 to V with their local port has the value V.
The figure 7 present a star router with 9 input ports. It uses a
local port and 8 ports to connect the star router with the 8
peripheral routers.
As shown in figure 4 each peripheral router uses 4 ports.
The first one is the local port where the IP core is connected.
The others ports are used to connect the peripheral router with
neighboring routers called respectively CW port and CCW
port, and AC port. The AC port establishes the connection
between peripheral router and the STAR router. Figure 5. TLM interconnection modules
The routing function of the port K of STAR router cannot
route packet only for routers K-2, K-1, K+1, and K+2. The
central router is connected to all routers in the ring on its
peripheral ports. Figure 6 presents the pseudo code used to
implement the routing function sub routine for STAR router
which is different to then used by the local port.
3
Figure 8. TLM interconnection modules
Figure 6. TLM interconnection modules The traffic generator use an initiator socket connected to a
In order to model TLM mesh 2D or STAR RING target socket to the input port of the router. The traffic analyzer
topologies, we present in figure 9 our extension including has a target socket connected to an initiator socket to the output
several attributes able to perform the used routing algorithm. port of the router. We note that each router processing type
Moreover this extension employs priority of the transaction depends on its position with regards to the limits of the
that can be used to improve QoS. “Adress_Port_int” attribute network, as shown in Figure 10. For example, the central
corresponds to the source address of the initiator module switch has all five ports. However, each corner switch has only
providing the transaction (Master IP) while “Adress_Port_dest” three ports.
refers to the destination address of the target module (Slave IP)
accepting the transaction. In fact, those parameters indicate the
essential factors of the communication network specified in the
routing functions. To ensure the scalability of the 2D mesh or
STAR RING, we specify a setting in the template entry that
indicates the number of rows and columns or the valence
needed for the automatic mapping instance of the network
object.
4
upstream of each target socket in order to manage competing
requests. The routing and arbiters functions, comply with the
principle of internal connectivity through the mapping between
the initiator sockets of the routing side, and the target sockets
for the arbitration side.
Arbitration function: it is able to manage the competing
request from initiators, which seek to achieve the same target
module. The transport method can indeed be called
simultaneously by multiple processes. This implies that a
hardware implementation of the routing node must have the
computing resources and infinite memory in order to
simultaneously process an arbitrary number of transactions. Figure 11. Network components
To remedy this problem, a mutual exclusion (mutex) can be However, the index is returned, acheminemnt of the
exploited in an arbitration module to manage the flow of data. transaction is carried out by blocking transport interfaces
The Queues can be used for the purpose of synchronization. b_transport.
This allows having a model with contention, which describes a
realistic operation ready for synthesis. IV. PERFORMANCE EVALUATION
5
Accordingly, throughput is measured in flits/cycle/PC. throughput is better than the START RING network starting
Throughput signifies the maximum value of the accepted from 80% of injection rate because it has the lowest sensitivity
traffic and it is related to the peak data rate sustainable by the to the packet drops. For injection rate greater than 80%, the
system [27]. latency and throughput of the mesh network is better than
STAR RING network.
The workload or traffic model is basically defined by three
parameters: distribution of destinations, injection rate, and The second traffic pattern used is the hot spot traffic. In this
message length. The traffic pattern indicates the destination for type of traffic, each traffic generator sends a set of packets to
the next message at each node. We use the most frequently other traffic analyzer with an equal probability except for a
used traffic pattern; the uniform distribution [28]. In this specific node (Called Hotspot) which receives packets with a
distribution, the probability of node i sending a message to greater probability. The percentage of additional packets that a
node j is the same for all i and j, i ≠ j [29]. The case of nodes hotspot node receives compared to other nodes is indicated
sending packets to themselves is excluded because we are after a hotspot name (Hotspot 100%).
interested in the packet transfers that use the network [28].
Hot spot traffic
1200
uniform traffic
150
STAR-RIN
Mesh2D 1000
STAR-RING
mesh2D
145
Latency (flit/cycle/IP)
800
Latency (flit/cycle/IP)
140
600
135 400
200
130
0
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
125
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Injection rate (flits/cycle)
Injection rate (flits/cycle)
Figure 14. Average transaction latency under hotspot traffic
Figure 12. Average transaction latency under Uniform traffic Figure 14 and 15 show the average packet latency and the
Figure 12 and 13 depict the average packet latency and the throughput packet versus normalized accepted traffic for the 3
throughput packet versus normalized accepted traffic for the 3 x 3 mesh 2D network sizes using the XY dimension-order
x 3 mesh 2D network sizes using the XY dimension-order routing, and STAR RING topology with a valence equal to 8
routing, and STAR RING topology with a valence equal to 8 using proposed routing algorithm, with hot spot distribution of
using the proposed routing algorithm, with a uniform message destinations. The STAR router is the specific router
distribution of message destinations. for STAR RING network and the router (11) is the specific
router for 3x3 mesh 2D network.
Uniform traffic
0.07
STAR-RING Hot spot traffic
mesh2D 0.14
STAR-RING
0.06 mesh2D
0.12
throughput (flit/cycle/IP)
throughput (flit/cycle/IP)
0.05
0.1
0.04 0.08
0.06
0.03
0.04
0.02
0.02
0.01
0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0
Injection rate(flits/cycle) 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Injection rate (flits/cycle)
Figure 13. Throughput under Uniform traffic
Figure 12 clearly indicates that 3 x 3 mesh network Figure 15. Throughput under hotspot traffic
saturates at 90 %, the START RING network saturates at 80%. The mesh network outperforms STAR RING network in
The STAR RING provides a better result in terms of latency terms of latency and throughput the period of injection rate
than the mesh network. Figure 13 shows the variation of variation.
throughput under different injection rates. The mesh
6
V. CONCLUSION [13] Open SystemC Initiative OSCI: SystemC documentation, OSCI,
www.systemc.org; 2004.
A library of generic TLM models of router components [14] OCP-IP: Open Core Protocol, OCP 2.1 Specification. OCP-IP.
has been constructed. Two different topologies were www.ocpip.org; 2005.
selected and analyzed using specific routing and switching [15] J. Xi and P. Zhong, “A Transaction-Level NoC Simulation
methods with systemC TLM2.0. A comparison study Platform with Architecture-Level Dynamic and Leakage Energy
aimed at exploring different topologies of networks-on- Models. Proc. Of the 16th ACM Great Lakes Symposium on
chip has been conducted at transactional level modeling. VLSI (GLSVLSI), April 2006.
Simulations have been carried out to compare the metrics [16] S. Lee, S.-R. Yoon, J. Lee, M. L. Huang, and S.-C. Park.
of performance such as latency, throughput and network Transaction level model simulator for NoC-based MPSoC
platform. Proc. of the 6th WSEAS International Conference on
load. The mesh 2D topology has a little increase of
Instrumentation, Measurement, Circuits and Systems (IMCAS),
throughput when compared to STAR RING network January 2007.
topology in uniform traffic pattern but has a higher latency [17] S. G. Pestana, E. Rijpkema, A. Radulescu, K. Goossens, and O.
in the same pattern of traffic. Mesh 2D network P. Gangwal. Cost-performance trade-offs in networks on chip: A
outperforms STAR RING topology in the most of the simulationbased approach. Proc. of the Conference on Design,
injection rate variation in hot spot traffic pattern because Automation and Test in Europe (DATE), February 2004
of its regular and stable behavior. STAR RING topology [18] T. Grotker, S. Liao, G. Martin, and S. Swan, System Design
generally performs badly in most of the considered with SystemC. Kluwer Academic Publishers: Massachusetts,
parameters. 2002, pp. 131151
[19] F. Ghenassia, Transaction Level Modeling with SystemC : TLM
Other architectures will be studied and simulated for Concepts and Applications for Embedded Systems, Springer:
different parameters using our library components. Netherlands, 2005, pp.23-95.
[20] E. Viaud, F. P'echeux, and A. GreinerAn, "Efficient TLM/T
Modeling and Simulation Environment Based on Conservative
REFERENCES Parallel Discrete Event Principles," Proceedings of the
conference on Design, automation and test in Europe, Munich,
Germany, 2006, pp. 94-99
[1] L. Benini and G. De Micheli. Networks on chip: a new soc [21] G. Schimer and R. D'omer, "Fast and Accurate Transaction
paradigm. IEEE Computer, January 2002 Level Models using Result Oriented Modeling. Proceedings of
[2] S. Kolson, A. Jantsch and al. A NoC architecture and design the 2006 IEEE/ACM international conference on Computer-
methodology. Proc. Annual Symp. VLSI, 2002. aided design, San Jose, Califomia, pp. 363-368, 2006.
[3] W. Dally and B. Towles. Route packets not wires: On chip [22] S. H. Sfar, I. E. Bennour, and R. Tourki. Transaction level
inttercnnection networks. Design Automation Conference. pp. modeling of an OSI-like layered NoC. International Conference
684-689, 2001. on Design and Test of Integrated Systems in Nanoscale
[4] F. Moraes, N. Calazans, A. Mello, M. Leandro, and L. Ost. Technology, Gammarth, Tunisia, pp.404-408, 2006.
Hermes. An infrastructure for low area overhead [23] A. Portero, R. Pia and J. Carrabina. SystemC Implementation of
packetswitching networks on chip. Integr. VLSI J., Vol.38, a NoC. IEEE International Conference on Industrial
Issue1. pp. 69-93,2004. Technology, Hong Kong, China, pp. 1132-1135, 2005.
[5] S. Kumar and al. A network on chip architecture and design [24] Sung-Rok Yoon and Sin-Chong Park. Case Study on
methodology. Proc. Symposium on VLSI, editor, Proc. Transaction Level Modeling of NoC based IEEE 802.11n. Asia-
Symposium on VLSI. pp. 117-124, April 2002. Pacific Conference on Communications, Busan, Korea, pp. 1-4,
[6] J.-Y. Nollet, V. Marescaux, T. Verkest, D. Vernalde, S. 2006.
Lauwereins, R. Bartic and T.A. Mignolet. Highly scalable [25] Sung-Rok Yoon, Jin Lee and Sin-Chong Park. Transaction level
network on chip for reconfigurable systems. IEEE International analysis of NoC based coded MIMO-OFDM receiver. IEEE
Symposium on System-on-Chip. pp. 79-82, November 2003. Wireless Communications and Networking Conference, Las
[7] D. Bertozzi, A. Jalabert, S. Murali, R. Tamhankar, S. Stergiou, Vegas, USA, pp.1794-1799, 2006,
L. Benini, and G. De Micheli. Noc synthesisfilow for [26] P. P. Pande, C. Grecu, A. Ivanov and R. Saleh, "Design of a
customized domain specific multiprocessor systems-on-chip. switch for network on chip applications. Proc. International
IEEE Transaction on Parallel and Distributed Systems. Vol. 16. Symposium on Circuits and Systems, pp. 217–220, 2003.
pp.113-129, February 2005. [27] P. P. Pande, C. Grecu, M. Jones, A. Ivanov, and R. Saleh,
[8] Cai, L. and D. Gajski. Transaction Level Modeling: An
"Performance evaluation and design trade-offs for network-on-
Overview. International Conference on Hardware/Software chip interconnect architectures," IEEE Trans. Computer, Vol.
Codesign and System Synthesis (CODES+ISSS). pp. 19-24, 54-8, pp.1025–1040, August 2005.
2003. [28] J. Duato, S. Yalamanchili, and L. Ni, “Interconnection networks:
[9] A. Donlin. Transaction Level Modeling: Flows and Use Models. an engineering approach. Morgan Kaufmann Publishers, 2003.
International Conference on Hardware/Software Codesign and [29] D. A. Reed and D. C. Grunwald, "The performance of
System Synthesis (CODES+ISSS). pp. 75-80, 2004. multicomputer interconnection networks," IEEE Trans.
[10] L. Cai and D. Gajski: Transaction Level Modeling: An Computer, Vol. 20, Issue 6, pp. 63–73, June 1987.
Overview.International Conference on Hardware/Software [30] A. Noureddine, C. wissem, and A. Brahim, “Design and
Codesign and System Synthesis. October 2003. Performance Evaluation of On Chip Network with Transaction
[11] L. Benini and.al.: SystemC Cosimulation and Emulation of Level Modeling. Proceedings of the conference International
Multiprocessor SoC Design. IEEE Computer; April 2003. Conference Microelectronics (ICM), DEC 2011
[12] A. Donlin. Transaction Level Modeling: Flows and Use Models.
Int. Conf. Hardware/Software Codesign and System Synthesis;
September 2004.