You are on page 1of 7

JOURNAL OF COMPUTING, VOLUME 2, ISSUE 9, SEPTEMBER 2010, ISSN 2151-9617

HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/
WWW.JOURNALOFCOMPUTING.ORG 107

Efficient Mapping of Heuristic Packet


Classifier on Network Processor based
Router to enhance QoS for Multimedia
Applications
R. Avudaiammal, P. Seethalakshmi
Research Scholar, Professor,
Anna University Tiruchirappalli

Abstract— Packet classification is an important function performed by network devices such as edge router, firewalls and intrusion detec-

tion systems to provide QoS and network security. With its complexity, the exponential growth of link speed and diversified services offered by

the Internet, Packet classification is becoming a major bottleneck in the router performance. Traditional router performs this classification

using Application Specific Integrated Circuits (ASICs), which suffers from lack of flexibility. Powerful Embedded Network Processor (NPs) a

flexible and cost efficient network appliance has been introduced by many companies that can be an alternative to implement the packet

classification at nearly link speed. The objective of this paper is to design and implement a new low complexity packet classification algorithm

of heuristic type named Trie based Tuple Space Search (TTSS) and to efficiently map this Packet classifier component on Network Proces-

sor based router. The performance is evaluated using Intel’s IXP2400 NP Simulator. The results demonstrate that, TTSS outperforms the

other heuristic packet classification algorithms. Parallel mapping of TTSS on Network processor based router gives better performance than

its pipelined mapping which is more suitable to enhance QoS for multimedia applications.

Index Terms — Multimedia, QoS, Packet Classification, TTSS, Network Processor, and IXP 2400.

1 INTRODUCTION

I nternet traffic has become high in recent times due to


the growth of real-time multimedia applications such
as Video on demand , Video telephony, Video stream-
dimensional Packet Classification algorithm. Packet Clas-
sification is carried out by considering the source and
destination address, protocol type of L3 and source port
ing and e-learning that require guaranteed QoS. and destination port of L4. This classification of packets
Processing of the packets at router level such as Receiving involves complex tasks such as Longest Prefix Matching
IP packets from incoming links, classifying the packets, (LPM) for source and destination address, Exact Matching
Scheduling, Routing and output porting are to be per- (EM) for protocol and Range Matching (RM) for port.
formed at high speed to satisfy the QoS requirements. With large matching conditions, the best matching rule
Packet classification is an important function performed based on filters is chosen. In addition to this complex
by network devices such as edge router, firewalls and task, the packet classification has to be done at wire speed
intrusion detection systems to provide QoS and network which is a bottleneck at routers.
security. The packets that arrive at the input port of the Currently, Routers are mainly based on Applica-
router are classified into different flows by a classifier, by tion Specific Integrated Circuits (ASICs) that are custom
comparing the fields in the L3 / L4 header of the incom- made and are not flexible to support diversified Internet-
ing packet. Flow is a sequence of packets that belong to working services. Earlier General Purpose Processor
the same logical stream and should be treated similarly (GPP) based routers offer flexibility in supporting new
by the network. A set of rules named as Filters (F) having features by simply upgrading the software, but has diffi-
the attributes of the packets such as Source IP, Destina- culties in supporting higher bandwidth [1]. Embedded
tion IP are stored as fields in the router table for the pur- Network Processors have recently emerged to provide
pose of classifying the incoming packets. The classifier both the performance of ASICs and the programmability
extracts relevant fields from the header of the incoming of GPPs [2]. Powerful Embedded Network Processors
packet in the order same as in the Filter to perform multi- have been introduced by many companies that can be
JOURNAL OF COMPUTING, VOLUME 2, ISSUE 9, SEPTEMBER 2010, ISSN 2151-9617
HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/
WWW.JOURNALOFCOMPUTING.ORG 108

placed in routers to execute various network related tasks the microblocks are arranged in a pipeline. For example,
at packet level. The design and development of routers when the arrival of a new packet is being handled by a
using Network Processor has gained significance due to receive microengine, an existing packet can be classified
its high performance. and transmitted by the respective microblocks simulta-
Network Processors (NPs) have a high- neously, which entails interaction between microblocks in
performance parallel processing architecture on a single the form of pipeline. Also the impact of different design
chip which is more suitable for detailed packet inspection, mappings namely parallel and pipelined mapping of the
processing having complex algorithms and forwarding at packet classification on the Microengines has been ex-
wire speed. Network Processors have a set of hierarchi- amined and the performance measures show that the pa-
cally distributed memory devices, a set of on-chip proces- rallel design mapping of Microengines has better packet
sors (Micro Engines) to carry out packet level parallel processing rate than the pipelined mapping.
processing operations through multitasking and multith-
readed programming [16]. Each Micro Engine (ME) has The remainder of this paper is organized as fol-
multiple hardware thread contexts that enable thread lows: Section 2 presents the background concept of packet
context switches with zero or minimal overhead [3]. Mi- classification. Section 3 and 4 describes the proposed
cro Engines can examine and forward packets indepen- packet classification algorithm and its implementation
dently without using the host processor, bus, or memory. details. The performance analysis is presented in section 5
All these features reveal that, Micro Engines in the Net- and conclusion in Section 6.
work Processor can be assigned different packet
processing functionalities so that classification of packets
can be done efficiently to provide QoS. Studies [16-19]
have focused on implementing the networking services
using programmable network processors.
The work presented in this paper is based on the
fully programmable Intel ® IXP 2400 processor. The In-
tel® IXP2400 is a member of the Intel’s second generation
network processor family. The architecture of this inte-
grated network processor IXP2400 [4] shown in Figure 1
has a single 32-bit XScale core processor, eight 32-bit Mi-
croEngines (MEs) organized as two clusters, standard
memory interfaces and high speed bus interfaces. Each
ME has eight hardware-assisted threads and 4KB local
memory. Each microengine has 256 general purpose reg- Figure 1. Architecture of IXP2400
isters that are equally shared between eight threads. Mi-
croengines exchange information by using either an on-
chip scratchpad memory or 128 special purpose next
2 RELATED WORK
neighbor registers. Data transferring across the MEs and
locations external to the ME, (for eg DRAMs, SRAMs etc.)
are done by using 512 Transfer Registers. The Xscale core
is responsible for initializing and managing the chip,
T his section presents a brief overview of the Packet
Classification algorithms. Survey on packet classifica-
tion algorithms [5], [6], [7] and [8] shows that the
handling control and management functions and loading Packet Classification problem is inherently hard. Linear
the ME instructions. search is the simplest method of packet classification that
IXP2400 chip has a pair of buffers (BUF), Receive uses linked list to perform searching through a set of
BUF (RBUF) and Transmit BUF (TBUF) that are used to rules. If ‘N’ is the number of rules, then its spatial and
send / receive packets to / from the network ports with temporal complexity is O (N). Tuple Space (TS), a heuris-
each of size 8 Kbytes. The packets are injected into the tic for packet classification [9] maps the set of rules to a
Network Processor from the network through the Media Tuple that is stored in a Hash table. As the number of
Switch Fabric Interface and then forwarded to the Micro- distinct Tuples is much lesser than the number of rules,
Engines for processing. All threads in a particular ME even a simple linear search in the Tuple Space provides a
execute program code called microblock stored in the significant increase in speed of packet classification. It has
local memory of that ME. Finally the processed packets the time complexity reduced with order O (Wk) where W
are driven into the network by Media Switch Fabric Inter- is the length of the IP prefix and ‘k’ is the number of fields
face (MSF) at output port. used in classification. Another search namely Pruned
In this work, the heuristic Packet clas- Tuple Space Search (PTS) [10] has the time complexity
sifier named Trie based Tuple Space Search(TTSS) has reduced further by performing searches on the subset of
been proposed and implemented on the IXP 2400 Proces- the Tuples. Though any field or combination of fields
sor with two supporting microblocks namely Packet Re- may be used for pruning, it is found that pruning on
ceive and Packet Transmit. In the router, packet arrival source and destination address strikes a favorable balance
and departure can occur simultaneously; hence to utilize between the reduction in number of Tuples and overhead
the parallel processing of Network processor effectively, for the pruning steps. Tuple-pruning algorithm is able to
JOURNAL OF COMPUTING, VOLUME 2, ISSUE 9, SEPTEMBER 2010, ISSN 2151-9617
HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/
WWW.JOURNALOFCOMPUTING.ORG 109

achieve good performance in practical environment: Matching is carried out at first level for Protocol and Pre-
however its worst-case speed is not guaranteed. Trie fix Matching is carried out at the next level for the IP ad-
search is another most popular high speed IP route look- dress.
up scheme for packet classification that uses Tree data Taylor and Turner have given useful characteris-
Structure with individual bit look-up[11]. tics of the filter [22], by analyzing rule sets provided by
Some of the trie-based algorithms follow the hie- ISPs, network equipment vendor, and other researchers
rarchical approach of the packet classification which re- working in the field to verify and compare the perfor-
cursively performs search in each field. In [12-15], trie mance of the proposed classification algorithms. From
based classification algorithm suitable for high speed two [22] it is observed that even though the transport-layer
dimensional packet classifications have been described. A fields have a wide variety of specifications, the most
one-dimensional 1-bit trie is a binary tree like structure, common are TCP (49%), followed by UDP (27%), the
in which each node has two element fields, le(the left wildcard (13%), ICMP (10%) and the other protocols such
element) and re (the right element), and each element as OSPF, IGMP, etc are lesser than 1% of the filters. Be-
field has the components child and data. Branching is cause of the small number of protocol types, node of the
done based on the bits in the search key. If the ith bit of the trie at level 1 splits the rule set based on the protocol field
search key is 0, then at level i child branch followed at a of the header that reduces the number of rules to be
node is from left-element (the root is at level 0); other- searched to 50% at the next level of the trie. The sample
wise, from right element. In One-Dimensional Multibit rule table and the associated data structure of TTSS are
Tries the branching is based on the number of bits known shown in Figure 2.
as “stride”. Multi Dimensional Multibit Trie (MDMT)
search [13-15] is a Trie Search where the packets are clas- Moreover the speed and efficiency of several
sified by searching the Destination Trie, Source Trie, Pro- longest prefix matching and packet processing algorithms
tocol Trie and Port Trie at different levels using stride. It depend upon the number of unique prefix lengths and
has time complexity O (W/L) and space complexity O (N the distribution of rules across those unique values. A
(W/L) 2(k-1)), where ‘L’ is the average length of strides. majority of the rule sets specify fewer than 15 unique pre-
Though there are several algorithms specialized for the fix lengths for either source or destination address prefix-
case of rules on two fields (e.g. Source and destination IP es [16]. The number of unique source/destination prefix
address only), it is necessary to design a packet classifica- pair lengths is generally lesser than 32, which is smaller
tion algorithm that uses more number of header fields to compared to the filter size and 8% of the rules are redun-
provide support for diversified services with the re- dant. According to classic IP addressing structure, in [11]
quirement of both low memory space and low access it has been shown that most of the rules ignore subnet-
overhead[21]. A new low complexity Trie Based Tuple ting.
Space Search (TTSS) Packet classification algorithm has Based on these observations [16] and [22], each
been proposed to give a remarkable enhancement to the node of the trie at level 2 has been constructed with mul-
existing Trie based and Tuple based algorithms to sup- tiple elements referring hash tables of prefix pair and pre-
port QoS of multimedia applications. fix length to further reduce the lookup time by reducing
the search space. A node with prefix length ‘w’ has 2w
element fields. All matching candidate rules are identified
by using the destination prefix length ‘w’ as a search key
3 TRIE BASED TUPLE SPACE SEARCH
and for those candidates, rules are further filtered using
ALGORITHM the source prefix field. Left most element of the node has

T his section presents the proposed TTSS packet classi-


fication algorithm that accelerates the lookup time.
This is achieved by simplifying the lookup procedure
the hash table with prefix length ‘i’ and the hash tables of
other elements will have the rules with prefix length ‘j’,
and the hash tables always satisfy the inequality i > j.
and avoiding unnecessary tuple probing. It is achieved by Longest Prefix matching (LPM) is preferred for Address
dividing the Tuple space into multiple subspaces. Since lookup and hence Packet classification has been per-
each rule is having two major components called an ap- formed by traversing the trie from left to right and from
plication specification and an address prefix pair, classifi- top to bottom using the tuple to reduce the time complex-
cation is done in two stages namely: Application field ity. This technique drives the algorithm to have the space
look up and prefix-pair look up respectively. This appli- and time complexity as O (N) and as (log W) respectively.
cation specification identifies a specific application ses- With this low complexity, TTSS could be effective for
sion by transport protocol, source port and destination high speed classification compared to other Multidimen-
port. Address prefix pair identifies the communicating sional Multibit Trie (MDMT) packet classification algo-
subnets by specifying a source address prefix and a desti- rithms [11-15].
nation address prefix or address prefix pair. The imple- In this paper, TTSS algorithm has been imple-
mentation of the TTSS classifier includes IP destina- mented on the IXP 2400 Network processor platform be-
tion/source address, Source/destination port and proto- cause the algorithm is feasible for parallel implementa-
col as filter/rule fields. TTSS uses hierarchical trie struc- tion. Also the achievement of its performance improve-
ture to store the rules and traverses a trie through a set of ment such as throughput and packet classification rate on
rules for classifying each incoming packet. In TTSS, Exact the IXP 2400 with different design mappings (parallel and
JOURNAL OF COMPUTING, VOLUME 2, ISSUE 9, SEPTEMBER 2010, ISSN 2151-9617
HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/
WWW.JOURNALOFCOMPUTING.ORG 110
Table 1 Sample rule table

Rule D.A prefix S.A Protocol Port


prefix
R1 1000000* 1010* TCP HTTP
R2 10010* 11010* TCP HTTP
R3 101001* 1010* UDP RTSP
R4 1000101* 010* TCP FTP
R5 110000* 1101* UDP SNMP
R6 111001* 1001* IGMP HTTP
R7 110011* 1100* GRE DNS
R8 1000110* 11011* UDP DNS
Figure 2 Data structure of HASPC
R9 110000* 1101* TCP FTP
R10 1000110* 1100* TCP HTTP
R11 10110* 1000* UDP RTSP
R12 10011* 010* TCP HTTP 4 IMPLEMENTATION
R13
R14
1001000*
101000*
0010*
0010*
UDP
TCP
RTP
FTP
T his section presents the implementation details of
TTSS Packet classification on Intel’s IXP 2400 Net-
work Processor. The implementation architecture is
shown in Figure 3. It consists of modules for Packet Re-
R15 1000100* 0010* UDP RTP ceive, Packet Classification and Packet Transmit. Also
R16 1010000* 1001* * SMTP Layer 2 de-capsulation and header validation are in-
cluded in the Packet Classification module.
R17 101000* 0010* UDP RTP
Packet Receive Microengine has been interfaced
R18 1000100* 0011* TCP FTP
with the Media Switch Fabric Interface (MSF). The pack-
R19 10101* 1101* UDP RTP ets are injected through Media switch Fabric Interface and
Receive Buffer (RBUF). For each packet, the packet data is
R20 10001* 0010* TCP FTP
written to DRAM, the packet meta-data (offset, size) is
R21 100000* 101* TCP FTP written to SRAM. The Receive Process is executed by mi-
croengine (ME0) using all the eight threads available in
R22 101001* 1010* TCP HTTP that microengine (ME0). The packet sequencing is main-
R23 111000* 10001* * SMTP tained by executing the threads in strict order. The packet
buffer information is written to a scratch ring for use by
R24 1001010* 1101* UDP HTTP the packet processing stage Communication between
pipeline stages is implemented through controlled access
R25 10100* 0100* UDP RTSP
to shared ring buffers.
R26 10100* 0101* TCP DLS Classifier microblock is executed by microengine (ME1)
that stores all rules in SRAM as a hierarchical trie data
R27 100001* 010* TCP DNS structure. The classifier micro block does not operate on
R28 1010010* 1010* * SMTP shared data structures and hence inter-thread synchroni-
zation is not needed. The micro block is organized as a
R29 10000* 1011* TCP FTP functional pipe-stage i.e all threads in a microengine ex-
ecutes the same algorithm but they process different
R30 1000101* 010* UDP SNMP
packets.
R31 1000001* 1101* TCP FTP Each rule is associated with QoS information class_id.
This microblock removes the layer-3 header of each in-
R32 1001011* 0010* IGP DNS
coming packet from DRAM by updating the offset and
size fields in the packet Meta using standard protocol
Pipelined design mapping) are evaluated to prove its libraries of IXP 2400. Since it is important to maintain
support of enhancing the QoS for multimedia applica- packet sequencing, the threads in the microblock execute
tions. in strict order to dequeue the packet header from the
JOURNAL OF COMPUTING, VOLUME 2, ISSUE 9, SEPTEMBER 2010, ISSN 2151-9617
HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/
WWW.JOURNALOFCOMPUTING.ORG 111

DRAM for performing packet classification. Packet clas- and destination addresses and exact match is used for
sifier validates the IP header of esch incoming packet protocol flag.
based on RFC 1812[5]. If the validity check fails, then the
packet is dropped. Otherwise, the packet is classified into 5.1 Design Mappings
different traffic flows based on the IP header and is en- The implementation of the classification phase on
queued in the respective queue using the proposed pack- the IXP2400 can be done in different ways namely parallel
et classification algorithm. For each valid packet, the mi- mapping and pipeline mapping. In both the cases, Pack-
cro block then builds a hash input from the header and et Rx and Packet Tx are processed by microengine 0
then compares IP header fields with the rule stored in the (ME0) and microengine 3 (ME3) and the microengines
hash entry. If a matching entry is found then the classifier ME1 and ME2 are used for classification purpose. In pa-
writes selected dispatch loop variables with data stored in rallel mapping, all the classification steps for a single
a hash entry and enqueues the packet in the queue ac- packet are processed by the single microengine and hence
cording to class_id field of the rule for further processing ME1 and ME2 perform the classification of different
by the router. Otherwise, if matching fails the algorithm packets simultaneously.
loads the next hash entry in a chain as indicated by In pipelined mapping, the classification of a single packet
next_entry_ptr and repeats the entry matching procedure. is done by two different micro engines. ME1 and ME2,
If a classifier reaches end of chain without finding a one is for Field Extraction and the other is for Header va-
matching entry, a default rule is applied. lidation (according to RFC1812 [5]) and for table lookup.
Then the Packet Transmit microblock is executed
by microengine (ME3) that moves packet into TBUFs for
transmitting over the media interface through different RX RX
ports. The MSF is monitored by the packet Transmitter
microblock to stop the transmission on that port if the
TBUF threshold for specific ports has been exceeded and PC
if so it queues up the requests to transmit packets on that
port in local memory. The Packet Transmitter microblock PC PC
periodically updates the classifier with information about PC
how many packets have been transmitted.

4.1 Implementation Environment


IXA 3.51 SDK is a cycle based simulator [18] in TX TX
which IXP2400 is set to run under the following condi-
tions: PLL output frequency: 1200 MHz, ME frequency:
600 MHz, Xscale Frequency: 600 MHz. SRAM frequen- Figure 4. Parallel and Pipelined Mapping
cies: 200 MHz, two channels, 64 MB per channel, DRAM
Frequency: 150 MHz, 64 MB. The configuration for the
device type is x32 MPHY4 with bus mode 1x32 to send The four (LS, TSS, PTS and TTSS) algorithms are
and receive packets from the simulator. A device with 4 implemented using single microengine, parallel and pipe-
ports, each with a data rate of 1000 Mbps and receive and lined design mapping as shown in Figure 4. Figure 5
transmit buffer sizes of 128KB chosen for this application. shows that at the end of 60,000 microengine cycles,
The simulator is configured to send packet streams to Throughput using TSS is 34.15% more than LS, using PTS
ports 0 through 3 of the device. Implementation assigns is 67.85% more than LS and it is 73.5% more in TTSS.
individual blocks from the fast path pipeline to separate Throughput in TTSS is 59.8% and 17.6% more than TSS
microengines on the IXP2400 NPs. and PTS respectively.

Traffic generated in this work includes uniformly


T H R OU GH P U T
120
distributed RTP/UDP packets, UDP packets, TCP pack- 100
ets, etc and all types of classIP addresses.. Traffic is gen- 80
erated with a constant rate of 1000Mb/s and inter-packet 60
gap is set to 96 ns. 40

20

0
5 PERFORMANCE EVALUATION 20000 40000 60000

Simulat ion t ime (cycles)

I n order to evaluate the performance of the algorithm, 5- LS TSS PTS TTSS

tuple header field consisting of IP source and destina-


tion address, protocol type, source and destination
ports is considered because these five fields are common Figure 5 Throughput
fields under literature even though higher number of
fields is possible [11]. Prefix match is used for IP source TTSS outperforms the LS, TSS and PTS based
JOURNAL OF COMPUTING, VOLUME 2, ISSUE 9, SEPTEMBER 2010, ISSN 2151-9617
HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/
WWW.JOURNALOFCOMPUTING.ORG 112

packet classification algorithms and hence the perfor- This is due to the fact that in pipelined mapping, a new
mance of the different design mappings is analyzed with packet cannot be handled by microengines earlier in the
TTSS classification algorithm and is shown in Figure 6. pipeline until the availability of inter-microengine buffer
entries. These entries are available only when the entire
T H R O UG H P UT
processing for that packet is completed by all microen-
250
200
gines.
150
P A C KE T S E N T / R E C E IV E R A T IO
100
50
1.2
0 1
LS TSS P TS TTSS 0.8
A lgo rithms 0.6
0.4
P ipelined M apping P arallel M apping
0.2
0
LS TSS PTS TTSS
Figure 6 Throughput Packet Classification Algorithms

P arallel M apping P ipelined M apping


Table 2 Comparison of Performance parameters

Figure 7 Packet Sent/Receive Ratio

I D LE T I M E ( M E 2 )
90
80
70
60
50
40
30
20
10
0
LS TSS PTS TTSS
A lgorit hms

Pipelined M apping Parallel M apping

Figure 8 Idle Time of ME2


Table 2 shows that the speedup factor [20] of
TTSS in parallel mapping is 2.18 and in pipelined map- 12 0
I D LE T I M E ( M E 3 )

ping is 1.52 compared to TTSS classification using single 10 0

microengine. From the table it is also inferred that classi- 80


60
fication rate in TTSS is 2788 KPPS (Kilo Packets per Sec) 40
and 1937 KPPS in parallel and pipelined mapping respec- 20
tivelyThe pipelined mapping seems to be an ideal map- 0
LS TSS PTS TTSS
ping for the algorithm, in which the various steps in
A lg orit hms
processing a packet has been split and assigned across Pip elined M ap ping Parallel M ap ping
microengines. However, the packet processing speed of
TTSS is reduced by 31.25% in pipelined mapping than in
parallel mapping as shown in Figure 7. This is due to the
fact that the access of SRAM occurs only once in parallel Figure 8 Idle Time of ME3
mapping to read the packet header whereas in pipelined
mapping it occurs more than once for a single packet. On analysis, it is seen that Trie based Tuple Space
Microengine being the most important resource Search achieves high speed packet classification using
of NP, its utilization factor can be described by its idle Embedded Network Processor in parallel mapping.
time as shown in Figure 8a. and 8b. In TTSS Idle time of
ME2 is almost 25% lesser than in other algorithms and for
6 CONCLUSION
ME3 it is 32% lesser. The packet classifier in pipelined
mapping exhibits more idle time than that in parallel
mapping. It is clear from Figure 7a and 7b that the idle
time in ME2 and ME3 in Parallel mapping is 21% and
N etwork devices such as edge router, firewalls and
intrusion detection systems devices can utilize pro-
grammable Network Processors (NP) to implement
32% lesser than that of pipelined mapping respectively. a computationally intensive packet classification algo-
JOURNAL OF COMPUTING, VOLUME 2, ISSUE 9, SEPTEMBER 2010, ISSN 2151-9617
HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/
WWW.JOURNALOFCOMPUTING.ORG 113

rithm at line speeds to provide QoS and Network securi- Two-Dimensional Multibit Tries,” http://www.cise.ufl.edu/~wlu/ pa-
pers/p-2dtries.pdf, 2008.
ty. This paper describes the design of Packet Classifier [15]. W. Lu and S. Sahni, “Packet Classification Using Space-
component on Network based Router and its perfor- Efficient pipelined Multibit Tries,” IEEE Transactions on Computers,
mance enhancement. The proposed low complexity heu- Vol. 57, No.5, May 2008.
ristic Trie based Tuple Space Search (TTSS) packet classi- [16]. Intel Corporation. Intel Network Processors Product Infor-
mation. http://www.intel.com/design/ network/ products/npfamily.
fication has been implemented on Intel’s IXP 2400 Net- [17]. “Microengine version 2 (MEv2) Assembly Language cod-
work Processor to improve the performance of Router. ing Standards. Revision 1.01g” Intel Corporation, June2003.
By dividing the tuple space into multiple subspaces, the [18]. “Intel IXP2400/IXP2800 Network Processors - Develop-
space complexity and the time complexity achieved by ment Tool User Guide” Intel Corporation, March 2004.
[19]. Intel Corporation, “Intel IXP2400 & IXP2800 Network Proces-
TTSS is O (N) and O (log W) respectively. The implemen- sors Programmer’s Reference Manual” 2004.
tation results lead to the observation that Throughput of [20]. John.L.Hennessy and David.A.Patterson” Computer Archi-
Trie based Tuple Space search (TTSS) is almost 60% more, tecture Quantitative Apporach “3/e Morgan Kaufmann Publishers
2003.
compared to TSS and PTS. In this work, the performance [21]A.Feldman and S.Muthukrishnan, “Tradeoffs for Packet Classifi-
of TTSS is also evaluated using pipelined as well as paral- cation” in IEEE INFOCOM, pp. 1193-1202, March 2000
lel design mapping. The results show that the speedup [22]. D.E Taylor, J.S Turner “Class Bench, A packet classification
factor of TTSS in parallel mapping is 2.18 and in pipelined Benchmark”, in IEEE INFOCOM, vol 3, March 2005, PP. 2068-
2079.
mapping is 1.52 compared to TTSS classification using [23]. Pankaj Gupta and n.Mckeown, “Packet classification on mul-
single microengine. Results prove that classification rate tiple fields”ACM SIGCOMM, pp. 147-160, August 99.
in TTSS is 2788 KPPS (Kilo Packets per Sec) and 1937 [24]. Atsushi Yoshioka, Shariful Hasan Shakot, and Min Sik Kim
“Rule Hashing for Efficient Packet Classification in Network Intrusion
KPPS in parallel and pipelined mapping respectively. Detection”, IEEE 2008.
Moreover, the pipelined design mapping has a packet
processing rate of 31.25% lesser than the parallel map-
ping, primarily due to multiple memory reads per packet
in the latter. As compared with the pipelined mapping,
parallel mapping of TTSS classifier can provide higher
Throughput and classification rate. Thus the work sug-
gests that TTSS based packet classification in parallel
mapping is efficient for enhancing QoS of multimedia
applications.
Mrs. R. Avudaiammal has received her
B.E. degree in Electronics and Communication Engineering from
REFERENCES Madurai Kamaraj University, India in 1992 and M.E. degree in Ap-
[1]. Michael Coss and Ron Sharp, “The Network Processor plied Electronics from Bharathiar University, India in 2000. She is an
Decision”, Bell Labs Technical Journal, pp: 177-189, 2004 Associate Professor at St.Joseph’s College of Engineering, Chennai,
[2]. Douglas.E. Comer, “Network Systems Design Using Net- India. She has 17 years of teaching experience. She has published
work Processors”. Pearson Education, 2003.
[3]. E.J. Johnson and A.R. Kunze, “IXP2400/ 2850 Program- books on Microprocessors with Dhanpatrai publication and on Infor-
ming”, Intel Press, 2004. mation coding Techniques with TMH Publishers. She is currently
[4]. Intel IXP 2400/ IXP2800 Network Processor “Hardware pursuing her research at Anna University Tiruchirappalli , India. Her
Reference manual”, Intel Corporation 2003. research interests are in Embedded systems, Multimedia Networks
[5]. F. Baker “Requirement for IP Version 4 Router” June 1995. and Network Processor .
Intenet Ingineering Task Force, ftp://ftp,ietf.org/rfc/rfc1812.txt.
[6]. David E. Taylor “Survey & Taxonomy of Packet Classifica-
tion Techniques” ACM Comput. Survey. Vol 37, No 5, pp. 238-275,
Sep 2005.
[7]. Pankaj Gupta and Nick Mc Keown “Algorithm for Packet
Classification” IEEE Network Magazine, Vol 15,no2,pp.24-32, Apirl,
2001.
[8]. M.A. Ruiz-Sanchez, E.W. Biersack, W.Dabbous, “Survey
and Taxonomy of IP Address Lookup Algorithms” IEEE Network,
Vol.15, No2,.pp.8-23,April 2001.
[9]. V.Srinivasan, S.Suri and G.Varghese “Packet Classification
Using Tuple Space Search” ACM SIGCOMM, September1999,
pp.135-146.
[10]. Pi-Chung Wang, Chia-Tai Chan, Chun-Liang Lee, and
Hung-Yi Chang “Scalable Packet Classification for Enabling Internet
Differentiated Services” IEEE Trans, Multimedia, vol.8, no.8,
pp.1239-1249, Dec 2006. Dr.P. Seethalakshmi has received
[11]. Stefano Giordano, Gregorio Procissi, Federico Rossi, and her B.E. degree in Electronics and Communication Engineering in
Fabio Vitucci “ Design of a Multi-Dimensional Packet Classifier for 1991 and M.E. degree in Applied Electronics in 1995 from Bhara-
Network Processors” in Proc. IEEE ICC 2006, pp. 503-508. thiar University, India. She obtained her doctoral degree from Anna
[12]. W.Lu and S. Sahni, “Efficient Construction of Pipelined University Chennai, India in the year 2004. She has 15 years of
Multibit-Trie Router Tables,” IEEE Trans. Computers, vol. 56, no. 1,
pp. 32-43, Jan. 2007. teaching experience. She is Director/ CAE, Anna University Thiruchi-
[13]. W. Lu and S. Sahni, “Packet Classification Using Two- rappalli. Her areas of research includes Multimedia Streaming, Wire-
Dimensional Multibit Tries,” Proc. 10th IEEE Symp. Computers and less Networks, Network Processors and Web Services.
Comm., 2005.
[14]. W. Lu and S. Sahni, “Packet Classification Using Pipelined

You might also like