Professional Documents
Culture Documents
The layered concept of networking was developed to accommodate changes in technology. Each layer of a
specific network model may be responsible for a different function of the network. Each layer will pass
information up and down to the next subsequent layer as data is processed.
The OSI Network Model Standard
The OSI network model layers are arranged here from the lower levels starting with the physical (hardware)
to the higher levels.
Physical Layer - The actual hardware.
Data Link Layer - Data transfer method (802x ethernet). Puts data in frames and ensures error free
transmission. Also controls the timing of the network transmission. Adds frame type, address, and error
control information. IEEE divided this layer into the two following sublayers.
Logical Link control (LLC) - Maintains the Link between two computers by establishing
Service Access Points (SAPs) which are a series of interface points. IEEE 802.2.
Media Access Control (MAC) - Used to coordinate the sending of data between
computers. The 802.3, 4, 5, and 12 standards apply to this layer. If you hear someone talking about the
MAC address of a network card, they are referring to the hardware address of the card.
Network Layer - IP network protocol. Routes messages using the best path available.
Transport Layer - TCP, UDP. Ensures properly sequenced and error free transmission.
Session Layer - The user's interface to the network. Determines when the session is begun or
opened, how long it is used, and when it is closed. Controls the transmission of data during the session.
Supports security and name lookup enabling computers to locate each other.
Presentation Layer - ASCII or EBCDEC data syntax. Makes the type of data transparent to the layers
around it. Used to translate date to computer specific format such as byte ordering. It may include
compression. It prepares the data, either for the network or the application depending on the direction it is
going.
Application Layer - Provides services software applications need. Provides the ability for user
applications to interact with the network.
Many protocol stacks overlap the borders of the seven layer model by operating at multiple layers of the
model. File Transport Protocol (FTP) and telnet both work at the application, presentation, and the session
layers.
The Internet, TCP/IP, DOD Model
This model is sometimes called the DOD model since it was designed for the department of defense It is
also called the TCP/IP four layer protocol, or the internet protocol. It has the following layers:
Link - Device driver and interface card which maps to the data link and physical layer of the OSI
model.
Network - Corresponds to the network layer of the OSI model and includes the IP, ICMP, and IGMP
protocols.
Transport - Corresponds to the transport layer and includes the TCP and UDP protocols.
Application - Corresponds to the OSI Session, Presentation and Application layers and includes
FTP, Telnet, ping, Rlogin, rsh, TFTP, SMTP, SNMP, DNS, your program, etc.
Please note the four layer TCP/IP protocol. Each layer has a set of data that it generates.
The Link layer corresponds to the hardware, including the device driver and interface card. The link
layer has data packets associated with it depending on the type of network being used such as ARCnet,
Token ring or ethernet. In our case, we will be talking about ethernet.
The network layer manages the movement of packets around the network and includes IP, ICMP,
and IGMP. It is responsible for making sure that packages reach their destinations, and if they don't,
reporting errors.
The transport layer is the mechanism used for two computers to exchange data with regards to
software. The two types of protocols that are the transport mechanisms are TCP and UDP. There are also
other types of protocols for systems other than TCP/IP but we will talk about TCP and UDP in this
document.
The application layer refers to networking protocols that are used to support various services such
as FTP, Telnet, BOOTP, etc. Note here to avoid confusion, that the application layer is generally referring to
protocols such as FTP, telnet, ping, and other programs designed for specific purposes which are governed
by a specific set of protocols defined with RFC's (request for comments). However a program that you may
write can define its own data structure to send between your client and server program so long as the
program you run on both the client and server machine understand your protocol. For example when your
program opens a socket to another machine, it is using TCP protocol, but the data you send depends on
how you structure it.
Data Encapsulation, a Critical concept to be understood
When starting with protocols that work at the upper layers of the network models, each set of data is wrapped
inside the next lower layer protocol, similar to wrapping letters inside an envelope. The application creates the
data, then the transport layer wraps that data inside its format, then the network layer wraps the data, and
finally the link (ethernet) layer encapsulates the data and transmits it.
Each network layer either encapsulates the data stream with additional information, or manages data handling
or come part of the connection.
Without going into a great deal of technical detail, I will describe a general example of how these layers work
in real life. Assuming that the protocol stack being used is TCP/IP and the user is going to use an FTP client
program to get or send files from/to a FTP server the following will essentially happen:
1. The user will start the FTP client program on the sending computer.
2. The user will select the address (If the user selected a name, a description of DNS would need to be
described complicating this scenario) and port of the server.
3. The user will indicate to the FTP client program that they want to connect to the server.
4. The application layer will send information through the presentation layer to the session layer telling
it to open a connection to the other computer at a specific address and port. The presentation layer will not do
much at this time, and the presentation layer is actually handled by the FTP program.
5. The session layer will negociate through to the FTP server for a connection. There are several
synchronization signals sent between the client and server computers just to establish the connection. This is a
description of the sending of a signal from the client to the server:
1. The session layer of the client will send a data packet (SYN) signal to the transport layer.
2. The transport layer will add a header (TCP header) to the packet indicating what the source
port is and what the destination port is. There are also some other flags and information that will not be
discussed here to minimize complexity of this explanation.
3. The network layer will add source IP address and destination IP address along with other
information in a IP header.
4. The datalink layer will determine (using ARP and routing information which is not discussed
here for brevity) the hardware address of the computer the data is being sent to. An additional header
(ethernet) will be added at this layer which indicates the hardware address to receive the message along with
other information.
5. The information will be transmitted across the physical wire (hardware layer) until the signal
reaches the network card of the server computer. The signal may go through several hubs or repeaters.
6. The FTP server will normally only look for ethernet frames that are matching its own
hardware address.
7. The FTP server will see the ethernet frame matching its address and strip the ethernet
header information and send it to the network layer.
8. The network layer will examine the IP address information, strip the IP header, and if the IP
address matches its own, will send the information to the transport layer.
9. The transport layer will look at the TCP port number and based on the port number and
services being run, will strip the TCP header and send the information to the appropriate program which is
servicing the requested port.
10. At this point, the session layer in the FTP program will conduct a series of data exchanges
between itself through all the lower layers to the client computer until a session is established.
6. At this point information may be sent through several FTP commands between the client and the
server. Every transmission passes through the network layers from the application layer down to the hardware
layer and back up the layers on the receiving computer.
7. When the client decides to terminate the session layer will be informed by the higher layers and will
negociate for the closing of the connection.
This article is about a specific protocol technology. For the entire set of Internet related protocols, see Internet
Protocol Suite.
The Internet Protocol (IP) is a protocol used for communicating data across a packet-switched internetwork
using the Internet Protocol Suite, also referred to as TCP/IP.
IP is the primary protocol in the Internet Layer of the Internet Protocol Suite and has the task of delivering
distinguished protocol datagrams (packets) from the source host to the destination host solely based on their
addresses. For this purpose the Internet Protocol defines addressing methods and structures for datagram
encapsulation. The first major version of addressing structure, now referred to as Internet Protocol Version 4
(IPv4) is still the dominant protocol of the Internet, although the successor, Internet Protocol Version 6 (IPv6)
is being deployed actively worldwide.
Contents
[hide]
o 1 IP encapsulation
o 2 Services provided by IP
o 3 Reliability
o 4 IP addressing and routing
o 5 Version history
o 6 Reference diagrams
o 7 See also
o 8 References
o 9 External links
[edit] IP encapsulation
Data from an upper layer protocol is encapsulated as packets/datagrams (the terms are basically synonymous
in IP). Circuit setup is not needed before a host may send packets to another host that it has previously not
communicated with (a characteristic of packet-switched networks), thus IP is a connectionless protocol. This is
in contrast to public switched telephone networks that require the setup of a circuit for each phone call
(connection-oriented protocol).
[edit] Services provided by IP
Because of the abstraction provided by encapsulation, IP can be used over a heterogeneous network, i.e., a
network connecting computers may consist of a combination of Ethernet, ATM, FDDI, Wi-Fi, token ring, or
others. Each link layer implementation may have its own method of addressing (or possibly the complete lack
of it), with a corresponding need to resolve IP addresses to data link addresses. This address resolution is
handled by the Address Resolution Protocol (ARP) for IPv4 and Neighbor Discovery Protocol (NDP) for IPv6.
[edit] Reliability
The design principles of the Internet protocols assume that the network infrastructure is inherently unreliable
at any single network element or transmission medium and that it is dynamic in terms of availability of links
and nodes. No central monitoring or performance measurement facility exists that tracks or maintains the
state of the network. For the benefit of reducing network complexity, the intelligence in the network is
purposely mostly located in the end nodes of each data transmission, cf. end-to-end principle. Routers in the
transmission path simply forward packets to next known local gateway matching the routing prefix for the
destination address.
As a consequence of this design, the Internet Protocol only provides best effort delivery and its service can also
be characterized as unreliable. In network architectural language it is a connection-less protocol, in contrast to
so-called connection-oriented modes of transmission. The lack of reliability allows any of the following fault
events to occur:
• data corruption
• lost data packets
• duplicate arrival
• out-of-order packet delivery; meaning, if packet 'A' is sent before packet 'B', packet 'B' may arrive
before packet 'A'. Since routing is dynamic and there is no memory in the network about the path of prior
packets, it is possible that the first packet sent takes a longer path to its destination.
The only assistance that the Internet Protocol provides in Version 4 (IPv4) is to ensure that the IP packet
header is error-free through computation of a checksum at the routing nodes. This has the side-effect of
discarding packets with bad headers on the spot. In this case no notification is required to be sent to either
end node, although a facility exists in the Internet Control Message Protocol (ICMP) to do so.
IPv6, on the other hand, has abandoned the use of IP header checksums for the benefit of rapid forwarding
through routing elements in the network.
The resolution or correction of any of these reliability issues is the responsibility of an upper layer protocol. For
example, to ensure in-order delivery the upper layer may have to cache data until it can be passed to the
application.
In addition to issues of reliability, this dynamic nature and the diversity of the Internet and its components
provide no guarantee that any particular path is actually capable of, or suitable for performing the data
transmission requested, even if the path is available and reliable. One of the technical constraints is the size of
data packets allowed on a given link. An application must assure that it uses proper transmission
characteristics. Some of this responsibility lies also in the upper layer protocols between application and IP.
Facilities exist to examine the maximum transmission unit (MTU) size of the local link, as well as for the entire
projected path to the destination when using IPv6. The IPv4 internetworking layer has the capability to
automatically fragment the original datagram into smaller units for transmission. In this case, IP does provide
re-ordering of fragments delivered out-of-order.[1]
Transmission Control Protocol (TCP) is an example of a protocol that will adjust its segment size to be smaller
than the MTU. User Datagram Protocol (UDP) and Internet Control Message Protocol (ICMP) disregard MTU
size thereby forcing IP to fragment oversized datagrams.[2]
[edit] IP addressing and routing
Perhaps the most complex aspects of IP are IP addressing and routing. Addressing refers to how end hosts
become assigned IP addresses and how subnetworks of IP host addresses are divided and grouped together. IP
routing is performed by all hosts, but most importantly by internetwork routers, which typically use either
interior gateway protocols (IGPs) or external gateway protocols (EGPs) to help make IP datagram forwarding
decisions across IP connected networks
[edit] Version history
In May 1974, the Institute of Electrical and Electronic Engineers (IEEE) published a paper entitled "A Protocol
for Packet Network Interconnection."[3] The paper's authors, Vint Cerf and Bob Kahn, described an
internetworking protocol for sharing resources using packet-switching among the nodes. A central control
component of this model was the "Transmission Control Program" (TCP) that incorporated both connection-
oriented links and datagram services between hosts. The monolithic Transmission Control Program was later
divided into a modular architecture consisting of the Transmission Control Protocol at the connection-oriented
layer and the Internet Protocol at the internetworking (datagram) layer. The model became known informally
as TCP/IP, although formally it was henceforth referenced as the Internet Protocol Suite.
The Internet Protocol is one of the determining elements that define the Internet. The dominant
internetworking protocol (Internet Layer) in use today is IPv4; with number 4 assigned as the formal protocol
version number carried in every IP datagram. IPv4 is described in RFC 791 (1981).
The successor to IPv4 is IPv6. Its most prominent modification from Version 4 is the addressing system. IPv4
uses 32-bit addresses (c. 4 billion, or 4.3×109, addresses) while IPv6 uses 128-bit addresses (c. 340 undecillion,
or 3.4×1038 addresses). Although adoption of IPv6 has been slow, as of June 2008, all United States
government systems have demonstrated basic infrastructure support for IPv6 (if only at the backbone level).
[4]
Version numbers 0 through 3 were development versions of IPv4 used between 1977 and 1979.[citation
needed] Version number 5 was used by the Internet Stream Protocol (IST), an experimental stream protocol.
Version numbers 6 through 9 were proposed for various protocol models designed to replace IPv4: SIPP
(Simple Internet Protocol Plus, known now as IPv6), TP/IX (RFC 1475), PIP (RFC 1621) and TUBA (TCP and UDP
with Bigger Addresses, RFC 1347). Version number 6 was eventually chosen as the official assignment for the
successor Internet protocol, subsequently standardized as IPv6.
A humorous Request for Comments that made an IPv9 protocol center of its storyline was published on April 1,
1994 by the IETF.[5] It was intended as an April Fool's Day joke. Other protocol proposals named "IPv9" and
"IPv8" have also briefly surfaced, though these came with little or no support from the wider industry and
academia.[6]
Reference diagrams
Chapter 5 – IP Routing
Published: February 08, 2005 | Updated: April 18, 2006
Writer: Joe Davies
Abstract
This chapter describes how IPv4 and IPv6 forward packets from a source to a destination and the basic
concepts of routing infrastructure. A network administrator must understand routing tables, route
determination processes, and routing infrastructure when designing IP networks and troubleshooting
connectivity problems.
For a version of this chapter that has been updated for Windows Vista and Windows Server 2008, click here.
On This Page
Chapter Objectives
IP Routing Overview
IPv4 Routing
IPv6 Routing
Routing Tools
Chapter Summary
Chapter Glossary
Chapter Objectives
After completing this chapter, you will be able to:
Define the basic concepts of IP routing, including direct and indirect delivery, routing tables and their
contents, and static and dynamic routing.
Explain how IPv4 routing works with the TCP/IP component of Windows®, including routing table
contents and the route determination process.
Define IPv4 route aggregation and route summarization.
Configure Windows hosts, static routers, and dynamic routers for routing.
Define network address translation and how it is used on the Internet.
Explain how IPv6 routing works with the IPv6 component of Windows, including routing table
contents and the route determination process.
Configure hosts and static routers for the IPv6 component of Windows.
Define the use of the Route, Netsh, Ping, Tracert, and Pathping tools in IPv4 and IPv6 routing.
Top of page
IP Routing Overview
IP routing is the process of forwarding a packet based on the destination IP address. Routing occurs at a
sending TCP/IP host and at an IP router. In each case, the IP layer at the sending host or router must decide
where to forward the packet. For IPv4, routers are also commonly referred to as gateways.
To make these decisions, the IP layer consults a routing table stored in memory. Routing table entries are
created by default when TCP/IP initializes, and entries can be added either manually or automatically.
Direct and Indirect Delivery
Forwarded IP packets use at least one of two types of delivery based on whether the IP packet is forwarded to
the final destination or whether it is forwarded to an IP router. These two types of delivery are known as direct
and indirect delivery.
Direct delivery occurs when the IP node (either the sending host or an IP router) forwards a packet to
the final destination on a directly attached subnet. The IP node encapsulates the IP datagram in a frame for the
Network Interface layer. For a LAN technology such as Ethernet or Institute of Electrical and Electronic
Engineers (IEEE) 802.11, the IP node addresses the frame to the destination’s media access control (MAC)
address.
Indirect delivery occurs when the IP node (either the sending host or an IP router) forwards a packet
to an intermediate node (an IP router) because the final destination is not on a directly attached subnet. For a
LAN technology such as Ethernet or IEEE 802.11, the IP node addresses the frame to the IP router’s MAC
address.
End-to-end IP routing across an IP network combines direct and indirect deliveries.
In Figure 5-1, when sending packets to Host B, Host A performs a direct delivery. When sending packets to
Host C, Host A performs an indirect delivery to Router 1, Router 1 performs an indirect delivery to Router 2,
and then Router 2 performs a direct delivery to Host C.
Figure 5-10 An example of route aggregation for an IPv6 unicast address prefix
Windows Support for IPv6 Static Routing
The IPv6 protocol component of Windows supports static routing. You can configure a computer running
Windows Server 2003 or Windows XP as a static IPv6 router by enabling forwarding on the computer's
interfaces and then configuring it to advertise address prefixes to local hosts.
Figure 5-11 shows an example network using a simple static routing configuration. The configuration consists
of three subnets, three host computers running Windows XP or Windows Server 2003 (Host A, Host B, and
Host C), and two router computers running Windows XP or Windows Server 2003 (Router 1 and Router 2).
Figure 5-11 Static routing example with the IPv6 protocol component of Windows
After the IPv6 protocol is installed on all computers on this example network, you must enable forwarding and
advertising over the two network adapters of Router 1 and Router 2. Use the following command:
netsh interface ipv6 set interface InterfaceName|InterfaceIndex forwarding=enabled advertise=enabled
in which InterfaceName is the name of the network connection in the Network Connections folder and
InterfaceIndex is the interface index number from the display of the netsh interface ipv6 show interface
command. You can use either the interface name or its index number.
For example, for Router 1, if the interface index of the network adapter connected to Subnet 1 is 4 and the
interface index of the network adapter connected to Subnet 2 is 5, the commands would be:
netsh int ipv6 set int 4 forw=enabled adv=enabled
netsh int ipv6 set int 5 forw=enabled adv=enabled
You can abbreviate each Netsh parameter to its shortest unambiguous form.
After you enable forwarding and advertising, you must configure the routers with the address prefixes for their
attached subnets. For the IPv6 protocol component of Windows Server 2003 and Windows XP, you do this by
adding routes to the router's routing table with instructions to advertise the route. Use the following
command:
netsh interface ipv6 set route Address / PrefixLength InterfaceName|InterfaceIndex publish=yes
in which Address is the address portion of the prefix and PrefixLength is the prefix length portion of the prefix.
To publish a route (to include it in a router advertisement), you must specify publish=yes.
For example, for Router 1 using the example interface indexes, the commands are:
netsh int ipv6 set rou 2001:db8:0:1::/64 4 pub=yes
netsh int ipv6 set rou 2001:db8:0:2::/64 5 pub=yes
The result of this configuration is the following:
Router 1 sends Router Advertisement messages on Subnet 1. These messages contain a Prefix
Information option to autoconfigure addresses for Subnet 1 (2001:DB8:0:1::/64), a Maximum Transmission
Unit (MTU) option for the link MTU of Subnet 1, and a Route Information option for the subnet prefix of
Subnet 2 (2001:DB8:0:2::/64).
Router 1 sends Router Advertisement messages on Subnet 2. These messages contain a Prefix
Information option to autoconfigure addresses for Subnet 2 (2001:DB8:0:2::/64), an MTU option for the link
MTU of Subnet 2, and a Route Information option for the subnet prefix of Subnet 1 (2001:DB8:0:1::/64).
When Host A receives the Router Advertisement message, the host automatically configures a global address
on its network adapter interface with the prefix 2001:DB8:0:1::/64 and an Extended Unique Identifier (EUI)-64-
derived interface identifier. The host also adds a route for the locally attached Subnet 1 (2001:DB8:0:1::/64)
and a route for Subnet 2 (2001:DB8:0:2::/64) with the next-hop address of the link-local address of Router 1's
interface on Subnet 1 to its routing table.
When Host B receives the Router Advertisement message, the host automatically configures a global address
on its network adapter interface with the prefix 2001:DB8:0:2::/64 and an EUI-64-derived interface identifier.
The host also adds a route for the locally attached Subnet 2 (2001:DB8:0:2::/64) and a route for Subnet 1
(2001:DB8:0:1::/64) with the next-hop address of the link-local address of Router 1's interface on Subnet 2 to
its routing table.
In this configuration, Router 1 does not advertise itself as a default router (the Router Lifetime field in the
Router Advertisement message is set to 0), and the routing tables of Host A and Host B do not contain default
routes. A computer running the IPv6 protocol component for Windows Server 2003 or Windows XP will not
advertise itself as a default router unless a default route is configured to be published.
To continue this example configuration, the interface index of Router 2's network adapter connected to
Subnet 2 is 4, and the interface index of Router 2's network adapter connected to Subnet 3 is 5. To provide
connectivity between Subnet 2 and Subnet 3, you would issue the following commands on Router 2:
The result of this configuration is the following:
Router 2 sends Router Advertisement messages on Subnet 2. These messages contain a Prefix
Information option to autoconfigure addresses for Subnet 2 (2001:DB8:0:2::/64), an MTU option for the link
MTU of Subnet 2, and a Route Information option for the subnet prefix of Subnet 3 (2001:DB8:0:3::/64).
Router 2 sends Router Advertisement messages on Subnet 3. These messages contain a Prefix
Information option to autoconfigure addresses for Subnet 3 (2001:DB8:0:3::/64), an MTU option for the link
MTU of Subnet 3, and a Route Information option for the subnet prefix of Subnet 2 (2001:DB8:0:2::/64).
When Host B receives the Router Advertisement message from Router 2, the host does not automatically
configure a global address using the 2001:DB8:0:2::/64 prefix, because a global address with that prefix already
exists. Host B also adds a route for Subnet 3 (2001:DB8:0:3::/64) with the next-hop address of the link-local
address of Router 2's interface on Subnet 2 to its routing table.
When Host C receives the Router Advertisement message, the host automatically configures a global address
on its network adapter interface with the prefix 2001:DB8:0:3::/64 and an EUI-64-derived interface identifier.
It also adds a route for the locally attached subnet (Subnet 3) (2001:DB8:0:3::/64) and a route for Subnet 2
(2001:DB8:0:2::/64) with the next-hop address of the link-local address of Router 2's interface on Subnet 3 to
its routing table.
The result of this configuration is that, although Host B can communicate with both Host A and Host C, Host A
and Host C cannot communicate because Host A has no routes to Subnet 3 and Host C has no routes to Subnet
1. You can solve this problem in either of two ways:
Configure Router 1 to publish a route to Subnet 3 with the next-hop address of Router 2's link-local
address on Subnet 2, and configure Router 2 to publish a route to Subnet 1 with the next-hop address of
Router 1's link-local address on Subnet 2.
Configure Router 1 to publish a default route with the next-hop address of Router 2's link-local
address on Subnet 2, and configure Router 2 to publish a default route with the next-hop address of Router 1's
link-local address on Subnet 2.
For the first solution, Router 1 will advertise two Route Information options on Subnet 1—one for Subnet 2
and one for Subnet 3. Therefore, Host A will add two routes to its routing table—one for 2001:DB8:0:2::/64
and 2001:DB8:0:3::/64. Router 1 will continue to advertise only one Route Information option (for Subnet 1)
on Subnet 2. Similarly, Router 2 will advertise two Route Information options on Subnet 3—one for Subnet 1
and one for Subnet 2. Therefore, Host C will add two routes to its routing table—one for 2001:DB8:0:1::/64
and 2001:DB8:0:2::/64. Router 2 will continue to advertise only one Route Information option (for Subnet 3)
on Subnet 2. The result of this configuration is that all the hosts and all the routers have specific routes to all
the subnets.
For the second solution, Router 1 will advertise itself as a default router with one Route Information option
(for Subnet 2) on Subnet 1. Therefore, Host A will add two routes to its routing table—one for the default
route ::/0 and one for 2001:DB8:0:2::/64. Similarly, Router 2 will advertise itself as a default router with one
Route Information option (for Subnet 2) on Subnet 3. Therefore, Host C will add two routes to its routing table
—one for the default route ::/0 and one for 2001:DB8:0:2::/64. The result of this configuration is that all the
hosts and all the routers have a combination of specific and general routes to all the subnets, with the
exception of Host B, which has only specific routes to all the subnets. The problem with solution 2 is that
Router 1 and Router 2 have default routes pointing to each other. Any non-link-local traffic sent from Host A or
Host C that does not match the prefix 2001:DB8:0:1::/64, 2001:DB8:0:2::/64, or 2001:DB8:0:3::/64 is sent in a
routing loop between Router 1 and Router 2.
You could extend this network of three subnets and two routers to include more subnets and more routers.
However, the administrative overhead to manage the configuration of the static routers does not scale. At
some point, you would want to use an IPv6 routing protocol.
Configuring Hosts for IPv6 Routing
IPv6 hosts are configured for routing through the router discovery process, which requires no configuration.
When an initializing IPv6 host receives a Router Advertisement message, IPv6 automatically configures the
following:
On-link subnet prefixes that correspond to autoconfiguration address prefixes contained within the
Router Advertisement message.
Off-link subnet prefixes that correspond to specific routes contained within the Router Advertisement
message.
A default route, if the router sending the Router Advertisement message is advertising itself as a
default router.
Because the typical IPv6 host is automatically configuring all the routes that it typically needs to forward
packets to an arbitrary destination, you do not need to configure routes on IPv6 hosts.
Routing Tools
Windows Server 2003 and Windows XP operating systems include the following command-line utilities that
you can use to test reachability and routing and to maintain the routing tables:
Route
Displays the local IPv4 and IPv6 routing tables. You can use the Route tool to add temporary and persistent
routes, change existing routes, and remove routes from the IPv4 routing table.
Netsh interface ipv6
Displays the IPv6 routing table (netsh interface ipv6 show routes), adds routes (netsh interface ipv6 add
route), removes routes (netsh interface ipv6 delete route), and modifies existing routes (netsh interface ipv6
set route).
Ping
Verifies IP-level connectivity to another TCP/IP computer by sending either ICMP Echo or ICMPv6 Echo Request
messages. The tool displays the receipt of corresponding Echo Reply messages, along with round-trip times.
Ping is the primary TCP/IP tool used to troubleshoot connectivity, reachability, and name resolution.
Tracert
Determines the path taken to a destination by sending ICMP Echo or ICMPv6 Echo Request messages to the
destination with incrementally increasing Time to Live (TTL) or Hop Count field values. The path displayed is
the list of near-side router interfaces of the routers in the path between a source host and a destination. The
near-side interface is the interface of the router that is closest to the sending host in the path.
Pathping
Provides information about network latency and network loss at intermediate hops between a source and a
destination. Pathping sends multiple ICMP Echo or ICMPv6 Echo Request messages to each router between a
source and destination over a period of time and then computes results based on the packets returned from
each router. Because Pathping displays the degree of packet loss at any given router or link, you can determine
which routers or links might be having network problems.
Chapter Summary
The chapter includes the following pieces of key information:
IP routing is the process of forwarding a packet based on the destination IP address. IP uses a routing
table to determine the next-hop IP address and interface for a packet being sent or forwarded.
IP routing is a combination of direct and indirect deliveries. Direct delivery occurs when the IP node
forwards a packet to the final destination on a directly attached subnet, and indirect delivery occurs when the
IP node forwards a packet to an intermediate router.
Static routing relies on the manual administration of the routing table. Dynamic routing relies on
routing protocols, such as RIP and OSPF, to dynamically update the routing table through the exchange of
routing information between routers.
The TCP/IP component of Windows uses a local IPv4 routing table to determine the route used to
forward the packet. From the chosen route, the next-hop IPv4 address and interface are determined. IPv4
hands the packet to ARP to resolve the next-hop address to a MAC address and send the packet. You can use
the route print command to view the IPv4 routing table for the TCP/IP component of Windows.
Rather than use routes for the address prefixes of every subnet in your network, you can use route
summarization to advertise a summarized address prefix that includes all the subnets in a specific region of
your network.
An IPv4 host is configured with a default gateway. IPv4 static routers are configured with either
subnet routes or summarized routes. IPv4 dynamic routers are configured with the settings that allow them to
exchange routing information with neighboring routers.
A network address translator (NAT) is an IPv4 router that can translate the IP addresses and TCP/UDP
port numbers of packets as they are forwarded. A NAT allows a small network to share a single public IPv4
address.
The IPv6 component of Windows uses a local IPv6 routing table to determine the route used to
forward the packet. From the chosen route, IPv6 determines the next-hop IPv6 address and interface. IPv6
hands the packet to the Neighbor Discovery process to resolve the next-hop address to a MAC address and
send the packet. You can use the route print or netsh interface ipv6 show routes command to view the
routing table for the IPv6 component of Windows.
IPv6 hosts automatically configure themselves with routing information based on the receipt of
Router Advertisement messages. You must use netsh interface ipv6 commands to manually enable and
configure routers running the IPv6 component of Windows to advertise address prefixes and routes.
You use the Route and Netsh tools to manage IP routing tables. You use the Ping tool to test basic
reachability. You use the Tracert tool to show the path that a packet takes from source to a destination. You
use the Pathping tool to test for link and router reliability in a path from a source to a destination.
Chapter Glossary
default gateway – A configuration parameter for the Internet Protocol (TCP/IP) component that is the IPv4
address of a neighboring IPv4 router. Configuring a default gateway creates a default route in the IPv4 routing
table.
default route – A route that summarizes all possible destinations and is used for forwarding when the routing
table does not contain any other more specific routes for the destination. For example, if a router or sending
host cannot find a subnet route, a summarized route, or a host route for the destination, IP selects the default
route. The default route is used to simplify the configuration of hosts and routers. For IPv4 routing tables, the
default route is the route with the network destination of 0.0.0.0 and netmask of 0.0.0.0. For IPv6 routing
tables, the default route has the address prefix ::/0.
direct delivery – The delivery of an IP packet by an IP node to the final destination on a directly attached
subnet.
distance vector – A routing protocol technology that propagates routing information in the form of an address
prefix and its “distance” (hop count).
host route – A route to a specific IP address. Host routes allow packets to be routed on a per-IP address basis.
For IPv4 host routes, the route prefix is a specific IPv4 address with a 32-bit prefix length. For IPv6 host routes,
the route prefix is a specific IPv6 address with a 128-bit prefix length.
indirect delivery – The delivery of an IP packet by an IP node to an intermediate router.
link state – A routing protocol technology that exchanges routing information consisting of a router’s attached
subnet prefixes and their assigned costs. Link state information is advertised upon startup and when changes
in the network topology are detected.
longest matching route – The algorithm used to select the routes in the routing table that most closely match
the destination address of the packet being sent or forwarded.
NAT – See network address translator (NAT).
network address translator (NAT) – An IPv4 router that translates addresses and ports when forwarding
packets between a privately addressed network and the Internet.
next-hop determination – The process of determining the next-hop address and interface for sending or
forwarding a packet, based on the contents of the routing table.
Open Shortest Path First (OSPF) – A link state-based routing protocol for use within a single autonomous
system. An autonomous system is a portion of the network under the same administrative authority.
OSPF – See Open Shortest Path First (OSPF).
path vector – A routing protocol technology that exchanges sequences of hop information that indicate the
path for a route. For example, BGP-4 exchanges sequences of autonomous system numbers.
RIP – See Routing Information Protocol (RIP).
route determination process – The process of determining which single route in the routing table to use for
forwarding a packet.
route summarization – The practice of using address prefixes to summarize the address spaces of regions of a
network, rather than using the routes for individual subnets.
router – An IPv4 or IPv6 node that can forward received packets that are not addressed to itself (also called a
gateway for IPv4).
Router Advertisement – For IPv4, a message sent by a router that supports ICMP router discovery. For IPv6, an
IPv6 Neighbor Discovery message sent by a router that typically contains at least one Prefix Information
option, from which hosts create stateless autoconfigured unicast IPv6 addresses and routes.
router discovery – For IPv4, the ability of hosts to automatically configure and reconfigure a default gateway.
For IPv6, a Neighbor Discovery process in which a host discovers the neighboring routers on an attached link.
Routing Information Protocol (RIP) – A distance vector-based routing protocol used in small and medium sized
networks.
routing protocols – A series of periodic or on-demand messages that contain routing information that is
exchanged between dynamic routers.
routing table – The set of routes used to determine the next-hop address and interface for IP traffic sent by a
host or forwarded by a router.
static routing – The use of manually configured routes in the routing tables of routers.
supernetting – The obsolete use of route summarization to assign blocks of Class C address prefixes on the
Internet.
Network Layer Task :Network Layer Task The network layer concerns with getting packets from the source all
the way to the destination which may require many hops through intermediate routers. This contrasts with the
data link layer, which just moves frames from one end of a wire to another. To achieve its goals the network
layer must know about the topology of the communication subnet ( the set of all routers) and choose
appropriate paths through it. It must take care to choose routers to avoid overloading some of the lines and
routers while leaving others idle. When source and destination are in different networks, it has to deal with the
differences.
Services to transport layer :Services to transport layer The network layer services to the transport layer have
been designed with the following goals: The services should be independent of the router technology The
transport layer should be shielded from the number, type and topology of the subnets present. The network
addresses made available to the transport layer should use a uniform numbering plan acros LAN’s and WAN’s
The Internet community argues that a subnet is inherently unreliable, the hosts should do error control and
flow control. The service should thus be connectionless, but as reliable as possible, and the complexity is
placed on the hosts. The telephone companies argue that the subnet should provide a reliable, connection-
oriented service, placing the complexity in their subnets.
Comparison of Virtual-Circuit and Datagram Subnets :Comparison of Virtual-Circuit and Datagram Subnets
Routing Algorithms :Routing Algorithms The following properties are desirable in a routing algorithm:
correctness and simplicity. robustness, against software and hardware failures, traffic changes and topology
changes for very long periods. stability, some algorithms never converge to an equilibrium. fairness and
optimality, which are conflicting goals. static or adaptive
The Optimality Principle :The Optimality Principle The optimality principle states that if router J is on the
optimal path from router I to router K, then the optimal path from J to K also falls along the same route. What
is to be optimized are positive values! As a consequence, the set of optimal routes from all sources to a given
destination form a tree ( a sink tree) routed at the destination.
Shortest Path Routing :Shortest Path Routing Dijkstra's (or another) algorithm is used to compute the path
with the shortest length between any two nodes. Several metrics of length are possible. When using the
number of hops as a measure, each arc as a length of 1, as was used in the figure above. If physical distance
was taken as a measure, M would go over L. In general the labels on the arcs can be computed as a function of
distance, bandwidth, average traffic, communication costs, mean queue length, measured delay, etc. Subnet
as an undirected graph node: a router arc: a communication link. Each arc is labeled with a length.
Flooding :Flooding Another static algorithm is flooding, in which every incoming packet is sent out on every
outgoing line except the one it arrived on. It generates a vast number of duplicate packets, an infinite number
unless some measures are taken to damp the process. E.g. a hop counter in the header of each packet, which
is decremented at each hop, and the packet is discarded when the counter reaches 0. In selective flooding the
packets are only sent out on those lines that are going approximately in the right direction. Flooding might be
usable in military applications, large numbers of routers may be blown to pieces at any instant, as it is very
robust. Also during initialization of routers.
Distance Vector Routing :Distance Vector Routing A routing table in each router contains for each router the
preferred outgoing line for that router and an estimate for the “cost” to that destination. The cost metric
might be number of hops, queue length, time delay, etc. Time delay is measured by periodically sending ECHO
packets. Once every T msec each router sends to its neighbors a list of estimated “costs” to each destination.
Link State Routing :Link State Routing Distance vector routing reacts slowly on bad news, e.g. break down of a
link (count to infinity problem). The core of the problem is that when X tells Y that it has a path somewhere, Y
has no way of knowing whether it itself is on the path. Link State Routing: each router sends costs to neighbors
to all other routers. Each router must: Discover its neighbors, learn their network address. Measure the delay
or cost to each of its neighbors. Construct a packet telling all it has just learned. Send this packet to all other
routers. Compute the shortest path to every other router.
Link State Packets :Link State Packets The trickiest part is distributing the link state packages reliably, to assure
that each router has basically the same view of the subnet. A 32 bit sequence number (sufficient for 137 years,
if it is updated every second) is used. An age field is decremented every second and at every send and the
packet is discarded if the age reaches 0. All link state packets are acknowledged.
Broadcast Routing :Broadcast Routing Router I has to send a message to all other routers. A sink tree (b) is the
fastest, e.g. 4 hops and 14 messages send. But all routers must know the sink tree. Reverse path forwarding:
when a router receives a broadcast packet, it checks whether it arrives on the line the router uses for sending
packets to the source of the broadcast. If so, send out over all other links, else discard. 5 hops and 24 packets.
Routing for Mobile Hosts :Routing for Mobile Hosts Each mobile host has a fixed Home agent. When it moves
it registers at a foreign agent, which sends its information to the home agent.
Routing for Mobile Hosts (2) :Routing for Mobile Hosts (2)
Congestion :Congestion The reason congestion and flow control are often confused is that some congestion
control algorithm operate by sending messages back to various sources, telling them to "slow down". Thus a
host can get a "slow down" message either because the receiver on the direct link cannot handle the load or
because the network cannot handle it. When too many packets are present, buffers get full, packets are
discarded, more retransmissions and less packets delivered. Congestion thus tends to feed upon itself and
become worse, leading to collapse of the system.
Congestion Control in VC Subnets :Congestion Control in VC Subnets With admission control once congestion
has been signaled, no more virtual circuits are set up until the problem has gone away. This is a crude
approach, but it is simple and easy to implement. Another way is to route new connections around the
problem area. The subnet is "redrawn" leaving congested routers and all their lines out and then determine
the best route for a new connection in that subnet. A step further is to try to avoid routers that are directly
connected to the congested routers.
Hop-by-Hop Choke Packets :Hop-by-Hop Choke Packets (a) A choke packet that affects only the source. (b) A
choke packet that affects each hop it passes through.
Jitter Control :Jitter Control For applications such as audio and video streaming, it does not matter much if the
packets take 20 or 30 msec to be delivered, as long as the transit time is constant. The jitter should be small. In
some applications, like video on demand, jitter can be compensated for by buffering at the receiver. For
others, like Internet telephony or videoconferencing, the delay inherent in buffering is not acceptable.
Buffering :Buffering Constant bit rate (e.g. telephony), attempts to simulate a wire, providing uniform
bandwidth and delay. Variable bit rate (e.g. compressed videoconferencing), images must arrive in time
independent on how much they could be compressed. Non-real-time variable bit rate (e.g. watching a movie
over internet), a lot of buffering at the receiver is allowed. Available bit rate (e.g. file transfer), not sensitive to
jitter or delay.
The Leaky Bucket Algorithm :The Leaky Bucket Algorithm (a) A leaky bucket with water. (b) a leaky bucket
with packets.
The Token Bucket Algorithm :The Token Bucket Algorithm (a) Before. (b) After. 5-34
Tunneling :Tunneling
Internetwork Routing :Internetwork Routing This is similar to routing within a single subnet with some
additional complications. Routers might prefer routes with no protocol conversions over ones that use
tunneling over ones that need protocol conversions, even if those routes are longer. Subnets might be run by
different carriers and have different charging algorithm, so routing might be based on cost. When international
boundaries are crossed the various laws and political issues come into play.
Design Principles for Internet :Design Principles for Internet Make sure it works. Keep it simple. Make clear
choices. Exploit modularity. Expect heterogeneity. Avoid static options and parameters. Look for a good
design; it need not be perfect. Be strict when sending and tolerant when receiving. Think about scalability.
Consider performance and cost.
Collection of Subnetworks :Collection of Subnetworks At the network layer, the Internet can be viewed as a
collection of subnetworks or Autonomous Systems (AS). There is no real structure, but several mayor
backbones exist, constructed from high bandwidth lines and fast routers.
The IPv4 Protocol :The IPv4 Protocol The IHL field tells how long the header is, in 32 bit words. The Type of
Service field contains a 3 bit Precedence field, used for the priority from 0 (normal) to 7 (network control
packet), and 3 flags Delay, Throughput and Reliability, to specify what is most important for the packet. In
practice, current routers mostly ignore the TOS field. The situation is changing.
Some options for IPv4 :Some options for IPv4 The Time to Live field is a counter to limit packet lifetimes, it
must be decremented at each hop. The packet is discarded when TOL hits 0. The Protocol field tells the
receiving host which transport process (TCP, UDP or other) the packet should be given to. The Header
checksum verifies the header only, useful for detecting errors by bad memory bytes or corrupted software
inside a router. It must be recomputed at each hop, because the TOL changes.
IP Addresses :IP Addresses The class A, B resp. C formats allow for 126, 16382 resp. 2 million networks with 16
million, 64K resp. 254 hosts. Network addresses were given to organizations, leading to many unused host
numbers.
Special IP Addresses :Special IP Addresses IP addresses of the form 10.x.y.z (and onther one) are intended for
use within a LAN (compagny or home nowadays). They are not intended to go on the public internet.
Subnets :Subnets Same network number but internal subnets for departments.
CIDR – Classless InterDomain Routing :CIDR – Classless InterDomain Routing Class A and B networks were all
given out, Class C networks are often too small. A basic idea is to allocate the remaining class C networks
(more than 2 million) in variable sized blocks, a site needing 8000 addresses then gets 32 contiguous class C
networks. The world was divided up into 4 zones. A site outside Europe, that gets a packet destinated for 194...
or 195... can just send it to its standard European gateway. In effect 32 million addresses have now been
compressed into one routing table entry.
NAT – Network Address Translation :NAT – Network Address Translation NAT makes the IP network in fact
connection-oriented as it maintains information on each connection passing through it. A crash of the NAT box
terminates every TCP connection. Some protocols send IP numbers (and port numbers) in data, to be used by
the other side. They have been adapted or other ways are used.
Internet Control Message Protocol :Internet Control Message Protocol When something unexpected occurs in
a router or host, this event is reported by ICMP. The most important messages are in the table. It is also used
by routers to test the internet or to obtain information to be use in routing decisions. Each ICMP message is
encapsulated in an IP packet.
ARP– The Address Resolution Protocol :ARP– The Address Resolution Protocol IP addresses must be linked to
data link layer addresses, like Ethernet addresses or other types. With ARP the host broadcast a frame asking
who owns a certain IP address, like E1 asking for 192.31.65.5. Host E2 alone will answer with a broadcast
frame telling its IP and ethernet number. Entries in the ARP cache time out to allow for hardware changes.
Dynamic Host Configuration Protocol :Dynamic Host Configuration Protocol If a computer boots ups, what is it
IP address? It could be a fixed number, which is in the computer. But this requires administrative procedures,
which cost time and is error prone. DHCP (Dynamic Host Configuration Protocol) assigns IP addresses
dynamically. Older protocols for this are RARP and BOOTP.
OSPF and BGP :OSPF and BGP The Internet is made up of a large number of autonomous systems (AS). Each AS
can use its own routing algorithm (called an interior gateway protocol) inside. The OSPF (Open Shortest Path
First) algorithm became a standard in 1990. It support 3 kinds of connections and networks: point-to-point
lines multi-access networks with broadcasting (most LAN's) and without broadcasting (most packet-switched
WANs).
OSPF :OSPF 5-66 OSPF distinguishes 4 classes of routers: internal routers, area border routers, backbone
routers and AS boundary routers. The classes may overlap, e.g. all the border routers are backbone
routers.Type of service is handled by having 3 graphs, labeled with costs when delay, throughput or reliability
are the metric. OSPF uses raw IP packets between adjacent routers. It is inefficient to have every router on a
LAN talk to every other router on that LAN. Instead one router is elected as the designated router, which is
adjacent to the other routers. A backup designated router is kept up to date in case the primary designated
router crash. Another interior gateway protocol is IS-IS.
BGP – Exterior Gateway Routing Protocol :BGP – Exterior Gateway Routing Protocol BGP (Border Gateway
Protocol) routers have to worry about politics like: no transit traffic through certain AS'es, never put Iraq on a
route starting from Pentagon, etc. These policies are manually configured into each BGP router. BGP is
basically a distance vector routing, but instead of maintaining just the cost to each possible destination, each
BGP router keeps track of the exact path used and send this information periodically to its neighbors.
IPv6 :IPv6 The major goals of the new IPv6 protocol were: Support billions of hosts, even with inefficient
address space allocation Reduce the size of the routing tables Simplify the protocol, to allow routers to process
packets faster Provide better security (authentication and privacy) Pay more attention to type of service,
particularly for real time data Aid multicasting by allowing scopes to be specified Make it possible for a host to
roam without changing its address Allow the protocol to evolve in the future Permit the old and the new
protocols to coexist for years
The Main IPv6 Header :The Main IPv6 Header The flow label is also still experimental but will be used to allow
a source and destination to set up a pseudoconnection with particular properties and requirements. Traffic
class, is used to distinguish between packets whose sources can be flow controlled, values between 0 and 7, or
not, values between 8 and 15.
Extension Headers :Extension Headers The use of jumbograms is important for supercomputer applications
that must transfer gigabytes efficiently across the Internet.The routing header list up to 24 routers that must
be visited on the way to the destination. Both strict (the full path is supplied) and loose (only selected routers
are supplied) are available, and they can be combined.
Adresses :Adresses In addition to multicast, also anycast is supported. The destination is a group of addresses,
but it is tried to deliver the packet to just 1 of them, usually the nearest one. This can be used for example to
contact a group of cooperating file servers. The 16 byte addresses are written as 8 groups of 4 hexadecimal
digits with colons between the groups, leading 0's can be left out and 1 or more groups of 16 0's can be
replaced by a pair of colons.: 8000::123:4567:89AB:CDEF.
Network Layer
DHCP Dynamic Host Configuration Protocol
DVMRP Distance Vector Multicast Routing Protocol
ICMP/ICMPv6 Internet Control Message Protocol
IGMP Internet Group Management Protocol
IP Internet Protocol version 4
IPv6 Internet Protocol version 6
MARS Multicast Address Resolution Server
PIM Protocol Independent Multicast-Sparse Mode (PIM-SM)
RIP2 Routing Information Protocol
RIPng for IPv6 Routing Information Protocol for IPv6
RSVP Resource ReSerVation setup Protocol
VRRP Virtual Router Redundancy Protocol
Transport Layer
ISTP
Mobile IP Mobile IP Protocol
RUDP Reliable UDP
TALI Transport Adapter Layer Interface
TCP Transmission Control Protocol
UDP User Datagram Protocol
Van Jacobson compressed TCP
XOT X.25 over TCP
Session Layer
BGMP Border Gateway Multicast Protocol
Diameter
DIS Distributed Interactive Simulation
DNS Domain Name Service
ISAKMP/IKE Internet Security Association and Key Management Protocol and
Internet Key Exchange Protocol
iSCSI Small Computer Systems Interface
LDAP Lightweight Directory Access Protocol
MZAP Multicast-Scope Zone Announcement Protocol
NetBIOS/IP NetBIOS/IP for TCP/IP Environment
Application Layer
COPS Common Open Policy Service
FANP Flow Attribute Notification Protocol
Finger User Information Protocol
FTP File Transfer Protocol
HTTP Hypertext Transfer Protocol
IMAP4 Internet Message Access Protocol rev 4
IMPPpre/IMPPmes Instant Messaging and Presence Protocols
IPDC IP Device Control
IRC ·Internet Relay Chat Protocol
ISAKMP Internet Message Access Protocol version 4rev1
ISP
NTP Network Time Protocol
POP3 Post Office Protocol version 3
Radius Remote Authentication Dial In User Service
RLOGIN Remote Login
RTSP Real-time Streaming Protocol
SCTP Stream Control Transmision Protocol
S-HTTP Secure Hypertext Transfer Protocol
SLP Service Location Protocol
SMTP Simple Mail Transfer Protocol
SNMP Simple Network Management Protocol
SOCKS Socket Secure (Server)
TACACS+ Terminal Access Controller Access Control System
TELNET TCP/IP Terminal Emulation Protocol
TFTP Trivial File Transfer Protocol
WCCP Web Cache Coordination Protocol
X-Window X Window
Routing
BGP-4 Border Gateway Protocol
EGP Exterior Gateway Protocol
EIGRP Enhanced Interior Gateway Routing Protocol
HSRP Cisco Hot Standby Router Protocol
IGRP Interior Gateway Routing
NARP NBMA Address Resolution Protocol
NHRP Next Hop Resolution Protocol
OSPF Open Shortest Path First
TRIP Telephony Routing over IP
Tunneling
ATMP Ascend Tunnel Management Protocol
L2F The Layer 2 Forwarding Protocol
L2TP Layer 2 Tunneling Protocol
PPTP Point to Point Tunneling Protocol
Security
AH Authentication Header
ESP Encapsulating Security Payload
TLS Transport Layer Security Protocol
The TCP/IP suite is illustrated here in relation to the OSI model:
Click the protocols on the map to see more details.
Home
At almost regular intervals the debate of RISC vs. CISC sweeps through the comp.arch newsgroup. John
Mashey of SGI has compiled an extensive article in multiple parts on that subject that he reposts on these
occasions. I have compiled one of his reposts into this HTML document for easier reading, especially of the
tables. I've tried to preserve the original text and formatting if possible, only a few obvious typos have been
fixed.
The orginal usenet posting from which I derived this document should be available on Google from these
search results.
This article is made available on the WWW by permission of the author, John Mashey.
Achim Gratz
RISC vs. CISC
Path: irz401!fu-berlin.de!news.apfel.de!cpk-news-hub1.bbnplanet.com!su-news-feed4.bbnplanet.com!
news.bbnplanet.com!enews.sgi.com!news.corp.sgi.com!mash.engr.sgi.com!mash
From: mash@mash.engr.sgi.com (John R. Mashey)
Newsgroups: comp.arch
Subject: Re: RISC vs CISC ... [the big posting: for the N+1st time]
Date: 3 Jun 1997 20:18:01 GMT
Organization: Silicon Graphics, Inc.
Lines: 823
Message-ID: <5n1u5p$5hn$2@murrow.corp.sgi.com>
References: <5qsxov$gkm@skyway.bridge.net> <5mnh29$72p$1@murrow.corp.sgi.com>
<5mnksb$btk@tooting.netapp.com> <338F9902.5016@OntheNet.com.au>
<5mtanv$en2$1@sue.cc.uregina.ca>
NNTP-Posting-Host: mash.engr.sgi.com
Xref: irz401 comp.arch:39141
Long-time readers of this newsgroup can stop here, you've seen it before, but once again, confusion is
breaking out.
Note: for a shorter and more coherent form of this, read Microprocessor Report, March 26, 1992, "CISCs are
Not RISCs, and Not Converging Either."
Oh? Consider the problem of designing exception handling for the P6. It's pipelining, out-of-order and
speculative execution that make exception handling tough, and both the CISC and RISC people are doing as
much of this as they can. It has little to do with the instruction set.
True, but, as you point out, the distinction between RISC and CISC is disappearing, to the point where the main
distinction is psychological. And that's the biggest problem: The RISC designers can't be convinced of the
importance of designing an architecture suited for something other than benchmarks, whereas the CISC
designers realize that accommodating operating system requirements is a critical part of their job.
1. Attached is the Nth repost of the discussion of RISC-vs-CISC non-convergence, i.e., architecture !=
implementation, one more time. If you've followed this newsgroup for a while, you've seen this before. I have
included some followon discussions, and done minuscule editing.
2. The comments about RISC designers above are simply wrong; "All generalizations are false", but
especially this one.
3. re: Exception-handling: as I've noted for years in public talks (the "Car" talk especially), tricky
exception-handling is the price for machines with increased parallelism and overlap. Errata sheets are
customarily filled with bugs around exception cases. Oddly enough, speculative-execution, o-o-o machines
may actually fare better than you'd think, given that they already need mechanisms for undoing (more-or-less)
completed instructions. From history, recall that the 360/91, circa 1967, had some imprecise exceptions, as did
the 360/67 (early 360 with virtual memory), so exciting exception handling have been with us for a while.
====REPOST====
Article 22850 of comp.arch:
Path: mips!mash
Subject: Nth re-posting of CISC vs RISC (or what is RISC, really)
Message-ID: <2419@spim.mips.COM>
Most of you have seen most of this several times before; there is a little editing, nothing substantial. Some
followup comments have been added.
PART I - ARCHITECTURE, IMPLEMENTATION, DIFFERENCES
PART II - ADDRESSING MODES
PART III - MORE ON TERMINOLOGY; WOULD YOU CALL THE CDC 6600 A RISC?
PART IV - RISC, VLIW, STACKS
PART I - ARCHITECTURE, IMPLEMENTATION, DIFFERENCES
WARNING: you may want to print this one to read it...
(from preceding discussion):
Anyway, it is not a fair comparison. Not by a long stretch. Let's see how the Nth generation SPARC, MIPS, and
88K's do (assuming they last) compared to some new design from scratch.
Well, there is baggage and there is BAGGAGE. One must be careful to distinguish between ARCHITECTURE and
IMPLEMENTATION:
A. Architectures persist longer than implementations, especially user-level Instruction-Set Architecture.
B. The first member of an architecture family is usually designed with the current implementation
constraints in mind, and if you're lucky, software people had some input.
C. If you're really lucky, you anticipate 5-10 years of technology trends, and that modifies your idea of
the ISA you commit to.
D. It's pretty hard to delete anything from an ISA, except where:
I. You can find that NO ONE uses a feature (the 68020->68030 deletions mentioned by someon
else).
II. You believe that you can trap and emulate the feature "fast enough". i.e., microVAX support
for decimal ops, 68040 support for transcendentals.
Now, one might claim that the i486 and 68040 are RISC implementations of CISC architectures.... and I think
there is some truth to this, but I also think that it can confuse things badly:
Anyone who has studied the history of computer design knows that high-performance designs have used many
of the same techniques for years, for all of the natural reasons, that is:
1.
a. They use as much pipelining as they can, in some cases, if this means a high gate-count, then
so be it.
b. They use caches (separate I & D if convenient).
c. They use hardware, not micro-code for the simpler operations.
(For instance, look at the evolution of the S/360 products. Recall that the 360/85 used caches, back around
1969, and within a few years, so did any mainframe or supermini.)
Question: So, what difference is there among machines if similar implementation ideas are used?
Answer: There is a very specific set of characteristics shared by most machines labeled RISCs, most of which
are not shared by most CISCs. The RISC characteristics:
2.
a. Are aimed at more performance from current compiler technology (i.e., enough registers).
OR
b. Are aimed at fast pipelining in a virtual-memory environment with the ability to still survive
exceptions without inextricably increasing the number of gate delays (notice that I say gate delays, NOT just
how many gates).
Even though various RISCs have made various decisions, most of them have been very careful to omit those
things that CPU designers have found difficult and/or expensive to implement, and especially, things that are
painful, for relatively little gain.
I would claim, that even as RISCs evolve, they may have certain baggage that they'd wish weren't there.... but
not very much. In particular, there are a bunch of objective characteristics shared by RISC ARCHITECTURES that
clearly distinguish them from CISC architectures.
I'll give a few examples, followed by the detailed analysis: MOST RISCs:
3.
a. Have 1 size of instruction in an instruction stream
b. And that size is 4 bytes
c. Have a handful (1-4) addressing modes) (* it is VERY hard to count these things; will discuss
later).
d. Have NO indirect addressing in any form (i.e., where you need one memory access to get the
address of another operand in memory)
4.
a. Have NO operations that combine load/store with arithmetic, i.e., like add from memory, or
add to memory. (note: this means especially avoiding operations that use the value of a load as input to an
ALU operation, especially when that operation can cause an exception. Loads/stores with address modification
can often be OK as they don't have some of the bad effects)
b. Have no more than 1 memory-addressed operand per instruction
5.
a. Do NOT support arbitrary alignment of data for loads/stores
b. Use an MMU for a data address no more than once per instruction
c. Have >=5 bits per integer register specifier
d. Have >= 4 bits per FP register specifier
These rules provide a rather distinct dividing line among architectures, and I think there are rather strong
technical reasons for this, such that there is one more interesting attribute: almost every architecture whose
first instance appeared on the market from 1986 onward obeys the rules above...
Note that I didn't say anything about counting the number of instructions...
So, here's a table:
C: number of years since first implementation sold in this family (or first thing which with this is binary
compatible). Note: this table was first done in 1991, so year = 1991-(age in table).
3a: # instruction sizes
3b: maximum instruction size in bytes
3c: number of distinct addressing modes for accessing data (not jumps) I didn't count register
orliteral, but only ones that referenced memory, and I counted different formats with different offset sizes
separately. This was hard work... Also, even when a machine had different modes for register-relative and
PC_relative addressing, I counted them only once.
3d: indirect addressing (0 - no, 1 - yes)
4a: load/store combined with arithmetic (0 - no, 1 - yes)
4b: maximum number of memory operands
5a: unaligned addressing of memory references allowed in load/store, without specific instructions ( 0
- no never [MIPS, SPARC, etc], 1 - sometimes [as in RS/6000], 2 - just about any time)
5b: maximum number of MMU uses for data operands in an instruction
6a: number of bits for integer register specifier
6b:number of bits for 64-bit or more FP register specifier, distinct from integer registers
Note that all of these are ARCHITECTURE issues, and it is usually quite difficult to either delete a feature (3a-
5b) or increase the number of real registers (6a-6b) given an initial isntruction set design. (yes, register
renaming can help, but...) Now:
items 3a, 3b, and 3c are an indication of the decode complexity,
3d-5b hint at the ease or difficulty of pipelining, especially in the presence of virtual-memory
requirements, and need to go fast while still taking exceptions sanely
items 6a and 6b are more related to ability to take good advantage of current compilers.
There are some other attributes that can be useful, but I couldn't imagine how to create metrics for them
without being very subjective; for example "degree of sequential decode", "number of writebacks that you
might want to do in the middle of an instruction, but can't, because you have to wait to make sure you see all
of the instruction before committing any state, because the last part might cause a page fault," or
"irregularity/assymetricness of register use", or "irregularity/complexity of instruction formats". I'd love to use
those, but just don't know how to measure them. Also, I'd be happy to hear corrections for some of these.
So, here's a table of 12 implementations of various architectures, one per architecture, with the attributes
above. Just for fun, I'm going to leave the architectures coded at first, although I'll identify them later. I'm
going to draw a line between H1 and L4 (obviously, the RISC-CISC Line), and also, at the head of each column,
I'm going to put a rule, which, in that column, most of the RISCs obey. Any RISC that does not obey it is marked
with a +; any CISC that DOES obey it is marked with a *. So...
An interesting exercise is to analyze the ODD cases. First, observe that of 12 architectures, in only 2 cases does
an architecture have an attribute that puts it on the wrong side of the line.
Of the RISCs:
A1 is slightly unusual in having more integer registers, and less FP than usual. [Actually, slightly out of
date, 29050 is different, using integer register bank instead, I hear.]
D1 is unusual in sharing integer and FP registers (that's what the D1:6b == 0).
E1 seems odd in having a large number of address modes. I think most of this is an artifact of the way
that I counted, as this architecture really only has a fundamentally small number of ways to create addresses,
but has several different-sized offsets and combinations, but all within 1 4-byte instruction; I believe that it's
addressing mechanisms are fundamentally MUCH simpler than, for example, M2, or especially N1, O3, or P3,
but the specific number doesn't capture it very well.
F1... is not sold any more.
H1 one might argue that this process has 2 sizes of instructions, but I'd observe that at any point in
the instruction stream, the instructions are either 4-bytes long, or 8-bytes long, with the setting done by a
mode bit, i.e., not dynamically encoded in every instruction.
Of the processors called CISCs:
L4 happens to be one in which you can tell the length of the instruction from the first few bits, has a
fairly regular instruction decode, has relatively few addressing modes, no indirect addressing. In fact, a big
subset of its instructions are actually fairly RISC-like, although another subset is very CISCy.
M2 has a myriad of instruction formats, but fortunately avoided indirect addressing, and actually,
MOST of instructions only have 1 address, except for a small set of string operations with 2. I.e., in this case,
the decode complexity may be high, but most instructions cannot turn into multiple-memory-address-with-
side-effects things.
N1,O3, and P3 are actually fairly clean, orthogonal architectures, in which most operations can
consistently have operands in either memory or registers, and there are relatively few weirdnesses of special-
cased uses of registers. Unfortunately, they also have indirect addressing, instruction formats whose very
orthogonality almost guarantees sequential decoding, where it's hard to even know how long an instruction is
until you parse each piece, and that may have side-effects where you'd like to do a register write-back early,
but either:
o must wait until you see all of the instruction until you commit state or
o must have "undo" shadow-registers or
o must use instruction-continuation with fairly tricky exception handling to restore the state of
the machine.
It is also interesting to note that the original member of the family to which O3 belongs was rather simpler in
some of the critical areas, with only 5 instruction sizes, of maximum size 10 bytes, and no indirect addressing,
and requiring alignment (i.e., it was a much more RISC-like design, and it would be a fascinating speculation to
know if that extra complexity was useful in practice). Now, here's the table again, with the labels:
General comment: this may sound weird, but in the long term, it might be easier to deal with a really
complicated bunch of instruction formats, than with a complex set of addressing modes, because at least the
former is more amenable to pre-decoding into a cache of decoded instructions that can be pipelined
reasonably, whereas the pipeline on the latter can get very tricky (examples to follow). This can lead to the
funny effect that a relatively "clean", orthogonal architecture may actually be harder to make run fast than one
that is less clean. Obviously, every weirdness has it's penalties...
Now, in either case, we now have a longword that has the address of the actual data.
a. Increment r1 [well, this is where you'd LIKE to do it, or in parallel with step 2a).] However,
see later why not...
b. Now, fetch the actual data from memory, using the address just obtained, doing everything
in step 2a) again, yielding the actual data, which we need to stick in a temporary buffer, since it doesn't
actually go in a register.
2. Now, decode the second operand specifier, which goes thru everything that we did in step 2, only
again, and leaves the results in a second temporary buffer. Note that we'd like to be starting this before we get
done with all of 2 (and I THINK the VAX9000 probably does that??) but you have to be careful to
bypass/interlock on potential side-effects to registers... actually, you may well have to keep shadow copies of
every register that might get written in the instruction, since every operand can use auto-
increment/decrement. You'd probably want badly to try to compute the address of the second argument and
do the MMU access interleaved with the memory access of the first, although the ability of any operand to
need 2-4 MMU accesses probably makes this tricky. [Recall that any MMU access may well cause a page
fault...]
3. Now, do the add. [could cause exception]
4. Now, do the third specifier... only, it might be a little different, depending on the nature of the cache,
that is, you cannot modify cache or memory, unless you know it will complete. (Why? well, suppose that the
location you are storing into overlaps with one of the indirect-addressing words pointed to by r1 or 4(r1), and
suppose that the store was unaligned, and suppose that the last byte of the store crossed a page boundary and
caused a page fault, and that you'd already written the first 3 bytes. If you did this straightforwardly, and then
tried to restart the instruction, it wouldn't do the same thing the second time.
5. When you're sure all is well, and the store is on its way, then you can safely update the two registers,
but you'd better wait until the end, or else, keep copies of any modified registers until you're sure it's safe. (I
think both have been done?)
6. You may say that this code is unlikely, but it is legal, so the CPU must do it. This style has the following
effects:
a. You have to worry about unlikely cases.
b. You'd like to do the work, with predictable uses of functional units, but instead, they can
make unpredictable demands.
c. You'd like to minimize the amount of buffering and state, but it costs you in both to go fast.
d. Simple pipelining is very, very tough: for example, it is pretty hard to do much about the next
instruction following the ADDL, (except some early decode, perhaps), without a lot of gates for special-casing.
(I've always been amazed that CVAX chips are fast as they are, and VAX 9000s are REALLY impressive...)
e. EVERY memory operand can potentially cause 4 MMU uses, and hence 4 MMU faults that
might actually be page faults...
f. AND there are even worse cases, like the addp6 instruction, that can require *40* pages to
be resident to complete...
7. Consider how "lazy" RISC designers can be:
a. Every load/store uses exactly 1 MMU access.
b. The compilers are often free to re-arrange the order, even across what would have been the
next instruction on a CISC. This gets rid of some stalls that the CISC may be stuck with (especially memory
accesses).
c. The alignment requirement avoids especially the problem with sending the first part of a
store on the way before you're SURE that the second part of it is safe to do.
Finally, to be fair, let me add the two cases that I knew of that were more on the borderline: i960 and Clipper:
(I think an ARM would be in this area as well; I think somebody once sent me an ARM-entry, but I can't find it
again; sorry.)
Note: slight modification (I'll integrate this sometime):
From jfc@MIT.EDU Mon Nov 29 12:59:55 1993
Subject: Re: Why are Motorola's slower than Intel's ? [really what's a RISC]
Newsgroups: comp.arch
Organization: Massachusetts Institute of Technology
Since you made your table IBM has released a couple chips that support unaligned accesses in hardware even
across cache line boundaries and may store part of an unaligned object before taking a page fault on the
second half, if the object crosses a page boundary.
These are the RSC (single chip POWER) and PPC 601 (based on RSC core).
John Carr (jfc@mit.edu)
(Back to me; jfc's comments are right; if I had time, I'd add another line to do PPC... which, in some sense
replays the S/360 -> S/370 history of relaxing alignment restrictions somewhat. I conjecture that at least some
of this was done to help Apple s/w migration.)
SUMMARY
1. RISCs share certain architectural characteristics, although there are differences, and some of those
differences matter a lot.
2. However, the RISCs, as a group, are much more alike than the CISCs as a group.
3. At least some of these architectural characteristics have fairly serious consequences on the
pipelinability of the ISA, especially in a virtual-memory, cached environment.
4. Counting instructions turns out to be fairly irrelevant:
1. It's HARD to actually count instructions in a meaningful way... (if you disagree, I'll claim that
the VAX is RISCier than any RISC, at least for part of its instruction set :-) Why: VAX has a MOV opcode,
whereas RISCs usually have a whole set of opcodes for {LOAD/STORE} {BYTE, HALF, WORD}
2. More instructions aren't what REALLY hurts you, anywhere near as much features that are
hard to pipeline
3. RISCs can perfectly well have string-support, or decimal arithmetic support, or graphics
transforms... or lots of strange register-register transforms, and it won't cause problems..... but compare that
with the consequence of adding a single instruction that has 2-3 memory operands, each of which can go
indirect, with auto-increments, and unaligned data...
PART II - ADDRESSING MODES
Article: 30346 of comp.arch
Path: odin!mash.wpd.sgi.com!mash
Subject: Updated addressing mode table
Message-ID: < C52tAM.K4B@odin.corp.sgi.com>
Nntp-Posting-Host: mash.wpd.sgi.com
I promised to repost this with fixes, and people have been asking for it, so here it is again: if you saw it before,
all that's really different is some fixes in the table, and a few clarified explanations:
THE GIANT ADDDRESSING MODE TABLE (Corrections happily accepted) This table goes with the higher-level
table of general architecture characteristics.
Shown below are 22 distinct addressing modes [you can argue whether these are right categories]. In the table
are the *number* of different encodings/variations [and this is a little fuzzy; you can especially argue about the
4 in the HP PA column, I'm not even sure that's right]. For example, I counted as different variants on a mode
the case where the structure was the same, but there were different-sized displacements that had to be
decoded. Note that meaningfully counting addressing modes is *at least as bad* as meaningfully counting
opcodes; I did the best I could, and I spent a lot of hours looking at manuals for the chips I hadn't programmed
much, and in some cases, even after hours, it was hard for me to figure out meaningful numbers... *Most* of
these architectures are used in general-purpose systems and *most* have at least one version that uses
caches: those are important because many of the issues in thinking about addressing modes come from their
interactions with MMUs and caches...
COLUMN NOTES:
1. Columns 1-7 are addressing modes used by many machines, but very few, if any clearly-RISC
architectures use anything else. They are all characterized by what they don't have:
o 2 adds needed before generating the address
o indirect addressing
o variable-sized decoding
2. Columns 13-15 include fairly simple-looking addressing modes, which however, *may* require 2 back-
to-back adds before the address is available. [*may* because some of them use index-register=0 or something
to avoid indexing, and usually in such machines, you'll see variable timing figures, depending on use of
indexing.]
3. Columns 16-22 use indirect addressing.
ROW NOTES
1. Clipper & i960, of current chips, are more on the RISC-CISC border, or are sort of "modern CISCs".
ARM is also characterized (by ARM people, Hot Chips IV: "ARM is not a pure RISC").
2. ROMP has a number of characteristics different from the rest of the RISCs, you might call it "early
RISC", and it is of course no longer made.
3. You might consider HP PA a little odd, as it appears to have more addressing modes, in the same way
that CISCs do, but I don't think this is the case: it's an issue of whether you call something several modes or
one mode with a modifier, just as there is trouble counting opcodes (with & without modifiers). From my view,
neither PA nor POWER have truly "CISCy" addressing modes.
4. Notice difference between 68000 and 68020 (and later 68Ks): a bunch of incredibly-general &
complex modes got added...
5. Note that the addressing on the S/360 is actually pretty simple, mostly base+displacement, although
RX-addressing does take 2 regs+offset.
6. A dimension *not* shown on this particular chart, but also highly relevant, is that this chart shows the
different *types* of modes, *not* how many addresses can be found in each instruction. That may be worth
noting also:
AMD : i960 1 one address per instruction
S/360 - MC68020 2 up to 2 addresses
VAX 6 up to 6
By looking at alignment, indirect addressing, and looking only at those chips that have MMUs, consider the
number of times an MMU *might* be used per instruction for data address translations:
AMD - Clipper 2 [Swordfish & i960KB: no TLB]
S/360 - NSC32K 4
MC68Ks (all) 8
VAX 24
When RS/6000 does unaligned, it must be in the same cache line (and thus also in same MMU page), and traps
to software otherwise, thus avoiding numerous ugly cases.
Note: in some sense, S/360s & VAXen can use an arbitrary number of translations per instruction, with MOVE
CHARACTER LONG, or similar operations & I don't count them as more, because they're defined to be
interruptable/restartable, saving state in general-purpose registers, rather than hidden internal state.
SUMMARY
1. Computer design styles mostly changed from machines with: 2-6 addresses per instruction, with
variable sized encoding address specifiers were usually "orthogonal", so that any could go anywhere in an
instruction
o sometimes indirect addressing
o sometimes need 2 adds *before* effective address is available
o sometimes with many potential MMU accesses (and possible exceptions)
per instruciton, often buried in the middle of the instruction,
and often *after* you'd normally want to commit state because of auto-increment
or other side effects.
to machines with:
o 1 address per instruction
o address specifiers encoded in small # of bits in 32-bit instruction
o no indirect addressing
o never need 2 adds before address available
o use MMU once per data access
and we usually call the latter group RISCs. I say "changed" because if you put this table together with the
earlier one, which has the age in years, the older ones were one way, and the newer ones are different.
2. Now, ignoring any other features, but looking at this single attribute (architectural addressing features
and implementation effects therof), it ought to be clear that the machines in the first part of the table are
doing something *technically* different from those in the second part of the table. Thus, people may
sometimes call something RISC that isn't, for marketing reasons, but the people calling the first batch RISC
really did have some serious technical issues at heart.
3. One more time: this is *not* to say that RISC is better than CISC, or that the few in the middle are
bad, or anything like that... but that there are clear technical characteristics...
PART III - MORE ON TERMINOLOGY; WOULD YOU CALL THE CDC 6600 A RISC?
Article: 39495 of comp.arch
Newsgroups: comp.arch
From: mash@mash.engr.sgi.com (John R. Mashey)
Subject: Re: Why CISC is bad (was P6 and Beyond)
Organization: Silicon Graphics, Inc.
Date: Wed, 6 Apr 94 18:35:01 PDT
In article <2nii0d$kkn@crl2.crl.com>, dbennett@crl.com (Andrea Chen) writes:
You may be correct on the creation of the term, but RISC does refer to a school of computer design that dates
back to the early seventies.
This is all getting fairly fuzzy and subjective, but it seems very confusing to label RISC as a school of thought
that dates back to the early 1970s.
1. One can say that RISC is a school of thought that got popular in the early-to-mid 80's, and got
widespread commercial use then.
2. One can say that there were a few people (like John Cocke & co at IBM) who were doing RISC-style
research projects in the mid-70s.
3. But if you want to go back, as has been discussed in this newsgroup often, a lot of people go back to
the CDC 6600, whose design started in 1960, and was delivered in 4Q 1964. Now, while this wouldn't exactly fit
the exact parameters of current RISCs, a great deal of the RISC-style approach was there in the central
processor ISA:
a. Load/store architecture.
b. 3-address register-register instructions
c. Simply-decoded instruction set
d. Early use of instructions schedule by compiler, expectation that you'd usually program in
high-level language and not often to assembler, as you'd expect compiler to do well.
e. More registers than common at the time
f. ISA designed to make decode/issue easy
Note that the 360/91 (1967) offered a good example of building a CISC-architecture into a high-performance
machine, and was an interesting comparison to the 6600.
4. Maybe there is some way to claim that RISC goes back to the 1950s, but in general, most machines of
the 1950s and 1960s don't feel very RISCy (to me). Consider Burroughs B5000s; IBM 709x, 707x, 1401s; Univac
110x; GE 6xx, etc, and of course, S/360s. Simple load/store architectures were hard to find; there were often
exciting instruction decodings required; indirect addressing was popular; machines often had very few
accumulators.
5. If you want to try sticking this in the matrix I've published before, as best as I recall, the 6600 ISA
generally looked like:
6. That is:
a. 2: it has 2 instruction sizes (not 1), 15 & 30 bits (however, were packed into 60-bit words, so
if you had 15, 30, 30, the second 30-bitter would not cross word boundaries, but would start in the second
word.)
b. *: 15-and-30 bit instructions, not 32-bit.
c. 1: 1 addressing mode [Note: Time McCaffrey emailed me that one might consider there to be
more, i.e., you could set address register to combinations of the others to give
autoincrement/decrement/Index+offset, etc). In any case, you compute an address as a simple combination of
1-2 registers, and then use the address, without furhter side-effects.
d. 0: no indirect addressing
e. 1: have one memory operand per instruction
f. 0: do NOT support arbitrary alignment of operands in memory (well, it was a word-addressed
machine :-)
g. 1: use an MMU for data translation no more than once per instruction (MMU used loosely
here)
h. 3,3: had 3-bit fields for addressing registers, both index and FP
Now, of the 10 ISA attributes I'd proposed for identifying typical RISCs, the CDC 6600 obeys 6. It varies in
having 2 instruction formats, and in having only 3 bits for register fields, but it had simple packing of the
instructions in to fixed-size words, and register/accumulators were pretty expensive in those days (some
popular machines only had one accumulator and a few index registers, so 8 of each was a lot). Put another
way: it had about as many registers as you'd conveniently build in a high-speed machine, and while they
packed 2-4 operations into a 60-bit word, the decode was pretty straighforward. Anyway, given the caveats, I'd
claim that the 6600 would fit much better in the RISC part of the original table...
PART IV - RISC, VLIW, STACKS
Article: 43173 of comp.arch
Newsgroups: comp.sys.amiga.advocacy,comp.arch
From: mash@mash.engr.sgi.com (John R. Mashey)
Subject: Re: PG: RISC vs. CISC was: Re: MARC N. BARR
Date: Thu, 15 Sep 94 18:33:14 PDT
In article <35a1a3$mlb@doc.armltd.co.uk>, Clive.Jones@armltd.co.uk writes:
Really? The Venerable John Mashey's table appears to contain as many exceptions to the rule about number of
GP registers as most others. I'm sure if one were to look at the various less conventional processors, there
would be some clearly RISC processors that didn't have a load-store architecture - stack and VLIW processors
spring to mind.
I'm not sure I understand the point. One can believe any of several things:
a. One can believe RISC is some marketing term without technical meaning whatsoever. OR
b. One can believe that RISC is some collection of implementation ideas. This is the most common
confusion.
c. One can believe that RISC has some ISA meaning (such as RISC == small number of opcodes)... but
have a different idea of RISC than do most chip architects who build them. If you want to pay words extra
money every Friday to mean something different than what they mean to practitioners... then you are free to
do so, but you will have difficulty communicating with practitioners if you do so.
EX: I'm not sure how stack architectures are "clearly RISC" (?) Maybe CRISP, sort of. Burroughs B5000 or
Tandem's original ISA: if those are defined as RISC, the term has been rendered meaningless.
EX: VLIWs: I don't know any reason why I'd call VLIWs, in general, either clearly RISC or clearly not. VLIW is a
technique for issuing instructions to more functional units than you have the die space/cycle time to decode
more dynamically. There gets to be a fuzzy line between:
A. A VLIW, especially if it compresses instructions in memory, then expands them out when
brought into the cache.
B. A superscalar RISC, which does some predecoding on the way from memory->cache, adding
"hint" bits or rearranging what it keeps there, speeding up cache->decode->issue.
At least some VLIWs are load/store architectures, and the operations they do look usually look like typical RISC
operations. OR, you can believe that:
d. RISC is a term used to characterize a class of relatively-similar ISAs mostly developed in the 1980s.
Thus, if a knowledgable person looks at ISAs, they will tend to cluster various ISAs as:
A. Obvious RISC, fits the typical rules with few exceptions.
B. Obviously not-RISC, fits the inverse of the RISC rules with relatively few exceptions.
Sometimes people call this CISC... but whereas RISCs, as a group, have relaitvely similar ISAs,
the CISC label is sometimes applied to a widely varying set of ISAs.
C. Hybrid / in-the-middle cases, that either look like CISCy RISCs, or RISCy CISCs. There are a few
of these.
Cases 1-3 are appropriate may apply to reasonably contemporaneous processors, and make some sense. and
then 4)
A. CPUs for which RISC/CISC is probably not a very relevant classification. I.e., one can apply the
set of rules I've suggested, and get an exception-count, but it may not mean much in practice, especially when
applied to older CPUs created with vastly different constraints than current ones, or embedded processors, or
specialized ones. Sometimes an older CPU might have been designed with some similar philosophies (i.e., like
CDC 6600 & RISC, sort of) whether or not it happend to fit the rules. Sometimes, die-space constraints my have
led to "simple" chips, without making them fit the suggested criteria either. Personally, torturous arguments
about whether a 6502, or a PDP-8, or a 360/44 or an XDS Sigma 7, etc, are RISC or CISC... do not usually lead to
great insight. After a while such arguments are counting angels dancing on pinheads ("Ahh, only 10 angles,
must be RISC" :-).
In this belief space, one tends to follow Hennessy & Patterson's comment in E.9 that "In the history of
computing, there has never been such widespread agreement on computer architecture." None of this
pejorative of earlier architectures, just the observation that the ISAs newly-developed in the 1980s were far
more similar that the earlier groups of ISAs. [I recall a 2-year period in which I used IBM 1401, IBM 7074, IBM
7090, Univac 1108, and S/360, of which only the 7090 and 1108 bore even the remotest resemblance to each
other, i.e., at least they both had 36-bit words.]
SUMMARY
RISC is a label most commonly used for a set of ISA characteristics chosen to ease the use of aggressive
implementation techniques found in high-performance processors (regardless of RISC, CISC, or irrelevant). This
is a convenient shorthand, but that's all, although it probably makes sense to use the term the way it's usually
meant by people who do chips for a living.