Vol. 4 No.


Plotting the Internet’s Future Calling All Smartbots

Multithreading for Mere Mortals Virtualization Comes of Age The Hennessy-Patterson Interview


Queue December/January 2006 Vol. 4 No. 10

December/January 2006-2007

Architecture’s Renaissance


Unlocking Concurrency 24 Ali-Reza Adl-Tabatabai, Intel, Christos Kozyrakis, Stanford University, and Bratin Saha, Intel Can transactional memory ease the pain of multicore programming?

VOL. 4 NO. 10

The Virtualization Reality 34 Simon Crosby, XenSource and David Brown, Sun Microsystems A look at hypervisors and the future of virtualization.

Better, Faster, More Secure 42 Brian Carpenter, IBM and IETF What’s ahead for the Internet’s fundamental technologies? The IETF chair prognosticates.

2 December/January 2006-2007 ACM QUEUE

rants: feedback@acmqueue.com


Fast Software Configuration Management

Introducing Folder Diff,
a productivity feature of Perforce SCM.
Folder Diff is an interactive, side-by-side display for comparing the state of any two groups of files. Use Folder Diff to quickly determine the differences between files in folders, branches, labels, or your local disk. This is especially useful when performing complex code merges. And when you’ve been working offline, Folder Diff makes it a snap to reconcile and catch up with the Perforce Server when you get back online. Folder Diff is just one of the many productivity tools that come with the Perforce SCM System.

Perforce Folder Diff

Download a free copy of Perforce, no questions
asked, from www.perforce.com. Free technical support is available throughout your evaluation.


Forward Thinking Charlene O’Hanlon, ACM Queue


NEWS 2.0 10

Taking a second look at the news so you don’t have to.


Visitors to our Web site are invited to tell us about the tools they love—and the tools they hate.


The Berkeley-Stanford duo who wrote the book (literally) on computer architecture discuss current innovations and future challenges.

Peerless P2P George V. Neville-Neil, Consultant




Will the Real Bots Stand Up? Stan Kelly-Bootle, Author

4 December/January 2006-2007 ACM QUEUE

rants: feedback@acmqueue.com

Visual Studio 2005. The difference is obvious.

Spot the difference? Your peers will. A faster path to Visual Basic® 2005 makes it easier to leverage your existing skills while taking on the challenging projects that make reputations. You get over 400 features that streamline coding, so you can focus on the work that matters. See all 400 differences at msdn.microsoft.com/difference

© 2006 Microsoft Corporation. All rights reserved. Microsoft, Visual Basic, Visual Studio, the Visual Studio logo, and “Your potential. Our passion.” are either registered trademarks or trademarks of Microsoft Corporation in the United States and/or other countries. The names of actual companies and products mentioned herein may be the trademarks of their respective owners.

Publisher and Editor Charlene O’Hanlon cohanlon@acmqueue.com Editorial Staff Executive Editor Jim Maurer jmaurer@acmqueue.com Managing Editor John Stanik jstanik@acmqueue.com Copy Editor Susan Holly Art Director Sharon Reuter Production Manager Lynn D’Addesio-Kraus Editorial Assistant Michelle Vangen Copyright Deborah Cotton Editorial Advisory Board Eric Allman Charles Beeler Steve Bourne David J. Brown Terry Coatta Mark Compton Stu Feldman Ben Fried Jim Gray Wendy Kellogg Marshall Kirk McKusick George Neville-Neil Guest Expert Kunle Olukotun
ACM Queue (ISSN 1542-7730) is published ten times per year by the ACM, 2 Penn Plaza, Suite 701, New York, NY 10121-0701. POSTMASTER: Please send address changes to ACM Queue, 2 Penn Plaza, Suite 701, New York, NY 10121-0701 USA Printed in the U.S.A. The opinions expressed by ACM Queue authors are their own, and are not necessarily those of ACM or ACM Queue. Subscription information available online at www.acmqueue.com.

ACM Headquarters Executive Director and CEO: John White Director, ACM U.S. Public Policy Office: Cameron Wilson Sales Staff National Sales Director Ginny Pohlman 415-383-0203 gpohlman@acmqueue.com Regional Eastern Manager Walter Andrzejewski 207-763-4772 walter@acmqueue.com Contact Points Queue editorial queue-ed@acm.org Queue advertising queue-ads@acm.org Copyright permissions permissions@acm.org Queue subscriptions orders@acm.org Change of address acmcoa@acm.org ACM U.S. Public Policy Office: Cameron Wilson, Director 1100 17th Street, NW, Suite 507, Washington, DC 20036 USA +1-202-659-9711–office, +1-202-667-1066–fax, wilson_c@acm.org
ACM Copyright Notice: Copyright © 2006 by Association for Computing Machinery, Inc. (ACM). Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and full citation on the first page. Copyright for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or fee. Request permission to republish from: Publications Dept. ACM, Inc. Fax +1 (212) 869-0481 or e-mail <permissions@acm.org> For other copying of articles that carry a code at the bottom of the first or last page or screen display, copying is permitted provided that the per-copy fee indicated in the code is paid through the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923, 508-750-8500, 508-750-4470 (fax).

Deputy Executive Director and COO: Patricia Ryan Director, Office of Information Systems: Wayne Graves Director, Financial Operations Planning: Russell Harris Director, Office of Membership: Lillian Israel Director, Office of Publications: Mark Mandelbaum Deputy Director, Electronic Publishing: Bernard Rous Deputy Director, Magazine Development: Diane Crawford Publisher, ACM Books and Journals: Jono Hardjowirogo Director, Office of SIG Services: Donna Baglio Assistant Director, Office of SIG Services: Erica Johnson Executive Committee President: Stuart Feldman Vice-President: Wendy Hall Secretary/Treasurer: Alain Chesnais Past President: Dave Patterson Chair, SIG Board: Joseph Konstan For information from Headquarters: (212) 869-7440

6 December/January 2006-2007 ACM QUEUE

rants: feedback@acmqueue.com

Forward Thinking from the editors
Charlene O’Hanlon, ACM Queue




am of the opinion that humans are not flexible creatures. We resist change like oil resists water. Even if a change is made for the good of humankind, if it messes around with our daily routine, then our natural instinct is to fight the change like a virus. Let’s face it, all of us thrive on routine—what time we get up, how we brush our teeth, where we sit on the train, what we eat for lunch—and for some it takes a lot to break the routine. If you don’t agree, take a look at your life. How many of you regularly perform some task that you dislike (backing up your hard drive, going to the same boring job, eating liver every Tuesday night) simply because you don’t want to face the alternative (a harddrive crash, no extra money for new CDs, the chance that your iron level will dip so low you’ll end up in the hospital getting mass blood transfusions)? I grew up in a household in which Saturday was cleaning day and everyone was forced to pitch in, so as a result there was a time not too long ago when I was absolutely stringent about keeping a perfectly clean house. As I’ve gotten older and somewhat wiser, however, I’ve started slacking off somewhat in the housecleaning department. A creature of habit, I used to begin my picking-up process in earnest every night at 9:30, darting in and out of every room in the house like a dervish and cleaning up the detritus of the day. Then one night, out of pure exhaustion, I just didn’t. And I woke up the next morning still alive and healthy. My house was a little out of order, but it wasn’t anything I couldn’t handle. Since then I’ve cut down my dervish episodes to three a week, and it suits me well (I’m also a little calmer now). Baby steps, I know. But for some it takes baby steps to precede the big steps. But because this is December and a new year—and the chance to make those dreaded New Year’s resolutions—is just weeks away, I’ve decided that 2007 will be the year I make some real changes in my life. I don’t just mean switching laundry detergents, but real change. And if I fail in my attempts, then I will work harder to make my changes successful. I know there will be difficulties, both internal and external, but I will face the changes and the challenges head on, embracing the changes rather than fighting them.
8 December/January 2006-2007 ACM QUEUE

Can we say the same for our industry in the next year? Can technology face the changes and adapt accordingly? Can we force an evolution, or will it come naturally? Charles Darwin said living things must adapt or die, but I wonder whether the same applies to technology. Indeed, we humans are the ones forcing the change—after all, technology does not create itself—but are we moving along a path in which one day technology will be responsible for its own evolution? It’s a thought that is both thrilling and scary—the kind of stuff that Michael Crichton novels are made of. Some may scoff and say that humans ultimately have control over the amount of intelligence any machine has, and that we will always be superior. But I would point out that humans are often held back by the one thing that technology knows nothing about: fear. A certain amount is fear is healthy; fear is what keeps us from jumping off a cliff without a bungee cord just to see what it feels like. But too much fear can prevent us from discovering our true talents and best assets—fear of the unknown, fear of being ridiculed, fear of failure. Call me crazy, but I’m sure a Web server doesn’t care whether it is being laughed at. I, for one, can envision a day when technology becomes smarter than humans. I think we will reach that threshold when man and machine possess equal intelligence, and then technology will evolve to surpass man simply because we humans can’t get past our fears. Which may be a good thing, depending on how one looks at it. I, for one, would never wish humankind to lose its humanity for the sake of lightning-fast decisions or a better way to build a widget. Fear, along with all our myriad emotions, is what makes us human. You can’t say that about a Web server. Q CHARLENE O’HANLON, editor of Queue, is in for some big adventures in 2007. Stick around and see for yourself. Meanwhile, send your comments to her at cohanlon@acmqueue. com.

rants: feedback@acmqueue.com

In this latest ACM Premium Queuecast, Paul Clenahan, Actuate vice president and member of the BIRT project management committee, discusses how BIRT can help developers incorporate reporting capabilities quickly and easily into their applications.

Log on to www.acmqueue.com to download the ACM Queuecast to your computer or MP3 player.



Sponsored by

news 2.0
Fox and the Weasel Capitalizing on the growing popularity of Mozilla’s Firefox, many Linux distributors now package the open source Web browser with their Linux code. According to Mozilla’s licensing policies, distributors may package the Firefox code with the Firefox name and logo, provided that Mozilla approves any changes made to the code. Mozilla wants to protect its trademark and prevent the confusion that might ensue if there were many separate forks of Firefox that all used the Firefox name and logo. Debian, a Linux distribution closely aligned with the free software movement, is butting heads with Mozilla over these requirements. The folks at Debian want to package a version of Firefox, but they object to using the logo because it’s trademarked and therefore conflicts with Debian’s free-use ethos. They also object to Mozilla’s code approval process, which could disqualify Debian’s browser from any association with the Firefox brand. So what’s a self-respecting free software advocate to do? One solution would be for Debian to adopt the GNU fork of Firefox, which, in obvious tribute to its parent, is cutely named IceWeasel. Another option would be for Debian to apply the IceWeasel name and logo, which are not trademarked, to its own Firefox code. data right from the cow pasture. Farmers can THE NEWS SO YOU input, view, and manage information using wireless DON’T HAVE TO devices equipped with a Web browser. FarmWizard’s wirelessly accessed hosted service shows that this new breed of “Agri-IT” applications closely aligns with computing trends seen in other sectors.

Taking a second look AT


http://www.vnunet.com/computing/news/2167254/ handhelds-collect-farming


http://www.internetnews.com/dev-news/article. php/3636651

Down on the Wireless Farm As Queue reported in its September 2006 issue, compliance is a growing challenge for enterprises that’s creating business opportunities for those savvy enough to sort it out. Lest we get too bogged down in SOX and HIPAA and Basel II, however, we must remember that compliance with government mandates is a challenge for all industries. For example, farmers across the globe must comply with government reporting requirements to verify the safety of the food they produce. European Union farmers must keep detailed records about their cattle—everything from where they’re grazing to their health problems. Farmers are turning to technology to help them comply. Companies such as Ireland’s FarmWizard are seizing the opportunity to provide solutions. FarmWizard allows cattle farmers to manage important farming
10 December/January 2006-2007 ACM QUEUE

Second-Life Commerce Meets First-Life IRS It’s becoming increasingly difficult to draw boundaries between the imaginary and the real. Immersive online simulations such as Second Life and World of Warcraft have evolved virtual exchange systems that closely resemble real-world commerce. Players looking for an edge in these games can head to eBay, where valuable items can be bought and sold with real currency, with the actual exchange of goods occurring in the online gaming world. Congress has noticed all this commerce and is evaluating its policies for governing these virtual-to-real-world transactions. After all, any transaction occurring in a real marketplace using real money reasonably could be subject to taxation, regardless of whether the goods exchanged are tangible or imaginary. But things become complex when you consider the potential real-world value of virtual goods traded in cyberspace. If one person sells a deed to some Second-Life property on eBay, while someone else, acting as an avatar online, completes the same transaction using Second Life’s internal Linden dollars, is the first transaction taxable and the second one not taxable? The problem for the IRS is that while these games are quite sophisticated, their economic systems lack the structures and institutions, such as a stock market, that real-world tax law relies on. If the lack of these features is what’s keeping taxes out of virtual worlds, it seems unlikely game developers will add them anytime soon.


http://today.reuters.com/news/ArticleNews.aspx?type= technologyNews&storyID=2006-10-16T121700Z_01_N15 306116_RTRUKOC_0_US-LIFE-SECONDLIFE-TAX.xml Q

rants: feedback@acmqueue.com

What’s on Your Hard Drive? reader files


s the year draws to an end, we would like to thank all of our readers who have submitted to WOYHD. Over the past 12 months we’ve seen a wide variety of tools mentioned, and, come 2007, we would like to see a lot more of the same. So log on to our Web site at http:// Who: Charles Moore What industry: Consulting and systems integrator Job title: Software engineer Flavor: Develops on Windows for Windows Tool I love! Beyond Compare. I love the abilities of this program. I use it all the time to reconcile multiple versions of files. It’s particularly useful to determine what changed between different releases—better than most versioning systems’ compares. Tool I hate! Microsoft Office. It never works the way you want it to. Any option that is available to make it do what you want (and how do you find out about it?) is usually several layers down under some unrelated menu—sometimes even in another system application! Many features were either incorrectly implemented, not thought out, incompatible with other features, or just plain don’t work. And you don’t have—or aren’t allowed—any other option. Who: Leon Woestenberg What industry: Broadcasting Job title: Senior designer Flavor: Develops on Linux for Linux Tool I love! OpenEmbedded. The world of cross-compilation is cruel, especially if your system is a complex of different external open source tools together with your own tools. OpenEmbedded is the platform that solves the subtleties that would have cost me a lot of time to get right. I know—I have been there before. Tool I hate! VisualWhatever. I do not think dragging components into a view, setting some attributes, and then hooking them up is anything like the way systems should be designed. If anyone thinks this is top-down design, think again.

www.acmqueue.com and send us your rants, raves, and more new tools that you absolutely can’t live without—or can’t stand to use. As further incentive, if we publish your submission, you’ll be starting off the New Year with a brand new Queue coffee mug! Who: Shyam Santhanam What industry: ISP/Telecommunications, energy, cable, utilities Job title: Software engineer Flavor: Develops on Linux for Linux Tool I love! Eclipse. I love the extensibility! It’s a jack-of-all-trades and master of each. Also the intuitive UI and stability are great. Eclipse is doing things the way IDEs should instead of how they “have been” in the past. Tool I hate! Make. The complex and clumsy syntax of Make files and their widespread acceptance in Linux (Unix) development is horrible. If you’ve ever tried tracing down a linker error originating in a 500-line Makefile with nth-level nested expansions, then you know what I mean.

Who: Mark Westwood What industry: Oil and gas Job title: Principal software engineer Flavor: Develops on Linux for Linux Tool I love! XEmacs. We have a longtime love affair; she is so much more than an editor to me. My wife doesn’t understand me this well! Compilers come and go, debuggers are transient, but an editor is for life. Tool I hate! Anything GUI. I can write a finite-difference time-domain Maxwell equation solver in Fortran in a quarter of the time it takes my users to make up their minds whether a dialog box should be shifted two pixels to the left or three pixels down. Computers are for number crunching!

more queue: www.acmqueue.com

ACM QUEUE December/January 2006-2007 11

Peerless P2P kode vicious
A koder with attitude, KV ANSWERS


eer-to-peer networking (better known as P2P) has two faces: the illegal file-sharing face and the legitimate group collaboration face. While the former, illegal use is still quite prevalent, it gets an undue amount of attention, often hiding the fact that there are developers out there trying to write secure, legitimate P2P applications that provide genuine value in the workplace. While KV probably has a lot to say about file sharing’s dark side, it is to the legal, less controversial incarnation of P2P that he turns his attention to this month. Take it away, Vicious…

Dear KV, I’ve just started on a project working with P2P software, and I have a few questions. Now, I know what you’re thinking, and no this isn’t some copyright-violating piece of kowboy kode. It’s a respectable corporate application for people to use to exchange data such as documents, presentations, and work-related information. My biggest issue with this project is security—for example, accidentally exposing our users’ data or leaving them open to viruses. There must be more things to worry about, but those are the top two. So, I want to ask, “What would KV do?”

Dear UP, What would KV do? KV would run, not walk, to the nearest bar and find a lawyer. You can always find lawyers in bars, or at least I do; they’re the only ones drinking faster than I am. The fact that you believe your users will use your software only for your designated purpose makes you either naive or stupid, and since I’m feeling kind today, I’ll assume naive. So let’s assume your company has lawyers to proGot a question for Kode Vicious? E-mail him at kv@acmqueue.com—if you dare! And if your letter appears in print, he may even send you a Queue coffee mug, if he’s in the mood. And oh yeah, we edit letters for content, style, and for your own good!

Unclear Peer

tect them from the usual charges of providing a MISS MANNERS HE AIN’T. system whereby people can exchange material that perhaps certain other people, who also have lawyers, consider it wrong to exchange. What else is there to worry about? Plenty. At the crux of all file-sharing systems—whether they are peer-to-peer, client/server, or what have you—is the type of publish/subscribe paradigm they follow. The publish/subscribe model defines how users share data. The models follow a spectrum from low to high risk. A high-risk model is one in which the application attempts to share as much data as possible, such as sharing all data on your disk with everyone as the basic default setting. Laugh if you like, but you’ll cry when you find out that lots of companies have built just such systems, or systems that are close to being as permissive as that. Here are some suggestions for building a low-risk peerto-peer file-sharing system. First of all, the default mode of all such software should be to deny access. Immediately after installing the software, no new files should be available to anyone. There are several cases in which software did not obey this simple rule, so when a nefarious person wanted to steal data, he or she would trick someone into downloading and installing the file-sharing software. This is often referred to as a “drive-by install.” The attacker would then have free access to the victim’s computer or at least to the My Documents or similar folder. Second, the person sharing the files—that is, the sharer—should have the most control over the data. The person connecting to the sharer’s computer should be able to see and copy only the files that the sharer wishes that person to see and copy. In a reasonably low-risk system, the sharing of data would have a timeout such that unless the requester got the data by a certain time (say, 24 hours), the data would no longer be available. Such timeouts can be implemented by having the sharer’s computer generate a one-time use token containing a timeout that the requester’s computer must present to get a particular file.
YOUR QUESTIONS. rants: feedback@acmqueue.com

12 December/January 2006-2007 ACM QUEUE

Third, the system should be slow to open up access. Although we don’t want the user to have to say OK to everything—because eventually the user will just click OK without thinking—you do want a system that requires user intervention to give more access. Fourth, files should not be stored in a known or easily guessable default location. Sharing a well-known folder such as My Documents has gotten plenty of people into trouble. The best way to store downloaded or shared files is to have the file-sharing application create and track randomly named folders beneath a well-known location in the file system. Choosing a reasonably sized random string of letters and digits as a directory name is a good practice. This makes it harder for virus and malware writers to know where to go to steal important information. Fifth, and last for this particular letter, the sharing should be one-to-one, not one-to-many. Many systems share data one-to-many, including most file-swapping applications, such that anyone who can find your machine can get at the data you are willing to share. Global sharing should be the last option a user has, not the first. The first option should be to a single person, the second to a group of people, and the last, global. You may note that a lot of this advice is in direct conflict with some of the more famous file-sharing, peer-topeer systems that have been created in the past few years. This is because I have been trying to show you a system that allows for data protection while data is being shared. If you want to create an application that is as open—and as dangerous—as Napster or its errant children were and are, then that’s a different story. From the sound of your letter, however, that is not what you want.

Other things you will have to worry about include the security of the application itself. A program that is designed to take files from other computers is a perfect vector for attacks by virus writers. It would be unwise— well, actually, it would be incredibly stupid—to write such a program so that it executes or displays files immediately after transfer without asking the user first. I have to admit that answering yes to the question, “Would you like to run this .exe file?” on Windows is about the same as asking, “Would you like me to pull the trigger?” in a game of Russian roulette. Another open research area, er, I mean, big headache, which I’ll not get into here, is the authentication system itself. Outside of all the other advice I just gave, this problem is itself quite thorny. How do I know that you are you? How do you know that I am me? Perhaps I am the Walrus, except, wait, the Walrus was Paul. Well, I believe you have enough to think about now. I suggest you sleep on it and wake up screaming, just like...

KODE VICIOUS, known to mere mortals as George V. Neville-Neil, works on networking and operating system code for fun and profit. He also teaches courses on various subjects related to programming. His areas of interest are code spelunking, operating systems, and rewriting your bad code (OK, maybe not that last one). He earned his bachelor’s degree in computer science at Northeastern University in Boston, Massachusetts, and is a member of ACM, the Usenix Association, and IEEE. He is an avid bicyclist and traveler who has made San Francisco his home since 1990.
© 2006 ACM 1542-7730/06/1200 $5.00


Coming in February: Secure Open Source
Open Source vs. Closed Source Security Vulnerability Management Updates That Don’t Go Boom!

more queue: www.acmqueue.com

ACM QUEUE December/January 2006-2007 13

A Conversation with John Hennessy and David Patterson interview
Photography by Jacob Leverich


They wrote the book ON
computing). Patterson pioneered the RISC project at Berkeley, which produced research on which Sun’s Sparc processors (and many others) would later be based. Meanwhile, Hennessy ran a similar RISC project at Stanford in the early 1980s called MIPS. Hennessy would later commercialize this research and found MIPS Computer Systems, whose RISC designs eventually made it into the popular game consoles of Sony and Nintendo. Interviewing Hennessy and Patterson this month is Kunle Olukotun, associate professor of electrical engineering and computer science at Stanford University. Olukotun led the Stanford Hydra single-chip multiprocessor

s authors of the seminal textbook, Computer Architecture: A Quantitative Approach (4th Edition, Morgan Kaufmann, 2006), John Hennessy and David Patterson probably don’t need an introduction. You’ve probably read them in college or, if you were lucky enough, even attended one of their classes. Since rethinking, and then rewriting, the way computer architecture is taught, both have remained committed to educating a new generation of engineers with the skills to tackle today’s tough problems in computer architecture, Patterson as a professor at Berkeley and Hennessy as a professor, dean, and now president of Stanford University. In addition to teaching, both have made significant contributions to computer architecture research, most notably in the area of RISC (reduced instruction set

14 December/January 2006-2007 ACM QUEUE

rants: feedback@acmqueue.com

Since IP is considered a financial asset in today’s business climate, the threats to IP create a real concern. In an ACM Premium Queuecast, Aladdin vice president Gregg Gronowski explains how Software Digital Rights Management solutions are the de-facto standard today for protecting software IP, preventing software piracy, and enabling software licensing and compliance.

Log on to www.acmqueue.com to download the ACM Queuecast to your computer or MP3 player.



Sponsored by

research project, which pioneered multiple processors on a single silicon chip. Technology he helped develop and commercialize is now used in Sun Microsystems’s Niagara line of multicore CPUs. KUNLE OLUKOTUN I want to start by asking why you decided to write Computer Architecture: A Quantitative Approach. DAVID PATTERSON Back in the 1980s, as RISC was just getting under way, I think John and I kept complaining to each other about the existing textbooks. I could see that I was going to become the chair of the computer science department, which I thought meant I wouldn’t have any time. So we said, “It’s now or never.” JOHN HENNESSY As we thought about the courses we were teaching in computer architecture—senior undergraduate and first-level graduate courses—we were very dissatisfied with what resources were out there. The common method of teaching a graduate-level, even an introductory graduate-level computer architecture course, was what we referred to as the supermarket approach. The course would consist of selected readings—sometimes a book, but often selected readings. Many people used [Dan] Siewiorek, [Gordon] Bell, and [Allen] Newell (authors of Computer Structures, McGraw-Hill, 1982), which were essentially selected readings. Course curricula looked as though someone had gone down the aisle and picked one selection from each aisle, without any notion of integration of the material, without thinking about the objective, which in the end was to teach people how to design computers that would be faster or cheaper, and with better cost performance. KO This quantitative approach has had a significant impact on the way that the industry has designed computers and especially the way that computer research has been done. Did you expect your textbook to have the wide impact that it had? JH The publisher’s initial calculation was that we needed to sell 7,000 copies just to break even, and they thought we had a good shot at getting to maybe 10,000 or 15,000. As it turned out, the first edition sold well over 25,000. We didn’t expect that. DP This was John’s first book, but I had done several books before, none of which was in danger of making me money. So I had low expectations, but I think we were shooting for artistic success, and it turned out to be a commercial success as well. JH The book captured a lot of attention both among academics using it in classroom settings and among practicing professionals in the field. Microsoft actually stocked
16 December/January 2006-2007 ACM QUEUE

it in its company store for employees. I think what also surprised us is how quickly it caught on internationally. We’re now in at least eight languages. DP I got a really great compliment the other day when I was giving a talk. Someone asked, “Are you related to the Patterson, of Patterson and Hennessy?” I said, “I’m pretty sure, yes, I am.” But he says, “No, you’re too young.” So I guess the book has been around for a while. JH Another thing I’d say about the book is that it wasn’t until we started on it that I developed a solid and complete quantitative explanation of what had happened in the RISC developments. By using the CPI formula Execution Time/Program = Instructions/Program x Clocks/ Instruction x Time/Clock we could show that there had been a real breakthrough in terms of instruction throughput, and that it overwhelmed any increase in instruction count. With a quantitative approach, we should be able to explain such insights quantitatively. In doing so, it also became clear how to explain it to other people. DP The subtitle, Quantitative Approach, was not just a casual additive. This was a turn away from, amazingly, people spending hundreds of millions of dollars on somebody’s hunch of what a good instruction set would be—somebody’s personal taste. Instead, there should be engineering and science behind what you put in and what you leave out. So, we worked on that title. We didn’t quite realize—although I had done books before—what we had set ourselves up for. We both took sabbaticals, and we said, “Well, how hard can it be? We can just use the lecture notes from our two courses.” But, boy, then we had a long way to go. JH We had to collect data. We had to run simulations. There was a lot of work to be done in that first book. In the more recent edition, the book has become sufficiently well known that we have been able to enlist other people to help us collect data and get numbers, but in the first one, we did most of the work ourselves. DP We spent time at the DEC Western Research Lab, where we hid out three days a week to get together and talk. We would write in between, and then we would go there and spend a lot of time talking through the ideas. We made a bunch of decisions that I think are unchanged in the fourth edition of the book. For example, an idea has to be in some commercial product before we put it into the book. There are thousands of ideas, so how do you pick? If no one has bothered to use it yet, then we’ll wait till it gets used before we describe it.
rants: feedback@acmqueue.com

KO Do you think that limits the forward-looking nature of the book? DP I think it probably does, but on the other hand, we’re less likely to put a bunch of stuff in that ends up being thrown away. JH On balance, our approach has probably benefited us more often than it has hurt us. There are a lot of topics that became very faddish in the architecture research community but never really emerged. For example, we didn’t put trace caches in the book when they were first just an academic idea; by the time we did put them in, it was already clear that they were going to be of limited use, and we put in a small amount of coverage. That’s a good example of not jumping the gun too early. DP I think value prediction was another. There was tremendous excitement about its potential, and it ended up having limited applicability. KO You delayed the third edition for Itanium, right? JH I think our timing worked out right. It just goes to show the value of the quantitative approach. I think you can make a lot of pronouncements about an architecture, but when the rubber meets the road, does it perform or not? DP One of the fallacies and pitfalls to consider is that you shouldn’t be comparing your performance to computers of today, given Moore’s law. You should be comparing yourself to performances at the time the computers come out. That relatively straightforward observation was apparently, to many people in marketing departments and to executives at computer companies, a surprising observation. The Itanium was very late, which is one of its problems. KO You’ve made a commitment to keeping the text upto-date, so will there be a fifth edition? DP It’s actually a lot more than four editions. We originally wanted to write the book for graduate students, and then our publisher said, “You need to make this information available for undergraduates.” JH We thought somebody else would write an undergraduate textbook, but nobody did. DP So we’ve now done three editions of the undergraduate book and four editions of the senior/graduate book. JH What makes it so much work is that in each edition, 60 percent of the pages are essentially new. Something like 75 percent are substantially new, and 90 percent of them are touched in that process, not counting appendices. We replan each book every single time. It’s not a small undertaking.
more queue: www.acmqueue.com

DP I’m pretty proud of this latest edition. We felt really good about the first edition, but then I think some of the editions just got really big. This one, we’ve put on a diet and tried to concentrate on what we think is the essence of what’s going on, and moved the rest of the stuff into the CD and appendices.

KO How would you characterize the current state of computer architecture? Could you talk about the pace of innovation, compared with what it was in the past? JH I think this is nothing less than a giant inflection point, if you look strictly from an architectural viewpoint—not a technology viewpoint. Gordon Bell has talked eloquently about defining computers in terms of what I might think of as technology-driven shifts. If you

When we talk about


we’re talking about a problem that’s as hard as any computer science has faced.


It’s the biggest thing in 50 years
because industry is betting its future that parallel programming will be useful.

ACM QUEUE December/January 2006-2007 17

look at architecture-driven shifts, then this is probably only the fourth. There’s the first-generation electronic computers. Then I would put a sentinel at the IBM 360, which was really the beginning of the notion of an instruction-set architecture that was independent of implementation. I would put another sentinel marking the beginning of the pipelining and instruction-level parallelism movement. Now we’re into the explicit parallelism multiprocessor era, and this will dominate for the foreseeable future. I don’t see any technology or architectural innovation on the horizon that might be competitive with this approach. DP Back in the ’80s, when computer science was just learning about silicon and architects were able to understand chip-level implementation and the instruction set, I think the graduate students at Berkeley, Stanford, and elsewhere could genuinely build a microprocessor that was faster than what Intel could make, and that was amazing. Now, I think today this shift toward parallelism is being forced not by somebody with a great idea, but because we don’t know how to build hardware the conventional way anymore. This is another brand-new opportunity for graduate students at Berkeley and Stanford and other schools to build a microprocessor that’s genuinely better than what Intel can build. And once again, that is amazing. JH In some ways it’s déjà vu, much as the early RISC days relied on collaboration between compiler writers and architects and implementers and even operating-system people in the cases of commercial projects. It’s the same thing today because this era demands a level of collaboration and cross-disciplinary problem solving and design. It’s absolutely mandatory. The architects can’t do it alone. Once ILP (instruction-level parallelism) got rolling, at least in the implicit ILP approaches, the architects could do most of the work. That’s not going to be true going forward. DP This parallelism challenge involves a much broader community, and we have to get into applications and language design, and maybe even numerical analysis, not just compilers and operating systems. God knows who should be sitting around the table—but it’s a big table. Architects can’t do it by themselves, but I also think you can’t do it without the architects. KO One of the things that was nice about RISC is that with a bunch of graduate students, you could build a 30,000- or 40,000-transistor design, and that was it. You were done.
18 December/January 2006-2007 ACM QUEUE

DP By the way, that was a lot of work back then. Computers were a lot slower! JH We were working with hammers and chisels. DP We were cutting Rubylith with X-acto knives, as I remember. KO Absolutely. So today, if you really want to make an impact, it’s very difficult to actually do VLSI (very large scale integration) design in an academic setting. JH I don’t know that that’s so true. It may have gotten easier again. One could imagine designing some novel multiprocessor starting with a commercial core, assuming that commercial core has sufficient flexibility. You can’t design something like a Pentium 4, however. It’s completely out of the range of what’s doable. DP We recently painfully built a large microprocessor. At the ISCA (International Symposium on Computer Architecture) conference in 2005, a bunch of us were in the hallway talking about exactly this issue. How in the world are architects going to build things when it’s so hard to build chips? We absolutely have to innovate, given what has happened in the industry and the potential of this switch to parallelism. That led to a project involving 10 of us from several leading universities, including Berkeley, Carnegie-Mellon, MIT, Stanford, Texas, and Washington. The idea is to use FPGAs (field programmable gate arrays). The basic bet is that FPGAs are so large we could fit a lot of simple processors on an FPGA. If we just put, say, 50 of them together, we could build 1,000-processor systems from FPGAs. FPGAs are close enough to the design effort of hardware, so the results are going to be pretty convincing. People will be able to innovate architecturally in this FPGA and will be able to demonstrate ideas well enough that we could change what industry wants to do. We call this project Research Accelerator for Multiple Processors, or RAMP. There’s a RAMP Web site (http:// ramp.eecs.berkeley.edu). KO Do you have industry partners? DP Yes, we’ve got IBM, Sun, Xilinx, and Microsoft. Chuck Thacker, Technical Fellow at Microsoft, is getting Microsoft back into computer architecture, which is another reflection that architecture is exciting again. RAMP is one of his vehicles for doing architecture research. JH I think it is time to try. There are challenges, clearly, but the biggest challenge by far is coming up with sufficiently new and novel approaches. Remember that this era is going to be about exploiting some sort of explicit parallelism, and if there’s a problem that has confounded computer science for a long time, it is exactly that. Why did the ILP revolution take off so quickly? Because prorants: feedback@acmqueue.com

grammers didn’t have to know about it. Well, here’s an approach where I suspect any way you encode parallelism, even if you embed the parallelism in a programming language, programmers are going to have to be aware of it, and they’re going to have to be aware that memory has a distributed model and synchronization is expensive and all these sorts of issues. DP That’s one of the reasons we’re excited about what the actual RAMP vision is: Let’s create this thing where the

architects supply the logic design, and it’s inexpensive and runs not as fast as the real chip but fast enough to run real software, so we can put it in everybody’s hands and they can start getting experience with a 1,000-processor system or a lot bigger than you can buy from Intel. Not only will it enable research, it will enable teaching. We’ll be able to take a RAMP design, put it in the classroom, and say, “OK, today it’s a shared multiprocessor. Tomorrow it has transactional memory.” The plus side with FPGAs is that if somebody comes up with a great idea, we don’t have to wait four years for the chips to get built before we can start using it. We can FTP the designs overnight and start trying it out the next day. KO I think FPGAs are going to enable some very interesting architecture projects. DP Architecture is interesting again. From my perspective, parallelism is the biggest challenge since high-level programming languages. It’s the biggest thing in 50 years because industry is betting its future that parallel programming will be useful. Industry is building parallel hardware, assuming people can use it. And I think there’s a chance they’ll fail since the software is not necessarily in place. So this is a gigantic challenge facing the computer science community. If we miss this opportunity, it’s going to be bad for the industry. Imagine if processors stop getting faster, which is not impossible. Parallel programming has proven to be a really hard concept. Just because you need a solution doesn’t mean you’re going to find it. ACM QUEUE December/January 2006-2007 19

more queue: www.acmqueue.com

JH If anything, a bit of self-reflection on what happened in the last decade shows that we—and I mean collectively the companies, research community, and government funders—became too seduced by the ease with which instruction-level parallelism was exploited, without thinking that the road had an ending. We got there very quickly—more quickly than I would have guessed—but now we haven’t laid the groundwork. So I think Dave is right. There’s a lot of work to do without great certainty that we will solve those problems in the near future. ers don’t control the software business, so you’ve got a very difficult situation. It’s far more important now to be engaging the universities and working on these problems than it was, let’s say, helping find the next step in ILP. Unfortunately, we’re not going to find a quick fix. DP RAMP will help us get to the solution faster than without it, but it’s not like next year when RAMP is available, we’ll solve the problem six months later. This is going to take a while. For RISC, the big controversy was whether or not to change the instruction set. Parallelism has changed the programming model. It’s way beyond changing the instruction set. At Microsoft in 2005, if you said, “Hey, what do you guys think about parallel computers?” they would reply, “Who cares about parallel computers? We’ve had 15 or 20 years of doubling every 18 months. Get lost.” You couldn’t get anybody’s attention inside Microsoft by saying that the future was parallelism. In 2006, everybody at Microsoft is talking about parallelism. Five years ago, if you had this breakthrough idea in parallelism, industry would show you the door. Now industry is highly motivated to listen to new ideas. So they are a ready market, but I just don’t think industry is set up to be a research funding agency. The one organization that might come to the rescue would be the SRC (Semiconductor Research Council), which is a government/semiconductor industry joint effort that funnels monies to some universities. That type of an organization is becoming aware of what’s facing the microprocessor and, hence, semiconductor industry. They might be in position to fund some of these efforts.

KO One of the things that we had in the days when you were doing the RISC research was a lot of government funding for this work. Do we have the necessary resources to make parallelism what we know it has to be in order to keep computer performance going? DP I’m worried about funding for the whole field. As ACM’s president for two years, I spent a large fraction of my time commenting about the difficulties facing our field, given the drop in funding by certain five-letter government agencies. They just decided to invest it in little organizations like IBM and Sun Microsystems instead of the proven successful path of universities. JH DARPA spent a lot of money pursuing parallel computing in the ’90s. I have to say that they did help achieve some real advances. But when we start talking about parallelism and ease of use of truly parallel computers, we’re talking about a problem that’s as hard as any that computer science has faced. It’s not going to be conquered unless the research program has a level of long-term commitment and has sufficiently significant segments of strategic funding to allow people to do large experiments and try ideas out. DP For a researcher, this is an exciting time. There are huge opportunities. If you discover how to efficiently program a large number of processors, the world is going to beat a path to your door. It’s not such an exciting time to be in industry, however, where you’re betting the company’s future that someone is going to come up with the solution. KO Do you see closer industry/academic collaboration to solve this problem? These things wax and wane, but given the fact that industry needs new ideas, then clearly there’s going to be more interest in academic research to try to figure out where to go next. JH I would be panicked if I were in industry. Now I’m forced into an approach that I haven’t laid the groundwork for, it requires a lot more software leverage than the previous approaches, and the microprocessor manufactur20 December/January 2006-2007 ACM QUEUE

KO There are many other issues beyond performance that could impact computer architecture. What ideas are there in the architecture realm, and what sort of impact are these other nonperformance metrics going to have on computing? JH Well, power is easy. Power is performance. Completely interchangeable. How do you achieve a level of improved efficiency in the amount of power you use? If I can improve performance per watt, I can add more power and be assured of getting more performance. DP It’s something that has been ignored so far, at least in the data center. JH I agree with that. What happened is we convinced ourselves that we were on a long-term road with respect to ILP that didn’t have a conceivable end, ignoring the fact that with every step on the road we were achieving
rants: feedback@acmqueue.com

lower levels of efficiency and hence bringing the end of that road closer and closer. Clearly, issues of reliability matter a lot, but as the work at Berkeley and other places has shown, it’s a far more complicated metric than just looking at a simple notion of processor reliability. DP Yes, I guess what you’re saying is, performance per watt is still a quantitative and benchmarkable goal. Reliability is a lot harder. We haven’t successfully figured out thus far how to insert bugs and things and see how things work. Now, that’s something we talked about at Berkeley and never found a good vehicle for. I’m personally enthusiastic about the popularity of virtual machines for a bunch of reasons. In fact, there’s a new section on virtual machines in our latest book. JH Whether it’s reliability or security, encapsulation in some form prevents a failure from rippling across an entire system. In security, it’s about containment. It’s about ensuring that whenever or wherever attacks occur, they’re confined to a relatively small area. DP We could use virtual machines to do fault insertion. What we’re doing right now at Berkeley is looking into using virtual machines to help deal with power. We’re interested in Internet services. We know that with Internet services, the workload varies by time of day and day

of the week. Our idea is when the load goes down, move the stuff off some of the machines and turn them off. When the load goes up, turn them on and move stuff to them, and we think there will be surprisingly substantial power savings with that simple policy. KO People will come up with new ideas for programming parallel computers, but how will they know whether these ideas are better than the old ideas? DP We always think of the quantitative approach as pertaining to hardware and software, but there are huge fractions of our respective campuses that do quantitative work all the time with human beings. There are even elaborate rules on human-subject experiments. It would be new to us to do human-subject experiments on the ease of programming, but there is a large methodology that’s popular on campuses that computer science uses only in HCI (human-computer interaction) studies. There are ways to do that kind of work. It will be different, but it’s not unsolvable. KO Would you advocate more research in this area of programmability? DP Yes. I think if you look at the history of parallelism, computer architecture often comes up with the wild idea of how to get more peak performance out of a certain

Bill Moggridge
“This will be the book—the book that summarizes how the technology of interaction came into being and prescribes how it will advance in the future. Essential, exciting, and a delight for both eyes and mind.” — Don Norman, Nielsen Norman Group and Northwestern University, author of Emotional Design
816 pp., 700 illus., color thourghout $39.95 cloth and DVD

The Laws of Simplicity
John Maeda
“A clear and incisive guide for making simplicity the paramount feature of our products; it’s also a road map for constructing a more meaningful world.” — Andrea Ragnetti, Board of Management, Royal Philips Electronics
176 pp., 30 illus. $20 cloth

Available in fine bookstores everywhere, or call 800-405-1619.

more queue: www.acmqueue.com

ACM QUEUE December/January 2006-2007 21

new from the mit press

Designing Interactions

fixed hardware budget. Then five or 10 years go by where a bunch of software people try to figure out how to make that thing programmable, and then we’re off to the next architecture idea when the old one doesn’t turn out. Maybe we should put some science behind this, trying to evaluate what worked and what didn’t work before we go onto the next idea. My guess is that’s really the only way we’re going to solve these problems; otherwise, it will just be that all of us will have a hunch about what’s easier to program. Even shared memory versus message passing—this is not a new trade-off. It has been around for 20 years. I’ll bet all of us in this conversation have differing opinions about the best thing to do. How about some experiments to shed some light on what the trade-offs are in terms of ease of programming of these approaches, especially as we scale? If we just keep arguing about it, it’s possible it will never get solved; and if we don’t solve it, we won’t be able to rise up and meet this important challenge facing our field. KO Looking back in history at the last big push in parallel computing, we see that we ended up with message passing as a de facto solution for developing parallel software. Are we in danger of that happening again? Will we end up with the lowest common denominator—whatever is easiest to do? JH The fundamental problem is that we don’t have a really great solution. Many of the early ideas were motivated by observations of what was easy to implement in the hardware rather than what was easy to use: how we’re going to change our programming languages; what we can do in the architecture to mitigate the cost of various things, communication in particular, but synchronization as well. Those are all open questions in my mind. We’re really in the early stages of how we think about this. If it’s the case that the amount of parallelism that programmers will have to deal with in the future will not be just two or four processors but tens or hundreds and thousands for some applications, then that’s a very different world than where we are today. DP On the other hand, there’s exciting stuff happening in software right now. In the open source movement, there are highly productive programming environments that are getting invented at pretty high levels. Everybody’s example is Ruby on Rails, a pretty different way to learn how to program. This is a brave new world where you can rapidly create an Internet service that is dealing with lots of users.
22 December/January 2006-2007 ACM QUEUE

There is evidence of tremendous advancement in part of the programming community—not particularly the academic part. I don’t know if academics are paying attention to this kind of work or not in the language community, but there’s hope of very different ways of doing things than we’ve done in the past. Is there some way we could leverage that kind of innovation in making it compatible with this parallel future that we’re sure is out there? I don’t know the answer to that, but I would say nothing is off the table. Any solution that works, we’ll do it. KO Given that you won’t be able to buy a microprocessor with a single core in the near future, you might be optimistic that the proliferation of these multicore parallel architectures will enable the open source community to come up with something interesting. Is that likely? DP Certainly. What I’ve been doing is to tell all my colleagues in theory and software, “Hey, the world has changed. The La-Z-Boy approach isn’t going to work anymore. You can’t just sit there, waiting for your single processor to get a lot faster and your software to get faster, and then you can add the feature sets. That era is over. If you want things to go faster, you’re going to have to do parallel computing.” The open source community is a real nuts-and-bolts community. They need to get access to parallel machines to start innovating. One of our tenets at RAMP is that the software people don’t do anything until the hardware shows up. JH The real change that has occurred is the free software movement. If you have a really compelling idea, your ability to get to scale rapidly has been dramatically changed. DP In the RAMP community, we’ve been thinking about how to put this in the hands of academics. Maybe we should be putting a big RAMP box out there on the Internet for the open source community, to let them play with a highly scalable processor and see what ideas they can come up with. I guess that’s the right question: What can we do to engage the open source community to get innovative people, such as the authors of Ruby on Rails and other innovative programming environments? The parallel solutions may not come from academia or from research labs as they did in the past. Q LOVE IT, HATE IT? LET US KNOW feedback@acmqueue.com or www.acmqueue.com/forums
© 2006 ACM 1542-7730/06/1200 $5.00 rants: feedback@acmqueue.com

“The Digital Library”
Vinton G. Cerf Vice President and Chief Internet Evangelist Google

“Online Books and Courses”
Amy Wu Computer Science Student Stanford University

Benjamin Mako Hill Research Assistant MIT Media Laboratory

Maria Klawe President Harvey Mudd College


Uniting the world’s computing professionals, researchers and educators to inspire dialogue, share resources and address the computing field’s challenges in the 21st Century.

Association for Computing Machinery
Advancing Computing as a Science & Profession www.acm.org/learnmore


Computer Architecture

24 December/January 2006-2007 ACM QUEUE
rants: feedback@acmqueue.com



ulticore architectures are an inflection point in mainstream software development because they force developers to write parallel programs. In a previous article in Queue, Herb Sutter and James Larus pointed out, “The concurrency revolution is primarily a software revolution. The difficult problem is not building multicore hardware, but programming it in a way that lets mainstream applications benefit from the continued exponential growth in CPU performance.” 1 In this new multicore world, developers must write explicitly parallel applications that can take advantage of the increasing number of cores that each successive multicore generation will provide. Parallel programming poses many new challenges to the developer, one of which is synchronizing concurrent access to shared memory by multiple threads. Programmers have traditionally used locks for synchronization, but lock-based synchronization has well-known pitfalls. Simplistic coarse-grained locking does not scale well, while more sophisticated fine-grained locking risks introducing deadlocks and data races. Furthermore, scalable libraries written using fine-grained locks cannot be easily composed in a way that retains scalability and avoids deadlock and data races. TM (transactional memory) provides a new concurrency-control construct that avoids the pitfalls of locks and significantly eases concurrent programming. It brings to mainstream parallel programming proven concur-

rency-control concepts used for decades by the database community. Transactional-language constructs are easy to use and can lead to programs that scale. By avoiding deadlocks and automatically allowing fine-grained concurrency, transactional-language constructs enable the programmer to compose scalable applications safely out of thread-safe libraries. Although TM is still in a research stage, it has increasing momentum pushing it into the mainstream. The recently defined HPCS (high-productivity computing system) languages—Fortress from Sun, X10 from IBM, and Chapel from Cray—all propose new constructs for transactions in lieu of locks. Mainstream developers who are early adopters of parallel programming technologies have paid close attention to TM because of its potential for improving programmer productivity; for example, in his keynote address at the 2006 POPL (Principles of Programming Languages) symposium, Tim Sweeney of Epic Games pointed out that “manual synchronization…is hopelessly intractable” for dealing with concurrency in game-play simulation and claimed that “transactions are the only plausible solution to concurrent mutable state.”2 Despite its momentum, bringing transactions into the mainstream still faces many challenges. Even with transactions, programmers must overcome parallel programming challenges, such as finding and extracting parallel tasks and mapping these tasks onto a parallel architecture for efficient execution. In this article, we describe how

Multicore programming with transactional memory

more queue: www.acmqueue.com

ACM QUEUE December/January 2006-2007 25


Computer Architecture

transactions ease some of the challenges programmers face using locks, and we look at the challenges system designers face implementing transactions in programming languages.


A memory transaction is a sequence of memory operations that either executes completely (commits) or has no effect (aborts).3 Transactions are atomic, meaning they are an all-or-nothing sequence of operations. If a transaction commits, then all of its memory operations appear to take effect as a unit, as if all the operations happened instantaneously. If a transaction aborts, then none of its stores appear to take effect, as if the transaction never happened. A transaction runs in isolation, meaning it executes as if it’s the only operation running on the system and as if all other threads are suspended while it runs. This means that the effects of a memory transaction’s stores are not visible outside the transaction until the transaction commits; it also means that there are no other conflicting stores by other transactions while it runs.

Transactions give the illusion of serial execution to the programmer, and they give the illusion that they execute as a single atomic step with respect to other concurrent operations in the system. The programmer can reason serially because no other thread will perform any conflicting operation. Of course, a TM system doesn’t really execute transactions serially; otherwise, it would defeat the purpose of parallel programming. Instead, the system “under the hood” allows multiple transactions to execute concurrently as long as it can still provide atomicity and isolation for each transaction. Later in this article, we cover how an implementation provides atomicity and isolation while still allowing as much concurrency as possible. The best way to provide the benefits of TM to the programmer is to replace locks with a new language construct such as atomic { B } that executes the statements in block B as a transaction. A first-class language construct not only provides syntactic convenience for the programmer, but also enables static analyses that provide compile-time safety guarantees and enables compiler optimizations to improve performance, which we touch on later in this article. Figure 1 illustrates how an atomic statement could be introduced and used in an object-oriented language such as Java. The figure shows two different implementations of a thread-safe map data structure. The code in section A of the figure shows a lock-based map using Java’s synchronized statement. The get() method simply delegates the call to an underlying non-thread-safe map

Lock-based vs. Transactional Map Data Structure A
class LockBasedMap implements Map { Object mutex; Map m; LockBasedMap(Map m) { this.m = m; mutex = new Object(); } public Object get() { synchronized (mutex) { return m.get(); } } // other Map methods ... }
26 December/January 2006-2007 ACM QUEUE

class AtomicMap implements Map { Map m; AtomicMap(Map m) { this.m = m; } public Object get() { atomic { return m.get(); } } // other Map methods ... }

rants: feedback@acmqueue.com

implementation, first wrapping the call in a synchronized statement. The synchronized statement acquires a lock represented by a mutex object held in another field of the synchronized hash map. This same mutex object guards all the other calls to this hash map. Using locks, the programmer has explicitly forced all threads to execute any call through this synchronized wrapper serially. Only one thread at a time can call any method on this hash map. This is an example of coarsegrained locking. It’s easy to write thread-safe programs in this way—you simply guard all calls through an interface with a single lock, forcing threads to execute inside the interface one at a time. Part B of figure 1 shows the same code, using transactions instead of locks. Rather than using a synchronized statement with an explicit lock object, this code uses a new atomic statement. This atomic statement declares that the call to get() should be done atomically, as if it were done in a single execution step with respect to other threads. As with coarse-grained locking, it’s easy for the programmer to make an interface thread safe by simply wrapping all the calls through the interface with an atomic statement. Rather than explicitly forcing one thread at a time to execute any call to this hash map, however, the programmer has instead declared to the system that the call should execute atomically. The system now assumes responsibility for guaranteeing atomicity and implements concurrency control under the hood.

Performance of Transactions vs. Locks
3.0 2.5 2.0 time (s) 1.5 1.0 0.5 0 synch (fine) atomic synch (coarse)

more queue: www.acmqueue.com



4 number of threads

Unlike coarse-grained locking, transactions can provide scalability as long as the data-access patterns allow transactions to execute concurrently. The transaction system can provide good scalability in two ways: • It can allow concurrent read operations to the same variable. In a parallel program, it’s safe to allow two or more threads to read the same variable concurrently. Basic mutual exclusion locks don’t permit concurrent readers; to allow concurrent readers, the programmer has to use special reader-writer locks, increasing the program’s complexity. • It can allow concurrent read and write operations to different variables. In a parallel program, it’s safe to allow two or more threads to read and write disjoint variables concurrently. A programmer can explicitly code fine-grained disjoint access concurrency by associating different locks with different fine-grained data elements. This is usually a tedious and difficult task, however, and risks introducing bugs such as deadlocks and data races. Furthermore, as we show in a later example, finegrained locking does not lend itself to modular software engineering practices: In general, a programmer can’t take software modules that use fine-grained locking and compose them together in a manner that safely allows concurrent access to disjoint data. Transactions can be implemented in such a way that they allow both concurrent read accesses, as well as concurrent accesses to disjoint, fine-grained data elements (e.g., different objects or different array elements). Using transactions, the programmer gets these forms of concurrency without having to code them explicitly in the program. It is possible to write a concurrent hash-map data structure using locks so that you get both concurrent read accesses and concurrent accesses to disjoint data. In fact, the recent Java 5 libraries provide a version of HashMap, called ConcurrentHashMap, that 16 8 does exactly this. The code for ConcurrentHashMap, however, is significantly longer and more complicated than the version ACM QUEUE December/January 2006-2007 27


Computer Architecture

using coarse-grained locking. The algorithm was designed by threading experts and it went through a comprehensive public review process before it was added to the Java standard. In general, writing highly concurrent lock-based code such as ConcurrentHashMap is very complicated and bug prone and thereby introduces additional complexity to the software development process. Figure 2 compares the performance of the three different versions of HashMap. It plots the time it takes to complete a fixed set of insert, delete, and update operations on a 16-way SMP (symmetric multiprocessing) machine.4 As the numbers show, the performance of coarse-grained locking does not improve as the number of processors increases, so coarse-grained locking does not scale. The performance of fine-grained locking and transactional memory, however, improves as the number of processors increases. So for this data structure, transactions give you the same scalability and performance as fine-grained locking but with significantly less programming effort. As these numbers demonstrate, transactions delegate to the runtime system the hard task of allowing as much concurrency as possible. Although highly concurrent libraries built using fine-grained locking can scale well, a developer doesn’t necessarily retain scalability after composing larger applications out of these libraries. As an example, assume the programmer wants to perform a composite operation that moves a value from one concurrent hash map to another, while maintaining the invariant that threads always see a key in either one hash map or the other, but never in neither. Implementing this requires that the programmer

resort to coarse-grained locking, thus losing the scalability benefits of a concurrent hash map (figure 3A). To implement a scalable solution to this problem, the programmer must somehow reuse the fine-grained locking code hidden inside the implementation of the concurrent hash map. Even if the programmer had access to this implementation, building a composite move operation out of it risks introducing deadlock and data races, especially in the presence of other composite operations. Transactions, on the other hand, allow the programmer to compose applications out of libraries safely and still achieve scalability. The programmer can simply wrap a transaction around the composite move operation (figure 3B). The underlying TM system will allow two threads to perform a move operation concurrently as long as the two threads access different hash-table buckets in both underlying hash-map structures. So transactions allow a programmer to take separately authored scalable software components and compose them together into larger components, in a way that still provides as much concurrency as possible but without risking deadlocks because of concurrency control. By providing a mechanism to roll back side effects, transactions enable a language to provide failure atomicity. In lock-based code, programmers must make sure that exception handlers properly restore invariants before releasing locks. This requirement often leads to complicated exception-handling code because the programmer must not only make sure that a critical section catches and handles all exceptions, but also track the state of the data structures used inside the critical section so that the exception handlers can properly restore invariants. In a transaction-based language, the atomic statement can roll back all the side effects of the transaction (automatically restoring invariants) if an uncaught exception propagates out of its block. This significantly reduces the amount of exception-handling code and improves robustness, as uncaught exceptions inside a transaction won’t compromise a program’s invariants.

Thread-safe Composite Operation A
move(Object key) { synchronized(mutex) { map2.put(key, map1.remove(key)); } }

move(Object key) { atomic { map2.put(key, map1.remove(key)); } }

rants: feedback@acmqueue.com

28 December/January 2006-2007 ACM QUEUE


Transactional memory transfers the burden of concurrency management from the application programmers to the system designers. Under the hood, a combination of software and hardware must guarantee that concurrent transactions from multiple threads execute atomically and in isolation. The key mechanisms for a TM system are data versioning and conflict detection. As transactions execute, the system must simultaneously manage multiple versions of data. A new version, produced by one of the pending transactions, will become globally visible only if the transaction commits. The old version, produced by a previously committed transaction, must be preserved in case the pending transaction aborts. With eager versioning, a write access within a transaction immediately writes to memory the new data version. The old version is buffered in an undo log. If the transaction later commits, no further action is necessary to make the new versions globally visible. If the transaction aborts, the old versions must be restored from the undo log, causing some additional delay. To prevent other code from observing the uncommitted new versions (loss of atomicity), eager versioning requires the use of locks or an equivalent hardware mechanism throughout the transaction duration. Lazy versioning stores all new data versions in a write buffer until the transaction completes. If the transaction commits, the new versions become visible by copying from the write buffer to the actual memory addresses. If the transaction aborts, no further action is needed as the new versions were isolated in the write buffer. In contrast to eager versioning, the lazy approach is subject to loss of atomicity only during the commit process. The challenges with lazy versioning, particularly for software implementations, are the delay introduced on transaction commits and the need to search the write buffer first on transaction reads to access the latest data versions. A conflict occurs when two or more transactions operate concurrently on the same data with at least one transaction writing a new version. Conflict detection and resolution are essential to guarantee atomic execution. Detection relies on tracking the read set and write set for each transaction, which, respectively, includes the addresses it read from and wrote to during its execution. We add an address to the read set on the first read to it within the transaction. Similarly, we add an address to the write set on the first write access. Under pessimistic conflict detection, the system checks for conflicts progressively as transactions read and write data. Conflicts are detected early and can be handled
more queue: www.acmqueue.com

either by stalling one of the transactions in place or by aborting one transaction and retrying it later. In general, the performance of pessimistic detection depends on the set of policies used to resolve conflicts, which are typically referred to as contention management. A challenging issue is the detection of recurring or circular conflicts between multiple transactions that can block all transactions from committing (lack of forward progress). The alternative is optimistic conflict detection that assumes conflicts are rare and postpones all checks until the end of each transaction. Before committing, a transaction validates that no other transaction is reading the data it wrote or writing the data it read. The drawback to optimistic detection is that conflicts are detected late, past the point a transaction reads or writes the data. Hence, stalling in place is not a viable option for conflict resolution and may waste more work as a result of aborts. On the other hand, optimistic detection guarantees forward progress in all cases by simply giving priority to the committing transaction on a conflict. It also allows for additional concurrency for reads as conflict checks for writes are performed toward the end of each transaction. Optimistic conflict detection does not work with eager versioning. The granularity of conflict detection is also an important design parameter. Object-level detection is close to the programmer’s reasoning in object-oriented environments. Depending on the size of objects, it may also reduce overhead in terms of space and time needed for conflict detection. Its drawback is that it may lead to false conflicts, when two transactions operate on different fields within a large object such as a multidimensional array. Wordlevel detection eliminates false conflicts but requires more space and time to track and compare read sets and write sets. Cache-line-level detection provides a compromise between the frequency of false conflicts and time and space overhead. Unfortunately, cache lines and words are not language-level entities, which makes it difficult for programmers to optimize conflicts in their code, particularly with managed runtime environments that hide data placement from the user. A final challenge for TM systems is the handling of nested transactions. Nesting may occur frequently, given the trend toward library-based programming and the fact that transactions can be composed easily and safely. Early systems automatically flattened nested transactions by subsuming any inner transactions within the outermost. While simple, the flattening approach prohibits explicit transaction aborts, which are useful for failure atomicity on exceptions. The alternative is to support partial ACM QUEUE December/January 2006-2007 29


Computer Architecture


STM (software transactional memory) implements transactional memory entirely in software so that it runs on stock hardware. An STM implementation uses read and rollback to the beginning of the nested transaction when write barriers (that is, inserts instrumentation) for all a conflict or an abort occurs during its execution. It shared memory reads and writes inside transactional code requires that the version management and conflict detecblocks. The instrumentation is inserted by a compiler and tion for a nested transaction are independent from that allows the runtime system to maintain the metadata that for the outermost transaction. In addition to allowing is required for data versioning and conflict detection. explicit aborts, such support for nesting provides a powerFigure 4 shows an example of how an atomic construct ful mechanism for performance tuning and for controlcould be translated by a compiler in an STM implementaling the interaction between transactions and runtime or tion. Part A shows an atomic code block written by the operating system services.5 programmer, and part B shows the compiler instrumentIt is unclear which of these options leads to an optimal ing the code in the transactional block. We use a simplidesign. Further experience with prototype implemenfied control flow to ease the presentation. The setjmp tations and a wide range of applications is needed to function checkpoints the current execution context so quantify the trade-offs among performance, ease of use, that the transaction can be restarted on an abort. The and complexity. In some cases, a combination of design stmStart function initializes the runtime data structures. options leads to the best performance. For example, some Accesses to the global variables a and b are mediated TM systems use optimistic detection for reads and pesthrough the barrier functions stmRead and stmWrite. The simistic detection for writes, while detecting conflicts at stmCommit function completes the transaction and makes the word level for arrays and at the object level for other its changes visible to other threads. The transaction gets 6 data types. Nevertheless, any TM system must provide validated periodically during its execution, and if a conefficient implementations for the key structures (read flict is detected, the transaction is aborted. On an abort, set, write set, undo log, write buffer) and must facilitate the STM library rolls back all the updates performed by the integration with optimizing compilers, managed the transaction, uses a longjmp to restore the context saved at the beginning of the transaction, and reexTranslating an Atomic Construct for STM ecutes the transaction. A User Code B Compiled Code Since TM accesses need to be instrumented, a int foo (int arg) int foo (int arg) compiler needs to generate { { an extra copy of any func… jmpbuf env; tion that may be called atomic … from inside a transac{ do { tion. This copy contains b = a + 5; if (setjmp(&env) == 0) { instrumented accesses } stmStart(); and is invoked when the … temp = stmRead(&a); function is called from } temp1 = temp + 5; within a transaction. The stmWrite(&b, temp1); transactional code can be stmCommit(); heavily optimized by a break; compiler—for example, by } eliminating barriers to the } while (1); same address or to immut… able variables.7
30 December/January 2006-2007 ACM QUEUE
rants: feedback@acmqueue.com


runtimes, and existing libraries. The following sections discuss how these challenges are addressed with software and hardware techniques.


The read and write barriers operate on transaction records, pointer-size metadata associated with every piece of data that a transaction may access. The runtime system also maintains a transaction descriptor for each transaction. The descriptor contains its transaction’s state such as the read set, the write set, and the undo log for eager versioning (or the write buffer for lazy versioning). The STM runtime exports an API that allows other components of the language runtime, such as the garbage collector, to inspect and modify the contents of the descriptor, such as the read set, write set, or undo log. The descriptor also contains metadata that allows the runtime system to infer the nesting depth at which data was read or written. This allows the STM to partially roll back a nested transaction.8 The write barrier implements different forms of data versioning and conflict detection for writes. For eager versioning (pessimistic writes) the write barrier acquires an exclusive lock on the transaction record corresponding to the updated memory location, remembers the location’s old value in the undo log, and updates the memory location in place. For lazy versioning (optimistic writes) the write barrier stores the new value in the write buffer; at commit time, the transaction acquires an exclusive lock on all the required transaction records and copies the values to memory. The read barrier also operates on transaction records for detecting conflicts and implementing pessimistic or optimistic forms of read concurrency. For pessimistic reads the read barrier simply acquires a read lock on the corresponding transaction record before reading the data item. Optimistic reads are implemented by using data versioning; the transaction record holds the version number for the associated data.9 STM implementations detect conflicts in two cases: the read or write barrier finds that a transaction record is locked by some other transaction; or in a system with optimistic read concurrency, the transaction finds, during periodic validation, that the version number for some transaction record in its read set has changed. On a conflict, the STM can use a variety of sophisticated conflict resolution schemes such as causing transactions to back off in a random manner, or aborting and restarting some set of conflicting transactions. STMs allow transactions to be integrated with the rest of the language environment, such as a garbage collector. They allow transactions to be integrated with tools, such as debuggers. They also allow accurate diagnostics for performance tuning. Finally, STMs avoid baking TM semantics prematurely into hardware. STM implementations can incur a 40-50 percent overmore queue: www.acmqueue.com

head compared with lock-based code on a single thread. Moreover, STM implementations incur additional overhead if they have to guarantee isolation between transactional and nontransactional code. Reducing this overhead is an active area of research. Like other forms of TM, STMs don’t have a satisfactory way of handling irrevocable actions such as I/O and system calls, nor can they execute arbitrary precompiled binaries within a transaction.


Transactional memory can also be implemented in hardware, referred to as HTM (hardware transactional memory). An HTM system requires no read or write barriers within the transaction code. The hardware manages data versions and tracks conflicts transparently as the software performs ordinary read and write accesses. Apart from reducing the overhead of instrumentation, HTM systems do not require two versions of the functions used in transactions and work with programs that call uninstrumented library routines. HTM systems rely on the cache hierarchy and the cache coherence protocol to implement versioning and conflict detection. Caches observe all reads and writes issued by the processors, can buffer a significant amount of data, and are fast to search because of their associative organization. All HTM systems modify the first-level caches, but the approach extends to lower-level caches, both private and shared. To track the read set and write set for a transaction, each cache line is annotated with R and W tracking bits that are set on the first read or write to the line, respectively. When a transaction commits or aborts, all tracking bits are cleared simultaneously using a gang or flash reset operation. Caches implement data versioning by storing the working set for the undo log or the data buffer for the transactions. Before a cache write under eager versioning, we check if this is the first update to the cache line within this transaction (W bit reset). In this case, the cache line and its address are added to the undo log using additional writes to the cache. If the transaction aborts, a hardware or software mechanism must traverse the log and restore the old data versions.10 In lazy versioning, a cache line written by the transaction becomes part of the write buffer by setting its W bit.11 If the transaction aborts, the write buffer is instantaneously flushed by invalidating all cache lines with the W bit set. If the transaction commits, the data in the write buffer becomes instantaneously visible to the rest of the system by resetting the W bits in all cache lines. ACM QUEUE December/January 2006-2007 31


Computer Architecture

To detect conflicts, the caches must communicate their read sets and write sets using the cache coherence protocol implemented in multicore chips. Pessimistic conflict detection uses the same coherence messages exchanged in existing systems.12 On a read or write access within a transaction, the processor will request shared or exclusive access to the corresponding cache line. The request is transmitted to all other processors that look up their caches for copies of this cache line. A conflict is signaled if a remote cache has a copy of the same line with the R bit set (for an exclusive access request) or the W bit set (for either request type). Optimistic conflict detection operates similarly but delays the requests for exclusive access to cache lines in the write set until the transaction is ready to commit. A single, bulk message is sufficient to communicate all requests.13 Even though HTM systems eliminate most sources of overhead for transactional execution, they nevertheless introduce additional challenges. The modifications HTM requires in the cache hierarchy and the coherence protocol are nontrivial. Processor vendors may be reluctant to implement them before transactional programming becomes pervasive. Moreover, the caches used to track the read set, write set, and write buffer for transactions have finite capacity and may overflow on a long transaction. Long transactions may be rare, but they still must be handled in a manner that preserves atomicity and isolation. Placing implementation-dependent limits on transaction sizes is unacceptable from the programmer’s perspective. Finally, it is challenging to handle the transaction state in caches for deeply nested transactions or when interrupts, paging, or thread migration occur.14 Several proposed mechanisms virtualize the finite resources and simplify their organization in HTM systems. One approach is to track read sets and write sets using signatures based on Bloom filters. The signatures provide a compact yet inexact (pessimistic) representation of the sets that can be easily saved, restored, or communicated if necessary. The drawback is that the inexact representation leads to additional, false conflicts that may degrade performance. Another approach is to map read sets, write sets, and write buffers to virtual memory and use
32 December/January 2006-2007 ACM QUEUE

hardware or firmware mechanisms to move data between caches and memory on cache overflows. An alternative virtualization technique is to use a hybrid HTM-STM implementation. Transactions start using the HTM mode. If hardware resources are exceeded, the transactions are rolled back and restarted in the STM mode.15 The challenge with hybrid TM is conflict detection between software and hardware transactions. To avoid the need for two versions of the code, the software mode of a hybrid STM system can be provided through the operating system with conflict detection at the granularity of memory pages.16 A final implementation approach is to start with an STM system and provide a small set of key mechanisms that targets its main sources of overhead.17 This approach is called HASTM (hardware-accelerated STM). HASTM introduces two basic hardware primitives: support for detecting the first use of a cache line, and support for detecting possible remote updates to a cache line. The two primitives can significantly reduce the read barrier in general instrumentation overhead and the read-set validation time in the case of optimistic reads.


Composing scalable parallel applications using locks is difficult and full of pitfalls. Transactional memory avoids many of these pitfalls and allows the programmer to compose applications safely and in a manner that scales. Transactions improve the programmer’s productivity by shifting the difficult concurrency-control problems from the application developer to the system designer. In the past three years, TM has attracted a great deal of research activity, resulting in significant progress.18 Nevertheless, before transactions can make it into the mainstream as first-class language constructs, there are many open challenges to address. Developers will want to protect their investments in existing software, so transactions must be added incrementally to existing languages, and tools must be developed that help migrate existing code from locks to transactions. This means transactions must compose with existing concurrency features such as locks and threads. System calls and I/O must be allowed inside transactions, and transactional memory must integrate with other transactional resources in the environment. Debugging and tuning tools for transactional code are also challenges, as transactions still require tuning to achieve scalability and concurrency bugs are still possible using transactions. Transactions are not a panacea for all parallel programrants: feedback@acmqueue.com

ming challenges. Additional technologies are needed to address issues such as task decomposition and mapping. Nevertheless, transactions take a concrete step toward making parallel programming easier. This is a step that will clearly benefit from new software and hardware technologies. Q

AUTHORS’ NOTE For extended coverage on the topic, refer to the slides from the PACT ’06 (Parallel Architectures and Compilation Techniques) tutorial, “Transactional Programming in a Multicore Environment,” available at http://csl.stanford. edu/~christos/publications/tm_tutorial_pact2006.zip. REFERENCES 1. Sutter, H., Larus, J. 2005. Software and the concurrency revolution. ACM Queue 3 (7). 2. Sweeney, T. 2006. The next mainstream programming languages: A game developer’s perspective. Keynote speech, Symposium on Principles of Programming Languages. Charleston, SC (January). 3. Herlihy, M., Moss, E. 1993. Transactional memory: Architectural support for lock-free data structures. In Proceedings of the 20th Annual International Symposium on Computer Architecture. San Diego, CA (May). 4. Adl-Tabatabai, A., Lewis, B.T., Menon, V.S., Murphy, B.M., Saha, B., Shpeisman, T. 2006. Compiler and runtime support for efficient software transactional memory. In Proceedings of the Conference on Programming Language Design and Implementation. Ottawa, Canada (June). 5. A. McDonald, A., Chung, J., Carlstrom, B.D., Cao Minh, C., Chafi, H., Kozyrakis, C., Olukotun, K. 2006. Architectural semantics for practical transactional memory. In Proceedings of the 33rd International Symposium on Computer Architecture. Boston, MA (June). 6. Saha, B., Adl-Tabatabai, A., Hudson, R., Cao Minh, C., Hertzberg, B. 2006. McRT-STM: A high-performance software transactional memory system for a multicore runtime. In Proceedings of the Symposium on Principles and Practice of Parallel Programming. New York, NY (March). 7. See reference 4. 8. See reference 6. 9. See reference 6. 10. Moore, K., Bobba, J., Moravan, M., Hill, M., Wood, D. 2006. LogTM: Log-based transactional memory. In Proceedings of the 12th International Conference on High-Performance Computer Architecture. Austin, TX (February).
more queue: www.acmqueue.com

11. Hammond, L., Carlstrom, B., Wong, V., Chen, M., Kozyrakis, C., Olukotun, K. 2004. Transactional coherence and consistency: Simplifying parallel hardware and software. IEEE Micro 24 (6). 12. See reference 10. 13. See reference 11. 14. Chung, J., Cao Minh, C., McDonald, A., Skare, T., Chafi, H., Carlstrom, B., Kozyrakis, C., Olukotun, K. 2006. Tradeoffs in transactional memory virtualization. In Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems. San Jose, CA (October). 15. Damron, P., Fedorova, A., Lev, Y., Luchangco, V., Moir, M., Nussbaum, D. Hybrid transactional memory. In Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems. San Jose, CA (October). 16. See reference 14. 17. Saha, B., Adl-Tabatabai, A., Jacobson, Q. 2006. Architectural support for software transactional memory. In Proceedings of the 39th International Symposium on Microarchitecture. Orlando, FL (December). 18. Transactional Memory Online Bibliography; http:// www.cs.wisc.edu/trans-memory/biblio/. LOVE IT, HATE IT? LET US KNOW feedback@acmqueue.com or www.acmqueue.com/forums ALI-REZA ADL-TABATABAI is a principal engineer in the Programming Systems Lab at Intel Corporation. He leads a team developing compilers and scalable runtimes for future Intel architectures. His current research concentrates on language features supporting parallel programming for future multicore architectures. CHRISTOS KOZYRAKIS (http://csl.stanford.edu/~christos) is an assistant professor of electrical engineering and computer science at Stanford University. His research focuses on architectures, compilers, and programming models for parallel computer systems. He is working on transactional memory techniques that can greatly simplify parallel programming for the average developer. BRATIN SAHA is a senior staff researcher in the Programming Systems Lab at Intel Corporation. He is one of the architects for synchronization and locking in the next-generation IA-32 processors. He is involved in the design and implementation of a highly scalable runtime for multicore processors. As a part of this he has been looking at language features, such as transitional memory, to ease parallel programming.
© 2006 ACM 1542-7730/06/1200 $5.00

ACM QUEUE December/January 2006-2007 33


Computer Architecture

The Virtualization Reality
A number of important challenges are associated with the deployment and configuration of contemporary computing infrastructure. Given the variety of operating systems and their many versions—including the often-specific configurations required to accommodate the wide range of popular applications—it has become quite a conundrum to establish and manage such systems. Significantly motivated by these challenges, but also owing to several other important opportunities it offers, virtualization has recently become a principal focus for computer systems software. It enables a single computer to host multiple different operating system stacks, and it decreases server count and reduces overall system complexity. EMC’s VMware is the most visible and early entrant in this space, but more recently XenSource, Parallels, and Microsoft have introduced virtualization solutions. Many of the major systems vendors, such as IBM, Sun, and Microsoft, have efforts under way to exploit virtualization. Virtualization appears to be far more than just another ephemeral marketplace trend. It is poised to deliver profound changes to the way that both enterprises and consumers use computer systems. What problems does virtualization address, and moreover, what will you need to know and/or do differently to take advantage of the innovations that it delivers? In this article we provide an overview of system virtualization, taking a closer look at the Xen hypervisor and its paravirtualization architecture. We then review several challenges in deploying and exploiting computer systems and software applications, and we look at IT infrastructure management today and show how virtualization can help address some of the challenges.


All modern computers are sufficiently powerful to use virtualization to present the illusion of many smaller VMs (virtual machines), each running a separate operating system instance. An operating system virtualization environment provides each virtualized operating system (or guest) the illusion that it has exclusive access to the underlying hardware platform on which it runs. Of course, the virtual machine itself can offer the guest a different view of the hardware from what is really available, including CPU, memory, I/O, and restricted views of devices. Virtualization has a long history, starting in the mainframe environment and arising from the need to provide isolation between users. The basic trend started with time-sharing systems (enabling multiple users to share a single expensive computer system), aided by innovations in operating system design to support the idea of processes that belong to a single user. The addition of user and supervisor modes on most commercially relevant

34 December/January 2006-2007 ACM QUEUE

rants: feedback@acmqueue.com

Are hypervisors the new foundation for system software?

more queue: www.acmqueue.com

ACM QUEUE December/January 2006-2007 35


Computer Architecture

The Virtualization Reality
processors meant that the operating system code could be protected from user programs, using a set of so-called “privileged” instructions reserved for the operating system software running in supervisor mode. Memory protection and, ultimately, virtual memory were invented so that separate address spaces could be assigned to different processes to share the system’s physical memory and ensure that its use by different applications was mutually segregated. These initial enhancements could all be accommodated within the operating system, until the day arrived when different users, or different applications on the same physical machine, wanted to run different operating systems. This requirement could be satisfied only by supporting multiple VMs, each capable of running its own operating system. The virtualization era (marked by IBM’s release of VM for the System/360 in 1972) had dawned. nant class of instructions requiring emulation. By definition, a user program cannot execute these instructions. One technique to force emulation of these instructions is to execute all of the code within a virtual machine, including the operating system being virtualized, as user code. The resident VMM then handles the exception produced by the attempt to execute a privileged instruction and performs the desired action on behalf of the operating system. While some CPUs were carefully architected with operating system virtualization in mind (the IBM 360 is one such example), many contemporary commodity processor architectures evolved from earlier designs, which did not anticipate virtualization. Providing full virtualization of a processor in such cases is a challenging problem, often resulting in so-called “virtualization holes.” Virtualization of the x86 processor is no exception. For example, certain instructions execute in both user mode and supervisor mode but produce different results, depending on the execution mode. A common approach to overcome these problems is to scan the operating system code and modify the offending instruction sequences, either to produce the intended behavior or to force a trap into the VMM. Unfortunately, this patching and trapping approach can cause significant performance penalties. PARAVIRTUALIZATION An alternative way of achieving virtualization is to present a VM abstraction that is similar but not identical to the underlying hardware. This approach has been called paravirtualization. In lieu of a direct software emulation of the underlying hardware architecture, the concept of paravirtualization is that a guest operating system and an underlying hypervisor collaborate closely to achieve optimal performance. Many guest operating system instances (of different configurations and types) may run atop the one hypervisor on a given hardware platform. This offers improved performance, although it does require modifications to the guest operating system. It is important to note, however, that it does not require any change to the ABI (application binary interface) offered by the guest system; hence, no modifications are required to the guest operating system’s applications. In many ways this method is similar to the operating
rants: feedback@acmqueue.com


Operating system virtualization is achieved by inserting a layer of system software—often called the hypervisor or VMM (virtual machine monitor)—between the guest operating system and the underlying hardware. This layer is responsible for allowing multiple operating system images (and all their running applications) to share the resources of a single hardware server. Each operating system believes that it has the resources of the entire machine under its control, but beneath its feet the virtualization layer, or hypervisor, transparently ensures that resources are properly and securely partitioned between different operating system images and their applications. The hypervisor manages all hardware structures, such as the MMU (memory management unit), I/O devices, and DMA (direct memory access) controllers, and presents a virtualized abstraction of those resources to each guest operating system. EMULATED VIRTUALIZATION The most direct method of achieving virtualization is to provide a complete emulation of the underlying hardware platform’s architecture in software, particularly involving the processor’s instruction set architecture. For the x86 processor, the privileged instructions—used exclusively by the operating system (for interrupt handling, reading and writing to devices, and virtual memory)—form the domi36 December/January 2006-2007 ACM QUEUE

system virtualization approach of VM for the IBM 360 and 370 mainframes.1,2 Under pure virtualization, you can run an unmodified operating-system binary and unmodified application binaries, but the resource consumption management and performance isolation is problematic—one guest operating system and/or its apps could consume all physical memory and/or cause thrashing, for example. The paravirtualization approach requires some work to port each guest operating system, but rigorous allocation of hardware resources can then be done by the hypervisor, ensuring proper performance isolation and guarantees. The use of paravirtualization and the complementary innovation of processor architecture extensions to support it (particularly those recently introduced in both the Intel and AMD processors, which eliminate the need to “trap and emulate”) now permit high-performance virtualization of the x86 architecture.



An example of paravirtualization as applied on the x86 architecture is the Xen hypervisor (figure 1). Xen was initially developed by Ian Pratt and a team at the University of Cambridge in 2001-02, and has subsequently evolved into an open source project with broad involvement. Any hypervisor (whether it implements full hardware emulation or paravirtualization) must provide virtualization for the following system facilities: • CPUs (including multiple cores per device) • Memory system (memory management and physical memory) • I/O devices • Asynchronous events, such as interrupts Let’s now briefly examine Xen’s approach to each of these facilities. (For further detail, we recommend the excellent introduction to and comprehensive treatment of Xen’s design and principles presented in Pratt et al.’s paper.3) CPU AND MEMORY VIRTUALIZATION In Xen’s paravirtualization, virtualization of CPU and memory and low-level hardware interrupts are provided by a low-level efficient hypervisor layer that is implemented in about 50,000 lines of code. When the operating system updates hardware data structures, such as the page table, or initiates a DMA operation, it collaborates with the hypervisor by making calls into an API that is offered by the hypervisor. This, in turn, allows the hypervisor to keep track of all changes made by the operating system and to optimally
more queue: www.acmqueue.com

decide how to manage the state of hardware data structures on context switches. The hypervisor is mapped into the address space of each guest operating system, meaning that there is no context-switch overhead between the operating system and the hypervisor on a hypercall. Finally, by cooperatively working with the guest operating systems, the hypervisor gains insight into the intentions of the operating system and can make it aware that it has been virtualized. This can be a great advantage to the guest operating system—for example, the hypervisor can tell the guest that real time has passed between its last run and its present run, permitting it to make smarter rescheduling decisions to respond appropriately to a rapidly changing environment. Xen makes a guest operating system (running on top of the VMM) virtualization-aware and presents it with a slightly modified x86 architecture, provided through the so-called hypercall API. This removes any difficult and costly-to-emulate privileged instructions and provides equivalent, although not identical, functionality with explicit calls into the hypervisor. The operating system must be modified to deal with this change, but in a wellstructured operating system, these changes are limited to its architecture-dependent modules, most typically a fairly small subset of the complete operating system implementation. Most importantly, the bulk of the operating system and the entirety of application programs remain unmodified.

Paravirtualization–Xen Hypervisor
domain 0/ root partition mgt code device drivers user apps Linux user apps Windows

mgt API

hypercall API

hardware • small hypervisor runs directly on hardware • guest OSes co-operate with hypervisor for resource management & I/O • device drivers outside hypervisor


ACM QUEUE December/January 2006-2007 37


Computer Architecture

The Virtualization Reality
For Linux, the Xen hypercall API takes the form of a jump table populated at kernel load time. When the kernel is running in a native implementation (i.e., not atop a paravirtualizing hypervisor), the jump table is populated with default native operations; when the kernel is running on Xen, the jump table is populated with the Xen hypercalls. This enables the same kernel to run in both native and virtualized forms, with the performance benefits of paravirtualization but without the need to recertify applications against the kernel. Isolation between virtual machines (hence, the respective guest operating systems running within each) is a particularly important property that Xen provides. The physical resources of the hardware platform (such as CPU, memory, etc.) are rigidly divided between VMs to ensure that they each receive a guaranteed portion of the platform’s overall capacity for processing, memory, I/O, and so on. Moreover, as each guest is running on its own set of virtual hardware, applications in separate operating systems are protected from one another to almost the same degree that they would be were they installed on separate physical hosts. This property is particularly appealing in light of the inability of current operating systems to provide protection against spyware, worms, and viruses. In a system such as Xen, nontrusted applications considered to pose such risks (perhaps such as Web browsers) may be seconded to their own virtual machines and thus completely separated from both the underlying system software and other more trusted applications. I/O VIRTUALIZATION I/O virtualization in a paravirtualizing VMM such as Xen is achieved via a single set of drivers. The Xen hypervisor exposes a set of clean and simple device abstractions, and a set of drivers for all hardware on the physical platform is implemented in a special domain (VM) outside the core hypervisor. These drivers are offered via the hypervisor’s abstracted I/O interface for use within other VMs, and thus are used by all guest operating systems.4 In each Xen guest operating system, simple paravirtualizing device drivers replace hardware-specific drivers for the physical platform. Paravirtualizing drivers are independent of all physical hardware but represent each type of device (e.g., block I/O, Ethernet). These drivers enable high-performance, virtualization-safe I/O to be accom38 December/January 2006-2007 ACM QUEUE

plished by transferring control of the I/O to the hypervisor, with no additional complexity in the guest operating system. It is important to note that the drivers in the Xen architecture run outside the base hypervisor, at a lower level of protection than the core of the hypervisor itself. The hypervisor is thus protected from bugs and crashes in device drivers (they cannot crash the Xen VMM) and can use any device drivers available on the market. Also, the virtualized operating system image is much more portable across hardware, since the low levels of the driver and hardware management are modules that run under control of the hypervisor. In full-virtualization (emulation) implementations, the platform’s physical hardware devices are emulated, and the unmodified binary for each guest operating system is run, including the native drivers it contains. In those circumstances it is difficult to restrict the respective operating system’s use of the platform’s physical hardware, and one virtual machine’s runtime behaviors can significantly impact the performance of the others. Since all physical access to hardware is managed centrally in Xen’s approach to I/O virtualization, resource access by each guest can be marshaled. This provides the consequential benefit of performance isolation for each of the guest operating systems. Those who have experience with microkernels will likely find this approach to I/O virtualization familiar. One significant difference between Xen and historical work on microkernels, however, is that Xen has relaxed the constraint of achieving a complete and architecturally pure emulation of the x86 processor’s I/O architecture. Xen uses a generalized, shared-memory, ring-based I/O communication primitive that is able to achieve very high throughputs by batching requests. This I/O abstraction has served well in ports to other processor architectures, including the IA-64 and PowerPC. It also affords an innovative means to add features into the I/O path, by plumbing in additional modules between the guest virtual device and the real device driver. One example in the network stack is the support of full OSI layer 2 switching, packet filtering, and even intrusion detection. HARDWARE SUPPORT FOR VIRTUALIZATION Recent innovations in hardware, particularly in CPU, MMU, and memory components (notably the hardware
rants: feedback@acmqueue.com

virtualization support presently available in the Intel VT-x and AMD-V architectures, offered in both client and server platforms), provide some direct platform-level architectural support for operating system virtualization. This has enabled near bare-metal performance for virtualized guest operating systems. Xen provides a common HVM (hardware virtual machine) abstraction to hide the minor differences between the Intel and AMD technologies and their implementations. HVM offers two key features: First, for unmodified guest operating systems, it avoids the need to trap and emulate privileged instructions in the operating system, by enabling guests to run at their native privilege levels, while providing a hardware vector (called a VM EXIT) into the hypervisor whenever the guest executes a privileged instruction that would unsafely modify the machine state. The hypervisor begins execution with the full state of the guest available to it and can rapidly decide how best to deal with the reason for the VM EXIT. Today’s hardware takes about 1,000 clock cycles to save the state of the currently executing guest and to transition into the hypervisor, which offers good, though not outstanding, performance. A second feature of the HVM implementations is that they offer guest operating systems running with a paravirtualizing hypervisor (in particular, their device drivers) new instructions that call directly into the hypervisor. These can be used to ensure that guest I/O takes the fastest path into the hypervisor. Paravirtualizing device drivers, inserted into each guest operating system, can then achieve optimal I/O performance, even though neither Intel’s nor AMD’s virtualization extension for the x86 (Intel VT and AMD-V, respectively) offers particular performance benefits to I/O virtualization.


A number of chronic challenges are associated with deployment and management of computer systems and their applications, especially in the modern context of larger-scale, commercial, and/or enterprise use. Virtualization provides an abstraction from the physical hardware, which breaks the constraint that only a single instance of an operating system may run on a single hardware platform. Because it encapsulates the operating environment, virtualization is a surprisingly powerful abstraction. SERVER VIRTUALIZATION The past decade has witnessed a revolutionary reduction in hardware costs, as well as a significant increase in both capacity and performance of many of the basic hardware
more queue: www.acmqueue.com

platform constituents (processors, storage, and memory). Ironically, in spite of the corresponding widespread adoption of these now relatively inexpensive, x86-based servers, most enterprises have seen their IT costs and complexity escalate rapidly. While the steady march of Moore’s law has markedly decreased hardware’s cost of acquisition, the associated proliferation of this inexpensive computing has led to tremendous increases in complexity—with the costs of server configuration, management, power, and maintenance dwarfing the basic cost of the hardware. Each server in the data center costs an enterprise on average $10,000 per year to run when all of its costs—provisioning, maintenance, administration, power, real estate, hardware, and software—are considered. In addition, the artifacts of current operating-system and system-software architecture result in most servers today running at under 10 percent utilization. Several opportunities arise directly from the rapid performance and capacity increase seen in the contemporary commodity hardware platforms. Last decade’s trend in commercial IT infrastructure was an expanding hardware universe: achieving performance and capacity by “horizontal” scaling of the hardware. Given the dramatic performance available on a single commodity box today, we may now be witnessing a contraction of this universe—still a horizontal trend, but in reverse. Whereas it may have required many servers to support enterprisewide or even department-wide computing just five years ago, virtualization allows many large application loads to be placed on one hardware platform, or a smaller number of platforms. This can cut both per-server capital cost and the overall lifetime operational costs significantly. The 10-percent utilization statistic reveals that server consolidation can achieve a tenfold savings in infrastructure cost, not simply through reduced CPU count but more importantly through its consequent reductions in switching, communication, and storage infrastructure, and power and management costs. Since virtualization allows multiple operating system images (and the applications associated with each that constitute software services) to share a single hardware server, it is a basic enabler for server consolidation. The virtual I/O abstraction is another important component of server virtualization. In the past, when multiple servers and/or multiple hardware interfaces per server were used to support scalability, physical hardware devices could be individually allotted to guarantee a certain performance, specific security properties, and/or other configuration aspects to individual operating-sysACM QUEUE December/January 2006-2007 39


Computer Architecture

The Virtualization Reality
tem and application loads. Nowadays, a single device may have significantly higher performance (e.g., the transition from Fast Ethernet to inexpensive Gigabit or even 10Gigabit network interface cards), and just one or a much smaller number of physical devices will likely be present on a single server or server configuration. In such configurations, where individual physical hardware devices are shared by multiple hosted VMs on a single server, ensuring that there is proper isolation between their respective demands upon the shared hardware is critical. Strict allocation of the shared CPU, memory, and I/O resources, as well as the assurance of the security of both the platform and the guests, are key requirements that fall on the hypervisor. Beyond its immediate application for server consolidation, server virtualization offers many further benefits that derive from the separation of virtual machines (an operating system and its applications) from physical hardware. These benefits (several of which have yet to be exploited fully in application) include dynamic provisioning, high availability, fault tolerance, and a “utility computing” paradigm in which compute resources are dynamically assigned to virtualized application workloads. VIRTUAL APPLIANCES Once an operating system and its applications have been encapsulated into a virtual machine, the VM can be run on any computer with a hypervisor. The ability to encapsulate all states, including application and operating-system configuration, into a single, portable, instantly runnable package provides great flexibility. For a start, the application can be provisioned and the VM saved in a “suspended” state, which makes it instantly runnable without further configuration. The image of one or more applications that have been properly configured in a VM and are ready to run can be saved, and this may then be used as a highly portable distribution format for a software service. The administrative tasks of installing and configuring an operating system and the necessary applications prior to instantiating and launching a software service on a platform are no longer needed. The preconfigured and saved VM image is simply loaded and launched. VMware led the industry with its appliance concept, which aims
40 December/January 2006-2007 ACM QUEUE

to use packaged VMs as a new software distribution technique. VMware offers more than 200 prepackaged appliances from its Web site. Within the enterprise, the packaged VM offers additional benefits: Software delivered by an engineering group can be packaged with the operating system it requires and can be staged for testing and production as a VM. Easily and instantly provisioned onto testing equipment, the application and the operating system against which it is certified can be quickly tested in a cost-efficient environment before being made available as a packaged VM, ready for deployment into production. A key problem in the data center is the ability to get new applications quickly into production. New applications typically take 60 to 90 days to qualify. To make it from testing into the data center, IT staff must acquire a new machine, provision it with an operating system, install the application, configure and test the setup for the service in question, and only then, once satisfied, rack the resulting server in the data center. This packaging approach provides an avenue to a solution. Once new software has been packaged as an appliance, it can be deployed and run instantly on any existing server in the data center that has sufficient capacity to run it. Any final testing or qualification can still be done before the service is made available for production use if required, but the lead times to acquire, install, and/or customize new hardware at its point of use are removed. LIVE RELOCATION Virtual appliances accelerate software provisioning and portability. Live relocation—the ability to move a running VM dynamically from one server to another, without stopping it—offers another benefit: When coupled with load-balancing and server resource optimization software, this provides a powerful tool for enabling a “utility computing” paradigm. When a VM is short of resources, it can be relocated dynamically to another machine with more resources. When capacities are stretched, additional copies of an existing VM can be cloned rapidly and deployed to other available hardware resources to increase overall service capacity. Instantaneous load considerations are a notorious challenge in the IT administrative world. Grid engines, as applied on distributed virtualized
rants: feedback@acmqueue.com

servers, where spare resources are held in reserve, can be used to spawn many instances of a given application dynamically to meet increased load or demand. CLIENT SECURITY AND MOBILITY On the client, virtualization offers various opportunities for enhanced security, manageability, greater worker mobility, and increased robustness of client devices. Virtualization of clients is also made possible through the hosting of multiple client operating system instances on a modern server-class system. Offering each client environment as a virtualized system instance located on a server in the data center provides the user with a modern-day equivalent of the thin client. Mobility of users is a direct result of their ability to access their virtualized workload remotely from any client endpoint. Sun’s Sun Ray system is an example of one such implementation. Increased security of data, applications and their context of use, and reduced overall cost of administration for client systems are important aspects of this technology. Enhanced reliability and security can be achieved, for example, by embedding function-specific, hidden VMs on a user’s PC, where the VM has been designed to monitor traffic, implement “embedded IT” policies, or the like. The packaging of applications and operating-system images into portable appliances also provides a powerful metaphor for portability of application state: Simply copying a suspended VM to a memory stick allows the user to carry running applications to any virtualizationready device. VMware’s free Player application is a thin, client-side virtualization “player” that has the ability to execute a packaged VM. Examples include prepackaged secure Web browsers that can be discarded after per-session use (to obtain greater security) and secured, user-specific or enterprise-specific applications.

ACKNOWLEDGMENTS We are particularly indebted to the team at the University of Cambridge, including Paul Barham, Boris Dragovic, Keir Fraser, Steven Hand, Tim Harris, Alex Ho, Derek McAuley, Rolf Neugebauer, Ian Pratt, Andrew Warfield, and Matt Williamson, who have developed and evolved the Xen system. This article reports on their work. REFERENCES 1. Gum, P. H. 1983. System/370 extended architecture: Facilities for virtual machines. IBM Journal of Research and Development 27(6): 530-544. 2. Seawright, L., MacKinnon, R. 1979. VM/370—a study of multiplicity and usefulness. IBM Systems Journal 18(1): 4-17. 3. Barham, P., Dragovic, B., Fraser, K., Hand, S., Harris, T., Ho, A., Neugebauer, R., Pratt, I., Warfield, A. 2003. Xen and the art of virtualization. In Proceedings of the 19th ACM SOSP (October): 164-177. 4. Fraser, K., Hand, S., Neugebauer, R., Pratt, I., Warfield, A., Williamson, M. 2004. Safe hardware access with the Xen virtual machine monitor. Cambridge, UK: University of Cambridge Computer Laboratory; www.cl.cam. ac.uk/research/srg/netos/papers/2004-oasis-ngio.pdf. LOVE IT, HATE IT? LET US KNOW feedback@acmqueue.com or www.acmqueue.com/forums SIMON CROSBY is CTO of XenSource where he is responsible for XenEnterprise R&D, technology leadership, and product management, and maintaining a close affiliation with the Xen project run by Ian Pratt, the founder of XenSource. Crosby was a principal engineer at Intel where he led research in distributed autonomic computing and platform security and trust. Before Intel, Simon founded CPlane Inc., a network optimization software vendor. He was a tenured faculty member at the University of Cambridge, where he led research on network performance and control, and multimedia operating systems. DAVID BROWN is a member of the Solaris Engineering group at Sun Microsystems. He led the Solaris ABI compatibility program and more recently has worked on several projects to support Sun’s AMD x64- and Intel-based platforms. Earlier he was a founder of Silicon Graphics and the Workstation Systems Engineering group at Digital Equipment Corporation. He introduced and described the unified memory architecture approach for high-performance graphics hardware in his Ph.D. dissertation at the University of Cambridge.
© 2006 ACM 1542-7730/06/1200 $5.00


The use of virtualization portends many further opportunities for security and manageability on the client. The examples presented here only begin to illustrate the ways in which virtualization can be applied. Virtualization represents a basic change in the architecture of both systems software and the data center. It offers some important opportunities for cost savings and efficiency in computing infrastructure, and for centralized administration and management of that infrastructure for both servers and clients. We expect it to change the development, testing, and delivery of software fundamentally, with some immediate application in the commercial and enterprise context. Q
more queue: www.acmqueue.com

ACM QUEUE December/January 2006-2007 41

42 December/January 2006-2007 ACM QUEUE
rants: feedback@acmqueue.com

Who’s in charge of the Internet’s future?

more queue: www.acmqueue.com

Since I started a stint as chair of the IETF (Internet Engineering Task Force) in March 2005, I have frequently been asked, “What’s coming next?” but I have usually declined to answer. Nobody is in charge of the Internet, which is a good thing, but it makes predictions difficult (and explains why this article starts with a disclaimer: It represents my views alone and not those of my colleagues at either IBM or the IETF). The reason the lack of central control is a good thing is that it has allowed the Internet to be a laboratory for innovation throughout its life—and it’s a rare thing for a major operational system to serve as its own development lab. As the old metaphor goes, we frequently change some of the Internet’s engines in flight. This is possible because of a few of the Internet’s basic goals: • Universal connectivity—anyone can send packets to anyone. • Applications run at the edge—so anyone can install and offer services. • “Cheap and cheerful” core technology—so transmission is cheap. • Natural selection—no grand plan, but good technology survives and the rest dies. Of course, this is an idealistic view. In recent years, firewalls and network address translators have made universal connectivity sticky. Some telecommunications operators would like to embed services in the network. Some transmission technologies try too hard, so they are not cheap. Until now, however, the Internet has remained a highly competitive environment and natural selection has prevailed, even though there have been attempts to protect incumbents by misguided regulation. In this environment of natural selection, predicting technology trends is very hard. The scope is broad—the IETF considers specifications for how ACM QUEUE December/January 2006-2007 43

IP runs over emerging hardware media, maintenance and improvements to IP itself and to transport protocols including the ubiquitous TCP, routing protocols, basic application protocols, network management, and security. A host of other standards bodies operate in parallel with the IETF. To demonstrate the difficulty of prediction, let’s consider only those ideas that get close enough to reality to be published within the IETF; that’s about 1,400 new drafts per year, of which around 300 end up being published as IETF requests for comments (RFCs). By an optimistic rough estimate, at most 100 of these specifications will be in use 10 years later (i.e., 7 percent of the initial proposals). Of course, many other ideas are floated in other forums such as ACM SIGCOMM. So, anyone who agrees to write about emerging protocols has at least a 93 percent probability of writing nonsense. What would I have predicted 10 years ago? As a matter of fact, I can answer that question. In a talk in May 1996 I cautiously quoted Lord Kelvin, who stated in 1895 that “heavier-than-air flying machines are impossible,” and I incautiously predicted that CSCW (computer-supported collaborative work), such as packet videoconferencing and shared whiteboard, would be the next killer application after the Web, in terms of bandwidth and realtime requirements. I’m still waiting. A little earlier, speaking to an IBM user meeting in 1994 (before I joined IBM), I made the following specific predictions: • Desktop client/server is the whole of computing. The transaction processing model is unhelpful. • Cost per plug of LAN will increase. • Internet and IPX will merge and dominate. • Desktop multimedia is more than a gimmick, but only part of desktop computing. • Wireless mobile PCs will become very important. • Network management (including manageable equipment and cabling) is the major cost. Well, transaction processing is more important in 2006 than it has ever been, and IPX has just about vanished. The rest, I flatter myself, was reasonably accurate.
44 December/January 2006-2007 ACM QUEUE

All of this should make it plain that predicting the future of the Internet is a mug’s game. This article focuses on observable challenges and trends today.


The original Internet goal that anyone could send a packet to anyone at any time was the root of the extraordinary growth observed in the mid-1990s. To quote Tim Berners-Lee, “There’s a freedom about the Internet: As long as we accept the rules of sending packets around, we can send packets containing anything to anywhere.”1 As with all freedoms, however, there is a price. It’s trivial to forge the origin of a data packet or of an e-mail message, so the vast majority of traffic on the Internet is unauthenticated, and the notion of identity on the Internet is fluid. Anonymity is easy. When the Internet user community was small, it exerted enough social pressure on miscreants that this was not a major problem area. Over the past 10 years, however, spam, fraud, and denial-of-service attacks have become significant social and economic problems. Thus far, service providers and enterprise users have responded largely in a defensive style: firewalls to attempt to isolate themselves, filtering to eliminate unwanted or malicious traffic, and virtual private networks to cross the Internet safely. These mechanisms are not likely going away, but what seems to be needed is a much more positive approach to security: Identify and authenticate the person or system you are communicating with, authorize certain actions accordingly, and if needed, account for usage. The term of art is AAA (authentication, authorization, accounting). AAA is needed in many contexts and may be needed at several levels for the same user session. For example, a user may first need to authenticate to the local network provider. A good example is a hotel guest using the hotel’s wireless network. The first attempt to access the Internet may require the user to enter a code supplied by the front desk. In an airport, a traveler may have to supply a credit card number to access the Internet or use a preexisting account with one of the network service providers that offer connectivity. A domestic ADSL customer normally authenticates to a service provider, too. IETF protocols such as EAP (Extensible Authentication Protocol) and RADIUS (Remote Authentication Dial-in User Service) are used to mediate these AAA interactions. This form of AAA, however, authenticates the user only as a sender and receiver of IP packets, and it isn’t used at all where free service is provided (e.g., in a coffee shop). Often (e.g., for a credit card transaction) the remote server needs a true identity, which must be authentirants: feedback@acmqueue.com

cated by some secret token (in a simple solution, a PIN code transmitted over a secure channel). But the merchant who takes the money may not need to know that true identity, as long as a trusted financial intermediary verifies it. Thus, authentication is not automatically the enemy of privacy. Cryptographic authentication is a powerful tool. Just as it can be used to verify financial transactions, it can in theory be used to verify any message on the Internet. Why, then, do we still have spoofed e-mail and even spoofed individual data packets? For individual data packets, there is a solution known as IPsec (IP security), which is defined in a series of IETF specifications and widely (but not universally) implemented. It follows the basic Internet architecture known as the end-to-end principle: Do not implement a function inside the network that can be better implemented in the two end systems of a communication. For two systems to authenticate (or encrypt) the packets they send to each other, they have only to use IPsec and to agree on the secret cryptographic keys. So why is this not in universal usage? There are at least three reasons: • Cryptographic calculations take time during the sending and receiving of every packet. This overhead is not always acceptable except for very sensitive applications. • Management of cryptographic keys has proved to be a hard problem, and usually requires some sort of preexisting trust relationship between the parties. • Traversing firewalls and network address translators adds complexity and overhead to IPsec. Thus, IPsec deployment today is limited mainly to virtual private network deployments where the overhead is considered acceptable, the two ends are part of the same company so key management is feasible, and firewall traversal is considered part of the overhead. More general usage of IPsec may occur as concern about malware within enterprise networks rises and as the deployment of IPv6 reduces the difficulties caused by network address translation. For e-mail messages, mechanisms for authentication or encryption of whole messages have existed for years (known as S/MIME and PGP). Most people don’t use them. Again, the need for a preexisting trust relationship appears to be the problem. Despite the annoyance of spam, people want to be able to receive mail from anybody without prior arrangement. Operators of Internet services want to receive unsolicited traffic from unknown parties; that’s how they get new customers. A closed network may be good for some purposes, but it’s not the Internet.
more queue: www.acmqueue.com

It’s worth understanding that whereas normal end users can at worst send malicious traffic (such as denialof-service attacks, viruses, and fraudulent mail), an ISP can in theory spy on or reroute traffic, or make one server simulate another. A hardware or software maker can in theory insert “back doors” in a product that would defeat almost any security or privacy mechanism. Thus, we need trustworthy service providers and manufacturers, and we must be very cautious about downloaded software. To summarize the challenges in this area: • How can identity be defined, authenticated, and kept private? • How can trust relationships be created between arbitrary sets of parties? • How can cryptographic keys be agreed upon between the parties in a trust relationship? • How can packet origins be protected against spoofing at line speed? • How can we continue to receive messages from unknown parties without continuing to receive unwanted messages? The IETF is particularly interested in the last three questions. Work on key exchange has resulted in the IKEv2 standard (Internet key exchange, version 2), and work continues on profiles for use of public-key cryptography with IPsec and IKEv2. At the moment, the only practical defense against packet spoofing is to encourage ingress filtering by ISPs; simply put, that means that an ISP should discard packets from a customer’s line unless they come from an IP address assigned to that customer. This eliminates spoofing only if every ISP in the world plays the game, however. Finally, the problem of spam prevention remains extremely hard; there is certainly no silver bullet that will solve this problem. At the moment the IETF’s contribution is to develop the DKIM (Domain Keys Identified Mail) specification. If successful, this effort will allow a mail-sending domain to take responsibility, using digital signatures, for having taken part in the transmission of an e-mail message and to publish “policy” information about how it applies those signatures. Taken together, these measures will assist receiving domains in detecting (or ruling out) certain forms of spoofing as they pertain to the signing domain. We are far from done on Internet security. We can expect new threats to emerge constantly, and old threats will mutate as defenses are found.


The Internet has never promised to deliver packets; technically, it is an “unreliable” datagram network, which ACM QUEUE December/January 2006-2007 45

may and does lose a (hopefully small) fraction of all packets. By the end-to-end principle, end systems are required to detect and compensate for missing packets. For reliable data transmission, that means retransmission, normally performed by the TCP half of TCP/IP. Users will see such retransmission, if they notice it at all, as a performance glitch. For media streams such as VoIP, packet loss will often be compensated for by a codec—but a burst of packet loss will result in broken speech or patchy video. For this reason, the issue of QoS (quality of service) came to the fore some years ago, when audio and video codecs first became practical. It remains a challenge. One aspect of QoS is purely operational. The more competently a network is designed and managed, the better the service will be, with more consistent performance and fewer outages. Although unglamorous, this is probably the most effective way of providing good QoS. Beyond that, there are three more approaches to QoS, which can be summarized as: • Throw bandwidth at the problem. • Reserve bandwidth. • Operate multiple service classes. The first approach is based on the observation that both in the core of ISP networks and in properly cabled business environments, raw bandwidth is cheap (even without considering the now-historical fiber glut). In fact, the only place where bandwidth is significantly limited is in the access networks (local loops and wireless networks). Thus, most ISPs and businesses have solved the bulk of their QoS problem by overprovisioning their core bandwidth. This limits the QoS problem to access networks and any other specific bottlenecks. The question then is how to provide QoS management at those bottlenecks, which is where bandwidth reservations or service classes come into play. In the reservation approach, a session asks the network to assign bandwidth all along its path. In this context, a session could be a single VoIP call, or it could be a semi-permanent path between two networks. This approach has been explored in the IETF for more than 10 years under the name of “Integrated Services,” supported by RSVP (Resource Res46 December/January 2006-2007 ACM QUEUE

ervation Protocol). Even with the rapid growth of VoIP recently, RSVP has not struck oil—deployment seems too clumsy. A related approach, however—building virtual paths with guaranteed bandwidth across the network core—is embodied in the use of MPLS (MultiProtocol Label Switching). In fact, a derivative of RSVP known as RSVP-TE (for traffic engineering) can be used to build MPLS paths with specified bandwidth. Many ISPs are using MPLS technology. MPLS does not solve the QoS problem in the access networks, which by their very nature are composed of a rapidly evolving variety of technologies (ADSL, CATV, various forms of Wi-Fi, etc.). Only one technology is common to all these networks: IP itself. Therefore, the final piece of the QoS puzzle works at the IP level. Known as Differentiated Services, it is a simple way of marking every packet for an appropriate service class, so that VoIP traffic can be handled with less jitter than Web browsing, for example. Obviously, this is desirable from a user viewpoint, and it’s ironic that the more extreme legislative proposals for so-called “net neutrality” would ostensibly outlaw it, as well as outlawing priority handling for VoIP calls to 911. The challenge for service providers is how to knit the four QoS tools (competent operation, overprovision of bandwidth, traffic engineering, and differentiated services) into a smooth service offering for users. This challenge is bound up with the need for integrated network management systems, where not only the IETF but also the DMTF (Distributed Management Task Force), TMF (TeleManagement Forum), ITU (International Telecommunication Union), and other organizations are active. This is an area where we have plenty of standards, and the practical challenge is integrating them. However, the Internet’s 25-year-old service model, which allows any packet to be lost without warning, remains; and transport and application protocols still have to be designed accordingly.


As previously mentioned, MPLS allows operators to create virtual paths, typically used to manage traffic flows across an ISP backbone or between separate sites in a large corporate network. At first glance, this revives an old controversy in network engineering—the conflict between datagrams and virtual circuits. More than three decades ago this was a major issue. At that time, conventional solutions depended on end-to-end electrical circuits (hardwired or switched, and multiplexed where convenient).
rants: feedback@acmqueue.com

The notion of packet switching, or datagrams, was THE EVERLASTING PROBLEM introduced in 1962 by Paul Baran and became practiIn 1992, the IETF’s steering group published a request cable from about 1969 when the ARPANET started up. To for comments2 that focused on two severe problems the telecommunications industry, it seemed natural to facing the Internet at that time (just as its escape from combine the two concepts (i.e., send the packets along the research community into the general economy was a predefined path, which became known as a virtual cirstarting): first, IP address space exhaustion; and second, cuit). The primary result was the standard known as X.25, routing table explosion. The first has been contained by developed by the ITU. a hack (network address translation) and is being solved To the emerging computer networking community, by the growing deployment of IPv6 with vastly increased this seemed to add pointless complexity and overhead. In address space. The second problem is still with us. The effect, this controversy was resolved by the market, with number of entries in the backbone routing tables of the the predominance of TCP/IP and the decline of X.25 from Internet was below 20,000 in 1992 and is above 250,000 about 1990 onward. Why then has the virtual circuit today (see figure 1). approach reappeared in the form of MPLS? This is a tough problem. Despite the optimistic comFirst, you should understand that MPLS was developed ment about router speed, such a large routing table needs by the IETF, the custodian of the TCP/IP standards, and to be updated dynamically on a worldwide basis. Furtherit is accurate to say that the primary target for MPLS is more, it currently contains not only one entry for each the transport of IP packets. Data still enters, crosses, and address block assigned to any ISP anywhere in the world, leaves the Internet encased in IP packets. Within certain but also an entry for every user site that needs to be “muldomains, however, typically formed by single ISPs, the tihomed” (connected simultaneously to more than one packets will be routed through a preexisting MPLS path. ISP for backup or load-sharing). As more businesses come This has two benefits: to rely on the network, the number of multihomed sites • At intermediate switches along the path, switching will is expected to grow dramatically, with a serious estimate take place at full hardware speed (today electronically, of 10 million by 2050. A routing table of this size is not tomorrow optically), without any complex routing deciconsidered feasible. Even if Moore’s law solves the storage sions being made at line speed. and processing challenge at reasonable cost, the rate of • The path itself is established with whatever Growth of the BGP Table, 1994 to Present security, bandwidth, and QoS characteristics 300,000 it is considered to need, using network manage250,000 ment techniques known collectively as “traffic engineering.” 200,000 Note that not all experts are convinced by these 150,000 benefits. Modern IP routers are hardly slow, and as previously noted, QoS may in 100,000 practice not be a problem in the network core. Most ISPs insist on the need for 50,000 such traffic engineering, however. 0 Even with MPLS virtual ’95 ’96 ’97 ’98 ’99 ’00 ’01 ’02 ’03 ’04 ’05 ’06 paths, the fundamental date unit of transmission on the Internet remains a single IP packet.
active BGP entries (FIB)


more queue: www.acmqueue.com

ACM QUEUE December/January 2006-2007 47

change in a table of that size could greatly exceed the rate at which routing updates could be distributed worldwide. Although we have known about this problem for more than 10 years, we are still waiting for the breakthrough ideas that will solve it.


The telecommunications industry was fundamentally surprised by the Internet’s success in the 1990s and then fundamentally shaken by its economic consequences. Only now is the industry delivering a coherent response, in the form of the ITU’s NGN (Next Generation Networks) initiative launched in 2004. NGN is to a large extent founded on IETF standards, including IP, MPLS, and SIP (Session Initiation Protocol), which is the foundation of standardized VoIP and IMS (IP Multimedia Subsystem). IMS was developed for third-generation cellphones but is now the basis for what ITU calls “fixed-mobile convergence.” The basic principles of NGN are:3 • IP packet-based transport using MPLS • QoS-enabled • Embedded service-related functions—layered on top of transport or based on IMS • User access to competing service providers • Generalized mobility At this writing, the standardization of NGN around these principles is well advanced. Although it is new for the telecommunications industry to layer services on top rather than embedding them in the transport network, there is still a big contrast with the Internet here: Internet services are by definition placed at the edges and are not normally provided by ISPs as such. The Internet has a history of avoiding monopoly deployments; it grows by spontaneous combustion, which allows natural selection of winning applications by the end users. Embedding service functions in the network has never worked in the past (except for directories). Why will it work now?

technology of the Internet, and that the party is far from over. The Internet technical community has succeeded by being open—and open-minded. Any engineer who wants to join in can do so. The IETF has no membership requirements; anyone can join the mailing list of any working group, and anyone who pays the meeting fee can attend IETF meetings. Decisions are made by rough consensus, not by voting. The leadership committees in the IETF are drawn from the active participants by a community nomination process. Apart from meeting fees, the IETF is supported by the Internet Society. Any engineer who wants to join in can do so in several ways: by supporting the Internet Society (http://www. isoc.org), by joining IETF activities of interest (http:// www.ietf.org), or by contributing to research activities (http://www.irtf.org and, of course, ACM SIGCOMM at http://www.acm.org/sigs/sigcomm/). Q REFERENCES 1. Berners-Lee, T. 1999. Weaving the Web. San Francisco: HarperCollins. 2. Gross, P., Almquist, P. 1992. IESG deliberations on routing and addressing, RFC 1380 (November). DDN Network Information Center; http://www.rfc-archive. org/getrfc.php?rfc=1380. 3. Based on a talk by Keith Knightson. 2005. Basic NGN architecture principles and issues; http://www.itu.int/ ITU-T/worksem/ngn/200505/program.html. ACKNOWLEDGMENTS Thanks to Bernard Aboba and Stu Feldman for valuable comments on a draft of this article. LOVE IT, HATE IT? LET US KNOW feedback@acmqueue.com or www.acmqueue.com/forums BRIAN E. CARPENTER is an IBM Distinguished Engineer working on Internet standards and technology. Based in Switzerland, he became chair of the IETF (Internet Engineering Task Force) in March 2005. Before joining IBM, he led the networking group at CERN, the European Laboratory for Particle Physics, from 1985 to 1996. He served from March 1994 to March 2002 on the Internet Architecture Board, which he chaired for five years. He also served as a trustee of the Internet Society and was chairman of its board of trustees for two years until June 2002. He holds a first degree in physics and a Ph.D. in computer science, and is a chartered engineer (UK) and a member of the IBM Academy of Technology.
© 2006 ACM 1542-7730/06/1200 $5.00 rants: feedback@acmqueue.com


It should be clear from this superficial and partial personal survey that we are still having fun developing the
48 December/January 2006-2007 ACM QUEUE

book reviews
Sustainable Software Development: An Agile Perspective Kevin Tate, Addison-Wesley Professional, 2005, $39.99, ISBN: 0321286081 Our software engineering community has for decades flirted with the idea of applying the rigor of other engineering disciplines to the development of software. This book boldly argues against this metaphor. Buildings are expensive to modify and typically static, whereas software is cheap to modify and evolves over its lifetime. Instead, author Kevin Tate argues that an appropriate metaphor is a coral reef: an ecosystem of developers, customers, suppliers, distributors, and competitors that live on top of the software, in the same way that a reef’s organisms live around the coral. Both the coral and the software evolve with their surrounding ecosystems. The book distinguishes itself from other agile programming books by taking a wider view of the field, covering not only the project management side of agile practices, but also developer collaboration and technical excellence. It starts by arguing that the goal of sustainability comes into play by recognizing that a project’s progress depends on the competition between negative stresses (user requirements, disruptive technologies and business models, external dependencies, competition, and cost management) and positive controls (collaboration, methodology, expertise, decision making, leadership, culture, and simplicity). When the negative stresses outweigh the counteraction of a project’s controls, the project enters into a death spiral of diminishing productivity. The remainder of the book is organized around a chapter for each of the four principles that should guide sustainable development: defect prevention, a working product, emphasis on design, and continual refinement. With another apt metaphor, Tate advises developers to juggle the four principles of sustainable development while working on product features and fixing bugs. The text is full of interesting ideas and illuminating sidebars discussing real-world cases, but as a result the reader can occasionally get lost among them, losing focus on the argument and the course of thought. Nevertheless, this is a book that both developers and managers will appreciate and value. Its advice is important, understandmore queue: www.acmqueue.com

able, and practical: a gift to the software engineering community. —D. Spinellis Hacking Exposed: Web Applications, 2nd edition Joel Scambray, Mike Shema, Caleb Sima, McGraw-Hill Osborne Media, 2006, $49.99, ISBN: 0072262990 Many years ago, the “Hacking Exposed” book series started covering security from a hacker’s perspective. Since the security landscape has become more complex, the series now covers the multiple facets of network and system security, and includes books on specific systems (Linux, Windows, Cisco), as well as wireless networks (forthcoming in 2007). This book is dedicated to the security of Web applications and associated service deployment architectures. It is written from an attacker’s point of view and follows the basic steps that an attacker takes. It starts with a description of the reconnaissance phase (fingerprinting the application and the supporting Web server) and moves on to more intrusive attacks, roughly divided into those against authentication methods, bypassing authorization mechanisms, abusing the input-validation procedures, and stealing sensitive information. These subjects are presented well, both from a conceptual point of view and through examples drawn from real-world cases. There is also a relatively short chapter addressing the security and vulnerabilities of Web services. Hacking on the server side is only one viewpoint of Web security. The typical end user may also get hacked just by visiting malicious or compromised Web sites. One chapter in the book reviews the most famous exploits and vulnerabilities related to these issues. This book meets high expectations. It is fun and easy to read. It covers in sufficient depth the technical details and underlying system and software-specific issues. The technical level is somewhere between intermediate and advanced, thus appealing to a broad range of readers. Webmasters will learn how to check their servers for the most common security flaws, programmers will appreciate the contents on securing their code, and typical readers will get a comprehensive picture of the status of Web security today. I recommend this truly exceptional book to all of these readers. —Radu State
Reprinted from Computing Reviews, © 2006 ACM, http://www.reviews.com

ACM QUEUE December/January 2006-2007 49

LISA (Large Installation System Administration) Conference Dec. 3–8, 2006 Washington, D.C. http://www.usenix.org/events/lisa06/ Web Builder 2.0 Dec. 4-6, 2006 Las Vegas, Nevada http://www.ftponline.com/ conferences/webbuilder/2006/ ICSOC (International Conference on Service-Oriented Computing) Dec. 4-7, 2006 Chicago, Illinois http://www.icsoc.org/ Search Engine Strategies Dec. 4-7, 2006 Chicago, Illinois http://searchenginestrategies.com/ sew/chicago06/ XML Conference Dec. 5-7, 2006 Boston, Massachusetts http://2006.xmlconference.org/ The Spring Experience Dec. 7-10, 2006 Hollywood, Florida http://thespringexperience.com/ Web Design World Dec. 11-13, 2006 Boston, Massachusetts http://www.ftponline.com/ conferences/webdesignworld/ 2006/boston/ http://www.macworldexpo.com/ launch/ Symposium on POPL (Principles of Programming Languages) Jan. 17-19, 2007 Nice, France http://www.cs.ucsd.edu/popl/07/ IUI (International Conference on Intelligent User Interfaces) Jan. 28-31, 2007 Honolulu, Hawaii http://www.iuiconf.org/

To announce an event, E-MAIL
QUEUE-ED@ACM.ORG OR FAX +1-212-944-1318

Gartner Business Process Management Summit Feb. 26-28, 2007 San Diego, California http://www.gartner.com/2_events/ conferences/bpm3.jsp Black Hat Briefings and Trainings Feb. 26-Mar. 1, 2007 Washington, D.C. http://www.blackhat.com/html/ bh-dc-07/bh-dc-07-index.html ETel (Emerging Telephony Conference) Feb. 27-Mar. 1, 2007 Burlingame, California http://conferences.oreillynet.com/ etel2007/


Designing and Building Ontologies Feb. 5-8, 2007 Washington, D.C. http://www.wilshireconferences. com/seminars/Ontologies/ RSA Conference Feb. 5-9, 2007 San Francisco, California http://www.rsaconference. com/2007/us/ SCALE 5x (Southern California Linux Expo) Feb. 10-11, 2007 Los Angeles, California http://www.socallinuxexpo.org/ scale5x/ FAST (Usenix Conference on File and Storage Technologies) Feb. 13–16, 2007 San Jose, California http://www.usenix.org/events/fast07/ LinuxWorld OpenSolutions Summit Feb. 14-15, 2007 New York, New York http://www.linuxworldexpo.com/ live/14/


DAMA (Data Management Association) International Symposium and Wilshire Meta-Data Conference Mar. 4-8, 2007 Boston, Massachusetts http://www.wilshireconferences. com/MD2007/ Game Developers Conference Mar. 5-9, 2007 San Francisco, California http://www.gdconf.com/ TheServerSide Java Symposium Mar. 21-23, 2007 Las Vegas, Nevada http://javasymposium.techtarget. com/lasvegas/
rants: feedback@acmqueue.com


Macworld Jan. 8-12, 2007 San Francisco, California

50 December/January 2006-2007 ACM QUEUE

St. Mary’s College of Maryland Tenure-Track Assistant Professor Positions
Two assistant-level tenuretrack positions in Computer Science at St. Mary’s College of Maryland—a Public Liberal Arts College—starting Fall 2007. Industrial experience and a demonstrated ability to attract and retain students from underrepresented groups are desired. Further details at: http://www.smcm.edu/nsm/ mathcs/cs07.html. AA/EOE.

Netflix Netflix is looking for great engineers!
Do you want to work with talented people who are motivated by making a difference and interested in solving tough problems? Are you a web-savvy software engineer, developer or designer? No matter how you mind works, we have a job opening for you. For more details, please come to our website at http://www. netflix.com/Jobs and submit an application.

aai Services corporation Sr Software Engineer
AAI Services Corporation is seeking Software Eng 4 at McLean, VA location. BS and 10 years C++ and some Linux required. OpenGL or 3D graphics preferred. US Citizen, EOE. comprehensive benefits, competitive salary.

The D. E. Shaw group Software Developer
The D. E. Shaw group is looking for top-notch, innovative software developers to help it expand its tech venture and proprietary trading activities. We’re a a global investment and technology development firm with approximately US $25 billion in aggregate investment capital and a decidedly different approach to doing business. The application of advanced technology is an integral part of virtually everything we do, from developing computationally intensive strategies for trading in securities markets around the globe to designing a supercomputer intended to fundamentally transform the process of drug discovery. Developers at the firm work on a variety of interesting technical projects including real-time data analysis, distributed system development, and the creation of tools for mathematical modeling. They also enjoy access to some of the most advanced computing resources in the world. If you’re interested in applying your intellect to challenging problems of software architecture and engineering in a stimulating, fastpaced environment, then we’d love to see your resume. To apply, e-mail your resume to ACM-SNowak@career.deshaw.com. EOE.

Winona State University Computer Science Department
The Computer Science Department at Winona State University invites applications for a tenure-track faculty position on its Rochester campus, to begin Fall 2007. A PhD or ABD in Computer Science or a closely related field required. We are particularly interested in candidates with specialization in bioinformatics, biomedical informatics, database, and data mining; candidates in all areas of CS will be considered and are encouraged to apply. Rochester, a vibrant, diverse city, located in SE Minnesota, offers many opportunities for collaboration and/or employment for partners at Mayo Clinic, 20+ software companies including IBM, and Rochester Community and Technical College, among others. Review begins 1/16/07. For a full position description and application procedure, see http:// www.winona.edu/humanresources. AA/EOE

American University of Beirut Department of Computer Science
The Department of Computer Science at the American University of Beirut invites applications for faculty positions at all levels. Candidates should have a Ph.D. in computer science or a related discipline, and a strong research record. All positions are normally at the Assistant Professor level to begin September 15, 2007, but appointments at higher ranks and/or visiting appointments may also be considered. Appointments are for an initial period of three years. The usual teaching load is not more than nine hours a week. Sabbatical visitors are welcome. The language of instruction is English. For more information please visit http://www.aub.edu.lb/~webfas/ Interested applicants should send a letter of application and a CV, and arrange for three letters of reference to be sent to: Dean, Faculty of Arts and Sciences, American University of Beirut, c/o New York Office, 3 Dag Hammarskjold Plaza, 8th Floor, New York, NY 10017-2303, USA or Dean, Faculty of Arts and Sciences, American University of Beirut, P.O.Box 11-0236, Riad El-Solh, Beirut 1107 2020, Lebanon. Electronic submissions may be sent to: as_ dean@aub.edu.lb. All application materials should be received by December 29, 2006. The American University of Beirut is an Affirmative Action, Equal Opportunity Employer

University of North Carolina at Charlotte Department of Software and Information Systems
University of North Carolina at Charlotte Department of Software and Information Systems Tenure-Track Faculty Position Two tenure-track faculty positions available at the associate/assistant professor level. The Department is dedicated to research and education in Computing with emphasis in Information Security & Assurance and Information Integration & Environments. The Department offers degrees at the Bachelor, Master, and Ph.D. levels. Faculty candidates with strong research expertise in Software Engineering, Trusted Software Development, Trusted Information Infrastructures, and Information Security and Privacy are encouraged to apply. Highly qualified candidates in other areas will also be considered. Salary will be highly competitive. Applicants must have a Ph.D. in Computer Science, Information Technology, Software Engineering, or a related field, as well as a strong commitment to research and education. For further details please visit http://www.sis.uncc.edu. Application review will start in January 2007. Please send a detailed CV together with four references, copies of scholarly publications, and other supporting documents to search-sis@uncc.edu. All mater-ials need to be electronically submitted as separate PDF file attachments. References must be sent directly. Women, minorities and individuals with disabilities are encouraged to apply. UNC more queue: www.acmqueue.com Charlotte is an Equal Opportunity/Affirmative Action employer.

ACM QUEUE December/January 2006-2007 51

Continued from page 56 with what seem now to be ridiculously frugal resources. Maurice Wilkes, David Wheeler, and Stan Gill had written the first book on programming.4 This revered, pioneering trio are generally acknowledged as the co-inventors of the subroutine and relocatable code. As with all the most sublime of inventions, it’s difficult to imagine the world without a call/return mechanism. Indeed, I meet programmers, whose parasitic daily bread is earned by invoking far-flung libraries, who have never paused to ponder with gratitude that the subroutine concept needed the brightest heaven of invention. Although no patents for the basic subroutine mechanism were sought (or even available) back then, a further sign of changing times is that patents are now routinely [sic] awarded for variations on the call/return mechanism, as well as for specific subroutines.5 David Wheeler died suddenly in 2004 after one of his daily bicycle rides to the Cambridge Computer Labs. It’s quite Cantabrigian to “die with your clips on.” I had the sad pleasure of attending David’s memorial service and learning more of his extensive work in many areas of computing.6 Other innovations from the Cambridge Mathematical Laboratory in the early 1950s included Wilkes’s paper introducing the concept of microprogramming. On a more playful note was the XOX program written by my supervisor A. S. (Sandy) Douglas. This played (and never lost!) tic-tac-toe (also known as OXO)—a seemingly trivial pursuit, yet one with enormous, unpredicted consequences. XOX was the very first computer game with an interactive CRT display, the challenge being not the programming logic, of course, but the fact that the CRT was designed and wired for entirely different duties. Little could anyone guess then that games and entertainment would become the dominant and most demanding applications for computers. Can anyone gainsay this assertion? One would need to add up all the chips, MIPS, terabytes, and kid-hours (after defining kid), so I feel safe in my claim. Discuss! If you insist, I can offer a weaselly cop-out: Games and entertainment are now among the most dominant and demanding applications for computers. Cue in some computer-historic Pythonesque clichés: “We had it tough in them days, folks. The heat, mud, dust, and flies. Try telling the young’uns of today—they just don’t believe yer. And did I mention the heat? 3,000 red-hot Mullard valves. All of us stripped down t’ waist—even t’ men!” Then an older old soldier would intervene: “512 words? You were lucky! All we had were two beads on a rusty abacus!” More ancient cries of disbe52 December/January 2006-2007 ACM QUEUE

lief: “An abacus? Sheer luxury! We had to dig out our own pebbles from t’ local quarry. We used to dream of having an abacus...” In truth, adversity did bring its oft-touted if not so sweet usages. Programming at the lower levels with limited memory constantly “focused the mind”—you were nearer the problem, every cycle had to earn its keep, and every bit carry [sic!] its weight in expensive mercury, as it were. The programming cycle revolved thus: handwrite the code on formatted sheets; punch your tape on a blind perforator (the “prayer” method of verification was popular, whence quips about the trademark Creed); select and collate any subroutines (the library was a set of paper tapes stored in neat white boxes); wait in line at the tape reader (this was before the more efficient “cafeteria” services were introduced); then finally collect and print your output tape (if any). All of which combined to impose a stricter discipline on what we now call software development. More attention perhaps than in these agile, interactive days was given to the initial formulation of the program, including “dry runs” on Brunsviga hand calculators. Indeed, the name of our discipline was numerical analysis and automatic computing, only later to be called computer science.7 EDSAC designer Professor (now Sir) Maurice Wilkes was quoted by the Daily Mail, October 1947: “The brain will carry out mathematical research. It may make sensational discoveries in engineering, astronomy, and atomic physics. It may even solve economic and philosophical problems too complicated for the human mind. There are millions of vital questions we wish to put to it.” A few years later, the Star (June 1949) was reporting: “The future? The ‘brain’ may one day come down to our level and help with our incometax and bookkeeping calculations. But this is speculation and there is no sign of it so far.” Allowing for journalistic license, one can spot early differences between how computing was expected to evolve and how, in fact, things turned out. The enormous impact on scientific research did come about and continues to grow, but the relative pessimism about commercial applications quickly vanished. Indeed, soon after the June 1949 quote (“no sign of it so far”), the UK’s leading caterers, food manufacturers, and tea-shop chain, J. (Joe) Lyons & Co., embarked on its LEO (Lyons Electronic Office) project, a business computer based directly on Wilkes’s EDSAC designs (with appropriate financial support). I recall visits by the Joe Lyons “suits,” who
rants: feedback@acmqueue.com

explained that there was more to their business computing needs than multiplying NumberOfBuns by UnitBunPrice. LEO was running complex valuation, inventory, and payroll applications by 1951, an important root of the global expansion of commercial IT subsequently dominated by IBM. I now move from praising the pleasantly unpredicted iPod-in-my-pocket to complaining about the overconfident predictions that continue to elude us. Otherwise, this column will fail to qualify as “curmudgeonly.” Reconsider the 1947 proposition: “It [meaning IT!] may even solve economic and philosophical problems too complicated for the human mind.” One can agree that many complicated “economic” problems have succumbed. Here one must think beyond cutting payroll checks and the inventory control of tea cakes. Although such tasks can be quite complex because of sheer size and the quirks of a volatile real-world rule-set, they remain essentially mechanizable given the certifiable [sic] patience of nitpicking programmers.8 Moving to the wider field of “economic” problems that are truly “too complicated for the human mind,” we can acknowledge the progress made in the computer-modeling of such domains as global trade, energy consumption, and climate change. These systems stretch the normal physical laws of cause and effect by having to cope with chaos (the susceptibility to small input changes and measurement accuracies) and the nondeterministic influences of human behavior. Nevertheless, these models are useful in testing possible outcomes for given policy decisions. Interestingly, Professor Steve Rayner (director of the James Martin Institute, Oxford University) calls the global environmental problem “wicked,” a technical term for those complex problems that demand “clumsy” solutions. Clumsy is also a technical term! Briefly, clumsy solutions combine the diverse solutions hinted at by the various competing models in an undogmatic way (see http://www.unsw.edu.au/news/pad/articles/2006/jul/jack_ beale_lecture.html). To this extent, the prediction that computers “may even solve economic problems too complicated for the human mind” is unfolding with promise, although the emphasis should really be on “helping us to solve.” Finally, what can we say about the 1947 hope that computers “may even solve philosophical problems too complicated for the human mind.” Je dis que NON! The AI and robotics dreams continue to frustrate us even in the reduced realm of simulating what all human minds (and bodies) do naturally as a matter of course without
more queue: www.acmqueue.com

any “complicated” introspection! In particular, I mean the sine-qua-non acquisition and command of natural language. The claimed progress certainly excites some Aivatarians (the folks who apply AI techniques to construct artificial avatars). Chomsky’s analogy is telling: The long-jumpers increase their leaps each year, and claim that one day soon they will achieve bird-like flight. Or the parrots, taught to... er... parrot an increasing number of words, will one day (to paraphrase Bertrand Russell) tell you that their parents were poor but honest. The ACM TechNews (Oct. 20, 2006) reports the creditable achievements of the aivatar (AI avatar) George, programmed by Rollo Carpenter of Televisual. George is a chatbot, a term adding to the glut of bots, derived from the Czech Karel Câpek’s play R.U.R. (Rossum’s Universal Robot). Robot (coined by Câpek’s brother) comes from the Slavonic root for work, but we must not overlook the political satire: robota meant statute labor in the days of

AI and robotics
dreams continue to frustrate us...

Uncle Joe. We now have as many bots in the computer laxicon as wares. Thus, genetically modified veggies are grobots, and humans are knobots (compare with wetware for that mushy stuff between our ears). George is said to “speak” some 40 languages, having had more than 11 million “conversations” with human online text messages via http://www.jabberwacky.com (note the deliberate variant spelling from Lewis Carroll’s poem “Jabberwocky.” The hint may be that WACI is emerging as an acronym for wild and crazy ideas). My own brief exchange with George reminds me of those early versions of Doctor but clearly with the richer range you might expect from a larger database. George’s ACM QUEUE December/January 2006-2007 53

replies lack any real signs of human intelligence. Discussing Democratic presidential candidates (George’s idea), I entered “Hillary Clinton.” George replied, “Hilary Clinton is a senator.” I said, “She’s also a candidate for president.” George replied, “Oh yes, I know!!!” But when I asked, “Is she Hilary or Hillary?”, George answered, “This does not make sense—how about math?” I said, “Nyet,” and George answered: “You’re right, that’s Russian.” The site has a disclaimer: “Jabberwacky learns the behavior and words of its users. It may use language and produce apparent meanings that some will consider inappropriate. Use this site with discretion, and entirely at your own risk.” Tigers (Japan) and the Dirty Dozen (Germany) as “looking remarkably like a contest between toasters on wheels, while the Four-Legged League (RoboMutts) appeared to spend most of the time sniffing each other’s shorts.” Kitano remains optimistic that by 2050 a team of autonomous bots will beat the human World Cup Champions. That’s a prediction that’s difficult to gainsay. Kitano points out that 50 years after EDSAC, the IBM Deep Blue beat world chess champion Garry (or some prefer Gary) Kasparov. Seddon argues that playing a symbolic “computable” game like chess cannot be compared with the physical complexities of soccer, where the rules appear simple but defy algorithmic precision. “He was bleedin’ off-side!” “Oh no, he bleedin’ wasn’t!” Seddon reckons that the chances of Kitano’s prophesy coming true are about the same as Beckham ever becoming world chess champion. That honor, by the way, has just been achieved by the Russian Vladimir Kramnik but not without some all-too-human, sordid altercations. His rival, the Bulgarian Veselin Topalov, objected to Kramnik’s frequent trips to the restroom (or, in chess notation, K x P?). The ultimate insult was that Kramnik had been consulting a computer hidden in the Gents. This was convincingly disproved, but one further irony ensued. Since the normal games ended in a points tie, a soccerlike extra-time had to be played: four games of nail-biting rapid play, which, to the relief of all fair-chess lovers, was won by Kramnik. Q

...by 2050

a team

of autonomous bots will beat the human World Cup champions.

The point has long been made that human knowledge and learning relies deeply on our having a corporeal entity able to explore three-dimensional space and some cognitive “devices” to acquire the notions of both place and time. Chatbots that reside in static hardware are rather limited in this respect. Hence the need for a mobile bot, whether it has humanoid features such as Rossum’s original robots or not. From “embedded systems” to “embodied”? For limited, repetitive actions in tightly controlled environments, tremendous progress has been made, as in motor-car assembly. As a devout soccer fan, I’m intrigued by the possibilities of two teams of 11 robots playing the Beautiful Game. In 1997, Hiroaki Kitano launched the annual Robot World Cup, or RoboCup for short.9 (The name Ballbot has been taken up elsewhere for a robot that moves around rather like a gymnast walking while balanced on a large sphere; see http://www. post-gazette.com/pg/06235/715415-96.stm.) By 2002, 29 different countries had entered the RoboCup staged in Fukuoka, Japan, attracting a total audience of 120,000 fans. Peter Seddon describes the game between the Baby
54 December/January 2006-2007 ACM QUEUE

REFERENCES 1. You, too, can relive those heroic days. Martin Campbell-Kelly (no relation) of Warwick University offers an EDSAC simulator for the PC and Mac; http://www.dcs. warwick.ac.uk/~edsac/Software/EdsacTG.pdf. This site will also point you to the vast EDSAC bibliography. 2. I’m lying for a cheap laugh. In fact, I’ve never knowingly stolen a file of any kind. As a member of ASCAP (American Society of Composers, Authors, and Publishers), I urge you all to obey the IP (intellectual property) protocols. 3. EOF, possibly overloaded from EndOfFile to ExtremelyOldFart, started life as plain OF (OldFart) in the Jargon File and subsequent versions of the Eric Raymond/Guy Steele Hacker’s Dictionary. In the 1980s OF was generally applied (with pride or sarcasm) to those with more than about 25 years in the trenches. It now seems appropriate to define EOFs by stretching the time served to “more than about 50 years.” 4. Wilkes, M. V., Wheeler, D. J., Gill, S. 1951. The Preparation of Programs for an Electronic Digital Computer.
rants: feedback@acmqueue.com

New York: Addison-Wesley. 5. For the effective nullification in 1994 of the Supreme Court’s 1972 Gottschalk v. Benson decision, which had excluded mathematical algorithms from patents applications, see http://www.unclaw.com/chin/scholarship/software.htm. Andrew Chin discusses the vital topic of computational complexity and its impact on patent-law complexity! Claims made for “effective” algorithms can run afoul of well-known computer-scientific theorems. 6. http://www.cl.cam.ac.uk/UoCCL/misc/obit/wheeler. html. 7. Professor P. B. Fellgett once asked the possibly rhetorical question, “Is computer science?” Professor Dijkstra was dubious, posing the counter-riddle, “Is typewriter science?” thereby proving the correctness of the name “computing science.” 8. Readers may have their own counter-anecdotes where apparently trivial business functions turn out to be provably noncomputable. I exclude the epistemological problems of Lex Coddonis: the axioms for normalizing a 100-percent-conforming relational database. I recall programming a UK County Council payroll on the IBM 650, where the police were given a supposedly tax-free

clothing allowance that another rule declared taxable, leading to a later refund of the tax paid that was itself taxed and later refunded, ad almost infinitum. 9. Seddon, P. 2005. The World Cup’s Strangest Moments. Chrysalis Books. LOVE IT, HATE IT? LET US KNOW feedback@acmqueue.com or www.acmqueue.com/forums STAN KELLY-BOOTLE (http://www.feniks.com/skb/; http:// www.sarcheck.com), born in Liverpool, England, read pure mathematics at Cambridge in the 1950s before tackling the impurities of computer science on the pioneering EDSAC I. His many books include The Devil’s DP Dictionary (McGrawHill, 1981), Understanding Unix (Sybex, 1994), and the recent e-book Computer Language—The Stan Kelly-Bootle Reader (http://tinyurl.com/ab68). Software Development Magazine has named him as the first recipient of the new annual Stan Kelly-Bootle Eclectech Award for his “lifetime achievements in technology and letters.” Neither Nobel nor Turing achieved such prized eponymous recognition. Under his nom-de-folk, Stan Kelly, he has enjoyed a parallel career as a singer and songwriter.
© 2006 ACM 1542-7730/06/1200 $5.00

Instantly Search Terabytes of Text
◆ ◆ ◆ ◆ ◆

over two dozen indexed, unindexed, fielded data and full-text search options highlights hits in HTML, XML and PDF, while displaying links, formatting and images converts other file types (word processor, database, spreadsheet, email and attachments, ZIP, Unicode, etc.) to HTML for display with highlighted hits Spider supports static and dynamic Web content, with WYSWYG hit-highlighting API supports .NET/.NET 2.0, C++, Java, SQL databases. New .NET/.NET 2.0 Spider API

dtSearch® Reviews


“Bottom line: dtSearch manages a terabyte of text in a single index and returns results in less than a second” – InfoWorld “For combing through large amounts of data, dtSearch “leads the market” – Network Computing “Blindingly fast”– Computer Forensics: Incident Response Essentials

◆ ◆

◆ “Covers all data sources ... powerful Web-based engines”– eWEEK Spider ($199) Desktop with ◆ “Searches at blazing speeds”– Computer Reseller News Test Center (from $800) h Spider Network wit ◆ “The most powerful document search tool on the market”– Wired Magazine ) er (from $999 eb with Spid W For hundreds more reviews — and developer case studies — see www.dtsearch.com ) s (from $2,500 for CD/DVD New Publish T 64-bit beta Contact dtSearch for fully-functional evaluations r Win & .NE Engine fo The Smart Choice for Text Retrieval ® since 1991 e for Linux Engin

1-800-IT-FINDS • www.dtsearch.com

more queue: www.acmqueue.com

ACM QUEUE December/January 2006-2007 55

Will the Real Bots Stand Up? curmudgeon
Stan Kelly-Bootle, Author


From EDSAC to iPod—
Meanwhile, adding stilllife pictures, such as cover art, may retain the iPod’s simple “completeness,” but pushing the device to TV seems to me to break the spell of sound gimcrackery [sic]. Peering at tiny moving pictures is a pointless pain, whereas even modestly priced earphones provide the superb hi-fi we used to dream about when growing up. The near-exponential improvement of every computing power-performance parameter—physical size, clock speed, storage capacity, and bandwidth, to name the obvious features—is now a cliché of our fair trade. Yet even my older readers3 may need reminding just how bleak things were almost 60 years ago as the world’s first stored-program machine (note the Cambridge-chauvinistic singular) moved into action. The house-size EDSAC was effectively a single-user personal computer—a truly general computing factotum, but as Rossini’s Figaro warns: Ahime, che furia! Ahime, que folla! Uno alla volta, per carità! (Heavens, what mayhem! Goodness, what crowds! One at a time, for pity’s sake!) Originally (1947) EDSAC boasted [sic] 512 words of main memory stored in 16 ultrasonic mercury-delay-line tanks, cleverly known as “long” tanks because they were longer than the short tanks used for registers. On the bright side, as we used to quip, each of the 512 words was 18 bits! Forget the word count, feel the width! Alas, for technical reasons, only 17 of the 18 bits were accessible. By 1952, the number of long tanks had doubled, providing a dizzy total of 1-KB words. Input/output was via fivetrack paper tape, which therefore also served as mass [sic again] storage. Subject only to global timber production, one might see this as virtually unlimited mass storage, although access was strictly slow-serial via 20-charactersper-second tape readers and 10-characters-per-second teletype printers. (By 1958, with EDSAC 2 taking over, paper tape and printer speeds had risen and magnetic tapes had become the standard backup and mass storage medium.) Although hindsight and nostalgia can distort, one still looks back with an old soldier’s pride at the feats achieved Continued on page 52
PREDICTIONS ELUDE US rants: feedback@acmqueue.com

hen asked which advances in computing technology have most dazzled me since I first coaxed the Cambridge EDSAC 1 1 into fitful leaps of calculation in the 1950s, I must admit that Apple’s iPod sums up the many unforeseen miracles in one amazing, iconic gadget. Unlike those electrical nose-hair clippers and salt ’n’ pepper mills (batteries not included) that gather dust after a few shakes, my iPod lives literally near my heart, on and off the road, in and out of bed like a versatile lover—except when it’s recharging and downloading in the piracy of my own home.2 I was an early iPod convert and remain staggered by the fact that I can pop 40 GB of mobile plug-and-play music and words in my shirt pocket. I don’t really mind if the newer models are 80 GB or slightly thinner or can play movies; 40 GB copes easily with my music and electure needs. Podcasts add a touch of potluck and serendipity-doo-dah. Broadcasts from the American public radio stations that I’ve missed since moving back to England now reach my iPod automatically via free subscriptions and Apple’s iTunes software. I’ve learned to live with that pandemic of “i-catching” prefixes to the point where I’ve renamed Robert Graves’s masterwork “iClaudius,” but I digress. The functional “completeness” of the audio iPod stems from its ideal marriage of hardware and software. The compactness is just right, respecting the scale of human manipulations. The Dick Tracy wristwatch vade mecum failed through over-cram and under-size. The iPod succeeds with a legible alphanumeric screen and that senile-proof, uncluttered, almost minimal, click-wheel user interface. This avoids the input plague of most portable gadgets such as phones, calculators, and PDAs: the minuscule keyboards and buttons. I hasten to deflect the wrath of my daughter-in-law Peggy Sadler and all who have mastered and swear by the Palm Pilot stylus! The click wheel offers circular, serial access to and selection of your titles, but that’s a decent compromise when you ponder the problems of searching by keywords. Spoken commands remain, as always, waiting for the next reassuring “breakthrough.” I’ll return anon to other Next-BigFix-Release promises.
56 December/January 2006-2007 ACM QUEUE





Register by JAN 19

Register by FEB 23






BACK BY POPULAR DEMAND SESSIONS IN Join C++ creater Bjarne Stroustrup and renowned C++ expert Herb IN-DEPTH Sutter for an in-depth two-day TRACKS tutorial on C++. NEW THIS YEAR: RUBY AND WEB 2.0! PLUS: Keynotes, Expo, Birds-of-a-Feathers, Panels, Case Studies, Roundtables, Parties and Special Events

OVER 200

• Business


• Modeling

Software • C++ • Java • .NET • Ruby

& Design • People, Projects & Methods • Requirements & Analysis

• Security • Testing

& Quality • Web 2.0 • Web Services/SOA • XML


R E G I S T E R T O D AY AT W W W. S D E X P O . C O M

The Object Database With Jalapeno. ˜ Give Your POJOs More Mojo.
The object database that runs SQL faster than relational databases now comes with InterSystems Jalapeño™ technology that eliminates mapping. Download a free, fully functional, non-expiring copy at: www.InterSystems.com/Jalapeno1S
© 2006 InterSystems Corporation. All rights reserved. InterSystems Caché is a registered trademark of InterSystems Corporation. 10-06 CacheJal1Queue