Coding with Lisp

Coding with Lisp

‘A hip, burgeoning free software community is currently pushing Lisp, arguably the oldest programming language in common use, to the fore as a next generation coding platform. Selfconfessed Lisp newbie, Martin Howse, presents the first part of an accessible guide to the culture of this flexible language, and practical implementations under GNU/Linux’

Lambda The Ultimate
The art of Lisp, or coding as philosophy

case for Lisp than coder and writer Paul Graham (see Hackers and Painters, LU&D 42), whose own work with Viaweb is good testament both to the power of Lisp as an expressive language and Lisp’s relevance within enterprise level, large-scale projects. Graham’s essays and books provide a singular practical and conceptual resource, elegantly tying real world arguments within a neat conceptual bundle, which is well wrapped up by his strong theoretical grounding and knowledge of the language. He speaks from experience, and those who paraphrase Graham without his depth of understanding will always sound a bit hollow. Nevertheless, Lisp’s features and advantages must be outlined, and these can readily be pinned down both to a decent level of abstraction and to highly usable abstractions themselves, such as closures and macros. It’s worth remembering that Lisp belongs within the family of functional languages, which implies

complete theory of computation within Lisp, is a rich seam to mine, with demanding papers on this subject furnished by the likes of Graham, Sussman and Guy Steele, another major player in the early life of Lisp, and co-inventor of the intriguing Connection Machine.

MORE TO LISP THAN LISTS
Extendibility here means both extending the Lisp language itself in a completely transparent manner, thus building a domain specific language for the application, and providing the facility for others to readily extend and customise an application. In the first instance, the field of equivalence between software and data allows for coding custom languages with powerful abstractions, and in the latter case, this form of what Graham calls bottom-up programming naturally results in extensible software, with GNU Emacs as prime example here. From this perspective, Lisp isn’t about writing applications, it’s about writing languages. And, given the growing complexity of both hardware and contemporary software systems, this heavily modular, high level approach to programming and systems architecture is seriously compelling. Indeed, pioneers Sussman and Abelson preface their essential SICP volume with clear indications of how to control complexity, through establishing conventional interfaces, and establishing new languages. And, alongside those who claim that Lisp is unusable due to an arcane syntax which multiplies parentheses, some would still maintain that Lisp is slow, and argue thus that the need for speed has lost out in the battle with complexity. But, given that nearly all modern

Coder’s dream or user’s nightmare? Symbolics Genera environment was a Lisp hacker’s paradise, allowing coders to dig deep into Lisp hardware

L

isp has come in from the cold. After years enduring the icy Artificial Intelligence (AI) winter, a period in the late eighties when Lisp was dumped by many thanks to associations with over-hyped AI research, and a frosty decade in academic obscurity, the open source community has dusted down this unique language, and freshened her up with lemon odour. Or rather Lemonodor.com, a high profile Web site which actively promotes and describes Lisp use and can readily be considered the Slashdot of the Lisp world, has performed this service for the community. Like a proud mother, Lisp, probably the oldest language still in use today, can view her children and remark on how their best features all stem from her genes or rather memes. Flexibility and speed of development are her watchwords, and key enabling features of the language, such as run-time typing and an interactive approach to interpretation or compilation have been adopted by newer languages such as Python. Yet, Python misses the killer core or inner beauty which so many celebrate in Lisp. To approach this core takes more than simply enumerating Lisp’s

good points. To hijack Python guru, Tim Peter’s, words, “Good languages aren’t random collections of interchangeable features: they have a philosophy and internal coherence that’s never profitably confused with their surface features.” Understanding Lisp is a gradual process, and involves some degree of immersion in the history, philosophy and culture of Lisp as well as a gentle involvement with coding to further elucidate core concepts such as recursion and macros. Yet, Lisp, with an array of dialects, libraries and implementations is far from the easiest language to get up and running with, in contrast to Python, which obviously presents a one stop shop. And although a clear conceptual understanding can be readily sought in printed works such as The Little Lisper or in Abelson and Sussman’s seminal Structure and Interpretation of Computer Programs (SICP), practical advice which would further support this knowledge is sorely lacking within the community. As we’ll see, certain initiatives are under way to resolve these issues, but aside from evangelising Lisp, an effort which is close to achieving its goals, more work does need to be done in this direction, and this short article only aims to provide an overview for the intelligent and necessarily intrigued beginner. No-one has been more instrumental in persuasively arguing the

modularity, an essential requirement for large-scale, complex projects. Such abstractions, and features such as automatic memory management, courtesy of built-in garbage collection, readily enable new ways of programming, and this is Lisp’s great advantage. Sure, the semi-interpreted nature of Lisp, with functions able to be tried and tested at the interactive REPL (Read Eval Print Loop) prompt, or so called top level, assists in rapid development, and, Graham proffers amusing examples of improving server software on the fly, but Lisp’s real advantage remains in its extensibility which can readily be seen as stemming from a core feature of the language; the fact that Lisp programs are expressed as Lisp data structures. Indeed, John McCarthy, a key figure within AI and inventor and primary implementor of Lisp, remarks in an essay on the early history of Lisp, that “One can even conjecture that Lisp owes its survival specifically to the fact that its programs are lists, which everyone, including me, has regarded as a disadvantage.” Hacker, Pascal Constanza, argues that this core feature makes of Lisp much more than just another Turing complete language, and this notion, which embeds a

Good languages aren’t random collections of interchangeable features: they have a philosophy and internal coherence that’s never profitably confused with their surface features
implementations compile during interpretation, following a just-in-time model, or allow specifically for interactive compilation, speed is rarely an issue. And the speed of prototyping or coding is a further issue which should also enter the equation. With Lisp, at least one has greater options for exploration and prototyping, and, if needs be, optimisations can be furnished later in the day. Such notions regarding Lisp as sluggish belong within an old-fashioned view of Lisp which focuses on syntax and on lists, arguing that LISP stands for List Processing Language, and which corrals this powerful language within an academic prison. The new view of Lisp is that,

50 LinuxUser & Developer

LinuxUser & Developer

51

Coding with Lisp

given automatic highlighting and indentation, parentheses and other syntactical issues disappear. Lisp is a far more flexible language than the acronym would suggest.

GETTING META
A further consequence of the flattening of software and data as quite distinct animals, is that Lisp really does live in the land of the meta, and that’s a place where a good few sophisticated coders and theorists like to hang out. Douglas Hofstadter, in his seminal mathematical and meta-math work, Godel, Escher, Bach: An Eternal Golden Braid, provides many mind stimulating adventures at the metalevel, and Lisp makes for a very natural fit here. And yet another consequence of the much

and Sussman’s work. Indeed, in their heyday, these pioneers were churning out meta-circular evaluators for subsets and dialects of Lisp at an alarming rate, and their work forms an important link between the more exciting aspects of mathematics, philosophy and computer science. Another valuable starting point here would be the common assertion that the proof of Godel’s incompleteness theorem, which is essential to an understanding of AI, would have been easier if he had invented Lisp first, given Lisp’s predilection for the meta-circular. And just before any unthinking C coders chime in, a C compiler written in C, which can be used for bootstrapping, does not belong in the realm of the metacircular which further specifies that precise semantics must

Another consequence of the much vaunted data/software equivalence, is the unusual defining quality that Lisp can be written in itself
not be defined in the evaluator. The common parallel is with looking up a word in the dictionary, and finding that the definition uses the original word. That is how things work with a Lisp written in Lisp; eval, which quite obviously evaluates expressions, is implemented by calling eval. In contrast, a C compiler must specify detailed and precise semantics for each and every construct, and take care of boring old parsing. The REPL defines all that is needed to build a Lisp interpreter; read an expression, evaluate it and then print the results. It has to be admitted that there’s a certain beauty and simplicity at work here, and Lisp is certainly unique in this respect. A good deal of this simplicity stems from Lisp’s roots, with the theoretical work of John McCarthy in the 1950s, which touches on all the rich thematics wrapped up by Godel’s work in the sphere of mathematics. Both McCarthy and Graham write well on this early history of the language, and their texts make for essential reading. McCarthy did not set out to design and create a programming language to meet specific programming needs or satisfy a problem domain, rather he was interested in mathematical notation and expressing theory. This makes Lisp unique in the field of programming, and quite distinct from the functionality associated with C or C++. Lisp is a flexible, theoretical language which is primarily expressive. Rather than jogging through the history of Lisp, which is well rehearsed elsewhere by the likes of McCarthy and Steele, Graham’s Roots of Lisp paper presents a conceptual walk-through of the birth of Lisp, with McCarthy’s notation translated into Common Lisp code, and along the way he provides a good description of the primitive Lisp forms, which are function calls or macros, before arriving at the

L-Lisp, running with an OpenGL Lisp library, puts some life back into a code heavy subject, with Lindenmayer systems readily simulating plants and fractals

vaunted data/software equivalence, is the unusual defining quality that Lisp can be written in itself. In practice, this can be achieved in very few lines of code, and such a resulting beast is the rather frightening metacircular interpreter or metacircular evaluator. This creature lies at the very heart of an understanding of the history and conceptual underpinnings of Lisp, and writing such an interpreter forms a useful exercise for the novice. Again this is another rich seam well worthy of further investigation, and the intrigued could start with Graham’s excellent paper, The Roots of Lisp, or plough straight into Abelson

Lisp lives in the land of the meta, and that’s a place where a good few sophisticated coders and theorists like to hang out

52 LinuxUser & Developer

Coding with Lisp

Coding with Lisp

The Lemonodor effect
The Common Lisp community has certainly gone from strength to strength in the last two years, with blogs and wikis as primary medium for the sharing of enthusiasm, inspiration and information. There are blogs by and for newbies, and blogs from old hands such as Rainer Joswig, and sometime LU&D contributor, Daniel Barlow. However, the Slashdot of all Lisp weblogs, if such a thing can be imagined, is surely Lemonodor, clearly signalling that Lisp is the new rock and roll with a mix of hardcore news from the Lisp community, John Wiseman’s artistic LA lifestyle reports, and a marked emphasis on Lisp within the field of robotics. Lushly produced, with intriguing illustrations often loosely associated with the topic at hand, Lemonodor is a legend in the land of Lisp. And a mention by Wiseman of a site, surely results in record traffic, if not of Slashdotting proportions. Even one of the best free Common Lisp IDEs, SLIME, makes mention of achieving Lemonodor fame in the startup routines. Inspired by Planet GNOME, Planet Lisp acts as meta blog, collecting essential content from a huge number of Lisprelated weblogs. Despite sometimes straying from the Lisp path into territory which indeed may rival the poor signal-to-noise ratios of Slashdot, Planet Lisp does make for a decent daily immersion in Lisp culture. With plentiful RSS feeds, and much cross linking the Lisp world is well connected, and the greatest resource of them all, cliki.net collects much of these links and blogs and provides further essential resources. Cliki.net, powered by Daniel Barlow’s CLiki engine, a Common Lisp wiki, is well worth checking out changelog-wise on a daily basis. The practical lisp page is a useful starting point for newbies, if somewhat dated in respect to SLIME use, but cliki.net does provide links or information on practically every aspect of Lisp culture and practice. Parodying or mirroring the evangelicism of a religious Web site, another CLiki, the ALU (Association of Lisp Users) wiki, is packed with so called “Road to Lisp” pages which demonstrate the fervent admiration which Lisp provokes. Unfortunately, the ALU wiki has recently been the subject of prolonged spam attacks, and its active life does appear to be in danger. Other CLikis of note include the *hyper-cliki*, an annotatable Common Lisp reference, and the wonderfully funky TUNES project CLiki which outlines a range of projects and resources towards the creation of a free reflective computing system, or Lisp-based OS. Other essential online resources linked from cliki.net, include the Common Lisp Cookbook, an attempt to create a community resource parallelling the Perl Cookbook approach and both Successful Lisp and Practical Common Lisp, two essential online works.

detailed explication of an eval function written in Lisp. Graham describes Lisp’s elegant syntax and notation, and key terms such as expression, form, list and atom. Alongside six primitive operators, the quote operator, which has obvious parallels with quotation in the English language, is well described as functioning to distinguish data from code. The lambda notation for denoting functions is

of the huge On Lisp volume to macros, which he considers of great importance within the paradigm of bottom-up programming, and macros are certainly an essential, if hard to learn, feature which allows for writing programs that write programs. Macros are quite simply functions which transform expressions and they can themselves call other functions and make use of other macros; a heady brew indeed whose power of transformation is unheard of elsewhere. To clear up any confusion, macros under Lisp have little to do with their C-based namesakes, which perform purely string substitutions. Macros allow the language to play with its own readily accessible internals as data, and a good many Common Lisp functions are implemented as macros themselves. Understanding macros is one thing, and making use of them perhaps even more complex, but given the definition that macros are simply operators that are implemented by transformation, and noting a few example expansions which can readily be tested with the macroexpand-1 function should set the beginner on the right track.

than Common Lisp, more elegant and crystalline, as opposed to the baroque, full features of Common Lisp. But there are great similarities between the languages, and core features of Scheme such as continuations, which freeze the state of a computation for later use, can readily be achieved with macros under Common Lisp. Once again, it seems that languages cannot so readily be boiled down to a feature set. The relative size of Scheme is also an issue. Given that Lisp doesn’t really bother about the difference between built-in functions and userdefined functions, it’s a tough call to decide

Lushly produced, with intriguing illustrations often loosely associated with the topic at hand, Lemonodor is a legend in the land of Lisp

can fire up, say, CLISP, a pleasant, simple Common Lisp interpreter, straight from the command line and throw a few Lisp expressions at it, but to get some decent work done you’ll need a more powerful implementation which integrates well with an editor such as GNU Emacs to form an efficient IDE. In part two, we’ll pit SBCL against CMUCL, the two favoured free implementations, integrate these tightly with GNU Emacs using a touch of SLIME, throw packaging and packages into the mix and touch on embedding and extending with Lisps such as librep and functionalities such as FFI (Foreign Function Interface).

Scheme can be viewed as a more minimal language than Common Lisp, more elegant and crystalline, as opposed to the baroque, full features of Common Lisp
clearly elaborated, and with great elegance, Graham whips out a surprise Lisp eval, written using functions built from only seven primitives. Further functions can be elaborated and evaluated using this eval, which can readily be transformed towards a contemporary Lisp, and thence bended towards implementations which can easily furnish abstractions such as object-oriented programming (OOP). Indeed, in his excellent ANSI Common Lisp, Graham shows as an exercise how a minimal OOP can be implemented in Common Lisp, without using CLOS (Common Lisp Object System) features. His preliminary language is implemented

SCHEMING
Though Lisp’s history post-McCarthy does make for interesting reading, with colourful anecdotes peppering the story of computer science’s most philosophical language and furnishing a classic narrative of riches to rags and back again, there is little here that is totally relevant to the contemporary Lisper, aside perhaps from intriguing material covering the hardware implemented Lisp Machines and their associated development environments such as Genera, which few contemporary IDEs can even dream of competing with. It’s also worth bearing in mind that given Lisp’s flexibility and extendibility which make it easy to create quite radically different dialects of Lisp, Lisp should really be considered as a family of languages rather than a language in its own right. And until the early to mid 80s the Lisp world was seriously splintered with competing dialects and implementations proliferating. To address these issues, hardcore Lisp hackers gathered to standardise a new language, Common Lisp, which is the main Lisp in use today, alongside Scheme, an unusual, elegant dialect created by Sussman and Steele in the late 70s. Common Lisp is well specified in Common Lisp the Language, or CLtL for those in the know, authored by Guy Steele. ANSI standardisation for Common Lisp followed a few years later. Thus, one of the first choices facing the novice Lisp coder, before even considering free implementations, is whether to ride with Common Lisp or Scheme. There can be no easy answer and the question has probably fed more flame wars in both communities than any other issue. Researching the culture of both dialects can throw interesting light on theoretical issues under both languages, and it’s relatively easy to grasp the fundamental differences in feel and approach. Scheme does have a particularly interesting history, and its creation is considered as of seminal importance within the history of computing, resulting as it does from an attempt to understand Actors, Carl Hewitt’s message passing model of computation. Scheme can be viewed as a more minimal language

Debugging is a fine art under Common Lisp, and along with packaging and macros can readily confuse the beginner SLIME on GNU Emacs makes for a contemporary IDE which matches up to the power and sheer flexibility of Lisp languages

where the core language ends and library functions begin. Under ANSI Common Lisp, the piece of string is certainly seen as being a good deal longer than under Scheme, but to do any useful work Schemers may have to take on board some supplementary libraries. It is perhaps more fitting to investigate specific implementations, with PLT Scheme as a hot favourite on that side of the Lisp fence. Scheme does have a lot of things going for it, and contrary to the argument that Scheme’s simplicity and beauty don’t play well with real-world issues, it is totally possible to produce enterprise level apps under Scheme. That said, the free software Common Lisp community, grouped around hearty resources such as cliki.net and Lemonodor, just seems much more active, though there is a good deal of crossover, with seasoned hackers turning their hands to both dialects. For the sake of simplicity, Scheme will remain outside the scope of these articles, with the caveat that an understanding of the conceptual underpinnings of Scheme and the lazy evaluation which continuations facilitate can prove suitably enriching for any hacker. Beyond this essential conceptual background, what the newbie sorely needs to know are the specifics of respected implementations and how to get up and running with these most efficiently. Sure you

Key Links
Lemonodor www.lemonodor.com SICP mitpress.mit.edu/sicp Paul Graham www.paulgraham.com John McCarthy www-formal.stanford.edu/jmc Guy Steele library.readscheme.org/page1.html PLT Scheme www.plt-scheme.org Planet Lisp planet.lisp.org *hyper-cliki* lisp.tech.coop/index ALU wiki alu.cliki.net/index CLiki www.cliki.net/index TUNES project tunes.org Successful Lisp www.psg.com/~dlamkins/sl/contents.html Practical Common Lisp www.gigamonkeys.com/book Common Lisp Cookbook cl-cookbook.sourceforge.net

in just eight lines of code. Under the mini OOP embedded language, when it comes down to improving the syntax of message calls to make them read more like Lisp, the rather more complex meta world of macros is encountered. Once again macros come courtesy of the uniform treatment of code and data as forms for manipulation. Graham has devoted the whole

54 LinuxUser & Developer

LinuxUser & Developer

55

Sensor networks

Sensor networks

G
Microkernels and TinyOS
For anyone with any knowledge of recent Unix history, the debate as to whether microkernels or so-called monolithic kernels are to be preferred is fairly old hat. Embedded systems do sometimes rely on them even though they might not have much more than a rudimentary kernel running only essential services. Sensor systems tend to run real-time operating systems, usually without prejudice towards the kernel architecture. The design decisions for TinyOS are quite interesting in this regard, since many RTOS companies tend to be quite fond of microkernel architectures. RTOS developers like to be able to strip down a kernel to its most essential functions so that they only need to add on services whenever they need. Very small RTOS systems like VXworks and a number of Linux-based RTOS flavours can scale down to 32 k quite easily, but they tend to be quite unsuitable for sensor nodes. Sensor nodes need to be able to process concurrent data bursts. Since most systems are still using a process model that is quite complex, and traditional multithreading models need to be able to interrupt threads whenever necessary, it is vital that individual data processing tasks are always running right through to the end. No interruption of data processing tasks is allowed. This does not mean that tasks cannot run concurrently, but it does mean that time constraints are not quite as important as in other RTOS applications. Sensor nodes also need to wake up and go back to sleep without draining the energy reserves needed by the node. These are devices in the millimeter range and energy density plus energy efficiency are extremely important. Microkernels are designed for modularity, not for energy efficiency. Although portability is a major concern for TinyOS - there are large numbers of “mote” architectures out there - modularity is a condition for the design of TinyOS, not the major design goal. Microkernels come earlier in the evolution of kernel design. TinyOS is far more concerned with running the node while avoiding energy waste. Finally, microkernels tend to factor networking out of the kernel and provide separate services to deal with networking and routing. TinyOS has to provide simple networking support (“single-hop networking”) inside the kernel, since routing algorithms require more time and resources than the average node can deliver. 2k of RAM and, say, a 5 MHz CPU do not allow for that.

lobal warming has became the new ghoul of modern politics. Since the cold war, global terrorism, Islam and various dictators have been made to serve as the new counterpoints of global political strategy. But the frontiers have changed. Global warming and its attendant consequences have suddenly become a more important issue than the war in Iraq or the catastrophic Indian Ocean tsunami. The nations of the Indian Ocean and Europe have always been slightly more attuned to global environmental concerns. Many reasons can be adduced: Europe’s global dominance until the end of the 2nd world war and the spectacular implosion of imperial pretensions in the post-war period left many Europeans with a clear image of global political and economic realities, but little power to influence either. The nations of the Indian Ocean were mostly subject to the political will of others until well after 1945. The late 20th century, however, afforded countries from Iran to Indonesia, with a

booming nuclear-powered India in the middle, sudden prospects of economic prosperity and even regional dominance.

EARLY WARNING SYSTEMS ECONOMICS
While global data networks are providing timely warnings of tsunamis in the Pacific, the presence of the US, Canada, Japan and New Zealand is a factor not only valid for geographers, but also for budget allocations for respective governments. The Indian Ocean contains few zones of cooperation and many of conflict: India and Pakistan, Burma and Thailand, Sri Lanka, Indonesia and Malaysia have all eyed each other across lengthy borders and through decades of conflict. Asking them to cooperate is often a question of political tact as much as global economic realities. Where should the money and the experts come from to finance and run an early-warning network that can be run by all Indian Ocean countries?

GLOBAL NETWORKS = US NETWORKS?
The USA developed the global reach, but none of the global economic and social presence so essential to experience divergent political cultures and environmental and political concerns. Technical and scientific expertise was available to the US political elites, but as recent political events have borne out, the very thought of global political engagement with other countries on equal terms is anathema to the current US administration. The very size of the scientific establishment in the US and its rather smaller, but still influential counterparts in the European Union and Japan leads to similar problems. More than 50 per cent of all scientists globally are employed inside the US. If one is to add the number of scientists employed by EU countries and organizations, the predominance becomes simply staggering. Japan is turning into a major source of scientific expertise, although logistical and cultural issues are inhibiting process.

JASON AND THE ARGONAUTS
There are precedents for global networks that already provide global oceanographic data: one of them is called the Argo project which consists

There might be few interesting data one day and the next day, a storm or an earthquake might deluge observing scientists with interesting data

Predicting global warming and natural phenomena, volcanic tremors and shifts in the earth beneath our feet, may depend on global networks of tiny sensors around the world’s oceans. The extent to which the fate of this planet is in our hands is overwhelming, but the complexities of our ecology are still defeating the resources and people researching it. Frank Pohlmann investigates

Embedding Sensors in Global Politics
56 LinuxUser & Developer
LinuxUser & Developer

57

Sensor networks

Sensor networks

of more than 1600 robotic sensors floating 1 mile below sea level to collect data about water salinity and water temperature changes. In a few years, 3000 sensors will be stationed in the world’s oceans. This is complemented by the laser altimeter satellite Jason 1 and the radar altimeter platform TOPEX/Poseidon, which provides surface topography data from almost all global oceans. Ocean currents change the surface height and curvature of the sea, which in turn can be measured to two-inch accuracy from a height of 1300 km. Combined with temperature

resources and people researching it. The Argo project is just one example. What we are talking about here is a large number of subsurface floats taking fairly simple measurements. Imagine having to take meaningful measurements in biotopes and around small lakes a few hundred yards across. Now imagine having to accomplish that very same task in several hundred thousand locations globally. Collecting and collating the data locally and reliably and according to accepted scientific standards is difficult and has hardly begun.

cope with sudden increases in the data stream. The configuration of networks would have to change quickly if the need arises. If the sensors are mobile, a greater concentration of sensors in one area might be regarded as necessary. If some of them were destroyed, the wider mesh of sensor nodes should not create problems for data transmission, like increased latency. The operating systems and programs running the sensors should be stable and follow soft real-time standards. There are a number of projects using precisely such a technology. Wireless

Oceanography and climatology are not yet threatened by patent lawyers, although cloud patterns or whale songs may become subject to copyright or patent law in the future Science and technology, owing to its institutional history, is still settled around the Atlantic and the Pacific
gradient data, a fairly accurate picture of the “weather” below the surface of the see can be gleaned. Recent events in the Indian Ocean and the brouhaha around global warming - the latter being old news which was common knowledge 25 years ago - suddenly brought global early warning systems and the evaluation of scientific data collected all over the world to the forefront. Luckily, oceanography and climatology are not yet threatened by patent lawyers, although cloud patterns or whale songs may become subject to copyright or patent law in the future. This is somewhat less frivolous than it sounds, since, as we all know, songs are being copyrighted and to our knowledge, whale songs are unique to the whales singing them. Let there be no doubt that data relating to the global climate are in the public domain. The extent to which the fate of this planet is in our hands is overwhelming, but the complexities of our ecology are still defeating the

RAM might sound positively excessive. Java for mobile platforms is not comparable for architectural reasons: it is not as heavily componentized as TinyOS, and neither does it include a concurrent programming language: Java includes concurrent libraries expressed in APIs, not the language itself. It would also need an OS running the Java virtual machine and the code, conditions which the TinyOS architecture does not impose. TinyOS is written in a C like language called nesC that provides both a structured componentized development model and a concurrency paradigm. Since continuous data collection is the major purpose of this operating system, it is extremely important that there are facilities for a programmer to debug his code for race conditions; the nesC compiler would report them. Programming in nesC also includes primitives that allow controlling data collection from sensor nodes.

NETWORK ZOOLOGY
The sensor nodes, also known as “motes” when just the radio-board underlying the sensor hardware is referred to, are connected and controlled via an RF network; a network consisting of TinyOS-run sensor nodes or “motes” would need to be extremely efficient in transmitting data. The motes would not be able to store much information in RAM, given that RAM size ranges from 0.5 k to 8 k are common and secondary storage is extremely limited. The so-called mica2 motes, reached a size of 7 mm a few years ago. Traditional TCP/IP networking is not suitable, since wireless mesh networks have to be routed ad hoc and the communication overhead would tax the limited power budget motes have to run on. Traditional addressing modes like Ipv6 would be suitable, if it wasn’t for the fact that connection establishment and reliability are major issues for IP-based networking: it means that they energy expenditure for Ipv6 stacks would be considered excessive and the need for routing servers imposes networking overhead that is impossible to maintain. Recent models have run for 945 days on 2 AA batteries. Not bad for a data

YES OR NO? IT IS RANDOM, SIR!
One simple observation with regard to global climate and local ecology is the fact that with our methods of observation we are unlikely to be able to predict the global climate. Randomness and unpredictability are probably features inherent in large, complex systems: there are no hidden variables whose elaboration would make the weather or ocean currents fully transparent. This also means that the collection of ecological data on any scale has to respond to rapid, seemingly random changes in the environment. There might be few interesting data one day and the next day, a storm or an earthquake might deluge observing scientists with interesting data.

sensor networks have been developed for a number of years and it seems that micro-sensor technologies and some related networking products have left the embrace of academia and entered the industry. In some cases the reception of sensor networks has been colored by some very negative political perceptions: sensors the size of flies (“motes”) have military and civilian military applications whose potential have civil rights campaigners up in arms. The fears are justified and legislation is going to be needed. But there are uses to which sensor networks can be put that might ensure mankind’s longevity, rather than stoking the fires of global paranoia.

acquisition board combined with a small motherboard with processor and radio running in standby mode 24 hours a day. Wireless sensor networks usually have to be self-organizing, but when it comes to networks the size of the Argos network, there is another element to be taken into account: since wireless sensor networks consisting of motes cannot transmit data across long distances, they have to employ precisely the same principles of mesh-networking and adhoc routing, but enable the same kind of coverage for environmental monitoring, and a less dense sensor population. Argo nodes send data directly to satellites, which given the coverage and stability of the nodes makes a lot of sense, but if somewhat more local information is needed, requires too much investment. The kind of coverage and quick data collection and information evaluation necessary to prevent, e.g. earthquakes from killing 100s of thousands is not quite available yet, but first attempts are in the works. TinyOS and various types of motes were developed largely at UC Berkeley. A small number of companies tried to establish a number of application areas like biological habitat monitoring, building monitoring in case of seismic events and patient monitoring in long-term care facilities. By now, other researchers working with TinyOS have tried to simplify and extend TinyOS and mote networks to apply them to new applications. Harvard-based researchers succeeded in testing motes for a network that was monitoring a volcano for seismic events and possible eruptions.

another mote synchronized all data transmissions that were coming from the motes and a stationary seismic detector a few miles away. The sensor nodes ran TinyOS of course and data processing was accomplished using Java and Perl routines. This is just an experiment, although many other applications can be thought of. The technology to collect data all over the world exists and advances in the miniaturization of networking and CPU technology have made global environmental monitoring possible.

RELATIONS BETWEEN NODES AND POLITICIANS
There is another problem that the tsunami in December 2004 threw into sharp relief: even if the data are present and they have been evaluated, they have to be presented to decision makers in a comprehensible form. The Indian, Sri Lankan and Indonesian governments were faced with the need to make quick decisions; Sri Lanka and India had sufficient time to warn their populations and cut down the number of dead substantially. Unfortunately, even the best technology, as software available for free under the GPL or as hardware for fairly little money are not sufficient to make those countries able to respond to national emergencies quickly. In the case of the Indian Ocean tsunami some data was available, but the respective governments had few systems available to respond quickly to the emergency in question. In undergraduate essays it is considered vital to come up with a conclusion that draws on various answers given to the essay question. Embedded networking and indeed the availability of sensor networks are political issues, not only due to the importance of the data they collect, but also due to their potential ubiquity. It is one thing to observe that surveillance tools can become almost invisible. It is another to increase our expertise concerning geological and biological systems. Pervasive computing is slowly becoming reality and sensor networks are probably the most important part of it. It can save 10s of thousands of lives, but it can also make Orwell’s 1984 look like political naivitÈ. And we shouldn’t forget that for the moment, many countries still do not have the resources to instrument their coastlines and seabeds with sensors and data centers. Science and technology, owing to its institutional history, is still settled around the Atlantic and the Pacific.

Key Links
Argo www.argo.ucsd.edu/ www.primidi.com/2004/12/03.html TinyOS www.tinyos.net/ www.xbow.com/General_info/eventdetails.aspx?eid=54&localeid =3&addressid=54 Sensor Networks www.intel.com/research/exploratory/heterogeneous.htm lternet.edu/technology/sensors/arrays.htm Medical Monitoring www.eecs.harvard.edu/~mdw/proj/vitaldust/ Volcano Monitoring www.eecs.harvard.edu/~werner/projects/volcano/

VOLCANO SURVEILLANCE
When volcanoes are about to erupt, they emit seismic tremors as well as infrasonic pulses. A sensor array tailored to the work of volcanologists would include two sensors detecting infrasonic waves and tremors. In this particular case, the sensor array consisted of infrasonic detectors, since other stations were able to pick up seismic tremors already. Three mica2 motes collected the data, one transmitted the infrasonic data over a point-to-point link and

Recent models have run for 945 days on 2 AA batteries

HEY TINY
TinyOS is a GPLed operating system running the nodes of embedded wireless networks. The networking components are not vital to the OS, which is built to run extremely small processors whose ROM might not exceed 256K and for whom 128K of

SENSOR NETWORKS AND SENSOR POLITICS
Obviously, the sensors would have to be hardy enough to survive a battering by the elements and the networks linking the sensors have to be flexible to

58 LinuxUser & Developer

LinuxUser & Developer

59

Advertising Feature

UK Office: Telephone: 01295 756102 UK Office Fax: 01295 276133 Email: transtec.uk@transtec.co.uk http://www.transtec.co.uk/

transtec 2500 Opteron Servers The ideal entry to 64-bit computing
The transtec AG - THE EUROPEAN IT FACTORY The right product, for the right price at the right time and the right place. A simple message. A high demand. A challenge we fulfil. That is our purpose. Our customers do not want offthe-rack products. They expect tailor-made. That is why they need a competent partner. Competence that transtec has built up within two decades. And we work to strengthen it every day. The newest software, the fastest processors, the largest memory chips. The technology scouts from transtec track down the latest inventions in the IT industry. In combination with proven technology the engineers in Tuebingen develop faster and more stable IT systems. In this way, the most innovative technology flows directly into the computer networks of the transtec customer Only in their own production facilities transtec can fulfil customers` requests. From thousands of variations and combination possibilities, companies can select the technology that suits exactly their needs. transtec builds the equipment together precisely as ordered in the built-to-order process. This makes many of transtec´s 45,000 annually delivered computer systems unique.

The AMD Opteron™ is based on the AMD64 technology, which makes it possible to have 64-bit technology on an x86 platform. Other important innovations in Opteron™ processors include an integrated storage control module to lower the number bottlenecks in memory and the HyperTransport technology. Opteron-based systems are ideal for use in database servers or for complex applications that need more than 4 GB memory per process. Thanks to the complete 32-bit compatibility of Opteron™ processors, these servers are also perfect if an upgrade to 64-bit technology has already been planned, but not all of the applications are available yet in a 64-bit edition.

● Tower, optional 19“ Rackmount, ● 4 U One AMD Opteron™ 242 1.6 GHz processor ● Max. 2x AMD Opteron™ 250 2.4 GHz ● 1 GB registered ECC DDR-SDRAM (max. 16 GB) ● Onboard Dual 10/100/1000BASE-T ● Hotswap 36 GB Ultra320 SCSI disk ● max. 10 Hotswap Ultra320 SCSI disks

£1595.00

transtec-solutions in hardware
60 LinuxUser & Developer

Amazing Offer!
Worth £64.95

FREE SUSE 9.2
GNU/

regular column Poles apart
W
elcome to my new column, I’m Jason Kitcat and I’ll be your host over the coming months as we explore Free Libre Open Source Software (FLOSS), e-government and life in the digital lane. Enjoy the ride. Just so you know that I am qualified to take you on this journey, I’ve been tooling around with the Internet since around 1996. I’ve run a dial-in bulletin board system and started several tech companies. Since 2000 I’ve been building online communities at Swing Digital, a Brighton-based company I co-founded. We host all our communities on custom versions of Linux and Apache. So what’s the first stop on our journey? Poland as it happens. I spent some time in the south-eastern corner of Poland during the Christmas period and was curious to know how Linux is doing in this region. Poles are bursting with pride in their country, rightfully so, still it has its fair share of challenges. In a country with high unemployment and low wages relative to the rest of the EU I was expecting FLOSS to play a major part in Polish digital life. I was very wrong. I didn’t see a single copy of Linux running in any of the offices or homes I visited, everyone had Windows. Chatting with Marek Ogrodnik, an employee at a major technology retailer, I learnt

Swing digital

Professional
Or any one of these

subscriptions to

With NEW

O’Reilly books!

GET THE AWARD WINNING SUSE LINUX 9.2 PROFESSIONAL INCLUDING THE LINUX KERNEL 2.6.8, KDE 3.3 AND GNOME 2.6 - AVAILABLE ON 5 CDS AND 2 DVDS PLUS EXTENSIVE DOCUMENTATION. NEWCOMERS AND ADVANCED USERS WILL FIND PROFESSIONAL ASSISTANCE AND INSTRUCTIONS FOR ALL KINDS OF ISSUES. THE SUSE LINUX 9.2 PROFESSIONAL PACKAGE NOW INCLUDES THE PORTED 64-BIT APPLICATIONS FOR USERS WHO EMPLOY THE LATEST TECHNOLOGY OF INTEL AND AMD 64-BIT PROCESSORS AS WELL AS 32-BIT BINARIES.

Never miss an issue!

there are just over 8,400 Linux users in Poland, a country of nearly 40 million people. While the project’s methodology is hard to verify the results still give us an idea of the low adoption Linux has had. Before accepting Microsoft’s overwhelming dominance I wanted to double check the market share figures. After much Jason Kitcat digging I found some more statistics. Ranking.pl (www. ranking.pl) shows that 98.8% of those accessing the Internet in Poland are on Windows with only 1.1 per cent on all forms of Unix including Linux and MacOS X. 64.6 per cent of Polish Internet users are on Windows XP. Sadly these figures are probably, if anything, underestimating the Windows share as Linux and MacOS X users are far more likely to go online than the average user, especially those stuck with Windows 3.1 or 95. Historically a significant portion of the copies of Windows and

COVERDISCS PRIORITY DELIVERY

Every LinuxUser & Developer CD is guaranteed to be packed with hundreds of the latest and greatest Free Software projects, including many exclusive LinuxUser & Developer covermounts

or computer shop for that matter,
is still ignorant of the benefits Linux could offer
that he reckoned that Windows has 99% market-share in Poland. Piled up next to Marek as we chatted were boxes and boxes of Mandrake Linux on DVD for £15. It wasn’t selling, but OEM versions of Windows XP for £65 were. Consider that for those with jobs in Poland an excellent salary is £300 per month. With Windows XP Home Upgrade selling for £190, Office going for £360 and the academic license for Office a cheeky £70 I couldn’t understand why Linux was more popular. Stepping over discarded flyers for Mandrake I asked Marek why Linux wasn’t more popular. His answer was similar to those I often hear back in the UK: Linux is very hard to use, he couldn’t get it to even install on the showroom computer - when it does work you can’t play games, watch movies or buy software the way you can with Windows. The network effect is in full force in Poland - Because everyone else has Windows, you are better off having Windows. Poland has, like most countries, a technical elite who can and do ignore Microsoft-shaped bandwagons, and inevitably they are playing with Linux. PLD is the Polish Linux Distribution (www.pld.org.pl) made by Poles. Another Polish distribution is Aurox (www.aurox.org). There is also a Polish Linux User Group (www.linux.org.pl). It’s a small group, but they seem to have influence. Poland caused a storm and ingratiated themselves to the FLOSS community by stopping the EU Software Patent Directive in its tracks. This is the sign of a well informed technorati but unfortunately the average man in the street, or computer shop for that matter, is still ignorant of the benefits Linux could offer. According to the Linux Counter Project (counter.li.org/bycountry/PL.php) associated software have been shared and not purchased legally. However, since a very strict government crackdown on piracy three years ago the majority of software is grudgingly purchased. Coming from a Communist era where scarcity forced ingenuity, Poland is culturally ripe for the FLOSS approach to digital life. Unfortunately Windows got there first and it’s going to be very tough overcoming Microsoft’s huge installed base. All is not lost though‚ - As in many other countries, the government can use procurement policies to force competition. The Polish Ministry of Finance is using Linux for some backend functions. This move gained considerable publicity and is probably only the first of many gains for Linux in the Polish government. On the other hand government has also been unhelpful, in at least one case commercial installations of Debian Linux were regarded by the local tax office as ‘gifts’ so a 30 per cent tax was applied on the assumption that Debian cost as much as Windows. Oops. While Windows’ dominance in markets such as Poland can be depressing, it isn’t the whole picture. The key thing to remember is that whether we look to Eastern Europe, Central America or India, most people don’t have a computer at all. So we don’t need to wean them off Windows, we need to reach out to them and explain why FLOSS can help to make our digital lives more fair and open.

The average man in the street,

Receive your copy direct to your place or work or home before it reaches the newsstands!

Please send me the next 12 issues of LinuxUser and Developer and my copy of Suse 9.2 Professional for £59.99 (1 year) or £89.99 (2 years) Please send me the next 12 issues of LinuxUser and Developer (rest of the world subscription offer) MR/MRS/MISS FIRST NAME ........................................................................... SURNAME...................................................................................................... ADDRESS ...................................................................................................... ..................................................................................................................... ..................................................................................................................... POSTCODE.......................................TELEPHONE............................................ EMAIL ........................................................................................................... PLEASE BEGIN MY SUBSCRIPTION WITH ISSUE NUMBER................................. I ENCLOSE A CHEQUE PAYABLE TO LINUXUSER & DEVELOPER FOR.................. ALTERNATIVELY PLEASE BILL ME FOR............................................................

Send to;
Subscriptions Dept LinuxUser and Developer 5 Broadhey, Romily Stockport, Cheshire SK6 4NL
Or call the subscriptions hotline on;

0161 4303423

Or subscribe online at

www.linuxuser.co.uk
Outside the UK - If you live outside of the UK, we regret to say that the free software offer doesn’t apply. However, you can subscribe for £74.99* if you live within Europe and £89.99* if you live elsewhere, including the US and Australia. Both prices include P&P. *Charged at local currency exchange rate.

Jason Kitcat (jason@swingdigital.com) is Managing Director of online community consultancy, Swing Digital

LinuxUser & Developer

63

Hacking the kernel

Hacking the Kernel
Hacking the kernel
complex remote kgdb hacking - we’ll get to that after we have dealt with the basics and the merits for using debuggers at all. Kernel debuggers, like all software debuggers, allow us to peek into the internals of a running system and discern some specific information about the way in which it is operating at run time. This helps us to track down difficult to locate bugs but also provides a service as a mechanism for understanding the correct normal operation of a real system. In last month’s column, we devoted a small section to using UML but the time is right now to explore the workings of the kernel in more detail using a combination of available tools. The first of these is the venerable GNU Debugger, gdb. gdb represents one of the most popular and widely used GNU tools after the GNU Compiler Collection (gcc) and GNU C Library (glibc). It is certainly one of the most ubiquitous debuggers in widespread use today and is well understood by many programmers. Using gdb could form the subject matter of a rather large book (it does), and also deviate us from our chosen topic, so a thorough exploration of the available features is left up to the reader. It is assumed that the reader has a working environment with appropriate development tools installed, and that the gdb documentation has been consulted where necessary. gdb is typically used to debug a standalone program or running process, but it can also utilise a memory image contained within a “core dump” - produced as a side effect of eg. a “segmentation fault”, during which this file is dumped if the user has not configured a system not so to do. The Linux kernel provides a special procfs entry in /proc/kcore, which is a dynamically generated “core” file with a difference. By hooking gdb up to use this file, we can peek at the state of the running system. We won’t be able to change anything through this initial interface, but we can look at certain kernel counters and check on its general mental wellbeing to a certain extent. To begin debugging a live

Jon Masters provides his regular look at Linux Kernel development

Taking the reins. Tracing the source
Debugging the Linux kernel (continued)

Using gdb could form the subject matter of a rather large book (it does)
kernel, fire up gdb with the appropriate kernel image and core file: jcm@perihelion:~$ sudo gdb /lib/ modules/2.6.10/build/vmlinux /proc/kcore GNU gdb 6.1-debian Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type “show copying” to see the conditions. There is absolutely no warranty for GDB. Type “show warranty” for details. Core was generated by `auto BOOT_ IMAGE=linux ro root=2103’. #0 0x00000000 in ?? () (gdb) Note the use of sudo to run the gdb command as the root user. Those without sudo appropriately installed and configured can use a root shell instead. Also note that gdb was passed the full kernel vmlinux image from the top level of the kernel source directory. This is the full kernel binary prior to any stripping and with full debugging symbols already built in. The kernel was booted with the parameters shown in the last output lines above. It is possible to interrogate the state of various kernel variables from this prompt, but it is not possible to change the kernel state - so calling kernel functions is out

K

ernel Hacking is a regular feature on the inner workings of the Linux kernel. Each article is split into several sections, which vary in both technical detail as well as intended audience, although the goal is to provide an accessible alternative to existing introductory material. In this issue, we continue our investigations with various kernel debugging techniques in preparation for building a python based GUI kernel memory browser

ways to make the system crash horribly. Debugging provides a mechanism for tracing the execution of a running Linux system and watching as each line of kernel source results in a a series of operations which form the cohesive whole. Let’s have some fun with gdb, and kgdb through in a little debugging of Linux on Fedora Core 3, within a VMware machine. To follow the instructions in this next section, readers will benefit from having a kernel image built with debug symbols - see the “Kernel Debugging” submenu under “Kernel Hacking”, when building a kernel from the source.

Waiting for gdb to connect to a remote target

Debugging isn’t just a mechanism for fixing bugs in kernel drivers and discovering new and spectacular ways to make the system crash horribly
in the coming issues, investigate some of the more recent changes to the kernel and check out the latest news from the various mailing lists. Kernel security is increasingly becoming an issue of interest as numerous new exploits have recently surfaced, we’ll look at what’s being done to tackle these kinds of kinks in the new development process. Your feedback is most welcome, especially any feature suggestions - drop us a line at letters@linuxuser.co.uk. Learning about the kernel is almost as important as writing code itself, sometimes more so. Debugging isn’t just a mechanism for fixing bugs in kernel drivers and discovering new and spectacular

USING GDB FOR FUN AND PROFIT
In the last issue, readers were briefly introduced to the concept of kernel debugging using the UML (User Mode Linux) as a mechanism for running test kernels within a virtual machine on a regular desktop PC. This time around, we will look at using the GNU command line debugger, gdb to peak at the running kernel on a regular Linux machine and then consider the options for interactive real time kernel debugging. None of these options require a second PC since it’s possible to use a virtual machine PC emulator such as VMware (even an evaluation version if just for a single test run) for some of the more

64 LinuxUser & Developer

LinuxUser & Developer

65

Hacking the kernel

Hacking the Kernel
Hacking the kernel
Debugging the kernel init process with the aid of a remote gdb client. The virtual machine running within VMware is completely controlled by a remote gdb, which can issue commands to set breakpoints and modify the behaviour of the target.

Kernel Personalities
Linux wouldn’t be nearly as fun and interesting as it is, were it not for the efforts of a select group of gifted hackers who busily code away in the wee hours of the morning, powered solely by adrenaline and unholy amounts of caffeine - well some of them, at least. The LKML (Linux Kernel Mailing List) is a hive of activity on a daily basis. Recent discussions have included a new “inotify” mechanism in 2.6.10, a reworking of the internal representations of pipes, and a treatise on “Getting Started With Kernel Hacking” in which Jonathan Corbet reminded readers that the new edition of Linux Device Drivers will be out possibly during Boston LinuxWorld. Amongst the various mailing list posts came a replacement to “dnotify”. Dnotify has historically represented a bit of a hack job solution to the problem of tracking changes to directories. Rather than having a program continuously poll a directory for changes, it is possible to register to receive a signal when such events occur. Unfortunately, dnotify doesn’t adequately solve the problem of watching individual files for changes and can result in large amounts of wasted code and resources to achieve a relatively simple task. Inotify instead uses a new device node, through which programs can communicate with the kernel to request notifications of events on a file - especially through a regular poll or select syscall within its main loop. Perhaps one of the most disturbing issues with kernel development at the moment is the handling of critical security issues. We have recently seen several remotely exploitable defects in the kernel which could be used by malicious crackers to gain unauthorised access or to launch a Denial of Service (DoS) attack. Fortunately, fixes for such problems are typically a mere few lines of code and quickly make it in to vendor updates - but as development is taking place in the mainstream kernel source tree, these patches can be delayed by several days getting in to the stable source. For example, as of this writing, there remains a critical exploit in the current stable kernel which has not yet resulted in a public release of a fixed stable kernel. At the FOSDEM in Brussels at the end of February, Alan Cox will give what promises to be an interesting talk on the new development process - which, it has to be said, he doesn’t seem all too fond of. As effort is underway to bring out the 2.6.11 kernel, various -rc prereleases have started to become available. Kernel releases can be tracked most effectively by maintaining a daily update of Linus’ BitKeeper tree. The graphical “revtool” allows one to track a tree of changes from one release of the kernel to the next - and to see a complete changelog history for arbitrary points along the time line. Readers would be well advised to perform a daily pull of Linus’s tree and watch as new patches are released unto the unsuspecting masses. Consult http://linux.bkbits.net/, Documentation/BK-usage directory within the kernel source itself, and the accompanying website to this series for more specific usage information. Robert Love released a new edition of “Linux Kernel Development”, revised and updated for kernels up to and including 2.6.10. This edition is published by his new employer, Novell, and now includes additional chapters on “Getting Started with the Kernel”, “Modules”, and also the Kernel Events layer “kobjects and sysfs” that Robert has helped work on. All in all, this promises to be a very interesting read and worthy of a place on the reader’s bookshelf unfortunately this title is not yet available in electronic form and has yet to become widely available in European countries (we’ll do a review just as soon as we can get hold of it!). In the words of the books’ author: “You should buy a copy or two and carry them with you everywhere you go’”. The 2005 FOSDEM (Free and Open Source Developers European Meeting) takes place on the weekend of February 26 and 27. Your author will be there to witness the Kernel and Embedded Linux fora, as well as the general talks held at this event.

for the moment. Print the value of jiffies: (gdb) print jiffies $1 = 51656788 The kernel maintains an internal counter, known as the jiffy, which represents the number of timer ticks (timer interrupts) which have taken place since the system was booted. It’s what effectively defines the “uptime”. We can use this mechanism to interrogate any global kernel variables but must be aware of gdb’s internal caching, which may result in a repeated output of the above value of jiffies if asked for the same value immediately afterwards. gdb supports full tab completion, and in combination with a Linux source code cross reference such as LXR (see resources) allows for limited, but entirely safe, perusal of the kernel state. We’ll build on this example in the next issue - to create a graphical browser of kernel state information.

KGDB - TAKING THE REINS
It’s all very well using gdb to pick at the running kernel on a desktop machine, but how does that help us in our quest to learn more about the innards of Linux? The answer comes in the form of patches to the kernel source which add remote debugging capabilities. Using a second PC (or a VMware virtual machine in this case) one can remotely take control of the execution of each line of the kernel source. There are two primary kernel debuggers used in such cases: kgb and kgdb. We’ll focus on kgdb, since it

allows us to remotely debug a kernel at the source level rather than interrupt a running machine and issue limited diagnostic commands. Kgdb was the original creation of SGI hackers but is now a sourceforge project in its own right. It comes in the form of various patches which must be applied to the kernel sources in documented order. The kernel boot command line must be altered (in /boot/grub/menu.lst or /etc/lilo.conf, depending upon your platform and bootloader) to include a serial port for kgdb to listen on for a remote gdb. In the example case, kernel 2.6.7 has been patched using the kgdb sources and built to run on an AMD Athlon. The author chose to use a VMware virtual machine rather than a real physical PC to prove the point that even a real machine is not required for our experimentation purposes. VMware is an example of an Intel Architecture machine emulator which in this case runs Fedora Core 3 as if it were standalone. The modifed kernel was placed in /boot/ vmlinuz-2.6.7 and /boot/grub/menu.lst was modified to accommodate the new kernel image. An entry of the following form was added to the grub boot menu: title Kernel Development root (hd0,0) kernel /boot/vmlinuz-2.6.7 ro root=/dev/hda1 kgdbwait kgdb8250=0,115200 This instructs kgdb to halt the kernel as early as possible and wait

66 LinuxUser & Developer

LinuxUser & Developer

67

Hacking the kernel

LAMP
kernel source tree without a radical change of heart on the part of Linus Torvalds. Linus believes debuggers do more harm than good, altering the characteristic performance of the system under test. He’s right, but us mere mortals exercising appropriate care and due attention probably benefit more than we lose out. [VMware no longer supports direct access to the virtual machine’s serial port in newer versions. It is now necessary to use a wrapper (e.g. “vmware-p2s”) in combination with the “named pipe” option in the VMware configuration to connect VMware to a spare pseudo terminal (e.g. /dev/ptyp0) on the reader’s computer. This is quite straightforward - see Resources box.]

LAMP

for a remote connection from a gdb client via the first serial port (ttyS0). Now, another machine connected to this serial line can take remote control thusly: (gdb) target remote /dev/ttyp0 Remote debugging using /dev/ttyp0 breakpoint () at kernel/kgdb.c:1212 1212 atomic_set(&kgdb_setting_ breakpoint, 0); It is now possible to control the kernel as if it were any other program being debugged. For example, in the following sequence we elect to insert a few breakpoints in the boot sequence, stopping execution in init/main.c at “prepare_ namespace” (which is near to the end of the boot sequence) and to then view the console ringbuffer (the buffer to which printk saves its output before it is sent to the console) and a backtrace of functions called along the way.: (gdb) break prepare_namespace Breakpoint 6 at 0xc05511f6: file init/do_ mounts.c, line 391. (gdb) c Continuing. Breakpoint 6, prepare_namespace () at init/ do_mounts.c:391 391 if (saved_root_name[0]) { (gdb) p log_buf $3 = 0xc0588aa0 “<4>Linux version 2.6.7 (jcm@perihelion) (gcc version 3.3.5 (Debian 1:3.3.5-5)) #1 Tue Jan 1 00:00:00 GMT 1970\n<6>BIOS-provided physical RAM map:\n<4> BIOS-e820: “, ‘0’ <repeats 16 times>, “ - “, ‘0’ <repeats 11 times>, “9f800 (usa”... (gdb) bt #0 prepare_namespace () at init/do_ mounts.c:391 #1 0xc01006fd in init (unused=0x0) at init/ main.c:638 #2 0xc01032b1 in kernel_thread_helper () at arch/i386/kernel/process.c:229 (gdb) This SGI tool is both extremely powerful and available for several architectures at this point. While it is technically impressive and certainly of wide use, it is unlikely that tools such as kgdb will be integrated in to the mainline

One of the major advantages of a relational database is the ability to join more than one table when executing queries. In this month’s article David Tansley looks at joins, using MySQL with a touch of PHP to show how it can work on a web page

Linus believes debuggers do more harm than good, altering the characteristic performance of the system under test, but us mere mortals probably benefit more than we lose out

PHP and Joins
W
hile there are plenty of in-depth resources available elsewhere on relational database design, the LAMP developer requires at least a basic grasp of the subject. A true relational database design provides the minimum of repeated information. If changes are required to table attributes, then the design should allow for the database to scale gracefully with the additional data load. Breaking data into many tables allows for more manageable storage of the information within the database. To be able to select related data from these tables, we must be able to select from more than one table in a single SQL statement. In other words, we must be able to join these tables in order to extract the data and turn it into useful information. The actual join of the tables goes on behind the scenes, but to inform MySQL that we wish to extract from more than one table a particular SQL syntax is required. customers table; this forms the relationship between the two tables (customer.customer_id to orders.customer_id). This allows us to know which customer has ordered what, and the link back to the customers name is the customer_id.
Listing 2. Orders Table product_id 1000 1012 1014 1016 order_id customer_id 3 3 5 Karate GI 1 Brown Belt Mouth Guard Mitts

TABLE CREATION
To create the database and the customer table, the following sql is used:
[dxtans@bumper dxtans]$ mysql -u root mysql> CREATE DATABASE customer_ord; mysql> USE customer_ord; mysql> CREATE TABLE customer ( -> full_name varchar(45) NOT NULL default ‘’, -> customer_id int(10) unsigned NOT NULL auto_increment, -> PRIMARY KEY (customer_id));

CREATING THE TABLES
To demonstrate how joins can be accomplished we first need to create two tables, where the first table has a relationship to the second table. In my very simplified example tables we have a customers table and an orders table. The customer table (see Listing 1) contains a unique number for each customer (customer_id) which is the primary key for that table. It also contains the customer’s full name.
Listing 1. Customer Table customer_id 1 2 3 4 5 full_name Peter Jones Lucy Bling David Grass Pauline Neave Louise Christine

Resources
kgdb kgdb.sourceforge.net The website accompanying this series www.jonmasters.org/kernel Kernel newbies www.kernelnewbies.org wiki.kernelnewbies.org #kernelnewbies IRC discussion channel Tracking the Linux kernel mailing lists www.kerneltraffic.org Archives of the lkml www.lkml.org Linux Weekly News features Kernel Page www.lwn.net Linux Kernel Cross Reference www.lxr.linux.no

To create the orders table the following sql is provided:
mysql> CREATE TABLE orders ( -> product_id int(10) unsigned NOT NULL default ‘0’, -> order_id varchar(45) NOT NULL default ‘’, -> customer_id int(10) unsigned NOT NULL default ‘0’, -> PRIMARY KEY (product_id));

The orders table (see Listing 2), contains a product_id, and this is the primary key for the table. A product name is also provided as well as the customer_id. You may have noticed that the customer_id in the orders table contains some of the values from the customer_id in the

The next task is to insert some values into the tables. For the customer table I have taken the long route and inserted them one at a time like so:
mysql> insert into customer (full_name,customer_id) values (‘Peter Jones’,’’);

68 LinuxUser & Developer

LinuxUser & Developer

69

LAMP

LAMP
In the SQL used here each table name is comma separated. This informs MySQL that we will be extracting from two tables, and this is therefore a join query. Notice that there are 20 rows returned; though this is a join, it does not provide any really useful information. Using a WHERE clause we can gather better information in our decision making environment. Let’s extract all customers that have ordered a product:
mysql> select customer.full_name, orders.order_id -> from customer, orders -> where customer.customer_id=orders.customer_id; mysql> select customer.full_name, orders.order_id -> from customer -> inner join orders -> on customer.customer_id=orders.customer_id;

LAMP

By the time I got to the orders table I was bored with entering data manually, and created a file (orders_in.txt) to import the orders rows. The file was a <tab> separated field for each row, like so:
1000 1012 1014 1016 Karate GI Brown Belt Mouth Guard Mitts 1 3 3 5

customer records have been read in from the customers table to populate the pull down menu. See Listing 4. cust_select.php.

LISTING 3. CUST_INC.PHP
<?php # cust_inc.php $SERVER=”localhost”; $DB=”customer_ord”; $USER=”root”; $CUST_TABLE=”customer”; # holds initial connection and changing to working database $connection=mysql_connect($SERVER,$USER); if (!$connection) { echo “Could not connect to MySQL server!”; exit; } $db=mysql_select_db($DB,$connection); if (!$db) { echo “Could not change into the database $DB”; exit; } ?>

Then after making sure we are back at the database prompt, run the following command:
mysql> load data local infile “orders_in.txt” into table orders; Query OK, 4 rows affected (0.00 sec) Records: 4 Deleted: 0 Skipped: 0 Warnings: 0

Both tables should now be loaded up and ready to go. Look at the tables with these commands:
mysql> select * from customer; mysql> select * from orders;

JOINING
A join will create a virtual table, multiplying the tables together. Then a WHERE clause containing a condition is used to filter out the unwanted rows for the query. Please note that we have five rows in the customer table and four rows in the orders table. The most basic join is the cross join, which is fairly useless in its most basic form:
mysql> select * from customer, orders;

specified the columns I want extracted by using their tablename. column_name. First we select the ‘full_ name’ from the customer table as well as the ‘order_id’ from the orders table. We then filter it through a WHERE condition, which equates to “if the ‘customer_id’ from the customers table matches the ‘customer_id’ from the orders table, then print the results.” We could also see who has ordered a pair of mitts from our orders database, by appending AND to the query. This means “only print results if both sides of the condition are true.” As we only want to print the name (full_name) of the customer this time, we only need to select customer.full_name from the customers table:
mysql> select customer.full_name -> from customer, orders -> where customer.customer_id=orders.customer_id

+------------------+-------------+ | full_name | order_id | +------------------+-------------+ | Peter Jones | Karate GI | | David Grass | Brown Belt | | David Grass | Mouth Guard | | Louise Christine | Mitts | +------------------+-------------+

In the above SQL, notice that I have

Notice that ‘inner join’ is now used to indicate we are performing an inner join in the query and the WHERE is replaced by the ON word. Two other joins are the left and right join; The left join will extract all rows from the customer table, even if there are rows in ‘customer’ that do not have a match in the orders table. All will become clear with a query example. Notice in the following query we use a ‘left join’, and more rows are returned this time because we have customers ‘Lucy Bling’ and ‘Pauline Neave’ who have not ordered any items. This is a non-match between the two tables, but because we are using a left join these results will get printed with a NULL value:
mysql> select customer.full_name, orders.order_id -> from customer -> left join orders -> on customer.customer_id=orders.customer_id;

+-------------+------------------+------------+-------------+-------------+ | customer_id | full_name | product_id | order_id | customer_id | +-------------+------------------+------------+-------------+-------------+ | 1 | Peter Jones | 1000 | Karate GI | 1 | | 2 | Lucy Bling | 1000 | Karate GI | 1 | | 3 | David Grass | 1000 | Karate GI | 1 | | 4 | Pauline Neave | 1000 | Karate GI | 1 | | 5 | Louise Christine | 1000 | Karate GI | 1 | | 1 | Peter Jones | 1012 | Brown Belt | 3 | | 2 | Lucy Bling | 1012 | Brown Belt | 3 | | 3 | David Grass | 1012 | Brown Belt | 3 | | 4 | Pauline Neave | 1012 | Brown Belt | 3 | | 5 | Louise Christine | 1012 | Brown Belt | 3 | | 1 | Peter Jones | 1014 | Mouth Guard | 3 | | 2 | Lucy Bling | 1014 | Mouth Guard | 3 | | 3 | David Grass | 1014 | Mouth Guard | 3 | | 4 | Pauline Neave | 1014 | Mouth Guard | 3 | | 5 | Louise Christine | 1014 | Mouth Guard | 3 | | 1 | Peter Jones | 1016 | Mitts | 5 | | 2 | Lucy Bling | 1016 | Mitts | 5 | | 3 | David Grass | 1016 | Mitts | 5 | | 4 | Pauline Neave | 1016 | Mitts | 5 | | 5 | Louise Christine | 1016 | Mitts | 5 | +-------------+------------------+------------+-------------+-------------+

-> and orders.order_id=”Mitts”;

The right join behaves similarly to the left join except that we compare from the second table (orders) first. Practically any query made by a user can be extracted using joins. However, I have just covered the most basic joins and queries; there is a lot more to it. Extracting data from tables into a client application is not much fun. Since this is a LAMP tutorial it would be better to web enable our database, and what better tool to do this with than PHP?
+------------------+-------------+ | full_name | order_id | +------------------+-------------+ | Peter Jones | Karate GI | | Lucy Bling | NULL | | David Grass | Brown Belt | | David Grass | Mouth Guard | | Pauline Neave | NULL | | Louise Christine | Mitts | +------------------+-------------+

LISTING 4. CUST_SELECT.PHP
<HTML> <BODY> <CENTER><B> Select A Customer To View Their Order(s)</B></CENTER> <?php # cust_select.php # include file include (“cust_inc.php”); $sql=”SELECT full_name,customer_id FROM $CUST_TABLE”; $mysql_result=mysql_query($sql,$connection); $num_rows=mysql_num_rows($mysql_result); if ( $num_rows == 0 ) { echo “Sorry there is no information”; } else { # we have records echo “<FORM METHOD=GET ACTION=\”cust_view.php\”>”; echo “Please select a Customer<BR>”; echo “<SELECT NAME=\”customer_id\”>”; while ($row=mysql_fetch_array($mysql_result))

So far we have only used the WHERE join method; this method was pushed more heavily in the early MySQL documentation. Now the documentation actually uses the word ‘join’, and it is sometimes called the ‘join on’ method. One common type of join is the inner join, or equi-join. Like our earlier example,
+------------------+ | full_name | +------------------+ | Louise Christine | +------------------+

QUERYING WITH PHP
Creating a rudimentary reporting interface is fairly straightforward. I will present a drop down menu form that contains the customer names. These customer names will be pulled from the customer table, and the user will then select a name for the report to be generated. Of course in our example it will only contain one or two names, but the concept will be the same no matter how many rows the query returns. Running a query against the orders table will then generate the report. The report will display orders for that given customer. PHP for common MySQL tasks, like connecting to the database and checking for connection errors, is reusable code so it is put into an include file which can be called from other PHP scripts. Listing 3. cust_inc.php contains that code. One thing we will need to give to the user is the choice of whether they want to select an individual customer or see all orders for all customers. This is easily accomplished by adding another OPTION VALUE called ‘all’, after the

this join is based on extracting data based on common or equivalent matching columns. If there is a match then results are printed; it will not print any results from a non-match. So to print all customers that have ordered a product, as we did earlier, we could use:

70 LinuxUser & Developer

LinuxUser & Developer

71

LAMP

regular column
Strange anomalies of copyright law (Part 2)

Freestyle

Clare’s enclosure
{ $full_name=$row[“full_name”]; $customer_id=$row[“customer_id”]; # display results echo “<OPTION VALUE=\”$customer_id\”>$full_name”; } echo “<OPTION VALUE=\”all\”>All Customers”; echo “</SELECT>”; } # end else echo “<BR><BR>”; echo “<INPUT TYPE=\”SUBMIT\” VALUE=\”See Order\”>”; echo “<BR><BR><INPUT TYPE=\”RESET\” VALUE=\”Clear selection\”>”; mysql_close($connection); ?> # apologise if there’s nothing to show if ( $num_rows == 0 ) { echo “Sorry there is no information”; } else { # we have results echo “<TABLE ALIGN=\”CENTER\” BORDER=\”8\” WIDTH=\”40%\”>”; echo “<TR><TH>Full Name</TH><TH> Item(s) Ordered</TH></TR>”; { $mysql_result=mysql_query($sql_all,$connection); $num_rows=mysql_num_rows($mysql_result); } else { $mysql_result=mysql_query($sql_single,$connection); $num_rows=mysql_num_rows($mysql_result); }

T

Once the user has selected their customer to view, or ‘all’ to see all customers orders, the submit button will send them onto cust_view.php, see Listing 5. A check is initially carried out to determine if the user has loaded this page directly without first going to cust_select.php - if this is the case, then a hyperlink points them back. This check is carried out by determining if the parsed variable from the cust_select.php form, customer_id has a value. If not then we know no selection has taken place. Figure 1. customer_select, shows the initial web page, where the menu is populated with the customer details. The screen shot below shows the result of a query.

while ($row=mysql_fetch_array($mysql_result)) { $full_name=$row[“full_name”]; $order_id=$row[“order_id”]; # display results echo “<TR><TD><CENTER>$full_name</CENTER></ TD><TD><CENTER>$order_id</CENTER></TD></TR>”; } } # end the script mysql_close($connection); ?> </TABLE> <BR><A HREF=”cust_select.php”>Back</A> </BODY> </HTML>

he poet John Clare died in the Northampton General Lunatic Asylum in 1864. During his lifetime he was known as ‘the peasant poet’, though he was not, strictly speaking, a peasant, and lived in an era of social upheaval, enclosures and landless labour, when a landless labourer had even less rights than a peasant. Clare was from a rural labouring family, and had little education, but is now recognised as the greatest English poet of nature, as important as his contemporaries, Keats, Byron and Shelley, of whom everybody has heard. As the critic Geoffrey Grigson wrote as long ago as 1949, “Clare has gradually been transformed from ‘peasant poet’ into poet, from cottage rushlight into what indeed he is, a star of considerable and most unique coruscation.” Clare ignored the rules of punctuation and spelling, probably as a consequence of the slightness of his education, and wrote: “I am gennerally understood tho I do not use that awkward squad of pointings called commas colons semicolons etc and for the very reason that altho they are drilled hourly daily and weekly by every boarding school Miss who pretends to gossip in correspondence they do not know their proper exercise for they even set grammarians at loggerheads and no one can assign them their proper places for give each a sentence to point and both shall point it differently.” The

July, 2000, Whitaker made a “provisional bargain”, presumably verbal, with the dying John Taylor in May 1864, to pay Clare’s widow, who could neither read nor write, £10 a year for Clare’s manuscripts and publication rights. Later the same year he signed an agreement with Richard Hillesley Clare’s widow and her children for the transfer of their rights to Clare’s copyrights. This agreement, discovered in the archive of the Whitaker publishing house in 1932, was destroyed with the rest of the archive during the London blitz eight years later. Whitaker’s edition of Clare never appeared, and he transferred the bulk of the surviving manuscripts (and, some presumed, the copyright) to the care of the Peterborough Museum Society before his death in 1895. No edition of Clare published between Whitaker’s death and Robinson’s purchase of “all rights whatsoever possessed by the company in the published and unpublished works of John Clare” in 1965, acknowledged any

The purgatorial hell and French bastile of English liberty,

where harmless people are trapped
and tortured until they die
fashion for removing punctuation from modern poetry began with the French poets, Apollinaire and Cendrars, in the early decades of the twentieth century, but Clare had his own untutored, and strangely modern, take on the rhythms and nuances of poetry and punctuation. Clare’s punctuation, and his feelings about it, are at the heart of the academic disputes that surround his work. Clare’s first book sold well by the standards of his time, and outsold Keats by some margin, but his relative commercial success was shortlived, and he depended for his income on patrons, who sometimes over-edited and censored his material. Much of his better poetry was written during the 23 years he spent in the lunatic asylum - “the purgatorial hell and French bastile of English liberty, where harmless people are trapped and tortured until they die”. Most of his work, some 2700 poems, remained unpublished during his lifetime, and at the time of his death he was all but forgotten, despite the best efforts of his publisher, John Taylor. But due to a quirk in English copyright law, 140 years after Clare’s death, the ownership of the copyright to Clare’s unpublished writings is still claimed as the sole property of one individual, Professor Eric Robinson, who purchased the “rights” for £1 in July, 1965. This has been a point of contention among Clare scholars and publishers for the last 40 years. At the time of Clare’s death the ownership of Clare’s unpublished work was passed to James Whitaker, best known as the creator of Whitaker’s Almanac, with the intention of producing a posthumous collection. According to John Goodridge, writing in the Guardian in copyright holder. Under the 1842 Copyright Act, an author, or after his death his personal representative, retained perpetual control over his work as long as it remained unpublished. This clause remained in force until it was finally replaced in the 1988 Act with a finite, 50-year term of protection (made potentially extendable by a further 25 years in a 1996 Act). Although Robinson has contributed much to Clare scholarship over the last half-century, his claims to the ownership of Clare’s legacy has caused much controversy. According to Goodridge: Robinson “has enforced this claim, demanding acknowledgement and often payment from anyone who wishes to publish Clare material. In the view of Tim Chilcott, the effect has been ‘the impoverishment of editorial debate compared with other Romantic writers, the absence of challenging alternative views, the deadening hand of the authorised definitive version.’” A core concern of Clare’s poetry was the disruption to the traditional patterns of life caused by the enclosures of the English commons. Like Gerard Winstanley, Clare believed the earth to be a “common treasury for all”. It seems unlikely that Clare would approve of a situation where the rights to his work were enclosed and claimed as the property of one individual 140 years after his death. “And me they turned me inside out For sand and grit and stones And turned my old green hills about And pickt my very bones.”

LISTING 5. CUST_VIEW.PHP
<HTML>
<BODY> <?php # cust_view.php # did the user load this page directly ? if (empty( $_GET[‘customer_id’] )) { echo “You need to make a selection first”; echo “<A HREF=\”cust_select.php\”> Customer Menu</A>”; exit; } # include file include (“cust_inc.php”); # this query is used if user selects a customer name $sql_single=”select customer.full_name, orders.order_id from customer inner join orders on customer.customer_id=orders.customer_id AND customer.customer_id=’{$_GET[‘customer_id’]}’”; # this query is used is user selects all customers $sql_all=”select customer.full_name, orders.order_id from customer inner join orders on customer.customer_id=orders.customer_id”; # determine which query to use if ( $_GET[‘customer_id’] == ‘all’ )

Using joins and PHP provides a powerful means to generate business intelligent information in a user friendly manner. Being web enabled is the ultimate user interface; since no installation of client applications is required, reports can be made available to anyone with a browser.

72 LinuxUser & Developer

LinuxUser & Developer

73

Regular features

Otherwords
“The absolute transformation of everything that we ever thought about music will take place within 10 years, and nothing is going to be able to stop it. I see absolutely no point in pretending that it’s not going to happen. I’m fully confident that copyright, for instance, will no longer exist in 10 years, and authorship and intellectual property is in for such a bashing” David Bowie

LinuxUser & Developer

75