You are on page 1of 16

The Nature of Lisp

Monday, May 8, 2006

Introduction
When I first stumbled into Lisp advocacy on various corners of the web I was already an experienced pro rammer! "t that point I had ro##ed what seemed at the time a wide ran e of pro rammin lan ua es! I was proud to have the usual suspects $%&&, 'ava, %(, etc!) on my service record and was under impression that I #new everythin there is to #now about pro rammin lan ua es! I couldn*t have possibly been more wron ! My initial attempt to learn Lisp came to a crashin halt as soon as I saw some sample code! I suppose the same thou ht ran throu h my mind that ran throu h thousands of other minds who were ever in my shoes+ ,Why on -arth would anyone want to use a lan ua e with such horrific syntax./, I couldn*t be bothered to learn a lan ua e if its creators couldn*t be bothered to ive it a pleasant syntax! "fter all, I was almost blinded by the infamous Lisp parentheses/ 0he moment I re ained my si ht I communicated my frustrations to some members of the Lisp sect! "lmost immediately I was bombarded by a standard set of responses+ Lisp*s parentheses are only a superficial matter, Lisp has a hu e benefit of code and data bein expressed in the same manner $which, obviously, is a hu e improvement over 1ML), Lisp has tremendously powerful metapro rammin facilities that allow pro rams to write code and modify themselves, Lisp allows for creation of mini2lan ua es specific to the problem at hand, Lisp blurs the distinction between run time and compile time, Lisp, Lisp, Lisp!!! 0he list was very impressive! 3eedless to say none of it made sense! 3obody could illustrate the usefulness of these features with specific examples because these techni4ues are supposedly only useful in lar e software systems! "fter many hours of debatin that conventional pro rammin lan ua es do the 5ob 5ust fine, I ave up! I wasn*t about to invest months into learnin a lan ua e with a terrible syntax in order to understand obscure features that had no useful examples! My time has not yet come! 6or many months the Lisp advocates pressed on! I was baffled! Many extremely intelli ent people I #new and had much respect for were praisin Lisp with almost reli ious dedication! 0here had to be somethin there, somethin I couldn*t afford not to et my hands on/ -ventually my thirst for #nowled e won me over! I too# the plun e, bit the bullet, ot my hands dirty, and be an months of mind bendin exercises! It was a 5ourney on an endless la#e of frustration! I turned my mind inside out, rinsed it, and put it bac# in place! I went throu h seven rin s of hell and came bac#! "nd then I ot it! 0he enli htenment came instantaneously! 7ne moment I understood nothin , and the next moment everythin clic#ed into place! I*ve achieved nirvana! 8o9ens of times I heard -ric :aymond*s statement 4uoted by different people+ ,Lisp is worth learnin for the profound enli htenment experience you will have when you finally et it; that experience will ma#e you a better pro rammer for the rest of your days, even if you never actually use Lisp itself a lot!, I never understood this statement! I never believed it could be true! "nd finally, after all the pain, it made sense/ 0here was more truth to it than I ever could have ima ined! I*ve achieved an almost divine state of mind, an instantaneous enli htenment experience that turned my view of computer science on its head in less than a sin le second! 0hat very second I became a member of the Lisp cult! I felt somethin a nin5itsu master must feel+ I had to spread my newfound #nowled e to at least ten lost souls in the course of my lifetime! I too# the usual path! I was rehashin the same ar uments that were iven to me for years $only now they actually

made sense/), hopin to convert unsuspectin bystanders! It didn*t wor#! My persistence spar#ed a few people*s interest but their curiosity dwindled at the mere si ht of sample Lisp code! <erhaps years of advocacy would for e a few new Lispers, but I wasn*t satisfied! 0here had to be a better way! I ave the matter careful thou ht! Is there somethin inherently hard about Lisp that prevents very intelli ent, experienced pro rammers from understandin it. 3o, there isn*t! "fter all, I ot it, and if I can do it, anybody can! 0hen what is it that ma#es Lisp so hard to understand. 0he answer, as such thin s usually do, came unexpectedly! 7f course/ 0eachin anybody anythin involves buildin advanced concepts on top of concepts they already understand/ If the process is made interestin and the matter is explained properly the new concepts become as intuitive as the ori inal buildin bloc#s that aided their understandin ! 0hat was the problem/ Metapro rammin , code and data in one representation, self2modifyin pro rams, domain specific mini2lan ua es, none of the explanations for these concepts referenced familiar territory! =ow could I expect anyone to understand them/ 3o wonder people wanted specific examples! I could as well have been spea#in in Martian/ I shared my ideas with fellow Lispers! ,Well, of course these concepts aren*t explained in terms of familiar territory,, they said! ,0hey are so different, they*re unli#e anythin these people have learned before!, 0his was a poor excuse! ,I do not believe this to be true,, I said! 0he response was unanimous+ ,Why don*t you ive it a try., >o I did! 0his article is a product of my efforts! It is my attempt to explain Lisp in familiar, intuitive concepts! I ur e brave souls to read on! ?rab your favorite drin#! 0a#e a deep breath! <repare to be blown away! 7h, and may the 6orce be with you!

XML Reloaded
" thousand mile 5ourney starts with a sin le step! " 5ourney to enli htenment is no exception and our first step 5ust happens to be 1ML! What more could possibly be said about 1ML that hasn*t already been said. It turns out, 4uite a bit! While there*s nothin particularly interestin about 1ML itself, its relationship to Lisp is fascinatin ! 1ML is the all too familiar concept that Lisp advocates need so much! It is our brid e to conveyin understandin to re ular pro rammers! >o let*s revive the dead horse, ta#e out the stic#, and venture into 1ML wilderness that no one dared venture into before us! It*s time to see the all too familiar moon from the other side! >uperficially 1ML is nothin more than a standardi9ed syntax used to express arbitrary hierarchical data in human readable form! 0o2do lists, web pa es, medical records, auto insurance claims, confi uration files are all examples of potential 1ML use! Let*s use a simple to2do list as an example $in a couple of sections you*ll see it in a whole new li ht)+
<todo name="housework"> <item priority="high">Clean the house.</item> <item priority="medium">Wash the dishes.</item> <item priority="medium">Buy more soap.</item> </todo>

What happens if we unleash our favorite 1ML parser on this to2do list. 7nce the data is parsed, how is it represented in memory. 0he most natural representation is, of course, a tree 2 a perfect data structure for hierarchical data! "fter all is said and done, 1ML is really 5ust a tree seriali9ed to a human readable form! "nythin that can be represented in a tree can be represented in 1ML and vice versa! I hope you understand this idea! It*s very important for what*s comin next! Let*s ta#e this a little further! What other type of data is often represented as a tree. "t this point the list is as ood as infinite so I*ll ive you a hint at what I*m ettin at 2 try to remember your old compiler course! If you have a va ue recollection that source code is stored in a tree after it*s parsed, you*re on

the ri ht trac#! "ny compiler inevitably parses the source code into an abstract syntax tree! 0his isn*t surprisin since source code is hierarchical+ functions contain ar uments and bloc#s of code! @loc#s of code contain expressions and statements! -xpressions contain variables and operators! "nd so it oes! Let*s apply our corollary that any tree can easily be seriali9ed into 1ML to this idea! If all source code is eventually represented as a tree, and any tree can be seriali9ed into 1ML, then all source code can be converted to 1ML, ri ht. Let*s illustrate this interestin property by a simple example! %onsider the function below+
int add(int arg1, int arg ! " return arg1 # arg $ %

%an you convert this function definition to its 1ML e4uivalent. 0urns out, it*s reasonably simple! 3aturally there are many ways to do this! =ere is one way the resultin 1ML can loo# li#e+
<de&ine'&un(tion return'type="int" name="add"> <arguments> <argument type="int">arg1</argument> <argument type="int">arg </argument> </arguments> <)ody> <return> <add *alue1="arg1" *alue ="arg " /> </return> </)ody> </de&ine>

We can o throu h this relatively simple exercise with any lan ua e! We can turn any source code into 1ML, and we can transform the resultin 1ML bac# to ori inal source code! We can write a converter that turns 'ava into 1ML and a converter that turns 1ML bac# to 'ava! We could do the same for %&&! $In case you*re wonderin if anyone is cra9y enou h to do it, ta#e a loo# at ?%%21ML)! 6urthermore, for lan ua es that share common features but use different syntax $which to some extent is true about most mainstream lan ua es) we could convert source code from one lan ua e to another usin 1ML as an intermediary representation! We could use our 'ava21ML converter to convert a 'ava pro ram to 1ML! We could then run an 1ML2%<< converter on the resultin 1ML and turn it into %&& code! With any luc# $if we avoid usin features of 'ava that don*t exist in %&&) we*ll et a wor#in %&& pro ram! 3eat, eh. "ll this effectively means that we can use 1ML for eneric stora e of source code! We*d be able to create a whole class of pro rammin lan ua es that use uniform syntax, as well as write transformers that convert existin source code to 1ML! If we were to actually adopt this idea, compilers for different lan ua es wouldn*t need to implement parsers for their specific rammars 2 they*d simply use an 1ML parser to turn 1ML directly into an abstract syntax tree! @y now you*re probably wonderin why I*ve embar#ed on the 1ML crusade and what it has to do with Lisp $after all, Lisp was created about thirty years before 1ML)! I promise that everythin will become clear soon enou h! @ut before we ta#e our second step, let*s o throu h a small philosophical exercise! 0a#e a ood loo# at the 1ML version of our ,add, function above! =ow would you classify it. Is it data or code. If you thin# about it for a moment you*ll reali9e that there are ood reasons to put this 1ML snippet into both cate ories! It*s 1ML and it*s 5ust information encoded in a standardi9ed format! We*ve already determined that it can be enerated from a tree data structure in memory $that*s

effectively what ?%%21ML does)! It*s lyin around in a file with no apparent way to execute it! We can parse it into a tree of 1ML nodes and do various transformations on it! It*s data! @ut wait a moment/ When all is said and done it*s the same ,add, function written with a different syntax, ri ht. 7nce parsed, its tree could be fed into a compiler and we could execute it! We could easily write a small interpreter for this 1ML code and we could execute it directly! "lternatively, we could transform it into 'ava or %&& code, compile it, and run it! It*s code! >o, where are we. Loo#s li#e we*ve 5ust arrived to an interestin point! " concept that has traditionally been so hard to understand is now ama9in ly simple and intuitive! %ode is also always data/ 8oes it mean that data is also always code. "s cra9y as this sounds this very well mi ht be the case! :emember how I promised that you*ll see our to2do list in a whole new li ht. Let me reiterate on that promise! @ut we aren*t ready to discuss this 5ust yet! 6or now let*s continue wal#in down our path! " little earlier I mentioned that we could easily write an interpreter to execute our 1ML snippet of the add function! 7f course this sounds li#e a purely theoretical exercise! Who in their ri ht mind would want to do that for practical purposes. Well, it turns out 4uite a few people would disa ree! Aou*ve li#ely encountered and used their wor# at least once in your career, too! 8o I have you out on the ed e of your seat. If so, let*s move on/

Ant Reloaded
3ow that we*ve made the trip to the dar# side of the moon, let*s not leave 4uite yet! We may still learn somethin by explorin it a little more, so let*s ta#e another step! We be in by closin our eyes and rememberin a cold rainy ni ht in the winter of 2000! " prominent developer by the name of 'ames 8uncan 8avidsonB was hac#in his way throu h 0omcat servlet container! "s the time came to build the chan es he carefully saved all his files and ran make! -rrors! Lots of errors! >omethin was wron ! "fter careful examination 'ames exclaimed+ ,Is my command not executin because I have a space in front of my tab./, Indeed, this was the problem! " ain! 'ames has had enou h! =e could sense the full moon throu h the clouds and it made him adventurous! =e created a fresh 'ava pro5ect and 4uic#ly hac#ed to ether a simple but surprisin ly useful utility! 0his spar# of enius used 'ava property files for information on how to build the pro5ect! 'ames could now write the e4uivalent of the ma#efile in a nice format without worryin about the damned spaces ever a ain! =is utility did all the hard wor# by interpretin the property file and ta#in appropriate actions to build the pro5ect! It was neat! "nother 3eat 0ool! "nt! "fter usin "nt to build 0omcat for a few months it became clear that 'ava property files are not sufficient to express complicated build instructions! 6iles needed to be chec#ed out, copied, compiled, sent to another machine, and unit tested! In case of failure e2mails needed to be sent out to appropriate people! In case of success ,@ad to the @one, needed to be played at the hi hest possible volume! "t the end of the trac# volume had to be restored to its ori inal level! Aes, 'ava property files didn*t cut it anymore! 'ames needed a more flexible solution! =e didn*t feel li#e writin his own parser $especially since he wanted an industry standard solution)! 1ML seemed li#e a reasonable alternative! In a couple of days "nt was ported to 1ML! It was the best thin since sliced bread! >o how does "nt wor#. It*s pretty simple! It ta#es an 1ML file with specific build instructions $you decide if they*re data or code) and interprets them by runnin speciali9ed 'ava code for each 1ML element! It*s actually much simpler than it sounds! " simple 1ML instruction li#e the one below causes a 'ava class with an e4uivalent name to be loaded and its code to be executed!
<(opy todir="../new/dir"> <&ileset dir="sr(+dir"/> </(opy>

0he snippet above copies a source directory to a destination directory! "nt locates a ,copy, tas# $a 'ava class, really), sets appropriate parameters $todir and fileset) by callin appropriate 'ava methods and then executes the tas#! "nt comes with a set of core tas#s and anyone can extend it with tas#s of their own simply by writin 'ava classes that follow certain conventions! "nt finds these classes and executes them whenever 1ML elements with appropriate names are encountered! <retty simple! -ffectively "nt accomplishes what we were tal#in about in the previous section+ it acts as an interpreter for a lan ua e that uses 1ML as its syntax by translatin 1ML elements to appropriate 'ava instructions! We could write an ,add, tas# and have "nt execute it when it encounters the 1ML snippet for addition presented in the previous section/ %onsiderin that "nt is an extremely popular pro5ect, the ideas presented in the previous section start loo#in more sane! "fter all, they*re bein used every day in what probably amounts to thousands of companies/ >o far I*ve said nothin about why "nt actually oes throu h all the trouble of interpretin 1ML! 8on*t try to loo# for the answer on its website either 2 you*ll find nothin of value! 3othin relevant to our discussion, anyway! Let*s ta#e another step! It*s time to find out why!

Why XML?
>ometimes ri ht decisions are made without full conscious understandin of all the issues involved! I*m not sure if 'ames #new why he chose 1ML 2 it was li#ely a subconscious decision! "t the very least, the reasons I saw on "nt*s website for usin 1ML are all the wron reasons! It appears that the main concerns revolved around portability and extensibility! I fail to see how 1ML helps advance these oals in "nt*s case! What is the advanta e of usin interpreted 1ML over simple 'ava source code. Why not create a set of classes with a nice "<I for commonly used tas#s $copyin directories, compilin , etc!) and usin those directly from 'ava source code. 0his would run on every platform that runs 'ava $which "nt re4uires anyway), it*s infinitely extensible, and it has the benefit of havin a more pleasant, familiar syntax! >o why 1ML. %an we find a ood reason for usin it. It turns out that we can $althou h as I mentioned earlier I*m not sure if 'ames was consciously aware of it)! 1ML has the property of bein far more flexible in terms of introduction of semantic constructs than 'ava could ever hope to be! 8on*t worry, I*m not fallin into the trap of usin bi words to describe incomprehensible concepts! 0his is actually a relatively simple idea, thou h it may ta#e some effort to explain! @uc#le your seat2belt! We*re about to ma#e a iant leap towards achievin nirvana! =ow can we represent *copy* example above in 'ava code. =ere*s one way to do it+
Copy,ask (opy = new Copy,ask(!$ -ileset &ileset = new -ileset(!$ &ileset.set.ir("sr(+dir"!$ (opy.set,o.ir("../new/dir"!$ (opy.set-ileset(&ileset!$ (opy.e/e(ute(!$

0he code is almost the same, albeit a little lon er than the ori inal 1ML! >o what*s different. 0he answer is that the 1ML snippet introduces a special semantic construct for copyin ! If we could do it in 'ava it would loo# li#e this+
(opy("../new/dir"! " &ileset("sr(+dir"!$

%an you see the difference. 0he code above $if it were possible in 'ava) is a special operator for copyin files 2 similar to a for loop or a new foreach construct introduced in 'ava C! If we had an automatic converter from 1ML to 'ava it would li#ely produce the above ibberish! 0he reason for this is that 'ava*s accepted syntax tree rammar is fixed by the lan ua e specification 2 we have no way of modifyin it! We can add pac#a es, classes, methods, but we cannot extend 'ava to ma#e addition of new operators possible! Aet we can do it to our heart*s content in 1ML 2 its syntax tree isn*t restricted by anythin except our interpreter/ If the idea is still unclear, consider introducin a special operator *unless* to 'ava+
unless(some0)1e(t.(an-ly(!! " some0)1e(t.transportBy2round(!$ %

In the previous two examples we extend the 'ava lan ua e to introduce an operator for copyin files and a conditional operator unless! We would do this by modifyin the abstract syntax tree rammar that 'ava compiler accepts! 3aturally we cannot do it with standard 'ava facilities, but we can easily do it in 1ML! @ecause our 1ML interpreter parses the abstract syntax tree that results from it, we can extend it to include any operator we li#e! 6or complex operators this ability provides tremendous benefits! %an you ima ine writin special operators for chec#in out source code, compilin files, runnin unit testin , sendin email. 0ry to come up with some! If you*re dealin with a speciali9ed problem $in our case it*s buildin pro5ects) these operators can do wonders to decrease the amount of code you have to type and to increase clarity and code reuse! Interpreted 1ML ma#es this extremely easy to accomplish because it*s a simple data file that stores hierarchical data! We do not have this option in 'ava because it*s hierarchical structure is fixed $as you will soon find out, we do have this option in Lisp)! <erhaps this is one of the reasons why "nt is so successful. I ur e you to ta#e a loo# at recent evolution of 'ava and %( $especially the recently released specification for %( D!0)! 0he lan ua es are bein evolved by abstractin away commonly used functionality and addin it in the form of operators! 3ew %( operators for built2in 4ueries is one example! 0his is accomplished by relatively traditional means+ lan ua e creators modify the accepted abstract syntax tree and add implementations of certain features! Ima ine the possibilities if the pro rammer could modify the abstract syntax tree himself/ Whole new sub2lan ua es could be built for speciali9ed domains $for example a lan ua e for buildin pro5ects, li#e "nt)! %an you come up with other examples. 0hin# about these concepts for a bit, but don*t worry about them too much! We*ll come bac# to these issues after introducin a few more ideas! @y then thin s will be a little more clear!

Almost Lisp
Let*s for et about the operator business for the moment and try to expand our hori9ons beyond the constraints of "nt*s desi n! I mentioned earlier that "nt can be extended by writin conventional 'ava classes! "nt interpreter then attempts to match 1ML elements to appropriately named 'ava classes and if the match is found the tas# is executed! "n interestin 4uestion be s to be as#ed! Why not extend "nt in "nt itself. "fter all, core tas#s contain a lot of conventional pro rammin lan ua e constructs $*if* bein a perfect example)! If "nt provided constructs to develop tas#s in "nt itself we*d reach a hi her de ree of portability! We*d be dependent on a core set of tas#s $a standard library, if you will) and we

wouldn*t care if 'ava runtime is present+ the core set could be implemented in anythin ! 0he rest of the tas#s would be built on top of the core usin "nt21ML itself! "nt would then become a eneric, extensible, 1ML2based pro rammin lan ua e! %onsider the possibilities+
<task name=",est"> <e(ho message="3ello World4"/> </task> <,est />

If ant supported the ,tas#, construct, the example above would print ,=ello World/,! In fact, we could write a ,tas#, tas# in 'ava and ma#e "nt able to extend itself usin "nt21ML/ "nt would then be able to build more complicated primitives on top of simple ones, 5ust li#e any other pro rammin lan ua e/ 0his is an example of ,1ML, based pro rammin lan ua e we were tal#in about in the be innin of this tutorial! 3ot very useful $can you tell why.) but pretty damn cool! @y the way, ta#e a loo# at our *0est* tas# once a ain! %on ratulations! Aou*re loo#in at Lisp code! What on -arth am I tal#in about. It doesn*t loo# anythin li#e Lisp. 8on*t worry, we*ll fix that in a bit! %onfused. ?ood! Let*s clear it all up/

A Better XML
I mentioned in the previous section that self2extendin "nt wouldn*t be very useful! 0he reason for that is 1ML*s verbosity! It*s not too bad for data files but the moment you try writin reasonably complex code the amount of typin you have to do 4uic#ly starts to et in the way and pro resses to becomin unusable for any real pro5ect! =ave you ever tried writin "nt build scripts. I have, and once they et complex enou h havin to do it in 1ML becomes really annoyin ! Ima ine havin to type almost everythin in 'ava twice because you have to close every element! Wouldn*t that drive you nuts. 0he solution to this problem involves usin a less verbose alternative to 1ML! :emember, 1ML is 5ust a format for representin hierarchical data! We don*t have to use 1ML*s an le brac#ets to seriali9e trees! We could come up with many other formats! 7ne such format $incidentally, the one Lisp uses) is called an s2expression! >2expressions accomplish the same oals as 1ML! 0hey*re 5ust a lot less verbose, which ma#es them much better suited for typin code! I will explain s2expressions in a little while, but before I do I have to clear up a few thin s about 1ML! Let*s consider our 1ML example for copyin files+
<(opy todir="../new/dir"> <&ileset dir="sr(+dir"/> </(opy>

0hin# of what the parse tree of this snippet would loo# li#e in memory! We*d have a *copy* node that contains a fileset node! @ut what about attributes. =ow do they fit into our picture. If you*ve ever used 1ML to describe data and wondered whether you should use an element or an attribute, you*re not alone! 3obody can really fi ure this out and doin it ri ht tends to be blac# ma ic rather than science! 0he reason for that is that attributes are really subsets of elements! "nythin attributes can do, elements can do as well! 0he reason attributes were introduced is to curb 1ML*s verbosity! 0a#e a loo# at another version of our *copy* snippet+
<(opy> <todir>../new/dir</todir> <&ileset> <dir>sr(+dir</dir> </&ileset>

</(opy>

0he two snippets hold exactly the same information! =owever, we use attributes to avoid typin the same thin more than once! Ima ine if attributes weren*t part of 1ML specification! Writin anythin in 1ML would drive us nuts/ 3ow that we ot attributes out of the way, let*s loo# at s2expressions! 0he reason we too# this detour is that s2expressions do not have attributes! @ecause they*re a lot less verbose, attributes are simply unnecessary! 0his is one thin we need to #eep in mind when transformin 1ML to s2expressions! Let*s ta#e a loo# at an example! We could translate above snippet to s2expressions li#e this+
((opy (todir "../new/dir"! (&ileset (dir "sr(+dir"!!!

0a#e a ood loo# at this representation! What*s different. "n le brac#ets seem to be replaced by parentheses! Instead of enclosin each element into a pair of parentheses and then closin each element with a ,$Eelement), we simply s#ip the second parenthesis in ,$element, and proceed! 0he element is then closed li#e this+ ,),! 0hat*s it/ 0he translation is natural and very simple! It*s also a lot easier to type! 8o parentheses blind first time users. Maybe, but now that we*re understand the reasonin behind them they*re a lot easier to handle! "t the very least they*re better than arthritis inducin verbosity of 1ML! "fter you et used to s2expressions writin code in them is not only doable but very pleasant! "nd they provide all the benefits of writin code in 1ML $many of which we*re yet to explore)! Let*s ta#e a loo# at our *tas#* code in somethin that loo#s a lot more li#e lisp+
(task (name ",est"! (e(ho (message "3ello World4"!!! (,est!

>2expressions are called lists in Lisp lin o! %onsider our *tas#* element above! If we rewrite it without a line brea# and with comas instead of spaces it*s startin to loo# surprisin ly li#e a list of elements and other lists $the formattin is added to ma#e it easier to see nested lists)+
(task, (name, "test"!, (e(ho, (message, "3ello World4"!!!

We could do the same with 1ML! 7f course the line above isn*t really a list, it*s a tree, 5ust li#e its 1ML2alternative! 8on*t let references to lists confuse you, it*s 5ust that lists that contain other lists and trees are effectively the same thin ! Lisp may stand for List <rocessin , but it*s really tree processin 2 no different than processin 1ML nodes! Whew! "fter much ramblin we finally ot to somethin that loo#s li#e Lisp $and is Lisp, really)! @y now the mysterious Lisp parentheses as well as some claims made by Lisp advocates should become more clear! @ut we still have a lot of round to cover! :eady. Let*s move on/

C Macros Reloaded
@y now you must be tired of all the 1ML tal#! I*m tired of it as well! It*s time to ta#e a brea# from all the trees, s2expressions, and "nt business! Instead, let*s o bac# to every pro rammer*s roots! It*s time to tal# about % preprocessor! What*s % ot to do with anythin , I hear you as#. Well, we now #now enou h to et into metapro rammin and discuss code that writes other code! Fnderstandin this tends to be hard since all tutorials discuss it in terms of lan ua es that you don*t #now! @ut there is nothin hard about the concept! I believe that a metapro rammin discussion based on % will ma#e the whole

thin much easier to understand! >o, let*s see $pun intended)! Why would anyone want to write a pro ram that writes pro rams. =ow can we use somethin li#e this in the real world. What on -arth is metapro rammin , anyway. Aou already #now all the answers, you 5ust don*t #now it yet! In order to unloc# the hidden vault of divine #nowled e let*s consider a rather mundane tas# of simple database access from code! We*ve all been there! Writin >GL 4ueries all over the code to modify data within tables turns into repetitive hell soon enou h! -ven with the new %( D!0 LI3G stuff this is a hu e pain! Writin a full >GL 4uery $albeit with a nice built in syntax) to et someone*s name or to modify someone*s address isn*t exactly a pro rammer*s idea of comfort! What do we do to solve these problems. -nter data access layers! 0he idea is simple enou h! Aou abstract database access $at least trivial 4ueries, anyway) by creatin a set of classes that mirror the tables in the database and use accessor methods to execute actual 4ueries! 0his simplifies development tremendously 2 instead of writin >GL 4ueries we ma#e simple method calls $or property assi nments, dependin on your lan ua e of choice)! "nyone who has ever used even the simplest of data access layers #nows how much time it can save! 7f course anyone who has ever written one #nows how much time it can #ill 2 writin a set of classes that mirror tables and convert accessors to >GL 4ueries ta#es a considerable chun# of time! 0his seems especially silly since most of the wor# is manual+ once you fi ure out the desi n and develop a template for your typical data access class you don*t need to do any thin#in ! Aou 5ust write code based on the same template over and over and over and over a ain! Many people fi ured out that there is a better way 2 there are plenty of tools that connect to the database, rab the schema, and write code for you based on a predefined $or a custom) template! "nyone who has ever used such a tool #nows what an ama9in time saver it can be! In a few clic#s you connect the tool to the database, et it to enerate the data access layer source code, add the files to your pro5ect and voilH 2 ten minutes worth of wor# do a better 5ob than hundreds of man2hours that were re4uired previously! What happens if your database schema chan es. Well, you 5ust have to o throu h this short process a ain! 7f course some of the best tools let you automate this 2 you simply add them as a part of your build step and every time you compile your pro5ect everythin is done for you automatically! 0his is perfect/ Aou barely have to do anythin at all! If the schema ever chan es your data access layer code updates automatically at compile time and any obsolete access in your code will result in compiler errors/ 8ata access layers are one ood example, but there are plenty of others! 6rom boilerplate ?FI code, to web code, to %7M and %7:@" stubs, to M6% and "0L, 2 there are plenty of examples where the same code is written over and over a ain! >ince writin this code is a tas# that can be automated completely and a pro rammer*s time is far more expensive than %<F time, plenty of tools have been created that enerate this boilerplate code automatically! What are these tools, exactly. Well, they are pro rams that write pro rams! 0hey perform a simple tas# that has a mysterious name of metapro rammin ! 0hat*s all there is to it! We could create and use such tools in millions of scenarios but more often than not we don*t! What it boils down to is a subconscious calculation 2 is it worth it for me to create a separate pro5ect, write a whole tool to enerate somethin , and then use it, if I only have to write these very similar pieces about seven times. 7f course not! 8ata access layers and %7M stubs are written hundreds, thousands of times! 0his is why there are tools for them! 6or similar pieces of code that repeat only a few times, or even a few do9en times, writin code eneration tools isn*t even considered! 0he trouble to create such a tool more often than not far outwei hs the benefit of usin one! If only creatin such tools was much easier, we could use them more often, and perhaps save many hours of our time! Let*s see if we can accomplish this in a reasonable manner!

>urprisin ly % preprocessor comes to the rescue! We*ve all used it in % and %&&! 7n occasion we all wish 'ava had it! We use it to execute simple instructions at compile time to ma#e small chan es to our code $li#e selectively removin debu statements)! Let*s loo# at a 4uic# example+
5de&ine triple(6! 6 # 6 # 6

What does this line do. It*s a simple instruction written in the preprocessor lan ua e that instructs it to replace all instances of triple(X) with X + X + X! 6or example all instances of *triple(5)* will be replaced with *5 + 5 + 5* and the resultin code will be compiled by the % compiler! We*re really doin a very primitive version of code eneration here! If only % preprocessor was a little more powerful and included ways to connect to the database and a few more simple constructs, we could use it to develop our data access layer ri ht there, from within our pro ram/ %onsider the followin example that uses an ima inary extension of the % preprocessor+
5get'd)'s(hema("1 7.8.8.1, un, pwd"!$ 5iterate'through'ta)les 5&or'ea(h'ta)le (lass 5ta)le'name " %$ 5end'&or'ea(h

We*ve 5ust connected to the database schema, iterated throu h all the tables, and created an empty class for each! "ll in a couple of lines ri ht within our source code/ 3ow every time we recompile the file where above code appears we*ll et a freshly built set of classes that automatically update based on the schema! With a little ima ination you can see how we could build a full data access layer strai ht from within our pro ram, without the use of any external tools/ 7f course this has a certain disadvanta e $aside from the fact that such an advanced version of % preprocessor doesn*t exist) 2 we*d have to learn a whole new ,compile2time lan ua e, to do this sort of wor#! 6or complex code eneration this lan ua e would have to be very complex as well, it would have to support many libraries and lan ua e constructs! 6or example, if our enerated code depended on some file located at some ftp server the preprocessor would have to be able to connect to ftp! It*s a shame to create and learn a new lan ua e 5ust to do this! -specially since there are so many nice lan ua es already out there! 7f course if we add a little creativity we can easily avoid this pitfall! Why not replace the preprocessor lan ua e with %E%&& itself. We*d have full power of the lan ua e at compile time and we*d only need to learn a few simple directives to differentiate between compile time and runtime code/
<9 (out << ":nter a num)er; "$ (in >> n$ 9> &or(int i = 8$ i < <9= n 9>$ i##! " (out << "hello" << endl$ %

%an you see what happens here. -verythin that*s between IJ and JK ta s runs when the pro ram is compiled! "nythin outside of these ta s is normal code! In the example above you*d start compilin your pro ram in the development environment! 0he code between the ta s would be compiled and then ran! Aou*d et a prompt to enter a number! Aou*d enter one and it would be placed inside the for loop! 0he for loop would then be compiled as usual and you*d be able to execute it! 6or example, if you*d

enter C durin the compilation of your pro ram, the resultin code would loo# li#e this+
&or(int i = 8$ i < <$ i##! " (out << "hello" << endl$ %

>imple and effective! 3o need for a special preprocessor lan ua e! We et full power of our host lan ua e $in this case %E%&&) at compile time! We could easily connect to a database and enerate our data access layer source code at compile time in the same way '>< or ">< enerate =0ML/ %reatin such tools would also be tremendously 4uic# and simple! We*d never have to create new pro5ects with speciali9ed ?FIs! We could inline our tools ri ht into our pro rams! We wouldn*t have to worry about whether writin such tools is worth it because writin them would be so fast 2 we could save tremendous amounts of time by creatin simple bits of code that do mundane code eneration for us/

Hello Lisp!
-verythin we*ve learned about Lisp so far can be summari9ed by a sin le statement+ Lisp is executable 1ML with a friendlier syntax! We haven*t said a sin le word about how Lisp actually operates! It*s time to fill this ap2! Lisp has a number of built in data types! Inte ers and strin s, for example, aren*t much different from what you*re used to! 0he meanin of 71 or "hello" is rou hly the same in Lisp as in %&& or 'ava! What is of more interest to us are symbols, lists, and functions! I will spend the rest of this section describin these data types as well as how a Lisp environment compiles and executes the source code you type into it $this is called evaluation in Lisp lin o)! ?ettin throu h this section in one piece is important for understandin true potential of Lisp*s metapro rammin , the unity of code and data, and the notion of domain specific lan ua es! 8on*t thin# of this section as a chore thou h, I*ll try to ma#e it fun and accessible! =opefully you can pic# up a few interestin ideas on the way! 7#! Let*s start with Lisp*s symbols! " symbol in Lisp is rou hly e4uivalent to %&& or 'ava*s notion of an identifier! It*s a name you can use to access a variable $li#e currentTime, arrayCount, n, etc!) 0he difference is that a symbol in Lisp is a lot more liberal than its mainstream identifier alternative! In %&& or 'ava you*re limited to alphanumeric characters and an underscore! In Lisp, you are not! 6or example + is a valid symbol! >o is -, , hello-!orl", hello+!orl", #, etc! $you can find the exact definition of valid Lisp symbols online)! Aou can assi n to these symbols any data2type you li#e! Let*s i nore Lisp syntax and use pseudo2code for now! "ssume that a function set assi ns some value to a symbol $li#e does in 'ava or %&&)! 0he followin are all valid examples+
set(test, <! set(=, <! set(test, "hello"! set(test, =! set(?, "hello"! // // // // // // sym)ol =test= will e>ual an integer < sym)ol === will e>ual an integer < sym)ol =test= will e>ual a string "hello" at this point sym)ol === is e>ual to < there&ore sym)ol =test= will e>ual to < sym)ol =?= will e>ual a string "hello"

"t this point somethin must smell wron ! If we can assi n strin s and inte ers to symbols li#e #, how does Lisp do multiplication. "fter all, # means multiply, ri ht. 0he answer is pretty simple! 6unctions in Lisp aren*t special! 0here is a data2type, function, 5ust li#e inte er and strin , that you assi n to symbols! " multiplication function is built into Lisp and is assi ned to a symbol #! Aou can reassi n a different value to # and you*d lose the multiplication function! 7r you can store the value of the

function in some other variable! " ain, usin pseudo2code+


?(@, A! set(temp, ?! set(?, @! ?(@, A! temp(@, A! set(?, temp! ?(@, A! // // // // // // // // // // multiplies @ )y A, resulting in 1 sym)ol =?= is e>ual to the multiply &un(tion so temp will e>ual to the multiply &un(tion sets sym)ol =?= to e>ual to @ error, sym)ol =?= no longer e>uals to a &un(tion it=s e>ual to @ temp e>uals to a multiply &un(tion so Bisp multiplies @ )y A resulting in 1 sym)ol =?= e>uals multiply &un(tion again multiplies @ )y A, resulting in 1

Aou can even do wac#y stuff li#e reassi nin plus to minus+
set(#, '! #(<, A! // // // // the *alue o& ='= is a )uilt in minus &un(tion so now sym)ol =#= e>uals to a minus &un(tion sin(e sym)ol =#= is e>ual to the minus &un(tion this results in 1

I*ve used functions 4uite liberally in these examples but I didn*t describe them yet! " function in Lisp is 5ust a data2type li#e an inte er, a strin , or a symbol! " function doesn*t have a notion of a name li#e in 'ava or %&&! Instead, it stands on its own! -ffectively it is a pointer to a bloc# of code alon with some information $li#e a number of parameters it accepts)! Aou only ive the function a name by assi nin it to a symbol, 5ust li#e you assi n an inte er or a strin ! Aou can create a function by usin a built in function for creatin functions, assi ned to a symbol *fn*! Fsin pseudo2code+
&n CaD " return ?(a, % !$

0his returns a function that ta#es a sin le parameter named $a$ and doubles it! 3ote that the function has no name but you can assi n it to a symbol+
set(times'two, &n CaD " return ?(a, !$ %!

We can now call this function+


times'two(<! // returns 18

3ow that we went over symbols and functions, what about lists. Well, you already #now a lot about them! Lists are simply pieces of 1ML written in s2expression form! " list is specified by parentheses and contains Lisp data2types $includin other lists) separated by a space! 6or example $this is real Lisp, note that we use semicolons for comments now)+
(! (1! (1 "test"! (test "hello"! (test (1 ! "hello"! $ $ $ $ $ $ $ $ $ an empty list a list with a single element, 1 a list with two elements an integer 1 and a string "test" a list with two elements a sym)ol test and a string "hello" a list with three elements, a sym)ol test a list o& two integers 1 and and a string "hello"

When a Lisp system encounters lists in the source code it acts exactly li#e "nt does when it encounters 1ML 2 it attempts to execute them! In fact, Lisp source code is only specified usin lists, 5ust li#e "nt source code is only specified usin 1ML! Lisp executes lists in the followin manner! 0he first element of the list is treated as the name of a function! 0he rest of the elements are treated as functions parameters! If one of the parameters is another list it is executed usin the same principles and the result is passed as a parameter to the ori inal function! 0hat*s it! We can write real code now+
(? @ A! $ e>ui*alent to pseudo'(ode ?(@, A!. $ Eym)ol =?= is a &un(tion $ @ and A are its parameters. $ Feturns 1 . $ returns 18 $ error; @ is not a &un(tion $ error, times'two e/pe(ts one parameter $ error, times'two e/pe(ts one parameter $ sets sym)ol =#= to )e e>ual to whate*er sym)ol ='= $ e>uals to, whi(h is a minus &un(tion $ returns 1 sin(e sym)ol =#= is now e>ual $ to the minus &un(tion $ multiplies @ )y the se(ond parameter $ (whi(h is a &un(tion (all that returns A!. $ Feturns 1 .

(times'two <! (@ A! (times'two! (times'two @ A! (set # '! (# < A! (? @ (? !!

3ote that so far every list we*ve specified was treated by a Lisp system as code! @ut how can we treat a list as data. " ain, ima ine an "nt tas# that accepts 1ML as one of its parameters! In Lisp we do this usin a 4uote operator $ li#e so+
(set test =(1 !! (set test (1 !! (set test =(? @ A!! $ $ $ $ test is e>ual to a list o& two integers, 1 and error, 1 is not a &un(tion sets test to a list o& three elements, a sym)ol ?, an integer @, and an integer A

We can use a built in function hea" to return the first element of the list, and a built in function tail to return the rest of the list*s elements+
(head =(? @ A!! $ returns a sym)ol =?= (tail =(? @ A!! $ returns a list (@ A! (head (tail =( ? @ A!!! $ (tail =(? @ A!! returns a list (@ A! $ and (head =(@ A!! returns @. (head test! $ test was set to a list in pre*ious e/ample $ returns a sym)ol =?=

Aou can thin# of built in Lisp functions as you thin# of "nt tas#s! 0he difference is that we don*t have to extend Lisp in another lan ua e $althou h we can), we can extend it in Lisp itself as we did with the times-t!o example! Lisp comes with a very compact set of built in functions 2 the necessary minimum! 0he rest of the lan ua e is implemented as a standard library in Lisp itself!

Lisp Macros
>o far we*ve loo#ed at metapro rammin in terms of a simple templatin en ine similar to '><! We*ve done code eneration usin simple strin manipulations! 0his is enerally how most code eneration tools o about doin this tas#! @ut we can do much better! 0o et on the ri ht trac#, let*s start off with a 4uestion! =ow would we write a tool that automatically enerates "nt build scripts by loo#in at source files in the directory structure.

We could ta#e the easy way out and enerate "nt 1ML by manipulatin strin s! 7f course a much more abstract, expressive and extensible way is to wor# with 1ML processin libraries to enerate 1ML nodes directly in memory! 0he nodes can then be seriali9ed to strin s automatically! 6urthermore, our tool would be able to analy9e and transform existin "nt build scripts by loadin them and dealin with the 1ML nodes directly! We would abstract ourselves from strin s and deal with hi her level concepts which let us et the 5ob done faster and easier! 7f course we could write "nt tas#s that allow dealin with 1ML transformations and write our eneration tool in "nt itself! 7r we could 5ust use Lisp! "s we saw earlier, a list is a built in Lisp data structure and Lisp has a number of facilities for processin lists 4uic#ly and effectively $hea" and tail bein the simplest ones)! "dditionally Lisp has no semantic constraints 2 you can have your code $and data) have any structure you want! Metapro rammin in Lisp is done usin a construct called a ,macro,! Let*s try to develop a set of macros that transform data li#e, say, a to2do list $surprised.), into a lan ua e for dealin with to2do lists! Let*s recall our to2do list example! 0he 1ML loo#s li#e this+
<todo name="housework"> <item priority="high">Clean the house.</item> <item priority="medium">Wash the dishes.</item> <item priority="medium">Buy more soap.</item> </todo>

0he correspondin s2expression version loo#s li#e this+


(todo "housework" (item (priority high! "Clean the house."! (item (priority medium! "Wash the dishes."! (item (priority medium! "Buy more soap."!!

>uppose we*re writin a to2do mana er application! We #eep our to2do items seriali9ed in a set of files and when the pro ram starts up we want to read them and display them to the user! =ow would we do this with 1ML and some other lan ua e $say, 'ava). We*d parse our 1ML files with the to2do lists usin some 1ML parser, write the code that wal#s the 1ML tree and converts it to a 'ava data structure $because fran#ly, processin 87M in 'ava is a pain in the nec#), and then use this data structure to display the data! 3ow, how would we do the same thin in Lisp. If we were to adopt the same approach we*d parse the files usin Lisp libraries responsible for parsin 1ML! 0he 1ML would then be presented to us as a Lisp list $an s2expression) and we*d wal# the list and present relevant data to the user! 7f course if we used Lisp it would ma#e sense to persist the data as s2expressions directly as there*s no reason to do an 1ML conversion! We wouldn*t need special parsin libraries since data persisted as a set of s2expressions is valid Lisp and we could use Lisp compiler to parse it and store it in memory as a Lisp list! 3ote that Lisp compiler $much li#e !3-0 compiler) is available to a Lisp pro ram at runtime! @ut we can do better! Instead of writin code to wal# the s2expression that stores our data we could write a macro that allows us to treat data as code/ =ow do macros wor#. <retty simple, really! :ecall that a Lisp function is called li#e this+
(&un(tion'name arg1 arg arg@!

Where each ar ument is a valid Lisp expression that*s evaluated and passed to the function! 6or example if we replace ar%1 above with (+ & 5), it will be evaluated and ' would be passed to the

function! " macro wor#s the same way as a function, except its ar uments are not evaluated!
(ma(ro'name (# A <!!

In this case, $& L C) is not evaluated and is passed to the macro as a list! 0he macro is then free to do what it li#es with it, includin evaluatin it! 0he return value of a macro is a Lisp list that*s treated as code! 0he ori inal place with the macro is replaced with this code! 6or example, we could define a macro plus that ta#es two ar uments and puts in the code that adds them! What does it have to do with metapro rammin and our to2do list problem. Well, for one, macros are little bits of code that enerate code usin a list abstraction! "lso, we could create macros named to-"o and item that replace our data with whatever code we li#e, for instance code that displays the item to the user! What benefits does this approach offer. We don*t have to wal# the list! 0he compiler will do it for us and will invo#e appropriate macros! "ll we need to do is create the macros that convert our data to appropriate code/ 6or example, a macro similar to our triple % macro we showed earlier loo#s li#e this+
(de&ma(ro triple (/! =(# G/ G/ G/!!

0he 4uote prevents evaluation while the tilde allows it! 3ow every time triple is encountered in lisp code+
(triple A!

it is replaced with the followin code+


(# A A A!

We can create macros for our to2do list items that will et called by lisp compiler and will transform the to2do list into code! 3ow our to2do list will be treated as code and will be executed! >uppose all we want to do is print it to standard output for the user to read+
(de&ma(ro item (priority note! =()lo(k (print stdout ta) "Hriority; " G(head (tail priority!! endl! (print stdout ta) "Iote; " Gnote endl endl!!!

We*ve 5ust created a very small and limited lan ua e for mana in to2do lists embedded in Lisp! >uch lan ua es are very specific to a particular problem domain and are often referred to as domain specific lan ua es or ()*s!

"omain #pecific Lan$ua$es


In this article we*ve already encountered two domain specific lan ua es+ "nt $specific to dealin with pro5ect builds) and our unnamed mini2lan ua e for dealin with to2do lists! 0he difference is that "nt was written from scratch usin 1ML, an 1ML parser, and 'ava while our lan ua e is embedded into Lisp and is easily created within a couple of minutes! We*ve already discussed the benefits of 8>Ls, mainly why "nt is usin 1ML, not 'ava source code! Lisp lets us create as many 8>Ls as we need for our problem! We can create domain specific lan ua es

for creatin web applications, writin massively multiplayer ames, doin fixed income tradin , solvin the protein foldin problem, dealin with transactions, etc! We can layer these lan ua es on top of each other and create a lan ua e for writin web2based tradin applications by ta#in advanta e of our web application lan ua e and bond tradin lan ua e! -very day we*d reap the benefits of this approach, much li#e we reap the benefits of "nt! Fsin 8>Ls to solve problems results in much more compact, maintainable, flexible pro rams! In a way we create them in 'ava by creatin classes that help us solve the problem! 0he difference is that Lisp allows us to ta#e this abstraction to the next level+ we*re not limited by 'ava*s parser! 0hin# of writin build scripts in 'ava itself usin some supportin library! %ompare it to usin "nt! 3ow apply this same comparison to every sin le problem you*ve ever wor#ed on and you*ll be in to limpse a small share of the benefits offered by Lisp!

What%s ne&t?
Learnin Lisp is an uphill battle! -ven thou h in %omputer >cience terms Lisp is an ancient lan ua e, few people to date fi ured out how to teach it well enou h to ma#e it accessible! 8espite reat efforts by many Lisp advocates, learnin Lisp today is still hard! 0he ood news is that this won*t remain the case forever since the amount of Lisp2related resources is rapidly increasin ! 0ime is on Lisp*s side! Lisp is a way to escape mediocrity and to et ahead of the pac#! Learnin Lisp means you can et a better 5ob today, because you can impress any reasonably intelli ent interviewer with fresh insi ht into most aspects of software en ineerin ! It also means you*re li#ely to et fired tomorrow because everyone is tired of you constantly mentionin how much better the company could be doin if only its software was written in Lisp! Is it worth the effort. -veryone who has ever learned Lisp says yes! 0he choice, of course, remains yours!

You might also like