You are on page 1of 11

Anatomy of a Paper: Part I, Inspiration

By Sean Carroll | July 30, 2007 11:43 am

How does theoretical physics get done? I had my first exposure to research doing observational astronomy as an undergrad; it was fascinating, following the process all the way from spending freezing nights at the telescope collecting photons, to reducing the data, seeing what the light curves taught you about the stars, to finally writing a paper. But I knew all along that I really wanted to be a theorist. Looking at those papers with their incomprehensible Greek indices filled me with anticipation for the day it would all finally make sense. (Eventually you realize that more and more of it does make sense, but it never all makes sense, or anywhere close. Most of your time is spent thinking about the parts you dont understand.) But I had no idea how such papers were actually produced where did you start? When I was looking at grad schools, I took the train up to Princeton to visit the physics department and knock on peoples doors rather less planned out than I would advise anyone else to do. (You couldnt Google people back then.) I found one guy who was sitting in his office, a faint smell of cigar smoke in the background, scribbling equations on a legal pad. Looked promising. I introduced myself and asked a few silly questions, among which was How do you do research? He leaned back, propping his sneakerclad feet onto the desk, fixed me with a look and said I dont know. You just have an idea, and then do research about it. As advice goes, it was more Delphic than practical. I didnt know at the time that this guy would later be my bossfor a while, and eventually win the Nobel Prize. So I thought it would be fun to describe the process in a bit more detail, using a worked example. It is no exaggeration to say that every paper is different, but there might be some useful lessons in there somewhere. I recently finished a paper with Lotty Ackerman and Mark Wise that is a pretty canonical example a solid paper, not something earth-shattering that will change the face of science as we know it, but a meaningful contribution with some good ideas and some useful equations. Well, it was recently finished when I began writing this monstrously long post, which by now was many months ago. So Ive decided to divide it into pieces this will be the first of a three-part series. Lotty is a grad student here at Caltech; she had previously worked with Mark, who is a respectable particle theorist in the office next to mine. He knew that she was cosmologically inclined, so introduced Lotty and me to each other even before I officially arrived. I suggested to Lotty that we begin to think about density perturbations in inflation (the hypothetical period of accelerated expansion in the early universe), as much because I wanted to learn more about the subject as for any more focused research goal. Im not the best advisor in the world; I have lots of ideas, but they inevitably start out rather ill-formed, and most of them stay that way. Occasionally one of them coalesces out of the fog into something substantial, and a paper gets written. Its a harrowing way to operate, especially from the grad-student perspective.

One day Lotty was having lunch with Jonathan Pritchard, another grad student here, and they wondered out loud what would happen if inflation didnt happen the same way in every direction in space. That is, what the consequences would be if there were some direction picked out throughout the universe, so that inflation occurred at a different rate (or something) parallel to that direction than perpendicular to it. Presumably there would be something different we could observe about the density fluctuations if we looked along that particular direction than if we looked in another direction, but what exactly? How could we tell? And is there some physical mechanism we could imagine introducing that would actually pick out a direction during inflation, and then (just to keep things simple) disappear afterwards so that we wouldnt notice it today? Dont ask me why they thought of it. Just the kind of thing you chat about at lunch all the time, if you happen to be a theoretical cosmologist. This kind of meandering speculation is one way papers get started. You (if youre like me I cant speak for other people) never sit down and say, Lets have an idea. Some people are fortunate enough to have programmatic, focused research agendas when I was a postdoc at MIT in the early Nineties, Ed Bertschinger had collected around him an amazing set of postdocs and grad students, all focused on understanding temperature anisotropies in the cosmic microwave background and what they could tell us about the universe. It was a great moment to be thinking about those issues, and a lot of those students are now high-powered faculty members with groups of their own. But most theorists are not quite so systematic. You noodle over problems, talk to other people with similar interests (or complementary skill sets), make connections between different ideas. Occasionally a flash of insight will hit just before you fall asleep, or while youre waiting for the barista to make your latte. (I should make clear that this particular What if? question is not completely unmotivated speculation. Inflation is a great theory, and is likely to be right in some yet-to-be-defined sense, but its not something that anyone should think we more or less understand. Were extrapolating well beyond known physics, so it pays to keep an open mind. One way of forcing yourself to keep an open mind is to ask specific and testable questions about the space of possibilities encompassed by your ideas.)

Were passing over pretty quickly what is the key step in the whole process of paperwriting asking a good question, and equally importantly, realizingthat its an interesting question, and that there is a way to answer it. The rest is the straightforward part, staying up late and solving equations. Unfortunately, theres no known way to formalize this process of recognizing good questions. Youd be surprised at how often, once youve had your basic training, you read someone elses paper and think I totally should have thought of that first. But you didnt. A quick glance at papers appearing on arxiv.org shows that the vast majority of them are collaborative, rather than single-author. Thats a reflection both of how ideas arise in the first place friends chit-chatting over lunch, or via email, or at conferences and how the work actually gets done often enough, one person will have an idea but someone else will have the expertise necessary to bring it to fruition. I will never understand how people can suggest replacing conferences or seminar visits with talks broadcast over the internet. Thats like trying to improve a restaurant experience by making sure the plates and cutlery are really shiny, and doing away with the food entirely. Conferences arent about talks, although those are occasionally interesting. (David Lodge, in Small World, holds up the ideal conference as one in which there arent any talks at all.) Theyre about the ongoing low-level interaction between the participants at meals and coffee breaks. Thats where the ideas get created! Then you can each go home and apply yourself to the nitty-gritty work of turning those ideas into papers. Speaking of which the answer to the inflation-with-a-preferred-direction question wasnt obvious, so Lotty asked Mark about it. (Who knows where I was off traveling, probably.) He didnt know either, but it sounded like an interesting question. So (as one will do) he started scribbling down some models of inflation that might behave that way. Basically, trying to invent a way to allow the negative pressure associated with the inflaton field (the hypothetical field whose energy drives the hypothetical accelerated expansion) to be direction-dependent. We have some general pre-existing ideas about how inflation might conceivably work, and a good field theorist has a bag full of models that can be shaped into different forms depending on the problem under consideration, so it was a matter of asking how easy it would be to tweak those models to give them a preferred direction. When I did eventually drop by my office, Mark mentioned the idea to me. It sounded interesting, but I didnt have anything insightful to add off the top of my head. But that afternoon there was a physics colloquium, during which my mind wandered, and I started thinking of different ways the inflaton might get a direction-dependent pressure. After the talk, I went to Marks office to say Your idea is crazy, but heres an idea that might work. The next day, Mark gathered Lotty and me into his office to explain why my idea was crazy, but he had a new idea that might work. That process continued for a while, back and forth between the three of us; suggesting models, finding reasons why they should be discarded, realizing that a previously-discarded model might be able to sidestep the previous objections, and so on. Along the way, we had the good idea of thinking about the problem in a completely phenomenological, model-independent way. In other words, there is one particular final product that we get out of inflation: a power spectrumthat tells us how strong the

density fluctuations are at any given length scale. From this there is a well-understood procedure to predict temperature fluctuations in the cosmic microwave background, which are the most directly observable consequences of the primordial density fluctuations. So, the phenomenological approach is to forget about particular models of inflation, and simply ask what kind of impact a preferred direction could possibly have on the power spectrum (and thus the CMB). In principle, the answer is all sorts of impacts. The perturbations are generally described in terms of an amplitude defined at each length scale and each direction on the sky. (More technically, we express the power spectrum in Fourier space as a function of the wavevector.) In the usual description, every direction is deep down the same as every other, so really the power spectrum is just a function of the length scale. Even better, to a good approximation inflation predicts that the amplitude of the fluctuations should be scale invariant a constant value, the same at every length. So really the complete power spectrum is specified by just one number! Thats the amplitude of the primordial density fluctuations. Having only one parameter makes your theory extremely predictive, which is why we can squeeze so much information out of the data we get from WMAP, for example. (Of course, we immediately start adding new parameters, but thats another story.) But now that we have a preferred direction, we can imagine that the perturbation amplitude really does depend on the direction we look in on the sky. (The fluctuations might be a bit stronger [or weaker] if we happen to be looking exactly along the direction that was picked out as special during inflation, in other words.) Furthermore, in principle it could have a different impact at every different length scale! So, not very predictive. On the other hand, theres a good physical reason why the perturbations from ordinary inflation are scale-invariant; the process of inflation itself is basically the same during most of its duration. While inflation is going on, the universe is expanding at an approximately constant rate, and stretching tiny quantum fluctuations into large-scale density perturbations. Because the process of inflation is uniform, the amplitude of the resulting perturbations is (basically) uniform. Therefore, we should (even in the absence of any particular model) be able to apply similar reasoning once we stick in a preferred direction. Our idea was that there would be some violation of rotational invariance either inflation would happen slightly more rapidly along some particular direction, or the decay of the inflaton field into ordinary matter and radiation would be more efficient along some direction, or whatever but it would have a constant magnitude while inflation was happening. So we should expect the new effect (which we were imagining to be small, given that its certainly not bloody obvious in the existing CMB observations) would also be scale-invariant! That makes life much simpler. Were now suggesting that, instead of the primordial perturbations just having a single amplitude that is independent of both direction and length scale, there is a tiny extra modulation of the amplitude that depends on one new pure number (to specify how big the effect is) and one direction on the sky (corresponding to the preferred direction). In other words, three new parameters: one magnitude of the effect, and one position on the sky (specified by right ascension and declination, for example).

In the exciting cliffhanger that was Part One, we saw how the idea behind a paper came to be nurtured from a meandering speculation into a somewhat well-defined calculational question. In particular, Lotty Ackerman and Mark Wise and I were asking what would happen if there were a preferred direction during inflation an axis in the sky along which primordial perturbations were just a little bit different than in the perpendicular plane. We guessed, even in the absence of a specific model, that such a statistical anisotropy would show up as a nearly scale-invariant modulation of the power spectrum. Now we need to turn such ideas into something more concrete. In fact, our phenomenological guess was enough to go and start calculating how this new effect will show up on the CMB, and we all set about doing exactly that. None of us Mark, Lotty, and I are really experts at this sort of thing, but thats why they make books and review articles. (Without Scott Dodelsons book, I would have been in trouble.) As it turns out, many years ago Mark had written one of the very first papers on deriving CMB anisotropies from inflationary perturbations, so he had a head start on calculating things. But the analysis that he and Larry Abbott had done way back when had concentrated on the gravitational redshift/blueshift of the CMB (the SachsWolfe effect), which is only the most important contribution on large angular scales. Lotty and I realized that we should be able to calculate the effect at every scale all at once, which turned out to be right. Its true that messy astrophysical effects (acoustic oscillations) become important at medium and small scales, and it would take a real cosmologist to understand them. But all we were doing was changing the initial amplitude of the perturbations, in a direction-dependent way. The eventual effect is simply a product of the initial amplitude and a transfer function that encodes the messy fluid dynamics once and for all; since our new primordial power spectrum left the transfer function unaffected, we didnt have to worry about it. (More generally, Lotty and I were full contributors when it came to ideas, but Mark is very fast when it comes to calculations. We would have to occasionally distract him with something shiny while we sat down to catch up with the equations.)

So we read up on calculating CMB anisotropies, and applied it to our model. Since everyone usually assumes that all directions are created equal, we couldnt simply plug and chug; we had to re-do the usual calculations from the start, keeping the extra degree of complexity introduced by our preferred direction. That provided a good excuse to educate ourselves about some of the nitty-gritty involved in turning primordial density perturbations into a signal on the CMB sky. In particular, we had to play with spherical harmonics, which are the conventional way to encode information spread over a sphere for example, the temperature of the microwave background as a function of position on the sky. Every good physicist knows the basic properties of spherical harmonics, but we had to do some particular integrals that were not that common. I dont know about you, but

when Im faced with a nontrivial integral, I tryMathematica first, ask questions later. But Mathematica didnt know these integrals, so actual work was required. At some point it dawned on me that we could use a recursion equation relating one spherical harmonic to a set of others to turn the integral into something doable. No special points for me; my collaborators figured it out independently. Still, its always fun to crack a knotty calculational problem. A few amusing footnotes to the recursion-equation episode. First footnote: I figured it out while sampling a martini at the Hilton Checkers lounge in downtown L.A. This was last fall, while I was still relatively new to the area, and was spending time checking out the various local establishments. Verdict: a pretty good martini, I must say. The bartender was intrigued by all the equations I was happily scribbling, and asked me what was going on. I explained just a bit about the CMB etc., and she was genuinely interested. But then, alas, she mentioned something about astrology. So I had to explain that this was actually very different etc. I got the impression that she ultimately did appreciate the difference between astronomy and astrology, once it was laid right out there. Now if only we could replace the horoscopes in daily newspapers with charts of the night sky. Second footnote: it is one thing to think maybe a recursion equation would be useful here, its another thing to actually remember the damn equation. Im generally not a good equation-rememberer, and I wasnt lugging any reference books with me. (I was in a bar, remember?) But I was lugging my laptop with me, and there was wireless internet. So naturally I looked up the equation on Wikipedia, and there it was! I checked it against some more conventionally reliable resource once I got home, but the Wikipedia page was perfectly accurate. (Nobody finds it worthwhile to vandalize pages on special mathematical functions.) Final footnote: something interesting is revealed about the nature of a technical education. The point is, I didnt know much about the particular recursion equation that I ended up using; I wasnt sure that an appropriate equation even existed. There was just a vague feeling in my mind that Legendre polynomials (which I did know how to relate to spherical harmonics) were the kind of thing that probably obeyed some recurrence relations. But the last time that I actually had dealt directly with Legendre polynomials, if ever, was most likely when I took an undergraduate class in quantum mechanics or mathematical physics, a good twenty years (half my life) ago. In other words, my physics education worked exactly as it is supposed to it stuck a vague idea into my head that persisted undisturbed for a couple of decades, that was there when I needed it, and provided me with just enough information that I knew where to turn when the occasion arose. This is one of the reasons I feel such antipathy toward GREs and grad-school qualifying exams and the like: they set up a testing environment that bears absolutely no relationship to the way that real research is done, and end up valorizing a certain kind of cleverness and calculational speed over real insight and creativity. On the other hand, they do provide a way to quantify something, even if its not something very important, and we can then proceed to deploy these scientific-looking numbers to separate the men from the sheep, secure in the knowledge that our quantifications are highly precise, if completely inaccurate. Okay, rant over.

So we all managed to turn an interesting collection of cranks, showing how to convert a set of direction-dependent primordial density perturbations into a set of quantities one could observe in the cosmic microwave background. All in all, an impressive-looking bunch of equations resulted, and that was basically half of the paper we ended up writing. The other half dealt with the question we had started with in the first place is there some compelling, or at least plausible, physical model that would actually lead to such perturbations? Rotational invariance is a subset of Lorentz invariance, a cornerstone of relativity and thus of all of modern physics. Fortunately, however, violating Lorentz invariance is one of the things I am especially good at. Ive already blogged about a paper I wrote with Eugene Lim on the cosmological consequences of Lorentz-violating vector fields. The big difference is that Eugene and I, following in the footsteps of Ted Jacobson and David Mattingly and others, had taken advantage of the fact that the real universe already has a preferred cosmological reference frame the one in which the CMB is statistically isotropic. We imagined that there were some hypothetical vector field that had a nonzero value in empty space, but which (basically) pointed along a timelike direction, orthogonal to hypersurfaces of constant cosmological time. What Lotty, Mark and I needed was a vector field that pointed in some preferred direction in space, i.e. a spacelike vector. But thats not so hard; just take the theory with a timelike vector and change some minus signs to plus signs. We didnt put too much tender loving care into constructing the worlds most compelling theory, because it wasnt the theory that was our primary concern it was the robust predictions that theories of that sort might end up making. But we were able to write down a model that seemed to have all the properties we wanted. Within the assumptions of that model, we could make a very specific calculation of the predicted density perturbations, and compare them with the model-independent guess we had started with. There was pretty good agreement; our guess was that the perturbations would be basically scaleinvariant, and the particular model we considered produced perturbations that only varied by about 10% over phenomenologically interesting scales. I should mention that, while working on the vector-field idea, I found myself in another bar this one across the puddle, a neighborhood pub in London. Guinness this time, not a martini. And wouldnt you know it, the bartender sees my equations spread out there and asks what it is Im doing. (By the time I retire, every bartender in the Western hemisphere is going to have at least a passing acquaintance with the basics of contemporary cosmology.) This guy was really into it, and wanted to write down not just the title but also the ISBN number of the book I was reading. Since it was Dodelsons cosmology text, which is a gripping read but full of equations, I scribbled a short list of more accessible books he could check out, about which he seemed truly excited. Now if only the London pubs would stay open past ten p.m., wed have an excellent situation all around. After being inspired in Part One and sweating through some calculations inPart Two, weve assembled all the ingredients of a good paper. We have an interesting question: What would happen if there were a preferred spatial direction during inflation? We have suggested a robust answer an expression for the generalized power spectrum of

density fluctuations and calculated its observable effects. And then we proposed one specific model, lending credence to the idea that this is a sensible scenario to contemplate. Next its time to write the paper up, and then its cocoa and schnapps all around. Which we proceeded to do, of course. Except that, as we were writing, there was something nagging at the back of my brain. We were thinking like field theorists, coming up with an idea (a preferred direction during inflation) and exploring how it could be constrained by data. But werent there people out there engaged in the converse looking at the data and asking what it implies? Why, yes, there were. In fact, it gradually occurred to me, there wasalready a claim on the market that the actual CMB data were indicating a preferred direction in space! This had totally slipped my mind, in the excitement of exploring our little idea. (As the professional cosmologist of the collaboration, remembering such things was implicitly my job.) The claim that there actually is evidence for a preferred direction in the CMB goes by the clever name of the axis of evil. If one looks closely at the observed anisotropies on the very largest scales, two interesting facts present themselves. First, there is less anisotropy than one would expect, on very large angular scales. Second, and somewhat more controversially, the anisotropy that does exist seems to be oriented along a certain plane in the sky, defining a preferred direction perpendicular to that plane. This preferred direction has been dubbed the axis of evil. Is the axis of evil real? That depends on what one means by real. It does seem to be there in the data. On the other hand, maybe its just a fluke. Nobody has a theory that predicts CMB anisotropy directly as a function of position on the sky rather, theories like inflation probabilistically predict the amplitude of anisotropy on each angular scale. But at each scale there are only a fixed number of independent observations one can make, implying an irreducible uncertainty in ones predictions that was the original definition of cosmic variance, before we re-purposed the phrase. For what its worth, the actual plane in the sky defined by the large-scale anisotropy seems to coincide with the ecliptic, the plane in which the various planets orbit the Sun. Many people believe its just some local effect, or an artifact of a particular way of reducing data, or just a fluke to be honest, nobody knows. Whats relevant to the present discussion is that the very existence of the axis of evil phenomenon meant that other people had already been asking about preferred spatial directions in the CMB, even before our seminal work that didnt yet quite exist. This fact dawned on me in the middle of our writing, and I started digging through the A of E literature. Lo and behold, I found a few of the equations of which we were so proud, especially in the work of Gumrukcuoglu, Contaldi, and Peloso. They had, in fact, derived a few of the equations of which we were justifiably proud. But not all of them! We had, in other words, been partially scooped, although not entirely so. This is a remarkably frequent occurrence you think youre working on some project for esoteric reasons that are of importance only to you, only to find that similar tendencies had been floating around in the air, either recently or some number of years prior. Occasionally the scoopage is so dramatic that you really have nothing new

to add; in that case the only respectable thing is to suck it up and move on to another project. Very often, the overlap is noticeable but far from complete, and you still have something interesting to contribute; that turned out to be the case this time. So we soldiered on, giving credit in our paper to those who blazed trails before us, and highlighting those roads which we had traversed all by ourselves. At the end of the process from meandering speculation, focusing in on an interesting question, gathering the necessary technical tools, performing the relevant calculation, comparing with the existing literature, and finally writing up the useful results you have a paper. Considering all the work you have put into it, the actual paper is annoyingly slight as a physical artifact, even if its one of the longer ones. Unless you are really lucky (and perhaps also good), the amount of work you really do and stuff you figure out is much more than shows up in the distilled and polished final product. Nevertheless, I always finish the paper-writing process with a feeling of accomplishment and a degree of surprise that it seemed to work yet again. As often as not, however, the nominal capstone of the process submittingthe paper to the arxiv is not the final step. There is, of course, the matter of submitting it to a journal and going through the refereeing process, but in this gilded electronic age thats usually somewhat anticlimactic. No, the real action comes in the couple of days after youve put the paper online, in which time you collect helpful emails from your colleagues around the world: Dear Dr. X: I enjoyed reading your new paper astro-ph/0701357. You might be interested in the related work I did years ago, astro-ph/yymmnnn. Warmly, Dr. Y. Translation: I want a citation, you bastard. Although its often not at all impolite, and can even be quite helpful theres a lot of science out there, and no way for any one person to keep track of it all. Perhaps the most useful feature of the electronic age (as far as scientists go) are the citations services like SPIRES, which put at your fingertips a network of related papers, connected through a web of referencing and beingreferenced-by. Its by no means unreasonable, when one has written a potentially relevant paper, to make some token effort to see that it is included in that network; otherwise it could easily be lost forever. It can be annoying to get too many requests for citations, especially irrelevant ones, but I essentially always go ahead and add them into a revised version of the paper; it doesnt kill any electrons. In our case, we didnt get any of the dreaded emails that point out that someone else had done exactly what our paper had done. There were a couple of confused exchanges back and forth with friends who figured that we must have been motivated by the axis-of-evil stuff, and were trying to convince us that our particular model (with an effect that was scale invariant, rather than concentrated on the largest scales) wasnt the right way to tackle that problem; we had to explain that we really werent trying to solve any sort of pre-existing puzzle, we were just trying to probe the fundamental robustness of the

usual assumptions about inflation. In fact, we didnt go so far in our paper as to actually compare to any CMB data sets we are possessed of sufficient self-knowledge to understand that there are other people out there much better equipped than we are to carry out such a task. Hopefully it will be carried out soon! But, although we didnt get any deal-breaking emails from the outside world, we did get a post-submission insight from the inside world, namely me. As we were writing, there was another thing that had been nibbling at the edges of my consciousness isnt there some good reason why inflation usuallydoesnt pick out a preferred direction, but rather is completely isotropic? And finally I recalled what it was something called the Cosmic No-Hair Theorem. This is a result, due to Bob Wald, which essentially says that in a universe filled with positive vacuum energy plus some other stuff, the vacuum energy always wins out. The other stuff always dilutes away, leaving you with the usual isotropic expansion. Which was interesting, since we had just written a paper featuring a model that did have vacuum energy plus some other (vector) fields, in which the other stuff did not dilute away, but rather imprinted a direction-dependent effect at all scales. Did we goof, or take advantage of a loophole? Happily, it was the latter. In general relativity, you can prove almost nothing without assuming something reasonable about the energymomentum sources, in the form of energy conditions that restrict the energy density and pressure of the stuff you are considering. The cosmic no-hair theorem assumed the Dominant Energy Condition, which is perfectly reasonable; without assuming the DEC, you cant be sure that your theory is stable. But our vector fields, it turns out, were more clever than we were. Our theory violates the dominant energy condition, so it is consistent with the results of the theorem. But it is not unstable; if we divide the fields into a homogeneous background plus a set of small perturbations, the background (which is effecting inflation) violates the DEC, but the fluctuations (which would possibly lead to instabilities) actually obey the DEC. So we managed to find a theory that was well-behaved, but which sidestepped the crucial assumption of the cosmic no-hair theorem. Thats why it had been a little tricky to find a good model in which inflation was anisotropic for an extended period; there was a theorem that said it couldnt happen, and we had to find a clever loo phole, even though we didnt know thats what we were doing. So we updated the paper to include a few sentences about that situation. And now, I think, were truly done. The paper has been accepted and published and all that. But of course, one good project suggests all sorts of others. If we explored whether the cherished assumption of rotational invariance could be violated during inflation, are there other cherished assumptions we could contemplate violating? Answer: sure, there are plenty. But there are also plenty of unconnected science ideas that are worth pursuing. So we have to decide whether we should continue to move forward with this kind of investigation, or switch to something completely different. (I tend to go with the latter, more often than not.) One of the happy things about being a professional scientist is that theres no worry that one day you will wake up and all of the good questions will be answered. On to whatever comes next.