**Atomic and Electronic Structure of Solids
**

This text is a modern treatment of the theory of solids. The core of the book deals with the physics of electron and phonon states in crystals and how they determine the structure and properties of the solid. The discussion uses the single-electron picture as a starting point and covers electronic and optical phenomena, magnetism and superconductivity. There is also an extensive treatment of defects in solids, including point defects, dislocations, surfaces and interfaces. A number of modern topics where the theory of solids applies are also explored, including quasicrystals, amorphous solids, polymers, metal and semiconductor clusters, carbon nanotubes and biological macromolecules. Numerous examples are presented in detail and each chapter is accompanied by problems and suggested further readings. An extensive set of appendices provides the necessary background for deriving all the results discussed in the main body of the text. The level of theoretical treatment is appropriate for ﬁrst-year graduate students of physics, chemistry and materials science and engineering, but the book will also serve as a reference for scientists and researchers in these ﬁelds. Efthimios Kaxiras received his PhD in theoretical physics at the Massachusetts Institute of Technology, and worked as a Postdoctoral Fellow at the IBM T. J. Watson Research Laboratory in Yorktown Heights. He joined Harvard University in 1991, where he is currently a Professor of Physics and the Gordon McKay Professor of Applied Physics. He has worked on theoretical modeling of the properties of solids, including their surfaces and defects; he has published extensively in refereed journals, as well as several invited review articles and book chapters. He has co-organized a number of scientiﬁc meetings and co-edited three volumes of conference proceedings. He is a member of the American Physical Society, the American Chemical Society, the Materials Research Society, Sigma Xi-Scientiﬁc Research Society, and a Chartered Member of the Institute of Physics (London).

**Atomic and Electronic Structure of Solids
**

Efthimios Kaxiras

Cambridge, New York, Melbourne, Madrid, Cape Town, Singapore, São Paulo Cambridge University Press The Edinburgh Building, Cambridge , United Kingdom Published in the United States of America by Cambridge University Press, New York www.cambridge.org Information on this title: www.cambridge.org/9780521810104 © Efthimios Kaxiras 2003 This book is in copyright. Subject to statutory exception and to the provision of relevant collective licensing agreements, no reproduction of any part may take place without the written permission of Cambridge University Press. First published in print format 2003 - - - - - - ---- eBook (NetLibrary) --- eBook (NetLibrary) ---- hardback --- hardback ---- paperback --- paperback

Cambridge University Press has no responsibility for the persistence or accuracy of s for external or third-party internet websites referred to in this book, and does not guarantee that any content on such websites is, or will remain, accurate or appropriate.

I dedicate this book to three great physics teachers: Evangelos Anastassakis, who inspired me to become a physicist, John Joannopoulos, who taught me how to think like one, and Lefteris Economou, who vastly expanded my physicist’s horizon.

Contents

Preface Acknowledgments I Crystalline solids 1 Atomic structure of crystals 1.1 Building crystals from atoms 1.1.1 Atoms with no valence electrons 1.1.2 Atoms with s valence electrons 1.1.3 Atoms with s and p valence electrons 1.1.4 Atoms with s and d valence electrons 1.1.5 Atoms with s , d and f valence electrons 1.1.6 Solids with two types of atoms 1.1.7 Hydrogen: a special one-s -valence-electron atom 1.1.8 Solids with many types of atoms 1.2 Bonding in solids Further reading Problems 2 The single-particle approximation 2.1 The hamiltonian of the solid 2.2 The Hartree and Hartree–Fock approximations 2.2.1 The Hartree approximation 2.2.2 Example of a variational calculation 2.2.3 The Hartree–Fock approximation 2.3 Hartree–Fock theory of free electrons 2.4 The hydrogen molecule 2.5 Density Functional Theory

vii

page xv xix 1 4 5 7 13 15 24 24 25 27 29 32 36 37 42 42 44 44 46 47 49 54 58

viii

Contents

2.6 Electrons as quasiparticles 2.6.1 Quasiparticles and collective excitations 2.6.2 Thomas–Fermi screening 2.7 The ionic potential Further reading Problems 3 Electrons in crystal potential 3.1 Periodicity – Bloch states 3.2 k-space – Brillouin zones 3.3 Dynamics of crystal electrons 3.4 Crystal electrons in an electric ﬁeld 3.5 Crystal symmetries beyond periodicity 3.6 Groups and symmetry operators 3.7 Symmetries of the band structure 3.8 Symmetries of 3D crystals 3.9 Special k-points Further reading Problems 4 Band structure of crystals 4.1 The tight-binding approximation 4.1.1 Example: 1D linear chain with s or p orbitals 4.1.2 Example: 2D square lattice with s and p orbitals 4.1.3 Generalizations of the TBA 4.2 General band-structure methods 4.3 Band structure of representative solids 4.3.1 A 2D solid: graphite – a semimetal 4.3.2 3D covalent solids: semiconductors and insulators 4.3.3 3D metallic solids Further reading Problems 5 Applications of band theory 5.1 Density of states 5.2 Tunneling at metal–semiconductor contact 5.3 Optical excitations 5.4 Conductivity and dielectric function

65 68 69 72 78 78 82 82 87 94 97 101 104 105 111 117 119 120 121 121 125 129 136 140 145 145 148 153 157 157 160 160 165 167 169

Contents

ix

5.5 Excitons 5.6 Energetics and dynamics 5.6.1 The total energy 5.6.2 Forces and dynamics Further reading Problems 6 Lattice vibrations 6.1 Phonon modes 6.2 The force-constant model 6.2.1 Example: phonons in 2D periodic chain 6.2.2 Phonons in a 3D crystal 6.3 Phonons as harmonic oscillators 6.4 Application: the speciﬁc heat of crystals 6.4.1 The classical picture 6.4.2 The quantum mechanical picture 6.4.3 The Debye model 6.4.4 Thermal expansion coefﬁcient 6.5 Application: phonon scattering 6.5.1 Phonon scattering processes 6.5.2 The Debye–Waller factor 6.5.3 The M¨ ossbauer effect Problems 7 Magnetic behavior of solids 7.1 Magnetic behavior of insulators 7.2 Magnetic behavior of metals 7.2.1 Magnetization in Hartree–Fock free-electron gas 7.2.2 Magnetization of band electrons 7.3 Heisenberg spin model 7.3.1 Ground state of the Heisenberg ferromagnet 7.3.2 Spin waves in the Heisenberg ferromagnet 7.3.3 Heisenberg antiferromagnetic spin model 7.4 Magnetic order in real materials 7.5 Crystal electrons in an external magnetic ﬁeld 7.5.1 de Haas–van Alphen effect 7.5.2 Classical and quantum Hall effects Further reading Problems

177 185 186 194 200 201 203 203 207 209 213 216 218 218 219 221 225 227 228 232 234 237 238 239 246 247 251 254 255 258 262 265 268 270 273 279 279

x

Contents

8

Superconductivity 8.1 Overview of superconducting behavior 8.2 Thermodynamics of the superconducting transition 8.3 BCS theory of superconductivity 8.3.1 Cooper pairing 8.3.2 BCS ground state 8.3.3 BCS theory at ﬁnite temperature 8.3.4 The McMillan formula for Tc 8.4 High-temperature superconductors Further reading Problems

282 282 289 293 293 297 307 308 310 312 312 315 317 317 317 320 325 325 331 338 345 347 348 350 350 355 356 360 365 370 371 374 376 378 381 382

II Defects, non-crystalline solids and ﬁnite structures 9 Defects I: point defects 9.1 Intrinsic point defects 9.1.1 Energetics and electronic levels 9.1.2 Defect-mediated diffusion 9.2 Extrinsic point defects 9.2.1 Impurity states in semiconductors 9.2.2 Effect of doping in semiconductors 9.2.3 The p–n junction 9.2.4 Metal–semiconductor junction Further reading Problems

10 Defects II: line defects 10.1 Nature of dislocations 10.2 Elastic properties and motion of dislocations 10.2.1 Stress and strain ﬁelds 10.2.2 Elastic energy 10.2.3 Peierls–Nabarro model 10.3 Brittle versus ductile behavior 10.3.1 Stress and strain under external load 10.3.2 Brittle fracture – Grifﬁth criterion 10.3.3 Ductile response – Rice criterion 10.3.4 Dislocation–defect interactions Further reading Problems

Contents

xi

11 Defects III: surfaces and interfaces 11.1 Experimental study of surfaces 11.2 Surface reconstruction 11.2.1 Dimerization: the Si(001) surface 11.2.2 Relaxation: the GaAs(110) surface 11.2.3 Adatoms and passivation: the Si(111) surface 11.3 Growth phenomena 11.4 Interfaces 11.4.1 Grain boundaries 11.4.2 Hetero-interfaces Further reading Problems 12 Non-crystalline solids 12.1 Quasicrystals 12.2 Amorphous solids 12.2.1 Continuous random network 12.2.2 Radial distribution function 12.2.3 Electron localization due to disorder 12.3 Polymers 12.3.1 Structure of polymer chains and solids 12.3.2 The glass and rubber states Further reading Problems 13 Finite structures 13.1 Clusters 13.1.1 Metallic clusters 13.1.2 Carbon clusters 13.1.3 Carbon nanotubes 13.1.4 Other covalent and mixed clusters 13.2 Biological molecules and structures 13.2.1 The structure of DNA and RNA 13.2.2 The structure of proteins 13.2.3 Relationship between DNA, RNA and proteins 13.2.4 Protein structure and function Further reading Problems

385 386 394 398 400 403 408 419 419 421 427 428 430 430 436 437 440 443 447 448 451 456 457 459 460 460 462 476 481 483 484 498 504 509 510 510

xii

Contents

III Appendices Appendix A Elements of classical electrodynamics A.1 Electrostatics and magnetostatics A.2 Fields in polarizable matter A.3 Electrodynamics A.4 Electromagnetic radiation Further reading Appendix B Elements of quantum mechanics B.1 The Schr¨ odinger equation B.2 Bras, kets and operators B.3 Solution of the TISE B.3.1 Free particles B.3.2 Harmonic oscillator potential B.3.3 Coulomb potential B.4 Spin angular momentum B.5 Stationary perturbation theory B.5.1 Non-degenerate perturbation theory B.5.2 Degenerate perturbation theory B.6 Time-dependent perturbation theory B.7 The electromagnetic ﬁeld term Further reading Problems Appendix C Elements of thermodynamics C.1 The laws of thermodynamics C.2 Thermodynamic potentials C.3 Application: phase transitions Problems Appendix D Elements of statistical mechanics D.1 Average occupation numbers D.1.1 Classical Maxwell–Boltzmann statistics D.1.2 Quantum Fermi–Dirac statistics D.1.3 Quantum Bose–Einstein statistics D.2 Ensemble theory D.2.1 Deﬁnition of ensembles D.2.2 Derivation of thermodynamics

513 515 515 518 520 524 529 530 530 533 539 539 540 543 549 554 554 556 557 559 560 560 564 564 567 570 578 579 580 580 582 583 584 585 589

5.5.1 Isotropic elastic solid E.1 The δ -function and the θ -function G.2 The stress tensor E.3 Spins in an external magnetic ﬁeld Further reading Problems Appendix E Elements of elasticity theory E.2 Plane strain E.3 Functional derivatives G.3 The δ -function sums for crystals G.3.1 Equipartition and the Virial D.5.3.5.3 Stress-strain relations E.5 The δ -function and its Fourier transform G.1 Potential of a gaussian function F.4 Strain energy density E.5 Applications of elasticity theory E.2 The Ewald method Problems Appendix G Mathematical tools G.2 Power series expansions G.6 Normalized gaussians Appendix H Nobel prize citations Appendix I References Index Units and symbols 591 591 592 603 617 617 622 622 624 626 627 629 629 632 634 636 636 638 639 640 642 644 644 646 648 649 650 650 654 654 655 657 659 660 667 .4 Fourier and inverse Fourier transforms G.3 Applications of ensemble theory D.1 The strain tensor E.3 Solid with cubic symmetry Further reading Problems Appendix F The Madelung energy F.1 Differential operators G.Contents xiii D.3.2 Fourier transform of the δ -function G.2 Ideal gases D.5.5.

.

there is no attempt to provide a thorough account of related experimental facts. This choice was made in order to keep the book within a limit that allows its contents to be covered in a reasonably short period (one or two semesters. see more detailed instructions below). Instead of providing a comprehensive list of references. as well as extensive discussions of the relevant physics. The exposition of the theory is accompanied by worked-out examples and additional problems at the end of chapters. There are also selected references to original research articles that laid the foundations of the topics discussed. The suggestions for further reading at the end of chapters can serve as a starting point for exploring the experimental literature. but tend to be overwhelming as introductory texts. as well as to more recent work. It discusses the atomic and electronic structure of solids.Preface This book is addressed to ﬁrst-year graduate students in physics. materials science and engineering. the electrons responsible for the cohesion of a solid interact through long-range Coulomb forces both with the xv . which the student is strongly encouraged to explore if not already familiar with it. This book is an attempt to introduce the single-particle picture of solids in an accessible and self-contained manner. There are many sources covering the experimental side of the ﬁeld. The meaning of this is clariﬁed and its advantages and limitations are described in great detail in the second chapter. while some details of the derivations are relegated to problems. in the hope of exciting the student’s interest for further exploration. The theoretical derivations start at a basic level and go through the necessary steps for obtaining key results. Traditional textbooks on solid state physics contain a large amount of useful information about the properties of solids. chemistry. The book addresses mostly theoretical concepts and tools relevant to the physics of solids. As already mentioned. with proper guiding hints. the treatment is mostly restricted to the single-particle picture. the reader is typically directed toward review articles and monographs which contain more advanced treatments and a more extended bibliography. Brieﬂy.

these topics have become an essential part of the physics of solids and must be included in a general introduction to the ﬁeld. is relevant to real solids. The discussion is based on crystals. where the functional components may be . The second part of the book consists of ﬁve chapters. The phenomena involved in the many-body picture require an approach and a theoretical formalism beyond what is covered here. the introduction of tools like second quantization. it is feasible to describe the solid in terms of an approximate picture involving “single electrons”. The book is divided into two parts. which interact with the other electrons through an average ﬁeld. The material in these chapters is more speciﬁc than that in the ﬁrst part of the book. This material. a topic not traditionally discussed in the context of solids. avoiding. typically. turns out to be extremely useful and remarkably realistic for many. but not all.xvi Preface nuclei of the solid and with all the other electrons. these “single-electron” states do not correspond to physical electron states (hence the quotes). In fact. respectively. 10 and 11). for example. these topics constitute the subject of a second course on the theory of solids. Such structures are becoming increasingly important. however. and for certain classes of phenomena. Chapter 6 develops the tools for understanding the motion of atoms in crystals through the language of phonons. the ﬁrst ﬁve chapters develop the theoretical basis for the single-electron picture and give several applications of this picture. In certain limits. I have tried to keep the discussion in these two chapters at a relatively simple level. This leads to a very complex many-electron state which is difﬁcult to describe quantitatively. called Crystalline solids. The purpose of these last two chapters is to give a glimpse of interesting phenomena in solids which go beyond the single-electron picture. and a full many-body approach is necessary. situations. This picture. especially in the ﬁeld of nanotechnology. The ﬁrst part. although based on approximations that cannot be systematically improved. The logic of this approach is to make the material accessible to a wide audience. which offer a convenient model for studying macroscopic numbers of atoms assembled to form a solid. Green’s functions and Feynman diagrams. I must make here a clariﬁcation on why the very last chapter is devoted to ﬁnite structures. of non-crystalline solids (chapter 12) and of ﬁnite structures (chapter 13). at the cost of not employing a more elegant language familiar to physicists. and thus less important from a fundamental point of view. consists of eight chapters and includes material that I consider essential in understanding the physics of solids. In these cases the “single-electron” picture is not adequate. Although more advanced. There are several phenomena – superconductivity and certain aspects of magnetic phenomena being prime examples – where the collective behavior of electrons in a solid is essential in understanding the nature of the beast (or beauty). as opposed to idealized theoretical concepts such as a perfect crystal. In this part. which contain discussions of defects in crystals (chapters 9. for solids in which atoms are frozen in space. Chapters 7 and 8 are devoted to magnetic phenomena and superconductivity.

but I hope that a good fraction of the readers will ﬁnd them useful. The material in the second part of the book could not be made equally self-contained by the addition of appendices. what could be a more relevant example of a regular one-dimensional structure than the human DNA chain which extends for three billion base-pairs with essentially perfect stacking. I have made an effort to provide enough references for the interested reader to pursue in more detail any topic covered in the second part. The appendix on elasticity theory contains the background information relevant to the discussion of line defects and the mechanical properties of solids. thermodynamics and statistical mechanics. I hope that the scope of what is covered is broad enough to offer a satisfactory representation of the ﬁeld.Preface xvii measured in nanometers. The appendix on the Madelung energy provides a detailed account of an important term in the total energy of solids. This is the reason for omitting some important topics. even though it is not rigid in the traditional sense? This second part of the book contains material closer to actual research topics in the modern theory of solids. rather than an encyclopedic compilation of research topics. After all. I have drawn mostly from my own research experience. contain all the information necessary to derive from very basic principles the results of the ﬁrst part of the book. In deciding what to include in this part. Finally. An appendix at the end includes Nobel prize citations relevant to work mentioned in the text. on classical electrodynamics. a few comments about the details of the contents. this was not a trivial task and I was occasionally forced . quantum mechanics. as an indication of how vibrant the ﬁeld has been and continues to be. is that they do have certain common characteristics with traditional crystals. Another reason for including a discussion of these systems in a book on solids. such as the physics of metal alloys. techniques and tricks which are used extensively throughout the text. which was deemed overly technical to include in the ﬁrst part. Some ﬁnal comments on notation and ﬁgures: I have made a conscious effort to provide a consistent notation for all the equations throughout the text. which are studied by ever increasing numbers of traditional physicists. the appendix on mathematical tools reviews a number of formulae. I have strived to make the discussion of topics in the book as self-contained as possible. with representative examples of current applications. Finally. The appendices may seem excessively long by usual standards. I have included unusually extensive appendices in what constitutes a third part of the book. Four of these appendices. chemists and materials scientists. For this reason. Given the breadth of topics covered. Prime examples of such objects are clusters or tubes of carbon (the fullerenes and the carbon nanotubes) and biological structures (the nucleic acids and proteins). because of its more specialized nature. My excuse for such omissions is that the intent was to write a modern textbook on the physics of solids. Despite such omissions. such as a high degree of order. and which are expected to ﬁnd their way into solid state applications in the not too distant future.

with short introductions of the important physics background where needed. This can be done in the form of special assignments. Exceptions are the set of ﬁgures on electronic structure of metals and semiconductors in chapter 4 (Figs. (b) For students without graduate level training in the basic ﬁelds of physics mentioned above.xviii Preface to make unconventional choices in order to avoid using the same symbol for two different physical quantities. . covering of this part. As an indication of the degree of familiarity with basic physics expected of the reader. the feeling of immediacy in the ﬁgures as I would have drawn them on the blackboard. the choice of for the enthalpy so that the more traditional symbol H could be reserved for the magnetic ﬁeld. The material in the ﬁrst part can be covered at a more leisurely pace. My own teaching experience indicates that approximately 40 hours of lectures (roughly ﬁve per chapter) are adequate for a brisk. Lastly.30). pointing out important features rather than being faithful to details. Material from the second part can be used selectively as illustrative examples of how the basic concepts are applied to realistic situations. I decided to draw all the ﬁgures “by hand” (using software tools). thermodynamics and statistical mechanics). the ﬁrst part represents a comprehensive introduction to the single-particle theory of solids and can be covered in a one-semester course. which was provided by Pavlos Maragakis. rather than to reproduce ﬁgures from the literature. the readers who can tackle these problems easily can proceed directly to the main text covered in the ﬁrst part. 13. The book has been constructed to serve two purposes. I have included sample problems in the corresponding appendices. in the remainder of the full-year course. The purpose of this choice is to maintain. The material of the second part of the book can then be covered. or as projects at the end of the one-semester course. selectively or in its entirety as time permits.12). the entire book can serve as the basis for a full-year course. which were produced by Yannis Remediakis. and the ﬁgure of the KcsA protein in chapter 13 (Fig. 4. in order to make the meaning of symbols more transparent. so that the variable of differentiation would be unambiguous even if. using the appendices as a guide. ∇r or ∇k . on certain occasions. to the extent possible. I have also made extensive use of superscripts. I hope that the result is not disagreeable. even when discussing classic experimental or theoretical results. the introduction of a subscript in the symbol for the divergence. but not unreasonable. this is redundant information. quantum mechanics. given my admittedly limited drawing abilities. Some of these are: the choice of for the volume so that the more traditional symbol V could be reserved for the potential energy.6–4. the choice of Y for Young’s modulus so that the more traditional symbol E could be reserved for the energy. which are often in parentheses to differentiate them from exponents. (a) For students with adequate background in the basic ﬁelds of physics (electromagnetism.

Ken Shih. Bob Westervelt. Bert Halperin. Larry Boyer. Uzi Landman. Mark Pederson. they are: John Joannopoulos. Sokrates Pantelides. Phaedon Avouris. Frans Spaepen. Oscar Alerhand. Dan Branton. Yaneer Bar-Yam. Dave Weitz. Roy Gordon. xix . Steven Louie. Warren Pickett. Marvin Cohen. Peter Bl¨ ochl. Venky Narayanamurti. Pantelis Kelires. Jeremy Broughton. Eugen Tarnow. In-When Lyo. Jerry Tersoff. Rob Phillips. Bill Curtin. David Singh. Joe Demuth. Lloyd Whitman. David Vanderbilt. John Hutchinson. Jene Golovchenko. Karin Rabe. Jim Chelikowsky. John Weeks. John Wilkins. Koblar Jackson. from all of whom I have learned a tremendous amount. Kosal Pandey. Dimitri Maroudas. David Nelson. Michael Tinkham. Dung-Hai Lee. Jim Rice. In my two-decade-long journey through the physics of solids. Steve Erwin. Sidney Yip. Bill Paul. Charlie Lieber. Mike Aziz. I hope the presentation of these topics here does justice to his meticulous and inspired teaching. Michael Ortiz. Daryl Hess. Henry Ehrenreich. Jenna Zink. Ike Silvera. Alan Needleman. Peter Pershan. Randy Feenstra. Dimitri Vvedensky. Peter Feibelman. Bill Carter. Martin Karplus. Norton Lang. Matt Copel. Barry Klein. Rajiv Kalia. Shi-Yu Wu. Jim Chadi. Nick Kioussis. Daniel Fisher. Stratos Manousakis. especially the chapters that deal with symmetries of the crystalline state and band structure methods. Alex Antonelli. I had the good fortune to interact with a great number of colleagues. Russ Cafﬂisch. George Whitesides. Michael Mehl. Andy Rappe. Ladislas Kubin. Andy Zangwill. Farid Abraham. Joe Feldman. Michael Duesbery. Bob Meade. Paul Martin. Eric Mazur. Emily Carter. Ellen Williams. In roughly chronological order in which I came to know them. Russ Hemley. Ruud Tromp. was inspired to a great extent by the lectures of John Joannopoulos who ﬁrst introduced me to this subject. Franz Himpsel. Mark Gyure. Eugene Demler. Charlie Marcus. Bob Hamers. Dimitri Papaconstantopoulos. Cynthia Friend. Stan Williams. Eric Heller. Joe Serene.Acknowledgments The discussion of many topics in this book. Howard Stone. Leo Kouwenhoven. George Turner. Michael Payne. Priya Vashishta.

Noam Bernstein. Linda Zeger. Ickjin Park. Kyeongjae Cho. Daniel Orlikowski. Ryan Barnett. Laszlo Barab` asi. Dionisios Margetis. Nick Choly. Maurice de Koning. Olivier Politano. Ellad Tadmmor. Gil Zumbach. Last but not least. Trevor Bass. I am particularly indebted to them for guidance and advice. Jos´ e Soler. George Tsironis. George Theodorou. to a great extent. while its shortcomings are the exclusive responsibility of the author. Klaus Kern. Nikos Flytzanis. including Daniel Kandel. who encouraged me to turn my original class notes into the present book and supported me with patience and humor throughout this endeavor. I hope that they have learned from me a small fraction of what I have learned from them over the last dozen years. Bert Halperin and Sidney Yip. Jonah Erlebacher. Vassilis Pontikis and Sauro Succi. Kostas Fotakis. Joao Justo. Yannis Remediakis. Helen Eisenberg. Cambridge. Stavros Farantos. Normand Modine. Tim Mueller. Umesh Waghmare. John Joannopoulos. Doros Theodorou. Vasily Bulatov. I owe a huge debt of gratitude to my wife. I was also very fortunate to work with many talented graduate and undergraduate students. Greg Smith. Qiang Cui and Gang Lu. Marcus Elstner. Panos Tzanetakis. Certain of these individuals played not only the role of a colleague or collaborator.xx Acknowledgments John Smith. Paul Maragakis. Henry Ehrenreich. Massachusetts. Kosal Pandey. must be attributed to the generous input of friends and colleagues. Lefteris Economou. October 2001 . Grigoris Athanasiou. Oliver Leifeld. Pointing out these shortcomings to me would be greatly appreciated. Melvin Chen. Eleni. The merits of the book. Thomas Frauenheim. and with a very select group of Postdoctoral Fellows and Visiting Scholars. but also the role of a mentor at various stages of my career: they are. Riad Manaa. Sohrab Ismail-Beigi. Martin Bazant. as well as for sharing with me their deep knowledge of physics. Yuemin Sun. Hanchul Kim. including Yumin Juan. Dimitri Papaconstantopoulos.

and only one sentence passed on to the next generation of creatures. Nevertheless. such as diamonds. This was ﬁrst postulated by the ancient Greek philosopher Demokritos. (R. but. In that one sentence. the atoms are arranged in a regular three-dimensional periodic pattern. but repelling upon being squeezed into one another. Many solids are crystalline in nature. attracting each other when they are a little distance apart. that is. diamonds represent the hardest substance known. if just a little imagination and thinking are applied. what statement would contain the most information in the fewest words? I believe it is the atomic hypothesis that all things are made of atoms – little particles that move around in perpetual motion. all of scientiﬁc knowledge were to be destroyed. Feynman. but was established scientiﬁcally in the 20th century.Part I Crystalline solids If. In some cases it has taken geological times and pressures to form certain crystalline solids. there is an enormous amount of information about the world. this postulate is one of the greatest feats of the human intellect. There is a wide variety of crystal structures formed by different elements and by different combinations of elements. in some cataclysm. Consisting of carbon and found in mines. There are also many ordinary solids we encounter in everyday life 1 . There is an amazing degree of regularity in the structure of solids. near perfect macroscopic crystals can be formed by simply melting and then slowly cooling a substance in the laboratory. especially since it was not motivated by direct experimental evidence but was the result of pure logical deduction. the mere fact that a number of atoms of order 1024 (Avogadro’s number) in a solid of size 1 cm3 are arranged in essentially a perfect periodic array is quite extraordinary. The Feynman Lectures on Physics) Solids are the physical objects with which we come into contact continuously in our everyday life. they do not represent the ground state equilibrium structure of this element. Solids are composed of atoms. The atoms (ατ oµα = indivisible units) that Demokritos conceived bear no resemblance to what we know today to be the basic units from which all solids and molecules are built. P. However. In many other cases. surprisingly.

The Bravais lattice and the basis associated with each unit cell determine the crystal. which are called the basis. in a terminology that physicists often use. The unit cell contains typically one or a few atoms. these defects exist in relatively small concentrations in the host crystal. The existence of crystals has provided a tremendous boost to the study of solids. what determines the properties of the material is not so much the structure of individual crystallites but their relative orientation and the structure of boundaries between them. The points in space that are equivalent by translations form the so called Bravais lattice . a chocolate bar. There are many solids that are not crystalline. is therefore a reasonable one. . a bar of soap. because the manipulation of defects makes it possible to alter the properties of the ideal crystal. If solids lacked any degree of order in their structure. many of the notions advanced to describe disordered solids are extensions of. or amorphous semiconductors. As a consequence. All this justiﬁes the prominent role that the study of crystals plays in the study of solids. as we discuss in chapter 11. often very reminiscent of the local arrangement of atoms in their crystalline counterparts. which in perfect form would have a much more limited range of properties. ideas developed for crystalline solids. even very high quality crystals contain plenty of one-dimensional or zero-dimensional defects in their bulk as well. since a crystalline solid can be analyzed by considering what happens in a single unit of the crystal (referred to as the unit cell). sugar or salt grains.2 Part I Crystalline solids in which there exists a surprising degree of crystallinity. study of them would be much more complicated. The idealized picture of atoms in the bulk behaving as if they belonged to an inﬁnite periodic solid. For example. Nevertheless. with some famous examples being glasses. the “vacuum” state . Even in this case. or use as a point of reference. which is then repeated periodically in all three dimensions to form the idealized perfect and inﬁnite solid. the nature of a boundary between two crystallites is ultimately dictated by the structure of the crystal grains on either side of it. since the ratio of atoms on the surface to atoms in the bulk is typically 1 : 108 . Even in these solids. In these examples. In fact. even bones in the human body. candles. however. are composed of crystallites of sizes between 0. there exists a high degree of local order in their structure. For all practical purposes the surfaces constitute a very small perturbation in typical solids. This regularity has made it possible to develop powerful analytical tools and to use clever experimental techniques to study the properties of solids.5 and 50 µm. which in a sense represent two-dimensional defects in the perfect crystalline structure. It is actually the presence of such defects that renders solids useful. and as such can be studied in a perturbative manner. with the ideal crystal being the base or. Real solids obviously do not extend to inﬁnity in all three dimensions – they terminate on surfaces.

the existence of crystals becomes even more impressive. Accordingly. we will use crystals as the basis for studying general concepts of bonding in solids. even though there is no theoretical proof of this statement. The above discussion emphasizes how convenient it has proven for scientists that atoms like to form crystalline solids. and we will devote the ﬁrst part of the book to the study of the structure and properties of crystals.Part I Crystalline solids 3 It is a widely held belief that the crystalline state represents the ground state structure of solids. but only one lowest energy crystalline state. . and the atoms can ﬁnd this state in relatively short time scales and with relatively very few mistakes! If one considers the fact that atomic motion is quite difﬁcult and rare in the dense atomic packing characteristic of crystals. so that the atoms have little chance to correct an error in their placement. A collection of 1024 atoms has an immense number of almost equivalent ordered or disordered metastable states in which it can exist.

These electrons interact among themselves and with the nuclei of the constituent atoms. It is quite extraordinary that even among solids which are composed of single elements. while some metallic alloys (like nichrome) have resistivities of 102 µ ·cm. Despite this relative simplicity in composition. have room-temperature resistivities of 1016 –1019 µ ·cm (for wood) to 1025 µ ·cm (for quartz). are encountered in most common solids. In this ﬁrst chapter we will give a general description of these interactions and their relation to the structure and the properties of solids. Ge) have room-temperature resistivities ranging from 3. from one to half a dozen or so. Si. These solids are 4 . which is what makes them so useful and indispensable to mankind. out of more than 100 total. certain common solids like wood (with a rather complex structure and chemical composition) or quartz (with a rather simple structure and composed of two elements. have room-temperature resistivities of 1–5µ ·cm. Cu. Si and O). and they are considered semimetals or semiconductors. Al). which is measured by their electrical resistivity. most solids contain only very few of these elements. All these solids are considered good conductors of electrical current. Some typical single-element metallic solids (such as Ag. physical properties can differ by many orders of magnitude.1 Atomic structure of crystals Solids exhibit an extremely wide range of properties. Certain single-element solids (like C. the so called valence electrons. The origin of all the properties of solids is nothing more than the interaction between electrons in the outer shells of the atoms.3 × 1011 µ ·cm (for Si). One example is the ability of solids to conduct electricity. it is indeed extraordinary when we consider its origin.5 × 103 µ ·cm (for graphitic C) to 2. Finally. solids exhibit a huge variety of properties over ranges that differ by many orders of magnitude. While our familiarity with many different types of solids makes this fact seem unimpressive. The extremely wide range of the properties of solids is surprising because most of them are made up from a relatively small subset of the elements in the Periodic Table: about 20 or 30 elements. Moreover.

optical. A useful measure of this behavior is the yield stress σY .1. the variation of the properties of solids has to do with the speciﬁc ways in which the valence electrons of the constituent atoms interact when these atoms are brought together at distances of a few angstroms (1 Å= 10−10 m = 10−1 nm). their wavefunctions (orbitals) 1 The highest concentration of atoms does not correspond to the highest number of valence electrons per atom.1 Building crystals from atoms The structure of crystals can be understood to some extent by taking a close look at the properties of the atoms from which they are composed. Naively. a representative alkali metal. 1. that is. These ranges are nowhere close to the ranges of yield stresses and electrical resistivities mentioned above. a rather soft and ductile metal. as already mentioned these are called valence electrons. electrical. it returns to its original state when the external stress is removed. . which is the stress up to which the solid behaves as a linear elastic medium when stressed. range from 40 in Al. Typical distances between nearest neighbor atoms in solids range from 1. the hardest material. The range of electrical resistivities covers an astonishing 25 orders of magnitude! Another example has to do with the mechanical properties of solids. to 17 × 1022 cm−3 in C. The remaining electrons of the atom are tightly bound to the nucleus. Yield stresses in solids. Anywhere from one to a dozen valence electrons per atom participate actively in determining the properties of solids. measured in units of MPa. Concentrations of atoms in a solid range from 1022 cm−3 in Cs. including mechanical. or brittle when they do not yield easily. thermal and magnetic properties. but instead break when stressed. depending on the nature of electrons that participate actively in the formation of the solid. in this case a mechanical stress. a representative covalently bonded solid. Rather. and of electron concentrations1 of roughly 100. and correspondingly that of electrons. and this in turn determines all the other properties of the solid. to 5 × 104 in diamond. The way in which the valence electrons interact determines the atomic structure. one might expect that the origin of the widely different properties of solids is related to great differences in the concentration of atoms. Again we see an impressive range of more than three orders of magnitude in how a solid responds to an external agent. The yield stresses of common steels range from 200–2000 MPa. Solids are classiﬁed as ductile when they yield plastically when stressed.5 to 3 Å. This is far from the truth. We can identify several broad categories of atoms. These considerations give a range of atomic concentrations of roughly 20.1 Building crystals from atoms 5 considered insulators. The electrons in the outermost shells of the isolated atom are the ones that interact strongly with similar electrons in neighboring atoms. a brittle insulator.

we will only concern ourselves with the low-temperature structures. MCL = monoclinic. and consider how the valence electrons behave. without paying close attention to details. a c a a a a c a b a a a c c a a b b Figure 1. FCC = face-centered cubic. Some basic properties of the elemental solids are collected in the Periodic Table (pp. triclinic. tetragonal. CUB = cubic. which can produce a different structure at higher temperatures.6 1 Atomic structure of crystals do not extend far from the position of the nucleus. and they are very little affected when the atom is surrounded by its neighbors in the solid. monoclinic.1. The acronyms for the crystal structures that appear in the Table stand for: BCC = body-centered cubic. orthorhombic. will not be considered [1]. HCP = hexagonal-close-packed. We will discuss below the crystal structure of various solids based on the properties of electronic states of the constituent atoms. Finally. We are only concerned here with the basic features of the crystal structures that the various atoms form. Shapes of the unit cells in some lattices that appear in Periodic Table. called elemental solids. 1. Bottom row: rhombohedral. . such as number of nearest neighbors. these will come later. 9). Top row: cubic. 8. where we list: r The crystal structure of the most common phase. dynamical effects. and then proceed to more complicated structures involving several types of atoms. These are called the core electrons. We begin our discussion with those solids formed by atoms of one element only. For most purposes it is quite reasonable to neglect the presence of the core electrons as far as the solid is concerned. TET = tetragonal. The corners in thin lines indicate right angles between edges. DIA = diamond. GRA = graphite. which correspond to the lowest energy static conﬁguration. Selected shapes of the corresponding crystal unit cells are shown in Fig.1. RHL = rhombohedral. ORC = orthorhombic.

Kr and Xe. µ ·cm. which means that the melting temperature is only a small fraction of the cohesive energy. The crystal structure that corresponds to this atomic arrangement is one of the close-packing geometries.1. r The electrical resistivity in units of micro-ohm-centimeters. The melting temperature provides a measure of how much kinetic energy is required to break the rigid structure of the solid.604 K). Ne.1. d and f shells are being ﬁlled. These are the atoms with all their electronic shells completely ﬁlled. Ar. When these atoms are brought together to form solids they interact very weakly. There are a few exceptions in this progression. i. which are indicated by asterisks denoting that the higher angular momentum level is ﬁlled in preference to the lower one (for example. except for some good insulators which have resistivities 103 (k). marked by two asterisks. 106 (M) or 109 (G) times higher. Typical values of the cohesive energy of solids are in the range of a few electronvolts (see Tables 5. the valence electronic conﬁguration of Cu. The natural units for various physical quantities in the context of the structure of solids and the names of unit multiples are collected in two tables at the end of the book (see Appendix I). etc.1 Building crystals from atoms 7 r The covalent radius in units of angstroms.5). for most elemental solids the resistivity is of order 1–100 in these units. is s 1 d 10 instead of s 2 d 9 . is s 0 d 10 instead of s 2 d 8 . This unconventional choice of units for the melting temperature is meant to facilitate the discussion of cohesion and stability of solids. the sum of covalent radii of two nearest neighbor atoms give their preferred distance in the solid. which is a measure of the typical distance of an atom to its neighbors. r The atomic concentration of the most common crystal phase in 1022 cm−3 . Fortunately.e. and the weak interaction is the result of slight polarization of the electronic wavefunction in one atom due to the presence of other atoms around it. well below room temperature. This interaction is referred to as “ﬂuctuating dipole” or van der Waals interaction. Since the interaction is weak. The main concern of the atoms in forming such solids is to have as many neighbors as possible.). arrangements which allow the closest packing of hard spheres. Their outer electrons are not disturbed much since they are essentially core electrons. The columns of the Periodic Table correspond to different valence electron conﬁgurations.1 Atoms with no valence electrons The ﬁrst category consists of those elements which have no valence electrons. Å. typically a few percent. r The melting temperature in millielectronvolts (1 meV = 10−3 eV = 11. the noble elements He. that is. that of Pd. which follow a smooth progression as the s . . the solids are not very stable and they have very low melting temperatures. p . speciﬁcally.4 and 5. 1. which in gaseous form are very inert chemically. marked by one asterisk. in order to maximize the cohesion since all interactions are attractive. the interaction is attractive.

42 2.6 2.65 HCP 1.03 28.9 42 Mo* CUB 1.3 HCP 1.70 80 HCP 1.50 9.35 25.18 139 43 Tc BCC 1.09 2.69 102.54 8.91 96.89 1.29 41.23 covalent radius (A) (meV) 39.75 4. in the simplest case.98 86.30 317.22 188.0 HCP 1.97 6.18 1.56 15.27 234.08 4.4 6.5 77 Ir Cesium Barium Lanthanum Hafnium Tantalum Wolframium Rhenium Osmium Iridium BCC 2.3 RHL 1.14 8.18 187.7 7.74 96. Each atom has 12 equidistant nearest neighbors in this structure.66 47.06 5.7 7.4 8.22 24.49 1.9 8.80 19.34 281.33 12.42 5.32 167.70 9.6 HCP 1.16 152.2 7.4 4. 1.85 94.7 5.1 3.92 85.6 37 Rb FCC 1.04 91.34 45 Rh* Rubidium Strontium Zirconium Molybdenum Technetium Ruthenium Rhodium BCC 2.64 110.30 5.08 4.46 19 20 III-B IV-B V-B VI-B VII-B VIII VIII s2d 1 s2d 2 21 s2d 3 s2d 4 23 s2d 5 s2d 6 s2d 7 26 Ca Sc Scandium Ti Titanium 22 V Vanadium Cr* Chromium 24 Mn 25 Fe Iron Co Cobalt 27 Potassium Calcium Manganese BCC 2.8 40 Zr BCC 1.25 192.9 8.30 3.5 BCC 1.65 4.9 3.23 39.93 64.26 285.7 3.3 103.03 105.27 51 39 Yttrium HCP 1.8 2. but are not deformed.91 20 BCC 1.6 6.62 115.1 55 Cs FCC 1.97 0.8 I-A II-A 1 Atomic structure of crystals s1 s2 3 Li Lithium Be Beryllium 4 Na BCC 1.4 12.4 4.7 38 Sr Y HCP 1.1 FCC 1.02 60 57 La HCP 1.3 4.55 13.4 3. They .36 7.30 4.1 8.2 73 Ta BCC 1.4 72 Ha BCC 1.4 68.04 14 75 Re Os HCP 1.0 5.90 134.91 2.34 237.17 130.71 44 Ru* HCP 1.52 35.40 21.45 183.16 26. Thus.1 7.8 41 Nb* Niobium BCC 1.4 (µΩ cm) Magnesium K BCC 1.2 76 FCC 1.2.44 156.4 11 Sodium Mg HCP 0.15 12.28 234.17 74 W HCP 1.1 BCC 1.8 56 Ba HCP 1.17 156.8 2.26 4. atoms that have no valence electrons at all behave like hard spheres which attract each other with weak forces.65 92.1 f 2d 0s2 f 3d 0s2 f 4d 0s2 f 5d 0s2 f 6d 0s2 f 7d 0s2 Ce Cerium 58 Pr 59 Nd 60 Pm 61 Sm Samarium 62 Eu Europium 63 Praseodymium Neodymium Promethium FCC 1.9 7.36 79.70 (1022 cm-3 ) 9.44 215.78 22.98 1. which is shown in Fig.60 50 HCP 1.54 HCP 1.3 12 symbol name crystal structure melting point atomic concentration resistivity Li Lithium 3 atomic number o BCC 1.3 5.25 224.0 The particular crystal structure that noble-element atoms assume in solid form is called face-centered cubic (FCC).9 7.28 297.0 BCC 1.6 6.30 249.62 154.

85 1.34 46.91 11 82 Pb Lead RHL 1.64 160 k 84 85 86 Po At Rn Radon Mercury Thallium Bismuth Pollonium Astatine FCC 1.17 CUB 1.66 230 G 32 33 34 35 36 Ge As Se Br Kr Gallium Germanium Arsenic Selenium Bromine Krypton Pd** Ag* Palladium Silver FCC 1.6 RHL 1.80 9.6 HCP 1.8 3.30 20.28 157.61 136.34 115.53 2.82 107 f 7d 1s2 f 8d 1s2 f 10d 0s2 f 11d 0s2 f 12d 0s2 f 13d 0s2 f 14d 0s2 f 14d 1s2 Gd 64 Tb Terbium 65 Dy 66 Ho Holmium 67 Er Erbium 68 Tm Thulium 69 Yb Ytterbium 70 Lu Lutetium 71 Gadolinium Dysprosium HCP 1.57 154.36 4.2 form weakly bonded solids in the FCC structure.83 80 Hg TET 1.21 HCP 1.5 HCP 1.10 5.26 96 HCP 1.94 2.58 150.5 6.0 4M 13 Al VIII I-B GRA 0.25 ORC 1.4 5.64 28.2 5.48 49.84 46 FCC 1.17 92. Unless we apply external pressure to enhance this attractive interaction.17 116.37 ORC 1.68 26.6 4.2 3.1.70 CUB 0.85 FCC 1.37 30.55 5. The only exception to this rule is He.57 46.26 59.83 8.36 1.93 42.22 81.62 9.26 56.02 25.45 1.56 166.00 2.67 47 FCC 1.22 114.82 202. in which the attractive interactions are optimized by maximizing the number of nearest neighbors in a close packing arrangement.49 20.44 37.88 3.40 43.98 28.1 28.30 175.64 6.02 3.31 39 83 Bi HCP 1.38 38.6 3.92 48 49 Cd In DIA 1.77 HCP 0.66 MCL 0.39 58.76 5. in which the attractive interaction between atoms is so weak that it is overwhelmed by the zero-point motion of the atoms.42 47 M 50 Sn Tin RHL 1.4 G 14 15 16 17 18 Si P S Cl Ar Aluminum Silicon Phosphorus Sulfur Chlorine Argon s2d 8 s2d 9 28 Ni Nickel Cu* Copper 29 Zn Zinc FCC 1.6 FCC 94.59 140.02 131.67 30 31 Ga DIA 1.14 6.45 145.65 3.41 77.36 1.7 3.47 51.23 2.12 RHL 1.4 HCP 1.43 2.17 2.26 86.22 104.48 51.24 17.59 144.2 3.50 18 FCC 1.9 8.18 4.18 80.45 93.5 3.04 ORC 0.90 2.7 3.33 FCC 62.09 6.14 FCC 33.9 9.67 12 M 53 51 52 54 Sb Te I Xe Xenon Cadmium Indium Antimony Tellurium Iodine Pt** Platinum Au* Gold HCP 1.88 37.63 79 HCP 1.3 13.3 4.68 3.56 156.0 HCP 1.32 67.9 6.17 ORC 1.1 Building crystals from atoms III--A IV--A V--A VI--A VII--A Noble 9 s2p 1 s2p 2 5 s2p 3 6 s2p 4 7 s2p 5 8 s2p 6 9 B Boron C Carbon N Nitrogen O Oxygen F Fluorine Ne Neon 10 TET 0.75 3.4 59.7 3.0 HCP 1.14 25.1 HCP 1.64 FCC 338.10 ORC 1.34 106.02 II-B s2d 10 2.99 FCC 33.15 148.19 4.91 2. .44 6.93 78 FCC 1.37 81 Tl TET 1.

with 12 neighbors which are separated into two groups of six atoms each: the ﬁrst group forms a planar six-member ring surrounding an atom at the center... Left: one atom and its 12 neighbors in the face-centered cubic (FCC) lattice.2.3. that is. and . 1.. where A . AB AB AB . two FCC or two HCP lattices. arranged so that in the resulting crystal the atoms in each sublattice . The ﬁrst sequence corresponds to the FCC lattice.10 1 Atomic structure of crystals Figure 1... He remains a liquid.. C represent the three possible relative positions of spheres in successive planes according to the rules of close packing. The other close-packing arrangement of hard spheres is the hexagonal structure (HCP for hexagonal-close-packed). one above and one below the six-member ring. while the second group consists of two equilateral triangles. which gives a hexagonal lattice. the size of the spheres representing atoms is chosen so as to make the neighbors and their distances apparent. for close packing in three dimensions the successive planes must be situated so that a sphere in one plane sits at the center of a triangle formed by three spheres in the previous plane. An interesting variation of the close-packing theme of the FCC and HCP lattices is the following: consider two interpenetrating such lattices. with the central atom situated above or below the geometrical center of each equilateral triangle. 1. There are two ways to form such a stacking of hexagonal close-packed planes: .4. as shown in Fig. Right: a portion of the three-dimensional FCC lattice. The HCP structure bears a certain relation to FCC: we can view both structures as planes of spheres closely packed in two dimensions... This is also an indication that in some cases it will prove unavoidable to treat the nuclei as quantum particles (see also the discussion below about hydrogen). the second to the HCP lattice.. ABC ABC . B .. as illustrated in Fig. the size of the spheres is chosen so as to indicate the close-packing nature of this lattice.

Right: the .1. This is illustrated in Fig.. For example: tetravalent group IV elements such as C. ABC ABC .. It should be evident that the combination of two close-packed lattices cannot produce another close-packed lattice. Ge form the diamond lattice. the diamond. Left: one atom and its 12 neighbors in the hexagonal-close-packed (HCP) lattice. A B C A B A Figure 1..4... AB AB AB .1 Building crystals from atoms 11 Figure 1. have as nearest equidistant neighbors atoms belonging to the other sublattice. in both cases each atom ﬁnds itself at the center of a tetrahedron with exactly four nearest neighbors. stacking corresponding to the FCC crystal..3. Si. Interestingly. These arrangements give rise to the diamond lattice or the zincblende lattice (when the two original lattices are FCC) and to the wurtzite lattice (when the two original lattices are HCP). 1. The lattices are viewed along the direction of stacking of the hexagonalclose-packed planes. The two possible close packings of spheres: Left: the . stacking corresponding to the HCP crystal. Consequently. these two types of lattices differ only in the relative positions of second (or farther) neighbors. the size of the spheres is chosen so as to indicate the close-packing nature of this lattice. combinations of two different group IV elements or complementary elements . Right: a portion of the three-dimensional HCP lattice.. the size of the spheres representing atoms is chosen so as to make the neighbors and their distances apparent. Since the nearest neighbors are exactly the same. zincblende and wurtzite lattices are encountered in covalent or ionic structures in which four-fold coordination is preferred..5.

In this case an atom again has 12 equidistant neighbors. Top: illustration of two interpenetrating FCC (left) or HCP (right) lattices. these correspond to the diamond (or zincblende) and the wurtzite lattices. The icosahedron is one of the Platonic solids in which all the faces are perfect planar shapes. left side (a corresponding area can also be identiﬁed in the wurtzite lattice. but it is not possible to ﬁll 2 An n -fold symmetry means that rotation by 2π/ n around an axis leaves the structure invariant. showing the tetrahedral coordination of all the atoms. which are at the apexes of an icosahedron. (such as group III–group V. certain combinations of group III–group V elements form the wurtzite lattice.5. with the vertical direction corresponding to the direction along which close-packed planes of the FCC or HCP lattices would be stacked (see Fig. respectively.4). 1. in the case of the icosahedron. The two original lattices are denoted by sets of white and shaded circles.12 1 Atomic structure of crystals B C B A A A B A Figure 1. while the circles of slightly smaller and slightly larger size (which are superimposed in this view) lie on planes behind and in front of the plane of the paper. Bottom: a perspective view of a portion of the diamond (or zincblende) lattice. In fact. A variation of the wurtzite lattice is also encountered in ice and is due to hydrogen bonding. the faces are 20 equilateral triangles. it turns out that the icosahedral arrangement is optimal for close packing of a small number of atoms.2 as shown in Fig. this is the area enclosed by the dashed rectangle in the top panel. The lattices are viewed from the side. . Yet another version of the close-packing arrangement is the icosahedral structure. group I–group VII) form the zincblende lattice. 1.6. group II–group VI. Lines joining the circles indicate covalent bonds between nearest neighbor atoms. The icosahedron has 12 apexes arranged in ﬁve-fold symmetric rings. These structures are discussed in more detail below. upon reﬂection). All the circles of medium size would lie on the plane of the paper.

1 Building crystals from atoms 13 Figure 1. Rb and Cs (the alkalis) with one valence electron. unless defects were introduced to allow for deviations from the perfect symmetry [2–4]. This fact is a simple geometrical consequence (see also chapter 3 on crystal symmetries). In solids. the wavefunctions of valence electrons do not exhibit any particular preference for orientation of the nearest neighbors in space. this structure cannot be extended to form a periodic solid in three-dimensional space. Left: one atom and its 12 neighbors in the icosahedral structure. three-dimensional space in a periodic fashion with icosahedral symmetry. Na. Right: a rendition of the icosahedron that illustrates its close-packing nature. Ca.1. Sr and Ba with two valence electrons. The nuclei with their core electrons form ions. Based on this observation it was thought that crystals with perfect ﬁve-fold (or ten-fold) symmetry could not exist. and thus being shared by all the atoms in the solid forming a “sea” of negative charge.6. These are Li. and Be. The resulting crystal structure is the one which optimizes the electrostatic repulsion . The wavefunctions of valence electrons of all these elements extend far from the nucleus. These solids were named “quasicrystals”. Since the s states are spherically symmetric. Mg. 1. caused quite a sensation. the size of the spheres representing atoms is chosen so as to make the neighbors and their distances apparent. and their study created a new exciting subﬁeld in condensed matter physics. K. the valence electron wavefunctions at one site have signiﬁcant overlap with those at the nearest neighbor sites. which are immersed in this sea of valence electrons.2 Atoms with s valence electrons The second category consists of atoms that have only s valence electrons. The ions have charge +1 for the alkalis and +2 for the atoms with two s valence electrons. For the atoms with one and two s valence electrons a simpliﬁed picture consists of all the valence electrons overlapping strongly. They are discussed in more detail in chapter 12.1. in the mid 1980s [5]. The discovery of solids that exhibited ﬁve-fold or ten-fold symmetry in their diffraction patterns.

14 1 Atomic structure of crystals Figure 1. This behavior is central . Left: one atom and its eight neighbors in the body-centered cubic (BCC) lattice. The mathematical formulation of this statement is called Bloch’s theorem and will be considered in detail later. that is. This symmetry is the main feature of the external potential that the valence electrons feel in the bulk of a crystal: they are subjected to a periodic potential in space. The actual structures are body-centered cubic (BCC) for all the alkalis. which means that the wavefunctions themselves must be periodic up to a phase. the electronic state is delocalized over the entire solid.7. the size of the spheres is chosen so as to indicate the almost close-packing nature of this lattice. in all three dimensions.7. and thus they are shared by all atoms in the solid. the electronic wavefunctions must obey the symmetry of the external potential. after FCC and HCP. The physical symmetry which allows us to make this jump is the periodicity of the crystalline lattice. 1. and FCC or HCP for the two-s -valence-electron atoms. forming a sea of electrons. One point deserves further clariﬁcation: we mentioned that the valence electrons have signiﬁcant overlap with the electrons in neighboring atoms. It may seem somewhat puzzling that we can jump from one statement – the overlap of electron orbitals in nearby atoms – to the other – the sharing of valence electrons by all atoms in the solid. which prefers the BCC structure. which for all practical purposes extends to inﬁnity – an idealized situation we discussed earlier. Right: a portion of the three-dimensional BCC lattice. of the positively charged ions with their attraction by the sea of electrons. except Ba. Just like in any quantum mechanical system. then all equivalent atoms of the crystal share the same state equally. A periodic wavefunction implies that if two atoms in the crystal share an electronic state due to overlap between atomic orbitals. In the BCC structure each atom has eight equidistant nearest neighbors as shown in Fig. which is the second highest number of nearest neighbors in a simple crystalline structure. the size of the spheres representing atoms is chosen so as to make the neighbors and their distances apparent.

.1.3 Atoms with s and p valence electrons The next level of complexity in crystal structure arises from atoms that have both s and p valence electrons. In the following we will use the symbols s (r). . . and their overlap with neighboring wavefunctions of the same type can lead to interesting ways of arranging the atoms into a stable crystalline lattice (see Appendix B on the character of atomic orbitals). dx y orbitals are shown shaded black and white. . p . and ψ n (r)(n = a .1 Building crystals from atoms 15 to the physics of solids. 2. and represents a feature that is qualitatively different from what happens in atoms and molecules.) to denote linear combinations of the atomic orbitals at site A. pl (r). to denote atomic orbitals as they would exist in an isolated atom. . plA (r). . dm (r). 1. dzx orbitals are similar to the dx y orbital.) which are appropriate for the description of electronic states in the crystal. i = 1. We use φiA (r)(i = 1. 1. When they are related to an atom A at position R A . . b) for combinations of φiX (r)’s ( X = A . dm (r). . . which are functions of r. Representation of the character of s . B . . but lie on the yz and zx planes. z z z z y y y y s x px y x py x pz x z y y x x x d 3z2-r 2 d x 2-y2 d xy Figure 1. can then serve as the new basis for representing electron wavefunctions. 2. these become functions of r − R A A and are denoted by s A (r). where electronic states are localized (except in certain large molecules that possess symmetries akin to lattice periodicity). d atomic orbitals. The d yz . p y . illustrated in Fig. .1. The s and p states. The lobes of opposite sign in the px .8. pz and dx 2 − y 2 .8. The individual p states are not spherically symmetric so they can form linear combinations with the s states that have directional character: a single p state has two lobes of opposite sign pointing in diametrically opposite directions.

φ1 Now imagine that we place atoms A and B next to each other along the x axis. φ1 A separated by 120°. ﬁrst atom A and to its right atom B . φ2 . are called sp 2 orbitals. We arrange the distance so that there A B and φ1 .1) A A A . It is easy to show that. while φ1 ˆ direction.2) B B B .9. pzA which are orthonormal. we consider ﬁrst the linear combinations which constitute a new orthonormal basis of atomic orbitals: √ 1 A 2 A A φ1 = √ s + √ p x 3 3 1 1 1 A A A φ2 = √ s A − √ px + √ py 3 6 2 1 1 1 A A A φ3 = √ s A − √ px − √ py 3 6 2 A φ4 = pzA (1. We can form two b a B B =1 (φ A + φ1 ) and ψ1 =1 (φ A − φ1 ) of which the ﬁrst linear combinations. For A B ˆ direction. φ3 point along three directions on the x y plane The ﬁrst three orbitals. if the atomic orbitals are orthonormal. Let us assume that in the neutral isolated state of the atom we can occupy each of these orbitals by one electron. 1. 2. piA (i = x . thereby maximizing the interaction between these two orbitals. ψ1 2 1 2 1 .16 1 Atomic structure of crystals The possibility of combining these atomic orbitals to form covalent bonds in a crystal is illustrated by the following two-dimensional example. φ3 also point along three directions on the x y plane separated The orbitals φ1 by 120°. with the following linear combinations: √ 1 B 2 B B φ1 = √ s − √ px 3 3 1 1 1 B B B φ2 = √ s B + √ px − √ py 3 6 2 1 1 1 B B B φ3 = √ s B + √ px + √ py 3 6 2 B B φ4 = pz (1. 3) have energy ( s + 2 p )/3. points in a direction perpendicular to the x y plane. y . which we label B . since they are composed of one s and two p atomic orbitals. py . px . φ4 . φ2 . Imagine now a second identical atom. while the last one. A A labeled A. z ) have energies s and p . then the states φk (k = 1. which are pointing toward each is signiﬁcant overlap between orbitals φ1 other. but in the opposite sense (rotated by 180°) from those of atom A. with states s A . For an atom. at a distance a . points along the +x points along the −x example. as shown in Fig. these states. and the states s A . note that this is not the ground state of the atom.

(1. the energy level diagram for the s .1.1)). A (the arrows connect equivalent atoms). .9.1 Building crystals from atoms y y y z 17 2π /3 x y x φ1 φ2 φ3 2π /3 x φ4 x A’’ φ4 B ψ4a ψ1a ψ2a ψ3a A B B pB pB p x y z φ4 ψ4b ψ1b ψ2b ψ3b A A pA pA p x y z φB φB φB 1 2 3 φA φA φA 1 2 3 A’ sB sA Figure 1. Bottom: the graphitic plane (honeycomb lattice) and the C60 molecule. p atomic states. A . their sp 2 linear combinations (φiA and φiB ) and the bonding (ψib ) and antibonding (ψia ) states (up– down arrows indicate electrons spins). Illustration of covalent bonding in graphite. Top: the sp 2 linear combinations of s and p atomic orbitals (deﬁned in Eq. Middle: the arrangement of atoms on a plane with B at the center of an equilateral triangle formed by A .

Now. as far as atom B is concerned. and place it √ 1 3 ˆ− 2 y ˆ relative to the position of atom B . . Imagine next that we repeat this exercise: we take another atom with the same linear combinations of orbitals as A.35 Å. this is a general feature of how combinations of single-particle orbitals behave (see Problem 2). Through the same procedure we can form a third σ bond between atoms B and A . so we consider the vectors that connect them as the repeat vectors at which equivalent atoms in the crystal should exist.36 larger. The resulting lattice is called the honeycomb lattice. relative to the position of the original atom B . we form a lattice. it actually corresponds to the structure of graphite. and at the same distance a as the previx 2 ous two neighbors. whereas the distance between successive planes is 3. the product of the spatial and spin parts. φ1 . planes of C atoms in the honeycomb lattice are placed on top of each other to form a three-dimensional solid. An indication of this weak bonding between planes compared to the in-plane bonds is that the distance between nearest neighbor atoms on a plane is 1. so that the total wavefunction.18 1 Atomic structure of crystals maximizes the overlap and the second has a node at the midpoint between atoms A and B . As usual. 1. occupy the symmetric (lower energy) one with two electrons as before and create a second σ bond between atoms B and A .42 Å. its three neighbors of the orbitals φ3 are exactly equivalent. by forming the symmetric and antisymmetric linear combinations B A and φ3 . and at the in the direction of the vector 2 x B A same distance a as atom A from B . since the energy of ψ b is lower than the energy A B or φ1 . one from each atomic orbital. we expect the symmetric linear combination of single-particle orbitals (called the bonding state) to have lower energy than the antisymmetric one (called the antibonding state) in the system of the two atoms. as illustrated in Fig. Though this example may seem oversimpliﬁed. a factor of 2. If we place atoms of type A at all the possible integer multiples of these vectors. this is based on the assumption that the spin wavefunction of the two electrons is antisymmetric (a spin singlet). φ2 and φ2 will be pointing toward each other. This is the essence of the chemical bond between two atoms. is antisymmetric upon exchange of their coordinates. one of the most stable crystalline solids. which of φ1 in this case is called a covalent σ bond. of the orbitals φ1 in the symmetric linear combination because of their spin degree of freedom. but the interaction between planes is rather weak (similar to the van der Waals interaction). which we will call A . Each atom of type A is surrounded by three atoms of type B and vice versa. We can place two electrons. Finally we repeat√ this procedure with a third atom A placed along the direction of the vector 1 ˆ + 23 y ˆ relative to the position of atom B . To complete the lattice we have to place atoms of type B also at all the possible integer multiples of the same vectors. In graphite. The exact energy of the bonding and antibonding states will depend on the overlap A B . Through this exercise we have managed to lower the energy of the system.9. as it should be due to their fermionic nature. We can form symmetric and antisymmetric combinations from them. Due to our choice of orbitals.

Smalley. These structures have been nicknamed after Buckminster Fuller. The physics of these structures will be discussed in detail in chapter 13. φ2 . Many more interesting variations of this structure have also been produced. received the 1996 Nobel prize for Chemistry.1 Building crystals from atoms 19 What about the orbitals pz (or φ4 ). The tubes in particular seem promising for applications in technologically and biologically relevant systems. while maintaining the three-fold coordination and bonding of the graphitic plane. shown in Fig. which are linear combinations of s and p orbitals (the original s atomic orbitals have lower energy than p ). Carbon is a special case. which so far have not been used? If each atom had only three valence electrons. W. F. an American scientist and practical inventor of the early 20th century. This structure actually exists in nature! It was discovered in 1985 and it has revolutionized carbon chemistry and physics – its discoverers. 1. including “onions” – spheres within spheres – and “tubes” – cylindrical arrangements of three-fold coordinated carbon atoms.3) . and the energy can be lowered by occupying the symmetric combination.9. Kroto and R. Symmetric and B antisymmetric combinations of neighboring pzA and pz orbitals can also be formed b a (the states ψ4 . ψ4 . which are perpendicular to the x y plane and thus parallel to each other. who designed architectural domes based on pentagons and hexagons. in which the π bonds are almost as strong as the σ bonds. There is a different way of forming bonds between C atoms: consider the following linear combinations of the s and p atomic orbitals for atom A: 1 A = [s A − φ1 2 1 A φ2 = [s A + 2 1 A φ3 = [s A + 2 1 A φ4 = [s A − 2 A A px − py − pzA ] A A px − py + pzA ] A A px + py − pzA ] A A px + py + pzA ] (1. In the case of C. Curl. φ3 . respectively). E. In this case the overlap between neighboring pz orbitals is signiﬁcantly smaller and the corresponding gain in energy signiﬁcantly less than in σ bonds. and the right combination of pentagonal and hexagonal rings produces the almost perfect sphere. The presence of pentagons introduces curvature in the structure. These electrons remain in the pz orbitals. This is referred to as a π bond. which is generally weaker than a σ bond.1. bucky-onions. then these orbitals would be left empty since they have higher energy than the orbitals φ1 . and bucky-tubes. the nicknames are buckminsterfullerene or bucky-ball for C60 . H. An intriguing variation of this theme is a structure that contains pentagonal rings as well as the regular hexagons of the honeycomb lattice. R. each atom has four valence electrons so there is one electron left per atom when all the σ bonds have been formed.

but having all the signs of the p orbitals reversed: 1 B = [s B φ1 2 1 B φ2 = [s B 2 1 B φ3 = [s B 2 1 B φ4 = [s B 2 B B B + px + py + pz ] B B B − px + py − pz ] B B B − px − py + pz ] B B B + px − py − pz ] (1. B .8. as illustrated in Fig. (1. the arrangement of atoms in the three-dimensional diamond lattice. We can now imagine placing atoms B . which are composed of one s and three p orbitals. which are degenerate.20 1 Atomic structure of crystals φ1 φ2 φ3 φ4 B’’’ a a a a ψ1 ψ2 ψ3 ψ4 A pA pA p x y z A A A φA 1 φ2 φ3 φ4 B B B φB 1 φ2 φ3 φ4 B B pB p p x y z A B’’ sA B’ b b b ψ1 ψ2 ψ3 ψ4b B sB Figure 1. B . see Fig. are called sp 3 orbitals. as deﬁned in Eq. is equal to ( s + 3 p )/4. the new states. using the same convention as in Fig. B atoms. 1. The up–down arrows indicate occupation by electrons in the two possible spin states. an atom A is at the center of a regular tetrahedron (dashed lines) formed by equivalent B . B .10. where s . On the right side. It is easy to show that the energy of these states. their sp 3 linear combinations (φiA and φiB ) and the bonding (ψib ) and antibonding (ψia ) states.4) .5. These orbitals point along the directions from the center to the corners of a regular tetrahedron. 1. with which we associate linear combinations of s and p orbitals just like those for atom A. Illustration of covalent bonding in diamond. p are the energies of the original s and p atomic orbitals. the energy level diagram for the s . p atomic states. B at the corners of the tetrahedron.10. B .3). Bottom panel: on the left side. For a perspective view of the diamond lattice. 1. the three arrows are the vectors that connect equivalent atoms. Top panel: representation of the sp 3 linear combinations of s and p atomic orbitals appropriate for the diamond structure.

10. . the thermodynamically stable solid form of carbon is the soft. Al forms the FCC crystal and is the representative metal with s and p electrons and a close-packed structure. cheap graphite rather than the very strong. 1. is relatively open compared to the close-packed lattices. on the other hand. to create four σ bonds around atom A. In and Tl. like B. ψ b . that is. 1. preferring to optimize the number of neighbors. respectively. The exact energy of the ψ orbitals will depend on the overlap between the φ A and φ B orbitals. Interestingly. with four neighbors per atom. This is the other stable form of bulk C. and forms the FCC crystal (see also below). B atoms deﬁne the repeat vectors at which atoms must be placed to form an inﬁnite crystal. and it can gain some energy by increasing the number of neighbors (from four to six) at the expense of perfect tetrahedral σ bonds. Ga forms quite complicated crystal structures with six or seven near neighbors (not all of them at the same distance). Sn forms crystal structures that are distorted variants of the diamond lattice. since its σ bonds are not as strong as those of the other group-IV-A elements. elements with only three valence s and p electrons.1 Building crystals from atoms 21 Then we will have a situation where the φ orbitals on neighboring A and B atoms will be pointing toward each other. 1. shown in Fig. but not the graphite. Ga. all electrons are taken up by the bonding states. Two other elements with four valence s and p electrons. we can expect the energy of the ψ b states to be lower than the original s atomic orbitals and those of the ψ a states to be higher than the original p atomic orbitals. B . shown in Fig. There are two more elements with four valence s and p electrons in the Periodic Table. namely Si and Ge. as alluded above. brilliant and very expensive diamond crystal! The diamond lattice. B . and we can form symmetric and antisymmetric combinations of those. The vectors connecting the equivalent B . lattice. In forms a distorted version of the cubic close packing in which the 12 neighbors are split into a group of four and another group of eight equidistant atoms. They instead form more complex structures in which they try to optimize bonding given their relatively small number of valence electrons per atom. also crystallize in the diamond.1. None of these structures can be easily described in terms of the notions introduced above to handle s and p valence electrons. black. Since C has four valence electrons and each atom at the center of a tetrahedron forms four σ bonds with its neighbors. and such units are close packed to form the solid. By placing both A-type and B -type atoms at all the possible integer multiples of these vectors we create the diamond lattice.6. do not form the graphite structure. Pb.10. Its stability comes from the very strong covalent bonds formed between the atoms. Al. ψ a . for sufﬁciently strong overlap. behaves more like a metal. Surprisingly. Sn and Pb. This results in a very stable and strong three-dimensional crystal. graphite has a somewhat lower internal energy than diamond. demonstrating the limitations of this simple approach. Some examples: the common structural unit for B is the icosahedron. as shown schematically in Fig.

12. Se. but in those solids there exist additional covalent bonds between the planes of puckered atoms so that the structure is not clearly planar as is the case for P. tend to form molecular-like ring or chain structures with two nearest neighbors per atom. Examples of such units are shown in Fig. 2. The structures of As.74 Å for Te. 1. If the covalent bonds were composed of purely p orbitals the bond angles between nearest neighbors would be 90°. 1. and the planes are stacked on top of each other to form the solid. that is.5 for Se and 1. 3. almost a factor of 2 larger. S. so that the preferred angle between the bonding orbitals is not 90°.86 Å for Te. the N2 unit is particularly stable. since signiﬁcant hybridization takes place between s and p orbitals that participate in bonding.3 for Te. the covalent bonds are arranged in puckered hexagons which form planes.46 Å for Se and 3. Sb and Bi follow the same general pattern with three-fold bonded atoms. As.22 1 Atomic structure of crystals Of the other elements in the Periodic Table with s and p valence electrons. as illustrated in Fig. the covalent bonds in these structures are a combination of s and p orbitals with predominant p character. the ratio of distances between atoms within a bonding unit and across bonding units is 1. Since these elements have a valence of 6. as pure p bonding would imply. while typical distances between atoms in successive units are 3. but ranges between 102° and 108°. the lightest element with ﬁve valence electrons which forms a crystal composed of nitrogen molecules. The interaction between planes is much weaker than that between atoms on a single plane: an indication of this difference in bonding is the fact that the distance between nearest neighbors in a plane is 2.06 Å for S. P. A characteristic structure is one in which the three p valence electrons participate in covalent bonding while the two s electrons form a ﬁlled state which does not contribute much to the cohesion of the solid. and the bond angles are somewhere between 120◦ (sp 2 bonding) and 90◦ (pure p bonding). this ﬁlled state is called the “lone pair” state.11.17 Å while the closest distance between atoms on successive planes is 3. These rings or chains are puckered and form bonds at angles that try to satisfy bonding requirements analogous to those described for the solids with four s and p valence electrons.87 Å. The elements with six s and p valence electrons. O. In this structure. 1. those with ﬁve electrons. This picture is somewhat oversimpliﬁed. An exception to this general tendency is nitrogen.50 Å for S. Te and Po. An exception . Sb and Bi. The structure of solid P is represented by this kind of atomic arrangement. instead. Typical distances between nearest neighbor atoms in the rings or the chains are 2. N. tend to form complex structures where atoms have three σ bonds to their neighbors but not in a planar conﬁguration.7 for S.32 Å for Se and 2. they tend to keep four of their electrons in one ﬁlled s and one ﬁlled p orbital and form covalent bonds to two neighbors with their other two p orbitals. which are then packed to form three-dimensional crystals.

in the second 95°. the O2 unit is particularly stable. eight-fold rings (Se) and one-dimensional chains (Se and Te). to this general tendency is oxygen. The theme of diatomic molecules as the basic unit of the crystal. bromine and iodine form solids by close packing of diatomic molecules.11.12. . already mentioned for nitrogen and oxygen. The solids are formed by close packing of these units. Characteristic units that appear in the solid forms of S.1 Building crystals from atoms 23 Figure 1. but the bond angles between nearest neighbors are not 120° and hence the atoms do not lie on the plane. Figure 1. The planes are stacked on top of each other as in graphite to form the 3D solids. Se and Te: six-fold rings (S). the lightest element with six valence electrons which forms a crystal composed of oxygen molecules.1. For illustration two levels of buckling are shown: in the ﬁrst structure the bond angles are 108°. The layers of buckled atoms that correspond to the structure of group-V elements: all atoms are three-fold coordinated as in a graphitic plane. is common in elements with seven s and p valence electrons also: chlorine.

just like elements with one or two s valence electrons. that is. The f orbitals have directional character which is even more complicated than that of p or d orbitals. more interesting are structures formed by these elements and other elements of the Periodic Table. starting with an occupation of two electrons in Ce and completing the shell with 14 electrons in Lu. in chapter 4).1. They are metallic solids with high atomic densities. HCP and BCC type. in which atoms have from 12 to 16 neighbors. since there are ﬁve d orbitals it is difﬁcult to construct linear combinations with s orbitals that would neatly point toward nearest neighbors in three-dimensional space and produce a crystal with simple σ bonds. in which the complex character of the f orbitals can be exploited in combination with orbitals from neighboring atoms to form strong bonds. These elements form space-ﬁlling close-packed crystals. Of those we discuss brieﬂy the lanthanides as the more common of the rare earths that are found in solids. d and f valence electrons The same general trends are found in the rare earth elements. For instance. which are grouped in the lanthanides (atomic numbers 58–71) and the actinides (atomic numbers 90 and beyond) . which forms a low-symmetry rhombohedral structure. There are very few exceptions to this general tendency. Even those structures. which forms a very complex structure with a cubic lattice and a very large number of atoms (58) in the unit cell. are slight variations of the basic close-packing structures already mentioned. In these elements the f shell is gradually ﬁlled as the atomic number increases.5 Atoms with s. as we discuss in a subsequent section. however. Alternatively.4 Atoms with s and d valence electrons This category includes all the atoms in the middle columns of the Periodic Table. of the FCC. with a couple of exceptions (Sm which has rhombohedral structure and Eu which has BCC structure). However. Moreover. these elements are used as dopants in complicated crystals. elements with s and d valence electrons tend to form solids where the s electrons are shared among all atoms in the lattice. Thus. the d valence orbitals typically lie lower in energy than the s valence orbitals and therefore do not participate as much in bonding (see for example the discussion about Ag.24 1 Atomic structure of crystals 1. columns I-B–VII-B and VIII. where they donate some of their electrons to . the Mn structure. Note that d orbitals can form strong covalent bonds by combining with p orbitals of other elements. However. The d orbitals in atoms have directional nature like the p orbitals. is a slight variation of the BCC structure. The crystals formed by most of these elements typically have metallic character. and Hg. 1. The solids formed by these elements are typically close-packed structures such as FCC and HCP. namely Mn.1.

which in some ways are complementary. which are repeated periodically in space to form a lattice.13. this type of crystal is representative of ionic bonding. This of course leads to one positively and one negatively charged ion. the small number of neighbors in this structure. completing the s and p levels. 1. as opposed to the rock-salt and cesium chloride structures. this becomes more pronounced when atoms from the second (group II-B) and sixth (group VI-A) .1. described earlier. respectively. Many combinations of group I-A and group VII-A atoms form this kind of crystal. In this case each ion has six nearest neighbors of the opposite type. which have one and seven valence electrons. This arrangement forms two interpenetrating cubic lattices and is known as the cesium chloride (CsCl) structure. with alternating corners occupied by atoms of the opposite type. In this case each ion has eight nearest neighbors of the opposite type. so it is natural to expect them to behave in some ways similar to the alkali metals. Several combinations of group I-A and group VII-A atoms crystallize in this structure.14. In this case each ion has four nearest neighbors of the opposite type. 1. Another way of achieving a stable lattice composed of two kinds of ions with opposite sign. However. which then acquires a closed electronic shell. the representative solid with this lattice. is to place them in the two interpenetrating FCC sublattices of the diamond lattice. suggest that the cohesion of these solids cannot be attributed to simple ionic bonding alone. Solids composed of such elements are referred to as “alkali halides”. one of the most common crystals. which is called the zincblende structure from the German term for ZnS.1. as shown in Fig.1 Building crystals from atoms 25 states formed by other atoms. Both of these ionic structures are shown in Fig. One such example is discussed in the following sections. 1. The easiest way to arrange such atoms is at the corners of a cube. It is natural to expect that the atom with one valence electron will lose it to the more electronegative atom with the seven valence electrons. A different way to arrange the ions is to have one ion at the center of a cube formed by ions of the opposite type. Many combinations of atoms in the I-B column of the Periodic Table and group VII-B atoms crystallize in this structure. One example that comes immediately to mind are solids composed of atoms in the ﬁrst (group I-A) and seventh (group VII-A) columns of the Periodic Table.6 Solids with two types of atoms Some of the most interesting and useful solids involve two types of atoms. This arrangement results in the sodium chloride (NaCl) or rock-salt structure. Since in all these structures the group I-A atoms lose their s valence electron to the group VII-A atoms. The elements in the I-B column have a ﬁlled d -shell (ten electrons) and one extra s valence electron. In fact.

and each O atom has two Si neighbors. InSb. In this case we would have to assume that the group II atoms lose their two electrons to the group VI atoms. etc. NaCl. Figure 1. in mixed ionic and covalent bonding. in which the ions form a body-centered cubic lattice with each ion surrounded by eight neighbors of the opposite type. several III-V. Indeed the crystals of group II and group VI atoms in the zincblende structure are good examples of mixed ionic and covalent bonding. the bonding is even more tilted toward covalent character. structure. II-VI and IV-IV solids exist in this lattice. . Left: the zincblende lattice in which every atom is surrounded by four neighbors of the opposite type. Right: the CsCl structure.13. This trend extends to one more class of solids: group III-A and group V-A atoms also form zincblende crystals. there are combinations of two group IV atoms that form the zincblende structure. for example AlP. some interesting examples are SiC and GeSi alloys.14. columns of the Periodic Table form the zincblende structure (ZnS itself is the prime example). Right: a representative SiO2 structure. GaAs. in which the ions form a simple cubic lattice with each ion surrounded by six neighbors of the opposite type. Left: the rock-salt. similar to the case of group IV atoms which form the diamond lattice.26 1 Atomic structure of crystals Figure 1. but since the electronegativity difference is not as great between these two types of elements as between group I-A and group VII-A elements. something more than ionic bonding must be involved. In this case. Finally. in which each Si atom has four O neighbors.

each Si atom has four O neighbors and is situated at the center of a tetrahedron. including amorphous structures.00 Å). while the distance between O atoms is 2. the covalent bonds are polarized to a large extent. which gives it a special character: in all other cases (except for He) we can consider the ions as classical particles. There are two H atoms attached to each O atom by short covalent bonds (of length 1. 1. 1. The position of these tetrahedra relative to each other. such as organic molecules and water. This results again in a mixture of covalent and ionic bonding. because this type of structural distortion involves only a slight bending of bond angles. Due to its special character. It also has the smallest mass. The proton is an ion much smaller in size than the other ions produced by stripping the valence electrons from atoms: its size is 10−15 m.1. can be changed with little cost in energy. Ice forms many complex phases [6]. ﬁve orders of magnitude smaller than typical ions. The tetrahedra of Si–O atoms are very stable units and the relative positions of atoms in a tetrahedron are essentially ﬁxed. due to their large mass. Due to the large electronegativity of O.1.14. Its interaction with the other elements. which have a size of order 1 Å.1 Building crystals from atoms 27 A variation on this theme is a class of solids composed of Si and O. because when H tries to share its one valence s electron with other atoms. This freedom in relative tetrahedron orientation produces a very wide variety of solids based on this structural unit. so that the structure can be thought of as covalently bonded. in its ordinary phase called I h . such as common glass. 1. H forms a special type of bond called “hydrogen bond”.7 Hydrogen: a special one-s-valence-electron atom So far we have left H out of the discussion. In these solids. without changing bond lengths.75 Å. the H2 O molecules are placed so that the O atoms occupy the sites of a wurtzite lattice (see Fig. its light mass implies a large zero-point motion which makes it necessary to take into account the quantum nature of the proton’s motion. while the H atoms are along lines that join O atoms [7]. This is because H is a special case: it has no core electrons. The solid in which hydrogen bonding plays the most crucial role is ice. The . while each O atom has two Si neighbors. however. In this manner the valence of both Si and O are perfectly satisﬁed. so that the two types of atoms can be considered as partially ionized. Yet another difference between hydrogen and all other elements is the fact that its s valence electron is very strongly bound to the nucleus: the ionization energy is 13. while in the case of hydrogen. This is encountered in many structures composed of molecules that contain H atoms.6 eV. what is left is a bare proton rather than a nucleus shielded partially by core electrons. as illustrated in Fig. such as the zeolites. There is one H atom along each line joining two O atoms. and structures with many open spaces in them.5). as well as between H atoms. is unusual. whereas typical ionization energies of valence electrons in other elements are in the range 1–2 eV.

5) and the H atoms are along the lines joining O atoms. This is an unfavorable situation as far as formation of covalent bonds is concerned. This is illustrated in Fig. in this system.15.15. but this motif of local bonding is common. The hydrogen bond is much weaker than the covalent bond between H and O in the water molecule: the energy of the hydrogen bond is 0. The relative position of atoms is not given to scale. There are many ways of arranging the H atoms within these constraints for a ﬁxed lattice of O atoms. while that of the covalent H–O bond is 4.3 eV. and the H atoms are along the directions joining the center to the corners of the tetrahedron. at the cost of an anisotropic bonding arrangement (a completed electronic shell should be isotropic. Other forms of ice have different lattices.75 Å. The O atom has six valence electrons in its s and p shells and therefore needs two more electrons to complete its electronic structure.8 eV. Within the atomic orbital picture discussed earlier for solids with s and p electrons. it is these hydrogen bonds that give stability to the crystal. having lost their electrons to O.28 1 Atomic structure of crystals b a O H Figure 1. and two H atoms are bonded by short covalent bonds to each O atom. The cores of the H atoms (the protons). Left: illustration of hydrogen bonding between water molecules in ice: the O atom is at the center of a tetrahedron fromed by other O atoms.00 Å. The O–H covalent bond distance is a = 1. giving rise to a large conﬁgurational entropy. and. The two H atoms that are attached to it to form the water molecule provide these two extra electrons. 1. but this would involve only one p orbital of the O atom to which both H atoms would bond. while the H–O hydrogen bond distance is b = 1. 1. Right: illustration of the structure of I h ice: the O atoms sit at the sites of a wurtzite lattice (compare with Fig. as in the case of Ne which has two more electrons than O).75 Å. in order to make it easier to visualize which H atoms are attached by covalent bonds to the O atoms. has length 1. there is one H atom along each such line. bond between a H atom and an O atom to which it is not covalently bonded is called a hydrogen bond. because it is not possible to . experience a Coulomb repulsion. The most favorable structure for the molecule which optimizes this repulsion would be to place the two H atoms in diametrically opposite positions relative to the O atom. we can construct a simple argument to rationalize hydrogen bonding in the case of ice.

1 Building crystals from atoms 29 form two covalent bonds with only one p orbital and two electrons from the O atom. H is supposed to form an atomic solid when the molecules have approached each other enough so that their electronic distributions are forced to overlap strongly [10]. The latest estimates are that it takes more than 3 Mbar of pressure to form the atomic H solid. two of which form covalent bonds with the H atoms. At higher pressure. which can only be reached under very special conditions in the laboratory.1. many more possibilities open up. At low pressure and temperature. corresponding to the attraction between oppositely charged lobes of the H2 O tetrahedra.8 Solids with many types of atoms If we allow several types of atoms to participate in the formation of a crystal. the hydrogen bond has signiﬁcant covalent character as well: the two types of orbitals pointing toward each other form bonding (symmetric) and antibonding (antisymmetric) combinations leading to covalent bonds between them. In fact. 1. are still a subject of active research [11–13]. when sophisticated scattering experiments and quantum mechanical calculations provided convincing evidence in its support [9]. and although the BCC structure seems to be the most likely phase. and which has been achieved only in the 1990s. 1. the conditions of pressure and temperature at which this transition occurs. H is expected to form a crystal composed of H2 molecules in which every molecule behaves almost like an inert unit.1. This rationalization. A compromise between the desire to form strong covalent bonds and the repulsion between the H cores is the formation of four sp 3 hybrids from the orbitals of the O atom. This point of view was originally suggested by Pauling [8] and has remained controversial until recently. and the structure of the ensuing atomic solid. while the other two are ﬁlled by two electrons each. However. is somewhat misleading as it suggests that the hydrogen bond. This produces a tetrahedral structure with two lobes which have more positive charge (the two sp 3 orbitals to which the H atoms are bonded) than the other two lobes (the two sp 3 orbitals which are occupied by two electrons each). this has not been unambiguously proven to date.15. There is considerable debate about what the crystal structure at this pressure should be. It is natural to expect that bringing similar molecular units together would produce some attraction between the lobes of opposite charge in neighboring units. but the types of bonding that occur in these situations are variants of . with very weak interactions to the other molecules. however. by analogy to all other alkalis. The solid phases of pure hydrogen are also unusual. is essentially ionic. This is precisely the arrangement of molecules in the structure of ice discussed above and shown in Fig. There are indeed many solids with complex composition.

Left: a Cu atom surrounded by six O atoms. Right: a set of corner-sharing O octahedra. One interesting example of such complex structures is the class of ceramic materials in which high-temperature superconductivity (HTSC) was observed in the mid-1980s (this discovery. The motif of oxygen octahedra with a metal atom at the center to which the O atoms are covalently bonded. respectively. which form an octahedron. van der Waals and hydrogen bonding. The A atoms provide the necessary number of electrons to satisfy all the covalent bonds. and the dx 2 − y 2 orbital of B. several of these types of bonding are present simultaneously.16). the Cu–O atoms are bonded by strong covalent bonds. and become partially ionized giving rise to mixed ionic and covalent bonding (see. the overall bonding involves . The chemical formula of perovskites is ABO3 . is also the basis for a class of structures called “perovskites”. The empty spaces between the octahedra can accommodate atoms which are easily ionized. In these materials strong covalent bonding between Cu and O forms one-dimensional or two-dimensional structures where the basic building block is oxygen octahedra. Bednorz and K. M¨ uller.30 1 Atomic structure of crystals Figure 1. The basic unit is shown in Fig. was recongnized by the 1987 Nobel prize for Physics). In many situations. bonding along the z axis is accomplished through the overlap between the pz orbital of the third (O3 ) oxygen atom and the d3z 2 −r 2 orbital of B (see Fig. supplemented by atoms which are easily ionized. 1. the types we have already discussed: metallic. 1. 1. A. by J.8 for the nature of these p and d orbitals). G. The octahedra can also be joined at the remaining apexes to form a fully three-dimensional lattice. rare earth atoms are then placed at hollow positions of these backbond structures.16. to produce a mixed covalent– ionic structure. Fig. ionic. covalent. forming a two-dimensional square lattice.17: bonding in the x y plane is accomplished through the overlap between the px and p y orbitals of the ﬁrst (O1 ) and second (O2 ) oxygen atoms. Thus. for example. where A is the easily ionized element and B the element which is bonded to the oxygens.

some examples are CaTiO3 .17. O3 (shown as the small open circles in the structural unit). This coupling of mechanical and electrical responses is very useful for practical applications. which breaks the symmetry of the cubic lattice. both strong covalent character between B and O. These solids have very intriguing behavior: when external pressure is applied on them it tends to change the shape of the unit cell of the crystal and therefore produces an electrical response since it affects the internal dipole moment. pz orbitals of the three O atoms and the dx 2 − y 2 . such as sensors and actuators and non-volatile memories. p y . The complexity of the structure gives rise to several interesting properties. a2 . that is. the remaining oxygen atoms are related to those by the repeat vectors of the crystal. d3z 2 −r 2 orbitals of the B atoms that participate in the formation of covalent bonds in the octahedron are shown schematically. the ability of the solid to acquire and maintain an internal dipole moment. The dipole moment is associated with a displacement of the B atom away from the center of the octahedron. O2 .1. a3 . indicated as a1 . conversely.1 Building crystals from atoms 31 O3 O2 B O1 pz a3 O3 O 2 py a2 A d3z 2-r 2 B d x 2-y 2 O1 a1 px Figure 1. an external electric ﬁeld can also affect the internal dipole moment and the solid changes its shape to accommodate it. The basic structural unit of perovskites ABO3 (upper left) and the atomic orbitals that contribute to covalent bonding. The solids that exhibit this behavior are called piezoelectrics. The six oxygen atoms form an octahedron at the center of which sits the B atom. The three distinct oxygen atoms in the unit cell are labeled O1 . The px . such as ferroelectricity. as well as ionic character between the B–O units and the A atoms. while the thicker lines between the oxygen atoms and B represent the covalent bonds in the structural unit. The thin lines outline the cubic unit cell.

due to its lack of core electrons. (2) Metallic bonding. which is formed when H is present. PbZrO3 (lead zirconate). which is formed by atoms that do not have valence electrons available for sharing (the noble elements). in the other two there is a degree of covalent bonding present. and is rather weak.2 Bonding in solids In our discussion on the formation of solids from atoms we encountered ﬁve general types of bonding in solids: (1) Van der Waals bonding. which can be thought of as linear combinations of the original atomic orbitals. and ionic bonding between the alkali atoms and the fullerene units. taking up the electrons of the alkali atoms and becoming ionized. which is formed when two different types of atoms are combined. the solids produced in this way are not particularly stable. for van der Waals bonding and for purely ionic bonding it is sufﬁcient to assume simple classical models. one assumes that there is an attractive potential between the atoms which behaves like ∼ r −6 with distance r between atoms (this behavior can actually be derived from perturbation theory. In the ﬁrst case bonding is purely ionic. it is possible to estimate the strength of bonding without involving a detailed description of the electronic behavior. the solids produced in this way are semiconductors or insulators. its light mass and high ionization energy. BaTiO3 (barium titanate). (3) Covalent bonding. see Problem 4). which is formed when electrons in well deﬁned directional orbitals.32 1 Atomic structure of crystals (calcium titanate). as the electronic densities of the two atoms start overlapping. the solids produced in this way are the usual metals. For some of these cases. Speciﬁcally. producing a uniform “sea” of negative charge. Combinations of such elements are I–VII. and one that prefers to grab electrons from other atoms and become a negative ion. Another example of complex solids is the class of crystals formed by fullerene clusters and alkali metals: there is strong covalent bonding between C atoms in each fullerene cluster. The clusters act just like the group VII atoms in ionic solids. The potential must become repulsive at very short range. (4) Ionic bonding. weak van der Waals bonding between the fullerenes. II–VI. It is intriguing that these solids also exhibit superconductivity at relatively high temperatures! 1. one that prefers to lose some of its valence electrons and become a positive ion. which is formed when electrons are shared by all the atoms in the solid. but electrons have no incentive to form bonding states (as was the case in covalent bonding) since all electronic shells are already . have strong overlap with similar orbitals in neighboring atoms. and III–V. PbTiO3 (lead titanate). For van der Waals bonding. (5) Hydrogen bonding.

the vibrational frequency of the H2 molecule.65 1.0 3.1.40 2.1.2 Bonding in solids 33 Table 1. the Lennard–Jones potentials for the noble gases correspond to very soft bonds indeed! A potential of similar nature.510 Original sources: see Ashcroft and Mermin [14]. is the Morse potential: VM (r ) = e−2(r −r0 )/b − 2e−(r −r0 )/b (1. In Table 1.98 1.0 3. is about 500 meV.74 2. this is indicative of the stiffness of the bond between atoms.6) where again and b are the constants that determine the energy and length scales and r0 is the position of the minimum energy. which has the same minimum and .4 3. Parameters for the Lennard–Jones potential for noble gases. For convenience the attractive part is taken to be proportional to r −12 . the parameters for the usual noble gas elements are shown in Table 1. It is instructive to compare these two potentials with the harmonic oscillator potential. For the calculation of h ¯ ω using the Lennard–Jones parameters see the following discussion and Table 1. also used to describe effective interactions between atoms. more than two orders of magnitude larger.1 2. Use of this potential can then provide a quantitative measure of cohesion in these solids. For comparison.2 for the relation between this frequency and the Lennard–Jones potential parameters).310 Kr 14.2. One measure of the strength of these potentials is the vibrational frequency that would correspond to a harmonic oscillator potential with the same curvature at the minimum.722 Xe 20. the simplest type of covalent bond between two atoms. Ne (meV) a (Å) h ¯ ω (meV) 3.5) with and a constants that determine the energy and length scales. ﬁlled. which gives the famous Lennard–Jones 6–12 potential: VL J (r ) = 4 a r 12 − a r 6 (1.1 we present the frequencies corresponding to the Lennard–Jones potentials of the common noble gas elements (see following discussion and Table 1. These have been determined for the different elements by referring to the thermodynamic properties of the noble gases.1.213 Ar 10.

The overall behavior of the two potentials is quite similar.2. by analogy to the harmonic oscillator potential (see Appendix B). the latter are given by: HO = n+ En 1 h ¯ω 2 (1. Lennard–Jones VL J (r ).2. This allows a comparison between the energy levels associated with this potential and the corresponding energy levels of the harmonic oscillator.34 1 Atomic structure of crystals Table 1. The relations between the parameters that ensure the three potentials have the same minimum and curvature at the minimum are also given (the parameters of the Morse and harmonic oscillator potentials are expressed in terms of the Lennard–Jones parameters). The deﬁnitions of the three potentials are such that they all have the same value of the energy at their minimum. Morse VM (r ). The harmonic oscillator potential is what we would expect near the equilibrium of any normal interaction potential. and a much weaker increase of the energy for distances larger than the equilibrium value.18. The other two potentials extend the range far from the minimum. VL J (r ) Potential Vmin rmin V (rmin ) Relations 4 a 12 r VM (r ) a 6 r VH O (r ) − +1 m ω2 (r − r0 )2 2 − r0 m ω2 r0 = (2 6 )a ω = (432 ) 1 3 1 − − 1 (2 6 )a 1 e−2(r −r0 )/b − 2e−(r −r0 )/b − r0 2( /b2 ) r0 = (2 6 )a b = (2 /6)a 1 6 1 (72/2 3 )( /a 2 ) / ma 2 curvature. relative to the harmonic oscillator. given by: 1 VH O (r ) = − + m ω2 (r − r0 )2 2 (1. Comparison of the three effective potentials. One advantage of the Morse potential is that it can be solved exactly. The relations between the values of the other parameters which ensure that the minimum in the energy occurs at the same value of r and that the curvature at the minimum is the same are given in Table 1. and harmonic oscillator VH O (r ). both potentials have a much sharper increase of the energy for distances shorter than the equilibrium value.8) . 1. m the mass of the particle in the potential and r0 the position of the minimum.7) with ω the frequency. namely − . a plot of the three potentials with these parameters is given in Fig.

The only difﬁculty is that the summation converges very slowly. We thus see that the spacing of levels in the Morse potential is smaller than in the corresponding harmonic oscillator.3 35 Lennard--Jones Morse Harmonic V(r) 0 Ϫ0.18. with n the integer index of the levels.2. one assumes that what keeps the crystal together is the attractive interaction between the positively and negatively charged ions.3 Ϫ0.2 1. because the interaction potential . the two parameters of the Lennard–Jones potential. This is expected from the behavior of the potential mentioned above.5).8 2 r Figure 1.6 0. The three effective potentials discussed in the text.9 0. with same minimum and curvature at the minimum.9) for the parameters deﬁned in Table 1. (1. whereas those of the Morse potential are given by: M = n+ En 1 h ¯ω 1 h ¯ω 1 − n+ 2 4 2 (1. Morse Eq. cesium chloride and zincblende lattices we have discussed already.6) and harmonic oscillator Eq. and that it becomes progressively smaller as the index of the levels increases. we expect its energy levels to behave in the same manner. it is possible to calculate the cohesive energy. The energy is given in units of and the distance in units of a . the crystal structure and the distance between ions. Lennard–Jones Eq. For the ionic solids with rock-salt.6 Ϫ0. Since the Lennard–Jones potential has an overall shape similar to the Morse potential.8 1 1.6 1.4 1. (1.2 Bonding in solids 1.2 0.2 0. and in particular from its asymptotic approach to zero for large distances. (1. which only depends on the ionic charges.7).9 Ϫ1. This is called the Madelung energy. again in a purely classical picture. For purely ionic bonding.1.

1976). The descriptions that we mentioned for the metallic and covalent solids are also referred to by more technical terms. Heidelberg. which converges very slowly. Mermin (Saunders College Publishing. . The metallic sea of electrons paradigm is referred to as the “jellium” model in the extreme case when the ions (atoms stripped of their valence electrons) are considered to form a uniform positive background. N. covalent and mixed bonding. it provides an inspired coverage of most topics that had been the focus of research up to its publication. much more information is required to render it a realistic tool for calculations. as was explained above for the graphite and diamond lattices. O.Madelung (Springer-Verlag. We will revisit these notions in more detail. Ashcroft and N. This is a comprehensive and indispensable source on the physics of solids. The other types of bonding. This is another Madelung sum.D. in this limit the crystal itself does not play an important role. although the approach we used by combining atomic orbitals is conceptually simple. Introduction to Solid State Theory. 2. even if we think of the electrons as a uniform sea. Further reading We collect here a number of general books on the physics of solids. are much more difﬁcult to describe quantitatively. we have to consider the energy of the positive ions that exist in the uniform negative background of the electron sea.W. In fact. metallic. This will also be discussed in detail in subsequent chapters. we need to know the energy of this uniform “liquid” of fermions. since it relies on the use of a basis of atomic orbitals in linear combinations that make the bonding arrangement transparent. makes the calculation of the Madelung energy through the Ewald summation trick much more efﬁcient (see Appendix F). Solid State Physics. which we will develop in chapter 3. The formal way for treating periodic structures. In addition to the electronic contributions. This will be the subject of the next chapter. 1. Material in these books goes well beyond the topics covered in the present chapter and is relevant to many other topics covered in subsequent chapters. other than it provides the background for forming the electronic sea. Berlin. The description of the covalent bonding paradigm is referred to as the Linear Combination of Atomic Orbitals (LCAO) approach. For metallic bonding. 1981). which is not a trivial matter. and any simple way of summing successive terms gives results that depend on the choice of terms. formally this sum does not converge. Philadelphia.36 1 Atomic structure of crystals (Coulomb) is long range. As far as covalent bonding is concerned. and the electron interactions again come into play in an important way.

Guinier and R. L. 10. Wiley. 1974). Oxford. London. Kittel (7th edn. with an emphasis on topics relevant to materials science and technological applications. This is one of the standard introductory texts in the physics of solids with a wealth of useful information. 1963).W. Sutton (Oxford University Press. because they include two equivalent sites per unit cell which can be occupied by the different ions. New York. The Nature of the Chemical Bond and the Structure of Molecules and Solids. rock-salt. A. P. Crystal Structures. emphasizing both the physical and chemical aspects of bonding in solids and molecules. The Solid State. Donohue (J. R.A. The Structure of the Elements. Wiley. 1993). Wiley. W. C. 8. New York. 1989). This is a fresh look at the physics of condensed matter. New York. Wiley. 5. Quantum Theory of Matter: A Novel Introduction. Harrison (McGraw-Hill. Oxford. cesium chloride and zincblende. This is an older treatment with many useful physical insights. 1972). J. 1984). so that each ion type is completely surrounded by the other. Wyckoff (J. Solid State Theory. This is an older but extensive treatment of the physics of solids. 1960). Pauling (Cornell University Press. This is an older but very comprehensive two-volume work. Jullien (Oxford University Press. Oxford. 1995). Theoretical Solid State Physics. Bonding and Structure of Molecules and Solids. Quantum Theory of Solids. Jones and N. New York. 15. 11. 14. 6. Menlo Park. 7. Ithaca. This is a classic treatment of the nature of bonding between atoms. 12. Introduction to Solid State Physics.P. Describe the corresponding bipartite . D. J. This book is a modern account of the physics of solids. are called bipartite lattices. New York.G. This is a useful compilation of crystal structures for elemental solids. 1996). Electronic Structure of Materials. 1963). Kittel (J. Cambridge. covering a very broad range of topics at an introductory level. 1970). W. Ziman (Cambridge University Press. Problems 1. This is an older text with an advanced treatment of the physics of solids. C. J. Basic Notions of Condensed Matter Physics. Wiley. 9. covering many interesting topics. it covers both the single-particle and the many-body pictures. New York.M. Pettifor (Oxford University Press. 1973). Anderson (Benjamin-Cummings Publishing.H. This is a very useful compilation of all the structures of elemental solids and a wide variety of common compounds.Problems 37 3. Wiley. 13. It discusses extensively bonding in molecules but there is also a rich variety of topics relevant to the bonding in solids.W. This book represents a balanced introduction to the theoretical formalism needed for the study of solids at an advanced level. presenting the physics of solids. Principles of the Theory of Solids. 4. Modinos (J. which we have discussed in this chapter. The three ionic lattices. March (J. 1996). A. A.

19 for b = 2. which is discussed at length in the next chapter. interpret the bonding character of state φ (+) (x ) and the antibonding character of state φ (−) (x ). taking into account the enhanced Coulomb attraction between the electron and the two ions in the region −b/2 < x < +b/2. Make sure to take into account possible changes in the kinetic energy and show that they do not affect the argument about the character of these two states.5a . we construct the symmetric (+) and antisymmetric (−) combinations: 1 e−|x −b/2|/a ± e−|x +b/2|/a φ (±) (x ) = √ λ(±) with λ(±) the normalization factors λ(±) = 2a 1 ± e−b/a 1 + b a . We wish to demonstrate in a simple one-dimensional example that symmetric and antisymmetric combinations of single-particle orbitals give rise to bonding and antibonding states. this example is relevant to the hydrogen molecule.3 We begin with an atom consisting of an ion of charge +e and a single valence electron: the electron–ion interaction potential is −e2 /|x |.38 1 Atomic structure of crystals lattices in two dimensions. is given by: δ n (±) (x ) = ± 1 λ(±) 2φ1 (x )φ2 (x ) − e−b/a 1 + b a |φ1 (x )|2 + |φ2 (x )|2 A plot of the probabilities |φ (±) (x )|2 and the differences δ n (±) (x ) is given in Fig. We next consider two such atoms. a 1 φ2 (x ) = √ e−|x +b/2|/a a 2. Using this plot. we will take the normalized wavefunction for the ground state of the electron to be 1 φ0 (x ) = √ e−|x |/a a where a is a constant. Show that the difference between the probability of the electron being in the symmetric or the antisymmetric state rather than in the average of the states φ1 (x ) and φ2 (x ). Are they all different from each other? Try to obtain the Madelung energy for one of them. . 1 φ1 (x ) = √ e−|x −b/2|/a . A common approximation is to take the symmetric and antisymmetric combinations to be deﬁned as: 1 φ (±) (x ) = √ [φ1 (x ) ± φ2 (x )] . the ﬁrst ion at x = −b/2 and the second at x = +b/2. From the two single-particle states associated with the electrons in each atom. and show how the calculation is sensitive to the way in which the inﬁnite sum is truncated. 1. with b the distance between them (also referred to as the “bond length”). arising from the ion which is situated at x = 0. 2a 3 Though seemingly oversimpliﬁed.

3 Ϫ0. In this example b = 2. In order to derive the attractive part of the Lennard–Jones potential. and the differences δ n (±) (x ) between them and the average occupation of states φ1 (x ). we consider two atoms with Z electrons each and ﬁlled electronic shells. is always negative for φ (+) and always positive for φ (−) . φ2 (x ) by thinner solid lines. When they are brought closer together. In the ground state. the two electronic charge distributions will be polarized because each will feel the effect of the ions and electrons of the other. as described in section 1. that is.1. and therefore we can neglect exchange of electrons between them. Thus.6 Ϫ4 Ϫ2 Ϫb/2 0 +b/2 2 4 Figure 1. |x − b/2| 3. λ(±) = 2a . Describe how the different orbitals combine to form hydrogen bonds in the solid structure of ice. 4. left panel) and antisymmetric ((−).3 Ϫ0. right panel) linear combinations of single-particle orbitals: the probability densities |φ (±) (x )|2 are shown by thick solid lines in each case. In this limit.6 Ϫ4 Ϫ2 Ϫb/2 0 +b/2 2 4 Ϫ0. the atoms will have spherical electronic charge distributions and. this again justiﬁes our assertion that φ (+) corresponds to a bonding state and φ (−) to an antibonding state. Symmetric ((+). Provide an argument of how different combinations of orbitals than the ones discussed in the text would not produce as favorable a covalent bond between H and O. Produce an energy level diagram for the orbitals involved in the formation of the covalent bonds in the water molecule.5a and x is given in units of a .6 0.7.Problems 0.3 0 0 Ϫ0.6 39 0. show by 1 [ φ1 |V1 (x )|φ1 + φ2 |V2 (x )|φ2 ] 2 V2 (x ) = −e2 |x + b/2| where V1 (x ) = −e2 . when sufﬁciently far from each other. which is reasonable in the limit b numerical integration that the gain in potential energy V ≡ φ (±) | [V1 (x ) + V2 (x )] |φ (±) − a . they will not interact.3 0. it is the . We are assuming that the two atoms are still far enough from each other so that their electronic charge distributions do not overlap. the dashed lines show the attractive potential of the two ions located at ±b/2 (positions indicated by thin vertical lines).19.

From second order perturbation theory. r2 . r3 . 2 (the expression for the density n (r) in terms of the many-body wavefunction | is discussed in detail in Appendix B). the energy change due to this interaction is given by: E= + nm (1) 0 (2) 0 | Vint | (1) 0 (2) 0 (1) 0 (2) 0 | Vint | (1) n (2) m 2 (1. . which deﬁnes the distance at which interactions between the two charge densities become negligible. and the last term is the repulsion between the two sets of electrons in the two different atoms. we consider the interaction potential between the two neutral atoms: Vint = Z 2 e2 − |R1 − R2 | Z e2 i |ri(1) − R2 | − j Z e2 |r(2) j − R1 | + ij e2 |ri(1) − r(2) j | where R1 .11) = i =1 δ (r − ri ) (I ) 0 (r1 . This implies that the distances |ri(1) − R1 | and |r(2) j − R2 | should be small compared with the distance between the atoms |R2 − R1 |. Show that the ﬁrst order term in E corresponds to the electrostatic interaction energy between the charge density (2) distributions n (1) 0 (r). To model the polarization effect. and E 0 . 0 are the ground-state many-body wavefunctions of the two atoms. r Z ) dr2 dr3 · · · dr Z 2 (1. are their excited states. R2 are the positions of the two nuclei and ri(1) . The wavefunctions involved in the second order term in E will be negligible. unless the electronic coordinates associated with each atom are within the range of nonvanishing charge density. show that this term vanishes (the two charge densities in the unperturbed ground state are spherically symmetric). (1) (2) . .10) 1 E 0 − E nm (1) (2) where 0 . . . . for this reason this interaction is sometimes also referred to as the “ﬂuctuating dipole interaction”. Assuming that there is no overlap between these two charge densities. We deﬁne the electronic charge density associated with the ground state of each atom through: 2 I) n( 0 (r) = Z Z (I ) 0 (r. r2 . the third term is the attraction of the electrons of the second atom to the nucleus of the ﬁrst. . the ﬁrst term is the repulsion between the two nuclei.40 1 Atomic structure of crystals polarization that gives rise to an attractive potential. . n 0 (r). Show that expanding the interaction potential in the small quantities . E nm are the corresponding energies of the n m two-atom system in their unperturbed states. r Z ) dr1 dr2 · · · dr Z with I = 1. the second term is the attraction of the electrons of the ﬁrst atom to the nucleus of the second. r(2) j are the sets of electronic coordinates associated with each nucleus. In the above equation. .

gives. show that the leading order term in the energy difference E behaves like |R2 − R1 |−6 and is negative.Problems |ri(1) − R1 |/|R2 − R1 | and |r(2) j − R2 |/|R2 − R1 |. This establishes the origin of the attractive term in the Lennard–Jones potential.12) Using this expression. to lowest order: − + e2 |R2 − R1 | e2 |R2 − R1 | 3 ij (2) (ri(1) − R1 ) · (R2 − R1 ) (r j − R2 ) · (R2 − R1 ) · (R2 − R1 )2 (R2 − R1 )2 41 (ri(1) − R1 ) · (r(2) j − R2 ) ij (R2 − R1 )2 (1. .

and is based on solving a many-body Schr¨ odinger equation of the form H ({R I . Each electron also experiences the presence of other electrons through an effective potential in the single-particle equations.1 The hamiltonian of the solid An exact theory for a system of ions and interacting electrons is inherently quantum mechanical. M I is the mass of ion I . while keeping the important effect they have on valence electrons.2 The single-particle approximation In the previous chapter we saw that except for the simplest solids. like those formed by noble elements or by purely ionic combinations which can be described essentially in classical terms. In the above equations: h ¯ is Planck’s constant divided by 2π . 42 . In the last section of this chapter we will provide a formal way for eliminating the core electrons from the picture. 2. ri }) = E ({R I . the ions. which lead to sets of single-particle equations for the electronic degrees of freedom in the external potential created by the presence of the ions. this effective potential encapsulates the many-body nature of the true system in an approximate way. The following chapters deal with these valence electrons (we will also refer to them as simply “the electrons” in the solid). We will do this by starting with the exact hamiltonian for the solid and introducing approximations in its solution. and in turn inﬂuences. Our goal in this chapter is to establish the basis for the single-particle description of the valence electrons. in all other cases we need to consider the behavior of the valence electrons. containing the kinetic energy operators − I (2. we will study how their behavior is inﬂuenced by.2) i and the potential energy due to interactions between the ions and the electrons. ri }) h ¯2 2 ∇ − 2 M I RI h ¯2 2 ∇ 2m e ri (2.1) where H is the hamiltonian of the system.

the hamiltonian of the system becomes H=− i h ¯2 2 ∇ − 2m e ri iI 1 Z I e2 e2 + | R I − ri | 2 i j ( j =i ) | ri − r j | (2. R J also repel one another giving rise to a potential energy term Z I Z J e2 | RI − R J | (2.3) where e is the electronic charge. and take their kinetic energy into account as a classical contribution. making the former behave like classical particles. and {ri } are the variables that describe the electrons.2. The total external potential experienced by an electron due to the presence of the ions is Vion (r) = − I Z I e2 | RI − r | (2. which produces a potential energy term e2 | ri − r j | (2. where the ions have to be treated as quantum mechanical particles.5) Two ions at positions R I . ri }) is the manybody wavefunction that describes the state of the system. {R I } are the positions of the ions. ({R I . producing a potential energy term − Z I e2 | RI − r | (2. Two electrons at ri . are the lightest elements (especially H). An electron at r is attracted to each positively charged ion at R I .4) where Z I is the valence charge of this ion (nucleus plus core electrons). r j repel one another. If the ions are at rest. noted in the previous chapter. we can think of the ions as moving slowly in space and the electrons responding instantaneously to any ionic motion.6) Typically.7) + 1 2 Z I Z J e2 | RI − R J | I J ( J =I ) . E is the energy of the system. so that has an explicit dependence on the electronic degrees of freedom alone: this is known as the Born–Oppenheimer approximation. We can then omit the quantum mechanical term for the kinetic energy of the ions. Its validity is based on the huge difference of mass between ions and electrons (three to ﬁve orders of magnitude). The only exception to this.1 The hamiltonian of the solid 43 m e is the mass of the electron.

the conduction of electricity in the usual ohmic manner. for example. solving for ({ri }) is an extremely difﬁcult task. namely H ({ri }) = φ1 (r1 )φ2 (r2 ) · · · φ N (r N ) (2. this is known as the “exchange” property. The hamiltonian then takes the form H=− i h ¯2 2 ∇ + 2m e ri Vion (ri ) + i e2 2 1 | ri − r j | i j ( j =i ) (2. Even with this simpliﬁcation. In developing the one-electron picture of solids. It is an appropriate description when the effects of exchange and correlation are not crucial for describing the phenomena we are interested in. 2.1 The Hartree approximation The simplest approach is to assume a speciﬁc form for the many-body wavefunction which would be appropriate if the electrons were non-interacting particles. because of the nature of the electrons. If two electrons of the same spin interchange positions. however.2 The Hartree and Hartree–Fock approximations 2.8) with the ionic potential that every electron experiences Vion (r) deﬁned in Eq. in which we describe the system as a collection of classical ions and essentially single quantum mechanical particles that reproduce the behavior of the electrons: this is the single-particle picture. this is often referred to as a mean-ﬁeld approximation for the electron–electron interactions. optical excitations in solids.44 2 The single-particle approximation In the following we will neglect for the moment the last term. (2. mentioned in chapter 1).2. each electron is affected by the motion of every other electron in the system. Moreover. To do this. must change sign. and then try to explore their meaning. this is known as the “correlation” property. Such phenomena include. transport in high magnetic ﬁelds (the quantum Hall effects). we have to pass from the many-body picture to an equivalent one-electron picture. We will ﬁrst derive equations that look like single-particle equations. we will not neglect the exchange and correlation effects between electrons.5). approximate picture. It is possible to produce a simpler. such as superconductivity. We discuss how this constant can be calculated for crystals in Appendix F (you will recognize in this term the Madelung energy of the ions. Phenomena which are outside the scope of the single-particle picture include all the situations where electron exchange and correlation effects are crucial.9) . and all properties of solids that have to do with cohesion (such as mechanical properties). etc. we will simply take them into account in an average way. which as far as the electron degrees of freedom are concerned is simply a constant. and is a manifestation of the Pauli exclusion principle.

11) where the constants i are Lagrange multipliers introduced to take into account the normalization of the single-particle states φi (the bra φi | and ket |φi notation for single-particle states and its extension to many-particle states constructed as products of single-particle states is discussed in Appendix B).e. The more important problem is to determine how realistic the solution is. Then we would have a set of orbitals that would look like single particles. j = i were known. if all the other orbitals φ j (r j ).2. the total energy of the system becomes EH = = i H |H| φi | H 2 −h ¯ 2 ∇r e2 + Vion (r) | φi + 2m e 2 φi φ j | i j ( j =i ) 1 | φi φ j |r−r | (2. this problem of self-consistency.2 The Hartree and Hartree–Fock approximations 45 with the index i running over all electrons. In principle. We can make the original trial φ ’s orthogonal. discussed below). we then compare the resulting φi ’s with the original ones. Vi H (r) given by Vi H (r) = +e2 j =i φj | 1 | φj |r−r | (2. The wavefunctions φi (ri ) are states in which the individual electrons would be if this were a realistic approximation. With this approximation. we obtain from this the single-particle Hartree equations: 2 −h ¯ 2 ∇r + Vion (r) + e2 2m e φj | j =i 1 | φj |r−r | φi (r) = i φi (r) (2.12) . can be solved iteratively. normalized to unity. 2. Each orbital φi (ri ) can then be determined by solving the corresponding single-particle Schr¨ odinger equation. These are single-particle states.10) Using a variational argument. the comparison of input and output wavefunctions is made through the densities. We assume a set of φi ’s. This cycle is continued until input and output φi ’s are the same up to a tolerance δtol . the fact that the equation for one φi depends on all the other φ j ’s. as illustrated in Fig. as would be natural in Density Functional Theory. and modify the original φi ’s so that they resemble more the new φi ’s. use these to construct the single-particle hamiltonian. i. each φi (r) experiencing the ionic potential Vion (r) as well as a potential due to the presence of all other electrons.1 (in this example. which allows us to solve the equations for each new φi . This is known as the Hartree approximation (hence the superscript H ). and maintain the orthogonality at each cycle of the self-consistency iteration to make sure the ﬁnal φ ’s are also orthogonal.

This is known as the Hartree potential and includes only the Coulomb repulsion between electrons.2. It is a mean-ﬁeld approximation to the electron–electron interaction. Construct ρ(in) (r) = (in) (r)|2 . This kind of operation is easily implemented on the computer. 2. Solve − 2¯ i me ∇r + V (r. ρ 2 (out) (r) = (out) (out) φi (r) i ↓ 4. i |φi (in) (r) V sp (r. We can take the variation in the wavefunction to be of the form δφi |. Choose φi ↓ 2. Figure 2. where the energy is given by Eq. so that any variation in the wavefunction will give a zero variation in the energy (this is equivalent to the statement that the derivative of a function at an extremum is zero). taking into account the electronic charge only. starting with the many-body wavefunction of Eq. so varying the bra and the ket independently is equivalent to varying the real and . Schematic representation of iterative solution of coupled single-particle equations. (2. GOTO 2. The potential is different for each particle. this is allowed because the wavefunctions are complex quantities.1.9). subject to the constraint that φi |φi = 1. ρ(in) (r)) ↓ h 2 sp (in) (r)) φ 3.2 Example of a variational calculation We will demonstrate the variational derivation of single-particle states in the case of the Hartree approximation. We assume that this state is a stationary state of the system. else φi (in) i |φi (out) (r)|2 (r) = φi (out) (r).46 2 The single-particle approximation 1. which is a severe simpliﬁcation. Construct ρ(out) (r) = ↓ 5. which can be taken into account by introducing a Lagrange multiplier i : δ EH − i i ( φi |φi − 1) = 0 (2.13) Notice that the variations of the bra and the ket of φi are considered to be independent of each other. Compare ρ(out) (r) to ρ(in) (r) If |ρ(in) (r) − ρ(out) (r)| < δtol → STOP. (2.10).

. for example. Ref. For simplicity we will neglect the spin of electrons and keep only the spatial degrees of freedom. Combining then Hartree-type wavefunctions to form a properly antisymmetrized wavefunction for the system. in fact. which is legitimate since they represent independent components (for a more detailed justiﬁcation of this see. This is known as the Hartree–Fock approximation. To this end. [15]). Eq. 2. we conclude that − 2 h ¯ 2 ∇r + Vion (r) + e2 2m e φj| j =i 1 |φ j |r−r | φi (r) = i φi (r) which is the Hartree single-particle equation. that is. The above variation then produces δφi | − 2 h ¯ 2 ∇r + Vion (r)|φi + e2 2m e δφi φ j | j =i 1 |φi φ j − |r−r | 1 |φ j − |r−r | i i δφi |φi = δφi | − 2 h ¯ 2 ∇r + Vion (r) + e2 2m e φj| j =i |φi = 0 Since this has to be true for any variation δφi |. it changes sign when the coordinates of two electrons are interchanged. by considering electrons with up and down spins at position r. we obtain the determinant (ﬁrst introduced by Slater [16]): φ1 (r1 ) φ2 (r1 ) 1 · HF ({ri }) = √ · N! · φ N (r1 ) φ1 (r2 ) · · · φ1 (r N ) φ2 (r2 ) · · · φ2 (r N ) · · · · · · φ N (r2 ) · · · φ N (r N ) (2. This does not imply any serious restriction. (2. which changes its sign.11).2 The Hartree and Hartree–Fock approximations 47 imaginary parts of a complex variable independently.3 The Hartree–Fock approximation The next level of sophistication is to try to incorporate the fermionic nature of electrons in the many-body wavefunction ({ri }). at the Hartree–Fock level it is a simple matter to include explicitly the spin degrees of freedom. we can choose a wavefunction which is a properly antisymmetrized version of the Hartree wavefunction.2.14) where N is the total number of electrons.2. since interchanging the position of two electrons is equivalent to interchanging the corresponding columns in the determinant. This has the desired property.

which is called the “exchange” term.48 2 The single-particle approximation The total energy with the Hartree–Fock wavefunction is EHF = = i HF |H| HF φi | + − e2 2 e 2 2 2 −h ¯ 2 ∇r + Vion (r) | φi 2m e φi φ j | i j ( j =i ) 1 | φi φ j |r−r | 1 | φ j φi |r−r | (2.18) ρi (r) so that the Hartree potential takes the form ρ j (r ) dr = e 2 Vi H (r) = e2 | r − r | j =i ρ (r ) − ρi (r ) dr |r−r | (2. are 2 −h ¯ 2 ∇r + Vion (r) + Vi H (r) φi (r) − e2 2m e φj | j =i 1 | φi φ j (r) = |r−r | i φi (r) (2. It is instructive to try to put this term in such a form. the last one.17) (2. obtained by a variational calculation. The exchange term describes the effects of exchange between electrons. First we express the Hartree term in a different way: deﬁne the single-particle and the total densities as ρi (r) = | φi (r) |2 ρ (r) = i (2. This term has the peculiar character that it cannot be written simply as Vi X (ri )φi (ri ) (in the following we use the superscript X to denote “exchange”).21) . r ) = j =i φi (r )φi∗ (r)φ j (r)φ ∗ j (r ) φi (r)φi∗ (r) (2. by multiplying and dividing by the proper factors. which we put in the Hartree–Fock many-particle wavefunction by construction.20) Then the single-particle Hartree–Fock equations take the form 2 −h ¯ 2 ∇r + Vion (r) + Vi H (r) + Vi X (r) φi (r) = 2m e i φi (r) (2.16) This equation has one extra term compared with the Hartree equation.15) φi φ j | i j ( j =i ) and the single-particle Hartree–Fock equations .19) Now construct the single-particle exchange density to be ρiX (r.

22) The Hartree and exchange potentials give the following potential for electron– electron interaction in the Hartree–Fock approximation: Vi H F (r) = e2 ρ (r ) dr − e 2 |r−r | ρi (r ) + ρiX (r. In particular. the electronic states must also reﬂect this symmetry of the potential.e. In the following we will be using relations implied by the Fourier transform method. 2.25) The ﬁrst term is the total Coulomb repulsion potential of electrons common for all states φi (r).3 Hartree–Fock theory of free electrons 49 with the exchange potential. that is one in which the ionic potential is a uniformly distributed positive background . in analogy with the Hartree potential.26) where is the volume of the solid and ki is the wave-vector which characterizes state φi .23) which can be written. we will use those as the only index. r ) dr |r−r | (2. given by Vi X (r) = −e2 ρiX (r. with the help of the Hartree–Fock density ρiH F (r. which is uniform. This is referred to as the jellium model.24) as the following expression for the total electron–electron interaction potential: Vi H F (r) = e2 ρ (r ) − ρiH F (r. Since the wave-vectors sufﬁce to characterize the single-particle states. r )= j φi (r )φi∗ (r)φ j (r)φ ∗ j (r ) φi (r)φi∗ (r) (2. Plane waves are actually a very convenient and useful basis for expressing various physical quantities. φi → φk . while the second term is the effect of fermionic exchange. r ) dr |r−r | (2. r ) dr |r−r | (2. and is different for each state φi (r). which are proven in Appendix G. i. We also deﬁne certain useful quantities related to the density of the uniform electron gas: the wave-vectors have a range of values from zero up to some maximum . so they must be plane waves: 1 φi (r) = √ eiki ·r (2.2. which simplify the calculations. they allow the use of Fourier transform techniques.3 Hartree–Fock theory of free electrons To elucidate the physical meaning of the approximations introduced above we will consider the simplest possible case. In this case.

the Fermi momentum. which arises from ρiH F (r.33) . rs . These two terms. which is related to the density n = N / through n= 3 kF 3π 2 (2.28) It is often useful to express equations in terms of another quantity. This gives the following expression for kF : kF = (9π/4)1/3 (9π/4)1/3 =⇒ kF a0 = rs (rs /a0 ) (2. e2 h ¯2 = = 1 Ry 2 2a0 2m e a0 (2. a0 = 0. (D. which is the natural unit for energies in solids. we introduce the unit of energy rydberg (Ry). If the electrons had only kinetic energy. Eq. r ) dr |r−r | φk (r) = k φk (r) (2. Finally.529177 Å).32) With the electrons represented by plane waves. Eq.30) where the last expression contains the dimensionless combinations of variables kF a0 and rs /a0 . the electronic density must be uniform and equal to the ionic density. the uniform positive ionic charge and the uniform negative electronic charge of equal density. r ): 2 −h ¯ 2 ∇r − e2 2m e HF ρk (r. which is deﬁned as the radius of the sphere whose volume corresponds to the average volume per electron: 4π 3 3π 2 rs = = n −1 = 3 3 N kF (2.50 2 The single-particle approximation magnitude kF . the total energy of the system would be given by E kin = 5 E kin 3 h ¯ 2 kF = =⇒ 2 π 10m e N 5 F (2. The Fermi energy is given in terms of the Fermi momentum F = 2 h ¯ 2 kF 2m e (2.29) and rs is typically measured in atomic units (the Bohr radius. The only terms remaining in the single-particle equation are the kinetic energy and the part of Vi H F (r) corresponding to exchange.27) (see Appendix D. (D.12)).10)).31) (see Appendix D. cancel each other.

The inverse Fourier transform of 1/|r − r | turns out to be 1 = |r−r | dq 4π iq·(r−r ) e (2π )3 q 2 (2. (2.2. r ) dr eik·r |r − r | −e2 =√ k ∗ ∗ φk (r )φk (r)φk (r)φk (r ) 1 dr eik·r ∗ φk (r)φk (r) |r−r | −e =√ 2 k e−i(k−k )·(r−r ) 1 dr eik·r |r−r | Expressing 1/ | r − r | in terms of its Fourier transform provides a convenient way for evaluating the last sum once it has been turned into an integral using Eq.33).8).34) so that all we need to show is that they are also eigenfunctions of the second term HF (r. 1 (2π )3 −4π e2 √ e−i(k−k −q)·(r−r ) dr = δ (q − (k − k )) (2.38) k <k F . Using Eq.3 Hartree–Fock theory of free electrons 51 We have asserted above that the behavior of electrons in this system is described by plane waves. r ) −e2 ρk −e 2 dr φk (r) = √ |r − r | HF ρk (r.24) for ρk HF (r.37) which upon integration over q gives dk 1 e2 1 ik·r = − kF F (k / kF ) √ eik·r e 3 2 (2π ) | k − k | π (2. this representation allows us to identify the quantity in square brackets in the last expression with a δ -function in momentum space.35) as proven in Appendix G. we prove this statement next. which is derived in Appendix G. (2. r ) we obtain in the hamiltonian of Eq. Substituting this expression into the previous equation we obtain −e2 √ k e−i(k−k )·(r−r ) dk (2π )3 dq 4π iq·(r−r ) e dr eik·r (2π )3 q 2 dq 1 1 2 q (2π )3 e−i(k−k −q)·(r−r ) dr eik·r (2.36) −4π e2 = √ k <kF At this point it will be necessary to employ the Fourier transform representation of the δ -function. Plane waves are of course eigenfunctions of the kinetic energy operator: − h ¯ 2 ∇ 2 1 ik·r h ¯ 2 k2 1 ik·r √ e = √ e 2m e 2m e (2. (D.

the energy of single-particle state φk (r) is given by k = h ¯ 2 k 2 e2 − k F F (k / k F ) 2m e π (2. With this result.32). 2. Energy (in rydbergs) of individual single-particle states as a function of momentum k / kF (with kF = Fermi momentum).33).2 for two different values of rs . (2. (2.40) which.41) The behavior of the energy k as function of the momentum (in units of kF ) is illustrated in Fig.41). (2. using the variable rs introduced in Eq.5 1 1. This is an intriguing result: it shows that.29) and the deﬁnition of the Ry for the energy unit given in Eq.5 2 k/k F Figure 2.39) This completes the proof that plane waves are eigenfunctions of the single-particle hamiltonian in Eq. 2 (in a0 ).41)). with k = |k|. (2. can be rewritten in the following form: k = (9π/4)1/3 rs /a0 2 (k / kF )2 − 2 π (9π/4)1/3 rs /a0 F (k / kF ) Ry (2. The dashed curves give the kinetic energy contribution (ﬁrst term on the right-hand side of Eq.52 12 2 The single-particle approximation 9 rs =1 Energy (Ry) 6 3 rs =2 0 Ϫ3 0 0. due to the exchange interaction the energy of state φk . as given by Eq. (2. even though plane waves are eigenstates of this hypothetical system. where the function F (x ) is deﬁned as F (x ) = 1 + 1 − x2 1+x ln 2x 1−x (2. for two different values of rs = 1.2.

This term has interesting behavior at |k| = kF . It also gives a lower energy than the non-interacting electron case for all values of k. 2. Eq.096 + O(rs /a0 ) Ry (2. Thus.44) This result should be compared with the expansion for the exact energy of the electron gas in the high-density limit (low rs /a0 values). this factor of 2 is canceled by a factor of 1/2 needed to compensate double counting of the effective interaction in the sum of k ’s: remember that this effective interaction is contained in the HF single-particle equations Eq. (2.2. Using the expression of kF in terms of rs . we obtain EHF 0.916 E 2. which represents the effective electron-electron interaction due to exchange.2). the electron–electron interaction included at the Hartree–Fock level lowers the energy of the system signiﬁcantly. Turning the second term in the above equation into an integral through the usual procedure.16) as the sum over all states other than state i . an effect which is more pronounced for small values of rs (see Fig.2. [17]). as is evident in Fig. 0.42) Notice that we must include a factor of 2 for the spin of the electrons in both summations. We can calculate the total energy of this system by summing the single-particle energies over k up to momentum kF : E HF h ¯ 2 k2 e2 kF =2 − 2m e π k <k F 1+ k <k F 2 kF + k − k2 kF ln 2kkF kF − k (2. and expressing everything in rydbergs with the help of Eq. as might be expected for non-interacting particles. But for the second term. (2.916 2. captures the ﬁrst two terms in the . we can evaluate the sum to ﬁnd EHF 3 e2 kF 3 = F− (2. ﬁrst obtained by GellMann and Brueckner (for details and original references see Ref.21 − = Ry N (rs /a0 )2 (rs /a0 ) (2. 2. (2. it also contains the term proportional to F (k / kF ) in Eq. (2.3 Hartree–Fock theory of free electrons 53 is not simply h ¯ 2 k 2 /2m e . so if we simply sum all these contributions contained in the k ’s we will be counting each contribution twice. based on an ad hoc expression for the many-body wavefunction.0622 ln(rs /a0 ) − 0.43) N 5 4 π which quantiﬁes by how much the effective electron–electron interaction due to exchange lowers the energy of the system relative to the kinetic energy alone.30).32).45) N (rs /a0 )2 (rs /a0 ) It is quite remarkable that the Hartree–Fock approximation. This is indeed explicitly done for the kinetic energy part (see Appendix D where the expression for E kin was derived).21 − = + 0.40).

that is. namely EX 0. There is actually good justiﬁcation to use such a term in single-particle equations in order to describe the exchange contribution. a factor of 1/2 must be introduced to account for double-counting of interactions. so it is the simplest possible system for studying electron–electron interactions in a .477[a0 n ] Ry (2.46) (2. (2. was to generalize this term for situations where the density is not constant.954[a0 2π where an extra factor of 2 is introduced to account for the fact that a variational derivation gives rise to a potential term in the single-particle equations which is twice as large as the corresponding energy term. In this case the exchange energy would arise from a potential energy term in the single-particle hamiltonian which will have the form V X (r) = − 3e 2 2 3 π 1/3 [n (r)]1/3 = − 3e2 3 n (r)]1/3 Ry (2. using the expressions for rs discussed earlier. can be written as EX 3e2 =− N 4 3 π 1/3 3 1/3 n 1/3 = −1. 2. due to Slater [18]. when one calculates the total energy by summing terms in the single-particle equations.916 =− Ry N (rs /a0 ) which. Another interesting point is that we can express the potential due to exchange in a way that involves the density. they can be non-homogeneous. and hence the Fermi momentum. This molecule consists of two protons and two electrons. we discuss brieﬂy a model of the hydrogen molecule. although the values of the constants involved are different from Slater’s (see also section 2.5 on Density Functional Theory).48) kF (r) = −2.54 2 The single-particle approximation exact expansion of the total energy.e.4 The hydrogen molecule In order to demonstrate the difﬁculty of including explicitly all the interaction effects in a system with more than one electron.47) One of the most insightful proposals in the early calculations of the properties of solids. Of course in real situations this may not be very helpful. In the last equation. since in typical metals (rs /a0 ) varies between 2 and 6. a system with non-homogeneous distribution of electrons.44). This potential will give rise to the second term on the right-hand side of Eq. the density. have become functions of r. conversely. i.

Solving for the wavefunction (r1 . then this electron would see the following hamiltonian. the new hamiltonian becomes H(r1 . and it does not change the character of the electronic wavefunction. It is discussed in several of the textbooks mentioned in chapter 1. and proton–proton repulsion. r2 ) of this new hamiltonian analytically is already an impossible task. We will attempt to do this approximately. and electron–electron. When the two protons are very far away. Similarly. (last term in Eq. r2 ) = − 2 2 h ¯ 2 ∇r h ¯ 2 ∇r e2 e2 1 2 − − − 2m e | r1 − R1 | 2m e | r2 − R2 | + − e2 e2 e2 e2 − + + | R1 − r2 | | r1 − R2 | | r1 − r2 | | R1 − R2 | (2. the two electrons do not interact and the two electronic wavefunctions are the same.1 We begin by deﬁning the hamiltonian for a single hydrogen atom: h 1 (r1 ) = − 2 h ¯ 2 ∇r e2 1 − 2m e | r1 − R1 | (2.52) This is a conveniently simple model for illustrating electron exchange and correlation effects. When the atoms are brought together. This is equivalent to applying the Born–Oppenheimer approximation to the problem and neglecting the quantum nature of the protons.51)). we will ignore the proton– proton repulsion. . using the orbitals s1 (r) and s2 (r) as a convenient basis.49) where R1 is the position of the ﬁrst proton. (2. the ground state of the hydrogen atom with energy 0 .51) where the last four terms represent electron–proton attraction between the electron in one atom and the proton in the other (the cross terms). As we have done so far. If we were dealing with a single electron. an atom at position R2 (far from R1 ) will have the hamiltonian 2 h ¯ 2 ∇r e2 2 h 2 (r2 ) = − − 2m e | r2 − R2 | (2. since it is only a constant term as far as the electrons are concerned. The justiﬁcation for using this approximation here is that we are concentrating our attention on the electron–electron interactions in the simplest possible model rather than attempting to give a realistic picture of the system as a whole.2. The wavefunction for this hamiltonian is s (r1 − R1 ) = s1 (r1 ). even though we mentioned in chapter 1 that this may not be appropriate for hydrogen.4 The hydrogen molecule 55 realistic manner. in the presence of the two protons: Hsp (r) = − 1 2 h ¯ 2 ∇r e2 e2 − − 2m e | r − R1 | | r − R2 | (2. only centered at different points in space.50) and the wavefunction s (r2 − R2 ) = s2 (r2 ).

2. and = we can write the total hamiltonian as H(r1 .56 2 The single-particle approximation We deﬁne the expectation value of this hamiltonian in the state s1 (r) or s2 (r) to be ≡ s1 |Hsp |s1 = s2 |Hsp |s2 Notice that s1 (r) or s2 (r) are not eigenstates of Hsp (r). the Hubbard model has not been solved analytically. which arises from the interaction term when the two electrons are placed at the same orbital: U ≡ s1 s1 | e2 e2 |s1 s1 = s2 s2 | |s2 s2 | r1 − r2 | | r1 − r2 | (2. that is. which reﬂect the basic symmetry of the hamiltonian. (2. upon inversion with respect to the center of the molecule. A different term we can deﬁne is the “on-site” repulsive interaction between two electrons. and has found many applications.3. these are illustrated in Fig. Despite its apparent simplicity. and research continues today to try to understand its physics. A model based on these physical quantities. There are two such possibilities: 1 φ0 (r) = √ [s1 (r) + s2 (r)] 2 1 φ1 (r) = √ [s1 (r) − s2 (r)] 2 (2. the electron–electron repulsion. inversion relative to the midpoint of the distance between the two protons (the center of the molecule). r2 ) = Hsp (r1 ) + Hsp (r2 ) + e2 | r 1 − r2 | 0. s2 (r2 ) basis to deﬁne the so called “hopping” matrix elements t ≡ − s1 |Hsp |s2 = − s2 |Hsp |s1 (2. using as a basis s1 (r) and s2 (r). These matrix elements describe the probability of one electron “hopping” from state s1 (r) to s2 (r) (or vice versa).57) the ﬁrst being a symmetric and the second an antisymmetric wavefunction. . Now we want to construct single-particle orbitals for Hsp (r).53) Also notice that (2. especially in highly correlated electron systems. It will prove convenient within the s1 (r1 ). The model contains the bare essentials for describing electron–electron interactions in solids. the hopping matrix element and the on-site Coulomb repulsion energy.56) where U is also taken to be a real positive quantity.54) We call the very last term in this expression. the “interaction” term. was introduced originally by Hubbard [18–20]. within the singleparticle hamiltonian Hsp (r). s2 (r) to make sure that t is a real positive number.55) where we can choose the phases in the wavefunctions s1 (r).

(2. we can write three possible Hartree wavefunctions: H 0 (r1 . in terms of the iH (i = 0. 1). r2 ) = φ0 (r1 )φ0 (r2 ) = φ1 (r1 )φ1 (r2 ) = φ0 (r1 )φ1 (r2 ) (2. that is.60) (2.2. the spin degrees of freedom of the electrons form a totally antisymmetric combination. We will assume that the total wavefunction has a spin-singlet part. H 0 |H(r1 . r2 )| H 0 . 1) are φ0 |Hsp (r)|φ0 = − t φ1 |Hsp (r)|φ1 = + t (2. r2 ) H 2 (r1 .58) Using the single-particle orbitals of Eq. right).61) H H (r1 .59) (2.62) = 2( − t ) + U 2 1 H H (2.57).4 The hydrogen molecule 57 s1 (r) s2 (r) φ0 (r) φ1 (r) Figure 2. r2 ) place the two electrons in the same Notice that both 0 state.57). are 1 (2. Schematic representation of the hydrogen wavefunctions for isolated atoms (top panel). r2 ) H 1 (r1 .3. r2 )| 1 = 2( + t ) + U 2 Next we try to construct the Hartree–Fock approximation for this problem. r2 ) and 1 (r1 . r2 ) which contains the interaction term. The expectation values of the two-particle hamiltonian H(r1 . (2. The expectation values of Hsp in terms of the φi (i = 0. The latter two are the states deﬁned in Eq. and the linear combinations that preserve the inversion symmetry with respect to the center of the molecule: a symmetric combination (bottom panel. left) and an antisymmetric combination (bottom panel.63) 1 |H(r1 . this is allowed because of the electron spin.

One possible choice is HF 0 (r1 . which has been called Density Functional . and the corresponding energy is lower than all other approximations we tried. A more accurate description should include the excited states of electrons in each atom. including correlation effects. To the extent that G S (r1 . r2 ) = s1 (r1 )s1 (r2 ) = s2 (r1 )s2 (r2 ) (2. r2 ) 1 = √ [s1 (r1 )s2 (r2 ) + s1 (r2 )s2 (r1 )] 2 (2. r2 ) U 4t 1 1+ 2N 2 − U 4t HF 1 (r1 . this represents the optimal solution to the problem within our choice of basis.5 Density Functional Theory In a series of seminal papers.66) Using the three functions iH F (i = 0. 1. which increases signiﬁcantly the size of the matrices involved. A study of the ground state as a function of the parameter (U / t ) elucidates the effects of correlation between the electrons in this simple model (see Problem 5). This tells us that the spatial part of the wavefunction should be totally symmetric. Kohn and Sham developed a different way of looking at the problem. r2 ) is known as the Heitler–London approximation.64) HF (r1 . Extending this picture to more complex systems produces an almost exponential increase of the computational difﬁculty. r2 ) 1 4t 2 + U 2 4 (2. r2 ) (2.68) where N is a normalization constant. r2 ) HF 2 (r1 . r2 ) involves several Hartree–Fock type wavefunctions. we can construct matrix elements of the hamiltonian Hi j = iH F |H| jH F and we can diagonalize this 3 × 3 matrix to ﬁnd its eigenvalues and eigenstates. This exercise shows that the ground state energy is 1 EG S = 2 + U − 2 and that the corresponding wavefunction is G S (r1 . r2 ) + HF 2 (r1 .58 2 The single-particle approximation which multiplies the spatial part of the wavefunction.67) =√ + 1 2N HF 0 (r1 .65) (2. The wavefunction 0 Two other possible choices for totally symmetric spatial wavefunctions are HF 1 (r1 . 2. 2). Hohenberg.

69) (2. instead of starting with a drastic approximation for the behavior of the system (which is what the Hartree and Hartree–Fock wavefunctions represent). as was done in the Hartree and Hartree–Fock approximations. . Kohn (a theoretical physicist) shared the 1998 Nobel prize for Chemistry with J. We will review here the essential ideas behind Density Functional Theory. . that is. r ) = ∗ (r. .13)–(B. Let E and be the total energy and wavefunction and E and be the total energy and wavefunction for the systems with hamiltonians H and H . . as expressed through the many-body wavefunction: n (r) = N γ (r. 23] and are referred to as the Hohenberg–Kohn–Sham theorem. . [22. To prove this. . r N ) (r. . . . This theory has had a tremendous impact on realistic calculations of the properties of molecules and solids. r |r. . r |r. r ). denoted by γ (r. (r. we will show that the density n (r) is uniquely deﬁned given an external potential V (r) for the electrons (this of course is identiﬁed with the ionic potential). . since the many-body wavefunction need never be explicitly speciﬁed. Eq. This is a huge simpliﬁcation. . The basic ideas of Density Functional Theory are contained in the two original papers of Hohenberg. one can develop the appropriate single-particle equations in an exact manner. (B. give rise to the same density n (r).2. A measure of its importance and success is that its main developer. Kohn and Sham. r N ) (r . . . Pople (a computational chemist). Thus.1). We will show that this is impossible. respectively. First. r ). . suppose that two different external potentials. r2 . r N )dr2 · · · dr N ∗ ∗ N ( N − 1) 2 (r. which involves the many-body wavefunction ({ri }). . (2. In the following discussion we will make use of the density n (r) and the oneparticle and two-particle density matrices. r3 . r2 . The basic concept is that instead of dealing with the many-body Schr¨ odinger equation. W. . . where their physical meaning is also discussed. and its applications to different problems continue to expand. r3 . and then introduce approximations as needed. . V (r) and V (r). r N )dr3 · · · dr N These quantities are deﬁned in detail in Appendix B. . . . they do not differ merely by a constant. r ) = N (r. Eqs. r . one deals with a formulation of the problem that involves the total density of electrons n (r). We assume that V (r) and V (r) are different in a non-trivial way. . respectively.A. .70) . r N )dr2 · · · dr N (r. r .15). r N ) (r.5 Density Functional Theory 59 Theory (DFT). where the ﬁrst hamiltonian contains V (r) and the second V (r) as external potential: E= E = |H| |H | (2.

E< |H | = |H + V − V | = |H + V − V | (2. by the variational principle.74) because by assumption the densities n (r) and n (r) corresponding to the two potentials are the same. But the external potential determines the wavefunction. are common to all solids. therefore we conclude that our assumption about the densities being the same cannot be correct.72) Adding Eqs.72).60 2 The single-particle approximation Then we will have. (2. This proves that there is a one-to-one correspondence between an external potential V (r) and the density n (r).73) But the last two terms on the right-hand side of Eq.71) and (2. and therefore this functional does not depend on anything else other than the electron density (which is determined uniquely by the external potential V that differs from system to system).71) |H | + |(V − V )| = =E + |(V − V )| where the strict inequality is a consequence of the fact that the two potentials are different in a non-trivial way. (2.73) give n (r)[V (r) − V (r)]dr − n (r)[V (r) − V (r)]dr = 0 (2. with T representing the kinetic energy and W the electron–electron interaction. and is given by E [n (r)] = |H| = F [n (r)] + V (r)n (r)dr (2. we conclude that the expression F [n (r)] = |(T + W )| (2. which is obviously wrong. since the terms T and W .76) From the variational principle we can deduce that this functional attains its minimum for the correct density n (r) corresponding to V (r). This leads to the relation E + E < E + E . If we denote as T + W the terms in the hamiltonian other than V . so that the wavefunction must be a unique functional of the density. we obtain (E + E ) < (E + E ) + |(V − V )| − |(V − V )| (2. From these considerations we conclude that the total energy of the system is a functional of the density.75) must be a universal functional of the density. the kinetic energy and electron–electron interactions. Similarly we can prove E <E− |(V − V )| (2. since for a given V (r) and any .

r )drdr + |r−r | V (r)γ (r. r ) = 1 (2.79)–(2. r ) = (r. (2. These particles can be considered to be non-interacting: this is a very important aspect of the nature of the ﬁctitious particles. as before. we can express the various terms in the energy functional. We obtain n (r) = i |φi (r)|2 φi∗ (r)φi (r ) i (2. The important difference in the present case is that we do not have to interpret these single-particle states as corresponding to electrons.77) Using our earlier expressions for the one-particle and the two-particle density matrices. r)dr (2.5 Density Functional Theory 61 other density n (r) we would have E [n (r)] = |H| = F [n (r)] + V (r)n (r)dr > |H| = E [n (r)] (2.2.81). (2.79) (2.14).82) In this expression. r )|r =r dr e2 (r. the ﬁrst term represents the kinetic energy of the states in the Slater determinant (hence the superscript S ). which will allow us to simplify things considerably. r |r.80) γ (r. since their behavior will not be complicated by interactions. The assumption that we are dealing with non-interacting particles can be exploited to express the many-body wavefunction ({ri }) in the form of a Slater determinant.81) n (r)n (r )− | γ (r. we can obtain explicit expressions for E [n ] and F [n ]: E [n (r)] = + |H| =− h ¯2 2m e 2 ∇r γ (r. as in Eq.78) Now we can attempt to reduce these expressions to a set of single-particle equations. r ) |2 2 With the help of Eqs. r |r. Since the ﬁctitious particles are noninteracting. we can take the kinetic energy to be given by T S [n (r)] = i φi | − h ¯2 2 ∇ |φi 2m e r (2. They represent ﬁctitious fermionic particles with the only requirement that their density is identical to the density of the real electrons.83) . We can then express the various physical quantities in terms of the single-particle orbitals φi (r) that appear in the Slater determinant. which take the form F [n (r)] = T S [n (r)] + e2 2 n (r)n (r ) drdr + E XC [n (r)] |r−r | (2.

(2. As we saw in the case of the Hartree–Fock approximation. which we choose to be δ n (r) = δφi∗ (r)φi (r) with the restriction that δ n (r)dr = δφi∗ (r)φi (r)dr = 0 (2. which is obtained from Eq. With this choice. The singleparticle equations Eq.87) i φi (r) (2. Since the effective potential is a function of the density. This term includes all the effects of the many-body character of the true electron system. the last term is the variational functional derivative2 of the as yet unspeciﬁed functional E XC [n (r)]. A more pressing issue is the exact form of E XC [n (r)] which is unknown. The second term in Eq. As mentioned earlier. through a variational argument: h ¯2 2 − ∇r + V e f f (r. note that φi (r) and φi∗ (r) are treated as independent. and taking the restriction into account through a Lagrange multiplier i . we arrive at the following single-particle equations. the exchange property. n (r)) = V (r) + e2 δ E XC [n (r)] n (r ) dr + |r−r | δ n (r) (2. this is not a signiﬁcant problem.62 2 The single-particle approximation Notice that this expression could not be written down in a simple form had we been dealing with interacting particles.86) with V (r) the external potential due to the ions. as far as their variation is concerned. (2. by deﬁnition. the “exchange-correlation” term. which takes into account exchange explicitly.88) For the deﬁnition of this term in the context of the present theory see Appendix G. (2. We can now consider a variation in the density. We can consider the simplest situation.84) so that the total number of particles does not change. n (r)) φi (r) = 2m e where the effective potential is given by V e f f (r. . in which the true electronic system is endowed with only one aspect of electron interactions (beyond Coulomb repulsion).82) is the Coulomb interaction. we will need to solve these equations by iteration until we reach self-consistency.79) and hence depends on all the single-particle states. we will deal with it separately below.85) (2.86) are referred to as Kohn–Sham equations and the singleparticle orbitals φi (r) that are their solutions are called Kohn–Sham orbitals. What remains is. E XC [n (r)]. which we separate out from the electron-electron interaction term in the functional F [n (r)]. that is. in a uniform system the contribution of exchange to the total energy is EX = − 2 3 e2 kF N 4π (2.

represented by the external potential V (r). we could calculate from the energy functional the total energy of the system. The electrons themselves. experience the effects of this long range interaction since the motion of one affects the motion of each and every other one.87).89) We can now generalize this to situations where the density is not uniform. we know that electrons are not interacting merely through the exclusion principle. this would provide an exact solution in terms of the single-particle wavefunctions φi (r).91) 3 3 [n (r)] = − e2 4 π [n (r)]1/3 This allows us to calculate the expression for δ E XC [n ]/δ n in the case where we are considering only the exchange aspect of the many-body character. which was based on an ad hoc assumption. being interacting particles. embodied in the last term of the effective potential in Eq. and assume that the same expression holds. Slater introduced . Thus. we [n ]1/3 n dr (2. (2.92) This is remarkably similar to Slater’s exchange potential.5 Density Functional Theory 63 Since the total number of electrons in the system can be written as N = can write E X [n ] = − 3 e2 4π 3 3 kF n dr = − e2 4 π 1/3 n dr. In fact it differs from the Slater exchange potential by only a factor of 2/3.90) (2. we cannot assume that the full potential which the ﬁctitious particles experience due to many-body effects. However.2. the latter does not complicate the many-body character of the wavefunction . Since the density is the same. (2. This analysis shows that. the second term in Eq. but that they also experience the long-range Coulomb repulsion from each other. this is known as the correlation aspect of the many-body wavefunction. we could adopt the exchange potential as derived above for use in the single-particle equations. One part of the Coulomb interaction is included in the effective potential. We obtain ∂ δ E X [n (r)] = δ n (r) ∂ n (r) X [n (r)]n (r) = 4 3 X [n (r)] = −e2 3 π 1/3 [n (r)]1/3 (2. obtaining: E X [n (r)] = X X [n (r)]n (r)dr 1/3 (2. In an early attempt to account for this. (2. as well as the Coulomb attraction to the ions. can be described by the exchange part we have discussed so far. Eq. if electrons interacted only through the Pauli exclusion principle (hence only the exchange interaction would be needed).87).48). Recall that these are not true electron wavefunctions. but they give rise to exactly the same density as the true electrons.

64 2 The single-particle approximation Table 2. by comparing Slater’s exchange potential. What should E XC [n (r)] be to capture all the many-body effects? This question is the holy grail in electronic structure calculations. (2. In these expressions the .0529 B3 = 0. denoted by α (hence the expression X − α potential). rs is measured in units of a0 and the energy is in rydbergs. P–Z = Perdew–Zunger [25].7 is a typical choice). A2 = −0.1). since the exchange and correlation effects are inherently non-local in an interacting electron system (see also the discussion below). B2 = 1.0368 rs ≥ 1 a “fudge factor” in his expression for the exchange potential. It is easy to extract the part that corresponds to correlation in the X − α expression. of which we mention a few so that the reader can get a feeling of what is typically involved.91).2846.884. multiplied by the fudge factor α . 1 (α = 0. the value required for the potential arising from pure exchange.3334 (3 α 2 cor [n (r)] 0 V XC [n (r)] 4 X 3 X − 1) X A( B + rs )−1 A = 0. There are many interesting models. B = 7. A4 = 0.0622. Model Exchange Slater Wigner H–L P–Z: rs < 1 A1 + A2rs + [ A3 + A4rs ] ln(rs ) A1 = −0. X is the pure exchange energy from Eq.096. If α = 2/3. A collection of proposed expressions for the correlation energies and exchange-correlation potentials is given in Table 2. but somewhat smaller than. This factor is usually taken to be close to. So far no completely satisfactory answer has emerged. The numerical constants have units which depend on the factor of rs involved with each. but the problem remains an area of active research.8 4 X 3 2α 1 + Brs ln 1 + Ars−1 A = 21. H–L = Hedin–Lundqvist [24].004 √ −1 B1 1 + B2 rs + B3rs B1 = −0.1.0232 A3 = 0. B = 0. it is thought that the X − α expression includes in some crude way the effects of both exchange and correlation. to the potential involved in the single-particle equations derived from Density Functional Theory (see Table 2. In fact it is not likely that any expression which depends on n (r) in a local fashion will sufﬁce. Correlation energy functionals cor [n (r)] and exchange-correlation potentials V XC [n (r)] in various models.1.

In the case of the Hartree and Hartree–Fock approximations. because even at the exchange level. These expressions are usually given in terms of rs . The resulting single-particle equations are supposed to .94) while the pure exchange energy X [n ] is the expression given in Eq. obtained by numerical methods at different densities. Eq.29). The common feature in all these approaches is that E XC depends on n (r) in a local fashion. that is. The parameters that appear in the expression proposed by Hedin and Lundqvist [24] are determined by ﬁtting to the energy of the uniform electron gas. r ). This is actually a severe restriction. a concentrated effort has been directed toward producing expressions for E XC that depend not only on the density n (r). for example. expressed in terms of single-particle states. but also on its gradients [27]. (2. the expressions for the exchange density.6 Electrons as quasiparticles So far we have examined how one can justify the reduction of the many-body equation Eq. it should depend on r and r simultaneously (recall. surfaces. etc. It is a much more difﬁcult task to develop non-local exchange-correlation functionals.). (2. (2. More recently.2.93) and the exchange-correlation potential that appears in the single-particle equations is deﬁned as V XC [n (r)] = δ E XC [n (r)] δ n (r) (2. ρiX (r. but still represent a local approximation to the exchange-correlation functional. A similar type of expression was proposed by Perdew and Zunger [25]. one starts with a guess for the many-body wavefunction. Including correlation effects in a realistic manner is exceedingly difﬁcult. For this reason they are referred to as the Local Density Approximation to Density Functional Theory. the functional should be non-local. obtained by series expansions (see Problem 7).20)). which is related to the density through Eq. 2. n needs to be evaluated at one point in space at a time. as we have already demonstrated for the hydrogen molecule. which captures the more sophisticated numerical calculations for the uniform electron gas at different densities performed by Ceperley and Alder [26].91). This can be done by introducing certain approximations. The expression proposed by Wigner extrapolates between known limits in rs .1) to a set of single-particle equations. that is.6 Electrons as quasiparticles 65 exchange-correlation functional is written as E XC [n (r)] = X [n (r)] + cor [n (r)] n (r)dr (2. (2. These expansions tend to work better for ﬁnite systems (molecules.

it embodies certain features of such an approach because correlation effects are typically the weakest contribution to the total energy.66 2 The single-particle approximation describe the behavior of electrons as independent particles in an external potential deﬁned by the ions. and the wavefunctions of the ﬁctitious particles can be determined. A better interpretation is to consider them as quasiparticles. and of course the ionic potential. The case of coherent many-body states . beginning with the non-interacting particles as the unperturbed state. exchange energy. whose density is the same as the density of real electrons. are properly taken into account. In particular. However. as in the case of the Hartree–Fock approximation. This general notion was ﬁrst introduced by L.86) possess some ﬂavor of the quasiparticle notion: in the non-interacting Kohn-Sham picture all the important effects. it may still be possible to describe the properties of the system in terms of weakly interacting particles.D. which represent some type of collective excitations of the original particles. correlation effects are then added approximately. What do the Kohn–Sham single-particle states actually represent? We know that they are not necessarily associated with electrons. In the case of DFT. the equations can be solved. Landau. (2. Dealing with these weakly interacting particles is much simpler. Landau’s basic idea was that in a complicated system of strongly interacting particles. for which the single-particle equations are solved self-consistently each time). by comparison with the results of numerical calculations for the electron gas. The ﬁctitious non-interacting particles involved in the Kohn–Sham equations Eq. as well as an external ﬁeld produced by the presence of all other electrons. so describing the system in this language can be very advantageous. because the exchange-correlation functional which appears in them is not known explicitly. ﬁctitious particles. Once this functional has been determined (in an approximate way). turn out to be remarkably good. including kinetic energy. It is quite extraordinary that many of the measurable properties of real systems can be identiﬁed with the behavior of the Kohn-Sham quasiparticles in a direct way. Why is the single-particle approximation successful? We can cite the following three general arguments: r The variational principle: Even when the wavefunctions are not accurate. We will see some examples of this in later chapters. one can derive exactly the single-particle equations for non-interacting. This is because the optimal set of single-particle states contains most of the physics related to the motion of ions. Although this is not expressed as a systematic perturbative approach. the total energy is not all that bad. Coulomb repulsion. these equations cannot be solved exactly. One can construct approximations to this functional. some type of perturbation theory treatment can be formulated when one deals with a weakly interacting system of particles. Energy differences between states of the system (corresponding to different atomic conﬁgurations.

In this picture. the weaker the interaction. where the single-particle approximation fails.2. That is. There is no real hole in the system.25). these are the optimal states as far as the energy is concerned in the single-particle picture. the energy of the system is reasonably well described and corrections to it tend to be small. the phenomenon of superconductivity involves energy scales of a few degrees kelvin (or at most ∼ 100 K). To illustrate the concept of screening and the exchange-correlation hole.95) are very small. (2. For example. with plane waves as the single-particle . the potential experienced by each electron due to electron–electron interactions is given by Eq. whereas the motion of atoms in solids (involving changes in atomic positions) involves energies of the order of 1 eV = 11 604 K. this is referred to as the “exchange-correlation hole”. but the description of electronic behavior is also reasonable. By its nature this interaction involves a pair of particles. j > F ) and two holes (with energies n . j = m . the single-particle states are orthogonal. therefore the lowest order non-vanishing terms in the correction must involve the excitation of two electrons (with energies i . in the single-particle picture. Thus. we should try to include in the correction to the total energy the effect of excitations through the Coulomb interaction. This indicates that the matrix elements in the numerator of Eq. r The exclusion principle: The single-particle states are ﬁlled up to the Fermi level in order to satisfy the Pauli exclusion principle.95) n However. the term is just a ﬁgurative way of describing the many-body effects in the single-particle picture. j = n or i = n . Since it is the interaction between electrons that we want to capture more accurately. concerns much more delicate phenomena in solids. which. the closer the system is to the picture of non-interacting single particles. These particles. r Screening: The Coulomb interaction between real electrons is “screened” by the correlated motion of all other electrons: each electron is surrounded by a region where the density of electronic charge is depleted.6 Electrons as quasiparticles 67 for the electrons. can no longer be identiﬁed with individual electrons. however. not only are the energies in the single-particle picture quite good. This forms an effective positive charge cloud which screens an electron from all its neighbors. we refer again to the simple example of free electrons in the Hartree–Fock approximation. the contribution of such excitations to the total energy will be φi φ j | |r1e |φ φ −r2 | m n i 2 2 E∼ + j − m − (2. which do not correspond to an excitation at all). since they carry with them the effects of interaction with all other electrons (the exchange-correlation hole). (2. we need to consider virtual excitations of this ground state and include their contribution to the total energy. In order to build a more accurate description of the system. The net effect is a weakened interaction between electrons. m ≤ F ). so that there is no overlap between such states: φi |φm = φ j |φn = 0 (except when i = m . and the corresponding corrections to the energy are very small.

(2. it corresponds to the absence of an electron from a single-particle state which lies below the Fermi level. where the electron wavefunctions are not simply plane waves and correlation effects are taken into account. Notice that the mass of this quasiparticle can be different than that of the free electron. and they involve collective (that is. which arises from the Hartree–Fock density deﬁned in Eq. 2. the effect of exchange between electrons is expressed by the second term in the numerator of the integrand. (b) Hole: This is a quasiparticle.26). (D. we obtain HF ρk (r.64).6. The notion of a hole is particularly convenient when the reference state consists of quasiparticle states that are fully occupied and are separated by an energy gap from . In contrast to quasiparticles. that of “collective excitations”. in which correlation effects have been neglected and the hole has inﬁnite strength and is located right on the electron itself with zero extent around it.96) In this potential. r ) = 1 k e−i(k−k )·(r−r ) (2. a cloud of effective charge of opposite sign due to exchange and correlation effects arising from interaction with all other electrons in the system. The Fermi energy (highest occupied state) is of order 5 eV. r ) dr |r − r | (2. This is an extreme case of the exchange-correlation hole. r ) = e−ik·(r−r ) δ (r − r ) (2.8) and (G. The electron is a fermion with spin 1/2.68 2 The single-particle approximation states Eq.24).98) that is. but of opposite charge. In more realistic treatments. and the Fermi velocity (v F = h ¯ kF / m e ) is ∼ 108 cm/sec. becomes VkH F (r) = e2 HF ( N / ) − ρk (r. it can be treated as a non-relativistic particle.97) With the help of Eqs. that is. the electrons in this case experience an exchange hole which is a δ -function centered at the electron position. the exchange-correlation hole has ﬁnite strength and ﬁnite extent around the electron position. We summarize here the most common quasiparticles and collective excitations encountered in solids: (a) Electron: As we discussed above already. (2. they bear no resemblance to constituent particles of a real system. this is a quasiparticle consisting of a real electron and the exchange-correlation hole.1 Quasiparticles and collective excitations There is another notion which is also very important for the description of the properties of solids. these are bosons. the Hartree–Fock density takes the form: HF ρk (r. like the electron. coherent) motion of many physical particles. For the present simple example.

100) . The response of a system of charged particles to an external ﬁeld larger than ∼ ks is typically described by the dielectric function. of order 10 Å). for typical densities this gives an energy of order 5–20 eV. In the following chapters we will explore in detail the properties of the some of these quasiparticles and collective excitations. where n is the density. corresponding to a bound state of an electron and a hole. where the motion of a negatively charged electron distorts the lattice of positive and negative ions around it.1 eV. This is. This means that the Coulomb potential of an electron (−e)/ r would be quickly suppressed as we move away from the electron’s position (here taken to be the origin of the coordinate system). deﬁned as ε= ext = − ind =1− ind (2. The binding energy is of order e2 /(εa ). Polaron: This is a quasiparticle. which give for the binding energy ∼ 0. Magnon: This is a collective excitation of the spin degrees of freedom on the crystalline lattice.2 Thomas–Fermi screening To put a more quantitative expression to the idea of electrons as screened quasiparticles. Polarons are invoked to describe polar crystals. The energy scale of plasmons is h ¯ω ∼ h ¯ 4π ne2 / m e . can be conveniently discussed in terms of holes. with a typical energy scale of h ¯ ω ∼ 0.1 eV. like the electron. It is a quantized lattice vibration.6. for example. the polaron has a different mass than the electron. such as missing electrons. Plasmon: This is a collective excitation of the entire electron gas relative to the lattice of ions.2. Perturbations with respect to this reference state. It corresponds to a spin wave. The most efﬁcient way to do this is to multiply the Coulomb potential of the bare electron by a decaying exponential: scr (r) = (−e) −ks r e r (2. with an energy scale of h ¯ ω ∼ 0. Because its motion is coupled to motion of ions. tied to a distortion of the lattice of ions. the situation in p-doped semiconductor crystals.99) where ks is the screening inverse length: the potential is negligible at distances −1 . 2.6 Electrons as quasiparticles 69 (c) (d) (e) (f) (g) the unoccupied states. corresponding to coherent motion of all the atoms in the solid. where ε is the dielectric constant of the material (typically of order 10) and a is the distance between the two quasiparticles (typically a few lattice constants. First. Phonon: This is a collective excitation. suppose that the electrons are maximally efﬁcient in screening each other’s charge.1 eV. its existence is a manifestation of the long-range nature of the Coulomb interaction.001 − 0. Exciton: This is a collective excitation. consider the following simple model.

Notice that. at T = 0. one of the occupation numbers must correspond to a state below the Fermi level. This general deﬁnition holds both in real space (with the appropriate expressions in terms of integral equations) and in Fourier space.106) n (k) = ( − )/ k T e k F B +1 where F is the Fermi energy and T the temperature. whose Fourier transform is (−e)4π/ k 2 . and therefore the dielectric becomes (k) = (−e)4π/(k 2 + ks function takes the form ε(k) = ext (−e)4π/ k 2 (k) = =1+ 2) (k) (−e)4π/(k 2 + ks ks k 2 (2. and the other to a state above the . the response function χ (k) takes the form dk n (k − k/2) − n (k + k/2) (2.107) χ (k) = −e2 (2π )3 h ¯ 2 (k · k )/2m e This is called the Lindhard dielectric response function .103) For sufﬁciently weak ﬁelds. we take the response of the system to be linear in the total ﬁeld: ρ ind (k) = χ (k) (k) (2. its Fourier transform 2 ) (see Appendix G). ind is the ﬁeld induced by the response of the charged particles. Now.101) We can also deﬁne the induced charge ρ ind (r) through the Poisson equation: −∇ 2 ind (r) = 4πρ ind (r) =⇒ ind (k) = 4π ind ρ (k) |k|2 (2. in order to have a non-vanishing integrand.104) where the function χ (k) is the susceptibility or response function.102) which gives for the dielectric function in Fourier space ε(k) = 1 − 4π ρ ind (k) k2 (k) (2. if the bare Coulomb potential ext (r) = (−e)/ r .70 2 The single-particle approximation where ext is the external ﬁeld. This gives for the dielectric function 4π (2. is screened by an exponential factor e−ks r to become the total potential (r) = (−e)e−ks r / r .105) ε(k) = 1 − 2 χ (k) k As can be shown straightforwardly using perturbation theory (see Problem 8). ¯ 2 k 2 /2m e and Fermi occupation for a system of single particles with energy k = h numbers 1 (2. and = ext + ind is the total ﬁeld.

(2. we must have |k − k/2| < kF and |k + k/2| > kF (or vice versa).6 Electrons as quasiparticles 71 Fermi level. This is based on the limit of the Lindhard function for |k| kF . At T = 0. gives ε(k) = 1 + TF ls k 2 . (2. Eq. (2.109) This expression. when substituted in the equation for the dielectric constant. we conclude that this approximation corresponds to an exponentially screened potential of the form given in Eq. deﬁned in Eq. Eq.110) TF where ls is the so called Thomas–Fermi screening inverse length. Consistent with our earlier remarks.101).2. inverse length ls . The study of this function provides insight into the behavior of the system of single particles (see Problem 8). and is evidently a function of F . that is. (2.108) where F (x ) is the same function as the one encountered earlier in the discussion of Hartree–Fock single-particle energies.111) This quantity represents the full occupation of states in the ground state of the single-particle system at temperature T . we obtain to ﬁrst order in k n (k ± k/2) = n (k ) ∓ h ¯2 ∂ n (k ) (k · k) 2m e ∂ F (2. which depends on the temperature T through the occupation n T .110). In this case. since otherwise the occupation numbers are either both 0 or both 1. the integral over k in the Lindhard dielectric function can be evaluated to yield χ0 (k) = −e2 m e kF 2¯ h2 π 2 F (k /2kF ) (2.99) with the Thomas–Fermi screening TF .106) about k = 0. (2. TF ls (T ) = 4π e 2 ∂ nT ∂ F (2. A special limit of the Lindhard response function is known as the ThomasFermi screening. this corresponds to electron–hole excitations with small total momentum relative to the Fermi momentum. These considerations indicate that the contributions to the integral will come from electron–hole excitations with total momentum k.39). expanding the occupation numbers n (k ± k/2) through their deﬁnition in Eq. with the general result of the exponentially screened potential. By comparing the expression for the dielectric function in the Thomas–Fermi case. deﬁned as nT = 2 dk (2π )3 e( 1 k − F )/ kB T +1 (2.

namely C. In order to develop the pseudopotential for a speciﬁc atom we consider it as isolated. as they apply to the case of an isolated atom. allows us to take the core electrons out of the picture. while the core electrons are largely unaffected when the atoms are placed in the solid. we will describe its main ideas here. we are only dealing with the valence electrons of atoms.7 The ionic potential So far we have justiﬁed why it is possible to turn the many-body Schr¨ odinger equation for electrons in a solid into a set of single-particle equations. These satisfy the Schr¨ odinger type equations Hsp |ψ (v) = Hsp |ψ (c) = (v ) (c ) |ψ (v) |ψ (c) (2. This approach. Ge and Pb.115) . It is clear from these ﬁgures that the contribution of the valence states to the total electron density is negligible within the core region and dominant beyond it. Let us separate explicitly the single-particle states into valence and core sets. using as an external potential that of its nucleus. Now let us deﬁne a ˜ (v) through the following relation: new set of single-particle valence states |φ ˜ (v ) − |ψ (v) = |φ c ˜ (v) |ψ (c) ψ (c) |φ (2.4 for four elements of column IV of the Periodic Table. and denote by |ψ (n ) the single-particle states which are the solutions of the single-particle equations discussed earlier. As we have emphasized before. the work of Phillips and Kleinman ﬁrst established the theoretical basis of the pseudopotential method [28]. Si.112) (2. we obtain ˜ Hsp |φ (v ) − c ˜ ψ (c) |φ (v ) Hsp |ψ (c) = (v ) ˜ |φ (v ) − c ˜ ψ (c) |φ (v ) |ψ (c) (2. When actually solving the single-particle equations we need to specify the ionic potential. Because of this difference between valence and core electrons. which with the proper approximations can be solved to determine the single-particle wavefunctions.114) Applying the single-particle hamiltonian Hsp to this equation. 2. identiﬁed as |ψ (v) and |ψ (c) respectively.72 2 The single-particle approximation 2. Since this approach is one of the pillars of modern electronic structure theory. The separation of the atomic electron density into its core and valence parts is shown in Fig. known as the pseudopotential method. we need to calculate these states for all the electrons of the atom. In principle. as well as all the other terms arising from electron-electron interactions. and at the same time to create a smoother potential for the valence electrons. a highly effective approach has been developed to separate the two sets of states.113) where Hsp is the appropriate single-particle hamiltonian for the atom: it contains a potential V sp which includes the external potential due to the nucleus.

5 1 1. [1s 2 2s 2 2 p 6 ]3s 2 3 p 2 ).5 0 n(r) 3. [1s 2 ]2s 2 2 p 2 ).5 2 1. which.2.5 2 1.5 C 3 2. while the solid line represents the density of valence electrons and the total density of electrons (core plus valence).5 2 2. for Si below 1.5 Ge 3 2.5 Å.5 2 2.7 The ionic potential n(r) 3.5 0 Pb 0 0.5 3 2.5 1 1. [1s 2 2s 2 2 p 6 3s 2 3 p 6 3d 10 ]4s 2 4 p 2 ) and Pb ( Z = 82. gives 1− c (v ) ˜ |ψ (c) ψ (c) | |φ ( c (v ) (v ) = (v ) ˜ |ψ (c) ψ (c) | |φ = (v ) (v ) ⇒ Hsp + − (c ) ˜ )|ψ (c) ψ (c) | |φ ˜ |φ (v ) (2. [1s 2 2s 2 2 p 6 3s 2 3 p 6 3d 10 4s 2 4 p 6 4d 10 4 f 14 5s 2 5 p 6 5d 10 ]6s 2 6 p 2 ).5 1 0.5 3 2.5 r Figure 2. the core states are given inside square brackets. taking into account that Hsp |ψ (c) = Hsp − c (c ) (c) |ψ (c) .5 1 0. and for Pb below 2.5 3 3.5 r 0 0.5 0 73 n(r) 3.5 1 0.5 r 0 0.5 2 2.4. Electron densities n (r ) as a function of the radial distance from the nucleus r in angstroms.5 3 3.5 1 1.5 0 Si 0 0. In each case. Ge and Pb are due to the nodes of the corresponding wavefunctions. for Ge below 2. Si ( Z = 14. the dashed line with the shaded area underneath it represents the density of core electrons.5 1 0. which acquire oscillations in order to become orthogonal to core states.5 2 1.0 Å.5 2 2. In all cases the valence electron density extends well beyond the range of the core electron density and is relatively small within the core.5 r n(r) 3.116) .0 Å. Ge ( Z = 32. for four elements of column IV of the Periodic Table: C ( Z = 6.5 1 1.5 3 3. The wiggles that develop in the valence electron densities for Si.5 3 3.5 2 1.5 Å. The core electron density for C is conﬁned approximately below 1.

consisting of the nucleus plus the core electrons.74 2 The single-particle approximation ˜ (v) obey a single-particle equation with a modiﬁed Therefore. the |φ Why is this a useful approach? First. the pseudo-wavefunctions preserve all the important physics relevant to the behavior of the solid. The modiﬁed potential for these states is called the “pseudopotential”. So the new valence states deﬁned through Eq. so it should be a much smoother potential without the 1/ r singularity due to the nucleus at the origin. in addition to the regular potential V sp .119) which is strictly positive. Therefore the pseudowavefunctions experience an attractive Coulomb potential which is shielded near the position of the nucleus by the core electrons. given by V ps = V sp + c ( (v ) − (c ) )|ψ (c) ψ (c) | (2. In fact.114): what this deﬁnition amounts to is projecting out of the valence wavefunctions any overlap they have with the core wavefunctions. where the core states die exponentially. Thus. that make it somewhat suspicious. and. but the proper ionic potential away from the core region. Moreover. the potential that the pseudo-wavefunctions experience is the same as the Coulomb potential of an ion. correspondingly. the potential that these states experience includes. In other words. at least in the way that was formulated above. which experience a weaker potential near the atomic nucleus. First. the term ( c (v ) − (c ) )|ψ (c) ψ (c) | (2. (2. it is a non-local potential: .114) have zero overlap with core states. if the only effect of core electrons were to repel them from the core region. the quantity |ψ (c) ψ (c) | c (2. Farther away from the core region. through the pseudopotential formulation we have created a new set of valence states. but they have the same eigenvalues as the original valence states. the pseudopotential the corresponding states |φ represents the effective potential that valence electrons feel. In this sense. the new states |φ potential. this term is repulsive and tends to push ˜ (v) outside the core. The fact that they also have exactly the same eigenvalues as the original valence states. (2. because (v) > (c) (valence states have by deﬁnition higher energy than core states). but have the same eigenvalues (v) as the original valence states |ψ (v) .117) ˜ (v) ’s are called “pseudo-wavefunctions”. consider the deﬁnition of the pseudowavefunctions through Eq. There are some aspects of the pseudopotential. also indicates that they faithfully reproduce the behavior of true valence states. Since it is this region in which the valence electrons interact to form bonds that hold the solid together.118) is a projection operator that achieves exactly this result.

(2. Finally. and eliminate common terms from both sides to arrive at Hsp + c ( (v ) − (c) ˆ (v ) = )|ψ (c) ψ (c) | |φ (v ) ˆ (v ) |φ (2. All these features may cast a long shadow of doubt on the validity of the pseudopotential construction in the mind of the skeptic (a trait not uncommon among physicists). The pseudopotential also depends on the energy (v) . r )φ − (c ) =⇒ V ps (r. and therefore the pseudopotential state |φ is not uniquely deﬁned.116). which are precisely the regions of interest for the physics of solids.121) αc |ψ (c ) in ˜ (v) = |φ ˆv − where αc are numerical constants. which means it is not uniquely deﬁned.7 The ionic potential 75 ˜ (v) gives applying it to the state |φ ( c (v ) − (c ) ˜ (v) = )|ψ (c) ψ (c) |φ ( c (v ) ˜ (v) (r )dr V ps (r. This can be demonstrated by adding any linear combination of |ψ (c) states ˜ (v) to obtain a new state |φ ˆ (v) : to |φ ˆ (v) = |φ ˜ (v) + |φ c αc |ψ (c ) (2. as the above relationship demonstrates. r ) = )ψ (c)∗ (r )ψ (c) (r) (2. Practice of this art.122) ˆ (v) obeys exactly the same single-particle equation as the This shows that the state |φ ˜ (v) . however. Using |φ Eq.116) as the Schr¨ odinger equation that determines the pseudo˜ (v) and their eigenvalues. has shown that these features can actually be exploited to deﬁne pseudopotentials that work very well in reproducing the behavior of the valence wavefunctions in the regions outside the core. the pseudopotential is not wavefunctions |φ unique. . (2.2. which is an unknown quantity if we view Eq.120) This certainly complicates things. we obtain Hsp + c c ( (v ) − (c ) )|ψ (c) ψ (c) | = (v ) ˆ (v) − |φ c αc |ψ (c ) αc |ψ (c ) c ˆ (v) − |φ We can now use ψ (c) |ψ (c ) = δcc to reduce the double sum c c on the left hand side of this equation to a single sum.

Schematic representation of the construction of the pseudo-wavefunction φ (r ) and pseudopotential V ps (r ). we can construct the pseudo-wavefunction to be a smooth function which has no nodes and goes to zero at the origin. For each valence state of interest. This is by deﬁnition the desired pseudopotential: it is guaranteed by construction to produce a wavefunction which matches exactly the real atomic . We can achieve this by taking some combination of smooth functions which we can ﬁt to match the true wavefunction and its ﬁrst and second derivative at rc . and approach smoothly to zero at the origin. 2. so that the region r < rc corresponds to the core. Therefore. When atoms are placed at usual interatomic distances in a solid. We begin with a self-consistent solution of the single-particle equations for all the electrons in an atom (core and valence). Having deﬁned the pseudo-wavefunction. This hypothetical wavefunction must be normalized properly. and we identify it with the tail of the calculated atomic wavefunction. we discuss next how typical pseudopotentials are constructed for modern calculations of the properties of solids [29]. beginning with the real valence wavefunction ψ (r ) and Coulomb potential V Coul (r ).76 2 The single-particle approximation ψ(r) 0 V Coul φ (r) r (r) 0 rc ps V (r) r Figure 2. We want therefore to keep this part of the valence wavefunction as realistic as possible.5. we take the calculated radial wavefunction and keep the tail starting at some point slightly before the last extremum.5. the behavior of the wavefunction is not as important for the properties of the solid.6. rc is the cutoff radius beyond which the wavefunction and potential are not affected. we can invert the Schr¨ odinger equation to obtain the potential which would produce such a wavefunction. As an example. Inside the core. and the resulting interaction between the corresponding electrons produces binding between the atoms. these valence tails overlap signiﬁcantly. 2. The entire procedure is illustrated schematically in Fig. as shown in Fig. We call the radial distance beyond which this tail extends the “cutoff radius” rc .

d2 φ(v) /dr 2 continuous at rc ↓ Normalize pseudo-wavefunction φ(v) (r ) for 0 ≤ r < ∞ ↓ ˆ + V ps (r )]φ(v) (r ) = Invert [F ↓ V ps (r ) = (v) (v) φ(v) (r ) (v) ψ (v) (r ) 77 ˆ φ(v) (r )]/φ(v) (r )] − [F ˆ is the operator in the singleFigure 2. ˆ consists of the kinetic energy operator. V Coul and V ps are the Coulomb potential and the pseudopotential of the ion. dφ(v) /dr . and is smooth and nodeless inside the core region. the Hartree potential term and the exchangethat is. each valence state will give rise to a different pseudopotential. due to the non-uniqueness in the deﬁnition of the pseudopotential and the fact that their behavior inside the core is not relevant to the physics of the solid. the ionic charge for an . where Z v is the valence charge of the atom. We can then use this pseudopotential as the appropriate potential for the valence electrons in the solid. nodeless. under the following conditions: φ(v) (r ) smooth. instead of having a 1/ r singularity like the Coulomb potential. that is. F particle hamiltonian Hsp that contains all other terms except the ionic (external) potential.7 The ionic potential ˆ + V Coul (r )]ψ (v) (r ) = Solve H sp ψ (v) (r ) = [F ↓ Fix pseudo-wavefunction φ(v) (r ) = ψ (v) (r ) for r ≥ rc ↓ Construct φ(v) (r ) for 0 ≤ r < rc . The true valence wavefunctions have many nodes in order to be orthogonal to core states. but this is not a serious complication as far as actual calculations are concerned. wavefunction beyond the core region (r > rc ).2. We note here two important points: (i) The pseudo-wavefunctions can be chosen to be nodeless inside the core. F correlation term.6. (ii) The nodeless and smooth character of the pseudo-wavefunctions guarantees that the pseudopotentials produced by inversion of the Schr¨ odinger equation are ﬁnite and smooth near the origin. Of course. giving rise to a smooth potential. All the pseudopotentials corresponding to an atom will have tails that behave like Z v / r . The basic steps in constructing a pseudopotential.

The huge advantage of the pseudopotential is that now we have to deal with the valence electrons only in the solid (the core electrons are essentially frozen in their atomic wavefunctions). E. New York. Dreizler. (2. This is an excellent review of the theory and applications of pseudopotentials. Berlin. vol. pp. (Plenum Press.14). 2.M. respectively. for a wide range of elements [30–32]. 2. Density Functional Theory. eds. but we will not go into these details here. with many useful insights and detailed accounts of theoretical tools.E. Gross and R. have the physical meaning of the energy required to remove this state from the system. (2. . Computer Physics Reports. ﬁnd the energy difference between two systems. Show that the quantities i appearing in the Hartree–Fock equations. consisting of taking an electron from a state with momentum k1 and putting it in a state with 3. This book provides a detailed account of technical issues related to DFT. 3. and the pseudopotentials are smooth so that standard numerical methods can be applied (such as Fourier expansions) to solve the single-particle equations.78 2 The single-particle approximation ion consisting of the nucleus and the core electrons. but through the careful work of many physicists in this ﬁeld over the last couple of decades there exist now very good pseudopotentials for essentially all elements of interest in the Periodic Table [30]. P.K. Electron Correlations in Molecules and Solids. 1989). 115–198 (North Holland. 9. so that removing the electron in state φi (ri ) does not affect the other states φ j (r j ). Amsterdam. 1991). you may assume that N is very large. Further reading 1. 1995). Consider a simple excitation of the ground state of the free-electron system. Problems 1. one with and one without the state φi (ri ) which have different numbers of electrons. This is a comprehensive discussion of the problem of electron correlations in condensed matter. more accurate and more transferable. The modern practice of pseudopotentials has strived to produce in a systematic way potentials that are simultaneously smoother. Sufﬁce to say that pseudopotential construction is one of the arts of performing reliable and accurate calculations for solids. Pickett.. Use a variational calculation to obtain the Hartree–Fock single-particle equations Eq. To do this. Fulde (Springer-Verlag. W.16) from the Hartree–Fock many-body wavefunction deﬁned in Eq.U. There are several details of the construction of the pseudopotential that require special attention in order to obtain potentials that work and actually simplify calculations of the properties of solids. Pseudopotential Methods in Condensed Matter Applications. which were introduced as the Lagrange multipliers to preserve the normalization of state φi (ri ). N and N − 1.

and E is the total energy. (2.57) are those given in Eq. are those given in Eqs.61)? we were to use the wavefunction 2 (c) Using the Hartree–Fock wavefunctions for this model deﬁned in Eqs. (2.60). B= (2. Discuss the physical implications of this result for a hypothetical solid that might be reasonably described in terms of the uniform electron gas.123) ∂ ∂ 2 where is the volume. We will investigate the model of the hydrogen molecule discussed in the text. (a) Consider ﬁrst the single-particle hamiltonian given in Eq.Problems 79 4.) The bulk modulus B of a solid is deﬁned as ∂2 E ∂P = (2. . 1) deﬁned in Eqs.66). (2. under what assumptions is this a reasonable approximation? What would be the expression for the energy if H (r1 . show that its expectation values in terms of the Hartree wavefunctions iH (i = 0. (2.46). respectively.54). Show that for the uniform electron gas with the kinetic energy and exchange energy terms only.58). the excitation energy and momentum in units of the Fermi energy F and the Fermi momentum kF .64)– (2. (2.63). show a graph of this relationship in terms of reduced variables. (2. respectively.62) and (2. (2.31) and Eq. which contains the interaction term.68).916 3 (Ry/a0 − ) 6π (rs /a0 )5 6π (rs /a0 )4 EX 1 E kin 3 + 2 ) 5 (1/a0 6π (rs /a0 )3 N N (2. r2 ) deﬁned in Eq. and in which the value of (rs /a0 ) is relatively small ((rs /a0 ) < 1). respectively. that is. To derive these results certain matrix elements of the interaction term need to be neglected. show that its expectation values in terms of the single-particle wavefunctions φi (i = 0. momentum k2 .124) or equivalently. (2. (2. the bulk modulus is given by B=− B= 5 2. since the ground state of the system consists of ﬁlled single-particle states with momentum up to the Fermi momentum kF . we must have |k1 | ≤ kF and |k2 | > kF .21 2 0. r2 ).125) 5. (2. Removing the electron from state k1 leaves a “hole” in the Fermi sphere. (2. Here we will assume that the same approximations as those involved in part (b) are applicable. P is the pressure. 1) deﬁned in Eq.59) and (2. in terms of the kinetic and exchange energies. Eq. Discuss the relationship between the total excitation energy and the total momentum of the electron–hole pair.67) and Eq. construct the matrix elements of the hamiltonian Hi j = HF |H | i HF j and diagonalize this 3 × 3 matrix to ﬁnd the eigenvalues and eigenstates. verify that the ground state energy and wavefunction are those given in Eq. given in Eq. (At this point we are not concerned with the nature of the physical process that can create such an excitation and with how momentum is conserved in this process. (b) Consider next the two-particle hamiltonian H(r1 .52). this quantity describes how the solid responds to external pressure by changes in its volume. so this excitation is described as an “electron–hole pair”. (2.

electrostatic and exchange.68). a system of electrons will form a regular lattice.1. (2. Then we integrate this relation with respect to n i . this is known as the Wigner crystal. called the “ﬁlling factors”.128) 8. in the low density (high (rs /a0 )) limit. We wish to derive the Lindhard dielectric response function for the free-electron gas. The electrostatic energy turns out to be E es = − 6 1 Ry 5 (rs /a0 ) (2. (2.126) 7. What is the physical meaning of the resulting equation? In the extremely low density limit.127) This can be compared with the energy of the electron gas in the Hartree–Fock approximation Eq. with each electron occupying a unit cell. we obtain the correlation energy. We take a partial derivative of the total energy with respect to n i and relate it to i . Give a plot of this result as a function of (U / t ) and explain the physical meaning of the answer for the behavior at the small and large limits of this parameter.86). kinetic. The energy of this crystal has been calculated to be E W igner = − 3 3 + Ry (rs /a0 ) (rs /a0 )3/2 (2. are on the same proton. 6. deﬁned by Eq. We want to determine the physical meaning of the quantities i in the Density Functional Theory single-particle equations Eq. we express the density as n (r) = i n i |φi (r)|2 (2. where the n i are real numbers between 0 and 1.44). E W igner and E H F + E es . The charge density is deﬁned in terms of the single-particle wavefunctions as ρ (r) = (−e) k n (k)|φk (r)|2 with n (k) the Fermi occupation numbers. which is by deﬁnition the interaction energy after we have taken into account all the other contributions. (2. to which we must add the electrostatic energy (this term is canceled by the uniform positive background of the ions. using perturbation theory. Show that the result is compatible with the Wigner correlation energy given in Table 2.80 2 The single-particle approximation (d) Find the probability that the two electrons in the ground state. To do this. From ﬁrst order perturbation theory (see Appendix B). the change in wavefunction of state k due to a perturbation represented by the potential V int (r) is given by |δφk = k (0) (0) φk |V int | φk (0) k (0) |φk − (0) k . but here we are considering the electron gas by itself). Taking the difference between the two energies.

(a) Derive the expression for the Lindhard dielectric response function. Determine the parameters A . (2. and the corresponding dielectric constant ε = 1 − 4π χ / k 2 .107). 9. by keeping only ﬁrst order terms in V int in the perturbation expansion. (b) Evaluate the zero-temperature Lindhard response function. φ2 (r ).Problems 81 (0) (0) with |φk the unperturbed wavefunctions and k the corresponding energies. takes the form 2 TF ls (T = 0) = √ π 10. at k = 2kF . We want to construct a pseudopotential which gives a state ψ4 (r ) that is smooth and nodeless in the core region. given in (0) Eq. (2. (2. φ3 (r ) are fully occupied core states. interpret their behavior in terms of the single-particle picture. Choose as the cutoff radius rc the position of the last extremum of φ4 (r ).129) and has nine electrons. Then invert the radial Schr¨ odinger equation to obtain the pseudopotential which has ψ4 (r ) as its solution.111). and use the simple expressions ψ4 (r ) = Az 2 e− Bz ψ4 (r ) = φ4 (r ) 2 r ≤ rc r > rc (2.130) √ for the pseudo-wavefunction. where z = r m e ω/h ¯ . Does this procedure produce a physically acceptable pseudopotential? . Plot the pseudopotential you obtained as a function of r . (2. TF Show that at zero temperature the Thomas–Fermi inverse screening length ls . with the total occupation n T given by Eq. φ1 (r ). The harmonic oscillator potential is discussed in detail in Appendix B. B so that the pseudo-wavefunction ψ4 (r ) and its derivative are continuous at rc . Eq. kF a0 with kF the Fermi momentum and a0 the Bohr radius. deﬁned in Eq. The ﬁrst four states. and the last state φ4 (r ) is a valence state with one electron in it. These changes in the wavefunctions give rise to the induced charge density ρ ind (r) to ﬁrst order in V int . φ0 (r ).108). Consider a ﬁctitious atom which has a harmonic potential for the radial equation: − 1 h ¯ 2 d2 + m e ω2r 2 φi (r ) = 2m e dr 2 2 i φi (r ) (2. for free electrons with energy k =h ¯ 2 |k|2 /2m e and with Fermi energy F .110).

3.1) The lattice vectors connect all equivalent points in space.e. and the origin of the coordinate system can be located at any position in space. it is often chosen to be the position of one of the atoms in the PUC. a2 . these functions have the same value at an equivalent point of any other unit cell related to the PUC by a translation R. a3 and the positions of atoms inside a primitive unit cell (PUC). for convenience. as in the exchange-correlation hole. 82 . n 1 . n 3 : integers (3. multiplied by integers: R = n 1 a1 + n 2 a2 + n 3 a3 . We saw that the proper interpretation of single particles involves the notion of quasiparticles: these are fermions which resemble real electrons..3 Electrons in crystal potential In chapter 2 we provided the justiﬁcation for the single-particle picture of electrons in solids.1 Periodicity – Bloch states A crystal is described in real space in terms of the primitive lattice vectors a1 . Here we begin to develop the quantitative description of the properties of solids in terms of quasiparticles and collective excitations for the case of a perfectly periodic solid. due to the periodicity of the crystal.2) This is a useful deﬁnition: we only need to know all relevant real-space functions for r within the PUC since. but are not identical to them since they also embody the effects of the presence of all other electrons. The PUC is deﬁned as the volume enclosed by the three primitive lattice vectors: PU C =| a1 · (a2 × a3 ) | (3. There can be one or many atoms inside the primitive unit cell. this set of points is referred to as the “Bravais lattice”. The lattice vectors R are formed by all the possible combinations of primitive lattice vectors. an ideal crystal. i. n 2 .

Due to the periodicity of the lattice. 2. m 1 .1): G = m 1 b1 + m 2 b2 + m 3 b3 .8) for all R and G vectors deﬁned by Eqs. by analogy to the Bravais lattice vectors deﬁned in Eq.3) with the obvious consequence ai · b j = 2πδi j (3. l = n 1 m 1 + n 2 m 2 + n 3 m 3 (3. This statement applied to the single-particle wavefunctions is known as “Bloch’s theorem”.3. This relationship can serve to deﬁne one set of vectors in terms of the other set. b2 = . which is the inverse space of the real lattice. The volume of that cell in reciprocal space is given by | b1 · (b2 × b3 ) |= (2π )3 (2π )3 = | a1 · (a2 × a3 ) | PU C (3. m 2 . (3.1) and (3.5) We can construct vectors which connect all equivalent points in reciprocal space.9) with f (G) the Fourier Transform (FT) components. This also gives eiG·R = 1 (3. a1 · (a2 × a3 ) a2 · (a3 × a1 ) b3 = 2π (a1 × a2 ) a3 · (a1 × a2 ) (3.1 Periodicity – Bloch states 83 The foundation for describing the behavior of electrons in a crystal is the reciprocal lattice.6). which we call G. The reciprocal primitive lattice vectors are deﬁned by b1 = 2π (a2 × a3 ) 2π (a3 × a1 ) . (3. m 3 : integers (3.10) .7) where l is always an integer. i = 1. as we describe below. the dot product of any R vector with any G vector gives R · G = 2π l .4) The vectors bi . Bloch’s theorem: When the potential in the single-particle hamiltonian has the translational periodicity of the Bravais lattice V sp (r + R) = V sp (r) (3. 3 deﬁne a cell in reciprocal space which also has useful consequences. any such function need only be studied for r within the PUC.6) By construction. Any function that has the periodicity of the Bravais lattice can be written as f (r) = G eiG·r f (G) (3.

are used to denote the two atoms in the unit cell. 0) 2 √ a 3a ( 2 .− 2 . FCC. a ) 2 2 √ 3a (a . These crystals are illustrated in Fig. 0. 0) 2 2 √ 3a (a . Bottom: diamond. Top: simple cubic. 2 . the side of the conventional cube or parallelpiped. BCC. the position of one atom in the PUC is always assumed to be at the origin. 0) a2 (0. −a . a . − 2 . The crystals deﬁned in Table 3. 2 ) (a .1. 3.84 3 Electrons in crystal potential Table 3. the position of the second atom t2 is given with respect to the origin. gray and black circles. √ . HCP and graphite lattices two different symbols. a3 that deﬁne the primitive unit cell of simple crystals. 0) (a . a . HCP. −a ) 2 2 2 a a ( 2 . graphite (single plane). the bonds between nearest neighbors are also shown as thicker lines. 2 2 (0. For the diamond and graphite lattices.1. a. a . . 2 . For the HCP lattice. 0) 2 √2 a 3a ( 2 . 0. a2 . dNN is the distance between nearest neighbors in terms of the lattice constant a .1. Only crystals with one or two atoms per unit cell are considered. a second parameter is required. only the two-dimensional honeycomb lattice of a single graphitic plane is deﬁned. For the diamond. 0. 2 (0. 0) a3 (0. Vectors a1 . a . −a ) 2 2 2 a a ( 2 . For graphite. when there are two atoms in the PUC. a ) (a . Lattice Cubic BCC FCC Diamond HCP Graphite a1 (a . 0) (a . All vectors are given in cartesian coordinates and in terms of the standard lattice parameter a . 0. c) a a a c a a a a Figure 3. a . 0. 0) 2 2 3 8 3 (0. namely the c/a ratio. a) 4 4 4 a (a .1. 0) (a . a. √ . c) 2 2 3 2 a (a . . In all cases the lattice vectors are indicated by arrows (the lattice vectors for the diamond lattice are identical to those for the FCC lattice). 2 a ) 2 a ) 2 a ) 2 t2 c/a dNN √ a 3 2 a √ 2 a √ 4 3 a √ 3 a √ 3 a (a .

Our goal is to determine the eigenfunctions of TR so that we can use them as the basis to express the eigenfunctions of Hsp . we will ﬁrst determine the eigenvalues of TR . At this point k is just a subscript index for identifying the wavefunctions.12). the wavefunctions ψk (r) can be expressed as the product of the phase factor exp(ik · r) multiplied by the functions u k (r) which have the full translational periodicity of the Bravais lattice.12) that is.15) . (3. we can choose all eigenfunctions of Hsp to be simultaneous eigenfunctions of TR : Hsp ψk (r) = k ψk (r) TR ψk (r) = cR ψk (r) (3. We notice that TR TR = TR TR = TR+R ⇒ cR+R = cR cR (3. u k (r) = Proof of Bloch’s theorem: A convenient way to prove Bloch’s theorem is through the deﬁnition of translation operators. we can factor out of ψk (r) the phase factor exp(ik · r).13) This operator commutes with the hamiltonian Hsp : it obviously commutes with the kinetic energy operator. Conversely. if Eq. The two formulations of Bloch’s theorem are equivalent.11) holds. The states ψk (r) are referred to as Bloch states.11) must obviously hold. To this end.14) with cR the eigenvalue corresponding to the operator TR . We deﬁne the translation operator TR which acts on any function f (r) and changes its argument by a lattice vector −R: TR f (r) = f (r − R) (3. For any wavefunction ψk (r) that can be put in the form of Eq. u k (r + R) = u k (r) (3. (3. whose eigenvalues and eigenfunctions can be easily determined. Consequently. (3.11) A different formulation of Bloch’s theorem is that the single-particle wavefunctions must have the form ψk (r) = eik·r u k (r).1 Periodicity – Bloch states 85 the single-particle wavefunctions have the same symmetry.3. and it leaves the potential energy unaffected since this potential has the translational periodicity of the Bravais lattice. (3. up to a phase factor: ψk (r + R) = eik·R ψk (r) (3.11). the relation of Eq. in which case the remainder ψk (r) eik·r must have the translational periodicity of the Bravais lattice by virtue of Eq.

When this form of the wavefunction is inserted in the single-particle Schr¨ odinger equation. since u k (r + R) = u k (r). since TR ei(k+G)·r = ei(k+G)·(r−R) = e−ik·R ei(k+G)·r = ck ei(k+G)·r (3.21) . Having established that the eigenvalues of the operator TR are cR = exp(−ik · R). r)ψk (r) = k ψk (r) =⇒ H(p + h ¯ k. r) by +h ¯ k. By virtue of Eq. where R is any vector connecting equivalent Bravais lattice points. we deﬁne ca j = ei2πκ j ( j = 1. (3. introduced earlier to label the wavefunctions. (3.18) because exp(−iG · R) = 1.19). Then we can write the eigenfunctions of Hsp as an expansion over all eigenfunctions of TR corresponding to the same eigenvalue of TR : ψk (r) = G αk (G)ei(k+G)·r = eik·r u k (r) αk (G)eiG·r G u k (r) = (3. when dealing with the states u k instead of the states ψk : H(p.17) where now the index k. r)u k (r) = k u k (r) (3. so that ca j can take any complex value.19) which proves Bloch’s theorem. 2. 3) (3.4).20) Solving this last equation determines u k (r). Without loss of generality. we conclude that it must be an exponential in R. is expressed in terms of the reciprocal lattice vectors b j and the complex constants κ j . which is the only function that satisﬁes the above relation.86 3 Electrons in crystal potential Considering cR as a function of R. we obtain the equation for u k (r): 1 2m e h ¯ ∇r +h ¯k i 2 + V sp (r) u k (r) = k u k (r) (3.16) where κ j is an unspeciﬁed complex number. the deﬁnition of ca j produces for the eigenvalue cR : cR = e−ik·R k = κ1 b1 + κ2 b2 + κ3 b3 (3. since u k (r + R) = u k (r) for u k (r) deﬁned in Eq. which with the factor exp(ik · r) makes up the solution to the original single-particle equation. we ﬁnd by inspection that the eigenfunctions of this operator are exp(i(k + G) · r). The great advantage is that we only need to solve this equation for r within a PUC of the crystal. This result can also be thought of as equivalent to changing the momentum operator in the hamiltonian H(p.

also known as the Born–von Karman boundary conditions.023 × 1023 ). Here we will show that this index actually has physical meaning. Consider that the crystal is composed of N j unit cells in the direction of vector a j ( j = 1. Values of k that differ by a reciprocal lattice vector G are equivalent. we conclude that ¯ k.3. 3. Bloch’s theorem and Eq.2 k-space – Brillouin zones In the previous section we introduced k = κ1 b1 + κ2 b2 + κ3 b3 as a convenient index to label the wavefunctions. it can be thought of as a wave-vector. (3. r)ψk (r) = e−ik·r k eik·r u k (r) = k u k (r) = H(p + h ¯ k. 6. r) e−ik·r H(p. N = N1 N2 N3 is equal to the total number of unit cells in the crystal (of order Avogadro’s number. which can be any N j consecutive integer values. (2) The number of distinct values that k may take is N = N1 N2 N3 . where we think of the values of N j as macroscopically large. 2.24). (3. we derive another relation between the two forms of the singleparticle hamiltonian. We need to specify the proper boundary conditions for the single-particle states within this crystal.2 k-space – Brillouin zones 87 For future use. r)eik·r = H( p + h (3. we can choose periodic boundary conditions. exp(ik · r) represents a plane wave of wave-vector k. . Consistent with the idea that we are dealing with an inﬁnite solid. The physical meaning of this result is that the wavefunction does not decay within the crystal but rather extends throughout the crystal like a wave modiﬁed by the periodic function u k (r). (3. ψk (r) = ψk (r + N j a j ) (3. 3).22) This last expression will prove useful in describing the motion of crystal electrons under the inﬂuence of an external electric ﬁeld.24) eik·( N j a j ) = 1 ⇒ ei2πκ j N j = 1 ⇒ κ j = Nj where n j is any integer. r)u k (r) and comparing the ﬁrst and last term.24). multiplying the ﬁrst expression from the left by exp(−ik · r) we get e−ik·r H(p. This shows two important things. because n j can take N j inequivalent values that satisfy Eq. (1) The vector k is real because the parameters κ j are real. Since k is deﬁned in terms of the reciprocal lattice vectors b j . Values of n j beyond this range are equivalent to values within this range.23) imply that nj (3.23) with r lying within the ﬁrst PUC. This fact was ﬁrst introduced in chapter 1. because they correspond to adding integer multiples of 2π i to the argument of the exponential in Eq.

2. For elastic scattering |q| = |q |.}). notice that the closer together the planes are spaced the longer is the corresponding perpendicular lattice vector. −1 2 2 ( j = 1. . . 0. . } d } . 3) (3. . which is scattered by the planes of atoms in a crystal to a wave-vector q .25) where we assume N j to be an even integer (since we are interested in the limit N j → ∞ this assumption does not impose any signiﬁcant restrictions). Some families of parallel atomic planes are identiﬁed by sets of parallel dashed lines.88 3 Electrons in crystal potential since adding a vector G to k corresponds to a difference of an integer multiple of 2π i in the argument of the exponential in Eq. the difference in paths along q θ θ’ q’ Figure 3. that is. in the case of an inﬁnite crystal when the values of k become continuous. we ﬁrst introduce the notion of Bragg planes. The difference in path for two scattered rays from the horizontal family of planes is indicated by the two inclined curly brackets ({. Consider a plane wave of incident radiation and wave-vector q. . which is the analog of the PUC in real space. To generalize the concept of the BZ. 2. Schematic representation of Bragg scattering from atoms on successive atomic planes. This volume in reciprocal space is known as the ﬁrst Brillouin Zone (BZ in the following). . . (3. By convention. we choose the ﬁrst BZ to correspond to the following N j consecutive values of the index n j : nj = − Nj Nj . This statement is valid even in the limit when N j → ∞.24). As the schematic representation of Fig. together with the lattice vectors that are perpendicular to the planes and join equivalent atoms. The second statement has important consequences: it restricts the inequivalent values of k to a volume in reciprocal space. 3. .2 shows.

Eq.28) for a given G. This corresponds to the ﬁrst BZ. this difference must be equal to l λ.26) ˆ the unit vector along q (q ˆ = q/|q|) and d = |d|.29) n 2 b2 × N2 where we have used n j = 1. and second. we obtain ˆ = 1 |G| q·G 2 (3. the differential volume change in k is 3 k = = k1 · ( k2 × n 1 b1 · N1 k3 ) n 3 b3 N3 ⇒ |dk| = (2π )3 N PU C (3. Now consider the origin of reciprocal space and around it all the points that can be reached without crossing a Bragg plane. so that the spacing of k values becomes inﬁnitesimal and k becomes a continuous variable.28) This is the deﬁnition of the Bragg plane: it is formed by the tips of all the vectors q which satisfy Eq. and N = N1 N2 N3 is the total number of unit cells in the crystal.5) for the volume of the basic cell in reciprocal space.3. (3. i. and from those all the R vectors. Using q = (2π/λ)q interference. we can determine all the G vectors. For constructive interference between incident and reﬂected waves.2 k-space – Brillouin zones 89 incident and reﬂected radiation from two consecutive planes is ˆ ˆ −d·q d cos θ + d cos θ = d · q (3.e. For a crystal with N j unit cells in the direction a j ( j = 1. we obtain the condition for constructive λ is the wavelength. This relation determines all vectors q that lead to constructive interference.27) where we have made use of two facts: ﬁrst. For an inﬁnite crystal N → ∞. R · (q − q ) = 2π l ⇒ q − q = G (3. (3. 3). that d = R since d represents a distance between equivalent lattice points in neighboring atomic planes. by scanning the values of the angle of incidence and the magnitude of q.28) serves to identify all the families of planes that can reﬂect radiation constructively.7). 2. From the above equation we ﬁnd q = q − G. the Bravais lattice of the crystal. d being a vector that with q connects equivalent lattice points. where l is an integer and ˆ . Since the angle of incidence and the magnitude of the wave-vector q can be varied arbitrarily. Therefore. as shown in Eq. we have also made use of Eq. that the reciprocal lattice vectors are deﬁned through the relation G · R = 2π l . (3. By squaring both sides of this equation and using the fact that for elastic scattering |q| = |q |. (3. The .

(3. the second BZ is the set of points that can be reached from the origin by crossing only one Bragg plane. It also provides a convenient deﬁnition for the second.3. the third BZ. a case that is particularly easy to visualize. is the central square.. is composed of the four triangles around the central square. . We saw above that due to crystal periodicity. indicating that the tip of the vector q must lie on a plane perpendicular to G that passes through its midpoint. Thus.28) means that the projection of q on G is equal to half the length of G. shown in lighter shade and labeled 3. G = 2π (±x shown in white and labeled 1. The usefulness of BZs is that they play an analogous role in reciprocal space as the primitive unit cells do in real space. The ﬁrst BZ. We also saw that values of k are equivalent if one adds to them any vector G. etc. the second BZ. ±2π y ˆ and with a1 = x ˆ ±y ˆ )) are shown.3. is shown in Fig. we only need to solve the single-particle equations for values of k within the ﬁrst BZ. which by the above arguments are identiﬁed as the Bragg planes. . A more rigorous deﬁnition is that the ﬁrst BZ is the set of points that can be reached from the origin without crossing any Bragg planes. The construction of the ﬁrst three BZs for a two-dimensional square lattice. we only need to solve the single-particle equations inside the PUC. third. This gives a convenient recipe for deﬁning the ﬁrst BZ: draw all reciprocal lattice vectors G and the planes that are perpendicular to them at their midpoints.90 3 Electrons in crystal potential condition in Eq. or within any single BZ: points in other BZs are related by G vectors which make Figure 3. shown hatched and labeled 2. 3. the volume enclosed by the ﬁrst such set of Bragg planes around the origin is the ﬁrst BZ.. a2 = y ˆ . BZs: the second BZ is the volume enclosed between the ﬁrst set of Bragg planes and the second set of Bragg planes. going outward from the origin. is composed of the eight smaller triangles around the second BZ. along with the Bragg planes that bisect them. Illustration of the construction of Brillouin Zones in a two-dimensional crystal ˆ .. The ﬁrst two sets of reciprocal lattice vectors (G = ±2π x ˆ . excluding the points in the ﬁrst BZ.

30) By comparing this with Eq. (n ) The corresponding eigenvalues. (3. Gm = m b.2 k-space – Brillouin zones 91 εk εk 2V4 2V 2 −3π −2π −π 0 π 2π 3π ka −3π −2π −π 0 π 2π 3π ka Figure 3. for an inﬁnite crystal the subscript index k is a continuous variable.29) we conclude that in each BZ there are N distinct values of k. V4 = |V (4π/a )|. Similarly. where N is the total number of PUCs in the crystal. which we denote by a su(n ) (r).4. given by BZ = |b1 · (b2 × b3 )| = (2π )3 (2π )3 = |a1 · (a2 × a3 )| PU C (3. each subsequent BZ can accommodate 2 N electrons . The eigenvalues and eigenfunctions in the two schemes are related by relabeling of the superscript indices. them equivalent. The superscript index is discrete (it takes integer values). with k = h ¯ 2 k2 /2m e (a = a x where m is an integer) the band structure in the reduced and the extended zone schemes is shown in Fig. whereas. This is a very useful observation: if there are n electrons in the PUC (that is. Keeping only the ﬁrst BZ is referred to as the reduced zone scheme. For a system of free ˆ . A plot of the energy bands is called the band structure.4. Left: the one-dimensional band structure of free electrons illustrating the reduced zone and extended zone schemes. electrons in one dimension. it can accommodate up to 2 N electrons. n N electrons in the crystal). Since the ﬁrst BZ contains N distinct values of k. with splitting of the energy levels at BZ boundaries (Bragg planes): V2 = |V (2π/a )|. perscript: ψk as argued above. b = 2π x ˆ /a . within any single BZ and for the same value of k there may be several solutions of the single-particle equation.3. Right: the one-dimensional band structure of electrons in a weak potential in the reduced and extended zone schemes. then we need exactly n N /2 different ψk (r) states to accommodate them. taking into account spin degeneracy (two electrons with opposite spins can coexist in state ψk (r)). Therefore we need a second index to identify fully the solutions to the single-particle equations. corresponding to the various allowed states for the quasiparticles. 3. keeping all BZs is referred to as the extended zone scheme. are referred to as “energy bands”. also identiﬁed by two indices k . It turns out that every BZ has the same volume. Of course.

this gap. Proof: We take the complex conjugate of the single-particle Schr¨ odinger equation (in the following we drop the band index (n ) for simplicity): Hsp ψk (r) = k ψk (r) ∗ ⇒ Hsp ψk (r) = ∗ k ψk (r) k. In the extended zone scheme. there will be a gap between the highest energy of occupied states (the top of the ﬁrst band) and the lowest energy of unoccupied states (the bottom of the second band). its position could be anywhere within the band gap. denoted by 2V2 in Fig.4. that is. we need to occupy a number of states that corresponds to a total of n /2 bands per k-point inside the ﬁrst BZ. In the free-electron case this corresponds to the value of the energy at the ﬁrst BZ boundary.33) . The Fermi level is deﬁned as the value of the energy below which all single-particle states are occupied. the Fermi level must be such that the ﬁrst band in the ﬁrst BZ is completely full. we must have: (n ) k = (n ) −k (3. we need to ﬁll completely the states that occupy a volume in k-space equivalent to n /2 BZs. is referred to as the “band gap”. Theorem Since the hamiltonian is real. For n = 2 electrons per PUC in the one-dimensional free-electron model discussed above. we need to occupy states that correspond to the lowest energy band and take up the equivalent of n /2 BZs.92 3 Electrons in crystal potential because it has the same volume in reciprocal space. Which states will be ﬁlled is determined by their energy: in order to minimize the total energy of the system the lowest energy states must be occupied ﬁrst. Given the above deﬁnition of the Fermi level. the ﬁrst and the second bands are split in energy at the BZ boundary. this is known as Kramers’ theorem. the wavefunctions ψk and ψk ∗ we can identify ψk (r) with ψ−k (r) because ψ−k (r) = e−ik·r G ∗ ψk (r) α−k (G)eiG·r ∗ αk (G)e−iG·r G =e −ik·r (3. (3. For n electrons per unit cell. 3. The case of electrons in the weak periodic potential poses a more interesting problem which is discussed in detail in the next section: in this case. the system is time-reversal invariant.31) for any state. We consider next another important property of the energy bands. A more detailed examination of the problem reveals that actually the Fermi level is at the middle of the gap (see chapter 9). In the reduced zone scheme. ∗ have the same (real) eigenvalue that is.32) However. For n = 2 electrons per PUC.

This is a linear system of equations in the unknowns αk (G). which we take as the deﬁnition of the α−k (G)’s. Using the Fourier expansion of the potential and the Bloch states.35) in the single-particle Schr¨ odinger equation. V (G) ≈ 0 for all G. we obtain the following equation: h ¯2 (k + G)2 − 2m e k + G V (G )eiG ·r αk (G)ei(k+G)·r = 0 (3. Then the wavefunction ψ−k (r) is a solution of the single-particle equation. k = h ¯ 2 k2 2m e (3. A simple and useful generalization of the free-electron model is to consider that the crystal potential is not exactly vanishing but very weak. with the proper behavior ψ−k (r + R) = exp(−ik · R)u −k (r).↓ . since these components can only arise from corresponding features in the potential. and the eigenvalue −k = k . . V (r) = G V (G)eiG·r αk (G)eiG·r G ψk (r) = eik·r (3. which can be solved to determine the values of these unknowns and hence ﬁnd the eigenfunctions ψk (r). ∗ ψ−k. Now if the potential is very weak. A more detailed analysis which takes into account spin states explicitly reveals that. for spin 1/2 particles Kramers’ theorem becomes −k. For systems with equal numbers of up and down spins.↑ = k.↓ (r) (3.↑ (r) = iσ y ψk . which means that the wavefunction cannot have any components αk (G) for G = 0.38) as we would expect for free electrons (here we are neglecting electron–electron interactions for simplicity).34) where σ y is a Pauli matrix (see Problem 3).37) where we have used the relation exp(iG · r)dr = PU C δ (G).3. Kramers’ theorem amounts to inversion symmetry in reciprocal space.36) G Multiplying by exp(−i(G + k) · r) and integrating over r gives h ¯2 (k + G)2 − 2m e k αk (G) + G V (G − G )αk (G ) = 0 (3.2 k-space – Brillouin zones 93 and the only requirement for these two wavefunctions to be the same is: α−k (G) = ∗ αk (−G). In this case we take αk (0) = 1 and obtain 1 ψk (r) = √ eik·r .

(3. which is small but not negligible.39) )2 where we have used the zeroth order approximation for k . V (G0 ). we obtain k = h ¯ 2 k2 ± |V (G0 )| 2m e (3. Since all V (G) = 0 except for G0 .42) for the two possible solutions.e. Solving this system. we obtain the following linear system of equations: h ¯ 2 k2 − 2m e h ¯ 2 (k + G0 )2 − 2m e k αk (0) + V ∗ (G0 )αk (G0 ) = 0 αk (G0 ) + V (G0 )αk (0) = 0 (3.4. The result is (n ) k+q = (n ) k + h ¯ h ¯ 2 q2 h ¯2 q · p(nn ) (k) + + 2 me 2m e me |q · p(nn ) (k)|2 n =n (n ) k − (n ) k (3. αk (G0 ) is indeed very small as long as the denominator is ﬁnite. and consequently all coefﬁcients αk (G) are negligible.40) (k + G0 )2 = k2 ⇒ k · G 2 and this is the condition for Bragg planes! In this case. This is illustrated for the one-dimensional case in Fig.37) reduces to αk (G0 ) = V (G 0 ) h ¯ 2 2m e k2 − (k + G0 (3. except for αk (G0 ). without setting αk (0) = 1. The only chance for the coefﬁcient αk (G0 ) to be large is if the denominator is vanishingly small. in order to obtain the correct solution we have to consider both αk (0) and αk (G0 ). at Bragg planes (i.94 3 Electrons in crystal potential Now suppose that all components of the potential are negligible except for one. the energy of the free electrons is modiﬁed by the terms ±|V (G)| for the nonvanishing components of V (G). Given that V (G0 ) is itself small.3 Dynamics of crystal electrons It can be shown straightforwardly. that if (n ) we know the energy k for all n at some point k in the BZ. 3. Thus. and we take as before αk (0) = 1.41) k where we have used V (−G) = V ∗ (G). assuming αk (G0 ) to be much smaller. which happens for ˆ 0 = − 1 |G0 | (3.43) . at the boundaries of the BZ). we can obtain the energy at nearby points. using second order perturbation theory. 3. Then Eq.

3 Dynamics of crystal electrons 95 where the quantities p(nn ) (k) are deﬁned as p(nn ) (k) = h ¯ (n ) (n ) ψ |∇r |ψk i k (3. this approach is known as q · p perturbation theory. that is. We can simply the expectation value of the momentum operator in state ψk also calculate the same quantity from ∇k (n ) k = lim ∂ (n ) k+q q→0 ∂q = h ¯ (nn ) p (k) me (3. Since the effective mass involves complicated dependence on the direction of the k-vector and the momenta and energies of many states. (3. as the expression derived above demonstrates. we will omit the band index n for simplicity. it can have different magnitude and even different signs along different crystallographic directions! We wish next to derive expressions for the evolution with time of the position (n ) .45) (n ) which shows that the gradient of k with respect to k (multiplied by the factor m e /h ¯ ) gives the expectation value of the momentum for the crystal states. which is no longer a simple scalar quantity but a second rank tensor. and include its time derivatives where appropriate.46) The dimensions of this expression are 1/mass. we will need to allow the crystal momentum to acquire a time dependence. as well as on the wavefunctions and energies of all other crystal electrons with the same k-vector. the effective mass of a crystal electron depends on the wave-vector k and band index n of its wavefunction. .44) Because of the appearance of terms q · p(nn ) (k) in the above expressions. denoted by ki . To and velocity of a crystal electron in the state ψk this end. the diagonal matrix elements are (n ) (r). k j : 2 (n ) (n ) 1 ∂2 k 1 ∂ k+q ≡ 2 = lim 2 ) q→0 h h ¯ ∂ ki ∂ k j ¯ ∂ qi ∂ q j m i(n j (k) 1 1 1 = δi j + 2 me me pi(nn ) p (jn n ) + p (jnn ) pi(n n ) n =n (n ) k − (n ) k (3. The quantities deﬁned in Eq. This can then be directly identiﬁed as the inverse effective mass of the quasiparticles. (n ) with respect to components of the Let us consider the second derivative of k vector k. This is a demonstration of the quasiparticle nature of electrons in a crystal.3. It is important to recognize that. ﬁgure out its dynamics.44) are elements of a two-index matrix (n and n ). Since we are dealing with a particular band. k = k(t ).

r)u k (r)dr u∗ k (r)H(p + h u∗ ¯ k. r)] u k (r)dr u∗ k (r) [∇k H(p + h Next.48) whose proof involves simple differentiations of the exponentials with respect to k. to rewrite the right-hand side of the previous equation as i i ψk |[H. r]eik·r u k (r)dr u∗ k (r)e −ik·r Heik·r u∗ k (r) ∇k e u k (r)dr ¯ k. r] =⇒ ψk | |ψk = ψk |[H. Now we can take advantage of the following identity: ∇k e−ik·r Heik·r = ie−ik·r [H. outside the integral and subtract the additional terms produced by this change. r]|ψk = ∇k h ¯ h ¯ − 1 h ¯ ¯ k.49) since u k |u k = 1 for properly normalized wavefunctions.1 This leaves the following result: ψk | dr(t ) 1 |ψk = ∇k dt h ¯ 1 ∗ ψk (r)Hψk (r)dr =⇒ vk = ∇k h ¯ k (3. which we will encounter again in chapter 5. r]|ψk ≡ h ¯ h ¯ 1 = h ¯ 1 = h ¯ −ik·r [H. and of Eq. r) with eigenvalue k . r]|ψk dt h ¯ dt h ¯ with [H. we u k (r) is an eigenfunction of the hamiltonian H(p + h obtain for these two terms: ∇k u k |H(p + h ¯ k)|u k + u k |H(p + h ¯ k)|∇k u k = k ( ∇k u k |u k + u k |∇k u k ) = k ∇k u k |u k = 0 (3. r) (∇k u k (r))dr k (r)H(p + h ¯ k. ∇k . .50) where we have identiﬁed the velocity vk of a crystal electron in state ψk with the expectation value of the time derivative of the operator r(t ) in that state. r]eik·r (3.47) = [H. (3. r)u k (r)dr + ∇k u ∗ k (r) H(p + h We deal with the last two terms in the above expression separately: recalling that ¯ k. r] the commutator of the hamiltonian with the position operator. which leads to 1 i ψk |[H. we move the differentiation with respect to k. This result 1 This is a special case of the more general Hellmann–Feynman theorem. we have from the usual formulation in the Heisenberg picture (see Appendix B): dr(t ) dr(t ) i i (3.96 3 Electrons in crystal potential Considering the time-dependent position of a crystal electron r(t ) as a quantum mechanical operator.22).

(3.4 Crystal electrons in an electric ﬁeld We consider next what happens to crystal electrons when they are subjected to a constant external electric ﬁeld E. Taking a derivative of the velocity in state ψk with respect to time. and inﬂuences strongly their response to external electric ﬁelds. which could potentially change the wavefunctions and the eigenvalues of the single-particle hamiltonian. so the wave-vectors evolve in the same manner for all states! 3.52). as might be expected. Second. In particular. so far we have discussed the dynamics of electrons in the crystal without the presence of any external forces. for instance in electronic devices. we examine this issue in detail in the next section. to the extend that the above relation actually holds. The periodicity of the crystal has profound effects on the transport properties of electrons.46). and using the chain rule for differentiation with respect to k. it is a state-independent equation. z ) as i d vk = dt h ¯ j dk j dt 1 ∂2 k h ¯ 2 ∂ k j ∂ ki = j h ¯ dk j dt 1 m ji (k) (3. (3. j = x .4 Crystal electrons in an electric ﬁeld 97 is equivalent to Eq.45) which we derived above for the momentum of a crystal electron in state ψk . which gives rise to an electrostatic potential (r) = −E · r. (3. (3.3. . External electric ﬁelds are typically used to induce electron transport. we ﬁnd 1 d vk = dt h ¯ dk · ∇k ∇k dt k (3. The form of Eq. but the mass is not a simple scalar quantity as in the case of free electrons. it is only an inference from the dimensions of the different quantities that enter into Eq. the identiﬁcation of the time derivative of h ¯ k with the external force at this point cannot be considered a proper proof.52) compels us to identify the quantities in parentheses on the right-hand side as the components of the external force acting on the crystal electron: h ¯ dk =F dt (3. First. y .53) We should note two important things.51) which we can write in terms of cartesian components (i . the mass is now a second rank tensor corresponding to the behavior of crystal electrons. With this identiﬁcation.52) where we have identiﬁed the term in the square brackets as the inverse effective mass tensor derived in Eq. this equation has the form of acceleration = force/mass.

we obtain terms which reduce to f (G)δ ((k − k ) − G) → k − k = G . we expect the wave-vectors themselves to acquire a time dependence. the square bracket includes only terms which have the full periodicity of the lattice. which is HE = H0 + q (r) = H0 − q E · r where H0 is the hamiltonian of the crystal in zero ﬁeld and q the charge of the particles (for electrons q = −e). To show this. From the relation ∇k ψk (r) = ∇k eik·r u k (r) = irψk (r) + eik·r ∇k u k (r) =⇒ rψk (r) = −i∇k ψk (r) + ieik·r ∇k e−ik·r ψk (r) (3. that are relevant to the new hamiltonian: as we saw at the end of the previous section. consider the matrix element of this term between two states of different k: ψk | eik·r E · ∇k e−ik·r |ψk = E · =E· −ik ·r ik·r u∗ e ∇k e−ik·r eik·r u k (r)dr k (r)e ei(k−k )·r u ∗ k (r)∇k u k (r) dr Now. which will take the form f (r) = G f (G)e−iG·r with G the reciprocal lattice vectors. we will need to construct the proper states.54) we conclude that the action of the hamiltonian HE on state ψk is equivalent to the action of H0 − iq eik·r E · ∇k e−ik·r + iq E · ∇k The ﬁrst new term in this hamiltonian has non-vanishing matrix elements between states of the same k only. To make a connection to the results of the previous section. as we have discussed about the functions u k (r): u∗ k (r)∇k u k (r) = f (r) → f (r + R) = f (r) Therefore. characterized by wave-vectors k. inserting this into the previous equation.98 3 Electrons in crystal potential For the present discussion we will include the effects of the external ﬁeld from the beginning. this term can be expressed through its Fourier transform. starting with the new hamiltonian in the presence of the external electric ﬁeld. in this last expression. after integrating over r.

so that the ﬁrst BZ for this crystal extends from −π/a to π/a . Let us consider a simple example to illustrate this behavior: suppose we have a one-dimensional crystal with a single energy band.59) In chapter 4 we discuss the physical system that can give rise to such an expression for the single-particle energy k .56) Comparing the right-hand side of these two last equations term by term we ﬁnd h ¯ dk = −eE dt (3. k = + 2t cos(ka ) (3.5: the lattice constant is a .2 This is shown in Fig. or. must have and are characterized by wave-vectors k.58) with a reference energy and t a negative constant. The ψ k states will form the basis for the solution of the time-dependent Schr¨ odinger equation of the full hamiltonian therefore: i¯ h ˜k ∂ψ ˜ E + iq E · ∇k ψ ˜ k = [˜k + iq E · ∇k ] ψ ˜k = H ∂t ˜k dk(t ) ∂ψ i ˜k = − ˜k + · ∇k ψ ∂t h ¯ dt (3. (3. and the energy ranges from a minimum of + 2t (at k = 0) to a maximum of − 2t (at k = ±π/a ). k in the ﬁrst BZ. with where we have introduced the eigenvalues ˜k of the hamiltonian H ˜ k a time-dependent quantity.3. The momentum for this state is given by p (k ) = 2 2m e ta me d k =− sin(ka ) h ¯ dk h ¯ (3.55) ˜ k we ﬁnd for its time derivative while from the expression we wrote above for ψ (3. and is This result gives the time evolution of the wave-vector k for the state ψ consistent with what we expected from Eq. if we choose to work with k. we must have k = k . Consequently. 3. which we call ψ the form ¯) ˜ k (r) = e−(i/h ψ ˜ k dt ik·r e ˜ k (r) u ˜ E explicitly.57) ˜ k .53).4 Crystal electrons in an electric ﬁeld 99 for non-vanishing matrix elements. . This establishes that the ﬁrst new term in the hamiltonian has non-vanishing matrix elements only between states of the same k. we can choose to work with basis states that are eigenfunctions of the hamiltonian ˜ E = H0 − iq eik·r E · ∇k e−ik·r H ˜ k . These states.

and all states would remain completely full. with only a few states empty at wave-vectors near the boundaries of the ﬁrst BZ (at k = ±π/a ) where the energy is a maximum. that is. then its k would increase at a constant rate (eE 0 /h ¯ ) until the value π/a is reached. the time evolution of the wave-vector will be e dk e dk =− E⇒ = E0 dt h ¯ dt h ¯ We will consider some limiting cases for this idealized system. Single-particle energy eigenvalues k for the simple example that illustrates the dynamics of electrons and holes in a one-band. then all the wave-vectors would be changing in the same way.61) =e k >k F p (k ) ˆ x me where in the ﬁrst two sums the summation is over k values in the ﬁrst BZ. we conclude that no current would ﬂow: a full band cannot contribute to current ﬂow in an ideal crystal! (3) If the band were mostly full. over unoccupied states only. The same picture would hold for a few electrons initially occupying the bottom of the band. (2) If the band were completely full. The last equality follows from the fact . (1) A single electron in this band would start at k = 0 at t = 0 (before the application of the external ﬁeld it occupies the lowest energy state). whereas in the the last sum it is restricted over k > kF . one-dimensional crystal. ˆ with E 0 a positive When the system is in an external electric ﬁeld E = − E 0 x constant. over occupied states only. Since this creates no change in the system.100 3 Electrons in crystal potential εk holes ε -2t holes −π ε π ka electrons ε +2t Figure 3.5. restricted to k ≤ kF that is. then the total current would be I = −e k ≤kF (3.60) v(k ) = −e k ≤kF p (k ) p (k ) p (k ) ˆ ˆ = −e x x +e me me me k >kF k ∈BZ (3. and then it would re-enter the ﬁrst BZ at k = −π/a and continue this cycle.

(3. which as explained above does not contribute to the current. assuming that we are dealing with a single hole state at the top of the band. a crystal can also have symmetries beyond the translational periodicity. reﬂections on planes.45) to obtain the second equality and Eq.63) to obtain the third equality. and using the general result of Eq.61).62) which is a negative quantity (recall that t < 0). There are two basic advantages . and combinations of these operations among themselves and with translations by vectors that are not lattice vectors. such as rotations around axes. In two. All these symmetry operations are useful in calculating and analyzing the physical properties of a crystal. 3.3. (3. Thus. we ﬁnd the time derivative of the current in our onedimensional example to be 1 d p (k ) d dI ˆ =e =e x dt m e dt dt k >kF k >k F 1 d h ¯ dk k ˆ= x e m h ¯ k >k F dk ˆ x dt (3. in this case the system behaves like a set of positively charged particles.63) with the effective mass m being the negative quantity found in Eq. (3.5 Crystal symmetries beyond periodicity 101 that the sum over all values of k ∈ BZ corresponds to the full band. In our simple one-dimensional example we can use the general expression for the effective mass derived in Eq.57).46). (3. to ﬁnd near the top of the band: 1 1 d2 k ¯2 = 2 2 =⇒ m = h m h ¯ dk d2 k dk 2 −1 k =±π/a = h ¯2 2ta 2 (3. From the general expression Eq. we obtain e2 e2 dI =− E= E dt m |m | (3.and three-dimensional cases. (3. The one-dimensional examples we presented there can have only this type of symmetry.65) which describes the response of a positively charged particle of charge +e and positive mass |m | to an external ﬁeld E.62).5 Crystal symmetries beyond periodicity In the previous sections we discussed the effects of lattice periodicity on the single-particle wavefunctions and the energy eigenvalues. we can write the energy near the top of the band as ±(π/a )+k 1 = + 2t −1 + (ka )2 + · · · 2 = ( − 2t ) + ta 2 k 2 = ( − 2t ) + h ¯ 2k2 2m (3. Now. as we expect for holes. (3. referred to as holes (unoccupied states).64) where we have used Eq. Using the Taylor expansion of cos near k0 = ±π/a .

leading to a deeper understanding of the physical properties of the crystal as well as to simpler ways of calculating these properties in the context of the single-particle picture. Taking into account a factor of 2 for spin. the volume in reciprocal space for which solutions need to be calculated is further reduced.67) . in the FCC crystals with one atom per unit cell. This very interesting branch of mathematics is particularly well suited to reduce the amount of work by effectively using group representations in the description of single-particle eigenfunctions. Using spherical coordinates for the integration over k. thereby elucidating its optical properties. Although conceptually straightforward. Here. where N is the number of unit cells. Second. Taking full advantage of the crystal symmetries requires the use of group theory.66) where we have used dk = (2π )2 /( N a 2 ) for the two-dimensional square lattice. which is beyond the scope of the present treatment.102 3 Electrons in crystal potential to using the symmetry operations of a crystal in describing its properties. To illustrate the importance of taking into account all the crystal symmetries we discuss a simple example. The Fermi momentum is obtained by integrating over all k-vectors until we have enough states to accommodate all the electrons. and one atom per unit cell. Consider a two-dimensional square lattice with lattice constant a . usually to a small fraction of the ﬁrst Brillouin Zone.|k|<kF 2 ( N a2) (2π )2 |k|<kF dk = N Z ⇒ 2 (2π )2 |k|<kF dk = Z a2 (3. which involve excitation or de-excitation of electrons by absorption or emission of photons. called the irreducible part. for the various possible solutions. that is. using symmetry arguments it is possible to identify the allowed optical transitions in a crystal. First. the theory of group representations requires a signiﬁcant amount of discussion. for example. certain selection rules and compatibility relations are dictated by symmetry alone. we will develop some of the basic concepts of group theory and employ them in simple illustrative examples. The simplest case is that of free electrons. a total of N Z electrons in the crystal of volume N a 2 . the irreducible part is 1/48 of the full BZ. We want to ﬁnd the behavior of energy eigenvalues as a function of the reciprocal lattice vector k. the total number of states we need in reciprocal space to accommodate all the electrons of the crystal is given by 2→ k. Let us ﬁrst try to ﬁnd the Fermi momentum and Fermi energy for this system. we obtain: 2 2π (2π )2 kF 0 k dk = 1 2 Z kF = 2 ⇒ kF = 2π a 2π Z a2 1/2 (3. Assume that this atom has Z electrons so that there are Z /a 2 electrons per unit cell. for example. by analogy to the general result for the three-dimensional lattice dk = (2π )3 /( N PU C ) (see Appendix D).

and from this we can obtain the Fermi energy F F: = 2 h ¯ 2 kF h ¯ 2 2π Z = 2m e 2m e a 2 (3.6. in the extended zone scheme each component of k will span different ranges of values in different BZs. Right: the shape of occupied portions of the various Brillouin Zones for the 2D square lattice with Z = 4 electrons per unit cell. Suppose that we want to plot the energy levels as a function of the k-vector as we did in the one-dimensional case: what do these look like.68) The value of kF determines the so called “Fermi sphere” in reciprocal space. With the tools we have developed so far. We will discuss next how we can take advantage of symmetry to reduce the problem to something manageable. The number of full BZs and the shapes of occupied regions in partially ﬁlled BZs depend on kF through Z .5 Crystal symmetries beyond periodicity 6 6 5 5 103 6 6 5 6 5 6 6 5 6 5 5 4 3 4 3 6 5 6 4 2 4 3 5 6 5 3 4 5 2 3 5 kF 5 4 2 4 3 6 5 5 6 6 1 4 4 1 4 3 6 2 3 4 5 4 5 6 6 5 6 4 5 6 6 5 2 3 Figure 3. there is no simple solution to the problem. which contains all the occupied states for the electrons. we realize that the k-vectors are two-dimensional so we would need a three-dimensional plot to plot the energy eigenvalues. This Fermi sphere corresponds to a certain number of BZs with all states occupied by electrons. and what does the occupation of states in the various BZs mean? The answer to these questions will help us visualize and understand the behavior of the physical system. two axes for the k x . 3.3. Left: the ﬁrst six Brillouin Zones of the two-dimensional square lattice. The position of the Fermi sphere (Fermi surface) for a crystal with Z = 4 electrons per unit cell is indicated by a shaded circle. and a certain number of partially occupied BZs with interestingly shaped regions for the occupied portions. making the plot very complicated. an example for Z = 4 is shown in Fig. Using the reduced Zone scheme is equally complicated now that we are dealing with a two-dimensional system.6. k y components and one for the energy. . First. Moreover.

C are members of a group then ( AB )C = A( BC ). which involve a reﬂection and a translation by t = R.104 3 Electrons in crystal potential 3.69) The space group can contain the following types of operations: r r r r r lattice translations (all the Bravais lattice vectors R).. around symmetry axes. there exists another element called its inverse A−1 . T is a subgroup of S. To apply symmetry operations on various functions. (c) P : the point group.70) In terms of these operators. where A is every other element of the group. π/3. for which A A−1 = A−1 A = E . which satisfy the following four conditions: 1. (b) T : the translation group. or reﬂection on a plane σ . Finally. A2 . (3.. which contains all the translations that leave the crystal invariant. which form T. such that A1 A2 = A3 =⇒ B1 B2 = B3 For a crystal we can deﬁne the following groups: (a) S : the space group. some operations may no longer leave the crystal invariant if they involved a translation by a vector not equal to any of the lattice vectors. screw axes. These will be symbolized by {U |t}.. we introduce the operators corresponding to the symmetries. where U corresponds to a proper or improper rotation and t to a translation. These operators act on real space vectors according to the rule: {U |t}r = U r + t (3. which contains all space group operations with the translational part set equal to zero. proper rotations by π. B2 . that is. 3. like inversion I . Inverse element: for every element A of the group. all the lattice vectors R. P is not necessarily a subgroup of S. B .. improper rotations.. glide planes. π/2. 2.. A subgroup is a subset of operations of a group which form a group by themselves (they satisfy the above four conditions). because when the translational part is set to zero. or unit element). An and B1 . the translations can be described as { E |R} (where E is the identity. Closure: if A and B are elements of the group so is C = AB . . . which contains all symmetry operations that leave the crystal invariant. 2π/3. which involve a proper rotation and a translation by t = R.6 Groups and symmetry operators We ﬁrst deﬁne the notion of a group: it is a ﬁnite or inﬁnite set of operations. Associativity: if A . Bn . Unit element: there is one element E of the group for which E A = AE = A. two groups A and B are called isomorphic when there is a one-to-one correspondence between their elements A1 . 4. while the pure rotations (proper or improper) are .

Therefore generally P × T = S. all symmetry operators in the space group commute with the hamiltonian. A symmorphic group is one in which a proper choice of the origin of coordinates eliminates all non-lattice translations. in which case P is a true subgroup of S. For any two symmetry operators belonging to the space group. so we deﬁne a new set of operators whose effect is to change r: O{U |t} f (r) = f ({U |t}−1 r) (3. 32 different types of point groups.72) The validity of this can be veriﬁed by applying {U −1 | − U −1 t}{U |t} on r. we can easily show that the inverse of {U |t} is {U |t}−1 = {U −1 | − U −1 t} (3. and a total of 230 different space groups. and therefore the external potential. Groups that include such symmetry operations are referred to as non-symmorphic groups. invariant. {U1 |t1 }{U2 |t2 }r = {U1 |t1 }[U2 r + t2 ] = U1U2 r + U1 t2 + t1 (3. 3. Finally. We saw that P consists of all space group operators with the translational part set equal to zero. If the crystal has glide planes and screw rotations. the second to the point group P. The point group leaves the Bravais lattice invariant but not the crystal invariant (recall that the crystal is deﬁned by the Bravais lattice and the atomic basis in each unit cell). Using this rule for multiplication of operators. Some general remarks on group theory applications to crystals are in order. In three dimensions there are 14 different types of Bravais lattices. since they leave the crystal.3.7 Symmetries of the band structure In the following we will need to apply symmetry operations to functions of the space variable r. of which 73 are symmorphic and 157 are non-symmorphic. We will give a more detailed account of symmetry groups in three dimensions after we have considered a simple example in two dimensions. We will prove that the group formed by the operators O{U |t} is isomorphic to the group of . The ﬁrst set corresponds to the translation group T.71) which means that the operator {U1U2 |U1 t2 + t1 } corresponds to an element of the space group. because several of the group symmetries may involve non-lattice vector translations t = R.7 Symmetries of the band structure 105 described as {U |0}. that is. { E |t} ∈ / T. symmetry operations that involve {U |t} with t = 0. then P does not leave the crystal invariant. whereupon we get back r. the action is always on the vector r itself.73) In this deﬁnition of O{U |t} .

that ψk (r) and O{U |t} ψk (r) are eigenstates (n ) (r) is a of the hamiltonian with the same eigenvalue. To prove the second statement. we apply the (n ) operator O{ E |−R} to the function O{U |t} ψk (r): this is simply a translation operator ( E is the identity).106 3 Electrons in crystal potential operators {U |t}: O{U1 |t1 } O{U2 |t2 } f (r) = O{U1 |t1 } f ({U2 |t2 }−1 r) = O{U1 |t1 } g (r) = g ({U1 |t1 }−1 r) = f ({U2 |t2 }−1 {U1 |t1 }−1 r) = f ([{U1 |t1 }{U2 |t2 }]−1 r) = O{U1 |t1 }{U2 |t2 } f (r) ⇒ O{U1 |t1 } O{U2 |t2 } = O{U1 |t1 }{U2 |t2 } (3. We have also used the relation {U2 |t2 }−1 {U1 |t1 }−1 = [{U1 |t1 }{U2 |t2 }]−1 which can be easily proved from the deﬁnition of the inverse element Eq. that O{U |t} ψk Bloch state of wave-vector U k. (3. since their elements satisfy the condition of Eq. We will prove that for any element of the space group {U |t} ∈ S.74) proves the isomorphism between the groups {U |t} and O{U |t} . To prove this we note that {U |t}−1 { E | − R}{U |t} = {U |t}−1 {U |t − R} = {U −1 | − U −1 t}{U |t − R} = { E |U −1 t − U −1 R − U −1 t} = { E | − U −1 R} . proving the ﬁrst statement. by their deﬁnition (they leave the crystal. and therefore changes the argument of a function by +R.74) where we have deﬁned the intermediate function g (r) = f ({U2 |t2 }−1 r) to facilitate the proof. and second.72). We expect to ﬁnd that this function is then changed by a phase factor exp(iU k · R). if indeed it is a Bloch state of wave-vector U k. (3. hence the total potential invariant): (n ) (r)] = O{U |t} [ O{U |t} [Hψk (n ) (n ) k ψk (r)] (n ) ⇒ H[O{U |t} ψk (r)] = (n ) (n ) k [O{U |t} ψk (r)] (n ) (n ) which shows that ψk (r) and O{U |t} ψk (r) are eigenfunctions with the same eigenvalue.76) (n ) (n ) ψU k (r) = O{U |t} ψk (r) (n ) (n ) To show this we need to prove: ﬁrst. we will now apply it to simplify the description of the eigenfunctions and eigenvalues of the single-particle hamiltonian.75) (3.69) in an obvious correspondence between elements of the two groups. We ﬁrst note that the symmetry operators of the space group commute with the hamiltonian. The last equality in Eq. we have (n ) Uk = (n ) k (3. Having deﬁned the basic formalism for taking advantage of the crystal symmetries. (3.

we have found that the symmetry properties of the energy spectrum and the single-particle wavefunctions simplify the task of obtaining the band structure considerably. where U is any symmetry in the point group P. This offers great savings in computation.76). we have shown that the two Bloch states are related by (n ) (n ) (n ) (n ) −1 −1 ψU r − U −1 t ) k (r) = O{U |t} ψk (r) = ψk ({U |t} r) = ψk (U (n ) (n ) and eigenfunctions ψk (r) at a point k we so that if we knew the eigenvalues k could immediately ﬁnd all the eigenvalues and eigenfunctions at points U k. When we apply the operators (n ) in this last equation to ψk (r). Furthermore. the corresponding eigenvalue must be U k . The set of vectors U k for {U |0} ∈ P is called the star of k. which is equal to k since (n ) (n ) we have already proven that the two Bloch states ψk (r) and O{U |t} ψk (r) have the same eigenvalue. which can then be used to obtain all the remaining solutions. which can be “unfolded” by the symmetry operations of the point group to give us all the solutions in the entire Brillouin Zone. both the energy and the wavefunctions are smooth functions of k which in the limit of an inﬁnite crystal is a continuous variable.7 Symmetries of the band structure 107 and multiplying the initial and ﬁnal parts of this equation by {U |t} from the left. (n ) (n ) Therefore. (n ) These results indicate that the energy spectrum k has the full symmetry of the point group P = [{U |0}]. We have then proven the two symmetry statements of Eq. (3. that is. To summarize the discussion so far. When this last result is substituted into the previous equation. since we need only solve the single-particle equations in a small portion of the Brillouin Zone. by requiring that we solve the singleparticle Schr¨ odinger equations in the IBZ only. so we can understand the behavior of the energy bands by . Furthermore. This portion of the Brillouin Zone. we get: (n ) (n ) (r)] = eiU k·R [O{U |t} ψk (r)] O{ E |−R} [O{U |t} ψk (n ) which proves that O{U |t} ψk (r) is indeed a Bloch state of wave-vector U k. we obtain { E | − R}{U |t} = {U |t}{ E | − U −1 R} ⇒ O{ E |−R}{U |t} = O{U |t}{ E |−U −1 R} due to isomorphism of the groups {U |t} and O{U |t} . we ﬁnd (n ) (n ) (n ) O{ E |−R} O{U |t} ψk (r) = O{U |t} O{ E |−U −1 R} ψk (r) = O{U |t} ψk (r + U −1 R) (n ) Using the fact that ψk (r) is a Bloch state. is called the Irreducible Brillouin Zone (IBZ).3. a dot product of two vectors does not change if we rotate both vectors by U . we obtain (n ) ψk (r + U −1 R) = eik·U −1 R (n ) (n ) ψk (r) = eiU k·R ψk (r) where we have used the fact that k · U −1 R = U k · R.

and having ˆ .2: an entry in this Table is the result of multiplying the element at the top of its column with the element at the left end of its row. The deﬁnition of each element in terms of its action on an arbitrary vector (x . The symmetries for this model. which also has the full symmetry of the point group.2. We retake our simple example of the free-electron model in the 2D square lattice to illustrate these concepts. reﬂection on the axis at θ = 3π/4 from the x axis. σ1 . C4 reﬂection on the x axis. reﬂection on the axis at θ = π/4 from the x axis. Using these symmetries we deduce that the symmetries of the point group give an IBZ which is one-eighth that of the ﬁrst BZ. − y ) ( y . σx . There are a few high-symmetry points in this IBZ (in (k x . y ) (x . x ) (− y . consisting of one atom per PUC. a2 = a y ˆ . y ) (y. − y ) (−x . rotation by (2π )/4 around an axis perpendicular to the plane. . rotation by (2π )/2 around an axis perpendicular to the plane. This group multiplication table can be used to prove that all the group properties are satisﬁed. C2 . C4 . E: C4 : C2 : 3 C4 : σx : σy : σ1 : σ3 : (x . −x ) (x . k y ) notation): (1) = (0. are: lattice vectors a1 = a x r r r r r r r r the identity E . 3. y ) (x .108 3 Electrons in crystal potential Table 3. −x ) E C4 C2 3 C4 σx σy σ1 σ3 C4 C2 3 C4 E σ3 σ1 σx σy C2 3 C4 E C4 σy σx σ3 σ1 3 C4 E C4 C2 σ1 σ3 σy σx σx σ1 σy σ3 E C2 C4 3 C4 σy σ3 σx σ1 C2 E 3 C4 C4 σ1 σy σ3 σx 3 C4 C4 E C2 σ3 σx σ1 σy C4 3 C4 C2 E considering their values along a few high-symmetry directions: the rest interpolate in a smooth fashion between these directions. x ) (−x . y ) → → → → → → → → (x . so that the group is symmorphic. y ) (x . rotation by 3(2π )/4 around an axis perpendicular to the plane. y ) (x . These symmetries constitute the point group P for this physical system. y ) (x . 3 . which has the full symmetry of the point group. σ3 . This point group has the group multiplication table given in Table 3. For this system the point group is a subgroup of the space group because it is a symmorphic group. 1)(π/a ).2. y ) (x . 0). y ) on the plane is also given on the left half of Table 3. (2) M = (1. y ) (x . y ) (− y . reﬂection on the y axis. Group multiplication table for symmetries of the 2D square lattice. We assume that the origin is at the position of the atom. shown in Fig. and therefore the space group S is given by S = P × T where T is the translation group. σx .7.

σ1 . C2 . 0)(π/a ). the special points are labeled . the energy of single-particle levels is given by (n ) k = h ¯2 |k|2 2m e . which has the symmetries E .3. In the example we have been discussing. which has the symmetries E . (4) = (k . C4 ). 0 < k < 1. Thus. σ y . σ y . other than the identity E . For the case of the free-electron system in the two-dimensional square lattice. in addition to all the other point group symmetries imposed by the lattice.8. . M. X. so we do not have to add it explicitly in the list of symmetry operations when deriving the size of the IBZ. since both k and ψk (r) are smooth functions of k. σx . which has the symmetries E . (5) = (k . we need only calculate their values for the high-symmetry points mentioned. (6) Z = (1. which has the symmetries E . Left: the symmetry operations of the two-dimensional square lattice. σ3 ) and the curves with arrows 3 the rotations (labeled C4 . the energy for the ﬁrst nine bands at these high-symmetry points is shown in Fig. C2 is actually equivalent to inversion symmetry. Any point inside the IBZ that is not one of the high-symmetry points mentioned above does not have any symmetry. . σ1 . 0 < k < 1. k )(π/a ). As already described above. Right: the Irreducible Brillouin Zone for the twodimensional square lattice with full symmetry. σx . σ y . 3. (3) X = (1. In this simple model. C2 . even if the crystal does not have inversion symmetry as one of the point group elements. to reduce the size of the irreducible portion of the Brillouin Zone. 0 < k < 1. we can always use inversion symmetry. which for systems with equal numbers of spin-up and spin-down electrons implies inversion symmetry in reciprocal space. 0)(π/a ).7. the thick straight lines indicate the reﬂections (labeled σx . k )(π/a ). Z. One important point to keep in mind is that Kramers’ theorem always applies.7 Symmetries of the band structure σ3 σy C4 109 ky a π Σ M Z ∆ X π k xa σ1 C2 σx −π Γ C4 3 −π Figure 3. which will then provide essentially a full description of the physics.

as shown in Fig.0 1 Χ ∆ Γ Σ ∆ Σ Μ Ζ Χ Μ Ζ Χ ∆ Γ Γ 1 2 3 4 Figure 3. All these points differ by reciprocal lattice vectors.6. as is done in Fig. This makes it evident why the bands labeled 2 and 3 are degenerate along the − M line: the wave-vectors corresponding to these points in the BZs with labels 2 and 3 have equal magnitude. 3.8 the wave-vectors that correspond to equivalent points in the ﬁrst nine BZs. third. the pairs of bands labeled 5 and 6.8. 3. .8. To illustrate the origin of the bands along the high-symmetry directions in the BZ. for the ﬁrst nine bands coming from the ﬁrst nine BZs.0 7 2 Γ Σ 1 Μ Ζ 8 1. second. and then folding them within the ﬁrst BZ. 3. or 7 and 8. The position of the Fermi level F for Z = 4 electrons per unit cell is indicated. but they involve different pairs of bands.0 3 Electrons in crystal potential 9 8 9 9 8 7 8 8 6 7 6 5 5 7 6 4 5 3 3 2 1 4 3 6 2.5 1. the only requirement is that the label be kept consistent for the various parts of the BZ. The band structure for the two-dimensional free-electron model at the highsymmetry points in the IBZ.110 3. etc. BZ. along the − M direction.5 9 5 7 9 2. In this case. it is convenient to choose band labels that make the connection to the various BZs transparent. At the bottom we reproduce the ﬁlling of the ﬁrst four BZs. For the same reason. which are shown on the right. The labeling of the bands is of course arbitrary and serves only to identify electronic states with different energy at the same wave-vector k. The bands are obtained by scanning the values of the wave-vector k along the various directions in the ﬁrst. it depends only on the magnitude of the wave-vector k. that is.5 0. are also degenerate along the − M line.0 4 5 3 6 4 2 2 1 1 3 4 2 ∆ Χ εF 0. so in the reduced zone scheme they are mapped to the same point in the ﬁrst BZ. Analogous degeneracies are found along the − X and M − X lines. we show in Fig.

System Triclinic Monoclinic Orthorhombic Tetragonal Trigonal Hexagonal Cubic Cell sides a a a a =b=c =b=c =b=c =b=c α α α α Cell angles =β =β =β =β =γ =π =γ 2 =γ = π 2 =γ = π 2 π 2 Lattices P P. C3h . the behavior of the energy bands along the high-symmetry directions of the BZ provides essentially a complete picture of the eigenvalues of the single-particle hamiltonian throughout reciprocal space in the reduced zone scheme. hexagonal and cubic. c. in the M X and M directions. The third band has only a small portion occupied. b. Oh a=b=c a=b=c a=b=c α=β= π γ = 23 α=β=γ = R P. F = face-centered. 3. The second band is full along the direction M X . b) are given in Table 3.3. and γ = the angle between sides a . Finally. D6 . Th . tetragonal. D4 C4v . Td . c) and cell angles (α. F . D3h . C3v . we can now interpret the ﬁlling of the different BZs in Fig. and the third BZ has occupied slivers near the four corners.8 Symmetries of 3D crystals 111 Using this band structure. C6 . C6h C6v . D4h C3 . In three dimensions there are 14 different Bravais lattices. S4 . D3 D3d .8 Symmetries of 3D crystals We revisit next the symmetry groups associated with three-dimensional solids. C = side-centered. 3. c. O . R = rhombohedral. The ﬁrst band is completely full being entirely below the Fermi level. which corresponds to the Fermi surface shape depicted. Ci C 2 . They are grouped in six crystal systems called triclinic. I. C 2h D2 . β. D2d . corresponding to the so called “pockets” of occupied states at the corners. F P. C3i . The hexagonal system is often split into two parts. D6h T . and so is the ﬁrst BZ.3.9 shows the conventional unit cells for the 14 Bravais lattices. the fourth band has a tiny portion occupied around the point M . called the trigonal and hexagonal subsystems. As this discussion exempliﬁes. C. The deﬁnition of these systems in terms of the relations between the sides of the unit cell (a . γ ) are shown and the corresponding lattices are labeled P = primitive. orthorhombic.3. β = the angle between sides b. Each corner of the cell is occupied by sites that are equivalent by translational symmetry. I.6. b. C4h . I P π 2 Point groups C1 . Fig. 3. C P. giving rise to the almost full second BZ with an empty region in the middle. but only partially full in the directions X and M . c) and the angles between them (with α = the angle between sides a . The relations between the cell sides (a . Table 3. monoclinic. D2h . C s . The six crystal systems and the associated 32 point groups for crystals in 3D. C2v C4 . I = body-centered. in order of increasing symmetry.

The small gray circles indicate equivalent sites in the unit cell: those at the corners are equivalent by translational symmetry. The conventional unit cells of the 14 Bravais lattices in three dimensions. the primitive unit cell in this case is one-half the conventional cell. When there are no other sites equivalent to the corners. 3.112 Triclinic 3 Electrons in crystal potential Orthorhombic Tetragonal Hexagonal Cubic c a P Monoclinic b P P P P I I P I C R F C F Figure 3. at height c/3 and 2c/3 along the main diagonal (see Fig. and it is designated R for rhombohedral. When the cell has an equivalent site at the center of one face it is designated C (by translational symmetry it must also have an equivalent site at the opposite face). the others indicate the presence of additional symmetries. A number of possible point groups are associated with each of the 14 Bravais lattices. 6). When the cell has an equivalent site at its geometric center (a body-centered cell) it is designated I for the implied inversion symmetry.9). . Finally. There are 32 point groups in all. the primitive unit cell in this case is one-half the conventional cell. the primitive unit cell in this case is one-quarter the conventional cell. the primitive unit cell in this case is one-third the conventional cell. 4. When the cell has an equivalent site at the center of each face (a face-centered cell) it is designated F. when there is a single axis of rotation – a number subscript m indicates the m -fold symmetry around this axis (m can be 2. denoted by the following names: r C for “cyclic”. 3. depending on the symmetry of the basis. the conventional cell is the same as the primitive cell for this lattice and it is designated P. implying a primitive unit cell smaller than the conventional.9. in the hexagonal system there exists a lattice which has two more equivalent sites inside the conventional cell.

as required by threedimensional periodicity (see problem 1). while when a mirror plane symmetry is present the notation Cs is used (instead of what would normally be called C1h ). is denoted by S4 (this is different than C4v in which the π/2-rotation and the reﬂection on a vertical plane are independently symmetry operations). when inversion symmetry is present the notation Ci is adopted (instead of what would normally be called C1i ). referred to as “stereograms”. There exists a useful way of visualizing the symmetry operations of the various point groups. Finally. When only two-fold. since these groups do not have any symmetry operations other than the trivial ones (identity and inversion).11). when there are four sets of rotation axes of three-fold symmetry. that is. it does not belong to any special axis or plane of symmetry. the names of the groups in this notation are given in Table 3. there are 24 images of an arbitrary point .3. we discuss brieﬂy the symmetries of the cube. as in an octahedron (see Fig. In Fig. v for “vertical”. we refer the interested reader to these Tables for further details. the existence of additional mirror plane symmetries is denoted by a second subscript which can be h for “horizontal”. The convention for the equivalent points on the sphere is that the ones in the northern hemisphere are drawn as open circles.8 Symmetries of 3D crystals 113 r D for “dihedral”.10 we show the stereograms for 30 point groups belonging to ﬁve crystal systems. the stereograms for the triclinic system (point groups C1 and Ci ) are not shown because they are trivial: they contain only one point each. the so called “roto-reﬂection” group. 3. while the ones in the southern hemisphere are shown as solid dots and the sphere is viewed from the northern pole. 3. In all of these cases. 3. This set of conventions for describing the 32 crystallographic groups is referred to as the Schoenﬂies notation. Inversion symmetry is denoted by the letter subscript i . In the case of a one-fold rotation axis.3. r T for “tetrahedral”. three-fold and four-fold rotations are allowed. the group generated by the symmetry operation π/2-rotation followed by reﬂection on a vertical plane. as in a tetrahedron (see Fig. when there are two-fold axes at right angles to another axis. or d for “diagonal” planes relative to the rotation axes. which has the highest degree of symmetry compatible with three-dimensional periodicity. These consist of two-dimensional projections of a point on the surface of a sphere and its images generated by acting on the sphere with the various symmetry operations of the point group. This scheme is more complicated. when there are four-fold rotation axes combined with perpendicular two-fold rotation axes. In order to illustrate the action of the symmetry operations. As an illustration. the initial point is chosen as some arbitrary point on the sphere. There is a somewhat more rational set of conventions for naming crystallographic point groups described in the International Tables for Crystallography. r O for “octahedral”.11).

The lines within the circles are visual aides. and the triclinic system (groups C1 and Ci ) is not shown at all since it has trivial representations.114 3 Electrons in crystal potential Monoclinic Tetragonal Trigonal Hexagonal Cubic C2 C4 C3 C6 T Cs S4 C 3i C 3h Th C 2h C 4h C 3v C 6h Td Orthorhombic C 4v D3d C 6v O C 2v D2d D3 D6 Oh D2 D4 D3h D2h D4h D6h Figure 3. which in several cases correspond to reﬂection planes. The trigonal and hexagonal subsystems are shown separately. . Stereograms for 30 point groups in 3D.10.

whose three-fold rotational symmetry axes are the main diagonals of the cube. when the edges of the horizontal square are not of the same length as the other edges. is identiﬁed within the cube by thicker lines. i. there are four such axes in the regular tetrahedron. (a) A three-fold rotation axis of the regular tetrahedron. the vertical axis corresponds to four-fold rotation.4. −z ).e. the vertical and horizontal double-headed arrows. while the two horizontal axes correspond to two-fold rotations. there are four pairs of such tetrahedra in the cube. 2 These operations involve rotations by π (the classes C4 and C2 ). The 24 symmetry operations of the cube.11. of an octahedron. denoted by a double-headed arrow. π/2 (the class C4 ) and 2π (the identity E ). passes through one of its corners and the geometric center of the equilateral triangle directly across it. with all edges equal and three four-fold rotation axes. E 2 C4 x Rx /2 R y /2 Rz /2 R1 R1 x −x −x y z z −y −z y −y −z y −y y −y z x −x −z x −z z −x z −z −z z x y −y x −y −x −x y C4 Rx /4 R x /4 R y /4 R y /4 Rz /4 R z /4 Rx /2 Rx /2 R y /2 R y /2 Rz /2 Rz /2 x x −z z y −y −x −x z −z y −y z −z y y −x x z −z −y −y x −x −y y x −x z z y −y x −x −z −z C3 R2 R2 R3 R3 R4 R4 C2 (a) (b) (c) (d) Figure 3. − y . π/3 (the class C3 ).8 Symmetries of 3D crystals 115 Table 3. (d) One such octahedron.. Illustration of symmetry axes of the tetrahedron and the octahedron. Twenty-four more symmetry operations can be generated by combining each of the rotations with inversion I which corresponds to (−x . When all edges are of equal length all three axes correspond to four-fold rotational symmetry. each passing through one of its corners. a change of all signs. (c) Three different symmetry axes. .3. (b) Two tetrahedra (in this case not regular ones) with a common axis of three-fold rotational symmetry in the cube are identiﬁed by thicker lines.

Therefore. y and z denoted as a class by C4 and as individual operations by Rν/4 (ν = x . 3. since inversion can be combined with any rotation to produce a new operation. denoted as a class by C2 and as individual operations by Rν/2 (ν = x . (b) and (c) The C2 class consisting of two-fold rotation axes that are perpendicular to. y . two-fold (π ) rotations around the axes x . (a) The C3 class consisting of three-fold rotation axes which correspond to the main diagonals of the cube. y . ± y . The associated 24 operations are separated into ﬁve classes: 1. 4. two-fold (π ) rotations around axes that are perpendicular to the x . the second coordinate can be any of four possibilities. If these 48 operations are applied with respect to the center of the cube. 2 and as 2. 3. three-fold (2π/3) rotations around the main diagonals of the cube denoted as a class by C3 and as individual operations by Rn (n = 1. the cartesian axes. 3. in a crystal with a cubic Bravais lattice and a basis that does not break the cubic symmetry (such as a single atom per unit cell referred to as simple cubic. When inversion is added to these operations. y and z denoted as a class by C4 individual operations by Rν/2 (ν = x . y . z axes and bisect the in-plane angles between the cartesian axes. the cube remains invariant. y . 2. Of these classes of operations the ﬁrst three are trivial to visualize. and bisect the angles between. y . Illustration of two classes of symmetry operations of the cube. . and the last coordinate any of two possibilities.4. including changes of sign: the ﬁrst coordinate can be any of six possibilities (±x . the identity E . It is easy to rationalize this result: there are 48 different ways of rearranging the three cartesian components of the arbitrary point. all 48 symmetry operations of the cube z 3 2 1 4 _ y z _ x _ x’ z _ y’ _ z’ y x x _ z y x y (a) (b) (c) Figure 3.116 3 Electrons in crystal potential in space (x . ±z ). 4) for counter-clockwise or by R n for clockwise rotation. z ). z ): these are given in Table 3. 5. indicated by double-headed arrows and labeled 1–4. while the last two are represented schematically in Fig.12.12. z ) for counter-clockwise or by R ν/4 for clockwise rotation. or an FCC or a BCC lattice). the total number of images of the arbitrary point becomes 48. z ) or Rν/2 with the subscript indicating the cartesian axis perpendicular to the rotation axis. the rotation axes are shown as single-headed arrows and the points at which they intersect the sides of the cube are indicated by small circles. four-fold (2π/4) rotations around the axes x .

While this signiﬁcantly simpliﬁes our task.77) where k ∈ BZ stands for all values of the wave-vector k inside a Brillouin Zone (it is convenient to assume that we are dealing with the reduced zone scheme and the ﬁrst BZ). There are other larger point groups in three dimensions but they are not compatible with three-dimensional periodicity.3. 3. For example. (n ) k < F (n ) |ψk (r)|2 . that is. P is an operation in the point group. the icosahedral group has a total of 120 operations. there is an even greater simpliﬁcation: with the use of a very few k values we can obtain an excellent approximation to the sum in Eq. We also can deduce that the sum of f (k) over all k-points in the BZ is equal to the sum of g (k).78) (n ) with the single-particle wavefunctions ψk (r) normalized to unity in the PUC. We begin by deﬁning the function f (k) = 1 NP g ( P k) P ∈P (3. the number of equivalent points to which a particular k-point in the IBZ is mapped to when the different symmetry operations are applied to it.80) . For example.9 Special k-points Another useful application of group theory arguments is in the simpliﬁcation of reciprocal-space integrations. and we discuss them next.9 Special k-points 117 will leave the crystal invariant. and therefore all these operations form the point group of the crystal. we need to keep track of the multiplicity of each point in the IBZ. This is the largest point group compatible with threedimensional periodicity.77). We can take advantage of the crystal symmetry to reduce the summation over k values inside the IBZ. Then we will have f (k) = f ( P k) since applying P to any operation in P gives another operation in P due to closure. and N P is the total number of operations in P.79) where P is the point group. but its ﬁve-fold symmetry is not compatible with translational periodicity in three dimensions. and the deﬁnition of f (k) includes already a summation over all the operations in P. Very often we need to calculate quantities of the type g= 1 N g (k) = k∈BZ PU C dk g (k) (2π )3 (3. In doing this. because f (k) = k∈BZ k∈BZ 1 NP g ( P k) = P ∈P 1 NP g ( P k) P ∈P P k∈BZ (3. These are called special k-points [33–35]. (3. when calculating the electronic density in the single-particle picture we have to calculate the quantities ρk (r) = n. ρ (r) = dk ρk (r) (2π )3 (3.

g : f = 1 N f (k) = k∈BZ 1 N g (k) = g k∈BZ (3. summation over P k ∈ BZ of g ( P k) is the same for all P . Now. With these relations.81) Combining the results of the previous two equations.84).118 3 Electrons in crystal potential where we have used the fact that summation over k ∈ BZ is the same as summation over P k ∈ BZ.83) which gives for the sum f f = 1 N 1 f˜(R)e−ik·R = N f˜(R) R k∈BZ e−ik·R (3.85) R so that we have found the sum of g (k) in the BZ to be equal to the Fourier coefﬁcients f˜(R) of f (k) evaluated at R = 0. we obtain the desired result for the sums f .82) We can expand the function f (k) in a Fourier expansion with coefﬁcients ˜ f (R) as follows: f ( k) = R f˜(R)e−ik·R → f˜(R) = k∈BZ f (k)eik·(R) (3.84) R k∈BZ Now we can use the δ -function relations that result from summing the complex exponential exp(±ik · R) over real-space lattice vectors or reciprocal-space vectors within the BZ.87) . f takes the form f = 1 N 1 f˜(R)δ (R) = f˜(0) = g = N g (k) k∈BZ (3. But we also have f (k) = R f˜(R)e−ik·R = f˜(0) + R=0 f˜(R)e−ik·R (3. which are proven in Appendix G. so that doing this summation for all P ∈ P gives N P times the same result: 1 NP g ( P k) = P ∈P P k∈BZ P k∈BZ g ( P k) = k∈BZ g (k) (3.86) We would like to ﬁnd a value of k0 that makes the second term on the right side of this equation vanishingly small. because then f˜(R)e−ik0 ·R ≈ 0 ⇒ g = f˜(0) ≈ f (k0 ) R=0 (3. (3. to simplify Eq.

These sums are called “shells of R”. [35. Typically f˜(|R|) falls fast as its argument increases. This last equation gives us a practical means of determining k0 : it must make as many shells of R vanish as possible. Group Theory and Quantum Mechanics. 2. Allen and E. Heine (Pergamon Press. Tinkham (McGraw-Hill. so we must ﬁnd a reasonable way of determining values of the special k-point. to make it a reasonable candidate for a special k-point. so it is only necessary to make sure that the ﬁrst few shells of R vanish for a given k0 . New York. Wiley. (3. Then for all R of the same magnitude which are connected by P operations we will have f˜(R) = f˜(|R|). .88) where we have taken advantage of the fact that f (k) = f ( P k) for any operation P ∈ P. The Structure of Materials. we have also used the expression for the inverse Fourier transform f˜(R) from Eq. New York.L.86) identically zero. in which case we need to evaluate f (k) at all of these points to obtain a good approximation for g (for details see Refs. New York. 1998). M. (3. V. and their deﬁnition depends on the Bravais lattice. To this end we notice that f˜( P R) = k∈BZ f (k)eik·( P R) = f (P P −1 k∈BZ −1 f (k)ei( P −1 k)·R = k)e k∈BZ i( P −1 k)·R = f˜(R) (3.83) (see also the discussion of Fourier transforms in Appendix G).Further reading 119 The value k0 is a special point which allows us to approximate the sum of g (k) over the entire BZ by calculating a single value of f (k)! In practice it is not possible to make the second term in Eq. A generalization of this is to consider a set of special k-points for which the ﬁrst few shells of R vanish. Thomas (J.M. Group Theory in Quantum Mechanics. 36]). S. 3. 1964). This book contains a detailed discussion of crystal symmetries as they apply to different materials. and consequently we can break the sum over R values to a sum over |R| and a sum over P ∈ P which connect R values of the same magnitude: f (k0 ) = f˜(0) + |R| f˜(|R|) P ∈P e−ik0 ·( P R) |R| (3. 1960).89) where the summation inside the square brackets is done at constant |R|. Further reading 1.

is Eq. Use second order perturbation theory to derive the expression for the energy at wavevector k + q in terms of the energy and wavefunctions at wave-vector k. (a) Show that the single-particle equation obeyed by u k (r). t2 = 0.3a y ˆ. given in Eq. with two atoms per unit cell. where C is complex conjugation and σ y is a Pauli spin matrix (deﬁned in Appendix B). Draw the Irreducible Brillouin Zone and indicate the high-symmetry points and the symmetry operations for each. lattice model. derive the expression for the inverse effective mass. the part of the wavefunction which has full translational symmetry. a2 = ˆ + ay ˆ ax ax (3.120 3 Electrons in crystal potential Problems 1. 7. and construct its group multiplication table.46). then the following relations hold for the energy and wavefunction of single-particle states: (n ) ↓−k 2.1. the positions of atoms in the primitive unit cell.5a x where a is the lattice constant. = (n ) ↑k . Consider the 2D honeycomb lattice with two atoms per unit cell. (b) Determine the lattice vectors. deﬁned by the lattice vectors: √ √ 1 1 3 3 ˆ − ay ˆ . and atomic positions t1 = 0.43). (n ) (n ) ψ↓− k (r) = iσ y C ψ↑k (r) (3. Find a special k-point for the 2D square lattice and the 2D hexagonal lattice. How many shells can you make vanish with a single special k-point in each case? .91) a1 = 2 2 2 2 √ ˆ ). (3. Show that the only rotations compatible with 3D periodicity are multiples of 2π/4 and 2π/6. (3. and the reciprocal lattice vectors. particle can be chosen to be (a) Show that the time-reversal operator for a spin.37).90) 4. 9. 6. From that. and zincblende crystal structures discussed in chapter 1. 5. 3.1 2 T = iσ y C . (b) Kramers degeneracy : show that if the hamiltonian is time-reversal invariant. 8. (3.20). (a) Calculate the reciprocal lattice vectors for the crystal lattices given in Table 3. (these form the 2D hexagonal lattice). (3. at positions t1 = 0. (b) Show that the coefﬁcients αk (G) in the Fourier expansion of u k (r) obey Eq. Draw the ﬁrst seven bands for the high-symmetry points and ﬁnd the Fermi level for a system in which every atom has two valence electrons. for the NaCl. as given in Eq. 3. t2 = 1/ 3(a x Determine all the symmetries of this lattice. by analogy to Fig. CsCl. Is this group symmorphic or non-symmorphic? Draw the occupied Brillouin Zones for the 2D square lattice with n = 2 electrons per unit cell.6. Find the symmetries and construct the group multiplication table for a 2D square ˆ + 0.

which takes us in the most natural way from electronic states that are characteristic of atoms (atomic orbitals) to states that correspond to crystalline solids. both conceptually and computationally. where the only effect of the presence of the lattice is to impose the symmetry restrictions on the eigenvalues and eigenfunctions. as in the free-electron model. The basic assumption in the TBA is that we can use orbitals that are very similar to atomic states (i. we will conclude the chapter by discussing the electronic structure of several representative crystals.e. as obtained by elaborate computational methods. In realistic situations the potential is certainly not zero. Our task here is to develop methods for determining the solutions to the single-particle equations for realistic systems. We will deal with the general theory of the TBA ﬁrst. nor is it necessarily weak. whose application typically involves a large computational effort.4 Band structure of crystals In the previous two chapters we examined in detail the effects of crystal periodicity and crystal symmetry on the eigenvalues and wavefunctions of the single-particle equations. 4. The models we used to illustrate these effects were artiﬁcial free-electron models. The latter term is actually used in a wider sense. Finally. We will then discuss brieﬂy more general methods for obtaining the band structure of solids. hence the term “tight-binding”) as a basis for expanding the crystal wavefunctions. is the so called Tight-Binding Approximation (TBA). wavefunctions tightly bound to the atoms. We also saw how a weak periodic potential can split the degeneracies of certain eigenvalues at the Bragg planes (the BZ edges). also referred to as Linear Combination of Atomic Orbitals (LCAO). We will do this by discussing ﬁrst the so called tight-binding approximation. as we will explain below.1 The tight-binding approximation The simplest method for calculating band structures. 121 . and then we will illustrate how it is applied through a couple of examples. we will also attempt to interpret these results in the context of the tight-binding approximation.

. for a given pair of indices i (used to denote the position ti of the atom in the PUC) and l (used for the type of orbital). Bloch’s theorem is satisﬁed for our choice of χkli (r). d .4) l . . Now we can expand the crystal single-particle eigenstates in this basis: (n ) ψk (r) = (n ) ck li χkli (r) (4.i χkm j |Hsp |χkli − (n ) k (n ) χkm j |χkli ck li = 0 (4. These states must obey Bloch’s theorem. It is assumed that we need as many orbitals as the number of valence states in the atom (this is referred to as the “minimal basis”). The index l can take the usual values for an atom. and we call them χkli (r): 1 χkli (r) = √ N eik·R φl (r − ti − R ) R (4. Our ﬁrst task is to construct states which can be used as the basis for expansion of the crystal wavefunctions. assuming that the ψk (r) are solutions to the appropriate single-particle equation: (n ) (r) = Hsp ψk (n ) k ψk (r) ⇒ l .3) that is.i (n ) (n ) and all that remains to do is determine the coefﬁcients ck li .5) . The state φl (r − ti ) is centered at the position of the atom with index i . and φl (r) is one of the atomic states associated with this atom. We ﬁrst verify that these states have Bloch character: 1 χkli (r + R) = √ eik·(R −R) eik·R φl ((r + R) − ti − R ) N R 1 = eik·R √ N 1 = eik·R √ N eik·(R −R) φl (r − ti − (R − R)) R eik·R φl (r − ti − R ) = eik·R χkli (r) R (4.122 4 Band structure of crystals Suppose then that we start with a set of atomic wavefunctions φl (r − ti ) (4. . the angular momentum character s . . which is another lattice vector. that is.1) where ti is the position of atom with label i in the PUC.2) with the summation running over all the N unit cells in the crystal (the vectors R ). p . with the obvious deﬁnition R = R − R.

R = 0. and we have eliminated one of the sums over the lattice vectors with the factor 1/ N . This is exactly the number of solutions (bands) that we can expect at each k-point. We call the brackets in the last expression the “overlap matrix elements” between atomic states.9) This is referred to as an “orthogonal basis”. which is expressed by the relation φm (r − t j )|φl (r − ti − R) = δlm δi j δ (R) (4. .8) and we call the brackets on the right-hand side of Eq. (4. because (n ) (n ) ∼ δ (k − k ) ψk ψk (4.8) the “hamiltonian matrix elements” between atomic states. since in the last line of Eq.5) we have a secular equation of size equal to the total number of atomic orbitals in the PUC: the sum is over the number of different types of atoms and the number of orbitals associated with each type of atom.e.R 1 = N = R eik·R φm (r − t j )|φl (r − ti − R) R. In order to solve this linear system we need to be able to evaluate the following integrals: χkm j |χkli = 1 N eik·(R −R ) φm (r − t j − R )|φl (r − ti − R ) R . (4. i. In a similar fashion we obtain: χkm j |Hsp |χkli = R eik·R φm (r − t j )|Hsp |φl (r − ti − R) (4. this is only a convenient approximation.1 The tight-binding approximation 123 In the above equation we only need to consider matrix elements of states with the same k index. there would be no interactions between nearest neighbors.7) to be non-zero only for the same orbitals on the same atom. (4.R eik·R φm (r − t j )|φl (r − ti − R) (4. k to the IBZ. (4. only for m = l . In Eq.6) where we are restricting the values of k. we take the overlap matrix elements in Eq.7) where we have used the obvious deﬁnition R = R − R .7) there is no explicit dependence on R .1 1 If the overlap between the φm (r) orbitals was strictly zero. j = i . since any overlap between different orbitals on the same atom or orbitals on different atoms is taken to be zero.4. At this point we introduce an important approximation: in the spirit of the TBA.

we will take the hamiltonian matrix elements in Eq. we still need to calculate the values of the matrix elements that we have kept.e.i χkli (r) = 1 √ N Bloch basis crystal states (n ) ck li =0 secular equation orthogonal orbitals on-site elements hopping elements φm (r − t j )|φl (r − ti − R) = δlm δi j δ (R) φm (r − t j )|Hsp |φl (r − ti − R) = δlm δi j δ (R) l φm (r − t j )|Hsp |φl (r − ti − R) = δ ((t j − ti − R) − dnn )Vlm . which deﬁne the on-site and hopping matrix elements of the hamiltonian. 40]). Equations that deﬁne the TBA model. when they are across unit cells R can be one of the primitive lattice vectors. ik·R φl (r − ti − R) Re (n ) (n ) ψk (r) = l . and continues to be an active area of research. are summarized in Table 4. these matrix elements can be calculated using one of the single-particle hamiltonians we have discussed in chapter 2 (this approach is being actively pursued as a means of performing fast and reliable electronic structure calculations [38]). When the nearest neighbors are in the same unit cell. The ﬁrst three equations are general.i j are also referred to as “hopping” matrix elements. Refs. denoted in general as dnn : φm (r − t j )|Hsp |φl (r − ti − R) = δ ((t j − ti − R) − dnn )Vlm .8) to be non-zero only if the orbitals are on the same atom.124 4 Band structure of crystals Table 4. which are referred to as the “on-site energies”: φm (r − t j )|Hsp |φl (r − ti − R) = δlm δi j δ (R) l (4. The parametrization of the hamiltonian matrix in an effort to produce a method with quantitative capabilities has a long history. The equations that deﬁne the TBA model. starting with the work of Harrison [37] (see the Further reading section). We illustrate . if the orbitals are on different atoms but situated at nearest neighbor sites. However. based on the atomic orbitals φl (r − ti − R) of type l centered at an atom situated at the position ti of the unit cell with lattice vector R. for instance. with the approximation of an orthogonal basis and nearest neighbor interactions only. R can be zero.i j (4.i j Similarly. which are ﬁtted to reproduce certain properties and can then be used to calculate other properties of the solid (see. it is often more convenient to consider these matrix elements as parameters. [39.1. In principle. Even with this drastic approximation. The last three correspond to an orthogonal basis of orbitals and nearest neighbor interactions only. (4.1. R = 0.i ckli χkli (r) (n ) χkm j |Hsp |χkli − k χkm j |χkli l .11) The Vlm . for j = i . i.10) or.

that is. while the lattice vectors R are given by na .1 Example: 1D linear chain with s or p orbitals We consider ﬁrst the simplest possible case.2). φl (x )|φl (x − na ) = δn 0 (4. p for a few values of k are shown in Fig. l = s .1. We keep the index l to identify different types of orbitals in our simple model. we can now attempt to calculate the band structure for this model. then we deﬁne the hamiltonian matrix element to be φl (x )|Hsp |φl (x − na ) = l δn 0 (4. Therefore.4. With these states. The TBA with an orthogonal basis and nearest neighbor interactions only implies that the overlap matrix elements are non-zero only for orbitals φl (x ) on the same atom. a linear periodic chain of atoms.1 The tight-binding approximation 125 these concepts through two simple examples. we deﬁne the hamiltonian matrix element to be φl (x )|Hsp |φl (x − na ) = tl δn ±1 (4. that is n = ±1. The ﬁrst task is to construct the basis for the crystal wavefunctions using the atomic wavefunctions.12) where we have further simpliﬁed the notation since we are dealing with a 1D example.13) Similarly. We will consider atomic wavefunctions φl (x ) which have either s -like or p -like character. the basis for the crystal wavefunctions in this case will be simply χkl (x ) = ∞ n =−∞ eikx φl (x − na ) (4.15) . with a the lattice constant and n an integer. 4. We notice that because of the simplicity of the model. Our system has only one type of atom and only one orbital associated with each atom. the ﬁrst concerning a 1D lattice.1. as was done for the general case in Eq. the second a 2D lattice of atoms. (4. there are no summations over the indices l (there is only one type of orbital for each atom) and i (there is only one atom per unit cell). nearest neighbor interactions require that the hamiltonian matrix elements are non-zero only for orbitals that are on the same or neighboring atoms. The real parts of the wavefunction χkl (x ). with the position vector r set equal to the position x on the 1D axis and the reciprocal-space vector k set equal to k .14) while if they are on neighboring atoms. 4. If the orbitals are on the same atom.

75 and 1 (in units of π/a ). where l is the on-site hamiltonian matrix element and tl is the hopping matrix element. the dashed lines represent the term cos(kx ). which implies that ts < 0 for s -like orbitals and t p > 0 for p -like orbitals.25 k=0.5 k=0. bottom ﬁve: p -like state.126 4 Band structure of crystals k=0 k=0. 0.75 k=1 Figure 4. Real parts of the crystal wavefunctions χkl (x ) for k = 0. Top ﬁve: s -like state.5 k=0. .25.25 k=0.1.5.75 k=1 k=0 k=0. The solid dots represent the atoms in the one-dimensional chain. which determines the value of the phase factor when evaluated at the atomic sites x = na . as we explain in more detail below. 0. We expect this interaction between orbitals on neighboring atoms to contribute to the cohesion of the solid. 0.

we ﬁnd that we have to solve a 1 × 1 matrix. that is. for −π/a ≤ k ≤ π/a .2 so that the overlap between s orbitals situated at nearest neighbor sites is positive.5). (4. because we have only one orbital per atom and one atom per unit cell. only the tail beyond the core is involved.4): ψk (x ) = ck χkl (x ) (4.17) The behavior of the energy in the ﬁrst BZ of the model. . the band structure of the model.1 The tight-binding approximation 127 We are now ready to use the χkl (x ) functions as the basis to construct crystal wavefunctions and with these calculate the single-particle energy eigenvalues. we can take it to be unity. 4. that is. the hopping matrix element must be negative: ts ≡ 2 ∗ φs (x )Hsp (x )φs (x − a )dx < 0. giving the energy band for this simple model: 1D chain : k = l + 2tl cos(ka ) (4. in which case the crystal wavefunctions ψk (x ) are the same as the basis functions χkl (x ).4. Eq. We are concerned here with the sign implied only by the angular momentum character of the wavefunction and not by the radial part. of the latter part. We elaborate brieﬂy on the sign of the hopping matrix elements and the dispersion of the bands. 4. The s orbitals are spherically symmetric and have everywhere the same sign. we obtain [ χkl (x )|Hsp |χkl (x ) − ⇒ n k χkl (x )|χkl (x ) ]ck = 0 k n eikna φl (x )|Hsp |φl (x − na ) = ⇒ n eikna φl (x )|φl (x − na ) eikna δn 0 n e ikna [ l δn 0 + tl δn ±1 ] = k The solution to the last equation is straightforward.2 for the s and p orbitals. (4. With the above deﬁnitions of the hamiltonian matrix elements between the atomic orbitals φl ’s. is shown in Fig. Since the coefﬁcient ck is undeﬁned by the secular equation. The crystal wavefunctions are obtained from the general expression Eq. as discussed in chapter 2.16) where only the index k has survived due to the simplicity of the model (the index l simply denotes the character of the atomic orbitals).1). Inserting these wavefunctions into the secular equation. which we have already discussed above (see Fig. It is assumed that the single-particle hamiltonian is spherically symmetric. In order to produce an attractive interaction between these orbitals.

and since the hamiltonian is the same as in the previous case. the hopping matrix element must be positive tp ≡ sp φ∗ p ( x )H ( x )φ p ( x − a )d x > 0. with one atom per unit cell and one orbital per atom. right: p -like state. 4. while that of the p -like orbital will have the positions of the extrema reversed. the overlap between p orbitals situated at nearest neighbor sites and oriented in the same sense as required by translational periodicity. for the 1D inﬁnite chain model. leading to larger dispersion for the p bands. Moreover. which leads to larger dispersion for the p band. We conclude that the negative sign of this matrix element is due to the hamiltonian. is negative.2. and therefore |t p | > |ts |. the p orbitals have a positive and a negative lobe (see Appendix B). s . because the positive lobe of one is closest to the negative lobe of the next. p are the on-site hamiltonian matrix elements and ts < 0 and t p > 0 are the hopping matrix elements. consequently. we expect that in general there will be larger overlap between the neighboring p orbitals than between the s orbitals. Therefore.2. since the product of the wavefunctions is positive.128 εk εs -2ts εs 4 Band structure of crystals εk εp +2tp −π π ka −π εp ka εs +2ts εp-2tp Figure 4. the band structure for a 1D model with one s -like orbital per unit cell will have a maximum at k = ±π/a and a minimum at k = 0. Thus. The generalization of the model to a two-dimensional square lattice with either one s -like orbital or one p -like orbital per atom and one atom per unit cell is . Single-particle energy eigenvalues k in the ﬁrst BZ (−π ≤ ka ≤ π ). in the tight-binding approximation with nearest neighbor interactions. as shown in Fig. The sketches at the bottom of each panel illustrate the arrangement of the orbitals in the 1D lattice with the positions of the atoms shown as small black dots and the positive and negative lobes of the p orbitals shown in white and black. due to the directed lobes of the former. in order to produce an attractive interaction between these orbitals. Left: s -like state. Due to larger overlap we expect |ts | < |t p |. On the other hand.

equivalently.4. 0] ⇒ χkm |Hsp |χkl = R eik·R φm (r)|Hsp |φl (r − R) (4.1. ±aˆ y. 4. 2. (4. 4. . as described by the equations of Table 4. Simiwith the two-dimensional reciprocal-space vector deﬁned as k = k x x larly. We assume that there are four atomic orbitals per atom. 3 in the above examples).22) = 0 only for [R = ±aˆ x. From where k = k x x these expressions.1. the generalization to the three-dimensional cubic lattice with either one s -like orbital or one p -like orbital per atom and one atom per unit cell leads to the energy eigenvalues: 3D cube : k = l + 2tl [cos(k x a ) + cos(k y a ) + cos(k z a )] (4. the two-dimensional square lattice with one atom per unit cell. the energy eigenvalues are given by 2D square : k = l + 2tl [cos(k x a ) + cos(k y a )] (4. or. p y . 0] There are a number of different on-site and hopping matrix elements that are generated from all the possible combinations of φm (r) and φl (r) in Eq.22). ±aˆ y. z is the number of nearest neighbors (z = 2.1 The tight-binding approximation 129 straightforward. in relation to disorder-induced localization of electronic states.19) ˆ + kyy ˆ + kz z ˆ is the three-dimensional reciprocal-space vector. The overlap matrix elements in this case are φm (r)|φl (r − R) = δlm δ (R) ⇒ χkm |χkl = R eik·R φm (r)|φl (r − R) = R eik·R δlm δ (R) (4. We will use this fact in chapter 12.21) = δlm while the hamiltonian matrix elements are φm (r)|Hsp |φl (r − R) = 0 only for [R = ±aˆ x. one s -type and three p -type ( px .18) ˆ + kyy ˆ .2 Example: 2D square lattice with s and p orbitals We next consider a slightly more complex case. 6 in the above examples). pz ). We work again within the orthogonal basis of orbitals and nearest neighbor interactions only. we can immediately deduce that for this simple model the band width of the energy eigenvalues is given by W = 4dtl = 2ztl (4.20) where d is the dimensionality of the model (d = 1.

z . right: elements that vanish due to symmetry.3. Schematic representation of hamiltonian matrix elements between s and p states. β = x . The two lobes of opposite sign of the px . z ) ˆ ) = 0 (α = y . α = β ) (4. y . β = x . .24) as can be seen by the diagrams in Fig.130 4 Band structure of crystals which we deﬁne as follows: s p = φs (r)|Hsp |φs (r) = φ px (r)|Hsp |φ px (r) = φ p y (r)|Hsp |φ p y (r) = φ pz (r)|Hsp |φ pz (r) ˆ ) = φs (r)|Hsp |φs (r ± a y ˆ) Vss = φs (r)|Hsp |φs (r ± a x ˆ ) = − φs (r)|Hsp |φ px (r + a x ˆ) Vsp = φs (r)|Hsp |φ px (r − a x ˆ ) = − φs (r)|Hsp |φ p y (r + a y ˆ) Vsp = φs (r)|Hsp |φ p y (r − a y ˆ ) = φ p y (r)|Hsp |φ p y (r ± a y ˆ) V ppσ = φ px (r)|Hsp |φ px (r ± a x ˆ ) = φ px (r)|Hsp |φ px (r ± a y ˆ) V ppπ = φ p y (r)|Hsp |φ p y (r ± a x ˆ ) = φ pz (r)|Hsp |φ pz (r ± a y ˆ) V ppπ = φ pz (r)|Hsp |φ pz (r ± a x (4. By the symmetry of the atomic orbitals we can deduce: φs (r)|Hsp |φ pα (r) = 0 (α = x . 4. p y orbitals are shaded black and white. with the single-particle hamiltonian Hsp assumed to contain only spherically symmetric terms. Left: elements that do not vanish. α = β ) φ pα (r)|Hsp |φ pβ (r ± a x φ pα (r)|Hsp |φ pβ (r) = 0 (α. s s Vss px px Vppσ py Vsp px s px s s py px py py px py Vppπ Figure 4. 4. z . y .3. z ) φs (r)|Hsp |φ pα (r ± a x ˆ ) = 0 (α. y .23) The hopping matrix elements are shown schematically in Fig.3.

26) χk p y (r)|H |χk p y (r) = With these we can now construct the hamiltonian matrix for each value of k. at the point X = (1.4.25) and similarly for the rest of the matrix elements χks (r)|Hsp |χk px (r) = 2i Vsp sin(k x a ) χks (r)|Hsp |χk p y (r) = 2i Vsp sin(k y a ) χk pz (r)|Hsp |χk pz (r) = χk px (r)|H |χk px (r) = sp sp p p p + 2V ppπ cos(k x a ) + cos(k y a ) + 2V ppσ cos(k x a ) + 2V ppπ cos(k y a ) + 2V ppπ cos(k x a ) + 2V ppσ cos(k y a ) (4. Using the results of chapter 3 for the IBZ for the high-symmetry points for this lattice. 0). (2) M = p − 4V ppπ . For a quantitative discussion of the energy bands we will concentrate on certain portions of the BZ. (3) = (4) = p + 2V ppπ + 2V ppσ (4. X = p − 2V ppπ + 2V ppσ (4. (2) X = (3) p.1 The tight-binding approximation 131 Having deﬁned all these matrix elements. (2) = p + 4V ppπ . and obtain the eigenvalues and eigenfunctions by diagonalizing the secular equation.29) We have chosen the labels of those energy levels to match the band labels as displayed on p. 4. 135 in Fig.28) Finally. Notice that there are doubly degenerate states at . (4) M = s − 4Vss (4. where we get (1) M = (3) M = p − 2V ppπ − 2V ppσ . k y ) = (0. we ﬁnd for our example χks (r)|Hsp |χks (r) = φs (r)|Hsp |φs (r) ˆ) + φs (r)|Hsp |φs (r − a x sp ˆ) + φs (r)|H |φs (r + a x sp ˆ) + φs (r)|H |φs (r − a y sp ˆ) + φs (r)|H |φs (r + a y = s ˆ eik·a x ˆ e−ik·a x ˆ eik·a y ˆ e−ik·a y + 2Vss cos(k x a ) + cos(k y a ) (4. We ﬁnd that at = (k x . X = (4) s.27) The same is true for the point M = (1. which correspond to high-symmetry points or directions in the IBZ. we can calculate the matrix elements between crystal states that enter in the secular equation. the matrix is already diagonal and the eigenvalues are given by (1) = s + 4Vss . we conclude that we need to calculate the band structure along − − X − Z − M − − . 0)(π/a ) we have another diagonal matrix with eigenvalues (1) X = p + 2V ppπ − 2V ppσ .4(a). 1)(π/a ).

Z . k Ak Bk Ck Dk Ek = (k . . namely 1 χk1 (r) = √ χk px (r) + χk p y (r) 2 1 χk2 (r) = √ χk px (r) − χk p y (r) 2 (4. 0)(π/a ) 2Vss (cos(k π ) + 1) 2iVsp sin(k π ) 2V ppσ cos(k π ) + 2V ppπ 2V ppσ + 2V ppπ cos(k π ) 2V ppπ (cos(k π ) + 1) Z = (1.2. States associated with free atoms are not a good choice. . In principle.132 4 Band structure of crystals Table 4. pz orbitals at the high-symmetry points .2) k = 1 ( Ak + C k ) ± 2 ( Ak − Ck )2 + 4| Bk |2 . All that remains to be done is to determine the numerical values of the hamiltonian matrix elements. There is a question as to what exactly the appropriate atomic basis functions φl (r) should be. while the matrix for requires a change of basis in order to be brought into this form. Z . that is. dictated by symmetry. . p y . These matrices are then easily solved for the eigenvalues. giving: (1. because in . k )(π/a ) 2Vss (cos(k π ) − 1) 2iVsp sin(k π ) 2V ppσ cos(k π ) − 2V ppπ 2V ppπ cos(k π ) − 2V ppσ 2V ppπ (cos(k π ) − 1) = (k .30) 0 0 Dk 0 0 0 0 Ek The matrices for and Z can be put in this form straightforwardly. χks (r) and χk pz (r). 0 < k < 1.2. the same as before. Matrix elements for the 2D square lattice with s . px .32) We have then obtained the eigenvalues for all the high-symmetry points in the IBZ. The different high-symmetry k-points result in the matrix elements tabulated in Table 4. (3) k = Dk . k )(π/a ) 4√ Vss cos(k π ) 2 2iVsp sin(k π ) 2(V ppσ + V ppπ ) cos(k π ) 2(V ppσ + V ppπ ) cos(k π ) 4V ppπ cos(k π ) and at M . one can imagine calculating the values of the hamiltonian matrix elements using one of the single-particle hamiltonians we discussed in chapter 2. we obtain matrices of the type 0 Ak Bk 0 B ∗ Ck 0 0 k (4. (4) k = Ek (4. In all cases.31) with the other two functions. For the three other high-symmetry points. by the values of k at those points and the form of the hopping matrix elements within the nearest neighbor approximation.

to contribute to the cohesion of the solid. Alternatively. we can choose p to be the zero of energy and s to be lower in energy by approximately the energy difference of the corresponding freeatom states.4.1 The tight-binding approximation 133 the solid the corresponding single-particle states are more compressed due to the presence of other electrons nearby. Let us try to predict at least the sign and relative magnitude of the hamiltonian matrix elements.2 eV. First. but we could expect the compression of eigenfunctions to have similar effects on the different eigenvalues. V ppσ should be positive so that the net effect is an attractive interaction. by analogy to our earlier analysis for the 1D model. the two p states are parallel to each other at a distance a . The choice Vss = −2 eV for this interaction would be consistent with our choice of the difference between s and p . The choice s = −8 eV is representative of this energy difference for several second row elements in the Periodic Table. In the case of V ppσ we are assuming the neighboring φ px (r) states to be oriented along the x axis in the same sense. the diagonal matrix elements s . we expect the interaction of two p states to be attractive in general. Because of this negative overlap. with positive lobes pointing in the positive direction as required by translational periodicity. Finally. This implies that the negative lobe of the state to the right is closest to the positive lobe of the state to the left. A reasonable choice is V ppπ = −1. Similarly. We expect this interaction to be attractive. that is. that is. Therefore. which can be determined independently from experiment. Since the energy scale is arbitrary. The matrix element Vss represents the interaction of two φs (r) states at a distance a . so that the positive lobe of the p orbital is closer to the s orbital and their overlap is positive. One possibility then is to solve for atomic-like states in ﬁctitious atoms where the single-particle wavefunctions are compressed.8 eV. p should have a difference approximately equal to the energy difference of the corresponding eigenvalues in the free atom. when the orbitals were pointing toward each other. so that the overlap between the two states will be negative. the lattice constant of our model crystal. We expect this matrix element to be roughly of the same magnitude as Vss and a little larger in magnitude. so we expect the attractive interaction to be a little weaker than in the previous case. to reﬂect the larger overlap between the directed lobes of p states. we expect Vss to be negative. Notice that if we think of the atomic-like functions φl (r) as corresponding to compressed wavefunctions then the corresponding eigenvalues l are not identical to those of the free atom. As a consequence . in an attempt to guess a set of reasonable values. by imposing for instance a constraining potential (typically a harmonic well) in addition to the Coulomb potential of the nucleus. one can try to guess the values of the hamiltonian matrix so that they reproduce some important features of the band structure. we deﬁne Vsp to be the matrix element with φ px (r) to the left of φs (r). In the case of V ppπ . A reasonable choice is V ppσ = +2. by analogy to what we discussed earlier for the 1D model.

4.28).8 −2.1 (f) −8. p is taken to be zero in all cases. With these choices. (a) s (b) −16. without affecting the other three bands. Values of the on-site and hopping matrix elements for the band structure of the 2D square lattice with an orthogonal s and p basis and nearest neighbor interactions. there is also a doubly degenerate state at X .1 eV. which leads to the band structure of plot (c). To keep the comparisons simple.2 −1. A reasonable choice is Vsp = −2.2 −1.2 Vss V ppσ V ppπ Vsp −8.4 (b)–(f ) provide insight into the origin of the bands.0 +4.28). which at M depends on the value of Vss .0 −4. this is purely accidental.4 (a) and (b) we conclude that band 1 arises from interaction of the s orbitals in neighboring atoms: a decrease of the corresponding eigenvalue s from −8 to −16 eV splits this band off from the rest.0 −2. as we found in Eq.0 +2. An increase of the magnitude of Vss by a factor of 2. (a)–(f ) refer to parts in Fig. 4.1 (c) −8. To facilitate the comparison we label the bands 1–4.3.4.2 −1. Notice that in addition to the doubly degenerate states at and M which are expected from symmetry. has as a major effect the increase of the dispersion of band 1.0 −2.4(a). according to their order in energy near . must be negative. this matrix element.1 of this deﬁnition.8 −4. The corresponding Figs. 4. All values are in electronvolts. due to our choice of parameters. this conﬁrms that band 1 is primarily due to the interaction between s orbitals. Comparing Figs.0 +2.8 −2.4 −1.2 −3. the values for each case are given explicitly in Table 4.8 −2. we expect its magnitude to be somewhere between the Vss and V ppσ matrix elements. 4.0 +2.1 (e) −8.0 −2. Since in plot (b) band 1 has split from the rest. In order to elucidate the inﬂuence of the various matrix elements on the band structure we also show in Fig.0 +2. except for some minor changes in the neighborhood of M where bands 1 and 3 were originally degenerate.8 −2.0 +2.0 −2.134 4 Band structure of crystals Table 4.6 −2.4 a number of other choices for their values.1 (d) −8. now bands 3 and 4 have become degenerate at M . as found in Eq. which also contributes to attraction.3. . 4.2 −1. the model yields the band structure shown in Fig. (4. by lowering its energy throughout the BZ by 8 eV. in each of the other choices we increase one of the matrix elements by a factor of 2 relative to its value in the original set and keep all other values the same. There are also some changes in band 4.0 −2. as the following discussion also illustrates. (4. because there must be a doubly degenerate eigenvalue at M independent of the values of the parameters.

4. p y . The values of the parameters for the six different plots are given in Table 4. pz orbitals with nearest neighbor interactions.4.1 The tight-binding approximation 12 12 135 (a) 6 6 (b) 0 4 0 3 2 Ϫ6 Ϫ6 Ϫ12 Ϫ12 Ϫ18 1 Χ ∆ Γ Σ Μ Ζ Χ ∆ Γ Ϫ18 Ϫ24 12 Ϫ24 12 Χ ∆ Γ Σ Μ Ζ Χ ∆ Γ (c) 6 6 0 0 (d) Ϫ6 Ϫ6 Ϫ12 Ϫ12 Ϫ18 Ϫ18 Ϫ24 12 Χ ∆ Γ Σ Μ Ζ Χ ∆ Γ Ϫ24 12 Χ ∆ Γ Σ Μ Ζ Χ ∆ Γ (e) 6 6 0 0 (f) Ϫ6 Ϫ6 Ϫ12 Ϫ12 Ϫ18 Ϫ18 Ϫ24 Χ ∆ Γ Σ Μ Ζ Χ ∆ Γ Ϫ24 Χ ∆ Γ Σ Μ Ζ Χ ∆ Γ Figure 4. . px . The band structure of the 2D square lattice with one atom per unit cell and an orthogonal basis consisting of s .3.

The second interesting feature is that the lowest band is parabolic near . 4. This is demonstrated in more realistic examples later in this chapter. The other bands are also affected by this change in the value of V ppπ .1. this leads to the conclusion that band 2 arises from π -bonding interactions between pz orbitals.4(a) and (b) are nearly parallel to each other throughout the BZ. which lessen the importance of the V ppπ matrix element. since in the other bands there are also contributions from σ -bonding interactions.136 4 Band structure of crystals Increasing the magnitude of V ppσ by a factor of 2 affects signiﬁcantly bands 3 and 4. in this case the interaction between s and p orbitals (Vsp ) is much larger than the interaction between s orbitals. Finally.1. as seen in plot (f). as discussed in chapter 5. the lowest band near the minimum has essentially pure s character.4 except for (f). This last situation is unusual. (4. This indicates that bands 3 and 4 are essentially related to σ interactions between the px and p y orbitals on neighboring atoms. as well as of the free-electron model discussed in chapter 3.3 Generalizations of the TBA The examples we have discussed above are the simplest version of the TBA. This is an accident related to our choice of parameters for these two plots.22). this is because all other bands except band 2 involve orbitals s and p interacting through σ bonds. increasing the magnitude of Vsp by a factor of 2 affects all bands except band 2. respectively. Two other features of the band structure are also worth mentioning: First. as deﬁned in Eq.1. as seen from the comparison between plots (a) and (d). This type of behavior has important consequences for the optical properties. that bands 1 and 3 in Figs. This is also supported by plot (e). in which increasing the magnitude of V ppπ by a factor of 2 has as a major effect the dramatic increase of the dispersion of band 2. but the effect is not as dramatic. 4. where the nature of the lowest band is clearly associated with the atomic orbitals with the lowest energy. 4. as the other four plots prove. in all plots of Fig. particularly when the lower band is occupied (it lies entirely below the Fermi level) and the upper band is empty (it lies entirely above the Fermi level). with only orthogonal basis functions and nearest neighbor interactions. The parabolic nature of the lowest band near the minimum is also a feature of the simple 1D model discussed in section 4. and not at all band 2.21) and Eq. In all these cases. somewhat less band 1. Only for the choice of parameters in plot (f) is the parabolic behavior near the minimum altered. because they contain π -bonding interactions between px and p y orbitals. so that the nature of the band near the minimum is not pure s any longer but involves also the p states. and its dispersion is dictated by the periodicity of the lattice rather than interaction with other bands. Far more common is the behavior exempliﬁed by plots (a) – (d). We also encountered matrix elements in . (4.

as shown in Fig. Left: two p orbitals oriented at arbitrary directions θ1 . A comprehensive treatment of the tightbinding method and its application to elemental solids is given in the book by Papaconstantopoulos [41]. It is easy to generalize all this to a more ﬂexible model. and from symmetry we have φ p1x |Hsp |φ p2 y = 0 and φ p1 y |Hsp |φ p2x = 0. the direction perpendicular to it the y axis. It is also worth mentioning that the TBA methods are increasingly employed to calculate the total energy of a solid.33) where the line joining the atomic centers is taken to be the x axis. Right: an s orbital and a p orbital which lies at an angle θ relative to the line that join their centers. θ2 . the case when they are perpendicular results in zero matrix elements by symmetry. see Fig. it is straightforward to include conﬁgurations in which the p orbitals are not just parallel or lie on the line that joins the atomic positions.5. 4. This then leads to the general description of the interaction between two p -type orbitals oriented in random directions θ1 and θ2 relative to the line that joins the atomic positions where they are centered.5: φ p1 (r) = φ p1x (r) cos θ1 + φ p1 y (r) sin θ1 φ p2 (r) = φ p2x (r) cos θ2 + φ p2 y (r) sin θ2 φ p1 |H |φ p2 = φ p1x |Hsp |φ p2x cos θ1 cos θ2 + φ p1 y |Hsp |φ p2 y sin θ1 sin θ2 sp = V ppσ cos θ1 cos θ2 + V ppπ sin θ1 sin θ2 (4. the other perpendicular to it.3. [43]). . as we discuss next. This practice is motivated by the desire to have a reasonably fast method for total-energy and force calculations while maintaining the ﬂexibility of a quantum mechanical treatment as opposed to resorting to effective interatomic potentials (for details on how TBA methods are used for total-energy calculations see the original papers in the literature [42]. one lying along the line that joins the atomic positions. 4. (1) Arbitrary orientation of orbitals First. relative to the line that joins their centers. We can consider each p orbital to be composed of a linear combination of two perpendicular p orbitals.1 The tight-binding approximation 137 which the p wavefunctions are either parallel or point toward one another along the line that separates them. The matrix elements between an s and a p orbital with arbitrary orientation relative to the line joining their centers is handled by the same y y p1 θ1 θ2 p2 p x θ s x Figure 4.4.

where r is the distance between the atomic positions where the atomic-like orbitals are centered.35) (4. that is.li (4. Then we will have φm (r − R − t j )|φl (r − R − ti ) = Sµ µ (4.138 4 Band structure of crystals procedure. A common approximation is to take Sµ µ = f (|R − R |) Sm j . the more matrix elements we will need to calculate (or ﬁt). (4.34) where we use the index µ to denote all three indices associated with each atomic orbital. This new matrix is no longer diagonal. Then we need to solve the general secular equation (Eq. In fact. it is plausible that.7)) matrix elements. (2) Non-orthogonal overlap matrix A second generalization is to consider that the overlap matrix is not orthogonal. (4. but it is obviously another implicit approximation. as we had assumed earlier.21). µ → (li R) and µ → (m j R ). (4. the presence of other electrons nearby will affect the interaction of any two given .36) where the function f (r ) falls fast with the magnitude of the argument or is cut off to zero beyond some distance r > rc . (4. and the approximation becomes more computationally demanding. which is more appropriate in describing the wavefunctions of the solid. on top of restricting the basis to the atomic-like wavefunctions. (3) Multi-center integrals The formulation of the TBA up to this point has assumed that the TBA hamiltonian matrix elements depend only on two single-particle wavefunctions centered at two different atomic sites. In this case. we assumed that the hamiltonian matrix elements depend only on the relative distance and orientation of the two atomic-like orbitals between which we calculate the expectation value of the single-particle hamiltonian. consistency requires that the hamiltonian matrix elements are also cut off for r > rc . Sµ µ = δml δ ji δ (R − R ).5)) with the general deﬁnitions of the hamiltonian (Eq. This is especially meaningful when we are considering contracted orbitals that are not true atomic orbitals.8)) and overlap (Eq.5. in the environment of the solid. This is referred to as the two-center approximation. Eq. 4. leading to φ p (r) = φ px (r) cos θ + φ p y (r) sin θ φ p |Hsp |s = φ px |Hsp |s cos θ + φ p y |Hsp |s sin θ = Vsp cos θ for the relative orientation of the p and s orbitals shown in Fig. For example. Of course the larger rc .

This is referred to as going beyond the minimal basis. Next. we orthogonalize the excited states to the states in the orthogonal minimal basis. This implies that the overlap between such states will not fall off fast with distance between their centers. we need to introduce more parameters to allow for the ﬂexibility of having several possible environments around each two-center matrix element. In principle we should consider all such interactions. will depend on the position of all other atoms around it and on the type of atomic orbitals on these other atoms. the value of a two-center hamiltonian matrix element. This is accomplished by diagonalizing the non-orthogonal overlap matrix and using as the new basis the linear combination of states that corresponds to the eigenvectors of the non-orthogonal overlap matrix. The advantages of this generalization are obvious since including more basis functions always gives a better approximation (by the variational principle). presents certain difﬁculties: the excited states tend to be more diffuse in space. An example of such terms is a three-center matrix element of the hamiltonian in which one orbital is centered at some atomic site. In this case. and the overlap between them is reduced. One way of taking into account these types of interactions is to make the hamiltonian matrix elements environment dependent. since we will have a larger basis and a correspondingly larger number of matrix elements that we need to calculate or obtain from ﬁtting to known results.1 The tight-binding approximation 139 atomic-like wavefunctions. making the approach much more complicated. and it will be difﬁcult to truncate the non-orthogonal overlap matrix and the hamiltonian matrix at a reasonable distance. and a term in the hamiltonian (the ionic potential) includes the position of a third atomic site.4. involving explicitly the positions of only two atoms. we can consider our basis as consisting not only of the valence states of the atoms. Each orthogonalization involves the diagonalization of the corresponding overlap matrix. Nevertheless. however. To accomplish this. (4) Excited-state orbitals in basis Finally. we perform the following operations. and ﬁnally we orthogonalize the new excited states among themselves. The advantage of this procedure is that with each diagonalization. the energy of the new states is raised (since they are orthogonal to all previous states). The increase in realistic representation of physical systems is always accompanied by an increase in complexity and computational cost. with tails extending farther away from the atomic core. a second orbital is centered at a different atomic site. This. In this way we create a basis that gives rise to a hamiltonian which can be truncated at a reasonable cutoff distance. but including unoccupied (excited) atomic states. To avoid this problem. the increase in variational freedom that comes with the inclusion of excited states increases computational complexity. First we orthogonalize the states in the minimal basis. .

Since the potential where n inside the WS cell is assumed to be spherical. To the extent that these functions represent accurately all possible electronic states (if the original set does not satisfy this requirement we can simply add more basis functions). The number of basis functions is no longer determined by the number of valence states in the constituent atoms.37) ˆ (rb ) is the vector normal to the surface of the WS cell. A popular set is composed of normalized gaussian functions (see Appendix 1) multiplied by the appropriate spherical harmonics to resemble a set of atomic-like orbitals.140 4 Band structure of crystals This extension is suggestive of a more general approach: we can use an arbitrary set of functions centered at atomic sites to express the hamiltonian matrix elements. which are the analog of the Brillouin Zones in real space. denoted by rb : ψk (rb ) = e−ik·R ψk (rb + R) ˆ (rb ) · ∇ ψk (rb ) = −e−ik·R n ˆ (rb + R) · ∇ ψk (rb + R) n (4. originally developed by Wigner and Seitz [44]. either analytically. We review the basic ideas of these methods next. we can then consider that we have a variationally correct description of the system. the following boundary conditions must be obeyed at the boundary of the WS cell. 4. whose shape is dictated by the crystal. if the choice of basis functions permits it.2 General band-structure methods Since the early days of solid state theory. we can use the standard expansion in spherical harmonics Ylm (ˆ r) and radial wavefunctions ρkl (r ) (for details see . These methods were the foundation on which modern approaches for electronic structure calculations have been developed. but its boundaries are those of the WS cell. Cellular or Linearized Mufﬁn-Tin Orbital (LMTO) method This approach. a number of approaches have been introduced to solve the single-particle hamiltonian and obtain the eigenvalues (band structure) and eigenfunctions. We can then calculate the hamiltonian and overlap matrix elements using this set of functions and diagonalize the resulting secular equation to obtain the desired eigenvalues and eigenfunctions. or numerically. the potential felt by the electrons is the atomic potential. Due to the Bloch character of wavefunctions. but rather by variational requirements. considers the solid as made up of cells (the Wigner–Seitz or WS cells). It is customary in this case to use an explicit form for the single-particle hamiltonian and calculate the hamiltonian and overlap matrix elements exactly. In each cell. which is spherically symmetric around the atomic nucleus. This is the more general Linear Combination of Atomic Orbitals (LCAO) method.

A detailed exposition of these methods falls beyond the scope of this book. which can be corrected by elaborate extensions described as “full potential” treatment.4. In the APW method. valence as well as core electrons. The two methods share the basic concept of separating space in the solid into the spherical regions around the nuclei and the interstitial regions between these spheres. It is only valence states that have signiﬁcant weight in the regions outside the atomic spheres. For core states. it is reasonable to consider it to be spherical within a sphere which lies entirely within the WS. especially in crystal structures other than the close-packed ones (FCC. the wavefunctions are essentially unchanged within the spheres. hence the name of the method Linearized Mufﬁn-Tin Orbitals (LMTO). Both the LMTO and the APW methods treat all the electrons in the solid. There is also an all-electron electronic structure method based on multiple scattering theory known as the Korringa–Kohn–Rostocker (KKR) method. Since the potential cannot be truly spherical throughout the WS cell. and in functions with spherical symmetry within the spheres. this separation of space leads to inaccuracies. The basic assumption of the method is that a spherical potential around the nuclei is a reasonable approximation to the true potential experienced by the electrons in the solid. Accordingly. introduced by Slater [45]. HCP.38) where the dependence of the radial wavefunction on k enters through the eigenvalue k . In many cases. and to be zero outside that sphere.2 General band-structure methods 141 Appendix B) which obey the following equation: 2 d +2 d + dr 2 r dr h ¯2 2m e −1 h ¯ 2 l (l + 1) ρkl (r ) = 0 k − V (r ) − 2m e r 2 (4. the spheres are touching while in the LMTO method they are overlapping. In terms of these functions the crystal wavefunctions become: ψk (r) = lm αklm Ylm (ˆ r)ρkl (r ) (4. In the remaining of this section we will examine other band structure . they are discussed in specialized articles or books (see for example the book by Singh [46] often accompanied by descriptions of computer codes which are necessary for their application. BCC). Augmented Plane Waves (APW) This method. they are referred to as “all-electron” methods. This gives rise to a potential that looks like a mufﬁn-tin. consists of expanding the wavefunctions in plane waves in the regions between the atomic spheres. This method is in use for calculations of the band structure of complex solids. that is.39) Taking matrix elements of the hamiltonian between such states creates a secular equation which can be solved to produce the desired eigenvalues. Then the two expressions must be matched at the sphere boundary so that the wavefunctions and their ﬁrst and second derivatives are continuous.

43) Taking matrix elements between such states produces a secular equation which can be diagonalized to obtain the eigenvalues of the energy. The trial valence wavefunctions are written at the outset as a combination of plane waves and core-derived states: (v ) φk (r) = 1 eik·r + c (c ) β (c ) ψk (r) (4.40) (c ) where the ψk (r) are Bloch states formed out of atomic core states: (c) ψk (r) = R eik·R φ (c) (r − R) (4.43) to obtain something more familiar. Orthogonalized Plane Waves (OPW) This method.142 4 Band structure of crystals methods in which the underlying concept is a separation between the electronic core and valence states. First notice that (c) ψk +G (r) = R (c ) ei(k+G)·R φ (c) (r − R) = ψk (r) (4.41) With the choice of the parameters (c ) β ( c ) = − ψk |k . We can then use these trial valence states as the basis for the expansion of the true valence states: (v ) (r) = ψk G (v ) αk (G)φk +G (r) (c ) (c ) (c) k|ψk ψk |ψk =0 (4. This separation makes it possible to treat a larger number of valence states. with a relatively small sacriﬁce in accuracy.42) (4. (4. due to Herring [47]. r|k = 1 eik·r (v ) we make sure that the wavefunctions φk (r) are orthogonal to core Bloch states: (v ) (c ) (c) φk |ψk = k|ψk − c (c ) (c ) where we have used ψk |ψk = δcc to reduce the sum to a single term. Pseudopotential Plane Wave (PPW) method We can manipulate the expression in Eq.44) and with this we obtain (v ) ψk (r) = G αk (G) 1 G 1 ei(k+G)·r − c (c ) (c) ψk +G |(k + G) ψk+G (r) = αk (G)ei(k+G)·r − c (c ) ψk G αk (G)|(k + G) (c ) ψk (r) . The advantage is that structures with many more atoms in the unit cell can then be studied efﬁciently. is an elaboration on the APW approach.

2 General band-structure methods 143 which.49) where we have eliminated a factor of N (the number of PUCs in the crystal) with a summation R . To simplify the situation.i dr N PU C Vat (r − ti − R)e−iG·r ps ps = dr R. this is easily generalized to the case of several types of atoms.114). (2.45) can be rewritten as (v ) ˜ k (r) − (r) = φ ψk c (v ) (c ) ˜ (c ) ψk |φ k ψk (r) (v ) (4. We can expand the pseudopotential in the plane wave basis of the reciprocal lattice vectors: ps (r) = Vcr G ps Vcr (G)eiG·r (4.47) where Vat (r − ti ) is the pseudopotential of a particular atom in the unit cell. since the summand at the end does not involve an explicit dependence on R. with the deﬁnition v) ˜( φ k (r) = G 1 1 αk (G) ei(k+G)·r = eik·r G αk (G)eiG·r (4. we will assume that we are dealing with a solid that has several atoms of the same type in each unit cell.4. and therefore we expect its Fourier components ps Vcr (G) to fall fast with the magnitude of G.i PU C Vat (r)e−iG·r eiG·ti = Vat (G) ps i eiG·ti (4. Using the expression from above in terms of the atomic pseudopotentials.48) As we have argued before.46) This is precisely the type of expression we saw in chapter 2.i ps Vat (r − ti − R) ps (4. So the construction of the orthogonalized plane waves has led us to consider valence states from which the core part is projected out. We have also deﬁned the Fourier transform of the atomic pseudopops tential Vat (G) as the content of the square brackets in the next to last expression . which in turn leads to the idea of pseudopotentials. at position ti . The crystal potential that the pseudo-wavefunctions experience is then given by ps (r) = Vcr R. Eq. the pseudopotential is much smoother than the true Coulomb potential of the ions. which contain a sum that projects out the core part. discussed in detail in chapter 2. the Fourier components take the form (with N PU C the volume of the crystal) ps Vcr (G) = dr N 1 N PU C ps Vcr (r)e−iG·r = R. for the states we called pseudo-wavefunctions.

whereas among those with small |G|.144 4 Band structure of crystals in the above equation. G )αk (G ) = G sp (n ) (n ) k αk (G ) (4.52) where the hamiltonian matrix elements are given by Hk (G. Depending on the positions of atoms in the unit cell.45). This means that the values of the crystal pseudopotential for these values of G are not needed for a band-structure calculation.51) These must be solved by considering the expansion for the pseudo-wavefunction in terms of plane waves.50) is called the “structure factor”. which are anyway isotropic. To put these arguments in quantitative form. several may be eliminated due to vanishing values of the structure factor S (G). S (G) = i eiG·ti (4. this summation can vanish for several values of the vector G. consider the single-particle equations which involve a pseudopotential (we neglect for the moment all the electron interaction terms. The sum appearing in the last step of this equation. Eq. Solving this secular equation produces the eigenvalues (band structure) and eigenfunctions for a given system.53) Diagonalization of the hamiltonian matrix gives the eigenvalues of the energy (n ) ˜ (n ) k and corresponding eigenfunctions φ k (r). The idea then is to use a basis of plane waves exp(i G · r) to expand both the pseudowavefunctions and the pseudopotential. (4. . Taking matrix elments of the hamiltonian with respect to plane wave states. we arrive at the following secular equation: (n ) Hk (G. Fourier components of ps Vat (r) which are multiplied by vanishing values of the structure factor S (G) will not be of use in the above equation. since those corresponding to large |G| are negligible because of the smoothness of the pseudopotential. From the above analysis. which will lead to a secular equation with a relatively small number of non-vanishing elements. Obviously. the full problem is considered in more detail in chapter 5): − h ¯2 2 n) ps ˜( ∇ + Vcr (r) φ k (r) = 2m e (n ) ˜ (n ) k φ k (r) (4. G ) = sp h ¯ 2 (k + G)2 ps δ (G − G ) + Vat (G − G ) S (G − G ) 2m e (4. we conclude that relatively few Fourier components of the pseudopotential survive.

7. 4. We should emphasize that this approach does not give accurate results for semiconductor band gaps. arising from the pz orbitals that contribute to the π -bonding. In the following we will rely on the PPW method to do the actual calculations. labeled n = 8. the two p orbitals that combine to form the sp 2 hybrids involved in the σ bonds on the plane. The antisymmetric (antibonding) combination. this is known as the GW approach [48. can often give quite reasonable results [50]. Much theoretical work has been devoted to develop accurate calculations of the eigenvalue spectrum with the use of many-body techniques.4. we easily recognize the lowest band as arising from a bonding state of s character.3.3 Band structure of representative solids 145 4. inﬁnite graphitic sheet. of the van der Waals type. A simpler approach. Notice that because we are dealing with pseudo-wavefunctions. On a plane of graphite the C atoms form a honeycomb lattice. 49]. which we diagonalize numerically using a plane wave basis. There are two p states participating in this type of bonding. The interaction between planes is rather weak. 4. a hexagonal Bravais lattice with a two-atom basis. In this plot. it is this latter approach that has been used in the calculations described below (for details see Ref. labeled n = 3. which vanish at the position of the ion. based on extensions of DFT. it is almost the mirror image of the π -bonding state with respect to the Fermi level. represent a p -like bonding state. and corresponds to σ bonds between C atoms. the charge is . 4. has the reverse dispersion and lies higher in energy. The next three bands intersect each other at several points in the BZ. The single band intersecting the other two is a state with p character. 4. labeled n = 2. graphite consists of stacked sheets of three-fold coordinated carbon atoms. We present in Fig. The two bands that are degenerate at . [51]). and the overlap of wavefunctions on different planes is essentially non-existent. In the actual calculations we will employ the density functional theory single-particle hamiltonian. We will also rely heavily on ideas from the TBA to interpret the results. this is band n = 1 at counting from the bottom. This is due to the fact that the spectrum of the single-particle hamiltonian cannot describe accurately the true eigenvalue spectrum of the many-body hamiltonian.3 Band structure of representative solids Having established the general methodology for the calculation of the band structure. since it has proven one of the most versatile and efﬁcient approaches for calculating electronic properties. it is the symmetric (bonding) combination of these two pz orbitals.6 the band structure for a single. that is. All these features are identiﬁed in the plots of the total charge densities and eigenfunction magnitudes shown in Fig. periodic.1 A 2D solid: graphite – a semimetal As we have mentioned in chapter 3. we will now apply it to examine the electronic properties of several representative solids.

0 -2. The dashed line indicates the position of the Fermi level. The zero of the energy scale is set arbitrarily to the value of the highest σ -bonding state at .146 4 Band structure of crystals Q Γ P 15.0 -17. .0 -12. Remediakis.5 5. The π -antibonding state (n = 8) has a node in the region between the atoms. Band structure of a graphite sheet. and a dip in the middle of the bond. The π -bonding state (n = 2) shows the positive overlap between the two pz orbitals.) mostly concentrated in the region between atoms. The s -like part (n = 1) of the σ -bonding state has uniform distribution in the region between the atoms.5 -15.0 2. calculated with the PPW method.5 -10. the center of the bond corresponds to the largest charge density.5 10. Q identiﬁed.6.N. This is a manifestation of the expulsion of the valence states from the core region. (Based on calculations by I. which is taken into account by the pseudopotential.5 P Γ Q P Γ k Figure 4.5 εF E (eV) 0. The small diagram above the band structure indicates the corresponding Brillouin Zone with the special k points .5 -5.0 12. 4) of the σ -bonding state has two pronounced lobes.0 7.0 -7. P . The p -like part (n = 3.

) . on the plane of the atoms (left) and on a plane perpendicular to it (right). (b) The wavefunction magnitude of states at . (Based on calculations by I.4 n=8 Figure 4.4. Remediakis.N.3 Band structure of representative solids 147 (a) (b) n=1 n=2 n=3.7. (a) Total electronic charge density of graphite.

the Fermi level must be at a position which makes the three σ -bonding states (one s -like and two p -like) and the π -bonding state completely full. there are eight valence electrons per PUC in each of these crystals. Similarly. Placing electrons in unoccupied states at the bottom of empty bands allows them to move freely in these bands. The band structure of these four crystals is shown in Fig. In the case of graphite. The bonding and antibonding combinations of the pz states are degenerate at the P high-symmetry point of the BZ. even though.3. the availability of states immediately below and immediately above the Fermi level. a total of eight valence electrons per unit cell. whereas the third and fourth have partially ionic and partially covalent bonding character (see also the discussion in chapter 1). while the π -antibonding state is completely empty. strictly speaking according to the deﬁnition given above. barely exhibiting the characteristics of metallic behavior. We consider ﬁrst four crystals that have the following related structures: the diamond crystal. Indeed. In all four . This means that some states below the Fermi level will be unoccupied and some above will be occupied. it cannot be described as anything else.148 4 Band structure of crystals Since in this system there are two atoms per unit cell with four valence electrons each. SiC and GaAs. The crystals we will consider are Si.8. lie even higher in energy and are completely empty. electrons obey a Fermi distribution with an abrupt step cutoff at the Fermi level F . 4. This is the hallmark of metallic behavior. the third has two atoms of the same valence (four valence electrons each) and the last consists of a group-III (Ga) and a group-V (As) atom. that is. Accordingly. and the zincblende crystal in which the lattice is similar but the two atoms in the PUC are different. At ﬁnite temperature T . antibonding states arising from antisymmetric combinations of s and px .2 3D covalent solids: semiconductors and insulators Using the same concepts we can discuss the band structure of more complicated crystals. p y orbitals. as we discussed for the free-electron model. At zero temperature. we need four completely ﬁlled bands in the BZ to accommodate them. which makes it possible to excite electrons thermally. The ﬁrst two are elemental solids. graphite is considered a semimetal. which consists of two interpenetrating FCC lattices and a two-atom basis in the PUC. C. the distribution will be smoother around the Fermi level. with the two atoms being of the same kind. Thus. 4. with a width at the step of order kB T . the number of states immediately below and above the Fermi level is actually very small: the π -bonding and antibonding bands do not overlap but simply touch each other at the P point of the BZ. The energy of the highest occupied band is taken to deﬁne the zero of the energy scale. that is. The ﬁrst two crystals are characteristic examples of covalent bonding.

0 L KU W Γ X W L GaAs Γ KU X Figure 4.5. 0. since there are eight valence electrons in the PUC of each of .5.4.0 −20.8.0 −10. 0. The ﬁrst and the last are semiconductors. 0) U = (1. The small diagram above the band structure indicates the Brillouin Zone for the FCC lattice.0 − 7.0 2.0 2.0) 149 Γ X U L L = (0.0 2.5 −10.5 5. U identiﬁed and expressed in units of 2π/a .5 Γ Γ −5.N. denoted by gap . 0. the other two are insulators. with the special k-points X .0 − 17.5.75. Band structure of four representative covalent solids: Si.0 −25.0 0.0 0.5 − 10. We notice ﬁrst that there are four bands below zero.0 − 7.25) W K 5.5 − 5. W . The energy scale is in electronvolts and the zero is set at the Valence Band Maximum.5 − 15.0 Si L KU W X W L KU X −15. which means that all four bands are fully occupied.75.5 − 15.0 −7.0 10. SiC. (Based on calculations by I.5 L KU W Γ X W L Γ KU X − 2.0 −12. L .0 0. this is the band gap. The ramiﬁcations of this feature are very important.5) K = (0. namely there is a range of energies where there are no electronic states across the entire BZ.0 − 12. where a is the lattice constant.0.25. 0.0 −2. C.5 SiC − 10.5 5.5 − 5.3 Band structure of representative solids X = (1. 0) W = (1.5 −5.0 − 12.0 − 2.0 L KU W Γ X W L C Γ KU X 7. 0. K . Remediakis.) cases there is an important characteristic of the band structure.5 0. is the center of the BZ. 0.5 5. GaAs.

in C. VBM) is always at the point. and the interaction between the hybrid orbitals in the solid. we conclude that for all practical purposes the states above the Fermi level remain unoccupied (these solids melt well below 5800 K). about 23 eV. crystals with impurities (see detailed discussion in chapter 9). that is. both their energy difference and the interaction of the hybrid orbitals is large. almost twice that of Si. since as we discussed in chapter 3 a ﬁlled band cannot carry current. all semiconductors in use in electronic devices are of this type.8).5 eV for GaAs. 4.2 eV for Si. CBM) can be at different positions in the BZ.5 eV. A more detailed analysis (see chapter 9) reveals that for an ideal crystal.5 eV for SiC. as discussed in chapter 5. 5 eV for C-diamond. the absence of any states above the Fermi level to which electrons can be thermally excited. In a perfect crystal it would take the excitation of electrons from occupied to unoccupied bands to create current-carrying electronic states. the Fermi level is at the middle of the gap. giving a large band width. where we are dealing with 2s and 2 p atomic states. The nature of the gap has important consequences for optical properties. For Si and C it is somewhere between the and X points. one in which there are no intrinsic defects or impurities. Naively. in SiC it is about 16 eV. In Si and GaAs it is about 12. the states below the Fermi level are called “valence bands” while those above the Fermi level are called “conduction bands”. since for any such position all states below it remain occupied and all states above remain unoccupied. it will not be possible to excite thermally appreciable numbers of electrons from occupied to unoccupied bands. see Fig. and 1 eV = 11 604 K. and in C it is considerably larger. Only when imperfections (defects) or impurities (dopants) are introduced into these crystals do they acquire the ability to respond to external electric ﬁelds. Thus. that is. 4. the range of energies covered by the valence states. see Fig. . For insulators the band gap is even higher (2. that is. It is also of interest to consider the “band width”. that is. Since the band gap is typically of order 1 eV for semiconductors (1. all the other cases discussed here are indirect gap semiconductors or insulators.8). First we note that the highest occupied state (Valence Band Maximum. This is the hallmark of semiconducting and insulating behavior. For instance. The lowest unoccupied state (Conduction Band Minimum. This makes it difﬁcult for these solids to respond to external electric ﬁelds.150 4 Band structure of crystals these solids. The speciﬁc features of the band structure are also of interest. Only for GaAs is the CBM at : this is referred to as a direct gap. until the temperature reaches ∼ gap /2. Accordingly. while in SiC it is at the X point. 1. for an energy range of ± gap /2. There are two factors that inﬂuence the band width: the relative energy difference between the s and p atomic valence states. we might expect that the Fermi level can be placed anywhere within the band gap. This means that there are no states immediately above or below the Fermi level.

4. p A and s B .4. would apply to GaAs. This is illustrated in Fig. 3. In all the examples considered here it is easy to identify the lowest band at as the s -like state of the bonding orbitals. which has three electrons in the Ga orbitals and ﬁve electrons in the As orbitals. This illustration makes it clear why the states near the VBM have p bonding character and are associated with the more electronegative element in the solid. the bonding and antibonding states acquire dispersion. In these cases the s -like state bears more resemblance to the corresponding atomic state in . the two sets of atomic-like orbitals are the same and the character of bonding and antibonding states near the VBM and CBM is not differentiated among the two atoms in the unit cell. p B correspond to the atoms A . but with different occupation of the atomic orbitals. while those near the CBM have p antibonding character and are associated with the less electronegative element in the solid: the character derives from the hybrid states which are closest in energy to the corresponding bands. A similar diagram. and the bonding and antibonding states are denoted by ψib . The gap between the two manifolds of states. in the crystal. gap . 2.9. ψia (i = 1. SiC and GaAs. Since the sp 3 orbitals involve one s and three p states. which leads to formation of the valence and conduction energy bands. which arise from the interaction of sp 3 hybrids in nearest neighbor atoms. is indicated. φiB (i = 1. The VBM is in all cases three-fold degenerate at . the sp 3 hybrids in each case are denoted by φiA . It is also interesting that the bottom s -like band is split off from the other three valence bands in the solids with two types of atoms. 4). 2.3 Band structure of representative solids 151 ψ1a ψ2a ψ3a ψ4a band A A A p p p x y z A A A φA 1 φ2 φ3 φ4 Conduction εgap B B B φB 1 φ2 φ3 φ4 B B B p p p x y z sA ψ1b ψ2b ψ3b ψ4b band Valence sB Figure 4. 4) respectively. Origin of the bands in sp 3 bonded solids: the levels s A . B . the corresponding p -like states of the bonding orbitals are at the top of the valence manifold at . 3. In the case of a homopolar solid. These states disperse and their s and p character becomes less clear away from .9: in this example we show the relative energy of atomic-like s and p orbitals for two different tetravalent elements (for example Si and C) which combine to form a solid in the zincblende lattice.

6. on the (110) plane of the diamond or zincblende lattice.7.8 Figure 4. SiC.4 GaAs. Top: charge densities of four covalent solids. Si.4 Si. n=1 Si.3. Remediakis. C. The diagrams on the left indicate the atomic positions on the (110) plane in the diamond and zincblende lattices. n=6.10. n=2. Bottom: wavefunction magnitude for the eight lowest states (four below and four above the Fermi level) for Si and GaAs.7 Si. on the same plane as the total charge density. n=2. n=5. GaAs.3.152 4 Band structure of crystals Si C SiC GaAs Si. n=5 GaAs.) .N. n=1 GaAs. n=8 GaAs. (Based on calculations by I.

The dispersion near the bottom of the lowest band is nearly a perfect parabola. 4. 4. a detailed comparison of such plots to experiment can be found in Ref. The high concentration of electrons in the regions between the nearest neighbor atomic sites represents the covalent bonds between these atoms.3 3D metallic solids As a last example we consider two metals. 3 p . as the energy level diagram of Fig. 3. have p -like bonding character.9 suggests. whereas in SiC and GaAs it is polarized closer to the more electronegative atoms (C and As. 7. the states n = 2. As in the case of GaAs). has the characteristics of the free-electron band structure in the corresponding 3D FCC lattice. The next three unoccupied degenerate states in Si (n = 5. 7) have p antibonding character.3. Moreover. there are just enough of these bonds to accommodate all the valence electrons.4. Al is the prototypical solid with behavior close to that of free electrons. with lobes pointing in the direction of the nearest neighbors. there are two covalent bonds per atom. 4. Indeed. 4. also pointing away from the direction of the nearest neighbors. 4). Finally. the next unoccupied state in Si (n = 8) has s antibonding character.3 Band structure of representative solids 153 the more electronegative element (C in the case of SiC. shown in Fig. with a node in the middle of the bond and signiﬁcant weight at both the As and Ga sites. All these features can be identiﬁed in the charge density plots.10. The ﬁrst is a simple metal. there are four covalent bonds emanating from each atom (only two can be seen on the plane of Fig. 4. the next three unoccupied degenerate states (n = 6. as would be expected for . reveals features that we would expect from the preceding analysis: the lowest state (n = 1) has s -like bonding character around all atoms in Si and around mostly the As atoms in GaAs. with pronounced lobes pointing away from the direction of the nearest neighbors and nodes in the middle of the bonds. Its band structure. respectively). shown in Fig. The magnitude of the wavefunctions at for Si and GaAs. In GaAs. In all cases the valence electronic charge is concentrated between atomic positions. [52]. 4 have As p -like bonding character.10). Regions far from the bonds are completely devoid of charge. 4. Since each bonding state can accommodate two electrons due to spin degeneracy. 6. Speciﬁcally.11. in the sense that only s and p orbitals are involved: the corresponding atomic states are 3s . and since each bond is shared by two atoms. 3. these covalent bonds take up all the valence electrons in the solid. In the case of Si and C this distribution is the same relative to all atomic positions. Al and Ag. shown in Fig.10. 8) have clearly antibonding character with nodes in the middle of the bonds and have large weight at the Ga atomic sites. The next three states which are degenerate (n = 2. The next state (n = 5) has clearly antibonding character.

(Based on calculations by I. . Remediakis.0 4.0 6. reveal features expected from the analysis above.0 14.5 bands will be occupied throughout the BZ.154 4 Band structure of crystals 12. The total charge density and magnitude of the wavefunctions for the lowest states at .0 Γ Γ Al L K W X W L K 22.0 10.0 8.0 Ϫ8.0 2.0 0.11. and Ag.0 L K W Γ X W L Γ K Ag Figure 4.11 the Fermi level is at a position which makes the lowest band completely full throughout the BZ. The zero of energy denotes the Fermi level. 4.12. Band structure of two representative metallic solids: Al.0 Ϫ6. Since there are only three valence electrons per atom and one atom per unit cell.N.0 18. the total charge density is evenly distributed throughout the crystal.0 Ϫ12. a d -electron metal. especially along the X –W – L high symmetry lines.0 Ϫ2. shown in Fig. we expect that on average 1. a free-electron metal.0 Ϫ4. As seen in Fig. and small portions of the second band full. 4. Speciﬁcally.0 Ϫ10.) free electrons.

6. These states give rise to the features of the total charge density that appear like directed bonds.4 n=5. this charge concentration cannot be interpreted as a covalent bond. and with large weight throughout the crystal. 3. The next three degenerate states (n = 2. Electronic charge densities on a (100) plane of the FCC lattice for Al and Ag.N. because there are too many neighbors (12) sharing the three valence electrons.3 Band structure of representative solids 155 Al n=1 total n=2. This corresponds to the metallic bonding state.4 n=5. the next set of unoccupied states (n = 5. uniformly distributed around each atomic site. (Based on calculations by I. Remediakis. 4) are clearly of p character. but at other points in the BZ they are occupied.7 Ag n=1 total n=2.3. As far as individual states are concerned. 7) are clearly .3. with pronounced lobes pointing toward nearest neighbors. At these states are unoccupied.4. thus contributing to bonding. The atomic positions are at the center and the four corners of the square.12.6 Figure 4. Finally. the lowest band (n = 1) is clearly of s character.) Although there is more concentration of electrons in the regions between nearest neighbor atomic positions. The total charge density and the magnitude of the wavefunctions for the three lowest states (not counting degeneracies) at are shown. 6.

and one additional s electron which is shared among all the atoms in the crystal. The second example. This indicates that both solids will act as good metals. and one two-fold degenerate. shows signiﬁcant dispersion and contributes to the metallic bonding in this solid. Only the s state. as expected for non-interacting atoms. and thus there are plenty of states immediately below and above the Fermi level for thermal excitation of electrons. the Fermi level intersects bands with high dispersion at several points. it produces the completely spherically symmetric distribution shown in the total-density panel. reveal some interesting features. if we were to neglect the ﬁve d bands. all of which are ﬁlled states below the Fermi level.11. The next band can be identiﬁed with the s -like bonding band. and is on average half-ﬁlled. shown in Fig. In both cases. . Their low dispersion is indicative of weak interactions among these orbitals. as one would expect for core-like completely ﬁlled states. 5s . In this case we have 11 valence electrons. This band interacts and hybridizes with the d bands. Indeed. They seem to be very tightly bound to the atoms. leaving large holes at the positions where the atoms are. namely that they have an essentially full electronic shell (the 4d shell in Ag). notice how the total charge density is mostly concentrated around the atoms. 4. This state is strongly repelled from the atomic cores. one three-fold degenerate. with the characteristic four lobes emanating from the atomic sites. There is also one band with large dispersion. In fact. and there seems to be little interaction between these atoms. the rest of the band structure looks remarkably similar to that of Al. When the charge of these states is added up.12.5 bands to be ﬁlled on average in the BZ.156 4 Band structure of crystals of antibonding character with nodes in the direction of the nearest neighbors. and we expect 5. The lack of interaction among these states is reﬂected in the lack of dispersion of their energies in the band structure plot. Ag. First. as the mixing of the spectrum near suggests. their energy is above the Fermi level throughout the BZ. is more complicated because it involves s and d electrons: the corresponding atomic states are 4d . All these states have clearly d character. being able to carry current when placed in an external electric ﬁeld. The total valence charge density and the magnitude of wavefunctions for the few lowest energy states at . Fig. The next ﬁve states are in two groups. with very small interaction across neighboring sites. which intersects the Fermi level at several points. This is consistent with the picture we had discussed of the noble metals. Indeed the wavefunction of the lowest energy state at (n = 1) clearly exhibits this character: it is uniformly distributed in the regions between atoms and thus contributes to bonding. The ﬁve low-energy occupied bands are essentially bands arising from the 4d states. we see ﬁve bands with little dispersion near the bottom of the energy range. 4. which is shared among all atoms. The distribution of charge corresponding to this state is remarkably similar to the corresponding state in Al.

Singh (Kluwer Academic. was derived for the simple chain. this plane is referred to as graphene. Harrison (W. (a) Take a basis for each atom which consists of four orbitals. where d is the dimensionality.L. Freeman. Berlin. 2. This book is a thorough compilation of information relevant to semiconductor crystals. For these lattices. using the simplest tight-binding model with one atom per unit cell and one s -like orbital per atom. Pastori Parravicini (Pergamon Press. W. 1989).R. t1 = 0.L. 1988). Oxford. Eq. Pseudopotentials and the LAPW Method. D. Chelikowsky (Springer-Verlag. pz . New York. D. Prove the orthogonality relation. This book is a thorough compilation of the electronic structure of elemental metallic solids. Problems 1.54) a1 = 2 2 2 2 3 where a is the lattice constant of the 2D hexagonal plane. 2. assuming 3. 3.H. a2 and atomic positions t1 . as obtained with the APW method. square and cubic lattices in one. t2 = √ x ˆ ax ax (4. This book contains a detailed account of the augmented plane wave approach and the plane wave approach and their applications to the electronic structure of solids. Cohen and J.6).A. the number of neighbors z is always 2d . 6. p y . This is a comprehensive account of the tight-binding approximation and its application to the electronic structure of elemental solids. Consider the same simple tight-binding model for the close-packed lattices in two and three dimensions. . J.Problems 157 Further reading 1. the simple hexagonal lattice in 2D and the FCC lattice in 3D. This book contains a general discussion of the properties of solids based on the tight-binding approximation. The relationship between the band width and the number of nearest neighbors. 4. with nearest neighbor interactions in the tight-binding approximation. 1994). a2 = ˆ − ay ˆ . t2 : √ √ 1 1 3 3 a ˆ + ay ˆ . Bassani and G. a subﬁeld of great importance to the theory of solids. Moruzzi. Consider a single plane of the graphite lattice. Williams (Pergamon Press. (4. Handbook of the Band Structure of Elemental Solids. Eq. s . New York. that is. px .F. Boston. Janak and A. San Francisco. Electronic States and Optical Transitions in Solids. F. and derive the corresponding relation between the band width and the number of nearest neighbors. V. 5. Planewaves. 1980).20). Electronic Structure and the Properties of Solids. deﬁned by the lattice vectors a1 . two and three dimensions.A. Electronic Structure and Optical Properties of Semiconductors. Determine the hamiltonian matrix for this system at each high-symmetry point of the IBZ. M. Calculated Electronic Properties of Metals. Papaconstantopoulos (Plenum Press. (4. 1978). 1975).R.

as illustrated in Fig. with on-site energy 0 and nearest neighbor hamiltonian matrix element t . 4. (a) Although there are ﬁve d orbitals associated with the Cu atom and three p orbitals associated with each O atom.158 4 Band structure of crystals a2 O py Cu d x2 -y2 px O a1 Figure 4. one Cu atom lattice vectors a1 = a x and two O atoms at distances a1 /2 and a2 /2. Model for the CuO2 plane band structure: the Cu dx 2 − y 2 and the O px and p y orbitals are shown. Consider the following two-dimensional model: the lattice is a square of side a with ˆ and a2 = a y ˆ .13. shown in Fig.13. The lattice vectors a1 and a2 are offset from the atomic positions for clarity. Use proper combinations of the atomic orbitals to take advantage of the symmetry of the problem (see also chapter 1). 4. there are three atoms per unit cell. How well does your choice of parameters reproduce the important features of the true band structure. Comment on the differences between these model bands and the true π bands of graphite. 4. We will assume that the atomic orbitals associated with these atoms are orthogonal. 4. an orthogonal overlap matrix.6? (c) Show that the π bands can be described reasonably well by a model consisting of a single orbital per atom. which yields the following expression for the bands: √ (±) k = 0 ± t 1 + 4 cos a a 3a k x cos k y + 4 cos2 ky 2 2 2 1/2 (4. scaled appropriately to reﬂect the interactions between carbon orbitals. only one of the Cu orbitals (the dx 2 − y 2 one) and two .55) Choose the values of the parameters in this simple model to obtain as close an approximation as possible to the true bands of Fig. with positive lobes in white and negative lobes in black.6. (b) Choose the parameters that enter into the hamiltonian by the same arguments that were used in the 2D square lattice.

because of the geometry. valid? . with p < d . 4. Use these matrix elements to calculate the band structure for this model. about the resemblance of the two band structures. (b) Deﬁne thehamiltonianmatrixelements betweenthe relevantCu–d and O– p nearest neighbor orbitals and take the on-site energies to be d and p . To what extent are the claims made in the text. the remaining Cu and O orbitals do not interact with their neighbors.11. it has been used extensively to describe the basic electronic structure of the copper-oxide–rare earth materials which are high-temperature superconductors [53]. 5. Explain why this is a reasonable approximation.Problems 159 of the O orbitals (the px one on the O atom at a1 /2 and the p y one on the O atom at a2 /2) are relevant. Historical note: even though this model may appear as an artiﬁcial one. and in particular why the Cu–d3z 2 −r 2 orbitals do not interact with their nearest neighbor O– pz orbitals. (c) Discuss the position of the Fermi level for the case when there is one electron in each O orbital and one or two electrons in the Cu orbital. Calculate the free-electron band structure in 3D for an FCC lattice and compare it with the band structure of Al given in Fig.

For example. the band structure can be used to calculate the total energy of the solid. k ∈[ . we simply need to add up all states with energy in the interval of interest. to produce the cohesion between atoms in the solid. and dielectric response of the solid. using the band structure one can determine the possible optical excitations which in turn determine the color. To illustrate this concept we consider ﬁrst the free-electron model. we get g ( )d = 1 k. +d ] dk = 1 2 k dk π2 (5. This response is directly related to the optical and electrical properties of the solid. A related effect is the creation of excitons. The resulting energy eigenvalues (band structure) and corresponding eigenfunctions provide insight into how electrons are arranged. Taking into account the usual factor of 2 for spin degeneracy and normalizing by the volume of the solid . reﬂectivity. The density of states g ( )d for energies in the range [ . bound pairs of electrons and holes. 5. that is. from which one can determine a wide variety of thermodynamic and mechanical properties. such as absorption or emission of light. +d ] 2= 2 (2π )3 160 k ∈[ . The results of such calculations can be useful in several other ways.5 Applications of band theory In the previous chapter we examined in detail methods for solving the single-particle equations for electrons in solids. both from an energetic and from a spatial perspective.1 Density of states A useful concept in analyzing the band structure of solids is the density of states as a function of the energy. In the present chapter we examine the theoretical tools for calculating all these aspects of a solid’s behavior. + d ] is given by a sum over all states with energy in that range. The band structure of the solid can elucidate the way in which the electrons will respond to external perturbations.1) . Since the states are characterized by their wave-vector k. which are also important in determining optical and electrical properties. Finally.

There is a useful theorem that tells us exactly how many critical points of each type we can expect. Eq. the extremum can be a minimum (zero negative coefﬁcients. the roots of the denominator are of ﬁrst order and therefore contribute a ﬁnite quantity to the integration over a smooth twodimensional surface represented by Sk .4) (n ) is constant and where the last integral is over a surface in k-space on which k equal to . 2. referred to as “type 0 critical point”).i )2 (5. Then the expression for the density of states becomes g( ) = = 1 n .5). 2m e h ¯ k= 2m e h ¯2 1/2 (5. where l is the number of negative coefﬁcients in the Taylor expansion of the energy. . In this ﬁnal expression. + d ]. 3. These relations give for the density of states in this simple model in 3D: g( ) = 1 2π 2 2m e h ¯ 2 3/2 √ (5. these roots introduce sharp features in the function g ( ). there are d !/ l !(d − l )! critical points of type l . we have to use the band-structure calculation for k . two types of saddle point (“type 1 and 2 critical points”) or a maximum (“type 3 critical point”).1 Density of states 161 where we have used spherical coordinates in k-space to obtain the last result as well as the fact that in the free-electron model the energy does not depend on the angular orientation of the wave-vector: k = h ¯ 2 |k|2 me =⇒ k dk = 2 d .5) The expansion is over as many principal axes as the dimensionality of our system: in three dimensions (d = 3) there are three principal axes. (5.3) In a crystal.2) for k ∈ [ .k 2δ ( − (n ) k ) = 1 2 (2π )3 (n ) k | δ( − n (n ) k )dk 2 (2π )3 n (n ) k = |∇k d Sk (5. instead of this simple relationship between energy and momentum which applies to free electrons. i = 1. we can expand the energy in a Taylor expansion (from here on we consider the contribution of a single band and drop the band index for simplicity): d [∇k k ]k = k0 = 0 ⇒ k = k0 + i =1 αi (ki − k0. For the values k0 where ∇k k = 0. Theorem Given a function of d variables periodic in all of them. which are called “van Hove singularities”. characterized by the symbols αi . Depending on the signs of these coefﬁcients.5.

162 5 Applications of band theory Table 5. α2 have the same sign. we can obtain the number of each type of critical point in d = 3 dimensions. 3 along the principal axes. α2 . We can choose the principal axes so that α1 . α3 > 0 α1 . Symbols (l. α2 > 0.1. i = 1. M0 . We can rescale the variables q . we can always do this since there are always at least two coefﬁcients with the same sign (see Table 5.7) With these changes.10) . k − k0 2 2 2 = α1 k1 + α2 k2 + α3 k3 (5. the DOS function takes the form g( ) = λ (2π )3 δ( − k0 2 − β q 2 − α3 k 3 )dk (5. In order to perform the k-space integrals involved in the DOS we ﬁrst make the following changes of variables: in the neighborhood of the critical point at k0 . 2. multiplicity = d!/l!(d − l )!. type. to obtain g( ) = λ 1/2 (2π )2 βα3 δ( − k0 2 − q 2 − k3 )q dq dk3 (5. α2 . k3 so that their coefﬁcients become unity.6) where k is measured relative to k0 . which are given in Table 5. 3. in which case αi > 0.8) where the factor λ comes from rescaling the principal axes 1 and 2. and characteristic behavior of the coefﬁcients αi .9) Now we can consider the expression in the argument of the δ -function as a function of k3 : f (k3 ) = − k0 2 − q 2 − k3 ⇒ f (k3 ) = −2k3 (5. Next we want to extract the behavior of the density of states (DOS) explicitly near each type of critical point. α2 .1. i = 1.1). α3 < 0 With the help of this theorem. We rescale these axes so that α1 = α2 = β after the scaling and introduce cylindrical coordinates for the rescaled variables k1 . α3 < 0 α1 > 0. Let us ﬁrst consider the critical point of type 0. k2 : 2 2 + k2 . l 0 1 2 3 Symbol M0 M1 M2 M3 Multiplicity 3!/0!3! = 1 3!/1!2! = 3 3!/2!1! = 3 3!/3!0! = 1 Type minimum saddle point saddle point maximum Coefﬁcients α1 . α3 < 0 α1 . q 2 = k1 θ = tan−1 (k2 / k1 ) (5. Critical points in d = 3 dimensions. 2. Ml ).

12) and with this we can now perform the ﬁnal integration to obtain g ( ) = −λ 0 ( − k0 − q 2 )1/2 ( − 0 1/2 k0 ) = λ0 ( − k0 ) 1/2 (5. M1 .11) where the factor λ0 embodies all the constants in front of the integral from rescaling and integration over k3 . so the DOS must be zero. For < k0 .14) for < k0 and it is zero for > k0 . so that the lower limit of q is Q 1 = 0. the δ -function cannot be satisﬁed for the case we are investigating.5. 3. Then the DOS becomes g ( ) = λ1 (q 2 + For → k0 .15) and we need to specify the limits of the last integral from the requirement that q 2 + k0 − ≥ 0. we have after rescaling α1 = α2 = β > 0 and α3 < 0.1 Density of states 163 and we can integrate over k3 with the help of the expression for δ -function integration Eq. with αi < 0. (G. with αi > 0. 2. There are two possible situations: (i) For < k0 the condition q 2 + k0 − ≥ 0 is always satisﬁed. we can perform a similar analysis. 3. and the upper limit is any positive value Q 2 = Q > 0. We outline brieﬂy the calculation for M1 : in this case. which gives g ( ) = λ0 0 Q 1 q dq ( − k0 − q 2 )1/2 (5. 2. For the other two cases. k0 − )1/2 Q 0 = λ1 ( Q 2 + k0 − )1/2 − λ1 ( k0 − )1/2 (5. M2 . The upper limit of integration Q for the variable q is determined by the condition 2 k3 = − k0 − q2 ≥ 0 ⇒ Q = ( − 1/2 k0 ) (5.13) This result holds for > k0 . i = 1. we ﬁnd that for the maximum M3 .17) − )1/2 + O( k0 − ) . which leads to g( ) = λ 1/2 (2π )2 βα3 Q2 Q1 δ( − 1 + k0 k0 2 − q 2 + k3 )q dq dk3 → λ1 (q 2 − )1/2 q dq (5.60) (derived in Appendix G). the DOS behaves as g ( ) = λ3 ( k0 − )1/2 (5.16) expanding in powers of the small quantity ( g ( ) = λ1 Q − λ1 ( k0 k0 − ) gives (5. By an exactly analogous calculation. i = 1.

The behavior of the DOS near critical points of different type in three dimensions.20) and g ( ) = λ2 Q − λ2 ( − 1/2 k0 ) + O( − k0 ) (5. . In each case the DOS shows the characteristic features we would expect from the detailed discussion of the band structure of these solids. α2 < 0. 55]). For instance.21) for > k0 . [54. presented in chapter 4. As an example. k0 − )1/2 Q ( − 1/2 k0 ) = λ1 ( Q 2 + k0 − )1/2 k0 (5.1. α3 > 0. with α1 . we ﬁnd that for the other saddle point M2 . the valence bands in Si show a low-energy hump associated with the s -like states and a broader set of features associated with the p -like states which have larger dispersion. Then the DOS becomes k0 ) k0 1/2 > g ( ) = λ1 (q 2 + For → k0 .2 the DOS for three real solids: a typical semiconductor (Si).19) − ) + O[( k0 − )2 ] By an exactly analogous calculation. methods must be developed that allow the inclusion of eigenvalues at a very large number of sampling points in reciprocal space. the DOS behaves as g ( ) = λ2 Q + for < k0 λ2 ( − 2Q k0 ) + O[( − 2 k0 ) ] (5. the DOS also reﬂects the presence of the band gap. In particular. corresponding to the behavior of free electrons with the characteristic 1/2 g(ε ) minimum g(ε ) maximum g(ε ) saddle point 1 ε ε ε g(ε ) saddle point 2 ε Figure 5. Al has an almost featureless DOS. We should caution the reader that very detailed calculations are required to resolve these critical points.164 (ii) For 5 Applications of band theory the condition q 2 + k0 − ≥ 0 is satisﬁed for a lower limit Q 1 = and an upper limit being any positive value Q 2 = Q > ( − ( − k0 ) 1/2 > 0. a free-electron metal (Al) and a transition (d -electron) metal (Ag). The behavior of the DOS for all the critical points is summarized graphically in Fig. we show in Fig. usually by interpolation between points at which electronic eigenvalues are actually calculated (for detailed treatments of such methods see Refs.18) expanding in powers of the small quantity ( g ( ) = λ1 Q + λ1 ( 2Q k0 − ) gives (5. 5.1. 5. with valence and conduction bands clearly separated.

5.2 Tunneling at metal–semiconductor contact g(ε ) M2 165 Si M1 M0 M3 εF ε Al M0 εF ε Ag M0 εF ε Figure 5. corresponding to the s -like state which has free-electron behavior. Ag has a large DOS with signiﬁcant structure in the range of energies where the d -like states lie. in order to bring out the important features.2. Several critical points are identiﬁed by arrows for Si (a minimum. aluminum (Al). For an intrinsic semiconductor without any impurities. At equilibrium the Fermi level on both sides of the contact must be the same. Examples of calculated electronic density of states of real solids: silicon (Si). Finally. The Fermi level is denoted in each case by F and by a vertical dashed line.2 Tunneling at metal–semiconductor contact We consider next a contact between a metal and a semiconductor which has a band gap gap . but has a very low and featureless DOS beyond that range. a free-electron metal with the FCC crystal structure. The density of states scale is not the same for the three cases. a maximum and a saddle point of each kind) and for the metals (a minimum in each case). 5. a semiconductor with the diamond crystal structure. the Fermi level will be in the middle of the band gap (see chapter 9). and silver (Ag). dependence at the bottom of the energy range. We will assume that the gap is sufﬁciently . a transition (d -electron) metal also with the FCC crystal structure.

to a good approximation. consistent with our assumption about the behavior of the metal density of states. n k are the Fermi occupation numbers on each side. In general. We can then turn the summations over k and k into integrals over the energy by introducing the density of states associated with each side. We will assume that.22) where Tkk is the tunneling matrix element between the relevant single-particle states (2) and n (1) k . whereas the standard deﬁnition of the current involves transfer of positive charge. With these assumptions. When a voltage bias V is applied to the metal side. the overall minus sign comes from the fact that the expression under the summation accounts for electrons transferred from side 2 to 1.23) What is usually measured experimentally is the so called differential conductance.166 5 Applications of band theory small to consider the density of states in the metal as being constant over a range of energies at least equal to gap around the Fermi level. applying the above expression for the current to the semiconductor-metal contact (with each side now identiﬁed by the corresponding superscript) we obtain Im →s = −|T |2 Is →m = −|T |2 ∞ −∞ ∞ −∞ g (s ) ( ) 1 − n (s ) ( ) g (m ) ( − eV )n (m ) ( − eV )d g (s ) ( )n (s ) ( )g (m ) ( − eV ) 1 − n (m ) ( − eV ) d Subtracting the two expressions to obtain the total current ﬂowing through the contact we ﬁnd I = −|T |2 ∞ −∞ g (s ) ( )g (m ) ( − eV ) n (s ) ( ) − n (m ) ( − eV ) d (5. given by the derivative of the current with respect to applied voltage: dI = |T | 2 dV ∞ −∞ g (s ) ( )g (m ) ( − eV ) ∂ n (m ) ( − eV ) d ∂V (5. the current ﬂowing from side 2 to side 1 of a contact will be given by I 2→ 1 = − kk (2) |Tkk |2 (1 − n (1) k )n k (5. We will also take the metal side to be under a bias voltage V . At zero temperature the Fermi occupation number has the form n (m ) ( ) = 1 − θ ( − F) . all its states are shifted by +eV so that at a given energy the density of states sampled is g (m ) ( − eV ). k .24) where we have taken g (m ) ( − eV ) to be independent of V in the range of order gap around the Fermi level. the tunneling matrix elements are independent of k.

and c.5. (B. and therefore its derivative is a δ -function (see Appendix G): ∂ n (m ) ( − eV ) ∂ n (m ) ( ) = eδ ( − = −e ∂V ∂ F) where we have introduced the auxiliary variable = − eV .25) which shows that by scanning the voltage V one samples the density of states of the semiconductor.3 Optical excitations 167 where θ ( ) is the Heavyside step-function.27) with h ¯ q the momentum.28) In order to have non-vanishing matrix elements in this expression. Using this result in the expression for the differential conductance we ﬁnd dI dV = e|T |2 g (s ) ( F T =0 + eV )g (m ) ( F ) (5. while the other. t ) = e ei(q·r−ωt ) A0 · p + c. (5. the one with −iωt in the exponential corresponds to absorption. and the interaction hamiltonian is given by (see Appendix B): Hint (r. corresponds to emission of radiation. Of the two terms in Eq.27). while the ﬁnal state ψk | must be unoc|ψk cupied (conduction state). p = (¯ h/i)∇r is the momentum operator. The transition rate for such an excitation is given by Fermi’s golden rule (see Appendix B. including the gap. mec (5. Thus. 5.3 Optical excitations Let us consider what can happen when an electron in a crystalline solid absorbs a photon and jumps to an excited state. the measured differential conductance will reﬂect all the features of the semiconductor density of states. the initial state (n ) (n ) must be occupied (a valence state). Eq. Since all electronic states in the crystal can be expressed . (5. ω the frequency and A0 the vector potential of the photon radiation ﬁeld. stands for complex conjugate.c. with +iωt .c.63)): Pi → f (ω) = 2π h ¯ (n ) (n ) |Hint |ψk ψk 2 δ( (n ) k − (n ) k −h ¯ ω) (5.26) (n ) (n ) |.27). the probability for absorption of light becomes: Pi → f 2π = h ¯ e mec 2 (n ) iq·r (n ) |e A0 · p|ψk ψk 2 δ( (n ) k − (n ) k −h ¯ ω) (5. k . |i = |ψk are the ﬁnal and initial states of the electron in the where f | = ψk (n ) (n ) crystal with the corresponding eigenvalues k . With the expression for the interaction hamiltonian of Eq.

With this . When other excitations that can carry crystal momentum are present. the difference between the wave-vectors of the initial and ﬁnal states due to the photon momentum is negligible. that is.168 5 Applications of band theory in our familiar Bloch form. (n ) (r) = eik·r ψk G (n ) αk (G)eiG·r (5. However. in which case the initial and ﬁnal photon states can have different momenta. as argued above. (n 2 ) We have also argued that state ψk | is a conduction state (for which we will 2 (n 1 ) is a valence state (for which use the symbol c for the band index) and state |ψk 1 we will use the symbol v for the band index). The direct and indirect transitions are illustrated in Fig.29) we ﬁnd the following expression for the matrix element: (n ) iq·r (n ) |e A0 · p|ψk = ψk GG (n ) (G ) αk ∗ (n ) αk (G) [¯ hA0 · (k + G)] × ei(k−k +q+G−G )·r dr The integral over real space produces a δ -function in reciprocal-space vectors (see Appendix G): ei(k−k +q+G)·r dr ∼ δ (k − k + q + G) ⇒ k = k + q where we have set G = 0 in the argument of the δ -function because we consider only values of k. 5. that is. Consequently. The matrix elements involved in the transition probability can be approximated as nearly constant. Taking q = 0 in the above equation leads to so called “direct” transitions.3. where we possible pairs of states that have an energy difference k are assuming that the wave-vector is the same for the two states. while the crystal wave-vectors have typical wavelength values of order the interatomic distances. These are the only allowed optical transitions when no other excitations are present. independent of the wave-vector k and the details of the initial and ﬁnal state wavefunctions. Then we would have to sum over all the (n ) (n ) − k =h ¯ ω. Now suppose we are interested in calculating the transition probability for absorption of radiation of frequency ω. k within the ﬁrst BZ. the energy and momentum conservation conditions can be independently satisﬁed even for indirect transitions. taking into account the relative magnitudes of the three wave-vectors involved in this condition reveals that it boils down to k = k: the momentum of radiation for optical transitions is |q| = (2π/λ) with a wavelength λ ∼ 104 Å. transitions at the same value of k in the BZ. ∼ 1 Å. such as phonons (see chapter 6).

This new quantity is called the “joint density of states” (JDOS). The conductivity is actually inferred from experimental measurements of the dielectric function.v δ( (c) k − (v ) k −h ¯ ω) (5. and 3 is an indirect transition across the minimal gap gap . we obtain for the total transition probability P (ω) = P0 k. our ﬁrst task is to establish the relationship between these two quantities. Right: intraband transitions in a metal across the Fermi level F approximation. Accordingly. The main difference here is that we are interested in the density of pairs of states that have energy difference h ¯ ω. and its calculation is exactly analogous to that of the DOS. as we discuss in the next section. 2 is another direct transition at a larger energy. we discuss brieﬂy how the dielectric function is measured experimentally. (5.28). Illustration of optical transitions. The JDOS appears in expressions of experimentally measurable quantities such as the dielectric function of a crystal.c.3. 5.30) where P0 contains the constant matrix elements and the other constants that appear in Eq. Left: interband transitions in a semiconductor. between valence and conduction states: 1 is a direct transition at the minimal direct gap.4 Conductivity and dielectric function The response of a crystal to an external electric ﬁeld E is described in terms of the conductivity σ through the relation J = σ E.5. the conductivity is the appropriate response function which relates the intrinsic properties of the solid to the effect of the external perturbation that the electromagnetic ﬁeld represents. The fraction R of reﬂected power for normally incident radiation on a solid with dielectric constant ε is given by the classical theory of electrodynamics . rather than the density of states at given energy . Thus. where J is the induced current (for details see Appendix A). This expression is very similar to the expressions that we saw earlier for the DOS. Before we do this.4 Conductivity and dielectric function 169 εk Conduction bands 2 1 3 εk ε gap 1 εF Valence bands k k Figure 5.

ω)ei(q·r−ωt ) . We next derive the relation between the dielectric function and the conductivity. when the explicit variable in J is the time t rather than the frequency ω. ω) = ωρ ind (q. t ) ⇒ El (q.32) known as the Kramers–Kronig relations. ω) = − 1 iq 2 σ (q. Jt ) and the electric ﬁeld (El . In essence. J(r.34) We can separate out the longitudinal and transverse parts of the current (Jl . Assuming the reﬂectivity R can be measured over a wide range of frequencies.42)–(A. ω) = σ (q. both ε1 and ε2 can then be determined.35) In the time domain. t ) = J(q. (A. π w−ω ε2 (ω ) = − P ∞ −∞ dw ε1 (w) − 1 π w−ω (5.53) and accompanying discussion). t ) = E(q. Using the plane wave expressions for the charge and current densities and the electric ﬁeld. Eqs. t ) = ρ ind (q.33) With the use of this relation. This gives for the induced charge ρ ind (q. this implies that there is only one unknown function (either ε1 or ε2 ).170 5 Applications of band theory (see Appendix A. ω) ∂t (5. ω) where (r. ω)ei(q·r−ωt ) the deﬁnition of the conductivity in the frequency domain takes the form1 J(q. ω) = −i q (q.41)) as √ 1− ε R= √ 1+ ε ∞ −∞ 2 (5. ω)q · E(q. t ) = −∇r (r. ω) ω (5. that is. In these expressions the P in front of the integrals stands for the principal value. ω)ei(q·r−ωt ) . the right-hand side of this relation is replaced by a convolution integral. ρ ind (r. . ω)E(q. ω) = σ (q. ω) (5. and using the Kramers–Kronig relations. E(r. t ) is the scalar potential (see Appendix A. (A. are related by ε1 (ω ) = 1 + P dw ε2 (w ) . Eq. Et ).31) The real and imaginary parts of the dielectric function ε = ε1 + iε2 . with the longitudinal component of the electric ﬁeld given by El (r. the continuity equation connecting the current J to the induced charge ρ ind gives ∇r · J + ∂ρ ind = 0 ⇒ q · J(q. ω) (q.

intersected by the Fermi level. known as the Drude model.36) Comparing this result with the general relation between the response function and the dielectric function ε (q. This gives for the conductivity σ (q. The second. known as the Lorentz model. refers to the frequency-dependent dielectric function for the case where only transitions within a single band.4 Conductivity and dielectric function 171 As we have discussed in chapter 2. ω) is the susceptibility or response function. as a ﬁnal step. ω) = 1 − 4π χ (q. refers to the opposite extreme. This will provide the direct link between the electronic structure at the microscopic level and the experimentally measured dielectric function. ω)] =⇒ ε(q. the case where the only allowed transitions are between occupied and unoccupied bands separated by a band gap: ε(ω) = 1 − ω2 p 2 (ω2 − ω0 ) + iηω (5. that is.39) where ω p and τ are constants (known as the plasma frequency and relaxation time. η two additional constants. we will next express the conductivity in terms of microscopic properties of the solid (the electronic wavefunctions and their energies and occupation numbers). are allowed: ε (ω ) = 1 − ω2 p ω(ω + i/τ ) (5.37) we obtain the desired relation between the conductivity and the dielectric function: σ (q. for weak external ﬁelds we can use the linear response expression: ρ ind (q. where χ (q. ω) (q. ω) = χ (q. which captures the macroscopic response of the solid to an external electromagnetic ﬁeld.5. it is worth mentioning two simple expressions which capture much of the physics. ω) = iω χ ( q. respectively).40) with ω p the same constant as in the previous expression and ω0 . The ﬁrst. Before proceeding with the detailed derivations. The expressions we will derive below can be ultimately reduced to these . ω) = 1 − 4π iω (5. ω) = iω 4π σ (q. ω) q2 (5.38) Having established the connection between conductivity and dielectric function. ω ) q2 (5. ω). we will use the connection between conductivity and dielectric function to obtain an expression of the latter in terms of the microscopic properties of the solid. ω) [1 − ε(q.

21) we ﬁnd that the expectation value of the many-body operator O in the single-particle states labeled by k is O = k. t ) c ∂t c iω eh ¯ eh ¯ int Et · ∇r ⇒ Hk Et · ψk |∇r |ψk . In the usual Coulomb gauge we have the following relation between the transverse electric ﬁeld Et (∇r · Et = 0). the carriers of the electromagnetic ﬁeld.k k −h (5. as derived in Appendix B. the relevant interaction term is that for absorption or emission of photons. r ).43) With this.k = − meω meω (5. (B. this exercise provides some insight to the meaning of the constants involved in the Drude and Lorentz models. t ) = − (5. (5. Eq. to ﬁrst order perturbation theory the interaction term of the hamiltonian gives for the single-particle density matrix int γk . From Eq.k ok.k (5. O({ri }) = i o(ri ) in terms of the matrix elements in the single-particle states. and the vector potential A: Et (r. As derived by Ehrenreich and Cohen [56]. t ) = − 1 ∂A iω c ⇒ Et (r.41) where ok.k are the matrix elements of the operator o(r) and the singleparticle density matrix γ (r.k . the interaction term takes the form Hint (r. For simplicity we will use a single index. with the understanding that this index represents in a short-hand notation both the band index and the wave-vector index. For the calculation of the conductivity. t ) = A(r. to identify the single-particle states in the crystal.k γ k . .k = n0( k − k ) − n0( k) Hint ¯ ω − i¯ hη k .172 5 Applications of band theory simple expressions.44) int which is the expression to be used in γk . t ) ⇒ A(r.k and γk.27) (see Appendix B for more details). the subscript k. t ) = Et (r. We must therefore identify the appropriate singleparticle density matrix and single-particle operator o(r) for the calculation of the conductivity. For the calculation of the conductivity we will rely on the general result for the expectation value of a many-body operator O({ri }). which can be expressed as a sum of single-particle operators o(ri ). and η is an inﬁnitesimal positive quantity with the dimensions of frequency. which was discussed earlier.42) where n 0 ( k ) is the Fermi occupation number for the state with energy k in the unperturbed system.

onemustbeoccupiedandtheotherempty.k eh ¯ ∇r |ψk im e n0( k − k ) − n0( k) int Hk . expressed in tensor notation J = σ E → Jα = β σαβ E β we obtain.4 Conductivity and dielectric function 173 As far as the relevant single-particle operator o(r) is concerned. The result is precisely the type of expression we mentioned above: of the two states involvedintheexcitation |ψk . This expression for the conductivity is known as the Kubo–Greenwood formula [57]. the two processes .5. and combining it with the int expression for the single-particle density matrix γk . we obtain for the expectation value of the total current: J = k. for the real and imaginary parts of the conductivity.otherwise the difference of the Fermi occupation numbers [n 0 ( k ) − n 0 ( k )] will lead to ¯ ω and n 0 ( k ) = 0.k int jk. which is of course the induced current. then we have emission of a photon with energy h ¯ ω.k 1 [n 0 ( ω −h ¯ω ) − n 0 ( k )] ψk | ∂ ∂ |ψk ψk | |ψk ∂ xα ∂ xβ (5.k 1 [n 0 ( ω k k ) − n 0 ( k )] ψk | ∂ ∂ |ψk ψk | |ψk ∂ xα ∂ xβ e h ¯ 2 me k 2 2 − −h ¯ ω) k k. |ψk . n 0 ( k ) = 0. (5.k γk .33).k = 2 2 k. Notice that if k − k = h then we have absorption of a photon with energy h ¯ ω. from Eq. These expressions can then be used to obtain the real and imaginary parts of the dielectric function. Eq. (5.k − h ¯ ω − i¯ h η k =− ie h ¯ 2 me 1 ω n0( k − k ) − n0( k) ψk |∇r |ψk ψk |∇r |ψk · Et (5. a vanishing contribution.55) (proven in Appendix G) to obtain the real and imaginary parts of σ . me im e Using this expression as the single-particle operator o(r). it must describe the response of the physical system to the external potential.38).k derived above.k ψk | − k.45) ¯ ω − i¯ hη k −h From the relation between current and electric ﬁeld. (G. whereas if n 0 ( k ) = 1. The single-particle current operator is j(r) = −e v= −eh ¯ −e p = ∇r . n 0 ( k ) = 1. Re[σαβ ] = π e2 h ¯2 m2 e × δ( Im[σαβ ] = − × k k.46) 1 − k where we have used the mathematical identity Eq.

3. To simplify the notation. (3. We begin with the intraband transitions.3. Eq. in the limit q → 0. ε(q. (5. The ﬁrst kind are called intraband or free-electron transitions: they correspond to situations where an electron makes a transition by absorption or emission of a photon across the Fermi level by changing slightly its wave-vector. we will consider two different situations: we will examine ﬁrst transitions within the same band. consistent with our assumption of an isotropic solid. n . (5. With .46) for the effective mass of electrons and. k = k + q. to relate the matrix elements of the momentum operator (n ) (n ) (n ) p(nn ) (k) = ψk |(−i¯ h∇r )|ψk to the gradient of the band energy ∇k k . We will also use the result of Eq. but at slightly different values of the wave-vector. also as illustrated in Fig.45). k .174 5 Applications of band theory give contributions of opposite sign to the conductivity. The second kind correspond to interband or bound-electron transitions: they correspond to situations where an electron makes a direct transition by absorption or emission of a photon acros the band gap in insulators or semiconductors. 5. n = n . we will omit from now on the dependence of the dielectric function on the wave-vector difference q = k − k with the understanding that all the expressions we derive concern the limit q → 0. we will ﬁrst assume an isotropic solid. which leads to the JDOS. and then use the relation we derived in Eq. 5. To analyze the behavior of the dielectric function. In order to obtain the dielectric function from the conductivity. precisely as derived in the previous section. Since in both cases we are considering the limit of k → k. then we obtain a sum over δ -functions that ensure conservation of energy upon absorption or emission of radiation with frequency ω.30). in which case σαβ = σ δαβ .47) with q = k − k. if we assume that the matrix elements are approximately independent of k. (3. and second.nn (5.38). In the above expression we have introduced again explicitly the band indices n . we have taken the effective mass to be a scalar rather than a tensor. we deﬁne (n ) (q) = h ¯ ωk (n ) k+q − (n ) k = h ¯ h ¯2 q2 q · p(nn ) (k) + me 2m (n ) (k) (5. In the case of the real part of the conductivity which is related to the imaginary part of the dielectric function.43) and (3. as illustrated in Fig.48) where we have used the expressions of Eqs. transitions at the same wave-vector k = k but between different bands n = n . ω) = 1 + 4π e 2 m2 e 1 ω2 (n ) (n ) |(−i¯ h∇r )|ψk ψk 2 (n ) (n ) k ) − n0( k ) (n ) (n ) ¯ ω − i¯ hη k − k −h n0( kk .

Returning to Eq. At this point. once we have summed over all the relevant values of k and n . the quantity in the square brackets depends exclusively on the band structure and fundamental constants. to obtain ε(ω) = 1 − 4π e 2 N me 2m e h ¯ 2 c2 k.49).50) This is the characteristic frequency of the response of a uniform electron gas of density N / in a uniform background of compensating positive charge (see problem 5).5. and in the second equation we have made use of the expression introduced in Eq.48).n 2 kq. (n ) (q) compared to (ω + iη) in we will neglect the ﬁrst term.n 1 ω2 2q 2 ω− (n ) ωk (q) + iη 2 − 2 ω+ (n ) ωk (q) + iη (n ) k ∇k 2 (n ) k 2 =1− 4π e h ¯ 2 (n ) · ∇k k ω2 + q 1 ( n 2 ω m ) (k) 2 ∇k (n ) (ω + iη)2 − ωk (q) 2 where in the ﬁrst equation we have written explicitly the two contributions from the absorption and emission terms. the modes describing this response are called plasmons (mentioned in chapter 2). Using the standard expressions for the radius rs of the sphere corresponding to the average volume per electron. we can take advantage of the fact that q is the momentum of the photon to write ω2 = q2 c2 in the second term in the square brackets and. in the limit q → 0. (5. this quantity takes real positive values which we . we can express the plasma frequency as ω p = 0. as well as the term ωk the denominator. called the plasma frequency. (5.n 1 |∇k m (k) (n ) (n ) 2 k | 1 (ω + iη)2 (5.4 Conductivity and dielectric function 175 these considerations. we ﬁnd that the frequencies of plasmon oscillations are in the range of 103 –104 THz. ω p : ωp ≡ 4π e 2 N me 1/2 (5.49) with N the total number of unit cells in the crystal. we obtain ε(ω) = 1 − 4π e2 h ¯ 3 kq. averaging over the two spin states. which we take to be equal to the number of valence electrons for simplicity (the two numbers are always related by an integer factor).716 1 rs /a0 3/2 × 1017 Hz and since typical values of rs /a0 in metals are 2–6. The quantity in the ﬁrst parenthesis has the dimensions of a frequency squared.

c (v c) ω2 − ωk 2 (c ) (v ) ψk |r|ψk 2 (5. writing out explicitly the contributions from the absorption and emission terms and averaging over the two spin states. Eq. . a one-band metal.47). that is. which is intersected by the Fermi level. with a real positive constant. To ensure direct transitions we introduce a factor (2π )3 δ (k − k ) in the double summation over wave-vectors in the general expression for the dielectric function.51) (5. (5.52) ( ω p )2 ( ω p )2 η ⇒ lim ( ω )] = π ε δ (ω) [ 2 η→0 ω ω2 + η2 ω These expressions describe the behavior of the dielectric function for a material with one band only. that is. (3. transitions for which k = k and n = n .54) 2 ε2 (ω) = 8π 2 e 2 h ¯ (v c ) (cv ) δ (ω − ωk ) − δ (ω + ωk ) k.n =n p (n ) |ψ me k 2 n0( (n ) (n ) k ) − n0( k ) (nn ) ωk − ω − iη (5. We note that the relevant values of the wave-vector and band index are those that correspond to crossings of the Fermi level. we obtain for the real and imaginary parts of the dielectric function in the limit η → 0 ε1 (ω) = 1 − 8π e2 h ¯ (v c) 2ωk k.53) where we have deﬁned (nn ) = h ¯ ωk (n ) k − (n ) k We notice that the matrix elements of the operator p/ m e can be put in the following form: p d (n ) p (n ) = v = r ⇒ ψk | |ψk me dt me 2 (nn ) = ωk 2 (n ) (n ) |r|ψk ψk 2 where we have used Eq. Extracting the real and imaginary parts from this term we obtain ε1 (ω) = 1 − ε2 (ω) = ( ω p )2 ( ω p )2 ⇒ lim ( ω )] = 1 − ε [ 1 η→0 ω2 + η2 ω2 (5.55) 2 At non-zero temperatures.176 5 Applications of band theory denote by 2 . With this result. these values must be extended to capture all states which have partial occupation across the Fermi level.v.2 The remaining term in the expression for the dielectric function is the one which determines its dependence on the frequency. since we have assumed transitions across the Fermi level within the same band.c (c) (v ) ψk |r|ψk (5.v. We next consider the effect of direct interband transitions.47) to obtain the last expression. from which we obtain ε (ω ) = 1 + 4π e 2 h ¯ 1 ω2 (n ) | ψk k.

5 Excitons Up to this point we have been discussing excitations of electrons from an occupied to an unoccupied state.51) and (5. as in semiconductors and insulators. For semiconductors. as given by Eqs. a material with a band gap.54) and (5. Ag. that is. Source: Refs. This would be the behavior of the dielectric function of a material in which only interband transitions are allowed. since one state is occupied and the other empty. From the preceding derivations we conclude that the dielectric function of a semiconductor or insulator will derive from interband contributions only. [57–59]. 5.5. Left: the real (ε1 ) and imaginary (ε2 ) parts of the dielectric function of Ag. Examples of dielectric functions ε(ω) as a function of the frequency ω (in eV). For more details the reader is referred to the review articles and books mentioned in the Further reading section. Often we are interested in applying these tools to situations where there is a gap between occupied and unoccupied states. This implies a lower cutoff in the energy of the photons that .52) (see also Ref. (5. described by Eqs. because this produces a large joint density of states (see problem 8). which can apply to any situation. that is. (5. [61]). it is often easy to identify the features of the band structure which are responsible for the major features of the dielectric function.5 Excitons ε 6 4 2 0 -2 177 ε 36 ε Ag ε2 6 4 2 0 -2 Ag ε2 bound 24 12 Si 4 8 ε1 ω ε1 4 8 ω 0 -12 5 10 free ε1 ω Figure 5. center: an analysis of the real part into its bound-electron (interband) and free-electron (intraband) components. Right: the real and imaginary parts of the dielectric function of Si.4. The latter are typically related to transitions between occupied and unoccupied bands which happen to be parallel. We have developed the tools to calculate the response of solids to this kind of external perturbation. in which case the difference in energy between the initial and ﬁnal states must be larger than or equal to the band gap. they have a constant energy difference over a large portion of the BZ. is shown in Fig.4. 5.55) while that of a metal with several bands will have interband contributions as well as intraband contributions. where we have made explicit the requirement that one of the indices must run over valence bands (v ) and the other over conduction bands (c). A typical example for a multi-band d -electron metal.

so we need to invoke the many-body hamiltonian to describe the physics. There are two types of excitons.015 0.004 0. delocalized over a range of several angstroms. | ri − r j | (5. This is common in insulators. The second type consists of weakly bound excitons. so that solving it becomes an exercise in dealing with single-particle wavefunctions. because they involve the creation of an electron-hole pair in which the electron and hole are bound by their Coulomb attraction. and with a small binding energy of order 0.48 0.44 1. We begin by writing the total many-body hamiltonian H in the form H({ri }) = i h (ri ) + h ¯ ∇2 + 2m e r 2 1 2 i= j e2 .2 we give some examples of materials that have Frenkel and Mott– Wannier excitons and the corresponding binding energies. such as ionic solids (for example. These excitons are referred to as Mott–Wannier excitons [63]. For the purposes of the following treatment.01 eV. SiO2 ). can be absorbed or emitted. These are called excitons.178 5 Applications of band theory Table 5.2. These excitons are referred to as Frenkel excitons [62]. This is common in small band gap systems. equal to the band gap energy. I − Z I e2 | r − R − tI | where R are the lattice vectors and t I are the positions of the ions in the unit cell.56) h (r) = − R.40 0.001–0.029 0.015 Source: C. we will ﬁnd it convenient to cast the problem in the language of Slater determinants composed of single-particle states.004 0. However.00 Mott–Wannier excitons Si Ge GaAs CdS CdSe 0. while intermediate cases are more difﬁcult to treat. Kittel [63].40 0. Examples of Frenkel excitons in ionic solids and Mott-Wannier excitons in semiconductors. especially semiconductors. Binding energies of the excitons are given in units of electronvolts Frenkel excitons KI KCl KBr RbCl LiF 0. These two limiting cases are well understood. The ﬁrst consists of an electron and a hole that are tightly bound with a large binding energy of order 1 eV. there are cases where optical excitations can occur for energies smaller than the band gap. In Table 5. The presence of excitons is a genuine many-body effect. .

s1 .) = eiK·R K (r1 . of the electrons: there are N atoms in the solid. The many-body wavefunction will be taken as a Slater determinant in the positions r1 . all the states corresponding to k values in the ﬁrst BZ will be occupied. For localized Frenkel excitons. We will assume that we are dealing with atoms that have two valence electrons. .59) . . and the part that contains all the complications of electron-electron interactions. giving rise to a simple band structure consisting of a fully occupied valence band and an empty conduction band. . r2 .) (5. hence 2 N electrons in the full valence band. For simplicity. These so called Wannier functions are deﬁned in terms (v ) of the usual band states ψk .57) We will discuss in some detail only the case of Frenkel excitons.58) In the ground state. because of the equal occupation of single-particle states with up and down spins. .s (r) k (5. in the following we will discuss the case where there is only one atom per unit cell. since for every occupied k-state there is a corresponding −k-state with the same energy. . it is convenient to use a unitary transformation to a new set of basis functions which are localized at the positions of the ions in each unit cell of the lattice. giving the many-body wavefunction (v ) (r1 ) ψk 1↑ (v ) (v ) ψk (r2 ) · · · ψk (r2 N ) 1↑ 1↑ (v ) (v ) (v ) ψk (r1 ) ψk (r2 ) · · · ψk (r2 N ) 1↓ 1↓ 1↓ · · · 1 · · · K (r1 .) = √ (2 N )! · · · (v ) (v ) (v ) ψk N ↑ (r1 ) ψk N ↑ (r2 ) · · · ψk N ↑ (r2 N ) (v ) (v ) (v ) (r1 ) ψk (r2 ) · · · ψk (r2 N ) ψk N↓ N↓ N↓ (5.5. . Similarly the total spin will be zero. thus eliminating the index I for the positions of atoms within the unit cell. with one spin-up and one spin-down electron in each state. r2 . namely h (r). . . . r2 . . The many-body wavefunction will have Bloch-like symmetry: K (r1 + R. . r2 + R. generalization to more bands is straightforward. . which is the non-interacting part. we separate the part that can be dealt with strictly in the singleparticle framework. s2 .s (r) through the following relation: 1 (v ) φs (r − R) = √ N (v ) e−ik·R ψk . . and spins s1 . . . and the total wave-vector will be equal to zero. s2 .5 Excitons 179 In this manner.

since the original ground state had spin 0. we need to pay special attention to the possible spin states of the entire system. the new state created (c) (r − R p ) produces a many-body state with total by adding a particle in state φs p spin z -component Sz = sp − sh . When this is done. we conclude that the hole corresponds to the time-reversed particle state. . The resulting spin states for the particle–hole system. When the electron is (v ) (r − Rh ). since the effect of the time-reversal operator T on the energy and the wavefunction is: T (n ) k↑ = (n ) −k↓ .60) In order to create an exciton wavefunction we remove a single electron from state (v ) (c) (r − Rh ) (the subscript h standing for “hole”) and put it in state φs (r − R p ) φs h p (the subscript p standing for “particle”). we must take their contribution to the spin as the opposite of what a normal particle would contribute.180 5 Applications of band theory Using this new basis we can express the many-body wavefunction as (v ) (r1 − R1 ) φ↑ ··· (v ) (v ) (r1 − R1 ) · · · φ↓ (r2 N − R1 ) φ↓ · · 1 · · 0 (r1 ... which in this case is ↑↓. (n ) (n ) T |ψk ↑ = |ψ−k↓ as discussed in chapter 3.3 and contrasted against the particle–particle spin states. from an occupied valence to an unoccupied conduction state. the total momentum and total spin of the many-body wavefunction must be preserved. as identiﬁed by the total spin S and its z -component Sz . because there are no terms in the interaction hamiltonian to change these values (we are assuming as before that the wave-vector of the incident radiation is negligible compared with the wave-vectors of electrons). Because of the difference in the nature of holes and particles. s2 . Therefore. the many-body state has a total spin z -component removed from state φs h Sz = −sh . if we start as usual with the highest Sz state. s1 . are given in Table 5. by the absorption of a photon as discussed earlier in this chapter. Notice that for the particle–hole system. there will be an overall minus sign associated with the hole spin-lowering operator due to complex conjugation implied by time reversal. Taking into consideration the fact that the hole has opposite wavevector of a particle in the same state. and proceed to create the rest by applying spin-lowering operators (see Appendix B).) = √ (2 N )! · · (v ) (v ) φ↑ (r1 − R N ) · · · φ↑ (r2 N − R N ) (v ) (v ) (r1 − R N ) · · · φ↓ (r2 N − R N ) φ↓ (v ) φ↑ (r2 N − R1 ) (5. r2 . This reveals that when we deal with hole states. . This is the expected excitation.

. s2 N ) (5. in which FR (R ) = δ (R − R ). r2 N . s2 N ) = ( S . all particle variables in this many-body wavefunction. . Sz ) R. Sz = 0. . s1 . s p . The values of these coefﬁcients will determine the wavefunction for this excited many-body state. produced by the combination of s p . Spin state S S S S = 1. . s2 N ) where the wavefunction relation: ( S . Sz ) ([r p − R. . .R (r1 . sh ]. as implemented in Eq.61). Sz = 1. r2 N − R. . . since it is not explicitly involved in the new many-body wavefunction. . r2 . .5 Excitons 181 Table 5. . . s1 .61) is deﬁned by the following ( S . .R (r1 . sh ]. We need to invoke a set of coefﬁcients FR (R ) for this possible difference in shifts. but we use different symbols (r p and rh ) to denote the two different states associated with this excitation. Sz ) ([r p . . . Then the Bloch state obtained by appropriately combining such states is ( S . This many-body state has total spin and z -projection ( S . . s2 N ) (5. Let us consider a simple example. . r2 N . s 2 N ) K 1 =√ N FR (R )eiK·R R. r 2 N . in the manner discussed above. The real-space variable associated with the particle and the hole states is the same.R (r1 . . rh − R − R . which . . . Sz = 1. s2 N ) (v ) the many-body wavefunction produced by exciting one particle from φs (rh ) to h (c) φs p (r p ). r2 N . . (5. which is multiplied by the phase factor exp(i K · R).62) that is. Sz ) (r 1 . . Sz ) R.5. Spin conﬁgurations for a particle–particle pair and a particle–hole pair for spin-1/2 particles. . Sz ). are shifted by R but the hole variable is shifted by any other lattice vector R = R + R . Let us denote by ( S . The physical meaning of this choice of coefﬁcients is that only when the electron and the hole are localized at the same lattice site that the wavefunction does not vanish. r2 N .R ( S . the second to the hole. s p . s1 . Sz =1 =0 = −1 =0 Particle–particle ↑↑ + ↓↑) ↓↓ 1 √ (↑↓ − ↓↑) 2 1 √ (↑↓ 2 Particle–hole ↑↓ − ↓↓) ↓↑ 1 √ (↑↑ + ↓↓) 2 1 √ (↑↑ 2 Now the ﬁrst task is to construct Bloch states from the proper basis. s 1 . . Sz ) R. s2 .3. sh . rh . In the latter case the ﬁrst spin refers to the particle.

0 . Sz ) We can express the quantities E 0 .R + 1 = N = R 1 N R R =R ( S . h (r) are the many-body and .64) where we have taken advantage of the translational symmetry of the hamilto( S . Sz ) eiK·(R−R ) E R . H({ri }). Sz ) |H| K ( S .R ( S . Sz ) ( S .65) (n ) = φ (n ) (r − R) with n = v where we have used the short-hand notation r|φR or c for the valence or conduction states. and obtain for the energy: ( S . The energy of the state corresponding to this choice of coefﬁcients is ( S . we have also eliminated summations over R when the summand does not depend explicitly on R. Sz ) K = 1 N eiK·(R−R ) R.0 (5.0 + R=0 ( S .182 5 Applications of band theory represents the extreme case of a localized Frenkel exciton.0 . Sz ) eiK·(R−R ) E R .R ( S . Sz ) ( S .R (5. Sz ) R. E R.63) ( S .R ( S .R . Sz ) = E0 + E0 . In the above equations we do not include spin labels in the singleparticle states φ (n ) (r) since the spin degrees of freedom have explicitly been taken into account to arrive at these expressions. together with a factor of (1/ N ). Sz ) EK = ( S .R = E R −R. Sz ) E0 . the states with no subscript correspond to R = 0. Sz ) Now we can deﬁne the last expectation value as E R .0 (c ) − 0 (v ) + V (c ) − V (v ) + U ( S ) E0 = (c ) (v ) 0 |H| (c) = φ | h |φ (c ) = φ (v) |h |φ (v) (v ) | 2 φ (c) φR R (v ) | 2 φ (v) φR R V (c ) = V (v ) = e2 e2 (v ) (v ) − φ (c) φR | |φ (c) φR |φ (v) φ (c) |r−r | |r−r | R e2 e2 (v ) (v ) (v ) (v ) − φ (v) φR | φ |φ (v) φR |φR |r−r | |r−r | U ( S ) = 2δ S .0 in terms of the single-particle states φ (v) (r) and φ (c) (r) as follows: ( S . Sz ) nian to write E R . Sz ) EK = 1 N R. Sz ) ER .0 φ (c) φ (v) | e2 e2 |φ (v) φ (c) − φ (c) φ (v) | |φ (c) φ (v) |r−r | |r−r | (5. ( S .R ( S .R | H | ( S . Sz ) e−iK·R E R . Sz ) R .

consisting of the interaction energy of the particle V (c) . Hint ({ri }.64) takes the form: ( S .5 Excitons 183 single-particle hamiltonians deﬁned in Eq. which is added. . t ) = i e ei(q·ri −ωt ) A0 · pi + c. and the particle–hole interaction U ( S ).R |H| ( S .0 δ (K)δ ( E K − E0 − h ¯ ω) e mec φ (v) |A0 · p|φ (c) where we have used the appropriate generalization of the interaction hamiltonian to a many-body system. and the wave-vector conserving δ -function is introduced by arguments similar to what we discussed earlier for the conservation of k upon absorption or emission of optical photons. mec (5.66) with the summation running over all electron coordinates.60).0 = ( S . Sz ) ER . the interaction energy of the hole V (v) .0 φR e2 e2 (c) (v ) (v ) (c ) (c) φ − φR φ | |φR |φ (v) φR |r−r | |r−r | This term describes an interaction between the particle and hole states. Sz ) 0. Sz ) δ S .56).5. (5. (5. The interaction terms are at the Hartree–Fock level. which is a natural consequence of our choice of a Slater determinant for the many-body wavefunction. How do these results change the interaction of the solid with light? The absorption probability will be given as before by matrix elements of the interaction hamiltonian Hint with eigenfunctions of the many-body system: P (ω) = 2π h ¯ 2π = h ¯ 0 |H int 2 | 2 K ( S .0 (v ) (c ) φ | = 2δ S . The interpretation of these terms is straightforward: the energy of the system with the exciton is given. including the direct and exchange contribution. In this expression we can now use the results from above. (5. by adding the particle energy (c) . relative to the energy of the ground state E 0 . Sz ) δ( EK − E0 − h ¯ ω) 2 ( S .c. which arises from the periodicity of the system. Eq. giving for the argument of the energy-conserving δ -function. Sz ) R. in addition to their local Coulomb interaction U ( S ) . which is subtracted. The total-spin conserving δ -function is introduced because there are no terms in the interaction hamiltonian which can change the spin. With the same conventions. the last term appearing under the summation over R in Eq. subtracting the hole energy (v) and taking into account all the Coulomb interactions. which depends on the total spin S .

5. we must also include the contributions from electron–electron interactions in these single-particle energies. as illustrated in Fig. In our simple model with one occupied and one empty band. ψk (r).67) give the particle–hole interaction. as expected for an electron–hole pair that is bound by the Coulomb interaction. 5. V (v) at the Hartree–Fock level.0) E0 . Modiﬁcation of the absorption spectrum in the presence of excitons (solid line) relative to the spectrum in the absence of excitons (dashed line). For Mott–Wannier excitons. (5. only instead of the local(v ) (c) (r − Rh ). (v) are eigenvalues of the single-particle hamiltonian h (r) which does not include electron–electron interactions. (5. gap = (c ) + V (c ) − (v ) + V (v) Notice that the energies (c) . which is overall negative (attractive between two oppositely charged particles).67) Taking into account the physical origin of the various terms as described above.sh (r).67) will give the band gap. the absorption spectrum will have an extra peak at energies just below the band gap energy. to obtain the proper quasiparticle energies whose difference equals the band gap. φs (r − R p ) for the single-particle states we ized Wannier functions φs h p (v ) (c) will need to use the Bloch states ψkh . in terms of which we must p . In more realistic situations there will be several exciton peaks reﬂecting the band structure features of valence and conduction bands. This implies that there will be absorption of light for photon energies smaller than the band gap energy.5. the treatment is similar. after we take into account that only terms with K = 0. we conclude that the two terms in square brackets on the right-hand side of Eq.s p .R (5. in the latter case absorption begins at exactly the value of the band gap. gap . which are represented by the terms V (c) .0) − E0 = [ h ¯ ω = E0 (c) + V (c ) ] − [ (v ) + V (v) ] + U (0) + R=0 (0. The last two terms in Eq.184 5 Applications of band theory Absorption εgap E Figure 5. S = Sz = 0 survive: (0. corresponding to the presence of excitons. and therefore their difference is not equal to the band gap.

We will touch upon some topics where such calculations have proven particularly useful in chapters 9–11 of this book. momentum conservation implies k p − kh = K. deserve special mention. r the energetics of extended deformations. such as brittle or ductile behavior. Cohen. and from those one can predict the shape of solids. r the properties of defects of dimensionality zero. so that the total momentum K is unchanged. we have to construct a many-body wavefunction which is a summation over all such possible states. This condition is implemented by choosing k p = k + K/2. Accordingly. which in turn provides insight into complex phenomena like fracture. corrosion. are crucial in understanding its mechanical response. Calculations of this type have proliferated since the early 1980s. the reader may consult the review article of Chelikowsky and Cohen [65]. r the relative stability of competing surface structures can be determined through their total energies. The energy of these excitons exhibits dispersion just like electron states. The contributions of M. adhesion. r phase transitions as a function of pressure can be predicted if the total energy of different phases is known as a function of volume. which can elucidate complex phenomena like bulk diffusion and surface growth. It is impossible to provide a comprehensive review of such applications here. For example. kh = k − K/2 and allowing k to take all possible values in the BZ. The ability to obtain accurate values for the total energy as a function of atomic conﬁguration is crucial in explaining a number of thermodynamic and mechanical properties of solids. and lies within the band gap of the crystal. who pioneered this type of application and produced a large number of scientiﬁc descendants responsible for extensions of this ﬁeld in many directions.6 Energetics and dynamics 185 express the many-body wavefunction. etc. which are being expanded and reﬁned at a very rapid pace by many practitioners worldwide. r the dynamics of atoms in the interior and on the surface of a solid can be described by calculating the total energy of relevant conﬁgurations. providing a wealth of useful information on the behavior of real materials.L. The energy corresponding to this wavefunction can be determined by steps analogous to the discussion for the Frenkel excitons. can be elucidated through total-energy calculations of model structures. one and two. r alloy phase diagrams as a function of temperature and composition can be constructed by calculating the total energy of various structures with different types of elements.5. 5. In this case. catalysis. . like shear or cleavage of a solid. For a glimpse of the range of possible applications.6 Energetics and dynamics In the ﬁnal section of the present chapter we discuss an application of band theory which is becoming a dominant component in the ﬁeld of atomic and electronic structure of solids: it is the calculation of the total energy of the solid as a function of the arrangement of the atoms.

we will assume that the ionic pseudopotentials are the same for all electronic states in the atom. and U ion −ion is the energy due to the ion–ion repulsive interaction.72) U ion −el = k ψk |V ps (r)|ψk 1 2 ψk ψk | kk U el −el = e2 |ψk ψk + E XC [n (r)] |r−r | where | ψk are the single-particle states obtained from a self-consistent solution of the set of single-particle Schr¨ odinger equations and n (r) is the electron density.68) where T is the kinetic energy of electrons. U ion −el is the energy due to the ion– electron attractive interaction.1 The total energy We will describe the calculation of the total energy and its relation to the band structure in the framework of Density Functional Theory (DFT in the following). which. this will help keep the discussion simple.186 5 Applications of band theory 5. Using the theory described in chapter 2. In the following we will also adopt the pseudopotential method for describing the ionic cores. allows for an efﬁcient treatment of the valence electrons only. The reason is that this formulation has proven the most successful compromise between accuracy and efﬁciency for total-energy calculations in a very wide range of solids.6.71) (5. This last term is the Madelung energy: U ion −ion = 1 2 Z I Z J e2 | RI R J | (5. an approximation known as the “local pseudopotential”. as we discussed in chapter 2. which can be calculated by the Ewald method. In realistic calculations the pseudopotential typically depends on the angular momentum of the atomic state it represents. for simplicity. the rest of the terms take the form T = k ψk | − 2 h ¯ 2 ∇r |ψk 2m e (5.69) I =J with Z I the valence charge of ion at position R I . The total energy of a solid for a particular conﬁguration of the ions E tot ({R}) is given by E tot = T + U ion −el + U el −el + U ion −ion (5.70) (5. we use a single index k to identify these single-particle wavefunctions. Furthermore. . see chapter 2. which is known as a “non-local pseudopotential”. as discussed in Appendix F. U el −el is the energy due to the electron–electron interaction including the Coulomb repulsion and exchange and correlation effects.

and E XC [n (r)] is the exchange and correlation contribution to the total energy. V ps (r) is the external potential that each valence electron in the solid experiences due to the presence of ions described by pseudopotentials. We will adopt the local density approximation (LDA in the following) for the exchange correlation functional in terms of the electron density: E XC [n (r)] = XC [n (r)]n (r)dr (5.68) and (5.74) and the single-particle equations take the form 2 h ¯ 2 ∇r − + V ps (r) + 2m e e2 n (r ) d r + V XC (r) | ψk = |r−r | k | ψk (5. the density is given by n (r) = k( k < F ) |ψk (r)|2 with the summation running over all occupied states with energy k below the Fermi level F . In terms of these states. which.5.6 Energetics and dynamics 187 with the understanding that it encompasses both the wave-vector and the band index.72). depends on the electron density only.75) Multiplying this equation by ψk | from the left and summing over all occupied states. (5. we obtain ψk | − k( k < F ) 2 h ¯ 2 ∇r | ψk + ψk | V ps (r) | ψk 2m e + e2 n (r)n (r ) d rd r + |r−r | V XC (r)n (r)d r = k( k < F ) k (5.73) with XC [n ] the local function of the density that accounts for exchange and correlation effects (see also chapter 2). Eqs. in the framework of DFT. we ﬁnd E tot = k( k < F ) k − 1 2 V Coul (r)n (r)dr − V XC (r)n (r)dr + U ion −ion (5.77) .76) By comparing this with our earlier expression for the total energy. The potential due to exchange and correlation effects that appears in the single-particle equations is then V XC (r) = ∂ ∂ n (r) XC [n (r)]n (r) = XC [n (r)] + ∂ XC [n (r)] n (r) ∂ n (r) (5.

we can use the identity 4π eiq·r dr = r | q |2 (5. the Fourier transforms of the density and the potentials provide a particularly convenient platform for evaluating the various terms of the total energy. the Poisson equation 2 Coul V (r) = −4π e2 n (r) ⇒ V Coul (G) = ∇r 4π e2 n (G) | G |2 (5. (5. the calculation is a bit more tricky. The Fourier transforms of the density and potentials are given by n (r) = G eiG·r n (G) eiG·r V Coul (G) G V Coul (r) = (r) = G V XC eiG·r V XC (G) (5. (5. that is.78) and the difference in exchange and correlation potential as V XC (r) = V XC (r) − XC [n (r)] (5.188 5 Applications of band theory where we have deﬁned the Coulomb potential as V Coul (r) = e2 n (r ) dr |r−r | (5. deﬁned by the reciprocal-space lattice vectors G.83) to rewrite the Coulomb term in Eq. For the Coulomb term.77) as V Coul (r)n (r)dr = G V Coul (G)n (G) = 4π e2 G [n (G)]2 | G |2 (5.79) For a periodic solid. we obtain for the exchange and correlation term in Eq. Zunger and Cohen [66]. We follow here the original formulation of the total energy in terms of the plane waves exp(iG · r). as derived by Ihm.84) . First.77) V XC (r)n (r)dr = G V XC (G)n (G) (5.80) In terms of these Fourier transforms.81) with the total volume of the solid.82) and the relation between electrostatic potential and electrical charge.

the sum of single-particle energies. fourth term in Eq. The very last term is of course the classical contribution of the ions as point charges. This is done by adding to the total energy the following term: U ps = I ps ZI VI (r) + ps Z I e2 dr r (5. A standard approach is to use a tight-binding hamiltonian which. when properly parametrized.5. This means that the average potential is zero. we have to compensate for the alteration introduced by the pseudopotential.86). The three other terms are then viewed as corrections which can be approximated by an empirical term in the form of a classical interatomic potential that depends only on the distance between atoms.6 Energetics and dynamics 189 Here we need to be careful with handling of the inﬁnite terms. However. The G = 0 term in the Fourier transform corresponds to the average over all space. can provide an accurate estimate of the ﬁrst term in the total energy. (5.86) This ﬁnal expression has an appealing form: the total energy is essentially the sum of all occupied single-particle states. the total positive charge of the ions is canceled by the total negative charge of the electrons. but deviates from it inside the core. and can then . the Madelung energy. where the pseudopotential is different from the Coulomb potential.86). With these contributions. the total energy of the solid takes the form E tot = k( k < F ) k − 2π e2 [n (G)]2 − | G |2 G=0 + V XC (G)n (G) U G ps + U ion −ion (5. (5. This “correction potential” is ﬁtted to reproduce the exact energies of a few possible structures. second and third terms in Eq. which involve much smaller computational cost than fully self-consistent methods like DFT. This expression for the total energy has inspired the derivation of semi-empirical schemes. corrected for doublecounting the Coulomb and exchange-correlation contributions. which implies that due to charge neutrality we can omit the G = 0 term from all the Coulomb contributions. we have introduced an alteration of the physical system by representing the ions with pseudopotentials in the solid: the pseudopotential matches exactly the Coulomb potential beyond the cutoff radius. This is actually the dominant term in the total energy. which is treated explicitly in Appendix F. and adjusted for the inﬁnite terms due to Coulomb interactions between the various charges.85) where the summation is over all the ions and VI (r) is the pseudopotential corresponding to ion I . (5. the integrand in this term does not vanish inside the core region. When accounting for the inﬁnite terms. the ﬁrst term in Eq. We note that in a solid which is overall neutral.86).

The classic case is silicon which. There is certainly a loss in accuracy due to the very approximate treatment of the terms other than the sum of the electronic eigenvalues. 5. if the volume per atom of the solid were somehow reduced. This type of analysis using DFT calculations was pioneered by Yin and Cohen [67]. this method captures the essential aspects of bonding in solids because it preserves the quantum mechanical nature of bond formation and destruction through the electronic states with energies k . where the exact changes in the energy are not crucial. It turns out that when pressure is applied to silicon. As an example of the use of total-energy calculations in elucidating the properties of solids.6 indicate that the lowest energy structure would become the beta-tin phase. the structure indeed becomes beta-tin. a body-centered tetragonal (BCT) structure with ﬁve nearest neighbors [68]. In particular. this method is quite useful because it makes feasible the application of total-energy calculations to systems involving large numbers of atoms. where we present the energy per atom of the simple cubic lattice with six nearest neighbors for each atom. 5. the so called beta-tin structure which has six nearest neighbors. the value of this critical pressure predicted from the curves in Fig.190 5 Applications of band theory be used to calculate the energy of a variety of other structures.6 is in excellent agreement with experimental measurements (for more details see Ref. the results of Fig. However. 5. we discuss the behavior of the energy as a function of volume for different phases of the same element. the FCC and BCC high-coordination lattices with 12 and eight nearest neighbors. [67]). as we have mentioned before. Although not as accurate as a fully self-consistent approach. This comparison shows that it would not be possible to form any of these phases under equilibrium conditions. . The minimum in each curve corresponds to a different volume. A Maxwell (common tangent) construction between the energy-versus-volume curves of the two lowest energy phases gives the critical pressure at which the diamond phase begins to transform to the beta-tin phase.6. as the variations of coordination in the different phases suggest. this can be tolerated for certain classes of problems. has as its ground state the diamond lattice and is a semiconductor (see chapter 1). We must compare the energy of the various structures as a function of the volume per atom. At their minimum. respectively. but a reasonable estimate of the energy differences associated with various processes is important. for example by applying pressure. The question we want to address is: what is the energy of other possible crystalline phases of silicon relative to its ground state? The answer to this question is shown in Fig. since we do not know a priori at which volume the different crystalline phases will have their minimum value. all other phases have higher energy than the diamond lattice. and the ground state diamond lattice which has four nearest neighbors.

Since these are all very basic properties of solids. total-energy calculations allow the determination of the optimal lattice constant of any crystalline phase.6. we elaborate on this issue brieﬂy. its cohesive energy and its bulk modulus.5. The zero of the energy scale corresponds to isolated silicon atoms. BCC. As the example just discussed demonstrates. where . BCT. FCC. A simple polynomial ﬁt to the calculated energy as a function of volume near the minimum readily yields estimates for the volume. 5. as is evident from the curves of Fig. First. normalized by the volume (in the present calculation we are assuming zero temperature conditions): ∂P B≡− (5. 1/3 . that corresponds to the lowest energy. Total energy (in electronvolts per atom) as a function of volume (in angstoms cubed) for various crystal phases of silicon: diamond (a semiconducting phase with four nearest neighbors).3 The bulk modulus B of a solid is deﬁned as the change in the pressure P when the volume changes. BCC BCT beta-tin cubic 3 191 Energy 4 5 5 10 15 Volume 20 25 30 Figure 5. and hence the lattice constant. the behavior of the energy near the minimum of each curve is smooth and very nearly quadratic.6 Energetics and dynamics 2 diamond FCC.6. beta-tin and cubic (all metallic phases with higher coordination).87) ∂ 3 The standard way to ﬁt the total-energy curve near its minimum is by a polynomial in powers of is the volume.

The ﬁrst. Phillips [70] about the origin of energy bands in semiconductors (see the Further reading section). this is referred to as the Universal Binding Energy Relation (UBER). can improve the situation considerably (see Ref. In light of the above discussion. when scaled appropriately. it actually introduces signiﬁcant errors because of certain subtleties: the calculation of the total energy of the free atom cannot be carried out within the framework developed earlier for solids. as was done for the solid.C. deals with the bulk modulus of covalently bonded solids.4 the underlying formalism does not give results of equal accuracy for the two cases. The calculation of the total energy of the free atom has to be carried out in a realspace approach. 4 This can be done. referred to as Generalized Gradient Approximations (GGAs). we obtain for the bulk modulus ∂E ∂2 E (5. Cohen [69] and based on the arguments of J. thus artiﬁcially creating a periodic solid that approximates the isolated atom. without involving reciprocal-space vectors and summations. because in the case of the free atom it is not possible to deﬁne a real-space lattice and the corresponding reciprocal space of G vectors for expanding the wavefunctions and potentials. as the curvature at the minimum. We discuss these phenomenological theories in some detail. which predict with remarkable accuracy some important quantities related to the energetics of solids. because they represent important tools.88) P=− ⇒B= ∂ ∂ 2 which is easily calculated from the total energy versus volume curves. developed by M. as they distil the results of elaborate calculations into simple. for example. While this is a simple deﬁnition. Even if both total-energy calculations were carried out with the same computational approach.192 5 Applications of band theory Using the relation between the pressure and the total energy. by placing an atom at the center of a large box and repeating this box periodically in 3D space. Guinea and Ferrante [71]. which presents a greater challenge to the formalism. Modiﬁcations of the formalism that include gradient corrections to the density functionals. where the variations in the electron density are not too severe. This is because the DFT/LDA formalism is well suited for solids. The second. developed by Rose. the cohesive energy of the solid is given by the difference between the total energy per atom at the minimum of the curve and the total energy of a free atom of the same element (for elemental solids). practical expressions. [27] and chapter 2). Finally. In the case of a free atom the electron density goes to zero at a relatively short distance from the nucleus (the electronic wavefunctions decay exponentially). The difference in computational methods for the solid and the free atom total energies introduces numerical errors which are not easy to eliminate. Smith. it is worthwhile mentioning two simple phenomenological theories. .L. it is a universal relation for all solids. concerns the energy-versus-volume relation and asserts that.

K. the latter two quantities are the two parameters that are speciﬁc to each solid. as: B= Nc (1971 − 220λ)d −3. possibly exceeding those of naturally available hard materials such as diamond. (5.89). which provides a measure of the electron density relevant to bonding strength in the solid. forming the covalent bonds.5 . Si. a0 is its equilibrium value. l is a length scale and E coh is the cohesive energy.5 GPa 4 (5. this expression has been used to predict theoretically solids with very high bulk moduli [72]. and semiconductors (Si) and insulators (C).90) where a is the lattice constant (which has a very simple relation to the volume per atom. in this case the crucial connection is between the bulk modulus and the bond length.6 Energetics and dynamics 193 Cohen’s theory gives the bulk modulus of a solid which has average coordination Nc .05a 3 ] exp(−a ) a = (a − a0 )/ l (5. as was discussed in chapter 2. given the simplicity of the expression in Eq. the electronic density is concentrated between nearest neighbor sites.89) where d is the bond-length in Å and λ is a dimensionless number which describes the ionicity: λ = 0 for the homopolar crystals C. . This curve has a single minimum and goes asymptotically to zero for large distances. Cohen’s argument [69] produced the relation B ∼ d −3. the ﬁt to this rather wide range of solids is quite satisfactory. The basic physics behind this expression is that the bulk modulus is intimately related to the average electron density in a solid where the electrons are uniformly distributed. For example. the value of its minimum corresponds to the cohesive energy. 5. Ge. Ni) or nonmagnetic (Cu. Mo) ground states. and λ = 2 for II–VI compounds (where the valence of each element differs by 2 from the average valence of 4). Problem 4. Therefore. (5. In fact.7 the ﬁt of this expression to DFT/GGA values for a number of representative elemental solids. depending on the lattice). including metals with s (Li. For covalently bonded solids.89). The UBER theory gives the energy-versus-volume relation as a two-parameter expression: E (a ) = E coh E (a ) E (a ) = −[1 + a + 0. leads to Eq. we show in Fig. λ = 1 for III–V compounds (where the valence of each element differs by 1 from the average valence of 4). actual values of bulk moduli are captured with an accuracy which is remarkable. dressed with the empirical constants that give good ﬁts for a few representative solids.5. As can be checked from the values given in Table 5. Ca) and sp (Al) electrons. which. As seen from this example.5. Sn. sd electron metals with magnetic (Fe.

is remarkable. 5. This type of calculation is indeed possible. with a few electronic structure calculations of the total energy. a0 and E coh . molecular binding and chemisorption of gas molecules on surfaces of metals.2 Forces and dynamics Having calculated the total energy of a solid. Fit of calculated total energies of various elemental solids to the Universal Binding Energy Relation. but this is probably more related to limitations of the DFT/GGA formalism rather than the UBER theory. Here.2 Li (BCC) C (DIA) Al (FCC) Si (DIA) K (BCC) Ca (FCC) Fe (BCC) Ni (FCC) Cu (FCC) Mo (BCC) Scaled Binding Energy Ϫ0. the values of the lattice constant and cohesive energy.194 0. the calculations are based on DFT/GGA. obeying . adhesion energies.4 Ϫ0. and opens up a very broad ﬁeld. etc.0 5 Applications of band theory Ϫ0.8 Ϫ1. as well as the energy of adhesion. the study of the dynamics of atoms in the solid state. when applied to surface phenomena).6.6 Ϫ0. The degree to which this simple expression ﬁts theoretically calculated values for a wide range of solids. chemisorption. critical stresses. a number of important quantities such as the cohesive energy (or absorption. it should be feasible to calculate forces on the ions by simply taking the derivative of the total energy with respect to individual ionic positions. The UBER expression is very useful for determining. bulk modulus. extracted from this ﬁt differ somewhat from the experimental values. that is.7.0 Ϫ1. by dynamics we refer to the behavior of ions as classical particles.2 Ϫ2 Ϫ1 0 1 2 Scaled Separation 3 4 5 Figure 5.

the derivative of U ion −ion . (5.92). one from the ion–ion interaction energy. This then proves the Hellmann–Feynman theorem. For the contribution from the ion-ion interaction energy. as stated in Eq. which have been taken into account explicitly with the Fion −ion term).5. known as the Hellmann-Feynman theorem. The importance of the Hellmann–Feynman theorem is the following: the only terms needed to calculate the contribution of ion–electron interactions to the force are those terms in the hamiltonian that depend explicitly on the atomic positions R. where E 0 is the energy corresponding to the eigenstate | 0 . it is . First. with respect to the position of a particular ion. deﬁned in Eq. We discuss next the details of the calculation of forces in the context of DFT. we can rewrite the second and third terms in the expression for the ion–electron contribution to the force as ∂ 0 | ∂ RI 0 E0 + E0 0 | ∂ 0 ∂ 0| = E0 ∂ RI ∂ RI 0 =0 (5.69).6 Energetics and dynamics 195 Newton’s equations of motion. as the derivative of the total energy 0 | H | 0 with respect to R I : −el =− Fion I ∂ 0 |H| 0 =− ∂ 0 |H| ∂ RI ∂ RI 0 − 0| ∂H | ∂ RI 0 − 0 |H | ∂ 0 ∂ RI Now using H | 0 = E 0 | 0 and 0 | H = E 0 0 |. To prove this theorem. we begin with the standard deﬁnition of the force on ion I . and one from the ion–electron interaction energy. gives for the force on this ion −ion Fion =− I ∂ U ion −ion = ∂ RI J =I Z I Z J e2 (R I − R J ) | R I − R J |3 (5. discussed in Appendix F. In order to calculate the contribution from the ion-electron interaction. This theorem states that the force on an atom I at position R I is given by −el =− Fion I 0 | ∂H | ∂ RI 0 (5. with the calculated forces at each instant in time. (5. R I .91) which can be evaluated by methods analogous to those used for the Madelung energy. we note that there are two contributions. From our analysis of the various terms that contribute to the total energy.92) where H is the hamiltonian of the system and | 0 is the normalized ground state wavefunction (we assume that the hamiltonian does not include ion-ion interactions. 0 | 0 = 1. we ﬁrst prove a very useful general theorem.93) where the last result is due to the normalization of the wavefunction.

we have all the necessary information to calculate the forces on the ions. I ps (5. the coefﬁcients αk (G) have been determined. I ps (5. Following the derivation of the expression for the total energy.G. expressed in terms of the single-particle wavefunctions | ψk . and ρ (G) is deﬁned at = as ρ (G) = k.94) Taking the derivative of this term with respect to R I . deﬁned in Eq.72).G = −i GeiG·R I V ps (G)ρ (G) G (5. The expectation value of this term in the ground state wavefunction.96) where S (G) = (1/ N ) I eiG·R I is the structure factor.95) We will derive the ion–electron contribution to the force explicitly for the case where there is only one atom per unit cell. we obtain the force contribution −el Fion I −el Fion =− I ∂ ∂ RI ψk | VI (r − R I ) | ψk k. when working with the basis of plane waves G).G ∗ αk (G )αk (G − G) (5. With these deﬁnitions. (5. . namely U ion −el . and the origin of the coordinate system is chosen so that the atomic positions coincide with the lattice vectors. we obtain for the ion–electron contribution to the force: −el Fion =− I ∂ ∂ RI at ∗ αk (G)αk (G )V ps (G − G) S (G − G) k.e.196 5 Applications of band theory clear that there is only one such term in the hamiltonian. we use again the Fourier transforms of the potentials and single-particle wavefunctions ψk (r) = G ei(k+G)·r αk (G) eiG·r V ps (G) G V (r) = ps where we have dropped the index I from the pseudopotential since there is only one type of ion in our example.97) We see from Eq. at is the atomic volume / N ( N being the total number of atoms in the solid). is given by ψk | VI (r − R I ) | ψk k.96) that when the single-particle wavefunctions have been calculated (i. (5.

from which the forces on the ions can be calculated. It is important to note that the kinetic energy of the electron wavefunctions in the Car–Parrinello lagrangian is a ﬁctitious term. Accordingly. {R I }]. in this case the temperature is reduced to zero.98) Here the ﬁrst two terms represent the kinetic energy of electrons and ions ( M I and R I are the mass and position of ion I ). introduced for the sole purpose of coupling the dynamics of ionic and electronic degrees of freedom. This method can be applied to the study of solids in two ways. together with the value of the time step dt used to update the electronic wavefunctions and the ionic positions.6 Energetics and dynamics 197 This last statement can be exploited to devise a scheme for simulating the dynamics of ions in a computationally efﬁcient manner. it has nothing to do with the true quantum-mechanical kinetic energy of electrons.5. are adjusted to make the evolution of the system stable as well as computationally . which is allowed to explore the phase space consistent with the externally imposed temperature and pressure conditions. {R I }] + ψi | ψi − δi j ) (5. The value of the electron effective mass µ. which is of course part of the total energy E [{ψi }. and consists of evolving simultaneously the electronic and ionic degrees of freedom. This is achieved by coupling the electron and ion dynamics through an effective lagrangian. in the sense that both ionic and electronic degrees of freedom have been optimized. (b) It can be applied at ﬁnite temperature to study the time evolution of the system. deﬁned as L= i 1 dψi dψi µ | + 2 dt dt ij I ij( 1 MI 2 dR I dt 2 − E [{ψi }. the third term is the total energy of the electron–ion system which plays the role of the potential energy in the lagrangian. the “mass” µ associated with the kinetic energy of the electron wavefunctions is a ﬁctitious mass. This scheme was ﬁrst proposed by Car and Parrinello [73]. The big advantage of this method is that the single-particle electronic wavefunctions. the system is in its equilibrium state with E [{ψi }. (a) It can be applied under conditions where the velocities of electrons and ions are steadily reduced so that the true ground state of the system is reached. and the last term contains the Laplace multipliers i j which ensure the orthogonality of the wavefunctions |ψi . {R I }] a minimum at zero temperature. When the velocities associated with both the electronic and ionic degrees of freedom are reduced to zero. do not need to be obtained from scratch through a self-consistent calculation when the ions have been moved. and is essentially a free parameter that determines the coupling between electron and ion dynamics.

66 2. a large time step might throw the system out of balance.15 2.03 3.5944) (1.2 44.25 5.06 1.5921) (1.44 3.4 15.05 4.45 0.37 4.8 3.30 2.51 3.95 3. efﬁcient.32 1.1 100 35.8609) (1.35 1.41 0.14 4. Kittel [64].51 1.74 (c/a ) B 11.225 5. However. the time step dt should be as large as possible in order either to reach the true ground state of the system in the fewest possible steps.57 4.85 6. B = bulk modulus in gigapascals. Speciﬁcally.90 g( F ) 0.89 3.08 3.48 0.61 4.5699) (1.5 36.88 3.09 2.89 3.5864) (1. in the sense that the time-evolved wavefunctions may not correspond to the ground state of the ionic system. Janak and Williams [75]).98 4.31 7.73∗ 1. Basic structural and electronic properties of common metals.6 6.80 3.31 3.27 0.74 4.45∗ 1.054∗ 0. except for g ( F ) (source: Moruzzi.30∗ 0.39 5.31 1.585 2.198 5 Applications of band theory Table 5.10 6.75 4.5815) (1.0 E coh 1.2 11.6231) (1. but one for which theoretical data are available.49∗ 1. a0 = lattice constant in angstroms (and c/a ratio for HCP crystals).87 2.58 6.64 1.63 1. E coh = cohesive energy per atom in electronvolts.4.225 5.06 2.7 72.90 0.01∗ 1.6215) (1.72 3.56 0.70 0.0 35.3 162 170 190 273 168 321 191 270 186 181 137 101 59.28∗ 1. Solid Li Na K Rb Be Mg Ca Sr Sc Y Ti Zr V Nb Cr Mo Fe Ru Co Rh Ni Pd Cu Ag Zn Cd Al Ga In Crystal BCC BCC BCC BCC HCP HCP FCC FCC HCP HCP HCP HCP BCC BCC BCC BCC BCC HCP HCP FCC FCC FCC FCC FCC HCP HCP FCC FCC† FCC† a0 3.73 0.27 3.90 4.65 3.50 (1.40 0. .31 0.52 3. All values are from experiment.28 6.85 3.6 43.41 0.16 3.36∗ 0.21 5.95 1.71 2.39 3.5793) (1.29 0.491 4. g ( F ) = density of states /eV-atom at the Fermi energy F (values marked by ∗ are for the FCC structure). † This is not the lowest energy crystal structure.65 2. with the total number of steps in the simulation determined by the available computational resources.13∗ 2.22 2.11 0.84 1.35 4. or to simulate the evolution of the system for the largest possible time interval at ﬁnite temperature. For small enough values of dt .8 46.41∗ 1.82 4.2 3.93 0.23 3.49 2.6 105 83.8859) Source: C.

5.66 4.4 11.41 4. the center of the Brillouin Zone).7 9.80 1. 74.11 5.0 (1.57 5.40 5.65 3.3 9.17 0.42 0.83 2.86 6.7 10.8 77.12 3.66 6.28 2.70 2.69 5. The ensuing dynamics of the ions are always governed by accurate quantum mechanical forces. ε = static dielectric function (ε2 at ω = 0).85 E coh 7.37 6.0 58.76 X 0.87 6.66 6. In this way.43 3.0 62.13 4.0 60.0 77.6302) 42. Solid C Si Ge SiC BN BP AlN AlP AlAs AlSb GaN GaP GaAs GaSb InN InP InAs InSb ZnS ZnSe ZnTe CdS CdSe CdSe CdTe Crystal DIA DIA DIA ZBL ZBL ZBL WRZ ZBL ZBL ZBL ZBL ZBL ZBL ZBL WRZ ZBL ZBL ZBL ZBL ZBL ZBL ZBL ZBL WRZ ZBL a0 3.74 2.08 4.4 51.62 4. ZBL = zincblende.7 74.89 1.4 77.4 2. All values are from experiment.54 5.41 0.6013) 86. WRZ = wurtzite.39 2.45 5.43 5.39 1.2 5.23 3. a0 = lattice constant in angstroms (and c/a ratio for the wurtzite structures).6 14.45 5. 76]. is avoided except at the initial step.04 6.13 2.1 9.6 Energetics and dynamics 199 Table 5.84 2. taken together with the electronic structure . gap = minimal band gap in electronvolts and position of the CBM (the VBM is always at . This produces very realistic simulations of the dynamic evolution of complicated physical systems.7 12.70 5.81 4. obtained by the general method outlined above.0 15. measured at room temperature.6102) 71.74 1.1 14. E coh = cohesive energy per atom in electronvolts. a fully self-consistent solution of the electronic problem for a given ionic conﬁguration.5.2 224 (1.52 2.48 1.34 3.0 47.48 (c/a ) B 443 98.50 2.51 3.99 gap CBM 0.36 3.8 9.4 2.0 (1.66 3.61 Sources: [64.48 5.1 62.1 15.7 7.1 13. The crystal structures are denoted as DIA = diamond.58 1.30 6. Basic structural and electronic properties of common semiconductors.10 5.73 1.9 8.62 6.52 0.0 53.78 3.1 10.2 88. the balance between the time-evolved wavefunctions and positions of the ions is maintained – this is referred to as the system remaining on the Born–Oppenheimer surface.08 3.32 1. B = bulk modulus in gigapascals. The total-energy and force calculations which are able to determine the lowest energy conﬁguration of a system.34 2.54 3.85 X L X X X X ∼X ∼X ∼X ε 5.8 57.4 6.11 2.39 6.1 9.21 4.82 6.6 17.9 12. which is the most computationally demanding part of the calculation.

4. containing a wealth of physical insight into the origin and nature of energy bands in semiconductors. This book contains a comprehensive treatment of optical properties of solids. in Optical Properties of Solids.R. based on DFT/LDA and its extensions using gradient corrections to the exchange-correlation density functional. 2. we collect in Tables 5. Fermi International Summer School. For reference. Varenna. Mod. 61. 1966). including the theoretical background within the one-electron framework. J.C. Phillips (Academic Press.200 5 Applications of band theory information inherent in these calculations. some of which were discussed in the text. Beer (Academic Press. Phys.C. Moreover.K. 1973). This article presents a fundamental theory for solids and includes several examples. Jones and O. while calculated phonon frequencies. This article is an excellent early review of density functional theory methods with emphasis on applications to molecules. Ehrenreich. Nevertheless. Italy. Allan. Wooten (Academic Press. has certain important advantages. 6. J. R.P. D. T. M. Bonds and Bands in Semiconductors. p. Joannopoulos. Philipp and H. Arias. in the simplest version of the computations. the ability of these calculations to address a wide range of properties for systems composed of almost any element in the periodic table without involving any empirical parameters. calculated lattice constants differ from experimental ones by a few percent (typically 1–2%). Rev. ed. J. H. eds. 93. H. 64. Payne and M. Ultraviolet Optical Properties. in conjunction with atomic pseudopotentials to eliminate core electrons.C. in Semiconductors and Semimetals. as obtained either from experiment or theory. 3. provide a powerful method for a deeper understanding of the properties of solids. This is a general review article. makes them a truly powerful tool for the study of the properties of solids. Tauc (Course 34 of E. Probably its biggest advantage is that it does not rely on any empirical parameters: the only input to the calculations is the atomic masses of constituent atoms and the number of valence electrons.A. Ehrenreich. the atomic number of each element. Optical Properties of Solids. 689 (1989). Teter. 1967). Electromagnetic Transport in Solids: Optical Properties and Plasma Effects. This is a classic work. that is. Gunnarsson. 1972).O. F.4 and 5. Phys. Mod. Willardson and A. but this can be corrected. 3.5 the basic structural and electronic properties of a variety of common metals and semiconductors. We have already mentioned above some examples where such calculations can prove very useful. but with emphasis on the experimental side. Rev. The computational approach we have discussed. These calculations are not perfect: for example.D. New York. vol. the calculated band gap of semiconductors and insulators can be off from the experimental value by as much as 50%. . R. either within the same theoretical framework or by more elaborate calculations. Further reading 1. as already mentioned in chapter 4.C. 5. bulk moduli and elastic constants may differ from experimental values by 5–10%. 1045 (1992).

(5. and identify the constants in the latter in terms of fundamental constants and the microscopic properties of the solid. is a unitary one. 4. with s . given in Eqs.50). Make sure you include enough points to be able to see the characteristic behavior at the critical points. Science 235. and relate the value of the wave-vector K to the single-particle wave-vectors.99) Discuss the uniqueness of Wannier functions. Ehrenreich.Problems 201 This article is a concise review of total-energy methods based on density functional theory and their applications to the physics of solids. Fig. both appearing near h ¯ ω ∼ 4 eV.59). px . This article provides a penetrating perspective on the importance of electronic structure calculations for materials science. 3.39).52). (5. (5. Prove the last equality in Eq. 2. 9. using the TBA (the example problem solved explicitly in chapter 4). Derive the type. that is. starting with the expression of Eq. Apply this analysis to the case of Si. 4. 6. and identify the constants in the latter in terms of fundamental constants and the microscopic properties of the solid. given in Eqs. 5. to the Drude result. (a) Prove that the transformation from the band states to the Wannier states. (5. Provide a physical interpretation for the constants in the Lorentz expression.8. 4. the major peak of 2 (ω) and the node of 1 (ω). Eq. it will oscillate with the plasma frequency. H. 7.57). Problems 1.54) and (5. Eq. Eq. to interpret the main features of the dielectric function in Fig.4). (5. Use the band structure of Si. (5. (b) Prove the expressions for the real and imaginary parts of the dielectric function as a function of frequency due to intraband transitions. 8. . Derive the density of states for the free-electron model in one and two dimensions.49). (5. to the Lorentz result.51) and (5. starting with the expression of Eq.4. (5. Provide a physical interpretation for the constants in the Drude expression.40). Eq. the Wannier functions are orthonormal: (n ) (r) = ψk R (n ) (m ) (n ) φR (r)eik·R ⇒ φR |φR = δnm δ (R − R ) 5. (5. that is. (5. Prove the Bloch statement for the many-body wavefunction. p y . Calculate the DOS for the 2D square lattice with one atom per unit cell. but the bands for points in the entire IBZ are needed. Eq. For this calculation it is not sufﬁcient to obtain the bands along the high-symmetry points only. multiplicity and behavior of the DOS at critical points in one and two dimensions. 10. 7. 1029 (1987). (5. (5.53). Compare the expression for the dielectric function due to interband transitions. (a) Prove that if a uniform electron gas is displaced by a small amount in one direction relative to the uniform background of compensating positive charge and then is allowed to relax.8. Compare the expression for the dielectric function due to intraband transitions. Eq. whose band structure is shown in Fig. (5.53).55).49). pz orbitals. (c) Prove the expressions for the real and imaginary parts of the dielectric function as a function of frequency due to interband transitions. Eq.

not necessarily periodic. Derive the equation that the expansion coefﬁcients obey. Derive the equations of motion for the electronic and ionic degrees of freedom from the Car–Parrinello lagrangian. (c) Consider a perturbation V (r. Prove the expressions for the various terms in the energy of a Frenkel exciton represented by a Slater determinant of Wannier functions. (5.98). and comment on the physical situations where such an expansion is appropriate. Eq. . given in Eq. and use the Wannier functions as a basis.202 5 Applications of band theory (b) Estimate the degree of localization of a Wannier function in an FCC cubic lattice: in order to do this assume that the BZ is almost spherical and take the periodic n) part of the Bloch state u ( k (r) to be nearly independent of k. 12.65). 11. (5. t ).

speciﬁcally a human-generated sound. These modes correspond to collective excitations. This is easily rationalized. with an amplitude that depends on the temperature. These excitations are called phonons. that is. voice. for sound. In this chapter we discuss phonons and how they can be used to describe thermal properties of solids. nor is there a Pauli exclusion principle governing the occupation of any particular phonon state. which can be excited and populated just like electronic states. collective vibrations of the atoms in a crystalline solid which can be excited arbitrarily by heating (or hitting) the solid.2) The word phonon derives from the Greek noun φωνη. with the convention that |ti | < |Rn | for all non-zero lattice vectors. its actual meaning is more restrictive. if we consider the real nature of phonons.1 Unlike electrons. “phoni”. Then the deviation of each ionic position at ﬁnite temperature from its zero-temperature position can be denoted as Sni = δ Rni 1 (6. phonons are bosons: their total number is not ﬁxed.1) where the Rn are the Bravais lattice vectors and the ti are the positions of ions in one PUC. Suppose that the positions of the ions in the crystalline solid at zero temperature are determined by the vectors Rni = Rn + ti (6. Because a crystalline solid has symmetries. 6. these thermal vibrations can be analyzed in terms of collective modes of motion of the ions.1 Phonon modes The nature and physics of phonons are typically described in the so called harmonic approximation.6 Lattice vibrations At a ﬁnite temperature the atoms that form a crystalline lattice vibrate about their equilibrium positions.e. i. 203 .

and N is the number of PUCs in the crystal.i. . j .β (6. ν is the number of ions in the PUC.β ∂ Rni α ∂ Rm jβ (6. Higher order terms in the expansion are considered negligible.4) where the total energy E depends on all the atomic coordinates Rni .5) in terms of which the potential energy becomes V = 1 Fni α.α. where d is the dimensionality of space (the number of values for α ). The harmonic approximation consists of keeping only the second order terms in this Taylor series.7) (6. We notice that the following relations hold: ∂2 E ∂2V = Fni α.m . In terms of these vectors the kinetic energy K of the ions will be K = n . y .204 6 Lattice vibrations with n running over all the PUCs of the crystal and i running over all ions in the PUC. with the zeroth order term being an arbitrary constant (set to zero for convenience) and the ﬁrst order terms taken to be zero since the system is expected to be in an equilibrium conﬁguration at zero temperature. the potential energy V is then given by V = ∂2 E 1 Sni α Sm jβ 2 n .m jβ Sni α Sm jβ 2 n . We deﬁne the so called force-constant matrix by Fni α.i.i 1 dSni Mi 2 dt 2 = ni α 1 Mi 2 d Sni α dt 2 (6. while the second equation is a consequence of Newton’s third law. since the left-hand side represents the negative of the α component of the total force on ion i in the unit cell labeled by Rn .3) where Mi is the mass of ion i and α labels the cartesian coordinates of the vectors Sni (α = x . z in 3D).α.m jβ = ∂2 E ∂ Rni α ∂ Rm jβ (6.m .β (6. which represents a minimum in the total energy. j .m jβ Sm jβ = ∂ Sni α ∂ Rni α m .8) where the ﬁrst equation is a direct consequence of the deﬁnition of the force-constant matrix and the harmonic approximation for the energy. In this approximation.6) The size of the force-constant matrix is d × ν × N . j. The potential energy of the system can be written as a Taylor series expansion in powers of Sni .m jβ = ∂ Sni α ∂ Sm jβ ∂ Rni α ∂ Rm jβ ∂V ∂E = Fni α.

we note that from the deﬁnition of the dynamical matrix ˜ ni α. This gives. through ˜ ni α.β (6.8).β Fni α. i. ˜ ni α = ω2 u m .m jβ = D 1 Fni α.11) We deﬁne a new matrix. We can try to solve the equations of motion by assuming sinusoidal expressions for the time dependence of their displacements: 1 ˜ ni α e−iωt Sni α (t ) = √ u Mi (6. j .m jβ Mi M j (6. which we will call the dynamical matrix.β (6. To this end.6. the solution of which gives the values of the frequency and the vectors that describe the corresponding ionic displacements. We need to reduce this eigenvalue equation to a manageable size. j .9) where we have used Eq.1 Phonon modes 205 The motion of the ions will be governed by the following equations: Mi ∂E d2 Sni α =− =− Fni α. Rm jβ were to be shifted by the same lattice vector R . the result of differentiation of the energy with ionic positions must be the same because of the translational invariance of the hamiltonian.e.m jβ 1 ˜ m jβ u Mi M j (6.12) In terms of this matrix. when substituted of the ions in the deﬁnition of the new variables u into the equations of motion.m jβ Sm jβ dt 2 ∂ Rni α m . d × ν × N .13) where we have used bold symbols for the dynamical matrix and the vector of ionic displacements in the last expression.14) If both positions Rni α . This leads to the conclusion that the dynamical matrix can only depend on the distance Rn − Rm and not on the .m jβ = D 1 Fni α. so that it can be solved.m jβ = Mi M j 1 ∂2 E Mi M j ∂ Rni α ∂ Rm jβ (6.m jβ u ˜ = ω2 u ˜ ˜ m jβ = ω2 u ˜ ni α =⇒ D D m . The size of the dynamical matrix is the same as the size of the force-constant matrix.10) where ω is the frequency of oscillation and we have explicitly introduced the mass ˜ ni α . This is an eigenvalue equation. j . Obviously. it is impossible to diagonalize such a matrix for a crystal in order to ﬁnd the eigenvalues and eigenfunctions when N → ∞. the equations of motion can be written as ˜ ·u ˜ ni α. (6.

k for each value of the wave-vector in the BZ.m jβ = D ˜ i α. jβ (Rn − Rm ) D Accordingly.17) and with the deﬁnition Di α. In transforming the problem to a manageable size. The solutions to the eigenvalue equation will need to be labeled by two indices. jβ (Rn − Rm )e−ik·(Rn −Rm ) u jβ = ω2 u i α D j . Arguments similar to those applied to the solution of the singleparticle hamiltonian for the electronic states can also be used here. jβ (k) = R ˜ i α.β m (6. so will the eigenvalues ω and eigenvectors u i α . jβ (k)u jβ = ω2 u i α =⇒ D(k) · u = ω2 u j . jβ (k) has a dependence on the wave-vector k. for all the different ions in the PUC and the cartesian coordinates. The size of the new matrix is d × ν .20) l) ˆ( where e k j is the set of d components of the eigenvector that denote the displacement of ion j in d dimensions. to reduce the reciprocal-space volume where we need to obtain a solution down to the IBZ only. where N is the number of PUCs in the crystal. The solution for the displacement of ion j in the PUC at lattice vector Rn will then be given by l) S( n j (k. which takes d × ν values. which of course are all the values in the ﬁrst BZ. the most general ionic motion of the crystal can . The eigenvectors can be chosen to be orthonormal: l) ˆ( e kj j ∗ (l ) ˆk ·e j = δll (6. There are N distinct values of k in the ﬁrst BZ.15) (6. that is.18) where the last expression is obtained with the use of Eq.16) (6. t ) = 1 (l ) i(k·Rn −ω(l ) t ) k ˆ e e Mj kj (6.19) Since Di α.206 6 Lattice vibrations speciﬁc values of n and m .7). no information has been lost in the transformation. we realize that we now have to solve this eigenvalue problem for all the allowed values of k. we can deﬁne the ionic displacements as follows: ˜ ni α = u i α eik·Rn u which gives for the eigenvalue equation ˜ i α.β (6. the eigenvalue equation takes the form: Di α. which is a manageable size for crystals that typically contain few atoms per PUC. ˜ ni α. that is. (6.21) In terms of these displacements. jβ (R)e−ik·R = D n e−ik·Rn 1 ∂2V Mi M j ∂ Sni α ∂ S0 jβ (6. and l .

Eq. We assume that there are two types of distortions of the bonds relative to their ideal positions and orientations: there is bond stretching (elongation or contraction of the length of bonds) and bond bending (change of the orientation of bonds relative to their original position). To determine the contributions of bond stretching and bond bending to the potential energy. the crystal can be thought of as having bonds between each pair of nearest neighbor atoms.1. which is the analog of the tight-binding approximation for electronic states. For bond stretching. The bond stretching energy is given by 1 Vr = κr 2 |ri j | ij 2 (6. we will use the bonds between the ions as the basic units. 6. These choices are consistent with the harmonic approximation. which is evident from the deﬁnition of the dynamical matrix.6.2 The force-constant model We discuss next a simple model for calculating the dynamical matrix. The symmetry (l ) of the eigenvalue spectrum allows us to solve the eigenvalue equations for ωk in the ﬁrst BZ only. In the present case. we will take the energy to be proportional to the square of the angle that describes the change in bond orientation. including lattice vectors.23) where κr is the force constant for bond stretching. Finally.k (l ) ck 1 (l ) i(k·Rn −ω(l ) t ) k ˆ e e Mj kj (6. si for the ionic positions and their displacements and omit any reference to the lattice vectors for simplicity. by analogy to the atomiclike orbitals for the electronic states. and |ri j | is the change in the length of the . i j stands for summation over all the distinct nearest neighbor pairs. where G is any reciprocal lattice vector. just as we did for the electronic energies. jβ (k + G). In the following analysis we employ the lower case symbols ri . we will use the diagram shown in Fig. zero-temperature conﬁguration. j (t ) = l .18). We restore the full notation. we will take the energy to be proportional to the square of the amount of stretching. This symmetry is a direct consequence of the property Di α. (6. 6. In its ideal.2 The force-constant model 207 be expressed as Sn . In both cases the underlying symmetry of the crystal that leads to this simpliﬁcation is the translational periodicity.22) (l ) where the coefﬁcients ck correspond to the amplitude of oscillation of the mode (l ) (l ) with frequency ωk . we note that the eigenvalues of the frequency ωk (l ) (l ) obey the symmetry ωk +G = ωk . to the ﬁnal expressions when we apply them to a model periodic system. jβ (k) = Di α. For bond bending.

We will denote by si the departure of ion i from its equilibrium ideal position.25) which in turn gives for the bond stretching energy 1 Vr = κr 2 ˆi j (s j − si ) · r ij 2 (6. which gives to lowest order |ri j | = b 1 + ri j · (si − s j ) b2 =⇒ ˆ i j · (si − s j ) |ri j | = |ri j | − b = r (6. If we assume that all the bonds are equal to b (this can be easily generalized for systems with several types of bonds).1. Schematic representation of bond stretching and bond bending in the forceconstant model. which gives for the new distance between displaced atoms i and j : ri j = s j − si + ri j =⇒ |ri j | = s j − si + ri j 2 1/2 (6. s j for the bond bending energy.27) with κθ the bond bending force constant and θi j the change in the orientation of the bond between atoms i and j .24) The displacements si will be taken to be always much smaller in magnitude than the bond distances ri j . si . Using this fact. The ﬁlled circles denote the positions of atoms in the ideal conﬁguration and the open circles represent their displaced positions. in which case the harmonic approximation makes sense. Using the same notation for the new positions . bond between atoms i and j . and θi j the change in orientation of the bond (the angular distortion). deﬁned to be 1 Vθ = κθ 2 b2 ij θi j 2 (6. ri j is the original bond length. ri j the distorted bond length.26) By similar arguments we can obtain an expression in terms of the variables si . consistent with the assumption of small deviations from equilibrium. where ri j is the new distance between atoms i and j . then the change in bond length between atoms i and j is given by |ri j | = |ri j | − b.208 6 Lattice vibrations r’ ij ∆θij si i j rij sj Figure 6. s j are the atomic displacements. we can expand in powers of the small quantity |si − s j |/b.

2 The force-constant model 209 of the ions (see Fig. (6. and has two atoms per unit cell at positions t1 = (a /2 2)ˆ x.30) Combining the two contributions.29) which leads to the following expression for the bond bending energy: ˆi j |s j − si |2 − (s j − si ) · r ij 2 (6. t2 = −(a /2 2)ˆ x. over all pairs of nearest neighbors.1).2.26) and (6.28) For the left-hand side of this last equation. Eqs. 6. that is. we obtain for the bond bending term 1 ri j · ri j = |ri j |b cos( θi j ) ≈ 1 − ( θi j )2 |ri j |b 2 =⇒ ri j · ri j 1 = 1 − ( θi j )2 |ri j |b 2 (si − s j )2 2ri j · (si − s j ) 1 − 1 + b2 b2 b2 −1/2 (6. but exists (a / 2)[ˆ x+y in a two-dimensional √ √ space. we obtain for the total potential energy V = 1 2 ˆi j (κr − κθ ) (s j − si ) · r ij 2 + κθ |s j − si |2 (6. we ﬁnd ri j · ri j |ri j |b = (s j − si + ri j ) · ri j ri j · (s j − si ) b2 1+ = 1+ =1− ri j · (si − s j ) 1 (si − s j )2 3 |ri j · (si − s j )|2 − + b2 2 b2 2 b4 2 (s j − si ) · ri j |s j − si |2 + 2b2 2b 4 where we have used the Taylor expansion of (1 + x )−1/2 (see Appendix G) and kept only terms up to second order in the small quantities |si − s j |/b.30). and the assumption that the displacements si are much smaller than the bond distances ri j . gives ˆi j ( θi j )2 b2 = |s j − si |2 − (s j − si ) · r 1 Vθ = κθ 2 2 (6. when compared to the right-hand side of the previous equation.1 Example: phonons in 2D periodic chain We next apply the force-constant model to a simple example that includes all the essential features needed to demonstrate the behavior of phonons. This type of atomic arrangement . 6.6. Our example is based √ on a system that is periodic in one dimension (with lattice vector a1 = ˆ ]). This result.31) where the sum runs over all pairs of atoms in the crystal that are bonded.

210 6 Lattice vibrations x 1 a1 2 y Figure 6. Using the deﬁnition of the dynamical matrix Eq. we obtain for the various matrix elements of the dynamical matrix Di α. the size of the dynamical matrix will be 2 × 2 = 4. ±1.12 = ( S01 y − S12 y )2 (κr − κθ ) + 2 1 V11.18). In our example this means that we need only the following three terms in the expression for the potential energy: 1 V01. To identify the atoms. will generate phonons of different nature. for each unit cell we only need to take into account the interaction between atoms within the unit cell and the interaction of these atoms with their nearest neighbors in the adjacent unit cells. as we will see below. α = x . (6. y ) we must only use . By convention. We denote the potential energy corresponding to a pair of atoms by Vni. and taking the masses of the two ions to be the same M1 = M2 = M . since we have adopted the approximation of nearest neighbor interactions.m j . 2.2. we use two indices as in the general formulation developed above.02 = ( S11 y − S02 y )2 (κr − κθ ) + 2 1 κθ [( S01x − S02x )2 + ( S01 y − S02 y )2 ] 2 1 κθ [( S01x − S12x )2 + ( S01 y − S12 y )2 ] 2 1 κθ [( S11x − S02x )2 + ( S11 y − S02 y )2 ] 2 where we have reverted to the original notation for the displacement of ions Sni α .02 = ( S01x − S02x )2 (κr − κθ ) + 2 1 V01. Since there are two atoms per unit cell (ν = 2).32) We notice that. Consequently. in the spirit of using local force constants. and the system exists in two dimensions (d = 2). 2) and α labeling the spatial coordinates (in our case α = x . we take the unit cell of interest to correspond to the zero lattice vector. and the index identifying the position of the atom inside the unit cell (i ). 6. y ). with n labeling the unit cell (in our case n = 0. for the diagonal elements Di α. Deﬁnition of the model for the phonon calculation: it consists of a chain of atoms in 2D space. with 1 = −1). We will take into account nearest neighbor interactions only. jβ (k) = 1 M e−ik·R R ∂2V ∂ SRi α ∂ S0 jβ (6.i α (i = 1. The physical system is shown schematically in Fig. i labeling the atoms in the unit cell (in our case i = 1.2. the lattice vector index (n ).

ωk A plot of the two different eigenvalues. We have used values of the force constants such that κr > κθ since the cost of stretching bonds is always greater than the cost of bending bonds. D2x .2x .1x . R = ±a1 .2 y . 6. y 2. The solution of this matrix gives for the frequency eigenvalues 1 2 =√ ωk (κr + κθ ) ± M 2 κr2 + κθ + 2κr κθ cos(k · a1 ) (6. Dynamical matrix for the linear chain in 2D. D1 y .6. All elements have been multiplied by M . denoted in the following by ωk from the sign in front of the square root. x −κr − κθ e−ik·a1 κr + κθ 0 0 1.1. whereas for the off-diagonal elements D1x . x 2.1 y we need to take into account contributions from R = 0. y 0 0 κr + κθ −κθ − κr eik·a1 2. as appropriate for the nearest neighbors. x 1. (+) (−) .34) . The remaining matrix elements are identically equal to zero because there are no contributions to the potential energy that involve the variables Sni x and Sm j y together. x 1.2 The force-constant model 211 Table 6. The explicit expressions for the matrix elements are tabulated in Table 6.1.33) with each value of ωk doubly degenerate (each value coming from one of the two submatrices). y κr + κθ −κr − κθ eik·a1 0 0 2. y 0 0 −κθ − κr e−ik·a1 κr + κθ the R = 0 contribution because there are no nearest neighbor atoms with the same index i .3. 1. This gives the following results: (−) ω0 1 =√ M 1 κr κθ 2 κr + κθ 1/2 ak 1/2 1 1 κr κθ 2 2 (+) =√ a k ω0 2(κr + κθ ) − 2 κr + κθ M 1 (−) ω1 = √ [2κθ ]1/2 M 1 (+) ω1 = √ [2κr ]1/2 M (6. D2 y . is given in Fig. It is instructive to analyze the behavior of the eigenvalues and the corresponding eigenvectors near the center (k = 0) and near the edge of the BZ (k = (π/a )ˆ x).

This mode involves both bond stretching and √ (+) ∼ κr + κθ . 6. . Right: the motion of ions corresponding to the eigenvalues at the center and at the edge of the BZ. we ﬁnd an eigenvector which corresponds to motion of the the eigenvalue ω0 ions against each other within one unit cell. while the motion of equivalent ions in neighboring unit cells is π out of phase. for the bond bending. The eigenvector of the lower branch eigenvalue ω1 at |k| = (π/a ) corresponds to motion of the two ions which is in phase within one unit cell.212 6 Lattice vibrations ω(+) 0 ω(−) 0 ω (+) 1 ω1 (−) ω(−) 1 ω0 (+) −π 0 ω (−) 0 ω1 π ka (+) Figure 6. Solution of the chain model for phonons in 2D.3. which was quadratic in k . hence the frequency eigenvalue ω0 (+) . hence the frequency eigenvalue ω1 (+) .3. where we have used the subscript 0 for values of |k| ≈ 0 and the subscript 1 for (−) . The motion of ions corresponding to these four modes is illustrated in Fig. but π out of phase in neighboring unit cells. uniform translation of the crystal. is linear in values of |k| ≈ π/a . Left: the frequency eigenvalues. we ﬁnd that it corresponds to a normal mode in which all the ions move in the same direction and by the same amount. hence the frequency eigenvalue ω1 ∼ κr . Finally. There are four other modes ˆ which are degenerate to these and involve similar motion of the ions in the y direction. This distortion involves bond stretching √ (+) only. in contrast to the standard behavior of the energy of electrons at the bottom of the valence band. If we examine the eigenvector of this eigenvalue in the limit k → 0. while the motion of equivalent ions in neighboring unit cells is in phase. This motion does not involve bond bending or (−) bond stretching distortions. This motion involves √ (−) ∼ κθ . we ﬁnd an eigenvector which corresponds to motion of the ions eigenvalue ω1 against each other in the same unit cell. We see that the lowest frequency at k ≈ 0. that is. For bond bending distortions only. ω0 k = |k|.

Si is the prototypical covalent solid. in which the motion of ions within one unit cell is in phase. However. establishing the values of force constants for interactions beyond nearest neighbors is rather complicated and must rely on experimental input. . except when required by symmetry. “O” for optical and “A” for acoustic. Moreover. Moreover. In cases where there is only one atom per unit cell there are only acoustic modes. there are doubly degenerate modes at each (1) (2) (−) (3) (4) (+) value of k (ωk = ωk = ωk and ωk = ωk = ωk ) because we are dealing with ˆ and y ˆ directions. For crystals with ν atoms per unit cell. For example. we would need parameters that can describe accurately the relevant contributions of the bond bending and bond stretching forces. one can use the formalism discussed in chapters 4 and 5 (DFT/LDA and its reﬁnements) for calculating total energies or forces. a 2D crystal which has symmetric couplings in the x the degeneracy will be broken if the couplings between ions are not symmetric in the different directions. Although through such a procedure it is possible to obtain estimates for the bond stretching and bond bending parameters. with such extensions the model loses its simplicity and the transparent physical meaning of the parameters. where both bond bending and bond stretching terms are very important. is referred to as “optical”. Interactions to several neighbors beyond the nearest ones are required in order to obtain frequencies closer to the experimentally measured spectrum throughout the BZ. “T” for transverse. ﬁtting the values of κr and κθ to reproduce the experimental values for the highest frequency optical mode at (labeled LTO( ))2 and the lowest frequency acoustic mode at X (labeled TA( X )) gives the spectrum shown in Fig. there are 3 × ν modes for each value of k. In general. In 3D there would be three acoustic and three optical modes for a crystal with two atoms per unit cell. We note that in this example we get two branches because we have two atoms per unit cell.6. 6. the model is not very accurate: for instance. while the upper branch in which the motion of ions within one unit cell is π out of phase.2 The force-constant model 213 The lower branch of the eigenvalues of the frequency.4 (see also Problem 1). so the in-phase and out-of-phase motion of atoms in the unit cell produce two distinct modes. Alternatively. compared with experimental values.2. is referred to as “acoustic”. the values of the two other phonon modes at X are off by +5% for the mode labeled TO( X ) and −12% for the mode labeled LOA( X ).2 Phonons in a 3D crystal As an illustration of the above ideas in a more realistic example we discuss brieﬂy the calculation of phonon modes in Si. 6. The atomic displacements that correspond to various phonon modes can be established 2 In phonon nomenclature “L” stands for longitudinal. If we wanted to obtain the phonon spectrum of Si by a force-constant model. and their frequencies would not be degenerate. as follows.

44 L Γ X 1 2 1 2 3 4 3 4 TA(X) LOA(X) 1 2 1 2 3 4 2 1 TO(X) LTO( Γ) Figure 6. Top: the phonon spectrum of Si along L − − X. we show in Fig. Bottom: atomic displacements associated with the phonon modes in Si at the and X points in the BZ.214 6 Lattice vibrations ω (THz) 16 LTO TO 12 LOA 8 4 TA 0 5. 6. κθ / M Si = 2.4. as obtained from the . either by symmetry (for instance.828 THz. as in the example discussed above. inequivalent atoms are labeled 1–4.4 the atomic displacements corresponding to certain high-symmetry phonon modes for Si. using group theory arguments) or in conjunction with simple calculations based on a force-constant model. For instance.245 THz. √the high-symmetry directions √ calculated within the force-constant model with κr / M Si = 8.

32 TO( X ) 13. Higher order terms give the anharmonic contributions which correspond to phonon–phonon interactions. or the restoring forces on the atoms.2 we compare the calculated frequencies for the phonon modes of Fig.16 15. for a review and more details.16 11.6.4 involves only two atoms. The atomic displacements corresponding to these modes are shown in Fig. 6. [65. LTO( ) DFT energy calculation DFT force calculation Force-constant model Experiment 15. 67].45 4. see Ref.48 13.53 LOA( X ) 12. and the coefﬁcient of the second order term gives the phonon frequency (in the case of a force calculation the ﬁt starts at ﬁrst order in the phonon amplitude).98 10. For instance. These energy differences are then ﬁtted by a second or higher order polynomial in the phonon amplitude. For a given atomic displacement. Its limitation is that only phonons of relatively high symmetry can be calculated. In Table 6.49 force-constant model. Frequencies of four high-symmetry modes are given (in THz) at the center ( ) and the boundary (X ) of the Brillouin Zone.4 (from Yin and Cohen [77]). the unit cell for the LTO( ) mode in Fig. making computations for large unit cells with many atoms intractable. This straightforward method. whereas the supercell for all the modes at X involves four atoms. in order to represent this motion. twice as many as in the PUC of the perfect crystal.2 The force-constant model 215 Table 6.37 4.2. The computational cost of the quantum mechanical calculations of energy and forces increases sharply with the size of the unit cell.4. called the “frozen phonon” approach. gives results that are remarkably close to experimental values. the information from the few high-symmetry points can usually be interpolated to yield highly accurate phonon frequencies throughout the entire BZ. Phonon modes in Si.14 15.53∗ 15. However. 6. DFT results are from Refs.49∗ 4. . can be calculated as a function of the phonon amplitude. 6. as obtained by theoretical calculations and by experimental measurements.65 13.90 TA( X ) 4. the boundaries and a few special points within the Brillouin Zone. [65].51 14. the energy of the system relative to its equilibrium structure.83 12. such as at the center. the same as in the PUC of the perfect crystal. that is. The reason is that phonon modes with low symmetry involve the coherent motion of atoms across many unit cells of the ideal crystal. a large supercell (a multiple of the ideal crystal unit cell) that contains all the inequivalent atoms in a particular phonon mode must be used. with values of κr and κθ chosen to reproduce exactly the experimental frequencies marked by asterisks. The force-constant model is based on nearest neighbor interactions only. to experimentally measured ones.

l l ) (l ) ik·Rn (l ) (l ) ik ·Rm ˆ ki α e ˆ k jβ e e Q( Fni α. In that equation. the coefﬁcients of factors exp(ik · Rn ) and exp(i(−k) · Rn ) cos(ωk that appear in the expansion must be complex conjugate.β 1 2 Mi M j k.i . that is (l ) t ). The potential energy can also be expressed in terms of the atomic displacements and the force-constant matrix: V = n . We begin with the most general expression for the displacement of ions. since they describe ionic displacements in real space.37) and (1/ N ) Eq. and express the most general displacement of ions as: k Sn j (t ) = l .39) k.3 Phonons as harmonic oscillators We next draw the analogy between the phonon hamiltonian and that of a collection of independent harmonic oscillators. j . j dt dt 2 (6.216 6 Lattice vibrations 6.m jβ Q k k e (6.22). l) ˆ( e kj ∗ l) ˆ( =e −k j (6. Eq.36) and since this relation must hold for any time t .k l) Q( k (t ) 1 (l ) ik·Rn ˆ e e Mj kj (6.l . j Mj 2 dSn j dt 2 = 1 2 n .69). (6.α . Therefore.l . j kl . Moreover. (6.m . we obtain for the kinetic energy K = n exp(ik · Rn ) = δ (k − G) from l) (l )∗ dQ( k dQk dt dt l) (l )∗ dQ( 1 1 l) (l )∗ k dQk ˆ( ˆk e kj · e j = 2 k. (6.38) Using the relations of Eq.l where we have set G = 0 in the argument of the δ -function since the wave-vectors k.37) The kinetic energy of the system of ions will be given in terms of the displacements Sn j (t ) as K = n. we combine the amplitude (l ) (l ) with the time-dependent part of the exponential exp(−iωk t ) into a new variable ck l) Q( ( t ). we deduce that l) Q( k (t ) ∗ l) = Q( −k (t ). with the help of the same relations that were used in the derivation of the . k lie in the ﬁrst BZ.k l l) (l ) dQ( k dQk ˆ (l ) · e ˆ (l ) ei(k+k )·Rn e dt dt k j k j (6.k . for the time dependence we should only consider the real part.35) These quantities must be real. (G.21). leading to the relation l) l) Q( e( k (t )ˆ kj ∗ l) l) = Q( e( −k (t )ˆ −k j (6.40) which.l . and the last equality was obtained with the help of Eq.

all these phonons contribute to the internal energy of the solid. The amplitude of the vibration. contained in Q ( k . This implies that the total energy due to the phonon excitations must be given by E sphon = k. The total energy of the excitation is that of a collection of independent harmonic oscillators l) with the proper phonon frequencies.l 2 (6. (6. Notice that the Q ( k contain the amplitude of the vibration. where Q ( k is the free variable describing the motion of the harmonic oscillator. the total energy involves the absolute value squared of the amplitude of every phonon mode that is excited. Q ( k can be thought of as the quantum mechanical position variable and l) dQ( k /dt as the conjugate momentum variable of a harmonic oscillator identiﬁed by l) index l and wave-vector k.11) to relate (l ) the displacements to the force-constant matrix and the frequency eigenvalues ωk .3 Phonons as harmonic oscillators 217 kinetic energy. becomes V = 1 2 l ) (l )∗ (l ) ωk Q( k Qk k. we obtain the total energy of a system of phonons: (l ) 2 2 2 d Q 1 (l ) l) k (6.10) and (6.42). l) (l ) Q( k and its conjugate variable d Q k /dt obey the proper commutation relations for bosons. there is no limit for this amplitude.l dt which is formally identical to the total energy of a collection of independent har(l ) l) monic oscillators with frequencies ωk .35). therefore an arbitrary number of phonons of each frequency can be excited. the harmonic oscillators describing this motion should be treated as quantum mechanical ones. With this interpretation.43) . as expected for harmonic oscillators (see Problem 2). (6. and consequently. The fact that an arbitrary number of phonons of each mode can be excited indicates that phonons must be treated as bosons.41) To obtain the last expression we have also used Eqs. can be interpreted as the number of excited phonons of this particular mode.l l) n( ks + 1 (l ) h ¯ ωk 2 (6. with the atomic motion given as a superposition of the harmonic modes corresponding to the excited phonons. Indeed.6. in l) Eq. This analysis shows that a solid in which atoms are moving is equivalent to a number of phonon modes that are excited.42) E phon = Q( + ωk k 2 k. In principle. This expression also makes it easy to show that the kinetic and potential contributions to the total energy are equal. (6. Combining the expressions for the kinetic and potential energies. as seen from Eq. Since the atomic motion in the solid must be quantized.

we will see below that this has measurable consequences (see section 6. with n ( ks allowed to take any non-negative integer value. the energy is given in terms of the displacements δ rn from ideal . In practice.218 6 Lattice vibrations l) (l ) where n ( ks is the number of phonons of frequency ωk that have been excited in a particular state (denoted by s ) of the system.4. and in particular their speciﬁc heat. This is especially interesting at low temperatures. there is a certain amount of energy in the system due to the so called zero-point motion associated which are added with quantum harmonic oscillators. n ( k0 = 0. even in the ground state of the l) system when none of the phonon modes is excited. For large displacements anharmonic terms become increasingly important. Evidently. where the quantum nature of excitations becomes important.{p}) {dr}{dp}e−β E ({r}. beginning with a brief review of the result of the classical theory.43). this places a limit on the number of phonons that can be excited before the harmonic approximation breaks down. 6.5). One interesting aspect of this expression is that. and a more elaborate description is necessary which takes into account phonon–phonon interactions arising from the anharmonic terms. In the harmonic approximation. This expression is appropriate for l) quantum harmonic oscillators.44) =− where {r} and {p} are the coordinates and momenta of the particles. that is.4 Application: the speciﬁc heat of crystals We can use the concept of phonons to determine the thermal properties of crystals. this arises from the factors of 1 2 to the phonon occupation numbers in Eq.1 The classical picture The internal energy per unit volume of a collection of classical particles at inverse temperature β = 1/ kB T (kB is Boltzmann’s constant) contained in a volume is given by E = 1 {dr}{dp} E ({r}. (6.{p}) 1 ∂ ln ∂β {dr}{dp}e−β E ({r}. the harmonic approximation to phonon excitations is reasonable. {p})e−β E ({r}. and gives behavior drastically different from the classical result. if the atomic displacements are not too large. 6. We discuss this topic next.{p}) (6.

48) where E s is the energy corresponding to a particular state of the system. we express the total internal energy as E =− 1 ∂ ln ∂β e− β E s s (6. in which case the integral in the last expression in Eq.n 1 2 n p2 n Mn (6.45) where E 0 is the energy of the ground state and F(rn − rn ) is the force-constant matrix. the indices n .49) There is a neat mathematical trick that allows us to express this in a more convenient form.2 The quantum mechanical picture The quantum mechanical calculation of the internal energy in terms of phonons gives the following expression: E = 1 s E s e−β Es −β E s se (6.47) and turns out to be a constant independent of temperature. This behavior is referred to as the Dulong–Petit law.4. We will take E s from Eq. we can rescale all coordinates by β 1/2 . Since both the potential and kinetic parts involve quadratic expressions.46) from which the speciﬁc heat per unit volume can be calculated: c(T ) = ∂ ∂T E = 3kB ( Nat / ) (6. which is valid at high temperatures. Just as in the classical discussion. which involves a certain number of excited phonons.44) is multiplied by a factor of β −3 Nat where Nat is the total number of atoms in 3D space.6. giving for the internal energy per unit volume: E = E0 + 3 Nat kB T (6. (6. (6.43) that we derived earlier.4 Application: the speciﬁc heat of crystals 219 positions as E = E0 + 1 2 δ rn · F(rn − rn ) · δ rn + n . deﬁned in Eq.50) . (6.l l) n( ks + 1 (l ) h ¯ ωk β 2 (6.5). What remains is independent of β . n run over all the particles in the system. 6. Consider the expression e− β E s = s s exp − k.

220 6 Lattice vibrations l) (l ) 1 which involves a sum of all exponentials containing terms (n ( ¯ ωk with all k + 2 )β h l) . Now consider the expression possible non-negative integer values of n ( k ∞ k.54) This quantity represents the average occupation of the phonon state with frequency (l ) ωk . and therefore n k ( T ) becomes negligibly small.l 1−e (l ) −β h ¯ ωk = 1 k.l (l ) h ¯ ωk ∂ (l ) ¯ (T ) n ∂T k (6. In this (l ) l) ¯( limit.49).52) which gives for the total internal energy per unit volume E =− 1 ∂ ln ∂β ¯ ωk /2 e− β h (l ) k. for the acoustic branches. For large β we can rection of the wave-vector k (l ) (l ) = vk then approximate all frequencies as ωk ˆ k over the entire BZ. (l ) (l ) except when ωk happens to be very small. From Eq. β h ¯ ωk becomes very large. Notice further that the geometric series summation gives ∞ e n =0 (l ) −β (n + 1 ¯ ωk 2)h = ¯ ωk / 2 e−β h ¯ ωk 1 − e−β h (l ) (l ) (6.51) It is evident that when these products are expanded out we obtain exponentials (l ) with exponents (n + 1 )β h ¯ ωk with all possible non-negative values of n . We saw in an earlier discussion that ωk goes to zero linearly near the center of the BZ (k ≈ 0). (6.53) we can now calculate the speciﬁc heat per unit volume: c(T ) = ∂ ∂T E = 1 k. (6.55) We examine ﬁrst the behavior of this expression in the low-temperature limit. since for large (l ) values of k the contributions are negligible because of the factor exp(¯ hvk ˆ kβ ) . and we can substitute the second expression for the ﬁrst one in Eq. and is appropriate for bosons (see Appendix D). where k = |k| is the (l ) wave-vector magnitude and vk ˆ is the sound velocity which depends on the diˆ and the acoustic branch label l . (l ) (l ) = vk We can then write that near the center of the BZ ωk ˆ k .53) l) ¯( where we have deﬁned n k as l) ¯( n k (T ) = 1 e (l ) βh ¯ ωk −1 (6.l (l ) l) ¯( h ¯ ωk n k + 1 2 (6.l n =0 −β h −β h −β h ¯ 2 ωk ¯ 2 ωk ¯ 2 ωk ¯ 2 ωk ¯ ωk 1 + e 1 + · · ·)(e 2 + e 2 + · · ·) e−β (n + 2 ) h = (e−β h 1 (l ) 1 3 1 3 (l1 ) (l1 ) (l2 ) (l2 ) (6. at temperature T . Therefore 2 the expressions in these two equations are the same.

so that the result is independent of the BZ shape. Turning the sum into an integral as usual.4.57) T = 4c0 T 3 ∂T that is.59) which is the same result as in the classical calculation. as acoustic modes near k ≈ 0 are. i. we then ﬁnd that ∂ 4 (6.e. Thus. In fact. 6. and for each value of k there are 3ν normal modes in 3D (ν being the number of atoms in . where Nat is the total number of atoms in the crystal. the expression for the speciﬁc heat becomes c(T ) = 1 k. while the integral now has become independent of temperature. For simplicity we assume we are dealing with an isotropic solid. at sufﬁciently high temperatures the Dulong-Petit law is recovered. being the same in all directions. the behavior of the speciﬁc heat at low temperature is cubic in the temperature and not constant as the Dulong-Petit law suggests from the classical calculation. the Bose occupation factors take the form 1 kB T (6. by taking h ¯ vk ˆ kβ = tk ˆ . (l ) In the high-temperature limit.l kB = 3kB ( Nat / ) (6.56) l We will change variables as in the case of the classical calculation to make the inte(l ) (l ) grand independent of inverse temperature β . the sound velocity. In this model we assume all frequencies to be linear in the wave-vector magnitude. when kB T h ¯ ωk .3 The Debye model A somewhat more quantitative discussion of the behavior of the speciﬁc heat as a function of the temperature is afforded by the Debye model. so we can take ω = v k for all the different acoustic branches.6. which gives −4 a factor of β in front of the integral. Denoting by c0 the value of the constant which is obtained by performing the integration and encompasses all other constants in front of the integral. we obtain for the speciﬁc heat at low temperature ∂ c(T ) = ∂T (l ) h ¯ vk dk ˆ k (l ) 3 ¯ vk (2π ) e h ˆ kβ − 1 (6. we can use the same argument about the negligible contributions of large values of k to extend the integration to inﬁnity in the variable k .4 Application: the speciﬁc heat of crystals 221 which appears in the denominator. The total number of phonon modes is equal to 3 Nat . with v .58) ≈ (l ) ¯ ωk − 1 h ¯ ω(l ) eβh c ( T ) = c0 k and with this.

the upper limit of the integral is approximated by ∞. ω = v k for all modes. with D much larger than T . the Debye model makes sense only for crystals with one atom per PUC. The physical meaning of the Debye temperature is somewhat analogous to that of the Fermi level for electrons. D = h ¯ ωD kB (6. As the temperature increases. at sufﬁciently high temperature the speciﬁc heat eventually approaches the constant value given by the Dulong–Petit law. we obtain the following expression for the speciﬁc heat at any temperature: c(T ) = 9nkB T D 3 0 D/ T t 4 et dt (et − 1)2 (6. Therefore. while for T below the Debye l) ¯( temperature.61) With these deﬁnitions. For T above the Debye temperature all phonon modes l) ¯( are excited. n k ( T ) ≈ 0 for those modes. as the temperature increases the value of the speciﬁc heat increases slower than T 3 . What happens for higher temperatures? Notice ﬁrst that the integrand is always a positive quantity. as shown in Fig. We can also deﬁne the Debye frequency ωD and Debye temperature D . because it treats the frequency spectrum as linearly dependent on the wave-vectors. .57). Eq. Notice that kD is determined by considering the total number of phonon modes. which of course can be anything since phonons are bosons. and there is no temperature dependence left in the integral which gives a constant when evaluated explicitly. This reduces to the expression we discussed above.5. the upper limit of the integral becomes smaller and smaller. Strictly speaking. Just as in the case of electronic states. we can use a normalization argument to relate the density of the crystal. the high-frequency phonon modes are frozen. n = Nat / . through ωD = v k D .62) In the low-temperature limit. n k ( T ) > 0 for all frequencies. and the value obtained from integration is lower than the value obtained in the low-temperature limit. As we saw above.222 6 Lattice vibrations the PUC). because it assumes all modes to be acoustic in nature (they behave like ω ∼ k for k → 0). to the highest value of the wave-vector that phonons can assume (denoted by kD ): 3ν = 3 Nat =⇒ 3 k k =k D k =0 3 Nat kD dk = 3 N =⇒ n = = at (2π )3 6π 2 (6. that is. (6. not the actual number of phonons present in some excited state of the system. that is.60) from which the value of kD is determined in terms of n . 6. which produces a density of states with features that depend on the structure of the solid. The Debye model is oversimpliﬁed. For a realistic calculation we need to include the actual frequency spectrum.

For ﬁxed value (l ) = v |k|. analogous to what we had discussed for the electronic energy spectrum. An example (Al) is shown in Fig.l (l ) δ (ω − ωk )= l dk (l ) δ (ω − ωk )= (2π )3 l (l ) ω=ωk d Sk 1 (l ) 3 (2π ) |∇k ωk | where the last expression involves an integral over a surface in k-space on which (l ) (l ) ωk = ω. Right: comparison of the density of states in the Debye model as given by Eq. The asymptotic value for large T corresponds to the Dulong–Petit law. and a realistic calculation (adapted from Ref. Left: behavior of the speciﬁc heat c as a function of temperature T in the Debye model. shown by the shaded curve. 6. and since there are three phonon branches in 3D space (all having the same average sound velocity within the Debye model). we ﬁnd for the density of states g (ω ) = ω 3 1 4π 3 (2π ) v v 2 = 3 ω2 2π 2 v 3 (6.4 Application: the speciﬁc heat of crystals g(ω)ω D /n 223 c(T) 3nkB 9 6 3 ~T 3 T 0 0. shown by the thick solid line. can be easily obtained starting with the general expression g (ω) = 1 k. g (ω).6. for Al. with v the average sound velocity. In the Debye model we have ωk = (l ) v |k|. which gives |∇k ωk | = v . Within the Debye model. the density of phonon modes at frequency ω per unit volume of the crystal.64) 2 3 2π v 6π 0 .5 1 ω/ωD Figure 6. the integral over a surface in k-space that corresponds to this of ω = ωk value of ω is equal to the surface of a sphere of radius |k| = ω/v .65). [78]).5. the argument of the density of states.5.63) that is. (6. a simple quadratic expression in ω. If we calculate the total number of available phonon modes per unit volume N ph / up to the Debye frequency ωD we ﬁnd ωD 3 N ph 1 ωD 1 3 = g (ω)dω = = 3 2 kD (6.

50 7.50 3. In fact. Source: Ref.48 2.33 20. (6. Nevertheless.60) is equal to 3( Nat / ).17 4. and through it the Debye frequency.76 5.08 4.75 5.5 we show a comparison of the density of states in the Debye model as given by Eq.12 6. This comparison illustrates how simple the Debye model is relative to real phonon modes. for Al.63 8. .05 38. Values of D (in K) and ωD (in THz) were determined by ﬁtting the observed value of the speciﬁc heat at a certain temperature to half the value of the Dulong–Petit law through Eq.81 Element Na Mg Al Si Sb Ag Cd Mo Fe Pd D ωD 3.08 6.33 7.79 400 1000 1250 1860 285 315 234 460 400 375 150 318 394 625 200 215 120 380 420 275 100 230 240 350 120 170 100 310 385 230 In the diamond phase. Debye temperatures D and frequencies ωD for elemental solids.65).02 4.62) [79].54 2.56 4.3. the model is useful in that it provides a simple justiﬁcation for the behavior of the speciﬁc heat as a function of temperature as well as a rough measure of the highest phonon frequency. Using the deﬁnition of the Debye frequency we can express the density of states as ω g (ω)ωD =9 n ωD 2 (6. is often used to determine the Debye temperature. and as obtained by a realistic calculation [78]. (6. exactly as we would expect for a 3D crystal containing Nat atoms. ωD .88 9. Eq.224 6 Lattice vibrations which from Eq. 6.79 5.3. (6.92 8.46 8. the form of the speciﬁc heat predicted by the Debye model. In Fig. a common practice is to ﬁt the observed speciﬁc heat at given T to one-half the value of the Dulong–Petit law as obtained from Eq.62).02 4. which determines D . (6. Since this relation involves temperature dependence. (6. [78].84 26.94 6.65) which is a convenient relation between dimensionless quantities.29 2.73 Element K Ca Ga Ge Bi Au Hg W Co Pt D ωD 2. Table 6. Element Li Be B Ca As Cu Zn Cr Mn Ni a D ωD 8. Results of this ﬁtting approach for the Debye temperatures of several elemental solids are given in Table 6.00 7.62).58 8.21 13.

normalized by this linear dimension. at constant temperature P=− ∂E ∂ (6. the negative inverse of which is the rate of change of the volume with pressure. Eq.53) to obtain for the thermal expansion coefﬁcient α= 1 3B − k. the phonon frequencies ωk l) ¯( pation numbers n at temperature T .55).66) P 1/3 The linear dimension of the solid L is related to its volume through L = which gives for the thermal expansion coefﬁcient in terms of the volume α= 1 3 ∂ ∂T . in terms of the bulk modulus and the pressure α= 1 3B ∂P ∂T (6. it is customary to express the thermal k .4 Application: the speciﬁc heat of crystals 225 6.6. normalized by the volume.87)).4. that is. (6. (5. at constant temperature: B −1 ≡ − 1 ∂ ∂P (6. at constant pressure P : α≡ 1 L ∂L ∂T (6. (l ) and the corresponding average occuEq.l (l ) ∂h ¯ ωk ∂ l) ¯( ∂n k ∂T (6. With the help of standard thermodynamic relations (see Appendix C) we obtain for the thermal expansion coefﬁcient. (6. we can use E from Eq.68) T (see also the equivalent deﬁnition given in chapter 5.4 Thermal expansion coefﬁcient One physical effect which is related to phonon modes and to the speciﬁc heat is the thermal expansion of a solid.69) The pressure is given by the negative rate of change of the internal energy E with volume.67) P This can also be expressed in terms of the bulk modulus B . The thermal expansion coefﬁcient α is deﬁned as the rate of change of the linear dimension L of the solid with temperature T . In fact.70) T When changes in the internal energy E are exclusively due to phonon excitations.71) T which involves the same quantities as the expression for the speciﬁc heat c. (6.

l −1 k.l ∂T (6. .l k.76) The Gr¨ uneisen parameter is a quantity that can be measured directly by experiment.72) where the coefﬁcient γ is known as the Gr¨ uneisen parameter. for temperatures as high as their melting point. For metals. which requires a many-body picture. it gives different behavior of the speciﬁc heat and the thermal expansion coefﬁcient at low temperature. the thermal energy is still much lower than the band gap energy. Important warning We should alert the reader to the fact that so far we have taken into account only the excitation of phonons as contributing to changes in the internal energy of the solid.73) We can simplify the notation in this equation by deﬁning the contributions to the speciﬁc heat from each phonon mode as (l ) (l ) =h ¯ ωk ck l) ¯( ∂n k ∂T (6. This is appropriate for semiconductors and insulators where electron excitation across the band gap is negligible for usual temperatures: For these solids. a detailed description of the thermodynamics of the Fermi liquid is given in Fetter and Walecka [17]. These effects can be treated explicitly by modeling the electrons as a Fermi liquid.75) in terms of which the Gr¨ uneisen parameter takes the form −1 γ = k.l (l ) ck (6. on the other hand. This quantity is given by 1 γ = c (l ) l) ¯( ∂h ¯ ωk ∂n k − =− ∂ ∂T (l ) l) ¯( ∂n ∂h ¯ ωk k ∂ ∂T l) ¯( (l ) ∂ n k h ¯ ωk k.l (l ) (l ) γk ck k.74) and the mode-speciﬁc Gr¨ uneisen parameters as (l ) γk =− (l ) (l ) ∂ (ln ωk ∂ωk ) = − (l ) ∂ ∂ (ln ) ωk (6. the minimum energy required to excite electrons from their ground state. when it is included in the picture.226 6 Lattice vibrations expansion coefﬁcient in terms of the speciﬁc heat as α= γc 3B (6. the excitation of electrons is as important as that of phonons.

l l) (l ) where n ( k is the number of phonons of frequency ωk that are excited. the ﬂux of scattered neutrons as a function of their energy (which is given by p 2 /2 Mn ) will ˆ which correspond to phonon frequencies exhibit sharp peaks in certain directions p that satisfy Eq. which imply a change in the number of phonons that are excited in the solid. the change in energy will be given by E= k. an inelastic collision involves changes in the total energy and total momentum. In experiment p is the momentum of the incident neutron beam and p is the momentum of the scattered beam.6. p) and after ( E .5 Application: phonon scattering 227 6. Neutrons are typically scattered by atomic nuclei in the solid. p = p+h ¯ (k + G) (l ) E = E −h ¯ ωk .77) p= k.78) For processes that involve a single phonon we will have for the energy and momentum before ( E . The type of signal obtained from such experiments is shown schematically in Fig.l l) h ¯ (k + G) n ( k (6. for every value of the energy and momentum of neutrons there will be a broad background. and Mn the mass of the neutron. for which we cannot apply the energy and momentum conservation equations separately.80) has solutions only (l ) (l ) if ω( p−p )/ h ¯ corresponds to a phonon frequency ωk .79) (l ) (l ) Using the fact that ωk ±G = ωk .80) with (+) corresponding to absorption and (−) to emission of a phonon. corresponding to multiphonon processes. However. in the presence of phonon excitations. phonon frequencies are measured experimentally by inelastic neutron scattering. one determines the phonon spectrum by scanning the energy of the scattered neutron beam along different directions.6. From these peaks. (6. Similarly. (6. the change in momentum is given by (l ) l) h ¯ ωk n( k (6. Thus. In reality it is impossible to separate the single-phonon events from those that involve many phonons. p = p−h ¯ (k + G) (absorption) (emission) (6.80). . Speciﬁcally.5 Application: phonon scattering We consider next the effects of scattering of particles or waves incident on a crystal. Eq. we obtain the following relation from conservation of energy: p2 p2 (l ) = ±h ¯ ω( p −p)/ h ¯ 2 Mn 2 Mn (6. 6. In fact. p ) a collision: (l ) E = E +h ¯ ωk .

We will denote the incident wave-vector of the electron by k and the scattered wave-vector by k . each of volume PU C . The width of the single-phonon peaks are due to anharmonic processes.5. we will assume that the solid contains only one type of atoms. The broad background is due to multiphonon processes. we obtain Mkk = Vat (q) = 1 N 1 PU C Vat (q) n j e−iq·(t j +Rn ) (6.1 Phonon scattering processes In order to provide a more quantitative picture of phonon scattering we consider the scattering of an electron in a plane wave state from a solid in the presence of phonons. 6. Vcr (r). that is. For simplicity. If this vector happens to be equal . between initial and ﬁnal states: Mkk = Vcr (r) = nj 1 N PU C e−ik ·r Vcr (r)eik·r dr (6.228 6 Lattice vibrations σ(ω) ω Figure 6. N is the total number of unit cells in the crystal. Inserting the expression for the crystal potential in the scattering matrix element. the individual peaks correspond to single-phonon events from which the phonon frequencies are determined. Schematic representation of the cross-section of inelastic neutron scattering experiments.82) (6. In the above expressions we have introduced the vector q = k − k. so there is only one type of ionic potential Vat (r). The matrix element Mkk for the scattering process will be given by the expectation value of the scattering potential.83) Vat (r)e−iq·r dr with Vat (q) the Fourier transform of the ionic potential Vat (r).6. the potential of all the ions in the crystal.81) Vat (r − t j − Rn ) where Vat (r − t j − Rn ) is the ionic potential of an atom situated at t j in the unit cell of the crystal at the lattice vector Rn .

If q = G. (4.q l ) (l ) iq ·Rn ˆ q )e (Q( q e (6.q l ) (l ) iq ·Rn ˆq e Q( q e (6. Obviously. We can express the deviations of the ions from their crystalline positions Sn in terms of the l) l) ˆ( amplitudes of phonons Q ( q and the corresponding eigenvectors e q .5 Application: phonon scattering 229 to a reciprocal-lattice vector G. the structure factor takes the form 1 S (q) = e−iq·(t j +Rn ) e−iq·Sn j (6. j For simplicity.89) We will take the deviations from ideal positions to be small quantities and the corresponding phonon amplitudes to be small.50).88) and with this expression the structure factor becomes S (q) = 1 N e−iq·Rn exp −iq · n l .85) in which case the scattering matrix element takes the form Mkk = Vat (q) S (q) (6. with l denoting the phonon branch: Sn = l . Eq. we (l ) deﬁne the vectors fq n as l) (l ) (l ) iq ·Rn ˆq e f( nq = Q q e (6. we will assume from now on that there is only one atom in each unit cell. With these deviations.90) .84) where we have identiﬁed the sum over the atoms in the unit cell j as the structure factor S (G). the other sum over the crystal cells n is canceled in this case by the factor 1/ N . The thermal motion leads to deviations from the ideal crystal positions. To simplify the calculations.6. which allows us to eliminate the summation over the index j . we can generalize the deﬁnition of the structure factor as follows: S (q) ≡ 1 N e−iq·(t j +Rn ) nj (6. only the structure factor S (q) is affected by this departure of the ions from their ideal positions.86) We are now interested in determining the behavior of this matrix element when.87) N n. the scattering matrix element takes the form Mkk = 1 N Vat (G) n j e−iG·t j = Vat (G) S (G) (6. due to thermal motion. the ions are not at their ideal crystalline positions. which we denote by sn j . so we examine its behavior. the quantity we had deﬁned in chapter 4.

so we will ignore them in the following. these processes are called normal single-phonon scattering processes.q l) exp −iq · f( nq = l . will have small magnitude. with wave-vector q and amplitude Q ( q+G . and involves the projection l) ˆ( of the scattering wave-vector q onto the polarization e q+G of this phonon.91) l) Keeping only terms up to second order in |f( n q | in the last expression gives the following result: 1 − iq · l .q (l ) fn q − 1 2 ll .q 2 − l q <l q (l ) q · fn q (l ) q · fn q = 1 − iq · l .93) which is the structure factor for a crystal with the atomic positions frozen at the ideal crystal sites. If G = 0. (6.89) (0) The zeroth order term in Eq.q l) f( nq = l .q (l ) fn q − 1 2 (l ) q · fn q l . (1) The ﬁrst order term in Eq. (6. .89) as follows: exp −iq · l . The latter usually contribute less to scattering.92) gives −iq · ql 1 N l ) −iq·Rn f( = −iq · nq e n ql 1 N l ) (l ) i(q −q)·Rn ˆq e Q( q e n l) (l ) ˆ( (q · e q+G ) Q q+G l = −iq · ql l ) (l ) ˆ q δ (q − q − G) = −i Q( q e (6.q l) 1 − iq · f( nq − 1 l) q · f( nq 2 2 +··· (6. if G = 0 they are called Umklapp processes.q q (l ) q · fn q (l ) q · fn q (6. by our assumption above. when the above expression is substituted into Eq. that is.92) Let us consider the physical meaning of these terms by order.92) gives S0 (q) = 1 N e−iq·Rn n (6. (6. we can expand the exponential with square brackets in Eq.230 6 Lattice vibrations which. in the absence of any phonon excitations. (6.94) This is an interesting result: it corresponds to the scattering from a single-phonon l) mode. Using this fact.

(6. the corresponding phonon amplitudes and projections of the scattering wave-vector onto the phonon polarizations are also involved. k’=k+q k’=k+q q-q’ q k q’ k’’=k+q’ k Figure 6.7. with q the total scattering wave-vector. For example. .6. while the wave-vectors of the phonons are represented by wavy lines of the proper direction and magnitude to satisfy momentum conservation by vector addition. as illustrated in Fig.and two-phonon scattering processes.7. 6. This set of processes can be represented in a more graphical manner in terms of diagrams.7. one of wave-vector q . Diagrams for the one. the other of wave-vector q − q . 6. The incident and scattered wave-vectors are represented by normal vectors.95) This expression can be interpreted as the scattering from two phonons.92) gives 231 − 1 1 2 ll q q N =− =− =− l) (l ) −iq·Rn (q · f( n q )(q · fn q )e n 1 l ) (l ) l ) (l ) 1 ˆ q )(q · Q ( ˆq ) (q · Q ( q e q e 2 ll q q N ei(q +q n −q)·Rn 1 l ) (l ) l ) (l ) ˆ q )(q · Q ( ˆ q )δ (q + q − q) (q · Q ( q e q e 2 ll q q 1 2 l ) (l ) l) (l ) ˆ q )(q · Q ( ˆq (q · Q ( q e q−q e −q ) ll q (6. and hence the scattering cross-section. (i) Every vertex where a phonon wave-vector q intersects other wave-vectors in- troduces a factor of (−i) l l) l) ˆ( Q( q q·e q . Simple rules can be devised to make the connection between such diagrams and the expressions in the equations above that give their contribution to the structure factor.5 Application: phonon scattering (2) The second order term in Eq. the following simple rules would generate the terms we calculated above from the diagrams shown in Fig.

other than the ﬁnal scattering phonon wave-vector q.2 The Debye–Waller factor For an arbitrary state of the system. When we average over the sums that appear in the absolute value squared of the structure factor. Note that in the expression l) for W the index n of f( n q is irrelevant since it only appears in the complex exponential exp(−iq · Rn ) which is eliminated by the absolute value. (6. Substituting the expression of f( n q from Eq.99) . which we have not discussed here. For perturbative calculations in many-body interacting systems such techniques are indispensable. which enters in the expression for the cross-section. W conl) tains no dependence on n . (iii) When an intermediate phonon wave-vector q . appears in a diagram. We are also typically interested in the absolute value squared of the scattering matrix element.96) l) Taking advantage again of the fact that the magnitude of f( n q vectors is small. 6. The use of diagrammatic techniques simpliﬁes many calculations of the type outlined above. we can approximate the factor in square brackets by e−2W ≈ 1 − 1 N l) q · f( nq lq 2 =⇒ W = 1 2N l) q · f( nq lq 2 (6.232 6 Lattice vibrations (ii) A factor of (1/ m !) accompanies the term of order m in q.98) l) 2 We will assume next that the amplitudes | Q ( q | are independent of the phonon mode l . the contribution that survives is | S (q)|2 = | S0 (q)|2 1 − 1 N l) q · f( nq lq 2 (6.5. we are not interested in individual phonon scattering processes but rather in the thermal average over such events. These rules can be applied to generate diagrams of higher orders. which leads to l ) (l ) ˆq q · Q( q e lq 2 = q | Q q |2 l l) 2 2 ˆ( |q · e q | = |q| q | Q q |2 (6.97) The quantity W is called the Debye–Waller factor. the latter being the experimentally measurable quantity.90) we ﬁnd W = 1 2N l ) (l ) ˆq q · Q( q e lq 2 (6. thus. it is accompanied by a sum over all allowed values.

M the mass of the atoms and n q the phonon occupation numbers. so that we can expand the exponential in the integrand in powers of t and evaluate the integral to obtain W∞ ≡ lim W = T D 3¯ h2 |q|2 T 2 MkB 2 D (6. there is a maximum frequency ωD . We will also employ the Debye approximation in which all phonon frequencies are given by ωq = v q with the same average v . phonons can be viewed as harmonic oscillators.5 Application: phonon scattering 233 because in our simple example of one atom per unit cell the polarization vectors l) ˆ( e k cover the same space as the cartesian coordinates. therefore.101) with ωq the phonon frequencies. The limits of high temperature and low temperature are interesting. l) 2 2 ˆ( |q · e k | = |q| l With these simpliﬁcations. we will turn the sum over q into an integral over the ﬁrst BZ.103) is a very small quantity. This approximation leads to the following it is the Debye temperature kB D = h expression for the Debye–Waller factor: W = 3¯ h2 |q|2 T 2 2 MkB 3 D D/ T 0 et 1 1 + t dt −1 2 (6. . At high temperature. When this is substituted in the expression for W it leads to 1 W = |q|2 2N (n q + 1 )¯ h h ¯ 2 |q|2 2 = M ωq 2M N nq + q q 1 2 1 h ¯ ωq (6. at high enough temperature when all the phonon modes are excited. and related to ¯ ωD . (6. in absolute value squared.103) with T the temperature.100) As we discussed earlier.102) As usual. the effect of the thermal motion will be manifested strongly in the structure factor which. This gives 2 | Q q |2 = n q + M ωq (n q + 1 )¯ h 1 2 h ¯ ωq =⇒ | Q q |2 = 2 M ωq (6. the upper limit of the integral in Eq. Thus.6. W takes the form W = 1 |q|2 2N | Q q |2 q (6. for which the average kinetic and potential energies are equal and each is one-half of the total energy.104) which has a strong linear dependence on the temperature.

When the emission of γ -rays takes place within a solid. 6. the upper limit in the integral in Eq. We can estimate the amount of broadening by the following argument: the momentum of the emitted photon of wave-vector K is h ¯ K. i. This suggests that even in the limit of zero temperature there will be a signiﬁcant modiﬁcation of the structure factor by exp(−2W0 ) relative to its value for a frozen lattice of ions. the 1 2 This effect is another manifestation of the quantum mechanical nature of phonon excitations in a crystal. Consider a nucleus that can emit γ -rays due to nuclear transitions. associated with phonons. which must be equal to the recoil of the nucleus M v. but we see that the term 1 2 signiﬁcant. and the emission .105) For low temperatures. much of the photon momentum can be carried by phonons with low energy. with M the mass of the nucleus (assuming that the nucleus is initially at rest).e. For T → 0. The and the photon frequency is given by h ¯ ω0 /c = h broadening is then obtained by E R = h ¯ ω. In free space. The recoil energy of the nucleus is E R = M v2 /2 = h ¯ 2 K2 /2 M .107) This can be much greater than the natural broadening which comes from the ﬁnite lifetime of nuclear processes that give rise to the γ -ray emission in the ﬁrst place. this term provides the dominant contribution to the integral. so that E R = h ¯ ω0 /2 Mc2 . Both of these changes obscure the nature of the nuclear transition. the energy and momentum conservation requirements produce a shift in the γ -ray frequency and a broadening of the spectrum. Speciﬁcally.234 6 Lattice vibrations will be multiplied by a factor e−2W∞ = exp − 3¯ h2 |q|2 T MkB 2 D (6. (6.3 The M¨ ossbauer effect Another interesting application of phonon scattering is the emission of recoil-less radiation from crystals.103) is a very large number.106) which is comparable to the value of W∞ at T = D . due to the phonon degrees of freedom. near the center of the BZ. the presence of phonons can change things signiﬁcantly. 2 2 ¯ |K|.5. known as the M¨ ossbauer effect. the exact result of the integration contains a complicated dependence on inside the bracket in the integrand is now temperature. this modiﬁcation arises from the zero-point motion term added to the phonon occupation numbers. which gives ω = ω0 ER 2 Mc2 1/2 (6. W0 ≡ lim W = T →0 3¯ h2 |q|2 8 MkB D (6.

This allows l) us to exclude the factor of 1/ M from the deﬁnition of the coordinates Q ( q (t ) (since the mass now is the same for all phonon modes). and P0→0 is the probability that the crystal will be left in its ground state. where the constant as = M ωs /h .112) ¯. we can write the operator for the transition as O(K. { p })|i |2 (6. (6. in distinction to what we had done above. With these deﬁnitions. where e The wavefunction of each independent harmonic oscillator in its ground state is (s ) φ0 (Qs ) = 2 as π 1/4 e−as Q s /2 2 2 (6. (6. as discussed before. For simplicity. |i now refer to nuclear states. To simplify the notation. K is the wave-vector of the radiation and { p } are nuclear variables that describe the internal state of the nucleus. we also combine the q and l indices into a single index s . as derived above. in the following we assume that there is only one type of atom in the crystal and one atom per unit cell. The hamiltonian for the crystal degrees of freedom involved in the transition is simply the phonon energy. Due to translational and Galilean invariance. Eq. the quantities Q s have the dimension of length. see Eq.110) which. { p }) is the operator for the transition.111) ˆ s is the eigenvector corresponding to the phonon mode with frequency ωs . We exploit this separation to write Pi → f = | f |O({ p })|i |2 P0→0 (6.42). corresponds to a set of independent harmonic oscillators.109) where the states f |. The transition probability between initial (|i ) and ﬁnal ( f |) states is given by Pi → f = | f |O(K.6.35). in which case the γ -rays will have a frequency and width equal to their natural values.5 Application: phonon scattering 235 can be recoil-less. We examine this process in some detail next. { p }) = exp(−iK · r)O({ p }).108) where O(K. we obtain the following expression for the phonon hamiltonian: H phon = 1 2 2 2 M ωs Qs + M s dQs dt 2 (6. with only phonons of low frequency ωq→0 emitted. The harmonic oscillator coordinates Q s are related to the atomic displacements S by S= s ˆs Qs e (6. so that all √ atoms have the same mass Mi = M . with this new deﬁnition. where the last operator involves nuclear degrees of freedom only.

at T = 0.236 6 Lattice vibrations We are now in a position to calculate the probability of recoil-less emission in the ground state of the crystal.113) 2 2 2 and the average value of Q 2 But there is a simple relation between 1/2as s in the ground state of the harmonic oscillator: 1 = 2 2as 2 as π 1/2 (s ) (s ) −as Q s Q2 d Q s = φ0 |Q2 = Q2 se s |φ0 s 2 2 (6.118) We invoke the same argument as in Eq. so that all oscillators are in their ground state before and after the transition.119) This result holds in general for a canonical distribution.101) to express Q 2 s in terms of the corresponding occupation number n s and frequency ωs : 2 1 M ωs Q2 hωs =⇒ Q 2 s = (n s + 2 )¯ s = )¯ h (n s + 1 2 M ωs ns + 1 2 h ¯ ωs (6.114) with the help of which we obtain 0|eiK·S |0 2 = exp − s ˆ s )2 Q 2 (K · e s (6.99).115) l) 2 Now we will assume that all the phonon modes l have the same average ( Q ( q ) .116) 2 S and from this 2 = q.117) P0→0 ∼ e−K 2 S2 /3 (6. from which we can obtain 1 2 2 h ¯ 2 K2 2 K S = 3 2M 3 ns + 1 2 2 = ER h ¯ ωs 3 (6.120) s s . following the same steps as in Eq. This probability is given by 2 P0→0 ∼ 0|e = s iK·S |0 2 = s (s ) iK·e (s ) φ0 |e ˆ s Q s |φ0 2 2 as π 1/2 e 2 2 ˆs Qs −as Q s iK·e e dQs = s ˆ s ) /2as e−(K·e (6. (6. leads to 0|eiK·S |0 By a similar argument we obtain 2 = exp − q K2 Q 2 q (6. which. (6.l l ) (l ) ˆq Q( q e =3 q Q2 q (6.

53 THz. Do phonons carry physical momentum? What is the meaning of the phonon wavevector q? How does the phonon wave-vector q enter selection rules in scattering processes. 1) and X = 2a (1. 3. n s vanishes and if we use the Debye model with ωs = v k (i. 4. At T = 0. Fit the value of κr to reproduce the experimental value for the highest optical mode at . (6. Take the along the L − − X directions. Provide the steps in the derivation of Eq. and use this value to obtain the frequencies of the various modes at X . Using this fact. ﬁnd the behavior of the thermal expansion coefﬁcient α as a function of temperature T .103) from Eq. to obtain P0→0 ∼ exp − 3 ER 2 kB D (6. will be close to unity. which is a ﬁxed value since the value of ω0 is the natural gives E R = h frequency of the emitted γ -rays. where L = π a ratio of the bond stretching and bond bending force constants to be κr /κθ = 16. This corresponds to recoil-less emission. Problems 1. and the highest optical branch at X and L . compare these with the values given in Table 6. assume that the bulk modulus B has a very weak temperature dependence. (6. Determine the atomic displacements for the normal modes at .121) q from which we can calculate explicitly the sum over q by turning it into an integral with upper limit kD (the Debye wave-vector).. Use the Born force-constant model to calculate the phonon frequencies for silicon π (1. without having to balance the recoil energy. 5. we take an average velocity vs = v ). such as a photon of momentum k scattering off a crystal and emerging with momentum k after creating a phonon of momentum q? (l ) .122) with D the Debye temperature. As we have seen before. in other words. which is 15. 2. which is valid for a system with a canonical distribution at temperature T . the probability of the transition involving only nuclear degrees of freedom. (6. respectively). .42).2. 0. 1.and high-temperature limits (T → 0 and T D . 0). the lowest acoustic branch at X and L . as given in Eq. we obtain 1 1 2 2 K S = ER 3 3 1 h ¯ vk (6.e. (6.75).e. not by invoking the analogy to a set of independent harmonic oscillators) that the same holds for the kinetic and potential energy contributions to the energy of a collection of phonons. Show explicitly (i. Show that the kinetic and potential energy contributions to the energy of a simple harmonic oscillator are equal.102). This last expression shows that if the Debye temperature is high enough (kB D E R ) then P0→0 ∼ 1. are equal.6. deﬁned Show that in the Debye model all mode-speciﬁc Gr¨ uneisen parameters γk in Eq. energy conservation 2 ¯ ω0 /2 Mc2 .5 Application: phonon scattering 237 with E R the recoil energy as before. in the low.

A variation on this theme is the situation in which microscopic magnetic moments tend to have parallel orientation but they are not necessarily equal at neighboring sites. or the magnetic moment arising from the motion of charged particles in an external electromagnetic ﬁeld (orbital 238 . that is. In the latter case there is inherent magnetic order due to the orientation of the microscopic magnetic moments. This origin can be ascribed to two factors: the intrinsic magnetic moment (spin) of quantum mechanical particles. Finally. Magnetic order. which is described as ferrimagnetic behavior. the system is described as ferromagnetic. but when subjected to an external ﬁeld it develops magnetization which is aligned with the ﬁeld. induced either by an external ﬁeld or inherent in the system. this corresponds to situations where the microscopic magnetic moments (like the spins) tend to be oriented in the same direction as the external magnetic ﬁeld. A natural way to classify magnetic phenomena is by the origin of the microscopic magnetic moments. Thus. this corresponds to situations where the induced microscopic magnetic moments tend to shield the external magnetic ﬁeld. A system is called paramagnetic if it has no inherent magnetization. the tendency to increase the entropy by randomizing the direction of microscopic magnetic moments. If magnetic moments at neighboring sites tend to point in opposite directions.7 Magnetic behavior of solids We begin with a brief overview of magnetic behavior in different types of solids. If the microscopic magnetic moments tend to be oriented in the same direction. magnetic phenomena are usually observable at relatively low temperatures where the effect of entropy is not strong enough to destroy magnetic ordering. may be destroyed by thermal effects. the system is described as antiferromagnetic. but when subjected to an external ﬁeld it develops magnetization that is opposite to the ﬁeld. but the net macroscopic magnetization is zero. We ﬁrst deﬁne the terms used to describe the various types of magnetic behavior. a system may exhibit magnetic order even in the absence of an external ﬁeld. A system is called diamagnetic if it has no inherent magnetization.

which gives rise to very interesting classical and quantum phenomena. In real systems the two sources of magnetic moment are actually coupled. namely the motion in an external ﬁeld of itinerant (metallic) electrons. the orbital angular momentum L and the total angular momentum J (this assumes that spin-orbit coupling is weak. the motion of electrons in the electric ﬁeld of the nuclei leads to what is known as spin-orbit coupling. that is. which are not bound to any particular nucleus. supplemented by Hund’s rules that specify which states are preferred energetically. and the behavior of interacting quantum spins on a lattice.1 Magnetic behavior of insulators The magnetic behavior of insulators can usually be discussed in terms of the physics that applies to isolated atoms or ions which have a magnetic moment. the total spin S . The third rule states that J = | L − S | when the electronic shell is less than half ﬁlled (n ≤ 2l + 1) and J = L + S when it is more than half ﬁlled (n ≥ 2l + 1). In order to make the discussion of magnetic phenomena more coherent. whether these are due to single electrons (which have spin 1 ) 2 or to compound objects such as atoms or ions (which can have total spin of various values). The rules that govern this behavior are the usual quantum mechanical rules for combining spin and angular momentum vectors. each of these topics is treated in a separate section. the behavior of non-interacting electron spins in metals. The ﬁrst rule states that the spin state is such that the total spin S is maximized. when the shell is exactly half ﬁlled the two expressions happen to give the same result for J .1 Magnetic behavior of insulators 239 moment). Such phenomena include the behavior of magnetic atoms or ions as non-interacting spins in insulators. In particular. where n is the number of electrons in the shell. Of course. In solids. . There are three quantum numbers that identify the state of an atom or ion. In the ﬁnal section of this chapter we consider the behavior of crystal electrons in an external magnetic ﬁeld. there is another effect that leads to magnetic behavior. Hund’s rules determine the values of S . The second rule states that the occupation of angular momentum states l z in the shell (there are 2l + 1 of them for a shell with nominal angular momentum l ) is such that the orbital angular momentum L is maximized. so that S and L can be independently used as good quantum numbers). orbital moments. In such situations the most important consideration is how the spin and angular momentum of electrons in the electronic shells combine to produce the total spin of the atom or ion. 7. for those situations consistent with the ﬁrst rule. we consider ﬁrst the behavior of systems in which the microscopic magnetic moments are individual spins. This coupling produces typically a net moment (total spin) which characterizes the entire atom or ion.7. L and J for a given electronic shell and a given number of electrons. the application of all rules must also be consistent with the Pauli exclusion principle.

In F. L = 0). a pair with opposite spins and the remaining two with the same spin in a total spin S = 1 state. In C. the 2 p shell. 7. consider the elements in the second row of the Periodic Table. l z = 0 and l z = −1 of the 2 p shell are singly . F:( S = 1 2 To illustrate how Hund’s rules work. Ne:( S = 0. in a total-spin S = 1 2 the angular momentum state l z = 1 of the 2 p shell is singly occupied. the corresponding orbital angular momentum is L = 0 and the total angular momentum is J = 0. as shown in Fig. two of the four valence electrons a state with L = 1 and J = | L − S | = 1 2 are in a ﬁlled 2s shell and the other two are in the 2 p shell. two of the three valence electrons are in a ﬁlled 2s shell and one is in state.1. In B. two of the ﬁve valence electrons are in a ﬁlled 2s shell and the other three state. L = 1). Schematic representation of the ﬁlling of the 2s and 2 p electronic shells by electrons according to Hund’s rules. all of them with the same spin in a total spin S = 3 2 angular momentum states l z = 1. L = 0). the angular momentum states l z = 1 and l z = 0 of the 2 p shell are singly occupied. the corresponding orbital angular momentum is L = 0 and the . two of the seven valence electrons are in a ﬁlled 2s shell and the other ﬁve are in the 2 p shell. both with the same spin in a total spin S = 1 state. values are: Li:( S = 1 2 2 3 . The resulting total spin and orbital angular momentum . resulting in a state with L = 1 and J = | L − S | = 0. the angular momentum states l z = 0 and l z = −1 of the 2 p shell are singly occupied and the state l z = +1 is doubly occupied. In Be. two occupied.1. resulting in . C:( S = 1. Be:( S = 0. L = 1). the state. this being the only possibility for the spin.240 lz +1 2p 2s 0 -1 7 Magnetic behavior of solids +1 0 -1 +1 0 -1 +1 0 -1 Li 2p 2s Be B C N O F Ne Figure 7. resulting in a state with L = 0 and J = | L − S | = L + S = 3 2 of the six valence electrons are in a ﬁlled 2s shell and the other four are in the 2 p shell. O:( S = 1. L = 0). resulting in a state with L = 1 and J = L + S = 2. L = 0). the opposite spins and the remaining one giving rise to a total spin S = 1 2 . columns I-A through to VII-A. In N. the two valence total angular momentum is J = | L − S | = | L + S | = 1 2 electrons ﬁll the 2s shell with opposite spins in a total spin S = 0 state. the are in the 2 p shell. this being one valence electron is in the 2s shell in a total spin S = 1 2 the only possibility. L = 1). L = 1). in two pairs with state. B:( S = 1 . In Li. In O. N:( S = 2 .

resulting in a state with L = 1 and . L = 0 J =L+S= 3 2 and therefore also J = 0. The presence of the crystal produces a very minor perturbation to the atomic conﬁguration of electronic shells.1. We also provide in this table the standard symbols for the various states. It is straightforward to derive the corresponding results for the d and f shells. as traditionally denoted in spectroscopy: (2 S +1) X J . D . containing the magnetic elements Fe. G . The electrons in these shells are largely unaffected by the presence of the crystal. because they are relatively tightly bound to the nucleus: for the lanthanides there are no core f electrons and therefore the wavefunctions of these electrons penetrate close to the nucleus. C. which are given in Table 7. 4. 5. The predictions of the magnetic behavior of these solids based on . 5. d . these electrons mix more strongly with the other valence electrons and therefore the presence of the crystal has a signiﬁcant inﬂuence on their behavior. 6. It is tempting to extend this description to the atoms with partially ﬁlled 3d electronic shells (the fourth row of the Periodic Table. From the above discussion we conclude that the B. Consequently. 1. H . 2. g . by analogy to the usual notation for the nominal angular momentum of the shell. 2. 6. when they are close to other atoms. with X = S . F . Li can donate its sole valence electron to become a positive ion with closed electronic shell and F can gain an electron to become a negative ion with closed electronic shell. 1. p . f . i for l = 0. However. s . while Be is already in a closed electronic shell conﬁguration. Insulators in which atoms have completely ﬁlled electronic shells. h . because of the stability of the closed electronic shells. P . There also exist insulating solids that contain atoms with partially ﬁlled 4 f electronic shells (the lanthanides series. see chapter 1). such as the noble elements or the purely ionic alkali halides (composed of atoms from columns I and VII of the Periodic Table) are actually the simplest cases: the atoms or ions in these solids differ very little from isolated atoms or ions. 3. these d electrons also can penetrate close to the nucleus since there are no d electrons in the core. Thus.7. The magnetic behavior of noble element solids and alkali halides predicted by the analysis at the individual atom or ion level is in excellent agreement with experimental measurements. insulators that contain such atoms or ions will exhibit the magnetic response of a collection of individual atoms or ions. However. none of these atoms would lead to magnetic behavior in solids due to electronic spin alone. their valence electrons form hybrid orbitals from which bonding and antibonding states are produced which are ﬁlled by pairs of electrons of opposite spin in the energetically favorable conﬁgurations (see the discussion in chapter 1 of how covalent bonds are formed between such atoms). 3.1 Magnetic behavior of insulators 241 angular momentum states l z = +1 and l z = 0 of the 2 p shell are doubly occupied and the state l z = −1 is singly occupied. I for L = 0. Co and Ni). Ne. 4. N and O isolated atoms would have non-zero magnetic moment due to electronic spin states alone. with closed 2s and 2 p electronic shells has S = 0.

4 3. 5.2 1.62 2. Total spin S .68 0.5 7.58 3.9 2. where 1 ≤ n ≤ 2(2l + 1). P .5 1.8 3. . (2 S +1) X J . 6.00 7.4 8.4 9. F .0 5. The standard spectroscopic symbols. 4.60 9. Atomic spin states according to Hund’s rules. the − sign applies for n ≤ (2l + 1) and the + sign applies for n ≥ (2l + 1).6 10. are also given with X = S .94 9.4 4.5 3.1.3 4.7 5. note the ionization of the various elements (right superscript) which makes them correspond to the indicated state. Eq.90a 3. For the 3d -shell transition metals and the 4 f -shell rare earth elements we give the calculated ( pth ) and experimental ( pex p ) values of the effective Bohr magneton number.54 3.5 D3/2 F2 4 F3/2 5 D0 6 S5/2 5 D4 4 F9/2 3 F4 2 D5/2 1 S0 3 2 F5/2 H4 4 I9/2 5 I4 6 H5/2 7 F0 8 S7/2 7 F6 6 H15/2 5 I8 4 I15/2 3 H6 2 F7/2 1 S0 3 Values obtained from the quenched total angular momentum expression ( L = 0 ⇒ J = S ) which are in much better agreement with experiment than those from the general expression.00 pex p 1.8). Of the two expressions for J . l 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 a n 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 11 12 13 14 S= 1/2 1 3/2 2 5/2 2 3/2 1 1/2 0 1/2 1 3/2 2 5/2 3 7/2 3 5/2 2 3/2 1 1/2 0 sz L= 2 3 3 2 0 2 3 3 2 0 3 5 6 6 5 3 0 3 5 6 6 5 3 0 lz J = |L ∓ S| 3/2 2 3/2 0 5/2 4 9/2 4 5/2 0 5/2 4 9/2 4 5/2 0 7/2 6 15/2 8 15/2 6 7/2 8 (2 S +1) 2 XJ Ti3+ V3+ Cr3+ Mn3+ Fe3+ Fe2+ Co2+ Ni2+ Cu2+ Zn2+ Ce3+ Pr3+ Nd3+ Pm3+ Sm3+ Eu3+ Gd3+ Tb3+ Dy3+ Ho3+ Er3+ Tm3+ Yb3+ Lu3+ pth 1.5 10.242 7 Magnetic behavior of solids Table 7. 1.83a 1. (7.57 4.0 9.63 10.54 0.87a 2.73a 0. I for L = 0.00 2.83a 3. G .87a 4. D . H .73a 2.92a 4.9 5. orbital angular momentum L and total angular momentum J numbers for the l = 2 (d shell) and l = 3 ( f shell) as they are being ﬁlled by n electrons. 3.5 3. 2.72 10.84 0.8 3. Source: [74].90a 5.59 7.8 2.

is given by J N ZN = Jz =− J e −β m 0 Jz H = eβ m 0 ( J + 2 ) H − e−β m 0 ( J + 2 ) H 1 1 N eβ m 0 2 H − e−β m 0 2 H 1 1 where β = 1/ kB T is the inverse temperature. ∂H M =− ∂F ∂H where M is the magnetization and F is the free energy (see Appendix D for a derivation of these expressions from statistical mechanics). which according to the Curie law is inversely proportional to the temperature. .7. .1) with m 0 the magnetic moment of the atoms or ions. . with the axis z deﬁned by the direction of the external magnetic ﬁeld. In insulating solids whose magnetic behavior arises from individual ions or atoms. so that there are (2 J + 1) values of its Jz component. there is a common feature in the response to an external magnetic ﬁeld H . This response is measured through the magnetic susceptibility.1 Magnetic behavior of insulators 243 the behavior of the constituent atoms is not very close to experimental observations. The canonical partition function for this system. In order to prove the Curie law we use a simple model. which give rise to the following energy levels for the system: E Jz = m 0 Jz H . + J (7. . We deﬁne the variable w= βm0 H 2 in terms of which the free energy becomes F =− N sinh(2 J + 1)w 1 ln Z N = − ln β β sinh w which gives for the magnetization M and the susceptibility χ per unit volume M = N m0 [(2 J + 1) coth(2 J + 1)w − coth w] 2 1 (2 J + 1)2 N m2 0 − β χ = 4 sinh2 w sinh2 (2 J + 1)w . . Jz = − J . consisting of N atoms or ions. A more elaborate description. along the lines presented in the following sections. . + J . We assume that the ion or atom responsible for the magnetic behavior has a total angular momentum J . . is required for these solids. Jz = − J . known as the Curie law. . The magnetic susceptibility is deﬁned as χ= ∂M .

orbital angular momentum L and total angular momentum J for the state of the atom or ion determine the exact value of the magnetic moment m 0 . which is deﬁned as µB ≡ eh ¯ = 0. (7. so the condition of Eq.3) This is exactly the form of the Curie law. c.5) where h = 2π h ¯ is Planck’s constant and φ0 is the value of the ﬂux quantum which involves only the fundamental constants h . which we justify below. In order for this derivation to be valid.6) .2) (7.0023 2π (7. that is. even for very large magnetic ﬁelds of order 104 G. In the limit of small w. the susceptibility is inversely proportional to the temperature. e. From the deﬁnition of w we obtain the condition m 0 H << 2kB T (7. The interaction energy of an electron in a state of total spin S and orbital angular momentum L. φ0 ≡ 2m e c 2π h ¯ c me φ0 e (7. to lowest order in the ﬁeld is given by µ B (L + g0 S) · H where g0 is the electronic g-factor g0 = 2 1 + α + O(α 2 ) + · · · = 2.4) Typical values of the magnetic moment m 0 (see following discussion) are of order the Bohr magneton. The values of the total spin S .579 × 10−8 eV/G 2m e c The Bohr magneton may also be written as µB = 2 eh ¯ 2π a 0 hc e πh ¯2 = = Ry.4) is satisﬁed reasonably well except at very low temperatures (below 1 K) and very large magnetic ﬁelds (larger than 104 G). with an external magnetic ﬁeld H. the product m 0 H is of order 10−4 eV ∼ 1 K. the lowest few terms in the series expansions of coth x and sinh x (see Appendix G) give M= χ= N J ( J + 1)m 2 0 1 H 3 kB T N J ( J + 1)m 2 0 1 3 kB T (7. From the units involved in the Bohr magneton we see that. we have to make sure that w is much smaller than unity.244 7 Magnetic behavior of solids with the total volume containing the N magnetic ions or atoms.

we conclude that. (7. (7. (7. which. S and Jz quantum numbers. in the basis of states with deﬁnite J . (7. When this expression is used for the theoretical prediction of the Curie susceptibility. the results compare very well with experimental measurements for rare earth 4 f ions. When it comes to the effective Bohr magneton number for the 3d transition metal ions.1) we conclude that the magnetic moment is given by m 0 = g ( J L S )µ B ≈ 3 S ( S + 1) − L ( L + 1) + µB 2 2 J ( J + 1) (7. deﬁned in terms of the magnetic moment m 0 as √ m 0 J ( J + 1) p= = g ( J L S ) J ( J + 1) µB Values of p calculated from the expression of Eq. and consequently their total angular . that is. (7. the vanishing value of the total angular momentum means that the linear (lowest order) contribution of the magnetic ﬁeld to the energy assumed in Eq. in this case. a very good approximation. much better agreement between theory and experiment is obtained if instead we use for the magnetic moment the expression m 0 = g0 µ B ≈ 2µ B .45) with λ = g0 ). Eq. for which J = 0. the expression of Eq. as discussed above. in order to evaluate the effect of the magnetic ﬁeld. The reason is that in the 3d transition metals the presence of the crystal environment strongly affects the valence electrons due to the nature of their wavefunctions. (7. In that case. we must calculate the expectation values of the operator ( L z + g0 Sz ).7) where we have used g0 ≈ 2. (7. Eq.7. Choosing the direction of the magnetic ﬁeld to be the z axis.8) is not successful in reproducing the values measured experimentally. Typically.6) denote the expectation value of the operator in the electron state. if we set L = 0.1 Magnetic behavior of insulators 245 with α = e2 /h ¯ c = 1/137 the ﬁne-structure constant. and higher order contributions are necessary to capture the physics. are given by µ B H ( L z + g0 Sz ) = µ B H g ( J L S ) Jz with g ( J L S ) the Land´ e g-factors 1 S ( S + 1) − L ( L + 1) 1 g ( J L S ) = (g0 + 1) + (g0 − 1) 2 2 J ( J + 1) (see Appendix B. L .1. namely Eu. Comparing Eq. The angular brackets in Eq.3) are given in Table 7.7) with Eq.6) is not adequate.3). the comparison is made through the so called “effective Bohr magneton number”. There is only one blatant disagreement between theoretical and experimental values for these rare earth ions.8) and obtained from experiment by ﬁtting the measured susceptibility to the expression of Eq. (7. (7.3). and take J = S in the expression for the magnetic susceptibility Eq. (B.8) (7.

In this case. All results from that analysis are directly applicable to the electron gas. that is. F = 9π 4 2/3 1 Ry < kB T (rs /a0 )2 For typical values of rs in metals.2 Magnetic behavior of metals We consider next the magnetic behavior of metallic electrons. 7. but the orbital angular momentum number is no longer a good quantum number. at T = 0.9) . Using the relation between the density 3 = 3π 2 n . that is. a phenomenon discussed in detail in Appendix D. the density n and Fermi energy F . taking the magnetic moment of each particle to be m 0 = g0 µ B . We quote here some basic results from the discussion of Appendix D: the hightemperature and low-temperature limits of the susceptibility per unit volume are given by χ (T → ∞) = 2 2 g0 µB n.246 7 Magnetic behavior of solids momentum quantum numbers are not those determined by Hund’s rules. it is natural to expect that the only important quantities are those related to the ﬁlling of the Fermi sphere. In the simplest possible picture. we can view the electrons in metals as free fermions of spin 1 in an external magnetic ﬁeld. which gives for the condition at which the Curie law should be observed in the free-electron gas. (rs /a0 ) ∼ 2–6. kB T χ (T = 0) = 2 2 3 g0 µB n 2 F with n = N / the density of particles. Apparently. In the opposite extreme. we can rewrite the susceptibility of the free-electron gas at T = 0 as 2 2 χ = g0 µB m e kF π 2h ¯2 2 2 = g0 µB gF (7. and it can be used to describe the magnetic behavior of ions in the crystalline environment. We provide ﬁrst the general picture and then examine in some detail speciﬁc models that put this picture on a quantitative basis. This 2 leads to paramagnetic behavior which is referred to as Pauli paramagnetism. Hund’s ﬁrst rule still holds. We see that in the high-temperature limit the susceptibility exhibits behavior consistent with the Curie law. we ﬁnd that the temperature must be of order 104 –105 K in order to satisfy this condition. which applies to the uniform electron gas in and the Fermi momentum kF 3D. This phenomenon is referred to as “quenching” of the orbital angular momentum due to crystal ﬁeld splitting of the levels. which apply to spherical atoms or ions. the relevant scale for the temperature is the Fermi energy. the total spin quantum number is still good. which is too high to be relevant for any real solids.

9) is a general form of the susceptibility. we had found for the ground state energy . Pauli paramagnetism and Landau diamagnetism are effects encountered in solids that are very good conductors. after averaging over the spin states. The ﬁrst step in making the free-electron picture more realistic is to include exchange effects explicitly. We show here that. (5. the following discussion on magnetization of band electrons). This behavior is referred to as Landau diamagnetism. in order to measure these effects one must be able to separate the effect that comes from the ions in the solid. In the following we discuss in more detail the theory of the magnetic response of metallic solids in certain situations where it is feasible to identify its source.2 Magnetic behavior of metals 247 where g F = g ( F ) is the density of states evaluated at the Fermi level (recall that in 3D has the form the density of states in the case of free fermions of spin 1 2 √ g( ) = 2 me h ¯2 3/2 π2 1/2 . 7. useful in several contexts (see. These models are adequate to introduce the important physics of magnetic behavior in metals. that is. but without the added complications imposed by band-structure effects. Eq. depending on the density of the electron gas. as shown in chapter 5.2. Eq. [80]. as we did in the Hartree–Fock approximation (see chapter 2). and therefore we took the ground state of the system to have non-magnetic character. The orbital motion of electrons in an external electromagnetic ﬁeld always leads to diamagnetic response because it tries to shield the external ﬁeld. and they are both much weaker than the response of individual atoms or ions. We specialize the discussion to models that provide a more detailed account of the magnetic behavior of metals. which is not always easy.3)). (7. The two effects are of similar magnitude but opposite sign. Thus. as mentioned above.1 Magnetization in Hartree–Fock free-electron gas In our discussion of the free-electron gas in chapter 2 we had assumed that there are equal numbers of up and down spins. the ground state can have magnetic character. for a treatment of this effect see Ref. that is. to invoke a Hartree–Fock picture. This is not necessarily the case. for example.7. The exchange interaction is taken into account by assuming a Slater determinant of free-particle states for the many-body wavefunction. the behavior of their valence electrons is close to that of the free-electron gas. In that case. We next analyze a model that takes into account band-structure effects in an approximate manner. This arises from considering explicitly the exchange interaction of electrons due to their fermionic nature.

we ﬁnd that if we had N↑ ( N↓ ) electrons with spin up (down). In the following discussion we will express all energies in units of rydbergs.10). because the factors of 2 that enter into the calculation of the spin-averaged results are compensated for by the factor of 2 difference in the deﬁnition of k F ↑ . the relation between this Fermi momentum and the number of fermions in volume would be N↑ = k3 F↑ 6π .248 7 Magnetic behavior of solids per particle. When the two spin states of the electrons are averaged. relative to that of kF . 2 N↓ = k3 F↓ 6π 2 Notice that for N↑ = N↓ = N /2. we would have a total energy given by an equation analogous to the spin-averaged case: HF E↑ N↑ 3 3 = (k F ↑ a0 )2 − (k F ↑ a0 ).10) where kF is the Fermi momentum. We deﬁne the magnetization number M and the magnetization per particle m as M = N↑ − N↓ . occupying states up to Fermi momentum k F ↑ (k F ↓ ). and for simplicity we do not include that unit in the equations explicitly. By a similar calculation as the one that led to the above result. We next consider a combined system consisting of a total number of N electrons. The ﬁrst term in the total energy E H F is the kinetic energy. k F ↓ . in the last expression we have used the rydberg as the unit of energy and the Bohr radius a0 as the unit of length. the Fermi momentum is related to the total number of fermions N in volume by n= N = 3 kF 3π 2 as shown in Appendix D. (D. we would obtain: k F ↑ = k F ↓ = kF as expected for the non-magnetic case. with N↑ of them in the spin-up state and N↓ in the spin-down state. For each set of electrons with spin up or spin down considered separately.11) . while the second represents the effect of the exchange interactions. 5 2π HF E↓ N↓ 3 3 = (k F ↓ a0 )2 − (k F ↓ a 0 ) 5 2π The coefﬁcients for the kinetic energy and exchange energy terms in the spinpolarized cases are exactly the same as for the spin-averaged case. m = M N (7. (2.43) 2 3e 2 3 3 EHF 3h ¯ 2 kF − = kF = (kF a0 )2 − (kF a0 ) Ry N 5 2m e 4π 5 2π (7. Eq. Eq.

when used in the expression for the total energy of the general case. when all of the electrons are in one spin state. while for large enough rs (low-density limit) the energy will be dominated by the exchange part and the magnetization will be m = ±1. E↓ . we ﬁnd N↑ = (1 + m ) N . it is obvious that for low rs the kinetic energy dominates while for high rs the exchange energy dominates. The extreme case of this is m = ±1. Indeed. From the powers of rs involved in the two terms of the total energy. if m = 0 there will be more than half of the particles in one of the two spin states. This is exactly what is found by scanning the values . the radius of the sphere that encloses this average volume: 4π 3 N rs = =⇒ 3 HF HF + E↓ E↑ 3π 2 N 1/3 = 9π 4 1/3 rs which. and that will give larger exchange energy than the m = 0 case with exactly half of the electrons in each spin state. Therefore. the free-electron gas. We recall the deﬁnition of the average volume per particle through the variable rs .12) HF HF and using these expressions in the total energies E ↑ . and from the fact that N = N↑ + N↓ . we expect that for low enough rs (high-density limit) the energy will be dominated by the kinetic part and the magnetization will be zero. there are no other terms involved in the total energy since we neglect the Coulomb interactions and there is no exchange interaction between spin-up and spin-down electrons. 2 N↓ = (1 − m ) N 2 (7. gives N = 3 10 − 3 4π 9π 4 2/3 1 (1 + m )5/3 + (1 − m )5/3 (rs a0 )2 1/3 9π 4 1 (1 + m )4/3 + (1 − m )4/3 (rs a0 ) Using this expression for the total energy we must determine what value of the magnetization corresponds to the lowest energy state for a given value of rs .7. we obtain for the total energy per particle HF HF E↑ + E↓ N 3a 2 = 0 10 3a0 − 4π 3π 2 N 2/3 (1 + m )5/3 + (1 − m )5/3 (1 + m )4/3 + (1 − m )4/3 3π 2 N 1/3 Notice that in the simple approximation we are considering. We expect that when exchange dominates it is more likely to get a magnetized ground state.2 Magnetic behavior of metals 249 From these deﬁnitions.

For low values of rs the kinetic energy dominates and the total energy is positive (rs ≤ 2. It is interesting that the transition from the unpolarized to the polarized state is abrupt and complete (from zero polarization m = 0 to full polarization. as functions of rs (in units of a0 ).450 and m = ±1 for rs > 5.2. 7. the transition point at which the magnetization jumps from a value of m = 0 to a value of m = 1. The total energy multiplied by a factor of 10 (solid line) and the magnetization m (dashed line) of the polarized electron gas. this simple model is not adequate to describe realistic systems. but the total energy itself is continuous. while for larger values of rs the exchange energy dominates and the total energy is negative. as shown in Fig. m 0 Ϫ0. It is also interesting that the kinetic energy and exchange energy contributions to the total energy are discontinuous at the transition point.2. but that the total energy is continuous.450. of rs and determining for each value the lowest energy state as a function of m : the ground state has m = 0 for rs ≤ 5. First. m = ±1). having to do with the limitations of both the Hartree–Fock approximation and the uniform electron gas approximation.824) coresponding to a non-magnetic ground state and the one on the right (at rs = 6. Notice that both contributions are discontinuous at rs = 5. with two local minima of equal energy on either side of it. the Hartree–Fock approximation is not realistic because of screening and .5 Energy (Ry).250 1 7 Magnetic behavior of solids 0. the one on the left (at rs = 4.411).5 Ϫ1 2 3 4 5 6 7 rs Figure 7.450. While the above discussion is instructive on how exchange interactions alone can lead to non-zero magnetization. as we discuss in more detail next. The transition point is a local maximum in the total energy. The thinner dotted lines indicate the contributions of the kinetic part (positive) and the exchange part (negative).077) to a ferromagnetic ground state. There are several reasons for this. The presence of the two minima of equal energy is indicative of how difﬁcult it would be to stabilize the ferromagnetic ground state based on exchange interactions.

7.3. which underlies the preceding discussion. in either case. The starting point for this theory is the density of states derived from the band structure. that have lower energy. We assume that the electronic states can be ﬁlled up to the Fermi level F for a spin-averaged conﬁguration with N↑ = N↓ = N /2. in some sense a direct extension of the preceding analysis. We will take the number of electrons with spin up and spin down to be given by N↑ = F+ −∞ g ( )d . with the electrons localized at crystal sites to optimize the Coulomb repulsion. . arising purely from exchange interactions in a uniform electron gas. 7. it is not surprising that a state with non-zero magnetization.2.13) where we have introduced to denote the deviation of the highest occupied state in each case from the Fermi level (a positive deviation in the case of spin-up electrons and a negative one in the case of spin-down electrons). Notice that. N↓ = F− g ( )d −∞ (7.2 Magnetic behavior of metals 251 correlation effects (see the discussion in chapter 2). which means that g ( ) is constant in the neighborhood of F or it is a local extremum at = F . the derivative of the density of states g ( ) must vanish at = F . attempts to ﬁx the Hartree–Fock approximation destroy the transition to the non-zero magnetization state. these are referred to as spin density waves. there are actually other solutions beyond the uniform density solution. while the highest energy of electrons in the depleted spin state will be somewhat lower than the Fermi energy. resulting in a ferromagnetic state. so that N↑ > N /2. Finally. which includes the important aspects of the band structure in magnetic phenomena. The energy of the electrons that have been transferred to the spin-up state will be slightly higher than the Fermi level. We then perturb this state of the system by moving a few electrons from the spin-down state to the spin-up state. the exchange effect can lead to polarization of the spins at low enough density. in order to conserve the total number of electrons when we transfer some from the spin-down state to the spin-up state.7. for very low densities where the state with non-zero magnetization is supposed to be stable. The model is nevertheless useful in motivating the origin of interactions that can produce ferromagnetic order in metals: the essential idea is that. is not observed in real solids. in the small limit. Thus. Moreover.2 Magnetization of band electrons We next develop a model. the density of states must be symmetric around its value at F . as shown in Fig. there is a different state in which electrons maximally avoid each other to minimize the total energy: this state is known as the “Wigner crystal”. neglecting all other contributions. g ( ). N↓ < N /2.

24) and (G. where F is the Fermi energy for the spin-averaged state. We can also deﬁne the band energy of the system. we will take g ( F ) = 0. this Through the same procedure of expanding the integrals as Taylor series in expression gives for the band energy EH = 2 F −∞ g ( )d + 2 gF + Fg ( F) (7. (G. Expanding the integrals over the density of states as Taylor series in through the expressions given in Eqs. with corresponding Fermi levels F + and F − . that is.11). with the factor of 2 coming from the two spin states associated with each electronic state with energy . The remaining term is .40).3. Consistent with our earlier assumptions about the behavior of the density of states near the Fermi level. The integral in the above equation represents the usual band energy for the spin-averaged system.15) . Illustration of the occupation of spin-up electron states g↑ ( ) and spin-down electron states g↓ ( ). and using the deﬁnition of the magnetization number from Eq. which in this case can be identiﬁed as the Hartree energy E H . we obtain M = 2 g F 1 + O( 2 ) (7.16) where we have kept terms only up to second order in .252 g ( ε) 7 Magnetic behavior of solids ∆ 0 ∆ εF ε g ( ε) Figure 7. this contribution comes from summing the energy of electronic levels associated with the two spin states: E H = E↑ + E↓ = F+ −∞ g ( )d + F− g ( )d −∞ (7. the energy due to electron-electron interactions other than those from exchange and correlation. (7. In this example we have chosen g ( ) to have a symmetric local maximum over a range of 2 around = F .14) where we have used the symbol g F = g ( F ) for the density of states at the Fermi level.

we examine the magnetic susceptibility χ of a system in which spontaneous magnetization arises when the Stoner criterion is met. we assume that the system gains energy due to exchange interactions.18) Using the expressions for N↑ and N↓ in terms of the magnetization. for ﬁnite temperature. with the change in the exchange energy due to spin polarization. the total contribution to the exchange energy from the spin-up and the spin-down sets of particles will be given by E X = −J N↑ ( N↑ − 1) N↓ ( N↓ − 1) + 2 2 (7. (7.7. which is always positive. (7. that is. To complete the discussion. we ﬁnd that the change in the exchange energy due to spin polarization is M2 (7. We will take J > 0. and keeping only terms to second order in .14).17) Our goal is to compare this term. We will deﬁne the exchange energy in terms of a parameter − J . The question then is. We . Eqs. (7.11) and (7. we ﬁnd that the two changes in the energy become equal for J = 1/g F and the gain in exchange energy dominates over the cost in band energy for δE X = −J J> 1 gF which is known as the “Stoner criterion” for spontaneous magnetization [81]. which we refer to as the “exchange integral”: this is the contribution to the energy due to exchange of a pair of particles. in addition to the density of states in Eq. This treatment applies to zero temperature. corresponding to the spin-averaged state with N↑ = N↓ = N /2. Since exchange applies only to particles of the same spin.2 Magnetic behavior of metals 253 the change in the band energy due to spin polarization: δE H = 2 gF (7.19) 4 which is always negative (recall that J > 0). under what conditions will the gain in exchange energy due to spin polarization be larger than the cost in band energy. as we proved explicitly for the Hartree–Fock free-electron model in chapter 3. we ﬁnd that the exchange energy is J E =− 2 X N 2 2 (1 + m ) 1 + m − 2 N + (1 − m ) 1 − m − 2 N If we subtract from this expression the value of the exchange energy for m = 0. we must also take into account the Fermi occupation numbers (see Problem 3).13).17)? Using our result for the magnetization from Eq. Eq.12). (7.

For example. 7. a more elaborate theory which takes into account the realistic band structure is needed (see. A model that captures the physics of such systems is the so . by deﬁnition. Refs. which ranges from zero to some ﬁnal value H when the magnetization M ranges from zero to its ﬁnal value M : δ E tot = − 0 H ( M µ B )d H = −χ −1 µ2 B M 0 1 M d M = − χ −1 M 2 µ2 B 2 where we have included an overall minus sign to indicate that the spontaneous magnetization lowers the total energy. When these assumptions are not valid. the value of this factor for Pd. This change can also be calculated from the magnetization in the presence of the induced ﬁeld H . extracted from measurements of the speciﬁc heat. (7. The preceding discussion of magnetic effects in band electrons is based on the assumption of a particularly simple behavior of the band structure near the Fermi level.254 7 Magnetic behavior of solids will assume that the magnetization develops adiabatically. [83]–[85]). Eq.9): the susceptibility of band electrons is enhanced by the factor |1 − g F J |−1 . which involves explicitly the density of states at the Fermi energy g F and the exchange integral J .3 Heisenberg spin model The discussion of the previous section concerned non-interacting microscopic magnetic moments.20) which should be compared with the Pauli susceptibility. producing an effective magnetic ﬁeld H whose change is proportional to the magnetization change d H = χ −1 d M µ B where we have also included the Bohr magneton µ B to give the proper units for the ﬁeld. From the preceding discussion we ﬁnd that the change in total energy due to the spontaneous magnetization is given by δ E tot = δ E H + δ E X = 1 − gF J 2 M 4g F and is a negative quantity for J > 1/g F (the system gains energy due to the magnetization). and the constant of proportionality is. the inverse of the susceptibility. for example. is |1 − g F J | ∼ 13 (for more details see chapter 4. From the two expressions for the change in total energy we ﬁnd χ= 2g F µ2 |1 − g F J | B (7. Many aspects of magnetic behavior involve magnetic moments that are strongly coupled. volume 1 of Jones and March [82]).

1 Ground state of the Heisenberg ferromagnet As a ﬁrst step in the study of the Heisenberg ferromagnet. which depends on the relative distance R − R between the spins. If the lowering–raising operator product were neglected.21) If the exchange integral J is positive. where nearest neighbor spins will tend to be oriented in opposite directions.7. we ﬁnd = HH = − 1 J (R − R ) S− (R) S+ (R ) + Sz (R) Sz (R ) 2 R=R (7. We discuss ﬁrst the physics of the Heisenberg ferromagnet. 7.3 Heisenberg spin model 255 called Heisenberg spin model. taking into account that raising and lowering operators at different lattice sites commute with each other because they operate on different spins. the model would be equivalent to the classical Ising spin model.3. (7. as S+ ≡ Sx + i Sy . S(R). then the model describes ferromagnetic order because the spins will tend to be oriented in the same direction to give a positive value for S(R) · S(R ) as this minimizes the energy. a negative value for J will lead to antiferromagnetic order. which is the simpler case. In the opposite case. The Heisenberg spin model hamiltonian is: HH = − 1 J (R − R )S(R) · S(R ). This model consists of interacting spins on lattice sites. 1 S− ≡ Sx − i Sy =⇒ Sx = ( S+ + S− ). with S denoting a quantum mechanical spin variable and R the lattice point where it resides. A pair of spins at lattice sites R and R interact by the so called exchange term J . 2 R=R J (R − R ) = J (R − R) (7.22) where the presence of a sum over all lattice vectors and the relation J (R − R ) = J (R − R) conspire to eliminate the two separate appearances of the lowering– raising operator product at sites R and R . 2 Sy = 1 ( S+ − S− ) 2i which give for a pair of spins situated at R and R S(R) · S(R ) = Sx (R) Sx (R ) + Sy (R) Sy (R ) + Sz (R) Sz (R ) 1 S+ (R) S− (R ) + S− (R) S+ (R ) + Sz (R) Sz (R ) 2 When this result is substituted into the Heisenberg spin model. with the help of the spin raising S+ and lowering S− operators (for details see Appendix B). . the physics of the Ising model with nearest neighbor interactions is discussed in Appendix D. and only the Sz (R) Sz (R ) part were retained. Eq.21). we express the dot product of two spin operators.

. Sz (R ) = S ≡ | S . so it makes sense to consider only changes of one unit in the Sz value of the spins. Sz (R ) = S ≡ | S − 1.23) Next we try to construct eigenstates of the hamiltonian that have different energy (N ) . | Sz (R) = S − 1. R and consider the spins at all the other sites ﬁxed. Thus. First. For quantum mechanical particles the Sz values are quantized: Sz = S . we can focus on than E 0 the possible conﬁgurations of a pair of spins situated at the given sites R. . second. We prove this statement in two stages. . . S − 1. We deﬁne the state with all Sz = S as | Sz (R1 ) = S . . .22) to it: the product of operators S− (R) S+ (R ) applied to | S ( N ) gives zero. We are searching for the lowest energy states. it is easy to guess the ground state of this system: it would be the state with all spins pointing in the same direction and Sz for each spin assuming the highest value it can take. − S . we will allow the spin of the particles to be arbitrary. Since the hamiltonian contains only pair interactions. Sz = S . − S + 1. . The z direction is determined by an external magnetic ﬁeld which is taken to be vanishingly small. the z component of the spin would be a continuous variable with values Sz = S to Sz = − S . Sz (R ) = S − 1 ≡ | S . It turns out that this is also the ground state of the quantum mechanical system. . the only terms in the hamiltonian that give non-vanishing contributions are the Sz (R) Sz (R ) products. S ≡ | S ( N ) and apply the hamiltonian H H in the form of Eq. starting from the conﬁguration | Sz (R) = S . . S − 1 . the state | S ( N ) is indeed an eigenfunction of the hamiltonian with eigenvalue: 1 (N ) E0 = − S2 J (R − R ) 2 R=R (7. we will show that this state is a proper eigenfunction of the hamiltonian and calculate its energy.256 7 Magnetic behavior of solids Even though we developed this model with electrons in mind. In the classical case. we will show that any other state we can construct cannot have lower energy than it. with its sole purpose being to break the symmetry and deﬁne a direction along which the spins tend to orient. S There are two conﬁgurations that can be constructed with one Sz value lowered by one unit. Therefore. . because the raising operator applied to the maximum Sz value gives zero. . which give S 2 for each pair of spins and leave the spins unchanged. Sz (R N ) = S = | S . If the particles were classical. S | Sz (R) = S . . (7.

the coefﬁcients of the two conﬁgurations must have the same magnitude in order to produce an eigenfunction. | S (−) have only a single spin with Sz = S − 1 and all other spins with Sz = S . | S (−) = [| S . S − 1 = 2 S | S − 1. S − 1 − | S . S± | Sz = ( S ∓ Sz )( S + 1 ± Sz )| Sz ± 1 (7. When applied to the two conﬁgurations deﬁned above. with eigenvalue S ( S − 1). S = 2 S | S . R . we can then apply the hamiltonian to the states | S (+) . S and vice versa. so they represent the smallest possible change from . have higher energy. which from Eq. or to be opposite. Thus. we conclude that of the two states we constructed.23) is found to be (0) = − S 2 J (R − R ) Thus. this state is characterized by equal coefﬁcients for the spin pairs which involve one reduced spin. which means that both of these conﬁgurations need to be included in the wavefunction in order to have an eigenstate of the hamiltonian. S − 1 S− (R) S+ (R )| S .24) as discussed in Appendix B. Ignoring all other spins which are ﬁxed to Sz = S . while | S (−) has higher energy by 2 S J (R − R ) (recall that all J (R − R ) are positive in the ferromagnetic case). forgetting normalization for the moment. which we can choose to be ±1. S where we have used the general expression for the action of raising or lowering operators on a spin state of total spin S and z component Sz . This analysis shows that when the Sz component of individual spins is reduced by one unit. S − 1 ] . S − 1 to | S − 1.7. Notice that both states | S (+) .3 Heisenberg spin model 257 Since the hamiltonian contains a sum over all values of R. | S (+) has the same energy as the ground state. S . we see that the raising–lowering operators turn | S . S − 1 ] We also notice that these states are eigenfunctions of the product of operators Sz (R) Sz (R ) or Sz (R ) Sz (R). these operators will produce S− (R ) S+ (R)| S − 1. | S (−) . Two simple choices are to take the coefﬁcients of the two conﬁgurations to be equal. only a special state has the same energy as the ground state. Moreover. with one of the two spins reduced. which. (7. both S− (R) S+ (R ) and S− (R ) S+ (R) will appear in it. to ﬁnd that they are both eigenstates with energies: (+) = − S 2 J (R − R ). we can choose to be +1. All other states which have unequal coefﬁcients for pairs of spins. These choices lead to the following two states: | S (+) = [| S . S − 1 + | S . (−) = −( S 2 − 2 S ) J (R − R ) These energies should be compared with the contribution to the ground state energy of the corresponding state | S .

assuming that | S ( N ) is a normalized state (recall the result of the action of S− on a state with total spin S and z component Sz = S . Evidently. It turns out that the state with the same energy as the ground state corthan E 0 responds to the inﬁnite wavelength.3. 7. These states are referred to as “spin waves” or “magnons”.24)).25) | Sk N R 2S √ √ where the factors 1/ N and 1/ 2 S are needed to ensure proper normalization. or zero wave-vector (k = 0).27) 1 ( N −1) J (R − R ) + S J (R) | Sk = − S2 2 R =R R=0 . Sz (R ) Sz (R ) S− (R)| S ( N ) = S ( S − 1)(δR R + δR R ) + S 2 (1 − δR R − δR R ) S− (R)| S ( N ) = S 2 − S δRR − S δRR S− (R)| S ( N ) (7. lowering the z component of spins by more than one unit or lowering the z component of several spins simultaneously will produce states of even higher energy. except for the special state which is degenerate with the state | S ( N ) . Eq. We will show that they are eigenstates of the Heisenberg hamiltonian and calculate their energy and their properties.258 7 Magnetic behavior of solids state | S ( N ) .2 Spin waves in the Heisenberg ferromagnet The above discussion also leads us naturally to the low-lying excitations starting from the ground state | S ( N ) . all other states have higher energy (N ) .26) ( N −1) gives so that the Sz (R ) Sz (R ) part of the hamiltonian when applied to state | Sk − 1 ( N −1) J (R − R ) Sz (R ) Sz (R ) | Sk 2 R =R (7. while the states with higher energy are spin-wave states with wave-vector k = 0. These excitations will consist of a linear superposition of conﬁgurations with the z component of only one spin reduced by one unit. These arguments lead to the conclusion that. (7. In order to produce states that reﬂect the crystal periodicity. To apply the Heisenberg ferromagnetic hamiltonian to the spin-wave state ( N −1) . we must multiply the spin conﬁguration which has the reduced spin at site R by the phase factor exp(ik · R). spin-wave state. The resulting states are 1 1 ( N −1) ≡√ eik·R √ S− (R)| S ( N ) (7. we start with the action of the operator Sz (R ) Sz (R ) on conﬁguration | Sk S− (R)| S ( N ) . just as we did for the construction of states based on atomic orbitals in chapter 4.

which is degenerate with the ground state. which is similar to what we calculated above. we see that the spin wave energies are higher than the ground state energy. To this end. (7. We wish next to analyze the behavior of the system in a spin-wave state. (7.30) S⊥ (R) S⊥ (R ) ≡ Sx (R) Sx (R ) + Sy (R) Sy (R ) = 1 S+ (R) S− (R ) + S− (R) S+ (R ) 2 ( N −1) The expectation value of this operator in state | Sk will involve the application of the lowering and raising operators S+ (R) S− (R ) or S− (R) S+ (R ) on conﬁguration S− (R )| S ( N ) . R .28). Eq.23). we deﬁne the transverse spin–spin correlation operator which measures the correlation between the non-z components of two spins at sites R. This angle obviously depends on the wave-vector ( N −1) ( N −1) Sk | S⊥ (R) S⊥ (R )| Sk = . leading to 2S (7.31) cos[k · (R − R )] N This shows that spins which are apart by R − R have a difference in transverse orientation of cos[k · (R − R )].28) which gives for the action of the S− (R ) S+ (R ) part of the hamiltonian on state ( N −1) | Sk − 1 ( N −1) J (R − R ) S− (R ) S+ (R ) | Sk 2 R =R = −S R=0 ( N −1) Combining the two results. as anticipated from our analysis based on a pair of spins.3 Heisenberg spin model 259 We next consider the action of the operator S− (R ) S+ (R ) on conﬁguration S− (R)| S ( N ) : S− (R ) S+ (R ) S− (R)| S ( N ) = 2 S δRR S− (R )| S ( N ) (7. except for k = 0. from its deﬁnition. Since we have assumed the J (R)’s to be positive. Eqs.29) 1 − eik·R J (R) (7. (7.29). which provides a picture of what the spin-wave state means: spins are mostly oriented in the z direction but have a small transverse component which changes orientation by an angle k · (R − R ) for spins separated by a distance R − R . this operator is ( N −1) J (R)eik·R | Sk (7.7. we ﬁnd that the state | Sk is an eigenfunction of the hamiltonian with energy ( N −1) (N ) = E0 +S Ek R=0 (N ) with the ground state energy E 0 deﬁned in Eq.27) and (7.

First. . 7. the projection on the x y plane. k of the spin wave. where 6 6 a a is the lattice constant. An illustration for spins on a 2D square lattice is shown in Fig. We will assume that we can treat the spin waves like phonons. which means that ( N −1) the z component spin of each conﬁguration and therefore of the entire | Sk state is N S − 1.4. we calculate the magnetization in the system with spin waves. (7. Illustration of a spin wave in the 2D square lattice with k = ( 1 . Now suppose there are several spin-wave states excited at some temperature T . with boson occupation numbers N −1) = n( k 1 exp[ ( N −1) / kB T ] k −1 where the excitation energies are measured with respect to the ground state energy (N ) : E0 ( N −1) k ( N −1) (N ) = Ek − E0 = 2S R=0 J (R) sin2 1 k·R 2 In the above expression we have used the fact that J (R) = J (−R) to retain only ( N −1) the real part of the complex exponential in the expression for E k .4. we notice that in a spin-wave state deﬁned as | S− (R) S ( N ) only one out of N Sz spin components has been lowered by one unit in each conﬁguration.260 7 Magnetic behavior of solids Figure 7. Finally.30). that is. so we can take a superposition of them to describe the state of the system. 1 ) π . Only the transverse component of the spins is shown. Eq.

that is. This expression gives the temperature dependence of the magnetization as being ∼ −T 3/2 . One last thing worth mentioning. we conclude that in 3D the integral gives ﬁnite value but in 2D and 1D it gives an inﬁnite value. Since the integration includes the value q = 0. as was the case for phonons (see chapter 6). which is a direct consequence of the form of the magnetization given in Eq. in 2D and 1D is not canceled.33) with n the density of particles. because they cannot be described as independent bosons. This behavior is actually observed in experiments performed on isotropic ferromagnets. as we have been assuming so far. which in 3D is canceled by the factor q 2 in the inﬁnitesimal dq = q 2 dq dq ˆ and dq = dq dq ˆ. since each state | Sk by one unit. The interpretation of this result is that in 2D and 1D the presence of excitations above the ground state is so disruptive to the magnetic order in the system that the . will be given by M (T ) = N S − k N −1) n( = NS 1 − k 1 NS 1 k exp[ ( N −1) / kB T ] k −1 (7. There exist also anisotropic systems in which the value of J⊥ (R − R ) corresponding to the transverse operators S⊥ (R) S⊥ (R ) = Sx (R) Sx (R ) + Sy (R) Sy (R ) is different than the value Jz (R − R ) corresponding to the operator Sz (R) Sz (R ).3 Heisenberg spin model 261 We must emphasize that treating the system as a superposition of spin waves is an approximation.7. turning the sum over k into an integral as usual. which is known as the Bloch T 3/2 law.32) ( N −1) reduces the total z component of the ground state spin. the total z component of the spin.33). In the small |k| limit. but 1/q 2 . we obtain for the magnetization per particle m (T ) = M (T )/ N m (T ) = S − 2kB T n 2/3 S 3/2 dq exp (2π )3 −1 J (R)(q · R) R=0 2 −1 (7. N S . (7. because in those cases dq = q dq dq respectively. systems in which J (R − R ) is the same for all operators appearing in the Heisenberg spin hamiltonian. Within this approximation. in which case the Bloch T 3/2 law does not hold. is that for small enough |q| = q we can expand the exponential to obtain −1 exp R=0 J (R)(q · R) 2 −1 1 ≈ 2 q −1 ˆ · R) J (R)(q R=0 2 If we assume for the moment (we will analyze this assumption below) that only small values of |q| contribute to the integral. √ and deﬁning the new variable q = k/ 2kB T / S . which we identify with the magnetization of the system. then the integrand contains the factor ˆ .

For simplicity. the energy gain t due to hopping of the electrons between nearest neighbor sites. and we revisit next. which we have discussed in the context of the hydrogen molecule in chapter 2. but for small T only states with very small energies can contribute to the integral. so that the states have seen earlier that in the limit k → 0 the energy k with the smallest k also have the lowest energy above the ground state. at low T only the states with lowest |k|. Another way to express this is that there can be no magnetic order at ﬁnite temperature in 2D and 1D within the isotropic Heisenberg model. To complete the argument. (7.32) we see that for large values of k / kB T the exponential in the denominator of the integrand leads to vanishing contributions to the integral. are localized at particular atomic sites and are allowed to hop between such sites. we assume that there is only one type of atom in the crystal. The Hubbard model assumes that the system can be described by electrons which. the Bloch T 3/2 law will hold for low temperatures. To be speciﬁc. so that all the hopping energies are the same and all the on-site Coulomb repulsion energies are the same. 7. which is the dominant contribution to the Coulomb energy. forming a simple Bravais lattice with lattice vectors R. in the single-particle language. A useful viewpoint for motivating the Heisenberg antiferromagnetic spin model is based on the Hubbard model [18–20]. will contribute to the magnetization. The model is deﬁned by two parameters. namely the conditions under which only small values of |k| are ( N −1) relevant: from Eq. Thus. The scale over which the temperature can be considered low is set by the factor S 2 J (R)(k · R)2 R=0 which obviously depends on the value of the exchange integral J (R). which coincide with the atomic sites. We also ignore the interaction between electrons and ions. the hopping and . and the energy cost U to have two electrons at the same site. Due to Coulomb repulsion the electrons will avoid being on the same site. we examine the validity of its basic assumption. which in effect represents the kinetic energy. We ( N −1) → 0.262 7 Magnetic behavior of solids spin-wave picture itself breaks down at ﬁnite temperature. a statement known as the Hohenberg–Mermin– Wagner theorem [86].3 Heisenberg antiferromagnetic spin model The Heisenberg spin model with negative exchange integral J can be used to study the physics of antiferromagnetism. Therefore. which can only happen if they have opposite spins due to the Pauli exclusion principle.3. assuming that its main effect is to produce the localized single-particle states |ψR centered at the lattice sites R. and consequently lowest |q|. This implies that for large T spin-wave states with large energies will contribute.

of order t 2 / U . we can classify the unperturbed states according to how many sites are doubly occupied: the lowest state with zero energy will have no doubly occupied sites. it requires the presence of an unoccupied site next to the site where the electron is. If we consider the hopping part of the hamiltonian as a small perturbation on the Coulomb interaction term. t ≡ − ψR | T U ≡ e2 ψR ψR | exp(−k |r1 − r2 |) |ψR ψR |r1 − r2 | (7. etc. T the kinetic energy operator. H2 and H3 involve the virtual occupation of a site by two electrons of opposite spin. it can be rigorously shown [87. we can now have an estimate of the exchange integral J : in the context of the Hubbard model and in the limit of strong on-site Coulomb repulsion. is simply the hopping term. 7. Since J < 0. In fact. in this example. From this analysis. leads to spin exchange between electrons at neighboring sites when they happen to have opposite spins. and S(Ri ) is the spin operator for site i . J must be of order t 2 / U .34) ˆ is In these expressions. it is possible to show using perturbation theory [87–89] that.5.3 Heisenberg spin model 263 Coulomb repulsion energies can be deﬁned in terms of the corresponding operators of the hamiltonian and the single-particle states |ψR : ˆ |ψR±ai . t and U are both taken to be positive constants. 90] that the second term in the effective hamiltonian discussed above. which may or may not be accompanied by spin exchange. in the case of one electron per crystal site. which is the origin of the energy cost U . i . the next state with energy U will have one doubly occupied site. this last term applies only to situations where two electrons at nearest neighbor sites have opposite spins and there is an unoccupied site next to one of them. the physical meaning of these three terms is illustrated in Fig. also of order t 2 / U . of order t . ai stands for any of the three primitive lattice vectors. to the extent possible. It should be . The second term H2 . will avoid having double occupancy of any site due to the very high cost in Coulomb energy. and we have assumed a screened Coulomb potential between the electrons. In this spirit. In the limit t U the system.7. to second order in t . leads to electron-pair hopping by one lattice vector. j must be nearest neighbor indices. the original hamiltonian H is equivalent to an effective hamiltonian He f f which has three terms: He f f = H1 + H2 + H3 The ﬁrst term H1 . we have arrived at a Heisenberg antiferromagnetic hamiltonian. can be expressed as H2 = − J 2 S(Ri ) · S(R j ) ij where the exchange integral is given by J = −(8t 2 / U ). The third term H3 . into which the electron can hop.

which are high-temperature superconductors (see chapter 8). This is in stark contrast to the ferromagnetic case. which allows electrons to exchange spins or remain in the same conﬁguration. an insulating crystal with exactly one electron per crystal site. an example is the doped copper-oxide perovskites. The Heisenberg antiferromagnetic spin model is much more complicated than the ferromagnetic model. (c) The pair hopping term. which allows spin-up or spin-down electrons to move by one site when the neighboring site is unoccupied. In fact. at the cost of virtual occupation of a site by two electrons of opposite spin (intermediate conﬁguration). This is not true for every lattice. but only for lattices that can be split into two interpenetrating sublattices. For nearest neighbor interactions only. this special case.5. (a) The hopping term. or can be considered small perturbations of. which allows a pair of electrons with opposite spins to move by one site. and the diamond . where it is possible to make all spins on a lattice parallel. Adapted from Ref. either with or without spin exchange. at the cost of virtual occupation of a site by two electrons of opposite spin (intermediate conﬁguration). and this is actually the ground state of the model with J > 0.264 7 Magnetic behavior of solids t t t U t U t t t (a) t U t (b) t (c) Figure 7. [87]. The basic problem is that it is considerably more difﬁcult to ﬁnd the ground state and the excitations of spins with antiferromagnetic interactions. these lattices are called bipartite. the BCC lattice (also with two interpenetrating cubic sublattices). Some examples of bipartite lattices in 3D are the simple cubic lattice (with two interpenetrating cubic sublattices). However. The three terms in the effective second order hamiltonian obtained from the Hubbard model in the limit t U . emphasized that this derivation applies to a special case. (b) The exchange term. it is possible to create a state in which every spin is surrounded by nearest neighbor spins pointing in the opposite direction. because it is obviously impossible to try to make all spins on a lattice oriented opposite from each other: two spins that are antiparallel to the same third spin must be parallel to each other. there are several physical systems of interest which fall within. which is called the “N´ eel state”. in this case the model is only studied in its nearest neighbor interaction version.

7. in 2D examples of bipartite lattices are the square lattice (with two interpenetrating square sublattices) and the graphitic lattice (with two interpenetrating hexagonal sublattices). Two spins. 93]. shown by a circle. The lattices that cannot support a N´ eel state are called “frustrated lattices”. 7. Illustration of the N´ eel state on a 2D square lattice. This can be easily seen from the fact that the operators S+ (i ) S− ( j ) or S− (i ) S+ ( j ) which appear in the hamiltonian. spin waves may also be used to describe the physics of this model. The difﬁculty with solving the Heisenberg antiferromagnet is that. assuming that deviations from the N´ eel state are small [92. when applied to the pair of antiparallel spins at nearest neighbor sites i j .6. a basic consideration is the temperature . the N´ eel state is not an eigenstate of the hamiltonian. even in cases where the lattice supports such a state. we discuss next how this order is actually manifested in real materials. First.4 Magnetic order in real materials 265 Figure 7. have been ﬂipped from their ideal orientation.4 Magnetic order in real materials Having developed the theoretical background for describing magnetic order in ferromagnetic and antiferromagnetic ideal solids. lattice (with two interpenetrating FCC sublattices). Much theoretical work has been directed toward understanding the physics of the Heisenberg antiferromagnet. 7. especially after it was suggested by Anderson that this model may be relevant to the copper-oxide high-temperature superconductors [91]. at least in the limit of large spin S . This is illustrated in Fig. Interestingly. ﬂip both spins and as a result destroy the alternating spin pattern and create a different conﬁguration. destroying locally the antiferromagnetic order: neighboring pairs of spins with wrong orientations are highlighted by thicker dashed lines.6. some examples being the FCC lattice in 3D and the hexagonal lattice in 2D.

The N´ eel temperatures are generally much lower than the Curie temperatures. M0 is the saturation magnetization in gauss. such as in Fe and Co. Table 7. n B is the number of Bohr magnetons per atom. The Curie temperature TC and N´ eel temperature TN are given in kelvin. This quantity is a measure of the difference in occupation of spin-up and spin-down states in the solid. magnetic order is relatively weak and can be easily destroyed by thermal ﬂuctuations.2. dependence of magnetic order. [73].266 7 Magnetic behavior of solids Table 7.2 provides examples of elementary and compound magnetic solids and their Curie and N´ eel temperatures.6 7. The critical temperature above which there is no magnetic order is eel temperature (TN ) called the Curie temperature (TC ) for ferromagnets and the N´ for antiferromagnets. The higher the value of n B the more pronounced is the magnetic . In certain solids. these critical temperatures can be rather large. the antiferromagnetic state being more delicate than the ferromagnetic one. n B .2 1. it is determined by the band structure of the spin-polarized system of electrons in the periodic solid (values of n B can be readily obtained from band-structure calculations of the type described in chapter 4).0 M0 1752 1446 510 1980 Solid Cr Mn Ce Nd Antiferromagnets TN 311 100 13 19 Solid Sm Eu Dy Ho TN 106 91 176 133 Compound Ferromagnets Solid MnB MnAs MnBi MnSb FeB Fe2 B TC 152 670 620 710 598 1043 Solid Fe3 C FeP CrTe CrBr3 CrI3 CrO2 TC 483 215 339 37 68 386 Solid MnO FeO CoO NiO MnRbF3 MnKF3 Antiferromagnets TN 122 198 291 600 54 88 Solid FeKF3 CoKF3 MnF2 MnCl2 FeF2 CoF2 TN 115 125 67 2 78 38 Source: Ref. Another interesting characteristic feature of ferromagnets is the number of Bohr magnetons per atom.7 0. As we noted already in the introduction to this chapter. Elemental Ferromagnets Solid Fe Co Ni Gd TC 1043 1388 627 293 nB 2. exceeding 1000 K. Examples of ferromagnets and antiferromagnets.

as dictated by the atomic scale interactions (the magnetization of a domain is the sum of all the spins in it). The large arrows indicate the magnetization in each domain. An illustration of this effect in a simple 2D case is shown in Fig. The system accommodates this by breaking into regions of different magnetization called domains. Left: this corresponds to a zero external magnetic ﬁeld. there is no direct correlation between n B and the Curie temperature. 7.7. which has grown in size at the xpense of the other domains. In a domain wall the orientation of spins is gradually changed from the one which dominates on one side of the wall to that of the other side. The magnetizations of different domains are oriented in a pattern that tries to minimize the magnetic dipole–dipole interactions over large distances.4 Magnetic order in real materials 267 behavior of the solid.7. The domains of different magnetization are separated by boundaries called domain walls. values of this quantity for elemental ferromagnets are given in Table 7. This value corresponds to all the spins being aligned in the same direction. The pattern of domain magnetizations follows the ﬁeld of a magnetic dipole situated at the center of each ﬁgure. The saturation magnetization is not attained spontaneously in a real ferromagnet but requires the application of an external ﬁeld. the highest value that the magnetization can attain. but this does not correspond to a globally optimal state from the energetic point of view.7. with the total magnetization averaging to zero. domains with magnetization opposite to the external ﬁeld have shrunk the most. Other dipoles placed in the ﬁeld of a given magnetic dipole would be oriented at various angles relative to the original dipole in order to minimize the magnetic dipole–dipole interactions. However. Within each domain the spins are oriented parallel to each other. However. The dipole– dipole interactions are much weaker than the short range interactions responsible for magnetic order. . the behavior of ferromagnets is characterized by the saturation magnetization M0 . over macroscopically large distances the shear number of dipole–dipole interactions dictates that their contribution to the energy should also be optimized. Illustration of magnetic domains in a ferromagnet. Right: this corresponds to a magnetic ﬁeld in the same direction as the magnetization of the domain at the center. This is related to interesting physics: in a real ferromagnet the dominant magnetic interactions at the atomic scale tend to orient the magnetic moments at neighboring sites parallel to each other. Inset: the domain wall structure at microscopic scale in terms of individual spins.2. that is. Finally. This Figure 7.

as indicated in Fig. introduces limitations in the creation and placement of magnetic domain walls. For large ﬁelds this process becomes irreversible. very much like in the case of spin waves. this behavior is referred to as “hysteresis”. since the domain walls move only by a small amount and they can revert to their original conﬁguration upon removal of the ﬁeld. which means lagging behind. when the ﬁeld is reduced to zero after saturation. 7. In an ideal solid the domain sizes and distribution would be such that the total magnetization is zero.268 7 Magnetic behavior of solids change in spin orientation takes place over many interatomic distances in order to minimize the energy cost due to the disruption of order at the microscopic scale induced by the domain wall. Thus. The more the magnetization of a certain domain deviates from the external ﬁeld the higher its energy will be and consequently the more this domain will shrink. 7. the magnetization does not return to its original zero value but has a positive value because of limitations in the mobility of domain walls due to defects in the solid. The external ﬁeld will favor energetically the domains of magnetization parallel to it. after the ﬁeld is removed. The magnetization M can be increased by the application of an external magnetic ﬁeld H . It is therefore necessary to apply a large ﬁeld in the opposite direction. 7. especially chapters 9. If the extent of the domain wall were of order an interatomic distance the change in spin orientation would be drastic across the wall leading to high energy cost. .8. In real materials the presence of a large number of defects (see Part II. 10 and 11). which requires the application of a positive ﬁeld to reduce it to zero. Spreading the spin-orientation change over many interatomic distances minimizes this energy cost. When the ﬁeld is removed. For relatively small ﬁelds this process is reversible. 7. to reduce the magnetization M back to zero. This response of the system to the external ﬁeld is called a “hysteresis loop”. the magnetization now has a non-zero value in the opposite direction. eventually they will cross regions with defects which will make their return to the original conﬁguration impossible.1 Continuing to increase the ﬁeld in the negative direction will again lead to a state with saturated magnetization.8. and the domain walls will move to enlarge the size of these domains at the expense of domains with magnetization different from the external ﬁeld.7. This response is particularly dramatic when the electrons are conﬁned in two dimensions and are subjected to a very strong perpendicular magnetic ﬁeld at 1 From the Greek word υσ τ ρησ ις . as illustrated in Fig. because if the walls have to move far.5 Crystal electrons in an external magnetic ﬁeld In this section we consider the response of electrons in an external magnetic ﬁeld. denoted as − Hc in Fig.

In certain situations it is possible to form a ﬂat interface between two crystals. This arrangement is called a quantum well. the magnetic ﬁeld will be H = H z of an insulator and doped semiconductor produces a potential well in the direction perpendicular to the interface plane which quantizes the motion of electrons in this direction. When these levels are well separated in energy and the temperature is lower than the separation between levels.35) ¯ (x .7. y ) f (z ) ψ (r) = ψ (7. The latter part experiences the conﬁning potential Vcon f (z ). only the lowest level is occupied. We will take x . which gives rise to discrete levels. 7. the end-points and the arrows along the hysteresis curve indicate schematically the changes in magnetization due to domain wall motion. zero magnetization is obtained again for the reversed ﬁeld value − Hc . making the system of electrons essentially a two-dimensional one. We will therefore concentrate the discussion mostly to those conditions. When the ﬁeld is reduced back to zero the magnetization has a ﬁnite value. The circular insets next to the origin.5 Crystal electrons in an external magnetic ﬁeld M M0 269 ϪHc 0 H Figure 7. and the system of electrons conﬁned in two dimensions in the semiconductor is called an inversion layer. To begin. The original magnetization curve starts at H = 0. The electron wavefunctions ψ (r) can be factored into two parts: ¯ ( x . M = M0 . This is illustrated in Fig. A more detailed . y ) describes the motion on the interface plane and the where the ﬁrst part ψ second part f (z ) describes motion perpendicular to it. y to be the coordinate axes on the interface plane and z the direction ˆ . A particular arrangement perpendicular to it. let us explore how electrons can be conﬁned in two dimensions. their motion in the z direction is conﬁned. Hysteresis curve of the magnetization M in a ferromagnet upon application of an external ﬁeld H . When all electrons are at this lowest level corresponding to the wavefunction f 0 (z ).9: the conﬁning potential Vcon f (z ) can be approximated as nearly linear on the doped semiconductor side and by a hard wall on the insulator side. very low temperature. M = 0 and extends to the saturation point.8.

whereas a full band corresponding to a crystal plane can accommodate of order (1024 cm−3 )2/3 = 1016 cm−2 states. It should be noted that the phenomena we will describe below can also be observed for a system of holes at similar density. ˆ the electron orbits are quan1.5. except for the conﬁnement in the z direction which introduces quantization of the motion along z . The behavior of electrons in the plane exhibits quantization in two ways. The conﬁning potential in the z direction Vcon f (z ). 1 . Here we give only a qualitative discussion of these two levels of quantization. the conﬁned electrons occupy a small fraction of the band near its minimum. In the presence of a perpendicular magnetic ﬁeld H = H z tized in units of the cyclotron frequency ωc = eH mc . and wavefunctions f 0 (z ). The Fermi level lies between 0 and 1 . which are actually related (see Problem 4). and localization in the x y plane. f 1 (z ). discussion of electronic states in doped semiconductors and the inversion layer is given in chapter 9. and the energies 0 . 7. Now we introduce the effects of quantization in the remaining two dimensions. For both the electron and the hole systems the presence of the crystal interface is crucial in achieving conﬁnement in two dimensions.1 de Haas–van Alphen effect Up to this point we have treated the electrons as a set of classical charge carriers. because the density of conﬁned electrons in the inversion layer is such that only a very small fraction of one band is occupied.9.270 7 Magnetic behavior of solids ε Vconf (z) ε1 ε0 0 f (z) 1 f0 (z) z Figure 7. Therefore. the density of conﬁned electrons is of order n = 1012 cm−2 . of the two lowest states. but is otherwise not important in the behavior of the charge carriers (electrons or holes) under the conditions considered here. Speciﬁcally. In the following discussion we will ignore the crystal momentum k and band index normally associated with electronic states in the crystal.

. Suppose that in order to accommodate the total number of electrons N we need to ﬁll the Landau levels up to index l . we deﬁne the area A of the plane as A = W L . (7. In this case we will have (l + 1) f < N < (l + 2) f =⇒ 1 AH 1 < <b= l +2 N φ0 l +1 where we have deﬁned the new variable b for convenience. if the ﬁeld H has precisely the value H = nch /e(l + 1) exactly l + 1 Landau levels will be ﬁlled with index numbers 0. If the electrons cannot be accommodated exactly by an integer number of Landau levels. 1. The number of electronic states that each Landau level can accommodate is determined by the quantization of magnetic ﬂux. the Bohr magneton µ B and the variable b. The number of ﬂux quanta corresponding to this total ﬂux is f = HA . 7. we obtain for the total energy per particle E(H ) = µB N N φ0 A b(2l + 3) − b2 (l 2 + 3l + 2) .5). Thus. separated by intervals h ¯ ωc .10.5 Crystal electrons in an external magnetic ﬁeld 271 The corresponding energy levels are equally spaced. . This is intuitively understood as a consequence of the fact that there exist spatial limitations in the placement of the circular orbits. The total energy of the system in this situation will be given by l E(H ) = j =0 fh ¯ ωc j+ 1 2 + [ N − (l + 1) f ] h ¯ ωc l + 3 2 Using the deﬁnition of the frequency ωc . l . 2. where W is the width and L the length of the plane. that is.7. as illustrated in Fig. 1 1 <b< . we need exactly l + 1 Landau levels starting with level 0.36) where n = N / A is the density of electrons in the plane. This is intuitively understood as the quantization of circular motion due to a harmonic oscillator potential. then levels up to index l will be completely ﬁlled and the level with index l + 1 will be partially ﬁlled. Then we will have N = (l + 1) f = (l + 1) HA N φ0 ch ch ⇒l +1= =n ⇒H =n φ0 A H eH e(l + 1) (7. The total magnetic ﬂux through the plane where the electrons are conﬁned is H (W L ). . these are called Landau levels. φ0 with φ0 the ﬂux quantum deﬁned in Eq.l ≥ 0 l +2 l +1 . .

the sound attenuation. this level also becomes ﬁlled when the value of the ﬁeld decreases enough. which is the magnetic ﬁeld in units of N φ0 / A. b > 1.10. Notice that the spacing of the levels. examples include the conductivity (this is known as the Shubnikov–de Haas effect and is usually easier to measure than the de Haas–van Alphen effect). A number of other physical quantities exhibit the same type of oscillations as a function of the magnetic ﬁeld H . l = 0 ∂H N = µ B 2(l 2 + 3l + 2)b − (2l + 3) . giving for the energy per particle E(H ) = µB N N φ0 b. and b > 1. = µB φ0 1 1 <b< . l = 0 We can now calculate the magnetization M and susceptibility χ per unit volume. . with index l + 1. in which case the discontinuities in the magnetization are rounded off. a total of l + 1 levels. eventually.11: the magnetization shows oscillations between positive and negative values as a function of the magnetic ﬁeld! This behavior is known as the de Haas–van Alphen effect. the magnetoresistance and the magnetostriction (strain induced on the sample by the magnetic ﬁeld).l ≥ 0 l +2 l +1 χ ≡− 1 ∂2 E = 0. For large enough ﬁeld.272 E l+1 7 Magnetic behavior of solids E E l H=nhc/e(l+1) nhc/e(l+2)<H<nhc/e(l+1) H=nhc/e(l+2) Figure 7. H > N φ0 / A. that is.l ≥ 0 l +2 l +1 A plot of the magnetization in units of ( N / )µ B as a function of the variable b. b > 1. is shown in Fig. Between ﬁllings of entire levels the Fermi energy is pinned at the energy of the partially ﬁlled level. also decreases as the ﬁeld decreases. l = 0 ∂ H2 2A 1 (l 2 + 3l + 2). 7. given by h ¯ ωc = h ¯ (eH / mc). Schematic representation of ﬁlling of Landau levels by ﬂux quanta φ0 (shown as shaded circles): at some value of the magnetic ﬁeld H each Landau level can accommodate H A /φ0 quanta (with A the area of the plane) and the total number of electrons N ﬁlls all levels up to index l . A b > 1. 1 1 <b< . Although we have shown how it arises in a 2D system. as M ≡− N 1 ∂E = − µ B . the effect is observed in 3D systems as well. as the ﬁeld decreases each level can accommodate fewer ﬂux quanta and some electrons move to the next level. only the lowest Landau level (l = 0) will be ﬁlled.

These cross-sectional areas will play a role analogous to the area of the plane on which the electrons were conﬁned in the above example. as shown in Fig. The oscillations of the magnetization as a function of the magnetic ﬁeld can be used to map out the Fermi surface of metals. When a 3D metal is in an external magnetic ﬁeld. E x . Usually. different cross-sections of the Fermi surface come into play. through the following argument. due to Onsager [94]. Thus.5 Crystal electrons in an external magnetic ﬁeld 273 Figure 7. Due to the presence of . Of course there are many cross-sections of a 3D Fermi surface along a certain direction. measured in units of ( N / )µ B . The applied electric ﬁeld will induce a current jx = (−e)n vx in the same direction. Although additional complications arise from the band structure.5. the plane perpendicular to the ﬁeld will intersect the Fermi surface of the metal producing a cross-sectional area that depends on the shape of the Fermi surface.7. in principle we can use the oscillations of the magnetization to determine the extremal cross-sectional areas enclosed by the Fermi surface on this plane. which makes it possible to determine the entire 3D Fermi surface by reconstructing it from its various cross-sections. our simple treatment of electrons conﬁned in 2D becomes relevant to the behavior of electrons with wavevectors lying on a certain plane that intersects the Fermi surface. By changing the orientation of the metal with respect to the direction of the ﬁeld. 7. 7. in conjunction with magnetic measurements which give the precise values of the extremal areas on selected planes.2 Classical and quantum Hall effects We consider next what happens if in addition to the perpendicular magnetic ﬁeld we apply also an electric ﬁeld in the x direction. exhibits oscillations as a function of the magnetic ﬁeld H . Illustration of the de Haas–van Alphen effect in a two-dimensional electron gas: the magnetization M .12.11. a band-structure calculation of the type discussed in detail in chapter 4 is indispensable in reconstructing the exact Fermi surface. where vx is the velocity and n is the density of electrons in the plane. measured in units of N φ0 / A. but the relevant ones are those with the largest and smallest areas (the extremal values).

41) 2 ρx y + ρx x 2 In three dimensions the two indices of the resistivity and conductivity tensors take three values each (x .38) The expression derived in Eq. the magnetic ﬁeld. and the fact that the motion in the y direction can be associated with an effective electric ﬁeld E y . with the external magnetic ﬁeld H and the x .12. this ratio can be viewed as one of the off-diagonal components of the conductivity tensor σi j (i . we obtain Ey = Fy (−e)nc jx jx = = H⇒ (−e) (−e)nc Ey H (7. Assuming that ρ yx = −ρx y . y . y components of the electric ﬁeld E x . an isotropic solid. it is easy to ﬁnd the relation between the two tensors:2 ρx x σx x = 2 = σ yy 2 ρx y + ρx x −ρx y σx y = 2 = −σ yx (7.274 7 Magnetic behavior of solids H jy jx Ey Ex Figure 7.38) is deﬁned as the Hall conductivity σ H : it is the ratio of the current in the x direction to the effective electric ﬁeld E y in the perpendicular direction. j = x . y ). In more general terms. ρx x = ρ yy . (7. j y . which relates the current to the electric ﬁeld: jx = σ x x E x + σ x y E y j y = σ yx E x + σ yy E y (7. E y . j = x . y ): E x = ρ x x jx + ρ x y j y E y = ρ yx jx + ρ yy j y (7. Geometry of the two-dimensional electrons in the Hall effect. as usually deﬁned in the classical Hall effect in electrodynamics.37) c Using the expression for the current.39) The electric ﬁeld can be expressed in terms of the current using the resistivity tensor ρi j (i . the moving electrons will experience a force in the y direction given by vx Fy = (−e) H (7.40) which is the inverse of the conductivity tensor. . z ). that is. and the current jx .

We then ﬁnd for the Hall conductivity σ H = σx y = − 1 jx = ρx y Ey (7. since for a given density of electrons n .44). (7.44) These relations suggest that the Hall conductivity and resistivity are quantized.44) with l = 0 and ρx x = σx x = 0.13. In 1980. Then we expect ρx y to have the value given by Eq.43) where l + 1 is the number of ﬁlled Landau levels (up to index l ).10. (7.36) that n (ch /eH ) is an integer. since a ﬁlled Landau level corresponds to a saturated system analogous to the ﬁlled band discussed in chapter 3. it is obvious that this need not be always the case. When ρx y has the quantized values suggested by Eq. (7. (7. Equivalently. the total magnetic ﬂux through the plane will decrease.36) this level can accommodate a density of electrons n = eH /ch . (7. This fascinating observation was called the integer quantum Hall effect (IQHE). the meaning of this assumption will become evident shortly.7. a small change in the value of the magnetic ﬁeld will violate the condition of Eq. experiments by von Klitzing and coworkers [95] showed that for very high magnetic ﬁelds and low temperatures.44). According to Eq. Why does the IQHE exist? To put it differently.38). However. as we had to assume earlier in order to arrive at the quantized values of ρx y . 7. K. what might we expect for values of n and H that do not satisfy the condition of an integer number of ﬁlled Landau levels? Suppose that we start the experiment with a value of the magnetic ﬁeld that corresponds to exactly one ﬁlled Landau level. This is shown schematically in Fig. Let us consider the density n to be ﬁxed by the number of electrons and the geometry of the system. This conclusion rests on the assumption that the value of the magnetic ﬁeld and the density of electrons in the plane are such that exactly l + 1 Landau levels are ﬁlled. the diagonal resistivity ρx x vanishes. von Klitzing was awarded the 1985 Nobel prize for Physics. We expect that in this case there is no current. 7. for its discovery. the Hall resistivity ρx y as a function of H has wide plateaus that correspond to integer values of l + 1 in Eq. we obtain σH = −enc e2 = − (l + 1) H h (7.5 Crystal electrons in an external magnetic ﬁeld 275 We assume next that the conditions in the system of conﬁned electrons are such that σx x = 0 ⇒ ρx x = 0. As soon as the value of the magnetic ﬁeld is decreased slightly. as illustrated in Fig. (7. and the ﬁrst Landau level will not be able to accommodate .42) Combining this with the expression from Eq. we can obtain the Hall resistivity (the off-diagonal component of the resistivity tensor) as ρx y = − 1 h 1 = 2 σx y e l +1 (7.

As the magnetic ﬁeld is varied and localized states with energy away from an ideal Landau level are being ﬁlled. Indeed. are identiﬁed. while the states lying between ideal Landau levels are localized and cannot carry current. and the value of ρx y does not have to remain the same as before. so that a few electrons will have to go to the next Landau level. The Fermi energy will jump between the values of the Landau levels and will be pinned at these values. as illustrated in Fig.15. What leads to the existence of plateaus in the value of ρx y is the fact that not all states in the system are extended Bloch states. For the rest of the values of H . so that it will be able to carry current. Only those states with energy very close to the ideal Landau levels are extended and can carry current. for an ideal system we would expect ρx y to take the quantized values of Eq.44) only for the speciﬁc values of H that satisfy this equation (with ﬁxed n ). This next level will have only a very small occupation. as indicated in Fig.14. (7. In a real system. in which case ρx x = 0. where its values are given by (h /e2 )/(l + 1) with l + 1 an integer. all the electrons in the system. and ρx x to vanish at exactly these values only. Schematic representation of the integral quantum Hall effect. 7.276 xy 1 (h/e 2) 7 Magnetic behavior of solids ρ 1/2 1/3 1/4 H ρ xx H Figure 7.13. The plateaus in the Hall conductivity ρx y as a function of the magnetic ﬁeld H . with linear Hall resistivity in H . as successive levels are being ﬁlled one after the other with decreasing magnetic ﬁeld. the diagonal conductivity ρx x vanishes for the values of H corresponding to the plateaus. the Landau levels are not δ -functions in energy but are broadened due to the presence of impurities. the system should behave like the classical system of electrons. This produces the . there can be no change in the conductivity of the system which remains stuck to its quantized value corresponding to the ideal Landau level. 7.

localized states extended states εF l-1 l l+1 l+2 ε Figure 7. This has led to advances in metrology.7. plateaus in the Hall resistivity ρx y . the ﬁne-structure . denoted by a vertical dashed line. . with l = 0. When the magnetic ﬁeld takes the values H = nch /e(l + 1). The spacing between levels is h ¯ ωc = h ¯ (eH / mc). as a function of the magnetic ﬁeld H . and the unshaded narrow regions around the ideal levels represent the extended states. 1. the grey areas represent the localized states due to the presence of impurities. Position of the Fermi level E F in the spectrum of Landau levels.. For more detailed arguments we refer the reader to the original theoretical papers by Laughlin [96] and by Halperin [97]. 2. For example. which can dominate the electronic behavior. exactly l + 1 Landau levels are ﬁlled and the Fermi level. We should emphasize that the above discussion is an oversimpliﬁed version of very careful theoretical analysis.5 Crystal electrons in an external magnetic ﬁeld 277 εF H= chn e l= 0 1 2 3 ε H= chn 2e l= 0 1 2 3 4 5 6 ε H= chn 3e l= 0 1 2 3 4 5 6 7 8 9 ε Figure 7.15.14. since the ratio e2 / h appears in several fundamental units. reaching one part in 107 . lies between levels with index l and l + 1. the levels are shown as vertical lines representing inﬁnitely sharp δ functions. . . When the Fermi level F lies between levels l and l + 1 only localized states are available and these do not contribute to conduction. This fascinating phenomenon demonstrates the importance of defects in the crystal. Landau levels in the two-dimensional electron gas: the solid lines represent the ideal levels (no impurities). We should also note that the plateaus in Hall resistivity can be measured with unprecedented precision.

. . Evidently. The magnitude squared of this wavefunction can be written as | q (r1 . . the system again behaves like a saturated band. 2/5. This discovery was called the fractional quantum Hall effect (FQHE).. Sufﬁce to say that the FQHE effect has generated enormous interest in the condensed matter physics community. q integers. z N ) = (2q + 1) j <k ln |z j − z k | − (2q + 1) k =1 | z k |2 .46) ν= 2q p + 1 with p . .44). St¨ ormer and Gossard [98] revealed that the plateaus in the Hall resistivity can also occur for values larger than (h /e2 ). 5/3. Laughlin originally postulated a many-body wavefunction to describe the states with ν = 1/(2q + 1) where q is a positive integer [100]. 5/2. they occur at values 3.. . . this is an antisymmetric wavefunction upon exchange of any pair of electron positions. Thus. . . . . 3/5. all these fractions have odd denominators... in units of (h /e2 ). According to Eq. experiments on samples of very high purity at high magnetic ﬁelds and low temperatures by Tsui. r N ) |2 = exp q (z 1 . (7. 1/5. The states corresponding to the partial ﬁllings are referred to as “incompressible quantum ﬂuids”. y ) of each electron are involved in a special linear combination that gives the complex coordinate z = x + i y .. A hierarchy of states at fractions p (7. z N )/(q N + 1/2) N 2 q (z 1 . an even more remarkable observation was in store: a couple of years after the experiments of von Klitzing and co-workers. this wavefunction for N electrons has the form N q (r1 . one can determine the value of a fundamental constant that plays a central role in high energy physics! Fascinating though the discovery of the IQHE may have seemed. .47) where only the in-plane coordinates (x . .45) and the speed of light c is known to one part in 109 from independent measurements. that lead to plateaus in the Hall conductivity has been predicted theoretically [99] and observed experimentally. . 5. . 3/2.278 7 Magnetic behavior of solids constant is given by α= e2 = h ¯c e2 h 2π c (7. The explanation of the FQHE involves very interesting many-body physics which lies beyond the scope of the present treatment. corresponding to ﬁllings of ν = 1/3. k =1 z k = xk + i yk (7.. . In particular. using phenomena related to the behavior of electrons in a solid. r N ) = j <k (z j − z k )(2q +1) exp − N | z k |2 . . . this implies that when speciﬁc fractions of the Landau levels are ﬁlled. 2/3. .

Rev. Phys. This analogy can be used to infer the relation of this wavefunction to a system of quasiparticles with fractional charge. z N ) has the form of an electrostatic potential energy between charged particles in 2D: the ﬁrst term represents the mutual repulsion between particles of charge (2q + 1). eds. 5. Wiley. 1990). (7. 3. In the Stoner theory of spontaneous magnetization of a system consisting of N electrons.16) using the spin-up and spin-down ﬁlling of the two spin states deﬁned in Eq. Girvin. 1. St¨ ormer were awarder the 1998 Nobel prize for Physics for their work on the FQHE. according to Hund’s rules. despite the highly correlated motion of the underlying physical particles. S.14) and band energy Eq. at which there is no quantized Hall effect. Tsui and H. We introduce a characteristic temperature S 2. given in Table 7. The Fractional Quantum Hall Effect. . each of which carries an even number of ﬁctitious magnetic ﬂux quanta [101]. Mod. This is a collection of interesting review articles on the quantum Hall effect. . (7. where M is the magnetization number (the difference of spin-up and spin-down numbers of electrons). Landsberg. 3. D. Problems 1. vol.13). R. North Holland. The Quantum Hall Effect. . the energy of an electron with magnetic moment ±µ B is modiﬁed by the term ∓µ B (µ B M ). R. T.L. 977–1038 (ed. P. 63. 4. Manousakis. 1992). Chakraborty in Handbook on Semiconductors. (J. For a thorough review of the Heisenberg antiferromagnetic model we refer the reader to the article by E. theorists have also addressed the properties of partially ﬁlled Landau levels with even denominator fractions (such as ν = 1/2). W. T.E. . it is intriguing that those states. Pinczuk.1. vol. Chakraborty and P. 1. 1988). Laughlin. Das Sarma and A. Pietil¨ ainen (Springer-Verlag. ‘The quantum Hall effect’. a testament to the great excitement which these experimental discoveries and theoretical developments have created. More recently. (2nd edn. Prange and S. Verify the assignment of the total spin state S . Further reading 1. Berlin. .M.B. Many interesting theoretical ideas have been developed to account for the behavior of the incompressible quantum ﬂuids which involves the highly correlated motion of all the electrons in the system. New York. pp. Jones and N. 6. Perspectives in Quantum Hall Effects. Wiley. seem to be described well in terms of independent fermions. 1973).H. 2. Springer-Verlag. 1 (1991).T. angular momentum state L and total angular momentum state J of the various conﬁgurations for the d and f electronic shells.Problems 279 where q (z 1 . An extensive treatment of theoretical models for magnetic behavior can be found in Theoretical Solid State Physics. New York. eds. 1997). London. (7. and the second term the attraction to a uniform background of opposite charge.C. March (J. Prove the expressions for the magnetization Eq.

s λ= 4π 2h ¯2 2mkB T 1/2 Next. At ﬁnite temperature T . n ( ) is the Fermi occupation number. the magnetization number M is given by M= ∞ −∞ ¯ kB [n ( − s S) ¯ kB − n( + s S )] g ( )d while the total number of electrons is given by N= where ∞ −∞ ¯ kB [n ( − s S) ¯ kB + n( + s S )] g ( )d is the system volume. derive the condition for the existence of spontaneous magnetization. y= kB T z= S T ¯. 2¯ s kB S = F kB S > 2 3 F .280 7 Magnetic behavior of solids for the onset on spontaneous magnetization by deﬁning the magnetic contribution to the energy as ¯ kB E mag = s S. We also deﬁne the Fermi integral function: (y) = F1 2 ∞ 0 x2 dx x − e y +1 and the magnetic Show that the number of electrons per unit volume n = N / moment per unit volume m = M / are given by 2π F1 ( y + z) + F 1 ( y − z) 2 2 λ3 2π ( y + z) − F 1 ( y − z) m = 3 F1 2 2 λ where the variables y . a single band with parabolic energy dependence. λ are deﬁned as n= µ . k = h ¯2 2 k 2m 1 with m the effective mass of electrons. From this result. z . For simplicity we will consider the free-electron model or. show that the equations derived above lead to ¯ )2/3 − (1 − s ¯ )2/3 (1 + s 2 with F the Fermi energy of the unpolarized electrons. ¯ = M / N the dimensionless magnetic moment with kB the Boltzmann constant and s per particle. equivalently. n( ) = 1 e( −µ)/ kB T +1 g ( ) is the density of states and µ is the chemical potential.

Problems 281 Find the upper limit of S . Assume that the magnetic ﬁeld is generated by a vector potential A = (− y H . scaled by 1/λ and shifted by k λ. 0).36). The Schr¨ odinger equation for electrons conﬁned in the x y plane is (y) = F1 2 e 2¯ 1 ¯ (x . Find the eigenvalues and eigenfunctions of this Schr¨ odinger where λ2 = h equation. The free variable in this equation is y . y ) p− A ψ (x . ¯ /(ωc m ). ¯ (x . which is known as the Landau gauge. y ) ψ to ﬁnd the allowed values for k . which implies 0 < |k | ≤ W /λ2 . ¯ (x + L . (7. with frequency ωc = eH / mc (the cyclotron frequency).48) 4. (b) Assume that the extent of the system is L in the x direction and W in the y direction. y ) = ψ 2m c (7. y ) = ψ ¯ (x . y ) = eikxψ ˜ ( y ) and show that ψ ˜ ( y ) satisﬁes a Schr¨ (a) Take ψ odinger equation with a harmonic oscillator potential. (Hint: show that the Fermi integral function for large values of its argument can be approximated by 2 3/2 π 2 −1/2 y + y + ··· 3 12 The results quoted above are obtained by keeping only the leading order term in this expansion. with l = 0. 0. The answer must be the same as the one suggested by ﬂux quantization arguments. Use Born–von Karman boundary conditions for x . The eigenfunctions of the Schr¨ odinger equation we solved above are centered at y0 = −λ2 k .) We want to derive the quantization of energy and ﬂux of the Landau levels in the quantum Hall effect. Eq. and from those determine the number of states per unit area that each Landau level can accommodate. These are the energies of the Landau levels and the corresponding wavefunctions. .

It is a truly remarkable phenomenon of purely quantum mechanical nature. 8. Because of its fascinating nature and of its many applications. 8. the transition is rather sharp. but one with a truly zero resistance. This is different than the case of very good conductors. which has made it difﬁcult to take advantage of this extraordinary behavior in practical applications. Below Tc there is no measurable DC resistance in a superconductor and. materials which are very good conductors in their normal state typically do not exhibit superconductivity.1 Overview of superconducting behavior Superconductivity is mostly characterized by a vanishing electrical resistance below a certain temperature Tc . superconductivity has been the focus of intense theoretical and experimental investigations ever since its discovery.1(a). The drop in resistance from its normal value above Tc to zero takes place over a range of temperatures of order 10−2 – 10−3 Tc . since it is scattering by phonons which gives rise to resistance in a conductor. as illustrated in Fig. the superconducting state is not a state of merely very low resistance. Studies of superconductivity have gained new vigor since the discovery of high-temperature superconductors in 1987. it is also an essentially many-body phenomenon which cannot be described within the single-particle picture. if a current is set up in it. The reason is that in very good conductors there is little coupling between phonons and electrons.8 Superconductivity Superconductivity was discovered in 1911 by Kamerling Onnes [102] and remains one of the most actively studied aspects of the behavior of solids. whereas electron-phonon coupling is crucial for superconductivity. it will ﬂow without dissipation practically forever: experiments trying to detect changes in the magnetic ﬁeld associated with current in a superconductor give estimates that it is constant for 106 – 109 years! Thus. In fact. that is. because cooling the specimen to within a few degrees of absolute zero 282 . For typical superconductors Tc is in the range of a few degrees kelvin. called the critical temperature.

7 25.1 26. In 1986 a new class of superconducting materials was discovered by Bednorz and M¨ uller. Element Al Cd Gd Hg In Ir Mo Sn W Pb Re Tc 1.722 0.80 1. [74].4 14.85 0.083 3.0 28. which is expensive and cumbersome).175 0.0 8. respectively. critical ﬁeld H0 (in oersted).015 7. the only non-solid substance at such low temperatures.8 33. as the coolant.6 39.0 23. and Debye frequency h ¯ ωD (in milielectronvolts).4 Element Ru Nb Ta Tc Th Ti Tl V Os Zn Zr Tc 0. It also re-invigorated theoretical interest in superconductivity.6 50.40 0.2 803 200 h ¯ ωD 36. J.2 35.915 3. Bednorz and K.49 9.6 56 178 1408 70 54 47 h ¯ ωD 50.61 H0 69 2060 829 1410 1.0 33.0 43. but these materials are ceramics and therefore more difﬁcult to utilize than the classical superconductors.3 0.2 35.8 6. In the following we will refer to the older.2). This is well above the freezing point of N2 (77 K). The new superconductors have complex crystal structures characterized by Cu–O octahedra arranged in various ways as in the perovskites (see discussion of these types of structures in chapter 1) and decorated by various other elements.113 0.38 5. so that this much more abundant and cheap substance can be used as the coolant to bring the superconducting materials below their critical point.8.0 9.40 2.47 7. since it seemed doubtful that the microscopic mechanisms responsible for low-temperature superconductivity could also explain its occurrence at such high temperatures.196 1.517 1.2 18.1. which are typically elemental metals or simple metallic alloys.38 0. is quite difﬁcult (it requires the use of liquid He.8 22.66 0. .G. Critical temperature Tc (in kelvin).4 36. which has sparked an extraordinary activity. Elemental conventional superconductors.2. We give examples of both conventional and high-temperature superconductors in Tables 8.408 0.25 4. in which the Tc is much higher than in typical superconductors: in general it is in the range of ∼ 90 K. M¨ uller were awarded the 1987 Nobel prize for Physics for their discovery.1 Overview of superconducting behavior 283 Table 8. low-temperature superconductors as the conventional superconductors. The discovery of high-temperature superconductors has opened the possibility of many practical applications.0 8. but in certain compounds it can exceed 130 K (see Table 8.697 H0 105 28 58 339 282 16 96 305 1.949 3.0 Source: Ref. to distinguish them from the high-temperature kind.1 and 8.A. dubbed high-temperature superconductors.

as shown in Fig.1. These are summarized in Fig. above which the ﬁeld is too strong for the superconductor to resist: the ﬁeld abruptly penetrates into the superconductor and the magnetic moment becomes zero.1(c). This is indicative of a gap in the excitation spectrum of the superconductor that can be overcome only above a sufﬁciently high frequency to produce a decrease in the superconductivity and consequently an increase in the resistance. m = −(1/4π ) H . denoted by (X. The total magnetic ﬁeld B inside the superconductor is given by B = H + 4π m . which is known as the Meissner effect [103].5 Cu2 O7 TlCa2 Ba2 Cu3 O8 Bi2 Sr2 CuO6 Bi2 CaSr2 Cu2 O8 Bi2 Ca2 Sr2 Cu3 O10 Source: Ref. Critical temperature Tc (in kelvin). where m is the magnetic moment. High-temperature superconductors.La)2 Cu2 O6 (Pb. the superconductor is a perfect diamagnet. 8. B is zero inside the superconductor. that is.Ba)x CuO4 La2 Ca1−x Srx Cu2 O6 YBa2 Cu3 O7 YBa2 Cu4 O8 Y2 Ba4 Cu7 O15 Tl2 Ba2 CuO6 Tl2 CaBa2 Cu2 O8 Tl2 Ca2 Ba2 Cu3 O10 TlCaBa2 Cu2 O7 TlSr2 Y0. or there are equivalent structures with two different elements. This is true up to a certain critical value of the external ﬁeld Hc . Tc 39 35 60 93 80 93 92 119 128 103 90 110 10 92 110 Material SmBaSrCu3 O7 EuBaSrCu3 O7 GdBaSrCu3 O7 DyBaSrCu3 O7 HoBaSrCu3 O7 YBaSrCu3 O7 ErBaSrCu3 O7 TmBaSrCu3 O7 HgBa2 CuO4 HgBa2 CaCu2 O6 HgBa2 Ca2 Cu3 O8 HgBa2 Ca3 Cu4 O10 Pb2 Sr2 La0. which is related to the critical temperature by h ¯ ωg ≈ 3. 8. In several cases there is fractional occupation of the dopant atoms.5 Ca0. Materials that exhibit this behavior . a behavior that can be easily interpreted as perfect shielding of the external ﬁeld by the dissipationless currents that can be set up inside the superconductor.Y).Cu)Sr2 (La.5kB Tc . to understand the physics of the high-temperature superconductors.1(b). which then approaches its normal state value as shown in Fig.284 8 Superconductivity Table 8. In other words. Superconductors can expel completely an external magnetic ﬁeld H imposed on them.5 Cu3 O8 Pb2 (Sr. denoted by x (with 0 < x < 1). [74]. Material La2 CuO4+δ La2−x (Sr. 8.Ca)Cu2 O7 Tc 84 88 86 90 87 84 82 88 94 127 133 126 70 32 50 both in experiment and in theory. in such cases the value of the highest Tc is given. The AC resistance in the superconducting state is also zero below a certain frequency ωg .5 Ca0.2. A number of other properties are also characteristic of typical superconductors.

The expulsion of magnetic ﬁeld costs energy because it bends the magnetic ﬁeld lines around the superconductor. (c) and g AC (d) The total magnetic ﬁeld B and the magnetic moment 4π m as a function of the external ﬁeld H for type I and type II superconductors. (a) The DC conductivity ρ DC s) as a function of temperature. which approaches the normal state value ρ ( above ω .1 Overview of superconducting behavior 285 ρ DC (a) (s) ρ AC (n) ρ AC (b) s 0 n Tc T 0 s ωg ω B Type I (c) B Type II (d) s 0 n Hc H 0 s v Hc1 Hc2 n H 4πm 4πm H Hc (e) n C (f) s 0 s Tc T 0 n T Tc Figure 8. v the vortex and n the normal (unshaded) state.1. (b) The AC conductivity ρ ( state as AC in the superconducting n) a function of frequency. (f) The speciﬁc heat as a function of temperature. (e) The critical ﬁeld Hc as a function of temperature. while the rest . are called “type I” superconductors.8. Experimental portrait of a typical superconductor: s denotes the superconducting (shaded). Depending on the shape of the specimen. even when the magnetic ﬁeld is smaller than Hc it may still penetrate some regions which revert to the normal state.

Vortices can move in response to external forces or thermal ﬂuctuations. In type I superconductors. the motion of which depends on the presence and distribution of defects in the crystal. 8. Illustration of expulsion of magnetic ﬁeld (Meissner effect) in a type I superconductor.5). The ground state of the vortex state is a regular array of the vortex lines.2. beyond which the magnetic moment is again zero as illustrated in Fig. Eq. remains in the superconducting state. the critical ﬁeld Hc is a function of temperature and vanishes at Tc . There exists a different class of superconductors.1(d). Left: a specimen of spherical shape. (7. The motion of vortices can be quite elaborate. The core of the vortex has microscopic dimensions in cross-section and is not rigid but can change direction subject to line-tension forces. vortices can form bundles. this is called the Abrikosov lattice. These materials are called “type II” superconductors. the behavior of Hc with temperature is . The phase in which the external ﬁeld partially penetrates in the superconductor is called the “vortex” state and has very interesting physics in itself. the normal or superconducting regions in the intermediate state are of macroscopic dimensions. located at x = 0. In type I superconductors. A vortex consists of a cylindrical core where the external ﬁeld has penetrated the specimen. which on a plane perpendicular to the lines forms a triangular lattice. as shown in Fig. It is shown in the intermediate state. as illustrated in Fig. and is surrounded by material in the superconducting state. as ﬁrst suggested by Anderson [104]. The behavior of B (x ) and ψ (x ) determine the penetration length λ and coherence length ξ . Right: magnetic ﬁeld B (x ) and the magnitude of the order parameter |ψ (x )| as functions of the distance x from the normal-superconducting interface. The ﬂux contained in the core of the vortex is equal to the fundamental ﬂux quantum.286 8 Superconductivity ξ 1 B(x)/B(0) |ψ(x) | λ 0 x Figure 8. with some regions in the superconducting (shaded) and others in the normal (unshaded) state. this is called the “intermediate” state.1(e).2. for instance. 8. that is. 8. in which the Meissner effect is also observed up to a critical ﬁeld Hc1 but then a gradual penetration of the ﬁeld into the superconductor begins that is completed at a higher critical ﬁeld Hc2 . the material is in the normal state. Center: a specimen of rectangular shape with its long dimension perpendicular to the direction of the ﬁeld lines.

The penetration length gives the scale over which the magnetic ﬁeld inside the superconductor is shielded by the supercurrents: if x measures the distance from the surface of the superconducting sample towards its interior.5 in the simplest cases. 8. Pippard [107]. vanishes at T = 0 but not at T = Tc . Abrikosov [109] and others. the electronic speciﬁc heat1 of a superconductor has a discontinuity at Tc . and the “coherence length”.R.1) with H0 a constant in the range of 102 – 103 Oe. the so called “penetration length” denoted by λ. . Finally.N. A common feature of the phenomenological theories is the presence of two important length scales. is the so called “isotope effect”: for superconductors made of different isotopes of the same element. d Hc /dT . where it takes the value −2 H0 / Tc . Cooper and J. since the phonon frequencies naturally introduce a mass dependence of the form M −1/2 (see chapter 6). who were awarded the 1972 Nobel prize for Physics for their contribution. as discussed below. In general this dependence is given by the relation Tc ∼ M α (8. a theory based on microscopic considerations was lacking until 1957. this behavior is another indication of a gap in the excitation spectrum of the superconducting state. Although extremely useful phenomenological theories were proposed to account for the various aspects of superconductivity. This fact led Fr¨ ohlich to suggest that there is a connection between electron–phonon interactions and superconductivity [105].2) with α = −0.1 Overview of superconducting behavior 287 described well by the expression Hc (T ) = H0 1 − T Tc 2 (8.1.8. The theoretical explanation of superconductivity remained a big puzzle for almost a half century after its discovery. when it was developed by J.1(f). by speciﬁc heat we mean the electronic contribution only. One last important feature of superconductivity. falling by a ﬁnite amount from its superconducting value below Tc to its normal value above Tc . not illustrated in Fig. Ginzburg and Landau [108]. as shown in Fig. due to London and London [106]. denoted by ξ . it is found that the critical temperature Tc depends on the isotope mass M . An interesting consequence of this behavior is that the slope of the critical ﬁeld with temperature. Schrieffer [110]. 8. the total magnetic ﬁeld behaves like B (x ) = B (0)e−x /λ 1 In this entire section. L. Bardeen. this is referred to as BCS theory. this is related to the nature of the phase transition between the superconducting and normal state.

the ﬁrst going from superconducting toward the normal region. It is natural that such a length scale should exist. We present here a brief heuristic argument for the reasons behind these two types of behavior. which produces a temperature-independent value for κ .2. whose absolute value is between unity and zero: at these two extremes the superconducting state is either at full strength or has been completely destroyed. We can use the distance between such a pair of electrons as a measure for ξ . 8. The typical behavior of B (x ) and |ψ (x )| in a type I superconductor is shown in Fig. the vortices.288 8 Superconductivity that is. the electron pairs are awkwardly rigid on the scale of λ and the superconducting state is too “inﬂexible” to shield the magnetic ﬁeld as it tries to penetrate the sample. When the magnetic ﬁeld becomes very strong and its expulsion from the sample is . and especially through the so called Ginzburg–Landau theory. serves to distinguish√ type I from type II superconductors: κ < 1/ 2 corresponds to type I and κ > 1/ 2 corresponds to type II superconductors. 8. equivalently. Both λ and ξ are temperature-dependent quantities. ψ (x ). the superconducting state is not affected (the correlations responsible for its stability are not destroyed). we will assume that the superconducting state consists of bound pairs of electrons with opposite momenta and spins. it is affected. The electron pairs that create the superconducting state cannot exist within a region of size λ near the boundary between the normal and superconducting states. it is a measure of the distance over which the superconducting state is affected by ﬂuctuations in the external ﬁeld or other variables. Anticipating the discussion of the microscopic mechanism for superconductivity. In type II superconductors the interface energy is negative and it is actually energetically favorable to create the microscopic regions which contain a quantum of ﬂux. κ = λ/ξ . the second going in the opposite direction. In type I superconductors the interface energy is positive and the sample tries to minimize it by splitting into macroscopic normal or superconducting domains with the smallest possible interface area. The distinction between the two types of superconductivity in terms of κ has to do with the interface energy between the normal and superconducting parts. Their √ratio. This behavior can be understood from the point of view of phenomenological theories of superconductivity. If ξ is considerably larger than λ. if it is of order ξ . the coherence length. as shown by Abrikosov [109]. as illustrated in Fig. The temperature dependence of λ and ξ is essentially the same.2. because the ﬁeld cannot drop to zero immediately inside the superconductor as this would imply an inﬁnite surface current. In phenomenological theories the superconducting state is typically described by a quantity called the “order parameter”. The coherence length determines the scale over which there exist strong correlations which stabilize the superconducting state. If the length scale of these ﬂuctuations is much larger than ξ . it decays exponentially over a distance of order λ. or. within this region the superconducting state and the magnetic ﬁeld both decay to zero.

2 Thermodynamics of the superconducting transition 289 energetically costly.2 Thermodynamics of the superconducting transition In order to discuss the thermodynamics of the superconducting transition as a function of the external magnetic ﬁeld H and temperature T we consider a simpliﬁed description in terms of two components. 8. It is then reasonable to expect that the ratio κ = λ/ξ can characterize the two types of behavior. the only option for the system is to reduce the volume of the superconducting sample and yield some volume to the normal state within which the magnetic ﬁeld can exist. For sufﬁciently large magnetic ﬁeld the superconducting state is abruptly destroyed altogether: this is the behavior characteristic of type I superconductors. We will assume that the normal part occupies a volume n and the superconducting part occupies a volume s . We conclude this section with a short discussion of the high-temperature superconductors and their differences from the conventional ones. with √ a critical value κc separating them.8. which sheds some light onto the behavior of the physical system. this implies a positive interface energy. In the following we provide ﬁrst a simple thermodynamic treatment of the superconducting transition. The generation of vortices can be continued to much larger values of the magnetic ﬁeld. This picture is represented by Fig. the electron pairs are quite nimble and the superconducting state is more “ﬂexible” and able to shield small regions of normal state. On the other hand. this is the behavior characteristic of type II superconductors. This allows the creation of vortices of microscopic size where the magnetic ﬁeld penetrates the sample. which add up to the ﬁxed total volume of the specimen = n + s . in which case the magnetic moment in the superconducting part is m = −(1/4π ) H and vanishes in the normal part. The magnetization M is given as usual by M= m (r)dr = m s =− 1 H 4π s The total magnetic ﬁeld is given by B (r) = H + 4π m (r) . Abrikosov’s theory established that κc = 1/ 2. For simplicity we consider a type I superconductor in a homogeneous external ﬁeld. if ξ is considerably smaller than λ.3. before their density becomes such that the superconducting state is destroyed. 8. the normal (denoted by the superscript (n )) and superconducting (denoted by the superscript (s )) parts. The system attains equilibrium under given H and T by transferring some amount from one to the other part. and then we give an elementary account of the BCS theory of superconductivity.

8. The dashed area at the interface between normal and superconducting regions has length scale λ. The Gibbs free energy of the total system is given by G = F0 + 1 8π [ B (r)]2 dr − H M = F0 + 1 2 H 8π n −H − 1 H 4π s . Schematic illustration of the behavior of type I (top) and type II (bottom) superconductors. In type II superconductors it responds by creating microscopic vortices which contain one quantum of magnetic ﬂux each (a white circle with a single magnetic line) which increases the interface area. the size of the electron pairs is a measure of the coherence length ξ .290 8 Superconductivity ξ λ λ ξ Figure 8. over which the superconducting state and the ﬁeld decay to zero from opposite directions.3. shown as small arrows pointing in opposite directions and linked by a wavy line. which vanishes in the superconducting part and is equal to H in the normal part. The normal state is shown as white and the black dots indicate the lines of magnetic ﬁeld (perpendicular to the plane of the paper in this example) which penetrate the normal but not the superconducting state. In type I superconductors the system responds to increasing magnetic ﬁeld by shrinking the superconducting region and minimizing the interface. consists of pairs of electrons with opposite momenta and spins.1(c). as shown in Fig. occupying the shaded areas. The superconducting state.

respectively. for the normal and superconducting parts. f 0(s ) . (8. For the normal part. For the superconducting part. the earlier equation relating the partial derivatives of the Helmholtz free energies per unit volume to the external ﬁeld becomes s (n ) − s (s ) = − Hc d Hc 4π dT (8. we can equate these derivatives to the entropies per unit volume of the normal and superconducting parts. At equilibrium. this condition is satisﬁed at H = Hc . with this connection to the entropy. Eq. this is always the case. we must have d n = −d s . with which the Gibbs free energy becomes G = f 0(n ) n + f 0(s ) s + 1 2 H 8π + 2π m 2 s where we have also used the fact that H = −4π m in the superconducting part and = n + s . which gives ∂G 1 = f 0(s ) − f 0(n ) + 2π m 2 = 0 =⇒ f 0(s ) − f 0(n ) = −2π m 2 = − H 2 ∂ s 8π Differentiating the last expression with respect to T we obtain H dH ∂ f 0(s ) ∂ f 0(n ) − =− ∂T ∂T 4π dT From the deﬁnition of f 0(n ) . indicating a balance between the amount of superconducting and normal volumes consistent with the external conditions (the values of H and T ). We denote the non-magnetic contributions to the Helmholtz free energy per unit volume as f 0(n ) .3) where we have dropped the subscripts 0 from the entropies and specialized the relation to the values of the critical ﬁeld only. since the magnetic moment in the normal part is zero. s (n ) . s (s ) : ∂ f 0(n ) = ∂T ∂ f (n ) ∂T (n ) = −s0 . the derivative of G with respect to the volume of either part will be zero. using standard thermodynamic relations (see Appendix C). the critical ﬁeld at which superconductivity is destroyed.3) is an interesting result: if we deﬁne the latent heat per unit volume L = T s . we can rewrite this equation .8. Therefore. f 0(s ) . whereupon. Since the total volume of the specimen is ﬁxed. we can view the two partial derivatives on the lefthand side of the above equation as derivatives of Helmholtz free energies at constant magnetization M (in this case zero).2 Thermodynamics of the superconducting transition 291 where we have deﬁned F0 to be the Helmholtz free energy arising from the nonmagnetic contributions. M =0 ∂ f 0(s ) = ∂T ∂ f (s ) ∂T M =0 (s ) = − s0 with the subscripts 0 in the entropies serving as a reminder of the validity of these expressions only at zero magnetization.

1). which implies that the latent heat must vanish there since the ratio L / Hc must give a ﬁnite value equal to −(Tc /4π )(d Hc /dT )T =Tc . (8. as indicated in Fig. From this argument we conclude that at Tc the transition is second order.1(f). we ﬁnd for the difference in speciﬁc heats per unit volume: T c (T ) − c (T ) = − 4π (n ) (s ) d Hc dT 2 + Hc d2 Hc dT 2 (8. d Hc /dT is not zero. At this point.3) with respect to temperature. as the temperature is increased through the transition point. but break down at the critical temperature where Hc = 0.4) also shows that the speciﬁc heat of the √ superconducting state is higher than the speciﬁc heat of the normal state for Tc / 3 < T < Tc . we will assume that Hc as a function of temperature is given by Eq. 8.1(e). dropping by a ﬁnite amount from its value in the superconducting state to its value in the normal state.4) To explore the consequences of this result. which means that when going from the normal to the superconducting state the latent heat is a positive quantity. 8. which is also indicated schematically in Fig. This behavior is indicated in Fig. (8. . Differentiating both sides of Eq. We note that d Hc /dT ≤ 0 as suggested by the plot in Fig. the speciﬁc heat has a discontinuity at the critical temperature.292 8 Superconductivity in the following manner: 4π L d Hc =− dT Hc T which has the familiar form of the Clausius–Clapeyron equation for a ﬁrst order phase transition (see Appendix C). (8.1(e). 8. which leads to the following expression for the difference in speciﬁc heats between the two states: c (s ) ( T ) − c (n ) ( T ) = At T = Tc this expression reduces to c(s ) (Tc ) − c(n ) (Tc ) = 2 H0 π Tc 2 H0 T T 3 2 2π Tc Tc 2 −1 that is. The above arguments are valid as long as Hc > 0. 8. and using the standard deﬁnition of the speciﬁc heat C = d Q /dT = T (d S /dT ). Eq.1(f). that is. the superconducting state is more ordered (has lower entropy) than the normal state.

For two free electrons at a distance r. as r|k = (1/ k |V |k = 1 e−i(k−q)·r V (r)eik·r dr = 4π e 2 |q|2 the last expression being simply the Fourier transform V (q) of the bare Coulomb potential (see Appeindix G). taken to be in a plane wave state ) exp(ik · r). screening changes the bare Coulomb interaction to a Yukawa potential: V (r) = e2 e−ks |r| |r| with ks the inverse screening length. The second is the condensation of the Cooper pairs into a single coherent quantum state which is called the “superconducting condensate”.3 BCS theory of superconductivity There are two main ingredients in the microscopic theory of superconductivity developed by Bardeen. with the dielectric function describing the effect of screening. We have seen in chapter 3 that in the simplest description of metallic behavior. Due to momentum conservation. with q = k − k . this interaction is screened by all the other electrons.1 Cooper pairing The Coulomb interaction between two electrons is of course repulsive. the ﬁnal momen¯ (k − q). 8. which leads to the formation of the so called “Cooper pairs”. the carriers of the electromagnetic ﬁeld: an electron with initial momentum h ¯ k scatters off another electron by exchanging a photon of momentum h ¯ q.3. We can calculate the matrix element tum of the electron will be h ¯k = h corresponding √ to this scattering of an electron. In the solid. We discuss both ingredients in some detail. the interaction potential in real space is V (r) = e2 |r| We can think of the interaction between two free electrons as a scattering process corresponding to the exchange of photons. the Fourier transform of this potential is (see Appendix G) V (q) = 4π e2 2 |q|2 + ks . the Thomas–Fermi model. The ﬁrst is an effective attractive interaction between two electrons that have opposite momenta (larger in magnitude than the Fermi momentum) and opposite spins.8. this is the state responsible for all the manifestations of superconducting behavior. Cooper and Schrieffer.3 BCS theory of superconductivity 293 8.

These considerations indicate that. We can describe this motion in terms of phonons emitted by the traveling electron. Fr¨ olich [111] and Bardeen and Pines [112] showed that the effective interaction between electrons due to exchange of a phonon is given in reciprocal space by Vkk phon (q) = ( k g 2h ¯ ωq − k )2 − (¯ hωq )2 (8. When the ions are allowed to move. h ¯ ωq is the wave-vector and energy of the exchanged phonon. The physical origin of this new term is shown schematically in Fig.4: an electron moving through the solid attracts the positively charged ions which come closer to it as it approaches and then return slowly to their equilibrium positions once the electron has passed them by. k are the incoming and outgoing wave-vectors and energies of the electrons. the presence of the screening term exp(−ks |r|) eliminates the singularity at |q| → 0 of the bare Coulomb potential (this singularity is a reﬂection of the inﬁnite range of the bare potential). when the interaction is viewed as a scattering process. To show that an attractive interaction due to phonons can actually produce binding between electron pairs. it is evident that if the energy difference ( k − k ) is smaller than the phonon energy h ¯ ωq the effective interaction is attractive. The preceding discussion concerned the interaction between two electrons in a solid assuming that the ions are ﬁxed in space. From this expression. and q. Left: the distortion that the ﬁrst electron induces to the lattice of ions. the relevant physical quantity is the Fourier transform of the real-space interaction potential. a new term in the interaction between electrons is introduced. k and k .4. Accordingly. the charge of the electrons. we consider the following simple model [113]: the . Right: the second electron with opposite momentum at the same position but at a later time. It is natural to assume that the other electrons will be affected by this distortion of the ionic positions. The strength of the interaction is characterized by the constant e2 . since the electrons themselves are attracted to the ions. Illustration of attractive effective interaction between two electrons mediated by phonons. the collective motion of ions toward one electron will translate into an effective attraction of other electrons toward the ﬁrst one.294 8 Superconductivity Figure 8. g is a constant that describes the strength of the electron-phonon interaction.5) where k. 8. in the following we will only consider the interaction potential in this form. The lattice distortion favors energetically the presence of the second electron in that position. that is.

or the distortion will decay away. For simplicity. which lie beyond the scope of the present treatment. after multiplying through ) exp(i(k · r)) and integrating over r. We provide a rough argument to justify this choice. There is.6) . the distortion of the lattice induced by an electron would lead to an attractive interaction with any other electron put in the same position at a later time. k . We denote the interaction potential in real space as V phon (r). Moreover.4. both larger in magnitude than the Fermi momentum kF . and is zero otherwise. As indicated in Fig. We expand the wavefunction ψ (r) of the relative motion in plane waves with momentum k larger in magnitude than the Fermi momentum kF . only when the single-particle energies k . k lie within a narrow shell of width t0 above the Fermi level. we will take the pair of interacting electrons to have opposite wave-vectors ±k in the initial and ±k in the ﬁnal state. The delay is restricted to times of order 1/ωD . no restriction on the momentum of the second electron from these considerations. consistent with the assumption that they have opposite momenta. the following relation for the by (1/ coefﬁcients αk : (2 k − E pair )αk + |k |>kF αk Vkk phon =0 (8. as in the jellium model. however. the detailed justiﬁcation has to do with subtle issues related to the optimal choice for the superconducting ground state.3 BCS theory of superconductivity 295 interaction potential in reciprocal space is taken to be a constant Vkk = −V0 < 0. we will assume that the single-particle states are plane waves. 1 ψ (r) = √ |k |>kF αk e−ik ·r and insert √ this expansion in the above equation to obtain. where r is the relative position of the two electrons (in an isotropic solid V phon would be a function of the relative distance r = |r| only). Taking the center of mass of the two electrons to be ﬁxed. we arrive at the following Schr¨ odinger equation for their relative motion: − 2 h ¯ 2 ∇r + V phon (r) ψ (r) = E pair ψ (r) 2µ where µ = m e /2 is the reduced mass of the pair. It turns out that the way to maximize the effect of the interaction is to take the electrons in pairs with opposite momenta because this ensures that no single-particle state is doubly counted or left out of the ground state. where ωD is the Debye frequency. 8. independent of k.8.

6) αk Vkk |k |>kF phon = −V0 |k |>kF αk θ( F + t0 − k) where the step function is introduced to ensure that the single-particle energy lies within t0 of the Fermi energy F (recall that by assumption we also have k > F ). We now employ the features of the simple model outlined above. we can approximate the density of states over such a narrow range by its value at the Fermi energy. The last equation leads to the following expression for αk : αk = V0 |k |>kF αk θ( F + t0 − k) 1 . produces the following relation: αk |k|>kF = V0 |k|>kF |k |>kF αk θ( F + t0 − k) 1 2 k − E pair Assuming that the sum in the parentheses does not vanish identically. These considerations lead to → gF |k|>kF F F +t0 d . (8.296 8 Superconductivity phon where Vkk is the Fourier transform of the phonon interaction potential Vkk phon = 1 e−ik ·r V phon (r)eik·r dr and k is the single-particle energy of the electrons (in the present case equal to h ¯ 2 k2 /2m e ). which gives for the summation in Eq. upon summing both sides over k with |k| > kF . 2 k − E pair which. in addition. taking the range t0 to be very narrow. g F = g ( F ). we obtain 1 = V0 |k|>kF θ( F + t0 − k) 1 2 k − E pair The sum over k is equivalent to an integral over the energy that also includes the density of states g ( ).

(8.8. attractive.3 BCS theory of superconductivity 297 and with this identiﬁcation.7) was a seminal result: it showed that the effective interaction can be very weak and still lead to a bound pair of electrons. as assumed in the Cooper model. this radius is R ∼ 104 Å.3. 8. which implied that a perturbative approach to the problem would not work. given by Eq. their difference is likely to be smaller than the phonon energy h ¯ ωq and hence their interaction. the previous equation yields 1 = V0 g F F F + t0 1 d 2 − E pair This is easily solved for E 1= pair : =⇒ E pair = 2 F 2( F + t0 ) − E pair 1 V0 g F ln 2 2 F − E pair − 2t0 exp(2/ V0 g F ) − 1 In the so called “weak coupling” limit V0 g F 1. it established that electrons close to the Fermi level are the major participants in the pairing. so with both k and k within a shell of this thickness above the Fermi level. where ωD is the Debye frequency: this is an approximate upper limit of the frequency of phonons in the solid (see chapter 6). the above expression reduces to E pair = 2 F − 2t0 e−2/ V0 g F This relation proves that the pair of electrons forms a bound state with binding energy E b given by Eb ≡ 2 F − E pair = 2t0 e−2/ V0 g F (8.7) A natural choice for t0 is h ¯ ωD . which is a very large distance on the atomic scale.2 BCS ground state The fact that creation of Cooper pairs is energetically favorable (it has a positive binding energy) naturally leads to the following question: what is the preferred state of the system under conditions where Cooper pairs can be formed? A tempting . (8. and it indicated that their binding energy is a non-analytic function of the effective interaction V0 . which is justiﬁed a posteriori since it is reasonably obeyed by classical superconductors. Cooper also showed that the radius R of the bound electron pair is R∼ h ¯ 2 kF m e Eb For typical values of kF and E b . in which the range of the integral automatically satisﬁes the step function appearing in the summation.5). Eq.

k ↑. Moreover. but it is helpful in establishing that this wavefunction describes a coherent state of the entire system of electrons. |vk |2 is the probability that a Cooper pair of wave-vector k is present in the ground state and |u k |2 the probability that it is not.9) In Eq. To address these issues. which is a consequence of the fact that at zero temperature all states with wave-vectors up to the Fermi momentum kF are ﬁlled. The ground state is further restricted to only two of those states in each subspace. We explore next the implications of the BCS ground state wavefunction. removing the basis for the stability of Cooper pairs.298 8 Superconductivity answer would be to construct a state with the maximum possible number of Cooper pairs. each pair can be thought of as a composite particle with bosonic character. Each vector subspace contains a total of 16 states. A different scheme must be invented. |vk | = 0 for |k| > kF . in which the Fermi surface can survive.8). v−k = vk and u −k = u k .8) where |ψk ψ−k represents the presence of a pair of electrons with opposite spins and momenta and |0k 0−k represents the absence of such a pair. (8. but as a whole it has zero total momentum and zero total spin. in terms of these quantities the normal (non-superconducting) ground state is described by |vk | = 1. −k ↑. In this sense. This is reminiscent of the Bose-Einstein condensation of bosons at low enough temperature (see Appendix D). . BCS proposed that the manybody wavefunction for the superconducting state can be chosen from the restricted Hilbert space consisting of a direct product of four-dimensional vector spaces: each such space includes the states generated by a pair of electrons with opposite momenta. because of the special nature of the composite particles in the BCS wavefunction. Each pair of electrons retains the antisymmetric nature of two fermions due to the spin component which is a singlet state. We note that from the deﬁnition of these quantities. this cannot be done for all available electrons since then the Fermi surface would collapse. and the total wavefunction as a coherent state of all these bosons occupying a zeromomentum state. the ones with paired electrons of opposite spin in a singlet conﬁguration: (s ) 0 = k [u k |0k 0−k + vk |ψk ψ−k ] (8. depending on whether the individual single-particle states are occupied or not. We discuss ﬁrst the physical meaning of this wavefunction. −k ↓. k ↓. yet a sufﬁciently large number of Cooper pairs can be created to take maximum advantage of the beneﬁts of pairing. This analogy cannot be taken literally. For properly normalized pair states we require that ∗ uku∗ k + vk vk = 1 (8. However. |u k | = 0 for |k| < kF and |u k | = 1.

The last term gives ∗ ψk ψ−k |H0 |ψk ψ−k = |vk |2 δ (k − k ) vk vk sp k + |vk |2 δ (k + k ) k where we have used the facts that vk = v−k and k = −k (the latter from the discussion of electronic states in chapter 3). we ﬁnd that it generates the following types of terms: uku∗ k 0k 0−k |H0 |0k 0−k . a convention we adopt in the following. in the following we adopt this terminology to conform to literature conventions. When we take matrix elements of the ﬁrst term in the hamiltonian with respect to the ground state wavefunction deﬁned in Eq. as discussed in chapter 2.8). sp Of all these terms. H = H0 + H phon The ﬁrst term describes the electron interactions in the single-particle picture with the ions frozen. only the last one gives a non-vanishing contribution. In principle. because sp the terms that include a |0k 0−k state give identically zero when H0 is applied to them. H0 |ψk = sp k |ψk and form a complete orthonormal set. it is convenient to measure all single-particle energies k relative to the Fermi energy F . in which case k is indeed the kinetic energy.2 even though in a single-particle picture it also includes all the electron–ion as well as the electron-electron interactions. sp ∗ u k vk ψk ψ−k |H0 |0k 0−k sp ∗ vk vk ψk ψ−k |H0 |ψk ψ−k sp vk u ∗ k 0k 0−k |H0 |ψk ψ−k . Summing over all values of k. . H phon = 1 2 V phon (ri − r j ) ij (8. we can write the ﬁrst term as a sum over single-particle terms and the second as a sum over pair-wise interactions: H0 = i sp sp h (ri ).10) The single-particle wavefunctions |ψk are eigenfunctions of the ﬁrst term. (8. when the ionic motion is taken into account. Furthermore. with eigenvalues k . k . they can be considered as free particles in the positive uniform background of the ions (the jellium model).3 BCS theory of superconductivity 299 The hamiltonian for this system contains two terms.8. It is customary to refer to k as the “kinetic energy”. This is equivalent to having a variable number of electrons in the system. and the second term describes the electron interactions mediated by the exchange of phonons. with the chemical potential equal to the Fermi level. we 2 The reason for this terminology is that in the simplest possible treatment of metallic electrons.

which describes the exchange of phonons.5.10). we see that it must describe the types of processes illustrated in Fig. while in the ﬁnal state the reverse is true. 8. Scattering between electrons of wave-vectors ±k (initial) and ±k (ﬁnal) through the exchange of a phonon of wave-vector q: in the initial state the Cooper pair of wave-vector k is occupied and that of wave-vector k is not. The shaded region indicates the Fermi sphere of radius kF . we ﬁnd that the total energy of the ground state deﬁned in Eq. ﬁnd that the contribution of the ﬁrst term in the hamiltonian to the total energy is given by sp (s ) 0 |H0 | (s ) 0 = k 2 k |vk |2 Turning our attention next to the second term in the hamiltonian.5.300 8 Superconductivity final k k’ initial -k k’ q -k kF k initial -k’ final -k’ kF Figure 8.8) is (s ) = E0 sp (s ) 0 |H0 + H phon | (s ) 0 = k 2 k |vk |2 + kk ∗ Vkk u ∗ k vk u k vk (8. these matrix elements will take the form Vkk = ψk ψ−k |V phon |ψk ψ−k from which we conclude that the contribution of the phonon interaction hamiltonian to the total energy will be (s ) phon | 0 |H (s ) 0 = kk ∗ Vkk u ∗ k vk u k vk Finally. putting together the two contributions. (8. the interaction of electrons through the exchange of phonons will lead to this pair being kicked out of the ground state and being replaced by another pair of wave-vector k which was not initially part of the ground state. (8. This indicates that the only non-vanishing matrix elements of the second term in the hamiltonian must be of the form ∗ phon (vk |ψk ψ−k | u k |0k 0−k ) ψk ψ−k |u ∗ vk k 0k 0−k | H In terms of the potential that appears in Eq.11) . If a Cooper pair of wave-vector k is initially present in the ground state.

k =| k |e −iw ˜k . Eq. This argument leads to (s ) ∂ E0 ∗ = 2 k vk + ∂vk Vk k u ∗ k u k vk + k k Vkk ∂ u∗ k ∗ ∗ vk u k vk = 0 ∂vk From the normalization condition. Next. which are the only parameters in the problem. since Vkk represents the Fourier transform of a real potential. (8. e 2 vk = sin θk −iwk /2 e 2 θk iwk e 2 (8. This is the usual procedure of varying only the bra of a single-particle state (in the present case a single-pair state) while keeping the ket ﬁxed. vk . therefore the right-hand side must also be real.9).8.13) When these expressions are substituted into the previous equation we ﬁnd 2 k sin θk θk cos = 2 2 ∗ k sin2 θk −iwk − e 2 k cos2 (8. and the assumptions we have made above. (8.9).14) In this relation the left-hand side is a real quantity. if we express the complex number k as its magnitude times a phase factor. we ﬁnd that ∂ u∗ vk k ∗ =− ∂vk uk which we can substitute into the above equation. we will employ a variational argument: we will require that the ground ∗ while keeping vk and u k state energy be a minimum with respect to variations in vk ﬁxed. as was done in chapter 2 for the derivation of the single-particle equations from the many-body ground state energy. vk which automatically satisﬁes the normalization condition. Eq. In order to determine the values of u k .3 BCS theory of superconductivity 301 where in the ﬁnal expression we have omitted for simplicity the phonon wave-vector dependence of the matrix elements Vkk . and allows for a phase difference wk between them: u k = cos θk iwk /2 . we will assume an explicit form for the parameters u k .12) we arrive at the following relation: 2 k vk u k + 2 kuk − ∗ 2 k vk =0 ∗ In the above derivation we have used the fact that Vk k = Vkk . and with the use of the deﬁnitions k = k Vk k u ∗ k vk → ∗ k = k ∗ Vkk u k vk (8.

14) simpliﬁes to k sin θk + | k | cos θk =0 (8.17) ζk +| 2 k| (8.13). (8. vk given in Eq.16).18) and using Eq. From the deﬁnition of ζk . we refer to ζk as the energy of quasiparticles associated with excitations above the superconducting ground state. (8. Eq. 2 ζk 2 1 1− k 2 2 ζk 1/2 | vk | 2 = = 1 k 1− 2 ζk (8. we consider a simpliﬁed picture in which we neglect the k-dependence of all quantities and use . For this reason. associated with breaking the pair of wave-vector k (see Problem 2). cos θk = cos θk = − k ζk k (8. and therefore its imaginary part must vanish: | 2 k | sin θk +| 2 2 k | cos θk ˜ k ) = 0 =⇒ wk = w ˜k sin(wk − w 2 With this result. which is concerned with the ground state of the system. it is straightforward to derive the relations |u k |2 = |u k ||vk | = 1 k 1+ . (8. the ﬁrst one leads to the lowest energy state while the second leads to an excited state (see Problem 1).302 8 Superconductivity and substitute this expression into the previous equation. we ﬁnd that the right-hand side becomes | 2 k | sin θk −i(wk −w ˜ k) −| e 2 k | cos 2 θk i(wk −w ˜ k) e 2 which must be real.16) (8. sin θk = ζk where we have deﬁned the quantity ζk ≡ 2 k sin θk = − | k| . (8. In the following discussion.15) This equation has two possible solutions: ζk | k| . In order to gain some insight into the meaning of this solution.18) Of the two solutions.19) 1 | k| 2 ζk The quantity ζk is actually the excitation energy above the BCS ground state. we will use the values of the parameters u k . Eq.

|u | = |v | = 0. and |u |2 taking the reverse values. It is instructive to contrast this result to the normal state which. |u | = 0 for < F and |u | = 1. corresponds to the occupations |v | = 1. 2 2 At = 0.8.18). In this picture. vk reduce to |u |2 = 1 1+ 2 2 + | |2 . Features of the BCS model: Left: Cooper pair occupation variables |u |2 and |v |2 as a function of the energy . We can also use this simpliﬁed picture to obtain the density of states in the superconducting state. the magnitudes of the parameters u k . Thus. as the only relevant variable. within a range of order | |. 8. v exhibit behavior similar to Fermi occupation numbers. The change in |v |2 .6. |v | = 0 for > F . from the deﬁnition of ζk . |u |2 from one asymptotic value to the other takes place over a range in of order | | around zero (the exact value of this range depends on the threshold below which we take |u | and |v | to be essentially equal to their asymptotic values). (8.3 BCS theory of superconductivity g( ζ ) 1 |v|2 |u| 2 303 gF 0 ε −|∆| 0 |∆| ζ Figure 8. the occupation of Cooper pairs is signiﬁcant only for energies around the Fermi level. |ζ | ζ2 − | |2 . Right: superconducting density of states g (ζ ) as a function of the energy ζ . for |ζ | > | | for |ζ | < | | . gives g (ζ ) = g F = 0.5. Since the total number of electronic states must be conserved in going from the normal to the superconducting state. we will have g (ζ )dζ = g ( )d ⇒ g (ζ ) = g ( ) d dζ which. with |v |2 being close to unity for 0 and approaching zero for 0. g F is the density of states in the normal state at the Fermi level. as discussed earlier.6. viewed here as a function of . We see that the formation of the superconducting state changes these step functions to functions similar to Fermi occupation numbers. We see that the parameters u . | v |2 = 1 1− 2 2 + | |2 which are plotted as functions of in Fig. Eq. which coincides with F .

leading to the situation shown in Fig. With this approximation. This is known as the “superconducting gap” and is directly observable in tunneling experiments. dI dV = e|T |2 g (s ) ( F m) + eV )g ( F T =0 with T the tunneling matrix element and g (s ) ( ). Right: differential conductance d I /dV as a function of the bias voltage eV at T = 0 (solid line): in this case the measured curve should follow exactly the features of the superconductor density of states g (s ) ( ). In the last expression we have approximated g ( ) by its value at the Fermi level g F . with the Fermi level in the middle of the superconducting gap. To illustrate how the superconducting gap is observed experimentally we consider a contact between a superconductor and a metal. g (m ) ( ) the density of states of m) the superconductor and the metal. Typically. the superconducting gap is small enough to allow us to approximate the density of states in the metal as a constant over a range of energies at least equal to 2| | around the Fermi level. This is an intriguing result: it shows that the superconducting state opens a gap in the density of states around the Fermi level equal to 2| |. This result shows that by scanning the bias voltage V one samples the . which was discussed in detail in chapter 5. 8.25). the latter evaluated at the Fermi level. At equilibrium the Fermi level on both sides must be the same. g ( F = g (m ) ( F ).6. the situation at hand is equivalent to a metal-semiconductor contact. When a voltage bias V is applied to the metal side the metal density of states is shifted in energy by eV (dashed line). At T > 0 (dashed line) the measured curve is a smoothed version of g (s ) ( ).7. Left: density of states at a metal–superconductor contact: the shaded regions represent occupied states. We can therefore apply the results of that discussion directly to the metal–superconductor contact: the measured differential conductance at T = 0 will be given by Eq. 8.7.304 g (s) (ε ) 8 Superconductivity dI/dV −|∆| g (m) (ε ) |∆| eV ε 0 |∆| eV Figure 8. (5. The function g (ζ ) is also plotted in Fig. within which there are no quasiparticle states.

as shown in Fig. making measurements at T = 0 is not physically possible. the quasiparticle energies ζk take the form ζk = h ¯ 2 kF me 2 (k − kF )2 + | |2 This result proves our assertion that the lowest quasiparticle energy is | | (see also Problem 2).16) and the deﬁnition of k . the measured differential conductance will reﬂect all the features of the superconductor density of states. we may conclude that the lowest energy of each quasiparticle state is half the value of the gap. The physical interpretation of the gap is that it takes at least this much energy to create excitations above the ground state. when the voltage V is scanned the measured differential conductance is also a smooth function representing a smoothed version of the superconductor density of states. including the superconducting gap. Thus. However. which gives k = h ¯2 2 h ¯2 h ¯ 2 kF 2 (k − kF )= (k − kF )(k + kF ) ≈ (k − kF ) 2m e 2m e me where in the last expression we have assumed that the single-particle energies k (and the wave-vectors k) lie within a very narrow range of the Fermi energy F (and the Fermi momentum kF ).3 BCS theory of superconductivity 305 density of states of the superconductor.7. With this expression.8. Consistent with our earlier assumptions. | |. In this case. we measure these energies relative to the Fermi level F . We therefore consider its behavior in more detail.20) ⇒ =− k Vk k 2ζk .13) and (8. since an excitation is related to the creation of quasiparticles as a result of Cooper pair breaking.7: the superconducting gap is still clearly evident. (8. This can actually be derived easily in a simple picture where the singleparticle energies k are taken to be the kinetic energy of electrons in a jellium model. that is. These considerations reveal that the quantity k plays a key role in the nature of the superconducting state. as illustrated in Fig. we ﬁnd k = k Vk k u ∗ k vk = k k Vk k cos k θk θk sin e−iwk = − 2 2 Vk k k | k | −iwk e 2ζk (8. Eq. 8. (8. From Eqs. At ﬁnite temperature T > 0 the occupation number n (m ) ( ) is not a sharp step function but a smooth function whose derivative is an analytic representation of the δ -function (see Appendix G). 8. although not as an inﬁnitely sharp feature.12).

306 8 Superconductivity which is known as the “BCS gap equation”. (8. . non-superconducting state (see Problem 3). vk and k that we have obtained in terms of k .11) and the relevant expressions derived above we ﬁnd (s ) = E0 k 2 k 1 k 1− 2 ζk 1− 1− k + k | 1 | 2 | iwk k |e cos θk −iwk /2 θk e sin e−iwk /2 2 2 = k k ζk k + k k | sin θk 2 k| = k k ζk − k 2ζk (8. The second term is the gain in potential energy due to the binding of the Cooper pairs. we ﬁnd | |= h ¯ ωD sinh(1/g F V0 ) (see Problem 3). In the weak coupling limit we ﬁnd | | = 2¯ hωD e−1/g F V0 (8. Within the simple model deﬁned in relation to Cooper pairs.21) which is an expression similar to the binding energy of a Cooper pair. indicating that a perturbative approach in the effective interaction parameter would not be suitable for this problem. as discussed in chapter 6: | | h ¯ ωD F Thus. this type of expression is non-analytic in V0 .22) This last expression for the total energy has a transparent physical interpretation. which in turn is much smaller than the Fermi energy. (8. ζk . In the BCS model it is possible to show that this energy is always lower than the corresponding energy of the normal. As was mentioned in that case. From Eq.7). this term is always positive since k ≤ ζk and comes from the fact that we need to promote some electrons to energies higher than the Fermi energy in order to create Cooper pairs. The ﬁrst term is the kinetic energy cost to create the Cooper pairs. the superconducting gap in the density of states is a minuscule fraction of the Fermi energy. We next calculate the total energy of the superconducting ground state using the values of the parameters u k . which is evidently always negative corresponding to an effective attractive interaction. The expression we have found for | | shows that the magnitude of this quantity is much smaller than h ¯ ωD . Eq.

only now it is a temperature-dependent quantity through the dependence of k on n k .3 BCS theory of superconductivity 307 8.3. we consider the situation at ﬁnite temperature. (8.24) By analogy to our discussion of the zero-temperature case. With these considerations. k as k = k Vk k u ∗ k vk (1 − 2n k ) → ∗ k = k ∗ Vkk u k vk (1 − 2n k ) (8. which leads to ∂ F (s ) = 0 ⇒ 2 k (1 − 2|vk |2 ) − 2 ∂ nk ∗ ∗ k u k vk −2 ∗ k u k vk + 2kB T ln nk =0 1 − nk All the relations derived for u k .23) To this energy we must add the entropy contribution.3 BCS theory at ﬁnite temperature Finally. In this case. Consequently.8. when these relations are substituted into . each occurrence of a Cooper pair will be accompanied by a factor (1 − 2n k ).26) is the BCS gap equation at ﬁnite temperature. Eq. (8. following exactly the same steps as in the zerotemperature case. vk . leads to k =− k Vk k k 2ζk (1 − 2n k ) (8. (8. The entropy of a gas of fermions consisting of spin-up and spin-down particles with occupation numbers n k is given by (see Appendix D) S (s ) = −2kB k [n k ln n k + (1 − n k ) ln(1 − n k )] (8. taking into account spin degeneracy. Eq. We can also require that the free energy F (s ) is a minimum with respect to variations in the occupation numbers n k .25). k in terms of the variables θk .25) We can now apply a variational argument to the free energy of the ground (s ) state F (s ) = E 0 − T S (s ) which.26) with ζk deﬁned by the same expression as before. This leads to the assignment of (1 − 2n k ) as the Cooper pair occupation number of the same wave-vector. the total energy of the ground state at ﬁnite temperature will be given by (s ) = E0 k 2n k k + 2(1 − 2n k )|vk |2 k + kk ∗ Vkk u ∗ k vk u k vk (1 − 2n k )(1 − 2n k ) (8. we deﬁne the quantities ∗ k . to arrive at the free energy. while the occurrence of an electronic state that is not part of a Cooper pair will be accompanied by a factor 2n k .18). wk are still valid with the new deﬁnition of k . Eq. we will take the temperature-dependent occupation number of the single-particle state k to be n k for each spin state.

we can take the Debye frequency to be ωD ∼ κ M 1/2 where κ is the relevant force constant and M the mass of the ions.4 The McMillan formula for Tc We close this section with a brief discussion of how the Tc in conventional superconductors. can be calculated. conﬁrming the validity of BCS theory. 8. Eq. we ﬁnd 2| 0| = 3.2). the relation we called the isotope effect.28).45 exp − 1. with Eq. Eq. (8.29) where we have used the notation 2| 0 | for the zero-temperature value of the superconducting gap. (8.21).62λµ∗ (8. they lead to nk = 1 1 + eζk / kB T (8.52kB Tc (8.26). we ﬁnd k =− k Vk k k 2ζk tanh ζk 2kB T If we analyze this equation in the context of the BCS model (see Problem 4). rather than the single-particle energy k . (8. Eq.14 h (8. as the relevant variable.308 8 Superconductivity the above equation. McMillan [114] proposed a formula to evaluate Tc as Tc = D 1. arising from electron-phonon coupling as described by the BCS theory.30) .5. with α = −0. This relation is referred to as the “law of corresponding states” and is obeyed quite accurately by a wide range of conventional superconductors.3. in the weak coupling limit we obtain ¯ ωD e−1/g F V0 kB Tc = 1.28) This relation provides an explanation for the isotope effect in its simplest form: from our discussion of phonons in chapter 6. this leads to Tc ∼ M −1/2 . Moreover. (8. Using this expression for n k in the BCS gap equation at ﬁnite temperature.04(1 + λ) λ − µ∗ − 0. combining our earlier result for | | at zero temperature.27) This result reveals that the occupation numbers n k have the familiar Fermi function form but with the energy ζk .

It turns out that λ can actually be obtained from electronic structure calculations. ψq are electronic wavefunctions in the ideal crystal. the part of the polarization vector corresponding to the position of ion j ).25. but we will conform to this convention. The value of µ∗ is difﬁcult to obtain from calculations. n . l . This expression is valid for λ < 1.34) (l ) is the crystal potential in the presence where V0 is the ideal crystal potential. The terms in Eq. The exponential dependence of Tc on the other parameter. a procedure which has even led to predictions of superconductivity [115].34) can be readily evaluated through the computational methods discussed in chapter 5. k) = j h ¯ (l ) 2 M j ωk (n ) (l ) ˆk j · ψq e δV ψ (n ) δ (q − q − k) (8. (8. k. q. q . n . l . λ is expressed as the average over all phonon modes of the constants λ( k : λ= l l) λ( k dk (8. necessitates a more accurate estimate of its value. We outline the calculation of λ following the treatment of Chelikowsky and Cohen l) [65]. q .3 BCS theory of superconductivity 309 where D is the Debye temperature. its value has been estimated from tunneling measurements to be µ∗ ∼ 0. k) is the electron-phonon matrix element and · · · F is an average over the Fermi surface.8.1. and u ( k is the average atomic displacement corresponding to this phonon mode. g F . Vk l) of the phonon mode identiﬁed by l . The electron-phonon matrix elements are given by f (n . for other cases this value is scaled by the density of states at the Fermi level. l .33) δt j q where the summation on j is over the ions of mass M j at positions t j in the unit cell.32) where f (n . (n ) l) (n ) ˆ( . k)|2 F (8. but in any case the value of this constant tends to be small. The term δ V /δ t j is the change in the crystal potential due to the presence of the phonon.31) which describe the coupling of an electron to a particular phonon mode identiﬁed by the index l and the wave-vector k (see chapter 6): l) λ( k = 2g F (l ) h ¯ ωk | f (n . λ. λ is a constant describing electron–phonon coupling strength3 and µ∗ is another constant describing the repulsive Coulomb interaction strength. For sp metals like Pb. . n . This term can be evaluated as l) ˆ( e kj l (l ) Vk − V0 δV · = l) δt j u( k (8. q. q . taking the value of g F for Pb as the norm. and e ψq k j is the phonon polarization vector (more precisely. by introducing atomic 3 It is unfortunate that the same symbol is used in the literature for the electron–phonon coupling constant and the penetration length. q.

This assumption implies an isotropic solid and a spherical Fermi surface. Using this formalism. g F V0 . (c) In order to derive simple expressions relating the superconducting gap | 0 | to the critical temperature (or other experimentally accessible quantities) we assumed that there is no dependence of the gap. see Appendix B). 284) conform to some but not all of these points.2. These are as follows. In a Cooper pair the electrons have opposite momenta and opposite spins in a spin-singlet conﬁguration. and the presence of strong antiferromagnetic interactions may be . mostly in order to bring out their differences from the conventional ones. the quasiparticle energies. (a) Electrons form pairs. We ﬁrst review the main points of the theory that explains the physics of conventional superconductors. This model is referred to as “s -wave” pairing. non-superconducting state. due to an attractive interaction mediated by the exchange of phonons. Points (a) and (b) apparently apply to those systems as well. (d) We also assumed that we are in the limit in which the product of the density of states at the Fermi level with the strength of the interaction potential. is much smaller than unity. Excitations above this ground state are represented by quasiparticles which correspond to broken Cooper pairs. p. due to the lack of any spatial features in the physical quantities of interest (similar to the behavior of an s -like atomic wavefunction. as derived in the previous section.4 High-temperature superconductors In this ﬁnal section we give a short discussion of the physics of high-temperature superconductors. we called this the weak-coupling limit. 8. on the electron wave-vectors k. which is a dimensionless quantity. and represents a coherent state of the entire electron gas. Magnetic order plays an important role in the physics of these materials. with one notable exception: there seems to exist ground for doubting that electron-phonon interactions alone can be responsible for the pairing of electrons. with l) u( k 1 = Nat 1/2 l) 2 |u( kj| j the atomic displacement averaged over the Nat atoms in the unit cell. it was predicted and later veriﬁed experimentally that Si under high pressure would be a superconductor [115]. The high-temperature superconductors based on copper oxides (see Table 8. and the attractive interaction potential.310 8 Superconductivity l) displacements u( k j corresponding to a phonon mode and evaluating the crystal potential difference resulting from this distortion. called Cooper pairs. (b) Cooper pairs combine to form a many-body wavefunction which has lower energy than the normal.

predicts but is in the range 4 − 7. Finally.8 (see also chapter 1 for the structure of perovskites). 8.83 b=3.40 b=3.88 0 a=3. These departures from the behavior of conventional superconductors have sparked much theoretical debate on what are the microscopic mechanisms responsible for this exotic behavior. similar to that exhibited by a d -like atomic wavefunction. At the present writing. which seem to be in the “strong-coupling” limit. .4 High-temperature superconductors 311 c=13. may be able to capture the reason for electron pairing.78 4. point (d) also appears to be strongly violated by the high-temperature superconductors. having as a main structural feature planes of linked Cu–O octahedra which are decorated by other elements. Point (c) seems to be violated by these systems in several important ways. (8. as shown in Fig. taking also into account the strong anisotropy in these systems (which is discussed next).77 2. dependence of the gap on k. some variation of the basic theme of electron-phonon interaction.78 a=3. intimately related to the mechanism(s) of electron pairing. there exist strong indications that important physical quantities such as the superconducting gap are not featureless.29). On the other hand. Representation of the structure of two typical high-temperature superconductors: La2 CuO4 (left) is a crystal with tetragonal symmetry and Tc = 39 K.18 Ba c=11.83 O Cu Ba Figure 8. but possess structure and hence have a strong dependence on the electron wave-vectors. Notice the Cu–O octahedron. In addition to this anisotropy of the crystal in real space. the issue remains unresolved. Eq. Notice also the puckering of the Cu–O planes in the middle of the unit cell of YBa2 Cu3 O7 and the absence of full Cu–O octahedra at its corners. that is. which is of order a few lattice constants as opposed to ∼104 Å.52.68 Y La O Cu 0 4.15 2.16 1. as the weak-coupling limit of BCS theory. For both structures the conventional unit cell is shown.8.8. and from the fact that the ratio 2| 0 |/ kB Tc is not 3.42 4. while YBa2 Cu3 O7 (right) is a crystal with orthorhombic symmetry and Tc = 93 K. in the case of La2 CuO4 this cell is twice as large as the primitive unit cell. These indications point to what is called “d -wave” pairing. with the dimensions of the orthogonal axes and the positions of inequivalent atoms given in angstroms. clearly visible at the center of the La2 CuO4 conventional unit cell. Indications to this effect come from the very short coherence length in these systems. The copper-oxide superconductors are strongly anisotropic.

in Solid State Physics. This is a standard reference. This is a classic account of the BCS theory of superconductivity. 61. Poole.22). Next. Turnbul. Academic Press. Phys. San Diego. k in terms of θk and wk . pp. This article reviews the electronic properties of HTSC ceramic crystals. 3. R. 1989). 135–212 (eds. Show that the energy cost of removing the pair of wave-vector k from the BCS ground state is: Ek pair = 2 k |vk |2 + ∗ k u k vk + ∗ ∗ k u k vk where the ﬁrst term comes from the kinetic energy loss and the two other terms come from the potential energy loss upon removal of this pair. “Electronic structure of copper-oxide semiconductors”. Superconductivity in Metals and Alloys.312 8 Superconductivity Further reading 1. 1964). Farach and R.R. p. Reading. Pickett. 1989). “The structure of YBCO and its derivatives”. (8. vk . “Electronic structure of the high-temperature oxide superconductors”. 42. Superconductivity. This is a modern account of superconductivity. using the expressions for u k . Boston. 433 (1989). and the BCS solution Eq. 4. 2.E. pp. (8. H. D. show that the excitation energy to break this pair. . 1995). K. Academic Press. 7. including the high-temperature copper-oxide superconductors. Creswick (Academic Press. Rev. vol.17). 2. Eq.. Problems 1. (8. Bristol. Tinkham (2nd edn. 5. H. J. Ehrenreich and D. This book contains an interesting account of superconductivity as well as illuminating comparisons to superﬂuidity. vol. Introduction to Superconductivity. This is a thorough review of experimental and theoretical studies of the electronic properties of the oxide superconductors. de Gennes (Addison-Wesley.J. 8. MA. 6. in Solid State Physics. by having only one of the two single-particle states with wave-vectors ±k occupied. including extensive coverage of the high-temperature copper-oxide superconductors. Ehrenreich and D. W. show that this excitation energy is equal to ζk .P. Schrieffer (Benjamin/Cummings.16).R.G. as deﬁned in Eq. Finally. 42. P. Hass. H. by 2ζk . Reading. 1986). (8. C. Show that the second solution to the BCS equation. Mod. 1996). McGraw-Hill. is given by k − Ek pair where k is the energy associated with occupation of these single-particle states. Theory of Superconductivity.18). Eq. Superﬂuidity and Superconductivity.. 1966). with comprehensive coverage of all aspects of superconductivity. Boston. This article reviews the structure of representative high-temperature superconductor ceramic crystals. Tilley and J. Shaw. C. Beyers and T. corresponds to an energy higher than the ground state energy. Tilley (Adam Hilger Ltd. Turnbul. New York. 213–270 (eds.M.A. M.

21). −k.25).23). in the context of BCS theory and Cooper pairs. with the gap being a temperature-dependent quantity (T ). . (8. which becomes zero at some temperature that we identify as the transition temperature Tc . this reduces to Eq. and the convention that single-particle energies k are measured with respect to the Fermi level. Eqs.28). (b) Show that for T = Tc in the weak-coupling limit the above equation yields Eq. Eq. Starting with the expressions for the energy and the entropy of the BCS superconducting state at ﬁnite temperature. Eq. prove the law of corresponding states. 2 k 4. gives = h ¯ ωD sinh(1/g F V0 ) 1. By comparing this with the result at zero temperature.Problems 3. (8. Eq. (8. with the single-particle energy k lying within a shell of ± h ¯ ωD around the Fermi level F . Then use the BCS model deﬁned in the previous problem.26). Eq.24). A pair consists of two electrons with opposite spins and wave-vectors k. Eq. in the weak-coupling limit g F V0 (s ) (n ) − E0 between the normal (b) Show that in this model the energy difference E 0 ground state energy given by (n ) E0 = |k|<kF (s ) and the superconducting ground state energy E 0 given in Eq. (8. (8. to show that: (a) The gap equation at ﬁnite temperature gives h ¯ ωD 0 1 ( 2 + 2 )1/2 tanh ( 2 + 2 )1/2 2kB T d = 1 g F V0 From this expression prove that the gap (T ) is a monotonically decreasing function of T . and the deﬁnition the temperature-dependent gap. (8. (8. use a variational argument on the free energy to prove the ﬁnite-temperature gap equation. (8. and that.20). (8. The density of single-particle states is considered constant in ¯ ωD < k < F + h ¯ ωD and is set equal to g F = g ( F ).21).22). is always negative.29). The gap k and the effective interaction potential Vkk are considered independent of k and are set equal to the constant values and −V0 (with V0 and > 0). (8. Within these the range F − h assumptions. Find the value of this energy difference in the weak-coupling limit and interpret its physical meaning. the summation over wave-vectors k reduces to: → gF k h ¯ ωD −h ¯ ωD d (a) Show that in this model the BCS gap equation. 313 The BCS model consists of the following assumptions.

.

To begin with. 315 . this is a true surface of a crystal. the concentration of defects is rarely below one per billion. are not perfect crystals. these are called substitutional impurities.Part II Defects. Alternatively. The linear defects are called dislocations. A natural way to classify defects in crystalline structure is according to their dimensionality: (i) Zero-dimensional or “point” defects consist of single atomic sites. there are many solids whose structure has none of the characteristic symmetries of crystals (amorphous solids or glasses) as well as many interesting ﬁnite systems whose structure has some resemblance to crystals but they are not of inﬁnite extent. in the best quality crystals produced with great care for the exacting requirements of high-technology applications. (ii) One-dimensional defects consist of lines of atomic sites perturbed from their ideal positions. Most real systems. these can extend for distances comparable to the linear dimension of the solid. these are called grain boundaries and interfaces. even in those real solids which come close to the deﬁnition of a perfect crystal there are a large number of defects. however. a two-dimensional defect may correspond to the intersection of two crystallites (grains). which are not in the proper crystalline positions. Point defects may also consist of crystalline sites occupied by atoms foreign to the host crystal. (iii) Two-dimensional defects consist of a plane of atomic sites where the crystalline lattice is terminated. This second part of the book is concerned with all these types of solids and structures. such as Si wafers used to manufacture electronic devices. examples of point defects are missing atoms called vacancies. For instance. Moreover. or complexes of very few atomic sites. this corresponds to roughly one defect per cube of only 1000 atomic distances on each side! These imperfections play an essential role in determining the electronic and mechanical properties of the real crystal. non-crystalline solids and ﬁnite structures The crystalline structure of solids forms the basis for our understanding of their properties. or extra atoms called interstitials.

Chapters 12 and 13 are devoted to solids that lack crystalline order and to structures which are ﬁnite (that is. . These structures are becoming increasingly important in practical applications and give rise to a wide variety of interesting physical phenomena. line defects and surfaces and interfaces of crystals. not of macroscopic extent and therefore not representable as periodic) in one or more dimensions. non-crystalline solids and ﬁnite structures In the following three chapters we will take up the properties of point defects.316 Part II Defects.

The most common intrinsic point defects are the vacancy (a missing atom from a crystalline site) and the interstitial (an extra atom in a non-crystalline site). 9. denoted as E (vac) ( N − 1). that is. atoms which are foreign to the crystal but are situated at crystalline sites. minus the energy of an equal number of atoms in a perfect crystal (with N → ∞ for an inﬁnite crystal): (v ac) f = lim N →∞ E (vac) ( N − 1) + E0( N ) N − E0( N ) (9. are shown in Fig. plus the energy of an atom in the ideal crystal. one atom fewer than the ideal crystal which has N atoms. An illustration of these defects in a two-dimensional square lattice is given in Fig.1. 9. that is.1 Energetics and electronic levels Vacancies are quite common in many crystals.1 Intrinsic point defects 9. in this ﬁgure the types of distortions introduced by various defects are exaggerated to make them obvious. substituting for a regular crystal atom. This is deﬁned as the energy of the crystal containing a vacancy.2. that is.1) where E 0 ( N ) is the ground state energy of a crystal consisting of N atoms at ideal positions. that is.1.9 Defects I: point defects Point defects can be either intrinsic. 9. More realistic examples. due to the breaking of bonds. jumping from one crystalline 317 . The most common extrinsic defects are substitutional impurities. which is called the formation energy (fvac) . atoms of the same type as the host crystal but in wrong positions. Vacancies can move in the crystal. The presence of defects often distorts signiﬁcantly the atomic structure of the crystal in the immediate neighborhood of the defects. consisting of the vacancy and interstitial defects in bulk Si. The creation of a vacancy costs some energy. or extrinsic. atoms of a different type than those of the ideal crystal. both in close-packed metallic structures as well as in open covalent structures.

2. into the vacant site. I = interstitial. S1 . or saddle point. this energy corresponds to the so called activated conﬁguration. undistorted crystal. also surrounded by a faint circle of the same radius. Examples of vacancy and interstitial defects in bulk Si. Right: an interstitial atom placed in an empty region of the diamond lattice. site to the next – what actually happens is an atom jumps from one of the sites surrounding the vacancy. called the migration energy m . S2 = substitutional impurities of size smaller and larger than the host atoms. The concentration of vacancies per site in the crystal at temperature T under equilibrium conditions is determined by the . V = vacancy. In the lower left corner an ideal atom and its nearest neighbors are shown for comparison. with one of the tetrahedrally bonded atoms highlighted. the crystal is viewed at an angle that makes the open spaces evident.1. (v ac) that has to overcome an energy barrier. In all cases the defects are surrounded by a faintly drawn circle which includes their nearest neighbors. The actual lowest energy conﬁguration of the Si interstitial is different and involves a more complicated arrangement of atoms. The thin lines and small dots in the neighborhood of defects serve to identify the position of neighboring atoms in the ideal. Center: the vacancy conﬁguration. Figure 9. which corresponds to the formation energy (fvac) . respectively. Illustration of point defects in a two-dimensional square lattice.318 9 Defects I: point defects V I S1 S2 Figure 9. and is given relative to the energy of the equilibrium conﬁguration of the defect. This is a thermally activated process. Left: the ideal bulk structure of Si. with the highlighted tetrahedrally bonded atom missing (note the relaxation of its four neighbors).

1 Intrinsic point defects 319 vacancy formation energy.2) as can be easily shown by standard statistical mechanics arguments (see Problem 1). If the . outside the range of energy that corresponds to crystal states. By analogy to the vacancy. it becomes clear that each dangling bond state is occupied by one electron. in the band gap of semiconductors. since two electrons are required to form a covalent bond. N of them at regular crystalline positions plus one at the interstitial position. To illustrate this point. or by exchanging positions with an atom of the host lattice. and is given by c(vac) (T ) = exp(− (v ac) / kB T ) f (9.3) where E (int ) ( N + 1) is the energy of the solid containing a total of N + 1 atoms.9. the presence of the defect does not make a big difference to the electronic properties of the solid. called the interstitial migration energy m the concentration of interstitials per site in the crystal at temperature T under equilibrium conditions is given by c(int ) (T ) = exp(− (int ) / kB T ) f (9. If the energy of such defect-related states happens to be in the same range of energy as that of states of the perfect crystal. is given by (int ) f = lim N →∞ E (int ) ( N + 1) − E0( N ) N − E0( N ) (9. However. undergoing a thermally activated process by which they are displaced from one stable position to another. These defects are common in crystals with open structures since in close-packed crystals there simply is not enough room to accommodate extra atoms: it costs a great amount of energy to squeeze an extra atom into a close-packed crystal. their effect can be signiﬁcant.4) The presence of intrinsic defects in a crystal has an important consequence: it introduces electronic states in the solid beyond those of the ideal crystal. consider the vacancy in Si: the presence of the vacancy introduces four broken bonds (called “dangling bonds”). either through a direct jump. and with the motion of interstitials. by analogy to the formation energy of the vacancy. Interstitials in crystals with open structures can also move. there is a thermal activation energy associated (int ) . its formation energy (fint ) . Interstitial atoms are extra atoms that exist in the crystal at positions that do not coincide with regular crystalline sites. when the defect states happen to have energy different than that of the crystal states. for example. By counting the available electrons. The energy to create an interstitial. that is. the energy of which is in the middle of the band gap. Such arguments prove that the concentration of defects is determined by the balance between the entropy gain due to the various ways of arranging the defects in the crystal and the energy cost to introduce each defect.

9. t ) . each broken bond contains one electron denoted by an up or a down arrow for the spin states.1. or have very little room to move in the case of close-packed metallic solids. The shaded regions represent the conduction and valence states of the bulk. Diffusion is described in terms of the changes in the local concentration c(r. degenerate electronic states. 9. This is called a Jahn–Teller distortion. Vacancies and interstitials can move much more easily through the crystal than atoms residing at ideal lattice sites in the absence of any defects. the former are fully occupied. The net effect is to produce a more stable structure. as illustrated schematically in Fig. Schematic representation of electronic states associated with the Si vacancy. the latter are fully bonded to their neighbors by strong bonds in the case of covalent or ionic solids. coincident with the Fermi level which is indicated by a dashed line. 9. it is common in defect conﬁgurations when the undistorted defect structure corresponds to partially occupied. The four electrons end up occupying the two bonding states. Left: the vacancy conﬁguration before any relaxation of the neighbors. which move in pairs closer to each other as shown in Fig. neighbors of the vacant site are not allowed to distort from their crystalline positions.2. 9. An illustration of the motion of point defects in a two-dimensional square lattice is shown in Fig. and the two sets of states are separated by a small gap. Right: the reconstructed vacancy.320 9 Defects I: point defects Ideal Vacancy Conduction band Relaxed Vacancy εF Valence band Figure 9. and are each half ﬁlled (every electronic level can accommodate two electrons with opposite spin). symmetry requires that the four dangling bond states are degenerate in energy.3. and all levels are degenerate. with pairs of broken bonds forming bonding and antibonding states.4.2 Defect-mediated diffusion One of the most important effects of intrinsic point defects is enhanced atomic diffusion. with lower energy and a new band gap between the occupied bonding and unoccupied antibonding states.3. t ) = −∇r c(r. The actual physical situation is interesting: the four dangling bonds combine in pairs. forming two bonding and two antibonding combinations. t ) with respect to time t due to the current j(r. leaving the antibonding ones empty. This is achieved by a slight distortion in the position of the immediate neighbors of the vacancy. the latter empty.

The interstitial must displace some of its neighbors signiﬁcantly in order to move. in both cases.4.6) . Notice that in the case of the vacancy what is actually moving is one of its neighbors into the vacant site. the saddle point conﬁguration induces larger distortions. t ) = 0 =⇒ = D (T )∇r ∂t ∂t (9. If motion of vacancies and interstitials were the only possible means for diffusion in a solid. t ) 2 c(r.9. D (T ) is the diffusion constant. it has dimensions [length]2 [time]−1 and depends on the temperature T . the energy cost of breaking the bonds of this moving atom corresponds to the migration energy. Illustration of motion of point defects in a two-dimensional square lattice.5) The last equation is known as Fick’s law. t ) + D (T )j(r.1 Intrinsic point defects 321 equilibrium saddle point equilibrium Figure 9. The larger dashed circles indicate the region which is signiﬁcantly distorted due to the defect. ﬂowing into or out of a volume element. then the diffusion constant would be D (T ) = D (vac) (T ) exp[− (v ac) / kB T ] m + D (int ) (T ) exp[− (int ) m / kB T ] (9. t ) ∂ c(r. Top: the migration of the vacancy (indicated by a smaller circle in dashed line) between two equilibrium conﬁgurations. Bottom: the migration of the interstitial (indicated by a circle in thicker line) between two equilibrium conﬁgurations. the energy cost of this distortion corresponds to the migration energy. The mass conservation equation applied to this situation gives ∂ c(r.

which is taken to deﬁne the zero of the energy scale. The energy E (n ) (n ) ˜ (n ) = E (n ) − a . the number of attempts to jump out of the stable conﬁguration per unit time.8) The attempt frequency ν (n ) and entropy term exp( S (n ) / kB ) give the rate of the mechanism. the distance by which an atom moves in the elementary hop associated with the defect mechanism. the saddle point surface passes energy a through the saddle point conﬁguration and separates the two stable conﬁgurations of the mechanism. that is. the spatial variables have been scaled by the square roots of the corresponding reduced masses: d (n ) =d n) m( 1 x1 d n) ˜( ˜1 d m 1 x n) m( 2 x2 · · · d n) ˜( ˜2 · · · d m 2 x n) m( d xd n) ˜( ˜ d −1 m d −1 x d A(n ) = d (9. f (vac) or f (int ) . then d = 3 N . if there are N atoms involved in the mechanism.5). E (see Fig. This surface is locally perpendicular to constant energy contours. for example. 9. and a geometrical factor. the diffusion constant is given by D= n f (n ) a (n ) 2 ν (n ) e S (n ) / kB − e (n ) a / kB T (9. as well as three additional factors: the attempt frequency ν (vac) or ν (int ) .322 9 Defects I: point defects The pre-exponentials for the vacancy and the interstitial mechanisms contain the concentrations per site c(vac) (T ) or c(int ) (T ). related to the defect structure which accounts for the different ways of moving from the intitial to the ﬁnal conﬁguration during an elementary hop. The activation energy is the sum of formation and migration energies: (n ) a = (n ) f + (n ) m (9. ν (n ) is the attempt frequency. a (n ) is the hopping length. a are the entropy change and activation energy associated with a particular mechanism. In the most general case.7) where the sum runs over all contributing mechanisms n .9) where the ﬁrst integral is over a (d − 1)-dimensional area A(n ) called the “saddle point surface” and the second integral is over a d -dimensional volume (n ) around the equilibrium conﬁguration. The energy E (n ) is measured from the ground state energy. that is.10) The dimensionality of these two spaces is determined by the number of independent degrees of freedom involved in the diffusion mechanism. that is. In these two integrals. . and S (n ) . the hopping length a (vac) or a (int ) . f (n ) is the geometric (n ) factor. Within classical thermodynamics this rate can be calculated through Vineyard’s Transition State Theory (TST) [116]: (n ) ≡ν e (n ) S (n ) / k B = kB T 2π e ˜ (n ) / kB T −E dA (n ) e − E (n ) / k B T −1 d (n ) (9. ˜ (n ) is measured from the activated state so that E (n ) is always positive.

The equilibrium region.1 Intrinsic point defects 323 Saddle-point surface Saddle point Migration path Equilibrium region Figure 9. (9. which in certain systems can be substantial. Energy surface corresponding to an activated diffusion mechanism. The rate (n ) given by Eq.12) This form of the diffusion constant helps elucidate the relevant importance of the various contributions. In many situations it is convenient and adequately accurate to use the harmonic approximation to TST theory: this involves an expansion of the energy over all the normal modes of the system near the equilibrium and the saddle point conﬁguration of the mechanism: E (n ) ≈ 1 2 d m i(n ) (2πνi(n ) )2 (xi(n ) )2 i =1 d −1 i =1 ˜ (n ) ≈ 1 E 2 ˜ i(n ) )2 ˜ i(n ) (2π ν ˜ i(n ) )2 (x m (9. One of the limitations of TST theory is that it does not take into account the backﬂow of trajectories. the saddle point surface and the migration path are indicated.5. We note ﬁrst that the geometric factor is typically a constant of order unity. while the hopping length is ﬁxed by structural considerations (for . ˜ (n ) is always positive.11) With these expressions.9) is based on so that the energy E a calculation of the current of trajectories that depart from the equilibrium conﬁguration and reach the saddle point with just enough velocity to cross to the other side of the saddle point surface. while the second is a normalization by the number of attempts of the system to leave the equilibrium conﬁguration. the contribution of mechanism n to the diffusion constant becomes D (n ) = f (n ) a (n ) 2 d νi(n ) i =1 d −1 i =1 −1 ν ˜ i(n ) e− (n ) a / kB T (9. The ﬁrst integral in the TST rate is a measure of the successful crossings of the saddle point surface.9. the saddle point conﬁguration.

through the methods discussed in chapter 5 for calculating the total energy of various atomic conﬁgurations. Thus. in the case of the vacancy it is simply equal to the distance between two nearest neighbor sites in the ideal crystal). which essentially determines its contribution. When the . then the expression for the pre-exponential factor Eq. If several hypothetical mechanisms have activation energies that cannot be distinguished within the experimental uncertainty.12) can help identify the dominant processes: the ones with a very ﬂat saddle point surface and a very steep equilibrium well. for a given activation energy a . Ref. when trying to identify the microscopic processes responsible for diffusion. for example. Equally important is the value of the attempt frequency ν2 : diffusion is enhanced if the curvature of the energy surface in the direction of the migration path is large. the slope of the resulting line identiﬁes the activation energy of the diffusion mechanism. the expression for the diffusion constant reduces to D = f a2 ν1 ν2 − e ν ˜1 a / kB T In this expression we identify ν2 as the attempt frequency and we relate the ratio ˜ 1 to the change in entropy through ν1 /ν ν1 =e ν ˜1 S / kB ⇒ S = kB ln ν1 ν ˜1 From this example we see that. diffusion is enhanced when the curvature of the energy at the saddle point is much smaller than the curvature in the corresponding direction around the equilibrium conﬁguration. that is. [117]). we consider a very simple example of a mechanism where there are only two relevant degrees of freedom. the ﬁrst consideration is to determine whether or not a proposed mechanism has an activation energy compatible with experimental measurements. To illustrate how the other features of the energy surface affect the diffusion rate. (9.324 9 Defects I: point defects example. a single atomistic process (n ) compatible with experimental measurements and is has an activation energy a considered the dominant diffusion mechanism. This issue can be established reasonably well in many cases of bulk and surface diffusion in solids. When the measured diffusion constant is plotted on a logarithmic scale against 1/ T . will give the largest contribution to diffusion (for an application of this approach see. when ν ˜1 ν1 which implies a large entropy S . giving rise to a large attempt frequency. in this case. The remaining factors are related to the features of the energy surface associated with a particular mechanism. (n ) is evidently the activation energy a which enters exponentially in the expression for the diffusion constant. this is known as an Arrhenius plot. and hence low frequencies ν ˜ i(n ) and high frequencies νi(n ) . Often. The dominant feature of the energy surface.

1 Impurity states in semiconductors A common application of extrinsic point defects is the doping of semiconductor crystals.2 Extrinsic point defects 325 harmonic approximation is not appropriate.9). with P the external pressure. It is also possible that more sophisticated theories than TST are required.1 Experimental measurements of the effect of pressure on the diffusion rate can help identify the type of defect that contributes most to diffusion. reveal a signature that is characteristic of. 9. Such transitions are often responsible for the color of crystals that otherwise would be transparent. the impurity behaves essentially like an isolated atom in an external ﬁeld possessing the point group symmetries of the lattice. the substitution results in a stable structure with interesting physical properties. especially when backﬂow across the saddle point surface is important. interstitials and point defects in general introduce signiﬁcant distortions to the crystal positions in their neighborhood. These distortions can be revealed experimentally through the changes they induce on the volume of the crystal and their effect on the diffusion constant. In certain cases. both in the equilibrium conﬁguration and in the activated conﬁguration. is needed to calculate the rate (n ) from Eq. When the natural size of the impurity is compatible with the volume it occupies in the host crystal (i. and therefore can uniquely identify.e. using hydrostatic or non-hydrostatic pressure experiments. .2. The impurity levels are in general different than the host crystal levels. through absorption or emission of photons.2 Extrinsic point defects The most common type of extrinsic point defect is a substitutional impurity. that is. Vacancies. and the manner in which it moves in the crystal through the distortions it induces to the lattice [118]. 9. the impurity. This type of impurity is called a color center. This external ﬁeld usually induces splitting of levels that would normally be degenerate in a free atom. that is. the distance between the impurity and its nearest neighbors in the crystal is not too different from its usual bond length). Electronic transitions between impurity levels. This necessitates the generalization of the above expressions for the diffusion constant to the Helmholtz free energy F = E − T S + P . This is a physical process of crucial importance to the operation of modern 1 The pressure–volume term is replaced by a stress–strain term in the case of non-hydrostatic pressure. a foreign atom that takes the place of a regular crystal atom at a crystal site.9. the energy variation with respect to defect motion near the equilibrium conﬁguration and near the saddle point conﬁguration. the entire energy landscape. (9. but such situations tend to be rare in the context of diffusion in solids.

for the energy of states near the highest valence ( k ) and . The light and heavy electron and hole states are also indicated. In the opposite case. these extra electrons are in states near the CBM. which are relevant to doping. The energy of electronic i.6. the bands corresponding to the light and heavy masses are split in energy for clarity. lies within the band gap of the semiconductor. from q · p perturbation theory (see chapter 3). ¯ )−1 ]αβ = δαβ (1/m ¯ α ). At a given point in the BZ we can always identify the principal axes which make the inverse effective mass a diagonal tensor. [(m states around this point will be given by quadratic expressions in k. described as “light” or “heavy” electrons or holes. The crystal has an inverse effective mass which is a tensor. leave empty states in the VBM of the crystal called “holes”. if it lies near the band extrema (Valence Band Maximum –VBM– or Conduction Band Minimum –CBM) they are called “shallow” states. which when occupied. these can have an effective mass which is smaller or larger than the electron mass. It is the latter type of state that is extremely useful for practical applications. Schematic illustration of shallow donor and acceptor impurity states in a semiconductor with direct gap. Applying this to the VBM and the CBM of a (v ) semiconductor yields. so we will assume from now on that the impurity-related states of interest have energies close to the band extrema.326 εk CBM VBM 9 Defects I: point defects light electrons heavy Conduction bands Donor state Acceptor state ε gap Valence bands heavy holes light k Figure 9. these are called “deep” states.e. in which case it is called a donor. as discussed in chapter 3. If the impurity atom has more valence electrons than the host atom.6. the impurity creates states near the VBM. The energy of states introduced by the impurities. 9. If the energy of the impurity-related states is near the middle of the band gap. We recall from our earlier discussion of the behavior of electrons in a periodic potential (chapter 3) that an electron in a ¯ )−1 ]αβ . One important aspect of the impurity-related states is the effective mass of charge carriers (extra electrons or holes): due to band-structure effects. this type of impurity is called an acceptor. We begin with a general discussion of the nature of impurity states in semiconductors. written as [(m inverse effective mass depends on matrix elements of the momentum operator at the point of the BZ where it is calculated. so we discuss its basic aspects next. These features are illustrated in Fig. electronic devices.

we will employ what is referred to as “effective-mass theory” [119]. This wavefunction will obey the single-particle equation H0 + V ( I ) (r) ψ ( I ) (r) = sp sp (I ) ψ ( I ) (r) (9.9. v bm = = + + h ¯2 2 h ¯2 2 2 k2 k2 k1 + 2 + 3 ¯ v1 m ¯ v2 m ¯ v3 m 2 k2 k2 k1 + 2 + 3 ¯ c1 m ¯ c2 m ¯ c3 m cbm (9. and use the eigenfunctions of the perfect crystal as the basis for expanding the perturbed wavefunctions of the extra electrons or the holes.14) where ε is the dielectric constant of the solid. We will consider the potential introduced by the impurity to be a small perturbation to the crystal potential. ¯ ( I ) can be positive or negative. which of course. We expect that in such a potential. The potential that the presence with this deﬁnition Z of the impurity creates is then ¯ ( I ) e2 Z V ( I ) (r) = − εr (9.2 Extrinsic point defects 327 the lowest conduction ( (v ) k (c) k (c) k ) states. m those for the conduction states m ¯ ( I ) which is the difference We will assume the impurity has an effective charge Z between the valence of the impurity and the valence of the crystal atom it replaces. respectively. m ¯ c3 are positive. we will assume that we are dealing with a situation where there is only one band in the host crystal. In order to keep the notation simple. we elaborate below on what are the relevant values of the index k. ¯ c1 . m ¯ v3 are negative. The wavefunction of the impurity state will then be given by ψ ( I ) (r) = k βk ψk (r) (9. valence states for the holes). and ¯ v1 .16) with H0 the single-particle hamiltonian for the ideal crystal. the generalization to many bands being straightforward. when applied to the crystal wavefunctions ψk (r). the impurity-related states will be similar to those of an atom with effective ¯ ( I ) e/ε . m where the effective masses for the valence states m ¯ c2 . In order to examine the behavior of the extra electrons nuclear charge Z or the holes associated with the presence of donor or acceptor dopants in the crystal.15) where ψk (r) are the relevant crystal wavefunctions (conduction states for the extra electrons. gives H0 ψk (r) = sp k ψk (r) (9.13) ¯ v2 .17) .

(9. (9. we will try to construct a linear combination of these coefﬁcients which embodies the important physics.328 9 Defects I: point defects Substituting the expansion (9. we rewrite the wavefunction for the impurity state in Eq.23) for properly normalized crystal wavefunctions.18) If we express the crystal wavefunctions in the familiar Bloch form. which implies that its Fourier transform components fall very fast with the magnitude of the reciprocalspace vectors. But for G = 0 and k = k = k0 . and therefore the only relevant matrix elements are those for G = 0 and k ≈ k . we arrive at the following equation for the coefﬁcients βk : k βk + k V ( I ) (k − k)βk = (I ) βk (9. γkk (G) becomes γk0 k0 (0) = G ∗ αk (G )αk0 (G ) = 0 G |αk0 (G )|2 = 1 (9. ψk (r) = eik·r G αk (G)eiG·r (9. We will also assume that the expansion of the impurity wavefunction onto crystal wavefunctions needs only include wave-vectors near a band extremum at k0 (the VBM for holes or the CBM for extra electrons).19) we can obtain the following expression for the matrix elements of the impurity potential that appear in Eq.22) We will assume that the impurity potential is a very smooth one. (9.18): ψk |V ( I ) |ψk = G γkk (G)V ( I ) (G − k + k ) (9. V ( I ) (G) = V ( I ) (r)eiG·r dr (9.21) and V ( I ) (G) is the Fourier transform of the impurity potential.15) into Eq. To this end.16) and using the orthogonality of the crystal wavefunctions we obtain k βk + k ψk |V ( I ) |ψk βk = (I ) βk (9.20) where we have deﬁned the quantity γkk (G) as γkk (G) = G ∗ αk (G )αk (G + G) (9.15) using the expressions for the crystal Bloch functions that involve the .24) Rather than trying to solve this equation for each coefﬁcient βk . Using these results.

the sum over k must extend over all values in the BZ. whereas in our discussion we have restricted k to the neighborhood of k0 . we obtain k βk e k i(k−k0 )·r + k. and for these values u k (r) does not change signiﬁcantly.9. this leads to ψ ( I ) (r) ≈ u k0 (r) k βk eik·r = ψk0 (r) k βk ei(k−k0 )·r = ψk0 (r) f (r) where in the last expression we have deﬁned the so called “envelope function”. However. If we multiply both sides of Eq. With this extension we ﬁnd V ( I ) (k − k)e−i(k −k)·r = V ( I ) (r) k (9.26) We rewrite the second term on the left-hand side of this equation as V ( I ) (k − k)e−i(k −k)·r ei(k −k0 )·r βk k k In the square bracket of this expression we recognize the familiar sum of the inverse Fourier transform (see Appendix G). f (r) is a smooth function because it involves only Fourier components of small magnitude |k − k0 |. Our goal then is to derive the equation which determines f (r). modiﬁed by the envelope function f (r). In order to have a proper inverse Fourier transform. since the potential V ( I ) (k) is assumed to vanish for large values of |k| we can extend the summation in the above equation to all values of k without any loss of accuracy. We see from this analysis that the impurity state ψ ( I ) (r) is described well by the corresponding crystal state ψk0 (r). we can take it as a common factor outside the summation.25) This expression is the linear combination of the coefﬁcients βk that we were seeking: since k is restricted to the neighborhood of k0 . (9.k V ( I ) (k − k)ei(k−k0 )·r βk = (I ) f (r) (9. f (r) = k βk ei(k−k0 )·r (9.2 Extrinsic point defects 329 periodic functions u k (r): ψ ( I ) (r) = k βk eik·r u k (r) Since the summation extends only over values of k close to k0 .24) by exp[i(k − k0 ) · r] and sum over k in order to create the expression for f (r) on the right-hand side.27) .

(9. the impurity state is a hole. using the identity (ki − k0.13) for the energy of crystal states near the band extremum.i )2 ¯i 2m (9. (9.33) where we have added the index n to allow for different possible solutions to this equation.28) We next use the expansion of Eq.30) k i Finally.i )2 ei(k−k0 )·r = − ∂ 2 i(k−k0 )·r e ∂ xi2 (9.26) takes the form k βk e k i(k−k0 )·r + V ( I ) (r) f (r) = (I ) f (r) (9. i. so the equation is formally the same as that of an electron . with the help of this expansion. k = k0 + i h ¯2 (ki − k0.31) we arrive at the following equation for the function f (r): − i h ¯ 2 ∂2 + V ( I ) (r) ¯ i ∂ xi2 2m f (r) = (I ) − k0 f (r) (9. but the effective mass of the particles m the presence of the crystal. the plicity an average effective mass m equation for the envelope function takes the form − 2 ¯ ( I ) e2 h ¯ 2 ∇r Z − ¯ 2m εr f n (r) = (I ) n − k0 f n (r) (9.32) This equation is equivalent to a Schr¨ odinger equation for an atom with a nuclear potential V ( I ) (r). This is identical to the equation for an electron in an atom with nuclear ¯ ( I ) e/ε.i )2 βk ei(k−k0 )·r + V ( I ) (r) f (r) = ¯i 2m (I ) − k0 f (r) (9. the effective charge Z mass is also negative. in this case.14). (9. Eq. not only is the effective charge of the nucleus modiﬁed by the dielectric constant ¯ i also bear the signature of of the crystal.e.29) with i identifying the principal directions along which the effective mass tensor is ¯ i . precisely the type of equation we were anticipating. If the charge is negative.330 9 Defects I: point defects and with this. With the impurity potential V ( I ) (r) given by Eq. the equation for the diagonal with elements m coefﬁcients βk takes the form h ¯2 (ki − k0. and assuming for sim¯ for all directions i (an isotropic crystal).

34) while the corresponding solutions for the envelope function are hydrogen-like wave¯ =h ¯ 2 . Speciﬁcally. (9. and typical values of ε are of order 10. for instance. both in energy and in wavefunction-spread around the impurity atom. The number of conduction electrons.2. Using these values. the wavefunction of the functions with a length scale a ¯ 2 ε/me ﬁrst state is ¯ f 1 (r) = A1 e−r /a (9. This indicates that in impurity-related states the electrons or the holes are loosely bound. the energy of impurity-related states is scaled by a factor ¯ /me ¯ (e2 /ε )2 m m = 4 mee ε2 while the characteristic radius of the wavefunction is scaled by a factor ε m e e2 = 2 ¯ (e /ε ) ¯ /me m m As is evident from Eq. Typically the impurity atoms have an effective charge ¯ ( I ) = ±1. this will be equal . both of these energy differences.34). In a semiconductor. scaled appropriately. we calculate ﬁrst the number of conduction electrons or holes that exist in a semiconductor at ﬁnite temperature T . while the energy of holes relative to CBM is negative (m ¯ being negative in this case). (9. will be given by the sum over occupied states with energy above the Fermi level F . denoted by n c (T ).1). the energy of donor electron states relative to the ¯ being positive in this case). in the absence of any dopant impurities.1).35) with A1 a normalization factor. so the solutions are hydrogen-like wavefunctions and energy eigenof Z values. the VBM is positive (m (I ) denoted as n in Eq.2 Effect of doping in semiconductors To appreciate the importance of the results of the preceding discussion. 9. are referred to as the “binding energy” of the electron or ¯ / m e in semiconductors are of order hole related to the impurity.9.2 Extrinsic point defects 331 associated with a positive ion. Thus. Typical values of m −1 10 (see Table 9. the eigenvalues of the extra electron or the hole states are given by (I ) n = k0 − ¯ 4 me 2¯ h2 ε 2 n 2 (9. while the radius of the wavefunction is scaled by a factor of ∼ 102 . we ﬁnd that the binding energy of electrons or holes is scaled by a factor of ∼ 10−3 (see examples in Table 9.34).

The effective masses at the CBM and VBM of representative elemental (Si and Ge) and compound semiconductors (GaAs) are given in units of the electron mass m e . In the case of compound semiconductors we also indicate which atom is substituted by the impurity.744 eV) VBM GaAs: CBM (1.4 10.08 0.60 0. ¯ L /me m Si: CBM (1. the number of holes. [73].7 9.80 0.1.066 ¯ T /me m 0.066 Eb 45 54 39 46 67 72 12.2 10. will be equal to the integral over all valence states of the density of valence states gv ( ) multiplied by one minus the Fermi occupation number: pv ( T ) = v bm −∞ gv ( ) 1 − 1 e( − F )/ kB T +1 d = v bm −∞ gv ( ) 1 e( F − )/ kB T d +1 (9.52 1.37) .153 eV) VBM Ge: CBM (0.0 12. is the band gap.332 9 Defects I: point defects Table 9.12 Source: Ref.8 6 6 30 6 6 28 28 31 35 35 Impurity P As Sb B Al Ga P As Sb B Al Ga S/As Se/As Te/As Si/Ga Sn/Ga Be/Ga Mg/Ga Zn/Ga Cd/Ga Si/As ¯( I ) Z +1 +1 +1 −1 −1 −1 +1 +1 +1 −1 −1 −1 +1 +1 +1 +1 +1 −1 −1 −1 −1 −1 VBM 0. The energy of the CBM relative distinguished in longitudinal m to the VBM.53 eV) 0. to the integral over all conduction states of the density of conduction states gc ( ) multiplied by the Fermi occupation number at temperature T : n c (T ) = ∞ gc ( ) cbm 1 e( − F )/ kB T +1 d (9. denoted by pv (T ).36) Similarly.6 10. in meV. Impurity states in semiconductors. given in parentheses.16 0.19 0. and are ¯ L and transverse m ¯ T states.34 0.04 0. E b is the binding energy of donor and acceptor states from the band extrema for various impurity atoms.98 0.

accordingly. The exponential factors in the integrals for n ¯ v (T ) fall very fast when the values of are far from cbm and vbm . respectively. ∼ 1/40 eV. giving above expressions for n ¯ c (T ) = n 1 4 ¯ c kB T 2m πh ¯2 3/2 . 1 e( F − v bm )/ kB T +1 ≈ e−( F − v bm )/ kB T These approximations are very good for the values = cbm and = vbm in the expressions for the number of electrons and holes. ¯ c (T ) = n ¯ v (T ) = p ∞ cbm gc ( )e−( gv ( )e−( − cbm )/ kB T d d F − v bm )/ kB T . We have seen above that near these values we can use q · p perturbation theory to express the energy near the CBM and VBM as quadratic expressions in k.37).e.36) and (9. and p only values of very near these end-points contribute signiﬁcantly to each integral. so that we can use them for the entire range of values in each integral. where m Now we can derive simple expressions for the number of electrons or holes under equilibrium conditions at temperature T in a crystalline semiconductor without . Since the band gap of semiconductors is of order 1 eV and the temperatures of operation are of order 300 K. We have mentioned before (see chapter 5) that in a perfect semiconductor the Fermi level lies in the middle of the band gap.13) and (5. (9. and become even better for values larger than cbm or lower than vbm .38) ¯ c. we will use this fact to justify an approximation for the number of electrons or holes. we can use for gc ( ) and gv ( ) expressions which are proportional to ( − cbm )1/2 and ( vbm − )1/2 . a statement that can be easily proved from the arguments that follow. v bm v bm − )/ k B T −∞ ¯ c (T ) and p ¯ v (T ) are obtained by extracting from the integrand where the quantities n a factor independent of the energy . we can use the approximations 1 e( cbm − F )/ kB T +1 ≈ e−( cbm − F )/ kB T . i. Moreover. Eqs. (5.2 Extrinsic point defects 333 In the above equations we have used the traditional symbols n c for electrons and pv for holes. ¯ v (T ) = p 1 4 ¯ v kB T 2m πh ¯2 3/2 (9. we will refer to these quantities as the reduced ¯ c (T ) number of electrons or holes.9. Eqs. These considerations make it possible to evaluate explicitly the integrals in the ¯ c (T ) and p ¯ v (T ). which come from the fact that the former represent negatively charged carriers (with the subscript c to indicate they are related to conduction bands) and the latter represent positively charged carriers (with the subscript v to indicate they are related to valence bands). For the moment.14). m ¯ v are the appropriate effective masses at the CBM and VBM. This gives ¯ c (T )e−( n c (T ) = n ¯ v (T )e−( pv ( T ) = p cbm − F )/ kB T . respectively. from our analysis of the density of states near a minimum (such as the CBM) or a maximum (such as the VBM).

which for temperatures of order room temperature is extremely small. the CBM. excitation of electrons from the top of the valence band into the acceptor-related state leaves behind a delocalized hole state that can carry current.334 9 Defects I: point defects dopants. the gap being of order 1 eV = 11 604 K. of order few meV (the binding energy of the 1s electron in hydrogen is 13. we have seen that the binding energy is of order 10−3 of the binding energy of electrons in the hydrogen atom. the number of electrons or holes is proportional to the factor exp(− gap /2kB T ). It is instructive to calculate the average occupation of donor and acceptor levels in thermal equilibrium.40) where gap = cbm − vbm .41) where n i is the number of electrons in state i whose degeneracy and energy are gi and E i(d ) . The meaning of binding energy here is the amount by which the energy of the donor-related electron state is below the lowest unoccupied crystal state. Thus. This operation depends on the presence of carriers that can be easily excited and made to ﬂow in the direction of an external electric ﬁeld (or opposite to it. From statistical mechanics.39) and the two numbers must be equal because the number of electrons thermally excited to the conduction band is the same as the number of holes left behind in the valence band. We note that the product of electrons and holes at temperature T is 1 n c ( T ) pv ( T ) = 16 2kB T πh ¯2 3 ¯ cm ¯ v )3/2 e−( (m cbm − v bm )/ kB T (9. It is now easy to explain why the presence of dopants is so crucial for the operation of real electronic devices. Therefore. i. they have donor and acceptor impurities have an effective charge of | Z one electron more or one electron fewer than the crystal atoms they replace. F is the Fermi level. that is. Similarly.e. in the undoped crystal. For electrons and holes which are due to the presence of impurities.6 eV). or the amount by which the energy of the acceptor-related electron state is above the highest occupied crystal state. the average number of electrons in a donor state is ne = i n i gi exp[−( E i(d ) − n i F )/ kB T ] i gi exp[−( E i(d ) − n i F )/ kB T ] (9. each number is equal to n c ( T ) = pv ( T ) = 1 4 2k B T πh ¯ 2 3/2 ¯ cm ¯ v )3/4 e− (m gap /2kB T (9. Excitation of electrons from the impurity state to a conduction state gives rise to a delocalized state that can carry current. Thus. which we take to be the chemical potential for . the VBM. depending on their sign). this excitation is much easier than excitation of electrons from the VBM to the CBM across the entire band gap. In the following discussion we assume for simplicity that ¯ ( I ) | = 1. the presence of donor impurities in semiconductors makes it possible to have a reasonable number of thermally excited carriers at room temperature.

We call this (d ) term U (d ) . With these values the average number of electrons in the donor state takes the form ne = 2 exp[−( (d ) − F )/ kB T ] = 1 + 2 exp[−( (d ) − F )/ kB T ] 1 1 2 exp[( (d ) − F )/ kB T ] +1 (9. For the donor state the situation is illustrated in Fig. Typically.2 Extrinsic point defects 335 electrons. the one-electron state has energy E 1 (a ) (a ) (a ) is the energy of the acceptor level. The energy for the no-electron state (d ) (d ) is E 0 = 0. in which case the valence bands are exactly full as in the perfect crystal. 9. with (d ) the energy of the donor level. a state with one electron (degeneracy 2) and a state with two electrons (degeneracy 1). Taking an electron from the reservoir and placing it in the donor state costs energy of an amount (d ) − F . the degenhas energy E 2 = 2 . we present a more physical picture of the argument. an additional term needs to be included to account for the Coulomb repulsion between the two electrons of opposite spin which reside in the same localized state. We will again take the Coulomb repulsion energy between the holes to be much larger than the difference ( (a ) − F ). With these considerations. so that the contribution of the two-electron state to the thermodynamic average is negligible. where eracies of these states are the same as those of the donor states. so that the no-electron state has negligible contribution to the thermodynamic averages.7: the presence of the donor impurity creates a localized state which lies just below the CBM. while the state with one electron has energy E 1 = (d ) . In the case of the two-electron state. the average number of holes in the acceptor state takes the form ph = 2 − n e = 2 − = 1 1 2 2 exp[−( (a ) − F )/ kB T ] + 2 exp[−(2 (a ) − 2 F )/ kB T ] 2 exp[−( (a ) − F )/ kB T ] + exp[−(2 (a ) − 2 F )/ kB T ] BT ] exp[( F − (a ) )/ k +1 (9.43) This expression has a similar appearance to the one for the average number of electrons in the donor state. the Coulomb repulsion is much greater than the difference ( (d ) − F ). and the two-electron state electrons). There are three possible states associated with a donor impurity level: a state with no electrons (degeneracy 1).42) The corresponding argument for an acceptor level indicates that the no-electron state (a ) has energy E 0 = U (a ) due to the repulsion between the two holes (equivalent to no (a ) = (a ) .9. Because the above assignments of energy may seem less than obvious. and the energy of the two-electron state is given by E 2 = 2 (d ) + U (d ) . We deﬁne the zero of the energy scale to correspond to the state with no electrons in the donor state. while placing two electrons in it costs twice as much energy plus the Coulomb repulsion energy U (d ) experienced .

7. 9. respectively. with regular crystal atoms shown as small black circles. but a convenient way to describe the collective behavior of the electrons in the solid. if we consider an isolated system. However. which corresponds to a ﬁlled valence band. the localized holes experience a Coulomb repulsion U (a ) . A hole is not a physical object. the natural value of the chemical potential is F = cbm . that with one electron localized near the defect in the middle panel and that with two electrons in the right panel. Electron occupation of of a donor-related state. singly hatched and cross-hatched regions represent occupation by one electron and two electrons. Z are shown as large shaded areas. Notice that. since this is the reservoir of electrons for occupation of the donor state. which is equivalent to having two holes. (d ) − F < 0. assuming U (d ) For the acceptor state the situation is illustrated in Fig. with no additional effect. is shown in the left panel. Electron states has Z (d ) = 3 (that is. if there are no electrons in this state. equivalent to having one hole. Bottom row: this indicates the occupation of electronic levels in the band structure. Top row: this indicates schematically the occupation in real space. This is another manifestation of the quasiparticle nature of electrons and holes in the solid. Taking an electron from the reservoir and placing it in this state. state being occupied by one electron. costs energy of an amount (a ) − F . which means that the optimal situation corresponds to the donor | (d ) − F |. The situation with no electrons in the defect state.8: the presence of the acceptor impurity creates a localized state which lies just above the VBM. while placing two electrons in this state costs twice this energy. The perfect crystal has two electrons per atomic site corresponding to Z = 2 and the donor ¯ (d ) = +1) and is shown as a slightly larger black circle. by the two electrons localized in the defect state. in this case. The defect-related states are shown as more extended than regular crystal states.336 9 Defects I: point defects Figure 9. The Fermi level is at the bottom of the conduction band and the defect state is slightly below it. The binding of the .

.9.2 Extrinsic point defects 337 Figure 9. which corresponds to two electrons missing from the perfect crystal (equivalent to two holes). Thus. Z smaller gray circle. since this is the reservoir of electrons for occupation of the acceptor state. which means that the optimal situation corresponds to the acceptor state being occupied by one electron. Z occupation of this state by zero electrons is equivalent to having two localized holes. (a ) − F > 0. this statement expresses the cost in energy to have the electrons completely expelled from the neighborhood of the acceptor defect. The Fermi level is at the top of the valence band and the defect state is slightly above it. when the system is in contact with an external reservoir which determines the chemical potential of electrons (see. Electron occupation of an acceptor-related state. the above arguments still hold with the position of the Fermi level not at the band extrema. 9. except that the acceptor has Z (a ) = 1 (that is. which has a weaker ionic potential than a regular crystal atom ¯ (a ) = −1). or equivalently by one hole. If we consider an isolated system. the symbols are the same as ¯ (a ) = −1) and is shown as a in Fig. is shown in the left panel. but determined by the reservoir. hole at the acceptor level means that electrons collectively avoid the neighborhood of the acceptor defect. These results are summarized in Table 9. for instance. assuming U (a ) | (a ) − F |.2. which repel each other by an energy U (a ) . in this case.7. the natural value of the chemical potential is F = vbm . In both cases. the following sections on the metal–semiconductor and metal–oxide–semiconductor junctions). For both the donor and the acceptor state. the at the same site (equivalent to a negative effective charge.8. that with one electron localized near the defect in the middle panel and that with two electrons in the right panel. the optimal situation has one electron in the defect state which corresponds to a locally neutral system. The situation with no electrons in the defect state.

In either case. Similarly. one that is doped with donor impurities and has an excess of electrons. is called the “depletion region”. which prohibits further diffusion of electric charges. the reverse applies to the motion of holes from the p-doped to the n-doped side. The energies of the donor and acceptor states for different occupations. U (a ) is the Coulomb repulsion between two localized electrons or holes. some holes will diffuse from the p-doped side to the n-doped side leaving behind negatively charged acceptor impurities. The two parts are in contact. an arrangement called “forward bias”. the external potential introduced by the bias . State i 1 2 3 Occupation ni no electrons one electron two electrons Degeneracy gi 1 2 1 Energy (donor) E i(d ) − n i F 0 − F) 2( (d ) − F ) + U (d ) ( (d ) Energy (acceptor) E i(a ) − n i F U (a ) ( (a ) − F ) 2( (a ) − F ) 9. U (d ) . an electric ﬁeld is set up due to the imbalance of charge. The basic feature is the presence of two parts. it is negatively (n) doped. respectively. we will discuss in very broad terms the operation of electronic devices which are based on doped semiconductors (for details see the books mentioned in the Further reading section). current ﬂows because the holes are attracted to the negative pole and the electrons are attracted to the positive pole. The potential (x ) corresponding to this electric ﬁeld is also shown in Fig.2. Because electrons and holes are mobile and can diffuse in the system. and one that is doped with acceptor impurities and has an excess of holes. In this case. some electrons will move from the n-doped side to the p-doped side leaving behind positively charged donor impurities. which then lose their holes and become negatively charged. that is.338 9 Defects I: point defects Table 9. 9.2. Once enough holes have passed to the n-doped side and enough electrons to the p-doped side. as shown schematically in Fig. An alternative way of describing this effect is that the electrons which have moved from the n-doped side to the p-doped side are captured by the acceptor impurities.9. 9. This arrangement is called a p–n junction. and which is therefore depleted of carriers. freeing up the depletion region for additional charges to move into it.9.3 The p–n junction Finally. The effect of the p–n junction is to rectify the current: when the junction is hooked to an external voltage bias with the plus pole connected to the p-doped side and the minus pole connected to the n-doped side. The region near the interface from which holes and electrons have left to go the other side. the carriers that move to the opposite side are no longer mobile. that is. it is positively (p) doped.

We consider ﬁrst the situation when the two parts. which further restricts the motion of carriers. voltage counteracts the potential due to the depletion region. If. even in a reverse biased p–n junction there is a small amount of current that can ﬂow due to thermal generation of carriers in the doped regions. which makes it possible for current to ﬂow. 9. the current is essentially constant and very small (equal to the saturation current). In reality. p-doped and n-doped. 9. The rectifying behavior of the current J as a function of applied voltage V is also indicated. the actual potential difference (solid lines) between the p-doped and n-doped regions relative to the zero-bias case (dashed lines) is enhanced in the reverse bias mode. by convention taken as negative applied voltage. The positive and negative signs represent donor and acceptor impurities that have lost their charge carriers (electrons and holes. In this case. while for reverse bias. the positive pole of the external voltage is connected to the the n-doped region and the negative pole to the p-doped region. the small residual current for reverse bias is the saturation current. For forward bias. leading to rectiﬁcation. then current cannot ﬂow because the motion of charges would be against the potential barrier. by convention taken as positive applied voltage.9. the external potential introduced by the bias voltage enhances the potential due to the depletion region. n-doped and depletion regions identiﬁed. the resulting potential (x ) prohibits further diffusion of carriers. . The formation of the p–n junction has interesting consequences on the electronic structure.9. Left: charge distribution. the current ﬂow increases with applied voltage. Schematic representation of p–n junction elements. with the p-doped.9. Right: operation of a p–n junction in the reverse bias and forward bias modes.2 Extrinsic point defects Reverse p p-doped holes & acceptor impurities 339 Forward p n n n-doped electrons & donor impurities -+ Φ (x) +Φ (x) Depletion region Φ (x) 0 x 0 x J 0 x Reverse saturation current 0 Forward V Figure 9.9. as indicated in Fig. as shown schematically in Fig. the p–n junction preferentially allows current ﬂow in one bias direction.9. on the other hand. this is called the saturation current. Thus. 9. respectively) which have diffused to the opposite side of the junction. as indicated in Fig. and is reduced in the forward bias mode. an arrangement called “reverse bias”.

This distortion of the bands is referred to as “band bending”.340 vacuum 9 Defects I: point defects depletion region ε(p) cbm (p) εcbm ε(n) F ε(p) F ε(n) cbm ε(p) vbm ε(n) vbm εF eV C ε(n) cbm ε(p) vbm ε(n) vbm p-doped n-doped p-doped n-doped Figure 9. F . while that of the n-doped part. .6). F the CBM (these assignments are explained below). When the two parts are brought into contact. It is instructive to relate the width of the depletion region and the amount of band bending to the concentration of dopants on either side of the p–n junction and the energy levels they introduce in the band gap. 9. From the discussion in the preceding subsections.10. The reason behind the band bending is the presence of the potential (x ) in the depletion region. the two Fermi levels must be aligned. 9. In fact. the band extrema (VBM and CBM) are at the same position for the two parts. Moreover.10. Right: the bands when the two sides are brought together. as shown in Fig. but we avoid this term here in order to prevent any confusion with the externally applied bias potential. with the energy change due to the contact potential eVC . with a common Fermi level F . is near p-doped part. we have seen that an acceptor impurity introduces a state whose energy (a ) lies just above the VBM and a donor impurity introduces a state whose energy (d ) lies just below the CBM (see Fig. in equilibrium these states are occupied by single electrons. VC = (+∞) − (−∞) which is called the “contact potential”. with different Fermi levels F and F . are well separated. assuming that the effective charge of 2 Sometimes this is also referred to as the “bias potential”. the bands on the two sides of the interface are distorted to accommodate the common Fermi level and maintain the same relation of the Fermi level to the band extrema on either side far from the interface. Left: the bands of the p-doped ( p) (n ) and n-doped parts when they are separated. is near the VBM. but the Fermi levels are not: the Fermi level in the ( p) (n ) . since charge carriers move across the interface to establish a common Fermi level. the band bending in going from the p-doped side to the n-doped side is shown. 2 The difference in the band extrema on the two sides far from the interface is then equal to eVC . In this case. the amount by which the bands are bent upon forming the contact between the p-doped and n-doped regions is exactly equal to the potential difference far from the interface. Band bending associated with a p–n junction. When this happens.

If the concentration of the dopant impurities.9. respectively. Thus.9. in n-doped material the Fermi level will coincide with (d ) and in p-doped material it will coincide with (a ) : (n ) F = (d ) . x < 0 ε (9. we immediately deduce that eVC = = (n ) F gap − − ( p) F = − gap − ( p) F − (n ) cbm ( p) v bm − (d ) (n ) cbm − (n ) F (a ) ( p) v bm − − since this is the amount by which the position of the Fermi levels in the p-doped and n-doped parts differ before contact. We then use Poisson’s equation. ( p) F = (a ) From these considerations. We deﬁne the direction perpendicular to the interface as the x axis and take the origin to be at the interface. x > 0 2 dx ε 4π (9. which for this one-dimensional problem gives 4π d2 ( x ) = − ρn .44) = − ρp. 9.10. Integrating once and requiring that d /dx vanishes at x = +ln and at x = −l p . denoted the impurities is Z (a ) here as N or N (d ) for acceptors and donors. the edges of the depletion region where the potential has reached its asymptotic value and becomes constant. is signiﬁcant then the presence of electrons in the impurity-related states actually determines the position of the Fermi level in the band gap. We can also determine the lengths over which the depletion region extends into each side. x < 0 ε where ε is the dielectric constant of the material. In terms of the dopant concentrations. We also deﬁne the zero of the potential (x ) to be at the interface. (0) = 0. but leads to correct results which are consistent with more realistic assumptions (see Problem 6). we ﬁnd: 4π d (x ) = − ρn (x − ln ).2 Extrinsic point defects 341 ¯ ( I ) = ±1. x > 0 dx ε 4π = − ρ p (x + l p ).45) . by assuming uniform charge densities ρ p and ρn in the p-doped and n-doped sides. respectively. these charge densities will be given by ρn = eN (d ) . and using the diagram of Fig. l p and ln . as indicated in Fig. ρ p = −eN (a ) assuming that within the depletion region all the dopants have been stripped of their carriers. The assumption of uniform charge densities is rather simplistic. 9.

x >0 = − ρn ( x − l n ) 2 + ρn ln dx ε ε 2π 2π ρ pl 2 = − ρ p (x + l p )2 + p. This condition gives N (d )ln = N (a )l p (9. (9. and requiring the potential to vanish at the interface. leads to 2π 2π d (x ) 2 . which is related to the electric ﬁeld.48) It is interesting to consider the limiting cases in which one of the two dopant concentrations dominates: N (a ) N (d ) ⇒ lD = ε VC 2π e ε VC 2π e 1/2 N (d ) 1/2 −1/2 N (d ) N (a ) ⇒ l D = N (a ) −1/2 which reveals that in either case the size of the depletion region is determined by the lowest dopant concentration.342 9 Defects I: point defects The derivative of the potential. In these situations the band gap and position of the band extrema are different on each side of the junction. For doped semiconductors. It is also possible to create p–n junctions in which the two parts consist of different semiconducting materials.46) where we have also used the relation between the charge densities and the dopant concentrations mentioned above. lp = ε VC 2π e 1 N (d ) N (a ) N (a ) + N (d ) 1/2 From these expressions we ﬁnd that the total size of the depletion layer l D is given by l D = l p + ln = ε VC 2π e N (a ) + N (d ) N (a ) N (d ) 1/2 (9. must be continuous at the interface since there is no charge build up there and hence no discontinuity in the electric ﬁeld (see Appendix A). these are called “heterojunctions”.47) . x < 0 ε ε From this expression we can calculate the contact potential as 2π e 2 + N (a )l 2 N (d ) l n p ε and using the relation of Eq.46) we can solve for ln and l p : VC = (ln ) − (−l p ) = ln = ε VC 2π e 1 N (a ) N (d ) N (a ) + N (d ) 1/2 (9. the Fermi levels on the two sides of a heterojunction will also be . Integrating the Poisson equation once again. Up to this point we have been discussing electronic features of semiconductor junctions in which the two doped parts consist of the same material.

However.11: in both cases the material on the left has a smaller band gap than the material on the right (this could represent. Two typical situations are shown in Fig. Top: a situation with p-doped small-gap material and n-doped large-gap material. the bands are bent as usual to accommodate the common Fermi level. When the Fermi levels of the two sides are aligned upon forming the contact. the contact potential for the conduction and valence states will be given by (c) = eVC F − cbm . As indicated in Fig. (v ) eVC = F − v bm .11. at different positions before contact. 9. a junction between GaAs on the left and Alx Ga1−x As on the right).11. The band bending produces energy wells for electrons in the ﬁrst case and for holes in the second case. Bottom: the situation with the reverse doping of the two sides.2 Extrinsic point defects vacuum depletion region 343 (p) εcbm ε(n) F ε(p) F ε(n) cbm (p) εcbm electron well (c) eVC ε(p) vbm ε(n) vbm εF ε(n) cbm ε(p) vbm (v) eVC ε(n) vbm p-doped vacuum n-doped p-doped n-doped (p) εcbm (c) eVC (n) εcbm depletion region (n) εcbm ε(p) cbm ε(n) F ε(p) F ε(p) vbm εF ε(n) vbm hole well (n) εvbm ε(p) vbm (v) eVC n-doped p-doped n-doped p-doped Figure 9. for example. the two contact potentials will be given by (c) eVC = F + cbm . (v ) eVC = F + v bm whereas in the reverse case of n-doping in the small-gap material and p-doping in the large-gap material. The left side in each case represents a material with smaller band gap than the right side. in these situations the contact potential is not the same for the electrons (conduction states) and the holes (valence states).9. 9. in the case of a heterojunction with p-doping in the small-gap material and n-doping in the large-gap material. Illustration of band bending in doped heterojunctions. before and after contact.

The metal electrodes are separated by the oxide layer. so if charge carriers are placed in the wells they will be localized in these discrete states and form a 2D gas parallel to the interface. The basic features of a MOSFET: the source and drain. Another interesting and very important feature is the presence of energy wells due to these discontinuities: such a well for electron states is created on the p-doped side in the ﬁrst case and a similar well for hole states is created on the n-doped side in the second case. it is possible to populate these wells by additional dopant atoms far from the interface or by properly biasing the junction. vbm are the differences in the positions of the band extrema before contact. are attached to the oxide layer and to the bottom of the p-doped layer and are connected through the bias voltage. 9. . the gate and body. The 2D gas of carriers can then be subjected to external magnetic ﬁelds. Two additional metal electrodes. the arrangement of n-doped and p-doped regions is more complicated. A MOSFET is illustrated in Fig.12. The conducting channel is between the two n-doped regions. This phenomenon was discussed in more detail in chapter 7.344 9 Defects I: point defects where in both cases F is the difference in Fermi level positions and cbm . buried in a larger p-doped region and connected through two metal electrodes and an external voltage. The most basic element of a device is the so called Metal–Oxide– Semiconductor-Field-Effect-Transistor (MOSFET). Indeed. This element allows the operation of a rectifying channel with very little loss of power. A different bias voltage is connected to an electrode placed Bias Gate Oxide Channel Drain Source n n p Body Figure 9. 9. The two n-doped regions act as source (S) and drain (D) of electrons. The states associated with these wells are discrete in the x direction. An external voltage is applied between the two n-doped regions with the two opposite poles attached to the source and drain through two metal electrodes which are separated by an insulating oxide layer. both n-doped regions. In real devices. There are two n-doped regions buried in a larger p-doped region.12. giving rise to very interesting quantum behavior. It is evident from Fig.11 that in both situations there are discontinuities in the potential across the junction due to the different band gaps on the two sides.

as discussed above.2 Extrinsic point defects 345 at the bottom of the p-doped layer. For an n-doped semiconductor. respectively. φs are the work functions of the metal and the semiconductor. respectively. cbm . leaving a channel through which the electrons can travel from the source to the drain. the current ﬂow is between the source and drain.2.13. The advantage of this arrangement is that no current ﬂows between the body and the gate.13. Band alignment in a metal–semiconductor junction: φm . The energy difference between the vacuum level. called the body (B). with correspondingly lower generation of heat. it is denoted by φm for the metal and φs for the semiconductor. and the Fermi levels in the semiconductor and the metal. each has a well-deﬁned Fermi level denoted by (m ) (s ) F for the metal and F for the semiconductor. (m ) v bm . which takes much less power to maintain. even though it is this pair of electrodes to which the bias voltage is applied. vacuum φs χs depletion region φm ε(s) F ε(m) F εcbm ∆φ eV S εvbm εF εcbm εvbm metal semiconductor (n-doped) Figure 9. When a sufﬁciently large bias voltage is applied across the body and the gate electrodes. 9. this connection affects the energy and occupation of electronic states on the semiconductor side. and to another electrode placed above the insulating oxide layer. and VS is the potential (Schottky) barrier. A particular model of this behavior. the Fermi level lies close to the conduction band.9. In particular. (s ) F . the bottom of the conduction band. called the gate (G). the holes in a region of the p-doped material below the gate are repelled. Instead. and the Fermi level is deﬁned as the work function (the energy cost of removing electrons from the system).4 Metal–semiconductor junction The connection between the metal electrodes and the semiconductor is of equal importance to the p–n junction for the operation of an electronic device. which is common to both systems. φ = φm − φs is the shift in work function. the formation of the so called Schottky barrier [120]. F represent the top of the valence band. . 9. When the metal and semiconductor are well separated. giving rise to effective barriers for electron transfer between the two sides. χs is the electron afﬁnity. is shown schematically in Fig. In modern devices there are several layers of this and more complicated arrangements of p-doped and n-doped regions interconnected by complex patterns of metal wires.

This is done by moving electrons from one side to the other. the electron energy bands of the semiconductor must bend. as discussed in more detail in chapter 11. just like in the case of the p–n junction. Second. Far away from the interface the relation of the semiconductor bands to the Fermi level should be the same as before the contact is formed. we obtain (n ) = e VS + VS ( p) gap Two features of this picture of metal–semiconductor contact are worth emphasizing. This creates a layer near the interface which has fewer electrons than usual on the semiconductor side. experiments indicate that measured Schottky barriers are indeed roughly proportional to the metal work function for large-gap semiconductors (ZnSe. For the case illustrated in Fig. The presence of the depletion region makes it more difﬁcult for electrons to ﬂow across the interface. electrons have moved from the semiconductor (which originally had a higher Fermi level) to the metal. creating a charge depletion region on the semiconductor side. it assumes there are no changes in the electronic structure of the metal or the semiconductor due to the presence of the interface between them. denoted by χs . which alter the simple picture described above.346 9 Defects I: point defects The energy difference between the conduction band minimum.13. depending on the relative position of Fermi levels. and more electrons than usual on the metal side. denoted by cbm . the band bending is in the opposite sense and the corresponding Schottky barrier is given by eVS ( p) = gap − (φm − χs ) Combining the two expressions for the metal/n-type and metal/p-type semiconductor contacts. This corresponds to a potential barrier VS . other than the band bending which comes from equilibrating the Fermi levels on both sides. Moreover. In the case of a junction between a metal and a p-type semiconductor. . but they tend to be almost independent of the metal work function for small-gap semiconductors (Si. when the two Fermi levels are aligned. since at the interface they must maintain their original relation to the metal bands. the Schottky barrier is given by (n ) = φm − χs eVS as is evident from Fig. the Fermi levels on the two sides have to be aligned. 9. GaAs) [121]. and the vacuum level is called the electron afﬁnity. 9. ZnS). called the Schottky barrier. First.13. The interface can induce dramatic changes in the electronic structure. To achieve this. Neither of these features is very realistic. When the metal and the semiconductor are brought into contact. the Schottky barrier is proportional to the work function of the metal. In the case of a junction between a metal and an n-type semiconductor.

When the energy difference eVG is sufﬁciently large.F.9. Band bending occurs in this case as well. it can produce an “inversion region”. In the inversion layer the electrons in the occupied semiconductor conduction bands form a two-dimensional system of charge carriers. The gate voltage moves the electronic states on the metal side down in energy by eVG relative to the common Fermi level. VG is the gate (bias) voltage. such as the quantum Hall effect (integer and fractional) can be observed (see chapter 7). which lowers both the valence and the conduction bands of the semiconductor in the immediate neighborhood of the interface. The situation is further complicated by the presence of the insulating oxide layer between the metal and the semiconductor. In such systems. Brennan (Cambridge University Press. because the conﬁning potential created by the distorted bands can support only one occupied level below the Fermi level. Compare the band energy in the inversion region with the conﬁning potential in Fig. 9. K. Band alignment in a metal–oxide–semiconductor junction. interesting phenomena which are particular to two dimensions. which lowers the energy of electronic states in the metal by eVG relative to the common Fermi level. for a p-doped semiconductor. S. 9. which we will refer to as the gate voltage VG (see also Fig. a narrow layer near the interface between the semiconductor and the oxide where the bands have been bent to the point where some conduction states of the semiconductor have moved below the Fermi level. 2. Further reading 1. Physics of Semiconductor Devices. The interesting new feature is that the oxide layer can support an externally applied bias voltage. Sze (J. as shown for instance in Fig. .M. New York.14 for a p-doped semiconductor.12). 1981). This is a modern account of the physics of semiconductors. This produces additional band bending. with extensive discussion of the basic methods for studying solids in general. Cambridge. This is a standard reference with extensive discussion of all aspects of semiconductor physics from the point of view of application in electronic devices.14. When these states are occupied by electrons. 1999). 7. that is. The Physics of Semiconductors.Further reading depletion region eVG depletion region 347 εcbm ε(m) F εF εvbm (s) εcbm εF εvbm (s) ε(m) F eVG metal oxide semiconductor (p-doped) inversion region Figure 9. current can ﬂow from the source to the drain in the MOSFET. Wiley.

We consider a general point defect whose formation energy is f relative to the ideal crystal. but can easily be extended to more general situations. (9. Problems 1. 3 In this simple model we assume that the defect occupies a single crystal site. 5. Calculate the number of available carriers at room temperature in undoped Si and compare it with the number of carriers when it is doped with P donor impurities at a 3. We wish to prove the relations for the point defect concentrations per site. We assume that there are N atoms and N defects in the crystal. show that in the zero-temperature limit the Fermi level lies exactly at the middle of the band gap. Oxford. F = E − T S . ¯ v (T ) given ¯ c (T ) or holes p Prove the expressions for the reduced number of electrons n by Eq.4) for vacancies and interstitials. (9. which is common for typical point defects. This is a thorough and detailed discussion of all aspects of crystal interfaces.38). while the number of defects N . Using the fact that the number of electrons in the conduction band is equal to the number of holes in the valence band for an intrinsic semiconductor (containing no dopants). is minimized for 1 x = f / kB T e −1 (c) Using this result.348 9 Defects I: point defects 3. Interfaces in Crystalline Materials. and therefore the total number of crystal sites involved. Sutton and R. The formation energy of vacancies in Si is approximately 3. and that of interstitials is approximately 3. Determine the relative concentration of vacancies and interstitials in Si at room temperature and at a temperature of 100◦ C. show that the concentration of the defect per site of the crystal is given by c(T ) ≡ N = e− N+N f / kB T as claimed for the point defects discussed in the text. 1995). given in Eqs. (a) Show that in the microcanonical ensemble the entropy of the system is given by S = kB N [(1 + x ) ln(1 + x ) − x ln x ] (b) Show that the free energy at zero external pressure. .3 The number of atoms N will be considered ﬁxed.7 eV.4 eV. 2.2) and (9. We deﬁne the ratio of defects to atoms as x = N / N . will be varied to obtain the state of lowest free energy.W. including a treatment of metal–semiconductor and semiconductor–semiconductor interfaces. Show that this result holds also for ﬁnite temperature if the densities of states at the VBM and CBM are the same. the energy of the latter being deﬁned as the zero energy state. occupying a total of N + N sites.P. 4. Ballufﬁ (Clarendon Press. A.

Problems 349 6. x >0 x <0 (a) Plot these functions and show that they correspond to smooth distributions with no charge build up at the interface. From this. Describe in detail the nature of band bending at a metal–semiconductor interface for all possible situations: there are four possible cases.13. depending on whether the (s ) (m ) (s ) (m ) semiconductor is n-doped or p-doped and on whether F > F or F < F . 7. 9. Show that from this result the relation of Eq. . (9. eN (a ) . Calculate the contact potential by setting a reasonable cutoff for the asymptotic values. Our starting point will be the following expressions for the charge distributions in the n-doped and p-doped regions: ρn (x ) = tanh ρ p (x ) = tanh x ln x lp 1 − tanh2 1 − tanh2 x ln x lp eN (d ) . the point at which 99% of the charge distribution is included in either side. concentration of 1016 cm−3 or with As donor impurities at a concentration of 1018 cm−3 . for example.48). analogous to Eq. (9. (c) Integrate Poisson’s equation again to obtain the potential (x ) and determine the constants of integration by physical considerations.46) follows. of these only one was discussed in the text. (b) Integrate Poisson’s equation once to obtain the derivative of the potential d /dx and determine the constants of integration by physical considerations. derive expressions for the total size of the depletion region l D = l p + ln . x = 0. shown in Fig. We will analyze the potential at a p–n junction employing a more realistic set of charge distributions than the uniform distributions assumed in the text.

Dislocations had been considered in the context of the elastic continuum theory of solids. In 1934. as a one-dimensional mathematical cut in a solid. Orowan [123]. the “screw dislocation”. A few years later. dislocations became indispensable in understanding the mechanical properties of solids and in particular the nature of plastic deformation. in which electrons pass through a thin slice of the material and their scattering from atomic centers produces an image of the crystalline lattice and its defects (see.1. 10. The existence of dislocations in crystalline solids is conﬁrmed experimentally by a variety of methods. Burgers [126] introduced the concept of a different type of dislocation. 10. [127. The ﬁeld of dislocation properties and their relation to the mechanical behavior of solids is enormous. The region around the 350 . Although initially viewed as useful but abstract constructs. Polanyi [124] and Taylor [125]. as well as some classic treatments. This is called an edge dislocation and is shown in Fig. beginning with the work of Volterra [122]. each independently. 128]). Suggestions for comprehensive reviews of this ﬁeld. The most direct observation of dislocations comes from transmission electron microscopy. made the connection between the atomistic structure of crystalline solids and the nature of dislocations. Refs. for example.1 Nature of dislocations The simplest way to visualize a dislocation is to consider a simple cubic crystal consisting of two halves that meet on a horizontal plane. are given in the Further reading section. with the upper half containing one more vertical plane of atoms than the lower half. this concerned what is now called an “edge dislocation”.10 Defects II: line defects Line defects in crystals are called dislocations. The points on the horizontal plane where the extra vertical plane of atoms ends form the dislocation line. A striking manifestation of the presence of dislocations is the spiral growth pattern on a surface produced by a screw dislocation.

which has a Burgers vector parallel to its line. six in the −x direction. 10. and involves signiﬁcant distortion of the atoms from their crystalline positions in order to accommodate the extra plane of atoms: atoms on the upper half are squeezed closer together while atoms on the lower half are spread farther apart than they would be in the ideal crystal.10. The extra plane of atoms (labeled 0) is indicated by a vertical line terminated at an inverted T.2. The Burgers vector of the edge dislocation is perpendicular to its line. in this normally closed path. A dislocation is characterized by the Burgers vector and its angle with respect to the dislocation line. indicated by the small arrow labeled b. depending on the crystal structure and the Burgers vector. The energetically preferred dislocations have as a Burgers vector the shortest lattice vector.1. Dislocations in which the Burgers vector lies between the two extremes (parallel or perpendicular to the dislocation line) are called mixed dislocations. the end misses the beginning by the Burgers vector. dislocation line is called the dislocation core.1. The Burgers vector is the vector by which the end misses the beginning when a path is formed around the dislocation core. Another characteristic example is a screw dislocation. For the two direction of the dislocation line. and is perpendicular to the dislocation line which lies along the z axis. 10.1 Nature of dislocations -5 -4 -3 -2 -1 0 +1 +2 +3 +4 +5 351 y z x b -5 -4 -3 -2 -1 +1 +2 +3 +4 +5 Figure 10. Far away from the dislocation core the vertical planes on either side of the horizontal plane match smoothly. edge and screw dislocation. The Burgers vector for this dislocation is along the x axis. There are different types of dislocations. for reasons which will be discussed in detail below. we take six atomicspacing steps in the + y direction. as illustrated in Fig. consisting of steps that would have led to a closed path in the perfect crystal. The path shows the Burgers vector construction: starting at the lower right corner. as shown in Fig. six in the − y direction and six in the +x direction. The Burgers vector of a full dislocation is one of the Bravais lattice vectors. A dislocation is characterized by the ˆ and its Burgers vector b. Illustration of an edge dislocation in a simple cubic crystal. denoted by ξ extreme cases. the following relations hold between .

2. ﬁve in the − y direction and ﬁve in the +x direction. parallel to the dislocation line. in two views. the defect in the crystal is annihilated. Examples of a dislocation loop and dislocation nodes are shown in Fig. Individual dislocations cannot begin or end within the solid without introducing additional defects. starting at the lower right corner. The shaded circles in the top view represent atoms that would normally lie on the same plane with white circles but are at higher positions on the z axis due to the presence of the dislocation. this so called “dislocation annihilation” can be easily rationalized if we consider one of them corresponding to an extra plane on the upper half of the crystal and the other to an extra plane on the lower half of the crystal: when the two extra planes are brought together. 10. top view on the left (along the dislocation line) and side view on the right. dislocations in a real solid must extend all the way to the surface. The path shows the Burgers vector construction: in the top view. ﬁve in the −x direction. is indicated by the small arrow labeled b which lies along the z axis. dislocations can combine to form dislocations of a different type. In general.352 10 Defects II: line defects y z x z x b y Figure 10. This can be generalized to the notion of reactions between dislocations. the dislocation direction and Burgers vector: ˆe · be = 0. Illustration of a screw dislocation in a simple cubic crystal. we take ﬁve atomic-spacing steps in the + y direction. in which the resulting dislocation has a Burgers vector which is the vector sum of the two initial Burgers vectors. two edge dislocations of opposite Burgers vector can cancel each other. Since a dislocation is characterized by a unique Burgers vector.1 and 10. or form a closed loop. in a . the end misses the beginning by the Burgers vector. shown in the side view.2. As a consequence. 10. We return to this issue below. The Burgers vector. For example.3. edge : ξ ˆs · bs = ±bs screw : ξ as is evident from Figs. or form nodes at which they meet with other dislocations. in this normally closed path.

Dislocations can form regular networks of lines and nodes. These inﬁnities are artifacts of the inﬁnite crystal and the single. It is also easy to see . 10. without counting the energy cost of forming the dislocation core (see discussion in later sections of this chapter). Therefore.3. 10. the dislocation loop will consist of segments with screw. is very small because the bonds of atoms in the dislocation core are already stretched and deformed. In the dislocation loop there exist segments of edge character (b perpendicular to dislocation line).10. A closer look at the example of the edge dislocation demonstrates this point: a small displacement of the atomic columns near the dislocation core would move the dislocation from one position between adjacent lower-half vertical planes to the next equivalent position. In reality dislocations start and end at other defects. The Burgers vectors at each node sum to zero. as illustrated in Fig. with bi denoting Burgers vectors.3. The total energy for moving the entire dislocation is of course inﬁnite for an inﬁnite dislocation line. dislocation loop the Burgers vector and the dislocation line will be parallel in certain segments. as mentioned already. equivalently. Thus. The energy cost for such a displacement. perpendicular in other segments and at some intermediate angle at other segments. which are both idealizations. 10. per unit length of the dislocation. One of the most important features of dislocations is that they can move easily through the crystal. screw character (b parallel to dislocation line) and mixed character (b at some intermediate angle to dislocation line).3. the sum of Burgers vectors of dislocations meeting at a node must be zero.1 Nature of dislocations 353 screw b ξ ξ edge b edge ξ ξ screw b ξ3 b3 ξ2 b2 ξ 1 b b1 Figure 10.4. Dislocation nodes are defect structures at which ﬁnite segments of dislocations begin or end without creating any other extended defects in the crystal. a path enclosing all the dislocations that meet at a node will not involve a Burgers vector displacement. due to the elastic distortion it induces to the crystal. but then the energy of the dislocation itself is inﬁnite even at rest. as illustrated in Fig. inﬁnite dislocation line. as indicated in Fig. edge and mixed character. while their motion involves the sequential displacement of small sections of the dislocation line. Illustration of a dislocation loop (left) and a network dislocations meeting at nodes (right). or.

10.4. acting uniformly along the dislocation line.4. forming a ledge at the far end of the crystal. and the dislocation has been expelled from the crystal. and b is the Burgers vector of the dislocation. the length of the dislocation is l in the direction perpendicular to the plane of the ﬁgure. Illustration of how an edge dislocation can be moved by one lattice vector to the right by the slight displacement of few atomic columns near the core (left and middle panels). Repeating this step eventually leads to a deformation of the solid. then the work W done by an external shear stress τ to deform the crystal by moving an edge dislocation through it. This process is the main mechanism for plastic deformation of crystals. it is possible to obtain the force per unit length of the dislocation due to the presence of external stress. we ﬁnd that the force per unit length of the dislocation due to the external stress τ is given by fPK = 1 FP K = τ b l . in the conﬁguration of Fig. called the “Peach– Koehler” force FP K . is W = (τ wl )b If we assume that this is accomplished by a constant force. as indicated by the small arrows. If the width of the crystal is w and the length of the dislocation is l .4. 10. τ is the external shear stress that forces the dislocation to move in the fashion indicated over a width w . Within the idealized situation of a single dislocation in an inﬁnite crystal. as shown in Fig. where the upper half and the lower half differ by one half plane (right panel).354 τ 10 Defects II: line defects w b τ Figure 10. then the work done by this force will be given by W = FP K w Equating the two expressions for the work. how the displacement of an edge dislocation by steps similar to the one described above eventually leads to a permanent deformation of the solid: after a sufﬁcient number of steps the dislocation will be expelled from the crystal and the upper half will differ from the lower half by a half plane.

and therefore respond to externally applied macroscopic stresses. some general features can be described by phenomenological models without specifying those details. they are typically described in the context of continuum elasticity theory. The basic concepts of elasticity theory are reviewed in Appendix E. the atomic structure of the dislocation core does not enter directly. When the dislocation line lies in a direction that produces a short circuit. make them very important defects for mediating the mechanical response of solids. It is interesting that although almost 60 years old. A widely used phenomenological model is due to Peierls and Nabarro [129. and the distortions they induce throughout the crystal. Dislocations are also important in terms of the electronic properties of solids. its effect can be disastrous to the operation of the device. For more involved treatments we refer the reader to the specialized books mentioned at the end of the chapter. shown in Fig.1) This force is evidently always perpendicular to the direction of the dislocation line ˆ . The values that enter into these phenomenological models can then be ﬁtted to reproduce. In semiconductors. Before delving into the Peierls– Nabarro model we discuss some general results concerning the stress ﬁeld and elastic energy of a dislocation.10. to vector ξ ˆ f P K = (σ · b) × ξ (10. to the extent possible. the properties of dislocations in speciﬁc solids. In this context. Of the many important effects of dislocations we will only discuss brieﬂy their mobility and its relation to mechanical behavior. as in the example of the edge dislocation under shear stress τ .2 Elastic properties and motion of dislocations Although many aspects of dislocation shape and motion depend on the crystal structure and the material in which the dislocation exists. dislocations induce states in the gap which act like traps for electrons or holes. and it is non-zero if there is a component of the stress tensor σ parallel to the ξ Burgers vector b. and external stress described by the tensor σ . so we will examine its basic features in this section.2 Elastic properties and motion of dislocations 355 This expression can be generalized to an arbitrary dislocation line described by the ˆ . Because dislocations induce long-range strain in crystals. . The ease with which dislocations move.4. 130]. 10. this model actually provides some very powerful insights into dislocation properties. this model still serves as the basis of many contemporary quantitative attempts to describe dislocations in various solids. such as brittle or ductile response to external loading. 10.

with certain assumptions for the symmetry of the problem and the long-range behavior of the stress ﬁelds (see Problems 1 and 2 for details).1 Stress and strain ﬁelds We examine ﬁrst the stress ﬁelds for inﬁnite straight edge and screw dislocations.1. the glide plane (also referred to as the slip plane) is the x z plane.1. θ. The components of the ﬁelds are given in polar (r. we deﬁne the coordinate system to be such that the dislocation line is the z axis. z ) coordinates. be and the corresponding elastic constants. z ) and cartesian (x . It is convenient to use the cylindrical coordinate system (r. All components include the appropriate Burgers vectors bs . The stress components for the screw and edge dislocations in these coordinates.2.1 for an isotropic solid. y . bs .2). 10. We also deﬁne the glide plane through its normal vector n which is given by ˆ ×b ξ ˆ= n |b| For the edge dislocation shown in Fig. To this end. θ. are given in Table 10. Polar coordinates σi j σrr σθ θ σzz σr θ σθ z σzr Screw Edge θ − K e be sin r θ − K e be sin r θ −2ν K e be sin r θ K e be cos r Cartesian coordinates σi j σx x σ yy σzz σx y σ yz σzx Screw Edge x +y )y − K e be (3 (x 2 + y 2 )2 2 2 0 0 0 0 1 K s bs r 0 0 0 0 x K s bs (x 2 + y2 ) y − K s bs (x 2 + y2 ) −y )y K e be ((x x 2 + y 2 )2 2 2 y −2ν K e be (x 2 + y2 ) (x − y ) K e be x (x 2 + y 2 )2 2 2 0 0 0 0 0 .356 10 Defects II: line defects 10. the Burgers vector is bs = bs z ˆ . The stress ﬁelds for the screw and edge dislocations. 10. which are given by µ µ . The derivation of these expressions is a straight-forward application of continuum elasticity theory. the horizontal axis is the x axis and the vertical axis is the y axis (see ˆ Figs.2) Ks = 2π 2π (1 − ν ) Table 10. be . ˆ while for the edge For the screw dislocation.1 and 10. Ke = (10. z ) to express the stress ﬁelds of dislocations. depend dislocation it is be = be x on the crystal. The magnitudes of the Burgers vectors. as well as in the standard cartesian coordinates.

it is a compressive stress.5.3) for the edge dislocation. as given by the expressions of Table 10. for a mixed dislocation. The contours of constant stress for the various stress components of the screw (top panel) and edge (bottom panel) dislocations. for the screw and edge dislocations. σ = (σrr + σθ θ + σzz )/3 is zero for the screw dislocation. For mixed dislocations with a screw and an edge component. The σzz component of the edge dislocation is identical in form to the σx z component of the screw dislocation. These naturally involve the shear modulus µ and Poisson’s ratio ν . 10. For instance. An interesting consequence of these results is that the hydrostatic component of the stress.1 in cartesian coordinates. white represents large positive values and black represents large negative values.2 Elastic properties and motion of dislocations 357 σxz σyz σxx σxy σyy Figure 10. but takes the value 2 sin θ σ = − K e be (1 + ν ) 3 r (10. deﬁned for an isotropic solid (see Appendix E). Plots of constant stress contours for various components of the stress tensors are shown in Fig. the corresponding quantities are a combination of the results discussed so far.5. respectively. that is. in which the .10.

358 10 Defects II: line defects angle between the dislocation line and the Burgers vector is θ .8) (10. y ) = u y ( x .1–0. see Appendix E). we obtain u x (x . for these examples we used a typical average value ν = 0. y ) = be xy y 2(1 − ν ) tan−1 + 2 4π (1 − ν ) x (x + y 2 ) −be x 2 − y2 1 − 2ν ln(x 2 + y 2 ) + 4π (1 − ν ) 2 2(x 2 + y 2 ) (10. From the results given above we ﬁnd for the diagonal components of the strain ﬁeld of the edge dislocation = = −be y 2x 2 y + (1 − 2 ν ) 4π (1 − ν ) (x 2 + y 2 )2 (x 2 + y 2 ) be y y (x 2 − y 2 ) + 2ν 2 2 2 2 4π (1 − ν ) (x + y ) (x + y 2 ) (10. y ) for the screw dislocation and of u x (x . y ) is a symmetric expression in the variables x and y . For the edge dislocation. while the other two components of the strain ﬁeld vanish identically (see also Problem 1). 10.4. y ) = bs y tan−1 2π x (10. Plots of u z (x . by integration. the displacement ﬁeld. y ) = u y (x . the corresponding elastic constant is given by K mi x = µ 2π 1 sin2 θ + cos2 θ 1−ν (10. For the screw dislocation. u z (x . so that b cos θ is the screw component and b sin θ is the edge component.25 (for most solids ν lies in the range 0. y ) = 0.9) There are two constants of integration involved in obtaining these results: the one for u x is chosen so that u x (x . namely that far from the dislocation core u z goes uniformly from zero to bs as θ ranges from zero to 2π . . we can use for the stress ﬁeld the stress– strain relations for an isotropic solid to obtain the strain ﬁeld and from that.4) It is also interesting to analyze the displacement ﬁelds for the two types of dislocations.6) (10.7) xx yy and integrating the ﬁrst with respect to x and the second with respect to y .6. 0) = 0 and the one for u y is chosen so that u y (x . these choices are of course not unique. y ) for the edge dislocation are given in Fig. the displacement ﬁeld on the x y plane is given by u x ( x .5) a result derived from simple physical considerations.

1 x=1 0 0 Ϫ0. The displacement ﬁeld u z is a smooth function of y and tends to a step function when x → 0: this is sensible in the context of continuum elasticity. as is evident from the schematic representation of Fig.2 Elastic properties and motion of dislocations 0. Speciﬁcally. For y → 0+ it becomes a three-step function. we note ﬁrst that there is a shift in u z by bs /2 when going from −∞ to +∞ along the y axis for a given value of x > 0. (x .2 Ϫ0.2 edge. 10. The discontinuity in this component occurs at x = 0 and is exactly be /2 for a given value of y > 0. the displacement u x becomes a one-step function. The discontinuity is entirely due to the tan−1 ( y /x ) term in Eq. For large values of x the displacement u z is very gradual and the shift by bs /2 takes place over a very wide range of y values.1 0. For these examples we used a typical value of ν = 0. thus completing a total shift by bs along a Burgers circuit.y) x=0. or of y in the case of the edge dislocation.3 Ϫ3 Ϫ2 Ϫ1 0 1 2 3 Ϫ0. the displacement u x is a discontinuous function of x . the other term being equal to zero for x = 0. This pathological behavior is a reﬂection of the limitations of continuum elasticity: the u y component must describe the presence of an extra plane of atoms in going from .1 Ϫ0.2 screw.10.y) y=1 y=0. (10. y ) = (0. The u z and u x components of the displacement ﬁelds for the screw and edge dislocations. since for x = 0 there must be a jump in the displacement to accommodate the dislocation core. going abruptly from ∓be /2 to ±be /2 at x = 0. For y → ±∞. For negative values of x in the case of the screw dislocation. ux(x. there is a similar shift of bs /2 for the corresponding x < 0 value. The total shift by be is completed when the corresponding path at y < 0 is included in the Burgers circuit. The other component of the displacement. next jumping to +be /2 and ﬁnally jumping back to zero for x → 0+ (the reverse jumps occur for y → 0− ).3 359 0. Certain features of these displacement ﬁelds are worth pointing out. then jumping to −be /2. 0). u y .3 0. For the edge dislocation. due to the ln(x 2 + y 2 ) term which blows up.2 Ϫ0. the curves are reversed: they become their mirror images with respect to the vertical axis.8). in units of the Burgers vectors.25.2.6. uz(x. being zero all the way to x → 0− . is even more problematic at the origin.1 0.1 Ϫ0. for the screw dislocation.3 Ϫ3 Ϫ2 Ϫ1 0 1 2 3 y x Figure 10.1 0. The values of the x and y variables are also scaled by the Burgers vectors.

360 10 Defects II: line defects y < 0 to y > 0 at x = 0. which takes into account the discrete nature of the atomic planes but retains some of the appealing features of the continuum approach. like ∼ 1/ r (see Table 10. But since there is no explicit information about atomic sites in continuum elasticity. 10. while the strain is proportional to the stress in an isotropic solid. We will consider ﬁrst a simpliﬁed model to obtain the essential behavior of the elastic energy. in fact. The inﬁnitesimal dislocation at x gives rise to a shear stress σx y (x . the expressions for the stresses in this case were actually derived with the assumption that r = x 2 + y 2 is very large on the scale of interatomic distances (see Problem 2). the description of the physical system based on continuum elasticity fails when we approach the core region of the edge dislocation. the Peierls–Nabarro model.4. This has to do with the fact that both the strain and the stress induced by the dislocation fall off slowly. gives a logarithmically divergent term.7: it consists of an edge dislocation in the same geometry as shown earlier in Figs. that is. as shown in Appendix E). as indicated in the schematic representation of the edge dislocation in Fig. to motivate the presence of two terms: one arising from the strain far from the core and the other from the dislocation core. The contribution to the elastic energy is given by the product of the stress and strain which. when integrated over the plane perpendicular to the dislocation line. which spreads over many atomic sites in the x and y directions around the core (shaded area on left panel) is assumed to be conﬁned on the glide plane (x z ).1 and 10. this condition is reﬂected by the pathological behavior of the u y component of the displacement ﬁeld.2. but we y x’ u(x’) z x d σxy u(x) Figure 10.7. 0) at another point x . 10.1 for the stress. Idealized model of an edge dislocation for the calculation of the elastic energy: the displacement ﬁeld. 10.1. 10. A more realistic treatment. is discussed below.2 Elastic energy We next examine the energy associated with the elastic distortion that an inﬁnite straight dislocation induces in the crystal. as shown schematically on the right panel. where the displacement is u (x ). The idealized model we will consider is illustrated in Fig. In other words. and thus are valid only far away from the dislocation core. It turns out that the contribution of the ﬁrst term is inﬁnite. .

is given by σx y (x . The ﬁnal result is what we expected from the deﬁnition of the dislocation density. we will treat the dislocation as if it were composed of a sequence of inﬁnitesimal dislocations [131]: the inﬁnitesimal dislocation between x and x + d x has a Burgers vector dbe (x ) = −2 du dx x =x dx = ρ (x )dx (10. from the expressions derived earlier (see Table 10.14) This inﬁnitesimal dislocation produces a shear stress at some other point x which. y ) − u x (x . (10. be .11) where we have used the expression for the x component of the displacement of an edge dislocation derived earlier. which we will denote as u (x ) and take to be a continuous function of the position x . identiﬁed as the x z plane in the geometry of Fig. 0) = K e dbe (x ) x−x . 10. We will also ﬁnd it useful to employ another function. which in the end we identiﬁed with the scalar displacement of the present idealized model. with the dislocation core at x = 0. du (x ) dx Integrating the dislocation density over all x we ﬁnd ρ (x ) = − ∞ −∞ (10. symmetry considerations lead to the conclusion that the displacement is a scalar function. The disregistry (also called the misﬁt) across the glide plane can be deﬁned as ρ ( x ) = −2 u (x ) = lim [u x (x .8) to determine the displacement of an edge dislocation on the glide plane.1). With the displacement a continuous function of x . Eq. deﬁned as du ( x ) (10. (10.1.10) dx This quantity is useful because it describes the disregistry across the glide plane.2 Elastic properties and motion of dislocations 361 will assume that the displacement is conﬁned to the glide plane. With this assumption.13) du + du = be 0 where we have again used Eq. which must integrate to the Burgers vector.12) ρ (x )dx = −2 lim = −2 − −∞ −be /4 →0 du (x ) dx + dx 0 be /4 ∞ du (x ) dx dx (10.8).10. 0) → 2u (x ) y →0 (10. the dislocation density ρ (x ). − y )] = 2u x (x . The deﬁnition of the disregistry leads to the equivalent deﬁnition of the dislocation density.

we perform an integration by parts over the variable x . we have introduced a factor of 1/2 to account for the doublecounting of the interactions between inﬁnitesimal dislocations. to arrive at the following expression for the elastic energy: (el ) [u (x )] = Ue 2 K e be Ke ln( L ) − 2 2 L −L L −L ρ (x )ρ (x ) ln |x − x |dx dx (10. (10. In practice. A typical dislocation density is 105 cm of dislocation line per cm3 of the crystal. we will drop this ﬁrst term. We next employ the expression given in Eq. The shear stress at the point x is a force per unit area on the x z plane whose ˆ (see deﬁnition of the stress tensor in Appendix E). Accordingly. the ﬁrst term in the elastic energy expression does not lead to an unphysical picture. The dislocations tend to cancel the elastic ﬁelds of each other.15) This result clearly separates the contribution of the long-range elastic ﬁeld of the dislocation. embodied in the ﬁrst term. and concentrate on the second term. inﬁnitely long straight dislocation in an inﬁnite crystal. and we use the expression for the dislocation density from Eq.362 10 Defects II: line defects where we think of the “core” of the inﬁnitesimal dislocation as being located at x . (10.15) is inﬁnite for L → ∞. from the contribution of the large distortions at the dislocation core.14) for dbe (x ). embodied in the second term. and over dbe (x ) to account for the contributions from all inﬁnitesimal dislocations. (10. surface-normal unit vector is y The displacement u (x ) necessary to create the inﬁnitesimal dislocation at x takes place in the presence of this force from the dislocation at x . Thus. giving the following contribution to the elastic energy from the latter inﬁnitesimal dislocation: (el ) dUe = Ke dbe (x ) u (x ) x−x Integrating this expression over all values of x from − L to L (with L large enough to accommodate the range of the displacement ﬁeld). it is not an interesting term from the point of view of the atomic structure of the dislocation core. as it depends exclusively on the distribution of . expressed as 105 cm−2 . which is an artifact of the assumption of a single. providing a natural cutoff for the extent of the elastic ﬁeld of any given segment of a dislocation. we obtain for the elastic energy of the edge dislocation (el ) [u (x )] = Ue Ke 2 −L L be /2 u (x ) −be /2 1 dbe (x )dx x−x In the above expression.10). which includes the energy due to the dislocation core. there are many dislocations in a crystal and they are not straight and inﬁnitely long. Since the contribution of this term is essentially determined by the density of dislocations in the crystal. The ﬁrst term of the elastic energy in Eq.

The integration then gives Us(el ) = 1 K s bs 2 L rc 1 1 bs 2 2 ln( L ) − K s bs ln(rc ) dr = K s bs r 2 2 (10. the elastic energy of the mixed dislocation will be given by (el ) Umi x = b2 L K mi x ln 2 rc = 1 1 L b2 µ cos2 θ + sin2 θ ln 2 2π (1 − ν ) 2π rc (10. We obtain (el ) = Ue 1 K e be 2 L rc 1 1 be 2 2 ln( L ) − K e be ln(rc ) dr = K e be r 2 2 (10. that is. the disregistry u (x )).2 Elastic properties and motion of dislocations 363 the dislocation displacement u (x ) (or.18) . This approach can be applied to both screw and edge dislocations. by which the planes of atoms are misplaced. Imagine for example that we create a screw dislocation by cutting the crystal along the glide plane (x z plane) for x > 0 and then displacing the part above the cut relative to the part below. equivalently. respectively. We will also need to introduce two limits for the integration over the radius. We take as a measure of the strain the displacement bs . If the dislocation is not pure edge or pure screw it can be thought of as having an edge component and a screw component. through the dislocation density ρ (x ). and the second includes the core energy due to the disregistry u . an inner limit rc (called the core radius) and an outer limit L . and the part above this plane is displaced relative to the part below by an amount equal to the Burgers vector be along x .17) a result identical in form to that for the screw dislocation. we must integrate over all values of the radius. During this procedure the average stress at a distance r from the dislocation line will be half of the value σθ z for the screw dislocation. the cut is again on the glide plane (x z plane) for x > 0. lying along the dislocation line and perpendicular to it. using the expressions for the stress ﬁelds provided in Table 10. The Burgers vector is composed by the screw and edge components which are orthogonal. Since the misﬁt is along the x axis we only need to integrate the value of the σr θ component for θ = 0 along x . 1 K b / r .15). and assuming that the dislocation interacts with its own average stress ﬁeld as it is being created.1. With the angle between the dislocation line and the Burgers vector deﬁned as θ .10. We outline yet another way to obtain an expression for the elastic energy associated with the dislocation. (10. given by the stress × the corresponding strain. In order 2 s s to obtain the energy per unit length corresponding to this distortion.16) in which the ﬁrst term is identical to the one in Eq. from rc to L . In the case of the edge dislocation. by an amount equal to the Burgers vector bs along z .

the elastic energy is proportional to µb2 . as illustrated in Fig. screw and mixed dislocations. but locally they look like two separate dislocations.8. b1 . edge. b2 are not lattice vectors. this situation occurs. as usual. since this corresponds to the lowest energy dislocation. In all cases. 10. the partial dislocations will be split by a distance d which must satisfy 2 2 K (b1 + b2 ) + γs f d ≤ K b2 with K the relevant elastic constant. Their Burgers vectors b1 .4). When this condition is met. 10. The condition for the existence of partial dislocations is 2 2 b1 + b2 < b2 because then they are energetically preferred over the full dislocation. in Al. if the stacking fault energy per unit area is γs f . for instance. b2 indicated for this example: note that b is the shortest lattice vector but b1 .8. Eq. a full dislocation will be split into two partials. More speciﬁcally. Right: the conventional cubic unit cell of an FCC crystal with the slip plane shaded and the vectors b. is a lattice vector.364 10 Defects II: line defects where we have used the expression for the elastic constant of the mixed dislocation. b1 + b2 = b. (10. as shown in Fig. 10.8. as long as the energy of the stacking fault which appears between the split dislocations. . This result is also useful in interpreting the presence of so called “partial dislocations”. but they add up to a total Burgers vector which. the partial dislocations are connected by a stacking fault (hatched area). The above argument assumes that the partial dislocations are b b1 b2 b1 b b2 Figure 10. An example of Burgers vectors for full and partial dislocations in an FCC crystal is shown in Fig. This result justiﬁes our earlier claim that the Burgers vector is the shortest lattice vector. These are dislocations which from very far away look like a single dislocation.8. where b1 + b2 = b. then we cannot speak of splitting of partial dislocations. This means that the angle between the vectors b1 and b2 must be greater than 90◦ . b2 have magnitudes which are shorter than any lattice vector. If the values of d in the above expression happen to be small on the interatomic scale because of a large γs f . Left: splitting of a full dislocation with Burgers vector b into two partials with Burgers vectors b1 and b2 . does not overcompensate for the energy gain due to splitting.

(10.3 Peierls–Nabarro model The Peierls–Nabarro (PN) model relies on two important concepts. across a crystal plane. (10.2 Elastic properties and motion of dislocations 365 not interacting and that the only contributions to the total energy of the system come from isolated dislocations. to express all quantities in terms of either the misﬁt or the displacement ﬁeld.1 The ﬁrst concept is that the dislocation can be described in terms of a continuous displacement distribution u (r). the inﬁnite K b2 ln( L ) which appears in all the expressions we derived above. This is an important quantity.2. if the energetic conditions are met then splitting will occur. we can take advantage of the relation between the misﬁt u (x ) and the displacement u (x ). Thus. Since the interaction between the partial dislocations is repulsive. In the following discussion we will drop. the γ (u ) energy cost is identiﬁed with the energy cost of displacing two semiinﬁnite halves of a crystal relative to each other uniformly by u . Instead. 10. we adopt a more general point of view based on a variational argument for the total energy of the dislocation. This leads to the following expression for the total energy of the dislocation: K U (tot ) [u (x )] = − ρ (x )ρ (x ) ln |x − x |dx dx + γ (u (x ))dx (10. which was derived above for an edge dislocation.11). for example.15) for the elastic energy of the dislocation core. but the model can be generalized to other types of dislocations (see. It has proven very useful in studying dislocation core properties from ﬁrst-principles calculations [132–134]. The essence of the resulting equations is the same as in traditional approaches. as was done above in deriving the expressions for the elastic energy. Eshelby’s generalization to screw dislocations [131]). For simplicity of notation we will drop the subscript “e” denoting the edge character of the dislocation. In the PN theory. The second concept is that a misﬁt u between two planes of the crystal corresponds to an energy cost of γ (u ) per unit area on the plane of the misﬁt.19) 2 A variational derivative of this expression with respect to the dislocation density 1 The treatment presented here does not follow the traditional approach of guessing a sinusoidal expression for the shear stress (see for example the treatment by Hirth and Lothe. The energy cost due to the misﬁt will be given by an integral of γ (u ) over the range of the misﬁt. our discussion of the PN model strictly speaking applies only to an edge dislocation. and introduce the sinusoidal behavior as a possible simple choice for the displacement potential. The crucial argument in the PN theory is that the elastic energy of the dislocation core is balanced by the energy cost of introducing the misﬁt in the lattice.10. for the reasons we mentioned earlier. Moreover. called the generalized stacking fault energy or γ -surface. . originally introduced by Vitek [132]. We will term 1 2 also adopt the expression of Eq. Eq. mentioned in the Further reading section).

One of the achievements of the PN theory is that it provides a reasonable estimate of the dislocation size.21) in the PN integro-differential Eq. This potential must be a periodic function of u (x ) with a period equal to the Burgers vector of the dislocation. (10. if acting alone. (10.20). as shown schematically in Fig. (10. 10.20).366 10 Defects II: line defects ρ (x ) leads to the PN integro-differential equation: b /2 2K −b/2 1 dγ (u (x )) du ( x ) + =0 x−x du (x ) (10. The resulting dislocation proﬁle is a compromise between these two tendencies. 10. the second term represents the restoring stress due to the non-linear misﬁt potential acting across the slip plane. which is referred to as the Frenkel model [136]. we can assume a sinusoidal function for the misﬁt potential. we also present the dislocation proﬁles that each term in Eq. we can obtain an analytic solution for the misﬁt. is a result of the competition between the two energy terms in Eq. As the simplest possible model. the misﬁt energy dominates and the dislocation becomes narrow (ζ is small) in order to minimize the misﬁt energy. In this ﬁgure. (10.22) A typical dislocation proﬁle is shown in Fig.20) would tend to produce.21) where γus is the amplitude of the misﬁt energy variation (this is called the “unstable stacking energy” [137] and is a quantity important in determining the brittle or ductile character of the solid.9. as discussed in the following section). characterized by the value of ζ . The optimal size of the dislocation core. which is given by b b K b2 x tan−1 + ζ = π ζ 2 2πγus du bζ ρ (x ) = − = 2 dx π (ζ + x 2 ) u (x ) = (10. while the restoring stress would produce a very broad dislocation to minimize the misﬁt energy. This form of the potential ensures that it vanishes when there is no misﬁt (u = 0) or the misﬁt is an integral multiple of the Burgers vector (the latter case corresponds to having passed from one side of the dislocation core to the other). The elastic stress term would produce a very narrow dislocation to minimize the elastic energy. .9: If the unstable stacking energy γus is high or the elastic moduli K are low. Using the sinusoidal potential of Eq. One choice is (see Problem 4 for a different possible choice) γ (u ) = γus 2 1 − cos 2π u b (10.20) The ﬁrst term is the elastic stress at point x due to the inﬁnitesimal dislocation ρ (x )dx at point x .

The only effect of the lattice periodicity so far comes from the periodicity of the misﬁt potential with period b. is invariant with respect to arbitrary translation of the dislocation density ρ (x ) → ρ (x + t ).19). (10. which represents a compromise between the two tendencies. i. The dislocation described by the PN solution Eq. the second term in Eq.20). if γus is low or K is large.9. and there are no unphysical discontinuities in the displacement ﬁelds. the dislocation spreads out in order to minimize the elastic energy. which is dominant. (10. Yet another important achievement of PN theory is that it makes possible the calculation of the shear stress required to move a dislocation in a crystal. Eq. Proﬁle of an edge dislocation: Top: the disregistry or misﬁt u (x ) as dictated by the minimization of the elastic energy (solid line) or the misﬁt energy (dashed line) and the corresponding densities ρ (x ).e. This is clearly unrealistic.10. requires a modiﬁcation of what we have discussed so far. i. the failures of the treatment based strictly on continuum elasticity theory discussed earlier are avoided. Bottom: the disregistry and density as obtained from the Peierls–Nabarro model. (10. (10.22) does not experience any resistance as it moves through the lattice. The expression for the total energy of the dislocation. however. given by Eq. the ﬁrst term in Eq. In the opposite case.10). which corresponds to a lattice vector of the crystal.20).e. In either case.2 Elastic properties and motion of dislocations u(x) b/2 elastic misfit 367 ρ(x) elastic 0 x misfit -b/2 0 ρ(x) x u(x) b 0 x 0 x Figure 10. (10. and is a consequence of neglecting the discrete nature of the lattice: the PN model views the solid as an isotropic continuous medium. In order to rectify this shortcoming and to introduce . This.

It therefore explains why in some crystals it is possible to have plastic deformation for shear stress values several orders of magnitude below the shear modulus: it is all due to dislocation motion! In particular. Peierls and Nabarro showed that the shear stress for dislocation motion. 2 An alternative deﬁnition of this stress is the shear stress. the Peierls stress can be estimated by extrapolating to zero temperature the critical resolved yield stress. 10−5−10−4 for ionic crystals. . makes the energy barrier to dislocation motion vanish. which leads to an energy barrier to the motion of the dislocation. (10.19): γ (u (x ))dx → ∞ n =−∞ γ (u (xn )) x (10.22). the energy will have a periodic variation with period equal to the distance between two equivalent atomic rows in the crystal (this distance can be different from the Burgers vector b).368 10 Defects II: line defects a resistance to the motion of dislocations through the lattice. they are brittle solids (see also the discussion in section 10. the PN model was modiﬁed so that the misﬁt potential is not sampled continuously but only at the positions of the atomic planes. i. With this modiﬁcation. it is interesting to note that in covalently bonded crystals where dislocation activity is restricted by the strong directional bonds between atoms. Having introduced an energy variation as a function of the dislocation position. Experimentally.3). which. is translated through the lattice. This gives Peierls stress values measured in terms of the shear modulus (σ P /µ) of order 10−5 for close-packed FCC and HCP metals.e. the stress beyond which plastic deformation (corresponding to dislocation motion) sets on. when applied to the crystal.23) where xn are the positions of atomic planes and x is the spacing of atomic rows in the lattice. we can obtain the stress required to move the dislocation without any thermal activation. is given by σP = 2µ 2π a exp − 1−ν (1 − ν )b (10. Finding the stress through this deﬁnition relies on computational approaches and is not useful for obtaining an analytical expression.24) with b the Burgers vector and a the lattice spacing across the glide plane. This amounts to the following modiﬁcation of the second term in the total energy of the dislocation in Eq. (10. and 10−2−1 for compound and elemental semiconductors in the zincblende and diamond lattices. This stress can be deﬁned as the maximum slope of the variation in the energy as a function of the translation. the so called Peierls stress σ P . it does provide some interesting insight. The amplitude of this periodic variation in the energy is called the Peierls energy. when the PN solution. the ratio σ P /µ is of order unity. 5 × 10−3 for BCC metals. that is. which implies that these crystals do not yield plastically. While this is a truly oversimpliﬁed model for dislocation motion. Eq.2 Using this deﬁnition.

(10. Thus. and subsequent motion of the ensuing kink–antikink in the direction of the dislocation line. where the energy is minimized. through formation of kink–antikink pairs. This is illustrated in Fig. for ﬁxed values of the elastic constants µ. the value of the Peierls stress is extremely sensitive (exponential dependence) to the ratio (a /b). Therefore. ν .10: the dislocation line in the equilibrium conﬁguration resides in the Peierls valley. In contrast to this. and these are the crystals exhibiting easy dislocation motion. The actual motion of dislocations is believed to take place through small segments of the dislocation moving over the Peierls barrier. In close-packed metallic systems. Notice that there are two aspects to this criterion: the spacing between atomic planes across the glide plane a . crystals with more complex unit cells have relatively large Burgers vectors and small spacing between atomic planes across the glide plane. the value of (a /b) is large. 10.24). according to this simple theory.10. the dislocations corresponding to the smallest Burgers vector and to the glide plane with the largest possible spacing between successive atomic planes will dominate. in a given crystal the motion of dislocations corresponding to the largest value of (a /b) will dominate. A section of the dislocation may overcome the Peierls energy barrier by creating a kink–antikink dislocation line kink Peierls barrier antikink Peierls energy Peierls valley Figure 10.2 Elastic properties and motion of dislocations 369 As can be easily seen from Eq.10. and the Burgers vector of the dislocation b. the shear stress for dislocation motion cannot be overcome before fracturing the solid. Dislocation motion in the Peierls energy landscape. giving large Peierls stress. In these solids. .

as discussed in the previous section. the stress just above the yield point exhibits a dip as a function of strain. If the external load is released after the yield point has been passed the solid does not return to its original state but has a permanent deformation. and yield strain. 10. This behavior is illustrated in Fig. Presumably it is much easier to move the kinks in the direction of the dislocation line rather than the entire line all at once in the direction perpendicular to it. σ y . all play an important role in the mobility of dislocations. it also remains in the elastic regime (linear stress-strain relation) up to the fracture point: if the external load is released the solid returns to its initial state. so that a smaller stress is needed to maintain a constant strain rate. σ brittle ductile * σy * Yb Yd 0 εy ε Figure 10. due to the introduction and motion of dislocations. which is central to the mechanical behavior of solids. The point beyond which the ductile solid is no longer elastic is called the yield point.11. of ﬁxed length in the strain. brittle and ductile.11. indicate the corresponding Young’s moduli. The yield point of the ductile solid is characterized by the yield stress σ y and the yield strain y . and the presence of impurities. Stress σ versus strain relationships for typical brittle or ductile solids.3 Brittle versus ductile behavior The broadest classiﬁcation of solids in terms of their mechanical properties divides them into two categories. Typical stress–strain curves for brittle and ductile solids are shown in Fig. The asterisks indicate the fracture points. has lower Young’s modulus but does not break until much larger strain is applied. The inﬂuence of the core structure. Beyond a certain amount of strain the solid starts deforming plastically. Ductile solids respond to external loads by deforming plastically. characterized by the yield stress. but can withstand only limited tensile strain. The triangles in the elastic regime. The kinks can then move along the dislocation line. Often. Brittle solids fracture under the inﬂuence of external stresses. . Yb . the effect of temperature. eventually displacing the entire dislocation over the Peierls energy barrier. 10. Yd .370 10 Defects II: line defects pair and moving into the next Peierls valley. because dislocations at this point multiply fast.11: a brittle solid is usually characterized by a large Young’s modulus (see Appendix E). on the other hand. 10. y . beyond which it fractures. A ductile solid.

while in modes II and III the applied stress is pure shear. The blunting of the crack requires the generation and motion of dislocations in the neighborhood of the crack tip.10. In mode I the applied stress is pure tension. At the phenomenological level. The basic idea is that if the crack propagates into the solid under the inﬂuence of the external stress. the brittle or ductile nature of the solid is related to what happens at the atomistic level near the tip of pre-existing cracks under external loading. The propagation of the crack involves the breaking of bonds between atoms at the very tip of the crack in a manner that leads to cleavage. and modes II and III involve pure shear in the two possible directions on the plane of the crack. theories have been developed that characterize the two types of behavior in terms of cleavage of the crystal or the nucleation and motion of dislocations[138.12. .3. Deﬁnition of the three loading modes of a crack: mode I involves pure tension.3 Brittle versus ductile behavior 371 Formulating a criterion to discriminate between brittle and ductile response based on atomistic features of the solid has been a long sought after goal in the mechanics of solids. the response is described as ductile. this is illustrated in Fig. 10. Thus. 10. whereas if the crack blunts and absorbs the external load by deforming plastically.12. In all real solids there exist cracks of different sizes. The manner in which the external forces are applied to the crack geometry can lead to different types of loading. described as mode I. 139]. the response is described as brittle. these are the defects that can lead to plastic deformation. we will present the continuum elasticity picture I II III III II I Figure 10.1 Stress and strain under external load We begin with some general considerations of how a solid responds to external loading. Before we examine the phenomenological models that relate microscopic features to brittle or ductile behavior. The question of brittle or ductile response reduces to what happens to the cracks under external loading. We review here the basic elements of these notions. II and III.

10. θ ). Left: deﬁnition of stress components at a distance r from the crack tip in polar coordinates. since by symmetry the strain is conﬁned to the x y plane (u z = 0). indicating that very near the crack tip. Right: the penny crack geometry in a solid under uniform external loading σ∞ very far from the crack. The constant K is called the “stress intensity factor”. we ﬁnd to lowest order √ σ∞ π a σ yy y =0. For the geometry of Fig. the above expression put . in what is essentially mode I loading for the crack. The crack runs along the z axis. inﬁnitesimal height ( y direction) and inﬁnite width (z direction). are bounded for r → 0 and can be neglected in analyzing the behavior in the neighborhood of the crack tip. The higher order terms in r n /2 . 10. The solid is loaded by a uniform stress σ∞ very far from the penny crack. The ﬁrst solution of this problem was produced for an idealized 2D geometry consisting of a very narrow crack of length 2a (x direction). which is referred to as “plane strain”.r >0 = √ 2π r This is an intriguing result. Letting x = r + a and expanding the expression for σ yy in powers of r . (r. The general solution for the stress near the crack has the form N K ) n /2 σi j = √ αi(n f i j (θ ) j (θ )r 2π r n =0 8 (10. involving ) the constants αi(n j . this is usually called a “penny crack”.13.13. as illustrated in Fig.25) where f i j (θ ) is a universal function normalized so that f i j (0) = 1. r → 0.13. The solution to this problem gives a stress ﬁeld in the direction of the loading and along the extension of the crack: σ yy y =0. of a loaded crack.372 10 Defects II: line defects y σθθ σθr r θ σrr σr θ x σ 8 y x ∆u y 2a σ Figure 10. the stress √ diverges as 1/ r .x >a =√ σ∞ x x 2 − a2 with the origin of the coordinate axes placed at the center of the crack.

26) (10. The expression for the displacement ﬁeld. evaluated at the plane which is an extension of the crack and ahead of the crack ((x > 0.32) The generalization of these results to mode II and mode III loading gives K II mode II : σ yx = √ . gives u y = u y |θ =π − u y |θ =−π = 4(1 − ν ) KI µ r 2π r 2π (10. we ﬁnd that in √ the latter case the stress intensity factor is given by K I = σ∞ π a . Eq. evaluated on either side of the opening behind the crack ((x < 0. y = 0) or θ = ±π . σ yz are the dominant stress components on the plane which is an extension of the crack and ahead of the crack (θ = 0).31). means that there are very large forces exerted on the atoms in the neighborhood of the crack tip. mode III : σ yz = √ 2π r 4(1 − ν ) K II µ 4 r u z = K III µ 2π ux = where σ yx . 2π r K III . 10. see Fig. First. the divergence of the √ stress near the crack tip as 1/ r for r → 0. These results have interesting physical implications.27) (10.10. The expression for the stress.13). see Fig.13).28) where κ = 3 − 4ν for plane strain (see Problem 5). u z refer to the displacement discontinuity behind the crack (θ = ±π ).30) (10. 10. gives σ yy θ =0 KI =√ 2π r (10. y = 0) or θ = 0.31) Comparing this result to that for the penny crack loaded in mode I.3 Brittle versus ductile behavior 373 in polar coordinates produces to lowest order in r KI θ θ σrr = √ 2 − cos2 cos 2 2 2π r KI θ θ cos2 sin σr θ = √ 2 2 2π r KI θ σθ θ = √ cos3 2 2π r while the displacement ﬁeld u to lowest order in r is ux = uy = K I (κ − cos θ ) 2µ K I (κ − cos θ ) 2µ θ r cos 2π 2 θ r sin 2π 2 (10. because beyond a certain . and u x .29) (10. Of course in a real solid the forces on atoms cannot diverge. (10.

34) The word rate here is used not in a temporal sense but in a spatial sense. This is not an easy task. Eq. These two possibilities correspond to brittle or ductile response as already mentioned above. (10. 10. δ a = 0 This can be generalized to the case of a small extension of the crack by δ a .2 Brittle fracture – Grifﬁth criterion What remains to be established is a criterion that will differentiate between the tendency of the solid to respond to external loading by brittle fracture or ductile deformation. 10. . this issue remains one of active research at present. Nevertheless. Which of the two possibilities will be preferred is dictated by the microscopic structure and bonding in the solid. 3 δa = 0 (10. separation of the two surfaces on either side of the crack plane. this choice conforms with the terminology in the literature. In an early work. in fact. as in energy per unit crack area.3.374 10 Defects II: line defects point the bonds between atoms are broken and there is effectively no interaction between them. by introducing the energy release rate G b related to this extension: δU = P δ y − G b δ a . phenomenological theories do exist which capture the essence of this issue to a remarkable extent. This bond breaking can lead to cleavage.32). The crack is loaded in mode I . then the change in internal energy U per unit width of the crack. He showed that the critical rate3 of energy per unit area G b required to open an existing crack by an inﬁnitesimal amount in mode I loading is given by G b = 2γs (10. as illustrated in Fig. indicating that the distortion of the solid can indeed be large far from the crack tip while right at the crack tip it is inﬁnitesimal. or to the creation of dislocations. Second. that is. for a ﬁxed crack length will be given by δU = P δ y . because it implies a connection between very complex processes at the atomistic level and the macroscopic response of the solid. Consider a crack of length a and inﬁnite width in the perpendicular direction. plastic deformation of the solid. This result can be derived from a simple energy-balance argument.33) where γs is the surface energy (energy per unit area on the exposed surface of each side of the crack). the displacement ﬁeld is proportional to r 1/2 . that is.14. Grifﬁth developed a criterion for the conditions under which brittle fracture will occur [140]. If the load is P and the extension of the solid in the direction of the load is δ y .

Left: the top panel illustrates how the crack opens in mode I loading (pure tension) by cleavage. Right: the top panel illustrates how the crack blunts in mode II loading (pure shear) by the emission of an edge dislocation (inverted T). The middle panel illustrates cleavage of the crystal along the cleavage plane. From this last expression we obtain δ E = δU + 2γs δ a = P δ y − G b δ a + 2γs δ a . per unit width of the crack. the thicker solid lines indicate its ﬁnal position.3 Brittle versus ductile behavior 375 a δa d d γs E E γus 0 d0 d 0 b 2b d Figure 10. the thin solid lines indicate the original position of the crack.10. which. The bottom panel indicates the behavior of the energy during cleavage. the thicker solid lines indicate its ﬁnal position after it has propagated by a small amount δ a . Schematic representation of key notions in brittle and ductile response. The middle panel illustrates sliding of two halves of the crystal on the glide plane (in general different than the cleavage plane). The bottom panel indicates the behavior of the energy during sliding.14. is E = U + 2γs a with γs the surface energy. the thin solid lines indicate the original position of the crack. The total energy of the solid E is given by the internal energy plus any additional energy cost introduced by the presence of surfaces on either side of the crack plane.

is the mechanism that leads to blunting rather than opening of existing cracks. It is straightforward to relate the energy release rate to the stress intensity factors introduced above to describe the stresses and displacements in the neighborhood of the crack.376 10 Defects II: line defects where we have also used Eq. δ E = P δ y . (10.R. in the context of the Peierls framework [142]. Formulating a criterion for the conditions under which nucleation of dislocations at a crack tip will occur is considerably more complicated than for brittle fracture. the formulation of a criterion for ductile response. which leads directly to the Grifﬁth criterion. the total change in the energy must be equal to the total work by the external forces. Nucleation of dislocations at a crack tip. Early work by Rice and Thomson [141] attempted to put this process on a quantitative basis. which in this picture is not taken into account. The reason for the discrepancy is that. the core radius. Grifﬁth’s criterion for brittle fracture involves a remarkably simple expression. For most other solids. . which provides an appealingly simple and very powerful formulation of the problem. and their subsequent motion away from it. there is also plastic deformation of the solid ahead of the crack. (10. as shown schematically in Fig. The blunting of the crack tip is the atomistic level process by which plastic deformation absorbs the external load. In particular. This issue has been the subject of much theoretical analysis. is not a uniquely deﬁned parameter. The plastic deformation absorbs a large fraction of the externally imposed load and as a consequence a much larger load is required to actually break the solid. the energy required for fracture is considerably larger than 2γs . Eq. preventing the breaking of the solid.3.3 Ductile response – Rice criterion These considerations bring us to the next issue. The Grifﬁth criterion is obeyed well by extremely brittle solids. however. such as silica or bulk silicon. ductile response is related to dislocation activity. At equilibrium. Rice and coworkers.34). More recent work has been based on the Peierls framework for describing dislocation properties.33). in addition to bond breaking at the crack tip. that is. for mode I loading in plane strain. We brieﬂy discuss the work of J. This allowed the derivation of expressions that do not involve arbitrary parameters such as the dislocation core radius [137]. As already mentioned. The criteria they derived involved features of the dislocation such as the core radius rc and the Burgers vector b. 10. the energy release rate G I and the stress intensity factor K I are related by GI = 1−ν 2 K 2µ I (see Problem 6).14. 10.

the dislocation will be created and will move on a plane which does not coincide with the crack plane. this maximum value of the energy is γus . 10. the energy goes through a maximum at a relative displacement b/2. In this manner. which in general is given by G II = 1−ν 2 K 2µ II (see Problem 6). this corresponds to the creation of an edge dislocation. Rice’s criterion can be rationalized in the special case of pure mode II loading. is also equal to the elastic energy associated with slip between the two halves of the crystal. the solid will relax to the next minimum of the energy situated at dti p = b. Under the external shear stress. for more general loading geometries α depends on two angles. the energy is at a local maximum in the direction of tip motion (this is actually a saddle point in the energy landscape associated with any relative displacement of the two halves of the crystal on the slip plane). 10.3 Brittle versus ductile behavior 377 The critical energy release rate G d for dislocation nucleation at a crack tip. In terms of structural changes in the solid. In general. the dislocation will then move further into the solid through the types of processes we discussed in the previous section. where b is the Burgers vector for dislocations on this plane. Before this local maximum has been reached. the work done by external forces on the solid is absorbed by the creation of dislocations at the tip of the crack and their motion away from it. Rice showed that the energy release rate in this case. it is evident from the representation . For mode I and mode III loading α = 1. the angle between the dislocation slip plane and the crack plane. The variation in the energy is periodic with period b.14: when the two halves of the crystal slide over each other on the slip plane. a number of different possible glide planes and dislocation types (identiﬁed by their Burgers vectors) must be considered in order to determine the value of γus . producing blunting of the crack as shown in Fig. while the crystal is brought from one equilibrium conﬁguration to another equivalent one. For a given crystal structure. However. illustrated in Fig. When the crack tip reaches dti p = b/2. α is a factor that depends on the geometry.10. is given by G d = αγus (10. For example. deﬁned as the lowest energy barrier that must be overcome when one-half of an inﬁnite crystal slides over the other half. according to Rice’s criterion [137].14.35) γus is the unstable stacking energy. where dti p is the position of the crack tip. once this local maximum has been reached. and the angle between the Burgers vector and the crack line [142]. U (dti p ). this process leads to a local change in the neighborhood of the crack tip but no breaking. if allowed to relax the solid will return to its original conﬁguration.

10.3. no guidance provided by these arguments as to what value of γus /γs differentiates between the tendency for brittle or ductile response. when externally loaded. as well as the unstable stacking energy γus for different glide planes. alloying with other elements. The tendency for brittle versus ductile behavior can then be viewed as a competition between the G b and G d terms: their ratio will determine whether the crystal. We will consider . but for mode III loading it is a screw dislocation. Examples of such issues are the coupling of different modes of loading. At present. some comments on the issue are warranted. Changes in these quantities due to impurities. lattice trapping. Refs. In particular. Obviously. The real power of the. however. The interaction of dislocations with defects is very important for the overall behavior of the solid and forms the basis for understanding several interesting phenomena. admittedly oversimpliﬁed. All these issues are the subject of recent and on-going investigations (see. picture described above lies in its ability to give helpful hints and to establish trends of how the complex macroscopic phenomena we are considering can be related to atomistic level structural changes. or whether it will absorb the load by creation and motion of dislocations (low γus /γs ). the importance of thermal activation of dislocation processes. There is.4 Dislocation–defect interactions Up to this point we have been treating dislocations as essentially isolated line defects in solids. even when compared with atomistic simulations where the ratio γus /γs can be calculated directly and the response of the system is known. and other effects of the atomically discrete nature of real solids. dislocations in real solids coexist with other defects. where such an approach was successfully employed to predict changes in the brittleness of speciﬁc materials). both the surface energy γs for different cleavage planes of a crystal. for example. etc. dislocation loops. [149]. much remains to be resolved before these theories are able to capture all the complexities of the competition between crack blunting versus brittle fracture tendencies in real materials. In fact. [142–147]).. are intrinsic properties of the solid which can be calculated with high accuracy using modern computational methods of the type discussed in chapter 5.14 that in mode II loading the relevant dislocation is an edge one. will undergo brittle fracture (high γus /γs ). this ratio cannot be used as a predictive tool. Ref. which in the above picture of brittle or ductile response are not taken into account.378 10 Defects II: line defects of Fig. for example. The reason is that a number of more complicated issues come into play in realistic situations. 10. can then provide an indication of how these structural alterations at the microscopic level can affect the large-scale mechanical behavior of the solid (see. ledges effects. While a full discussion of these interactions is not possible in the context of the present treatment.

interstitials and impurities. can indeed diffuse easily even in close-packed crystals and can therefore maintain the preferred equilibrium distribution in the neighborhood of the dislocation as it moves. due to their small size. As far as interactions of dislocations among themselves are concerned. can affect the overall response of the solid. This breaks the dislocation into two portions. This process. and defects which have ﬁnite extent in all dimensions but are not necessarily of atomicscale size. the shorter of which shrinks to the original dislocation conﬁguration between the defects while the larger moves away as a dislocation loop. interactions between dislocations themselves. It is known. and interactions between dislocations and two-dimensional defects (interfaces). point defects can be either drawn toward the dislocation core or repelled away from it. typically experience long-range interactions with dislocations because of the strain ﬁeld introduced by the dislocation to which the motion of point defects is sensitive. the prototypical ductile metal. known as the Frank–Read source.3 Brittle versus ductile behavior 379 selectively certain important aspects of these interactions. If the point defects can diffuse easily in the bulk material. The interaction of a dislocation with such defects can be the source of dislocation multiplication. this can lead to a junction (permanent lock between the two dislocations) or a jog (step on the dislocation line). A classic example is hydrogen impurities: H atoms. We mentioned earlier how the motion of a dislocation through the crystal . such as vacancies. these can take several forms. for instance. hence they restrict the motion of the dislocation.15: a straight dislocation anchored at two such defects is made to bow out under the inﬂuence of an external stress which would normally make the dislocation move. including Al. One example is an intersection. in turn. As a consequence. At some point the bowing is so severe that two points of the dislocation meet and annihilate. 10. This. but there is little doubt that the interaction of the H impurities with the dislocations is at the heart of this effect. Depending on the type of dislocations and the angle at which they meet. we can distinguish between zero-dimensional defects which are of microscopic size. such as the point defects we encountered in chapter 9. that H impurities lead to embrittlement of many metals. is illustrated in Fig. they will alter signiﬁcantly the behavior of the dislocation. We turn next to the second type of zero-dimensional defects. leading to multiple dislocations emanating from the original one. those which are not of atomic-scale size. which we address in the order we adopted for the classiﬁcation of defects: interaction between dislocations and zero-dimensional defects. The junctions make it difﬁcult for dislocations to move past each other. Atomic-scale point defects. The actual mechanisms by which this effect occurs at the microscopic level remain open to investigation. In the ﬁrst category. so that their equilibrium distribution can follow the dislocation as it moves in the crystal.10. formed when two dislocations whose lines are at an angle meet. The process can be continued indeﬁnitely as long as the external stress is applied to the system.

we consider the interaction of dislocations with two-dimensional obstacles such as interfaces. in which interacting dislocations form well deﬁned patterns. two parts of the dislocation meet and annihilate (conﬁguration 6). known as dislocation microstructure. that is. which has two components). grain boundaries themselves can be represented as arrays of dislocations (this topic is discussed in more detail in chapter 11). This is one instance of a more general phenomenon. 10. This type of interaction also restricts the motion of individual dislocations and produces changes in the mechanical behavior of the solid. After sufﬁcient bowing. an effect known as hardening. the plasticity of the solid is reduced. we will examine what happens when dislocations which exist . The formation of junctions which correspond to attractive interactions between dislocations is one of the mechanisms for hardening. Another consequence of dislocation interactions is the formation of dislocation walls.4 and accompanying discussion). arrangements of many dislocations on a planar conﬁguration. The junctions or jogs can be modeled as a new type of particle which has its own dynamics in the crystal environment. Interestingly. When dislocation motion is inhibited. Most materials are composed of small crystallites whose interfaces are called grain boundaries.15. Illustration of Frank–Read source of dislocations: under external stress the original dislocation (labeled 1) between two zero-dimensional defects is made to bow out.380 10 Defects II: line defects 7 6 5 4 3 2 1 7 6 Figure 10. Detailed simulations of how dislocations behave in the presence of other dislocations are a subject of active research and reveal quite complex behavior at the atomistic level [150. leads to plastic deformation (see Fig. Finally. Here. at which point the shorter portion shrinks to the original conﬁguration and the larger portion moves away as a dislocation loop (conﬁguration 7. 151].

A classic account of dislocations. To this end.S. below which the effect is reversed. Dislocations in Crystals. 17. 1991). D. and is known as the Hall–Petch effect [152].N. Suzuki. [154. Richardson. etc. This book offers an insightful connection between dislocations and plasticity. 9. Lothe (Krieger. A. the trend continues down to a certain size of order 10 nm.R. Hull and D. in Solid State and Materials Science. from the simple motion of an isolated dislocation which underlies plastic deformation. 2. Takeuchi. nevertheless. the material becomes softer as the grain size decreases. 1953). to the mechanisms of dislocation nucleation at crack tips. Friedel (Addison-Wesley. As a consequence. M. Berlin. Introduction to Dislocations. New York. Weertman (McMillan. 1–46 (CRC Press. 1991). 7. recent computational approaches have set their sights at a more realistic connection between the atomistic and macroscopic regimes in what has become known as “multiscale modeling of materials” (for some representative examples see Refs. J.R. Yoshinaga (Springer-Verlag. 1992). Theory of Dislocations. vol. The Theory of Crystal Dislocations. stress-induced corrosion. Oxford. Theory of Crystal Dislocations. It is easy to imagine that since the smaller the crystallite the higher the surface-to-volume ratio. H. It is. Malabar. Bacon (Pergamon Press. Further reading 1. J. The reason for this reversal in behavior is that. one. for very small grains. 1964). Putting all the aspects of dislocation behavior together. This is actually observed experimentally.P. Elementary Dislocation Theory. This is a thorough. fatigue. 1964).H. Reading. Dislocations. 8. 155]). S. This is the standard reference for the physics of dislocations. 6. 3. Nabarro (Oxford University Press. the dislocations become immobile and pile up at the boundary. J. New York. with many insightful discussions.T. pp. Weertman and J. MA.and two-dimensional defects. An accessible and thorough introduction to dislocations. 1967).Y. 5. that is. is a daunting task.J. Read (McGraw-Hill. containing extensive and detailed discussions of every aspect of dislocations. W.Further reading 381 within the crystal meet a grain boundary. . Duesbery and G. an essential task in order to make the connection between the atomistic level structure and dynamics to the macroscopic behavior of materials as exempliﬁed by fascinating phenomena like work hardening. New York. Interestingly. the ability of the crystal to deform plastically is again diminished and the material becomes harder. F. Hirth and J. “The dislocation core in crystalline materials”. modern discussion of the properties of dislocation cores. Dislocation Dynamics and Plasticity. sliding between adjacent grains at the grain boundaries becomes easy and this leads to a material that yields sooner to external stress [153]. T. 4. 1984). 1964). When this occurs. An older but classic account of dislocations by one of the pioneers in the ﬁeld. Cottrell (Gordon and Breach. materials composed of many small crystallites will be harder than materials composed of few large crystallites. to the interaction of dislocations with other zero-.

Eq. ln r. (E. θ ) = 0 which is separable. Then. This is justiﬁed by the schematic representation of the screw dislocation in Fig. ﬁnd the stress components of the screw dislocation in cartesian and polar coordinates and compare the results to the expressions given in Table 10. Of these. The solutions with positive powers of r must also be rejected since they blow up at large distances from the dislocation core. we can use the equations of plane strain.50). u z goes uniformly from zero to bs as θ ranges from zero to 2π . We deﬁne the function 2 B = σx x + σ yy = ∇x yA where the laplacian with subscript x y indicates that only the in-plane components are used. (Hint: the shear stress components in cartesian and polar coordinates are related by: σr z = σx z cos θ + σ yz sin θ σθ z = σ yz cos θ − σx z sin θ 2. 10. c and ln r must be rejected since they have no θ dependence. 10. Laplace’s equation for B in polar coordinates in 2D reads ∂2 1 ∂2 1 ∂ + + ∂r 2 r ∂r r 2 ∂θ 2 B (r. (E. Find the strain components of the screw dislocation in cartesian and polar coordinates. using the the stress–strain relations for an isotropic solid.37) Similar relations hold for the shear strains. r ±n sin n θ. r ±n cos n θ with c a real constant and n a positive integer. and the function B must surely depend on the variable θ from physical considerations. B (r. θ ). with the strain zz along the axis of the dislocation vanishing identically. The stress components for plane strain are given in terms of the Airy stress function.36) with bs the Burgers vector along the z axis.1 makes it clear that a single inﬁnite edge dislocation in an isotropic solid satisﬁes the conditions of plane strain. A(r.49). since the function A obeys Eq. (a) Show that the four possible solutions to the above equation are c. argue that the dominant one for large r which makes physical sense from the geometry of the edge . (E. we can deﬁne the displacement ﬁeld as u x = u y = 0. which is unphysical. θ ) can be written as a product of a function of r with a function of θ . uz = bs y bs θ = tan−1 2π 2π x (10. The geometry of Fig.382 10 Defects II: line defects Problems 1. discussed in detail in Appendix E. while the other two components of the strain ﬁeld vanish identically.1. Of the remaining solutions.32).) In order to obtain the stress ﬁeld of an edge dislocation in an isotropic solid.2: sufﬁciently far from the dislocation core. by Eq. (10. The function B must obey Laplace’s equation. that is. In order to obtain the stress ﬁeld of a screw dislocation in an isotropic solid.

(10.30). (10. Eq. (10. (10. From this solution obtain the dislocation density ρ (x ) and compare u (x ) and ρ (x ) to those of Eq. θ ) = α1r sin θ ln r 2 Discuss why the solutions to the homogeneous equation ∇x y A = 0 can be neglected.1. σr θ . Eqs.20). (c) Express the stress components in both polar and cartesian coordinates and compare them to the expressions given in Table 10. We are interested in solutions for the Airy function. θ ) = r 2 f (r θ ) + g (r. θ ). obtained from the choice of potential in Eq. Derive the solution for the shape of the dislocation. θ ). −δ y ) − x x ( x . θ ) such that the resulting stress has the form σi j ∼ r q near the crack tip: this implies A ∼ r q +2 . u x (+∞) − u x (−∞) = ∞ [ −∞ x x ( x . Eq. We will use the results of the plane strain situation. (10.21). .22).49) and use them to obtain the strains from the general strain–stress relations for an isotropic solid. (a) Show that the proper choices for mode I symmetry are f (r.22). g (r. Eqs. and the displacement ﬁeld. obtain the stress components σθ θ . (E. (10. σrr . This completes the derivation of the stress ﬁeld of the edge dislocation. +δ y )] d x = be where δ y → 0+ . to determine the value of the constant α1 . with the assumption of a sinusoidal misﬁt potential. Eq. θ ) = f 0r q cos q θ.29). show that a solution for the Airy function is A(r. 3. discussed in detail in Appendix E.21). Then use the normalization of the integral of x x to the Burgers vector be . in the neighborhood of a crack loaded in mode I . From the above solution for A(r. The original lattice restoring stress considered by Frenkel [136] was 2π u (x ) dγ (u ) = Fmax sin du b where b is the Burgers vector and Fmax the maximum value of the stress. which satisﬁes the PN integro-differential equation.Problems dislocation is 1 B (r. θ ) = β1 sin θ r 383 (b) With this expression for B (r.28). What is the physical meaning of the two choices for the potential. 5. (10. obtain the stresses as determined by Eq. (10. Eq. Eq.20). (E. Does this choice satisfy the usual conditions that the restoring force should obey? Find the solution for u (x ) when this restoring force is substituted into the PN integro-differential equation.28). θ ) = g0r q +2 cos(q + 2)θ 4. With these choices. and what are their differences? We wish to derive the expressions for the stress. A(r. (10.26)–(10.

384 10 Defects II: line defects (b) Determine the allowed values of q and the relations between the constants f 0 . z ) are the dominant stress components and displacement discontinuities behind the crack. using the standard stress–strain relations for an isotropic solid (see Appendix E). as the only possibility. obtain the displacement ﬁeld given in Eqs. (d) From the solution for the stress. (c) Show that by imposing the condition of bounded energy 2π 0 0 R σ 2r dr dθ < ∞ and by discarding all terms r q which give zero stress at the crack tip r = 0. y .29) and (10.26)–(10. u j ( j = x . K III are the corresponding stress intensity factors. .30).28). in mode I. we arrive at the solution given by Eqs. due to elastic forces is given by G = lim 1 δ a →0 δ a δa 0 1 1−ν 2 1 2 2 σ y j u j dr = ( K I + K II K )+ 2 2µ 2µ III where σ y j . II and III loading and K I . g0 by requiring that σθ θ = σr θ = 0 for θ = ±π . K II . per unit crack area. Show that the energy release rate for the opening of a crack. (10. 6. (10.

At a more practical level. to the second as interfaces. In particular. It would not be possible to cover all the interesting 385 . a type of interface very common in crystalline solids. Above a certain temperature (the melting temperature). Surfaces are the subject of a very broad and rich ﬁeld of study called surface science. Interfaces can occur between two entirely different solids or between two grains of the same crystal. to which entire research journals are devoted (including Surface Science and Surface Review and Letters). 157]. the entropy term in the free energy wins and it becomes favorable to generate isolated dislocations. We refer to the ﬁrst type of defects as surfaces. At the fundamental level. dislocations in two dimensions are mobile and have long-range strain ﬁelds which lead to their binding in pairs of opposite Burgers vectors. and consequently have all the characteristics of dislocations discussed in chapter 10. We have already seen that the conﬁnement of electrons at the interface between a metal and a semiconductor or two semiconductors creates the conditions for the quantum Hall effects (see chapters 7 and 9). grain boundaries. there are several aspects of surfaces and interfaces that are extremely important for applications. chemical reactions mediated by solid surfaces are the essence of catalysis. There exist several other phenomena particular to 2D: one interesting example is the nature of the melting transition. For instance.11 Defects III: surfaces and interfaces Two-dimensional defects in crystals consist of planes of atomic sites where the solid terminates or meets a plane of another crystal. which in 2D is mediated by the unbinding of defects [156. Surfaces and interfaces of solids are extremely important from a fundamental as well as from a practical point of view. a process of huge practical signiﬁcance. opening a different view of the physical world. in which case they are called grain boundaries. Point defects in 2D are the equivalent of dislocations in 3D. are crucial to mechanical strength. this produces enough disorder to cause melting of the 2D crystal. Similarly. surfaces and interfaces are the primary systems where physics in two dimensions can be realized and investigated.

Typically. especially those dealing with the link between atomic and electronic structure. such as low-energy electron diffraction (referred to as LEED).1.1 Experimental study of surfaces We begin our discussion of surfaces with a brief review of experimental techniques for determining their atomic structure. inﬁnite. even when they have no foreign contaminants. Real surfaces of solids. as illustrated in Fig.. in which surfaces of solids could be cleaned and maintained in their pure form. With proper care. islands. called terraces. terraces of a different orientation relative to the overall surface are referred to as facets. 3D crystal studied in earlier chapters. facets.386 11 Defects III: surfaces and interfaces phenomena related to crystal surfaces in a short chapter. steps and facets can be extended over distances that are large on the atomic scale. ﬂat regions on a solid surface can be prepared. small features are referred to as islands. etc. The detailed study of crystalline surfaces became possible with the advent of ultra high vacuum (UHV) chambers. terraces. etc. they contain many imperfections. are not perfect two-dimensional planes. 11. This is due to the fact that a surface of a pure crystal is usually chemically reactive and easy to contaminate. which are large on the atomic scale consisting of thousands to millions of interatomic distances on each side. but islands are typically small in all directions. such as steps. by analogy to the ideal. . The surfaces of solids under usual circumstances are covered by a large amount of foreign substances. These terraces are close approximations to the ideal 2D surface of an inﬁnite 3D crystal. For this reason. 11. have been used extensively to determine the structure of surfaces (see. for Terrace Step Island Facet Figure 11. reﬂection high-energy electron diffraction (RHEED).1.. Rather. X-ray scattering. Our aim here is to illustrate how the concepts and techniques we developed in earlier chapters. changes in height of order a few atomic layers are called steps. can be applied to study representative problems of surface and interface physics. Various features of real surfaces shown in cross-section: reasonably ﬂat regions are called terraces. It is this latter type of surface that we discuss here. In the direction perpendicular to the plane of the ﬁgure. scattering techniques. it has been very difﬁcult to study the structure of surfaces quantitatively.

these methods are referred to as low-. a constant distance from the . Electron or X-ray scattering methods are based on the same principles used to determine crystal structure in 3D (see chapter 3). which consists of exciting core electrons of an atom and measuring the emitted X-ray spectrum when other electrons fall into the unoccupied core state: since core-state energies are characteristic of individual elements and the wavefunctions of core states are little affected by neighboring atoms. the pattern of scattered ions can be related to the surface structure. medium-. These methods. Thus. which bounce off the sample atoms in trajectories whose nature depends on the incident energy. Since the mid-1980s. with exponential dependence.2 for details). G.11. depending on the sign of the bias voltage. This is usually established through a method called “Auger analysis”. A different type of scattering measurement involves ions.2). This ingenious technique (its inventors. as discussed in section 11. Rohrer were recognized with the 1986 Nobel prize for Physics). this spectrum can be used to identify the presence of speciﬁc atom types on the surface.1 Experimental study of surfaces 387 example. or high-energy ion scattering (LEIS. The Auger signal of a particular atomic species is proportional to its concentration on the surface. When an atomically sharp tip approaches a surface. Since surface atoms are often in positions different than those of a bulk-terminated plane (see section 11. and consequently it can be used to determine the chemical composition of the surface with great accuracy and sensitivity. The methods mentioned so far concern measurements of the surface atomic structure. mentioned in the Further reading section). The tunneling current is extremely sensitive to the distance from the surface. articles in the book by van Hove and Tong. respectively). Binning and H. MEIS and HEIS. electrons can tunnel from the tip to the surface. In this measurement a sample held at high voltage serves as an attraction center for ions. or in the opposite direction. Another interesting aspect of the surface is its chemical composition. is based on a simple scheme (illustrated in Fig. which bounce off it in a pattern that reﬂects certain aspects of the surface structure. a new technique called scanning tunneling microscopy (STM) has revolutionized the ﬁeld of surface science by making it possible to determine the structure of surfaces by direct imaging of atomistic level details. The reason why this technique is particularly effective on surfaces is that the emitted X-rays are not subjected to any scattering when they are emitted by surface atoms. and is held at a bias voltage relative to the sample. which can often be very different than the structure of a bulk plane of the same orientation. in order to achieve constant tunneling current. 11. which can also differ signiﬁcantly from the composition of a bulk plane. when combined with detailed analysis of the scattered signal as a function of incidentradiation energy.2. can be very powerful tools for determining surface structure. Yet a different type of measurement that reveals the structure of the surface on a local scale is referred to as “ﬁeld ion microscope” (FIM).

φ ( R ) . We provide here an elementary discussion of how STM works. and is moved up or down to maintain constant current of tunneling electrons. The theory of STM was developed by Tersoff and Hamann [158] and further extended by Chen [159]. identiﬁed as left ( L ) and right ( R ): I = 2π e h ¯ n ( L ) ( i )[1 − n ( R ) ( j )] − n ( R ) ( j )[1 − n ( L ) ( i )] |Ti j |2 δ ( i − i. when they are well separated and there is no tunneling current.388 11 Defects III: surfaces and interfaces Tip Surface profile Surface layer V Figure 11. in order to illustrate its use in determining the surface structure. with Fermi levels and work functions (the difference between the Fermi level and the (L ) ( R) . since it is between electronic states of the sample and the tip that electrons can tunnel to and from. This produces a topographical proﬁle of the electronic density associated with the surface. Schematic representation of the scanning tunneling microscope: a metal tip is held at a voltage bias V relative to the surface. we assume that both sides are metallic solids. This leads to a scan of the surface at constant height. 11.1) where n ( L ) ( ) and n ( R ) ( ) are the Fermi ﬁlling factors of the left and right sides. respectively. This height is actually determined by the electron density on the surface. When the two sides are brought close together and allowed to reach equilibrium with a . and Ti j is the tunneling matrix element between electronic states on the two sides identiﬁed by the indices i and j . F and φ ( L ) . which is common to both sides) deﬁned as F respectively.2. The actual physical situation is illustrated in Fig. j j) (11. vacuum level. surface must be maintained.3: for simplicity. The starting point is the general expression for the current I due to electrons tunneling between two sides. A feedback loop can ensure this by moving the tip in the direction perpendicular to the surface while it is scanned over the surface.

(11. common Fermi level. which in the case illustrated in Fig. the product that gives a non-vanishing contribution is n ( L ) ( i )[1 − n ( R ) ( j )]. Left: the situation corresponds to the two sides being far apart (L ) ( R) and having different Fermi levels denoted by F and F . In order to ﬁx ideas. identiﬁed as left ( L ) and right ( R ). the surface of which is being studied. The energy shift introduced by the bias potential. In the limit of very small bias voltage and very low temperature. the inherent tunneling barrier can be changed. Reversing the sign of the bias potential would have the opposite effect on tunneling.11.1) will give a non-vanishing contribution and the other one will give a vanishing contribution. Energy level diagram for electron tunneling between two sides.1 Experimental study of surfaces 389 vacuum δφ =φ(R)−φ(L) δε=eV −δφ φ (L) φ (R) ε(L) F ε(R) F eV εF εF Figure 11. given by δφ = φ ( R ) − φ ( L ) This difference gives rise to an electric ﬁeld. conditions which apply . which in the case illustrated in Fig. φ ( L ) and φ ( R ) are the work functions of the two sides. F . resulting in a energy level shift and a new effective electric potential.3 enhances tunneling relative to the zero-bias situation. there will be a barrier to tunneling due to the difference in work functions. which results in a common Fermi level F and an effective electric potential generated by the difference in work functions. δ = eV − δφ = eV − φ ( R ) + φ ( L ) gives rise to an effective electric ﬁeld. in the following we will associate the left side with the sample. δφ = φ ( R ) − φ ( L ) .4. When a bias voltage V is applied. one of the ﬁlling factor products appearing in Eq.3. At ﬁnite bias voltage V . and the right side with the tip.3 inhibits tunneling from the left to the right side.3. generated by the energy difference δ = eV − δφ . 11. Right: one side is biased relative to the other by a potential difference V . Center: the two sides are brought closer together so that equilibrium can be established by tunneling. as shown in Fig. 11. 11. 11. For the case illustrated in Fig.

The meaning of Eq.4) is that the spatial dependence of the tunneling current is determined by the magnitude of electronic wavefunctions of the sample evaluated at the tip position. in this limit we can write I = 2π e2 V h ¯ |Ti j |2 δ ( i − i. and in the limit of zero temperature. T →0 lim I = I0 i |ψi (rt )|2 δ ( i − F) (11. such as the density of tip states at the Fermi level. the tunneling current takes the form V . when there is a bias potential V on the left side: the product n ( L ) ( )[1 − n ( R ) ( )] gives a non-vanishing contribution (shaded area). j F )δ ( j − F) (11. the non-vanishing ﬁlling factor product divided by eV has the characteristic behavior of a δ -function (see Appendix G). which lies entirely between the tip and the sample. is Ti j = h ¯2 2m e ψi∗ (r)∇r ψ j (r) − ψ j (r)∇r ψi∗ (r) · ns d S (11. Thus. In other words. Tersoff and Hamann showed that. as shown by Bardeen [160].3) where ψi (r) are the sample and ψ j (r) the tip electronic wavefunctions. and right n ( R ) ( ) sides. while n ( R ) ( ) [1 − n ( L ) ( )] gives a vanishing contribution. For a ﬁnite value of the bias voltage V . the corresponding expression for the tunneling current would involve . The integral is evaluated on a surface S with surface-normal vector ns . The general form of the tunneling matrix element.2) where we have used a symmetrized expression for the two δ -functions appearing under the summation over electronic states. gt ( F ). The ﬁlling factors entering in the expression for tunneling between the left n ( L ) ( ). (11. the tunneling current gives an exact topographical map of the sample electronic charge density evaluated at the position of a point-like tip.4.390 11 Defects III: surfaces and interfaces n(ε ) 1 n (L) 1−n(R) n(ε ) 1 n (R) 1−n (L) 0 εF εF +eV ε 0 εF εF +eV ε Figure 11. and its work function. to most situations for STM experiments. assuming a point source for the tip and a simple wavefunction of s character associated with it.4) where rt is the position of the tip and I0 is a constant which contains parameters describing the tip.

or unoccupied sample states (corresponding to V < 0). in which the only relevant dimension is perpendicular to the surface plane. the electronic states are free-particle-like with wave-vector k . in the latter from the tip to the sample. we will employ simple one-dimensional models. ψk (z ) ∼ exp(ikz ). 11. and are separated by a potential barrier Vb (which we and energy k = h will take to be a constant) from the tip side. Thus. What remains to be established is that these sample states are. We will next employ a slightly more elaborate model to argue that surface states are localized near the surface and have energies close to the Fermi level. ﬁrst. as is evident from Fig. the solution to the single-particle Schr¨ odinger equation in the barrier takes the form (see Appendix B) ψk ( z ) ∼ e±κ z .3: T →0 lim I = I0 i |ψi (rt )|2 θ ( i − F )θ ( F + eV − i ) (11. second. close in energy to the Fermi level. In order to argue that this is the case. In the following.6) The tunneling current is proportional to the magnitude squared of this wavefunction evaluated at the tip position. assumed to be a distance d from the surface.1 Experimental study of surfaces 391 a sum over sample states within eV of the Fermi level. We consider again a 1D model but this time with a weak periodic potential. only states that have signiﬁcant magnitude at the surface are relevant to tunneling. whether one is probing occupied sample states (corresponding to V > 0. since we are working in one dimension. In the preceding discussion we were careful to identify the electronic states on the left side as “sample” states without any reference to the surface. as the physically relevant solution that decays to zero far from the sample. we dispense with vector notation. It is worthwhile mentioning that this expression is valid for either sign of the bias voltage. localized in space near the surface and. we imagine that on the side of the sample. the constant term V0 and a term which . This simple argument indicates that the tunneling current is exponentially sensitive to the sample–tip separation d .3). First. The weak periodic potential in the sample will be taken to have only two non-vanishing components. that is. as indicated by Fig.11. We call the free variable along this dimension z . that is. a nearlyfree-electron model. κ= 2m e (Vb − h ¯ 2 k) (11. ¯ 2 k 2 /2m e . in the former case electrons are tunneling from the sample to the tip. which gives that I ∼ e−2κ d where we have chosen the “−” sign for the solution along the positive z axis which points away from the surface. 11.5) where θ (x ) is the Heavyside step function (see Appendix G).

The black dots represent the positions of ions in the model. and wavefunction ψk (z ) of a surface state with energy k . 11.392 11 Defects III: surfaces and interfaces V(z) εk εk 0 Vb z 2VG q: imaginary 0 q: real q2 ψ k(z) Figure 11. The problem can then be solved completely by matching at z = 0 the values of the wavefunction and its derivative as given by the expressions for z < 0 and z > 0. c is a constant that includes the normalization. involves the G reciprocal-lattice vector (see also the discussion of this model for the 3D case in chapter 3): V (z ) = V0 + 2VG cos(Gz ).5. Illustration of the features of the nearly-free-electron model for surface states.8) (11. 2 k − k2 VG where we have introduced the variable q = k − G /2 which expresses the wavevector relative to the Brillouin Zone edge corresponding to G /2.7) k Solving the 1D Schr¨ odinger equation for this model gives for the energy wavefunction ψk (z < 0) (see Problem 1) k and h ¯2 = V0 + 2m e G 2 2 h ¯ 2q 2 + ± 2m e e2iφ = 2 VG + h ¯2 qG 2m e 2 (11. we will assume a simple barrier of height Vb . that is.10) with c another constant of normalization and κ deﬁned in Eq. z<0 (11. The interesting feature of this solution is that it allows q to take imaginary values. as functions of the variable z which is normal to the surface. both outside the sample (z > 0) as ∼ exp(−κ z ). The energy gap is 2VG at q = 0. Left: the potential V (z ) (thick black line). (11. the wavefunction decays exponentially. in the expression for the wavefunction. For such values. The potential V (z ) and the wavefunction ψk (z ) as functions of z and the energy k as a function of q 2 are shown in Fig. as well as inside the sample (z < 0) as ∼ exp[|q |z ].6). Imaginary values of q give rise to the surface states. For the potential outside the sample. where q = k − G /2. q = ±i|q |. the state is spatially conﬁned to . Right: the energy k as a function of q 2 .9) ψk (z < 0) = ceiqz cos G z+φ . which gives for the wavefunction ψk (z > 0) = c e−κ z (11.5.

see Fig. Having established the basic picture underlying STM experiments. the use of Bardeen’s expression for the tunneling current.1 Experimental study of surfaces 393 the surface. very sharp tips (the STM signal is optimal when the tunneling takes place through a single atom at the very edge of the tip). Therefore. Pt and Ir. in which case they certainly affect each other’s electronic wavefunctions. In all these cases the relevant electronic states of the tip at the Fermi level have d character. Put differently. constructed from sample electronic states within eV of the Fermi level. the Tersoff–Hamann–Chen theory establishes that the STM essentially produces an image of the electron density associated with the surface of the sample. instead of being proportional to the surface wavefunction magnitude as given in Eqs. The basic result is that the expression for the tunneling current. the sample states probed by the STM tip are spatially localized near the surface and have energies close to the Fermi level.4) and (11. First. This was . which depend on the nature of the tip electronic state. (11. Chen [159] developed an extension of the Tersoff–Hamann theory which takes into account these features by employing Green’s function techniques which are beyond the scope of the present treatment.5). the imaginary values of q correspond to energies that lie within the forbidden energy gap for the 3D model (which is equal to 2VG and occurs at q = 0. Eq. Although these facts were established within a simple 1D model. since those values of q would produce wavefunctions that grow exponentially for z → ±∞. (11.5). This is not necessarily the case for realistic situations encountered in STM experiments. Thus. the tips commonly employed in STM experiments are made of transition (d -electron) metals. This theory includes two important approximations. which are reasonably hard. Moreover. because their essential features in the direction perpendicular to the surface are captured by the preceding analysis. the surface states have energies within the energy gap. it is proportional to the magnitude of derivatives of the wavefunctions. it is worthwhile considering the limitations of the theory as developed so far. that is. Examples of metals typically used in STM tips are W. Assuming that all states below the energy gap are ﬁlled. y ) coordinates of the surface plane as well.3). assumes that tunneling occurs between undistorted states of the tip and the sample. we have established both facts mentioned above. the results carry over to the 3D case in which the electronic wavefunctions have a dependence on the (x . 11. In fact. In other words.11.1. making it possible to produce stable. the two sides between which electrons tunnel are considered to be far enough apart to not inﬂuence each other’s electronic states. when the surface states are occupied the Fermi level will intersect the surface energy band described by k . A summary of the relevant expressions is given in Table 11. Actually. The second important approximation is that the tip electronic states involved in the tunneling are simple s -like states. the tip and sample surface often come to within a few angstroms of each other.

This has become a standard approach in interpreting STM experiments. and to observe standing waves of electron density on the surface induced by microscopic structural features [166. and κ is given by Eq. z ). y . 167]. STM images provide an unprecedented amount of information about the structural and electronic properties of surfaces. the STM image can indeed be thought of as a “picture” of the atomic structure of the surface. y . z ) dαβ (α. β = x . since electrons are distributed so as to minimize the total energy of the system according to the rules discussed in chapter 2: while this usually involves shielding of the positive ionic cores.1.2 Surface reconstruction The presence of the surface produces an abrupt change in the external potential that the electrons feel: the potential is that of the bulk crystal below the surface . it can often produce more elaborate patterns of the electronic charge distribution which are signiﬁcantly different from a simple envelope of the atomic positions. In general. in order to establish the actual structure of the surface. for example. 11.6). The ψi ’s are sample wavefunctions and ∂α ≡ ∂/∂α (α = x . STM techniques have even been used to manipulate atoms or molecules on a surface [161–163]. Tip orbital s pα (α = x . it is important to compare STM images with theoretical electronic structure simulations. To the extent that these states correspond to atomic positions on the surface. [161]). For this reason. as they usually (but not always) do. Ref. especially in situations which involve multicomponent systems (for recent examples of such studies see.394 11 Defects III: surfaces and interfaces Table 11. All quantities are evaluated at the position of the tip orbital. z ) dx 2 − y 2 d3z 2 −r 2 Contribution to I |ψi |2 3κ −2 |∂α ψi |2 2 15κ −4 ∂αβ ψi 15 −4 κ 4 5 −4 κ 4 2 2 2 2 2 ∂x ψi − ∂ y ψi 2 3∂ z ψi − ψi a crucial step toward a proper interpretation of the images produced by STM experiments. Contribution of different types of orbitals associated with the STM tip to the tunneling current I . to affect their chemical bonding [165]. (11. y . The exceptions to this simple picture have to do with situations where the electronic states of the surface have a structure which is more complex than the underlying atomic structure. It is not hard to imagine how this might come about.

11. they are quite different in metals and semiconductors). but its nature in real systems is more complex than the simple picture we presented here based on the jellium model. Illustration of surface features of the jellium model.6. The + and − signs inside circles indicate the charge imbalance which gives rise to the surface dipole. The thicker wavy line is the electronic charge density n (z ). At the simplest level. 11. The need of the system to minimize the energy given this change in the external potential leads to interesting effects both in the atomic and electronic structure near the surface. The electronic charge distribution. The charge imbalance leads to the formation of a dipole moment associated with the presence of the surface. These are known as “Friedel oscillations” and have a characteristic wavelength of π/ kF . related to the average density n of the n(z) n nI(z) 0 z Figure 11.2 Surface reconstruction 395 and zero above the surface. but involves oscillations which extend well into the bulk. The presence of the surface is thus marked by an abrupt step in the ionic potential. we consider some general features of the surface electronic structure. with a slightly positive net charge just below the surface and a slightly negative net charge just above the surface. which changes smoothly near the surface and exhibits Friedel oscillations within the bulk. the change of the electronic charge density is not monotonic at the surface step. The solid line is the ionic density n I (z ). Before embarking on this description. where kF is the Fermi momentum. This so called surface dipole is a common feature of all surfaces. this is the generalization of the jellium model of the bulk crystal (see chapter 2). as illustrated in Fig. as illustrated in Fig. In addition to the surface dipole. However. we can think of the external potential due to the ions as being constant within the crystal and zero outside. These changes in structure depend on the nature of the physical system (for instance.6. which is constant in the bulk and drops abruptly to zero outside the solid. will undergo a change from its constant value far into the bulk to zero far outside it. 11. .6. This smooth transition gives rise to a total charge imbalance. the electronic charge distribution does not change abruptly near the surface but goes smoothly from one limiting value to the other [168]. and are discussed below in some detail for representative cases. which in general must follow the behavior of the ionic potential.

(D. with vectors b1 . the (111) surface of the diamond lattice corresponds to ˆ +y ˆ +z ˆ direction. the (001) surface of the diamond lattice corresponds to a plane perpendicular to the z axis of the cube. a n 1 × n 2 reconstruction on the (klm ) plane is one in which the lattice vectors on the plane are n 1 and n 2 times the primitive lattice vectors of the ideal. and a2 = a2x x ˆ + a2 y y ˆ . for example.10) in Appendix D). that is. rather than the primitive unit cell which has shorter vectors but not along cubic directions (see chapter 3). The differences can be small. which is referred to as “surface relaxation”. Since FCC and BCC crystals are part of the cubic system. Similarly. These vectors are multiples of lattice a1 = a1x x vectors of the three-dimensional crystal. An ideal crystal surface is characterized by two lattice vectors on the surface plane. For instance. Surfaces are identiﬁed by the bulk plane to which they correspond.396 11 Defects III: surfaces and interfaces jellium model by kF = (3π 2 n )1/3 (see Eq. which is referred to as “surface reconstruction”. Surfaces of lattices with more complex structure (such as the diamond or zincblende lattices which are FCC lattices with a two-atom basis). ˆ + a1 y y ˆ . The standard notation for this is the Miller indices of the conventional lattice. where c stands for “centered”. Simple integer multiples of the primitive lattice vectors in the bulk-terminated plane often are not adequate to describe the reconstruction. The existence of the Friedel oscillations is a result of the plane wave nature of electronic states associated with the jellium model: the sharp feature in the ionic potential at the surface. or large. which the electrons try to screen. to have reconstructions √ √ of the form n 1 × n 2 . It is possible. The characteristic feature of crystal surfaces is that the atoms on the surface assume positions different from those on a bulk-terminated plane. bulk-terminated (klm ) plane. or c(n 1 × n 2 ). which is a multiple of the PUC. For example. unreconstructed. the (001) surface of a simple cubic crystal corresponds to a plane perpendicular to the z axis of the cube. The changes in atomic positions can be such that the periodicity of the surface differs from the periodicity of atoms on a bulk-terminated plane of the same orientation. producing a structure that differs drastically from what is encountered in the bulk. with the components corresponding to the Fermi momentum giving the largest contributions. surfaces of these lattices are denoted with respect to the conventional cubic cell. The standard way to describe the new periodicity of the surface is by multiples of the lattice vectors of the corresponding bulk-terminated plane. The cube actually contains four PUCs of the diamond lattice and eight atoms. are also described by the Miller indices of the cubic lattice. one of the main diagonals a plane perpendicular to the x of the cube. mentioned in the Further reading section. For a detailed discussion of the surface physics of the jellium model see the review article by Lang. b2 such that bi · a j = 2πδi j . We turn our attention next to speciﬁc examples of real crystal surfaces. For example. has Fourier components of all wavevectors. . The corresponding reciprocal space is also two dimensional.

Right: the reconstructed surface with every second atom missing and the remaining atoms having either two-fold or three-fold coordination. on which every surface atom has three-fold coordination. The driving force behind the reconstruction is the need of the system to repair the damage done by the introduction of the surface. giving an average coordination of 8/3. It is possible to increase the average coordination of surface atoms by introducing a reconstruction: removal of every other row of surface atoms (every other surface atom in our 2D example) leaves the rest of the atoms two-fold or three-fold coordinated. it also increases the size of the unit cell by a factor of 2.11. The missing row reconstruction in close-packed surfaces.2 Surface reconstruction (001) 397 a1 (001) 2a1 (111) Figure 11.7.7. We deﬁne as surface atoms all those atoms which have fewer neighbors than in the bulk structure. The horizontal dashed line denotes the surface plane (average position of surface atoms) with surface unit cell vector a1 . Typically it is advantageous to undergo a surface reconstruction when the packing of atoms on the surface plane is not optimal. which severs the bonds of atoms on the exposed plane. while the inclined one indicates a plane of close-packed atoms. The surface atoms in our example have two-fold coordination in the cleaved surface.7. bulk-terminated plane with surface atoms two-fold coordinated. For simplicity and clarity we consider a two-dimensional example. This is illustrated in Fig. 11. rather than their usual four-fold coordination in the bulk. as shown in Fig. The horizontal dashed line denotes the surface plane with surface unit cell vector 2a1 . Left: the unreconstructed. We examine a particular cleaving of this 2D crystal which exposes a surface that does not correspond to close packing of the atoms on the surface plane. the surface atoms try to regain the number of neighbors they had in the bulk through the surface relaxation or reconstruction. illustrated in a 2D example. The labels of surface normal vectors denote the corresponding surfaces in the 3D FCC structure. 11. Surface reconstructions are common in both metal surfaces and semiconductor surfaces. The new surface unit cell contains one two-fold and two three-fold coordinated atoms. In metals with a close-packed bulk crystal structure. What has happened due to the reconstruction is that locally the surface looks as if it were composed of smaller sections of a close-packed plane. This is actually a situation quite common . a signiﬁcant improvement over the unreconstructed bulk-terminated plane.

. the Si(001) surface.2. which had been disturbed by the breaking of covalent bonds when the surface was created. The reason for its extensive study is that most electronic devices made out of silicon are built on crystalline substrates with the (001) surface. since these are the highest occupied states.398 11 Defects III: surfaces and interfaces on surfaces of FCC metals. In semiconductors. the second-to-third layer spacing is expanded. In such cases. The (001) bulk-terminated plane consists of atoms that have two covalent bonds to the rest of the crystal. To compensate for this distortion in bonding below the surface. is as much inﬂuenced by the presence of imperfections. We discuss next some examples of semiconductor surface reconstructions to illustrate these general themes. giving rise to interesting and characteristic patterns on the surface. 11. In general. Semiconductor surface reconstructions have a very pronounced effect on the chemical and physical properties of the surface. as exhibited by their chemical and physical properties. the surface atoms try to repair the broken covalent bonds to their missing neighbors by changing positions and creating new covalent bonds where possible. The severed bonds are called dangling bonds. with an sp 3 basis of four orbitals associated with each atom. while the other two bonds on the surface side have been severed (see Fig. When the 2D packing of surface atoms is already optimal.8). as by the surface reconstruction. If we consider a tight-binding approximation of the electronic structure. it follows that the dangling bond states have an energy in the middle of the band gap.1 Dimerization: the Si(001) surface We begin with what is perhaps the most common and most extensively studied semiconductor surface. such as steps and islands on the surface plane. the surface layer simply recedes toward the bulk in an attempt to enhance the interactions of surface atoms with their remaining neighbors. The general tendency is that the reconstruction restores the semiconducting character of the surface. but to a lesser extent. There are a few simple and quite general structural patterns that allow semiconductor surfaces to regain their semiconducting character. This results in a shortening of the ﬁrst-to-second layer spacing. 11. containing one electron (since a proper covalent bond contains two electrons). Each dangling bond is half-ﬁlled. This is known as the “missing row” reconstruction. This oscillatory behavior continues for a few layers and eventually dies out. This may involve substantial rearrangement of the surface atoms. in which case the close-packed plane is the (111) crystallographic plane while a plane with lower coordination of the surface atoms is the (001) plane. the behavior of metal surfaces. the effect of reconstruction cannot be very useful. This energy is also the Fermi level.

with the fully occupied and the empty surface state. the states of the symmetric dimers.2 Surface reconstruction 399 Unreconstructed Symmetric dimers Tilted dimers Conduction EF EF Valence Figure 11. in which the surface atoms have come closer together in pairs to form the (2 × 1) dimer reconstruction. since they are the highest occupied states. Left to right: the states associated with the bulk-terminated (001) plane. called dimer bonds. This tilting has an important effect on the electronic levels. i. with symmetric dimers. Top panel: reconstruction of the Si(001) surface. These atoms come together in pairs. with the bonding (fully occupied) and antibonding (empty) combinations well within the valence and conduction bands. which is taken as the macroscopic deﬁnition of the surface plane. Bottom panel: schematic representation of the bands associated with the Si(001) dimer reconstruction. 11. the tilted dimer reconstruction in the (2 × 1) pattern.8. The shaded regions represent the projections of valence and conduction bands.11. with every surface atom having two broken (dangling) bonds. while the up and down arrows represent single electrons in the two spin states. in the lowest energy conﬁguration of the system. and the remaining half-ﬁlled dangling bond states in the mid-gap region. the degenerate. The up-atom of the dimer has three bonds which form angles between them close to 90°. do not have to be symmetric – there is no symmetry of the surface that requires this.e. Therefore. with each bonded pair of atoms called a dimer. hence the (2 × 1) periodicity.8). The dimers. one per dimer atom. In reality. respectively. however. to form new bonds. the states of the tilted dimer reconstruction. indicated by the dashed line.8. separated by a small gap. Left to right: the bulkterminated (001) plane. 11. which means that there are at least two surface atoms per surface unit cell (see Fig. Indeed. the unit cell of the surface reconstruction is (2 × 1) or multiples of that. one per dimer atom. the dimerized surface. it is . the dimers are tilted: one of the atoms is a little higher and the other a little lower relative to the average height of surface atoms. The energy of these states determines the position of the Fermi level. half-ﬁlled states of the dangling bonds in the mid-gap region which are coincident with the Fermi level. which we analyze through the lens of the tight-binding approximation and is illustrated schematically in Fig. This leaves two dangling bonds. which for symmetric dimers are degenerate and half-ﬁlled each (they can accommodate two electrons of opposite spin but are occupied only by one). The formation of a dimer bond eliminates two of the dangling bonds in the unit cell.

GaAs. the one perpendicular to the plane of the three bonds. The Fermi level is now situated in the middle of the surface band gap. in this case from the down-atom to the up-atom. The net effect is that the surface has semiconducting character again. All these changes in bonding geometry lower the total energy of the system. which contains equal numbers of Ga and As atoms. We examine the (110) surface of GaAs. and the electronic levels associated with these hybrids. while its s orbital does not participate in bonding. discussed in chapters 1 and 4. which can become stable under certain conditions. 11. while the down-atom p orbital is left empty. This example illustrates two important effects of surface reconstruction: a change in bonding character called rehybridization. the two remaining dangling bond electrons are accommodated by the up-atom s orbital. In fact. We should also note that not all the dimers need to be tilted in the same way.2 Relaxation: the GaAs(110) surface We discuss next the surface structure of a compound semiconductor. Therefore. while its third p state. that is. while the atomistic structure is established directly through STM images. The reconstruction pattern and periodicity are established through scattering experiments. a pair of degenerate. alternating tilting of the dimers leads to more complex reconstruction patterns. half-ﬁlled states are split to produce a ﬁlled and an empty state. At the same time. The tilting of the dimer is another manifestation of the Jahn–Teller effect. the hybrids involved in the bonding of surface atoms.400 11 Defects III: surfaces and interfaces in a bonding conﬁguration close to p 3 . it forms three bonding orbitals with its one s and two of its p states. In this effect.2. that is. Consequently. as far as the atomic relaxation on the surface. contain only one of the two species of atoms present in the crystal. does not participate in bonding. which we discussed in connection to the Si vacancy in chapter 9. Highly elaborate ﬁrst-principles calculations based on Density Functional Theory verify this simpliﬁed picture. which is smaller than the band gap of the bulk. it is in a bonding conﬁguration close to sp 2 . such as the (001) or (111) planes. with a small band gap between the occupied upatom s state and the unoccupied down-atom p state. the down-atom of the dimer has three bonds which are almost on a plane. to illustrate the common features and the differences from elemental semiconductor surfaces. The . Of the two orbitals that do not participate in bonding. and a transfer of electrons from one surface atom to another. other surface planes in this crystal. This situation is similar to graphite. it forms covalent bonds through its three p orbitals. which becomes ﬁlled. the s orbital of the up-atom has lower energy than the p orbital of the down-atom. are concerned. leading to a stable conﬁguration. The above picture is also veriﬁed experimentally.

in non-polar planes the stoichiometry is the same as in the bulk. namely that the electronic levels corresponding to the two broken bonds are not degenerate. there is an important difference. The surface unit cell remains unchanged after relaxation. The hybrids associated with the As atoms lie lower in energy since As is more electronegative. the latter is called polar. originate from sp 3 hybrids associated with the Ga and As atoms. to the lower energy level. Consequently. 11. than Ga. we speak of surface relaxation. The ratio of the two species of atoms is called stoichiometry. The broken bonds of the Ga and As surface atoms are partially ﬁlled with electrons.2 Surface reconstruction 401 former type of surface is called non-polar. and one of its bonds has been severed. containing equal numbers of Ga and As atoms. Structure of the GaAs(110) surface: Left: top view of the bulk-terminated (110) plane. This is because the levels. The relaxation can be explained in simple chemical terms. and these are not equivalent. In the GaAs case. in a tight-binding sense. for this case. Right: side views of the surface before relaxation (below) and after relaxation (above).11. the number of atoms in the surface unit cell is the same as in the bulk-terminated plane. Accordingly. . the As dangling bond. the Ga dangling bond. however. we expect the electronic charge to move from the higher energy level. the same as the unit cell in the bulk-terminated plane. as was done for the rehybridization of the Si(001) tilted-dimer surface.9. while in polar planes the stoichiometry deviates from its bulk value. and has a higher valence charge. The ﬁrst interesting feature of this non-polar. compound semiconductor surface in its equilibrium state is that its unit cell remains the same as that of the bulkterminated plane. each atom has three bonds to other surface or subsurface atoms.9. rather than surface reconstruction. Moreover. Top and side views of the GaAs(110) surface are shown in Fig. In the bulk-terminated plane. Figure 11. in a situation similar to the broken bonds of the Si(001) surface.

As a result. The standard method for doing this is to associate a number of electrons per dangling Unrelaxed Surface Conduction Ga As (3/4)e (5/4)e Relaxed Surface EF Valence Figure 11. This picture needs to be complemented by a careful counting of the number of electrons in each state. Right: the states of the relaxed surface.10. with the fully occupied As s state below the top of the valence band. perpendicular to the plane of the three sp 2 bonding orbitals of Ga. Left: the states associated with the bulk-terminated (110) plane. and both states lie inside the bulk band gap. whereas the unoccupied Ga p level lies above the bottom of the bulk conduction band. which lies even lower in energy than the As sp 3 hybrid corresponding to the unrelaxed As dangling bond. Speciﬁcally. since it widens the gap between the occupied As level and the unoccupied Ga level. after relaxation. and the Ga atom in an almost planar. and the empty Ga p state above the top of the conduction band. the Ga dangling bond state is higher in energy than the As dangling bond state. These changes make the transfer of charge from the partially ﬁlled Ga dangling bond to the partially ﬁlled As dangling bond even more energetically favorable than in the unrelaxed surface. the two states.402 11 Defects III: surfaces and interfaces A relaxation of the two species of atoms on the surface enhances this difference. the bulk band gap of the semiconductor has been fully restored by the relaxation. the occupied As s level lies below the top of the bulk valence band. By the same token. Moreover. this charge transfer is enough to induce semiconducting character to the surface. sp 2 -like bonding arrangement. as indicated in Fig. the non-bonding electronic level of the As atom is essentially an s level. 11. the non-bonding electronic level of the Ga atom is essentially a p level.10. to make sure that there are no unpaired electrons. This places the As atom in a p 3 -like bonding arrangement. the As atom is puckered upward from the mean height of surface atoms. while the up and down arrows represent partially occupied states in the two spin states. . In fact. while the Ga atom recedes toward the bulk. which lies even higher in energy than the Ga sp 3 hybrid corresponding to the unrelaxed Ga dangling bond. Schematic representation of the bands associated with the GaAs(110) surface relaxation. are separated enough to restore the full band gap of the bulk! In other words. The shaded regions represent the projections of valence and conduction bands. after relaxation.

Indeed. At the same time. 11. Conﬁrmation comes also from experiment: STM images clearly indicate that the occupied states on the surface are associated with As atoms. the above analysis of the energetics of individual electronic levels suggests that 3/4 of an electron leaves the surface Ga dangling bond and is transferred to the As surface dangling bond. and which drastically change the surface geometry and the periodicity of the surface. the ideal bulk-terminated plane consists of atoms that have three neighbors on the side of the substrate and are missing one of their neighbors on the vacuum side (see Fig.2. If the atoms at the top layer on the Si(111) surface did not have four electrons but three.3 Adatoms and passivation: the Si(111) surface Finally. For this surface.11). with high energy. the dangling bonds must contain one electron each in the unreconstructed surface. and since this state is half ﬁlled.2 Surface reconstruction 403 bond equal to the valence of each atom divided by four. this simple picture is veriﬁed by elaborate quantum mechanical calculations.11. the second and more complex way involves extra atoms called adatoms. we discuss one last example of surface reconstruction which is qualitatively different than what we have seen so far. This is a highly unstable situation. By analogy to what we discussed above for the Si(001) surface. each Ga dangling bond is assigned 3/4 of an electron since Ga has a valence of 3. which can be either intrinsic (Si) or extrinsic (foreign atoms). its energy is coincident with the Fermi level. which becomes fully occupied containing 3/4 + 5/4 = 2 electrons. This example concerns the Si(111) surface. With this scheme. then there would not be a partially ﬁlled surface state due to the dangling bonds. which shows that there are no surface states in the bulk band gap. and hence can form only three covalent . With these assignments. This simple argument actually works in practice: under the proper conditions. There are two ways to remedy this and restore the semiconducting character of the surface: the ﬁrst and simpler way is to introduce a layer of foreign atoms with just the right valence. the zincblende crystal (see chapter 1). and will also help us introduce the idea of chemical passivation of surfaces. it is possible to replace the surface layer of Si(111) with group-III atoms (such as Ga) which have only three valence electrons. and that there is a relaxation of the surface atoms as described above. since each atom participates in the formation of four covalent bonds in the bulk structure of GaAs. rehybridization due to the relaxation drives the energy of these states beyond the limits of the bulk band gap. whereas each As dangling bond is assigned 5/4 of an electron since As has a valence of 5. The energy of the dangling bond state lies in the middle of the band gap. without changing the basic surface geometry. We examine these two situations in turn. 11. while the unoccupied states are associated with Ga atoms [169].

since the Ga sp 3 hybrids have higher energy than the Si sp 3 hybrids. The Ga atoms recede toward the bulk and become almost planar with their three neighbors in an sp 2 bonding arrangement.11. the net result is a structure of low energy and much reduced chemical reactivity. leaving the bulk band gap free. The large open circles represent the adatoms in the two possible conﬁgurations. The resulting structure has the same periodicity as the bulk-terminated plane and has a surface-related electronic level which is empty. which makes the unoccupied level have p character and hence higher energy. that is. bonds. This level lies somewhat higher in energy than the Si dangling bond. the As atoms move away from the bulk and become almost pyramidal with their three neighbors in a p 3 -like bonding arrangement. slight relaxation of the foreign atoms on the surface helps move the surface-related states outside the gap region. this pushes the energy of the unoccupied level higher than the bottom of the conduction band. the smaller shaded circles are the subsurface atoms which are bonded to the next layer below. Top and side views of the surface bilayer of Si(111): the larger shaded circles are the surface atoms. which is all that is required of the surface atoms. a passivated surface. In both cases. this pushes the energy of the occupied level lower than the top of the valence band. In both of these cases. since the As sp 3 hybrids have lower energy than the Si sp 3 hybrids. Similarly. For these atoms the presence of the extra electron renders the dangling bond state full. the H3 and T4 positions. A different way of achieving the same effect is to replace the surface layer of Si atoms by atoms with one more valence electron (group-V atoms such as As). . which makes the non-bonding occupied level have s character and hence lower energy. leaving the bulk band gap free. The As-related electronic state lies somewhat lower in energy than the Si dangling bond.404 11 Defects III: surfaces and interfaces T4 H3 T4 H3 Figure 11.

at special sites. In both cases the adatom is three-fold bonded while the surface atoms to which it is bonded become four-fold coordinated. 11. the resulting structure of the Si surface has the same periodicity. Ref. It turns out that this simple hypothesis works well for H: when each surface dangling bond is saturated by a H atom. by adding H. Similar effects can be obtained in other surfaces with the proper type of adsorbate atoms. Since it takes exactly one H atom to saturate one Si surface dangling bond. √ one shown in Fig. as illustrated in Fig. which corresponds to a reconstruction with a unit cell containing √ adatom and three surface atoms. for example. and having three nearest neighbors.12. 11. and bonding it to the three ﬁrst-layer atoms of the ring.and second-layer atoms. a stable. [170]).√ new surface lattice vectors are larger by a factor of 3 compared with the original lattice vectors of the bulk-terminated plane. leading to more complicated surface reconstructions.11. The Si–H bond is even stronger than the Si–Si bond in the bulk.2 Surface reconstruction 405 We can hypothesize that yet a different way of achieving chemical passivation of this surface is to saturate the surface Si dangling bonds through formation of covalent bonds to elements that prefer to have exactly one covalent bond. These atoms are called adatoms and can be either native Si atoms (intrinsic adatoms) or foreign atoms (extrinsic adatoms). Thus. which have more complex interactions with the surface atoms. and the second-layer atom directly below it. which have been called “valence mending adsorbates” (see. There is a way to passivate the Si(111) surface by adding extra atoms on the surface. the structure of the bulk-terminated plane can be restored and maintained as a stable conﬁguration.11. Thus. then placing it at one of the two stable positions will result in a chemically passive and stable structure. The ﬁrst position involves placing the adatom directly above a secondlayer Si atom. the three surface atoms to which it is bonded. since the The resulting periodicity is designated ( 3 × 3). The simple hypothesis of saturating the Si(111) surface dangling bonds with monovalent elements does not work for the other alkalis. if the adatom is of a chemical type that prefers to form exactly three covalent bonds like the group-III elements Al. each adatom saturates three surface dangling bonds by forming covalent bonds with three surface Si atoms. this position is called the T4 site for being on T op of a secondlayer atom and having four nearest neighbors. This will be the case if the entire surface is covered by adatoms. The second position involves placing the adatom above the center of a six-fold ring formed by three ﬁrst-layer and three second-layer Si atoms. this position is called the H3 site for being at the center of a H exagon of ﬁrst. because they have only one s valence electron. and bonding it to the three surface Si atoms which surround this second-layer atom. There are two positions that adatoms can assume to form stable or metastable structures. Now. Ga and In. and in fact has the same atomic structure as the bulk-terminated plane. chemically passive surface is obtained. The new lattice vectors are also rotated . Such elements are the alkali metals (see chapter 1).

These adatoms have four valence electrons. this version is appropriate for Si adatoms. the structure would be unstable for another reason. whereas the H3 position is metastable: it is higher in energy than the T4 position. since there is only √ one way to form lattice vectors larger than the original ones by a factor of 3. if a ( 3 × 3) reconstruction with Si adatoms were formed. Left: the (2 × 2) reconstruction with one adatom (at the T4 site) and four surface atoms – the three bonded the adatom and a restatom. that is. this version is appropriate for trivalent adatoms (eg. Both the imbalance of electronic charge due to the unpaired electron in the Si adatom dangling bond. Adatom reconstructions of the Si(111) surface viewed from above. bonds of the proper length and at proper angles among themselves. the adatoms pull the three surface atoms closer together. just like group-III adatoms. It turns out that the T4 position has in all cases lower energy. the large open circles represent the adatoms. the Si(111) surface can be covered by group-III √ √ adatoms (Al. Thus. and the compressive strain due to the presence of the adatom. in order to form good covalent bonds to the surface atoms.12. so even though they saturate three surface dangling bonds. We mentioned that native Si adatoms can also exist on this surface. This distortion is energetically costly. but this is redundant.406 11 Defects III: surfaces and interfaces Figure 11. with one adatom (at the T4 site) and three surface atoms per unit cell. In) in a ( 3 × 3) reconstruction involving one adatom per reconstructed surface unit cell. we would expect that this situation is not chemically passive. The small and medium sized shaded circles represent the atoms in the ﬁrst bilayer. by 30° relative √ original lattice vectors. Ga. This dangling bond will contain an unpaired electron. thereby inducing large compressive strain on the surface. under certain deposition conditions. Indeed. i. the T4 position. √ √If all the surface dangling bonds were saturated by Si adatoms. Al. √ √ to Right: the ( 3 × 3) reconstruction. so that it is the stable adatom position. The Si adatoms prefer. The reconstructed unit cell vectors are indicated by arrows.e. Ga. but still a local minimum of the energy. they are left with a dangling bond of their own after forming three bonds to surface atoms. so this reconstruction is sometimes √ to the designated ( 3 × 3) R 30°. can be remedied by leaving one out of every four surface atoms not bonded . In).

we can then achieve essentially a perfect structure. 11. The STM images of this surface basically pick out the adatom positions in both positive and negative bias [174]. another tetravalent element with the diamond bulk structure. 11. one on the restatom and one on the adatom. The amount of tensile strain introduced in this way is close to what is needed to compensate for the compressive strain due to the presence of the adatom. ending up in the state with lower energy. and consists of several interesting features. 173]. as discussed above. It turns out that the actual reconstruction of the Si(111) surface is more complicated. Large-scale STM images of this surface give the impression of an incredibly intricate and beautiful lace pattern. As in the examples discussed above. the main feature is the set of adatoms (a total of 12 in the unit cell). a unit cell 49 times larger than the original bulk-terminated plane! This reconstruction is quite remarkable. The two unpaired electrons. accompanied by restatoms (a total of six in the unit cell). Thus. Speciﬁcally.12. as shown in Fig. This surface atom. Due to its large unit cell. the dimers are formed by pairs of atoms in the subsurface layer which come together to make bonds. i. having as its dominant feature the adatom–restatom pair in each unit cell. such as a stacking fault. which means that it pushes its three neighbors away and induces tensile strain to the surface. This unit cell consists of four surface atoms plus the adatom. which are locally arranged in a (2 × 2) pattern. will also have a dangling bond with one unpaired electron in it. and has in fact a (7 × 7) reconstruction. In fact.13). The reconstruction of the Ge(111) surface. . and at the corner holes three surface and one subsurface atoms of the ﬁrst bilayer are missing. and leaving the other state empty. dimers and a corner hole. as shown in Fig. adatom by adatom (see Fig. this situation restores the semiconducting character of the surface by opening up a small gap between ﬁlled and empty surface states. a (2 × 2) reconstruction. it took a very long time to resolve the atomic structure of the Si(111) (7 × 7) reconstruction. in what concerns both the electronic features and the balance of strain on the reconstructed surface.e. so that the net strain is very close to zero [171]. Moreover. is c(2 × 8). that is. as weaved by Nature. 11. However. the ﬁrst STM images were crucial in establishing the atomic structure of this important reconstruction [172. can now be paired through charge transfer. called the restatom. A natural choice for this unit cell is one with lattice vectors twice as large in each direction as the original lattice vectors of the bulk-terminated plane. the presence of the restatom has beneﬁcial effects on the surface strain. the dominating features of the actual reconstruction of Si(111) are indeed adatom–restatom pairs. which is a simple variation of the (2 × 2) pattern. the restatom relaxes by receding toward the bulk. By creating a surface unit cell that contains one adatom and one restatom.11.2 Surface reconstruction 407 to any adatom.13: the stacking fault involves a 60° rotation of the surface layer relative to the subsurface layer.

and the large open circles represent the adatoms. the dimers. respectively).408 11 Defects III: surfaces and interfaces Stacking fault dimer adatom Corner hole restatom Figure 11. The reconstructed unit cell vectors are indicated by arrows. the 12 adatoms. This is in part sustained by the desire to control the growth of technologically important . The main features of the reconstruction are the corner holes.13. Bottom: a pattern of adatoms in an area containing several unit cells. as it would be observed in STM experiments. Top: the unit cell of the (7 × 7) reconstruction of the Si(111) surface viewed from above.3 Growth phenomena Great attention has also been paid to the dynamic evolution of crystal surfaces. The small and medium sized shaded circles represent the atoms in the ﬁrst bilayer (the subsurface and surface atoms. and the stacking fault on one-half of the unit cell (denoted by dashed lines). 11. one of the unit cells is outlined in white dashed lines.

t ). To the extent that such surfaces are not useful in technological applications.11. In macroscopic treatments of growth. the dimer reconstruction produces long rows of dimers. This in turn leads to highly anisotropic islands and anisotropic growth [178].3 Growth phenomena 409 materials. the main physical quantity is the height of the surface h (r. the relevant processes in growth are the deposition of atoms on the surface. driven mostly by technological demands for growing ever smaller structures with speciﬁc features. At the microscopic level. The precise way in which atoms move on the surface depends on the atomic structure of the terraces and steps. This justiﬁes the attention paid to surface reconstructions. For instance. which depends on the position on the surface r and evolves with . There is a different approach to the study of surface growth. A great deal of attention has been devoted to the statistical mechanical aspects of growth. such structures are expected to become the basis for next generation electronic and optical devices. which leads to a surface with very irregular features on all length scales. the roughening transition is to be avoided during growth. in which the microscopic details are coarse-grained and attention is paid only to the macroscopic features of the evolving surface proﬁle during growth. usually described as an average ﬂux F . motion along the dimer rows is much faster than motion across rows [174–176]). which is determined by the surface reconstruction. This motion involves diffusion of the atoms on the surface. which can play a decisive role in determining the growth mechanisms. on the Si(001) surface. and the subsequent motion of the atoms on the surface that allows them to be incorporated at lattice sites. which requires detailed knowledge of growth mechanisms starting at the atomistic level and reaching to the macroscopic scale. The basic phenomenon of interest in such studies is the so called roughening transition. Studying the dynamics of atoms on surfaces and trying to infer the type of ensuing growth has become a cottage industry in surface physics. Such approaches are based on statistical mechanics theories. We describe brieﬂy some of these aspects. and aim at describing the surface evolution on a scale that can be directly compared to macroscopic experimental observations. both on ﬂat surface regions (terraces) as well as around steps between atomic planes. Here we will give a very brief introduction to some basic concepts that are useful in describing growth phenomena of crystals (for more details see the books and monographs mentioned in the Further reading section). understanding the physics of this transition is as important as understanding the microscopic mechanisms responsible for growth. Accordingly. Equally important is the fundamental interest in dynamical phenomena on surfaces: these systems provide the opportunity to use elaborate theoretical tools for their study in an effort to gain deeper understanding of the physics.

we assume that the surface width scales with the time t for some initial period. The time it takes for the system to cross over from the time-scaling regime to the size-scaling regime depends on the system size. t ) ∼ t β . the longer it takes for the inhomogeneities in height to develop to the point where they scale with the linear system dimension. The time at which the scaling switches from one to the other is called the crossover time. a very rough surface. which corresponds to wide variations in the height. the so called dynamic exponent z . For a system of ﬁxed size. the width will saturate to the value L α . The meaning of the latter expression for large α is that the width of the surface becomes larger and larger with system size. that is. We therefore can write for the two regimes of growth w( L .16) . t << TX t >> TX (11. In terms of the average height h h (r. TX . we can deduce that if the crossover point is approached from the time-scaling regime we will obtain w ( L . there is a relation between the crossover time TX and the linear system size L .11) ¯ (t ) and the local height where A is the surface area.15) whereas if it is approached from the size-scaling regime we will obtain w ( L . Thus. because the larger the system is. t ).13) where we have introduced the so called growth exponent β and roughness exponent α (both non-negative numbers).14) From the above deﬁnitions. t )dr S (11. TX ) ∼ TX β (11. w( L . comparable to the in-plane linear size of the system. TX ∼ L z (11. beyond which it scales with the system size L . which we express by introducing another exponent. t ) − h 2 1/2 dr (11. This is based on empirical observations of how simple models of surface growth behave. to describe the scaling in the two regimes. In a macroscopic description of the surface height during growth.12) where we have deﬁned the length scale L of the surface as L 2 = A. the surface width is given by w( L .410 11 Defects III: surfaces and interfaces ¯ (t ) is deﬁned as an integral over the surface S : time t . TX ) ∼ L α (11. t ) ∼ L α . t ) = 1 L2 ¯ (t ) h (r. The average height h ¯ (t ) = 1 h A h (r.

11.3 Growth phenomena

411

which, together with the deﬁnition of the dynamic exponent z , produce the relation α (11.17) z= β From a physical point of view, the existence of a relationship between w and L implies that there must be some process (for instance, surface diffusion) which allows atoms to “explore” the entire size of the system, thereby linking its extent in the different directions to the surface width. With these preliminary deﬁnitions we can now discuss some simple models of growth. The purpose of these models is to describe in a qualitative manner the evolution of real surfaces and to determine the three exponents we introduced above; actually only two values are needed, since they are related by Eq. (11.17). The simplest model consists of a uniform ﬂux of atoms being deposited on the surface. Each atom sticks wherever it happens to fall on the surface. For this, the so called random deposition model, the evolution of the surface height is given by ∂ h (r, t ) = F (r, t ) (11.18) ∂t where F (r, t ) is the uniform ﬂux. While the ﬂux is uniform on average, it will have ﬂuctuations on some length scale, so we can write it as F (r, t ) = F0 + η(r, t ) (11.19)

where F0 represents the constant average ﬂux and η(r, t ) represents a noise term with zero average value and no correlation in space or time: η(r, t ) = 0, η(r, t )η(r , t ) = Fn δ (d ) (r − r )δ (t − t ) (11.20)

where the brackets indicate averages over the entire surface and all times. In the above equation we have denoted the dimensionality d of the spatial δ -function explicitly as a superscript. This is done to allow for models with different spatial dimensionalities. Now we can integrate the random deposition model with respect to time to obtain h (r, t ) = F0 t +

0 t

η(r, t )dt

(11.21)

**from which we ﬁnd, using the properties of the noise term, Eq. (11.20), h (r, t ) = F0 t w2 (t ) = h 2 (r, t ) − h (r, t )
**

2

= Fn t

(11.22)

This result immediately gives the value of the growth exponent (w ∼ t β ) for this model, β = 0.5. The other two exponents cannot be determined, because in this model there is no correlation in the noise term, which is the only term that can give

412

600

11 Defects III: surfaces and interfaces

500

400

Height(x,t)

300

200

100

0

0

100 x

200

Figure 11.14. Simulation of one-dimensional growth models. The highly irregular lines correspond to the surface proﬁle of the random deposition model. The smoother, thicker lines correspond to a model which includes random deposition plus diffusion to next neighbor sites, if this reduces the surface curvature locally. The two sets of data correspond to the same time instances in the two models, i.e. the same amount of deposited material, as indicated by the average height. It is evident how the diffusion step leads to a much smoother surface proﬁle.

rise to roughening. Indeed, simulations of this model show that the width of the surface proﬁle keeps growing indeﬁnitely and does not saturate (see Fig. 11.14), as the deﬁnition of the roughness exponent, Eq. (11.13), would require. Consequently, for this case the roughness exponent α and the dynamic exponent z = α/β cannot be deﬁned. The next level of sophistication in growth models is the so called EdwardsWilkinson (EW) model [179], deﬁned by the equation: ∂ h (r, t ) 2 h (r, t ) + η(r, t ) = ν ∇r ∂t (11.23)

In this model, the ﬁrst term on the right-hand side represents the surface tension, which tends to smooth out the surface: this term leads to a relative increase in the height locally where the curvature is positive and large, and to a relative decrease

11.3 Growth phenomena

413

in the height locally where the curvature is negative. The net result is that points of positive curvature (valleys) will ﬁll in, while points of negative curvature (hillocks) will decrease in relative height as the surface evolves, producing a smoother surface proﬁle. The second term on the right-hand side in Eq. (11.23) is a noise term with the same properties as the noise term included in the random deposition model. Notice that for simplicity we have omitted the term corresponding to the uniform ﬂux, because we can simply change variables h → h + F0 t , which eliminates the F0 term from both sides of the equation. In other words, the EW model as deﬁned in Eq. (11.23) deals with the variations in the moving average height. This equation can be solved, at least as far as the values of the exponents are concerned, by a rescaling argument. Speciﬁcally, we assume that the length variable is rescaled by a factor b: r → r = br. By our scaling assumption, when the growth is in the length-scaling regime, we should have h → h = bα h . This is known as a “self-afﬁne” shape; if α = 1, so that the function is rescaled by the same factor as one of its variables, the shape is called “self-similar”. We must also rescale the time variable in order to be able to write a similar equation to Eq. (11.23) for the rescaled function h . Since space and time variables are related by the dynamic exponent z , Eq. (11.14), we conclude that the time variable must be rescaled as: t → t = b z t . With these changes, and taking into account the general property of the δ -function in d dimensions δ (d ) (br) = b−d δ (d ) (r), we conclude that scaling of the noise term correlations will be given by η(br, b z t )η(br , b z t ) = Fn b−(d +z ) δ (d ) (r − r )δ (t − t ) (11.24)

which implies that the noise term η should be rescaled by a factor b−(d +z )/2 . Putting all these together, we obtain that the equation for the height, after rescaling, will take the form bα−z ∂t h = ν bα−2 ∇ 2 h + b−(d +z )/2 η =⇒ ∂t h = ν b z −2 ∇ 2 h + b−d /2+z /2−α η where we have used the short-hand notation ∂t for the partial derivative with respect to time. If we require this to be identical to the original equation (11.23), and assuming that the values of the constants involved (ν, Fn ) are not changed by the rescaling, we conclude that the exponents of b on the right-hand side of the equation must vanish, which gives 2−d 2−d , β= , z=2 (11.25) 2 4 These relations fully determine the exponents for a given dimensionality d of the model. For example, in one dimension, α = 0.5, β = 0.25, while in two dimensions α = β = 0. α=

414

11 Defects III: surfaces and interfaces

Let us consider brieﬂy the implications of this model. We discuss the twodimensional case (d = 2), which is closer to physical situations since it corresponds to a two-dimensional surface of a three-dimensional solid. In this case, the roughness exponent α is zero, meaning that the width of the surface proﬁle does not grow as a power of the system size; this does not mean that the width does not increase, only that it increases slower than any power of the linear size of the system. The only possibility then is that the width increases logarithmically with system size, w ∼ ln L . This, however, is a very weak divergence of the surface proﬁle width, implying that the surface is overall quite ﬂat. The ﬂatness of the surface in this model is a result of the surface tension term, which has the effect of smoothing out the surface, as we discussed earlier. How can we better justify the surface tension term? We saw that it has the desired effect, namely it ﬁlls in the valleys and ﬂattens out the hillocks, leading to a smoother surface. But where can such an effect arise from, on a microscopic scale? We can assign a chemical potential to characterize the condition of atoms at the microscopic level. We denote by µ(r, t ) the relative chemical potential between vapor and surface. It is natural to associate the local rate of change in concentration of atoms C (r, t ) with minus the local chemical potential C (r, t ) ∼ −µ(r, t ): an attractive chemical potential increases the concentration, and vice versa. It is also reasonable to set the local chemical potential proportional to minus the local curvature of the surface, as ﬁrst suggested by Herring [180],

2 h (r, t ) µ ∼ −∇r

(11.26)

because at the bottom of valleys where the curvature is positive, the surface atoms will tend to have more neighbors around them, and hence feel an attractive chemical potential; conversely, while at the top of hillocks where the curvature is negative, the atoms will tend to have fewer neighbors, and hence feel a repulsive chemical potential. A different way to express this is the following: adding material at the bottom of a valley reduces the surface area, which is energetically favorable, implying that the atoms will prefer to go toward the valley bottom, a situation described by a negative chemical potential; the opposite is true for the top of hillocks. This gives for the rate of change in concentration

2 C (r, t ) ∼ −µ(r, t ) ∼ ∇r h (r, t )

(11.27)

Notice that by C (r, t ) we refer to variations of the rate of change in concentration relative to its average value, so that, depending on the curvature of µ(r, t ), this relative rate of change can be positive or negative. The rate of change in concentration

11.3 Growth phenomena

415

can be related directly to changes in the surface height, ∂ h (r, t ) 2 h (r, t ) = ν C (r, t ) = ν ∇r ∂t (11.28)

which produces the familiar surface tension term in the EW model, with the positive factor ν providing the proper units. In this sense, the mechanism that leads to a smoother surface in the EW model is the desorption of atoms from hillocks and the deposition of atoms in valleys. In this model, the atoms can come off the surface (they desorb) or attach to the surface (they are deposited) through the presence of a vapor above the surface, which acts as an atomic reservoir. We can take this argument one step further by considering other possible atomistic level mechanisms that can have an effect on the surface proﬁle. One obvious mechanism is surface diffusion. In this case, we associate a surface current with the negative gradient of the local chemical potential j(r, t ) ∼ −∇r µ(r, t ) (11.29)

by an argument analogous to that discussed above. Namely, atoms move away from places with repulsive chemical potential (the hillocks, where they are weakly bound) and move toward places with attractive chemical potential (the valleys, where they are strongly bound). The change in the surface height will be proportional to the negative divergence of the current since the height decreases when there is positive change in the current, and increases when there is negative change in the current, giving ∂ h (r, t ) ∼ −∇r · j(r, t ) ∂t (11.30)

Using again the relationship between chemical potential and surface curvature, given by Eq. (11.26), we obtain for the effect of diffusion ∂ h (r, t ) 4 h (r, t ) = −q ∇r ∂t (11.31)

with q a positive factor that provides the proper units. Adding the usual noise term with zero average and no correlations in space or time, Eq. (11.20), leads to another statistical/stochastic model for surface growth, referred to as the Wolf– Villain (WV) model [181]. This model leads to a smooth surface under the proper conditions, but through a different mechanism than the EW model. The growth, roughening and dynamical exponents for the WV model can be derived through a simple rescaling argument, analogous to what we discussed for the EW model,

416

11 Defects III: surfaces and interfaces

which gives 4−d 4−d , β= , z=4 (11.32) 2 8 For a two-dimensional surface of a three-dimensional crystal (d = 2 in the above equation1 ), the roughness exponent is α = 1, larger than in the EW model. Thus, the surface proﬁle width in the WV model is more rough than in the EW model, that is, the surface diffusion is not as effective in reducing the roughness as the desorption/deposition mechanism. However, the surface proﬁle will still be quite a bit smoother than in the random deposition model (see Fig. 11.14 for an example). A model that takes this approach to a higher level of sophistication and, arguably, a higher level of realism, referred to as the Kardar–Parisi–Zhang (KPZ) model [182], assumes the following equation for the evolution of the surface height: α= ∂ h (r, t ) 2 h (r, t ) + λ [∇r h (r, t )]2 + η(r, t ) = ν ∇r ∂t (11.33)

This model has the familiar surface tension and noise terms, the ﬁrst and third terms on the right-hand side, respectively, plus a new term which involves the square of the surface gradient. The presence of this term can be justiﬁed by a careful look at how the surface proﬁle grows: if we assume that the surface grows in a direction which is perpendicular to the local surface normal, rather than along a constant growth direction, then the gradient squared term emerges as the lowest order term in a Taylor expansion of the surface height. This term has a signiﬁcant effect on the behavior of the surface height. In fact, this extra term introduces a complex coupling with the other terms, so that it is no longer feasible to extract the values of the growth and roughness exponents through a simple rescaling argument as we did for the EW and WV models. This can be appreciated from the observation that the KPZ model is a non-linear model since it involves the ﬁrst and second powers of h on the right-hand side of the equation, whereas the models we considered before are all linear models involving only linear powers of h on the right-hand side of the equation. Nevertheless, the KPZ model is believed to be quite realistic, and much theoretical work has gone into analyzing its behavior. Variations on this model have also been applied to other ﬁelds. It turns out that the WV model is relevant to a type of growth that is used extensively in the growth of high-quality semiconductor crystals. This technique is called Molecular Beam Epitaxy (MBE);2 it consists of sending a beam of atoms or molecules, under ultra high vacuum conditions and with very low kinetic energy,

1 2

This is referred to in the literature as a “2+1” model, indicating that there is one spatial dimension in addition to the growth dimension. The word epitaxy comes from two Greek words, επ ι which means “on top” and τ αξ ις , which means “order”.

11.3 Growth phenomena

417

toward a surface. The experimental conditions are such that the atoms stick to the surface with essentially 100% probability (the low kinetic energy certainly enhances this tendency). Once on the surface, the atoms can diffuse around, hence the diffusion term (∇ 4 h ) is crucial, but cannot desorb, hence the deposition/desorption term (∇ 2 h ) is not relevant. The noise term is justiﬁed in terms of the random manner in which the atoms arrive at the surface both in position and in time. MBE has proven to be the technique of choice for growing high-quality crystals of various semiconductor materials, and especially for growing one type of crystal on top of a different type (the substrate); this is called “heteroepitaxy”. Such combinations of materials joined by a smooth interface are extremely useful in optoelectronic devices, but are very difﬁcult to achieve through any other means. Achieving a smooth interface between two high-quality crystals is not always possible: depending on the interactions between newly deposited atoms and the substrate, growth can proceed in a layer-bylayer mode, which favors a smooth interface, or in a 3D island mode which is detrimental to a smooth interface. In the ﬁrst case, called the Frank–van der Merwe mode, the newly deposited atoms wet the substrate, that is, they prefer to cover as much of the substrate area as possible for a given amount of deposition, the interaction between deposited and substrate atoms being favorable. Moreover, this favorable interaction must not be adversely affected by the presence of large strain in the ﬁlm, which implies that the ﬁlm and substrate lattice constants must be very close to each other. In the second case, called the Volmer–Weber mode, the newly deposited atoms do no wet the substrate, that is, they prefer to form 3D islands among themselves and leave large portions of the substrate uncovered, even though enough material has been deposited to cover the substrate. When the 3D islands, which nucleate randomly on the substrate, eventually coalesce to form a ﬁlm, their interfaces represent defects (see also section 11.4), which can destroy the desirable characteristics of the ﬁlm. There is actually an intermediate case, in which growth begins in a layer-bylayer mode but quickly reverts to the 3D island mode. This case, called the Stranski– Krastanov mode, arises from favorable chemical interaction between deposited and substrate atoms at the initial stages of growth, which is later overcome by the energy cost due to strain in the growing ﬁlm. This happens when the lattice constant of the substrate and the ﬁlm are signiﬁcantly different. The three epitaxial modes of growth are illustrated in Fig. 11.15. In these situations the stages of growth are identiﬁed quantitatively by the amount of deposited material, usually measured in units of the amount required to cover the entire substrate with one layer of deposited atoms, called a monolayer (ML). The problem with MBE is that it is rather slow and consequently very costly: crystals are grown at a rate of order a monolayer per minute, so this method of growth can be used only for the most demanding applications. From the theoretical point of view, MBE is the simplest technique for studying crystal growth phenomena, and has been the subject of numerous theoretical

418

layer-by-layer

**11 Defects III: surfaces and interfaces
**

intermediate island

θ=0.5

θ=0.5

θ=0.5

θ=1

θ=1

θ=1

θ=2

θ=2

θ=2

θ=4

θ=4

θ=4

Figure 11.15. Illustration of the three different modes of growth in Molecular Beam Epitaxy. Growth is depicted at various stages, characterized by the total amount of deposited material which is measured in monolayer (ML) coverage θ . Left: layer-by-layer (Frank–van der Merwe) growth at θ = 0.5, 2, 1 and 4 ML; in this case the deposited material wets the substrate and the ﬁlm is not adversely affected by strain, continuing to grow in a layered fashion. Center: intermediate (Stranski–Krastanov) growth at θ = 0.5, 2, 1 and 4 ML; in this case the deposited material wets the substrate but begins to grow in island mode after a critical thicknes h c (in the example shown, h c = 2). Right: 3D island (Volmer–Weber) growth at θ = 0.5, 2, 1 and 4 ML; in this case the deposited material does not wet the substrate and grows in island mode from the very beginning, leaving a ﬁnite portion of the substrate uncovered, even though enough material has been deposited to cover the entire substrate.

investigations. The recent trend in such studies is to attempt to determine all the relevant atomistic level processes (such as atomic diffusion on terraces and steps, attachment–detachment of atoms from islands, etc.) using ﬁrst-principles quantum mechanical calculations, and then to use this information to build stochastic models of growth, which can eventually be coarse-grained to produce continuum equations of the type discussed in this section. It should be emphasized that in the continuum models discussed here, the surface height h (r, t ) is assumed to vary slowly on the scale implied by the equation, so that asymptotic expansions can be meaningful. Therefore, this scale must be orders of magnitude larger than the atomic scale, since on this latter scale height variations can be very dramatic. It is in this sense that atomistic models must be coarse-grained in order to make contact with the statistical models implied by the continuum equations. The task of coarse-graining the atomistic behavior is not trivial, and the problem of how to approach it remains an open one. Nevertheless, the statistical

11.4 Interfaces

419

models of the continuum equations can be useful in elucidating general features of growth. Even without the beneﬁt of atomistic scale processes, much can be deduced about the terms that should enter into realistic continuum models, either through simple physical arguments as described above, or from symmetry principles. 11.4 Interfaces An interface is a plane which joins two semi-inﬁnite solids. Interfaces between two crystals exhibit some of the characteristic features of surfaces, such as the broken bonds suffered by the atoms on either side of the interface plane, and the tendency of atoms to rearrange themselves to restore their bonding environment. We can classify interfaces in two categories: those between two crystals of the same type, and those between two crystals of different types. The ﬁrst type are referred to as grain boundaries, because they usually occur between two ﬁnite-size crystals (grains), in solids that are composed of many small crystallites. The second type are referred to as hetero-interfaces. We discuss some of the basic features of grain boundaries and hetero-interfaces next. 11.4.1 Grain boundaries Depending on the orientation of equivalent planes in the two grains on either side of the boundary, it may be possible to match easily the atoms at the interface or not. This is illustrated in Fig. 11.16, for two simple cases involving cubic crystals. The plane of the boundary is deﬁned as the ( y , z ) plane, with x the direction perpendicular to it. The cubic crystals we are considering, in projection on the (x , y ) plane are represented by square lattices. In the ﬁrst example, equivalent planes on either side of the boundary meet at a 45° angle, which makes it impossible to match atomic distances across more than one row at the interface. All atoms along the interface have missing or stretched bonds. This is referred to as an asymmetric tilt boundary, since the orientation of the two crystallites involves a relative tilt around the z axis. In the second example, the angle between equivalent planes on either side of the boundary is 28°. In this example we can distinguish four different types of sites on the boundary plane, labeled A , B , C , D in Fig. 11.16. An atom occupying site A will be under considerable compressive stress and will have ﬁve close neighbors; There is no atom at site B , because the neighboring atoms on either side of the interface are already too close to each other; each of these atoms has ﬁve nearest neighbors, counting the atom at site A as one, but their bonds are severely distorted. At site C there is plenty of space for an atom, which again will have ﬁve neighbors, but its bonds will be considerably stretched, so it is under tensile stress. The environment of the atom at site D seems less distorted, and this atom will have four neighbors at almost

420

θ

**11 Defects III: surfaces and interfaces
**

θ

y z x

C D

A

B

Figure 11.16. Examples of grain boundaries between cubic crystals for which the projection along one of the crystal axes, z , produces a 2D square lattice. Left: asymmetric tilt boundary at θ = 45° between two grains of a square lattice. Right: symmetric tilt boundary at θ = 28° between two grains of a square lattice. Four different sites are identiﬁed along the boundary, labeled A , B , C , D , which are repeated along the grain boundary periodically; the atoms at each of those sites have different coordination. This grain boundary is equivalent to a periodic array of edge dislocations, indicated by the T symbols.

regular distances, but will also have two more neighbors at slightly larger distance (the atoms closest to site A). This example is referred to as a symmetric tilt boundary. The above examples illustrate some of the generic features of grain boundaries, namely: the presence of atoms with broken bonds, as in surfaces; the existence of sites with fewer or more neighbors than a regular site in the bulk, as in point defects; and the presence of local strain, as in dislocations. In fact, it is possible to model grain boundaries as a series of ﬁnite segments of a dislocation, as indicated in Fig. 11.16; this makes it possible to apply several of the concepts introduced in the theory of dislocations (see chapter 10). In particular, depending on how the two grains are oriented relative to each other at the boundary, we can distinguish between tilt boundaries, involving a relative rotation of the two crystallites around an axis parallel to the interface as in the examples of Fig. 11.16, or twist boundaries, involving a rotation of the two crystallites around an axis perpendicular to the interface. These two situations are reminiscent of the distinction between edge and screw dislocations. More generally, a grain boundary may be formed by rotating the two crystallites around axes both parallel and perpendicular to the interface, reminiscent of a mixed dislocation. Equally well, one can apply concepts from the

11.4 Interfaces

421

theory of point defects, such as localized gap states (see chapter 9), or from the theory of surfaces, such as surface reconstruction (see the discussion in section 11.2). In general, due to the imperfections associated with them, grain boundaries provide a means for enhanced diffusion of atoms or give rise to electronic states in the band gap of doped semiconductors that act like traps for electrons or holes. As such, grain boundaries can have a major inﬂuence both on the mechanical properties and on the electronic behavior of real materials. We elaborate brieﬂy on certain structural aspects of grain boundaries here and on the electronic aspects of interfaces in the next subsection. For more extensive discussions we refer the reader to the comprehensive treatment of interfaces by Sutton and Ballufﬁ (see the Further reading section). As indicated above, a tilt boundary can be viewed as a periodic array of edge dislocations, and a twist boundary can be viewed as a periodic array of screw dislocations. We discuss the ﬁrst case in some detail; an example of a tilt boundary with angle θ = 28° between two cubic crystals is illustrated in Fig. 11.16. Taking the grain-boundary plane as the yz plane and the edge dislocation line along the z axis, and using the results derived in chapter 10 for the stress and strain ﬁelds of an isolated edge dislocation, we can obtain the grain-boundary stress ﬁeld by adding the ﬁelds of an inﬁnite set of ordered edge dislocations at a distance d from each other along the y axis (see Problem 7): σxtilt x =

tilt σ yy

σxtilt y

¯ ) − cosh(x ¯) − x ¯ sinh(x ¯ )] K e be [cos( y ¯) sin( y 2 d ¯ ) − cos( y ¯ )] [cosh(x ¯ ) − cosh( y ¯) + x ¯ sinh(x ¯ )] K e be [cos( y ¯) sin( y = 2 d ¯ ) − cos( y ¯ )] [cosh(x ¯ ) cos( y ¯ ) − 1] K e be [cosh(x = d [cosh(x ¯ ) − cos( y ¯ )]2

(11.34) (11.35) (11.36)

¯ = 2π x /d , y ¯ = 2π y /d , and the Burgers where the reduced variables are deﬁned as x vector be lies along the x axis. It is interesting to consider the asymptotic behavior of the stress components as given by these expressions. Far from the boundary in the direction perpendicular to it, x → ±∞, all the stress components of the tilt ¯ ) term grain boundary decay exponentially because of the presence of the cosh2 (x in the denominator. 11.4.2 Hetero-interfaces As a ﬁnal example of interfaces we consider a planar interface between two different crystals. We will assume that the two solids on either side of the interface have the same crystalline structure, but different lattice constants. This situation, called a hetero-interface, is illustrated in Fig. 11.17. The similarity between the two crystal structures makes it possible to match smoothly the two solids along a crystal plane.

422

**11 Defects III: surfaces and interfaces
**

d h

Figure 11.17. Schematic representation of hetero-interface. Left: the newly deposited ﬁlm, which has height h , wets the substrate and the strain is relieved by the creation of misﬁt dislocations (indicated by a T), at a distance d from each other. Right: the newly deposited material does not wet the substrate, but instead forms islands which are coherently bonded to the substrate but relaxed to their preferred lattice constant at the top.

The difference in lattice constants, however, will produce strain along the interface. There are two ways to relieve the strain.

(i) Through the creation of what appears to be dislocations on one side of the interface, assuming that both crystals extend to inﬁnity on the interface plane. These dislocations will be at regular intervals, dictated by the difference in the lattice constants; they are called “misﬁt dislocations”. (ii) Through the creation of ﬁnite-size structures on one side of the interface, which at their base are coherently bonded to the substrate (the crystal below the interface), but at their top are relaxed to their preferred lattice constant. These ﬁnite-size structures are called islands.

The introduction of misﬁt dislocations is actually related to the height h of the deposited ﬁlm, and is only made possible when h exceeds a certain critical value, the critical height h c . The reason is that without misﬁt dislocations the epilayer is strained to the lattice constant of the substrate, which costs elastic energy. This strain is relieved and the strain energy reduced by the introduction of the misﬁt dislocations, but the presence of these dislocations also costs elastic energy. The optimal situation is a balance between the two competing terms. To show this effect in a quantitative way, we consider a simple case of cubic crystals with an interface along one of the high-symmetry planes (the x y plane), perpendicular to one of the crystal axes (the z axis). In the absence of any misﬁt dislocations, the in-plane strain due to the difference in lattice constants of the ﬁlm a f and the substrate as , is given by

m

≡

as − a f = af

xx

=

yy

(11.37)

11.4 Interfaces

423

and involves only diagonal components but no shear. For this situation, with cubic symmetry and only diagonal strain components, the stress–strain relations are c11 c12 c12 σx x xx σ yy = c12 c11 c12 yy (11.38) σzz c12 c12 c11 zz However, since the ﬁlm is free to relax in the direction perpendicular to the surface, σzz = 0, which, together with Eq. (11.37), leads to the following relation: 2c12

m

+ c11

zz

=0

From this, we obtain the strain in the z direction, zz , and the misﬁt stress in the x y plane, σm = σx x = σ yy , in terms of the misﬁt strain m :

zz

=−

2c12 c11

m

(11.39) 2c12 c11

m

σm = c11 + c12 −

= µm

m

(11.40)

where we have deﬁned the constant of proportionality between σm and m as the effective elastic modulus for the misﬁt, µm . When misﬁt edge dislocations are introduced, as shown in Fig. 11.17, the strain is reduced by be /d , where be is the Burgers vector and d is the distance between the dislocations. The resulting strain energy per unit area of the interface is γm = µm

m

−

be d

2

h

(11.41)

**The dislocation energy per unit length of dislocation is Ud =
**

2 2 h be K e be µ ln = 2 4π (1 − ν ) be

(11.42)

with K e the relevant elastic constant, K e = µ/2π (1 − ν ). In order to arrive at this result, we have used the expression derived in chapter 10 for the elastic energy of an edge dislocation, Eq. (10.17), with µ the effective shear modulus at the interface where the misﬁt dislocations are created, which is given by µ= 2µ f µs , µ f + µs

where µ f and µs are the shear moduli of the ﬁlm and the substrate, respectively. For the relevant length, L , over which the dislocation ﬁeld extends, we have taken L ∼ h , while for the dislocation core radius, rc , we have taken rc ∼ be , both reasonable approximations; the constants of proportionality in these approximations

Illustration of a regular 2D array of misﬁt edge dislocations at the interface of two cubic crystals. γm . are combined into a new constant which we will take to be unity (an assumption which is justiﬁed a posteriori by the ﬁnal result. which is l= 2d 2 = 2 d d as illustrated in Fig.44) m where we have introduced the variable ζ = be /d . The limit of no misﬁt dislocations at the interface corresponds to ζ → 0. because then their spacing d → ∞ while the Burgers vector be is ﬁxed. With this. What remains to be determined is the total misﬁt dislocation length per unit area of the interface.18. is 2d . the dislocation energy per unit area takes the form γd = lUd = 2 be h µ ln 2π (1 − ν )d be (11. as explained below). The condition for making misﬁt dislocations favorable is that the total interface energy decreases by their introduction. combined with the misﬁt energy per unit area. 11.424 11 Defects III: surfaces and interfaces d d Figure 11.43) which. The total dislocation length in an area d 2 . which can be expressed by dγint (ζ ) dζ ≤0 ζ =0 .18. giving L / rc = (h /be ). The distance between dislocations in each direction is d . outlined by the dashed square. gives for the total energy per unit area of the interface γint (ζ ) = µm ( − ζ )2 h + be h µ ln 2π (1 − ν ) be ζ (11.

crystals can be grown on a different substrate by MBE.11. it is very realistic and relevant to nanoscale structures for advanced electronic devices. which naturally relieve some of the strain through relaxation. Here we comment on our the critical thickness h choice of the constant of proportionality in the relation L / rc ∼ (h /be ). As mentioned in section 11. which is the expected order of magnitude since L ≈ h and rc ≈ be to within ˜ c be misﬁt dislocations are not an order of magnitude. for h > h are stable. If the interactions between substrate and newly deposited atoms are favorable. the critical thickness of the ﬁlm at which the introduction of misﬁt dislocations becomes energetically favorable is determined by the condition dγint (ζ ) dζ ζ =0 ˜c = 0 =⇒ h 1 µ = ˜ c) 4π (1 − ν )µm ln(h (11. as long as it is of order unity. In particular. In certain cases. and thus are not stable. . The resulting islands are called quantum dots and have sizes in the range of a few hundred to a few thousand angstroms. (11. the conﬁnement of electrons within the dots gives rise to levels whose energy depends sensitively on the size and shape of the dots. mentioned in the Further reading section).45) m where in the last equation we have expressed the critical thickness in units of ˜ c = h c /be .3. When the substrate and the newly deposited material have the same crystal structure.45). for example. These phenomena play a crucial role in the morphology and stability of the ﬁlm that is formed by deposition on the substrate. then the ﬁrst of the two situations described above arises where the strain is relieved by the creation of misﬁt dislocations beyond a critical ﬁlm thickness determined by Eq. which we took to be unity: the logarithmic term in (h /be ) makes the ﬁnal value of the critical height insensitive to the precise value of this constant. As it turns out. the electronic properties of the dots are different than those of the corresponding bulk material. If the chemical interactions do not allow wetting of the substrate. All quantities on the right-hand side of the last the Burgers vector h equation are known for a given type of ﬁlm deposited on a given substrate. but whether they form or not could also depend on kinetic factors. the book by Barnham and Vvedensky. so that the deposited atoms wet the substrate. the strain energy due to the lattice constant difference also grows and eventually must be relieved.4 Interfaces 425 Consequently. Because of their ﬁnite size. the strain effects can lead to islands of fairly regular shapes and sizes. This phenomenon is called selfassembly of nanoscale quantum dots. thus ˜ c can be uniquely determined. then ﬁnite-size islands are formed. The example of an interface between two cubic crystals with a regular array of misﬁt dislocations may appear somewhat contrived. As the newly deposited material grows into crystal layers. The hope that this type of structure will prove very useful in new electronic devices has generated great excitement recently (see. For ﬁlm thickness h < h ˜ c be misﬁt dislocations energetically favorable. it is possible to nucleate the new crystal on the substrate even if their lattice constants are different.

A study of the nature of such states for a prototypical system. Some aspects of the oversimpliﬁcation are the presence of point defects at the interface and the fact that real surfaces are not . To illustrate this. The presence of these states will have important consequences for the electronic structure of the interface. This is in contrast to the Schottky model. We will consider then an ideal metal surface and an ideal semiconductor surface in intimate contact. but only in the immediate neighborhood of the interface since they decay exponentially on that side. structures that are of paramount importance in the operation of electronic devices. where the barrier is directly proportional to the metal work function (see chapter 9). has been reported by Louie and Cohen [184]. as planes which are ﬂat on the atomic scale and at a distance apart which is of order typical interatomic distances. but they do not decay into the metal side.426 11 Defects III: surfaces and interfaces We conclude this section on hetero-interfaces with a brief discussion of electronic states that are special to these systems. However. creating the conditions necessary to produce equilibrium between the Fermi levels on the two sides. in the large-gap case the MIGS are not enough to produce the conditions necessary for electronic equilibrium. a behavior similar to that of the surface states discussed earlier in relation to the 1D surface model. In the small-gap case. whose energy is ﬁxed within the semiconductor gap. which makes them much less effective in creating the interface dipole. These are called “metal induced gap states” (MIGS). since the dipole moment is due to MIGS. It was ﬁrst pointed out by Heine [183] that these conditions lead to the creation of Bloch states with energies in the gap of semiconductor and complex wave-vectors in the direction perpendicular to the interface. in contrast to the situation of the 1D surface model. that is. Thus. This behavior can be interpreted in the context of MIGS: the wavefunctions of these states decay much more rapidly into the semiconductor side in a large-gap semiconductor. The MIGS decay exponentially into the semiconductor. The occupation of MIGS will draw electronic charge from the semiconductor side. the induced barrier to electron transfer across the interface will not depend on the work function of the metal. consider a junction between a metal and an n-type semiconductor. Of particular interest are interfaces between metals and semiconductors. The picture outlined above is certainly an oversimpliﬁed view of real metal–semiconductor interfaces. and the physics of the Schottky model are relevant. that between the (111) surfaces of Al and Si. Since the MIGS have energies within the semiconductor gap. We had remarked there that the measured barrier is roughly proportional to the metal work function for semiconductors with large gap but essentially independent of the metal work function for semiconductors with small gap. as discussed in chapter 9. they will be occupied when the junction is formed. in the context of DFT calculations. without the presence of any contaminants such as oxide layers. This will give rise to a dipole moment at the interface. the MIGS are sufﬁcient for creating equilibrium conditions.

5. Metois. 4.: Cond. A thorough account of modern theories of crystal growth phenomena. Tong. N. Zangwill.Y. This is a useful account of the structure and properties of semiconductor surfaces and interfaces. 1994). Low-dimensional Semiconductor Structures. ed. Academic Press. For more detailed discussion and extensive references the reader is directed to the book by Sutton and Ballufﬁ (see the Further reading section). 7. The Structure of Surfaces. This is an excellent general book for the physics of surfaces. 1995). (Cambridge University Press. The First Thirty Years. Semiconductor Surfaces and Interfaces. J. including many insightful discussions. eds. 1998). Kaldis. An interesting book on surfaces in general. . with emphasis on scattering methods such as LEED. Barnham and D.M¨ onch (Springer-Verlag. Phys. 1991). Physics of Crystal Growth. 8. vol. 10. “Theory and simulation of crystal growth”.L. Matt.. A useful modern introduction to the statistical mechanics of growth phenomena on surfaces. ed. Kern. 2.D.A. Pimpinelli and J. 1995). Berlin. Villain (Cambridge University Press. eds. Atomic and Electronic Structure of Surfaces. 3. 9. This is a very thorough treatment of early experimental work on epitaxial growth phenomena. K. 6. which is the physical quantity of interest from the practical point of view (it is relevant to the design of electronic devices). in Current Topics in Materials Science. A collection of review articles on experimental and theoretical techniques for determining the atomic structure of surfaces. Berlin. pp. A. Van Hove and S.E. Friedel.. Physics at Surfaces. Barabasi and H. Amsterdam. Seitz. A. C. Cambridge. Le Lay and J.Further reading 427 atomically ﬂat planes but contain features like steps and islands. 9. M. D. 1979). (Springer-Verlag. G. Turnbull and H. Levi and M. (Elsevier. North-Holland Publishing Co. with extensive coverage of their electronic structure. R. 28. 3 (E. 299–344 (1997). “Basic mechanisms in the early stages of epitaxy”. Vvedensky. 1988). This is a collection of classic papers published in the journal Surface Science in its ﬁrst 30 years. 1985). “The density functional formalism and the electronic structure of metal surfaces”. All these issues further complicate the determination of the barrier to electron transfer across the interface. W. New York. Ehrenreich.C. A. Amsterdam. Stanley (Cambridge University Press. Further reading 1. pp.. 1973). eds. 11. (Cambridge University Press. Fractal Concepts in Crystal Growth. Kotrla. Lannoo and P. Surface Science. Duke. Cambridge. A. in Solid State Physics. M. This is a useful article describing the techniques and results of computer simulations of growth phenomena. Cambridge.J. Lang.. (Springer-Verlag. Cambridge. vol. 2000). 225–300 (F. Berlin.

the wavefunction will have components that involve the zero and G reciprocal-lattice vectors. determine whether these neighbors are at the same distance as nearest neighbors in the bulk or not. This is a thorough and detailed discussion of all aspects of interfaces in crystals. We wish to solve the 1D surface model deﬁned by Eq.W. 3. For the simple cubic. face-centered cubic and diamond lattices. and one restatom per unit cell. 4. (a) Assume that the adatom is at the lowest energy T4 position. what does that imply for its bonding? 2. and how the occupation of these new states determines the metallic or semiconducting character of the surface. Construct the schematic electronic structure of the Si(111) reconstructed surface. Explain how the sp 3 states of the original bulk-terminated plane and those of the adatom and the restatom combine to form bonding and antibonding combinations. c1 . while the wavefunction is given by the expression in Eq. Interfaces in Crystalline Materials. (110) and (111) crystallographic directions: ﬁnd the number of in-plane neighbors that each surface atom has. Oxford. Give an argument that describes the charge transfer in the (2 × 2) reconstruction of Si(111). describe the structure of the bulk-terminated planes in the (001). (11.8): √ √ (a) for the ( 3 × 3) reconstruction with an adatom and three surface atoms per unit cell (b) for the (2 × 2) reconstruction with an adatom. Discuss which of these surfaces might be expected to undergo signiﬁcant reconstruction and why. From this system show that the energy takes the values given in Eq. Sutton and R. Write the Schr¨ odinger equation for this model for z < 0 and produce a linear system of two equations in the unknown coefﬁcients c0 . A. Ballufﬁ (Clarendon Press.428 11 Defects III: surfaces and interfaces A collection of review articles highlighting problems in semiconductor devices based on low-dimensional structures. using a simple tight-binding picture.P.7). what does that imply for its bonding? (b) Assume that the restatom is in a planar bonding conﬁguration with its three neighbors. 12. with one adatom and one restatom per unit cell. and determine how many “broken bonds” or “missing neighbors” each surface atom has compared to the bulk structure. For this potential. .8). (11. A convenient choice for the wavefunction is ψk (z < 0) = eikz c0 + c1 e−iGz . by analogy to what was done for Si(001) (see Fig. where surfaces and interfaces play a key role. show the surface lattice vectors. and is in a pyramidal bonding conﬁguration with its three neighbors. 1995).9). Problems 1. (11. body-centered cubic. 11. three surface atoms.

together with the deﬁnition of the dynamical exponent Eq. 6. (c) Detailed computer simulations for this model give that. Obtain the behavior of the various stress components far from the boundary in the direction perpendicular to it.32). we would then obtain three equations for α. (11. growth and dynamical exponents given in Eq. Using the expression for the stress ﬁeld of an isolated edge dislocation. using the expressions that relate stress and strain ﬁelds in an isotropic solid (see Appendix 1). by analogy to what was done for the Edwards–Wilkinson model. β. Find these three equations and discuss the overdetermination problem. for d = 1. through the standard rescaling argument. for the Wolf–Villain model through a rescaling argument. Can you speculate on what went wrong in the simple tight-binding analysis? 5. Also. (11. with the dislocation lines along the z axis. Eq. This is also veriﬁed by detailed ﬁrst-principles electronic structure calculations. Derive the roughness.Problems 429 (c) Based on these assumptions. that is. x → ±∞. (11. is given by Eqs. Table 10. 7. (a) Suppose we use a rescaling argument as in the EW model. neglect the surface tension term. calculate the strain ﬁeld for this inﬁnite array of edge dislocations. z which. show that the stress ﬁeld of an inﬁnite array of edge dislocations along the y axis. and derive the exponents for this case. (b) Suppose we neglect one of the terms in the equation to avoid the problem of overdetermination. separated by a distance d from each other. by analogy to what we discussed in the case of the Si(100) (2 × 1) tilted-dimer reconstruction. (d) It turns out that in a real surface the electronic charge transfer takes place from the adatom to the rest atom. given in chapter 10. overdetermine the values of the exponents. Is this result compatible with your values for α and β obtained from the rescaling argument? Discuss what may have gone wrong in this derivation. We will attempt to demonstrate the difﬁculty of determining. deduce what kind of electronic charge transfer ought to take place in the Si(111) (2 × 2) adatom–restatom reconstruction. Comment on the physical meaning of the strain ﬁeld far from the boundary.1. to a very good approximation α = 1/2. .33). Speciﬁcally.14). the roughness and growth exponents in the KPZ model.36). (11.34)–(11. β = 1/3.

12 Non-crystalline solids While the crystalline state is convenient for describing many properties of real solids. the discovery of ﬁve-fold and ten-fold rotational symmetry in certain metal alloys [5] was a shocking surprise. the underlying large units are polymer chains. including itself. O. even as a starting point for more elaborate descriptions. but these symmetries are not compatible with three-dimensional periodicity. from single covalent bonds. while the variations in large-scale structure are due to other types of interactions (ionic. This is because perfect 430 . there are a number of important cases where this picture cannot be used. S). carbon atoms can form a range of bonds among themselves. in plastics. the local structure is determined by strong covalent interactions. is a familiar example. For instance. but there is no long-range order of any type. as was done for defects in chapters 9–11. Such solids are called amorphous. reﬂections. the presence of local order in bonding leads to large units which underlie the overall structure and determine its properties. has a certain degree of regularity. van der Waals) among the larger units. mostly from the ﬁrst and second rows of the Periodic Table (such as N. glass. hydrogen bonding. Another example involves solids where the local arrangement of atoms. or an underlying regular pattern are present. 12. based on SiO2 . Such solids are called quasicrystals. A class of widely used and quite familiar solids based on such structures are plastics. Amorphous solids are very common in nature. These types of structures are usually based on carbon. we look at solids in which certain symmetries such as rotations. to multiple bonds to van der Waals interactions. This is no accident: carbon is the most versatile element in terms of forming bonds to other elements. embodied in the number of neighbors and the preferred bonding distances. As an example. hydrogen and a few other elements. In such cases. P. In a different class of non-crystalline solids.1 Quasicrystals As was mentioned in chapter 1.

. in order to illustrate how they give rise to a diffraction pattern with features reminiscent of a solid with crystalline order. Using this formula. with n → ∞. is not compatible with ﬁve-fold or ten-fold rotational symmetry (see Problem 5 in chapter 3). The new sequence is LSLLSLLSLLSLLSL · · · This sequence has a unit cell consisting of three parts. denoted by L and S . which these unusual solids appeared to possess.12. Accordingly. We will discuss a one-dimensional version of this type of structure. τ = (1 + 5)/2. one long and one short.. and is based on a very simple construction: consider two line segments. Our quasicrystal example in one dimension is called a Fibonacci sequence. This means that such structures can extend to inﬁnity without any defects. for that matter. structures that have ﬁve-fold rotational symmetry and can tile the two-dimensional plane. the √ ratio of L to S segments tends to the golden mean value.e. in the next iteration we obtain LSLLSLSLLSLSLLSLSLLSLSLLS · · · i. two) dimensions. we can determine the position of the n th point on the inﬁnite line (a point determines the beginning of a new segment. these solids were named “quasicrystals”. for example. respectively. as xn = n + n 1 N I NT τ τ (12. a sequence with an eight-part unit cell ( L S L L S L S L ). It is easy to show that in the sequence with an n -part unit cell. We use these to form an inﬁnite. Penrose who ﬁrst studied them). as follows: LSLSLSLSLS · · · This sequence has a unit cell consisting of two parts. L . Suppose now that we change this sequence using the following rule: we replace L by L S and S by L . a sequence with a ﬁve-part unit cell ( L S L L S ). If we keep applying the replacement rule to each new sequence. we refer to it as a three-part unit cell.e. by combining certain simple geometric shapes according to speciﬁc local rules. we refer to it as a two-part unit cell.1) . Theoretical studies have revealed that it is indeed possible to create. and so on ad inﬁnitum.1 Quasicrystals 431 periodic order in three (or. S and L . which generates the following sequence: LSL LSLSL LSL LSLSL LSL LSLSL LSL LSLSL LSL LSLSL · · · i. either long or short). maintaining the perfect rotational local order while having no translational long-range order. perfectly periodic solid in one dimension. These structures are referred to as Penrose tilings (in honor of R. L and S .

b. y0 ) in the lattice. The projected points form a Fibonacci sequence on the x axis. Speciﬁcally. the Fourier Transform (FT) of a Bravais lattice is a set of δ -functions in reciprocal space with arguments (k − G). The question which arises now is. on the negative x and positive y sides. alters the ordering of the segments through b. We deﬁne this new line as the x axis.1. 12. along the y direction. as can be easily seen from the fact that the . and cos θ = τ/ 1 + τ 2 ). which are the vectors that satisfy the coherent diffraction condition. we draw the x axis. but was produced by a very systematic augmentation of the original perfectly periodic sequence. Our aim is to explore what the FT of the Fibonacci sequence is. as illustrated in Fig. √ a line at an angle θ to √ 2 where tan θ = 1/τ (this also implies sin θ = 1/ 1 + τ . Next. We can generalize this expression to xn = λ n + a + 1 n N I NT +b τ τ (12. where G = m 1 b1 + m 2 b2 + m 3 b3 . that by construction the sequence is not random at all. y0 ) and a line parallel to the x axis passing through the point diametrically opposite (x0 . In order to calculate the FT of the Fibonacci sequence we will use a trick: we will generate it through a construction in a two-dimensional square lattice on the x y plane. m 1 . m 3 are integers and b1 . the more it appears that any ﬁnite section of the sequence is a random succession of the two components. how can the quasicrystalline structure be distinguished from a perfectly ordered or a completely disordered one? In particular. This change simply shifts the origin to a . the Fibonacci sequence captures the characteristics of quasicrystals in one dimension. In this sense.2) where a . m 2 . however. and the line perpendicular to it as the y axis. Its signature is the set of spots in reciprocal space at multiples of the reciprocal-lattice vectors. we deﬁne the distance w between (x0 . We consider all the points of the original square lattice that lie between these last two lines (a strip of width w centered at the x axis). namely that the larger the number of L and S segments in the unit cell. and project them onto the x axis. Starting at some lattice point (x0 . b2 . We know. The shifting and scaling does not alter the essential feature of the sequence. λ are arbitrary numbers. We draw two lines parallel to the x axis at distances ±w/2 along the y axis. b3 are the reciprocal-lattice vectors. and introduces an overall scale factor λ. what is the experimental signature of the quasicrystalline structure? We recall that the experimental hallmark of crystalline structure was the Bragg diffraction planes which scatter the incident waves coherently.432 12 Non-crystalline solids where we use the notation N I N T [x ] to denote the largest integer less than or equal to x . y0 ) (which we take to be the origin of the coordinate system).

Construction of the Fibonacci sequence by using a square lattice in two dimensions: the thick line and corresponding rotated axes x .1 Quasicrystals 433 w y y’ x’ w x Figure 12.4) h(y ) = (12. positions of the projections on the x axis are given by xn = sin θ n + 1 n 1 N I NT + τ τ 2 (12. y ).12. y ) = g nm δ ( x − m )δ ( y − n ) 0 for | y | < w/2 1 for | y | ≥ w/2 (12. are projected onto the thick line to produce the Fibonacci sequence. λ = sin θ . The second function identiﬁes all the points that lie within the strip of width w around the rotated x axis. y )) g (x . the points on the x axis that constitute the Fibonacci sequence are given by f (x ) = g (x . b = 1/2. The square lattice points within the strip of width w . y axes. with a = 0. outlined by the two dashed lines.5) where m . n are integers.2). y ) = g . The ﬁrst function produces all the points of the original square lattice on the x y plane. y are at angle θ (tan θ = 1/τ ) with respect to the original x .1. which we do next.6) ¯ (x (x .3) which is identical to Eq. This way of creating the Fibonacci sequence is very useful for deriving its FT. y )h ( y )d y (12. We begin by deﬁning the following functions: ¯ (x . In terms of these two functions. (12. y (x .

y ) ˜ ( p .4) and (12. when inserted into Eq. q ) and h so all we need to do is calculate the FTs g and h ( y ). ˜ f ( p ). at the values of p for which the argument of the δ -functions in ˜ f ( p ) vanishes. the latter is the sum of δ -functions over the single index m . where bm = m (2π/a ). (12. but. g and h ( y ).434 12 Non-crystalline solids where x and y have become functions of x .7) ˜ ( p ) of the functions g (x . This means that some of the spots in the diffraction pattern will be prominent (those corresponding to large values of the pre-δ -function factor). respectively. the magnitude of the corresponding term is not constant. n and m . as 1 f˜( p ) = 2π ˜ (−q )dq ˜ ( p .5): ˜ ( p. as opposed to what we would expect for the FT of a periodic lattice in one dimension.7) give 1 f˜( p ) = 2π = 1 2π (12. These are obtained from the standard deﬁnition of the FT and the functions deﬁned in Eqs. of g (x . but is given by the factor in square brackets in front of the δ -function in Eq. and h The FT. q )h g (12. y through x = x cos θ − y sin θ. y ) ˜ ( p . The ﬁnal result is quite interesting: the FT of the Fibonacci sequence contains two independent indices. q ) = g 1 2π δ ( p − 2π (m cos θ + n sin θ ))δ (q − 2π (−m sin θ + n cos θ )) nm ˜ ( p ) = w sin(w p /2) h w p /2 which. Moreover.8) nm nm sin(πw(m sin θ − n cos θ )) δ ( p − 2π (m cos θ + n sin θ )) πw(m sin θ − n cos θ ) (m −n τ ) √ sin πw (m τ + n ) 1+τ 2 w δ p − 2π (12. y = x sin θ + y cos θ ˜ (q ). of f (x ) is given in terms of the FTs. a few values of p will stand out in a diffraction pattern giving the . Finally. with arguments ( p − bm ). with a the lattice vector in real space.9) √ πw ( m − n τ ) √ 1 + τ2 1+τ 2 w where in the last expression we have substituted the values of sin θ and cos θ in terms of the slope of the x axis.9). there will be a dense sequence of spots. because of the two independent integer indices that enter into Eq. (12.9). (12. while others will be vanishingly small (those corresponding to vanishing values of the preδ -function factor). since not all of them are prominent. q ). as mentioned earlier (for a proof of the above relations see Problem 1). (12.

53)). with w = 1 and the δ -function represented by a window of width 0. Eq. which justiﬁes the name “quasicrystal”.9).2. This is one scheme proposed by R. (12.3: the elemental building blocks here are rhombuses (so called “fat” and “skinny”) with equal sides and angles which are multiples of θ = 36°. b. c. D . this maintains the ﬁve-fold rotational symmetry of the seed but has no long-range translational periodicity. (G. A seed consisting of the ﬁve fat and ﬁve skinny rhombuses is shown labeled on the right-hand side. The Fourier Transform f˜( p ) of the Fibonacci sequence as given by Eq. An example in two dimensions is shown in Fig. e) rhombuses.3. Ammann and . The pattern can be extended to cover the entire two-dimensional plane without defects (overlaps or gaps). A B C D E b a E e 4θ A D B C d a b 3θ 2θ c d e c θ Figure 12. 12. Similar issues arise for quasicrystals in two and three dimensions. B . d .1 Quasicrystals 9 435 6 ~ f(p) 3 0 Ϫ3 0 10 20 p 30 40 50 Figure 12. appearance of order in the structure.2. Construction of 2-dimensional quasicrystal pattern with “fat” ( A. the only difference being that the construction of the corresponding sequences requires more complicated building blocks and more elaborate rules for creating the patterns.02 (cf.12. E ) and “skinny” (a . This is illustrated in Fig. 12. C .

it was demonstrated through simulations that essentially inﬁnite structures can be built using local rules only. as was done above to produce the one-dimensional Fibonacci sequence from the two-dimensional square lattice. (ii) Solids composed of many different types of atoms which have complicated patterns of bonding. however. (i) Solids composed of one type of atom (or a very small number of atom types) which locally see the same environment throughout the solid.2 Amorphous solids Many real solids lack any type of long-range order in their structure. other schemes have also been proposed by Penrose for this type of two-dimensional pattern. that we refer to as plastics. which a small unit of atoms (the building block) can readily ﬁnd. Another interesting aspect of the construction of quasicrystals is that a projection of a regular lattice from a higher dimensional space. 12. The properly constructed pattern maintains the ﬁve-fold rotational symmetry but has no long-range translational periodicity. This. The class of structures based on long polymeric chains. We refer to these solids as amorphous. This would cast a doubt on the validity of using the quasicrystal picture for real solids. Note how the elemental building blocks can be arranged in a structure that has perfect ﬁve-fold symmetry. even the type of order we discussed for quasicrystals. so it exhibits small deviations from site to site. Penrose. Fortunately. is a familiar example. to a preferred local relative orientation for bonding. see chapter 1). can also be devised in two and three dimensions. it would be difﬁcult to accept that the elementary building blocks in real solids are able to communicate their relative orientation across distances of thousands or millions of angstroms. for instance. and amorphous metals are some examples. SiO2 ). is only achieved when strict rules are followed on how the elemental building blocks should be arranged at each level of extension. or are global rules necessary? The answer to this question has crucial implications for the structure of real solids: if global rules were necessary. An important question that arose early on was: can the perfect patterns be built based on local rules for matching the building blocks. Ge. chalcogenide glasses (such as As2 S3 . In real solids the elementary building blocks are three-dimensional structural units with ﬁve-fold or higher symmetry (like the icosahedron. just like in simple crystals. Amorphous semiconductors and insulators (such as Si. otherwise mistakes are generated which make it impossible to continue the extension of the pattern. . The local environment cannot be identical for all atoms (this would produce a crystal). and this structure can be extended to tile the inﬁnite two-dimensional plane without any overlaps or gaps in the tiling.436 12 Non-crystalline solids R. It is far easier to argue that local rules can be obeyed by the real building blocks: such rules correspond. which are enough to destroy the long-range order. We can distinguish two general types of amorphous solids. As2 Se3 ).

The formation of the CRN from covalently bonded atoms is more subtle. The resemblance extends to the number of nearest neighbors (four). The reason for this close resemblance between the ideal crystalline structure and the amorphous structure is the strong preference of Si to form exactly four covalent . those with close-packed structures (like the crystalline metals).12. the bonding between atoms locally resembles that of the ideal crystalline Si. 12. but the small distortions allow a random arrangement of the atomic units so that the crystalline order is destroyed. i. the atoms are not given the opportunity (due. and will be discussed in some detail next. so they are stuck in some arbitrary conﬁguration. A model that successfully represents the structure of close-packed amorphous solids is the so-called random close packing (RCP) model. each atom sees an environment very similar to that in the ideal crystalline structure. with tetrahedral coordination. We know from earlier discussion that for crystalline arrangements this leads to the FCC (with 12 nearest neighbors) or to the BCC (with eight nearest neighbors) lattices. in this case. The idea behind this model is that atoms behave essentially like hard spheres and try to optimize the energy by having as many close neighbors as possible. the most important is the number of nearest neighbors.e.1 Continuous random network Amorphous Si usually serves as the prototypical covalently bonded amorphous solid. the bond legnth (2. and those with open. however. This makes it possible to deﬁne a regular type of site as well as defects in structure.2. Recalling the discussion of chapter 1. we can further distinguish the solids of this category into two classes.35 Å) and the angles between the bonds (109°). We begin with a discussion of amorphous solids of the ﬁrst category. Analogous considerations hold for the open structures that resemble the crystal structure of covalently bonded semiconductors and insulators. the corresponding model is the so called continuous random network (CRN).2 Amorphous solids 437 We will examine in some detail representative examples of both types of amorphous solids. with small deviations. The random packing of hard spheres. the diamond lattice. signiﬁcantly more for the bond angle). those in which the local bonding environment of all atoms is essentially the same. covalently bonded structures (like crystalline semiconductors and insulators). to rapid quenching from the liquid state) to ﬁnd the perfect crystalline close-packed arrangement. The resemblance to crystalline Si ends there. while the values of the bond lengths and bond angles can deviate somewhat from the ideal values (a few percent for the bond length. under conditions that do not permit the formation of the close-packed crystals. since there is no long-range order. as close to the perfect structure as is feasible within the constraints of how the structure was created. Thus. for instance. akin to the deﬁnitions familiar from the crystal environment. of these. In amorphous close-packed structures. In this case. is an intuitive idea which will not be examined further.

Modern simulations attempt to create models of the tetrahedral continuous random network by quenching a liquid structure very rapidly. which has the ability to saturate the single missing bonds. in supercells with periodic boundary conditions. This is actually in agreement with experiment. There is on-going debate as to what exactly the defects in amorphous Si are: there are strong experimental indications that the three-fold coordinated defects are dominant. . 12.4). and are called “dangling bond” sites). What could have gone wrong in forming such an extended structure? It was thought that the strain in the bonds as the structure was extended (due to deviations from the ideal value) might make it impossible for it to continue growing. which arises from its electronic structure (the four sp 3 orbitals associated with each atom. for example.438 12 Non-crystalline solids bonds at tetrahedral directions with its neighbors. however. but there are also compelling reasons to expect that ﬁve-fold coordinated defects are equally important [189]. [186]. Such simulations. at least for reasonably large. and Fig. Very often amorphous Si is built in an atmosphere of hydrogen. The ﬁrst convincing evidence that a local bonding arrangement resembling that of the crystal could be extended to large numbers of atoms. The defects consist of mis-coordinated atoms. Ref. The idea that an amorphous solid has a structure locally similar to the corresponding crystalline structure seems reasonable.01%). which shows of order ∼ 1% mis-coordinated atoms in pure amorphous Si. The difference from the original idea for the CRN is that the simulated structures contain some defects. It took. It is intriguing that all modern simulations of pure amorphous Si suggest a predominance of ﬁve-fold coordinated defects. and reduces dramatically the number of defects (to the level of 0. or ﬁve-fold bonded (they have an extra bond. Alternatively. and are called “ﬂoating bond” sites). see the discussion in chapter 1). Polk’s hand-built model and subsequent computer reﬁnement showed that this concern was not justiﬁed. A more realistic picture of bonding (but not necessarily of the structure itself) can be obtained by using a smaller number of atoms in the periodic supercell (fewer than 100) and employing ﬁrst-principles electronic structure methods for the quenching simulation [187]. a long time to establish the validity of this idea with quantitative arguments. in the sense that they are good approximations of the inﬁnite tetrahedral CRN. without imposing the regularity of the crystal. which can be either three-fold bonded (they are missing one neighbor. can create models consisting of several thousand atoms (see. models have been created by neighbor-switching in the crystal and relaxing the distorted structure with empirical potentials [188]. free-standing structures consisting of several hundred atoms. came from hand-built models by Polk [185]. based on empirical interatomic potentials that capture accurately the physics of covalent bonding. These structures are quite realistic.

This can be generalized to model amorphous solids with related structures. it contains 216 atoms and has only six mis-coordinated atoms (ﬁve-fold bonded).4. because the Si–O–Si angles can be easily distorted with very small energy cost. This ﬂexibility gives rise to a very stable structure. which allows the Si-centered tetrahedra to move considerable distances relative to each other. which represent fewer than 3% defects. if we consider putting an oxygen atom at the center of each Si–Si bond in a t-CRN and allowing the bond lengths to relax to their optimal length by uniform expansion of the structure. which can exhibit much wider diversity in the arrangement of the tetrahedral units than what would be allowed in the oxygen-decorated t-CRN. Analogous extensions of the t-CRN can be made to model certain chalcogenide glasses.12. At the same time. The corresponding amorphous SiO2 structures are good models of common glass. This structure is actually much more ﬂexible than the original t-CRN. then we would have a rudimentary model for the amorphous SiO2 structure. In this structure each Si atom is at the center of a tetrahedron of O atoms and is therefore tetrahedrally coordinated. So far we have been discussing the tetrahedral version of the continuous random network (we will refer to it as t-CRN). each O atom has two Si neighbors.2 Amorphous solids 439 Figure 12. which contain covalently . In particular. so that all atoms have exactly the coordination and bonding they prefer. assuming that the original t-CRN contains no defects. This particular structure is a model for amorphous silicon. Example of a continuous random network of tetrahedrally coordinated atoms.

Similarly. We consider that the detector is situated at a position R well outside the solid.11) where we have neglected the term (|rn |/|R|)2 since the vector rn . Since there is no periodic or regular structure of any form in the solid. we have to assume that the incident wave. R) at the detector will be proportional to | A(q. q . R)|2 . has a much smaller magnitude than the vector R. R). in this case the directions of R and q must be the same. which lies within the solid. R) = A0 |R|2 e−ik·(rn −rm ) nm (12. q . the detector being far away from the solid. is scattered into a spherical wave exp(i|q ||R − rn |)/|R − rn |. We will do the same here. R) ≈ ei|q ||R| |R| ei(q−q )·rn = n ei|q ||R| |R| e−ik·rn n (12. This procedure gives A(q.2 Radial distribution function How could the amorphous structure be characterized in a way that would allow comparison to experimentally measurable quantities? In the case of crystals and quasicrystals we considered scattering experiments in order to determine the signature of the structure. 12. therefore we can write |q ||R − rn | = |q ||R| 1 − 1/2 2 1 2 R · r + | r | n n |R|2 |R|2 ˆ · rn = |q ||R| − q · rn ≈ |q ||R| − |q |R (12. mentioned in the Further reading section). in the denominator of the spherical wave we can neglect the vector rn relative to R. q . which gives f (k. Due to the lack of order. q . to obtain for the amplitude in the detector: A(q.12) where we have deﬁned the scattering vector k = q − q.13) .2. we now have to treat each atom as a point from which the incident radiation is scattered.440 12 Non-crystalline solids bonded group-V and group-VI atoms (for more details see the book by Zallen. R) = n eiq·rn ei|q ||R−rn | |R − rn | (12. which has an amplitude exp(iq · rn ) at the position rn of an atom. We then have to s