http://www.pitt.edu/~jdnorton/teaching/HPS_0410/index.html[28/04/2010 08:17:32 ﺹ]
Lectures Assignments
Course Description
Schedule
Old Schedule before snow
closing.
Term paper
Clock
Sign in sheet
Title page, Preface and Table of Contents for Einstein for
Everyone
Introduction: the questions
Special relativity: the basics
Special relativity: adding velocities
Special relativity: the relativity of simultaneity
Is special relativity paradoxical?
E=mc
2
Origins of Special Relativity
Einstein's Pathway to Special Relativity
Spacetime
Spacetime and the Relativity of Simultaneity
Spacetime, Tachyons, Twins and Clocks
What is a four dimensional space like?
Philosophical Significance of the Special Theory of
Relativity.
Euclidean Geometry: The First Great Science
NonEuclidean Geometry: A Sample Construction
Spaces of Constant Curvature
Spaces of Variable Curvature
General Relativity
Gravity Near a Massive Body
Einstein's Pathway to General Relativity
Relativistic Cosmology
Big Bang Cosmology
Black Holes
A Better Picture of Black Holes
Atoms and the Quantum
1. Principle of Relativity
2. Adding Velocities Einstein's
Way
3. Relativity of Simultaneity
4. Origins of Special Relativity
5. Spacetime
6. Philosophical Significance
7. NonEuclidean Geometry
8. Curvature
9. General Relativity
10. Relativistic Cosmology
11. Big Bang Cosmology
12. Black Holes Not required for
submission
13. Origins of Quantum
Theory
14. Problems of Quantum
Theory
HPS 0410 Einstein for Everyone Spring 2010
HPS 0410 Einstein for Everyone
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/index.html[28/04/2010 08:17:32 ﺹ]
Origins of Quantum Theory
Quantum Theory of Waves and Particles
The Measurement Problem
Einstein on the Completeness of Quantum Theory
Einstein as the Greatest of the Nineteenth Century
Physicists
For documents relating to the Fall 2008 offering of this class, click here.
For documents relating to the Spring 2008 offering of this class, click here.
For documents related to the Spring 2007 offering of this class, click here.
HPS 0410 Course Description
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/description.html[28/04/2010 08:17:34 ﺹ]
HPS 0410 Einstein for Everyone Spring 2010
Back to main course page
Lectures
Monday/ Wednesday 1:00 pm  1:50 pm, CL 232 (John D. Norton)
Recitations
(Register for one.)
Monday 33:50 pm, CL 216 (Julia Bursten)
Monday 55:50 pm, CL 229 (Emi Iwatani)
Tuesday 1212:50 pm, CL 327 (Julia Bursten)
Tuesday 11:50 pm, CL 327 (Emi Iwatani)
Tuesday 33:50 pm, CL 129 (Julia Bursten)
Tuesday 44:50 pm, CL 129 (Emi Iwatani)
Instructors
John D. Norton, 4126241051, jdnorton@pitt.edu
Room 817 CL. Office hours: Monday 23 pm, Wednesday 23 pm.
Julia Bursten , jrb135@pitt.edu
Room 901H CL. Office hours: Tuesday 12, Wednesday 121.
Emi Iwatani, emi8@pitt.edu
Room 901M CL. Office hours: Monday 23pm, Tuesday 23 pm.
Course website
Course materials will be posted at the course website
http://www.pitt.edu/~jdnorton/teaching/HPS_0410
Click here http://www.pitt.edu/~jdnorton/teaching/HPS_0410
We will communicate grades through the Blackboard website at
https://courseweb.pitt.edu/
These websites will be the primary means of obtaining course material. To take this
course, you must have access the internet.
Topics
Special relativity: The two postulates and their strange consequences: rods and clocks run amuck. The light barrier.
Relativity of simultaneity: the confusion of when and where and the puzzles it solves. Spacetime: time as the fourth
dimension. Origins of special relativity: how did Einstein do it?. Puzzles and paradoxes. The most famous equation:
E=mc
2
. The philosophical dividend.
General relativity: Straightening out Euclid. Acceleration provides the clue: gravitation is just spacetime bent. General
relativity passes the tests. Applications of general relativity: Goedel universes and the like: could we take a journey
into the past? Cosmology: the biggest picture possible; a beginning and end for time? Black holes: when the fabric
of spacetime collapses.
HPS 0410 Course Description
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/description.html[28/04/2010 08:17:34 ﺹ]
Quantum theory: The puzzle of black body radiation: light comes in lumps. The Bohr atom: where electrons jump.
The perversity of matter in the small: both particle and wave. The uncertainty principle. The failure of determinism.
The puzzle of Schrödinger's cat: neither alive nor dead.
Assessment
Short tests
There will be 6 short inclass tests, roughly one each two weeks. (Schedule) The
grade is the best 5 of 6.
35%
Recitation
The grade is divided between assignments (25%) and recitation participation
(10%).
An assignment is due each week in the recitation. The assignment grade is the
best 11 of 14.
After cancellation of classes February 810, the assignment grade is reset at the best 10 of 13.
35%
Term paper
The term paper is by electronic submission to your recitation instructor on the day
of the final lecture, Wednesday April 21.
30%
Short Test
The short tests will examine material covered roughly in the preceding two weeks. They
will be held in the first 15 minutes of class and consist of a series of 34 related questions
requiring a few sentences each as answers.
Policy on Missed Tests and Late Assignments
No make up tests will be offered. Since the test grade is the best 5 of 6, one missed test
is automatically forgiven. It is strongly recommended that this one forgiven test be used
only when illness or emergencies preclude class attendance.
Assignments are due each week at the start of the recitation. Late assignments are not
accepted. Since the assignment grade is the best 11 of 14, three missed assignments are
automatically forgiven. It is strongly recommended that these forgiven assignments be
used only when illness or emergencies preclude class attendance.
(An exception is made for students who add the course after the start of term. Assignments due prior to the date on
which the class was added may be submitted at the next scheduled recitation.)
For added flexibility, a universal makeup assignment is offered to all students. The
makeup assignment is a second term paper conforming to the term paper guidelines, but
only 500 words in length, due on the day of the last lecture, Wednesday April 21.
What do I do if a university break cancels a recitation in which an
assignment is due?
There will be no recitation held on Martin Luther King Day, Monday, January 18.
Assignment 2, due in these cancelled recitations, may be submitted to the recitation
instructor at the beginning of the lecture that immediately follows the cancelled recitation
HPS 0410 Course Description
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/description.html[28/04/2010 08:17:34 ﺹ]
on Wednesday January 20.
Texts
The primary text for the class is available on this website as the online text Einstein for
Everyone.
Supplementary readings are:
J. Schwartz and M. McGuinness, Einstein for Beginners. New York: Pantheon.
J. P. McEvoy and O. Zarate, Introducing Stephen Hawking. Totem.
J. P. McEvoy, Introducing Quantum Theory. Totem.
Special Needs
If you have a disability for which you are or may be requesting an accommodation, you
are encouraged to contact both your instructor and Disability Resources and Services, 216
William Pitt Union, 4126487890 or 4123837355 (TTY) as early as possible in the term.
For more information, see http://www.drs.pitt.edu/
The Undergraduate Dean of Arts and Sciences has requested instructors to alert all students to University of
Pittsburgh Policy 091001, "Email Communications Policy."
HPS 0410 Schedule
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/schedule.html[28/04/2010 08:17:36 ﺹ]
HPS 0410 Einstein for Everyone Spring 2010
Back to main course page
Clock
Schedule
Schedule as revised after snowstorm closings of February 810. Old schedule here.
Week
Lecture
Date
Recitation
Date
Lecture Topic
Assignment
Due
Test
1
Wed.
Jan. 6
Introduction: the questions.
2
Mon.
Jan. 11
Special relativity: the basics.
Mon. Jan. 11
Tues. Jan. 12
1. Principle of
Relativity
Wed.
Jan. 13
Special relativity: adding velocities.
Relativity of simultaneity
3
NO CLASS
Mon. Jan.
18.
Martin
Luther King
Day
Tues. Jan. 19
Add/drop ends
Submitting
assignments due
on Monday
Tues. Jan. 19
2. Adding
Velocities
Einstein's Way
Wed.
Jan. 20
Is special relativity paradoxical?
4
Mon. Jan
25
E=mc
2
Mon. Jan. 25
Tues. Jan. 26
3. Relativity of
Simultaneity
Wed. Origins of special relativity
Test
1
What
HPS 0410 Schedule
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/schedule.html[28/04/2010 08:17:36 ﺹ]
Jan. 27 Einstein's Pathway to Special Relativity is it
on?
Grades
5
Mon.
Feb. 1
Spacetime
Spacetime and the Relativity of
Simultaneity
Mon. Feb. 1
Tues. Feb. 2
4. Origins of
Special
Relativity
Wed.
Feb. 3
Spacetime and the Relativity of
Simultaneity
Spacetime, Tachyons, Twins and Clocks
6
Mon. Feb.
8
Classes cancelled this week because of
snowstorm. This is a revised schedule of classes.
Old schedule here.
`
Mon. Feb. 8
Tues. Feb. 9
Wed. Feb.
10
7
Mon.
Feb. 15
What is a four dimensional space like?
Philosophical significance of relativity
Mon. Feb. 15
Tues. Feb. 16
5. Spacetime
Wed.
Feb. 17
Philosophical significance of relativity
Test
2
What
is it
on?
Grades
8
Mon.
Feb. 22
Euclidean Geometry: The First Great
Science
NonEuclidean Geometry: A Sample
Construction
Mon. Feb. 22
Tues Feb. 23
6. Philosophical
Significance
Wed.
Feb. 24
NonEuclidean Geometry: A Sample
Construction
Spaces of Constant Curvature
9
Mon.
Mar. 1
Spaces of Constant Curvature
Spaces of Variable Curvature
HPS 0410 Schedule
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/schedule.html[28/04/2010 08:17:36 ﺹ]
Mon. Mar. 1
Tues. Mar. 2
7. Non
Euclidean
Geometry
Wed.
Mar. 3
General relativity
Test
3
What
is it
on?
Grades
SPRING
BREAK
10
Mon.
Mar. 15
General relativity
Mon. Mar. 15
Tues. Mar. 16
8. Curvature
Wed.
Mar. 17
Gravity Near a Massive Body
Einstein's Pathway to General Relativity
11
Mon.
Mar. 22
Relativistic cosmology
Mon. Mar. 22
Tues. Mar. 23
9. General
Relativity
Wed.
Mar. 24
Relativistic cosmology
Test
4
What
is it
on?
Grades
12
Mon.
Mar. 29
Big bang cosmology
Mon. Mar. 29
Tues. Mar. 30
Term paper
topic
submitted
10. Relativistic
Cosmology
Wed.
Mar. 31
Big bang cosmology/
Black holes
13
Mon.
Apr. 5
Black holes
Optional: A Better Picture of Black Holes
HPS 0410 Schedule
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/schedule.html[28/04/2010 08:17:36 ﺹ]
Mon. Apr. 5
Tues. Apr. 6
11. Big Bang
Cosmology
Wed.
Apr. 7
Origins of Quantum Theory
Test
5
What
is it
on?
Grades
14
Mon.
Apr. 12
Origins of Quantum Theory
Mon. Apr. 12
Tues. Apr. 13
13. Origins of
Quantum
Theory
Wed.
Apr. 14
Quantum Theory of Waves and Particles
15
Mon.
Apr. 19
The Measurement Problem
Mon. Apr. 19
Tues. Apr. 20
14. Problems of
Quantum
Theory
Wed.
Apr. 21
Term paper
due
Einstein on the Completeness of
Quantum Theory
Test
6
What
is it
on?
Test 1. Wednesday January 27. The test will be in the first 15 minutes of class and will consist of 34
questions requiring answers of a few sentences each. The material examinable is the content of the chapters
"Special relativity: the basics," "Special relativity: adding velocities," "Relativity of simultaneity," "Is special
relativity paradoxical?" and the assignments 13.
Test 2. Wednesday February 17. The material examinable is the content of the chapters "E=mc
2
", "Origins
of Special Relativity," "Einstein's Pathway to Special Relativity," the three "Spacetime" chapters and the
assignments 4 and 5.
Test 3. Wednesday March 3. The material examinable is the content of the chapters "Philosophical
Significance of Relativity," the chapters on Euclidean and NonEuclidean Geometry and Spaces of Constant
Curvature; and the assignments 6 and 7.
Test 4. Wednesday March 24. The material examinable is the content of the chapters "Spaces of Variable
Curvature," "General Relativity," "Gravity Near a Massive Body" and "Einstein's Pathway to General
Relativity"; and the assignments 8 and 9.
Test 5. Wednesday April 7. The material examinable is the content of the chapters "Relativistic Cosmology"
and "Big Bang Cosmology" and the assignments 10 and 11.
HPS 0410 Schedule
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/schedule.html[28/04/2010 08:17:36 ﺹ]
Test 6. Wednesday April 21. The material examinable is the content of the chapters "Black Holes," "Origins
of Quantum Theory," as much as we have covered of "Quantum Theory of Waves and Particles," "The
Measurement Problem," "Einstein on the Completeness of Quantum Theory" and the assignments 13 and
14.
HPS 0410 Schedule
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/schedule_old.html[28/04/2010 08:17:38 ﺹ]
HPS 0410 Einstein for Everyone Spring 2010
Back to main course page
Clock
Schedule
This is the term's OLD schedule what has been modified as a result of the cancellation of classes on
February 810 due to snowstorms. The new schedule is here.
Week
Lecture
Date
Recitation Date Lecture Topic Assignment Due Test
1 Wed. Jan. 6 Introduction: the questions.
2
Mon. Jan.
11
Special relativity: the basics.
Mon. Jan. 11
Tues. Jan. 12
1. Principle of
Relativity
Wed. Jan.
13
Special relativity: adding
velocities.
Relativity of simultaneity
3
NO CLASS
Mon. Jan. 18.
Martin Luther
King Day
Tues. Jan. 19
Add/drop ends
Submitting
assignments due on
Monday
Tues. Jan. 19
2. Adding Velocities
Einstein's Way
Wed. Jan.
20
Is special relativity
paradoxical?
4 Mon. Jan 25
E=mc
2
Mon. Jan. 25
Tues. Jan. 26
3. Relativity of
Simultaneity
Wed. Jan.
27
Origins of special relativity
Einstein's Pathway to
Special Relativity
Test 1
What is
it on?
Grades
HPS 0410 Schedule
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/schedule_old.html[28/04/2010 08:17:38 ﺹ]
5 Mon. Feb. 1
Spacetime
Spacetime and the Relativity
of Simultaneity
Mon. Feb. 1
Tues. Feb. 2
4. Origins of Special
Relativity
Wed. Feb. 3
Spacetime and the Relativity
of Simultaneity
Spacetime, Tachyons, Twins
and Clocks
6 Mon. Feb. 8
What is a four dimensional
space like?
Philosophical significance of
relativity
Mon. Feb. 8
Tues. Feb. 9
5. Spacetime
Wed. Feb.
10
Philosophical significance of
relativity
Test 2
What is
it on?
7
Mon. Feb.
15
Euclidean Geometry: The
First Great Science
NonEuclidean Geometry: A
Sample Construction
Mon. Feb. 15
Tues. Feb. 16
6. Philosophical
Significance
Wed. Feb.
17
NonEuclidean Geometry: A
Sample Construction
Spaces of Constant
Curvature
8
Mon. Feb.
22
Spaces of Constant
Curvature
Spaces of Variable
Curvature
Mon. Feb. 22
Tues Feb. 23
7. NonEuclidean
Geometry
Wed. Feb.
24
General relativity
Test 3
What is
it on?
HPS 0410 Schedule
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/schedule_old.html[28/04/2010 08:17:38 ﺹ]
9 Mon. Mar. 1
General relativity
Gravity Near a Massive Body
Einstein's Pathway to
General Relativity
Mon. Mar. 1
Tues. Mar. 2
8. Curvature
Wed. Mar. 3 General relativity
SPRING
BREAK
10
Mon. Mar.
15
Relativistic cosmology
Mon. Mar. 15
Tues. Mar. 16
9. General Relativity
Wed. Mar.
17
Relativistic cosmology
Test 4
What is
it on?
11
Mon. Mar.
22
Big bang cosmology
Mon. Mar. 22
Tues. Mar. 23
10. Relativistic
Cosmology
Wed. Mar.
24
Big bang cosmology/
Black holes
12
Mon. Mar.
29
Black holes
Mon. Mar. 29
Tues. Mar. 30
Term paper topic
submitted
11. Big Bang
Cosmology
Wed. Mar.
31
A Better Picture of Black
Holes
Test 5
What is
it on?
13 Mon. Apr. 5
A Better Picture of Black
Holes
Mon. Apr. 5
Tues. Apr. 6
12. Black Holes
HPS 0410 Schedule
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/schedule_old.html[28/04/2010 08:17:38 ﺹ]
Wed. Apr. 7 Origins of Quantum Theory
14
Mon. Apr.
12
Origins of Quantum Theory
Mon. Apr. 12
Tues. Apr. 13
13. Origins of
Quantum Theory
Wed. Apr.
14
Problems of Quantum
Theory
15
Mon. Apr.
19
Problems of Quantum
Theory
Mon. Apr. 19
Tues. Apr. 20
14. Problems of
Quantum Theory
Wed. Apr.
21
Term paper due
Problems of Quantum
Theory
Test 6
What is
it on?
Test 1. Wednesday January 27. The test will be in the first 15 minutes of class and will consist of 34
questions requiring answers of a few sentences each. The material examinable is the content of the chapters
"Special relativity: the basics," "Special relativity: adding velocities," "Relativity of simultaneity," "Is special
relativity paradoxical?" and the assignments 13.
Test 2. Wednesday February 10. The material examinable is the content of the chapters "E=mc
2
", "Origins
of Special Relativity," "Einstein's Pathway to Special Relativity," the three "Spacetime" chapters and the
assignments 4 and 5.
Test 3. Wednesday February 24. The material examinable is the content of the chapters "Philosophical
Significance of Relativity" and "NonEuclidean Geometry" and the assignments 6 and 7.
Test 4. Wednesday March 17. The material examinable is the content of the chapters "Spaces of Variable
Curvature" and "General Relativity" and the assignments 8 and 9.
Test 5. Wednesday March 31. The material examinable is the content of the chapters "Relativistic
Cosmology" and "Big Bang Cosmology" and the assignments 10 and 11.
Test 6. Wednesday April 21. The material examinable is the content of the chapters "Black Holes," "A Better
Picture of Black Holes" and "Origins of Quantum Theory" and the assignments 12 and 13.
HPS 0410 Term Paper
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/paper.html[28/04/2010 08:17:39 ﺹ]
HPS 0410 Einstein for Everyone Spring 2010
Term Paper
An Amazing Scientific Discovery
Due by final lecture: Wednesday April 21
Submit in electronic form to recitation instructor
1000 words
Topic selection
Due in recitation: Mon., Mar. 29/ Tues., Mar. 30
Project
This course is a parade of amazing scientific discoveries. They are things that would never
occur to us ordinarily: that there may be no fact as to whether two events are simultaneous;
that energy and matter are the same thing; that gravity is just funny geometry; that time had a
beginning; and more. What makes these all the more amazing is that they are not conjurings of
fiction. They are our best attempts to describe how our world really is; and science can tell us a
cogent and compelling story as to why we should believe them.
For your term paper, you are to identify and describe an amazing idea. Your text should
contain:
1. A clear explanation of the amazing scientific discovery.
2. An account of how the discovery was made.
Your amazing idea must be drawn from standard science. The goal is not to report on wild
speculation that someone, someday thinks might become regular science. You are to seek an
amazing discovery that has already become regular science. If you are unsure whether an
amazing idea is drawn from standard science, ask if it has experimental or observational
evidence in its favor. If it doesn't, it is speculation!
Your paper must present material not already covered in lectures and recitations. For this
reason you are best advised to write about an amazing idea not already covered in the class. If
you do choose one we have covered in class, note that your grade will depend entirely on the
extent to which you go beyond class material.
Your paper must present novel text written specifically for this class. Because of the breadth of
the assignment, you may find you already have something written for another class that suits
HPS 0410 Term Paper
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/paper.html[28/04/2010 08:17:39 ﺹ]
the assignment. You may not "recycle" text written for another class. The point of this
assignment is for you to do new research and write new text.
Focus on the rational basis of the discovery. Your account of how the discovery was made
should focus on what led the scientist or scientists to the discovery and the reasons that they
found to believe in its correctness. You need not distract yourself with incidental biographical or
other background facts unless they are important to understanding the grounding of the
discovery.
Keep the discovery narrow. It is easy to tackle too big a topic. Modern cosmology as theory is
far too big for this project. One discovery in itsuch as the presence of dark matter in galaxies
is already quite a big enough topic for this paper. If in doubt, narrow the topic.
The discovery must be in science and not technology. While the achievements of modern
technology are amazing, they are not our concern in this paper. You should be looking at
things we know, not things we make. Sometimes the latest technology has an amazing
scientific discovery behind it; that discovery could be the focus of a paper. If you do decide to
pursue a scientific discovery that lies behind some new advance in technology, be careful; very
often those discoveries are complicated and can make the paper hard to write.
Selection of Topic
A brief statement of the amazing idea selected is due in the recitation, Monday, March 29/
Tuesday, March 30. Submit it as one paragraph, on paper. 1/10th of the term paper grade is
assigned for submitting a suitable statement on time. (These are easy points earned just for
being on time!)
Consult with your recitation instructor if you are uncertain over the idea or need assistance in
locating a suitable one.
Presentation
The paper should be headed with your name, the title of the paper and the course to which it is
being submitted. The paper should have an introduction and conclusion and be divided into
appropriately headed sections. A standard system for footnoting and for referencing your
sources must be adopted and used consistently throughout. Consult a guide on writing term
papers if you are unsure of such systems.
We expect your writing to be clear and simple. That applies both to the thoughts expressed
and the words used. The thoughts should develop naturally in small, clear steps. The wording
should be plain and direct and the sentences short. There is no gain in a big word, when a little
one will do. We expect proper grammar and correct spelling and will penalize major excursions.
HPS 0410 Term Paper
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/paper.html[28/04/2010 08:17:39 ﺹ]
Submission
Your paper is to be submitted to us in electronic form via turnitin.com, a plagiarism prevention
web resource. Here are the instructions for submitting your paper:
1. Visit http://turnitin.com.
2. Click “New Users” in the upper right corner.
3. Please contact recitation instructor to obtain the appropriate Turnitin Class ID number
and Class Enrollment Password.
4. Finish the registration process.
5. Click on the “Einstein for Everyone” class link.
6. Click on the “Submit” icon in the row marked “Paper.”
7. Upload your paper.
Acceptable formats for your paper are MS Word, WordPerfect, PostScript, PDF, HTML, RTF,
and plain text. You should also submit your extra credit paper, if you choose to do one, by
clicking on the “Submit” icon in the row marked “Extra Credit Paper.” All papers (including extra
credit papers) must be submitted by midnight of the due date.
Use of Sources
As is standard in all academic writing, the wording of your paper should be your own; it should
not be copied or paraphrased even loosely from another source. If you are uncertain over the
correct use of sources, see this Guide.
Clock
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/clock.html[28/04/2010 08:17:44 ﺹ]
HPS 0410 Einstein for Everyone Spring 2010
Einstein's Time is ...
Main course page Schedule
HPS 0410 Sign In
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/sign_in.html[28/04/2010 08:17:45 ﺹ]
HPS 0410 Einstein for Everyone Spring 2010
Name:_______________________________
Major:________________________________
Level:________________________________
Is there anything in particular you would like to cover in this course?
Einstein for Everyone
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/index.html[28/04/2010 08:17:47 ﺹ]
Einstein for Everyone
JOHN D. NORTON
Nullarbor Press
2007
revisions 2008, 2010
Einstein for Everyone
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/index.html[28/04/2010 08:17:47 ﺹ]
Copyright © 2007, 2008, 2010 by Nullarbor Press
Published by Nullarbor Press, 500 Fifth Avenue, Pittsburgh, Pennsylvania 15260
with offices in Liberty Ave., Pittsburgh, Pennsylvania, 15222
All Rights Reserved
John D. Norton
Center for Philosophy of Science
Department of History and Philosophy of Science
University of Pittsburgh
Pittsburgh PA USA 15260
An advanced sequel is planned in this series:
Einstein for Almost Everyone
2 4 6 8 9 7 5 3 1
ePrinted in the United States of America
no trees were harmed
web*book
TM
Einstein for Everyone
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/index.html[28/04/2010 08:17:47 ﺹ]
Preface
For over a decade I have taught an introductory, undergraduate class, "Einstein for
Everyone," at the University of Pittsburgh to anyone interested enough to walk through
door. The course is aimed at people who have a strong sense that what Einstein did
changed everything. However they do not know enough physics to understand what he
did and why it was so important. The course presents just enough of Einstein's physics
to give students an independent sense of what he achieved and what he did not achieve.
The latter is almost as important as the former. For almost everyone with some
foundational axe to grind finds a way to argue that what Einstein did vindicates their
view. They certainly cannot all be right. Some independent understanding of Einstein's
physics is needed to separate the real insights from the never ending hogwash that
seems to rain down on us all.
With each new offering of the course, I had the chance to find out what content worked
and which of my ever so clever pedagogical inventions were failures. By this slow
process of trial and error, indulging the indefinitely elastic patience of the students at
the University of Pittsburgh, the course has grown to be something that works pretty
wellor so it seems from my side of the lectern.
At the same time, my lecture notes have evolved. They began as chaotic pencil jottings.
Over time they solidified into neater pencil script and overhead transparencies; and then
into summaries that I posted on my website; and then finally those summaries were
expanded into a full text that can be read independently. That text is presented here.
Its content reflects the fact that my interest lies in history and philosophy of science and
that I teach in a Department of History and Philosophy of Science. There is a lot of
straight exposition of Einstein's physics and the physics it inspired. However there is
also a serious interest in the history of Einstein's science. A great deal of my
professional life has been spent poring over Einstein's manuscripts, trying to discern
how he found what he found. The results of those studies have crept in. In other places I
try to show how a professional philosopher approaches deeply intractable foundational
issues. The temptation in such cases is let one's standard of rigor drop, since otherwise
it seems impossible to arrive at any decision. That is exactly the wrong reaction. When
the problems are intractable, we must redouble our commitment to rigor in thought and
I have tried to show how we can do this.
This texts owes a lot to many. It came about because once Peter Machamer, then chair
of the Department of HPS, urged a meandering junior professor to do a course that
"did" Einstein and black holes and all that stuff. The text is indebted to the University of
Einstein for Everyone
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/index.html[28/04/2010 08:17:47 ﺹ]
Pittsburgh, which has the real wisdom to see that it gets the most from its faculty by
letting them do what fascinates them, for they will surely do that best. It owes the
greatest debt to the infinite patience of the students who have taken this class, told me
what works and what does not, and each year allow me at least indirectly to experience
anew that inescapable sense of wonder when one first grasps the beauty of what
Einstein did.
i i i
Contents
Preface iii
1. Introduction read
2. Special Relativity: The Basics read
3. Special Relativity: Adding Velocities read
4. Special Relativity: Relativity of Simultaneity read
5. Is Special Relativity Paradoxical? read
6. E=mc
2
read
7. Origins of Special Relativity read
8. Einstein's Pathway to Special Relativity read
9. Spacetime read
Einstein for Everyone
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/index.html[28/04/2010 08:17:47 ﺹ]
10. Spacetime and the Relativity of Simultaneity read
11. Spacetime, Tachyons, Twins and Clocks read
12. What is a Four Dimensional Space Like? read
13. Philosophical Significance of the Special Theory of
Relativity
read
14. Euclidean Geometry: The First Great Science read
15. NonEuclidean Geometry: A Sample Construction read
16. Spaces of Constant Curvature read
17. Spaces of Variable Curvature read
18. General Relativity read
19. Gravity Near a Massive Body read
20. Einstein's Pathway to General Relativity read
21. Relativistic Cosmology read
22. Big Bang Cosmology read
23. Black Holes read
24. A Better Picture of Black Holes read
25. Atoms and the Quanta read
26. Origins of Quantum Theory read
27. Quantum Theory of Waves and Particles read
28. The Measurement Problem read
Einstein for Everyone
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/index.html[28/04/2010 08:17:47 ﺹ]
29. Einstein on the Completeness of Quantum Theory read
30. Einstein as the Greatest of the Nineteenth Century
Physicists
read
i v
Questions
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/Questions/index.html[28/04/2010 08:17:54 ﺹ]
HPS 0410 Einstein for Everyone
Back to main course page
Questions
John D. Norton
Department of History and Philosophy of Science
University of Pittsburgh
Do astronauts age more slowly?
Can a finite universe have no edge?
Can time have a beginning?
Is time travel possible?
Does the moon change because a mouse looks at it?
Here are the questions that were asked in the description
in the course catalog... Answered.
Do astronauts age more
slowly?
YES
According to Einstein's special theory of relativity, all processes
slow down when a system moves at high speed. The result
applies to astronauts since they are moving rapidly. The amount of
slowing is so slight as to be imperceptible for ordinary speeds. It
becomes very significant when we get close to the speed of light:
An astronaut is really just a
quick way of saying
"someone who travels away
from the earth at high speed
and returns."
Questions
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/Questions/index.html[28/04/2010 08:17:54 ﺹ]
Car at 100
miles per hour
Rocket at earth's escape
velocity
(7 miles per second)
Rocket at 100,000
miles per second
(53% speed of
light)
Rocket at 185,800
miles per second
(99% speed of light)
Lose 0.35
seconds in
1,000,000 years
Lose 0.022 seconds in 1 year
(Astronaut is 0.022 seconds
younger on returning after a
one year trip.)
Astronaut
metabolism slows
to 84% of normal.
Astronaut
metabolism slows to
4.5% of normal.
(One year
journey=aging 16
days)
Small effect... ...large effect
How can special relativity know that these effects will
happen? They arise directly from the basic supposition
of the theory: all uniformly moving observers must
measure the same speed for light.186,000 miles
per second.
At first this seems impossible. Say I send out a light
signal from earth. I measure its speed at 186,000 miles
per second.
What about another observer that chases after the light
signal at, say, half the speed of light. Shouldn't that
observer see the light signal slowed to half its speed? All
our common sense says yes. Special relativity says no.
Questions
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/Questions/index.html[28/04/2010 08:17:54 ﺹ]
How can that be? Something in our common sense
assumptions must be wrong. There is not much room to
look for the mistake. We find the speed of the light
signal with just two instruments: a measuring rod to
determine how far the light signal goes; and a clock to
measure how long it takes to go that far. Classically we
assume that neither is affected by rapid motion. At least
one of these assumptions must be wrong if the speed of
light is to remain constant. When we work through the
details we find that both are: the rod shrinks in the
direction of motion and the clock slows.
So rapidly moving clocks slow. How does that get to a
rapidly moving astronaut aging more slowly. An
astronaut's metabolism is a clock. You can use your
Questions
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/Questions/index.html[28/04/2010 08:17:54 ﺹ]
pulse to time things if you like. So that metabolism clock
must slow too. The legend is that Galileo used his pulse
to time the period of a slowly swinging lamp while not
attending to a cathedral mass and thereby arrived at the
famous result of the isochrony of the pendulum, which
just says that the period of a pendulum is fixed by its
length. His pulse was the simple clock used to time the
pendulum.
Can a finite universe have no
edge?
YES
What is this question asking?
It is asking whether we could have a universe with a
finite volume. That means if I ask "How many
cubic miles of space are there?" the answer is
not "infinity" but some definite number. It might be a
big number. Say 63 kazillion cubic miles. But it is
still a definite number, so that if you started to count
off the cubic miles in space, you would eventually
come to an end.
At the same time it is asking if this finite universe
could have no edge. An edge is just what you think.
It is a place you get to where you run out of
space.
Questions
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/Questions/index.html[28/04/2010 08:17:54 ﺹ]
Can both be possible at the same time? Can you run out
of space in the sense that you count off all the cubic
milesbut you never come to an edge?
Both can indeed happen in a more restricted way in a
very familiar example. Consider motions on the
surface of the earth. If you start in Pittsburgh, choose
any direction you like and keep moving straight ahead,
you will eventually come back to where you started.
There will be no edge for you to fall off. So the surface of
the earth has the sort of properties we are looking for. It
is finite in area. It just 196,000,000 square miles. But it
has no edge.
Of course the example seems strained. While we come
back to where we started, we are really not going in a
straight line, but in a big circle. While the two
dimensional surface of the earth is finite without edge, it
gets these properties because it is really curved into a
third dimension.
Does that fact really make such a difference to the
possibility of a surface of finite area but no edge? What
if we were flat beings trapped in the two
dimensional surface of the earth , unable to sense
the existence of this third dimension. All we know about
the surface of the earth was what can be read off our
two dimensional maps. Then all we would know was that
we lived in a finite two dimensional space with no edge.
Questions
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/Questions/index.html[28/04/2010 08:17:54 ﺹ]
That a third dimension might have something to do with
this, to us would be speculation of little practical
importance. We would have no way of accessing this
third dimension.
Could the analogous thing happen for a three
dimensional space ? One of the big discoveries of
19th century geometry was that this is entirely possible.
To get us started, imagine that there is a fourth
dimension of space into which our three dimensions
curve. Then we might end up with a three dimensional
space which has finite volume but no edge. No matter
which way you voyage in a spaceship, you will eventually
come back to where you started, without hitting an edge.
We satisfied ourselves that this is possible by imagining
a fourth dimension of space. How seriously should
we take this fourth dimension? Our two dimensional
surface dwellers could ignore the possibility of a third
dimension in doing their geometry. All that mattered to
them were the geometrical facts of the earth's surface
that they could measure. In the three dimensional case,
it is the same. All that matters are the geometrical facts
about our three dimensional space that are accessible to
us three dimensional beings. In the end, this fourth
dimension of space becomes a comfortable fable to help
us get used to the idea that a finite three dimensional
space without edge is entirely possible.
In the 19th century, this sort of space was an interesting
mathematical curiosity. In 1917, shortly after Einstein had
completed his general theory of relativity, he proposed
Questions
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/Questions/index.html[28/04/2010 08:17:54 ﺹ]
that our cosmic space was really like this. This was the
first relativistic cosmology . Whether space has this
structure remains one of the most interesting of the open
questions of modern cosmology. In Einstein's original
universe, space had a finite volume:
1,000,000,000,000,000,000,000,000,000,000 cubic
light years
That's a one followed by 30 zeros. But there is no edge.
Can time have a
beginning?
YES
At first this seems impossible. If time has a beginning,
there must be a first event or at least a clustering of
events near it. Surely something must have happened
before them?
Einstein's cosmology of 1917 was the first of many ever
stranger cosmologies to be devised on the basis of his
general theory of relativity. Einstein's first universe was
static in time. The cosmologies that followed, starting in
the 1920s, were not. They portrayed space itself as
continually expanding . We can think of Einstein's
universe as a three dimensional analog of a two
dimensional spherical surface, somewhat like a balloon.
Then this expansion simply corresponds to the inflation
of the balloon.
Questions
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/Questions/index.html[28/04/2010 08:17:54 ﺹ]
Here is a picture of this
expansion. The universe
is represented by a
sphere and time
advances up the screen.
So the small universe of
long ago grows up the
page to the universe of
the present.
Now imagine this
expansion in reverse.
As we look further and
further back in time, the
balloon gets smaller and
smaller. In the typical
cosmologies considered
nowadays, not too long
into the past the balloon
would have shrivelled to
nothing. At that point in
our story, space would
have ceased to be.
One might try to image
times before that
moment. But it would be
futile, since there is no
space associated with
the time. Indeed there is
something highly
suspect about the
moment at which the
balloon shrivels to a
point. Then the curvature
of the space becomes
infinite and the basic
equations of Einstein's
theory break down.
This first moment is not
really a moment in time
at all. It really amounts to
a lower bound on our
projections into the past.
Questions
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/Questions/index.html[28/04/2010 08:17:54 ﺹ]
If we think of it as
"time=0," then only
moments with a time
coordinate greater than 0
have physical meaning.
It is the beginning of time
and is otherwise known
at the "big bang."
To see why it is the "big bang", let us return our
imagination to the forward direction and imagine what
happens around the beginning of the expansion. Take
any moment you like, as close as you like to the big
bang. By choosing that moment closer and closer to the
big bang, you can make space shrivel up as close to a
point in size as you like. From that moment, everything
space and all its matter  explodes outwards. All
this happened not so long ago. It was around 10 billion
years ago.
Is time travel possible?
YES
The "yes" is intriguing, but there is a catch. The question
Questions
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/Questions/index.html[28/04/2010 08:17:54 ﺹ]
did not ask if there really is time travel; it asked
only if it is possible. Something can be possible without
actually happening. It is possible for our earth to have
two moons. In fact it has only one.
While we have no evidence that time
travel actually occurs, all our latest
work in theories of space and time tell
us that it is entirely possible .
Broadly speaking, there are two
senses of time travel, both possible.
1. The first sense is the the H. G. Wells sense. This one is
named after the author of the most famous story about time travel
in which a voyager hops into a machine and travels about in time.
Special relativity has room for something close. If we had things
that traveled faster than light, then, for some observers, they would
travel backwards in time. These faster than light objects are
"tachyons." For some observers, they would leave today and arrive
yesterday.
The effects that bring this
about are closely related to
those that lead to the slower
aging of rapidly moving
astronauts. That effect
depended on rapidly moving
clocks not behaving as we
expected. The time travel
effect arises from anomalies
in how observers in rapid
motion set their clocks at
different places in space
Of course
how we
could get
ourselves
to travel
faster than
light is an
unsolved
problem!
We
cannot
accelerate
through
the speed
of light.
But is
there
some way
Questions
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/Questions/index.html[28/04/2010 08:17:54 ﺹ]
to recreate
ourselves
traveling
faster than
light? If
so, some
observers
would
judge us
to be
traveling
backwards
in time.
"There was a young lady named Bright,
Whose speed was far faster than light.
She set out one day
In a relative way,
And returned home the previous night."
Arthur Henry Reginald Buller.
2. The second sense is more topological and has been
called " Goedelian" (by John Earman) in honor of the
great logician Kurt, Goedel, who was a friend of
Einstein's and did pioneering work on spacetimes that
admit time travel.
Questions
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/Questions/index.html[28/04/2010 08:17:54 ﺹ]
We can imagine space and time as forming
a huge sheet of paper. the vertical line is
the complete history through time of a
person, experiencing the years ..., 1980,
1981, ... etc.
What Einstein did in 1917 was to get us to
wrap up the sheet of paper in the spatial
direction so travel in the direction "left" is
wrapped around to meet travel in the
direction "right". That way we always end
up where we started.
What Einstein's theory also allows is that
travel into the future of time can be
wrapped around to connect with the
past, so that if we persist long enough in
time we end up back at the present.
Questions
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/Questions/index.html[28/04/2010 08:17:54 ﺹ]
This is less a type of time travel that we create with a
machine. The best advice to someone who wants to
travel in time this way is that they should be sure to be
born into the right universe! However there are special
circumstances that might bring it about. It might happen
near black holes generated by gravitational collapse. It
also may happen if we get very dense, very rapidly
rotating matter.
Does the moon change
because a mouse looks at it?
Questions
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/Questions/index.html[28/04/2010 08:17:54 ﺹ]
YES
This "yes" depends upon quantum mechanics, in whose
founding Einstein played a major role. It is our best
theory of matter and is usually applied to deal with
matter in the very small, that is, little particles like
electrons. It tells us that matter in the very small has
properties quite unlike the ones we are used to with
ordinary objects.
We are used to the idea that ordinary objects are either
particles or waves. It turns out that in the small, particles
are both particles and waves . They have a dual
character that is quite preplexing when you first learn of
it and, as far as I can tell, that perplexity never really
goes away, even if you know a lot about them.
Questions
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/Questions/index.html[28/04/2010 08:17:54 ﺹ]
Take electrons, for example.
They are familiar to us from
oldfashioned television
tubes. The electrons are
fired from a glowing element
at the back of the tube. They
are formed into a beam by
deflecting magnetic fields.
When the electron is in flight
in the beam, it behaves
just like a wave. It is
spreads out in space, has a
wavelength and frequency
and can produce all sorts of
wavelike phenomena, like
interference patterns. These
are just like the rippled
patterns that water waves
make on the surface of a
pond when pebbles are
dropped in. We can only get
them because the waves are
spread out in space.
When these electrons strike
the screen of the TV tube,
they behave very differently.
According to the standard
text book accounts of
quantum mechanics, they
instantly cease to be wave.
They collapse to a point, so
they are now behaving like
a particle . We see that
localization through the
emitting of a brief flash of
light from just one point on
the screen. (Many of those
flashes combine to make the
Questions
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/Questions/index.html[28/04/2010 08:17:54 ﺹ]
images we watch.)
So sometimes an electron behaves like a wave; and
sometimes like a particle. So what? The odd part is what
decides whether the electron behaves like a wave or a
particle. In the standard text book treatments, we
decide by the act of observing the electron. An
electron left to itself behaves like a wave. The moment
we observe it for example by having it smash into the
screen of a TV tube so that we can see where it is from
the flash of light produced then it behaves like a
particle.
That is the odd part. Standard, text book quantum
mechanics tells us that the act of our observing the
electron has caused it to collapse to a point. This
astonishing idea troubled Einstein very greatly and
he could never accept it. What difference does it make to
the electron if we observe it or not?
What Einstein also saw was that the difficulty could not
be confined to minute objects like electrons. If individual
particles have this dual waveparticle, then so do
collections of particles. Our observing of them will also
cause them to collapse. Big objects like steam
locomotives, moons and planets are just many, many
particles all in one place. They will also have a slight
wave character, too small for us to notice, but there
nonetheless. And when we observe them, they will
collapse!
His collaborator and biographer Abraham Pais reports
"...during one walk, Einstein suddenly stopped, turned to
me, and asked whether I really believed that the moon
exists only when I look at it."
The famous physicist (and inventor of the name "black
Questions
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/Questions/index.html[28/04/2010 08:17:54 ﺹ]
hole") John Wheeler also reported of Einstein
"...No one can forget how he expressed his discomfort
about the role of the observer, 'When a mouse
observes, does that change the state of the universe?'"
The question above is a combination of these two
remarks and the answer of yes is just standard text book
physics.
Copyright John D. Norton. February, 2002; July 2006; January 3, 2007.
Special Relativity Basics
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/Special_relativity_basics/index.html[28/04/2010 08:18:00 ﺹ]
HPS 0410 Einstein for Everyone
Back to main course page
Special Theory of Relativity: The Basics
John D. Norton
Department of History and Philosophy of Science
University of Pittsburgh
Inertial and Accelerated Motion
Absolute versus Relative Motion
I. The Principle of Relativity
II. The Light Postulate
A Light Clock
Light Clocks are Slowed by Motion
All Moving Clocks Are Slowed by Motion
Moving Rods Shrink in the Direction of Their Motion
What you need to know:
Background reading: J. Schwartz and M. McGuinness, Einstein
for Beginners. New York: Pantheon.. pp. 66  151.
"On the Electrodynamics of Moving Bodies"
In June 1905, when Albert Einstein was still a patent examiner
in Bern, Switzerland, he sent a paper with this title to the
journal Annalen der Physik. It contained his special theory of
relativity. He argued that altering our understanding of the
behavior of space and time could resolve certain problems in
electrodynamics. (See page one in German or English.)
To understand what these alterations were, we need some
preliminary notions.
Inertial and Accelerated Motion
There is a preferred motion in space
known an inertial motion. Any body left to
Special Relativity Basics
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/Special_relativity_basics/index.html[28/04/2010 08:18:00 ﺹ]
itself in space will default to an inertial
motion, which is just motion at uniform
speed in a straight line. The easiest
example to visualize is a huge spaceship
with the engines turned off, gliding through
space. At any point in space, many inertial
motions are possible. They will be pointed
in different directions and will be at
different speeds.
Any other motion is accelerated. This
includes motion at uniform speed in a
circle. While the speed stays the same, the
direction does not. So the motion is
accelerated.
Sometimes we will talk of an "inertial
observer," which is just an observer
moving inertially.
Such an observer might set up an
elaborate system of measuing rods and
other physical devices to fix the positions of
events; and an elaborate system of clocks
to fix their timing. Such a system is an
inertial frame of reference.
Absolute versus
Relative Motion
Relative motion arises when one body moves with
respect to another. For example, our spaceship might
move relatively to a nearby planet.
Special Relativity Basics
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/Special_relativity_basics/index.html[28/04/2010 08:18:00 ﺹ]
Correspondingly the planet moves relative to the
spaceship.
Prior to Einstein, it was generally thought that there was
another sense of motion, absolute motion. According to this
sense, there is a fact of the matter as to whether the spaceship
is moving, without regard to whether it moves relative to
another object, such as a planet. There is an absolute state of
rest in space, according to this earlier view. Either the
spaceship is in this state and at rest; or it is not and it is
moving.
Einstein found it most convenient to base his theory of relativity
on two postulates ; once they were assumed it became an
exercise in logic to develop the whole theory. The two
postulates are
I. The Principle of Relativity and
II. The Light Postulate.
I. The Principle of Relativity
All inertial observers find the same laws of
physics.
What this says is just this: imagine two spaceships, each
moving inertially in space but with different velocities. If we
conduct experiments on either ship aimed at determining a law
of physics, we will end up with the same law no matter which
spaceship we are on.
Or, more simply, the laws of physics simply tell us which
Special Relativity Basics
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/Special_relativity_basics/index.html[28/04/2010 08:18:00 ﺹ]
physical process can happen and which cannot. So if all inertial
observers find the same laws, that just means that any process
that can happen for one inertial observer can happen for any
other.
Here are some important consequences of the principle:
No experiment aimed at detecting a law of nature can
reveal the inertial motion of the observer.
Absolute velocity has no place in any law of nature.
No experiment can reveal absolute motion.
Notice that the principle of relativity is limited to inertial
motions. In special relativity, this relativity of motion does
not extend to accelerated motion. If something
accelerates, then it does so absolutely; there is no need
to say that it "accelerates with respect to..." A traditional
indicator of accelertion is inertial forces. If you are in an
airplane that flies uniformly in a straight line, you have no
sense of motion. If the airplane hits turbulence and
accelerates, you sense immediately the acceleration as
inertial forces throw things around in the cabin.
II. The Light Postulate
All inertial observers find the same speed for
light.
That speed is 186,000 miles per second or 300,000
kilometers per second. Because this speed crops up so often in
relativity theory, it is represented by the letter "c".
That Einstein should believe the principle of relativity should not
come as such a surprise. We are moving rapidly on planet
earth through space. But our motion is virtually invisible to us,
as the principle of relativity requires.
Why Einstein should believe the light postulate is a little harder
to see. We would expect that a light signal would slow
down relative to us if we chased after it. The light postulate
says no. No matter how fast an inertial observer is traveling in
pursuit of the light signal, that observer will always see the light
signal traveling at the same speed, c.
Special Relativity Basics
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/Special_relativity_basics/index.html[28/04/2010 08:18:00 ﺹ]
The principal reason for his acceptance of the light
postulate was his lengthy study of electrodynamics, the theory
of electric and magnetic fields. The theory was the most
advanced physics of the time. Some 50 years before, Maxwell
had shown that light was merely a ripple propagating in an
electromagnetic field. Maxwell's theory predicted that the speed
of the ripple was a quite definite number: c.
The speed of a light signal was quite unlike the speed of a
pebble, say. The pebble could move at any speed, depending
on how hard it was thrown. It was different with light in
Maxwell's theory. No matter how the light signal was made and
projected, its speed always came out the same.
The principle of relativity assured Einstein that the laws of
nature were the same for all inertial observers. That light
always propagated at the same speed was a law within
Maxwell's theory. If the principle of relativity was applied to it,
the light postulate resulted immediately.
A Light Clock
One cannot have both of Einstein's postulates and leave
everything else unchanged. We can only retain both without
contradiction if we make systematic changes throughout
our physics. Let us begin investigating these changes, which
include our basic, classical presumptions about space and
time. One of them is that we learn that a moving clock runs
slower.
To see how this comes about, we could undertake a detailed analysis of a real
clock, like a wristwatch or a pendulum clock. That would be difficult and
complicatedand unnecessarily so. All we need is to demonstrate the effect for
just one clock and that will be enough, as we shall see shortly, to give it to us for
all clocks. So let us pick the simplest design of clock imaginable, one
specifically chosen to make our analysis easy.
A light clock is an idealized clock that consists of a rod of length 186,000 miles
Special Relativity Basics
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/Special_relativity_basics/index.html[28/04/2010 08:18:00 ﺹ]
with a mirror at each end. A light signal is reflected back and forth between the
mirrors. Each arrival of the light signal at a mirror is a "tick" of the clock. Since
light moves at 186,000 miles per second, it ticks once per second.
Light Clocks are Slowed by
Motion
To see the effect of motion on this light clock, imagine that it
has been set into rapid motion. To begin, we will assume that
the motion is perpendicular to the rod and that it is very
fast99.5% the speed of light. (We'll write this compactly as
"0.995c.") An observer traveling with the clock will still see the
light signal bounce backwards and forwards between the
mirrors as before. Let us view this process from the
perspective of an observer who stays behind and does not
move with the clock.
That observer sees a light signal leave one end of the rod and
arrive at the other end. But that end is now rushing away from
Special Relativity Basics
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/Special_relativity_basics/index.html[28/04/2010 08:18:00 ﺹ]
the light signal at 99.5% the speed of light. A quick calculation
shows that that the signal will now take 10 seconds to
reach the other end of the rod.
To see this, note that in ten seconds the rod will move
1,850,700 miles, as shown in the figure above. So to get to the
end of the rod, the light signal must traverse the diagonal
path shown. A little geometry tells us that a right angle triangle
with sides 186,000 miles and 1,850,700 miles will have a
diagonal of 1,860,000 miles.
Pythagoras' theorem tells us the diagonal is
1,860,000 miles since
1,860,000 miles
2
= 1,850,700 miles
2
+ 186,000 miles
2
Since light moves at 186,000 miles per second, it will need ten
seconds to traverse the diagonal.
Setting the arithmetic aside, the result is simple. Since the
light signal must travel so much farther to traverse the rod of a
moving clock, it takes much longer to do it. So a moving light
clock ticks slower. In this case, for a clock moving at 99.5% the
speed of light, it ticks once each ten seconds instead of once
each second.
All Moving Clocks Are
Slowed by Motion
A simple application of the principle of relativity shows that all
clocks must be slowed by motion, not just light clocks. We set a clock
of any construction next to a light clock at rest in an inertial laboratory.
Special Relativity Basics
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/Special_relativity_basics/index.html[28/04/2010 08:18:00 ﺹ]
We notice that they both tick at the same rate.
That must remain true when we set the laboratory into a different
state of inertial motion.
But since the light clock has slowed with the motion, the other clock
must also slow if it is to keep ticking at the same rate as the light
clock.
You might be tempted to say that the other clock would not
keep pace with the light clock. But then you would have
devised a device that detects absolute motion, in
contradiction with the principle relativity. That device would pick
out absolute rest as the only state in which the two clocks run
at the same rate.
Moving Rods Shrink in the
Direction of Their Motion
So far, we have considered a light clock whose rod is
perpendicular to the direction of its motion. If we now consider
a light clock whose rod is oriented parallel to the direction
of motion, we will end up concluding that its rod must shrink in
the direction of its motion. To get to this result, we need two
steps:
First Step: Light clocks oriented perpendicular to one another
run at the same speed.
Take the light clock considered above. Image a second, identical light clock with
its rod oriented parallel to the direction of the motion. Once again the principle of
relativity requires that both clocks run at the same speed. We could just
leave it at thatan application of the earlier result. However it is reassuring to go
through it from scratch.
To begin, we don't need the principle of relativty to see that the clocks at rest run
Special Relativity Basics
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/Special_relativity_basics/index.html[28/04/2010 08:18:00 ﺹ]
at the same rate. They will run at the same rate simply because they are the same
clocks oriented in different directions. That just follows from the isotropy of
space. All its directions are equivalent. So the orientation of the clock cannot
affect its speed.
Now imagine that we take the entire system of the two clocks
and set it into rapid motion at, say, 99.5% the speed of
light, in the direction of one of the light clocks.
An observer moving with the two light clocks must see them
continue to run at the same rate. We now do need the
principle of relativity to establish this. Our earlier symmetry
argument doesn't work anymore, since the two directions of the
clocks are intrinsically different. One is perpendicular to the
direciton of motion; the other is parallel to it. The principle of
relativity requires that they run at the same rate. For, if they ran
at different rates, the device would be an experiment that could
detect absolute motion.
We could detect absolute motion just by
taking two light clocks perpendicular to each
other and checking if they run at the same
rate. Only when we are rest would they run at
the same rate. If they do not run at the same
rate we would know we are moving absolutely.
The principle of relativity prohibits an
experiment that can do this. So the two clocks
must run at the same rate.
Second Step: The rod oriented in the direction of motion must
shrink.
We know from the earlier analysis that a light clock (indeed any
clock) moving at 99.5% the speed of light is slowed so that it
ticks only once in ten seconds. So now we know that the
light clock oriented parallel to the direction of motion must tick
Special Relativity Basics
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/Special_relativity_basics/index.html[28/04/2010 08:18:00 ﺹ]
once each ten seconds. But that cannot happen if everything is
just as we describe it. Imagine the outward bound journey of
the light signal.
How do I get this? If you have to know, here are the
details. The light signal chases at 100% c after the
leading end of the rod. That end is initially 186,000
miles away and moving at 99.5% c. So the light signal
approaches the end of the rod at 0.5% c, which is
930 miles per second. The distance to cover is
186,000 miles, so it takes 186,000/930 = 200
seconds.
The light signal has to go from one end to the other of a
186,000 mile rod. The light moves at 186,000 miles per
second. But the rod is also moving in the same direction at
99.5% the speed of light. So the light has to chase after a
rapidly fleeing end and will need much more than a second
to catch it. With a little arithmetic it turns out that the light
will need 200 seconds to make the trip.
But the light clock has to tick once every
ten seconds! Something has gone badly
wrong. What has gone wrong is our
assumption that the rod parallel to the
direction of motion retains its length.
That is incorrect. That rod actually
shrinks to 10% of original length, so
the moving pair of clocks really looks
more like:
Now the light signal has time to get from one end of the
rod to the other and keep the clock ticking at once each ten
seconds as expected. The signal just has far less distance to
travel so now it can maintain the rate of ticking expected.
There are more details in this last calculation that I don't want to bother you
with. But since some of you will ask, here they arebut only for those who really
want them.
Overall it will turn out that the light signal now needs 20 seconds to complete the
journey from the trailing end of the rod to the front and then back. That is what we
expect. The round trip journal is "two ticks" and should take 2x10=20 seconds. The
Special Relativity Basics
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/Special_relativity_basics/index.html[28/04/2010 08:18:00 ﺹ]
catch is that virtually all of the 20 seconds will be spent in the forward trip and
virtually none of it in the rearward trip. This effect actually figures in the relativity of
simultaneity which we will discuss at some length later.
If you want to see this for yourself you should redo the calculations. If you do,
you'll need to undo my rounding off. The rod is not contracted exactly 10%I
rounded things off to keep life simple. It is 9.987%. The ticks are not exactly 10
seconds apart, but 10.0125 seconds. The forward trip will take 19.9750 seconds.
The rearward trip will take 0.05 seconds. That gives a total round trip of 20.025
seconds = 2x10.0125 as expected.
The analysis is now complete. We have learned that a clock
moving at 99.5% the speed of light, slows by a factor of ten. It
ticks once each ten seconds instead of once each second. A
rod, oriented in the direction of motion, shrinks to 10% of its
length. Rods perpendicular to the direction of motion are
unaffected.
The two effects are not noticeable as long as our speeds are
far from that of light. They become marked when we get close
to the speed of light. The closer we get the the speed of
light, the closer clocks come to stopping completely and rods
come to shrkinking to no length in the direction of motion. For
more details of how the effects depend on speed, see What
Happens at High Speeds.
What you need to know:
Inertial and accelerated motion.
Absolute versus relative motion.
Einstein's two postulates and how to apply them.
What a light clock is and how it is affected by motion.
Moving rods are shrunk in the direction of their motion.
Copyright John D. Norton. January 2001, August 30, 2002, July 20, 2006; January 8 2007, January 3, August 21, 27, 2008.
Special Relativity Adding Velocities
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/Special_relativity_adding/index.html[28/04/2010 08:18:06 ﺹ]
HPS 0410 Einstein for Everyone
Back to main course page
Special Theory of Relativity: Adding
Velocities
John D. Norton
Department of History and Philosophy of Science
University of Pittsburgh
Nothing Can Be Accelerated Through the Speed of Light
Setting up the Challenge
Prohibited by the Principle of Relativity
Adding Velocities Einstein's Way
Light?
What you need to know
Nothing Can Be Accelerated
Through the Speed of Light
The speed of light clearly has a special place in this theory. If
something is traveling at the speed of light c, then all observers
will find it to be traveling at exactly same speed.
A similar thing happens to things traveling at less than the
speed of light. If one observer finds an object to be traveling at
less than light, say, then so must every other. There is no way
that observers can change their states of motion so as to find the
object traveling at faster than the speed of light. And there is a
similar result for objects traveling at faster than the speed of light
if such things exist. If one observer finds them traveling at faster
than the speed of light, then so must all.
One of light's most important roles as a limiting velocity follows
from this: no matter how hard we try, it is impossible to
accelerate something through the speed of light . More
generally, the speeds of things are divided into three groups:
things that travel slower than light,
things that travel at exactly the speed of light,
and things that travel faster than the speed of light.
Special Relativity Adding Velocities
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/Special_relativity_adding/index.html[28/04/2010 08:18:06 ﺹ]
We cannot slow down or speed up anything so that it crosses
the barrier of the speed of light.
Yet it looks like it would be pretty easy to violate the
limiting character of the speed light by accelerating something
through the speed of light. We might have a gun that can fire
particles at, say, 2,000 miles per second. That is well below the
speed of light. We put the gun on a spaceship that we accelerate
up to 185,000 miles per seconda mere 1,000 miles per second
short of the speed of light. If we fire the gun in the direction of
motion, would it not accelerate the particle through the speed of
light?
The limiting character of the speed of light is sufficiently striking
for it to be worth seeing how it follows from the principle of
relativity.
Setting up the Challenge
To see it, let us set up the challenge quite solidly. Imagine that a
machine that can fire particles at 100,000 miles per second, which is
more than half the speed of light, 186,000 miles per second.
Now we will try to push things
past the speed of light. Imagine
that the machine is placed on a
spaceship that also moves at
100,000 miles per second in the
direction that the machine fires the
particles; that is, it moves at this
speed with respect to a second
observer on the earth.
So, let us ask the
Special Relativity Adding Velocities
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/Special_relativity_adding/index.html[28/04/2010 08:18:06 ﺹ]
obvious question.
What will the earth
bound observer find
for the speed of the
particle?
The calculation
seems irresistible.
The spaceship
moves at 100,000
miles per second with
respect to the
earthbound observer;
and the particle
moves at 100,000
miles per second with
respect to the
spaceship. So...
100,000 + 100,000 =
200,000 ??
But that would be faster than the speed of light, 186,000
miles per second.
Prohibited by the
Principle of Relativity
To see that the principle of relativity prohibits
this faster than light outcome, imagine that
a light signal passes the particle emitting
machine at the moment that the particle is
emitted. The observer moving with the
machine would (obviously) judge that the light
signal overtakes the particle.
Now imagine
Special Relativity Adding Velocities
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/Special_relativity_adding/index.html[28/04/2010 08:18:06 ﺹ]
this same
process
viewed by the
the
Earthbound
observer.
That observer
must also
see the light
signal
overtake the
particle.
It is just the
one
experiment,
so both
observers
must judge
the same
outcome.
What else would you expect ? Might it be that the light signal
would overtake a particle emitted by the machine, when the
machine is on earth. But when the machine emits on a rapidly
moving spaceship, then the particle overtakes the light?
That is exactly what the principle of
relativity prohibits! For then we have an
experiment that can detect absolute motion. The
resting machine emits particles that don't
overtake light; the rapidly moving machine emits
particle that do overtake light. The principle of
relativity demands that the experiment must
proceed in the same way when carried out on
earth or a rapidly moving spaceship.
(For experts) Those who have read ahead might worry that each
observer might find a different outcome, perhaps as an artefact of the
relativity of simultaneity (below). That won't happen. Whether light
overtakes the particle or not can be reduced to local facts independent of
judgments of simultaneity. Imagine that the light signal and the particle
are to traverse the same interval in space AB. Both depart A at the same
momentjudged locally. If light outstrips the particle, it will arrive at B
before the particle. That earlier arrival is once again a local fact that
obtains just at point B.
Adding Velocities Einstein's Way
What this shows is that the principle of relativity prohibits us
adding velocities in the usual way. We cannot add velocities by
the ordinary rule 100,000 + 100,00 = 200,000. More generally,
Special Relativity Adding Velocities
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/Special_relativity_adding/index.html[28/04/2010 08:18:06 ﺹ]
the classical rule for the composition of velocities fails:
Velocity of A
with respect to C
=
Velocity of A
with respect to B
+
Velocity of B
with respect to C
In its place we need a new rule for the composition of velocities.
It ought to look like the ordinary rule as long as velocities are
smallwe do know that the ordinary rule works for slow moving
things like cars on freeways and trains. But it must look very
different at high speeds. If we use it to add two velocities close to
light, we must get a resultant that is still less than the velocity of
light. Einstein found that the principle of relativity forces a
particular rule. For the case of velocities oriented in the same
direction in space, the relativistic rule for composition of
velocities is:
Velocity of A
with respect to C
=
Velocity of A
with respect to B
+
Velocity of B
with respect to C
__________________________________
reduction factor
All the work is done in this new rule by the reduction factor. When the velocities are small,
this factor is close to 1. So it is as if it isn't really there and Einstein's rule just behaves like the
classical rule. But when the velocities get to be close to that of light, the factor starts to get
larger and larger and in just the right way to prevent any composition of velocities less than
light exceeding that of light.
(For
experts
only) Click
here to see
the
complete
formula.
If we use the rule to add 100 mph to 100 mph, the reduction
factor is almost exactly one, so the ordinary rule works: 100 +
100 = 200.
If we use the rule for adding 100,000 miles per second to
100 miles per second, we are now dealing with velocities that
are 100,000/186,000 = 0.54 the speed of light. For that sum, the
reduction factor is 1.29, so the composition yields:
(100,000 + 100,000)/1.29 = 200,000/1.29 = 155,000
which is still less than the speed of light.
What is most instructive is to see what happens if we start with a
velocity of 100,000 miles; and add 100,000 miles per second to
it; and add it again; and again; and again.
To picture physically what we are doing, imagine that we start
with our base machine "I" that happens already to be moving at
100,000 miles per second. From it we shoot out a second
smaller version of the same machine call it "II" at 100,000
Special Relativity Adding Velocities
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/Special_relativity_adding/index.html[28/04/2010 08:18:06 ﺹ]
miles per second with respect to "I."
Now let's repeat the operation. From the smaller machine "II,"
we'll shoot out a yet smaller version of the same machine at
100,000 miles per second with respect to "II." Call it "III."
Then machine "III" will shoot out machine "IV"; and so on; and so
on. As we pass through the series of machines "I," "II,", "III," "IV,"
etc., we are boosting each with a speed of 100,000 miles per
second with respect to the one before.
The cumulative effect of the repeating boosting by 100,000
miles per second is shown below. The total speed of the last
boosted machine increases as we proceed along the sequence
"I," "II," etc. But the increases become smaller and smaller.
No matter how often we add 100,000 miles per second, we never
get past the speed of lighthere set at exactly 186,000 miles per
second. We get closer and closer to it. But never past it.
One way to think of it is as an "Einstein tax," that copies the
way a very severe progressive taxation might increase the
amount of tax paid as we get more income. We keep adding
100,000 miles per second to the speed, but the Einstein tax
implemented through the reduction factor precludes our total
speed ever exceeding that of light.
That the ordinary addition rule fails follows from the principle of
Special Relativity Adding Velocities
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/Special_relativity_adding/index.html[28/04/2010 08:18:06 ﺹ]
relativity. Why should the ordinary rule fail? Here's way to
get comfortable with the the failure. In the original example, the
spaceship observer uses rods and clocks that move with the
spaceship to measure the speed of the emitted particle as
100,000 miles per second. The earthbound observer now wants
to find the speed of the emitted particle. That observer, however,
cannot directly use measurements made with the spaceship rods
and clocks, for the earthbound observer thinks that they have
shrunk and slowed. The earthbound observer must correct the
spaceship observer's measurements for effects such as these.
The result of the these corrections is Einstein's formula!
Light?
This special role for the speed of light sometimes arouses
special wonder. What is so special about light , we may be
drawn to ask, that everything else takes such special note of it?
Once one starts along this path, all sorts of confusions may
arise. Is it that light is used for communication and finding things
out? Does everything somehow respond to how we find things
out? Does special relativity still work in the dark?
Wellyou can forget all this mystical mumbojumbo, if ever it
attracted you. There is nothing special about light. It's space and
time that is special. They have properties we don't expect. Space
and time are such that rapidly moving objects shrink and their
processes slow down. For a long time, we didn't notice these
effects because we did not have a thorough account of a probe
of space and time that moves very fast. That changed in the
Special Relativity Adding Velocities
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/Special_relativity_adding/index.html[28/04/2010 08:18:06 ﺹ]
nineteenth century when we developed good theories of light. It
is the probe that moves very fast and, for the first time, begins to
reveal to us that space and time are not quite what we thought.
There is one further fact about space and time. It harbors a
special velocity, one that is the same for all inertial observers. It
is an invariant (="unchanging") velocity. Light is just something
that happens to go as fast as it possibly can and thereby ends up
going at that speed.
There's nothing special about light. What is special is the speed
at which it goes.
What you need to know:
Nothing can be accelerated through the speed of light.
Adding velocities Einstein's way.
Copyright John D. Norton. January 2001, August 30, 2002, July 20, 2006; January 8 2007, January 3, August 21, 27, 2008, January 13, 2010..
Special Relativity Basics
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/Special_relativity_rel_sim/index.html[28/04/2010 08:18:16 ﺹ]
HPS 0410 Einstein for Everyone
Back to main course page
Special Theory of Relativity
Relativity of Simultaneity
John D. Norton
Department of History and Philosophy of Science
University of Pittsburgh
Using Light Signals to Judge the Time Order of Events
What the Relativity of Simultaneity is NOT
What you need to know:
When Einstein first hit upon special relativity, he thought one effect
of special importance, so much so that it fills the first section of his
"On the Electrodynamics of Moving Bodies." It is the relativity of
simultaneity. According to it, inertial observers in relative
motion disagree on the timing of events at different places.
If one observer thinks that two events are simultaneous, another
might not. At first this will seem like just another of the many novel
effects relativity brings. However, as we explore more deeply, you
will see that this is the central adjustment Einstein made to our
understanding of space and time in special relativity . Once you
grasp it, everything else makes sense. (And until you do, nothing
quite makes sense!)
Using Light Signals to Judge
the Time Order of Events
There is a quick way to see how this comes about. Imagine a long
platform with an observer located at its midpoint. At either end, at the
places marked A and B, there are two momentary flashes of light.
The light propagates from these events to the observer. Let us
imagine that they arrive at the same moment, as they do in the
animation below. Noticing that they arrive at the same moment and
that they come from places equal distances away, the observer will
decide that the two events happened simultaneous.
Another outcome is closely related. Imagine also that there are
Special Relativity Basics
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/Special_relativity_rel_sim/index.html[28/04/2010 08:18:16 ﺹ]
clocks located at A and B. If both clocks show the same reading at
the events of the two flashes, then we would judge the two clocks
to be properly synchronized. That is what the platform observer
judges since, as the animation shows, both clocks read "0" when the
flashes occur at each location.
Here's a version that isn't animated.
So far, nothing remarkable has happened. That is about to change.
Now consider this process from the point of view of an observer
who moves relative to the platform along its length. For that new
observer, the platform moves rapidly and, in the animation, in the
direction from A towards B. Once again there will be two flashes and
light from them will propagate towards the observer at the midpoint
of the platform. However the midpoint is in motion. It is rushing away
from light coming from A; and rushing toward the light coming from
B. Nonetheless, the two signals arrive at the midpoint at the same
moment.
Special Relativity Basics
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/Special_relativity_rel_sim/index.html[28/04/2010 08:18:16 ﺹ]
Here's a version that isn't animated.
What is the new observer to make of this? For the new observer, the light from A must
cover a greater distance to catch up with the receding midpoint; and the light from B
must cover a lesser distance to arrive at the midpoint rushng towards it. So if the two
arrive at the same moment, the light from A must have left earlier than the light
from B to give it greater time to cover the greater distance to get to the midpoint. That is,
the flash at A happened earlier than the flash at B. The two events were not
simultaneous, according to the new observer.
Notice that the reasoning
requires the light
postulate: both light
flashes must move at the
same speed; that is, each
must require the same
time to cover the same
distance.
The reasoning extends to the clocks. The clocks at A and B show
the same time when the flash events happen at each.These two
events are not simultaneous for the new observer. Therefore the
new observer will judge the clocks at A and B not be properly
synchronized. In fact clock A is set ahead of clock B.
In short, the platform observer will say that the two flashes
happened simultaneously and that the two clocks are properly
Special Relativity Basics
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/Special_relativity_rel_sim/index.html[28/04/2010 08:18:16 ﺹ]
synchronized; the new observer will say the Aflash happened first
and that the Aclock is set ahead of the Bclock. It is not a matter
that one or other of them is somehow misinformed. They are both
using the same information. Rather it is that judgements of the
simultaneity of spatially separated events depend on the observer,
just as the rate of clocks and lengths of bodies depends of the
observer in special relativity.
That a moving clock slows and moving rod shrinks is something
most of us get used to with a little thought. The same is not true of
the relativity of simultaneity. It is harder to get used to it, since it
amounts to a more fundamental breakdown. It tells us that that there
is no absolute fact about the relative timing of events at distant
places. Imagine that you have candles on a birthday cake in
Pittsburgh and on one in far away Sydney. You plan to have them
blown out at exactly the same moment. The relativity of simultaneity
tells you that there is no absolute fact to whether you succeed.
Relative to an earthbound observer, you may succeed. But that can
mean that relative to an observer on the moon, who moves relative
to the earth, you did not succeed.
The relativity of simultaneity adds to the repertoire of quantities
that are relative and not absolute. There is no absolute fact to
whether a spaceship is moving uniformly or is at rest. It can only be
said to be at rest relative to another body. There is no absolute fact
as to whether a rod is foot long or a process lasts for one minute.
They can only true with respect an observer with a definite state of
motion. To this list we add that there is no absolute fact to whether
two spatially separated events are simultaneous; or whether two
spatially separated clocks are synchronous. These can only be true
relative to an observer with a definite state of motion.
What the Relativity
of Simultaneity is
NOT
There is a quite benign way in which observers can disagree on the
simultaneity of events. It is not the effect at issue. To see the benign
way, imagine that a flash of lightning strikes the tree you are
standing under. Let us say the strike comprises two events: the
flash of the light and the boom of the thunder. For you
standing under the tree, if you survive, the two events are
simultaneous. It would not appear so for someone standing on a
distant hill top watching the lightning strike. That observer would see
the flash and then, several seconds later, hear the boom of the
Special Relativity Basics
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/Special_relativity_rel_sim/index.html[28/04/2010 08:18:16 ﺹ]
thunder. For you the flash and the boom are simultaneous. For the
distant observer they are not simultanteous; or, more precisely, they
do not appear simultaneous.
This same effect can arise in more abstruse settings. When we look
at a distant galaxy 10 million light years away, we are seeing it as it
appeared 10 million years ago. So if we see some event
occuring now, such as a star in the galaxy exploding, that event
really happened 10 millions years ago. It will appear to us that it
happened now, at the same time as the events of the present day. In
fact it did not. We know that and we correct for the time the starlight
took to reach us in judging the timing of the event.
These two examples illustrate the oddities of what we can call
"appearance simultaneity." Events are simultaneous in this
sense, merely if our sensations of them happen at the same
moment. Or they fail to be simultaneous in this sense if our
sensations of them happen at different times.
That sort of simultaneity is not the sort that is at issue in the relativity
of simultaneity. The idea is that we correct for differences in
appearance simultaneity. For example, when we hear the boom of
the thunder coming after we see the flash of the lightning, we
routinely allow for the fact that light travels very rapidly, but sound
travels slowlyroughly one mile in five seconds. So even though we
sense the flash and boom at different times, we judge the two
originating events to be simultaneous.
Here's another case. Two lightning bolts strike at points D and E,
where D is farther away from the observer. Let's say that the strikes
are timed so that light signals from the bolts arrive at the same
moment at the observer. The observer would see both flashes at the
same time. The bolts would appear simultaneous. But the observer
would then correct for the greater distance that the light signal
from D must travel. So that the observer sees the flashes at the
same time means the observer judges the D bolt to have struck
Special Relativity Basics
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/Special_relativity_rel_sim/index.html[28/04/2010 08:18:16 ﺹ]
earlier.
The relativity of simultaneity of relativity theory arises after we have
corrected for the oddities of appearance simultaneity. Even after
those corrections have been made, it turns out that observers in
relative motion will not agree on the timing of spatially separated
events. In the thought experiment above with the A and B clocks, it
turns out that no corrections for appearance simultaneity are
needed. Since the observer is located at the midpoint of the
platform, the flashes of light at A and B are delayed equally. That is
why the observer was placed there.
What you need to know:
What the relativity of simultaneity is.
What the relativity of simultaneity is not.
Copyright John D. Norton. January 2001, August 30, 2002, July 20, 2006; January 8 2007, January 3, August 21, 27, 2008; January 13, 2010..
Problem of Reciprocity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/Reciprocity/index.html[28/04/2010 08:18:21 ﺹ]
HPS 0410 Einstein for Everyone
Back to main course page
Is Special Relativity Paradoxical?
John D. Norton
Department of History and Philosophy of Science
University of Pittsburgh
What's the Problem?
The Car and the Garage
Relativity of Simultaneity...
...Solves the Problem
Relativity of Simultaneity and the Measurement of
Lengths
Relativity of Simultaneity and the Measurement of
the Rates of Clocks
Are the Relativistic Effects Illusory Artefacts of
Measurement?
What You Need to Know
Background reading: J. Schwartz and M. McGuinness,
Einstein for Beginners. New York: Pantheon.. pp. 109 
116.
What's the Problem?
Relativity theory tells us that a moving clock is slowed
down and a moving rod is shrunk in the direction of its
motion. If I am an inertial observer, I will find the effect
to come about for the clocks and rods of a spaceship
moving past at rapid speed. But if that spaceship is
moving inertially, then, by the principle of relativity, the
spaceship's observer must find the same thing
for my clocks and rods. Relative to that observer,
my clocks and rods move past at great speed. So that
observer would find my clocks to be slowed and my
Problem of Reciprocity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/Reciprocity/index.html[28/04/2010 08:18:21 ﺹ]
rods to be shrunk in the direction of my motion.
Each finds the other's clocks slowed and rods shrunk.
How can both be possible ? Is there an
inconsistency in the theory? If I am bigger than you,
then you must be smaller than me. You cannot also be
smaller than me. That's the problem.
The Car and the Garage
That each finds the others clocks slowed and rods
shrunk is troubling. But it is not immediately obvious
that there is a serious problem. If I walk away from
you, simple perspective effects make it look to each of
us that the other is getting smaller. That perspectival
effect should not worry anyone. The car in the garage
problem is an attempt to show that the relativistic
effects are more serious than this simple perspectival
effect. There is, it tries to show, a real contradiction;
and we should not tolerate contradictions in a physical
theory.
Here is how we might try to get a contradiction out of the
relativistic effect of each observer judging the other to have
shrunk. Imagine a car that fits perfectly into a garage. The
garage is a small free standing shed that is just as long as the
car. There is a door at the right and a door at the left of the
garage. The car fits exactlyas long as it is at rest.
Now image that we drive the car at 86.6% speed
of light through the garage from right to left. The
doors have been opened at the right and the left of the
Problem of Reciprocity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/Reciprocity/index.html[28/04/2010 08:18:21 ﺹ]
garage to allow passage of the car. There is a garage
attendant, who stands at rest with respect to the
garage. Can the garage attendant close both doors so
that, at least for a few brief moments, the car is fully
enclosed within the garage?
According to the garage
attendant, there is no problem
achieving this. At 86.6% the
speed of light, the car has
shrunk to half of its length at
rest. It fits in the garage handily.
The garage attendant can close
both doors and trap the car
inside.
According to the car driver, however, matters are
quite different.The car is at rest and the garage moves.
The garage approaches the car at 86.6% the speed of
light. So the car driver finds that it is the garage and
not the car that has shrunk to half its length. The
garage is now half as long as the car. The car driver
says that there is no way the garage attendant can
shut both doors and trap the car fully inside.
Now this is a serious problem. Either the car can or
cannot be trapped fully within the garage, but not both.
(Or so it would seem.)
Relativity of Simultaneity...
There is a solution. It depends upon our remembering
that that there is more in special relativity than the
slowing of clocks and the shrinking of rods. We have
already seen the relativity of simultaneity which will
Note that an "event" in the context of
relativity theory has a narrow meaning. It is
something that happens at one place and at
one time. Events are not spread out in
space and time as might be the sort of
Problem of Reciprocity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/Reciprocity/index.html[28/04/2010 08:18:21 ﺹ]
take on greater and greater importance in our
assessment of the theory. It tells us that observers in
relative motion can disagree on the timing of
spatially separated events.
events that we talk about in everyday talk. In
relativity theory, an event happens at just
one moment and one spot.
...Solves the
Problem
The possibility of that disagreement is the key to the
problem of the car and the garage. A judgment of the
simultaneity of events is essential to any judgment of
whether the car was trapped in the garage by the
closing of doors. The car driver and the garage
attendant disagree on whether the car is ever fully
enclosed in the garage simply because they disagree
on the time order of two events.
The garage attendant says:
There are two events:
"Left door shut": I closed the left door before the car
struck it.
"Right door shut": I closed the right door after the car
passed.
And these events happened at the same time.
Therefore the car was fully enclosed.
Problem of Reciprocity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/Reciprocity/index.html[28/04/2010 08:18:21 ﺹ]
The car driver says:
"There are two events.
"Left door shut": You closed the left door
before the car struck it.
"Right door shut": You closed the right door
after the car passed.
But these events did not happened at the
same time.
You closed the left door first.
Thenlateryou closed the right door after
the front of the car had already burst through
the closed left door.
Therefore the car was never fully enclosed.
Both agree that the two events "left door shut" and
"right door shut" happened. They disagree on the
time order in which they happened. But that time
order is what is needed to decide whether the car was
fully enclosed in the garage. In a nutshell:
• The car can only be said to have been fully enclosed
in the garage if both doors were shut at the same time.
• There is no observer independent fact of the matter
as to timing of these events.
• Therefore there is no observer independent fact as to
whether the car was ever fully enclosed in the garage.
Relativity of Simultaneity and
Problem of Reciprocity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/Reciprocity/index.html[28/04/2010 08:18:21 ﺹ]
the Measurement of Lengths
The problem of the car and the garage shows how
judgments of lengths are entangled with judgments of
simultaneity. This entanglement runs throughout
special relativity. Indeed, one can understand all the
odd kinematical effects as derived from it; for this
reason, it was the first effect Einstein discussed in his
1905 paper.
For example, the relativity of simultaneity lies behind
relativistic length contraction. To see this, consider
how we might measure the length of a moving object.
Take a car moving along a freeway at fancifully
high speeds, so that relativistic effects come into play.
I am standing by the roadside and want to know the
car's lengthor at least its length relative to me.
I cannot just hold up a measuring rod and proceed
in the normal way: that is, check which marks on the
rod align with each end of the car. For the car is
zooming past. By the time I have noted the alignment
of the front of the car with, say, the 0 mark on the
measuring rod, the car has long since zoomed off into
distance. I will have had no chance to check where the
rear of car aligned. I need a more refined procedure.
Here's one: as the car zooms by, I stand with a friend
at the roadside, each of us holding a raised flag, ready
to plant into the roadside. As the front of the car
passes, I plant my flag into the roadside; as the rear
of the car passes my friend, my friend plants his flag
into the roadside. The car zooms away. But that
doesn't matter anymore. I have the information I need
in the locations of the flags. I can use my measuring
rod to determine the distance between the flags. That
is the length of the moving car.
Problem of Reciprocity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/Reciprocity/index.html[28/04/2010 08:18:21 ﺹ]
What is essential to this procedure is that I and my
friend plant our flags at the same time. Otherwise
the distance between the two marks will not properly
reflect the length of the car.
But there's the
catch. The car
driver will
disagree with my
judgments of
which events are
simultaneous. The
car driver will agree,
of course, that there
are two events, the
planting of the two
flags. But the car
driver will not agree
that I and my friend
placed the marks
simultaneously.
Rather the car
driver will find my
friend and I to be
rushing toward the
car and the two flag
plantings to have
happened at
different times. As
the figure shows,
the car driver will
Problem of Reciprocity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/Reciprocity/index.html[28/04/2010 08:18:21 ﺹ]
judge the planting of
my flag at the front
to have happened
first; and the
planting of my
friend's flag at the
rear to have
happened later.
Here's an animated version of this process.
Since my friend delayed the planting of the flag at the
rear (in the car driver's judgment), the rear of the car
advanced for some short time after I'd planted my flag
Problem of Reciprocity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/Reciprocity/index.html[28/04/2010 08:18:21 ﺹ]
at the front. Therefore (in the car driver's judgment) the
distance we staked out with the flags is shorter than
the length of the car and our determination of the
length of the car is wrong. Hence we end up
disagreeing about the length of the car.
The important point is that neither of us (driver
and roadside observer) has made an error. There
is no absolute fact as to which of us is really
moving. Therefore there is no absolute fact as
to which of our judgements of the timing of the two
events is correct. Just as in the case of the car
and the garage, we each judge the other as
shrunken because we judge the simultaneity of
events differently.
Relativity of Simultaneity and
the Measurement of the Rates
of Clocks
Similar considerations arise in judgments of the
slowing of moving clocks. To see how the relativity of
simultaneity underlies the relativistic slowing of clocks,
we attend to a procedure we might use to measure the
effect.
To judge the rate of a clock that passes me I need
to be able to compare its reading with my wristwatch
now and then compare its reading again later with my
wristwatch after some time has passed. If the clock is
running slow, I'll notice that its rate lags behind my
wristwatch.
The catch in this simple procedure is that the clock is
moving. I might find that both it and my wristwatch
Problem of Reciprocity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/Reciprocity/index.html[28/04/2010 08:18:21 ﺹ]
read the same time now, at the moment the clock
passes. But the clock is moving rapidly. So after some
time has elapsed, it has moved off into the distance.
How can I find out what the moving clock reads an
hour from now when it is no longer anywhere near
me? Here's one procedure: I set up many clocks at
rest with respect to me throughout space. Then, one
hour later, as the moving clock passes one of those
clocks, a friend notes what the moving clock reads and
what the local resting clock reads. From my friend's
report, I can figure out whether the moving clock has
slowed or not.
The figure shows the bare essentials of the moving
clock and all the other clocks spread out through
space. The moving clock agrees with the reading of
the leftmost clock my wristwatch as it passes by.
However when it passes the rightmost, it now reads
much less. So I judge it to have slowed.
Problem of Reciprocity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/Reciprocity/index.html[28/04/2010 08:18:21 ﺹ]
This procedure seems quite sound. So does that mean
an observer who travels with the moving clock
would agree and judge the moving clock to have
slowed? No! We have seen that relativity theory
requires that observer to judge my array of clocks to be
running more slowly! How can that be?
By now you know the answer. An essential part of the
procedure is that all the clocks I laid out through space
must be synchronized. That means that the events
of each clock reading say "12 noon" must be
simultaneous events. The relativity of simultaneity tells
Problem of Reciprocity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/Reciprocity/index.html[28/04/2010 08:18:21 ﺹ]
us that observers in relative motion may disagree on
whether those events are simultaneous. Therefore
observers in relative motion may disagree on whether
clocks separated in space are properly synchronized.
And that is what happens in this case.
The moving observer will judge my clocks not to be
properly synchronized. As a result, the moving
observer will regard my judgments of the rate of the
moving clock to be defective. As before, there is no
absolute fact as to whether the clocks are properly
synchronized. Therefore there is no absolute fact as to
whether the moving clock slows with respect to my
clocks; or whether my clocks slow with respect to the
moving clock.
Are the Relativistic
Effects Illusory
Artefacts of
Measurement?
Once you recognize how fully the relativity of
simultaneity is bound up in the relativistic length
contraction and clock slowing effects, it is easy to fall
into a new misunderstanding. One might think that the
effects are not really part of the world at all, but
that they somehow come about solely because of the
way we set our clocks.
An analogy: it is possible to board a transpacific flight
in Sydney, Australia, on one day and, after 16 hours of
travel, disembark in Los Angeles the day before! Is this
time travel ? Of course not. During the flight, you
crossed the international date line. That the calendar
Problem of Reciprocity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/Reciprocity/index.html[28/04/2010 08:18:21 ﺹ]
reads a day earlier in Los Angeles is purely an artefact
of how we set our clocks and calendars across the
world.
In the early 1910s, this issue entered the physics
literature in discussion of the geometry of a rotating
disk. in 1911, V. Varicak offered the following
diagnosis of the origin of relativistic length
contraction:
It "is only an illusory, subjective appearance,
caused by the manner of our regulation of clocks and
measurement of length"
"a psychological and not physical effect"
The rotating disk has some odd properties. Its
circumference is relativistically contracted but
its radius is not, resulting in "Ehrenfest's
paradox." But this is a topic for another time.
Einstein's reply of the same year read:
"The question of whether the Lorentz contraction really
exists or not is misleading.
...[it is] not real in so far as it does not exist for a co
moving observer.
...[it is] real in so far as it can be demonstrated in
principle by physical means by an observer that is not
comoving"
What I think Einstein is getting at is this. He is
accusing Varicak of conflating two distinctions:
Real
versus
unreal
Observer independent
versus
observer dependent
That we age is real. That we travel
backwards in time when flying from
Sydney to Los Angeles is unreal.
That an object spins on it axis is observer independent; it is verified by the
presence of inertial forces. That an asteroid moves uniformly in space
must be judged relative to another object.
Varicak's point seems to be that being observer
dependent makes an effect unreal. Einstein's
response is that observer dependent effects can be
Problem of Reciprocity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/Reciprocity/index.html[28/04/2010 08:18:21 ﺹ]
both, according to the observer. You cannot infer from
observer dependence to unreality. That the asteroid
moves relative to us is both real and relative to us.
What You Need to Know
The relativity of simultaneity
How it solves the car in the garage problem.
How the relativity of simultaneity is involved in judgments of the length of moving
bodies and rates of clocks.
Why this doesn't mean that the relativistic effects are illusions.
Copyright John D. Norton. February 2001, September 2002; July 2006; January 2, 2007, Jnuary 10, August 21, 2008.
E= mc^2
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/E=mcsquared/index.html[28/04/2010 08:18:26 ﺹ]
HPS 0410 Einstein for Everyone
Back to main course page
E=mc
2
John D. Norton
Department of History and Philosophy of Science
University of Pittsburgh
Kinematics and Dynamics
The Basic Concepts of Dynamics: Two Relations
Conservation of Energy and Momentum
Achieving Unlimited Velocities in Classical Physics
The Unlimited Momentum Loophole Closed
Simple Redescription of the Growth of Mass
E = mc
2
at last
Hear Einstein Explain It
What You Need to Know
Linked documents:
The World's Quickest Derivation of E=mc
2
Resolving Collisions in Classical and Relativistic Physics
Einstein's famous equation has grown into one of the
great symbols of the 20th century. It is the one equation
in science that people recognize, if any is. It has a kind
of iconic status and dual connotations: the brilliance
and insight of Einstein and the darkness of atomic
bombs. Images.
Kinematics and Dynamics
So far we have looked at
kinematics, the study of
motions in space and time,
and we have seen how
What is kinematics? In kinematics we might look at the trajectories of tennis balls
in flight. We learned from Galileo that they move in parabolic arcs. That means that
their trajectories are symmetric about their apex and that, for the same starting
E= mc^2
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/E=mcsquared/index.html[28/04/2010 08:18:26 ﺹ]
Einstein's special theory of
relativity has affected it.
speed, the tennis ball will go the longest horizontal distance if its initial direction is
pointed 45 degrees to the horizontal. We learn nothing of the causes of these
motions.
What is dynamics? In the case of tennis balls in flight, dynamics treats how
forces acting on the tennis balls lead to these parabolic trajectories. A tennis
ball acquires its initial speed because it is struck by a tennis racket; that is,
the tennis racket applies a brief, strong force to it. The resulting trajectory
curves downward in a parabolic arc, since, as Newton told us, the earth's
gravity is applying a constant downward force to it.
We now look at dynamics, the
study of the causes that affect
motion, where those causes are
forces. E=mc
2
arises as part of
the modification to dynamics
brought about by Einstein's
theory.
In kinematics, special relativity changes our normal
expectations. We find that we can now, no longer
accelerate anything through the speed of light and we
have to adjust our ideas about space and time to
accommodate this result. Since all these motions have
causes treated in dynamics, we must make
corresponding changes in our theories of these causes.
When we make these changes in dynamics, E=mc
2
results. We shall now see how that comes about.
The Basic Concepts of
Dynamics: Two Relations
E= mc^2
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/E=mcsquared/index.html[28/04/2010 08:18:26 ﺹ]
Classical physics and special relativity agree in
the following framework of basic concepts. That is, they
both employ the concepts of energy, momentum and
force and they both respect the two relations stated
below that obtain between them.
Energy: I know of no useful definition for energy. It is
understood by example. When systems interact,
they exchange energy. For example, moving car has a
certain energy of motion that is converted to heat energy
in the brakes when the car is slowed. That energy
originally came from chemical energy stored in the
gasoline fuel, which in turn was supplied as light energy
to the plants that became petroleum.
Momentum: The momentum of a moving body is a
measure of the quantity of motion. It is defined by
momentum = mass x velocity
for a mass moving at the nominated velocity. The
formula contains both mass and velocity since the
quantity of the motion increases with both. (Which is the
greater motion: an ounce of lead moving at 100mph or a
pound of lead moving at 100mph?)
Force: When two bodies interact, the force measures
the rate of transfer of momentum and energy, such as
through the two relations below. That is, it measures the
intensity of an interaction. It is roughly equivalent to the
prescientific notion of muscular effort. A push can set a
heavy cart in motion because it applies a force to it. Its
size is given by the rate of transfer of energy and
momentum.
E= mc^2
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/E=mcsquared/index.html[28/04/2010 08:18:26 ﺹ]
If a constant force acts on a body, force, energy and momentum are related by the
simple relations:
Momentum gained by body = Force x Time during which force acts
Energy gained by body = Force x Distance through which force acts
These relations obtain in both classical and relativistic
physics. We shall see, however, that in the relativist
context they turn into E=mc
2
.
Conservation of Energy and
Momentum
The most important laws in dynamics are those that
state the conservation of energy and of momentum.
These two laws can be applied whenever we have a
closed system; that is, a system that does not interact
with its surroundings. They assert that for such systems
and any process they may undergo:
Total
Energy
at start
=
Total
Energy
at end
Total
Momentum
at start
=
Total
Momentum
at end
An isolated spaceship in deep space is a good
example to consider. Imagine some interaction that
takes place within that system. These laws tell us that
the total of energy before the interaction equals the total
of energy afterwards; and the total of momentum before
the interaction equals the total of momentum afterwards.
E= mc^2
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/E=mcsquared/index.html[28/04/2010 08:18:26 ﺹ]
Let's look at an example of such a
process. A spacewalker stands on
the spaceship and both are at
rest. They have no velocity, so their
total momentum is zero.
Momentum
=mass x velocity
=mass x 0 = 0
Now imagine that the spacewalker vigorously
pushes off from the spaceship and floats off into space.
The spacewalker has gained some momentum. If the
spacewalker has mass of 200 pounds and moves off at
10 feet per second, he has gained 2,000 units of
momentum. The law of conservation of momentum
demands that the total momentum of the two systems
stays constant. That is the total momentum of the
spaceship plus spacewalker system must remain zero.
That can only happen if the spaceship gains a
negative momentum exactly opposite to the
momentum gained by the spacewalker. That is a
momentum of 2,000 units. Then the sum of the two will
be zero.
2,000 + 2,000 = 0
E= mc^2
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/E=mcsquared/index.html[28/04/2010 08:18:26 ﺹ]
And that will only happen if the spaceship gains a
velocity in the direction exactly opposite to that of the
spacewalker's motion; that is, if the spaceship recoils.
So conservation of momentum demands a recoil.
If you want to keep doing the sums, we can figure
out just how big the velocity of recoil must be. If
the spaceship has a mass of 2,000 lb, then the
recoil is just 1ft/sec. For then its momentum is
2,000x(1) = 2,000 ft lb/sec
Analogous considerations apply to the combined energy
of the spaceship and spacewalker. After the interaction
both spaceship and spacewalker have some energy of
motion. That energy originated as chemical energy
stored in the muscles of the spacewalker, before the
spacewalker used muscle power to push off the
spaceship. The energy of motion gained by the system
must match the chemical energy lost from the
spacewalker's muscles so that the total energy stays
constant.
Achieving Unlimited Velocities
in Classical Physics
Unlike relativity theory, classical physics allows us to
accelerate bodies to arbirtrarily high speeds. There is a
simple mechanism for achieving these unlimited speeds
in classical physics. If we keep applying a constant
force to a body, the body will keep gaining energy and
momentum and its velocity will rise accordingly.
How can we go about applying
a constant force to a body over
a long enough time period to
achieve very high velocities? If
the body is very small, it turns
out to be much easier than you
might imagine. If the body is a
very small particlean electron
E= mc^2
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/E=mcsquared/index.html[28/04/2010 08:18:26 ﺹ]
saythen the forces of
electric and magnetic
fields can quite quickly
accelerate the particles to
close to the speed of light.
Over a hudred years ago, this
happened in the first cathode
ray tubes, that is, in devices
like old fashioned TV tubes. If
a high voltage is applied
between two metal plates in a
near evacuated vessel, the
electric field resulting from the
voltage is quite capable of
pulling electrons off one
electrode at very high speeds.
This same technique is still
used today in particle
accelerators in which various
particles are accelerated to
close to the speed of light by
combinations of electric and
magnetic fields.
What about accelerating ordinary objects up to very high
speeds by ordinary means ? What about the most
familiar method of all, swatting a ball with a bat or a
club? You might suspect that this procedure is self 
defeating. If I want to swat a ball to get it to move
quickly, wouldn't I need a faster moving bat to swat
it with? And once it is moving fast, wouldn't I need a yet
faster moving bat to get it to move still faster ? So can
the method only yield high speeds if I already have
something moving at even higher speeds?
These worries turn out to be misplaced. It is easy to see
that a small ball at rest, hit by a much heavier bat, will be
accelerated to twice the speed of the bat. This, for
example, is pretty much what happens when a golf club
E= mc^2
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/E=mcsquared/index.html[28/04/2010 08:18:26 ﺹ]
hits a golf ball. The greatest speed the ball can achieve
is twice that of the club's head. The argument that shows
it is much easier than you would expect and uses the
principle of relativity to make it simple. Look here for
details.
It also turns out that it quite easy to imagine systems
that use repeated collisions that would accelerate
bodies, according to classical physics, to arbitrarily high
speeds. Here's one simple set up. We have two very
massive blocks rolling towards one another on rails. A
small elastic body is trapped in between, perhaps
suspended by a rope from a high support. It is set in
motion by a collision with one of the blocks. The body
then bounces back and forth between the two approach
blocks. With each bounce , according to classical
physics, the body gains the same increment of speed
and same increment in the magnitude of its momentum.
By choosing the sizes and distances carefully, we can
set things up so that there are as many collisions as we
like. While the mass bounces back and forth between
the approaching blocks, the collisions happen more
and more rapidly and the mass goes faster and faster.
According to classical physics, this arrangement is quite
able to accelerate the mass past the speed of light, as
E= mc^2
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/E=mcsquared/index.html[28/04/2010 08:18:26 ﺹ]
long as the blocks are massive enough and the
materials strong enough not to break in the violent
collisions.
These schemes illustrate how it is
possible, according to classical
physics, to impart unlimited
momentum to a body and, as a
result, to boost it to unlimited
velocities, including those greater
than that of light.
We have seen that relativity theory prohibits boosting
bodies past the speed of light. Therefore we must pay
attention to how it arises in classical physics. Then we
can decide how we must modify classical dynamics, so
that it does not allow us to accelerate objects through
the speed of light.
Momentum = mass x velocity
increases
without limit
fixed
increases
without limit
Recall that momentum is mass x velocity. Since the
mass is a fixed number, characteristic of the body, if the
body's momentum increases, so must its
velocity. If its momentum grows without limit, then its
velocity also increases without limit. As a result,classical
physics tells us that we can accelerate masses through
the speed of light.
The Unlimited
Momentum Loophole
Closed
E= mc^2
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/E=mcsquared/index.html[28/04/2010 08:18:26 ﺹ]
In relativistic physics, we can also supply unlimited
momentum to a body. Indeed we can use the same
mechanism as in classical physics just set up a small
object to collide with a larger one. And with successive
collisions, we can supply more and more momentum to
the small object. However, as we saw in the case of just
one collision, these processes will never accelerate the
small object past the speed of light.
Somehow we have to make sense of this prohibition on
accelerating objects through the speed of light. We still
have the relation momentum = mass x velocity. We can
increase the momentum without limit. So why doesn't
the velocity also increase without limit?
In the classical context, getting to the conclusion of
unlimited velocity depended on an assumption: the
mass of the object is constant. That is the only
assumption we have left to adjust. That is how Einstein
modified dynamics in 1905. The mass of the object
increases with its velocity. Schematically:
increases
without limit
increases
only as
far as c
Momentum = mass x velocity
SO...
mass must
increase
when velocities
getclose
to c
So as we put more and more momentum into the body,
the velocity ceases to rise without limit; the mass starts
to rise instead . Eventually, once the velocity has
gotten close to that of light, all the increase is associated
E= mc^2
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/E=mcsquared/index.html[28/04/2010 08:18:26 ﺹ]
with the mass. This effect on a mass, as we repeatedly
double its momentum, is shown in the table:
Momentum Mass
Velocity
(in units of 1,000
miles/sec)
0 1 0
118.6 1.186 100
237.2
=118.6x2
1.621 146.4
474.4
=237.2x2
2.740 173.2
948.8
=474.4x2
5.198 182.5
1897.6
=948.8x2
10.251 185.1
3795
=1897.6x2
20.43 185.8
7590
=3795x2
40.82 185.9
... ... ...
∞ ∞ c=186
Only for those of you who have to know, the
formula used to determine the mass m is... (click).
In sum, according to relativity theory, a forceno matter
how big or long actingcannot accelerate a body
through the speed of light. The closer the body gets to
the speed of light, the greater its mass becomes and the
harder it gets to accelerate. The mass grows without
limit.
In 1905 , this was not such a shocking way to view
things. It was then known that when electrons moved
close to the speed of light in cathode ray tubes, they got
E= mc^2
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/E=mcsquared/index.html[28/04/2010 08:18:26 ﺹ]
harder and harder to accelerate, as if they were
becoming more massive. Prior to Einstein, this was
explained as a complicated interaction between the
electron and its electromagnetic field. Einstein now just
said that the electron's mass wasn't merely appearing to
increase; it was increasing.
For more on how relativity blocks a scheme for using
collisions to boost things through the speed of light, see
this account of the Resolution of Collisions in Classical
and Relativistic Physics.
Simple Redescription of the
Growth of Mass
When a body is accelerated, we add momentum to it
and its mass increases. We also add energy to the body.
There turns out to be an especially simple rule that
connects the energy added with the mass gained:
Add 1 unit of energy > Add 1/c
2
units of mass
Add 2 units of energy > Add 2/c
2
units of mass
etc.
More generally:
Add E units of energy > Add E/c
2
units of mass
Turning this around, we can say:
1 unit of mass > c
2
units of energy
2 unit of mass > 2c
2
units of energy
etc.
E= mc^2
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/E=mcsquared/index.html[28/04/2010 08:18:26 ﺹ]
More generally:
m units of mass > mc
2
units of energy
This is Einstein's celebrated equation: E = mc
2
seen in one application. It turns out that the relation can
be derived in this case with very little more fuss merely
by combining the two relations we saw above for energy,
momentum and force. For the brave: show me.
E = mc
2
at last
This famous equation asserts an equivalence of energy
and mass. Whenever a body gains or loses mass or
energy it gains or loses a corresponding amount of
energy or mass according to the conversion formula E =
mc
2
. We have seen this for mass gain under an applied
force. That is, we have seen the result for one particular
form of energy, energy of motion. When a body loses
or gains energy of motion, it loses or gains mass
according to E = mc
2
.
What about other forms of energy? What about heat
energy, chemical energy, electrical energy, etc. Do
bodies also lose and gain mass according to E = mc
2
when they lose or gain these forms of energy? Yes they
do, but it takes a little bit more argumentation to
establish the result.
The argument that establishes this is a little complicated, so it is included here for the brave only.
We know that energy of motion has mass. The conservation of momentum requires that if this holds for one form of
energy, it must hold for all. To see this, imagine that we have some conversion of energy of motion into another form of
energy. For example, we are in an isolated spaceship with a rapidly spinning flywheel. The flywheel has considerable
energy of motion and thus a corresponding mass. We now use the motion of the flywheel to turn an electrical generator
E= mc^2
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/E=mcsquared/index.html[28/04/2010 08:18:26 ﺹ]
that then charges a battery. The battery stores the electrical energy as chemical energy. The flywheel has lost some
energy of motion; the battery has gained the corresponding amount of chemical energy. The flywheel has also lost some
mass, according to E = mc
2
. Will the battery gain the corresponding amount of mass as well ?
To see that it must, view the entire process from another spaceship that sees this system of flywheel and battery moving
with uniform speed. The process will not alter the velocity of the flywheel plus battery system. Conservation of momentum
demands that the total momentum of the flywheel plus battery remain the same. Therefore, since momentum = mass x
velocity, the total mass of the system must stay the same. But that can only happen if the mass lost by the flywheel
reappears as mass in the battery and it does so exactly in accord with E = mc
2
. The example can be repeated with a
conversion of energy of motion to any other type of energy.
Put most briefly, Einstein's equation says that energy
and mass are really just two different names for
the same thing. They rise and fall together because
they are at heart the same thing. We like to call that
thing mass when it is in hard, lumpy forms like bricks.
We prefer to call it energy when it is in the form of
radiation. But the one is just a form of the other.
What is most important for practical purposes is that the
conversion factor c
2
is huge . That means that a
small amount of mass under conversion yields a huge
amount of energy. This is the principle behind nuclear
weapons and nuclear power. In a series of nuclear
reactions, Uranium atoms "fission" that is splitinto
atoms of smaller size and other particles such as
neutrons. It turns out that the total mass of these decay
products is just slightly less than the mass of the
Uranium we started with. This mass defect is around a
tenth of one percent of the mass. This missing mass has
been converted into energy. Because c
2
is so large, the
result of converting even a small part of the mass into
another form is the release of a huge amount of energy.
If the release is uncontrolled, the result is a catastrophic
explosion, an atomic bomb. If it is controlled in a power
plant, the result is useful power.
E= mc^2
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/E=mcsquared/index.html[28/04/2010 08:18:26 ﺹ]
One gram of matter, about 20 drops of water, if it were
fully converted into electrical energy, would be
25,000,000 kilowatt hours of electrical energy. That is
enough energy to power a 100 watt light bulb for
250,000,000 hours or 28,500 years. Recorded history
extends only about 12,000 years. At 5 cents a kilowatt
hour, it would cost $1,250,000 if purchased from a utility
company. The energy of that same gram, if released in
an explosion, would be equivalent to 21,000 tons of
TNT.
In 1905, Einstein did not expect this sort of application of
his result, which then seemed to be purely of theoretical
interest.
E= mc^2
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/E=mcsquared/index.html[28/04/2010 08:18:26 ﺹ]
Hear Einstein Explain It
Click here.
What You Need to Know
The notions of energy, momentum and force and how they are related.
The conservation of energy and momentum and how to apply them.
How greater than c speed is achieved in classical physics and why these methods fail in
relativity theory.
What happens to the mass of a body as c is approached.
What E=mc
2
says and how it is applied.
Copyright John D. Norton. January 2001, September 2002. July 2006, January 11, September 23, 2008.
Origins of Special Relativity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/origins/index.html[28/04/2010 08:18:39 ﺹ]
HPS 0410 Einstein for Everyone
Back to main course page
Origins of Special Relativity
John D. Norton
Department of History and Philosophy of Science
University of Pittsburgh
Origins of the Principle of Relativity
Light
Ether Current Experiments Fail
Fresnel Ether Drag
Tuning the Fresnel Ether Drag
Michelson Morley Experiment
The Failures are Explained by H. A. Lorentz
What you should know
Background reading: J. Schwartz and M. McGuinness,
Einstein for Beginners. New York: Pantheon.. pp. 1  82.
We now take Einstein's special theory of relativity for
granted. The evidence in its favor is quite massive, so
that there is little license for skepticism. Our real task is
to learn the theory and there are many text books that
develop it in an easy to understand fashion.
In 1905, however, when Einstein first introduced it, it
was a strange and even shocking theory . Then Einstein
did not have the luxury of a simple text book on special
relativity from which he could learn the theory.
Somehow he had to see that such a theory was needed.
And then he had to devise the theory and know it was
not crazy speculation. How did he do it? That is the
present topic the history of Einstein's discovery of
special relativity. We shall see that Einstein had no
crystal ball. He worked with resources and methods
available to everyone. That is the fascination of the
episode. We shall see how he took the same pieces
everyone had and assembled a masterpiece where
everyone else faltered.
Origins of Special Relativity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/origins/index.html[28/04/2010 08:18:39 ﺹ]
Before we look at Einstein's deliberations, we need to
see what came before. That provided Einstein with
the foundation upon which he could build the special
theory of relativity.
Origins of the Principle
of Relativity
The principle of relativity tells us that we cannot detect
our uniform motion. That idea became important to
physics in the seventeenth century. After Copernicus, it
gradually became accepted that the earth was not
motionless at the center of the universe. Instead it
spun on its axis and orbited the sun. Yet, as the ancient
Greeks were quick to point out, if the earth moved, why
didn't we have some sensation of the movement?
The principle of relativity tells us that we cannot detect our
uniform motion. That idea became important to physics in the
seventeenth century. After Copernicus, it gradually became
accepted that the earth was not motionless at the center of
the universe. Instead it spun on its axis and orbited the sun.
Yet, as the ancient Greeks were quick to point out, if the earth
moved, why didn't we have some sensation of the movement?
Nicholas Copernicus
replaced by
Origins of Special Relativity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/origins/index.html[28/04/2010 08:18:39 ﺹ]
Earth in the Center
Sun orbits
Sun in the Center
Earth orbits
Isaac Newton
If Copernicus' idea was to survive, physics would have to be
renewed so that one's own motion would be undetectable; that
is, so that it satisfied a principle of relativity. As far as
observable things were concerned, the physics Newton
developed in the seventeenth century satisfied this principle.
For example, he associated forces with acceleration and not
simply motion. So, no matter how fast a body moved, as long
as it was not accelerating, no force acted on it.
Light
What altered this happy arrangement in the
nineteenth century were advances in the theory of
light. Newton has supposed that light consisted of
rapidly moving corpuscles; they obeyed the
principle of relativity as much as anything else in
his universe. Following work of Fresnel and others
early in the nineteenth century, this account was
replaced by one of light as a propagating wave.
Origins of Special Relativity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/origins/index.html[28/04/2010 08:18:39 ﺹ]
Newton splits light into its component colors
One of the most important indications that light was
a wavelike process was discovery of
interference, shown below in Thomas Young's
famous two slit experiment. Two light sources
produce the characteristic interference patterns
familar to anyone who has thrown two pebbles into
a calm pond.
If light was a wave, it was assumed that the wave must
be carried by some medium, just as sound waves are
carried by air. That medium was known as the
luminiferous (=light bearing) ether . So the moving
earth was now supposed to be moving through a
medium that must stream past the earth, much as water
streams past a boat moving through the ocean.
Origins of Special Relativity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/origins/index.html[28/04/2010 08:18:39 ﺹ]
Ether Current Experiments Fail
This ether now made plausible that our planet's absolute
motion might be detectable by experiments on the earth.
All we had to do was to seek to see the current of ether
flowing past. It proved quite easy to devise experiments to
do this. Recall that the ether carries light waves, much as
air carries sound waves or water, water waves. So if the
ether is flowing past us, that flow ought to be revealed in
measurements on light.
A series of experiments were devised in the 19th century to detect
this ether current. They were experiments on light. Typically they
involved the passing of light through a combination of prisms, lenses
and the like, creating inference fringes and then looking for an effect
in these fringes. The striking result of all these experiments was that
the flow of ether had no effect on optical experiments. In that sense,
all the experiments failed. Curiously, it was as though the earth
just happened to be at perfect rest in the ether. In retrospect, this is
a puzzling outcome. At the time, however, there was nothing like
the sense of crisis you might expect. Rather it had become a simple
regularity of experiment that the ether drift was invisible to us.
In some ways the
attitude was not
so different from
what we now take
to be a
reasonable
attitude to atoms.
We know that they
are there. Yet at
the same time we
know that they are
so small that no
(19th century)
instrument will
allow us to see
them individually.
The experiments could be
catalogued according to their
sensitivity. The least sensitive
and easiest to conduct were so
called "first order"
experiments. Many were
undertaken and all failed to
demonstrate an ether current.
Note for the techies who have to know what "first order" means: First
order experiments produced results that are proportional to the speed
of the ether, as a fraction of the speed of light. Second order
experiments produced effects that vary with the square of this
fractional quantity. Since this fraction is very small, its square is
smaller still. That means that second order experiments produce
effects that are very much harder to detect than first order
experiments.
Origins of Special Relativity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/origins/index.html[28/04/2010 08:18:39 ﺹ]
Fresnel Ether Drag
That all first order experiments failed to reveal the
earth's motion should, you migh expect, have been very
puzzling. However it soon ceased to be mysterious. It
could be explained by a single hypothesis, the Fresnel
"ether drag" hypothesis. It supposed that the ether
was dragged partially by optically dense media the
lenses and other media used in optical experimentsby
an amount tuned directly to the medium's refractive
index. It turned out that amount could be selected so
that it would exactly cancel out any possible first order
effect of an ether current.
What is the refractive index ? When light enters a
dense optical medium like glass, it slows down. The
refractive index measures the amount of slowing. A
refractive index of 1.5, a common figure for ordinary
glass, means that light moves at 1/1.5 = 2/3 as fast as
light in a vacuum. The greater the refractive index, the
more the light is slowed and, as a result, the more the
light is bent when it enters the medium.
Here's how the drag hypothesis worked. Light waves
are carried by the medium of the ether, just as water
waves are carried by water and sound waves by air. If
the water or the air is moved at some speed, then that
speed will be added to the speed of the water or sound
waves. The same would be expected in the case of light
if the ether is moved. The motion of the ether must
be added to the motion of the light it carried.
But what does it take to move the ether?
Consider a glass block. Since light waves
pass through it, there must be ether inside it
to carry the waves. If the block moves, does
the ether move with it? The simplest case is
that it does not. Then, it is as if the glass
Origins of Special Relativity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/origins/index.html[28/04/2010 08:18:39 ﺹ]
block is perfectly porous sieve that lets the
ether flow freely through it.
This is the case of no ether drag illustrated
opposite. A light wave propagates in the
ether of empty space horizontally from the
left towards the block, which is moving
vertically. The light passes through the block
without any deflection from the vertical
motion of the block. That is because the
ether is undragged; it is left behind fully by
the moving block and takes on none of the
block's motion.
Now take the opposite case. It arises when
the ether is fully trapped by the glass block
and moves with it, much as air trapped
inside a closed car moves with the car. In
this case, the ether moves vertically with the
glass block, with the same speed as the
glass block. As result, the horizontal
lightwave is deflected vertically with the full
motion of the glass block. This is full ether
drag.
Finally, there are a myriad of intermediate
cases, in which the ether is only partially
dragged by the glass block. In these cases,
the glass block acts as a more or less
porous sieve communicating less or more of
its motion to the ether. These are the cases
of partial ether drag . In these cases, the
light wave is only partially deflected from its
horizontal motion.
Origins of Special Relativity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/origins/index.html[28/04/2010 08:18:39 ﺹ]
Assuming just the right amount of partial
drag tuned exactly to the glass' refractive
index was enough to eradicate any positive
sign of our apparatus' motion through the
ether in first order experiments.
Tuning the Fresnel
Ether Drag
But what is just the right amount of partial drag? And
why should it be tuned so precisely to the refractive
index of the optical medium ? We can see how this
comes about if we pursue just one simple
experiment that we might try to use to detect the
earth's motion through the ether. It is just one
experiment. However things work out the same in many
other experiments.
To begin, imagine that we are on an earth that is perfectly at
rest in the ether and that we receive light from a distant star
that is exactly overhead. That starlight would penetrate a
Origins of Special Relativity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/origins/index.html[28/04/2010 08:18:39 ﺹ]
glass block as shown in the figure. The light would descend
vertically and keep moving vertically in the block.
Now take the same case but add the fact that the earth
we are standing on moves horizontally.
In the ether frame of reference, the light will continue to
descend vertically towards the block. But what
happens to the light when it enters the moving block?
The possible effects of the motion of the block on
the propagation of the light in the block are shown in
the figure. The light in the block may be either
undragged, partially dragged or fully dragged. Which
trajectory the light follows depends on the amount of
ether drag.
Now transform our viewpoint to that of the observer
moving with block. The figure shows the same system,
just redescribed by the moving observer. The three
possible effects of the block's motion on the light are
shown again.
There is a second effect. If we change our point of view
to one that moves with the block, there is a
corresponding alteration in the light ray outside the block.
The vertically propagating light acquires an extra motion
Origins of Special Relativity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/origins/index.html[28/04/2010 08:18:39 ﺹ]
opposite to that of our motion. The light that descended
vertically in the ether, is now found to descending
obliquely as a result of this acquired horizontal motion.
This effect is widely recognized in astromony and was
observed in starlight in the 18th century. It is known as
"stellar aberration " and is manifested in a slight
angular shift in the apparent positions of stars, in
coordination with the earth's motion.
The effect is familiar. Imagine rain falling vertically. If you
drive through the rain in a car, the vertically falling rain
will acquire a component of horizontal motion towards
you and splash onto the windscreen.
The pressing question is whether we can use this
effect of stellar aberration to determine that we on earth
are moving in the ether. That is, can we distinguish this
case from one in which we are at rest in the ether and
the star is moving towards us with the same relative
velocity? We could use this effect to determine our
absolute motion in the ether if the incident ray of light
differed in any behavior from a ray of light arriving
obliquely at the glass block when the block is at rest in
the ether.
The behavior of a light ray obliquely incident onto a glass
block is well understood from the study of refraction in
elementary optics. The incident ray is bent towards a line
perpendicular to the block's surface. The amount the
refracted ray is bent depends upon the refractive index of
the glass according to Snell's law. The greater the refractive
index, the greater the deflection.
Origins of Special Relativity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/origins/index.html[28/04/2010 08:18:39 ﺹ]
We cannot infer our motion through the ether from the light
striking a moving glass block, as long as the light incident
on the moving block bends in just the same way as incident
light is refracted by a block at rest in the ether. That means
that the partial drag of the ether must simulate this
refractive effect exactly, so that the partially dragged ray
above must be bent through just the same angle as it is in
ordinary refraction.
This is the how the Fresnel drag has to be tuned exactly to
the refractive index of the optical medium. The greater the
refractive index, the more the refracted ray is bent and, as a
result, the greater the amount of ether drag needed to
simulate it.
For those of you who have to know the formula that specifies the tuning, it
is just this. The amount of drag is the velocity of the optical medium in the
ether multiplied by (11/n
2
), where n is the refractive index.
We see here for the first time something that we will see
again. We have an experiment that we first expect to be
able to reveal the earth's motion through the ether. We
might expect that the light of distant stars would behave
differently in optical media that move in the ether.
However a second effect arises, partial ether drag, and
it exists in exactly the amount needed to cancel out
any positive result that would affirm motion in the ether.
Origins of Special Relativity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/origins/index.html[28/04/2010 08:18:39 ﺹ]
There was a complication. A widely known property of
glass is that it refracts light differently for different colors.
That is, its refractive index varies with the frequency of
the light. This is what enables a prism to split light into
its different colors and is responsible for the chromatic
aberration of lenses that lens designers try so hard to
avoid. The odd outcome of this fact is that light of
different frequencies will be associated with different
amounts of ether drag, according to Fresnel's formula.
In effect that means that each frequency of light has its
own ether. That was troubling thought even in the 19th
century.
Origins of Special Relativity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/origins/index.html[28/04/2010 08:18:39 ﺹ]
Michelson Morley Experiment
After first order experiments came second order experiments. These were
a great deal more sensitive to any ether current. They were, however, also
a great deal harder to carry out. There was only one successfully
executed in the 19th century, the celebrated experiment of Albert A.
Michelson and Edward W. Morley of 1887 that completed Michelson's
earlier efforts at such an experiment. Indeed the experiment was so
difficult that Michelson won the Nobel prize principally for his highly
sensitive optical interferometer used in the experiment.
Their
original
paper is
quite
readable.
Origins of Special Relativity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/origins/index.html[28/04/2010 08:18:39 ﺹ]
' "
AMERICAN JOURNAL OF SCI ENCE.
( TIIIRD BIRlEe.,
..... oW __ . If oW _ """ ...
t....... _ ; "'" .. uan A. ll,""........ ,d
bw ... \11' . lIIou,.... ·
'flU _ • ..,. '" .... ..,."....,. of I"". woo _ 101 ........
bJ ....... Iioo. _,., ......... ...,. .,.. TIM ....
... ""'Mood .. . .... pie _ ..... _ 0( ok ...... '1 0/
.i .. IH ".'1 0( , ......... o. I .. TIM ddl "., ... I,
.... ....... , . "oIa .. ,;.", ..... " •• tIooJ<od ••
_ ........ .. 100 00 , ...... ,10""7 '"_, 01 ...
...",...,.. '1'1010 .... ..... & ... ol_ ..... ...
.. .... _. B • • I, ""10.1 "' __ 'f ...... r"", _ ... ",
Upori_, ......... ._. b .,d._.. ..
_ . ... _"'" , ;.a. • .., .... . ;\10 . ..... r .. ll ....
_.ot .... _1001_ .......... 0( .......... ,
of .... _ .. .... _'1 01 __ .... 10_
"""f I . . .... io 110_10. ...... i • .., I • • _ .... ....
__ __ . 1110 • __ .... _101 100 .... .
..... __ ...... ,
.__ .......... _.
,k_ .. _"' ___ "_"". _
_ ..... . .. _'_ ..... ... =
_ ... _ .... _, .......  ..
_of __ "_" ___ = ___ _
"_"_'"" ..... _ ....... _ ..  .. __ .. .. 
_ ..... .. _ .... _,
_ _ __ .. __ unT . . .. ___ • I ....
"
L
Origins of Special Relativity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/origins/index.html[28/04/2010 08:18:39 ﺹ]
Pages from Michelson and Morely's paper.
The basic idea of the experiment is that light moves
differently on a moving earth according to whether it
propagates transverse to the direction of the earth's
motion or parallel to the direction of the earth's motion.
In the first case the ether current flows across the
propagating light, slowing it a little. In the second case,
it provides a kind of head wind that slows the light more
or a tail wind that speeds it up.
Here is a schematic picture of the way the experiment
sought to look for these differences.
Origins of Special Relativity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/origins/index.html[28/04/2010 08:18:39 ﺹ]
A light source sends a beam of light to a half silvered
mirror that splits the beam in two. One half continues in
the same direction; the other is sent off at 90 degrees.
They both strike mirrors at equal distances which reflect
them back to a place where they can be viewed. That
the mirrors are placed at equal distances from the half
silvered mirror is represented by the two rods of equal
length in the figure that connect them.
You can grasp the way the experiment works most
simply if you imagine not a beam of light, but merely a
pulse of light , as shown in the figure. Since the
distances to the two mirrors are the same, the two
pulses will require the same time to traverse the
distance out and back and they will be detected at the
same time.
In practice, pulses are not used. A steady lightbeam is
used. Any difference in propagation time will be
manifested by the peaks and troughs of the waves
misaligning when they are combined at the detecting
Origins of Special Relativity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/origins/index.html[28/04/2010 08:18:39 ﺹ]
screen. The combining of these two waves produces
interference fringes at the detecting screen. So any
change in the alignment is revealed as a change in
the interference fringes.
In use, the apparatus is turned very slowly so that the
ether current passes over it from successively different
directions. During this turning, the ether current affects
the light traveling in the two directions differently and
these changes are expected to be manifested as
changes in the observed interference patterns.
Imagine, for example, that the horizontal direction in the figure below
aligns with the direction of motion of the earth in the ether. Then,
thinking classically, we expect the ether current to slow the travel
time of a light pulse making the round trip in the direction transverse
to the ether current. The net effect of of the ether current on the
pulse that makes the round trip parallel to the ether current is
an even greater slowing. So, as the figure shows, by the time the
For example, in
an extreme case
and unrealistic, if
the apparatus
moves at
.866c, then the
transit time for
the transverse
Origins of Special Relativity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/origins/index.html[28/04/2010 08:18:39 ﺹ]
transverse pulse reaches to detector, the longitudinal pulse is still
traversing the apparatus.
These difference in arrival times will change as the apparatus rotates
and they will be manifested as changes in the observable
interference fringes.
pulse is doubled;
and it is
quadrupled for
the longitudinal
pulse.
The result was negative. Michelson and Morley
found shifts in the interference fringes, but they were
very much smaller that the size of the effect expected
from the known orbital motion of the earth.
The Failures are Explained by
H. A. Lorentz
The outcome of the 19th century tradition of
experiments aimed at detecting the ether current was
negative. The wave theory of light of the 19th century
depended upon this ether. It was what carried the light
wave, just as air carries sound waves. Yet no
experiment could show the direction or magnitude of the
ether current.
The puzzle was deepened and broadened by the end of
the 19th century through the assimilation of optics into
Maxwell's theory of electric and magnetic fields. In the
1860's, Maxwell showed that a light wave is really a
wave of electric and magnetic fields, an
electromagnetic wave. So now the luminiferous ether
was also the ether that carried these fields.
Origins of Special Relativity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/origins/index.html[28/04/2010 08:18:39 ﺹ]
How is it possible for Maxwell's electrodynamics to
be based fundamentally upon the notion of an ether, yet
no experiment can reveal the magnitude and direction of
the ether current ? This was the problem taken up and
solved brilliantly by the great Dutch physicist H. A.
Lorentz.
Lorentz first simplified Maxwell's theory into the form
that it is routinely taught today. All matter, he proposed,
simply consists of electric charges (called "ions" or
"electrons") in the empty space of the ether. He then
proceeded to show how electrodynamical theory could
explain the failure of the experiments to produce a
result.
Origins of Special Relativity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/origins/index.html[28/04/2010 08:18:39 ﺹ]
If an optical medium just
consists of such charges,
Lorentz could show that
an electromagnetic wave
propagating through it
would be affected in
exactly the way
Fresnel's ether drag
hypothesis required. The
ether was not really
dragged in Lorentz's
account. His was a fixed,
immobile ether. Rather
the charges that made
up the medium were
excited by the light wave
as it passed through.
They absorbed energy
from the light and re
emitted it. When the
incident and reemitted
light were combined, the
net effect was a slowing
of the propagation of light
that matched exactly the
effect of Fresnel's
hypothesis. The ether
was not dragged; it just
looked like it was. The
amount that light slowed
in media in Fresnel's
hypothesis was no longer
a supposition but a
demonstrated result in
electrodynamics. That
explained why all first
order experiments failed.
Origins of Special Relativity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/origins/index.html[28/04/2010 08:18:39 ﺹ]
The second order Michelson Morley experiment was a little harder.
There was a solution suggested by the fact that classically light needs
more time to make the longitudinal round trip than the transverse one.
So what if the apparatus contracted in length longitudinally.
Then the longitudinal pulses would need less time to make the round
trip and negative result could be restored. The result would look
something like this:
The figure
shows the
extreme and
unrealistic case
of motion at
.866c. The
apparatus
would have to
contract 50%
longitudinally.
What Lorentz was able to show was that Maxwell's
theory of electromagnetism predicted precisely
this much longitudinal contraction.To get this result,
Lorentz modeled matter composing a body as a large
collection of electric charges, all held together in
equilibrium by electric and magnetic forces.
The equilibrium was disturbed if the entire object was set in
Origins of Special Relativity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/origins/index.html[28/04/2010 08:18:39 ﺹ]
motion. Moving electric charges create magnetic fields that
in turn act back of electric charges. All these changes settle
out into a new equilibrium configuration. What Lorentz
could show was that new configuration consists in a
contraction of the body in the direction of motion in just the
amount needed to eradicate a possible result from the
Michelson Morley experiment.
The catch was that matter probably couldn't consist just of electric
charges held by electric and magnetic forces. There had to be
other forces as well. They had to be there, for example, to prevent
Lorentz's electrons blowing themselves apart under the mutual
repulsion of the like charges in different parts of an electron. So
Lorentz simply supposed that these other forces would
behave just like electric and magnetic forces and yield the
same result.
This type of
reasoning was later
denounced as "ad
hoc"; that is an
hypothesis, cooked
up specifically to
solve one problem,
but with no
independent support
from anywhere else.
For example, an ad
hoc explanation of
why I cannot find my
keys is that an evil
key hiding gremlin
has hidden them.
The 20th century opened with the Maxwell Lorentz
theory of electrodynamics as the most successful
physical theory of the era. While that theory was based
essentially on the existence of an ether, the failure to
detect ether currents was no longer a puzzle, but a
prediction of the theory . Lorentz showed that the
theory entailed effects whose combined import was to
make the ether current invisible and the absolute motion
of the earth undetectable by us. We might be moving
through the ether at some definite speed and in some
definite direction. But the physics of electrodynamics
conspired to prevent us ever measuring that speed and
direction.
At the time this seemed like a perfectly satisfactory
resolution of the puzzle of the failure of all ether drift
Origins of Special Relativity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/origins/index.html[28/04/2010 08:18:39 ﺹ]
experiments. It is only if you know what is coming next
that you find the resolution awkward. Or, if you are
Einstein, you see more in the resolution than others then
did.
A final remark: the schematic drawing of the Michelson Morley
experiment above may seem oddly familiar. In fact we have
already seen its essential content before. The two arms of
the apparatus are light clocks . You will recall that we
computed the relativistic contraction effect from the condition
that moving light clocks, one transverse to and one parallel to
the direction, of motion must tick at the same rate. This is the
same contraction that figures in Lorentz's account.
What you should know
What is the luminiferous ether?
What were ether current experiments of the nineteenth century and what were their
outcomes?
What was Fresnel's ether drag hypothesis.
How was the MichelsonMorley experiment set up?
How did H. A. Lorentz explain these outcomes?
Copyright John D. Norton. January 2001, September 2002; July 2006; January 2, 2007; January 21,February 4, 2008; January 15, 17, 27, 2010.
Einstein's Pathway to Special Relativity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/origins_pathway/index.html[28/04/2010 08:19:02 ﺹ]
HPS 0410 Einstein for Everyone
Back to main course page
Einstein's Pathway to Special Relativity
John D. Norton
Department of History and Philosophy of Science
University of Pittsburgh
Chasing a beam of light
Magnet and conductor
Emission theories of light
Crisis: the relativity of simultaneity
The turn to principles
Three Components
Einstein's 1905 "On the Electrodynamics of Moving Bodies"
What you should know
Background reading: J. Schwartz and M. McGuinness,
Einstein for Beginners. New York: Pantheon.. pp. 1  82.
We have now reviewed the developments in the physics of
moving bodies, of light, of electricity and magnetism that
brought the physics that Einstein found when he began to
think about ether, electricity, magnetism and motion.
It was pondering these developments that led Einstein to
discover the special theory of relativity in 1905. The discovery
was not momentary. The theory was the outcome of, in
Einstein's own reckoning, seven and more years of work.
He even places one of his early landmarks in a thought
experiment he had at the age of 16, in 1896, nine years before
the year of miracles of 1905. Unfortunately we have only
fragmentary sources to document the years of this struggle.
Below I identify a few of the major ones.
The story of Einstein's discovery of special relativity has
exercised an almost irresistible fascination on many, in spite
of the dearth of sources. So, if you read more widely, you
will see much speculation over how to fill in the blanks
between the known landmarks and even over which are the
important landmarks. Some of it is responsible; some is not.
Chasing a beam of light
Einstein's Pathway to Special Relativity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/origins_pathway/index.html[28/04/2010 08:19:02 ﺹ]
Einstein in high school
Writing a half century later in 1946 in his Autobiographical
Notes, Einstein recounted a thought experiment conducted
while he was a 16 year old student in 1896 that marked his
first steps towards special relativity.
"...a paradox upon which I had already hit at the age of
sixteen:
If I pursue a beam of light with the velocity c (velocity of light in
a vacuum), I should observe such a beam of light as an
electromagnetic field at rest though spatially oscillating.
There seems to be no such thing, however, neither on the
basis of experience nor according to Maxwell's equations.
From the very beginning it appeared to me intuitively clear
that, judged from the standpoint of such an observer,
everything would have to happen according to the same laws
as for an observer who, relative to the earth, was at rest. For
how should the first observer know or be able to determine,
that he is in a state of fast uniform motion?
One sees in this paradox the germ of the special relativity
theory is already contained."
The basic thought is clear. If Einstein were to chase after a
propagating beam of light at c
Einstein's Pathway to Special Relativity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/origins_pathway/index.html[28/04/2010 08:19:02 ﺹ]
he would see a frozen light wave
and that Einstein deemed impossible.
At first it seems that is will be simple to figure out just what is
worrying Einstein. He states a few simple reasons. I don't
want to go into them here since they actually turn out to be
rather hard to disentangle . My best effort to disentangle
them is given at "Chasing a Beam of Light: Einstein's Most
Famous Thought Experiment,"
http://www.pitt.edu/~jdnorton/Goodies/Chasing_the_light
Magnet and conductor
Einstein's thinking evolved from this early, youthful flight into
richer and technically more detailed scrutiny of motion in
Maxwell's electrodynamics. Einstein initially took the idea of
an ether state of rest seriously and conceived experiments
that were designed to reveal the earth's motion through the
ether.
These thoughts eventually took a very different turn with
Einstein deciding that the ether state of rest had no place
in electrodynamics and that the principle of relativity was to be
upheld. The decisive moment seems to have come with a
thought experiment, the magnet and conductor, that is
recounted in the opening paragraph of Einstein's 1905 paper.
This is a version of that
thought experiment that is
modified slightly from the way
The simple idea behind the thought experiment is that Maxwell's
electrodynamics treats a magnet at rest in the ether very differently
from one that moves in the ether. A magnet at rest is surrounded
Einstein's Pathway to Special Relativity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/origins_pathway/index.html[28/04/2010 08:19:02 ﺹ]
Einstein sets it up. (Caution!)
by a magnetic field only.
However, if the magnet moves through
the ether, things are very different. In
addition to the magnetic field, a new entity
comes into being around the magnet, an
induced electric field.
The creation of the electric field draws on details of Maxwell's theory
that need not distract us here. Briefly, as the magnet moves past a
fixed point in the ether, the magnetic field strength changes with time
at that point. That change in field strength, according to Maxwell's
theory, creates an electric field.
This difference between the two cases seems to provide an
unequivocal marker of motion through the etheror so
it would seem. To determine if a magnet is moving absolutely
through the ether or not, one merely needs to look for that
induced electric field. That is easy to do. An electric field
accelerates electric charges, such as the conducting electrons
in a piece of wire, a conductor. So all that has to be done is to
place a conductor near the magnet, as the figures show, and
to look for an induced electric current. If there is one, then
there is an induced electric field and magnet is moving; if
there isn't one, then the magnet is at rest in the ether.
It all seems so straightforward. But it doesn't work . The
simplest situation arises if we attach the conductor to the
magnet so that it moves or rests with the magnet. If the
magnet is at rest in the ether, then there will be no current in
Einstein's Pathway to Special Relativity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/origins_pathway/index.html[28/04/2010 08:19:02 ﺹ]
the conductor. So far, it is as expected. But if the magnet and
conductor move together an extra complication enters.
Because the conductor is now moving absolutely in a
magnetic field, another part of Maxwell's theory tells us that a
second electric current will be induced in the conductor.
Remarkably that second current flows in the opposite direction
to the one produced by the electric field and it turns out to
cancel it out exactly.
The upshot is that checking for an electric current in the
conductor fails as a means of distinguishing the absolute rest
of the magnet from its motion. In both cases, the current is
the sameno current at all. So an Einstein riding with an
absolutely moving magnet, would detect no current and find
the situation to be indistinguishable from absolute rest as far
as the observable currents were concerned.
More curiously, it is as if the electric field just isn't there for an
observer moving with the magnet. But one at rest in the ether
would say there is an electric field present.
Einstein later described how this realization had affected
him quite profoundly:
Einstein's Pathway to Special Relativity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/origins_pathway/index.html[28/04/2010 08:19:02 ﺹ]
"In setting up the special theory of relativity, the following ...
idea concerning Faraday’s magnet electric induction
[experiment] played a guiding role for me...
[magnet conductor thought experiment described].
...The idea, however, that these were two, in principle different
cases was unbearable for me. The difference between the two,
I was convinced, could only be a difference in choice of
viewpoint and not a real difference. Judged from the [moving]
magnet, there was certainly no electric field present. Judged
from the [ether state of rest], there certainly was one present.
Thus the existence of the electric field was a relative one,
according to the state of motion of the coordinate system
used, and only the electric and magnetic field together could
be ascribed a kind of objective reality, apart from the state of
motion of the observer or the coordinate system. The
phenomenon of magnetoelectric induction compelled me to
postulate the (special) principle of relativity.
[Footnote] The difficulty to be overcome lay in the constancy of
the velocity of light in a vacuum, which I first believed had to
be given up. Only after years of [jahrelang] groping did I notice
that the difficulty lay in the arbitrariness of basic kinematical
concepts."
Einstein, Albert (1920) “Fundamental
Ideas and Methods of the theory of
Relativity, Presented in Their
Development,” Collected Papers of
Albert Einstein, Vol. 7, Doc. 31.
Einstein in 1920
In sum Einstein's lesson was this. Maxwell's theory
employed an ether state of rest; but that state of rest could not
be revealed by observation. So somehow the principle of
relativity needed to be upheld.
In retrospect, this relativity of the induced
electric field had, in effect, committed Einstein to
the relativity of simultaneity, although he certainly
did not know it at the time. A simple thought
experiment shows that it can only be reconciled
with Maxwell's electrodynamics if we give up the
absoluteness of simultaneity. See From the
Magnet and Conductor to the Relativity of
Simultaneity on my "Goodies" page.
And a second moral was an unexpected relativity.
Prior to Einstein, it had been thought that whether an
electric field is present at some place is an absolute
fact. Einstein now concluded that it is observer
dependent: some observers will judge an electric field
to be present; others in a different state of motion will
not. This was the first of Einstein's reorganization of our
ideas of which quantities are absolute and which
relative.
Einstein's Pathway to Special Relativity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/origins_pathway/index.html[28/04/2010 08:19:02 ﺹ]
Emission theories of light
The magnet and conductor thought experiment marked the
way forward for Einstein. He was to uphold the principle of
relativity in electrodynamics. The only obvious way of doing
that was to modify electrodynamical theory. As the
concluding footnote in Einstein's quote from 1920 above
suggests, Einstein could already know one element that must
be in the modification. According to Maxwell's theory, light
always propagates at c with respect to the ether. That result
must change if the theory conforms to the principle of relativity
since there will no longer be an ether state of rest against
which the motion of the light can be judged.
Walther Ritz
We know from later recollections what one of Einstein's modified
versions of electrodynamics looked like. In that version, the velocity of
light is a constant, not with respect to the ether, but with respect to the
source that emits the light. Such a theory is called an "emission"
theory of light and, if the other parts of the theory are well behaved,
will satisfy the principle of relativity.
Einstein later recalled that the theory he developed was
essentially that developed later by Walther Ritz in 1908.
In Ritz's theory and thus probably also in Einstein's
theoryall electrodynamic action, not just light,
propagated in a vacuum at c with respect to the actions
source. The essential change is shown in the animation:
For experts: the way to built the theory was
actually very easy. If Maxwell's theory is
formulated in terms of retarded potentials, one
needs only to tinker with the formula for the
retardation time to bring the whole theory into
the form of an emission theory. Everything else
can stay the same.
In Maxwell's
theory, all
electrodynamic
action,
generated by a
In a Ritzstyle
emission
theory, all
electrodynamic
action,
Einstein's Pathway to Special Relativity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/origins_pathway/index.html[28/04/2010 08:19:02 ﺹ]
source charge
at some
moment,
propagates at
c from the
fixed point in
the ether
occupied by
the source at
that moment.
generated by a
moving
source,
propagates at
c from a point
that moves
at uniform
velocity with
the source.
Here is a nonanimated version:
My own best effort to reconstruct of the details of Einstein's
theory can be found in "Einstein's Investigations of Galilean
Covariant Electrodynamics prior to 1905," Archive for History
of Exact Sciences, 59 (2004), pp. 45105.
Crisis: the relativity of
simultaneity
Einstein's Pathway to Special Relativity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/origins_pathway/index.html[28/04/2010 08:19:02 ﺹ]
It was a lovely theory. But it didn't work. We can only guess what the
problems were. But we know he found many. Indeed Einstein seems to
have expended considerable energy trying to figure out if any emission
theory might work. His later recollections are littered with different
reasons for why no emission theory at all could do justice to
electrodynamics.
My own conjectures on how
these arguments may have
worked are discussed in
part in my"Chasing a Beam
of Light:Einstein's Most
Famous Thought
Experiment,"
An emission theory fails. So Einstein would have found
himself in an impossible position. The speed of light cannot
vary with the speed of the emitter; presumably it must be a
constant, as Maxwell's theory had urged all along. Yet in
addition, Einstein was convinced that the principle of relativity
must obtain in electrodynamic theory. How can both
obtain? They require the speed of light to be the same for all
inertial observers?
The footnote already quoted above points us to Einstein's next
step.
"The difficulty to be overcome lay in the constancy of the
velocity of light in a vacuum, which I first believed had to be
given up. Only after years of [jahrelang] groping did I notice
that the difficulty lay in the arbitrariness of basic
kinematical concepts."
The key to the puzzle is the relativity of simultaneity. If Einstein
gives up the absoluteness of simultaneity, then the principle of
relativity and the constancy of the speed of light are
compatible after all . The price paid for the compatibility is
that we must allow that space and time behaves rather
differently than Newton told us.
More importantly for Einstein's struggles of that time is an
extra bonus: it turns out that within the new theory of space
and time of special relativity, Maxwell's electrodynamics does
not need to be modified at all. It turns out to be compatible
with principle of relativity just as it is. That would have been a
very satisfactory outcome for Einstein.
Einstein recounted later the moment of discovery . In a
lecture in Kyoto on December 14, 1922, he is reported by
Ishiwara, who took notes in Japanese, to have said:
Einstein's Pathway to Special Relativity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/origins_pathway/index.html[28/04/2010 08:19:02 ﺹ]
"Why are these two things
inconsistent with each other ? I felt
that I was facing an extremely
difficult problem. I suspected that
Lorentz’s ideas had to be modified
somehow, but spent almost a year on
fruitless thoughts. And I felt that was
puzzle not to be easily solved.
But a friend of mine living in living in
Bern (Switzerland) [Michele
Besso]helped me by chance. One
beautiful day, I visited him and said to
him: ‘I presently have a problem that
I have been totally unable to solve.
Today I have brought this “struggle”
with me.’ We then had extensive
discussions, and suddenly I realized
the solution. The very next day, I
visited him again and immediately
said to him: ‘Thanks to you, I have
completely solved my problem.”
My solution actually concerned the
concept of time. Namely, time cannot
be absolutely defined by itself, and
there is an unbreakable connection
between time and signal velocity.
Using this idea, I could now resolve
the great difficulty that I previously
felt. After I had this inspiration, it took
only five weeks to complete what
is now known as the special theory of
relativity."
Translation from Stachel, John (2002) Einstein
from ‘B’ to ‘Z.’: Einstein Studies, Volume 9.
Boston: Birkhäuser, p. 185.
Einstein taking sake
A portrait of Einstein by the cartoonist Okamoto Ippei (1886 1948), done in
December of 1922 in Sendai, Miyagi Prefecture, Japan
This moment of recognition of the relativity of simultaneity is
one of the great moments of discovery in science and, at this
moment philosophical reflections played a key role. Absolute
Einstein's Pathway to Special Relativity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/origins_pathway/index.html[28/04/2010 08:19:02 ﺹ]
David Hume
simultaneity seems an uncontroversial part of the world. How
could we give it up ? Einstein had been reading many
philosophers, including Hume and Mach . They had
stressed that concepts are our servants, not our masters, and
they are warranted only in so far as they might be grounded
in experience. So was absolute simultaneity grounded
properly in experience ? Einstein began to think about the
experiences that we use to establish simultaneity of events
and he realized that it was not. Reading these philosophers
gave him the courage to continue and to abandon absolute
simultaneity. In its place came the relativity of simultaneity.
Ernst Mach
For an account of how reading Hume and Mach helped, see
my "How Hume and Mach Helped Einstein Find Special
Relativity."
The turn to principles
The moment of the recognition of the relativity of simultaneity
came, in the above account, 5 weeks prior to Einstein's
completion of the 1905 paper (and in another 5 to 6 weeks). In
these five to six weeks in which he pulled together the pieces
of the finished theory, Einstein made one more very significant
methodological advance that would forever color how we
see relativity theory.
Einstein's pathway to discovery amounted to the recognition
that if you take Maxwell's electrodynamics seriously you have
to see that built into it is both the principle of relativity and a
new kinematics of space and time that supports it. Yet Einstein
does not simply argue it that way in the finished paper.
The reason is not hard to see. Prior to, just a few
months before completing his 1905 special relativity
paper, Einstein had published a paper in which he had
foreshadowed the demise of Maxwell's
electrodynamics! In his earlier light quantum,
Einstein had advanced the astonishing assertion that
Einstein's Pathway to Special Relativity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/origins_pathway/index.html[28/04/2010 08:19:02 ﺹ]
sometimes light does not behave like a wave as
Maxwell's theory demanded; sometimes it behaved like
a spatially localized collection of energy.
So how could Einstein now base a new theory of space and
time on Maxwell's theory? He knew something was very right
about Maxwell's theory. There was also something very wrong
about it. How could one theorize in such an unstable
environment. The answer came to Einstein, as he reported
in his Autobiographical Notes, in a distinction of what he
called constructive theory from theories of principle.
"Reflections of this type made it clear to me as long ago as
shortly after 1900, i.e., shortly after Planck's trailblazing work,
that neither mechanics nor electrodynamics could (except in
limiting cases) claim exact validity. Gradually I despaired of
the possibility of discovering the true laws by means of
constructive efforts based on known facts. The longer and the
more desperately I tried, the more I came to the conviction
that only the discovery of a universal formal principle could
lead us to assured results. The example I saw before me was
thermodynamics. The general principle was there given in
the theorem: The laws of nature are such that it is impossible
to construct a perpetuum mobile (of the first and second kind).
How, then, could such a universal principle be found?"
In effect, what Einstein saw was that he did not really need all
of Maxwell's theory for his new account of space and time. He
needed only a few core ideas robust enough to survive
the coming quantum revolution. Following the model of
thermodynamics, these few core ideas would be advanced as
Einstein's Pathway to Special Relativity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/origins_pathway/index.html[28/04/2010 08:19:02 ﺹ]
principles from which the entire theory could be deduced.
What could those principles be? The principle of relativity
itself was an obvious choice. He also needed something that
distilled the relevant essence of Maxwell's electrodynamics.
What about the hardest won lesson of his years of work
towards the final theory: the recognition that an emission
theory of light must fail ? That is, that Maxwell's theory was
right after all in demanding that that light always propagates at
c, no matter how fast the emitter may be moving ? That
became the second principle, light postulate. Those two
principles proved to be sufficient to allow the entire theory to
be deduced. Einstein laid out both as his postulates and the
theory adopted its now familiar form.
Three Components
We have seen three components in Einstein's discovery:
Astute analysis of new and surprising experiments.
Deeply reflective philosophical analysis of the nature of
time and physical theories.
Solving an incongruous and overlooked problem in the
foundations of electricity and magnetism.
While all three had a role in Einstein's discovery, the last was
the most decisive. Unfortunately this is often overlooked in
accounts of the origins of Einstein's theory. Einstein's
engagement with current experiments and his facility in
philosophical analysis are important. However special relativity
would not have come about at all were it not for the particular
problems in electrodynamics addressed by Einstein and which
demanded a radical solution.
Einstein's 1905 "On the
Electrodynamics of Moving
Bodies"
Einstein's Pathway to Special Relativity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/origins_pathway/index.html[28/04/2010 08:19:02 ﺹ]
Einstein arrived at his "On the electrodynamics of
moving bodies," which is my best candidate for the
most famous scientific paper ever written.
An online version of this paper is here. Beware of a
famous misrendering in this standard edition as
noted in this version of the first two sections.
The paper has several parts. First there is an introduction. It
3. Z'lt'J' Eleklil'O(lynamik beweyter KlJrper;
von A. Ei1Mtein.
891
DaB die Elektrodynamik Maxwells  wie dieselbe gegen.
wiirtig aufgeCaSt zu werden pOegt  in ihrer Anweodung auf
bewegte Korper zu A.symmetrien fllhrt, welche den Phanomeoen
Dieht anzubaften scbeiDen, ist bekannt. MaD denke z. B. an
die elektrodynamische Wechselwirkung zwischen einem Mag·
ueten nod einem Leiter, Das beobachtbare PMnomen hangt
bier nUl" "b von der Relativbewegnng von Leiter nod Magnet,
wahrend 08ch der ublicheo Anffilssung die heiden Fillle, daB
der eine oder der andere dieser Korper der bewegte sei streng
voneinander zu trennen sind. Beweg! sich nii.mlich dar Magnet
und ruht der Leiter so entsteht in der Umgebung des Magneten
ein elektri,ches Feld von gewissem Energiewerte, welches an
den Orten, wo sich Teile des Leiters befinden, einen Strom
erzeugt. Ruht abe,' der Magnet und bowegt sich der Leiter,
so entsteht in der Umgebung des Magneten kein elektrisches
Feld, dagegen im Leiter eine elektromotorische Kraft, weloher
an sich keine Bnergie entspricht, die aber  Gleichheit der
Relativbewegung bei den beiden ins Auge gefaBten FiiJlen
vorausges.,tzt  zn elektrischen StrOmen VOn dcrselben GrOBe
und demselben Verlaufe VemolassuDg gibl, wie im erstsn Falle
die elektl'ischen Kritfte.
Beispiel. lihnlicher Art, sowie die miBlnogeneD Versuche,
eioe Bewegnng de. Et'de relativ zum "Lichtmedium" zu kOD
slatiereD, fiibren zu dar Vermntung, daB dem Begriffe der
absoluten Rube Dichl nur in der Mechonjk, soodaro &nch in
der Elektrodynamik keiue Eigen.chaflen der Erscheinuogeo ent.
spl'echen, sondero daB vielmehr 1m aile Koordinatensysteme,
fiir welche die mechaniscben Gleichungen gelten, .. nch die
gleichen elektrodynamischen und optischen Gesetze gelleD, wie
dies fOr die GraBen ers!er Ordnung berei!. erwiesen ist. Wir
wollen diese Vermutung (deren Iubalt im folgenden "Prinzip
der RelativitiW' genaont warden wird) zur Voranssetzong er
hebeo uud auBerdam die mit ibm nur schein bar nnvertrllgliche
Einstein's Pathway to Special Relativity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/origins_pathway/index.html[28/04/2010 08:19:02 ﺹ]
commences with the recounting of the magnet and
conductor thought experiment. It then announces the project
of solving the resulting problem with a new theory of space
and time based on the principle of relativity and the light
postulate.
In the first "Kinematical Part"
of the paper, Einstein develops
the parts of the theory devoted
only to space and time. Its first
section, "Definition of
Simultaneity," Einstein gives his
celebrated analysis of the
relativity of simultaneity. It is one
of the most celebrated
conceptual analyses of the
century and a model very many
others tried to follow.
Einstein's Pathway to Special Relativity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/origins_pathway/index.html[28/04/2010 08:19:02 ﺹ]
The second "Electrodynamical
Part" proceeds to what must have
seemed for Einstein in 1905 to be
the real benefit of the paper. He
proceeded to show how Maxwell's
electrodynamics was already a
theory that conformed to the
principle of relativity and noted that
this fact made solution of some
problems in electrodynamics very
easy.
For a problem concerning moving
systems, such as the reflection of
light off a moving mirror, was really
the same as another much easier
problem with resting bodies, such
as the reflection of light off a
resting mirror. If you could solve
the easy problem, then the
principle of relativity let you write
down a solution to the harder one
almost immediately, just by
transforming your viewpoint from
one frame of reference to another.
For more on Einstein's discoveries of 1905, see my website.
What you should know
What Einstein at age 16 imagined it would be like to chase light.
Einstein's Pathway to Special Relativity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/origins_pathway/index.html[28/04/2010 08:19:02 ﺹ]
His magnet and conductor thought experiment and what he learned from it.
How he tried to use emission theories of light.
The importance of his insight on simultaneity.
Why he chose to formulate the special theory in terms of two principles.
Copyright John D. Norton. January 2001, September 2002; July 2006; January 2, 2007; January 21,February 4, 2008; January 15, 2010.
•
•
•
Spacetime
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/spacetime/index.html[28/04/2010 08:19:48 ﺹ]
HPS 0410 Einstein for Everyone
Back to main course page
Spacetime
John D. Norton
Department of History and Philosophy of Science
University of Pittsburgh
Why Spacetime?
Building a Spacetime
Light Cones
The Right Terminology
What you should know:
Why Spacetime?
So far all our discussion in special relativity has
involved the motion of bodies in space over time. If you
haven't already noticed, these motions can become
rather complicated to visualize. Recall how tough
it is to keep track of what the different ends of a
moving rod are doing as a light signal bounces back
and forth between them.
In 1907 the mathematician Hermann Minkowski
explored a way of visualizing these processes that
proved to be especially well suited to disentangling
relativistic effects. This was their representation in
spacetime. Quite puzzling relativistic effects could be
comprehended with ease within the spacetime
representation and work in the theory of relativity
started to be transformed into work on the geometry of
spacetime.
Spacetime
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/spacetime/index.html[28/04/2010 08:19:48 ﺹ]
Building a Spacetime
We build a spacetime by taking instantaneous
snapshots of space at successive instants of time and
stacking them up. It is easiest to imagine this if we
start with a two dimensional space. The snapshots
taken at different times are then stacked up to give us
a three dimensional spacetime. In this spacetime, a
small body at rest will be represented by a vertical line.
To see why it is vertical, recall that it has to intersect
each instantaneous space at the same spot. A vertical
line will do this. If it is moving, it will intersect each
instantaneous space at a different spot; a moving body
is presented by a line inclined to the vertical.
A standard convention (that I will usually use) is to
represent trajectories of light signals by lines at 45
o
to the vertical.
Spacetime
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/spacetime/index.html[28/04/2010 08:19:48 ﺹ]
In the figure, a moving rod is represented by the
trajectories in spacetime of its ends. The zigzag line is
a light signal bouncing back and forth between these
two ends.
Here's another example. Take snapshots of the earth
orbiting the sun in the three dimensional space
around the sun in the course of a year, which will look
like:
Spacetime
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/spacetime/index.html[28/04/2010 08:19:48 ﺹ]
Now we stack them up
into the third dimension.
When we clean things up a little,
we have a spacetime.
Spacetime
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/spacetime/index.html[28/04/2010 08:19:48 ﺹ]
So far we have described how a two dimensional
space is combined with the one extra dimension of
time to generate a three dimensional spacetime, such
as shown above in the figures. Our space is three
dimensional. So when we add the extra dimension of
time we generate a four dimensional spacetime.
There is no easy way to draw a picture of a four
dimensional spacetime. Visualizing it can be very
hard. But that does not make it mysterious. It is just
another sort of space that happens to transcend simple
visualization. In physics, four dimensions are actually
quite modest. In statistical mechanics, we routinely
deal with phase spaces of 6 x number of molecules in
a gas sample. For even small samples of gas, that can
come to 10
25
a space with
10000000000000000000000000 dimensions. So we
should not be too awed by a mathematical space with
only four dimensions!
Spacetime
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/spacetime/index.html[28/04/2010 08:19:48 ﺹ]
Light Cones
That the speed of light is a constant is one of the
most important facts about space and time in special
relativity. That fact gets expressed geometrically in
spacetime geometry through the existence of light
cones, or, as it is sometimes said, the "light cone
structure" of spacetime.
To see that structure, we imagine an event at which
there is an explosion. Light will propagate out from it in
an expanding spherical shell. In a two dimensional
space, it will look like an expanding circle, as shown
below.
To see that structure, we imagine an event at
which there is an explosion. Light will propagate
out from it in an expanding spherical shell. In a two
dimensional space, it will look like an expanding
circle.
An animations makes the motion more visible.
Spacetime
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/spacetime/index.html[28/04/2010 08:19:48 ﺹ]
Now stack up these spatial snapshots to make a
spacetime. The spacetime diagram that corresponds to
it looks like a cone. As we proceed up the cone, we
look in each instantaneous space to see how far the
light has propagated. Each intersection of the cone
with the space will be a circle.
In the figure, the expanding circle of light is
represented by the top half of the cone. It is customary
to draw in the bottom half of the cone, although it
not part of the expansion of the light. In fact it
represents the opposite. It depicts a circle of light
collapsing in towards the original event at the apex of
the cone. Here is that collapse, presented also as an
animation.
Spacetime
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/spacetime/index.html[28/04/2010 08:19:48 ﺹ]
A final animation now shows the association between
the different stages of the collapsing and expanding
light shell and the crosssections of the light cone.
The Right
Terminology
There is much potential for confusion in talking about
spacetimes. As a result a fairly precise vocabulary has
been built up and it is important to to use it correctly.
Pay attention to the following terms:
Spacetime
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/spacetime/index.html[28/04/2010 08:19:48 ﺹ]
Spacetime When we add
the extra dimension of time
to a space, we produce a
spacetime.
Minkowski spacetime
There is nothing special
about a spacetime. They
can arise in classical
physics. So if we mean a
spacetime that also
behaves the way special
relativity demands, then we
have a Minkowski
spacetime. (Note for later: when
we look at general relativity, we
will meet spacetimes that are
relativistic but not Minkowski
spacetimes.)
Event These are the
individual points of a
spacetime. They represent
points in space at a
particular time.
Timelike Worldline
This is the trajectory of a
point moving less than the
speed of light. These
curves are contained within
the light cone. They
represent the trajectories of
massive particles.
Lightlike curve This is
the trajectory of a point
moving at the speed of
lighta light signal. They lie
on the surface of the light
cone.
Spacetime
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/spacetime/index.html[28/04/2010 08:19:48 ﺹ]
Spacelike curve This is
a curve that lies outside the
light cone. If an object is to
make this curve its
trajectory, it would need to
travel faster than light.
Spacelike
hypersurfaces These
are the instantaneous
spatial snapshots of
spacetime. They are three
dimensional in the case of
a four dimensional
spacetime.
Past and future light
cones All the lightlike
curves through an event
form the light cone at that
event. The part of the cone
to the future of that event is
the future light cone. The
part to the past is the past
light cone.
Light cone structure
Since the speed of light is
generally taken to be the
fastest that causes can
propagate their effects,
once we know how the light
Spacetime
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/spacetime/index.html[28/04/2010 08:19:48 ﺹ]
cones are distributed in
space we can say a great
deal about what is possible
and impossible causally in
the spacetime. So this
distribution is of great
interest to us. It is called
the light cone structure of
the space.
Timelike geodesic This
term will be defined later.
What you should know:
What a spacetime is.
The correct use of the particular terms associated with spacetimes.
Copyright John D. Norton. January 2001, September 2002; July 2006; February 3, 2007; January 23, September 24, 2008.
Spacetime and the Relativity of Simultaneity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/spacetime_rel_sim/index.html[28/04/2010 08:20:03 ﺹ]
HPS 0410 Einstein for Everyone
Back to main course page
Spacetime and the Relativity of
Simultaneity
John D. Norton
Department of History and Philosophy of Science
University of Pittsburgh
Slicing Up Spacetime
Propagating Times through Space
Why Tilt?
Relativity of Simultaneity and the Speed of Light
The novelty of Minkowski spacetime
What you should know:
What use is spacetime ? It turns out to make
visualizing and understanding the relativity of
simultaneity a great deal easier. The judgments of
simultaneity of different inertial observers correspond
to slicing the spacetime up into different stacks
of spaces with each space formed from a set of
simultaneous events.
Slicing up Spacetime
To see how this works, here are three observers in
relative motion in a spacetime.
First we have an observer whose worldline
Spacetime and the Relativity of Simultaneity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/spacetime_rel_sim/index.html[28/04/2010 08:20:03 ﺹ]
runs vertically up the page.
The next observer moves to the right with
respect to the first.
The third observer moves to the left with respect
to the first.
Notice how differently they slice up the spacetime into
spaces of simultaneous events. That difference
simply is the relativity of simultaneity. It is expressed
in the tilting of the hypersurfaces of simultaneity as we
Spacetime and the Relativity of Simultaneity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/spacetime_rel_sim/index.html[28/04/2010 08:20:03 ﺹ]
move the judgments of simultaneity of events from
inertial observer to inertial observer.
In looking at the three slicings as they are drawn
above, it is easy to fall into the trap of imagining that
the first slicing is somehow the "right" one and the
second and the third are distortions due to the
observers' motion. That would be a mistake. The
principle of relativity assures that all three observers
are equally good. The judgments of simultaneity of
any one is just as good a those of the other two and
each of the figures is an equally good way of dividing
the spacetime into sets of simultaneous events.
The fact that one observer's worldline is drawn as a
vertical line and the others are oblique is just an
accident of the way we chose to draw the diagram.
Correspondingly, the fact that one observer's
hypersurfaces are perfectly horizontal and the others
are tilted is again an accident of the way we drew the
figure. We could redraw the figures so that the third
observer's worldline, say, is vertical. Then the third
observer's hypersurfaces of simultaneity would be
drawn as horizontal; the worldlines of the other two
observers would be diagonals; and their hypersurfaces
of simultaneity would be tilted.
Two points to watch when
you are drawing this tilting of
hypersurfaces.
First, setting an observer into
motion to the right will tilt the
observer's world line to the
right; and the hypersurface of
simultaneity will also tilt up on
the right side to meet it.
Second, if one follows the
usual convention of drawing
Spacetime and the Relativity of Simultaneity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/spacetime_rel_sim/index.html[28/04/2010 08:20:03 ﺹ]
light lines at 45
o
, then the
angle of the observer's
worldline to the vertical will be
the same as the angle of the
hypersurface of simultaneity
to the horizontal.
Propagating Times
through Space
The tilting of the hypersurfaces gives us a simple
picture of how inertial observers in relative motion
assign times to events.
An inertial observer carries a
clock that marks the time of
events along the observer's
worldline as "1," "2," "3," ... That
settles the time of events only
on the worldline for the
observer. What time should be
assigned to events not on the
observer's worldline? The
observer's hypersurfaces of
simultaneity answer.
Spacetime and the Relativity of Simultaneity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/spacetime_rel_sim/index.html[28/04/2010 08:20:03 ﺹ]
Consider the hypersurface that
passes through the event of the
clock showing "1." All these
events are simultaneous in the
judgement of the observer.
Therefore all these events are
assigned time "1.
The same applies for the
remaining hypersurfaces that
pass through the events of the
clock ticking "2" and "3." All the
events on those hypersurfaces
are assigned times "2" and "3,"
respectively.
The same analysis
obtains for a new
inertial observer who
moves relative to the
original observer. The
new observer's clock
assigns times to
events on the
observer's world line.
The observer's
hypersurfaces of
simultaneity are then
used to propagate the
times throughout the
spacetimes.
Clearly the original and
new observer will
Spacetime and the Relativity of Simultaneity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/spacetime_rel_sim/index.html[28/04/2010 08:20:03 ﺹ]
differ on the times
each assigns to the
same event in almost
every case. Is there a
sense in which one is
assigning times
correctly and the other
not? There cannot be.
The prinicple of
relativity requires each
observer's frame to be
equivalent. If the
procedure is good in
one inertial frame, then
it is equally good in all.
This reminds us once
again that there is no
frame independent
notion of simultaneity
in a Minkowski
spacetime.
Why Tilt?
Just why is it that hypersurfaces of simultaneity tilt
when we change the state of motion of the observer or
reference frame? A simple construction shows how
it comes about.
Spacetime and the Relativity of Simultaneity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/spacetime_rel_sim/index.html[28/04/2010 08:20:03 ﺹ]
Imagine that some inertial
observer wants to determine
which events in spacetime are
simultaneous with some event O
on the observer's world line. The
simple way to do it is with light
signals. Following Einstein's
original idea of 1905, the
observer sends out light signals,
reflects them off positions in
space and then notes when they
return.
In the figure opposite, there is a
light signal leaving the observer's
worldline, reflecting at event A
and then arriving back at the
observer's worldline. Since the
event O is exactly midway in
time between the departure and
arrival events, the observer judges
event A to be simultaneous with
O.
Also it is obvious that light signals
reflected at A' and A must arrive
back at the observer at the same
time, since they departed the
observer at the same time. It is
just symmetry.
This same reasoning applies to all the remaining
events shown in the figure: B', C', B and C. In each
case, there are arrival and departure events at the
observer's worldline for light signals that are reflected
at B', C', B and C. The event O is midway between the
arrival and departure events in each case. Therefore,
the observer judges each of B', C', B and C to be
Spacetime and the Relativity of Simultaneity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/spacetime_rel_sim/index.html[28/04/2010 08:20:03 ﺹ]
simultaneous with O. The totality of these events will
form a flat plane perpendicular to the observer's
worldline.
Now let us consider a second case in which a new
inertial observer moves relative to the original inertial
observer. The new observer's worldline will be drawn
as a tilted line. Which events will that observer judge
as simultaneous with an event O on that observer's
worldline? Although the worldline is now tilted, the
same procedure just described can also be used to
pick out the events simultaneous with O. Indeed the
principle of relativity requires this, for otherwise we
would have some intrinsic difference between the first
inertial frame and the second; only in the first could this
procedure be used.
The construction
proceeds as before
and it is drawn for you
at left. As long as we
adhere to the light
postulate and draw
our lightlike curves at
45 degrees to the
vertical, we will end up
plotting out events A',
B', C', A, B and C that
lie on a tilted
hypersurface.
This happens because
the departure and
arrival events for the
light have been
displaced to the left
and right respectively.
We now need to locate
the bends in the
Spacetime and the Relativity of Simultaneity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/spacetime_rel_sim/index.html[28/04/2010 08:20:03 ﺹ]
lightlike curvesthe
reflection events in
such a way that, as
before, the light
signals from A and
A' return at the same
arrival event (and so
on for B and B' and for
C and C'). That can
only happen if we
displace the reflection
events into the tilted
hypersurface shown.
If you are having any
trouble seeing this, the
simplest remedy is to
draw the figure for
yourself, being careful to
keep all light signals
propagating along lines
that are at 45 degrees to
the vertical.
Relativity of Simultaneity and
the Speed of Light
Spacetime and the Relativity of Simultaneity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/spacetime_rel_sim/index.html[28/04/2010 08:20:03 ﺹ]
Special relativity requires us to believe something that
at first seems unbelievable: that two inertial observers
in relative motion will judge the same speed for the
same light signal. We know now that the relativity of
simultaneity solves the problem. The two can judge the
same speed for light since, through the relativity of
simultaneity, they set the clocks used to measure the
speed of light differently.
Visualizing just how the relativity of simultaneity
enables the light postulate to hold for all inertial
observers is not easy as long as we try to picture
things in ordinary space. It does become dramatically
simpler once we depict them in spacetime and use
the simple geometric picture of the relativity of
simultaneity that it affords.
To see how this comes about, let us first sharpen the
problem by describing the difficulty in a quite
concrete case . Once we see how the relativity of
simultaneity resolves this one case, others are
obviously analogous. Imagine that we have an
inertially moving rod and a light signal that bounces
back and forth between its two ends.
For a observer at rest on the rod, the light signal will
take the same time for the forward and return journey.
Now imagine that we redescribe the process
from the perspective of an observer on a
nearby planet , who judges the rod to be in
uniform motion along its length. That planet
observer would judge the light to need more
time to traverse the rod in the direction of the
rod's motion and less in the direction opposite
to rod's motion.
Remember how it goes? When the light signal
moves in the direction of the rod, it chases
after a fleeing end and needs more time to
catch it. When the light signal moves in the
opposite direction, its destination is an end that
approaches; so it needs less time to reach it.
Note that the speed of light is assumed to be
the same in both directions.
Spacetime and the Relativity of Simultaneity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/spacetime_rel_sim/index.html[28/04/2010 08:20:03 ﺹ]
If the light postulate is to hold for both inertial
observers, somehow both have to be right. The
forward the return journal should take the same time
for the rod observer; and they should take different
times for the planet observer.
Let's now look at the spacetime diagrams for this
process.
First here's a spacetime
diagram that depicts the rod
observer's judgments. In
particular, the hypersurfaces
of simultaneity reflect the rod
observer's judgments of
simultaneity of events.
The equal spacing of the
hypersurfaces reflects the rod
observer's judgment that
equal times are needed for
the forward and return
journey of the light signal.
More precisely, the events in
question are the arrivals of the light
signal at either end of the rod. The
hypersurfaces reflect how the rod
observer associates these events
Spacetime and the Relativity of Simultaneity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/spacetime_rel_sim/index.html[28/04/2010 08:20:03 ﺹ]
with events simultaneous with them
on the rod observer's own world line.
The rod observer will then use a
single clock carried with the observer
to judge the equality of times elapsed
between these latter events.
Here's the spacetime
diagram that depicts
the planet
observer's
judgments. In
particular, the
hypersurfaces of
simultaneity reflect
the judgments of
simultaneity of events
by the planet
observer.
Spacetime and the Relativity of Simultaneity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/spacetime_rel_sim/index.html[28/04/2010 08:20:03 ﺹ]
It is clear that a
greater time is
needed for the light to
traverse the rod when
the light propagates
in the direction of the
rod's motion; and less
time is needed for the
return trip.
The speed of light in
both directions
remains the same.
That is captured by
the fact that the
lightlike curves in both
directions are at 45
o
to the vertical.
How can both views
cohere? That
becomes apparent
immediately if we now
depict how the planet
observer judges the
rod observer's
Spacetime and the Relativity of Simultaneity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/spacetime_rel_sim/index.html[28/04/2010 08:20:03 ﺹ]
hypersurfaces of
simultaneity to be
spread over the
spacetime.
The planet observer
notices that the rod
observer's judgments
of simultaneity differ
from the planet
observer's. The
difference lies in a
tilting of the rod
observer's
hypersurfaces of
simultaneity. Indeed
that tilting is precisely
what is needed to
restore the equality of
times for the forward
and return trips of the
light signal.
In comparing the last
two figures, the key
element to notice is
that the rod and
bouncing light
signal remain the
same. All that
changes is the way
that the two
Spacetime and the Relativity of Simultaneity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/spacetime_rel_sim/index.html[28/04/2010 08:20:03 ﺹ]
observers slice up
the spacetime into
hypersurfaces of
simultaneity.
The planet observer's
slicing leads to the
judgment that the
light's forward journey
takes longer.
The rod observer's
slicing leads to the
judgment that the
light's forward and
return journeys take
the same time.
The animation at left
shows the two figures
overlayed and that
the rod and light
signal world lines are
the same in both.
The novelty of
Minkowski
Spacetime and the Relativity of Simultaneity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/spacetime_rel_sim/index.html[28/04/2010 08:20:03 ﺹ]
spacetime
Just what is so special about a Minkowski spacetime?
One might think that is it the idea of representing
space and time together in a four dimensional
geometry, where the four dimensionality of the
geometry outstrips our immediate powers of
visualization. It is certainly the case that the four
dimensionality if both interesting and hard to visualize.
But there is nothing inherently relativistic about it.
One can take all the physics of Newton and reexpress
it in four dimensional terms.
The big difference between Newtonian and
relativistic spacetimes lies in how they are sliced up
into three dimensional spaces. That slicing is done by
picking out sets of simultaneous events to form three
dimensional spaces.
In Newtonian spacetimes, there is only one way to do
this, so a Newtonian spacetime unstacks into a unique
set of spaces. In this sense, space and time remain
distinct even if we represent the physics in a
spacetime.
In a relativistic (i.e. Minkowski) spacetime, the
relativity of simultaneity tells us that there are many
ways to do this; there is no unique, preferred
unstacking. In this sense, space and time get fused
together and this fusion is the real novelty of the
spacetime approach in relativity theory.
This novelty is surely what Hermann Minkowski had in
mind when he wrote in the introduction to his famous
lecture "Space and Time" of 1908:
Spacetime and the Relativity of Simultaneity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/spacetime_rel_sim/index.html[28/04/2010 08:20:03 ﺹ]
"The views of space and time which I wish to
lay before you have sprung from the soil of
experimental physics and therein lies their
strength. They are radical. Henceforth space
by itself and time by itself, are doomed to fade
away into mere shadows, and only a kind of
union of the two will preserve an independent
reality."
What you should know:
How is the relativity of simultaneity is represented in a spacetime
diagram.
How can it be used to resolve puzzles in the way observers in relative
motion see light propagation and the rates of moving clocks.
Copyright John D. Norton. January 2001, September 2002; July 2006; February 3, 2007; January 23, September 24, 2008; February 2, 2010.
Spacetime, Tachyon, Twins, ...
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/spacetime_tachyon/index.html[28/04/2010 08:20:10 ﺹ]
HPS 0410 Einstein for Everyone
Back to main course page
Spacetime, Tachyons, Twins and Clocks
John D. Norton
Department of History and Philosophy of Science
University of Pittsburgh
Tachyons
Tachyon Paradoxes
Symmetry of Clock Slowing: Half Twin Effect
Symmetry of Length Contraction
The Twin Effect ("Paradox")
Timelike Geodesics
What you should know:
Once we have the notion of spacetime and the simple
picture it brings of the relativity of simultaneity, we find that
other processes and effects in special relativity become a
great deal easier to understand. Here is a collection of a
few of them.
Tachyons
One of the most intriguing entities in relativity theory are
tachyons, hypothetical particles that travel faster than light.
They are distinguished from "bradyons," particles that travel
at less than the speed of light. While bradyons are familiar
and include protons, electrons and neutrons, tachyons
have never been observed.
Bradyons Tachyons
travel slower than light travel faster than light
ordinary matter exotic matter (not found)
Add energy and momentum Add energy and momentum
Spacetime, Tachyon, Twins, ...
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/spacetime_tachyon/index.html[28/04/2010 08:20:10 ﺹ]
and they speed up. and they slow down.
c is the upper limit to their speeds c is the lower limit to their speeds
For present purposes, the interesting fact is a curious
property: for some observers they travel backwards in
time. With the spacetime representation of the relativity of
simultaneity it is now very easy to see how this comes
about. The figures below show a tachyon being created
and propagating into space; and how three different
observers would judge the same tachyon.
Observer A judges it to be moving forward in time
from its creation at the instant marked "now." It
propagates from the "now" hypersurfaces of
simultaneity towards the "later" hypersurfaces of
simultaneity.
Observer B moves in the direction of
propagation of the tachyon. Observer B finds
the tachyon to lie fully within one of B's
hypersurfaces of simultaneity, the "now"
hypersurface that contains the event of the
tachyon's creation. Indeed the "now"
hypersurface contains all the events in the
tachyon's history. So the tachyon exists only
"now" for observer B. That is, for B, the
tachyon has infinite speed it covers distance
in no timeand it has disappeared to spatial
infinity in the same instant "now."
Finally observer C, who moves even faster
in the same direction, judges the tachyon
traveling into the past. It is created on the
Spacetime, Tachyon, Twins, ...
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/spacetime_tachyon/index.html[28/04/2010 08:20:10 ﺹ]
"now" hypersurface of simultaneity and
propagates towards "earlier" hypersurfaces
of simultaneity. It arrives at the earlier ones
before it was created; that means it is
traveling backwards in time.
All three figures above are drawn with the tachyon
moving up the page. So it is easy to fall into the
trap of imagining that the figure with observer A
shows what is really happening: that the tachyon is
really propagating forward in time and that the
other two figures represesent distored reporting
from observers B and C in motion. That is not how
it works. The principle of relativity assures us that
the reports of all of the observers are equally
good. C's reporting of the tachyon traveling back in
time is as good as A's reporting of the tachyon
traveling forward in time. That the figure showing
observer A looks more natural is just an accident of
the way the figures have been drawn. We could
equally well draw the figures so that C's worldline is
vertical. Then, as shown at left, the natural reading
would be to say that the tachyon propagates
backwards in time.
None of the figures in any more correct than any
other. The principle of relativity assures us that all
the observers are equally good. Since they
disagree on whether the tachyon propagates
forward in time, the best we can say is that there is
no observer independent fact of the direction
of propagation, just as there is no fact as to which
observer is really at rest.
Spacetime, Tachyon, Twins, ...
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/spacetime_tachyon/index.html[28/04/2010 08:20:10 ﺹ]
Tachyon Paradoxes
For some observers, tachyons travel backwards in time,
that is, into the past. Does that mean that they can be used
to affect the past, that is, to change the past? Does that
mean that we can use them to create paradoxical situation?
The standard time travel paradox is the one in which a
time traveler travels back in time and kills his or her
grandfather; so that the time traveler is never born; so the
time traveler doesn't travel back in time! This closed loop
produces a contradiction. The time traveler both exists at
some time and and does not exist at the same time.
It proves to be quite easy to conceive situations in which
tachyons are emitted and absorbed in such a way as to
produce similar closed, paradoxical cycles. The standard
tachyonic paradox employs spaceships run entirely by
robots, programmed to behave in certain ways according to
whether or not a tachyonic signal has been received by
them
In the figure below, the robot controlled spaceship A is
programmed simply to self destruct if it receives a
tachyonic signal; and if it still exists later to emit a tachyonic
signal into the past.
The robot controlled spaceship B is programmed to
switch into an "activated" mode upon receipt of a tachyonic
signal; and to transmit a tachyonic signal later only if it is in
the activated mode. In addition the motions of the
spaceships and timing of the emissions are all carefully
preprogramed so that the spaceships are in just the right
positions for the sending of a signal the other will receive;
and the receiving of a signal the other will send.
Spacetime, Tachyon, Twins, ...
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/spacetime_tachyon/index.html[28/04/2010 08:20:10 ﺹ]
Start the cycle with the sending of a tachyonic signal by
spaceship A. Tracing through the effects of that signal soon
leads to the outcome that spaceship A selfdestructs prior
to sending the tachyonic signal. So the signal is not sent.
And if the signal is not sent, tracing through the effects
leads to the conclusion that spaceship A does not self
destruct. So A's tachyonic signal is sent. We have a
contradiction:
A's
tachyonic
signal
is sent.
if and
only if
A's
tachyonic
signal
is NOT sent.
If you know a little logic, you will find it easy from this to
infer to a contradiction in the traditional form: (A's signal is
sent AND A's signal is NOT sent.)
Since tachyons are candidates for serious science and not
imaginings of science fiction, we cannot tolerate such an
Spacetime, Tachyon, Twins, ...
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/spacetime_tachyon/index.html[28/04/2010 08:20:10 ﺹ]
outright contradiction. Somehow it must be resolved. The
most obvious resolution is the most severe. We could just
suppose that these paradoxes show that there are no
tachyons. That seems too severe to me since other
weaker resolutions are possible.
The simplest resolution is just to suppose that the emission
of tachyons is just not something that can be
controlled by us. Just as the receipt of a signal is
something that happens to us, the emission (or receipt) of a
tachyon is again just something that happens to us. What
makes this resolution plausible is that there is no absolute
distinction between emission and receipt of a tachyon.
What one observer counts as an emission another may
count as a receipt. so we might expect the one rule over
control to cover both emission and receipt.
We could look to other more fanciful resolutions. Perhaps
tachyons exist but they don't interact with normal matter.
Most people find that dubious. Since we are normal matter,
that means we never interact with them and so we can
never know they are there.
Symmetry of Clock
Slowing: Half Twin Effect
We can also use spacetime diagrams to give us a much
simpler, geometric picture of how it is possible that two
moving observers can each judge the other's clocks to
have slowed. The set up employs just half of the
construction that is used in the twin effect to be described
below. So I call it the "half twin effect."
We imagine that twins A and B sets off at the same speed
but in opposite directions with respect to our vantage point
on earth. Each twin carries a clock. As they move away
from each other, each twin judges the rate of the
other's clock . How this is done is a detail we need not
fuss with. They might use light signals, for example, and
Spacetime, Tachyon, Twins, ...
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/spacetime_tachyon/index.html[28/04/2010 08:20:10 ﺹ]
correct for the time of flight of the signals to figure out what
was the other twin's clock reading at each instant.
The figure shows the essential result. Even before we do
any detailed analysis, a quick glance at the figure shows a
perfect symmetry between the twins A and B. So we
already expect that whatever decision A may make of B's
block, then B will make the same decision of A's clock. (The
clock readings are the numbers next to each twin's worldline.)
Twin A wants to determine the rate of B's clock. So twin A
asks: when A's clock reads 1, 2, 3, 4, ... what does B's
clock read ? Answering requires that A make judgments
of distant simultaneity. To do this, A must use the
hypersurfaces of simultaneity approriate to A's motion.
Because of the way those hypersurfaces tilt, A will judge
B's clock to read earlier times. For example, when A's clock
reads "4," twin A will judge that B's clock reads only "3." As
a result, twin A will judge that B's clock runs slower.
Spacetime, Tachyon, Twins, ...
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/spacetime_tachyon/index.html[28/04/2010 08:20:10 ﺹ]
Twin B will give the same analysis. However, since twin
B is in motion relative to A, twin B's analysis will make
different judgments concerning which events are
simultaneous. These different judgments are represented
by B's hypersurfaces of simultaneity. When B's clock reads
"4," twin B will judge A's clock to read only "3." As a result,
twin B will determine that A's clock runs slower.
Thus each infers the other twin's clock is running
slower. Of course, an observer on earth would with our
vantage point finds the twins to recede with equal speed in
opposite directions and would judge that the rates of both
twin's clocks to be the same.
Symmetry of Length Contraction
Essentially the same analysis applies to relativistic length
contraction. Each of the twins A and B carry a rod, where
the rods have the same length when compared in one
frame of reference. Each twin will judge the other's rod
to have contracted. Each of these judgments makes
essential use of a judgment of simultaneity. Once again,
when we take into account how the twins' judgments of
simultaneity differ, we can see how each ends up judging
the other's rod to have shrunk.
The spacetime diagram below depicts the essential ideas. It
shows twin A and twin B's rods receding from one another;
and it shows a hypersurface of simultaneity for each. As
before, the symmetry of the figure shows that whatever
twin A finds for the Brod, twin B will find for the Arod.
Twin A determines the length of the Arod by judging the
distance between the two ends of the A rod at one
instant of time. The events of A's hypersurface of
simultaneity constitutes one instant of time and the
distances between them are distributed uniformly across
the hypersurface. In the diagram, A's rod extends from
position 1 to position 2; as a result it it judged to be of unit
Spacetime, Tachyon, Twins, ...
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/spacetime_tachyon/index.html[28/04/2010 08:20:10 ﺹ]
length by twin A. The ends of the Brod, however, intersect
the same hypersurface at events that lie between positions
6 and 7. As a result, twin A judges the Brod to be of less
than unit length.
Twin B will carry out an analogous analysis and judge
that the Brod is of unit length, but the Arod is of less than
unit length.
The Twin Effect
("Paradox")
That inertial observers in relative motion will each judge the
others' clocks to run slower is, by now, a quite familiar and
readily understandable outcome of relativity theory. It does
take a little while to get used to the idea, of course. When
you first hear it, it seems strange and even paradoxical.
How can each be correct in judging the other's clock to
have slowed ? What would happen if the two observers
meet and compare their clocks ? If relativity is right, each
would have to read a time earlier than other; and surely that
is impossible. Or is it?
We now know that these concerns are misplaced. The
Spacetime, Tachyon, Twins, ...
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/spacetime_tachyon/index.html[28/04/2010 08:20:10 ﺹ]
clocks cannot start out from the same place and then be re
united without one or both accelerating; and those
accelerations so interfere with the analysis that no
contradiction arises. When either accelerates, they cease to
be inertial observers.
However an enduring literature has tried to generate some
sort of paradox from the effect of relativistic clock slowing.
The most famous of the these attempts is associated with a
story of two twins. One stays on the earththe "stayat
hometwin." The stay athome twin's motion is inertial
throughout. The other travels off rapidly into space,
journeys far and fast and then returns home. The traveling
twin must accelerate to complete this journey.
The story of what happens is readily told from the point
of view of the stayathome twin. The traveling twin's
clocks will slow due to the twin's rapid motion. That slowing
encompasses all processes related to time. So the
traveling twin's metabolism will slow as well. When the
traveling twin returns to earth, the traveler will have aged
significantly less. If the traveling twin had maintained a
speed of 86.6% the speed of light, the internal clock would
slow to 50% of the normal rate. Let us say the traveling
Spacetime, Tachyon, Twins, ...
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/spacetime_tachyon/index.html[28/04/2010 08:20:10 ﺹ]
twin returns, after what the stayathome twin finds to be 8
days. The traveling twin will experience merely the passing
of 4 days.
All this can be depicted in a simple spacetime
diagram. It shows the worldline of the stayat
home twin. The numbers 1, 2, ... , 8 are the days
of time elapsed on the stayathome twin's clock.
It also shows the worldline of the traveling twin,
who moves away from earth, travels inertially for
four days of stay athome twin time; abruptly
turns around; and then takes another 4 days to
return. Because the traveling twin is moving so
fast, the traveler's clock and the traveling twin's
metabolism run at half the speed of the stayat
home twin. The numbers 1, 2, 3, 4 represent the
days of time elapsed on the traveling twin's clock
and metabolism.
The twins have set their clocks to zero at the
start when the traveling twin leaves. When the
traveling twin returns, the stayathome twin has
aged 8 days, but the traveler has aged only 4
days.
So far, the analysis is straightforward. Where the
problems enter, is if we try to recount the story
from the perspective of the traveling twin 
and do it badly. The temptation is to say that,
from the traveling twin's perspective, everything
looks the same. The stayathome twin recedes,
then turns around and comes back. So, we are
tempted to ask, should the stay athome twin
have aged less ? Since the stay athome twin
cannot have both aged more and less, do we
have paradoxoften called the "twin paradox"?
We should not expect the stay athome twin to
age less. The error made was to assume a
symmetry in the two twins. The stayathome
twin maintains inertial motion throughout the
process. The traveling twin must at some point
Spacetime, Tachyon, Twins, ...
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/spacetime_tachyon/index.html[28/04/2010 08:20:10 ﺹ]
turn around and return. Even if very briefly, the
traveling twin must accelerate. That acceleration
makes a big difference and enables us to
maintain different results for each twin.
Spacetime, Tachyon, Twins, ...
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/spacetime_tachyon/index.html[28/04/2010 08:20:10 ﺹ]
Relativity theory is able to give a
consistent account of times
elapsed for both twins. The
clearest account shows the
judgments of simultaneity made
by each twin.
First, the diagram opposite shows
the judgments of simultaneity
of the stay athome twin. As
time passes on the clock of the
stayathome twin, we can trace
out the corresponding times on
the clock of the traveling twin.
The two clocks are both set to 0
at the outset. Then, the traveler's
clock starts to lag.
After 2 days have elapsed for the
stayathome twin, we follow the
hypersurface of simultaneity from
the worldline of the stayathome
twin to the traveling twin to find
that just one day has elapsed for
the traveler.
Repeating for the remaining
times, we see that for each time
elapsed on the stayathome
twin's clock 2, 4, 6, 8 dayshalf
the time has elapsed on the
traveling twin's clock1, 2, 3, 4.
Now consider the judgments of
simultaneity of the traveling twin, as
shown in the spacetime diagram
opposite. Since the traveling twin is
moving very rapidly, the traveler's
Spacetime, Tachyon, Twins, ...
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/spacetime_tachyon/index.html[28/04/2010 08:20:10 ﺹ]
hypersurfaces of simultaneity are quite
tilted.
Two hypersurfaces of simultaneity are
shown in the lower part of the diagram for
the outward part of the traveler's journey.
These are the hypersurfaces that pass
through the event at which the clock
reads 1 day and just before the turn
around at the traveler's clock time of 2
days.
We read from these hypersurfaces that
the traveling twin judges the stayat
home twin's clock to be running at half
the speed of the travelers. When the
traveler's clock reads 1 day, the stayat
home twin's reads 1/2 day; just before the
turn around, when the traveler's clock is
almost at 2 days, the stayathome twin's
clock is almost at 1 day.
Then, at the end of the outward leg, the
traveler abruptly changes motion,
accelerating sharply to adopt a new
inertial motion directed back to earth.
What comes now is the key part of the
analysis. The effect of the change of
motion is to alter completely the
traveler's judgment of simultaneity.
The traveler's hypersurfaces of
simultaneity now flip up dramatically.
Moments after the turnaround, when the
travelers clock reads just after 2 days,
the traveler will judge the stayathome
twin's clock to read just after 7 days.
That is, the traveler will judge the stayat
home twin's clock to have jumped
suddenly from reading 1 day to reading 7
days. This huge jump puts the stayat
home twin's clock so far ahead of the
Spacetime, Tachyon, Twins, ...
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/spacetime_tachyon/index.html[28/04/2010 08:20:10 ﺹ]
traveler's that it is now possible for the
stayathome twin's clock to be ahead of
the travelers when they reunite.
Careful attention to the differing judgments of simultaneity
of the two twins shows that there is nothing paradoxical
in the twin effect. The brief moment of acceleration of the
traveling twin completely alters the traveler's judgments of
simultaneity and this alteration is key to seeing how
relativity provides a consistent account of the effect.
Nevertheless, many still get confused by the twin effect.
The traps they fall into go something like this.
Question: If the stayathome twin judges the other twin's
clock to slow, does not the principle of relativity require that
the traveling twin see the same thing for the stayathome
twin? Otherwise, could we not use the difference to detect
the absolute motion of the traveling twin?
Answer: The principle of relativity applies to inertial
motion. Only the stayathome twin moves inertially. So the
principle of relativity is applied to that twin only. There is no
problem in the traveling twin deciding that he is moving
from the relative slowing of his clock, since what he is really
inferring is that he is accelerating.
Question: OKforget the principle of relativity. Is there
not a symmetry in the situation. Each sees the other
moving so if one sees the other's clock slow, should not
both?
Answer: There is not a perfect symmetry in the two twins.
One moves inertially; the other accelerates. So there is no
basis for expecting symmetrical effects and we do not get
them.
Timelike
Spacetime, Tachyon, Twins, ...
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/spacetime_tachyon/index.html[28/04/2010 08:20:10 ﺹ]
Geodesics
The reason I have gone into such detail on the story of the
twin effect is that it turns out to be especially simple to
understand when we relate it to the geometry of a
Minkowski spacetime.
The result that will interest us is one of the most
fundamental results of Euclidean geometry, that is, of
the ordinary geometry of our space. If one has two points in
space, which of all possible curves is the straight line the
connects them? The answer is that the straightest is the
shortest.
This shortest curve is called a "geodesic." That
the straight lines are the shortest is a very familiar fact
of experience. If I need to go from one side a large hall
to another quickly, I choose the straight path since that
is the shortest. The figure shows a straight line as the
shortest curve connecting two events A and B.
There is an analogous notion in Minkowski
spacetime (and in all relativistic spacetimes). Think of all
the timelike trajectories that might represent the motion of
some physical system. How do we distinguish those that
are inertial? In the spacetime diagrams, they are drawn as
straight lines since they are straight in several senses. The
one that matters to us here is that they are geodesics
analogous to the geodesics of Euclidean geometry.
In a Euclidean space, every curve has a length. If we drive
a car on some trip, the length of the road we traverse is
Spacetime, Tachyon, Twins, ...
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/spacetime_tachyon/index.html[28/04/2010 08:20:10 ﺹ]
measured by the car's odometer. In spacetime there is a
similar notion. As you or I traverse some timelike worldline
in spacetime, we carry an instrument that measures the
curve's "length" in spacetime, analogous to the car's
odometer. That instrument is our wristwatch or any other
clock we carry with us. The length of a timelike curve in
spacetime is just the time elapsed as read by a comoving
clock.
So now we can say which of all timelike trajectories
connecting two event A and B in a Minkowski spacetime is
the inertial trajectory. It is just the timelike geodesic, where:
Timelike geodesic: The timelike curve connecting two
events of greatest proper time.
The definition is exactly like
that of the geodesic of
Euclidean space, except that
we have replaced shortest
spatial length by greatest
proper time. It tells us that we
proceed from event A to
event B with greatest elapsed
time if we follow an inertial
trajectory.
But that fact is just the result of the twin effect! The stay
athome twin travels to some event in his future along an
inertial trajectory. The traveling twin follows an accelerated
trajectory along which less proper time elapses.
So we see that the twin effect is as fundamental to the
geometry of a Minkowski spacetime as is the simple idea in
ordinary geometry that a straight line is the shortest
distance between two points.
What you should know:
Spacetime, Tachyon, Twins, ...
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/spacetime_tachyon/index.html[28/04/2010 08:20:10 ﺹ]
How is the relativity of simultaneity used to infer that tachyons travel backwards
in time for some observers.
How the relativity of simultaneity makes it easy to see that inertial observers
judge each others clocks to slow and rods to shrink.
What the twin effect is and why it isn't paradoxical.
What a timelike geodesic is and how it relates to the twin effect.
Copyright John D. Norton. January 2001, September 2002; July 2006; February 3, 2007; January 23, September 24, 2008; January 21, 2010.
Four dimensions
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/four_dimensions/index.html[28/04/2010 08:20:18 ﺹ]
HPS 0410 Einstein for Everyone
Back to main course page
What is a four dimensional space like?
John D. Norton
Department of History and Philosophy of Science
University of Pittsburgh
The one dimensional interval
The two dimensional square
The three dimensional cube
The four dimensional cube: the tesseract
Stereovision
Summary table
A roomy challenge
A knotty challenge
Using colors to visualize the extra dimension
What you should know
We have already seen that there is nothing terribly
mysterious about adding one dimension to space to form
a spacetime. Nonetheless it is hard to resist a lingering
uneasiness about the idea of a four dimensional
spacetime. The problem is not the time part of a four
dimensional spacetime; it is the four. One can readily
imagine the three axes of a three dimensional space: up
down, across and back to front. But where are we to put
the fourth axis to make a four dimensional space?
My present purpose is to show you that there is nothing
at all mysterious in the four dimensions of a spacetime.
To do this, I will drop the time part completely. I will
just consider a four dimensional space; that is, a space
just like our three dimensional space, but with one extra
dimension. What would it be like?
With no effort whatever, I can visualize a three
dimensional space and you can too. What would it be
like to live in a three dimensional cube? To be asked to
visualize that is like being asked to breathe or blink. It is
effortless. There we sit in the cube with its six square
walls and eight corners. Our mind's eye lets us hover
about inside.
Four dimensions
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/four_dimensions/index.html[28/04/2010 08:20:18 ﺹ]
Can I visualize what it would be like to live in the four
dimensional analog of a cube, a four dimensional cube or
"tesseract." I cannot visualize this with the same
effortless immediacy. I doubt that you can as well. But
that is just about the only thing we cannot do. Otherwise
we can determine all the properties of a tesseract
and just what it would be like to live in one. There are
many techniques for doing this. I will show you one
below. It involves progressing through the sequence of
dimensions, extrapolating the natural inferences at each
step up to the fourth dimension. Once you have seen
how this is done for the special case of a tesseract, you
will have no trouble applying it to other cases.
The door to the fourth dimension is opening.
The one dimensional interval
The one dimensional analog of a cube is an interval. It is
formed by taking a dimensionless point and dragging it
through a distance. That distance could be 2 inches or 3
feet or anything. Let us call the distance "L".
The interval has length L. It is bounded by 2 points as its
facesthe two points at either end of the interval.
The two dimensional
square
The two dimensional analog of a cube is a square. It is
formed by dragging the one dimensional interval through
a distance L in the second dimension.
Four dimensions
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/four_dimensions/index.html[28/04/2010 08:20:18 ﺹ]
The square has area L
2
. It is bounded by faces on 4
sides. The faces are intervals of length L. We know there
are four of them since its two dimensional axes must be
capped on either end by faces.
So we have 2 dimensions x 2 faces each = 4 faces. The
faces together form a perimeter of 4xL in length.
The three
dimensional
cube
To form a cube, we take the square and drag it a
distance L in the third dimension.
The cube has volume L
3
. It is bounded by faces on 6
sides. The faces are squares of area L
2
. We know there
are 6 of them since its three dimensional axes must be
capped on either end by faces.
Four dimensions
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/four_dimensions/index.html[28/04/2010 08:20:18 ﺹ]
So we have 3 dimensions x 2 faces each = 6 faces. The
faces together form a surface of 6xL
2
in area. Drawing a
picture of a three dimensional cube on a two dimensional
surface is equally easy. We take two of its faces two
squaresand connect the corners.
There are several ways of doing the drawing that
corresponds to looking at the cube from different angles.
The figure shows two ways of doing it. The first gives an
oblique view; the second looks along one of the axes.
The four dimensional cube: the
tesseract
So far I hope you have found our constructions entirely
unchallenging. The next step into four dimensions can be
Four dimensions
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/four_dimensions/index.html[28/04/2010 08:20:18 ﺹ]
done equally mechanically. We just systematically repeat
every step above. The only difference is that this time we
cannot readily form a mental picture of what we are
building. But we can know all its properties!
To form a tesseract, we take the cube and drag it a
distance L in the fourth dimension.
The tesseract has volume L
4
. It is bounded by faces on 8
sides. The faces are cubes of volume L
3
. We know there
are 8 of them since its four dimensional axes must be
capped on either end by faces.
So we have 4 dimensions x 2 faces each = 8 faces. The
faces together form a "surface" (really a three
dimensional volume) of 8xL
3
in volume. Drawing a
picture of a four dimensional tesseract in a three
dimensional space is straightforward. We take two of its
facestwo cubesand connect the corners.
Four dimensions
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/four_dimensions/index.html[28/04/2010 08:20:18 ﺹ]
There are several ways of doing the drawing that
corresponds to looking at the tesseract from different
angles. The figure shows two ways of doing it. The first
gives an oblique view; the second looks along one of the
axes.
So now we seem to know everything there is to know
about the tesseract! We know its volume in four
dimensional space, how it is put together out of eight
cubes as surfaces and even what the volume of its
surface is (8xL
3
).
Stereovision
The "drawings" of the tesseract are hard to see clearly.
That is because they are really supposed to be three
dimensional models in a three dimensional space. So
what we have above are two dimensional drawings
of three dimensional models of a four dimensional
tesseract. No wonder it is getting messy!
The images below are stereo pairs. If you are familiar
Four dimensions
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/four_dimensions/index.html[28/04/2010 08:20:18 ﺹ]
with how to view them, you will see that they give you a
nice stereo view of the three dimensional model. If these
are new to you, they take practice to see. You need to
relax your view until your left eye looks at the left image
and the right eye looks at the right image.
But how can you learn to do this? I find it easiest to start
if I sit far away from the screen and gaze out into the
distance over the top of the screen. I see the two
somewhat blurred images on the edge of my field of
vision. As long as I don't focus on them, they start to drift
together. That is the motion you want. The more they drift
together the better. I try to reinforce the drift as best I can
while carefully moving my view toward the images. The
goal is to get the two images to merge.When they do, I
keep staring at the merged images, the focus improves
and the full three dimensional stereo effect snaps in
sharply. The effect is striking and worth a little effort.
This pair is easier to fuse:
and this one is a little harder:
Summary table
Four dimensions
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/four_dimensions/index.html[28/04/2010 08:20:18 ﺹ]
We can summarize the development of the properties of
a tesseract as follows:
Dimension Figure Face Volume
Number
of faces
Volume of
surface/ perimeter
1 interval point L 1x2=2 two points
2 square interval
L
2
2x2=4 4L
3 cube square
L
3
3x2=6
6L
2
4 tesseract cube
L
4
4x2=8
8L
3
A roomy
challenge
If you were to live in a tesseract, you might choose to live
in its three dimensional surface, much as a two
dimensional person might choose live in the 6 square
rooms that form the two dimensional surface of a cube.
So your house would be the eight cubes that form the
surface of the tesseract. Imagine that there are doors
where ever two of these cubes meet. If you are in one of
these rooms, how many doors would you see? What
would the next room look like if you passed through one
of the doors? How many doors must you pass through to
get to the farthest room? How many paths lead to that
farthest room? Could you have any windows to outside
the tesseract ? What about windows to inside the
tesseract?
Some of these questions are not easy. To answer them,
go back to the easy case of a three dimensional cube
with faces consisting of squares. Ask the analogous
questions there and just extrapolate the answers to the
tesseract.
Four dimensions
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/four_dimensions/index.html[28/04/2010 08:20:18 ﺹ]
A knotty challenge
Access to a fourth dimension makes many things
possible that would otherwise be quite impossible. To
see how this works, we'll use the strategy of thinking out
a process in a three dimensional space. Then we
replicated it in a four dimensional space.
Consider a coin lying in a frame on a table top.
There is no way the coin can be removed from
the frame within the confines of the two
dimensional surface of the table. Now recall
that we have access to a third dimension. The
coin is easily removed merely by lifting it into
the third dimension, the height above the table.
We are then free to move the coin as we
please in the higher layer and then lower back
to the tabletop outside the frame.
The thing to notice about the lifting is that the motion
does not move the coin at all in the two horizontal
directions of the two dimensional space. So the motion
never brings it near the frame and there is no danger
of collision with the frame.
Four dimensions
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/four_dimensions/index.html[28/04/2010 08:20:18 ﺹ]
Now repeat this analysis for its analog in one higher dimension, a
marble trapped within a three dimensional box.
The marble can be removed in exactly the same way by "lifting"
it, this time into the fourth dimension. As with the coin in the
frame, the key thing to note is that in this lifting motion, the
marble's position in the three spatial directions of the box are
unchanged. The marble never comes near the walls and
there is no danger of colliding with them.
Once it is lifted into a new three dimensional space, it can be
moved around freely in that space and lowered back into the
original three dimensional space, but now outside the box.
Now finally consider two linked rings in some
three dimensional space. Can we separate them
using access to a fourth dimension?
It can be done by exactly the same process of
lifting one of the rings into the fourth dimension. As
before, note that the lifting does not move the ring
Four dimensions
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/four_dimensions/index.html[28/04/2010 08:20:18 ﺹ]
in any of the three directions of the three
dimensional space holding the initially linked rings.
So the motion risks no collisions of moved ring
with the other. The lifting simply elevates the
moved ring to a new three dimensional layer of the
four dimensional space in which no part of the
other ring is found. The moved ring can then be
freely relocated in that new layer and, if we
pleased lowered back into the original three
dimensional space in quite a different location.
Now comes the knotty challenge. We are familiar in our
three dimensional space with tying knots in a rope. Some
knots are just apparent tangles that can come apart
pretty easily. Others are real and can only be undone by
threading the end of the rope through a loop. So take this
to be a real knot : one that cannot be undone by any
manipulation of the rope if we cannot get hold of the
ends. (Imagine, if you like, that they are each anchored
to a wall and cannot be removed.)
The challenge is to convince yourself that there are no
real knots in ropes in a four dimensional space. The
principal aid you will need is the manipulation above of
the linked rings. To get yourself started, imagine how you
would use a fourth dimension to until some simple knot
you can easily imagine.
Four dimensions
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/four_dimensions/index.html[28/04/2010 08:20:18 ﺹ]
Using colors to
visualize the extra
dimension
Does the general idea of "lifting" an object into the fourth
dimension still seem elusive? If so, here's a technique for
visualizing it that may just help. The trick is to imagine
that differences in position in the extra dimension of
space can be represented by differences of colors.
Here's how it works when we start with a two dimensional space
and lift into the third dimension. The objects in the original two
dimensional space are black. As we lift through the third
dimension, they successively take on the colors blue, green and
red.
Now let's apply this colored layer trick to
the earlier example of lifting a coin out of a
frame. The coin starts in the same two
dimensional space as the frame. We lift it
up into the third dimension into a higher
spatial layer that we have colorcoded
red. In this higher layer, the coin can move
freely left/right and front/back without
intersecting the frame. We moving it to the
right until it passes over the frame. Then
we lower it back down outside.
Now imagine that we cannot perceive the
third dimension directly. Here's how we'd picture
the coin's escape. It starts out inside the frame in
the space of the frame. It is then lifted out of the
frame into the third dimension. At that moment, it
is indicated by a ghostly red coin. Its spatial
Four dimensions
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/four_dimensions/index.html[28/04/2010 08:20:18 ﺹ]
position in the left/right and front/back direction
has not changed. All that has changed is its
height. It is now in the red height layer. If we
move the coin left or right, or front and back, in
this red layer, it no longer intersects the frame
and can move right over it. We won't see it move
over the frame, however. As far as we are
concerned it will just move through it.
The motion of the coin in this third dimensional
escape passage is illustrated by the ghostly red
coin.
This last analysis of the coin in the frame is the template for
dealing with the real case of a marble trapped inside a three
dimensional box. If the marble moves in any of the three familiar
dimensions (up/down, left/right and front/back), its motion
intersects the walls of the box and it cannot escape. So we lift
the marble into the fourth dimension, without changing its
position in the three familiar dimensions. In the figure, this is
shown by the marble turning ghostly red. In the red space, the
marble is free to move up/down, left/right and front/back, without
intersecting the box's walls. The marble then moves so that is
passes over one of the walls. It is then lowered out of the red
space back to the original three dimensional space of the box,
but now outside the walls.
The same analysis applies to the linked rings. One ring
is lifted out of the three dimensional space of the original
set up. In this red space, the ring can move freely without
intersecting the other ring. We move it well away from the
other ring and then drop it back into the original three
dimensional space. It is now unlinked from the other ring.
Four dimensions
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/four_dimensions/index.html[28/04/2010 08:20:18 ﺹ]
What you should know
The properties of squares, cubes and tesseracts.
How to arrive at the properties of a tesseract and
other four dimensional figures by extrapolating the
methods used to get the properties of a cube.
Copyright John D. Norton. February 2001; July 2006, February 2, 2008..
Significance
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/significance/index.html[28/04/2010 08:20:53 ﺹ]
HPS 0410 Einstein for Everyone
Back to main course page
Philosophical Significance of the
Special Theory of Relativity
or
What does it all mean?
John D. Norton
Department of History and Philosophy of Science
University of Pittsburgh
The Project
The Method
Candidate Morals
1a. Skepticism about common sense ideas
1b. Skepticism about science
2. A General Relativism
3. Time is the fourth dimension
4. Verificationism
Historical background to Verificationism
What is my view of all this?
5. Operationism (P.W.Bridgman)
What is my view of this?
6. The use of evidence: common causes and
common origins
7. Change is illusion
8. Causal Theory of Time (H. Reichenbach):
My Picks
What you should know
The Project
Special relativity has changed our understanding of the
nature of space, time, energy and other physical
quantities. There is a very widespread feeling that the
advent of special relativity has somehow changed the
way we look at things in a sense that goes beyond
Significance
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/significance/index.html[28/04/2010 08:20:53 ﺹ]
these narrow physical results. What might that sense
be?
The problem in answering is that there is scarcely a
viewpoint or movement in modern philosophical thought
that does not claim support in one way or another from
Einstein's achievement. Clearly they cannot all be
right. Quite often radically opposed viewpoints claim
support from Einstein's achievement. In the end it is up
to you to decide, since the issue remains controversial.
You should use your knowledge of Einstein's theory and
the circumstances surrounding its emergence to assist
you.
The Method
Deciding what this significance might be is a
philosophical problem of no small interest. It must be
resolved by the standard methods of philosophical
analysis. These methods are simple to describe and not
so difficult to learn. To begin, we need to keep two
notions in mind:
The thesis or claim.
Just what is it that is being claimed? This must be
stated in as simple and clear a manner as possible.
No real progress can be made until we know what
this is in precise and unambiguous terms. Often
merely finding the clear statement is an advance in
itself.
The arguments that support the claim.
A thesis or claim by itself is only of so much value.
What now matters is what reasons can be given to
believe the claim. These reasons should be laid out
in as cogent a form as possible. Typically these
reasons will take the form of an explicit argument.
To support a claim we might try to show that if you
believe the widely accepted X, Y and Z, then logic
Significance
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/significance/index.html[28/04/2010 08:20:53 ﺹ]
forces you to accept the new claim C. Or the
argument might be that if you fail to believe the
thesis you will be caught up in some undesirable
consequence, even an outright contradiction.
What is meant by "logic forces"? I just mean that the
inference from X, Y and Z to C is a valid argument, where
"argument" means what it means in any introductory logic
class. If you haven't had such a class, you should. It will
help you clarify your thinking a great deal. There you will
learn than an argument is not a shouting match. It is a
sequence of propositions. A valid argument is one in which
each proposition of the sequence is either introduced as a
premise or inferred from propositions earlier in the
sequence. A valid argument is one in which the truth of
the premises necessitates the truth of conclusions inferred.
For example:
1. All men are mortal. (Premise)
2. Socrates is a man. (Premise)
3. Socrates is mortal. (Inferrred from 1,2.)
is a valid argument since, if premises 1. and 2. are true,
then the conclusion 3. must also be true.
These two ideas are easy to state and
look rather simple to satisfy. That is true
as long as the problems dealt with are
themselves easy. However once we
start to entertain the traditionally
intractable problems of philosophy,
holding to them can be come quite
demanding. Success at it may be a
significant achievement and the best
work in philosophy is distinguished by
its success in holding to them in
adverse circumstances.
Our goal is to take something that is puzzling, vague
and elusive and make it precise and definite. If we do it
right, the resolution of the puzzle should seem so
straightforward that we wonder why it ever seemed
otherwise.
For a discussion of philosophical morals that can be drawn
from relativity theory concerning space and time, see my
paper PITT PHILSCI00000138 on philsciarchive. Beware.
The discussion is at a more advanced level than presumed in
this class, so it is only for the adventurous.
Candidate Morals
The obvious candidate is just the basic content of the
Significance
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/significance/index.html[28/04/2010 08:20:53 ﺹ]
theory itself. It tells us some pretty surprising things
about space and time and the matter they contain: that c
is a fundamental barrier to all motions; that moving
clocks slow; that simultaneity is relative; that energy and
mass are equivalent; and so on. In so far as a perennial
problem of philosophy has been to discern the nature of
space and time, this is a reasonable answer. However it
is usually thought that the advent of relativity somehow
changed something fundamental, perhaps in how we
see ourselves in the universe, or, more narrowly, in how
we conduct our scientific investigations of that universe.
Our quest is for morals of that type.
Here are some candidate morals of this broader type. I
will give you my reaction to them to give you an
example of how these claims may be analyzed and also
to let you know what I think. Do you agree? Make up
your own mind and you proceed. Your decision will be
reported in the assignment.
1a. Skepticism about common sense
ideas
Relativity shows us that we cannot expect
our common sense ideas about the physical
world to be reliable.
On its face this is acceptable. This claim is clear
enough. The argument for it also takes no imagination
to see. We had commonsense ideas about rods, clocks,
simultaneity and more. We believed them because they
seemed, well, commonsensical. Relativity showed that
they were incorrect. Therefore commonsense ideas are
untrustworthy, or at least on some occasions.
Significance
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/significance/index.html[28/04/2010 08:20:53 ﺹ]
This argument is acceptable. The main part that I don't
like is the suggestion that we needed relativity theory
to tell us this. Anyone who has ever attended to
developments in science will find numerous examples of
science revealing the fragility of commonsense ideas.
Copernicus did just that to our commonsense idea that
we are at rest; he showed that we hurl through space at
great speed in space, spinning all the while.
There is a more important connection between
scientific breakthroughs and common sense,
Here's a list of things that we know through
commonsense:
The sky cannot fall down.
Cows cannot jump over the moon.
The earth is spherical.
People venturing to the other side will not fall off.
Significance
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/significance/index.html[28/04/2010 08:20:53 ﺹ]
The earth spins on its axis and orbits the sun.
Matter is made of atoms too small to see.
Nothing goes faster than light.
The items of the list become successively more
sophisticated. Indeed inspecting them reveals that each
item of today's commonsense is a major result of
yesterday's science.
What this suggests is that there is no independent
notion of a common sense idea that somehow sits
outside what we know through systematic
investigations. Rather commonsense is a byproduct of
those investigation. The broad acceptance of common
sense ideas about our physical world is merely the final
stage of absorption of the results of scientific
investigation. That is why today's common sense is
yesterday's scientific breakthrough.
From this we can infer a more subtle moral : there is
a kind of reliability in common sense ideas since they
are ultimately, though indirectly, grounded in something
more solid. Rather than needing a blanket skepticism
about common sense ideas, the real thing to guard
against is common sense that does not keep pace with
newer investigations.
For example , it still seems to be a part of common
sense that "airs" can be good for you. Don't we know of
the benefits of clear mountain air ? Similarly the wrong
"airs" were thought to be unhealthy. The disease of
malarialiterally "bad air" mal aria was thought to be
caused by them. Of course we now know that malaria is
really caused by infection from mosquito borne
parasites with the mosquitos coming from swamps that
might also emit bad smells. So the idea of avoiding bad
smelling places to avoid the disease was right, but only
indirectly.
Significance
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/significance/index.html[28/04/2010 08:20:53 ﺹ]
1b. Skepticism about science
Relativity shows us that even the best of our
theoriesclassical mechanicsare unreliable.
Why should we believe any of the theories of
modern science? Should we not expect the
Einsteins of tomorrow to overturn them all?
Alchemist searching for the philosopher's stone that will convert base metals into gold.
The thesis is clear. The argument is also clear.
Relativity is just the latest of many instances of new
science overturning old theories we thought secure. So
we should expect our latest theories will eventually also
be overturned, so don't believe them.
Significance
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/significance/index.html[28/04/2010 08:20:53 ﺹ]
In my view this is a lamentable argument, defective
because it rests on a false premise: the idea that
relativity theory simply wiped away all the physics that
went before. It did not. The bulk of that physics stays
intact. Classical physics only needs relativistic
corrections when we deal with velocities close to the
speed of light. In virtually all applications, from designing
bridges to launching Apollo astronauts, classical physics
suffices.
The real pattern is that, once a science reaches some
level of maturity, it becomes a fixture in the domains in
which it was developed. The much publicized
revolutions that eventually do arise supply adjustments
outside of those domains. Here are some examples:
Science
Maturity
achieved
Where it eventually
fails
Geometry
Ancient
Greece,
Euclid 3rd
century BC
On cosmic scales
Solar
system
astronomy
Heliocentrism,
Copernicus,
Kepler, 16th
and 17th
century
Very precise
measurements correct
their predictions but
leave the heliocentric
layout intact.
Dynamics
Newtonian
mechanics,
17th century
Domains of
very fast (special
relativity)
very heavy (general
relativity)
very large (relativistic
cosmology)
very small (quantum
theory)
There is a much more benign moral in all this: do not
Significance
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/significance/index.html[28/04/2010 08:20:53 ﺹ]
trust theories in domains remote from those in which
they were devised. The persistence of the skeptical
argument is a puzzle to me. It simply rests on defective
history of science, yet it remains popular among many
historians of science who should know better.
2. A General Relativism
Einstein has shown us that the fundamental
quantities of physics are relative. Is this not
a quite general moral ? Is not what is true or
false or what is right or wrong relative to the
individual? Should we not say "Everything is
relative"?
Significance
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/significance/index.html[28/04/2010 08:20:53 ﺹ]
This argument is defective. First, that certain quantities
in relativity theory are relative to the observer or, better
said, state of motion of the observer, has no real
bearing on whether there is one true standard for the
good or the morally right. The same wordrelativismis
used in all cases, but the similarity of meaning is so
superficial as not to allow success in one domain to
carry to another.
Second, it is not true in relativity theory that
"everything is relativity". Only certain quantities are,
albeit more that in classical physics. Some quantities
are not relative. The simplest examples are the so
called "rest" quantities: rest mass, rest length etc. These
are by definition the masses and lengths measured by a
comoving observer. They are characteristic properties
of bodies and are of fundamental physical importance;
(obviously) all observers must agree on their values.
They are an absolute.
It is something of an accident of history to do with
Einstein's way of thinking about relativity theory that we
stress the "relative" aspect of the theory. In the more
mathematical approach to the theory, what draws most
attention is what is not relative, the socalled
Significance
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/significance/index.html[28/04/2010 08:20:53 ﺹ]
"invariants." So early in the history Einstein agreed with
the great mathematician Felix Klein that a better name
for the theory would have been "theory of invariants."
Working in that mathematical tradition, Hermann
Minkowski, who introduced the notion of spacetime,
wrote in his 1908 lecture "Space and Time":
"...the word[s] relativitypostulate for the requirement of
an invariance with the group G
c
seem to me feeble.
Since the postulate comes to mean that only the four
dimensional world in space and time is given by
phenomena, but that the projection in space and in time
may still be undertaken with a certain degree of
freedom, I prefer to call it the postulate of the absolute
world (or briefly the worldpostulate)."
Had history paid more attention to Minkowski's
advocacy of the absolute world, might I now be
lamenting the fallacy of inferring that "everything is
absolute" from Einstein's theory?!
3. Time is the fourth dimension
With the transition to relativity theory, we no
longer conduct our physics in a three
dimensional space; we now employ the four
dimensional spacetime introduced by
Minkowski.
This slogan "time is the fourth dimension" is a
mischievous slogan, used, as far as I can tell, to
intimidate novices. They are supposed to be awed by
the apparent profundity of the claim while at the same
time never being able quite to grasp its content at the
insightful depth apparently accessible to the mischief
making sloganeer. If you meet such a sloganeer, you
should ask "what precisely do you mean ?" Keep in
The power of the slogan comes
from it suggests but does not say. It
suggest something like: "In 1903,
the Wright brothers liberated us from
the two dimensions of the space of
the earth's surface and opened a
new, third dimension, altitude. In
1905, Einstein did it again with a
new dimension, time." Spelled out
bluntly like this, the suggestion is
Significance
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/significance/index.html[28/04/2010 08:20:53 ﺹ]
mind the confusion favored by sloganeers sketched
below and insist on a precise answer!
obviously nonsense.
There is no interesting content to the claim. The
problem lies in the vagueness of the statement of the
thesis. There are two readings possible for it and
neither yields results of importance.
In a trivial and true reading, we allow that space and
time taken together form a manifold of four dimensions.
What that just means is that four numbers are needed to
locate an event in spacetime. Three of them are the
usual spatial coordinates and the last is a time
coordinate. That is true and was always true in classical
physics as well. There is nothing of novel interest in this
reading beyond the usual banalities about how things
change with time. The idea that this sort of spatial
representation of time is possible is as old as a pocket
book calendar in which the passage of time is
represented by a sequence of boxes or list of dates.
Significance
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/significance/index.html[28/04/2010 08:20:53 ﺹ]
Significance
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/significance/index.html[28/04/2010 08:20:53 ﺹ]
There is a profound but false version of the slogan.
What if time were a fourth dimension just like the three
dimensions of space? That would be extraordinary. It
mean that we could move about in the time dimension
just as we move about in the space dimension. But time
is not just like space in relativity theory. The theory
keeps the timelike direction in spacetime quite distinct
from the spacelike; the light cone structure does this
quite effectively. So relativity theory contradicts this
profound reading.
Underlying the profound reading is a simple fallacy.
We note that in a spacetime formulation of relativity
theory, time is usefully represented spatially in a
diagram. So we can infer time must be like space in
Significance
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/significance/index.html[28/04/2010 08:20:53 ﺹ]
some aspects or this device would fail. It does not follow
that time is like space in all aspects. Analogously, we
can represent the spectrum of colors spatially with color
wheels and rainbows. That does not mean that colors
are spatial. Red is not the fifth dimension of space.
There is an interesting entanglement of space and time
in relativity theory captured in the relativity of
simultaneity. But the slogan of time as the fourth
dimension is a defective and misleading way of
expressing it.
4. Verificationism
Einstein eliminated the ether from physics
since there were no observable
circumstances in which our motion through it
could be revealed. This is compatible with a
verificationist approach to all propositions.
According to it, a proposition is meaningless
unless there are circumstances conceivable
under which it could be proven true (verified)
or at least confirmed.
Einstein's establishment of special relativity has been
judged by many to embody the core insight of a strong
movement in philosophy from the earlier part of the 20th
century. Hans Reichenbach was a German philosopher
who learned relativity theory from Einstein in Berlin in the
1910s and became one of his principal, philosophical
interpreters. He wrote in his contribution to the 1949
volume Albert Einstein: PhilosopherScientist, which
celebrated Einstein, in his chapter "The Philosophical
Significance of the Theory of Relativity" (pp. 29091) [with
my added paragraph breaks]:
Significance
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/significance/index.html[28/04/2010 08:20:53 ﺹ]
"To advocate the philosophical significance of Einstein's
theory, however, does not mean to make Einstein a
philosopher; or, at least, it does not mean that Einstein
is a philosopher of primary intent. Einstein's primary
objectives were all in the realm of physics.
But he saw that certain physical problems could not be
solved unless the solutions were preceded by a logical
analysis of the fundamentals of space and time, and he
saw that this analysis, in turn, presupposed a
philosophic readjustment of certain familiar conceptions
of knowledge.
The physicist who wanted to understand the Michelson
experiment had to commit himself to a philosophy for
which the meaning of a statement is reducible to
its verifiability, that is, he had to adopt the verifiability
theory of meaning if he wanted to escape a maze of
ambiguous questions and gratuitous complications.
It is this positivist, or let me rather say, empiricist
commitment which determines the philosophical position
of Einstein. It was not necessary for him to elaborate on
it to any great extent; he merely had to join a trend of
development characterized, within the generation of
physicists before him, by such names as Kirchhoff,
Hertz, Mach, and to carry through to its ultimate
consequences a philosophical evolution documented at
earlier stages in such principles as Occam's razor and
Leibniz' identity of indiscernibles."
Reichenbach's analysis depends upon comparing
Einstein's view with that of his contemporaries:
Two
theories in
1905
Einstein's
special theory
of relativity
H. A. Lorentz's
electron theory
Agreed on
moving rods
contract,
moving clocks
moving rods
contract, temporal
processes of
Significance
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/significance/index.html[28/04/2010 08:20:53 ﺹ]
what could
be
observed
slow, ..., no
observably
distinguishable
state of rest.
moving bodies slow,
..., no observably
distinguishable state
of rest.
Disagreed
on the
unobserved
things
posited by
theory
no such thing
as an ether
with a
preferred ether
state of rest
Motion through the
ether with respect to
its preferred state of
rest causes rods to
contract and clocks
to slow, etc.
They do so in just the
right amount to
prevent us
distinguishing which
inertial state of motion
coincides with rest in
the ether.
One sees in this comparison the essential intuition
that guides Reichenbach's analysis: something seems
to be wrong with Lorentz's theory. It has an extra
element, the ether with its state of rest, that is not
present in Einstein's theory, even though both theories
make the same prediction.
The elusive nature of this ether
state of rest and Einstein's
reaction to it was later captured in
the slogan "the difference that
makes no difference is no
difference." That slogan seems to
capture an obvious and simple
view.
To illustrate the idea, imagine that I insist that there are pixies in
the mountains, but that you will never see them, no matter how
hard you search, since they hide perfectly behind the trees
whenever you come near. You would surely doubt my assurance
and properly suspect that there really are no pixies in the
mountains. If their presence leaves no observable trace, my
insistence that they really are there looks like a delusion.
Historical background to Verificationism
Significance
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/significance/index.html[28/04/2010 08:20:53 ﺹ]
Auguste Comte
What about the pixies in the mountains?
We would no longer have to worry about
whether they are really there. For the
positivist, a proper theory of living beings in
the mountains would only include a register
of what we have seen. There would be no
pixies since there are no experiences
associated with them.
Ernst Mach
Reichenbach located Einstein's thought in a
tradition that was built around the intuition
captured in this slogan. The doctrine of positivism
was August Comte (17891857) and Ernst Mach
(18381916). One of its central ideas was that a
theory in science is nothing more than a
compact summary of experience.
For example, Galileo noted that, on many
occasions, the distance a body fell in times 1, 2,
3, ... was proportional to the square of these times
1, 4, 9, ... He then announced his law of fall, that
the the distance of fall is proportional to the
square of the time. All he was announcing was a
compact summary of these experimental results.
In so far as our assertions in science go beyond
these compact summaries they are disparaged as
idle metaphysics.
The central themes of positivism were picked up and
developed in the 1920s by a tradition that came to be
known as logical positivism. It added to positivism a
serious engagement with the use of the machinery of
formal logic. The hope was that formalizing language by
its machinery would reduce all disagreement to issues
that could be settled by its precise techniques. Its leading
figures took Mach as their patron spirit and met in Vienna
as the celebrated "Vienna Circle," with leading members
For example, imagine that I insist
that the latest sage of my favorite
cult is immortal. You disagree
and think he is mortal. We
resolve our dispute by finding
premises on which we agree.
You might check whether I agree
that 1. All men are mortal; and 2.
The sage is a man. If so, then
logic forces the conclusion that 3.
The sage is mortal. And if we
don't agree on these premises,
we just move things back a step
and see what grounds them.
Significance
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/significance/index.html[28/04/2010 08:20:53 ﺹ]
including M. Schlick, R. Carnap, O. Neurath and F.
Waismann. A comparable movement developed in Berlin
under Hans Reichenbach and used the label "logical
empiricism."
The logical structure of
arguments can be represented
symbolically:
1. If A then B. 2. A. 3. Therefore
B.
So the hope was that this entire
procedure of resolution could be
conducted symbolically.
Its core slogan was initially formulated by Friedrich
Waismann in 1930 and then developed by Carnap,
Schlick and Neurath. It is the "verifiability principle "
according to which
"The meaning of a proposition is the means
of verification."
where verification is just the demonstration of the
proposition's truth.
At first, the principle seems to make no sense. How
can the meaning of something be a "means," that is,
a way of doing something. The meaning of the
proposition "There are three marbles in the box," one
would think, is just what the proposition says:
somewhere there is a box and it has three marbles in it.
The means of verification of the proposition is something
different. It is whatever technique we may use to locate
the box, open it and count up the marbles inside.
Odd as this definition is, the motivation for it becomes
immediately clear if we apply it to the problem cases
we've been looking at. Take the proposition "there are
exactly three pixies in the mountains." We have seen
that there is no means of verifying this. So the
verifiability principle immediately gives us a comfortable
result. The proposition has no meaning.
Significance
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/significance/index.html[28/04/2010 08:20:53 ﺹ]
One of the famous hoax photos of the Cottingley fairies, taken in 1917.
They fooled Sir Arthur Conan Doyle, author of the Sherlock Holmes stories.
So a more straightforward version of the principle
makes this interest in the meaningfulness of
propositions directly apparent. It asserts:
A proposition is meaningful if and only if it is possible
to verify or falsify it (strong version)
confirm or disconfirm it (weak version).
To verify or falsify is to demonstrate truth or falsity.
Finding that an electron has negative charge falsifies
the proposition that all electrons are uncharged. But it
does not verify it since it is still possible that other
electrons have no charge. To confirm or disconfirm is
to display evidence that increases or decreases the
probability of the proposition. So the finding of an
electron with negative charge confirms to some small
degree that all electrons have negative charge.
Significance
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/significance/index.html[28/04/2010 08:20:53 ﺹ]
The verifiability principle demonstrated great power to
cut off long standing philosophical disputes.
Proposition after proposition fail the principle's test. So
they are judged meaninglesswell disguised forms of
babbleand thus no longer worthy of philosophical
scrutiny and debate. Here are examples of propositions
all beaten to meaninglessness by the cudgel of the
verifiability principle.
Reality is spiritual.
The moral rightness of an action is a nonempirical
property.
Beauty is significant form.
God created the world for the fulfillment of his purpose.
(These from the Encyclopedia of Philosophy.)
Significance
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/significance/index.html[28/04/2010 08:20:53 ﺹ]
What distinguishes live from dead matter is more than
chemistry; it is the presence of a life force.
(Carnap)
Against this background, one can see immediately why
Reichenbach could mount such enthusiasm for
Einstein's work in special relativity. Typical applications
of the verifiability principle are located in long standing
philosophical debates. But here is Einstein using
reasoning in a signal scientific breakthrough that
Significance
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/significance/index.html[28/04/2010 08:20:53 ﺹ]
looks just the same.
Take the proposition: there is an ether with a unique
state of rest. What Einstein found in developing his
special theory of relativity was that no observation could
distinguish it. So Einstein banished it from physicsand,
Reichenbach in effect notes with essentially the same
reasoning as led the logical positivists to discard the
spirituality of reality and life forces.
Some of you may notice notice a similarity with these
ideas and Karl Popper's celebrated analysis of what it
is to be scientific. While Popper energetically defended
his priority and creativity, it is not hard to see that his
formulation is a minor variation of the logical positivists'
views. Where they say to be meaningful is to be
verifiable or falsifiable, Popper says that to be scientific
is to be falsifiable. In retrospect, these are small
differences that only a true zealot could muster the
energy to debate fiercely.
What is my view of all this?
There is a lot that is right in this approach. Einstein
found a circumstance in which something was claimed
to exist (an ether state of rest) while at the same time
our best theories predicted that we could never detect it.
Such a circumstance ought to be troubling and signal to
us that something has gone seriously awry in our
theorizing. We have created a physical notion that is by
construction shielded from all possibility of physical test.
However I also believe that the verificationists went too
far. They urged not just that the proposal for things like
the ether state of rest was defective. They urged that it
was meaningless. That goes too far. The proposition
"There is an ether state of rest." is judged by them to be
meaningless blather, cognitively equivalent to a grunt or
a drool. Surely the proposition is perfectly meaningful
we understand just what it says and presumably so did
Significance
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/significance/index.html[28/04/2010 08:20:53 ﺹ]
Einstein. The problem, as I suggested above, is that we
have no good reason to believe it. Our best judgement
would be to say it is probably false.
5. Operationism (P.W.Bridgman)
By recognizing that the meanings of all
concepts are fixed solely by the operations
needed to verify them, we avoid smuggling
arbitrary preconceptions into our conceptual
systems that may prevent us learning new
things from experience.
Percy William Bridgman (1882 1961) was a Nobel prize winning,
experimental physicist who also wrote about scientific methodology,
especially in his 1927 Logic of Modern Science. He believed that
one could learn an important moral about the nature of concepts
in scientific theories by attending to what Einstein did. Here is
his review of what Einstein did and the morals we should draw from
them.
What Einstein did Bridgman's moral
Einstein learned the
principle of relativity and
the light postulate.
New experience is
always possible
Einstein could not initially
accept them. They
appeared irreconcilable
because of Einstein's tacit,
but erroneous, presumption
of absolute simultaneity.
We may not be able to
accommodate new
experience in our
conceptual system
because of false
presumptions hidden
in it.
Einstein defined the
concept of simultaneity
If we define all our
Significance
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/significance/index.html[28/04/2010 08:20:53 ﺹ]
through operations with
light signals and revealed
the falsity of the
presumption of absolute
simultaneity.
concepts
operationally, we
purge our conceptual
system of harmful,
false assumptions.
Einstein's revised concepts
of space and time are now
able to accept new
experiences, including
relativistic length
contraction and time
dilation.
Our conceptual
system is now
prepared for new
experiences.
In sum, Bridgman's goal was to revise our system of
concepts so that we might never again face a
revolution triggered by concepts that had false
presumptions buried in them. Had we realized that
different operations are used to measure the length of
moving bodies and the length of resting bodies, we
might have been prepared for the possibility that the two
might not be the same. He proclaimed:
"We must remain aware of these joints in our
conceptual system if we hope to render unnecessary
the services of the unborn Einsteins."
Bridgman used the length of a rod as a way of
illustrating his basic idea. Before operationalism, we just
talked of the length of a rod, assuming that there is just
one length for it. So we were ill prepared to learn that
moving rods have a length different from rods at rest.
Bridgman presumed that a concept was meaningful just
up to the operations used to determine it. That meant
that if different operations were used, one had
different concepts. So we might measure lengths by
repeatedly laying out of rulers; that would give one
notion of length call it the "ruler length." Or we might
measure length by an operation that times light signals;
Significance
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/significance/index.html[28/04/2010 08:20:53 ﺹ]
that would give another notioncall it "light length."
Had we attended to the operations used to measure the
length of a rod, we would have realized that different
operations are used to measure the length of a rod at
rest and the length of a moving rod. That means they
are different concepts and, in principle, may have
different values. We are prepared for the possibility of
different values, which turns out to be the result relativity
delivers.
Bridgman formulated his operationism is a way similar to
the verificationists.
His central claim was:
"In general, we mean by any concept nothing more than
a set of operations; the concept is synonymous with the
corresponding set of operations."
It does seem peculiar to say that a concept is a set of
operations and the idea does not seem very attractive
now. What made it attractive to Bridgman is that it
immediately gave him the results he wanted. If two
quantities are measured by different operations, then
their concepts are automatically different. And if we
have a quantity, such as our velocity through the ether,
that no operation can measure, then there is no physical
basis for concept. It is an illicit concept as far as proper
physical theorizing is concerned.
What is my view of this?
Once again there is something right and important in
Bridgman's ideas. If we have a concept, especially a
quantitative one, but no clear idea of the operations
needed to fix it or its magnitude, we may have
something defective in our concept. This is a warning
that must be heeded.
What is wrong about Bridgman's system is that it is too
Significance
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/significance/index.html[28/04/2010 08:20:53 ﺹ]
strict. We may well avoid being surprised again by a
false assumption buried in some concept if we become
operationists. But, as Hempel pointed out, the cost will
be that science becomes unworkable. Every distinct
operation will yield a new concept. Even rest length
would cease to be single concept; there would be as
many variants as ways we can devise to measure it:
ruler length, light length, ruler length measured on
Wednesdays; ruler length with steel rulers; ruler length
with brass rulers; etc. Our theories would need to leave
open the question as to whether each of these are the
same.
For better or worse, a workable science must presume,
even if provisionally, that the different operations are
measuring the same concept.
Significance
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/significance/index.html[28/04/2010 08:20:53 ﺹ]
6. The use of evidence: common causes
and common origins
Einstein's rejection of Lorentz's ether based
electrodynamics in favor of a novel theory of
space and time is a paradigm example of the
appropriate use of evidence.
The simplest form of this idea has already been
developed in the context of the verificationist moral.
Objects moving and at rest in the ether differed in their
relation to the ether state of rest. But it was a difference
that made no difference. So we had good reason to
believe that there really was no difference.
That is, the invisibility of the ether state of rest is
simply good evidence that there is no ether state of rest.
Recent work has brought to light a stronger way of
understanding how Einstein used what he found as
evidence. Recall the difference between Einstein and
Lorentz' theories:
Lorentz Einstein
There is an ether state
of rest, but all matter
shrinks, all processes
slow, etc., so as to
make it invisible. Every
theory of matter must
predict these
processes to assure
Space and time are such
that lengths shrink and
clocks slow with motion.
Every theory of matter
must predict this since
every theory of matter is
about substances that
reside in the one space
Significance
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/significance/index.html[28/04/2010 08:20:53 ﺹ]
this invisibility. and time.
Lorentz' theory depends on what, in retrospect, seems
to be an extraordinary coincidence . Maxwell's
electrodynamics predicts the length contraction and time
dilation effects and so must every theory of every other
form of matter.
Einstein's theory requires no such coincidence .
Space and time are the way they are as described in
relativity theory. That explains why every theory of
matter must predict these effects.
So Einstein's theory explains better because it posits
fewer arbitrary coincidences and therefore is better
supported by the evidence of these effects. To use a
notion pioneered by Wes Salmon, we might may that
the spacetime is the single, common cause of these
effects in all matter theories. Or, to use the expression
preferred by Michel Janssen, who has developed these
ideas, Einstein displayed a common origin for all these
effects. So he calls the related inferences "COIs"
common origin inferences.
That we find common origins to explain better and so to
be better supported by evidence is really a
commonplace. Imagine that there is suddenly a series
of burglaries in an otherwise quiet street. We are much
more likely to infer to a common causeone burglar
robbing repeatedlythan to many independent causes
many burglars who by chance all happen to be robbing
at the same time.
Significance
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/significance/index.html[28/04/2010 08:20:53 ﺹ]
These inferences also appear throughout science. A famous
example is Copernicus' inference that the earth moves. He
noted that the motion of the outer planets, Mars, Jupiter and
Saturn, when viewed from the earth, each had a wobble
superimposed upon them. What was curious about the wobbles
was that they were perfectly synchronized with each other and
also the motion we see for the sun around the earth. He inferred
that the apparently coincidental synchronization of the wobble has
a common origin. The earth was really moving around the sun
and the wobble was merely the superimposition of our motion on
the planets.
The situation is not
so different from what
someone on a pogo
stick might see.
Everything around
them is jumping up
and down in
synchronized
bounces. Of course
all they are really
seeing is the
superimposition of
their own bouncing
on the things around
them.
Significance
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/significance/index.html[28/04/2010 08:20:53 ﺹ]
I
I II .
1"J 3.7.
I
1.
••
, .
•
/ .
'I.
Significance
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/significance/index.html[28/04/2010 08:20:53 ﺹ]
7. Change is illusion
The relativity of simultaneity establishes that
the future is as determinate as the past and
present.
This moral is intended to negate a common sense idea
we have about the future. It is the idea that the future
is unresolved, whereas the present and past are or
have happened and so are fixed. The notion is captured
well enough by comparing the outcome of the last
presidential election with the next. The outcome of the
last election is known and fixed; it is a part of the
determinate past. The outcome of the next election is
open; it is a part of the indeterminate future.
We popularly imagine that the moment of the now
advances through history converting the indeterminate
possibilities of the future into the fixed actualities of the
present and the determinate facts of the past.
The Moving Finger
writes; and, having
writ,
Moves on: nor all thy
Piety nor Wit
Shall lure it back to
cancel half a Line,
Significance
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/significance/index.html[28/04/2010 08:20:53 ﺹ]
Nor all thy Tears wash
out a Word of it.
It is sometimes thought that merely employing a four
dimensional spacetime in physics is already enough to
overturn the idea that the future is indeterminate. For in
a spacetime diagram, we see both past and present laid
out as equally real. This argument is flawed. It depends
essentially on confusing the reality of a picture of a thing
with the reality of the thing. My diary has equally real
squares in it for yesterday and tomorrow. We would not
infer from that, that yesterday and tomorrow are equally
real (or squares).
The argument from spacetime is also less relevant in
the present context since spacetime could also be used
with classical physics. So whatever moral we might get
from it is equally available from classical physics.
Putnam, Rietjdk and others have tried to use what is
distinctive about the Minkowski spacetime of relativity
theory, the relativity of simultaneity, to get a stronger
result about the determinateness of the future. They
combine the way the relativity of simultaneity tangles up
future and past with two observers in relative motion to
get the result.
In brief , their argument goes as follows. Consider some possible
event in our future: will there be a blizzard next February 1? We
can always find a position and motion for a possible observer who
would in our present, judge next February 1 to be in his present.
Being
"determinate" is
the key notion. It
is somewhat
vague. I
understand a
Significance
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/significance/index.html[28/04/2010 08:20:53 ﺹ]
For that observer, whether or not there is a blizzard here on
February 1 is a present fact it is determinate. Since that is true
now of that observer, should we not also assume that the blizzard
(or otherwise) of next February 1 is determinate?
future event to be
determinate if it
has whatever it is
that past events
have that make
them immutable.
The figure shows the spacetime diagram that goes with
the argument.
The argument is:
Earth observer:
Event "Spaceship now" is simultaneous with respect to
event "Earth now."
Therefore Event "Spaceship now" is determinate with
respect to event "Earth now."
Spaceship observer:
Event "Earth later" is simultaneous with respect to event
"Spaceship now."
Therefore Event "Earth future" is determinate with
respect to event "Spaceship now."
Combining:
Event "Earth later" is determinate with respect to event
"Earth now."
Significance
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/significance/index.html[28/04/2010 08:20:53 ﺹ]
There are two weaknesses in the argument.
First, we must accept that simultaneity and
determinateness go hand in hand. That is, we must
accept that
"Spaceship now" is simultaneous with respect to event
"Earth now."
entails that
"Spaceship now" is determinate with respect to event
"Earth now."
I see no good reason to accept this. In part the problem
is that I don't really know what "determinateness" is.
Second, it is not clear that determinateness is
transitive. Transitivity is the property that allows us to
chain together judgements of determinateness as is
done in the little argument above. Again, whether it is
admissible depends on what "determinate" means and I
am unsure. Certainly simultaneity judgments from
different observers cannot be chained together. We
cannot infer that the events "Earth later" and "Earth
now" are simultaneous. Why should it be different with
determinateness
8. Causal Theory of Time (H.
Reichenbach):
Einstein defined simultaneity in terms of a
light signaling operation. We can generalize
his procedure to define the nature of time in
terms of signaling with any causal process.
To say that "an event P is earlier than an
event Q" simply means that it would be
possible for some causal process to pass
from P to Q.
Reichenbach here attempted to solve an old problem in
Significance
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/significance/index.html[28/04/2010 08:20:53 ﺹ]
philosophy, rather nicely expressed in a lament by
Augustine:
"What, then, is time?
If no one asks me, I know:
if I wish to explain it to one that asketh,
I know not."
This traditional problem is already captured in the dictionary game. You
want to know what time is ? Look up the definition of time in the
dictionary. And then look up the definition of the definition and soon
enough you are back at time, in a closed circuit. There seems no,
simple, noncircular way to finish the defining sentence "Time is..."
In my
Concise
Oxford
English
Dictionary,
"time" is
defined as
"duration";
and
"duration" as
"continuance
in, length of,
time."
Definition of time
Definition of duration
Reichenbach's causal theory of time aims to solve this
problem. It will complete the "Time is..." sentence with
talk of causes. To be more precise, it looks at the time
order of events, the notions of earlier and later. Just what
Significance
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/significance/index.html[28/04/2010 08:20:53 ﺹ]
does it mean to say that two events are separated in
time? Reichenbach's answer is in terms of causal
connectibility.
Event P is
earlier than
event Q
just
means
that
event P could causally
affect event Q
by, for example, the
transmission of a light or
signal from P to Q.
The inspiration for this approach is Einstein's 1905
treatment of simultaneity. In Einstein's special theory of
relativity, it is true that two events A and B are
simultaneous if they are hit by light signals emitted at
the same moment from their midpoint. Einstein turned
this truth into a definition. Two events are defined as
simultaneous if they could be hit by such light signals.
That definition was the centerpiece of the first section of
Einstein's paper.
Significance
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/significance/index.html[28/04/2010 08:20:53 ﺹ]
Reichenbach extended this thinking to all the time
relations between events, being before and being after.
It is a truth that P is earlier than Q just if a causal signal
could pass from P to Q. Reichenbach now proposed
that this truth be a definition.
There is something important and right about the
approach. We cannot allow notions like time to become
too distant from the physical processes of the world.
Special relativity has reminded us that our notions of
time must respond to those processes and the physical
theories that govern them. Time is deeply entangled
with causation. We will see just how much more
profound that entanglement is when we deal with the
spacetimes of general relativity.
However, in my view, Reichenbach's approach goes
too far. We do not just see the entanglement of space
and time in his theory. We see the reduction of time
order to causal order. Causation becomes the
fundamental idea and time order is derived from it. The
difficulty is that we end up with a primitive notion,
causation, that we seem to understand less well than
the thing we started with, time order. So now we must
ask "what is causation?" We will have a harder time
answering. Theories of the nature of causation remain
diverse and controversial. (For my diatribe on causation see
"Causation as Folk Science.") Time remains far less
Significance
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/significance/index.html[28/04/2010 08:20:53 ﺹ]
problematic; our theories of time are some of the best
developed of all physics. A theory that reduces the less
problematic to the more problematic seems to me to be
most problematic.
My Picks
Everyone will find their own favorites, although it can be
quite hard to make the selection. For what it is worth,
here are my picks. They have actually mostly been
embedded in the earlier critical discussion.
Common sense tracks the latest science. That is,
common sense lags behind our latest science, which is
very slowly incorporated into that nebulous "what
everyone knows." Doesn't everyone now know that
matter is made of atoms; or that the air is part oxygen
and that oxygen is the bit that matters for our survival?
Yet all this was once the most advanced science. The
process seems to be continuing with special relativity.
Many people somehow know that "nothing goes faster
than light" but they are not sure where is comes from.
The moral is not solely derived from special relativity,
but special relativity does supply a nice instance of it.
Mature theories are very stable in the domains for
which they were devised. They are fragile elsewhere.
This is what I think should be learned from the long
history of fragility of scientific theories, with the advent of
special relativity an excellent example. While theories
do not retain unqualified validity when we move to new
domains, the mature theories remain essentially
unaltered in their original domains. We need relativity
theory for motions close to the speed light, yet we still
use ordinary Newtonian theory for motions at ordinary
speed. That does not seem likely to change.
Beware of theories or parts of theories that are
designed to escape experimental or observational test.
Significance
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/significance/index.html[28/04/2010 08:20:53 ﺹ]
This is the part the verificationists got right. There is
something very fishy about theoretical entities with
properties so perfectly contrived that we cannot ever put
them to observational or theoretical test. We should
treat them with the highest suspicion. Asking for the
means of verification or falsification is a good test if one
is suspicious. Finding clear conditions for verification or
falsification is an assurance that a healthy connection
between the theory and experience is possible.
Be ready to abandon concepts that hide empirical
content. This is the part that the operationists got right.
One cannot develop conceptual schemes without
making presumptions about the world, yet those very
presumptions can be contradicted by emerging science,
making acceptance or even formulation of appropriate
new theories difficult. A related concern is that some
concepts may have no real basis in experience at all
(e.g. ether state of rest!). Asking for an operational
definition of the concept is a healthy but not final test. If
it admits an operational definition, then at least we know
it has a connection to possible experience.
Infer to common causes. When you have the choice,
the better explanation is the one that posits fewer
coincidences and that is the one you should infer to.
Space isolates us causally. The novel results about
space and time itself provide some of the most
interesting results of special relativity. If we try to look
beyond the theory and still have outcomes that pertain
to space and time, I think the most important is simply
the idea of upper limit of speed of light to causal
interactions. That tells us that we are quite powerfully
causally isolated from other parts of the universe.
Nearby galaxies are already millions of light years away.
That means that just sending a signal from our galaxy to
another will require eons of time. Conversely, something
happening there now will not affect us for the
corresponding eons.
If one wishes to press further, special relativity has
Significance
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/significance/index.html[28/04/2010 08:20:53 ﺹ]
revealed a relatedness of space and time that we did
not formerly suspect. It is hard to know how best to
express this entanglement. I think the best way is still
our familiar relativity of simultaneity.
What you should know
The various philosophical morals people have tried to draw from relativity theory.
How to identify a clear thesis and the argument that supports it.
How to criticize the statement of a thesis and the argument that supports it.
Your own view of which philosophical morals can be drawn from relativity theory.
Copyright John D. Norton. February 2001; October 2002; July 2006; February 2, 13, September, 23, 2008; February 1, 2010.
Euclidean Geometry
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/non_Euclid_Euclid/index.html[28/04/2010 08:21:20 ﺹ]
HPS 0410 Einstein for Everyone
Back to main course page
Euclidean Geometry
The First Great Science
John D. Norton
Department of History and Philosophy of Science
University of Pittsburgh
Euclid and his Elements
How Do We Organize Our Knowledge?
Knowing with Certainty
To Come
Euclid's Postulates
Deriving a Theorem
The Fifth Postulate
Attempts to Eliminate the Odd Man Out
What you should know
Linked documents:
Euclid's Postulates and Some NonEuclidean Alternatives
The definitions, axioms, postulates and propositions of Book I of Euclid's Elements.
Euclid and his Elements
Here's an introductory puzzle. In the totality of our intellectual heritage, which book is most
studied and most edited? The answer is obvious: the Bible. But which is the most studied and
edited work after it? That is a little harder to say. The answer comes from a branch of science that
we now take for granted, geometry. The work in Euclid's Elements. This is the work that codified
geometry in antiquity. It was written by Euclid, who lived in the Greek city of Alexandria in Egypt
around 300BC, where he founded a school of mathematics. Since 1482, there have been more
than a thousand editions of Euclid's Elements printed. It has been the standard source for
geometry for millennia. It is only in recent decades that we have started to separate geometry from
Euclid. In living memory my memory of high school geometry was still taught using the
development of Euclid: his definitions, axioms and postulates and his numbering of them.
Euclidean Geometry
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/non_Euclid_Euclid/index.html[28/04/2010 08:21:20 ﺹ]
Oxyrhynchus papyrus showing fragment of Euclid's Elements, AD 75125 (estimated)
Euclidean Geometry
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/non_Euclid_Euclid/index.html[28/04/2010 08:21:20 ﺹ]
Title page of Sir Henry Billingsley's first English version of Euclid's Elements, 1570
Euclidean Geometry
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/non_Euclid_Euclid/index.html[28/04/2010 08:21:20 ﺹ]
Oliver Byrne's 1847 edition of the first 6 books of Euclid's Elements used as little text as possible and replaced labels by colors.
Euclidean Geometry
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/non_Euclid_Euclid/index.html[28/04/2010 08:21:20 ﺹ]
A recent edition from Dover.
This long history of one book reflects the immense importance of geometry in science. We now
often think of physics as the science that leads the way. In the seventeenth century, Newton found
one simple system of physics that worked for both the heavens and the earth. That set a
standard of achievement that the other sciences sought to emulate. Newton, however, was
learning from another science that already set an enduring standard of achievement: geometry.
We can identify two reasons for the importance of Euclid's Elements in our understanding of the
foundations of science: its structure and the certitude of its results.
How Do We Organize Our Knowledge?
First, Euclid's Elements solved an important problem. When we have a large body of
knowledge, such as we have in geometry, how are we to organize it ? We know many simple
things in geometry: the sum of the angles of a triangle are always 180 degrees. And we know
more complicated things. A 3 45 sided triangle is a right angled triangle. And even more
complicated things. As Pythagoras found, in a right angled triangle, the sum of the areas of the
squares erected on the two shorter sides is equal in area to of a square erected on the
hypotenuse.
Euclidean Geometry
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/non_Euclid_Euclid/index.html[28/04/2010 08:21:20 ﺹ]
So, as our knowledge grows, how are we to organize it so that we capture in it all the truths that
we want and do not let in things that don't property belong there? Euclid employed a quite
profound method, deductive systematization. His elements were structured according to a series
of propositions:
Definitions.
This is the response to the simple injunction: "define your terms"else you cannot know
precisely what you are talking about. There are 35 definitions. They include such familiar
ideas as:
1. A point is that which has no part.
2. A line is a breadthless length.
3. The extremities of lines are points.
...
22. Quadrilateral figures are bounded by four straight lines.
...
and so on.
Axioms or Common Notions
These are general statements, not specific to geometry, whose truth is obvious or self
evident. There are 12. For example:
1. Things which are equal to the same thing are equal to one another.
2. If equals be added to equals, the wholes are equal.
Euclidean Geometry
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/non_Euclid_Euclid/index.html[28/04/2010 08:21:20 ﺹ]
and so on.
Postulates
These are the basic suppositions of geometry. They reflect its constructive character; that is,
they are assertions about what exists in geometry. The first of the five simply asserts that you
can always draw a straight line between any two points.
Theorems or Propositions
These are the consequences deduced logically from the definitions, axioms and postulates.
They form the bulk of geometrical knowledge and include Pythagoras' famous result above
concerning the areas of squares on the sides of right angled triangles.
All the definitions, axioms, postulates and propositions of Book I of Euclids Elements are here.
Once this structure is adopted, the problem of knowing just what really belongs in geometry is
reduced to matters of deductive inference. Is this or that a truth of geometry ? The question is
answered by determining whether it can be deduced from Euclid's postulates and axioms. Do you
doubt that this is a truth of geometry ? Then you must show where Euclid's proof broke down.
Eventually, as you trace the proof's back to their sources, you end up seeing that the truth of the
result derives ultimately from the truth of postulates and axioms. And their truth is so obvious
as to admit no doubt. Who wants to say that you cannot always draw a straight line between any
two given points?
In the seventeenth century, with newfound confidence, natural philosophers rebuilt all learning
from scratch, discarding the wisdom of antiquity as flawed. In that effusion of new investigation,
one achievement stood unchallenged. That was Euclid's Elements. Indeed its premier position
was reinforced when the structure it gave to geometrical knowledge was adopted by Newton to
codify his new mechanics. Like Euclid, Newton listed definitions and, where Euclid gave axioms
and postulates, Newton gave his celebrated three laws of motion. Euclid's Elements became the
template for organizing knowledge, be it a new science such as Newton's or even knowledge
outside science.
Euclidean Geometry
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/non_Euclid_Euclid/index.html[28/04/2010 08:21:20 ﺹ]
TilE
ELEMENTS OF EUCLID,
I I . '"
THE FIRST SIX DOOKS,
THE ELEVENTH AND TWELFTH.
THE ERRORS
TKION, Ol <ITJUU, IIn'...,NG £00 .,,,.no _ 'IUI_"" U'D 110.'
." us RATtIRpo.
.UO.
THE. DOOK OF EUCLID'S DATA,
IN LIICr; M4NNER OORREt.rED.
BY ROBERT SIMSON; M. D.
T<l THIS EDI"OIl us A"'" £:<:U.Ut>,
ELEMENTS OF PLANE AND SPHERiCAL TRIGONOMb'TItY.
. r
rHlLADEf.PHiA:
DESILVER. THOMAS &, CO.
1838.
THE
MATHEMATicAL
PRINCIPLES
OF
Natural Philofophy.
By Sir I S/l/lC N E WTO N .
Tranflated into Ent.1ijb by
To which ;11C :ldded,
Tbe Laws of tbe MOO N's Motio
to Gravity.
' ByJOHN MAC: " I N Aflron. Prof
Seer. R. SO&.
IN Two VOLUMES.
LON DON,
Primed for B £ N JAM IN MOT T E, at the Middlt
llmpleGalt, in FleelJlrul .
MDCCXXIX.
Euclidean Geometry
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/non_Euclid_Euclid/index.html[28/04/2010 08:21:20 ﺹ]
Tile
ELEMENTS OF EUCLID.
BOOK I.
DEFINITIONS.
L
A I'QIIIT II that which hath no parts, or which hath no magni
tude.
IL
A line is length without breadth.
III.
The e:rtremltiel of a line are points.
IV.
A straJght line is tl)at which Hes evenly between its extreme points.
V.
A superficies is that which hath only length and breadth.
VL
The extremities of a superficies are lines.
VIL
A plane superficies Is that in which any two points being taken,' the
straight line between them lies wholly In that superficies.
VIII.
.. A plane angle is the inclination of two lines to one another" in
:l plane, which meet together, but arc 1I0t in the anme direc
tion."
IX.
A plane recttllneal angle Is the inclination of two straight linea to
one Mother, which meet together, but o.re not in the .tnDlC streight
line_
A J)
B c
J
/
MATHEMATICAL
PRINCIPLES
OF
Natural Philofophy.
•
DEFINITIONS.
•
D E F. I.
Tht Jfl.!.lantio of Afafltr is t ht 1I1rafort Dr ri;.
fame, arijing fiDln iu dmjitJ and built con.
jun£ll,.
m
H tJ S air or a double dlTlfiry. in a doubll
T fpace, is in in a triple
fatuple in qu. ntiry. The lame
thing is to be underflood of fnow, and
line dufl or powders. are condtnfcd by compreffion
or .Jiql.lefatliolJ; andor an bodies tlut ve by (3ufC'!
. .  II . whit:
: 1
Euclidean Geometry
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/non_Euclid_Euclid/index.html[28/04/2010 08:21:20 ﺹ]
POSTULATES.
I.
LET it be granted that 8 straight 11110 may be drawn from anyone
point to any other poinl
".
That a terminated stl1l.lgllt Hne may be produced to any length in a
straight line.
In.
And that a circle may be described from any centre, at any distance
from that centre.
AXJOMS.
(.
TIII_ which are equal to the same are equal to one another.
IL
Jr equala be added to equals, the wholes are equals.
•
of Natu,al Phifofoph,;
Axioms or Laws of
Motion.
LAW I.
£ 1;"' /;(ld, ptr{everu in its j/.:lt of rdl. or
of uni[(lrm mOli(ln in II right line, "'J!r/s it
is (ompeUed to cha"ge that fiatt bJ Jaru s
imprrfid Ibereon.
P
Rojediles perfevere in their motioos, fo (ar as they
:I.rt not rerarded by the refifiance of the air. or
impell'd downwards by (he force of gravity. A top,
... oofe Fire by their cohefion are drawn a
fKle from rechline:r.r motions, does not cofe its rotation,
othawife tNn as it is rttarded by the air. The
gmter bodies of the Planets and Comets, meeting with
fcfs refifiance in more fret fp;ces, prererve their motions
botb progreffiveand circular for amuch longer time.
LAW II.
Tht alura/ion of mOlion ;s t1Jtr
10 the 11JDtive force imprtjS"d; ana is made
in the 4irtlJion iftht right/in( in whicb
th., fqru is imprifi'd.
If my force gtIlmte5 II motion. a double force will
geomte double the motion, II triple force triple the
modoo, whether thac force be impfers'd altogether and
.   .   . '   .. _t
I
Euclidean Geometry
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/non_Euclid_Euclid/index.html[28/04/2010 08:21:20 ﺹ]
SECT.H. if NatMralPbifojopl,. S7
SEC T ION II.
0./ the invention of ,en/ripe/a'
.forces.
PROPOS ITION I. THEOREM I.
Thr IPftJl, ':J.'hich rt'l.lo/villg bodin do/ribe
6, radij dr41t1J to an im11lO'Veablt untu
Df fora, dfJ lit in Ihi' fame ;mmQvtable
plants. and are proporlional to fhi fi11lt s in
__,chich th'J are d1cribd. PI. 1. Fig. S.
0 R (uppo(e the time to be divided into
F CCju31 Jnrts, and in the fun part of dut
tim? let the body by its inDue force de
o , {cribe the right line .AB J n the I«ond
part of that rime, the fame would. (by law I.) if
not hinder'd, proceed direBly to c, along the line
11' to AB;: (0 thilt by (he radii AS, BS, ,s
drawn to the ccme, the equal aras ASH, would
be de(cribed. Bur when the body is arrivN at B,
fuppofe that a ccntripC'raJ force aCts :It once with a
gmt impu)fe, and turning afide the body from the
right line B c. compeUs it afterwards to continue its
motion along the line Be. Drtw c C parallel to
BS meeting BC in C; and at «he end of the (e.
<000 pan of the time, the body (by Cor. I. of the
Jaws) wiJI be found in C, in the fame plane with the
""8Je 4S1I. Joyn SC, 1nd, bc"ur. SB and C, '"'
pmlJel,
Euclidean Geometry
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/non_Euclid_Euclid/index.html[28/04/2010 08:21:20 ﺹ]
10 Till: tl.l::.tCrI'Tli Ot' C\,: CLlD. aoolC I,
PROPOSITION I, PROBLEM.
To describe an equilateral triangle upon a given finite SIr:tight
line.
Let AD be the given straight line; it Is required to describe an
cquilaternllriangle upon il.
,
From the centre A, Ilt the distance
AB, describe (3. Postulate.) the dr
cle BCD, and from ' the centre n, at
the: d i ~ t l l n o o DA. licscriL.oc the dn:!c
ACE; and from the point C, in
whi ch tllP. circles cut one another,
dmw the straight lines (2. Post.) CA,
cn to the poillt$ A, n; ABC shall be
an equilateral triangle,
c
Because the point A is the centre of the circle BCD, AC Is equal
(15. pefinltlon.) to AD; and because the point B is the centre of the
circle ACE, DC is equal to SA: bul It has been proved that CA Is
«Ilia! to AB; therefore CA, cn arc Clieh of them equal to A B; but
things which afC equal to the same life equal to one another; (l s I.
Axiom.) therefore CA Is crIU!!! to CD; wherefore CA, AD, DC ure
equal to one another; Ilnd the triangle ABC is therefore cqoilllternl,
and It is described upon the given straight line AU. Which '011113
required to be done. \
PIWP. 11. PROD.
From a given point to draw a straight line cqu3l to a gi\'en
straight line.
Let A be the gh'cn point. and DC the given straight Une; it is re
quired to draw from the point A a struig!!t line I.'quul to DC.
From the poi nt A to B draw ( I.
POSL) the straight Jlne AD; and
upon It describe (I. I.) the cquUatc
ml triangle DAB, and produce (2.
Post.) the atrnigbt lines DA, DB,
to E lind F i from Ihe centre D, at
the di stance BC, describe (3. Post.)
the circle CGH, and from the cen
tre D, lit the distance 00, describe
the ci rcle GKL. AL shall be equal
to nco
Because the point B is the centre
of the circle CGH, 00 is equal (15.
K
H
F
Def.) to BG i bnd because D Is the centre of the circle GKL, DL
Is equal to 00, and DA, DB, parts of them, arc equal: therefore
the remainder AL is equal lo the remainder (3. Ax.) l1G; but it
hM been shown, Ibat BC Is equal to G, wherefore AL lind BC
are each of thent f'tjual to BG; and things that tire equol to the
Euclidean Geometry
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/non_Euclid_Euclid/index.html[28/04/2010 08:21:20 ﺹ]
Knowing with Certainty
Plnle H.l '01. L.P. bd.
  0
;:I°w.l. v .45 0 4 7· J
. , I, ) a
' ¥ ' ~ o ',P' 4 60
~ P ~
•
•
Euclidean Geometry
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/non_Euclid_Euclid/index.html[28/04/2010 08:21:20 ﺹ]
Second, the enduring success of Euclid's Elements assured us that some things could be
known with certainty . While the knowledge of antiquity collapsed, geometry thrived as the
method central to Newton's discovery and also the template for his organization of his new
mechanics. The idea of that sort of certainty is familiar today. We are used to the idea that some
branches of study do not need fragile experiments to verify them. There is no point in counting two
apples and then two apples into a basket to verify that 2+2=4; and then doing it again with pears or
pineapples, just to be sure. Arithmetic does not need it. 2+2 does not just happen to be 4; it has to
be 4. There is no other possibility.
So Euclid's geometry and Newton's physics bequeathed to thinkers the problem of understanding
just how this level of certitude was possible. Our modern minds are steeped in the idea that
knowledge of the world comes from experience and new experience can always overthrown old
learning. By the eighteenth century, the sense was widespread that Euclid and Newton had found
the final truths of geometry and mechanics. The philosophical problem was to determine how this
was possible.
One of the most influential thinkers of all time, the eighteenth century philosopher Immanuel Kant, provided an enduring answer.
There are some types of knowledge that are both synthetic and a priori, he declared. They are synthetic in the sense that they say
more of their subjects than are given by the subject's definition; they are a priori in the sense that they can be know prior to experience
of the subject. Arithmetic and geometry were Kant's premier examples of synthetic a priori knowledge. According to Kant, it is a
synthetic, a priori truth that 7+5=12; and it is a synthetic, a priori truth that the sum of the angles of all triangles is 180 degrees.
"A triangle has three
sides." is analytic, since
the definition of triangle
includes the idea that is
has three sides.
"A triangle's angles sum
to 180 degrees." is
synthetic, since this
summation to 180
degrees is not part of
the definition of a
triangle. It is an addition.
Immanuel Kant
To Come
Euclidean Geometry
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/non_Euclid_Euclid/index.html[28/04/2010 08:21:20 ﺹ]
These ideas provide our starting point. We shall see in later chapters that matters take a very
different turn in the nineteenth century.
Nineteenth century mathematicians realized that the eighteenth century certainty of geometry was
mistaken. Geometry was an empirical science. It reported the way our space happened to to be,
not the way it had to be. If that was so, other geometries were possible and our experience of
space might well have been different. In the nineteenth century, these were regarded as
possibilities that were unrealized. Nature had many choices but, they thought, she chose Euclid's
system.
This realization of the mere possibility of geometries other than Euclid's was shocking. Greater
shocks were in store. In the twentieth century, Einstein delivered the final insult to Euclid. He
found through his general theory of relativity that a nonEuclidean geometry is not just a possibility
that Nature happens not to use. In the presence of strong gravitational fields, Nature chooses
these geometries.
All this is coming in later chapters. Now, it's back to Euclid.
Euclid's Postulates
The geometry of Euclid's Elements is based on five postulates. They assert what may be constructed in
geometry. Let us start by reviewing the first four postulates. The first postulate is:
For a compact summary of these and other postulates, see
Euclid's Postulates and Some NonEuclidean Alternatives
1. To draw a straight line from any point to any point.
This postulates simple says that if you have any two pointsA and B, saythen you can always
connect them with a straight line.
It is tempting to think that there is no real content in this assertion. That is not so. This postulate is
telling us a lot of important material about space. Any two points in space can be connected; so
space does not divide into unconnected parts. And there are no holes in space such as might
Euclidean Geometry
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/non_Euclid_Euclid/index.html[28/04/2010 08:21:20 ﺹ]
obstruct efforts to connect two points.
The second postulate is:
2. To produce a finite straight line continuously in a straight line.
It tells us that we can always make a line segment longer. That means that we never run out of
space; that is, space is infinite.
The third postulate is:
3. To describe a circle with any center and distance.
It allows for the existence of circles of any size and centersay center A and radius AB.
Note that this sort of postulate is not superfluous. A definition can tell us what a circle is, so we
know one if ever we find one. But the definition does not assert their existence. Analogously,
we can give a definition of a unicorn; that doesn't mean they exist. This postulate says circles
exist, just as the first two postulates allow for the existence of straight lines.
The fourth postulate says:
Euclidean Geometry
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/non_Euclid_Euclid/index.html[28/04/2010 08:21:20 ﺹ]
4. That all right angles are equal to one another.
It just says that whenever we create a right angle by erecting
perpendiculars, the angles so created are always the same.
Definition 10. When a straight line standing on another straight line, makes the adjacent angles equal to one another, each of
the angles is called a right angle; and the straight line which stands on the other is called perpendicular to it.
Sameness here means that were we to manipulate the angles by sliding them over the page, they
would coincide.
It may seem that a postulate like this is superfluous. Isn't it completely obvious
that all right angles made this way are the same ? Yes it isbut that is the
essence of the postulates, to assert what is so unproblematic as to make them
unchallengeable. Nonetheless, the equality of all right angles still does need
to be asserted, since it will be assumed throughout everything that is to
follow in Euclid's Elements.
Could it really fail? Yes. The size of a right angle is the arc of a circle it subtends divided by the radius
in a neighborhood close to the point. The postulate depends upon the ratio of the circumference of the
circle enclosing the point to the radius being the same everywhere in the limit of arbitrarily small circles.
While this sameness obtains in all the geometries we are about to look at, once it is stated this simply,
one can see that in principle in could fail. In one part of space, the ratio might be the familiar 2π; in
other parts, it may be more or less.
Deriving a Theorem
So far everything is going very well. The postulates we have seen are utterly innocuous and
readily accepted. But once they are accepted, a lot follows . The simplest is the existence of
equilateral triangles. Their construction is the burden of the first proposition of Book 1 of the
thirteen books of Euclid's Elements.
The problem is to draw an equilateral triangle on a given straight line AB.
Euclidean Geometry
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/non_Euclid_Euclid/index.html[28/04/2010 08:21:20 ﺹ]
Postulate 3 assures us that we can draw a circle with center A
and radius B.
Analogously, Postulate 3 also assures us that we can draw a circle with center B
and radius BA.
Now consider both circles together.
They intersect at some point. Let us call
it c.
The assumption that they meet is not guaranteed by Euclid's postulates. It is an additional assumption that tacitly presupposes that the surface is an ordinary
two dimensional surface. This is one of several well known points in Euclid's system where the deductions are less rigorous than we would expect.
Now connect A and C with a straight line; and B and C with a straight line. That each straight line
can be drawn is asserted by Postulate 1.
Euclidean Geometry
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/non_Euclid_Euclid/index.html[28/04/2010 08:21:20 ﺹ]
Consider the triangle ABC. From the Definitions 15 and 16 of a circle,
we know that the two radii AB and AC of the circle centered at A are
equal.
AB=AC
Similarly, we know that the two radii AB and CB of the circle centered at B
are equal.
AB=CB
So, by Axiom 1, we know that all three are equal
AB=AC=BC
and the triangle is equilateral.
QED
Definition 15. A circle is a plane figure contained by one line, which is called the circumference, and is such
that all straight lines drawn from a certain point within the figure to the circumference are equal to one
another;
Definition 16. And this point is called the center of the circle.
Axiom 1. Things that are equal to the same thing are equal to one another.
QED = quod erat demonstrandum = "which was to be proved"
This illustrates the power of Euclid's system. Every step is guaranteed by an axiom or a
postulate, so that one cannot accept the axioms and postulates without also accepting the
proposition.
The Fifth Postulate
So far everything has been going very well. However these first four postulates are not enough to
do the geometry Euclid knew. Something extra was needed. Euclid settled upon the following
as his fifth and final postulate:
Euclidean Geometry
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/non_Euclid_Euclid/index.html[28/04/2010 08:21:20 ﺹ]
5. That, if a straight line falling on two straight lines make the interior angles on the same side less
than two right angles,
the two straight lines, if produced indefinitely, meet on that side on which are the angles less than
the two right angles.
It is very clear that there is something quite different about this fifth postulate. The first four were
simple assertions that few would be inclined to doubt. Far from being instantly selfevident, the fifth
postulate was even hard to read and understand.
5. That, if a straight line falling on two straight lines...
... make the interior angles on the same side less than two right angles... [in this case, side on the
right]
...the two straight lines, if produced indefinitely, meet on that side on which are the angles less
than the two right angles.
Euclidean Geometry
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/non_Euclid_Euclid/index.html[28/04/2010 08:21:20 ﺹ]
Or, in an animation:
Attempts to Eliminate the Odd Man
Out
From antiquity, there had been discomfort with this fifth postulate, an odd man out among the
postulates. The obvious remedy was to find a way to deduce the fifth postulate from the other four.
If that could be done, then the fifth postulate would become a theorem and the awkwardness of
needing to postulate it would evaporate.
Many tried. The famous astronomer Ptolemy of the first century AD tried. The great mathematician
John Wallis tried in the 17th century. The most famous of all attempts was published by Girolamo
Saccheri in 1733, Euclides ab Omni Naevo Vindicatus, ("Euclid Cleared of Every Defect”).
Yet even this massive work did not achieve its goal, so the efforts continued.
Euclidean Geometry
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/non_Euclid_Euclid/index.html[28/04/2010 08:21:20 ﺹ]
The eighteenth century closed with Euclid's geometry justly celebrated as one of the great
achievements of human thought. The awkwardness of the fifth postulate remained a blemish in a
work that, otherwise, was of immortal perfection. We knew the geometry of space with certainty
and Euclid had revealed it to us.
What you should know
How Euclid organized geometry into a deductive structure.
An idea of what his definitions, axioms, postulates and theorems look like.
A sense of how Euclidean proofs work.
The sense of certainty scholars of earlier eras assigned to Euclid's geometry.
Why the fifth postulate is an awkwardness for Euclid's geometry.
Copyright John D. Norton. December 28, 2006, February 28, 2007; February 2, 9, 14, September 22, 2008; February 2, 2010
NonEuclidean Geometry
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/non_Euclid_construction/index.html[28/04/2010 08:21:29 ﺹ]
HPS 0410 Einstein for Everyone
Back to main course page
NonEuclidean Geometry
A Sample Construction
John D. Norton
Department of History and Philosophy of Science
University of Pittsburgh
From the Eighteenth to the Nineteenth Century
Alternative Formulations of Euclid's Fifth Postulate
Exploring the Geometry of 5
NONE
A Trip Around Space
Circles and Triangles
Einstein's Moral
What you should know
Linked document:
Euclid's Postulates and Some NonEuclidean Alternatives
From the Eighteenth to the Nineteenth
Century
We saw in the last chapter that the earlier centuries brought the nearly
perfect geometry of Euclid to nineteenth century geometers. The one
blemish was the artificiality of the fifth postulate. Unlike the other four
postulates, the fifth postulate just did not look like a selfevident truth.
In the eighteenth century, as in the centuries before, the project had
been to rid Euclid's geometry of this flaw. The goal was to derive the fifth
postulate from the other four. Then, geometry would need only to posit
the first four postulates; the fifth would be deduced from them.
An indirect strategy was used in the efforts to derive the fifth postulate
from the other four. The procedure what to take the first four postulates
and add the negation of the fifth to them. Then the geometer would
proceed to explore the consequences of these five assumptions. The
goal was to demonstrate that a contradiction followed. Arriving at a
contradiction would show that a false presumption had been made
somewhere. The candidates for the false presumption were the five
assumptions of the starting point. Four of these were just the first four of
Euclid's postulates, which were taken to be secure. So the false
presumption had to be the negation of the fifth postulate.
Euclid's first
four postulates
and
the negation of
Euclid's fifth postulate
leads to a contradiction.
NonEuclidean Geometry
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/non_Euclid_construction/index.html[28/04/2010 08:21:29 ﺹ]
Conclude:
This assumption must be false.
That is, the finding of a contradiction showed that the negation of the
fifth postulate was false. Stripping out the double negative
("...negation...false") we just have that the fifth postulate is true. Or,
more carefully, as long as the first four postulates are true, then
the fifth is true. And that just means that we have inferred the truth of
the fifth postulate from the other four. The postulates needed for Euclid's
geometry would thereby be reduced to the first four.
The work was both encouraging and maddening. It was encouraging in
that all sorts of very odd results followed. It was maddening in that
none of the results, no matter how odd, was actually a flatout
contradiction. None flatly asserted "A and not A." It was as if the
geometers had struggled past many dangers but were perpetually
trapped one step short of the end of their journey.
In the nineteenth century, the reason for this frustrating failure was
finally recognized by Gauss, Riemann, Bolyai, Lobachevsky and others.
When the earlier geometers had posited an alternative to Euclid's fifth
postulate, they were not creating a contradiction. Rather they were
defining a new geometry.The conclusions they drew were merely facts
in the new geometry. These facts seemed odd simply because they
belonged in a geometry different from that of Euclid.
Gauss
Riemann
Bolyai Lobachevsky
The import of this realization was profound. It gradually became clear
that geometry did not have to be Euclidean. The success of
Euclidean geometry was something to be discovered. It certainly worked
where ever we looked. But would it still work if we surveyed volumes of
space on the cosmic space ? And what are to we to make of Kant's
assurance that space has to be Euclidean, a synthetic a priori fact?
If one has a prior background in Euclidean geometry, it takes a little
while to be comfortable with the idea that space does not have to be
Euclidean and that other geometries are quite possible. In this
chapter, we will give an illustration of what it is like to do geometry in a
space governed by an alternative to Euclid's fifth postulate.
NonEuclidean Geometry
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/non_Euclid_construction/index.html[28/04/2010 08:21:29 ﺹ]
Alternative Formulations of
Euclid's Fifth Postulate
One of most important byproducts of the efforts to derive Euclid's fifth
postulate were simpler, alternative formulations of the postulate that
could be used in place of Euclid's original. Many were found, including:
There exists a pair of coplanar straight lines, everywhere equidistant
from one another.
There exists a pair of similar, noncongruent triangles.
If in a quadrilateral a pair of opposite sides are equal and if the angles
adjacent to the third side are right angles, then the other two angles are
also right angles. (Saccheri)
There is no upper bound to the area of a triangle.
Of all the reformulations, one proves to be most useful. It was stated by
an 18th century mathematician and physicist, Playfair. His postulate,
equivalent to Euclid's fifth, was:
5.
ONE
Through any given point can be drawn exactly one straight line
parallel to a given line.
This formulation made it easy to state what the alternatives were. In
place of ONE, we could have NONE or MORE than one.
5
MORE
. Through any given point MORE than one straight line can be
drawn parallel to a given line.
The idea behind this alternative is easy to say but hard to draw The figure below
shows its import. All the lines drawn through the point are straight and parallel to
the line not passing through the point. The picture cannot really show that, of course,
since the screen is a surface that conforms to Euclid's postulates.
And just what does "parallel"
mean? Euclid tells us:
Definition 35. Parallel straight
lines are such as are in the
same plane, and which being
produced ever so far both ways
do not meet.
NonEuclidean Geometry
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/non_Euclid_construction/index.html[28/04/2010 08:21:29 ﺹ]
The other possibility is:
5
NONE
. Through any given point NO straight lines can be drawn parallel
to a given line.
Once again the import of the alternative postulate is hard to draw since
the screen is a Euclidean surface. In the figure, the line through the
point is a straight line. The postulate tells us that no matter which
straight line we pick through the point, the outcome is the same. It is not
parallel to the line not on the point. If extended it will eventually meet the
other line.
Exploring the Geometry of 5
NONE
Let us join the explorers of the nineteenth century and take the first
steps into the new space of these odd geometries. Let us explore the
space of 5
NONE
.
A Trip Around Space
To begin, select ANY STRAIGHT LINE at all in
our space with two points A and B on it. At each of A
and B, we will erect perpendicular, straight lines.
It will be important for what follows that the line selected be any straight line at
all. However we shall see that the analysis below can only be carried out if the
two points A and B are selected so that they are quite close together.
NonEuclidean Geometry
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/non_Euclid_construction/index.html[28/04/2010 08:21:29 ﺹ]
The alternative postulate, 5
NONE
, assures us that these
perpendiculars, if projected, will eventually meet at some point. Let
us call that point O.
There is a perfect symmetry in the figure; we could switch A and B
and nothing would change, so we can infer that
AO = BO
Note that in the figure the lines AO and BO are straight lines. In
a Euclidean geometry, they could not possibly meet. However this
is not a Euclidean geometry, so odd things will happen. This is just
the beginning...
Now find the midpoint of AB and call it Q. Erect a perpendicular to
AB at Q. Project it until it eventually meets AO and BO. (It must
meet them since there are no parallels in this geometry.)
Where will in meet them? It cannot be to either side of the point O
since then there would be an asymmetry. The midpoint Q and its
perpendicular do not favor either side.
So the perpendicular must pass through the point O.
What can we say about the length of OQ? In the figure, it looks as
if OQ is shorter than OA. Of course little in the figure is really quite
as it looks. Both OA and OB are straight lines, for example, but
they look curved.
It turns out that OA and OQ are the same length:
OA = OQ
To see this, just consider the triangle OAQ. It is constructed in
exactly the same way as the triangle OAB; that is, we erect two
NonEuclidean Geometry
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/non_Euclid_construction/index.html[28/04/2010 08:21:29 ﺹ]
perpendiculars and project them until they meet. So the triangle
OAQ has the same symmetries that led us to conclude that
OA=OB in triangle OAB.
The same reasoning leads us conclude that OQ and OB are the
same length. So:
OA = OQ = OB
Now repeat the construction.
Bisect AQ and from that point erect a perpendicular that will pass
through O.
Bisect QB and from that point erect a perpendicular that will pass
through O.
By repeating this process indefinitely, we can divide the original
interval AB into as many equal sized parts as we like.
Perpendiculars raised from each of these points will all pass
through the point O.
As before, all these perpendiculars will have the same length. If we
measure distance along these perpendiculars, we conclude that
the point O is the same same distance from every point on the line
AB.
Clearly the point O has a special significance for the entire straight
line through AB.
Recall that every line in the figure to the right is a straight line!
NonEuclidean Geometry
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/non_Euclid_construction/index.html[28/04/2010 08:21:29 ﺹ]
Let us now do essentially the same construction but in a way
that extends past AB.
As before, we have points A and B on the line we chose earlier
with the two perpendiculars erected at A and B.
On AB produced through B we pick a point C such that AB=BC.
We now erect a perpendicular at C.
As before, it must intersect the perpendiculars at A and B at the
same point O and the perpendiculars OA, OB and OC will have the
same length
OA = OB = OC
The argument is essentially the same as before. If the perpendicular at C did not
pass through O, it would intersect the perpendicular at A at some other point O' on
AO. But that would now mean that the perpendicular at B no longer respects the
symmetry in the large triangle AO'C. Continuing the same arguments above gives us
the equality of the lengths of all the perpendiculars.
This construction could be continued with points D, E, F, ... each a
distance AB advanced from the point before. At each we erect a
perpendicular, which will intersect the others at the same point O. Since
the triangles produced by this construction, OAB, OBC, OCD, ODE and
NonEuclidean Geometry
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/non_Euclid_construction/index.html[28/04/2010 08:21:29 ﺹ]
OEF are congruent, the angles at the apex are all the same:
angle AOB = angle BOC = angle COD = angle DOE = angle EOF.
That is, by extending the base of the triangle, AB to AC to AD etc. we
can make the angle at the apex grow as large as we like. So we can
certainly make it as big as a right angle. Let us say that this happens
with a base AG. And we can keep extending the base to G' until we
have a second right angle at GOG'. And we can extend to G" so that we
have a third right angle at G'OG''. And finally we can extend the base to
G''' so we have a fourth right angle at G''OG'''.
NonEuclidean Geometry
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/non_Euclid_construction/index.html[28/04/2010 08:21:29 ﺹ]
We have arrived at something remarkable in this figure. It is not just that
all these lines are straight lines. It is more. The angle AOG''' is four right
angles. How can that be ? Rather than tell you right away, let me give
you a clue. We don't need to draw all the lines as straight in the figure.
We just need to remember which are straightin this case all of them.
So we can redraw the figure as:
NonEuclidean Geometry
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/non_Euclid_construction/index.html[28/04/2010 08:21:29 ﺹ]
Think once again what it means for the angle AOG''' to be four right
angles. Consider the line OA as it sweeps around O. It passes one right
angle to reach OG; two right angles to reach OG'; three right angles to
reach OG''; four right angles to reach OG'''. But if a radial arms sweeps
four right angles, it has returned to its starting point. That is, the line
OG''' has returned to OA; that is OG''' is OA. Or the point G''' just is the
same point as A. So the figure is more correctly drawn as:
NonEuclidean Geometry
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/non_Euclid_construction/index.html[28/04/2010 08:21:29 ﺹ]
Notice what has happened. We started with a straight line AB and
extended it to G, G', G'' and then finally back to itself. So the straight line
on which points A and B lie is actually a straight line that wraps back
onto itself.
Now recall that there was nothing special about this line. We started
with ANY STRAIGHT LINE at all. It follows that all straight lines in the
new geometry wrap back onto themselves. Since these straight lines fill
all of space, it follows that that this space wraps back onto itself in
every direction.
Circles and Triangles
This last figure has more surprises. To begin, recall that all the lines in it
are straight. So it follows that one of the quarter wedgesAOG'' sayis
actually a triangle, since it is a figure bounded by three straight lines.
Moreover, the angles at each corner are the samea right angle. That
means that we have a triangle the sum of whose angles is three right
angles, one more than we are used to for all triangles in Euclidean
geometry.
Also it is clear from the symmetry of the three angles, that each side is
the same length. This triangle is also an equilateral triangle. So it is
more accurately drawn as the triangle on the right.
NonEuclidean Geometry
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/non_Euclid_construction/index.html[28/04/2010 08:21:29 ﺹ]
There is also a circle in the figure. While the line AGG'G''G''' is a
straight line, it also has the important property of being the
circumference of a circle centered on O. Every point on AGG'G''G''' is
the same distance from O. That is the defining property of a circle.
And what an unusual circle it is. It has radius AO. That radius AO is
equal in length to each of the four segments AG, GG', G'G'', G''A that
make up the circumference.
Radius = AO
Circumference = AG + GG' + G'G'' + G''A
AO = AG = GG' = G'G'' = G''A
That means that the circle AGG'G''G''' has the curious property that
Circumference = 4 x Radius
Contrast that with the properties familiar to us from circles in Euclidean
geometry
Circumference = 2π x Radius
A longer analysis would tell us that the area of the circle AGG'G''G'''
stands in an unexpected relationship with the radius AO. Specifically
Area = (8/π) x Radius
2
In Euclidean geometry, the area of a circle relates to its radius by
Area = π x Radius
2
NonEuclidean Geometry
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/non_Euclid_construction/index.html[28/04/2010 08:21:29 ﺹ]
Einstein's Moral
Let us return to our starting point. Euclid's achievement appeared
unshakeable to the mathematicians and philosophers of the eighteenth
century. The great philosopher Immanuel Kant declared Euclid's
geometry to be the repository of synthetic, a priori truths, that is
propositions that were both about the world but could also be known
true prior to any experience of the world. His ingenious means of
justifying their privileged status came from his view about how we
interact with what is really in the world. In our perceiving of the world, we
impose an order and structure on what we perceive; one manifestation
of that is geometry.
The discovery of new geometries in the nineteenth century showed that
we ought not to be so certain that our geometry must be Euclidean. In
the early twentieth century Einstein showed that our actual geometry
was not Euclidean . So what are we to make of Kant's certainty?
Einstein gave this diagnosis in his 1921 essay "Geometry and
Experience."
"... an enigma presents itself which in all ages has agitated inquiring minds. How can it be that
mathematics, being after all a product of human thought which is independent of experience, is
so admirably appropriate to the objects of reality? Is human reason, then, without experience,
merely by taking thought, able to fathom the properties of real things?
In my opinion the answer to this question is, briefly, this: as far as the propositions of
mathematics refer to reality, they are not certain; and as far as they are certain, they do not
refer to reality..."
To restate Einstein's point in terms closer to Kant's terminology: in so far
as geometry is synthetic its propositions are not certain; they are
empirical claims about the world to be investigated by science like any
other claim and we can never be absolutely certain of them. In so far as
a geometry's propositions are a priori, they are not factual claims about
the world; they are "if then" statement of logic within some logical
system whose initial propositions are the postulates of the geometry.
What you should know
Different versions of Euclid's fifth postulate.
The alternatives to the fifth postulate that yield alternative geometries.
How to derive results from the alternative postulate 5
NONE
in simple geometric constructions.
How these new geometries changed our view of the certitude of geometrical knowledge.
Copyright John D. Norton. December 28, 2006, February 28, 2007; February 2, 9, 14, September 22, 2008; February 2, 2010.
Constant Curvature
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/non_Euclid_constant/index.html[28/04/2010 08:21:41 ﺹ]
HPS 0410 Einstein for Everyone
Back to main course page
Spaces of Constant Curvature
John D. Norton
Department of History and Philosophy of Science
University of Pittsburgh
Unfamiliar Geometries Become Familiar
The New Geometry of 5
NONE
is Spherical
Geometry
Circles and Triangles
Corrections to the Other Postulates
Is the Geometry of 5
NONE
Consistent?
The Geometry of 5
MORE
The Geometries Generalized To Three Dimensions
of Space
Dropping the Embedding Space
What you should know
Linked documents:
Euclid's Postulates and Some NonEuclidean Alternatives
Unfamiliar Geometries
Become Familiar
In the last chapter, we explored the geometry induced
by the postulate 5
NONE
by means of the traditional
construction techniques of geometry familiar to Euclid.
We drew lines and found points only as allowed by the
various postulates. The outcome was a laborious
construction of circles and triangles with some quite
peculiar properties. We constructed a circle with center
O and circumference G, G', G'', G'''.
Constant Curvature
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/non_Euclid_constant/index.html[28/04/2010 08:21:41 ﺹ]
Its circumference is only 4 times is radius (and not
the 2π times its radius dictated by Euclid's geometry).
Its cirumference is both a circle and a straight line at
the same time. Each of its quadrants are triangles with
odd properties. The triangle OGG', for example, has
three angles, each of one right angle. So the sum of its
angles is three right angles (and not the two right
angles dictated by Euclid's geometry).
You would be forgiven for thinking that the new
geometry of 5
NONE
is a very peculiar and unfamiliar
geometry and that there is no easy way to
comprehend it as a whole. The surprising thing is that
this is not so. The geometry of 5
NONE
and the
geometry of the other postulate 5
MORE
turn out to be
Constant Curvature
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/non_Euclid_constant/index.html[28/04/2010 08:21:41 ﺹ]
the geometries that arise naturally in surfaces of
constant curvature. Recognizing that fact makes it
easy to visualize these new geometries and one
rapidly develops a sense of the sorts of results that will
be demonstrable in them.
We will see in this chapter how this arises. Indeed it
makes the visualization too easy, that danger is that
we overlook the fact that we are really dealing with
new an different geometries.
The New Geometry of
5
NONE
is Spherical
Geometry
The geometry of 5
NONE
proves to be very familiar; it is
just the geometry that is natural to the surface of a
sphere, such as is our own earth, to very good
approximation. The surface of a sphere has constant
curvature. That just means that the curvature is
everywhere the same. To see how the connection to
the geometry of 5
NONE
works, we need only identify
the line AGG'G'' with the equator. The perpendiculars
we erected to it in the last chapter then just become
lines of longitude all of which intersect at the North
Pole, that is at, O.
Constant Curvature
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/non_Euclid_constant/index.html[28/04/2010 08:21:41 ﺹ]
It isn't quite that simple. We do need to adjust our
notion of what a straight line is. The essential idea
remains the same. A straight line between two points A
and B is still the shortest distance between two points.
But now we are forced to remain on the surface of the
sphere in finding the shortest distance. There is no
burrowing into the earth to get a shorter distance
between two points. The curve that implements the
shortest distance in the surface is known as a
"geodesic".
Constant Curvature
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/non_Euclid_constant/index.html[28/04/2010 08:21:41 ﺹ]
There is a simple way of creating geodesics on the
surface of a sphere. They are the "great circles ."
That is, they are the circles produced by the
intersection of the sphere with a plane that passes
through the center of the sphere.
In short, the new geometry of 5
NONE
is just the
geometry of of great circles on spheres.
In such a geometry, there are no parallel lines. All
pairs of great circles intersect somewhere. That this is
so is sometimes overlooked. People sometimes
mistake a parallel of latitude for a great circle. In the
Constant Curvature
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/non_Euclid_constant/index.html[28/04/2010 08:21:41 ﺹ]
figure below, points A and B of the same latitude are
connected by a parallel of latitude. The parallel of
latitude is a parallel to the equator. However it is not
the analog of straight line in this geometry, a geodesic.
For geodesics are produced by the intersection of the
sphere with planes that pass through the center of the
sphere. The great circle passing through points A and
B is shown in the second figure. It connects A and B
by a path that deviates to the North. Since it is the
great circle, it is the curve of least distance in the
surface of the sphere between A and B.
The great circles are the routes taken by ships and
airlines over the surface of the earth, whenever
possible, since they are the paths of least distance.
Constant Curvature
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/non_Euclid_constant/index.html[28/04/2010 08:21:41 ﺹ]
Circles and
Triangles
We can now return to the triangles and circles visited
earlier. Their properties were radically different from
Euclidean triangles and circles. The triangle's angles
summed to three right angles and the circle's
circumference was only four times the radius. It is now
easy to see that these deviations from Euclidean
expectations arise only for very large figures on the
surface of the sphere. A very small patch of the
surface of a sphere is very close to being a Euclidean
plane. The calm surface of a small lake on the Earth is
very nearly a flat plane; the surface of an ocean is
markedly curved. In those very small patches, circles
and triangles are very nearly Euclidean in their
properties.
The figure below shows a very small equilateral
triangle A''B''C''. The sum of its angles will meet
Euclidean expectation near enough and be two right
angles. As the triangle grows larger, passing through
Constant Curvature
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/non_Euclid_constant/index.html[28/04/2010 08:21:41 ﺹ]
triangle A'B'C' to the huge ABC, the sum of its angles
will grow until they are three right angles at ABC.
The situation is the same with circles. The circle
around the North Pole below with very small radius OA
will meet Euclidean expectations, near enough, and
have its circumference 2π times its radius. As the
circle grows with radius increasing through OB to OC,
the formula will mutate. When the radius is OC, so the
circle now coincides with the equator, the
circumference will have dropped to being only four
Constant Curvature
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/non_Euclid_constant/index.html[28/04/2010 08:21:41 ﺹ]
times the radius.
Corrections to the
Other Postulates
Now that we have identified our geometry of 5
NONE
as
the geometry of great circles on spheres, two small
corrections are needed. The first postulate allows us to
draw a straight line between any two points. In the new
geometry, there are two ways of connecting any
two nearby points by a great circle. One goes the short
way; the other goes the long way all around the other
side of the sphere.
The second correction is for the second postulate
which allows us to produce a straight line indefinitely.
That is not possible for great circles. They are already
maximally extended. One part of the original notion
of the second postulate was that a straight line never
really comes to an end. Any point that looks like an
end is only a temporary terminus and the line can be
extended past it. That lack of a boundary point is all
we need for the revised second postulate.
The two modified forms of the first and second
postulates that accommodate these two alterations
are:
Constant Curvature
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/non_Euclid_constant/index.html[28/04/2010 08:21:41 ﺹ]
1'. Two distinct points determine at least one
straight line.
2'. A straight line is boundless (i.e. has no
end).
The modified postulates are illustrated by the geodesic
drawn through two points A and B:
Is the Geometry of
5
NONE
Consistent?
Consider the geometry of 5
NONE
; that is the
geometry that is deducible from the the fifth postulate
5
NONE
and the other four postulates, suitably
adjusted. The expectation of the mathematicians of
the eighteenth century and earlier had been that one
would eventually be able to deduce a contradiction
from them. That is, they expected them to be
inconsistent. We started deducing consequences
from the postulates but found only odd results, not
contradictions.
By contradiction, I mean "A and
not A," for A some sentence. So if
one's theory allows contradictions
to be deduced, the theorist has a
very serious problem. It may mean
that someone working in dynamics
can infer that a system both
conserves energy ("A") and does
not conserve energy ("not A").
Which ought the theorist to
believe?!
Constant Curvature
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/non_Euclid_constant/index.html[28/04/2010 08:21:41 ﺹ]
How do we know that a more imaginative, more
thorough analysis might not eventually produce a
contradiction? That is, how do we know that the
new geometry is consistent?
The question could be answered by a proof of the
consistency of the geometry. Alas, advances in
twentieth century mathematics have shown that
proving the consistency of a rich system in
mathematics is typically impossible. However the
geometers of the nineteenth century had already
supplied us with something that, for practical purposes,
is good enough.
In showing that the geometry of 5
NONE
is really the
geometry of great circles on spheres, they provided a
relative consistency proof. The idea is simple
enough. In a three dimensional Euclidean space, we
can recreate or simulate, the different geometry of
5
NONE
by constructing a sphere. Imagine that
somehow we could generate a contradiction within the
geometry of 5
NONE
. That would then mean that we
could generate a contradiction within the geometry of
great circles on spheres. And that would mean that
there must be a contradiction recoverable within the
geometry of three dimensional Euclidean spaces.
To get a more concrete sense of how this works,
imagine that there is a way of deducing an
inconsistency in the geometry of 5
NONE
. A geometer
sits down and begins the steps of the construction that
leads to a contradiction. Perhaps the geometer draws
a straight line AB; and then a perpendicular to it; and
so on. Now imagine a second geometer who works in
Euclidean space. That geometer clones exactly
everything the first geometer does, but now
replaces the first geometer's straight line AB by a great
circle through AB on some sphere. The two
constructions will proceed analogously for the original
Constant Curvature
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/non_Euclid_constant/index.html[28/04/2010 08:21:41 ﺹ]
geometer working the space of 5
NONE
and the clone
geometer working in the Euclidean space.
Geometer working with straight lines
in geometry of 5
NONE
.
Geometer working with great circles on
spheres in Euclidean geometry.
Select any two points A and B. Select any two points A and B.
Connect them with a straight line. Connect them with a great circle.
...
...
...
...
...
...
Contradiction! Contradiction!
If the first geometer finds the construction leads to a
contradicition, then so must the clone geometer. But
that clone geometer is working fully within Euclidean
geometry. That is, if the first geometer finds a
contradiction in the geometry of 5
NONE
, then the
second must find a contradiction in Euclidean
geometry.
Constant Curvature
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/non_Euclid_constant/index.html[28/04/2010 08:21:41 ﺹ]
So, if the geometry of 5
NONE
is inconsistent, then
Euclidean geometry must be inconsistent. Or turning it
around, if Euclidean geometry is consistent , then
so must the geometry of 5
NONE
. Of course the big
catch is that we cannot prove that Euclidean geometry
is consistent. However we can take some comfort that
millennia of investigations have failed to find an
inconsistency in it. The relative consistency proof
assures us that we are no worse off in the geometry of
5
NONE
.
The Geometry of 5
MORE
What of the geometry of 5
MORE
? One might imagine
that there are many distinct versions according to how
many parallels can be drawn through a point not on
the original straight line. One can quickly see,
however, that there is only one possibility for this
number. Imagine, for example, that the geometry
allows two parallels AA' and BB' through the point but
Constant Curvature
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/non_Euclid_constant/index.html[28/04/2010 08:21:41 ﺹ]
no more.
Then we can always bisect AA' and BB' with a third
line CC'. Now AA' and BB' are parallel to the original
line in the sense that they never intersect it, no matter
how far they are projected. Since CC' is sandwiched
between AA' and BB', the same must be true of it.
The basic idea generalizes. Any attempt to limit the
maximum number of parallels allowed by 5
MORE
fails;
we can always add one more. So the geometry of
5
MORE
is the geometry that arises when we may draw
infinitely many parallels through the point not on
the original line.
We could continue the exercise of discovering the
geometry 5
MORE
through step by step inference. Since
we've seen it done once for the geometry of 5
NONE
,
let us just skip to the final result. It turns out that the
geometry of 5
MORE
is the geometry of a negatively
curved surface of constant curvature like a saddle or
potato chip.
Constant Curvature
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/non_Euclid_constant/index.html[28/04/2010 08:21:41 ﺹ]
In this geometry, lines can have infinite length, just as
in familiar Euclidean geometry.
However there are differences that are analogous to
those of the geometry of a spherical space:
In very small parts of the space, circles and triangles
behave like Euclidean circles and triangles, near
enough.
As the circles and triangles get larger, deviations from
Euclidean behavior emerge. The circumference of
circles becomes more than 2π times the radius; and
the sum of the angles of a triangle become less than
two right angles.
Constant Curvature
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/non_Euclid_constant/index.html[28/04/2010 08:21:41 ﺹ]
The perpendiculars to the equator on the surface of a
sphere converge to a single point, the North Pole. On
this surface of negative curvature, perpendiculars to a
straight line diverge.
The Geometries Generalized
Constant Curvature
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/non_Euclid_constant/index.html[28/04/2010 08:21:41 ﺹ]
To Three Dimensions of Space
So far, we have explored the geometries of 5
NONE
and
5
MORE
for the case of two dimensional spaces. We
can also consider each in three dimensional spaces.
The results we would arrive at are summarized in the
table (duplicated in Euclid's Postulates and Some
Non_Euclidean Alternatives).
Spherical Geometry
Positive curvature
Postulate 5
NONE
Euclidean
Geometry
Flat
Euclid's
Postulate 5
Hyperbolic Geometry
Negative Curvature
Postulate 5
MORE
Straight lines
Finite length; connect
back onto themselves
Infinite length Infinite length
Sum of angles of a
triangle
More than 2 right
angles
2 right angles
Less than 2 right
angles
Circumference of a
circle
Less than 2 π times
radius
2π times radius
More than 2 π times
radius
Area of a circle
Less than π(radius)
2
π(radius)
2
More than π(radius)
2
Surface area of a
sphere
Less than
4π(radius)
2
4π(radius)
2
More than
4π(radius)
2
Volume of a sphere
Less than
4π/3(radius)
3
4π/3(radius)
3
More than
4π/3(radius)
3
In very small regions of space, the three geometries
are indistinguishable. For small triangles, the sum of
the angles is very close to 2 right angles in both
spherical and hyperbolic geometries.
Constant Curvature
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/non_Euclid_constant/index.html[28/04/2010 08:21:41 ﺹ]
Dropping the Embedding
Space
What made visualizing these nonEuclidean
geometries easy was that we embedded the non
Euclidean space in a higher dimensioned Euclidean
space. That took an unfamiliar and even disquieting
geometry and made it familiar. However in the end, we
must dispense with these higher dimensional
embedding spaces and simply take the new
geometries as worthy geometries in their own right.
There are three problems if we do not dispense with
the embedding space.
One is technical. Sometimes the embedding cannot
be implemented fully. The two dimensional negatively
curved saddle shape can only be embedded into a
three dimensional space in pieces; the full surface
cannot be embedded.
Another is practical. The real gain is to our
imagination. Imagine a three dimensional curved
space that is curving into the fourth dimension of a four
dimensional Euclidean space. Wellthat's the problem.
You cannot imagine it. So the practical gain to
visualization is lost in this case. It is replace by a new
problem: how are we to visualize the curving of the
three dimensional space into a four dimensional
space?
The final problem is the most serious. If our geometry
turns out to be factually one of the curved geometries,
then the supposition of a higher dimensioned
Euclidean space is a falsehood and a potentially very
misleading one. For, if we take it seriously, we end up
believing that space is really Euclidean after all, but
only in some higher dimension to which we have no
Constant Curvature
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/non_Euclid_constant/index.html[28/04/2010 08:21:41 ﺹ]
access. If all we know is the three dimensions of space
in which we measure, then we have no license to
conjure up an otherwise inaccessible higher
dimensioned Euclidean space for it to curve into. What
makes us think such a higher dimensioned space
exists?
What you should know
How the geometries of 5
NONE
and 5
MORE
are realized in surfaces of constant positive
and negative curvature.
How each of the these geometries differs in its treatments of ordinary figures from
Euclidean geometry.
How the geometries generalize to dimensions higher than two.
Why and how you should "drop the embedding space."
How the consistency of the nonEuclidean geometries is assured through a relative
consistency proof.
Copyright John D. Norton. December 28, 2006, February 28, 2007; February 2, 9, 14, September 22, 2008; February 3, March 1, 2010.
Spaces of Variable Curvature
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/non_Euclid_variable/index.html[28/04/2010 08:21:47 ﺹ]
HPS 0410 Einstein for Everyone
Back to main course page
Spaces of Variable Curvature
John D. Norton
Department of History and Philosophy of Science
University of Pittsburgh
Spaces of Variable Curvature
Geodesic Deviation
Intrinsic versus Extrinsic Curvature
Geodesic Deviation in Spaces of Variable
Curvature
Curvature in Different Directions of a Higher
Dimensioned Space
A Space with Different Curvature in Different
Directions
In Sum
What you should know
Linked document:
Euclid's Postulates and Some NonEuclidean
Alternatives
Spaces of Variable Curvature
So far, we have examined the geometry of
homogeneous spaces. That is, we have been
examining spaces that are everywhere the same,
geometrically. This means that if a space is flat in one
place, we have assumed it is flat everywhere. Or if it
has positive curvature in one place, we have assumed
it has the same positive curvature everywhere else.
Spaces of Variable Curvature
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/non_Euclid_variable/index.html[28/04/2010 08:21:47 ﺹ]
A simple example is just the surface of a sphere. Its intrinsic
geometry has positive curvature and that curvature is the same
everywhere. This means that the geometry of each little patch of
the sphere's surface is same as every other little patch.
Nothing makes us to restrict ourselves
to surfaces like the surface of a sphere.
We can investigate surfaces that have
curvatures that vary from place to place:
no curvature here; positive curvature
there; even more positive curvature
somewhere else; and negative
curvature in yet another place. A
surface with that sort of geometry is not
hard to visualize. It is just a dimpled
surface shown here, with flat parts,
domelike parts with positive curvature
and saddle like parts with negative
curvature.
If we have a space of variable curvature, how do we
determine the curvature at each place ? Since the
curvature varies from place to place, the methods that
we learned in the last chapter will be of limited use. We
could draw very big triangles to check how the sums of
their angles differ from 180
o
. But if the curvature of the
space varies a great deal over the region of space
covered by the triangle, that sum will not tell us much
of any use.
We will need a means that works locally, that is, it
works in tiny little patches of the space. We will
develop thse means below, first for the familiar case of
spaces of constant curvature.
Spaces of Variable Curvature
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/non_Euclid_variable/index.html[28/04/2010 08:21:47 ﺹ]
Geodesic Deviation
If we are in a curved space, here is a convenient way
to determine the curvature without having to resort to a
higher dimensioned embedding space. We start with a
straight (geodesic) line and erect straight (geodesic)
perpendiculars on it. We then proceed along the
perpendiculars, noticing whether they converge or
diverge (or neither). That tells us immediately what
sort of space we are in. There are three cases
corresponding to the three geometries.
In the Euclidean case,
the perpendiculars neither
converge nor diverge.
In the case of a spherical
geometry of positive curvature,
the perpendiculars converge.
Spaces of Variable Curvature
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/non_Euclid_variable/index.html[28/04/2010 08:21:47 ﺹ]
In the case of a hyperbolic
geometry of negative curvature,
the perpendiculars diverge.
So far, we have just considered the simplest case of
geodesic deviation. In that simplest case, we started
out with a family of geodesic curves perpendicular to
some base line and checked whether they
converged or diverged. We can still use geodesic
deviation if we start out with a family of geodesic
curves that are not perpendicular to some
baseline. All we check for is whether they converge
or diverge faster or slower than a straight line
projection would indicate.
Just how much convergence or
divergence should we expect with a
"straight line projection"? Under it,
the distances by which the
geodesics approach or recede are
just directly proportional to the
distance we move along the
geodesics. If they are converging,
when we go twice as far, for
example, then the geodesics near
each other exactly twice as much.
Here's how things work out for the case of
positive curvature. We start out with a family of
geodesic curves that are initially diverging. The
dotted lines show how they would continue to
diverge under straight line expectations. What
they actually do is to diverge slower than
these expectation. So we have a case of
positive curvature.
Now here's the other case of positive curvature.
The family of geodesic curves are initially
Spaces of Variable Curvature
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/non_Euclid_variable/index.html[28/04/2010 08:21:47 ﺹ]
converging. The dotted lines show how they
would continue to converge under straight line
expectations. What they actually do is to
converge faster than these expectations. So
we have a case of positive curvature again.
Note that in both cases of positive curvature, the
deviation from linear expectations is inwards.
All this is reversed for the case of negative curvature.
Deviations from the linear trend are always outwards.
Intrinsic versus
Extrinsic Curvature
The notion of geodesic deviation enables us to
distinguish two types of curvature in geometry.
The first is most familiar to us, extrinsic curvature. It
arises whenever we have a surface that curves into a
higher dimension. We have seen many examples. One
of the simplest arises when a flat sheet of paper is bent
or rolled up into a cylinder. A more interesting case
arises when the surface is dome like, such as
hemisphere.
In this last case of the hemisphere, the curvature of the
surface into the higher dimension is associated with a
failure of ordinary Euclidean geometry in the surface of
the sphere. This failure of Euclidean geometry arises
fully within the surface; it is a manifestation of intrinsic
curvature. To summarize:
Spaces of Variable Curvature
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/non_Euclid_variable/index.html[28/04/2010 08:21:47 ﺹ]
A surface exhibits extrinsic
curvature when that surfaces
curves into a higher dimension
in an embedding space.
A surface exhibits intrinsic curvature when
the geometry within the surface differs from
flat, Euclidean geometry. It is revealed by
geodesic deviation.
You might think that extrinsic curvature and intrinsic
curvature must go hand in hand; whenever you have
one, you have the other. That is not so. It is easy to
have a surface that has extrinisic curvature, but no
intrinsic curvature. The example is a familiar one.
Take a flat Euclidean surface. The geometry on its
surface will be Euclidean, obviously. That means, if
we draw a triangle on the surface, its angles will sum
to 180 degrees.
Now roll that surface up into a cylinder. That
means the surface has now acquired extrinsic
curvature. However its intrinsic curvature has not
changed; it is still intrinsically flat. To see this
consider any figure that you might have drawn on
the surface. Within the surface, nothing about
the figure is disturbed. If the figure conformed to
Euclidean geometry before being rolled up, it will
conform to Euclidean geometry after being rolled
up.
For example, when the surface was rolled up, the
sides of the triangle shown are bent into the higher
dimension by the rolling up of the surface. But
within the surface, they remain straight lines in the
sense relevant to the intrinsic geometry. That is,
Spaces of Variable Curvature
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/non_Euclid_variable/index.html[28/04/2010 08:21:47 ﺹ]
they remain geodesics, the shortest distances in
the surface between the corners of the triangle.
Correspondingly, by measures taken within the
surface, the angles of the triangle will still sum to
180 degrees.
Geodesic Deviation in
Spaces of Variable
Curvature
One of the special benefits of geodesic deviation
is that it works in surfaces whose curvature varies
from place to place. It is the local measure of
curvature we need. We may have a surface with
positive curvature, but the amount of curvature
varies from place to place. The rate of
convergence will tell us how much curvature we
have at each place. It will also tell us if the
curvature varies to zero (flat) or becomes
negative. Here is a surface whose curvature
varies from place to place. Geodesic deviation
allows us to track how the curvature changes.
When we looked at spaces of constant
curvature, we defined geodesics as
curves of shortest distance between two
points. That definition remains when we
move to spaces of variable curvature.
Geodesics are still the curves of
shortest length. In familiar terms,
imagine that you are hiking over a rocky
terrain whose surface has a curvature
that varies from place to place. If you
walk along the shortest route you can
find, you have just traced out a
geodesic of that surface.
Spaces of Variable Curvature
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/non_Euclid_variable/index.html[28/04/2010 08:21:47 ﺹ]
To begin, on the left, there is no
divergence or convergence of the
perpendicular geodesics. The
geometry within the surface at the
leftmost part is flat.
How can that be? The surface is curved into a cylinder! That
only means that the surface has extrinsic curvature. It is bent
into a higher dimension. There is no intrinsic curvature;
that is, there is none that manifests geometrically within the
surface.
In the central part, the surface adopts the negative
curvature of a saddle shape. The perpendiculars there
diverge outwards from the central part of the saddle.
On the right, the surface is positively curved. So
there the perpendiculars converge as we move away
from the central portion of the curved surface.
Curvature in
Different
Directions of a
Higher
Dimensioned
Space
Spaces of Variable Curvature
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/non_Euclid_variable/index.html[28/04/2010 08:21:47 ﺹ]
So far we have seen that the curvature of a surface
can vary from place to place. It might be zero here,
positive there and negative over there. However, at
any one place in the surface, there has only been
one curvature. That turns out to be a special case that
arises in the simple example of a two dimensional
surface.
When we consider spaces of three or more
dimensions, the sort of curvature we have can vary
according to the direction in which we are looking.
More precisely, we can slice up a three or higher
dimensional space into two dimensional sheets with
different orientations. In making the sheets, we make
them as straight as we can; that is, we make sure that
they are built from straight lines running in two
directions. As before, we use geodesic deviation to
determine the curvature intrinsic to each sheet. In
general, we will find that different two dimensional
sheets going through the same point in the space will
have different curvatures.
Let us start with a simple example in
which there is no difference in curvature
according to the direction considered.
Here's a three dimensional Euclidean
space sliced into sheets that run front
back and left right. (As required above,
these sheets are built from straight lines
that run frontback and eastwest, so they
are as flat as we can make them.) We use
geodesic deviation to find the curvature in
the sheet at some point P. There is none;
the sheet is flat.
Spaces of Variable Curvature
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/non_Euclid_variable/index.html[28/04/2010 08:21:47 ﺹ]
We might slice up the very same space into
sheets that run frontback and updown. We
can use geodesic deviation again to find the
curvature at the same point P, but in this
new sheet. There is none; the sheet is flat.
Things worked out simply in this last example. But it is
entirely possible that they do not and that we end
up with a different curvature from that found at point P
in a sheet that runs in a different direction. You are
probably wondering how this can come about. The
next section constructs a three dimensional space that
has different curvatures in different sheets.
A Space with
Different Curvature
in Different
Directions
The construction of this section is a
little taxing until you are used to
visualizing curved spaces of various
dimensions. So work through it if
you can. Or, if you prefer, just accept
Spaces of Variable Curvature
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/non_Euclid_variable/index.html[28/04/2010 08:21:47 ﺹ]
that in three and higher dimensions,
the curvature of space can vary
accoring to the direction of the sheet
considered.
To begin, recall that we can always add an extra
dimension to a space by "extruding" it, that is,
stacking up repeated copies in a new direction. That
adds to the space we started with. For example, start
with an ordinary, flat two dimensional surface that runs
frontback and leftright. Extrude it in the obvious way
in the updown direction and we have constructed an
ordinary, flat, three dimensional space.
We need not start with an ordinary flat surface. We might
start with a one dimensional circle that runs left right and
extrude it up down. The resulting two dimensional
surface is a cylinder. So far nothing unusual has happened
as far as curvature is concerned. The intrinsic curvature of
the surface of the cylinder is flat. (We already saw this
above. Just imagine that the cylinder is slit vertically and
unrolled. We end up with a flat sheet, while we haven't
changed any of the geometry intrinsic to the surface.)
Now let us do something a little fancier. We start with a two
dimensional space that is the surface of a sphere. In that
surface, we have two directions: eastwest and northsouth.
We form a three dimensional space by extruding the
sphere in a third updown direction. The picture shows
Spaces of Variable Curvature
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/non_Euclid_variable/index.html[28/04/2010 08:21:47 ﺹ]
roughly what the resulting space would be like. It is only a
rough picture since the extruded space has a geometry that
makes it impossible to draw faithfully on a two dimensional
Euclidean page. The figure is trying to show a three
dimensional, non Euclidean space that consists of the
surfaces of many spheres all stacked up on top of each
other in an additional dimension.
At any point in this space we can take different slices. For
example, we can take an eastwest north south slice .
The resulting sheet is just the original two dimensional
sphere. Geodesic deviation will tell us that the sheet has
positive curvature.
We could also slice the space in the eastwest updown
direction. That defines a sheet that might coincide with the
Spaces of Variable Curvature
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/non_Euclid_variable/index.html[28/04/2010 08:21:47 ﺹ]
eastwest equator of the sphere and what that equator
extrudes into in the up down direction. That is just a
cylinder. We can use geodesic deviation to determine the
curvature in this sheet. Since the cylinder is just a flat
surface rolled up, we will find zero curvature.
Threrefore, in this space, if we form a sheet by slicing in one
direction, we end up with a sheet that has positive
curvature. If we form a sheet by slicing in a different
direction at the same point, we end up with a sheet that has
zero curvature.
In Sum
These last two sections show just how complicated
curvature can be in geometry. Curvature can vary from
place to place in a space; and at one place it can vary
according to the direction considered. That capacity for
complexity is going to prove very useful. It turns out
to be just what Einstein needed to represent gravity
geometrically. But now we are getting ahead of
ourselves; that will be our topic in the next chapter.
What you should know
How to use geodesic deviation to detect the curvature of a surface's geometry from
within the surface.
How curvature can vary from place to place in a space.
How curvature can differ at one place in space if we consider different two dimensional
Spaces of Variable Curvature
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/non_Euclid_variable/index.html[28/04/2010 08:21:47 ﺹ]
sheets there.
Copyright John D. Norton. February 14, September 22, October 13, 2008; March 1, 2010.
General Relativity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/general_relativity/index.html[28/04/2010 08:21:58 ﺹ]
HPS 0410 Einstein for Everyone
Back to main course page
General Relativity
John D. Norton
Department of History and Philosophy of Science
University of Pittsburgh
Special and General Relativity
In a Nutshell: Gravitation is Curvature of Spacetime
Geodesic Deviation: a Refresher
Free Fall inside the Earth...
...Reinterpreted
Uniqueness of Free Fall
Gravity Above the Surface of the Earth
Masses Distributed Vertically
Masses Distributed Horizontally
Summed Curvature and Matter Density
From Curvature in SpaceTime Sheets to SpaceSpace Sheets
Representing Matter Density
Representing Summed Curvature
Einstein's Gravitational Field Equations
What You Should Know
Background Reading: J. P. McEvoy and O. Zarate, Introducing
Stephen Hawking. Totem Books. pp. 9  46.
Special and General Relativity
The special theory of relativity was a first step for Einstein. The
fuller development of his goal of relativizing physics came with his
general theory of relativity. That theory was completed in its most
important elements in November of 1915. By many measures, the
special theory was a smaller achievement. Its final creative phase
took Einstein some 5 to 6 weeks. Of all the new theories of 20th
century physics it is usually regarded as the most conservative.
Had Einstein not published the theory in 1905, we have good
reason to think that it would have emerged in one form or another.
Both Lorentz and Poincaré had developed the essential equations;
they just put a different interpretation on them than did Einstein.
The general theory of relativity took seven years of work by
Einstein, the final two to three being years of intense and
exhausting labor. No one else was even close to Einstein's ideas.
Had he not worked on them, they would most probably not have
emerged then. We may not even have them today. In some ways,
General Relativity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/general_relativity/index.html[28/04/2010 08:21:58 ﺹ]
Einstein's theory is conservative. It is the last "classical" field
theory in the sense that "classical" can mean "nonquantum." In
another sense, it is anything but conservative. The theory is quite
different from any theory before or after. It treats a force by means
of geometry and eventually leads to startling notions: black holes,
other universes and the bridges to them and even the possibility of
time travel. All other theories of forces have been readily swept
into quantum theory. General relativity has resisted and the
problem of bringing general relativity and quantum theory together
remains one of the most difficult, outstanding puzzles of modern
physics.
In a Nutshell: Gravitation is
Curvature of Spacetime
Before we start to delve into the theory in greater detail, we should
just state its basic idea . The theory is based on a single,
luminous, dominant idea.
In Newton's classical account of
gravitation, the earth wants to move
inertially, that is, uniformly in a straight
line. A gravitational force from the sun
deflects it and causes it to move in an
elliptical orbit around the sun.
In Einstein's theory, the presence of the sun disturbs 
that is, curvesthe very fabric of space and time. The earth
then merely moves inertially in this new disturbed
spacetime. It follows an inertial trajectory, but that
trajectory has been distorted so that it ends up as an
ellipse in the space around the sun; or, more precisely, a
helical trajectory winding around the sun's worldline in
spacetime.
General Relativity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/general_relativity/index.html[28/04/2010 08:21:58 ﺹ]
General relativity combines the two major theoretical transitions
that we have seen so far. These two transitions are depicted in the
table below. The first is represented in the vertical direction by the
transition from space to spacetime. We learned from
Minkowski that special relativity can be developed as the geometry
of a spacetime. The analogy is quite close. The trajectories of
bodies in inertial motion are straight lines in spacetime in the
sense that they are curves of greatest proper time, that is, timelike
geodesics. That makes them the analogs of the straight lines of
Euclidean geometry, which are also called geodesics, the curves
of shortest distance.
The second transition is represented in the horizontal direction in
the table. It is the transition from flat to curved geometry. In
the context of ordinary spatial geometry, that transition takes us
from the venerable geometry of Euclid to the geometry of curved
surfaces of the nineteenth centry. In the context of spacetime
theories, that same transition takes us from the geometry of a flat
spacetime, the Minkowski spacetime of special relativity, to the
geometry of the curved spacetimes of general relativity. The
central idea of Einstein's general theory of relativity is that this
curvature of spacetime is what we traditionally know as gravitation.
Flat geometry Curved geometry
General Relativity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/general_relativity/index.html[28/04/2010 08:21:58 ﺹ]
Space Euclidean geometry NonEuclidean geometry
Spacetime
Special relativity
(Minkowski spacetime)
General relativity
(semiRiemannian spacetimes)
This makes learning Einstein's general theory of relativity much
easier, for we have already done much of the ground work. The
mathematics needed to develop the theory is just the
mathematics of curved spaces, but with the one addition shown:
it is transported from space to spacetime.
There is a great deal more that could be saidand some of it will
be. Einstein himself gave a rather detailed account of the theory as
generalizing the principle of relativity to accelerated motion. In
first approaching the theory, I will say little about that. I will take
you along a different pathway that avoids many of the unnecessary
pitfalls of Einstein's account. The problem is that, in retrospect, it is
very far from clear just how that generalization was brought about
or even if it was done at all. So let's concentrate on curvature. The
royal road to curvature is geodesic deviation.
Geodesic Deviation: a Refresher
Let us recall how geodesic deviation allows us to detect the
positive curvature of a spherical surface.
A number of observers all start at the equator of a sphere. They
proceed in the same direction, due North. As they proceed, the
paths converge, eventually crossing at the North Pole. Here is the
familiar view of the surface of the two dimensional sphere
embedded in a three dimensional space is shown in the first figure.
How would this appear to someone trapped in the surface, without the
General Relativity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/general_relativity/index.html[28/04/2010 08:21:58 ﺹ]
higher viewpoint of the third dimension ? They would map out the
trajectories as shown.
We detect the positive curvature of the surface in the convergence of the
paths of the travelers. Had they diverged, we would have diagnosed
negative curvature.
Free Fall inside the Earth...
Can we find similar effects in spacetime? Then we would have
found curvature. So what we seek is a sheet of spacetime in which
we find converging or diverging curves. As we shall see, that will
be easy to find. A collection of masses in free fall in a gravitational
field will provide exactly the sort of curves we need.
To get us started, we will take the simplest case as far as the
curvature is concerned, although the set up physically is a bit
messier.
Imagine that we drill a hole through to the center of the earth and
out to the other side. It will be 6400 miles long. We evacuate the
resulting tube and cap it so that bodies dropped in the hole can fall
without any air resistance at all.
A small ball dropped from the surface would fall to the center, arriving there in
21 minutes, rush past and head towards the other side, arriving another 21
minutes later. It would then fall back towards the side it started. If nothing
intervened it would continue to oscillate back and forth, taking 42 minutes to
complete each journey from one side to the other.
Of course we must ignore
the rotation of the earth.
Otherwise that rotation
might bring the walls of the
tube into collision with the
falling ball.
General Relativity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/general_relativity/index.html[28/04/2010 08:21:58 ﺹ]
One of the oddities of gravity is that this period of 84 minutes (=42
minutes there + 42 minutes back) is fixed, no matter where the
ball may first be released. Imagine, for example, that it is released
from rest halfway between the surface and center. It would take
the same 42 minutes to cross the center and come momentarily to
rest at the corresponding point on the other side of the center; and
then another 42 minutes to make the trip back to its starting point.
Here's an animation that
shows balls starting at
different places in the hole.
Imagine that the balls are so
small that they pass by one
another without interference.
(That is hard to draw, so the
animation just shows them
passing through each other.)
Now let's plot the motions
through time over the 42
minutes needed for a ball to
fall past the center and
come to rest at the other
side:
This plot has now given us a
spacetime diagram of the
motions of the balls. It is just:
If you compare this spacetime diagram to the earlier figure of the
travelers on the earth's surface, you will see that they agree in the
essential aspect. They both show converging trajectories, the
hallmark of positive curvature. This allows us to interpret the
General Relativity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/general_relativity/index.html[28/04/2010 08:21:58 ﺹ]
gravitational motions in a novel way.
The temptation is to call this convergence "geodesic deviation."
We need to be a little cautious here since the trajectories in the
spacetime are not necessarily geodesics , that is, curves of
shortest distance. They are not in Newtonian theory. In relativity,
both special and general, they are timelike geodesics. That is, they
are curves of greatest proper time, which is the analog of the
straight lines of Euclidean geometry, the curves of shortest
distance. So we can yield to the temptation and, in so doing, arrive
at the essential idea of Einstein's theory.
...Reinterpreted
Here's how we pass to the essential idea of Einstein's theory.
Newton's theory: These
motions are due the force of
gravity deflecting the bodies
from the trajectories they want to
follow into the oscillations we
see.
Reinterpreted theory: the sheet of space
time displayed in the spacetime diagram is
instrinsically curved. The trajectories followed by
the bodies in free fall are simply the straightest
lines of this new curved geometry.
We'll call this a sheet of space hyphentime to
indicate that the sheet has one spatial dimension
and one temporal dimension.
Don't even try to imagine this as extrinsic curvature,
the bending of a surface into a higher dimensioned
space. That way leads to madness! Think of the
curvature intrinsically, that is, as a geometrical effect
arising entirely within the surface.
This case of free fall inside the earth turns out to be an especially
simple case as far as curvature is concerned in two ways.
First, the curvature of the
spacetime sheets explored by
these falling masses proves to
be constant throughout the
sheet. That follows since the
rate at which neighboring balls
in the sheet converge is the
same throughout the sheet.
This constancy is not too hard to see. If one works out the Newtonian gravitation theory, it turns
out that the acceleration due to gravity of a ball in the tube grows linearly with distance from
the center of the earth. That means that with each additional mile's distance, we add the
same increment to the acceleration of the ball. Therefore if two balls are separated by one mile
in height anywhere in the tube, their relative acceleration will be the same. Since the relative
acceleration fixes the rate of convergence, that rate will be the same everywhere. Since the
rate of convergence fixes the curvature, it follows that the curvature is the same everywhere.
General Relativity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/general_relativity/index.html[28/04/2010 08:21:58 ﺹ]
Second, the magnitude of the curvature does not depend on the
mass or size of the earth; it depends only on the mass density of
the earth. (This is not obvious. An easy calculation in Newtonian theory can show
it, however.)
These last two points are important enough to be stated in a
relation that is close to (but not quite) one that holds very
generally:
Curvature of spacetime sheet
within the earth
is
proportional
to
matter density
of the earth
In this formula Newtonian "mass density" has been replaced by the
vaguer "matter density" in anticipation of what will transpire in
general relativity, where the density of matter is a more
complicated quantity that embraces energy and momentum
densities as well as stresses.
The analysis can be generalized. We considered just one space
time sheet, the one swept out through time by the hole we
imagined drilled through the earth. Nothing in the analysis
depended upon where we drilled the hole. We could have drilled
many holes. Each would sweep out a different sheet in spacetime
to which this analysis would apply. In general, there are three
independent spatial directions we could have chosen,
correspondingly to the three axes of a three dimensional space.
Finding the curvature in the three resulting sheets would be
enough to fix the curvature in all possible sheets generated by
holes we may dig.
Uniqueness of Free Fall
We have reinterpreted gravitational accelerations as
manifestations of an intrinsic curvature of spacetime.
So far, we have actually posited nothing new,
physically beyond "geometrizing away gravitational
forces." Everything said so far could be carried through
in Newton's theory of gravity without affecting any of
the observationally testable predictions of the theory.
We have just repackaged an old theory in a unfamiliar
wrapping.
There is one complication in this repackaging of Newtonian
gravitation theory. Free falls are not definable as curves in
spacetime of greatest time elapsed, as they are in special
relativity and will be in general relativity. So a more
complicated construction is needed to make sense of the
straightness of trajectories of free fall in Newtonian gravitation
theory. Its details need not trouble us here. From this
perspective, however, Newtonian theory turns out to be more
complicated than general relativity!
General Relativity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/general_relativity/index.html[28/04/2010 08:21:58 ﺹ]
One special fact about gravity makes it an especially apt
redescription. There is a uniqueness in free fall trajectories that is
peculiar to gravity. If we drop a one pound ball in the tube, it will
take 42 minutes to pass to the other side of the earth. The same is
true of a two pound ball; or a three pound ball; or a ball of any
mass. They all take 42 minutes to pass to the other side of the
earth. While they do it, they follow exactly the same trajectory. So
if we release a one, two and three pound ball at the same moment,
they will remain together as they traverse the hole to the other
side. This is the uniqueness of free fall.
It is just the latest version of the
result Galileo made famous when
he wrote of dropping objects of
different mass from a tower and
noting that they would fall alike.
In Newtonian theory, the result is given more complicated
expression. The quantity that measures how a gravitational force
will act on a body in some gravitational field is its gravitational
mass. The quantity that measures how much a given body will
accelerate when acted on by a force is the body's inertial mass. It
is an unexplained coincidence in Newtonian theory that these two
masses are equal. The result is the uniqueness of free fall. A two
pound mass feels twice the gravitational force than does a one
pound mass in the same gravitational field, since it has twice the
gravitational mass. But the two pound mass is still accelerated by
the same amount, since it has twice the inertial mass and so
resists acceleration twice as much.
Notice that if electric forces were pulling the balls through the
tube, this uniqueness of fall would fail. There is no coupling of
inertial mass and electric charge. So if we drop one body which
carries twice the charge of a second, there is no assurance that
the inertial mass is also doubled; and so no assurance that the two
General Relativity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/general_relativity/index.html[28/04/2010 08:21:58 ﺹ]
will fall alike.
This remarkable result of the uniqueness of free fall is what makes
the reinterpretation very comfortable. We can think of the
spacetime sheet as having a natural spacetime geometry
revealed to us by masses. That geometry is largely independent of
the masses. For all massesbig and small reveal the same
trajectories. The masses are more like probes exploring an
independently existing structure.
Finally Einstein's reinterpretation eradicates an awkwardness of
Newtonian theory. That theory had to posit that increases in
gravitational mass in bodies are perfectly and exactly
compensated by corresponding increases in inertial mass, so that
the uniqueness of free fall can be preserved. Einstein's
redescription does away with that coincidence and even the very
idea of distinct inertial and gravitational masses. In his theory,
bodies now just have mass, or, in the light of special relativity,
massenergy. For Einstein the primitive notion is the geometrical
structure of spacetime with the curved trajectories traced out by all
freely falling bodies, independently of their mass.
Gravity Above the Surface of the
Earth
So far, we have dealt with an especially simple case in which the
curvature of the spacetime sheet is everywhere the same. More
generally curvature in spacetime will vary from event to event
and, even at one event, it will be different according to the
particular space time sheet considered. This is simply a re
expression in curvature language of the more familiar fact that
gravitation varies from place to place and acts differently in
different directions.
We can explore this variability by considering masses falling under
the action of gravity above the surface of the earth. (As before,
we will ignore the rotation of the earth.)
Masses Distributed
Vertically
General Relativity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/general_relativity/index.html[28/04/2010 08:21:58 ﺹ]
To begin, consider masses 100 miles above the surface, stacked
up 1 mile apart as shown and initially at rest with respect to earth.
Now let them fall freely. The masses closer to the earth will feel a
slightly stronger pull of gravity, so they will fall slightly faster. It
is easy to compute the effect. In the course of 18.3 seconds, the
masses will fall roughly one mile. A mass that is one mile closer ,
however, will fall 1.6 feet more than a mass starting one mile
higher.
If we plot these motions through time on a spacetime diagram, we
recover a familiar figure.
This is just a spacetime sheet showing diverging trajectories; that
is, this particular spacetime sheet has negative curvature.
General Relativity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/general_relativity/index.html[28/04/2010 08:21:58 ﺹ]
Masses Distributed
Horizontally
We will get a different outcome if we consider masses aligned
horizontally. As before they are 100 miles above the surface of the
earth and spaced one mile apart, but now they are distributed
horizontally.
In this case, each mass will feel the same gravitational force.
However those forces will pull in a slightly different direction for
each mass. The forces are all directed towards the center of the
earth. If the masses start from rest and go into free fall, after 18.3
seconds they will have fallen one mile. Since they are being pulled
by forces that converge to one point, the center of the earth, the
masses will have converged slightly in the course of falling. It turns
out that each mass will be 0.8 feet closer to its neighboring
masses as a result of the motion.
These motions can be plotted on a spacetime diagram, from which
we recover a familiar figure.
General Relativity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/general_relativity/index.html[28/04/2010 08:21:58 ﺹ]
This is just a space time sheet showing converging trajectories;
that is, this particular spacetime sheet has positive curvature.
To sum up, we can identify three spacetime sheets passing
through an event 100 miles above the surface of the earth in which
free fall motions are plotted. In the sheet spatially oriented in the
updown direction, we find a negative curvature. The remaining
two sheets are spatially oriented eastwest and northsouth. In
each of those we found a positive curvature.
Summed Curvature and Matter
Density
By considering masses in free fall within a tube bored through the
earth, we saw a connection between the curvature of the space
time sheet and the matter density. Since we know that matter
produces gravity and that gravity is now to be represented by a
curvature of spacetime, you might suppose that this is a general
relation. Could the spacetime curvature just be proportional to
matter density everywhere ? That clearly doesn't work. Above the
surface of the earth, there is no matter density, but there certainly
is gravity and, as we have just seen, curvature of the spacetime
sheets as well.
So we need a weaker relationship between curvature of the
spacetime sheets and matter density. That relationship turns out
to be easy to see if we just tabulate the cases we've seen so far.
General Relativity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/general_relativity/index.html[28/04/2010 08:21:58 ﺹ]
The curvature revealing deviations in the space time sheets, as
mapped out by masses in free fall, will be measure by the
convergence or divergence of masses one mile apart when they
fall one mile.
Convergence (+) or Divergence ( ) of bodies one mile apart in
free fall for one mile.
Inside the earth Above the earth
Updown +0.8ft 1.6 ft
Eastwest +0.8ft +0.8ft
Northsouth +0.8ft +0.8ft
Total +2.4 ft 0
The table suggests the correct result. For the case of spacetime
sheets above the earth where there is no matter density, the
curvature revealing deviations in each is nonzero. But their sum is
zero. We define the summed curvature to be the sum of the
curvatures of the spacetime sheets for the three different spatial
directions. Then we can write the connection between curvature
and matter density as:
Summed curvature of
spacetime sheets
is proportional to
matter
density
This relation amounts to a natural relaxing of the too stringent
condition that curvature must be proportional to matter density. For
with the relaxed condition, we can still have curvature in the
individual space time sheets at events where the matter density
vanishes. For example, we can have negative curvature in the
sheets aligned with the updown spatial direction. But to keep the
summed curvature zero, we must have positive curvature in
sheets aligned in the other directions.
General Relativity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/general_relativity/index.html[28/04/2010 08:21:58 ﺹ]
From Curvature in SpaceTime
Sheets to SpaceSpace Sheets
All our considerations so far apply equally to Newton's theory of
gravity (with the notion of free fall trajectory given a suitable geometric
reinterpretation) as to Einstein's new theory. They have dealt only with
curvature of spacetime sheets, that is, in two dimensional
surfaces in spacetime that are spacelike in one direction and
timelike in another.
xxxs
As the figure shows, the spacetime will also have spacespace
sheets. These are just the ordinary two dimensional slices of
three dimensional space. Within the context of Newton's theory,
reinterpreted as a theory of spacetime curvature, curvature does
not extend to them.
That they should be treated differently makes sense in the
Newtonian context. For there space and time are treated very
differently.
It is quite another matter when we move to relativity theory. The
core innovation of Einstein's special theory of relativity was a
mixing together of space and time, manifested most vividly as the
relativity of simultaneity. In a Minkowski spacetime, there are many
ways to slice up the spacetime into spaces that persist through
time. So, in the relativistic context, it is no longer so natural to
have one rule for spacetime sheets and another for spacespace
sheets.
Where Einstein's general theory of relativity deviates sharply from
General Relativity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/general_relativity/index.html[28/04/2010 08:21:58 ﺹ]
Newton's is that Einstein requires the curvature associated with
gravity to extend from spacetime sheets to spacespace sheets as
well; and for all to be governed by the same relationship. This
difference can be summarized as:
Newton's theory of gravitation
rendered as a curved spacetime
theory
Einstein's general theory of relativity
Summed
curvature
of space
time
sheets
is
proportional
to
matter
density
Summed
curvature of
all sheets of
spacetime,
spacetime
and space
space
is
proportional
to
matter
density
No curvature in purely spatial space
space sheets
This table summarizes the core ideas but avoids a lot of very
messy technical and mathematical issues. Let us just consider
Einstein's theory.
Representing Matter Density
What ought to represent matter density ? In Newtonian
mechanics, that would just be mass density. We learned from
special relativity that mass is not such a simply quantity. The real
concept is massenergy and what complicates things further is that
the amount of massenergy a body has will vary with the frame of
reference. These are just the beginning of a series of
complications.
It turns out that an adequate representation of the matter density
at an event in spacetime requires a catalog of a lot of information:
energy density, momentum densities, energy fluxes and all the
various forms of stress that may also be present. The synopsis of
all this information in a 4x4 table is known as the stress energy
tensor. The quantities just listed are usually represented by a
capital T and two subscript numbers. That is, they are 16 numbers:
T
00
, T
01
, T
02
, ... , T
44
. Laid out in a table, they look like this:
General Relativity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/general_relativity/index.html[28/04/2010 08:21:58 ﺹ]
T
00
T
01
T
02
T
03
mass
energy
density
energy flux = momentum density
T
10
T
11
T
12
T
13 =
energy flux
=
momentum
density
normal
pressure
shear
stress
shear
stress
T
20
T
21
T
22
T
23
shear
stress
normal
pressure
shear
stress
T
30
T
31
T
32
T
33
shear
stress
shear
stress
normal
pressure
I've included a decoding of what each of the T
00
, T
01
, ... mean, but
you should not worry too much about these details. It would take a
much longer exposition that given here to make sense of it all. All
you need to see is that the quantity known as the stress energy
tensor is really a bag that holds a lot of information about what is
present in spacetime at some point: its energy density, its
momentum densities, pressures, stresses and so on.
Representing Summed
Curvature
There is a similar problem in determining precisely which quantity
should represent the summed curvature. There is a single
4x4x4x4 table, known as the Riemann curvature tensor, that
represents all the curvature information pertaining to the different
sheets in spacetime. The entries in the table are represented by
the numbers R
0000
, R
0001
, R
0002
, ... , R
3333
. Somehow we need
to extract an appropriate sum of curvature quantities from it. There
are several ways to do this.
Deciding which was the right way proved to be a special stumbling
block for Einstein. The final answer, however, became so strongly
connected with Einstein's theory that it is now named after him. It
is called the Einstein tensor . As with the stress energy tensor,
the Einstein tensor can be written down as a 4x4 table of numbers
computed from the numbers in the 4x4x4x4 table of the Riemann
curvature tensor.
The 16 numbers that form this table are written as G
00
, G
01
, G
02
,
... , G
33
. Laid out as a table, they look like
G
00
G
01
G
02
G
03
General Relativity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/general_relativity/index.html[28/04/2010 08:21:58 ﺹ]
G
10
G
11
G
12
G
13
G
20
G
21
G
22
G
23
G
30
G
31
G
32
G
33
Einstein's Gravitational Field
Equations
The precise mathematical expression of the connection between
summed spacetime curvature and matter density just sets the two
tables equal to each other. It is done term by term in the tables:
G
00
= T
00
, G
01
= T
01
, ... , G
33
= T
33
. The resulting set of
equations is one of the most famous and most important equations
in physics and is known as:
Einstein's gravitational field equations
EINSTEIN
TENSOR
equals
STRESS
ENERGY
TENSOR
These are the core equations of Einstein theory and the crowning
glory of Einstein's discovery. In one set of equations they
embrace the entirety of gravitational phenomena as well as the
geometry of space.
These equations decide which spacetimesthat is, which
universesare admissible according to Einstein's theory. The
admissible ones will be those spacetimes in which the spacetime
curvature and matter density are related appropriately.
While this is easy to say, the mathematical difficulty of finding a
spacetime that satisfies Einstein's equations is immense. Success
is so hard that we usually celebrate it by naming the spacetime
after the person who first shows that it satisfies the gravitational
field equations.
It turns out that there is just one example that is simple enough for us to
General Relativity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/general_relativity/index.html[28/04/2010 08:21:58 ﺹ]
see without any hard calculations. Consider a flat Minkowski spacetime of
special relativity and imagine that it is completely empty of all matter. Since it
is flat, its curvature is zero at every event; therefore its summed curvature at
every event is zero; therefore the Einstein tensor is zero. Since it is empty of
all matter, its stress energy tensor is also zero everywhere. Combining, we
see that Einstein's gravitational field equations are satisfied: the Einstein
tensor equals the stressenergy tensor since both are zero.
"The Einstein tensor is
zero." "The stressenergy
tensor is zero." What can
that mean when neither is a
number? Each are 4x4
tables of numbers. It just
means that each number in
the 4x4 table is zero. That
is, it means just what you
thought it means.
Note that we cannot turn this around. If we have a spacetime
in which the stress energy tensor is zero, so that the Einstein
tensor is zero, it does not now follow that the curvature is also
zero. We have already seen that one can have a non zero
curvature that yields a zero summed curvature.
You have now seen the basic suppositions of Einstein's theory.
Our situation is rather like what happens after you have read the
first few pages of Euclid's Elements of Geometry. Once you know
the five postulates, there is a sense in which you know the whole
geometry: you have enough information to infer all the theorems
by simple logic. Of course the project of finding all those theorems
is enormous. It is the same with general relativity. The basic
ideas of the theory have been given to you, but finding out
what the theory really says is possible is an enormous and difficult
project. Only a small part of it will occupy many more chapters. We
will be building entire worlds.
What You Should Know
The difference between special and general relativity.
How geodesic deviation reveals curvature.
How free fall motion in a gravitational field can be reinterpreted as a curvature of a spacetime sheet.
The difference between the curvature of a spacetime sheet and spacespace and how each reveals itself
to us.
Einstein's gravitational field equations: The connection between summed curvature and matter density.
Copyright John D. Norton. February 2001; January 2, 2007, February 15, August 23, October 16, 27, 2008; February 5, 2010.
Gravity near a massive body
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/general_relativity_massive/index.html[28/04/2010 08:22:16 ﺹ]
HPS 0410 Einstein for Everyone
Back to main course page
Gravity Near a Massive Body
John D. Norton
Department of History and Philosophy of Science
University of Pittsburgh
The Geometry of Space
Causal Structure
The Three Tests
Mercury
Light bending.
Red Shift
What You Should Know
In the last chapter, we learned the barest elements of
Einstein general theory of relativity. We now need to
understand what those elements entail for gravity. The
first place to start is the most familiar, the gravitational
effects arising near a massive object like our earth or
sun. These were the first applications of Einstein's
new theory.
The Geometry of Space
Einstein's theory allows that the geometry of space
can become curved as well in the vicinity of very
massive objects. That is true for the space we know that
is close to both the great masses of the earth and sun.
However the deviation from flatness in these spaces is
so slight that no ordinary measurement can detect it.
For this reason, we believed for millennia that our space
is exactly Euclidean, whereas it is only very nearly so.
The deviation of spatial geometry from the Euclidean
Gravity near a massive body
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/general_relativity_massive/index.html[28/04/2010 08:22:16 ﺹ]
becomes more noticeable once we consider very
intense gravitational fields or the enormous distances of
cosmology.
To get a sense of just how close our local geometry is to
Euclidean, let us estimate the disturbance to it due to
the presence of the sun. Consider a huge circle
around the sun that roughly coincides with our earth's
orbit. Euclidean geometry tells us that the circumference
of this circle is 2π x radius of the orbit.
Imagine that we now approach the sun one mile at a
time and draw a new circle centered on the sun at each
step. The Euclidean result tells us that for each mile
we come closer to the sun, the circumference of the
circle is diminished by 2π miles.
That is the Euclidean result. Because of the presence of
the sun, space around the sun is not exactly Euclidean.
According to general relativity, for each mile that we
come closer to the sun, the circle does not lose 2π
miles in circumference; it loses only (0.99999999)x2π
miles.
Gravity near a massive body
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/general_relativity_massive/index.html[28/04/2010 08:22:16 ﺹ]
If we tried to build a model out of paper or plastic that
had this property, it could not lie flat in the Euclidean
space of our model builder's room. Instead as we added
the portions of the surface that lie closer to the sun,
those portions would pop out of the surface. That
popping out is a kind of embedding diagram and one of
the most frequently built models in the context of
general relativity.
The model captures an important geometrical fact about
the space around our sun that it is no longer exactly
Gravity near a massive body
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/general_relativity_massive/index.html[28/04/2010 08:22:16 ﺹ]
Euclidean. However it is misleading in two ways.
First, since it is an embedding diagram, we should not
be misled into assigning any physical reality to the
higher dimensioned space in which the surface is
modeled. It is introduced solely for our ease of
visualization. In fact the diagram is a step backwards in
that it is return to the old way of visualizing curvature as
a bending of a surface into a higher dimensioned space.
While it might be a useful aid to visualization, it is
factually false. There is, as far as we know, no higher
dimensioned space into which the surface bends.
Second, a common way of encapsulating
Einstein's theory is to roll marbles across the
model and suggest that gravitational
attraction somehow comes from the resulting
deflection of the marble's roll. From the
discussion above, you can see why that is
misleading. The gravitational deflection of
ordinary objects falling in the vicinity of the
sun is due to the curvature of the spacetime
sheets. What the model shows is the
curvature of the spacespace sheets and that
curvature is so small as to have negligible
effects on the motions of ordinary objects.
The model is often described as a rubber
membrane model and the picture is of a massive
object sitting on a rubber membrane distorts the
membrane. Just about the only thing right in the
model is that the surface of the membrane is
similar to the surface of the embedding diagram.
Almost everything else is misleading and has to
be imagined away. There is no gravity outside the
membrane, for example, pulling the mass down so
it distorts the membrane. Most importantly, there is
no curvature of the spacetime sheets of
spacetime represented, even though that curvature
is responsible for the familiar gravitational effects.
Gravity near a massive body
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/general_relativity_massive/index.html[28/04/2010 08:22:16 ﺹ]
Causal Structure
One of the consequences of Einstein's theory will have
special importance to us. Gravity is a curvature of
spacetime that affects all free fall motions. Light
propagating is one of those motions. So just as
massive bodies like planets and comets are deflected
toward the sun, so also in light.
One of the characteristics of a Minkowski spacetime and
the more general spacetimes of Einstein's theory
spacetime is that it has a light cone structure that is
usually taken to map out the fastest trajectories for
causal interactions. Since gravity affects light, it will also
affect this causal structure. The effect of gravitation is to
tip the light cones in the direction of the gravitational
attraction.
Gravity near a massive body
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/general_relativity_massive/index.html[28/04/2010 08:22:16 ﺹ]
This can have some very interesting consequences,
such as new regions of spacetime causally isolated from
our region. These arise in the theory of black holes and
we will see more of them later.
The Three Tests
Shortly after Einstein complete his theory, he
announced three empirical tests that he believed
established the theory. Two had yet to be done. They
were:
Mercury
According to Newton's theory, planets orbit the sun
along elliptical paths. Here's a picture of the orbital
motion according to Newton's theory; and an animation:
Gravity near a massive body
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/general_relativity_massive/index.html[28/04/2010 08:22:16 ﺹ]
Einstein's theory predicted the same, but added that the
axis of the ellipses of the planetary orbits would
advance very slightly . That means the axis would
rotate slowly in the same direction as the planet's
motion. In Mercury's case, the advance would be about
43 seconds of arc per century. This amount of advance
is really very small. To see this, note that there are 60
minutes in one degree and 60 seconds in one minute.
So 43 seconds of arc is very much less than a single
degree. It would be impossible to use a sharp pencil and
a big sheet of paper to draw two intersecting straight
lines that intersect at 43 seconds of arc. They would be
so close that they would appear like one line. Yet this is
the extra advance Einstein's theory predicts over the
time of 100 years.
Here is a picture of this advance, with the size of the
advance greatly exaggerated, and an animation:
That so called "anomalous" advance had already been
Gravity near a massive body
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/general_relativity_massive/index.html[28/04/2010 08:22:16 ﺹ]
observed but no final explanation had been agreed on
for it. When Einstein discovered that his theory
predicted this elusive 43 seconds of arc, it might well
have been the greatest scientific moment of his life. He
recalled having heart palpitations, being unable to sleep
and a sense that something inside snapped.
Of course the matter was more complicated than the
above gloss suggests. Even in Newtonian theory, the
ellipse of Mercury's orbit was expected to move by over
400 seconds per century due to the perturbations of the
other planets. That means that the gravitational
attraction of the other planets pulls Mercury off the
simple elliptical orbit computed in their absence. Adding
in the effects of these perturbations, Newtonian theory
could account for all but about 40 seconds of the motion
of the axis of Mercury's orbit. Until Einstein was able to
explain it exactly with his general theory of relativity in
late 1915, this small discrepancy did not seem to be
very worrisome. It was only afterwards that explaining it
became a sine qua non for any new gravitation theory.
Here's a contemporary account from Simon
Newcomb's authoritative The Elements of the Four
Inner Planets and the Fundamental Constants of
Astronomy: Supplement to the American Ephemeris and
Nautical Almanax for 1897. Washington: Government
Printing Office, 1895, p. 184.
Gravity near a massive body
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/general_relativity_massive/index.html[28/04/2010 08:22:16 ﺹ]
Note that Newcomb allows that the anomalous motion
of Mercury could be accommodated if Newton's law of
gravitation was not exactly an inverse square law.
That is, he considers the possibility that the force of
'fbe motion of t he l)6ri heli on w be actuall y used in tlle tables
is equal to the motion of the nooe frolll the lUeall equinox, plus
the increase of the arc of t he orbit between the node and
l)6riilelion. The adopted value of this (juantity is found by
increaSi ng' the motion of 11" 1 by the followi ng quantities:
1. The change due to the motion of t.he plan9 of the orbit,
2. Tlle change due to the motion of the ecliptic,
The forllluiro for these two quantities arc
(1); 0] D, 1f = tan ~ i.siu I ~ D! &
(2) ; Ot D, 1f = J(" tau; i sin (L"  fI)
3. The excess of motion 8ho\'111 by ollsen ' atious in the case
of Mel'cury aud Mars, and computed fOl' all four pJauets ail if
they b'1':tvitated toward the Sun with a force proportional tQ
r  wbel'e
n = 2.000 000 16120
'fhe values of t·hi$ correction are
J.hwcnry ;
Vell us;
Earth ;
Mars;
4. Tile general procession.
Dl1f = 43.37
1{;.98
lU.45
5.55
5. In the case of ell e Earth] the motion arising from t he
action of the Moon, of which t he amount is
D
t
iT" = 7",68
Bnt tbe fi r st two corrections drop Ollt in t his caae.
The preceding transformations of the secular vllriations are
made wi th t he origi nal values of tlltl element.s tJ ami i, IlS given
in AB/rQlWII, ical Paperl , Vol. V, Part I V, pp. 337, 338.
Gravity near a massive body
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/general_relativity_massive/index.html[28/04/2010 08:22:16 ﺹ]
gravity does not dilute in inverse proportion with
(distance)
2
but with (distance)
2.00000016120
. We might
wonder if this is an admission that no hypothesis within
the existing system is expected to accommodate the
anomaly so that an alteration of fundamental law has to
be contemplated. Or, more likely, is it just a working
astronomer noting the simplest way to develop a rule
that will allow prediction of planetry motion?
Light bending.
According to Einstein's theory, light, just like any other
form of matter, is affected by gravity. That is, light also
"falls" in a gravitational field. Just as a comet's
trajectory is deflected by the sun when is passes
nearby, a ray of starlight grazing the sun would also be
deflected. Einstein computed that the deflection would
be about 1.75 seconds of arc. The deflection had two
components. Half of the deflection is due to the
curvature of space near the sun. The other half arises
merely from the light falling towards the sun. This
deflection was verified by expeditions in 1919 that took
photos of the stars near the sun at the time of a solar
eclipse.
What complicates the measurement is that one gets half
of Einstein's predicted deflection in Newtonian theory.
One merely needs to assume that light is a form of
matter that falls in a gravitational field in Newtonian
theory, just as every other form of matter falls. That is
sufficient to give half the deflection of Einstein's theory.
Gravity near a massive body
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/general_relativity_massive/index.html[28/04/2010 08:22:16 ﺹ]
One of Eddington's eclipse photos
Gravity near a massive body
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/general_relativity_massive/index.html[28/04/2010 08:22:16 ﺹ]
From the New York Times,
November 10, 1919.
Full article.
A minor variation on this effect arises if the deflecting
body is massive enough to bring together the light that
passes on either side of it from a luminous body behind
it. Then the deflecting body acts a kind of lens, focusing
the light. In the figure, the observer would see two
images of the same object. In the case of perfect
alignment, the observer would see a ring of duplicated
images. This effect, known as "gravitational lensing,"
has only recently been observed. While Einstein did not
discuss the effect in his publications, it turns out that he
Gravity near a massive body
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/general_relativity_massive/index.html[28/04/2010 08:22:16 ﺹ]
had computed it in a private notebook in 1913.
Here's a spectacular image of gravitational lensing:
Gravity near a massive body
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/general_relativity_massive/index.html[28/04/2010 08:22:16 ﺹ]
Dowloaded from
http://hubblesite.org/newscenter/archive/releases/1995/14/image/a/format/web_print/
February 15, 2007.
Red Shift
According to Einstein's theory, informally speaking, time
runs slower closer to massive bodies. That means that
natural clocks in the sun run slower than the same
clocks on earth. Of course there are no ordinary clocks
in the sun. But there is something much better. Excited
atoms emit light in very specific frequencies and our
measuring the frequency of that light is akin to our
measuring the frequency of ticking of a clock. Any
slowing of those atomic clocks would result in a change
in the frequency of light emitted from the sun.
Einstein's theory predicts a very small degree of slowing
of clocks in the sun. It manifests in the light from the sun
Gravity near a massive body
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/general_relativity_massive/index.html[28/04/2010 08:22:16 ﺹ]
being slightly reddened for observers watching from
far afield on the earth. The red shift for light from the
sun is merely 0.00002%, which proved extremely
difficult to detect. The effect was found later in the light
from stars far more massive than the sun. The figure
shows light climbing out of the stronger gravitational
field of the sun towards the earth.
What You Should Know
The difference between the curvature of a spacetime sheet and spacespace and how
each reveals itself to us.
What the rubber sheetlike embedding diagram shows (and does not show).
The three famous tests of general relativity.
Copyright John D. Norton. February 2001; January 2, 2007, February 15, August 23, October 16, 27, 2008; February 5, 2010.
Einstein's Pathway
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/general_relativity_pathway/index.html[28/04/2010 08:22:30 ﺹ]
HPS 0410 Einstein for Everyone
Back to main course page
Einstein's Pathway to General Relativity
John D. Norton
Department of History and Philosophy of Science
University of Pittsburgh
The Starting Point
Adjusting Newton's Theory of Gravitation
"The Happiest Though of My Life"
The Principle of Equivalence
Relativity of Inertia ("Mach's Principle")
Learning About Gravitation
Gravitational Slowing of Clocks
Gravitational Bending of Light
The Rotating Disk
Assembling the Pieces
What You Should Know
We have followed a simple pathway to the main ideas of the
general theory of relativity. We started with the geometrical
notion of the curvature of space and saw how that
geometrical notion can be extended from space to spacetime.
We then found the resulting theory of curved spacetime not just
to cover a curved geometry of space, but gravitational
phenomena as well.
This pathway to the theory was not Einstein's. His was more
indirect, more inspired, more tortured and more fallible. The
final theory emerged after Einstein struggled for seven years
with many things: strong hunches about what the theory should
say physically, vivid thought experiments to support the
hunches, lengthy explorations into new mathematics, errors
and confusions that thoroughly derailed him and a final insight
that rescued him from exhaustion and desperation.
The seven years of work divides loosely into two phases. The
earlier phase of his work was governed by powerful physical
intuitions that seemed as much rationally as instinctively
based. He felt a compelling need to generalize the principle of
relativity from inertial motion to accelerated motion. He was
transfixed by the ability of acceleration to mimic gravity and by
the idea that inertia is a gravitational effect. As Einstein
struggled to incorporate these ideas into a new physical theory,
he was drawn to use the mathematics of curvature as a means
Einstein's Pathway
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/general_relativity_pathway/index.html[28/04/2010 08:22:30 ﺹ]
of formulating the new theory.
As the mathematics of curvature took a more controlling
position in the later phase, his work began to change. The
theorizing was governed increasingly by notions a
mathematical simplicity and naturalness. When the theory was
completed, Einstein's starting point was quite distant. It
remains a matter of controversy today whether Einstein
succeeded in realizing his original ambitions.
It is impractical in this chapter to review all these
considerations. Einstein's intricate mathematical struggles in
the later years cannot easily be described in informal terms.
However some of his earlier physical reflections are so famous
and so characteristic of Einstein , that they must be
mentioned. You should treat these as interesting reports on
Einstein's intellectual biography. You may well find it hard to
connect some of the ideas to be laid out with the final theory.
The Starting Point
Einstein's first concrete steps on his pathway to general
relativity came in 1907 when he was commissioned by
Johannes Stark to write a review article on relativity theory
for Stark's journal Jahrbuch der Radioaktivitaet und Electronik.
The exercise was, apparently, quite straightforward. In his
1905 theory, Einstein had offered a new account of space and
time. Since the theories of physics were all set in space and
time, physicists needed to be assured that these theories could
be maintained; or, if not, shown how they should be adjusted to
fit with Einstein's new theory.
The exercise proceeded well. Electrodynamics actually
needed no adjustment. Einstein's 1905 theory of relativity had
been created to fit with the existing theory. The mechanics of
bodies required adjustments to the notions of energy,
momentum and mass. The most prominent of these was the
famous equivalence E=mc
2
. Einstein also sketched out a
relativistic treatment of thermodynamics, the theory of heat and
work.
Then came gravity. Newton's celebrated theory of gravitation
presumed instantaneous action at a distance. The sun now
exerts a gravitational force on the earth now with a magnitude
Einstein's Pathway
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/general_relativity_pathway/index.html[28/04/2010 08:22:30 ﺹ]
set by Newton's inverse square law. The key part was the
"now." If the sun were to move slightly, the resulting alteration
in the force it exerts on the earth would be felt by us
instantaneously according to Newtonian theory.
That means that Newton's theory depends upon a notion of
absolute simultaneity. A change there is felt here at the
same moment. However Einstein's 1905 theory had banished
absolute simultaneity from physics. Different observers would
judge different pairs of events to be simultaneous. Newton's
theory had to be adjusted to accommodate this new relativity.
Adjusting Newton's Theory
of Gravitation
The change needed was, apparently, straightforward. In the
revised theory, a change in the sun should not be felt here on
earth instantly, but only after a time lag of around 8 1/3
mintues, the approximate time light takes to propagate from the
sun to the earth. Then absolute simultaneity would no longer
be needed in the theory.
This meant that Newton's theory needed to be adjusted to look
more like electrodynamics. In the latter theory, effects do not
propagate instantly in the electromagnetic field; they propagate
in waves at the speed of light. There were many ways to make
the adjustments Newton's theory needed. All of them produced
very small changes in the predictions of the theory. While one
might not be sure precisely which of the many adjustments was
the right one to pick, there didn't seem to be any major
problem. Rather the issue was a surfeit of good solutions. Or
Einstein's Pathway
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/general_relativity_pathway/index.html[28/04/2010 08:22:30 ﺹ]
so believed other leading thinkers of Einstein's time, such as
the great French mathematician, Henri Poincaré, and the
inventor of spacetime, Hermann Minkowski.
Einstein, however, did not see it that way. He examined
gravitation theories, modified to allow for a finite time of
propagation of effects, and found a result that aroused great
suspicions in him. In the modified theories, the distance fallen
by a body varies according to its sideways motion. In the
simplest case, the body would fall a shorter distance if it has
some sideways velocity.
The differences in the distances fallen were very small and
not likely to be detectible in an experiment. Nonetheless they
bothered Einstein. They contradicted the exact correctness of
Galileo's old observation that all bodies fall alike, even though
the differences were far too small to be detectible by the
methods available to Galileo.
Other physicists of the time were aware of this effect, but
discounted it as too small to be of any concern. Einstein did
not. It meant that the way a body fell would depend on the
energy of the body. We can only guess now why that
bothered Einstein so much. It might be that Einstein
imagined that a hot body, consisting of many small atoms in
thermal motion, might fall differently from a cold one according
to these theories.
Einstein was still a clerk in the Bern patent office in 1907. Yet
he came to the extraordinary conclusion that an adequate
theory of gravitation could not be devised within the
confines of his existing theory of relativity.
Einstein's Pathway
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/general_relativity_pathway/index.html[28/04/2010 08:22:30 ﺹ]
"The Happiest Though of My
Life"
It was while pondering this problem that Einstein hit upon what
he later described as "the happiest thought of my life." If began
when he suddenly saw new significance in a commonplace of
Newtonian gravity. A body in free fall in Newtonian gravity does
not feel its own weight. This effect is very famiiliar to us
now. We have all watched spacewalkers floating weightlessly
outside their capsules. They are in free fall above the earth,
orbiting with their spacestations and that free fall cancels their
weight.
This effect came about from an apparently accidental agreement of two
quantities in Newtonian theory: the inertial mass of a body happens to equal its
gravitational mass exactly. Einstein now believed that this equality could be
no accident. He needed to find a gravitation theory in which this equality is a
necessity.
The inertial mass
of a body
measures its
resistance to
acceleration when
a force is applied
to it.
The gravitation
mass of body
measures its
power to produce a
gravitational field.
Einstein's Pathway
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/general_relativity_pathway/index.html[28/04/2010 08:22:30 ﺹ]
The immediate outcome of this reflection was Einstein's
"principle of equivalence." It formed the basis of the concluding
Part V of his 1907 Jahrbuch article. There he suggested that
gravitation required an extension of special relativity based on
the principle of equivalence.
The Principle of
Equivalence
There are very many formulations of the principle of
equivalence in the literature. Most of them pick up directly on
the idea of weightlessness in free fall. They assert that free fall
transforms away a gravitational field in some tiny volume of
space. While this is a common formulation of the principle in
text books, it is troubled. Free fall transforms away gross
effects of gravitation. But, in Einstein's final theory, it does not
transform away the effects of spacetime curvature. In that
sense, free fall does not transform away gravity in the final
theory.
Einstein later complained about this version of the principle,
objecting that one could not in general transform away an
arbitrary gravitational field over an extended region of space.
His original formulation and the one to which he adhered for
his entire life proceeded differently. He turned around the
original idea of free fall eradicating gravitation. Acceleration
can also produce a gravitational field.
For more, see John D. Norton,"What was
Einstein's Principle of Equivalence?"
Studies in History and Philosophy of
Science, 16 (1985) , pp. 203246; reprinted
in D. Howard and J. Stachel (eds.), Einstein
and the History of General Relativity:
Einstein Studies Vol. I, Boston: Birkhauser,
1989, pp.547.
More specifically, Einstein took the case of special relativity
without gravitation. He now imagined a uniformly accelerated
observer, in relation to whom all free objects would accelerate.
That state of space found by the observer, Einstein asserted in
his principle of equivalence, is a homogeneous
gravitational field. In this case, uniform acceleration and
homogeneous gravitation are equivalent.
Einstein developed the idea in one of his best known thought
experiments. He asked us to imagine a physicist who
awakens in a box. Unknown to the physicist, the box is in a
Einstein's Pathway
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/general_relativity_pathway/index.html[28/04/2010 08:22:30 ﺹ]
distant part of the space of special relativity and is being
accelerated uniformly in one direction by the tug some agent. If
the physicist were to release objects in the box, they would be
left behind by the accelerating box; they would move inertially,
while the box accelerated. This figure shows this for two bodies
of different mass at rest and a third body that has a horizontal
inertial motion.
The physicist inside the box would find that the released
masses accelerate in a direction opposite to the box's
acceleration. The physicist would judge there to be a field
inside the box pulling on all free bodies.
Now comes the key point. All bodies released by the physicist
would fall exactly alike, no matter what their mass or
composition. So the field found by the physicist inside the box
would manifest the signature property of a gravitational field:
it would accelerate all bodies exactly alike.
One might be tempted to say that the field inside the box is just
Einstein's Pathway
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/general_relativity_pathway/index.html[28/04/2010 08:22:30 ﺹ]
an "inertial field," some sort of fake gravitational field.
Einstein's assertion was otherwise. The field created by
motion in the box just is a fullblown, authentic homogeneous
gravitational field.
Principle of Equivalence
The inertial effects inside a uniformly accelerated box in
gravitation free space are equivalent to those of a
homogeneous gravitational field; more tersely, uniform
acceleration creates a homogeneous gravitational field.
The equivalence just asserted may seem benign. It seems just
to codify a equivalence in the way bodies fall in two cases. In
fact the assertion is strong, for it asserts that the equivalence
applies to all processes, not just fall the bodies. That
means that it applies also to all processes involving fields, such
as electric and magnetic fields.
You will see why Einstein found this principle attractive. His
efforts to produce a relativistic theory of gravity had failed since
he could find no theory in which all bodies fell alike, no matter
what their mass or composition. The gravitational field
delivered by the principle of equivalence was assured to have
this property. In particular, the sideways motion of a body
would have no effect on its rate of fall. The field generated in
this thought experiment did not have the defect of the earlier
theories.
Relativity of Inertia ("Mach's
Principle")
What also attracted Einstein in this analysis was that it
promised to remedy a defect he perceived in both Newton's
physics and in special relativity. In both, you will recall, it is just
a brute fact that certain motions are distinguished as inertial.
This, in Einstein's view, was worrisome. It was no better than
the original idea that there is an ether state of absolute rest.
There seemed to Einstein no good reason for why one state
should be the absolute rest state rather than another.
Correspondingly, Einstein saw no good reason for why some
motions should be singled out as inertial and others as
Einstein's Pathway
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/general_relativity_pathway/index.html[28/04/2010 08:22:30 ﺹ]
accelerating.
In 1916, Einstein formulated this worry in a thought experiment.
He imagined two fluid bodies in a distant part of space.
These bodies, the reader quickly infers, are like stars or
planets, which form roughly spherical shapes under their own
gravity. Einstein further imagines that there is relative rotation
between the two bodies about the axis that joins them. This
relative rotation is verifiable by observers on each body, who
can trace out the motion of the other body. Each would judge
the other to be rotating.
It can happen in ordinary Newtonian physics that one of these
bodies is not rotating with respect to an inertial frame and the
other one is. In that case, the second rotating body will
bulge. This effect arises on the earth. It rotates about the axis
of its north and south poles. It bulges slightly at the equation as
a result of centrifugal forces that seek to fling the matter of
earth away from this axis.
It would be entirely unacceptable, Einstein now asserted, were
this to happen to two spheres in an otherwise empty space. For
there is no difference in the observable relations between the
two spheres. Each rotates with respect to the other. So why
should just one bulge ? Newton's absolute space or inertial
systems, Einstein protested, was an inadequate
explanation. Einstein demanded something observable to
make the difference.
Einstein was an avid reader of the physicistphilosopher
Ernst Mach and, in Mach's writings, he had found what
seemed to be a solution to the problem. Mach seemed to
be proposing, Einstein thought, that the privileging of
certain states of motion is due to the distribution of
matter in the universe. Why is our frame of reference
inertial? It is because the stars are at rest in our frame.
Why is my wording so careful here? it is not
clear that what Einstein read in Mach is what
Mach actually said. For more, see John D.
Norton, "Mach's Principle before Einstein." in J.
Barbour and H. Pfister, eds., Mach's Principle:
From Newton's Bucket to Quantum Gravity:
Einstein Studies, Vol. 6. Boston: Birkhäuser,
1995, pp.957. Download.
Einstein's Pathway
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/general_relativity_pathway/index.html[28/04/2010 08:22:30 ﺹ]
When we try to accelerate, we feel inertial forces. These are
the forces that make us dizzy when we spin in a fun fair; or
they are the forces that throw our coffee in the air when our
airplane hits an airpocket.
These forces, Einstein understood Mach to assert, arise from
an interaction between the mass of our body (and our coffee)
and all the other masses of the universe, distributed in the
stars. Einstein first called this idea the "relativity of inertia" and
later, in 1918, "Mach's Principle."
In the case of Einstein's two fluid spheres, the bulge of one
of them would now be explained by the fact that this bulging
sphere was rotating with respect to all the other masses of the
universe, whereas theh other sphere was not. That would be
the observable difference between the two fluid bodies.
This analysis was clearly inspired by Mach's famous account of
Newton's bucket experiment. Newton had noted that water
in a spinning bucket adopts a concave surface, as a result,
Newton urged, of its rotation with respect to absolute space.
No, Mach had responded several hundred years later, all one
has in the case of Newton's bucket is rotation with respect to
the stars.
The weakness of this analysis is that there is no account of
how rotation with respect to distant masses could produce
these inertial forces. In 1907, Einstein hoped that his
emerging theory of gravity would provide the mechanism. It
could then satisfy Mach's Principle and, through it, generalize
the principle of relativity to acceleration. For in a theory that
satisfies Mach's Principle, no state of motion is intrinsically
inertial or accelerating. When we see something accelerating, it
is not accelerating absolutely in such a theory; it is merely
Einstein's Pathway
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/general_relativity_pathway/index.html[28/04/2010 08:22:30 ﺹ]
accelerating with respect to the stars. Preferred inertial motions
need not enter into the account any more. All motion,
accelerated or inertial, would be relative.
To deliver this sort of account of inertial forces, Einstein's
theory would need to break down the strict division between
inertial and accelerated motion of his special theory of relativity.
The principle of equivalence promised to weaken this
division. According to it, whether the physicist in the box was to
be judged accelerating or not depended on your point of view.
An inertial observer would judge the physicist to be
accelerating uniformly in a gravitation free space. The physicist
would judge him or herself to be unaccelerated in a
gravitational field. It was a first step towards generalizing the
prinicple of relativity to acceleration, Einstein believed.
Learning About Gravitation
By his own later judgment, Einstein did not, in the end, find a
theory that fully satisfied Mach's Principle. The immediate
benefit of his new principle of equivalence, however, was that it
let Einstein learn a lot about gravitation. For the principle
delivered to Einstein one special case of a gravitational field
that, he believed, conformed with relativity theory and in which
all bodies truly fell alike. Einstein's program of research on
gravity in the five years following 1907 was simply to examine
the properties of this one special case and to try to generalize
them to recover a full theory. His early hope was that the
generalization of the principle of relativity would somehow
emerge in the course of those investigations.
Gravitational Slowing of Clocks
Two properties of this special case of the gravitational field
were noteworthy. First, Einstein recognized that clocks run at
different rates in the box of his thought experiment according to
their location. A clock placed lower in the created field runs
slower.
Einstein's Pathway
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/general_relativity_pathway/index.html[28/04/2010 08:22:30 ﺹ]
Einstein immediately generalized that effect to all gravitational
fields. Clocks deeper in a gravitational field run slower. A clock
in the sun would run slower than one on earth if only we
could have a clock in the sun withouth it being destroyed by
the heat of the sun. It turns out we can find clocks in the sun.
Radiating atoms radiate in very definite frequencies according
to which element they are. That means that they behave like
little clocks. Their running slower is manifested in a slight
reddening of the light they emit. Einstein computed an effect on
the wavelength of sunlight of one part in two million.
While Einstein did not use spacetime diagrams in 1907, they
provide an easy way to see that clocks run at different rates
according to their position when they accelerate in a Minkowski
spacetime. The effect is driven almost entirely by the relativity
of simultaneity.
The spacetime diagram shows two
clocks A and B accelerating together
towards the right in a Minkowski
spacetime. The numbers show the
proper time elapsed along each clock's
worldline and thus the time each clock
reads. The hypersurfaces of simultaneity
are those of the inertial observer on
the left of the figure. According to that
inertial observer, the two clocks run at
the same speed, at least for the initial
portion of their acceleration.
Why don't the two clocks run at exactly the same speed? This is an
artifact of how uniform acceleration arises in a Minkowski spacetime.
Observers on the clocks judge the distance between them to stay the
same. Therefore an inertial observer will judge this distance to contract. As
a result, the inertial observer will judge the two clocks to accelerate at
slightly different rates; the difference will be just enough to give the
length contraction effect. This means that, in the same time, the A clock
will achieve a greater speed than the B clock, according to the inertial
observer's judgments of simultaneity. Hence the inertial observer will judge
the A clock's reading to start to lag slightly behind that of the B clock. This
effect is shown in the figure, which has been drawn carefully to scale.
If you really have to see more details, see uniform acceleration in a
Minkowski spacetime.
Einstein's Pathway
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/general_relativity_pathway/index.html[28/04/2010 08:22:30 ﺹ]
Now consider an observer who accelerates with the rightmost
"B" clock, that is, the clock higher up in the created field. As the
clock changes speed, that observer's hypersurfaces of
simultaneity will tilt so that the B observer will judge the A clock
to be lagging successively more behind. When B's clock reads
2, B will judge the A clock to read 1; when B's clock reads 4, B
will judge the A clock to read 2. Overall, B will judge A's clock
to be running at half the B clock's speed entirely because of the
relativity of simultaneity.
The geometry of uniform acceleration in a Minkowski
spacetime turns out to be especially simple. The hypersurfaces
of simultaneity of an observer accelerating with the B clock
Einstein's Pathway
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/general_relativity_pathway/index.html[28/04/2010 08:22:30 ﺹ]
turns out to coincide with the hypersurfaces of simultaneity of
an observer accelerating with the A clock. Hence the observer
moving with clock A will agree that the A clock is running
slower and the B clock faster. When the A observer's clock
reads 1, A will judge B's clock to read 2. When the A observer's
clock reads 2, A will judge B's clock to read 4.
Gravitational Bending of Light
The second important effect pertained to light. An
unaccelerated observers finds that light propagates in a
straight line in Minkowski spacetime. Here, for example, is
such a light flash propagating across the box of Einstein's
thought experiment.
For the physicist accelerating with the box, however, the light
will be judged to fall, just like everything else in the box. As a
result, the physicist will find the light's path to be bent
downward by the gravitational field.
Einstein's Pathway
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/general_relativity_pathway/index.html[28/04/2010 08:22:30 ﺹ]
Einstein generalized this result to arbitrary gravitational fields.
This generalization enabled him to make one of the most
celebrated predictions of his theory. A ray of starlight grazing
the sun would be bent as the light fell into the sun's
gravitational field. This bending would be manifested as a
displacement of the star's apparent position in the sky and this
displacement would be visible at the time of solar eclipse.
In 1907, Einstein had predicted the gravitational bending of
light. But he did not realize that it might actually be tested at
the time of a solar eclipse. After his 1907 Jahrbuch article,
Einstein's efforts were redirected towards the puzzle of the
quantum. In 1911, however, he returned to theorize about
gravity. He realized then that his prediction of the gravitational
bending of light could be tested at a solar eclipse. He wrote
another paper developing this idea and also other aspects of
his theory.
Einstein was keen to see this test undertaken. The greatest
difficulty was that it required a solar eclipse and that meant that
astronomers must place themselves precisely in its path. That
need eventually led to astronomers traveling to the Crimea and
to both Southern Africa and South America. In 1913, Einstein
wrote to the American astronomer G. E. Hale asking whether
the test could be undertaken without an eclipse.
Einstein's Pathway
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/general_relativity_pathway/index.html[28/04/2010 08:22:30 ﺹ]
Hale responded that it could not. The brightness of the sky
near an uneclipsed sun in just too great.
Gravitational Slowing of Light
In 1907, Einstein had also concluded that the speed of light,
and not just its direction, would also be affected by the
gravitational field. The effect was closely connected with the
gravitational slowing of clocks and is almost entirely a
consequence of the relativity of simultaneity. One can see how
it comes about with a similar set of spacetime diagrams. The
clocks A, A', B and B' all accelerate uniformly in a Minkowski
spacetime and in a way that ensures that the distance from A
to A' remains the same as from B to B'. A light signal propates
from A to A' and a second light signal propagates from B to B'.
The figure shows the hypersurfaces of simultaneity of an
inertial observer . Of course the inertial observer will judge
the two light signals to propagate at the same speed. That is
just familiar special relativity.
We notice also that, initially, the four clocks A, A', B, B' run in
Einstein's Pathway
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/general_relativity_pathway/index.html[28/04/2010 08:22:30 ﺹ]
synchrony according to the judgments of simultaneity of the
inertial observer. Hence using the readings of these clocks
directly, we will infer that the two light signals propagate at the
same speed. In more detail, we note that the distance from A to
A' equals the distance from B to B'; and each light signal takes
the same time to traverse the distance. Both light signals leave
when the local clocks read 0 and arrive when the local clocks
read 3. Hence using these local clock readings, we infer
that the two light signals travel at the same speed.
Now consider how these processes are judged by an observer
who accelerates with the clocks. All that changes in the
analysis that follows is that we use different judgments of
simultaneity. That leads to the judgment of differing speeds
for the propagaion of light.
Let us take the observer who accelerates with clock B. That
observer's hypersurfaces of simultaneity will tilt more and more
as clock B gains speed from the acceleration. This was the
effect that led observer B to judge that the A clock was running
slower than the B clock. This same tilting will lead observer B to
judge that the AA' light signal propagates at roughly half
the speed of the BB' light signal. Both signals traverse the
same distance. However the the AA' signal leaves A when the
B clock reads 0 and arrives at A' then the B clock reads 4. The
BB' signal leaves B when the B clock reads 0 and arrives at B'
when the B clock reads a little over 2.
Einstein's Pathway
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/general_relativity_pathway/index.html[28/04/2010 08:22:30 ﺹ]
Recall that the judgments of simultaneity of accelerating
observers who move with the clocks agree, since they agree
on the hypersurfaces of simultaneity. So we can choose any
one of the accelerating observers and get the same outcome.
Each of the accelerating observers will judge the transit time
for BB' to be roughly half that of AA'. They will agree that light
propagates slower on the left side of the figure, that is, deeper
in the created field.
Applying the principle of equivalence , we now conclude
that the same slowing manifests in a gravitational field. A light
signal deeper in the gravitational field at A propagates slower
than a light signal higher in the gravitational field at B.
The conclusion that gravity slows the speed of light caused
Einstein some trouble with unkind contemporary critics .
Einstein had first based his theory of 1905 of the striking idea
of the constancy of the speed of light, but he now seemed to
be retracting it.
By 1912, Einstein had developed all these ideas into a fairly
complete theory of static gravitational fields, that is gravitational
fields that do not vary with time and admit well defined spaces.
The most striking characteristic of the theory was that the
intensity of the gravitation field, the gravitational potential,
was given by the speed of light. So as one moved to different
parts of space, the intensity of the gravitational field would vary
in concert with the changes in the speed of light. As late as
1912, some five years after Minkowski's work, Einstein was
loath to use spacetime methods. While I have developed the
Einstein's Pathway
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/general_relativity_pathway/index.html[28/04/2010 08:22:30 ﺹ]
clock slowing and light slowing effects using spacetime
diagram, Einstein did not do this. His method of analysis was
algrebraic. He represented the processes by equations in
which speeds and times appeared as variables. He rarely if
ever drew diagrams such as given above.
What Einstein now needed was a way to extend these
results to the more general case of gravitational fields that
vary with time. That, it turned out, required Einstein to move
well beyond the mathematics he knew. Another thought
experiment pointed the way.
The Rotating Disk
If one has a circular disk at rest in some inertial reference
system in special relativity, the geometry of its surface is
Euclidean. That is quite obvious, but it will be useful to spell out
what that means in terms of the outcomes of measuring
operations. If the disk is ten feet in diameter, then it means that
we can lay 10 foot long rulers across a diameter. Euclidean
geometry tells us that the circumference is π x 10 feet, which is
about 31 feet. That means that we can traverse the full
circumference of the disk by laying 31 rulers round the outer
rim of the disk.
What if we have a disk of the same
diameter of 10 feet but in rapid uniform
rotation with respect to the first disk? Things
will go rather differently. Assume that this
Einstein's Pathway
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/general_relativity_pathway/index.html[28/04/2010 08:22:30 ﺹ]
rotating disk is covered with foot long rulers
that move with it. These rulers are just like the
ones that were used to survey the non
rotating disk. (That means that an observer
moving with the rod on the rotating disk would
find it to be identical to one of the rulers used
to survey the nonrotating disk.) What will be
the outcome of surveying the geometry of this
rotating disk with those rods?
An observer who is not rotating with the disk
would judge all these rulers to have shrunk in
the direction of their motion. That means that,
according to this new observer, the surveying
of the disk would proceed differently. Ten
rulers would still be needed to span the
diameter of the disk. Since the motion of the
disk is perpendicular to the rulers laid out
along a diameter, the length of these rulers
would be unaffected by the rotation. That is
not so for the rulers laid along the
circumference. They lie in the direction of
rapid motion. As a result, they shorten and
more are needed to cover the full
circumference of the disk.
Note what was not said in this account. It did not say that we
take the first disk and set it into rotation. The reason is that it is
impossible in relativity theory to take a disk made out of stiff
material and set it into rotation. If one were to try to do this, the
disk would contract in the circumferential direction but not in the
radial direction. As a result, a disk made of stiff material would
break apart. If we want a rotating disk made of stiff material, we
need to create it already rotating. Once in a letter on the subject,
Einstein remarked that a way to get a disk of stiff material into
rotation is first to melt it, set the molten material into rotation and
then allow it harden. The rotating disk problem has created a
rather large and fruitless literature that suggests some sort of
paradox is at hand. Most of it derives from a failure to recognize
that a stiff disk cannot be set into uniform rotation without
destroying it.
Another little trap to avoid: While we have used the judgments of
an observer not on the disk to infer the outcome of the surveying
operations on the disk, the outcomes of those operations are
independent of the observer's state of motion. Either a diameter
can be covered with ten rods or it cannot; either the circumference
can be spanned by 31 rulers or it cannot. Once one observer has
found which is the case, we know the result for all observers.
Thus we measure the circumference of the rotating disk to be
greater than 31 feet, the Euclidean value. In other words, we
find that the geometry of the disk is not Euclidean. The
circumference of the disk is more than the Euclidean value of π
times its diamater.
The significance of this thought experiment was great for
Einstein's Pathway
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/general_relativity_pathway/index.html[28/04/2010 08:22:30 ﺹ]
Einstein. Through his principle of equivalence, Einstein had
found that linear acceleration produces a gravitational field.
Now he found that another sort of acceleration, rotation,
produces geometry that is not Euclidean.
Assembling the Pieces
Einstein had all this in place by the summer of 1912. He
knew that gravitation could bend light and slow clocks. He
expected that the final theory would somehow involve
accelerations in a new way and that such accelerations came
with a breakdown of Euclidean geometry. He also knew that
the natural arena in which to conduct relativity theory is
Minkowski's spacetime.
To us, the final step does not seem like such a great leap.
Assemble the pieces and infer that gravitation is a curvature
of spacetime! All that is needed is nice mathematical clothing to
dress this idea. For Einstein in 1912 it was far from easy. He
first needed the assistance of his mathematician friend Marcel
Grossmann to find his way in the new and difficult mathematics
the theory required.
Then he took a series of wrong turnings and ended up with
the wrong gravitational field equations not the celebrated
Einstein equations that appear in all the modern textbooks. It
required three years of painful work first to recognize that
something had gone wrong and then to find the final equations.
The precise causes underpinning these wrong turning remain a
point of debate in the history of general relativity literature.
Two elements, however, played a role in misleading Einstein.
First, in 1912 and 1913, Einstein had recognized the need to
employ a geometry of variable curvature in spacetime in his
theory of gravity. However he was convinced that this curvature
would not be manifested in the space space slices of
spacetime in certain simple cases. These were the cases of a
static gravitational field and also a very weak gravitational field.
Both of these are realized in the gravitational field of the sun.
Einstein expected space around the sun to exactly
Euclidean. Alas, as we have seen, Einstein's final theory
required curvature in the space space slices even in this
simple case. That meant that Einstein could not accept the
Einstein's Pathway
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/general_relativity_pathway/index.html[28/04/2010 08:22:30 ﺹ]
equations of the final theory for they would entail a curvature of
space when Einstein believed there was none.
Second, Einstein used a different style of theorizing to the one
largely used in these chapters. Here, we have used a
geometrical approach, emphasizing the picturing of
gravitational effects in geometric diagrams. Einstein, however,
labeled events in spacetime with arbitrarily coordinate numbers
and expressed all his results in terms of equations relating
these coordinates. Einstein knew that this labeling of spacetime
events with coordinates was purely arbitrary and that all his
results had to be independent of the particular
coordinate system used. However knowing this in the
abstract and carrying through the demand in all details are two
different things. By his own later admission, Einstein found it
hard to purge his coordinate systems of independent reality.
One the low points in his struggle with coordinate systems
came when Einstein used an ingenious argument the "hole
argument"to show that gravitational field equations like the
ones of his final theory are inadmissible on physical grounds.
While the hole argument did not warrant that conclusion, it has
been rehabilitated in recent work in philosophy of space of
time, where it now lives a good life. (See, " The Hole Argument ."
Stanford Encyclopedia of Philosophy.)
What made the last phase of this three years especially urgent
was the fact that David Hilbert, the greatest mathematician of
the era, had also become interested in the theory and had
started to formulate the gravitational field equations in a
mathematically more elegant formulation.
For an glimpse into Einstein's private notebook to see his
calculations during the decisive phase of the discovery of
general relativity, see "A Peek into Einstein's Zurich Notebook."
on my Goodies page. Here's one page on which Einstein
writes down the Riemann curvature tensor for the first time and
finds it hard to see how it can be used in his gravitational field
equations.
Einstein's Pathway
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/general_relativity_pathway/index.html[28/04/2010 08:22:30 ﺹ]
In November 1915, Einstein published his final version of the
theory, complete with the gravitational field equations so
distinctive of his theory. Here are those equations as he wrote
them at that time, in a 1916 review article:
What You Should Know
Einstein's Pathway
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/general_relativity_pathway/index.html[28/04/2010 08:22:30 ﺹ]
What first led Einstein to work on what became his general theory of relativity.
The principle of equivalence
How Einstein used it to infer the properties of gravitational fields.
The relativity of inertia.
Einstein's transition to the mathematics of spacetime curvature.
Copyright John D. Norton. February 2001; January 2, 2007, February 15, August 23, October 16, 27, 2008; February 5, 19, 2010.
•
•
•
•
•
Relativistic Cosmology
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/relativistic_cosmology/index.html[28/04/2010 08:22:41 ﺹ]
HPS 0410 Einstein for Everyone
Back to main course page
Relativistic Cosmology
John D. Norton
Department of History and Philosophy of Science
University of Pittsburgh
Einstein's Great Book of Universes
Minkowski Spacetime
Solving Einstein's Gravitational Field Equations
The Schwarzschild Spacetime
The Einstein Universe
The Trouble with a Schwarzschild Cosmology
Abolishing Infinity
The Cosmological Constant "λ"
λ Lives On
Time Travel Universes
The Cylinder Universe
Grandfather Paradox
Global Constraints
The Goedel Universe
What you should know
Einstein's Great Book of
Universes
The arrival of Einstein's general theory of relativity
marked a rebirth of interest and work in cosmology,
the study of the universe on the largest scale.
Within Newtonian theory, cosmology had reduced
essentially to one question: just how is matter
distributed within an infinite, Euclidean space. Einstein's
theory made the question much more interesting. For
now cosmologists had to contend with many possible
matter distributions, many possible geometries and
Relativistic Cosmology
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/relativistic_cosmology/index.html[28/04/2010 08:22:41 ﺹ]
many possible dynamics for both.
What connected all these together was Einstein's
gravitational field equations. They specified just which
matter distributions could go with just which geometries
and how the whole system might evolve over time. One
way to think of these equations is as a law that our
universe must satisfy. Another is to imagine them as a
selection rule. Among all conceivable universes, only
some will satisfy Einstein's equations. These universes
are the ones that we designate as possible universes,
where "possible" now just means "licensed by
Einstein's theory."
I like to think of these possible universes as
each comprising a page of a great book.
Metaphorically, Einstein's gravitational field
equations are that book. We shall now turn to
reading that book. Just what sorts of universes
are possible according to Einstein's theory? As
we flip from page to page we will see some
quite interested universes. Among them we
hope to find out own.
Minkowski Spacetime
Minkowski spacetime is the spacetime in which special
relativity holds. This is the simplest all solutions of
Einstein's equations. It arises in when the unsummed
curvature of spacetime is zero. It is the case of no
gravity. If the unsummed curvature is zero, then the
summed curvature must also be zero. Einstein's
equations will be satisfied if the universe is free of
massenergy. For this reason, Minkowski spacetime is
an unrealistic candidate for our universe. We know our
Relativistic Cosmology
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/relativistic_cosmology/index.html[28/04/2010 08:22:41 ﺹ]
universe has matter in it!
That does not make the structure of a Minkowski
spacetime uninteresting. The situation is similar to one
we encounter on our earth's surface. We know that this
surface is a sphere in the large and that a non
Euclidean geometry must be used to analyze it. Yet in
any small patchsuch as an area of land the size of
a citywe can ignore the curvature and apply Euclidean
geometry without appreciable error. Correspondingly,
for many applications, Minkowski spacetime is a
sufficiently good approximation in the small.
It will fail, however, whenever, gravitation is strong;
that is, when the curvature is large. In the vicinity of the
sun, gravitation is not strong. So there we manage to
extend the use of Minkowski spacetime by pretending
that curvature effects of gravitation are really due to a
new field, the gravitational field.
Solving Einstein's
Gravitational Field Equations
Relativistic Cosmology
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/relativistic_cosmology/index.html[28/04/2010 08:22:41 ﺹ]
The more interesting spacetimes come from other
pages of Einstein's book. We find them by "solving"
Einstein's gravitational field equations. That just means
that we find spacetimes whose summed curvature
matches the matter density in the precise way that
Einstein's field equations demand. Minkowski
spacetime is the simplest solution. Finding others is a
formidable mathematical challenge and only the
simplest of solutions are easy to find. So when a new
solution is found, the new solution is generally named
after whoever found it. Finding a solution is really
discovering a new, possible universe. So the
discoverer's name is then attached to that universe: an
Einstein universe, a Reissner Nordstroem spacetime,
and so on.
While the activity of solving Einstein's equations is very
hard, the process in conceptual form is quite easy to
describe. The Einstein equations specify how
spacetime can be locally, that is, at any one point. They
say this much curvature always goes with this much
matter. To solve the Einstein equations is merely
finding a way of distributing curvature and matter over
spacetime so neighboring points mesh correctly.
A rather good analogy is to the
solving of a jigsaw puzzle. The
Einstein equations give us an
endless supply of small pieces.
They are how spacetime can be in
infinitesimally small patches. Each
piece has the right combination of
curvature and matter.
Relativistic Cosmology
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/relativistic_cosmology/index.html[28/04/2010 08:22:41 ﺹ]
Solving the Einstein
equations corresponds to
finding a combination of
pieces from the supply that
can be fitted together.
What is it for two pieces to fit
properly together? Each little
patch of spacetime will have
varying curvature, matter
density and other
geometrical and physical
quantities. In a good solution
of Einstein's equations, these
quantities must change
continuously as we move
from one patch to another.
They cannot jump suddenly
as we cross the boundary
between patches; and they
are usually not even allowed
to change their rate of
growth abruptly.
In the jigsaw analogy, this
condition of continuity
between the adjacent
patches of spacetime is
represented by the
requirement that the edge
shapes of adjacent pieces
agree, so that they can be
connected without gaps.
The simplest
possible
solutions are
"homogeneous"
that means that
Relativistic Cosmology
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/relativistic_cosmology/index.html[28/04/2010 08:22:41 ﺹ]
they are
everywhere the
same. In the
jigsaw puzzle
analogy, that
means that the
spacetime has
to be put
together from
repetitions of the
same piece over
and over and
over. In the
case of a
Minkowski
spacetime, that
piece is just one
that has no
curvature and
no matter.
The Schwarzschild Spacetime
This was one of the first interesting, exact solutions
given for Einstein's equations and was computed by the
German astronomer Karl Schwarzschild. (He died
shortly afterwards in World War I at the front.) It is of a
universe which looks like a Minkowski spacetime as
you get close to infinity in space. It has a central point
in space around which all the curvature is distributed
symmetrically. It is unchanging in time.
This spacetime is taken to be a good approximation for
Relativistic Cosmology
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/relativistic_cosmology/index.html[28/04/2010 08:22:41 ﺹ]
the spacetime around the sun (as long as we neglect
that the sun rotates and has some electric charge).
In the jigsaw puzzle analogy, the solution is put
together from pieces that are flat like those of the
Minkowski spacetime in regions that are far from the
central point. The pieces get more curved as we
approach the central point, but their curvature always
respects the rotational symmetry of the space around
the sun.
The Einstein Universe
The Trouble with a
Schwarzschild
Cosmology
Relativistic Cosmology
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/relativistic_cosmology/index.html[28/04/2010 08:22:41 ﺹ]
The Schwarzschild spacetime is a good approximation
of the spacetime around our sun. But does it work for
the whole universe? It would work if all the matter of
the universe were located in just one island in an
otherwise empty space. However, when Einstein
started to contemplate these possibilities shortly after
completing his general theory of relativity in 1916, he
did not like this possibility for two reasons.
First, the astronomical information available to Einstein
at that time indicated that the universe was filled with a
roughly uniform distribution of stars. So empirically, the
model was wrong.
Second, there was a deeper theoretical worry.
General relativity had shown how matter fixes the
(summed) curvature of spacetime. Einstein liked that
notion a lot. It reduced the arbitrariness of a spacetime.
Why did the geometry curve this way here and that way
there? It did so because the matter distribution went
this way here and that way there. The theory reduced
the number of arbitrary stipulations that needed to be
made in building a picture of our spacetime.
Einstein liked this idea so much that he elevated the
idea to a principle. He demanded that the whole
geometry of spacetime must be fixed by the matter
distribution. That is stronger than what he had up to
then. For the Einstein equations only required the
matter distribution to fix the summed curvature at each
event in spacetime. Einstein called this stronger
requirement "Mach's principle," since it reminded him
of epistemological analyses of space and time
undertaken by the physicistphilosopher Ernst Mach.
The difficulty is that a Schwarzschild spacetime
has a property that is not fixed by the matter
distribution. That is its flatness at spatial infinity. There
is no matter in realms remote from the center of the
spacetime, so there is no matter to determine that
Relativistic Cosmology
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/relativistic_cosmology/index.html[28/04/2010 08:22:41 ﺹ]
flatness. It is something that we have to demand in
addition.
In the jigsaw puzzle analogy , the problem is this:
near the center, we know that we need to lay down
pieces of spacetime that respect the rotational
symmetry of spacetime around the central mass. But
when we get far away from that central island, what
sorts of piece are we supposed to lay down ? In a
Schwarzschild spacetime, we lay down pieces that look
more and more like those found in a Minkowski
spacetime. But that is now a choice we are making.
Nothing about the matter in the central island forced us
to make it. That arbitrariness is what worried Einstein.
If you have been reading attentively, you will have noticed that I have
been careful to say only that this arbitrariness worried Einstein. It is
not clear that we should we worried about this arbitrariness.
Historically, Einstein's imagination was grasped by the idea of the
matter distribution fixing spacetime geometry completely. It would be a
pretty outcome and one that Einstein found very helpful in guiding his
construction of new theories. If those theories survive empirical
testing, we might even discover that his idea happens to be true.
Somewhere, somehow Einstein came to believe not just that it might
happen to be true. He came to believe that it has to be true. This is
a much stronger claim. In reptrospect, it hard to see how to justify it.
What if the matter of the universe were all collected in one big central
island. What is wrong with spacetime remote from the island being
flat? The universe has contingent features. Why can't that be one of
them?
Abolishing Infinity
How could one avoid the need to stipulate the
properties of spacetime at infinity ? In 1917, Einstein
came up with an ingenious escape: obliterate spatial
infinity! By adding an extra term to his gravitational field
equations, Einstein found a simple solution of his
augmented field equations. ("Augmented"? What is that about?
It refers to the famous λ. See below for an explanation.) It contains
a uniform matter distribution that approximates a
uniform distribution of stars. That matter is at rest and
Relativistic Cosmology
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/relativistic_cosmology/index.html[28/04/2010 08:22:41 ﺹ]
the geometry of a spatial slice is unchanging with time.
Space, however, curves back onto itself so that it is
spherical. That is, space has the geometry of 5
NONE
with positive curvature. In such a space, there is no
infinity at which to stipulate the properties of space and
time.
If one pictures just one dimension of space, then the
universe looks like a cylinder. Spacetime resides
just in the surface of the cylinder. The vertical lines are
the world lines of the stars at rest. The one spatial
dimension is wrapped back onto itself; the time
dimension is not. Each spatial slice at a particular time
appears as a circle; if we could represent all three
dimensions of space, we would somehow have to
replace the circle by a complete sphere of three
dimensional space.
Relativistic Cosmology
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/relativistic_cosmology/index.html[28/04/2010 08:22:41 ﺹ]
The Einstein universe is an especially simple universe.
It is homogeneous. That means that, like Minkowski
spacetime, it is geometrically the same at every event.
It is also spatially isortropic, which means that it is the
same in every spatial direction. In the jigsaw puzzle
analogy this homogeneity means that the spacetime is
assembled from just one sort of piece, used repeated to
build the entire spacetime.
The Cosmological
Constant "λ"
Something very important passed by very quickly just
now. The Einstein universe turns out not to solve
Einstein's gravitational field equations of 1915. In
order to accommodate the new cosmology, Einstein
had to make what appeared to be a somewhat arbitrary
adjustment to his gravitational field equations.
In their original form, they said
summed curvature
of spacetime
=
matter
density
We saw in the chapter on general relativity what these
equations require inside a uniform matter distribution.
That was our first illustration a mass falling in an
evacuated tube inside the matter distribution of the
earth. There we saw that there is a positive curvature
in the spacetime sheets of space time.
The situation is the same with objects within the uniform
matter distribution of the Einstein universe. Einstein's
original gravitational field equations call for positive
curvature in the spacetime sheets. That means that his
field equations are calling for the same sort of dynamics
Relativistic Cosmology
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/relativistic_cosmology/index.html[28/04/2010 08:22:41 ﺹ]
as we saw for masses falling freely in a tube drilled
through the center of the earth. All the matter of the
universe should be accelerating towards the other
neighboring pieces of matter, just as the neighboring
masses in the tube accelerate towards each other. That
is, all the matter of this universe should be undergoing
everywhere an inward gravitational collapse, perhaps
delayed only by an initial outward velocity.
The trouble is that there is no curvature in those
sheets in an Einstein universe; the spatial slices remain
unchanged through time. There is no convergence or
divergence of the points of the matter distribution.
Einstein's resolution was to modify his field
equations in a way that would no longer call for this
particular curvature. That is, he put another term into
the equations that supplied the missing curvature. The
real justification was essentially only that it gave him the
result he wanted, the admissibility of his new universe.
The term added was Einstein's celebrated
"cosmological term" or just "lambda" λ. It is a constant
term added to the equations, which means that it is the
same at every event.
His gravitational field equations now read:
summed curvature matter
Relativistic Cosmology
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/relativistic_cosmology/index.html[28/04/2010 08:22:41 ﺹ]
of spacetime
+ λ =
density
As before, each of these quantities is really a 4x4 table of numbers.
The λ table is constant in the sense that it is the same at every event
in spacetime.
At the time, it seemed like a good idea. But Einstein
very soon came to regret the addition, which he saw
as harming the formal beauty and simplicity of the
equations.
Einstein also almost immediately became embroiled in
a dispute with the Dutch astronomer, de Sitter. Einstein
had hoped that augmenting his gravitational field
equations with the cosmological term would preclude
empty universes without matter. De Sitter showed that
the augmented equations admitted a cosmology with no
matter density, contrary to Einstein's expectations. It
was an odd spacetime now called "de Sitter
spacetime"that is everywhere expanding although
there is no matter in it. That means that any two tiny
test masses that somehow found their way into the
universe would accelerate away from each other,
whereever they were located.
De Sitter's spacetime may seem an elaborate
construction. It turns out, however, to be the simplest
spacetime after the flat Minkowski spacetime. It has
constant curvature that is, it has the same curvature at
every event.
To see how simple it is, recall our original recipe for generating curved
spaces. The simplest case was a flat Euclidean surface. We then
generated a two dimensional spherical space by looking at the surface
of a three dimensional sphere in a three dimensional space; and we
generated a three dimensional spherical space by looking at the surface
of a four dimensional sphere embedded in a fourth dimensional space.
This procedure in the context of a Minkowski spacetime gives us a de
Sitter spacetime. We take a five dimensional MInkowski spacetime (one
time dimension, four spatial dimensions). In it, we construct the analog
of a sphere; it is the four dimensional hyperboloid shown in the figure. In
Relativistic Cosmology
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/relativistic_cosmology/index.html[28/04/2010 08:22:41 ﺹ]
a Minkowski spacetime, it is a surface of constant curvature, the analog
of a sphere in Euclidean space. The four dimensional surface of that
hyperboloid is the de Sitter spacetime.
One can then easily see how the de Sitter spacetime solves Einstein's
gravitational field equations augmented with the cosmological term λ.
Since de Sitter spacetime has constant curvature, its summed curvature
is everywhere the same. So we generate a solution of Einstein's
augmented equation merely by picking that de Sitter spacetime whose
summed curvature equals the negative of λ. Then the left hand side of
Einstein's equation is zero; the right hand side is also zero since we
assume the spacetime to be matter free.
λ Lives On
In retrospect, the extra term Einstein added to his
equations had a simple interpretation. A uniform mass
distribution, if left to itself, ought to collapse under
gravitational self attraction. That is the physical
interpretation of the curvature of the spacetime sheets
that the equations of 1915 were calling for. In adding
the cosmological term, Einstein was, in effect, adding a
cosmic force of repulsion that would cancel out this
natural gravitational selfattraction. That way the matter
distribution could remain static.
When de Sitter forms a universe without matter, no
gravitational selfattraction of matter opposes λ's
powers of repulsion. We can still insert minute test
masses into this otherwise empty universe to plumb its
properties. With λ's repulsive powers only in effect, we
find a universe in which test masses flee everywhere
Relativistic Cosmology
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/relativistic_cosmology/index.html[28/04/2010 08:22:41 ﺹ]
from each other.
In 1917, when Einstein proposed his universe, the
natural supposition was that the matter of the universe
is static on the largest scale. In the 1920s and 1930s, it
became clear that this was not so. In fact the matter of
the universe is everywhere expanding rapidly and that
expansion was adequate to counter temporarily the
gravitational self attraction called for by Einstein's
theory. (A good analogy is a stone tossed into the air.
Its initial, upward velocity overcomes the downward pull
of gravity, but only temporarily.) When these dynamic
cosmologies emerged , Einstein renounced the
cosmological term.
Einstein's renunciation of the cosmological term has not
proven to be fatal to the idea, however. It gives
cosmologists, eager to match their models to the latest
astronomical data, an extra parameter to adjust, so
that they can get a fit of their model to new, recalcitrant
data.
In that context, there is a popular reinterpretation of
the cosmological term. To see it, take Einstein's
augmented gravitational field equations in the case in
which there is no ordinary matter, so the term "matter
density" is 0.
summed curvature
of spacetime
+ λ = 0
Now just take the λ term and move it to the other side of
the equal sign:
summed curvature
of spacetime
= λ
So, where Einstein's original equations used to say
Relativistic Cosmology
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/relativistic_cosmology/index.html[28/04/2010 08:22:41 ﺹ]
"matter density," they now say "λ." What that means is
that Einstein's λ is behaving like an extra sort of matter
distributed through space, according to the original
equations.
Since we know that λ corresponds to a force of
repulsion between matter, it behaves like an odd
sort of matter that accelerates the expansion of
matter in space. What is odd about it is that all
ordinary matter generates attractive gravitational
forces. That was the fundamental idea of Newton's
original notion of "universal gravitation." As noted
above, this now gives some understanding of why
the matterfree de Sitter universe is expanding. It is
being driven by the repulsions inherent in the
cosmological term.
The cosmological constant λ has proven especially
useful to recent work in cosmology, for the observed
motions of distant matter incorporate accelerated
recessions greater than Einstein's original equations
allowed. Einstein would not be pleased.
Time Travel Universes
Before we turn to pursue the spacetime that best
resembles our own in the next chapter, it is interesting
to review how Einstein's theory allows us describe
universes in which time travel is possible.
The Cylinder
Universe
The easiest type of time travel universe looks like a trick
Relativistic Cosmology
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/relativistic_cosmology/index.html[28/04/2010 08:22:41 ﺹ]
that is stipulated into existence. However there is
nothing illicit about it. And its great simplicity enables
us to refine our intuitions about just how time travel can
arise.
Einstein showed us through the Einstein universe that
we can curve space back onto itself and thus produce a
closed space.
That construction proved a little complicated for
Einstein since there are three dimensions of space that
need to be accommodated. If we want to do it in the
time direction , it is much easier. There is only one
dimension of time. The simplest case arises if we wrap
up the time direction of a Minkowski spacetime. As
before, if we consider only one dimension of space, we
recover a cylinder. The spacetime is on the surface of
the cylinder. For the new case the cylinder is wrapped
up in the timelike direction.
You might wonder if a trick like this is really
allowed by Einstein's gravitational field
Relativistic Cosmology
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/relativistic_cosmology/index.html[28/04/2010 08:22:41 ﺹ]
equations. It is. Recall that Einstein's
gravitational field equations merely fix how
each little patch of spacetime must look. A
solution is admissible if each patch
connects properly with those next to it. That
will happen in this spacetime. In any not too
big piece, this cylinder universe is exactly
the same as a Minkowski spacetime; each
piece connects with the one next to it just
as they do in a Minkowski spacetime. That
is all that is needed for the spacetime to
count as a solution of Einstein's
gravitational field equations.
The timelike curve on the spacetime represents the life
of a traveler who stays at one point in space, but
passes through time merely by being. Eventually that
worldline will wrap all the way around the spacetime
and reconnect. At that point, the traveler will meet his or
her former self.
Grandfather
Paradox
The traditional "grandfather" paradox of time travel
arises if the latest stage of the traveler (now imagined
to be the grandson of the original traveler) were to kill
the original one (the grandfather). A contradiction
would ensue. With the assassination complete, there
would no traveler to pass through time and commit it.
So the assassination happening entails that it doesn't
happen.
Relativistic Cosmology
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/relativistic_cosmology/index.html[28/04/2010 08:22:41 ﺹ]
Global Constraints
The possibility of such paradoxes has led some to
conclude that time travel universes are logical
impossibilities. That is too hasty. There is an obvious
loophole in the paradox. If the assassination attempt
fails, then there is no contradiction.
So that is what must happen in a time travel universe.
The grandson's bullet must miss; or the gun misfire;
or the grandfather ducks; or who knows what. For if the
assassination attempt didn't fail, there would be no
assassin to attempt it.
That resolution is, as far as I know, admissible. Many
find it objectionable since there seems to be no reason
in the physics itself that forces the failure of the
assassination attempt. What if the grandson takes all
due care, aims carefully with a new gun, and so on?
How can we be so sure that the attempt will fail.
We can. The intuitions that tell us it will not fail are
honed in a type of universe that is quite different from a
time travel universe. In the ordinary time travel free
Relativistic Cosmology
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/relativistic_cosmology/index.html[28/04/2010 08:22:41 ﺹ]
universes, such as we presume we inhabit, local
constraints prevail. If the gun misfired, for example, it
was because something in the state of the gun
immediately prior to to the assassination attempt
intervened. Perhaps the grandson passed through a
rain shower and a component of the gun began to rust.
In a time travel universe, in addition to these sorts of
local constraints, we have a new type: global
constraints. These are extra constraints that all
processes must conform to in order that distant future
and distant past mesh when they meet. These global
constraints do not arise in time travel free universes.
They are what assures us that the assassination
attempt must fail.
We can get an idea of how they work from the jigsaw
puzzle analogy for solving Einstein's equations.
First consider a universe without time
travel. We start with a row of pieces that
represents space in the present instant.
Then we add successive rows that
correspond with space in successive
future times. The pieces we add are
constrained only by the local
requirement that each piece mesh with
those immediately before and after it in
time; and those around it in space.
Relativistic Cosmology
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/relativistic_cosmology/index.html[28/04/2010 08:22:41 ﺹ]
Now take the case of a time travel
universe. All these constraints apply.
But, in addition, as we keep adding
the successive rows, we will
eventually end up going all the way
round the space and then the new
and powerful constraint will come
into force. The last row we add has
to be so perfectly built so that it
meshes with the past edge of the
first row we put in place. That is a
global constraint. It means that in our
planning of which pieces to lay
down, we had to worry about the
local meshing of the pieces; and, in
addition, we had to select pieces
now so that eventually the final
meshing of last and first row
would work out.
Here's a simple example in a different arena of how these
sorts of global constraints can work. It is the arithmetic
puzzle, "99." In the puzzle, you are to start at zero and
may add or substract any number you like between 0 and 10,
as many times as you like, provided that the numbers that
you are adding or subtracting are always even numbers. Is
there some combination of additions and subtractions that will
get you to 99?
Locally, there is no obstacle to getting to 99. If you could
somehow get your sum to 97 or to 95, you could complete
the task by adding 2 or 4. Just looking locally at the numbers
around 99 reveals no problems.
Relativistic Cosmology
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/relativistic_cosmology/index.html[28/04/2010 08:22:41 ﺹ]
Globally, however, there is a constraint that necessarily
defeats your attempts to arrive at 99. Since you start with
zero and may add or subtract even numbers only, your sum
must always be an even number. So you can never get to 97
or 95 or any other number that is an even number removed
from 99. This global constraint assures your failure to solve
the puzzle.
A simple illustration show just how powerful these
global restriction on a spacetime can be. Consider just
about the simplest possible time travel universe: a
universe empty of all matter excepting just one mass.
Now pick some time slice. What configurations of the
particle are possible?
In an ordinary time travel free universe, at some
initial moment of time, we can have the worldline of the
mass with any initial velocity.
If the spacetime is a time travel, cylinder universe,
we are strangely restricted in the possibilities for this
time slice. We could choose a mass at rest. That
corresponds to the case of a single worldline that
eventually wraps back onto itself. But if we have the
mass initially moving, then we must also stipulate that
clone masses be distributed in space at uniform
intervals. These will be the repeated returns of the
single mass as it travels all the way round spacetime
and back to the present.
Relativistic Cosmology
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/relativistic_cosmology/index.html[28/04/2010 08:22:41 ﺹ]
The global constraint says that if we have a moving
mass here and now, we must also have a moving mass
there and now; and there and now; and so on. That sort
of constraint would be incomprehensible in a universe
without time travel. What reason of physics, we would
exclaim, requires it just as we ask, what reason of
physics requires the grandson's assassination attempt
to fail!
The Goedel
Universe
There is something that looks just a little fishy about
the way time travel is arrived at in the cylinder universe.
It does not seem to arise from the physics of the
spacetime. It comes from a stipulation on our part that
the future wrap back onto past. Einstein's theory seems
only to get involved in so far as it raises no objection.
There is nothing wrong with this way of introducing time
travel, of course.
It is nice to know, however, that time travel also can
arise more naturally. The Goedel universe is one such
example. This solution to the Einstein equations was
arrived at by the famous logician, Kurt Goedel, in
the 1940s, when he was a colleague of Einstein's at the
Institute for Advanced Study in Princeton, and
published in 1949.
The Goedel universe is a solution of Einstein's
equations with the cosmological term. Its signature
property is that it contains closed timelike
worldlines. As a result, it is hard to pick out a single
timelike direction globally in the spacetime. Rather, we
can get a feel for its spacetime properties by taking just
a single two dimensional slice of it. It will become clear
that this is not a spacelike slice.
Relativistic Cosmology
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/relativistic_cosmology/index.html[28/04/2010 08:22:41 ﺹ]
If we consider some observer in the middle of this slice,
the observer will find all the matter in a great cosmic
rotation around them. (For this reason, the Goedel
universe cannot be ours. We don't see such rotation.)
The reason for the rotation lies with the structure of
spacetime itself. As we consider positions in the slice
further away from the observer, the light cones start
to tip over. So if we consider a large enough chunk of
the slice, we can find a timelike curve that loops back
onto itself. It forms a closed timelike curve, the hallmark
of universes that admit time travel.
The timelike curve is not a geodesic; it represents the
trajectory of an accelerating spaceship. To achieve time
travel, the spaceship would need to accelerate quite
considerably. Most interestingly, the Goedel universe
Relativistic Cosmology
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/relativistic_cosmology/index.html[28/04/2010 08:22:41 ﺹ]
uses no stipulations about past wrapping back onto the
future to achieve the possibility of time travel.
There are other universes that admit time travel.
Often rotation is involved. Spacetime around an
infinitely long, very dense, rapidly rotating tube of matter
admits closed timelike curves, for example. Some of the
most fascinating of the time travel universes are those
in which one part of spacetime is connected to another
by a wormhole. That is just a tunnel of spacetime that
provides an alternative route from one part of spacetime
to another.
What you should know
The analogy between solving Einstein's
gravitational field equations and the solving of a
jigsaw puzzle.
How Einstein's cosmological constant λ modifies
his gravitational field equations.
The basic characteristics of various solutions of the
Einstein equations:
Minkowski spacetime
Schwarzschild spacetime
Einstein universe
de Sitter universe
cylinder universe
Goedel spacetime
How time travel can arise in relativistic
cosmologies.
Copyright John D. Norton. March 2001; January 2007; February 16, October 15, 27, 2008.
Big Bang Cosmology
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/big_bang/index.html[28/04/2010 08:22:57 ﺹ]
HPS 0410 Einstein for Everyone
Back to main course page
Big Bang Cosmology
John D. Norton
Department of History and Philosophy of Science
University of Pittsburgh
Our Universe: The Observed Hubble Expansion
Matter Distribution on the Largest Scale
Motion on the Largest Scale
Hubbles's Law, Hubble's Constant and the Age of the
Universe
FriedmannRobertsonWalker Spacetimes
What the Big Bang really is
Cosmological Red Shift
Cosmic Dynamics: Three Possibilities
A Newtonian Analogy for the Dynamics
And Ours Is...
Is a Big Bang Inevitable?
A Theorem by Stephen Hawking
What you should know
Background reading: J. P. Mc Evoy and O. Zarate, Introducing
Stephen Hawking. Totem. pp. 46105,
Our Universe: The Observed Hubble
Expansion
None of the universes discussed so far are ours. To determine
which universe in Einstein's great book is our universe we need to
know a little more about ours. Two facts have proven decisive in
selecting our universe: the distribution of matter in the universe
and its motion.
Matter Distribution on the Largest
Scale
How is matter distributed in our universe on the largest scale? To
answer we need to get a sense of just what that largest scale is.
Let us step up to it:
Big Bang Cosmology
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/big_bang/index.html[28/04/2010 08:22:57 ﺹ]
Within our solar system, the distance from the sun to the earth
is 93 million miles; light requires 8.3 minutes to propagate from the
sun to the earth. Pluto is much farther away from the sun, 2700 to
4500 million miles depending on the position in its orbit.
Our solar system is just one of hundreds of billions of stars that
form our galaxy, the Milky Way. It is vastly bigger than our
solar system. Its main disk is 80,000 to 100,000 light years in
diameter. It is worth pausing to imagine what that means. A light
year is the distance light travels in one year: 5,880,000,000,000
miles. Just one light year is already enormous. If we decide to
send a light signal to some randomly chosen star in the Milky
Way, it will require many tens of thousands of years to get there.
That is already longer than recorded history. If some being there
decides to send a signal in response, will there be anyone here to
receive it?
Here's how the Milky Way looks to us from the inside as a broad
luminous band made up of many stars spread across the sky.
Here's an artist's conception of how it looks from the outside:
Big Bang Cosmology
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/big_bang/index.html[28/04/2010 08:22:57 ﺹ]
The remaining stars of the universe are grouped into other
galaxies. Here's our nearest galaxy, the Andromeda galaxy,
M31, which is about 2 million light years away:
Finally, on the largest scale, luminous matter is roughly
uniformly distributed through space in galaxies separated by
millions of light years. Here's an image from the Hubble telescope:
The images above were drawn from the NASA website, http://www.nasa.gov/, January 14,
2007. NASA provides these images copyright free subject to the restrictions on
http://www.simlabs.arc.nasa.gov/copyright_info/copyright.html
This familiar picture of the universe on the largest scale is a quite
recent discovery. As late as 1920, it remained unclear whether all
the matter of the universe was collected in one place, the Milky
Big Bang Cosmology
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/big_bang/index.html[28/04/2010 08:22:57 ﺹ]
Way; or whether the Milky Way was just one galaxy of many
scattered through space. This question was the subject of what
came to be known as the "Great Debate" that happened in the
Baird auditorium of the Smithsonian Museum of Natural History on
April 26, 1920 . There two astronomers battled. Harlow Shapley
defended the theory of island universe and Heber Curtis argued
for the many galaxies of stars view that ultimately prevailed.
Harlow Shapey
Heber Curtis
These galaxies are the basic units of matter of modern
cosmology. They are the molecules of the cosmic gas that is the
subject of modern cosmology. The theory proceeds by assuming
that they form a continuous fluid, much as we routinely
assume that water or air is a continuous fluid, even though we
know it is made of molecules; or that sand dunes are continuous,
even though they are made of grains of sand. As long as we take
a distant enough view of galaxies, molecules or sand grains, they
blend into their neighbors and appear to form a continuous
distribution of matter.
The galaxies form the luminous part of the matter of the universe.
Recent investigations are showing that there is a lot more matter
in the cosmos. It is prefixed by "dark...". Dark energy permeates
all space and plays a major role in cosmic dynamics. Dark matter
provides the additional gravitational pull needed to hold galaxies
together.
Motion on the Largest Scale
Einstein, in 1917, presumed that on the largest scale we would
see a uniform distribution of stars all roughly at rest. In the course
of the 1920s, in the aftermath of the Great Debate, it became clear
that the basic unit of cosmic matter would be the galaxy and not
the star. That by itself changed little at the fundamental level of
theory. What did change our cosmic theorizing a lot was an
observation about light from distant galaxies pursued most
famously by Edwin Hubble towards the end of the 1920s. That
observation became the single most important observational
Big Bang Cosmology
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/big_bang/index.html[28/04/2010 08:22:57 ﺹ]
fact of modern cosmology.
Edwin Hubble
Mount Wilson telescope
What Hubble observed was that light from distant galaxies was
redder than light from nearby galaxies.
More importantly, there was a linear relationship between the
distance to the galaxy and the amount of reddening. Double the
distance and you double the reddening; triple the distance and
you triple the reddening; and so on.
How was this reddening to be interpreted? Hubble inferred that it
revealed a velocity of recession of the galaxies. The redder the
light the faster the galaxies were receding.
Hubble arrived at this interpretation through an effect familiar from
optics and acoustics, the Doppler effect . Every sound or light
wave has a particular frequency and wavelength. In sound, they
determine the pitch; in light they determine the color. Here's a light
wave and an observer.
Big Bang Cosmology
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/big_bang/index.html[28/04/2010 08:22:57 ﺹ]
If the observer were to hurry towards the source of the light, the
observer would now pass wavecrests more frequently than the
resting observer.
That would mean that moving observer would find the frequency
of the light to have increased (and correspondingly for the
wavelengththe distance between crests to have decreased).
That increase in frequency is a shifting of the light towards the
blue end of the spectrum.
The converse effect would happen if the observer were to recede
from the light source. The light's frequency would diminish and
the light would redden.
For light, this effect depends only on the
relative motion of observer and source.
So if the observer were at rest and the light
source moved, exactly the same thing
would happen.
This is no longer true in the case of sound. Then there is a medium that
carries the sound waves, the air, and we get slightly different results
according to which of the observer or sound emitter is moving with respect
to the air. There is nothing analogous to the air for lightthere is no
luminiferous ether!
Big Bang Cosmology
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/big_bang/index.html[28/04/2010 08:22:57 ﺹ]
The Doppler effect is familiar from everyday life. When an
ambulance approaches us with its siren on, we hear a higher
pitch because it is approaching. As it passes and then recedes,
we hear the pitch suddenly drop. There has been no change in
the sound emitted by the siren. The ambulance driver hears no
change in the siren pitch. All these changes happen as a result of
the relative motion between you and the ambulance siren by
means of the Doppler effect.
Hubble inferred from the red shift of light from distance galaxies to
a velocity of recession of the galaxies. The further a galaxies
is from us, the faster it recedes. The relationship is linear, a fact to
be explored in a moment.
Hubble arrived at the basic fact that all modern cosmologies try to
accommodate: the universe is undergoing a massive expansion.
I'll mention here for later reference that the use of Doppler's
principle as a way of interpreting the red shift has limited
application. When we have developed a full cosmological model
using general relativity, we'll see that the presumptions above of a
static space with observers and galaxies moving in it will fail.
Big Bang Cosmology
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/big_bang/index.html[28/04/2010 08:22:57 ﺹ]
Instead we shall see that the reddening of light from distant
galaxies comes from a stretching of space itself while the light
propagates to us. Doppler's principle provides a useful, classical
approximation of the effect.
Hubbles's Law, Hubble's Constant and the Age
of the Universe
Hubble found a linear relationship between the velocity of
recession and the distance to the galaxy. What that means can be
seen in the table:
Distance to galaxy
(light years)
Velocity of recession
(kilometers/second)
1,000,000 20
2,000,000 40
3,000,000 60
4,000,000 80
5,000,000 100
There is an obvious rule built into this table and it is known as
"Hubble's Law":
Velocity of recession
(kilometers/second)
= 20 x
Distance to galaxy
(millions of light years)
The magic number of 20 in this formula carries a lot of the
content. In effect is it telling us that we need to assign 20
kilometers per second of velocity of recession for every million
light years of distance between us and the galaxy. This number,
which is one of the most important cosmic parameters, is known
as Hubble's constant.
Built into Hubble's law is also a notion of the age of the
universe. To see it, consider a galaxy a million light years distant
from us. If its speed of recession was the same in all history, we
can compute how long ago the matter of that galaxies was here.
Similarly we can compute how long ago the matter of a galaxy two
million light years distant was here. And we can compute how long
ago the matter of a galaxy three million light years distant was
Big Bang Cosmology
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/big_bang/index.html[28/04/2010 08:22:57 ﺹ]
here.
A remarkable fact follows from the linearity of Hubble's law. All the
times computed will come out to be the same. They will simply
be one divided by Hubble's constant (with the units appropriately
adjusted). The time we have computed is a time at which all the
matter of the universe was coincident. That marks the beginning of
the universewe now call it the "big bang." This is very pretty. We
proceed from observations about galaxies to Hubble's law with its
constant on to the age of the universe.
Age of Universe
= 1 /
Hubble's constant
The Hubble age of the universe is roughly 14 billion years.
Big Bang Cosmology
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/big_bang/index.html[28/04/2010 08:22:57 ﺹ]
FriedmannRobertson
WalkerSpacetimes
Which solutions of Einstein's gravitational field equations can
accommodate a universe with a Hubble expansion ? Or, more
figuratively, which pages in Einstein's great book might belong to
our universe?
The answer lies in a class of solutions of Einstein's equations
picked out a few simple conditions, FriedmannRobertson
Walker spacetimes. These spacetime can be sliced up into
spaces that evolve into each other over time.
(It isn't automatic that a spacetime can be cut up into nice spatial slices. A Goedel
universe cannot be sliced up nicely into spaces that evolve into each other over
time.)
What character should the spaces have? Our observations of our
cosmos tell us that on the largest scale space is homogeneous
and isotropic. So we ask that the solutions have a
Big Bang Cosmology
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/big_bang/index.html[28/04/2010 08:22:57 ﺹ]
homogeneous, isotropic space that is a space that is the
same in every place (homogeneous) and in all directions
(isotropic); and that this space simply evolves with time.
This condition that space is homogeneous and isotropic on the
largest scale has been called the " cosmological principle."
That strikes me as a little dangerous. Naming the condition the
cosmological principle does no harm, of course, as long as we
realize that it is just a name. However, when the term "principle" is
used, it is easy to get the impression that the condition is
somehow unchallengeable. That is risky. Whether the universe is
roughly homogeneous and isotropic is something that should be
determined by observation. It should not be elevated to apriori
heights.
The condition that the space is homogeneous and isotropic
restricts it to three general possibilities. Such a space must have
constant spatial curvature. We know from earlier that the three
possibilities are:
Spherical
Positive curvature
Flat, Euclidean
Zero Curvature
Hyperbolic
Negative Curvature
A space of one of these three types will be the instantaneous
snapshots that comprise the "now" of the cosmology.
Each of these snapshots of space will be filled with a uniform
matter distribution. Its composition is not fixed. It may be
ordinary matter, such as comprise planets and stars; or it may be
radiation; or it may be a mixture of the two. At present we have a
mixture that is heavily skewed towards ordinary matter. In the
past, radiation was dominant.
Finally the spaces of the cosmology cannot remain static. They
are either expanding or contracting. The first case of expansion
is the one that interests us most since it is what we observe. As
time passes, the space expands, its curvature, if it has any
decreases, and the distance between the galaxies increases. The
figure shows the worldlines of the galaxies with the spatial slices.
Big Bang Cosmology
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/big_bang/index.html[28/04/2010 08:22:57 ﺹ]
The most interesting feature of this figure is what happens when
we project the worldlines of the galaxies into the past. The
galaxies get closer and closer. Eventually, they converge onto a
state of infinite curvature and density. This is the initial statethe
socalled "big bang."
What the Big Bang
really is
It is easy to misunderstand the nature of the big bang and the
expansion of the universe.
The popular image called to mind by the name big bang is
something like this. There is a huge empty space, with an infinitely
dense nugget of matter containing all future matter of the universe.
At the moment of the big bang, This nugget explodes. Fragments
of this primeval nugget are scattered into space, progressively
filling it with an expanding cloud of matter. This is NOT the
modern big bang model.
Big Bang Cosmology
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/big_bang/index.html[28/04/2010 08:22:57 ﺹ]
Rather the expansion is the expansion of space itself. The
most helpful picture is of the rubber surface of a balloon
expanding. The galaxies are like dots drawn on the surface. They
move as the rubber sheet stretches. The galaxies fly apart
because space expands. At any instant, space is always full of
matter; there is no island of fragments expanding into an empty
space.
If we now project back to the big bang, we project back to a time
at which all matter and space were somehow compressed into a
state of infinite density. Einstein's gravitational field equations tell
us the matter density equals the summed spacetime curvature.
So, if the matter density is infinite, the curvature of spacetime
has become infinite as well.
That last statement cannot be literally correct. According to
Einstein's general theory of relativity, spacetime at every event has
definite curvature. If that curvature is everywhere infinite, we
define no spacetime at all. If we try to imagine the time of the big
bang itself as one of the times of the cosmology, we are saying
that there is a time at which spacetime is not properly defined. So
there can be no time in the cosmology corresponding to the big
bang. We describe the big bang as a "singularity," a breakdown
in the laws that govern space and time.
The term singularity, roughly speaking, designates a point in a mathematical
structure where a quantity fails to be well defined, even though the quantity is well
defined at all neighboring points. The simplest and best known example arises with
the inverse function, 1/x. As long as x is nonzero, 1/x is well defined. For
x = 10, 5, 1, 0.5, 0.1, 0.01, ...,
1/x = 0.1, 0.2, 1, 2, 10, 100, ...
For negative values
x = 10, 5, 1, 0.5, 0.1, 0.01...,
1/x is 0.1, 0.2, 1, 2, 10, 100, ...
The system has a singularity at x=0, for then 1/x = 1/0 and, as we all learn in our
arithmetic classes, "you cannot divide by zero." There is a temptation to say that
1/0 is "infinity." But that is dangerous. As we have just seen, if we approach x=0
Big Bang Cosmology
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/big_bang/index.html[28/04/2010 08:22:57 ﺹ]
from positive values of x, the inverse 1/x grows without limit towards +infinity (i.e.
"plus infinity"). If we approach x=0 from negative values of x, the inverse 1/x grows
negatively without limit towards infinity (i.e. "minus infinity"). If we insist on giving
1/0 a value, which do we give? "Plus infinity" or "minusinfinity"? The safer course
is just to say that we have a singularity at x=0 and not try to give it any value.
What we can say is this. The universe has an age or timeits age
after the big bang. The spacetime of the universe exists for every
age greater than zero: 1 million years, one hundred years, one
second, one half second, one tenth second, and so on. No matter
how small we make the age, there is a corresponding spacetime,
as long as the age is greater than zero. But nothing corresponds
to the zero age.
This moment of zero age is a fictitious moment in the history of
the universe. In that regard, it is like the fictitious point "at infinity"
on the horizon where parallel lines meet. Of course well all know
that there is no such point, although we see it drawn routinely in
perspective drawings.
Cosmological Red Shift
We can now return to the red shift that figures in the Hubble
expansion and give a more precise account of its origin. It is not a
traditional Doppler shift, but something more subtle. A distant
galaxy emits light towards us. The light waves with their crests are
carried by space towards us. For a distant galaxy, it can take a
very long time for the light to reach us. During that time, the
cosmic expansion of space proceeds. The effect is that the
waves of the light signal get stretched with space. So the
wavelength of the light increases and its frequency decreases. It
becomes red shifted.
Big Bang Cosmology
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/big_bang/index.html[28/04/2010 08:22:57 ﺹ]
To get a sense of the process, imagine a column of ants setting
off to walk across a rubber sheet. They may enter the sheet at a
rate of one ant per second. If the rubber sheet is stretched while
the ants walk, each ant will need to go further to get to the other
side than the one before. So the ants will arrive less frequently at
the other side than the original rate of one ant per second.
Cosmic Dynamics: Three
Possibilities
What is the overall dynamics of spacetime? Einstein's gravitational
field equations applied to the Friedmann RobertsonWalker
spacetimes give us three possibilities, cataloged below as I, II
or III.
What decides between them is the density of matter. The so
called "critical density" of matter is the deciding value. It is a
minute average density of 10
29
grams per cubic centimeter. Our
cosmology will be one of the three shown in the table below
according to whether the actual average density of matter in our
universe is greater than, equal to or less than this critical density.
Cosmology
I II III
Average mass
density
Greater than critical Critical Less than critical
Spherical
positive curvature
Flat, Euclidean
Hyperbolic
negative curvature
Big Bang Cosmology
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/big_bang/index.html[28/04/2010 08:22:57 ﺹ]
Geometry of
space
zero curvature
Dynamics
Expands and collapses to
big crunch
Expands indefinitely Expands indefinitely
The table gives the broad features. In cases I and III, space is
curved. The scale factor Rthe radius of curvature of the space
determines the extent of curvature. (The radius of curvature of a
three dimensional space is the three dimensional analog of the
radius of a twodimensional sphere.) The value of R differs greatly
according to the particular matter density at hand. However a
rough estimate is this:
Scale
factor
R
very
roughly
equals
Hubble
age of
universe
x
speed
of light
So by this estimate the scale factor is roughly 14 billion light
years. This value only obtains exactly for special cases. In
cosmologies I, it obtains exactly if the average matter density is
twice the critical.
We can also get a sense of the dynamics by plotting how the
scale factor R changes with time in typical examples of the three
cosmologies. In the case of cosmologies II with Euclidean
geometry, the scale factor R is simply set to be the distance
between two conveniently placed galaxies. As the cosmic
expansion proceeds, R grows in response.
Big Bang Cosmology
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/big_bang/index.html[28/04/2010 08:22:57 ﺹ]
In general there are no simple formulae for these curves. One
case proves to be simple. In Cosmologies II, if all the matter is
what is called "dust" in the jargon (i.e. ordinary matter like our
earth), then R increases in direct proportion with (time)
2/3
. Or in
that cosmology, if all the matter is radiation, R increases in direct
proportion with (time)
1/2
.
A Newtonian Analogy
for the Dynamics
At first the dynamics seems arbitrary. Why should the different
universes have the properties they do? Why, for example, should
a universe with greater mass density only have a big crunch? And
why with lesser mass density, will the expansion continue
indefinitely? We can makes some sense of this with an analogy
from Newtonian theory.
There is a reason Newtonian theory can tell us something. Recall
that general relativity turns back into Newtonian theory as
long as we consider ordinary conditions: nothing moves quickly,
there are no strong gravitational fields andmost important here
we consider small distances, not cosmic distances.
So it turns out that a tiny chunk of the cosmic fluid of a Friedmann
RobertsonWalker spacetime is governed by Newtonian
principles. The easiest way to see those principles in action is to
Big Bang Cosmology
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/big_bang/index.html[28/04/2010 08:22:57 ﺹ]
consider a closely analogous system in Newtonian theory.
Imagine that we have a bomb in space that explodes. It will
spread debris into space. Each fragment in the debris will attract
all the others according to Newton's inverse square law of gravity.
The ultimate fate of the debris cloud depends on the balancing of
the initial magnitude of the explosion with the strength of the
gravitational attraction within the debris cloud.
If there is a greater amount of matter in the original lump, the
explosion will produce a denser cloud of debris. Its internal forces
of gravitational attraction will be strong enough to slow and halt
the initial outward motion of the explosion and draw the fragments
back together, bringing about a collapse. It corresponds to the
dynamics of cosmology I; there is a big bang and a big crunch.
If there is a lesser amount of matter in the original lump, the
explosion will produce a more dilute debris cloud whose internal
forces of attraction will not be sufficient to halt the initial outward
motion of the blast. That outward motion will continue indefinitely.
It corresponds to cosmology III; there is a big bang and no big
crunch.
We could imagine an intermediate case in which the explosion
is just energetic enough to fling the debris out of the reach of the
gravitational forces; any weakening of the explosion would be too
weak to prevent collapse. This corresponds to the intermediate
case of cosmology II.
The Newtonian analogy is useful in so far as it gives us a nice
picture for the dynamics. But it omits a lot. There is no account
of the different spatial geometries and the big bang is the
explosion of a nugget of matter into a preexisting space. That is
not what is portrayed by relativistic cosmologies.
Big Bang Cosmology
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/big_bang/index.html[28/04/2010 08:22:57 ﺹ]
And Ours Is...
Which of these three cosmologies is ours? The questions is not
without some interest. If it is the first cosmology I, then we live in a
finite space. If we point in in any direction, after some finite
distance, we are pointing at the back of our own heads! Further,
just as the universe has a finite past bounded by the big bang, so
there is also an end in our future. The entire universe will collapse
down onto itself in a "big crunch". In cosmologies II and III neither
of these results obtain. Cosmology II, however, is the only one in
which the geometry of space on cosmic scales is Euclidean.
The value of the critical density is extremely small: 10
29
grams
per cubic centimeter of space. That is
0.00000000000000000000000000001 grams per cubic
centimeter. That is very little indeed! It corresponds to roughly 5
hydrogen atoms only in a cubic meter of space. That sort of
vacuum is extremely hard to achieve with laboratory equipment on
earth.
Here's another measure of how small it is. Take one fifth
of a teaspoon of water, which is roughly 20 drops. (That
amounts to one gram.) How widely spread must it be in
order to match the critical density? Guess! What if we
take those 20 drops and spread them over the volume of
the Astrodome? Not even close. Think bigger. What about
20 drops spread over the volume of earth? Better, but still
too small. That density is still 100 times too big. Those 20
drops of water need to spread over one hundred earth
volumes if their dilution is to match the critical density!
That, at least, is what my sums show. The radius of the earth is about
6,366,000 meters. So its volume is 1.08 x 10
21
m
3
, which comes to 1.08
x 10
27
cubic centimeters. So one gram spread over this volume is still
roughly 100 times too dense.
Since this critical density is so small, you might think that our
universe must have an average density more than critical. That
would be jumping to conclusions. What counts is the density of
matter averaged over all space. So we need to take the matter
of earth and spread it over the vast emptiness of space between
stars and galaxies. And then the calculation gets more
complicated because of the steady accumulation of evidence that
a very substantial portion of the energy of our universe is "dark,"
so its existence is actually inferred indirectly from the gravitational
effects it produces.
Big Bang Cosmology
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/big_bang/index.html[28/04/2010 08:22:57 ﺹ]
The upshot is that the average density of matter comes out very
close to the critical. Indeed the astonishing and maddening result
is that the more accurately it is measured, the closer our density
gets to the critical value. So we remain unable to say which of the
cosmic scenarios above is our own.
The suspicion is growing that our density may be exactly the
critical density. It seems too much of a coincidence that of all
values our matter density could have, it just turns out to be so
close to the critical density. So the supposition is that there might
be some cosmic process that has driven the matter density to this
value. So called "inflationary" cosmologies posit an early phase of
very rapid cosmic expansion that would have the effect of driving
the matter density towards the critical.
Is a Big Bang Inevitable?
The distinctive feature of big bang cosmology is the big
bang. We know it is there because when we project
back the trajectories of the expanding galaxies and see
that they all converge onto one point. It is somewhat
like a lens focussing the rays of light of the sun. The
rays emerge from the lens just perfectly aligned to focus
to a single infinitely bright spot.
Or that is what they would do in ideal circumstances. That is, if
the light rays falling onto the lens were perfectly parallel and the
lens perfectly constructed. In the real world, there are always
slight unevennesses and neither of these assumptions hold.
All that will happen is that the light is focussed to a very bright
spot, not a point of infinite intensity.
Might the same be true of the big bang ? Friedmann
Big Bang Cosmology
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/big_bang/index.html[28/04/2010 08:22:57 ﺹ]
RobertsonWalker spacetimes presume a perfectly
uniform matter distribution. That means that all the
motions of matter now are presumed symmetrically
arranged so that, from the uniformity alone, if we
project back into the past, we eventually come to a
singular point.
That perfect uniformity is an idealization we know is not
true of our world. While the matter of the universe might
be nearly uniform when seen on some cosmic scale,
locally it is far from uniform. When we allow for these
nonuniformities might we not have a big banga true
singularitybut merely region of spacetime with a lot of
near misses ? So instead of the big bang, we have a
temporary region of very high density? If that
happened, the big bang would no longer be the
beginning of time. There would be time and matter and
space before the big bang. The big bang would merely
be an extremely hot phase of highly compresses matter
and space in the overall history of the cosmos.
Might the big bang merely be an unrealistic artefact of
an unrealistic symmetry assumption? For the big bang
to be physically interesting to us, its existence must be
assured robustly by physical principles, not fragile
assumptions of uniformity.
A Theorem by Stephen
Hawking
It was long supposed that nonuniformities might preclude a true
singularity. In the 1960s, when a group of mathematical physicists
turned to this and related issues, it was soon shown that this
supposition was wrong.There was a great deal more inevitability to
the big bang. The results were delivered in the form of
mathematical theorems within the framework of general relativity.
They show that, under quite broad conditions, even with non
Big Bang Cosmology
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/big_bang/index.html[28/04/2010 08:22:57 ﺹ]
uniformity, a singularity is inevitable. Here is one of a number
of these theorems proved by Stephen Hawking.
IF
(a) the
universe is
expanding
at some
instant
("now");
(b) the rate
of
expansion
is now
everywhere
greater than
some fixed
positive
amount "K";
Note that this expansion need not be uniform. It can be high in one
place and low in another. It just must be everywhere greater than
some positive amount K. This "K" can be anything, but it must be
greater than zero.
(c)
Einstein's
gravitational
field
equations
(without λ)
hold with
the "strong
energy
condition";
Recall that the notion of "matter density" in general relativity is
complicated. Energy contributes to it, but so do stresses. Indeed
stresses can be sources of the gravitational field. The "strong
energy condition" requires that the contribution to matter from
ordinary energy is greater than that from stresses. So this condition
says "not too much funny matter."
Notice that there is no condition that the matter distribution be
uniform or even that there be matter everywhere.
(d) there
are no
causally
isolated
pockets of
spacetime
in the
universe;
The technical requirement is that the spatial slice "now" be a Cauchy
surface. A Cauchy surface has the property that every non
terminating physical process, propagating at or less than the speed
of light, must pass through it exactly once. So any process in the surface's
past will end up intersecting it; and any process in its future must have passed through
it. Specifying the state of a Cauchy surface fixes the future and past geometric state of
the spacetime fully. That a spacetime have a Cauchy surface is a strong condition not
satisfied by many spacetimes. The hypersurfaces of simultaneity of a Minkowski
spacetime and the natural spatial slices of a FriedmannRobertsonWalker spacetime
representing the "nows" are Cauchy surfaces.
THEN
Big Bang Cosmology
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/big_bang/index.html[28/04/2010 08:22:57 ﺹ]
no timelike
world line
can be
extended
indefinitely
into the
past; that is,
no timelike
curve has
greater
temporal
length than
3/K.
The "THEN" conclusion in effect tells us that there is something
pathological going on in our past. If we try to trace the history of
a galaxy one case of a timelike world line indefinitely into the
past, something blocks it. Such world lines cannot extend back
arbitrarily. You might complain that this is an oblique way of
characterizing a singularity. That is true, but that is how these
matters are dealt with. Recall that a singularity is not a point in the
spacetime, so its identification will have to be indirect.
We can get a more intuitive sense of how the theorems work by
recalling the jig saw puzzle analogy for solving Einstein's
equations. We specify the "now" part of spacetime in accord with
the "IF" conditions above. We then try to reconstruct the
spacetime of the past by solving Einstein's gravitational field
equations; that is, we start to put in the pieces of spacetime that fill
out the past. What we discover if that we can only keep adding in
pieces for some finite distance in time to the past. Then we can go
no further.
The strength of a theorem is that it is a mathematically proven
result. As long as the "IF" conditions are met, the "THEN"
conclusion is forced by mathematics alone. That is also the
weakness. The "IF" conditions may fail. Indeed stating a
theorem like this is an invitation to troublemakers to find failures of
the IF condition.
They certainly can be found. If there are black holes, then the
causal niceness condition (d) is violated. Also, as we discover
more and more exotic forms of matter in the cosmos, we may
worry about the energy condition (c). Or, if Einstein's cosmological
constant λ is small but nonzero in a world with spherical spatial
Big Bang Cosmology
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/big_bang/index.html[28/04/2010 08:22:57 ﺹ]
geometry, then it turns out that we can have a cosmology with no
big bang and no big crunch. Space slowly collapses over time to
some minimum size and then expands out again. It is a single
gentle bounce.
What you should know
What our universe looks like on the largest scale.
The expansion of the galaxies and the Hubble law
How FriedmannRobertsonWalker spacetimes form the basis of modern big bang cosmology.
The three types of universes in FRW cosmology and what decides between them.
The Newtonian analogs for big bang cosmology.
The conditions under which a big bang singularity is inevitable.
Copyright John D. Norton. March 2001; January 2007, February 16, 23, October 16, November 10, 2008, March 31, 2010.
Black Holes
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/black_holes/index.html[28/04/2010 08:23:08 ﺹ]
HPS 0410 Einstein for Everyone
Back to main course page
Black Holes
John D. Norton
Department of History and Philosophy of Science
University of Pittsburgh
The Basic Idea
From Doubt to Observation
Stable and Unstable Systems
Gravitational Collapse is Not Self Limiting
So what stops it?
Newtonian Black Holes and Escape Velocity
The Collapse of Stars
Newtonian and Relativistic Black Holes
Forming a Black Hole in General Relativity
Falling into a Black Hole
Tidal Forces
What You Should Know
The Basic Idea
Black holes are some of the most interesting pathologies in
space and time delivered by Einstein's general theory of
relativity. They form when matter collapses gravitationally
onto itself, such as when massive stars burn out. They are a
region of space where the gravitational pull is so strong that
nothingnot even lightcan escape. Hence John Wheeler
called them "black holes." There is more. They incorporate
singularities in spacetime structure: points where Einstein's
theory breaks down, since the curvature of spacetime
becomes infinite. And they can supply bridges to new
universes.
We tend to associate black holes with Einstein's general
theory of relativity. Yet their origins lie firmly in classical,
Newtonian physics. They depend on a potentially
catastrophic instability that resides merely in the fact that
masses attract gravitationally and attract more strongly
the closer together they are. So once gravitational collapse
Black Holes
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/black_holes/index.html[28/04/2010 08:23:08 ﺹ]
starts, its gets harder and harder to stop. This means that
fully collapsed bodies that don't allow light to escape are
already possible in Newtonian theory.
So far we only have a singular gravitational field. Newton's
space and time are unaffected. Now add the idea brought
by general relativity that gravity goes with a curvature of
spacetime. We find that a singularity in the gravitational field
corresponds to a singularity in the structure of space and time
itself.
From Doubt to
Observation
Black holes tend nowadays to be accepted as a routine part
of physical theory. That certainly was not always so.
Theorists of earlier decades viewed them skeptically. Peter
Bergmann, one of Einstein's assistants, remarked that
through such singularities, general relativity contains the
seeds of its own destruction. Einstein himself tried to
argue (unsuccessfully) that they could not form.
Now we are so confident that there are black holes that the
issue is not so much whether they exist, but where we should
point our telescopes to see one. They arise either as
collapsed stars or as the massive centers of galaxies. The
object Cygnus X1 has long been a strong candidate for a
black hole. It is an unseen companion to a visible star that is
33 times as massive as our sun. The star HDE 226868 orbits
around a second object so massive and compact that it must
be a black hole. That object is Cygnus X 1, a strong
candidate for a black hole.
Here is an image of HDE 226868 and its invisible
companion taken with an optical telescope at the Palomar
Observatory (http://imagine.gsfc.nasa.gov/YBA/cygX1mass/cygX1image.html)
Black Holes
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/black_holes/index.html[28/04/2010 08:23:08 ﺹ]
Here's an artist's impression of Cygnus X1, whose
powerful gravity drags matter from its companion star into the
black hole accretion disk.
(http://agile.gsfc.nasa.gov/docs/objects/binaries/cygx1_artists.html)
The process is more dynamic. The two objects orbit each other with a period of
5.6 days as this sped up animation shows (http://heasarc.gsfc.nasa.gov/docs/binary.html).
The images above were drawn from the NASA website, http://www.nasa.gov/, January
21, 2007. NASA provides these images copyright free subject to the restrictions on
http://www.simlabs.arc.nasa.gov/copyright_info/copyright.html
Black Holes
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/black_holes/index.html[28/04/2010 08:23:08 ﺹ]
Stable and Unstable Systems
Stable systems are inherently selfcorrecting. If something
disturbs them from their equilibrium state, they naturally
correct themselves and return to the equilibrium.
Consider a heavy office desk. Lift the corner and release it.
It falls back to its regular position. It's weight is so positioned
that it corrects deflections from the normal position. Things do
not spontaneously get hot. If we heat something, it tends to
lose heat faster as it gets hotter. Heat a potato in the oven and
we can only keep it hot by leaving it in the oven; it
spontaneously cools once we take it out.
An example closer to what is to come is electric charges.
They naturally tend to distribute themselves. If we clump a lot
of positive charges together, they repel, opposing the
clumping. The more we force them together, the stronger the
forces of repulsion become. The more they resist.
Unstable systems are not self correcting under
perturbations. They are the opposite. If we disturb them, their
natural dynamics magnifies the disturbance.
Consider a tall, thin bookcase. It is not stable. If
we tip it away from its normal upright position
sufficiently far, it will fall. Once the center of the
weight passes the support, the weight no longer
acts to oppose the deflection; it reinforces it. The
deflection grows and the greater the deflection, the
faster it grows.
If temperature behaved similarly, our world would be very
different. Imagine that, once a body got hot enough, that
Black Holes
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/black_holes/index.html[28/04/2010 08:23:08 ﺹ]
would somehow trigger it getting yet hotter, and so on
indefinitely. We would be standing forever on the abyss of
possible runaway processes that yield infinite temperatures.
Gravitational Collapse is Not Self
Limiting
Now consider gravitational collapse. As Newton first told us
over 300 years ago, all bodies gravitationally attract all
other bodies. So matter naturally wants to clump together in
ever denser, smaller clumps. This process is gravitational
collapse.
It is not a self limiting process. The more matter clumps
together, the stronger become the forces that drive the
clumping. That is an immediate consequence of Newton's
inverse square law. The gravitational force between bodies
varies inversely with the square of the distance between
them.
As two bodies near and the distance between them
reduces from 3 to 2 to 1, the gravitational force pulling them
together increases ninefold: from 1/9 to 1/4 to 1.
Black Holes
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/black_holes/index.html[28/04/2010 08:23:08 ﺹ]
The same thing happens if we have a large sphere of
matter collapsing gravitationally.
The figure shows the force on a unit of mass
located at the surface of the sphere. As the
size of the sphere reduces from 3 to 2 to 1, the
force on the unit mass at the surface increases
ninefold: 1/9 to 1/4 to 1.
For experts only: It is not immediately obvious that this
follows from Newton's inverse square law since the force
on the mass at the surface will be the total force due to all
the other masses in the sphere. These other masses are at
many different distances from the unit mass and the forces
due to each must be summed. A familiar theorem in
Newtonian mechanics tells us that the gravitational force of
masses in a sphere on bodies outside it is the same as the
force due to a point with equal mass located at the
sphere's center. So we can figure out the force on the unit
mass by pretending that the masses of the sphere are all
located at its center.
In short, if we have a cluster of masses that fall together
under their mutual gravitational attraction, those forces of
attraction will grow stronger as the masses come closer
together. There is nothing in the properties of gravity to
prevent the continued collapse. In this sense, gravitation
forever threatens a catastrophic, runaway collapse.
So what stops it?
Black Holes
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/black_holes/index.html[28/04/2010 08:23:08 ﺹ]
The process of gravitational collapse just described is the
process by which stars, galaxies and planets are formed.
Cosmic debris, hydrogen or other elements, coalesce under
their own gravity to produce these celestial objects. What
prevents their collapsing to a point? Other forces intervene.
There are three types of forces that halt continuing
collapse. Each force has its limit.
Gravitational
collapse of...
is halted by...
But that halting force can be
overcome by...
Galaxies and
planetary systems.
Their matter forms
great, orbiting swirls
as they collapse
together.
The orbital motion of stars in galaxies
and planets in solar systems lead to
centrifugal forces that prevent the
stars and planets falling to the centers
of the systems
If these motions are lost due to
collisions, collapse can ensue.
Stars.
They become very
hot as they form
from gravitationally
collapsing clouds of
cosmic matter.
Their high temperatures yield high
pressures that balance the continuing
pull of gravity.
Stars are radiating away their
heat. Even though nuclear
reactions contribute more heat,
eventually they will burn out and
the stars will cool.
Planets.
The mechanical rigidity of the rocks
and incompressibility of the molten
core of rocky planets prevent further
collapse. The gas pressure of gas giants
prevent their collapse.
If the gravitational forces are
strong enough because a lot of
matter is collapsing, these
mechanical forces can be
overcome.
The table summarizes how three different effects prevent
complete gravitational collapse. The circumstances with stars
needs a little more explanation.
Stars are huge spheres of gases, heated to very high
temperatures by the energy released in gravitational collapse
and then by thermonuclear reactions ignited by the rising
temperatures. Those high temperatures cause the gases to
expand. If those expanding gases were somehow trapped in
a chamber so the expansion was halted, very high
Black Holes
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/black_holes/index.html[28/04/2010 08:23:08 ﺹ]
pressures would result. The gravitation of the star itself
provides just such a chamber. The gravitational attraction of
each part of the star for all other parts pulls back on the other
parts, stopping the expansion and producing high pressure.
The stability of the star consists in a perfect balance of the
outward pressure forces and the inward gravitational forces.
It is a temporary balance, since the nuclear fuel of the stars
will eventually burn out.
Newtonian Black Holes and
Escape Velocity
What makes black holes black is that light cannot escape
from them. In this regard, there is a simple analog to the
black holes of general relativity in Newtonian physics.
One way to gauge the intensity of the gravitational field of a
collapsing body is to determine its "escape velocity." If we
have an object on the surface of one of these entities and
hurl it straight up, the escape velocity is the minimum
speed it would need in a vertical direction to escape the
gravitational pull of the entity. On the surface of our earth,
that escape velocity is 11 kilometers per second verticallya
quite prodigious speed.
That means that a body hurled directly upward at more
than 11 kilometers per second would escape the pull of the
earth's gravity with some speed to spare. An object hurled
upward at less than 11 kilometers per second would always
fall back. The closer the speed gets to 11 kilometers per
second, the higher the object would rise before falling back.
At 11 kilometers per second, the object has exactly the
minimum velocity needed to escape the earth; once it was far
away from the earth, all its velocity would be used up by the
escape and it would approach rest.
If the earth were to undergo gravitational collapse, that
escape velocity would increase as the collapsing earth's
size decreases. The increase is not as fast as you might
Black Holes
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/black_holes/index.html[28/04/2010 08:23:08 ﺹ]
expect. It grows only with the square root of the change in
size. The table illustrates this growing.
Decrease size by ratio: Increase escape velocity by ratio:
100 10
10,000 100
1,000,000 1,000
So reduce the radius of the earth by a factor of 100
(from 6500 kilometers to 65 kilometers) and the
escape velocity increases by a factor of the square
root of 100; that is, by a factor of 10. So it becomes
110 kilometers per second.
The precise formula, if you want to know, is just
To see the formation of a Newtonian black hole, just continue
this collapse process, as in the figure. The Newtonian black
hole forms when the collapsing earth's radius passes 1/3" .
That is an astonishingly small size into which all the matter of
the earth must be squeezed. But if it is done, the gravity at
the surface is so strong that an object hurled upward at
c=300,000 km/sec=186,000 miles/sec could only just escape.
Nothing traveling any slower could escape. If the collapse
continued to anything smaller than 1/3", then not even
objects moving at c could escape.
The object formed with radius 1/3"
from the earth is a Newtonian black
Black Holes
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/black_holes/index.html[28/04/2010 08:23:08 ﺹ]
hole. It has many properties in
common with the black holes of
general relativity. Most importantly,
they both have the same size.
As the collapse proceeds, energy is
released in larger and larger
amounts. If the collapse continued to
a point, an infinite amount of energy
would be released (but would not be
carried off by light moving at c!).
For experts only. The formula for the size r
bh
of a Newtonian black hole is
easy to compute. The negative potential energy of a mass m on surface of
a sphere of mass M is just GM/r
bh
, where G is the universal constant of
gravitation. If the mass m moves at speed c directly upwards, it has kinetic
energy (1/2)mc
2
. If the mass is just able to escape, these two energies
must sum to zero. Solving we find
r
bh
= 2GM/c
2
Curiously, this is the same formula as general relativity gives for the
"Schwarschild radius" that designates the event horizon of a general
relativistic black hole.
The existence of the Newtonian black holes and their
similarities to the black holes of general relativity is striking.
However the significance of their similarities should not be
overestimated.
Light cannot escape a Newtonian black hole only if a
particular way of escaping is chosen and particular
assumptions are made about light: that the light can only
escape if it is shot directly upwards like a stone from a
catapult and if the maximum speed it has when released from
the catapult is c. Newtonian physics would allow things to
escape the Newtonian black hole by gentler means.
Imagine a rocket ship that fires its motors so as to generate
an upward acceleration that is greater than the attraction of
gravity. As long as that upward acceleration just exceeds that
of gravity, the rocket ship would gently rise and escape. That
is not possible, as we shall see, for a black hole in general
relativity.
The Collapse of Stars
We need not fear that our planet earth will undergo
gravitational collapse. The mechanical incompressibility of
rock is an enduring feature.
It is not so for stars, such as our sun. The gas pressure
that resists collapse depends on the high temperature of the
star. Stars radiate and constantly lose the energy that
Black Holes
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/black_holes/index.html[28/04/2010 08:23:08 ﺹ]
sustains the high temperature. That energy is resupplied by
nuclear reactions in the star. When stars form initially from
collapsing clouds of hydrogen gas, the hydrogen fuses to
form Helium, liberating vast amounts of energy. After that,
more fusion reactions occur producing more elements.
These reactions cannot proceed indefinitely. Eventually the
nuclear fuel will be spent and the gases will start to cool.
When they do, the pressure produced by the high
temperature will drop as well. And when that happens, the
balance of inward gravitational force and outward pressure
will be disrupted in favor of the gravitational forces. The star
will begin to collapse in onto itself. The stability of stars is
only a temporary circumstance.
What happens next is not so simple. There are many
possibilities and astrophysicists have developed detailed
histories of how different stars will fare under gravitational
collapse.
The most important factor in deciding their fate is the mass of
the star. The unit commonly used is "solar mass"the mass
of the star in relation to our sun's mass. "Two solar masses"
means twice the mass of our sun. Smaller stars tend to burn
out quietly, larger stars are more likely to collapse
catastrophically and produce a black hole. The table
summarizes some of the major trends.
Stars of.. In gravitational collapse...
Less than 1.3
solar masses.
Form black dwarves.
2 to 3 solar
masses.
Form neutron stars ("pulsars"); or may
fragment in supernova explosions.
More than
three solar
masses
Nothing halts gravitational collapse;
black holes form.
While the eventual fate of our sun is clear, we are in no
immediate danger. The times required for these processes is
of the order of billions of years. Our sun has been gently
Black Holes
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/black_holes/index.html[28/04/2010 08:23:08 ﺹ]
burning its hydrogen for 4.5 billion years and will continue to
do so for another 5 billion.
One might expect that larger stars would live longer since
they have more fuel. However the reverse is true. They burn
their nuclear fuel even faster so they have shorter lives.
Newtonian and
Relativistic
Black Holes
Black holes can form in both Newtonian theory and general
relativity. However there are significant differences between
them.
Newtonian
black hole
Singular point of
infinite matter density
and field strength.
Space and time unaffected.
Infinite energy is
released in the collapse
that forms the black hole.
General
relativistic
black hole
Singularity in
spacetime curvature.
Causal structure of space and time
affected; there are causally isolated
regions of space and time.
Finite energy is released
in the collapse that forms
the black hole.
A Newtonian black hole is less radical than a relativistic black
hole in so far as the Newtonian black hole involves no
disturbance to space and time. Such a disturbance is
inevitable in general relativity, since Einstein's gravitational
field equations connect matter density and gravitation to
spacetime geometry. So if the matter density and
gravitational field becomes singular, we should expect similar
pathologies in space and time.
A Newtonian black hole, however, is far more radical that a
relativistic hole in another sense. In formation, a fully
collapsed Newtonian black hole must shed an infinite
amout of energy. While we talk of infinities all the time, we
should not be casual about such an amount. The release of
an infinity of energy in our neighborhood would overwhelm
everything. In general relativity, the formation of a black hole
Black Holes
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/black_holes/index.html[28/04/2010 08:23:08 ﺹ]
does not call for an infinity of energy to be released.
Forming a Black Hole in General
Relativity
Let us now trace how spacetime is affected by the formation
of a black hole in general relativity. The spacetime diagram
below shows a sphere of matter undergoing gravitational
collapse. It is the simplest case of an uncharged, non
rotating sphere of matter and produces a socalled
"Schwarzschild" black hole.
At the bottom of the figure is a spatial slice of a
fairly ordinary spacetime, in which a sphere of
matter begins its gravitational collapse.
The collapse continues as we proceed up the
figure. The sphere becomes smaller and
smaller, until it eventually it is so small and
dense and its gravity so strong that not even
light can escape its surface. That is the
formation of a black hole and it happens at a
radial position known as the "Schwarzschild
radius." For an object the size of the earth, we
already saw that radius is 1/3". For an object
the size of the sun, it is 2.95 km. (Note that neither
the earth nor sun have enough mass to overcome stabilizing
forces and produce a black hole.)
The radial position from where light can longer
escape is called the "event horizon." It is an
important boundary in spacetime. Outside the
event horizon, rapidly moving bodies that have
strayed too close to the black hole can still
escape, if they can move fast enough. Once
they stray within the event horizon, no escape
is possible. The fastest speed relativity theory
admits, that of light, is no longer enough to
Black Holes
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/black_holes/index.html[28/04/2010 08:23:08 ﺹ]
allow escape.
Once the collapsing matter has collapsed within
the event horizon, the collapse continues all the
way to zero size. What results is a point of
infinite matter density and therefore a point of
infinite spacetime curvature. It is a singularity.
Once these two quantities have become infinite,
Einstein's gravitational field equations have
ceased to function; the theory breaks down.
Within the event horizon, all motion of matter
and light is towards that singularity. It is
everyone's future. In this sense, the directions
of space and time are switched within the
event horizon. Time now points towards the
singularity, for that is everyone's future.
In a Minkowski spacetime, the light cones mapped out the
possible motions and the possibilities for causal connections. In
that spacetime, the lightcones were uniformly distributed in
spacetime, with no regions of spacetime causally distinct from
others. In a black hole, it is otherwise.
In a black
hole
spacetime,
Black Holes
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/black_holes/index.html[28/04/2010 08:23:08 ﺹ]
lightcones
far away
from the
event
horizon
are
oriented
as
expected.
As we
near the
event
horizon,
the light
cones tip
over to
face the
singularity.
At the
event
horizon
itself, the
light cones
have
tipped
over so far
that only
motions
faster than
light can
escape
falling into
the
singularity.
Within the
event
horizon,
the light
cones
futures are
all pointed
towards
c
o
N
";:
o
~
+
C
~
( )
space
t
time
Black Holes
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/black_holes/index.html[28/04/2010 08:23:08 ﺹ]
the
singularity.
A presumption in the literature on black holes is that nothing
travels faster than light . We noted earlier that relativity
theory does in principle admit faster than light motionsa
particle executing them would be a tachyon. However no
such particle has been detected.
When we look at these diagrams, it is clear that the event
horizon marks a special boundary in spacetime. It marks the
point of no return for travelers falling into the black hole.
However there is nothing special, locally, at the event horizon
that is different from neighboring events. As the traveler
passes the event horizon, there are no special flags or
markers that the traveler sees. Spacetime around the event
horizon will be highly curved but otherwise no different from
the spacetime on either side. In brief, the traveler "feels no
bump" when the event horizon is passed.
The event horizon gets its special properties from its relation
to the global structure of the spacetime and specifically to
the singularity and the exterior of the black hole. For it marks
the boundary past which a traveler's future must lie in the
singularity and can no longer lie in the exterior of the black
hole. It is something like the position computed by
demographers called the "mean center of the US population."
When you stand at that position, which is somewhere in
Missouri, the average distance to all peoples in the US
comes out to zero. Move an inch to the west and you are now
on average closer to people in the west; move an inch to the
right and you are now on average closer to people in the
east. Of course it is nothing locally about the position in
Missouri that gives it this property. It is the relation between
that position and all the people spread out over the US.
Black Holes
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/black_holes/index.html[28/04/2010 08:23:08 ﺹ]
Falling into a Black Hole
What would it be like to fall into a black hole? In brief, it would
be a mistake one would not want to repeatalthough you
would not get the chance to repeat it!
The figure shows the worldline of a planet near a black hole
and the worldline of a spaceship that imprudently came
too close to the black hole and fell in.
Black Holes
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/black_holes/index.html[28/04/2010 08:23:08 ﺹ]
The first part of the curve represents that portion of the
spaceship's motion that an observer outside the black hole
would see. The spaceship would fall rapidly at first towards
the black hole. However, as it got closer, it would slow and
eventually freeze just outside the event horizon. In the entire
lifetime of the outside observer, the body would never
actually reach the event horizon. That would be true even if
the planet observer lived and observed indefinitely.
All this would be hard to see. Other objects falling into the
black hole would be emitting bursts of radiation that would
blanket the observer's view. Also, as the spaceship gets
closer to the event horizon, light from it would be ever more
red shifted and thus weakened and dimmed.
The view of this journey for the spaceship in would be
quite different. The outside world appear to speed up and
huge amounts of outside time would elapse in the short time
the spaceship would take to reach the event horizon. The
observer would pass the event horizon feeling no bump in
spacetime at all. The final stage of the journey would be
Black Holes
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/black_holes/index.html[28/04/2010 08:23:08 ﺹ]
completed extremely rapidly as the spaceship reached the
singularity.
The diagram is not a good representation of the journey. The
spaceship's trajectory is shown in two disconnected parts.
It is not a matter that the figure is truncated at the top. If we
drew extensions of the figure upward, the two parts of the
trajectory would never connect, no matter how big the
extension. This is a seriously misleading aspect of the figure.
In fact the worldline is a continuous trajectory in spacetime
with no break.
Tidal Forces
The experience for the occupants within the spaceship falling
in would be rapid. But it would not be pleasant. The reason
is tidal forces.
When we stand on the earth, our feet are slightly closer to
the center than our heads. So the gravitational force on our
feet is slightly greater than on our heads. These differences
of forces are known as tidal forces since they also produce
the earth's tides. On earth the effect is so slight that we
cannot perceive it.
If we were to fall feet first into a black hole, it
would be quite different. Once we are close to
the black hole, the gravitational pull at our feet
would be very much greater than at our heads.
We would literally be pulled apart by the
difference, as cruelly as if we were placed on a
medieval torture rack that stretched our feet from
our heads.
In science fiction movies of spaceships falling
into black holes, these tidal forces are usually
represented as rather interesting optical effects,
somewhat like looking at a distorted reflection in
a fun house mirror. The voyagers emerge merely
ruffled as if they had enjoyed a rather energetic
roller coaster ride. In gruesome reality, they
Black Holes
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/black_holes/index.html[28/04/2010 08:23:08 ﺹ]
would be lucky to be anything other than a bloody
pulp.
These tidal forces may seem familiar. Recall that we gauged
the curvature of spacetime sheets by the extent to which
free fallilng bodies converged or diverged. These
convergences and divergences are experienced by the falling
bodies as driven by external forces and they are the tidal
forces in question here.
They are called "tidal" because they are responsible for the ocean tides on
earth. Our planet is in free fall toward the moon. The portion of the ocean's
water closer to the moon experiences a stronger attraction than the portion
furthest away. That elongates the waters that jacket the earth into two lobes.
The earth rotates under these two lobes, once every day, producing two high
tides.
Black Holes
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/black_holes/index.html[28/04/2010 08:23:08 ﺹ]
What You Should Know
The self accelerating character of gravitational collapse
and why stars are prone to it.
What black hole formation looks like in Newtonian
gravitation theory and how that differs from the
relativistic case.
The layout of a Schwarzschild black hole: the singularity,
the event horizon and how the light cones are arranged
around them.
How the event horizon marks a point of no return.
What it would be like to watch someone fall into a black
hole.
What it would be like to fall in.
Copyright John D. Norton. March 2001, October 2002; February 8, 2007, February 23, 2008; April 20, 2010.
Picturing Black Holes
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/black_holes_picture/index.html[28/04/2010 08:23:26 ﺹ]
HPS 0410 Einstein for Everyone
Back to main course page
A Better Picture of Black Holes
John D. Norton
Department of History and Philosophy of Science
University of Pittsburgh
Why We Need a Better Diagram for Visualizing Black Holes:
Conformal Diagrams
An Analogy to Perspective Drawings
A Conformal Diagram of a Minkowski Spacetime
A Conformal Diagram of a Black Hole formed from
Collapsing Matter
Conformal Diagram of a Fully Extended, Schwarzschild
Black Hole
EinsteinRosen Bridges
More...
What You Should Know
Why We Need a Better Diagram for
Visualizing Black Holes: Conformal
Diagrams
The spacetime diagram we used so far for visualizing black
holes is not a very good representation of a black hole. It
cannot represent the continuous spacetime trajectory of a body
falling in as a continuous curve. There is no point in it at which
the body is at an event on the event horizon. It does not even
show all the structures present in a black hole. There are other
parts to spacetime we do not see on it.
The diagram also breaks with our familiar slogan "time
goes upspace goes across." Inside the event horizon for this
figure, time goes across in the sense that horizontal lines
pointing towards the singularity are future directed timelike
curves.
There is a better way of representing the black hole. It is to
use a conformal diagram that brings in infinities and represents
them as points on the diagram. These diagrams will include
Picturing Black Holes
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/black_holes_picture/index.html[28/04/2010 08:23:26 ﺹ]
purely fictitious points like the end points of the timelike
worldlines of objects that persist for infinite time. There is no
such end point, but they will be on the diagram and prove very
useful to us.
An Analogy to
Perspective
Drawings
At first the idea of a diagram that represents such fictitious
points at infinity might seem a little mysterious. But the idea is
actually quite familiar in another context, that of perspective
drawing. Imagine an infinite, two dimensional Euclidean plane
crisscrossed by a grid of lines. An ordinary drawing, looking
straight down from overhead, cannot capture more than a small
portion of the plane.
We know some of properties of the grid. All the northsouth lines
are parallel and never meet. They just go off to a north and a
south infinity. Sometimes we say that these parallel lines
"meet" at infinity. The talk suggests a kind of Valhalla for
valiant, but departed lines where they all finally meet to
celebrate battles lost and won.
Of course we don't intend that "meet at infinity" talk to be
taken literally. There is no place at infinity where the lines meet.
All we really mean is that the lines go off indefinitely in the same
northerly (or southerly) direction. The place at infinity where we
imagine them meeting is just a reification of the idea that they
persist indefinitely in going in the same direction.
The northsouth lines are crossed by eastwest lines . An
analogous story can be told for them. They are parallel and
"meet" at a different infinity. This "meet at infinity" talk is just a
way of reifying the different directions in which two sets of lines
persist in moving.
Now imagine that we move our
Picturing Black Holes
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/black_holes_picture/index.html[28/04/2010 08:23:26 ﺹ]
gaze towards the horizon from
our overhead position. We can now
see an infinite portion of the plane.
If we make a drawing of what we
see, we can see the entire length of
infinite lines, or at least one half of
them extending infinitely.
In this new perspective drawing, we can actually see the
points at infinity at which the parallel lines meet. These
points at infinity lie on the horizon. All the northsouth lines
meet at one point on the horizon. It is the North vanishing point,
but let us call it the "northinfinity." All the eastwest lines meet
at a different point on the horizon. It is the East vanishing point,
but let us call it the "east infinity."
Of course no one thinks these points represent a real place on
the plane. The fact that a line in the figure actually connects to
the northinfinity point just encodes the fact that the line really
keeps going north indefinitely. Analogously, the fact that a line
in the figure connects to a different point at infinity, the east
infinity, just encodes the fact that the line really keeps going
indefinitely in a different direction, east. In short, the point on
the horizon is a fictitious point that represents the infinity
never actually reached by the lines.
A traditional perspective drawing does
not show us the full, infinite plane. For
that, we need an even more distorted
image, an overhead view such as
produced by a camera with a "fisheye"
lens.
Picturing Black Holes
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/black_holes_picture/index.html[28/04/2010 08:23:26 ﺹ]
Finally perspective drawings must pay a price for being able
to show the infinite length of a line. In the overhead view, a
drawing can be scaled properly. That means that one inch on
the drawing can always correspond to one mile in the real
plane. This cannot be done in a perspective drawing. A length
of one inch in one part of the perspective drawing might
correspond to one mile in the real plane. But one inch in a part
of the drawing close to the points at infinity might represent a
much greatereven infinitedistance in the real plane.
A Conformal Diagram of a
Minkowski Spacetime
Just like the overhead view of the NorthSouth and East West lines, the
spacetime diagrams we have used so far for a Minkowski spacetime
only show a small portion of the spacetime. We can also have a diagram
in which the points at infinity become visible . We used a
A conformal transformation
has the property of leaving
lightlike curves unaffected,
but stretching and shrinking
times and spatial distances.
We need not pursue the
Picturing Black Holes
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/black_holes_picture/index.html[28/04/2010 08:23:26 ﺹ]
perspective transformation before. This time we shall use a "conformal"
transformation to produce a "conformal diagram."
messy details here.
Recall that a timelike geodesic
is just a point moving inertially
and a spacelike geodesic is just
the familiar straight line of
ordinary geometry. A lightlike or
null geodesic is the curve traced
by a light pulse moving freely.
A Minkowski spacetime has many different sorts of infinities. They
come from the types of curves in the spacetime. It has timelike, space
and lightlike geodesics. Each has their own infinity. Note as before
that these infinities in the diagram are fictitious points. There is no
point in spacetime corresponding to them, just as there is no point in
space corresponding to the vanishing point of a perspective drawing.
First, here is the conformal
diagram of a Minkowski
spacetime. This is the
complete spacetime. It includes
all of the infinity of space and
the infinity of time through
which things persist.
This diagram gives the simplest
case in which we consider just
one dimension of space.
Note the three types of
infinities: timelike, lightlike and
spacelike. They correspond to
the different vanishing points in
an ordinary perspective
drawing.
Let us investigate each in turn.
Here is an ordinary Minkowski spacetime with
the timelike geodsics that stretch from the
infinite past to the infinite future. The diagram can
only show a finite portion of each geodesic.
Picturing Black Holes
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/black_holes_picture/index.html[28/04/2010 08:23:26 ﺹ]
Here are the same timelike geodesics shown on the conformal
diagram. We can now see them in their entirety, stretching from past
timelike infinity i

to future timelike infinity i
+
.
It is important to note that the distances along the curves do not
represent properly scaled times elapsed. 1/4" length of the
timelike geodesic in the middle of the curve might represent a day of
elapsed time. The final 1/4", at the end of the geodesic where it joins
i
+
, corresponds to an infinity of time elapsed.
Note that we can only be assured that timelike geodesics corresponding to
unaccelerated motions will stretch from past timelike infinity i

to future timelike infinity
i
+
. If a timelike curve has sufficient acceleration, it can originate or terminate in the
lightlike infinities. While such timelike curves are possible, they are exceptional cases.
Most ordinary timelike curves that represent ordinary motions will terminate in the two
timelike infinities.
Here are lightlike geodesics on a
spacetime diagram. they stretch from the
infinite past to the infinite future. Only a
small portion of each full curve can be
shown.
Here are the same lightlike geodesics displayed in their entirety on
a conformal diagram. They extend from past lightlike infinity to future
lightlike infinity. Note that these infinities are not just points, but a
complete line, rather like the line of the horizon in a perspective
drawing.
These infinities are sometimes called "null" infinity and lightlike curves,
"null" curves, since the time elapsed along a lightlike curve is zero,
that is, "null."
The symbol for lightlike infinity is a script i, which looks like . To
some people this looks like a curly J. Since a script i is hard to render
in html, it is often called "scri."
Picturing Black Holes
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/black_holes_picture/index.html[28/04/2010 08:23:26 ﺹ]
Finally, a Minkowski spacetime is populated
with spacelike curves as well. They form
the spacelike hypersurfaces that are the
"nows" of the spacetime.
They extend from infinity to infinity and only a
small portion of each surface can be shown.
Here are spacelike curves taken from these spatial hypersurfaces.
They are shown in their entirety and stretch from one spacelike infinity
i
0
to another spacelike infinity i
0
.
Distances in the figure no longer correspond to properly scaled
distances in space. 1/4" at the center of figure on one of these curves
may correspond to a mile; the last 1/4" of the curves, where they
connect to i
0
, corresponds to an infinite distance.
That light travels at the same velocity c is
encoded into the diagrams by the particular
fact that all lightlike geodesics are oriented at
45
o
to the vertical. This important
geometric fact is shown in the figure.
That timelike geodesics represent points
moving at less that the speed of light has a
similar geometric expression.
Picturing Black Holes
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/black_holes_picture/index.html[28/04/2010 08:23:26 ﺹ]
Such curves are, at every event, pointed in a
direction that makes an angle of less that
45
o
with the vertical.
Since timelike geodesics generally change
their direction from event to event in a
conformal diagram, we need to be a little
clearer about what this means.
It means this. Take any event on the
worldline, such as shown in the figure. Bring a
straight edge to the event so that the straight
edge is tangent to the curve at that event.
Then that straight edge must make an angle
of less than 45
o
with the vertical.
This must be true of every event on the
timelike geodesic.
The important properties of a conformal diagram are
threefold:
Time once again always goes up in the figure; and space
goes across.
Lightlike curves are always at 45
o
. The light cones no longer
tip over in the figure. Timelike curves are always directed at
less than 45
o
with the vertical; and spacelike curves are always
at greater than 45
o
with vertical.
The same intervals on the figure no longer correspond to the
same times elapsed and spaces covered. An interval of say
one inch on a timelike curve in the middle of the diagram might
correspond to one day of elapsed time. The last inch of the
timelike curve terminating in i
+
corresponds to an infinite time
elapsed.
A Conformal Diagram of a Black
Hole formed from Collapsing
Picturing Black Holes
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/black_holes_picture/index.html[28/04/2010 08:23:26 ﺹ]
Matter
It is not at all obvious that we have gained anything with the
conformal diagram of a Minkowski spacetime; we just seem to
be saying what we already knew in a new, unfamiliar way. In
the case of a black hole, however, the conformal diagram
makes it easy to see properties of the spacetime that were
formerly quite hard to see.
Here is the conformal diagram of a black hole that has been
produced by collapsing matter. This is a black hole of the
simplest type , one associated with a Schwarzschild
spacetime. This black hole has no electric charge and no
angular momentum (i.e. it isn't spinning). You'll see
immediately how unrealistic that is. Any collapsing cloud of
matter is likely to have a very complicated structure and
certainly not be so perfectly symmetric that it collapses without
turning. However the simplicity makes it easy for us to see its
properties.
To read the diagram, start at the bottom.
The worldlines of collapsing matter come out
of past timelike infinity, i

. As they proceed
upward through time, they collapse onto
themselves.
They have formed a black hole when
collapsed sufficiently to generate an event
horizon, which is indicated as the line at 45
o
in
the upper part of the diagram. Then all the
matter ends in the singularity, which is a
horizontal line at the top of the figure.
The novelty is that the spacetime is now
divided into two regions, marked I and II.
Region I is an ordinary spacetimein fact the
familiar Schwarzschild spacetime we've spent
so much time looking at. It has the familiar
timelike, lightlike and spacelike infinities.
The region II is the region inside the black
hole, past the event horizon.
Picturing Black Holes
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/black_holes_picture/index.html[28/04/2010 08:23:26 ﺹ]
Let us now add the worldlines of a planet and a traveler
that leaves the planet and falls into the black hole.
The planet's worldline originates in past timelike infinity, i

,
and terminates in future timelike infinity, i
+
. The planet has a
calm, infinite life.
The traveler leaves the planet and passes from region I into
the black hole region II and into the singularity.
Picturing Black Holes
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/black_holes_picture/index.html[28/04/2010 08:23:26 ﺹ]
The event horizon marks the boundary of no
return. We can see that by recalling that a traveler
must alway travel at less than the speed of light.
That is, the traveler's worldline must always be at
less that 45
o
to the vertical.
That means that the traveler can always pull away
from the black hole, as long as the event horizon
has not been passed.
If the traveler strays close to event horizon at A, or
closer at A', or even closer at A'', it is evident from
the geometry of the figure that the traveler can
always find a trajectory that will end in future
timelike infinity. For the event horizon itself marks
the boundary of a curve 45
o
with the vertical.
Once the event horizon is passedsay the traveler
is at event Bit is too late. All trajectories at less
than 45
o
with the vertical terminate in the
singularity.
Take a moment; stare at the figure; and convince
Picturing Black Holes
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/black_holes_picture/index.html[28/04/2010 08:23:26 ﺹ]
yourself that all this is correct. Remember the key
fact: the traveler's worldline must everywhere make
an angle of less than 45
o
with the vertical.
Finally we can use the conformal diagram
to trace the signals sent by the traveler
back to the planet.
The diagram shows the traveler sending
out light signals, which propagate along the
45
o
trajectories light follows.
The diagram shows that only light signals
emitted before the traveler passes the
event horizon will make it to the planet.
Once the traveler has passed the event
horizon, all the light signals will end up in
the singularity.
Turning it around, if a planet observer waits
and watches the while the traveler falls in,
the rate of signals received will slow down.
The planet observer would need to live for
all eternity of time to intercept all the
signals sent out by the traveler before the
traveler reaches the event horizon.
Conformal Diagram of a Fully
Extended, Schwarzschild Black
Hole
The black hole formed by collapsing matter is the simplest
black hole. It is far from the most interesting. By solving
Einstein's field equations that is by consulting his book of
universeswe find a closely related black hole. It is like the one
we've just seen, but was not formed by collapsing matter. It has
existed for all time. It is known as the "fully extended,
Schwarzschild black hole."
Picturing Black Holes
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/black_holes_picture/index.html[28/04/2010 08:23:26 ﺹ]
We can construct one mathematically by starting with just the matter free part of
the Schwarzschild spacetime. We then extend that piece of spacetime with
more matter free spacetime by means of Einstein's gravitational field equations.
If we keep extending the spacetime in that way, we end up with a new and
interesting black hole. Here is a conformal diagram of it.
Recall the jigsaw
puzzle analogy.
The extension is
like adding more
pieces to an
incomplete jigsaw
puzzle.
The great novelty of the new black hole is that it is twice the
size of the old one. On the "other side" of the event horizon is a
complete duplicate of the exterior of the black hole, the
region III. Just as region I is an infinite space surrounding the
black hole, region III is another infinite space just like it.
Everything that happens in region I can happen in region III.
Both can have planets and moons and space travelers. In
region I we can have a planet that passes from past timelike
infinity to future timelike infinity and sends out a traveler who
falls into the black hole. And we have the same thing for
region III: a planet that passes from past timelike infinity to
future timelike infinity and sends out a traveler who falls into the
black hole.
Picturing Black Holes
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/black_holes_picture/index.html[28/04/2010 08:23:26 ﺹ]
The second duplication is equally striking. The counterpart of
the singularity in the future is a new singularity in the past. It is
surrounded by a region that duplicates region II, the inside of
the black hole. This new singularity/region IV behaves like the
reverse of the future singularity. Just as things fall into the
future singularity, things fall out of the past singularity and
into the spacetime. For this reason, the structure is called a
"white hole." The diagram shows timelike worldlines of things
that are ejected by the past singularity into the regular
spacetime regions I and III.
What can come out of the singularity ? Ejection from it is the
reverse process in time of falling into the future singularity. So
anything that can fall into the future singularity can be ejected
from the past singularity. That means anythingdinosauars;
Picturing Black Holes
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/black_holes_picture/index.html[28/04/2010 08:23:26 ﺹ]
the socks you lost; TVs showing reruns of "I love Lucy";
and so on. You might expect that the theory would say that
ejecting odd things like that is vastly improbable. The awkward
thing is that the theory assigns no probabilities to these
possibilities. It just says that this is possible and that is
impossible. Ejecting dinosaurs is as possible as ejecting the
chaotic gush of hot particles and radiation that seems most
natural.
The past singularity is a "naked
singularity." That means it is not
hidden behind an event horizon,
like the future singularity, and
things that come out of it can reach
us.
Before you dismiss the dinosaurs and TVs as crazy, recall that we have one clear
example of the existence of a naked singularity. That is the big bang of
cosmology. That singularity certainly did eventually eject dinosaurs and TVs,
although it did take a while for them to form from the material ejected! There's no
news on your socks, however. Did you look behind the drier ?
What about traveling from region I to the other world of
region III? The idea is hugely appealing (if we set aside worries
about tidal forces). We would throw ourselves into a black hole,
which would then prove to be the portal to another world! Alas,
it is clear from the conformal diagram that passing over into the
other world is prohibited to beings like us who cannot travel
faster than light. Here is a worldline of a traveler who makes the
passage. You'll see that it is inevitable that, at some point, the
curve must make an angle of more than 45
o
with the vertical.
That is, the traveler must at some point exceed the speed of
light. Any attempt by travelers who cannot do this would result
in a oneway trip into the singularity.
Picturing Black Holes
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/black_holes_picture/index.html[28/04/2010 08:23:26 ﺹ]
Proceeding in this way, the conformal diagram enables us to
recover lots of information about the fully extended black
hole. Is it possible for beings from region I and III to meet
somewhere? Yes, in region II. If a traveler falls from a planet
into the black hole, how much of the planet's worldline will be
visible through light signals to the traveler in the brief moments
that remain before the traveler meets the future singularity?
Depending on how the traveler falls, an arbitrarily large amount
will be visible.
EinsteinRosen Bridges
The diagrams shown above are limited in one aspect. A three
dimensional space has two of its dimensions suppressed.
The space appears merely as a single line, a one dimensional
space, marked below as a "spatial hypersurface."
If we restore the two missing dimensions, each point on
the hypersurface is really a two dimensional sphere
of space enclosing the black hole. As we proceed from
one side to the other, the enclosing spheres get smaller
and smaller. However, since the geometry is not
Euclidean, the spheres do not lose area as fast as you'd
expect when we move to spheres successively closer to
the singularity. We already saw this effect in the
Schwarzschild spacetime of massive bodies like the sun.
In this case, however, the effect is stronger. The spheres
Picturing Black Holes
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/black_holes_picture/index.html[28/04/2010 08:23:26 ﺹ]
reach a minimum size and then expand.
The effect can be seen in an embedding diagram in which we show only two
of the three dimensions of the spheres. The spheres are now represented
by circles. The circles become smaller as we proceed from region I to III.
However once they reach a minimum size, they begin to expand. The
diagram is very suggestive. It has been called an EinsteinRosen bridge
that connects the two worlds of regions I and III. In a sense it is bridge, but it
is only one that travelers who can go faster than light can cross.
The figure below of the
EinsteinRosen bridge
is an extension of the
familar embedding
diagram of the space
around the sun, in
which the space
appears to be stretched
like a rubber
membrane.
More...
This short introduction does not even begin to exhaust all the
novel and interesting ideas associated with black holes. We
have looked only at the simplest case. If we allow that the black
hole can have some angular momentum (i.e. it spins) and that
it can carry charge, the associated conformal diagram become
very much more complicated. Many new regions corresponding
to new worlds appear. It does turn out to be possible for us to
visit them, if we fall into a black hole and somehow survive the
Picturing Black Holes
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/black_holes_picture/index.html[28/04/2010 08:23:26 ﺹ]
pummeling of tidal forces.
It also turns out that the types of black holes are limited by
the factors just mentioned. Once the mass, charge and angular
momentum of a black hole are fixed, then all its properties are
also determined. That gravitational collapse will always
produce a black hole has also been demonstrated in theorems
akin to those that demonstrate the inevitability of a big bang
singularity. And it has been suggested that when gravitational
collapse produces a singularity, that singularity is always
hidden behind an event horizonthis is known as "cosmic
censorship."
Yet further complications arise if we allow for the quantum
nature of matter. It turns out that black holes, especially small
ones, become unstable, emit particles and can evaporate!
What You Should Know
How to read conformal diagrams for Minkowski spacetime.
How to read conformal diagrams for a Schwarzschild
black hole.
How to use those conformal diagrams to determine what
happens to travelers and signals exploring the spacetime.
Copyright John D. Norton. March 2001, October 2002; February 8, 2007, February 23, 2008.
Atoms and the Quantum
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/atoms_quantum/index.html[28/04/2010 08:23:37 ﺹ]
HPS 0410 Einstein for Everyone
Back to main course page
Atoms and the Quantum
John D. Norton
Department of History and Philosophy of Science
University of Pittsburgh
This document is a duplication of "Atoms
Entropy Quanta: Einstein's Statistical Physics
of 1905," found in the Goodies section of
mywebsite at www.pitt.edu/~jdnorton/goodies .
1. The Three Statistical Papers of 1905
2. A MiniTutorial on Ideal Gases
3. Einstein's Doctoral Dissertation
4. The Statistical Physics of Dilute Sugar Solutions
5. Einstein's Brownian Motion Paper
6. The Importance of Einstein's Analysis of
Brownian Motion
7. The Light Quantum Paper: Einstein's Astonishing
Idea
8. A New Atomic Signature
9. Conclusion
Einstein's work in statistical physics of 1905 is unified
by a single insight: Physical systems that consist of
many, spatially localized, independent micro 
components have distinctive macroproperties. These
macroproperties provide a signature that reveals the
system's microscopic nature. Einstein used this insight
in two ways. It enabled him to treat many, apparently
distinct systems alike, simply because their micro
components are localized and independent. And he
used the measurable macro signature to reveal the
microconstitution of physical systems. In the case of
heat radiation, the result was revolutionary.
Atoms and the Quantum
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/atoms_quantum/index.html[28/04/2010 08:23:37 ﺹ]
1. The Three Statistical
Papers of 1905
In his annus mirabilis of 1905, Einstein published three
papers in statistical physics that appeared to be only
loosely connected. They were:
Einstein's doctoral dissertation
"A New Determination of Molecular Dimensions"
Buchdruckerei K. J. Wyss, Bern, 1905. (30 April 1905)
Also: Annalen der Physik, 19(1906), pp. 289305.
Einstein used known physical properties of sugar
solution (viscosity, diffusion) to determine the size of
sugar molecules.
"Brownian motion paper."
"On the motion of small particles suspended in liquids
at rest required by the molecular kinetic theory of
heat."
Annalen der Physik, 17(1905), pp. 549 560.(May 1905; received 11
May 1905)
Einstein predicted that the thermal energy of small
particles would manifest as a jiggling motion, visible
under the microscope.
"Light quantum/photoelectric effect paper"
"On a heuristic viewpoint concerning the production
and transformation of light."
Annalen der Physik, 17(1905), pp. 132148.(17 March 1905)
Einstein inferred from the thermal properties of high
frequency heat radiation that it behaves
thermodynamically as if constituted of spatially
localized, independent quanta of energy.
Atoms and the Quantum
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/atoms_quantum/index.html[28/04/2010 08:23:37 ﺹ]
These three papers were intimately connected by a
single insight that Einstein used and developed as the
content of the papers unfolded. Take a system that
consists of very many, spatially localized, independent
microscopic components. That constitution can be
read from the thermal properties of the system, as long
as one knows how to read the signs. The most familiar
example is a very dilute kinetic gas; its component
molecules move independently. This constitution is
directly expressed in the fact that the pressure,
temperature and volume of the gas conforms to the
ideal gas law.
Einstein was not the first to see these sort of
possibilities. However he used them with greater
fluidity and reach than ever before.
2. A MiniTutorial on
Ideal Gases
For a very gentle warm up exercise, see "How big is
an atom?"
To illustrate this insight, let us look at this most familiar
case of ideal gases. This is the case of most ordinary
gases, just like the air, when they are at ordinary
temperatures that are not too cold and pressures that
are not too high, so that they remain very dilute.
Here is an ideal gas
trapped in a cylinder by a
weighted piston. That it
obeys the ideal gas law
means that the following
Atoms and the Quantum
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/atoms_quantum/index.html[28/04/2010 08:23:37 ﺹ]
calculation always works.
Take the pressure P and
multiply it by the volume
V of the gas. Whatever
you get will always be
exactly the same as what
you get when you take the
number of molecules
n, multiply it by
Boltzmann's constant
k and the temperature
T.
Or, to put it more simply:
PV = nkT
This result is so simple that it is easy to miss what is
quite remarkable about it. What is remarkable is
exactly that it is so simple. Gases come in many
different forms. We might have a very light gas like
helium, the gas used to lift balloons, whose molecules
are little spheres. Or we might have a denser gas like
the oxygen of the air, whose molecules are dumbbell
shaped. Or we might have a vaporized liquid, like
water vapor, whose molecules are shaped something
like little Mickey Mouse heads. In every case, the
same law holds, even if the oxygen or water vapor are
mixed up with another gas like nitrogen in the air. Yet
nothing in the law takes note of all these differences.
All that enters the law are the the volume, the
temperature, the number of molecules and a single
universal constant, Boltzmann's constant k. From
them, using a little easy arithmetic, the law tells you
what the gas pressure P will be.
How can the ideal gas law do this ? It can do it
Atoms and the Quantum
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/atoms_quantum/index.html[28/04/2010 08:23:37 ﺹ]
because the truth of the law does not depend upon
the detailed physical properties of the gas. Rather it
depends only on a single fact shared by all dilute
gases: they consist of may independent spatially
localized molecules. The law needs this and nothing
more; as a result it does not need to ask if the gas
molecules are heavy or light, this shape or that; or
even if the molecules are alone in space or surrounded
by molecules of another type. This fact also
foreshadows the far broader application of the ideal
gas law than just to ideal gases.
Exactly how this law comes about is a somewhat
technical issue, although not that technical. In its very
simplest form it goes like this. The single most
important result of the statistical physics of Maxwell
and Boltzmann for a thermal systems is that the
probability that one of its molecules is in some state is
fixed by that state's energy. Specifically, the probability
of a state with energy E is proportional to an
exponential factor exp( E/kT). So, for the gas in the
above cylinder, we can ask for the probability that one
of its molecules will be found at some height h. Now its
energy at height h is its energy of motion plus the
energy of height, mgh, where m is the molecule's
mass and g the acceleration of gravity. This formula
assumes the essential thing, that the molecules are
independent of each other. For the energy of the
molecule depends on its height and not on the position
of any other molecules.
What this means is that the probability of finding
some given molecule at height h decays exponentially
with height h according to the factor exp(mgh/kT).
Now the gas is more dense where there are more
molecules; or more precisely, the probability of finding
a molecule at height h is proportional to the density of
the gas at height h. Therefore the density of the gas
decays exponentially with height according to the
same factor exp(mgh/kT). So this means that the gas
Atoms and the Quantum
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/atoms_quantum/index.html[28/04/2010 08:23:37 ﺹ]
is more dense lower down and less dense higher up.
All that seems reasonable enough.
But you might also quite reasonably
ask why the force of gravity just
doesn't pull all the gas molecules
down to the bottom of the cylinder,
so that they lie in a big heap at the
bottom of the cylinder, like a pile of
dust. The simple answer looks at
the gas microscopically and calls
upon the thermal motions of the
molecules to scatter them through
the chamber. The relevant effect of
these microscopic motions can be
redescribed macroscopically as a
pressure. The many microscopic
collisions of the molecules with the
piston, for example, appear
macroscopically as a smooth
pressure exerted by the gas on it.
Correspondingly the tendency of the gas to scatter
upward because of the microscopic motions appears
macroscopically as a pressure gradient in the gas.
There is a higher pressure lower in the cylinder and
that higher pressure tends to push the gas upward.
Now different pressure gradients in the gas will lead to
different density distributions, with equilibrium arising
when the pressure gradient exactly balances the
weight of the gas and piston above. Which pressure
gradient will lead to a distribution proportional to exp(
mgh/kT) in every case? Wellyou know the answer. It
is exactly the pressure gradient given by the ideal gas
law, PV=nkT!
To summarize, the assumption that a gas consists of
many, independent, localized molecules leads to the
Atoms and the Quantum
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/atoms_quantum/index.html[28/04/2010 08:23:37 ﺹ]
ideal gas law. And it should come as no surprise that
the argumentation can be reversed. If we have any
gas in the context of MaxwellBoltzmann statistical
physics that satisfies the ideal gas law, then it consists
of many, independent molecules.
There remains one subtle point that will become of
central importance. The ideal gas law follows from the
assumption that the gas consists of many,
independent, localized molecules. Notice what is not
assumed. It is not assumed that the molecules move in
straight lines at uniform speed between collisions with
other molecules; or that the molecules are the only
matter present. The ideal gas law is a much more
general result. It holds for any thermal system
consisting of many, independent, localized
components; and the notion of component and its
context can be quite broad.
All this can be made precise mathematically with only
a little more effort. See how here.
3. Einstein's Doctoral
Dissertation
Of his statistical papers on 1905, the light quantum
paper was published first. However in terms of the
development of their ideas, Einstein's doctoral
dissertation presents the natural starting point. The
common ideas of the three papers appear in it in their
simplest form and they are developed adventurously in
the other two papers.
The point of Einstein's doctoral dissertation, "A New
Determination of Molecular Dimensions," was clearly
stated in its title. It was to determine how large
Atoms and the Quantum
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/atoms_quantum/index.html[28/04/2010 08:23:37 ﺹ]
molecules are. The answer was given in a particular
way. A basic result of chemical atomism is that there
are always the same number of molecules in one gram
mole of any substancessuch as 2g of hydrogen gas,
or 18g of water, or 32g of oxygen gas. That number is
N. It is called Avogadro's number in the English
tradition and Loschmidt's number in Einstein's German
tradition. Finding N then automatically tells us the
mass of hydrogen molecules, water molecules and
oxygen molecules.
The method Einstein hit upon was simple in
conception. Pure water has a certain viscosity that
measures how readily it flows. Water's viscosity is very
much less than honey, for example, which flows much
less readily. The addition of sugar to water to make a
syrup like honey increases the viscosity. Einstein
proposed that, at least in the case of dilute sugar
solutions, the increase in viscosity is simply due to the
bulk of the sugar molecules obstructing the free flow of
the dissolving water. Einstein's project was to model
this obstructive effect as a mathematical problem in
fluid flow; and to compare the results with
experimentally determined viscosities of dilute sugar
solutions; and thereby to estimate N. The idea was
simple, but its execution was not.
Einstein managed to reduce the
problem to computing the flow that
results in the situation shown opposite.
Water flows inward on one axis and
then diverges outward on others. That
flow will be impeded by the presence of
a sugar molecule at the center, where
the molecule is presumed to be a
perfect sphere. That impeding of the
flow, Einstein assumed, would manifest
as an increase in the viscosity of the
Atoms and the Quantum
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/atoms_quantum/index.html[28/04/2010 08:23:37 ﺹ]
solution.
After a long and hard calculation, after Einstein had
made many special assumptions just so that the
computation could be done at all, Einstein arrived at
his result. The apparent viscosity mu of the water was
increased to mu
*
of the solution in direct relation to the
fraction of the volume phi of the solution taken up by
the sugar:
(1) mu
*
= mu . (1 + phi)
And the fraction of the volume taken up by the sugar
could be determined by simple geometry from rho the
sugar density, m the molecular weight of the sugar, P
the radius of the sugar molecule and N:
(2) phi = (rho/m) . N . (4pi/3) . P
3
Well, it was a little more complicated. Einstein made an error in the
calculation and the correct result was
mu
*
= mu . (1 + (5/2)phi). The examiners did not notice. Einstein was
awarded his PhD and years later corrected the mistake.
Don't be put off by all the terms in equations (1) and
(2). All that really matters is that Einstein has
equations that relate things that can be measured
(viscosity of sugar solutions, etc.) to the thing he wants
to know N. So Einstein could take equations (1) and
(2), combine them and turn the outcome inside out.
The result is
(3) N = (3m/4 pi rho) . (mu
*
/mu) . 1/P
3
Or, if we express it in terms that matter:
(3) N = (things that can be measured) x 1/P
3
Atoms and the Quantum
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/atoms_quantum/index.html[28/04/2010 08:23:37 ﺹ]
You'll immediately see the
problem with equation (3).
N and the radius of the
sugar molecule P are
both things that we don't
know (and want to know).
So Einstein has that old
foe of algebra homework:
ONE equation in TWO
unknowns. And we all
learned in school that you
cannot solve that. In
effect we have a rule
such that if we know the
value of one unknown P
say we can figure out
the other in this case N.
That is shown in the plot.
We have a curve that
displays all the values of
P and the corresponding
values of N that go with
them.
What Einstein needed
was a second equation,
so he would have TWO
equations in TWO
unknowns. Then he would
have a second curve on
the plot and where the
two curves crossed he
would find the unique
values of both N and P.
But where could Einstein get his second equation?
He found it by looking at how sugar diffuses in water.
How he analyzed this diffusion process will be our real
focus. So let me just state his result for the moment. It
Atoms and the Quantum
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/atoms_quantum/index.html[28/04/2010 08:23:37 ﺹ]
uses the diffusion coefficient D that determines how
fast sugar diffuses and is measurable directly in
experiment, and the ideal gas constant R.
(4) N= (RT/6 pi mu D) . 1/P
or in terms of what matters
(4) N = (things that can be measured) x 1/P
So Einstein now had two equations (3) and (4) in his
two unknowns, N and P, and they could be solved. He
found N = 2.1 x 10
23
. Later, after he corrected his calculation
for his error, he had N = 6.6 x 10
23
, which is much closer to the
modern value of 6.02 x 10
23
.
4. The Statistical
Physics of Dilute Sugar
Solutions
Diffusion is a familiar process. The smell of last nights
pepperoni pizza soon fills the refrigerator as the aroma
diffuses into every corner. Similarly a spoonful of sugar
syrup carefully placed at the bottom of a cup of water
(and not stirred!) will slowly diffuse over a period of
days and weeks through the water making a (roughly)
uniform sugar solution. The microscopic
mechanism of diffusion is simply the scattering of
sugar molecules under their random thermal motion.
Indeed in dilute solutions, the sugar molecules form a
system of a large number of molecules that do not
interact with oneanotherthey are widely spaced in the
water because of the high dilution.
A large number of molecules that do not
Atoms and the Quantum
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/atoms_quantum/index.html[28/04/2010 08:23:37 ﺹ]
interact?! This is exactly the condition
that we saw the molecules of an ideal
gas had to obey in order for the ideal
gas law to obtain. So it should hold here
as well. And it does!
The random, microscopic motions of
sugar molecules that leads to diffusion
can be redescribed on a macroscopic
level as a pressure, just as is the case
with an ideal gas. This pressure is the
familiar osmotic pressure so
important in cell biology. Consider a
semipermeable membrane that can
pass water but not sugar, such as the
membrane in the figure opposite or a
cell wall. The (gray) water can pass
freely through it, but sugar molecules
(the little white spheres) cannot.
Through their collisions with the
membrane, the sugar molecules exert a
pressure on the membrane and the
considerations that fix the size of the
ideal gas pressure are exactly the same
as those that fix the size of the osmotic
pressure.
The osmotic pressure P exerted by n
sugar molecules in a volume V of water
in dilute solution obeys the ideal gas
PV = nkT
This osmotic pressure became central to
Einstein's derivation of the result (4) for
Atoms and the Quantum
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/atoms_quantum/index.html[28/04/2010 08:23:37 ﺹ]
sugar diffusing in solution. To generate it,
he imagined the same set up as I have
described above, dissolved sugar
molecules in a gravitational field. There
are two processes acting on the sugar
molecules.
First, the effect of gravity is to pull the
molecules downward. So they fall, as
shown. A standard law in fluid mechanics,
Stokes' law, expresses just how fast they
fall under the pull of gravity.
Second, a diffusion process scatters the
falling sugar molecules. Its net effect is to
send the sugar molecules from regions of
high concentration to regions of low
concentrations. That precludes the falling
molecules accumulating too much at the
bottom of the vessel.
Einstein used the fact that dissolved sugar
exerts an osmotic pressure to determine
the magnitude of this effect. The falling
sugar forms a density gradient. The ideal
gas law asserts that pressure is proportion
to density, so there is an osmotic pressure
gradient. And that pressure gradient drives
the sugar back up.
An equilibrium between the processes
will be established when the amounts of
sugar transported by the two processes in
opposite directions are equal. The
Atoms and the Quantum
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/atoms_quantum/index.html[28/04/2010 08:23:37 ﺹ]
equation that sets those two rates of
transport equal turns out to be just the
second equation Einstein needed for the
argument of his doctoral dissertation:
(4) N= (RT/6 pi mu D) . 1/P
or in terms of what matters
(4) N = (things that can be measured) x
1/P
5. Einstein's Brownian Motion
Paper
The argument and method of Einstein's dissertation
was indirect and cumbersome. Since the original
project of examining the viscosity of sugar solutions
yielded one equation in two unknowns, he needed to
introduce analysis of a second sort of physical
process, diffusion, in order to get a result. To recall, he
ended up with TWO equations in TWO unknowns, N
and P, the radius of a sugar molecule:
(3) N = (things that can be measured) x 1/P
3
(4) N = (other things that can be measured) x 1/P
We could well imagine Einstein examining these two
unknowns, N and P, and lamenting that both are
inaccessible to direct measurement. In the case of
sugar solutions, of course, the problem is inescapable.
To know one is to know the other; if we are ignorant of
one we do not know the other. But wait what if we
were to apply this same analysis not to sugar solutions
Atoms and the Quantum
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/atoms_quantum/index.html[28/04/2010 08:23:37 ﺹ]
but to other solutions whose "molecules" are so big
that we might measure their size directly under the
microscope? That could be done. All we are really
considering is a suspension in water of very finely
divided particles, perhaps even like the tiny pollen
grains Brown had observed under the microscope
earlier in the 19th century. For these systems, there
now only ONE unknown, N. Thermal motions would
lead such particles to diffuse through water and, using
equation (4) alone, Einstein could determine N from
the measured rate of their diffusion.
I do not know if this is the reasoning that brought
Einstein from the reflections of his doctoral dissertation
to the Brownian motion paper. But I can say that the
path is obvious and direct, just as it leads to a very
much more adventurous result. Einstein is not longer
computing the size of molecules, he has found a
process which it seems that only a molecular kinetic
theory of heat can accommodate!
The remarkable fact is that
Einstein could use exactly the
same analysis for this process
as he had used for the
diffusion of sugar. The
suspended particles consist of
a large number of independent
componentsthat you can see
them under the microscope
does not alter that fact. So they
will exhibit thermal motions
which in turn exert a pressure
on a membrane that does not
allow them to pass.
At this point, no more
calculation is needed. The
Atoms and the Quantum
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/atoms_quantum/index.html[28/04/2010 08:23:37 ﺹ]
particles will establish an
equilibrium distribution in the
gravitational field exactly as did
the sugar molecules. Once
again we can characterize that
equilibrium by equating the rate
at which the particles fall under
gravity with the rate at which
diffusion scatters them back
up. The result is:
(4) N= (RT/6 pi mu D) . 1/P
as before. Since P is now
observed, all Einstein needs is
to measure the rate of diffusion
of the particles to recover D
and then use (4) to compute
N.
This last step of the computation of N
proved the most interesting. The thermal
diffusion of these particles would manifest
under the microscope as a random
jiggling motion. Indeed Einstein
conjectured that this was just the motion
Brown has noted for pollen grains,
although in this first paper Einstein
lamented that he did not have enough data
to be sure.
For particles of size 0.001mm, Einstein
predicted a displacement of approximately
6 microns in one minute.
Atoms and the Quantum
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/atoms_quantum/index.html[28/04/2010 08:23:37 ﺹ]
6. The Importance of
Einstein's Analysis of
Brownian Motion
Following the easy logic of the pathway from his
dissertation, we may overlook the momentous
importance of what has just transpired. Einstein had
found an effect that settled one of the major debates of
the early 20th century!
In the course of the latter part of the 19th century,
Maxwell, Boltzmann and others had struggled to
establish that their statistical treatment of thermal
processes deserved a place in physics. It was a
difficult struggle. For their statistical accounts seemed
to be at odds with established thermodynamics,
grounded squarely in experiment. Most notoriously,
there were (then) two laws fundamental laws of
thermodynamics. The second law, the entropy
principle, expressed the notion that thermodynamic
processes were directed in time. Gases spontaneously
expand to fill space. They do not spontaneously
contract. In the statistical approach, however, they do
spontaneously contract, but with very small probability.
(We will see more of this shortly!) So Boltzmann
struggled to establish that this basic law of
thermodynamics only held with very high probability.
For Maxwell and Boltzmann, the project was to catch
up with thermodynamics and show that they could do
what the thermodynamicists were already doing
without calling upon stories about atoms. Seen in this
light, the opposition of energeticists like Ostwald at the
start of the 20th century to atoms is quite
Atoms and the Quantum
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/atoms_quantum/index.html[28/04/2010 08:23:37 ﺹ]
understandable. They did not seem to need atoms
to do their physics; and presuming atoms required
compromising the basic laws of thermodynamics. So
why play with the notion of atoms when it brought pain
but no gain?
Einstein now had found a way to turn the tables. The
strength of the thermodynamicists was their grounding
in experiment. Yet here was an experimental effect 
the random thermal motions of suspended particles 
that could not be accounted for by ordinary
thermodynamic means. One had to resort to something
like a molecular kinetic account. Einstein pointed to
this momentous outcome in rather dry language in the
introduction to his paper:
"If it is really possible to observe the motion
discussed here ... then classical
thermodynamics can no longer be viewed as
strictly valid even for microscopically
distinguishable spaces, and an exact
determination of the real size of a mole
becomes possible."
Here I follow Anne Kox's analysis of Einstein's "eine exakte
Bestimmung der wahren Atomgroesse" and translate Atomgroesse
as size of a mole.
In addition to this foundational issue, there was a
second theoretical bounty emerging from Einstein's
analysis of Brownian motion. In order to determine N,
Einstein needed to estimate the diffusion coefficient
associated with the random motion of the suspended
particles. This required a statistical analysis of the
random jiggling of the particles.
The analysis had to be
probabilistic. If a
Atoms and the Quantum
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/atoms_quantum/index.html[28/04/2010 08:23:37 ﺹ]
particle starts at some
known position, we can at
best specify the
probabilities of it straying
ever further from that
initial point. The curve
representing these
probabilities is the familiar
bell curve. As time t
passes it becomes more
and more flattened,
capturing the greater
probability of the particle
straying from its initial
position.
Einstein showed that this
flattening of the curve is
directly related to the
diffusion coefficient D.
That is, the mean square
displacement is 2.D.t.
Through this analysis,
Einstein's paper became
one of the first treatments
of the problem of the
"random walk" and one of
the founding documents
in the new field of
stochastic processes.
Finally there were some interesting subtleties in this
random motion. First, the jiggles observed under the
microscope were not the result of collisions with
Atoms and the Quantum
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/atoms_quantum/index.html[28/04/2010 08:23:37 ﺹ]
individual water molecules. You might presume that
the effect of very many collisions with water molecules
would rapidly average out to no effect at all. That turns
out to be mistaken. The statistical analysis shows that
even very many molecular collisions leaves a residual
jiggle. Second, it is futile to try to find the average
speed of the jiggling particles. Speed is
displacement/time. Einstein's analysis shows that the
average displacement is proportional to the square
root of time. So the ratio of displacement/time varies
as 1/(square root of time) and so goes to zero as time
gets large. So if we try to average out the jiggles to find
an average speed, we end up with averages that will
get closer and closer to zero the longer the time period
we consider.
7. The Light Quantum
Paper:
Einstein's Astonishing
Idea
The great triumph of 19th century physics had been
Maxwell's electrodynamics. It established definitively
the wave character of light, identifying it as
propagation in the electromagnetic field. It seemed
impossible in the face of Maxwell's great achievement
that we could ever go back to a view of light such as
Newton held, that light consists of little corpuscles. Yet
exactly this was the astonishing idea of Einstein's 1905
light quantum paper.
Einstein had several bases for this idea. Some were
grounded directly in experiment. For example, he
argued that we could best account for the
Atoms and the Quantum
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/atoms_quantum/index.html[28/04/2010 08:23:37 ﺹ]
photoelectric effect if we assumed that the energy of
propagating light was spatially localized in little packets
of size hf, where h is Planck's constant and f is the
frequency of the light. This explanation of the
photoelectric effect was cited in the awarding of the
Nobel Prize to Einstein in 1921: "for his services to
Theoretical Physics, and especially for his discovery of
the law of the photoelectric effect."
The core argument of Einstein's paper was different,
however. It drew on the thermodynamic behavior of
high frequency heat radiation. What Einstein noticed
was that there was an atomistic signature in its
macroscopically measurable thermal properties. He
noted that high frequency heat radiation behaved
thermodynamically as if it consisted of independent,
spatially localized quanta of energy of size hf. This
remark was the light quantum hypothesis.
The idea that the macroscopic properties of a system
may reveal its microscopic properties is not new.
Indeed it has been present throughout the discussion
so far. That the system exerts a pressure governed by
the ideal gas law is just such a signature. It tells us
that the system consists of many, independent
components and this signature can be found in ideal
gases, in dilute solutions and in systems of suspended
particles. It actually turns out to be present in high
frequency heat radiation as well. However its presence
is harder to see . Heat radiation does exert a
pressure, known as radiation pressure. That pressure
is a function of the temperature and frequency of the
radiation only. So we may well wonder how the ideal
gas law PV=nkT could apply to it, for the ideal gas law
clearly allows a volume dependence through the
presence of the term V.
It turns out the the ideal gas law still does apply to a
high frequency heat radiation. That fact is obscured by
a novelty of heat radiation. The number of quanta in
Atoms and the Quantum
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/atoms_quantum/index.html[28/04/2010 08:23:37 ﺹ]
heat radiation is not fixed in the way the number of
components is fixed for other systems such as an
ideal gas. If we correct for that effect, compatibility with
the ideal gas law is restored.
When an ideal gas undergoes a constant
temperature expansion, the ideal gas law
PV=nkT tells us that the product of
pressure and volume PV stay the same.
That is, the pressure decreases and
the volume increases. This is how we are
used to seeing the ideal gas law
manifested.
When a system of high frequency heat
radiation expands at constant
temperature, new energy quanta are
created in direct proportion to the volume
V. That is, n/V remains constant. The idea
gas law now tells us that the pressure P
remains constant since we may write the
law as P=(n/V)kT. The immediate effect is
that the satisfaction of the ideal gas law is
obscured since we are so used to the law
telling us that pressure P decreases in a
constant temperature expansion. The
atomic signature is there; but it is in an
unfamiliar form.
Atoms and the Quantum
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/atoms_quantum/index.html[28/04/2010 08:23:37 ﺹ]
8. A New Atomic Signature
Einstein did not mention the ideal gas law as an
atomic signature for heat radiation. He did however
demonstrate the existence of another atomic signature
to which high frequency heat radiation did conform. He
first illustrated that signature for the familiar case of an
ideal gas.
The statistical approach to gases differed from a purely
thermodynamic one, as noted above, in that it allows
for gases to spontaneously recompress, albeit with
very small probability. The analysis is very simple.
Consider an ideal gas with just four molecules. The
molecules will move randomly through the chamber
shown and will mostly be spread throughout it.
There is a probability of 1/2 that any given molecule
will be in the left half of the chamber when we check.
So the probability that all four of them will be there is
just
(1/2) x (1/2) x (1/2) x (1/2) = (1/2)
4
.
The key fact of independence is what allows us just to
multiply all four probabilities together to get the result.
If we had n molecules, the probability would be
(1/2) x (1/2) x (1/2) x ...(n times) ... x (1/2) =
(1/2)
n
.
Atoms and the Quantum
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/atoms_quantum/index.html[28/04/2010 08:23:37 ﺹ]
Since ordinary samples of gas will have of the order of
n = 10
24
molecules, this probability is fantastically
small and we have no chance of observing this
fluctuation in ordinary life. (And that is fortunate, for otherwise
our lives in the air would like a small cork tossed about on a stormy
sea!)
However the probability of this fluctuation is still quite
definite. An ideal gas can spontaneously compress to
half its volume with miniscule probability (1/2)
n
.
Statistical physics happens to give us another way to
determine this probability, without us actually having to
see the spontaneous recompression. The probability of
the transition is related to a macroscopic
thermodynamic quantity, entropy. We need not here
go into many details of the nature of this quantity. All
that matters for us is that entropy is a thermodynamic
property of thermal systems, just as is energy, and its
value is routinely given in tables of thermal properties
of substances. I will not pause here to rant about the unfortunate
mythology of mystery that surrounds the notion. A good part of it is
due to plain old foggy thinking. See my website,
http://www.pitt.edu/~jdnorton for details.
The Simplest
Version of the
Argument
The details of the next steps of Einstein's argument are
a little messy for people who don't like logarithms. So
here's the very simplest version without logarithms.
The thermodynamic quantity entropy tells us what
sorts of transformations thermal systems will undergo.
The basic rule is that thermal systems will tend to
states of higher entropy. So the entropy difference
between two states of a system gives us information
Atoms and the Quantum
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/atoms_quantum/index.html[28/04/2010 08:23:37 ﺹ]
on the tendency of the system to move between the
states. Indeed the "tendency" can be given a quite
precise measure as a probability. If we know the
entropy difference between two states of a system, we
know the probability that the system will spontaneously
move between those two states.
Now recall that the entropy of a
system is an ordinary
thermodynamic quantity like
energy. Just as you can measure
the energy content of some volume
of radiation by a suitable
experiment, you can also
measure the entropy content of
that system.
To get a sense of how it works, imagine that you slowly heat
some system which is initially at some absolute temperature
T. You can figure out how much the energy of the system
changes with each unit of heat you addone unit of energy is
added for each unit of heat. The corresponding calculation for
entropy is almost as simple. For each unit of heat energy you
add when the system is at T, you add 1/T units of entropy.
That is just what Einstein did for heat radiation. More
precisely, he took other people's measures of entropy
and used them to figure out the entropy difference
between two states: a quantity of heat radiation of
energy E at one, particular high frequency f and a
second quantity of heat radiation of the same energy E
and frequency f, but half the volume.
From the entropy change between those two states,
Einstein could infer that the probability of the
quantity of radiation spontaneously fluctuating to half
its volume is just (1/2)
(E/hf)
. Written out more fully that
is
(1/2) x (1/2) x (1/2) x ...(E/hf times) ... x (1/2) =
(1/2)
(E/hf)
Comparing this formula to the corresponding formula
for n molecules, it is almost impossible to avoid
concluding that this quantity of high frequency radiation
Atoms and the Quantum
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/atoms_quantum/index.html[28/04/2010 08:23:37 ﺹ]
consists of E/hf spatially localized radiation molecules
Einstein called them "light quanta" that move
independently through the volume.
The picture to have in mind is:
The best part is that the probability formula tells us
directly how big these light quanta are. The probability
comes from multiplying E/hf factors of (1/2) together.
So we infer that the total energy E of the radiation is
divided into that many quanta of energy, each of size
hf.
The Fancier Version
of the Argument
Now here's the fancier version.
The entropy change between two states S is related to
the logarithm of the probability W of a spontaneous
transition between the two states by the formula
S = k log W.
Einstein judged this result so important that he named
it " Boltzmann's Principle." That wonderful formula
was engraved on Boltzmann's gravestone; it is the
bridge we need between the macroscopic and the
microscopic.
Apply this principle to the case of the ideal gas of n
Atoms and the Quantum
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/atoms_quantum/index.html[28/04/2010 08:23:37 ﺹ]
molecules that spontaneously compresses to half its
volume with probability W = (1/2)
n
. We find that the
difference in entropy between the gas and that same
gas occupying one half the volume is given by
(5) S = k.log W = k.log (1/2)
n
=  nk.log 2
While we arrived at this entropy difference by thinking
about extremely improbable fluctuations in the gas'
volume, it can also be found in standard
thermodynamic treatises, derived entirely from
macroscopic properties of ideal gases, without any
mention of microscopic properties and very unlikely
events. (In particular, you do not need to know the size of N to get
this formula. For nk = n
m
.R, where n
m
is the number of moles and R
is the ideal gas constant.) But now that we know how to
read the logarithmic dependence of entropy on volume
of (5), we can recognize it as a macroscopic signature
of the spatially localized, independent atoms in the
ideal gas.
Einstein recognized this same signature in a single
frequency cut of high frequency heat radiation. By
drawing directly on experimental measurements of the
thermal properties of high frequency heat radiation, he
noted that the entropy difference between two
quantities of radiation of energy E and frequency f, one
at the full volume and one at the half volume, is just:
(6) S =  (E/hf).k.log 2 = k log (1/2)
(E/hf)
The analogy between formulae (5) and (6) is obvious.
Einstein had now found the macroscopic signature of
atoms in high frequency heat radiation. Comparing
equations (5) and (6), we immediately see that the
heat radiation is governed by a formula appropriate to
a system consisting of E/hf independent components.
Or, to put it another way, it is as if the energy E of the
radiation is divided into independent, spatially localized
Atoms and the Quantum
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/atoms_quantum/index.html[28/04/2010 08:23:37 ﺹ]
components of energy hf. This, you will recall, is just
Einstein's light quantum hypothesis, but now read from
equations (5) and (6).
You should note how carefully hedged Einstein's
statement of the light quantum hypothesis is. Its most
careful formulation from his 1905 paper is:
"Monochromatic radiation of low density behavesas
long as Wien's radiation formula is valid [i.e. at high
values of frequency/temperature]in a thermodynamic
sense, as if it consisted of mutually independent
energy quanta of magnitude [hf]."
Einstein is very careful to add many conditions: high
frequency/temperature, low density, "as if" and "in a
thermodynamic sense."
That caution is very prudent. Einstein had not
explained away the quite prodigious body of evidence
from the 19th century all pointing to the wavelike
character of light. Indeed that evidence will never go
away. What Einstein eventually decided a few years
later is that both wave and particle characters are
needed for a full account of light. Sometimes light will
behave like a wave; sometimes like a localized
particle; and sometimes both. That we now know as
"waveparticle" duality.
Modern readers often find it irresistible to jump from
these light quanta of 1905 to modern photons; that
is, to imagine that Einstein was just proposing that light
really consists of particles or corpuscles after all. That
would be a risky jump for all the reason just given. In
addition, an essential part of the notion of a photon is
that it carries momentum. Nothing in Einstein's
arguments so far have established that his light quanta
of 1905 also carry momentum. That conclusion had to
be established by further analysis and it came with
time.
Atoms and the Quantum
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/atoms_quantum/index.html[28/04/2010 08:23:37 ﺹ]
9. Conclusion
Einstein published three papers in statistical physics in
1905. By any measure, their content is
extraordinary. In one form or another they contained
the seeds of the new theorizing in statistical physics of
the twentieth century. They provided a new method of
estimating the size of molecules, a treatment of the
diffusion of solutes and small particles in viscous
media, the identification of a phenomenon that turned
the tide of resistance to molecular kinetic methods in
physics, a foundational analysis in the new field of
stochastic processes and the demonstration of the
granular character of electromagnetic radiation.
When faced with this wealth, it is hard not to be awed,
let alone to find a unifying theme that permeates the
work. My goal has been to display just such a theme,
even if the theme does not pass through the heart of
every aspect of Einstein's achievement. That theme is
the simple idea that thermal systems consisting of
many, spatially localized, independent components
have the same macroscopic properties, most notably
the satisfaction of the ideal gas law. This fact simplifies
analysis of many systems, since once the
independence of the components is known, the ideal
gas law must follow, whether the system is a gas,
dilute solution or microscopically visible particles in
suspension. And the inference can be inverted. Once
an atomic signature is seen, one can infer back to the
constitution of the system. In the case of high
frequency heat radiation, the presence of the atomic
signature was so definite that it emboldened Einstein
to overthrow the great achievement of 19th century
physics. He rejected Maxwell's electrodynamics and its
wave theory of light, in favor of a new and still ill
Atoms and the Quantum
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/atoms_quantum/index.html[28/04/2010 08:23:37 ﺹ]
formed quantum account of radiation.
Copyright John D. Norton, May 8, 2005. Minor corrections, May 15, 2005; link to "How big is an atom? June 17, 2006. Section 8 revised April 11,
2007.
1. Principle of Relativity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/assignments/01_P_of_R/index.html[28/04/2010 08:23:39 ﺹ]
HPS 0410 Einstein for Everyone Spring 2010
Back to main course page
Assignment 1. Principle of Relativity
For submission Mon Jan. 11, Tues. Jan. 12.
According to the principle of relativity, no experiment conducted within a laboratory can
reveal its uniform (=inertial) motion; all that can be revealed is the uniform motion of the
laboratory with respect to other bodies.
1. Special relativity tells us that moving rods shrink and moving clocks slow down. The
page shows you how to calculate how big these effects are. Two rows for 10,000 mi/sec
and 93,000 mi/sec have been left blank. Fill in the blanks.
2. You have equipped your spaceship laboratory with the finest of instruments. You have
a pure platinum yardstick, machined to be exactly one yard in length, and an atomic clock
that ticks off the seconds with unimaginable accuracy. Your spaceship laboratory is set in
motion at 99.5% of the speed of light with you inside, carefully observing what your rod and
clock do. Special relativity tells us that your rod shrinks to 10% of its length and your clock
runs ten times slower. You check to see if this is so. You know that the distance from your
nose to the tip of your outstretched arm is about one yard; your yardstick still tells you it is a
yard. You know your resting pulse rate is roughly one beat per second; your atomic clock
agrees. Your pulse still beats at roughly one beat per second.
Why do these attempts to detect rod shrinking and clock slowing fail?
If they did not fail, why would your success at measuing rods shrinking and clocks slowing
amount to a violation of the principle of relativity.
For discussion in the recitation.
A. What is inertial motion? An inertial observer? Accelerated motion? Absolute motion?
Relative motion? A light clock?
1. Principle of Relativity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/assignments/01_P_of_R/index.html[28/04/2010 08:23:39 ﺹ]
B. You are in a uniformly moving spaceship that enters an asteroid field. You observe the
asteroids of the field rushing past your window (and fear a collision with one). Does this
observation constitute an experiment that violates the principle of relativity? Explain.
C. You are inside an airplane drinking coffee. The airplane strikes turbulent air. Your
stomach falls and the coffee flies out of the cup. You have no doubt now that you are
moving. Does this observation constitute an experiment that violates the principle of
relativity? Explain.
D. We saw in the chapter that a light clock moving at 99.5% c slows by a factor of 10. We
also know from computing "beta" factors that a clock moving at 86.6% c slows by a factor of
2. Convince yourself of this second result by considering a light clock which moves
transverse to its length at 86.6% c.
E. Use the principle of relativity and the result of A to show that any clock moving at
86.6%c slows by a factor of 2.
02 Adding Velocities
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/assignments/02_P_of_R_II/index.html[28/04/2010 08:23:41 ﺹ]
HPS 0410 Einstein for Everyone Spring 2010
Back to main course page
Assignment 2: Adding Velocities
Einstein's Way
For submission Tues. Jan. 19
What do I do if my recitation is on MondayMartin Luther King Day?
1. Two spaceships pass a planet, moving in opposite directions. A planet observer judges
each to be moving at 100,000 miles per second. An observer on one of the spaceships
measures the speed of the other spaceship.
(a) According to classical physics, what speed will that spaceship observer measure for the
other spaceship? Is this speed faster than light?
(b) According to relativity theory, what speed will that spaceship observer measure for the
other spaceship? Is this speed faster than light?
2. The planet observer of question 1. above watches the first spaceship observer measure
the speed of the second spaceship by means of a procedure that uses rods and clocks.
Would the planet observer judge that measuring procedure to be a fair one that gives the
correct result?
02 Adding Velocities
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/assignments/02_P_of_R_II/index.html[28/04/2010 08:23:41 ﺹ]
For discussion in the recitation.
A. Imagine that you have a gun that can fire a particle at 100,000 miles per second. You
are in a spaceship moving at 100,000 miles per second with respect to the earth. You point
the gun in the direction of your motion and fire. Would an earthbound observer judge the
particle to travel at 200,000=100,000+100,000 miles per second? Show that the
earthbound observer could not since that would violate the principle of relativity, when that
principle is combined with the light postulate. How rapidly would you (the spaceship
observer) judge the particle to be moving?
B. The arguments we have investigated show that relativity theory prohibits us
accelerating an object past the speed of light. Do any of them rule out objects that have
always been traveling faster than light (or, possibly, were created initially already moving
faster than light)?
03 Relativity of Simultaneity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/assignments/03_rel_sim/index.html[28/04/2010 08:23:45 ﺹ]
HPS 0410 Einstein for Everyone Spring 2010
Back to main course page
Assignment 3: Relativity of
Simultaneity
For submission Mon. Jan. 25, Tues. Jan 26.
For Einstein, the big breakthrough in his work on special relativity came when he found a
way to reconcile the principle of relativity and the light postulate. He recognized that these
principles only seemed irreconcilable because of an unwarranted assumption that we
routinely make about space and time. We assume that all observers should agree on which
events are simultaneous. Instead, Einstein noticed, we may allow for the possibility that
observers in relative motion may disagree about which spatially separated events are
simultaneous. This assumption of the relativity of simultaneity allowed him to retain both the
principle of relativity and the light postulate. This assignment will help you to see how.
1. An observer is at the midpoint of a long spaceship. At the same instant he sends light
signals to both front and rear of the spaceship. Event A is the arrival of the signal at the
rear; event B is the arrival of the signal at the front.
(a) Are the two events A and B simultaneous according to the spaceship observer?
(b) Imagine that there are two good clocks located at the front and the rear of the spaceship
and the arrival of the signals is used to reset each clock to the same time. Are the clocks
now properly synchronized according to the spaceship observer?
(c) The spaceship is moving rapidly in the direction of its length past a planet. An observer
on the planet watches the signaling procedure described above. Does the planet observer
03 Relativity of Simultaneity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/assignments/03_rel_sim/index.html[28/04/2010 08:23:45 ﺹ]
judge events A and B to be simultaneous? If not, which happens first?
(d) Does the planet observer judge the two clocks to be set in proper synchrony? If not,
which is set ahead of the other?
2. A light signal flashes back and forth between the two ends of the same spaceship. If
the light postulate is to hold for the spaceship observer, then the spaceship observer must
judge that the light travels at the same speed in all directions. That is, according to the
spaceship observer, the signal must take the same time to travel from front to back as from
back to front. Assume this transit time is one minute. Then the arrival times of the light
signal must be registered as 12:00, 12:02, 12:04, ...etc. at the rear of the ship and 12:01,
12:03, 12:05, ... etc. at the front.
(a) Assume the light postulate also holds for the planet observer. Will the planet observer
judge the transit time for the forward trip of the light signal to be the same as the transit
time for the backward trip? If not, which is longer?
(b) How can the planet observer reconcile the answer to 2.(a) with the readings on the
clocks of the moving spaceship that record the transit times for the light signal?
03 Relativity of Simultaneity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/assignments/03_rel_sim/index.html[28/04/2010 08:23:45 ﺹ]
For discussion in the recitation.
A. Two observers I and II both stand on a large platform. There are two lightning strikes,
A and B.
The observer I is located at the midpoint of the spatial locations of the strikes A and B. Light
signals coming from the strikes A and B arrive at this observer I at the same time.
The observer II is located much closer to the strike A. As a result, the light signal from strike
A arrives at observer II much earlier than the light signal from strike B.
Observer I sees the signals at the same time; observer II sees them at different times.
Is this difference the relativity of simultaneity of relativity theory? If not, why not?
B. Two identical spaceships pass one another, moving rapidly in opposite directions at
the same speed according to an observer on a nearby planet. The planet observer judges
that both spaceships have shrunk the same amount due to relativistic length contraction.
So they are the same length and, in conformity with this expectation, the planet observer
notes that the two spaceships line up perfectly as they pass.
03 Relativity of Simultaneity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/assignments/03_rel_sim/index.html[28/04/2010 08:23:45 ﺹ]
An observer on one of the spaceships, however, finds the other spaceship to be moving
rapidly. So that spaceship observer judges the other spaceship to have shrunk relative to
the first spaceship. And an observer on the second spaceship comes to the reverse
judgment, that the first spaceship has shrunk.
How is it possible for all of of them to come to such different judgements?
C. Judgments of simultaneity are involved in any procedure that measures the length of
moving bodies or the times elapsed for processes on them. Consider some procedures for
measuring such lengths and times and show how judgments of simultaneity are hidden in
them. What if, for example we measure the length of a moving body by timing how long it takes to pass a single
observation point, where we use just one clock to time its passage. Its length is just its speed multiplied by the time
measured.
4. Origins
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/assignments/04_origins/index.html[28/04/2010 08:23:46 ﺹ]
HPS 0410 Einstein for Everyone Spring 2010
Back to main course page
Assignment 4: Origins of Special
Relativity
For submission Mon. Feb. 1, Tues. Feb 2.
Read the introduction and first two sections of Einstein's paper "On the electrodynamics of
moving bodies." Read it slowly and reverently. This text is to modern physics what Genesis
is to modern JudeoChristianity and the Declaration of Independence is to US history.
1. Compare what is moving with respect to what in the magnet and conductor thought
experiment in the two accounts you have: the one Einstein gives in his paper and the one in
the chapter section, Magnet and Conductor. How do the two accounts differ?
2. What is the "definition of simultaneity" that Einstein describes in the first section of his
paper? That is, what must be stipulated by definition according to Einstein if we are to be
able to compare the timing of events at a point A and a point B of space?
For discussion in the recitation.
A. In the introduction, what is established by the magnet and conductor thought
experiment? How do ether current experiments enter the discussion? What is "apparently
irreconcilable" and why is it so? How is Einstein suggesting that he will solve the problem?
B. In Section 2, how does Einstein establish that observers in relative motion may
disagree on the lengths of rods and the synchrony of clocks?
C. If the synchrony of different clocks is set by a definition, presumably freely chosen,
then it would seem that any velocities measured by them are also a matter of freely chosen
definition. So how can Einstein at the end of Section 1 say that the constancy of the speed
4. Origins
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/assignments/04_origins/index.html[28/04/2010 08:23:46 ﺹ]
of light is a universal constant "in agreement with experience"?
Here are some questions about E=mc
2
:
D. What does the law of conservation of mass say? What does the law of conservation of
energy say? In classical physics, these are two separate laws. What becomes of them in
relativity physics?
E. When an electric battery is charged, what happens to its mass?
When a hot body cools, what happens to its mass?
When a spring is compressed what happens to its mass?
Inside a completely isolated spacestation, an electric battery is used to warm the hands of
an astronaut and to run a motor that winds a spring. What happens to the total energy of
the spaceship? What happens to the total mass of the spaceship?
F. When an atom of Uranium235 undergoes fission and breaks into parts, the total mass
of the parts is less than the mass of the original atom. What happens to the missing mass?
Why is this missing mass important in modern life? What does the law of conservation of
mass say ? What does the law of conservation of energy say? In classical physics, these
are two separate laws. What becomes of them in relativity physics?
5 Spacetime
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/assignments/05_spacetime/index.html[28/04/2010 08:23:49 ﺹ]
HPS 0410 Einstein for Everyone Spring 2010
Back to main course page
Assignment 5: Spacetime
For submission
1. Draw a spacetime diagram with the following elements. Be sure to label each one clearly.
An event O.
A worldline of an observer A that passes through O.
The light cone at O.
The hypersurface of all events simultaneous with O (for observer A).
An event E
past
which is in the past of O and can causally affect O.
An event E
future
which is in the future of O and can be causally affected by O.
An event E
elsewhere
which is outside the light cone of O and cannot be causally affected by
O.
A timelike curve through O.
A spacelike curve through O.
A lightlike curve through O.
2. On the spacetime diagrams below:
(a) An observer A judges the two events E
1
and E
2
to be
simultaneous.
Draw the worldline of the observer A and a hypersurface of
events that A will judge to be simultaneous.
How does this hypersurface support A's judgment of the
simultaneity of E
1
and E
2
.
(b) An observer B moves relative to A and judges E
1
to be
later that E
2
.
Draw the worldline of observer B and a hypersurface of events
that B will judge to be simultaneous.
5 Spacetime
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/assignments/05_spacetime/index.html[28/04/2010 08:23:49 ﺹ]
How does this hypersurface support B's assessment of the
time order of E
1
and E
2
.
(c) An observer C moves relative to A and judges E
1
to be
earlier that E
2
.
Draw the worldline of observer C and a hypersurface of events
that C will judge to be simultaneous.
How does this hypersurface support C's assessment of the
time order of E
1
and E
2
.
(d) If C judges a tachyon to have travelled from E
1
to E
2
, what
would A and B say about it?
For discussion in the recitation
A. The relativity of simultaneity is revealed most simply in the following thought experiment in
which two observers in relative motion judge the timing of two explosions by means the light
signals they produce:
5 Spacetime
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/assignments/05_spacetime/index.html[28/04/2010 08:23:49 ﺹ]
Draw a spacetime diagram of this experiment, indicating:
The planet observer's worldline and associated hypersurfaces of simultaneity.
The spaceship observer's worldline and associated hypersurfaces of simultaneity.
The worldlines of the front and rear of the spaceship.
The two explosion events.
The light signals emitted by the explosions.
B. At sunrise of Day 1, a monk commences a long walk up the narrow, winding road from the
monastery in the valley to the mountain top. It is a hard, tiring climb, so he stops frequently to
rest and even reverses his direction from time to time. He arrives at the mountain top just at the
moment of sunset. At sunrise on Day 2, the monk commences the return journey. This time the
journey is far easier. Rather than hurry to complete it quickly, the monk decides to pause
frequently to admire the wildflowers, inhale the mountain air and absorb the splendor of the
view. He arrives in the valley at the moment of sunset.
Is there any moment on the two days at which the monk is in exactly the same position on the
road?
At first it seems impossible to determine an answer to this question from the information given.
Whether there is such a moment seems to depend on the details of the monk's progress up and
down the mountain. Drawing spacetime diagrams rapidly solves the problem, however. To see
how, draw plausible world lines for the monk's two journeys on the spacetime diagrams here.
5 Spacetime
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/assignments/05_spacetime/index.html[28/04/2010 08:23:49 ﺹ]
Explain how they make it obvious that the moment specified in the question must always exist
no matter what the details of the monk's progress. (Hint: To see this, imagine the two spacetime
diagrams superimposed.)
6. Significance
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/assignments/06_significance/index.html[28/04/2010 08:23:51 ﺹ]
HPS 0410 Einstein for Everyone Spring 2010
Back to main course page
Assignment 6: Philosophical
Significance of the Special Theory of
Relativity
For submission
Consider the candidate morals in the the chapter, The Philosophical Significance of the
Special Theory of Relativity.
1. Which, if any, do you find most convincing? If you answer "none of the above," propose
an alternative.
2. In your own words, give a clear a statement of the moral.
3. State clearly the argument in favor of the moral.
For discussion in the recitation
A. Consider the two challenges in " What is a four dimensional spacetime like?" The
second is to show that there are no knots in a four dimensional space. Use the techniques
described to show that if the knot shown were in a four dimensional space, the knot could
be untied without detaching the ends of the rope from the walls.
6. Significance
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/assignments/06_significance/index.html[28/04/2010 08:23:51 ﺹ]
Hint: consider the section of the rope marked "XXXXX." What if it were lifted into the fourth
dimension?
B. An equilateral triangle is a plane figure bounded by three lines of equal length. It is
drawn by taking a line AB and a point C not on AB. The points A and B are connected to C
with straight lines. C is selected so that all three lines AB, AC and BC are equal in length.
A regular tetrahedron is is a three dimensional solid bounded by four equilateral triangles. It
is drawn by taking an equilateral triangle ABC and a fourth point D. The points A, B and C
are connected to D by straight lines. D is selected so that each of the triangles ABC, ABD,
BCD and ACD are equilateral.
Continuing in this pattern, what does a four dimensional tetrahedron look like ? How is it
constructed? Draw one.
(For the brave to tackle outside the recitation: Compute the area and volume of an equilateral triangle
and a regular tetrahedron. Continue to compute the four dimensional volume of the figure drawn in
B. Warning: This is a hard problem. I have not found a simple way of doing it!)
07 NonEuclidean Geometry
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/assignments/07_Non_Euclidean/index.html[28/04/2010 08:23:53 ﺹ]
HPS 0410 Einstein for Everyone Spring 2010
Back to main course page
Assignment 7: NonEuclidean
Geometry
For submission
1. Consider a geometry in which Euclid's 5th postulate is replaced by:
Through any point NO straight line can be drawn parallel to a given line.
Show that there is at least one triangle in this geometry whose angles sum to more than
two right angles.
Hint: On a line PQ, select two points A and B. Construct lines AC and BD perpendicular to
PQ. What happens if AC and BD are extended in both directions?
2.In a Euclidean space, what is
(a) the sum of the angles of any triangle;
(b) the circumference of a circle with radius 10,000 km;
(c) the area of a right angled triangle if the length of the sides enclosing the right angle are
both 10,000 km?
3. The geometry of 1. above, suitably treated, is the geometry of the surface of a sphere.
07 NonEuclidean Geometry
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/assignments/07_Non_Euclidean/index.html[28/04/2010 08:23:53 ﺹ]
The Earth is, to good approximation, a sphere of circumference 40,000 km.
(a) On this sphere, what is the sum of the angles of a triangle all of whose sides are 10,000
km? (An example of such a triangle is shown as triangle ABC. It has one vertex at the
North Pole and extends down to the equator.)
(b) What is the circumference of a circle of radius 10,000 km in this surface?
(c) The triangle ABC is a right angled triangle all of whose sides are 10,000 km long. What
is its area ? (Reminder: The area of the Earth is 509,300,000 sq.km.) Compare your
answers in question 2 and 3.
4. If you had before you a two dimensional surface of constant curvature, how could your
determine whether the curvature was positive, negative or zero by measuring
(a) the sum of angles of a triangle;
(b) the circumference of a circle of known radius?
5. How could you check whether our three dimensional space (if it has constant
curvature) has a positive, negative or zero curvature by measuring
(a) the sum of angles of a triangle;
(b) the surface area of a sphere of known radius?
For discussion in the recitation.
A. Does it make sense to say that a space has a curved geometry if there is no higher
dimensioned space into which the space can curve?
B. In the context of Question 5, how might we go about measuring the sum of the angles
of a triangle in our actual space? Remember, ordinary measurements of things in our actual
07 NonEuclidean Geometry
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/assignments/07_Non_Euclidean/index.html[28/04/2010 08:23:53 ﺹ]
space conform closely to Euclidean geometry. Architects routinely build skyscrapers using
Euclidean geometry. Therefore, if our actual space turns out not to be Euclidean, the
amount of curvature would have to be very, very small. We would need a very, very
accurate way of measuring the angles of a triangle for this test. Only then would we be able
to tell if the sum is really 179.9999 degrees or 180.00001 degrees. What might such a very,
very accurate means be?
Hint, if our space has some curvature, the deviation from 180 degrees in the sum of the angles of a
triangle becomes greater the larger the triangle. That makes the detection of deviations from
Euclidean geometry easier.
C. The discovery of nonEuclidean geometries eventually precipitated a crisis in our
understanding of what has to be and what just might be the case. At one extreme are
necessities, such the truths of logic; they have to be true. At the other extreme are
mundane factual matterscontingent statements that may or may not be true. Somewhere
in between is a transition. Locating that transition has traditionally been of great importance
in philosophy and philosophy of science. For if something is necessarily true, we need
harbor no doubt over it. If something is contingent, the mainstream empiricist philosophy
says we can only learn it from experience. Sometimes the contingent proposition is very
broad. For example, consider the proposition that there never has been and never will be a
magnet with only one pole. We may come to believe this proposition with ever greater
confidence. But we can never be absolutely certain of it. We never know whether tomorrow
will bring the counterexample.
Just where should the transition between necessity and contingency come?
Here is a list of propositions that begins with logical truths and bleeds off into ordinary
contingent propositions. Sort them into necessary truths and contingent propositions. How
are you deciding which is which?
If A and B are both true, then A is true.
If one of A or B is true and A is false, then B is true.
For any proposition A, either A is true or A is not true.
1 + 1 = 2
7 + 5 = 12
There are an infinity of prime numbers.
Every circle has one center.
The sum of the angles of a triangle is two right angles.
Only the fittest survive.
Every effect has a cause.
Every occurrence has a cause.
07 NonEuclidean Geometry
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/assignments/07_Non_Euclidean/index.html[28/04/2010 08:23:53 ﺹ]
No effect comes before its cause.
Improbable events are rare.
Energy is always conserved.
Force equals mass times acceleration.
The earth has one moon.
08 Curvature
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/assignments/08_Non_Euc_GR/index.html[28/04/2010 08:23:56 ﺹ]
HPS 0410 Einstein for Everyone Spring 2010
Back to main course page
Assignment 8: Curvature
For submission
1. What is the difference between extrinsic and intrinsic curvature?
2. Imagine that you are a two dimensional being trapped in a flat two dimensional surface.
(a) How would you use geodesic deviation to confirm the flatness of your surface?
(b) Imagine that a three dimensional being picks up your surface and bends it into cylinder,
without in any way stretching your surface. (This is just what happens when someone takes
a piece of paper and rolls it into a cylinder.) You are still trapped in the surface. If you now
use geodesic deviation to determine the curvature of your surface, would you get the same
result as in (a)? Explain why.
3. In antiquity, it was observed that the position of the northern pole star changed as the
observer's position changed in the northsouth direction. Specifically, for each 69 miles =
111km that the observer moved northward, the pole star raised in elevation by one degree.
(a) Explain how this observation enabled ancient astronomers to argue that the surface of
the earth is curved. (Note that the ancient astronomers knew that the pole star was so far
away that no change of position on the earth's surface brings us appreciably closer to it.)
(b) Use it to estimate the circumference of the earth.
(c) Explain why this observation enables the establishing of extrinsic curvature.
(d) Explain why this observation, by itself, does not enable us to infer the intrinsic curvature
of the earth's surface. (Hint: Is there a shape with extrinsic curvature, but no intrinsic
curvature that exhibits the effect?)
08 Curvature
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/assignments/08_Non_Euc_GR/index.html[28/04/2010 08:23:56 ﺹ]
4. In a space with three or more dimensions, the curvature need not be the same in every
two dimensional sheet that passes though some point in the space. Of course sometimes
things are simple and the curvature does work out the same. Here's an example. Imagine
that you are in an ordinary, three dimensional Euclidean space. You slice the space up into
the flattest two dimensional sheets you can find, all built out of intersecting straight lines.
The first set of sheets run left right and up down. The second set of sheets run leftright
and front back. The third set of sheets run up down and front back. You use geodesic
deviation to determine the curvature of the sheets in each set. What is the curvature of:
(a) The leftright and updown sheets?
(b) The leftright and frontback sheets?
(c) The updown and frontback?
(d) Things need not work out so simply. In what space discussed in the chapter would the
results be different?
For discussion in the recitation.
A. Here's an exercise that shows how geodesic deviation can be used to determine how
much curvature a surface has, not just whether it is zero, positive or negative. Geodesic
deviation can be used by observers on the surface of a planet to determined whether they
are on an earth sized planet or on one twice its size with correspondingly different
08 Curvature
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/assignments/08_Non_Euc_GR/index.html[28/04/2010 08:23:56 ﺹ]
curvature.
(a) Two observers stand on the earth's equator 100 miles apart. They begin to move
northward. After traveling 100 miles they find that they are closer by 169 feet. How is this
effect is related to the curvature of the earth's surface.
(b) If they had started 200 miles apart and moved 100 miles due north, by how much would
they have approached each other ? Convince youself that your answer is correct by
drawing a figure.
(c) Imagine that, before the observers start their motions, the earth is inflated to twice its
size so that its radius of curvature has doubled. The observers of (a) are carried along with
the inflation, like two ants sitting on a balloon. They now start 200 miles apart. After they
have moved 200 miles due north, by how much would they have converged? (Hint: get the
answer just by scaling up everything in (a)!)
(d) Use your answer to (b) to convince yourself that the result of (c) could not happen on an
earth of the original size, so that the amount of convergence can be used to determine if
the surface is the earth's or a planet of twice its size.
Technical note: What makes these computations messy is that the amount of convergence increases
with the square of the distance the observers travel north. The formula is
Convergence = (1/2)x(eastwest distance at equator) x (distance moved north)
2
/ (radius of earth)
2
where the formula holds only as long as the two distances are very small compared to the radius of
the earth. This formula can be inverted to determine the radius of the earth from local measurements
of the other distances in the formula.
B. Here's an example that illustrates how curvature can vary in different directions.
Consider the extruded spherical space discussed here in the chapter on Spaces of Variable
Curvature. Imagine that somehow you have been transported into this space. You want to
figure out which are the eastwest, leftright and updown directions in this space. To do so,
you label three perpendicular directions "X," "Y" and "Z." You slice the space into three
different types of two dimensional sheets. The XY sheets contains the directions X and Y;
and so on for XZ and YZ. You now have three sorts of sheets in which you can carry out
geodesic deviations measurements. Let us say you end up with the following results:
XY sheet: geodesics converge
XZ sheet: geodesics neither converge nor diverge
YZ sheet: geodesics neither converge nor diverge
(a) What sort of curvature does each of the three sheets have?
08 Curvature
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/assignments/08_Non_Euc_GR/index.html[28/04/2010 08:23:56 ﺹ]
(b) Which of X, Y and Z directions can correspond to eastwest, northsouth and updown?
Explain how you arrived at this identification.
09 General Relativity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/assignments/09_general_relativity/index.html[28/04/2010 08:23:59 ﺹ]
HPS 0410 Einstein for Everyone Spring 2010
Back to main course page
Assignment 9: General Relativity
For submission
1.The effect of geodesic deviation can be used to detect curvature in spacetime.
(a) The simplest case is the gravitation free Minkowski spacetime. Consider four objects
arranged at equal distances apart in a straight line in Minkowski spacetime and initially at
rest. Draw a spacetime diagram of their ensuing worldlines. Use the notion of geodesic
deviation to conclude that the sheet of the spacetime that they are exploring is flat.
(b) Now imagine that the same four bodies are momentarily at rest, high above the surface
of a planet, such as our earth, all lined up at the same altitude. They are released and
begin to fall towards the planet. Draw a spacetime diagram of the ensuing worldlines. Use
the notion of geodesic deviation to conclude that the sheet of spacetime they are exploring
is curved.
2. (a) What is the essential idea of Einstein's gravitational field equations?
(b) Why is it plausible that the Minkowski spacetime of special relativity conforms to them in
case the spacetime's matter density is everywhere zero?
(c) Does this mean that a Minkowski spacetime is the only possibility where the matter
density is zero? Why not?
3.(a) What consequence does the equality of inertial and gravitational mass of Newtonian
theory have for bodies in free fall?
(b) How is this consequence important to Einstein's new theory of gravity, which depicts
gravitational effects as resulting from a curvature of spacetime?
For discussion in the recitation.
09 General Relativity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/assignments/09_general_relativity/index.html[28/04/2010 08:23:59 ﺹ]
A. According to general relativity, there is noticeable curvature in the spacetime sheets of
spacetime in the vicinity of the earth. That curvature is manifested as gravitational effects.
General relativity also tells us that the geometry of space above the surface of the earth
has a very, very slight curvature as well. That would be manifested as a curvature in a
"spacespace" sheet of spacetime. How could geodesic deviation be used to detect it,
assuming that precise enough measurements could be made?
B. Einstein first hit upon the idea that gravitation slows clocks through a thought
experiment conducted fully within a Minkowski spacetime of special relativity. He imagined
an observer with two clocks all enclosed within a box and accelerating uniformly in a
Minkowski spacetime. He then showed that, according to special relativity, the clocks run at
different rates, according to their position in the box. The farther forward they are in the
direction of the acceleration, the faster they run. Einstein's principle of equivalence then
added the assertion that the inertial field appearing in the box was nothing other than a
special form of a gravitational field. So he concluded that clocks run at different rates
according to their altitude in a gravitational field. The higher clocks run faster and the lower
ones slower.
The relative slowing of the clocks can be recovered fully from the spacetime geometry of a
Minkowski spacetime. Here is a spacetime diagram of two clocks accelerating. The
acceleration is in the direction from the A clock to the B clock. Draw in hypersurfaces of
simultaneity for observers located with the clocks and moving with them. Show that the B
clock observer judges the Aclock to run slower; and the Aclock observer judges the B
clock to run faster.
09 General Relativity
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/assignments/09_general_relativity/index.html[28/04/2010 08:23:59 ﺹ]
C. Einstein took a radically new approach to gravity by declaring it to coincide with a
curvature of spacetime. However, as we have seen in the chapter, the same thing can be
done with Newtonian gravitation theory, so that all its gravitational effects can be
associated with a curvature in some parts of spacetime. So what is new with Einstein's
proposal?
D. You can take a flat sheet of paper and wrap it into a cylinder, so that its rightmost edge
coincides with its leftmost edge. That operation does not affect the intrinsic flatness of the
paper. One can do the same thing in imagination with a cubical chunk of Minkowski
spacetime to create a very odd, new spacetime. Take the chunk's rightmost edge and
declare that it coincides with its leftmost edge. That means that anyone traveling past the
surface marking rightmost edge of this space would simply pop back at the surface marking
the leftmost edge. Use geodesic deviation to convince yourself that the wrapping up of this
spacetime has not changed the flatness of the spacetime.
10 Relativistic Cosmology
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/assignments/10_cosmology/index.html[28/04/2010 08:24:01 ﺹ]
HPS 0410 Einstein for Everyone Spring 2010
Back to main course page
Assignment 10: Relativistic
Cosmology
For submission
1. Name a spacetime that has the following properties:
(a) It is uniformly filled with matter that is everywhere at rest.
(b) It is empty of matter but space collapses and then expands everywhere.
(c) It has a special center in the geometry of its space.
(d) It has no matter and no gravitational effects anywhere.
2. (a) What are Einstein's gravitational field equations of 1915? How does Einstein's
cosmological constant λ modify them?
(b) Show how the term can be reinterpreted as representing a form of matter in space.
(c) Why is the form of matter odd?
3. Imagine a timetravel, cylinder universe which is empty except for one mass.
(a) Draw in the worldline of the mass when it remains at rest in the space and reconnects
with itself.
10 Relativistic Cosmology
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/assignments/10_cosmology/index.html[28/04/2010 08:24:01 ﺹ]
(b) Draw the worldline of the mass when the mass moves to the right.
(c) The mass can collide with its future self. The collision is such that the mass gets
deflected by just the right amount to come back as the later self of the collision. Draw the
worldline that shows this, recalling that aside from collisions the mass moves inertiallyi.e.
in a straight line in the space.
10 Relativistic Cosmology
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/assignments/10_cosmology/index.html[28/04/2010 08:24:01 ﺹ]
Hint: Here's a way of resolving collisions in a spacetime diagram. The diagram
opposite shows what happens in spacetime when a body A approaches a
body B at rest and with equal mass. If the collision is elastic, body A comes to
rest and body B moves off the same velocity that A had initially.
For discussion in the recitation.
A. (a) If time travel were possible, the familiar paradox tells us that we could travel back
in time, assassinate our grandfather in his youth, thereby precluding our birth. A
contradiction ensues, since it now follows both that you traveled back in time and that you
did not travel back in time. Good physical theories cannot tolerate contradictions. Does this
mean we should abandon any theory that tells us that time travel is possible?
(b) In an old movie, a time traveler enters William Shakespeare's room just at the moment
he is writing Hamlet's famous soliloquy. Shakespeare, however, is completely stumped and
cannot find the right line. "To be or not to be." the time traveler whispers impatiently in
Shakespeare's ear. "An excellent line," Shakespeare exclaims as he dutifully writes it in his
manuscript. The puzzle is this: who thought up the line? More generally, is this the same
sort of paradox as the "grandfather paradox ? Or is there something significantly different
about it.
(c) Here's another version of the paradox of (b). A time traveler steals Michelangelo's
famous statue of David from its gallery in Florence and transports it back to Michelangelo's
workshop in 1501, just as the sculptor is about to start work on the statue. The time traveler
kidnaps the sculptor, keeps him trapped for the 3 years needed to sculpt the masterpiece
and places the stolen statue in Michelangelo's workshop. When he is released,
Michelangelo is too embarrassed to admit that he did not make the statue. Who made the
statue?
B. Imagine a Minkowski spacetime wrapped
up in one spatial direction. A space traveler
synchronizes his clock with one on earth and
then leaves earth. The traveler moves
10 Relativistic Cosmology
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/assignments/10_cosmology/index.html[28/04/2010 08:24:01 ﺹ]
inertially eventually coming back to earth
without ever changing direction. When the
traveler's clock and the earth clock are
compared, the traveler's clock will be found to
have been slowed by the motion and will read
less than the earth clock. Is this a violation of
the principle of relativity ? Shouldn't the
traveler expect the earth clock to have run
slower? Note that this version of the "twin"
problem is unlike the familiar one in so far as
the traveler moves inertially at all times; there
is no turning around and thus no acceleration.
(Hint: this space has a preferred state of
motion! To find it, try drawing in the
hypersurfaces of simultaneity of the earth and
of the spacetraveler.)
11 Big Bang Cosmology
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/assignments/11_big bang/index.html[28/04/2010 08:24:04 ﺹ]
HPS 0410 Einstein for Everyone Spring 2010
Back to main course page
Assignment 11: Big Bang Cosmology
For submission
We can use Hubble's law to arrive at a crude estimate of the age of the universe. That is,
we will calculate how long ago all the galaxies were crammed into our neighborhood of
space. This time will be our estimate of how long ago the big bang happened. We will
assume that each galaxy has moved at a constant speed for all time, although this speed
will vary from galaxy to galaxy.
We will use the value of 20 km/sec per 1,000,000 lightyears for Hubble's constant.
1. (a) If a galaxy is 1,000,000 lightyears away from us now, according to Hubble's law,
how fast is it receding from us?
(b) A galaxy traveling at 1 km/sec will travel one lightyear in 300,000 years. How long does
the galaxy of (a) require to travel a lightyear?
(c) How long did it take the galaxy of (a) to get to its position 1,000,000 lightyears distant
from us?
2. Repeat the calculation of 1. for a galaxy now 2,000,000 light years distant from us.
3. Repeat the calculation of 1. for a galaxy now 3,000,000 lightyears distant from us.
The final result of 1., 2., and 3. should be the same. At the time calculated, all the matter of
universe would have been compressed into our neighborhood. This is our estimate of the
age of the universe, often called the "Hubble age."
4. The dynamics that drive standard relativistic cosmologies are somewhat hard to
understand. It turns out that this relativistic dynamics is mimicked in several important
aspects by some simple dynamical systems in Newtonian theory. Those systems consist of
a quantity of matter concentrated into a point in an empty Newtonian universe. That point
11 Big Bang Cosmology
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/assignments/11_big bang/index.html[28/04/2010 08:24:04 ﺹ]
explodes violently throwing out fragments of matter in all directions, producing an
expanding cloud of debris. In Newtonian gravitation theory, every fragment of matter exerts
an attractive gravitational force on every other fragment. These attractive forces act to pull
the fragments of the cloud back together, slowing the rate of expansion of the cloud of
debris.
There are three different types of histories for the cloud, according to the energy of the
initial explosion:
I. Low energy explosion. The energy of the explosion is not great enough to overcome the
attractive forces of gravitation and the cloud collapses back onto itself under gravitational
forces.
II. High energy explosion. The energy of the explosion is sufficient to overcome the
attractive forces of gravitation. The fragments continue to move apart without limit. The
cloud is spread more and more thinly over time and never collapses back to a point. Only a
part of the total energy of the explosion is needed to overcome the attractive forces of
gravitation. The remainder fuels a continuing rapid expansion.
III. Critical energy explosion. The energy of the explosion is the exact minimum needed to
prevent recollapse. Over time all of the energy of the explosion is used up in counteracting
the attractive forces of gravitation. The critical energy level lies exactly on the boundary
between the energies of I. and II.
(a) Which Newtonian model is associated with which relativistic cosmology?
(b) While these Newtonian models are remarkably good in mimicking the relativistic
dynamics, the Newtonian models differ from the relativistic cosmologies in several very
important ways. What are they?
11 Big Bang Cosmology
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/assignments/11_big bang/index.html[28/04/2010 08:24:04 ﺹ]
For discussion in the recitation.
A. It may seem that Hubble's law conflicts with the basic supposition of Friedman
Robertson Walker cosmology that the universe is homogeneous and isotropic in space. For
Hubble's law tells us that everything is rushing away uniformly from our particular galaxy.
Does not that make our galaxy some sort of special center of galactic motion, different from
every other galaxy? The following calculations show that the galactic motions of Hubble's
law look the same from every galaxy.
Consider (0) our galaxy and galaxies (I) 1,000,000 and (II) 2,000,000, and (III) 3,000,000
and (IV) 4,000,000 light years distant from us, all in the same direction. Compute the
velocities of recession of the galaxies (I)(IV) from us.
Now imagine that you are an observer located on galaxy (I). Recompute the velocities of
recession of the other galaxies. Find that Hubble's law still holds. That means that the
expansion looks the same to an observer on galaxy 1 as it does from our galaxy. It is not
hard to see that the same result will hold for all observers, no matter which galaxy is their
home.
(In computing these velocities, use the ordinary Newtonian rule for composing velocities.)
B. If the universe turns out to have an open geometry so that space is infinite, then all of
our observations are showing us only the tiniest part of space. It is a finite fragment of an
infinite expanse. Given that tiny sample, are we justified in asserting that the universe is
spatially homogeneous the same in every place ? Or is this fundamental hypothesis of
cosmology mere supposition?
C. Some theorists find a singularity, such as the big bang, an affront to science and feel a
strong need to find reformulated theories that will eliminate them. Are singularities to be
avoided or eliminated from theories if possible? Why?
D. The adoption of big bang cosmology triggered a long standing debate in theology.
Should we take the big bang to vindicate the theistic claim of divine creation of the
universe? Theists like to point out the similarity between the creation account in Genesis
"Let there be light."and big bang cosmology's assertion of a finite past that was dominated
by radiation as we approach the big bang. Atheists, however, reply that nowhere in big
bang cosmology do we find any agent outside of space, time and matter with creative
powers; we just have matter and space expanding in time.
11 Big Bang Cosmology
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/assignments/11_big bang/index.html[28/04/2010 08:24:04 ﺹ]
12 Black Holes
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/assignments/12_black_holes/index.html[28/04/2010 08:24:06 ﺹ]
HPS 0410 Einstein for Everyone Spring 2010
Back to main course page
Assignment 12: Black Holes
For submission: NO LONGER REQUIRED DUE TO SNOWSTORM CLASS
CANCELLATIONS.
1. Why do black holes result from gravitation and not, say, from electric or magnetic
attractions?
2. What sorts of objects in our universe are candidates for collapse into a black hole?
3. In the context of a black hole, what are (a) the singularity; (b) the event horizon; (c)
tidal forces?
4. Here are conformal diagrams of a Minkowski spacetime and fully extended,
Schwarzschild black hole.
12 Black Holes
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/assignments/12_black_holes/index.html[28/04/2010 08:24:06 ﺹ]
Using both words and the appropriate symbols, label:
future timelike infinity
past timelike infinity
future lightlike infinity
past lightlike infinity
spacelike infinity
future singularity
past singularity
event horizon
5. (a) Use the conformal diagram below to show that a traveler cannot pass from the
world of region III to that of region I; and that light signals also cannot pass from region III to
I.
12 Black Holes
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/assignments/12_black_holes/index.html[28/04/2010 08:24:06 ﺹ]
(b) Use the conformal diagram below to show that travelers from the worlds of regions I and
III can meet.
For discussion in the recitation.
A. If black holes let no light escape, how is it possible for us identify candidate black
holes among what our telescopes see in the sky?
B. (a) What prevents the gravitational collapse of planets?
(b) What prevents the gravitational collapse of stars?
12 Black Holes
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/assignments/12_black_holes/index.html[28/04/2010 08:24:06 ﺹ]
C. What are three differences between a Newtonian and a relativistic black hole?
D. Minkowski spacetimes are well behaved in so far as there are no inaccessible regions.
Illustrate this by picking any event in the conformal diagram of the Minkowski spacetime
and showing that it can always be reached by some space traveller, who proceeds from
past timelike infinity at less than the speed of light.
E. Use a conformal diagram of a black hole to show that an outside observer can only
see a portion of the trajectory of a traveler who falls into the black hole.
F. We have learned repeatedly to be suspicious of things that are supposed to exist but
whose properties are so set up as to make our detecting them impossible. Is the world of
region III such a thing? We, in region I, cannot visit it or receive signals from it; and no one
from region III can visit or signal us in region I.
13 Origins of Quantum Theory
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/assignments/13_quantum_th_origins/index.html[28/04/2010 08:24:08 ﺹ]
HPS 0410 Einstein for Everyone Spring 2010
Back to main course page
Assignment 13: Origins of Quantum
Theory
For submission
1. (a) What experiment gives us good reason to think that light consists of waves? How
does it lead to that result?
(b) What experiment gives us good reason to think that high frequency light has its energy
localized at points in space, like a particle? How does it lead to that result?
2. (a) What model of the atom tells us that electrons could be found anywhere in the
vicinity of an atom's very small nucleus? On what physical theory is that model based?
(b) How does the theory of atomic spectra suggest that the theory of (a) is wrong.
(c) What theory of the atom results from taking the atomic spectra seriously?
3. (a) How does de Broglie's theory of matter waves connect the energy and momentum
of particles with the frequency and wavelength of waves?
(b) How does this theory make sense of the theory of the atom of 2.(c)?
For discussion in the recitation
A. Consider the sequence of theories that set us on the way to modern quantum theory.
They mixed together components of classical physics with new quantum notions and, to
use the "old quantum theory" one had to invoke both classical and quantum notions at the
same time:
• Planck's analysis of heat radiation assumed that heat radiation was generated by
emission and absorbtion of light from classically described electric resonators. His analysis
seemed to require that electric resonators only be allowed to adopt discrete energy levels,
although classical physics told us that they could adopt a continuous range of energies.
13 Origins of Quantum Theory
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/assignments/13_quantum_th_origins/index.html[28/04/2010 08:24:08 ﺹ]
• Einstein's 1905 light quantum hypothesis held that high frequency light energy is localized
at points in space. Yet at the same time Einstein still allowed that interference phenomena
were possible for light and that requires that the light be spread out in space.
• Bohr's 1913 theory of the atom took the classical theory of electron orbits in which
electrons may orbit at any distance from the nucleus, but cannot do so stably. To it he
added the assumption that these electrons can orbit stably, but only at very few discrete
distances from the nucleus.
In all these cases, the theorists seem to make essential use of logically incompatible
assumptions. Electrons cannot both be stable and not be stable, for example. The
presence of a logical inconsistency is usually taken to be fatal to a physical theory. Yet here
were successful theories that seemed to depend essentially on contradictory assumptions.
(a) Should we require our physical theories to be consistent?
(b) Do you know any examples of theories that were discarded when they were found to be
based on contradictory assumptions?
(c) Are there other examples of successful theories that are based on inconsistent
assumptions?
B.To sharpen the problems above, consider this. If a theory is contradictory, then it allows
both the truth of some proposition A and also the truth of its negation notA. In classical
logic, one can deduce anything at all from a contradition. Here's the proof. (If you have had a
logic class, this will seem entirely trivial. If not, you may be a bit startled by how easy it is to infer anything from a
contradiction.) The inference combines two standard argument forms:
Addition
C
Therefore, C or D
Disjunctive syllogism
C or D
notC
Therefore, D
To prove any proposition B from a contradiction (A and notA)
1. A (Assumption)
2. notA (Assumption)
For example:
1. Electron orbits are stable. (Assumption)
2. Electron orbits are not stable. (Assumption)
13 Origins of Quantum Theory
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/assignments/13_quantum_th_origins/index.html[28/04/2010 08:24:08 ﺹ]
3. A or B (From 1,2 by Addition)
4. B (From 2, 3 by Disjunctive
Syllogism)
3. Electron orbits are stable OR bananas are high in Potassium. (From
1, 2 by Addition)
4. Bananas are high in Potassium. (From 2, 3 by Disjunctive Syllogism)
What this tells us is that, in an inconsistent theory, we can deduce anything. So should we
be so surprised that Planck, Einstein and Bohr can deduce their results from inconsistent
premises? From inconsistent premises, we could deduce that planets orbit in squares; or
that everything is made of licorice!
Or is there something more subtle at work? Planck, Einstein and Bohr seem to have found
some deep truths about the world. How can they be extracted from the snake pit of logical
inconsistency?
14 Problems of Quantum Theory
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/assignments/14_quantum_th_problems/index.html[28/04/2010 08:24:10 ﺹ]
HPS 0410 Einstein for Everyone Spring 2010
Back to main course page
Assignment 14: Problems of Quantum
Theory
For submission
1. Consider a wave packet used in de Broglie's theory to represent a particle. How is the
particle's momentum affected if we make the spatial extent of the wave packet bigger or
smaller? How does this difference relate to the "Heisenberg Uncertainty Principle"?
2. What is the difference between interpreting the uncertainty of Heisenberg's principle as
ignorance as opposed to indeterminateness?
3. What is the "Schroedinger evolution" of a matter wave ? What is "the collapse of the
wavepacket"?
4. In the standard analysis of the Schroedinger cat thought experiment, what leads to the
definite survival or definite death of the cat?
For discussion in the recitation.
A. Quantum theory is an indeterministic theory. That means that a complete specification
of the present state of some atomic system does not fix its future. Here's how we apply this
idea to radioactive decay. If you have a single atom of Neptunium NP
231
93
, there is a one
in two chance that it will decay over the next 53 minutes. According to standard quantum
theory, that is all you can know. There is no way to know ahead of time whether the atom
will decay. Do you really believe that? Might it be if we had a more complete picture of the
compicated, hidden recesses of this atom that we'd see some tiny difference between
those atoms that end up decaying and those that do not ? Ought we expect some future
theory of the insides of atoms to tell us about these sorts of hidden properties? Ought we to
demand such a theory before we can say we really understand radioactive decay ? Or
14 Problems of Quantum Theory
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/assignments/14_quantum_th_problems/index.html[28/04/2010 08:24:10 ﺹ]
should we comfortable with the idea that some processes just are indeterministic?
B. To get a sense of how the Heisenberg uncertainty principle applies, consider the
problem of balancing a pencil perfectly on its tip. Here is what is needed for success
in the balancing operation: you have to align the center of mass of the pencil exactly
over the pencil's tip; and, as you take your fingers off the pencil after doing this, you
need to leave the pencil perfectly at rest. What does Heisenberg's uncertainty
principle tell you about your chances of success?
C. The "measurement problem" remains a lingering difficulty for quantum theory. Yet
modern quantum theory remains an extremely successful theory of matter that has given us
many fascinating insights into the nature of matter and makes many quantitative predictions
that have been borne out by experience. How is this possible?
D. Consdier the Schroedinger cat thought experiment. According to the text book account
of quantum measurement, immediately prior to our opening the box, the cat is in a 5050
superposition of alive and dead states; when we open the box and look at the cat, we
trigger a collapse into just one of those states. Most people find that instinctively
implausible. However our instincts have mislead us often enough. We all felt instinctively
that there is a universal fact over whether two events are simultaneous; or that the sum of
the angles of a right angle has to be 180 degrees. Both proved to be false. Should we
believe our instincts in this case? If so, why? If not, why not?
Origins of Quantum Theory
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_origins/index.html[28/04/2010 08:24:25 ﺹ]
HPS 0410 Einstein for Everyone
Back to main course page
Origins of Quantum Theory
John D. Norton
Department of History and Philosophy of Science
University of Pittsburgh
In a Nutshell
Theories of Matter at the End of the Nineteenth
Century
Max Planck and the Problem of Black Body
Radiation
Heat Radiation
Planck's Analysis of 1900
Albert Einstein and the Light Quantum
The Proposal
Photoelectric Effect
WaveParticle Duality
Niels Bohr and Atomic Spectra
Atomic Spectra
Failure of Rutherford's Nuclear Model
Bohr's Theory
de Broglie and Schroedinger's Matter Waves
Matter Waves
Discreteness of Atom Electron Energies
The New Quantum Theory
What you should know
Background Reading: J. P. McEvoy, Introducing
Quantum Theory. Totem. This book covers very similar
ground to this chapter, but in greater detail. Read as
much as you like!
In a Nutshell
Each of the theories we have dealt with so far show us
how classical theories break down when we proceed to
realms remote from common experience. Classical
Origins of Quantum Theory
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_origins/index.html[28/04/2010 08:24:25 ﺹ]
Newtonian physics fails when have systems that travel
very fast, or we journey into very strong gravity, or we
consider cosmic expanses of space. Special relativity
prevails in domains of very high speeds; general
relativity in domains of very strong gravitation; relativistic
cosmology over enormous distances.
Classical Newtonian physics also breaks down when we
consider very small systems, such as individual atoms
and the particles from which they are made. Quantum
theory gives us our best account of nature in the very
small. The standard quantum theory we shall consider
here makes no changes to the ideas of space and time
of relativity theory. Most standard quantum theories are
formulated within spaces and times that conform to
Einstein's special theory of relativity or even just to
Newton's account. While some versions of quantum
theory are set within in the spacetimes of general
relativity, a complete adaptation of quantum theory and
Einstein's general theory of relativity remains beyond
our grasp.
Origins of Quantum Theory
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_origins/index.html[28/04/2010 08:24:25 ﺹ]
Quantum theory is a theory of matter; or more
precisely it is a theory of the small components that
comprise familiar matter. The ordinary matter of tables
and chairs, omelettes and elephants is made up of
particles, like electrons, protons and neutrons. Quantum
theory provides us our best account of these particles. It
also provides us with an account of matter in the form of
radiation, such as light. It is commonly known that light
somehow consists both of light waves and also particle
like photons. The notion of these photons comes from
quantum theory (and from Einstein directly, who first
introduced them in 1905 as "light quanta").
The central novelty of quantum theory lies in the
description of the state of these particles. It turns out
that this state does not coincide perfectly with any state
we are familiar with from classical physics. In some
ways, the particles of quantum theory are like little tiny
points of matter, as the name "particle" suggests. In
others, they are like little bundles of waves. A full
account requires us to see that fundamental particles
have properties of both at the same time. There is no
easy way to visualize this necessary combination;
indeed there may be no fully admissible image at all.
The problem of arriving at it remains a challenge today.
That problem, however, has proved to be no obstacle to
the theory itself. Modern quantum theory has enjoyed
enormous empirical success, accounting for a huge
array of phenomena and making striking predictions.
Origins of Quantum Theory
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_origins/index.html[28/04/2010 08:24:25 ﺹ]
It is possible to describe the basic posits of quantum
theory compactly. However these posits are very likely
to appear arbitrary and even a little bewildering on first
acquaintance. What is needed is some understanding of
why those posits were chosen and what problems they
are intended to solve. The best way to arrive at this
understanding is to review the historical
developments in the course of the first quarter of the
twentieth century that led to quantum theory. For in that
historical development one can see a naturally growing
sequence of problems and solutions that eventually
issues in the modern theory.
Unlike relativity theory, the birth of quantum theory was
slow and required many hands . It emerged in the
course of the first quarter of the twentieth century with
contributions from many physicists, including Einstein.
Theories of Matter at the
End of the Nineteenth
Century
At the end of the nineteenth century, matter was
understood to come in two forms.
Origins of Quantum Theory
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_origins/index.html[28/04/2010 08:24:25 ﺹ]
One was particles, localized lumps of stuff
that flew about like little bullets. The best
investigated of the fundamental particles was
the electron. Thomson had found in 1896 that
the cathode rays found in cathode tubesthe
precursor of old fashioned glass TV tubes
were deflected by electric and magnetic fields
just as if they were tiny little lumps of
electrically charged matter. Atoms, a bound
collection of various particles, were also
particulate in character.
The other form was wavelike matter. The one well
investigated form was light or, more generally,
electromagnetic waves. Newton, along with many others
in the seventeenth century, had given accounts of light
as consisting of a shower of tiny corpuscles. Although
wave account had then also been pursued, Newton's
corpuscular view remained dominant. That changed at
the beginning of the nineteenth century with the
exploration of interference effects by Thomas Young
and others.
Origins of Quantum Theory
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_origins/index.html[28/04/2010 08:24:25 ﺹ]
The most celebrated interference
effect arises in the two slit
experiment. Waves of light (depicted
as parallel wavefronts moving up the
screen) strike a barrier with two holes
in it. Secondary waves radiate out from
the two slits and interfere with each
other, forming the characteristic cross
hatching pattern of interference. These
are the same patterns seen on the
surface of a calm pond in the ripples
cast off by two pebbles dropped in the
water.
The essential thing in these interference experiments is
the way the waves combine. The patterns arise because
the waves can add up two ways.
In constructive interference, the phases of the waves
are such that they add to form a combined wave of
greater amplitude. The figure shows the greatest possible
effect of constructive interference. All the parts of the two
waves line up to interfere constructively everywhere.
In destructive interference , the phases are such that
the waves subtract to cancel out. The figure shows the
greatest possible effect of destructive interference. All
parts of the two waves line up in such a way as to interfere
destructively everywhere.
Origins of Quantum Theory
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_origins/index.html[28/04/2010 08:24:25 ﺹ]
In ordinary cases of interference, such as the two slit
experiments, both destructive and constructive interference
happen in different parts of the region where the waves
intersect. That leads to the complicated interference
patterns seen.
Interference effects are readily understandable if one
thinks of a wave as some sort of displacement in a
medium. A water wave in the ocean, for example,
consists of peaks and troughs where the sea water is
displaced above and below the mean sealevel. If two
waves meet and both peaks coincide, the result is a
peak with their combined height. That is constructive
interference. If a peak and trough coincide, then the two
can cancel out. That is destructive interference.
In the nineteenth century, Maxwell found that that
explanation of interference so compelling, that he
thought it provided good evidence for an ether.
Light, he urged, must be a displacement in something if
it is to have peaks and troughs that can cancel out. That
something, the carrier of the light wave, is the ether. If
light were made up of corpuscles, it seemed impossible
that one could combine two corpuscles and have them
annihilate.
Origins of Quantum Theory
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_origins/index.html[28/04/2010 08:24:25 ﺹ]
With the demise of the ether theory, it became clear that
something more interesting was at hand. The matter of
light itself somehow came in a form that it could locally
cancel other light waves. That sort of interaction was an
early indication of the sorts of interactions that would
become commonplace in quantum theory.
This neat division of matter into particlelike and wave
like would not persist. The story of the coming of
quantum theory is the story of the breakdown of this
division. In the sections to come, we shall see how
various clues in the observed physical properties of
matter showed that this simple division must fail.
Ordinary matter
Gases, liquids, solids
Radiative matter
Light, radio waves, heat radiation
View at the end of
the nineteenth century
Particles Waves
Clue that this was too
simple
Discreteness of atomic
spectra (and more)
Thermal properties of heat
radiation (and more)
View with the
completion
of quantum theory
Both wave and particle
properties
Both wave and particle properties
Origins of Quantum Theory
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_origins/index.html[28/04/2010 08:24:25 ﺹ]
Max Planck and the
Problem of Black Body
Radiation
Heat Radiation
The first clue that radiation might also have particlelike
properties came in 1900. It came in apparently
Origins of Quantum Theory
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_origins/index.html[28/04/2010 08:24:25 ﺹ]
innocuous work on heat radiation. This sort of radiation
is familiar to everyone. It is the radiation that warms
our hands in front of fire, that burns the toast and that
provides the intense glare of a furnace. Physicists had
been measuring how much energy is found in each of
the different frequencies (i.e. colors) that comprise heat
radiation. That distribution varies with the temperature of
the radiation. As a body that emits radiation passes
from red to orange to white heat, the frequencies with
the greatest energy change correspondingly.
In 1900, as the newest and latest of the data came in,
Max Planck in Berlin was working on understanding the
physical processes that led to these distributions of
energy. His model of heat radiation was of a jumble of
many frequencies of electromagnetic waves that have
come to equilibrium in a cavity. The waves are absorbed
and emitted by oscillating charges in the walls of the
cavity. That way, the temperature of the walls could be
conveyed to the radiation itself. The cavity really just is
an oven and it is filling the space inside with heat
radiation. This radiation inside the cavity was known as
"cavity radiation."
If a tiny window was opened in the walls of the cavity,
the radiation released would also have the temperature
Origins of Quantum Theory
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_origins/index.html[28/04/2010 08:24:25 ﺹ]
of the cavity. Some clever thermodynamic arguments
showed that it had exactly the same composition as
radiation reemitted by a body at that same temperature
if that body had the special property that it absorbed
perfectly all radiation that fell on it, before reradiating it.
Such bodies are called "black"; so that form of radiation
is known as "black body radiation."
Planck's Analysis of
1900
Planck found a very simple formula
that fitted the latest experimental
results very well. His problem was
to tell a theoretical story about how
that formula came about. After
some hesitation, he found such a
story. However the essential
computation in his story depended
upon a very odd assumption.
(Debate continues today over
whether Planck actually realized
how radical this assumption was
and how crucial it was to his
account.) Planck modelled the
heat radiation as coming from
energized electric resonators.
Ordinary resonators of classical
physics are just masses vibrating
on springs, as shown in the figure.
They can take on a continuous
range of energies.
Planck's story required that these
resonators not be energized over a
continuous range of energies.
Origins of Quantum Theory
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_origins/index.html[28/04/2010 08:24:25 ﺹ]
Instead they might take energies
of, say, 0, 1, 2, 3, ... units, but
nothing in between. Energies of
say 1.2 units or 3.7 units were
prohibited.
...
oooo
ooo
oo
o
Deciding what those units were proved to be important.
The units of energy were tied to the resonant frequency
of the resonator. They were given by Planck's formula:
Energy = h x frequency
That means that the allowed energies are (h x
frequency), twice (h x frequency), thrice (h x frequency),
Origins of Quantum Theory
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_origins/index.html[28/04/2010 08:24:25 ﺹ]
and so on.
The letter h stands for a new constant of nature
introduced by Planck and now called "Planck's
constant." This new constant plays the same of role in
quantum theory that the speed of light plays in relativity
theory; it tells us when quantum effects will be
important. The number is very small, suggesting that
quantum effects are to be expected in the small; for
example, for ordinary frequencies, units of energy given
by Planck's formula will be very small, so we will not
notice the granularity it requires when we look at the
larger energies of systems ordinary experience. (h =
6.62 x 10
27
erg seconds.)
Planck's original formula applied to the energy of the
resonators. He tried hard to confine the discontinuity it
suggested to these resonators and even just to the
interaction between radiation and the resonators. Over
the next decade, other physicists began to see that the
discontinuity could not be confined. Computations
analogous to those of Planck from 1900 could be
applied to heat radiation directly. They drove to the
conclusion that Planck's formula applied directly to heat
radiation as well. In each frequency, the energy of heat
radiation must come in whole units of h x frequency.
That conclusion is hard to reconcile with the idea that
heat radiation is purely a wave phenomenon.
Albert Einstein and the
Light Quantum
Origins of Quantum Theory
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_origins/index.html[28/04/2010 08:24:25 ﺹ]
The Proposal
While Planck may not have recognized how radical his
work of 1900 was, Einstein realized that something very
odd was afoot with high frequency light and he did it
apparently independently of Planck. In 1905 he argued
that we needed to change our basic picture of the
constitution of radiation.
High frequency light behaves in certain circumstances
as it if were made up of spatially localized bundles
of energy using (once the notation is adjusted) Planck's
formula to give the amount of energy in each bundle. So
once again light could be seen, in some ways, as a
shower of corpuscles, each corpuscle now with energy
equal to h x (frequency of light).
The traditional picture inherited from the
great achievements of nineteenth century
physics was that light is a propagating wave.
Origins of Quantum Theory
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_origins/index.html[28/04/2010 08:24:25 ﺹ]
What Einstein now urged
was that high frequency light
sometimes behaved as if it
were made up of spatially
localized bundles of energy.
Planck's formula gave the
amount of energy in each
bundle. So once again light
was said to consist of a
shower of corpuscles ,
each corpuscle now with
energy equal to h x
(frequency of light).
While this seems like a return to a Newtonian particle
view, the return was not and could not be complete.
For the wave based notion of frequency was part of
Einstein's hypothesis. And whatever else may come, the
experiments on the interference of light remained.
Einstein's core argument was ingenious. He looked at
the observed properties of high frequency light and
noticed they were governed in certain aspects by
exactly the same laws that govern ordinary gases. By
reverse engineering those gas laws, Einstein could
show that they depended essentially on gases
consisting of very many spatially localized little localized
lumps of matter, their molecules. He supposed that it
was no accident that light and gases obeyed the same
laws; they did, he urged, because the light really was
made of little localized unitscalled "quanta"of energy.
Origins of Quantum Theory
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_origins/index.html[28/04/2010 08:24:25 ﺹ]
For a more detailed account of Einstein's core
argument, see the chapter " Atoms and the Quantum,"
Section 7, " The Light Quantum Paper: Einstein's
Astonishing Idea."
The word "quantum" (plural "quanta") was then just
used as a label for a unit of some quantity. In 1905 talk
of a light quantum would be understood to be nothing
more than talk of a "light unit."
Photoelectric Effect
The best known part of Einstein's
1905 paper on the light quantum was
an observation made towards the
end of the paper. Einstein had been
following experiments on the so
called " photoelectric effect ." In it,
light is used to kick electrons out of
an electrically charged cathode.
According to the wave theory of light,
the intensity of the light ought to
determine if the light can generate
these "photoelectrons." For more
intense light has more energy and
energy is what is needed to liberate
the electrons held in the cathode's
surface.
It is easy to diminish the intensity of light. We can,
for example, just move the light source far away so that
the light energy it emits is spread over a great area. The
expectation from the wave theory is that this dimmed
light will lose its ability to liberate photoelectrons.
Origins of Quantum Theory
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_origins/index.html[28/04/2010 08:24:25 ﺹ]
Experiment had shown, however, that the intensity did
not matter to the ability of light to produce
photoelectrons. All that mattered was the frequency of
the light. If light was of low frequency, it could not
generate photoelectrons, even if the light were very
intense. If the light had a high frequency it could
produce photoelectrons, even if the light was of very low
intensity.
This, Einstein observed triumphantly, is just what one
would expect if light energy were localized in quanta
with energy given by Planck's formula. All one had to
assume was that a single quanta was all that was
needed to generate each photoelectron.
If the light was of low frequency, its
individual quanta would be of low
energy, so no one quanta would be
energetic enough to knock electrons
out of the cathode. Increasing the
intensity of the light did nothing more
than increasing the number of light
quanta showering on the cathode, all
them too weak in energy to liberate a
photoelectron.
If the light was of high frequency,
then each light quantum was
individually energetic enough to
liberate a single photoelectron. The
intensity of the light did not matter.
Low intensity meant that there were
not many light quanta incident on the
cathode. But since only one light
quantum is needed to liberate just
one photoelectron, the effect would
be there for high frequency light, no
matter how weak the intensity of the
light.
Origins of Quantum Theory
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_origins/index.html[28/04/2010 08:24:25 ﺹ]
About fifteen years later in 1921, Einstein won the
Nobel prize . His work on the photoelectric effect
attracted special mention in the award. The citation read
"for his services to Theoretical Physics, and especially
for his discovery of the law of the photoelectric effect."
WaveParticle
Duality
If this corpuscular view of light is so successful, do we
need the wave view at all ? In 1909, Einstein showed
that certain phenomena could only be successfully
explained if we used both wave and particle view;
the full observed effect came from the sum of two terms,
one a particle term, the other a wave term. The need for
both is sometimes called "waveparticle duality."
Many of you will want to use the word "photon"
interchangeably with Einstein's "light quantum." There is
probably not much harm in doing that as long as you
realize that the word "photon" comes from a later era in
quantum theory. It was introduced by G. N. Lewis in
1926, 21 eventful years later.
When we use the word photon, the natural presumption
is that we are referring to the entity that derives from the
completed quantum theory of the 1920s and 1930s.
When Einstein proposed his light quanta, not even an
Einstein could anticipate quite how radically the
emerging quantum theory would diverge from classical
ideas. Einstein's proposal of 1905 was quite
restricted; he posited that the energy of high frequency
light was spatially localized into the little lumps he called
light quanta. He could not then know how things would
transpire for low frequency light. And his proposal of
1905 did not say anything about the momentum of the
light quanta. That light quanta also carry momentum
was inferred later.
Origins of Quantum Theory
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_origins/index.html[28/04/2010 08:24:25 ﺹ]
Niels Bohr and Atomic
Spectra
Atomic Spectra
The analysis of heat radiation and the power of light to
generate photoelectrons provided the first clues that this
wavelike form of matter was not merely wavelike, but
also had particle like aspects as well. What of the
particles that make up matter ? What of the electrons
that Thomson had found in 1896 ? The clue that they
also had wavelike aspects eventually derived from
observations in atomic spectra.
If gases are energized by heating or passing an electric
discharge through them, they emit light. The orange
sodium vapor lamps or bright white mercury vapor
lamps used in parking lots employ this mechanism in its
simplest form. The reverse process also occurs. Gases
Origins of Quantum Theory
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_origins/index.html[28/04/2010 08:24:25 ﺹ]
will absorb lightthat is how they can block transmission
of light.
Origins of Quantum Theory
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_origins/index.html[28/04/2010 08:24:25 ﺹ]
One might expect that such emissions (and absorptions)
contain all frequencies (colors) a perfect rainbow
even if the intensity of light across the spectrum might
vary. They do not. Gases are very selective in the
frequencies they emit and absorb. They will emit and
absorb only a few very particular frequencies. The
frequencies emitted form what is called the atomic
emission spectrum of the element; and those absorbed
form the absorption spectrum. The frequencies in them
are distinctive that they can be used as a characteristic
signature for identifying an otherwise unknown gas.
Here is the emission spectrum of
hydrogen gas. The light emitted by
excited hydrogen has been spread
out into its component frequencies
by passing it through a prism or
diffraction grating. The light then
darkens a photographic emulsion in
different places according to its
frequency.The series of lines shown
is the socalled "Balmer series" that
appears in the visible and near
visible frequencies of light.
(Wavelengths are shown in units of
Angstroms.) From Gerhard Herzberg, Atomic
Spectra and Atomic Structure. PrenticeHall, 1937.
Failure of
Rutherford's Nuclear
Model
In 1913, Niels Bohr reported on his efforts to devise a
model of the process of light emission from the atoms of
elements that would explain the very particular
Origins of Quantum Theory
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_origins/index.html[28/04/2010 08:24:25 ﺹ]
frequencies emitted. The problem proved to be far
harder than one would expect. Then, the best model of
an atom was Rutherford's nuclear model. According
to it, an atom is like a little solar system. It has a
massive, but tiny, positively charged nucleus. That
nucleus exerts an attractive force on lighter negatively
charged electrons that orbit it, rather like the way the
planets orbit the very massive sun.
In the Rutherford model, exciting a gas by passing high
voltage electricity through it would energize the electrons,
which could then move further away from the attractive
pull of the nucleus. When they fell back towards the
nucleus, the energy they gained would be lost as light
energy; that emitted light forms the emission spectrum.
The first difficulty was that, as they fell back to the
nucleus, they would pass through a continuous range of
orbital frequencies and thus emit a continuous range of
frequencies of light. There was no way to limit the emitted
to light of just a few special frequencies.
The second difficulty was more serious. Nothing stops
the emission of energy by the electrons through this
process of light emission. They would continue to do it
until they crashed into the nucleus. According to classical
electrodynamics, this would happen very quickly. It was
not clear that Rutherford's model allowed matter made of
atoms to exist at all.
Bohr's Theory
Bohr solved both problems with
a proposal of breathtaking
audacity. Classical
Origins of Quantum Theory
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_origins/index.html[28/04/2010 08:24:25 ﺹ]
electrodynamics was quite clear:
an electron orbiting the nucleus
is accelerating and therefore
must radiate energy. It would be
like a little radio transmitter,
broadcasting electromagnetic
waves. In the process, it must
lose energy, fall deeper into the
attractive pull of the nucleus and
eventually crash into it nucleus.
Bohr simply posited that this
was not true. Rather, he
asserted that there are stable
orbits arrayed around the
nucleus in which an electron
could orbit indefinitely without
losing any energy.
Next, Bohr supposed that electrons
can jump up and down between these
allowed orbits. If an electron is to
jump up, away from the nucleus to a
higher energy orbit, it needs to gain
energy to be able to climb away from
the pull of the atom's positively
charged nucleus. It gets that extra
energy by being struck by a quantum
of light, which excites the jump. The
quantum of light must deliver exactly
the right amount of energy to make up
the difference between the energy of
orbit left and one to which the electron
jumps.
In addition Bohr assumed that the
energy of the exciting light quantum
obeys Planck's formula, so that its
Origins of Quantum Theory
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_origins/index.html[28/04/2010 08:24:25 ﺹ]
energy is just h times its frequency.
The outcome is that light only of a very
specific frequency can excite the
jumps between two specific orbits.
Bohr's theory also allows for the
reverse process. Once an electron has
jumped up to a higher energy orbit, it
will not stay there. It will jump back
down to a lower energy orbit. In the
process, it will re emit the energy it
gained in jumping up as a quantum of
light energy. Once again, the energy of
the light emitted will conform to
Planck's formula and be equal to h
times its frequency.
As a result, when an electron jumps
down between two orbits, it emits light
of a definite frequency that is
characteristic of exactly that jump.
Having made those assumptions, Bohr could read off
the oddest result from the observed atomic spectra.
Since only very few frequencies of light were present, it
followed that only very few jumps were possible, so that
only very few orbits were permitted for the electron.
It was as though our sun allowed a planet to orbit where
Venus is and where the Earth is; but it prohibited any
planet in between.
All that remained was to figure out just which of the
many possible orbits are found in this favored set of
stable orbits. That was relatively easy to do. The
observed spectra gave a complete catalog of the
energy differences between these allowed, stable
orbits. Each line in the observed spectra resulted from
electrons jumping between two specific orbits. It is a
Origins of Quantum Theory
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_origins/index.html[28/04/2010 08:24:25 ﺹ]
numerical exercise to determine precisely which those
few orbits are. The calculation was not so different from
this exercise in geography. If we are given the distances
between every pair of cities in a country, we can use
those data to figure out where on the map each city is
found. Atomic spectra gave Bohr the energetic
distances between his allowed orbits. From those data
he could determine the energies and thus locations of
those allowed orbits.
When Bohr did that, he found a very simple way to
summarize just which of the orbits were allowed.
They were those whose orbital angular
momentum came in units h/2π. Just as
Planck's relation told us that radiant energy comes
in whole units of h x frequency, Bohr now found that
orbiting electrons always must have whole units of
angular momentum: one h/2π, two h/2π, three h/2π
We have seen that the ordinary (linear)
momentum of a body is just its mass
times its velocity. Angular momentum is
an analogous quantity that plays an
important role in the dynamics of rotating
or orbiting systems. For a small mass like
a classical electron orbiting a nucleus, it
is defined as the electron's mass x radius
of orbit x angular speed of electron.
Origins of Quantum Theory
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_origins/index.html[28/04/2010 08:24:25 ﺹ]
and nothing in between.
Bohr's theory was puzzling, even maddening. Just as
with Einstein's hypothesis of the light quantum, it
seemed to require that classical physical notions both
hold and fail at the same time. That was not a
comfortable situation. Those discomforts were eclipsed
by a brighter fact. Bohr's theory worked, and it worked
very well. Observational spectroscopy was providing
theorists with an expansive catalog of spectra of many
substances under many different conditions. Starting
from Bohr's theory, physicists were able to develop an
increasingly rich and successful account of them. While
it was clear that something was not right, in the face of
these successes, it was tempting to postpone asking too
pointedly how this goose could keep laying golden
eggs.
The central result of Bohr's theory of 1913 was that
the angular momentum of orbiting electrons came
in full multiples  quanta  of h/2π. In the years
immediately following, that simple condition was
expanded into a broader condition that a quantity
known as "action" came only in whole multiples for
physical systems that returned periodically to the
same initial condition. As a result the term
"quantum of action " entered the physicists'
vocabulary.
This sidebar should contain a brief
sentence that gives you a useful idea of
the physical quantity "action." Alas, I've
been unable to figure out what that
sentence might be. It probably doesn't help
too much if I tell you that the trajectories of
bodies obeying classical physical laws can
be picked out as those that render
extremal the action added up along the
trajectories. Did that help? I didn't think it
would.
de Broglie and
Schroedinger's Matter
Waves
Origins of Quantum Theory
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_origins/index.html[28/04/2010 08:24:25 ﺹ]
Matter Waves
Bohr's theory of 1913 and its later elaboration gave a
wonderfully rich repertoire of methods for accounting for
atomic spectra. They depended on a contradictory mix
of classical and non classical notions. By the early
1920s, the limits of this system began to show and
theorists also turned to the task of making some
coherent sense of this body of theory that soon came to
known as "the old quantum theory."
The major breakthroughs to the "new quantum theory"
came in the middle of the 1920s. A number of different
theorists found ways of developing coherent theories of
the quantum domain; and they all eventually proved to
be different versions of the same new theory.
Heisenberg, Born and Jordan first developed matrix
mechanics. Its basic quantities were infinite tables of
numbers  matrices  drawn as directly as possible
from observed quantities like atomic spectra.
Another approach proved equivalent and is easier to
picture. It was based on a supposition by de Broglie of
1923 and developed by Schroedinger in 1926. Einstein
had show that a wave phenomenon, light, also had
Origins of Quantum Theory
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_origins/index.html[28/04/2010 08:24:25 ﺹ]
particle like properties. Might the reverse be true also?
Might particle like electrons also have wave properties?
The hypothesis answered yes. It associated a wave of a
particular wavelength with a particle of some definite
momentum.
Here is de Broglie's formula that tells us which
wavelength goes with which momentum:
momentum = h / wavelength
Notice how similar it is to Planck's formula which relates
energy and frequency. Here is Planck's formula again:
energy = h x frequency
The two together form the foundation of the matter
wave approach. They tell us how to assign a wave of
some definite frequency and wavelength to a particle of
some given energy and momentum.
Here's a way to see the two equations in even more similar form. For a
periodic process we can write frequency = 1/period, where "period" is
the time needed for the process to recur. Then Planck's formula
becomes
energy = h / period.
Now the equations relate momentum to a length (the wavelength) and
energy to a time (the period).
Discreteness of
Atom Electron
Energies
Origins of Quantum Theory
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_origins/index.html[28/04/2010 08:24:25 ﺹ]
The beauty of the matter wave hypothesis is that it
explained naturally why only very particular energy
states are admissible for electrons bound in atoms. The
reason that only few energy states are admissible for
these electrons derives directly from the fundamental
differences between particles and waves. We can see
these differences by considering a very simple case, a
particle/wave trapped in a box.
To begin, imagine an ordinary, classical
particle confined to a box. It bounces back
and forth between the walls. Classical
physics allows it to move at any speed. As a
result it can have a continuous range of
different energies.
Now imagine instead that we are confining a
wave to the same box. The stable waves
that can persist within the box are so called
"standing waves."
Anyone who plays a stringed instrument
is familiar with them. When a string is
plucked or bowed, the base note results
from a standing wave whose half 
wavelength is the length of the string. There
are overtones also formed that give the
richness of the sound. These are smaller
standing waves, whose wavelengths equal
the length of the string, 2/3 that length, half
Origins of Quantum Theory
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_origins/index.html[28/04/2010 08:24:25 ﺹ]
that length, and so on. The essential
condition is that a wave can form as long as
it has nodesthe points of no displacement
at either end of the string.
The matter waves that can form within the
box have the same structure as these tones
and overtones. They have wavelengths of
once, 1/2, 1/3, ... , times the double width of
the box. (We use the double width since standing
waves have nodes at each half wavelength.) Each of
these waves turns out to have a different
energy that depends on the wavelength of
the standing wave. Thus only very few
definite energies are permitted for the
waves trapped in the box; the many
intermediate energies between them are not
allowed.
What of de Broglie's relation, momentum = h/wavelength? Are we
to say that the standing waves in the box have momenta proportional
to h/2, h, (3/2)h, 2h, ... etc. corresponding to the above allowed
wavelengths. Well almost. The standing wave with wavelength equal
to the width of box could be associated with a particle moving to the
right with momentum = h/(width of box) and one moving to the left with
momentum h/(width of box). But a standing wave is propagating
neither to the right nor to the left. To get the wave to stand still, we
form the superposition of these two waves. Superposition allows us to
have a wave that is moving both to the left and right at the same time,
and thus goes nowhere. See the next chapter for more on
superposition.
The situation for an electron in a hydrogen atom is
essentially the same. The electric attraction of the
positively charged nucleus forms a prison that traps the
electron, just as the box above traps the wave. The
wave in the box may persist only in a few energy states.
Correspondingly an electron wave trapped in a
Origins of Quantum Theory
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_origins/index.html[28/04/2010 08:24:25 ﺹ]
hydrogen atom may persist only in a few definite energy
states. These turn out to the be the energies of the
stable orbits of Bohr's theory.
While those energies survive, what does not survive
from Bohr's theory is the idea of the electron as a
spatially localized particle orbiting the nucleus in a
classical circular or elliptical orbit, but nonetheless
violating classical electrodynamics by not radiating. The
space around the atom's nucleus is filled with a standing
wave of the electron. Classical electrodynamic theory no
longer directly applies; the earlier contradiction with that
theory has evaporated.
The New Quantum
Theory
In the later part of the 1920s, all these ideas
coalesced into what was called the "new quantum
theory," to distinguish it from the "old quantum theory" of
the decades before. There were matrix based
approached proposed by Heisenberg, Born and Jordan;
Origins of Quantum Theory
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_origins/index.html[28/04/2010 08:24:25 ﺹ]
and the matter waves of de Broglie and Schoedinger;
and Dirac introduced his c numbers and qnumbers. It
soon became clear that all these approaches were
really just the same theory dressed up in different
mathematical clothing. The puzzling properties of light
and matter that led to this theory were now essentially
resolved. The solution lay in a new conception of the
nature of matter. Matter fundamentally is not made of
particle OR waves; it consists of a form of matter that,
roughly speaking, is both particle AND wave; and this is
true both for ordinary matter like protons and electrons,
and for radiative matter like light.
This new synthesis, however, left a legacy of
enduring problems. First, the new theory introduced
an element of probability that was unknown in classical
physics. There are many processes for whose
outcomes the theory can only give probabilities. Will this
radioactive atom decay now or later ? The best the
theory can offer are probabilities. This circumstance
proved deeply troubling to many thinkers of the era,
including Einstein. They found it repugnant to think that
the fundamental laws of the universe might be
probabilistic and described the difficulty as a breakdown
of "causality." There were deeper problems. The new
quantum theory worked very well for small particles.
However it was far less clear how it should be applied to
macroscopic bodies. Tables, chairs, houses and
elephants do not obviously manifest a combination of
wave and particlelike properties. Yet the theory said
that they must. We will see in the next chapter how that
problem continues to vex us today.
What you should know
What theories of matter looked like at the end of the
nineteenth century.
Origins of Quantum Theory
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_origins/index.html[28/04/2010 08:24:25 ﺹ]
How Planck's analysis of heat radiation generated
a problem for classical physics.
What is contained in Einstein's 1905 proposal of
the light quantum.
How Bohr used atomic spectra to infer to a new
and strange model of the atom.
How the proposal of matter waves started to make
sense of Bohr's proposal.
Copyright John D. Norton. April 2001; March 16, August 22, November 23, 2008; April 7, 2010.
Waves and Particles
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_waves/index.html[28/04/2010 08:24:35 ﺹ]
HPS 0410 Einstein for Everyone
Back to main course page
The Quantum Theory of Waves and
Particles
John D. Norton
Department of History and Philosophy of Science
University of Pittsburgh
Both Wave and Particle?
Superpositions of Matter Waves
Wave Packets
Heisenberg's "Uncertainty" Principle
...Applied to a Hydrogen Atom
Complementary Pairs
Uncertain or Indefinite?
How Quantum States Change over Time
Schroedinger Evolution...
...Is Not the Whole Story
Measurement: Collapse of the Wave Packet
Indeterminism: An Unsure Future
Anxieties over Irreducible Chanciness
The Nineteenth Century View of Causation
What you should know
Both Wave and Particle?
We have seen that the essential idea of quantum
theory is that matter, fundamentally, exists in a state
that is, roughly speaking, a combination of wave
and particle like properties . To enter into the
foundational problems of quantum theory, we will need
to look more closely at the "roughly speaking." It is
needed since it is not so easy to see how matter can
have both wave and particle properties at once. One of
the essential properties of waves is that they can be
Waves and Particles
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_waves/index.html[28/04/2010 08:24:35 ﺹ]
added: take two waves, add them together and we
have a new wave. That is a commonplace for waves.
But it makes no sense for particles, classically
conceived. Just how do we "add up" two particles?
Quantum theory demands that we get some of the
properties of classical particles back into the
waves. Doing that is what is going to visit problems
upon us. It will lead us to the problem of indeterminism
and then to very serious worries about how ordinary
matter in the large is to be accommodated into
quantum theory. For the picture of matter in the small
presented by quantum theory is quite unlike our
ordinary experience of matter in the large.
Superpositions of Matter
Waves
A distinctive characteristic of waves is that we can take
two waves and add them up to form a new wave.
That adding of waves is the essence of the
phenomenon of the interference of waves. The theory
of matter waves tells us that particles like electrons are
also waves. So we should be able to add several of
them together, just as we could add several light
waves together.
When we do this, we form the "superposition" of the
individual matter waves. These superpositions turn out
to have a central role in the theory of matter waves and
in quantum theory as a whole. So let us look at a
simple example of superposition. Here are four
matter waves with wavelengths 1, 1/2, 1/3 and 1/4.
We will "add them up," that is, form their superposition,
in the same way that we add light waves.
Waves and Particles
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_waves/index.html[28/04/2010 08:24:35 ﺹ]
Notice what happened when we formed the
superposition. Each of the four component waves is
uniformly spread out in space and has a definite
wavelength. That situation starts to reverse in the
superposition. The resulting wave is no longer
uniformly spread out. It tends to be more
concentrated in one place. It also no longer has a
single wavelength. The distances between adjacent
peaks and troughs differ in different parts of the wave.
Waves and Particles
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_waves/index.html[28/04/2010 08:24:35 ﺹ]
Wave Packets
This example of superposition will help us resolve a
little puzzle in matter wave theory. Recall de
Broglie's relation. It tells us that a matter wave with a
definite wavelength has a definite momentum.
Where is the particle? The answer can be read from
the figure. It is spread throughout space. It has no
one position in space; it has all positions.
What wave represents a particle that is spatially
localized? Take the extreme case of a particle
localized at just one point in space. Its matter wave
is just a pulse at that point in space.
So now we come to the puzzle: what is the
momentum of this spatially localized particle?
The superposition given earlier answers the puzzle.
We found that when we took the matter waves of
particles with different momenta and added them, we
produced a matter wave that was spatially localized. If
Waves and Particles
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_waves/index.html[28/04/2010 08:24:35 ﺹ]
we had been careful in choosing exactly which matter
waves to add, we could find a set that would sum to
form a perfectly localized pulse. That set turns out to
contain all possible values of momenta.
So the answer to our puzzle is that the pulse is
associated with all possible momenta.
These two cases are the extremes. We have a matter
wave with a definite momentum but all possible
positions; and we have a matter wave with a definite
position but all possible momenta. Free, propagating
particles in quantum theory are represented by an
intermediate case, a wave packet:
We arrive at a wave packet by adding matter waves
with a small range of momenta. The resulting packet
occupies a range of positions in space and is
associated with a range of momenta.
Heisenberg's
"Uncertainty" Principle
The tradeoff we have just seen between definiteness
of position and definiteness of momentum is quantified
by what is commonly known as Heisenberg's
uncertainty principle. For reasons that I will explain
shortly, I prefer to call it an "indeterminacy principle." It
depends on using a standard statistical measure, the
standard deviation, for the uncertainty or
Waves and Particles
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_waves/index.html[28/04/2010 08:24:35 ﺹ]
indeterminacy or, more colloquially, the spread in a
wave packet. The principle asserts:
indeterminacy
in position
x
indeterminacy
in momentum
is
greater
than
or
equal
to
h/2π
This principle tells us that the indeterminacy in position
and momentum when multiplied together can never
get smaller than h/2π. To see what that amounts to,
imagine that we have a wave packet that has the least
indeterminacy allowed, so that the quantites multiplied
equal h/2π. If we then somehow further reduce the
indeterminacy of the momentum of this wave packet,
it follows from the principle that we must increase the
indeterminacy of the wave packet's position. For the
two quantities multiplied together can never get
smaller than h/2π.
Conversely, if we reduce the indeterminacy of the
wave packet's position, then we must increase the
indeterminacy of its momentum. Just this was the
process we saw when we started to form a wave
packet by superposing waves of different momentum.
As we add more waves of different momentum, we can
narrow the spatial spread of the wave packet, but only
at the cost of increasing the spread in momentum.
...Applied to a Hydrogen
Atom
Since h is such a small number , the sorts of indeterminacies
arising are so small as to be unnoticeable for ordinary objects. It
Waves and Particles
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_waves/index.html[28/04/2010 08:24:35 ﺹ]
is quite different on an atomic scale.
Take the case of an electron trapped in a hydrogen atom.
Let's think about it classically. If the electron is to remain bound
to the positively charged nucleus of the atom, it must have a
quite small momentum. Then it will remain in the familar elliptical
orbit of Bohr's theory. (Or if we think fully classically, it will spiral into
the nucleus as it radiates away its energy.)
If the momentum is too big, the electron will tear itself away
from the nucleus and escape. The electrical attraction of the
nucleus will not be sufficient to hold it. This situation is essentially
the same as what happens with a very rapidly moving comet and
the sun. If the comet moves slowly enough, it will remain trapped
in an elliptical orbit around the sun. If it is moving fast enough, it
will flee off into space never to return.
Now recall that these particles are matter waves subject to
Heisenberg's principle. The indeterminacy in the
momentum of the electron must be small. For only then are we
assured that the momentum of the electron remains close
enough to zero for it to remain trapped by the attraction of the
nucleus. If the indeterminacy is large, we cannot preclude the
possibility that the electron has a sufficiently large momentum to
escape.
It is a simple computation to see how small that
indeterminacy in the electron's momentum must be. If
we then insert that smallest indeterminacy into
Heisenberg's formula, we find the least
indeterminacy of the electron's position. That
indeterminacy in position turns out to be roughly of the
size of the atom; or, more precisely, of the lowest
energy orbit of Bohr's 1913 model.
Waves and Particles
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_waves/index.html[28/04/2010 08:24:35 ﺹ]
So the electron is spread over the whole
atom; it is futile to look at a particular spot
within the atom for the electron. This reflects
what we already expected from the use of a
matter wave to represent an electron in a
hydrogen atom. Bohr's troublesome classical
orbits are replaced by waves spread over the
space surrounding the nucleus.
These waves are often pictured as diffuse
"clouds." The simplest of these clouds is
pictured at right. Of course the nucleus is also
subject to quantum mechanics, so it too should be
"fuzzed out" into a little cloud.
Complementary Pairs
This reciprocal indeterminacy of position and
momentum is just one of many in quantum mechanics.
When two quantities form complementary pairs, the
two quantities will enter into analogous indeterminacy
relations. There is such a relation, for example,
between the energy and timing of a process. There is
another between the angular momentum of an object
and its angular position. (The angular position of a body is just
a specification of the direction in which it lies with respect to some
arbitrarily chosen center and axis. Is it in the zero degree position?
Or do we find it at 90 degrees ? A familiar example of angular
position is a compass bearing at sea. Our port, we might judge, lies
due East, that is 90 degrees from due North.)
This last indeterminacy can be applied to the example
of the hydrogen atom. If an orbiting electron is
definitely in just one of Bohr's stationary orbits, then its
angular momentum has a definite value. As a result of
the angular momentum angular position
indeterminacy, its angular position must be completely
Waves and Particles
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_waves/index.html[28/04/2010 08:24:35 ﺹ]
indeterminate. So the angular position of the electron
about an axis used to determine the angular
momentum is completely indeterminate. That is again
just what we would expect when we replace Bohr's
pointlike electrons with waves.
Uncertain or
Indefinite?
Why am I avoiding the common talk of "uncertainty" in
association with Heisenberg's principle?
Uncertainty over some quantity suggests the quantity
has a definite value but that we just do not know what
it is. We may be uncertain, for example, about the
price of paint at the paint store before we go there to
buy paint. There is a definite price all customers are
charged; we just do not know what it is.
Now compare that with the price that some very
valuable painting may obtain in a coming auction. We
do not now know what that price will be; the auction
hasn't happened yet. We may say that we are
uncertain of the price. But it is a different sort of
uncertainty. There is no price now to know. The price
will only be determined when the auction actually
happens.
In the standard approach to quantum mechanics, the
uncertainties of Heisenberg's uncertainty principle are
of the second type. When the position of a particle is
indeterminate, that means that there is no single
position associated with the particle; its wave is spread
over many positions. It is not that the particle really has
a definite position and we just don't know which it is. It
is not that we are uncertain about the position because
Waves and Particles
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_waves/index.html[28/04/2010 08:24:35 ﺹ]
there are more facts to know about the position. There
are no further facts to know.
So talk of "uncertainty" in Heisenberg's formula can be
misleading. It suggest that we are just ignorant of
something that could be known. It is easy to overlook
the second way that we can come to be uncertain: the
issue is indefinite and there is nothing more to know.
The standard approach to quantum mechanics derives
the uncertainty from indefiniteness. There are other
approaches in which this is not so. In one developed
by Louis de Broglie and David Bohm, particles always
have a definite position and the uncertainties arise
from our ignorance. These approaches represent a
minority view.
How Quantum States
Change over Time
Schroedinger
Evolution...
An essential part of quantum mechanics deals with
how matter waves change over time. Mostly, matter
waves behave just like ordinary waves. If you have
ever watched ripples spread on the surface of a
smooth pond, you have see at least qualitatively just
what matter waves do.
Waves and Particles
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_waves/index.html[28/04/2010 08:24:35 ﺹ]
Take a particle that we localize to just one place, so its
matter wave is a spatially localized pulse. Left to itself,
that pulse will spread out in all directions as
propagating waves. It is just like what happens when a
pebble hits the surface of the pond. The localized
splash immediately spreads out in broadening ripples.
That type of behavior is called "Schroedinger
evolution," because it is governed by Schroedinger's
wave equation.That equation just says that matter
waves propagate like waves.
...Is Not the Whole
Story
Waves and Particles
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_waves/index.html[28/04/2010 08:24:35 ﺹ]
If Schroedinger evolution were the only way that
matter waves could change, we would have some
difficulty connecting matter waves with our ordinary
experience. Matter waves typically are spread over
many positions and are superpositions of many
momenta. Yet when we measure them, we always
find just one value for position or momentum.
For example, the simplest sort of measurement is to
intercept a matter wave with a photographic plate or a
scintillation screen that glows when struck by a
particle. In both cases, we find that the matter waves
yield just one definite position. They give us a
single spot in the photograph or a localized flash of
light on the screen.
The screen of an old
fashioned TV tube is a
scintillation screen.
Electrons are fired at it
from an electron gun at the
rear of the tube. While the
electrons are in flight, they
retain wavelike
properties. Those wavelike
properties are essential to
an electron microscope,
which focusses them like
an optical microscope
focusses light.
When the matter wave of
the electron strikes the
screen, however, the
Waves and Particles
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_waves/index.html[28/04/2010 08:24:35 ﺹ]
resulting flash of light
reveals just a single
position.
Measurement:
Collapse of the
Wave Packet
The standard solution to this problem is to propose
that there is a second sort of time evolution for matter
waves. The first type, Schroedinger evolution, arises
when matter waves are left to themselves or when they
interact with just a few other particles.
The second type arises whenever we perform a
measurement of a quantity like position or momentum.
Then the matter wave collapses to one that has a
definite value for the quantity measured. If we are
measuring the position of the matter wave, it collapses
to a localized pulse. If we are measuring momentum, it
collapses to a wave with a definite momentum.
This second sort of time evolution is called
"measurement" or "collapse of the wave packet."
Waves and Particles
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_waves/index.html[28/04/2010 08:24:35 ﺹ]
It is not easy to specify exactly when a measurement
evolution will take place. The simplest condition is that
it arises in a circumstance in which we are trying to
ascertain the value of a quantity. That condition is of
no use in theory formation. For matter waves do not
"know" what we are intending; they do not choose to
evolve in one way or another according to our wishes
or interests. The best we can come up with is a simple
rule of thumb. Matter waves left to themselves or
interacting with just a few particles undergo
Schroedinger evolution. Matter waves interacting with
macroscopic bodies (such as particle detectors)
undergo collapse.
Indeterminism: An
Unsure Future
Schroedinger evolution of a matter wave is fully
deterministic. That means that if we specify the
present state of the matter wave, its future state is
fixed completely by Schroedinger's equation.
This determinism of the theory fails when we consider
measurement. For when we measure the position
of a particle represented by a wave packet, we do not
know for sure which position will be revealed. The best
Waves and Particles
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_waves/index.html[28/04/2010 08:24:35 ﺹ]
we can do is to say which are the candidate positions
and, using a standard rule, compute the probability of
each.
Thus measurement introduces indeterminism into
quantum theory. A full specification of the present state
of the matter wave and everything that will interact with
it is not enough to fix what its future state will be.
The rule that determines the probability
of each candidate outcome depends
essentially on superposition.
Consider, for example, a wave packet.
It is the superposition of many spatially
localized pulses.
The figure shows just five of them. In
general there are infinitely many.
What is important is that the amplitude
of the component pulses vary according
to the part to which they will contribute
in the fully assembled wave packet.
A pulse contributing to the large
amplitude central section will have a
large amplitude. A pulse contributing to
the smaller amplitude edges will itself
have a smaller amplitude.
This last fact is the clue that tells us
how to compute the probability of a
measurement outcome.
We expect the measured position of the
particle to appear more probably in the
large amplitude center of the wave
packet, than in the lower amplitude
edges.
Waves and Particles
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_waves/index.html[28/04/2010 08:24:35 ﺹ]
Max Born used this fact when he proposed the "Born
rule," that tells us that the amplitude of the component
fixes the probability that this component will be the
outcome of measurement.
Probability that
wave packet
collapses to component
on measurement
=
(
amplitude
of component
)
2
The slight complication in Born's rule is that the amplitudes of the
components are not real numbers. They are complex numbers
that include things like "i," the square root of minus one and other
more complicated things like 1+i and 37  10i. Probabilities have
to be real numbers between 0 and 1. So Born had to convert
the complexvalued amplitudes into a real numbers. There are
many ways of doing this. Few give a real number that also obeys
all the rules of the probability calculus. Taking the "square" of the
amplitude turns out to be the one that works.
For experts only:
of course by
"square" of a
complex number I
really mean its
"squared norm."
That is the number
itself, multiplied by
its complex
conjugate. For
z=1+i, the squared
normz
2
= (1+i)(1
i) = 1i
2
= 2.
Anxieties over
Irreducible
Chanciness
When quantum theory first emerged as our best theory
of fundamental particles, the central role of
probabilities in the theory caused much concern. The
probabilities associated with the collapse of the wave
packet were not of the type always formerly seen.
Waves and Particles
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_waves/index.html[28/04/2010 08:24:35 ﺹ]
Prior to quantum theory, the probabilities that had
crept into physics could always be thought of as
manifestations of our ignorance of the true state of
affairs.
We might not know whether a coin will come up
heads or tails when tossed, so we say there is a
probability of 1/2 on heads. But that probability merely
masks our ignorance. If we knew exactly how hard the
coin had been flipped, exactly how the air currents in
the room were laid out, and a myriad more other
details, we could in principle determine exactly
whether the coin would be heads or tails.
In quantum theory , when the wave packet
collapses, we find different probabilities for the
different outcomes. But there is no definite fact of the
matter over which we are ignorant. There is no one
true, hidden outcome prior to measurement. No further
accumulation of information could lessen our
ignorance. There is nothing more to know. The best we
can say is that each of the position measurements are
possible and that they will arise with such and such
probability.
It is now a little hard to see why this difference in the
probabilities led to so much anxiety among physicists
in the 1920s and later. All that has happened is that we
have found the world to be a little different from what
Waves and Particles
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_waves/index.html[28/04/2010 08:24:35 ﺹ]
we expected. We may once have thought probabilities
to be expressions of ignorance. We now find that they
are irreducible parts of the way the world is put
together. Their appearance in theory has nothing to do
with what we may or may not know. The world just is
fundamentally chancy in certain of its aspects.
The Nineteenth Century
View of Causation
The reason, I believe, that this irreducibly chancy
character of the world created such anxiety is a legacy
of nineteenth century philosophy. In the course of
the nineteenth century, the notion of causation had
been greatly purified by philosophical analysis. The
outcome was a lean account of causation as
determinism. This causes that simply means that this
is invariably followed by that. So for the world to be
causal, in this view, simply means that the present
state of the world fixes its future state.
It may now be hard to see that this is what the
nineteenth century scientists took causality to
be. Here is Einstein, in a speech from 1950,
describing the situation:
"...the laws of the external world were also
taken to be complete, in the following sense:
If the state of the objects is completely given
at a certain time, then their state at any other
time is completely determined by the laws of
nature. This is just what we mean when we
speak of 'causality.' Such was approximately
the framework of the physical thinking a
hundred years ago."
Waves and Particles
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_waves/index.html[28/04/2010 08:24:35 ﺹ]
Albert Einstein, "Physics, Philosophy, and Scientific Progress,"
International Congress of Surgeons, Cleveland, Ohio, 1950;
printed in Physics Today, June 2005, pp.4648.
The irreducible probabilities of quantum theory showed
that the present state of the world does not fix its future
state. The best it does is to give probabilities for
different possible futures. Therefore, according to the
nineteenth century conception, the world is not
causal. Thus the physicists of the 1920s frequently
lamented the violation of the "principle of causality."
The consensus now is that their
notion of causation was far too
narrow. There are notions of
causation that cohere perfectly
well with irreducible probabilities.
Quantum theory does not
present a challenge to the
cogency of causation. We now
think that quantum mechanics
does not present a foundational
problem in this area. However
quantum theory does present
That is the majority view. There is a minority view, which I
champion. It regards the 1920s failure of the principle of
causality as part of a long history of failure. In this view, the
effort to find a principle of causality in nature is actually an effort
to conceive an a priori science. Processes in nature are
interconnected. But it is not our business to legislate in advance
the nature of that connectedness. Perhaps it conforms to
something like a principle of causality; or perhaps it does not.
The long history of our failure to find any well functioning
principle of causality suggests that there is none to be found. It
suggests that our efforts are better spent empirically examining
how things connect, broadening our conceptions to match and
not trying to force them into a mold first devised thousands of
Waves and Particles
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_waves/index.html[28/04/2010 08:24:35 ﺹ]
some significant foundational
problems in related areas. These
problems will be the subject of
the following chapters.
years ago. Or that is what I argue in my "Causation as Folk
Science." in Philosophers' Imprint, Vol. 3, No. 4.
What you should know
How matter waves enter into superpositions and
how this allows wave packets to form.
How Heisenberg's uncertainly principle places a
limit of the definiteness of quantities.
The difference between uncertainty and
indefiniteness.
What is quantum measurement (collapse of the
wave packet).
How probabilities essentially enter into quantum
theory and why this was initially regarded as a
failure of causality.
Copyright John D. Norton. April 2001; March 16, Augst 22, December 1, 2008; March 5, April 14, 2010.
The Measurement Problem
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_measurement/index.html[28/04/2010 08:24:46 ﺹ]
HPS 0410 Einstein for Everyone
Back to main course page
The Measurement Problem
John D. Norton
Department of History and Philosophy of Science
University of Pittsburgh
Two Rules of Evolution in Time
Schroedinger's Cat
Schroedinger's Original Version
Einstein's Early Version of the Problem
Einstein's Later Formulation
Responses to the Measurement Problem
What you should know
Two Rules of Evolution in
Time
Quantum theory has made many demands upon us.
We need now to accept that physics is essentially
indeterministic; that particles may be somewhere
without being at any particular place; that they may
have energy and momentum without having any
particular value for them; and a host more non
classical oddities. Most of these ideas are simply
unfamiliar conceptions and, in the end, the best thing is
just to get used to the idea that world depicted by
quantum theory is very different from the world
delivered by our raw senses.
There are other problems in quantum theory that
should not be accommodated with this forgiving
attitude. This chapter will develop the one that it most
The Measurement Problem
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_measurement/index.html[28/04/2010 08:24:46 ﺹ]
prominent and has proven most intractable : the
measurement problem. It depends on the fact that a
quantum system can evolve in time in two ways.
One way, you will recall from the last chapter, is
Schroedinger evolution, in which the wave of the
system propagates in the familiar manner of waves.
The other way a quantum system can evolve in time is
through the "collapse of the wave packet" that arises
when we perform a measurment:
When will a wave packet undergo Schroedinger
evolution or collapse? Earlier, we saw that there is
only a rule of thumb to guide us. Schroedinger
evolution arises when matter waves are left to
themselves or when they interact with just a few
others. Measurement arises when a matter wave
interacts with a macroscopic measuring device. That
means that a matter wave interacting with a
photographic plate collapses. Sometimes it is said that
the last collapse does not happen until an intelligent
The Measurement Problem
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_measurement/index.html[28/04/2010 08:24:46 ﺹ]
human agent actually looks at the plate. That last
claim is extremely strange. Are we supposed to
believe that human intelligence enters into the time
evolution of fundamental particles in the same way as
perturbing fields?
The lack of a precise principle to decide which
evolution will arise has created a constellation of
puzzles known at the "measurement problem." The
best known example is "Schroedinger's cat," a thought
experiment devised by Erwin Schroedinger in 1935.
Schroedinger's Cat
To see how it arises, let us first look at how quantum
theory treats radioactive decay. The radioactive
element Neptunium NP
231
93
is extremely unstable.
It will undergo radioactive decay quite quickly.
It has a "half life" of 53 minutes. That means that if we
start with a lump of NP
231
93
and wait 53 minutes we
will have only half a lump left, near enough, and lot of
radioactive decay products.
The Measurement Problem
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_measurement/index.html[28/04/2010 08:24:46 ﺹ]
At the level of an individual atom of NP
231
93
, that
means that there is a probability of 1/2 that each
individual atom will decay over this 53 minutes. Now,
individual atoms of NP
231
93
are governed by
Schroedinger evolution; the probabilities only enter
when we measure to see if the atom has decayed or
not.
So over 53 minutes the atom evolves into a half:half
superposition of undecayed and decayed atom.
The collapse into one or other of these components
only arises when we take a measurement . That
may happen when we use a Geiger counter to check
for radioactive decay products. If we find them, then
the atom collapses into the decayed component.
Otherwise it collapsed into the undecayed component.
So far everything seems reasonable. What
Schroedinger realized was that there was quite some
arbitrariness in our division between Schroedinger
evolution and wave collapse. It was quite possible for
that one collapse to be magnified. The decay
products of the one decaying atom might trigger the
collapse of others. So instead of having just one atom
entering into a superposition over 53 minutes, we
might have very many atoms all coupled together
entering the superposed state after 53 minutes.
The cat paradox arises when we push this process of
amplification to an extreme. Instead of coupling the
one atom of NP
231
93
to a collection of other
The Measurement Problem
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_measurement/index.html[28/04/2010 08:24:46 ﺹ]
radioactive atoms, we couple it to the 10
25
atoms
of cat. The coupling is simple, although cruel. A Geiger
counter is set up to sense the decay of the atom. If it
decays, the Geiger counter will trigger the opening of a
can of poison. The atom, Geiger counter, poison and
cat are all enclosed in a box.
We then wait 53 minutes. In that time, the atom
evolves into a superposition of undecayed and
decayed atom. With it, the poison evolves into a
superposition of released and unreleased poison;
and the cat into a superposition of live cat and dead
cat.
The Measurement Problem
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_measurement/index.html[28/04/2010 08:24:46 ﺹ]
+
At this stage, no measurement has been performed;
no human has looked at the Geiger counter or listened
for its clicks. So the cat is neither alive nor dead.
The evolution, as far as the cat is concerned, is
something like this:
What finally decides whether the cat is alive or dead is
our observation. After 53 minutes we open the box and
observe, that is, "measure," the life state of the cat.
Only then does the cat's wave collapse onto one of
dead or alive.
There is a widespread sense that there is something
wrong with a theory that allows observation to play
such an important role. Most people have an
instinctive sense that the fact of life or death for
the cat is not decided merely by our observation. After
53 minutes, the cat is definitely just one of alive or
dead; whether we look in the box does not change that
The Measurement Problem
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_measurement/index.html[28/04/2010 08:24:46 ﺹ]
circumstance in any way.
This instinctive reaction is surely correct. However
having it really only sharpens the problem. It does not
solve it. For the inference that the cat is in a
superposition of alive and dead follows directly from
quantum theory by merely assuming that the box
contains nothing but atoms whose time evolution is
governed by Schroedinger's equation.
This paradox of the Schroedinger's cat is the most
vivid expression of a lingering problem in the
foundations of quantum theory. In the last two
decades especially, there has been a huge amount of
work devoted to finding variations to standard quantum
theory or just new ways to think about the same theory
that avoid this problem. There is no consensus on
which approach is the correct one or even if some sort
of repair is needed.
Schroedinger's
Original Version
Erwin Schroedinger published his "cat" thought
experiment in a lengthy paper in the November 29,
1935, issue of the journal Die
Naturwissenschaften. Here's the entirety of his
original account:
Erwin Schroedinger, "Die
Gegenwaertigen Situation in der
Quantenmechanik," Die
Naturwissenschaften, 23 (1935). pp.
80712, 82428. 4449.
Translation from Arthur Fine, The Shaky
Game: Einstein, Realism and the
Quantum Theory. University of Chicago
Press, 1986, p. 65; excepting last two
sentences.
The Measurement Problem
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_measurement/index.html[28/04/2010 08:24:46 ﺹ]
"One can even make quite ludicrous examples. A cat
is enclosed in a steel chamber, together with the
following infernal machine (which one must secure
against the cat's direct reach): in the tube of a Geiger
counter there is a tiny amount of a radioactive material,
so small that although one of its atoms might decay in
the course of an hour, it is just a probable that that none
will. If the decay occurs, the counter tube fires and, by
means of a relay, sets a little hammer into motion that
shatters a small bottle of prussic acid. When the entire
system has been left alone for an hour, one would say
that the cat is still alive provided no atom has decayed
in the meantime. The first atomic decay would have
poisoned it. The ψfunction of the total system would
yield an expression for all this in which, in equal
measure, the living and the dead cat (sit venia verbo
["pardon the expression"]) blended or smeared out.
The characteristic of these examples that
an indefiniteness originally limited to
atomic dimensions gets transformed into
gross macroscopic indefiniteness, which
can then be reduced by direct
observation. This prevents us from
continuing naively to give credence to a
"fuzzy model" as a picture of reality. In
itself this contains nothing unclear or
contradictory. There is a difference
between a blurred or unsharply taken
photograph and a shot of clouds and
mist.
"
The Measurement Problem
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_measurement/index.html[28/04/2010 08:24:46 ﺹ]
Einstein's Early
Version of the
Problem
We are used to thinking of Einstein as a visionary who
brought new and challenging theories of physics to us.
However as the new quantum theory solidified in the
late 1920s and thereafter became standard physics,
Einstein increasingly found himself playing a rather
different role, a critic of the new. He had no doubt
about the great successes of quantum theory in
exploring atomic phenomena and accommodating the
results of experiments. His concern, however, was that
the theory was only a provisional stopping point on the
path to a better theory. We shall see in a coming
chapter how Einstein elaborated these worries. He
concentrated on the idea that the quantum wave was
not a complete description of reality, but, in some way,
merely described averages.
The best known expression of these worries came in a
1935 paper Einstein co authored with Boris Podolsky
and Nathan Rosen, known universally as the "EPR"
paper. (A. Einstein, B. Podolsky, and N. Rosen, "Can quantummechanical
description of physical reality be considered complete?" Phys. Rev. 47 777
(1935), pp. 77880. Received March 25, 1935; published May 15, 1935.)
In the aftermath of this paper, Einstein and Schroedinger
exchanged letters in which they aired their common concerns
about quantum theory. In that correspondence, Einstein put to
Schroedinger what we now see is an early version of the cat
paradox. He outlined a "crude macroscopic example" in a letter
to Schroedinger of August 8, 1935:
Translation from
Arthur Fine, The
Shaky Game:
Einstein, Realism
and the Quantum
Theory. University
of Chicago Press,
1986, p. 78.
The Measurement Problem
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_measurement/index.html[28/04/2010 08:24:46 ﺹ]
"The system is a substance in chemically unstable
equilibrium, perhaps a pile of gunpowder that, by
means of intrinsic forces, can spontaneously combust,
and where the average life span of the whole setup is
a year. In principle this can quite easily be represented
quantummechanically. In the beginning the ψfunction
characterizes a reasonably welldefined macroscopic
state. But, according to your equation, after the course
of a year this is no longer the case at all. Rather, the
ψfunction then describes a sort of blend of notyet
and of alreadyexploded systems.
Through no art of interpretation can this ψfunction be
turned into an adequate description of a real state of
affairs; [for] in reality there is just no intermediary
between exploded and nonexploded.
"
The Measurement Problem
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_measurement/index.html[28/04/2010 08:24:46 ﺹ]
Perhaps Einstein's remarks led Schroedinger to his cat
thought experiment; or perhaps his thinking was
already moving in that direction. In a letter back to
Einstein of August 19, 1935, he characterized his
newlyconceived cat thought experiment as "very
similar to your exploding powder keg."
Einstein's Later
Formulation
In 1946, Einstein wrote a scientific biography for a volume
dedicated to him. In the volume, he repeated his concerns
over quantum theory in the vein in which he'd conceived them
in the 1935 EPR paper. The volume included a large number
of papers authored by others in Einstein's honor. In 1949,
Einstein assembled his reactions to them. These reactions
included a more mature version of the gunpowder example,
now modified by Schroedinger's formulation.
He took the example of the decay of a radioactive atom:
Albert Einstein,
"Remarks Concerning
the Essays Brought
Together in this Co
operative Volume,"
(1949) in, P. A.
Schilpp, ed., Albert
EinsteinPhilosopher
Scientist. 2nd ed. New
York: Tudor
Publishing, 1951.
The Measurement Problem
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_measurement/index.html[28/04/2010 08:24:46 ﺹ]
"We consider as a physical system, in the
first instance, a radioactive atom of definite
average decay time, which is practically
exactly localised at a point of the coordinate
system. The radioactive process consists in
the emission of a (comparatively light) particle.
For the sake of simplicity we neglect the
motion of the residual atom after the
disintegration process. Then it is possible for
us, following Gamow, to replace the rest of the
atom by a space of atomic order of magnitude,
surrounded by a closed potential energy
barrier which, at a time t = 0, encloses the
particle to be emitted. The radioactive process
thus schematised is then, as is well known, to
be described — in the sense of elementary
quantum mechanics — by a ψfunction in
three dimensions, which at the time t= 0 is
different from zero only inside of the barrier,
but which, for positive times, expands into the
outer space. This ψfunction yields the
probability that the particle, at some chosen
instant, is actually in a chosen part of space
(i.e., is actually found there by a measurement
of position). On the other hand, the ψfunction
does not imply any assertion concerning the
time instant of the disintegration of the
radioactive atom.
"
Einstein first diagnosed the difficulty as arising from a
mistaken assumption that the quantum mechanical
wave function, the ψfunction, gives a complete
The Measurement Problem
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_measurement/index.html[28/04/2010 08:24:46 ﺹ]
description of the one case, as opposed to an
average over the descriptions of many cases. A few
pages later, he then develops the example in the
direction of Schroedinger's cat thought experiment as
a response to the idea that we cannot ascertain a
definite time of decay without making measurements
that interefere essentially with the experiment:
"As far as I know, it was E. Schrödinger who first
called attention to a modification of this consideration,
which shows an interpretation of this type to be
impracticable. Rather than considering a system which
comprises only a radioactive atom (and its process of
transformation), one considers a system which
includes also the means for ascertaining the
radioactive transformation — for example, a Geiger
counter with automatic registration mechanism. Let
this latter include a registration strip, moved by a
clockwork, upon which a mark is made by tripping the
counter. True, from the point of view of quantum
mechanics this total system is very complex and its
configuration space is of very high dimension. But
there is in principle no objection to treating this entire
system from the standpoint of quantum mechanics.
Here too the theory determines the probability of each
configuration of all its co ordinates for every time
instant. If one considers all configurations of the
coordinates, for a time large compared with the
average decay time of the radioactive atom, there will
be (at most) one such registrationmark on the paper
strip. To each coordinate configuration corresponds a
definite position of the mark on the paper strip. But,
inasmuch as the theory yields only the relative
probability of the thinkable coordinateconfigurations,
it also offers only relative probabilities for the positions
of the mark on the paper strip, but no definite location
for this mark.
The Measurement Problem
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_measurement/index.html[28/04/2010 08:24:46 ﺹ]
"
The single mark on the recording chart identifies a
definite time of decay, analogous to the definite
survival or death of Schroedinger's cat.
Yet the quantum mechanical formalism yields no
single mark, but many marks, weighted
probabilistically.
Einstein continued to explain that he regarded the
standard quantum account of the mark as one for
which "there is hardly likely to be anyone who
would be inclined to consider it seriously."
The Measurement Problem
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_measurement/index.html[28/04/2010 08:24:46 ﺹ]
"If we attempt [to work with] the interpretation that the
quantumtheoretical description is to be understood as
a complete description of the individual system, we are
forced to the interpretation that the location of the mark
on the strip is nothing which belongs to the system per
se, but that the existence of that location is essentially
dependent upon the carrying out of an observation
made on the registrationstrip. Such an interpretation
is certainly by no means absurd from a purely logical
standpoint, yet there is hardly likely to be anyone who
would be inclined to consider it seriously. For, in the
macroscopic sphere it simply is considered certain that
one must adhere to the program of a realistic
description in space and time; whereas in the sphere
of microscopic situations one is more readily inclined to
give up, or at least to modify, this program.
"
Responses to the
Measurement Problem
One of the largest of the recent literatures in
philosophy of quantum theory has sought to resolve
the measurement problem and we can only have the
briefest glimpse of them here. Generally speaking,
most of those responses fall into four groups.
1. Accept the standard account.
This response essentially urges that the standard
treatment is adequate. It is intelligible only in so far as
it repeats the rule of thumb for deciding when
The Measurement Problem
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_measurement/index.html[28/04/2010 08:24:46 ﺹ]
measurement collapse occurs. In so far as it tries to go
further and offer more principled grounds, we descend
into a darkness where we are teased by dim lights with
names like "complementarity" that are so distant as to
remain obscure.
2. Hidden variable theory.
In this response, we are told that the probabilities of
quantum theory are (as Einstein wanted) merely
expressions of our ignorance. The best known and
best elaborated of these approaches is the de Broglie
Bohm pilot wave theory. While the theory gives an
elegant treatment of the simplest case of non 
relativistic quantum mechanics, it is strained to
accommodate the later forms of quantum theory that
emerged in the decades following the 1920s.
3. New dynamics.
In this approach, we suppose that the laws governing
matter change when we move from considering just a
few particles to the very many that comprise
macroscopic bodies. It turns out that only very slight
changes are needed to eradicate the measurement
problem completely and to give macroscopic bodies
properties that are very different from their microscopic
constituents. The principal difficulty with this approach
is that no one is able to say just which of the many
possible slight changes is the correct one.
4. No collapse theories.
These theories propose that Schroedinger evolution is
perfectly admissible for both macroscopic and
microscopic bodies. It denies that wave packet
collapse is a real process like Schroedinger evolution.
The most popular version of this approach employs the
notion that all results of a measurement are realized.
When we see a radioactive atom decay at a definite
moment, we ourselves are really in superposition of
The Measurement Problem
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_measurement/index.html[28/04/2010 08:24:46 ﺹ]
observers noting many different times of decay. This is
sometimes represented figuratively as if we are split
into many observers who inhabit many parallel worlds,
all equally real. This approach requires some
dedication. We must be quite committed to the idea
that a theory devised for tiny particle applies
unchanged to macroscopic bodies. For it requires us to
give up the most fundamental aspect of our laboratory
experience, that experiments have single, definite
results.
My own feeling is that none of these responses is
satisfactory. The least defective is the third. However,
if there are new physical laws that would resolve the
measurement problem, we can be pretty sure that they
are quite exotic and not produced by a small
adjustment in our existing theories. For, if these small
adjustments are there to be found, eight decades of
work by many of the brightest minds in quantum
physics has failed to find them.
What you should know
How radioactive decay is represented under
Schroedinger evolution as a superposition of
outcomes.
How Schroedinger's cat thought experiment
amplifies this superposition to cat states.
That the outcome of Schroedinger's thought
experiment is troubling.
Einstein's earlier gunpowder and later paper chart
formulations of the thought experiment.
Some sense of the very many proposals on offer
for resolving the measurement problem.
A healthy sense of skepticism about all of them.
Copyright John D. Norton. April 2001; March 16, Augst 22, December 1, 2008; March 7, 2010.
The Measurement Problem
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_measurement/index.html[28/04/2010 08:24:46 ﺹ]
Completeness of Quantum Theory
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_completeness/index.html[28/04/2010 08:25:21 ﺹ]
HPS 0410 Einstein for Everyone
Back to main course page
Einstein on the Completeness of Quantum
Theory
John D. Norton
Department of History and Philosophy of Science
University of Pittsburgh
Einstein's Principal Objection to Quantum Theory
Goes Does Not Play Dice
Einstein's Positive Program
Entanglement
The EPR Argument
Separability, Locality and Reality
Later Formulations
The Einstein Bohr Debate
Einstein Loses: The Bell Inequalities
Einstein in Retrospect
What you should know
The Einstein of this chapter is a little removed from the Einstein of
popular imagination. That Einstein is the first of the modern
physicists of the 20th century. He is the the genius of 1905 who
established the reality of atoms, laid out special relativity and E=mc
2
,
and made the audacious proposal of the light quantum. This same
Einstein went on to conceive a theory of gravity unlike anything seen
before and to reawaken the science of cosmology.
In his later years, a different Einstein emerged. The mainstream of
physics followed the course of the quantum theory of the mid 1920's.
Einstein recognized that this new quantum theory enjoyed remarkable
empirical successes, so that it clearly had something very right.
However he did not believe that future fundamental physics should
be to build upon it. Rather he thought the way ahead was to develop
the geometrical approach of general relativity into an all
encompassing "unified field theory" within which the results of the
new quantum theory would be derived. While he had contributed to
its development, Einstein became the most prominent critic of the
new quantum theory.
Einstein's Principal Objection to
Quantum Theory
That Einstein was uncomfortable with quantum theory attracted much
Completeness of Quantum Theory
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_completeness/index.html[28/04/2010 08:25:21 ﺹ]
attention and there have been many accounts of his reservations,
some trying to locate their deeper sources. However these different
accounts may vary, there is no doubt of Einstein's principal objection.
He believed that the quantum wave function of some system, the ψ
function, was not a complete description of the system. Rather, it
provided some sort of statistical summary of the properties of many
like systems. (The term "ψfunction" is just an old fashioned term for the quantum
wave. ψ is the Greek letter "psi.")
An exampleNOT Einstein'swill make this a little clearer. Consider
the air in the room. As far as ordinary measurements are concerned,
the air forms a continuous fluid. When sound propagates in air,
waves of compression and rarefaction move through the air. We can
arrive at a powerful theory of air and sound solely using the
representation of air as a continuous fluid that harbors pressure
waves.
We now know that this theory is incomplete. Air is made up of very
many, very tiny molecules. The familiar pressure waves that we use
to represent sound waves really represent the average positions of
the molecules that comprise the air. If we could zoom in on just a
small part of the sound wave, we would see something like this
(where the figure is exaggerating the granularity of air):
The perfectly regular, nicely rounded pressure waves can be so
uniform only because they smooth away all the bumps of the
individual atoms. They do however provide a serviceable theory of
air and sound waves for many many practical purposes. But they are
ultimately an incomplete picture of any particular sound wave. Many
different distributions of molecules can be smoothed to give the same
wave. So if we are given one wave, we cannot know which particular
Completeness of Quantum Theory
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_completeness/index.html[28/04/2010 08:25:21 ﺹ]
distribution of air molecules lies behind it. It could be one of very
many.
or or or ...
Eventually the differences between them will matter. In this example,
if we use only the pressure wave picture, it will never be possible to
trace out the trajectory of a single molecule , even though the
complete mechanical description of the system assigns a definite
trajectory to each of the very many molecules.
Einstein's attitude to the quantum wave was analogous. The ψ
function is not a complete description of any particular system. It is a
description of the average of many similar systems. For many
purposes, this will suffice, but ultimately it will fail. He wrote:
Albert Einstein, "Remarks Concerning the Essays
Brought Together in this Cooperative Volume," (1949)
in, P. A. Schilpp, ed., Albert EinsteinPhilosopher
Scientist. 2nd ed. New York: Tudor Publishing, 1951,
pp. 67172.
"Within the framework of statistical quantum theory there is no
such thing as a complete description of the individual system.
More cautiously it might be put as follows: The attempt to
conceive the quantumtheoretical description as the complete
description of the individual systems leads to unnatural
theoretical interpretations, which become immediately
unnecessary if one accepts the interpretation that the description
refers to ensembles of systems and not to individual systems...
Assuming the success of efforts to accomplish a complete
physical description, the statistical quantum theory would, within
the framework of future physics, take an approximately
analogous position to the statistical mechanics within the
framework of classical mechanics. I am rather firmly convinced
that the development of theoretical physics will be of this type;
but the path will be lengthy and difficult.
"
Goes Does Not Play Dice
Completeness of Quantum Theory
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_completeness/index.html[28/04/2010 08:25:21 ﺹ]
If the quantum wave is not a complete description of the physical
system, then Einstein has a ready explanation of the
probabilities that have now entered into physics in quantum
measurement processes: they are merely expressions of our
ignorance. If an atom has a probability of one half of radioactive
decay over an hour, then all that really means is that its wave
function describes an ensemble of many different atomic systems,
half of which decay in an hour. Whether one particular atom in the
ensemble will decay in one hour is definitely determinable. However
we will not be able to discern it if all we know is the quantum wave
associated with it. Whether it decays or not depends upon properties
of that system that have been smoothed away by the quantum wave
and thus are unknown to us. It is our ignorance of these smoothed
away properties that makes a probabilistic assertion the best we can
do.
The alternative to this view of incompleteness was to accept that the
quantum wave is a complete description of the system. Then the
probabilities of different measurement outcomes reflect an
ineliminable underdetermination in the world. Figuratively speaking,
the decision as to which outcome is realized lies outside the physical
system. The physics tells us that any of several outcomes is possible.
Einstein referred to this situation in his oft repeated quip that he could
not believe that God plays dice. The remark seems to have been
made frequently, but mostly in conversation. Here is how he put it,
when it was written:
To Max Born, December 4, 1926.
In Born, Born Einstein Letters, 91.
"Quantum mechanics is very worthy of regard. But an inner voice tells me
that this not yet the right track. The theory yields much, but it hardly brings us
closer to the Old One's secrets. I, in any case, am convinced that He does not
play dice.
"
Completeness of Quantum Theory
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_completeness/index.html[28/04/2010 08:25:21 ﺹ]
To Cornelius Lanczos, March 21,
1942, Einstein Archive, 15294.
Both quoted from Alice Calaprice,
ed., The Expanded Quotable
Einstein. Princeton University
Press, 2000. p.245, p. 251.
"It is hard to sneak a look at God's cards. But that he would choose to play
dice with the world...is something I cannot believe for a single moment.
"
Einstein's Positive Program
A note of caution is needed. The analogy to pressure waves in air is
my analogy. It suggests that Einstein somehow imagined a real,
pointlike particle hiding behind the quantum wave, a picture not so
removed from the Bohm hidden variable theory. Perhaps Einstein did
entertain a picture like this in his earlier speculations. However what
is quite distinctive about his mature statements of the incompleteness
of quantum theory is that they are extremely cautious in describing
the reality that may be hidden the statistical wave. Einstein remains
as uncommitted on the question as he can possibly be.
We do know, however, where Einstein hoped to find the theory that
would ultimately complete and even replace quantum theory. After he
completed his general theory of relativity in the 1910s, Einstein
embarked on the program of extending it to cover electromagnetism.
The general theory of relativity had shown that gravity could be
incorporated into the geometry of spacetime if we allowed for a
curved geometry. The hope was that further generalizations of the
geometry of spacetime would allow a geometrical treatment of
electricity and magnetism. This was his famed goal of a "unified
field theory ." In the process, Einstein hoped, a fuller account of
quantum processes might emerge.
Einstein pursued this project for decades, up to his death. However,
the final results were inconclusive. As he dug himself deeper into
these investigations, the mainstream of physics turned in other
directions. While Einstein was struggling to understand how to unify
two forces, gravity and electromagnetism, physics had discovered
two more fundamental forces, the weak and strong nuclear forces.
And while Einstein focussed on the geometrical approach that proved
so fruitful in the 1910s, quantum physicists were dealing with a new
theory in which the idea of an observer independent reality was
becoming elusive.
Entanglement
Completeness of Quantum Theory
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_completeness/index.html[28/04/2010 08:25:21 ﺹ]
Einstein was relentlessly consistent in his principal complaint
concerning quantum theory: it could not be a complete theory. And
he was correspondingly singleminded in the principal argument he
used in his efforts to establish this incompleteness. The argument
depended essentially on a highly nonclassical element of quantum
theory that Schroedinger in the 1930s called "entanglement." (He
called it "Verschränkung", in the same paper in which he presented
his cat paradox.)
When two states become entangled, a complete account of the
properties of one of the systems is not possible if it does not include
the other system; and this will be true no matter how far apart the two
systems may be spatially.
Entanglement can be illustrated if we consider the property of
position in space of a quantum particle. If there is just one particle,
we have already seen how the position property is discerned. The
particle will in general be represented by a wave spread in space. We
measure the position of the particle and this triggers a collapse to just
one point in space.
In this simplest case, the particle wave is spread over a small interval
of space. Slightly more complicated situations are possible. The
particle may be spread into discontinuous regions of space. For
example, the wave may have two lobes and we might measure just
whether the particle is in the left or right lobe. The measurement
operation has the effect of collapsing the wave to one of its two lobes
with a probability determined by the magnitude of the two lobes. (In
the figure, the two lobes are of equal magnitude, so collapse to each
is equally probable.)
How do we get two lobes like this? It is the situation that would arise
if we confined the particle to a box that had two disconnected
Completeness of Quantum Theory
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_completeness/index.html[28/04/2010 08:25:21 ﺹ]
chambers. The particle wave is nonvanishing only inside the two
chambers but is zero everywhere else. We do not perform a
measurement that discerns the exact position of the particle. Rather,
we merely measure whether the particle is in the left or right
chamber. The measurement will collapse the wave to one or other of
its two lobes.
Now take the case of two boxes, A and B, each with its own
particle. As before the particles are spread over the two chambers.
Drawing their wave functions is a little more complicated and this
complication will be origin of entanglement. A picture of the A particle
wave in its A space and a picture of the B particle wave in its B space
by themselves omits essential information about how the two
particles are correlated.
Completeness of Quantum Theory
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_completeness/index.html[28/04/2010 08:25:21 ﺹ]
In fullest quantum account, we do not have two waves, one for each
particle. There is only one wave corresponding to the two particles
and that one wave resides not in ordinary three dimensional space,
but in the six dimensional configuration space of the two
particles. This six dimensional space has three axes for the possible
positions of the A particle; and another three axes for the possible
spatial positions of the B particle. Picking one point in the space
specifies a position for both the A and the B particle.
The resulting six dimensional space is impossible to draw easily.
However we get the essential idea if we idealize each particle as
living in a one dimensional space: a one dimensional A space and a
one dimensional B space. The corresponding configuration space is a
two dimensional space. One of its dimensions is A space; and the
other is B space. Each point in this two dimensional space gives us
one spatial coordinate for the A particle and one spatial coordinate for
the B particle. The wave that represents both particles is a wave in
the two dimensional configuration space.
Here is one way that the two particle system wave can be
distributed in this AB space:
Completeness of Quantum Theory
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_completeness/index.html[28/04/2010 08:25:21 ﺹ]
The wave is zero everywhere except for two lobes. There is a lobe in
the region that corresponds to the left chamber "L" of box A and the
left chapter "L" of box B; and there is a second lobe in the region of
the the right chamber "R" of box A and the right chamber "R" of box
B. If we measure the position of the A particle, the wave will
collapse to one or other lobe. For concreteness, let us say it is the
first lobe; the A particle will now definitely be in the left chamber. That
same collapse will automatically induce collapse of the B particle to
its left lobe and thereby confer on the B particle the property of
definitely being in the left chamber.
Completeness of Quantum Theory
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_completeness/index.html[28/04/2010 08:25:21 ﺹ]
That is the remarkable outcome. As a result of a measurement on
the A particle, the B particle has acquired a more definite position,
even though the two particles may be widely separated in space.
Completeness of Quantum Theory
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_completeness/index.html[28/04/2010 08:25:21 ﺹ]
The A particle may be in a box on my table. The B particle may be in
a box on a space station on Mars. This is entanglement. The
properties of the B particle are simply not separate from those of the
A particle no matter how far apart they may be in space. We
measured the A particle that resides on earth; and the B particle on
Mars was affected. If you are learning of entanglement for the first
time and you are not amazed by this result, you should go back and
reread the last few paragraphs.
The case just analyzed is the case of the left right positions of the
two particles perfectly correlated: that is, a "left" for the A particle
always goes with a "left" for the B particle; and a "right" for the A
particle always goes with a "right" for the B particles.
Other cases are possible. Here is the wave for the case of two
perfectly anticorrelated particles. You should reflect on the figure
until you are convinced that on measurement a "left" on particle A
always goes with a "right" on particle B; and conversely.
Completeness of Quantum Theory
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_completeness/index.html[28/04/2010 08:25:21 ﺹ]
The difference between correlated and anti 
correlated particles cannot be represented in
the A or B spaces of the two particles
individually. They can only be represented in
their joint configuration spaces. That is why the
use of configuration space is so essential. It
lets us represent otherwise elusive physical
properties.
Those of you who have seen entanglement discussed elsewhere will probably
have seen it expressed differently, as an impossibility of factoring the
common wave function into the product of a separate A and a separate B wave.
This is the same idea as expressed here in the figures. If we just take one lobe
of the common wave functionthe left lobe, say, it can be formed by multiplying
together the left lobes of the individual wave functions. Undoing the
multiplication is just factoring the one lobe into the two separate waves.
When we have the fully entangled state with both left and right lobes present,
we can no longer represent the combined wave as a simple product of two
waves, one from each of the A and B spaces.
The EPR Argument
The earliest fully developed and published version of Einstein's
argument against the completeness of quantum mechanics appeared in
a 1935 article coauthored with Boris Podolsky and Nathan Rosen and
universally known by the initials of its authors, "EPR."
A. Einstein, B. Podolsky, and N. Rosen,
"Can QuantumMechanical Description of
Physical Reality Be Considered Complete?"
Physical Review, 47 (1935), pp. 777–780.
Completeness of Quantum Theory
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_completeness/index.html[28/04/2010 08:25:21 ﺹ]
The argument of the paper depends essentially on exploiting
entanglement. If one has two entangled systems, one can perform
measurements on one of the systems and thereby learn the
properties of the other. We have seen how this would work in my
example above of a particle distributed over two chambers. The key
idea is that the measurement we perform on the first system
will not disturb the second, so that whatever property we learn of the
second system must be one it possessed prior to our making the
measurement.
This is why entanglement is such a powerful idea. We can allow that
a measurement on the first particle will disturb the first particle.
However EPR insist that a measurement on the first particle will not
disturb the second particle, which could be removed many light
years from the first in space.
The argument is then completed by noting that we could have
measured many different properties of the first system and, as a
result, discovered many properties of the second many more than
an assumption of completeness would allow.
This can be seen in the illustration EPR give of their general
argument. We imagine two particles that are entangled in such a way
that their momenta and position coordinates are equal but opposite in
sign. The simplest way to create such an entangled pair is through an
atomic event that ejects two particles of the same type in opposite
directions.
Completeness of Quantum Theory
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_completeness/index.html[28/04/2010 08:25:21 ﺹ]
Assume the event is such that the total momentum of the pair is zero.
Then, if the first particle has momentum +10 units, the other must
have momentum  10 units; and so on for all other possible values.
Similarly, the symmetry of the pair assures us that if the first particle
has moved to a position +100 units of distance from the creation
event, then the other must be at 100 units of distance; and so on for
all other possible positions.
It follows that we can discover the properties of the second
particle at will and without disturbing it, merely by performing
measurements on the first particle. We could, for example, discern
the second particle's momentum by measuring the momentum of the
first particle. Or we could find the position at some moment of the
second particle by measuring the position of the first.
We do not actually need to perform any of the measurements to
be assured that the second particle possesses the properties
mentioned. The mere possibility of the measurements is enough to
assure us that the properties are really there. That is, we do not need
to know the momentum and position of the second particle to be
assured that it has a definite momentum and position.
We conclude that the second particle must possess both a definite
position and a definite momentum. The wave representing the
second particle, however, will in general assign neither definite
position nor definite momentum to it. Therefore, EPR conclude, the
quantum wave is an incomplete description.
Separability, Locality and
Reality
The discussion above summarizes the EPR argument. However it
does not fully expose the assumptions that it makes. For the
argument to succeed, there are two assumptions needed and both
have been subject to quite intense scrutiny in the literature.
Completeness of Quantum Theory
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_completeness/index.html[28/04/2010 08:25:21 ﺹ]
The first is separability. EPR tacitly assume that two systems widely
separated in space have independent existences, so that the state of
one can be specified fully without consideration of the second.
The second assumption is locality. EPR assume that a
measurement here cannot affect a system there if here and there are
spacelike separated, that is, any influence propagating from one
place to the other must proceed faster than light.
Both assumptions are disavowed by standard quantum
mechanics. Entangled states violate separability and measurement
collapse is instantaneous. The admissibility and persuasiveness of
the EPR argument depend essentially on the extent to which one
accepts these two assumptions. One might discard them just
because quantum theory, our most successful theory of matter, does
not adhere to them. Or one might adopt them precisely because one
senses this is the beginning of the escape from the deeper woes of
the measurement problem.
The EPR paper did clearly state one of its premises that is closely
connected with these last two ideas. It is the "criterion of reality" that
takes a definite stance on a central issue in philosophy: how do we
know what is real and what is not:
Criterion of reality
"If, without in any way disturbing a system, we can predict with
certainty (i.e. with probability equal to unity) the value of a physical
quantity, then there exists an element of physical reality
corresponding to this physical quantity."
Later Formulations
The EPR paper is the bestknown expression of Einstein's argument
against the completeness of quantum theory. The logic of the paper
is a little more tangled than the sketch just given. There is clear
evidence that Einstein felt the tangles unnecessary, attributing them
to his co author. He wrote shortly afterwards to Schroedinger of his
concern (June 19, 1935):
"For reasons of language this [paper] was written by Podolsky after
several discussions. Still, it did not come out as well as I had
originally wanted; rather, the essential thing was, so to speak,
smothered by the formalism [Gelehrsamkeit]. (Translation from
Completeness of Quantum Theory
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_completeness/index.html[28/04/2010 08:25:21 ﺹ]
http://plato.stanford.edu/entries/qtepr/)
"
Einstein's own later statements of the essential argument were
much briefer and clearer. Here is version from his article "Physics
and Reality" (Journal of the Franklin Institute, 221, 1936).
"Consider a mechanical system consisting of two partial systems A
and B which interact with each other only during a limited time. Let
the ψ function before their interaction be given. Then the Schrödinger
equation will furnish the ψ function after the interaction has taken
place. Let us now determine the physical state of the partial system A
as completely as possible by measurements. Then quantum
mechanics allows us to determine the ψ function of the partial system
B from the measurements made, and from the ψ function of the total
system. This determination, however, gives a result which depends
upon which of the physical quantities (observables) of A have been
measured (for instance, coordinates or momenta). Since there can be
only one physical state of B after the interaction which cannot
reasonably be considered to depend on the particular measurement
we perform on the system A separated from B it may be concluded
that the ψ function is not unambiguously coordinated to the physical
state. This coordination of several ψ functions to the same physical
state of system B shows again that the ψ function cannot be
interpreted as a (complete) description of a physical state of a single
system. Here also the coordination of the ψ function to an ensemble
of systems eliminates every difficulty.
*
[Footnote]
*
A measurement on A, for example, thus involves a
transition to a narrower ensemble of systems. The latter (hence also
its ψ function) depends upon the point of view according to which this
reduction of the ensemble of systems is carried out.
"
Here's the version given in Einstein's Autobiographical Notes,
written over a decade after the EPR paper. It is worth quoting at
length since it surely represents Einstein's most considered view,
expressed in the way he thought most fitting.
"There is to be a system that at the time t of our observation consists
of two component systems S
1
and S
2
, which at this time are spatially
separated and (in the sense of the classical physics) interact with
each other but slightly. The total system is to be described completely
in terms of quantum mechanics by a known ψfunction, say ψ
12
. All
quantum theoreticians now agree upon the following. If I make a
complete measurement of S
1
, I obtain from the results of the
measurement and from ψ
12
an entirely definite ψfunction ψ
2
of the
Completeness of Quantum Theory
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_completeness/index.html[28/04/2010 08:25:21 ﺹ]
system S
2
. The character of ψ
2
then depends upon what kind of
measurement I perform on S
1
.
Now it appears to me that one may speak of the real state of the
partial system S
2
. To begin with, before performing the measurement
on S
1
, we know even less of this real state than we know of a system
described by the ψfunction. But on one assumption we should, in my
opinion, insist without qualification: the real state of the system S
2
is
independent of any manipulation of the system S
1
, which is spatially
separated from the former. According to the type of measurement I
perform on S
1
, I get, however, a very different ψ
2
for the second
partial system (φ
2
, φ
2
1
, . . . ). Now, however, the real state of S
2
must be independent of what happens to S
1
. For the same real state
of S
2
it is possible therefore to find (depending on one's choice of the
measurement performed on S
1
) different types of ψfunction. (One
can escape from this conclusion only by either assuming that the
measurement of S
1
(telepathically) changes the real state of S
2
or by
denying altogether that spatially separated entities possess
independent real states. Both alternatives appear to me entirely
unacceptable.)
If now the physicists A and B accept this reasoning as valid, then B
will have to give up his position that the ψfunction constitutes a
complete description of a real state. For in this case it would be
impossible that two different types of ψfunctions could be assigned
to the identical state of S
2
.
The statistical character of the present theory would then follow
necessarily from the incompleteness of the description of the systems
in quantum mechanics, and there would no longer exist any ground
for the assumption that a future foundation of physics must be based
upon statistics.
It is my opinion that the contemporary quantum theory represents an
optimal formulation of the relationships, given certain fixed basic
concepts, which by and large have been taken from classical
mechanics. I believe, however, that this theory offers no useful point
of departure for future development...
"
One remark is especially noteworthy since it makes clear the
importance of locality and separability in Einstein's argument. He
canvasses two possible escapes from his conclusion of the
incompleteness of quantum theory. They are "measurement of S
1
(telepathically) changes the real state of S
2
"that corresponds to a
violation of locality. The second is "denying altogether that spatially
separated entities possess independent real states" that is the
violation of separability.
In plumbing the depths of Einstein's objections to quantum theory, his
Completeness of Quantum Theory
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_completeness/index.html[28/04/2010 08:25:21 ﺹ]
concern to preserve separability seems to be the deepest and
most fundamental. Certainly separability is logically prior to locality.
One cannot require systems here to interact locally with systems
there unless one can already distinguish systems here from systems
there. That distinction requires separability.
Here's one of Einstein's remarks on the question from a paper
"Quantum Mechanics and Reality," Dialectica, 2 (1948), pp. 32024:
"Without such an assumption of the mutually independent existence
... of spatially distant things, as assumption which originates in
everyday thought, physical thought in the sense familiar to us would
not be possible. Nor does one see how physical laws could be
formulated and tested without such a clean separation.
"Translation from Don Howard, "Einstein on Locality and Separability," in Studies in History and
Philosophy of Science, 16 (1985), pp. 171201 on .
The Einstein Bohr Debate
In all this, Einstein was defending a minority view in the physics
community. The task of responding to Einstein was taken up by Niels
Bohr. The debate in which they engaged was surely one of the
monumental debates of the 20th century. Here were two titans of
modern physics with quite opposed positions, struggling to establish
their view of the meaning of the quantum.
The great difficulty in following the debate, however, is that its canonical history has been
written by Bohr in his contribution to the Schilpp Einstein volume. There one finds a story
of a farsighted Bohr, who recognizes the profound philosophical reorientation brought by
quantum theory; and a reactionary, recalcitrant Einstein unable to accommodate the novelty.
Einstein's view was, we would expect, somewhat different. Unfortunately Einstein gave no
extended, published account of his perspective on the debate. In private correspondence,
he was quite disparaging of Bohr, calling him a "talmudic philosopher [who] doesn't give a
hoot for 'reality,' which he regards as a hobgoblin of the naive..." (Einstein to Schroedinger, June 19,
1935. Translation from Don Howard, "Einstein on Locality and Separability," in Studies in History and Philosophy of Science, 16
(1985), pp. 171201 on p. 178.)
Niels Bohr,
"Discussions
with Einstein
on
Epistemological
Problems in
Atomic
Physics" in, P.
A. Schilpp, ed.,
Albert Einstein
Philosopher
Scientist. 2nd
ed. New York:
Tudor
Publishing,
1951.
Available online
here.
It is easier to report on Bohr's views than to justify them. So let me attempt
Completeness of Quantum Theory
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_completeness/index.html[28/04/2010 08:25:21 ﺹ]
just to report and a few expressions of my own hesitations . The point of view
advocated by Bohr was labeled "complementarity" by Bohr and its starting
point was an insistence that we must describe experiments in classical
terms:
"...it is decisive to recognise that, however far the phenomena transcend
the scope of classical physical explanation, the account of all evidence must
be expressed in classical terms. The argument is simply that by the word
"experiment" we refer to a situation where we can tell others what we have
done and what we have learned and that, therefore, the account of the
experimental arrangement and of the results of the observations must be
expressed in unambiguous language with suitable application of the
terminology of classical physics.
"
The "must be" seems to me excessive
and unwarranted. Somehow, in a way
he does not describe, Bohr is able to
preclude the description of my sensing
a flash of light as "I sensed a photon"
where photon is a term whose
meaning is given by quantum theory.
Classical terminology is peculiarily
welladapted to ordinary sized objects
since it arises in a theory designed to
describe them. So it is easy to
continue to use classical terms when
we describe quantum experiments with
ordinary sized objects. We should not
confuse that comfort with our having
no alternative.
This led Bohr immediately to what seems to be the central idea:
"This crucial point, which was to become a main theme of the
discussions reported in the following, implies the impossibility of any
sharp separation between the behaviour of atomic objects and the
interaction with the measuring instruments which serve to define the
conditions under which the phenomena appear. In fact, the
individuality of the typical quantum effects finds its proper expression
in the circumstance that any attempt of subdividing the phenomena
will demand a change in the experimental arrangement introducing
new possibilities of interaction between objects and measuring
instruments which in principle cannot be controlled. Consequently,
evidence obtained under different experimental conditions cannot be
comprehended within a single picture, but must be regarded as
complementary in the sense that only the totality of the phenomena
exhausts the possible information about the objects.
"
The main theme was then illustrated vividly and effectively
with a series of descriptions of the various measurement
Completeness of Quantum Theory
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_completeness/index.html[28/04/2010 08:25:21 ﺹ]
devices, described in a "semiserious" realistic style, in order
to make clear that performing one measurement precludes
the performing of another. For example, a position
measurement on a propagating particle might employ a slit
firmly bolted to the bench through which the particle passes.
We then know the exact height of the particle when it passes
through the slit.
A momentum measurement, however, would require a movable slit, whose recoil
under the passage of the particle would let us determine the size of a momentum
transfer to or from the particle. That essential moveability of the slit precluded the
fixed slit arrangement of the position measurement.
The mutual exclusivity of the two arrangements of measurement apparatus is
reflected in the complementarity of the quantities of position and momentum.
Bohr then developed further examples, included the celebrated
Einstein "photon in a box" thought experiment.
Overall, on Bohr's account so far, Einstein's approach was decisively
defeated in the resulting analyses. Needless to say, that did not seem
to be Einstein's view. You can read the details of Bohr's discussion
in his text and, for a suggestion on Einstein's side, see Don Howard,
"Revisiting the EinsteinBohr Dialogue."
All this display of realistic measuring devices was a prelude to
Bohr's response to the EPR argument. In giving it, he quoted
Completeness of Quantum Theory
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_completeness/index.html[28/04/2010 08:25:21 ﺹ]
from an earlier text he'd written that captured his central response
"From our point of new we now see that the wording of the above
mentioned criterion of physical reality proposed by Einstein,
Podolsky, and Rosen contains an ambiguity as regards the meaning
of the expression ' without in any way disturbing a system.' Of course
there is in a case like that just considered no question of a
mechanical disturbance of the system under investigation during the
last critical stage of the measuring procedure. But even at this stage
there is essentially the question of an influence on the very conditions
which define the possible types of predictions regarding the future
behaviour of the system. Since these conditions constitute an
inherent element of the description of any phenomenon to which the
term "physical reality" can be properly attached, we see that the
argumentation of the mentioned authors does not justify their
conclusion that quantum mechanical description is essentially
incomplete. On the contrary, this description, as appears from the
preceding discussion, may be characterised as a rational utilisation of
all possibilities of unambiguous interpretation of measurements,
compatible with the finite and uncontrollable interaction between the
objects and the measuring instruments in the field of quantum theory.
In fact, it is only the mutual exclusion of any two experimental
procedures, permitting the unambiguous definition of complementary
physical quantities, which provides room for new physical laws, the
coexistence of which might at first sight appear irreconcilable with the
basic principles of science. It is just this entirely new situation as
regards the description of physical phenomena that the notion of
complementarity aims at characterising.
"
This explanation of the purported failure of the EPR
argument is not easy to comprehend on a first
reading. One expects it to become clearer on re
reading. My experience is that this does not happen
and I have been unable to find a cogent
interpretation of the text. Whether we should
persevere or not remains an issue that divides the
philosophy of physics community. One part remains
convinced that Bohr's insights were profound, but
poorly expressed, and we could keep seeking their
deeper insight. Another holds that Bohr had vivid
thoughts that he believed, mistakenly, solved
foundational problems; but these thoughts were
I belong to the second group that finds Bohr's
thought opaque. My best efforts find Bohr
advocating a kind of ultraempiricism that
entangles epistemology (how we know
things) with ontology (what things are). The
idea is that what something is, is inseparable
from how we actually happened to find out
about it.
The EPR argument requires us to imagine
two different measurements that we might
perform on the first system; and from their
possible outcomes we infer to the properties of the second. EPR
presume that it is possible to know what would happen were two
different measurements performed on the system. Bohr's ultra
empiricism asserts that the two systems would not be the same system
if different measurements were performed on them. For what the
system is, involves essentially which measurement is performed on it.
What EPR think of as one system, explored by different measurements,
is, for Bohr's ultraempiricism, two different systems. It follows that EPR
are mistaken in imagining that the two measurements could be
performed on the very same system. The first steps of the EPR
argument are blocked.
While this seems to be Bohr's argument, it is opaque to me why Bohr
thought this ultraempiricism is compatible with quantum theory. It
amounts to a denial that quantum theory supports what philosophers
call "counterfactuals"statements of what would have happened were,
Completeness of Quantum Theory
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_completeness/index.html[28/04/2010 08:25:21 ﺹ]
incoherent and the opacity of Bohr's writing is simply
a result of that incoherence.
contrary to actual fact, some other conditions to be the case. Quantum
theory clearly supports counterfactuals. When we have a spread out
quantum wave representing some particle, standard algorithms in the
theory tell us what would happen were we to perform this
measurement, or, instead of it, had we performed that measurement.
Generally the description of what would happen is expressed in terms
of the probabilities of various outcomes. But there is no difficulty in
recovering the result. Thus, there seems no problem as far as quantum
theory is concerned when EPR assert what would happen were this
measurement or another incompatible measurement to be performed.
Einstein's direct response to Bohr's analysis in the same volume
was terse, even severe:
"...it must seem a mistake to permit theoretical description to be
directly dependent upon acts of empirical assertions, as it seems to
me to be intended [for example] in Bohr's principle of
complementarity, the sharp formulation of which, moreover, I have
been unable to achieve despite much effort which I have expended
on it. From my point of view [such] statements or measurements can
occur only as special instances, viz., parts, of physical description, to
which I cannot ascribe any exceptional position above the rest"
Not many scholars have the distinction of being told in print by
Einstein that he has been unable to discern precisely what they are
asserting "despite much effort"!
While the final outcome of the debate remains controversial in the
philosophy of physics literature, I can state my own view. In his
debate with Bohr, Einstein won. Einstein's argument is clear and
powerful. Bohr's claims are either obscure or indefensible.
Einstein Loses: The Bell
Inequalities
One can win a battle, but lose the war. And that is what happened.
Einstein won the debate with Bohr, in my view. In his debate with
quantum theory, Einstein lost and unequivocally so.
The reason for his loss did not emerge during his lifetime. They came
in the decade after his death through the work of John S. Bell. The
story of Bell's work and the flood of work it inspired is too large a
topic to treat adequately in this short section. We can see only some
preliminary fragments here.
What Bell noticed was a lacuna in Einstein's argument. Einstein
correctly noted that measurements on a system would enable the
Completeness of Quantum Theory
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_completeness/index.html[28/04/2010 08:25:21 ﺹ]
prediction of outcomes of measurements on another system
entangled with it. By imagining different possible measurements on
the first system, one then merely used the computations of the
quantum theory to determine the outcomes of the measurements on
the second system.
Einstein presumed that all these possible
outcomes reflected properties possessed by the
second system. That meant that all possible
measurements would end up revealing a single
set of possessed properties. What Bell
showed was that this last assumption failed. If
one assumed that the computations of quantum
theory correctly predicted the outcomes of
measurement, then there was no consistent set
of hidden properties consisted with all possible
measurements. Or, more precisely, if one
assumed separability and locality, then there was
no such set.
The situation is not so removed from the familiar parable of the blind men
and the elephant, but with an essential twist. In it, several blind men feel
different parts of the elephant, each imagining a very different animal on the
basis of the limited portion they sensed.
We, however, recognize that each part they describe can be fitted together to
describe the one familiar animal.
In the quantum case, however, each of the different measurements yields
results that cannot be fitted together to describe a single independent reality.
So it is as if the blind men report parts that cannot be integrated consistently
into one animal.
Bell's arguments cannot be developed here. They go beyond the
ideas developed above. But they do so only in the technical details,
not in matters of basic principle. To begin, Bell set his analysis in the
context of a version of the EPR argument laid out by David Bohm.
That version was devoted specifically to the measurement of a
quantity known as "spin." It was used since, in the context of quantum
mechanics, is it actually one of the simplest magnitudes.
Bell then assumed that entangled systems do have properties that
conform with Einstein's expectations of separability and locality; and
that these hidden properties fix the probabilities of the various
outcomes that arise on measurement. The outcomes are only
constrained by these probabilities, so generally we cannot be sure
which ones will appear in any one measurement. However, in many
repeated experiments, definite trends will emerge. They will take the
form of correlations between the outcomes returned by
measurements on each of the two entangled particles. What Bell
showed is that a characteristic parameter of these correlations will
always lie in a small interval of values. The assertion that they lie in
this interval is the Bell inequality. In later treatments, this interval
Completeness of Quantum Theory
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_completeness/index.html[28/04/2010 08:25:21 ﺹ]
spanned 2 to +2.
The parameter of Bell's inequality can also be determined by
assuming that the measurements conform to the predictions of the
quantum theory. The result of that calculation is that the parameter is
greater than 2 for the particular case treated. It follows that a theory
conforming to Einstein's expectations cannot yield the same
predictions as quantum theory. Since we have high confidence in
quantum theory's predictions, this was taken as a demonstration of
the failure of Einstein's assumptions.
There was a loophole Was it possibility that the predictions of
quantum theory were incorrect on this parameter? Later experiments,
such as reported by Aspect in 1981, affirmed that the quantum
predictions were correct. The loophole was closed.
The final outcome is that the EPR argument for the incompleteness
of quantum theory fails. Whatever reality lies behind quantum
processes does not conform to the presumptions of the EPR
argument. We cannot keep both separability and locality.
Something has to be given up.
For more see Abner Shimony, "Bell's Theorem ", Stanford
Encyclopedia of Philosophy.
Einstein in Retrospect
So what should our verdict be of Einstein's recalcitrance in the face of
the new quantum theory ? Here is my view. Einstein was wrong in
his suppositions of separability and locality in the quantum domain.
In his time, they were entirely reasonable demands and it was very
hard to see then that they would fail. That they do fail is the lesson
we have now learned.
However, in my view, he was not wrong to resist the foundational
accounts that surrounded quantum theory in his final decades. He
was quite right to protest that no account of the quantum domain
could so glibly give up the notion of reality as they did. All was
not well then in our accounts of the quantum domain; and all is not
well now. The clearest indication of the trouble is the persistence of
the measurement problem. It shows us that there is something quite
unresolved in the foundations of quantum theory.
In the early years of the theory, as new empirical and theoretical
advances came in rapid succession, it was easy to overlook these
problems. It is not hard to image the pressures faced by any
critic of a new, rising theory. Any new theory has small problems that
Completeness of Quantum Theory
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/quantum_theory_completeness/index.html[28/04/2010 08:25:21 ﺹ]
will be resolved soon enough, we can imagine Einstein hearing from
the theory's proponents. These problem are not usually reasons for
grave concern. Why should one doubt a theory will such a prodigious
record of success? Why hold up real progress with quibbles? Is it not
expedient to suspend criticism?
It took a thinker of strong character and principle to stand up to the
pressures of this expedient view. That thinker was Einstein and he
had little company in his hesitations. He wrote to Schroedinger on
May 31, 1928, at the very start:
"The Heisenberg Bohr tranquilizing philosophy or religion?is
delicately contrived that, for the time being, it provides a gentle pillow
for the true believer from which he cannot very easily be aroused. So
let him lie there."
Quoted from Arthur Fine, The Shaky Game.University of Chicago Press, 1988, p.18.
While we now may not agree with the nature of Einstein's positive
complaints concerning the newly emerging quantum theory, it is now
abundantly clear that something was not and is not right with the
theory. In hindsight that we see that Einstein's resistance was
appropriate and should be celebrated. I can see that clearly now, but
I doubt that I would have had the clarity and character to see it in
1928. I do not have the insight and principle of Einstein.
What you should know
What Einstein meant when he asserted the incompleteness of
quantum theory.
What Einstein intended with his "dice" remark and how it relates
to nineteenth century conceptions of causation.
What quantum entanglement is.
How Einstein used it in his EPR and later arguments aimed at
establishing the incompleteness of quantum theory.
The notions of separability and locality.
How the EPR argument depends upon them.
Some sense of the EinsteinBohr debate.
That it did not end well for Einstein when Bell's work appeared in
the 1960s.
That we should not judge Einstein harshly. HIndsight is 2020.
Copyright John D. Norton. March 27, April 11, April 20, 2010.
Einstein Nineteenth Century Physicist
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/Einstein_young_old/index.html[28/04/2010 08:25:27 ﺹ]
HPS 0410 Einstein for Everyone
Back to main course page
Einstein as the Greatest of the
Nineteenth Century Physicists
John D. Norton
Department of History and Philosophy of Science
University of Pittsburgh
The Young and the Old Einstein
Themes of Nineteenth Century Physics
What you should know
The Young and the Old
Einstein
The Einstein of popular thought is the young
Einstein. This is the intellectual rebel of 1905 who, in
one year, laid out the special theory of relativity and
E=mc
2
, postulated the light quantum and used
Brownian motion to make the case for the reality of
atoms. These achievements were made prior to
Einstein holding an academic position. He was then
still a patent examiner in the Bern patent office. The
years that followed brought Einstein a succession of
ever more prestigious academic appointments; and, in
the mid 1910s, he delivered his masterpiece, the
general theory of relativity.
Einstein Nineteenth Century Physicist
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/Einstein_young_old/index.html[28/04/2010 08:25:27 ﺹ]
In all this, there was a real sense that Einstein was
ahead of his peers, leading the way. The special
theory of relativity was absorbed into the mainstream
of physics fairly quickly. The general theory of relativity
was not quite so readily accommodated. This was in
part due to its burdensome mathematical demands of
the theory, at least relative to the standards of
mathematical expertise then found among physicists.
But the tide was flowing with Einstein. When the
eclipse expeditions of 1919 vindicated Einstein's
theory and he became a popular hero, critics risked
being seen as unimaginative reactionaries.
Einstein's work on the light quantum did not fare so
well. It was regarded by many as an odd aberration
from an otherwise brilliant mind. Even in the early
1920s, it was doubted by Niels Bohr, who had a
decade before developed the first quantum model of
the atom.
By the end of the 1920s, however, another Einstein
began to emerge. As the quantum theory enjoyed
success after success, Einstein found himself
unconvinced. He took on the role of critic,
complaining that the new quantum theory, for all its
virtues, could not be the final theory. This was
Einstein's new place in the physics community for his
final quarter century, ending with his death in 1955. He
Einstein Nineteenth Century Physicist
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/Einstein_young_old/index.html[28/04/2010 08:25:27 ﺹ]
remained a revered figure, but he became increasingly
isolated and marginalized, as he labored on his
alternative theories with the help of a few assistants. In
the years after his death, it became clear that
Einstein's objection to quantum theory failed, but not, I
believe, for the reasons articulated by his arch
antagonist Niels Bohr.
The old Einstein is a recalcitrant Einstein, unwilling to
swim with the new quantum tide that flooded over
physics. We should not judge that harshly. No thinker
can ever think purely new thoughts. We all sit at the
junction of the old and the new. Einstein was one of
the first of new physicists of the twentieth century. His
discoveries and methods exercised a profound,
defining influence on the development of twentieth
century physics. However, there is also a strong sense
in which he was one of the last of the nineteenth
century physicists. Perhaps he was the greatest of
Einstein Nineteenth Century Physicist
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/Einstein_young_old/index.html[28/04/2010 08:25:27 ﺹ]
them.
Themes of Nineteenth
Century Physics
To see why this is not such an unreasonable
assessment, we should review the major
discoveries and themes of the nineteenth century
and then see how they came to be realized and even
fulfilled in Einstein's research.
Nineteenth Century... Einstein...
Electrodynamics
The great discovery in physics of the
nineteenth century was Maxwell's
electrodynamics and its completion
by later physicists, including H. A.
Lorentz.
Special relativity
Encoded in the theory were the
equations that provided the
mathematical structure of special
relativity, the Lorentz transformation.
Einstein saw this structure and
extracted it, in a sense providing the
natural completion of the theory.
Thermodynamics
The other significant achievement of
nineteenth century physics was the
final recognition that thermal
processes were to be understood
statistically, as the average behavior of
systems made of very many
components. The simplest case was,
Reality of atoms
When Einstein began work on thermal
physics, this statistical approach was
still struggling for mainstream
acceptance. Einstein's work of 1905
on Brownian motion was a major
advance, perhaps even the major
advance, that made acceptance of
Einstein Nineteenth Century Physicist
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/Einstein_young_old/index.html[28/04/2010 08:25:27 ﺹ]
of course, that ordinary matter is made
of atoms and molecules and heat
resides in the energy distributed
randomly over them. But the same
analysis could be given of heat
radiation. The many components are
the many frequencies that comprise
radiation.
atoms inevitable.
Light quantum
When it comes to Einstein's boldest
posit, the light quantum, it is easy to
find a prescient Einstein, somehow
anticipating all the quantum craziness
to come. Yet there is another way to
see it, as I reported in the chapter
"Atoms and the Quanta." Einstein was
working to fulfill the nineteenth century
ambition of identifying the atomic
discontinuity that lay behind the
observed continuity of ordinary matter
like gases and liquids. The notion of
the light quantum emerged from this
nineteenth century program. He found
that the same techniques as worked
for atoms also showed him an
unexected granularity lurking behind
the apparent continuity of the radiation
field. That was the light quantum.
Geometry
If the twentieth century was the
century of novel physics, the
nineteenth century was the century of
novel mathematics. One of the
foremost achievements of the century
was a new conception of geometry. It
included the idea of nonEuclidean
geometries and their accommodation
to yet more sophisticated geometries,
notably projective geometry.
General relativity
Einstein's general theory of relativity
provided an account of gravity that
exploited these advances in geometry.
From a physical point of view,
Einstein's theory was a bold departure.
From a mathematical perspective,
however, it simply applied existing
mathematical techniques to a new and
highly interesting application.
Einstein Nineteenth Century Physicist
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/Einstein_young_old/index.html[28/04/2010 08:25:27 ﺹ]
Unification
A major idea of nineteenth century
physics overall was the theme of
unification. The conception was that
all the forces of nature were somehow
related and that the burden of physics
was to reveal those relations.
Nineteenth century physics is
punctuated by successful unifications.
Electricity, magnetism and light all fall
under one theory of electrodynamics.
The one notion of energy subsumes
heat, work and many other powers.
Unified field theory
Einstein's ambitions clearly held to this
goal of unification. He had
geometrized gravitation and the final
decades of his life were devoted to
finding a geometrized theory that
embraced both gravity and
electromagnetism, his unified field
theory.
Ether
The grounding of nineteenth century
electromagnetic theory was the ether.
Electric and magnetic fields were
merely states of this allpervading
medium.
Einstein's metrical "ether"
Einstein famously did away with the
ether; or, more precisely, he
announced it superfluous and railed
against the preferred state of rest
attributed to it. However, in his general
theory of relativity and his unified field
theories, Einstein retained an
analogous background medium. It was
not the ether of the nineteenth century.
Rather it was a kind of geometrized
version of it: the geometry of
spacetime provided a substratum
whose properties would be manifested
as gravity and electromagnetism.
Indeed, as a concession to Lorentz,
for a short time around 1920, Einstein
talked of the metrical field, the carrier
of geometrical properties, as an ether.
Causation Objections to quantum theory
Einstein Nineteenth Century Physicist
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/Einstein_young_old/index.html[28/04/2010 08:25:27 ﺹ]
The nineteenth century conception of
causation was determinism: to say
the world is causal is just to say that
conditions now fix conditions in the
future. This was a bare notion purged
of the many finer aspects routinely
assumed by a causal metaphysics.
Part of the original shock of quantum
theory was the sense that its
stochastic laws deprived the world of
its causal character in this nineteenth
century sense. There is a tendency
now to discount Einstein's complaint
against quantum theory, "God does
not play dice." However it was
repeated so often by him that we
surely must take it as heartfelt. On its
face, it is an honest expression of the
nineteenth century alarm at the loss of
causation.
Einstein was quite nineteenth century
in his expectation that the probabilities
of quantum theory would somehow
emerge from the supposed
incompleteness of quantum
description; that was precisely how the
probabilities of statistical physics of the
nineteenth century arose. Einstein's
positive hope was that physics would
continue along the lines of his general
theory of relativity. Somewhere in his
efforts to extend the theory to
electromagnetism, Einstein hoped, the
odd quantum phenomena would
emerge. These hopes hold the
quantum up to a nineteenth century
ideal of a field theory in which notions
of separability and locality are most
fully implemented.
Imagine that we come to a bend in the road, to use
a metaphor of Thomas Kuhn's. When we stand at the
corner, we see clearly the road that we have passed
and also the road that is to come. The bend belongs to
Einstein Nineteenth Century Physicist
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/Einstein_young_old/index.html[28/04/2010 08:25:27 ﺹ]
both parts. After we have passed the corner, all we
see is the new road and the bend that started it. We no
longer see earlier part it completed. Einstein is the
bend in the road that joins the nineteenth and
twentieth centuries of physics.
What you should know
The differences between the work of the young
and old Einstein.
The relevant themes of nineteenth century
physics and thought.
How Einstein's work embodies these themes.
That Einstein is one of the first of the new
physicists of the 20th century; and one of the last
the old tradition of the 19th century..
Copyright John D. Norton. April 21, 2010.
HPS 0410 Einstein for Everyone
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/2008_Fall/index.html[28/04/2010 08:25:28 ﺹ]
This is an archived copy of the course material. Some links may not work.
Lectures Assignments
Course
Description
Schedule
Term paper
Sign in sheet
Title page, Preface and Table of Contents for Einstein for
Everyone
Introduction: the questions
Special relativity: the basics
Is special relativity paradoxical?
E=mc
2
Origins of Special Relativity
Spacetime
What is a four dimensional space like?
Philosophical Significance of the Special Theory of
Relativity.
NonEuclidean Geometry
Euclid's Postulates and Some NonEuclidean Alternatives
Spaces of Variable Curvature
General Relativity
Relativistic Cosmology
Big Bang Cosmology
Black Holes
A Better Picture of Black Holes
Atoms and the Quantum
Origins of Quantum Theory
Problems of Quantum Theory
1. The Wild and the Wonderful
2. Principle of Relativity
3. Relativity of Simultaneity
4. Energy, Mass and Adding
Velocities
5. Origins of Special Relativity
6. Spacetime
7. Philosophical Significance
8. NonEuclidean Geometry
9. Curvature
10. General Relativity
11. Relativistic Cosmology
12. Big Bang Cosmology
13. Black Holes
14. Origins of Quantum Theory
15. Problems of Quantum Theory
HPS 0410 Einstein for Everyone Fall 2008
For documents relating to the Spring 2008 offering of this class, click here.
For documents related to the Spring 2007 offering of this class, click here.
Last update: August 23, 2008.
HPS 0410 Einstein for Everyone
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/2008_Spring/index.html[28/04/2010 08:25:30 ﺹ]
Lectures Assignments
Course
Description
Schedule
Term paper
Sign in sheet
Title page, Preface and Table of Contents for Einstein for
Everyone
Introduction: the questions
Special relativity: the basics
Problem of Reciprocity
E=mc
2
Origins of Special Relativity
Spacetime
What is a four dimensional space like?
Philosophical Significance of the Special Theory of Relativity.
NonEuclidean Geometry
Euclid's Postulates and Some NonEuclidean Alternatives
General Relativity
Relativistic Cosmology
Big Bang Cosmology
Black Holes
A Better Picture of Black Holes
Atoms and the Quantum
Quantum Theory
1. It Isn't That Easy
2. Principle of Relativity
3. Relativity of Simultaneity
4. Origins of Special
Relativity
5. Spacetime
6. Philosophical Significance
7. NonEuclidean Geometry
8. Curvature
9. General Relativity
10. Relativistic Cosmology
11. Big Bang Cosmology
12. Black Holes
13. Quantum Theory
HPS 0410 Einstein for Everyone Spring 2008
This is an archived copy of the Spring 2008 course documents. Not all links will work!
Note schedule changes for Martin Luther King birthday observation. Details.
Last update: March 21, 2008.
HPS 0410 Einstein for Everyone
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/2007/index.html[28/04/2010 08:25:31 ﺹ]
HPS 0410 Einstein for Everyone Spring 2007
These are the documents created for an earlier offering of the course HPS 0410 Einstein for
Everyone. As a result some links below and in the documents linked to will not work. For the
latest version of the course, click here.
Course Description
Schedule
Term paper
Lectures
Introduction: the questions
Special relativity: the basics
Problem of Reciprocity
E=mc
2
Origins of Special Relativity
Spacetime
What is a four dimensional space like?
Philosophical Significance of the Special Theory of Relativity.
NonEuclidean Geometry
Euclid's Postulates and Some NonEuclidean Alternatives
General Relativity
Relativistic Cosmology
Black Holes
Atoms and the Quantum
Assignments
HPS 0410 Einstein for Everyone
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/2007/index.html[28/04/2010 08:25:31 ﺹ]
1. Principle of Relativity
2. Relativity of Simultaneity
3. Problem of Reciprocity
4. Origins of Special Relativity
5. Spacetime (new version February 1)
6. Spacetime/ Four Dimensional Spaces
7. Philosophical Significance of the Special Theory of Relativity
8. Non_Euclidean Geometry
9. Curvature and General Relativity
10. General Relativity
11. Relativistic Cosmology I
12. Relativistic Cosmology II
13. Black Holes
14. Atoms and the Quantum
Last update: February 26, 2007.
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/test_grades/test_1_grades.png[28/04/2010 08:25:33 ﺹ]
20
18
16
•
14
1:
•
12
"
,

•
~
10
0
"
•
.c
8
E
,
z
6
4
2
0
Test 1
t
1 1 eLL ~
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
Score (max. 24)
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/test_grades/test_2_grades.jpg[28/04/2010 08:25:36 ﺹ]
Test 2
"
•
~
c
•
~
,
..
~
•

0
"
1l
E
, ,
z
•
6 1 8 9 '0 " '2 tJ .4 15 IG 11 18 ,9 ~ 21 ~ ~ 2'
Score (max. 24)
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/test_grades/test_3_grades.png[28/04/2010 08:25:38 ﺹ]
Test 3
14
12
•
10
1:
•
"
,
8

U)
'5
"
6
•
.c
E
,
4
z
2
0
1 2 3 4 5 6 7 8 9 101112131415161718192021222324
Score (Max. 24)
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/test_grades/test_4_grades.jpg[28/04/2010 08:25:42 ﺹ]
"
Test 4
", "
~
c
'"
"0
:J
~
,
'"

0
~
'"
.c
E
•
:J
Z
,
•
I ~ 3 • 6 6 7 8 9 '0 11 , 2 ,3 ,. IS 16 11 18 19
Score ( Max. 24)
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/test_grades/test_5_grades.png[28/04/2010 08:25:48 ﺹ]
Test 5
12
10
•
8
I;
•
a
•
'5 6
"
•
E
,
z
*
4
2
o
12
Score (M ax. 24  half points rounded up)
HPS 0410 Use of Sources
http://www.pitt.edu/~jdnorton/teaching/HPS_0410/sources.html[28/04/2010 08:25:49 ﺹ]
HPS 0410 Einstein for Everyone
Use Of Sources
Firsttime essay writers at the college level are sometimes unsure of the proper use of sources.
The general rules are:
(i) The wording you use must be your own and not copied or even loosely paraphrased from
another work. Any wording not your own must be represented as a quote by enclosing it in
quotation marks (if it is less than three lines of text) or setting it off as an indented block of text
(if it is more than 3 lines of text), both with appropriate footnoting.
(ii) The source of ideas which are sufficiently novel or idiosyncratic not to be taken as
commonly known must be indicated either in the text or in a footnote as is appropriate. The
same holds of any detailed account of one theory or another.
The most likely problem with (ii) is that you take it too seriously and footnote every sentence.
The satisfaction of (ii) must be tempered by the need to retain a clean and uncluttered text.
The real danger lies in (i); its violation is a serious form of plagiarism and is treated as a serious
offence within the university and academia in general. Here is an example to make the
requirements of (i) clear:
The original text says:
In astronomy, the half century from 1570 to 1620 saw a radical break with
tradition.