You are on page 1of 125

İSTANBUL TECHNICAL UNIVERSITY ⋆ INFORMATICS INSTITUTE

A TIME-SYMMETRIC AND INDIVIDUAL BLOCK TIME STEP ALGORITHM


FOR N-BODY INTEGRATION

Ph.D. Thesis by

Murat KAPLAN, M.Sc.

Department : INFORMATICS INSTITUTE

Program : COMPUTATIONAL SCIENCE & ENGINEERING

NOVEMBER 2008
İSTANBUL TECHNICAL UNIVERSITY ⋆ INFORMATICS INSTITUTE

A TIME-SYMMETRIC AND INDIVIDUAL BLOCK TIME STEP ALGORITHM


FOR N-BODY INTEGRATION

Ph.D. Thesis by
Murat KAPLAN, M.Sc.
(702002004)

Date of Submission : 30 May 2008


Date of Examin : 5 November 2008

Supervisor : Prof. Dr. Hasan SAYGIN (İ.T.Ü.)

Members of the Examining Committee : Assoc.Prof. Dr. Sondan D. FEYİZ (İ.T.Ü.)

Assoc.Prof. Dr. N. Abdülbaki BAYKARA (M.Ü.)

Prof. Dr. Metin DEMİRALP (İ.T.Ü.)

Prof. Dr. M. Serdar ÇELEBİ (İ.T.Ü.)

NOVEMBER 2008
İSTANBUL TEKNİK ÜNİVERSİTESİ ⋆ BİLİŞİM ENSTİTÜSÜ

ÇOK CİSİM İNTEGRASYONU İÇİN


ZAMAN SİMETRİK VE AYRIK BLOK ZAMAN ADIMLI BİR ALGORİTMA

DOKTORA TEZİ
M.Sc. Murat KAPLAN
(702002004)

Tezin Enstitüye Verildiği Tarih : 30 Mayıs 2008


Tezin Savunulduğu Tarih : 5 Kasım 2008

Tez Danışmanı : Prof. Dr. Hasan SAYGIN (İ.T.Ü.)

Diğer Jüri Üyeleri : Doç. Dr. Sondan D. FEYİZ (İ.T.Ü.)

Doç. Dr. N. Abdülbaki BAYKARA (M.Ü.)

Prof. Dr. Metin DEMİRALP (İ.T.Ü.)

Prof. Dr. M. Serdar ÇELEBİ(İ.T.Ü.)

KASIM 2008
ACKNOWLEDGEMENT

First of all, I would like to thank my supervisor, Hasan Saygn, for his support
and guidan e during the all period of my graduation thesis, M.S ., and Ph.D.
studies. I am also grateful for his philosophy in student advisory whi h help me
having a strong self- onden e in s ienti works. During these years, his sharing,
inspiration, and help are very important not only for thesis work but also being
a more independent resear her. I have been lu ky to have as my supervisor.
I would like to thank my ommittee members Sondan Durukano§lu Feyiz, and
N. Abdülbaki Baykara for their reviews and omments. Furthermore, I must
a knowledge Piet Hut and Jun Makino for their guidan es and enthusiasti
supports throughout an important part of the resear h. And I also have to thank
Piet for hosting me ve weeks in Institute for Advan ed Study, in Prin eton. He
made it possible for me to ommuni ate and dis uss about my thesis with Douglas
Heggie, Steve M Millan, Gerald Jay Sussman, and Peter Teuben. I would also
like to thank them for their invaluable dis ussions.
I would also like to a knowledge to the members of Informati s Institute starting
from Metin Demiralp, Serdar Çelebi, and Hasan Da§. I have had the pleasure
of working with all members of the Informati s Institute. Thanks to the all
graduate students, in parti ular my old and new room mates starting from
Hüseyin, O§uzhan, Fatih, and others. We shared many things...
Finally, I deeply thank to all my family members for their endless support and
love.

November 2008 Murat KAPLAN

ii
TABLE OF CONTENTS

Page

LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
LIST OF SYMBOLS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
SUMMARY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii
ÖZET . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii

1. INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

2. INTEGRATION METHODS AND ALGORITHMS . . . . . . . . . . . 8


2.1. Integration S hemes . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.1.1. Leapfrog integration . . . . . . . . . . . . . . . . . . . . . . 8
2.1.2. Fourthorder Hermite integration . . . . . . . . . . . . . . 10
2.1.3. Sixthorder Hermite integration . . . . . . . . . . . . . . . 13
2.1.4. Eighthorder Hermite integration . . . . . . . . . . . . . . 14
2.2. Variable Time Step . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.3. Time Step Criterion . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.4. For e Softening . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.5. Individual Time Steps . . . . . . . . . . . . . . . . . . . . . . . . 18
2.6. Blo k Time Steps . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.7. Time Symmetrization . . . . . . . . . . . . . . . . . . . . . . . . . 20
3. TIME-SYMMETRIC BLOCK TIME STEP ALGORITHM . . . . . . . 25
3.1. Impli it Iterative Approa h . . . . . . . . . . . . . . . . . . . . . 25
3.1.1. Two body tests for impli it iterative approa h . . . . . . . 26
3.1.2. Flip-op problem . . . . . . . . . . . . . . . . . . . . . . . 32
3.1.3. Flip-op resolution . . . . . . . . . . . . . . . . . . . . . . 33
3.1.4. A dierent approa h for impli it iterative blo k time steps:
extended blo ks . . . . . . . . . . . . . . . . . . . . . . . . 35
3.1.5. A rst attempt at a solution . . . . . . . . . . . . . . . . . 39
3.1.6. A se ond attempt at a solution . . . . . . . . . . . . . . . . 41
3.2. Numeri al Tests for 2-Body Problems . . . . . . . . . . . . . . . . 43
3.2.1. 2-body tests for leapfrog . . . . . . . . . . . . . . . . . . . 44
3.2.2. 2-body tests for fourth-order Hermite . . . . . . . . . . . . 47
3.2.3. 2-body tests for sixth-order Hermite . . . . . . . . . . . . . 49
3.2.4. 2-body tests for eighth-order Hermite . . . . . . . . . . . . 52
4. N-BODY IMPLEMENTATION . . . . . . . . . . . . . . . . . . . . . . . 56
4.1. Divide and Conquer: the Con ept of an Era . . . . . . . . . . . . 56
4.2. Des ription of the Algorithm with Leapfrog . . . . . . . . . . . . 57

iii
4.3. Numeri al Tests for N -Body Problems . . . . . . . . . . . . . . . 62
4.3.1. Test results for leapfrog integration . . . . . . . . . . . . . 62
4.3.2. Test results for fourthorder Hermite integration . . . . . . 66
4.3.3. Test results for sixthorder Hermite integration . . . . . . . 67
4.3.4. Test results for eighthorder Hermite integration . . . . . . 71
4.4. Numeri al Tests for Size of the Era . . . . . . . . . . . . . . . . . 77
4.5. Dynami Era . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
5. PARALLEL IMPLEMENTATION . . . . . . . . . . . . . . . . . . . . . 84
5.1. Requirements and Algorithm of the Sequential Code . . . . . . . . 84
5.2. Parallel Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 86
5.3. Load Balan e and Parallel Performan e . . . . . . . . . . . . . . . 87
6. CONCLUSION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

APPENDIX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

A. Plummer Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

B. Test Results for Dynamic versus Fixed Era Sizes . . . . . . . . . . . . . 99


2.1. 100-Body Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
2.2. 500-Body Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
CURRICULUM VITAE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

iv
LIST OF FIGURES

Page

Figure 1.1 : A Kepler two-body problem for an Earth-satellite system . . . 2


Figure 1.2 : A real world example for star lusters from Hubble Spa e
Teles ope: NGC6093 . . . . . . . . . . . . . . . . . . . . . . . 3
Figure 1.3 : An example for numeri al model initial onditions for 10000
point parti le Plummer model . . . . . . . . . . . . . . . . . . 6
Figure 2.1 : Relative energy error for xed time step leapfrog integration
for an ellipti Kepler orbit with e entri ity= 0.75 for 1000
time units. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Figure 2.2 : Relative energy error for variable time step leapfrog integration
for an ellipti Kepler orbit with e entri ity= 0.75 for 1000 time
units. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Figure 2.3 : Relative energy errors for variable time step versus symmetri
time step with leapfrog integration for an ellipti Kepler orbit
with e entri ity= 0.75 for 1000 time units. . . . . . . . . . . 22
Figure 2.4 : Kineti and potential energy urves for time symmetrized
variable time step leapfrog integration for an ellipti Kepler
orbit with e entri ity= 0.75 for 4 orbital periods. . . . . . . . 23
Figure 2.5 : Relative energy error for time symmetrized variable time
step leapfrog integration for an ellipti Kepler orbit with
e entri ity= 0.75 for 10 orbital periods. Only one iteration
is used for time symmetrization with variable time steps. . . . 23
Figure 2.6 : Relative energy error for time symmetrized variable time step
leapfrog integration for the same Kepler problem for 10000
time units. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Figure 2.7 : Relative energy errors for time symmetrized variable time step
leapfrog integration using dierent (1-4) number of iterations
for the same Kepler problem. Dierent number of iterations
are shown with dierent olors and symbols. . . . . . . . . . . 24
Figure 3.1 : Relative energy errors for leapfrog s heme with impli it
iterative blo k-time steps for a Kepler problem with
e entri ity= 0.75. . . . . . . . . . . . . . . . . . . . . . . . . 27
Figure 3.2 : Relative energy errors for leapfrog s heme with impli it
iterative blo k-time steps, and time-symmetri leapfrog s heme. 27
Figure 3.3 : Relative energy errors for leapfrog s heme with impli it
iterative blo k-time steps for an ellipti Kepler orbit with
e entri ity= 0.85. We applied dierent number of iterations
to the s heme as in the 3.1 . . . . . . . . . . . . . . . . . . . . 28

v
Figure 3.4 : Relative energy errors for leapfrog s heme with impli it
iterative blo k-time steps, and time-symmetri leapfrog s heme
for an ellipti Kepler orbit with e entri ity= 0.85. . . . . . . 28
Figure 3.5 : Relative energy errors for leapfrog s heme with impli it
iterative blo k-time steps for an ellipti Kepler orbit with
e entri ity= 0.9375. . . . . . . . . . . . . . . . . . . . . . . . 29
Figure 3.6 : Relative energy errors for leapfrog s heme with impli it
iterative blo k-time steps, and time-symmetri leapfrog s heme
for an ellipti Kepler orbit with e entri ity= 0.9375. . . . . . 29
Figure 3.7 : Relative energy errors for leapfrog s heme with impli it
iterative blo k-time steps for an ellipti Kepler orbit with
e entri ity= 0.9856. . . . . . . . . . . . . . . . . . . . . . . . 30
Figure 3.8 : Relative energy errors for leapfrog s heme with impli it
iterative blo k-time steps, and time-symmetri leapfrog s heme
for an ellipti Kepler orbit with e entri ity= 0.9856. . . . . . 30
Figure 3.9 : Relative energy errors for leapfrog s heme with impli it
iterative blo k-time steps for an ellipti Kepler orbit with
e entri ity= 0.99. (Here, we kept the size of the time steps
ten time smaller.) . . . . . . . . . . . . . . . . . . . . . . . . . 31
Figure 3.10: Relative energy errors for leapfrog s heme with impli it
iterative blo k-time steps, and time-symmetri leapfrog s heme
for an ellipti Kepler orbit with e entri ity= 0.99. . . . . . . 31
Figure 3.11: Blo k time steps with nearest neighbors. . . . . . . . . . . . . 36
Figure 3.12: Time steps for leapfrog s heme al ulated by two versions of
impli it iterative blo k-time steps for an ellipti Kepler orbit
with e entri ity= 0.99 for one orbital period. . . . . . . . . . 37
Figure 3.13: Relative energy errors for leapfrog s heme with two versions of
impli it iterative blo k-time steps for an ellipti Kepler orbit
with e entri ity= 0.99 for one orbital period. . . . . . . . . . 38
Figure 3.14: Relative energy errors for leapfrog s heme with two versions of
impli it iterative blo k-time steps for an ellipti Kepler orbit
with e entri ity= 0.99 for 10000 time units. . . . . . . . . . . 38
Figure 3.15: Relative energy errors for leapfrog s heme with two versions
of impli it iterative blo k-time steps, and time-symmetri
leapfrog s heme for an ellipti Kepler orbit with e entri ity=
0.96. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
Figure 3.16: Relative energy errors for leapfrog s heme with shared
blo k-time steps (se ond version), and time-symmetri leapfrog
s heme for 8 body runs. Here, Plummer model based initial
onditions are used. . . . . . . . . . . . . . . . . . . . . . . . . 40
Figure 3.17: Relative energy errors for a two-body integration of a bound
orbit with e entri ity e = 0.99. The top line with highest slope
orresponds to algorithm 1, the line with intermediate slope
orresponds to algorithm 2, and below those the two lines for
algorithms 0 and 3 are indistinguishable in this gure. . . . . 45
Figure 3.18: Relative energy errors at apo- enter. The four lines, from top
to bottom, orrespond to algorithms 1, 2, 3, and 0. . . . . . . 46
Figure 3.19: Same as Fig.3.18, but for a duration that is ten times longer. . 46

vi
Figure 3.20: Relative energy errors for a Kepler problem with e entri ity
e = 0.984375 for fourth-order Hermite integration with and
without TSBTS algorithm. . . . . . . . . . . . . . . . . . . . . 47
Figure 3.21: Relative energy errors for a Kepler problem with e entri ity
e = 0.984375 for fourth-order Hermite integration with TSBTS
algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
Figure 3.22: Relative energy errors for a 2 body problem for fourth-order
Hermite integration with and without TSBTS algorithm. . . . 48
Figure 3.23: Relative energy errors for a 2 body problem for fourth-order
Hermite integration with only TSBTS algorithm. . . . . . . . 49
Figure 3.24: Relative energy errors for a Kepler problem with e entri ity
e = 0.984375 for sixth-order Hermite integration with and
without TSBTS algorithm for dierent number of iterations. . 50
Figure 3.25: Relative energy errors for a Kepler problem with e entri ity
e = 0.984375 for sixth-order Hermite integration with TSBTS
algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
Figure 3.26: Relative energy errors for a 2 body problem for sixth-order
Hermite integration with TSBTS algorithm. . . . . . . . . . . 51
Figure 3.27: Relative energy errors for a 2 body problem for sixth-order
Hermite integration with TSBTS algorithm for 3,5, and 7
iterations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
Figure 3.28: Relative energy errors for a Kepler problem with e entri ity
e = 0.984375 for eighth-order Hermite integration with and
without TSBTS algorithm. . . . . . . . . . . . . . . . . . . . . 53
Figure 3.29: Relative energy errors for a Kepler problem with e entri ity
e = 0.984375 for eighth-order Hermite integration with TSBTS
algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
Figure 3.30: Relative energy errors for a 2 body problem for eighth-order
Hermite integration with TSBTS algorithm versus blo k time
steps. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
Figure 3.31: Relative energy errors for a 2 body problem for eighth-order
Hermite integration with TSBTS algorithm. . . . . . . . . . 55
Figure 4.1 : Flow hart of the algorithm . . . . . . . . . . . . . . . . . . . 61
Figure 4.2 : Growth of the relative energy error for 100-body runs, starting
from twenty dierent sets of initial onditions. For ea h set
of initial onditions, two integrations have been performed,
one without and one with time-symmetrization (in the latter
ase, using six iterations). The twenty lines with time
symmetrization form the horizontal bundle whi h is slowly
spreading in square-root-of-time fashion like a random walk;
the twenty lines without time symmetrization all show a
systemati , near-linear de rease in energy. . . . . . . . . . . . 63

vii
Figure 4.3 : Growth of the relative energy error for 512-body runs, starting
from a single set of initial onditions, but using a dierent
number of iterations. The lowest urve presents an integration
without time symmetrization. The urve above that presents
the result of time symmetrization using only one iteration.
The next two urves show the results of using three and two
iterations, respe tively; initially, the third iteration urve rises
a bit above the se ond iteration urve. . . . . . . . . . . . . . 64
Figure 4.4 : Growth of the relative energy error for 512-body runs, like
Fig. 4.3, but starting from a dierent set of initial onditions.
As before, the lowest urve presents an integration without
time symmetrization, and the urve above that presents the
result of time symmetrization using only one iteration. The
se ond iteration urve is the one that stays above the third
iteration urve for most of the run depi ted here. . . . . . . . 65
Figure 4.5 : Relative energy errors for 100 body problems. 10 dierent sets
of Plummer model initial onditions are used for fourth-order
Hermite integration with TSBTS algorithm and blo k time
step algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . 68
Figure 4.6 : Relative energy errors for 100 body problems. 10 dierent sets
of Plummer model initial onditions are used for fourth-order
Hermite integration with TSBTS algorithm. . . . . . . . . . . 68
Figure 4.7 : Relative energy errors for 500 body problems. 10 dierent sets
of Plummer model initial onditions are used for fourth-order
Hermite integration with TSBTS algorithm and blo k time
step algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . 69
Figure 4.8 : Relative energy errors for 500 body problems. 10 dierent sets
of Plummer model initial onditions are used for fourth-order
Hermite integration with TSBTS algorithm. . . . . . . . . . . 69
Figure 4.9 : Relative energy errors for 100 body problems. 10 dierent sets
of Plummer model initial onditions are used for sixth-order
Hermite integration with TSBTS algorithm and blo k time
step algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . 71
Figure 4.10: Relative energy errors for 100 body problems. 10 dierent sets
of Plummer model initial onditions are used for sixth-order
Hermite integration with TSBTS algorithm. . . . . . . . . . . 72
Figure 4.11: Relative energy errors for 500 body problems. 10 dierent sets
of Plummer model initial onditions are used for sixth-order
Hermite integration with TSBTS algorithm and blo k time
step algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . 72
Figure 4.12: Relative energy errors for 500 body problems. 10 dierent sets
of Plummer model initial onditions are used for sixth-order
Hermite integration with TSBTS algorithm. . . . . . . . . . . 73
Figure 4.13: Relative energy errors for 100 body problems. 10 dierent sets
of Plummer model initial onditions are used for eighth-order
Hermite integration with TSBTS algorithm and blo k time
step algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . 75

viii
Figure 4.14: Relative energy errors for 100 body problems. 10 dierent sets
of Plummer model initial onditions are used for eighth-order
Hermite integration with TSBTS algorithm. . . . . . . . . . . 75
Figure 4.15: Relative energy errors for 500 body problems. 10 dierent sets
of Plummer model initial onditions are used for eighth-order
Hermite integration with TSBTS algorithm and blo k time
step algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . 76
Figure 4.16: Relative energy errors for 500 body problems. 10 dierent sets
of Plummer model initial onditions are used for eighth-order
Hermite integration with TSBTS algorithm. . . . . . . . . . . 76
Figure 4.17: Relative energy errors for 100-body problems. 5 dierent sets
of Plummer model initial onditions with 7 dierent era sizes
are used without iterations. Here, all the energy errors are in
the same bounds. Even if a growing dispersion between the
urves is observed after 200 time units, small spreadings in the
long integration times is not so important for su h big relative
energy errors. . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
Figure 4.18: Relative energy errors for 100-body problems. 5 dierent
sets of Plummer model initial onditions with 5 dierent era
sizes (0.015625, 0.03125, 0.0625,0.125,0.25) are used with 3
iterations for 1000 time units. The top 5 urves (red ones) show
linear growing errors whi h orrespond to errors for biggest
era sizes (0.25). The rests present the results for other era
sizes. The smallest relative errors on the gure (bla k urves),
show a random-walk fashion, and orrespond to results with
the smallest era size (0.015625). . . . . . . . . . . . . . . . . . 79
Figure 4.19: Relative energy errors for 100-body problems. 5 dierent sets
of Plummer model initial onditions are used for 5 iterations
with 5 dierent era sizes (0.015625, 0.03125, 0.0625,0.125,0.25).
In this gure, all the urves show random-walk fashion instead
of linearly growing error. Also, the worst relative error is below
the 0.008 even if it was 0.035 in Fig.3.19, and 0.35 in Fig.3.18. 80
Figure 4.20: Relative energy errors for 500-body problems. 5 dierent sets
of Plummer model initial onditions are used with 7 dierent
era sizes (0.015625, 0.03125, 0.0625, 0.125, 0.25, 0.5, 1) for ea h
sets. 3 iterations have performed in the integrations. 15 urves
(red ones) in the enter of the gure present the results of
smaller era sizes (0.015625, 0.03125, 0.0625), the rest 20 urves
orrespond to bigger (0.125, 0.25, 0.5, 1) era sizes. . . . . . . . 81
Figure 4.21: Relative energy errors for 10 dierent 100-body problems. For
ea h initial onditions, two algorithms have been performed,
one with xed and one with hanging era size. Three iterations
have been used for both algorithms. Fixed era size was taken
as 0.015625. This value was also used as the allowed biggest
time step for both algorithms. Green urves orrespond to
dynami era sizes, and show smaller errors than xed ones in
most ases. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

ix
Figure 4.22: Relative energy errors for 10 dierent 500-body problems.
Fixed and dynami era sizes are performed for ea h initial
onditions as in Fig.4.21. As before, xed era size and allowed
biggest time step were taken as 0.015625. The results for
dynami and xed era sizes are in the same error ranges. . . . 83
Figure 5.1 : Load imbalan e for a Plummer model initial onditions
1000-body problem using 12 pro essors for 1000 time units.
Every single red points orresponds to load imbalan e for the
a tive parti le group while it's ve tors are updating. . . . . . . 88
Figure 5.2 : Speedup vs pro essor number for 10000-body Plummer model
initial onditions both for symmetrized and non-symmetrized
individual blo k time step algorithms. Continuous urve on
the top orresponds to symmetrized blo k time steps with 3
iterations. Dis ontinuous urve on the bottom orresponds to
lassi al blo k time step algorithm. . . . . . . . . . . . . . . . 89
Figure 5.3 : E ien y vs pro essor number for 10000-body Plummer model
initial onditions both for symmetrized and non-symmetrized
individual blo k time step algorithms. Continuous urve on
the top orresponds to symmetrized blo k time steps with 3
iterations. Dis ontinuous urve on the bottom orresponds to
lassi al blo k time step algorithm. . . . . . . . . . . . . . . . 89

x
LIST OF SYMBOLS

Fi : For e on i'th body


G : Gravitational onstant
E : Total energy
mi : Mass of i'th body
N : Number of bodies
e : E entri ity
ξ : Phase spa e ve tor
η : A ura y parameter
rij : Position ve tor from parti le j to parti le i
vij : Velo ity ve tor from parti le j to parti le i
aij : A eleration ve tor from parti le j to parti le i
jij : Jerk ve tor from parti le j to parti le i
ri : Position ve tor of i'th body
vi : Velo ity ve tor of i'th body
ai : A eleration ve tor of i'th body
ji : Jerk ve tor of i'th body
si : Snap ve tor of i'th body. Se ond derivative of ai
ci : Cra kle ve tor of i'th body. Third derivative of ai
pi : Pop ve tor of i'th body. Fourth derivative of ai
ni : Nanda ve tor of i'th body. Fifth derivative of ai
ai
(6)
: Sixth derivative of a eleration for i'th body
ai
(7)
: Seventh derivative of a eleration for i'th body
A+ , A− , J +, J − , S+ , S−,C+ ,C− : Summations and dieren es of a, j, s, and c

xi
A TIME-SYMMETRIC AND INDIVIDUAL BLOCK TIME STEP
ALGORITHM FOR N-BODY INTEGRATION

SUMMARY

In this thesis, it is aimed to develop a time-symmetri blo k time step algorithm


(TSBTS). The main obje tive of the work is to ombine high energy onservations
of the time symmetri methods and individual blo k time step algorithms.
For this purpose, a time symmetrization s heme oered by Hut, Makino, and
M Millan is used. This s heme is very good to onstru t time symmetry disturbed
by variable time steps, but it is not good enough to apply dire tly to individual
blo k time step algorithms.
There is not any analyti al solutions of N-body problem ex ept for two body
ases. It is easy to orre t the algorithm in Kepler problem for this reason.
Besides, it is known that ontributions to the energy error are largely generated
by lose en ounters between two parti les. For these reasons, two-body problem
is preferred to develop the algorithm in the rst instan e.
The algorithm is developed and tested in two-body problem for a ura y and
energy onservations in many dierent Kepler problem with dierent iteration
numbers. In the pro ess of the development work, leapfrog integration s heme
whi h is the se ond order time-symmetry s heme is preferred. Also fourth, sixth,
and eighth order Hermite integration s hemes are used for test runs.
An iterative s heme must be ombined with individual blo k time step s heme to
apply the new algorithm to the n-body problem ee tively. Era based iteration
on ept has been developed for this purpose. In this on ept, total history of the
simulation is splitted into a number of smaller periods that every one is alled as
era.
There are number of tests that are performed for era size. It is learly seen from
the results that size of era is so ee tive on energy onservations. Era size must
be hosen as minimum as possible both for energy errors and time onsumption.
This work also in ludes a generalization of the algorithm to enable dynami era
based iterations.
In the last part of the work, a parallel version of the algorithm is produ ed using
a opy algorithm based parallel s heme. Speedup and e ien y results are as
expe ted but load balan ing results are very good.

xii
ÇOK CİSİM İNTEGRASYONU İÇİN ZAMAN SİMETRİK VE AYRIK BLOK
ZAMAN ADIMLI BİR ALGORİTMA

ÖZET

Bu çal³mada, zaman-simetrik ayrk blok (individual blo k) zaman adml bir


integrasyon ³emas geli³tirilmesi amaçlanm³tr. Çal³mann ama  bu algoritmay
kullanarak zaman simetrik yöntemlerin sahip oldu§u yüksek enerji korunumunu
ayrk blok zaman adml ³emalara ta³maktr. Bunun için, Hut, Makino
ve M Millan tarafndan önerilen simetrikle³tirme ³emas kullanlm³tr. Bu
³ema, de§i³ken zaman adm kullanld§nda bozulan zaman simetrisini tekrar
sa§layabilmektedir. An ak ayrk zaman adml ³emaya direk olarak uygulamak
için yetersizdir.
Çok isim probleminin ikiden fazla sayda isim için analitik bir çözümü
olmad§ndan, algoritmay analitik çözümü olan Kepler probleminde do§rulamak
çok daha kolaydr. Ayr a, çok isim problemlerinde enerji korunumundaki
hatann büyük ksmnn genellikle olu³an çiftlerden kaynakland§ bilinmektedir.
Bu sebeplerle algoritmann geli³tirilmesi için ön e iki isim problemi
kullanlm³tr.
Algoritma iki isim probleminde geli³tirilmi³, do§rulu§u ve yüksek ba³arm
farkl zorluk dere elerindeki Kepler problemleriyle ve farkl yineleme saylaryla
test edilmi³tir. Algoritmann geli³tirilmesi a³amasnda zaman-simetrik, ikin i
dere eden bir ³ema olan leapfrog integrasyonu ter ih edilmi³tir. Yaplan testlerde
ayr a dördün ü, altn  ve sekizin i dere eden Hermite integrasyonlar da
kullanlm³tr.
Yeni algoritmann çok isim problemlerinde ba³aryla uygulanabilmesi için
yinelemeli bir yapnn ayrk blok zaman adml ³emayla birle³tirilmesi gerekmi³tir.
Bunun için, ça§ (era) tabanl yineleme kavram geli³tirilmi³tir. Ça§ tabanl
yinelemede, yineleme için belirli bir adm says yerine belirli bir zaman
aral§ndaki geçmi³ bilgiler tutulur.
Çal³mada ça§ ad verilen bu alt zaman aralklarnn büyüklükleri üzerine
ayrntl testler yaplm³, yaplan bu testler sonu unda alt zaman aralklarnn
büyüklü§ünün enerji korunumu üzerinde oldukça etkili oldu§u gözlenmi³tir.
Hem daha iyi enerji korunumu sa§lamak, hem de gereksiz zaman tüketiminden
kaçnmak için, alt zaman aralklarnn boyutunu küçük tutmak gerekti§i sonu una
ula³lm³tr. Çal³ma ayr a algoritman de§i³ken ça§ tabanl yinelemelere olanak
vere ek ³ekilde genelle³tirilmesini de içermektedir.
Çal³mann son a³amasnda, kopyalama algoritmas ( opy algorithm) tabanl
paralel bir ³ema kullanlarak, algoritmann paralel bir sürümü üretilmi³tir.
Hzlanma ve verimlilik grakleri açsndan beklenen düzeyde sonuçlar elde
edilirken, yük da§lm açsndan oldukça iyi sonuçlar elde edilmi³tir.

xiii
1. INTRODUCTION

The gravitational N -body problem starts with a basi two-body problem. In


lassi al me hani s, two-body problem is an old and well-known problem whi h
des ribes the motion of two point parti les that intera t only with ea h other.
Kepler problem is a spe ial ase of the gravitational two-body problem. These
two bodies intera t by a entral for e that varies in strength as the inverse
square of the distan e between them. Kepler problem is one of the fundamental
problem in lassi al me hani s, and it has been used to develop new methods in
lassi al me hani s, su h as Lagrangian me hani s, Hamiltonian me hani s, and
the Hamilton-Ja obi equation.
Kepler problem is on erned with the motion of a parti le in a for e eld whi h
de reases with the square of the distan e from a xed point. The Hamiltonian
is;
1 1
H = v2 − , (1.1)
2 r

where ~v is the velo ity ve tor, ~r is the position ve tor, and v2 = ~v ·~v, r = ~r ·~r,
and the equations of the motion are;
d~r
=~v,
dt
d~v ~r
= − 3. (1.2)
dt r

Fig.1.1 shows a Kepler two body problem. Here, a is semi-major axis, c is the
dieren e between semi major axis and the distan e (rp) between perihelion and
the Earth. The losest approa h point to the Earth in the satellite's orbit is
alled perihelion. The furthest point is alled aphelion. The eccentricity is an
important parameter for the Kepler orbit, and is given as in Eq.(1.3).

c (ra − rp)
e= = . (1.3)
a (ra + rp)

1
Figure 1.1: A Kepler two-body problem for an Earth-satellite system

Sin e Newton's universal law of gravitation, we know the mutual attra tion among
the large s ale parti les;

N m j (ri − r j )
Fi = mi r̈i = −Gmi ∑ , (1.4)
j=1 |ri − r j |
3
j6=i

where ri is the position ve tor of the ith body at time t , mi is its mass, G is the
gravitational onstant, N is number of the body, and dots denote dierentiation
with respe t to t . If we have a group of obje ts of whi h we exa tly know the
orresponding positions and velo ities, we an onstru t a problem to determine
the subsequent motions of the obje ts using the formulation of Eq.(1.4). This is
the so- alled gravitational N -body problem, and an be solved numeri ally using
equations of motion;

ṙi =vi ,
N m j (ri − r j )
v̇i = − G ∑ , (1.5)
j=1 |ri − r j |
3
j6=i

where vi is the velo ity ve tor of the ith body at time t .


The gravitational N -body problem an not be solved analyti ally for N > 2. While
there are various approximate methods, su h as Monte Carlo or Fokker-Plan k

2
method, the most a urate way to solve the N -body problem is dire t integration
of the orbits of the N bodies [15℄.
Dire t integration of the N -body problem is a ne essity for many ases su h as
star luster simulations (Fig.1.2). Dire t simulations are used to study for dense
stellar systems. These systems in lude, for example, dynami al evolution of star
lusters, stellar evolution, stellar ollisions, star formations, planetary systems
and gala ti nu lei [6, 7℄.
However, dire t methods produ e the highest a ura y with the pri e of the
longest omputation time, of order O(N 2) per time step. In order to nd the next
position for one parti le, one rst has to determine the gravitational for e that is
exerted on this parti le by all the other and N-1 parti les.

Figure 1.2: A real world example for star clusters from Hubble Space Telescope:
NGC6093

Fast Multipole Method (FMM) is more a urate than Monte Carlo like methods
sin e it does not require extra assumptions, as happens in approximate methods,
and is a fast way to ompute the potential eld. However, as in dire t integration
methods, FMM an be quite expensive when high-order expansions are used for
better a ura y [817℄.
3
For long-term N -body simulations, it is essential that the drift in the values of
onserved quantities is kept to a minimum. The total energy (Eq.(1.6)) whi h
is given as sum of the total kineti and potential energy, is often used as an
indi ator of su h a drift. It is known as the most sensitive quantity for monitoring
a ura y [18℄.

1 N 2
N N mm
i j
E= ∑ iim v − G ∑ ∑ |ri − r j | . (1.6)
2 i=1 i=1 j>i

During the last fteen years, two approa hes have been put forward to improve
numeri al onservation of energy and other theoreti ally onserved quantities:
symple ti integration s hemes, where the simulated system is guaranteed to
follow a slightly perturbed Hamiltonian system, and time-symmetri integration
s hemes, in whi h the simulated system follows the same traje tory in phase
spa e, when run ba kward or forward.
In both ases, for symple ti as well as for time-symmetri s hemes, the
introdu tion of adaptive time steps tends to destroy the desired properties.
Symple ti s hemes are perturbed to dierent Hamiltonians at dierent hoi es of
time step length, and therefore loose their global symple ti ity. Time-symmetri
s hemes typi ally determine their time step length at the beginning of a step,
whi h implies that running a step ba kward might give a slightly dierent length
for that step [14, 1921℄.
Time-symmetri integration s hemes and symple ti s hemes share the same
property that the energy errors in the s heme show mu h better behavior ompare
to the ase for generi integration s hemes. Allowing adaptive time steps typi ally
leads to a loss of symple ti ity. In ontrast, time symmetry an be easily
maintained, at least for a ontinuous hoi e of time step size [22℄.
Dire t integration methods take the most time onsuming part of many N -body
algorithms [1℄. They over many one step methods like leapfrog, and multi-step
methods like Adams methods. All these s hemes have to provide higher order
a ura y and stability in astrophysi al simulations. For this reason, higher-order
s hemes are to be preferred. This is omputationally expensive, whi h implies

4
that a good algorithm minimizing the number of for e al ulations is needed
[3, 23℄.
The rst step of onverting existing s hemes to a good algorithm is swit hing
from onstant time steps to variable time steps. There is an in reasing need for
adaptive-time step versions for the established integration s hemes. In the mean
time, there is also a need to parallelize the integration s hemes, so that the task
an be divided over many CPUs simultaneously to redu e the omputation time
dramati ally [24℄.
For pra ti al appli ations in large-s ale N -body al ulations in physi s and
astrophysi s, however, it is desirable to allow only a xed set of hoi es for the
time step size, in the form of powers of two. These so- alled blo k time steps
allow e ient parallelization, sin e large numbers of parti les sharing the same
blo k time step an then be integrated in parallel [25, 26℄.
In this work it is aimed to onstru t a time symmetri blo k time step integration
s heme. For this purpose, we rst onsider to analyze blo k time step s heme
with time symmetrization pro edure for the gravitational two-body problem. It
is hard to generate an e ient algorithm for N -body problem without having a
deep and lear understanding of this fundamental problem.
Straightforward implementation of time-symmetry, translated to blo k time steps,
fa es signi ant hurdles for N -body problem. For example, iteration an lead to
os illatory behavior, and even when su h behavior is suppressed, energy errors
show a linear drift in time. We present an approa h that ir umvents these
problems.
Sin e parallelization is rapidly be oming essential for any major simulation, we
also explore the possibility to extend time symmetry to the use of blo k time
steps for a parallel N -body integration algorithm.
Our basi integrator in the work is leapfrog s heme. It is a well known, se ond
order, time symmetri integration s heme for xed time steps. We, additionally,
used fourth, sixth and eighth order Hermite s hemes to see the ability of the
algorithm with higher order integration methods.

5
In programming part, generating odes for already existing s heme, and to test
them with our new algorithmi approa hes we have used Ruby programming
language spe i ally for two-body problems. Ruby is a very ee tive, fully
obje t oriented s ripting language. For time onsuming N -body odes, and for
parallelization of the algorithms, we prefer to use C language with MPI2 library.
C language with MPI allow us to work with our odes on dierent platforms. We
used Plummer model initial onditions for numeri al tests as shown in Fig. 1.3.
A brief des ription for Plummer model is taken from [3℄ (pages:121-122), and is
given in Appendix(A).
1

0.8

0.6

0.4

0.2

-0.2

-0.4

-0.6

-0.8

-1
-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1

Figure 1.3: An example for numerical model initial conditions for 10000 point particle
Plummer model

Existing integration s hemes together with basi denitions and requirements


of the N -body algorithms needed for onstru ting a new algorithm are given in
Chapter 2. In Chapter 3, we show inadequa y of straightforward implementations
for two body ases. We also dis uss the abilities of impli it iterative blo k time
steps. We analyze some of the problems that o ur when applying existing
methods to the ase of blo k time steps, and oer a novel solution, with a
truly time symmetri hoi e of time step. In Se tion 4, we give the denition
of era on ept, generalization of the algorithm for N -body ases, and possibility
to hoose dynami ally hanging era size. We demonstrate that the advantage
arries over to the general N -body problem for dierent integration s hemes. In

6
Chapter 5, we give a parallel s heme for the developed algorithm, and numeri al
test results. The last se tion in ludes on luding remarks.

7
2. INTEGRATION METHODS AND ALGORITHMS

A general a eptan e about methods is; if the results of a method mat hs what
happens in the real world, the method is a epted to be more a urate than the
others. However, traje tories in larger systems exhibit huge error grows even on
short time s ales. The exa t solutions are never known ex ept for two body ases.
Our main key to distinguish good and bad results is energy onservation.
In this work, we use two mostly preferred integration s hemes; leapfrog, and 4.th
order Hermite, and two newly developed integration s hemes; 6.th, and 8.th order
Hermite integrations for N -body integration odes. For the s hemes higher than
fourth-order, we need derivatives of a elerations up to se ond order (snap), third
order ( ra kle), and fourth order (pop). There are two ways to onstru t these
derivatives; rst one is to use previous steps as in the Aarseth s heme [3℄, se ond
one is to use dire tly al ulated derivatives. We preferred sixth, and eighth order
s hemes as in [27℄ for higher order integrations.

2.1 Integration Schemes

2.1.1 Leapfrog integration

Leapfrog integration s heme, as in Eq. (2.1), is one of the important se ond order
integration method for N-body simulations. It is also known as Verlet method in
omputational physi s. This simple and ee tive s heme is written simply as;

ri+1 = ri + vi+ 1 δ t,
2

vi+ 3 = vi+ 1 + ai+1 δ t. (2.1)


2 2

Here, position ve tors and a eleration ve tors are dened on integer times,
velo ity ve tors are dened on half-integer times. We an al ulate a elerations
dire tly from positions and an write a(r, v) = a(r) for a eleration ve tors.

8
This s heme an be rewritten to dene all quantities only at integer times. We
use following Taylor series expansions for velo ity terms;
δt
vi+ 1 = vi + ai ,
2 2
δt
vi+ 3 = vi+1 + ai+1 , (2.2)
2 2

and our self starting leapfrog s heme an be written as;


δ t2
ri+1 = ri + vi δ t + ai ,
2
δt
vi+1 = vi + (ai + ai+1 ) . (2.3)
2

It is possible to obtain highly stable and a urate results with this s heme. These
features ome from its time symmetri stru ture for onstant time steps.
Let us show how the self starting form of leapfrog integration s heme is time
symmetri : we rst take one step forward with δ t to go from (ri , vi ) phase spa e
to (ri+1, vi+1), and than take one step ba kward using the same s heme but taking
negative time step (−δ t ). If our s heme is time symmetri , rst and nal values
of the velo ities, and positions (r f , v f ) must be equal.
δ t2
r f =ri+1 − vi+1 δ t + ai+1
2
δ t2 δ t2
 
δt
= r i + vi δ t + a i − vi + (ai + ai+1 ) δ t + ai+1
2 2 2
=ri
δt
v f =vi+1 − (ai + ai+1 )
 2
δt δt
= vi + (ai + ai+1 ) − (ai + ai+1 )
2 2
=vi (2.4)

Our forward and ba kward steps an el all terms, and we got an exa t time
reversibility as in Eq.(2.4).
But this time symmetri stru ture (Fig. 2.1) is disturbed when we use variable
time steps with leapfrog s heme.
Leapfrog has several advantages over other methods. It is a se ond order
symple ti integrator requiring only one ostly for e evaluation per time step and

9
Figure 2.1: Relative energy error for fixed time step leapfrog integration for an elliptic
Kepler orbit with eccentricity= 0.75 for 1000 time units.

only one opy of the physi al state of the system. This is parti ularly bene ial
for N-body simulations where the ost of a for e evaluation is very expensive
and memory usage is often a riti al on ern. The for e eld in an N-body
simulation is not very smooth, so higher order does not ne essarily mean higher
a ura y for all times. It preserves properties spe i to Hamiltonian systems.
Gravitational N -body systems should benet from the use of an integrator that
onserves phase spa e volume and has no spurious dissipation. This is espe ially
important in self-gravitating systems where a dissipation time s ale that is linked
to the dynami al time an lead to a runaway to spuriously large densities [28℄.

2.1.2 Fourth–order Hermite integration

Leapfrog s heme is an important s heme that is se ond order. In many problems,


we need higher order a ura ies. Fourth order Hermite s heme is one of the
important integration method for N -body problem. This s heme is rstly
introdu ed by Makino [26, 29℄. It is onstru ted on Hermite interpolation
polynomials. With this s heme, we an al ulate time derivative of the
a eleration dire tly in predi tion phase. Interpolation polynomials of the for es
are onstru ted using expli itly omputed high order derivatives.

10
Fourth order Hermite integration is a tually known more a urate than the
methods of divided dieren es for the same order. It needs higher al ulation
ost per time step than fourth order Aarseth s heme [3℄, but it allows two times
bigger time steps than Aarseth's s heme. For this reason, it has gains for both
total al ulation ost and al ulation speed in the same order of e ien y.
Taylor series based predi tions are given as follows;

(δ t)2 (δ t)3 (δ t)4


ri+1 = ri + vi δ t + ai + ji + si ,
2 6 24
(δ t)2 (δ t)3 (δ t)4
vi+1 = vi + ai δ t + ji + si + ci ,
2 6 24
(δ t)2 (δ t)3
ai+1 = ai + ji δ t + si + ci ,
2 6
(δ t)2
ji+1 = ji + si δ t + ci . (2.5)
2

In Eq.(2.5), ji (jerk), si (snap), and ci ( ra kle) are rst, se ond, and third
derivatives of a eleration. We an al ulate si and ci using ai+1 and ji+1. First,
we rewrite ai+1 and ji+1 as;

6ai+1 − 6ai = 6 ji δ t + 3si (δ t)2 + ci (δ t)3,

2 ji+1 δ t = 2 ji δ t + 2si (δ t)2 + ci (δ t)2, (2.6)

substituting side by side;

6ai+1 − 2 ji+1 δ t = 6ai + 4 ji δ t + si (δ t)2, (2.7)

and we obtain si as;

si (δ t)2 = −6ai + 6ai+1 − 4 ji δ t − 2 ji+1 δ t. (2.8)

Using si into ji+1 ;

2 ji+1 δ t = 2 ji δ t − 12ai + 12ai+1 − 8 ji δ t − 4 ji+1 δ t + ci (δ t)2 , (2.9)

and we obtain ci as;

ci (δ t)3 = 12ai − 12ai+1 + 6 ji δ t + 6 ji+1 δ t. (2.10)

11
In this formulation, si, and ci show se ond and third time derivatives of
a eleration in order. For a fourth order s heme, higher than fourth order terms
of Taylor expansion are negle ted.
A ording to the Newton's Law of gravity, we an write a eleration term as;

N Mj
ai = G ∑ 2 ji
r . (2.11)
j=1 r ji
j6=i

ji is the rst time derivative of a eleration. It is known as jerk, and is given as;
" #
N v ji (r ji v ji )r ji
ji = G ∑ M j −3 . (2.12)
j=1 r3ji r5ji
j6=i

In the programming part, the most time onsuming parts are the al ulations of
higher order derivatives. Before we orre t position and velo ity ve tors, we use
third order Taylor expansions for predi tions;

p (δ t)2 (δ t)3
ri+1 = r i + vi δ t + a i + ji ,
2 6
p (δ t)2
vi+1 = vi + ai δ t + ji . (2.13)
2

Corre tors are based on the third-order Hermite interpolation onstru ted from
ai , and ji . Corre tors are given by;

p δt p δ t2
vci+1 = vi + (ai+1 + ai ) − ( ji+1 − ji ) ,
2 12
c δt p (δ t)2
ri+1 = ri + (vci+1 + vi ) − (ai+1 − ai ) . (2.14)
2 12

Hermite integration s heme an be also des ribed in impli it form;


δt (δ t)2 δ t3
ri+1 = ri + (vi+1 + vi ) − (ai+1 − ai ) + ( ji+1 + ji ) ,
2 10 120
δt δ t2
vi+1 = vi + (ai+1 + ai ) − ( ji+1 − ji ) . (2.15)
2 12

12
2.1.3 Sixth–order Hermite integration

For sixth-order Hermite s heme we need sixth-order predi tions. Here, we use
only fth order expansion for positions. Taylor series based predi tions for
sixth-order Hermite are given by;

p (δ t)2 (δ t)3 (δ t)4 (δ t)5


ri+1 = r i + vi δ t + a i + ji + si + ci ,
2 6 24 120
p (δ t)2 (δ t)3 (δ t)4
vi+1 = vi + ai δ t + ji + si + ci ,
2 6 24
p (δ t)2 (δ t)3
ai+1 = ai + j i δ t + si + ci . (2.16)
2 6

Summations and dieren es of a eleration, jerk and snap an be given as;

A+ ≡ ai+1 + ai ,

A− ≡ ai+1 − ai ,

J + ≡ h( ji+1 + ji ),

J − ≡ h( ji+1 − ji ),

S+ ≡ h2 (si+1 + si ),

S− ≡ h2 (si+1 − si ). (2.17)

where h = (ti+1 − ti )/2 and a(k) is the kth derivative of a eleration. Coe ients
of the interpolation polynomial at the midpoints ti+1/2 = (ti +ti+1)/2 are given as;
1 � +
8A − 5 j− + S+ ,

ai+1/2 =
16
1 �
15A− − 7J + + S− ,

h ji+1/2 =
16
h 2 1�
si+1/2 = 3J − − S+ ,

2 8
h3 (3) 1�
ai+1/2 = −5A− + 5J + − S− ,

6 8
4
h (4) 1 � −
−J + S+ ,

ai+1/2 =
24 16
5
h (5) 1 � −
3A − 3J + + S− .

ai+1/2 = (2.18)
120 16

13
By using Eq.(2.18) we an ompute high derivatives for next steps;
(3) (3) (4) h2 (5)
ai+1 = ai+1/2 + hai+1/2 + a ,
2 i+1/2
(4) (4) (5)
ai+1 = ai+1/2 + hai+1/2 ,
(5) (5)
ai+1 = ai+1/2 , (2.19)

where a(3)
i+1 , ai+1 , and ai+1 are ci+1 ,
(4) (5)
pi+1 , and ni+1 in order.
And orre tors for sixth-order Hermite are given as;
p δt p δ t2 δ t3
vci+1 = vi + (ai+1 + ai ) − ( ji+1 − ji ) + (si+1 + si ) ,
2 10 120
c δt p (δ t)2 p (δ t)2
ri+1 = ri + (vci+1 + vi ) − (ai+1 − ai ) − ( ji+1 − ji ) . (2.20)
2 10 120

2.1.4 Eighth–order Hermite integration

Similarly, for eighth-order Hermite s heme we use seventh-order predi tions for
positions. We also need two extra terms for ea h expansion, and a fourth
predi tion for jerk;

p (δ t)2 (δ t)3 (δ t)4 (δ t)5 (δ t)6 (δ t)7


ri+1 = ri + vi δ t + ai + ji + si + ci + pi + ni ,
2 6 24 120 720 5040
p (δ t)2 (δ t)3 (δ t)4 (δ t)5 (δ t)6
vi+1 = vi + ai δ t + ji + si + ci + pi + ni ,
2 6 24 120 720
p (δ t)2 (δ t)3 (δ t)4 (δ t)5
ai+1 = ai + ji δ t + si + ci + pi + ni ,
2 6 24 120
p (δ t)2 (δ t)3 (δ t)4
ji+1 = ji + si δ t + ci + pi + ni . (2.21)
2 6 24

Interpolation polynomials are onstru ted from (ai , ji , si , ci ), and


(ai+1 , ji+1, si+1 , ci+1 ). Even order oe ients are;

 
ai+1/2   + 
h2
16 −11 3 −1/3 A
2 si+1/2 1   J−
 
 = 1  0 15 −7
     
h4 (4)
, (2.22)
−1   S+

 32  0 −5 5
24 ai+1/2
 
C−
 
h6 (6) 0 1 −1 1/3
720 ai+1/2

and odd order ones are;

14
 
h ji+1/2   + 
h3
35 −19 4 −1/3 A
6 ci+1/2   J−
 
 
= 1  −35 35 −10 1 
.
h5 (5) (2.23)
−1   S+
  
 32  21 −21 8
120 ai+1/2
 
C−
 
h7 (7) −5 5 −2 1/3
5040 ai+1/2

We need extra dieren e denitions;

C+ ≡ h3 (ci+1 + ci ),

C− ≡ h3 (ci+1 + ci ). (2.24)

And orre tors for eighth-order Hermite are given as;

p δt p 3(δ t)2
vci+1 = vi + (ai+1 + ai ) − ( ji+1 − ji )
2 28
δt 3 δt 4
+ (si+1 + si ) + (ci+1 − ci ) ,
84 1680
c c δt p 3(δ t)2
ri+1 = ri + (vi+1 + vi ) − (ai+1 − ai )
2 28
( δ t) 2 δ t 4
P
+ ( jI+1 + ji ) + (si+1 − si ) . (2.25)
84 1680

2.2 Variable Time Step

When two parti les approa h ea h other, the for es and velo ities get bigger. If
time steps are onstant, the parti les an travel too far and the errors an in rease
when they ome lose together. To avoid su h unexpe ted situations we prefer
to use variable time steps with integration s hemes. With variable time step
s hemes, we an keep the time step smaller when the parti les are loser to ea h
other, and in rease it automati ally when the parti les are far away from ea h
other.
Time step requirements for bodies an vary in a very big range in N -body
simulations. There is no pra ti al way to predi t whi h onstant time step we
an use for the integration. Even for the best hoi e, generally variable time step
s hemes require less time steps than xed time steps for the same amount of
a ura y in general. These kind of algorithms ensure that numeri al instability
does not o ur.

15
The easiest way to use variable time steps for N -body integrations is shared time
step algorithm. In this s heme, time steps are al ulated for ea h parti le, and
all parti les are then for ed to take the smallest one. Variable time step s hemes
are better than xed ones. Variable but shared time steps are impra ti al for
N -body algorithms.

2.3 Time Step Criterion

One of the easiest estimate for time step riterion is ollisional time step. When
two parti les approa h ea h other, or go away from ea h other, ratio between
relative distan e and relative velo ity gives us an estimation. Even if a parti le
is moving head-on toward the other, it will take the order of relative distan e
divided by relative velo ity to hit ea h other. If they move in dierent dire tions,
the time s ale for signi ant hanges in their relative position will still be given
by this ratio (2.26);
 
|ri j |
�t= η , (2.26)
|vi j |

where η is an a ura y parameter, ri j and vi j are position and velo ity ve tors
from parti le j to parti le i respe tively.
On the other hand, if the parti les happen to move at roughly the same velo ity,
in both magnitude and dire tion, ollision time s ale estimate will give an
enourmuosly large number. In fa t, it will produ e innity if the relative velo ity
is exa tly zero. For su h ases, we need to in lude another riterion whi h is
alled as free fall time s ale;
 
|ri j |
�t= η . (2.27)
|ai j |

One of the important time step riterion is oered by Aarseth (2.28). Aarseth's
riterion is known better for predi tor/ orre tor s hemes [29℄, but this riterion
needs higher order derivatives. It is expensive for a se ond order integration
s heme.

16
v
u |ai ||a2| + |a(1) |2
u
� ti = tη (1) i(3) i (2) (2.28)
|ai ||ai | + |ai |2

In [30℄, Press et.al used another riterion given as Eq.(2.29) and ompared with
Aarseth' riterion, and found no signi ant dieren e in the step size for the
required a ura y.

F
 
� t = η (1) (2.29)
F

And this simple form was used by Gualandris [31℄;


!
|ai |
�t= η (1)
. (2.30)
|ai |

In this work, we use simple ollisional time step riterion (Eq.2.26) to produ e
blo k time steps.

2.4 Force Softening

In a dire tly written N -body ode without any softening parameter will never be
able to handle all of lose en ounters. No matter how small a time step we give it,
sooner or later there will be parti les that approa h ea h other losely enough to
have a near miss that takes less time than the time step size. This will ne essarily
lead to large numeri al errors.
Singular Newtonian potential energy between two parti les with positions ri and
r j and masses mi , m j an be given as follows;

mi m j
U (ri , r j ) = G . (2.31)
|r j − ri |

The simplest form of for e softening that has been used by N -body simulations
is a modied Green's fun tion as;

1
γ(r) = − √ , (2.32)
r2 + ε 2

17
whi h is known as Plummer softening, with ε being the hara teristi softening
radius [28℄.
The standard softening approa h is to repla e Newtonian potential energy by a
regular variant, simply by adding the square of ε ;

mi m j
U (ri , r j , ε ) = G � 1/2 . (2.33)
|r j − ri |2 + ε 2

When we dierentiate this modied potential with respe t to the position of a


parti le, we obtain a modied a eleration;
N (r j − ri )
d2
2
ri = G ∑ mj � 3/2 . (2.34)
dt j=1 |r j − ri |2 + ε 2
j6=i

And, in the limit that when ε → 0, Eq.(2.34) returns to the Newtonian


gravitational a eleration.

2.5 Individual Time Steps

In an N -body problem many dierent time s ales an be ne essary in a very big


range. Variable but shared time steps in rease for e al ulations for su h ases.
There is no need to integrate a parti le with smallest time step of the system
even if it's own time step ould be bigger. We prefer to use individual time step
s heme to avoid su h expensive for e al ulation.
In this s heme, every parti le for ed to take its own time step. This operation
redu es the total al ulation osts by a fa tor O(N 1/3 ) with respe t to shared
time step ode. When we integrate a parti le, we have to in lude a temporary
predi tion for other parti les, but this extra ost oming from low order
predi tions is not that mu h. However, it is not e ient to use the individual
time-step s heme in its original form on a parallel omputer sin e only one parti le
is integrated at ea h step.
The integration y le (Alg.1) itself begins by determining the next parti le, i, to
be advan ed;i.e. the parti le, j, with the smallest value of t j + � t j , where t j is the
time of the last for e evaluation [3℄.

18
Algorithm 1 Individual Time-Step Cycle (from Aarseth [3])
1: Determine the next particle: i = min j {t j + � t j }
2: Set the new global time by t = ti + � ti
3: Predict all coordinates r j
4: Improve ri and predict vi
5: Obtain the new force for i
6: Update the times tk
7: Apply the corrector to ri and vi
8: Specify the new time-step � ti
9: Repeat the calculation from step 1

2.6 Block Time Steps

In pra ti e, many large-s ale simulations in stellar dynami s use a blo k time-step
approa h, where the only allowed values for the time step length are powers
of two [3℄. In order to redu e the predi tion overheads of predi tor/ orre tor
s hemes, it is advantageous to use blo k time steps. This kind of s hemes are
also known as hierar hi al time-steps s hemes [25℄. In the ase of individual time
steps, it is the fa t that blo k time steps allow one to predi t the positions of all
parti les only on e per blo k time step, rather than separately for ea h parti le
that needs to be moved forward.
Let us dene a blo k time step at level n as having a length;
� t1
� tn = , (2.35)
2n−1

where � t1 is the maximum time step length.


The name derives from the fa t that, with this re ipe, many parti les will share
the same step size, whi h implies that their orbit integration an be performed
in parallel.
In prin iple, any level n may be pres ribed. However, it is rare for more than
about 12 levels to be populated in a realisti simulation with N ≤ 1000, in reasing
by a few levels for N ≤ 104 [3℄.

19
2.7 Time Symmetrization

It is espe ially important to prote t energy onservation and stability for


long-term integrations. At that point, time symmetri methods are important
whi h do not generate any se ular energy error using for periodi orbits. This
property makes possible to obtain high degree of energy onservation for the same
length of time steps. But time symmetry is disturbed (Fig.2.2) by variable time
step s heme for well known lassi al integrations. However one an onstru t an
impli it time symmetrization s heme with an iterative pro edure [19, 20℄. This
approa h seems appli able for higher order s hemes whi h an not be stru turally
time symmetri .

Figure 2.2: Relative energy error for variable time step leapfrog integration for an
elliptic Kepler orbit with eccentricity= 0.75 for 1000 time units.

It is surprisingly easy to introdu e a time-symmetri version for any adaptive


self-starting integration s heme. Let ξ = (r, v) be the 2N -dimensional phase
spa e ve tor for a system with N degrees of freedom, and let f (ξi, δ ti ) be the
operator that maps the phase spa e ve tor of the system at time ti to a new
phase spa e ve tor at time ti+1 = ti + δ ti . Any hoi e of self-starting integration

20
s heme, together with a re ipe to determine the next time step δ ti , at time ti and
phase spa e value ξi (ti), denes the pre ise form of f (ξi , δ ti).
The re ipe for making any su h s heme time-symmetri was given by [19℄, as
follows;

= f (ξi , δ ti ),

ξ
 i+1

(2.36)
h(ξi ) + h(ξi+1 )
= .

 δt
i
2

where h(ξ ) an be any time step riterion. Note that this re ipe leads to an
impli it integration s heme, whi h an be solved most easily through iteration.
In pra ti e, one or two iterations su e to get ex ellent a ura y, but with the
ost of doubling or tripling the number of for e al ulations that need to be
performed. Extensions of this impli it symmetrization idea have been presented
by [19℄, [20℄, and [22℄.
Here, we fo used on the ee ts of the dierent number of iterations in the
symmetrization pro edure (Eq.2.36) on the energy onservation. We an in rease
the iteration number before solving the Eq.(2.36). But it is desired to keep it
small. If a good time symmetrization is a hieved, we do not see linearly growing
errors on energy onservation.
Our rst results are obtained for the Kepler two-body problem for an ellipti al
orbit with initial onditions of r = [1 0 0], v = [0 0.5 0] for position and velo ity,
respe tively. The gravitational onstant and the total mass of the 2-body system
are hosen as unity for simpli ity. In Fig. 2.3, we show energy errors for variable
time step versus symmetri time step with leapfrog integration s heme for 1000
time units. In Fig. 2.4, four orbital periods are taken for both kineti and potential
energies. In Fig. 2.5, ten orbital periods are taken for duration time of the
integration, and 0.01 time units are taken for output intervals.
For Fig. 2.6, 10000 time units have been taken for duration time of the integration
to see the long time behavior of the s heme. A tually there is only one output
and only one urve in Fig. 2.6 as in Fig. 2.5. In Fig. 2.6, we present the output
is taken in every one time unit instead of our ea h time step. Our output gives

21
Figure 2.3: Relative energy errors for variable time step versus symmetric time step
with leapfrog integration for an elliptic Kepler orbit with eccentricity=
0.75 for 1000 time units.

an interse tion with 2.7 time units periodi motion of the system and for every
one time unit.
Fig. 2.7 shows the results of dierent number of iterations whi h are used in the
time symmetrization of the time steps for the same problem. We used 9000-10000
range of the duration time for the outputs to plot this graphi to show small
dieren es between dierent number of iterations. However most of the urves
are almost same. Again, there is only one urve, but we take the outputs for every
one time unit to see the small dieren es mu h more learly. As seen in Fig. 2.7,
there are no signi ant dieren es for high number of iterations in the mean
of energy onservation for this two body Kepler problem whi h has a periodi
motion.

22
Figure 2.4: Kinetic and potential energy curves for time symmetrized variable time
step leapfrog integration for an elliptic Kepler orbit with eccentricity=
0.75 for 4 orbital periods.

Figure 2.5: Relative energy error for time symmetrized variable time step leapfrog
integration for an elliptic Kepler orbit with eccentricity= 0.75 for 10
orbital periods. Only one iteration is used for time symmetrization with
variable time steps.

23
Figure 2.6: Relative energy error for time symmetrized variable time step leapfrog
integration for the same Kepler problem for 10000 time units.

Figure 2.7: Relative energy errors for time symmetrized variable time step leapfrog
integration using different (1-4) number of iterations for the same Kepler
problem. Different number of iterations are shown with different colors
and symbols.

24
3. TIME-SYMMETRIC BLOCK TIME STEP ALGORITHM

3.1 Implicit Iterative Approach

Sin e we will need to examine the idea of iteration below in more detail, let us
write out the pro ess expli itly. We start with the given state ξi and the impli it
equation for ξi+1 of the form

ξi+1 = f (ξi , δ ti (ξi , ξi+1 )). (3.1)

The rst guess for ξi+1 is

(0)
ξi+1 = f (ξi , δ ti (ξi , ξi )) (3.2)

and we an onsider this as our zeroth-order iteration. With this guess in hand,
we an now start to iterate, then nding

(1) (0)
ξi+1 = f (ξi , δ ti (ξi , ξi+1 )) (3.3)

as our rst-order iteration. This will already be mu h loser to the nal value, as
long as the time steps are small enough and the fun tion δ ti does not u tuate
too rapidly. In general, the kth iteration will yield a value for ξi+1 of

(k) (k−1)
ξi+1 = f (ξi , δ ti (ξi , ξi+1 )). (3.4)

We will now onsider the appli ation of these te hniques to blo k time steps. For
the purpose of illustrating the use of blo k time steps, it will su e to use the
leapfrog s heme whi h we present here in a self-starting, but still time-symmetri
form:

25
ri+1 = ri + vi δ t + ai (δ t)2/2

vi+1 = vi + (ai + ai+1 )δ t/2 (3.5)

All our onsiderations arry over to higher-order s hemes, as long as the base
s heme an be made time-symmetri when iterated to onvergen e. An example
of su h a s heme is the widely used Hermite s heme [29℄.
To start with, we apply the re ipe of [19℄ to blo k time steps � tn (Eq.2.35).
Starting with the ontinuum hoi e of

h(ξi ) + h(ξi+1)
δ tc,i = . (3.6)
2

We now for e ea h time step to take on the blo k value δ ti = � tn for the smallest n
value that obeys the ondition � tn ≤ δ tc,i . In more formal terms, δ ti = δ ti (δ tc,i) =
� tn for the unique n value for whi h

 
1 h(ξi ) + h(ξi+1 )
n = min k ≤ . (3.7)

k≥1 2k−1 2

3.1.1 Two body tests for implicit iterative approach

Fig. 3.1 shows the results that we obtained for the same Kepler problem with
impli it iterative blo k-time step approa h. Here, we have used 1,2, and 3
iterations.
For these results (Fig.3.1, and Fig. 3.2), there is no need to use more than one
iteration for time symmetrization of leapfrog s heme and for blo k time steps in
two body problem. Higher number of iterations do not produ es better results
for this integration s heme for Kepler problem.
On the other hand, even for more than two bodies, new time steps al ulated by
higher number of iterations will be in one of the nearest time blo ks for leapfrog
s heme with impli it iterative blo k-time steps. For example, in the previous
Kepler two body problem, iteration number does not ee t number of time blo ks
and number of time steps. We repeat our tests for Kepler problem with dierent
e entri ities (Fig.3.3-Fig.3.10).

26
Figure 3.1: Relative energy errors for leapfrog scheme with implicit iterative
block-time steps for a Kepler problem with eccentricity= 0.75.

Figure 3.2: Relative energy errors for leapfrog scheme with implicit iterative
block-time steps, and time-symmetric leapfrog scheme.

27
Figure 3.3: Relative energy errors for leapfrog scheme with implicit iterative
block-time steps for an elliptic Kepler orbit with eccentricity= 0.85. We
applied different number of iterations to the scheme as in the 3.1

Figure 3.4: Relative energy errors for leapfrog scheme with implicit iterative
block-time steps, and time-symmetric leapfrog scheme for an elliptic
Kepler orbit with eccentricity= 0.85.

28
Figure 3.5: Relative energy errors for leapfrog scheme with implicit iterative
block-time steps for an elliptic Kepler orbit with eccentricity= 0.9375.

Figure 3.6: Relative energy errors for leapfrog scheme with implicit iterative
block-time steps, and time-symmetric leapfrog scheme for an elliptic
Kepler orbit with eccentricity= 0.9375.

29
Figure 3.7: Relative energy errors for leapfrog scheme with implicit iterative
block-time steps for an elliptic Kepler orbit with eccentricity= 0.9856.

Figure 3.8: Relative energy errors for leapfrog scheme with implicit iterative
block-time steps, and time-symmetric leapfrog scheme for an elliptic
Kepler orbit with eccentricity= 0.9856.

30
Figure 3.9: Relative energy errors for leapfrog scheme with implicit iterative
block-time steps for an elliptic Kepler orbit with eccentricity= 0.99.
(Here, we kept the size of the time steps ten time smaller.)

Figure 3.10: Relative energy errors for leapfrog scheme with implicit iterative
block-time steps, and time-symmetric leapfrog scheme for an elliptic
Kepler orbit with eccentricity= 0.99.

31
It seems that, using more than one step for iterations is less ne essary for this
s heme than in the time-symmetri leapfrog s heme. Our primitive impli it
iterative blo k-time steps produ es promising results for long term integration in
the mean of energy onservation. We only used time symmetrization mentality
with blo k time steps. This s heme is not really time symmetri yet even for two
body ases.

3.1.2 Flip-flop problem

The problem with this approa h, as an be seen from the following example is
that we are no longer guaranteed to nd onvergen e for our iteration pro ess.
Let h(ξi ) = 0.502 and let the time derivative of h(ξi (t)) along the orbit be
(d/dt)h(ξi(t)) = −0.01. We then get the following results for our attempt at
iteration.

(0)
ξi+1 = f (ξi , δ ti (ξi , ξi )) = f (ξi , δ ti (h(ξi )))

= f (ξi , δ ti (0.502)) = f (ξi , 0.5) (3.8)

(1) 0 0
ξi+1 = f (ξi , δ ti (ξi , ξi+1 )) = f (ξi , δ ti ([h(ξi ) + h(ξi+1 )]/2))

= f (ξi , δ ti ([0.502 + (0.502 + 0.5 ∗ (−0.01))]/2))

= f (ξi , δ ti ([0.502 + 0.497]/2))

= f (ξi , δ ti (0.4995)) = f (ξi , 0.25) (3.9)

(2) 1 1
ξi+1 = f (ξi , δ ti (ξi , ξi+1 )) = f (ξi , δ ti ([h(ξi ) + h(ξi+1 )]/2))

= f (ξi , δ ti ([0.502 + (0.502 + 0.25 ∗ (−0.01))]/2))

= f (ξi , δ ti ([0.502 + 0.4995]/2))

= f (ξi , δ ti (0.50075)) = f (ξi , 0.5) (3.10)

And from here on, ξi+1


(k)
= f (ξi , 0.25) for every odd value of k and ξi+1 = f (ξi , 0.5)
(k)

for every even value of k; the pro ess of iteration will never onverge.
Under realisti onditions, for slowly varying h fun tions and small time steps,
this ip-op behavior will not o ur often, but it will o ur sometimes, for a
non-negligible fra tion of the time. We an see this already from the above
32
example: for a linear de rease in the h fun tion of (d/dt)h(ξi(t)) = −0.01, we will
get ip-opping not only for h(ξi ) = 0.502 but for any value in the nite range
0.50125 < h(ξi ) < 0.5025.

Sin e iteration onverges orre tly over the rest of the interval 0.5 < h(ξi) < 1, we
on lude that in this parti ular ase ip-opping o urs about one quarter of one
per ent of the time, over this interval. This is far too frequent to be negligible in
a realisti situation.
Clearly, a straightforward extension of the impli it iterative time symmetrization
approa h does not work for blo k time steps, be ause iteration does not onverge.
We have to add some feature, in some way. Our rst attempt at a solution is to
take the smallest of the two values in a ip-op situation.

3.1.3 Flip-flop resolution

The most straightforward solution of the ip-op dilemma is like utting the
Gordian knot; we just take the lowest value of the two alternate states. The
drawba k of this solution is that in general we need at least two iterations for
ea h time step, to make sure that we have spotted, and then orre tly treated,
all ip-op situation. In general, it is only at the third iteration that it be omes
obvious that a ip-op is o urring. To see this, onsider the previous example
with a starting value of h(ξi) = 0.501. In that ase we will get ξi+1 (0)
= f (ξi , 0.5)
and ξi+1
(1)
= f (ξi , 0.25), just as when we started with h(ξi ) = 0.502. The dieren e
shows up only at the se ond iteration, where we now nd ξi+1 (2)
= f (ξi , 0.25), a
value that will hold for all higher iterations as well.
The original iterative approa h to time symmetry in pra ti e already gives good
results when we use only one iteration. This implies a penalty, in terms of for e
al ulations per time steps, of a fa tor two ompared to non-time-symmetri
expli it integration. Now the use of ip-op resolution will for e us to always
take at least two iterations per step, raising the penalty to be ome at least a
fa tor of three.
However, there is a more serious problem: there is still no guarantee that taking
the lowest value in a ip-op situation leads to a time-symmetri re ipe. In fa t,

33
what is even more important, we have not yet he ked whether our symmetri
blo k time-step s heme is really time symmetri , in the absen e of ip-op
ompli ations.
In order to investigate these questions, let us return to the example we used
above, but instead of a linear time derivative, let us now use a quadrati time
derivative for the h fun tion that gives the estimate for the time step size. Rather
than writing a formal denition, let us just state the values, while shifting the
time s ale so that t = 0 oin ides with the parti le position being ξi :

h(0.00) = 0.502

h(0.25) = 0.499

h(0.50) = 0.499 (3.11)

When we start at time t = 0, and we integrate forward, we nd;

(0)
ξi+1 = f (ξi , δ ti (ξi , ξi )) = f (ξi , δ ti (h(ξi )))

= f (ξi , δ ti (0.502)) = f (ξi , 0.5), (3.12)

(1) 0 0
ξi+1 = f (ξi , δ ti (ξi , ξi+1 )) = f (ξi , δ ti ([h(ξi ) + h(ξi+1 )]/2))

= f (ξi , δ ti ([0.502 + 0.499]/2))

= f (ξi , δ ti (0.5005)) = f (ξi , 0.5). (3.13)

and so on: all further kth iterations will result in ξi+k


(1)
= f (ξi , 0.5). There is no
ip-op situation, when moving forward in time.
However, when we now turn the lo k ba kward, after taking this step of half a
time unit, we start with the value h(0.50) = 0.499, whi h leads to a rst step ba k
of δ t = 0.25. The end point of the rst step ba k is t = 0.25 with h(0.25) = 0.499.
Therefore, also here there is no ip-op situation: all iterations, while going
ba kward, result in a time step size of δ t = 0.25.
Thus we have onstru ted a ounter example, where forward integration would
pro eed with time step δ t = 0.5 and subsequent ba kward integration would

34
pro eed with time step δ t = 0.25. Clearly, our s heme is not yet time symmetri ,
even in the absen e of a ip-op ase.

3.1.4 A different approach for implicit iterative block time steps: extended blocks

As we have seen, there are situations where starting with a smaller time step
hoi e, the next iteration requires a time step twi e as long, while starting with
that longer time step hoi e, the next iteration requires a time step twi e as short.
One possible solution would be to allow intermediate time step values. For
example, in between the allowed values of 1/4 and 1/2 we ould take the

arithmeti mean 3/8, or the geometri mean 2/4 or the harmoni mean 1/3, or
any other kind of intermediate value. The advantage would be that this is likely
to avoid ip-opping for su iently small time steps and su iently smooth h
behavior. The drawba k would be that one su h time step would bring a parti le
out of syn hronization with all other parti les.
Sin e it is lear that we have to pay some pri e somewhere, let us onsider
extending the spe trum of allowed time steps, but in a minimal way: let us
make sure that we allow only one extra value in between ea h powers of two.
The most straightforward way to hoose intermediate time step size, in between
su essive powers of two fra tions of the maximal time step, is to use the
arithmeti mean, just as was done in the original s heme proposed for ontinuous
time steps by [19℄. Let h(ξi ) = 1/2k , and h(ξi+1) = 1/2l where k and l are positive
integers. New time step δ tn an then be dened as;
   k 
1 1 1 2 1 1
δ tn = k
+ l = l
+ 1 k+1 = μ k (3.14)
2 2 2 2 2 2

where μ = 2 2l + 1 . Apart from the fa tor μ , the rest of the expression is still a
 
1 2k

valid power of two, as before. In prin iple, μ an take the range of values depi ted
in Fig.3.11, namely 1, 2, 3, 4, or 5, orresponding to the one-iteration values for
δ tn as 5/8, 3/4, 1, 3/2, and 5/2, in that order.

In pra ti e, for moderately smooth h behavior and su iently small time steps,
the values of k and l will not dier by more than one, and therefore the only
non-standard μ oe ient will be 3, thus allowing only the value of 3/4 in
35
Figure 3.11: Block time steps with nearest neighbors.

between, say, 1/4 and 1/2. We an simplify our algorithm with these oe ients
as follows;

3


 h(ξi ) if h(ξi+1 ) > 2h(ξi )
2


δ tn = 3 h(ξ ) else i f h(ξi+1 ) < h(ξi ) (3.15)
i



 4
h(ξi ) else .

In Fig. 3.12 we show time symmetri blo k time steps al ulated by two ways for
an ellipti Kepler orbit with e entri ity 0.99 using leapfrog s heme. Here, rst
version means the way of symmetri blo k time steps omputation. It is lear
in the gure that, non-blo k time steps are shown in only a few points (As single
 ∗ symbols, the rest are overlapping and seem as lines).
On the other hand our se ond approa h makes possible a mu h softer passing
between two dierent blo k time steps. These non-blo k time steps are temporary
ompromise time steps, and just used to prote t time symmetry. We an use these
ompromise time steps in only where we really need.
For numeri al tests, we rstly used Kepler two body problem with dierent
e entri ities. Here, the gravitational onstant and the total mass of the 2-body
system are hosen unity for simpli ity. Integrations are made by leapfrog s heme.
In Fig. 3.13 we showed energy errors for two dierent approa hes to symmetri
blo k time step for the previous problem. Results are taken for only one orbital
period. Even for this short time period, there is a small energy error growing in
the rst version.

36
Figure 3.12: Time steps for leapfrog scheme calculated by two versions of implicit
iterative block-time steps for an elliptic Kepler orbit with eccentricity=
0.99 for one orbital period.

In Fig. 3.14 we plotted the energy errors for 10000 time units for the same
problem to see the long time behavior of the algorithms. Our rst approa h
gives signi antly worse errors than in the se ond one. It is also possible to
obtain a eptable results for the same problem using ten times smaller time
steps. However we do not need to make su h adjustments for the results of
the se ond version. We have to spend high omputational eorts for these kind
of adjustments.
Additionally, in Fig. 3.15 we showed energy errors for both blo k time step
s hemes and time symmetri algorithm for leapfrog s heme. It an be seen easily
that se ond approa h gives better results than rst approa h, and symmetri
time step leapfrog s heme. We obtained similar results for the tests with dierent
e entri ities.
After the tests for periodi two-body Kepler orbits, to illustrate the improved
a ura y for n-body problem, we used our algorithm for an eight-body system
with Plummer model initial onditions with leapfrog integration s heme. Fig. 3.16
illustrates the results for an eight-body system with Plummer model initial

37
Figure 3.13: Relative energy errors for leapfrog scheme with two versions of implicit
iterative block-time steps for an elliptic Kepler orbit with eccentricity=
0.99 for one orbital period.

Figure 3.14: Relative energy errors for leapfrog scheme with two versions of implicit
iterative block-time steps for an elliptic Kepler orbit with eccentricity=
0.99 for 10000 time units.

onditions for leapfrog s heme with blo k-time steps and symmetri time steps

38
Figure 3.15: Relative energy errors for leapfrog scheme with two versions of implicit
iterative block-time steps, and time-symmetric leapfrog scheme for an
elliptic Kepler orbit with eccentricity= 0.96.

with a variable time step shared by all parti les. Improvement on the a ura y
is learly seen from the Fig. 3.16.
Even if our two body tests show a lear symmetrization for iterative blo k
steps, and improvement in 8-body ases for energy errors, we have to go on
our investigations to a hieve fully symmetrized blo k time step.

3.1.5 A first attempt at a solution

Let us rethink about the whole pro edure. The basi problem has been that
the very rst step in any of our algorithms proposed so far has not been time
symmetri . The very rst step moves forward, and leads to a newly evolved
system at the end of the rst step. Only after making su h a trial integration, do
we look ba k, and try to restore symmetry. However, as we have seen, the danger
is large that this trial integration is not exhaustive; it may already go too far, or
not far enough, and thereby it may simply overlook a type of move that the same
algorithm would make if we would start out in the time-reversed dire tion.

39
Figure 3.16: Relative energy errors for leapfrog scheme with shared block-time steps
(second version), and time-symmetric leapfrog scheme for 8 body runs.
Here, Plummer model based initial conditions are used.

Formulating the problem in this way, immediately suggests a solution. At any


point in time, let us rst try to make the largest step that is allowed. If that step
turns out to be too large for our algorithm, we try a step that is half of that size.
If that step is too large still, we again half the size, and so on, until we nd a step
size that agrees with our algorithm, when evaluated in both time dire tions. A
similar treatment has been des ribed by [32℄.
This type of approa h is learly more symmetri than what we have attempted so
far. Instead of using information of the physi al system at the starting point of
the next integration step, we only use a mathemati al riterion to nd the largest
time step size allowed at that point, and we then apply the physi al riteria
symmetri ally in both dire tions.
Let us give an example. If the largest time step size is hosen to be unity, then at
time t = 0 we start by onsidering this time step. We try, in this order δ t = 1, δ t =
0.5, δ t = 0.25, and so on, until we nd a time step for whi h integration starting
in the forward dire tion, and integration starting in the ba kward dire tion, both

40
result in the new time step being a eptable. Let us say that this is the ase for
δ t = 0.125.

After taking this step, we are at time t = 0.125. The largest time step allowed at
that point, forward or ba kward, is δ t = 0.125. Any larger time step would result
in non-alignment of the blo k time steps: in the ba kward dire tion it would jump
over t = 0. So at this point we start by onsidering on e more δ t = 0.125. If that
time step is too large, we try half that time step, halving it su essively until we
nd a satisfa tory time step size.
Imagine that the se ond time step size is also δ t = 0.125. In that ase, we land
at t = 0.25. From there on, the maximum allowed time step size is δ t = 0.25, so
the rst try should be that size.
In prin iple, this approa h seems to be really time symmetri . However, there is
a huge problem with this type of s heme, as we have just formulated it. Imagine
the system to rawl along with time steps of, say δ t = 1/1024, and rea hing time
t = 1. Our new re ipe then suggests to start by trying δ t = 1, a 1024-fold in rease
in time step. Whatever subtle physi al ee t it was that for ed us to take su h
small time steps, is ompletely ignored by the mathemati al re ipe that for es us
to look at su h a ridi ulously large time step.
For example, in the ase of stellar dynami s, a double star may for e the stars
that orbit ea h other to take time steps that are ne essarily far shorter than the
orbital period. Starting out with a trial step size that is far larger than an orbital
period may or may not give spuriously safe-looking results. Clearly, we have to
ex lude su h enormous jumps in time step.

3.1.6 A second attempt at a solution

The simplest solution to taming sudden unphysi al in reases in time steps is to


allow at most an in rease of a fa tor two, in either the forward or the ba kward
dire tion. This then implies that we an only allow de reases of a fa tor two, and
not more than two, in either dire tion. The reason is that a de rease of a fa tor
four in one dire tion in time would automati ally translate into an in rease of a
fa tor four in the other dire tion.

41
Note that we have to be areful with our time step riterion. If we allow time
steps that are too large, we may en ounter situations where our time step riterion
would suggest us to shrink time steps by a fa tor of four, from one step to the
other. Sin e our algorithm does not allow this, we an at most shrink by a fa tor
of two, whi h may imply an una eptably large step. However, if our time step
riterion is su iently stri t, allowing only reasonably small time steps to start
with, it will be able to resolve the gradients in the riterion in su h as way as to
handle all hanges gra efully through halving and doubling.
When we apply this restri tion to the s heme outlined in the previous subse tion,
we arrive at the following ompa t algorithm.
First a matter of notation. Any blo k time step, of size δ t = 1/2k , onne ts
two points in time, only one of whi h an be written at t = Z/2(k−1) , with Z an
integer. Let us all that time value an even time, from the point of view of
the given time step size, and let us all that other time value an odd time. As
an example, if δ t = 0.125, than t = 0, 0.25, 0.5, 0.75, 1 are all even times, while
t = 0.125, 0.375, 0.625, 0.875 are all odd times.

Here is our time symmetri blo k-time step algorithm (TSBTS):

* When we start in a given dire tion in time, at a given point in time, we should
determine the time step size of the last step made by the system. In that way,
we an determine whether the urrent time is even or odd, with respe t to
that last time step.

* If the urrent time is odd, our one and only hoi e is: to ontinue with the
same size time step, or to halve the time step. First, we try to ontinue with
the same time step. If, upon iteration, that time step qualies a ording to
the time-symmetry riterion used before, Eq.3.7, we ontinue to use the same
time step size as was used in the previous time step. If not, we use half of the
previous time step.

* If the urrent time step is even, we have a hoi e between three options for the
new time step size: doubling the previous time step size, keeping it the same,
or halving it. We rst try the largest value, given by doubling. If Eq.(3.7)
42
shows us that this larger time step is not too large, we a ept it, otherwise
we onsider keeping the time step size the same. If (Eq.3.7) shows us that
keeping the time step size the same is okay, we a ept that hoi e, otherwise
we just halve the time step, in whi h ase no further testing is needed.

Note that in this s heme, we always start with the largest possible andidate
value for the time step size [33℄. Subsequently, we may onsider smaller values,
but the dire tion of onsideration is always from larger to smaller, never from
smaller to larger. This guarantees that we do not run into the ip-op problem
mentioned above.
Algorithm 2 Symmetrization Scheme for Block Time Steps
for i = 1 to number of iteration do
if time == odd time then
if dti 6= � tn then
dti = dti /2
end if
end if
if time == even time then
if dti < � tn then
dti = dti ∗ 2
end if
if dti == � tn then
dti = � tn
end if
if dti > � tn then
dti = dti /2
end if
end if
end for

3.2 Numerical Tests for 2-Body Problems

Before we he k how ee tive our algorithm for n-body ases, it an be better
to he k rst for two-body ases. We know what is look like graphi s of energy
errors in two-body ase if energy preserved.

43
3.2.1 2-body tests for leapfrog

We present here the results for a gravitational two-body integration. The relative
orbit of the two point masses forms an ellipse with an e entri ity of e = 0.99.
We have hosen a time unit su h that the period of the orbit is T = 2π .
We have implemented four dierent integration s hemes:
0) The original time-symmetri integration s heme des ribed by [19℄, where there
is a ontinuous hoi e of time step size. This is the approa h des ribed in se tion
3.1. We have used ve iterations for ea h step.
1) A blo k-time-step generalization, with a xed number of iterations. This is the
approa h analyzed in se tion 3.1.2. Here, we hose ve iterations for ea h step.
2) A blo k time step generalization, with a variable number of iterations. If after
ve iterations, the fourth and the fth iterations still give a dierent blo k time
step size, then we hoose the smallest of the two. This re ipe avoids ip-op
situations. It is the approa h des ribed in se tion 3.1.3.
The algorithm des ribed in the next se tion, 3.1.5, was not implemented here,
be ause it is guaranteed to lead to large errors in those ases where a new large
time step is allowed again just before peri- enter passage. We therefore swit hed
dire tly to the following se tion:
3) The implementation of our favorite algorithm, where we start with a truly time
symmetri hoi e of time step, with the restri tions that we only allow hanges
of a fa tor two in the dire tion of in reasing and de reasing the time step, and
that we only allow an in rease of time step on the so- alled even time boundaries.
This is the approa h given in se tion 3.1.6.
In Fig. 3.17 and Fig. 3.18 we show the results of integrating our highly e entri
binary with these four integration s hemes. In ea h ase, the largest errors are
produ ed by algorithm 1, smaller errors are produ ed by algorithm 2, and even
smaller errors appear with algorithm 3. Finally, algorithm 0 gives the smallest
errors.

44
Figure 3.17: Relative energy errors for a two-body integration of a bound orbit with
eccentricity e = 0.99. The top line with highest slope corresponds to
algorithm 1, the line with intermediate slope corresponds to algorithm 2,
and below those the two lines for algorithms 0 and 3 are indistinguishable
in this figure.

Fig. 3.17 shows the energy error in the two-body integration as a fun tion of
time. As is generally the ase for time-symmetri integration, the errors that
o ur during one orbit are far larger than the systemati error that is generated
during a full orbit. To bring this out more learly, Fig. 3.18 shows the error only
one time per orbit, at apo- enter, the point in the orbit where the two parti les
are separated furthest from ea h other, and the error is the smallest.
Finally, Fig. 3.19 shows the same data as Fig. 3.18, but for a period of time that
is ten times longer. In both Fig. 3.18 and Fig. 3.19, it is lear that the rst two
blo k time steps algorithms, 1 and 2, both show a linear drift in energy. This is a
lear sign of the fa t that they violate time symmetry. Note that in both gures
algorithm 3 gives rise to a time dependen y that looks like a random walk. This
may well be the best that an be done with blo k time steps, when we require
time symmetry.

45
Figure 3.18: Relative energy errors at apo-center. The four lines, from top to bottom,
correspond to algorithms 1, 2, 3, and 0.

Figure 3.19: Same as Fig.3.18, but for a duration that is ten times longer.

46
3.2.2 2-body tests for fourth-order Hermite

Here we use 1, 3, 5, and 7 iterations with TSBTS algorithm. A ura y parameter


η is hosen as 0.01, maximum allowed time step is 1/64. Here, 1 iteration
means standard individual fourth-order Hermite integration with blo k time
steps. Duration of the integrations are kept as 1000 time units.

Figure 3.20: Relative energy errors for a Kepler problem with eccentricity e =
0.984375 for fourth-order Hermite integration with and without TSBTS
algorithm.

In Fig. 3.20 we show the results of integrating highly e entri Kepler problem
with fourth-order Hermite integration s heme. TSBTS versions are highly better
than 1 iteration ase similarly in leapfrog tests.
Fig. 3.21 shows energy errors for the same Kepler problem for 3, 5, and 7 iterations
in apo- enter points. As we expe ted, 5 and 7 iterations shows learly better
onservation than 3 iterations.
Fig. 3.22 shows energy errors for 2 bodies with Plummer type realizations. Both
bodies have initial velo ities. We use standard N -body units. We have integrated
ea h system for 1000 time units, with a maximum time step of 1/64. We used 1,

47
Figure 3.21: Relative energy errors for a Kepler problem with eccentricity e =
0.984375 for fourth-order Hermite integration with TSBTS algorithm.

3, 5, and 7 iterations. There is a great improvement in energy errors with TSBTS


algorithm.

Figure 3.22: Relative energy errors for a 2 body problem for fourth-order Hermite
integration with and without TSBTS algorithm.

48
Figure 3.23: Relative energy errors for a 2 body problem for fourth-order Hermite
integration with only TSBTS algorithm.

Finally Fig. 3.23 shows energy errors for the same 2 bodies only with 3, 5, and 7
iterations. This gure shows only last 10 time units for total 1000 time units to
show the small dieren es between dierent number of iterations.

3.2.3 2-body tests for sixth-order Hermite

Here, we use sixth-order Hermite s heme with TSBTS algorithm. We use same
Kepler and 2-body problems. Our onstants as maximum allowed time steps,
duration of integrations, and a ura y parameters are the same with previous
tests.
In Fig. 3.24, we show the results of integrating Kepler problem with sixth-order
Hermite integration s heme. We used 1, 3, 5, and 7 iterations. A linear drift is
shown in 1 iteration ase as we expe ted.
Fig. 3.25 shows what happens with dierent number of iterations in more detail.
Even if, we prefer to see full time symmetri behavior as in leapfrog and
fourth-order ases, we see random-walk fashion for dierent number of iterations

49
Figure 3.24: Relative energy errors for a Kepler problem with eccentricity e =
0.984375 for sixth-order Hermite integration with and without TSBTS
algorithm for different number of iterations.

after the rst 100 time units. We obtain exa tly the same results for 5 and 7
iterations.
In Fig. 3.26, we show energy errors for the same 2-body problem we used before.
There are huge dieren es between symmetrized versions and 1 iteration ase.
Pure individual ase shows again a linear drift spreading to a wide range. The
results for 3, 5, and 7 iterations, are seen as ongruent the x axis.
Fig. 3.27, shows the energy errors for the same problem but only for 3, 5, and 7
iterations with TSBTS algorithm. Similar to the Kepler problem ase, 5 and 7
iterations produ e the same results. In this gure, results have almost a linear
drift, but the results for 5 and 7 iterations are keeping the random walk fashion
for rst 400 time units.
In the ase of the fourth-order s heme, we use predi tions with errors O(� t)4
for positions, O(� t)6 for interpolations to a hieve time symmetry. Here, we use
predi tions with O(� t)5 errors, and O(� t)8 interpolations for position ve tors in
a sixth order s heme. For this reason, we loose e ien y in predi tion steps, and
we an not get fully time symmetri behavior in energy errors. However we still

50
Figure 3.25: Relative energy errors for a Kepler problem with eccentricity e =
0.984375 for sixth-order Hermite integration with TSBTS algorithm.

Figure 3.26: Relative energy errors for a 2 body problem for sixth-order Hermite
integration with TSBTS algorithm.

have lear improvements for errors with TSBTS algorithm. Interpolation routines
will be given with N -body implementations.

51
Figure 3.27: Relative energy errors for a 2 body problem for sixth-order Hermite
integration with TSBTS algorithm for 3,5, and 7 iterations.

3.2.4 2-body tests for eighth-order Hermite

Here, we use eighth-order Hermite s heme with TSBTS algorithm. We use same
Kepler and 2-body problems and same onstants with previous tests as maximum
allowed time steps, duration of integrations, and a ura y parameters.
Fig. 3.28, shows energy errors for TSBTS algorithm with 3, 5, and 7 iterations,
and basi blo k time step algorithm for a highly e entri Kepler problem. TSBTS
algorithm with eighth-order s heme produ e learly better energy errors than
blo k time step s heme again.
Fig. 3.29, shows energy errors only for TSBTS algorithm with 3, 5, and 7
iterations. Here, we see almost a linear drift in energy errors for three dierent
number of iterations. These growing errors ome from predi tions whi h have
small a ura y orders for an eighth-order s heme.
Fig. 3.29, shows energy errors for 2 bodies with Plummer type realizations. We
again use 3, 5, and 7 iterations with TSBTS algorithm, and 1 iteration ase for
basi blo k time step s heme. We get similar improvements in energy errors as

52
Figure 3.28: Relative energy errors for a Kepler problem with eccentricity e =
0.984375 for eighth-order Hermite integration with and without TSBTS
algorithm.

Figure 3.29: Relative energy errors for a Kepler problem with eccentricity e =
0.984375 for eighth-order Hermite integration with TSBTS algorithm.

53
in Kepler problem ases. Fig. 3.30 show the results for 3, 5, and 7 iterations. We

Figure 3.30: Relative energy errors for a 2 body problem for eighth-order Hermite
integration with TSBTS algorithm versus block time steps.

got exa tly the same results for dierent number of iterations as linearly growing
errors. There is no big dieren es for higher number of iterations in this ase.
The reason of linear growing errors is orders of the predi tions whi h is similar to
the sixth-order ase. Our predi tions are in fth order for eighth-order integration.
Even this order is enough for the s heme, at least we need seventh-order
predi tions to obtain fully time symmetri results, and also, interpolation routines
have to be regenerated. Using higher order predi tions, and regeneration of
interpolations for sixth and eighth order s hemes are beyond of the work des ribed
in the thesis. We just use them to show that TSBTS algorithm works well with
higher order integration s hemes even they are not fully time symmetri .

54
Figure 3.31: Relative energy errors for a 2 body problem for eighth-order Hermite
integration with TSBTS algorithm.

55
4. N-BODY IMPLEMENTATION

So far, we have dis ussed the implementation of our blo k-symmetri algorithm
for individual time steps in the ase of the two-body problem. Of ourse, for N = 2,
it is not really ne essary to use blo k time steps, nor is it useful to introdu e
individual time steps. The reason we made both of these extensions was to
implement and test our basi ideas in the simplest ase. In this hapter, we rst
des ribe and test our algorithm for the general N -body problem with leapfrog
s heme. After the onstru tion of TSBTS algorithm, we test our algorithm with
higher order s hemes su h as 4.th, 6.th, and 8.th order Hermite integrations.

4.1 Divide and Conquer: the Concept of an Era

The major on eptual di ulty in designing a time-symmetri blo k time step
s heme is the global ontext information that is needed, with extensions toward
the future as well as the past. In order to determine the time step for parti le i at
time t , we need the information of all other parti les j at that time. In general,
there will be at least some j values for whi h the position and velo ity of parti le
j are not given at time t , be ause parti le j has a time step larger than parti le
i and time t happens to fall within the duration of one time step for parti le j.

In su h a ase, the time step of parti le i at time t depends on the positions and
velo ities of other parti les j, that an only be determined from time symmetri
interpolation between the positions and velo ities of ea h parti le j at times earlier
and later than t . However, the future j positions and velo ities depend in turn on
the orbit of parti le i, and thus on the time step of parti le i at time t . In other
words, there is a ir ular dependen e between the future positions and velo ities
of parti les j and the time step of parti le i.
To make things worse, ea h of the future positions and velo ities of any of the
parti les in turn will depend on information that is given even further in the

56
future. If we ontinue this logi , we would have to know the omplete future
of a whole simulation, before we ould attempt to time symmetrize that whole
history. And while any simulation will stop at a nite time, so the number of
time steps for ea h parti le will be a nite number, it is learly unpra ti al to let
the very rst time step depends on the positions and velo ities of the parti les at
the very end of simulation.
A more pra ti al solution is to impose a maximum size for any time step, as � tmax.
If we start the simulation at time t0, we know that all parti les will rea h time
t1 = t0 + � tmax , by making one or more steps. At that time, all parti les will be
syn hronized. This means that we an fo us on time symmetri orbit integration
for all parti les during the interval [t0,t1], without the need for any information
about any parti le at any time t > t1.
In other words, we divide and onquer; we split the total history of our simulation
into a number of smaller periods, whi h we all every one as era. Ea h era extends
a period in time equal to the largest allowed time step � tmax, or to an integer
multiple of � tmax, whatever turns out to be the most onvenient.

4.2 Description of the Algorithm with Leapfrog

Let the beginning of a single era be t0 and the end t1. We an assume for simpli ity
that the zero point in time has been hosen in su h a way as to be ompatible
with the era size, so that both t0 and t1 are integer multiples of the time interval
t1 − t 0 .

As we saw above, in order to obtain the time step size for parti le i at time t
within our era, we typi ally need to know the positions and velo ities of some
other parti les at times larger than t . The simplest way to provide this future
information is through iteration.
If one parti le i wants to step forward in time, we need to know the positions of
all parti les j in order to ompute the for e that ea h parti le j exerts on parti le
i. Among the parti les j, many may have a larger time step than parti le i, so
we may have to predi t the position for su h a parti le, for the time at whi h
parti le i wants to make a step. In addition, we need to predi t the velo ity of

57
parti le j, be ause the velo ity dieren e between parti les i and j are used in
determining the time step size, a ording to Eq.(4.7).
The predi ted position r p, j for parti le j is obtained with a se ond-order Taylor
expansion, while for the predi ted velo ity v p, j a rst-order expansion su es;

1
r p, j = r j + v j (t − t j ) + a j (t − t j )2 ,
2
v p, j = v j + a j (t − t j ). (4.1)

The integrated positions and velo ities for ea h parti le i at ea h time step, rnew
and vnew in Eq.(4.2), are all stored. First, we just perform standard forward
integration, with the usual kind of non-time-symmetri blo k time step algorithm,
for the omplete duration of our era (t0 < t < t1).
In the rst al ulations reported in this hapter, we have used the same
predi tor- orre tor form of the leapfrog algorithm as mentioned earlier in
Eq.(3.5);

1
rnew = rold + vold � t + aold � t 2,
2
1
vnew = vold + (aold + anew )� t. (4.2)
2

While simultaneously integrating the orbits of all parti les, we store the positions
and velo ities (and if ne essary higher order time derivatives) for all time steps
for all parti les during our era. This will then allow us to obtain the position
and velo ity of any parti le at any arbitrary time through interpolation, to the
a ura y given by this rst try, whi h will fun tion as our zeroth iteration.
Next we make our rst iteration. We again perform orbit integration for our
omplete N -body system, during t0 < t < t1. However, there are two dieren es
with respe t to the rst attempt. First of all, in order to al ulate the for e
from parti le j on parti le i, we no longer extrapolate the orbit of parti le j to
the time requested by i, but instead we interpolate the position and velo ity of
parti le j to the requested time, using stored positions and velo ities of parti le j
at slightly earlier and later times, using a time symmetri interpolation s heme.

58
To al ulate the for e at the new time for parti le i, we use the positions of the
other parti les j al ulated by interpolation, based on the previous iteration, in
the following way. The interpolation itself is done in a straightforward, linear
way. However, sin e we have a more a urate position at hand for at least the
starting point of ea h orbit segment, for parti le j, we may as well orre t the old
orbit segment by shifting it rigidly by an amount equal to the dieren e between
the starting point of the urrent and the previous iteration.
To be spe i , the interpolated position at time t for parti le j is given by;

r p = (1 − f )rs + f re + � rs , (4.3)

where f = (t − ts )/(te − ts ), ts is the largest time in the list of times for parti le j
not ex eeding t , and te is the next time in that list, immediately following ts . Here
both rs and re are obtained from the stored results from the previous iteration.
The orre tion term � rs is dened as follows;

� rs = rs,new − rs , (4.4)

where rs,new is the position of parti le j at time ts in the urrent iteration. As in


the ase des ribed above, sometimes we are unlu ky, and rs,new is not available.
In that ase we just set � rs to be zero, postponing further a ura y improvement
until the next iteration.
To make our predi tor- orre tor form of the leapfrog integration s heme
onsistent, we use the same interpolation s heme for the predi tor part, for the
parti les to be integrated. For the orre tor part, we used the trapezoidal s heme,
as follows;

1
vc,new = vold + � t(aold + anew ), (4.5)
2
1
rc,new = rold + � t(vold + vc,new ). (4.6)
2

Here, the subs ript old refers to the value at the previous time, the subs ript
c, new refers to the orre ted value at the new time, aold is al ulated with the old
values for the positions, and anew is al ulated with the predi ted values of the
59
positions. Note that the rst orre ted quantity that an be omputed is vc,new ,
based on the old and predi ted quantities. After that, we an also ompute the
orre ted quantity rc,new , based on the old quantities and vc,new .
Se ondly, we an now begin to symmetrize the time step for parti le i, in the
same way as we did it for the two-body problem in the previous se tion, with one
ex eption; we now obtain the estimated time step size at the beginning of the
time step from the urrent iteration, and we obtain the estimated time step size
at the end of the time step from the previous iteration (in the two-body ase the
iteration was done separately for ea h step).
(At time t for parti le i, let the time step al ulated a ording to the riterion
(4.7) be δ t , and the time step not ex eeding this δ t and ompatible with the
blo k time step riterion be � t p. But now we would like to know the time step
size that would be required at the end of this step, at time t ′ = t + � t p.)
Subsequent attempts, as se ond and higher iterations, repeat the same steps as
the rst iteration.
As before, we have implemented our iteratively time symmetri blo k time step
algorithm using the leapfrog algorithm as our basi integrator. Generalizations
to higher-order s hemes are somewhat more omplex, but they follow the same
basi logi we are outlining in the following se tions. A general ow hart of the
algorithm is shown in Fig. 4.1.
We have adopted the following time step riterion for parti le i:

|ri j |
δ ti = η min j , (4.7)
|vi j |
where η is a onstant parameter and ri j and vi j are the relative position and
velo ity between parti les i and j. To symmetrize the time step, we simply
require that, the step sizes that would be determined by the above riterion at
both ends of the time step are not smaller than the a tual time step used. In
other words, we take the minimum of the two time step values that our riterion
gives us at the beginning and at the end of the time step; this minimum is our
symmetrized time step. We ould have taken the average, but here we have used
the minimum value, for simpli ity.

60
Figure 4.1: Flow chart of the algorithm

61
4.3 Numerical Tests for N-Body Problems

4.3.1 Test results for leapfrog integration

Fig. 4.2 shows how the energy errors grow in the 100-body problem. In ea h ase,
we started with random realizations of a Plummer model, where we used standard
N -body units, in whi h the gravitational onstant G = 1, the total mass M = 1
and the total energy is Etot = −1/4. We have integrated ea h system for 50 time
units, with a maximum time step of 1/64 and η = 0.1 (see Eq.(4.7)). We have
used standard Plummer type softening with softening length ε = 0.01. We have
arried out forty time integrations, starting from twenty dierent realizations of
the Plummer model. For ea h realization, we have integrated the system on e
without any time symmetrization and on e with time symmetrization using six
iterations to guarantee su ient onvergen e. In our experien e, at least three
iterations were ne essary to a hieve high a ura y.
It is lear that, all runs without time symmetry shows a systemati drift in energy,
while no su h systemati tenden y is visible for time symmetrized runs. Among
the twenty non-symmetrized runs, even the best result ame out worse than the
worst result among the symmetrized runs. We on lude that time symmetrization
an signi antly improve the long-term a ura y of N -body simulations.
Fig. 4.3 shows similar results for 512-body runs. Here we have started from a
single Plummer model realization, and the urves show the ee t of varying the
number of iterations. In this ase, the se ond and third iteration already show a
dramati improvement in the long-time behavior of the total energy of the system.
The softening used here is ε = 1/512. All other parameters are the same as for
the 100-body runs. For the 512-body ase, the improvement is signi antly better
than it was in the 100-body runs. Finally, Fig. 4.4 shows that the results depi ted
in Fig. 4.3 are generi . Starting from a dierent set of initial onditions hanges
the details but not the overall pi ture, and again the se ond and third iterations
show dramati improvements over the original run and the rst iteration.

62
Figure 4.2: Growth of the relative energy error for 100-body runs, starting from
twenty different sets of initial conditions. For each set of initial
conditions, two integrations have been performed, one without and one
with time-symmetrization (in the latter case, using six iterations). The
twenty lines with time symmetrization form the horizontal bundle which
is slowly spreading in square-root-of-time fashion like a random walk;
the twenty lines without time symmetrization all show a systematic,
near-linear decrease in energy.

We oer the following explanation for the in rease in a ura y of the


time-symmetri s heme as a fun tion of parti le number. In these runs,
ontributions to the error are largely generated by lose en ounters between two
parti les, sin e the softening we used is relatively small. These error ontributions
are dominated by weak en ounters, sin e the softening, while small, is not
small enough to make gravitational fo using signi ant [34℄. The number of
weak en ounters, that take pla e during one time unit, within a parti le-parti le
distan e, that is less than the inter-parti le distan e (of order N 1/3 ) is of order
O(N 4/3 ). If our time symmetrization su eeds in repla ing the systemati ee ts
of all these en ounters by a random ee t, the result will be a shift from a linear

63
Figure 4.3: Growth of the relative energy error for 512-body runs, starting from a
single set of initial conditions, but using a different number of iterations.
The lowest curve presents an integration without time symmetrization.
The curve above that presents the result of time symmetrization using only
one iteration. The next two curves show the results of using three and two
iterations, respectively; initially, the third iteration curve rises a bit above
the second iteration curve.

to a square root drift, ee tively repla ing the O(N 4/3 ) dependen e by a O(N 2/3 )
dependen e. We on lude that the relative redu tion in the total energy error,
due to time symmetrization, grows with N , as N 2/3 .
Whether or not time-symmetri integration is to be preferred, depends on the
number of parti les and the duration of the integration. In the example ase
shown in Fig. 4.2, the energy error is about 10 times smaller for time-symmetri
integration, but the same ee t ould be a hieved in a omputationally heaper

way by redu ing the step size by a fa tor of 10. For large N , however, as shown
already in Fig. 4.3, the ee t of time symmetry is more pronoun ed.

64
Figure 4.4: Growth of the relative energy error for 512-body runs, like Fig. 4.3, but
starting from a different set of initial conditions. As before, the lowest
curve presents an integration without time symmetrization, and the curve
above that presents the result of time symmetrization using only one
iteration. The second iteration curve is the one that stays above the third
iteration curve for most of the run depicted here.

For very long integrations, time-symmetri integration is learly better, sin e



in that ase the error grows in random-walk fashion, as t , while for any
non-time-symmetri s heme the error grows linearly, proportional to t . In the
ase of a star- luster simulation whi h overs many relaxation times ales, the
parti les in the ore an go through a very large number of rossing times. For
example, a simulation of a globular luster with 106 stars would need to over at
least 105 half-mass rossing times. The rossing times ale in the ore is at least a
fa tor of 100 shorter than that of the half-mass rossing time. Therefore, parti les
in the ore need to be followed for as many as 107 lo al rossing times. The

dieren e between the s aling of t and t produ es a improvement of a fa tor of
roughly 103.5 in the energy error, in the ase of time-symmetri integration. With
a se ond-order s heme, it translates into a fa tor of 100 dieren e in the ne essary

65
step size. Even with a fourth-order s heme, it would imply a dieren e of a fa tor
of ten in the time step. Clearly, even applying ve iterations will produ e a gain
of a fa tor two with respe t to the alternative of having to de rease the time step
by a fa tor of ten.

4.3.2 Test results for fourth–order Hermite integration

When we apply our TSTBT algorithm to fourth or higer order Hermite


integrations, we need higher order derivatives. In the rst step, we have to make
predi tions for all bodies. When we rst pass from the era, it will be expensive
to al ulate higher order derivatives su h as si, and ci in Eq.(2.5). Instead of
using fourth-order Taylor expansions as in Eq.(2.5), we prefer to use third-order
predi tions for the rst pass;

(δ t)2 (δ t)3
ri+1 = ri + vi δ t + ai + ji ,
2 6
(δ t)2
vi+1 = vi + ai δ t + ji . (4.8)
2

We only al ulate dire tly;

ri j
ai j = mi j ,
ri3j
vi j
ji j = mi j 3 − 3α ai j , (4.9)
ri j

where ai, ji, and mi are total a eleration, total jerk, and mass of parti le i, ai j ,
and ji j are from parti le j to parti le i, and α is;

ri j vi j
α= . (4.10)
ri3j

In the se ond pass from era, whi h is the rst iteration, we use interpolations
during ti < t < ti+1 for omplete system;

new (ti+1 − ti )2 (ti+1 − ti )3


ri+1 = ri + vi (ti+1 − ti ) + ai + ji
2 6
(ti+1 − ti )4 (ti+1 − ti )5
+ si + ci ,
24 144

66
(ti+1 − ti )2
vnew
i+1 = vi + ai (ti+1 − ti ) + ji
2
(ti+1 − ti ) 3 (ti+1 − ti )4
+ si + ci , (4.11)
6 24

where si , and ci are al ulated using Eq.(2.8), and Eq.(2.10). And we update
only a tive parti le using Eq.(4.12);

δt δ t2
vci+1 = vi + (anew
i+1 + ai )
new
− ( ji+1 − ji ) ,
2 12
c δt (δ t)2
ri+1 = ri + (vci+1 + vi ) − (anewi+1 − a i ) . (4.12)
2 12

Here, we used again standard N -body units. We have integrated ea h system for
1000 time units to see the long time behavior of the integrations. We have used 3,
and 5 iterations with time symmetrization for ea h ase to ompare integrations
without time symmetrization. We used same 100, and 500 body initial onditions
for all Hermite tests.
Fig. 4.5 shows the energy errors in the 100 body problem for η = 0.1, maximum
time step is 1/64. There is a dramati improvement in energy errors for TSTBT
algorithm. The algorithm with 3, and 5 iterations gives results with the almost
same degree of a ura y. Time symmetri behavior is disturbed after 100 time
units. Fig. 4.6 shows the same results for only TSBTS algorithm with 3 and 5
iterations.
Fig. 4.7 shows the energy errors for 500 body problems for η = 0.5. Similar to
the 100 body ases, TSBTS algorithm gives better results. And Fig. 4.8 shows
only TSBTS results. Time symmetri behavior is disturbed after 100 time units
again for 500 body tests.

4.3.3 Test results for sixth–order Hermite integration

When we use sixth-order Hermite integration with TSBTS algorithm, we need


one more derivative than fourth order s heme. In the rst predi tion, we have to
al ulate one more term as ai+1, and we use fourth-order predi tions;

67
Figure 4.5: Relative energy errors for 100 body problems. 10 different sets of
Plummer model initial conditions are used for fourth-order Hermite
integration with TSBTS algorithm and block time step algorithm.

Figure 4.6: Relative energy errors for 100 body problems. 10 different sets of
Plummer model initial conditions are used for fourth-order Hermite
integration with TSBTS algorithm.

68
Figure 4.7: Relative energy errors for 500 body problems. 10 different sets of
Plummer model initial conditions are used for fourth-order Hermite
integration with TSBTS algorithm and block time step algorithm.

Figure 4.8: Relative energy errors for 500 body problems. 10 different sets of
Plummer model initial conditions are used for fourth-order Hermite
integration with TSBTS algorithm.

69
(δ t)2 (δ t)3 (δ t)4
ri+1 = ri + vi δ t + ai + ji + si ,
2 6 24
(δ t)2 (δ t)3
vi+1 = vi + ai δ t + ji + si ,
2 6
(δ t)2
ai+1 = ai + ji δ t + si . (4.13)
2

We also al ulate dire tly si j as follows;

a j − ai
si j = mi j − 6α ji j − 6β ai j , (4.14)
ri3j
where;

|v|2i j + ri j (a j − ai )
β= + α 2. (4.15)
ri2j
In the iteration phase, we need two more terms as pi , and ni . These are fourth and
fth derivatives of a elerations as in Eq.(2.19) respe tively, and we use following
interpolations;

new (ti+1 − ti )2 (ti+1 − ti )3 (ti+1 − ti )4


ri+1 = ri + vi (ti+1 − ti ) + ai + ji + si
2 6 24
(ti+1 − ti ) 5 (ti+1 − ti ) 6 (ti+1 − ti ) 7
+ ci + pi + ni ,
120 720 5040
(ti+1 − ti )2 (ti+1 − ti )3
vnew
i+1 = v i + a i (t i+1 − t i ) + j i + s i
2 6
(ti+1 − ti ) 4 (ti+1 − ti ) 5 (ti+1 − ti )6
+ ci + pi + ni ,
24 120 720
(ti+1 − ti )2 (ti+1 − ti )3
anew
i+1 = a i + j (t
i i+1 − t i ) + s i + c i
2 6
(ti+1 − ti ) 4 (ti+1 − ti ) 5
+ pi + ni . (4.16)
24 120

Using updated positions, velo ities, and a elerations, our new orre tors are
onstru ted as;

δt δ t2 δ t3
vci+1 = vi + (anew
i+1 + ai )
new
− ( ji+1 − ji ) + (si+1 + si ) ,
2 10 120
c δt (δ t)2 (δ t)2
ri+1 = ri + (vci+1 + vi ) − (anewi+1 − a i ) − ( j new
i+1 − j i ) . (4.17)
2 10 120

Fig. 4.9 shows the energy errors for 10 dierent 100 body problem for η = 0.1.
Here, results for TSBTS algorithm are learly better than blo k time step ases
70
again as we expe ted. 3, and 5 iteration results show no linearly growing errors
after 100 time units in Fig. 4.10.

Figure 4.9: Relative energy errors for 100 body problems. 10 different sets of
Plummer model initial conditions are used for sixth-order Hermite
integration with TSBTS algorithm and block time step algorithm.

Fig. 4.11 shows the energy errors for 10 dierent 500 body problems for η =
0.5. Fig. 4.12 shows the energy errors only for TSBTS algorithm with 3, and 5
iterations.

4.3.4 Test results for eighth–order Hermite integration

Similar to the sixth-order ase, we need one more derivative for predi tions. Our
new term is jerk ( ji+1), and our fth-order predi tions are;

(δ t)2 (δ t)3 (δ t)4 (δ t)5


ri+1 = ri + vi δ t + ai + ji + si + ci ,
2 6 24 120
(δ t)2 (δ t)3 (δ t)4
vi+1 = vi + ai δ t + ji + si + ci ,
2 6 24
(δ t)2 (δ t)3
ai+1 = ai + ji δ t + si + ci ,
2 6
(δ t)2
ji+1 = ji + si δ t + ci . (4.18)
2

71
Figure 4.10: Relative energy errors for 100 body problems. 10 different sets of
Plummer model initial conditions are used for sixth-order Hermite
integration with TSBTS algorithm.

Figure 4.11: Relative energy errors for 500 body problems. 10 different sets of
Plummer model initial conditions are used for sixth-order Hermite
integration with TSBTS algorithm and block time step algorithm.

72
Figure 4.12: Relative energy errors for 500 body problems. 10 different sets of
Plummer model initial conditions are used for sixth-order Hermite
integration with TSBTS algorithm.

We already know how to al ulate dire tly a, j, and s, from Eq.(4.9), and
Eq.(4.14). We need to al ulate dire tly third derivative of the a eleration c
as;

j j − ji
ci j = mi j − 9α si j − 9β ji j − 3γai j , (4.19)
ri3j

where γ is;

3vi j (a j − ai ) + ri j (j j − ji )
γ= 2
+ α (3β − 4α 2 ). (4.20)
ri j

Our new orre tors for eighth-order Hermite are;

δt 3(δ t)2
vci+1 = vi + (anew
i+1 + ai )
new
− ( ji+1 − ji )
2 28
(δ t) 3 (δ t) 4
+ (si+1 + si ) + (ci+1 − ci ) ,
84 1680
c δt 3(δ t)2
ri+1 = ri + (vci+1 + vi ) − (anew i+1 − a i )
2 28
( δ t) 2 ( δ t) 4
new
+ ( ji+1 + ji ) + (si+1 − si ) . (4.21)
84 1680

73
In the iteration phase, we use following equations for interpolations. As usual,
we use two more terms than sixth order ase as sixth and seventh derivatives of
a eleration, and we need an extra interpolation for jerk;

new (ti+1 − ti )2 (ti+1 − ti )3 (ti+1 − ti )4


ri+1 = ri + vi (ti+1 − ti ) + ai + ji + si
2 6 24
(ti+1 − ti ) 5 (ti+1 − ti ) 6 (ti+1 − ti ) 7
+ ci + pi + ni
120 720 5040
(t − t ) 8 (t − t ) 9
(6) i+1 i (7) i+1 i
+ ai + ai ,
40320 362880
(ti+1 − ti )2 (ti+1 − ti )3
vnew
i+1 = v i + a i (t i+1 − t i ) + j i + s i
2 6
(ti+1 − ti ) 4 (ti+1 − ti ) 5 (ti+1 − ti )6
+ ci + pi + ni
24 120 720
(t − t ) 7 (t − t ) 8
(6) i+1 i (7) i+1 i
+ ai + ai ,
5040 40320
(ti+1 − ti )2 (ti+1 − ti )3
anew
i+1 = a i + j (t
i i+1 − t i ) + s i + c i
2 6
(ti+1 − ti ) 4 (ti+1 − ti ) 5
+ pi + ni
24 120
6 7
(6) (ti+1 − ti ) (7) (ti+1 − ti )
+ ai + ai ,
720 5040
new (ti+1 − ti )2 (ti+1 − ti )3
ji+1 = ji + si (ti+1 − ti ) + ci + pi
2 6
(ti+1 − ti ) 4 (ti+1 − ti ) 5 (ti+1 − ti )6
+ ni + a(6) + a(7) . (4.22)
24 120 720

As similar to the previous Hermite integration tests, Fig. 4.13, and Fig. 4.14 show
the energy errors for 10 dierent 100 body problems for η = 0.1. The se ond one
shows (Fig. 4.13) only 3, and 5 iteration ases. Fig. 4.15, and Fig. 4.16 show the
energy errors for 10 dierent 500 body problems for η = 0.5. And Fig. 4.16 shows
only 3, and 5 iterations ases.
We know from two body tests that, eighth order Hermite integration is not fully
time symmetri . In any ase, it gives better results with TSBTS than blo k time
step s heme.

74
Figure 4.13: Relative energy errors for 100 body problems. 10 different sets of
Plummer model initial conditions are used for eighth-order Hermite
integration with TSBTS algorithm and block time step algorithm.

Figure 4.14: Relative energy errors for 100 body problems. 10 different sets of
Plummer model initial conditions are used for eighth-order Hermite
integration with TSBTS algorithm.

75
Figure 4.15: Relative energy errors for 500 body problems. 10 different sets of
Plummer model initial conditions are used for eighth-order Hermite
integration with TSBTS algorithm and block time step algorithm.

Figure 4.16: Relative energy errors for 500 body problems. 10 different sets of
Plummer model initial conditions are used for eighth-order Hermite
integration with TSBTS algorithm.

76
4.4 Numerical Tests for Size of the Era

Size of era an be hosen as any integer multiple of the maximum allowed time
step. There is no important omputational dieren e between dividing the small
parts and taking the whole simulation for integration part of an era. However
some symmetrization routines like adjusting the time steps and interpolating the
old data in reases the omputation time. Additionally, keeping the whole history
of the simulation needs very huge memory.
It is important to de ide what is the most onvenient hoi e for an era. We need
to keep enoughly more information from the previous steps to adjust the time
steps with iterations. To avoid extra works and keeping uselessly big history, it is
not re ommended to hoose a big size for era. On the other hand, era size must
be large enough to at h rapid and sharp time step hanges.
We made several tests with dierent Plummer model initial onditions using
dierent size of eras. Units were hosen as standard N -body units [4℄, as the
gravitational onstant G = 1, the total mass M = 1 and the total energy Etot =
−1/4. We limited maximum time step as 1/64. η parameters were kept bigger
than usual to see the error growth in smaller time periods. η parameters were
taken 0.1 for 100-body problems, and 0.5 for 500-body problems. Plummer type
softening length ε was taken as 0.01. Ea h system was integrated for every era
size (1, 0.5, 0.25, 0.125, 0.0625, 0.03125, 0.015625) for 1000 time units with leapfrog
integration.
Fig. 4.17, shows the energy errors for 5 dierent 100-body problems with 7
dierent era sizes. In these test runs, iterations have not been applied for
symmetrization. It is learly seen from the gure that, keeping the era size
big or small without iterations has no inuen e on energy errors. All the error
urves are in the same range. Small dieren es for same initial onditions omes
from syn hronization pro ess whi h is repeated at the end of ea h era. The
improvement on energy errors omes dire tly from the iteration pro ess.

77
Figure 4.17: Relative energy errors for 100-body problems. 5 different sets of
Plummer model initial conditions with 7 different era sizes are used
without iterations. Here, all the energy errors are in the same bounds.
Even if a growing dispersion between the curves is observed after 200
time units, small spreadings in the long integration times is not so
important for such big relative energy errors.

Fig. 4.18, shows the energy errors for 5 dierent 100-body problems with 5
dierent era sizes. In these test runs, time symmetrized blo k time steps were
used with 3 iterations. We also performed test runs for other era sizes (1.0, 0.5)
but their growth of energy errors rea h beyond the s ales of this gure. The
gure shows that, 3 iterations is not enough to avoid linearly growing errors for
big (here is 0.25) era sizes.
It seems that, bigger era sizes need more iterations for better energy errors. We
made following tests to see this ee t learly. Fig. 4.19 shows the energy errors for
5 dierent 100-body problems with 5 dierent era sizes as in the previous gure.
However, we used 5 iterations here. In this gure, the biggest era size (0.25 unit
time) does not show linear growing error just opposite to the ase of 3 iterations.
We in reased the parti le number ve times, and keep the η parameter bigger
as 0.5. η parameter ould be kept 0.1 again, but we for ed the algorithm
to take bigger time steps whi h produ e bigger energy errors for relatively

78
Figure 4.18: Relative energy errors for 100-body problems. 5 different sets of
Plummer model initial conditions with 5 different era sizes (0.015625,
0.03125, 0.0625,0.125,0.25) are used with 3 iterations for 1000 time
units. The top 5 curves (red ones) show linear growing errors which
correspond to errors for biggest era sizes (0.25). The rests present the
results for other era sizes. The smallest relative errors on the figure (black
curves), show a random-walk fashion, and correspond to results with the
smallest era size (0.015625).

small time periods. Fig. 4.20 show the energy errors for 5 dierent 500-body
problems with 7 dierent era sizes. Red urves show the errors for era sizes as
0.015625, 0.03125, 0.0625 time units, bla k urves show the errors for era sizes as
0.125, 0.25, 0.5, 1 time units.

Error ontributions mostly ome from lose en ounters [35℄. This ee t in reases
with long running times for small softening. If symmetrization pro ess ould
not be a hieved to produ e time symmetri blo k time steps, total energy error
grows linearly. As a result of our tests, better symmetri blo k time steps an be
obtained by paying attention to two points; in reasing the iteration number, and
keeping the era size smaller.
However era size must be greater than the smallest time step. Otherwise we an
not store past information for iteration pro ess, and the algorithm turns ba k to
lassi al blo k time step s heme. Even though it is hard to give a lower limit for

79
Figure 4.19: Relative energy errors for 100-body problems. 5 different sets of
Plummer model initial conditions are used for 5 iterations with 5 different
era sizes (0.015625, 0.03125, 0.0625,0.125,0.25). In this figure, all
the curves show random-walk fashion instead of linearly growing error.
Also, the worst relative error is below the 0.008 even if it was 0.035 in
Fig.3.19, and 0.35 in Fig.3.18.

era size, it has to be hosen big enough in order to estimate better blo k time
steps.

4.5 Dynamic Era

Our test results in the previous se tion show that, to keep the era size big or
small has a lear ee t on energy errors with symmetrized time steps for small
number of iterations. Amount of past information about the parti les' positions
and velo ities grows up with size of the era. Then, iteration pro ess need more
iterations to onverge to optimized time steps whi h require higher running time.
The simplest hoi es for hoosing era size vary between one time unit and allowed
biggest time step. However it an be ontrolled dynami ally, if we an nd proper
riterion to hange size of the era.
Let us remember the relation between blo k time steps and era: At the end of
ea h era, every parti le stops in the same system time, and takes the new blo k

80
Figure 4.20: Relative energy errors for 500-body problems. 5 different sets of
Plummer model initial conditions are used with 7 different era sizes
(0.015625, 0.03125, 0.0625, 0.125, 0.25, 0.5, 1) for each sets. 3
iterations have performed in the integrations. 15 curves (red ones)
in the center of the figure present the results of smaller era sizes
(0.015625, 0.03125, 0.0625), the rest 20 curves correspond to bigger
(0.125, 0.25, 0.5, 1) era sizes.

time step for new blo ks. The last blo k an take maximum allowed time step at
the most. The rst blo k an take any blo k time step smaller than maximum
allowed time step. Then, parti les are sorted a ording to their blo k time steps.
Our suggestion to hoose size of era is; take the dieren e between the biggest,
and smallest blo k times of the parti les. Dieren e between the last blo k time
and urrent time of the rst blo k gives us a dynami ally hanging size whi h we
an use as size of the era.
Naturally, sometime this dieren e an be higher than one time unit, or smaller
than allowed biggest time step. We an use the biggest time step and any small
integer multiple of this time step for top and bottom limits of the era respe tively.
If the biggest time step of the era is equal to allowed biggest time step, our era
size will be equal to the allowed biggest time step. If our era size is smaller than
our biggest time step, the parti les with biggest time steps are ex luded from
integration pro ess of the era, and remain to the next era. If it happens, errors

81
on the energy onservation are os illated. Even if our iteration pro ess prevents
the integration from these os illations, we an use allowed biggest time step for
era size in these ases.
In these tests, we used both two hoi es for era: equal to the allowed biggest time
step, and dynami ally hanging size as we dened above. We performed three
iterations. Green urves with ⋆'s show the results for dynami era, red urves
with ✷'s show the results for xed era. Fig. 4.21 shows the energy errors for 10
dierent 100-body problems. Fig. 4.22 shows the energy errors for 10 dierent
500-body problems. Large s ale versions of the gures are given in Appendix
(2.1).

Figure 4.21: Relative energy errors for 10 different 100-body problems. For each
initial conditions, two algorithms have been performed, one with fixed
and one with changing era size. Three iterations have been used for both
algorithms. Fixed era size was taken as 0.015625. This value was also
used as the allowed biggest time step for both algorithms. Green curves
correspond to dynamic era sizes, and show smaller errors than fixed ones
in most cases.

Results for dynami era size are in the same range with ones for xed era size.
Even if the hosen xed era size (0.015625) seems as the best hoi e for previous
tests with same initial onditions and parameters as maximum allowed time steps,
softening and a ura y parameters, in general, dynami era gives a little better
82
results than xed ones for 0.015625. We made more than 20 dierent test runs,
and we have not seen any worst energy errors than the xed ones. The worst
results for dynami era are onsistent with xed era sizes. Large s ale versions of
the gures an be seen in Appendix (2.2).

Figure 4.22: Relative energy errors for 10 different 500-body problems. Fixed and
dynamic era sizes are performed for each initial conditions as in Fig.4.21.
As before, fixed era size and allowed biggest time step were taken as
0.015625. The results for dynamic and fixed era sizes are in the same
error ranges.

83
5. PARALLEL IMPLEMENTATION

Parallelization of dire t N -body odes on general purpose parallel super omputers


is still extremely hallenging. The al ulation/ ommuni ation ratio is often
dominated by small number of parti les in a dense region, su h as globular
lusters or the nu leus of a galaxy. Communi ation times between pro essors
an ex eed the time needed for for e al ulation, and ommuni ation laten y
be omes the main bottlene k with the in reasing number of parti les [36℄. For
these reasons, the s aling problem is one of the fundamental short omings of the
N -body approa h for dire t simulations.

It is well known from the previous works that, signi ant improvement in the
performan e of a parallel dire t N -body ode an be obtained by means of spe ial
purpose omputers like GRAPE hardware [31, 37℄. However, we aimed to show
the possibility of using time symmetri blo k time step method with a parallel
algorithm as a rst step for lassi al lusters.

5.1 Requirements and Algorithm of the Sequential Code

Programming for dire t integration of N -body problem with hierar hi al s hemes


has some di ulties. In every integration step, partial for es on the a tive group
of parti les must be omputed and summed. Additionally, we have to store all
informations of the parti les in an era for every step for our TSBTS algorithm
(Algorithm 3).
At the starting point of every integration step, we have to know whi h parti les
must be updated. Instead of hoosing parti le with the smallest time step, we
prefer to sort them as blo ks. A ording to their step size, we have to re-arrange
our integration list. Even if this approa h in rease omplexity of the ode, it
gives us an ability to divide and send parti les as sub-blo ks to the omputing
nodes in parallelization of the algorithm.

84
In the programming side of the work, we have some stru ture arrays for parti le
lists to keep position and velo ity ve tors, higher order derivatives, and time
informations. To sort a stru tured array is a very time onsuming operation, if
stru ture has many sub arrays and units. In our ode, we used dummy stru tured
arrays that in lude only time and indexes for sorting operations to avoid su h
extra time onsuming works.
For initial onditions, our ode requires approximately 170 bytes for ea h parti le,
whi h amounts to about 170 K-bytes for a thousand parti les system. If we
a ept 1000 steps for every era to keep past informations in memory, running time
requirements will be approximately 260 M-bytes in luding a eleration ve tors
and time informations for leapfrog integration.
As an example for omputational requirements of astrophysi al N -body problems;
evolution of a globular luster ontaining a few thousand stars, sin e this requires
some 1015 oating point al ulations, equivalent to 10 G-ops-day, or several
months to a year on a typi al workstation. Therefore, a al ulation with half a
million stars, resembling a typi al globular star luster, will require approximately
10 P-ops-day [4℄.
Algorithm 3 Sequential Algorithm for TSBTS
1: Initialization:
- Read initial position and velocity vectors from the source.
- Sort particles according to time blocks
- Arrange size in the memory.
- Calculate initial total energy.
- Initialize particles’ forces, time steps, and next block times.
2: Start the iteration for era
3: Start the integrations for era
4: Start the integration for the first block of the era
5: Predict position and velocity vectors of the all particles for the current system time.
If this is the first step of the iteration, or time of the particle is smaller than current
time, do direct prediction, otherwise make interpolation from currently stored data.
6: Calculate partial forces on the active particles
7: Correct position and velocity vectors of the particles in the block
8: Update their new time steps and next block time. Symmetrized new time steps
after the first integration of the era.
9: Sort particles according to time blocks
10: Repeat from step 3 since current time is ≤ end of the era
11: Repeat from step 2 until number of the iteration reaches the iteration limit.
12: Repeat from step 2 for next era until the final time is reached.

85
5.2 Parallel Algorithm

Basi ally there are two well known s hemes are used in dire t N -body
parallelizations; opy and ring.
Ring or systoli algorithms are generally preferred to redu e the memory usage.
The performan e of systoli s hemes an be greatly developed by the use of
non-blo king ommuni ation in routines, whi h allows position and velo ity
ve tors to be ommuni ated at the same time that for e al ulations are being
arried out. Hyper-systoli algorithms redu e the ommuni ation omplexity
from O(Nxnp), where np the pro essor number, to O(N √np), at the expense of
in reased memory requirements [38℄.
Ring algorithms an be reasonable for shared time step odes, but it is not easy
to use them with blo k step s hemes, and it is also well known from the previous
works that, it gives at least almost the same speedup with the opy algorithm [37℄.
Number of the parti les in the integrated blo k hanges in every step. In many
ase and time, size of the integrated blo k an be smaller than number of the
pro essors. It is hard to obtain balan ed load distribution for su h ases.
We used opy algorithm. Even it is mu h more easy to extend for blo k step
s hemes, opy algorithm also have load imbalan e problem in lassi al usage. In
any time, blo k size an be smaller than number of pro essors.
We divide the partitioning strategy into two ases to avoid bad load balan ing.
In the rst ase, we divide the parti les of the a tive groups whi h is greater than
number of nodes. This is a kind of data partitioning, even every node has a full
opy of the system. In the se ond ase, we divide the for e al ulation of the
a tive blo ks as a kind of work partitioning.
Our parallel algorithm works with following steps as Algorithm(4).

86
Algorithm 4 Parallel TSBTS Algorithm based on SPMD model
1: Broadcast all particles. Each node workers has a full copy of the system.
2: Initialize system for all particles in all nodes. Every node computes all particles’
time steps.
3: Compute and sort time blocks.
4: Integrate particles in the first block whose block times are minimum for the era:
i) if number of first block ≥ number of nodes: every processor
calculates forces and integrates
(number of first time block)/(number of nodes) particles.
ii) if number of first block < number of nodes: every processor
calculates (number of particles)/(number of nodes) part of the forces
on the particles of the first block.
5: Update integrated particles.
6: Repeat from step 3.

5.3 Load Balance and Parallel Performance

We have performed test runs on a Linux luster in ITU-HPC Lab.1 with 37 dual
ore 3.40 GHz Intel(R) Xeon(TM) CPU with Myrinet inter onne t.
The ompute time was measured using MPI_Wtime(). The timing for total
ompute time was started before the broad ast of the system to the nodes, and
ended at the end of integration. Cal ulation time of every a tive sub parti le
group of the urrent time blo k for every single omputing node was taken as
work load of the pro essor. In the iteration pro ess, the biggest time was taken
as the work load of the pro essor for the same time blo k.
Work load of pro essors for every a tive integrated parti le group is dened as
wi , np is number of pro essors, and mean work load hW i is;

1 np
hW i = ∑ wi,
np i=1
(5.1)

and load imbalan es;

hW i
L(w) = 1 − . (5.2)
max(wi )

Fig.(5.1) shows the load imbalan e for a 1000-body problem. We used 12


pro essors (3 nodes, every node has 4 pro essors). In dire t N -body simulations,
1 İstanbul Technical University High Performance Computing Lab.

87
1000 body is a tually not a big number for 12 pro essors [36, 37, 39℄. Here, load
imbalan e is not seen as more than %0.1 in general.

Figure 5.1: Load imbalance for a Plummer model initial conditions 1000-body
problem using 12 processors for 1000 time units. Every single red points
corresponds to load imbalance for the active particle group while it’s
vectors are updating.

T1 is the running time for one pro essor, Tn is the running time for n pro essors.
speedup, and e f f iciency are given in order as;

T1
speedup = , (5.3)
Tn

T1
e f f iciency = . (5.4)
n ∗ Tn

Fig.5.2 and Fig.5.3 show speedup and e f f iciency results of symmetrized and
non-symmetrized blo k time steps for an 10000-body problem initial onditions
with Plummer softening length 0.01, and a ura y parameter η = 0.1. Only
one iteration with TSBTS algorithm orresponds to individual blo k time
step algorithm without symmetrization. Speedup result for 3 iterations is
learly better than the result for 1 iteration. These results show that,
ommuni ation/ al ulation ratio de rease with the iteration pro ess even though
iteration needs mu h more omputation time.

88
Figure 5.2: Speedup vs processor number for 10000-body Plummer model initial
conditions both for symmetrized and non-symmetrized individual block
time step algorithms. Continuous curve on the top corresponds to
symmetrized block time steps with 3 iterations. Discontinuous curve on
the bottom corresponds to classical block time step algorithm.

Figure 5.3: Efficiency vs processor number for 10000-body Plummer model initial
conditions both for symmetrized and non-symmetrized individual block
time step algorithms. Continuous curve on the top corresponds to
symmetrized block time steps with 3 iterations. Discontinuous curve on
the bottom corresponds to classical block time step algorithm.

89
6. CONCLUSION

In this thesis, we have su eeded in onstru ting an algorithm for time


symmetrizing blo k time steps, that does not show a linear growth of energy
errors. This is the rst su h algorithm that has been dis overed. We expe t that
the algorithm have pra ti al value for a wide range of large-s ale parallel N-body
simulations.
One of the major novelty in our s heme has been the introdu tion of a time period,
whi h we an an era, during whi h all positions and velo ities of all parti les are
stored in memory. These values are retained from one iteration to the next. We
expe t that this pro edure will have other advantages as well, in that it prevents
sudden surprises to o ur. For example, if a parti le will suddenly require a
very high speed, it may approa h another parti le with a long time step without
any warning. As another example, a star may undergo a supernova explosion,
something that other parti les will normally only noti e when urrent time step
has nished. In both ases, after the rst iteration all parti les will have a ess to
full knowledge about these unexpe ted events, and during the iteration pro edure,
they an automati ally adapt to the new situation.
Our new s heme promises to be ompetitive with traditional non-symmetrized
s hemes, espe ially for very long integration times, in whi h the same error bounds
may be rea hed using less omputer time. To prove that it will be the ase for
realisti appli ations learly requires further detailed investigations, beyond the
s ope of the urrent work.
The memory use of our s heme may seem formidable, and indeed, when a
large value for the era size is hosen, memory use is in reased signi antly
over traditional s hemes. For those N-body al ulations that are CPU time
limited, this may not be mu h of a on ern. However, for large-s ale osmologi al
simulations and other appli ations for whi h memory is important, it is possible

90
to hoose an era size in su h a way that the memory requirement of the new
s heme is less than twi e the memory requirement of non-symmetri s hemes, at
a CPU performan e penalty of less than a fa tor two.
There an be a tri k whi h is to take an era size lose to the harmoni mean
of the time steps of all parti les. In that way, half the omputing ost of the
non-symmetri s heme is asso iated with parti les that have time steps shorter
than this era size. Those parti les with natural time steps longer than this era
hoi e will see their step size shortened to the era size, but the total in rease in
time steps will be less than a fa tor two.
For su h reasons, we have analyzed the era on ept in more detail for time
symmetrized blo k time steps. Our test results show that size of era must be
hosen arefully for stable and robust integrations. This is important espe ially
for long term simulations with highly desirable energy onservations. Era size is
also important to avoid extra data storage and uselessly high number of iterations
whi h require too mu h running times.
As a se ond gain of the work, we re-designed the previous s heme, and suggested
a dynami ally hanging size for era. With this s heme, iteration pro ess of the
integration an follow needs of the simulation adaptively, and, era size will be
well-adjusted with physi s of the problem with proper riterion.
The reason that our s heme shows a dramati improvement in a ura y is time
symmetry, whi h suppresses linear error growth. The reason that our s heme is
somewhat ompli ated is purely empiri al: all else failed, in our attempts to try
simpler s hemes. Whether our pro edure is the simplest s heme that a tually an
produ e time symmetri versions of blo k time step odes is an open question. It
may well be, but we ertainly have no mathemati al proof. This is an interesting
question to be pursued further, for theoreti al as well as pra ti al reasons.
While we have illustrated our approa h for simpli ity with the leapfrog s heme,
all our onsiderations arry over to higher-order s hemes, as long as the base
s heme an be made time-symmetri when iterated to onvergen e. The rst
example of su h a s heme is the widely used fourth-order Hermite s heme [29℄.
We also applied our algorithm to newly developed sixth and eighth order Hermite

91
s hemes [27℄ to show how ee tive our algorithm with higher order s hemes.
Results of these tests also has big gains in a ura y [40℄.
Our se ond aim was to show that TSBTS s heme is as suitable as previously
known ones to develop parallel n-body odes. In large-s ale N-body simulations,
however, one often uses blo k time steps, where all time steps are for ed to take
on values as powers of two. This greatly fa ilitates parallelization, and hen e ode
e ien y. Parallelization of dire t n-body problem already has some di ulties
be ause of ommuni ation osts. Communi ation times dramati ally in rease
with number of pro essors. Previous works show that, it does not bring a big
gain to use more than 10 pro essors for a few thousands parti les [36, 37, 39℄.
This problem repeats in individual time step and blo k time step ases with load
balan ing problem.
As the third ontribution of the thesis, we produ ed a opy algorithm based
parallel s heme ombining with our time symmetrized blo k time step s heme. We
divided the for e al ulation into two ways a ording to number of the integrating
parti les to avoid bad load balan ing. If number of parti les in the integrated
blo k is greater than number of the pro essors, we use lassi al way in opy
algorithm to al ulate for es. If we have less number of parti le than pro essors to
integrate, we divide the for e al ulations to the pro essors as a work partitioning
[41℄.
Even we need to spend some extra ommuni ation eorts for the se ond ase,
this approa h gave us good load balan ing results. There are mu h more eort
have to be performed whi h are needed by iteration pro ess, but this eort is
ompensated by gain omes from highly onserved total energy. Speedup and
e ien y results were as we expe ted. S aling of the algorithm an be in reased
by using hyper systoli or other e ient algorithms [37, 38℄ in future works.
After many tests for many dierent ases with four dierent integration s hemes,
we an say that, time symmetry disturbed by blo k time steps for n-body
problems an be onstru ted by our algorithm. We in rease al ulation osts
for symmetrization, but this eort worth for high a ura y and onserved energy.

92
One of the future work is, to show how ee tive this algorithm for stru turally not
time symmetri integration s hemes su h as Adams-Moulton, and Runge-Kutta
methods. Multi-step methods are known as not so suitable to use with individual
time step algorithms be ause of the need for both middle points and past
informations. However we keep past informations in TSBTS algorithm. This
hara teristi of the algorithm give us ability to ombine su h integration s hemes
with it. We got very promising results with sixth and eighth order Hermite
s hemes even their interpolation routines were not fully time symmetri yet. Also,
orders of their rst predi tions are not high enough be ause of the di ulties for
higher order derivatives in starting point. We will try to handle su h problems
in future works. Additionally, it will be very interesting to use TSBTS algorithm
with Ahmad-Cohen [26℄ like s hemes.

93
REFERENCES

[1℄ Makino, J. & Hut, P., 1988. Performan e Analysis of Dire t N -Body
Cal ulations, Astrophysi al Journal Supplement, 68, 833.
[2℄ Hernquist, L. and Ostriker, J., 1992. A Self-Consistent Field Method for
Gala ti Dynami s, Astrophysi al Journal, 386, 375.
[3℄ Aarseth, J., S., 2003. Gravitational N -Body Simulations, Cambridge
University Press.
[4℄ Heggie, D. and Hut, P., 2003. The Gravitational Million-Body Problem,
Cambridge University Press.
[5℄ Kawai, A., Makino, J. and Ebisuzaki, T., 2004. Performan e Analysis
of High-A ura y Tree Code Based on the Pseudoparti le Multipole
Method, Astrophysi al Journal Supplement Series, 151, 1333.
[6℄ Hut, P., Shara, M., Aarseth, S., Klessen, R. and et ., 2003.
MODEST-1: Integrating Stellar Evolution and Stellar Dynami s,
New Astronomy, 8, 337370.

[7℄ Sills, A., Deiters, S., Eggleton, P., Freitag, M. and e t., 2003.
MODEST-2: A Summary, New Astronomy, 8, 605628.
[8℄ Greengard, L. and Rokhlin, V., 1987. A Fast Algorithm for Parti le
Simulations, Journal of Comp. Phys., 73, 325.
[9℄ Cheng, H., L., G. and V., R., 1987. A Fast Adaptive Multipole Algorithm
in Three Dimensions, Journal of Comp. Phys., 155, 468498.
[10℄ Ho kney, R. and Eastwood, J., 1988. Computer Simulation Using
Parti les, Institute of Physi s Publishing.
[11℄ Barnes, J. and Hut, P., 1989. Error Analysis of A Tree Code,
Astrophysi al Journal Supplement, 70, 389417.

[12℄ Hernquist, L., Hut, P. and Makino, J., 1993. Dis reteness Noise Versus
For e Errors in N -Body Simulations, Astrophysi al Journal, 402,
L85.
[13℄ M Millan, S. and Aarseth, S., 1993. An O(NlogN ) Integration S heme
for Collisional Stellar Systems, Astrophysi al Journal, 414, 200212.
[14℄ Quinn, T., Katz, N., Stadel, J. and Lake, G. Time
stepping N-body simulations, preprint, ar hived on;,
http://arxiv.org/ps/astro-ph/9710043.

94
[15℄ Quinlan, G. and Hernquist, L., 1997. The Dynami al Evolution
of Massive Bla k Hole Binaries, II. Self- onsistent N-body
integrations., New Astronomy, 2, 533.
[16℄ Dehnen, W., 2000. A very fast and momentum- onserving tree ode,
Astrophysi al Journal, 536, L39.

[17℄ Bagla, J. and Ray, S., 2003. Performan e Chara teristi s of TreePM
Codes, New Astronomy, 8, 665677.
[18℄ Goodman, J., Heggie, D. and Hut, P., 1993. On the Exponential
Instability of N -Body Systems, Astrophysi al Journal, 415, 715.
[19℄ Hut, P., Makino, J. and M Millan, S., 1995. Building A Better
Leapfrog, Astrophysi al Journal, 443, L93L96.
[20℄ Hut, P., Funato, Y., Kokubo, E. and Makino, J., 1997. Time
Symmetrization Meta-Algorithms, Computational Astrophysi s,
123, 2631.
[21℄ Leimkuhler, B. and Rei h, S., 2003. Simulating Hamiltonian Dynami s,
Cambridge University Press.
[22℄ Funato, Y., Hut, P., M Millan, S. and Makino, J., 1996.
Time-Symmetrized Kustaanheimo-Stiefel Regularization,
Astrophysi al Journal, 112,4, 1697.

[23℄ Springel, V., Yoshida, N. and White, S., 2001. GADGET: A ode
for ollisionless and gasdynami al osmologi al simulations, New
Astronomy, 6, 79.

[24℄ Makino, J., Fukushige, T., Koga, M. and Namura, K.,


2003. GRAPE-6: Massively-Parallel Spe ial-Purpose Computer for
Astrophysi al Parti le Simulations, Publi ations of the Astronomi al
So iety of Japan, 55, 11631187.

[25℄ M Millan, S., 1986. The Ve torization of small-N integrators, The Use of
Super omputers in Stellar Dynami s, 156, 61.

[26℄ Makino, J. and Aarseth, S., 1992. On a Hermite Integrator with


Ahmad-Cohen S heme for Gravitational Many-Body Problems,
Publi ations of the Astronomi al So iety of Japan, 44, 141.

[27℄ Nitadori, K. and Makino, J., 2007. 6th and 8th Order Hermite
Integrator for N -Body Simulations, New Astronomy, 13, 498507.
[28℄ Stadel, J., G., 2001. Cosmologi al N -Body Simulations and Their
Analysis, Ph.D. thesis, University of Washington.
[29℄ Makino, J., 1991. Optimal Order and Time-step Criterion for Aarseth-type
N-body Integrators, Astrophysi al Journal, 369, 200212.
[30℄ Press, W.H. and Spergel, D., 1988. Extrapolation S hemes for N-body
Codes, Astrophysi al Journal, 325, 715.
95
[31℄ Gualandris, A., 2006. Simulating Self-Gravitating Systems on Parallel
Computers, Ph.D. thesis, Amsterdam University.
[32℄ Quinlan, G., 1997. The Dynami al Evolution of Massive Bla k Hole
Binaries, I., New Astronomy, 1, 35.
[33℄ Kaplan, M., Saygn, H., Hut, P. and Makino, J., 2005. A New Time
- Symmetri Blo k Time-Step Algorithm for N -Body Simulations,
preprint, ar hived on;, http://arxiv.org/abs/astro-ph/0511304.

[34℄ Makino, J. & Hut, P., 1990. Bottlene ks in Simulations of Dense Stellar
Systems, Astrophysi al Journal, 365, 208.
[35℄ Makino, J., Hut, P., Kaplan, M. and Saygn, H., 2006.
A Time-Symmetri Blo k Time-Step Algorithm for N -Body
Simulations, New Astronomy, 12, 124133.
[36℄ Harfst, S., Gualandris, A., Merritt, D., Spurzem, R., Zwart, S.
and Ber zik, P., 2007. Performan e analysis of dire t N -Body
algorithms on spe ial-purpose super omputers, New Astronomy, 12,
357377.
[37℄ Makino, J., 2002. An e ient parallel algorithm for O(N 2) dire t
summation method and its variations on distributed-memory parallel
ma hines, New Astronomy, 7, 373384.
[38℄ Dorband, E., Hemsendorf, M. and Merritt, D., 2003. Systoli and
Hyper-systoli Algorithms for The Gravitational N -body Problem,
with an appli ation to Brownian motion., Journal of Computational
Physi s, 185,2, 484511.

[39℄ P. Spinnato, G.D. van Albada, P.S., 2000. Performan e Analysis of


Parallel N -Body Codes, ASCI 2000, Pro eedings of the sixth
annual onferen e of the Advan ed S hool for Computing
and Imaging, 213220.
[40℄ Kaplan, M. and Saygn, H., 2008. Hermite Integrations with TSBTS
Algorithm for N -Body Problems, MAASE'08, International
Conferen e on Multivariate Analysis and its Appli ation in
S ien e and Engineering.
[41℄ Kaplan, M. and Saygn, H., 2008. A Dynami Era Based
Time-Symmetri Blo k Time-Step Algorithm with Parallel
Implementations, submitted to Computer Physi s Communi ations
(in review).

96
A. Plummer Model

The initial density distribution is usually taken to be fairly spheri al with some
degree of entral on entration. Traditionally, the Plummer model [Plummer,
1911℄ has served this purpose well. The spa e density of the Plummer model is
given by;

3M 1
ρ (r) = 3
(1.1)
4π r0 [1 + (r/r0)2 ]5/2

where r0 is a s ale fa tor related to the half-mass radius by rh ≃ 1.3r0 In the


following we adopt the s aling M = 1, r0 = 1, whi h gives the mass inside a sphere
of radius r as
(−3/2)
M(r) = r3 1 + r2

(1.2)

First a radius, r, is hosen by setting M(r) = X1 , where X1 is a random number


in [0, 1]. Substituting into Eq.(1.2) and simplifying we obtain
 −1/2
−2/3
r = X1 −1 (1.3)

A reje tion may be applied in rare ases of large distan es (e.g. r > 10rh ).
The three spatial oordinates x, y, z are now sele ted by hoosing two normalized
random numbers, X2 , X3 , and writing

z = (1 − 2X2)r, (1.4)
x = (r2 − z2 )1/2 cos 2π X3 , (1.5)
y = (r2 − z2 )1/2 sin 2π X3. (1.6)

Let us assume isotropi velo ities. From the orresponding potential Φ = −(1 +
r2 )−1/2 in s aled units, the es ape velo ity is given by ve = 21/2 (1 + r2 )−1/4 . Sin e
the system is assumed to be in a steady state, we have f (r, v) ∝ (−E)7/2 for
the distribution fun tion, where E is the spe i energy. Hen e the probability
distribution of the dimensionless velo ity ratio q = v/ve is proportional to
7/2
g(q) = q2 1 − q2

. (1.7)

To obtain the velo ities von Neumann's reje tion te hnique is used and noted
that g(q) < 0.1 with q in [0, 1]. Let X4, X5 be two normalized random numbers.
97
If 0.1X5 < g(X4) we take q = X4 ; otherwise a new pair of random numbers is
hosen. The isotropi velo ity omponents vx , vy, vz are obtained by employing
the prin iple of 1.6 above, using two further random numbers, X6, X7. If a general
mass fun tion is hosen, the individual masses an also be assigned sequentially
sin e the system has some dis reteness. This introdu es further statisti al
u tuations in the density distribution but, as an be veried, departures from
overall equilibrium during the rst rossing time are small if virial ration (Qvir )
is equal to 0.5.
As an addition, virial ratio is given as;

Qvir = (T + A)/|U − 2W |, (1.8)

where T is total kineti energy, A is the angular momentum ontribution, U is


potential energy, and W is tidal energy.

98
B. Test Results for Dynamic versus Fixed Era Sizes

2.1 100-Body Tests

0.006
fixed era
dynamic era

0.005

0.004

0.003
Energy Error

0.002

0.001

-0.001

-0.002
0 100 200 300 400 500 600 700 800 900 1000
Time

0.012
fixed era
dynamic era

0.01

0.008

0.006
Energy Error

0.004

0.002

-0.002
0 100 200 300 400 500 600 700 800 900 1000
Time

99
0.007
fixed era
dynamic era
0.006

0.005

0.004
Energy Error

0.003

0.002

0.001

-0.001

-0.002
0 100 200 300 400 500 600 700 800 900 1000
Time

0.008
fixed era
dynamic era
0.007

0.006

0.005

0.004
Energy Error

0.003

0.002

0.001

-0.001

-0.002
0 100 200 300 400 500 600 700 800 900 1000
Time

100
0.006
fixed era
dynamic era
0.005

0.004

0.003

0.002
Energy Error

0.001

-0.001

-0.002

-0.003

-0.004
0 100 200 300 400 500 600 700 800 900 1000
Time

0.008
fixed era
dynamic era
0.007

0.006

0.005
Energy Error

0.004

0.003

0.002

0.001

-0.001
0 100 200 300 400 500 600 700 800 900 1000
Time

101
0.006
fixed era
dynamic era

0.005

0.004

0.003
Energy Error

0.002

0.001

-0.001

-0.002
0 100 200 300 400 500 600 700 800 900 1000
Time

0.007
fixed era
dynamic era

0.006

0.005

0.004
Energy Error

0.003

0.002

0.001

-0.001
0 100 200 300 400 500 600 700 800 900 1000
Time

102
0.007
fixed era
dynamic era

0.006

0.005

0.004
Energy Error

0.003

0.002

0.001

-0.001
0 100 200 300 400 500 600 700 800 900 1000
Time

0.009
fixed era
dynamic era
0.008

0.007

0.006

0.005
Energy Error

0.004

0.003

0.002

0.001

-0.001
0 100 200 300 400 500 600 700 800 900 1000
Time

103
2.2 500-Body Tests

0.001
fixed era
dynamic era

0.0008

0.0006

0.0004

0.0002

-0.0002

-0.0004
0 100 200 300 400 500 600 700 800 900 1000
Time

0.001
fixed era
dynamic era

0.0008

0.0006

0.0004

0.0002

-0.0002

-0.0004
0 100 200 300 400 500 600 700 800 900 1000
Time

104
0.001
fixed era
dynamic era

0.0008

0.0006

0.0004

0.0002

-0.0002

-0.0004
0 100 200 300 400 500 600 700 800 900 1000
Time

0.0014
fixed era
dynamic era
0.0012

0.001

0.0008

0.0006

0.0004

0.0002

-0.0002

-0.0004
0 100 200 300 400 500 600 700 800 900 1000
Time

105
0.0008
fixed era
dynamic era
0.0007

0.0006

0.0005

0.0004

0.0003

0.0002

0.0001

-0.0001

-0.0002
0 100 200 300 400 500 600 700 800 900 1000
Time

0.0012
fixed era
dynamic era

0.001

0.0008

0.0006

0.0004

0.0002

-0.0002

-0.0004
0 100 200 300 400 500 600 700 800 900 1000
Time

106
0.0016
fixed era
dynamic era
0.0014

0.0012

0.001

0.0008

0.0006

0.0004

0.0002

-0.0002

-0.0004
0 100 200 300 400 500 600 700 800 900 1000
Time

0.001
fixed era
dynamic era

0.0008

0.0006

0.0004

0.0002

-0.0002

-0.0004
0 100 200 300 400 500 600 700 800 900 1000
Time

107
0.0008
fixed era
dynamic era
0.0007

0.0006

0.0005

0.0004

0.0003

0.0002

1e-04

-0.0001

-0.0002

-0.0003
0 100 200 300 400 500 600 700 800 900 1000
Time

0.0016
fixed era
dynamic era
0.0014

0.0012

0.001

0.0008

0.0006

0.0004

0.0002

-0.0002

-0.0004

-0.0006
0 100 200 300 400 500 600 700 800 900 1000
Time

108

You might also like