A THREE-DIMENSIONAL CARTESIAN TREE-CODE AND APPLICATIONS TO VORTEX SHEET ROLL-UP

by Keith Lindsay

A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy (Mathematics) in The University of Michigan 1997

Doctoral Committee: Professor Robert Krasny, Chair Assistant Professor Peter Smereka Associate Professor Gr´tar Tryggvason e Professor Arthur Wasserman Professor Michael Weinstein

c

Keith Lindsay 1997 All Rights Reserved

This thesis is dedicated to the memory of Bruce Lindsay. I miss you and think of you often.

ii

ACKNOWLEDGEMENTS

There are a few people I would like to thank for their support while I have worked on this thesis. I would first like to thank my advisor Robert Krasny. With his guidance, I have learned a great deal about fluid dynamics and numerical analysis. Without his assistance, this thesis would not have been possible. I am grateful for all that he has taught me and I look forward to working with him in the future. I would also like to thank the other members of my dissertaion committee, Peter Smereka, Gr´tar Tryggvason, Arthur Wasserman, and Michael Weinstein for their thoughtful e comments and suggestions. I extend a special thank you to Judy Florian for all of the support that she has given me. I love you very much.

iii

TABLE OF CONTENTS

DEDICATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ACKNOWLEDGEMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LIST OF APPENDICES . . . . . . . . . . . . . . . . . . . . . . . . . . . . CHAPTER 1. INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Contributions of the Thesis . . . . . . . . . . . . . . . . . . . 2. FLUID DYNAMICS . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Governing Equations . . . . 2.2 Vortex Sheets . . . . . . . . 2.2.1 Parametrization . 2.2.2 Desingularization 2.2.3 Discretization . . 2.3 Vortex Rings . . . . . . . . 2.3.1 Formation . . . . 2.3.2 Stability . . . . . 2.3.3 Interactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

ii iii vi ix x

1 1 2 4 4 7 9 11 13 17 18 19 27 29 29 31 35 40 42

3. FAST METHODS FOR PARTICLE SIMULATIONS . . . . . 3.1 3.2 3.3 3.4 3.5 Mesh Codes . . . . . . . . . . . . . Tree Codes . . . . . . . . . . . . . Particle-Cluster Interactions . . . . Tree Construction . . . . . . . . . Recurrences for Taylor Coefficients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

iv

3.6 Error Analysis of Particle-Cluster Interactions . . . . . . . . . 3.7 Full Description of the Algorithm . . . . . . . . . . . . . . . . 3.8 Complexity Analysis . . . . . . . . . . . . . . . . . . . . . . . 4. ALGORITHM VALIDATION AND PERFORMANCE . . . 4.1 Convergence of Vortex Method . . . . . . . . . . . . . . . . . 4.2 Selection of Runtime Parameters . . . . . . . . . . . . . . . . 4.3 Algorithm Performance . . . . . . . . . . . . . . . . . . . . . 5. APPLICATIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 Vortex Ring with Azimuthal Perturbation . . . . . . . . . . . 5.2 Elliptical Vortex Ring . . . . . . . . . . . . . . . . . . . . . . 5.3 Colliding Vortex Rings . . . . . . . . . . . . . . . . . . . . . . 6. CONCLUSIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Directions for Future Work . . . . . . . . . . . . . . . . . . .

49 57 59 65 66 68 73 77 77 79 85 96 96 97

APPENDICES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

v

LIST OF FIGURES

Figure 2.1 2.2 A vortex sheet modeling parallel shear flow. . . . . . . . . . . . . . 7

Vortex lines and circulation. λ1 , λ2 : Lagrangian parameters, y0 : reference point, y : point on surface, C : curve for circulation integral. 10 Discretization of parameter space and a circular disk. λ1 , λ2 : Lagrangian parameters. λ1 is a radial parameter and λ2 is a parameter around the disk. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Particle insertion along a vortex line. given data (•), new particle (◦). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Vortex line insertion. given data (•), new particle (◦). . . . . . . .

2.3

14

2.4

15 16 17 21

2.5 2.6 2.7 2.8

Propagating vortex ring. . . . . . . . . . . . . . . . . . . . . . . . . Cylindrical coordinates and basis vectors. . . . . . . . . . . . . . . . Dispersion relation. sign(ω 2 )|ω| vs. k. R = 1, δ = 0.18, 0.15, 0.12, 0.09, 0.06. Going left to right, the peaks correspond to decreasing δ. Colliding vortex rings. . . . . . . . . . . . . . . . . . . . . . . . . . Particle-cluster interaction. x : target particle, yj : particle in cluster, τ : cell, y : center of τ . . . . . . . . . . . . . . . . . . . . . . . Subdivision of space for random points. (a) Nested subdivision of space. (b) Associated tree structure. . . . . . . . . . . . . . . . . . . Subdivision of space for points on a spiral. (a) Nested subdivision of space. (b) Associated tree structure. . . . . . . . . . . . . . . . . Computing Taylor coefficients for two-dimensional example. (•) : previous step, (◦) : current step, (◦) : future step. . . . . . . . . . . x

24 27

2.9 3.1

36

3.2

43

3.3

44

3.4

48

vi

4.1 4.2

Profile of rolling up vortex sheet. t = 1, δ = 0.10. . . . . . . . . . . . Profile of rolling up vortex sheet. t = 4, δ = 0.10, ∆t = 0.05, 1 = 0.15, 0.10, 0.05, 2 = 0.05 . . . . . . . . . . . . . . . . . . . . . . . . Execution time (sec.) vs. N0 . pmax = 6 (—), 8 (– – –), 10 (· · · ). . . Memory usage (MB) vs. N0 . pmax = 6 (—), 8 (– – –), 10 (· · · ). . . . Execution time (sec.) vs. N . pmax = 8. tol = 10−2 (—), 10−3 (– – –), 10−4 (· · · ). direct summation (–·–). actual data (o), projected data (x). (a) Execution time, (b) Direct summation time / fast algorithm time. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Memory usage (MB) vs. N . pmax = 8. fast algorithm (—), direct summation (–·–). actual data (o), projected data (x). (a) Memory usage, (b) Fast algorithm memory usage / direct summation memory usage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Actual error vs. specified tolerance. pmax = 8, N0 = 500, N = 6284, 12708, 25572, 38444, 51276. potential error bound (—), velocity error bound (· · · ). . . . . . . . . . . . . . . . . . . . . . . . . . . . . Execution time (sec.) vs. actual error. pmax = 8, N0 = 500, N = 6284, 12708, 25572, 38444, 51276. Connected lines are tol = 10−2 , 10−3 , 10−4 . potential error bound (—), velocity error bound (· · · ). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Variance of perturbed vortex sheet. δ = 0.10, ρ = 0.10. k : wavenumber of perturbation, t : time. . . . . . . . . . . . . . . . . . . . . . . Perturbed vortex sheet. k = 5. δ = 0.10, t = 0, 2, 4, 6. . . . . . . . Perturbed vortex sheet. k = 9. δ = 0.10, t = 0, 2, 4, 6. . . . . . . . Core of perturbed vortex sheet. k = 5, 9. δ = 0.10, t = 0, 2, 4, 6. . Elliptical vortex sheet. a = 0.8. δ = 0.10, t = 0, 2, 4, 6. . . . . . . . Elliptical vortex sheet. a = 0.6. δ = 0.10, t = 0, 2, 4, 6. . . . . . . . Elliptical vortex sheet. a = 0.5. δ = 0.10, t = 0, 2, 4, 6. . . . . . . .

67

69 71 72

4.3 4.4 4.5

74

4.6

74

4.7

75

4.8

76

5.1

80 81 82 83 86 87 88

5.2 5.3 5.4 5.5 5.6 5.7

vii

5.8 5.9

Vortex sheets modeling colliding disks. δ = 0.10, t = 0, 1, 2, 3, 4, 4.5. 90 Cut-away of vortex sheets modeling colliding disks. δ = 0.10, t = 0, 1, 2, 3, 4, 4.5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Vorticity isosurfaces of colliding vortex rings, perspective view. δ = 0.10, t = 0, 1, 2, 3, 4, 4.5. . . . . . . . . . . . . . . . . . . . . . . . Vorticity isosurfaces of colliding vortex rings, front view. δ = 0.10, t = 0, 1, 2, 3, 4, 4.5. . . . . . . . . . . . . . . . . . . . . . . . . . . Vorticity isosurfaces of colliding vortex rings, side view. δ = 0.10, t = 0, 1, 2, 3, 4, 4.5. . . . . . . . . . . . . . . . . . . . . . . . . . . . Vorticity isosurfaces of colliding vortex rings, top view. δ = 0.10, t = 0, 1, 2, 3, 4, 4.5. . . . . . . . . . . . . . . . . . . . . . . . . . . .

91

5.10

92

5.11

93

5.12

94

5.13

95

viii

LIST OF TABLES

Table 4.1 4.2 Machine characteristics. . . . . . . . . . . . . . . . . . . . . . . . . 66

Maximum point position differences for circular sheet. t = 1, δ = 0.10, e(∆t) = maxi xi (∆t) − xi (∆t/2) . . . . . . . . . . . . . . . .

67

ix

LIST OF APPENDICES

Appendix A. Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

B.

Cylindrical Coordinate Identities . . . . . . . . . . . . . . . . . . . . . 104

C.

Details from Circular Filament Analysis . . . . . . . . . . . . . . . . . 105 C.1 Propagation Speed of Circular Filament . . . . . . . . . . . . 105 C.2 Linearized Evolution Equations for Perturbation . . . . . . . 107

x

CHAPTER 1

INTRODUCTION

1.1

Overview

This thesis presents an algorithm for the rapid computation of three-dimensional vortex sheet motion. A vortex sheet is a material surface in the fluid across which the tangential component of fluid velocity has a jump discontinuity. They are frequently used as an asymptotic model for parallel shear flow. In our study of vortex sheets, the governing equations are taken in a Lagrangian form. When these equations are discretized, a large system of ordinary differential equations results. Referring to the discretization elements as particles, this system of equations is an N -body problem, a collection of N particles with pairwise interactions. In an N -body problem, it is necessary to evaluate sums of the form
N j=1

Kδ (xi , xj ) × wj ,

i = 1, . . . , N,

(1.1)

where xi , xj are particle positions, wj is a vector-valued weight associated with the jth particle, and δ is a smoothing parameter. Computing the sums in (1.1) directly, which is referred to as direct summation, requires O(N 2 ) operations. In our simulations, N takes on values up to 106 , so it is not practical to perform direct summation. The algorithm presented in this thesis evaluates the above sums to a

1

specified tolerance with O(N log N ) operations. It extends the work of Draghicescu and Draghicescu [21], who studied two-dimensional vortex sheet dynamics, to the three-dimensional case. There are three main ingredients for the efficiency of the algorithm: particle-cluster interactions, a tree-based nested subdivision of space to construct particle clusters, and adaptive strategies. In this thesis, the algorithm is used to study vortex ring dynamics with a vortex sheet model. The layout of the thesis is as follows. Chapter 2 gives an overview of the fluid dynamics relevant to our work. Vortex sheets are discussed and an overview of vortex rings is presented. Chapter 3 presents the new algorithm. It is described in detail and is related to previous work. Chapter 4 presents a validation of the algorithm, analyzing its convergence, accuracy, and speed-up. Results are presented for the test case of axisymmetric vortex ring roll-up. Chapter 5 presents simulations for perturbed vortex rings, elliptical vortex rings, and the collision and reconnection of two vortex rings, a configuration based on experiments performed by Schatzle [55]. Chapter 6 gives a summary and discusses possible extensions to the work. Appendix A contains a table of the notation, Appendix B lists identities related to cylindrical basis vectors and Appendix C presents details from the circular filament analysis which is performed in Section 2.3.2.

1.2

Contributions of the Thesis

The thesis makes three main contributions. First, the algorithm generalizes previously developed particle simulation algorithms. The main differences between the kernel Kδ used here and the ones previously used are that Kδ is not harmonic and it is a function of three variables. Second, we introduce new forms of adaptivity into the tree-based subdivision of space. This ensures that the algorithm’s execution time

will be small compared to direct summation for a variety of particle distributions. Third, we apply the algorithm to a three-dimensional smoothed vortex sheet model to study the dynamics of vortex rings. We show that the model allows vorticity isosurfaces to reconnect, even though the material surfaces do not.

CHAPTER 2

FLUID DYNAMICS

In this chapter, we present an overview of the fluid dynamics relevant to our work. In Section 1 we introduce the basic equations of fluid motion. Section 2 contains a discussion of vortex sheets, including their applications, how we parametrize them, the behavior they exhibit, and our numerical method for studying them. In Section 3 we introduce vortex rings, as an application of vortex sheet roll-up, and describe some issues that we are interested in studying such as stability and interactions.

2.1

Governing Equations

The motion of incompressible homogeneous (i.e. constant density) fluid is governed by the Navier-Stokes equations ut + (u · )u = − p + ν u, · u = 0, (2.1) (2.2)

where u(x, t) is the fluid velocity at position x and time t, p(x, t) is the fluid pressure and ν is the viscosity. Equation (2.1) is the momentum equation, a statement of Newton’s second law that mass times acceleration is equal to force. Equation (2.2) is the continuity equation, representing conservation of mass and incompressibility.

4

As described by Batchelor [6], in many flows the effect of viscosity is significant only in a small region of the fluid, for example in boundary layers or thin shear layers. Away from these regions, the fluid behaves as if it were inviscid. Furthermore, as demonstrated experimentally by Brown and Roshko for a turbulent mixing layer [9], the large scale features of the flow do not change for large Reynolds numbers, which may be considered the inverse of viscosity for our present purposes. So to understand the dynamics in these portions of the flow, it is useful to study the inviscid limit ν → 0 of the Navier-Stokes equations. This yields the Euler equations ut + (u · )u = − p, · u = 0. (2.3) (2.4)

In this thesis, we are considering vortex sheets, a particular type of weak solution to the Euler equations. As mentioned in Chapter 1, a vortex sheet is a surface in the fluid across which the tangential component of fluid velocity has a jump discontinuity. When analyzing weak solutions of differential equations, one difficulty that may arise is a lack of uniqueness of solutions. Thus, one must choose from among the possible solutions the one that is physically significant. We view the vortex sheet as the zero viscosity limit of smooth solutions to the Navier-Stokes equations. Delort [17] proved that the two-dimensional Euler equations with vortex sheet initial data possess global weak solutions if the vorticity is of one sign. Majda [41] extended the proof to show that in the inviscid limit, solutions of the Navier-Stokes equations, with vortex sheet initial data having vorticity of one sign, converge to weak solutions of the Euler equations. It is not known if these results extend to more general vortex sheet configurations, much less to three dimensions. Uniqueness of solutions is also not known. Discussions of these and other analytical aspects of vortex sheets are given

by Majda [40] and Caflisch [10]. In the next section, we describe vortex sheets in more detail. The above forms of the Navier-Stokes and Euler equations, in terms of velocity and pressure, are known as primitive variable formulations. An alternative form is in terms of the vorticity, ω = × u, which measures rotation within the fluid. Taking

the curl of the Euler equation (2.3), we obtain ω t + (u · )ω = (ω · )u. (2.5)

One advantage of this form is that the pressure has been removed. To close the system of equations, the velocity is recovered from the vorticity via the Biot-Savart integral u(x, t) = − 1 4π (x − y) × ω(y, t) dy. |x − y|3 (2.6)

R3

Equation (2.5) describes how the vorticity evolves in time and can be used together with (2.6) to form a numerical method to solve the Euler equations. Another evolution equation for the vorticity can be obtained in terms of the flow map Φ(x, t), which denotes the position of the fluid particle at time t that was initially at position x at time t = 0. The equations defining Φ are ∂Φ (x, t) = u(Φ(x, t), t) ∂t Φ(x, 0) = x. The evolution equation for ω in terms of Φ is ω(Φ(x, t), t) = Φ(x, t) ω(x, 0). (2.8) (2.7a) (2.7b)

Equations (2.6), (2.7) and (2.8) form a closed system which is the basis of the numerical method used in this thesis to study the Euler equations. In the remainder of this chapter we discuss vortex sheets and then present an overview of vortex rings.

Figure 2.1: A vortex sheet modeling parallel shear flow.

2.2

Vortex Sheets

As mentioned above, a vortex sheet is a surface in the fluid across which the tangential component of the fluid velocity has a jump discontinuity. Away from the surface, the fluid is assumed to be irrotational, which means that the vorticity is zero. However, since the velocity has a jump discontinuity across the surface, the vorticity is a δ-function there. One common application of vortex sheets is as a model for parallel shear flow in which the transition region between two streams of fluid is thin, as depicted in Figure 2.1. In this situation, the sheet evolves according to the velocity given by the Biot-Savart integral (2.6) and the sheet is called a free vortex sheet. Another application, described by Lamb [36], is to model the movement of a solid body through irrotational inviscid fluid by placing a vortex sheet on the body’s boundary. In this case, the sheet is called a bound vortex sheet, since it is bound to the body’s surface. Our application of vortex sheets, described in the next section, is a model of the formation process of a vortex ring. A method of generating a vortex ring is to place a solid circular disk in a fluid, give it an impulse along its axis and then dissolve the disk away. This process can be modeled by considering a bound vortex sheet on the solid disk. When the disk is dissolved away, a free vortex sheet remains in the fluid and rolls up into a vortex ring. It is this free sheet that is represented in our computations.

To compute the induced velocity of a vortex sheet, it is necessary to consider the Biot-Savart integral (2.6) in the case where the vorticity is a δ-function on a surface. Before proceeding, we introduce some additional notation. Away from the vortex sheet, the fluid is irrotational, so a velocity potential φ exists. Thus, for x not on the sheet, u(x, t) = φ(x, t). The limit of the fluid velocity exists as the sheet

is approached from either side. Choosing an orientation for the sheet, let u+ and u− denote the one-sided limits of u. Similarly, let φ+ and φ− denote the one-sided limits of φ. The jumps in u and φ across the sheet are denoted [u] = u+ − u− and [φ] = φ+ −φ− respectively. The jump in velocity is tangential to the sheet, so we have n · [u] = 0, where n denotes a unit vector normal to the sheet. One can show that the curl of a velocity field which has a tangential jump discontinuity across a surface and is otherwise irrotational is a surface δ-function with vector-valued strength ω = n × [u] = n × [ φ]. (2.9)

Although it is a slight abuse of notation and terminology, we will refer to this vectorvalued strength as the vorticity itself. A consequence of this relationship is that the vorticity ω is parallel to the surface and perpendicular to [ φ], a result we will use later. The Biot-Savart integral is interpreted with this singular vorticity, leading to the following surface integral for the induced velocity : u(x, t) = K(x, y) × ω(y, t) dSy , (2.10)

S

where S is the sheet, x is a point not on the sheet, K(x, y) = − 1 x−y 4π |x − y|3 (2.11)

is the Biot-Savart kernel, ω(y, t) is given by (2.9) and dSy is the area element of S at y. For x on the sheet, the integral in (2.10) is interpreted as a principal value integral, because it diverges otherwise.

In the next subsection, we describe the Lagrangian parametrization of the vortex sheet which is the basis of our numerical method. Then we discuss the singular behavior that vortex sheets exhibit, which leads us to desingularize their motion in order to obtain a tractable model. Finally, the discretization of the equations is described. 2.2.1 Parametrization

For computations, it is advantageous to use a Lagrangian parametrization of the vortex sheet. We do this by representing the sheet as a collection of vortex lines, parametrizing across them with circulation. The Lagrangian parametrization was presented by Caflisch [10] and Kaneda [28]. The sheet’s position is denoted y(λ1 , λ2 , t), where λ1 and λ2 are Lagrangian parameters. The induced velocity field at a point x on the sheet is u(x, t) = PV K(x, y(λ1 , λ2 , t)) × ω(λ1 , λ2 , t) ∂y ∂y dλ1 dλ2 , × ∂λ1 ∂λ2 (2.12) where the PV denotes the principal value integral. Caflisch [10] and Kaneda [28] showed that the jump [φ(y(λ1 , λ2 , t))] is independent of time. The demonstration was based on the fact that the fluid pressure is continuous across the vortex sheet, which follows from conservation of momentum. Thus, we may write φJ (λ1 , λ2 ) = [φ(y(λ1 , λ2 , t))]. (2.13)

Then, using (2.9) and some algebraic manipulations, they derived the identity ω(λ1 , λ2 , t) ∂y ∂y ∂φJ ∂y ∂φJ ∂y × − . = ∂λ1 ∂λ2 ∂λ1 ∂λ2 ∂λ2 ∂λ1 (2.14)

The specific choice of λ1 and λ2 is made to simplify the right-hand side of this equation. We choose λ1 to be the circulation between a fixed reference point on the

Figure 2.2: Vortex lines and circulation. λ1 , λ2 : Lagrangian parameters, y0 : reference point, y : point on surface, C : curve for circulation integral. sheet and other points on the sheet and λ2 to be a parameter along curves of constant circulation, as shown in Figure 2.2. We describe λ2 first, in terms of vorticity. Vortex lines are integral curves of the vorticity. Geometrically, they are curves which are parallel to the vorticity field ω. We choose λ2 so that at time t = 0, λ2 is a parameter along vortex lines, ensuring that that ∂ ∂φJ = φ+ (y, 0) − φ− (y, 0) ∂λ2 ∂λ2 ∂y ∂y = φ+ (y, 0) · − φ− (y, 0) · ∂λ2 ∂λ2 ∂y = 0, = [ φ(y, 0)] · ∂λ2 where the last equality is due to the fact that
∂y ∂λ2 ∂y ∂λ2

is parallel to ω(λ1 , λ2 ). It follows

is parallel to ω and [ φ] is

perpendicular to ω, as seen from (2.9). For such a choice of λ2 , the Biot-Savart integral (2.12) reduces to u(x, t) = PV K(x, y(λ1 , λ2 , t)) × ∂y ∂φJ (λ1 , λ2 , t) dλ1 dλ2 . ∂λ2 ∂λ1 (2.18) For our vortex ring application, the vortex lines are closed curves. We choose λ 2 to range from 0 to 2π, so the vortex lines are 2π-periodic functions of λ2 . In the 

¤ ¥£ ¡ ¢ 
(2.15) (2.16) (2.17)

© § ¨¦

computations, λ2 is chosen at t = 0 to be a linear rescaling of arclength. Note that this linear relationship does not hold for t > 0, because the vortex lines stretch non-uniformly as the sheet evolves. As mentioned above, λ1 is chosen to be the circulation between a fixed reference point on the sheet and other points on the sheet. We fix a material point y0 on the sheet. For any point y on the sheet, λ1 (y) is the circulation λ1 (y) = u · ds, (2.19)

C

where C is a closed curve meeting the sheet at y0 and y, and ds is a line element of arclength, as shown in Figure 2.2. Kelvin’s circulation theorem states that the circulation around a set of vortex lines moving with the flow does not change in time, ensuring that λ1 is a Lagrangian parameter. It follows from the definition that λ1 = φJ + c, where c is a constant which depends only on the reference point y0 . Thus,
∂φJ ∂λ1

= 1 and the Biot-Savart integral (2.18) reduces to u(x, t) = PV K(x, y(λ1 , λ2 , t)) × ∂y (λ1 , λ2 , t) dλ1 dλ2 . ∂λ2 (2.20)

This parametrization and the resulting form of the Biot-Savart integral is a generalization to three dimensions of the Birkhoff-Rott equation for the motion of a vortex sheet in two dimensions [8]. The circulation distribution for a vortex sheet depends on the initial condition of the specific problem being studied. We will describe it later when we discuss the application to vortex rings. 2.2.2 Desingularization

Vortex sheets exhibit behavior that smooth shear layers do not. For example, vortex sheet instabilities have arbitrarily large growth rates, the sheets form curvature singularities [46], and they roll up into infinite spirals [48]. These features make the study of vortex sheets difficult both theoretically and numerically.

As an example of the numerical difficulties, consider the motion of a flat vortex sheet. When a small amplitude perturbation is introduced to the sheet, it is amplified at a rate proportional to the spatial wavenumber of the perturbation. This is known as Kelvin-Helmholtz instability. In a numerical simulation, roundoff error introduces a perturbation to the sheet whose wavenumber is inversely proportional to the spacing of the points representing the sheet. Thus, when the computational mesh is refined, the wavenumber of the round-off error perturbation increases, the perturbation is amplified more rapidly and the computations become inaccurate. One technique to overcome this, introduced by Krasny [34], is to filter the sheet’s position at each time step. With this technique, it is possible to extend computations to longer times. However, the sheet still develops singularities in finite time. After the singularity forms, it is not possible to use the filter and round-off error grows, overwhelming the computations. Another technique, first proposed by Chorin and Bernard [15], is to desingularize the Biot-Savart kernel K(x, y). We follow this approach, using a desingularization analogous to the one used by Krasny [33] for two-dimensional vortex sheet roll-up. Our smoothed three-dimensional Biot-Savart kernel is Kδ (x, y) = − 1 x−y , 4π (|x − y|2 + δ 2 )3/2 (2.21)

where δ > 0 is the smoothing parameter. We replace the singular Biot-Savart integral (2.20) with u(x, t) = Kδ (x, y(λ1 , λ2 , t)) × ∂y (λ1 , λ2 , t) dλ1 dλ2 . ∂λ2 (2.22)

This kernel was first introduced by Rosenhead [51] in the study of vortex dynamics in the wake behind a cylinder. It is related to the Plummer potential which is used in astrophysics to model the distribution of matter in a galaxy. Note that

as δ → 0, Kδ → K. The introduction of δ smoothes the kernel and removes its singularity at the origin. This makes it unnecessary to treat the integral in (2.22) as a principal value integral. A consequence of the smoothing is that the kernel is no longer harmonic, which is one of the main reasons for developing the new fast computational method to be described in Chapter 3. Note that it is not possible to desingularize the kernel in such a way that the result is bounded and harmonic, which follows from the maximum principle. The strategy for computing vortex sheet roll-up is to solve the smoothed equation for fixed δ > 0 and to investigate the behavior of these solutions as δ → 0. This is analogous to finding weak solutions of the Euler equations by taking the zero viscosity limit of smooth solutions of the Navier-Stokes equations. This analogy is supported by the work of Tryggvason, Dahm and Sbieh [59] who performed computations for the δ → 0 limit of a two-dimensional vortex sheet and the ν → 0 limit of a corresponding Navier-Stokes computation. They found that the large scale features of a δ > 0 computation agree well with the features of a ν > 0 computation, and that the δ → 0 limit and the ν → 0 limit coincide. Liu and Xin [39] have shown that the δ → 0 limit of solutions to the two-dimensional vortex-blob equations is a weak solution of the Euler equations when the vorticity is of one sign. It is not known if such results hold in three dimensions. 2.2.3 Discretization

In this subsection, we describe how the sheet’s position y(λ1 , λ2 , t) and velocity (2.22) are discretized. The assumptions made about the parametrization are that λ1 measures circulation across the vortex lines and 0 ≤ λ2 ≤ 2π parametrizes along the vortex lines. We discretize the parameter space as shown in Figure 2.3. We first

Figure 2.3: Discretization of parameter space and a circular disk. λ1 , λ2 : Lagrangian parameters. λ1 is a radial parameter and λ2 is a parameter around the disk. discretize λ1 with a uniform grid. Each λ1 value corresponds to a vortex line which is then discretized in λ2 with a grid that is uniform with respect to arc-length in physical space (at t = 0). Note that there are more points on longer vortex lines, which leads to λ1 and λ2 being treated asymmetrically. This is done to ensure spatial resolution and accuracy of partial derivative computations along the vortex lines. With these points xi (t), we discretize the Biot-Savart integral (2.22) first in λ1 with the trapezoid rule and then in λ2 , also with the trapezoid rule. The
∂y ∂λ2

term in the

integrand is approximated with a 2nd order centered difference. This results in a system of ordinary differential equations
N dxi = Kδ (xi , xj ) × wj , dt j=1

where xi (t), xj (t) are points on the sheet and wj = Dλ2 (xj ) ∆λ1 ∆λ2 (2.24)  

    
(2.23)

particles for cubic interpolant new particle

Figure 2.4: Particle insertion along a vortex line. given data (•), new particle (◦). is the product of the finite difference Dλ2 along a vortex line and the integration weights ∆λ1 and ∆λ2 . The integration weights are adjusted appropriately at the λ boundaries for the trapezoid rules. From here on, we refer to the xj as particles. The system of differential equations (2.23) is solved with a 4th order Runge-Kutta method. Computing the right-hand side of (2.23) by direct summation requires O(N 2 ) operations, where N is the number of particles discretizing the vortex sheet. In Chapter 3, we present an algorithm which computes the sums in (2.23) more rapidly, to within a specified tolerance. As the sheet evolves, the vortex lines can individually stretch and can also separate from each other. This causes a loss of resolution which is overcome by inserting new particles along the lines and by inserting new lines. The first case corresponds to refining in λ2 for fixed λ1 and the second case corresponds to refining in λ1 globally in λ2 . The procedure for inserting a new point along a vortex line is depicted in Figure 2.4. The λ2 coordinate of the new particle is set to be the average of the separated particles’ coordinates. The position of the new particle is computed with a cubic polynomial in λ2 which interpolates the positions of the four particles surrounding the new particle, two on each side. The procedure for adding a new vortex line when adjacent lines become separated

" !

Figure 2.5: Vortex line insertion. given data (•), new particle (◦). is analogous, and is depicted in Figure 2.5. The λ1 coordinate of the new line is the average of the separated lines’ coordinates. The particle positions on the new line are generated as follows. The first step is to select the λ2 values where the particles will be placed. We do this by simply choosing the λ2 values of an adjacent line. The reasoning behind this is that once refinement along vortex lines takes place, the λ2 values on the adjacent lines yield good spatial resolution along the new line. To compute a particle position for each of these λ2 values, we first generate a corresponding λ2 particle position on each of the surrounding four vortex lines, two on each side. If one of these lines does not have a particle at that λ2 value, then one is generated with a cubic interpolant as described above for particle insertion along a line. The λ2 particle position on the new vortex line is then computed by interpolating these four particle positions with a cubic polynomial in λ1 . In our computations we make a change of variable λ1 = λ1 (α). This will ensure accuracy by placing more vortex lines in regions where the circulation is varying rapidly. Since λ1 is a Lagrangian parameter, α is one as well. With this change of

$ #

& %

vorticity velocity

propagation
Figure 2.6: Propagating vortex ring. variable, the smoothed velocity induced by the sheet is u(x, t) = Kδ (x, y(α, λ2 , t)) × ∂y (α, λ2 , t) λ1 (α) dα dλ2 . ∂λ2 (2.25)

In the computations, λ1 (α) is computed analytically at t = 0. When new vortex lines are inserted, the values of λ1 (α) are obtained using a cubic interpolant of the values of λ1 (α) at the surrounding lines.

2.3

Vortex Rings

A vortex ring is a flow in which vorticity is concentrated and directed around a torus, as depicted in Figure 2.6. The vorticity distribution causes the fluid to rotate around the torus and the ring propagates. In this thesis, we use a desingularized vortex sheet model to investigate vortex ring dynamics. In particular, we are interested in the formation process, stability properties, and interactions between rings. We review each of these topics in the following subsections.

2.3.1

Formation

There are various methods of creating a vortex ring, each having advantages and disadvantages for the experimentalist or numerical analyst. One technique, described by Thomson and Newall [58], is to release a drop of colored liquid into a container of water. As the drop falls through the water, it rolls up around the edges and forms a descending vortex ring. This experiment can be performed with a simple apparatus, but it is difficult to simulate numerically due to the collision of the fluid boundaries and the subsequent change in topology. Another method commonly used is to eject fluid from a circular nozzle. A shear layer separates at the opening and rolls up into a vortex ring. As described in Shariff and Leonard’s review [56], this process can be modeled using slug flow or self-similar vortex sheet roll-up. Another model, presented by Nitsche and Krasny [47], involves the roll-up of an axisymmetric vortex sheet which is not assumed to be self-similar. In their numerical computations, the sheet was desingularized in a manner similar to the method described above and they modeled the shedding of circulation at the edge of the nozzle. Their results agreed well with experiments performed by Didden [20]. The vortex sheet model used in this thesis is an extension of their work to fully three-dimensional flow. A simple model for vortex ring formation described by Taylor [57], is to supply an impulse to a flat circular disk along its axis of symmetry and to then dissolve the disk. When the disk is given an impulse, the velocity field in the fluid is induced by a bound vortex sheet on the surface of the disk. When the disk is dissolved, the sheet remains in the fluid and rolls up into a vortex ring. This method is more of a thought exercise and is not practical for experiments, but it is the one on which our computations are based. One reason for selecting this flow for our computations

is the absence of solid boundaries. The circulation distribution on the initially flat sheet is given by λ1 = √ 1 − r2, (2.26)

where r is the distance from the center of the disk. The velocity field induced by this circulation distribution is balanced so that the disk propagates. Note that in Cartesian coordinates, the circulation has a square-root singularity at the boundary of the disk. This implies that the jump in velocity becomes infinite at the edge. When the smoothing effect of δ is introduced, the singularity is removed and the balance in velocity is lost, resulting in the disk rolling up into a ring. This effect occurs in physical flow, although the smoothing is due to viscosity. In terms of the vortex sheet parametrization, the disk is given by y(λ1 , λ2 ) = ( 1 − λ2 cos λ2 , 1 1 − λ2 sin λ2 , 0), 1 (2.27)

where 0 ≤ λ1 ≤ 1 and 0 ≤ λ2 ≤ 2π. For our α reparametrization, we use λ1 = cos α, which yields y(α, λ2 ) = (sin α cos λ2 , sin α sin λ2 , 0), (2.28)

where 0 ≤ α ≤ π/2 and 0 ≤ λ2 ≤ 2π. The λ1 (α) term which arises in (2.25) is given by λ1 (α) = − sin α. 2.3.2 Stability

Since our model for vortex rings consists of a collection of circular vortex lines, we consider first the stability of a single circular vortex line, referred to as a vortex filament. So let y(λ, t) denote the position of a vortex filament in three dimensions, where λ is a Lagrangian parameter along the filament, 0 ≤ λ ≤ 2π and y(0, t) =

y(2π, t). The filament evolves according to the equation ∂y −1 (λ, t) = ∂t 4π
2π 0

∂y ˜ ˜ ˜ Kδ (y(λ, t), y(λ, t)) × (λ, t) dλ, ∂λ

(2.29)

where Kδ is given in (2.21). A propagating circular filament is a steady solution of (2.29) and we are interested in its stability properties. It is convenient to perform the analysis in cylindrical coordinates, so let (r, θ, z) be cylindrical coordinates, as shown in Figure 2.7. Also shown in the figure are the basis vectors er (θ), eθ (θ), and ez associated with the point (r, θ, z). Identities pertaining to this basis are listed in Appendix B. Suppose that the filament at time t = 0 is given by y(λ, 0) = (R, λ, 0). (2.30)

Substituting into (2.29), it can be shown that the filament propagates with velocity U= ez 4πR
2π 0

˜ 2(1 − cos λ) + (δ/R)2

˜ 1 − cos λ

3/2

˜ dλ.

(2.31)

The derivation is presented in Appendix C. It is worthwhile to point out the effect of the smoothing parameter δ. If δ were equal to zero, the filament velocity would be U= ez 8πR
2π 0

1 ˜ dλ, ˜ 1/2 (2(1 − cos λ))

(2.32)

which is a divergent integral. The integrand is positive, so considering the integral as a principal value integral will not result in a finite value. The interpretation of this equation is that a circular vortex line propagates with infinite velocity. The problem is that in an actual fluid, even when the vorticity is concentrated into a small region, the vorticity distribution does not have line delta functions. The desingularization that we use is one approach to overcome this difficulty, and was first introduced by Rosenhead [51] in the study of vortex dynamics in the wake behind a cylinder.

Figure 2.7: Cylindrical coordinates and basis vectors. Intuitively, the introduction of δ into the kernel spreads the vorticity associated with the vortex line over a region around the line with radius δ. It is this effect that leads us to call the lines filaments. Another approach to overcoming this difficulty is to ˜ cut off the integral in a small neighborhood of the point λ = 0, thereby removing the singularity. This technique was used by Crow [16] and Moore [45] in their study of the stability properties of the vortex pair trailing from an airplane wing. We now analyze the linear stability of the propagating circular vortex filament y(λ, t) = (R, λ, U t), where U= 1 4πR
2π 0

˜ 2(1 − cos λ) + (δ/R)2

˜ 1 − cos λ

3/2

˜ dλ.

We introduce a perturbation p(λ, t) to the solution y(λ, t), which we write in terms

V

U

'

H GEC 03FDB ) 0( T SQP 03RAI
(2.33) (2.34)

@ 94642 A787531

of the cylindrical basis at (R, λ, U t) p(λ, t) = pr (λ, t)er (λ) + pθ (λ, t)eθ (λ) + pz (λ, t)ez . (2.35)

We substitute y(λ, t) + p(λ, t) into (2.29), and obtain a system of integro-differential equations for the scalars pr (λ, t), pλ (λ, t), and pz (λ, t). Then we linearize these equations about p(λ, t) = 0, which is reasonable under the assumption that |p(λ, t)| is small in amplitude. The resulting linearized equations (C.18) appear in Appendix C for reference. If the equations are written in the abstract form
∂ p ∂t

= L(p), then it

can be shown that using the Fourier basis for p(λ) diagonalizes the operator L. So we may restrict attention to a single mode of the Fourier expansion for p(λ). Thus, we fix an integer k and substitute the expression p(λ, t) = eikλ+ωt (Ar er (λ) + Aθ eθ (λ) + Az ez ) (2.36)

into (C.18) and after some simplifications obtain the system of linear equations
        

ω 0

0 ω

−I3 0

    −iI2  Aθ       

−I1  Ar    ω Az



 

=

0       0 ,      

(2.37)

0

where the Ij are the integrals I1 = I2 = I3 = 1 4πR2 1 4πR2 1 4πR2
2π 0 2π 0 2π 0

3/2 ˜ 2(1 − cos λ) + (δ/R)2 ˜ ˜ (1 − cos λ)2 (1 + cos k λ) ˜ −3 dλ. ˜ + (δ/R)2 5/2 2(1 − cos λ)

˜ ˜ ˜ ˜ cos λ(1 − cos k λ) − k sin λ sin k λ ˜ dλ, 3/2 ˜ 2(1 − cos λ) + (δ/R)2 ˜ ˜ ˜ ˜ k(1 − cos λ) cos k λ − sin λ sin k λ ˜ dλ, 3/2 ˜ + (δ/R)2 2(1 − cos λ) ˜ ˜ ˜ ˜ ˜ k sin λ sin k λ + 2 cos k λ − cos λ(1 + cos k λ)

(2.38)

(2.39)

(2.40)

The unbalanced form of (2.37) is due to the fact that the integral which would have appeared in the (3, 2) entry of the matrix is zero. There are non-zero solutions of the form (2.36) only if the matrix in (2.37) is singular, which is true only if the determinant of the matrix is zero, 0 = ω(ω 2 − I1 I3 ). Thus, we have a solution only if ω = 0 or ω 2 = I1 I3 . (2.42) (2.41)

This relationship between k and ω, for fixed R and δ, is called a dispersion relation. For each of these values of ω, there is a corresponding (Ar , Aθ , Az ) solution to (2.37). Note that the dispersion relation does not depend on I2 , though the solution (Ar , Aθ , Az ) does. The solution y(λ, t) is linearly stable or unstable with respect to the perturbation p(λ, t) according to whether the real part of ω is negative or positive respectively, and if the real part of ω is zero, then y(λ, t) is linearly neutrally stable with respect to the perturbation p(λ, t). From the definitions (2.38) and (2.40), we see that the product I1 I3 is real, so the stability of y(λ, t) depends upon the sign of I1 I3 . If I1 I3 is negative, then y(λ, t) is linearly neutrally stable. However, if I1 I3 is positive, then there exist solutions p(λ, t) which grow and solutions p(λ, t) which decay, so y(λ, t) is unstable in general. An observation that can be made from the definitions of the integrals I1 and I3 is that for fixed k and δ/R, ω depends linearly on R−2 . In particular, the sign of ω 2 will be independent of R. Thus, whether or not a filament is unstable with respect to a perturbation depends only on k and δ/R. Figure 2.8 contains a plot of sign(ω 2 )|ω| as a function of k, for R = 1 and δ = 0.18, 0.15, 0.12, 0.09, 0.06. The sign term multiplying |ω| is chosen so that positive and negative values correspond to unstable and neutrally stable modes respectively. The

1 0 −1

sign(ω2) |ω|

−2 −3 −4 −5 −6 −7 0 5 10 15 20 25

k

Figure 2.8: Dispersion relation. sign(ω 2 )|ω| vs. k. R = 1, δ = 0.18, 0.15, 0.12, 0.09, 0.06. Going left to right, the peaks correspond to decreasing δ. values of the integrals were computed numerically with Maple. For values of k larger than those depicted, sign(ω 2 )|ω| continues to decrease, leveling off at a value which depends upon δ and R. For a given R and δ, ω 2 depends on k in the following qualitative manner. For k = 0 and 1, ω 2 = 0. As k increases from 1, ω 2 first decreases and then increases, following a parabolic shaped curve. After reaching a local maximum, ω 2 then decreases, eventually leveling off. For some values of δ/R, the value of ω at the local maximum is positive, and for others it is not. Recall that the filament is unstable when the peak is positive and is neutrally stable otherwise. More extensive computations than those depicted in the figure do not reveal an obvious pattern for when the mode at this peak is unstable. Also, for some values, such as δ/R = 0.18 and

0.15, there is more than one k value for which ω 2 is positive. As δ decreases, the wavenumber where the peak is located increases, and more extensive computations suggest that the wavenumber grows like O(δ −1 ) as δ → 0. The linear stability analysis assumes that the perturbation to the filament is small compared to δ, the nominal size of the filament’s core. However, the unstable modes, when they exist, have a wavenumber proportional to δ −1 , which implies that these modes have spatial oscillations with wavelengths on the order of the core size. Thus, as these oscillations grow, they quickly leave the realm where the linear stability analysis is valid. So it is not clear how to interpret the results physically. These results are qualitatively similar to those of Widnall and Sullivan’s [62] study of vortex ring stability. They used a thin filament approximation as a model for the vortex ring and overcame the divergence of the Biot-Savart integral (2.29) by using an integral cut-off and an asymptotic matching procedure to choose the location of the cut-off. They found that for certain intervals of core sizes, there is a narrow band of modes which are unstable. Rings with core size between these intervals are neutrally stable. As the core size decreases, the band of unstable modes narrows. The wavenumber that the band is centered around grows like a−1 , where a is the size of the core. They compared their theoretical predictions with experimental results and found a fair agreement for their prediction for the wavenumber of the unstable mode and good agreement for the amplification rate. One obstacle to generalizing the vortex filament stability analysis to a vortex ring is that the core structure of the ring is not generally known. Thus, one needs to provide a model for the core structure. For instance, in their work mentioned above, Widnall and Sullivan [62] used a constant core radius model and a constant local volume model. Widnall, Bliss, and Tsai [61] modeled the vorticity in the core both as

being constant and having a continuous quartic profile across the core, peaked at the center and zero at the boundary. These later models were better able to predict the wavenumber of the unstable mode than the model used in [62]. Saffman [52], using a vorticity distribution which includes viscous effects, was able to predict the unstable wavenumber found in the experiments of Krutzsch [35] and Maxworthy [42, 43]. Another model for the core vorticity distribution is a scaling of the third-order Gaussian exp(−r 3 ), which was used in simulations by Knio and Ghoniem [32]. In their study, they modeled the vortex ring as a collection of smooth vortex filaments, as we do. The principle differences between their model and ours is the initial placement of the filaments, the smooth kernel that is used, and the discretization. Their filaments are initialized to form a solid torus and the filament strengths are chosen to approximate the vorticity distribution. They smooth the Biot-Savart kernel by convolving it with a third-order Gaussian. Their computational results agree well with the analytical predictions of Widnall, Bliss and Tsai [61]. Another technique for generating a vorticity distribution in the ring’s core is to numerically solve the differential equations for an exactly propagating ring. The technique was used by Lifschitz, Suters, and Beale [38] in their study of the stability of axisymmetric vortex rings with swirl. In their study, they compared growth rate predictions from short wavelength asymptotics with computations using a vortex filament model of the ring. In their computations, the Biot-Savart kernel was smoothed by convolving it with a sixth degree piecewise polynomial having compact support. Their computations agreed reasonably well with the analytical predictions, the computational growth rates being consistently 1/3 to 1/2 the predicted maximum growth rates.

Figure 2.9: Colliding vortex rings. 2.3.3 Interactions

The type of vortex ring interaction that we are interested in is the collision depicted in Figure 2.9, a configuration studied experimentally by Schatzle [55]. The resulting collision exhibits vortex ring merger and has been studied experimentally, theoretically and numerically. Near the collision, oppositely oriented vortex filaments collide and merge. Our interest in the ring configuration is to find out if the vortex sheet model for vortex rings can capture such complex dynamics, despite the various simplifying assumptions built into the model. The regions of the rings which approach closely contain oppositely oriented vorticity. Saffman [53] proposed a model to describe the dynamics of vortex reconnection. As the rings meet, viscosity causes the opposite vorticity to cancel. This decrease in vorticity causes that region of each ring to stretch away from the point of contact, due to a local increase in pressure. Thus, the fluid is pushed away from the region of contact and it appears that the rings have connected. Saffman [53] modeled this process and the predictions for time scales and strain rates agreed reasonably well with

Schatzle’s experiments [55]. Various researchers have studied the vortex reconnection problem numerically. Anderson and Greengard [1] used a Lagrangian method and discretized the rings as a collection of vortex filaments. They smoothed the BiotSavart kernel by convolving it with a characteristic function and used a constant core vorticity model for the rings. They were able to compute the early stages of the ring merger and their results agree qualitatively with Schatzle’s experiments. Numerical simulations performed by Aref and Zawadzki [4] and Winckelmans [63] reproduced well the vortex ring collision and reconnection. Using a Eulerian-Lagrangian vortexin-cell code, they reproduced the ring merger and subsequent reconnection into two new rings, which begin to pinch off. Kida, Takaoka and Hussain [30, 31], using an Eulerian spectral method to study the vortex ring merger problem, were able to compute to later times in the sequence. However, in their computations, the rings remain connected after the reconnection, which conflicts with experimental observations of ring separation. This disparity was attributed to the fact that the experimental results are visualized with passive scalar transport, which is different from vorticity transport. The experiments do not necessarily show where the vorticity is large, since it may be amplified by the vortex stretching term in the Navier-Stokes equations. One reason for interest in this ring configuration related to singularity formation in solutions to the Euler equations. As the rings begin to collide, the oppositely oriented vortex filaments that approach each other begin to stretch. In an inviscid flow, this stretching intensifies the vorticity, which is a process that plays an important part in singularity formation. For instance, interacting vortex tube computations by Pumir and Kerr [49] show significant distortion in the core of the colliding rings and vortex filaments computations by Pumir and Siggia [50] for other configurations inidicate the possibility of singularity formation.

CHAPTER 3

FAST METHODS FOR PARTICLE SIMULATIONS

As mentioned previously, evaluating the sums in (2.23) by direct summation, a technique also referred to as the particle-particle (PP) method, requires O(N 2 ) operations. For large values of N , the time required to perform these operations is excessively large. Two approaches that have been developed in the past to overcome this difficulty are mesh and tree codes. They achieve their efficiency by computing approximations to the exact particle interactions. This is in contrast to algorithms such as the fast Fourier transform, which achieve efficiency by taking advantage of exact algebraic manipulations. Thus, performance is not the only issue to consider when examining mesh and tree codes, for the execution time typically depends on the desired accuracy. A brief description of mesh codes is given before proceeding to tree codes, the approach that this thesis follows.

3.1

Mesh Codes

Efficiency is gained in a mesh code by using the fact that elliptic equations can be solved rapidly on meshes. This is done either by using iterative methods such as successive overrelaxation, conjugate gradient, or multigrid, or direct methods based

29

on cyclic reduction or the fast Fourier transform. A comprehensive reference for this material is the book by Hockney and Eastwood [27]. Though mesh codes apply to more general settings, I will describe them as applied to problems in astrophysics. In this setting, the quantities in a simulation are star positions xi (t) and masses mi . The acceleration of the ith star due to the gravitational influence of the other stars is given by
N

ai = G
j=1, j=i

mj K(xi , xj ),

(3.1)

where K(xi , xj ) = − xi − x j . |xi − xj |3 (3.2)

Define the mass density ρ and gravitational potential Φ by
N

ρ=
j=1

mj δ(x − xj ),
N

(3.3)

Φ(x) = G
j=1

mj φ(x − xj ),

(3.4)

where φ(z) = |z|−1 . Then from the identities K(xi , xj ) =
2

φ(xi − xj ),

(3.5)

φ(z) = 4πδ(z),

(3.6)

it follows that
2

Φ(x) = 4πGρ(x),

(3.7)

ai =

Φ(xi ).

(3.8)

The particle-mesh (PM) method superimposes a fixed mesh over the particles and uses the auxiliary functions ρ and Φ to compute ai as follows :

1. Assign a mass function ρ to the mesh from the xj and mi . 2. Solve a discretized form of the Poisson equation (3.7) on the mesh. 3. Use (3.8) to compute accelerations on the mesh. 4. Interpolate accelerations from the mesh to the star positions xj . There are various techniques for implementing each step mentioned above. The main drawback of this method is that the accuracy is determined by the mesh size. When the grid is refined to improve the accuracy, the execution time increases. An alternative to the PM method is the particle-particle/particle-mesh (P 3 M) method, which combines the PP and PM methods. Interactions between nearby particles are computed with the PP method, and the rest of the interactions are computed with the PM method. So the functions being approximated with the mesh are smoother, resulting in a smaller error than the PM method produces with the same mesh. A full discussion of these methods is beyond the scope of this thesis and the interested reader is directed to Hockney and Eastwood’s book [27] for more details.

3.2

Tree Codes

There are two main ingredients for achieving efficiency in a tree code, particlecluster interactions and a nested subdivision of space which is used to construct the particle clusters. A particle-cluster interaction is used to rapidly compute the influence of a particle cluster on a single target particle. This is done by approximating the cumulative influence of the particles in the cluster on the target particle with a simplified expression. Once a preprocessing phase is performed, the expression can be evaluated for multiple target particles with an operation count independent of

the number of particles in the cluster. We will see below that particle-cluster interactions are only performed for particles and clusters which are separated from each other. Thus, the approximation used is referred to as a far-field approximation. The nested subdivision of space, which is used to construct the particle clusters, has a natural tree structure. The objective behind the subdivision of space is to generate particle-cluster interactions in which the particle is far from the cluster, relative to the cluster’s size. The combination of these two ingredients leads to an algorithm whose asymptotic operation count is O(N log N ). Two early examples of tree code algorithms are due to Appel [3] and Barnes and Hut [5], who used the algorithms for problems in astrophysics. In these algorithms, particle-cluster interactions were performed by approximating the cluster as a single particle located at the cluster’s center of mass. A drawback of this approximation is that it has limited accuracy. The Fast Multipole Method of Greengard and Rokhlin [24, 25] overcame this obstacle by using a series expansion to approximate particle-cluster interactions to any specified tolerance. They also introduced clustercluster interactions by expanding the far-field approximation into a local near-field expansion for rapid evaluation at multiple target points. The series expansions used in [24, 25] are Laurent series in two space dimensions and spherical harmonic expansions in three dimensions. Van Dommelen and Rundensteiner [60] employed a similar series approach to study two-dimensional fluid flow around a cylinder which was modeled with point vortices and a random walk simulation of diffusion effects. They used a Laurent series to approximate particle-cluster interactions, but they did not use cluster-cluster interactions. This simplifies the algorithm and results in smaller memory requirements, though for similar error tolerances, their ratio of improvement in execution time versus direct summation is less than Greengard and

Rokhlin’s two-dimensional results. Another tree code for two- and three-dimensional problems, due to Anderson [2] does not use series expansions. Instead, the approximations for particle-cluster and cluster-cluster interactions are based on the Poisson integral formula for the solution of Laplace’s equation in the interior of a circle or sphere. For two-dimensional problems, the ratio of improvement in execution time for Anderson’s algorithm is between Van Dommelen and Rundensteiner’s and Greengard and Rokhlin’s. All of these expansions and approximations are appropriate when the interaction kernel is harmonic, such as the Newtonian potential of electrostatic and gravitational interactions, but they are unsuitable for non-harmonic kernels, such as the kernel Kδ under consideration in this thesis. This is because they rely on the harmonicity of the kernel to ensure convergence. An expansion using Cartesian Taylor series, an idea first proposed by Zhao [65], can be used to overcome this constraint. Zhao used Taylor series for simulations with the Newtonian potential, a harmonic function, in three dimensions. The motivation was to generalize Greengard and Rokhlin’s [24] two-dimensional complex Taylor series expansion to the three-dimensional setting. The first application of this expansion to particle simulations with a non-harmonic kernel was by Draghicescu and Draghicescu [21], who computed the evolution of a desingularized vortex sheet in two space dimensions. An important contribution of their work is the introduction of recurrences to rapidly compute the expansion coefficients. One contribution of this thesis is to generalize this approach to the threedimensional vortex blob kernel. Our algorithm and the algorithm of Draghicescu and Draghicescu’s are like van Dommelen and Rundensteiner’s [60], in that they do not use cluster-cluster interactions. The reasoning for this is that converting a far-field expansion into a near-field expansion for Taylor series is a time consuming procedure,

requiring O(p3 ) and O(p4 ) operations in two and three dimensions respectively, where p is the order of the Taylor series being used. A recent development concerning this issue is a new version of the Fast Multipole Method for the Newtonian potential by Greengard and Rokhlin [26]. They speed up the far-field to near-field conversion by using an intermediate step of converting the expansion into plane wave expansions using Bessel functions. It is a matter for future work to see if this technique can be extended to non-harmonic kernels. Salmon and Warren [54] discussed using Taylor series for the Newtonian potential and the non-harmonic Plummer potential, although their recurrences and expansions were used only for the Newtonian potential. Using low-order methods and error bounds, they were able to improve upon the performance of previous loworder method of Barnes and Hut [5]. Winckelmans et. al. [64], improving upon the error estimates of Salmon and Warren [54], were able to simulate the vortex wake behind an accelerated airfoil using a vortex method with a cut-off Gaussian smoothed Biot-Savart kernel. Before going into more detail, we first briefly describe the overall structure of our tree code. The algorithm has two stages, the construction of the tree and the computation of the particle velocities. The tree construction involves the recursive subdivision of space to form the nested particle clusters, and the computation of cluster parameters which are used for particle-cluster interactions. Particle velocities are computed using a combination of particle-cluster interactions and particle-particle interactions. The decision for where these interactions are performed is based upon tolerance conditions and execution time considerations. The influence of a particle cluster on a target particle is computed with a particle-cluster interaction only when a tolerance condition is satisfied and when doing so takes less time than performing

individual particle-particle interactions. Otherwise, either the cluster acts on the target particle with particle-particle interactions, or the computation descends another level into the tree and considers interactions between the cluster’s subclusters and the target particle. This process continues until either particle-cluster interactions are performed or the leaves of the tree are reached, in which case particle-particle interactions are performed. Other algorithms which have a similar structure as ours have been shown to require O(N log N ) operations. We present numerical results in the next chapter which show that our algorithm’s execution time also grows like O(N log N ). The layout for the rest of this chapter is as follows. Section 3 describes particlecluster interactions where the approximation is based on Taylor series. The factors which determine the efficiency of such an approximation are discussed. This motivates the nested subdivision of space described in Section 4. Section 5 describes a method based on recurrences for computing the far-field expansion coefficients. Section 6 presents the error analysis upon which the adaptive order selection is based. Section 7 gives a full description of the algorithm. Section 8 discusses the execution time and memory requirements of the algorithm.

3.3

Particle-Cluster Interactions

Consider a particle-cluster interaction between a target particle x and a collection of particles yj , where j = 1 . . . Nτ . The yj are referred to as a particle cluster and the region of space containing them is referred to as a cell and is denoted τ . This situation is depicted in Figure 3.1 in two space dimensions. When the particle-cluster interaction is performed, the cumulative influence of the yj on x is replaced with a

Figure 3.1: Particle-cluster interaction. x : target particle, yj : particle in cluster, τ : cell, y : center of τ . truncated series expansion. From (2.23), the influence of the yj on x is
Nτ j=1

Kδ (x, yj ) × wj .

We expand Kδ (x, yj ) in a Taylor series in the second argument about a point y, the center of τ . For the moment, we will not specify where y is located except to say that two natural possibilities are : (1) the center of mass of the yj , (2) the geometrical center of τ when it is a rectangular box. Using multi-index notation, we have
Nτ j=1 Nτ

Kδ (x, yj ) × wj = =

j=1 Nτ j=1

Kδ (x, y + (yj − y)) × wj 1 k D Kδ (x, y)(yj − y)k × wj k! y
 

k

=
k

Nτ 1 k  (yj − y)k wj  D Kδ (x, y) × k! y j=1

=
k

ak (x, y) × bk (τ ),

where 1 k ak (x, y) = Dy Kδ (x, y), k!

bk (τ ) =
j=1

(yj − y)k wj .

The ak (x, y) are the Taylor coefficients of Kδ (x, y) with respect to y about y = y. The bk (τ ) describe the distribution of particles in τ and are referred to as particle

` aY bc W
(3.9) (3.10) (3.11)

X

moments. Note that the Taylor coefficients ak (x, y) are independent of the particles yj in the cell τ , and the particle moments bk (τ ) are independent of the target particle x. So in a certain sense, the expansion is a separation of variables. Once the particle moments are computed for one particle-cluster interaction, they can be stored and used for subsequent particle-cluster interactions with different target particles. In practice, the influence of the yj on x is approximated by truncating the infinite series in (3.10), yielding ak (x, y) × bk (τ ), (3.12)

|k|<p

where |k| = k1 +k2 +k3 and p is chosen to ensure that the error is less than a specified tolerance. The determination of p is described in Section 3.6. Let rτ = max |yj − y|
j

(3.13)

be the radius of the cluster about y, and R = (|x − y|2 + δ 2 )1/2 (3.14)

be the regularized distance from x to the center of the cell. It is shown in Section 3.6 that the error incurred by using the truncation (3.12) is O(hp ), where h = rτ /R is the convergence factor of the expansion. Thus, the truncation is referred to as a pth order expansion for a particle-cluster interaction. A particle-cluster interaction is performed as follows : 1. Determine the minimum value of p which ensures that the series truncation error is less than the specified tolerance. 2. If the particle moments bk (τ ) have not already been computed up to order p, then compute and store them.

3. Compute the Taylor coefficients ak (x, y) for |k| < p. 4. Compute the sum in (3.12). Step 1 is based on error bounds described in Section 3.6 and can be performed with O(pmax ) operations, where pmax is the largest admissible value for p. Step 2 requires O(Nτ p3 ) operations if the particle moments have not already been computed. However, this is a one-time cost whose relative effect on the overall execution time diminishes as τ is used in more particle-cluster interactions. The exponent on p in this operation count is three because we are in three space dimensions. Step 3 can be performed with O(p3 ) operations using a method based on recurrences that is described in Section 3.5. This is the best that can be expected, since there are O(p3 ) coefficients to compute. The sum in step 4 has O(p3 ) terms and can be computed with O(p3 ) operations. Adding these operation counts, we see that a pth order particlecluster interaction requires O(p3 ) operations, assuming that the particle moments have been computed. Using a particle-cluster interaction is not always advantageous. For instance, computing the influence of the yj on x by direct summation, i.e. (3.9), may require fewer operations than computing a pth order expansion, where p has been determined by accuracy constraints. In this situation, we do not use the expansion, opting either to use direct summation or to subdivide τ and consider particle-interactions between x and the resulting subcells. This decision is based on the following considerations. Direct summation requires O(Nτ ) operations, where Nτ is the number of particles in τ , and using the expansion requires O(p3 ) operations. So if direct summation requires fewer operations than the expansion, it may be loosely stated that either Nτ is small or p is large. If Nτ is small, then there are no alternatives to direct

summation that will reduce the operation count. This is quantified by introducing a parameter N0 and using direct summation when Nτ < N0 . If Nτ ≥ N0 and p is such that using the expansion requires more operations than direct summation, then we subdivide τ and consider particle-cluster interactions with the resulting subcells. The motivation for this is that the expansions for the particle-cluster interactions with the subcells will require lower orders (smaller value of p) to satisfy the specified tolerance. This is because the convergence factors for the new expansions are at most 0.79h, where the 0.79 factor arises from the partial bisection algorithm described in the next section. So when the execution time required to perform direct summation is less than the time required to perform a particle-cluster interaction, direct summation is performed if Nτ < N0 , and τ is subdivided if Nτ ≥ N0 . The iterative application of this leads to the nested subdivision of space which is described in detail in the next section. In practice, the comparison between the time required to perform direct summation and the time required to perform a pth order particle-cluster expansion is done as follows. A stand-alone program was written which performs direct summation between a particle and a cluster. The program was run using clusters with varying numbers of particles Nτ . The execution time was fit with a linear function of Nτ . This linear function is used to estimate how long it takes to perform direct summation with a cluster that has an arbitrary number of particles. A similar program was written which performs particle-cluster expansions and the execution time of this program as a function of p was determined. These execution times are stored and a table lookup is used to determine how long an expansion takes. The comparison between execution times is made between the linear function of Nτ and the stored execution time for a pth order expansion.

3.4

Tree Construction

The strategy of subdividing cells when using an expansion for a particle-cluster interaction is used to compute the velocity of every particle. A consequence of this is that every cell containing particles will be recursively subdivided until the resulting subcells have fewer than N0 particles. This operation of subdividing the cells can be done independently of the target particles, so it is advantageous to do it once, at the beginning of the velocity computations. The resulting collection of cells admits a natural tree structure where nodes in the tree correspond to cells of the subdivision. A cell τ2 is a child of a cell τ1 if τ2 was obtained by subdividing τ1 . The tree is constructed with the following recursive algorithm : 1. The collection of particles is enclosed with a rectangular box, which becomes the root cell of the tree and is denoted τ0 . Set the current cell to τ0 . 2. If the current cell τ contains fewer than N0 particles then exit. The cell τ is a leaf of the tree. 3. Otherwise, subdivide τ into subcells and apply step 2 to each subcell. The resulting subcells become children of τ in the tree. There are two aspects of this tree construction that need further explanation, the choice of N0 and the method by which the cells are subdivided. The choice of N0 affects the performance of the algorithm in two ways. If N0 is too small, then the tree will have many levels, leading to a large memory requirement. However, if N0 is too large, then the tree consists of cells having large spatial dimensions, and this increases the order p needed in the expansion for particle-cluster interactions, thereby increasing execution time. Computational experiments were performed on a

test case to determine an appropriate value of N0 . These tests are described in the next chapter. The subdivision of a cell τ is based upon τ ’s bounding box, the smallest rectangular box containing τ ’s particles whose sides are parallel to the coordinate axes. Let l be the longest edge of the bounding box. The bounding box is bisected in each √ direction in which its length is greater than l/ 2, yielding either 2, 4, or 8 subcells, and the particles are partitioned according to which subcell they are contained in. The subcells which contain particles become children of τ , and the subcells with no particles are discarded. The reason for bisecting the box only in the long directions is that bisecting in short directions does not significantly reduce the convergence factor of the particle-cluster expansion. This is because the convergence factor for the expansion is proportional to rτ , the radius of the cell. Note that when the subdivision process is applied recursively to the subcell, their bounding boxes depend only on their particles. Hence, the bounding boxes shrink to fit the particle distribution. The √ factor 1/ 2 was chosen to ensure that the child cell’s aspect ratio, before shrinking, is closer to 1 than the parent cell’s aspect ratio. Using a different factor was not found to improve the algorithm’s performance significantly. A byproduct of this algorithm for constructing the tree is that every cell in the tree has a bounding box computed for it. We select y, the base point of the Taylor series expansions, to be the center of the bounding box. Using this expansion point and shrinking the bounding boxes yields an expansion point which is close to the particles. A consequence of this is that small values of p can be used for the expansion, which reduces execution time. Figures 3.2 and 3.3 depict the subdivisions and associated trees resulting from the application of this algorithm to a random collection of points and a sequence of points on a spiral. The rectangles shown are the bounding boxes for the points within

them. The thickness of the rectangle borders are thinner for cells deeper in the tree. For these figures, N0 was set to 20 for illustrative purposes, so cells with more than 20 particles were subdivided as described above. The spiral example demonstrates how the bounding boxes shrink to fit the particle distribution. Distributions like this occur in our computations, since we are dealing with two-dimensional surfaces embedded in R3 .

3.5

Recurrences for Taylor Coefficients

For the algorithm to be computationally efficient, it is necessary to rapidly compute the ak (x, y), the Taylor coefficients of Kδ (x, y) defined in (3.11). A method for doing this is described here. Recall that the kernel Kδ is given by Kδ (x, y) = − 1 x−y . 4π (|x − y|2 + δ 2 )3/2 (3.15)

Our first observation is that the computation of the partial derivatives of Kδ can be sped up using the fact that Kδ (x, y) = ψ(z) = ψ(x − y), where (3.16)

1 (|z|2 + δ 2 )−1/2 . 4π

Note that ψ is a regularized form of the fundamental solution to Laplace’s equation in three dimensions. The Taylor coefficients of Kδ are ak (x, y) = Defining the quantities ck (x, y) = 1 k D ψ(x − y), k! (3.18) (−1)|k| k 1 k Dy Kδ (x, y) = D ( ψ)(x − y). k! k! (3.17)

r r r r r r r r r r r r r r r r rr r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r rr r r r r r r r r r r r r r r r r r r r r r r r rr r rr r r r r r r rr r r r r r r r r rr r r r r r rr r rr r rr r r r r r r rr r r r

r r r

(a)

(b)

Figure 3.2: Subdivision of space for random points. (a) Nested subdivision of space. (b) Associated tree structure.

r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r rrrrrrrrrr r r rr rrr r r rr r r rr rrrrrr r r r r r r r rr r rr r rr r r rr rr rr rrrrrrrrr r r r r r

r r r

(a)

(b)

Figure 3.3: Subdivision of space for points on a spiral. (a) Nested subdivision of space. (b) Associated tree structure.

we have the relationship
   |k|  (−1)     

ak =

(k2 + 1)ck1 ,k2 +1,k3

(k1 + 1)ck1 +1,k2 ,k3  

(k3 + 1)ck1 ,k2 ,k3 +1

  .   

(3.19)

We compute the ck (x, y) with recurrences and then use (3.19) to compute the ak (x, y). Another relationship between ak (x, y) and ck (x, y) arises by considering them as functions of x with y fixed ak (x, y) = (−1)|k|
x ck (x, y).

(3.20)

This equation will be useful for the error analysis in the next section. To simplify the presentation of the recurrences for the ck (x, y), we first present recurrences for the Taylor coefficients of a one-dimensional analogue of ψ, ψ1 (x) = (x2 + δ 2 )−1/2 . Proposition 1 Fix x0 ∈ R and let ck = ψ1 (x0 )/k! be the kth order Taylor coefficient of ψ1 (x) at x = x0 . Then the ck satisfy the recurrence (x2 + δ 2 ) ck + 2x0 (1 − 0 1 1 ) ck−1 + (1 − ) ck−2 = 0 2k k (3.21)
(k)

for k > 0, with the convention that ck = 0 for k < 0. Proof : First observe that ψ1 satisfies the differential equation (x2 + δ 2 ) ψ1 (x) + x ψ1 (x) = 0. (3.22)

Let k > 0. Differentiating (k − 1) times, using the Leibniz rule for differentiating a product, we obtain (x2 + δ 2 ) ψ1 (x) + (k − 1) 2x ψ1
(k) (k−1)

(x) + (k − 1)(k − 2) ψ1 + x ψ1
(k−1)

(k−2)

(x)
(k−2)

(x) + (k − 1) ψ1

(x) = 0. (3.23)

Substituting x = x0 and grouping similar terms yields 1 (k−1) (k−2) (k) (x0 ) + (k − 1)2 ψ1 (x0 ) = 0. (x2 + δ 2 ) ψ1 (x0 ) + 2x0 (k − ) ψ1 0 2 (3.24) The result (3.21) is obtained on dividing by k! and using the identities k − 1/2 1 − 1/(2k) = , k! (k − 1)! (k − 1)2 1 − 1/k = . k! (k − 2)! (3.25)

The essential ingredient in the proof of Proposition 1 is the differential equation (3.22) that ψ1 satisfies. The function ψ(z) of (3.16) whose Taylor coefficients we require, satisfies three differential equations which are analogous to (3.22) : (|z|2 + δ 2 ) ∂ψ (z) + z1 ψ(z) = 0, ∂z1 ∂ψ (z) + z2 ψ(z) = 0, ∂z2 ∂ψ (z) + z3 ψ(z) = 0. ∂z3 (3.26a)

(|z|2 + δ 2 )

(3.26b)

(|z|2 + δ 2 )

(3.26c)

Following the proof of Proposition 1, we obtain the following result. Proposition 2 Let z ∈ R3 , R = (|z|2 + δ 2 ) R2 ck1 ,k2 ,k3 + 2z1 (1 −
1/2

and ck =

1 D k ψ(z). k!

Then

1 1 )ck1 −1,k2 ,k3 + (1 − )ck1 −2,k2 ,k3 2k1 k1 (3.27a)

+ 2z2 ck1 ,k2 −1,k3 + ck1 ,k2 −2,k3 + 2z3 ck1 ,k2 ,k3 −1 + ck1 ,k2 ,k3 −2 = 0, R2 ck1 ,k2 ,k3 + 2z2 (1 − 1 1 )ck1 ,k2 −1,k3 + (1 − )ck1 ,k2 −2,k3 2k2 k2

+ 2z1 ck1 −1,k2 ,k3 + ck1 −2,k2 ,k3 + 2z3 ck1 ,k2 ,k3 −1 + ck1 ,k2 ,k3 −2 = 0, (3.27b) R2 ck1 ,k2 ,k3 + 2z3 (1 − 1 1 )ck1 ,k2 ,k3 −1 + (1 − )ck1 ,k2 ,k3 −2 2k3 k3

+ 2z1 ck1 −1,k2 ,k3 + ck1 −2,k2 ,k3 + 2z2 ck1 ,k2 −1,k3 + ck1 ,k2 −2,k3 = 0, (3.27c)

where k1 > 0 in (3.27a), k2 > 0 in (3.27b), k3 > 0 in (3.27c), with the convention that ck = 0 when any of the indices are negative. These recurrences are used to compute the ck for |k| ≤ p with O(p3 ) operations. To demonstrate how this is done and to simplify the presentation, we first explain the process for a two-dimensional analogue, obtained by omitting the z3 and k3 dependence. We describe the generalization to the three-dimensional case afterward. So consider the two recurrences R2 ck1 ,k2 + 2z1 (1 − 1 1 )ck1 −1,k2 + (1 − )ck1 −2,k2 + 2z2 ck1 ,k2 −1 + ck1 ,k2 −2 = 0, 2k1 k1 (3.28a) 1 1 )ck1 ,k2 −1 + (1 − )ck1 ,k2 −2 = 0. 2k2 k2 (3.28b)

R2 ck1 ,k2 + 2z1 ck1 −1,k2 + ck1 −2,k2 + 2z2 (1 −

The coefficients are computed in the following 4 steps, depicted in Figure 3.4. 1. Compute c0,0 from the definition. 2. Compute ck,0 and c0,k for k = 1 . . . p. 3. Compute ck,1 and c1,k for k = 1 . . . p − 1. 4. Compute ck1 ,k2 for k1 + k2 ≤ p. The coefficients obtained in step 4 are computed row-by-row. The computation is ordered this way to ensure that the coefficients needed for the recurrences are available. It also breaks the code into blocks which correspond to the cases when different coefficient indices arising in the recurrences are negative, allowing for a more understandable code. This is more of an issue in the three-dimensional case, where there are more index cases to consider. The computations for the three-dimensional case are performed in the following steps :

Step 1

Step 2

Step 3

Figure 3.4: Computing Taylor coefficients for two-dimensional example. (•) : previous step, (◦) : current step, (◦) : future step. x

h 0g
Step 4

t 0s

r fq x fw € 0y p 0i

e fd v fu

1. Compute c0 from the definition. 2. Compute ck when two indices are 0 and the other is ≥ 1. 3. Compute ck when one index is 0, one index is 1 and the other is ≥ 1. 4. Compute ck when one index is 0 and the other two are both ≥ 2. 5. Compute ck when two indices are 1 and the other is ≥ 1. 6. Compute ck when one index is 1 and the other two are ≥ 2. 7. Compute ck when all of the indices are greater than 2. As in the two-dimensional case, this ordering of the steps ensures that coefficients needed for the recurrences are available. Once the coefficients are computed, (3.19) is used to compute the ak (x, y) for |k| < p with another O(p3 ) operations. Thus, the overall operation count for computing the ak (x, y) for a pth order particle-cluster interaction is O(p3 ).

3.6

Error Analysis of Particle-Cluster Interactions

In this section, we obtain a bound for the error due to the series truncation in a particle-cluster interaction. This bound is used in the algorithm to compute an order p to satisfy the specified tolerance. So consider a cell τ with particles yj acting on the target particle x. To simplify the analysis, we initially bound the error in the series truncation for the influence of a single particle yj in τ on x. The triangle inequality then ensures that the total error in the particle-cluster interaction is less than the sum of the individual errors. From (3.12), yj ’s computed influence on x is ak (x, y) × (yj − y)k wj . (3.29)

|k|<p

Because the vector weight wj is independent of k, it factors out of the equation, so we restrict our attention to the expression ak (x, y)(yj − y)k . (3.30)

|k|<p

To analyze the rate of convergence of this series as p increases, the quantities Sn =
|k|=n

ak (x, y)(yj − y)k

(3.31)

are introduced, where the dependence of Sn on x, y, and yj is not explicitly displayed for notational convenience. With the series truncation that we are using, |k| < p, the multi-dimensional series (3.30) has been reduced to a one dimensional series ak (x, y)(yj − y)k = Sn .
n<p

(3.32)

|k|<p

We estimate the error due to the truncation by showing that the magnitude of the Sn decrease geometrically and using a bound on the first omitted term Sp . The differential equation relating ak (x, y) and ck (x, y), (3.20), leads us to introduce the quantity Tn =
|k|=n

ck (x, y)(yj − y)k ,

(3.33)

which is related to Sn by Sn = (−1)n
x Tn .

(3.34)

From the recurrences for ck (x, y), (3.27), we derive a recurrence for Tn . From this, we derive an explicit expression for Tn which involves Legendre polynomials. Then using (3.34), we derive an expression for Sn . This expression is used to estimate the error incurred by truncating the series (3.12).

Proposition 3 With Tn , ck (x, y) and R defined as above, R2 Tn + 2α(1 − where α = (x − y) · (yj − y), β = |yj − y|. (3.36) 1 1 )Tn−1 + β 2 (1 − )Tn−2 = 0, 2n n (3.35)

Proof : Consider a term R2 ck (x, y)(yj − y)k from the sum in (3.33) which makes up R2 Tn . Letting z = x − y, we apply a linear combination of the identities (3.27a,b,c) with weights k1 /n, k2 /n and k3 /n respectively and solve for R2 ck (x, y)(yj − y)k . The result is a weighted sum of ck (x, y) over |k| = n − 1 and n − 2. The weight on a particular ck (x, y) is computed as follows. If |k| = n − 1, then the ck (x, y) arose from identities (3.27) being applied to R2 ck (x, y) terms with k equal to (k1 + 1, k2 , k3 ), (k1 , k2 + 1, k3 ), or (k1 , k2 , k3 + 1). The resulting weight multiplying ck (x, y) is then (using k1 + k2 + k3 = n − 1) 2z1 k1 + 1 k2 k3 1 + 1− + (yj − y)(k1 +1,k2 ,k3 ) n n n 2(k1 + 1) + 2z2 + 2z3 k1 k2 + 1 k3 1 1− + (yj − y)(k1 ,k2 +1,k3 ) + n n n 2(k2 + 1)

(3.37) k1 k2 k3 + 1 1 + + 1− (yj − y)(k1 ,k2 ,k3 +1) n n n 2(k3 + 1) 1 1 )(yj − y)(k1 +1,k2 ,k3 ) + 2z2 (1 − )(yj − y)(k1 ,k2 +1,k3 ) = 2z1 (1 − 2n 2n 1 + 2z3 (1 − )(yj − y)(k1 ,k2 ,k3 +1) 2n (3.38) = 2 {z1 (yj,1 − y1 ) + z2 (yj,2 − y2 ) + z3 (yj,3 − y3 )} (1 − 1 )(yj − y)k 2n 1 )(yj − y)k 2n (3.39) = 2z · (yj − y)(1 − = 2α(1 − (3.40) (3.41)

1 )(yj − y)k 2n

If |k| = n−2, then the ck (x, y) arose from identities (3.27) being applied to R2 ck (x, y) terms with k equal to (k1 + 2, k2 , k3 ), (k1 , k2 + 2, k3 ), or (k1 , k2 , k3 + 2). The resulting weight multiplying ck (x, y) is then (using k1 + k2 + k3 = n − 2) k1 + 2 1 k2 k3 (yj − y)(k1 +2,k2 ,k3 ) )+ (1 − + n n n k1 + 2 1 k1 k2 + 2 k3 + + (1 − (yj − y)(k1 ,k2 +2,k3 ) )+ n n n k2 + 2 k1 k2 k3 + 2 1 ) (yj − y)(k1 ,k2 ,k3 +2) + + (1 − n n n k3 + 2 1 1 = (1 − )(yj − y)(k1 +2,k2 ,k3 ) + (1 − )(yj − y)(k1 ,k2 +2,k3 ) n n 1 + (1 − )(yj − y)(k1 ,k2 ,k3 +2) n 1 = |yj − y|2 (1 − )(yj − y)k n 1 = β 2 (1 − )(yj − y)k n + Thus, when R2 Tn is expanded with identities (3.27), the result is R 2 Tn = − 2α(1 − 1 )c (x, y)(yj − y)k − 2n k β 2 (1 − 1 )c (x, y)(yj − y)k n k (3.46) (3.47) (3.42)

(3.43)

(3.44) (3.45)

|k|=n−1

|k|=n−2

= −2α(1 −

1 1 )Tn−1 − β 2 (1 − )Tn−2 . 2n n

The recurrence (3.35) is related to the one satisfied by the Legendre polynomials, Pn (x) − 2x(1 − 1 1 )Pn−1 (x) + (1 − )Pn−2 (x) = 0, 2n n (3.48)

for n ≥ 2 with P0 (x) = 1 and P1 (x) = x [22, Chapter 10]. This observation leads to the following explicit formula for Tn . Proposition 4 With α, β, and R defined as above, Tn = α hn Pn − , 4πR βR (3.49)

where h= β . R (3.50)

Proof : The proof works by showing that Tn and the right-hand side of (3.49), which we refer to as Tn , satisfy the same two-term recurrence and have the same values for n = 0, 1. The recurrence for Tn is given in Proposition 3. From the recurrence for
α Pn in (3.48), we have that Pn (− βR ) is a solution of the recurrence

fn + 2

1 1 α (1 − )fn−1 + (1 − )fn−2 = 0. βR 2n n

(3.51)

α Thus, hn Pn (− βR ) is a solution of

fn + 2h

1 1 α (1 − )fn−1 + h2 (1 − )fn−2 = 0. βR 2n n

(3.52)

Multiplying this equation by R2 and using h = β/R yields R2 fn + 2α(1 − 1 1 )fn−1 + β 2 (1 − )fn−2 = 0, 2n n (3.53)

α which is the recurrence for Tn . Since Tn is a multiple of hn Pn (− βR ), it too satisfies the

recurrence. So Tn and Tn satisfy the same two-term recurrence. From the definition of Tn , (3.33), and the recurrence that the Tn satisfy, (3.35), the initial values for Tn are T0 = c0 (x, y) = 1 1 (|x − y|2 + δ 2 )−1/2 = 4π 4πR α α T1 = − 2 T0 = − . R 4πR3 (3.54) (3.55)

Using h = β/R, the initial values for Tn are h0 α 1 P0 − , = 4πR βR 4πR h1 α α T0 = P1 − . =− 4πR βR 4πR3 T0 = (3.56) (3.57)

Thus, Tn = Tn for all n ≥ 0. To obtain an expression for Sn , we take the gradient of Tn with respect to x. When written out in terms of x, yj , and y, we have from (3.49) Tn = |yj − y|n |x − y|2 + δ 2 4π
−(n+1)/2

Pn −

(x − y) · (yj − y) . |yj − y|(|x − y|2 + δ 2 )1/2 (3.58)

Defining γ=− we have Sn =
−(n+3)/2 (−1)n |yj − y|n −(n + 1)(x − y) |x − y|2 + δ 2 Pn (γ) 4π −(n+1)/2 −(yj − y) + |x − y|2 + δ 2 Pn (γ) |yj − y|(|x − y|2 + δ 2 )1/2 (x − y) · (yj − y) + (x − y) |yj − y|(|x − y|2 + δ 2 )3/2

(x − y) · (yj − y) , |yj − y|(|x − y|2 + δ 2 )1/2

(3.59)

. (3.60)

Recalling R = (|x − y|2 + δ 2 )1/2 and making some rearrangements, we obtain (−1)n Sn = 4πR2 |yj − y| R x−y Pn (γ) R yj − y (x − y) (x − y) · (yj − y) + Pn (γ) − + |yj − y| R2 |yj − y| −(n + 1)
n n

. (3.61)

Each fraction inside the curly braces is less than 1 in magnitude, so we have 1 |Sn | ≤ 4πR2 Using the inequalities |Pn (γ)| ≤ 1, which rely on |γ| ≤ 1, we have (n + 1)2 |Sn | ≤ 4πR2 |yj − y| R
n

|yj − y| R

((n + 1)Pn (γ) + 2Pn (γ)) .

(3.62)

|Pn (γ)| ≤ n(n + 1)/2,

(3.63)

.

(3.64)

So the terms of the series in (3.32) decay geometrically, which implies that the truncation error is roughly the magnitude of the first omitted term. Recall that when a pth order particle-cell interaction is performed, the expansion for the influence of all of the particles is ak (x, y) × bk (τ ) =
 

k

p≥0

|k|=p

ak (x, y) × bk (τ ) ,

(3.65)

and this is truncated by retaining the terms for which |k| < p. The geometric decay of the terms ensures that the error incurred by using this truncation will be on the order of ak (x, y) × bk (τ ). (3.66)

|k|=p

The bound (3.64) on Sn for n = p translates into the bound ak (x, y) × bk (τ ) ≤ (p + 1)2 4πRp+2
Nτ j=1

|k|=p

|wj ||yj − y|p .

(3.67)

Define the quantities σp (τ ) = (p + 1)2
Nτ j=1

|wj ||yj − y|p .

(3.68)

Then the error incurred by using a pth order expansion to approximate a particlecluster interaction is bounded by error < σp (τ ) . 4πRp+2 (3.69)

The σp (τ ) are computed during the construction of the tree. The first step in performing a particle-cluster expansion is to find the smallest p such that the expression in (3.69) is less than the specified tolerance. This can be done with O(pmax ) operations, as stated in Section 3.3, where pmax is the maximum admissible value of p.

When an algorithm using the error bound (3.69) to compute p was implemented, it was found that the actual error incurred in computing the velocity was typically three orders of magnitude smaller than the specified tolerance. Presumably, this is due to the repeated use of the triangle inequality in the analysis, which leads to overestimates. An alternative to bounding the error in the velocity is to bound the error in the velocity potential, which is achieved by using the identity (3.49), leading to the bound |Tn | ≤ 1 4πR |yj − y| R
n

,

(3.70)

a bound analogous to (3.64). The cumulative error bound corresponding to (3.69) is error < where σp (τ ) is now defined as

σp (τ ) , 4πRp+1

(3.71)

σp (τ ) =
j=1

|wj ||yj − y|p .

(3.72)

When an algorithm using this error bound to compute p was implemented, the error in computing the velocity was still smaller than the specified tolerance, but only by one order of magnitude. This is the approach used for all of the runs described in Chapters 4 and 5. For either of the error bounds, it is clear that the geometric decay rate of the truncation error depends linearly on rτ , the radius of the smallest sphere centered at y which encloses all the particles in the cell τ . Thus, it is appropriate to subdivide cells so that the resulting subcells are as close to spheres as possible. As mentioned in Section 3.4, this is the motivation for the bisection technique used when cells are subdivided.

3.7

Full Description of the Algorithm

The algorithm for computing all interactions has two stages, constructing the tree and computing the particle velocities with the aid of the tree. There are two parameters for the program, pmax , the maximum admissible order for expansions, and N0 , the maximum number of particles in an undivided cell. The tree is created with the recursive function create_tree, written here in pseudo-code, which accepts for input an array of particles associated with a cell τ , and an integer N , the length of the array. The purpose of the function is to create and initialize a tree node for the particles which are passed to it. This includes computing the particle’s bounding box, the cell’s moments, and the σ’s. If there are more than N0 particles, then the cell is subdivided and the function is called recursively. The function returns a pointer to the created tree node. function create_tree(particles, N ) begin allocate memory for tree node being created compute particle’s bounding box compute center of bounding box compute σp for p = 0 . . . pmax if N > N0 then compute the directions to subdivide the cell partition the particles, yielding subarrays of particles call create_tree for each subarray of particles make each returned tree node a child of τ return τ

end Once the tree is created, the recursive function compute_influence is called for each target particle to compute the influence of all particles on it. The function accepts for input a target position x, a cell τ , and a tolerance tol. The function returns the influence of the particles in the cell on the target position computed to the specified tolerance. It is initially called with the root cell of the tree τ0 . function compute_influence(x, τ , tol) begin estimate t0 , the time for direct summation with linear model compute minimum p to satisfy tolerance if p > pmax or time for pth order expansion > t0 then if τ has no children then compute and return influence using direct summation else call compute_influence for each child of τ return sum of returned influences else if τ ’s pth order particle moments have not been computed yet, then compute and store them compute the ck compute the ak from the ck compute and return sum of expansion end For each recursive call of compute_influence to itself, a local tolerance is required.

We want the cumulative errors from the child computations to be less than or equal to the tolerance passed to compute_influence. We achieve this by multiplying the parent’s tolerance by the ratio of the child’s weights and the parent’s weights. That is, if τ is the child cell, and P (τ ) is τ ’s parent, then tol(τ ) = |wj | tol(P (τ )), yj ∈P (τ ) |wj |
yj ∈τ

(3.73)

where tol(τ ) denotes the local tolerance for the cell τ . Then it follows that the sum of the errors from the child computations is bounded by the tolerance specified for P (τ ). Note that the sums in the ratio are the values of σ0 (τ ) and σ0 (P (τ )), which were computed in create_tree.

3.8

Complexity Analysis

In this section, we describe the memory and time requirements of the algorithm described above. We first show that the memory required for the algorithm is O(N ). The algorithm consists of two stages, tree construction and velocity computation, and we analyze them separately for their time requirements. The number of operations required for constructing the tree is O(N log N ). We break the velocity computations into two parts, particle-cluster interactions and particle-particle interactions. We present heuristic reasons for why these take O(N log N ) and O(N ) operations respectively. The bounds obtained in this section should be considered as rough guides to how the algorithm performs in practice, as opposed to sharp estimates. We expect that the asymptotic behavior of the bounds matches the algorithm’s asymptotic performance, but that the constants involved may be considerably off. In the next chapter, we present data from runs on test cases which demonstrates the algorithm’s performance benefit over an algorithm which uses only direct summation. It is our

position that that data is more significant than the asymptotic bounds obtained here, since we desire an actual execution time improvement, not just an asymptotically fast algorithm. With that in mind, we proceed with the analysis. We now discuss the memory requirements for the algorithm, in terms of the parameters N , N0 , and pmax . The memory can be broken down into 2 categories, that required for the particles, and that required for the tree. It is clear that the particles require O(N ) words of memory. The data for a single cell that requires more than O(1) words are the cell moments and the σp (τ ), which require O(p3 ) max and O(pmax ) words respectively. So a single cell uses O(p3 ) words of memory. We max bound the number of cells in the tree, by first bounding the number of cells which are parents of leaf cells, and then use that to bound the size of the entire tree. A cell which is the parent of a leaf cell has at least N0 particles. Since every particle is in exactly one such cell, there are at most N/N0 parents of leaf cells. Going down the tree, we see that there are at most 8N/N0 leaf cells, since each cell has at most 8 children. Going up the tree, we see that there are at most N/2N0 parents of parents of leaf cells, since each cell has at least 2 children, if it has children. Continuing up the tree, looking at parents of parents and so on, yields collections of cells with at most N/4N0 , N/8N0 , . . . cells respectively. Thus, there are at most (8 + 1 + 1/2 + 1/4 + . . . )N/N0 = 10N/N0 (3.74)

cells in the tree. Thus, the memory required for the tree is O(p3 N/N0 ). Note that max since each leaf cell has fewer than N0 particles in it, there must be more than N/N0 of them. Thus, our bound has the correct asymptotic order. The tree code algorithm presented here requires more memory than a direct summation program. However, in the next chapter, when the algorithm is validated,

the amount of memory actually used is compared to the memory used by a direct summation program. It is found that the memory required by the tree code algorithm is 1.3 to 1.6 times that required by direct summation. For certain particle distributions and tolerance values, the algorithm will perform poorly, taking more time than an algorithm which uses only direct summation. However, this behavior has not been observed in our tests. In order for the analysis to reflect the actual performance characteristics of the algorithm, we make a simplifying assumption about the particle distribution. The assumption is that when a cell is subdivided, there is an upper bound on the percentage of particles contained in a subcell. Mathematically, this is stated as Nτ ≤ C < 1, NP (τ ) (3.75)

where τ is an arbitrary cell, P (τ ) is the parent of τ , and C is a constant independent of τ . Intuitively, this assumption is bounding how inhomogeneous the distribution of particles can be. In our computations, the maximum value of Nτ /NP (τ ) was computed at each time step and found to be less than 0.5, so the assumption is justified. For the operation count of the tree construction, we first obtain an upper bound on the number of levels in the tree. The root cell of the tree, τ0 , has N particles. Inequality (3.75) gives an upper bound on the number of particles in a cell in terms of its parent. Applying it iteratively, we find that cells at the lth level have fewer than C l N particles. Now if a cell has fewer than N0 particles, it is not subdivided. This will be guaranteed if C l N < N0 , which is true if l > log(N0 /N )/ log C = log(N/N0 )/ log C −1 . Thus, there are at most O(log(N/N0 )) levels in the tree. Consider a cell τ and let T (τ ) be the total number of operations required to

construct the tree starting with τ and including all of τ ’s children. In the function create_tree, the steps that require more than O(1) operations are computing the particles’ bounding box and computing the σp (τ ) for p = 0 . . . pmax , which require O(Nτ ) and O(pmax Nτ ) operations respectively. If Nτ ≤ N0 , then no more operations are performed, so T (τ ) = O(pmax Nτ ). If Nτ > N0 , then the particles are partitioned and create_tree is called for each subcell. The partitioning consists of grouping the particles according to which octant they are located in with respect to the cell’s center, a procedure that can be done in O(Nτ ) operations. Then create_tree is called for each subcell, implying T (τ ) = O(pmax Nτ ) +
P (˜)=τ τ

T (˜). τ

(3.76)

This equation, derived for Nτ > N0 , is also true for Nτ ≤ N0 , since the cell has no children and the sum is empty. Suppose Nτ > N0 and consider applying (3.76) to itself, expanding each T (˜). τ The first term of the expansion of T (˜) is O(pmax Nτ ). Since the τ are the children τ ˜ ˜ of τ , we have Nτ = N τ , ˜
P (˜)=τ τ

(3.77)

an equality which requires Nτ > N0 . Thus, the first terms of the expansions of the T (˜) sum to O(pmax Nτ ). So if Nτ > N0 , then τ T (τ ) = 2O(pmax Nτ ) +
P (P (˜))=τ τ

T (˜), τ

(3.78)

with the last sum potentially being empty. We repeat this process and apply (3.76) recursively to the T (˜). However, it may be the case that not all children of τ have τ children, because some children of τ may have fewer than N0 particles and thus are

not subdivided. Thus, the inequality Nτ ≤ N τ , ˜ (3.79)

P (P (˜))=τ τ

analogous to (3.77), may be strict. So we obtain in general T (τ ) ≤ O(lpmax Nτ ) + T (˜), τ
P (l) (˜)=τ τ

l = 1, 2, . . . ,

(3.80)

where P (l) is the parent function P composed with itself l times. The recursion stops when l is larger than the number of levels in the tree below τ , since the sum in (3.80) is empty then. Thus, substituting τ = τ0 , and using the fact shown above that there are O(log(N/N0 )) levels in the tree, we have T (τ0 ) ≤ O(pmax N log(N/N0 )). (3.81)

For the computation of the particle velocities, we do not have an upper bound on the number of operations that are used. The main difficulties are because of the adaptive nature of the algorithm. There is not an apriori bound on the ratio of a parent’s cell size and a child’s cell size. Thus, particle-cluster interactions become significantly more advantageous when a parent cell is subdivided and shrunk, as opposed to the gradual improvement that occurs when only subdividing is performed. Also, the cells on a given level of the tree may have very different sizes. This makes it difficult to consider them together which is a natural technique. Tree codes in the past that are not as adaptive as the current one have been shown to take O(N log N ) operations. We believe that our algorithm does as well, based on heuristic considerations and actual execution times. The heuristic argument is as follows. We would show that O(N log N ) operations are required by showing that each particle takes part in O(log N ) particlecluster interactions and O(1) particle-particle interactions. Each particle takes part

in O(log N ) particle-cluster interactions because it takes part in O(1) of them on each level of the tree and there are O(log N ) levels in the tree. The reason that a particle only takes part in O(1) particle-cluster interactions on a given level is that it does not interact with cells which are sufficiently far away relative to the cell’s size, because the particle would have interacted with such cell’s parents. One obtains an upper bound on the relative distance to a cell with which a particle-cluster interaction is performed, and the upper bound is independent of the level. If the cells are of the same size on the level, this implies an upper bound on the number of cells satisfying the condition. This is one point where the analysis is heuristic and not rigorous for our algorithm. Thus, there is an upper bound on the number of particle-cluster interactions on a level, which implies an O(N log N ) operation count. To bound the number of particle-particle interactions, consider how many particles interact with a leaf cell on a particle-particle basis. If one assumes that the number of particles which do so is proportional to the number of particles in the leaf cell, then it follows that
2 κNτ < κN0 leaf cells τ leaf cells τ

Nτ < κN0 N,

(3.82)

where κ is the constant of proportionality. One can show that the particles which interact with a leaf cell on a particle-particle basis are contained in a sphere around the cell whose radius is proportional to the size of the cell. So if the particle density in the sphere is not too different than the density in the cell, then the assumption of the particle count is justified, and the operation count bound is achieved. So with these heuristic considerations and the rigorous bounds above, we have that the overall operation count for the algorithm is O(N log N ) and the memory usage is O(N ).

CHAPTER 4

ALGORITHM VALIDATION AND PERFORMANCE

In this chapter, we present a validation of the algorithm. The algorithm has two different aspects to it, the vortex method, i.e. the discretization of the vortex sheet model, and the tree-code which is used to evaluate particle velocities. The topics in the first section are related to the vortex method. We demonstrate the 4-th order convergence of the Runge-Kutta method and present results showing convergence as the vortex sheet is refined. The other sections deal with the tree-code. We discuss the selection of the runtime parameters N0 and pmax and demonstrate the algorithm’s accuracy and execution time improvement over direct summation. The algorithm was implemented in C [29], using double precision arithmetic and runs were performed on a Silicon Graphics Power Challenge L, a Sun UltraSPARC 2, and a Sun SPARCstation 20. Relevant information about the machines is presented in Table 4.1. The computations which involved timing comparisons were performed on the Silicon Graphics machine.

65

Machine Power Challenge L UltraSPARC 2 SPARCstation 20

RAM (MB) 128 380 32

CPU clock rate (MHz) 75 168 150

Table 4.1: Machine characteristics.

4.1

Convergence of Vortex Method

The purpose of the first test case is to verify the 4th order convergence of the Runge-Kutta method for the solution of the differential equations (2.23). These runs were performed with a program using direct summation. The initial condition was a flat circular vortex sheet of radius 1 with circulation distribution λ1 = (1 − r 2 )1/2 . Such a sheet rolls up into a vortex ring as described in Chapter 2. The α change of variable employed was λ1 = cos α, 0 ≤ α ≤ π/2, yielding r = sin α. The smoothing parameter δ was set to 0.1. The sheet was discretized with 64 circular vortex lines, uniformly spaced in α. Each vortex line was discretized with 128(1 + r) particles, where r is the radius of the vortex line, rounding the number of particles up to the nearest multiple of 8. The total number of particles discretizing the sheet was 13444. A profile of the sheet at time t = 1 is depicted in Figure 4.1. The computations were performed with different time steps and the results from different runs were compared by computing the maximum distance between particle positions. This comparison was used because in these particular runs, no particles or lines were inserted. The values of ∆t used were 0.2, 0.1, 0.05, 0.025 and 0.0125. The position differences were computed for consecutive values of ∆t and are displayed in Table 4.2. The results are consistent with 4th order accuracy.

0 −0.1 −0.2 −0.3 −0.4 −0.5 −0.6 −0.7 0 0.2 0.4 0.6 0.8 1

Figure 4.1: Profile of rolling up vortex sheet. t = 1, δ = 0.10.

∆t 0.2 0.1 0.05

e(∆t) 3.321 · 10−3 1.885 · 10−4 1.344 · 10−5

e(∆t)/(∆t)4 2.075 1.885 2.151 2.366

0.025 9.241 · 10−7

Table 4.2: Maximum point position differences for circular sheet. t = 1, δ = 0.10, e(∆t) = maxi xi (∆t) − xi (∆t/2) .

Recall from Section 2.2.3 that points and lines are inserted during a computation to maintain resolution as the vortex sheet is stretched. There are two parameters governing this process, denoted vortex lines is greater than
1, 1

and

2.

When the distance between two adjacent

a new vortex line is inserted and if two particles on a
2,

vortex line are separated by more than

a new particle is inserted. If either of these

parameters is too large, resolution is lost and the computations become inaccurate. Figure 4.2 depicts cross sections of an axisymmetric vortex sheet rolling up into a disk for different values of
1.

As above, the smoothing parameter δ is set to 0.1.

The cross sections are shown for t = 4. The time step size ∆t was set to 0.05, based on the results of the previous section. The are discernible. The
1 1

= 0.05 curve is smooth and no corners

= 0.10 curve is not as resolved, but the point positions are in
1

good agreement with the better resolved the
1

= 0.05 curve. The same can be said of

= 0.15 curve, but the loss of resolution in the core of the ring is considerable.

4.2

Selection of Runtime Parameters

In this section, we present the results of runs which were performed to select the runtime parameters N0 , the upper bound on the number of particles in an undivided cell, and pmax , the maximum admissible value of p. As explained previously, if N0 is small, then memory usage is large because the tree will have many levels. If N0 is large, then fewer particle-cluster interactions will be possible since there will be fewer cells, resulting in a large execution time. Similarly, if pmax is large, then there will be a large memory requirement because cell moments require O(p3 ) words max of memory. If pmax is small, then fewer particle-cluster interactions will be possible since the tolerance conditions will be satisfied less often, resulting in an increase in execution time. To find values for these parameters which ensure good performance,

−0.6

−0.8

−1

−1.2

−1.4

−1.6

−1.8 0

0.2

0.4

0.6

0.8

1
= 0.15,

Figure 4.2: Profile of rolling up vortex sheet. t = 4, δ = 0.10, ∆t = 0.05, 0.10, 0.05, 2 = 0.05

1

runs were performed with N0 and pmax taking on a range of values, N0 = 50 . . . 1000 in increments of 50, and pmax = 6, 8, 10. Runs were performed with different N values and tolerances to ensure consistent results. The execution times of these runs are presented in Figure 4.3 as a function of N0 . Going up the page, N increases, and going right across the page, tol, the requested tolerance decreases. The different line patterns correspond to different vales of pmax , as described in the caption. A few observations can be made from this data. First, though there are trends in the execution time as a function of N0 , the timings never vary by more than a few percentage points for fixed N and tolerance. The dependence on pmax is similar, although for the smaller tolerances, the difference in the execution time is larger. In particular, for the smallest tolerance, tol = 10−4 , the pmax = 6 times are 15 to 20 percent larger than the pmax = 8 and 10 times, which are nearly identical. In roll-up simulations in the Chapter 5, we use tol = 10−3 for accuracy. For this tolerance, pmax = 8 consistently has smaller execution times, so that is our choice of pmax . We postpone selection of N0 until after we discuss memory usage. The memory used for these runs, in megabytes, is presented in Figure 4.4 as a function of N0 . As in Figure 4.3, N increases going up the page. The amount of memory used by the algorithm is independent of the requested tolerance, so there is only one column. As N0 increases, the memory usage decreases and levels out. As expected, for larger values of pmax , the memory usage is larger. From execution time and memory considerations, we use the value 512 for N0 . This means that we are potentially performing particle-particle interactions with cells that contain 500 particles. Though this value may seem intuitively large, it is justified from the numerical data in Figures 4.3 and 4.4. If much smaller values of N0 are used, then execution times as well as memory usage are larger.

110 105 N=51276 t (sec.) 100 95 90 0

tol = 1.0e−2

200 190 180 170

tol = 1.0e−3

350

tol = 1.0e−4

300

500

1000

160

0

500

1000

250

0

500

1000

80 75 38444 t (sec.) 70 65 60 0 500 1000

140 130 120 110

240 220 200 180 0 500 1000 160 0 500 1000

42 40 38 36

75 70 65 60

120 110 100 90

25572

t (sec.)

0

500

1000

0

500

1000

0

500

1000

15.5 15 12708 t (sec.) 14.5 14 13.5 0 500 N0 1000

25 24

40

35 23 22 30

0

500 N0

1000

0

500 N0

1000

Figure 4.3: Execution time (sec.) vs. N0 . pmax = 6 (—), 8 (– – –), 10 (· · · ).

100

N=51276

t (sec.)

50

0

0

200

400

600

800

1000

60 t (sec.) 38444 40 20 0 0 200 400 600 800 1000

50 40 25572 t (sec.) 30 20 10 0 0 200 400 600 800 1000

25 20 12708 t (sec.) 15 10 5 0 0 200 400 600 800 1000

N0

Figure 4.4: Memory usage (MB) vs. N0 . pmax = 6 (—), 8 (– – –), 10 (· · · ).

4.3

Algorithm Performance

In this section, we compare the tree code’s performance to direct summation. Execution time and memory usage as functions of N are compared for different tolerances. These comparisons are based on evaluating the velocity at points on a surface which approximates a rolled up vortex sheet, no time evolution is performed. In Figures 4.5 and 4.6, the independent variable N , the total number of particles, was made to vary by changing the refinement in λ1 (i.e. α). Figure 4.5 displays the execution time. The different line patterns represent different requested tolerances, as described in the caption. Figure 4.5a presents the execution times in seconds and Figure 4.5b shows the ratio between the direct summation time and the tree code’s time. In our roll-up computations in Chapter 5, we use tol = 10−3 , which corresponds to the dashed line. With this tolerance, the new algorithm is faster than direct summation by a factor of 10 when there are 100,000 particles, and this factor increases with N . The factor of improvement appears to be increasing at a rate which is slightly less than linear. This is the expected behavior for an algorithm which requires O(N log N ) operations, since N 2 /(N log N ) = N/ log N . Figure 4.6 displays the memory used by the programs. Figure 4.6a presents the usage in megabytes and Figure 4.6b shows the factor of increase, i.e. the ratio between the new algorithm’s usage and the direct summation usage. As noted in Section 3.8, the percentage increase over the direct summation algorithm is between 1.3 and 1.6. The actual error in the computed value of the particle velocities, which is due to series truncation, is less than the specified tolerance. The disparity is due to the application of the triangle inequality in the error estimates in Section 3.6. Figure 4.7 displays the actual error as a function of the specified tolerance. Recall from

10000 9000
execution time (sec.)

(a)

30 25 20 15 10 5 0 0

(b)

8000 7000 6000 5000 4000 3000 2000 1000 0 0 5 N 10 15 4 x 10

5

N

10

15 4 x 10

Figure 4.5: Execution time (sec.) vs. N . pmax = 8. tol = 10−2 (—), 10−3 (– – –), 10−4 (· · · ). direct summation (–·–). actual data (o), projected data (x). (a) Execution time, (b) Direct summation time / fast algorithm time.

45 40
memory usage (MB)

(a)

1.6 1.55 1.5 1.45 1.4 1.35 1.3 0

(b)

35 30 25 20 15 10 5 0 0 5 10 15 4 x 10

N

5

N

10

15 4 x 10

Figure 4.6: Memory usage (MB) vs. N . pmax = 8. fast algorithm (—), direct summation (–·–). actual data (o), projected data (x). (a) Memory usage, (b) Fast algorithm memory usage / direct summation memory usage.

10
actual velocity error

−3

10 10 10 10 10

−4

−5

−6

−7

−8

10

−4

10 specified tolerance

−3

10

−2

Figure 4.7: Actual error vs. specified tolerance. pmax = 8, N0 = 500, N = 6284, 12708, 25572, 38444, 51276. potential error bound (—), velocity error bound (· · · ). Section 3.6 that there are two different error bounds, one on the velocity potential (3.71) and one on the velocity (3.69). The figure contains data for programs which determine p using these bounds for different values of N , plotted with solid and dashed lines as described in the caption. It was stated in Section 3.6 that if the choice of p is based on the velocity error bound, then the actual error in the velocity is several orders of magnitude smaller than the requested tolerance, which is clearly demonstrated by the figure. The actual error is also smaller when the potential error bound is used, but by a smaller margin. Note that the actual error is not sensitive to changes in N . Figure 4.8 depicts the execution time as a function of the actual error, using the two different error bounds. The plotted lines correspond to the requested tolerances tol = 10−2 , 10−3 , 10−4 for a fixed value of N , going up the plot as N increases. A conclusion that can be drawn from the figure is that the potential error bound

600
execution time (sec.)

500 400 300 200 100 0 −8 10

10

−7

10 10 actual velocity error

−6

−5

10

−4

10

−3

Figure 4.8: Execution time (sec.) vs. actual error. pmax = 8, N0 = 500, N = 6284, 12708, 25572, 38444, 51276. Connected lines are tol = 10−2 , 10−3 , 10−4 . potential error bound (—), velocity error bound (· · · ). requires less time to obtain a given actual error for the same number of points than the velocity error bound. This observation, and the closer match of requested tolerance and actual error are the reasons that we use the potential bound.

CHAPTER 5

APPLICATIONS

In this chapter, the results of simulations performed using our algorithm are presented. In all of the computations here, unless mentioned otherwise, the requested tolerance was tol = 10−3 , and the runtime parameters for the algorithm were N0 = 512 and pmax = 8. The smoothing factor was δ = 0.10.

5.1

Vortex Ring with Azimuthal Perturbation

This section presents the results of simulations of a perturbed rolling-up vortex sheet. An azimuthal instability was introduced to the sheet by perturbing a flat circular disk. In polar coordinates, the perturbation is of the form p(r, θ) = ρ r 2 cos(kθ)ez , (5.1)

where k is the perturbation wavenumber and ρ is the magnitude of the perturbation. The r 2 factor is present to smooth the perturbation at the origin. The perturbation may also be considered as a function of α and θ, its initial magnitude being proportional to sin2 α, since r = sin α. After the sheet rolls up, the radius of the ring, the position of the core, is approximately 0.8, as seen in Figure 4.2. Recall from the linear stability analysis of Section 2.3.2 that the stability of a vortex filament with respect to a perturbation 77

with wavenumber k depends only on k and δ/R. For δ = 0.10, and R = 0.8, δ/R = 0.125. From Figure 2.8, a vortex filament with δ/R = 0.12 has an unstable mode for k = 9. However, the presence of the rolls which are larger than δ presumably has an effect of spreading the vorticity out more away from the core. This is analogous to increasing δ, which lowers the wavenumber of the unstable mode. With this in mind, simulations were performed with wavenumbers k ranging from 4 to 11. The time step used was ∆t = 0.10, and the point insertion parameters
1

and

2

were

0.075 and 0.05 respectively. The value of ρ, the magnitude of the perturbation at the edge of the disk, was 0.10. Figure 5.1 shows a measure of the variance of the rings as a function of time. The quantity plotted was obtained as follows. Each value of α corresponds to a filament, which in our computations is perturbed from being circular. The average radius and z position of the filament are computed. For each value of α, we compute the L2 distance from the filament to the circle whose radius and z position are the averages just computed. The quantity plotted in Figure 5.1 is the L2 norm of this distance as a function of λ1 . The figure shows that the perturbation for the k = 4 and 5 modes does not grow much. For the larger wave numbers, the disturbance has more growth, peaking with the k = 10 perturbation. To visualize the sheets, we plot the sheet positions for the k = 5 and 9 simulations. These two values of k are representative of the behavior observed for other k values. The position of the sheet for the wavenumber k = 5 at times t = 0, 2, 4, 6 is shown in Figure 5.2. One can see from these images that the sheet is rolling up smoothly, the perturbation having only a marginal effect on the evolution. This is as opposed to the images in Figure 5.3, which shows the vortex sheet for the k = 9 simulation. In this simulation, and the other high wavenumber simulations, the outer turns of the

sheet are smooth, but the core is becoming highly distorted. A depiction of the ring’s core, for k = 5 and 9, is presented in Figure 5.4. The curves plotted are the filaments that correspond to α > 0.8. Initially, these filaments were near the outer portion of the disk. The distortion in the core for the k = 9 simulations as compared to the k = 5 is clearly evident here. The bulging behavior of the sheet around the waves is consistent with the simulations of Knio and Ghoniem [32] and the experiments of Didden [19]. The bulges are also similar to the deformations found by Meiburg, Lasheras, and Martin [44] in their study of azimuthal perturbations to a jet, which was based upon experiments and numerical simulations. It should be noted that the surfaces plotted in Figures 5.2 and 5.3 and the surface plots which appear later in this chapter are the surfaces formed by the material curves which coincided with the vortex lines of the sheets at t = 0. However, since we are using a smoothed Biot-Savart kernel, they are not the actual vortex lines for t > 0.

5.2

Elliptical Vortex Ring

In this section, results from simulations of an elliptical vortex ring are presented. The computations are similar to those of Dhanak and de Bernardinis [18] and Fernandez et. al. [23]. The model used for the formation of an elliptical vortex ring is to give an impulse to an elliptical disk and then to dissolve the disk away. As with a circular disk, a free vortex sheet remains and rolls up into a vortex ring. Following Dhanak and de Bernardinis [18], the circulation distribution for an elliptical disk is taken to be λ1 = 1− x2 y 2 − 2, a2 b (5.2)

k = 4 0.15 0.1 0.05 0 0 0.15 0.1 0.05 0 0 0.15 0.1 0.05 0 0 0.15 0.1 0.05 0 0 2 4 6 2 4 k = 10 6 2 4 6 2 4 6 0.15 0.1 0.05 0 0 0.15 0.1 0.05 0 0 0.15 0.1 0.05 0 0 0.15 0.1 0.05 0 0 2 2 2

k = 5

k = 6

k = 7

4

6

k = 8

k = 9

4

6

2 4 k = 11

6

t

t

4

6

Figure 5.1: Variance of perturbed vortex sheet. δ = 0.10, ρ = 0.10. k : wavenumber of perturbation, t : time.

Figure 5.2: Perturbed vortex sheet. k = 5. δ = 0.10, t = 0, 2, 4, 6.

Figure 5.3: Perturbed vortex sheet. k = 9. δ = 0.10, t = 0, 2, 4, 6.

0 −1 −2 1

0 −1 −2 1

0

−1

−1

0

1

0

−1

−1

0

1

0 −1 −2 1

0 −1 −2 1

0

−1

−1

0

1

0

−1

−1

0

1

0 −1 −2 1

0 −1 −2 1

0

−1

−1

0

1

0

−1

−1

0

1

0 −1 −2 1

0 −1 −2 1

0

−1

−1

0

1

0

−1

−1

0

1

Figure 5.4: Core of perturbed vortex sheet. k = 5, 9. δ = 0.10, t = 0, 2, 4, 6.

where the disk is the region x2 y 2 + 2 ≤ 1. a2 b (5.3)

The vortex filaments are ellipses with the same eccentricity as the elliptical disk. As before, the α change of variable used is λ1 = cos α, leading to λ1 (α) = − sin α. Simulations were performed for disks with different eccentricities, which was controlled by setting b = 1 and allowing a < 1 to vary. The ratio of the minor axis length to √ the major axis length is a and the eccentricity is 1 − a2 . We present results for a = 0.8, 0.6, 0.5. The insertion parameters step used was ∆t = 0.05. For values of a close to 1, an elliptical ring may be considered as a small perturbation of a circular ring with wavenumber 2. From the linear stability analysis in Section 2.3.2, we expect the perturbation to oscillate with constant magnitude. The behavior is exhibited by the a = 0.8 computation, which is presented in Figure 5.5. Initially, the disk is narrower in the direction coming out of and to the right of the page. Thus, the filaments running along the front-right edge are stretched in comparison to the rest of the disk. This intensifies the vorticity and that is why the outer turns have wrapped up and around more along this and its opposite edge. However, the difference is not enough to disturb the core, which is rolling up smoothly. The a = 0.6 and 0.5 computations are presented in Figure 5.6 and 5.7 respectively. The orientation of these disks is the same as for the a = 0.8 disk. In the regions where the fluid is moving most rapidly around the edge of the disk, the front-right and back-left, the fluid is forced up over the disk towards the center. As the fluid from either side approaches the center, it is forced up and away from the disk. This is the cause of the protruding spikes on the sheets. The presence of these structures
1

and

2

were both set to 0.05. The time

make it difficult to study the sheet’s motion. This is because as the sheet stretches to form the peaks, additional filaments are inserted, which increases the execution time. For the a = 0.5 computation, it was started with under 7500 particles and at time t = 6, it has 84,000 particles.

5.3

Colliding Vortex Rings

In this section, results of a simulation of oblique colliding vortex rings are presented. The configuration of vortex rings is based on experiments performed by Schatzle [55]. In our computations, the rings are inclined from horizontal by 30 degrees. The centers of the initial circular vortex sheets are located at (±1, 0, 0). An adaptive time-step procedure was used, with an initial ∆t = 0.10, although the time steps never went below 0.07. The point insertion parameters and 0.05 respectively. Figure 5.8 shows the vortex sheets which represent the colliding vortex rings. Figure 5.9 shows a cut-away of the same view, enabling one to see the rolling up structure which is present. In the region where the rings have merged, the windings of the sheet are flattened up against each other and are being pushed down. Because of this stretching, a large number of filaments and particles are inserted into this region, even though the vorticity amplitude is relatively low, as shown by the vorticity isosurfaces in the next figures. At time t = 0, there were 14984 particles representing the disks, and at time t = 4.5, the latest time in our runs, there were 891514 particles. For this number of particles, we estimate that our fast algorithm is performing the computations 60 times faster than direct summation. Even with the fast algorithm, the computation took 32 hours to go from t = 4 to t = 4.5, so a direct summation algorithm would take months.
1

and

2

were 0.075

Figure 5.5: Elliptical vortex sheet. a = 0.8. δ = 0.10, t = 0, 2, 4, 6.

Figure 5.6: Elliptical vortex sheet. a = 0.6. δ = 0.10, t = 0, 2, 4, 6.

Figure 5.7: Elliptical vortex sheet. a = 0.5. δ = 0.10, t = 0, 2, 4, 6.

Figures 5.10 through 5.13 show isosurfaces of the vorticity field, computed by differentiating the integral (2.25) and evaluating it for positions x on a regular grid. The values chosen for the isosurfaces are one- and two-thirds of the maximum initial computed vorticity. Each figure shows the rings from a different view point for the time sequence t = 0, 1, 2, 3, 4, 4.5. The first view is a perspective view with shading on the surfaces, and the others are orthogonal projections. As the rings approach, they initially pinch, and then they merge and this region flattens out. The connection region then begins to stretch out. This is in qualitative agreement with Schatzle’s experiment and the computations of Anderson and Greengard [1]. In Schatzle’s experiments, the connection region disconnects and there is another connection and subsequent disconnection which occurs at the bottom of the rings. Because of the stretching and reconnection of vorticity, it is an open question whether or not a vortex filament model can capture these later stages of the evolution. Our simulations appear to have effectively captured the merger of the rings. However, due to the large computational time, we were not able to explore the parameter space. For instance, it would be of interest to know how the ring merger depends upon the angle of inclination. We are also interested in knowing what happens when δ → 0.

Figure 5.8: Vortex sheets modeling colliding disks. δ = 0.10, t = 0, 1, 2, 3, 4, 4.5.

Figure 5.9: Cut-away of vortex sheets modeling colliding disks. δ = 0.10, t = 0, 1, 2, 3, 4, 4.5.

Figure 5.10: Vorticity isosurfaces of colliding vortex rings, perspective view. δ = 0.10, t = 0, 1, 2, 3, 4, 4.5.

Figure 5.11: Vorticity isosurfaces of colliding vortex rings, front view. δ = 0.10, t = 0, 1, 2, 3, 4, 4.5.

Figure 5.12: Vorticity isosurfaces of colliding vortex rings, side view. δ = 0.10, t = 0, 1, 2, 3, 4, 4.5.

Figure 5.13: Vorticity isosurfaces of colliding vortex rings, top view. δ = 0.10, t = 0, 1, 2, 3, 4, 4.5.

CHAPTER 6

CONCLUSIONS

6.1

Summary

A new algorithm has been presented for rapidly computing three-dimensional vortex sheet motion. The main ingredients of the algorithm are the use of Taylor series for particle-cluster interactions and a nested subdivision of space to create the particle clusters. An important feature of the algorithm is the use of recurrences to compute the expansion coefficients for particle-cluster interactions. New features of the algorithm include its application to a non-harmonic three-dimensional kernel, its adaptive subdivision of space and its adaptive error control. The majority of treecode algorithms previously developed for rapid computations in particle simulations have been restricted to applications where the particle interaction kernel is harmonic. Our algorithm overcomes this restriction by extending the Taylor series approach of Draghicescu and Draghicescu [21] to the three-dimensional vortex blob kernel Kδ . The subdivision of space, to obtain smaller particle clusters, takes into account the local particle distribution by using the particles’ bounding box. When the previous algorithms subdivide cells, they do not take into account the particle’s positions. Though there have been some algorithms which have only subdivided when there are sufficiently many particles to warrant it, such as the adaptive multipole algorithm of 96

Carrier, Greengard, and Rokhlin [11], even these algorithms have not taken the particles’ positions within the cells into consideration when subdividing. The bounding boxes also provide for series expansion points which yield good convergence. The order of the expansion used for particle-cluster interactions, p, is chosen adaptively and depends upon the selection of the expansion point, so good placement yields lower values of p, which improves the algorithm’s performance. The algorithm has been applied to study the dynamics of vortex rings which are modeled as rolled-up vortex sheets. With the fast algorithm we are able to perform simulations with 105 −106 particles, which was not previously feasible. We performed simulations of perturbed vortex rings, elliptical vortex rings, and the collision of two vortex rings. In the simulations of the colliding vortex rings, the vorticity in the rings appears to reconnect, due to superposition, even though the model does not explicitly account for viscous effects. The sheet motion is computed with a Lagrangian numerical method, computing the sheet’s velocity with a smoothed version of the Biot-Savart integral (2.22). Discretization leads to a large system of differential equations which are solved with a Runge-Kutta method. At each time step of the computation, we use the new algorithm to compute the velocity of the discrete particles representing the sheet.

6.2

Directions for Future Work

There are a number of ways to extend this work, which fall into three categories, investigating further the dynamics of the vortex sheet model for vortex rings, enhancing the algorithm, and applying the algorithm to other systems of equations. The vortex sheet model for vortex ring formation appears to capture the process of vortex reconnection. It would be useful to understand this better, which would

require more extensive runs and an exploration of the parameter space. For instance, one issue is to determine how the dynamics depend on δ. It is also of interest to extend the simulations to later times to see how the model performs. In this thesis, the only flows that we have considered are vortex rings modeled as rolled up vortex sheets. More general fluid flow problems can be studied using smoothed vortex filament models as introduced by Chorin [12] and other three-dimensional vortex methods as discussed by Leonard [37]. These numerical methods reconstruct the velocity field from the vorticity field using the Biot-Savart integral (2.6), which leads to an O(N 2 ) operation count, where N is the number of computational elements. Our algorithm can be used to speed these computations as was done for the vortex sheet problems we studied. The motivation for using an asymptotically fast algorithm is the O(N 2 ) operation count of direct summation. However, if N is not too large, then direct summation is feasible. So it would be advantageous to use a vortex method which is more efficient in terms of the number of discretizing particles that it uses. Though we insert points and lines when the sheet stretches, we do not remove any when they concentrate in a small region. If this could be done, then the execution time could be lowered. One possibility for doing this is the removal of vortex hairpins as described by Chorin [13, 14]. Another possibility is Lagrangian reparametrization. However, it is not clear that a rolling-up vortex sheet can be resolved with a small number of points, so these options may have only limited benefit for vortex sheet motion. In terms of enhancing the algorithm, one direction to take is to use different expansions than Taylor series for particle-cluster interactions. Two possible classes are orthogonal polynomials and wavelets. An advantage of orthogonal polynomials is that fewer terms would be needed to satisfy error tolerances. An advantage of

wavelets is that the approximant can be taken to be globally continuous, as opposed to piecewise continuous as the current method yields. This may be advantageous when the system being modeled is unstable. The main aspect of the algorithm which needs to be generalized for these changes is the computation of the expansion coefficients. If the coefficients are not computed efficiently, i.e. not in linear time with respect to the number of terms in the expansion, then the performance of the algorithm will be degraded. This is because coefficient computation will then dominate the overall operation count for particle-cluster interactions. Another way in which the algorithm can be improved is to use a better cell dividing technique. The current technique subdivides cells by bisecting the cell’s bounding box. This approach does not use any information about the internal structure of the particles in the cell, such as how the particles are grouped. Thus, it may break up natural clusters which span the cell’s mid-planes. An approach which detects such internal structure could be beneficial. In terms of studying different systems, the present algorithm can be used to study other systems which are modeled with vortex sheets or the algorithm can be generalized to study particle systems where the interaction kernel is different than Kδ . One application that is of interest is the three-dimensional simulation of the wake behind an airplane, modeled as a vortex sheet. For systems with different kernels, certain aspects of the algorithm need to be modified, although the basic idea of particle-cluster interactions and the subdivision of space are independent of the kernel. The main aspect of the algorithm which would need to be generalized is the computation of the expansion coefficients. However, a recurrence similar to (3.27) exists for the Taylor coefficients of any function ψ which satisfies a linear differential equation with polynomial coefficients. When such a differential equation

is differentiated n times and the Leibniz rule for differentiating a product is used, low order derivatives of ψ do not appear because high order derivatives of the polynomial coefficients vanish. Thus, the Taylor coefficients will satisfy a short recurrence. For instance, consider the third-order Gaussian ψ = exp(−r 3 ), which has been used as a convolution function to smooth the Biot-Savart kernel [7, 32]. The function ψ satisfies the differential equation ψ (r) + 3r 2 ψ(r) = 0, and its Taylor coefficients cn = ψ (n) (r)/n! satisfy the recurrence cn + 3(r 2 cn−1 + 2rcn−2 + cn−3 )/n = 0. (6.2) (6.1)

So the Taylor coefficients could be computed rapidly. Thus, we believe the algorithm can be extended to a wide class of systems.

APPENDICES

101

APPENDIX A

Notation

ak (x, y) bk (τ ) ck (x, y) ck h k K(x, y)

Taylor coefficients for particle-cluster interaction particle moments for cell τ Taylor coefficients of ψ Taylor coefficients of ψ1 convergence factor for particle-cluster interaction, error = O(hp ) wavenumber of vortex filament or ring perturbation Biot-Savart kernel

Kδ (x, y) smoothed Biot-Savart kernel N Nτ N0 p(x, t) p pmax P (τ ) rτ total number of particles in simulation number of particles in a cluster maximum Nτ in an unsplit cell fluid pressure order of the series truncation maximum admissible order parent of cell τ radius of cell τ about y

R Sn Tn u(x, t) wi (t)

smoothed distance from y to target particle sum of order n terms from particle-cluster expansion using Kδ sum of order n terms from particle-cluster expansion using ψ fluid velocity product of λ2 finite differences and λ1 λ2 integration weights

x(λ1 , λ2 , t) position of vortex sheet xi (t) yj y α δ
1, 2

discrete particle approximating position on vortex sheet particles making up a cluster center of cell τ , expansion point for Taylor series reparametrization of λ1 smoothing parameter insertion parameters in λ1 , λ2 directions respectively circulation parameter across vortex lines in vortex sheet parameter along a vortex line fluid viscosity magnitude of vortex ring perturbation sum of absolute value of weights in a cell cell containing a cluster of particles velocity potential jump in φ across vortex sheet flow map potential function for Kδ one-dimensional analogue of ψ vorticity jump in · across vortex sheet

λ1 λ2 ν ρ σp τ φ(x, t) φJ (λ1 , λ2 ) Φ(x, t) ψ ψ1 ω(x, t) [·]

APPENDIX B

Cylindrical Coordinate Identities

Change of Basis Formulas ˜ ˜ ˜ er (θ) = cos(θ − θ) er (θ) + sin(θ − θ) eθ (θ) ˜ ˜ ˜ eθ (θ) = − sin(θ − θ) er (θ) + cos(θ − θ) eθ (θ) (B.1) (B.2)

Derivatives of Basis Vectors d er (θ) = eθ (θ) dθ d eθ (θ) = −er (θ) dθ (B.3) (B.4)

Cross Product Relationships er × e θ = e z eθ × e z = e r ez × e r = e θ eθ × er = −ez ez × eθ = −er er × ez = −eθ (B.5) (B.6) (B.7)

APPENDIX C

Details from Circular Filament Analysis

This appendix contains some details of the analysis of a circular vortex filament from Section 2.3.2.

C.1

Propagation Speed of Circular Filament

The initial conditions, in cylindrical coordinates, are y(λ, 0) = (R, λ, 0). The evolution equation for y(λ, t) is ∂y −1 (λ, t) = ∂t 4π where Kδ (x, y) = − x−y 1 . 4π (|x − y|2 + δ 2 )3/2 (C.3)
2π 0

(C.1)

˜ Kδ (y(λ, t), y(λ, t)) ×

∂y ˜ ˜ (λ, t) dλ, ∂λ

(C.2)

To evaluate the integral in (C.2), we express the integrand in terms of the cylindrical ˜ basis at (R, λ, 0). This ensures that the basis elements are independent of λ, the variable of integration, so that they can be factored out of the integral. The first

˜ expression to compute is y(λ, 0) − y(λ, 0), ˜ ˜ y(λ, 0) − y(λ, 0) = Rer (λ) − Rer (λ) ˜ ˜ = Rer (λ) − R(cos(λ − λ)er (λ) + sin(λ − λ)eθ (λ)) ˜ ˜ = R(1 − cos(λ − λ))er (λ) − R sin(λ − λ)eθ (λ). ˜ Thus, the denominator in Kδ (y(λ, 0), y(λ, 0)) is ˜ |y(λ, 0) − y(λ, 0)|2 + δ 2
2 3/2

(C.4) (C.5) (C.6)

˜ ˜ = R (1 − cos(λ − λ))2 + R2 sin2 (λ − λ) + δ 2 ˜ = 2R2 (1 − cos(λ − λ)) + δ 2
3/2

3/2

(C.7)

(C.8)
3/2

˜ = R3 2(1 − cos(λ − λ)) + (δ/R)2 The partial derivative term in the integrand is ∂ ˜ ˜ Rer (λ) = Reθ (λ) ∂λ

.

(C.9)

(C.10) (C.11)

˜ ˜ = R(− sin(λ − λ)er (λ) + cos(λ − λ)eθ (λ)).

Dropping the λ dependence of the basis vectors, since all of vectors are based at θ = λ, we have ∂y ˜ ˜ (y(λ, 0) − y(λ, 0)) × (λ, 0) ∂λ ˜ ˜ = R2 (1 − cos(λ − λ))er − sin(λ − λ)eθ ˜ ˜ × − sin(λ − λ)er + cos(λ − λ)eθ ˜ ˜ ˜ = R2 (1 − cos(λ − λ)) cos(λ − λ) − sin2 (λ − λ) ez ˜ = R2 cos(λ − λ) − 1 ez (C.13) (C.14) (C.12)

Thus, the velocity of the filament is U= ez 4πR ez 4πR
2π 0 2π 0 3/2 ˜ 2(1 − cos(λ − λ)) + (δ/R)2 ˜ 1 − cos λ ˜ dλ, 3/2 2 ˜ 2(1 − cos λ) + (δ/R)

˜ 1 − cos(λ − λ)

˜ dλ

(C.15)

=

(C.16)

as stated in Section 2.3.2.

C.2

Linearized Evolution Equations for Perturbation

When a perturbation p(λ, t) is added to a circular vortex filament, the perturbation satisfies an integro-differential equation. As described in Section 2.3.2, we expand p(λ, t) in terms of the cylindrical basis er (λ), eθ (λ) and ez , obtaining p(λ, t) = pr (λ, t)er (λ) + pθ (λ, t)eθ (λ) + pz (λ, t)ez . (C.17)

When the evolution equation for p(λ, t) is linearized about the steady solution p(λ, t) = 0, the result is ∂pr 1 (λ) = ∂t 4πR2 ∂pθ 1 (λ) = ∂t 4πR2 1 ∂pz (λ) = ∂t 4πR2 −
2π 0 ∂ ˜ ˜ ˜ ˜ sin(λ − λ) ∂λ pz (λ) + cos(λ − λ)(pz (λ) − pz (λ)) ˜ dλ, 3/2 ˜ 2(1 − cos(λ − λ)) + (δ/R)2

(C.18a)

2π 0

∂ ˜ ˜ ˜ ˜ (1 − cos(λ − λ)) ∂λ pz (λ) + sin(λ − λ)(pz (λ) − pz (λ)) ˜ dλ, ˜ − λ)) + (δ/R)2 3/2 2(1 − cos(λ (C.18b)

2π 0

˜ 2(1 − cos(λ − λ)) + (δ/R)2 ˜ ˜ ˜ sin(λ − λ)(pθ (λ) − pθ (λ) + ∂ pr (λ))
∂λ 3/2 ˜ 2(1 − cos(λ − λ)) + (δ/R)2 ˜ ˜ (1 − cos(λ − λ))2 (pr (λ) + pr (λ)) 5/2

˜ ˜ ˜ pr (λ) − pr (λ) + (1 − cos(λ − λ))(pr (λ) + pr (λ) +
3/2

∂ ˜ p (λ)) ∂λ θ

−3 +3

˜ 2(1 − cos(λ − λ)) + (δ/R)2 ˜ ˜ ˜ (1 − cos(λ − λ)) sin(λ − λ)(pθ (λ) − pθ (λ)) ˜ 2(1 − cos(λ − λ)) + (δ/R)2
5/2

(C.18c) ˜ dλ.

BIBLIOGRAPHY

[1] C. Anderson and C. Greengard. The vortex ring merger problem at infinite Reynolds number. Comm. Pure Appl. Math., 42(8):1123–1139, 1989. [2] C. R. Anderson. An implementation of the fast multipole method without multipoles. SIAM J. Sci. Statist. Comput., 13(4):923–947, 1992. [3] A. Appel. An efficient program for many-body simulation. SIAM J. Sci. Statist. Comput., 6(1):85–103, 1985. [4] H. Aref and I. Zawadzki. Linking of vortex rings. Nature, 354(6348):50–53, 1991. [5] J. Barnes and P. Hut. A hierarchical O(N log N ) force-calculation algorithm. Nature, 324(6096):446–449, 1986. [6] G. K. Batchelor. An Introduction to Fluid Dynamics. Cambridge University Press, 1967. [7] J. T. Beale and A. Majda. High order accurate vortex methods with explicit velocity kernels. J. Comput. Phys., 58(2):188–208, 1985. [8] G. Birkhoff. Helmholtz and Taylor instability. In Proc. Sympos. Appl. Math., Vol. XIII, pages 55–76, 1962. [9] G. L. Brown and A. Roshko. On density effects and large structure in turbulent mixing layers. J. Fluid Mech., 64:775–816, 1974. [10] R. Caflisch. Mathematical analysis of vortex dynamics. In Mathematical aspects of vortex dynamics (Leesburg, VA, 1988), pages 1–24, 1989. [11] J. Carrier, L. Greengard, and V. Rokhlin. A fast adaptive multipole algorithm for particle simulations. SIAM J. Sci. Statist. Comput., 9(4):669–686, 1988. [12] A. J. Chorin. The evolution of a turbulent vortex. 83(4):517–535, 1982. Comm. Math. Phys.,

[13] A. J. Chorin. Hairpin removal in vortex interactions. J. Comput. Phys., 91(1):1– 21, 1990.

[14] A. J. Chorin. Hairpin removal in vortex interactions II. J. Comput. Phys., 107(1):1–9, 1993. [15] A. J. Chorin and P. S. Bernard. Discretization of a vortex sheet, with an example of roll-up. J. Comput. Phys., 13(3):423–429, 1973. [16] S. C. Crow. Stability theory for a pair of trailing vortices. AIAA J., 8(12):2172– 2179, 1970. [17] J. Delort. Existence de nappes de tourbillon en dimension deux. J. Amer. Math. Soc., 4(3):553–586, 1991. [18] M. R. Dhanak and B. de Bernardinis. The evolution of an elliptic vortex ring. J. Fluid Mech., 109:189–216, 1981. [19] N. Didden. Investigation of laminar, unstable vortex rings by means of laserDoppler anemometry. Mitt. Max-Planck-Institut Str¨mungsforschung Aero. o Versuch., 64, 1977. [20] N. Didden. On the formation of vortex rings: rolling-up and production of circulation. Z. Angew. Math. Phys., 30:101–116, 1979. [21] C. Draghicescu and M. Draghicescu. A fast algorithm for vortex blob interactions. J. Comput. Phys., 116(1):69–78, 1995. [22] A. Erd´lyi, W. Magnus, F. Oberhettinger, and F. Tricomi. Higher Transcene dental Functions, volume II. McGraw-Hill, 1953. [23] V. M. Fernandez, N. J. Zabusky, V. M. Gryanik, and V. M. Gryanik. Vortex intensification and collapse of the Lissajous-elliptic ring: single- and multi-filament Biot-Savart simulations and visiometrics. J. Fluid Mech., 299:289–331, 1995. [24] L. Greengard and V. Rokhlin. A fast algorithm for particle simulations. J. Comput. Phys., 73(2):325–348, 1987. [25] L. Greengard and V. Rokhlin. The rapid evaluation of potential fields in three dimensions. In C. Anderson and C. Greengard, editors, Vortex methods (Los Angeles, CA, 1987), number 1360 in Lecture Notes in Mathematics, pages 121– 141. Springer-Verlag, 1988. [26] L. Greengard and V. Rokhlin. A new version of the fast multipole method for the Laplace equation in three dimensions. Research Report 1115, Yale University Department of Computer Science, 1996. [27] R. W. Hockney and J. W. Eastwood. Computer Simulations Using Particles. McGraw-Hill, New York, 1981. [28] Y. Kaneda. A representation of the motion of a vortex sheet in a threedimensional flow. Phys. Fluids A, 2(3):458–461, 1990.

[29] B. W. Kernighan and D. M. Ritchie. The C Programming Language. Prentice Hall, 2 edition, 1988. [30] S. Kida, M. Takaoka, and F. Hussain. Reconnection of two vortex rings. Phys. Fluids A, 1(4):630–632, 1989. [31] S. Kida, M. Takaoka, and F. Hussain. Collision of two vortex rings. J. Fluid Mech., 230:583–646, 1991. [32] O. Knio and A. Ghoniem. Numerical study of a three-dimensional vortex method. J. Comput. Phys., 86(1):75–106, 1990. [33] R. Krasny. Desingularization of periodic vortex sheet roll-up. J. Comput. Phys., 65(2):292–313, 1986. [34] R. Krasny. A study of singularity formation in a vortex sheet by the point-vortex approximation. J. Fluid Mech., 167:65–93, 1986. ¨ [35] C. H. Krutzsch. Uber eine experimentel bebachtete erscheinung an wirbelringen bei ihrer translatorischen bewegung in wirklichen fl¨ ssigkeiten. Ann. Phys., u 35(5):497–523, 1939. [36] H. Lamb. Hydrodynamics. Dover Publications, New York, 6 edition, 1945. [37] A. Leonard. Computing three-dimensional incompressible flows with vortex elements. Ann. Rev. Fluid. Mech., 17:523–559, 1985. [38] A. Lifschitz, W. Suters, and J. T. Beale. The onset of instability in exact vortex rings with swirl. J. Comput. Phys., 129(1):8–29, 1996. [39] J. Liu and Z. Xin. Convergence of vortex methods for weak solutions to the 2-D Euler equations with vortex sheet data. Comm. Pure Appl. Math., 48(6):611– 628, 1995. [40] A. J. Majda. The interaction of nonlinear analysis and modern applied mathematics. In Proceedings of the International Congress of Mathematicians, Vol. I, II (Kyoto, 1990), pages 175–191, 1991. [41] A. J. Majda. Remarks on weak solutions for vortex sheets with a distinguished sign. Indiana Univ. Math. J., 42(3):921–939, 1993. [42] T. Maxworthy. The structure and stability of vortex rings. J. Fluid Mech., 51:15–32, 1972. [43] T. Maxworthy. Some experimental studies of vortex rings. J. Fluid Mech., 81:465–495, 1977. [44] E. Meiburg, J. C. Lasheras, and J. E. Martin. Experimental and numerical analysis of the three-dimensional evolution of an axisymmetric jet. In Turbulent Shear Flows 7 (Stanford University, USA, 1989), pages 195–208, 1991.

[45] D. W. Moore. Finite amplitude waves on aircraft trailing vortices. Aeronautical Quarterly, 23:307–314, 1972. [46] D. W. Moore. The spontaneous appearance of a singularity in the shape of an evolving vortex sheet. Proc. Roy. Soc. London Ser. A, 365(1720):105–119, 1979. [47] M. Nitsche and R. Krasny. A numerical study of vortex ring formation at the edge of a circular tube. J. Fluid Mech., 276:139–161, 1994. [48] D. I. Pullin. The large-scale structure of unsteady self-similar rolled-up vortex sheets. J. Fluid Mech., 88(3):401–430, 1978. [49] A. Pumir and R. M. Kerr. Numerical simulation of interacting vortex tubes. Phys. Rev. Lett., 58(16):1636–1639, 1987. [50] A. Pumir and E. D. Siggia. Vortex dynamics and the existence of solutions to the Navier-Stokes equations. Phys. Fluids, 30(6):1606–1626, 1987. [51] L. Rosenhead. The spread of vorticity in the wake behind a cylinder. Proc. Roy. Soc. Ser. A, 127:590–612, 1930. [52] P. G. Saffman. The number of waves on unstable vortex rings. J. Fluid Mech., 84(4):625–639, 1978. [53] P. G. Saffman. A model of vortex reconnection. J. Fluid Mech., 212:395–402, 1990. [54] J. K. Salmon and M. S. Warren. Skeletons from the treecode closet. J. Comput. Phys., 111(1):136–155, 1994. [55] P. R. Schatzle. An experimental study of fusion of vortex rings. PhD thesis, California Institute of Technology, 1987. [56] K. Shariff and A. Leonard. Vortex rings. Ann. Rev. Fluid. Mech., 24:235–279, 1992. [57] G. I. Taylor. Formation of a vortex ring by giving an impulse to a circular disk and then dissolving it away. J. Appl. Phys., 24(1):104, 1953. [58] J. J. Thomson and H. F. Newall. On the formation of vortex rings by drops falling into liquids, and some allied phenomena. Proc. Roy. Soc. Ser. A, 39:417– 436, 1885. [59] G. Tryggvason, W. J. A. Dahm, and K. Sbeih. Fine structure of vortex sheet rollup by viscous and inviscid simulation. J. Fluids Eng., 113(1):31–36, 1991. [60] L. van Dommelen and E. A. Rundensteiner. Fast, adaptive summation of point forces in the two-dimensional Poisson equation. J. Comput. Phys., 83(1):126– 147, 1989.

[61] S. E. Widnall, D. B. Bliss, and C. Tsai. The instability of short waves on a vortex ring. J. Fluid Mech., 66:35–47, 1974. [62] S. E. Widnall and J. P. Sullivan. On the stability of vortex rings. Proc. Roy. Soc. London Ser. A, 332:335–353, 1973. [63] G. S. Winckelmans. Topics in vortex methods for the computation of three- and two-dimensional incompressible unsteady flows. PhD thesis, California Institute of Technology, 1989. [64] G. S. Winckelmans, J. K. Salmon, A. Leonard, and M. S. Warren. Threedimensional vortex particle and panel methods: fast tree-code solvers with active error control for arbitrary distributions/geometries. In Forum on Vortex Methods for Engineering Applications (Albuquerque, NM, 1995), pages 23–43, 1995. [65] F. Zhao. An O(N ) algorithm for three-dimensional N -body simulations. Master’s thesis, Massachusetts Institute of Technology, 1987.

ABSTRACT
A THREE-DIMENSIONAL CARTESIAN TREE-CODE AND APPLICATIONS TO VORTEX SHEET ROLL-UP

by Keith Lindsay

Chair: Robert Krasny

An algorithm is presented for the rapid computation of vortex sheet motion in threedimensional fluid flow. The equations governing vortex sheet motion, considered in Lagrangian form, are desingularized and discretized, resulting in a system of equations for the N discretizing particles. Since the particles interact pairwise, evaluating the velocities by direct summation requires O(N 2 ) operations, which becomes prohibitively expensive as N increases. Based on measured execution times, the new algorithm computes the particle interactions with O(N log N ) operations. The additional memory required by the algorithm is less than 60% of the memory used by a direct summation algorithm. The algorithm extends Draghicescu’s algorithm from two to three space dimensions. The main ingredients are the replacement of particle-particle interactions with particle-cluster interactions which are based on Cartesian Taylor series expansions and the use of an adaptive tree-based subdivision of space to create the particle clusters. An important feature of the algorithm is

the use of recurrences to compute the expansion coefficients. The recurrences are a generalization of those used by Draghicescu. The new features of the algorithm are its application to a non-harmonic three-dimensional kernel, its adaptive subdivision of space and adaptive error control. The algorithm is used to study the dynamics of vortex rings which are modeled as rolling up vortex sheets. An adaptive point insertion algorithm is used to ensure that the vortex sheets are accurately resolved as they stretch. The problems considered are azimuthal vortex ring instabilities, the evolution of an elliptical vortex ring, and the collision of two vortex rings. In the last problem, the vorticity in the rings appears to connect, due to superposition, even though the vortex sheet model does not explicitly account for viscous effects and the sheets themselves do not connect.

Sign up to vote on this title
UsefulNot useful