You are on page 1of 14

48th AIAA Aerospace Sciences Meeting <br>Including the <br>New Horizons Forum and Aerospace 4 - 7 January 2010, Orlando,

Florida

AIAA 2010-450

A Three-Level Cartesian Geometry Based Implementation of the DSMC Method


Da Gao , Chonglin Zhang , and Thomas E. Schwartzentruber
Department of Aerospace Engineering and Mechanics, University of Minnesota, Minneapolis, MN 55455

The data structures and overall algorithms of a newly developed 3-D direct simulation Monte Carlo (DSMC) program are outlined. The code employs an embedded 3-level Cartesian mesh, accompanied by a cut-cell algorithm to incorporate triangulated surface geometry into the adaptively rened Cartesian mesh. Such an approach enables decoupling of the surface mesh from the ow eld mesh, which is desirable for near-continuum ows, ows with large density variation, and also for adaptive mesh renement (AMR). Two separate data structures are proposed in order to separate geometry data from cell and particle information. The geometry data structure requires little memory so that each partition in a parallel simulation can store the entire mesh, potentially leading to better scalability and efcient AMR for parallel simulations. A simple and efcient AMR algorithm that maintains local cell size and time step consistent with the local mean-free-path and local mean collision time is detailed. The 3-level embedded Cartesian mesh combined with AMR allows increased exibility for precise control of local mesh size and time-step, both vital for accurate and efcient DSMC simulation. Simulations highlighting the benets of AMR and variable local time steps will be presented along with DSMC results for 3-D ows with large density variations.

I. Introduction
IRECT Simulation Monte Carlo (DSMC) is a particle-based numerical method1 that simulates the Boltzmann equation. As a result, DSMC is an accurate simulation tool for modeling dilute gas ows ranging from continuum to free-molecular conditions. The DSMC method tracks a representative number of simulation particles through a computational mesh with each simulation particle representing a large number of real gas molecules. A key aspect of the method is that molecular movement and collision processes are decoupled. Specically, during each simulation time step, all particles are rst translated along their molecular velocity vector without interaction. After the movement phase, nearby particles that lie within the same computational cell are collided in a statistical manner. This decoupling is accurate for dilute gas ows as long as the simulation time step remains less than the local mean-collision-time (t < c ) and the cell size dimension remains less than the local mean-free-path (x < ). Unlike the Molecular Dynamics technique, precise molecular positions are not integrated using physically realistic interatomic potentials. Rather, simulation particles within each DSMC cell are randomly selected to undergo collisions. This enables the DSMC method to require only a small number of simulation particles per cell ( 20) in order to obtain acceptable collision statistics and an accurate solution to the Boltzmann equation. If the correct collision rate in each cell is specied in addition to the correct probabilities of internal energy exchange and chemical reaction during each collision, the DSMC method can accurately simulate complex dilute gas ows. The numerical restrictions on cell size and time step become signicant for full-scale 3D ows at moderate densities where small and c require large numbers of computational cells (and thus particles) and many simulation time steps. It has been shown that reducing the cell size and time step signicantly below and c results in a negligible improvement in accuracy.2, 3 As a result, an ideal DSMC simulation (that maximizes both accuracy and efciency) would have every computational cell adapted to (each containing 20 simulation particles) and move all particles for a local 1 time step of c . Typically, x 2 and t 1 5 c . For ows involving little variation in density ( constant, c constant), a uniform mesh and simulation time step is both accurate, efcient, and easy to implement. However, many low density ows are associated with high-altitude ight which, in turn, is inherently linked to high ow velocities

Research Research

Associate, Member AIAA. Email: dagao@aem.umn.edu. Assistant, Student Member AIAA. Email: zhang993@umn.edu. Assistant Professor, Member AIAA. Email: schwartz@aem.umn.edu.

1 of 14 American Institute of Aeronautics Copyright 2010 by the American Institute of Aeronautics and Astronautics, Inc. All rights reserved. and Astronautics

and signicant compressibility effects. Such ows can exhibit orders-of-magnitude variation in density and therefore present a challenge for the DSMC method both in terms of accuracy and in terms of computational efciency. Existing state-of-the-art DSMC codes generally adopt one of two approaches for the geometry model. Here geometry model refers to both the computational mesh and surface representation. For example, the MONACO code4 utilizes an unstructured body-tted quadrilateral or tetrahedral geometry model. Such tetrahedral meshes are easily tted to complex surface geometries. MONACO is also equipped with a pre-processor that enables the generation of a new mesh that is rened based on the local values from a previous solution. In addition, different particle weights and time steps can be specied in each cell. The Icarus code5 uses a multi-block system of algebraic meshes to dene the computational domain. This grid system allows exibility for capturing local high gradient regions without excessively gridding the entire domain because there is no requirement for side correspondence between regions.5 In contrast to body-tted grids, the second approach is to use a Cartesian-based geometry model. Here a Cartesian grid is used for the ow eld and the surface mesh must be cut from the Cartesian grid in order simulate complex geometries. For example the DSMC Analysis Code (DAC)6 uses a two-level Cartesian grid where each level-1 cell can be rened into any number (in each co-ordinate direction) of level-2 Cartesian cells. A pre-processor enables a new mesh to be adapted using a previous ow solution and different particle weights and time steps can be specied for each level-1 cell. The DAC code then employs a cut-cell method that cuts an arbitrary triangulated surface mesh from the 2-level Cartesian ow eld grid. Similarly, the SMILE code7 also employs a 2-level Cartesian grid and a cutcell method for simulating complex surface geometries. Finally, the DSnV codes1 use an equally-spaced background Cartesian grid where each division is further subdivided into a large number of Cartesian elements such that the number of elements is on the order of the number of simulated molecules. Collision and sampling cells are then comprised of grouping nearby elements together to form cells that are adapted locally to and contain a constant number of particles. However, caution must be exercised as grouping such smaller elements may lead to highly irregular-shaped collision and sampling cells. The purpose of this article is to describe in detail a new DSMC implementation called the Molecular Gas Dynamic Simulator (MGDS) code. The MGDS code extends upon the above-mentioned DSMC approaches and of particular interest is the manner in which the geometry model and particle information is stored in memory and the potential implications for large parallel simulations. Specically, the geometry model chosen for the MGDS code is a 3-level Cartesian ow eld grid utilizing a cut-cell method to imbed arbitrary triangulated surface geometries. The MGDS code performs fully automated adaptive mesh renement (AMR) during a DSMC simulation and automatically sets different time steps in each level-3 cell. The major reasons for development of this particular geometry model (in order of importance) are as follows. 1. The memory required to store cell-node vertices and cell-face normal vectors for unstructured tetrahedral meshes is substantially larger than the memory required to store a multi-level Cartesian grid. For large ow eld meshes (> 100 million cells) an unstructured tetrahedral mesh must not only be partitioned across parallel processors during a DSMC simulation, but pre/post-processing, grid generation, and AMR routines must also be parallelized due to the large memory requirements. A Cartesian geometry model signicantly alleviates these problems and may therefore be more suitable for future large-scale DSMC simulations. 2. A Cartesian geometry model necessitates a cut-cell method in order to handle complex surface geometries. Although this may seem like a drawback, a cut-cell approach is an extremely general and accurate technique for complex geometries.6, 8 As discussed by LeBeau et al.,6 a cut-cell approach allows for complete decoupling between the ow eld and surface mesh discretization. Such decoupling is especially important for the DSMC method and may be necessary for large-scale simulations of near-continuum ows where the mean-free-path near a vehicle surface may be orders of magnitude smaller than accurate surface resolution requires. Likewise, decoupling is equally important for low density ows (such as low orbiting satellites) where the local value of may be much larger than ne surface geometry features. 3. The MGDS code employs a 3-level embedded Cartesian grid for the ow eld. Adding the third independent level of Cartesian renement adds considerable exibility over previous 2-level implementations for ows with large density gradients. Precise AMR is essential for both simulation accuracy and computational efciency. 4. Use of a Cartesian geometry model not only results in lower memory storage requirements, but also enables many DSMC calculations to be performed with fewer oating point operations. Algorithms for initial grid generation, successive AMR, particle movement, and particle re-sorting during AMR require fewer operations when using a Cartesian grid compared to a non-Cartesian grid. The efciency of such operations, and the degree to which they can be automated, will become increasingly important as larger DSMC simulations are performed, especially for time-accurate unsteady problems where frequent AMR is required.
2 of 14 American Institute of Aeronautics and Astronautics

The spatial and temporal discretization required for accurate and efcient DSMC simulation is inherently linked to the solution itself. Thus, the DSMC method is ideal for fully automated grid generation, AMR, and variable time steps, potentially requiring little-to-no user input. As computational resources continue to grow rapidly, such automation would avoid current bottle-necks in total simulation turnover created by high demands on code users for large-scale parallel ow problems. In addition to providing a new DSMC simulation tool for complex ows over 3D geometries, development of the MGDS code will provide independent evaluation of existing DSMC implementation approaches, as well as investigation of new approaches specically geared towards future large-scale parallel simulations. The article is organized as follows. In Section II the data structures used to organize geometry, cell, and particle data are described, after which particle movement algorithms are outlined. In Section III the main DSMC algorithm, the AMR and variable time-step algorithm, and the cut-cell algorithm are described. Results for various 3D simulations that demonstrate the benets of 3-level AMR and variable time steps are presented in Section IV. Section V contains a summary of the major aspects of MGDS code and conclusions drawn from the preliminary results.

II. Data Structures


A. Geometry Data Structure

Figure 1. Geometry data structure used within the MGDS code.

The Geometry data structure used within the MGDS code is depicted in Fig. 1. It consists of a 3D array of level-one (L1) cell structures. Each L1 cell structure contains the co-ordinates of the bounding Cartesian vertices (x0 , y0 , z0 , xE , yE , zE ) and a pointer to a 3D array of level-two (L2) cell structures. This 3D array can have arbitrary dimensions, meaning each L1 cell can contain an arbitrary number of L2 cells. Each L2 cell structure is identical to a L1 cell structure, and therefore contains a pointer to a 3D array of L3 cell structures. In addition to storing the bounding vertex coordinates, each L3 cell structure contains the local values of c , , t , and currently the maximum expected collision rate < cr >max . Also, each L3 cell structure contains a pointer to an array of surface structures. The majority of cells will contain no boundary surfaces and the pointer will be null. However, for cells which do contain boundary surfaces (described in Section III C.), each surface structure in the array contains vertex, surface normal, area, surface type, and sampling variables. Given the coordinates of a particle, the Geometry data structure enables efcient cell-indexing of the particle. The most important aspect of this Geometry data structure (Fig. 1) is
3 of 14 American Institute of Aeronautics and Astronautics

that it contains all of the information required for the AMR, cut-cell, and variable time step (movement) procedures while requiring relatively little memory storage. Clearly the largest contribution to memory storage requirements comes from the data stored in each L3 cell. While the current MGDS implementation stores vertex coordinates for each L3 cell, this information could simply be computed from the L2 vertex information and the Cartesian index of the L3 cell, thereby greatly reducing the memory storage requirement. Only a small percentage of L3 cells that intersect a boundary surface will contain arrays of surface structures which clearly involve substantially larger memory requirements. Finally, each L3 cell structure contains both the processor number and local array index (on that processor) where its particle data and additional cell data is stored. It is proposed that the entire Geometry data structure could be stored locally on each processor in a parallel simulation even for large DSMC ow eld and surface meshes. For simulations where highly complex surface geometry prohibits this, at a minimum, large portions of the Geometry data structure could be stored on each processor. The more memory-intensive cell and particle data associated with each L3 cell (detailed in the next section) must be partitioned across parallel processors. An important aspect of the MGDS data structures is that any L3 cell data can be partitioned to any processor. That is, domain decomposition is not limited to L1 or L2 cells, which is essential for load balancing of large parallel simulations. The proposed benets of such a Geometry data structure, including efcient movement of particles with variable time steps and efcient AMR, for parallel simulations will be discussed in upcoming sections. B. Cell/Particle Data Structure

Figure 2. Cell/Particle data structure used within the MGDS code.

A DSMC code requires the storage and book-keeping of a large amount of data. Complex data structures are often employed for data organization and can provide the exibility required for a general DSMC implementation. Particle data is the dominant contributor to memory, however, the global number of particles, as well as the local number within each cell, can vary greatly within a DSMC simulation. Thus, statically allocated arrays of particles must either be conservatively large (unused memory) or re-sized often (computationally and memory intensive). Likewise, since the mesh resolution in DSMC depends on the solution itself, ideally, the number of cells would change during a simulation and data structures should account for this possibility as well. The MGDS code adopts the cell and particle data structures approach used in the MONACO code4 with certain modications. The specic Cell/Particle data structure used by the MGDS code is depicted in Fig. 2. On each parallel partition, an array of L3 cells is maintained. This array is actually an array of pointers to L3 Cell/Particle structures so that if the array requires resizing (during AMR), resizing an array of pointers is much less memory intensive and more efcient than resizing an array of Cell/Particle structures. Each cell-data structure contains substantial data (currently including duplicated data from the Geometry data structure). Each Cell/Particle structure also contains an array that stores sampled (cell-averaged) data of the particles within the cell. Currently each Cell/Particle structure also contains the nine integer indices of the corresponding L3 cell in the Geometry data structure. Finally, each Cell/Particle structure contains pointers to the head and tail of a doubly-linked list of particle structures. Each particle structure contains all required particle information (as shown in Fig. 2) and in addition, also contains two integers for the processor number

4 of 14 American Institute of Aeronautics and Astronautics

and index within the local cell array where the particle is located (the reason for this will be explained in Section III). It should be noted that cell-face data and cell-connectivity data are not required since the grid is Cartesian. The MGDS code is written in Fortran 90. All data structures are dened as Fortran derived data types which enable general and efcient packaging of data for broadcast among partitions in a parallel simulation. C. Particle Movement and Tracking

(a) Ray-tracing movement procedure

(b) Cartesian movement procedure.

Figure 3. Particle tracking procedures relevant to the DSMC method.

Ray-tracing is a general and efcient method of tracking particle movement through a computational mesh that is widely used in the computer graphics industry and is also used by many DSMC codes. The procedure is depicted schematically in Fig. 3(a) for movement within an unstructured 2D triangular mesh. Essentially, the timeto-hit each face of the current cell (thit f in Fig. 3(a)) is computed using the particle position/velocity and the face vertices/normal-vector, where x f is the normal-distance between the particle and cell-face. The particle is then advanced for the minimum time thit min = min(thit f ) and the particle is re-located to the neighboring cell. The process is then repeated for the remaining time (tsim thit min ). Of course if (thit min < tsim ) then the particle can be moved for the full time step and remains located in the current cell. This procedure is very general, it naturally sorts particles while moving them, and naturally allows for variable time steps to be used in each cell. Within a Cartesian grid, a particle may be moved for the full time step regardless of how many cells are crossed. The particles new position can then be re-indexed on the Cartesian grid very efciently compared to re-indexing on a non-Cartesian grid. After re-indexing, the particle can be re-located directly to the new cell. This procedure, called Cartesian-move is depicted in Fig. 3(b). However, even on a Cartesian grid, there are three main drawbacks to the Cartesian-move procedure. The rst is that variable time steps can not be used in each cell. This is not a drawback for computing unsteady ows, however, it is a signicant drawback when computing steady-state ows as variable time steps increase the simulation efciency greatly. The second drawback is that re-indexing on a multi-level Cartesian grid is rather computationally expensive, and may approach the expense of ray-tracing. Third, when simulating ow over complex geometries, ray-tracing must be used in cells close to boundary surfaces, thereby requiring the additional complexity of mixed Cartesian-move and ray-trace algorithms. Although the MGDS code maintains the option of the Cartesian-move procedure, the default is to use the raytracing procedure throughout the entire simulation domain. It is important to realize, however, that the ray-tracing algorithm is more efcient on a Cartesian grid than on a non-Cartesian grid, since the dot products (shown in Fig. 3(a)) involve only a single component. Furthermore, a trend in DSMC algorithm research is to enlarge cells above the x constraint without loss of accuracy through the use of virtual subcells.9 The consequence of this trend for 3D simulations will be that a large majority of simulation particles remain within the same cell during a given time step. On a Cartesian grid, the ray-trace procedure to determine if thit min < tsim is highly efcient. Further quantitative results comparing these two movement procedures can be found in Ref. 10.

5 of 14 American Institute of Aeronautics and Astronautics

III. Algorithms
A. Core DSMC Algorithm

By separating the Geometry data structure (Fig. 1) from the Cell/Particle data structure (Fig. 2), the main DSMC loop within the MGDS code is made more compact and communication between processors is potentially lowered when particles move across partitions in a parallel simulation. The resulting compact DSMC loop is shown in Fig. 4.
Loop (for each L3 cell) Collide Particles (update particle properties) Sample Properties Move (update particle positions) - recursive ray-trace algorithm - including surface collisions - record new x,y,z, proc#, local cell# - particles remain in original linked-list IF (INFLOW) Generate/Move Particles End Loop Package up communication Send data to partitions Global Sort (loop over all cells/particles)

Figure 4. Main DSMC algorithm within the MGDS code.

Each L3 Cell/Particle structure (Fig. 2) contains all necessary information to perform collisions and sample particle properties within the cell, independent of any other cell in the mesh. Next, since the entire Geometry data structure (or large portions of it) are stored on each partition, all particles within a given cell can be moved for their complete time step including variable time steps and surface collisions. Specically, without accessing any other cell data (which may be located on a different partition), the x, y, and z positions, the processor index, and the local cell array index of each particle can be updated solely using the Geometry data structure. Finally, if the current cell contains inow surfaces, then particles are generated and fully-moved as just described. The result is a single loop over all L3 cells for each full simulation time step. Each cell within this loop can be processed independently of all other cells in the simulation. It is important to note that upon completion of this loop, all cells still contain linked-lists of their original particles. Only the particle coordinates, destination processor, and local cell index on that processor have been updated; that is, each particle knows exactly where to be sent. At this point the particle data is packaged up and sent to the appropriate partition. As depicted in Fig. 3(b) this approach may have the potential to limit the inter-processor communication for occasional particles that cross multiple partitions in single time step. The nal step is a global sort algorithm that removes particles from current linked-lists and adds them to destination linked-lists. This is the only step that is not independent of other cells in the simulation. Optimization of the above movement and sorting procedures is discussed in Ref. 10 along with a shared-memory parallelization technique which threads both the main DSMC loop and the global sorting algorithm over multi-core processors. B. Adaptive Mesh Renement (AMR)

The AMR algorithm used within the MGDS code inputs the current Geometry data structure, generates a new Geometry data structure adapted precisely to the local mean-free-path, sets local time steps in accordance with the local mean-collision-time, and also interpolates the maximum expected collision frequency (< cr >max ) between old and new meshes. The algorithm then updates the Cell/Particle data structure as cells are added or removed and nally, the global sort algorithm re-sorts all particles into the appropriate linked-lists in the revised data structure. In this manner, the MGDS simulation continues in a smooth and accurate manner. The AMR procedure is called approximately 4-10 times during a typical steady-state MGDS simulation. With the current implementation described below, one call to the AMR function requires approximately the same computational time as 10 simulation time steps, regardless of the size of the simulation. While this has a negligible effect on the overall simulation time for a steady-state solution, future efciency improvements will be important for unsteady simulations where the AMR procedure might be called every few timesteps. An overview of the AMR procedure is detailed below along with a 3-level Cartesian grid schematic in Fig. 5. It is important to note that L1 cell sizes remain xed during the simulation. Thus, the AMR procedure is applied within each L1 cell independently of all other L1 cells according to the following steps:
6 of 14 American Institute of Aeronautics and Astronautics

Figure 5. Schematic of 3-level Cartesian mesh and AMR procedure.

1. Loop through all L3 cells contained within this L1 cell and determine max . 2. Set the new L2 cell size to max and generate a new Cartesian L2 cell grid within the L1 cell. 3. Loop through all new L2 cells and determine min within each. This involves determination of which old L3 cells intersect with each new L2 cell. On a 3D Cartesian grid, computing the volume of intersection between two generic cells is straight-forward and efcient. 4. Set the new L3 cell size within each new L2 cell to min . 5. Compute the volume of intersection between new and old L3 cells within each large L1 cell and interpolate (using a volume average) values of , c , and < cr >max into the new L3 cells. This information can then be used to accurately set an appropriate new local time step and maintain the local collision rate. 6. After processing all L1 cells via the above steps, the complete Geometry data structure is now updated. 7. The new Geometry data structure is now used to update the pointer array of Cell/Particle structures (Fig. 2) as well as modify cell data. Note that in changing the dimensions/vertices of cells, the particles contained within these cells may now be completely un-sorted. 8. Finally, the coordinates of all particles are re-indexed using the new Geometry data structure. The global sort algorithm is then called which adds/removes particle pointers to/from the correct linked lists. The MGDS simulation is now ready to carry-on with the general move/collide DSMC algorithm. C. Cut Cell Algorithm

Since the MGDS code already uses the ray-tracing technique to determine particle intersections with cell faces, arbitrary triangulated surface meshes can be naturally imbedded within the ow eld grid without modication to the MGDS data structures or algorithms. The cut-cell method performs two main functions. The rst is to read in a list of triangular surface elements (generated by various commercial surface triangulation packages) and sort all surface elements into the appropriate L3 cells within the Geometry data structure. Note that if a single large surface triangle cuts through multiple small L3 cells, that the surface element may simply be added to multiple cells and does not need to be trimmed. The second main function is to compute the volume of each cut-cell required to determine the collision rate and various macroscopic properties in the cut-cell. Both functions involve moderately complex geometrical calculations, but leave the basic DSMC data structures and algorithms completely unchanged. In order to sort a list of triangular surface elements into the list of L3 Cartesian cells, the MGDS code employs a Cut Cell Intersection technique initially detailed in computer graphics literature and more recently adapted for CFD simulations by Aftosmis et al.8 The technique is able to use computationally efcient bitwise operations and comparisons to determine intersections between triangular elements and Cartesian cells. Further details on computing the intersection between generally positioned triangles in 3D are contained in Ref. 8. Once all surface elements are sorted into appropriate L3 cells, the volume of each of these cut-cells is computed. The MGDS code uses a simple Monte Carlo technique to compute cut-volumes. Co-ordinates are chosen at random
7 of 14 American Institute of Aeronautics and Astronautics

within the un-cut Cartesian cell. The dot product between the chosen coordinates and the normal vector of a given surface element is computed. If the dot products are negative for all surface elements contained within the cell, then the point lies outside of the ow domain. The cut-volume is simply determined by dividing the fraction of points determined to lie outside the domain by the total number of Monte Carlo co-ordinates considered (N ). The error in such a volume calculation scales directly as 1/ N . It should be noted that this procedure is suitable only when the surface geometry cutting a given cell is either completely convex, or, completely concave. More complex collections of surface elements within a single L3 cell are currently not supported by the MGDS code and advanced methods for such geometries are discussed in Ref. 8. The cut-cell algorithm is called at the beginning of each simulation and also immediately following each call to the AMR function. The current cut-cell algorithm requires less computational time than a single AMR call for all simulations presented in this article. Finally, inow, outow, symmetry, or wall surfaces may be specied by the user on all sides of the overall Cartesian bounding box of the simulation. These outer surfaces are added to the appropriate L3 cells in addition to the triangulated surface elements sorted by the cut-cell algorithm.

IV. MGDS Simulation Results


A. Effect of variable time steps

Mach 5 ow of argon over a 10 cm at plate at 30 degrees angle of attack is simulated to steady-state with and without the use of variable time steps. The free-stream density and temperature are 7.5 105 kg/m3 and 200 K respectively. The temperature of the at plate is held xed at Twall = 2000 K on the bottom and Twall = 200 K on the top. The variable hard-sphere (VHS) collision model is used with a power law value of = 0.81, and diffuse reection and full thermal accommodation are imposed for surface collisions. The simulation is three-dimensional stretching approximately 4 in the z direction. However, since symmetry boundary conditions are imposed on the z boundary planes, the resulting ow is uniform in the z coordinate direction and only the two-dimensional ow eld results are presented.

(a) Normalized density eld.

(b) Number of simulation particles per cell.

Figure 6. MGDS simulation results without the use of variable time steps.

Figure 6(a) shows the resulting density eld (normalized by the free-stream density) when a constant simulation timestep is used. The 3-level Cartesian grid has been adapted to the local mean-free-path and the constant time step is 1 less than 2 c everywhere, thus producing an accurate solution. It is evident from Fig. 6(a) that the density increases by a maximum of 5 times between the free-stream and the leading edge of the plate. Although the density (and therefore the number density, n) increases towards the plate leading edge, the cell size decreases proportionately. Thus AMR should naturally provide some control in maintaining a constant number of particles per cell. However, as evident from Fig. 6(b), even with AMR, the average number of simulation particles per cell varies widely from less than 10 to

8 of 14 American Institute of Aeronautics and Astronautics

more than 500. If one considers the number of real gas molecules located within a volume of one cubic mean-free-path (Nreal ), the following dependence is realized: Nreal = n3 , n , 1/. (1)

If constant simulation particle weights are used, the number of simulation particles per cell should therefore scale with 1/2 in 3D (and 1/ in 2D). For example, if the density drops by one order of magnitude, although the number of molecules per unit volume drops equally, the volume under consideration rises by three orders of magnitude. Therefore, even when every cell is rened precisely to the local value of , if there were 10 particles per cell in the high density region near the leading edge, one would expect (/ )2 = 52 times more particles in the lower density free-stream cells. This is precisely veried in Fig. 6(b) where there are approximately 250 particles per cell in the free-stream region and 10 near the leading edge of the at plate. Note also, how in the very low density region above the plate, there are an enormous number of particles per cell. As stated earlier, only 20 particles per cell are required for accuracy and using additional particles per cell is inefcient. Thus, even with precise AMR for 3D simulations, large variations in the number of particles per cell will occur naturally in a DSMC simulation, resulting in a very inefcient simulations using far more particles than required for statistical accuracy.

(a) Local time step ratio in each L3 cell.

(b) Number of simulation particles per cell.

Figure 7. MGDS simulation results with the use of variable time steps.

As described by Kannenberg and Boyd,11 the use of a variable time step in each cell signicantly alleviates this problem. Compared to a global reference time step, tre f , the time step used to advance particles within a given cell is increased or decreased by a factor that varies from cell to cell (t = tratio tre f ). This factor is equal to the ratio of the local mean-collision-time to the global reference time step, tratio = S c /tre f and can be scaled with a globally constant factor S. At the same time, the weight for all particles within the cell is also multiplied by tratio . Essentially, where c (and therefore ) is large, particles move more rapidly through these larger cells. During a given instant, this reduces the number of particles found in that cell and, by adjusting the particle weight by the same factor, the number density in the cell remains correct. Since c 1/, the combination of variable time steps and AMR improves the scaling for the number of particles per cell to N constant in 2D and N 1/ in 3D. As detailed in Ref. 11, the new time step, number of particles, and particle weight are used in each cell to compute collision rates and outcomes without any modication to the DSMC algorithm. The MGDS code is able to set a variable time step in each L3 cell. In addition, this local time step can be slightly adjusted to account for in-exact AMR. Even when using a 3-level adaptive grid, locally x = precisely. In the MGDS code, the local time step factor is set as tratio = S (c /tre f ) (x/). Thus if a given cell is slightly too small the time step is lowered such that additional particles will accumulate in this smaller cell. It should be noted that adjusting the local time step in this manner is accurate as long as x/ 1, which is the case when using a 3-level Cartesian grid with AMR. Figure 7(a) shows the time scale factor used in each L3 cell for the at plate problem. The combination of precise AMR and precise local time step adjustment leads to a much more uniform distribution of
9 of 14 American Institute of Aeronautics and Astronautics

particles per cell as seen in Fig. 7(b). The ow eld solutions, surface heating rates, and surface shear stress proles obtained when using the variable time step method are veried to reproduce the results obtained using a constant simulation timestep which required substantially more particles. Varying the particle weight in each cell independent of the time step is an alternate method of controlling the number of particles per cell that is used in existing DSMC codes and also discussed in Ref. 11. Here, particles are either cloned or deleted in order to obtain the desired number of particles per cell, and their weights are adjusted accordingly. While the deletion of simulation particles is not thought to inuence solution accuracy, the cloning of particles may correlate statistics and result in random walk errors.1 Although techniques exist to minimize such errors, the combination of AMR and local time steps should greatly reduce the degree to which particles are cloned/destroyed. B. Hypersonic blunt body ows

(a) Temperature eld and nal 3-level Cartesian grid.

(b) Close-up of the plate-tip ow region.

Figure 8. Simulation results for Mach 15 nitrogen ow over a vertical at plate.

Mach 15 ow of diatomic nitrogen gas over a 16 cm vertical at plate is simulated with the MGDS code using both AMR and the variable time step method. The free-stream density and temperature are 6.5 106 kg/m3 and 80 K , respectively. The temperature of the at plate is held xed at Twall = 1500 K on front surface and Twall = 200 K on the back. The variable hard-sphere (VHS) collision model is used with a power law value of = 0.75, and diffuse reection and full thermal accommodation are imposed for surface collisions. The Larsen-Borgnakke12 model is used for translational-rotational energy exchange together with the variable rotational energy exchange probability model of Boyd13 using a maximum rotational collision number 18.1 and reference temperature for rotational energy exchange of 91.5 K. Vibrational energy is not considered. Again, the simulation is three-dimensional stretching approximately 8 in the z direction with symmetry boundary conditions. Only the two-dimensional ow eld results are presented. The MGDS simulation begins with a uniform Cartesian mesh (L3=L2=L1) that is sized to x = 0.5 and a uniform time step set to t = 0.2c . The simulation rapidly reaches steady state and the solution is then sampled for approximately 200 time steps. At this point the AMR function is called which renes the mesh, re-sorts all particles in the new mesh, resets all local time steps, and then continues with the simulation. The translational temperature eld after four AMR calls is shown in Fig. 8(a) which clearly shows the bow shock and compression region ahead of the plate and a rareed region behind the plate. A close-up of the plate tip is shown in Fig. 8(b). Here the level of renement between L1 cells (behind the plate) and L3 cells in front is seen to vary by a factor of 30. Indeed, without the variable time step method, the large cells in the plate wake would naturally contain 302 = 900 times more particles than the rened cells in front of the plate. Also evident in Fig. 8(b) is the increased exibility a 3-level Cartesian mesh has over a 2-level mesh. A 2-level Cartesian mesh requires uniform cell spacing within each L1 cell which would result in substantially more (unnecessary) cells in high gradient region upstream of the at plate. Not only would a 2-level Cartesian mesh require more cells and therefore more particles, but would also introduce further variation in

10 of 14 American Institute of Aeronautics and Astronautics

the number of particles per cell. In such a 2-level mesh, some of the cells would be now be much smaller than the local value of . As a result, these cells may no longer contain sufcient particles ( 20) and the solution may now become inaccurate (not just inefcient). In order to restore accuracy, either the global number of particles would have to be increased (highly inefcient) or particle cloning would become necessary.

Figure 9. Mach 15 nitrogen ow over a 3D vertical at plate.

In order to highlight the capability of the MGDS geometry model, AMR algorithm, and cut-cell algorithm, two additional solutions are presented in Fig. 9 and Fig. 10. The validation of MGDS simulation predictions with experimental results is not addressed in the current article. Figure 9 depicts 3D ow over a vertical at plate of nite size in the z direction. The free-stream conditions are identical to those used for the simulation displayed in Fig. 8, except Twall = 2500 K on the front of the plate. This modication results in more moderate density gradients near the plate surface and requires less renement, thereby enabling this 3D ow to be simulated on a single processor. Figure 9 highlights how the general AMR algorithm outlined in Section III B. is capable of smoothly adapting a 3-level Cartesian mesh for 3D ows involving large density gradients. Contours of translational temperature are plotted for the ow eld, and contours of the heat transfer coefcient are plotted on the plate surface. In Fig. 9, the ne grid spacing on the front of the plate and the coarse spacing immediately behind the plate is clearly evident. Finally, Fig. 10 shows a 3D solution of hypersonic ow of argon gas over a cylinder. The solution is uniform in the z direction and only the resulting 2D ow eld is displayed. The free-stream density and temperature are 6.5 106 kg/m3 and 80 K , respectively. The temperature of the at plate is held xed at Twall = 1500 K . Diffuse reection and full thermal accommodation are imposed for surface collisions. The free-stream velocity is set to correspond to a very high Mach number of 29. The resulting temperatures (seen in Fig. 10) are unrealistically high as no ionization reactions are enabled. However, the reason such a high Mach number was selected is to test the MGDS AMR and cut-cell algorithms with a challenging case involving large density variations. The cylinder geometry is specied as a triangulated surface and is cut from the initial uniform Cartesian ow eld mesh, and also cut from each new mesh generated by the AMR function. Although the cylinder geometry may seem simple, the cut cells resulting from the surface geometry span a wide range of volumes ranging from near complete cells, to very small slices of the original Cartesian cells. A close-up view of the mesh renement near the cylinder surface at approximately 45 degrees from the stagnation point is shown in Fig. 11.

11 of 14 American Institute of Aeronautics and Astronautics

Figure 10. Mach 29 argon ow over a cylinder.

12 of 14 American Institute of Aeronautics and Astronautics

Figure 11. Close-up view of the cylinder mesh at 45 degrees from stagnation point.

V. Summary and Conclusions


In this article, the data structures and algorithms of a newly developed 3D direct simulation Monte Carlo (DSMC) program, called the Molecular Gas Dynamic Simulator (MGDS) code, are outlined. The code employs an embedded 3-level Cartesian mesh, accompanied by a cut-cell algorithm to incorporate triangulated surface geometry into the adaptively rened Cartesian mesh. This geometry model is selected for its low memory storage requirements, its ability to decouple surface and ow eld discretizations, and increased computational efciency of DSMC particle movement procedures over non-Cartesian meshes. Two separate data structures are proposed in order to separate geometry data from cell and particle information. The geometry data structure contains all necessary information required for particle movement and AMR procedures, however for a Cartesian mesh, this still involves little memory such that each partition in a parallel simulation could potentially store the entire geometry. A separate cell/particle data structure is maintained that holds all cell and particle data. This data structure requires signicant memory storage and must be partitioned in a parallel simulation. Such separate geometry and cell/particle data structures enable particles within each cell to be moved for a full time step without requiring information from other cell data structures, including movement with variable time steps across multiple cells. Within the MGDS code, particles are moved through the 3level Cartesian mesh using a ray-tracing technique. A simple and efcient AMR algorithm that maintains local cell size and time step consistent with the local mean-free-path and local mean collision time is detailed. Simulations are presented that demonstrate signicant efciency gains for 3D ows with the use of precise AMR and variable time steps. Finally, a cut-cell method is outlined that sorts an arbitrary list of triangulated surface elements into local Cartesian cells and computes the cut volume using a Monte Carlo technique. The ray-trace particle movement algorithm is then able to accurately detect and simulate particle collisions with complex 3D surface geometries.

Acknowledgments
This work is partially supported by a seed-grant from the University of Minnesota Supercomputing Institute (MSI). This work is also supported by the Air Force Ofce of Scientic Research (AFOSR) under Grant No. FA9550-04-10341. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the ofcial polices or endorsements, either expressed or implied, of the AFOSR or the U.S. Government.

13 of 14 American Institute of Aeronautics and Astronautics

References
G. A., Molecular Gas Dynamics and the Direct Simulation of Gas Flows, Oxford University Press, New York, 1994. D. J., Gallis, M. A., Torczynski, J. R., and Wagner, W., DSMC Convergence Behavior of the Hard-Sphere-Gas Thermal Conductivity for Fourier Heat Flow, Physics of Fluids, Vol. 18, No. 7, 2006, pp. 116. 3 Gallis, M. A., Torczynski, J. R., and Rader, D. J., DSMC Convergence Behavior for Transient Flows, AIAA Paper 07-4258, June 2007, presented at the 39th AIAA Thermophysics Conference, Miami, FL. 4 Dietrich, S. and Boyd, I. D., Scalar and Parallel Optimized Implementation of the Direct Simulation Monte Carlo Method, Journal of Computational Physics, Vol. 126, 1996, pp. 328342. 5 Otahal, T. J., Gallis, M. A., and Bartel, T. J., An Investigation of Two-Dimensional CAD Generated Models with Body Decoupled Cartesian Grids for DSMC, AIAA Paper 2000-2361, June 2000, presented at the 39th AIAA Thermophysics Conference, Miami, FL. 6 LeBeau, G. J., A parallel implementation of the direct simulation Monte Carlo method, Computer Methods in Applied Mechanics and Engineering, Vol. 174, 1999, pp. 319337. 7 Ivanov, M. S., Markelov, G. N., and Gimelshein, S. F., Statistical simulation of reactive rareed ows: numerical approach and applications, AIAA Paper 98-2669, 1998, presented at the 43rd AIAA Aerospace Sciences Meeting and Exhibit, Reno, NV. 8 Aftosmis, M. J., Berger, M. J., and Melton, J. E., Robust and Efcient Cartesian Mesh Generation for Component-Based Geometry, AIAA Journal, Vol. 36, No. 6, 1998, pp. 952960. 9 LeBeau, G., Jacikas, K., and Lumpkin, F., Virtual Sub-Cells for the Direct Simulation Monte Carlo Method, AIAA Paper 03-1031, Jan. 2003, presented at the 39th AIAA Thermophysics Conference, Miami, FL. 10 Gao, D. and Schwartzentruber, T. E., Parallel Implementation of the Direct Simulation Monte Carlo Method for Shared Memory Architectures, Presented the 48th AIAA Aerospace Sciences Meeting, AIAA Paper 2010-451, Jan. 2010. 11 Kannenberg, K. C. and Boyd, I. D., Strategies for Efcient Particle Resolution in the Direct Simulation Monte Carlo Method, Journal of Computational Physics, Vol. 157, 2000, pp. 727745. 12 Larsen, P. S. and Borgnakke, C., Statistical Collision Model for Monte Carlo Simulation of Polyatomic Gas Mixture, Journal of Computational Physics, Vol. 18, 1975, pp. 405420. 13 Boyd, I. D., Rotational-Translational Energy Transfer in Rareed Nonequilibrium Flows, Physics of Fluids A, Vol. 2, No. 3, 1990, pp. 447 452.
2 Rader, 1 Bird,

14 of 14 American Institute of Aeronautics and Astronautics

You might also like