Future Work for the Grav-Sim Development Team

Balanced Octree and KDTree Space Dividers

The current implementations of Octree and KDTree are unbalanced in the sense that the space-division is done purely on a divide-by-2 basis. This means that one cell can end up with the majority of the bodies, leading to an asymmetric or unbalanced tree structure. It should be possible to produce balanced variants of each of these such that a roughly equal number of bodies is placed in each cell. This technique applies particularly cleanly to the KDTree and was the basis of the original work done by Joachim Stadel.

The balanced Octree and KDTree implementations could be made available in Grav-Sim as further options and tested and compared against the unbalanced variants.

Note that the alternative Hilbert curve / tree-based space divider (as used in the Ridler-style accelerators) is imnplicitly balanced because the tree construction is done bottom-up rather than top-down.


Hyperbolic Regularisation

Grav-Sim version 1.11 includes a simple Kepler orbit analyser for binary stars in elliptical orbits. This works by checking pairs of stars via the collision timescale calculation and generating Kepler orbital elements if the eccentricity is less than 1.0 (i.e. a bound orbit). The calculation is done in the traditional way via the eccentric anomaly, reverse-calculated from the mean anomaly by Newton-Raphson.

The next step would be to extend the Kepler regularisation to eccentricities greater than 1.0 (i.e. hyperbolic orbits). The logic for reverse-calculating eccentric anomaly from mean anomaly is similar to the elliptic case, but uses hyperbolic sines and cosines instead. However, it is more difficult to control the numerical errors, particularly with high eccentricities and close encounters.


Parabolic Regularisation

A further case to include in the regularisation would be eccentricities close to 1.0 (i.e. parabolic orbits). These arise quite rarely in practice, but can be solved with Barker's equation in conjunction with the other orbital elements.


Linear Regularisation

A final case to include in the basic Kepler regularisation scheme is the simple linear orbit. These can arise in practice as the first step in cold-collapse simulations (i.e. where all initial velocities are zero).


Parallel Pairwise Interactor

The Pairwise Interactor (as used in conjunction with tree-based space dividers to produce Dehnen-style accelerators) has proved problematic to parallelise in attempts so far. The reason is that OpenMP does not support nested parallel execution, which forces the code to make a single decision to parallelise at an appropriate point in the tree. The trouble is that the Octree and KDTree implementations as they stand are unbalanced, which means that typically one processor gets to do all the work while the others finish quickly and sit idle. What is needed is either balanced implementations of Octree and KDTree or a way of measuring the degree of imbalance in the existing implementations. Once that is done, the details of the parallelisation will be similar to the existing Brute-Force pairwise-parallel algorithm.


Beyond Predict-Evaluate-Correct (PEC) to P(EC)2

The current implementation of the Grav-Sim Predictor-Corrector integrators is ported from ACS and uses the simple PEC mode. This does a prediction prior to gravity evaluation and then a correction afterwards. The use of the correction typically improves energy errors by roughly a factor of 10 (according to Aarseth), without having to do any additional gravity evaluations.

However, it stands to reason that having done the correction, we now know that the original gravity evaluation was taken at the incorrect location and so can be improved. Therefore there is an argument for doing the gravity evaluation (and hence the correction) again. The resulting sequence is predict-evaluate-correct-evaluate-correct or P(EC)2. Theoretically you could keep repeating this process, but experience and the law of diminishing returns suggests that twice is sufficient.

Once implemented, the P(EC)2 mode will be doing 2 gravity evaluations per time-step rather than just one. The crucial test of its effectiveness will be to compare the results against the original PEC mode but with twice as many time-steps. On the one hand, the accuracy of the integrator will improve with the number of time-steps according to its order. On the other hand, numerical errors due to repeated close-encounters can be expected to increase as a result.


General Purpose GPUs

The recent widespread availability of graphics cards with general processing capability makes them an obvious target for future Grav-Sim development. This represents an alternative approach to the use of tree codes - i.e. throw hardware at the problem rather than trying to reduce the scale of the problem itself. Feedback from other researchers suggests that a factor of 10 improvement may be possible with CUDA-based nVidia graphics cards.

It is envisaged that the resulting system could mix-and-match GPU brute-force vs a CPU tree-code on a per-time-step basis, depending on the number of bodies involved. Clearly Shared or Constant time-steps (which involve the maximum number of bodies in each time-step) would run well on the GPU. However, recent experiments with large models (100,000 bodies or more) suggest that even individual time-steps would benefit. In particular, the initial gravity calculation at the start of the simulation and the final time-step prior to each log-point always involve the maximum number of bodies. Even some of the intermediate steps involve sufficiently many bodies to make the tree-codes take a noticeable amount of time, which suggests that the GPU will be a useful alternative here too.

Depending on progress, it may eventually be possible to run the tree codes on the GPU as well, opening up the way for even larger models.


Special-Purpose Hardware

At some point in the future, it may make sense to integrate Grav-Sim with special-purpose hardware such as the GRAPE-6. It is anticipated that the impact on the Grav-Sim system design would be similar to the general-purpose GPUs, although it is not clear that running tree-codes on special-purpose hardware is a realistic possibility.


KS Regularisation

The N-body literature makes extensive references to Kustaanheimo and Stiefel (KS) regularisation as a better way of keeping a lid on numerical errors via the the use of a generalised fictitious timescale. This is a clear candidate for inclusion in Grav-Sim via a new plug-in interface that gives a choice of orbit analyser, retaining the simple Kepler approach as an option.


Hierarchical and Chain Regularisation

The N-body industry-standard reference is "Gravitational N-Body Simulations, Tools and Algorithms" by Sverre J. Aarseth, Cambridge University Press, 2003. This book covers many complex subject areas but the extended regularisation could be included in Grav-Sim at some point in the future. That would take it beyond a simple collisional simulator that copes with binary stars to the full gamut of triple stars, chain systems and hierarchical systems. The quest for numerical accuracy and dealing with the pathological cases that would otherwise cause excessive errors is the main theme of this work.


Ahmad-Cohen Neighbours Scheme (ACNS)

Grav-Sim already employs a very basic neighbours scheme to determine the next time-step. For numeric-only integration, it calculates the minimum collision timescale on a pairwise basis and then groups bodies with similar timescales (within a factor of sqrt(2)) into a common time-step. For analytic integration with the Kepler orbit analyser, it calculates the 2 least collision timescales on a pairwise basis. If the minimum collision timescale represents an elliptical orbit, then it uses the next one for timescale determination (assuming unperturbed motion in the meantime).

ACNS is a generalisation of this approach to neighbour lists greater than 2. The idea is that the nearer-neighbour interactions are evaluated more frequently than those further away, which leads to an overall improvement of around a factor of 2 according to Aarseth.

This is a possibiity for future inclusion in Grav-Sim, in conjunction with hierarchical and chain regularisation. However, it is not clear at this stage whether ACNS is an alternative to the collision timescale calculation or whether it can add anything via combination.


Spherical Multipole Moments

The Grav-Sim tree codes are currently based on a fixed-order cartesian multipole expansion, up to the quadrupole term. An alternative would be to include a multipole representation in spherical coordinates, which is then more easily generalised to an arbitrary number of terms. A trade-off is expected between the increased accuracy and the additional processing required. Bearing in mind that the whole point of a tree code is to go faster than brute-force, current experience suggests that the next 2 terms in the sequence (octupole and hexedecapole) should be beneficial, but after that we should expect diminishing returns. Note that a plug-in interface would be the ideal solution here - retaining a choice between the spherical and cartesian coordinates.


Cartesian Octupole and Hexadecapole Terms

The accuracy of the cartesian multipole expansion currently used in the Grav-Sim tree codes can also be improved by incorporating the next 2 terms in the sequence in cartesian form. Note that this will affect the optimum setting of the minimum batch size parameter, given the additional computation required during tree construction.


Fast Multipole Method (FMM)

Once the spherical multipole moments are available in Grav-Sim, it would then be possible to implement (in conjunction with the Pairwise Interactor) the local multipole expansions that are characteristic of the FMM. Note that this contrasts with the Taylor-series local expansions advocated by Walter Dehnen and currently used in the Grav-Sim Pairwise Interactor.


Aarseth Criterion

One suggestion arising from the June 11th talk at MICA Simulations is the use of the "Aarseth Criterion" as an alternative to the collision timescale calculation used in ACS and Grav-Sim. It may be possible to engineer a plug-in interface that gives a choice of the method used in this area (i.e. how to choose the timscale for the next step). Note that Grav-Sim has recently made further use of the collision timescale concept with the Kepler orbit analyser for binary stars. The inclusion of the Aarseth criterion will likely involve significant changes to several parts of this code.


General Relativity (GR)

The speed of light is not currently modeled in Grav-Sim. Gravity is immediately felt everywhere at the same time and so subtle effects such as the precession of the orbit of Mercury will not be seen. Inclusion of GR would have a significant effect on all calculations, yet it would only really make a difference in certain specific situations. The ideal solution would therefore be to have it available as an option.


Collisions

Grav-Sim is collisional in the sense that it uses pure un-softened Newtonian gravity by default. However, it represents bodies as point masses that never actually collide. Bodies typically fly off in a different direction (hyperbolic orbit) due to a close gravitational encounter, or slip past each other quietly if softening is used.

Modelling of star collisions is very different to (say) rock collisions and would need separate treatment to do it realistically. This is a very complex subject area but is a candidate for future inclusion in Grav-Sim.


Stellar Evolution (SE)

Although Grav-Sim can be used to model stars in clusters and galaxies, it does not attempt to model the stars themselves. In reality a main sequence star will eventually evolve into a red giant and the transition could affect the results of the simulation, particularly in terms of collisions. Since Stellar Evolution is available via third-party packages in quite a mature form, the most sensible approach would be to link with these rather than including SE directly in the Grav-Sim code. The relatively loose coupling between N-body dynamics and SE makes this attractive in performance terms as well.

There are at least 2 different ways of integrating Grav-Sim with SE:

  • A tight integration via an internal client/server style API (which could be operated across the internet)
  • A loose integration via an external scripting language (such as Python as used by the MUSE project)

Model Generation

The Grav-Sim pre-defined model files were generated with the Plummer code as ported from ACS. There is a range of other models defined in the literature, each of which has its own merits and could be incorporated into Grav-Sim. This could extend Grav-Sim's reach beyond simple clusters to elliptical and spiral galaxies.

Existing packages such as Nemo and Starlab already include a good selection of models. Supporting their file formats would be a sensible next step.


Visualisation Techniques

Grav-Sim is currently quite limited in the areas of model and orbit visualisation. There are opportunities to improve the GLUT-based Grav-View viewer. However, a better approach may be to link with other visualisation packages and projects.

One interesting such project is the OpenSim viewer produced for the Meta-Institute for Computational Astrophysics (MICA) which has been setup in the virtual world Second Life.


Electromagnetism (EM)

The realism of a gravity-only simulation becomes questionable as soon as you get down to the level of dust and gas. Basically you need more physics beyond just gravity when these small bodies collide and electro-magnetism dominates.


Smoothed Particle Hydrodynamics (SPH)

State-of-the-art cosmological simulations tend to include gas as well as stars, often modeled via a technique known as SPH. One of the leading packages in this area is GADGET2 developed by Volker Springel.


Particle-Mesh (PM)

There are a variety of other ways to model gravity and one that is often used in cosmological simulation is PM, useful for extremely large numbers of spread-out particles.