A fellow student and I developed MPI/RT as a final project in a graduate parallel computing course. Roughly speaking, a project in our course passed if it successfully parallelized an algorithm or process that was “difficult” in the sense that it resisted easy parallelization; otherwise, the project failed. Furthermore, the course defined a “successful” parallelization as one that achieved linear or near-linear speedup as the application scaled from 2 up to 32 compute nodes on a distributed memory supercluster.
MPI/RT is a parallel, Monte Carlo ray tracing system. While naïve ray tracing systems are easy to parallelize—so much so that naïve ray tracing is often cited as an example of an “embarrassingly parallel” problem—advanced, aggressively-accelerated ray tracing systems are a different story altogether. Indeed, the introduction of aggressive acceleration techniques opens up dramatic disparities between rays in the amount of intersect testing work required—a significant impediment to parallelization. The “embarrassingly parallel” property of naïve ray tracing relies on the assurance that all rays take about the same amount of time to process. In the naïve case, an even division of rays over compute nodes trivially effects an even division of work. But this isn’t true in the presence of acceleration. In the accelerated case, an even division of rays usually results in an extremely uneven division of work between compute nodes—and a consequently poor parallelization outcome.
MPI/RT is a successfully parallelized, accelerated ray tracing system. To accelerate ray-object intersect testing, it builds a bounding volume hierarchy (or “BVH”) around every object it processes. Even though bounding volume hierarchies are a particularly aggressive acceleration strategy, MPI/RT retains excellent parallelization characteristics, scaling from 2 up to 32 parallel processors with near-linear speedup over a wide variety of input scenes. It achieves this by combining static optimization at start-up time with a dynamic work diffusion scheme at runtime. MPI/RT is implemented in C and is targeted specifically at the nyx supercluster at the University of Michigan Center for Advanced Computing.
You can download a copy of the MPI/RT source code by clicking here.
The Problem—Acceleration Techniques & Parallelization
Ray tracers create shaded and lit images of
three-dimensional scenes by tracing rays fired from a
distinguished eye point into a virtual environment. Here,
the rays either intersect some object, or traverse the
environment completely without hitting anything. Owing to this
design, intersect testing is ray tracing’s central operation.
PAGE 1 | NEXT >