I've been experimenting with Haskell's basic parallelism primitives, par, and pseq. These primitives trigger ("spark") evaluation of expressions in parallel - but only if the runtime deems it necessary.
After many failed or mildly successful experiments, I've finally managed to get a speedup. In short, I stopped trying to reinvent the wheel myself. I simply used parListChunk:
rayTraceImage :: Camera -> Int -> Int -> SceneGraph -> [Light] -> [Colour]
rayTraceImage camera renderWidth renderHeight sceneGraph lights = map (clamp . tracePixel eyePosition sceneGraph lights) rayDirections `using` parListChunk 256 rseq
where !rayDirections = [makeRayDirection renderWidth renderHeight camera (x, y) | y <- [0..(renderHeight - 1)], x <- [0..(renderWidth - 1)]]
!eyePosition = Camera.position camera
Here, we break the list of rays to be traced into data parallel chunks of 256, each free to be evaluated by a different thread. Each of these chunks is processed using map. Map applies the raytracing function to convert a ray direction into a colour, and we clamp the result.
And that's it.
Running it on two threads on my dual-core MacBook Pro halves the execution time.
Running it on two threads on my dual-core MacBook Pro halves the execution time.
What's nice is that the core 'map' expression is separated from the details of parallelisation. We've separated the concepts of raytracing and its parallel decomposition into two separate concerns. We are therefore free to vary either without intertwining their concerns.
Not a critical section, lock-free queue or atomic operation in sight!
Clearly, the nature of the functional approach surrenders a lot of low-level implementation details to your platform's runtime. However, it is more frequently the case that when you require such low-level intervention, you are working around deficiencies in your run-time platform, rather than dealing with fundamental algorithm details. Parallelising many algorithms follows established patterns.
It is also worth noting that the raytracer code is currently pretty straightforward code. I've not implemented any particular tricky optimisations, and certainly none of the ideas in work such as Ingo Wald's parallel raytracing PhD thesis. Not yet, anyway...
A massively parallel GPU-based raytracer (with GPL code):
ReplyDeletehttp://users.softlab.ece.ntua.gr/~ttsiod/cudarenderer-BVH.html
Great post thanks for writing
ReplyDelete