Thursday, 9 February 2012

Sparse Voxel Octrees in Haskell

It's been a long time coming (daughters birth impedes Haskell coding!), but I've finally implemented and uploaded support for Sparse Voxel Octrees in my Haskell Renderer.
It's quite an interesting algorithm to implement in something like Haskell. It's interesting because it really questions the balance of strict vs lazy evaluation in Haskell. Which is better - strict construction and strict traversal, with the platform optimisations that may bring, or lazy construction-as-you-traverse?
In an imperative language, you'd run a complete process to construct the tree, and then traverse the completed tree for each intersection query. You have to pay the total construction cost plus the traversal cost. Since you'd be constructing and storing the whole tree, you need to invest significant effort in data compression. On the other hand, the direct data representation would permit you to make a number of optimisations aimed at evaluation efficiency.
With Haskell, the lazy evaluation thunks can do some of that work for you. Instead of trying to crunch the whole tree down to be as small as possible, you may choose instead to rely on the Haskell runtime to just-in-time construct the relevant portions of it. You would only pay the cost of construction for the parts of the tree you traverse. Would this sum be lower than a strict construction and traversal?

Alternatively, In Haskell, you can still go for the strict option and implement it much as you would with C.

What's interesting is that it introduces this challenging, general question of what offers the most efficiency: A domain-specific, efficiently implemented clever algorithm, or a somewhat more generic algorithm, less specifically optimised but instead relying on lazy evaluation and the higher-order optimisations that may automatically permit?

I'm not sure what the answer is to this question right now. I suspect there is a cross over point that is very specific to the data set in question and therefore your particular application.


My current Haskell SVO code is on github, in my Haskell Renderer:

I'm going to fix up the various remaining TODOs and then pursue a more optimised, C-like version along the lines of the various SIMD-like SVO implementations that have recently emerged.
Implementation Notes

This was ultimately quite an elegant algorithm to implement in Haskell. Much of the algorithm is built off simple geometric tests. These are quite compact and elegant to represent in a functional language like Haskell. For instance, a ray-box test is simply:

boundingBoxIntersectRay :: AABB -> Ray -> Maybe (Double, Double)
boundingBoxIntersectRay (bounds0, bounds1) (Ray rayOrg _ invRayDir rayLen)
     | tmax > tmin = Nothing
     | tmin > 0 && tmin < rayLen = Just (tmin, tmax `Prelude.min` rayLen)
     | tmax > 0 && tmax < rayLen = Just (tmax, tmax)
     | otherwise = Nothing
     (tmin, tmax) = foldr1 (\(a0, b0) (a1, b1) -> (a0 `Prelude.max` a1, b0 `Prelude.min` b1)) (map intercepts [vecX, vecY, vecZ])

intercepts f = let x0 = (f bounds0 - f rayOrg) * f invRayDir
                   x1 = (f bounds1 - f rayOrg) * f invRayDir
               in (x0 `Prelude.min` x1, x0 `Prelude.max` x1)

Here we simply define an operation to calculate the slab intercepts of the box, and then repeat that operation over X, Y and Z, then fold to reduce the results. We're offering a description of the algorithm to Haskell in high level primitives, permitting optimisation and reduction - as opposed to clumsily trying to indirectly tell a C++ compiler what you'd really like it to do. It's surprising how often this sort of thing proves faster in Haskell. For example, using the "any" or "all" functions has proven faster than a set of || or && clauses for me in the past. I could really imagine that a thunk-less version of Haskell, free of the irritation of considering C++ aliasing might produce some very efficient code.

The traversal code is a simple recursive traversal, with few (as yet) clever tricks. It terminates either at the leaves, at the maximum traversal depth or when the feature becomes too small when projected to screen space (though this calculation needs further elaboration). It therefore gives a simple, automatic LOD control.
And so, finally, here it is, our SVO lego-sphere:

No comments:

Post a Comment