pretty pictures, slowly
I've hit a milestone with my implementation of the Ray Tracer
Challenge, I can render the cover image of the book!
This was my primary goal when I first started this book, so I am pretty stoked to have made it here. However, I am having major performance issues. The above image took 17 minutes to render, which is not great. It is significantly slowed down by using a ray bounce depth of 5, so every ray that hits a reflective surface has the potential to 5x the computation, which adds up. I've applied various optimizations already like caching inverse matrices, reducing allocations by mutation, and rewriting the core tuple and matrix code to be more efficient. These all provided a significant speed up, but it is not quite enough. I've experimented with multithreading, and saw a ~2x speed up, but I want to keep this single threaded.
Here's what the profiler output while generating the above image
(using snakeviz for visualization).
As we can see, there is nothing unexpected. We spend the vast
majority of the time multiplying matrices together. I've focused
on this code, and can't squeeze much more performance out of it.
I experimented a bit with numpy and numba here, but didn't see much
improvement (and don't want to use them in this project). The problem
seems to be that we are wasting a lot of work. I've read ahead about
using a Bounded Volume Hierarchy to only check for intersections
on objects within a bounding box, and I look forward to the
potential speedup. As it is now, the above image required
172,535,400 calls to the matrix_mul_tuple
function to be rendered in
300px by 300px. The book notes about an order of magnitude reduction in
the amount of intersection calls when using BVH on an appropriate scene,
which I hope to see as well.
I plan to implement BVH then crunch through the rest of the book pretty quickly, given that I've done this all before. I'll explore further optimizations at the end, and then look into doing another implementation with performance in mind. I've been particularly motivated by a site I found on the book's forum that demos a C++ implementation compiled to WASM. If I can get on the same order of magnitude of performance, I'll be happy.
I've been enjoying the process lately, which is great! While this
ray tracer is not the most performant (as admitted by the author), it
does produce
some of the better images that I have seen when compared to other
hobbyist/educational ray tracers. I am happy with the image quality
now, but I want to focus on performance before doing more post-book
additions like depth-of-field and anti-aliasing. For now, here's
two scenes from the recent chapters: