The CausticOne and CausticTwo Ray Tracing Processing UnitsI imagine that like me, many of you are curious as to why this type of real-time ray tracing algorithm could not be run on the power of a modern x86 CPU like the Nehalem-based Core i7? Information from Caustic can get a little fuzzy on this topic as they obviously don’t want the world thinking that the hardware they have spent time developing is not a necessary component to their solution. Essentially, the line from the company is that if they “could have” used a standard CPU for the processing in their algorithms “they would have.” Remember they stated early that they did not initially start out on the project intending to build a hardware solution.
According to James, modern processors on the market today do not offer the right kind of controls or instructions to efficiently implement the company’s designs. They used the example of an enterprise level (hundreds of Gigabit connections) networking switch to elaborate: could an Intel Core i7 processor be used in that case? Maybe but the multiple order of magnitude loss in efficiency completely destroys the cost model. What makes the efficiency argument complete for Caustic is the claim that their ASIC will only use 20 watts of power per RTPU compared to the 200 watts of a high-end graphics card.
What Caustic can achieve with the custom-programmed FPGA is a very specific and fast model for one particular type of computation. It would seem likely that the RTPU is doing something to the effect of a complete vector computation (or multiple) per clock cycle while keeping power consumption incredibly low.
The CausticOne is the first batch of hardware that will ship to qualified developers and ISVs in order to allow them to begin testing, programming and designing around the CausticRT platform. This card features an x4 PCI Express 1.0 connection, uses only a single slot, is passively cooled and is solely powered by the bus.
The processors themselves on this iteration of the card are simply FPGAs (field programmable gate array) running at 100 MHz; FPGAs are often used in the development stages of new hardware in order to emulate silicon before it is manufactured. The CausticOne is basically a temporary product to send to developers to prove that the technology is not “vaporware” and to provide a point of reference for performance expectations.
The CausticTwo, the card that will be offered for sale to end-users, will use a custom designed ASIC (application-specific integrated circuit) that offers a 350 MHz clock speed with 4x the logic of the CausticOne thus equating to a 14x throughput over the first generation card we saw demoed last month. The card will continue to be a single slot design powered solely by the PCI Express bus, though the new card will require a PCI Express x16 slot for additional bandwidth. The hardware change will apparently be completely transparent to the developers using the CausticOne; performance of their applications will basically increase fourteen fold. As of my meeting, the plan was to have custom ASIC cards available in “early 2010.”
How do you program for this thing?
While the Caustic hardware is an important piece of the ray tracing puzzle, it really comes down to the developers to determine how much adoption the new technology can achieve. “CausticRT” is the name the company has given to the entirety of the ray tracing platform and it features both a hardware and software component. We have discussed the hardware extensively above but the software component, known as “CausticGL” is just as important.
Rather than create a completely new ray tracing API around their hardware, the past experiences that the Caustic founders had with OpenGL pointed them instead to adopt and adapt a preexisting foundation. They settled on OpenGL ES 2.0 – a graphics API that includes HLSL support but is lightweight in terms of a technology footprint. Extensions were built off of this OpenGL API to allow the casting of rays within the shading language in syntax that will be very familiar to existing OpenGL developers. CausticGL is built off this design and becomes a new, custom OpenGL revision.
Without diving too much into the software coding, the advantages of having ray casting abilities in standard OpenGL-type syntax is that developers continue to write code as they are used to doing but can now include rays in with their other shading code. It is likely that with the ability to now write shaders with rays many of the shaders programmers are used to creating (complex lighting and shadows come to mind) would be removed or greatly reduced in complexity. Because the CausticRT platform is able to separate the ray tracing compute from the shading compute (with RT going to the Caustic hardware if available and shading going to the GPU) the performance enhancements of this implementation are mostly transparent.
Developers can combine CausticGL with other graphics APIs when writing their applications, so support for OpenGL 3.0 and even 3.1 is easy to integrate. Caustic even mentioned that it is possible to combine DirectX models with the CausticRT platform to create hybrid applications though the API trade-offs apparently aren’t that appealing today.
Another interesting note is that with the CausticGL API, the actual hardware itself is completely optional – the code written for the CausticOne or CausticTwo card would run on any normal system with a CPU and GPU though it would be notably slower.
Caustic also has plans to promote the CausticGL software API as an industry standard for ray tracing. While they feel like they have a “pretty good chance” of having that ratified, I know of at least three companies that would probably beg to differ…