DirectX 12 Has No More Secrets
Multiadapter is the last feature of DirectX 12, and it brings several modes of load balancing between multiple GPUs.
The DirectX 12 API is finalized and the last of its features are known. Before the BUILD conference, the list consisted of Conservative Rasterization, Rasterizer Ordered Viewed, Typed UAV Load, Volume Tiled Resources, and a new Tiled Resources revision for non-volumetric content. When the GeForce GTX 980 launched, NVIDIA claimed it would be compatible with DirectX 12 features. Enthusiasts were skeptical, because Microsoft did not officially finalize the spec at the time.
Last week, Microsoft announced the last feature of the graphics API: Multiadapter.
We already knew that Multiadapter existed, at least to some extent. It is the part of the specification that allows developers to address multiple graphics adapters to split tasks between them. In DirectX 11 and earlier, secondary GPUs would remain idle unless the graphics driver sprinkled some magic fair dust on it with SLI, CrossFire, or Hybrid CrossFire. The only other way to access this dormant hardware was by spinning up an OpenCL (or similar compute API) context on the side.
Apart from RAGE, which transcoded textures with CUDA, I do not know of a high-performance game that did that. I am not sure that the task even ran on a non-primary GPU (if you even installed a secondary graphics card that was from NVIDIA).
Introducing Multiadapter for DirectX 12
In OpenCL, a developer needs to explicitly separate their tasks between all compute devices. In DirectX 12, Multiadapter comes in both “Implicit” and “Explicit” varieties. Implicit Multiadapter tells the graphics driver that you do not want to deal with load balancing. Like SLI and CrossFire, this means Alternate Frame Rendering (AFR). I also expect that Implicit Multiadapter would also mirror all memory between devices and graphics cards of different models will not qualify, but neither of these two points were mentioned in the keynote. Of course, Microsoft still recommends that developers collaborate with hardware vendors to create a profile, like SLI and CrossFire do today with various driver updates and the GeForce Experience application.
It is unknown if Vulkan, the competing graphics API from the Khronos Group, will have a feature similar to Implicit Multiadapter. We will probably learn more about that later this year.
DirectX 12 also provides an alternative, called Explicit Multiadapter. This is a new concept for DirectX. Like OpenCL, individual GPUs can be separately addressed, send unique commands, and store unique data in memory. They do not even need to be similar in performance. One possible application is for integrated GPUs to draw a layer of objects, such as a cockpit or a 3D HUD, over what the main graphics card draws. Max McMullen, Principle Development Lead for Direct3D and DXGI at Microsoft, specifically mentioned calculating VR/AR perspective warp on integrated graphics. He also showed the Unreal Engine 4 Elemental Demo with an integrated GPU drawing some of the post-processing effects while the primary GPU worked on the next frame.
Multiadapter then breaks down Explicit further into two groups: Linked and Unlinked.
Linked GPUs allow special pairings of graphics hardware to collaborate more closely. They can share resources in each other's rendering pipeline and they are presented to the engine as a single GPU that has multiple command processors. We don't know how similar GPUs need to be for this classification though. “Look[s] like a single GPU” sounds like it excludes pairing cards from different vendors, because that sounds painful to implement across multiple, independent GPU drivers. It might be less strict than SLI and (non-Hybrid) CrossFire, but even that seems doubtful. Again, “look[s] like a single GPU” implies similar compute capabilities, and several of the other assumptions that make SLI and CrossFire possible to do automatically.
The other group, Unlinked Explicit Multiadapter, is interesting because it is agnostic to vendor, performance, and capabilities — beyond supporting DirectX 12 at all. This is where you will get benefits even when installing an AMD GPU alongside one from NVIDIA.
On the other hand, Unlinked Explicit Multiadapter is also the bottom of three-tiers of developer hand-holding. You will not see any benefits at all, unless the game developer puts a lot of care in creating a load-balancing algorithm, and even more care in their QA department to make sure it works efficiently across arbitrary configurations. We do not yet know how many developers will care that much. After all, as stated earlier, developers could have launched an OpenCL kernel to secondary graphics cards for years, except on Windows Vista because of its multiple graphics driver bug. They didn't. Will that change? Maybe.
Unlinked Explicit Multiadapter could be important for newer systems with integrated graphics though, which makes me wonder about HSA and similar technologies. Since a graphics processor is co-resident with the CPU, some of them can collaborate with less costly set-up work. Some on-processor GPUs can operate on system memory in-place. This saves the time required to copy and overwrite buffers between two segments of the same memory, which benefits workloads that alternate between GPU- and CPU-friendly tasks. Otherwise, a developer is left wondering whether the performance that they gain in offloading will be nullified, or even negative, because of the overhead. Hopefully DirectX 12 allows graphics vendors to skip irrelevant operations to their specific architecture, but it might not, and a representative from AMD was unable to clarify (granted this was by Twitter on a weekend).
The Final Result on Gaming
DirectX 12 will probably lead to several beautiful games, especially as developers optimize their asset creation process to it. We should be able to justify more objects with unique materials. This will make for more lively scenes and hopefully less person-hours of development for equivalent results. It could take a little while before we see production houses making big changes, except maybe parts of Ubisoft, so the quality might slowly ramp up over time.
As mentioned on last week's TWiCH, Microsoft presented a high-end technology and art demo from Square Enix. It was rendered at 4K, downsampled to 1080p, on four Titan X graphics cards in Implicit Multiadapter, which is the driver-controlled version that is similar to CrossFire and SLI than something like OpenCL. I found the hair and feather cape to be exceptionally well done, although I didn't really find anything else in the demo to be surprising. The hair was really the only thing that felt like, for instance, when Unreal Engine 4 Elemental Demo was shown at E3 2012 alongside the Unreal Editor walkthrough. Maybe I'm being too critical.
At the very least, we can put to the doubt to rest over what Direct X 12 will be. We know every feature and just a few small details elude us. While the specification is finalized, we are still waiting on content, samples, more than a few tools, or documentation. Those are still in early access only. Microsoft is doing the finishing touches on those, so they say.