Don’t Call It SPIR of the Moment
The Khronos Group has released Vulkan 1.1 and SPIR-V 1.3.
Vulkan 1.0 released a little over two years ago. The announcement, with conformant drivers, conformance tests, tools, and patch for The Talos Principle, made a successful launch for the Khronos Group. Of course, games weren’t magically three times faster or anything like that, but it got the API out there; it also redrew the line between game and graphics driver.
The Khronos Group repeats this “hard launch” with Vulkan 1.1.
First, the specifications for both Vulkan 1.1 and SPIR-V 1.3 have been published. We will get into the details of those two standards later. Second, a suite of conformance tests has also been included with this release, which helps prevent an implementation bug from being an implied API that software relies upon ad-infinitum. Third, several developer tools have been released, mostly by LunarG, into the open-source ecosystem.
Fourth – conformant drivers. The following companies have Vulkan 1.1-certified drivers:
There are two new additions to the API:
The first is Protected Content. This allows developers to restrict access to rendering resources (DRM). Moving on!
The second is Subgroup Operations. We mentioned that they were added to SPIR-V back in 2016 when Microsoft announced HLSL Shader Model 6.0, and some of the instructions were available as OpenGL extensions. They are now a part of the core Vulkan 1.1 specification. This allows the individual threads of a GPU in a warp or wavefront to work together on specific instructions.
Shader compilers can use these intrinsics to speed up operations such as:
- Finding the min/max of a series of numbers
- Shuffle and/or copy values between lanes of a group
- Adding several numbers together
- Multiply several numbers together
- Evaluate whether any, all, or which lanes of a group evaluate true
In other words, shader compilers can do more optimizations, which boosts the speed of several algorithms and should translate to higher performance when shader-limited. It also means that DirectX titles using Shader Model 6.0 should be able to compile into their Vulkan equivalents when using the latter API.
This leads us to SPIR-V 1.3. (We’ll circle back to Vulkan later.) SPIR-V is the shading language that Vulkan relies upon, which is based on a subset of LLVM. SPIR-V is the code that is actually run on the GPU hardware – Vulkan just deals with how to get this code onto the silicon as efficiently as possible. In a video game, this would be whatever code the developer chose to represent lighting, animation, particle physics, and almost anything else done on the GPU.
The Khronos Group is promoting that the SPIR-V ecosystem can be written in either GLSL, OpenCL C, or even HLSL. In other words, the developer will not need to rewrite their DirectX shaders to operate on Vulkan. This isn’t particularly new – Unity did this sort-of HLSL to SPIR-V conversion ever since they added Vulkan – but it’s good to mention that it’s a promoted workflow. OpenCL C will also be useful for developers who want to move existing OpenCL code into Vulkan on platforms where the latter is available but the former rarely is, such as Android.
Speaking of which, that’s exactly what Google, Codeplay, and Adobe are doing. Adobe wrote a lot of OpenCL C code for their Creative Cloud applications, and they want to move it elsewhere. This ended up being a case study for an OpenCL to Vulkan run-time API translation layer and the Clspv OpenCL C to SPIR-V compiler. The latter is open source, and the former might become open source in the future.
Now back to Vulkan.
The other major change with this new version is the absorption of several extensions into the core, 1.1 specification.
The first is Multiview, which allows multiple projections to be rendered at the same time, as seen in the GTX 1080 launch. This can be used for rendering VR, stereoscopic 3D, cube maps, and curved displays without extra draw calls.
The second is device groups, which allows multiple GPUs to work together.
The third allows data to be shared between APIs and even whole applications. The Khronos Group specifically mentions that Steam VR SDK uses this.
The fourth is 16-bit data types. While most GPUs operate on 32-bit values, it might be beneficial to pack data into 16-bit values in memory for algorithms that are limited by bandwidth. It also helps Vulkan be used in non-graphics workloads.
We already discussed HLSL support, but that’s an extension that’s now core.
The sixth extension is YCbCr support, which is required by several video codecs.
The last thing that I would like to mention is the Public Vulkan Ecosystem Forum. The Khronos Group has regularly mentioned that they want to get the open-source community more involved in reporting issues and collaborating on solutions. In this case, they are working on a forum where both members and non-members will collaborate, as well as the usual GitHub issues tab and so forth.
You can check out the details here.
“The second is device groups,
“The second is device groups, which allows multiple GPUs to work together.”
Now is the time to go into greater detail on the implications of device groups and what that means for multi-GPU load balancing between discrete GPUs or even Integrated and discrete GPUs for gaming usage.
With the high end GPU SKUs becoming so costly it would be nice if the games developers would begin to target Vulkan’s Explicit Multi-GPU adaptor IP with a stress on making integrated graphics and discrete graphics work togather for better gaming performance. There are a lot of folks interested in the new Raven Ridge Desktop APUs with integrated graphics for HTPC/Mini-Form-Factor PC usage and maybe also including some discrete Vega GPU micro-arch based replacments for the RX570/RX580 Polaris SKUs when they become available. Ditto for Nvidia’s GPUs with other Nvidia GPUs or even Nvidia’s GPUs and AMD’s GPUs both usable at the same time.
I’m seeing the included Khronos(second slide/graphic) discussing homogenious Multi-GPU sharing(under CF/SLI) but no direct mention of any heterogeneous Multi-GPU sharing and that inclusion of CrossFire and SLI and the wording is just confusing. Does it mean in a manner similar to CrossFire and SLI or does it mean that CrossFire and SLI support will still be needed. Both CrossFire and SLI are supposed to be deprecated by AMD and Nvidia respectively going froward so that wording in the Khronos provided slide/graphic is confusing.
And under the Device Groups heading in the outline provided by Khronos the second sub-bullet point in that slide/graphic states “Device groups make the number of GPUs in a system relatively transparent to an application.” with the third sub-bullet point stating that “Applications can be written to use one or many GPUs with a minimum of changes.” So I’m guessing that Khronos is only mentioning CrossFire and SLI because that’s for legacy support and that going forward that games developers will be using Vulkan’s Explicit multi-GPU adaptor support via that Vulkan API supported multi-GPU code path for any new and future games.
Scott do you think that games developers will be mostly ignoring coding for Multi-GPU and instead just continue only targeting single GPUs or do these Vulkan extentions make it easier for the games/gaming engine developers to target milti-GPU with under the Built-in Vulkan multi-GPU feature set?
That github README does not go into any more detail than listing more links but I wish that Khronos would provide more guidance on heterogeneous multi-GPU usage where Nvidia’s, AMD’s(integrated/diecrete), and Intel’s(Integrated) graphics could be made use of if they were installed/available on a PC/Laptop.
I was reqding elsewhere that
I was reqding elsewhere that multiGPU does not require identical GPUs, but they need to take the same compiled code. So maybe all GCN5, or maybe if we’re lucky all GCN, but not AMD and nVidia simultaneously. I suppose this minimizes the amount of juggling necessary when dealing each GPU.
The Ars Technica article(1)
The Ars Technica article(1) on Vulkan 1.1 states:
“The new revision standardizes a handful of features that were previously offered as extensions. The release rounds out the API, bringing parity with Microsoft’s DirectX 12 in a few areas where it was absent, improving compatibility with DirectX 12, and laying the groundwork for the next generation of GPUs.
One feature in particular goes a long way toward filling a Vulkan gap relative to Microsoft’s API: explicit multi-GPU support, which allows one program to spread its work across multiple GPUs. Unlike SLI and Crossfire of old, where the task of divvying up the rendering between GPUs was largely handled by the driver, this support gives control to the developer. With the addition, developers can create “device groups” that aggregate multiple physical GPUs into a single virtual device and choose how work is dispatched to the different physical GPUs. Resources from one physical GPU can be used by another GPU, different commands can be run on the different GPUs, and one GPU can show rendered images that were created by another GPU.
This feature does have one deficit relative to DirectX 12, as it requires homogeneous GPU configurations, where every GPU must match (or at least be closely related and use the same driver). DirectX 12 goes a step further, allowing heterogeneous GPU configurations that mix and match different GPUs from different vendors. This extra capability is interesting because, though multiple discrete GPUs are still relatively uncommon and are usually found only in the most expensive gaming systems, it’s very common for systems to have one discrete GPU and one integrated GPU.” (1)
Well at least those using Raven Ridge/Vega integrated graphics alongside a Discrete Vega GPU can have integrated and discrete graphics working together under Vulkan 1.1’s Homogenious multi-GPU support. So I’m looking at building a Mini-Desktop Raven Ridge system with hopes that there will be some mainstream Vega replacement for the RX 570/580 SKUs at some point in time. The fact that MS’s DX12 can offer Heterogenious GPU parings and not Vulkan means that Khronos has some more work ahead for getting that support for Vulkan also, and Khronos is only now had Vulkan released for a little more than one year so that progress is been very good on their part with getting Vulkan feature parity with DX12 mostly done.
Now for the Games/gaming engines to get more Vulkan multi-GPU support going forward so folks can pair up two weaker AMD GPUs for better gaming performance because the Miners are sure not letting go of their desire to by up all the Vega 56/64 units that they can get their hands on.
(1)
“Vulkan 1.1 out today with multi-GPU support, better DirectX compatibility
Updated drivers that support the latest version should be out today.”
https://arstechnica.com/gadgets/2018/03/vulkan-1-1-adds-multi-gpu-directx-compatibility-as-khronos-looks-to-the-future/
Edit: Vulkan released for a
Edit: Vulkan released for a little more than one year
to: Vulkan released for a little more than two years
Multi-GPU:
In terms of
Multi-GPU:
In terms of SPLITTING (not AFR) code between GPU’s we’re still a long way from that being common.
All game coding is a balance between expected payoff and the cost to implement. Adding multi-GPU support into the game and the expected TROUBLESHOOTING thereafter is simply not worth the effort to most developers.
We really need this to be mostly an AUTOMATIC handling by the GAME ENGINE and VULKAN both even if it’s just handling some aspects such as physics, hair or whatever.
AFR:
To be clear, SLI/Crossfire is 99% utilized via AFR (Alternate Frame Rendering) where one GPU creates a frame then the next GPU creates the next frame. The game must support that and there are issues such as frame pacing (stuttering) and added lag as well as scaling in terms of added FPS.
Ideally we want each new frame to be SPLIT between multiple GPU’s but that requires software support which is again hard (or at least impractical) to implement though we’re getting much closer.
Hardware:
It also looks like STITCHING together separate GPU elements to make a single GPU via interposer or similar means is the next step in getting high-end performance; in fact I expect the PS5 to do something like that, and NVidia at my best guess might have a product as early as 2020.
This can save a lot of money by increasing the YIELDS due to die size and by operating at a lower frequency. It also solves the problem of products with poor or no SLI/Crossfire (AFR specifically) support.
We might get something with 2x the performance of a GTX1080Ti made with four different dies stiched together at a frequency optimized for a combination of yields and power efficiency.