Why Two 4GB GPUs Isn’t Necessarily 8GB
Why Aren’t Two 4GB GPUs Really 8GBs?
We're trying something new here at PC Perspective. Some topics are fairly difficult to explain cleanly without accompanying images. We also like to go fairly deep into specific topics, so we're hoping that we can provide educational cartoons that explain these issues.
This pilot episode is about load-balancing and memory management in multi-GPU configurations. There seems to be a lot of confusion around what was (and was not) possible with DirectX 11 and OpenGL, and even more confusion about what DirectX 12, Mantle, and Vulkan allow developers to do. It highlights three different load-balancing algorithms, and even briefly mentions what LucidLogix was attempting to accomplish almost ten years ago.
If you like it, and want to see more, please share and support us on Patreon. We're putting this out not knowing if it's popular enough to be sustainable. The best way to see more of this is to share!
Open the expanded article to see the transcript, below.
TRANSCRIPT
Crossfire and SLI allow games to load-balance across multiple GPUs. It is basically impossible to do this in OpenGL and DirectX 11 otherwise. Vulkan and DirectX 12 provide game developers with the tools to implement it themselves, but they do not address every limitation. Trade-offs always exist.
In the older APIs, OpenGL and DirectX 11, games and other applications attach geometry buffers, textures, materials, and compute tasks to the API's one, global interface. After, a draw function is called to submit that request to the primary graphics driver. This means that work can only be split from within the driver, and only to the devices that driver controls, which prevents cross-vendor compatibility. Lucidlogix created software and hardware that pretended to be the primary graphics driver, loading GPUs from mismatched vendors behind the scenes. It never took off.
With Vulkan and DirectX 12, rather than binding data and tasks to a global state, applications assemble commands and push them onto lists. Not only does this allow multiple CPU threads to create work independently, but these lists can also point to any GPU. This is how OpenCL and other compute APIs are modeled, but Mantle was the first to extend it to graphics. Developers can load-balance GPUs by managing multiple lists with different destinations. This also means that the developer can control what each GPU stores in its memory, and ignore the data it doesn't need.
That said, even though the game developer has full control over tasks and memory, it doesn't mean that it will be any more efficient than SLI and Crossfire. To load-balance, some algorithm must be chosen that can split work between multiple GPUs, and successfully combine the results. The Alternate Frame Rendering algorithm, or AFR, separates draw calls by the frames they affect. If you have three GPUs, then you can just draw ahead three frames at a time. It's easy to implement, and performance scales very well when you add a nearly-identical card (provided the extra frames add to the experience).
Memory, on the other hand, does not scale well. Neighboring frames will likely draw the exact same list of objects, just with slightly adjusted data, such as camera and object positions. As a result, each GPU will need their own copy of this data in their individual memory pools. If you have two, four-gigabyte cards, they will each store roughly the same four gigabytes of data. This is a characteristic of the algorithm itself, not just the limited information that Crossfire and SLI needed to deal with on OpenGL and DirectX 11.
Other algorithms exist, however. For comparison, imagine a fighting game or a side-scroller. In these titles, objects are often separated into layers by depth, such as the background and the play area. If these layers are rendered by different GPUs into separate images, they could be combined later with transparency or z-sorting. In terms of memory, each GPU would only need to store its fraction of the scene's objects (and a few other things, like the layer it draws). A second benefit is that work does not need to split evenly between the processors. Non-identical pairings, such as an integrated GPU with a discrete GPU, or an old GPU with a new GPU, could also work together, unlike AFR. I say could, because the difference in performance would need to be known before the tasks are split. To compensate, the engine could vary each layer's resolution, complexity, quality settings, and even refresh rate, depending on what the user can notice. This would be similar to what RAGE did to maintain 60 FPS, and it would likely be a QA disaster outside of special cases. Who wouldn't want to dedicate a Titan graphics card to drawing Street Fighter characters, though?
Then again, video memory is large these days. It might be better, for quality and/or performance, to waste RAM in exchange for other benefits. AFR is very balanced for multiple, identical GPUs, and it's easy to implement; unfortunately, it could also introduce latency and stutter, and it is inefficient with video memory. Layer-based methods, on the other hand, are complicated to implement, especially for objects that mutually overlap, but it allows for more control in how tasks and memory are divided. VR and stereoscopic 3D could benefit from another algorithm, where two similar GPUs render separate eyes. Like AFR, this is inefficient with memory, because both eyes will see roughly the same things, but it will load-balance almost perfectly for two identical GPUs. Unlike AFR, it doesn't introduce latency or stutter, but it is useless outside of the two nearly-identical GPUs. Other GPUs will either idle, or be used for something else in the system, like physics or post-processing.
In any case, the developer knows what their game needs to render. They can now choose the best algorithm for themselves.
So far games companies
So far games companies haven’t done much to support SLI/Crossfire on average. Much of the advancement has come from Nvidia/AMD creating profiles to work around game issues to support more than 1 card for AFR. With DX12 and games this year we are seeing a lot less support than we have done in previous years.
So while DX12 could be the lift off dual cards have always needed for a great experience in practice its such a small part of the market that I can’t see it being catered to. I suspect DX12 is the end of dual cards.
For VR we already have a lot of games and 99.9% of them do not support dual cards, despite the clear lack of GPU rendering performance we have today. The Nvidia funhouse is a technical demo showing off SMP and SLI but no one else is using these speed up technologies.
I just don’t see where the incentive is for games developers and publishers to expend resources developing the technology especially after how long it took AoS to get dual vendor working right and how little it aided the experience in the end. The GPU manufacturers have a clear incentive as it sells more GPUs but games developers aren’t going to sell more games to support dual card PC users. So without the games having support and subsequently less people buying more than 1 card and even Nvidia writing off anything other than dual SLI for games well its not looking good.
this. and yet some people
this. and yet some people touting DX12 will spark more games to use multi gpu. multi gpu is gpu maker interest so they can sell more gpu. for game developer it did not makes their sales better in fact they only creating more problem for themselves because multi gpu often brings in issues that did not exist in single gpu only operation.
The Oxide Engine is actually
The Oxide Engine is actually quite interesting. Their rendering algorithm allows another way to split tasks between multiple GPUs. They did it in Mantle, but have not yet implemented it in DirectX 12. I've been told it's coming to a future engine version, though. Hopefully, it will be back-ported to the game, too. I'm planning on doing a follow-up animation to deep-dive their algorithm.
Do it in Vulkan and it will
Do it in Vulkan and it will work across many different OSs and device markets and not be as limited as DX12 is to mostly some windows 10(Serf) PC/Laptops.
For me the most interesting
For me the most interesting use case would be to put the now idling iGPU which is in 90% of the gaming desktops to work and have it render some clouds or whatever. If there is no iGPU you can always offload the work to the dGPU or any extra CPU cores.
That may have a negative
That may have a negative impact on your CPU cores. if the die gets too hot, you may find your CPU being throttled
True
True
I disagree. DX12/Vulkan is
I disagree. DX12/Vulkan is the BEGINNING of multi-GPU, not the end.
Changing how things work is not simple, and certainly not fast. The game engine, such as Unreal 4 Engine, will add in support for things like SFR to make it easier for game developers.
SFR will trickle into a few games, and get more support every year. It’s even conceivable a PS5 and “XBox 2” will have a multi-GPU setup since by the time they are released (if they are) software should have pretty good support for this.
It’s CHEAPER to use multiple, smaller cheaps, provided the software supports this.
Again though, saying DX12 is the “end” because you don’t see results already is shortsighted. It’s similar to assuming SteamOS is dead in the water because it’s not popular yet.
Both have long term strategies, though multi-GPU is the only certainty.
“chips” (not cheaps)
“chips” (not cheaps)
why did many people thinking
why did many people thinking that SFR will solve multi gpu issue? SFR is not new. they exist as long as AFR did. looking at CIV BE result from using SFR in Mantle it is obvious that game developer nor gpu maker have solved the issue surrounding SFR. that issue alone makes going multi gpu are not worth it with SFR.
I think that DX12 and Vulkan
I think that DX12 and Vulkan GPU load balancing has the most potential for innovation with the larger gaming industry/developers contributing to development of GPU load balancing algorithms and other software/API SDKs and software/middleware solutions to help automate the load balancing on multi-GPU based PC/Laptop/other systems. The CF/SLI propitary solutions always have had limited development resources utilized towards the development of better gaming support for the CF/SLI optimized games, while with Vulkan/DX12 and Milti-GPU adaptor, games developers will have control over that aspect of the GPUs via a more standardized graphics API solution from Vulkan/DX12.
So is should not take much time for the games industry/developers to create a gaming engine/middle ware/standard graphics API solutions to profile all the GPUs in use and develop solutions to allow for the most efficient utilization of any and all GPUs in a standardized and simple to implement way. If all the gaming Industry, games developers/Gaming engine developers, API(Vulkan, DX12) pool their resources to make the management of multi-GPU load balancing more standarized across the entire games/graphics software industries.
PCIe 4.0 is going to offer more inter-GPU bandwidth for making better use of the VRAM across more than one GPU, so that and maybe some latency improvments can be had. Maybe there needs to be some more work towards a standardized the way that hardware/APIs and GPUs can communicate with each other Via PCIe that can be brought into the Vulkan/DX12 API and let the games developers get at that functionality Via the graphics APIs also. AMD is using XDMA over PCIe and Nvidia is using its SLI bridge.
except most game developer
except most game developer are not interested to deal with multi gpu which only a very small subset of pc gamer.
With the caveat that Vulkan
With the caveat that Vulkan does not currently support multi GPU memory sharing, so it is dead on arrival. They are working on implementing it now, but no games can be made with support for multi GPU until that work is done.
Yeah. The Khronos Group knows
Yeah. The Khronos Group knows they're lacking in a few areas of multi-GPU support. It's a top priority for Vulkan Next.
It’s in Vulkan but it needs
It’s in Vulkan but it needs to be improved, and it will be, as many will not be moving to windows 10. And Vulkan will have a much larger install base across many more devices. The money will be there for any Vulkan Multi-GPU, with Valve and VR gaming supporting plenty of development for Vulkan and Multi-GPU.
I have a 1200 watt corsair
I have a 1200 watt corsair power supply, whenever i enable sli and play bf4 my system shuts down. When sli is disabled bf4 works perfect. Tried a few games but only bf4 does this and of course its my main game so sli is always disabled.
2x 780 watercooled temps are 24-26c idle and load 45c. I’ve tried many different drivers same result. Would the power supply be the cause of the instant shut downs? Cpu is overclocked but ive tried at default same thing.
Sounds like it could a power
Sounds like it could a power issue, but those type of issues are very tricky to track down. Does your board have an aux PCIe power connector and do you have a power lead from your PSU connected to it?
You could try a new PSU and see if it helps. If problem still occurs, it may point to a board power-related issue as well.
Good luck…
Thanks morry, I have an evga
Thanks morry, I have an evga power booster in now. X99 asus deluxe board no pcie power. Same problem. I thought maybe an outlet problem switch to a different room same problem. reseated the cards same problem. Switched power connectors to different gpu’s same problem. If i have power surge enabled in bios thats the error i get so i disabled the power surge option in bios and instead of an error it just shuts down. Really weird problem.
Excellent visuals and
Excellent visuals and descriptions. Please do more videos like this.
+1 – Can’t say it better.
+1 – Can’t say it better.
Thanks! : D Best way to see
Thanks! : D Best way to see more is to share. As you can guess, it takes a LOT of effort, so our targets are pretty high. We were swinging for the fences.
+1, amazing stuff,
+1, amazing stuff,
I enjoyed this as well. Keep
I enjoyed this as well. Keep ’em coming.
I loved this! If you guys
I loved this! If you guys made these regularly I think you would see your YouTube subs grow exponentially.
Really enjoyed this format,
Really enjoyed this format, brings something different.
I also hope you can use animations like this for more in depth tech run downs in the future.
That is our hope! We
That is our hope! We definitely need everyone’s help to get the word out because the animations take a ton of time and effort!
The best way to help us bring you more animations is to share and/or donate to our Patreon :).
It would be awesome if
It would be awesome if multi-gpu support across architectures became widespread. I could make use of that 4 yr old GPU on the shelf at home and not save up for a matching GPU that is currently in my gaming pc. It seems to me that having mixed architecture multi-gpu support would drive used GPU sales up and new GPU sales down some.
But that is if it ever takes off.
Scott, the graphics
Scott, the graphics animation/explanation is AWESOME! What program/set of programs did you use for that?
And yes, that was excellent content in my opinion. Thank you.
Thanks!
It was almost
Thanks!
It was almost entirely done in Blender, although stitching the frames together and a bit of post processing was done in Adobe After Effects CC. A couple of textures (ie: the PCIe pins) were done in Photoshop CC, and the audio was edited in Audition CC.
But yeah, it was like 90% Blender with Cycles.
Thank you so much man.
Thank you so much man.
Great Video. Pretty good at
Great Video. Pretty good at explaining what DX12Vulkan is as well.
Personally I think multi GPU
Personally I think multi GPU is the way forward. Not many people can realistically warrant spending £700+ on a single card purchase. Developers however are constantly pushing the envelope when it comes to how their product looks and what effects are used. That leaves us, the end user, either scaling back the settings so as to make the game playable and missing out on all the latest wizardry or turning all the latest effects up to max and trying to play at sub 30 fps. Hairwork is a prime example. Would love to enable it but doing so makes the game a bit too sluggish. The sweet spot is just where AMD are positioning themselves with the 480. Affordable to most people who actually do game and come Christmas or birthday can add another to significantly increase their enjoyment of the latest titles. We all see the latest releases being shown off by their very proud revs, how stunning they look, so smooth… Problem is the system they highlight it on is way out of reach of your typical non lottery winning PC gamer.
Fantastic video, please do
Fantastic video, please do more!
Great work , hope you get
Great work , hope you get more support.
Thanks! We’d want keep doing
Thanks! We'd want keep doing this indefinitely if it's sustainable!
I feel that this was an
I feel that this was an excellent way of portraying this message! I would definitely like to see more video shorts like this in the future.
Thanks!
Thanks!
My impression is that only
My impression is that only the biggest AAA-developers will be able to put resources towards supporting multi-GPU setups in their DX12-/Vulkan-engines in the future – and even then most will only support 2 GPUs at most.
3- or 4-GPU systems aren’t widespread enough in the PC gaming-user base to justify the effort and money needed to make it work.
Mid- and small-sized (and esp. indie-)studios will have to rely on pre-made solutions from MS or Khronos in the API itself or support from the big 3 (Unreal Engine, Unity, CryEngine) engine developers.
it doesn’t matter what PC
it doesn’t matter what PC gamers use PC gaming is dieing its a minority of gamers. the thing that will make Multi GPU a reality is PS5/phones.
PS4pro already uses a cpu that doubles the cores.. but yer if the next consoles don’t use multigpu it will probably be dead.
im planning on buying two 980ti or titan XMaxwells to run CRTs for the next 10-15 years. both perform the same but one is 6gb vs 12gb. I hope some one works out how to do multi gpu with out wasting memory so I can get the 980ti’s instead because they are 70% cheaper and I wouldn’t be wasting Vram. but aftering reading this I doubt it will be possible in mainstream games only way to not waste doubled GPUram is if your rendering two different viewports. I guess ill save up for the Titan XM’s