Multi-GPU graphics and Conclusions

PCPER:  What are your thoughts on the current climate of multi-GPU systems?  Do you see that as a real benefit and do you think developers are able to take advantage of those kind of hardware configurations easily enough?

CARMACK: From a developer stand point the uncomfortable truth is that the console capabilities really dominate the development decisions today.  If you look at current titles and how they’ve done on the console, you know, high end action GPU based things, the consoles are so the dominate factor that it’s difficult to set things up so that you can do much to leverage the really extreme high end desktop settings.  Traditionally you get more resolution, where a console game might be designed for 720p and the high end PC you go ahead and run at 1080p or even higher resolution, that’s an obvious thing.  You crank up the resolution.  You turn off compression when you have 1GB of video memory available.  And also normally you can go from a 30 Hz console game to a 60 Hz PC game.  So there are a number of things you can crank up there on the PC, but it’s difficult to try and justify any radically different algorithm, something you would really do with 4x the power you’d have with a high end PC system. 

John Carmack on id Tech 6, Ray Tracing, Consoles, Physics and more - Graphics Cards 11
Doom 3 / id Tech 4, 2004 – courtesy of

PCPER:  Do you think NVIDIA and AMD are relying too heavily on the multi-GPU technology instead of pushing forward with true next-generation GPUs?  Will multi-GPU systems continue to be an option at all?

CARMACK: I’ve always been a big proponent of these high end boutique systems – way back from the early days of 3dfx I always thought it was a real feather in their cap early on that they could pay more money and have a bigger system and have it double up and just go faster.  I think it’s a really good option and certainly companies like NVIDIA and AMD are throwing all the resources they possible can at making the newer, next-generation cards.  But to be able to have this ability to just pay more money and get more performance out of a current generation is really useful thing to have.  Whether it makes sense for gaming to have these thousand dollar graphics cards is quite debatable but it’s really good for developers; to be able to target something high end that’s going to come out three years from now by being able to pay more money today for 2x more power.  Certainly the whole high end simulation business has benefited a lot from commoditization of scalable graphics. 


Although on the down side it was clear that years back when everything was going in a fairly simple algorithmic approach as far as graphics engines where you just rendered to your frame buffer, it was easy for them to go ahead and chunk that frame buffer up into an arbitrary number of pieces. But now there is much more tight coupling between the graphics render and the GPUs where there are all sorts of feedbacks, rendering to sub buffers, going back and forth, getting dependent conditional query results, and it makes it a lot harder to just chunk the problem up like that.  But that’s the whole tale of multi-processing since the very beginning; we’re fighting that with multiple CPUs.  It’s the primary issue with advancing performance in computing.

John Carmack on id Tech 6, Ray Tracing, Consoles, Physics and more - Graphics Cards 12
Quake 4 / id Tech 4, 2005 – courtesy of  


That is my big take away message for a lot of people about the upcoming generation of general purpose computation on GPUs; a lot of people don’t seem to really appreciate how the vertex fragment rasterization approach to computer graphics has been unquestionably the most successful multi-processing solution ever.  If you look back over 40 years of research and what people have done on trying to use multiple processors to solve problems, the fact that we can do so much so easily with the vertex fragment model, it’s a real testament to its value.  A lot of people just think “oh of course I want more flexibility I’d love to have multiple CPUs doing all these different things” and there’s a lot of people that don’t really appreciate what the suffering is going to be like as we move through that; and that’s certainly going on right now as software tries to move things over, and it’s not “oh just thread your application”.  Anyone that says that is basically an idiot, not appreciating the problems.  There are depths of subtly to all of this where it’s been an ivory tower research project since the very beginning and it’s by no means solved. 


PCPER: NVIDIA and AMD driver teams have to hack up games to get them to work optimally on multi-GPU systems and that’s more difficult for them today than in the past.  Do you think developers dependence on the console market, which is solely single-GPU today, is a cause of those headaches?

CARMACK: It’s probably making it even harder for the PC card guys because as developers get more sophisticated with the low level access we get on the consoles, the rendering engines are become harder to kind of, behind our backs, automatically split across multiple GPUs.  We are doing more sophisticated things on the single GPU – there is a lot more data transfer going back and forth, updated states that have to be replicated across multiple GPUs, dependent sections of the screen doing different things.  It’s still possible buts it’s kind of a hairy job and I definitely don’t envy those driver writers or their task at all.


PCPER: Any thoughts on the 3-4 GPU systems from AMD and NVIDIA?  Overkill?

CARMACK: For many applications, for the class of apps that just treat something like a dumb frame buffer, they really will go ahead and be 4x faster especially if you’re just trying to be 4x the resolution on there, that’s easy.  There is no doubt that if you take a game that’s playing at the frame rate you want at a certain resolution, a 4 GPU solution will usually be able to go ahead and render 4x the pixels, or very close to linear scaling. 

John Carmack on id Tech 6, Ray Tracing, Consoles, Physics and more - Graphics Cards 13
Rage / id Tech 5, 2007 – courtesy of


But as far as what it’s unlikely to do is take a game that’s running 20 FPS at a given nominal resolution and then make that game run 60 FPS.  You’re likely bound up for things that aren’t raw GPU throughput, usually CPU throughput in the game.


PCPER: You’ve had choice words for what AGEIA was trying to do with the hardware physics add-in cards.  Now that they are off the scene, having been purchased by NVIDIA, what are your thoughts on that past situation?

CARMACK: That was one of those things where it was a stupid plan from the start and I really hope NVIDIA didn’t pay too much because I found the whole thing disingenuous.  Many people from the very beginning said their entire business strategy was to be acquired because it should have been obvious to everybody that the market for an add-in physics card was just not there.  And the market proved not to be there.  The whole thing about setting up a company and essentially lying to consumers, that this is a good idea, in order to cash out and be bought out by a big company, I saw the whole thing as pretty distasteful.  It’s obvious, and we knew when AGEIA was starting, that a few generations down the road we would have these general purpose compute resources on the GPU.  And what we have right now are things like CUDA that you can implement physics on; you can’t mix and match it very well right now, with such a heavy weight systems change, but that’s going to be getting better in future revisions.  And eventually you will be using a common set of resources that can run general data parallel stuff versus very high efficiency rasterization work.  As for the PhysX hardware, while they would have a little bit of talk about how their architecture was somehow much better suited for physics processing, and it might have been somewhat better suited, for it they never told anyone how or why.


PCPER: Do you think moving physics to a GPU is a benefit?

CARMACK: Right now, to offload tasks like that you have to be able to go ahead and stick them in a pretty deep pipeline so it doesn’t fit the way people do physics modeling in their games very well right now.  But as people choose to either change their architecture to allow a frame of latency in the reports of collision detection in physics or we get much finer grain parallelization where you don’t have this really long latency and you can kind of force an immediate mode call to GPU operations, then we start using that just the way we do SSE instructions or something in our current code base.  Then, yeah, we definitely will wind up using compute resources for things like that or collision detection physics. 


PCPER: NVIDIA has Novodex, Intel has Havok — will that cause fragmentation in the market?  Do you think Microsoft would combine them into a physics API like they did DirectX?

CARMACK: It will be interesting to see how that plays out because while I was well known for having certain issues with Microsoft on the graphics API side of things I really think Microsoft did the industry a good favor by eventually getting to the DX9 class of stuff, having a very intelligent standard that everyone was forced to abide by.  And it was a good thing.  But of course I have worries as we look towards this general compute topic, because if MS took 9 tries to get it right…well.  They probably have some accumulated wisdom about that whole process now, but there is always a chance for MS to sort of overstep their actual experience and lay down a standard that’s no good.  Their standards almost always evolve into something good… it would be wonderful if they got it right on the first step of DX compute, or whatever its going be.  I wouldn’t hold my breath on that because really all of this it is still research. With graphics we were really, for a larger part, following the SGI model for a long time and that gave the industry a real leg up.  Right now this comes back to the earlier point: everybody’s still waving their hands about what wonderful stuff we’re going to do here but we really don’t have the examples let alone the applications.   So it’s sort of a dangerous time to go in and start making specific standards when there’s not actually all that much of an experience base. 

John Carmack on id Tech 6, Ray Tracing, Consoles, Physics and more - Graphics Cards 14
Rage / id Tech 5, 2007 – courtesy of


As far as the physics APIs, I do expect that for any API to wind up getting broad game developer support, whether it’s going to Novodex or Havok, they are going to have to have backends that at least function using any acceleration technology available.  It’ll just be a matter of Intel obviously not trying to make a CUDA implementation very fast but someone will wind up having a CUDA implementation for it that is at least plug compatible.  Maybe NVIDIA will end up having wrappers for their APIs to do that.  But that is just kind of the reality with today’s development; unless you are a Microsoft tech developer or something that’s tied to the Xbox 360 platform, developers aren’t going to make a choice where ”well we’re going to use Intel’s stuff and not run on the current console generation” or something.  That’s just not going to happen. 
« PreviousNext »