DX11 could rival Mantle
During GDC and after the DX12 reveal, we sat down with NVIDIA’s Tony Tamasi to talk about how DX11 can rival Mantle with efficiency improvements.
The big story at GDC last week was Microsoft’s reveal of DirectX 12 and the future of the dominant API for PC gaming. There was plenty of build up to the announcement with Microsoft’s DirectX team posting teasers and starting up a Twitter account of the occasion. I hosted a live blog from the event which included pictures of the slides. It was our most successful of these types of events with literally thousands of people joining in the conversation. Along with the debates over the similarities of AMD’s Mantle API and the timeline for DX12 release, there are plenty of stories to be told.
After the initial session, I wanted to setup meetings with both AMD and NVIDIA to discuss what had been shown and get some feedback on the planned direction for the GPU giants’ implementations. NVIDIA presented us with a very interesting set of data that both focused on the future with DX12, but also on the now of DirectX 11.
The reason for the topic is easy to decipher – AMD has built up the image of Mantle as the future of PC gaming and, with a full 18 months before Microsoft’s DirectX 12 being released, how developers and gamers respond will make an important impact on the market. NVIDIA doesn’t like to talk about Mantle directly, but it’s obvious that it feels the need to address the questions in a roundabout fashion. During our time with NVIDIA’s Tony Tamasi at GDC, the discussion centered as much on OpenGL and DirectX 11 as anything else.
What are APIs and why do you care?
For those that might not really understand what DirectX and OpenGL are, a bit of background first. APIs (application programming interface) are responsible for providing an abstraction layer between hardware and software applications. An API can deliver consistent programming models (though the language can vary) and do so across various hardware vendors products and even between hardware generations. They can provide access to feature sets of hardware that have a wide range in complexity, but allow users access to hardware without necessarily knowing great detail about it.
Over the years, APIs have developed and evolved but still retain backwards compatibility. Companies like NVIDIA and AMD can improve DirectX implementations to increase performance or efficiency without adversely (usually at least) affecting other games or applications. And because the games use that same API for programming, changes to how NVIDIA/AMD handle the API integration don’t require game developer intervention.
With the release of AMD Mantle, the idea of a “low level” API has been placed in the minds of gamers and developers. The term “low level” can mean many things, but in general it is associated with an API that is more direct, has a thinner set of abstraction layers, and uses less translation from code to hardware. The goal is to reduce the amount of overhead (performance hit) that APIs naturally impair for these translations. With additional performance available, the CPU cycles can be used by the program (game) or be slept to improve battery life. In certain cases, GPU throughput can increase where the API overhead is impeding the video card's progress.
Passing additional control to the game developers, away from the API or GPU driver developers, gives those coders additional power and improves the ability for some vendors to differentiate. Interestingly, not all developers want this kind of control as it requires more time, more development work, and small teams that depend on that abstraction to make coding easier will only see limited performance advantages.
The reasons for this transition to a lower level API is being driven the by widening gap of performance between CPU and GPUs. NVIDIA provided the images below.
On the left we see performance scaling in terms of GFLOPS and on the right the metric is memory bandwidth. Clearly the performance of NVIDIA's graphics chips has far outpaced (as have AMD’s) what the best Intel desktop processor have been able and that gap means that the industry needs to innovate to find ways to close it.
Even with that huge disparity, there aren't really that many cases which are ripe for performance improvement with CPU efficiency increases. NVIDIA showed us this graphic above with performance changes when scaling a modern Intel Core i7 processor from 2.5 GHz to 3.3 GHz. 3DMark and AvP benchmarks don't scale at all, Battlefield 3 scales up to 3% with the GTX Titan, Bioshock Infinite scales across the board up to 5% and Metro: Last Light is the stand out with an odd 10%+ change on the HD 7970 GHz Edition.
NVIDIA doesn’t deny that a lower level API is beneficial or needed for PC gaming. It does, however, think that the methodology of AMD’s Mantle is the wrong way to go. Fragmenting the market into additional segments with a proprietary API does not maintain the benefits of hardware abstractions or “cross vendor support”. I realize that many readers will see some irony in this statement considering many in the industry would point to CUDA, PhysX, 3D Vision and others as NVIDIA’s own proprietary feature sets.
NVIDIA’s API Strategy
Obviously NVIDIA is going to support DirectX 12 and continues to support the latest updates to OpenGL. You’ll find DX12 support on Fermi, Kepler and Maxwell parts (in addition to whatever is coming next) and NVIDIA says they have been working with Microsoft since the beginning on the new API. The exact timeline of that and what constitutes “working with” is also up for debate, but that is mostly irrelevant for our discussion.
What NVIDIA did want to focus on with us was the significant improvements that have been made on the efficiency and performance of DirectX 11. When NVIDIA is questioned as to why they didn’t create their Mantle-like API if Microsoft was dragging its feet, they point to the vast improvements possible and made with existing APIs like DX11 and OpenGL. The idea is that rather than spend resources on creating a completely new API that needs to be integrated in a totally unique engine port (see Frostbite, CryEngine, etc.) NVIDIA has instead improved the performance, scaling, and predictability of DirectX 11.
This graphic, provided by NVIDIA of course, shows 9 specific Direct3D 11 functions. The metric of efficiency in this case is rated by the speed increase between the AMD R9 290X in red and the three different progressive driver versions in green on a GTX 780 Ti. The Draw, SetIndexBuffer and SetVertexBuffers functions have gone through several hundred percent performance improvements since just the R331 driver stack to an as-yet-unreleased driver due out in the next couple of weeks.
These were obviously hand selected by NVIDIA so there may be others that show dramatically worse results, but it is clear that NVIDIA has been working to improve the efficiency of DX11. NVIDIA claims that these fixes are not game specific and will improve performance and efficiency for a lot of GeForce users. Even if that is the case, we will only really see these improvements surface in titles that have addressable CPU limits or very low end hardware, similar to how Mantle works today.
NVIDIA shows this another way by also including AMD Mantle. Using the StarSwarm demo, built specifically for Mantle evaluation, NVIDIA’s GTX 780 Ti with progressive driver releases sees a significant shift in relation to AMD. Let’s focus just on D3D11 results – the first AMD R9 290X score and then the successive NVIDIA results. Out the gate, the GTX 780 Ti is faster than the 290X even using the R331 driver. If you move forward to the R334 and the unreleased driver you see improvements of 57% pushing NVIDIA’s card much higher than the R9 290X using DX11.
If you include Mantle in the picture, it improves performance on the R9 290X by 87% – a HUGE amount! That result was able to push the StarSwarm performance past that of the GTX 780 Ti with the R331 and R334 drivers but isn’t enough to stay in front of the upcoming release.
Thief, the latest Mantle-based game release, shows a similar story; an advantage for AMD (using driver version 14.2) over the GTX 780 Ti with R331 and R334, but NVIDIA’s card taking the lead (albeit by a small percentage) with the upcoming driver.
If you followed the panels at GDC at all, you might have seen one about OpenGL speed improvements as well. This talk was hosted by NVIDIA, AMD and Intel and all involved openly bragging about the extension-based changes to the API that have increased efficiency in a similar way to what NVIDIA has done with DX11. Even though OpenGL often gets a bad reputation for being outdated and bulky, the changes have added support for bindless textures, texture arrays, shader storage buffer objects, and commonly discussed DirectX features like tessellation, compute shaders, etc.
Add to that the extreme portability of OpenGL across mobile devices, Windows, Linux, Mac, and even SteamOS, and NVIDIA says their commitment to the open-source API is stronger than ever.
The Effect of DirectX 12
As we discussed in our live blog, the benefits of the upcoming DX12 implementation will come in two distinct parts: performance improvements for existing hardware and feature additions for upcoming hardware. Microsoft isn’t talking much about the new features that it will offer and instead are focused on the efficiency improvements. These include reductions in submission overhead, improved scalability on multi-core systems, and its ability to mimic a console-style execution environment. All of this gives more power to the developer to handle and manipulate the hardware directly.
NVIDIA claims that work on DX12 with Microsoft “began more than four years ago with discussions about reducing resource overhead. For the past year, NVIDIA has been working closely with the DirectX team to deliver a working design and implementation of DX12 at GDC.” This would indicate that while general ideas about what would be in the next version of DX, the specific timeline to build and prepare it started last spring.
NVIDIA is currently the only GPU vendor to have a DX12 capable driver in the hands of developers and the demo that Microsoft showed at GDC was running on a GeForce GTX TITAN BLACK card. (UPDATE: I was told that actually Intel has a DX12 capable driver available as well leaving AMD as the only major vendor without.)
Will NVIDIA feel heat from Mantle?
Though it doesn’t want to admit it, NVIDIA is clearly feeling some pressure from gamers and media due to AMD’s homemade Mantle API. The company’s stance is to wait for DirectX 12 to be released and level the playing field with an industry standard rather than proceed down the pathway of another custom option. In the next 18 months, though, there will be quite a few games released with Mantle support, using the Frostbite engine or CryEngine. How well those games are built, and how much of an advantage the Mantle code path offers, will determine if gamers will respond positively to Radeon cards. NVIDIA, on the other hand, will be focusing its reticule on improving the efficiency and performance of DirectX 11 in its own driver stack in an attempt to maximize CPU efficiency (and thus overall performance) levels to rival Mantle.
During a handful of conversations with NVIDIA on DirectX and Mantle, there was a tone from some that leaned towards anger, but hints at annoyance. It’s possible, according to NVIDIA’s performance improvements in DX11 efficiency shown here, that AMD could have accomplished the same thing without pushing a new API ahead of DirectX 12. Questions about the division of internal resources on the AMD software team between Mantle and DirectX development are often murmured as is the motives of the developers continuing to adopt Mantle today. Finding the answers to such questions is a fruitless endeavor though and to speculate seems useless – for now.
AMD has done good with Mantle. Whether or not the company intended for the new API to become a new standard, or merely force Microsoft's hand with DirectX 12, it is thanks to AMD that we are even talking about efficiency with such passion. Obviously AMD hopes they can get some financial benefits from the time and money spent on the project, with improved marketshare and better mindshare with gamers on the PC. The number and quality of games that are released in 2014 (and some of 2015) will be the determining factor for that.
Over the next year and half, NVIDIA will need to prove its case that DirectX 11 can be just as efficient as what AMD has done with Mantle. Or at the very least, the performance deltas between the two options are small enough to not base purchasing decisions on. I do believe that upon the release of DX12, the playing field will level once again and development on Mantle will come to a close; that is, at least if Microsoft keeps its promises.
Star Swarm is horrible
Star Swarm is horrible engine. Not only multithreading is done baldy that overhead is killing it (just in non-graphic computing), but its coding is so bad that it does several thousands of identical calls to functions which govern basic processing of vertices. Those functions are cheap on their own (from traces they appear to be around several dozens of instructions, so most of them is entry/exit processing alias stack manipulation).
That’s contrary to engines like Civilization V, where it is at worst several hundreds of calls.
Also furthermore it has too many batches per frame, meaning that overhead is bigger then cost of main computation. (Almost as if authors don’t know anything)
To test the effect, one can tweak TargetJobSize in Civilization V and see how it works. (Standard setting is 100 but you can get better performance on e.g. Titan by setting it to 1000)
==
Frankly, programmers still don’t know how to use well DX11. As for Mantle, it is proprietary fix for AMD’s drivers laying task on game engines. That’s the only reason why Mantle gets support as it is the only way to get most out of AMD’s card, because AMD can’t or refuses to support fully DX11 features.
And considering AMD lied about DX 12, I posit AMD intentionally cripples their DX 11 drivers to force adoption of Mantle.
AMD supports all DX11.2
AMD supports all DX11.2 features.
GCN 1.0 = DX11.1 w/ DX11.2 features
GCN 1.1 = DX11.2
Nvidia supports DX11 only and 2 game specific features in DX11.2. No Nvidia chip even the Maxwell 750/Ti supports all DX11.2 features.
Fermi = DX11.0
Kepler = DX11.0
Maxwell = DX11.0
That`s soo funny. Nvidia &
That`s soo funny. Nvidia & fanboys now coming out & saying DX12 matters when all these years they (nvidia & fanboys) couldn’t care less about DX11.2. That`s hilarious; Why does DX12 suddenly matter so much now?
nvidia cards supported the 3d
nvidia cards supported the 3d side of 11.2 which is used in gaming if i remember right, they just didn’t support the 2d side for desktop use hence why they can’t be called 11.2 cards. what additions they put in 11.2 really meant nothing to 99.9% of people.
This is one of big-o-shit
This is one of big-o-shit tech PR. Ofc , You cant build that without some truth in it. Beside that, it is black art.
So Nvidia simply says adapting the new driver stack to the specific game by some kinda home-cooked-magic is better than developing the game for a API(which the API itself manages stuff, while Nvidia is re-doing game-specific-driver ).
Nicely done then. as a indie dev, now i should negotiate with Nvidia to include my code-path in their new driver. Unless then i’ll probably dont have theese enhanced / calibrated / curve fitted driver-magic, lol.
If there is something worse than a proprietary API, it is the game-specific-driver which you almost got no control to do things.
Bingo
Bingo
Clearly game Dev don’t speak
Clearly game Dev don’t speak to opengl or DX maker!anti-cheat solution caus most of the issue.anti cheat disabling DX feature .
You want closer to the
You want closer to the metal?go ask Steve Gibson (GRC)he code in assembler.and now they re making these change while fast Fourier went sparse fast Fourier in 2014 .this alone is humongous.it affect everything.android is probably changing from fast Fourier to the latest 2014 sparse fast Fourier transform already .so I’m sorry but the time table for DX is so slow.I don’t see the benefit.they he been at this 4 years supposedly,if this true .ms never took into account sparse fast Fourier.so user would still be stuck with the old fast Fourier??hell no!
bottom line is, if i get AMD
bottom line is, if i get AMD graphic i have the best of both worlds, over 20 games enhanced perf considerably, and compatible dx12 in 2-3years games ( if microsoft delivers), what do i have to lose ? nothing.
AMD got nvidia by the balls
if what nvidia is saying here is true that they can drasticly increase perf on games, then why they waited untill now to do it ?
option
1-they are in bad with microsoft and intel to keep perf back to sell more
2-they are lying
3-to increase perf this way, they need alot of money and resources, which means they have to pick the game during devlopement to work with the devs, which means you will have couple games a year, for couple years before they stop.
in the end thanks to AMD PC gaming is finaly evolving
Might want to rethink that.
Might want to rethink that. Mantle was the biggest edge AMD HAD over Nvidia. Now with DX12 doing same thing that edge just disappeared. That edge AMD had with mantle still was kinda small, some games gave AMD 10% advantage but in other games like example theif. 290x only had like 5% advantage over my GTX780, no not 780ti. 73fps(AMD) vs 69fps(gtx780) the 73 number was off AMD’s own blog.
http://community.amd.com/community/amd-blogs/amd-gaming/blog/2014/03/17/amd-catalyst-143-beta-the-ultimate-driver-for-thief
Agreed. Nvidia is floundering
Agreed. Nvidia is floundering with empty PR propaganda. There’s evidence emerging that Nvidia only started any real commitment to DX12 around the time they saw the documents stolen from AMD regarding their long term internal plans.
The 4 year PR spin is simply about a couple people in a conversation talking to MS about something that they should think about doing. AMD also had those talks, and when MS had other priorities they went ahead with their own solution, backed by developers, also keeping MS in the loop (after all, AMD has been working very closely with Microsoft on console development 😉 ). Mantle. This is why AMD said at the time there was no DX12, because at that time, there was no development started.
Once NV got ahold of those documents they had stolen, they went into panic mode, and we are now seeing their PR associated with it. An API doesn’t take 6 years to develop for a cash rich company like MS, who is determined to retain their gaming ecosystem (Steam OS also plays a role). AMD did it with a very limited budget in 2 years, with very real and promising results. And it’s on the market today for end users to enjoy. Make no mistake, what we are hearing from NV is nothing but propaganda and half truths. The 4 year development PR is flat out spin.
Agreed. Nvidia is floundering
Agreed. Nvidia is floundering with empty PR propaganda. There’s evidence emerging that Nvidia only started any real commitment to DX12 around the time they saw the documents stolen from AMD regarding their long term internal plans.
The 4 year PR spin is simply about a couple people in a conversation talking to MS about something that they should think about doing. AMD also had those talks, and when MS had other priorities they went ahead with their own solution, backed by developers, also keeping MS in the loop (after all, AMD has been working very closely with Microsoft on console development 😉 ). Mantle. This is why AMD said at the time there was no DX12, because at that time, there was no development started.
Once NV got ahold of those documents they had stolen, they went into panic mode, and we are now seeing their PR associated with it. An API doesn’t take 6 years to develop for a cash rich company like MS, who is determined to retain their gaming ecosystem (Steam OS also plays a role). AMD did it with a very limited budget in 2 years, with very real and promising results. And it’s on the market today for end users to enjoy. Make no mistake, what we are hearing from NV is nothing but propaganda and half truths. The 4 year development PR is flat out spin.
Never AMD again…poor
Never AMD again…poor drivers and/or hardware design.
Screen goes blank with just a tan color with vertical lines but audio continues.
Looks like Nvidia was lying
Looks like Nvidia was lying and new about it.
http://techreport.com/news/26210/directx-12-will-also-add-new-features-for-next-gen-gpus
Nvidia knew Fermi, Kepler and Maxwell would not be fully DX12 compatible but got all there fans thinking they did with that slide at GDC.
Nvidia sure made a lot of people look like Nvidiots.
Yep. And AMD has FULL
Yep. And AMD has FULL compatibility with DX12. Not a surprise given that AMD and Microsoft worked closely on Xbox One. Even Tahiti is more feature advanced than NVidia’s new fangled Maxwell. Reviewers gave glowing reviews of that piece of crap though. A new architecture that is basically the same as AMD 2+ year older architecture. Downclock a R7 265 and you’ll have very similar power consumption and perf/watt as NV’s newest architecure. There is nothing groundbreaking or even impressive about that pile of shit. Quite a pathetic showing on NV’s behalf. But as long as they can ‘convince’ reviewers to give glowing reviews, it’s all that matters it seems.
you would have to down clock
you would have to down clock that 265 a bit to get power usage to levels of maxwell, so much probably be least 20-30% slower by time you get there.
Maxwell on top of being very lower power it also almost 2x faster per watt on mining. 3x 750ti that only draw around 180 watts beat a r9 290x that sucks 300watts. Throw on top of that can get 25-40% overclock on the 750ti makes it even worse and that was done on a card without a 6pin pci-e connector. One of r9’s biggest issues has not only been power draw and questionable performance advertising, is heat it makes.
Well yes, i would hope
Well yes, i would hope Maxwell would have at least some advantage over AMD’s nearly 3 year old architecure. But when i can underclock and undervolt my R7 265 to all but match NV’s next gen architecture that’s a complete failure in my book. There is no technical merit to considering a Maxwell over Tahiti, especially when Tahiti is still more feature rich than NV’s latest.
your mining comparison is of course flawed, misleading, and well, plain ol’ FUD. miners know it. enthusiasts know it. After market R9 290X’s have as good, if not better thermals than 780’s, which come with more expensive stock coolers. Slap the same cooler on both, and thermals are equal. Everything else is just PR on NV and reviewers behalf.
I still think mantle will be
I still think mantle will be better than directx. They must take a normal cpu like a 2500k or a 3670k stock speeds with mantle for amd vs directx on 780ti or so. you will see the 780ti being bottleneck and with mantle there is none.
if you look at high end r9
if you look at high end r9 card with high end intel or even mid range i5 with and without mantle enabled performance is only around 10-15% so bottle neck of dx11 isn’t that massive, as you get with a slower cpu. More so AMD cpu’s mostly the bottleneck rears up more and more.
Nvidia’s graphs might look
Nvidia’s graphs might look nice and fancy, but their DX11 improvments are pretty much senseless/nonsense.
I think the Star Swarm benchmark graph is the perfect example. Using a i7-3930K processor (for ~$550) at a not further specified clock speed (possibly overclocked) and not further specified RAM (I’d guess some high-end crap like 2133-2666 MHz low latency) in combination with a GTX 780Ti might beat a 290X with Mantle, but the later is much cheaper and (thanks to Mantle) does not require such a high-end CPU.
I can’t believe how many
I can’t believe how many times I’ve sat through these sort of topics. Microsoft will not be beaten by AMD. Intel and NVIDIA are not some back-alley, second-rate “yes please let us lose” corporations that will idly sit by and watch a company at the edge of bankruptcy (saved by the bell…50% of the bell being a Microsoft product – “bell” = current consoles) dent their shining armour. Yes, Mantle is here to stay for a bit, interest will decline once DX12 is out and developers see the numbers, and shareholders realize that the expenses and benefits associated with Mantle are not worth the cost, especially when DX12 is universally accessible. Nobody is questioning Mantle’s technological prowess (though it’s now debateble if it’s an authentic project by AMD seeing that OpenGL is basically the same, and AMD was in the know-how of DX12 for a long time) but still, we all owe AMD lots of thanks for bring this out of the shadows because it gave the other boys the nudge they needed. For now, it’s all marketing though. Stats and future numbers. Historical evidence is clear on the outcome, so let’s not shy away from those. Mantle simply doesn’t have staying power.
People seemingly forget that
People seemingly forget that the PS4 is able to run Mantle and not DX12, it’s currently whooping the XB1 in sales and will most likely becomes the dominant console (if it hasn’t already).
When DX12 arrives, to get the most of it you will need new hardware (especially if you have a Geforce), for Mantle you won’t (unless you have a Geforce). When game engines support Mantle, all future games can take advantage of it and all gamers will benefit from it (unless you have a Geforce).
So M$ has ‘some’ windows PCs and a console currently getting whooped by the PS4 (which can use Mantle) to fly it’s DX12 flag; AMD potentially has everything else, including the afore mentioned windows PCs and console.
I hardly doubt Mantle will be so short lived when it has got so much disruptive potential (especially if you have a Geforce).
“…it is thanks to AMD that
“…it is thanks to AMD that we are even talking about efficiency with such passion”
And strange that MS just months after Mantle release talks about really optimizing Direct X.
And have rebuild it from the ground up.
Talk about timing… or is it?
And so also for OpenGL.
No, its panic. But that is good.
It’s just sad to see that much needed improvements have to be forced.
But that is great that there is competition or else we just get slight improvements.
I hope that the same thing happens for better surround audio. TrueAudio
Sony forced dx12, as far as
Sony forced dx12, as far as Microsoft is concerned, they will use dx12 to get an advantage over ps4. Of course we have the AMD vs Nvidia side but the Sony/Microsoft side of dx12 is interesting. Do you think Microsoft would be so willing if dx12 helped ps4 also? PS4 hardware > XBONE hardware. I agree Mantle is the bulk of the reason but just want to expose the nuance.
I have not had a chance to
I have not had a chance to read through this entire thread (hot topic!) so sorry for repeating already stated comments: I’m responding b/c of the remark about fragmentation and Nvidia speaking out being and irony…
You have to build something and establish it before you worry about fragmentation. Nvidia CUDA, Physx, etc. being proprietary, yes, but no, Nvida practically laid all the groundwork which allows general alternatives to thrive (OpenCL, opensource Bullet, Direct Compute etc).
I’m not sure they could afford, at the time, to give their competitors a free lunch and go all in. So they trail-blazed and championed these technologies under their flag because they needed to open up doors. The competition and their partners road on their tailcoats, which is totally fair as the way of business. This time, AMD seems to be leading the charge. The only difference is that partners and the competition are only foot falls behind.
DX11 nVidia optimised – DX12
DX11 nVidia optimised – DX12 AMD/nVidia optimised – Mantle – OpenGL…. One thing for sure is, the state of PC gaming is good and no matter what way you look at it, all this competition is only a win for us gamers.
With 4K looking to be a standard sooner, rather than later, we will need all the GPU grunt we can muster and with the API’s giving boosts to fps, I will be a happy gamer 😀
True that, I agree. Happy
True that, I agree. Happy days to be PC Gamer 🙂
if intel amd and NVidia all
if intel amd and NVidia all say the same thing?odds are high it will happen reguardless if ms approve or not!why?come on guys!opengl is for all intent and purpose a android thing ,and android is pretty much Linux now a day!do you know how many corporation do hard core coding on Linux ,even ibm is there,and then there is the 2014 sparse fast fourier transform that will replace fast fourier transform ,this alone bring huge gain.and this alone pretty much a majority of stuff use that transform in a form or another.so at the end ?ya intel NVidia and amd decided to push harder for opengl .who wouldn’t android is the number one os on that planet !dev do not have the choice.they have to dev for android and window.dx12 would have been nice but I suspect android isn’t gona wait for ms to release dx12 before implementing next gen opengl and sparse fast fourier transform 2014 . i bet next android nexus will have all of these inside!the gain are just too important to not make this happen!