Summary of Data and Early Thoughts
Because of the complexity and sheer amount of data we have gathered using our Frame Rating performance methodology, we are breaking it up into several articles that each feature different GPU comparisons. Here is the schedule:
- 3/27: Frame Rating Dissected: Full Details on Capture-based Graphics Performance Testing
- 3/27: Radeon HD 7970 GHz Edition vs GeForce GTX 680 (Single and Dual GPU)
- 3/30: AMD Radeon HD 7990 vs GeForce GTX 690 vs GeForce GTX Titan
- 4/2: Radeon HD 7950 vs GeForce GTX 660 Ti (Single and Dual GPU)
- 4/5: Radeon HD 7870 GHz Edition vs GeForce GTX 660 (Single and Dual GPU)
- 4/16: Frame Rating: Visual Effects of Vsync on Gaming Animation
Without question we have inundated you with about as much information as any one person can take in with a single story. This is why we are breaking up our results in a few days of releases. There are a lot of answers as well as a lot of questions that resulted from this story and I am going to try and address as many of them as possible in our summary, but I encourage our readers to use the comments below and this thread in our forums to keep the discussion going. I’ll be following both over the next week (with GDC getting in the way initially) and look forward to getting some more input.
Single GPU Configurations – Performance as Expected
Today’s results focus on the Radeon HD 7970 GHz Edition and the NVIDIA GeForce GTX 680 as well as their SLI/CrossFire options, but let’s start with a quick talk about the results we see with the single card and single GPU configurations. Frame Rating still tells an interesting and unique story compared to FRAPS and thanks to some of our data analysis, (Min FPS percentiles, International Stutter Units) the HD 7970 and GTX 680 compare different than they might otherwise.
We definitely can’t say the same for the multi-GPU results, but when using only a single GPU both AMD and NVIDIA platforms show consistent results on a run to run basis as well as when we compare Frame Rating to the traditional FRAPS average frame rates and frame times. When we showed you the FRAPS graph followed by the Observed FPS graphics you should have seen that both the single GTX 680 and the single HD 7970 are basically the same on both.
Frame time graphs are going to be different due to the different locations in the graphics pipeline in which the frame times are measured between FRAPS and our capture solution, but generally both versions tell a similar story. If there is hitching or stutter found using the FRAPS time stamps then our at-the-display data will show the same thing, but maybe at different specific locations. Patterns are the key to find though as very few gamers are really just playing a game for 60 seconds at a time, let alone the same 60 seconds over and over.
The overall picture comparing the two cards indicates that the AMD Radeon HD 7970 GHz Edition is a faster card for gaming at 1920×1080, 2560×1440 and 5760×1080 triple-monitor resolutions. In Battlefield 3 the performance gap between the HD 7970 and GTX 680 was small at 19×10 and 25×14 but expanded to a larger margin at 57×10 (19%). AMD’s HD 7970 also shows less frame to frame variance in the BF3 than the GTX 680. This same pattern is seen in Crysis 3 as well, though at 5760×1080 we are only getting frame rates of 13 and 16 on average, getting the HD 7970 a 23% advantage.
DiRT 3 performed very well on both cards even at the 5760×1080 resolution though AMD’s HD 7970 maintained a small advantage. Far Cry 3 was much more varied with the GTX 680 taking the lead at 1920×1080 (20%) but at 2560×1440 and 5760×1080 the cards change places giving the HD 7970 the lead. Skyrim was another game that saw small performance leads for AMD at higher resolutions though I did find there to be less frame time variance on the GTX 680 system which provided a better overall experience for game that can run on most discrete GPUs on the market today.
Finally, one of the newest games to our test suite, Sleeping Dogs, the AMD Radeon HD 7970 holds a sizeable advantage across the board of the three tested resolutions. The margins are 34% at 1920×1080, 37% at 2560×1440 and 23% when using triple displays.
While some people might have assumed that this new testing methodology would paint a prettier picture of NVIDIA’s current GPU lineup across the board (due to its involvement in some tools), with single card configurations nothing much is changing in how we view these comparisons. The Radeon HD 7970 GHz Edition and its 3GB frame buffer is still a faster graphics card than a stock GeForce GTX 680 2GB GPU. In my testing there was only a couple of instances in which the experience on the GTX 680 was faster or smoother than the HD 7970 at 1920×1080, 2560×1440 or even 5760×1080.
AMD CrossFire Performance – A Bridge over Trouble Water?
Where AMD has definite issues is with HD 7970s in CrossFire, and our Frame Rating testing is bringing that to light in a startling fashion. In half of our tested games, the pair of Radeon HD 7970s in CrossFire showed no appreciable measured or observed increase in performance compared to a single HD 7970. I cannot overstate that point more precisely: our results showed that in Battlefield 3, Crysis 3 and Sleeping Dogs, adding in another $400+ Radeon HD 7970 did nothing to improve your gaming experience, and in some cases made it worse by introducing frame time variances that lead to stutter. Take a look at some of our graphs on those game pages and compare the FRAPS FPS result to the Observed FPS result that calculates an average frame rate per second after removing runts and drops. Clearly the performance of the dual-card configuration is only barely faster than the single card, removing the “scaling” of CrossFire. This occurs at 1920×1080 and 2560×1440 on those three games and actually happens several times on DiRT 3 but only at 2560×1440 (which actually leads me to believe this is a GPU performance issue, not a CPU performance issue).
It is worth pointing out that this does not necessarily mean you won’t have a fluid gaming experience on an AMD CrossFire configuration. Sleeping Dogs at 2560×1440 is a perfect example of this: CrossFire shows nearly 50% of the frames as runts, cutting the average frame rate in half, but those non-runt frames are actually delivered in a consistent manner. But a smooth gaming experience at 33 FPS on average on two HD 7970s in CrossFire doesn’t sound that good when you can get the same smooth experience at 33 FPS average with a single HD 7970. Dual GeForce GTX 680s in SLI on the other produce a fluid animation in Sleeping Dogs at 46 FPS.
In Far Cry 3 and Skyrim we did not have this problem with our performance metrics since we didn’t see large numbers of runts or drops in our testing. For Far Cry 3 in particular, the AMD cards had quite a bit more frame time variance (leading to stutter, non-fluid gameplay) with even the single HD 7970 getting higher marks on the International Stutter Units (ISU) graph than the GTX 680s in SLI.
The second major concern for AMD CrossFire users occurs when you enable triple-monitor configurations with Eyefinity. In every single game we tested, even Skyrim, DiRT3 and Far Cry 3 that didn’t show major runt issues on single monitor resolutions, just about every other frame of the game was being dropped. Just like the runt frame issue we mentioned above, the Eyefinity drop problem basically means you are running your 5760×1080 configuration at the performance level of a single HD 7970 even though you have invested twice the money AND that other performance software (in-game tests, FRAPS) are telling you differently. The results are so bad in fact from the recorded video that the FCAT Perl scripts aren’t quite able to decipher them because it thinks it is a poor capture; we can assure you that is not the case.
As much as we told you the single card results continued to favor AMD’s Radeon HD 7970 GHz Edition, the CrossFire results here counter that. As a buyer of a high end graphics card that will cost you over $400, the assurance of being able to run a multi-GPU solution to improve performance were not just insinuated, but verbally given. At this point, it is fair to say that AMD is not living up to its promises.
NVIDIA SLI Performance – How we expected multi-GPU to work
The NVIDIA GeForce GTX 680 looks slower than the HD 7970 in our single GPU comparisons, but that all changes when we compare dual-GPU to dual-GPU in this category. While AMD’s solution showed thousands of runt frames on BF3, Crysis 3 and Sleeping Dogs (two of which are AMD Gaming Evolved titles), NVIDIA’s SLI was able to handle scaling without a problem. Battlefield 3 at 2560×1440 goes from an average of 57 FPS on one GTX 680 to 100 FPS on two of them; Crysis 3 at 1920×1080 scales from 31 FPS to 56 FPS; Sleeping Dogs goes from 24 to 46 FPS at 2560×1440. And it is able to do so without massive frame time variance, which means the animations are not only improved by better frame rates but are still nearly as smooth as the single card options.
The secret to NVIDIA’s success lies it the hardware frame metering technology that it has built into the SLI infrastructure since the beginning, but is only just recently coming to light. Apparently a combination of both hardware on the GPU and software in the driver, the frame metering technology’s sole purpose to balance the output of frames from the GPU to the display in such a way to provide the best animation possible and balance performance and input latency.
In my talks with AMD before this article went live they told us that they were simply doing what the game engine told them to do – displaying frames as soon as they were available. Well as we can clearly see with the runts in more than half of our tested games, display a frame too early can be just as detrimental as display it too late. Without the ability to balance the two GPU’s output (or three or four) you will run into these problems and in fact we have seen the same thing happen with NVIDIA cards when metering is disabled. We are hoping that NVIDIA will give us the option to disable it and run some more Frame Rating tests to see how they compare in the near future.
In a couple games, Far Cry 3 and DiRT 3 on occasion, CrossFire is working as we would expect it to. Skyrim does not exhibit the runt problem but it also doesn’t seem to scale at all over a single GPU either. The inconsistency of this behavior might be just as troubling if my theory is correct. In Skyrim, Far Cry 3 and DiRT 3 at low resolutions, it would appear that the CPU may be the primary bottleneck for performance, and for Far Cry 3, a game that has numerous other technical issues, this maybe be why CrossFire is actually working. An artificial limiter on the game engine that helps meter out requests for frames to be rendered would essentially act like the hardware frame metering in NVIDIA’s SLI GPUs allowing for a better overall experience. In games like BF3, Crysis 3 and Sleeping Dogs where the GPU is in more demand, the AMD hardware/software combination is the limiting point in the pipeline and this is where the AMD solution falters.
Vsync – Only a Partial Answer
When I posted my preview of these results during the launch of the GeForce GTX Titan, many of you wanted to know what effects Vsync would have on the runts and frame time variance. As it turns out, Vsync can in fact improve the situation for AMD’s CrossFire pretty dramatically, but still leaves a lot of problems on the table. By doing metering on the frame rendering times of all GPU combinations including CrossFire, it is able to remove the runts from our captures and from affecting performance. Take a look at the results in Crysis 3 at 1920×1080 on the Radeon HD 7970s in CrossFire to see the other emerging issue though: drastically increased frame time variance. The constant shifting between 16ms and 33ms frame times means that you will often see stuttering animation even when the GPU has performance to handle higher or even more consistent frame rates.
To be fair, this same effect happens to NVIDIA’s GTX 680s in SLI. The only difference is NVIDIA has some options to try to fix it called Adaptive Vsync and Smooth Vsync. Both are activated through the NVIDIA Control Panel but Smooth Vsync is only available for SLI users (we are hoping this will be added for single GPU users as well soon). The Adaptive Vsync fixes the frame times at your display refresh rate (16ms, 60 Hz most of the time) anytime your frame rate would be higher than 60 FPS but then allows the engine to essentially “turn off” Vsync under 60 FPS so you don’t get the dramatic stuttering. Smooth Vsync is a little known feature that attempts to only change the frame rate / frame times when it knows it will have extended periods of available performance.
In some select instances for AMD's CrossFire we can actually see a completely resolved frame variance result, as demonstrated with the Battlefield 3 2560×1440 graphs. But Vsync still introduces other problems to latency and interactivity of PC games and is a topic we are going to dive into again soon.
Below is our full video that features the Frame Rating process, some example results and some discussion on what it all means going forward. I encourage everyone to watch it but you will definitely need the written portion here to fully understand this transition in testing methods. Subscribe to our YouTube channel if you haven't already!
Our Frame Rating graphics performance methods have definitely put a different light on the world of GPU performance. By enabling at-the-display evaluation rather than depending solely on the FRAPS reported numbers that don’t take into account many other steps of the gaming pipeline, we are able to paint a picture of the smoothness and real-world experiences as the user would see it directly. The system and process is time consuming, costly and harder to interpret but I think you will agree that the results we are seeing here today are well worth the effort. Until recently, very few people would have believed that AMD’s CrossFire and Eyefinity technologies would have had so many problems, thanks in many ways to previous testing platforms. As it stands now half of our six tested gaming titles show significant problems with single monitor CrossFire scaling and all six of them have serious problems with Eyefinity scaling. And that includes AMD sponsored titles like Crysis 3, Far Cry 3 and Sleeping Dogs.
NVIDIA’s frame metering technology, which still largely remains a mystery thanks to the company’s desire to keep its multi-GPU advantage as long as possible, is more important than we ever thought it would be and makes the GTX 680 stand out as a better solution for high end gaming than the HD 7970. That would not have been the case when looking at only single GPU results or by simply relying on the average frame rates of FRAPS data.
Another problem this causes for AMD and its partners is with dual-GPU hardware, the Powercolor Devil 13 or the ASUS ARES II. While both are powerful graphics cards, they depend completely on CrossFire technology to meet the performance expectations set by marketing and by cost and a quick glance at the HD 7970 CrossFire results I’ve showed you today (which basically mirror an ARES II) tells you that neither option is worth the price of admission.
It should come as no surprise that AMD was only recently made aware of these performance issues and was kind of stunned to find out how bad they were. The truth is that AMD could “hack” together a fix to make our frame time graphs look better but it would likely be a solution that introduced more problems than answers without doing the research required to get it right. The driver team has told me several times over the past two weeks that they should have a testable driver to fix the CrossFire problems “in 2 to 3 months.” Until then, buyers that consider a multi-GPU solution a goal or a requirement will want to seriously debate dropping Radeon cards from consideration.
We have quite a bit more test results to share with you as the week continues as well as more testing to do to get a better and clearer picture of what is going on in each and every scenario we can. We also plan on taking a look at more gaming titles with a smaller selection of hardware to find more patterns of success and failure with our current generation of GPUs.