UPDATE 1/28/15 @ 10:25am ET: NVIDIA has posted in its official GeForce.com forums that they are working on a driver update to help alleviate memory performance issues in the GTX 970 and that they will "help out" those users looking to get a refund or exchange.
UPDATE 1/26/25 @ 1:00pm ET: We have posted a much more detailed analysis and look at the GTX 970 memory system and what is causing the unusual memory divisions. Check it out right here!
UPDATE 1/26/15 @ 12:10am ET: I now have a lot more information on the technical details of the architecture that cause this issue and more information from NVIDIA to explain it. I spoke with SVP of GPU Engineering Jonah Alben on Sunday night to really dive into the quesitons everyone had. Expect an update here on this page at 10am PT / 1pm ET or so. Bookmark and check back!
UPDATE 1/24/15 @ 11:25pm ET: Apparently there is some concern online that the statement below is not legitimate. I can assure you that the information did come from NVIDIA, though is not attributal to any specific person – the message was sent through a couple of different PR people and is the result of meetings and multiple NVIDIA employee's input. It is really a message from the company, not any one individual. I have had several 10-20 minute phone calls with NVIDIA about this issue and this statement on Saturday alone, so I know that the information wasn't from a spoofed email, etc. Also, this statement was posted by an employee moderator on the GeForce.com forums about 6 hours ago, further proving that the statement is directly from NVIDIA. I hope this clears up any concerns around the validity of the below information!
Over the past couple of weeks users of GeForce GTX 970 cards have noticed and started researching a problem with memory allocation in memory-heavy gaming. Essentially, gamers noticed that the GTX 970 with its 4GB of system memory was only ever accessing 3.5GB of that memory. When it did attempt to access the final 500MB of memory, performance seemed to drop dramatically. What started as simply a forum discussion blew up into news that was being reported at tech and gaming sites across the web.
Image source: Lazygamer.net
NVIDIA has finally responded to the widespread online complaints about GeForce GTX 970 cards only utilizing 3.5GB of their 4GB frame buffer. From the horse's mouth:
The GeForce GTX 970 is equipped with 4GB of dedicated graphics memory. However the 970 has a different configuration of SMs than the 980, and fewer crossbar resources to the memory system. To optimally manage memory traffic in this configuration, we segment graphics memory into a 3.5GB section and a 0.5GB section. The GPU has higher priority access to the 3.5GB section. When a game needs less than 3.5GB of video memory per draw command then it will only access the first partition, and 3rd party applications that measure memory usage will report 3.5GB of memory in use on GTX 970, but may report more for GTX 980 if there is more memory used by other commands. When a game requires more than 3.5GB of memory then we use both segments.
We understand there have been some questions about how the GTX 970 will perform when it accesses the 0.5GB memory segment. The best way to test that is to look at game performance. Compare a GTX 980 to a 970 on a game that uses less than 3.5GB. Then turn up the settings so the game needs more than 3.5GB and compare 980 and 970 performance again.
Here’s an example of some performance data:
|GTX 980||GTX 970|
|Shadow of Mordor|
|<3.5GB setting = 2688x1512 Very High||72 FPS||60 FPS|
|>3.5GB setting = 3456x1944||55 FPS (-24%)||45 FPS (-25%)|
|<3.5GB setting = 3840x2160 2xMSAA||36 FPS||30 FPS|
|>3.5GB setting = 3840x2160 135% res||19 FPS (-47%)||15 FPS (-50%)|
|Call of Duty: Advanced Warfare|
|<3.5GB setting = 3840x2160 FSMAA T2x, Supersampling off||82 FPS||71 FPS|
|>3.5GB setting = 3840x2160 FSMAA T2x, Supersampling on||48 FPS (-41%)||40 FPS (-44%)|
On GTX 980, Shadows of Mordor drops about 24% on GTX 980 and 25% on GTX 970, a 1% difference. On Battlefield 4, the drop is 47% on GTX 980 and 50% on GTX 970, a 3% difference. On CoD: AW, the drop is 41% on GTX 980 and 44% on GTX 970, a 3% difference. As you can see, there is very little change in the performance of the GTX 970 relative to GTX 980 on these games when it is using the 0.5GB segment.
So it would appear that the severing of a trio of SMMs to make the GTX 970 different than the GTX 980 was the root cause of the issue. I'm not sure if this something that we have seen before with NVIDIA GPUs that are cut down in the same way, but I have asked for clarification from NVIDIA on that.
The ratios fit: 500MB is 1/8th of the 4GB total memory capacity and 2 SMMs is 1/8th of the total SMM count. (Edit: The ratios in fact do NOT match up...odd.)
The full GM204 GPU that is the root cause of this memory issue.
Another theory presented itself as well: is this possibly the reason we do not have a GTX 960 Ti yet? If the patterns were followed from previous generations a GTX 960 Ti would be a GM204 GPU with fewer cores enabled and additional SMs disconnected to enable a lower price point. If this memory issue were to be even more substantial, creating larger differentiated "pools" of memory, then it could be an issue for performance or driver development. To be clear, we are just guessing on this one and that could be something that would not occur at all. Again, I've asked NVIDIA for some technical clarification.
Requests for information aside, we may never know for sure if this is a bug with the GM204 ASIC or predetermined characteristic of design.
The questions remains: does NVIDIA's response appease GTX 970 owners? After all, this memory concern is really just a part of a GPU's story and thus performance testing and analysis already incorporates it essentially. Some users will still likely make a claim of a "bait and switch" but do the benchmarks above, as well as our own results at 4K, make it a less significant issue?
Our own Josh Walrath offers this analysis:
A few days ago when we were presented with evidence of the 970 not fully utilizing all 4 GB of memory, I theorized that it had to do with the reduction of SMM units. It makes sense from an efficiency standpoint to perhaps "hard code" memory addresses for each SMM. The thought behind that would be that 4 GB of memory is a huge amount of a video card, and the potential performance gains of a more flexible system would be pretty minimal.
I believe that the memory controller is working as intended and not a bug. When designing a large GPU, there will invariably be compromises made. From all indications NVIDIA decided to save time, die size, and power by simplifying the memory controller and crossbar setup. These things have a direct impact on time to market and power efficiency. NVIDIA probably figured that a couple percentage of performance lost was outweighed by the added complexity, power consumption, and engineering resources that it would have taken to gain those few percentage points back.