A new GPU, a familiar problem
NVIDIA’s GM206 based GTX 960 card is finally launching, after weeks of rumors and leaks. Can it topple AMD’s lead at $199?
Editor's Note: Don't forget to join us today for a live streaming event featuring Ryan Shrout and NVIDIA's Tom Petersen to discuss the new GeForce GTX 960. It will be live at 1pm ET / 10am PT and will include ten (10!) GTX 960 prizes for participants! You can find it all at https://www.pcper.com/live
There are no secrets anymore. Calling today's release of the NVIDIA GeForce GTX 960 a surprise would be like calling another Avenger's movie unexpected. If you didn't just assume it was coming chances are the dozens of leaks of slides and performance would get your attention. So here it is, today's the day, NVIDIA finally upgrades the mainstream segment that was being fed by the GTX 760 for more than a year and half. But does the brand new GTX 960 based on Maxwell move the needle?
But as you'll soon see, the GeForce GTX 960 is a bit of an odd duck in terms of new GPU releases. As we have seen several times in the last year or two with a stagnant process technology landscape, the new cards aren't going be wildly better performing than the current cards from either NVIDIA for AMD. In fact, there are some interesting comparisons to make that may surprise fans of both parties.
The good news is that Maxwell and the GM206 GPU will price out starting at $199 including overclocked models at that level. But to understand what makes it different than the GM204 part we first need to dive a bit into the GM206 GPU and how it matches up with NVIDIA's "small" GPU strategy of the past few years.
The GM206 GPU – Generational Complexity
First and foremost, the GTX 960 is based on the exact same Maxwell architecture as the GTX 970 and GTX 980. The power efficiency, the improved memory bus compression and new features all make their way into the smaller version of Maxwell selling for $199 as of today. If you missed the discussion on those new features including MFAA, Dynamic Super Resolution, VXGI you should read that page of our original GTX 980 and GTX 970 story from last September for a bit of context; these are important aspects of Maxwell and the new GM206.
NVIDIA's GM206 is essentially half of the full GM204 GPU that you find on the GTX 980. That includes 1024 CUDA cores, 64 texture units and 32 ROPs for processing, a 128-bit memory bus and 2GB of graphics memory. This results in half of the memory bandwidth at 112 GB/s and half of the peak compute capability at 2.30 TFLOPS.
Those are significant specification hits and will result in a drop of essentially half the gaming performance for the GTX 960 compared to the GTX 980. Some readers and PC enthusiasts will immediately recognize the GTX 960 as a bigger drop from the flagship part than recent generations of graphics cards from NVIDIA. You're not wrong.
GTX 960 | GTX 970 | GTX 980 | GTX 760 | GTX 770 | GTX 780 | GTX 660 | GTX 670 | GTX 680 | |
---|---|---|---|---|---|---|---|---|---|
GPU | GM206 | GM204 | GM204 | GK104 | GK104 | GK110 | GK106 | GK104 | GK104 |
GPU Cores | 1024 | 1664 | 2048 | 1152 | 1536 | 2304 | 960 | 1344 | 1536 |
Rated Clock | 1126 MHz | 1050 MHz | 1126 MHz | 980 MHz | 1046 MHz | 863 MHz | 980 MHz | 915 MHz | 1006 MHz |
Texture Units | 64 | 104 | 128 | 96 | 128 | 192 | 80 | 112 | 128 |
ROP Units | 32 | 64 | 64 | 32 | 32 | 48 | 24 | 32 | 32 |
Memory | 2GB | 4GB | 4GB | 2GB | 2GB | 3GB | 2GB | 2GB | 2GB |
Memory Clock | 7000 MHz | 7000 MHz | 7000 MHz | 6000 MHz | 7000 MHz | 6000 MHz | 6000 MHz | 6000 MHz | 6000 MHz |
Memory Interface | 128-bit | 256-bit | 256-bit | 256-bit | 256-bit | 384-bit | 192-bit | 256-bit | 256-bit |
Memory Bandwidth | 112 GB/s | 224 GB/s | 224 GB/s | 192 GB/s | 224 GB/s | 288 GB/s | 144 GB/s | 192 GB/s | 192 GB/s |
TDP | 120 watts | 145 watts | 165 watts | 170 watts | 230 watts | 250 watts | 140 watts | 170 watts | 195 watts |
Peak Compute | 2.30 TFLOPS | 3.49 TFLOPS | 4.61 TFLOPS | 2.25 TFLOPS | 3.21 TFLOPS | 3.97 TFLOPS | 1.81 TFLOPS | 2.46 TFLOPS | 3.09 TFLOPS |
Transistor Count | 2.94B | 5.2B | 5.2B | 3.54B | 3.54B | 7.08B | 2.54B | 3.54B | 3.54B |
Process Tech | 28nm | 28nm | 28nm | 28nm | 28nm | 28nm | 28nm | 28nm | 28nm |
MSRP | $199 | $329 | $549 | $249 | $399 | $649 | $230 | $399 | $499 |
This table compares the last three brand generations of NVIDIA's GeForce cards from x80, x70 and x60 products. Take a look at the GTX 680, a card based on the GK104 GPU and the GTX 660 based on GK106; the mainstream card has 62.5% of the CUDA cores and 75% of the memory bus width. The GTX 760 is actually based on the same GK104 GPU as the GTX 680 and GTX 770 and includes a wider 256-bit memory bus though dropped to half of the CUDA cores of the GK110-based GTX 780.
It's complicated (trust me, I know), but NVIDIA definitely wants to get to smaller GPU dies again on the lower-priced parts. Way back in 2012 NVIDIA released the GTX 660 with a 2.54 billion transistor die on the 28nm process, but stayed performance competitive with the 700-series. The GTX 760 jumped up a lot to a 3.54 billion transistor die and increased the price up to $250 at launch. Today's release of the GTX 960 is down to 2.94 billion transistors, near that of the GTX 660, but with a lower starting price point of $199.
Power use on the GTX 960 is amazingly low with a rated TDP of 120 watts and in our testing the GPU almost never even approaches that level. In fact, when playing a game like DOTA 2 with V-Sync off (60 FPS cap) the card barely draws more than 35 watts! (More details on that on the power page.)
In the press documentation from NVIDIA, the company makes several attempts to put a better spin on the specifications surround the GeForce GTX 960. For the first time, NVIDIA mentions an "effective memory clock" rate that is justified by the efficiency improvement in memory compression of Maxwell over Kepler. While this is definitely true, it's been true between generations for years and is part of the reason analysis of GPUs lie ours continue to exist. Creating metrics to selective improve line items is a bad move, and I expressed as much during our early meetings.
Separately, NVIDIA is moving forward with the continued emphasis on MFAA performance numbers. Remember that multi-frame sampled anti-aliasing (MFAA) was launched with the GTX 980 and GTX 970, and uses a post-processing filter to combine multiple frames temporally at 2xMSAA quality with shifted sample points. The result is a 4xMSAA look at 2xMSAA performance, at least in theory. When the GTX 980 and GTX 970 launched game support was incredibly limited, making the feature less than exciting. With this new driver, Maxwell GPUs will be able to support MFAA on all DirectX 11 and 10 games that support MSAA excluding only Dead Rising 3, Dragon Age 2 and Max Payne 3. That applies to most games we test in our suite including Crysis 3, Battlefield 4 and Skyrim; other games like Metro: Last Light or Bioshock Infinite use internal AA methods, not driver-based MSAA, and thus are unable to utilize MFAA.
When NVIDIA defaults to using MSAA, they are comparing 2xMSAA with MFAA (4xAA quality essentially) to 4xMSAA on other cards. To its credit, NVIDIA says they are only comparing this way to previous NVIDIA hardware, not to AMD's competing hardware. My thoughts on this are mixed at this point as it will no doubt start a race from both parties to fully integrate and showcase custom, proprietary AA methods exclusively going forward. See my page on MFAA performance later in the review for more details.
There are interesting comparisons to be made between the new GTX 960 and the currently shipping competing parts from AMD. Some of the specification differences will be claimed as important advantages for the Radeon line up. Obviously our performance evaluation will be the final deciding factor, but is there anything to these claims?
GTX 960 | GTX 760 | R9 285 | R9 280 | |
---|---|---|---|---|
GPU | GM206 | GK104 | Tonga | Tahiti |
GPU Cores | 1024 | 1152 | 1792 | 1792 |
Rated Clock | 1126 MHz | 980 MHz | 918 MHz | 827 MHz |
Texture Units | 64 | 96 | 112 | 112 |
ROP Units | 32 | 32 | 32 | 32 |
Memory | 2GB | 2GB | 2GB | 3GB |
Memory Clock | 7000 MHz | 6000 MHz | 5500 MHz | 5000 MHz |
Memory Interface | 128-bit | 256-bit | 256-bit | 384-bit |
Memory Bandwidth | 112 GB/s | 192 GB/s | 176 GB/s | 240 GB/s |
TDP | 120 watts | 170 watts | 190 watts | 250 watts |
Peak Compute | 2.30 TFLOPS | 2.25 TFLOPS | 3.29 TFLOPS | 3.34 TFLOPS |
Transistor Count | 2.94B | 3.54B | 5.0B | 4.3B |
Process Tech | 28nm | 28nm | 28nm | 28nm |
MSRP | $199 | $249 | $249 | $249 |
While comparing GPU core counts is useless between architectures, many of the other data points can be debated. The most prominent difference is the 128-bit memory bus that GM206 employs when compared to the R9 285 with a 256-bit memory bus or even the R9 280 with its massive 384-bit memory bus. Raw memory bandwidth is the net result of this - the GTX 960 only sports 112 GB/s while the R9 280 tosses out 240 GB/s, more than twice the value. This allows the R9 280 to have a 3GB frame buffer but also means it has disadvantage in TDP and transistor count / die size. An additional 1.4 billion transistors and 130 watts of thermal headroom are substantial. The Tonga GPU in the R9 285 has more than 2.0 billion additional transistors when compared to the GTX 760 - what they all do though is still up for debate.
There is no denying that from a technological point of view, having a wider memory bus and higher memory bandwidth is a good thing for performance. But it comes at cost - both in terms of design and in terms of AMD's wallet. Can NVIDIA really build a GPU that is both substantially smaller but equally as powerful?
Will nvidia hardware support
Will nvidia hardware support adaptive-sync?
I don’t have a question. I
I don’t have a question. I just want to say I’m new to building pcs. I just built my first. I’ve Watched numerous interviews with developers from many companies, this was the easiest to understand! Being a car guy, I especially loved the car analogy!
Question. Would a sli setup
Question. Would a sli setup of 960’s be good for one 1440P monitor setup.
Given the 960 will likely be
Given the 960 will likely be used in mid range systems, can we expect to see lower priced GSYNC monitors targeted at value for performance customers?
What was the reason behind
What was the reason behind the limited release of reference design 960’s? Are there any plans to release more in the future?
Will there be a 4gb variant
Will there be a 4gb variant of the 960?
From Tom – the 960 does not
From Tom – the 960 does not support 4GB.
at the microsoft demo
at the microsoft demo yesterday they claimed that dx12 would mean more power efficient graphics cards. what did they mean by this and how does it effect maxwell graphics cards?
http://blogs.msdn.com/b/directx/archive/2014/08/13/directx-12-high-performance-and-high-power-savings.aspx
The power efficiency related
The power efficiency related with DX12 has to do with CPU usage related with overhead in getting those DX commands to the GPU. There may be some power savings on the GPU side, but the bulk of it is going to fall on the CPU side, and should apply regardless of which type of GPU is installed.
Why is this (official)
Why is this (official) picture of GTX 960 showing two SLI fingers?
http://international.download.nvidia.com/geforce-com/international/image…
That’s a dead link.
That's a dead link.
I just asked the same
I just asked the same question on 1st page, and the link is working 🙂
Pentium g3258 running the 960
Pentium g3258 running the 960 card? Performance?
There are reports of Maxwell
There are reports of Maxwell cards that are clocking down on efficiency suddenly during gameplay, will these be resolved with driver updates?
Got any links we can
Got any links we can research?
GeForce Forums
GTX 970 3.5GB
GeForce Forums
GTX 970 3.5GB Vram Issue
https://forums.geforce.com/default/topic/803518/geforce-900-series/gtx-970-3-5gb-vram-issue/
Boost feature causes GTX 970/980 instability in low utilisation situations
https://forums.geforce.com/default/topic/784294/geforce-900-series/boost-feature-causes-gtx-970-980-instability-in-low-utilisation-situations/
Nvidia! Is there a fix coming or not? – Very disappointed new customer. ( 970 low usage )
https://forums.geforce.com/default/topic/792706/geforce-900-series/nvidia-is-there-a-fix-coming-or-not-very-disappointed-new-customer-970-low-usage-/
i’m so happy i dumped my 760
i’m so happy i dumped my 760 for $160 a few months ago and went ahead and cried and paid the premium for a 970. Comparing a computer with a 960 to a PS4 and now with windows10 cross play, even the lonely xb1, you’d have to be crazy to buy one of these cards.
The only way i’d see it if it was going to be your first computer and you couldn’t afford a console and really need to educate yourself with a computer and you had a friend who could you build you a computer for cheap, like $400 for everything.
Honestly, my 4ghz i7 & 970 is kind of disappointing for gaming now. I still play battlefield 3&4 & titanfall. But other than those, i don’t really see a point. Consoles are so much better for gaming right now.
You’re preaching to the wrong
You’re preaching to the wrong croud. This is the PC master race.
Come again?
Come again?
PC > console
PC > console
Aah 🙂
Aah 🙂
i’m so happy i dumped my 760
i’m so happy i dumped my 760 for $160 a few months ago and went ahead and cried and paid the premium for a 970. Comparing a computer with a 960 to a PS4 and now with windows10 cross play, even the lonely xb1, you’d have to be crazy to buy one of these cards.
The only way i’d see it if it was going to be your first computer and you couldn’t afford a console and really need to educate yourself with a computer and you had a friend who could you build you a computer for cheap, like $400 for everything.
Honestly, my 4ghz i7 & 970 is kind of disappointing for gaming now. I still play battlefield 3&4 & titanfall. But other than those, i don’t really see a point. Consoles are so much better for gaming right now.
What are the plans for NVidia
What are the plans for NVidia in regards freesync?
Nvidia has stated that they
Nvidia has stated that they do not plan to support freesync.
Tom, where do you buy your
Tom, where do you buy your glasses and what brand are they?
Question. Would a sli setup
Question. Would a sli setup of 960’s be good for one 1440P monitor setup.
It should handle 1440P ok,
It should handle 1440P ok, but would have a hard time with 4k.
Vram doesnt stack on SLI.
Vram doesnt stack on SLI. 1440p add’s quite a few pixels. Stick to at least a 4gb card. my2c
Without gsync, would LoL
Without gsync, would LoL still be running around 30w?
I believe it was running at
I believe it was running at 30W when limited to 60 FPS.
When developing a new GPU
When developing a new GPU technology like Maxwell, how does nvidia decide how many CUDA cores to include in each design? Is clock speed determined after that decision is made or earlier in the process?
Combo Question:
Why GTX 960
Combo Question:
Why GTX 960 have only 2GB (Is that enough for modern games)?
Question: Will Nvidia ever
Question: Will Nvidia ever develop an SLI solution that doesn’t require a bridge?
Will G-Sync have additional
Will G-Sync have additional value add / features in future versions?
What’s faster gtx 680 or the
What’s faster gtx 680 or the 960?