NVIDIA has let loose the GM200 GPU upon the world, in the form of the GTX TITAN X. It has 12GB of memroy…12GB!!
With the release of the GeForce GTX 980 back in September of 2014, NVIDIA took the lead in performance with single GPU graphics cards. The GTX 980 and GTX 970 were both impressive options. The GTX 970 offered better performance than the R9 290 as did the GTX 980 compared to the R9 290X; on top of that, both did so while running at lower power consumption and while including new features like DX12 feature level support, HDMI 2.0 and MFAA (multi-frame antialiasing). Because of those factors, the GTX 980 and GTX 970 were fantastic sellers, helping to push NVIDIA’s market share over 75% as of the 4th quarter of 2014.
But in the back of our mind, and in the minds of many NVIDIA fans, we knew that the company had another GPU it was holding on to: the bigger, badder version of Maxwell. The only question was going to be WHEN the company would release it and sell us a new flagship GeForce card. In most instances, this decision is based on the competitive landscape, such as when AMD might be finally updating its Radeon R9 290X Hawaii family of products with the rumored R9 390X. Perhaps NVIDIA is tired of waiting or maybe the strategy is to launch soon before Fiji GPUs make their debut. Either way, NVIDIA officially took the wraps off of the new GeForce GTX TITAN X at the Game Developers Conference two weeks ago.
At the session hosted by Epic Games’ Tim Sweeney, NVIDIA CEO Jen-Hsun Huang arrived when Tim lamented about needing more GPU horsepower for their UE4 content. In his hands he had the first TITAN X GPU and talked about only a couple of specifications: the card would have 12GB of memory and it would be based on a GPU with 8 billion transistors.
Since that day, you have likely seen picture after picture, rumor after rumor, about specifications, pricing and performance. Wait no longer: the GeForce GTX TITAN X is here. With a $999 price tag and a GPU with 3072 CUDA cores, we clearly have a new king of the court.
GM200 GPU Specifications
The basis for the new GeForce GTX TITAN X is NVIDIA’s GM200 GPU. This large, beastly part is based on the same Maxwell architecture that we know and love from the GTX 980, GTX 970 and GTX 960. It is also built on the same 28nm process technology that those same GPUs use, too. NVIDIA is not yet moving over to another process tech quite yet, despite the rumors that AMD is going to be migrating to 20nm on its next flagship GPU.
The GM200 includes 3072 CUDA cores, 192 texture units and a 384-bit memory bus. Clearly this GPU isn’t bluffing, it has some power.
|TITAN X||GTX 980||TITAN Black||R9 290X|
|Rated Clock||1000 MHz||1126 MHz||889 MHz||1000 MHz|
|Memory Clock||7000 MHz||7000 MHz||7000 MHz||5000 MHz|
|Memory Bandwidth||336 GB/s||224 GB/s||336 GB/s||320 GB/s|
|TDP||250 watts||165 watts||250 watts||290 watts|
|Peak Compute||6.14 TFLOPS||4.61 TFLOPS||5.1 TFLOPS||5.63 TFLOPS|
Essentially, the GTX TITAN X’s compute structure is exactly a 50% boost over the GeForce GTX 980 with just a slight reduction in clock rates at stock settings. You get 50% more processing cores, 50% more texture units, 50% more ROP units, 50% larger L2 cache and a 50% larger memory bus. Dang – that’s going to provide some impressive computing power, resulting in a peak theoretical throughput of 6.14 TFLOPS single precision.
During the keynote at GTC, NVIDIA's CEO quoted the single precision compute performance as 7.0 TFLOPS, which differs from the table above. The 6.14 TFLOPS rating above is based on the base clock of the GPU while the 7.0 TFLOPS number is based on "peak" clock rate. Also, just for reference, at the rated Boost clock the Titan X is rated at 6.60 TFLOPS.
A unique characteristic of this TITAN X card is that it does not have an accelerated performance configuration for double precision computing, which is something that both the TITAN and the TITAN Black had before it. The double precision performance is still a 1/32nd ratio (relative to single precision). That gives the TITAN X DP compute capability at just 192 GFLOPS. For reference, the TITAN Black has DP performance rated at 1707 GFLOPS with a 1/3rd ratio of the GPU’s 5.12 TFLOPS single precision capability. It appears that NVIDIA is not simply disabling the double precision compute capability on the GM200 GPU, hiding it and saving it for another implementation. Based on the die size, shader count and transistor count, it looks GM200 just doesn't have it. NVIDIA must have another solution up it's sleeve for Maxwell double precision compute.
Oh, and yes, before you ask me in the comments below, I have directly asked NVIDIA for comment about memory configuration concerns, in regards to the GTX 970. The TITAN X does not have any divided memory pools, the 384-bit memory bus is not sectioned off into any sub-groups that may or may not run at slower throughput. So, there’s that.
The GM200 is built on 24 SMX modules and this is the full GPU implementation so you should not expect another variant to show up down the road with anything higher than 3072 CUDA cores, unless the company spins a completely new GPU revision. Being that the GPU is still based on the 28nm TSMC process, and is compiled with 8 billion transistors, this is not a small component. Measured die size was 25mm x 25mm or 625mm2. The previous largest GPU we had seen is the GK110 used 7.1 billion transistors and had a die size of around 561mm2. (Note that NVIDIA sets the die size at 601mm2 based on measurements of 24.66mm x 24.38mm.)
Clock speeds on the TITAN X are lower than the GTX 980 at stock, as you would expect, but the differences aren’t drastic. With a base clock of 1000 MHz and rated Boost clock of 1075 MHz. The GTX 980 reference clocks are 1126 MHz/1216 MHz respectively resulting in a 12.6% decrease comparatively. The memory clock is still starting at 7.0 GHz, the same speed as the GTX 980 and even the previous GTX TITAN Black. The memory configuration is improved with 50% higher peak bandwidth, hitting 336.5 GB/s, which is higher than the 320 GB/s rated by the AMD Radeon R9 290X flagship.
Speaking of memory, the TITAN X will ship with 12GB of memory. Gulp. That is 3x the memory found on the GTX 980 and 2x the GTX TITAN Black which seemed crazy (at 6GB) when it shipped in February of 2014. I can already hear the debate and no, 12GB of memory is not going to be beneficial for gamers for quite some time – likely a time span even further out than the life of this GPU, to be fair. With the recently debate around the 3.5GB and 4GB frame buffers, we saw that most games available today, even when pushed with settings intending to increase memory usage, rarely extend in the world of 4GB. If you want to think really crazy, let’s assume you are planning on doing on 4K Surround gaming with three displays; you might be able to stretch that out to 7-8GB if you try real hard. Of course there is a secondary audience for this card that focus on GPGPU compute rather than gaming, where that 12GB of memory could be more useful, more quickly.
The only negative change on the GM200 (compared to GM204) is its rated TDP. With a listing of 250 watts, the GTX TITAN X is clearly going to run hotter than the GTX 980, which has a 165 watt TDP. (Of note, this is a 51% increase in TDP that is in line with the other specification changes.) Keep in mind that the Radeon R9 290X has a TDP of 290 watts, though we have measured it higher than that in our power testing several times. Both the new TITAN X and the R9 290X require a 6+8 pin power connection so it will be interesting to see how the real-world power consumption varies between these two GPUs.
The new GTX TITAN X shares the same feature set and capability as the GTX 980. That includes support for MFAA, VXGI acceleration, Dynamic Super Resolution, VR Direct and more. For more information on those features and what they bring to the table check out the links below.
Now, let’s dive into the design of the GTX TITAN X graphics card itself!