TU102 and TU104 Specifications, Overclocking, and Multi-GPU

In an unusual move, NVIDIA is simultaneously launching the Turing architecture into the market with two different GPUs.

Representing the highest-end offerings we are likely to see with Turing, the TU104 and TU102 GPUs are also some of the biggest GPUs regarding silicon area occupied and transistor counts that we've ever seen.

First, let's take a look at the entire Turing lineup, compared the rest of the current generation GPUs.

  RTX 2080 Ti Quadro RTX 6000 GTX 1080 Ti RTX 2080  Quadro RTX 5000 GTX 1080 TITAN V RX Vega 64 (Air)
GPU TU102 TU102 GP102 TU104 TU104 GP104 GV100 Vega 64
GPU Cores 4352 4608 3584 2944 3072 2560 5120 4096
Base Clock 1350 MHz 1455 MHz 1408 MHz 1515 1620 MHz 1607 MHz 1200 MHz 1247 MHz
Boost Clock 1545 MHz/
1635 MHz (FE)
1770 MHz 1582 MHz 1710 MHz/
1800 MHz (FE)
1820 MHz 1733 MHz 1455 MHz 1546 MHz
Texture Units 272 288 224 184 192 160 320 256
ROP Units 88 96 88 64 64 64 96 64
Tensor Cores 544 576 368 384 640
Ray Tracing Speed 10 GRays/s 10 GRays/s 8 GRays/s 8 GRays/s
Memory 11GB 24GB 11GB 8GB 16GB 8GB 12GB  8GB
Memory Clock 14000 MHz  14000 MHz  11000 MHz 14000 MHz  14000 MHz  10000 MHz 1700 MHz 1890 MHz
Memory Interface 352-bit G6 384-bit G6 352-bit G5X 256-bit G6 256-bit G6 256-bit G5X 3072-bit HBM2 2048-bit HBM2
Memory Bandwidth 616GB/s 672GB/s 484 GB/s 448 GB/s 448 GB/s 320 GB/s 653 GB/s 484 GB/s
TDP 250 W/
260 W (FE)
260 W 250 watts 215W
225W (FE)
230 W 180 watts 250W 292
Peak Compute (FP32) 13.4 TFLOPS / 14.2 TFLOP (FE) 16.3 TFLOPS 10.6 TFLOPS 10 TFLOPS / 10.6 TFLOPS (FE) 11.2 TFLOPS 8.2 TFLOPS 14.9 TFLOPS 13.7 TFLOPS
Transistor Count 18.6 B 18.6B 12.0 B 13.6 B 13.6 B 7.2 B 21.0 B 12.5 B
Process Tech 12nm 12nm 16nm 12nm 12nm 16nm 12nm 14nm
MSRP (current) $1200 (FE)/
$1000
$6,300 $699 $800/
$700
$2,300 $549 $2,999 $499

At 4352, the RTX 2080 Ti features just over 20% more CUDA cores than the previous generation GTX 1080 Ti. Similarly, the RTX 2080 features 15% more CUDA cores than the GTX 1080.

Clock speeds however, seem to be mostly comparable generation-to-generation, with the base clocks of the RTX cards actually coming in a bit lower than the Pascal-based GTX 10-series GPUs.

As usual, the RTX 2080 Ti does not represent the fully enabled TU102 die, leaving room for a potential TITAN RTX in the future.

One of the most bizarre differences in Turing specifications is the needed separation between "Founder's Edition" and "Reference" specifications. Since NVIDIA is for the first time selling their Founder's Edition graphics cards as Overclocked out of the box, it will be interesting to see what, if any third party designs run at the reference clock speeds.

NVIDIA Scanner

Speaking of overclocking, one of the most personally exciting features of Turing is NVIDIA Scanner.

Essentially, NVIDIA Scanner will be built into programs like EVGA Precision X and MSI Afterburner that will allow for automated overclocking of RTX-based graphics cards.

With one click, users can start the scanner, which will then begin applying higher frequencies, testing stability, and adjusting voltages as necessary until it reaches what NVIDIA has determined is the highest stable frequency at the lowest possible voltage.

The test load that NVIDIA is running is a math-based test, instead of a graphical one. This NVIDIA tuned workload is meant to ensure the highest level of stability across the broadest range of applications.

If you've ever had an overclock that you thought was stable only to crash on a new game title, you'll understand the problem that NVIDIA is trying to solve here.

Hardcore enthusiasts need not worry, the same level of manual overclocking support that was seen in Pascal will also be available for Turing, for users who aren't happy with the overclock from NVIDIA Scanner or want to tune the card completely themselves.

While NVIDIA is launching this feature with the RTX GPUs, they are expected to (eventually) bring NVIDIA Scanner support to older GPUs, such as the Pascal-based GTX 10-series.

NVLink

The NVLink interface now handles Multi-GPU (SLI) support on Turing. While NVIDIA has been using NVLink in enterprise-level products, such as the Volta V100 GPU for a bit, Turing marks the first adoption of NVLink on a consumer graphics card.

NVLink is capable of a massive 50GB/s of bandwidth with the single link connection found on the RTX 2080, and 100GB/s with the dual link connection on the RTX 2080 Ti. This additional bandwidth won't help your current SLI experience at all but is necessary to achieve proper SLI performance at resolutions such as 8K.

With a new interface, of course, comes new bridges. NVLink bridges are set to be available alongside the retail launch of the RTX 2080 and 2080 Ti on September 20th from a wide array of partners and NVIDIA themselves.

« PreviousNext »