GF106 and the cores that love it
The new GeForce GTS 450 1GB graphics card is based on a completely new GF106 GPU that takes the Fermi architecture and pulls it back a bit more to make a lower cost, more efficient unit. Starting at $120 or so, the GTS 450 is the cheapest NVIDIA DX11 part available but how can it compare to the Radeon HD 5770?Introduction
The high end of the graphics market is always the sexiest but we all know that the mid-range and more budget minded graphics cards are where the likes of AMD and NVIDIA make their money. The first release of the Fermi graphics cards in March of this year brought us the GeForce GTX 480 and GTX 470; both high performance cards but riddled with complaints about power consumption and heat. The GF100 GPU used in those two cards was not going to be well suited to make a move down to the low end space where the $200 price point is key.
In July we got the GeForce GTX 460 based on a completely revamped GPU, the GF104, that featured 336 CUDA cores (shader processors) compared to the 480 cores on the GTX 480 and 448 cores on the GTX 470. Not only that but NVIDIA took the time to re-balance the combination of shader cores, texture units and PolyMorph Engines (what NVIDIA uses for tessellation support in DX11). Those changes resulted in a GPU that was much more competitive in terms of performance per watt when placed against the AMD Radeon HD 5000 series of graphics cards and really became THE card to get in the $190-230 price range.
Today we are going another step lower as NVIDIA releases the GeForce GTS 450 that will retailed for $130-140.
The GF106 GPU – Fermi shrinks again
The GeForce GTS 450 1GB graphics card is taking aim at the Radeon HD 5700 series of cards with another GPU revision, GF106. This new GPU will feature 192 CUDA/shader cores, 32 texture units and 4 PolyMorph Engines all while using just over 100 watts of power.
This comparison table that NVIDIA provided in its documentation clearly demonstrates the progression downward from the GF100 flagship to the GF106 designs we are seeing for the first time. Notice that while the GTX 480 has 2.5x the CUDA cores of the new GTS 450, it has 3.75x the PolyMorph Engines but less than 2x the texture units. And because the GTS 450 has the exact same ratios as the GTX 460 (based on the GF104 GPU) we can assume that NVIDIA has found a balance for the low end cards that it is happy with.
Compared to the GF100, both of the newer GPUs are weighted much more heavily towards texture and shader performance as opposed to running so heavy on tessellation support. This turns out to be very noticeable in games like Metro 2033 and Lost Planet 2 (and we assume the upcoming HAWX 2 as well) and tells me that the PolyMorph Engines were at least partially responsible for the higher power consumption and lower efficiency of GF100 compared to the Radeon HD 5000 series.
You can also see in the first table that NVIDIA has put a target resolution on the new GeForce GTS 450 of 1680×1050 and that is based on information from the Steam hardware survey that indicates more than 50% of PC gamers are using resolutions at or below that level. While a fair assumption of the current market I assume that many of these users are on notebooks or Macs and as such may not be in the target DIY upgrade market the GTS 450 is targeting.
The GF106 block diagram above looks almost exactly like one half of a GF104 GPU. What is more interesting to me is that this is the first time a Fermi GPU has been fully utilized: the GF100 had one SM (simultaneous multiprocessor, the collection of CUDA cores associated with a single PolyMorph Engine) as did the GF104. Both GPUs had the ability to run with higher shader counts but disabled some due to either yield issues or power/heat issues. The GF106 does not do that and all 192 CUDA cores on the die are being utilized in the GTS 450 SKU.
This does NOT mean that we won’t see 144 or even 96 core parts down the road that reach into the sub-$100 markets, but for now, the GF106 is limited to this single graphics product.
The single GPC (graphics processing cluster) uses a fairly modest 128-bit memory interface utilizing 1GB of GDDR5 memory.
The clock speeds on the GeForce GTS 450 run at 783 MHz for the core, 1566 MHz for the shaders and 900 MHz for the memory. Those are pretty modest speeds but as it turns out those reference clocks are almost useless for this release. Every single partner I talked to going up to this GPU launch said they had reference AND overclocked cards ready, some reaching as high as 930 MHz core speed – that is an 18% increase over what NVIDIA is setting.
If so many, if not all, of the GF106 GPUs TSMC can pump out for NVIDIA are capable of 850+ MHz, one has to wonder why NVIDIA would set the reference speeds so low to begin with? The likely answer is that NVIDIA wanted to be able to sell a card that would perform nearly identically to the competition at this price point (HD 5770) while allowing card partners to sell slightly higher priced models that can perform beyond it.
NVIDIA still won’t allow us to use more than two of these cards in SLI (just as with the GTX 460) though as you will see later on the dual-card SLI performance scaling is pretty sweet and makes a set of them (at $260) a very compelling option next to a single GTX 470 (~$290).
The GF106 is also the first Fermi GPU to be released without a heat spreader so that we can finally see the size of the die up close and in person. The GF104 (bottom of this page) is much larger but without seeing the chip itself it is hard to see compare them reliably.