ATI’s R580 – What R520 Should Have Been
ATI’s R580 architecture is released on the world today and it might be the fastest GPU we have ever seen here at PC Perspective. Did I mention we look at a LOT of GPUs here…?
ATI’s R580 GPU architecture was probably one of the worst kept secrets in the industry in quite a long time. Even before R520 was being finalized, rumors were abound that the R580 was where ATI was really focusing and where they felt very confident they would get the upper hand on NVIDIA’s reigning G70. The R520 launch, otherwise known as the X1800, X1600 and X1300, vaulted past us less than four months ago and hot on its heels comes another ATI flagship GPU, the X1900; the one that should have been.
In ATI’s defense, they hadn’t planned on having the R580 released this close to the R520, but due to some problems with the yeilds and scaling on the R520 that weren’t worked out soon enough, the R520 was pushed back almost into the life time of the R580. ATI desperately needed to get SOMETHING out to compete with the 7800 cards from NVIDIA, as the aging X850 video cards weren’t even getting close. Thus, X1800 was thrust into the world late and somewhat lacking, and thus, R580 comes to pick up the pieces and spread good tidings to the gamers across the land.
A Pixel Pipeline Bonanza
Let’s jump right into the thick of things: what makes the new R580 architecture better than that of the R520? It can be summed up in two phrases: lots o’ pixel pipes and lots o’ yeild. Where as the R520 had problems getting up to the clock speeds ATI needed to compete, ATI is claiming to have had no problems with the R580, even with the larger and more complex architecture at the bottom of it.
Note: Much of this architecture is similar to that of the R520 (X1800) that we went into great detail on when it was released back in October. You should probably familiarize yourself with that information to get the most understanding out of what the R580 is offering.)
Below is a quick recap of the R520 architecture. There are 16 pixel pipelines, 8 vertex shader pipelines, 16 texture addressing units and 16 Render back-end units that handle rasterization, antialiasing and Z culling.
R520 Architecture – Radeon X1800
It was a decent architecture, and ATI’s first de-coupled design that allowed for dynamically adjusting the number of individual pipelines as they desired. But NVIDIA’s 24 pixel pipe design still had the edge in a lot of areas. Not so much now, it would seem, by looking at the R580 architecture diagram below.
R580 Architecture – Radeon X1900
That mash of lines and squares is actually showing 48 pixel pipelines in the new X1900 architecture, three times as many as we had in the X1800! Those 48 pixel pipes are still grouped together in sets of four and each pixel pipeline also has a corresponding flow contol unit, bringing them up to 48 total as well.
This quick table shows you the raw processing power that the additional 32 pixel processors bring to the table for the X1900 architecture. It is indeed quite impressive.
An important note here is that besides the additional pixel processing power, no other components in the GPU have really been changed all that much. We’ll get into the reasoning behind ATI’s decision to leave the vertex and rendering stages alone in the R580. It is still based on the 90nm technology that the X1800 used, but tweaked and fixed so that they should be getting much better yields this time around. Some other quick comparison specs on the die and transistor count are here:
R580 vs. R520
90nm process technology
Transistors: 384 Million vs. 320 Million
Die size: 315 sq mm vs. 264 sq mm
You can see that the R580 is a pretty big chip, though it isn’t quite as large as the G70 built on the 130 nm process.
Other changes made in this new revision include 50% more memory for Heiarchical Z that allows the X1900 to remove unnecessary pixels from the rendering pipeline on larger resolutions even faster. Those of you with monitors above 1600×1200 will see benefits from that. ATI is also introducing a new texture fetch option to developers in order to improve shadow rendering performance. Dubbed “Fetch4”, this feature allows the GPU to access four different values of adjacent texture addresses if only the value of a single parameter is needed (as is the case in most soft shadow rendering techniques).
This is in contrast to a standard texture fetch that still grabs four components, but they correspond to the RGB and alpha values of a single address. If this gets implemented, better performance on such applications would be assured. We’ll wait and see if ATI points us to any developers out there willing to give it a try.