SLI Changes and FastSync
In addition to the changes in asynchronous compute, I think perhaps the changes to SLI will result in the biggest discussion in the community. Let’s first start with the technical changes, first announced at the GTX 1080 launch last week.
When using two GTX graphics cards in previous generations, you only need to connect a single SLI connector on each of the cards with a connector, ribbon or PCB (fixed). We always knew that the connection between the two GPUs in that case was bandwidth limited but to what degree was never publicly known. With Pascal and the GTX 1080, NVIDIA is introducing a new SLI bridge called SLI HB, for high bandwidth. These bridges connect to both of the connectors on top of the cards, linking them together to improve bandwidth between the GPUs.
The frequency on the new SLI connection on the GTX 1080 is 650 MHz, up from the 400 MHz in previous GPUs.
This can get confusing, so let’s define types of SLI bridges. First you have the “standard” bridge, one that is built like a ribbon cable, or even an older PCB-based one without lighting. Second you have the “LED bridges” that were released in the last couple of years from both EVGA and NVIDIA; if you have an SLI bridge that lights up when connected, that’s what you have. Finally, the new “HB bridge” is built specifically for the GTX 1080 (and we assume other Pascal GPUs).
The original SLI bridges that you might have several of from motherboards over the years only are recommended for single display configurations of up to 2560×144 @ 60 Hz. If you have one of the LED bridges you can properly integrate high refresh rate 2560×1440 displays as well as 4K monitors. If you want to push into 5K or Surround gaming though, NVIDIA will recommend one of the new high bandwidth SLI bridges.
NVIDIA showed a test result comparing the new SLI HB Bridge against an older model when running in 4K Surround on Shadow of Mordor. While the older bridge couldn’t keep up with the transfer triple 4K images, the new higher bandwidth bridge was, resulting in a smoother game play experience with no stutters. It should be noted that without the HB bridge in place, the frames from the second GPU are being sent over PCI Express, which is clearly higher latency than the SLI connection.
With GTX 1080 now recommending that you use BOTH SLI connectors on the card for 2-Way SLI configurations, what happens to 3- and 4-card SLI? By default, NVIDA will only be supporting two GPUs in SLI and 3- and 4-Way SLI configurations “are no longer recommended.” Why?
As games have evolved, it is becoming increasingly difficult for these SLI modes to provide beneficial performance scaling for end users. For instance, many games become bottlenecked by the CPU when running 3-Way and 4-Way SLI, and games are increasingly using techniques that make it very difficult to extract frame-to-frame parallelism.
But there is a catch! Even though it’s not recommend, NVIDIA will still allow for 3-Way and 4-Way configurations with GTX 1080 through the use of something called an “Enthusiast Key.”
For this class of user we have developed an Enthusiast Key that can be downloaded off of NVIDIA’s website and loaded into an individual’s GPU. This process involves:
1.Run an app locally to generate a signature for your GPU
2.Request an Enthusiast Key from an upcoming NVIDIA Enthusiast Key website
3.Download your key
4.Install your key to unlock the 3 and 4-way function
Oooookkkkaaayyy….? I’m torn on this one. I can absolutely understand the position NVIDIA is in. If SLI users represent maybe 2-4% of GeForce users, then 3-4 GPU users have to be infinitesimally small minority, making the added cost of testing, validation and driver optimization for that group a painful position to defend to accounting. But if you are going to allow some users to enable it (and system builders) then why put this artificial, but easily bypassed, lock and key system in the way? To me it appears to be a way to deemphasize multi-GPU configurations above two is all.
NVIDIA is not building HB SLI bridges for 3- and 4-Way configurations so you’ll have to use legacy bridges with your GTX 1080s. While NVIDIA tells us that they are still putting effort into 3+ card multi-GPU configurations, and educating developers about the best practices to integrate it, they admit that “game designs are becoming less friendly over time for more than 2 GPUs.” Because of this, they are “appropriate expectations and steer most users towards 2-way systems.”
FastSync splits render and display pipelines
I sure hope you weren’t tried of technologies that end with “sync” because NVIDIA is releasing another one with the GTX 1080. FastSync is an alternative to Vsync On and Vsync Off states, but is not variable like G-Sync or FreeSync. The idea is straight forward: decouple the render pipe from the display pipe completely and there a lot of interesting things you can do. FastSync tells the game engine that Vsync is OFF, allowing it to render frames as fast as possible. The monitor is then sent frames at its maximum refresh rate but only completely rendered frames, avoiding the tearing artifacts usually associated with Vsync Off states.
FastSync creates a virtual buffer system that includes three locations. Front buffer, back buffer and the last rendered buffer. The front buffer is the one that is scanned out to the monitor at the same speed as the display refresh rate. The back buffer is the one that is being rendered to by the GPU and cannot be scanned out until it’s complete. The last rendered buffer will hold all new frames just completed in the back buffer, essentially saving a copy of the most recently rendered frame by the game. When the front buffer is finished scanning to the display, then the last rendered buffer would be copied to the front buffer and scanned out.
Interestingly, because buffer copies would take time and add latency, the buffers are just dynamically renamed. In high frame rate games the LRB and BB would switch positions concurrently at the render rate of the application, and when the FB had completed its most recent scan out, the current LRB would be renamed to the FB, immediately starting its scan out.
The usage model for FastSync is games that are running at very high frame rates (competitive gaming) and thus have to decide between the high input latency of Vsync On or the screen tearing of Vsync off. For CS:Go gamers that are used to hitting 200 FPS, you’ll be able to play the game tear-free with only a very slight increase in latency, about 8ms according to NVIDIA.
This is definitely something that should only be enabled in the NVIDIA Control Panel for those games that are running at frame rates well above the maximum refresh rate of your display. FastSync will be its very nature introduce some variability to the smoothness of your game, as it is “dropping” frames on purpose to keep you time with the refresh of your monitor while also not putting backpressure on the game engine to keep timings lined up. At high frame rates, this variability issue isn’t really an issue as the frame times are so high (200 FPS = 5ms) that +/- 5ms is the maximum frame to frame variance you would see. At lower frame rates, say 45 FPS, you could get as much as a 22ms variance.
FastSync is a cool new feature to improve the experience of FAST games, but don’t think NVIDIA has found a free alternative to variable refresh rate technology.
NVIDIA did state that FastSync was coming to Maxwell as well, and possibly even Kepler graphics cards.