GPGPU, CUDA, and YouOne of the big selling points that NVIDIA includes with the Fermi line, other than its really good 3D performance, is that of GPGPU. Whether it is CUDA based, or talking about GPU PhysX, NVIDIA has spent millions to help developers write code that will take advantage of the processing power of modern GPUs.
On one extreme we have products that are specially developed for the oil and gas industry, financial markets, and medical imaging. These applications, while affecting us indirectly, are not something we can run locally on our machines. Then again, the geological datasets that the oil and gas industry use are multi-terabytes in size, and something average people have no access to. On the other extreme we have the free GPGPU type programs like Folding@Home, which is free to download and you are donating your processing power to help research how proteins fold and react to each other in highly accurate simulations.
Folding@Home was initially a big win for ATI and their X1800/X1900 series of chips. These were far more flexible and easy to program for (using ATI’s “Close to the Metal” tools) than NVIDIA’s competing GeForce 7000 series of parts. But with NVIDIA pouring millions into helping to develop these applications, and pushing out hardware that was much more flexible and faster than ATI/AMD’s latest generation of parts, we started to see NVIDIA’s products take a lead. F@H has released a Fermi based folding client which is simply the fastest solution available as of now. AMD is working with the F@H people to get the HD 5000 series up to speed, but in the case of products like the HD 5870, only ½ of the stream units will be used due to the client being based on the HD 4000 series of chips (the HD 4870 and 4890 had 800 stream units at maximum, and the program continues to utilize just those amounts of units).
Badaboom was one of the first widely available GPGPU applications which utilized CUDA and the GPU to speed up transcoding of video.
Next up is PhysX. This physics middleware is fairly popular in the industry, but certainly not the overwhelming favorite. This includes GPU accelerated physics, and was a big selling point for titles such as “Batman: Arkham Asylum”. Certainly the controversy surrounding PhysX does tarnish its claims to be so much faster on a GPU, when in fact it is not even fully optimized for use on modern CPUs. Whether or not NVIDIA is short changing CPU based physics, GPU physics is a feature nonetheless. It does have a marked impact on scene complexity and interactions, and it can only be fully utilized on NVIDIA GPUs.
Now we are getting into the realm of paying for software. Badaboom is probably the best well known GPGPU application, and it works exactly as advertised. It is a transcoding software solution which leverages the GPU to get the job done faster than even a top end, $1000 multi-core CPU. The $30 US cost is certainly not a fortune, and if time is money, then this could easily pay for itself if even a marginal amount of transcoding is done on a daily basis. Professional solutions will have more tweakable settings, but for the vast majority of people actively doing transcoding, this is a remarkable piece of software for the price.
“Blow it up and enhance” has been a big-screen cliché since computers started doing heavy duty photography and video work. Until now, doing such a thing required a server farm and software that costs more than most houses. Thankfully for us, MotionDSP has released vReveal 2.0 which takes many of those same enhancement algorithms of previously out of reach solutions, and integrated it into a $40 software package. Don’t expect to take the Zapruder film segment and enhance it enough to find someone on the grassy knoll, but do expect to clean up cell phone footage to remove graininess, increase resolution, and remove the “shaky cam” effect. Again, this product works as advertised, and is a bargain to boot. It currently only uses NVIDIA GPUs for acceleration.
For example, I was able to run vReveal on a 480p video, enhancing sharpness and upconverting it to 1080p. For a 4 minute video, it was able to apply those effects in approximately 5 minutes of work. Using a 6 core Phenom II at 3.2 GHz took around 20 minutes to complete the same enhancements.
Now in Version 2.0, vReveal has the potential for being a “must have” video enhancement tool for a generation of users relying on cellphone video to capture those special (or idiotic) moments in life. Viral videos of people doing stupid stuff should look good!
Video playback solutions are now starting to adopt GPGPU functionality to improve and enhance standard definition content so that it approaches high definition quality. Total Media Theater 3 features a SimHD add-on which uses modern GPUs to improve standard definition content. This particular solution is in the $80 to $100 range, depending on which version is purchased. Results of this are mixed, with some artifacts often being included in scenes which distracts from the actual content being played. It is a nice step forward though, and we will see improvements to this type of application in the next few years which should clean up a lot of the outstanding issues.
Then we have the big dogs, the guys like Adobe CS4 and CS5, which are very expensive software solutions and “require” a Quadro based card to utilize GPU acceleration. This is one area that I feel does need to change, but I am coming from the consumer standpoint. The folks at NVIDIA and Adobe feel quite differently, as they enjoy higher margins on professional grade gear and software.
We can see that we have a limited, but interesting cross section of software that can leverage the GPU to improve performance. Again, NVIDIA has poured millions into developer relations to get these CUDA based apps out onto the market. It is not exactly a “must have” feature that is causing NVIDIA chips to fly off the shelves, but it is a feature nonetheless. For some users, this feature could be far more important than the 3D gaming performance of the Fermi family of cards.