Graphics card manufacturer NVIDIA launched a new Tesla K20X accelerator card today that supplants the existing K20 as the top of the line model. The new card cranks up the double and single precision floating point performance, beefs up the memory capacity and bandwidth, and brings some efficiency improvements to the supercomputer space.
While it is not yet clear how many CUDA cores the K20X has, NVIDIA has stated that it is using the GK110 GPU, and is running with 6GB of memory with 250 GB/s of bandwidth – a nice improvement over the K20’s 5GB at 208 GB/s. Both the new K20X and K20 accelerator cards are based on the company’s Kepler architecture, but NVIDIA has managed to wring out more performance from the K20X. The K20 is rated at 1.17 TFlops peak double precision and 3.52 TFlops peak single precision while the K20X is rated at 1.31 TFlops and 3.95 TFlops.
The K20X manages to score 1.22 TFlops in DGEmm, which puts it at almost three times faster than the previous generation Tesla M2090 accelerator based on the Fermi architecture.
Aside from pure performance, NVIDIA is also touting efficiency gains with the new K20X accelerator card. When two K20X cards are paired with a 2P Sandy Bridge server, NVIDIA claims to achieve 76% efficiency versus 61% efficiency with a 2P Sandy Bridge server equipped with two previous generation M2090 accelerator cards. Additionally, NVIDIA claims to have enabled the Titan supercomputer to reach the #1 spot on the top 500 green supercomputers thanks to its new cards with a rating of 2,120.16 MFLOPS/W (million floating point operations per second per watt).
NVIDIA claims to have already shipped 30 PFLOPS worth of GPU accelerated computing power. Interestingly, most of that computing power is housed in the recently unveiled Titan supercomputer. This supercomputer contains 18,688 Tesla K20X (Kepler GK110) GPUs and 299,008 16-core AMD Opteron 6274 processors. It will consume 9 megawatts of power and is rated at a peak of 27 Petaflops and 17.59 Petaflops during a sustained Linpack benchmark. Further, when compared to Sandy Bridge processors, the K20 series offers up between 8.2 and 18.1 times more performance at several scientific applications.
While the Tesla cards undoubtedly use more power than CPUs, you need far fewer numbers of accelerator cards than processors to hit the same performance numbers. That is where NVIDIA is getting its power efficiency numbers from.
NVIDIA is aiming the accelerator cards at researchers and businesses doing 3D graphics, visual effects, high performance computing, climate modeling, molecular dynamics, earth science, simulations, fluid dynamics, and other such computationally intensive tasks. Using CUDA and the parrallel nature of the GPU, the Tesla cards can acheive performance much higher than a CPU-only system can. NVIDIA has also engineered software to better parrellelize workloads and keep the GPU accelerators fed with data that the company calls Hyper-Q and Dynamic Parallelism respectively.
It is interesting to see NVIDIA bring out a new flagship, especially another GK110 card. Systems using the K20 and the new K20X are available now with cards shipping this week and general availability later this month.
You can find the full press release below and a look at the GK110 GPU in our preview.
Anandtech also managed to get a look inside the Titan supercomputer at Oak Ridge National Labratory, where you can see the Tesla K20X cards in action.
SALT LAKE CITY—SC12—Nov. 12, 2012— NVIDIA today unveiled the NVIDIA® Tesla® K20 family of GPU accelerators, the highest performance, most efficient accelerators ever built, and the technology powering Titan, the world’s fastest supercomputer according to the TOP500 list released this morning at the SC12 supercomputing conference.Armed with 18,688 NVIDIA Tesla K20X GPU accelerators, the Titan supercomputer at Oak Ridge National Laboratory in Oak Ridge, Tenn., seized the No. 1 supercomputer ranking in the world from Lawrence Livermore National Laboratory’s Sequoia system with a performance record of 17.59 petaflops as measured by the LINPACK benchmark.(1) Tesla K20 – Performance, Energy-Efficiency LeadershipBased on the revolutionary NVIDIA Kepler™ compute architecture, the new Tesla K20 family features the Tesla K20X accelerator, the flagship of NVIDIA’s Tesla accelerated computing product line. Providing the highest computing performance ever available in a single processor, the K20X provides tenfold application acceleration when paired with leading CPUs.(2) It surpasses all other processors on two common measures of computational performance – 3.95 teraflops singleprecision and 1.31 teraflops double-precision peak floating point performance. The new family also includes the Tesla K20 accelerator, which provides 3.52 teraflops of singleprecision and 1.17 teraflops of double-precision peak performance. Tesla K20X and K20 GPUNVIDIA Unveils World’s Fastest, Most Efficient Accelerators; Titan #1 on Top500 List accelerators representing more than 30 petaflops of performance have already been delivered in the last 30 days. This is equivalent to the computational performance of last year’s 10 fastest supercomputers combined.“We are taking advantage of NVIDIA GPU architectures to significantly accelerate simulations in such diverse areas as climate and meteorology, seismology, astrophysics, fluid mechanics, materials science, and molecular biophysics.” said Dr. Thomas Schulthess, professor of computational physics at ETH Zurich and director of the Swiss National Supercomputing Center.“The K20 family of accelerators represents a leap forward in computing compared to NVIDIA’s prior Fermi architecture, enhancing productivity and enabling us potentially to achieve new insights that previously were impossible.”Additional early customers include: Clemson University, Indiana University, Thomas Jefferson National Accelerator Facility (Jefferson Lab), King Abdullah University of Science and Technology (KAUST), National Center for Supercomputing Applications (NCSA), National Oceanic and Atmospheric Administration (NOAA), Oak Ridge National Laboratory (ORNL), University of Southern California (USC), and Shanghai Jiao Tong University (SJTU). Energy-Efficiency for “Greener” Data Centers. The Tesla K20X GPU accelerator delivers three times higher energy efficiency than previousgeneration GPU accelerators and widens the efficiency advantage compared to CPUs. Using Tesla K20X accelerators, Oak Ridge’s Titan achieved 2,142.77 megaflops of performanceper watt, which surpasses the energy efficiency of the No. 1 system on the most recentGreen500 list of the world’s most energy-efficient supercomputers.(3) Fastest on Broadest Range of Data Center ApplicationsThe Tesla K20 family accelerates the broadest range of scientific, engineering and commercial high performance computing and data center applications. Today, more than 200 software applications take advantage of GPU-acceleration, representing a 60 percent increase in less than a year. When Tesla K20X GPU accelerators are added to servers with Intel Sandy Bridge CPUs, many applications are accelerated up to 10x or more, including:
- MATLAB (engineering) – 18.1 times faster
- Chroma (physics) – 17.9 times faster
- SPECFEM3D (earth science) – 10.5 times faster
- AMBER (molecular dynamics) – 8.2 times fasterMore information about the Tesla K20 GPU accelerators is available at NVIDIA booth 2217 at SC12, Nov. 12-15, and on the NVIDIA high performance computing website. Users can also try the Tesla K20 accelerator for free on remotely hosted clusters. Visit the GPU Test Drive website for more information.AvailabilityThe NVIDIA Tesla K20 family of GPU accelerators is shipping today and available for order from leading server manufacturers, including Appro, ASUS, Cray, Eurotech, Fujitsu, HP, IBM, Quanta Computer, SGI, Supermicro, T-Platforms and Tyan, as well as from NVIDIA reseller partners.About NVIDIA Tesla GPUsNVIDIA Tesla GPUs are massively parallel accelerators based on the NVIDIA CUDA parallel computing platform and programming model. Tesla GPUs are designed from the ground up for power-efficient, high performance computing, computational science and supercomputing, delivering dramatically higher application acceleration for a range of scientific and commercial applications than a CPU-only approach.To learn more about CUDA or download the latest version, visit the CUDA website. More NVIDIA news, company and product information, videos, images and other information is available at the NVIDIA newsroom. Follow us on Twitter at @NVIDIATesla.