NVIDIA Introduces 7nm Ampere A100 Tensor Core GPU

Source: NVIDIA NVIDIA Introduces 7nm Ampere A100 Tensor Core GPU

NVIDIA CEO Jensen Huang has formally announced the first product powered by the company’s new Ampere architecture, the A100. The successor to the Tesla V100 datacenter GPU (announced May 10, 2017), the NVIDIA A100 Tensor Core GPU offers impressive specifications and capabilities.

NVIDIA A100, the first GPU based on the NVIDIA Ampere architecture, providing the greatest generational performance leap of NVIDIA’s eight generations of GPUs, is also built for data analytics, scientific computing and cloud graphics, and is in full production and shipping to customers worldwide, Huang announced.

Eighteen of the world’s leading service providers and systems builders are incorporating them, among them Alibaba Cloud, Amazon Web Services, Baidu Cloud, Cisco, Dell Technologies, Google Cloud, Hewlett Packard Enterprise, Microsoft Azure and Oracle.

The A100, and the NVIDIA Ampere architecture it’s built on, boost performance by up to 20x over its predecessors, Huang said. He detailed five key features of A100, including:

  • More than 54 billion transistors, making it the world’s largest 7-nanometer processor.
  • Third-generation Tensor Cores with TF32, a new math format that accelerates single-precision AI training out of the box. NVIDIA’s widely used Tensor Cores are now more flexible, faster and easier to use, Huang explained.
  • Structural sparsity acceleration, a new efficiency technique harnessing the inherently sparse nature of AI math for higher performance.
  • Multi-instance GPU, or MIG, allowing a single A100 to be partitioned into as many as seven independent GPUs, each with its own resources.
  • Third-generation NVLink technology, doubling high-speed connectivity between GPUs, allowing A100 servers to act as one giant GPU.

The result of all this: 6x higher performance than NVIDIA’s previous generation Volta architecture for training and 7x higher performance for inference.

NVIDIA A100 Specifications

Let’s take a quick look at the specs for this new GPU, with the previous Tesla V100 and older Tesla P100 for comparison:

  Ampere A100 Tesla V100 Tesla P100
GPU GA100 GV100 GP100
Architecture Ampere Volta Pascal
SMs 108 80 56
CUDA Cores 6912 5120 3584
Tensor Cores 432 640
Boost Clock 1410 MHz 1530 MHz 1480 MHz
Memory 40GB HBM2e 16GB HBM2 16GB HBM2
Memory Interface 5120-bit 4096-bit 4096-bit
Memory Bandwidth 1.6 TB/s 900 GB/s 616 GB/s
Transistor Count 54B 21.1B 15.3B
Die Size 826 mm2 815 mm2 610 mm2
Process Tech 7 nm 12 nm 16 nm
TDP 400W 300W 300W

 

The GA100 GPU is manufactured on TSMC’s 7nm process, with a die size of 826 mm 2. This is larger than the 815 mm 2 from GV100, and contains more than double the transistor count of the previous generation’s GPU. The number is an incredible 54 billion (!) transistors, up from 21.1 billion with GV100.

The GA100’s Stream Multiprocessor (SM) count is 108, and Ampere’s SMs are organized as 64 PF32 and 32 FP64 each. This breaks down to 6912 FP32 (single precision) CUDA Cores, and 3456 FP64 (double precision) CUDA Cores. And while the 432 Tensor Core count is down from GV100’s 640, A100 is using third-generation Tensor Core technology.

NVIDIA Introduces 7nm Ampere A100 Tensor Core GPU - Graphics Cards 2

The third-generation Tensor Cores in the NVIDIA Ampere architecture are beefier than prior versions. They support a larger matrix size — 8x8x4, compared to 4x4x4 for Volta — that lets users tackle tougher problems. That’s one reason why an A100 with a total of 432 Tensor Cores delivers up to 19.5 FP64 TFLOPS, more than double the performance of a Volta V100.

Memory size and bandwidth has also increased significantly, with 40GB of HBM2e on a 5120-bit bus providing nearly 1.6 TB/s in bandwidth (1,555 MB/s).

The DGX A100 platform shown during the KitchenNote is “the world’s largest GPU”; a massive 8-GPU configuration that weighs 50 lbs (NVIDIA’s CEO pulled out of the oven just a couple of days ago). The price tag for the the DGX? $200,000.

Is Ampere for Consumer GPUs Coming Soon?

As to potential consumer implications from Ampere, no new GeForce products have been announced, yet. Via VideoCardz, we find this quote from MarketWatch:

Ampere will eventually replace Nvidia’s Turing and Volta chips with a single platform that streamlines Nvidia’s GPU lineup, Huang said in a pre-briefing with media members Wednesday. While consumers largely know Nvidia for its videogame hardware, the first launches with Ampere are aimed at AI needs in the cloud and for research.

“Unquestionably, it’s the first time that we’ve unified the acceleration workload of the entire data center into one single platform,” Huang said.

It seems that we will have to wait a bit longer for more GeForce-related Ampere news.

Video News

About The Author

Sebastian Peak

Editor-in-Chief at PC Perspective. Writer of computer stuff, vintage PC nerd, and full-time dad. Still in search of the perfect smartphone. In his nonexistent spare time Sebastian's hobbies include hi-fi audio, guitars, and road bikes. Currently investigating time travel.

1 Comment

  1. Letard

    Remember to spell it correctly on podcast. It’s not pear but pair as in french mon pere (my father). Respect the man, rip.

    Reply

Leave a reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Latest Podcasts

Archive & Timeline

Previous 12 months
Explore: All The Years!