Google Cloud has added NVIDIA A100 Tensor Core GPU to their Accelerator-Optimized VM (A2) instances. They are currently only available as a private alpha, but they will be released to the public later this year.
These instances are significantly different from NVIDIA’s DGX A100. While NVIDIA’s offering has two 64-core AMD CPUs, these instances will be powered by up to 96 virtual processors from Intel’s Cascade Lake. The most significant difference between these two families of CPUs is that Cascade Lake does not support PCIe 4.0. This should not affect GPU-to-GPU communication, which is over NVLink Fabric, but it might matter if you spend a lot of your time copying buffers to and from system memory.
Between GPUs, Google lists peak bandwidth as 9.6 TB/s on their cloud instances, while NVIDIA lists 4.8 TB/s bi-directional on their DGX A100. I assume that these speeds are theoretically the same, and Google is just counting a full-duplex workload. Google’s instances have slightly more RAM than a DGX A100, with 1.3TB versus NVIDIA’s 1TB.
The A100-powered A2 instances will be available later this year, but pricing has not been announced.