Remember last month? Remember when I said that Google’s introduction of Tesla P100s would be good leverage over Amazon, as the latter is still back in the Kepler days (because Maxwell was 32-bit focused)?
To compare the two parts, the Tesla P100 has 3584 CUDA cores, yielding just under 10 TFLOPs of single-precision performance. The Tesla V100, with its ridiculous die size, pushes that up over 14 TFLOPs. Same as Pascal, they also support full 1:2:4 FP64:FP32:FP16 performance scaling. It also has access to NVIDIA’s tensor cores, which are specialized for 16-bit, 4×4 multiply-add matrix operations that are apparently common in neural networks, both training and inferencing.
Amazon allows up to eight of them at once (with their P3.16xlarge instances).
So that’s cool. While Google has again been quickly leapfrogged by Amazon, it’s good to see NVIDIA getting wins in multiple cloud providers. This keeps money rolling in that will fund new chip designs for all the other segments.