At the keynote of the GPU Technology Conference (GTC) today, NVIDIA CEO Jen-Hsun Huang disclosed some more updates on the roadmap for future GPU technologies.
Most of the detail was around Pascal, due in 2016, that will introduce three new features including mixed compute precision, 3D (stacked) memory, and NVLink. Mixed precision is a method of computing in FP16, allowing calculations to run much faster at lower accuracy than full single or double precision when they are not necessary. Keeping in mind that Maxwell doesn't have an implementation with full speed DP compute (today), it would seem that NVIDIA is targeting different compute tasks moving forward. Though details are short, mixed precision would likely indicate processing cores than can handle both data types.
3D memory is the ability to put memory on-die with the GPU directly to improve overall memory banwidth. The visual diagram that NVIDIA showed on stage indicated that Pascal would have 750 GB/s of bandwidth, compared to 300-350 GB/s on Maxwell today.
NVLink is a new way of connecting GPUs, improving on bandwidth by more than 5x over current implementations of PCI Express. They claim this will allow for connecting as many as 8 GPUs for deep learning performance improvements (up to 10x). What that means for gaming has yet to be discussed.
NVIDIA made some other interesting claims as well. Pascal will be more than 2x more performance per watt efficient than Maxwell, even without the three new features listed above. It will also ship (in a compute targeted product) with a 32GB memory system compared to the 12GB of memory announced on the Titan X today. Pascal will also have 4x the performance in mixed precision compute.
Do they even mention that
Do they even mention that NVLink need a new motherboard to function as advertise.
My guess is NVLink will not
My guess is NVLink will not be available for gamer consumer market. It would require new CPUs (since PCIe controller is inside system agent aka north bridge and that is integrated with CPU) and motherboards.
Who needs the motherboard’s
Who needs the motherboard’s north, or south bridge, PCI based GPU cards currently have their own controller chips to the GDDR5 memory, the 16 PCI lanes are to pass data/other between the motherboards PCI controller, to the GPU’s PCI controller complex. Nvlink is going the be mostly used by the GPU to communicate with its dedicated memory. PCI is a data protocol, the CPU and/or Motherboard has a PCI controller on its end, and the GPU/PCI based board has always had its own PCI controller chip to de-encapsulate the PCI data and present it to the GPU. The one good thing about Nvlink is that the GPU on the mezzanine module could also have an on board CPU to accelerate workloads that require branch/prediction workloads, and Nvlink will provide coherency between on module CPUs with the GPU.
You will need a motherboard
You will need a motherboard that supports a mezzanine module or Nvidia could include the mezzanine module connector on the PCI card. Having a mezzanine module is a good thing especially for all the extra traces and connectivity improvements.
I thought nvlink was for
I thought nvlink was for gpu-to-gpu communication, not necessarily for connection to the cpu. It makes sense if they have 3D (or “2.5D”) memory since the only thing routed out of the package is power/gnd and pci-e link otherwise. Instead of routing a 256-bit memory interface, they can route a wide interface to a neighboring gpu. I don’t know if this will allow sufficient bandwidth for the gpus to really share memory rather than having completely independent memory.
Nvlink was developed to
Nvlink was developed to connect IBM’s power8 CPUs with Nvidia’s GPU accelerators. Nvlink borrowed heavily from IBM’s CAPI coherent connection fabric technology in making Nvlink. Nvidia’s Volta GPUs will be used along side IBM’s Powre9 CPUs in the recently announced supercomputing* contracted awarded by the big spender U-SAM, so expect some good technology to filter down form your tax dollars.
Really Nvidia should get a Power8 license from OpenPower, and tell Intel to get to stepping, and make some home gaming servers of its own. IBM needs to lower the pricing on those licensable power8 design variants a little more for Nvidia/others to use for the some Power8 based consumer PC/laptop SKUs. Those power8 cores are beasts, and would make on hell of a gaming system CPU SKU to have on the motherboard, or module!
* http://www.anandtech.com/show/8727/nvidia-ibm-supercomputers
edit: form your tax
edit: form your tax dollars.
to: from your tax dollars.
AND,
Edit: on hell
to: one hell
Notice how Volta slipped yet
Notice how Volta slipped yet another year. To 2018 it seems.
Maybe they feel they need
Maybe they feel they need smaller node, so Pascal will be 14nm and Volta will be 10nm .
NvLink could just get
NvLink could just get soldered to the motherboard like everything else. AMD GPU’s could use the CPU based pci-e controller and Nvidia’s driver set could be used to enable NvLink if Nvidia GPU’s are installed.
I wonder how “Mixed Precision
I wonder how “Mixed Precision Compute” on NVidia GPU’s stands up against a fully HSA enabled chips from other vendors.
Seems to me that with enough granularity in the GPU architecture of a HSA compatible GPU you can also effectively get as good if not better “Mixed Precision Compute” performance.
Those other vendors do not
Those other vendors do not own HSA, The scientific and computing ideas behind HSA were around long before AMD/others jumped on the bandwagon. HSA aware hardware, and OSs, and UMA are good ideas for the entire computing market to adopt, Especially the HSA ability of allowing mixed maker/models of GPUs to be used in computing platforms, including integrated GPU, all able to be utilized for all computing workloads, all of the time(No switchable either/or one GPU but not the other).
Consumers really do not understand that computing never needed switchable integrated/discrete graphics, computing systems could have been able to utilize both integrated and discrete GPUs at the same time, all but for the greed of a few big monopolistic/greedy interests, who artificially prevented HSA from happening.
Mixed precision compute capabilities are not necessarily tied to HSA, they just mean better resource allocation for higher and lower precision mathematical workloads. Of course, if all the precision needs to be high, the there are some disadvantages also.