The Inquirer have posted a tiny bit of information about AMD's upcoming Vega and as any rumours about the new GPU are hard to find it is the best we have at the moment. AMD's claim is that the second generation HBM present on the 4GB and 8GB models could offer equivalent memory bandwidth to a GTX 1080 Ti, which makes perfect sense. The GTX 1080 Ti offers 484 GB/s of memory bandwidth while AMD's R9 series first generation HBM offers 512 GB/s. The real trick is filling that pipeline to give AMD's HBM2 based cards a chance to shine and which depends on software developers as much as it does the hardware. As well, The Inquirer discusses the possible efficiency advantages that Vega will have, which could result in smaller cards as well as an effective mobile product. Pop over to take a look at the current rumours, here is hoping we can provide more detailed information in the near future.
"AMD HAS TEASED more information about its forthcoming Vega-based graphics cards, revealing that they will come with either 4GB or 8GB memory and hinting that a launch is imminent."
Here is some more Tech News from around the web:
- iPhone-havers think they're safe. But they're not @ The Register
- FYI Docs.com users: You may have leaked passwords, personal info – thousands have @ The Register
- LastPass scrambles to fix another major flaw – once again spotted by Google's bugfinders @ The Register
- Johnny Depp signs on to play John McAfee in a film of his life @ The Inquirer
- Samsung 4K Blu-ray Player @ Hardware Secrets
- Futuremark Ends Support for 3DMark Vantage and PCMark Vantage @ [H]ard|OCP
- Konica Minolta Unveils the Future of Work, Or At Least Its Version @ Kitguru
- Win a PC hardware bundle with Gigabyte AORUS, HyperX and KitGuru
One stack of JEDEC standard
One stack of JEDEC standard HBM2 has a 1024 bit BUS subdivided into 8, 128 bit, channels that can support up to 256 GB/s of total effective bandwidth. So for 2 HBM2 stacks, 4 GB, or 8 GB/larger, that’s 512 GB/s for 2 HBM2 stacks and does the Vega designs really need more than that 512 GB/s amount of total effective bandwidth to feed the shaders efficiently.
Also it is good to re-read this AnandTech article(1) on JEDEC HBM2 as HBM2 has more features above and beyond HBM’s(1St generation) features With HBM2 having more command modes that do work with lower overhead/etc. HBM2 is much more than only HBM(1St generation) with larger memory capacity DRAM dies.
A closer look at Vega is also needed with that HBCC and that ability of the Vega GPU to manage its own Virtual Memory pools out into regular system DRAM and even storage SSD, etc. The Vega Micro-Arch also has that primitive shader construct and other improvements that need a closer look also when the proper amount of information becomes available for Vega closer to launch.
“One of the key enhancements of HBM2 is its Pseudo Channel mode, which divides a channel into two individual sub-channels of 64 bit I/O each, providing 128-bit prefetch per memory read and write access for each one. Pseudo channels operate at the same clock-rate, they share row and column command bus as well as CK and CKE inputs. However, they have separated banks, they decode and execute commands individually. SK Hynix says that the Pseudo Channel mode optimizes memory accesses and lowers latency, which results in higher effective bandwidth.”(1)
(1)
“JEDEC Publishes HBM2 Specification as Samsung Begins Mass Production of Chips”
http://www.anandtech.com/show/9969/jedec-publishes-hbm2-specification
Awesome stuff. Thanks, fellow
Awesome stuff. Thanks, fellow anon!
It would be good to have an
It would be good to have an article about GPU virtual memory support. I remember some such features being added a long time ago, but they sounded limited and may have required game developer support to use. It sounds like AMD has implemented a more general, CPU-like virtual memory system for Vega. It should allow much more efficient use of memory if there really was a lot of memory allocated that was never used. A hardware supported virtual memory system will just allocate as much virtual memory pages as you ask for. It doesn’t actually map them to physical memory until you actually use them. If you ask for 2 GB but only write 512 MB of it, then it will only take 512 MB of physical memory. If it works well, a 4 GB card may be a reasonable product, but a lot of people may not buy it anyway, since they don’t think it is enough memory. That could be an issue for sales, even if the memory size isn’t really a technical issue. That much high speed memory takes a lot of power though, so reducing it may be important for mobile products. There is a reason Apple stayed with 1 GB for iPhones for a long time and still limit it to 2 GB now.
There are many good articles
There are many good articles on the subject of GPU virtual memory at anandtech and the pay-walled trade journals. So its always good to do a search online for other articles on the same subject and wait for the Zen/Naples and other server variants and Vega(consumer/pro variants) to be released because that’s when the full details will be released on Vega’s GPU virtual memory support support, and AMD’s Infinity Fabric, etc. There is still a lot of information under NDA until Zen/Naples and Vega consumer and Radeon Pro WX/Radeon instinct SKUs(Based on the Vega Micro-arch.) hit the market so that’s over the next few months. Zen/Naples in Q2, ditto for Vega.
Think about this as long as AMD has been making APUs AMD’s GPUs could have had virtual memory capabilities and not only has AMD had professional discrete GPUs with virtual memory capabilities on some of AMD’s professional Radeon Pro WX variants they(Some Pro GPU SKUs) also have had in their hardware the ability to virtually slice a physical GPU up into many virtual GPUs. So on some of AMD past generation Radeon Pro SKUs(Now branded Radeon Pro WX and Radeon Instinct/AI) the whole physical GPU can be logically partitioned up safely and securely and have each GPU logical partition servicing a different application’s graphics/compute acceleration needs.
Gotta have the cores and rops
Gotta have the cores and rops to use all that bandwidth.
it isnt the cores and rops to
it isnt the cores and rops to use that bandwidth effectively it is far more important the program using it is run efficiently, just base example, bitcoin or ethereum mining, they can truly take advantage of Radeon design and end up being crazy crunching power compared to Nv graphics, from old to new, so if used properly Radeon has much untapped performance under the hood, games 9/10 simply do not take advantage of the horsepower in there.
So, the few games that can actually use R9 Fury properly (and even mining) show that the sheer speed of the memory as well as the fairly wide bus with low latency show the performance is there.
I think Vega will have similar ROP/TMU etc as Fury does, but, with much more optimization done so can get the performance that should be showing more often, possibly even more fine grained volt/amp/power/clock control as well, something AMD learned quite a bit with Ryzen, am sure they will put this to great use for Vega(maybe even the RX 500 series seeing as they are a much more refined 14nm process then was used on RX 400 series, i.e LPP vs LPE..anyways will not be long to see 🙂
Vega gonna be THREE times
Vega gonna be THREE times faster than an overclocked 1080 Ti. LOLz.
Still dont know why there are
Still dont know why there are releasing a 4GB card. Games now a days can easily use more than 4GB at 1080P. It just seems silly to me to release what is there more higher end GPU line up with only 4GB.
Also,will the HBM modules on these cards be able to hit 512GB/s? If i remember reading correctly Hynix was having slight manufacturing problems so these would only be able to hit the upper 400’s and the 512GB/s modules would come towards the end of this year.
My guess is that AMD is
My guess is that AMD is aiming for this:
4GB HBM as L1 cache memory
4-8GB DDR4/GDDR5 as L2 cache memory
We saw what they could do for video scrubbing with the Pro SSG demo by eliminating the PCIe bottleneck. It could work for gaming too.
No Vega will have a larger L2
No Vega will have a larger L2 cache and L1 but because of the HBCC, and whatever L3(If any is used) the HBM2 will be able to used like a last level Cache by the HBCC.
Remember the HBC managed by the HBCC on Vega will be a direct client of the L2 cache on Vega and we still do not know at this time if Vega will have a L3 cache(It appears in the diagram that AMD provided that there may be an L3 HBC), See(1), and that there is some sort of last level cache indicated but it’s still not definite if that last level cache is in fact HBM2 based. We do know that Vega’s HBCC/HBC and memory controller has the ability to manage tiers of memory and storage hierarchies as Virtual Memory pools on down to managing the GPU’s own Virtual Memory paging files on an SSD/Hard drive.
So Vega will be very different from what came before. It is a good idea to keep this Anandtech preview handy until the full Vega release information is made public becuse things are still yet to be defined as to what last level cache will be in relation to the curreun info proviede by AMD.
(1)[Anandtech page 3 {last graphic on page 3)
“The AMD Vega GPU Architecture Teaser: Higher IPC, Tiling, & More, Coming in H1’2017”
http://www.anandtech.com/show/11002/the-amd-vega-gpu-architecture-teaser/3
P.S. note that in addition to
P.S. note that in addition to the HBC/HBCC being a client of L2 the render back ends on on Vega are also clients of L2 so the ROPs are now clients of the L2 cache(a larger L2 by the way) so more render back end work from the faster L2 cache instead of directly from slower memory or lower cache levels(L3{?}If any is provided above HBM2).
HBM is too high latency to be
HBM is too high latency to be considered a L2 or L1 cache. Bandwidth is too low too but the latency is the bigger issue IMO.
HBM could be thought of as a L3 or L4 though (hence AMD’s rebranding of it to a HBC concept). They’re typically big and slow compared to on core caches while still being faster and lower latency than main DRAM.
The demo you’re talking about was with MUCH larger capacity but WAY slower and higher latency SSD’s on the GPU with the software made “aware” of its existence. That sort of thing will matter for specific HPC GPGPU workloads but not for gaming PC’s.
For games its all about those shaders, ALU’s, and the bandwidth to feed them.
I have Radeon R9 290X with
I have Radeon R9 290X with 2GB memory. I can run some games at 4k with 30 fps at high settings. When I run dxdiag it says my video memory is 18GB. Does that mean directX 12 already uses system memory because I didn’t expect I was able to run games in 4k with just 2GB gpu memory?