According to WCCFTech, AMD commented on Facebook that they are “putting the finishing touches on the 300 series to make sure they live up to expectation”. I tried look through AMD's “Posts to Page” for February 3rd and I did not see it listed, so a grain of salt is necessary (either with WCCF or with my lack of Facebook skills).
Image Credit: WCCFTech
The current rumors claim that Fiji XT will have 4096 graphics cores that are fed by a high-bandwidth, stacked memory architecture, which is supposedly rated at 640 GB/s (versus 224 GB/s of the GeForce GTX 980). When you're dealing with data sets at the scale that GPUs are, bandwidth is a precious resource. That said, they also have cache and other methods to reduce this dependency, but let's just say that, if you offer a graphics vendor a free, order-of-magnitude speed-up in memory bandwidth — you will have friend, and possibly one for life. Need a couch moved? No problem!
The R9 Series is expected to be launched next quarter, which could be as early as about a month.
So they are not being pushed
So they are not being pushed into Q2 release?
Was that just a rumor?
Q2 starts in about a month.
Q2 starts in about a month.
Semantics, but doest Q2 start
Semantics, but doest Q2 start in April? That’s a more than “about a month” but whatever, it’s soon.
I kinda get the feeling that some of the PCPER team already know a lot about the upcoming parts, but NDA says they aint sayin shit. Hoping for the best, to prevent monopoly to say the least.
Nah, I know nothing about
Nah, I know nothing about that stuff. I just saw some places say about a month out.
Looking at the last article
Looking at the last article in videocardz,
in March we might see 360 and 360X. All other cards go April or latter.
If that is even remotely
If that is even remotely correct, then we do not see any of the HBM parts until April, which wouldn’t be surprising.
I know you know nothing
I know you know nothing solid, if you did then you’d be breaking NDA, BUTTTTTTT, someone, someone ginger perhaps, I assume knows a little more than said person is legally allowed to share.
“The R9 Series is expected to
“The R9 Series is expected to be launched next quarter, which could be as early as about a month.”
Not really. AMD’s next financial Earnings release is scheduled at April 15th, so the next quarter starts that day. We are at least two months away.
Does this mean nvidia will be
Does this mean nvidia will be breaking out their real flagship for some ungodly price in the next few months? Funny how they trick the community into believing the 980 is the flagship until they are ready to fuck those people who bought the 980 thinking it was.
No one though that 980 was
No one though that 980 was their flagship. At least no one who knows core names. xxX04 core is the mainstream core, and xxX00 core in the big one. From day one we knew that GM200 will come 6+ months after the GM204.
Maybe not the enthusiasts who
Maybe not the enthusiasts who are onto nvidia’s paradigm, but most people do not know. No one might be a bit of an understatement. Can’t fault nvidia for being savvy businessmen; nevertheless, I do not like their tactics.
They didn’t really did
They didn’t really did something wrong here. 980 IS their fastest card right now.
What they did with 970, that’s a different story, a story that could get worst because there is a guy saying that, that last half gigabyte is in fact NEVER used, that the card goes directly to system memory.
last thing i read is that
last thing i read is that they wont be released till end of the year
So the HBM era begins, how
So the HBM era begins, how many total GB of this memory per card, that’s a whole lot of ones and zeros moving at 640 GB/s, it’s not going to be starved for bandwidth. 2015 is going to be fun from its start to finish, with 2016 being even more interesting, and AMD is going all in over the next few years, hopefully Zen, and K12 will put them in a better position, I want to see APUs with loads of this HBM, on package, or DIE, as the bandwidth is the big limiter for integrated GPUs, HBM on APUs would sure help.
“I want to see APUs with
“I want to see APUs with loads of this HBM, on package, or DIE, as the bandwidth is the big limiter for integrated GPUs, HBM on APUs would sure help.”
HBM isn’t really on-die. On package is what Intel already makes with their Iris pro graphics. It has a separate memory chip in the cpu package which is essentially just a small circuit board. HMB is still separate chips (so it isn’t on-die), but rather than being on a circuit board like cpu package, it is on a silicon transposer and uses through silicon vias to stack multiple chips. By stacking chips on a silicon transposer, they can get much shorter circuit paths and many more pins/pads.
So its modules then, with
So its modules then, with interposers and uber wide BUSES to uber high bandwidth on module/package memory. It’s the wider bus that I like, way wider than 64, and now if they could only include some CPU cores in the discrete GPUs to accelerate the games a little more, and take advantage of HBM’s bandwidth. AMD should be able to make some powerful discrete card gaming systems, and run a gaming optimized OS on a discrete card as well, imagine never having to go to the slower/narrower main system BUS, with all its inherent latency, with everything the OS, gaming engine, and all running from HBM. It looks like everything will eventually be placed on the silicon interposer, CPU, GPU, all sharing HBM, this would easily fit on a PCI card, and allow gaming to reach beyond the narrow restrictions of current motherboard designs, gaming motherboards could made to host many of these systems, and with each new PCI device the user would also get more CPU cores to go with more GPU power, and AMD could very easily make it so, if they can get a revenue stream that frees them from their console maker dependency.
If you already have CPUs on
If you already have CPUs on the gpu, then why would you need an external CPU at all? It would be great to have an APU with unified high bandwidth stacked memory, but to compete with dedicated solutions it would consume similar power to a dedicated solution which would mean on the order of 350 to 400 watts. That would require quite a cooling solution and a power deliver solution for a socket. They could make smaller chips and just use more of them, but this runs into off chip bandwidth issues. To really make them work together well, you need something higher bandwidth than a slot; this is why Nvidia was talking about nvlink using a mezzanine connector which is more like a giant CPU connector rather than a slot.
The CPU doesn’t get much from HBM unless it is processing a streaming type workload which would be more suited to a gpu anyway. For branch heavy integer code, it is more about caches, branch prediction, and prefetch. Increasing memory bandwidth has little effect for most integer (non-streaming) applications.
HBM is going to make things
HBM is going to make things interesting. There is very little external interconnect since all of the memory will be in the gpu package. You just need a lot of power and ground and the PCI-e link which is a small number of pins compared to 256 or 512-bit memory interface. The gpu and memory will be much more compact than board mounted memory, so a dual gpu card will actually be a much more reasonable size than current dual gpu cards. A quad card might be doable as far as card length, but the power consumption would probably get in the way. You would need a lot of power deliver circuitry and probably water cooling to remove the heat.
Does AMD have anything similar to nvlink planned? It would be nice to actually be able to share memory between GPUs (get 8 GB from a dual card rather than 4 GB), but this would require massive off package bandwidth. It may be doable though since there is no off package memory; if you can dedicate the pins that were used for a memory interface to package-to-package interconnect, then this may allow sufficient bandwidth. I don’t know if the design for this would be worthwhile though. It may be better to just use twice as much memory.
Wild speculation aside, I hope we actually see this soon, but I would expect a smaller gpu with board mounted memory first. An AMD APU with HBM would be great, but I don’t know if the transposer with stacked memory will be cheap enough for anything but high end parts.
Actually, its less costly,
Actually, its less costly, because putting the memory, and maybe even a CPU on the package means that if the memory module is bad it does not take out the GPU, the memory is made on a different line, and only the memory that passes get placed on package, even a CPU could be added to really speed thing up, and improve latency, it’s eventually going the be necessary to include a CPU in the on package/module with the GPU, if resolutions continue to grow beyond 4k, 64 bit motherboard buses will not be able to keep up with the draw calls, and latency issues.
With stacked memory, you have
With stacked memory, you have more points where things can go wrong and reduce the yield of final product. When you make a graphics card, you presumably also start with tested memory packages and a tested gpu, but something can go wrong in soldering to the pcb of the final card. Same thing with a silicon interposer. You still have to solder the stacked memory and gpu to the interposer, and this uses much smaller solder bumps compared to those used to mount the package onto the pcb. You can have something go wrong in the construction of the interposer. You then have another step to package and solder the interposer onto the pcb so the interposer adds an extra soldering step at minimum. You also have the stacking of the memory dies which could go wrong, but this isn’t directly comparable. Standard memory chips have the memory array but also have to include the interface logic. In an HBM stack, you have the interface logic on the bottom die in the stack. The other chips on top are almost nothing but the memory array so they are much smaller than a gddr5 die, for example. This is a “good thing” because the memory dies can be produced on a process optimized for memory and the bottom logic die can be produced on a process optimized more for logic.
Anyway, don’t count on products using a silicon transposer to be cheap. The transposer is only slightly larger than the gpu die, so it has to be mounted on a pcb to make the final card. Anytime you add steps, you add an oportunity for something to go wrong.
I have used both [silicon]
I have used both [silicon] transposer and interposer; not sure there is much difference in meaning…?