GPU Enthusiasts Are Throwing a FET
The next GPU architecture from NVIDIA is expected to jump two process nodes.
NVIDIA is rumored to launch Pascal in early (~April-ish) 2016, although some are skeptical that it will even appear before the summer. The design was finalized months ago, and unconfirmed shipping information claims that chips are being stockpiled, which is typical when preparing to launch a product. It is expected to compete against AMD's rumored Arctic Islands architecture, which will, according to its also rumored numbers, be very similar to Pascal.
This architecture is a big one for several reasons.
Image Credit: WCCFTech
First, it will jump two full process nodes. Current desktop GPUs are manufactured at 28nm, which was first introduced with the GeForce GTX 680 all the way back in early 2012, but Pascal will be manufactured on TSMC's 16nm FinFET+ technology. Smaller features have several advantages, but a huge one for GPUs is the ability to fit more complex circuitry in the same die area. This means that you can include more copies of elements, such as shader cores, and do more in fixed-function hardware, like video encode and decode.
That said, we got a lot more life out of 28nm than we really should have. Chips like GM200 and Fiji are huge, relatively power-hungry, and complex, which is a terrible idea to produce when yields are low. I asked Josh Walrath, who is our go-to for analysis of fab processes, and he believes that FinFET+ is probably even more complicated today than 28nm was in the 2012 timeframe, which was when it launched for GPUs.
It's two full steps forward from where we started, but we've been tiptoeing since then.
Image Credit: WCCFTech
Second, Pascal will introduce HBM 2.0 to NVIDIA hardware. HBM 1.0 was introduced with AMD's Radeon Fury X, and it helped in numerous ways — from smaller card size to a triple-digit percentage increase in memory bandwidth. The 980 Ti can talk to its memory at about 300GB/s, while Pascal is rumored to push that to 1TB/s. Capacity won't be sacrificed, either. The top-end card is expected to contain 16GB of global memory, which is twice what any console has. This means less streaming, higher resolution textures, and probably even left-over scratch space for the GPU to generate content in with compute shaders. Also, according to AMD, HBM is an easier architecture to communicate with than GDDR, which should mean a savings in die space that could be used for other things.
Third, the architecture includes native support for three levels of floating point precision. Maxwell, due to how limited 28nm was, saved on complexity by reducing 64-bit IEEE 754 decimal number performance to 1/32nd of 32-bit numbers, because FP64 values are rarely used in video games. This saved transistors, but was a huge, order-of-magnitude step back from the 1/3rd ratio found on the Kepler-based GK110. While it probably won't be back to the 1/2 ratio that was found in Fermi, Pascal should be much better suited for GPU compute.
Image Credit: WCCFTech
Mixed precision could help video games too, though. Remember how I said it supports three levels? The third one is 16-bit, which is half of the format that is commonly used in video games. Sometimes, that is sufficient. If so, Pascal is said to do these calculations at twice the rate of 32-bit. We'll need to see whether enough games (and other applications) are willing to drop down in precision to justify the die space that these dedicated circuits require, but it should double the performance of anything that does.
So basically, this generation should provide a massive jump in performance that enthusiasts have been waiting for. Increases in GPU memory bandwidth and the amount of features that can be printed into the die are two major bottlenecks for most modern games and GPU-accelerated software. We'll need to wait for benchmarks to see how the theoretical maps to practical, but it's a good sign.
Anandtech – AMD Radeon R9 285
Anandtech – AMD Radeon R9 285 Review: Feat. Sapphire R9 285 Dual-X OC
AMD introducing updated FP16 instructions back in 2014 with Tonga
wow so 2 AMD gpu’s have it
wow so 2 AMD gpu’s have it that is it. That is huge.
5 currently, 7 soon.
Arbiter’s point is that those
Arbiter's point is that those are two GPUs: Tonga and Fiji.
Still, NVIDIA currently has zero (unless a really old or mobile one just happens to have native support for it that I'm unaware of, but that's beyond the point if even true). I am more interested in whether it will be used frequently enough to justify the die space it requires, versus just emulating it in FP32.
Yes, that’s quite a few.
Yes, that’s quite a few. That said, the real question is if games will need or benefit from that level of precision.
Anyhow, I just hope AMD keeps gaining momentum. While I like Nvidia, I’m not comfortable with them having such a big marketshare gap over AMD, and I certainly don’t want them to ever become a monopoly.
I have many reasons for focusing on buying AMD over Nvidia and Intel, but my biggest one is usually not wanting either big company to become a monopoly. Plus, AMD’s products often have good bang/buck in certain price/performance ranges.
You realize if AMD went down
You realize if AMD went down and sold Radeon to Intel and gave all of its x86 IP (including x86_64 ownership) to Nvidia, neither would be a monopoly, and we’d have even better competition, right?
Delusion, that’s what yours
Delusion, that’s what yours is. Sounds like you have no idea how patent law works.
Not really, if AMD goes down
Not really, if AMD goes down or gets bought, Intel cant sell any chip they own until they re-write the contract they have with AMD. it used to be that intel gave AMD their X86 patent, but eventually AMD got ahead and now the tables have flipped and both companies gave each other instruction sets.
Intel would never want to make itself a new competitor, and Nvidia being a monopoly would be the worst idea ever.
man im sooo going to skip the
man im sooo going to skip the next gen nvidia and amd stuff you know theirs going to be a few kinks with the new chips
I feel like that’s the right
I feel like that’s the right idea. I’m betting the first generation 16nm and HBM2 stuff on both sides will have issues.
So staying with tried and true 28nm stuff that’s been tested and proven stable seems like a good idea for now. Plus, by the time the kinks are worked out with the next-gen stuff, the prices will be lower too.
So waiting is win/win for me. Less bugs and hardware issues, and lower prices. Plus, we can wait and see which cards give the best price/performance and are best for DX12 and /or legacy DX11.
I upgraded to Windows 10 on the day it released and now wish I’d just waited, looking back.
What happens if they then
What happens if they then move to another smaller process immediately after that? Will you then skip that process node too? Just because it is the first generation doesn’t mean it will be bad. Yes there may be issues, but these issues surely won’t manifest themselves in a way that inhibits the use of the product or puts it behind the performance of the previous generation. If they did then there would be no reason to release the product. Especially when the company relies almost entirely on this type of product for it’s income.
So wanna make a bet? We
So wanna make a bet? We probably going to stay on 16nm much longer than 28nm did. :p
Two+ process node shrinks and
Two+ process node shrinks and a new memory is a huge upgrade. There’s enough competition between AMD and Nvidia that they will launch the best they can immediately when 16nm ff is ready. This should be a huge step forward – more than Kepler or Maxwell. Of course we will have to wait and see.
The bad news is can’t stay
The bad news is can’t stay with older generation cause Nvidia will sabotage via driver game performance.
That’s a very odd statement
That’s a very odd statement considering AMD drivers for DX11 are very inefficient and can cause a loss of performance on weaker CPU’s especially.
AMD writes their own drivers so not sure how NVidia could sabotage that, and if you’re referring to something else like Gameworks that’s not a driver so is a whole different ball of wax.
I know there are people that claim NVidia does not play fair, however if you look very carefully you might discover that it is AMD that is not doing so more than NVidia. That’s a discussion for another time though.
GP100 isnt really new. Its
GP100 isnt really new. Its been in development for YEARS and so has 16nm FinFET. Just like 28nm and GK110.
These GPUs are designed primarily for several hundred million dollar pre-exascale supercomputers.
GP100 is directly competeing with Intel Knights Landing and Knights Hill, not with anything from AMD.
They HAVE to work or Nvidia will lose millions in market share to Intel. Theyll be shipping to integrators before consumers in batches of thousands for a single computer.
The fact that youll be able to put one of them in your PC is amazing. Rest assured they will work just fine and the last few years has been spent working the kinks out.
I doubt this is going to stop
I doubt this is going to stop developers from making shitty games like Batman. 😛
I just hope this results in a card that can last for atleast 5 years when combined with an adaptive sync technology. Gfx tech is evolving way too fast to keep up in such a recessionary environment.
I always go team red, but I’m
I always go team red, but I’m genuinely excited to see just how awesome nvidia’s HBM cards are going to be.
It will also be fun to see how far up the scale this shifts the gpu pricepoints.
Um, the 28nm node was
Um, the 28nm node was introduced on the 7970 first at the end of 2011, not the GTX 680 in 2012
You know this is Nvidia news
You know this is Nvidia news right? Nvidia introduced it in 2012 on their products.
As the article
As the article states.
Hes right. The article makes it seem Nvidia was the first to introduced 28nm with GTX 680 which it clearly wasn’t.
Golden rule of updating a
Golden rule of updating a graphics card (or smartphone): Update AT LEAST 2 gens later to see noticeable change.
Also, These cards will coincide with the occulus rift’s release, a device that will cost at least $350. (http://www.techtimes.com/articles/90681/20151003/the-very-first-production-oculus-rift-has-been-made-and-it-will-cost-more-than-350.htm)
That thing will need a card pushing at least 2160×1200 at better-than-decent framerates due to the nature of VR.
When occulus starts flying off the shelves, and it will imho at any price level below or around $550 – $600, GPU manufacturers will have to offer “rift ready” cards (i expect branded stickers…like windows10 ready)but at lower costs than today’s cards of similar power because one must consider the cost of the rift into the equation.
The rift will not cost thousands anyway, so you can’t really call it a hardcore/novelty item to justify sticking hardcore-priced cards next to it marketing wise.
I am expecting the gen AFTER the upcoming one to offer much much better performance per dollar because of the effect occulus will have on the gaming and other industries and because AMD and NVIDIA will already have covered their R&D expenses for their upcoming cards, thus being able to lower pricing and fight for the rift market share.
I say if you can afford it,
I say if you can afford it, upgrade whenever you want! YOLO!
Single precision, 32 bits
Single precision, 32 bits (about 7 decimal digits).
Halving that to 16 bit.. surly that will reduce the precision down so low that game designers and programmers will find it extremely hard to code for the things they need.
But twice as fast… maybe
The half precision float is
The half precision float is probably for deep learning applications. I remember reading about how they can have low precision but still get good results.