Process Technology Overview
Evidence points to 20 nm products as being undesirable for GPU technology.
We have been very spoiled throughout the years. We likely did not realize exactly how spoiled we were until it became very obvious that the rate of process technology advances hit a virtual brick wall. Every 18 to 24 months we were treated to a new, faster, more efficient process node that was opened up to fabless semiconductor firms and we were treated to a new generation of products that would blow our hair back. Now we have been in a virtual standstill when it comes to new process nodes from the pure-play foundries.
Few expected the 28 nm node to live nearly as long as it has. Some of the first cracks in the façade actually came from Intel. Their 22 nm Tri-Gate (FinFET) process took a little bit longer to get off the ground than expected. We also noticed some interesting electrical features from the products developed on that process. Intel skewed away from higher clockspeeds and focused on efficiency and architectural improvements rather than staying at generally acceptable TDPs and leapfrogging the competition by clockspeed alone. Overclockers noticed that the newer parts did not reach the same clockspeed heights as previous products such as the 32 nm based Sandy Bridge processors. Whether this decision was intentional from Intel or not is debatable, but my gut feeling here is that they responded to the technical limitations of their 22 nm process. Yields and bins likely dictated the max clockspeeds attained on these new products. So instead of vaulting over AMD’s products, they just slowly started walking away from them.
Samsung is one of the first pure-play foundries to offer a working sub-20 nm FinFET product line. (Photo courtesy of ExtremeTech)
When 28 nm was released the plans on the books were to transition to 20 nm products based on planar transistors, thereby bypassing the added expense of developing FinFETs. It was widely expected that FinFETs were not necessarily required to address the needs of the market. Sadly, that did not turn out to be the case. There are many other factors as to why 20 nm planar parts are not common, but the limitations of that particular process node has made it a relatively niche process node that is appropriate for smaller, low power ASICs (like the latest Apple SOCs). The Apple A8 is rumored to be around 90 mm square, which is a far cry from the traditional midrange GPU that goes from 250 mm sq. to 400+ mm sq.
The essential difficulty of the 20 nm planar node appears to be a lack of power scaling to match the increased transistor density. TSMC and others have successfully packed in more transistors into every square mm as compared to 28 nm, but the electrical characteristics did not scale proportionally well. Yes, there are improvements there per transistor, but when designers pack in all those transistors into a large design, TDP and voltage issues start to arise. As TDP increases, it takes more power to drive the processor, which then leads to more heat. The GPU guys probably looked at this and figured out that while they can achieve a higher transistor density and a wider design, they will have to downclock the entire GPU to hit reasonable TDP levels. When adding these concerns to yields and bins for the new process, the advantages of going to 20 nm would be slim to none at the end of the day.
Hindsight is of course 20/20, but back in 2012 we started to hear about a push to develop FD-SOI (fully depleted) products for 28 nm and 20 nm. AMD has a history of using PD-SOI (partially depleted), but when they spun off their fabrication arm to GLOBALFOUNDRIES, the group decided to forego development on any more SOI products and concentrate on bulk silicon (like Intel had done). The idea here was that materials such as those used in HKMG production would scale adequately from 28 nm to 20 nm, thereby delaying the R&D costs of developing FinFET technology for another couple of years. Why spend the money now if there is no pressing need for it? If bulk silicon and current materials could power the industry for the next few years, why go off on a sidebranch of SOI technology that could potentially not pay for itself?
ST-Micro developed a 28 nm FD-SOI process, but unfortunately it was done at a Fab that could not provide nearly enough wafers a month to satisfy any kind of demand. If I remember correctly, it was limited to several hundred wafers a month. It would be enough to handle some RF designs, but it would be entirely inappropriate for any kind of large scale production of a part that would go into a GPU product line or a low power, mass produced handset. This particular process node was a great success in terms of power consumption and transistor switching performance. ST-Micro showed off ARM Cortex-A9 designs that hit 3 GHz all the while having better overall power characteristics at idle and full load than 28 nm HKMG products.
We started hearing about the potential of this technology and that a theoretical 20 nm FD-SOI planar product would have slightly better electrical characteristics than Intel’s first generation 22 nm Tri-Gate. A gate-last implementation could have been class leading in terms of feature size and power/speed characteristics. Unfortunately for this technology, there was a lot of risk involved with developing a 20 nm FD-SOI product line. Equipment to handle bulk silicon will have to be modified or replaced entirely to handle FD-SOI. It is an expensive endeavor, plus while FD-SOI can support FinFET technology (FinFETs are in fact based on fully depleted deposited layers) most of the current research from multiple competitors has all been on bulk silicon. We can address “what ifs” all day, but when looking back it would have paid whoever had been able to develop planar FD-SOI handsomely when we look at how long 28 nm HKMG has been extended as a leading edge process technology.
Apple's A8 SOC is one of the first large, mass produced chips based on 20 nm planar technology. (Photo courtesy of Chipworks)
Looking over the foundry landscape we now understand why we have seen the 28 nm HKMG process last as long as it has. It is no longer cutting edge, but it is well understood and quite mature. AMD and NVIDIA have had to do a lot more in terms of design to overcome the limitations of the 28 nm HKMG process. Some years ago I had theorized that we would see a situation where process tech would simply come to a standstill for a longer than expected time, and that is when design and engineering would have to come to the fore to progress chip level improvements.
28 nm for GPUs Through 2015
This is where some speculation begins. So far we have only seen 28 nm products from NVIDIA as they have refreshed their lineup with Maxwell based parts. The GM200 is a massive chip at around 600 mm square, which is near the reasonable reticle limit of 28 nm. Yes, guys like IBM have chips that are larger in size, but these are not exactly mass produced parts that are supposed to have reasonable margins attached to them all the while addressing the consumer market. The GM200 looks to be the final puzzle piece for NVIDIA throughout 2015, with Pascal based parts being introduced in 2016.