GPU Specifications
As I mentioned before, the Kepler implementation on Tegra K1 is surprisingly close to the design you will find in a GeForce GT 780 Ti. The SMX unit includes 192 CUDA cores, a unified memory cache, and dedicated acceleration for tessellation, Z culling, and color ROPs. The primary differences found between the Tegra and GeForce units is a move from 16 texture units to 8 and from 8 color ROPs to 4.
Communication on the SMX has been changed up quite a bit though so a new on-chip network was needed for a power efficient implementation. Using the same communication routes used on the desktop / discrete Kepler GPUs just wouldn’t work in the SoC. The complexity of what exists on the desktop can cost a lot of power as well.
With a Kepler GPU in Tegra K1 you get all the benefits of Kepler automatically. A feature like hardware tessellation doesn’t surprise PC gamers but for mobile users and developers the feature is new and it is unique to NVIDIA. Tessellation allows you to dynamically generate geometry based on a level of detail variable usually set by screen position. This can be a savings of nearly 50x on triangle generation when compared to OpenGL ES2.0 software tessellation and significant performance improvements will be seen in specific scenarios that take advantage of it. NVIDIA showcased a couple of demos running on the Tegra K1 reference platform including a terrain map and NVIDIA’s classic Stone Giant demo – both were impressive and running well.
Geometry shading is also included with Kepler which can be utilized for cube maps, voxel rendering, and shadow volumes. Bindless textures are supported with the Tegra K1 to allow developers to access textures directly from memory. All of these features are expected on discrete GPUs, but are impressive additions to mobile graphics.
Tegra K1 supports GPU accelerated path rendering for improved text clarity and fast zooming. This feature has been a part of browsers and Android for some time, but it is good for NVIDIA to be keeping up in these key user experience areas with GPU compute.
In an architecture with a pretty limited memory bus width and low bandwidth, compression of textures and color can mean a lot. Not only used for gaming purposes, the Tegra K1 can use compression through many stages of the pipeline to improve performance as well as improve power efficiency of the platform.
These examples above show how much bandwidth NVIDIA is able to save with the GPU compression of Tegra K1. For mobile devices, saving memory bandwidth directly equates to power savings and battery life. Performance benefits won’t likely be seen until K1 is integrated into higher resolution displays where memory bandwidth could become a bottleneck.
Taking a GPU that currently resides in 200+ watt graphics processors and paring it down to fit into the mobile factors that require maximum power draw of 2 watts might seem like an impossible task, but NVIDIA was able to accomplish it with Kepler and a long bullet list of features. Rail gating, clock gating, power gating, GPU L2 cache and compression, early z culling and optimized interconnects are all at work in Tegra K1 to bring power down.
During briefings NVIDIA gave a specific example of how efficient Kepler can be. Take the GeForce 740M graphics card that utilizes two SMX units at 19 watts. First, remove 3 watts for IO and memory, 6 watts for leakage from higher voltages and you are down to 10 watts or 5 watts per SMX. If you run the voltage at 0.9v rather than the 1.1v implemented and clock down from 1.0 GHz to 500 MHz then you reach the 2 watt level that K1 needed.
NVIDIA demonstrated the GPU efficiency of Kepler in the K1 by comparing the reference platform to the iPhone 5s with the Apple A7 SoC, and the Sony Xperia Z Ultra with the Qualcomm Snapdragon S800 and Adreno 330 GPU in the new GFXBench 3.0. As the name suggests, this graphics test uses OpenGL ES3.0 and we are looking at the Manhattan 1080p off screen result below.
In both of the direct comparisons being made, the Tegra K1 is 1.5x more power efficient than the other SoCs at work. At the high performance level the K1 at 1.75W (SoC and memory) runs at the same performance as the iPhone 5s at 2.5W. At 1.5W the K1 performs the same as the Xperia at nearly 2.20W. The obvious issue with these results, other than they were run and presented by NVIDIA, is that we are only looking at single data points rather than a performance per watt curve. It is easy for a vendor to pick specific use cases where their silicon outperforms the competition, but the ability to do that for all (or most) of the device’s voltage range is much more important.
Well with Nvidia GPU joining
Well with Nvidia GPU joining their previously separate GPU technology between mobile ad the desktop, and entering the exclusive Top Tier ARM ISA custom design club, with Apple and others, it should not be to difficult to estimate what Maxwell will bring to the table. This merging the desktop GPU with some on die CPU cores, and maybe a large on die RAM, should begin the move towards less reliance on the moatherboard CPU. Gaming engines and other latency/bandwith constrained code will now run, and hopefully reside in a large on die RAM, to reduce these latency/bandwith issues between gaming engine code and the GPU. This puts the relevance on the motherbard CPU into question, with repect to descrete GPUs possessing their own complete gaming system ability.
I don’t think you’ll be
I don’t think you’ll be seeing the x86 CPU going the way to the dodo bird anytime soon….at least not for awhile.
For sure x86 will never
For sure x86 will never completely go away, AMD will be doing the very same thing with its own ARM based APUs as Nvidia’s K1, but AMD will also be taking the x86 ISA on board with the descrete GPUs for some CPU/GPU accelerated complete gaming platform capable descrete GPUs, via AMD’s already deveoped for the gaming consoles x86 based technology! Both Nvidia’s descrete Maxwell GPUs and AMD’s future descrete GPUs will merge the CPU with the GPU, and by themselves, become complete gaming platforms on a PCI card.
In the paragraph at the end
In the paragraph at the end of the page on GPU-Specifications, you misuse V where you mean W.
Shield 2 or 3
Shield 2 or 3
I feel like this could make
I feel like this could make for a perfect Steam OS box.
SteamOS is developed for
SteamOS is developed for x86(-64) and its games will be, too. It could be a good FirefoxOS console (or whatever).
Can`t wait !
Can`t wait !
Tegra 4 repeat ?
Samsung &
Tegra 4 repeat ?
Samsung & Qualcomm already announced their 64bit chips will be coming out for the new Phone/Tablet season last month. Both have already been leaked to be in phones already by Spring.
Unless Nvidia sells K1 32bit cheap to make it attractive I don’t see how it can gain traction much like Tegra 4 was overpriced and its modem wasn’t certified so it was a no go for phones or tablets that used cell service.
Did you read the article at
Did you read the article at all? Any of it? These new SoCs will have powerful graphics embedded, and it seems that the graphics are powerful enough to be on par with PS3/Xbox 360. Potentially this could lead to being on par with the Xbox 1 & PS4 within a few years. Now that is exciting.
Cant believe you never
Cant believe you never mentioned once Tegra K1 will lack native on-chip support for LTE.
Its also doesn’t support
Its also doesn’t support CDMA.
So it won’t work on Verizon nor Sprint networks in the U.S.
This is pretty compelling
This is pretty compelling stuff. As you say, it all depends on if they get any design wins. But for my 2 cents, I’d probably buy a phone or tablet with the Tegra K1, assuming it comes out before it gets leapfrogged by the next Adreno or Apple A8.
PCI Express capability is interesting. Does that mean this chip could potentially run Thunderbolt? Might do interesting things for accessory connectivity.
I like the comparison in raw compute power with last gen consoles. At the rate things are going, we’re going to catch up with current gen consoles before next gen consoles come out.
One of the other big things stopping developers from coming out with real, true-to-life console quality games for mobile chips is the lack of a standard controller. Bluetooth HID controllers, you can have a very different set of buttons on each one, so there is a barrier to entry–both to the developer who would have to try and make their game configurable enough that a wide variety of controllers is usable, and to the user who has to go and do that setup and may end up failing to get a good, workable configuration. Consoles have a single defined set of buttons, a single set of hardware, and that means the developer knows exactly what to design for.
So, even if Tegra K1 takes off, we may still have yet to see a lot of heavy hitting games put onto mobile platforms, unless someone comes along and makes a big push for a single controller definition. Apple, as a matter of fact, did this for iOS, so maybe that style of controller will become the controller for iOS, and spill over into the rest of the world, so that maybe there’s one big one that all the game developers design for, and the rest of the controllers can either follow suit or fall behind.
The rumor mill is already
The rumor mill is already placing the Apple A8 in some sort of sub-macbook air form factor with a full keyboard and running OS X and iOS. I’m looking for a poor man’s version of those expensive professional graphics tablets, running the Nvidia K1 Denver cores, with at least a 10-12 inch HD screen, and running Linux Mint, for my Gimp graphics and Light Blender 3d mesh modeling. I wish Nvidia could have done some HI polygon mesh modeling demos on the K1, that they had with with the A15 Cortex cores! Full OpenGL should work with Blender and Gimp, as well as other OpenSource software.
Can`t wait…AWESOME…thanks
Can`t wait…AWESOME…thanks PCPER for the info.
This is such a remarkable
This is such a remarkable advance that I can’t wait to see K1-powered hardware hit the market. There was a rumor last week that Microsoft’s next Windows RT tablet (presumably the Surface III) will be built around the Tegra K1. Has there been any confirmation of this?
Its a nice post. At first I
Its a nice post. At first I want to say thank you for sharing with us this kind of tremendous posting. Its my personal opinion that, Buying is much more effective than rent because then we have the freedom to do anything for next. invites you to read
Cooling tower systems