The Mali-T760 and T720
The T760 is the top end product from ARM when it comes to advanced graphics capabilities and performance. It is completely redesigned from the T600 series and is poised to be a leader in performance in its particular market. It includes up to 16 unified shader cores (though that number is up to the licensee’s discretion) and a host of advanced features that improve both performance and power consumption. The T700 architecture is based on the very power efficient tiling mode rendering, which was spearheaded by Imagination Technologies some 17 years ago. Tiling is very efficient in terms of memory accesses, and therefore more efficient with power when using similar process technologies as compared to an immediate mode renderer.
A new memory management unit is accentuated by the addition of the ARM Frame Buffer Compression (AFBC) technology. AFBC is a compression format that is lossless and exists in multiple stages of the rendering and output pipeline. Textures are compressed by the CPU and then delivered to the GPU. The GPU then processes the scene and the resultant output is then re-compressed and sent to the display controller where it is again decoded and then displayed on-screen. ARM discovered that the decrease in memory usage while using these compressed objects not only improved performance, but also decreased overall memory usage and power consumption. This compression from beginning to end in logical places resulted in bandwidth savings, which again makes the chip more power efficient. It takes more power to transfer the data to and from memory as compared to the time and effort to compress and decompress the data in this manner. Even though it adds to the complexity of the chip, it does so in an intelligent manner so that it ends up being overall more efficient than going back to the traditional way of doing things.
Something else that ARM is doing relates directly to the tiling mechanism employed by the Mali products. Because the output is tiled, they can quickly determine if those particular tiles are changing from one frame to the next. In Android based applications, the T700 chip will only transmit the tiles with changing information, so that static tiles will stay resident in the display buffer. This again reduces bandwidth and composition workload. Only that tiles that have new values will be changed in the buffer. This again improves overall bandwidth and lower power consumption.
This GPU was designed from the ground up to be very tightly integrated with the latest Cortex A15, A53, and A57 processors from ARM. The CPU/GPU links are again optimized in these processors to allow the greatest amount of memory bandwidth to be attained. Obviously we were not given more details on this as IT, NVIDIA, and Qualcomm all produce their own graphics portions which may or may not have this same level of optimization as what ARM offers with their Mali-T700 products.
ARM claims that the T760 has a 400% increase in energy efficiency over the Mali-T604, which was one of the first T600 series of products released. This comes from many different areas in terms of performance optimization and a focus on power consumption. All indications point to this being a leading graphics component for the ARM ecosystem, but we have yet to see it implemented into a product so far. We are still quite a few months from that point, but interest appears to be great in this new product for high end tablets. As screen resolutions keep climbing on these products, more performance and power efficiency are needed to keep these tablets lit.
The Mali-T720 is the latest midrange GPU offering from ARM, and it is aimed squarely at both area efficiency and power efficiency. It is a smaller GPU with fewer of the higher end bandwidth saving features as the T760. It is again highly optimized as compared to the T600 series and gains a significant amount of efficiency over previous cost optimized GPUs. It features up to 8 shader cores, but fewer memory and bandwidth structures to improve performance as compared to the T760. It is a smaller chip as compared to the T62x series, but it performs at near the same levels with better power efficiency. It is a big step up from the previous Mali-T450 which has been integrated in many designs around the world.
This is a product that will find a way into Cortex A7, A12, and A53 designs due to its overall good performance and impressive power characteristics. ARM obviously put some time into these designs to get the die size down so that they can be implemented more effectively in a variety of applications. More shader units and performance in the same die area, or fewer shader units that are aimed at SOCs that are more budget conscious.
ARM works very closely with the EDA (Electronic Design and Automation) software producers, like Cadence and Synopsys, to improve time to market and develop more efficient chips for the licensees. The current target is the 28 nm nodes, but ARM has already achieved success with test chips on 20 nm, 16 nm, and 14 nm nodes. These are obviously not ready for prime time, but ARM is working with the foundries and software design people to prepare the road for their licensees. Once a process opens up for mass production then ARM hopes to be there with a solution so that the time to market for these parts will be exceptionally short (comparatively speaking).
These products are not exactly next generation, but they do keep ARM in the running for providing a solid graphics platform for their partners. The competition will not sleep, and certainly it will be interesting to see Kepler for mobile eventually make it out to market. For the time being, ARM has a solid looking implementation for the market at 28 nm and beyond.