The Present and Future of Qualcomm GPUs
Qualcomm is now well on its way to the full release of the Adreno A4x series of GPUs, including the A405, A418, A420, and A430 with even higher performance and support for newer APIs like DX11.2. The A420 is already shipping with the Snapdragon 805 in phones like the Droid Turbo and Nexus 6. The A430 will increase the shader count by 2.25x, improving performance for so-called “superphones” and tablets.
Many hurdles still remain for Qualcomm and other mobile GPU vendors that will push the demand for GPU compute well into 2016. Within a year, Qualcomm expects that 4K displays will be available for mobile devices that draw the same power as the QHD displays shipping in the top-end devices currently. High quality gaming, including previous generation console level graphics (PS3 / Xbox 360) from engines like UE4 (though obviously with smaller data sets) are coming. Newer APIs like DX12 and the next generation of OpenGL ES are right around the corner. Furthermore, virtual reality headsets from companies like Samsung and Oculus already are using mobile devices to take over as a competitive gaming platform. Currently the Samsung Galaxy VR headset only supports Qualcomm Snapdragon/Adreno-based devices.)
General purpose GPU computing will also drive the need for faster GPUs. Imaging and multimedia are key differentiators for hardware vendors including capabilities like 4K video recording, high frame rate capture, and real-time effects previews; all are able to utilize the GPU for parallel processing. OpenCL will be the key to much of this work so supporting all current and upcoming revisions to that standard (as well as DirectCompute) are critical.
Adreno A4x Architecture
The Adreno A4x addresses much of this API transition already, but more can and will be done to improve shader performance, tessellation capability, lower latency for transactions, and of course intelligent GPU throttling to allow all of this to occur in the same or lower power profiles.
A4x is the first Qualcomm GPU to support Direct X 11.2, OpenGL ES 3.1, as well as the Android Extension Pack (AEP) that Google announced along with the Lollipop operating system. This extension to the current graphics APIs adds support for hardware tessellation, geometry shaders, compute shaders and ASTC texture compression – all targeting PC-quality graphics in mobile devices. The A4x GPUs all support AEP and improve performance along the areas of individual shader pipes.
OpenCL 1.2 Full profile is supported along with Microsoft’s DirectCompute API targeting faster general purpose compute for mobile GPUs. RenderScript acceleration has also been improved over the A3x line of GPUs.
Texturing performance gets an upgrade in A4x along with added support for higher levels of anisotropic filtering with less impact on performance. ASTC support (Adaptive Scalable Texture Compression) allows for better level of detail capability and overall improved texture quality at the same performance levels. Qualcomm has increased texture cache sizes and the general purpose L2 cache as well to help out.
ROPs are what do the final blending of pixels and output to the screen and Qualcomm has improved them with A4x to achieve peak draw rates more frequently. Better Z-stencils allow for faster depth rejections, lowering the amount of pixels that the GPU must process to draw any particular scene, removing hidden regions from the pipeline.
All of these GPU improvements are demonstrated with a host of custom made demos from QC’s internal development house. I saw a hardware accelerated dynamic tessellation example centered around a very detailed hornet on cracked desert earth; the results were impressive coming from a reference platform utilizing the Snapdragon 810. GPGPU compute improvements were shown off with an accelerated video panorama capability, using HD video rather than simply stills to create one of the most interesting visual products you will see on a mobile device.
A4x is just the latest GPU available in the market from Qualcomm but you can be sure that more is coming; this company is not standing still. Looking back at what has changed in the mobile market since 2008 is stunning.
If we focus on the capabilities that are enabled by Adreno and the graphics technologies, it offers a 60x performance improvement along with a 40x display resolution increase. The results are impressive by any metric.
Pixel Quality versus Pixel Quantity
In its partnership with dozens of OEMs and software developers, Qualcomm is preparing for another shift in the mobile market when it comes to graphics: that of pixel quality rather than just raw pixel quantity. The Snapdragon Display Engine is a portion of the display pipeline that has technologies integrated to improve the experience of the user beyond just adding pixels for stats-sheet padding. Built up of four discrete segments, the SDE includes composition acceleration, along with something that Qualcomm is calling “ecoPix” for display power reduction, “TruPalette” for improved picture quality, and an array of interfaces for various display technologies.
Composition acceleration is exactly what it sounds like: improving the performance of the compilation of the multitude of layers that need to be combined by the operating system and applications to provide a high quality user experience. The reduction of frame drops, otherwise known as jank, is important in this step as well and is part of a major push from Google to improve the smoothness of user experiences. Jank is critically impactful to how fast a system “feels” and the composition acceleration engine in the SDE helps here.
TruPalette is an interesting set of hardware and software tools that aim to improve the appearance of the pixels on the screen, regardless of the quality of the screen they are displayed on. This includes color enhancement to adjust the shift of colors along the gamut and memory color to improve the appearance of skin tones and foliage during those color enhancements. Most interesting is the gamut mapping capability that allows the source video/image to map correctly to the display gamut (provided by the OEM) to maintain color quality and consistency.
EcoPix is a set of features that Qualcomm offers to its GPU users to improve power efficiency. Content Adaptive Backlight (CABL) is a technology for LCDs that will adjust the luminance of the displayed frame while simultaneously lowering the backlight of the phone; lowering power but presenting the same perceived image to the user, thereby saving power. FOSS is the same idea for OLED screens that use self-emitting designs. Frame buffer compression lowers the amount of bandwidth required to go from the SOC to the display, which lowers overall power and can lower the amount of display memory necessary, reducing overall cost. Readers of PC Perspective are very familiar with variable refresh – the idea that a display can refresh at lower than standard rates (and more granular steppings). For Qualcomm, they use that ability to lower power consumption from display use.
Finally, the interface display control PHY allows connections to various outputs including a pair of DSI (display serial interface), an embedded DisplayPort (eDP) option, HDMI, and even full write-back support for wireless displays like Miracast.
All of these technologies demonstrate Qualcomm’s focus on the quality and power efficiency of individual pixels rather than the raw horsepower required to push additional pixels. Obviously a modern mobile GPU needs to balance both of these pressure points and Adreno seems to have OEMs covered in either direction.
An Evangelical Position
My time at Qualcomm a few months back proved to be incredibly interesting and reiterated to me how dedicated the company was to the GPU as a first-class part of the Snapdragon SoC. It might seem obvious that the company had this kind of commitment, but for many that only see the outward marketing of other companies in the field, it often appears that Qualcomm is always one-step behind. That definitely isn't the truth; it appears to be in lock-step with the partners and OEMs that drive the industry forward.
Qualcomm’s executives understand that they are in an interesting position in the mobile market. Though they are easily the market leader in terms of units sold, the company chooses to work with customers on processor design goals rather than designing them in isolation and presenting them to customers after completion. The “silicon for silicon’s sake” methodology can work in the world where 200 watt GPUs are the norm, but when you start discussing metrics like micro-watts, getting things right the first time is critical to success. Qualcomm works with its customers, including all of the largest phone and tablet vendors in the world, to measure compute needs and to develop the perfect processor for each segment. The company will often find itself evangelizing to its partners about the needs of consumers, the wants of consumers, and the right hardware to get there. As mentioned earlier, Qualcomm was one of the first companies to point to industry standard benchmarks as a useful and warranted method for component selection and future development. They also understand that some tests that are outdated should be discounted and focus on them shifted away. Qualcomm instead chooses to fight for the benchmarks that actually prove beneficial to the end user.
Qualcomm has a lot more on the line than most of the phone/tablet vendors with each individual chip. It only takes 8 months or so from development to shipment for a standard smartphone today (though there are obviously some exceptions), while the SoC that powers it will take 2-3 years from development to shipment. As with most computing hardware, a large portion of the engineering work that goes into a CPU, or GPU, or even communications processor is about predicting the needs of the future. Though no company is perfect, Qualcomm has gained unmatched credibility with customers and consumers by being correct for many years. Next time you see that cute little dragon ad during a commercial break, just remember how long it has taken to get there.
Can they run Crysis?
Can they run Crysis?
If you have a source license
If you have a source license for the CryTek Engine, and all the assets for Crysis, then yes, yes it can, now that CryTek Engine supports Linux. Not to mention Windows 10 support for different platforms.
I’d like the manufactures of
I’d like the manufactures of mobile GPUs to provide block diagrams of their GPU products including any optional components that OEMs may license and use, like dedicated Ray tracing, decoder logic units, etc. And for once I’d like reviews of technologies to be less testimonials of the product and more of a comparison and contrast with the competing products, including the complete block diagrams of the competing producers’ GPU products. I’d also like more generic naming used for all these specialized GPU units that are given Trade Names/Brands to see if the functional units have comparable functionality across all brands of mobile GPUs.
If at all possible utilize the generic computing sciences name for the hardware functionality instead of the manufacture’s trade/marketing terminology or put an annotated parenthetical reference generic name/term next to the marketing brand/term, for example SMT for Hyper-Treading, SMT(Simultaneous multithreading) is the proper generic terminology with which to compare CPUs with SMT ability when discussing/comparing CPUs, and GPUs are even more full of these trade names in an attempt by the devices manufacturer to differentiate, and sometime obfuscate, their products from the competition and confuse the consumer.
There is too much marketing speak and to little educating in most online technology websites, except for the professional trade journals which are behind pay walls, and most professional trade journals maintain extensive dictionaries of computing terminology as well as disambiguation entries of trade/marketing terminology translated to generic computing science terminology, to allow readers to properly compare different hardware systems between different manufactures.
The Mobile GPU makers/licensers are not providing sufficient data sheets, and diagrams for consumers to make an educated decision on just which of the mobile GPU, usually integrated in an SOC, has the largest feature set, that and there is little technical information to be had online, except the behind the pay-wall variety, or the occasional Hot Chips Symposium, where the professional engineers utilize the proper computing science terminology, although the marketing monkeys are trying to ruin hot chips by forcing their engineers to use the marketing terms.
There really needs to be a good online reference for computing science terminology, as well as proper technical documentation, Wikipedia is piss poor with their “technical” information on the various GPU/CPU/SOC processors, and processor pin-outs, as well as block diagrams, including a dedicated disambiguating section that translates marketing/trade terminology into proper generic computing science terminology.
No pleasing some people,
No pleasing some people, maybe you will be better served if you use other sites that to us mere mortals are difficult to comprehend. NO offense to the article writer but at least i can understand most of what he says.
The post is not specifically
The post is not specifically directed at PcPer, which is one of the better benchmarking/computer news sites outside of a pay-wall, but it would not hurt for these technical websites to pool their resources and get a service to properly document the complete technical details on the products that they review, including some educational articles once in a while to help the non technical readers better understand the technology nomenclature and definitions, and educate their readers to able to differentiate the marketing brand obfuscation and the actual computing sciences terminology for the various CPU/GPU/other hardware than needs to be compared and contrasted. This may be a consumer products oriented sight, but the technology is very complex and with the marketing and MBA types in charge for the most part of some very large technology companies, the tendency is more to confuse rather than inform on the part of said companies.
A lot of the high tech electronic devices have been commoditized and are marketed the same way bulk laundry soap is marketed, but in order for the consumers/readers to have any possibilities of making an informed decision proper education and review methods need to be used. This includes defining Acronyms, and disambiguating marketing “technical” terms/branding with the proper computing sciences terminology, so readers can properly research the specific technologies on CPU/GPU/SOC/other computing systems.
Too many technology websites are becoming little more than extensions of marketing departments, and click bait journalism is rampant on more than just a small number of websites. The total number of sponsored article content has sky rocketed, along with reviews that only talk about a single manufacture’s products without any direct feature for feature comparisons of any competitor’s equivalent product technology. It’s become more difficult to obtain the proper data sheets and specifications on especially the mobile SOC GPUs and their specific technologies compared to the more thorough analysis on the top end gaming GPU SKUs.
Well said. PCPer is for
Well said. PCPer is for users, not for marketing.
Too many marketing terms and I start looking for the “paid promotional article” tag hidden somewhere.
This is why benchmarks exist.
This is why benchmarks exist. It would be nearly impossible, even for someone knowledgable in the field, to predict how all of these features will work together to enhance the user experience. Going into low-level architectural detail isn’t going to be useful to most users.
Software benchmarks are the
Software benchmarks are the most gamed to deceive statistic where computer hardware is concerned, and nothing replaces a good thoroughly documented hardware data sheet, that gives definite SP/ROP/etc. counts on GPU hardware, as well as the proper block diagrams describing the complete workings of a GPU/CPU/Other processor, or at least links to data sheets/technical manuals that do have the most complete information, without revealing any trade secrets.
The low level hardware is the most important to see in order to at least have an estimation of what the device has compared to its competition, everybody knows that whatever SOC/GPU/CPU may have differing characteristics when placed into an OEMs final product, especially in mobile/laptops products where the device’s thermal settings may be lowered to run in the mobile form factor.
I’m a big fan of requiring mobile CPU/SOC OEMs and manufactures to provide some testing rigs/mules for reviews of their SOCs, custom and otherwise, so that the CPU/GPUs and SOC’s themselves can be properly put through their paces, outside of any eventual devices the CPUs/GPUs/SOCs may be placed in, if it works for the big gaming rigs, it should work for the smartphone/tablet SOCs, and believe me there are testing rigs/testing mules that can do the job. The industry uses them for internal device development before the phone/tablet product designs are finalized.
Even among phone/tablet systems the mainboards are fairly standardized, maybe not the shapes and dimensions of the mainboard PCBs, but the platform controller chips, and chipsets are fairly standardized, and the testing rigs/testing mules are used to put the mobile SOCs through their paces, every bit as thorough as the gaming rigs are tested and even more thoroughly with electrical usage and such.
So benchmarks: AnTuTu, and such gamed by device manufactures makes me mistrust single benchmarks, it’s already to a point where device manufactures, and SOC manufactures should be required to send their devices to independent third party testing labs to have the SOCs tested individually and also in the respective devices and the information made public, in or for the devices to be approved for sale. The FCC does this to a degree, but the information is difficult to find, and the Department of Energy as well as the EPA does testing, but some form of impartial standardized testing by an outside lab is in order for the entire mobile devices industry, as well as the PC/Laptop industry.
I wouldn’t trust synthetic
I wouldn’t trust synthetic benchmarks, but what better gauge of performance can you get than actually running the applications people are interested in?
“Software benchmarks are the
“Software benchmarks are the most gamed to deceive statistic where computer hardware is concerned, and nothing replaces a good thoroughly documented hardware data sheet, that gives definite SP/ROP/etc. counts on GPU hardware, as well as the proper block diagrams describing the complete workings of a GPU/CPU/Other processor, or at least links to data sheets/technical manuals that do have the most complete information, without revealing any trade secrets.”
The tech specs of these devices obviously make a difference, but they are actually often not useful for comparisons. There can be other bottlenecks in the system. Testing the SOC independently of the device it goes into is also not that useful since the final device will have its own specific thermal characteristics and possibly other bottlenecks. A lot depends on what screen the SOC us paired with. Apple does not disclose much info on their SOCs, but it doesn’t really matter since we can run test and see how it really performs in the applications we are interested in.
good read 🙂
good read 🙂
Don’t forget that they’re
Don’t forget that they’re also the company that pushed CDMA in North America because they had it locked up with pattents–even though none of the rest of the world used it because it was rubbish. This left NA and good chunks of Central and Southern America behind the world in cellular development.
CDMA isnt rubbish, it has a
CDMA isnt rubbish, it has a lot of beneficial features that the GSM standard did now have. Dont blame the whole network intercommunication problems on the technology, that’s a basic issue with creating standards in the USA/Rest of the world scenario.
Remember that GSM went to WCDMA which has its roots in the CDMA technology.
Well, this is certainly very
Well, this is certainly very enthusiastic.
Is running games at 4k on a
Is running games at 4k on a mobile device really necessary? High pixel density is great for text, but for images or video on such a small screen, I doubt it would bu noticable. For a small screen I wonder if it would really be noticable if it was rendering at 4k or just rendering at 1080 and scaling it to 4k.
Do you really have to game on
Do you really have to game on your mobile?
You might have wanted to
You might have wanted to mention they all may be banned soon, as they’ve been stealing NV tech for years 😉 The markman hearing showed the judge favoring 6 out of 7 nvidia patents. That’s a pretty clear sign he thinks Nvidia has a great case against samsung and Qcom and likely (no matter what ITC people think) that a 12 person jury would see the same, especially when they will be considering a 23-30Billion profit machine in samsung (6B for Qcom) stealing from an American company with ~600mil profits. It is clear mobile is now doing stuff that desktop has been doing for a decade, so someone is stealing without paying the patent owners (NV and likely even AMD in some ways) who blazed the trail 10-15 years ago. Note the patents NV sued over are from 1999-2001 when the tech that is being used on mobile was created. Patents for this stuff is created long before the products hit (IE we won’t see what NV’s working on in the last 5yrs until Volta and beyond probably), but unlike a patent troll NV/AMD are actually USING the tech in their products for a decade on desktops.
Good luck explaining to a jury how you’re doing it different than the people who’ve been doing it for 20yrs. You will be trampling on NV/AMD (maybe some intel) patents for decades to come most likely also unless they come up with some magically radically different way to get pixels onto a monitor. There is a very good reason Anandtech called it the wild west of patent infringement in mobile. It’s time to pay up for all these leechers. An NV win might actually lead to a case for AMD which they could really use to help fund R&D for the future (which has gone down for the last 4yrs, while losing $6Billion in 12 years). They are now playing PC/console ports on mobile directly, so I can’t wait to hear how they’ll say they do it different while running the same exact games that came on consoles/PC’s over the last decade+. It’s also worthy noting NV has been trying to get them to pay for 2+ years (IE, willful infringement after being told to stop or pay up!). This is worse than the INTEL case which ended in 1.5B to Nvidia for the same stuff once the chipset agreement was broken (intel wasn’t WILLFUL, it only happened due to breaking an agreement).
The worst that happens is
The worst that happens is that Qualcomm takes a bit from its warchest and buys nvidia.
Adreno is still
Adreno is still vliw4.
Interesting piece Ryan, I
Interesting piece Ryan, I think it does show Qualcomm really “caught up” with their graphics performance and design right around the time they collaborated with and acquired ATI’s mobile division.
It would’ve been a nice addendum to see the nature of Qualcomm’s deal with ATI at the time. Clearly they were licensing AMD’s Imageon IP in the earlier collaborations, but it doesn’t look like any of that IP was transferred or continually licensed to Qualcomm when they acquired Imageon. That certainly makes sense, as 65M seemed like a song at the time for an entire mobile graphics division (and still does), but its more and more obvious no IP changed hands, nor did any perpetual IP license from AMD. It was really just a transfer of staff and working knowledge.
That really would be the only way Nvidia would have gone this far in their litigation against Qualcomm; if any of AMD’s IP transferred or was still being licensed, that would have put an end to Nvidia’s litigation full stop.