A third primary processor
Qualcomm shares some more details about the Snapdragon 820, this time in the form of the Hexagon 680 DSP.
As the Hot Chips conference begins in Cupertino this week, Qualcomm is set to divulge another set of information about the upcoming Snapdragon 820 processor. Earlier this month the company revealed details about the Adreno 5xx GPU architecture, showcasing improved performance and power efficiency while also adding a new Spectra 14-bit image processor. Today we shift to what Qualcomm calls the “third pillar in the triumvirate of programmable processors” that make up the Snapdragon SoC. The Hexagon DSP (digital signal processor), introduced initially by Qualcomm in 2004, has gone through a massive architecture shift and even programmability shift over the last 10 years.
Qualcomm believes that building a balanced SoC for mobile applications is all about heterogeneous computing with no one processor carrying the entire load. The majority of the work that any modern Snapdragon processor must handle goes through the primary CPU cores, the GPU or the DSP. We learned about upgrades to the Adreno 5xx series for the Snapdragon 820 and we are promised information about Kryo CPU architecture soon as well. But the Hexagon 600-series of DSPs actually deals with some of the most important functionality for smartphones and tablets: audio, voice, imaging and video.
Interestingly, Qualcomm opened up the DSP to programmability just four years ago, giving developers the ability to write custom code and software to take advantages of the specific performance capabilities that the DSP offers. Custom photography, videography and sound applications could benefit greatly in terms of performance and power efficiency if utilizing the QC DSP rather than the primary system CPU or GPU. As of this writing, Qualcomm claims there are “hundreds” of developers actively writing code targeting its family of Hexagon processors.
The Hexagon DSP in Snapdragon 820 consists of three primary partitions. The main compute DSP works in conjunction with the GPU and CPU cores and will do much of the heavy lifting for encompassed workloads. The modem DSP aids the cellular modem in communication throughput. The new guy here is the lower power DSP in the Low Power Island (LPI) that shifts how always-on sensors can communicate with the operating system.
Snapdragon 820 will ship with the Hexagon 680 DSP with several new and intriguing features. First, Hexagon DSPs support hardware multithreading that permits four total threads in flight in the core without any penalty for context switching. Qualcomm achieves this with a pipeline that is designed to completely vacate each stage with a thread and allow for a new context (or the same) to enter the stage immediately, without register switching or data migration. Because the Hexagon DSP is not an out-of-order design, if a thread stall occurs the core will insert the next available thread, with no degradation in performance for the action. This multi-threading capability is enabled through ISA extensions and is part of the standard Qualcomm Snapdragon tool chain.
New to the Hexagon 680 DSP are additional ISA extensions called HVX, Hexagon Vector eXtensions. These are an industry-first ultra wide vector SIMD extension for a DSP, going as high as 1024-bit. Qualcomm claims this new vector capability will allow for customer and partner development and innovation in the areas of imaging, VR and CV. This DSP is also an important component to support the Spectra ISP we learned about with the Adreno 5xx GPU architecture, allowing for real-time low light video capture and camera functionality to improve the experience of end consumers.
Qualcomm’s best example of what the new V6x DSP can do centers around improving video capture in real time in low-light environments. Using a new algorithm the DSP can dynamically lighten only areas of a video or photo that would appear overly dark. This improves quality of the scenes and the DSP is what powers the fast and, just as importantly, power efficient noise-reduction for areas that were under-exposed.
You might wonder why Qualcomm just doesn’t enable this functionality on the GPU, as that is a highly parallel workhorse. To handle the code that was built for the DSP and HVX extensions, there is a lot of branching in the code, which is often problematic for GPUs. Before Hexagon 680 was ready the QC team brought up the new algorithm on the Krait GPU – the improvements seen in both performance and power consumption with the DSP are substantial and clearly demonstrate the advantages of adding this hardware to a SoC.
The second new addition to the DSP with the A6x generation is a secondary DSP included in a low power island. This LPI is used to create a dedicated location for always-on sensor-aware applications to do small amounts of compute and monitor without the power draw of the primary CPU, GPU or even the larger DSP. This is somewhat similar to what Apple did with the M7/M8 co-processor, separating the tracking for things like accelerometers, gyroscopes and compasses from other processors. The advantage that Qualcomm has over that design though is that the LPI is on the same piece of silicon as the rest of the hardware, significantly reducing the time to awake from the “low power mode” and start a larger DSP/CPU processing on the task at hand.
Qualcomm has separated the low power island on the Snapdragon 820 processor with its own power rail and it can exist and remain in the low power state with the rest of the SoC in a nearly complete shutdown and power-off state.
This feature and capability is supported through the same Snapdragon tool chain and will be fully supported in Android L.
Qualcomm’s Snapdragon 820 with the new Hexagon 680 DSP continues to look impressive as the company rolls out details; albeit slower than we would like. Fantastic features and capabilities like HVX and the low power island look promising on paper but we need to see how Qualcomm and its partners actually integrate them in future devices, due out in early 2016.
Qualcomm is a founding member
Qualcomm is a founding member of HSA foundation, I just wonder if the Hexagon DSP’s SIMD FP resources can be utilized for any other acceleration tasks when a phone/tablet device that uses the snapdragon SOC is not using the camera, that augmented VR is interesting, but maybe the DSP’s computational resources could also be used like the GPU, for similar to GPGPU workloads. Maybe there could be some tablet graphics software, or even gaming software that could be made to utilize the computing resources of the DSP, and I have seen white papers that talk about doing these kinds of all purpose acceleration tasks with any computational resources on a SOC/APU for any compute tasks at the HSA foundation’s website. It’s all just ones and zeros anyways, and with the proper HSA APIs like Vulkan/others acceleration of compute can be offloaded to all kinds of processing units.