Over the holidays I noticed that ARM released information on a new core design aimed at autonomous driving systems. The Cortex-A65AE is part of the company's Automotive Enhanced lineup and follows on the Cortex-A76AE) with its split-lock and other features that are part of ARM's Safety Ready program.
Aimed at processors that will be used in self driving cars, advanced driver assistance systems (ADAS), aviation, and industrial automation, the Cortex-A65AE core design integrates several safety and redundancy features that meet ASIL D specifications which is a hazard and risk assessment test for an ISO standard (26262) focused on road vehicle safety. Processors will be able to have up to eight cores and will support SMT with each physical core able to run two threads (at different exception levels and/or under different OSes). The cores can be run independently for performance or in lock step for redundancy and integrity checking comparing each other's calculation results (Split-Lock and Dual Core Lock Step respectively). Using the simultaneous multithreading, two threads on a physical core and operate in lock step mode with two other threads on a different physical shadow core according to Anandtech.
ARM has not yet released full details about the Cortex-A65AE core but it utilizes a 6A65AE4-bit out of order execution pipeline with the. ARMv8-A. It can be customized to suit the needs of ARM's partners so exact chip specifications will differ, but in general Cortex-A65AE cores can have 16 to 64 KB L1 instruction and data caches, 64 to 256 KB L2 cache, an optional L3 cache up to 4MB. Other features include support for ARM TrustZone, ECC memory, and ACP connections for accelerators. The new cores are built with ARM's DynamIQ technology and are slated to be used in chips built on the 7nm process node.
According to ARM, Cortex-A65AE cores are 70% faster in integer performance per core and offer up to 3.5 times the memory throughput and six times the read bandwidth for ACP accelerators versus the existing Cortex-A53 cores. The notable performance jump is likely the result of a combination of moving to a smaller process node, the addition of SMT, and architectural improvements and cache and inter-chip routing optimizations.
ARM is positioning the Cortex-A65AE as complementary to the Cortex-A76AE which is to say that the new core is not a direct replacement for it. While the Cortex-A76AE is high performance, the A65AE is high throughput and both cores reportedly have their place in future ADAS and self-driving cars. The Cortex-A65AE cores can be clustered together to do the initial processing and sensor fusion calculations from all of the inputs from cameras, radar, lidar, and other hardware. From there, clusters including Cortex A76AE chips (or a mix of the two) along with other accelerators can be responsible for making the decisions based on the sensor information. How well it works in practice and how this heterogenous setup will compare to competing offerings from NVIDIA, Intel/MobileEye, and others remains to be seen. I am all for the self-driving car future though so the more competition and developments in that space is always nice to see even if it's still a ways off yet!
The Cortex-A65AE being the first Cortex-A core to feature multithreading is also interesting and I am very curious if we will see that capability expanded to other ARM processors outside of the AE series. While SMT may not be worth it for mobile devices like smartphones and even tablets, perhaps future ARM-powered Always Connecred Windows notebook PCs will use processors with SMT capable cores as it would be easier to justify the extra cost in power and size to include multithreading.
What are your thoughts?
(PS I hope everyone had a safe holiday or at least a good week if you don't celebrate! I am looking forward to 2019 and continuing to serve you with bad puns and allegedlys technology coverage!)
“The Cortex-A65AE being the
“The Cortex-A65AE being the first Cortex-A core to feature multithreading is also interesting and I am very curious if we will see that capability expanded to other ARM processors”
You will confuse readers calling it “multithreading” maybe that’s why the computing sciences folks came up with the name SMT(Simultaneous MultiThreading). Modern OS’s can juggle thousands of software threads on processors of any design, SMT or Non-SMT enabled processors!
So maybe call it hardware multithreading, or processor multithreading! but really Simultaneous MultiThreading(SMT) is the actual generic name and Intel’s SMT is the same but Intel’s marketing just branded Intel’s vesrsion of SMT as HyperThreading(TM).
Wikipedia Defines SMT as:
“Simultaneous multithreading (SMT) is a technique for improving the overall efficiency of superscalar CPUs with hardware multithreading. SMT permits multiple independent threads of execution to better utilize the resources provided by modern processor architectures.” (1)
Also Intel was not the Inventor of SMT according to Wikipedia entry:
“Historical implementations[edit]
While multithreading CPUs have been around since the 1950s, simultaneous multithreading was first researched by IBM in 1968 as part of the ACS-360 project.[2] The first major commercial microprocessor developed with SMT was the Alpha 21464 (EV8). This microprocessor was developed by DEC in coordination with Dean Tullsen of the University of California, San Diego, and Susan Eggers and Henry Levy of the University of Washington. The microprocessor was never released, since the Alpha line of microprocessors was discontinued shortly before HP acquired Compaq which had in turn acquired DEC. Dean Tullsen’s work was also used to develop the Hyper-threading (Hyper-threading technology or HTT) versions of the Intel Pentium 4 microprocessors, such as the “Northwood” and “Prescott”. ” (1)
P.S. The AMD Project K12 custom ARM core Project/Design(Mothballed by AMD) team lead by Jim Keller was remored to be working on an SMT enabled design at the same time Keller and his other Team was desiging AMD’s Zen.
But the really the Broadcomm Vulcan was an SMT4(4 hardware threads per core) custom ARM design created several years ago but that Broadcomm Vulkan design was sold off to Cavium after Broadcomm was acquired by Avago Technologies Limited, Avago has since taken the Broadcom name and is now doing business as Broadcomm.
Cavium took the Broadcomm Vulcan design as the basis for its Thhunder X2 line of custom ARM server cores with the current Thunder X2 not being reated to Cavium’s Thunder X design. Cavium has since been acquired by Marvell so its Now the Marvell that has the SMT4 enabled Vulkan(Now Branded Thunder X2) custom SMT4 ARM Server core that been around for a few years now!
(1)
“Simultaneous multithreading”
https://en.wikipedia.org/wiki/Simultaneous_multithreading
Edit: Broadcomm
To: Broadcom
Edit: Broadcomm
To: Broadcom
Only one M in Broadcom unlike Qualcomm that has 2, I get that wrong all the time!