A much needed architecture shift
Intel revealed details of the 22nm Silvermont architecture that will be used in 2013 for tablets and phones.
It has been almost exactly five years since the release of the first Atom branded processors from Intel, starting with the Atom 230 and 330 based on the Diamondville design. Built for netbooks and nettops at the time, the Atom chips were a reaction to a unique market that the company had not planned for. While the early Atoms were great sellers, they were universally criticized by the media for slow performance and sub-par user experiences.
Atom has seen numerous refreshes since 2008, but they were all modifications of the simplistic, in-order architecture that was launched initially. With today's official release of the Silvermont architecture, the Atom processors see their first complete redesign from the ground up. With the focus on tablets and phones rather than netbooks, can Intel finally find a foothold in the growing markets dominated by ARM partners?
I should note that even though we are seeing the architectural reveal today, Intel doesn't plan on having shipping parts until late in 2013 for embedded, server and tablets and not until 2014 for smartphones. Why the early reveal on the design then? I think that pressure from ARM's designs (Krait, Exynos) as well as the upcoming release of AMD's own Kabini is forcing Intel's hand a bit. Certainly they don't want to be perceived as having fallen behind and getting news about the potential benefits of their own x86 option out in the public will help.
Silvermont will be the first Atom processor built on the 22nm process, leaving the 32nm designs of Saltwell behind it. This also marks the beginning of a new change in the Atom design process, to adopt the tick/tock model we have seen on Intel's consumer desktop and notebook parts. At the next node drop of 14nm, we'll see see an annual cadence that first focuses on the node change, then an architecture change at the same node.
By keeping Atom on the same process technology as Core (Ivy Bridge, Haswell, etc), Intel can put more of a focus on the power capabilities of their manufacturing.
Silvermont offers three important changes to the Atom line. First is improved performance, followed by better power efficiency and finally the adoption 22nm process technology. Better performance is brought about as a combination of all three but starts with the move from an in-order architecture to the first out-of-order design for Atom SoC.
The move to an OOO architecture is a long time coming for Intel's low power processing. AMD's Bobcat APU started shipping in early 2011 with an OOO design (though it was originally announced in 2007…) and both the ARM Cortex A9 and A15 are out of order architectures. An OOO design is much more complex than an in-order one, but it can increase power efficiency and performance by allowing more of the CPU architecture to be in use at a given time. Rather than simply conforming to the order of instructions and data as sent by the application, OOO processors have the ability to shift data and instructions to better utilize internal components.
Because Silvermont is not simply an evolution of the previous Atom design, talking about the changes that Intel wrought is a bit more complicated. Still though, Intel was quick to point out some areas of particular importance. Branch prediction was a big focus even though the OOO design means mispredictions are less costly. The L2 cache size has been increased per core while also lowering the latency to access that second level memory. The execution units have been redesigned as well to improve instruction latency and throughput.
Silvermont is built to maintain the macro-op (fused-op) instructions throughout the pipeline to avoid "decompression" requirements. This gives the architecture much higher efficiency and they didn't have to make the design "wider" to accommodate larger instructions.
Intel claims that the IPC increase with Silvermont is about 50% – that is a very impressive generation to generation leap. Consider what we are seeing in the Core architecture designs where Sandy Bridge to Ivy Bridge was well under 10% in most cases.
This diagram demonstrates the key changes from an in-order to an out-of-order design. In Saltwell, the architecture performed functions in a fixed, unchanging order and had no flexibility to help fill empty cycles delayed by memory access or longer functions. With Silvermont that is no longer the case and instructions can be scheduled dynamically by the CPU. This also has the benefit of dropping the branch misprediction rate from 13 cycles to 10 cycles.
Silvermont's multi-core design uses two-core modules as its primary building blocks. While initial products will use either one or two modules (for dual or quad core configurations) the design has the ability to scale to eight cores. Each module is built around a pair of Silvermont cores, each with their own L2 cache up to 1MB, connected via a system agent / crossbar. These multi-module designs will suffer slightly from the lack of any kind of monolithic, shared L2 cache compared to previous Core based CPUs from Intel, but implementation will have much to say in that regard.
Each module connects to the rest of the SoC through a fabric that has independent read and write channels for higher bandwidth and lower latency communications. I didn't get much more information on this fabric though I am guessing it looks less like the ring bus on the Core designs and more like a traditional crossbar.
Power management is a big part of the redesign as well and Silvermont actually supports per-core frequency adjustments even though Intel themselves said that was a feature they wouldn't recommend for most use cases. Each module get its own voltage plane (not on a per-core basis). And by enabling a modular design to Silvermont, Intel is able to quickly build new parts for different markets including the embedded and micro-server spaces.
You might have noticed there was no mention of HyperThreading technology in Silvermont – it is not included. Intel stated that its goal was to improve single threaded performance while maintaining the maximum level of efficiency possible. Better single threaded performance allows the CPU to execut instructions quickly so it can return to a sleep state as soon as possible. With the move to an out-of-order design they avoided the complications of SMT that improved power efficiency and added threaded support with multiple-core modules. The amount of die area saved by removing HyperThreading support essentially "paid for" the cost of migrating from in-order to an out-of-order design.
Along with the complete redesign of the architecture comes additional instructions and technologies from the other Intel product lines. Silvermont adds SSE 4.1 and 4.2 and AES-NI accelerations from the Westmere design in addition to enterprise level features like Intel VT-x2, OS Guard and McAfee DeepSAFE. VTX and AES are important moving forward, especially with the big push to micro-servers and advanced security measures.
Yes and what does the fine
Yes and what does the fine print on those sildes say about the type of testing software used! With AMD’s HSA and new unified memory access, how much less power will AMDs APUs use to transfer data between the CPU and the GPU, AMDs products with unified memory will only have to transfer a pointer/handle between the CPU and GPU (32 to 64 bits) as apposed to Intel’s designs that will have to transfer gigabites of data between the CPU’s and GPU’s separate memory space! AMD will be creating x86 and ARM based APUs with this unified memory adressing, and AMD’s APUs are becoming more HSA capable! In the low power smart phone market, the more processing power shared between the GPU and CPU will equate to a much more efficient utilization of resources on low power devices, with less CPU processing resources needed, this will lower the total transister count needed and saves power! Graphics wise in the future the only major compition for low power ARM based APU type systems will be Nvidia, and qualcomm, and a few others. Intel may be able to compete with ARM on the power(Not currently), but Intel will not be able to compete on the price front with ARM based APUs, as Intel appears to be moving towards the high end/high cost tablet and smart phone market, and has yet to lead in any mobile market yet!
Good article, informative.
Good article, informative. the “module” thing smells a bit like ‘bulldozer’. i really hope kabini can compete on the CPU front and TDP.
Do you think that intel and its graphics IP partners can beat GCN at these power envelopes?
by the way, typo on page 2, 9th paragraph, “…definitely seeing higher bust clocks with Silvermont…”
It’s not too similar to
It’s not too similar to bulldozer as only L2 is shared, nothing else.
(Another good reference for Silvermont is http://www.realworldtech.com/silvermont/ )
So finally an Atom worth
So finally an Atom worth looking at?
ARM is doomed…WINTEL for
ARM is doomed…WINTEL for LIFE !