Intel Architecture Day 2021: Alder Lake and Arc

Architecture Day Highlights – Consumer Desktop
Among the topics discussed during the virtual Architecture Day 2021 event hosted by none other than Raja Koduri, we received details about the core architecture behind the upcoming Alder Lake processors, which make use of a mix of performance and efficiency cores in an Arm-like way – in x86, of course.
Intel’s freshly renamed Arc graphics, still containing Intel’s Xe HPG (high performance graphics) architecture but no longer known exclusively by code names like DG2, were given a “sneak peek”, with the intel ARC Powered by Alchemist SoC and new DLSS-like “XeSS” technology discussed.
Alder Lake: A Tale of Two Cores
Intel shared details of the architecture behind the upcoming Alder Lake processors, with separate sections devoted to the new Efficient x86 Core, and Performance x86 Core – the latter representing “the biggest shift in x86 yet,” according to Intel.
Reinventing the multicore architecture, Alder Lake will be Intel’s first performance hybrid architecture with the new Intel Thread Director. This is Intel’s most intelligent client system-on-chip (SoC) architecture, featuring a combination of Efficient-cores and Performance-cores, scaling from ultra-mobile to desktop, and leading the industry transition with multiple industry leading I/O and memory. Products based on Alder Lake will begin shipping this year.
The reference to Intel Thread Director is significant, as this will be the key to effective use of the two cores in real-world situations.
Intel’s unique approach to scheduling was developed to ensure Efficient-cores and Performance-cores work seamlessly together, dynamically and intelligently assigning workloads from the start and optimizing the system for maximum real-world performance and efficiency. With intelligence built directly into the core, Intel Thread Director works seamlessly with the operating system to place the right thread on the right core at the right time.
Efficient x86 Core
A highly scalable x86 microarchitecture for addressing compute requirements across the entire spectrum of our customer’s needs, from low-power mobile applications to many-core microservices. Compared with Skylake, Intel’s most prolific CPU microarchitecture, the Efficient-core delivers 40% more single-threaded performance at the same power, or the same performance while consuming less than 40% of the power. For throughput performance, four Efficient-cores deliver 80% more performance while still consuming less power than two Skylake cores running four threads or the same throughput performance while consuming 80% less power.
While the word “efficient” might evoke thoughts of performance along the lines of an Atom core, Intel is comparing the new Efficient x86 core to Skylake – but at 40% lower power for the same single-threaded performance.
Intel’s video introducing the Efficient-core Architecture is embedded below:
Performance x86 Core
This x86 core is not only the highest performing CPU core Intel has ever built, but it also delivers a step function in CPU architecture performance that will drive the next decade of compute. It was designed as a wider, deeper and smarter architecture to expose more parallelism, increase execution parallelism, reduce latency and increase general purpose performance. It also helps support large data and large code footprint applications. Performance-core provides a Geomean improvement of about 19%, across a wide range of workloads over our current 11th Gen Intel Core architecture (Cypress Cove core) at the same frequency.
Targeted for data center processors and for the evolving trends in machine learning, Performance-core brings dedicated hardware, including Intel’s new Advanced Matrix Extensions (AMX), to perform matrix multiplication operations for an order of magnitude performance – a nearly 8x increase in artificial intelligence acceleration. This is architected for software ease of use, leveraging the x86 programing model.
Intel’s claim of 19% improvement is obviously significant, and as this Geomean improvement is “at the same frequency” as their 11th Gen Core it represents a massive jump in IPC. The closing reference to “software ease of use, leveraging the x86 programing model” seems like a direct jab in Apple’s direction. Well, as long as Intel’s new CPU architecture is well optimized for in software, we should see some interesting performance comparisons – vs. both AMD Ryzen and Apple M1.
Intel’s video introduction to Performance-core Architecture is embedded below:
The Alchemist Graphics Story ARC
Intel’s new branding for their high-performance graphics products was given its own press coverage before Architecture Day took place, as the public was informed that Arc (stylized ARC, I’m noticing) will be the name of such products going forward. Kind of like “Radeon” or “GeForce” branding. (Intel ARC 9800 GT!)
A new discrete graphics microarchitecture is designed to scale to enthusiast-class performance for gaming and creation workloads. The Xe HPG microarchitecture features a new Xe-core, a compute-focused programmable and scalable element, and full support for DirectX 12 Ultimate. New matrix engines inside the Xe-cores (referred to as Xe Matrix eXtensions, XMX) accelerate artificial intelligence workloads such as XeSS, a novel upscaling technology that enables high-performance and high-fidelity gaming. Xe HPG-based Alchemist SoCs (formerly code-named DG2) will be coming to market in the first quarter of 2022 under the new Intel Arc brand.
The Arc branding isn’t the only thing that has changed since the last time Intel talked about high-performance graphics; we are no longer dealing with EUs (Execution Units), as Intel has renamed this the Xe-core:
Each Xe-core includes 16x 256-bit Vector Engines and 16x 1024-bit Matrix Engines, and it is built to scale up to some very powerful solutions. In a time when we have never needed a high-performance competitor more, this is great news – but media outlets like us need to remain neutral until we have hardware in hand and know exactly how well it stacks up against the competition.
Speaking of competition, NVIDIA’s DLSS might be getting some serious AI upscaling competitor with Intel’s XeSS. I have no idea how to pronounce it, but it seems to be a tech more along the lines of DLSS than AMD’s FSR.
A super sampling tech that makes use of deep learning? Sounds familiar, eh? It will be enabled “on a broad set of hardware, including our competition”, meaning you can use it on AMD and NVIDIA GPUs, too. It’s early, and we don’t have hardware on hand to test, but the consumer graphics landscape could get very interesting.
Intel’s video on Xe HPG and Alchemist SoC is embedded below:
16 Matrix Engines?! That’s a lotta Neo’s!
You must be a dad?