AMD Details Carrizo Further
AMD’s Carrizo gets uncovered more with details about design and efficiency.
Some months back AMD introduced us to their “Carrizo” product. Details were slim, but we learned that this would be another 28 nm part that has improved power efficiency over its predecessor. It would be based on the new “Excavator” core that will be the final implementation of the Bulldozer architecture. The graphics will be based on the latest iteration of the GCN architecture as well. Carrizo would be a true SOC in that it integrates the southbridge controller. The final piece of information that we received was that it would be interchangeable with the Carrizo-L SOC, which is a extremely low power APU based on the Puma+ cores.
A few months later we were invited by AMD to their CES meeting rooms to see early Carrizo samples in action. These products were running a variety of applications very smoothly, but we were not informed of speeds and actual power draw. All that we knew is that Carrizo was working and able to run pretty significant workloads like high quality 4K video playback. Details were yet again very scarce other than the expected timeline of release, the TDP ratings of these future parts, and how it was going to be a significant jump in energy efficiency over the previous Kaveri based APUs.
AMD is presenting more information on Carrizo at the ISSCC 2015 conference. This information dives a little deeper into how AMD has made the APU smaller, more power efficient, and faster overall than the previous 15 watt to 35 watt APUs based on Kaveri. AMD claims that they have a product that will increase power efficiency in a way not ever seen before for the company. This is particularly important considering that Carrizo is still a 28 nm product.
New CPU Core, New Design Platform, and a New GCN Core
The Bulldozer architecture for AMD has not panned out exactly as planned for them. While Intel was focusing on high levels of IPC combined with decent frequency scaling, AMD went for a design with shared resources aimed at high clock speeds and highly parallel workloads. While the architecture has kept AMD afloat, it was not nearly the success that they had hoped for. So far we have seen Bulldozer, Piledriver, and Steamroller cores hit the market with varying levels of success. Now it is time for Excavator.
The last new core based on the Bulldozer architecture has been named Excavator and is being introduced with the Carrizo APU. Excavator is designed to provide around 5% greater IPC than the previous Steamroller, but will do so at less power. Excavator has doubled the L1 cache size from previous implementations, and it also adds the latest instructions to the mix.
AMD does not have the resources to do hand layouts on every core out there. They instead rely on a lot of automated place and route. This typically causes transistor budgets to inflate due to the inefficiencies of the software and the use of standard cell designs. It also causes the die size to be larger due to wasted space from using the more regularly shaped standard cell libraries. AMD has done two things that have positively impacted their design of Excavator to make it more competitive and more power efficient.
The CPU guys have worked closely with the GPU engineers to utilize a High Density Cell Library. GPUs have been historically characterized by running slower, but having very dense designs that can do a lot of work per cycle. CPUs have traditionally been faster, but leakier designs with more cache rather than more logic. AMD gambled on using these high density libraries to design Excavator, and it appears to have paid off. AMD has shrunk each Excavator module by an impressive 23% as compared to the previous Steamroller implementation. They have also utilized a more GPU-centric metal stack which not only enables greater density, it has a positive effect on shorter interconnects between functional units. Previous generations of CPUs have used the tapered metal stack to improve transistor switching speed.
While AMD did not say this up front, these design changes will impact the overall top speed of the Carrizo parts. This is not necessarily a bad thing. It is unlikely that an Excavator based 4 module CPU will ever make it to the desktop, and certainly if there was one it would have a hard time reaching 4 GHz. In a mobile application which does not reach such speeds anyway, it will actually have a positive impact on overall performance and battery consumption. As we can see by the power/speed curves provided by AMD, we see better power scaling to transistor switching speed than the previous “high speed optimized” Steamroller cores. We see a crossover in power/speed at around the 20 watt point, but also consider that this is the TDP “per module” rather than the entire chip. In a 35 watt TDP mobile APU, each module/core pair will be in the 10 to 13 watt TDP range, so the scaling at those power ratings is superior to that of the previous Steamroller. The modules can either run faster at the same TDP, or they can run more efficiently at the same speed.
The GPU portion of the APU has also seen a lot of work. It will be based on the latest GCN architecture design, so it has all the latest bells and whistles that AMD has put into cores such as the desktop “Tonga” GPU. In this case the CPU guys helped the GPU engineers with their design flows to improve overall frequency while consuming less power. This also can go the other way in that they can keep the same frequency, but decrease power consumption by some 20%. This change is allowing AMD to enable all 8 GCN cores in even their lowest power APUs based on Carrizo. This was not possible with Steamroller and we would see a max of 6 GCN cores in the 15 watt TDP range. This will improve graphics performance dramatically in those parts without breaking the TDP budget.
“Finally, the combination of
“Finally, the combination of using a very mature and (relatively) inexpensive 28 nm bulk process while having a 23% die shrink for more performance and better power efficiency should make for a less expensive and more desirable product for OEMs and consumers alike. More chips per wafer at a lower cost per wafer than cutting edge processes should help AMD make a little more margin per chip than what they have been used to as of late.”
The Die ist actually a little bit bigger than Kaveri’s due to the integration of the PCH (and other enhancements): 245 vs. 250mm².
Thanks for the clarification!
Thanks for the clarification! I had overlooked that, sadly.
How can Carrizo both
How can Carrizo both integrate the PCH and use the same old motherboard designs? What am I missing here?
Integrated components are
Integrated components are disabled when not needed by a particular setup. So it can work in either scenario.
5% IPC.
5% IPC.
5% IPC.
5%
5% IPC.
5% IPC.
5% IPC.
5% IPC.
5% IPC.
>35% over Phenom.
LE LOL.
“This change is allowing AMD
“This change is allowing AMD to enable all 8 GCN cores in even their lowest power APUs based on Carrizo.”
I feel like this is a pretty big deal; 15W chips with big 512 core GPUs should be excellent with HSA, not to mention truly mobile gaming. It’s a shame that I can never seem to find any decent AMD-based devices.
Thanks for the article Josh, great as always.
Carrizo lower cost and better
Carrizo lower cost and better than the super expensive Iris Pro 5200 graphics! I wonder if there will be laptop systems with dual integrated, and discrete AMD mobile graphics. Talk about a win win! Nice stepping stone towards ZEN, and AMD will be there with a custom ARM K12 APU and HSA compliant ARM ISA based SKUs for the tablet market, AMD could find its ARMv8a ISA based K12 a popular product for those OEMs who do not have Apple’s in house engineering resources. Nvidia dropped Denver off of the radar. AMD will have skybridge based motherboards to offer OEMs both an ARM, and x86 solutions in a pin compatible socket, saving on engineering separate motherboard solutions. Lots of interesting things coming from AMD between now and into 2016.
Who cares about parts that
Who cares about parts that arnt powerful enough to power a mechanical fly, bring on the 390x instead of having nvidia gobble up your mechanical flies for lunch!!!
And yet Intel, the so called
And yet Intel, the so called epitome of gaming, still needs Nvidia or AMD GPUs to properly high end game. These Carrizo parts have better graphics than Intel’s, and if paired with a discrete mobile GPU, with software/gaming engines tuned to AMD’s version of HSA, will be able to leverage the GPU for more than just graphics. The high end Carrizo APU has more SPs(512) than my discrete 7650m(480), and for the graphics tasks I use my laptop for, should be fine for 3d mesh modeling, not so for Intel’s stripped down integrated product, that may be acceptable for gaming(mid level gaming), but not for high polygon count mesh modeling, the more SPs the better. Carrizo’s cost will not break the bank, and I hope that Laptops are forthcoming with the high end Carrizo part, and an AMD discrete GPU, that should make for some interesting gaming/other benchmarks.
AMD should help create a demo gaming engine that could show off Carrizo’s HSA being used to accelerate gaming physics on the integrated GPU while using their discrete mobile GPU for graphics, it looks like the way AMD has structured its GPU into functional blocks that these individual blocks could be given physics tasks, while the remaining GPU blocks, both integrated and discrete, can do graphics, allowing for just the resources necessary for any GPGPU gaming acceleration, and the rest for graphics. I will be very interested in Carrizo for graphics uses, if the software is there to accelerate ray tracing on AMD’s GPU, to help alleviate the bottleneck of not having enough CPU cores for ray tracing interactions, even Intel’s core i7, with 4 cores and 8 threads takes a good while with any ray tracing tasks.
This “mechanical fly” APU that you think is weak, may just surprise in more than just the Gaming benchmarks. And I do not see Intel, or Nvidia with any SOCs, that have integrated graphics that can be used with a discrete GPU. This, Zen, and K12 are on the way, And I’ll be seriously looking at any AMD custom ARMv8a based server/workstation SKUs, as long as I could get an SKU with 16+ CPU cores, the ARMv8a ISA has support for SIMD, and the more CPU cores for ray tracing the better, at least until GPUs start getting dedicated ray tracing hardware support, The PowerVR wizard is the only one to date, but hopefully the entire industry will adopt hardware ray tracing units in GPUs.
So… VAO is basically that
So… VAO is basically that instead of CPU frequency dictating voltage, voltage dictates CPU frequecy (atleast temporarily)?
That might not have been a
That might not have been a nice way to put it.
A better way (I think) would be that VAO basically reduces voltage required to run at a particuar frequecy. Is that right?
The first guy is right.
The first guy is right. Voltage will determine frequency. So when voltage unexpectedly drops, then the frequency quickly responds to that and drops as well. It only lasts microseconds, so users shouldn't feel any burps.
This sounds like a bad thing
This sounds like a bad thing until you realize that means they can run higher clocks most of the time, since they’re now able to respond to changes so much faster!
“So far we have seen
“So far we have seen Bulldozer, Vishera, and Steamroller cores hit the market with varying levels of success. Now it is time for Excavator.”
Vishera should be Piledriver.
Beyond that, nicely writen like always.
I’m still confused why AMD is keeping the core advancement from AM3+ socket. Seeing how Zen is still 1-2 years away, this would give AM3+ a nice farewell. Or at least make a 8 core Excavator for FM2+ and I’m in.
Either way, hope the actual power/performance will be exactly what they are saying. Good luck AMD, we need competition ASAP.
Quite frankly because they
Quite frankly because they don’t have the resources to compete with Intel in that market segment.
in full agreement.
Problem is
in full agreement.
Problem is what to do with an upgrade. jump to Intel, or wait.. Guess wait and see, so FX8350 will have to do for now…
If you rely on integrated
If you rely on integrated graphics, wait for Carrizo.
If you rely on discrete graphics, wait for Pirate Islands.
Basically wait for AMD. You can always buy Intel later if none of AMD’s new products live up to the hype.
Yup, fully agreed. Wait and
Yup, fully agreed. Wait and see with AMD.
So far, Intel’s Broadwell IGP
So far, Intel’s Broadwell IGP is not looking too good. I’ve only seen the GT2 and GT3 benchmarks – not GT3e, but it looks like many newer games are still unplayable even at 720p/low. Supposedly they are DX12 compatible, but that doesn’t help the current situation.
Obviously any Intel + dGPU setup will offer the best overall performance and power savings, but it just sucks that they can’t get their mainstream IGP up to acceptable levels.
AMD does not have the money
AMD does not have the money for the contra revenue/Wink Wink, just take this fat brown envelop stuffed with Benjamins! Those laptop OEMs are sure to knuckle under, or break under some simple arm twisting, once those arms, so gently twisted behind their backs, have those non descript brown envelops shoved into their hands. That’s what does a lot of Intel’s competing in “that” market segment.
The Mobile market is another matter, and AMD will have K12, as well as Carrizo, and later Zen, to compete in the tablet market, it’s a shame that the Laptop market is so corrupted, but it will not stay that way forever, how is that Tablet market working out so far, for Intel.
Thanks, fixed.
I have no idea
Thanks, fixed.
I have no idea why AMD has not ported the advancements to AM3+. Perhaps they just don't feel the monetary means and rewards are there? Considering how little real improvement each new core set has produced over the last. Steamroller had slightly better IPC, but it was again aimed at a lower performance 28 nm process vs. 32 nm PD-SOI, so at the 4 GHz area where a new top end CPU would have been introduced… it could have had worse thermals than Piledriver and would not have overclocked as well (good luck in getting it to 5 GHz).
Carrizo is much in the same vein. It is designed for a slower process with greater efficiency. Cramming 4 modules of Excavator plus L3 cache would again likely have produced a AM3+ chip that could barely reach 4 GHz while staying in that 125 watt TDP range. The likelyhood of the design approaching 4.5 GHz is pretty low.
AMD just doesn't have the teams of engineers available to go down that many avenues at once, especially considering what Jim Keller and gang have done with future cores. Those other guys are likely busy with the new direction AMD is heading and they consider AM3+ a sidenote and a crutch until they can get the next gen stuff out on x86 and ARM.
Josh you might want to add
Josh you might want to add this to your article
http://www.3dmark.com/3dm11/9453670
3D Mark 11
P2645
with Generic VGA(1x) and AMD FX-8800P Radeon R7, 12 Compute Cores
Faster than a 740m I think.
Can AMD IGPs utilize any
Can AMD IGPs utilize any system RAM as VRAM? Does it partition it as needed? Is it manually adjustable? If I have 32GB RAM, can I use 16GB for CPU and 16GB for IGP?
You choose in the BIOS how
You choose in the BIOS how much RAM you will use for the GPU. I think usually is between 256MB and 2GBs, but those numbers could be a little different from system to system.
Good to know, thanks. I only
Good to know, thanks. I only ask because even though the IGP is limited in its rendering ability, there is a benefit to having direct access to unending amounts of VRAM for certain workloads.
I’m looking forward to the
I’m looking forward to the real results. So far sounds pretty good. I hope that the socket parity with Carrizo-L will get us a lot of laptops with either as an option.
I also wonder how 35W Carrizo will compare with “low power” (45W) Kaveri desktop chips. Would be nice if AMD could offer 35W Carrizo chips for FM2+ if they can reach the same performance as current 45W chips.
I’m hoping that just being
I’m hoping that just being able to share mainboard designs with Carrizo-L means we’ll see more AMD laptops in general (since OEMs wouldn’t need to design and support as many layouts). But yeah, if you take all of the lowend cat-core models and offer full Carrizo options as configurable upgrades that would be win-win.
This has real potential for
This has real potential for HTPC use. Low TDP allows for fanless operation (with right case). 4K decode (I do not need but I am sure others will).
If cheap and underclocked Carrizo would come in at around 25W then a small, cheap build could look great and be a perfect HTPC.
THe problem for me is at 35W, given I do not need 4K decode capability, it is no improvement on my i7-3770T which is overpowered for full HD work – so no need to upgrade
AMD is trying squeeze
AMD is trying squeeze waterout of rock, but, when you are in a position like they are in, you learn to roll with the punches.
What will make or break these chips are the devices that they might be put in by the OEMs. No one is going to buy a dinky little laptop with absolutely lousy quality hardware. These chips are powerful enough to be useful to 90% of the customer base. These are the devices I would like to see the Carrizo chips in.
Carrizo 35W parts:
Device Thickness 1″ (25mm)
Large 50WHr battery
14″-17.3″ 1080p displays (Higher resolutions are pointless with laughable scaling in Windows)
Carrizo L parts:
Device Thickness about the same (1″ or 25mm)
Large 40WHr battery
11.6″-14″ 1080p displays (Higher resolutions are pointless with laughable scaling in Windows)
You have the optional backlit keyboards, replaceable hardware (HDD to SSD, More SODIMMs, etc) and easy maintenance.
As they are using a mature process node for production, they should aim to supply as many of their flagship chips (in each category) as possible.
Price it around $500-$600 and you have something that will appeal to many.
Again, I will repeat what many have pointed out in the articles concerning this release. AMD should also make an effort to put these chips in a small barebones boxes like the Gigabyte BRIX (without a discrete GPU). Charge around $200-$250 and they would make excellent miniPCs for many.
My 2 cents.
I forgot to add….
No
I forgot to add….
No freaking discrete GPUs.
AMD graphics switching is still glitchy at best.
Good luck AMD.
Will AMD’s new CPU be able to
Will AMD’s new CPU be able to compete with Intel’s 5th generation Core i5 or even Corei7 CPU’s?
You mean this APU?
On the
You mean this APU?
On the mobile side (i5)? Sure
On desktop. Not really. Intel is sadly the only way to go for high performance single/few threaded stuff.
Am3+ CPU can compete with some multithreaded stuff. Like music production workstations etc… That demand more cores rather than high “clockspeeds”.
I wonder if it would be
I wonder if it would be worthwhile to have an HBM solution for devices with the thermal headroom to support it.
Depends on the
Depends on the implementation. HBM takes away a big bottleneck in most applications, but will add complexity and more TDP. Plus, cooling can be problematic in full 3D designs.