Gunning for Broadwell-E
AMD gave us our first glance at performance of Zen relative to Intel’s Broadwell-E platform.
As I walked away from the St. Regis in downtown San Francisco tonight, I found myself wandering through the streets towards my hotel with something unique in tow. It was a smile. I was smiling, thinking about what AMD had just demonstrated and showed at its latest Zen processor reveal. The importance of this product launch can literally not be overstated for a company struggling to find a foothold to hang on to in a market that it once had a definitive lead. It’s been many years since I left a conference call, or a meeting, or a press conference feeling genuinely hopefully and enthusiastic about what AMD has shown me. Tonight I had that.
AMD’s CEO Lisa Su, and CTO Mark Papermaster, took stage down the street from the Intel Developer Forum to roll out a handful of new architectural details about the Zen architecture while also showing the first performance results comparing it to competing parts from Intel. The crowd in attendance, a mix of media and analysts, were impressed. The feeling was palpable in the room.
It’s late as I write this, and while there are some interesting architecture details to discuss, I think it is in everyone’s best interest that we touch on them lightly for now, and instead refocus on the deep-dive once the Hot Chips information comes out early next week. What you really want to know is clear: can Zen make Intel work again? Can Zen make that $1700 price tag on the Broadwell-E 6950X seem even more ludicrous? Yes.
The Zen Architecture
Much of what was discussed from the Zen architecture is a re-release of what has been out in recent months. This is a completely new, from the ground up, microarchitecture and not a revamp of the aging Bulldozer design. It integrated SMT (simultaneous multi-threading), a first for an AMD CPU, to better take efficient advantage of a longer pipeline. Intel has had HyperThreading for a long time now and AMD is finally joining the fold. A high bandwidth and low latency caching system is used to “feed the beast” as Papermaster put it and utilizing 14nm process technology (starting at Global Foundries) gives efficiency, and scaling a significant bump while enabling AMD to scale from notebooks to desktops to servers with the same architecture.
By far the most impressive claim from AMD thus far was that of a 40% increase in IPC over previous AMD designs. That’s a HUGE claim and is key to the success or failure of Zen. AMD proved to me today that the claims are real and that we will see the immediate impact of that architecture bump from day one.
Press was told of a handful of high level changes to the new architecture as well. Branch prediction gets a complete overhaul. This marks the first AMD processor to have a micro-op cache. Wider execution width with broader instruction schedulers are integrated, all of which adds up to much higher instruction level parallelism to improve single threaded performance.
Performance improvements aside, throughput and efficiency go up with Zen as well. AMD has integrated an 8MB L3 cache and improved prefetching for up 5x the cache bandwidth available per core on the CPU. SMT makes sure the pipeline stays full to prevent “bubbles” that introduce latency and lower efficiency while region-specific power gating means that we’ll see Zen in notebooks as well as enterprise servers in 2017. It truly is an impressive design from AMD.
Summit Ridge, the enthusiast platform that will be the first product available with Zen, is based on the AM4 platform and processors will go up to 8-cores and 16-threads. DDR4 memory support is included, PCI Express 3.0 and what AMD calls “next-gen” IO – I would expect a quick leap forward for AMD to catch up on things like NVMe and Thunderbolt.
The Real Deal – Zen Performance
As part of today’s reveal, AMD is showing the first true comparison between Zen and Intel processors. Sure, AMD showed a Zen-powered system running the upcoming Deus Ex running at 4K with a system powered by the Fury X, but the really impressive results where shown when comparing Zen to a Broadwell-E platform.
Using Blender to measure the performance of a rendering workload (a Zen CPU mockup of course), AMD ran an 8-core / 16-thread Zen processor at 3.0 GHz against an 8-core / 16-thread Broadwell-E processor at 3.0 GHz (likely a fixed clocked Core i7-6900K). The point of the demonstration was to showcase the IPC improvements of Zen and it worked: the render completed on the Zen platform a second or two faster than it did on the Intel Broadwell-E system.
Not much to look at, but Zen on the left, Broadwell-E on the right…
Of course there are lots of caveats: we didn’t setup the systems, I don’t know for sure that GPUs weren’t involved, we don’t know the final clocks of the Zen processors releasing in early 2017, etc. But I took two things away from the demonstration that are very important.
- The IPC of Zen is on-par or better than Broadwell.
- Zen will scale higher than 3.0 GHz in 8-core configurations.
AMD obviously didn’t state what specific SKUs were going to launch with the Zen architecture, what clock speeds they would run at, or even what TDPs they were targeting. Instead we were left with a vague but understandable remark of “comparable TDPs to Broadwell-E”.
Pricing? Overclocking? We’ll just have to wait a bit longer for that kind of information.
Closing Thoughts
There is clearly a lot more for AMD to share about Zen but the announcement and showcase made this week with the early prototype products have solidified for me the capability and promise of this new microarchitecture. We have asked for, and needed, as an industry, a competitor to Intel in the enthusiast CPU space – something we haven’t legitimately had since the Athlon X2 days. Zen is what we have been pining over, what gamers and consumers have needed.
AMD’s processor stars might finally be aligning for a product that combines performance, efficiency and scalability at the right time. I’m ready for it –are you?
Tempted to say “Awww Yis” but
Tempted to say “Awww Yis” but nope. Will wait for benchmarks
Nice!!! I like ZEN so far.
Nice!!! I like ZEN so far. Hope they deliver.
Let’s just say they match the
Let’s just say they match the Haswell-E — the big issue I see is that they price it close to the Haswell-E, so there is no benefit to anyone other than AMD finally releasing a part that is catching up with Intel. The only way AMD can start digging into market share is to half the price of the Haswell 6900K for the Summit Ridge CPU or even less if they can still make a decent profit on the chip.
At this point, for the
At this point, for the consumer market, CPU performance is less important than ever before with a few niche exceptions (RTS style games and such). Low power consumption with reasonable performance is what they need. For laptops, sufficient CPU performance with a powerful GPU could be a winning combination. The main thing that causes me to not recommend AMD parts, even for low budget builds, is the high power consumption for the performance and the age of the platform. Zen should resolve both.
Intel has improved their GPUs significantly, but I doubt that Intel integrated graphics will compare well against 14 nm AMD integrated graphics. Intel IGPs have not compared that well even with a massive difference in process technology. It has been Intel 14 nm GPUs against AMD 28 nm. With both on 14 nm FinFET, I doubt Intel graphics will be comparable at all. It should reduce power consumption of AMD chips significantly. Global Foundries/Samsung 14 nm probably isn’t as good as Intel 14 nm, but it will be a lot closer than Intel 14 nm vs. TSMC 28 nm. I wouldn’t trust any benchmarks at this point, but it sounds like they will at least be I the same ball park for CPU performance. If they can provide that performance with good power efficiency, then I would think they will get a lot more laptop design wins. 14 nm production by itself should increase power efficiency significantly. They will also have improved their design for low power (clock gating and more advanced power management systems) significantly over what was implemented with excavator. I don’t think the power consumption will be an issue.
Also, If they can get a laptop APU out with a stack of HBM, I would defiantly buy it. Intel has their Iris Pro, but I doubt that it will be competitive with an HBM based APU. Even a single stack of HBM1 could boost graphics performance significantly. AMD just needs to get it out in a reasonable amount of time. The delay to 2017 is very disappointing. I have been wondering if Apple may actually use AMD chips in their next laptop updates. It would make some sense. It would be interesting if all of their early production is going to Apple. I am hoping that AMD has an HBM based APU available sooner rather than later, although I have wondered if any one would be able to make a laptop built like a console, with 8 or more gigabytes of GDDR5 rather than DDR4. That would only be a temporary solution until HBM based APUs are available though.
Your comments sound very
Your comments sound very familiar to those people I have encountered that live in the past and haven’t caught up with current technology. Its time you and many of your ilk pull your head out of the sand and start reading the latest tech news esp when it comes to AMD. We are not responsible for your laziness as is reflected in your lengthy post.
Hard to tell what you are
Hard to tell what you are talking about since you didn’t actually say much of anything. Perhaps you are an Intel fan boy. I wouldn’t dispute that Intel makes great CPUs. The problem is that the GPU is a much more important component now. For most games, almost any modern 4 thread desktop class CPU will do fine. Intel doesn’t have a good GPU, and they seem to have no intention of going after the GPU market. It is a lot lower margins than enterprise CPUs. Intel gets many times the amount of money for the same amount of silicon (as a Xeon CPU or Xeon Phi) compared to how much GPU makers get. Intel has clearly stated that they will focus on the enterprise market. I don’t know if they will essentially abandon the consumer market, but without a high performance GPU, they are not going to be very competitive.
don’t agree.
power
don’t agree.
power consumption only matters for mobile market and laptops not PC.s
PC neeeds performance at any cost.
You are living in the past if
You are living in the past if you think the mobile market is not important. Going forward, being able to do VR and AR without being tied to a massive PC is going to be very important. This will be mobile parts. I have seen a large number of benchmarks indicating that the CPU is irrelevant to most games as long as you have a minimum level of performance. Even some old excavator based parts do well when paired with a powerful GPU.
The hype train has been
The hype train has been running at AMD every damn CPU generation. When will people learn?
Considering the last
Considering the last generation was in 2011, and by all accounts it was a lemon, I’d say we’re due for a ride on the IPC express.
these hype train claims are
these hype train claims are hilarious. Like was said, last real CPU gen was a good while ago. What hype have you been dreaming about?
Also finally AMD copies Intel
Also finally AMD copies Intel (SMT, write back cache, micro-op cache, etc), which is a good thing. The only bad thing is that hype, unless AMD can pull a Conroe like this http://www.anandtech.com/show/1963 where the reviewer is allowed to test the CPU.
Don’t forget the Nehalem
Don’t forget the Nehalem preview also http://www.anandtech.com/show/2542
IPC gains are obviously good.
IPC gains are obviously good. Could be very good chip for servers. Clock speeds might not be what gamers want or need. The process might be the problem there, but isn’t AMD still tied to Global Foundries for years to come?
They are tied to Global
They are tied to Global Foundries for a certain number of minimal orders. Once they meet that minimal they can use any other foundry without penalty. I believe their Polaris is actually enough to meet this minial. Once AMD gains market share again, this should be easy. AMD can leverage Samsung and TSMC if they want to at that point.
“Clock speeds might not be
“Clock speeds might not be what gamers want or need”
Since when do games need a certain clockspeed? I don’t understand why gamers (or anyone else) need a certain clockspeed. At the end of the day, all that matters is how much performance the parts deliver, and in servers, how little power they burn while doing so (and while not doing anything).
Sure, but if Zen’s IPC is the
Sure, but if Zen’s IPC is the same as Broadwell’s but it clocks 30% lower, that’s a problem. (For enthusiast market at least, I imagine server market wants moar cores at lower frequency instead.)
If it clocks 30% lower with
If it clocks 30% lower with the same IPC and the price is 40% less than Broadwell I don’t see a problem… ¯_(ツ)_/¯
I do. Why would I, as a
I do. Why would I, as a gamer, pick the one with worse single threaded performance? Id rather have less cores at 40% less cost wise than lose out on single threaded.
Then do not buy 8core / 16
Then do not buy 8core / 16 thread CPU!
If you’re not a gamer, or a
If you’re not a gamer, or a casual gamer, or on a tight budget and would rather spend money on the GPU, it’s fine. If you’re a serious enthusiast with a big budget, it’s not.
Most buyers are in the first category, so that’s enough to keep AMD alive.
IPC = per cycle, not clock.
IPC = per cycle, not clock.
wtf… cycle and clock are
wtf… cycle and clock are the same thing lol
Cycle is the start to
Cycle is the start to completion of a task. Where as clock is the cycles per unit time. Hence why it’s measured in Hertz. They are similar but aren’t the same thing.
Cycles and clock are the same
Cycles and clock are the same thing, mostly. Is like Water and ice, is the same thing.
1 hz IS a cycle. Task cycles are a different thing so IPC IS instructions per cycle and hz are cycles.
And why the FUCK now people question this? When bulldozer come aout and failed all intel fanboys, and not fanboys, every fucking one was talking “ipc IPC, IPC … yes ipc” everything was IPC.
Now we question IPC importance?
Stop, let AMD fail, they don’t need help. Or let it be something competitive for better prices.
Bulldozer failed but not in
Bulldozer failed but not in IPC for that matter. AMD stated before Bulldozer was released that its IPC will be actually lower than Phenoms but with CMT. Biggest problem with Bulldozer was not IPC or CMT but slow cache due automatic design they had to use.
No it’s not
IPC is the same
No it’s not
IPC is the same regardless of clock speed
No it’s not
IPC is the same
No it’s not
IPC is the same regardless of clock speed
If the IPC is the same as
If the IPC is the same as Broadwell but the clock speed is less games will suffer.
A good portion of the market is still largely dependent upon single thread performance.
Increasingly less true and
Increasingly less true and moreso every day. Most single-thread dependent titles also don’t benefit much from CPU speeds over 3Ghz unless you’re going for really, REALLY high frame-rates.
You’re basically looking for problems here.
Im impressed by Zen but what
Im impressed by Zen but what youre saying is BS. There are tons of unoptimized games like like my 4GHz i7 a lot more than my 3GHz i5. IPC and high clocks are not that important in many server and supercomputer workloads, but they are in games, even now.
false.
IPC is more important
false.
IPC is more important than clock speed. If you have the same IPC but lower clock speeds, overclock fix the clock at your own taste.
Also, heat and power consumption are important.
Having a good IPC is the most important thing and THAT was intel did with the first gen intel core i7. They had lower clocks speed than phenom II, but they were better and the next gen fixed the clock speed. Result? GG amd.
Seeing them use Blender is a
Seeing them use Blender is a very nice touch, being Blender user myself.
With regards to usage of GPUs and CPU at the same time. Blender in it’s current form can render either on CPU, or on GPU, but not at the same time.
There are hacks, but it would only work with OpenCL devices, and Nvidia support in OpenCL for blender is… horrible. AMD’s GPU OpenCL implementation also has issues.
As this is an Open Sourced application, I guess they could have fixed the various issues with their own developers. For sure Blender Foundation wouldn’t be able to due to lack of developer resources due to community supported funding.
Therefore I would consider the rendering results pure CPU vs CPU at this point.
This just ensures that all my savings are going to AMD when Zen comes out.
One thing is for certain the
One thing is for certain the Blender Render ran only on the CPU, as GPU rendering/Cycles rendering is done on the GPU and has to be explicitly selected and the materials set up to use cycles and rendering on the GPU. The only time that Blender will leverage the GPU at all times is in the editor’s view point editing modes and Blender uses OpenGL for that.
One Note about Intel’s graphics it is terrible for editing large/High polygon count mesh scenes in Blender’s 3d/OpenGL driven edit modes. Intel lacks the same high numbers of shader cores that both Nvidia and AMD offer in their respective GPU SKUs. So if you are trying to edit any high polygon count scenes in Blender, or a single high polygon count mesh model, Intel’s GPUs can and do bog down and make the editing process practically useless.
That said, CPUs suck, suck, suck for rendering and I’m sure that that AMD’s demo was done on a relatively simple mesh model/s with the lighting/Ray tracing sample rate on the lower settings, as both Intel’s and AMD’s CPUs suck at rendering compared to GPUs that have vastly more FP resources to get the rendering jobs done in minutes rather that hours if the Ray Tracing/AO/AA/RI/Other settings are even set to medium and done on a CPU. The Good news is the AMD’s GCN GPUs now have Cycles rendering support so Cycles rendering can be done on AMD’s GCN GPUs, there is still more work to get AMD’s GPUs working better with Cycles but at least the support is there for cycles rendering on the GPU, instead of those MOOK CPUs that strain for hours all cores at 100% for hours on end to render a high detail scene with the lighting/Ray Tracing Sample rates, other setting turned up to medium.
GPUs RULE!
………the last time when
………the last time when AMD started such preview…..we got Bulldozer…..
When AMD didn’t do such ‘marketing’, it means everything was going well as proven by Athlon64 (first integrated memory controller) and HD4870/4850……..
I doubt they showed live
I doubt they showed live benchmarks of bulldozer. They probably just made claims.
Yeah, they didn’t really
Yeah, they didn't really cover Bulldozer like this. The only graphs/perf data I saw were some bars that showed Bulldozer having significantly more INT performance than previous Phenom II. That was essentially achieved through the CMT implementation obviously. I remember hearing from mobo partners about 4 months before launch that they samples they had were not running faster than the 1100T of the time and the mobo guys were very, very concerned.
AMD very much has something
AMD very much has something to prove at this point. This was not so much the case when the Athlon 64 debuted nor even when the Phenom II debuted. I’m not surprised that they would choose to start demoing their parts to try and show prospective buyers that they intend to be able to deliver this time around.
this is far more promising
this is far more promising than that aots benchmark
CPU numbers from AotS isn’t
CPU numbers from AotS isn’t really a benchmark, they vary pretty wildly. Trying to draw conclusions from the leaked numbers earlier was a fools errand.
I just hope that these new
I just hope that these new CPU’s are a match for the current generation I5’s if not I7’s so that there can be a viable alternative to Intel.
Do you Know why 4 or more CPU
Do you Know why 4 or more CPU cores are good for gaming, and its not so much related to the CPU’s ability to help the GPU for rendering! Those 4, 6, 8 core CPUs can take more of the Windows services and adware/spyware Bloat that steals CPU cycles and run it on one or two of the available CPU cores freeing the remaining cores to focus only on the gaming itself. So if you have 6 cores, maybe 2 can take all the windows CPU cycles stealing bloat pressure off of the remaining 4 CPU cores and that helps keep the GPU fed with more work. 8 CPU cores are even better for taking that M$ OS bloat while having even more CPU/Cores’ cycles available to feed the GPU work and run the CPU only non graphics parts of the game.
There you are! I was waiting
There you are! I was waiting for a brave Anonymous commenter to swoop in and warn us about how all this tech is just so Microsoft can more easily spy on us! Stay vigilant! Inspect your tinfoil hat regularly!
And your M$ paycheck is in
And your M$ paycheck is in the mail, you shameless M$ shill/GIT! Steam OS/Vulkan and other Linux kernel OSs/Vulkan gaming will be used for many billions of devices across way more than 300 million mostly forced windows 10 installs and forced windows 10 bundling of new OEM PC/Laptop hardware. Vulkan/Linux will be on Billions of devices from PC/laptops to tablets/phones so that burning platform of windows 10 and its vile EULA will be remanded to rot in the dust bin of devices history. Smell your stinky clouds of stink later Nadella’s reeking minion!
M$ the CPU cycles, and internet bandwidth, stealing crooks!
Windows 10 will XBONE your life, watch out for that EULA!
WINTEL sucks!
I like Linux as much as the
I like Linux as much as the next enthusiast, and my systems are a mix of Debian and Windows. That said, I think you’re overstating the problem.
The problem is avoiding any
The problem is avoiding any abusive Monopoly interests’ products, and M$ and Intel have abused their Monopoly market positions for decades! Nvidia is almost as bad with its product segmentation policies that strip out any useful computing functionality, other than gaming functionality, for Nvidia’s consumer GPU SKUs.
What is a damn shame it that for decades these abusive monopolies where allowed to rein free with little legal consequences and they should have been broken up years ago. We are now have been getting windows OSs force bundled on New PC/laptop hardware for ages, with little requirements for the OEMs to provide the drivers for alternative OSs and allow the customer to choose which OS is better for that customer’s needs.
Hopefully there will be some Linux OS based laptop OEM products that will allow some to have no Intel/Nvidia SOC/GPU hardware in their consumer laptop SKUs, and absolutely no M$ software products ever on their laptops! Because after windows 10 and its EULA, there is no PC/laptop OS/ecosystem choice other than Linux for a non closed OS/Application ecosystem going forward.
Which Vulkan/Linux games are
Which Vulkan/Linux games are you talking about? Something from Steam, which is a separate Digital Rights Management (proprietary, freedom-restricting, spyware enabling) platform?
All of the best games use some form of DRM. 99% of the games without DRM can run fine on a PC from 2009 with a mid-level video card. So Zen, and Intel Kaby Lake, and the latest video cards are irrelevant.
I have four Linux PCs in my house and the newest processor is from 2013 and cost $120 brand new, and that’s fine because I don’t run proprietary games on any of them.
Your message is incoherent. If you really care about freedom, you can’t run modern games. That’s the price you pay. Most people won’t pay it. I understand their choice and respect their right to make it. But you can’t escape the evils that are fundamental to Microsoft’s proprietary software business model by choosing to accept another proprietary software master. Here comes the new boss, same as the old boss.
The level of effort this guy
The level of effort this guy puts into his posts is impressive. Although, I think his extremism actually has the opposite effect than he seemingly desires. Calm rational arguments tend to sway public opinion better than using nicknames for opponents and making stuff up…or maybe not; after all, people voted for Trump in the primaries.
F-Off Joe Six-Pack, both the
F-Off Joe Six-Pack, both the conservative(Trump) and liberal(Clinton) versions of knee-jerk morons! Enjoy serving your merchant princes! Scam_Merica is so full of it when it comes to true democracy! BUST The TRUSTS Teddy, and get democracy back on track and out from under the corporations’ thumbs! Capitalism is the leading cause of Communism, Fascism, Naziism, and Capitalism is as big a threat to democracy as Communism, Fascism, Naziism were/are. Capitalism the great breeder of and supporter of despots around the world, and Joe Six-pack is the little tyrant that keeps the big tyrant’s wheels greased and moving, be the tyrant Capitalist, Communist, Fascist, Nazi, or religious!
I had to look up what Joe
I had to look up what Joe Six-Pack means. That’s a funny one. Doesn’t quite describe me per the urban dictionary definition though.
I love that you respond with further pontifications. Please join the site so we know it is you and not some imitator trying to steal your anti-establishment thunder.
I like his stuff. He speaks
I like his stuff. He speaks the truth. However, he has yet to mention what is he using to write his post on? My vote is hypocrisy.
Intel/Armh/Amd/old Apple/Chrome OS?
There are a few Armh based Linux devices that almost make sense. Gentoo on RPI3 and he’d have my respect
He is still probably going to need internet though so right there he might as well be just sleeping with Ballmer.
Is he stealing his internet? there we go respect again
Don’t know why he would be on mostly windows site complaining about windows though.
regardless if hypocrisy or not… keep it up
Arm-H’s reference design CPUs
Arm-H’s reference design CPUs are not such good designs for Laptop/Netbook SKUs, let’s stick with the custom ARM variety with the Licensee only Licensing the ARMv8A ISA and building their own core micro-architecture engineered to run the ARMv8A ISA. I’ll take AMD’s K12 if it has STM capabilities and is a more wide order superscalar design like the Apple A7 cores! It looks like the Keller led design teams responsible for Zen’s performance where also looking to take the same basic core design of Zen and engineering it to run the ARMv8A ISA, so maybe the same basic Zen Caching structure and other CPU/SMT resources minus most of the microcode/micro-op Cache(ARMv8A ISA does not need microcode as much with its simpler RISC instructions).
So maybe an AMD custom K12 with a wide order superscalar design and SMT to even best the Apple A series SKUs in CPU core and graphics performance! That is until IT’s PowerVR SKUs with the dedicated ray tracing units get more design wins. Both AMD and Nvidia need to look at getting dedicated Ray Tracing functional blocks into their GPU IP, as well as other methods to make CPUs unnecessary for any and all graphics workloads and free the end users from having to spend excessive amounts of money on overpriced CPUs for any graphics workloads! CPU’s are really a ripoff for the amount of cores/FPUs one gets at such high costs relative to what can be had from a GPU.
Props do Go Out to ARM Holding’s new Mali/Bifrost GPU micro-architecture and maybe SoftBank will fund some future Mail/Bifrost laptop/mobile variants that it can offer to the IP licensing market, just be sure to include some dedicated Ray Tracing functional Blocks on any future GPU designs, so those costly CPU bastards are not needed for any graphics workloads. And while you are at it SoftBank(Arm Holdings’ parent now) why don’t you fund ARM-H to take the ARMv8A ISA and engineer a Power8 style reference CPU micro-architecture to run the ARMv8A ISA, Power8 is a RISC ISA design also! So it would just take some extra money to give to ARM-H and have them work something up in the extra wide order superscalar Power8 style regimen that can run the ARMv8A RISC ISA.
P.S. Joe Six-Pack is an existential threat to civilization! Now BUST the WINTEL TRUST, break up M$, Intel, the other TRUSTS!
edit: if it has STM
to: if
edit: if it has STM
to: if it has SMT
Actually he is not far off
Actually he is not far off from the truth! Besides calling people tin foil hat no longer works anymore! Did you not hear about the Soros emails? You can no longer cry conspiracy theory and deny the truth! Wake up!
Except that the entire
Except that the entire internet really is constantly spied on.
Im a little sad they didn’t
Im a little sad they didn’t do instructions per clock per core comparison to skylake. If they where shown to be within 3-5% performance of the Skylake cpu I would honestly consider it.
Even if the Total Thermal output of the CPU was still 95W – 125W. (which by their own graph is most likely going to happen)
Does anyone know if AM4 will support ECC?
I hope AMD stays alive long enough to research some crazy stuff. Like Big little Architecture on the server side or High bandwidth memory on the APU or just thunderbolt.
Skylake is only really faster
Skylake is only really faster than Haswell/Broadwell (at the same clock) if you go with significantly faster (DDR4-3000+) RAM.. most RAM sold isn’t that speed, so i wouldnt’ consider this ‘far off’..
This IPC comparison shows
This IPC comparison shows that they ARE within 3-5% of Skylake’s IPC, given that Skylake is within that range of Broadwell.
While I think these kinds of
While I think these kinds of debate are fun – it’s part of the reason we’re on this site – I think it’s pointless to make any serious plans or even predictions at this point.
Even as an AMD fanboy, I’m not prognosticating anything good (or, to be fair, bad) about Zen until PCPer and Phoronix and similar sites get to run their tests with retail samples.
Some caveats….
1. Zen SMT
Some caveats….
1. Zen SMT throughput is much better but single core still lower because AMD still has not revised that 40% IPC improvement. AMD could have taken a leaf off IBM’s Power8 by designing an architecture specifically to run 2 SMT threads, see http://hothardware.com/ContentImages/Article/2515/content/big_amd-zen-throughput-3.jpg dual register sets and dual program counters…..
2. Blender is open source software. So its easy to add speedups and optimizations for Zen (also means easy to cripple competitors CPU). Will wait for better (hard to manipulate) benchmarks….
3. Why 3GHz? Could that be the max clockspeed of a cherry picked Zen engineering sample? Core i7 6900K base clockspeed is 3.2ghz. So why not 3.2ghz, same clockspeed as boost on the Zen engineering sample? 3.2ghz is not far from 3ghz, likewise 3ghz is not far from 2.8ghz. Reminds me of this https://www.youtube.com/watch?v=R7EZmYth6TM&t=0s cherry picked 3ghz engineering sample…
4. Delayed to 2017, exactly as many predicted after Lisa Su’s cryptic speak at AMD’s earnings call. Probably AMD needs to time to tweak for more clockspeed.
5. Another delay indicator is, where are the AM4 motherboards? Have not seen any announcements or exhibits from major motherboard manufacturers.
The 8core 14nm Zen is a 95w
The 8core 14nm Zen is a 95w part
The 8core 14nm I7-6900k is a 140w part.
The i7-6900k TDP is 50% higher, yet is SLOWER in this Blender demo.
But yes, the base clock was set at 3ghz vs 3.2ghz.
The reason it seem, is that at 95w , Zen peak at 3ghz on 8 core.
But the i7-6900k require 140w to sustain 3.2ghz on all 8 cores.
At those performance level, the margin is so fat that AMD got plenty of room to make money. lots of money
That’s incorrect. In the
That’s incorrect. In the demo, both systems are clocked at 3ghz fixed clockspeeds. So Zen is operating at more than 95W, while Broadwell E is operating less than 140W.
Also that 95W is for 2.8ghz base clockspeeds as known from the engineering sample leaks. So 3ghz will push Zen past 95W easily. And that’s the big question, if 3ghz is easy on Zen then why not go for 3.2ghz on all cores like the Core i7 6900K?
Broadwell E 140W is when all cores are at factory default 3.2ghz base clockspeed, but much lower than 140W when all cores are at 3ghz base clockspeed.
I dunno about that power
I dunno about that power usage. My 5820K at stock speeds rarely went over 100W during torture tests. At the 4.2GHz OC I have it at now, it hits 140W in the same tests.
Repeat after me: “Engineering
Repeat after me: “Engineering sample”
We still don’t know anything about Zen’s final clocks.
Global Foundries 14nm LPP is
Global Foundries 14nm LPP is a low power process node. That means AMD’s Zen is being fabbed using low power process node usually designed for mobile SoCs, thus expect low frequency. Previous AMD desktop CPUs has always been fabbed using high performance process node.
It is not that simple, they
It is not that simple, they certainly won’t use same libraries and even manufacturing process will have to be updated for manufacturing CPU alone.
Sorry, but we don’t know any
Sorry, but we don’t know any of that. AMD says Zen will have comparable thermals to Broadwell-E. That means 125-140W TDPs. And that’s perfectly okay for an 8-core, 16-thread CPU at high clocks. Expecting AMD to surpass Intel on perf/w is neither realistic nor smart. I’d rather keep my hopes realistic and be positively surprised if they knock it out of the park.
The reason for the 3GHz comparison is obvious – engineering samples rarely clock very high. And their intent was to show comparable IPC, so matching clock speeds is the only real way of showing this. As long as retail Zen clocks to around 4-4.2GHz stock (although I’d expect the 8core versions lower than that), they seem to be doing fine – given that this benchmark is representative.
Unfortunately, we don’t know that. However, the chance of them running highly optimized code for a brand new, unreleased architecture is next to zero. So I’m optimistic. But I still don’t expect AMD to surpass Intel. If they come close to matching current Core i series cpus, I’d be more than happy to give them my money.
If they come close to 2013
If they come close to 2013 Intel Core i7 performance for a significant price discount over a current equivalent-performance Intel part, I will be satisfied. That’s still a colossal jump from parts like the FX-8320 – and as mediocre as that part is compared to the latest Intel bits, it still suits me fine.
Anything better than that would be good. But I think it would take a miracle for Zen to let AMD close a… what, five? Seven? More? Year period of falling successively further behind in every metric except cost per performance.
You are not the only one
You are not the only one having the same question or conclusions.
https://twitter.com/FPiednoel/status/766332772025675776 “For those who wants to verify that Blender use different code on AMD and Intel,install GIT & follow the instructions https://wiki.blender.org/index.php/Dev:Doc/Tools/Git ”
https://twitter.com/FPiednoel/status/766298394046308352 “And the best shoot was with a software they could recompile and optimize … So, it is a “up to claim” , after architecture tuning …”
On that IBM Power8, you mean
On that IBM Power8, you mean like this http://www.anandtech.com/show/10435/assessing-ibms-power8-part-1/9 poor single thread performance but high SMT-2 throughput.
Finally. Intel hardcore
Finally. Intel hardcore fanboys can dream of an 8 core Cannonlake that will not cost $1000.
Of course they can still hope for AMD’s destruction and happily buy that 8 core Cannonlake… in 2019…. for $1500.
AMD will be, as always, the
AMD will be, as always, the Price/Performance leader for both CPUs and GPUs with no Gimping of compute on the GPU! Just wait and see for the VR gaming, and 4k gaming as games make use of all the extra compute on AMD’s GCN GPU SKUs. It’s Zen/Polaris for my next laptop purchase from hopefully a Linux OS based laptop OEM.
It’s time to get the Intel/Nvidia stench out of the Linux OS based OEM PC/Laptop market and go all non monopoly with the PC’s/laptop’s hardware and OS.
‘Comparable configurations’,
‘Comparable configurations’, so what about the number of memory channels used. If that Zen AM4 machine is only dual memory channel (which IMHO most likely) then its possible that the Broadwell-E machine is also running on dual channel memory. Is there any testing notes and disclaimers on the test systems used available?
blender is not that memory
blender is not that memory intensive.
Specially with the scene used.
I actually expect very little bus activity in this test…
And since ALL AMD products (Well, mainly CPUs) seem to be very weak when doing memory bus operation, Zen is not in the clear yet.
The hell its not, just get
The hell its not, just get some high polygon mesh models going and some milti-texture materials going and turn up some rendering settings and watch the memory fill up. Try setting the color depth up on some large textures and watch the VRAM get maxed out along with the CPUs RAM.
AMD is damn good at designing memory controllers, just look at its GPUs, and GPUs eat a CPU’s breakfast, lunch, and dinner at handling massive memory transfers. AMD has the IP, and the brain power to design good memory controllers.
Memory intensive, as in the
Memory intensive, as in the ratio of instruction executed per memory bus read.
Blender like most/any renderer is taxing the FPU units and branch prediction the most.
And no, AMD implemented a very very poor memory control on all its CPU for the past 5+ years.
On the GPU, it also seem nvidia is getting much better performance out of the same ram configuration… but its not as easy to quantify.
But on the CPU, benchmarks make it very evident. AMD is a world away from Intel.
I disagree, high
I disagree, high bandwidth/large amounts of memory usage, has been a strong point for AMD.
I am into Flight Simulation, I design 3d models of aircraft.
It has been AMD or nothing for the past 5 years…
It is not easy as you said.
It is not easy as you said. Nvidia has better results with comparable memory configuration also probably because of tile rendering. Still it is obvious nvidia has slight advantage in this regard and AMD acknowledge it themselves when HBM came out. Thats why Polaris has so advanced DCC matching maxwell.
Looking at the scene AMD used
Looking at the scene AMD used in the demo, it is less complex than the most scenes used in Blender benchmarks. Also every Blender version including customized compiled ones may give slightly different results depending on optimizations and features added.
I wonder why AMD did not show Cinebench R15, the de facto benchmark AMD has been using for IPC in their PowerPoint slides. That one much easier to compare since there are lots of Cinebench R15 benchmark data available from various reviews.
AOTS besides IPC sensitive is also memory bandwidth sensitive. Thats why Skylake DDR4 can pull ahead of Haswell DDR3 by quite a margin. That is also why Haswell E DDR4 scored very high as well. Well, Zen is supposed to be using DDR4 also. Yes, possible some memory bottleneck from memory controller weakness in AOTS test there which also affected old FX8350 as well.
Blender hardly used the
Blender hardly used the memory
https://twitter.com/FPiednoel/status/766358463974735872 “Ran a similar workload in Blender, it basically does fit in the L1 cache.”
Performance of some
Performance of some applications can be affected by quad channel memory. Example http://core0.staticworld.net/images/article/2015/09/memory_bandwidth_7zip_938-100613932-orig.png Perhaps AMD was running the 6900K with only 2 memory channels to compare.
Ryan,
Any word about PCIe
Ryan,
Any word about PCIe lanes and/or chipset(s)?
What else was said about “next-gen IO”?
MRFS
^This.
At certain point CPU
^This.
At certain point CPU speeds and cores become not the most defining factor, unless you are looking for the last 10% of performance.
However, if you can’t do proper SLI/crossfire, choke your I/O on M.2 4x drives or on 3400+Mhz DDR4 and don’t do USB 3.1 Gen 2… what’s the point ? Last year’s technology on a fast CPU?
The IO story needs to be really compelling for AMD to get wins.
Hope they have done the right thing.
Here’s some info: (TL;DR
Here’s some info: (TL;DR looks like CPU has 36 PCI-E lanes, and the chipset has some other nice features)
http://dresdenboy.blogspot.com/
(AMD’s ex-chip fab was in Dresden Germany)
2nd blog post shows the PCI information for the 8-core Zen:
There are 2 x PCIEx16 bridges and a PCIEx4 bridge that appear to be on the CPU.. below that more x4 and x1 bridges (connections) are shown, but they might be sitting on the chipset rather than the CPU.
The chipset also has a USB 3.1 port via PCI-E x4 (so it’s probably the faster USB3.1 gen2), and they show 4 x SATA 6gbps ports. My guess is M.2 support will come via 1 connection to the CPU and up to 2 more connections to the chipset (looking at the bottom half). There’s also a “SATA x16” at the bottom but i’m not sure what that represents.. SATA express or similar.
If AMD’s Zen chipsets do not
If AMD’s Zen chipsets do not come with native support
for 4 x U.2 ports, THEN those chipsets should support
at least three x16 slots with full x16 PCIe 3.0 lanes:
2 for SLI/Crossfire and at least 1 for a modern
NVMer RAID Controller (my continuing pet peeve).
We just refined a somewhat crude bandwidth comparison,
and our simplified version allows for an aggregate
controller overhead of 18.7%:
as such, 4 x U.2 SSDs in a hardware RAID-0
should have ~ the same raw bandwidth as DDR3-1600:
4 lanes per U.2 @ 984.6 MB/second = 3,938.4 MB/second
4 U.2 SSDs @ 3,938.4 = 15,753.6 MB/second.
DDR3-1600 x 8 = 12,800 MB/second
12,800 / 15,753.6 = 81.3% (i.e. 18.7% overhead)
Please advise if you confirm any error(s) above.
Thanks!
Your math is correct from a
Your math is correct from a pure theoretical maximum bandwidth standpoint.. However:
The DDR interface is much lower latency than 4x SSDs over PCI Express for these reasons:
– DDR controller on CPU die
– SSD controllers on each drive
– PCI Express communications protocol overhead
– NAND flash latency
– PCI Express signals can propagate longer distances than RAM; necessitating a higher latency
– Additional signal propagation within the SSD itself
– The CPU is engineered to read/write directly to RAM (registers).. The overhead to request/write data to a block device requires a lot more CPU time.
– etc..
This means that while on paper 16 PCI-E lanes worth of U.2 SSDs has the same peak bandwidth as DDR; they’ll never sustain anywheres near that bandwidth just due to latency of communications.
GB/sec is very useless for paper exercises because of this..
Something “after NAND flash” (ReRAM, Xpoint, etc) is what’s needed to achieve the speeds you’re thinking of..
I was also thinking about the
I was also thinking about the new bottleneck
if/when ReRAM/XPoint is mated with 16 GHz channels
i.e. to “sync” with PCIe 4.0. The latter was
in the news this week.
Your thoughts? p.s. Thanks for the above.
MRFS
Please go back and take note
Please go back and take note that
my calculation was not seeking
“maximum theoretical bandwidth”
because I allowed for a specific
“aggregate controller overhead”:
our simplified version allows for an aggregate
controller overhead of 18.7%
Are you implying that the latter overhead
should be a larger number?
p.s. We’re waiting for Highpoint to
Power to the People!
Power to the People!
If anything the Zen Chip
If anything the Zen Chip might have been of the APU variety and they had open CL rendering enabled or something
In any case, hope they don’t screw it up
No Zen/Polaris or Zen/Vega
No Zen/Polaris or Zen/Vega APUs until next year! It’s just Zen CPU cores this year into the first part of next year(2017), and even for the server market there will be at first Zen CPU core only server SKUs until later in 2017-2018 when the big Fat HPC/Server/W-Station APUs on an interposer arrive with their 16/32 Zen cores paired with a big fat Vega GPU die, and HBM2 appear.
The APU on an interposer era is about to begin from AMD, and that will be something to see. I do hope that Apple will drop Intel and go with some 16 Zen core/Big Vega Die/HBM2 APUs on an interposer from AMD for Apple’s Mac Pro Refresh, maybe in late 2017!
Considering that AMD is only
Considering that AMD is only talking about 8C/16T pure CPUs for HDT, how on earth do you see 16C/32T APUs happening? Sure, we don’t know die sizes for Zen yet, and they’re doing 32C/64T for servers, but fitting 16 cores, a useful GPU (minimum 512 CU, but to match that CPU in a meaningful way, twice that or more), plus a useful amount of HBM on an interposer that doesn’t price this into oblivion? Within a reasonable power envelope (sub-200W)? Not going to happen.
I’m hoping for 8C/16T APUs, but all things considered, a 4C/8T APU with a bigger GPU, HBM and a 125-140W TDP would probably be a better selling product. Gaming PC on a chip? Yes please.
No nada on any Zen APU, and
No nada on any Zen APU, and CPUs so suck at rendering, they suck at it bad, at rendering jobs at time frames that do not allow the dust to settle in and pile up and the cobwebs appear.
CPUs are worthless for Graphics work done fast, and way overpriced for the amount of FP performance that a CPU can provide relative to a GPU. Hey everybody just unplug all the GPUs from your PCs and see how the CPU really does for FPSs for games or frames for other rendering jobs, without a GPU’s help!
Hell a GPU would have rendered that image so fast it would have appeared to be just a snapshot with no human eye able to discern any rendering process going on.
raytracing and rasterizing is
raytracing and rasterizing is very different.
offline renderers are at the core based on managing scene graphs & rays. CPU are better at that then GPU. In some case, doing some task on the GPU is to convoluted to even be practical at all.
GPU are more geared toward shading, and shading group of pixel sharing identical branching patterns.
Because branching and random memory access is not something, even pascal or polaris, do well at all.
Also people often compare 250watt GPU with 95w CPU with software not optimized for a CPU SIMD engine to say “GPU rules”
They do offer tremendous amount of raw floating point computation,
but can be a pain to extract any real world performance out of it.
And if you look GPU and CPU are “merging” . AVX512 is a big step toward that.
I will take a 6 core 12 thead
I will take a 6 core 12 thead version with 40+ Pcie express lanes for $399.
Who at PCPER is covering Hot
Who at PCPER is covering Hot Chips this year? Both AMD and Nvidia are going to be there, Nvidia talking about its latest Tegra/Denver designs(Most Likely for Cars), and AMD will be talking about Zen, and hopefully supplying some Zen white-papers at Hot Chips. And Where are the Polaris white papers from AMD?
Also please publish all the slides, or links to the press kits for all the slides, for any AMD/Nvidia/other trade shows, conferences like this one and Hot Chips. Hot Chips should be covered every year as that’s the big conference for CPU/GPU technology along with SIGGRAPH!
I’ve been setting aside $50 a
I’ve been setting aside $50 a paycheck in preparation for the Zen launch. Looking forward to building an ITX AMD system with their desktop processor rather than an APU thanks to the unified AM4 socket. I really wish they would release sooner than later, don’t want to wait until Feb./Mar. like we had to do for Kaveri.
I’M SO EXCITED!
I told myself
I’M SO EXCITED!
I told myself that I would not upgrade my 4770k until I could get a proper 8 core for under $500. This sounds like the competition needed to make that happen sooner rather than later.
I’m so ready.
I was ready to
I’m so ready.
I was ready to get a 6900K and a new motherboard, probably to a total of $1300-$1400 USD to replace my aging (and struggling) i7 3770k that can’t even record gameplay without dropping 50% or more of my current FPS,or encode two videos at the same time without struggling and deteriorating overall system performance.
If AMD has got something good, then I am willing to reciprocate in kind.