Trinity Finally Comes to the Desktop
We finally have the FULL review of AMD’s Trinity Products.
Trinity. Where to start? I find myself asking that question, as the road to this release is somewhat tortuous. Trinity, as a product code name, came around in early 2011. The first working silicon was shown that Summer. The first actual release of product was the mobile part in late Spring of this year. Throughout the summer notebook designs based on Trinity started to trickle out. Today we cover the release of the desktop versions of this product.
AMD has certainly had its ups and downs when it comes to APU releases. Their first real APU was Zacate, based on the new Bobcat CPU architecture. This product was an unmitigated success for AMD. Llano, on the other hand, had a pretty rocky start. Production and various supply issues caused it to be far less of a success than hoped. These issues were oddly enough not cleared up until late Spring of this year. By then mobile Trinity was out and people were looking towards the desktop version of the chip. AMD saw the situation, and the massive supply of Llano chips that it had, and decided to delay introduction of desktop Trinity until a later date.
To say that expectations for Trinity are high is an understatement. AMD has been on the ropes for quite a few years in terms of CPU performance. While the Phenom II series were at least competitive with the Core 2 Duo and Quad chips, they did not match up well against the latest i7/i5/i3 series of parts. Bulldozer was supposed to erase the processor advantage Intel had, but it came out of the oven as a seemingly half baked part. Piledriver was designed to succeed Bulldozer, and is supposed to shore up the architecture to make it more competitive. Piledriver is the basis of Trinity. Piledriver does sport significant improvements in clockspeed, power consumption, and IPC (instructions per clock). People are hopeful that Trinity would be able to match the performance of current Ivy Bridge processors from Intel, or at least get close.
So does it match Intel? In ways, I suppose. How much better is it than Bulldozer? That particular answer is actually a bit surprising. Is it really that much of a step above Llano? Yet another somewhat surprising answer for that particular question. Make no mistake, Trinity for desktop is a major launch for AMD, and their continued existence as a CPU manufacturer depends heavily on this part.
Last week we showed off the basic gaming performance of Trinity, and there was more than a little bit of controversy surrounding that release. While the gaming tests showed the A10 5800K to be head and shoulders above everything else in the world (at least in the integrated GPU world), there were those who thought that AMD was trying to dictate reviews to give readers a far too positive impression of Trinity’s performance. Our opinion was that we wanted to get performance data out there as soon as possible, but we also warned readers that there was more to Trinity than just good graphics performance.
The Technology Behind Trinity
We covered Trinity this past Spring when the mobile parts were released, but we can certainly take a quick refresher on what makes up the processor.
There are many major upgrades to this latest APU as compared to the previous Llano. The big two are of course the use of the Piledriver micro-architecture instead of the older generation “Husky” that served Llano and basically originated with the Athlon II series. The graphics portion is also greatly changed by using the newer VLIW 4 architecture rather than the older VLIW 5. These two are the biggest and most obvious changes.
Going to the Piledriver core should allow AMD a larger range of power envelopes than Llano featured. There is a lot more finely grained power control throughout the design, and the overall architecture is just slightly more power efficient per clock than the previous generation. AMD also included all of the new bells and whistles when it comes to architectural innovations such as AVX, FMA4, FMA3, and other new operations that should improve performance. Piledriver is a reworked Bulldozer design, and improvements under the hood again help not only IPC, but also power characteristics. Piledriver should be able to clock higher, achieve higher IPC, and still have the same or lower TDP when it comes to module count and clockspeed.
VLIW 4 was originally introduced with the HD 6900 series of graphics chips and features greater stream unit utilization than the previous VLIW 5 architecture when it comes to DX10 and DX11 workloads. While there are fewer stream units overall (384 vs. 400 in Llano), the redesigned unit is able to run at a higher clockspeed and is more efficient per clock in most workloads. The A8 3870K had a GPU clockspeed of 600 MHz, while the A10 5800K is at 800 MHz.
GPGPU applications should also run faster on this particular unit due to all of the internal changes and the move to VLIW 4. AMD has been much more aggressive as of late in tuning their OpenCL performance, and Trinity can achieve a pretty impressive 763 GFLOPs of performance at the top end. This is up from the 500 GFLOPs area of Llano.
Trinity also features an updated UVD unit (Universal Video Decoder) which offloads even more work from the CPU by natively supporting operations from MVC, DivX, and MPEG-4. Outputs also get an upgrade as Trinity can natively support up to 3+1 monitors. Yes, Trinity can do Eyefinity if a user wants to.
Internally things are a lot more beefy when it comes to interconnects. There is a lot more bandwidth internal to the chip than ever before with an AMD APU. This should allow better communication from the CPU cores to the GPU, as well as a greater utilization of available bandwidth from main memory.
Trinity will come in two general categories for the desktop. There will be higher end 100 watt parts and then more mainstream 65 watt units. There will also be Trinity units branded as Athlon X4 and will not have a working graphics portion.
Turbo Core 3.0 is another upgraded function that allows the different units to speed up and slow down, depending on utilization. If an application is CPU heavy, then the CPU will be allocated extra power and will clock into Turbo Mode, while the GPU downclocks so that the overall TDP does not go over the limit. If an application is more GPU heavy, then that part is allowed more power and can be run at full speed while the CPU is downclocked. This little dance happens in milliseconds on the processor and there is dedicated logic used to monitor the cores and their usage.
Trinity is fabricated by GLOBALFOUNDRIES on their 32 nm PD-SOI/HKMG process. This is by now a very mature process with good yields and well known characteristics. This is probably another reason why Trinity should be more efficient overall.
did you turn off turbo core
did you turn off turbo core in bios?
also I’m sure a bios update is on the way
also make sure you use a good heatsink cause it will throttle
Yea, Tigerdirect has them
Yea, Tigerdirect has them
I think the units for SiSoft
I think the units for SiSoft Sandra memory bandwidth results should be GB/s, not MB/s.
so what was the power usage
so what was the power usage when OC’ed to 4.4ghz ?
I will be doing more testing
I will be doing more testing throughout the week of overclocked performance. Apparently there are a few settings which are deletrious to good overclocked results that we were unaware of (like the above mentioned Turbo Core). Apparently there are others as well. Once I get a better lock on overclocking, I will throw up some numbers here.
They need to drop these into
They need to drop these into some x86 tablets. That’s where the real advantage comes I think.
They also need to do a deal and get these in tv or tv boxes.
Agree
Agree
Great review, thanks again
Great review, thanks again Josh.
I was afraid this would happen again. I really want AMD to pull ahead or at least catch up.
Barely keeping up with an i3 isn’t keeping my hopes up. I would buy a new AMD CPU in a heartbeat if it could even come close to an i7. I want AMD to win dammit.
I realize that they are constantly working on the next CPU, i really hope somewhere down the line they just stop working on this architecture and go back the the Phenom II architecture (K10 I think) or at least re-release it along side bulldozer.
Is that possible? Or just too costly?
To this day they are still faster, just more power hungry. Not much of a down side IMO.
Why aren’t they pumping more cores in stuff? That would help a little wont it?
EDIT: oooh didnt see the tablet comment, that would be a great idea. Trinity Win8 tablets would be a huge advantage for AMD. I’m sure they can keep up with an Atom.
I don’t want to rain on your
I don’t want to rain on your parade, but i really have to disagree with a few things you said. Going back to the phenom II architecture = throwing in the towel and going out of business. AMD is NOT faster and more power hungry. It’s slower and more power hungry. Why aren’t they pumping more cores? They did do that with Bulldozer, and it has not been what people want as far as performance goes for MOST (some may still be appreciating 8 cores for small niche groups).
You misunderstood what i
You misunderstood what i meant. Sorry, to be clear. I was talking about the Phenom II architecture being faster and more power hungry, in relation to AMDs new CPUs (it still pulls far ahead in most of the benchmarks).
Its quite obvious they are slower and less efficient than Intels offerings.
I bring up going back to the Phenom II architecture so that AMD can actually compete. Instead of being the “Affordable” CPU manufacturer, and not competing at all.
I was thinking more along the lines of 10 cores minimum (Clearly 8 wasn’t enough) or perhaps im being naive thinking that you can just throw more power at the problem.
For the record i wold be SO SOLD on a 10 or more core consumer part. Its just cool IMO, fastest or not.
This is my belief based off
This is my belief based off of their releases so far.
They knew where they were at in generalized ‘speed’ or processing power, and new their architecture could not continue scaling into higher clocks and higher core counts and keep up with hyperthreading. I really think AMD tried to make an architecture that scaled well for multiple cores and high clock speeds. I think their processor lines show they succeeded in this, it is almost arbitrary for them to include more cores.
Now they are optimizing that core architecture (increased Instructions Per Clock, etc.) and that is the effort that will bring them back into the performance game. Running more real cores, higher overclocks (5Ghz+ wooo!), and close to or better IPC, will make people wonder WTF Intel has been doing. They have licenced Intel’s 3D trigate patents so they can continue scaling down with their manufacturing processes too.
AMD chose the less risky route here by optimizing their core count scalability then optimizing core operation, but it is also the longer slow & steady route. If they had tried to create Optimized high clock high IPC cores it is MUCH riskier.
They would either be back to dual core counts with the possibility their design does not scale into high core counts in which case they have to scrap that work when trying to design a high core count processor.
Or they are try to design an archecture that does it all and the results of that all coming together for one processor design is very risky in terms of ready time and design performance, which has a high possibility to leave them with a mediocre architecture they are unable to build upon to continue competeing.
Just to complete the picture,
Just to complete the picture, it is rumored that AMD tried to solve the major bottlenecks of the Bulldozer with Piledriver: scheduling and L2 cache speed. There are some tests at xbitlabs AIDA64 memory benchmark – the results show that in most Bulldozer and Piledriver are similar, but L2 read bandwidth of Piledriver has risen dramatically, almost doubled: http://www.xbitlabs.com/articles/cpu/display/amd-a10-5800k_4.html#sect0
This could be a major reason (along with double the scheduling queue) why the A10 that has no L3 cache managed to stick quite close to the FX 4170.
Anyway, this probably doesn’t bring that much of a performance benefit to make Piledriver look good, but as far as I remember this is the first time since Clawhammer (anyone remembers that one?) that AMD hasn’t degraded the speed or latency of the cache on a next-gen CPU. Let’s hope the L3 in Vishera continues on this route.
Another great review from a
Another great review from a great man.. Well close enough 😛
Still as the review goes, I do wonder how much lack of L3 cache has on performance. I know AM3+ Piledriver is still a few weeks away, I just hope it will make an upgrade from my 1090T worth it.
Still any speculations how much L3 cache could improve in terms of CPU performance? 5-10%?
Is there a way to test Buldozer cores with L3 disabled and enabled to see how much performance hit there was?
Now back to Trinity. New socket is a killer for me, but it is nice that AMD accnowledged this and stated that there will be at least one more CPU architecture (steamroler?) for the socket before AM3+ and FM2 fuse into one socket.
If only Blender3D would support OpenCL rendeirng on AMD GPUs, I would be inclined to think of this setup. For now only mobile Trinity seams to make sence at this point.
the results of you cinebench
the results of you cinebench r10 for the i3 2105 are MUCH, MUCH LOWER than the normal score…
please re run the benchmark, make sure to test it in 64bits.
We are redoing tests here and
We are redoing tests here and will look into it.
Thanks!
This is a very narrow product
This is a very narrow product line – nothing more than quad-core? The price varying only from $71 to $122? No part less than 65W? Aimed almost only at gamers.
Obviously the objective here was to get the FM2 socket exactly right so it can remain stable until late 2014. By that time, Thunderbolt has either replaced all these various nasty display ports, or it has not. Also by that time, PCIe 3.0 x32 devices will be more common and the full 52GB/s of the HTX bus should be available at the DDR3, DDR4 or DDR5 RAM interface. And, 10 gigabit ethernet will either have come down in price (thanks to Thunderbolt) or be irrelevant to desktops and NAS (thanks to Thunderbolt). By 2015, with these existing buses all maxed out, and the IEEE P1905.1 standard settled so that things like powerline networking’s interface to the PSU can be settled, and Thunderbolt vs. DisplayPort vs. 10 gigabit Ethernet settled somehow… FM3 or whatever can be stable from 2015 to 2018.
Maybe longer. The desktop will be dead by then, and the mainboard is going to be sitting in your wall near your electrical box or cable head, and talking to your refrigerator as much as to your TV.
Expect some ARM cores (for very low power idling) in the very next FM2 processor release. That’s the only way to respond to Haswell. Expect also some X6 X8 X10 and X12 processors in that lineup, and a few low-power options below 50W (with the ability to rely on ARM core to respond to routine network and device events to keep that power draw much lower in practice). Much more price variations, perhaps from $50 to $200 or even $300.
Given the graphics performance of these October 2012 chips though, it’s entirely reasonable to rely on the embedded graphics and use the PCIe x16 slot for a PCIe SSD – basically equivalent to slower RAM given the FM2 direct chip connection. Imagine 100GB RAM or 250GB RAM for a few hundred bucks (some OCZ PCIe SSDs sell for as little as $2/GB so that’s $200-$500, same as a good video card).
Given the excellent multi-core performance of database engines, and the very low price of these chips, it’s possible you could see lots of FM2 processors used in database hosts. Especially if there is a way to use OpenCL to do the processing on all those shader cores…?
This is a good preview for
This is a good preview for Vishera Piledriver. When that time comes, can see see the desktop Piledriver Vishera review with a Core 2 Duo (like Q6600), i7 920, 2600k, 3770k, and Phenom 2 quad and hex (and of course old FX).
AMD’s Trinity platform is a
AMD’s Trinity platform is a good platform, yes it trades blows with the i3 with Intel’s chips hitting hard when it comes to single threaded applications. However AMD hits Intel hard on entry level gaming.
When it comes down to Power Consumption I feel that the whole story isn’t being published. Intel’s HD 4000 just doesn’t cut the mustard to games and basically requires a ext. video card to edge out the AMD APU. With that said I have yet to see a power consumption table to show what the i3 or i5 have with an ext. video card. AMD’s APU already has a full blown video card on die and reflex it in it’s power consumption. Intel’s on die GPU is to show lower power consumption on the charts but knowing full well no one in there right mind would would run it that way.
true sad thing of these
true sad thing of these tests, fact that no one in their right mind uses intergraded cpu graphic’s to play games. Yet AMD seems happy to beat intel in these, since its only thing they can win in.
All kinds of people in their
All kinds of people in their right mind play games with integrated graphics. My children can play all their DC Universe, Roblox, Hero-Up, and all kinds of other 3D games which run perfectly fine on integrated graphics. They don't play Crysis, or BF3, or Skyrim, but the games they play are designed from the ground up to be played comfortably on integrated grahpics. I'm actually impressed by how DC Universe looks on an APU. Plays pretty smooth, looks good, and it is a cheap platform for users.
keep us updated on the
keep us updated on the overclock hunt
have you tried a better cooler?
cpu OC? igpu OC? memory OC?
how does it perform when overclocked?
be cool to do a video review of the overclock 😉
Looks like Llano 3870K is a
Looks like Llano 3870K is a much better buy at this stage. One can just clock its gpu to 900Mhz and blow away all the Trinity A10-5800K benchmarks above!. Llano does not seem to perform as well in the CPU side when clocked to 3.6Ghz from 3Ghz. The fact the Trinity is so highly clocked means that the K parts are pretty useless as the OC headroom is very small. Shown here only 4.4Ghz max which is really poor. This shows that AMD is putting a small headroom “buffer” in their Trinity chips. Intel , however, can be clocked to such frequencies fairly easily!.
Hi,
I own an AMD A8 5600k
Hi,
I own an AMD A8 5600k and i want to ask you what video card to choose from these:
http://www.msi.com/product/vga/R5770-Hawk.html
http://www.asus.com/Graphics_Cards/AMD_Series/EAH6770_DC_SL_2DI1GD5/
http://www.msi.com/product/vga/R6670-MD1GD5.html#?div=Specification varianta dual fan
http://ro.asus.com/Graphics_Cards/AMD_Series/EAH6670DIS1GD5/
I want to know if i can use all of them in dual graphics mod, AMD recomends HD6570 and HD6670 for this processor
THX!
Very great post. I simply
Very great post. I simply stumbled upon your blog and wished to mention that I’ve really enjoyed surfing around your blog posts.
After all I’ll be subscribing in your rss feed and I’m hoping you write again very soon!
Stop by my web page :: instagram