It is common knowledge that computing power consistently improves throughout time as dies shrink to smaller processes, clock rates increase, and the processor can do more and more things in parallel. One thing that people might not consider: how fast is the actual architecture itself? Think of the problem of computing in terms of a factory. You can increase the speed of the conveyor belt and you can add more assembly lines, but just how fast are the workers? There are many ways to increase the efficiency of a CPU: from tweaking the most common or adding new instruction sets to allow the task itself to be simplified; to playing with the pipeline size for proper balance between constantly loading the CPU with upcoming instructions and needing to dump and reload the pipe when you go the wrong way down an IF/ELSE statement. Tom’s Hardware wondered this and tested a variety of processors since 2005 with their settings modified such that they could only use one core and only be clocked at 3 GHz. Can you guess which architecture failed the most miserably?
Pfft, who says you ONLY need a calculator?
(Image from Intel)
Netburst architecture was designed to get very large clock rates at the expensive of heat — and performance. At the time, the race between Intel and its competitors was clock rate: the higher the clock the better it was for marketers despite a 1.3 GHz Athlon wrecking a 3.2 GHz Celeron in actual performance. If you are in the mood for a little chuckle, this marketing strategy was all destroyed when AMD decided to name their processors “Athlon XP 3200+” and so forth rather than by their actual clock rate. One of the major reasons that Netburst was so terrible was branch prediction. Branch prediction is a strategy you can use to speed up a processor: when you reach a conditional jump from one chunk of code to another, such as “if this is true do that, otherwise do this”, you do not know for sure what will come next. Pipelining is a method of loading multiple commands into a processor to keep it constantly working. Branch prediction says: “I think I’ll go down this branch” and loads the pipeline assuming that is true; if you are wrong, you need to dump the pipeline and correct your mistake. One way that Pentium Netburst kept high clock rates was by having a ridiculously huge pipeline, 2-4x larger than the first generation of Core 2 parts which replaced it; unfortunately the Pentium 4 branch prediction was terrible keeping the processor stuck needing to dump its pipeline perpetually.
The sum of all tests… at least time-based ones.
(Image from Tom’s Hardware)
Now that we excavated Intel’s skeletons to air them out it is time to bury them again and look at the more recent results. On the AMD side of things, it looks as though there has not been too much innovation on the efficiency side of things only now getting within range of the architecture efficiency that Intel had back in 2007 with their first introduction of Core 2. Obviously efficiency per core per clock means little in the real world as it tells you neither about raw performance of a part nor how power efficient it is. Still, it is interesting to see how big of a leap Intel made away from their turkey of an architecture theory known as Netburst and model the future around the Pentium 3 and Pentium M architectures. Lastly, despite the lead, it is interesting to note exactly how much work went into the Sandy Bridge architecture. Intel, despite an already large lead and focus outside of the x86 mindset, still tightened up their x86 architecture by a very visible margin. It might not be as dramatic as their abandonment of Pentium 4, but is still laudable in its own right.
You should have thrown up an
You should have thrown up an Atom and Llano CPU for good measure.
I had a p4 3.2ghz net arch
I had a p4 3.2ghz net arch cpu and it was fast as f at gaming but now its crud. Just goes to show how game programing has advanced to the point of massive prediction allgirithims, Just adding to latency:(
PS. Looks like AMD may have a better future in the desktop gaming market:) well lets HOPE so anyway <.
Limiting the system to 3 GHz
Limiting the system to 3 GHz is just nullifying the test in my opinion.
The newer architectures can handle far more because they do things so much faster and multi-core design is a MAJOR aspect of the innovations in pc hardware development. Single-core testing again, nullifies what seems to be a reasonable explanation.
For the sake of comparison, It would be marvelous to see a price/minute or performance graph as well to show launch prices, or pricing 6 months after launch compared across each generation.
It actually doesn`t nullify
It actually doesn`t nullify the test, it just measures something else. This says nothing about the performance of any part at any given point in time… just *what* they do to get higher performance.
Compared with the modern CPU,
Compared with the modern CPU, any CPU of the Netburst era is verry slow. That was the technology at that point. I doubt that an FX-55 or even the FX-57 (top CPUs at that time) will change something in this chart.
Netburst was indeed bad when launched, but in time became competitive with AMD and even if Netburst was launched in the time when AMD made Athlon / Athlon XP. Netburst continued to keep up with AMD in apps (not gaming) even when A64 launched.
In my opinion Netburst wasn’t so bad at all, but was far away from what intel was capable to make / produce in those years.
So you’ve got a 2005 Netburst
So you’ve got a 2005 Netburst against 3 years later AMD at the nearest comparison…
WHY ? WHY 3 YEARS TO SHOW ANYTHING BEATING IT ?
I DO BELIEVE IT IS ANOTHER AMD FANBOY LIE… THE CHART IS SO LAME I CANNOT BELIEVE ANYONE ACCEPTS IT AS PROOF
Can you read? The point is to
Can you read? The point is to show how bad of a µarch design Netburst was, not to prove that 3 year old and newer chips can beat it “Just another Intel lie” only using newer chips to beat older Pentium 4’s.
This comparison is clock for clock on a single core, and also has a 2007 dual core Atlhon 64 in the chart but again only using a single core, so it’s essentially a 2003/4/5 Athlon chip slapping the P4 round like the POS it was.
was there ever a good reason
was there ever a good reason why do they keep doing benchmarks with the worst revision in the pentium 4? willamette and prescott are the worst revisions of the pentium 4, and yet people still use those 2 pieces of shit to get the results…
if anyone had done actual research, northwood and cedar mill are the best revisions of the pentium 4. but what we have here is prescott prescott prescott?
the general knowledge a person would know of the pentium 4 is that it runs hot and is slow, even at 3GHz… but there was northwood.
the northwood at 3GHz (not counting the 400MT/s and 533MT/s versions) was quite competitive back in the athlon xp days of barton and thoroughbred-B.
heck i’d even pick a northwood if i had a good cooling system to start out with, the athlon xp was just not worth it for all-out performance.
but after northwood, which is prescott… intel just decided to attempt another clockspeed increase by installing 31 pipeline stages.
then things got downhill afterwards, till cedar mill showed up with its 65nm process. its technically a prescott with less heat production (allowing for higher overclocks), some or most cedar mills could get above 4GHz like most prescotts without requiring a good cooler on air.
now this comment wasn’t ment to say how good the pentium 4 is, its just here to give people more insight of the pentium 4’s revisions.
honestly, is a northwood or a cedar mill (D0 stepping with 65W TDP) that hard to find?
Good point about the cedar
Good point about the cedar mill. The cedar mill P4s especially the 651 with D0 stepping was a solid chip. Unfortunately by the time they came out the Core 2 series was just around the corner.
Another day, another useless
Another day, another useless article from Tom’s.
Prescott 2M (2005) is 50% slower than Brisbane (2008)
Brisbane (2008) is 42% slower than Sandy Bridge (2011)
…
Sandy Bridge vs. 8086: Sandy Bridge forfeits as it cannot run at 5MHz. 8086 wins.
Basically Tom’s just proved every uarch is bad compared to what comes years later, and that the advance slows down. You want to show something is bad? Compare it so something from the same time period like the stellar K7. Or at least some CPUs from the same year. But no, Tom’s goes to make a completely irrelevant comparison because reasons.
Oh, and that “limiting the number of active cores to one”. No wonder they dedicated one line to describing what they did since they have no idea how it worked: what exactly is disabled, what happens to the shared cache, memory controller, etc.
This is definitely the shoddiest comparison methodology I have seen in a high profile article as far back as I can remember.
The conclusion is right but the way they got there just made me deduct every single point Tom’s ever gathered with me. Just as well, both Tom’s and their sister site Anandtech seem to be full of amateur bloggers these days.
“These days”
article is from
“These days”
article is from 2011 mate. no need to be so angry. the world isn’t a cold dead place like you think it is.