Revit 2019 and IOMeter
This page contains two benchmarks that we have implemented for recently into our CPU reviews. We are always looking for benchmarks that provide more of a real-world look at performance instead of just synthetic workloads and are excited about these test. That being said, we are always looking for feedback and suggestions as to our benchmark suite!
Revit 2019 – RFO Benchmark
RFO Benchmark is a community developed series of scripts that can be used to evaluate system-level performance in Revit, the popular building information modeling software from Autodesk used primarily by those in the Architecture fields.
This benchmark tests several different aspects of Revit functionality, which stress the system in various ways including the following subtests:
- Update (converting a file to the Revit 2019 format from a previous version)
- Model Creation
- Export (converting the document to a format for raster printing)
- 3D Render
- Graphics (refreshing and rotating within the viewport of Revit)
While the Intel i9-9980XE leads the AMD Ryzen Threadripper processors in the Graphics and Update file tests, the 32-core AMD processor manages to eke out a 7% lead in the Render test, and 5% lead in the Model Export test.
IOMeter
One of the more overlooked aspects of single-threaded performance is disk throughput. As disks get faster and faster, a single application thread can struggle to achieve full bandwidth in some scenarios.
To evaluate this, we are running a RAM disk of 1024MB on the host machine using the Softperfect RAM disk application, and then testing throughput in IOMeter with a 100% random 4K workload. Since a RAM disk is the fastest possible storage device, this will give us an idea of how IOPS scale with single-threaded processor performance.
While generation gains between the i9-7980XE and i9-9980XE are within a few percentage points of each other, the i9-9980XE has a massive 90% lead in IOPS over the Threadripper 2990WX. Still, the consumer-oriented i9-9900K still has a 20% advantage over the 18-core CPU.
GN shows that the 7980xe has
GN shows that the 7980xe has higher OC headroom with delidding, but straight out of the box the 9980xe is only SLIGHTLY better in a couple of workloads, but otherwise the only significant change is literally in the name of the product, changed the 7 to a 9 and kept the same price.
That increase in power
That increase in power consumption over the 7980XE looks a bit suspect.
I agree. I tested both of the
I agree. I tested both of the CPUs again this morning to verify the numbers before publishing the review though!
Crazy. Maybe a bum sample?
Crazy. Maybe a bum sample?
I don’t find it that hard to
I don’t find it that hard to believe – this is what, a hybrid of the 2nd and 4th Tick in Intel’s cadence? They are hitting 14nm for every last drop of performance, so power usage can only increase.
I would not leave the first
I would not leave the first generation Threadripper 1950X results out of the lineup as the Price/Performance metric on first generation Threadripper is even greater now with first generation TR SKUs selling for well below their introductory MSRPs.
Also Price/performance wise there are for workstation type milti-core workloads the 16 core Epyc/Naples Dual Socket CPU SKUs with the Epyc 7301 selling online around $850-$950 range(Online). So that’s 2, Epyc 7301s for around $1700(at current deal pricing) and dual socket Epyc/Sp3 low end MBs are running $450-$650 range for dual socket Epyc/SP3 motherboards with 16(8 memory channels per socket) total memory channels at 1 DDR4 DIMM slot per memory channel and 128 total PCIe lanes. So for memory bandwidth hungry workloads 2, 16 core Epyc CPUs and 16 full channels of memory bandwidth across 2 scokets. It all depends on if you need Epyc/SP3 and Motherboards that actually tested vetted/certified for ECC Memory usage for your workloads.
I’ll bet first the generation Epyc/Naples deals are going to get better also by mid 2019 as the Epyc/Rome SKUs begin to arrive for online purchase.
It all depends on your workload requirements and that Intel HEDT part has higher CPU and memory clocks relative to the Xeon/Epyc(More Memory Channels) parts but with less total effective memory bandwidth with the i9-9980XE only Quad channel but not supporting ECC memory. And even the Threadripper/x399 MBs not really vetted/certified to support ECC Memory even though ECC memory can work on some TR motherboards.
The AMD Ryzen-1/TR-2 SKUs appear to be on sale pretty often below MSRP while all the first generation Ryzen/TR parts are always selling below MSRP. What little stocks of i9-9980XE SKUs that can be found online are selling at above $2000 currently. The Epyc/Naples lower core count CPU SKUs can be doubled up on dual socket Motherboards that offer and still offer 32 cores and even 16 total memory channels for some crazy good total effective memory bandwidth. And if you are pricing ECC memory then is possible to save if your MB supports 16 memory channels by populating the memory channels with the lower cost low capacity 4GB/8GB ECC DIMMS ad still have 64GB-128GB of total available ECC memory.
That power consumption.
Wow..
That power consumption.
Wow.. AMD TR 2990W in highly multithreaded benchmarks (blender, Cinema4d, etc) is out doing Intels top at similar power.
When was the last time one could say that about Intel vs AMD.
Can’t wait to see what Zen2 delivers. Intel really needs to step up the game with the next release.
1. not a chip i would
1. not a chip i would recommend at all. 2. more cores for more real multi-thread , multitasking workload. these reviews keep leaving out true multitasking workloads. no one does one thing at a time on a mutlicore workstation pc or even a consumer level chip. 3. the benchmarks mostly favor intel, cause they are written with intel in mind. have not see many benchmark apps say they have updates for the ryzen/threadripper family of chips. 4. i say this again and over and over. who is using an app that uses single threads? so why do they keep putting single thread benchmarks in these reviews? WE KNOW intel is king of single thread. but why shine a light on that and we have multithreaded chips. here is another bias review. even when you end it with “the Intel Core i9-9980XE might be your best bet—if your budget allows” if the chip lost to cheaper, higher core chips?
tl;dr -> Why is the
tl;dr -> Why is the Threadripper 2920X putting in a higher score for Geekbench single thread than the 2950X?
——
The Geekbench single threaded benchmark stands out to me as a little odd: the Threadripper 2920X gives a result ~9.6% higher than the 2950X (4957 vs 4521). I wouldn’t have expected this given that the 2950X has a 100MHz higher boost clock.
Another interesting, although less dramatic, data point is the Cinebench R15 single threaded scores: the Threadripper 2920X gives a result ~2.3% higher than the 2950X (176 vs 172).
Does anyone have any idea as to what could be causing this?
The achieved IPC is generally
The achieved IPC is generally more dependent on memory access performance than actual execution core performance. The execution core can generally execute ridiculously large numbers of instructions in parallel with out of order execution and multiple execution units. They often achieve an IPC or less than 1 even though they could execute something like 6 or 8 instructions per clock. This is why we have SMT. The core can execute instructions from another thread while waiting on L3 accesses. Intel has always been very good at cache design which directly effects single thread performance. The caching systems are probably much more complex than the cores these days.
I don’t know what causes the difference between the two Threadripper processors. The 2950 is a full 16 core chip with 4 cores per CCX. The 2920 is 3 cores per CCX. The only thing I can think of is that the 2920 has 8 MB of L3 for 3 cores while the 2950 has 8 MB for 4 cores. The other cores are not necessarily idle. They could be running OS and driver threads even if the only application running is single threaded.
Larger per-core L3 cache on
Larger per-core L3 cache on the 2920X may be something. That said, the result that the 2700X put in for Geekbench’s single threaded test doesn’t jive with that hypothesis – it has the same 2MB of L3 cache per core as the 2950X and it scores just a little lower than the 2920X with 4885.
If we were seeing an impact from background kernel and driver threads things would be even worse with all those threads having to be spread over “only” 8 hardware threads for the 2700X, right?
I like it though, keep ’em coming 🙂
Lots of wierd things can
Lots of wierd things can happen in a multi-threaded, multi-core situation. Threadripper is a NUMA system while the 2700x is not. If the OS does something such that memory is being accessed from a non-local NUMA node, that could cause lower performance. It could be higher performance to have the OS threads on the same CCX as the application or it could be lower. It depends on the memory access patterns. Getting consistent results from a NUMA system is much harder than a UMA system. Windows seems to be particularly bad at scheduling in Zen. I read some performance testing with some compression applications on phoronix. Windows server versions seem to be particularly bad. I am actually wondering if it is applying intel meltdown patches to amd systems, which is unnecessary.
I’m more interested in the
I’m more interested in the 9900X 10 core. I’m running 6 cores now for the 40 pcie lanes. Plex, folding and video encoding with a smattering of gaming. Those slightly lower core count but faster frequency seem to do well in that arena.
Yeah me too. I wonder how
Yeah me too. I wonder how enhanced cache (vs 9820K) will affect the performance. Will wait and see. My workload doesn’t benefit much from multitude of cores, but it benefits much from Intel architecture. Serious bummer for me is lack of Thunderbolt3 support on AMD platform.
I am curious as to what the
I am curious as to what the ramdisk test is actually testing. This seems like it is really a memory bandwidth test. I have been trying to figure out the memory controller configuration in Zen 1 and Zen 2. In Zen 1, the IFOP (on package links) are 32-bit but run at 4x base clock. That allows 128-bit per clock. The packet size is also 128-bit. A single channel memory controller is 64-bit, but since it is DDR, it transfers twice each clock for 128-bits per clock. The IFIS (inter-socket, off package links) are 16-bit, but they run at 8x base clock for also about 128-bit per base clock. Serdes has some encoding overhead, so both would be a bit less. DDR interfaces aren’t 100% efficient either, so it may map very well. This seems to imply that any one core is essentially limited to the bandwidth of a single memory channel, which might be what you are seeing in the ramdisk benchmark you are running.
For Zen 2, they need a lot more bandwidth per core to support double the AVX throughout. The clock speeds of the serdes links may be doubled with pci-e 4.0 physical layer. The memory clock is not doubled, so to maintain bandwidth, they would need to stripe across two memory channels instead of just one. On Intel chips, they may stripe across 4 memory channels, which may be why you see the results with the ramdisk. For intel server chips, they may be able to stripe across all 6 channels, but I don’t know if that is necessary or a good thing for all cases. The actual flops of a Zen 2 is actually close to an Intel server part with 2x 512-bit AVX units. Zen 2 seems to have 4 256-bit units instead. We don’t know if these are 2 add and 2 multiply. If they have the space on Zen 2 at 7 nm, they may be able to have all 4 do multiply and add.
To me being able to obtain
To me being able to obtain such stellar performance at an obtainable albeit premium price is hands down worth it. AMD has its own niche and can run with the big dogs all by itself but this new processor I just ordered as part of a general purpose gaming/rendering/research rig (with two RTD 2080 TI Blower edition cards) puts computing power on my desktop unimaginable just a few years ago under five figures. I personally do not own a car so have money normally spent on such a thing to use for technology.
This is the trade off I make to adopt the best tech I can get without selling my soul and my own daughter into slavery. ????