Results: Valve Benchmarks and Euler3DValve Particle Simulation Benchmark
A few years back Valve dabbled in multi-threaded applications to extract as much performance as possible out of modern CPUs. One of the apps is a particle simulation. This leverages every available core and renders particles in a 3D scene.
The AMD processors do not do as well as expected, even though they have more actual cores than the Intel part. I am curious as to what Valve used as a compiler for this program.
Valve Map Compilation Benchmark
The second benchmark Valve put out was a map compilation tool. This basically took a game level/map and compiled it, which figures in the pre-computed light values and other visual optimizations to make the map look good, yet still playable on most systems.
We see a much closer result this time than last. Intel takes first place and then the more cores the merrier after that.
This professional style benchmark is based on an actual engineering simulation featuring fluid flow over an airfoil shape. It is heavily multi-threaded, and can be adjusted to feature as many threads as needed. For this I set it for 20 steps and 1, 2, 4, 6, and 8 threads. I adjusted thread count according to how many actual cores the processors have. I mixed and matched a bit to see how the AMD processors would handle more threads than actual cores.
This benchmark has always heavily favored the Intel processors, and I’m pretty certain that it was compiled using one of the Intel compilers which ignores SSE flags from AMD processors. Or I could be wrong. Anyway, the AMD processors do not like to run more threads than they have cores for, and the only time we see a positive difference is when we see the differences on the AMD X3 processor between 2 and 4 threads. 2 threads will obviously ignore the 3rd core, while with 4 threads the 3rd core is used, but the overhead with thread switching negatively impacts performance so it does not exhibit the expected theoretical jump in performance.
It is interesting to see with the Intel processor that even though it “natively” supports up to 8 threads, the performance difference between 4 threads and 8 is negligible as compared to going from 2 threads to 4.