Whiskey Lake vs. Ice Lake Benchmarks: Testing Intel’s Big Leap in Ultraportable Graphics
We test Intel’s claim that Ice Lake graphics are up to twice as fast as last year’s Whiskey Lake
In the lead up to the launch of its new Ice Lake platform, Intel frequently touted the improved “Gen11” integrated graphics that would be making their debut in the 10nm mobile processors. Intel claimed that this new graphics architecture – now officially named “Iris Plus Graphics” – would significantly close the gap with, or even exceed in some cases, AMD’s Ryzen mobile graphics in the same power range.
We tested that claim earlier this month by comparing the first shipping Ice Lake system, the Dell XPS 13 2-in-1 with the top-end i7-1065G7, to a Lenovo ThinkPad T495 powered by AMD’s top processor in that same 15W TDP category, the “Picasso”-based Ryzen 7 3700U. The results for Gen11 Iris Plus Graphics were impressive. While the 3700U’s Vega 10 GPU still led in several games and applications, the Iris Plus Graphics were very close in performance, if not ahead in some categories.
But Intel made another, related claim prior to Ice Lake’s launch: that Gen11 graphics would be up to twice as fast as the “Gen9” UHD 620 graphics found in the preceding Whiskey Lake platform (“Gen10” graphics, which would have made their debut on Intel’s semi-aborted Cannon Lake platform, were never productized). So we set out to test this claim as well by picking up a Whiskey Lake-based system powered by the top processor from that generation, the Core i7-8565U.
Our goal is to compare the generation-over-generation performance of Intel’s top mobile processors in the 15W TDP category, with a specific focus on graphics. We’re also once again including the Ryzen 7 3700U, not only to compare Intel’s and AMD’s top currently available 15W platforms, but also to better understand the state of the market for low power integrated graphics prior to Ice Lake’s launch, since AMD’s Picasso-based systems have been on the market since the beginning of the year.
Comparison to Earlier Ice Lake vs. Picasso Benchmarks
First, a quick note for those who read our initial coverage on the Ice Lake benchmarks. Shortly after the release of that article, several software and driver updates were released. Both our Dell and Lenovo test systems received Windows 10 updates, and both Intel and AMD released graphics driver updates.
Since we were adding the Whiskey Lake system into the mix, and considering the reader feedback we received on the type of benchmarks we included, we chose to simply retest everything on the latest updates. Therefore, all tests for this article were run with the latest Windows, system firmware, and driver updates as of September 12, 2019.
It’s also worth noting that for some tests, specifically games, we’re using a different testing method. Our initial Ice Lake benchmarks reported minimum, maximum, and average frame rates. Reader feedback requested instead a focus on frame times, which is understandable in any case, but especially with lower power graphics. So, aside from a few built-in benchmarks, we now report average frame rate and average, 99th percentile, and 99.9th percentile frame times.
We further added some games that were requested (Fortnite, Overwatch) and removed a few games that couldn’t reach reasonably playable frame rates on this hardware (Far Cry 5, F1 2018). That’s a long way of saying that results between these two articles aren’t necessarily comparable.
The Test Systems
We tested three laptops: the aforementioned Dell XPS 13 2-in-1 (Ice Lake) and Lenovo ThinkPad T495 (Picasso), as well as the HP Spectre x360 13t (Whiskey Lake).
Note: our ThinkPad T495 specifically uses a Ryzen 7 PRO 3700U processor. This is part of AMD’s PRO line of mobile and desktop processors that offer additional enterprise-related security features and longer warranties. However, it has the same technical specifications and performance characteristics as the non-PRO 3700U. Since none of our tests are related to the chip’s unique enterprise features, we will refer to it as simply the “3700U” going forward.
As we mentioned in our initial Ice Lake coverage, comparing mobile processor platforms can be tricky, since a measurable amount of performance can depend on the choices made by the laptop manufacturer in terms of cooling and memory configuration. And, especially in this lower power, thin-and-light “Ultrabook-style” market range, the ability to match system configuration via upgrades or modifications is extremely limited, if that. Ideally, you’d want to compare platforms in the same exact chassis, such as a Dell XPS 15 that has options for two different processor types.
Unfortunately, the limited availability of Ice Lake processors just after launch, as well as the limited (but growing!) availability of Ryzen 3000 mobile options prevents this. Therefore, while we kept the testing conditions (ambient temperature, power status, testing order, etc.) the same, our results must carry the caveat of all laptop reviews: that performance is affected by the laptop design itself.
Another issue that came up in our initial Ice Lake benchmarks is memory. Some readers argued that the comparison between the Ice Lake and Picasso systems was invalid, since the i7-1065G7 uses LPDDR4-3733 while the 3700U is limited to DDR4-2400. The reality is that these are the speeds that Intel and AMD chose for these platforms. Unlike desktops, it’s generally not possible to use faster memory in a laptop than the platform supports. So, because AMD designed its Zen+ Ryzen 3000 mobile platform to use 2400MHz memory, that’s what we’re stuck with.
As to whether that makes the comparison “fair,” especially considering that memory speed has a greater effect on overall performance for Ryzen compared to Intel, our position is that these are the systems that consumers will buy. There is no Ryzen 3000 mobile system running DDR4 at 3733MHz, and there is no Ice Lake system running DDR4 at 2400MHz. We can’t compare hypotheticals; with laptops especially, we must evaluate the system as it will be available to consumers.
The other memory-related question dealt with confusion over the AMD system’s memory frequency. The manufacturers of several Picasso-based systems, including our ThinkPad T495, list the memory as DDR4-2666 on the device’s product page, leading some to believe that the 3700U and its counterparts could run its memory at a faster frequency. Unfortunately, this is an error on the part of the system makers and marketers.
Companies like Lenovo are shipping these systems with memory modules rated at DDR4-2666, but the system actually operates the memory at the default 2400MHz. You can see this in the HWiNFO screenshot above, with the red box showing the module rating and the blue box showing the actual operating frequency.
All of that said, here are the tech specs for our three systems:
|Specification||HP Spectre x360||Dell XPS 13 2-in-1||ThinkPad T495|
|Processor||Intel Core i7-8565U
(14nm Whiskey Lake U)
|Intel Core i7-1065G7
(10nm Ice Lake U)
|AMD Ryzen 7 PRO 3700U
|Microarchitecture||Whiskey Lake||Ice Lake||Picasso|
|Cores/Threads||4 Cores / 8 Threads||4 Cores / 8 Threads||4 Cores / 8 Threads|
|Max Boost Freq.||4.6GHz||3.9GHz||4.0GHz|
|Graphics||UHD Graphics 620||Iris Plus Graphics||Radeon RX Vega 10|
|Graphics Driver||Intel Graphics 126.96.36.19958||Intel Grahpics 188.8.131.5255||Adrenalin 2019 Edition 19.9.2|
|Storage||512GB NVMe||512GB NVMe||512GB NVMe|
|CPU Launch Date||August 2018||August 2019||January 2019|
We ran a number of synthetic and real-world benchmarks looking at both CPU and GPU performance. Each system was running Windows 10 1903 with the latest updates as of September 12, 2019 and all nonessential software and services were disabled. All tests were performed with a fully charged battery while connected to power via each laptop’s included power adapter, and all Windows or OEM power management options were configured for maximum performance.
Each test was run three times and the results were averaged. Specific details for each test are addressed in the paragraphs before each chart. The laptops were allowed to idle between tests to rebalance thermals. Also note that some of the tests include results measuring storage performance. This is something that is not necessarily inherent to each platform and can vary significantly depending on type and class of the chosen drive. All our systems were configured with a 512GB NVMe drive, but from different manufacturers and with different performance characteristics. Therefore, while we include the storage results for the sake of completeness, they alone shouldn’t be weighted by readers unless you are considering a purchase of the same laptop model and configuration.
As we discussed in Podcast #557, we’re providing our benchmark data spreadsheet via Google Sheets for those interested. Note that this is simply the raw data we collect while testing each system or component. It lacks the context and details for each test that is found in the article, and it may contain results that we chose not to use for reasons of relevance or clarity. Please let us know if you find this useful and we will try to include this type of data where applicable in future reviews.
Finally, as with our initial Ice Lake benchmarks, these tests are looking solely at performance, not efficiency or battery life. Since those aspects are even more dependent on the manufacturer’s laptop design than performance, we are currently testing them as part of an overall review of the laptops themselves.
Whiskey Lake vs. Ice Lake Benchmarks – CPU & Synthetics
We’ll start with benchmarks that focus more on the CPU side, including synthetic benchmark suites that attempt to measure overall system performance.
We used Blender Benchmark 1.0 Beta 2 to test performance in the popular open source 3D graphics application. The Quick Benchmark option measures the time it takes to render two scenes: BMW and Classroom. We reported the results in seconds, with a lower score equaling faster performance. The rendering is done entirely via the CPU.
Web browsing performance is frequently touted by hardware companies and using a browser is one of the most common activities for users. Using the latest version of Chrome (77.0.3865.75), we ran two benchmarks: Kraken 1.1 and JetStream 2. The Kraken results are reported in milliseconds, with a lower score equaling better performance, while the JetStream results are a proprietary score where higher is better.
The Cinebench benchmark measures rendering performance for Maxon’s Cinema 4D. It tests both multi-core and single-core rendering, with a higher score equaling better performance. We ran the current R20 version of the benchmark, but also tested the previous R15 version for those interested in comparing older scores.
Of note, Intel has recently argued that tests like Cinebench are of little value to most consumers, especially in the relatively low power 15W TDP category that we are testing here, since the majority of users almost certainly won’t perform heavy 3D rendering on such systems. While that may be true, Cinebench is still a widely used benchmark in the industry, is free and easily accessible to readers for testing their own existing hardware, and is useful for comparing the relative performance of many different processor categories. We therefore continue to include it as a data point and, as with all our benchmarks, expect our readers to decide how each test translates to their intended usage of a particular processor or system.
The recently launched Geekbench 5 is a multiplatform benchmark that measures single- and multi-core CPU performance and, separately, GPU compute performance. Changes in Geekbench 5 include the removal of memory testing as a factor in the CPU benchmarks (since memory performance can vary independently of the processor depending on component ratings and configuration) and the addition of a Vulkan-based GPU test as an alternative to OpenCL. Results for each test are reported as a proprietary score with higher numbers equaling better performance. Note that the scoring system was completely overhauled for Geekbench 5, so results on this latest version are not comparable to results from Geekbench 4 and earlier.
Novabench is a free multi-platform benchmark that attempts to measure all major aspects of a system’s performance, including CPU, GPU, memory, and storage. We used the latest version, 4.0.6. Results are reported by a proprietary score with higher being better.
PCMark 10 is the latest version of UL’s (formerly Futuremark’s) comprehensive benchmark to measure overall system performance. Users can select different tests depending on their needs, but options include measuring application start-up times, video conferencing performance, responsiveness in word processing and spreadsheet productivity apps, photo and video editing, and gaming. We ran the Extended mode which tests everything and reported the overall scores for each category. For specific scores in each subcategory, see our aforementioned benchmark data spreadsheet. Higher scores equal better performance.
Whiskey Lake vs. Ice Lake Benchmarks – Gaming
Next we’ll look at gaming-focused benchmarks, both synthetics that test gaming performance such as 3DMark, as well as a handful of games themselves.
A few results, which will be individually noted, used the game’s built-in benchmark. For all other games we measured performance with the Open Capture and Analytics Tool (OCAT) in 60-second blocks. For games where “the action” doesn’t start right away, such as World of Warships, we didn’t start measuring until we had reached a point of active conflict. VSync was of course disabled for all tests.
Except where otherwise noted, game results are reported as an average frame rate in frames per second and as average, 99th, and 99.9th percentile frame times (with the latter two sometimes referred to as 1% Low and 0.1% Low – in other words, the average of the worst 1% and worst 0.1% of frame times). When comparing these results, you want a higher frame rate and lower frame times that are also as consistent as possible between the average, 99th, and 99.9th percentiles. Higher, less consistent frame times mean you’ll see stuttering or “choppy” rendering during gameplay.
UL’s 3DMark application features a number of benchmarks aimed at evaluating a system’s gaming performance. Each benchmark contains tests that focus on the GPU and CPU/Physics, as well as a combined test that stresses both. The results are presented as overall scores with higher being better.
Of note, our Ice Lake system encountered an unusual issue with 3DMark and would only successfully complete the two tests we have included here: Night Raid and Time Spy. We confirmed that other Ice Lake systems did not experience this issue, so it appears to be limited to our test system, but we were not able to resolve it prior to publication. While some 3DMark tests, such as the Extreme versions of Time Spy and Fire Strike, are not suitable for the lower power mobile systems in this review, others are, and we will update this article with additional tests if we are able to resolve the issue. To be clear, this was the only issue of this kind that we experienced among all three test systems during our testing process.
We first tested Time Spy, which is currently the primary test in 3DMark’s ever-growing test suite. This is a DirectX 12-based benchmark designed for gaming PCs that by default renders at 2560×1440. The actual results on these lower powered systems with integrated graphics are of course unplayable but the scores provide a measurement for relative performance in a demanding scenario.
Next is Night Raid, also a DirectX 12 test but one that is designed exactly for our test systems: PCs with integrated graphics and other relatively lower-end hardware, including even Windows on ARM devices. As a result, the test renders at playable frame rates and gives a good idea of expected performance for modern games designed to accommodate integrated graphics.
Unigine Corp. has offered several multi-platform benchmarks over the years based on the company’s game engine. We tested their three most recent benchmarks: Superposition (2017), Valley (2013), and Heaven (2009). The Superposition results carry the most weight since it is the newest test and the best optimized for today’s processors and graphics cards, but we include the older tests for potential comparisons to older hardware.
We ran both the Heaven and Valley benchmarks using their Basic quality preset. For Superposition, we tested both the 720p Low and 1080p Medium presets. The results are presented as the overall benchmark score with higher numbers equaling better performance.
Final Fantasy XIV: Shadowbringers
The Final Fantasy XIV: Shadowbringers Benchmark is the official benchmark for the latest update to the popular MMORPG of the same name. The benchmark uses the same assets as the game to run through a series of demanding scenes, testing the capabilities of your GPU, GPU, and storage. We ran the test with the Standard (Laptop) quality preset. The test presents three results: an overall score (higher is better), an average frame rate in frames per second (higher is better), and a total loading time in seconds (lower is better).
Sid Meier’s Civilization VI, released in 2016, is the latest version of the popular turn-based strategy game. We used the game’s built-in benchmarks to measure average frame time and 99th percentile frame time, measured in milliseconds, and the average turn time, measured in seconds. The turn time, where lower is better, is primarily a CPU-based test, while the frame times task both CPU and GPU. Again, with frame times a lower, more consistent result is better. Tests were conducted in the game’s DirectX 11 mode at 1920×1080 with the graphics presets configured to Low.
The frighteningly popular Fortnite, first released in 2017, is a online multiplayer game with different free and paid modes. We tested the game at both the Low and Medium presets via its most well-known free-to-play Battle Royale mode and captured the frame rate and frame time data via OCAT. Of note, as is somewhat common with popular online games, Fortnite can set a separate rendering resolution independent of the output resolution in order to improve performance. We overrode this setting and ensured that both the rendering and output resolutions were locked at 1920×1080 for each test.
Grand Theft Auto V
Although first released for PC in early 2015, Grand Theft Auto V today remains one of the most-played games in the world and ranks as the third best-selling video game of all time. In DirectX 11 mode, we configured the game for the lowest graphical detail settings at 1920×1080 and used the game’s built-in benchmark to generate the average frame rate, and average and 99th percentile frame times. The game reports results in five “passes” but we have averaged the total test results for simplicity.
Released in mid-2016, Overwatch is a popular team-based multiplayer first-person shooter. Like Fortnite, the game has the option to set a rendering resolution independent of the output resolution. We therefore ensured that both rendering and output resolutions were locked at 1920×1080 for both the Low and Medium presets. We captured frame rate and frame time data via OCAT in a multiplayer vs. AI training match.
Rainbow Six Siege
Tom Clancy’s Rainbow Six Siege is a popular online tactical shooter first released in late 2015. We configured the graphics settings for 1920×1080 at both the Low and Medium presets and we used OCAT to measure frame rate and frame times during the game’s built-in benchmark. This approach wasn’t practical with other games such as Grand Theft Auto V since those benchmarks had interruptions in rendering to load new scenes, whereas Rainbow Six Siege rendered the entire benchmark run in one pass.
Released as a surprise hit in mid-2015, Rocket League is a popular multiplayer sports game described by the developers as “soccer, but with cars.” We tested both the Performance and High Quality graphics presets at 1920×1080, and measured frame rate and frame times via OCAT in a 3-vs-3 offline match in Champions Field.
World of Warships
One of the popular free-to-play online multiplayer games from Wargaming, 2015’s World of Warships is a naval warfare game featuring both player-vs-player and player-vs-AI game modes with a variety of customizable and upgradable ships. We tested both the Low and Medium graphics presets at 1920×1080 resolution and used OCAT to capture frame rate and frame time data in an 8-vs-8 cruiser battle. Since there is a period of calm at the beginning of each battle as the opposing ships approach each other, we didn’t begin testing until both sides were engaged.
Whiskey Lake vs. Ice Lake: The Big Picture
The results in the preceding charts clearly show that Ice Lake’s Gen11 Iris Plus Graphics are a huge bump over Whiskey Lake’s Gen9 UHD 620. As a final step, we’ve summarized this performance improvement for the graphics-related tests we conducted.
Looking at the synthetic benchmarks that deal with gaming and graphics performance, the results in the chart below show the percentage improvement in each test for Ice Lake over Whiskey Lake.
As you can see, Intel’s claim of “up to 2x” (which would equal a 100% improvement in the way we’re showing the data) is indeed true for several applications. Novabench fares the worst at a 50 percent improvement, while the other tests range between about 63 and 93%.
Looking at a summary of games, both bars also represent percentage improvement, so higher percentages for both frame rate and frame times are better in this context.
Overwatch on the Low preset shows the smallest improvement, at about 32% for frame rate and 17% for frame times, whereas Rainbow Six Siege saw frame rates improve over 72% and frame times drop by almost 41%. The range of improvement for the remaining games ranges from around 40% to just under 63%, which is impressive for a “single generation” jump (not counting the aforementioned Gen10).
To keep this improvement in perspective, and understand why it was so big for Intel, just block out Ice Lake from the per-test benchmark charts above and look at Whiskey Lake vs. Picasso, which was the exact state of the market in this 15W TDP category prior to Ice Lake’s launch. Intel still held a lead in most CPU-related workloads, but AMD and Ryzen Mobile owned the graphics side of things. In one generation, however frustrating it may have been to get there, Intel has closed the gap in graphics performance.
This gives consumers new choice in this ultraportable market. Now, those who need a thin and light laptop for travel or work can still game on the go with acceptable performance on either AMD or Intel.
Of course, this conclusion thus far only applies to the highest end “G7” variant of the Iris Plus Graphics, currently found in Ice Lake’s i7 tier and on two i5 models. We have yet to test the lower-end “G4″ or “G1” variants.
|Processor||Cores/Threads||Base Clock||Boost Clock||GPU||GPU EUs||GPU Max Clock||L3 Cache||Base TDP|
The other caveat for Ice Lake is availability. Of the systems that have thus far been announced with 10th Gen Ice Lake processors, very few are actually shipping, with Dell one of the only manufacturers to have consistent supply. So if you decide that Ice Lake will power your next thin and light laptop, you may need to wait a bit for the exact model and configuration to come into stock at your preferred laptop manufacturer.
But, looking just at this top end of the Ice Lake lineup, integrated graphics are no longer a negative factor for Intel. The company has confusingly split its 10th Generation of mobile processors between architectures (the new Comet Lake processors, also branded as 10th Gen, are 14nm Skylake-based parts lacking Gen11 graphics), but if you can find the right Ice Lake processor in a thin-and-light laptop design you prefer, you probably won’t have any regrets about performance.
As for AMD and Ryzen 3000 Mobile, Ice Lake really doesn’t challenge it completely, at least not yet based on the systems and price points we’ve seen. Our benchmarks show that the Ryzen 7 3700U, based on a nearly-year-old Zen+ mobile architecture, still outperforms Ice Lake and Gen11 Iris Plus Graphics in some situations. And beyond that, AMD enjoys its traditional “value” advantage, as systems powered by the top-end Ryzen Mobile processors are priced at hundreds less than these first Ice Lake options.
To reference our initial Ice Lake article, our Ice Lake-based Dell set us back just under $1,900 while the Ryzen-based ThinkPad was $1,280 (and is actually even less — $1,199 — as of today). There are reasons beyond the processor for the price difference in these systems, of course, but there’s still going to be a price-to-performance advantage for Ryzen 3000 Mobile for some time.
To reiterate, however, Intel’s strong progress in graphics performance with Ice Lake has given consumers new options, and further heated up the competition in this important mobile space. We now are eager to see actual performance of Intel’s lower-end G4 GPU, and see how pricing and availability settle across the Ice Lake lineup.