Who did WHAT in the Punchbowl?The Fly in the Ointment
The glimpse that Intel has given us of Larrabee is of course very interesting. It is a new and unique way of attempting to render graphics in an efficient and well thought out manner. While AMD’s and NVIDIA’s architectures are not exactly brute force, they still have a ways to go before they can be considered general computing units. Intel is going from the opposite direction and taking a general computing unit and making it into a renderer.
Flexibility and programmability are the two building blocks of Intel’s Larrabee architecture. While everything looks good on paper, the largest problem that many of us can see is that to achieve good performance along with this flexibility, a tremendous runtime compiler is going to have to be developed to squeeze every drop of performance from Larrabee. This situation is oddly reminiscent of Intel’s EPIC architecture, which is the basis for the Itanium series of products. To get good performance across many workloads with Itanium, a true next-generation compiler was needed. Intel was unable to deliver on the compiler technology, and as such Itanium was relegated to workloads where it excelled at, and was ignored for the vast majority of processing situations.
Intel is planning on Larrabee to be fully compliant with DirectX and OpenGL, as well as its own native applications for HPC or Stream computing.
Will Larrabee be the next EPIC? It is hard to say, as Intel certainly is not talking. But to deliver the kind of performance needed to compete with AMD and NVIDIA products which will be introduced around the time Larrabee is released, Intel is going to have its work cut out for it. While it is offering developers some interesting tools to create unique content, if it is not able to perform at a level that is consistent with its competitors across common applications, then developers will not take the time to explore the advantages that Intel is able to bring. This is not because of any loyalty to one manufacturer or another, but the simple economics of selling a product to the largest installed base. If Intel is not able to match performance, then users will not be buying their product.
This is going to be one area where Intel will have a leg up on the competition. By utilizing large numbers of X86 cores in a single product, high performance computing applications that once required many server class machines can be run on one card. While each X86 core is a dual issue in-order unit, Intel makes up for any deficiencies by throwing more and more cores at the problem. By running in an environment that allows an external CPU to handle the compiler software, high performance applications can leverage this multi-core architecture fairly efficiently and scale the results compared to how many cores are in the chip. The advantages of a fairly fast core with very fast inter-core communications can outweigh that of multiple servers running the same number of superscalar out of order cores which have to rely on front side busses, HyperTransport, or the upcoming QuickPath.
In simulations Intel was able to show that performance scales linearly with the number of cores used. As far as they could tell, they are not able to hit the point of diminishing returns with a realistic number of cores. This also applies to non-gaming applications.
For the first generation of products, I am going to go out on a limb and say that it will not be competitive with what AMD and NVIDIA will have for high end cards. Where I believe Intel could hit a homerun is with the high performance computing and stream processing that many complex applications are starting to leverage. NVIDIA has already gained a lot of mindshare in this particular industry with their CUDA outreach programs, and AMD is not all that far behind. The very fact that Intel is using a series of X86 cores in a single product which uses C/C++ already gives it a giant leg up over its competitors.
Some Closing Thoughts
Intel is showing a very interesting architecture, but time will tell if it will be a consumer success. We can speculate that the first chip based on the Larrabee architecture will be a fully custom part, or if it is not fully custom then the X86 and vector units will be custom, leaving the texture units and other fixed function units to run at a slower speed. But my best guess is that it will be a fully custom part running in the 2 GHz+ range.
The big question will be what process node Intel will introduce it at. If in 2009 when Larrabee sees the light of day, we could see some significant performance differences if Intel decides to use the “older” 45 nm process, or the new 32 nm process. While the smaller process will allow for faster clockspeeds, the bigger factor will be how many cores can be fit on a die that is appropriately sized for consumer use.
Remember those threads, strands, and fibers? Efficiency internal to these vector units is going to rely heavily on the software managing these fibers and strands.
The idea that Intel is making a significant push into the graphics market is making investors of AMD and NVIDIA very skittish, and rightly so. There are some amazingly smart people in Intel, and they are bringing out an interesting and unique architecture to address the performance problems in 3D rendering and high performance stream computing.
In the next few weeks we will be learning more about Larrabee, and we shall be passing the information to readers as fast as possible. In the meantime Intel has shown us some of the high level design decisions surrounding Larrabee, and it brings up more questions than answers at this time. Will Intel be able to develop a real-time compiler that will keep Larrabee running at near 100% efficiency levels, or is it too much of a jump for compiler technology? Or will the graphics driver for Larrabee induce too much of a performance drain from the CPU? As 2009 draws closer, we will eventually find out these answers. Until then, all we know is that Intel has sunk a significant amount of money into the Larrabee project, and for those that remember the $1 billion development of the Itanium… it could be quite a gamble.