What is Terascale?
If you’ve never heard the term ‘terascale’ before reading this article, you aren’t alone. Before attending this Fall’s IDF, I hadn’t been introduced to the term either. But after hearing and reading about it and doing a lot of research into the technology, I can tell you that we are going to walk away from this technology overview excited about the future of computing.
The basic premise of Terascale computing is being able to work on terabytes of data on a single machine which would require teraflops of processor power. (Terabyte is approximately 1024 GBs and a Teraflop is approximately 1000 Gigaflops.)
This slide from one of Intel’s presentations shows the progression from single data and single core processors, through the era of multi-core processors (which we are in now) and into the world of processors with many more cores on them than even the quad-core Intel parts announced this week.
One of the main reasons for a move to multi-core products is the fact that both Intel and other chip designers have been increasing the amount of transistors that we can fit in any given area but haven’t alleviated the problems of power and heat. The Gigahertz war ended poorly for Intel, as they had to revert backwards a bit and redesign their CPUs from the ground up with this knowledge. But, Moore’s Law still applies, and by 2011 Intel estimates we’ll be seeing chips with over 32 billion transistors on them! But if we can’t increase the power of a single core and cranking up the frequency, what can we do with all those transistor resources? The answer: more cores. Many more.
For our discussions here, the term ‘terascale’ will refer to a processor with 32 or more cores. Moving away from the ‘large’ cores seen in the Core 2 Duo and Athlon 64 lines from Intel and AMD, the cores in a terascale processor will be much simpler (kind of like we are seeing in the Cell processor design). These cores will be low power and probably based on a past-generation Intel architecture that has been refined and perfected. These cores can provide 4-5x higher performance/watt efficiency and will scale beyond the limits we have in current generation instruction level parallelism.
You should not think of this merely as SMP on a single die; these are vastly different cores with new platform requirements and software requirements. How so? How about memory bandwidth needs of 1.2 Terabytes/s compared to the 12 GB/s in current SMP systems? And what about latency levels of only 20 cycles for a terascale core compared to 400 cycles on a modern SMP system? Now you see the scales we are talking in here and the significance (and hurdles) these designs introduce.