Defining the Architecture
I should note here that though this technology is something that Intel is currently working on, it is NOT a product in development. It may become one, but Intel has indicated that for now this is all for research purposes only; surely something LIKE this will be seen in the coming years, but it may not have these exact specifications.
Sure, we are talking about processors with as many as 100+ cores on them, but what will those cores look like? This slide shows the past (single core processors), present (multi-core processors) and future. These CPUs using 10s-100s of processors will be utilizing something LIKE the current IA (Intel Architecture) but optimized for low power and multithreading communication. They might all share some cache and each have their own local cache, but we are talking smaller amounts than the current 4 MB we are looking at in today’s quad core processors.
These new cores will obviously be single threaded, but will still retain the raw processing power of their single, large core cousins (from years gone by). Think of each having the same 32 Gigaflops performance, running at 4 GHz: if we think of the large core processor having a die size of 21 mm^2 and the new small core taking up a die size of 6mm^2, that is a difference of 1.5 Gigaflops / mm^2 on the large core and 6.4 Gigaflops / mm^2 on the small core. Now you can begin to see how the scale will bring processing power (though simplified instructions power) to a terascale CPU.
Also, just as we have seen and discussed on the Cell processor architecture, the Intel terascale technology allows for specific functionality to be built into some of the cores to create a CPU with a more specific task in mind. If your server is constantly running image creations then it might be possible to create some custom cores to address that function quickly, and connect them on the terascale die, replacing a standard compute unit.
The technology to build these processors is closer than you might think as well. Using next-generation digital CMOS circuits, Intel would be able to achieve very low power, yet high performance processing models. Also, this would enable a very high speed, point-to-point signaling interconnect (HyperTransport is a p-t-p interconnect, for example) that would need to equate to 20 Gb/s of IO or more!
One of the features that a teracore processor might bring to the table is the ability to have some built-in safe guards for CPU failure. If a CPU is built with 40 of these cores on it, but only 32 are enabled, the remaining 8 would be around as a fail safe; essentially having ‘spares’, like in a RAID array, should a core go bad or exhibit some failure-like behavior.
Having such a large amount of cores will also enable even more fine grained dynamic power management — only the cores that are being used would need to be enabled. Also, the ability to move processing from a core that is running hard, creating hot spots on the CPU die, to a core farther away, would allow the CPU to manage its overall heat and safety better.