At GTC, NVIDIA announced a new device called the DIGITS DevBox:
The DIGITS DevBox is a device that data scientists can purchase and install locally. Plugged into a single electrical outlet, this modified Corsair Air 540 case equipped with quad TITAN X (reviewed here) GPUs can crank out 28 TeraFLOPS of compute power. The installed CPU is a Haswell-E 5930K, and the system is rated to draw 1300W of power. NVIDIA is building these in-house as the expected volume is low, with these units likely going to universities and small compute research firms.
Why would you want such compute power?
DIGITS is a software package available from NVIDIA. Its purpose is to act as a tool for data scientists to manipulate deep learning environments (neural networks). This package, running on a DIGITS DevBox, will give much more compute power capability to scientists who need it for their work. Getting this tech in the hands of more scientists will accelerate this technology and lead to what NVIDIA hopes will be a ‘Big Bang’ in this emerging GPU-compute-heavy field.
More from GTC is coming soon, as well as an exclusive PC Perspective Live Stream set to start in just a few minutes! Did I mention we will be giving away a Titan X???
**Update**
Ryan interviewed the lead developer of DIGITS in the video below. This offers a great explanation (and example) of what this deep learning stuff is all about:
Without ECC memory, and with
Without ECC memory, and with the TSX errata unfixed on this CPU’s SKU, no Academic institution, or brokerage house will utilize this for any research, or trading. They will go for the proper GPUs also, the ones with the most 64 bit FP units, and the certified drivers, anything less than a Quardro/Firepro will not be utilized, they may utilize the TITAN Z(with expensive certified drivers), but not any gaming SKU, it will have to come with error correcting memory. It’s better to spend 100+ grand on the proper equipment, than to jeopardize any government funded academic research, or financial breaking of a brokerage firm. It’s better to get the proper Xeon/power8 CPU SKUs, and GPU SKUs with the certified drivers, even for non graphics workloads.
This build is nothing more than a gaming rig, and lacks the error correction for any research/professional uses. Even 28 gigaflops of 64 bit FP is not enough for many scientific workloads, and those workloads need the maximum data/error correction protection.
How do you know it doesn’t
How do you know it doesn’t have ECC memory?
Cut and paste the processor
Cut and paste the processor make and model number into Google, and go to the Intel ARK database/knowledge base and read the quick reference spec sheets. You are not getting ECC memory on any non Xeon SKUs. Oh and be sure to read up on the TSX instruction errata for the Haswell microarchitecture. Learn to research!
With the GPUs doing the bulk
With the GPUs doing the bulk of the work, this device/software combination is not going to rely on synchronized transactions at the CPU level. Its primary purpose is workloads involving single precision that does not need ECC protection, either. Further, this box is $10k and is *built and supported by NVIDIA*. Sure you can custom build whatever you want all day long, but equivalent enterprise hardware with a service contract is going to far exceed that price point for the equivalent horsepower / footprint. Don't go comparing this to a cluster, either, because that's not what it is for.
Would this same type of
Would this same type of scenario apply with the New 2013 Mac Pro as well since ECC memory was left out on the GPUs? It has always bothered me and some additional clarification on this would be appreciated. I have wondered if this was the real reason Apple was charging so little for their FirePro GPU upgrades.
Thanks!
Really… -_- , no Apple
Really… -_- , no Apple Trash Can please.
They are charging so little ,
They are charging so little , simply because it cost them nothing ! The same thing apply to all nvidia GPUS if you want the truth .. you should know that these chips in fact does NOT cost them 10$ to make. and what the shitty nvidia , apple , intel are doing is all just because there are no real competition in the market.. so… simply just rip people off and play around with these idiots teenagers and get more money.
the apple trash crap shit
the apple trash crap shit doesnt cost nothing btw… it is really unfairly overpriced ! so FUCK YOU APPLE!
For more compute power
For more compute power capability to scientists is what the article states, the GPU/s will be used as accelerators, along with the CPU, and what both Server grade CPUs, and the professional Grade GPUs both need/provide is error correction, the driver software for the professional/academic research grade GPU accelerations must be vetted, and as error free as possible, And those Server/supercomputing GPU accelerators have the extra error correcting functionality added/enabled for their memory/memory subsystems, and their driver software that any consumer GPU gaming SKUs do not.
Scientists run on strict budgets/research grants, and strict peer review, no scientist is going to risk their professional career on this hardware/uncertified driver software, maybe this SKU will have some beginner’s training usage, but maybe even the TSX transactional memory capabilities is what is/may be needed for training programmers in transactional memory capable systems, and with this device’s CPU SKU, the errata prevents that usage also. Scientists, and the scientific/professional Computer Scientists that assist/support scientific computing will always get the funding necessary for the proper/certified computing equipment, there is no other choice or substitute, those research grants run in the millions of dollars, and the funding organizations/GAO are very strict with the vetting requirements on the computing equipment.
Nvidia’s CEO is way out there in marketing dreamland on this one!
Many of these come with
Many of these come with grant… I know quite a few friends using nvidia/intel awarded assets for their PHD research. University rarely just goes out and purchase them based on press release. They usually get a free trial run before putting down any order. Folks know exactly what they are going to use it for before buying them.
Btw, I think you completely ignored SW error recovery. There are huge chunks of research area where a few bit flips are easily detected, corrected or ignored in noise. Many of statistical methods don’t converge all the time anyway, and have huge precision tolerance. Even medical research doesn’t always mandate FP64 because well, the imaging system (thus the data source) is not that precise yet anyway. AI/Graphics/Computer vision are exactly these kind of applications where ECC doesn’t matter much. Just search the volume of papers published with data from consumer cards. Cost is still very important when Tesla is 3-4 times more expensive for very few benefits in most cases.
They really should just start
They really should just start supporting ECC everywhere. I have had files corrupted at home before because of bad system memory. It wasn’t bad enough to cause crashes, but it still corrupted files before I ran a memory test and found it. I don’t see why it is acceptable for system memory to be unprotected. ECC memory is not that much more expensive.
Really, software error
Really, software error recovery, how is the software going to do this and not totally bog down the system, why do you think they invented hardware error correction in the first place. Really, if the visual processor/visual software is running on an autonomous piece of equipment and every millisecond means the difference between the correct move, or running into the side of a building, what expensive hardware do you think the scientific folks are going to choose! Spend $100,000 on the proper professional GPU, and CPU hardware/drivers, with the proper hardware error correction and driver certification, or pay millions in damages, just because the error correction circuitry costs a few extra thousand, or the software error correction (as you STATE) is inexpensive, but may not be fast enough to catch the error quickly enough to stop the 5 ton piece of autonomous equipment from running into the corner convenience store and crushing little Debby and her mother!
Intel and Nvidia awarding assets for education is not in question here, this non professional equipment is! And do not expect the scientific community, or the community that runs the grants/funding, and the liability risk management institutions/departments, that all educational institutions employ, to ever approve of anything but the proper computing for the task, it will be written into the grant proposal, and no scientist, or scientific computing support, will take the unnecessary risks involved, the liability risks outweigh the savings. ECC matters on ALL medical equipment, you can be damn sure of that, and the medical equipment maker’s insurers will make damn sure of that!
Your spin/damage control has no basis in reality, and your supporting anecdotal “I know quite a few friends using Nvidia/Intel awarded assets for their PHD research” provides no valid direct/indirect evidence that “friends used NVidia/Intel assets” that was not the professional grade expensive for a reason(labiality reasons) CPU/GPU kit. No sane PHD professor, or student PHD candidate, would risk their scientific/professional work on any non professional grade CPU/GPU hardware.
Nvidia’s CEO is just trying to falsely spin up some good will/whitewashing, and possibly pawn off some of his excess SKU stock on the unsuspecting noobs! For sure the Intel SKU with the TSX errata will not be utilized, especially for transactional memory/software development, and the non professional gaming grade GPU SKUs will not see wide use in any academic setting. If I were purchasing any CPU/GPU SKUs for professional/academic use they would have to have the widest possible professional functionality across all academic disciplines, so the equipment could be utilized by all departments! The department currently using the device may not need FP64, but what about the other department that may receive/need this equipment in the future, did you consider the lifetime usage on a piece of computing equipment in an academic setting, and the usability across multiple disciplines!
Nobody cares about ECC in
Nobody cares about ECC in scientific research – it’s only needed for things like autopilots in planes. There are many many examples of Universities installing GeForce hardware for research. Indeed we run our national supercomputers with K80’s with ECC off all the time to get better performance and lower power consumption.
FUCK YOU NVIDIA! FUCK YOU…
FUCK YOU NVIDIA! FUCK YOU…
DIE F@G, DIE AMD.
DIE F@G, DIE AMD.
He was probably just quoting
He was probably just quoting Linus Torvalds, who said very same sentence.
“F*ck U Nvidia”
12 of these in parallel would
12 of these in parallel would get you onto the Top 500 supercomputer list. 😐
Not really, not without as
Not really, not without as much FP64+ capabilities, see beyond simply the large FLOPS numbers, but what precision are those stated 28 TFLOPS, the marketing copy does not say! WOW “GPUs can crank out 28 TeraFLOPS of compute power”, what a sales pitch, what about the precision of those 28 TeraFLOPS, what are they, I’ll bet those Top 500 on the supercomputer list offer at least FP64, if not more in the range of FP128, Nvidia’s marketing folks are not known for supplying the most complete and accurate figures.
3.5 = 4