AMD Enters Machine Learning Game with Radeon Instinct Products
AMD is launching Polaris, Fiji and Vega based GPU cards for machine learning.
NVIDIA has been diving in to the world of machine learning for quite a while, positioning themselves and their GPUs at the forefront on artificial intelligence and neural net development. Though the strategies are still filling out, I have seen products like the DIGITS DevBox place a stake in the ground of neural net training and platforms like Drive PX to perform inference tasks on those neural nets in self-driving cars. Until today AMD has remained mostly quiet on its plans to enter and address this growing and complex market, instead depending on the compute prowess of its latest Polaris and Fiji GPUs to make a general statement on their own.
The new Radeon Instinct brand of accelerators based on current and upcoming GPU architectures will combine with an open-source approach to software and present researchers and implementers with another option for machine learning tasks.
The statistics and requirements that come along with the machine learning evolution in the compute space are mind boggling. More than 2.5 quintillion bytes of data are generated daily and stored on phones, PCs and servers, both on-site and through a cloud infrastructure. That includes 500 million tweets, 4 million hours of YouTube video, 6 billion google searches and 205 billion emails.
Machine intelligence is going to allow software developers to address some of the most important areas of computing for the next decade. Automated cars depend on deep learning to train, medical fields can utilize this compute capability to more accurately and expeditiously diagnose and find cures to cancer, security systems can use neural nets to locate potential and current risk areas before they affect consumers; there are more uses for this kind of network and capability than we can imagine.
The Radeon Instinct initiative from AMD will utilize specifically built hardware accelerators along with AMD built ROCm software stacks to build machine learning frameworks and applications that are open and easily utilized by customers.
ROCm is an open-source HPC/Hyperscale-class platform for GPU computing that’s also programming-language independent. We are bringing the UNIX philosophy of choice, minimalism and modular software development to GPU computing. The new ROCm foundation lets you choose or even develop tools and a language run time for your application.
Like any reasonable compute initiative for machine learning, ROCm is built to support many GPUs, both inside a system and within a many server environment. It can simplify the stack through RDMA peer-sync. Besides being built for massive scaling, it includes compilers, language run times and interesting (and importantly) CUDA-application support. (CUDA being the NVIDIA developed GPGPU programming language.)
While I am still learning about this industry myself, the current configurations generally fall into two categories of workloads: training and inference. Training is accomplished with many high-performance servers and is the most time-consuming part of the process. NVIDIA once made claims that its GPUs were capable of a 14x speed up in this process over a comparable CPU, and that was back in 2014!
The inference part is aimed at using that built-up network of data from training. This could be in the form of cameras and a GPU for automated driving or drones using DNNs for impact avoidance. This portion is still accelerated by GPUs but doesn’t require as much relative horsepower to get the job completed.
AMD is announcing three accelerator cards for Instinct today, two for inference and one for training. The MI6 is a Polaris GPU based card with 5.7 TFLOPS of peak compute when measured in FP16 half precision math. It has 16GB of GDDR5 memory and uses about 150 watts of power. The 224 GB/s of memory bandwidth specification indicates that we are looking at Polaris 10 GPU like the one found on the RX 480. AMD claims that all the Instinct cards are passively cooled, which is technically accurate, but when you move the fans from the card to the server chassis, that’s a nebulous claim at best.
The MI8 accelerator is based on the same Fiji GPU implementation found on the Radeon R9 Nano, with a small form factor design that might help it find itself in unique system configurations. It has 8.2 TFLOPS of FP16 performance, 512GB/s of memory bandwidth though it is still limited to 4GB of memory because of the HBM integration.
Obviously the one of interest is the MI25, an accelerator with much higher performance aimed at training. This one uses one of the yet to be announced Vega 10 GPUs based on AMD’s upcoming Vega architecture. AMD was very tight lipped about specifications and performance of this card (we don’t even know how much memory the card will have) though we can infer estimated peak compute based on full system performance metrics. Based on servers that were shown, a Vega 10 GPU will like have 12.5 TFLOPS of single precision (FP32). In comparison, the Titan X Pascal based on GP102 has 11.0 TFLOPS of rated performance, so in theory, Vega 10 should exceed that. One thing to keep in mind, at least in prior AMD GCN architectures, the ratio of TFLOPS to performance has been higher for AMD than NVIDIA. (NVIDIA cards tend to offer better in-game performance at the same theoretical peak rated compute throughput.)
Instinct graphics cards are going to be built and supported by AMD directly, taking a page out of what NVIDIA has done with most its non-consumer graphics lines. This should give AMD more control on the messaging and branding for this line, something that system integrators spending millions of dollars on machine learning can appreciate.
Along with the hardware release comes MIOpen, a library for deep learning built by AMD to take advantage of the GCN architecture. MIOpen sits at the same level in the stack as C++ STL, NCCL and others, bridging between the ROCm platform and programming languages to the common frameworks like Caffe, TensorFlow, etc.
AMD did supply one relative performance metric using DeepBench GEMM, a common benchmark for deep learning systems. Compared to the Titan X Pascal card, the MI8 (based on the Radeon R9 Nano), comes in just slightly ahead. The MI25 using Vega is about 30% faster than NVIDIA’s Titan X Pascal. As far as I can tell, this benchmark used FP32 data types rather than FP16, so it’s not directly taking advantage of double packed math capabilities that Vega NCUs offer. (To be clear, the FP16 performance of the Pascal-based Titan X is awful, with a 1:64 ratio to single precision math.
Not letting an opportunity slide by them, AMD did use the Instinct announcement to show the upcoming Naples platform, the Zen architecture implementation for servers. No details were given, but the claim that Naples is “optimized” for GPU and accelerator throughput likely just points to an increase in available PCIe bandwidth and connectivity.
AMD’s stance on systems, differing from what NVIDIA has shown with the GP100 to this point, is to provide the hardware to system designers and let them create a device custom tailored for machine learning. The NVIDIA DXG-1 offers a stunning amount of performance, but it comes at a cost of $129,000 – AMD balked at the claim that compute capability should be priced that high. Partners were on hand and on stage to talk about what working with AMD Instinct should bring and showed off a few system integrations as well.
All three of the above systems use Instinct MI25 Vega-based GPUs and will vary in price along with their impressive stated compute (FP16) rates. Researchers able to get 3 PETAFLOPS of compute capability in a 42U standard rack design will have plenty of horsepower to develop the next-generation of deep neural networks.
This is a move that AMD and the Radeon Technologies Group needed to make. Though the world of machine learning is never going to eclipse the consumer or professional markets in terms of unit sales, the profit margin is incredibly high on configurations built for it. Also, the name recognition and halo effect that comes from being the leader in the training space will trickle down into inference platforms as well as to other markets that do directly interact with business and consumer markets. Building up the Instinct brand makes business sense, marketing sense and competitive sense, and it seems likely that AMD can impact DNN and machine learning fields with its combination of existing and upcoming GPU hardware and a push for an open, community driven software ecosystem.
Instinct, I like that name.
Instinct, I like that name.
Amd article starts with the
Amd article starts with the word Nvidia…
Yes they tend to do that a
Yes they tend to do that a lot here. Tarnish anot AMD good news story by kissing Nvidia’s ass in the opening statement.
For better or worse AMD does
For better or worse AMD does not exist in a vacuum. NVIDIA is still the big name in deep learning, but it is good that AMD is starting to catch up.
Sometimes I vacuum my video
Sometimes I vacuum my video cards. That’s probably not what you’re talking about though.
The best part about that is
The best part about that is when your vacuum gets a nice static charge going and then you touch your video card. Awesome results!
Well Nvidia is the
Well Nvidia is the competition and AMD will be competing with Nvidia so what’s the fuss about! Really why are people so emotionally engendered to a GPU parts supplier, any GPU parts supplier!
This is good news that AMD is competing with Nvidia in the Server/HPC/workstation and machine learning market. Those GPU accelerator parts from AMD are going to be more affordable and Nvidia will have to lower its prices also! AMD has its Zen CPUs to compete in the market also.
How is mentioning AMD’s main competitor any form of “Ass Kissing”. AMD already has some wins in the HPC/Server markets in China(1)! Watch AMD get more Zen CPU as well as Radeon Pro WX, and now Radeon Instinct, GPU accelerator business. Zen and Radeon GPU package deals will be incoming! AMD is a very lean operation with more pricing latitude across its X86 CPU, GPU, amd Motherboard chip-set pricing watch that package pricing latitude net AMD some CPU, GPU and motherboard chip-set market share for the Server/HPC/workstation, machine learning market, and the consumer markets.
“AMD Cloud Efforts Get Boost From Alibaba
Chinese e-commerce company to use AMD graphics chips as part of its cloud infrastructure services ”
Can’t really expect otherwise
Can’t really expect otherwise here…
Radeon “Instink”. I like the
Radeon “Instink”. I like the name too.
And the gaming GIT chimes in!
And the gaming GIT chimes in! do those webbed hands get in the way when you are usin a game controller!
Git is a mild pejorative with origins in British English for an unpleasant, silly, incompetent, stupid, annoying, senile, elderly or childish person. It is usually an insult, more severe than twit or idiot but less severe than wanker, arsehole or twat.
The word git first appeared in print in 1946, but is undoubtedly older. It is originally an alteration of the word get, dating back to the 14th century. A shortening of beget, get insinuates that the recipient is someone’s misbegotten offspring and therefore a bastard. In parts of northern England, Northern Ireland and Scotland get is still used in preference to git; the get form is used in the Beatles song “I’m So Tired”.
The word has been ruled by the Speaker of the House of Commons to be unparliamentary language.
The word was used self-deprecatingly by Linus Torvalds in naming the Git version control system.
You’re definitely worse than
You’re definitely worse than a git then. You resort to childish insults more than anyone ranting on about JHH and all the green goblins and such. Apologies if you aren’t that anonymous. But I don’t really care what you think anyways. It’s just a bit of fun you just don’t git. LOL
Nvidia is you imprinted
Nvidia is you imprinted mother, you must defend you mother at all costs! You are the bog standard gaming git that thinks that Nvidia(Your MOM) needs to be defended. Maybe you can study up on markets and technology but your lack of gray matter prevents that. Oh HBM/HBM2 was not all that in your mind until Nvidia announced that it would be getting on the HBM/HBM2 bandwagon. You need to go on over to the ESPN blogs and stay there because GPU/GPU technology is not a sport, it’s a competition but not a sports match. If you need to have a definite winner fix at all costs to assist your fragile self image then sports and sports blogs are the way for you!
You get these replies because your posts invite them, and Both AMD and Nvidia are technology companies that supply parts to the OEM/AIB market. They employ engineers and PHDs and both support the same standards bodies JEDEC, VESA, USB-IF, etc. You need to learn to separate out the marketing drivel from reality and start learning about computer science and then you may be able to have a proper discussion on complicated technology related subjects. Childish insults are more inline with your constant uninformed opinions amount markets/market players that you have for some very sycophant reason sided against, so how does it feel to be on the receiving end! Get over yourself!
You keep defending a brand, I’ll defend technology and technological advancement no matter the maker! AND has some very interesting technology that even Nvidia has started to use. And I like Nvidia’s Denver CPU technology it has its uses also! You will see Nvidia adopting more of AMD’s innovations and AMD likewise the other way, provided Nvidia gets the technology accepted by an industry standards body, that proprietary Lock-in that Nvidia is trying to force on its clients will not keep any market bound to Nvidia’s grasp, only innovation driven by fair market competition will be in store for Nvidia going forward! Ditto for Intel!
Yes we all know you’re a poor
Yes we all know you’re a poor troll
I like you, let’s be friends.
I like you, let’s be friends.
Besides if I really wanted to
Besides if I really wanted to be mean I’d say the name sucks because instinct is inborn and is in no way impacted or changed by learning.
Quote directly from Webster’s is n. “A specific, complex pattern of responses by which an organism, supposedly inherited, which is quite independent of any thought processes. Or pop. The ability to form a judgement without using the reasoning process.
Both of these may apply to almost every post made by anonymous.
But machine learning is all
But machine learning is all about organic learning processes what with all the synapse simulations going on! So somewhere in all that instinctual innately programmed intelligence than needs to exist for species survival the real sentient intelligence is an emergent function of that basic synaptic functionality. And inborn is innate and not to be confused with the more negatively connoted inbred into a smaller simian gene pool gaming git variety that leads to biological regressions and webbed feet and hands along with single digit IQs and that sycophant need to be associated with a winning brand to shore up a fragile ego and definite superego conflict!
Edit: than needs to exist
Edit: than needs to exist
to: that needs to exist
Get over yourself
Get over yourself megalomaniac. You think you’re superior to everyone else because your not a gamer. I have no ego problems whatsoever. I like Nvidia’s products and features.
Go sleep with your Radeon instinct card when it becomes available. It will probably be your only friend. But somehow I don’t think your nocturnal emissions will be covered under warranty.
Is a “Shrout out” really a
Is a “Shrout out” really a thing now?
Paul thinks it is apparently.
Paul thinks it is apparently.
With ML being integrated into
With ML being integrated into Cloud providers, and the pushes into making it a service, it is very likely to outpace consumer discreet GPU sales in the next 10 years.
That’s right and the ML,
That’s right and the ML, cloud, HPC and other business/professional markets can afford to pay better markups to AMD for these Pro WX/Instinct SKUs. So look for the Pro markets to pay for a whole lot of the R&D costs for AMD’s consumer SKUs, as the professional markets will give AMD the real revenues to really innovate even more.
Consumers can not write off any GPU/CPU expences on their Income Taxes but the professional markets will pay higher markups and write the expenses off. AMD’s CPU and GPU R&D costs are about to be paid for mainly by that professional market letting AMD price its consumer SKUs even lower! All that big professional market revenues will fund R&D in addition to the Government exascale R&D grants that will produce the CPU/GPU/Other AMD IP that will find its way into AMD’s comsumer CPU/GPU/Other SKUs! Great times are ahead AMD is back in the game!
It will happen in sooner than 10 years for total Revenues produced by the professional markets for AMD, and maybe for unit sales also in the professional markets relative to the consumer markets! AMD has got Zen server SKUs to package price with its Pro WX/Instinct GPU SKUs for some great deals!
“This is a move that AMD and
“This is a move that AMD and the Radeon Technologies Group needed to make.” So true Shrout.
Where you be, Derz? The
Where you be, Derz? The channel is not the same without you.
What is AMD’s investor’s
What is AMD’s investor’s instinct telling them right now?
10.67 +0.33 (3.19%)
And what is Nvidia’s investor’s instinct telling them right now?
88.68 -3.14 (-3.42%)
What’s the instinct today.
What’s the instinct today. AMD down 10.56 -.12 -1.16%
Nvidia on the other hand 91.51 +1.92 +2.14%.
Wait the RX 480 has 5.83
Wait the RX 480 has 5.83 T-Flops of single percision compute so this Radeon Instinct card is only using its native half percision(16 bit) math units for machine learning?
Intersting Polaris whitepaper and some slides form Tech Day:
Dissecting the Polaris Architecture”
“Radeon Polaris Tech Day
The Polaris Architecture”
apparently AI data sets are
apparently AI data sets are 16bit so thats why.
Oh well maybe in the new Vega
Oh well maybe in the new Vega micro-arch AMD can make the 32 bit units do 2 16 bit computations. Maybe the Instinct SKUs are fabbed without any 32 bit units to save power/space or maybe the 32 bit units can be used to do some 16 bit workloads and truncate the high order 16 bit part of the values if these parts are in fact made from an Ellesmere die. Maybe these are binned parts, who knows.
Well the MI25 card will be
Well the MI25 card will be based on Vega and also support packed FP16 math so Vega’s micro-arch can do 2 16 bit operations. I’m not sure if that packed 16 bit math is done in a 32 unit but it sounds like that’s what is happening on Vega!
“The most exciting Radeon Instinct card, the MI25, echoes the debut of Nvidia’s GP100 GPU on the Tesla P100 accelerator. This card is the first product with AMD’s next-generation Vega GPU on board. Most of the details of Vega and its architecture remain secrets for the moment, but we can say that the Vega GPU on board will offer support for packed FP16 math. That means it can achieve twice the throughput for FP16 data compared to FP32, an important boost for machine-learning applications that don’t need the extra precision. The Polaris and Fiji cards in the Radeon Instinct line support native FP16 data types to make more efficient use of register and memory space, but they can’t perform the packed-math magic of Vega to enjoy a similar performance speedup.”(1)
“AMD opens up machine learning with Radeon Instinct
Vega lights the way”
Machines CANNOT learn…anything ever. Looks like AMD is bullshitting us all with marketing mumbojumbo.
If you are going to call out
If you are going to call out the “marketing mumbojumbo” for AMD, then you need to do it for everyone, here are a couple of big players doing the exact same thing:
If 16-bit is for deep
If 16-bit is for deep learning, then is 32-bit+ for deeper learning?
64-Bit for Deepest Learning … PRO?
64 bit is for nuclear weapon
64 bit is for nuclear weapon design!
2.5 quintillion bytes of data
2.5 quintillion bytes of data every day and 99% of it is completely irrelevant.
We need machine learning to
We need machine learning to find out why 2 quintillion bytes of that data is pictures of butts. Or maybe not.