Microsoft Copilot Programming Aide Leaking Secrets And Ignoring Licenses?
We Didn’t Steal Your Code, It Was The AI!
Microsoft Copilot is a very interesting tool, with a bit of a larcenous streak. It is a programming aide that can turn natural language into working code, mostly intended for Visual Basic but also for a variety of other languages. Microsoft adopted the OpenAI Codex and trained it using billions of lines of code to be able to do this trick. The problem seems to be they never trained it to read licenses.
The code Microsoft Copilot was trained on came from a variety of open source software repositories, and since Microsoft bought GitHub back in 2018 it is an easy guess as to which repository much of the code came from. There is a small problem however, which has cause the launch of what could be a rather large lawsuit.
A lot of the code which it trained on, and liberally uses in it’s translation from natural language to code is covered under GPL, Apache, MIT and other OSS licenses. Those licenses require the author’s name be attributed when the code is used, and Microsoft Copilot does not, even when the snippets are longer than 150 characters. To make matters worse, some of the code it grabbed contained secrets that were published on public repositories but were not for general consumption.
It will be interesting to see how this plays out.
The tool was trained with machine learning using billions of lines of code from public repositories and can transform natural language into code snippets across dozens of programming languages.
More Tech News From Around The Web
- New Go-Playing Trick Defeats World-Class Go AI, But Loses To Human Amateurs @ Slashdot
- Microsoft feels the need, the need for speed in Teams @ The Register
- Accidental discovery produces superfluorescent light at room temperature @ Physics World
- The all liquid-cooled colo facility rush has begun @ The Register
- Microsoft is Showing Ads in the Windows 11 Sign-Out Menu @ Slashdot
- Qualcomm vs Arm: The bizarro quotient just went off the scale @ The Register
- Elgato HD60 X External Capture Card @ Tweaktown
So Steve Ballmer was right after all – GPL is cancer that attaches itself to the code