Copilot was not only trained on permissively licensed code. It’s trained on all public repos, even if the code is copyrighted (which is the default absent a more permissive license)
If the copyrighted code was uploaded to GitHub by the owner, there's no problem with this. When you upload code to GitHub, one of the rights that you grant to GitHub is the right to use your content for "improving the Service over time". See D.4. License Grant to Us in the GitHub Terms of Service. Once it is up there, you also grant other users certain rights, like viewing public repos and forking repos into their own copies. See D.5. License Grant to Other Users. Even with the most restrictive protections in place, using GitHub requires you to give up certain rights.
A question would be if creating and training Copilot is "improving the Service over time". I would suspect that it would be, though.
There are still some open questions around what happens when Copilot suggests code verbatim, but these are mostly for the users of Copilot. Although I would hope that GitHub is thinking about offering information to ensure that users understand the source of code they use, if it may be protected, and what licenses it may be offered under. There are still some interesting legal questions here, but I don't think that the training of Copilot is one of them.
A more interesting question would be what GitHub does if someone uploads someone else's copyright-protected code to GitHub and it is used for training Copilot before it is removed. If you don't own the copyright, you can't grant GitHub the rights needed to use that code for anything, including improving the service.
> A question would be if creating and training Copilot is "improving the Service over time". I would suspect that it would be, though.
Definitely an interesting case to be had, but I'd argue that it does not. They're using their customers' code to create an entirely new product that would not be possible without it, not just improving their ability to host a Git repo. Otherwise, what standard is beyond "improving the service over time?" Can they do anything with the code they host as long as it improves their service? What about sell bootleg copies of it and use the proceeds to upgrade their servers?
However D4 also explicitly says "This license does not grant GitHub the right to sell Your Content". One could argue that because Copilot is a commerical product it is in fact selling (a derivative of) user code, and thus the grant in D4 does not apply.
> but then they want to prohibit others from learning from what they share
The linked-to document explicitly DOES NOT prohibit others from learning what they share.
Quoting it: "If the project is under an open source license, it means that everyone can share a copy – even on GitHub – of the licensed material under certain conditions. A license restricting this right wouldn’t be open source anymore. However, since GitHub may not respect the terms of licensed code that is hosted on their servers, not uploading the code of others there is, in fact, an ethical choice."
"No Adultery" is typically a term of entering a relationship. We can liken that to code licensing. Cheating is explicitly established as being _against_ the ad-hoc contract of the relationship.
Conversely, open source licenses explicitly state that an end-user may further distribute that source code to anywhere they wish.
People want the benefits of publicly sharing stuff, but then they want to prohibit others from learning from what they share.
There are many options to keep things private. The downside is that you won't get the same exposure.