In early computing, everything was closed sourced. Quoting the wikiepdia page,
To develop a legal BIOS, Phoenix used a clean room design. Engineers read the BIOS source listings in the IBM PC Technical Reference Manual. They wrote technical specifications for the BIOS APIs for a single, separate engineer—one with experience programming the Texas Instruments TMS9900, not the Intel 8088 or 8086—who had not been exposed to IBM BIOS source code.
The legal team at Phoenix deemed inappropriate to "recall source in their own words" for legal reasons.
My non-legal intuition is that these companies training their models are violating copyright. But, the stakes are too high--it's too big to fail if you will. If we don't do it, then our competitors will destroy us. How do you reconcile that?