AI muddies the water because large models trained on public repos can reproduce GPL snippets verbatim, so prompting with tests that mirror the original risks contamination and a court could find substantial similarity. To reduce risk use black-box fuzzing and property-based tools, have humans review and scrub model outputs, run similarity scans, and budget for legal review before calling anything MIT.