If you have a product people have paid money for, and say you're releasing it as open source because it isn't sustainable, the people looking for that code aren't typically interested "training data".
I get your point that messy code may be useful for other purposes, but believe that the vast majority of users are looking for code to solve their problems, not "training data" or research.
That said, as you say, rights holders can do what they will.