You all keep using the word "Data"
Data, as in facts, as in the frequency of one word in relation to another.
"Copyright does not protect facts, ideas, systems, or methods of operation, although it may protect the way these things are expressed..." FROM: https://www.copyright.gov/help/faq/faq-protect.html
It's not a question of if, rather when the cat gets out of the bag and the legal battle starts. The problem is that all the copyright applies to the expression not the factual information it expresses (in this case word relations). Now "how math works" and "the language of the law" are going to make for an interesting court case. I suspect that math wins here but it depends on what judge gets it and how high it goes.