There is a surprising amount of open data. I think this would be a great hobby. Though maybe use a good VPN and fake creds. People have been murdered for this sort of thing.
Source: https://forbiddenstories.org/case/daphne-project-three-years...
I can't find a link now but some phd students in South America (I think?) fairly accurately found fraud based on the content in business and government contracts simply on certain clauses.
Would also be interesting to take stats of higher incidents of certain diseases and track them against factory production/creation in the area. Could also cross-examine against EPA fines.
You could then send the analysis to forbidden stories. Though remember they are journalists and not necessarily ML/GPT-3 experts.
Internet sleuths unite!