I have always been wondering about where I can find the data from the data breach (i.e. equifax leak, facebook, etc ...). Some of these dataset look like a great source to train ML models. It seems like people in hn or website like haveibeenpawned have accessed to these data.