What people are complaining about (totally unreasonably in my view) is obviously Meta is not "open sourcing" all the training data, so nobody can retrain the model from scratch themselves. This argument to me is just silly. The whole point of these models is they distil pretraining on massive data sets you wouldn't have access to otherwise. If you insist on them releasing the data set, they will have to cut it down to 0.1% of the size and you will be getting what you had access to already in the first place.