A well-trained 8B model will already be over-saturated with information from the start. It will therefore easily forget much old information when fine-tuning it with new materials. It just doesn't have the capacity to take in too much information.
Don't get me wrong. I think an 70B or larger model would be worth fine-tuning, especially if it can be grown further with more layers.