I used that strategy for parsing gigabytes of CSVs containing arbitrary natural language from the web - try to get these files fixed, or figure out a grammar for gigabytes of fuzzy data...
My approach never failed for me, so telling me that my strategy does not work is a strong claim, where it reliably did the job for me.
Your examples are all valid, but what you are describing are theoretical attacks on the method, while the method works in almost all cases in practice. We are talking about two different viewpoints: dealing with large amounts of messy data on one hand and parser theory in an ideal cosmos on the other hand.