The issues with a sleep and pray strategy are:
1. No amount of time is provably enough
2. If you just take the maximum recorded time and say add 20% padding, then waiting that amount of time to process every dataset could be detrimental to performance.
The example I gave happened to the team I was on in 2017/2018. We had 1000s of files totaling terabytes of data in a given batch. The 90th percentile time for consistency was the low 10s of seconds, the 99th percentile was measured in minutes. The manifest and retry not yet present method avoids having to put in a sleep(5 minutes) for the 1% of cases.