For example, if regularly pieces of foam are hitting the tiles after launch, was that part of the specs for the tiles to handle that? Did anybody go back, take a worst case scenario of a piece foam hitting a tile (size, speed, etc.) and verify that the tiles could handle such an impact?
So let's say they identify a tile failure mode as "tile struck by object". They assign a worst-case severity to that. Let's say they knew how bad it could be and they assign a severity as "loss of crew." Then they have to identify all the ways the tile could be struck and assign probabilities to that even happening. They use a matrix that maps the severity and probability to arrive at a risk classification. If the classification is higher than their threshold, they add mitigations that either reduce the severity or the probability (or both) until it's within an acceptable risk range.
There's lots that can go wrong with this process, though. You obviously have to be able to identify the failure modes. Is there some off-the-wall failure that nobody could foresee? Maybe. Then you have to have good enough data to objectively determine the risk. In this case, I wonder if all the previous foam strikes led them to discredit the risk as being improbable/negligible to cause that failure mode. Add to that, the PowerPoint seemed to imply the model they used is too conservative (it was believed to overestimate the actual penetration). I know people involved on some hypervelocity testing of the foam and they were legitimately surprised at the way the foam acted when it was fired at higher speeds. So in this case, the risk was probably unknown beforehand, although they assumed they understood the risk sufficiently. To quote Mark Twain, "What gets us into trouble is not what we don't know. It's what we know for sure that just ain't so."
That's just one system on an immensely complex machine. It's easy to sit back with hindsight and say "Well, they shouldn't have made a decision until they did additional testing to get the data." But if they did that to every system on the Shuttle, it likely wouldn't have left the ground. In practice, engineers deal with all kinds of other cost and schedule constraints.
Why didn't they go back and test with 'real world' foam sizes?
I would push back on the idea that they would not have to ground the Shuttle. If they thought the foam could cause a loss of crew, they would ground the Shuttle until they fully understood the problem. That's exactly what happened in the aftermath of Columbia.
>Why didn't they go back and test with 'real world' foam sizes?
That's exactly what they did after the incident (while the Shuttles were grounded). If you're asking why didn't they do that beforehand, my assumption is they already had a model that they felt they could use. According to the subject PPT slides, they even thought that model was overly conservative. In addition, while foam-shedding was out of spec, it was considered "in family" meaning that they knew of the issue and felt like it was not a flight safety issue. Both their physical and mental models of the phenomena were, at best, incomplete but they didn't know that at the time.
Which is weird because slide also mentions that a small increase in energy can have a disproportional effect.
I find it weird that they would rely on their model (for extrapolation) when they know that the behavior of the tiles is non-linear. If they knew that the real world was outside their testing parameters and they decided not to test, then that sounds to me like a very serious ommision.
I.e., it is weird to extrapolate tests to something 600 times bigger. Certainly if it is about impact on ceramics.