So that’s what it is! I was wondering why reducing context and summarising still makes it make mistakes and forget the steering. And couldn’t find explanation to why it starts ignoring instructions when context is not full at all.
How did you find that tool call is what degrades it?
Isn’t this a biggest problem there is and not just “design tension”?