It would be quite simple to have a two tiered approach to the logging problem since they have separated it into 2 components. One can just write and ship files while the other is what they have described in terms of providing real time streaming.
So the question then becomes what are the failures modes of their logging setup in terms of misbehaving clients? I don't know how kafka handles misbehaving clients. I suspect it would lead to global effects and slowdown of the entire cluster because of 1 or 2 misbehaving clients whereas in the current set up local misbehavior will be localized to the nearest aggregator dropping messages. Simple memory usage and other kinds of monitoring can then be used to find these issues and then mitigate them accordingly.
This is still a heck of lot simpler setup than using kafka and worrying about all sorts of weird distributed system failure modes. I'm sure kafka got them started initially but continuing to use it is like using a sledgehammer to kill a fly. For the use case they have this setup is the correct one and migrating to kafka if it becomes necessary will be possible. So in my view this is proper engineering. They've made all the right trade-offs instead of just chasing fads and trends.