I suspect the main problem here is that the input motion is too "perfect". There needs to be more randomness involved, otherwise you are just going to get really loud harmonics like you do when playing a perfect square wave.
When I play the trumpet, there is a spectrum of pitch tuning I can do for each note by buzzing my lips differently; and the middle of that spectrum (the correct note) is the easiest to perform. Off-pitch notes are usually achieved by placing my lips off-center. It would definitely be worth exploring this behavior, since an infinitely perfect lip placement does not occur in nature.
When I was taught to play the trumpet, one focus (for all wind instruments, really) was to always use more airflow than necessary, especially when playing quietly. This results in a cleaner sound, probably because the lip movement changes less suddenly while still altering enough pressure to be loud. With that general sense in mind, I suspect the need for more airflow is to compensate for relatively square-wave-like pressure waves from the lips simulation.
I wonder what sounds you could get if you recorded yourself buzzing your lips into your real mouthpiece, then using that as an input. Of course, that would be tricky, because blowing into a mouthpiece feels very different than blowing into a trumpet. Could this model potentially be simulated in reverse to find a good input?
This is about building a 44100 frame per second fluid dynamics engine that just so happens to kinda sorta sound like a Trumpet.
Then adding a particle system to the simulation and painting the particles in Blender to provide a nice visualization of the simulator.
---------
A partial differential equation done cell-by-cell like this is called finite element analysis. This is basically the brute force methodology top scientists are doing at National Labs from my understanding.
Just ya know, scaled down and hobby sized to this crude tube simulator (that arguably kinda sorta sounds like a Trumpet)