Autopilot can be thought of better as "auto-aviate". That is to say, if there is already a navigation plan, the aircraft can follow that plan. Simple autopilots just keep the wings level, others can hold an altitude and change heading. More sophisticated ones can change altitude or even fully land the plane.
All of those things, however, require people to manage the "Navigate" part. "Aviate" is a deterministically solved problem, at least in normal flight operations. As you point out we trust autopilots today, including on (nearly) every single commercial flight.
LLMs are a poor alternative to "aviate", but they could be part of a better flight management automation package. The parent article tries to use the LLM to aviate, with predictable results.
If paired with a capable auto-pilot (not the relatively basic one on that C-172), the LLM could figure out how to operate the FMS and take you from post take-off to final approach and aid in situational awareness.
Currently, I don't think there is a commercial solution for GA aircraft that could say, "Ok, I'm 20NM from KVNY, but there are three people ahead of me in the pattern, so I have to do a right 360 before descending and joining downwind on 34L".
Having an LLM propose that course of action and tell the autopilot to execute on it definitely would be an improvement to GA safety.
And well if they are there they might as well fly for practise.
And no. I would not allow LLM in to the loop of making any decision involving actual flying part.
Much of the value of a human crew is as an implicit dogfooding warranty for the passengers. If it wasn't safe to fly, the pilots wouldn't risk it day after day.
To think of it, it'd be nice if they posted anonymized third-party psych evaluations of the cockpit crew on the wall by the restrooms. The cabin crew would probably appreciate that too.
Everyone likes to hand wring about this sort of stuff but I think it's the exception. Nailing the "macro level" decisions like "we'll go around this storm but we'll go over that one" or "we must divert to A or B and we will chose B because it's better for our passengers/company/crew even if it's 10min more flying to get there" are what keep the industry humming along mostly in the black rather than in the red. And it's these sorts of things that AI just tends to yolo and get mostly right when they're obvious but also get immensely wrong when any sort of gotcha materializes.
Furthermore, the concept of "ejecting a passenger" from a flight would mostly not be something you do while in the air, unless you're nuts. Ejecting a passenger is either done before takeoff, or your crew decides to divert the flight, or continue to the destination and have law enforcement waiting on the tarmac.
Naturally, pilots get involved when it's a question of where to fly the plane and when to divert, but ultimately the cabin crew is also involved in those decisions about problem passengers.
never mind that most crashes are caused by humans, very rarely by technical issues going amok
Because humans are the fallback for all the scenarios that the tech cannot reliably cover. And my intuition says that the tech around planes is so heavily audited that only things that work with 99.999...% accuracy work will be left to tech.
"Caused by a human" is the lowest tier, first base human instinct analysis of any accident, and as such, unless proven otherwise, can be discarded out of hand.
It comes down to: if a human mistake is capable of causing an accident, your system is badly designed because it assumes a part of the system known to be unreliable (a human) is always reliable.
The whole trick is designing systems that are safe despite humans being in the loop. Then you get to benefit from the advantages humans bring over machines without suffering the downsides.
It absolutely can; it's called autoland[1]. In really bad visibility, pilots simply can't see the runway until too late, and most aerodromes which expect these conditions have some sort of autoland system installed. The most advanced ones will control every aspect of the plane from top-of-descent (TOD), flaps and throttle configuration, long and short final, gear down, flare, reverse thrust, and roll-out, all the way to a full stop on the runway. Zero pilot input needed.
And most of this was already available in the late 1970s. We have absolutely no need for LLM-based AI in aviation; traditional automation techniques have proven extremely powerful given how restricted the human domain of aviation already is.
—Sully Sullenberger
[0] Sully: My Search for What Really Matters. p. 188
Seeing how Claude (or any current LLM) perform in even the most low-stake coding scenario I dont think I would ever set foot on a plane where the 1% of most risky scenarios are decided by one.
I mean if you have a stable plane, then it'll do alright, as it'll mostly fly straight and level (assuming correct trim) reacting to turbulence however, the sampling rate would probably too slow, so you'd end up with oscillations.
For recognising that you're in a shit situation, yeah, it'll probably do that fine, but won't be able to give the correct control inputs at the right time.
Even that im not sure of, I know relatively little about aviation safety but I can imagine that there are all kinds of 0.0000000001% percent corner cases that no plane has ever encountered that still need some sort of reaction, who knows how easy an llm can distinguish those from the 0.000000001% corner cases that no plane has ever encountered that are completely fine and can be ignored.
Most of the time. Sometimes you get a double bird strike when you've barely cleared the Hudson river, or similar.
"spawning 5 subagents"
I try to fly about once a week, I’ve never really tried to self analyze what my inputs are for what I do. My hunch is that there’s quite a bit of I(ntegral) damping I do to avoid over correcting, but also quite a bit of D(erivative) adjustments I do, especially on approach, in order to “skate to the puck”. Density going to have to take it up with some flight buddies. OR maybe those with drone software control loop experience can weigh in?
(d'oh, should have read the specific context: in the case mentioned, it is where the system acts as an integrator (pitch -> altitude), and so pure P control is pretty reasonable)
Gold
You'd want all the data from the plane to be input neurons, and all the actions to be output neurons.
"500 Our Servers Are Experiencing High Load"
"500 Our Servers Are Experiencing High Load"
"500 Our Servers Are Experiencing High Load"
Related from December 2025: Garmin Emergency Autoland deployed for the first time
https://www.flightradar24.com/blog/aviation-news/aviation-sa...
Large planes are autolanded in normal conditions with oversight of qualified, capable and backed up operator, in harsh conditions they are not used, as far as I understand.
Autoland systems in small planes are emergency systems to land plane with disabled operator in any conditions generally acceptable for flying in that plane.
This is where I think Taalas-style hardware AI may dominate in the future, especially for vehicle/plane autopilot, even it can't update weights. But determinism is actually a good thing.
Using Claude sounds overkill and unfit the same time.
I wouldn't trust Claude to ride my bike, so I certainly wouldn't board its flight.
It would still be better just to let autopilots do the work, because the point of the exercise isn't improved avionics. But it would be an honestly posed challenge for LLMs.
The author tried getting Claude to develop an autopilot script while being able to observe the flight for nearly live feedback. It got three attempts, and did not manage autolanding. (There's a reason real autopilots do that assisted with ground-based aids.)