not to offend - but it sounds like your response/worries are based more on an emotional reaction. and rightly so, this is by all means a very scary and uncertain time. and undeniably these companies have not taken into account the impact their products will cause and the safety surrounding that.
however, a lot of your claims are false - progress is being made in nearly all the areas you mentioned
> hallucinations
are reduced with GPT-5
https://cdn.openai.com/pdf/8124a3ce-ab78-4f06-96eb-49ea29ffb...
"gpt-5-thinking has a hallucination rate 65% smaller than OpenAI o3"
> limited context window
same deal. gemini 2.5-pro has a 1 million token context window and GPT-5 is 400k up from 200k with o3
https://blog.google/technology/google-deepmind/gemini-model-...
"native multimodality and a long context window. 2.5 Pro ships today with a 1 million token context window (2 million coming soon)"
> expensive to operate and train
we don't know for certain but GPT-5 provides the most intelligence for the cheapest price at $10/1 million output tokens which is unprecedented
https://platform.openai.com/docs/models/gpt-5
> guardrails
are very well implemented in certain models like google who provide multiple safety levels
https://ai.google.dev/gemini-api/docs/safety-settings
"You can use these filters to adjust what's appropriate for your use case. For example, if you're building video game dialogue, you may deem it acceptable to allow more content that's rated as Dangerous due to the nature of the game. In addition to the adjustable safety filters, the Gemini API has built-in protections against core harms, such as content that endangers child safety. These types of harm are always blocked and cannot be adjusted."
now id like to ask you for evidence that none of these aspects have been improved - since you claim my examples are vague but make statements like
> Inability to recall simple information
> inability to stay on task
> (doesn't) support its output
> (no) long term planning
ive experienced the exact opposite. not 100% of the time but compared to GPT-4 all of these areas have been massively improved. sorry i cant provide every single chat log ive ever had with these models to satisfy your vagueness-o-meter or provide benchmarks which i assume you will brush aside.
as well as the examples ive provided above - you seem to be making claims out of thin air and then claim others are not providing examples up to your standard.