As someone working with LLMs, I’ve noticed an interesting gap around knowledge cutoffs and "recent" feature support. For example, GitHub introduced LaTeX math rendering in Markdown in May 2022. Despite a 2023 knowledge cutoff, some models like GPT-4o don’t seem aware of this change, while others like Claude 3.5 Sonnet do recognize it but defer to outdated information if cross-referenced with another model.
Why would a model trained with data up to 2023 overlook a significant update on a widely-used platform like GitHub? Could it reflect selective data filtering during training, limitations on incorporating certain types of updates post-cutoff, or perhaps something specific to how different LLMs process and prioritize technical information?
What are your thoughts?