1Show HN: Proposal for a real long-term AI memory benchmark (opens in new tab)(penfieldlabs.substack.com)4dial4811mo ago0
2Milla Jovovich's MemPalace Claims 100% on LoCoMo. Its Benchmarks.md Disagrees (opens in new tab)(penfieldlabs.substack.com)4dial4811mo ago0
3LoCoMo AI Benchmark: 6.4% of answer key wrong, judge accepts 63% of fake answers (opens in new tab)(github.com)3dial4811mo ago3