1OpenAI frontier models and Codex are now available on AWS (opens in new tab)(openai.com)370typpo24d ago131Save
2How to replicate the Claude Code attack with Promptfoo (opens in new tab)(promptfoo.dev)6typpo7mo ago0Save
7Benchmark Command R vs. GPT/Claude on your own data (opens in new tab)(promptfoo.dev)2typpo2y ago0Save
8DBRX vs. Mixtral vs. GPT: create your own benchmark (opens in new tab)(promptfoo.dev)1typpo2y ago0Save
9How to benchmark Gemini vs. GPT with your own data (opens in new tab)(promptfoo.dev)1typpo2y ago0Save
11How to benchmark Llama2 Uncensored vs. GPT-3.5 on your own inputs (opens in new tab)(promptfoo.dev)16typpo2y ago0Save
13Show HN: CLI for testing and evaluating LLM prompts and outputs (opens in new tab)(github.com)GitHub2typpo2y ago0Save
15Show HN: Promptfoo – CLI for testing & improving LLM prompt quality (opens in new tab)(github.com)GitHub14typpo3y ago5Save