1Improving LLM Inference with Continuous Batching: Orca Through Tinyorca (opens in new tab)(junupark.xyz)2immortal32mo ago0Save
2Bits-per-Byte (BPB): a tokenizer-agnostic way to measure LLMs (opens in new tab)(dipkumar.dev)1immortal38mo ago0Save
4GPT-5 Router – Inevitable Future of Chat Interfaces (opens in new tab)(dipkumar.dev)3immortal310mo ago0Save
5Instruction Aware Embeddings – Why Your Retriever Is Failing (opens in new tab)(dipkumar.dev)1immortal311mo ago0Save
6Improving Retrieval in RAG (Via Recall, Precision, and NDCG) (opens in new tab)(dipkumar.dev)2immortal31y ago0Save
7Show HN:AceVocab - Learn and master the vocabulary featured in the GRE/GMAT (opens in new tab)(acevocab.com)3immortal31y ago0Save
8AWS BedRock – Converse API – A single endpoint for all models? (opens in new tab)(dipkumar.dev)2immortal32y ago0Save
9Essential Database Design: Five Fields Every Table Must Have (opens in new tab)(dipkumar.dev)3immortal32y ago1Save
10India issues notice to Google for blocking count over nude childhood photo (opens in new tab)(deccanherald.com)3immortal32y ago0Save
11Hugging Face raises $235M from investors including Salesforce and Nvidia (opens in new tab)(techcrunch.com)378immortal32y ago203Save
12Speeding up the GPT with KV cache (memoization) (opens in new tab)(immortal3.github.io)2immortal33y ago0Save