immortal3 on Hacker News

1

Improving LLM Inference with Continuous Batching: Orca Through Tinyorca (opens in new tab)

(junupark.xyz)

2immortal32mo ago0

2

Bits-per-Byte (BPB): a tokenizer-agnostic way to measure LLMs (opens in new tab)

(dipkumar.dev)

1immortal38mo ago0

3

Creativity Is a Luxury (opens in new tab)

(dipkumar.dev)

2immortal310mo ago0

4

GPT-5 Router – Inevitable Future of Chat Interfaces (opens in new tab)

(dipkumar.dev)

3immortal310mo ago0

5

Instruction Aware Embeddings – Why Your Retriever Is Failing (opens in new tab)

(dipkumar.dev)

1immortal311mo ago0

6

Improving Retrieval in RAG (Via Recall, Precision, and NDCG) (opens in new tab)

(dipkumar.dev)

2immortal31y ago0

7

Show HN:AceVocab - Learn and master the vocabulary featured in the GRE/GMAT (opens in new tab)

(acevocab.com)

3immortal31y ago0

8

AWS BedRock – Converse API – A single endpoint for all models? (opens in new tab)

(dipkumar.dev)

2immortal32y ago0

9

Essential Database Design: Five Fields Every Table Must Have (opens in new tab)

(dipkumar.dev)

3immortal32y ago1

10

India issues notice to Google for blocking count over nude childhood photo (opens in new tab)

(deccanherald.com)

3immortal32y ago0

11

Hugging Face raises $235M from investors including Salesforce and Nvidia (opens in new tab)

(techcrunch.com)

378immortal32y ago203

12

Speeding up the GPT with KV cache (memoization) (opens in new tab)

(immortal3.github.io)

2immortal33y ago0

immortal3

Recent submissions

Improving LLM Inference with Continuous Batching: Orca Through Tinyorca (opens in new tab)

Bits-per-Byte (BPB): a tokenizer-agnostic way to measure LLMs (opens in new tab)

Creativity Is a Luxury (opens in new tab)

GPT-5 Router – Inevitable Future of Chat Interfaces (opens in new tab)

Instruction Aware Embeddings – Why Your Retriever Is Failing (opens in new tab)

Improving Retrieval in RAG (Via Recall, Precision, and NDCG) (opens in new tab)

Show HN:AceVocab - Learn and master the vocabulary featured in the GRE/GMAT (opens in new tab)

AWS BedRock – Converse API – A single endpoint for all models? (opens in new tab)

Essential Database Design: Five Fields Every Table Must Have (opens in new tab)

India issues notice to Google for blocking count over nude childhood photo (opens in new tab)

Hugging Face raises $235M from investors including Salesforce and Nvidia (opens in new tab)

Speeding up the GPT with KV cache (memoization) (opens in new tab)

Recent submissions

Improving LLM Inference with Continuous Batching: Orca Through Tinyorca (opens in new tab)

Bits-per-Byte (BPB): a tokenizer-agnostic way to measure LLMs (opens in new tab)

Creativity Is a Luxury (opens in new tab)

GPT-5 Router – Inevitable Future of Chat Interfaces (opens in new tab)

Instruction Aware Embeddings – Why Your Retriever Is Failing (opens in new tab)

Improving Retrieval in RAG (Via Recall, Precision, and NDCG) (opens in new tab)

Show HN:AceVocab - Learn and master the vocabulary featured in the GRE/GMAT (opens in new tab)

AWS BedRock – Converse API – A single endpoint for all models? (opens in new tab)

Essential Database Design: Five Fields Every Table Must Have (opens in new tab)

India issues notice to Google for blocking count over nude childhood photo (opens in new tab)

Hugging Face raises $235M from investors including Salesforce and Nvidia (opens in new tab)

Speeding up the GPT with KV cache (memoization) (opens in new tab)