1LLM Terminology explained simply: Weights, Inference, Effective sequence length (opens in new tab)(devforth.io)5dotnot6d ago1
2Real-world GPT-OSS-20B benchmark on L4, L40S and H100 (latency, tokens/SEC) (opens in new tab)(devforth.io)1dotnot1mo ago0