Here's a quick demo class that shows the technique. It's amazingly simple.
class Demo EXAMPLE = { "75%" => 0.75, "15%" => 0.15, "9%" => 0.09, "1%" => 0.01 }
def self.sample(choices = EXAMPLE)
choices.max_by { |_, weight| rand ** (1.0 / weight) }.first
end
def self.show(samples: 10000)
items = samples.times.map {sample}
items.each_with_object(Hash.new(0)) do |item, counts|
counts[item]+=1
end.sort_by(&:last).reverse.to_ h
end
end# Demo.show # => {"75%"=>7480, "15%"=>1525, "9%"=>901, "1%"=>94}
Edit: Sorry for the bad formatting. I added a gist:
https://gist.github.com/jasonl99/25d3c922d73f10a75fe228c2de3...
(For choosing from a small number of things this code has the advantage that it's not doing a lot of complicated stuff in Ruby, and in any case the speed probably doesn't matter too much. In fact, even when choosing from a large enough number of things to make this much slower than it need be, the speed may well not matter at all. But it's worth being aware.)
Also worth reading up on are sum-heaps. Alias tables are O(1) to sample from but O(n) to build/modify. Sum-heaps let you modify in O(log(n)) at the cost of sampling in O(log(n)) as well. A good writeup is here: https://timvieira.github.io/blog/post/2016/11/21/heaps-for-i...