Also, from what I can tell, every query is log(N) in the size of the completion set, instead of linear in the length of the query/suggestion (again, like a trie). Seems like this might have trouble scaling to large suggestion sets.
A query for a single-prefix term (i.e. "yank") is technically O(log(N)+M), where N is the # of items in the sorted set, and M is the # of items returned. But since in practice we often want a small # of items, we can keep both N and M small.
One tradeoff we made is that this approach doesn't support incremental updates easily, you can always add new items, but it would take a good bit of work to remove items that have expired, or update the scores of items that change.
As for memory usage, we only store each item once (in a redis hash table), and use the unique ids as entries in the sorted sets. Which sort of approaches a trie, since instead of:
ya -> yan -> yank -> yanke -> yankee -> yankees -> [data about yankees]
we have:
ya -> 101
yan -> 101
...
101 -> [data about yankees]
Aside from not supporting incremental updates, that approach has a high memory overhead. For every word of length N, you've got to store 1 + 2 + ... + N-1 + N characters, which is O(N^2). The constant is 1/2, and you do get some savings in shared prefixes, so that's definitely a worst-case upper bound -- but a trie grows with O(N).
The O(log(N)) lookup time can also be a limiting factor with large sets. A trie is going to give you worst-case linear lookup time in the length of the longest string in your set. This can be a pretty dramatic difference as string sets get large.
I might have to use this in my next service.
Or do you have numbers on the mean length of a phrase you handle currently, the number of such phrases and how much memory it takes?
In curious: What do you think about exposing this service via WebSockets? Would that make it even faster?