Something like this should be done in a hash table for constant lookup regardless of the number of entries (as entries are not being added "in real time" non-stop, the cost of bucket resizing shouldn't be too great) or with a trie for the best storage properties for many URLs for the same domain (the case author notes). If done right (correct hashing algorithm, good implementation, decent collision handling for the hash table; or any decently-performing in terms of space/time for the trie) this shouldn't be a problem.