Roughly, search engines work in two phases: retrieval, and scoring. Retrieval is when you figure out of the billions of documents in the index, which are the top few thousand that could be worthy of being search results. Scoring is when you look at each of those documents in more detail to figure out the actual top ten.
Scoring based on regular expressions wouldn't be too tough. Retrieval is the killer. Typically retrieval works based on "posting lists", which are basically indices for each word of which documents contain that word. To retrieve based on regular expressions, you would need posting lists for individual characters or short sequences of characters. That would take a lot more space.
You might be able to hack together some hybrid that would use existing posting lists. For example, if you required that the regular expression contain a word within it. But pure regular expressions would require a different index. That sort of added complexity is not worth it for the feature.
It might be practical to do a hybrid search -- a conventional word or phrased based search to return a limited set of documents that can then be brute-force searched using a regular expression. This could be especially handy for programmers searching for code samples, a position I often find myself in.
Who would use the regex search? Only programmers. So your market is tiny compared to a general-purpose search engine.
So more expensive queries that are harder to code up for many fewer people? Sounds like a losing bet.
- how will you make money?
- how will you implement this cheaply enough?
- who will really use this? what are they doing now instead?
[edit] The right model might be a sort of meta-search engine - feed the regex to something like Wikipedia to determine plausible keywords and then return aggregated search results based on the keywords. At prototype and small scale actual search results could be aggregatte from other search engines such as Google or Bing.
[edit] Interestingly, Wikipedia already has regex capability built in to AutoWikibrowser.
http://en.wikipedia.org/wiki/Wikipedia:AutoWikiBrowser/Regul...