Assuming it indexes everything locally and falls back to traditional search engines if none found, how do you feel about adding a shared middle layer? A layer that simply indexes all the canonical data that doesn't have any personal info. This way, the contributors can automatically contribute the pages they index - building a shared search engine over time! The whole thing can work without a crawler of its own (under appropriate license so people can trust it)