I'm already cringing about people in this thread talking about "language detection" and "stemming" as if there are good, easy solutions to them.
Take your favorite language detector, like cld2. Apply it to some real-world language, like random posts on Twitter. Did it detect the languages correctly? Welp, there goes that idea.
(Tweets are too short, you say? Tough. Search queries are shorter. You probably aren't lucky enough for your domain's text to be complete articles from the Wall Street Journal, which is what the typical NLP algorithm was trained on.)
Stemming will always be difficult and subtle. It's useful but it isn't even linguistically well-defined, so you'll have to tweak it a lot. If stemming seems easy, you haven't looked at where it goes wrong for your use case.