This is called lemmatization, and what you call a โword baseโ is called a lemma. morphaand reimplementation in a POS tester. However, both require the entry of labeled POS addresses to eliminate the inherent ambiguity in natural language.
(POS means categorizing words, for example, noun, verb. I assume you want a tool that processes English.)
Edit : since you are going to use this for your search, here are some tips:
- A simple conclusion for the English language has a mixed reputation in the world of search engine. Sometimes it works, often it doesnโt.
- . , Google. , .
- , , , , , . ( , .)
- Lucene, .
( , .)