When you do a search on Stackoverflow, it shortens the part of the question description that best suits your criteria, and after that it marks the words of the criteria.
I wonder how best to do this manually in C #, which means without the help of a full-text search engine.
The main problem: how to quickly select the best text part?
What i have done so far:
- I get text space indices. This allows me to know where words start so that I can start substrings from them.
- From each space index, I get 300 characters in front and check how many occurrences of keywords I find.
- I guess the length is 300 characters long, which is the best of all, so I cut it off from the source text.
Is this a good approach? Is there a faster way? Is counting the number of occurrences the best way to find the most relevant part?
source
share