I am writing a bridge between the user and the search engine, not the search engine. Part of my added value will trigger the intention of the request. The intent of the tracking number, stock symbol or address is pretty obvious. If I can classify the query, I can decide if the user should even see the search results. Of course, if I can’t, they will see the search results. I am currently developing this output mechanism.
I am writing a parser; he must take any given token and assign it a category. Here are some theoretical examples in English:
- "denver" is USCITY and PLACENAME
- "aapl" is NASDAQSYMBOL and STOCKTICKERSYMBOL.
- "555 555 5555" is USPHONENUMBER
I know that each of these cases is likely to require special handling, but I'm not sure where to start.
Ideally, I would get something simple:
queryCategory = magicCategoryFinder( query )
>print queryCategory
>"SOMECATEGORY or a list"
source
share