I have a database table full of addresses from geocoding Google Maps. Google reduces all directions (West → W, East → E, etc.).
Therefore, if I enter an address such as “100 West Pender Street,” then the formatted address returned by Google Maps will be “100 W Pender St,” which I insert into my table.
Now, if the user comes in and looks for this address, all of the following should match:
western express street 100 pender 100 w pender 100 western waybill
and they more or less act. The “w” in the table is ignored, however, as it falls below the minimum word length. addresses that fall on the eastern penner get equal weighting in the search results ("E" is also ignored).
What is the best way to handle this?
I suspect a minimum word length of 1 is a "bad thing."
I could search and replace known abbreviations (N, E, S, W, St, Ave, Dr, etc.) in google addresses and replace them with their extensions - but there are some street names where this is not valid (in some cities have single-letter street names: J Street, etc.)
Also addresses like “123 160 St” are not searchable at all, because the street number (123) and street name (160) both fall below the minimum word length.
Is MySQL FullText right for this? Sphinx offers something better?
Or is there another solution that I have not yet considered? Keep in mind that the user's search query will be matched not only with the property address, but also with other text columns, such as the name and description of the property.