I maintain a music database in mySQL, how do I return results stored in, for example, “Tiësto” when people search for “Tiesto”?
All data is stored when indexing the full text, if that matters.
I already use a combination of Levenshtein in PHP and REGEXP in SQL - I’m not trying to solve this problem, but just for greater accessibility for search in general.
PHP:
function Levenshtein($word) { $words = array(); for ($i = 0; $i < strlen($word); $i++) { $words[] = substr($word, 0, $i) . '_' . substr($word, $i); $words[] = substr($word, 0, $i) . substr($word, $i + 1); $words[] = substr($word, 0, $i) . '_' . substr($word, $i + 1); } $words[] = $word . '_'; return $words; } $fuzzyartist = Levenshtein($_POST['searchartist']); $searchimplode = "'".implode("', '", $fuzzyartist)."'";
MySql:
SELECT * FROM new_track_database WHERE artist REGEXP concat_ws('|', $searchimplode);
To add, I often perform character set conversions and lowercase sanitation in PHP, but they have always been ANOTHER way - standardizing non-Latin characters. I can't think of an oppsite process, but only in certain circumstances based on the data that I saved.
php regex mysql search levenshtein distance
user3174541 03 Oct '14 at 19:03 2014-10-03 19:03
source share