MySQL based row query

Question

MySQL based row query

This query has bothered me for the past 10 hours. Here we go:

I want to make a comparison with some data that I pull. I pull out the names and I want to remove similar names and not return them in the query.

Example:

I have the following names:

Seaside heights
Seaside HGTS
Talladega
Torncal Center
Tornkal ctr
Yonkers
Zebraville

I want it to return like this:

Seaside heights
Talladega
Torncal Center
Yonkers
Zebraville

Basically, I think it should be a substring (name, 0, 8) to get the first 8 characters, then run this 8 characters against the next entry, and if they match, to ignore it.

Perhaps I am thinking about understanding this. Any insights or concepts that may work will be appreciated.

-1

php mysql

csteel Mar 03 '12 at 4:53

source share

4 answers

William · Answer 1 · 2012-03-03T04:59:38+0000

First you request all the data.

Then, for each returned record, you want to run the LCS algorithm (the longest common subsequence).

If the longest common subsequence between two different records has a number of your choice, you can classify them as similar.

http://en.wikipedia.org/wiki/Longest_common_subsequence_problem

edit: It’s just that it turns out a good PHP function for this: http://php.net/manual/en/function.similar-text.php

user319198 · Answer 2 · 2012-03-03T05:07:19+0000

Try the following:

If the difference between the lines is similar to the difference in the example.

select names from tablename group by substring_index(names," ",1)

Jeremy holovacs · Answer 3 · 2012-03-03T05:02:55+0000

Perhaps you should take a look at soundex . It will not be perfect, but it can lead you to a park park.

Amber · Answer 4 · 2012-03-03T05:05:23+0000

If the differences between the lines are limited to a small set of abbreviations (HGTS ↔ Heights, CTR ↔ Center, etc.), you can simply save the table of this data and replace the abbreviations with the full versions, then check the uniqueness.

MySQL based row query

More articles: