Mysql regex search without duplicate characters

I have a database table with words from a dictionary.

Now I want to select the words for the anagram. For example, if I give the string SEPIAN , it should get values ​​like apes , pain , pains , pies , pines , sepia , etc.

For this, I used the query

 SELECT * FROM words WHERE word REGEXP '^[SEPIAN]{1,6}$' 

But this query returns words such as anna , essen that have duplicate characters not on the line. For instance. anna has two n , but there is only one n in the SEPIAN search SEPIAN .

How can I write my regex to achieve this? Also, if there are duplicate characters in my search bar, the resulting duplicate characters should reflect the result.

+6
source share
2 answers

Since MySQL does not support reverse capture groups, the typical solution (\w).*\1 will not work. This means that any given solution will have to list all possible doublings. In addition, as far as I can tell, backlinks are not valid in expectations or drops, and MySQL does not support hopes and expectations.

However, you can break this down into two expressions and use the following query:

 SELECT * FROM words WHERE word REGEXP '^[SEPIAN]{1,6}$' AND NOT word REGEXP 'S.*?S|E.*?E|P.*?P|I.*?I|A.*?A|N.*?N' 

Not very pretty, but it works, and it should also be quite effective.


To maintain a specified limit for duplicate characters, use the following pattern for your secondary expression:

 A(.*?A){X,} 

Where A is your character and X is the number of allowed.

So, if you add another N to your SEPIANN line (total 2 N s), your query will look like this:

 SELECT * FROM words WHERE word REGEXP '^[SEPIAN]{1,7}$' AND NOT word REGEXP 'S.*?S|E.*?E|P.*?P|I.*?I|A.*?A|N(.*?N){2}' 
+5
source

I think something like this will help you. words table:

 | id | word | alfagram | --------------------------------- | 1 | karabar | aaabkrr | | 2 | malabar | aaablmr | | 3 | trantantan| aaannnrttt| 

alfagram here are the letters of the word in alphabetical order.

PHP code:

 $searchString = 'abrakadabra'; $searchStringAlfa = array(); for( $i=0,$c=strlen($searchString);$i<$c;$i++ ){ if( isset($searchStringAlfa[$searchString[$i]]) ){ $searchStringAlfa[$searchString[$i]]++; }else{ $searchStringAlfa[$searchString[$i]] = 1; } } ksort($searchStringAlfa); $regexp = '^'; foreach( $searchStringAlfa as $alfa=>$amount ){ $regexp .= '['.$alfa.']{0,'.$amount.'}'; } $regexp .= '$'; 

$searchString is the string you want to find. Then you only need to execute the request:

 $result = mysql_query('SELECT * FROM words WHERE alfagram REGEXP "'.$regexp.'"'); 

Additional validation and optimization may be required.

+2
source

Source: https://habr.com/ru/post/920583/


All Articles