Fill a column with the number of substrings in another column

I have two tables "A" and "B". Table "A" has two columns, "Body" and "Number." The column "Number" is empty, the goal is to fill it.

Table A: Body / Number

ABABCDEF / IJKLMNOP / QRSTUVWKYZ / 

Table "B" has only one column:

Table B: Values

 AB CD QR 

Here is what I am looking for as a result:

 ABABCDEF / 3 IJKLMNOP / 0 QRSTUVWKYZ / 1 

In other words, I want to create a query that searches, for each row in the Body column, how many times the substrings appear in the Values ​​column.

How would you advise me to do this?

+4
source share
2 answers

Here is the finished request; The following is explained:

 SELECT Body, SUM( CASE WHEN Value IS NULL THEN 0 ELSE (LENGTH(Body) - LENGTH(REPLACE(Body, Value, ''))) / LENGTH(Value) END ) AS Val FROM ( SELECT TableA.Body, TableB.Value FROM TableA LEFT JOIN TableB ON INSTR(TableA.Body, TableB.Value) > 0 ) CharMatch GROUP BY Body 

Here's the SQL script here .

Now for the explanation ...

The internal query matches the rows of TableA with the substrings of TableB :

 SELECT TableA.Body, TableB.Value FROM TableA LEFT JOIN TableB ON INSTR(TableA.Body, TableB.Value) > 0 

His results:

 BODY VALUE -------------------- ----- ABABCDEF AB ABABCDEF CD IJKLMNOP QRSTUVWKYZ QR 

If you just count this, you get a value of 2 for the string ABABCDEF , because it just looks for the existence of substrings and does not take into account that AB happens twice.

MySQL does not have a function like OCCURS , therefore, to count the phenomena, we used the method of traversing the length of a string to its length with a deleted target string, divided by the length of the target string.

  • REPLACE('ABABCDEF', 'AB', '') ==> 'CDEF'
  • LENGTH('ABABCDEF') ==> 8
  • LENGTH('CDEF') ==> 4

Thus, the length of the string with all AB cases removed is 8 - 4 or 4. Divide 4 by 2 ( LENGTH('AB') ) to get the number of occurrences of AB : 2

String IJKLMNOP will mess this up. It does not have any target values, so there is a risk of dividing by zero risk. CASE inside SUM protects against this.

+2
source

You need an update request:

 update A set cnt = (select sum((length(a.body) - length(replace(a.body, b.value, '')) / length(b.value)) from b ) 

This uses a little trick to count the number of occurrences of b.value in a given string. It replaces each entry with an empty string and counts the difference in the length of the lines. This is divided by the length of the replaced string.

If you just need the number of matches (so the first value will be "2" instead of "3"):

 update A set cnt = (select count(*) from b where a.body like concat('%', b.value, '%') ) 
0
source

Source: https://habr.com/ru/post/1481966/


All Articles