I am trying to clear data in a PostgreSQL table, where in some records there are a large number of profanities in a column email_address(the corresponding records were entered by agitated users as a result of disappointment due to an error that has since been fixed):
βββββββββββββββββββββ
β email_address β
βββββββββββββββββββββ
β foo@go.bar.me.net β
β foo@foo.com β
β foo@example.com β
β baz@example.com β
β barred@qux.com β
βββββββββββββββββββββ
The output of the desired request
I would like to create a query that annotates each row from the data table with profanity assessment and orders account entries so that a person can go through the annotated data (presented in the web application) and take the necessary actions:
ββββββββββββββββββββββββββββββ
β email_address β score β
ββββββββββββββββββββββββββββββ
β foo@foo.com β 18 β
β foo@go.bar.me.net β 14 β
β foo@example.com β 9 β
β baz@example.com β 3 β
β barred@qux.com β 0 β
ββββββββββββββββββββββββββββββ
Attempt # 1
, , , ( 2 ...) , , email_address. profanities :
ββββββββββββββββββββ¬ββββββββ
β profanity_regexp β score β
ββββββββββββββββββββΌββββββββ€
β foo β 9 β
β bar(?!red) β 5 β
β baz β 3 β
ββββββββββββββββββββ΄ββββββββ
, LATERAL regexp_matches, email_address ( ):
SELECT
data.email_address,
array_agg(matches)
FROM
data,
profanities p,
LATERAL regexp_matches(data.email_address, p.posix_regexp, 'gi') matches
GROUP BY
data.email_address;
:
βββββββββββββββββββββ¬ββββββββββββββββββββ
β email_address β profanities_found β
βββββββββββββββββββββΌββββββββββββββββββββ€
β foo@foo.com β {{foo},{foo}} β
β foo@example.com β {{foo}} β
β foo@go.bar.me.net β {{foo},{bar}} β
β baz@example.com β {{baz}} β
βββββββββββββββββββββ΄ββββββββββββββββββββ
SUB-SELECT
, SQL:
SELECT
data.email_address,
array(
SELECT score * (
SELECT COUNT(*)
FROM (SELECT
regexp_matches(data.email_address, p.posix_regexp, 'gi')
) matches
)
FROM profanities p
) prof
from data;
( ) :
βββββββββββββββββββββ¬βββββββββββ
β email_address β prof β
βββββββββββββββββββββΌβββββββββββ€
β foo@go.bar.me.net β {9,5,0} β
β foo@foo.com β {18,0,0} β
β foo@example.com β {9,0,0} β
β baz@example.com β {0,0,3} β
β barred@qux.com β {0,0,0} β
βββββββββββββββββββββ΄βββββββββββ
, ?
, ?
http://sqlfiddle.com/#!17/6685c/4