SQL Query to find rows that did not occur this month

I am trying to find the number of sellers who sold last month but did not sell this month.

I have a query that works, but I don’t think it is effective, and I did not understand how to do this for all months.

SELECT count(distinct user_id) as users
FROM transactions
WHERE MONTH(date) = 12
AND YEAR(date) = 2015
AND transactions.status = 'COMPLETED'
AND transactions.amount > 0
AND transactions.user_id NOT IN 
(
    SELECT distinct user_id
    FROM transactions
    WHERE MONTH(date) = 1
    AND YEAR(date) = 2016
    AND transactions.status = 'COMPLETED'
    AND transactions.amount > 0
)

Table structure:

+---------+------------+-------------+--------+
| user_id |    date    |   status    | amount |
+---------+------------+-------------+--------+
|       1 | 2016-01-01 | 'COMPLETED' | 1.00   |
|       2 | 2015-12-01 | 'COMPLETED' | 1.00   |
|       3 | 2015-12-01 | 'COMPLETED' | 2.00   |
|       1 | 2015-12-01 | 'COMPLETED' | 3.00   |
+---------+------------+-------------+--------+

In this case, users with identifiers 2and 3did not make the sale this month.

+4
source share
4 answers

Use conditional aggregation:

SELECT count(*) as users
FROM
 (
    SELECT user_id
    FROM transactions
                       -- 1st of previous month
    WHERE date BETWEEN SUBDATE(SUBDATE(CURRENT_DATE, DAYOFMONTH(CURRENT_DATE)-1), interval 1 month) 
                       -- end of current month
                   AND LAST_DAY(CURRENT_DATE)
    AND transactions.status = 'COMPLETED'
    AND transactions.amount > 0
    GROUP BY user_id
           -- any row from previous month
    HAVING MAX(CASE WHEN date < SUBDATE(CURRENT_DATE, DAYOFMONTH(CURRENT_DATE)-1)
                    THEN date 
               END) IS NOT NULL
           -- no row in current month
       AND MAX(CASE WHEN date >= SUBDATE(CURRENT_DATE, DAYOFMONTH(CURRENT_DATE)-1)
                    THEN date 
               END) IS NULL           
 ) AS dt

SUBDATE(CURRENT_DATE, DAYOFMONTH(CURRENT_DATE)-1)= first day of the current month

SUBDATE(first day of current month, interval 1 month)= first day of the previous month

LAST_DAY(CURRENT_DATE)= end of current month

+2
source

, curdate() DATE_SUB(curdate(), INTERVAL 1 MONTH) ( if /):

SELECT count(distinct user_id) as users
FROM transactions
WHERE MONTH(date) = MONTH(DATE_SUB(curdate(), INTERVAL 1 MONTH))
AND transactions.status = 'COMPLETED'
AND transactions.amount > 0
AND transactions.user_id NOT IN 
(
    SELECT distinct user_id
    FROM transactions
    WHERE MONTH(date) = MONTH(curdate())
    AND transactions.status = 'COMPLETED'
    AND transactions.amount > 0
)

,

+1

. , EXPLAIN.

SELECT COUNT(DISTINCT user_id) users
  FROM transactions t
  LEFT 
  JOIN transactions x
    ON x.user_id = t.user_id
   AND x.date BETWEEN '2016-01-01' AND '2016-01-31'
   AND x.status = 'COMPLETED'
   AND x.amount > 0
 WHERE t.date BETWEEN '2015-12-01' AND '2015-12-31'
   AND t.status = 'COMPLETED'
   AND t.amount > 0
   AND x.user_id IS NULL;
+1

- :
, . , , .
-.

To make your query efficient, I would recommend at least a 2-column index for table transactions on [status, amount]. However, to prevent the query from finding data in the actual table, you can even create an index with four columns [status, amount, date, user_id], which should further improve the performance of your query.

Postgres (v9.0 +, verified)

SELECT   (DATE_PART('year', t.date) || '-' || DATE_PART('month', t.date)) AS d,
         STRING_AGG( DISTINCT t.user_id::TEXT, ',' ) AS buyers
FROM     transactions t
WHERE    t.status = 'COMPLETED'
AND      t.amount > 0
GROUP BY DATE_PART('year', t.date),
         DATE_PART('month', t.date)
ORDER BY DATE_PART('year', t.date),
         DATE_PART('month', t.date)
;

MySQL (not tested)

SELECT   (YEAR(t.date) || '-' || MONTH(t.date)) AS d,
         GROUP_CONCAT( DISTINCT t.user_id ) AS buyers
FROM     transactions t
WHERE    t.status = 'COMPLETED'
AND      t.amount > 0
GROUP BY YEAR(t.date), MONTH(t.date)
ORDER BY YEAR(t.date), MONTH(t.date)
;

Ruby (an example for further processing)

db_result                   = ActiveRecord::Base.connection_pool.with_connection { |con| con.execute( db_query ) }
unique_buyers               = db_result.map{|e|[e['d'],e['buyers'].split(',')]}.to_h

buyers_dec15_but_not_jan16  = unique_buyers['2015-12'] - unique_buyers['2016-1']
buyers_nov15_but_not_dec16  = unique_buyers['2015-11']||[] - unique_buyers['2015-12']
...(and so on)...
0
source

Source: https://habr.com/ru/post/1626236/


All Articles