Select rows for which the FIRST result is a specific value.

I have a custom written analytics system, and I'm trying to write a query that returns users who have landed on a specific page as their first hit. The relevant parts of the table are configured as such, with some simple data:

pageviews Table +----+---------------------+----------+-------------+ | id | time_in | users_id | articles_id | +----+---------------------+----------+-------------+ | 0 | 2013-08-15 00:00:00 | 0 | 0 | | 1 | 2013-08-16 00:00:00 | 0 | 1 | | 2 | 2013-08-17 00:00:00 | 1 | 1 | | 3 | 2013-08-18 00:00:00 | 1 | 0 | | 4 | 2013-08-19 00:00:00 | 1 | 1 | | 5 | 2013-08-20 00:00:00 | 2 | 1 | +----+---------------------+----------+-------------+ 

NOTE. Identification fields in my database actually use the GUID, not ints, as in this simple example.

Now, if I want someone to read article 1 as the first hit, I want my query to return to users 1 and 2 , but not 0 , since user 0 saw article 0 as the first hit on the site. Conversely, if I want to first see who reads article 0 , only user 0 will return the request.

Here is my request:

 SELECT * FROM pageviews WHERE articles_id = 1 GROUP BY users_id ORDER BY time_in 

But this returns different user IDs for all users who read article 1 , and do not filter out users who did not see it as the first result. I feel like I did the wrong thing, so I am talking to you guys.

Thanks in advance.

+4
source share
2 answers

One way to do it

 SELECT v.users_id FROM pageviews v JOIN ( SELECT users_id, MIN(time_in) time_in FROM pageviews GROUP BY users_id ) q ON v.users_id = q.users_id AND v.time_in = q.time_in WHERE v.articles_id = 1 

Output:

  |  USERS_ID |
 ------------
 |  1 |
 |  2 |

Here is the SQLFiddle demo

+2
source

Another way:

 SELECT users_id FROM pageviews p WHERE articles_id = 1 AND time_in = (SELECT MIN(time_in) from pageviews p2 WHERE p2.users_id = p.users_id) 

Sqlfiddle here

0
source

Source: https://habr.com/ru/post/1497918/


All Articles