SQL Server - select rows, delete duplicates, but keep row with highest date

Given the following database table in SQL Server 2008:

ActionID (PK) ActionType ActionDate UserID ContentID 1 'Create' '2013-05-26 18:40:00' 1 10 2 'Create' '2013-05-26 18:30:00' 2 10 3 'Edit' '2013-05-26 12:30:00' 5 12 4 'Edit' '2013-05-26 12:25:00' 5 12 5 'Delete' '2013-05-26 12:22:00' 6 12 

I want to write an SQL query that is grouped using ContentID and ActionType , but where the row with the last ActionDate returned and the other rows are ignored, even if they have different UserID values ​​or different column values.

So what should he return:

 ActionID (PK) ActionType ActionDate UserID ContentID 1 'Create' '2013-05-26 18:40:00' 1 10 3 'Edit' '2013-05-26 12:30:00' 5 12 5 'Delete' '2013-05-26 12:22:00' 6 12 

But I can’t figure out how to write a query to do this.

+4
source share
2 answers

One approach is to use CTE (Common Table Expression).

With this CTE, you can break down your data according to some criteria - i.e. your ContentID and Actiontype - and have a SQL Server number for all of your rows, starting at 1 for each of these "sections" ordered by ActionDate .

So try something like this:

 ;WITH Actions AS ( SELECT ActionID, ActionType, ActionDate, UserID, ContentID, RowNum = ROW_NUMBER() OVER(PARTITION BY ContentID, ActionType ORDER BY ActionDate DESC) FROM dbo.YourTable WHERE ...... ) SELECT ActionID, ActionType, ActionDate, UserID, ContentID, FROM Actions WHERE RowNum = 1 ORDER BY ActionDate DESC 

Does this mean what you are looking for?

+7
source
 select t1.* from Table1 t1 inner join (select ContentID, ActionType, max(ActionDate) as MaxDate from Table1 group by ContentID, ActionType) t2 on t1.ContentID = t2.ContentID and t1.ActionType = t2.ActionType and t1.ActionDate = t2.MaxDate; 

Any query that answers your question may lead to unexpected results if you have duplicate rows for the {ContentID, ActionType} pair.

+3
source

Source: https://habr.com/ru/post/1482838/


All Articles