Field-based duplicate filtering

Question

Field-based duplicate filtering

I have several records (accounts) that basically duplicate each other, with the exception of one field that represents the language in which the account is located.

For instance:

ID,BillID,Account,Name,Amount,Lang 1,0001,abcd,John Smith,10.99,ENG 2,0002,qwer,Jane Doe,9.99,ENG 3,0001,abcd,John Smith,10.99,SPA 4,0003,abcd,John Smith,4.99,CHI

All fields are strings except ID, which is autonomous.

In my choice of SQL I have

 SELECT * FROM Bills WHERE Account='abcd'

and it returns only 3 rows, but 2 rows for the same account. I need to return unique accounts for a specific account. Therefore, in the above scenario, I want to get 2 accounts with billID 0003 and version SPA or ENG 0001, but not both.

What will be upon request?

thanks

EDIT: I can't rely on a particular language that is always there. For example, I can’t say SELECT * FROM Bills WHERE Account='abcd' AND Lang='ENG' , because sometimes an account can only be in one language that is not ENG , and sometimes it can be in several languages in any combination.

+6

sql sql-server

George Jun 20 '13 at 18:53

source share

3 answers

 select ID,BillID,Account,Name,Amount,max(Lang) FROM Bills WHERE Account='abcd' group by BillID,Account,Name,Amount;

Given that you do not give priority to any particular language if there is the same account in different languages. The above query will work perfectly.

EDIT: Removed "ID" from the group. @Phil you are right .. !!

+7

user2407394 Jun 20 '13 at 18:58

source share

  select BillID,Account,Name,Amount,max(Lang) FROM Bills WHERE Account='abcd' group by BillID,Account,Name,Amount;

Same as user2407394, except with no identifier in groupby, as that will also return 3.

+3

Phil Jun 20 '13 at 19:06

source share

Eric Petroelje · Accepted Answer · 2013-06-20T18:58:41+0000

Probably the easiest way would be to use ROW_NUMBER and PARTITION BY

 SELECT * FROM ( SELECT b.*, ROW_NUMBER() OVER (PARTITION BY BillID ORDER BY Lang) as num FROM Bills b WHERE Account = 'abcd' ) tbl WHERE num = 1

Field-based duplicate filtering

More articles: