How to join two tables in Access while deleting duplicates?

I read all the possible solutions on the Internet, and each time I get different results.

I have two tables: Clients and Patrons. Both of them have the same structure: LastName, FirstName, Address, City, State and Zip. Clients have 108,000 records, while patrons have only 42,000 records. And some of these records are duplicated between them, since I do not have 150,000 clients.

I need one sequential list. The problem I am facing is that some of my clients are located at the same address, so I cannot just delete duplicate addresses, as this will delete the legitimate client. And I have several clients with very common names, say, Jane Doe, where there are several at different addresses, so I can’t just filter out duplicates of the last or first names.

I am using Microsoft Access 2010.

Simply converting unique values ​​to YES does not help.

I looked at Microsoft help files and I got results from 2 to 168,000 and most of them in between.

How can I get one list without duplicates without the need for its alphabet and go line by line for 150,000 entries?

+6
source share
4 answers

A UNION query returns only single rows. (There is also UNION ALL, but this will include duplicate lines, so you don't want it here.)

Try this request. If it does not return what you want, explain why, if not enough.

SELECT LastName, FirstName, Address, City, State, Zip FROM Clients UNION SELECT LastName, FirstName, Address, City, State, Zip FROM Patrons ORDER BY LastName, FirstName; 

Perhaps you need another field or fields in ORDER BY. I just suggested something to get you started.

+8
source

One way to do this is to make the FULL OUTER JOIN and COALESCE values. This will let you know if there is a customer table, a cartridge table, or both.

Unfortunately, AFAIK Access does not have a FULL OUTER, so you will need to simulate it.

 SELECT a.LastName, a.FirstName, a.Address, a.City, a.State, a.Zip , "Both" as type FROM Clients a INNER JOIN Patrons b ON a.LastName = b.LastName AND a.Address = b.Address AND a.City = b.City AND a.State = b.State AND a.Zip = b.Zip UNION ALL SELECT a.LastName, a.FirstName, a.Address, a.City, a.State, a.Zip , "Client" as type FROM Clients a LEFT JOIN Patrons b ON a.LastName = b.LastName AND a.Address = b.Address AND a.City = b.City AND a.State = b.State AND a.Zip = b.Zip WHERE b.PatronID is null (Or whatever the PK is) UNION ALL SELECT b.LastName, b.FirstName, b.Address, b.City, b.State, b.Zip , "Patron" as type FROM Clients a RIGHT JOIN Patrons b ON a.LastName = b.LastName AND a.Address = b.Address AND a.City = b.City AND a.State = b.State AND a.Zip = b.Zip WHERE a.ClientID is null (Or whatever the PK is) 

If you just need a list, although you should just use the HansUp answer

+2
source

I'm not sure that creating a fully automated solution is worth it: you can never create code that sees Doe, Jane, 1234 Sunset Boulevard and Doe, Jane, 1234 Sunset Bd as one and the same person, although it's really the same human!

If I were you, I would build a semi-automatic solution in 4 steps:

  • Combine both tables into one unique table, add a boolean field isDuplicate
  • Display on request all similar names and duplicate letters that need to be deleted.
  • Display on request all similar (as similar as possible) addresses and deleted duplicates for deletion.
  • Delete all entries where 'isDuplicate' is set to True

Of course, this method is interesting only if duplicate names / addresses are limited! I assume that your filtering will give you several hundred records. How long will it take? one hour or two? I think it's worth it! By automating this process, you can never be sure that all duplicates are eliminated, nor with the certainty that no legitimate client has been deleted. By doing the job this way, you will be confident in your result.

0
source

I am looking for a better way to do this, but I was surprised that the answer here looks "difficult". Without an easy way to do this, automatically connect to the simple Access features.

Use the query wizard to create a Cancel request. This will create a list of participants that exist on one but not both tables (you indicate that during the wizard). You can then add these records or create a new table as you wish.

I don’t know the way to mix the records in this step, as it is much more complicated.

0
source

Source: https://habr.com/ru/post/898406/


All Articles