Joining poorly designed SQL tables?

Question

Joining poorly designed SQL tables?

I tried to find information about joining tables without foriegn keys, but it seems like you always need to create a foreign key. I cannot modify these tables to do this, and I must report data that is already being produced. The following is some of the data in the tables to illustrate the problem.

Table A Journal Account Debit Credit Sequence -------------------------------------------------- 87041 150-00 100.00 0.00 16384 87041 150-10 0.00 100.00 32768 87041 150-00 50.0 0.0 49152 87041 210-90 0.0 50.0 65536

Then the second table, tracking additional bits of information, is in many respects identical, but does not have a sequence number that would correctly associate positions. It has its own serial number, which is not connected.

 Table B Journal Account Label Artist Sequence -------------------------------------------------- 87041 150-00 Label02 Artist12 1 87041 150-10 Label09 Artist03 2 87041 150-00 Label04 Artist01 3 87041 210-90 Label01 Artist05 4

For now, the best I can think of is to join the Journal and Account, but that duplicates the entries. I got close by playing with grouping and max () by sequence number, but the result was that not all duplicates are deleted for journal entries with a very large number of rows, and the first match from the second table is always displayed for lines that have one account.

 Closest - but bad - result Journal Account Debit Credit Sequence Label Artist ---------------------------------------------------------------------- 87041 150-00 100.00 0.00 16384 Label02 Artist12 87041 150-10 0.00 100.00 32768 Label09 Artist03 87041 150-00 50.0 0.0 49152 Label02 Artist12 <-- wrong 87041 210-90 0.0 50.0 65536 Label01 Artist05

How can I join tables in such a way that duplicates are excluded, as well as to display the correct label and artist? It seems to me that I should create a query that knows that one of the records from table B has already been used when record 49152 from table A is looking for a match.

EDIT:

@Justin Crabtree A.Sequence will be the order in which the positions were entered. Thus, the user could enter the last line in the example first, then the first line, then the third and, finally, the second.

@Edper Microsoft SQL Server ... hmm, today I can’t remotely to the client machine ... otherwise I would provide a version.

@Abe Miessler yes you are right.

As soon as I can return to the server, I will try your offer @pkuderov

+4

sql-server tsql

Codenamecain Jun 28 '13 at 10:55

source share

4 answers

Try this

 ;WITH a AS ( SELECT Journal, Account, Debit, Credit, Sequence, Id = ROW_NUMBER() OVER(PARTITION BY Journal ORDER BY Sequence) FROM dbo.tablea ) , b AS ( SELECT Journal, Account, Label, Artist, Id = ROW_NUMBER() OVER(PARTITION BY Journal ORDER BY Sequence) FROM dbo.tableb ) SELECT a.Journal, a.Account, a.Debit, a.Credit, a.Sequence, b.Label, b.Artist FROM a JOIN b ON b.Journal = a.Journal AND b.Account = a.Account AND b.Id = a.Id

+4

Ti Jun 28 '13 at 23:29

source share

If you ordered 2 rows of the table with your sequence numbers, will the rows coincide in the same order?

If so, this is a possible solution for the SQL server: you can create 2 CTEs, one for each table, with a ROW_NUMBER column, in which case both tables will have a column for the corresponding column that you can use to join. Let me know if you need an example.

0

Jkan Jun 28 '13 at 23:28

source share

If I read your requirements correctly, and you want all the rows from table A, but only the first corresponding row from table B, the best option would be to EXTERNAL APPLICATIONS with TOP (1). It will look something like this:

 select * from TableA OUTER APPLY (select TOP(1) Journal, Account, Label, Artist, Sequence FROM TableB WHERE Journal = TableA.Journal AND Account = TableA.Account ORDER BY Sequence) as B

(Definitely pseudo-code, but this should be somewhat close.)

If it comes to it, you can use ROW_NUMBER (), the section that is the log and account, and then match the Row_Number values for each result set. You will create one subquery / CTE for TableA and another CTE for TableB - each with a RowNumber value, which will be essentially a new integer. The first row in TableA will correspond to the first row in TableB, the second row in Table A will correspond to the second in TableB, etc. Of course, you run into some problems if there are more “A” lines in the Journal / Account journal than there are “B” lines.

A better question might be: "How does your code determine all the correspondences between TableA and TableB if they cannot use data columns to link them together?"

0

Peter Schott Jun 29 '13 at 1:31

source share

pkuderov · Accepted Answer · 2013-06-28T23:25:26+0000

Hi, this is just an idea:

 select a.Journal, a.Account, a.Debit, a.Credit, a.Sequence, b.Label, b.Artist from ( select *, row_number() over(partition by Journal, Account order by Sequence) as idInGroup from a ) as a join ( select *, row_number() over(partition by Journal, Account order by Sequence) as idInGroup from b ) as b on a.Journal = b.Journal and a.Account = b.Account and a.idInGroup = b.idInGroup

Here, I assume that the orders appeared in sequence order (in both tables) and that the basic advice for connection tables.

Joining poorly designed SQL tables?

More articles: