Multiple left joins - keeping the return number of rows down?

I'm trying to figure out how best to query a schema consisting of one central table, plus a lot of attribute tables (sorry, not sure of the best terminology here) that record one-to-many relationships. In the business layer, each of these tables corresponds to a collection, which may contain zero or more elements.

Currently, the code I'm looking for retrieves data, retrieving a list of values ​​from the main table, then iterating over it and querying each of the "auxiliary" tables to populate these collections.

I would like to try and bring it to one request, if possible. I tried to use multiple LEFT JOIN s. But this effectively connects to the cross product of the values ​​in the accessory tables, which leads to a row explosion - especially when you add a few more compounds. There are five such relationships in the table under consideration, so the number of rows returned for each record is potentially huge and almost entirely composed of redundant data.

Here's a small synthetic example of some tables, data, the query structure I use, and the results:

Database structure and data:

 create table Containers ( Id int not null primary key, Name nvarchar(8) not null); create table Containers_Animals ( Container int not null references Containers(Id), Animal nvarchar(8) not null, primary key (Container, Animal) ); create table Containers_Foods ( Container int not null references Containers(Id), Food nvarchar(8) not null, primary key (Container, Food) ); insert into Containers (Id, Name) values (0, 'box'), (1, 'sack'), (2, 'bucket'); insert into Containers_Animals (Container, Animal) values (1, 'monkey'), (2, 'dog'), (2, 'whale'), (2, 'lemur'); insert into Containers_Foods (Container, Food) values (1, 'lime'), (2, 'bread'), (2, 'chips'), (2, 'apple'), (2, 'grape'); 

In combination with a business object like this:

 class Container { public string Name; public string[] Animals; // may be empty public string[] Foods; // may be empty } 

And here is how I build a query against it:

 select c.Name container, a.Animal animal, f.Food food from Containers c left join Containers_Animals a on a.Container = c.Id left join Containers_Foods f on f.Container = c.Id; 

What gives these results:

 container animal food --------- -------- -------- box NULL NULL sack monkey lime bucket dog apple bucket dog bread bucket dog chips bucket dog grape bucket lemur apple bucket lemur bread bucket lemur chips bucket lemur grape bucket whale apple bucket whale bread bucket whale chips bucket whale grape 

Instead, I would like to see several rows equal to the maximum number of values ​​associated with the root table, in any of the relationships, with an empty space filled with NULL. This will save the number of rows returned in a way, way, way down, while at the same time easily converting to objects. Something like that:

 container animal food --------- -------- -------- box NULL NULL sack monkey lime bucket dog apple bucket lemur bread bucket whale chips bucket NULL grape 

Can this be done?

+4
source share
3 answers

Why not just return the two datasets ordered by the container, and then make a logical union with them on the client? What you ask will make the database engine do a lot more work, with a much more complex query, for (for me) a small advantage.

It will look something like this. Use two left joins to ensure that each dataset has at least one instance of all container names and then skips through them cyclically. Here is an approximate pseudo code:

 Dim CurrentContainer If Not Animals.Eof Then CurrentContainer = Animals.Container End If Do While Not Animals.Eof Or Not Foods.Eof Row = New Couplet(AnimalType, FoodType); If Animals.Animal = CurrentContainer Then Row.AnimalType = Animals.Animal Animals.MoveNext End If If Foods.Container = CurrentContainer Then Row.FoodType = Foods.Food Foods.MoveNext End If If Not Animals.Eof AndAlso Animals.Container <> CurrentContainer _ AndAlso Not Foods.Eof AndAlso Foods.Container <> CurrentContainer Then CurrentContainer = [Container from either non-Eof recordset] EndIf 'Process the row, output it, put it in a stack, build a new recordset, whatever. Loop 

However, of course, what you ask is possible! Here are two ways.

  • Treat the inputs separately and join their position:

     WITH CA AS ( SELECT *, Row_Number() OVER (PARTITION BY Container ORDER BY Animal) Pos FROM Containers_Animals ), CF AS ( SELECT *, Row_Number() OVER (PARTITION BY Container ORDER BY Food) Pos FROM Containers_Foods ) SELECT C.Name, CA.Animal, CF.Food FROM Containers C LEFT JOIN ( SELECT Container, Pos FROM CA UNION SELECT Container, Pos FROM CF ) P ON C.Id = P.Container LEFT JOIN CA ON C.Id = CA.Container AND P.Pos = CA.Pos LEFT JOIN CF ON C.Id = CF.Container AND P.Pos = CF.Pos; 
  • Align the inputs vertically and rotate them:

     WITH FoodAnimals AS ( SELECT C.Name, 1 Which, CA.Animal Item, Row_Number() OVER (PARTITION BY C.Id ORDER BY (CA.Animal)) Pos FROM Containers C LEFT JOIN Containers_Animals CA ON C.Id = CA.Container UNION SELECT C.Name, 2 Which, CF.Food, Row_Number() OVER (PARTITION BY C.Id ORDER BY (CF.Food)) Pos FROM Containers C LEFT JOIN Containers_Foods CF ON C.Id = CF.Container ) SELECT P.Name, P.[1] Animal, P.[2] Food FROM FoodAnimals FA PIVOT (Max(Item) FOR Which IN ([1], [2])) P; 
+4
source
 ; with a as ( select ID, c.Name container, a.Animal animal , r=row_number()over(partition by c.ID order by a.Animal) from Containers c left join Containers_Animals a on a.Container = c.Id ) , b as ( select ID, c.Name container, f.Food food , r=row_number()over(partition by c.ID order by f.Food) from Containers c left join Containers_Foods f on f.Container = c.Id ) select a.container, a.animal, b.food from a left join b on a.container=b.container and ar=br union select b.container, a.animal, b.food from b left join a on a.container=b.container and ar=br 
0
source
 WITH ca_ranked AS ( SELECT *, rnk = ROW_NUMBER() OVER (PARTITION BY Container ORDER BY Animal) FROM Containers_Animals ), cf_ranked AS ( SELECT *, rnk = ROW_NUMBER() OVER (PARTITION BY Container ORDER BY Food) FROM Containers_Foods ) SELECT container = c.Name, animal = ca.Animal, food = cf.Food FROM ca_ranked ca FULL JOIN cf_ranked cf ON ca.Container = cf.Container AND ca.rnk = cf.rnk RIGHT JOIN Containers c ON c.Id = COALESCE(ca.Container, cf.Container) ; 
0
source

Source: https://habr.com/ru/post/1397925/


All Articles