How to join new rows from a table?

I often encounter problems of this form and have not yet found a good solution:

Suppose we have two database tables representing an electronic commerce system.

userData (userId, name, ...) orderData (orderId, userId, orderType, createDate, ...) 

For all users of the system, select their user information, their latest order information with type = '1' and the latest order information with type = '2'. I want to do this in one request. Here is an example:

 (userId, name, ..., orderId1, orderType1, createDate1, ..., orderId2, orderType2, createDate2, ...) (101, 'Bob', ..., 472, '1', '4/25/2008', ..., 382, '2', '3/2/2008', ...) 
+4
source share
11 answers

This should work, you will need to set up the table and column names:

 select ud.name, order1.order_id, order1.order_type, order1.create_date, order2.order_id, order2.order_type, order2.create_date from user_data ud, order_data order1, order_data order2 where ud.user_id = order1.user_id and ud.user_id = order2.user_id and order1.order_id = (select max(order_id) from order_data od1 where od1.user_id = ud.user_id and od1.order_type = 'Type1') and order2.order_id = (select max(order_id) from order_data od2 where od2.user_id = ud.user_id and od2.order_type = 'Type2') 

Denormalizing your data might also be a good idea. This type of thing will be quite expensive. This way you can add last_order_date to your user data.

+4
source

I suggested three different approaches to solve this problem:

  • Using Pivots
  • Using case statements
  • Using inline queries in the where clause

All solutions assume that we define the β€œmost recent” order based on the orderId column. Using the createDate column createDate add complexity due to timestamp collisions and would seriously hinder performance, as createDate is probably not part of the indexed key. I only tested these queries using MS SQL Server 2005, so I have no idea if they will work on your server.

Solutions (1) and (2) are almost identical. In fact, they both result in the same number of reads from the database.

Solution (3) is not the preferred approach when working with large data sets. He consistently makes hundreds of logical readings of more than (1) and (2). When filtering for one specific user, approach (3) is comparable with other methods. In the case of one user, a drop in processor time helps withstand a significantly larger number of readings however, as the drive becomes busier and cache misses occur, this slight advantage will disappear.

Conclusion

In the presented scenario, use a consolidated approach, if supported by your DBMS. It requires less code than the case statement, and makes it easy to add order types in the future.

Note that in some cases, PIVOT is not flexible enough, and value-value functions that use case arguments are the way to go.

code

Approach (1) using PIVOT:

 select ud.userId, ud.fullname, od1.orderId as orderId1, od1.createDate as createDate1, od1.orderType as orderType1, od2.orderId as orderId2, od2.createDate as createDate2, od2.orderType as orderType2 from userData ud inner join ( select userId, [1] as typeOne, [2] as typeTwo from (select userId, orderType, orderId from orderData) as orders PIVOT ( max(orderId) FOR orderType in ([1], [2]) ) as LatestOrders) as LatestOrders on LatestOrders.userId = ud.userId inner join orderData od1 on od1.orderId = LatestOrders.typeOne inner join orderData od2 on od2.orderId = LatestOrders.typeTwo 

Approach (2) using case expressions:

 select ud.userId, ud.fullname, od1.orderId as orderId1, od1.createDate as createDate1, od1.orderType as orderType1, od2.orderId as orderId2, od2.createDate as createDate2, od2.orderType as orderType2 from userData ud -- assuming not all users will have orders use outer join inner join ( select od.userId, -- can be null if no orders for type max (case when orderType = 1 then ORDERID else null end) as maxTypeOneOrderId, -- can be null if no orders for type max (case when orderType = 2 then ORDERID else null end) as maxTypeTwoOrderId from orderData od group by userId) as maxOrderKeys on maxOrderKeys.userId = ud.userId inner join orderData od1 on od1.ORDERID = maxTypeTwoOrderId inner join orderData od2 on OD2.ORDERID = maxTypeTwoOrderId 

Approach (3) using inline queries in the where clause (based on Steve K.'s answer):

 select ud.userId,ud.fullname, order1.orderId, order1.orderType, order1.createDate, order2.orderId, order2.orderType, order2.createDate from userData ud, orderData order1, orderData order2 where ud.userId = order1.userId and ud.userId = order2.userId and order1.orderId = (select max(orderId) from orderData od1 where od1.userId = ud.userId and od1.orderType = 1) and order2.orderId = (select max(orderId) from orderData od2 where od2.userId = ud.userId and od2.orderType = 2) 

Script for creating tables and 1000 users with 100 orders:

 CREATE TABLE [dbo].[orderData]( [orderId] [int] IDENTITY(1,1) NOT NULL, [createDate] [datetime] NOT NULL, [orderType] [tinyint] NOT NULL, [userId] [int] NOT NULL ) CREATE TABLE [dbo].[userData]( [userId] [int] IDENTITY(1,1) NOT NULL, [fullname] [nvarchar](50) NOT NULL ) -- Create 1000 users with 100 order each declare @userId int declare @usersAdded int set @usersAdded = 0 while @usersAdded < 1000 begin insert into userData (fullname) values ('Mario' + ltrim(str(@usersAdded))) set @userId = @@identity declare @orderSetsAdded int set @orderSetsAdded = 0 while @orderSetsAdded < 10 begin insert into orderData (userId, createDate, orderType) values ( @userId, '01-06-08', 1) insert into orderData (userId, createDate, orderType) values ( @userId, '01-02-08', 1) insert into orderData (userId, createDate, orderType) values ( @userId, '01-08-08', 1) insert into orderData (userId, createDate, orderType) values ( @userId, '01-09-08', 1) insert into orderData (userId, createDate, orderType) values ( @userId, '01-01-08', 1) insert into orderData (userId, createDate, orderType) values ( @userId, '01-06-06', 2) insert into orderData (userId, createDate, orderType) values ( @userId, '01-02-02', 2) insert into orderData (userId, createDate, orderType) values ( @userId, '01-08-09', 2) insert into orderData (userId, createDate, orderType) values ( @userId, '01-09-01', 2) insert into orderData (userId, createDate, orderType) values ( @userId, '01-01-04', 2) set @orderSetsAdded = @orderSetsAdded + 1 end set @usersAdded = @usersAdded + 1 end 

A small fragment for testing query performance on MS SQL Server in addition to SQL Profiler:

 -- Uncomment these to clear some caches --DBCC DROPCLEANBUFFERS --DBCC FREEPROCCACHE set statistics io on set statistics time on -- INSERT TEST QUERY HERE set statistics time off set statistics io off 
+3
source

Sorry, I don't have an oracle in front of me, but this is the basic structure of what I will do in the oracle:

 SELECT b.user_id, b.orderid, b.orderType, b.createDate, <etc>, a.name FROM orderData b, userData a WHERE a.userid = b.userid AND (b.userid, b.orderType, b.createDate) IN ( SELECT userid, orderType, max(createDate) FROM orderData WHERE orderType IN (1,2) GROUP BY userid, orderType) 
+1
source

T-SQL Solution Example (MS SQL):

 SELECT u.* , o1.* , o2.* FROM ( SELECT , userData.* , (SELECT TOP 1 orderId.url FROM orderData WHERE orderData.userId=userData.userId AND orderType=1 ORDER BY createDate DESC) AS order1Id , (SELECT TOP 1 orderId.url FROM orderData WHERE orderData.userId=userData.userId AND orderType=2 ORDER BY createDate DESC) AS order2Id FROM userData ) AS u LEFT JOIN orderData o1 ON (u.order1Id=o1.orderId) LEFT JOIN orderData o2 ON (u.order2Id=o2.orderId) 

In SQL 2005, you can also use the RANK () OVER function. (But AFAIK is completely its MSSQL function)

+1
source

Their newest do you mean everything new on the current day? You can always check with createDate and get all user and order data if createDate> = current day.

 SELECT * FROM "orderData", "userData" WHERE "userData"."userId" ="orderData"."userId" AND "orderData".createDate >= current_date; 

UPDATED

Here is what you want after your comment here:

 SELECT * FROM "orderData", "userData" WHERE "userData"."userId" ="orderData"."userId" AND "orderData".type = '1' AND "orderData"."orderId" = ( SELECT "orderId" FROM "orderData" WHERE "orderType" = '1' ORDER "orderId" DESC LIMIT 1 

)

0
source

You can make a join request for this. The exact syntax requires some work, especially section groups, but the union must be able to do this.

For instance:

 SELECT orderId, orderType, createDate FROM orderData WHERE type=1 AND MAX(createDate) GROUP BY orderId, orderType, createDate UNION SELECT orderId, orderType, createDate FROM orderData WHERE type=2 AND MAX(createDate) GROUP BY orderId, orderType, createDate 
0
source

I use things like this in MySQL:

 SELECT u.*, SUBSTRING_INDEX( MAX( CONCAT( o1.createDate, '##', o1.otherfield)), '##', -1) as o2_orderfield, SUBSTRING_INDEX( MAX( CONCAT( o2.createDate, '##', o2.otherfield)), '##', -1) as o2_orderfield FROM userData as u LEFT JOIN orderData AS o1 ON (o1.userId=u.userId AND o1.orderType=1) LEFT JOIN orderData AS o2 ON (o1.userId=u.userId AND o2.orderType=2) GROUP BY u.userId 

In short, use MAX () to get the latest information by adding a criteria field (createDate) to the field (s) of interest (another field). SUBSTRING_INDEX () then deletes the date.

OTOH, if you need an arbitrary number of orders (if userType can be any number, and not limited to ENUM); better to handle a single request, something like this:

 select * from orderData where userId=XXX order by orderType, date desc group by orderType 

for each user.

0
source

Assuming that the order of Id monotonically increases with time:

 SELECT * FROM userData u INNER JOIN orderData o ON o.userId = u.userId INNER JOIN ( -- This subquery gives the last order of each type for each customer SELECT MAX(o2.orderId) --, o2.userId -- optional - include if joining for a particular customer --, o2.orderType -- optional - include if joining for a particular type FROM orderData o2 GROUP BY o2.userId ,o2.orderType ) AS LastOrders ON LastOrders.orderId = o.orderId -- expand join to include customer or type if desired 

Then, turning on the client or using SQL Server, there is a PIVOT function

0
source

Here is one way to move data of type 1 and 2 on the same line:
(by placing type 1 and type 2 information in their own selections, which will then be used in the from clause.)

 SELECT a.name, ud1.*, ud2.* FROM userData a, (SELECT user_id, orderid, orderType, reateDate, <etc>, FROM orderData b WHERE (userid, orderType, createDate) IN ( SELECT userid, orderType, max(createDate) FROM orderData WHERE orderType = 1 GROUP BY userid, orderType) ud1, (SELECT user_id, orderid, orderType, createDate, <etc>, FROM orderData WHERE (userid, orderType, createDate) IN ( SELECT userid, orderType, max(createDate) FROM orderData WHERE orderType = 2 GROUP BY userid, orderType) ud2 
0
source

This is how I do it. This is standard SQL and works in any database.

 SELECT u.userId, u.name, o1.orderId, o1.orderType, o1.createDate, o2.orderId, o2.orderType, o2.createDate FROM userData AS u LEFT OUTER JOIN ( SELECT o1a.orderId, o1a.userId, o1a.orderType, o1a.createDate FROM orderData AS o1a LEFT OUTER JOIN orderData AS o1b ON (o1a.userId = o1b.userId AND o1a.orderType = o1b.orderType AND o1a.createDate < o1b.createDate) WHERE o1a.orderType = 1 AND o1b.orderId IS NULL) AS o1 ON (u.userId = o1.userId) LEFT OUTER JOIN ( SELECT o2a.orderId, o2a.userId, o2a.orderType, o2a.createDate FROM orderData AS o2a LEFT OUTER JOIN orderData AS o2b ON (o2a.userId = o2b.userId AND o2a.orderType = o2b.orderType AND o2a.createDate < o2b.createDate) WHERE o2a.orderType = 2 AND o2b.orderId IS NULL) o2 ON (u.userId = o2.userId); 

Please note that if you have several orders of any type whose dates are equal to the most recent date, you will receive several rows in the result set. If you have several orders of both types, you will get N x M rows in the result set. Therefore, I would recommend that you select strings of each type in separate queries.

0
source

Steve K is absolutely right, thanks! I rewrote his answer a bit to explain that there could be no order for a certain type (which I did not mention, so I cannot blame Steve K.)

Here is what I ran using:

 select ud.name, order1.orderId, order1.orderType, order1.createDate, order2.orderId, order2.orderType, order2.createDate from userData ud left join orderData order1 on order1.orderId = (select max(orderId) from orderData od1 where od1.userId = ud.userId and od1.orderType = '1') left join orderData order2 on order2.orderId = (select max(orderId) from orderData od2 where od2.userId = ud.userId and od2.orderType = '2') where ...[some limiting factors on the selection of users]...; 
0
source

Source: https://habr.com/ru/post/1277332/


All Articles