Faster SQL query then joins

I have a large table with more than 10,000 rows, and it will grow to 1,000,000 in the near future, and I need to run a query that returns the time value for each keyword for each user. I have one right now that is pretty slow because I use left joins and it needs one subquery / keyword:

SELECT rawdata.user, t1.Facebook_Time, t2.Outlook_Time, t3.Excel_time FROM rawdata left join (SELECT user, sec_to_time(SuM(time_to_sec(EndTime-StartTime))) as 'Facebook_Time' FROM rawdata WHERE MainWindowTitle LIKE '%Facebook%' GROUP by user)t1 on rawdata.user = t1.user left join (SELECT user, sec_to_time(SuM(time_to_sec(EndTime-StartTime))) as 'Outlook_Time' FROM rawdata WHERE MainWindowTitle LIKE '%Outlook%' GROUP by user)t2 on rawdata.user = t2.user left join (SELECT user, sec_to_time(SuM(time_to_sec(EndTime-StartTime))) as 'Excel_Time' FROM rawdata WHERE MainWindowTitle LIKE '%Excel%' GROUP by user)t3 on rawdata.user = t3.user 

The table looks like this:

 WindowTitle | StartTime | EndTime | User ------------|-----------|---------|--------- Form1 | DateTime | DateTime| user1 Form2 | DateTime | DateTime| user2 ... | ... | ... | ... Form_n | DateTime | DateTime| user_n 

The result should look like this:

 User | Keyword | SUM(EndTime-StartTime) -------|-----------|----------------------- User1 | 'Facebook'| 00:34:12 User1 | 'Outlook' | 00:12:34 User1 | 'Excel' | 00:43:13 User2 | 'Facebook'| 00:34:12 User2 | 'Outlook' | 00:12:34 User2 | 'Excel' | 00:43:13 ... | ... | ... User_n | ... | ... 

And the question is, what is the fastest way in MySQL to do this?

+4
source share
1 answer

I think that your search by template probably slows it down, since you cannot use indexes in these fields. Also, if you can avoid subqueries and just do a direct connection, this may help, but finding wildcards is much worse. In any case, can you modify the table to have categoryName or categoryID, which can have an index and do not require wildcard searches? For example, where categoryName = 'Outlook' "

To optimize the data in your tables, add a category identifier (ideally this would be a link to a separate table, but just use arbitrary numbers for this example):

 alter table rawData add column categoryID int not null alter table rawData add index (categoryID) 

Then fill in the categoryID field for existing data:

 update rawData set categoryID=1 where name like '%Outlook%' update rawData set categoryID=2 where name like '%Facebook%' -- etc... 

Then change your insert to follow the same rules.

Then make a SELECT query this way (change wild cards to categoryID):

 SELECT rawdata.user, t1.Facebook_Time, t2.Outlook_Time, t3.Excel_time FROM rawdata left join (SELECT user, sec_to_time(SuM(time_to_sec(EndTime-StartTime))) as 'Facebook_Time' FROM rawdata WHERE categoryID = 2 GROUP by user)t1 on rawdata.user = t1.user left join (SELECT user, sec_to_time(SuM(time_to_sec(EndTime-StartTime))) as 'Outlook_Time' FROM rawdata WHERE categoryID = 1 GROUP by user)t2 on rawdata.user = t2.user left join (SELECT user, sec_to_time(SuM(time_to_sec(EndTime-StartTime))) as 'Excel_Time' FROM rawdata WHERE categoryID = 3 GROUP by user)t3 on rawdata.user = t3.user 
+4
source

Source: https://habr.com/ru/post/1437008/


All Articles