How to efficiently get n last rows with GROUP BY in sqlite?

I have a table of event results, and I need to get the most recent n events for each player for a given list of players.

This is on iOS, so it should be fast. I looked at a lot of solutions for n-groups that use subqueries or joins, but they work slower for my 100k dataset. Rows even in macbook pro. So far, my dumb decision, since I will only work with a maximum of 6 players, I need to make 6 separate requests. This is not very slow, but there must be a better way, right? Here's the gist of what I'm doing now:

results_by_pid = {} player_ids = [1,2,3,4,5,6] n_results = 6 for pid in player_ids: results_by_pid[pid] = exec_sql("SELECT * FROM results WHERE player_id = #{pid} ORDER BY event_date DESC LIMIT n_events") 

And then I continue my fun journey. But how can I turn this into one quick request?

+4
source share
2 answers

There is no better way. SQL functions that may help are not implemented in SQLite.

SQLite is designed as an embedded database, where most of the logic remains in the application. Unlike client / server databases, where network communication is to be avoided, there is no lack of performance for mixing SQL commands and program logic.

A less dumb solution requires you to do a few SELECT player_id FROM somewhere in advance, which should not be a problem.

To make individual queries efficient, make sure you have one index in the two columns player_id and event_date .

+1
source

This will not be the answer, but here goes ...

I found that making things really fast can include ideas from the nature of the data itself and the schema. For example, searching an ordered list is faster than searching an unordered list, but you need to pay the cost in front - both in design and in execution.

So ask yourself if there are any natural partitions in your data that can reduce the number of records SQLite needs to perform. You may ask if the last n events fall in a certain period of time. Will they all be from the last seven days? Last month? If so, you can build a query to exclude entire pieces of data before performing more complex search queries.

Also, if you just can't get the job to work fast, you can consider the UX cheat! Soooooo many engineers do not manage their UX. Will your request be triggered by clicking the view controller? Then install the thing going in the background thread from the PREVIOUS view controller and let it work while iOS animates. How long is the push animation? .2 seconds? At what point does your user point to the application (through some UX control) that will be requested by playerids ? As soon as he touches this button or TVCell, you can pre-select some data. Therefore, if the overall work you have to do is O (n log n), this means that you can probably split it into parts O (n) and O (log n).

Just some thoughts, while I avoid my own hard work.


Other thoughts

What about a separate table containing the identifiers of the previous n inserts? You can add a trigger to remove old identifiers if the table size grows above n. Say ..

 CREATE TABLE IF NOT EXISTS recent_results (result_id INTEGER PRIMARY KEY, event_date DATE); // is DATE a type? I don't know. you get the point CREATE TRIGGER IF NOT EXISTS optimizer AFTER INSERT ON recent_results WHEN (SELECT COUNT(*) FROM recent_results) > N BEGIN DELETE FROM recent_results WHERE result_id = (SELECT result_id FROM recent_results WHERE event_date = MIN(event_date)); // or something like that. I have no idea if this will work, // I just threw it together. 

Or you can simply create a temporary table based on the memory that you fill in when you load the application, and update when transactions are performed during the execution of the application. Thus, you only pay a steep price!

Some more considerations for you. Be creative and remember that you can usually determine what you want, both the data structure and the algorithm. Good luck

+1
source

Source: https://habr.com/ru/post/1469288/


All Articles