I have a large table (200'000'000 rows); so declared
thread( forum_id tinyint, thread_id int, date_first datetime, date_last datetime replycount mediumint, extra blob )
forum_id and thread_id are the primary key. With large forums (about a million topics), I sometimes have to run queries like SELECT thread_id FROM thread ORDER BY date_last DESC LIMIT 500000, 10
. These queries with huge offsets take the second, or perhaps several, to complete.
So I could, by duplicating the data, create some tables for forums with most threads to speed it up. Only a few forums contain more than 100,000 topics, so there will be a table like
thread_for_forumid_123456 ( thread_id int, date_first datetime, date_last datetime replycount mediumint )
What do you think about it? Will this speed up huge query biases? Do you have any other suggestions? Thanks.
source share