How to get MySQL to use an INDEX query to view?

I am working on a web project with a MySql database in Java EE. We needed a view to summarize data from 3 tables with more than 3M rows in total. Each table was created with an index. But I did not find a way to take advantage of the indices in extracting conditional choices from the view we created using [group by].

I have suggestions from people that using views in MySql is not a good idea . Because you cannot create an index for views in mysql, like in oracle. But in some tests that I took, indexes can be used in the select select clause. Perhaps I created these views in the wrong way.

I will describe an example to describe my problem.

We have a table that records data for high scores in NBA games, with an index in the column [happend_in]

CREATE TABLE `highscores` ( `tbl_id` int(11) NOT NULL auto_increment, `happened_in` int(4) default NULL, `player` int(3) default NULL, `score` int(3) default NULL, PRIMARY KEY (`tbl_id`), KEY `index_happened_in` (`happened_in`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8; 

insert data (8 rows)

 INSERT INTO highscores(happened_in, player, score) VALUES (2006, 24, 61),(2006, 24, 44),(2006, 24, 81), (1998, 23, 51),(1997, 23, 46),(2006, 3, 55),(2007, 24, 34), (2008, 24, 37); 

then I create a performance to see the highest score Kobe Bryant received in each year

 CREATE OR REPLACE VIEW v_kobe_highScores AS SELECT player, max(score) AS highest_score, happened_in FROM highscores WHERE player = 24 GROUP BY happened_in; 

I wrote a conditional statement to see the highest result that kobe got in 2006 ;

 select * from v_kobe_highscores where happened_in = 2006; 

When I explain this in toad for mysql, I find that mysql has checked all the rows to form a view, and then find the data with the condition in it, without using the index on [used_in].

 explain select * from v_kobe_highscores where happened_in = 2006; 

explain result

The view that we use in our project is built among tables with millions of rows. Scanning all rows from a table in each data view is unacceptable. Please help! Thank!

@zerkms Here is the result that I tested in real life. I do not see big differences between them. I think @ spencer7593 has the right point. The MySQL optimizer does not "click" this predicate in a view request. real-life test

+42
mysql indexing
Dec 19
source share
3 answers

How do you get MySQL to use an index to query a view? Short answer, indicate the index that MySQL can use.

In this case, the optimal index is probably the "covering" index:

 ... ON highscores (player, happened_in, score) 

MySQL will probably use this index, and EXPLAIN will show: "Using index" due to WHERE player = 24 (equality predicate in the leading column of the index. GROUP BY happened_id (second column in the index), may allow MySQL to optimize this using the index to avoid the sorting operation By including the score column in the index, you can fully execute the query from the index without visiting (searching) the data page that the index refers to.

This is a quick answer. The longer answer is that MySQL is unlikely to use an index with a leading column of happened_id to query the view.




Why performance causes performance issues

One of the problems you encounter with a MySQL view is that MySQL does not “push” a predicate from an external query into a query of the form.

In your outer query, WHERE happened_in = 2006 indicated. The MySQL optimizer does not account for a predicate when it runs an internal view request. This request for presentation is executed separately, before an external request. The result from the execution of this query is "materialized"; that is, the results are saved as a MyISAM staging table. (MySQL calls this a "view", and the name they use makes sense when you understand the operations performed by MysQL.)

The bottom line is that the index that you defined on happened_in is not used by MySQL when it processes the query that forms the definition of the view.

After creating the THEN intermediate “view”, an external query is executed using this “view” as the row source. This is when this external query is executed, the predicate happened_in = 2006 is evaluated.

Note that all rows from a query of the form are stored, which (in your case) is a string for EVERY value happened_in , and not just the one you specify in the equality predicate in the outer query.

The way in which view requests are processed may be "unexpected" for some, and this is one of the reasons that using "views" in MySQL can lead to performance problems compared to how request requests are handled by other relational databases.




Improving view query performance with a suitable coverage index

Given your view definition and your request, the best way to get it will be to use the Index Usage access method to request the view. To get this, you need a coverage index, for example.

 ... ON highscores (player, happened_in, score). 

This is most likely the most useful index (in terms of performance) for your existing definition definition and your existing query. The column player is the leading column because you have an equality predicate in this column in the view request. The happened_in column is as follows because you have a GROUP BY operation in this column, and MySQL will be able to use this index to optimize the GROUP BY operation. We also include the score column because this is the only column that your query refers to. This makes the index a “closing” index, because MySQL can satisfy this query directly from the index pages, without having to visit any pages in the base table. And this is as good as we are going to get out of this query plan: "Use index" without "Using filesort".




Performance comparison with offline query without a view

You can compare the execution plan of your request with a view against an equivalent stand-alone request:

 SELECT player , MAX(score) AS highest_score , happened_in FROM highscores WHERE player = 24 AND happened_in = 2006 GROUP BY player , happened_in 

A standalone query can also use a coverage index, for example.

 ... ON highscores (player, happened_in, score) 

but without the need to materialize the MyISAM staging table.




I am not sure if any of the previous ones gives a direct answer to the question that you asked.

Q: How to force MySQL to use the INDEX query to view?

A: Define a suitable index that can use a view query.

The short answer is a “coverage index” (the index includes all the columns referenced by the query request). The leading columns in this index should be the columns referenced by equality predicates (in your case, the player column will be the leading column, because the query has the predicate player = 24 Also, the columns referenced in GROUP BY should be leading columns in the index, which allows MySQL to optimize the GROUP BY operation using an index rather than using a sort operation.

The key point here is that the view request is basically an offline request; the results of this query are stored in an intermediate “derived” table (the MyISAM table created when the view request is run.

Using views in MySQL is not necessarily a “bad idea,” but I would strongly caution those who choose to use views in MySQL to KNOW how MySQL handles queries that reference these views. And the method of processing MySQL query requests differs (significantly) from the way view queries are processed by other databases (for example, Oracle, SQL Server).

+30
Dec 19 '12 at 3:27
source share

Creating a composite index with player + happened_in columns (in that particular order) is the best you can do in this case.

PS: do not test the behavior of the mysql optimizer on such a small number of rows, because it will likely prefer fullscan over indexes. If you want to see what happens in real life, fill it with real life - the same amount of data.

+2
Dec 19 '12 at 3:11
source share

This does not directly answer the question, but it is a communication-related solution for others facing this problem. This provides the same benefits of using the view, while minimizing the disadvantages.

I am setting up a PHP function to which I can send parameters, things that need to be inserted inside to maximize the use of the index, and not use them in a join or where clause outside the view. In this function, you can formulate the SQL syntax for the view and return this syntax. Then in the calling program you can do something like this:

 $table = tablesyntax(parameters); select field1, field2 from {$table} as x... + other SQL 

Thus, you get the benefits of encapsulation for a view, the ability to name it as if it were a view, but not the limitations of the index.

0
Feb 11 '14 at 6:08
source share



All Articles