Difference between USING and ON when joining more than two tables

Let's say I have three tables with the following data:

CREATE TABLE movies ( movie_id INT, movie_name VARCHAR(255), PRIMARY KEY (movie_id) ); CREATE TABLE movie_ratings ( movie_rating_id INT, movie_id INT, rating_value TINYINT, PRIMARY KEY (movie_rating_id), KEY movie_id (movie_id) ); CREATE TABLE movie_actors ( movie_actor_id INT, movie_id INT, actor_id INT, PRIMARY KEY (movie_actor_id), KEY movie_id (movie_id) ); INSERT INTO movies VALUES (1, 'Titanic'),(2,'Star Trek'); INSERT INTO movie_ratings VALUES (1,1,5),(2,1,4),(3,1,5); INSERT INTO movie_actors VALUES (1,1,2),(2,2,2); 

If I wanted to get the average rating and number of participants for each movie, I could do it using JOINs :

 SELECT m.movie_name, AVG(rating_value) AS avgRating, COUNT(actor_id) AS numActors FROM movies m LEFT JOIN movie_ratings r ON m.movie_id = r.movie_id LEFT JOIN movie_actors a ON m.movie_id = a.movie_id GROUP BY m.movie_id; 

Let me request this request A. Request A can be rewritten using USING as follows:

 SELECT m.movie_name, AVG(rating_value) AS avgRating, COUNT(actor_id) AS numActors FROM movies m LEFT JOIN movie_ratings r USING (movie_id) LEFT JOIN movie_actors a USING (movie_id) GROUP BY m.movie_id; 

Call this request B.

Both of these queries return 1 as numActors for the movie Star Trek. Therefore, modify this query a bit:

 SELECT m.movie_name, AVG(rating_value) AS avgRating, COUNT(actor_id) AS numActors FROM movies m LEFT JOIN movie_ratings r ON m.movie_id = r.movie_id LEFT JOIN movie_actors a ON r.movie_id = a.movie_id GROUP BY m.movie_id; 

Let me call this query C. Instead of doing m.movie_id = a.movie_id I now do r.movie_id = a.movie_id . For a query, C numActors = 0.

My questions:

  • How to write a C query using USING ? Can I?
  • Is USING essentially ON with the current table and the table specified in FROM ?
  • If the answer to # 2 is yes, then what does USING do when an implicit JOIN is used and several tables are in FROM ?
+5
source share
4 answers

1. Is it possible to overwrite C using USE?

Yes, you can using a nested join:

 SELECT m.movie_name, AVG(rating_value) AS avgRating, COUNT(actor_id) AS numActors FROM movies m LEFT JOIN ( movie_ratings r LEFT JOIN movie_actors a USING (movie_id) ) USING (movie_id) GROUP BY m.movie_id 

2. Is USE essentially an ON implementation with the current table and the table specified in FROM?

Not. MySQL documentation says:

Evaluating multi-tenant natural joins is very important, which affects the result of NATURAL or USING, and may require rewriting queries. Suppose you have three tables t1 (a, b), t2 (c, b) and t3 (a, c), each of which has one row: t1 (1,2), t2 (10,2) and t3 (7.10). Suppose also that you have this NATURAL JOIN in three tables:

SELECT ... FROM t1 NATURAL JOIN t2 NATURAL JOIN t3;

Previously, the left operand of the second join was considered t2, whereas it should be a nested join (t1 NATURAL JOIN t2). As a result, columns t3 are checked for common columns only in t2, and if t3 has common columns with t1, these columns are not used as equal join columns. Thus, the previous request was converted to the following equi-join:

SELECT ... FROM t1, t2, t3 WHERE t1.b = t2.b And t2.c = t3.c;

Basically, in older versions of MySQL, your query B was not the same as query A, but like query C!

3. What does USE do when an implicit JOIN is used and multiple tables are in FROM?

Again, referring to the MySQL Documentation :

Previously, the comma operator (,) and JOIN had the same priority, so the union expression t1, t2 JOIN t3 was interpreted as ((t1, t2) JOIN t3). JOIN now has a higher priority, so the expression is interpreted as (t1, (t2 JOIN t3)). This change affects statements that use the ON clause because this clause can only refer to columns in connection operands, and a change in priority changes the interpretation of what these operands are.

All about join order and priority. So basically t1, t2 JOIN t3 USING (x) would do t2 JOIN t3 USING(x) first and join t1 .

+2
source

If the column name in both tables is the same, then yes, you can use USING() .

In other words, it is:

 SELECT movie_name, AVG(rating_value) AS averageRating, COUNT(actor_id) AS numActors FROM movies m LEFT JOIN movie_ratings r ON m.movie_id = r.movie_id LEFT JOIN movie_actors a ON m.movie_id = a.movie_id GROUP BY m.movie_id; 

Same as:

 SELECT movie_name, AVG(rating_value) AS averageRating, COUNT(actor_id) AS numActors FROM movies m LEFT JOIN movie_ratings USING (movie_id) LEFT JOIN movie_actors USING (movie_id) GROUP BY movie_id; 

As for the ambiguity, they will not be here. It will join the tables when movie_id is equal. In your select statement, you pull the movie_name name, which exists in only one column.

However, if you said this:

 SELECT movie_id, AVG(rating_value) AS averageRating, COUNT(actor_id) AS numActors 

MySQL will say that there is an error because movie_id cannot be resolved because it is ambiguous. To eliminate this ambiguity, you just need to make sure that you use an alias or table name when selecting movie_id.

This is a valid select statement:

 SELECT m.movie_id, AVG(rating_value) AS averageRating, COUNT(actor_id) AS numActors 

There will be no errors for this.

I would like to comment that I foresee some danger here. If you leave videos with all these tables, you can get null values. If movie_id 1 has no ratings, your AVG (rating_value) will return null. You will not have this problem for COUNT (actor_id), as this will only return 0. I do not know if this bothers you, but keep in mind that this column may return null.

I built example tables in MySQL workbench and I can't get SQL Fiddle to work to show you, but if you want the data I created let me know and I will edit the question.

+3
source

There is no ambiguity, since USE applies to tables in the join, so this query

 SELECT movie_name, AVG(rating_value), COUNT(actor_id) FROM movies m LEFT JOIN movie_ratings r USING (movie_id) LEFT JOIN movie_actors a USING (movie_id) GROUP BY m.movie_id; 

to a large extent equivalent to that with inner joins, except that the movie_id column should only appear once in the results, instead of three times in the case of inner join .

See this example for troubleshooting a column: http://ideone.com/qMj5XK (using SQLite I think SQL Fiddle will not work, but MySQL should behave the same).

+2
source

How to write a C query using USING? Can I?

Like jpw mentioned in the answer yes , you can use USING with the query C. It will join m using r using movie_id and m using a also using movie_id . In fact, USING with MySQL aligns with the SQL 2003 standard.

USED ​​TO USE, essentially, doing ON with the current table and table is referenced in OT?

Yes USING does ON with the current table and the table specified in the FROM . The only difference is the number of columns you are about to complete if you use an asterisk in the SELECT . The Oracle documentation for USING much more useful than the MySQL documentation.

If the answer to # 2 is yes, then what does USE do when implied Is a JOIN used and some tables are in FROM?

You can try this for yourself, but I'm sure it will not work with an implicit join ( FROM tableA, tableB ). This might just be another reason why implicit joins should be avoided.

In addition, since USING can only be used with an explicit join, this will mean a very inconvenient request that mixes both explicit and implicit joins. Something you probably want to avoid.

Edit:

By the way, numActors is 0 in the C request, because your connection is incorrect. In fact, if there is no rating of films, then there are no actors! If you fix this, you will get the same result as query B.

 SELECT m.movie_name, AVG(rating_value) AS avgRating, COUNT(actor_id) AS numActors FROM movies m LEFT JOIN movie_ratings r ON m.movie_id = r.movie_id LEFT JOIN movie_actors a ON m.movie_id = a.movie_id -- Instead of r.movie_id = a.movie_id GROUP BY m.movie_id; 
+1
source

Source: https://habr.com/ru/post/1205647/


All Articles