ANSI and non-ANSI SQL JOIN syntax

I have my business logic in ~ 7000 lines of T-SQL stored procedures, and most of them have the following JOIN syntax:

SELECT AA, BB, CC FROM aaa AS A, bbb AS B, ccc AS C WHERE AB = B.ID AND BC = C.ID AND C.ID = @param 

I will get a performance boost if I replace such a query with the following:

 SELECT AA, BB, CC FROM aaa AS A JOIN bbb AS B ON AB = B.ID JOIN ccc AS C ON BC = C.ID AND C.ID = @param 

Or are they the same?

+50
sql join sql-server tsql
Oct 21 '09 at 6:28
source share
7 answers

Two queries are the same, except for the second - ANSI-92 SQL syntax, and the first - senior SQL syntax, which does not include the join clause. They should draw up exactly the same internal query plan, although you can check it out.

You must use the ANSI-92 syntax for several reasons.

  • Using the JOIN clause shares the logic of relationships with (WHERE) and is thus cleaner and more understandable.
  • This does not matter with this particular query, but there are several circumstances where the older external connection syntax (using +) is ambiguous, and the query results, therefore, are implementation dependent - or the query cannot be resolved at all. This does not happen with ANSI-92
  • This is good practice, as most developers and dba will use ANSI-92 at this time, and you must follow the standard. Of course, all modern query tools will generate ANSI-92.
  • As pointed out by @gbn, it tends to avoid random cross joins.

I myself resisted ANSI-92 for a while, as there is a slight conceptual advantage over the old syntax, since it’s easier to envision SQL as a massive Cartesian join of all the tables used, followed by a filtering operation - a mental technique that can be useful for understanding what it does SQL query. However, several years ago, I decided that I needed to move with time, and after a relatively short period of adjustment, I now prefer it - mainly because of the first reason mentioned above. The only place to move away from the ANSI-92 syntax, or rather not to use this parameter, is natural associations, which are implicitly dangerous.

+67
Oct. 21 '09 at 7:10
source share

The second construct is known as “mixed join syntax” in the SQL community. The first AFAIK construct does not have a widespread name, so call it the old-style inner join syntax.

The usual arguments are as follows:

Advantages of the “traditional” syntax: predicates are physically grouped in a WHERE in any order that makes the query as a whole, and n-ary relationships in particular are easier to read and understand ( ON syntax infixed clauses can propagate predicates, so you need to look for the appearance of one table or column over visual distance).

Deficiencies of the Traditional syntax: When one of the join predicates is excluded, there is no parsing error, and the result is a Cartesian product (known as CROSS JOIN in the infixed syntax), and such an error can be difficult to detect and debug. In addition, join predicates and filter predicates are physically grouped in a WHERE , which can lead to their confusion from each other.

+4
Oct 21 '09 at 9:15
source share

Complete both and check your query plans. They must be equal.

+3
Oct 21 '09 at 6:30 a.m.
source share

Two queries are equal - the first uses the ANSI JOIN syntax, the second uses the ANSI JOIN syntax. I recommend sticking with the ANSI JOIN syntax.

And yes, LEFT OUTER JOINs (which, incidentally, are also ANSI JOIN syntax) is what you want to use when it is likely that the table you are joining may not contain the corresponding records.

Link: Conditional connections in SQL Server

+3
Oct 21 '09 at 6:36
source share

OK, they do the same. It agreed. Unlike many, I use an older convention. That SQL-92 is “easier to understand” is controversial. Having written programming languages ​​for 40 years (gulp), I know that “easy to read” begins first before any other agreement with “visual acuity” (the term is incorrect, but this is the best phrase that I can use). When reading SQL, the FIRST thing you're worried about is which tables are involved, and then which table (most) defines the seed. Then you take care of the relevant data restrictions, and then the selected attributes. While SQL-92 basically separates these ideas, there are so many noisy words, the brain needs to interpret and process them, and this makes reading SQL slower.

 SELECT Mgt.attrib_a AS attrib_a ,Sta.attrib_b AS attrib_b ,Stb.attrib_c AS attrib_c FROM Main_Grain_Table Mgt ,Surrounding_TabA Sta ,Surrounding_tabB Stb WHERE Mgt.sta_join_col = Sta.sta_join_col AND Mgt.stb_join_col = Stb.stb_join_col AND Mgt.bus_logic_col = 'TIGHT' 

Visual acuity! Place commas for the new attributes in front. It also simplifies commenting code. Use a specific case for functions and keywords Use a specific case for tables Use a specific case for attributes Vertically Build operators and operations Make the first table in FROM represent the grain of the data Make the first WHERE tables equal to the join constraints, and let specific, hard constraints float at the bottom . Select 3 alias characters for ALL tables in your database and use the EVERY alias you refer to the table with. You should use this alias as a prefix for the (many) indexes on this table. 6 out of 1 1/2 dozens of the other, right? May be. But even if you use the ANSI-92 agreement (like me, and in the case of continued work), use the principles of visual acuity, vertical alignment so that your mental gaze can not fall into the places you want to see, and it is easy to avoid things (especially noisy words) you do not need.

+3
Jul 28 '15 at 12:41
source share

In my view, the FROM clause is where I decide which columns I need in the rows for my SELECT clause. This expresses a business rule that displays on the same line the values ​​needed in the calculations. A business rule can be a customer who has invoices, as a result of which invoices are built, including customer liability. It can also be places in the same zip code as customers, leaving a list of places and customers that are close to each other.

Here I determine the centricity of the rows in my result set. In the end, we are simply shown the metaphor of the list in the RDBMS, and each list has a topic (entity), and each line is an instance of the object. If we understand the centricity of the rows, we understand the object of the result set.

The WHERE clause, which is conceptually executed after the lines defined in the from clause, selects rows that are not required (or contain lines that are required) for the SELECT clause to work.

Since join logic can be expressed in both the FROM clause and the WHERE clause, and because there are sentences for dividing and overcoming complex logic, I choose to put the join logic, which includes the values ​​in the columns in the FROM clause, because it is essentially expressing a business rule that is supported by matching values ​​in columns.

i.e. I will not write the WHERE clause as follows:

  WHERE Column1 = Column2 

I will put this in the FROM clause as follows:

  ON Column1 = Column2 

Similarly, if a column should be compared with external values ​​(values ​​that may or may not be in the column), for example, comparing a zip code with a specific zip code, I will put this in a WHERE clause because I basically say I only need such rows .

i.e. I will not write the FROM clause as follows:

  ON PostCode = '1234' 

I will put this in the WHERE clause as follows:

  WHERE PostCode = '1234' 
+1
Oct 18 '14 at 20:51
source share

The ANSI syntax does not apply either the predicative placement in the correct sentence (whether it be ON or WHERE), nor the proximity of the ON clause to the adjacent table. Developer is free to write mess

 SELECT C.FullName, C.CustomerCode, O.OrderDate, O.OrderTotal, OD.ExtendedShippingNotes FROM Customer C CROSS JOIN Order O INNER JOIN OrderDetail OD ON C.CustomerID = O.CustomerID AND C.CustomerStatus = 'Preferred' AND O.OrderTotal > 1000.0 WHERE O.OrderID = OD.OrderID; 

Speaking of query tools that “will generate ANSI-92,” I comment here because it generated

 SELECT 1 FROM DEPARTMENTS C JOIN EMPLOYEES A JOIN JOBS B ON C.DEPARTMENT_ID = A.DEPARTMENT_ID ON A.JOB_ID = B.JOB_ID 

The only syntax that avoids the usual "file with a restriction-project-Cartesian" is an external join. This operation is more complicated because it is not associative (both with itself and with a normal connection). At least it’s wise to circle the query with the outer join. However, this is an exotic operation; if you use it too often, I suggest using the relational database class.

0
Mar 02 '11 at 19:30
source share



All Articles