Which DBMS allows ordering on an attribute that is missing from the select clause?

Suppose I have a table named Cars with two columns: CarName , BrandName

Now I want to execute this query:

 select CarName from Cars order by BrandName 

As you can see, I would like to return a list that is sorted by a column that is not in the selected part of the query.

The main (not optimized) sequence of executing sql commands: from , where , group by , having , select , order by .

The emerging issue is that BrandName is not part of what remains after the select command is executed.

I searched this in books, google and Stackoverflow, but so far I have only found a few SO comments, such as "I know a database system that does not allow this, but I don’t remember one."

So my questions are:
1) What do the SQL-92 or SQL99 standards mean about it.
2) Which databases allow this query and which do not?

(Background: a couple of students were asked about this, and I want to give them a better answer)

EDIT:
- Successfully tested for Microsoft SQL Server 2012

+6
source share
2 answers

Your query is completely legal syntax, you can order by columns that are not in select.

If you need complete specifications about legal ordering, in SQL Standard 2003 it contains a long list of statements about what order should and should not contain (02-Foundation, p. 415, section 7.13 <Query expression>, part 28). This confirms that your request is legal syntax.

I think that your confusion may result from selecting and / or sorting by columns that are not in the group, or sorting by columns that are not selected when using separate ones.

Both have the same underlying problem, and MySQL is the only one that, as far as I know, allows.

The problem is that when using a group individually or individually, any columns that are not contained in them are not needed, so it does not matter if they have several different values ​​in rows, because they are never needed. Imagine this simple data set:

 ID | Column1 | Column2 | ----|---------+----------| 1 | A | X | 2 | A | Z | 3 | B | Y | 

If you write:

 SELECT DISTINCT Column1 FROM T; 

You'll get

  Column1 --------- AB 

If you then add ORDER BY Column2 , which of the two ORDER BY Column2 will be used to order A, X or Z? It is not defined how to select a value for column2.

The same applies to selecting columns that are not part of a group. To simplify, just imagine the first two rows of the previous table:

 ID | Column1 | Column2 | ----|---------+----------| 1 | A | X | 2 | A | Z | 

In MySQL you can write

 SELECT ID, Column1, Column2 FROM T GROUP BY Column1; 

This actually violates the SQL standard, but works in MySQL, however, the problem is that it is not deterministic, the result:

 ID | Column1 | Column2 | ----|---------+----------| 1 | A | X | 

No more or less correct than

 ID | Column1 | Column2 | ----|---------+----------| 2 | A | Y | 

So what are you saying, give me one row for each individual Column1 value that both result sets satisfy, since you know which one you will get? Well, you don’t know, this is a rather popular misconception that you can add, and the ORDER BY to influence the results, for example, the following query:

 SELECT ID, Column1, Column2 FROM T GROUP BY Column1 ORDER BY ID DESC; 

Ensure that you get the following result:

 ID | Column1 | Column2 | ----|---------+----------| 2 | A | Y | 

due to ORDER BY ID DESC , however this is incorrect ( as shown here ).

MySQL docs :

The server can select any value from each group, therefore, if they do not match, the selected values ​​are undefined. Moreover, the selection of values ​​from each group cannot depend on the addition of an ORDER BY clause.

Thus, even if you have an order, this does not apply until one row for each group is selected and that one row is not defined.

SQL-Standard allows columns in the selection list that are not contained in the GROUP BY or aggregate function, however, these columns must be functionally dependent on the column in GROUP BY. From SQL-2003-Standard (5WD-02-Foundation-2003-09 - p. 346) - http://www.wiscorp.com/sql_2003_standard.zip

15) If T is a grouped table, then G is a set of columns of the group T. In each expression of the expression> contained in the <select list>, each column reference that refers to column T should point to some column C, which is functionally depends on G or is contained in the aggregated argument of the specification <set function> whose aggregation request is QS.

For example, the identifier in the sample table is PRIMARY KEY, so we know that it is unique in the table, so the following query conforms to the SQL standard and will work in MySQL and will not work in many DBMSs at the moment (at the time of writing Postgresql is the closest DBMS that I know for the correct implementation of the standard - Example here ):

 SELECT ID, Column1, Column2 FROM T GROUP BY ID; 

Since the identifier is unique for each row, for each identifier there can be only one Column1 value, one Column2 value Column2 is no ambiguity as to what should be returned for each row.

+7
source

There is no logical reason why any DBMS will not allow you to do this. The usual restriction applies to SELECT DISTINCT or the presence of a GROUP BY clause.

The current list of DBMS, which is known to support this:

  • Microsoft SQL Server 2012
  • Oracle
  • PostgreSQL
  • MySQL
  • DB2
+1
source

Source: https://habr.com/ru/post/959257/


All Articles