What is optimal? UNION vs. WHERE IN (str1, str2, str3)

I am writing a program that sends an email at a specific time for a client. I have a .NET method that takes time zone, time zone and destination time and returns time in this time zone. Therefore, my method is to select each individual time zone in the database, check the correct time using the method, and then select each client from the database with this time zone.

The request will look like one of them. Keep in mind that the order of the result set does not matter, so the union will be in order. Which is faster or do they really do the same?

SELECT email FROM tClient WHERE timezoneID in (1, 4, 9) 

or

 SELECT email FROM tClient WHERE timezoneID = 1 UNION ALL SELECT email FROM tClient WHERE timezoneID = 4 UNION ALL SELECT email FROM tCLIENT WHERE timezoneID = 9 

Edit: timezoneID is the foreign key for tTimezone, the table with the timezoneID with the primary key, and the time zone name is varchar (20). In addition, I went with WHERE IN , since I did not want to open the analyzer.

Edit 2: Request processes for 200k rows in less than 100 ms, so at this point I am done.

+4
source share
7 answers

Hey! These queries are not equivalent.

The results will be the same only if it is assumed that one email only applies to one time zone. Of course, however, the SQL engine does not know this and is trying to remove duplicates. Therefore, the first request should be faster.

Always use UNION ALL if you do not know why you want to use UNION.

If you don't know what the difference is, see this SO question.

Note: this question belongs to the previous version .

+4
source

For most database performance issues, the real answer is to run it and analyze what the database does for your data set. Run an explanation plan or trace to make sure your query hits the right indexes or creates indexes if necessary.

I would most likely go first using the IN clause, as it contains the biggest semantics of what you want. TimezoneID is represented as the primary key in some time zone table, so it must be a foreign key by email and indexed. Depending on the database optimizer, I would think that it should do an index check of the foreign key index.

+2
source

My first assumption was that

  SELECT email FROM tClient WHERE timezoneID in (1, 4, 9) 
it will be faster since only one scan of the table is required to search for results, but I suggest checking the execution plan for both queries.
+1
source

I don’t have the MS SQL Query Analyzer at hand to actually test my hypothesis, but I think that the WHERE IN option will be faster, because with a UNION server you will have to do 3 table scans, while WHERE IN will only need one. If you have a Query Analyzer, check the execution plans for both queries.

On the Internet, you can often come up with suggestions to avoid using WHERE IN, but this applies to cases where subqueries are used. Thus, this case is beyond the scope of this recommendation and is also easier to read and understand.

+1
source

I think there is some very important information in the question. First of all, it is very important that the weather clock is indexed or not, this is part of the primary key, etc. I would advise everyone to take a look at the analyzer, but in my experience the WHERE clause should be faster, especially with the index. Logic is something like: in a combined query there is additional overhead, checking types, column numbers in each, etc.

+1
source

In the book "Tuning SQL Performance", the authors found that UNION queries were slower in all 7 DBMSs they tested (SQL Server 2000, Sybase ASE 12.5, Oracle 9i, DB2, etc.): http://books.google .com / books? id = 3H9CC54qYeEC & pg = PA32 & vq = UNION & dq = sql + performance + tuning & source = gbs_search_s & sig = ACfU3U18uYZWYVHxr2I3uUj8kmPz9RpmiA # PPA33, M1

A later DBMS can optimize this difference, but this is doubtful. In addition, the UNION method is much longer and harder to maintain (what if you want a third?) Versus IN.

If you have no reason to use UNION, stick to the OR / IN method.

+1
source

Some DBMS query optimizers modify your query to make it more efficient, so depending on the DBMS you are using, you probably don't care.

0
source

Source: https://habr.com/ru/post/1276421/


All Articles