I need to select rows from the BUNDLES table that have one of several SAP_STATE_ID values. These values depend on whether the corresponding SAP status should be exported or not.
This query is executed very quickly (there is an index in the field SAP_STATE_ID) -
SELECT b.* FROM BUNDLES b WHERE b.SAP_STATE_ID IN (2,3,5,6)
But ... I would like to get a list of identifiers dynamically, for example:
SELECT b.* FROM BUNDLES b WHERE b.SAP_STATE_ID IN (SELECT s.SAP_STATE_ID FROM SAP_STATES s WHERE s.EXPORT_TO_SAP = 1)
And this question suddenly takes too much time. I would expect the SQL server to start the subquery first (it does not depend on anything from the main query), and then run everything, as in my first example. I tried rewriting it to use joins instead of a subquery:
SELECT b.* FROM BUNDLES b JOIN SAP_STATES s ON (s.SAP_STATE_ID = b.SAP_STATE_ID) WHERE s.EXPORT_TO_SAP = 1
but it has the same poor performance. It seems to be running a subquery for each row of the BUNDLES table or something like that. I do not really know how to read execution plans, but I tried. It says that 81% is for scanning the BUNDLES primary key index (I have no idea why it should do this, there is a BUNDLE_ID field defined as PRIMARY KEY, but it does not appear at all in the request ...)
Does anyone have an explanation why the SQL server is so "stupid"? Is there a way to achieve what I want with good performance, but without having to provide a static list of SAP_STATE_ID?
script for both tables and associated indexes - http://mab.to/xbYiI0wKj
execution plan for the subquery version - http://mab.to/8Qh6gpdYZ
query plan for the version with associations - http://mab.to/YCqeGCUbr
(for some reason, these two plans look the same, and both suggest creating the BUNDLES.SAP_STATE_ID index that already exists)