I originally wrote:
The first version of the MUCH request is less readable to me. Especially since you are not trying to smooth out the matching column inside the correlated subquery. JOINs are much clearer.
I still believe and support these claims, but I would like to add to my original answer based on new information added to the question. You asked if there are general rules or theories about what works best, TOP (1) or JOIN, leaving aside readability and preference)? I will reinstall, as I commented that no, there are no general rules or theories. When you have a specific example, it is very easy to prove that it works better. Let's take these two queries, similar to yours, but which work against system objects that we can all check:
-- query 1: SELECT name, (SELECT TOP (1) [object_id] FROM sys.all_sql_modules WHERE [object_id] = o.[object_id] ) FROM sys.all_objects AS o; -- query 2: SELECT o.name, m.[object_id] FROM sys.all_objects AS o LEFT OUTER JOIN sys.all_sql_modules AS m ON o.[object_id] = m.[object_id];
They return accurate results (3,179 rows on my system), but by that I mean the same data and the same number of rows. One clue that they do not look like the same query (or at least doesn't match the same execution plan) is that the results are returned in a different order. Although I did not expect any order to be maintained or respected because I did not enable ORDER BY anywhere, I would expect SQL Server to select the same order if they essentially use the same plan.
But this is not so. We see this by checking plans and comparing them. In this case, I will use SQL Sentry Plan Explorer , a free execution plan analysis tool from my company - you can get some of this information from Management Studio, but other parts are much easier to access in Plan Explorer (for example, actual duration and processor). The top plan is the version of the subquery, the bottom is the connection. Again, the subquery is at the top, the connection is at the bottom:

[ click for full size ]

[ click for full size ]
Actual execution plans: 85% of the total cost of running two queries is in the subquery version. This means that it is more than 5 times more expensive than the compound. Both the processor and I / O are much higher with the subquery version β look at all these readings! 6,600 + pages to return ~ 3,000 rows, while the connection version returns data with much less I / O β only 110 pages.
But why? Since the subquery version works essentially like a scalar function that you go to and grab the matching TOP row from another table, but do it for each row of the original query. We see that the operation is performed 3,179 times, looking at the tab βTop operationsβ, which shows the number of executions for each operation. Once again, the more expensive version of the subquery is on top, and the connection version is as follows:


I will spare you a more thorough analysis, but in general, the optimizer knows what it does. Indicate your intention (the connection of this type between these tables) and in 99% of cases when it will work on its own, which is the best way to do this (for example, an implementation plan). If you try to eliminate the optimizer, keep in mind that you are going to a fairly developed territory.
There are exceptions to each rule, but in this particular case, the subquery is definitely bad. Does this mean that the proposed syntax in the first request is always a bad idea? Absolutely not. There may be unclear cases where the subquery version works as well as the connection. I cannot think that there is much where the subquery will work better. Therefore, I would be mistaken on the side of the one that is likely to be better or better and the one that is more readable. I donβt see the benefits for the subquery version, even if you consider it more readable, because it is likely to lead to worse performance.
In general, I highly recommend that you stick to a more readable, self-documenting syntax if you don't find a case where the optimizer doesn't do it right (and I would put in 99% of cases when the problem was bad statistics or the sniffing parameter, and not the query syntax) . I would suspect that outside of these cases, reproductions that you could reproduce, where intricate queries that work better than their more direct and logical equivalents, will be quite rare. Your motivation for looking for these cases should be about the same as your preference for a non-intuitive syntax over the generally accepted "best practice" syntax.