At the company where I came to work, they run the PHP / MySQL relational database. I always thought that if I needed to pull various information from different tables, I could just make a simple join to pull data, such as ....
SELECT table_1.id, table_2.id FROM table_1 LEFT JOIN table_2 ON table_1.sub_id = table_2.id
When I got to where I work now, this is what they do.
<?php $query = mysql_query("SELECT sub_id FROM table_1"); while($rs = mysql_fetch_assoc($query)) { $query_2 = mysql_fetch_assoc(mysql_query("SELECT * FROM table_2 WHERE id = '{$rs['sub_id']}'"));
When I asked why the second method did this, they said that it works faster than the connection. They manage a database that has millions of records in different tables, and some of them are a bit wide (in a row). They said that they want to avoid joining if a poorly executed query can lock the table (or several of them). Another thing to keep in mind is that this database comes with a massive report builder that the client can use to create their own report, and if they go crazy and create a large report, this can lead to chaos.
I was confused, so I thought I would throw this out to the public for general programming. This may be a matter of opinion, but is it really faster to execute the while statement (one larger request to pull out a lot of lines, followed by many small cross-requests, if you do) or make a connection (pull a larger request once to get all the necessary data). While indexes are executing correctly, does it matter? Another thing to keep in mind is that the current database is in InnoDB format.
Thanks!
Update 8/28/14
So, I thought that I would post an update for this, and that worked longer. After this discussion, I decided to rebuild the report generator here at work. I do not have final results numbers, but I thought I would share what the result was.
I think I overdid it a bit, because I turned the entire report (quite dynamic as far as the returned data) into a massive unification holiday. Most connections, if not all, join the primary key value, so they all work very fast. If the report allowed 30 columns of data to be said, and it pulled out 2,000 records, each individual field launched a query to retrieve data (since this part of the data could be in another field). 30 x 2000 = 60000 and even with a weak request time of 0.0003 seconds per request, it was another 18 seconds of the total request time (which is pretty much what I remember). Now, when I rebuilt the request as a massive connection to a bunch of primary keys (where possible), the same report loaded in about 2-3 seconds, and most of this time it loaded html. Each record that returns runs between 0-4 additional queries depending on the necessary data (may not need any data if it can receive it in connections, which happens in 75% of cases). Thus, the same 2000 records will return an additional 0-8000 queries (much better than 60,000).
I would say that the while statement is useful in some cases, but, as noted in the comments below, benchmarking is what this is all about. In my case, consolidation was the best option, but in other areas of my site, approval is more relevant. In one case, I have a report in which a client can request several categories in order to pull and return data only for these categories. It so happened that I had category_id IN(...,...,..,.., etc etc etc) with 50-500 identifiers, and the index suffocated and would die on my hands when I held it in the final . So, what I did, identifiers were distributed in groups of 10 and ran the same query x / 10 times, and my results were faster than before, because the index likes to deal with 10 identifiers, not 500, so I saw a significant improvement on my requests then due to the execution of the while statement.