Finding an unused connection in an SQL query

I currently support a significant number of SQL queries. Some of them are created using copy / paste operations, and then delete unnecessary fields and sometimes forget to delete the tables from which these fields originate.

I am looking for a tool (or something other than the eyes + brain) that, given the SQL query, will analyze which of the joined tables does not have a field selected in the SELECT part.

Do you know about such a tool?

thanks

+4
source share
4 answers

A tool may hypothetically exist, but it would only be guaranteed to be correct if all of the following criteria were met for the said association

  • Its LEFT or INTERACTIVE JOIN OR INTERNAL JOIN where the power that is known as 1-1 AND ...
  • Does not reference SELECT, HAVING, GROUP BY or WHERE and ...
  • JOIN does not work, which has a side effect ...

Probably why there are no deterministic warnings in SQL parsers as they are, to say that an unused variable is in C #. But it might be worth creating an SQL validator that looks for some of these conditions and lets the user know that there is room for optimization.

+3
source

Just because no fields are specified in SELECT, this does not mean that the connection is not important for the query logic, and the results may change if the connection is deleted.

Consider this simple example: return the name of all customers who purchased the product in 2011.

SELECT DISTINCT c.CustomerName FROM Customer c INNER JOIN Sales s ON c.CustomerID = s.CustomerID AND s.SalesDate >= '2011-01-01' 

Columns from the Sales table are not returned to SELECT, but the join is critical to returning the correct result set.

Bottom line: I think you will need a review of the human eye / brain in order to properly clean things.

+6
source

below the function replaces all selection fields with a counter (*), and the second part removes unnecessary connections. This function works only with tables with aliases and should be checked for very complex queries and does not work if there are internal queries in the connection state.

 function sql_query_count($sql) { //replace select fields with count(*) $a = true; $b = 0; $first_select = stripos($sql, 'select '); $last_from = 0; $i = 0; while($a){ $i++; $b = stripos($sql, ' from ',$last_from); $c = strripos(substr($sql, $last_from, $b), 'select '); if ($c == $first_select || $c === false || $i>100) $a = false; $last_from = $b+6; } if (stripos($sql, 'order by') !== false) $sql = substr($sql, 0, stripos($sql, 'order by')); $sql1 = 'select count(*) as c ' . substr($sql, $b); //remove unnecessary joins $joins = preg_split("/ join /i", $sql1); $join_count = count($joins); $join_type = ''; if (count($joins)>1){ for ($index = 0; $index < $join_count+2; $index++) { $sql_new = ''; $where = ''; $i = 0; foreach ($joins as $key => $value) { $i++; $parts = preg_split("/ where /i", trim($value)); $value = $parts[0]; unset($parts[0]); $where = implode(' where ', $parts); $occurence_count = 0; if ($i > 1) { $a = explode(' on ', $value); $c = preg_replace('!\s+!', ' ', trim($a[0])); $c = explode(' ', $c); $occurence_count = substr_count($sql1, ' '.$c[1].'.')+substr_count($sql1, '='.$c[1].'.'); } $t = explode(' ', $value); $j = ''; if (trim(strtolower($t[count($t) - 1])) == 'inner'){ $j = 'inner'; unset($t[count($t) - 1]); } else if (trim(strtolower($t[count($t) - 2])).' '.trim(strtolower($t[count($t) - 1])) == 'left outer'){ $j = 'left outer'; unset($t[count($t) - 1]); unset($t[count($t) - 1]); } if ($occurence_count == 0 || $occurence_count > 1) $sql_new.= ' '.$join_type.(($join_type!='')?' join ':'').implode(' ', $t); $join_type = $j; } $sql_new .= ' where '.$where; $sql1 = $sql_new; $joins = preg_split("/ join /i", $sql1); } } return $sql1; } 
0
source

As mentioned above, defining redundant INNER JOINs will be a problem, as they sometimes affect the returned data, even if no data is actually selected on these tables.

It is said that the identification of excess LEFT COMPOUNDS is possible. I use this automatic query optimizer to automatically optimize SQL queries. Among other things, it can identify excess left connections.

0
source

Source: https://habr.com/ru/post/1346929/


All Articles