Closing PreparedStatement after one run is a design error?

Question

Closing PreparedStatement after one run is a design error?

I studied different places and heard a lot of dubious statements, starting from PreparedStatement , in all cases Statement is preferred, even if only to increase productivity; up to the statement that PreparedStatement should be used exclusively for batch statements and nothing else.

However, it seems that I followed blind spots in (mostly online) discussions. Let me introduce a specific scenario.

We have an application developed by EDA with a pool of database connections. Events come, some of them require perseverance, some do not. Some of them are artificially created (for example, update / reset something every X minutes, for example). Some events arrive and are processed sequentially, but other types of events (also requiring storage) can (and will) be processed simultaneously.

Apart from those artificially created events, there is no structure in how events requiring persistence arrive.

This application was developed a long time ago (around 2005) and supports several DBMSs. Typical event handler (where persistence is required):

get connection from pool
prepare SQL query
execute prepared statement
handle the result set, if applicable, close it
close the prepared report
prepare another expression, if necessary, and process it in the same way.
return connection to pool

If the event requires batch processing, the statement is ready once and the addBatch / executeBatch methods are used. This is an obvious performance advantage, and these cases are not related to this issue.

Recently, I got the opinion that the whole idea of preparing (parsing) a statement, executing it once and closing it, in fact, is a misuse of PreparedStatement , provides zero performance benefits, regardless of whether client prepared statements are used and that typical DBMSs (Oracle, DB2, MSSQL, MySQL, Derby, etc.) will not even push such an operator into a ready-made instruction cache (or at least their driver / JDBC data source will not by default).

In addition, I had to test certain scripts in the dev environment on MySQL, and it seems that the Connector / J usage analyzer agrees with this idea. For all unarmored prepared statements, calling close() prints:

PreparedStatement created, but used 1 or fewer times. It is more efficient to prepare statements once, and re-use them many times

Due to the choice of application design outlined earlier, having a PreparedStatement instance cache that contains every single SQL statement used by any event for every connection in the connection pool sounds like a bad choice.

Can anyone elaborate on this in more detail? Is the logic "cook-execute (once)" close "a flaw and essentially discourage?

PS Explicitly indicating useUsageAdvisor=true and cachePrepStmts=true for Connector / J and using either useServerPrepStmts=true or useServerPrepStmts=false still leads to performance warnings when calling close() in PreparedStatement instances for each unlisted SQL statement.

+5

java jdbc eda

afk5min Oct 11 '15 at 13:39

source share

4 answers

Gord thompson · Answer 1 · 2015-10-11T14:49:40+0000

Is logic ready to execute [once] sufficiently erroneously and essentially discouraged?

I do not see that this is a problem as such. This SQL statement must be “prepared” at some point, either explicitly (using PreparedStatement) or “on the fly” (with expression). A little more overhead can occur if we use PreparedStatement instead of Statement so that it will be executed only once, but it is unlikely that the overhead will be significant, especially if the statement you quote is true:

typical DBMSs (Oracle, DB2, MSSQL, MySQL, Derby, etc.) will not even push such an operator into a ready-made instruction cache (or at least their default JDBC driver / data source will not).

What discourages such a pattern:

 for (int thing : thingList) { PreparedStatement ps = conn.prepareStatement(" {some constant SQL statement} "); ps.setInt(1, thing); ps.executeUpdate(); ps.close(); }

because PreparedStatement is used only once, and the same SQL statement is prepared over and over. (Although even this may not be so important if the SQL statement and its execution plan are indeed cached.) The best way to do this is

 PreparedStatement ps = conn.prepareStatement(" {some constant SQL statement} "); for (int thing : thingList) { ps.setInt(1, thing); ps.executeUpdate(); } ps.close();

... or even better, with "try with resources" ...

 try (PreparedStatement ps = conn.prepareStatement(" {some constant SQL statement} ")) { for (int thing : thingList) { ps.setInt(1, thing); ps.executeUpdate(); } }

Note that this is true even without using batch processing. The SQL statement is still ready once and is used several times.

Marmite bomber · Answer 2 · 2015-10-11T15:59:01+0000

As already noted, the most expensive part is the analysis of the instructions in the database. Some database systems (largely database dependent - I will mainly talk to Oracle ) can make a profit if the operator is already parsed in a common pool. (In Oracle terminology, this is called soft syntax , which is cheaper than hard analysis - parsing the new statement). You can profit from soft analysis, even if you use a trained operator only once.

Thus, an important task is to provide the database with the ability to reuse the operator. A typical example of a counter is handling a collection-based IN list in Hibernate. You end the statement, for example

  .. FROM T WHERE X in (?,?,?, … length based on the size of the collection,?,? ,?,?)

You cannot reuse this operator if the collection size is different.

A good starting point for getting an overview of the range of SQL queries generated by a running application is (according to Oracle) the V $ SQL view. Filter out PARSING_SCHEMA_NAME with the connection pool user and check the SQL_TEXT counter and EXECUTORS.

Two extreme situations should be avoided:

Passing parameters (identifiers) in the request text (this is well known) and
Reusing instructions for different access paths.

An example of the latter is a query that, with the provided parameter, performs index access to a limited part of the table, whereas without the parameter, all records must be processed (full table scan). In this case, it is finally not a problem to create two different operators (since parsing both leads to different execution plans).

Bohemian · Answer 3 · 2015-10-11T14:48:42+0000

PreparedStatements preferable because it is necessary, regardless of whether you create it programmatically or not; internally, the database creates every time every time a query is executed - creating one programmatically just gives you a pen. Creating and dropping a PreparedStatement each time does not increase the overhead when using Statement .

Creating a single database requires a lot of effort (parsing, parsing, permission checking, optimization, access strategy, etc.). Reusing one bypasses this effort for subsequent executions.

Instead of discarding them, try either writing the query in such a way that it can be reused, for example, ignoring the zero input parameters:

 where someCol = coalesce(?, someCol)

therefore, if you set the parameter to null (i.e., "unspecified"), the condition is successful)

or if you absolutely must create a query every time, save the links to PreparedStatements in Map , where the embedded query is the key and reuses them if you get hit. Use WeakHashMap<String, PreparedStatements> to implement the map to prevent running out of memory.

Marmite bomber · Answer 4 · 2015-10-11T16:23:22+0000

Created by PreparedStatement, but used 1 or less times. It’s more efficient to prepare statements once and reuse them many times.

I am sure that you can safely ignore this warning, this seems like a requirement. It’s better to work 40 hours a week rather than sleep the next 56 hours, eat after 7 hours, and the rest is your free time .

You need exactly one performance per event - do you have to complete 50 to get a better average?

Closing PreparedStatement after one run is a design error?

More articles: