ResultSet: getting column values โ€‹โ€‹by index compared to retrieving by label

When using JDBC, I often come across constructions like

ResultSet rs = ps.executeQuery(); while (rs.next()) { int id = rs.getInt(1); // Some other actions } 

I myself asked (and the code authors) why not use labels to get column values:

 int id = rs.getInt("CUSTOMER_ID"); 

The best explanation I've heard is something related to performance. But really, is she doing really fast? I do not believe that, although I have never taken measurements. Even if tag extraction is a bit slower, nevertheless, it provides better readability and flexibility, in my opinion. So can someone give me a good explanation to avoid getting column values โ€‹โ€‹by column index instead of column label? What are the pros and cons of both approaches (perhaps with respect to some DBMSs)?

+48
java optimization jdbc maintenance resultset
Oct 09 '08 at 11:18
source share
13 answers

By default, string labels should be used .

Pros:

  • Column Order Independence
  • Improved readability / maintainability

Minuses:

  • You have no control over column names (access via stored procedures)

Which would you prefer?

Ints?

int i = 1;
customerId = resultSet.getInt (i ++);
customerName = resultSet.getString (i ++);
customerAddress = resultSet.getString (i ++);

or strings?

customerId = resultSet.getInt ("customer_id");
customerName = resultSet.getString ("customer_name");
customerAddress = resultSet.getString ("customer_address");

But what if position 1 has a new column? Which code do you prefer? Or, if the column order is changed, which version of the code do you need to change at all?

To do this, you should use the default string labels.

+43
09 Oct '08 at 14:07
source share

Warning: I'm going to get pompous here because it drives me crazy.

99% * of the time, this is an absurd micro-optimization that people have some kind of vague idea that makes things โ€œbetterโ€. This completely ignores the fact that if you are not in an extremely stressful and busy cycle over millions of SQL results all the time, which we hope is rare, you will never notice it. For everyone who does not, the time spent on developing, updating, and fixing errors in column indexing is much greater than the additional hardware costs for your application with infinitely low performance.

Do not optimize your code this way. Code for the person who supports it. Then observe, measure, analyze and optimize. Observe again, measure again, analyze again and optimize again.

Optimization is pretty much the last step in development, not the first.

* Image compiled.

+56
Oct 09 '08 at 11:26
source share

The answer was accepted, however, here is some additional information and personal experience that I have not yet seen.

Use column names (constants rather than literals are preferred) in general and, if possible. This becomes more understandable, easier to maintain, and future changes are less likely to break the code.

However, for column indices it is used. In some cases, this is faster, but not enough to override the above reasons for names *. This is very valuable when developing tools and general methods for working with ResultSet s. Finally, an index may be required because the column does not have a name (for example, an unnamed aggregate) or there are duplicate names, so there is no easy way to refer to both.

* Note that I wrote several JDBC drivers and peered at some open source sources internally, using column indexes to reference result columns. In all the cases I worked with, the internal driver first maps the column name to the index. Thus, you can easily see that the column name in all of these cases will always take longer. This may not be true for all drivers.

+6
Feb 04 '10 at 5:10
source share

From the java documentation:

The ResultSet interface provides getter methods (getBoolean, getLong, etc.) for extracting column values โ€‹โ€‹from the current row. Values โ€‹โ€‹can be obtained using either the column index number or the column name. In general, using a column index will be more efficient. The columns are numbered from 1. For maximum portability, the columns of the result set in each row should be read in order from left to right, and each column should be read only once.

Of course, each method (named or indexed) has its place. I agree that named columns should be the default. However, in cases where a huge number of loops is required and where the SELECT statement is defined and maintained in the same section of the code (or class), the indices should be in order - it is recommended to indicate the selected columns, and not just "SELECT * FROM ...", so as any change to the table will lead to code breaking.

+5
Jun 26 '13 at 6:27
source share

Of course, using column names improves readability and simplifies maintenance. But using column names has a downside. As you know, SQL allows multiple column names with the same name, there is no guarantee that the column name that you entered in the getter method for resultSet actually points to the column name that you want to get. Theoretically, using indexes instead of column names is preferable, but this reduces readability ...

thank

+4
Feb 04 '10 at 4:57
source share

I donโ€™t think that using shortcuts has a big impact on performance. But there is another reason not to use String s. Or int s, for that matter.

Consider using constants. Using the int constant makes the code more readable, but less likely to have errors.

Besides being more readable, the constant also prevents you from making a typo in label names - the compiler will throw an error if you do. And any value in the IDE will raise it. This is not the case if you use String or ints .

+2
Oct 09 '08 at 11:27
source share

I did performance profiling on this particular issue in an Oracle database. Our code has a ResultSet with many columns and a huge number of lines. From 20 seconds (!) The request is executed to execute the oracle.jdbc.driver.ScrollableResultSet.findColumn (line name) method takes about 4 seconds.

There is obviously something wrong with the overall design, but using indexes instead of column names is likely to take 4 seconds.

+2
Dec 31 '09 at 9:06
source share

You can get the best of both! Index usage speed with support and security for using column names.

First, if you don't loop through the result set, just use the column names.

  • Define a set of integer variables, one for each column that you will access. Variable names may include a column name: for example. iLast_Name.

  • Before the result set loop loops through the column metadata and sets the value of each integer variable to the column index of the corresponding column name. If the index of the Last_Name column is 3, set the value of iLast_Name to 3.

  • In the result set loop, use integer variable names in the GET / SET methods. The variable name is the visual key to the developer / maintainer regarding the actual name of the column being accessed, but the value is the index of the column and will give better performance.

NOTE. The initial collation (that is, the column name for index conversion) is performed only once before the loop, and not for each record and column in the loop.

+2
Nov 28 '10 at 21:00
source share

The JDBC driver ensures that the column indexes the search. Therefore, if you retrieve values โ€‹โ€‹by the column name each time the driver searches (usually in a hash map) to check the corresponding index for the column name.

+1
09 Oct '08 at 11:24
source share

I agree with the previous answers that performance is not something that can make us choose any of the approaches. It would be nice to consider the following things:

  • Reading the code: for every developer reading your code labels, there is a lot more sense than indexes.
  • Maintenance: Think about SQL query and how it is supported. What is more likely to happen in your case after fixing / improving / refactoring the SQL query: changing the order of the extracted columns or changing the column names of the results. It seems to me that changing the order of the extracted columns (as a result of adding / removing new columns in the result set) is more likely.
  • Encapsulation: despite what you choose, try to isolate the code in which you run the SQL query and the result of the parsing in one component, and only tell this component about column names and their mapping to indexes (if you decide to use them) .
0
Oct 09 '08 at 12:20
source share

Using an index is an attempt at optimization.

The time spent on this is wasted in vain on the additional effort required by the developer to find the necessary data to check whether their code will work properly after the changes.

I believe our built-in instinct uses numbers instead of text.

0
Oct 09 '08 at 12:29
source share

In addition to searching in Map for labels, this also leads to the creation of additional lines. Although this will happen on the stack, but nevertheless he will be carying the cost with it.

It all depends on the individual choice, and until the date I used only indexes :-)

0
Dec 19 '08 at 9:20
source share

As noted by other authors, I would stick with column names unless you have a really good reason not to. The performance impact is negligible compared, for example, to query optimization. In this case, maintenance is much more important than a small optimization.

0
Apr 15 '19 at 14:47
source share



All Articles