Java: how to read from a database efficiently?

I am trying to read a column from a sql database in java, I want the result to be returned in an array. Here is the function:

public static double[] open(Connection conn,String symbol,int startdate,int enddate) throws SQLException { int id = database.get_stockid(conn, symbol); Statement stat = conn.createStatement(); ResultSet rs = stat.executeQuery("select price_open from stock_data where stock_id="+id+" and date>="+startdate+" and date<="+enddate+";"); ArrayList<Double> data = new ArrayList<Double>(); while(rs.next()) { data.add(rs.getDouble("open")); } double[] data1 = new double[data.size()]; for(int a = 0; a < data1.length; ++a) { data1[a]=data.get(a); } return data1; } 

This is pretty slow. It takes 1.5 seconds with my sqlite database. Is this the standard way to read a column, or am I doing something wrong? This is a bottleneck in my application, so I need it to be as fast as possible.


Edited: Thank you. I just found out that ArrayList is not causing a problem. The bottleneck should be in the sql part: If I download the data in just 10 days, it will take as much time as loading the data in 10 years. So I have to improve my sql, but how ??

Here's the improved code:

 public static double[] open(Connection conn,String symbol,int startdate,int enddate) throws SQLException { int id = database.get_stockid(conn, symbol); PreparedStatement stat = conn.prepareStatement("select price_open from stock_data where (stock_id="+id +") and (date between "+startdate+" and "+enddate+");"); ResultSet rs = stat.executeQuery(); ArrayList<Double> data = new ArrayList<Double>(); while(rs.next()) { data.add(rs.getDouble(1)); } double[] data1 = new double[data.size()]; for(int a = 0; a < data1.length; ++a) { data1[a]=data.get(a); } return data1; } 
+4
source share
3 answers
  • Replace

     double[] data1 = new double[data.size()]; for(int a = 0; a < data1.length; ++a) { data1[a]=data.get(a); } 

    with

     double[] data1 = data.toArray(new double[data.size()]); 
  • Check what the query execution time is (by profiling this application or investing logs on the database side), check if it can be reduced, for example. introducing indexes into the columns used in the where clause id stock_id and date .

  • If you can estimate the number of records that your query returns, or you know that it will be at least N records, and not:

     ArrayList<Double> data = new ArrayList<Double>(); 

    Invoke:

     ArrayList<Double> data = new ArrayList<Double>(AMOUNT_OF_RECORDS); 

    this will prevent the ArrayList extension (creating a new array of a larger size and copying elements from a smaller array to a new array of a larger size).

    BTW. For the ArrayList class, the default initial capacity is 10.

  • Are the results returned from your query unique? Maybe most of the values ​​returned from the query are duplicated? If so, add the DISTINCT keyword to your query:

     select distinct price_open from stock_data ... 

    this will save time on exchanging data with the database, as well as reduce the number of results, fewer results should be processed.

  • Use PreparedStatement instead of Statement for:

    • protect against SQL injection
    • but also because of performance, since using PreparedStatement allows the database to reuse an already processed query

Update # 1

  • Remember to always free all resources, id ResultSet and PreparedStatement .
    • In Java 1.7+, you can use the new try-with-resources statement ( Java® Language Specification )
    • In older versions of Java, you must place calls to close methods in the finally block and have separate exception handling for each call to prevent the scenario where an exception thrown in the first close prevents the second close from being called.
+5
source

Your request:

 select price_open from stock_data where stock_id="+id+" and date>="+startdate+" and date<="+enddate+" 

To optimize this, create an index on stock_data(stock_id, date) . Index search will be used to retrieve data.

If your data is really big, then you can have an index on stock_data(stock_id, date, price_open) . These are just the three columns that the query refers to, so an index can satisfy the query without having to load the original data pages.

+2
source

You can improve performance by using a primitive array instead of an ArrayList, but this requires you to know how large the result set is.

Refer to columns by index instead of name - this may also improve a bit.

  datars.getDouble(1); 
+1
source

Source: https://habr.com/ru/post/1482754/


All Articles