I am trying to insert 100,000 rows in a MYSQL table for less than 5 seconds using Hibernate (JPA). I have tried all the suggestions of a sleeping trick and still can not do better than 35 seconds.
1st optimization: I started with the IDENTITY sequence generator, which inserted 60 seconds. I later abandoned the sequence generator and started assigning the @Id field myself, reading MAX(id) and using AtomicInteger.incrementAndGet() to assign the fields myself. This reduced the insertion time to 35 seconds.
Second optimization: I enabled batch inserts by adding
<prop key="hibernate.jdbc.batch_size">30</prop> <prop key="hibernate.order_inserts">true</prop> <prop key="hibernate.current_session_context_class">thread</prop> <prop key="hibernate.jdbc.batch_versioned_data">true</prop>
to configuration. I was shocked to find that batch insertions did nothing to reduce insertion time. It was another 35 seconds!
Now I'm thinking of trying to insert multiple threads. Does anyone have pointers? Should I choose MongoDB?
The following is my configuration: 1. Sleep configuration`
<bean id="entityManagerFactoryBean" class="org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean"> <property name="dataSource" ref="dataSource" /> <property name="packagesToScan" value="com.progresssoft.manishkr" /> <property name="jpaVendorAdapter"> <bean class="org.springframework.orm.jpa.vendor.HibernateJpaVendorAdapter" /> </property> <property name="jpaProperties"> <props> <prop key="hibernate.hbm2ddl.auto">${hibernate.hbm2ddl.auto}</prop> <prop key="hibernate.dialect">${hibernate.dialect}</prop> <prop key="hibernate.show_sql">${hibernate.show_sql}</prop> <prop key="hibernate.format_sql">${hibernate.format_sql}</prop> <prop key="hibernate.jdbc.batch_size">30</prop> <prop key="hibernate.order_inserts">true</prop> <prop key="hibernate.current_session_context_class">thread</prop> <prop key="hibernate.jdbc.batch_versioned_data">true</prop> </props> </property> </bean> <bean class="org.springframework.jdbc.datasource.DriverManagerDataSource" id="dataSource"> <property name="driverClassName" value="${database.driver}"></property> <property name="url" value="${database.url}"></property> <property name="username" value="${database.username}"></property> <property name="password" value="${database.password}"></property> </bean> <bean id="transactionManager" class="org.springframework.orm.jpa.JpaTransactionManager"> <property name="entityManagerFactory" ref="entityManagerFactoryBean" /> </bean> <tx:annotation-driven transaction-manager="transactionManager" />
`
- Object configuration:
`
@Entity @Table(name = "myEntity") public class MyEntity { @Id private Integer id; @Column(name = "deal_id") private String dealId; .... .... @Temporal(TemporalType.TIMESTAMP) @Column(name = "timestamp") private Date timestamp; @Column(name = "amount") private BigDecimal amount; @OneToOne(cascade = CascadeType.ALL) @JoinColumn(name = "source_file") private MyFile sourceFile; public Deal(Integer id,String dealId, ....., Timestamp timestamp, BigDecimal amount, SourceFile sourceFile) { this.id = id; this.dealId = dealId; ... ... ... this.amount = amount; this.sourceFile = sourceFile; } public String getDealId() { return dealId; } public void setDealId(String dealId) { this.dealId = dealId; } ... ... .... public BigDecimal getAmount() { return amount; } public void setAmount(BigDecimal amount) { this.amount = amount; } .... public Integer getId() { return id; } public void setId(Integer id) { this.id = id; }
`
- Retentive code (service):
`
@Service @Transactional public class ServiceImpl implements MyService{ @Autowired private MyDao dao; .... `void foo(){ for(MyObject d : listOfObjects_100000){ dao.persist(d); } }
`4. Dao class:
`
@Repository public class DaoImpl implements MyDao{ @PersistenceContext private EntityManager em; public void persist(Deal deal){ em.persist(deal); } }
`
Magazines: `
DEBUG ohejbinternal.AbstractBatchImpl - Reusing batch statement 18:26:32.906 [http-nio-8080-exec-2] DEBUG org.hibernate.SQL - insert into deal (amount, deal_id, timestamp, from_currency, source_file, to_currency, id) values (?, ?, ?, ?, ?, ?, ?) 18:26:32.906 [http-nio-8080-exec-2] DEBUG ohejbinternal.AbstractBatchImpl - Reusing batch statement 18:26:32.906 [http-nio-8080-exec-2] DEBUG org.hibernate.SQL - insert into deal (amount, deal_id, timestamp, from_currency, source_file, to_currency, id) values (?, ?, ?, ?, ?, ?, ?) 18:26:32.906 [http-nio-8080-exec-2] DEBUG ohejbinternal.AbstractBatchImpl - Reusing batch statement 18:26:32.906 [http-nio-8080-exec-2] DEBUG org.hibernate.SQL - insert into deal (amount, deal_id, timestamp, from_currency, source_file, to_currency, id) values (?, ?, ?, ?, ?, ?, ?) 18:26:32.906 [http-nio-8080-exec-2] DEBUG ohejbinternal.AbstractBatchImpl - Reusing batch statement 18:26:32.906 [http-nio-8080-exec-2] DEBUG org.hibernate.SQL - insert into deal (amount, deal_id, timestamp, from_currency, source_file, to_currency, id) values (?, ?, ?, ?, ?, ?, ?) 18:26:32.906 [http-nio-8080-exec-2] DEBUG ohejbinternal.AbstractBatchImpl - Reusing batch statement 18:26:32.906 [http-nio-8080-exec-2] DEBUG org.hibernate.SQL - insert into deal (amount, deal_id, timestamp, from_currency, source_file, to_currency, id) values (?, ?, ?, ?, ?, ?, ?) 18:26:32.906 [http-nio-8080-exec-2]
... ...
DEBUG ohejbinternal.AbstractBatchImpl - Reusing batch statement 18:26:34.002 [http-nio-8080-exec-2] DEBUG org.hibernate.SQL - insert into deal (amount, deal_id, timestamp, from_currency, source_file, to_currency, id) values (?, ?, ?, ?, ?, ?, ?) 18:26:34.002 [http-nio-8080-exec-2] DEBUG ohejbinternal.AbstractBatchImpl - Reusing batch statement 18:26:34.002 [http-nio-8080-exec-2] DEBUG org.hibernate.SQL - insert into deal (amount, deal_id, timestamp, from_currency, source_file, to_currency, id) values (?, ?, ?, ?, ?, ?, ?) 18:26:34.002 [http-nio-8080-exec-2] DEBUG ohejbinternal.AbstractBatchImpl - Reusing batch statement 18:26:34.002 [http-nio-8080-exec-2] DEBUG org.hibernate.SQL - insert into deal (amount, deal_id, timestamp, from_currency, source_file, to_currency, id) values (?, ?, ?, ?, ?, ?, ?) 18:26:34.002 [http-nio-8080-exec-2] DEBUG ohejbinternal.AbstractBatchImpl - Reusing batch statement 18:26:34.002 [http-nio-8080-exec-2] DEBUG org.hibernate.SQL - insert into deal (amount, deal_id, timestamp, from_currency, source_file, to_currency, id) values (?, ?, ?, ?, ?, ?, ?) 18:26:34.002 [http-nio-8080-exec-2] DEBUG ohejbatch.internal.BatchingBatch - Executing batch size: 27 18:26:34.011 [http-nio-8080-exec-2] DEBUG org.hibernate.SQL - update deal_source_file set invalid_rows=?, source_file=?, valid_rows=? where id=? 18:26:34.015 [http-nio-8080-exec-2] DEBUG ohejbatch.internal.BatchingBatch - Executing batch size: 1 18:26:34.018 [http-nio-8080-exec-2] DEBUG ohetijdbc.JdbcTransaction - committed JDBC Connection 18:26:34.018 [http-nio-8080-exec-2] DEBUG ohetijdbc.JdbcTransaction - re-enabling autocommit 18:26:34.032 [http-nio-8080-exec-2] DEBUG osorm.jpa.JpaTransactionManager - Closing JPA EntityManager [ org.hibernate.jpa.internal.EntityManagerImpl@2354fb09 ] after transaction 18:26:34.032 [http-nio-8080-exec-2] DEBUG osojpa.EntityManagerFactoryUtils - Closing JPA EntityManager 18:26:34.032 [http-nio-8080-exec-2] DEBUG ohejinternal.JdbcCoordinatorImpl - HHH000420: Closing un-released batch 18:26:34.032 [http-nio-8080-exec-2] DEBUG ohejiLogicalConnectionImpl - Releasing JDBC connection 18:26:34.033 [http-nio-8080-exec-2] DEBUG ohejiLogicalConnectionImpl - Released JDBC connection