I use ActiveRecord to bulk transfer some data from a table in one database to another table in another database. About 4 million lines.
I use find_each to fetch packages. Then I do some logic for each record and write it to another db. I tried to write directly one after the other, and use the beautiful activerecord-import attribute for batch writing.
However, in any case, the use of the memory of the ruby โโprocess is growing throughout the entire period of export / import. I would think that using find_each, I get lots of 1000, their number should only be 1000 in memory at a time ... but no, every record I get seems to consume memory forever until the process ends.
Any ideas? Is ActiveRecord caching something that I can disable?
updated Jan 17, 2012
I think I'm going to give it up. I tried: * Make sure everything is wrapped in ActiveRecord::Base.uncached do * Adding ActiveRecord::IdentityMap.enabled = false (I think this should disable the ID card for the current stream, although this is not clearly documented, and I think that ID is not enabled by default in current Rails)
None of them have a big effect, memory is still leaking.
Then I added a periodic explicit:
This seems to slow down the memory leak rate, but a memory leak still occurs (ultimately exhausting all memory and bombing).
So, I think I'm giving up, and decided that it is currently not possible to use AR to read millions of rows from one database and insert them into another. Perhaps there is a memory leak in the used MySQL code (that is my db), or somewhere else in AR, or who knows.