You can write in the middle of the file, but you must be careful to keep the length of the line that you overwrite, otherwise you will overwrite some of the following texts. I gave an example here using File.seek, IO :: SEEK_CUR gives the current position of the file pointer, at the end of the line that has just been read, +1 for the CR character at the end of the line.
look_for = "bbb" replace_with = "xxxxx" File.open(DATA, 'r+') do |file| file.each_line do |line| if (line[look_for]) file.seek(-(line.length + 1), IO::SEEK_CUR) file.write line.gsub(look_for, replace_with) end end end __END__ aaabbb bbbcccddd dddeee eee
After execution, at the end of the script you now have the following, and not what you had in mind, I assume.
aaaxxxxx bcccddd dddeee eee
Given this, the speed using this method is much better than the classic "read and write to new file" method. See These tests in a 1.7 GB music data file. For the classic approach, I used the Wayne technique. The test is performed using the .bmbm method, so file caching does not play a big role. Tests are performed using MRI Ruby 2.3.0 on Windows 7. The strings were effectively replaced, I checked both methods.
require 'benchmark' require 'tempfile' require 'fileutils' look_for = "Melissa Etheridge" replace_with = "Malissa Etheridge" very_big_file = 'D:\Documents\muziekinfo\all.txt'.gsub('\\','/') def replace_with file_path, look_for, replace_with File.open(file_path, 'r+') do |file| file.each_line do |line| if (line[look_for]) file.seek(-(line.length + 1), IO::SEEK_CUR) file.write line.gsub(look_for, replace_with) end end end end def replace_with_classic path, look_for, replace_with temp_file = Tempfile.new('foo') File.foreach(path) do |line| if (line[look_for]) temp_file.write line.gsub(look_for, replace_with) else temp_file.write line end end temp_file.close FileUtils.mv(temp_file.path, path) ensure temp_file.close temp_file.unlink end Benchmark.bmbm do |x| x.report("adapt ") { 1.times {replace_with very_big_file, look_for, replace_with}} x.report("restore ") { 1.times {replace_with very_big_file, replace_with, look_for}} x.report("classic adapt ") { 1.times {replace_with_classic very_big_file, look_for, replace_with}} x.report("classic restore") { 1.times {replace_with_classic very_big_file, replace_with, look_for}} end
What gave
Rehearsal --------------------------------------------------- adapt 6.989000 0.811000 7.800000 ( 7.800598) restore 7.192000 0.562000 7.754000 ( 7.774481) classic adapt 14.320000 9.438000 23.758000 ( 32.507433) classic restore 14.259000 9.469000 23.728000 ( 34.128093) ----------------------------------------- total: 63.040000sec user system total real adapt 7.114000 0.718000 7.832000 ( 8.639864) restore 6.942000 0.858000 7.800000 ( 8.117839) classic adapt 14.430000 9.485000 23.915000 ( 32.195298) classic restore 14.695000 9.360000 24.055000 ( 33.709054)
Thus, replacing in_file was 4 times faster.