How to populate a table from an Excel spreadsheet in Rails?

I have a simple four-column Excel spreadsheet that matches universities with its identification codes for the search. The file is quite large (300k).

I need to come up with a way to turn this data into a populated table in my Rails application. The trick is that it is a document that is being updated now and then, therefore it cannot be just a one-time solution. Ideally, it will be some kind of ruby ​​script that will read the file and automatically create records so that when we receive new version email, we can simply update it automatically. I'm on Geroku, if that matters at all.

How can I accomplish something like this?

+3
source share
2 answers

If you can save the table in CSV format, it is much better for processing CSV files than for parsing Excel tables. I found an effective way to deal with such a problem - to do a rake task that reads a CSV file and creates all the records as needed.

So, for example, here's how to read all the lines from a file using the old, but still effective FasterCSV gem

data = FasterCSV.read('lib/tasks/data.csv')
columns = data.remove(0)
unique_column_index = -1#The index of a column that always unique per row in the spreadsheet
data.each do | row |
  r = Record.find_or_initialize_by_unique_column(row[unique_column_index])
  columns.each_with_index do | index, column_name |
    r[column_name] = row[index]
  end
  r.save! rescue => e Rails.logger.error("Failed to save #{r.inspect}")
end

It really relies on you to have a unique column in the original spreadsheet to get away.

, , Capistrano script, . find_or_initialize , .

+2

Excel Hpricot. :

require 'hpricot'

doc  = open("data.xlsx") { |f| Hpricot(f) }
rows = doc.search('row')
rows = rows[1..rows.length] # Skips the header row

rows = rows.map do |row|
    columns = []
    row.search('cell').each do |cell|
        # Excel stores cell indexes rather than blank cells
        next_index          = (cell.attributes['ss:Index']) ? (cell.attributes['ss:Index'].to_i - 1) : columns.length
        columns[next_index] = cell.search('data').inner_html
    end
    columns
end
0

Source: https://habr.com/ru/post/1750179/


All Articles