Change the row row after applying transformations

Change the row row after applying transformations

I want to write a kiba transformation that allows me to insert the same information for a specific number of lines. In this case, I have an xls file that contains subheadings, and these subheadings also contain data, for example:

Client: John Doe, Code: 1234
qty, date, price
1, 12/12/2017, 300.00
6, 12/12/2017, 300.00
total: 2100
Client: Nick Leitgeb, Code: 2345
qty, date, price
1, 12/12/2017, 100.00
2, 12/12/2017, 200.00
2, 12/12/2017, 50.00
total: 600
Client: …..

To extract the corresponding data, I use the following conversion, which returns strings that match at least one regular expression of the two provided (date or "Client word")

transform, SelectByRegexes regex: [/\d+\/\d+\/\d+/, /Client:/], matches: 1

This will give me the following result:

Client: John Doe, Code: 1234
1, 12/12/2017, 300.00
6, 12/12/2017, 300.00
Client: Nick Leitgeb, Code: 2345
1, 12/12/2017, 100.00
2, 12/12/2017, 200.00
2, 12/12/2017, 50.00
…..

Now that I have the information I want, I need to replicate the client and code for each substring and remove the subtitle

John Doe, 1234, 1, 12/12/2017, 300.00
John Doe, 1234, 6, 12/12/2017, 300.00
Nick Leitgeb, 2345, 1, 12/12/2017, 100.00
Nick Leitgeb, 2345, 2, 12/12/2017, 200.00
Nick Leitgeb, 2345, 2, 12/12/2017, 50.00

, , - source pre_process, , , source/pre_process?, ?

+4
1

! Kiba. , source, :

last_seen_client_row = nil
logger = Logger.new(STDOUT)

transform do |row|
  # detect "Client/Code" rows - pseudo code, adjust as needed
  if row[0] =~ /\AClient:\z/
    # this is a top-level header, memorize it
    last_seen_client_row = row
    logger.info "Client boundaries detected for client XXX"
    next # remove the row from pipeline
  else
    # assuming you are working with arrays (I usually prefer Hashes though!) ; make sure to dupe the data to avoid
    last_seen_client_row.dup + row
  end
end

, , , , , - .

, !

+3

Source: https://habr.com/ru/post/1670973/


All Articles