Split a string at multiple points with Ruby on Rails

I have a row in my database that represents notes for the user. I want to break this line so that I can split each note into content, user and date.

Here is the line format:

"Example Note <i>Josh Test 12:53 PM on 8/14/12</i><br><br> Another example note <i>John Doe 12:00 PM on 9/15/12</i><br><br> Last Example Note <i>Joe Smoe 1:00 AM on 10/12/12</i><br><br>" 

I need to break this into an array

 ["Example Note", "Josh Test", "12:53 8/14/12", "Another example note", "John Doe", "12:00 PM 9/15/12", "Last Example Note", "Joe Smoe", "1:00 AM 10/12/12"] 

I am still experimenting with this. Any ideas are very welcome, thanks! :)

+4
source share
3 answers

You can use regex for a simpler approach.

 s = "Example Note <i>Josh Test 12:53 PM on 8/14/12</i><br><br> Another example note <i>John Doe 12:00 PM on 9/15/12</i><br><br> Last Example Note <i>Joe Smoe 1:00 AM on 10/12/12</i><br><br>" s.split(/\s+<i>|<\/i><br><br>\s?|(?<!on) (?=\d)/) => ["Example Note", "Josh Test", "12:53 PM on 8/14/12", "Another example note", "John Doe", "12:00 PM on 9/15/12", " Last Example Note", "Joe Smoe", "1:00 AM on 10/12/12"] 

The datetime element is not formatted, but it might be acceptable to apply some formatting to them.

Edit: Removed the unnecessary + character.

+4
source

You can use Nokogiri to parse the required text using Xpath / CSS selectors. To give you a simple bare-bones parsing example to get you started, the following cards each tag i as a new element in the array:

 require 'nokogiri' html = Nokogiri::HTML("Example Note <i>Josh Test 12:53 PM on 8/14/12</i><br><br> Another example note <i>John Doe 12:00 PM on 9/15/12</i><br><br> Last Example Note <i>Joe Smoe 1:00 AM on 10/12/12</i><br><br>") my_array = html.css('i').map {|text| text.content} #=> ["Josh Test 12:53 PM on 8/14/12", "John Doe 12:00 PM on 9/15/12", "Joe Smoe :00 AM on 10/12/12"] 

With a CSS selector, you can just as easily do something like:

 require 'nokogiri' html = Nokogiri::HTML("<h1>My Message</h1><p>Hi today date is: <time>Firday, May 31st</time></p>") message_header = html.css('h1').first.content #=> "My Message" message_body = html.css('p').first.content #=> "Hi today date is:" message_sent_at = html.css('p > time').first.content #=> "Friday, May 31st" 
+2
source

perhaps this may be useful

 require 'date' require 'time' text = "Example Note <i>Josh Test 12:53 PM on 8/14/12</i><br><br> Another example note <i>John Doe 12:00 PM on 9/15/12</i><br><br> Last Example Note <i>Joe Smoe 1:00 AM on 10/12/12</i><br><br>" notes=text.split('<br><br>') pro_notes = [] notes.each do |note_e| notes_temp = note_e.split('<i>') words = notes_temp[1].split(' ') temp = words[5].gsub('</i>','') a = temp.split('/') full_name = words[0] + ' ' + words[1] nn = notes_temp[0] dt = DateTime.parse(a[2] +'/'+ a[0] +'/'+ a[1] +' '+ words[2]) pro_notes << [full_name, nn, dt] end 
+1
source

Source: https://habr.com/ru/post/1483855/


All Articles