I am trying to take a list and serialize each item and put it in a CSV file with a key to create a text file with key / value pairs. This will eventually go through the Hadoop streams, so before you ask, I think it really needs to be in a text file. (but I am open to other ideas). It all seemed pretty straightforward. But I can't get serialization to work the way I want it ( yet ).
If I do this:
> rawToChar(serialize("blah", NULL, ascii=T))
[1] "A\n2\n133888\n131840\n16\n1\n9\n4\nblah\n"
Then I have those pesky \ n that later ruined my CSV analysis. I could go and replace \ n with another line, which I don't mind doing. However, this seems a bit dirty.
Another option that came to mind is the lack of calling rawToChar () and pumping the original ascii to a text file:
> serialize("blah", NULL, ascii=T)
[1] 41 0a 32 0a 31 33 33 38 38 38 0a 31 33 31 38 34 30 0a 31 36 0a 31 0a 39 0a
[26] 34 0a 62 6c 61 68 0a
Well, if I just dump into a text file, I will get \ n after each item in the list. So I tried to make a little paste / collapse:
> ser <- serialize("blah", NULL, ascii=T)
> ser2 <- paste(ser, collapse="")
> ser2
[1] "410a320a3133333838380a3133313834300a31360a310a390a340a626c61680a"
Now that the value can be written to a CSV text file! Just ... how can I turn it back on again? Let's just take the first hexadecimal element: 41 I can’t even figure out how to create a list of raw elements and insert the 41 hexadecimal value into one of the elements. When I try to drag the source hex value into the source list, I get something like this:
> r <- raw(1)
> r[1] <- 41
Error in r[1] <- 41 :
incompatible types (from double to raw) in subassignment type fix
> r[1] <- as.raw(41)
> r[1]
[1] 29
Shit! 29! = 41 (except for really large values of 29 and really small values of 41, of course)
Any ideas on how to crack this nut?