R: Creating CSV from serialized objects

I am trying to take a list and serialize each item and put it in a CSV file with a key to create a text file with key / value pairs. This will eventually go through the Hadoop streams, so before you ask, I think it really needs to be in a text file. (but I am open to other ideas). It all seemed pretty straightforward. But I can't get serialization to work the way I want it ( yet ).

If I do this:

> rawToChar(serialize("blah", NULL, ascii=T))
[1] "A\n2\n133888\n131840\n16\n1\n9\n4\nblah\n"

Then I have those pesky \ n that later ruined my CSV analysis. I could go and replace \ n with another line, which I don't mind doing. However, this seems a bit dirty.

Another option that came to mind is the lack of calling rawToChar () and pumping the original ascii to a text file:

> serialize("blah", NULL, ascii=T)
 [1] 41 0a 32 0a 31 33 33 38 38 38 0a 31 33 31 38 34 30 0a 31 36 0a 31 0a 39 0a
[26] 34 0a 62 6c 61 68 0a

Well, if I just dump into a text file, I will get \ n after each item in the list. So I tried to make a little paste / collapse:

> ser <- serialize("blah", NULL, ascii=T)
> ser2 <- paste(ser, collapse="")
> ser2
[1] "410a320a3133333838380a3133313834300a31360a310a390a340a626c61680a"

Now that the value can be written to a CSV text file! Just ... how can I turn it back on again? Let's just take the first hexadecimal element: 41 I can’t even figure out how to create a list of raw elements and insert the 41 hexadecimal value into one of the elements. When I try to drag the source hex value into the source list, I get something like this:

> r <- raw(1)
> r[1] <- 41
Error in r[1] <- 41 : 
  incompatible types (from double to raw) in subassignment type fix
> r[1] <- as.raw(41)
> r[1]
[1] 29 

Shit! 29! = 41 (except for really large values ​​of 29 and really small values ​​of 41, of course)

Any ideas on how to crack this nut?

+2
3

caTools Base64 -, :

> library(caTools)
> s<-base64encode(serialize("blah",NULL))
> s
[1] "WAoAAAACAAIKAQACAwAAAAAQAAAAAQAAAAkAAAAEYmxhaA=="
> unserialize(base64decode(s,"raw"))
[1] "blah"
+3

jmoy . , . , , CSV, . wiki. , :

listToCsv <- function(inList, outFileName){
  require(caTools)
  if (is.list(inList) == F) 
        stop("listToCsv: The input list fails the is.list() check.")
  fileName <- outFileName
  cat("", file=fileName, append=F)

  i <- 1
  for (item in inList) {
    myLine <- paste(i, ",", base64encode(serialize(item, NULL, ascii=T)), "\n", sep="")
    cat(myLine, file=fileName, append=T) 
    i <- i+1
  }
}

csvToList <- function(inFileName){
  require(caTools)
  linesIn <- readLines(fileName, n=-1)
  outList <- NULL

  i <- 1
  for (line in linesIn){
    outList[[i]] <- unserialize(base64decode(strsplit(linesIn[[i]], split=",")[[1]][[2]], "raw"))
    i <- i+1
  }
  return(outList)
}
+1

, as.raw(65) , 65 ( ) - 41 ( )

 > as.hexmode(65)
[1] "41"

, Hadoop?

0
source

Source: https://habr.com/ru/post/1751775/


All Articles