Read the tab with delimited characters with unusual characters, then enter an exact copy

Problem

I have a tab delimited input file that looks like this:

Variable [1] Variable [2] 111 Something Nothing 222 

The first row represents the column names, and the next two rows represent the column values. As you can see, column names include both spaces and some complex characters.

Now, I want to do this in order to import this file into R, and then output it again to a new text file, making it look like an input. To do this, I created the following script (assuming the input file is called "Test.txt"):

 file <- "Test.txt" x <- read.table(file, header = TRUE, sep = "\t") write.table(x, file = "TestOutput.txt", sep = "\t", col.names = TRUE, row.names = FALSE) 

From this, I get a conclusion that looks like this:

 "Variable..1." "Variable..2." "1" "111" "Something" "2" "Nothing" "222" 

Now with this exit there are a couple of problems.

  • The signs "[" and "]" have been converted to dots.
  • Spaces have been converted to dots.
  • Signs of quotes appeared everywhere.

How to make the output file look exactly like the input file?

What I tried so far

Regarding problems one and two, I tried to specify the column names by creating an internal vector, c("Variable [1]", "Variable [2]") , and then using the col.names parameter for read.table() . This gives me the same result. I also tried different encodings using the encoding option for table.read() . If I look at the internally generated vector mentioned above, it prints the names of the variables since they must be printed, so I assume that there is a problem with the conversion between the steps of the β€œtext β†’ R” and β€œR β†’ text” process. That is, if I look at the data frame created by read.table() without any internally created vectors, the column names are wrong.

Regarding problem number three, I am pretty much lost and could not figure out what I should try.

+6
source share
1 answer

Given the following input file as test.txt :

 Variable [1] Variable [2] 111 Something Nothing 222 

If the columns are separated by tabs, you can use the following code to create an exact copy:

 a <- read.table(file='test.txt', check.names=F, sep='\t', header=T, stringsAsFactors=F) write.table(x=a, file='test_copy.txt', quote=F, row.names=F, col.names=T, sep='\t') 
+9
source

Source: https://habr.com/ru/post/902197/


All Articles