This is a mistake in R-studio, not the only one. I saw you got a general answer about the problems of R-studio, which currently have non-language support for locales in windows. As far as I know, this is not the first time / version having similar problems. You may also encounter some new problems that I think are related to victory 10. Please note that since I have other problems, I use English to print Hebrew.
So, I tried to debug your problem there and came up with some problems, and some new ideas (I think ..) about where the problem is. I think that it can be debugged in order to write a complete function that will fix it, but due to time (and hour) limitations, I decided to stay here.
I created this data:
x <- data.frame("x"= c("דור","dor"))
As mentioned, using Hebrew locale I also get gibrish
Sys.setlocale("LC_ALL", "Hebrew") [1] "LC_COLLATE=Hebrew_Israel.1255;LC_CTYPE=Hebrew_Israel.1255;LC_MONETARY=Hebrew_Israel.1255;LC_NUMERIC=C;LC_TIME=Hebrew_Israel.1255" "דור" [1] "ãåø" x x 1 ãåø 2 dor
Using English, I get this conclusion.
Sys.setlocale("LC_ALL", "English") [1] "LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252" "דור" [1] "דור" x x 1 <U+05D3><U+05D5><U+05E8> 2 dor
Note that the output is not data.frame prints fine. It also occurs with the data.table class and prints fine with list and matrix .
Checking the print.data.frame and print.table reveals the main suspect: format .
Further research confirms these suspicions:
as.matrix(x) x [1,] "דור" [2,] "dor" format(as.matrix(x)) x [1,] "<U+05D3><U+05D5><U+05E8>" [2,] "dor "
As such, in your case, I suggest performing the following workflow:
Sys.setlocale("LC_ALL", "Hebrew") x <- read.csv("https://raw.githubusercontent.com/talgalili/temp2/gh-pages/Hebrew_UTF8.txt", encoding="UTF-8") as.matrix(x) âéì..áùðéí. îéâãø [1,] "23.0" "זכר" [2,] "24.0" "נקבה" [3,] "23.0" "נקבה" [4,] "24.0" "נקבה" [5,] "25.0" "זכר" [6,] "18.0" "זכר" [7,] "26.0" "זכר" [8,] "21.5" "נקבה" [9,] "24.0" "זכר" [10,] "26.0" "זכר" [11,] "24.0" "זכר" [12,] "19.0" "נקבה" [13,] "19.0" "נקבה" [14,] "24.5" "זכר" [15,] "21.0" "נקבה"
Both locales: Hebrew and English worked on my machine, but col.names did not work for any.
In conclusion, this is far from a complete solution, but just a small and partial processing of a print problem (or with a reminder of formatting). He also shed even more light on this Hebrew / non-English issue in R-studio, on which some of the best solutions can be written. One example of a solution to a similar problem of writing Hebrew in windows can be seen in this SO stream .