I am trying to create HTML using a script:
title: "SQL"
date: "`r format(Sys.time(), '%Y-%m-%d')`"
output:
html_document:
keep_md: yes
---
```{r setup, echo = FALSE}
library(knitr)
path <- "drive"
extension <- "sql"
```
This document contains the code from all files with extension ``r extension`` in ``r paste0(getwd(), "/", path)``.
```{r, results = "asis", echo = FALSE}
fileNames <- list.files(path, pattern = sprintf(".*%s$", extension))
fileInfos <- file.info(paste0(path, fileNames))
for (fileName in fileNames) {
filePath <- paste0(path, fileName)
cat(sprintf("## File `%s` \n\n### Meta data \n\n", fileName))
cat(sprintf(
"| size (KB) | mode | modified |\n|---|---|---|\n %s | %s | %s\n\n",
round(fileInfos[filePath, "size"]/1024, 2),
fileInfos[filePath, "mode"],
fileInfos[filePath, "mtime"]))
cat(sprintf("### Content\n\n```\n%s\n```\n\n", paste(readLines(filePath), collapse = "\n")))
Using the same setup and R scripts and the same directories on a Windows computer works well. I assume there is a problem with the encoding.
Sometimes execution starts with:
During startup - Warning messages:
1: Setting LC_CTYPE failed, using "C"
2: Setting LC_COLLATE failed, using "C"
3: Setting LC_TIME failed, using "C"
4: Setting LC_MESSAGES failed, using "C"
5: Setting LC_MONETARY failed, using "C"
6: Setting LC_PAPER failed, using "C"
7: Setting LC_MEASUREMENT failed, using "C"
and execution just freezes
Then I need:
Sys.setenv(LANG="de_DE.UTF-8")
and repeat the execution. The following output will appear:
processing file: all_SQLs.Rmd
|................ | 25%
inline R code fragments
|................................ | 50%
label: setup (with options)
List of 2
$ echo : logi FALSE
$ indent: chr " "
|................................................. | 75%
inline R code fragments
|.................................................................| 100%
label: unnamed-chunk-1 (with options)
List of 2
$ results: chr "asis"
$ echo : logi FALSE
output file:SQL.knit.md
/usr/lib/rstudio-server/bin/pandoc/pandoc +RTS -K512m -RTS SQL.utf8.md
pandoc: Cannot decode byte '\xfc': Data.Text.Internal.Encoding.Fusion.streamUtf8: Invalid UTF-8 stream
Error: pandoc document conversion failed with error 1
I assume there is a problem with the encodings. However, I tried everything I could, setting the locale in R, as well as in Ubuntu, according to one standard:
locale R
[1] LC_CTYPE=de_DE.UTF-8 LC_NUMERIC=C LC_TIME=de_DE.UTF-8
[4] LC_COLLATE=de_DE.UTF-8 LC_MONETARY=de_DE.UTF-8 LC_MESSAGES=de_DE.UTF-8
[7] LC_PAPER=de_DE.UTF-8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=C
Locale Ubuntu
LANG=de_DE.UTF-8
LANGUAGE=
LC_CTYPE="de_DE.UTF-8"
LC_NUMERIC="de_DE.UTF-8"
LC_TIME="de_DE.UTF-8"
LC_COLLATE="de_DE.UTF-8"
LC_MONETARY="de_DE.UTF-8"
LC_MESSAGES="de_DE.UTF-8"
LC_PAPER="de_DE.UTF-8"
LC_NAME="de_DE.UTF-8"
LC_ADDRESS="de_DE.UTF-8"
LC_TELEPHONE="de_DE.UTF-8"
LC_MEASUREMENT="de_DE.UTF-8"
LC_IDENTIFICATION="de_DE.UTF-8"
LC_ALL=
When used knit2html()
instead of a button Knit HTML
in Rstudio, I get the following:
warning:
1: In readlines (con):
incomplete last line in '/SQL.Rmd'
2: In RegExpr (".. '? [Hh] [1-6] *> (*) </ [Hh] [1-6] *.?>", Html, perl = TRUE):
Input string 1 is incorrect UTF-8
3: In Grepl (i, html, perl = TRUE):
Input string 1 is incorrect UTF-8
4: In Grepl (i, html, perl = TRUE):
Input string 1 is incorrect UTF-8