How to ensure that strings are in UTF-8?

How to convert this string the surveyÂ’s rulesto UTF-8in Scala?

I tried these roads but didn't work:

scala> val text = "the surveyÂ’s rules"
text: String = the surveyÂ’s rules

scala> scala.io.Source.fromBytes(text.getBytes(), "UTF-8").mkString
res17: String = the surveyÂ’s rules

scala> new String(text.getBytes(),"UTF8")
res21: String = the surveyÂ’s rules

Ok, I decided that way. Not conversion, but simple reading

implicit val codec = Codec("US-ASCII").onMalformedInput(CodingErrorAction.IGNORE).onUnmappableCharacter(CodingErrorAction.IGNORE)

val src = Source.fromFile(new File (folderDestination + name + ".csv"))
val src2 = Source.fromFile(new File (folderDestination + name + ".csv"))

val reader = CSVReader.open(src.reader())
+4
source share
2 answers

Note that when called text.getBytes()without arguments, you actually get an array of bytes representing the default encoded string for the platform. For example, on Windows, this may be some single-byte encoding; on Linux it might be UTF-8 already.

, getBytes(). Java 7 :

import java.nio.charset.StandardCharsets

val bytes = text.getBytes(StandardCharsets.UTF_8)

Java 6 :

import java.nio.charset.Charset

val bytes = text.getBytes(Charset.forName("UTF-8"))

bytes UTF-8.

+9

JVM file.encoding UTF-8 :

-Dfile.encoding=UTF-8

, UTF-8 .

scala, scala -Dfile.encoding=UTF-8.

+4

Source: https://habr.com/ru/post/1542465/


All Articles