Convert String to its Unicode code point

Assuming I have the string foo = "This is an apple"

Unicode code point equivalent will be

" \\x74\\x68\\x69\\x73.......... \\x61\\x70\\x70\\x6c\\x65 "

  T his ............. apple 

How to convert from String foo

to

The string " \\x74\\x68\\x69\\x73.......... \\x61\\x70\\x70\\x6c\\x65 "

+6
source share
2 answers

try it.

  public static String generateUnicode(String input) { StringBuilder b = new StringBuilder(input.length()); for (char c : input.toCharArray()) { b.append(String.format("\\u%04x", (int) c)); } return b.toString(); } 
+1
source

Here is a snippet of working code to convert:

 public class HexTest { public static void main(String[] args) { String testStr = "hello日本語 "; System.out.println(stringToUnicode3Representation(testStr)); } private static String stringToUnicode3Representation(String str) { StringBuilder result = new StringBuilder(); char[] charArr = str.toCharArray(); for (int i = 0; i < charArr.length; i++) { result.append("\\u").append(Integer.toHexString(charArr[i] | 0x10000).substring(1)); } return result.toString(); } } 

This mapping is:

\ u0068 \ u0065 \ u006c \ u006c \ u006f \ u65e5 \ u672c \ u8a9e \ u0020

If you want to get rid of extra zeros, you will develop it as described here .

Here's another version for the conversion, passing "This is an apple" , you get

\ u54 \ U68 \ u69 \ U73 \ u20 \ u69 \ U73 \ u20 \ U61 \ U6E \ u20 \ U61 \ U70 \ U70 \ u6c \ U65

using:

 private static String str2UnicodeRepresentation(String str) { StringBuilder result = new StringBuilder(); for (int i = 0; i < str.length(); i++) { int cp = Character.codePointAt(str, i); int charCount = Character.charCount(cp); //UTF characters may use more than 1 char to be represented if (charCount == 2) { i++; } result.append(String.format("\\u%x", cp)); } return result.toString(); } 
0
source

Source: https://habr.com/ru/post/989407/


All Articles