I am currently writing a Java program that communicates with the Chrome extension. I need to implement my own Chrome messaging protocol for communication. Google Chrome docs say:
... each message is serialized using JSON, UTF-8 encoding and precedes the 32-bit message length in its own byte order. ( Source )
I tried to implement this in Java, but I have problems when my messages have a certain length, although my implementation must be correct. Here is my current implementation based on earlier SO answers and questions ( here ):
// read the message size from Chrome. This part works correctly. public static int getInt(char[] bytes) { return (bytes[3]<<24) & 0xff000000| (bytes[2]<<16) & 0x00ff0000| (bytes[1]<< 8) & 0x0000ff00| (bytes[0]<< 0) & 0x000000ff; } // transform the length into the 32-bit message length. // This part works for small numbers, but does not work for length 2269 for example. public static String getBytes(int length) { return String.format("%c%c%c%c", (char) ( length & 0xFF), (char) ((length>>8) & 0xFF), (char) ((length>>16) & 0xFF), (char) ((length>>24) & 0xFF)); }
The problem seems to be in the way java implements characters. I would expect ordinary characters, for example, in C. In practice, it seems that Java sometimes turns these characters into unicode characters (or at least that's my suspicion so far). This is reflected in the next release (transferred to xxd to display the actual bytes) from a java program for a length of 2269:
0000000: c39d 0800 00 .....
Expected Result (with python):
import struct struct.pack('I', 2269) # outputs in interactive mode: '\xdd\x08\x00\x00'
What exactly is going on here? Why does Java convert my "0xDD" to "0xC39D" and how can I get the getBytes function to represent the expected input for Chrome Native Messaging? Using a different language is not an option.
source share