Convert jbyteArray to an array of characters and then print to the console

I am writing a JNI program where my .cpp file receives jbyteArray and I want to be able to print jbyteArray using printf. For this to happen, I believe that I need to convert jbyteArray to an array of characters.

For background knowledge, the java side of my JNI converts String to byteArray, and then this byteArray is passed as an argument to my JNI function.

What I have done so far prints the line correctly, but it is followed by unnecessary characters, and I don’t know how to get rid of them / if I do something wrong.

Here is what the string is:

dsa 

and what prints on the console:

 dsa,  

Spam characters change depending on what String is. Here is part of the relevant code:

.java file:

 public class tcr extends javax.swing.JFrame{ static{ System.loadLibrary("tcr"); } public native int print(byte file1[]); ..... String filex1 = data1TextField.getText();//gets a filepath in the form of a String from a GUI jtextfield. byte file1[]= filex1.getBytes();//convert file path from string to byte array tcr t = new tcr(); t.print(file1); } 

.cpp code:

 JNIEXPORT jint JNICALL Java_tcr_print(JNIIEnv *env, jobject thisobj, jbyteArray file1){ jboolean isCopy; jbyte* a = env->GetByteArrayElements(file1,&isCopy); char* b; b = (char*)a; printf("%s\n",b); } 

Any help would be appreciated.

+6
source share
2 answers

See what you do:

 jbyte* a = env->GetByteArrayElements(file1,&isCopy); 

a now points to the memory address in which the contents of the byte of the string are stored. Suppose the file contains the string "Hello world". In UTF-8 encoding, this will be:

48 65 6c 6c 6f 20 77 6f 72 6c 64

 char* b = (char*)a; 

b now points to this area of ​​memory. This is a char pointer, so you probably want to use it as a C string. However, this will not work. Lines C are defined as some bytes ending with a null byte. Now search there and you will see that there is no null byte at the end of this line.

 printf("%s\n",b); 

Here he is. You pass a char pointer to printf as %s , which tells printf that it is a string C. However, it is not a string C, but printf still tries to print all the characters until it reaches zero byte. So, what you see after dsa are actually bytes from your memory after the end of the byte array, until (by coincidence) a null byte. You can fix this by copying the bytes to a buffer that is one byte longer than the byte array, and then set the last element to zero.

UPDATE:

You can create a larger buffer and add a null byte as follows:

 int textLength = strlen((const char*)a); char* b = malloc(textLength + 1); memcpy(b, a, textLength); b[textLength] = '\0'; 

Now b is a valid zero-terminated string C. Also, be sure to call ReleaseByteArrayElements . You can do this right after memcpy called.

+7
source

A jbyteArray is actually a very good way to pass a Java string through JNI. This allows you to easily convert a string to the character set and encoding needed by the libraries and files / devices that you use on the C ++ side.

Make sure you understand the β€œ Absolute Minimum Every Software Developer Absolutely, positively needs to know about Unicode and character sets (no excuses!) ”

Java String uses Unicode character set and UTF-16 encoding (with platform dependent byte order).

String.getBytes () is converted to the "default platform encoding". Thus, he makes an assumption about the character set and the encoding you need, and what to do with characters that are not in the target character set. You can use other Java overloads of String.getBytes or Charset methods if you want to explicitly control these things.

When deciding which character set and encoding to use, consider that Unicode has been used for a couple of decades as the main string type in Java, .NET, VB, ...; in the compiler source files for Java, ...; usually in www. Of course, you may be limited to what you want to interact with.

Now it seems that the problem you are facing is either that the target character set does not contain the characters that your Java string has, and the substitution is used, or the console you use does not display them properly.

The console (or any application with a user interface), obviously, should choose a font with which to display characters. Typical fonts usually do not support the million code points available in Unicode. You can change the configuration of your console (or use another). For example, on Windows you can use cmd.exe or ps (Windows PowerShell). You can change the font in windows Cmd.exe and use chcp to change the character set.

UPDATE:

As @ main-- points out, if you use a function that expects the terminator to be added to the string, you should provide it, usually by copying the array, since the JVM retains ownership of the array. This is the actual reason for the behavior in this case. But all of the above also matters.

+2
source

Source: https://habr.com/ru/post/948795/


All Articles