Java strings are UTF-16. All other encodings can be represented using byte sequences. To decode character data, you must provide an encoding when you first create a string. If you have a damaged string, it is already too late.
Assuming ID3, the specifications define coding rules. For example, ID3v2.4.0 may limit the encodings used by the extended header:
q - Text Encoding Limitations
0 No restrictions 1 Strings are only encoded with ISO-8859-1 [ISO-8859-1] or UTF-8 [UTF-8].
Encoding processing is further defined in the document:
If nothing is said, strings, including numeric strings and URLs, are represented as ISO-8859-1 characters in the range of $ 20 to $ FF. Such lines are represented in the description frame as <text string> , or <full text string> If newlines are allowed. If nothing is said a newline is prohibited. ISO-8859-1 introduces a new line, if permitted, with only $ 0A.
Frames that allow various types of text encoding contain text encoded description bytes. Possible encodings:
$00 ISO-8859-1 [ISO-8859-1]. Terminated with $00. $01 UTF-16 [UTF-16] encoded Unicode [UNICODE] with BOM. All strings in the same frame SHALL have the same byteorder. Terminated with $00 00. $02 UTF-16BE [UTF-16] encoded Unicode [UNICODE] without BOM. Terminated with $00 00. $03 UTF-8 [UTF-8] encoded Unicode [UNICODE]. Terminated with $00.
Use transcoding classes such as InputStreamReader or (more likely in this case) the String(byte[],Charset) constructor String(byte[],Charset) to decode the data. See Also Java: An Approximate Guide to Character Encoding .
The analysis of the string components of the ID3v2.4.0 data structure will be something like this:
//untested code public String parseID3String(DataInputStream in) throws IOException { String[] encodings = { "ISO-8859-1", "UTF-16", "UTF-16BE", "UTF-8" }; String encoding = encodings[in.read()]; byte[] terminator = encoding.startsWith("UTF-16") ? new byte[2] : new byte[1]; byte[] buf = terminator.clone(); ByteArrayOutputStream buffer = new ByteArrayOutputStream(); do { in.readFully(buf); buffer.write(buf); } while (!Arrays.equals(terminator, buf)); return new String(buffer.toByteArray(), encoding); }