UTF-8 email subject decryption?

I have a line in this form: =?utf-8?B?zr...

And I want to get the file name in the correct UTF-8 encoding. Is there a library method somewhere in the center of maven that will do this for my decoding, or will I need to check the template and decode base64 manually?

+6
source share
3 answers

In MIME terminology, these coded pieces are called coded words. Check javax.mail.internet.MimeUtility.decodeText in JavaMail. The decodeText method will decode all encoded words in a string.

You can capture it with maven using

  <groupId>javax.mail</groupId> <artifactId>mail</artifactId> <version>1.4.4</version> 
+13
source

MimeUtility.decodeText works for me,

eg,

 MimeUtility.decodeText("=?UTF-8?B?4K6q4K+N4K6q4K+K4K604K6/4K614K+BIQ==?="); 
+4
source
 javax.mail.internet.MimeUtility.decodeWord() 

On the other hand, if you use JavaMail to decode your emails, you don’t have to worry about parsing the subject at all, nor about the syntax of the MIME body (attachment).

By the way, it doesn’t have to be Base64 (usually with Apple clients), it can also be Quoted-Printable (usually with MS Outlook client).

Thunderbird uses a format that is shorter (Base64 for Japanese, QP for most European languages).

If you really want to implement it yourself, look at RFC2047 and RFC2184 (you need to have a few subtleties, such as split coding in two different character sets or merging adjacent coded words, separated by just dropping the space)

+3
source

Source: https://habr.com/ru/post/898166/


All Articles