Parsing Multipart / Mixed with Multipart / Alternative body in java

I receive emails from a client where they insert a multiple / alternative message inside a multipart / mixed message. When I get the body of the message, it simply returns a multi-page / alternative level, when what I really want is the text / html part that is contained in multipart / alternative.

I looked through javadocs for javax.mail and I cannot find an easy way to get a bodypart body that is multiparty in itself or skip the first multi-part / mixed part and go into multi-page / alternative body to read text / html and text / simple snippets .

The email structure is as follows:

... Content-Type: multipart/mixed; boundary="----=_Part_19487_1145362154.1418138792683" ------=_Part_19487_1145362154.1418138792683 Content-Type: multipart/alternative; boundary="----=_Part_19486_1391901275.1418138792683" ------=_Part_19486_1391901275.1418138792683 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=ISO-8859-1 ... ------=_Part_19486_1391901275.1418138792683 Content-Transfer-Encoding: 7bit Content-Type: text/html; charset=ISO-8859-1 ... ------=_Part_19486_1391901275.1418138792683-- ------=_Part_19487_1145362154.1418138792683-- 

This is the code diagram used to analyze emails:

 Message [] found = fldr.search(searchCondition); for (int i = 0; i < found.length; i++) { Message m = found[i]; Object o = m.getContent(); if (o instanceof Multipart) { log.info("**This is a Multipart Message. "); Multipart mp = (Multipart)o; log.info("The Multipart message has " + mp.getCount() + " parts."); for (int j = 0; j < mp.getCount(); j++) { BodyPart b = mp.getBodyPart(j); // Loop if the content type is multipart then get the content that is in that part, // make it the new container and restart the loop in that part of the message. if (b.getContentType().contains("multipart")) { mp = (Multipart)b.getContent(); j = 0; continue; } log.info("This content type is " + b.getContentType()); if(!b.getContentType().contains("text/html")) { continue; } Object o2 = b.getContent(); if (o2 instanceof String) { <do things with content here> } } } } 

It seems that he stops at the second border and no longer understands. In the case of the above message, it stops at the border = "---- = _ Part_19486_1391901275.1418138792683" and never gets into the message text.

+6
source share
2 answers

In this block:

 if (b.getContentType().contains("multipart")) { mp = (Multipart)b.getContent(); j = 0; continue; } 

You set j to 0 and ask the loop to continue, hoping that it starts again from scratch. But the j++ increment operation will appear earlier, and your loop will start at 1, not 0.

Set j -1 to solve your problem.

 if (b.getContentType().contains("multipart")) { mp = (Multipart)b.getContent(); j = -1; continue; } 
+2
source

I checked your code and could not handle it.

In my case, b.getContentType() returns all uppercase characters (for example, "TEXT / HTML; charset = UTF-8"). So I converted it to lowercase and it worked.

 String contentType=b.getContentType().toLowerCase(Locale.ENGLISH); if(!contentType.contains("text/html")) { continue; } 
+1
source

Source: https://habr.com/ru/post/979354/


All Articles