Apache POI - Docx Product Release

I rate apache poi as being able to write docx files. The specific thing I'm looking for is to generate content in the docx file in different languages ​​(e.g. Hindi / Marathi). I ran into the following problem:

When a docx file is written, the text “Hindi / Marathi” is displayed as square squares, even if it is supported by the font “Arial Unicode MS”. The fact is that when checking boxes, MS Word displays the font as "Cailbri", although I explicitly set the font to "Arial Unicode MS". If I select the boxes in MS Word, and then change the font to "Arial Unicode MS", the Hindi / Marathi words will be visible correctly. Any idea why this is happening? Please note that I am using the POI version for development, as the previous stable version does not support customization of font families. Here is the source:

import java.io.FileNotFoundException; import java.io.FileOutputStream; import java.io.IOException; import org.apache.poi.xwpf.usermodel.XWPFDocument; import org.apache.poi.xwpf.usermodel.XWPFParagraph; import org.apache.poi.xwpf.usermodel.XWPFRun; public class CreateDocumentFromScratch { public static void main(String[] args) { XWPFDocument document = new XWPFDocument(); XWPFParagraph paragraphTwo = document.createParagraph(); XWPFRun paragraphTwoRunOne = paragraphTwo.createRun(); paragraphTwoRunOne.setFontFamily("Arial Unicode MS"); paragraphTwoRunOne.setText("नसल्यास"); XWPFParagraph paragraphThree = document.createParagraph(); XWPFRun paragraphThreeRunOne = paragraphThree.createRun(); paragraphThreeRunOne.setFontFamily("Arial Unicode MS"); paragraphThreeRunOne.setText("This is nice"); FileOutputStream outStream = null; try { outStream = new FileOutputStream("c:/will/First.doc"); } catch (FileNotFoundException e) { e.printStackTrace(); } try { document.write(outStream); outStream.close(); } catch (FileNotFoundException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } } } 

Any help would be appreciated.

+6
source share
1 answer

To resurrect a very old post; can the OP confirm the version of MS Office that is being used? The problem is that MS Office 2003 is running Windows XP. But then it can be on a higher version of the OS.

It looks like MS Word is using the Mangal font for a Hindi script [Encoding standard: Index: Hindi ISCII 57002 (Devanagari)]. The following link explains this:

https://support.office.com/en-ca/article/Choose-text-encoding-when-you-open-and-save-files-60d59c21-88b5-4006-831c-d536d42fd861

Suggested workaround: From the Windows XP Control Panel, select Regional and Language Options. Select Languages. Check the box "Install files for complex script languages ​​and from right to left (including Thai).

Reboot the computer.

However, this problem did not occur when opening a file using LibreOffice versions 4.3.5.2 on Windows and LibreOffice 4.2.7.2 on Linux (Ubuntu).

The following libraries are used: poi-3.10-FINAL-20140208.jar, poi-ooxml-3.10-FINAL-20140208.jar,
poi-ooxml-schemas-3.10-FINAL-20140208.jar, xmlbeans-2.3.0.jar, dom4j-1.6.1.jar, stax-api-1.0.1.jar

+1
source

Source: https://habr.com/ru/post/908324/


All Articles