How to read the font size of each word in a text document using POI?

I am trying to find out if there is anything in the word document that has font 2. However, I could not do this. To begin with, I tried to read the font of each word in the sample word document, which has only one line and 7 words. I do not get the right results.

Here is my code:

HWPFDocument doc = new HWPFDocument (fileStream); WordExtractor we = new WordExtractor(doc); Range range = doc.getRange() String[] paragraphs = we.getParagraphText(); for (int i = 0; i < paragraphs.length; i++) { Paragraph pr = range.getParagraph(i); int k = 0 while (true) { CharacterRun run = pr.getCharacterRun(k++); System.out.println("Color: " + run.getColor()); System.out.println("Font: " + run.getFontName()); System.out.println("Font Size: " + run.getFontSize()); if (run.getEndOffSet() == pr.getEndOffSet()) break; } } 

However, the above code always doubles the font size. that is, if the actual font size in the document is 12, then it gives 24, and if the actual font is 8, then it displays 16.

Is it right to read the font size from a Word document?

+6
source share
1 answer

Yes, that is the right way; the measurement is at one and a half points.

In docx you will have something like:

 <w:rPr> <w:sz w:val="28" /> </w:rPr> 

The @sz ECMA 376 specification defines the device as ST_HpsMeasure (Halftone Measurement)

Same thing with a binary document format that supports HWPF. If you look at [MS-DOC] , you will see that it also determines the size of the text in semitones.

+2
source

Source: https://habr.com/ru/post/949181/


All Articles