HTML formatted cell value from Excel using Apache POI

I am using apache POI to read excel document. At the very least, it can serve my purpose for the moment. But one thing that amazes me is retrieving the cell value as HTML.

I have one cell in which the user enters some string and applies some formatting (e.g. bullets / numbers / bold / italic) , etc.

SO, when I read it, the content should be in HTML format, and not in the simple line format specified by the POI.

I almost went through the entire POI API, but could not find anyone. I want to stay formatting only one column, not the whole excel. By column, I mean the text that is entered into this column. I want the text to be HTML text.

Studied and used by Apache Tika . However, as I understand it, it can only get text, but not text formatting.

Please help me. I'm running out of options.

Suppose I wrote My name Angel and Demon in Excel.

The output I have to get in Java is My name is <b>Angel</b> and <i>Demon</i>

+4
source share
1 answer

I insert this as unicode in cell A1 of the xls file:

 <html><p>This is a test. Will this text be <b>bold</b> or <i>italic</i></p></html> 

This html line produces the following:

This is a test. Will this text be bold or italic

My code is:

 public class ExcelWithHtml { // <html><p>This is a test. Will this text be <b>bold</b> or // <i>italic</i></p></html> public static void main(String[] args) throws FileNotFoundException, IOException { new ExcelWithHtml() .readFirstCellOfXSSF("/Users/rcacheira/testeHtml.xlsx"); } boolean inBold = false; boolean inItalic = false; public void readFirstCellOfXSSF(String filePathName) throws FileNotFoundException, IOException { FileInputStream fis = new FileInputStream(filePathName); XSSFWorkbook wb = new XSSFWorkbook(fis); XSSFSheet sheet = wb.getSheetAt(0); String cellHtml = getHtmlFormatedCellValueFromSheet(sheet, "A1"); System.out.println(cellHtml); fis.close(); } public String getHtmlFormatedCellValueFromSheet(XSSFSheet sheet, String cellName) { CellReference cellReference = new CellReference(cellName); XSSFRow row = sheet.getRow(cellReference.getRow()); XSSFCell cell = row.getCell(cellReference.getCol()); XSSFRichTextString cellText = cell.getRichStringCellValue(); String htmlCode = ""; // htmlCode = "<html>"; for (int i = 0; i < cellText.numFormattingRuns(); i++) { try { htmlCode += getFormatFromFont(cellText.getFontAtIndex(i)); } catch (NullPointerException ex) { } try { htmlCode += getFormatFromFont(cellText .getFontOfFormattingRun(i)); } catch (NullPointerException ex) { } int indexStart = cellText.getIndexOfFormattingRun(i); int indexEnd = indexStart + cellText.getLengthOfFormattingRun(i); htmlCode += cellText.getString().substring(indexStart, indexEnd); } if (inItalic) { htmlCode += "</i>"; inItalic = false; } if (inBold) { htmlCode += "</b>"; inBold = false; } // htmlCode += "</html>"; return htmlCode; } private String getFormatFromFont(XSSFFont font) { String formatHtmlCode = ""; if (font.getItalic() && !inItalic) { formatHtmlCode += "<i>"; inItalic = true; } else if (!font.getItalic() && inItalic) { formatHtmlCode += "</i>"; inItalic = false; } if (font.getBold() && !inBold) { formatHtmlCode += "<b>"; inBold = true; } else if (!font.getBold() && inBold) { formatHtmlCode += "</b>"; inBold = false; } return formatHtmlCode; } } 

My conclusion:

 This is a test. Will this text be <b>bold</b> or <i>italic</i> 

I think that this is what you want, I only show you the possibilities, I do not use the best code methods, I just program quickly to produce output.

+3
source

Source: https://habr.com/ru/post/1481394/


All Articles