Finding Docx files in java

I am writing a search application Content of documents I have already written a code to search for documents that are edited in a notebook.

I also want to do the same for docx files. After some research, I came up with these two things.

  • http://www.infoq.com/articles/cracking-office-2007-with-java this method requires me to extract the docx file and then search for the xml files, but this will require additional overhead for the extraction part and, frankly, I I don’t know how to process the xml file (discarding attribute content, etc.).

  • http://www.javadocx.com/download this method allows me to import the jar library into my project and maybe I can create docx files with it, I don’t understand how to open docx files using it

can someone recommend me an alternative method to perform the same action or help with the two above methods?

0
source share
1 answer

Try http://tika.apache.org/ or docx4j or POI.

+1
source

Source: https://habr.com/ru/post/1389492/


All Articles