I am writing a search application Content of documents I have already written a code to search for documents that are edited in a notebook.
I also want to do the same for docx files. After some research, I came up with these two things.
http://www.infoq.com/articles/cracking-office-2007-with-java this method requires me to extract the docx file and then search for the xml files, but this will require additional overhead for the extraction part and, frankly, I I donβt know how to process the xml file (discarding attribute content, etc.).
http://www.javadocx.com/download this method allows me to import the jar library into my project and maybe I can create docx files with it, I donβt understand how to open docx files using it
can someone recommend me an alternative method to perform the same action or help with the two above methods?
source share