I am writing an application for searching the Content of Documents i have already written the code for searching the documents which are editable by notepad.
I also wish to do the same for docx files. After some research i have come up wit开发者_Python百科h these two things
http://www.infoq.com/articles/cracking-office-2007-with-java this method requires me to extract docx file and then search the xml files however this would involve an extra overhead on the extraction part and frankly i dont know how to process an xml file ( discarding attribute content etc)
http://www.javadocx.com/download this method allows me to import a jar library to my project and supposedly i can create docx files with it, what i dont understand is how to open docx files using it
can anyone recommend me a alternate method to perform the same action or help with the above two mentioned methods?
Try http://tika.apache.org/ or docx4j or POI.
精彩评论