I have been using Liferay for the past two years for many years, but I have never required extensive document management.
Now I have a portlet where users upload documents (MS Office OLE2 documents, ODS documents, PDF, etc.) and I have to save them with all available metadata.
I know how to do this without using Liferay, I would probably use Apache solr with Apache Tika ( UpdateRichDocuments and ExtractingRequestHandler ) or Apache Jackrabbit, which use Apache Tika under the hood (org.apache.jackrabbit.extractor. *).
The problem is that if I look at the Liferay torso, there are several key classes:
Hooks (JCRHook, FileSystemHook, CMISHook, s3Hook) that are used directly inside DLLocalServiceImpl directly
Another alternative is to use DLAppLocalServiceImpl , which uses DLRepositoryLocalServiceImpl , and the files are saved in the repository also through Hooks, but there are many other things.
Liferay does not have a library for drawing with jabrabbit-text-extractors, so I suppose that if I wanted the metadata to be extracted from PDF, DOC, ODS documents, I would have very hard times ... because the level of service DL does not accept additional properties
- I think I will have to avoid using DL and JCR services, as well as directly access Jackrabbit ... But I would lose compatibility and the ability to migrate my repository, etc.
Can anyone collaborate on this, please? thank you
lisak source share