Full Text Indexing Excel Files

How can I configure the Plone search engine to activate full text indexing of excel files? I already installed pdftotext and wv for pdf, text indexing of text files.

+4
source share
2 answers

If you add Products.OpenXml to your instance eggs and set it to Plone, you can index modern Office formats, at least .docx and .xlsx. For regular old Excel files (.xls) this does not work.

I tried it in the Plone 4.3.2 buildout configuration a few weeks ago:

[instance]
eggs =
    ...
    Products.OpenXml

[versions]
# You need a more recent lxml than default Plone, some 3.x version
lxml = 3.3.3
Products.OpenXml = 1.1.1

Products.AROfficeTransforms. Products.OpenXml, Product.AROfficeTransforms , excel,.xls. :

[instance]
eggs =
    ...
    Products.AROfficeTransforms

[versions]
Products.AROfficeTransforms = 0.11.0

, xlhtml. , 2002 . .

+5

ftw.tika

:

  • Microsoft Office (Office Open XML)
  • *. Word docx
  • *. dotx Word Templates
  • *. xlsx Excel
  • *. xltx Excel
  • *. pptx Powerpoint
  • *. potx Powerpoint
  • *. ppsx - Powerpoint
  • Microsoft Office (97)
  • Rich Text
  • ODF OpenOffice
  • OpenOffice 1.x
  • Adobe (InDesign, Illustrator, Photoshop)
  • PDF
  • WordPerfect E-Mail

apache tika , ( ).

portal_transforms, .

:

+1

Source: https://habr.com/ru/post/1537072/


All Articles