Addition to the approved answer: there are also alternative commercial solutions for replacing Adobe IFilter for indexing text (providing a similar API, but also offering additional premium functionality):
- Foxit PDF IFilter : Provides much faster text indexing than the Adobe plugin.
- PDFLib PDF iFilter : includes support for corrupted PDF documents plus an additional API to run your own queries.
If you are looking for one tool that can be used from both managed .NET applications and legacy programming languages ββsuch as classic ASP or VB6, then this means that the commercial ByteScout PDF Extractor SDK will meet the requirements of both .NET and ActiveX / COM API.
Disclaimer: I work for ByteScout
source share