I am trying to write a C # utility that mimics the behavior of filtdump.exe from the Windows Search SDK (since filtdump does not seem to be self-propagating.) I am facing a combination of conflicting and / or non-existent documentation and technical issues that I cannot track. I hope someone can help remove one or another of these obstacles ...
According to MSDN, filtdump uses ILoadFilter::LoadIFilter to load IFilter. I claim that MSDN is lying, as it also claims that ILoadFilter::LoadIFilter exists only on Windows 7, but filtdump works fine on Windows. Process Monitor indicates that it actually calls LoadIFilter() from query.dll , so what I do:
public static class NativeMethods { // From Windows SDK v7.1, NTQuery.h [DllImport("query.dll", CharSet = CharSet.Unicode)] public static extern int LoadIFilter( string pwcsPath, [MarshalAs(UnmanagedType.IUnknown)] ref object pUnkOuter, ref IFilter ppIUnk); } object iUnknown = null; IFilter filter = null; var result = NativeMethods.LoadIFilter(args[0], ref iUnknown, ref filter); if (result != ResultCodes.S_OK) { Console.WriteLine("Failed to load an IFilter for {0}: {1}", args[0], result); return; }
For the most part, this application and filtdump give me the same results - they can open and extract text from text, a Word document and Outlook e-mail, and both do not work in the same set of other documents that have no IFilter. However, PDF files give me a problem. filtdump manages to open and extract text from most of the PDFs that I have selected for it, but each of the PDFs that I try to use in my own application gives me HRESULT 0x80004005, E_FAIL.
This is the same error from this question , but I get it in every PDF file, and filtdump not, so I know that IFilter works on at least some documents. Has anyone done this kind of thing before using PDF files that can see what I'm doing wrong?
source share