I am writing an iPhone application for reading PDF documents.
I know how to show pdf file using CGPDF ** classes in iOS.
Now I want to search for text in a pdf file and select the search text. So I need a library that can determine which text is in which position. In addition, I want the library to be able to handle Unicode characters and Chinese characters.
I searched for a few days, but still can not find anything suitable.
I tried xpdf, but it is written in C ++. I do not know how to use C ++ code in an iPhone application.
I also tried http://www.codeproject.com/KB/cpp/ExtractPDFText.aspx but it does not handle Chinese characters.
I tried the code myself, but encoding in PDF is really complicated.
For example, I donβt know what to refer to when I want to decode the text in the following font:
8 0 obj << /Type /Font /Subtype /Type0 /Encoding /Identity-H /BaseFont /RNXJTV+PMingLiU /DescendantFonts [ 157 0 R ] >> endobj 157 0 obj << /Type /Font /Subtype /CIDFontType2 /BaseFont /RNXJTV+PMingLiU /CIDSystemInfo << /Registry (Adobe) /Ordering (CNS1) /Supplement 0 >> /FontDescriptor 158 0 R /W 161 0 R /DW 1000 /CIDToGIDMap 162 0 R >> endobj 158 0 obj << /Type /FontDescriptor /Ascent 801 /CapHeight 711 /Descent -199 /Flags 32 /FontBBox [0 -199 999 801] /FontName /RNXJTV+PMingLiU /ItalicAngle 0 /StemV 0 /Leading 199 /MaxWidth 1000 /XHeight 533 /FontFile2 159 0 R >> endobj
source share