I have a problem using iTextSharp when reading data from a PDF file. What I want to achieve is to read only a certain part of the PDF page (I only want to get the address information, which is in a constant position). I have seen using iTextSharp when reading all pages, such as:
StringBuilder text = new StringBuilder();
if (File.Exists(fileName))
{
PdfReader pdfReader = new PdfReader(fileName);
for (int page = 1; page <= pdfReader.NumberOfPages; page++)
{
ITextExtractionStrategy strategy = new SimpleTextExtractionStrategy();
string currentText = PdfTextExtractor.GetTextFromPage(pdfReader, page, strategy);
currentText = Encoding.UTF8.GetString(ASCIIEncoding.Convert(Encoding.Default, Encoding.UTF8, Encoding.Default.GetBytes(currentText)));
text.Append(currentText);
}
pdfReader.Close();
}
return text.ToString();
But how can I limit myself to only a certain place? I am open to using anything, even the OCR technique, as it may happen in the future that some files will be images (but not necessarily at this time). This project is intended only for me, so there is no commercial use.
Thank!
source
share