Python gets pdf password protected page count

I tried to find a way to get the number of pages with pdf password protected using python3. So far I have tried the pypdf2 and pdfminer2 modules. Both refused because the file was not decrypted.

#!/usr/bin/python3 from PyPDF2 import PdfFileReader pdfFile = PdfFileReader(open("document.pdf", "rb")) print(pdfFile.numPages) 

This code will create an Error :

 PyPDF2.utils.PdfReadError: File has not been decrypted 

Is there a way to get the number of pages without decryption?

+5
source share
2 answers

You can use pdfrw

Example

a.pdf and b.pdf are the same pdf. The difference b.pdf is password protected pdf and a.pdf is a simple pdf without any protection and there are no pages 30

 >>> from pdfrw import PdfReader >>> print len(PdfReader('b.pdf').pages) 30 >>> print len(PdfReader('a.pdf').pages) 30 

Use the following command to install

 pip install pdfrw 

PDFRW Detail

+1
source

The following worked for me:

 from PyPDF2 import PdfFileReader pdf = PdfFileReader(open('path/to/file.pdf','rb')) pdf.decrypt(password) print pdf.getNumPages() 

I would recommend removing read protection using a command line tool such as qpdf (it’s easy to install, for example, on a Ubuntu, use apt-get install qpdf if you don’t already have one):

qpdf --password=PASSWORD --decrypt SECURED.pdf UNSECURED.pdf Then open the unlocked file with pdfminer and make your own material.

0
source

Source: https://habr.com/ru/post/1271087/


All Articles