Calling pdftotext from a python script does not work when I switch from a local machine to my web hosting

I wrote a small python script to parse / extract information from a PDF. I tested it on my local machine, I have python 2.6.2 and pdftotext version 0.12.4.

I am trying to run this on my hosting server (dreamhost). It has python version 2.5.2 and pdftotext version 3.02.

But when I try to run the script, I get the following error in the pdftotext line (I also checked it with a simple throw of the script) "Error: Could not open file '-'"

def ConvertPDFToText(currentPDF):
    pdfData = currentPDF.read()

    tf = os.tmpfile()
    tf.write(pdfData)
    tf.seek(0)

    if (len(pdfData) > 0) :
        out, err = subprocess.Popen(["pdftotext", "-layout", "-", "-"], stdin = tf, stdout=subprocess.PIPE ).communicate()
        return out
    else :
        return None

Please note that I pass this function to the same PDF file and it has access to it. In another function, I can write myself a PDF document from the same script running on a web host.

? subprocess/python/pdftext -? , , .

.

+3
3

pdftotext -? ? , , ? ( ).

+4

Noufal, . os.tmpfile() . . .

#import tempfile
def ConvertPDFToText(currentPDF):
    pdfData = currentPDF.read()

    tf = tempfile.NamedTemporaryFile()
    tf.write(pdfData)
    tf.seek(0)

    outputTf = tempfile.NamedTemporaryFile()

    if (len(pdfData) > 0) :
        out, err = subprocess.Popen(["pdftotext", "-layout", tf.name, outputTf.name ]).communicate()
        return outputTf.read()
    else :
        return None

, , Noufal , . , ?

+6

, Python:

# pdftotext -layout - -

# pdftotext -layout

pdftotext stdi/stdout, - .

    out, err = subprocess.Popen(["pdftotext", "-layout"], stdin = tf, stdout=subprocess.PIPE ).communicate()

, .

0

Source: https://habr.com/ru/post/1788680/


All Articles