Bookmark PDF file via command line

I am looking for a command line tool to add bookmarks to a PDF file.

I have a page number and a label . I would like to create a bookmark called label , referring to the page number .

Does anyone know a command line tool (preferably OSX) for this?

I have about 4,000 pages of PDF files and about 150 bookmarks, and I would like to automate it.

My plan is to use a system call inside an r-script.

EDIT

I create about 4,000 separate PDF files with graphs and I use the OSX /System/Library/Automator/Combine PDF Pages.action/Contents/Resources/join.py to merge the PDF files together. I used to use pdfjoin from the pdfjam package, but it was too slow. In the end, this is how I get my PDF file, where I manually add bookmarks using Adobe Acrobat Professional at the moment.

+6
source share
4 answers

You can also use pdftk . It is also available for OS X.

I will not understand all the details here and now, because it has already been done elsewhere. Just briefly:

  • Create a PDF sample from your source files (no bookmarks).
  • Add multiple bookmarks to Adobe Acrobat (which you have access to).
  • Run one of the following commands:

     pdftk my.pdf dump_data output - pdftk my.pdf dump_data output bookmarks+otherdata.txt 
  • Examine the output format.

  • Modify the output .txt file by adding all the necessary entries.
  • Run PDFTK again:

     pdftk my.pdf update_info bookmarks.txt output bookmarked.pdf 

Additional Information

This is the bookmark format that I noticed after checking in step 4 above.

 BookmarkBegin BookmarkTitle: -- Your Title 1 -- BookmarkLevel: 1 BookmarkPageNumber: 1 BookmarkBegin BookmarkTitle: -- Your Title 2 -- BookmarkLevel: 1 BookmarkPageNumber: 2 BookmarkBegin BookmarkTitle: -- Your Title 3 -- ... ... and so on... 

And replace the above .. in the appropriate place.

+3
source

Ok, here is a quick way to do three tasks at once:

  • Combine 400 single-page PDF files.
  • Create top level document ToC (table of contents).
  • Create a PDF bookmark for each page.

It uses the LaTeX installation.

You start with an empty LaTeX template, for example, the following:

 \documentclass[]{article} \usepackage{pdfpages} \usepackage{hyperref} \hypersetup{breaklinks=true, bookmarks=true, pdfauthor={}, pdftitle={}, colorlinks=true, citecolor=blue, urlcolor=blue, linkcolor=magenta, pdfborder={0 0 0}} \begin{document} { \hypersetup{linkcolor=black} \setcounter{tocdepth}{3} % Comment next line in or out if you want a ToC or not: \tableofcontents } %% Here goes your additional code: %% 1 line per included PDF! \end{document} 

Now, before the very last line of this template, you insert one line for the external PDF file you want to include.

  • If you want to generate ToC, it must be formatted as follows:

     \includepdf[pages={<pagenumber>},addtotoc{<pagenumber>,<section>,<level>,\ <heading>,<label>}]{pdffilename.pdf} 
  • If you are sure that each attached PDF is a one-page document, it simplifies this:

     \includepdf[addtotoc{<pagenumber>,<section>,<level>,\ <heading>,<label>}]]{pdffilename.pdf} 

Here, all of the following five parameters for addtotoc are required in the order specified for the files that will be displayed in bookmarks and in ToC. See below below for a specific example:

  • <pagenumber> : page number of the inserted document to which you want to connect. (In your case, it is always "1" because you only insert single-page documents, however you can insert a 5-page document and a link to page 3 of the inserted PDF file).
  • <section> : The name of the LaTeX section. Maybe section , subsection , subsubsection ... In your case, "section".
  • <level> : LaTeX partition level. In your case, "1".
  • <heading> . This is a string. Used for bookmark text.
  • <label> . This should be unique to each bookmark. Used in a PDF file to go to the correct page when you click a bookmark.

To test this quickly, I used Ghostscript to create 20 one-page PDF documents:

 for i in {1..20}; do gs -op${i}.pdf -sDEVICE=pdfwrite \ -c "/Helvetica findfont 30 scalefont setfont \ 100 600 moveto \ (Page ${i}) show \ showpage"; done 

With these test files, I could make the lines that need to be inserted into the template, looks like this:

 \includepdf[addtotoc={1,section,1,Page 1 (First),p1}]{p1.pdf} \includepdf[addtotoc={1,section,1,Page 2,p2}]{p2.pdf} \includepdf[addtotoc={1,section,1,Page 3,p3}]{p3.pdf} [...] \includepdf[addtotoc={1,section,1,Page 11 (In the Middle),p11}]{p11.pdf} [...] \includepdf[addtotoc={1,section,1,Page 20 (Last),p20}]{p20.pdf} 

Save the template with inserted lines, then run the following command twice :

  pdflatex template.tex pdflatex template.tex 

As a result, the file will have bookmarks similar to this in Preview.app:

Screenshot: Preview.app with the bookmarks opened


Note. LaTeX is available for OSX in two ways:


I will add one or two other methods to add bookmarks on the command line, later or in the next few days if I have more time.

Now it needs to be done, because I never showed it here on SO, AFAICR.

But I thought, because you gave the background โ€œI am combining single-page PDF files and itโ€™s slow, now I want to add bookmarks too ...โ€, I could show how to do it using one single method.

TIP . One other way would be to use pdftk , for which IS is available for Mac OS X!

+2
source

Here is another answer. This one uses Ghostscript to handle PDF-to-PDF and the pdfmark PostScript operator to insert bookmarks.

For a pdfmark theme, see also:

This method involves two steps:

  • Create a text file (indeed, a PostScript file) with a limited set of pdfmark commands, one per line and the bookmark you want to add.
  • Run the Ghostscript command, which processes your current PDF file along with the text file.

1.

The contents of the text file should look something like this:

 [/Page 1 /View [/XYZ null null null] /Title (This is page 1) /OUT pdfmark [/Page 2 /View [/XYZ null null null] /Title (Dunno which page this is....) /OUT pdfmark [/Page 3 /View [/XYZ null null null] /Title (Some other name) /OUT pdfmark [/Page 4 /View [/XYZ null null null] /Title (File 4) /OUT pdfmark [/Page 5 /View [/XYZ null null null] /Title (File 5) /OUT pdfmark [/Page 6 /View [/XYZ null null null] /Title (File 6) /OUT pdfmark [/Page 7 /View [/XYZ null null null] /Title (File 7) /OUT pdfmark % more lines for more pages to bookmark... [/Page 13 /View [/XYZ null null null] /Title (File 13) /OUT pdfmark [/Page 14 /View [/XYZ null null null] /Title (Bookmark for page 14) /OUT pdfmark % more lines for more pages to bookmark... 

Name this file, for example: addmybookmarks.txt

2.

Now run this command:

 gs -o bookmarked.pdf \ -sDEVICE=pdfwrite \ addmybookmarks.txt \ -f original.pdf 

As a result of PDF, bookmarked.pdf now has bookmarks. See screenshot:

Screenshot of bookmarks added with the help of Ghostscript and <code> pdfmark </code>

+2
source

Here's the python method used to add Bookmarks to the Table of Contents. Runs on MacOS without any other installations.

 #!/usr/bin/python from Foundation import NSURL, NSString import Quartz as Quartz import sys # You will need to change these filepaths to a local test pdf and an output file. infile = "/path/to/file.pdf" outfile = "/path/to/output.pdf" def getOutline(page, label): # Create Destination myPage = myPDF.pageAtIndex_(page) pageSize = myPage.boundsForBox_(Quartz.kCGPDFMediaBox) x = 0 y = Quartz.CGRectGetMaxY(pageSize) pagePoint = Quartz.CGPointMake(x,y) myDestination = Quartz.PDFDestination.alloc().initWithPage_atPoint_(myPage, pagePoint) myLabel = NSString.stringWithString_(label) myOutline = Quartz.PDFOutline.alloc().init() myOutline.setLabel_(myLabel) myOutline.setDestination_(myDestination) return myOutline pdfURL = NSURL.fileURLWithPath_(infile) myPDF = Quartz.PDFDocument.alloc().initWithURL_(pdfURL) if myPDF: # Here where you list your page index (starts at 0) and label. outline1 = getOutline(0, 'Page 1') outline2 = getOutline(1, 'Page 2') outline3 = getOutline(2, 'Page 3') # Create a root Outline and add each outline. (Needs a loop.) rootOutline = Quartz.PDFOutline.alloc().init() rootOutline.insertChild_atIndex_(outline1, 0) rootOutline.insertChild_atIndex_(outline2, 1) rootOutline.insertChild_atIndex_(outline3, 2) myPDF.setOutlineRoot_(rootOutline) myPDF.writeToFile_(outfile) 
+1
source

Source: https://habr.com/ru/post/987418/


All Articles