Summary
Allows for the management of PDF documents, including facilities for merging and deleting pages, setting document open behavior, and creating or changing document security settings.
Discussion
PDFDocumentOpen and PDFDocumentCreate are two functions that provide a reference to a PDFDocument object.
One common scenario for creating new PDF files is for the creation of a map book. The steps typically involve creating a new PDFDocument object, appending content from existing PDF files, and saving the PDFDocument object to disk. Another common scenario is to modify exisiting PDF file contents or properties. Once a PDFDocument is referenced, you can appendPages, insertPages, or deletePages as well as use the updateDocProperties and updateDocSecurity methods to modify PDF file settings.
The deletePages method is useful for swapping out only the pages that have been modified. It may take a long time to process dozens of pages. If only a relative few have been modified, it is faster to delete only those pages, then insert the newly updated pages using the insertPages method.
Currently, when using Python to set PDF security on a document, it is limited to RC4 encryption. If you set PDF security in ArcGIS AllSource, it is limited to AES 256-bit encryption. This means that if you are managing PDF documents using Python, you are limited to working with only PDF documents with no security or those documents that only use RC4 encryption.
Properties
Property | Explanation | Data Type |
pageCount (Read Only) | Returns an integer that represents the total number of pages in the PDF document. | Long |
Method Overview
Method | Explanation |
appendPages (pdf_path, {input_pdf_password}) | Appends one PDF document to the end of another. |
deletePages (page_range) | Provides the ability to delete one or multiple pages within an existing PDF document. |
insertPages (pdf_path, {before_page_number}, {input_pdf_password}) | Allows inserting the contents of one PDF document at the beginning or in between the pages of another PDFDocument. |
saveAndClose () | Saves any changes made to the currently referenced PDFDocument. |
updateDocProperties ({pdf_title}, {pdf_author}, {pdf_subject}, {pdf_keywords}, {pdf_open_view}, {pdf_layout}) | Updates the PDF metadata. You can also use this method to set behaviors that will trigger when the document is opened in Adobe Reader or Adobe Acrobat, such as the initial view mode and the page thumbnails view. |
updateDocSecurity (new_master_password, {new_user_password}, {encryption}, {permissions}) | Sets password, encryption, and security restrictions on a PDF. |
Methods
appendPages (pdf_path, {input_pdf_password})
Parameter | Explanation | Data Type |
pdf_path | A string that includes the location and name of the input PDF document to be appended. | String |
input_pdf_password | A string that defines the master password to a protected file. It must be a master password; a user password will not work. (The default value is None) | String |
When appending secured PDF documents, where each have different security settings, the output settings will be based on the primary document to which pages are being appended. For example, if the document that is being appended to does not have password information saved, but the appended pages do, the resulting document will not have password information saved.
To add pages to the beginning of the current PDF document, use insertPages instead.
deletePages (page_range)
Parameter | Explanation | Data Type |
page_range | A string that defines the page or pages to be deleted. Delete a single page by passing in a single value as a string (for example, "3"). Multiple pages can be deleted using a comma between each value (for example, "3, 5, 7"). Ranges can also be applied (for example, "1, 3, 5-12"). | String |
It is important to keep track of the pages that are being deleted because, each time pages are deleted, the internal PDF page numbers are automatically adjusted. For example, page 3 becomes page 2 immediately after page 1 or page 2 are deleted. If page 1 and page 2 are deleted, page 3 becomes page 1. You need to consider this if you are using deletePages and then immediately using insertPages along with a before_page_number value.
insertPages (pdf_path, {before_page_number}, {input_pdf_password})
Parameter | Explanation | Data Type |
pdf_path | A string that includes the location and name of the input PDF document to be inserted. | String |
before_page_number | An integer that defines a page number in the currently referenced PDFDocument before which the new pages will be inserted. For example, if the before_page_value is 1, the inserted page will be inserted before all pages. (The default value is 1) | Integer |
input_pdf_password | A string that defines the master password to a protected file. It must be a master password; a user password will not work. (The default value is None) | String |
When inserting secured PDF documents that have different security settings, the output settings will be based on the primary document into which pages are being inserted. For example, if the document into which pages are being inserted does not have password information saved, but the inserted pages do, the resulting document will not have password information saved.
To add pages to the end of the current PDF document, use appendPages instead.
saveAndClose ()
The saveAndClose method must be used for changes to be maintained. If a script exits before saveAndClose is executed, changes will not be saved. If you are creating a new file using PDFDocumentCreate, the file won't appear on disk until saveAndClose is executed.
updateDocProperties ({pdf_title}, {pdf_author}, {pdf_subject}, {pdf_keywords}, {pdf_open_view}, {pdf_layout})
Parameter | Explanation | Data Type |
pdf_title | The document title. This is a PDF metadata property. (The default value is None) | String |
pdf_author | The document author. This is a PDF metadata property. (The default value is None) | String |
pdf_subject | The document subject. This is a PDF metadata property. (The default value is None) | String |
pdf_keywords | The document keywords. This is a PDF metadata property. (The default value is None) | String |
pdf_open_view | Specifies the Adobe Reader view mode that will be used.
(The default value is USE_THUMBS) | String |
pdf_layout | Specifies the Adobe Reader layout mode that will be used.
(The default value is SINGLE_PAGE) | String |
A pdf_open_view setting of FULL_SCREEN will prompt a warning about full-screen mode when the PDF is opened. Setting pdf_open_view to a different option will not clear this setting unless pdf_open_view is set to USE_NONE.
updateDocSecurity (new_master_password, {new_user_password}, {encryption}, {permissions})
Parameter | Explanation | Data Type |
new_master_password | The primary document password. | String |
new_user_password | The user password needed to open the PDF for viewing. (The default value is None) | String |
encryption | Specifies the encryption technique that will be used for the PDF.
(The default value is AES-256) | String |
permissions [permissions,...] | A string or list of strings that specifies the permissions that will be granted by the document security settings.
(The default value is ALL) | String |
Tip:
A password on a secured PDF can be removed by setting the new_master_password or new_user_password properties to empty strings.
Code sample
This script creates a new PDF document, appends the contents of three separate PDF documents, and saves the resulting PDF file.
import arcpy, os
#Set file name and remove if it already exists
pdfPath = r"C:\Projects\YosemiteNP\AttractionsMapBook.pdf"
if os.path.exists(pdfPath):
os.remove(pdfPath)
#Create the file and append pages
pdfDoc = arcpy.mp.PDFDocumentCreate(pdfPath)
pdfDoc.appendPages(r"C:\Projects\YosemiteNP\Title.pdf")
pdfDoc.appendPages(r"C:\Projects\YosemiteNP\MapPages.pdf")
pdfDoc.appendPages(r"C:\Projects\YosemiteNP\ContactInfo.pdf")
#Commit changes and delete variable reference
pdfDoc.saveAndClose()
del pdfDoc
The following script modifies the PDF document metadata properties and sets the style in which the document opens.
import arcpy
pdfDoc = arcpy.mp.PDFDocumentOpen(r"C:\Projects\YosemiteNP\AttractionsMapBook.pdf")
pdfDoc.updateDocProperties(pdf_title="Yosemite Main Attrations Map Book",
pdf_author="Esri",
pdf_subject="Main Attractions Map Book",
pdf_keywords="Yosemite; Map Books; Attractions",
pdf_open_view="USE_THUMBS",
pdf_layout="SINGLE_PAGE")
pdfDoc.saveAndClose()
del pdfDoc
The following script sets the user_password and master_password, encrypts the PDF using RC4 compression, and requires a password when the document opens. Be sure to read the secured PDF limitations in the class description above.
import arcpy
pdfDoc = arcpy.mp.PDFDocumentOpen(r"C:\Projects\YosemiteNP\AttractionsMapBook.pdf")
pdfDoc.updateDocSecurity("master", "user", "RC4")
pdfDoc.saveAndClose()
del pdfDoc
The following script replaces a total of four pages in an existing PDF using deletePages followed by insertPages. Note how the new page 3 was inserted before the current page 3, which was really page 4 before the original page 3 was removed. The same applies to the range of pages 5–7. Be sure to read the secured PDF limitations in the class description above.
import arcpy
pdfDoc = arcpy.mp.PDFDocumentOpen(r"C:\Projects\YosemiteNP\AttractionsMapBook.pdf", "master")
pdfDoc.deletePages("3, 5-7")
pdfDoc.insertPages(r"C:\Projects\Yosemite\NewPage3.pdf", 3, "master")
pdfDoc.insertPages(r"C:\Projects\Yosemite\NewPages5-7.pdf", 5, "master")
pdfDoc.saveAndClose()
del pdfDoc