Working with PDF Documents
MaSH’s PDF libraries allow you to create, edit and split PDF documents.
Splitting PDFs
Using Captur and Foldr’s custom fields we can extract the parts of a PDF you require into new PDF documents.
Custom fields may be considered as either a header (the first new document begins on the page where the first instance of the field occurs and ends on the page before the next occurrence) or footer (the first new document begins at the start of the document and ends on the page where the first instance of the field occurs).
splitOnHeader
mash.pdf.splitOnHeader(File: file, number: field, boolean: ?discard = false) -> array
Splits a PDF document represented by a File object using the location of a captured value and returns an array containing new Files. If discard is set to true then any pages prior to the first instance of the field will be discarded otherwise they will be returned as the first new document.
Parameters
file
A Foldr file object whose content is a PDF document
field
The ID of the custom field which contains the captured data that we will be splitting the document on.
discard (optional)
Whether to use pages prior to the first instance of the captured field.
Natural
# Our PDF document
set file to mash.file(10, "test.pdf")
# The ID of the field which we are Capturing to
set field to 6
# The folder which we will write our split files to
set destination to mash.file(10, "output")
# Split the document
set splits to mash.pdf.splitOnHeader(file, field, true)
error
printline "Error: {{mash.error.message}}"
printline "Line: {{mash.error.line}}"
printline "Expression: {{mash.error.expression}}"
end
# Write each file to the destination folder
each splits as document
# The data inside the field which was used for the split
printline "Found value {{document.value}}"
set written to destination.write(document.file.name, document.file.contents)
printline "File written to: {{written.path}}"
end
Standard
# Our PDF document
file = mash.file(10, "test.pdf")
# The ID of the field which we are Capturing to
field = 6
# The folder which we will write our split files to
destination = mash.file(10, "output")
# Split the document
splits = mash.pdf.splitOnHeader(file, field, true)
error
printline("Error: {{mash.error.message}}")
printline("Line: {{mash.error.line}}")
printline("Expression: {{mash.error.expression}}")
end
# Write each file to the destination folder
each splits as document
# The data inside the field which was used for the split
printline("Found value {{document.value}}")
written = destination.write(document.file.name, document.file.contents)
printline("File written to: {{written.path}}")
end
Output
File written to: output/test-001.pdf
File written to: output/test-002.pdf
File written to: output/test-003.pdf
splitOnFooter
mash.pdf.splitOnFooter(File: file, number: field, boolean: ?discard = false) -> array
Similar to the previous method splitOnFooter except this will begin by splitting from the beginning of the document and end the first new document at the first instance of the captured field. If discard is set to true then any pages after the last instance will not be returned, otherwise this will make up the final document.
Editing PDFs
MaSH can be used to watermark or annotate existing PDF documents. The mash.pdf.builder() returns an instance of FPDi. You can read more about the package here.
The following example adds an image to an existing PDF document
Natural
# Grab the image file
set stamp to mash.file(10, "stamp.png")
# Grab the PDF file
set source to mash.file(10, "template.pdf")
# Where our modified PDF will be saved
set destination to mash.file(10, "output")
# Get an instance of the PDF builder
set pdf to mash.pdf.builder()
# Load our source file into the builder
pdf.setSourceFile(source)
# Grab the first page from the source file to use as a template
set template to pdf.importPage(1)
# Add a page to the builder
pdf.addPage()
# Get the size of the template page
set size to pdf.getTemplateSize(template)
# Copy the template page into our current page
pdf.useTemplate(template, 0, 0, size['width'], size['height'], true)
# Add our image to the page
pdf.Image(stamp, 26, 35, 0, 0, 'png')
# Write out new PDF to the destination folder
destination.write("modified.pdf", pdf)
Standard
# Grab the image file
stamp = mash.file(10, "stamp.png")
# Grab the PDF file
source = mash.file(10, "template.pdf")
# Where our modified PDF will be saved
destination = mash.file(10, "output")
# Get an instance of the PDF builder
pdf = mash.pdf.builder()
# Load our source file into the builder
pdf.setSourceFile(source)
# Grab the first page from the source file to use as a template
template = pdf.importPage(1)
# Add a page to the builder
pdf.addPage()
# Get the size of the template page
size = pdf.getTemplateSize(template)
# Copy the template page into our current page
pdf.useTemplate(template, 0, 0, size['width'], size['height'], true)
# Add our image to the page
pdf.Image(stamp, 26, 35, 0, 0, 'png')
# Write out new PDF to the destination folder
destination.write("modified.pdf", pdf)