Skip to main content

Deleting Content

PDFDancer provides consistent deletion methods across all content types. This guide covers how to remove pages, paragraphs, text lines, and images from your PDFs.


Deleting Pages

Delete a Single Page

from pdfdancer import PDFDancer

with PDFDancer.open("document.pdf") as pdf:
# Delete the third page (page number 3)
pdf.page(3).delete()

pdf.save("output.pdf")

Delete Multiple Pages

When deleting multiple pages, delete in reverse order to avoid page number shifting.

from pdfdancer import PDFDancer

with PDFDancer.open("document.pdf") as pdf:
# Delete pages 2, 3, and 5
# Delete in reverse order to avoid page number shifting
for page_number in reversed([2, 3, 5]):
pdf.page(page_number).delete()

pdf.save("output.pdf")
Why Delete in Reverse Order?

When you delete page 2 from a 5-page document, page 3 becomes page 2, page 4 becomes page 3, and so on. Deleting in reverse order (highest page numbers first) ensures that earlier page numbers remain valid throughout the deletion process.


Deleting Text

Delete a Paragraph

from pdfdancer import PDFDancer

with PDFDancer.open("document.pdf") as pdf:
# Find and delete a paragraph
paragraph = pdf.page(1).select_paragraphs_starting_with("The Complete")[0]
paragraph.delete()

# Verify deletion
remaining = pdf.page(1).select_paragraphs_starting_with("The Complete")
assert remaining == []

pdf.save("output.pdf")

Delete a Text Line

from pdfdancer import PDFDancer

with PDFDancer.open("document.pdf") as pdf:
# Find and delete a text line
text_line = pdf.page(1).select_text_lines_starting_with("Footer")[0]
text_line.delete()

# Verify deletion
remaining = pdf.page(1).select_text_lines_starting_with("Footer")
assert remaining == []

pdf.save("output.pdf")

Delete All Matching Text

from pdfdancer import PDFDancer

with PDFDancer.open("document.pdf") as pdf:
# Delete all paragraphs containing "DRAFT"
drafts = pdf.select_paragraphs_containing("DRAFT")
for paragraph in drafts:
paragraph.delete()

pdf.save("output.pdf")

Deleting Images

Delete a Single Image

from pdfdancer import PDFDancer

with PDFDancer.open("document.pdf") as pdf:
images = pdf.page(1).select_images()

if images:
# Delete the first image
images[0].delete()

pdf.save("output.pdf")

Delete All Images on a Page

from pdfdancer import PDFDancer

with PDFDancer.open("document.pdf") as pdf:
images = pdf.page(1).select_images()

# Delete all images on page 1
for image in images:
image.delete()

pdf.save("output.pdf")

Delete Images at Specific Location

from pdfdancer import PDFDancer

with PDFDancer.open("document.pdf") as pdf:
# Find and delete images at specific coordinates
images = pdf.page(1).select_images_at(x=100, y=200)

for image in images:
image.delete()

pdf.save("output.pdf")

Deleting vs Redaction

Deletion removes content from the document entirely. If you need to permanently remove sensitive information while leaving a visual indicator (black box), use redaction instead.

Use CaseApproach
Remove obsolete pagesDeletion
Remove draft watermarksDeletion
Remove sensitive data (SSN, etc.)Redaction
Clean up layoutDeletion
Compliance requirementsRedaction

Next Steps