# Peepdf

Peepdf is a Python based tool to explore PDF files in order to find out if the file can be harmful or not. The aim of this tool is to provide all the necessary components that a security researcher could need in a PDF analysis without using 3 or 4 tools to make all the tasks. With peepdf it's possible to see all the objects in the document showing the suspicious elements, supports all the most used filters and encodings, it can parse different versions of a file, object streams and encrypted files.

$./peepdf.py -i pdffile.pdf  ### Example¶ We will now see how to extract an embedded object file in PDFs As we can see there is no suspiction in the pdf file when viewed normally in a pdf viewer. So now lets load the pdf file in peepdf $ ./peepdf.py -i nothing.pdf
File: nothing.pdf
MD5: 56572d46b09ef2b3de1faa4c9d5e1cb0
SHA1: 99b73b7d87815f669d54bb1c430b703d4ae827a4
Size: 925647 bytes
Version: 1.1
Binary: True
Linearized: False
Encrypted: False
Objects: 8
Streams: 2
URIs: 0
Errors: 0

Version 0:
Catalog: 1
Info: No
Objects (8): [1, 2, 3, 4, 5, 6, 7, 8]
Streams (2): [5, 8]
Encoded (1): [8]
Suspicious elements:
/Names (1): [1]
/EmbeddedFiles: [1]
/EmbeddedFile: [8]


As we can see there is an embedded file in the pdf.

So now we need to extract the embedded file using the stream command as follows,

PPDF> stream 8 > embedfile

$file embedfile embedfile: PNG image data, 960 x 640, 8-bit/color RGB, non-interlaced$ xdg-open embedfile

We can see that there is an Image embedded in the pdf.