开发者

How to check if PDF was modified

开发者 https://www.devze.com 2022-12-10 21:17 出处:网络
I have a PDF generated by 3rd party system. Using PDF editor or els software I have modified it. Is it possible to detect if PDF file was modified, without original file?

I have a PDF generated by 3rd party system. Using PDF editor or els software I have modified it.

Is it possible to detect if PDF file was modified, without original file?

I will add some more details.

There is no encryption and no signature features.

Document is created by IT system. User receives document and modifies it.

Is it possible to track that change somehow?

I thought that all these applications leaves some data in PDF header or somewhere encoded inside file and it开发者_运维技巧 is possible to check it. However properties showed by windows explorer shows nothing... so I was interested if there is something smarter than viewing properties/header in explorer.


The problem with this is that just opening the PDF on a Mac in Preview and hitting Command-S to save the file will replace both the Creation and Modification date to match the current date/time. So even the creation date will be wrong. Even novice users can unknowingly do this, so if you're trying to track someone who may be purposefully modifying the document, it may lead to a false positive.

What you're asking is just too easy to spoof and fool unfortunately.


You could always check the md5sum of the pdf file. I'm not sure what environment you are using but that should help get you started.


It's going to be rough without the original file unless there were security features like encryption or digital signatures applied to it, which it doesn't sound like there was. Do you have access to any information at all about the original file? A file size, creation date, any of the metadata, etc.?


If the tool used to modify the PDF is working according to the PDF spec then in the Info dictionary it should update ModDate but leave CreationDate alone. You may also see some non-zero generation numbers on the objects although it is just as possible that all the objects have been regenerated and will therefore be generation 0. The trial version of CosEdit will allow you to look at these 2 items.

If however the tool has been used to intentionally modify the PDF without leaving a trace then they would be spoofing those bits of data so they won't help you.


Are the users modifying the PDF using Acrobat? If so then what Danio mentioned above should work. Strictly speaking, modifying the PDF should change its ModDate or xmp:ModifyDate without changing its CreationDate. However not all tools adhere to this; quite a few simply leave all metadata untouched, so this method of checking isn't 100% reliable unless you know what PDF editor your users employ.

If the editor your users use does change ModDate or xmp:ModifyDate, then you should be able to see it in two places. One is when you open the document in Acrobat and hit Ctrl-D to view Document Properties. The Creation field and Modified field should have different timestamps. There may also be APIs that can be used to programmatically retrieve this metadata. The other way you can visualize it is to simply open the PDF in Notepad and search for the properties. Most of the document won't be human readable but these timestamps should be. If they do get changed appropriately, you can always parse for them in your application. Good luck!


If you're using Ubuntu linux 18.04 and using Document Viewer then, you can

  • click on File options (3 vertical line ellipsis)
  • click on Properties...
  • look for Created / Modified fields in the Properties pop up

Beware: A sufficiently knowledgeable user can manipulate the PDF contents without changing the Created and Modified time stamps in the PDF metadata and the file system.


You can use some tools to get the pdf file property.

I use pdfinfo, you can get many property of the file, and check it.

pdfinfo 58dcc41d01293.pdf
    Author:         worker
    Creator:        Microsoft® Word 2016
    Producer:       Microsoft® Word 2016
    CreationDate:   Sat Aug 24 16:02:29 2019
    ModDate:        Sat Aug 24 16:02:29 2019
    Tagged:         yes
    UserProperties: no
    Suspects:       no
    Form:           none
    JavaScript:     no
    Pages:          55
    Encrypted:      no
    Page size:      841.92 x 595.32 pts (A4)
    Page rot:       0
    File size:      3346838 bytes
    Optimized:      no
    PDF version:    1.7
0

精彩评论

暂无评论...
验证码 换一张
取 消