I'm trying to programmatically parse my Gmail for various indexing functions, and am having trouble finding certain headers that I thought were standard email headers. I'm using the Zend IMAP library, and have no problems with authentication and otherwise viewing/manipu开发者_Python百科lating my Gmail. However, I'm having trouble with some headers missing. For instance
- about 1 out of 10 of the messages are missing the "message-id" header, including many sent from other gmail addresses
- occasionally, though rarely, the 'content-type','content-disposition', and 'filename' headers are missing from attachment headers. These always seem to be messages that are part of a longer thread of messages.
Can anybody explain why these headers might be missing? If the "message-id" header is missing, what is used as the unique identifier? Perhaps some sort of combination of other headers?
According to RFC 5322:
The only required header fields are the origination date field and the originator address field(s). All other header fields are syntactically optional.
The same RFC says:
Though listed as optional in the table in section 3.6, every message SHOULD have a "Message-ID:" field. Furthermore, reply messages SHOULD have "In-Reply-To:" and "References:" fields as appropriate and as described below.
So Message-ID isn't strictly-speaking mandatory. If it's missing, try looking for either the In-Reply-To or References fields.
精彩评论