We're parsing an email inbox signed up to a mailing list (Mailman) that does nothing except sit there and capture emails from other users on the mailing list. This is going to be PHP connecting to an email box, grabbing new emails and putting them into a MySQL database for use as a web archive that'开发者_StackOverflows searchable.
I noticed that many of the subjects have RE: FW: FWD in front of them (obviously), but wondered if I didn't need to manually strip these out to get a grouping by subject when outputting database results to the web page.
Maybe there's a PHP/Mail or PEAR class that will automatically handle message grouping/threading that I'm not aware of. Thanks for your help!
The proper way to thread them is not by subject, but rather by the Message-ID
and References
headers. The References
header will contain a comma-delimited string of all the previously related Messgage-ID
headers. By using these, the actual content of the subject line becomes less relevant since it can get modified and mangled. In other cases, you might get many separate threads with subjects like "Need help please" that should not be threaded together.
You probably want to look into the References
and In-Reply-To
email headers. These give you information about which email the current email is in reply to.
There's a good algorithm for threading email based on this information here: http://www.jwz.org/doc/threading.html
精彩评论