I'm wondering how I'd strip the contents of an email and perhaps do gmail-like threading when I have the email body html returned in ruby on rails.
For example, the following text is sent to my app:
Hey XYZ,
I've fixed that error and tested it a couple of times, it seems to be working fine now.
On Fri, Feb 18, 2011 at 7:44 AM, Joe David <joe@david.com> wrote:
Initial thread starts here...
--
Thanks
joe@david.com
Email format is a tricky thing. You could use find the boundaries between emails using a regex that detects strings like On Fri, Feb 18, 2011 at 7:44 AM, Joe David <joe@david.com> wrote:
. But you can't guarantee that all incoming messages will have a string like that between emails.
Another option is to check for the >
character, which may appear at the beginning of each quoted line. However, once again, you have to worry about what happens if you receive a message that doesn't follow this convention.
Yet another option--and I think Gmail does this--is to look for matches between the incoming message and previous messages. I.e. if you see Initial thread starts here...Thanks, joe@david.com
in the message, and you also have that in a previous message in your database, you could infer that it's a quote from earlier in the thread.
精彩评论