Stripping out original message from an email reply
Problem
My application receives Email from users. A response from gmail, for example, comes in like this:
This is some new text
On Sun, Apr 1, 2012 at 3:32 AM, My app wrote:
> Original...
> message..
Of course, this treatment varies from client to client.
Right now I am identifying the '4f77ed3860c258a567aeabf8' and throwing out everything after, because I know what email address they've sent to. This is not a general solution but works for my purposes, except for when there's a line break in the "Original message" line, like in the above example.
Is there a better, standard way to strip out past message's from a user's reply to an email?
Solution
If you want a 100% way to remove anything except the most recent post, compare each character from the new message and the previous one. If you don't want to write your own diff parser, check out this lib.
https://github.com/cemerick/jsdifflib
Or if you want a lightweight algo check this one out
http://ejohn.org/projects/javascript-diff-algorithm/
Discussion
View additional discussion.