开发者

Node.js: Regular expressions to get e-mail headers and body

开发者 https://www.devze.com 2023-02-25 13:16 出处:网络
I know very little about regular expressions and I\'m having trouble getting the information I need from an e-mail so I\'d like your help reading the fields: \"status\", \"to\", \"from\", \"subject\"

I know very little about regular expressions and I'm having trouble getting the information I need from an e-mail so I'd like your help reading the fields: "status", "to", "from", "subject" and "b开发者_JAVA技巧ody".

The e-mail has failed, details:

Action: failed
Status: 5.0.0 (permanent failure)

---------- Forwarded message ----------
From: exp@gmail.com
To: regular_exp@gmail.com
Date: Tue, 12 Apr 2011 13:55:23 +0000
Subject: test
hellloooooo

What's the best way to do it using JavaScript?

Thanks


A regular expression is probably not the best tool for this job. What you really want is a library that properly parses RFC 2822 email messages, especially since you want to extract the body – if you look at the spec, you'll see that there's a lot of complexity involved in parsing an email (text encodings, MIME, etc.)

Using mailparser:

var mailparser = require("./mailparser"),
    fs = require("fs"),
    sys = require("sys");

fs.readFile('mail.txt', function (err, data) {
    if (err) throw err;

    var mp = new mailparser.MailParser();

    // callback for the headers object
    mp.on("headers", function(headers){
        console.log("HEADERS");
        console.log(sys.inspect(headers, false, 5));
    });

    // callback for the body object
    mp.on("body", function(body){
        console.log("BODY");
        console.log(sys.inspect(body, false, 7));
    });

    mp.feed(data.toString("ascii"));
    mp.end();
});


Assuming that these fields are as simple and consistent as

[\n] From: [...][\n]

then an expression like

/[\n]( From: ).+[\n]/

Would work for you. Replace ( From: ) with ( Date: ) etc.

And use string.match(regExp)

Update:

var bodyRegex = /[\n] Subject: (.+)[\n](.+)/
var string = ...;
var result = string.match(bodyRegex);
result[1]; // Subject
result[2]; // Body
0

精彩评论

暂无评论...
验证码 换一张
取 消