I've been whacking on this regex for a while, trying to build something that can pick out multiple ordered property values (DTSTART, DTEND, SUMMARY) from an .ics file. I have 开发者_如何学JAVAother options (like reading one line at a time and scanning), but wanted to build a single regex that can handle the whole thing.
SAMPLE PERL
# There has got to be a better way...
my $x1 = '(?:^DTSTART[^\:]*:(?<dts>.*?)$)';
my $x2 = '(?:^DTEND[^\:]*:(?<dte>.*?)$)';
my $x3 = '(?:^SUMMARY[^\:]*:(?<dtn>.*?)$)';
my $fmt = "$x1.*$x2.*$x3|$x1.*$x3.*$x2|$x2.*$x1.*$x3|$x2.*$x3.*$x1|$x3.*$x1.*$x2|$x3.*$x2.*$x1";
if ($evts[1] =~ /$fmt/smo) {
printf "lines:\n==>\n%s\n==>\n%s\n==>\n%s\n", $+{dts}, $+{dte}, $+{dtn};
} else {
print "Failed.\n";
}
SAMPLE DATA
BEGIN:VEVENT
UID:0A5ECBC3-CAFB-4CCE-91E3-247DF6C6652A TRANSP:OPAQUE SUMMARY:Gandalf_flinger1 DTEND:20071127T170005 DTSTART,lang=en_us:20071127T103000 DTSTAMP:20100325T003424Z X-APPLE-EWS-BUSYSTATUS:BUSY SEQUENCE:0 END:VEVENTSAMPLE OUTPUT
lines:
==> 20071127T103000 ==> 20071127T170005 ==> Gandalf_flinger1CPAN is your friend:
vFile
iCal parser
You will pull your hair out until bald without a parser on vFile format (other than trivial files.) Regex for this is very hard.
Instead of permuting the three regexes into one big pattern with ORs, why not test the three patterns separately, since (given the anchoring $
s, ) they cannot overlap?
my $x1 = qr/(?:^DTSTART[^:]*:(?<dts>.*?)$)/smo;
my $x2 = qr/(?:^DTEND[^:]*:(?<dte>.*?)$)/smo;
my $x3 = qr/(?:^SUMMARY[^:]*:(?<dtn>.*?)$)/smo;
if ($evts[1] =~ $x1 and $evts[1] =~ $x2 and $evts[1] =~ $x3)
{
# ...
}
(I also turned the x variables into patterns themselves, and removed the unneeded escape in the character classes.)
It's better to use three regexes and some extra logic. This problem isn't a good match for regexes.
That's ugly... I think that the "better way" is to match each property, once at a time.
精彩评论