I'm parsing some RSS feeds that aggregate what's going on in a given city. I'm only interested in the stuff that is happening today.
At the moment I have this:
require 'rubygems'
require 'rss/1.0'
require 'rss/2.0'
require 'open-uri'
require 'shorturl'
source = "http://rss.feed.com/example.xml"
content = ""
open(source) do |s| content = s.read end
rss = RSS::Parser.parse(content, false)
t = Time.now
day = t.day开发者_运维百科.to_s
month = t.strftime("%b")
rss.items.each do |rss|
if "#{rss.title}".include?(day)&&(month)
# does stuff with it
end
end
Of course by checking whether the title (that I know contains the date of event in the following format: "(2nd Apr 11)") contains the day and the month (eg. '2' and 'May') I get also info about the events that happen on 12th May, 20th of May and so on. How can I make it foolproof and only get today's events?
Here's a sample title: "Diggin Deeper @ The Big Chill House (12th May 11)"
today = Time.now.strftime("%d:%b:%y")
if date_string =~ /(\d*).. (.*?) (\d\d)/
article_date = sprintf("%02i:%s:%s", $1.to_i, $2, $3)
if today == article_date
#this is today
else
#this is not today
end
else
raise("No date found in title.")
end
There could potentially be problems if the title contains other numbers. Does the title have any bounding characters around the date, such as a hyphen before the date or brackets around it? Adding those to the regex could prevent trouble. Could you give us an example title? (An alternative would be to use Time#strftime to create a string which would perfectly match the date as it appears in the title and then just use String#include? with that string, but I don't think there's an elegant way to put the 'th'/'nd'/'rd'/etc on the day.)
Use something like this:
def check_day(date)
t = Time.now
day = t.day.to_s
month = t.strftime("%b")
if date =~ /^#{day}nd\s#{month}\s11/
puts "today!"
else
puts "not today!"
end
end
check_day "3nd May 11" #=> today!
check_day "13nd May 11" #=> not today!
check_day "30nd May 11" #=> not today!
精彩评论