I am trying to a web page and find the RSS from the website. Can anybody suggest how I would go about finding the RSS for a website programmatically?
I would use XPAth and parse the HTML. To find rss+xml use XPath expression //link[@type=\"application/rss+xml\"]/@href
and to find atom+xml use //link[@type=\"application/atom+xml\"]/@href
Something like the code snippet below which uses the lib2xml parser and a wrapper I've built based on hpple wrapper. But you can use it for some pseudocode ideas. It returns an array of feed urls as NSURL's. This code also assumes you have the HTML you wish to search in a NSData structure.
-(NSArray *)getRSSFeedsUrlFromData:(NSData *)data {
NSMutableArray *rssFeeds = [[[NSMutableArray alloc] init] autorelease];
Document *doc = [[Document alloc] initWithHTMLData:data];
//rss+xml
NSString *kXPathQuery1 = @"//link[@type=\"application/rss+xml\"]/@href";
NSArray *elements1 = [doc search:kXPathQuery1];
for (DocumentElement *element in elements1) {
[rssFeeds addObject:[NSURL URLWithString:[element content]]];
}
//atom+xml
NSString *kXPathQuery3 = @"//link[@type=\"application/atom+xml\"]/@href";
NSArray *elements3 = [doc search:kXPathQuery3];
for (DocumentElement *element in elements3) {
[rssFeeds addObject:[NSURL URLWithString:[element content]]];
}
[doc release];
return rssFeeds;
}
精彩评论