I am trying to get the summary of an article and download it as a string. This works great with some articles, but the wikipedia website is inconsistent. So NSScanner fails pretty often while it works fine for other articles.
Here's my NSScanner implementation:
NSString *separatorString = @"<table id=\"toc\" class=\"toc\">";
NSScanner *aScanner = nil;
NSString *container = nil;
NSString *muString = [NSString stringWithString:@"</table>"];
aScanner = [NSScanner scannerWithString:string];
[aScanner setScanLocation:0];
[aScanner scanUpToString:muString intoString:n开发者_如何学编程il];
[aScanner scanString:muString intoString:nil];
[aScanner scanUpToString:separatorString intoString:&container];
How could this be improved? Or is there another way of getting this?
To visualize which bit of the article I want, here's an example:
http://en.wikipedia.org/wiki/Indigo
from this I'd want everything from "Indigo is the color on the electromagnetic spectrum" to "in English was in 1289".
Thanks!
You could use WebKit's DOM API to walk the actual structure, rather than trying to parse the text blindly.
精彩评论