hi i am trying to loop through XML document using NSXMLParser and have trouble with description tag.
some news websites have strange characters(HTML tags,<,>,a etc) in the tag and thus parsing is not as expected. could anyone provide so开发者_开发技巧me help?
thanks
You'll need to convert entity references to the characters that they represent. Any HTML tags would either need to be stripped, or fed into a UIWebView.
For skipping the html tags you need to do this:
- (NSString *)flattenHTML:(NSString *)html {
NSScanner *theScanner;
NSString *text = nil;
theScanner = [NSScanner scannerWithString:html];
while ([theScanner isAtEnd] == NO) {
[theScanner scanUpToString:@"<" intoString:NULL] ;
[theScanner scanUpToString:@">" intoString:&text] ;
html = [html stringByReplacingOccurrencesOfString:[NSString stringWithFormat:@"%@>", text] withString:@""];
}
//
html = [html stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];
return html;
}
Then you can simply replace other unwanted characters by string manipulation.
Hope this helps.
Thanks,
Madhup
精彩评论