i am pars开发者_高级运维ing rss feed.But i cantable to parse encoding data from thee rss feed.How to parse encoding data from the rss feed?
It's a rough task. feedparser (Python) does a number of things to try to appropriately guess the right character set. There are a few places where it can be provided -- such as the header of the XML and the header from the HTTP transaction (which overrides the header of the XML). If it's not there (or it's completely invalid which is quite common), it falls back to statistical guessing. There's one last technique -- try converting it as UTF-8 and if that fails, convert it from ISO-8859-1 to UTF-8 and try again. Good luck!
精彩评论