Android Pull Parsing RSS feeds troubles_问答_开发者

I am working on a very simple RSS reader for Android as a learning experience. I decided to use the XmlPullParser for parsing the feeds as it is quite simple and has an acceptable level of efficiency (for my needs). I am getting an error while trying to parse my test feed (rss.slashdot.org/slashdot/slashdot) that I can't seem to resolve despite scouring the web for answers. The error (from eclipse) is:

START_TAG <image>@2:1252 in java.io.InputStreamReader@43e7a488
START_TAG (empty) <{h开发者_如何学JAVAttp://www.w3.org/2005/Atom}atom10:link rel='self' type='application/rss+xml' href='http://rss.slashdot.org/Slashdot/slashdot'>@2:1517 in java.io.InputStreamReader@43e7a488
DEBUG/JRSS(313): java.net.MalformedURLException: Protocol not found:

The file in question is:

<image>
    ...
</image>
<atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/rss+xml" href="http://rss.slashdot.org/Slashdot/slashdot" />
<feedburner:info uri="slashdot/slashdot" />
<atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" />
    ...

so the error appears to occur at the feedburner tag.

finally, my code is:

public class XmlHelper
{
    private XmlPullParserFactory factory;
    private XmlPullParser xpp;
    private final int START_TAG = XmlPullParser.START_TAG;

    // Debugging Tag
    private final String TAG = "JRSS";

    // for channels and items
    private final String TITLE = "title";
    private final String LINK = "link";
    private final String DESCRIPTION = "description";
    private final String PUBDATE = "pubDate";

    // element keys for channel
    private final String LANGUAGE = "language";
    private final String IMAGE = "image";
    private final String ITEM = "item";

    // for items
    private final String AUTHOR = "author";

    // for images
    private final String URL = "url";
    private final String WIDTH = "width";
    private final String HEIGHT = "height";

    public XmlHelper(Context context)
    {
        try
    {
        factory = XmlPullParserFactory.newInstance();
    }
    catch (XmlPullParserException e)
    {
        Log.d(TAG, e.toString());
    }
        factory.setNamespaceAware(true);
    }

    public Channel addFeed(URL url) throws XmlPullParserException, IOException
    {       
        Channel c = new Channel();
        c.items = new ArrayList<Item>();

        xpp = factory.newPullParser();
        xpp.setInput(url.openStream(), null);

        // move past rss element
        xpp.nextTag();
        // move past channel element
        xpp.nextTag();

        while(xpp.nextTag() == START_TAG)
        {
            Log.d(TAG, xpp.getPositionDescription());

            if(xpp.getName().equals(TITLE))
                c.title = xpp.nextText();

            else if(xpp.getName().equals(LINK))
                c.url = new URL(xpp.nextText());

            else if(xpp.getName().equals(DESCRIPTION))
                c.description = xpp.nextText();

            else if(xpp.getName().equals(LANGUAGE))
                c.language = xpp.nextText();

            else if(xpp.getName().equals(ITEM))
            {
                Item i = parseItem(xpp);
                c.items.add(i);
            }

            else if(xpp.getName().equals(IMAGE))
            {
                parseImage(xpp);
            }

            else
                xpp.nextText();
        }

        return c;
    }

    public Item parseItem(XmlPullParser xpp) throws MalformedURLException, XmlPullParserException, IOException
    {
    Item i = new Item();

    while(xpp.nextTag() == START_TAG)
    {
            // do nothing for now
        xpp.nextText();
    }

        return i;
    }

    private void parseImage(XmlPullParser xpp) throws XmlPullParserException, IOException
    {
        // do nothing for now
        while(xpp.nextTag() == START_TAG)
        {
            xpp.nextText();
        }
    }

I don't really know if there is a way to just ignore this (because at this point I don't care about the feedburner tag) or if there is some feature of the parser that I can set to make this work, or if I'm going about this the wrong way. Any help / advice / guidance would be appreciated.

PullParsing is more efficient than SAX. But in my opinion its still leaves a lot one needs to do for getting your RSS feed to be capable of parsing any feeds out there.

You need to cater to all formats RSS 1, RSS 2, Atom etc. Even then you will have to contend with poorly formatted feeds.

I had faced similar problems in the past so decided to do my feed parsing on a server and just get the parsed contents. This allows me to run more complex libraries and parser which I can modify without pushing out updates for my app. You should look at server side options so that you can keep you app light weight and simple.

I have the following service running on AppEngine which allows for a much simpler XML / JSON parsing at your end. There is a fixed and simple structure to the response. You can use this for parsing

http://evecal.appspot.com/feedParser

You can send both POST and GET requests with the following parameters.

feedLink : The URL of the RSS feed response : JSON or XML as the response format

Examples:

For a POST request

curl --data-urlencode "feedLink=http://feeds.bbci.co.uk/news/world/rss.xml" --data-urlencode "response=json" http://evecal.appspot.com/feedParser

For GET request

evecal.appspot.com/feedParser?feedLink=http://feeds.nytimes.com/nyt/rss/HomePage&response=xml

My android app "NewsSpeak" uses this too.