开发者

PySolr rss dataimport

开发者 https://www.devze.com 2022-12-18 07:27 出处:网络
I am using PySolr to run my search.I want to index an rss feed and was wondering if this is possible using PySolr and if so how do you do it.

I am using PySolr to run my search. I want to index an rss feed and was wondering if this is possible using PySolr and if so how do you do it.

I have found instructions 开发者_JS百科on how to do this in Solr at http://wiki.apache.org/solr/DataImportHandler#HttpDataSource_Example

but can't find anything on how to do the equivalent in PySolr

Thanks


You probably don't need to do the equivalent in PySolr. If you already have Solr indexing the feed, as per the example, then you just use PySolr to query that index. Something like:

from pysolr import Solr
solr = Solr('http://localhost:8983/solr/rss/')
response = solr.search('some query string')
print response.hits
for result in response.docs:
    do_stuff_with(result)

If you really want to do it from the Python side, then you'll need to fetch and parse the RSS there (using other libraries, e.g. Universal Feed Parser); PySolr just wraps the interaction with Solr, it doesn't “do” data sources.

You may want to check out Haystack, which uses PySolr (and can use other engines) and neatly abstracts the job of creating search index entries and shipping them off to Solr for indexing.

0

精彩评论

暂无评论...
验证码 换一张
取 消