开发者

Get text snippet from search index generated by solr and nutch

开发者 https://www.devze.com 2023-03-24 06:12 出处:网络
I have just configured nutch and solr to successfully crawl and index text on a web site, by following the geting started tutorials. Now I am trying to make a search page by modifying the example velo

I have just configured nutch and solr to successfully crawl and index text on a web site, by following the geting started tutorials. Now I am trying to make a search page by modifying the example velocity templates.

Now to my question. How can I tell solr to provide a relevant text snippet of the content of the hits? I only get the following fields associated with each hit:

score, boost, digest, id, segment, title, date, tstamp and url.

The content is really indexed, because I can search for words that I know only is in the fulltext, but I still don't get the fulltext back associated with 开发者_如何转开发the hit.


don't forget: indexed is not the same as stored.

You can search words in an document, if all field are indexed, but no field is stored. To get the content of a specific field, it must be also stored=true in schema.xml

If your fulltext-field is stored, so probably the default "field-list-settings" does not include the fulltext-field. You can add this by using the fl parameter:

http://<solr-url>:port/select/?......&fl=mytext,*

...this example, if your fulltext is stored in the field called mytext

Finally, if you like to have only a snippet of the text with the searched words (not the whole text) look at the highlight-component from solr/lucene

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号