I want to use SOLR's remote-streaming faci开发者_如何学JAVAlity to extract and index the content of files.
This works fine if I pass stream.file=xxx as a parameter to the http GET method.
However, I have a lot of these, and want to batch them up (i.e. not have to have a GET per file).
Is there a way I can do this in SOLR?
e.g. I'd like to be able to POST some xml like this:
<add>
<doc stream_file="filename">
<field name="id">123</field>
</doc>
<doc>...
This has been recently asked (and answered) in the solr-user mailing list.
I find that multiple ADDs are fast, so long as you only COMMIT the batch and don't try to COMMIT after every ADD. I would guess that the performance penalty is not worth writing your own RequestHandler.
精彩评论