How can one extract all开发者_JS百科 the tokens from solr? Not from one document, but from all the documents indexed in solr?
Thanks!
You may do something like this(This sample is approved to be working on a lucene 4.x index):
IndexSearcher isearcher = new IndexSearcher(dir, true);
IndexReader reader = isearcher.getIndexReader();
Fields fields = MultiFields.getFields(reader);
Collection<String> cols = reader.getFieldNames(IndexReader.FieldOption.ALL);
for (String col : cols) {
Terms te = fields.terms(col);
if (te != null) {
TermsEnum tex = te.getThreadTermsEnum();
while (tex.next() != null)
// do something
tex.getTerm().text();
}
}
This iterates over all columns and also over every term per col. You may lookup the methods provided by TermsEnum like getTerm()
.
精彩评论