开发者

Apache Solr PDF indexing

开发者 https://www.devze.com 2023-04-03 06:45 出处:网络
I want to index the pdf documents whenever it is uploaded by an application. At the time of indexing I am sending the filename and fileType in the url like follows

I want to index the pdf documents whenever it is uploaded by an application. At the time of indexing I am sending the filename and fileType in the url like follows

http://localhost:8983/solr/update/extract?stream.file=/D:\apache-solr-3.3.0\example\exampledocs\Accessing_MySQL_from_IntalioBPMS.pdf&stream.contentType=application/pdf&literal.id=111&literal.fileName=Test.pdf&literal.fileType=pdf&commit=true

I am having the fields fil开发者_JS百科eName and fileType in my schema.xml file as well.

After I indexed the pdf documents I do the search it shows only the content and id of the pdf document but not the filename and filetype.

What I am doing wrong?


When you define your schema, you must specify which fields will be stored (or retrievable upon search). In this case, it is likely that your filename and filetype fields are only indexed and not stored.

Make sure your schema is like the following:

<field name="filename" type="{yourDesiredType}" indexed="true"stored="true"/> <field name="filetype" type="{yourDesiredType}" indexed="true"stored="true"/>

For more information about editing the schema.xml, go to http://wiki.apache.org/solr/SchemaXml.

0

精彩评论

暂无评论...
验证码 换一张
取 消