Apache Solr PDF indexing_问答_开发者_运维开发者技术经验分享

开发者 https://www.devze.com 2023-04-03 06:45 出处：网络

I want to index the pdf documents whenever it is uploaded by an application. At the time of indexing I am sending the filename and fileType in the url like follows

相关专题：solr

I want to index the pdf documents whenever it is uploaded by an application. At the time of indexing I am sending the filename and fileType in the url like follows

http://localhost:8983/solr/update/extract?stream.file=/D:\apache-solr-3.3.0\example\exampledocs\Accessing_MySQL_from_IntalioBPMS.pdf&stream.contentType=application/pdf&literal.id=111&literal.fileName=Test.pdf&literal.fileType=pdf&commit=true

I am having the fields fil开发者_JS百科eName and fileType in my schema.xml file as well.

After I indexed the pdf documents I do the search it shows only the content and id of the pdf document but not the filename and filetype.

What I am doing wrong?

When you define your schema, you must specify which fields will be stored (or retrievable upon search). In this case, it is likely that your filename and filetype fields are only indexed and not stored.

Make sure your schema is like the following:

<field name="filename" type="{yourDesiredType}" indexed="true"stored="true"/> <field name="filetype" type="{yourDesiredType}" indexed="true"stored="true"/>

For more information about editing the schema.xml, go to http://wiki.apache.org/solr/SchemaXml.