开发者

Jackrabbit Text search in Arabic PDF File

开发者 https://www.devze.com 2023-02-25 10:26 出处:网络
I am able to perform text search for arabic text file successfully using the following code in Jackrabbit. But for an Arabic PDF file, the same search is not working. If I give some of the non-arabic

I am able to perform text search for arabic text file successfully using the following code in Jackrabbit. But for an Arabic PDF file, the same search is not working. If I give some of the non-arabic text inside the fle, its giving me the correct result, but if I give an araic word i开发者_StackOverflownside the file, its not giving me any result.

Query query = queryManager.createQuery("select * from [nt:resource] AS resource where contains(resource.*, '%القط%')", Query.JCR_SQL2);

 QueryResult result = query.execute();
 RowIterator ri = result.getRows();

     while (ri.hasNext()) {      
     Row row = ri.nextRow(); 
     System.out.println("Row: " + row.toString()); 
 }

Thanks


Possibly PDFBox could not parse the file. In this case, there should be a warning in the log file.

0

精彩评论

暂无评论...
验证码 换一张
取 消