开发者

How to convert pdf to utf-8

开发者 https://www.devze.com 2022-12-24 23:37 出处:网络
I am tr开发者_如何学编程ying to upload a pdf file using webservice api. But this api doesnot work for pdf file. it works fine for text file.when i try to upload a pdf file it give error as

I am tr开发者_如何学编程ying to upload a pdf file using webservice api. But this api doesnot work for pdf file. it works fine for text file.when i try to upload a pdf file it give error as Client-SOAP-ERROR: Encoding: string '%PDF-1.4 %\xc7...' is not a valid utf-8 string

So can we convert this pdf file into utf8 string. i am using php as a scripting language.


A PDF is a binary file. It sounds like you're treating it as plain text.

Are you sure you're uploading it the way you should? It sounds like you're putting the raw PDF in your SOAP request - it seems likely that you're supposed to Base64-encode it if that's the case. Otherwise, you're gonna run into all kinds of trouble with special XML characters happening to appear in the file, messing the file up entirely.

In other words, double-check the API and make sure you aren't supposed to do something with the file (hint: if this thing accepts files like this, you can be pretty sure you need to do something).


It sounds like the API only supports plain text. You would need to change the API so that it supported other file formats.

… assuming you don't want to convert the PDF to plain text, which could be done with something like pdftotext

0

精彩评论

暂无评论...
验证码 换一张
取 消