开发者

How to clean up garbage text from string using PHP?

开发者 https://www.devze.com 2023-01-31 16:24 出处:网络
I am trying to parse a word document file. I upload the using PHP then I am trying to get contents using file_get_contents(); function but the problem is when its displayed in front end a lots of garb

I am trying to parse a word document file. I upload the using PHP then I am trying to get contents using file_get_contents(); function but the problem is when its displayed in front end a lots of garbage code in there like

Æ�Ѐ¤d�¤d�[$\$gd®l±����„h¤d�¤d�[$\$^„hgd®l±���
&�F�¤d�¤d�[$\$g开发者_StackOverflow中文版d3¡���gd3¡����„,¤d�¤d�[$\$^„,gd(E����¤d�¤d�[$\$gdÿ/��<��C��D��I��Å������O��P��‚��¡��¢��¬��­��®��Ù��ã��ó��ô�����

So my question is how can I clean up this text?


Maybe give this a shot? http://www.phpclasses.org/package/3553-PHP-Edit-Microsoft-Word-documents-using-COM-objects.html


Word documents (like docx and doc) are not straight text files - they are actually proprietary file types that do not just have the text from byte 0 - this is how they have fancy formatting and fonts. .docx files are actually archives (.zip files) that contain a myriad of XML and styles.

Your best bet is to use a text input form, or find code online that allows you to extract just the text. Or, download the doc files to your own computer and use your own copy of MS word to open it.

0

精彩评论

暂无评论...
验证码 换一张
取 消