I have a site scraped into $html variable.
now i want to replace some chars with this expression
$string1 = preg_replace('/[^A-Za-z0-9äöü!&_=\+-]/i', ' ', $string);
The Problem is there are special characters caused by different charsets.
I have a variable $charset in which the charset string of the page is saved. i.e. $charset="utf-8" or iso-8859-1 in utf-8 it's the german letter ü i want to replace in iso-8859-1 it's ü
Is there a possibility to tell the replace function according to the charset of开发者_运维知识库 the page without making separate Regular Expressions for each possible charset?
Or you can try adding
utf8_encode($string);
RIGHT BEFORE preg_replace. I'm not sure it will solve your problem, but it might.
For more information, see: http://se2.php.net/manual/en/function.utf8-encode.php.
精彩评论