开发者

PHP preg_replace not working / weird string type / number of characters not adding up

开发者 https://www.devze.com 2023-01-23 01:31 出处:网络
I want to remove the following specified characters from a string: < & \" # % So something like: Test%开发者_如何学JAVA#\"&<value

I want to remove the following specified characters from a string:

<
&
"
#
%

So something like:

Test%开发者_如何学JAVA#"&<value

should become:

Testvalue

The order of characters is immaterial.


There's something weird about the string type.

A test string looks like this:

var_dump($test): string(25) "A BCDEFG/<&"#%/HI" 

The number of characters is NOT adding up to 25 and I'm not sure why.


If I do:

$displayName = strtr($displayName, array('<' => '', '&' => '', '"' => '', '#' => '', '%' => ''));

I get:

 string(20) "A BCDEFG/lt;quot;/HI"


Escaping the < will work:

$displayName = preg_replace('/\<&"#%/', '', $displayName);

However the surrounding / serve as delimiters for the preg_* family. Therefore, you must do the following to remove these, too:

$displayName = preg_replace('|/\<&"#%/|', '', $displayName);

(Here I use | as delimiter instead since this character is not part of the expression itself.)


EDIT
If you're just interested in replacing the characters <, &, ", #, and %, this is probably preferable:

$displayName = str_replace(array('<', '&', '"', '#', '%'), '', $displayName);

EDIT 2
A great deal of confusion later, it seems that the $displayName string did actually contain A BCDEFG/&lt;&&quot;#%/HI. In that case you could replace the HTML entities directly (untested):

    $displayName = str_replace(array('&lt;', '&quot;', '&', '#', '%'), '', $displayName);


If you want to do it with regular expressions:

<?php
$displayName = 'A BCDEFG/<&"#%/HI';
$displayName = preg_replace('/[\<&"#%]/', '', $displayName);
echo($displayName);
?>

This outputs:

A BCDEFG//HI


Regarding the "number of characters not adding up" part: It used to be true that one char was one byte, but you can't count on that any more. My guess is that var_dump() shows how many bytes the string is internally, which you really should not care about.

When working with strings in high-level languages you really should concentrate on number of characters and forget how many bytes a string is. (there are exceptions, of course ;)

0

精彩评论

暂无评论...
验证码 换一张
取 消