I am trying to clean some junked up data of special characters (allowing a few) but some still get through. I found a regex snippet earlier but does not remove some characters, like asterisks.
$clean_body = $raw_text;
$clean_title = preg_replace("/[^!&\/A-Za-z0-9_ ]/","", $clean_body);
$clean_title = substr($clean_title, 0, 64);
$clean_body = nl2br($clean_body);
if ($nid) {
$node = node_load($nid);
unset($node->field_category);
} else {
$node = new stdClass();
$node->type = 'article';
node_object_prepare($node);
}
$split_title = str_split($clean_title);
foreach ($split_title as $key => $character) {
if ($key > 15) {
if ($character == ' ' && !preg_match("/[^!&\/,.-]/", $split_title[$key - 1])) {
$node->title = html_entity_decode(substr(strip_tags($clean_title), 0, $key - 1)) . '...';
}
}
}
The first part attempts to clean out anything in the raw text that isn't normal punctuation or alpha numeric. Then, I split the title into an array and look for a space. What I want to do is create a title that is at least 15 characters long, and truncates on a space (leaving whole words intact) without stopping on a punctuation character. This is the part I am having trouble with.
Some titles still come out as *****************
or ** HOW TO MAKE $$$$$$ BLOGGIN开发者_如何学GoG **
, when the first title should not even have *
's, and the section should be HOW TO MAKE...
, for example.
What about "/[^!&\/\w\s]/ui"
?
Works fine on my machine
Your problem (or, one of them anyhow) is this logic:
if ($key > 15) {
if ($character == ' ' && !preg_match("/[^!&\/,.-]/", $split_title[$key - 1])) {
$node->title = html_entity_decode(substr(strip_tags($clean_title), 0, $key - 1)) . '...';
}
}
You're only setting $node->title
if these conditions match when iterating the characters in the $split_title
array.
What happens when they don't match? $node->title
doesn't get set (or overwritten? You didn't give much context, so I can't tell).
Using this as a test:
$clean_body = '** HOW TO MAKE $$$$$$ BLOGGING **';
You can see that these conditions do not match, so $node->title
does not get set (or overwritten).
精彩评论