How can I leave only words in uppercase, digits, s开发者_如何学Cpecial characters, or words where the first letter in uppercase, but it contains no more than 3 characters, with preg_replace.
For example:
Portocjnk Karaer HDS-C 7/11, 9/15, 8/15-E => HDS-C 7/11, 9/15, 8/15-E
Karcher Karcher B 140 R Bp => B 140 R Bp
Karcher Karcher B 140 R Bsp Trr => B 140 R Bsp Trr
Tatata Tatat Yard-Man YM 84 M-W 31AY97KV643 => YM 84 M-W 31AY97KV643
(Tatata) (Tatat) Yard-Man YM 84 M-W 31AY97KV643 => YM 84 M-W 31AY97KV643
Thanks in advance.
preg_replace('|\b([A-Z][a-z][a-z][a-z][a-z\-]*)\b|','',$text);
this one would work with most of your example
This would be a simplistic whitelist approach. Instead of preg_replacing this will first extract the desired parts. And afterwards the $result array needs to be remerged.
preg_match_all('#\b[A-Z\d][A-Z\d/,-]*\b|\b(?<!-)[A-Z][a-z]{1,2}\b#', $str, $result);
$result = implode(" ", $result[0]);
You might need to add some more of the "special" characters in the second [...]
character class.
Check out https://stackoverflow.com/questions/89718/is-there-anything-like-regexbuddy-in-the-open-source-world for some nice tools that might help in designing regular expression.
精彩评论