Like:
The quick, brown fox jumps over a lazy dog. DJs flock by when MTV ax quiz prog. Junk MTV quiz graced by fox whelps. Bawds jog, flick quartz, vex nymphs. Waltz, bad nymph, for quick jigs vex! Fox nymphs grab quick-jived waltz. Brick quiz whangs jumpy veldt fox开发者_StackOverflow. Bright vixens jump; dozy fowl quack. Quick wafting zephyrs vex bold Jim. Quick zephyrs blow, vexing daft Jim.
(this is just sample text, the real one is much longer)
How can I get let's say 5 words from this text?
I tried using explode(' ', $text);
Then shuffle the array and pick 5 elements from it, but I get all the punctuation and other characters. I just want a-z characters. Also the words need to have at least 3 characters
You can use the built-in str_word_count
for this:
$words = str_word_count($str, 1);
shuffle($words);
$selection = array_slice($words, 0, 5);
See it in action.
You can also use another way (such as array_rand
) of picking out random words from the $words
array if you are concerned about performance; this is just the most convenient.
Use preg_split
:
$words = preg_split('#[^a-z0-9]+#', $string, -1, PREG_SPLIT_NO_EMPTY);
$key = array_rand($words);
return $words[$key];
This will split the string on any sequence of non-alphanumeric characters.
If you work on utf-8 data, try this instead:
$words = preg_split('#[^\pL\pN]+#u', $string, -1, PREG_SPLIT_NO_EMPTY);
Just remove the unwanted characters
$words = explode(' ', $string);
$words = array_map (function ($word) {
trim($word, '.,-:;"\'');
}, $words);
and filter by word length
$words = array_filter($words, function($word) {
return strlen($word) > 2;
}, $words);
$string = preg_replace("/[^a-z ]+/i", "", $string);
before you do the explode
精彩评论