I'm doing preg_match_all
and str_replace
on a block of text to grab YouTube-urls and replace them with the correct embed code.
Let's say I have the following block of text:
"bla bla bla bla <-youtube-url-> last few words"
Everything works fine - t开发者_StackOverflow中文版he youtube-url is replaced with the embed code etc. However, the "last few words" disappears from the final output after str_replace is run. I'm suspecting that the regex is swallowing everything after the url... This is what I'm using to match and extract YouTube ID's:
%(?:youtube\.com/(?:[^/]+/.+/|(?:v|e(?:mbed)?)/|.*[?&]v=)|youtu\.be/)([^"&?/ ]{11})%i
Any help would be greatly appreciated!
Update:
I just discovered that the problem only happens if the youtube url has any trailing parameters. The following input swallows last few words:
'www.youtube.com/watch?v=XXXXXXXXX¶meter=data last few words'
But if the input is like this:
'www.youtube.com/watch?v=XXXXXXXXX last few words'
it works fine. Can anyone help with the needed adjustments for the regular expression?
I usually break up complicated alternations to find out whats going on.
It appears you might have trouple with the last term [^"&?/ ]{11}
, but not sure
what you are trying to do. (below is in Perl)
$samp = 'www.youtube.com/watch?v=XXXXXXXXX¶meter=data last few words';
$regex = qr%
(?:
youtube\.com/
(?:
( [^/]+/.+/ ) # 1
|
( # 2
v
| e(?:mbed)?/
)
|
( .*[?&]v= ) # 3
)
|
( youtu\.be/ ) #4
)
( [^"&?/ ]{1,11} ) # 5, was {11}
(.*)$ # 6 the remainder
%xi;
if ( $samp =~ /$regex/ )
{
# just print what matched
print "all: '$&' \n";
print "1: '$1' \n";
print "2: '$2' \n";
print "3: '$3' \n";
print "4: '$4' \n";
print "5: '$5' \n";
print "6: '$6' \n";
}
Output:
all: 'youtube.com/watch?v=XXXXXXXXX¶meter=data last few words'
1: ''
2: ''
3: 'watch?v='
4: ''
5: 'XXXXXXXXX'
6: '¶meter=data last few words'
Change the .+
to \S+
so that you don't capture whitespace as part of the regex.
%(?:youtube\.com/(?:[^/]+/\S+/|(?:v|e(?:mbed)?)/|.*[?&]v=)|youtu\.be/)([^"&?/ ]{11})%i
The .*
was capturing the entire line, and the rest of your regex wasn't doing anything.
I'm not clear on what exactly you are trying to do. But I suggest that you try a regex tester tool - like this one, but there are others. it lets you visually examine the results of regex.
My bad. There was no problem with the regex, as I first suspected.
I was passing the user input to the PHP handler without escaping the input via encodeURIComponent() first. Thus, the handler assumed ¶meter=data
was the next input parameter - resulting in a broken POST variable.
Sorry for my incompetence, and thanks for all the help!
精彩评论