I'm trying to remove everything after and including '.html' in a web address string. Current (failing) code is:
$input = 'http://example.com/somepage.html?foo=bar&baz=x';
$result = preg_replace("/(.html)[^.html]+$/i",'',$input);
Desired outcome:
value of $result is 'http://example.com/somepage'
Some other examples of $input that should lead to same value $result:
http://example.com/somepage
http://ex开发者_如何转开发ample.com/somepage.html
http://example.com/somepage.html?url=http://example.com/index.html
Your regular expresson is wrong, it would only match strings ending with <one char> "html" <one or more chars matching ., h, t, m or l>
. Since preg_replace
just returns the string "as-is" if there was no match, you'd be fine with matching the literal .html
and ignoring anything after it:
$result = preg_replace('/\.html.*/', '', $input);
Why not use parse_url instead?
If you ever have issues with the syntax for preg_replace() then you can also use explode():
$input = explode(".html", $input);
$result = $input[0];
精彩评论