My valid urls could look more or less like this
http://someurl.com/some/path/file.pdf
or
http://someurl.com/some/path/file.开发者_运维知识库pdf?param=value
or
http://someurl.com/some/path/file.pdf?param=value&second=val
where the file extension could be .pdf, or some other extension like .jpg or .psd, or nothing at all.
I have the url stored without the someurl.com portion, so it's the some/path/file.pdf
part of the url
How can I use regex to know the file extension if it is present? Is regex the right tool for this?
I would use parse_url()
and pathinfo()
. These are the most correct functions for the job.
$url = 'http://someurl.com/some/path/file.pdf?param=value';
$path = parse_url($url, PHP_URL_PATH);
$ext = pathinfo($path, PATHINFO_EXTENSION);
var_dump($ext); // string(3) "pdf"
See is on CodePad.org.
You could use regex, but it will be more difficult to follow.
You would probably need to do a HTTP HEAD request. Regex would work for sure, but you're not guaranteed to catch all cases.
For example:
http://someurl.com/some/path/file might a be a text file without extension (like on most *nix systems) and the regex would fail to provide the file extension.
A much better option is PHP's parse_url function:
$path = parse_url($url, PHP_URL_PATH);
$extension = ($pos = strrpos($path, '.')) ? substr($path, $pos) : "";
Don't need regex, we can just use parse_url
.
$url = parse_url('http://example.com/path/to/file.php?param=value');
$extension = substr($url['path'], strrpos($url['path'], '.') + 1);
echo $extension; // outputs "php"
http://php.net/parse-url
http://php.net/substr
http://php.net/strrpos
精彩评论