开发者

Check for valid link (URL)

开发者 https://www.devze.com 2023-01-05 15:43 出处:网络
I was reading though this other question which has some really good regex\'s for the job but as far as I can see non of them work with BASH commands as BASH commands don\'t support such complex rexeg\

I was reading though this other question which has some really good regex's for the job but as far as I can see non of them work with BASH commands as BASH commands don't support such complex rexeg's.

if echo "http://www.google.com/test/link.php" | grep -q '(https?|ftp|file)://[-A-Z0-9\+&@#/%?=~_|!:,.;]*[开发者_StackOverflow-A-Z0-9\+&@#/%=~_|]'; then 
    echo "Link valid"
else
    echo "Link not valid"
fi

But this doesn't work as grep -q doesn't work ...

Edit, ok I just realised that grep had an "extended-regex" (-E) option which seems to make it work. But if anyone has a better/faster way I would still love to here about it.


The following works in Bash >= version 3.2 without using grep:

regex='(https?|ftp|file)://[-[:alnum:]\+&@#/%?=~_|!:,.;]*[-[:alnum:]\+&@#/%=~_|]'
string='http://www.google.com/test/link.php'
if [[ $string =~ $regex ]]
then 
    echo "Link valid"
else
    echo "Link not valid"
fi

I simplified your regex by using [:alnum:] which also matches any alphanumeric character (e.g. Э or ß), but support varies by the underlying regex library. This is another potential simplification which uses + instead of * and a repeated sequence (although your second sequence is different from the first).

regex='(https?|ftp|file)://[-[:alnum:]\+&@#/%?=~_|!:,.;]+'


Since I don't have enough rep to comment above, I am going to amend the answer given by Dennis above with this one.

I incorporated Christopher's update to the regex and then added more to it so that the URL has to at least be in this format:

http://w.w (has to have a period in it).

And tweaked output a bit :)

regex='^(https?|ftp|file)://[-A-Za-z0-9\+&@#/%?=~_|!:,.;]*[-A-Za-z0-9\+&@#/%=~_|]\.[-A-Za-z0-9\+&@#/%?=~_|!:,.;]*[-A-Za-z0-9\+&@#/%=~_|]$'

url='http://www.google.com/test/link.php'
if [[ $url =~ $regex ]]
then 
    echo "$url IS valid"
else
    echo "$url IS NOT valid"
fi


Probably because the regular expression is written in PCRE syntax. See if you have (or can install) the program pcregrep on your system - it has the same syntax as grep but accepts Perl-compatible regexes - and you should be able to make that work.

Another option is to try the -P option to grep, but the man page says that's "highly experimental" so it may or may not actually work.

I will say that you should think carefully about whether it's really appropriate to be using this or any regex to validate a URL. If you want to have a correct validation, you'd probably be better off finding or writing a small script in, say, Perl, to use the URL validation facilities of the language.

EDIT: In response to your edit in the question, I didn't notice that that regex is also valid in "extended" syntax. I don't think you can get better/faster than that.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号