How to rip a URL like http://www.facebook.com/pages/create.php to have a result like this: www.facebook.com
?
I tried this way, but doesn't wo开发者_JAVA技巧rk:
line.split('/', 2)[2]
My problem is probably with that two forward slashes // and some of the URLs start from the www strings.
Thanks for your help, Adia
You might want to look at Python's urlparse module.
>>> from urlparse import urlparse
>>> o = urlparse('http://www.facebook.com/pages/create.php')
>>> o.netloc
'www.facebook.com'
Probably the best bet would be returning the server part from a regex, ie,
\/[a-z0-9\-\.]*[a-zA-Z0-9\-]+\.[a-z]{2,3}\/
That can cover www.facebook.com, facebook.com, some-domain.tv, www.some-domain.net, etc.
NOTE: the head and trailing slashes are part of the regex and not regex separators.
Try:
line.split("//", 1)[-1].split("/", 1)[0]
I would do:
ch[7 if ch[0:7]=='http://' else 0:].partition('/')[0]
I’m not sure it’s valid for all the cases you’ll encounter
Also:
ch[(ch[0:7]=='http://')*7:].partition('/')[0]
精彩评论