开发者

Split from a specific delimiter

开发者 https://www.devze.com 2023-02-05 06:32 出处:网络
How to rip a URL like http://www.facebook.com/pages/create.php to have a result like this: www.facebook.com?

How to rip a URL like http://www.facebook.com/pages/create.php to have a result like this: www.facebook.com?

I tried this way, but doesn't wo开发者_JAVA技巧rk:

line.split('/', 2)[2]

My problem is probably with that two forward slashes // and some of the URLs start from the www strings.

Thanks for your help, Adia


You might want to look at Python's urlparse module.

>>> from urlparse import urlparse
>>> o = urlparse('http://www.facebook.com/pages/create.php')
>>> o.netloc
'www.facebook.com'


Probably the best bet would be returning the server part from a regex, ie,

\/[a-z0-9\-\.]*[a-zA-Z0-9\-]+\.[a-z]{2,3}\/

That can cover www.facebook.com, facebook.com, some-domain.tv, www.some-domain.net, etc.

NOTE: the head and trailing slashes are part of the regex and not regex separators.


Try:

line.split("//", 1)[-1].split("/", 1)[0]


I would do:

ch[7 if ch[0:7]=='http://' else 0:].partition('/')[0]

I’m not sure it’s valid for all the cases you’ll encounter

Also:

ch[(ch[0:7]=='http://')*7:].partition('/')[0]
0

精彩评论

暂无评论...
验证码 换一张
取 消