开发者

How could I capture the final url after redirects are followed with python, django, or shell based tool?

开发者 https://www.devze.com 2023-02-10 08:00 出处:网络
Ok I\'m doing a Dj开发者_高级运维ango project where I have affiliate links for different sites. So I want to be able to automatically determine where the final domain ends up being after all the redir

Ok I'm doing a Dj开发者_高级运维ango project where I have affiliate links for different sites. So I want to be able to automatically determine where the final domain ends up being after all the redirects are followed because allot of the affiliate links will be from a 3rd party rather than the destination it's self. For example an affiliate link may look like this:

   http://afl.affiliatenetworking.com/tracker.asp?ref=abc123afialiate       

but may end up redirecting to amazon.com for example. Is there anything in python (or an external utility on linux) that can let me know where I end up after all the redirects are followed.

Thanks!


By default, urllib2.urlopen() follows redirects. The response has a geturl() method which returns the address of the actual place you ended up. See the documentation.


You don't need any custom tools to perform such a check. Basic shell utils are enough:

curl -s --head --location 'http://afl.affiliatenetworking.com/tracker.asp?ref=abc123afialiate'|grep '^Location'|tail -n 1

The above will follow all of the redirects and extract the last Location header, which is the final destination.


You can also try FancyURLopener ( http://docs.python.org/library/urllib.html#urllib.FancyURLopener), it handles most of the redirect cases, and as it subclasses urlopener, you can use geturl(). So, you can simply say:

fancy = urllib.FancyURLopener({})
link = fancy.open('http://some/affiliate/link')
final_link = link.geturl()

Works great for me :)

0

精彩评论

暂无评论...
验证码 换一张
取 消