开发者

Python unescape URL

开发者 https://www.devze.com 2023-01-27 08:21 出处:网络
I have got a url in this form - http:\\\\/\\\\/en.wikipedia.org\\\\/wiki\\\\/The_Truman_Show. How can I make it normal url. I have tried using urllib.unquote without much success.

I have got a url in this form - http:\\/\\/en.wikipedia.org\\/wiki\\/The_Truman_Show. How can I make it normal url. I have tried using urllib.unquote without much success.

I can always use regular expressions or some simple string replace stuff. But I believe that there is a better way开发者_StackOverflow中文版 to handle this...


urllib.unquote is for replacing %xx escape codes in URLs with the characters they represent. It won't be useful for this.

Your "simple string replace stuff" is probably the best solution.


Have you tried using json.loads from the json module?

>>> json.loads('"http:\\/\\/en.wikipedia.org\\/wiki\\/The_Truman_Show"')
'http://en.wikipedia.org/wiki/The_Truman_Show'

The input that I'm showing isn't exactly what you have. I've wrapped it in double quotes to make it valid json.

When you first get it from the json, how are you decoding it? That's probably where the problem is.


It is too childish -- look for some library function when you can transform URL by yourself. Since there are not other visible rules but "/" replaced by "\/", you can simply replace it back:

def unescape_this(url):
    return url.replace(r"\\/", "/")
0

精彩评论

暂无评论...
验证码 换一张
取 消