I am programming a script that will grab some data from my website using http GET.
My problem is that i have to pass unicode characters to the website.
I am reading a file that contains these characters and then i try produce a url in order to make the request.
The file is utf-8 encoded and i use this to read from it
f = codecs.open("values.txt", encoding='utf-8')
then i read the first line of t开发者_运维知识库he file and i am concatenating the value with the url
sUrl = "http://example.com?word="
value = f.readline()
visitUrl = sUrl + value
if i use print visitUrl
the output is correct. i.e http://example.com?word=π
How to use visiUrl
without destroying my special characters?
I tried to encode the string to ascii but it doesn't work for all characters.
Quote the url
import urllib
s = u'Здравей'
urllib.quote(s.encode('utf-8'))
# %D0%97%D0%B4%D1%80%D0%B0%D0%B2%D0%B5%D0%B9
or use urlencode directly to build the query part of the url
urllib.urlencode({'data': s.encode('utf-8')})
# 'data=%D0%97%D0%B4%D1%80%D0%B0%D0%B2%D0%B5%D0%B9'
Build the URL with urllib.urlencode rather than trying to construct it by concatenating strings. Non-ASCII characters in a URL need to be URL encoded.
精彩评论