I'm trying to figure out how one might convert a string representation of a byte-string into an actual byte-string type. I'm not very used to Python (just hacking on it to help a friend), so I'm not sure if there's some easy "casting" method (like my beloved 开发者_运维百科Java has ;) ). Basically I have a text file, which has as it's contents a byte-string:
b'\x03\xacgB\x16\xf3\xe1\\v\x1e\xe1\xa5\xe2U\xf0g\x956#\xc8\xb3\x88\xb4E\x9e\x13\xf9x\xd7\xc8F\xf4'
I currently read in this file as follows:
aFile = open('test.txt')
x = aFile.read()
print(x) # prints b'\x03\xacgB\x16\xf3\xe1\\v\x1e\xe1\xa5\xe2U\xf0g\x956#\xc8\xb3\x88\xb4E\x9e\x13\xf9x\xd7\xc8F\xf4'
print(type(x)) # prints <class 'str'>
How do I make x be of type <class 'bytes'>
? Thanks for any help.
Edit: Having read one of the replies below, I think I'm maybe constraining the question too much. My apologies for that. The input string doens't have to be in python byte-string format (i.e. with the b and the quotation marks), it could just be the plain byte-string:
\x03\xacgB\x16\xf3\xe1\\v\x1e\xe1\xa5\xe2U\xf0g\x956#\xc8\xb3\x88\xb4E\x9e\x13\xf9x\xd7\xc8F\xf4
If this makes it easier or is better practice, I can use this.
>>> r'\x03\xacgB\x16\xf3\xe1\\v\x1e\xe1\xa5\xe2U\xf0g\x956#\xc8\xb3\x88\xb4E\x9e\x13\xf9x\xd7\xc8F\xf4'.decode('string-escape')
'\x03\xacgB\x16\xf3\xe1\\v\x1e\xe1\xa5\xe2U\xf0g\x956#\xc8\xb3\x88\xb4E\x9e\x13\xf9x\xd7\xc8F\xf4'
This will work for strings that don't have b'...'
around it. Otherwise you are encouraged to use ast.literal_eval()
.
Since your input is in Python's syntax, for some reason (*), the thing to do here is just call eval
:
>>> r"b'\x12\x12'"
"b'\\x12\\x12'"
>>> eval(r"b'\x12\x12'")
'\x12\x12'
Be careful, though, as this may be a security problem. eval
will run any code, so you may need to sanitize the input. In your case its simple - just check that the thing you're eval
-ing is indeed a string in the format you expect. If security isn't an issue here, just don't bother.
Redarding your EDIT: Still, eval
is the simplest approach here (after adding the b''
if it's not there). You could also, of course, do this manually by converting each \xXX
to its real value.
(*) Why, really? This seems like a strange choice for a data representation format
精彩评论