Our code calls stringWithUTF8String
but some data we have uses an octal
sequence \340 in the开发者_如何学运维 string. This causes some code to break because we never expect the function to return nil
. I did some research and found that any octal sequence from \200-\777 will give the same result. I know I can handle this returning nil
but I want to understand why it would return nil
, and what those octal escapes are interpreted as.
NSString *result = [NSString stringWithUTF8String:"Mfile \340 xyz.jpg"];
running this code return nil
for result. It appears that to code defensively we will have to check null results for this everywhere where we use it which seems unfortunate. The documentation for the function does not say anything about returning nil as a possibility. I would bet that there is a lot of code out there that does not check for it either.
The UTF-8 Character Table
doesn't have an entry for \340
. You need to use the ASCII
encoding for this. Do,
NSString * result = [NSString stringWithCString:"Mfile \340 xyz.jpg" encoding:NSASCIIStringEncoding];
NSLog(@"%@", result);
If you want iOS to handle it as UTF-8 you have to make sure it's valid UTF-8 characters you pass to it, so you may need to convert the octal characters to something human readable first.
I added a category which is called safeStringWithUTF8String: this is called everywhere instead it simply checks the return value for nil and returns the empty string if not valid. Not great but not sure what else to do we have to be able to handle any data passed in.
精彩评论