I have gotten a value, encoded like so:
%3Cp%3E%0AGlobal%20Business%20Intensive%20Course%20%u2013%
I noticed that one of the characters seems to be encoded in a different manner at the end, the %u2013. It appears to be some form of unicode character, but it is causing me to get URI malformed errors. is there a w开发者_如何学运维ay to replace these with standard encoding characters? In this example, it seems %u2013 is supposed to be a hyphen.
To be complete and more correct, the regular expression should also accept letters from A
to F
, since the %u2013
refers to a four-digit hexadecimal number. And you should definitely include the percent sign in the regular expression, otherwise you end up interpreting Blu2000
as a Unicode escape sequence, which it isn't.
function fixUnicodeUrl(url) {
var result = url.replace(/%u[0-9a-f]{4}/gi, function (match) {
var codepoint = parseInt(match.substring(2), 16);
var str = String.fromCharCode(codepoint);
return encodeURIComponent(str);
});
return result;
}
var yourUrl = '%3Cp%3E%0AGlobal%20Business%20Intensive%20Course%20%u2013%';
alert(fixUnicodeUrl(yourUrl));
That is malformed for sure. Where are you getting it from?
Here's a way to fix all occurrences of that type of malformation.
var str = '%3Cp%3E%0AGlobal%20Business%20Intensive%20Course%20%u2013%';
str = str.replace( /u\d{4}/g, function( sequence )
{
return encodeURIComponent( eval( '"\\' + sequence + '"' ) );
} );
精彩评论