javascript. handling odd characters in encoded string_问答_开发者

javascript. handling odd characters in encoded string

开发者 https://www.devze.com 2023-01-04 04:07 出处：网络

I have gotten a value, encoded like so: %3Cp%3E%0AGlobal%20Business%20Intensive%20Course%20%u2013% I noticed that one of the characters seems to be encoded in a different manner at the end, the%u20

I have gotten a value, encoded like so:

%3Cp%3E%0AGlobal%20Business%20Intensive%20Course%20%u2013%

I noticed that one of the characters seems to be encoded in a different manner at the end, the %u2013. It appears to be some form of unicode character, but it is causing me to get URI malformed errors. is there a w开发者_如何学运维ay to replace these with standard encoding characters? In this example, it seems %u2013 is supposed to be a hyphen.

To be complete and more correct, the regular expression should also accept letters from A to F, since the %u2013 refers to a four-digit hexadecimal number. And you should definitely include the percent sign in the regular expression, otherwise you end up interpreting Blu2000 as a Unicode escape sequence, which it isn't.

function fixUnicodeUrl(url) {
    var result = url.replace(/%u[0-9a-f]{4}/gi, function (match) {
        var codepoint = parseInt(match.substring(2), 16);
        var str = String.fromCharCode(codepoint);
        return encodeURIComponent(str);
    });
    return result;
}

var yourUrl = '%3Cp%3E%0AGlobal%20Business%20Intensive%20Course%20%u2013%';
alert(fixUnicodeUrl(yourUrl));

That is malformed for sure. Where are you getting it from?

Here's a way to fix all occurrences of that type of malformation.

var str = '%3Cp%3E%0AGlobal%20Business%20Intensive%20Course%20%u2013%';

str = str.replace( /u\d{4}/g, function( sequence )
{
  return encodeURIComponent( eval( '"\\' + sequence + '"' ) );
} );