开发者

Convert Unicode to UTF8

开发者 https://www.devze.com 2023-02-24 06:12 出处:网络
I am trying to mashup two different 3rd party services in javascript and I am getting strings in a certain character set, that I need to convert to a different character set in Javascript.

I am trying to mashup two different 3rd party services in javascript and I am getting strings in a certain character set, that I need to convert to a different character set in Javascript.

For example, the string is tést.

I am given an encoded string like this: te%u0301st. The accent is encoded as %u030开发者_Python百科1. I need to somehow convert this to this string: t%C3%A9st where the é is encoded as %C3%A9. How can I convert e%u0301 to %C3%A9 in javascript?

Thanks


You appear to be trying to normalize your input, probably in Unicode Normal Form C. I do not know of any simple way to do this in Javascript; you may need to implement the normalization algorithm yourself, or find a library which does so.

edited to remove answer to the wrong question


If all you need is any URL-escaped Unicode encoding, this will do the trick:

function convert(s) {
  function parse(a, c) {
    return String.fromCharCode(parseInt(c, 16));
  }
  return encodeURIComponent(s.replace(/%u([0-f]{4})/gi, parse));
}

convert('te%u0301st'); // => te%CC%81st

If you specifically need Normal Form C, you need to implement a whole lot of Unicode intelligence yourself, as 'te\u0301st'.length (or 'tést'.length) is 5 in javascript.

0

精彩评论

暂无评论...
验证码 换一张
取 消