multibyte identifiers list_问答_开发者_运维开发者技术经验分享

开发者 https://www.devze.com 2023-02-07 01:12 出处：网络

I was looking into multi-byte characters and how they are used but how many different identifiers/pasterns are used for dif开发者_高级运维ferent multi-bytes.

e.g: &nbps;,&#nbsp;,U+0026,%20

how many different identifiers such as &,&#,u+ ,% etc are there ?

Im trying to look for inputs if they have words which are more than 255 characters long then its probably a multi-byte (hack attempt) and then I can check if word can be split has the multi-byte identifier then stop the hack attempt.

% format - a url-encoded value for embedding into URLS, e.g. %20 is a space (ascii 20)
  - named character entity, a non-breaking space in this case
U+0026 - a unicode character in hex notation, an & in this case
&#...; - a numbered character entity in decimal (base10) & = &
&#x...; - a numbered character entity in hex (base 16): & = &

Are you trying to avoid homoglyph-based spoofing ? Does identifier means username here ?

If yes, and if your users use a latin alphabet, just allow only ascii letters and numbers:

$identifier = preg_replace('#[^A-Za-z0-9]+#', '', $identifier);

multibyte identifiers list

精彩评论

关注公众号

热门标签

图文推荐

multibyte identifiers list

更多 问答 相关资讯：

精彩评论

关注公众号

热门标签

图文推荐

更多问答相关资讯：