开发者

How to program a "decimal HTML decoder"?

开发者 https://www.devze.com 2023-01-30 07:33 出处:网络
I wish to create (in the R language) a \"decimal HTML decoder\" such as the one implemented in this website:

I wish to create (in the R language) a "decimal HTML decoder" such as the one implemented in this website:

http://www.hashemian.com/tools/html-url-encode-decode.php

But I'm not sure where to start, could someone propose any pointers on what to read/which tran开发者_开发问答slation table (or formula) to use?

My original motivation for this will be the decoding of Hebrew characters. (for example, the translation of something like this:

שלום

To this:

שלום

)

(hat tip goes to Matt Shotwell for the pointers)


inp <- "&#x5E9;&#x5DC;&#x5D5;&#x5DD;"
nohash <- sub("#", "0", strsplit(inp, "&")[[1]])  # cvrt # to 0
nohash
# [1] ""       "0x5E9;" "0x5DC;" "0x5D5;" "0x5DD;"
strtoi( sub(";", "", nohash) )  # remove trailing ";" and cvrt to dec
# [1]    0 1513 1500 1493 1501

Edit the time has expired on adding to my comment so I'll add this link that seems to have a conversion table:

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号