I wish to create (in the R language) a "decimal HTML decoder" such as the one implemented in this website:
http://www.hashemian.com/tools/html-url-encode-decode.php
But I'm not sure where to start, could someone propose any pointers on what to read/which tran开发者_开发问答slation table (or formula) to use?
My original motivation for this will be the decoding of Hebrew characters. (for example, the translation of something like this:
שלום
To this:
שלום
)
(hat tip goes to Matt Shotwell for the pointers)
inp <- "שלום"
nohash <- sub("#", "0", strsplit(inp, "&")[[1]]) # cvrt # to 0
nohash
# [1] "" "0x5E9;" "0x5DC;" "0x5D5;" "0x5DD;"
strtoi( sub(";", "", nohash) ) # remove trailing ";" and cvrt to dec
# [1] 0 1513 1500 1493 1501
Edit the time has expired on adding to my comment so I'll add this link that seems to have a conversion table:
精彩评论