开发者

Add utf-8 support to tcl

开发者 https://www.devze.com 2023-02-25 02:37 出处:网络
set botlisten(port) \"3333\" set botlisten(开发者_C百科password) \"123456\" set botlisten(channel) \"#chan\"
set botlisten(port) "3333"
set botlisten(开发者_C百科password) "123456"
set botlisten(channel) "#chan"
listen $botlisten(port) script botlisten
proc botlisten {idx} {
    control $idx botlisten2
}
proc botlisten2 {idx args} {
global botlisten newTorrentChannel
set args [join $args]
set botlisten(pass) [lindex [split $args] 0]
set botlisten(message) [join [lrange [split $args] 1 end]]
if {[string match $botlisten(pass) $botlisten(password)]} then {
   putquick "PRIVMSG $botlisten(channel) :$botlisten(message)"
 } else {
  putlog "Unauthorized person tried to connect to the bot"
  }
}  

Let say message have these chars: ąčęėįšųūž so bot output strange chars. So, in my opinion solution is add utf-8 support.


Tcl has had fully-integrated UTF-8 support for well over a decade (since Tcl 8.1, though nobody sane uses that version any more as there are monotonically better ones).

However, in general it is necessary to tell Tcl about what encoding is used on a particular communications channel with the outside world (with fconfigure's -encoding option). Tcl uses a default guess that is system dependent; on my system, it's actually UTF-8 but on others it is ISO 8859-1 or -15 or the appropriate Windows codepage. (Tcl's good at making default guesses BTW.) On sockets it's more awkward, since the encoding is something that's really a protocol-level decision (some protocols specify a particular encoding – SMTP does, IIRC – some switch encodings during the operation of the protocol – HTTP is a prime example of that – and some don't specify at all – IRC is the classic example of that). In some cases, the encoding command is necessary, so that scripts can take manual control over the conversion between byte sequences and characters. It's rather rare though.

Of course, if code is being used is just taking Tcl's strings and pushing them blindly across the net using low-level networking (hellooo, eggdrop!) then there's not really all that much the general Tcl level can do. The workarounds in that case are either to build eggdrop to use a different encoding (as Zero's link from his comment says) or to use encoding to do the munging, like this:

Convert UTF-8 into encoded form:

set encoded [encoding convertto utf-8 $normalString]

Convert encoded UTF-8 back into a normal string:

set normalString [encoding convertfrom utf-8 $encoded]
0

精彩评论

暂无评论...
验证码 换一张
取 消