I have a legacy database table with a mixed encoding. Some lines are UTF-8 and some lines are ISO 8开发者_运维问答859-1.
Are there some heuristics I can apply on the content of a line to guess which encoding best represents the content?
Convert from UTF-8. If that fails then it's not UTF-8, so you should probably convert from Latin-1 instead.
Compare
iconv("UTF-8", "ISO-8859-1//IGNORE", $text)
and
iconv("UTF-8", "ISO-8859-1", $text)
If they are not equal - consider it UTF-8.
精彩评论