开发者

HTML entity decoding in Java: apostrophe

开发者 https://www.devze.com 2023-01-21 07:17 出处:网络
I have to decode, using Java, HTML strings which contain the following entities: \"&#39\" and \"&apos\".

I have to decode, using Java, HTML strings which contain the following entities: "&#39" and "&apos". I'm using Apache Commons Lang, but it doesn't decode those two entities, so, I'm currently doing as follows, but I'm looking for the fastest way to do what I want.

import org.apache.commons.lang.StringEscapeUtils;

public class Strin开发者_开发百科gUtil {

        public static String decodeHTMLString(String s) {
            return StringEscapeUtils.unescapeHtml((s.replace("'", "`").replace("'", "'")));
        }

}

I searched for older questions, but none seems to answer my question.


Well, i would imagine that part of the problem is that one of your entities is double encoded: "'". That will not be turned into an apostrophe by any decoder.

As for "'", apparently that one is not +technically+ part of the html entity set.

0

精彩评论

暂无评论...
验证码 换一张
取 消