开发者

problem in reading <TITLE> tag from web page in java

开发者 https://www.devze.com 2023-03-06 08:24 出处:网络
I am using jtidy parser to parse the web page. It is working, sort of: InputStream in=new URL(\"http://www.medicinenet.com/alopecia_areata/article.htm\").openStream();

I am using jtidy parser to parse the web page. It is working, sort of:

InputStream in=new URL("http://www.medicinenet.com/alopecia_areata/article.htm").openStream();
Document doc= new Tidy().parseDOM(in, null);
String titleText=doc.getElementsByTagName("title").item(0).getFirstChild().getNodeValue();

It is working fine for <title>...</title>, but the url which I passed, it contains title tag <TITLE>...</TITLE> in capital letter. So开发者_开发百科 it is returning null.

How to read <TITLE>...</TITLE> & <title>...</title> in one statement using java code? Please help me.


Just check for null, then check uppercase

String titleText=doc.getElementsByTagName("title").item(0).getFirstChild().getNodeValue();
if (titleText == null) titleText=doc.getElementsByTagName("TITLE").item(0).getFirstChild().getNodeValue();

getElementsByTagName is case sensitive, so this is the simplest option.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号