开发者

Why doesn't HTMLunit work on this https webpage?

开发者 https://www.devze.com 2023-02-17 14:01 出处:网络
I\'m trying to learn more about HTMLunit and doing some tests at the moment. I am trying to get basic information such as page title and text from this site:

I'm trying to learn more about HTMLunit and doing some tests at the moment. I am trying to get basic information such as page title and text from this site:

https://....com (removed the full url, important part is that it is https)

The code I use is this, which is working fine on other websites:

 final WebClient webClient = new WebClient();
  final HtmlPage page;
  page = (HtmlPage)webClient.getPage("https://medeczane.sgk.gov.tr/eczane/login.jsp");
  System.out.println(page.getTitleText());
  System.out.println(page.asText());

Why can't I get this basic information ? If it is because of security measures, what are the specifics and can I bypass them ? Thanks.

Edit:Hmm the code stops working after webclient.getpage(); , test2 is not written. So I can not check if page is null or not.

  final WebClient webClient = new WebClient(BrowserVersion.FIREFOX_2);
  final HtmlPage page;
  System.out.println("test1");
    try {
        page = (HtmlPage)webClient.getPage("https://medeczane.sgk.gov.tr/eczane/login.jsp");
      S开发者_高级运维ystem.out.println("test2");


I solved this by adding this line of code:

webClient.setUseInsecureSSL(true);

which is deprecated way of disabling secure SSL. In current HtmlUnit version you have to do:

webClient.getOptions().setUseInsecureSSL(true);


I think that this is an authentication problem - If I go tho that page in Firefox I get a login box.

Try

webClient.setAuthentication(realm,username,password);

before the call the getPage()

0

精彩评论

暂无评论...
验证码 换一张
取 消