开发者

How do I extract web wiki pages, which are password protected?

开发者 https://www.devze.com 2022-12-18 03:22 出处:网络
I wish to get a few web pages and the sub-links on those which are password protected. I have the user name and the password and can access them from the normal browser UI. But As I wish to save these

I wish to get a few web pages and the sub-links on those which are password protected. I have the user name and the password and can access them from the normal browser UI. But As I wish to save these pages to my local drive for later reference, I am using WGET to get them:

wget --http-user=USER --http-password=PASS http://mywiki.mydomain.com/myproject

But the above is not working, as it asks for the password again. Is there any better way to do this, without getting stuck with the system asking for the password again. Also, what is the best option to get all the links and sub-links on a particular page and store them to a single folder.

Update: The actual page I 开发者_开发知识库am trying to access is behind a HTTPS gateway, and the certificate for the same is not gettin g validated. Is there any way to get through this?

mysystem-dsktp ~ $ wget --http-user=USER --http-password=PASS https://secure.site.mydomain.com/login?url=http://mywiki.mydomain.com%2fsite%2fmyproject%2f
--2010-01-24 18:09:21--  https://secure.site.mydomain.com/login?url=http://mywiki.mydomain.com%2fsite%2fmyproject%2f
Resolving secure.site.mydomain.com... 124.123.23.12, 124.123.23.267, 124.123.102.191, ...
Connecting to secure.site.mydomain.com|124.123.23.12|:443... connected.
ERROR: cannot verify secure.site.mydomain.com's certificate, issued by `/C=US/O=Equifax/OU=Equifax Secure Certificate Authority':
  Unable to locally verify the issuer's authority.
To connect to secure.site.mydomain.com insecurely, use `--no-check-certificate'.
Unable to establish SSL connection.

I tried the --no-check-certificate option also, it is not working. I only get the login page with this option and not the actual page I requested.


Could you try like this?

wget http://USER:PASSWD@mywiki.mydomain.com/myproject


Seems you're trying to access a page secured by a form.

You could to use that --no-check-certificate option and to follow this forum thread suggestions: Can't log in with wget.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号