Update: I upgrade wget from 1.10 to 1.12 and solved the problem.
For example
www.example.com/level1/level2/../test.html
In this way, wget and browser will visit
www.example.com/level1/test.html
But for
www.example.com/../test.html
wget will visit
www.example.com/../test.html
browser will visit
www.example.com/test.开发者_C百科html
I was using wget to parse some webpage to get the size of it and the elements inside it. Now I found that some webpage are using "../css/xxx.jpg" instead of "css/xxx.jpg". It is Ok to visit the webpage with browser, but not wget.
Is there a way to solve it? Thank you.
Before passing URLs to wget, trim "../" from the begging of the path. (splitting the URLS into components would help.)
How to do this depends on what language or framework you are using.
精彩评论