开发者

Why is one cookie missed?

开发者 https://www.devze.com 2023-02-24 09:02 出处:网络
I\'m scrapping a page which is the result of a redirect: I visit page1 first, then it redirects to page2 via http-equiv=\"refresh\". I\'m scrapping page2. Content on page2 is based on some cookies pag

I'm scrapping a page which is the result of a redirect: I visit page1 first, then it redirects to page2 via http-equiv="refresh". I'm scrapping page2. Content on page2 is based on some cookies page1 sets. I see page1 returns 2 cookies, but when I request page 2 (sending the same CookieContainer, one cookie is missing. What's wrong in my code?

Thank you:

First : I create a CookieContainer and an HttpWebRequest and request for page1.

HttpWebRequest req = (HttpWebRequest)HttpWebRequest.Create(eQuery);

req.AllowAutoRedirect = true; // but it doesn't autoredirects the meta-refresh

req.CookieContainer = cookiesContainer;

This is the result I get this from visiting page1

HTTP/1.1 200 OK

Date: Tue, 12 Apr 2011 19:14:06 GMT

Server: (...)

Set-Cookie: NAME1=VALUE1; path=/

Expires: Thu, 19 Nov 1981 08:52:00 GMT

Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0

Pragma: no-cache

Set-Cookie: NAME2=VALUE2; expires=Wed, 13-Apr-2011 19:14:06 GMT

Content-Length: 174

Keep-Alive: timeout=5, max=100

Connection: Keep-Alive

Content-Type: text/html

(...)

Everything is fine so far, I get two cookies are there and I get two cookie objects within the container.

Then I parse the "content" value of the meta http-equiv for the next url. And request it using a similar code and using the same container. But only one cookie is sent. Here is the HTTP sent:

GET DETECTED_U开发者_如何转开发RL_IN_HTTP_EQUIV_REFRESH HTTP/1.1

User-Agent: (...)

Host: example.com

Cookie: NAME1=VALUE1

As you see cookie NAME2 is missing. Why is that happening? is something related differences in the two cookies (one has path and other has expiration date)? Any idea how can I pass the two cookies?

PS: I don't have access to page1, so I can't set path, or expiration for cookies. I'm scrapping those pages.

Thank you.


If you don't specify a path on your cookie it will default to the path it was requested on. So if you received a cookie on this request with no path declaration:

http://contoso.com/subfolder/test.aspx

The browser would only send back that cookie for more requests in the /subfolder/ directory. To have the browser send it back for all paths you need to include path=/ when setting the cookie.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号