I understand the use of etags for optimistic concurrency control (e.g. in a RESTful 开发者_运维知识库style of architecture), and I've read that etags should be different for different representations of the same resource. Why is that?
Ultimately aren't we interested in knowing if the resource has changed so we can handle concurrent modifications? I'm having a hard time even envisioning when a resource's representation would change without the resource itself changing, so I'm obviously missing some basic understanding.
Good question, and I think it's a matter of some debate.
I think that most would say that the ETag represents not only the resource version, but also the content type. This would make sense for caching responses based on content type, language, etc.
Check out the following links:
- Wikipedia article section about strong vs weak ETags: http://en.wikipedia.org/wiki/HTTP_ETag#Strong_and_weak_validation
- An informative discussion on content encoding and ETags: http://www.gossamer-threads.com/lists/apache/dev/339577
This is not a matter of debate when you lay out facts or when you read the HTTP&HTTPbis spec.
ETag is a means of caching and concurrency control. Weak ETags is only a means of poor-man's caching.
In terms of caching (GET) - uri + content-type + etag can help you save bandwidth by not responding with the payload as well, but just with 304 status code.
In terms of concurrency control (POST;PUT;PATCH) - it is impetuous to have the ETag calculated based on URI + content-type + bit-exact response payload. Why?
- If you calculate the ETag based on a whole object, a superset of the response payload (ie. your payload gives a+b, but the object is actually a+b+c), then doing a PATCH for instance will end up failing because the ETag changed... you refresh.. you get the same data but a different ETag... you retry the PATCH with the new ETag, now it works. FAIL
- If you calculate the ETag based on a subset of the payload you are actually forcing the user out of being in control of the conditions for the unsafe call without any transparency at all. A PATCH will succeed even if the data associated with that ETag has changed, which is obviously not how the HTTP request was intended. FAIL
Conditional requests should be treated with a semantic similar to "Given that my view of the world is still the same, then perform the request. Fail otherwise". My view of the world is made out of a past response (URI + headers + payload).
精彩评论