For a web page that exists, but for which a user does not have sufficient privileges (they are not logged in or d开发者_如何学JAVAo not belong to the proper user group), what is the proper HTTP response to serve?

401 Unauthorized?

403 Forbidden?

Something else?

What I've read on each so far isn't very clear on the difference between the two. What use cases are appropriate for each response?

A clear explanation from Daniel Irvine ^{[original link]}:

There's a problem with 401 Unauthorized, the HTTP status code for authentication errors. And that’s just it: it’s for authentication, not authorization. Receiving a 401 response is the server telling you, “you aren’t authenticated–either not authenticated at all or authenticated incorrectly–but please reauthenticate and try again.” To help you out, it will always include a WWW-Authenticate header that describes how to authenticate.

This is a response generally returned by your web server, not your web application.

It’s also something very temporary; the server is asking you to try again.

So, for authorization I use the 403 Forbidden response. It’s permanent, it’s tied to my application logic, and it’s a more concrete response than a 401.

Receiving a 403 response is the server telling you, “I’m sorry. I know who you are–I believe who you say you are–but you just don’t have permission to access this resource. Maybe if you ask the system administrator nicely, you’ll get permission. But please don’t bother me again until your predicament changes.”

In summary, a 401 Unauthorized response should be used for missing or bad authentication, and a 403 Forbidden response should be used afterwards, when the user is authenticated but isn’t authorized to perform the requested operation on the given resource.

Another nice pictorial format of how http status codes should be used.

Edit: RFC2616 is obsolete, see RFC9110.

401 Unauthorized:

If the request already included Authorization credentials, then the 401 response indicates that authorization has been refused for those credentials.

403 Forbidden:

The server understood the request, but is refusing to fulfill it.

From your use case, it appears that the user is not authenticated. I would return 401.

Something the other answers are missing is that it must be understood that Authentication and Authorization in the context of RFC 2616 refers ONLY to the HTTP Authentication protocol of RFC 2617. Authentication by schemes outside of RFC2617 is not supported in HTTP status codes and are not considered when deciding whether to use 401 or 403.

Brief and Terse

Unauthorized indicates that the client is not RFC2617 authenticated and the server is initiating the authentication process. Forbidden indicates either that the client is RFC2617 authenticated and does not have authorization or that the server does not support RFC2617 for the requested resource.

Meaning if you have your own roll-your-own login process and never use HTTP Authentication, 403 is always the proper response and 401 should never be used.

Detailed and In-Depth

From RFC2616

10.4.2 401 Unauthorized

The request requires user authentication. The response MUST include a WWW-Authenticate header field (section 14.47) containing a challenge applicable to the requested resource. The client MAY repeat the request with a suitable Authorization header field (section 14.8).

and

10.4.4 403 Forbidden The server understood the request but is refusing to fulfil it. Authorization will not help and the request SHOULD NOT be repeated.

The first thing to keep in mind is that "Authentication" and "Authorization" in the context of this document refer specifically to the HTTP Authentication protocols from RFC 2617. They do not refer to any roll-your-own authentication protocols you may have created using login pages, etc. I will use "login" to refer to authentication and authorization by methods other than RFC2617

So the real difference is not what the problem is or even if there is a solution. The difference is what the server expects the client to do next.

401 indicates that the resource can not be provided, but the server is REQUESTING that the client log in through HTTP Authentication and has sent reply headers to initiate the process. Possibly there are authorizations that will permit access to the resource, possibly there are not, but let's give it a try and see what happens.

403 indicates that the resource can not be provided and there is, for the current user, no way to solve this through RFC2617 and no point in trying. This may be because it is known that no level of authentication is sufficient (for instance because of an IP blacklist), but it may be because the user is already authenticated and does not have authority. The RFC2617 model is one-user, one-credentials so the case where the user may have a second set of credentials that could be authorized may be ignored. It neither suggests nor implies that some sort of login page or other non-RFC2617 authentication protocol may or may not help - that is outside the RFC2616 standards and definition.

Edit: RFC2616 is obsolete, see RFC7231 and RFC7235.

  +-----------------------
  | RESOURCE EXISTS ? (if private it is often checked AFTER auth check)
  +-----------------------
    |       |
 NO |       v YES
    v      +-----------------------
   404     | IS LOGGED-IN ? (authenticated, aka user session)
   or      +-----------------------
   401        |              |
   403     NO |              | YES
   3xx        v              v
              401            +-----------------------
       (404 no reveal)       | CAN ACCESS RESOURCE ? (permission, authorized, ...)
              or             +-----------------------
             redirect          |            |
             to login       NO |            | YES
                               |            |
                               v            v
                               403          OK 200, redirect, ...
                      (or 404: no reveal)
                      (or 404: resource does not exist if private)
                      (or 3xx: redirection)

Checks are usually done in this order:

404 if resource is public and does not exist or 3xx redirection
OTHERWISE:
401 if not logged-in or session expired
403 if user does not have permission to access resource (file, json, ...)
404 if resource does not exist or not willing to reveal anything, or 3xx redirection

UNAUTHORIZED: Status code (401) indicating that the request requires authentication, usually this means user needs to be logged-in (session). User/agent unknown by the server. Can repeat with other credentials. NOTE: This is confusing as this should have been named 'unauthenticated' instead of 'unauthorized'. This can also happen after login if session expired. Special case: Can be used instead of 404 to avoid revealing presence or non-presence of resource (credits @gingerCodeNinja)

FORBIDDEN: Status code (403) indicating the server understood the request but refused to fulfill it. User/agent known by the server but has insufficient credentials. Repeating request will not work, unless credentials changed, which is very unlikely in a short time span. Special case: Can be used instead of 404 to avoid revealing presence or non-presence of resource (credits @gingerCodeNinja) in the case that revealing the presence of the resource exposes sensitive data or gives an attacker useful information.

NOT FOUND: Status code (404) indicating that the requested resource is not available. User/agent known but server will not reveal anything about the resource, does as if it does not exist. Repeating will not work. This is a special use of 404 (github does it for example).

As mentioned by @ChrisH there are a few options for redirection 3xx (301, 302, 303, 307 or not redirecting at all and using a 401):

Difference between HTTP redirect codes
How long do browsers cache HTTP 301s?
What is correct HTTP status code when redirecting to a login page?
What's the difference between a 302 and a 307 redirect?

According to RFC 2616 (HTTP/1.1) 403 is sent when:

The server understood the request, but is refusing to fulfill it. Authorization will not help and the request SHOULD NOT be repeated. If the request method was not HEAD and the server wishes to make public why the request has not been fulfilled, it SHOULD describe the reason for the refusal in the entity. If the server does not wish to make this information available to the client, the status code 404 (Not Found) can be used instead

In other words, if the client CAN get access to the resource by authenticating, 401 should be sent.

Assuming HTTP authentication (WWW-Authenticate and Authorization headers) is in use, if authenticating as another user would grant access to the requested resource, then 401 Unauthorized should be returned.

403 Forbidden is used when access to the resource is forbidden to everyone or restricted to a given network or allowed only over SSL, whatever as long as it is no related to HTTP authentication.

If HTTP authentication is not in use and the service has a cookie-based authentication scheme as is the norm nowadays, then a 403 or a 404 should be returned.

Regarding 401, this is from RFC 7235 (Hypertext Transfer Protocol (HTTP/1.1): Authentication):

3.1. 401 Unauthorized

The 401 (Unauthorized) status code indicates that the request has not been applied because it lacks valid authentication credentials for the target resource. The origin server MUST send a WWW-Authenticate header field (Section 4.4) containing at least one challenge applicable to the target resource. If the request included authentication credentials, then the 401 response indicates that authorization has been refused for those credentials. The client MAY repeat the request with a new or replaced Authorization header field (Section 4.1). If the 401 response contains the same challenge as the prior response, and the user agent has already attempted authentication at least once, then the user agent SHOULD present the enclosed representation to the user, since it usually contains relevant diagnostic information.

The semantics of 403 (and 404) have changed over time. This is from 1999 (RFC 2616):

10.4.4 403 Forbidden

The server understood the request, but is refusing to fulfill it. Authorization will not help and the request SHOULD NOT be repeated. If the request method was not HEAD and the server wishes to make public why the request has not been fulfilled, it SHOULD describe the reason for the refusal in the entity. If the server does not wish to make this information available to the client, the status code 404 (Not Found) can be used instead.

In 2014 RFC 7231 (Hypertext Transfer Protocol (HTTP/1.1): Semantics and Content) changed the meaning of 403:

6.5.3. 403 Forbidden

The 403 (Forbidden) status code indicates that the server understood the request but refuses to authorize it. A server that wishes to make public why the request has been forbidden can describe that reason in the response payload (if any).

If authentication credentials were provided in the request, the server considers them insufficient to grant access. The client SHOULD NOT automatically repeat the request with the same credentials. The client MAY repeat the request with new or different credentials. However, a request might be forbidden for reasons unrelated to the credentials.

An origin server that wishes to "hide" the current existence of a forbidden target resource MAY instead respond with a status code of 404 (Not Found).

Thus, a 403 (or a 404) might now mean about anything. Providing new credentials might help... or it might not.

I believe the reason why this has changed is RFC 2616 assumed HTTP authentication would be used when in practice today's Web apps build custom authentication schemes using for example forms and cookies.

401 Unauthorized: I don't know who you are. This an authentication error.
403 Forbidden: I know who you are, but you don't have permission to access this resource. This is an authorization error.

This is an older question, but one option that was never really brought up was to return a 404. From a security perspective, the highest voted answer suffers from a potential information leakage vulnerability. Say, for instance, that the secure web page in question is a system admin page, or perhaps more commonly, is a record in a system that the user doesn't have access to. Ideally you wouldn't want a malicious user to even know that there's a page / record there, let alone that they don't have access. When I'm building something like this, I'll try to record unauthenticate / unauthorized requests in an internal log, but return a 404.

OWASP has some more information about how an attacker could use this type of information as part of an attack.

This question was asked some time ago, but people's thinking moves on.

Section 6.5.3 in this draft (authored by Fielding and Reschke) gives status code 403 a slightly different meaning to the one documented in RFC 2616.

It reflects what happens in authentication & authorization schemes employed by a number of popular web-servers and frameworks.

I've emphasized the bit I think is most salient.

6.5.3. 403 Forbidden

The 403 (Forbidden) status code indicates that the server understood the request but refuses to authorize it. A server that wishes to make public why the request has been forbidden can describe that reason in the response payload (if any).

If authentication credentials were provided in the request, the server considers them insufficient to grant access. The client SHOULD NOT repeat the request with the same credentials. The client MAY repeat the request with new or different credentials. However, a request might be forbidden for reasons unrelated to the credentials.

An origin server that wishes to "hide" the current existence of a forbidden target resource MAY instead respond with a status code of 404 (Not Found).

Whatever convention you use, the important thing is to provide uniformity across your site / API.

These are the meanings:

401: User not (correctly) authenticated, the resource/page require authentication

403: User's role or permissions does not allow to access requested resource, for instance user is not an administrator and requested page is for administrators.

Note: Technically, 403 is a superset of 401, since is legal to give 403 for unauthenticated user too. Anyway is more meaningful to differentiate.

!!! DEPR: The answer reflects what used to be common practice, up until 2014 !!!

TL;DR

401: A refusal that has to do with authentication
403: A refusal that has NOTHING to do with authentication

Practical Examples

If apache requires authentication (via .htaccess), and you hit Cancel, it will respond with a 401 Authorization Required

If nginx finds a file, but has no access rights (user/group) to read/access it, it will respond with 403 Forbidden

RFC (2616 Section 10)

401 Unauthorized (10.4.2)

Meaning 1: Need to authenticate

The request requires user authentication. ...

Meaning 2: Authentication insufficient

... If the request already included Authorization credentials, then the 401 response indicates that authorization has been refused for those credentials. ...

403 Forbidden (10.4.4)

Meaning: Unrelated to authentication

... Authorization will not help ...

More details:

The server understood the request, but is refusing to fulfill it.

It SHOULD describe the reason for the refusal in the entity

The status code 404 (Not Found) can be used instead

(If the server wants to keep this information from client)

they are not logged in or do not belong to the proper user group

You have stated two different cases; each case should have a different response:

If they are not logged in at all you should return 401 Unauthorized
If they are logged in but don't belong to the proper user group, you should return 403 Forbidden

Note on the RFC based on comments received to this answer:

If the user is not logged in they are un-authenticated, the HTTP equivalent of which is 401 and is misleadingly called Unauthorized in the RFC. As section 10.4.2 states for 401 Unauthorized:

"The request requires user authentication."

If you're unauthenticated, 401 is the correct response. However if you're unauthorized, in the semantically correct sense, 403 is the correct response.

I have created a simple note for you which will make it clear.

403 Forbidden vs 401 Unauthorized HTTP responses

In English:

401

You are potentially allowed access but for some reason on this request you were denied. Such as a bad password? Try again, with the correct request you will get a success response instead.

403

You are not, ever, allowed. Your name is not on the list, you won't ever get in, go away, don't send a re-try request, it will be refused, always. Go away.

401: You need HTTP basic auth to see this.

If the user just needs to log in using you site's standard HTML login form, 401 would not be appropriate because it is specific to HTTP basic auth.

403: This resource exists but you are not authorized to see it, and HTTP basic auth won't help.

I don't recommend using 403 to deny access to things like /includes, because as far as the web is concerned, those resources don't exist at all and should therefore 404.

In other words, 403 means "this resource requires some form of auth other than HTTP basic auth (such as using the web site's standard HTML login form)".

https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.2

401: Who are you again?? (programmer walks into a bar with no ID or invalid ID)

403: Oh great, you again. I've got my eye on you. Go on, get outta here. (programmer walks into a bar they are 86'd from)

I think it is important to consider that, to a browser, 401 initiates an authentication dialog for the user to enter new credentials, while 403 does not. Browsers think that, if a 401 is returned, then the user should re-authenticate. So 401 stands for invalid authentication while 403 stands for a lack of permission.

Here are some cases under that logic where an error would be returned from authentication or authorization, with important phrases bolded.

A resource requires authentication but no credentials were specified.

401: The client should specify credentials.

The specified credentials are in an invalid format.

400: That's neither 401 nor 403, as syntax errors should always return 400.

The specified credentials reference a user which does not exist.

401: The client should specify valid credentials.

The specified credentials are invalid but specify a valid user (or don't specify a user if a specified user is not required).

401: Again, the client should specify valid credentials.

The specified credentials have expired.

401: This is practically the same as having invalid credentials in general, so the client should specify valid credentials.

The specified credentials are completely valid but do not suffice the particular resource, though it is possible that credentials with more permission could.

403: Specifying valid credentials would not grant access to the resource, as the current credentials are already valid but only do not have permission.

The particular resource is inaccessible regardless of credentials.

403: This is regardless of credentials, so specifying valid credentials cannot help.

The specified credentials are completely valid but the particular client is blocked from using them.

403: If the client is blocked, specifying new credentials will not do anything.

401 response means one of the following:

An access token is missing.
An access token is either expired, revoked, malformed, or invalid.

403 response on the other hand means that the access token is indeed valid, but that the user does not have appropriate privileges to perform the requested action.

Given the latest RFC's on the matter (7231 and 7235) the use-case seems quite clear (italics added):

401 is for unauthenticated ("lacks valid authentication"); i.e. 'I don't know who you are, or I don't trust you are who you say you are.'

401 Unauthorized

The 401 (Unauthorized) status code indicates that the request has not been applied because it lacks valid authentication credentials for the target resource. The server generating a 401 response MUST send a WWW-Authenticate header field (Section 4.1) containing at least one challenge applicable to the target resource.

If the request included authentication credentials, then the 401 response indicates that authorization has been refused for those credentials. The user agent MAY repeat the request with a new or replaced Authorization header field (Section 4.2). If the 401 response contains the same challenge as the prior response, and the user agent has already attempted authentication at least once, then the user agent SHOULD present the enclosed representation to the user, since it usually contains relevant diagnostic information.

403 is for unauthorized ("refuses to authorize"); i.e. 'I know who you are, but you don't have permission to access this resource.'

403 Forbidden

The 403 (Forbidden) status code indicates that the server understood the request but refuses to authorize it. A server that wishes to make public why the request has been forbidden can describe that reason in the response payload (if any).

If authentication credentials were provided in the request, the server considers them insufficient to grant access. The client SHOULD NOT automatically repeat the request with the same credentials. The client MAY repeat the request with new or different credentials. However, a request might be forbidden for reasons unrelated to the credentials.

An origin server that wishes to "hide" the current existence of a forbidden target resource MAY instead respond with a status code of 404 (Not Found).

I have a slightly different take on it from the accepted answer.

It seems more semantic and logical to return a 403 when authentication fails and a 401 when authorisation fails.

Here is my reasoning for this:

When you are requesting to be authenticated, You are authorised to make that request. You need to otherwise no one would even be able to be authenticated in the first place.

If your authentication fails you are forbidden, that makes semantic sense.

On the other hand the forbidden can also apply for Authorisation, but Say you are authenticated and you are not authorised to access a particular endpoint. It seems more semantic to return a 401 Unauthorised.

Spring Boot's security returns 403 for a failed authentication attempt

In the case of 401 vs 403, this has been answered many times. This is essentially a 'HTTP request environment' debate, not an 'application' debate.

There seems to be a question on the roll-your-own-login issue (application).

In this case, simply not being logged in is not sufficient to send a 401 or a 403, unless you use HTTP Auth vs a login page (not tied to setting HTTP Auth). It sounds like you may be looking for a "201 Created", with a roll-your-own-login screen present (instead of the requested resource) for the application-level access to a file. This says:

"I heard you, it's here, but try this instead (you are not allowed to see it)"