When designing the database tables for storing simple User/Role information, can anyone tell me why it would be a bad idea to store the Role information directly on the User table (for example, Com开发者_Go百科ma-Separated in a Roles column).

Thoughts:

The database doesn't need to know about the roles, that's the UI's domain
The quicker the access to a specific user's roles the better
Sure, if sometime in the future I want access to all of the users for a specific role the query might

be a little slow, but who cares at that point?

Does this make any sense? Am I off my rocker? Wouldn't creating Roles and UserRole tables be overkill and add unnecessary sql and code overhead?

UPDATE:

To further illustrate my point... in code, I want to know if user "Steve" is in role "Administrator".

Option 1: query the UserRole table for a list of roles for user "Steve". Loop through that list and see if the RoleName matches "Administrator".

Option 2: split the csv in the User's Role property and see if the resultant list contains "Administrator"

UPDATE II:

I agree that my suggestion violates all sorts of "best practice" type thinking, particularly around DB design. However, I am not seeing how the "best practices" make any sense in this sort of scenario. I do like to rock the best practices boat now and then...I like to code in a way that seems smart, which means sometimes I need to understand more to know when I'm not being smart :)

Because it violates 3rd normal form. You want to seperate all of your entities as different objects which means you need a seperate table, as well as a relationship table.

Violation of data

If you keep all of this in one table you are placing too much irrelevant information about a user in a user's table. A user's table should have fields that pertain only to that user, such as their name, user account, etc. But in this case you've decided to throw in some role information. This doesn't make sense as you are adding attributes that do not necessairly have anything to do with a user.

The result is you start adding fields that do not pertain to a user, and you end up having a ton of unrelated information. The solution to this is have a users table like so:

User table
UserID
User
...

Role table
RoleID
Role
....

User Role Table (the relationship)
UserRoles
UserID
RoleID
...

Updating / Inserting of data

The next issue you have to deal with if you store role information inside of a user's table is how to perform valid updates, inserts. This makes it all the more difficult. When you edit a record, you have to ensure you edit the right value from the CSV.

Finding the right role

Here lies the most difficult problem, how to find the role given a user. You might come up with this great parsing technique in C# or SQL Server, it works great..but it becomes terribly slow and hard to read. You start dealing with SubString(), Left(), Right(), Len() and a whole slew of other functions just to parse out the role of a user.

The Solution

You may think that putting it all into one table is easier right now. It probably will take a lot less up front time. But you have to develop applications with the future in mind. The UI will be much simplier if you follow the rules of 3nf and create a nice relational structure. Not only will the UI admin screens look nice, but getting a role for a specific UserID will be so trivial, as opposed to parsing or searching...

I don't think it's a great plan. Suppose you're manually updating someone's roles and you type the name of a role slightly wrong? If you had a separate table, a database constraint would warn you. Suppose you decide to change the name of a role? If you had a separate table, you would only need to change it in one place.

Database normalization is done for good reasons; it's not just nitpicky. You wouldn't repeat key code in your codebase in more than one place; database denormalization is the equivalent.

EDITED TO ADD: You make the point that the application is ultimately going to make decisions based on the values returned by the database; e.g. granting certain options if the user has a role called "Admin". This is true, and it is another, separate place where the consistency of, for example, role names can go awry. I don't think denormalizing the database makes this less likely.

One good approach to help with this (and a good way to implement authorization in general) is to have a single location in code where the role is translated into certain general abilities (e.g. admins can read and write all entities, guests can read certain entities and can't write anything, etc.). Then, in the many places where you need to establish access, you check against an ability, rather than checking against a role.

That is to say, in a view, if you're deciding whether to show the "edit" button on the description of an item, you don't check by doing if role=='ADMIN' or role=='EDITOR', you check by doing if user.can_edit(item). Somewhere else you've established that admins and editors get the ability to edit items. See, for example, the approach that the Rails authorization system CanCan uses.

Using this approach, there's only one place where you're referencing the names of roles (e.g. in CanCan you have a class called "abilities" which defines all of the rules for who can do what, based on their roles. Everywhere else, you reference what abilities a user has to determine what they can do or see.

You need extra logic to parse your CSV at the code side.
Most ORM does not support CSV.
If you have a long role name (ex. UserWhoHaveAccessToXObjButNotYObjAndAtTheSameTimeCanDoZ) then you are wasting space repeating it for all the users.

Also, your first assumption does not hold for all situations. What if MY database need to know about roles?

It depends on whether you want the Database Management system to manage the data or you want the application to manage the data. In most cases, the DBMS does a better job, because that's what it's built for.

You said

So why not just have the roles defined in the application and only in the application?

You can do that reliably only if you can truthfully say, "I'm 100% certain that every programmer and every database administrator will manually preserve data integrity perfectly no matter what application touches this database, no matter what language it's programmed in, no matter how experienced or inexperienced the programmer is, and no matter how complex the database becomes, from now until the end of its life."

If you get to the point in your career where you manage a programming project, part of your job will be assigning coding responsibilities to the persons best suited to carry them out. At the architectural level, someone also assigns program responsibilities--like data integrity--to the module or subsystem best suited to carry them out. Data integrity can only be guaranteed by assigning that responsibility to the dbms. Assigning it to any other part of the system, whether code or human, is just an expression of hope, and hope doesn't scale well.