I just started learning about database normalization and I have a question about one of my tables. My database right now is structured horribly, and one of the reasons is because I have a table that looks like this.
Customers Table
ID | Date_Entered | First_Name | Middle_Name | Last_Name | Maiden_Name
...
Address__street_dmv | Address_city_dmv | Address_state_dmv | Address_zip_dmv
...
Address__street_source2 | Address_state_source2 | Address_city_source2 | etc
.
The addresses keeping going on and on because my company obtains address data from multiple sources. But, of course, some of these address will be Null for some of our customers. So I think I need a separate addresses table like this that connects to the Customers table.
.
Addresses
ID | Number | Street | State | Zip | Source (drop down menu)
But then I was thinking the source would be redundant data. So, do I need a separate sources table like this?
Sources
Source_ID | Source
And change the addresses table like this?
ID | Number | Street | State | Zip开发者_运维知识库 | Source _ID (drop down)
It doesn’t seem right because now the Source_ID is redundant… Please help.
Bonus points if you can tell me whether or not I should include Maiden and Middle names in the Customer table since these too could possibly be Null (If not, how would the new table be structured?)
Sorry for being a noob.
I would go with something like
Customer
ID | Date_Entered | First_Name | Middle_Name | Last_Name | Maiden_Name
Addresses
ID | Number | Street | State_ID | Zip
Customers_Address
ID | Customer_ID | Address_ID | Source_ID
This allows you to have same address from multiple sources. You might also want to have separate table for streets, possibly like
Table_Street (ID | State_ID | Name)
and then in the Addresses
table you would only have Street_ID
instead of both Street
and State_ID
. This also allows you to show a selection list of streets when user has selected state.
I'd say it is OK to have Maiden and Middle names in the Customer table, even if they are rarely used.
Part of your question has to do with normalization, and part of it does not. That doesn't mean part of your question isn't important. It just means it's important for reasons that have nothing to do with normalization.
Your addresses are essentially a repeating group, in one sense of that term. So it does make sense to remove them from Customers. (This has to do with normalization; repeating groups violates 1NF.)
"Source" is not redundant data, and deciding whether to substitute an ID number for text has nothing to do with normalization.
When you move a table from a lower normal form to a higher normal form, the original table ends up with fewer columns. Substituting an ID number for text doesn't change the number of columns.
And every column in which you substitute a meaningless ID number for text requires a join to get the meaningful text back. Following your same logic, you could also substitute meaningless ID numbers for street, state, and zip, but that would then require four joins to get meaningful data back.
I'm not an SQL expert but here is what I think you are trying to describe.
A customer which is an unique entity has a current address and can have many other addresses, if this is correct yes you should separate the additional addresses into their own table.
Secondly the way you discover that a customer has x amount of addresses is that you obtain this information different companies for each customer, if this is the case I would have a separate table for the companies and record this as you planned, yes you will have repeating rows of source_id but this would be the case as they supply information about many different customers.
In relation to the maiden and middle names are these required by your business rules if so store them when required.
Again my SQL development is only student level really but from what I understand this is how I would go about it.
Hope this helps and if anyone can give more expert information go with it.
you could also try the following approach:
Customer
CustomerID (PK) | Date_Entered | First_Name | Middle_Name | Last_Name | Maiden_Name
Addresses
CustomerId (PK)(FK) | SourceID(PK) | Number | Street | State| Zip
This assumes a one-to-many relationship between Customer and Addresses. It also eliminates the Customer_Address table altogether, in favor of using two tables (Customer and Addresses), and defining a compound primary key for the Addresses table as the CustomerId and the SourceID. In this model, the CustomerId and the SourceId uniquely determines the Number, Street, State, and Zip. It also enforces data integrity by ensuring that each customer can only have one address from each source. Let me know if this helps or if I'm way off base. I'm still learning!
精彩评论