Database normalization question_问答_开发者_运维开发者技术经验分享

I just started learning about database normalization and I have a question about one of my tables. My database right now is structured horribly, and one of the reasons is because I have a table that looks like this.

Customers Table

ID |  Date_Entered   |  First_Name  |  Middle_Name |  Last_Name |    Maiden_Name

...

Address__street_dmv | Address_city_dmv  | Address_state_dmv |   Address_zip_dmv

...

Address__street_source2  |  Address_state_source2  |  Address_city_source2  | etc

The addresses keeping going on and on because my company obtains address data from multiple sources. But, of course, some of these address will be Null for some of our customers. So I think I need a separate addresses table like this that connects to the Customers table.

Addresses

ID  |   Number  |    Street    |  State  |   Zip  |    Source (drop down menu)

But then I was thinking the source would be redundant data. So, do I need a separate sources table like this?

Sources

Source_ID  |    Source

And change the addresses table like this?

ID  |   Number    |  Street    |  State  |   Zip开发者_运维知识库    |  Source _ID (drop down)

It doesn’t seem right because now the Source_ID is redundant… Please help.

Bonus points if you can tell me whether or not I should include Maiden and Middle names in the Customer table since these too could possibly be Null (If not, how would the new table be structured?)

Sorry for being a noob.

I would go with something like

Customer

ID |  Date_Entered  |  First_Name  |  Middle_Name | Last_Name | Maiden_Name

Addresses

ID  |   Number  |    Street    |  State_ID  |   Zip

Customers_Address

ID | Customer_ID | Address_ID | Source_ID

This allows you to have same address from multiple sources. You might also want to have separate table for streets, possibly like

Table_Street (ID | State_ID | Name)

and then in the Addresses table you would only have Street_ID instead of both Street and State_ID. This also allows you to show a selection list of streets when user has selected state.

I'd say it is OK to have Maiden and Middle names in the Customer table, even if they are rarely used.

Part of your question has to do with normalization, and part of it does not. That doesn't mean part of your question isn't important. It just means it's important for reasons that have nothing to do with normalization.

Your addresses are essentially a repeating group, in one sense of that term. So it does make sense to remove them from Customers. (This has to do with normalization; repeating groups violates 1NF.)

"Source" is not redundant data, and deciding whether to substitute an ID number for text has nothing to do with normalization.

When you move a table from a lower normal form to a higher normal form, the original table ends up with fewer columns. Substituting an ID number for text doesn't change the number of columns.

And every column in which you substitute a meaningless ID number for text requires a join to get the meaningful text back. Following your same logic, you could also substitute meaningless ID numbers for street, state, and zip, but that would then require four joins to get meaningful data back.

I'm not an SQL expert but here is what I think you are trying to describe.

A customer which is an unique entity has a current address and can have many other addresses, if this is correct yes you should separate the additional addresses into their own table.

Secondly the way you discover that a customer has x amount of addresses is that you obtain this information different companies for each customer, if this is the case I would have a separate table for the companies and record this as you planned, yes you will have repeating rows of source_id but this would be the case as they supply information about many different customers.

In relation to the maiden and middle names are these required by your business rules if so store them when required.

Again my SQL development is only student level really but from what I understand this is how I would go about it.

Hope this helps and if anyone can give more expert information go with it.

you could also try the following approach:

Customer

CustomerID (PK) | Date_Entered | First_Name | Middle_Name | Last_Name | Maiden_Name

Addresses

CustomerId (PK)(FK) | SourceID(PK) | Number | Street | State| Zip

This assumes a one-to-many relationship between Customer and Addresses. It also eliminates the Customer_Address table altogether, in favor of using two tables (Customer and Addresses), and defining a compound primary key for the Addresses table as the CustomerId and the SourceID. In this model, the CustomerId and the SourceId uniquely determines the Number, Street, State, and Zip. It also enforces data integrity by ensuring that each customer can only have one address from each source. Let me know if this helps or if I'm way off base. I'm still learning!