How would i go normalize the table that has duplicate tuples?
--------------------------------------
ID | Name | Email
-----------------------------------------
1 | John | user@somedomainname
2 | John | 开发者_Python百科 user@somedomainname
In this case two users have same name and email.
If it has duplicate tupes it can't have a Primary Key. This is required for first normal form.
Brax: First, look closely at the entities your table is describing. Duplicate information is a common sign that you're storing two (or more) entities in the same table. Then split these out. Write a query using group by, or distinct, or some application logic to find the unique values. Ensure this by using unique constraints where appropriate. Ensure these entities have a primary key.
Second: add foreign key columns to your existing table, so that it can form a relation to the new table(s) you just created. Fill the foreign key table.
Third: Drop the columns containing the information you just offloaded to the separate entity table(s).
Since your question is very generic, so is this answer... but I hope it helps at least a little bit.
Make the fields which form the tuple a Primary Key.
Are these users assumed to be the same person? Perhaps user 1 registered, then maybe abandoned that email address or entered it incorrectly, then user 2 used the same details). If they are assumed to be different people, you need to add additional data to uniquely identify them e.g. date of registration.
Assuming that the two users are the same person, first, decide which of the two rows you want to keep (for example, always choose the one with the lowest ID, so in this case its 1). Then find all rows in other tables that reference the row you want to remove (in this case, its 2). Update all of these Foreign Keys so they point to 1 instead. Then, you can delete the row that has ID = 2.
Now that you have tidied up the data, you should amend your schema to prevent this from happening again: put a unique constraint on Name and Email (or even just on Email if you should not allow the same email to be used by two people).
精彩评论