开发者

How to normalize live database

开发者 https://www.devze.com 2023-02-17 03:10 出处:网络
I need to perform normalization on data structure. I have one table with lots of redundant data (42 columns)

I need to perform normalization on data structure. I have one table with lots of redundant data (42 columns)

few examples:

files_shit (id, filename String, upload_user, user_name, tags text, ....)

and I want to create 3 tables file, user and tags

I have almost 30 000 records.

What is the best way to copy data from file_shit to files, users and tags and creating references? (between tags 开发者_开发知识库and files will be another another table file_tags)


First, you cannot convert this table. You will have to use new ones. A simple way is to use this table as a staging table. Create new tables. Then select from this table and add to those.

You will have to identify the primary key for each table. Then fill up the tables (you may have to identify which table to fill first for reasons of referential integrity...etc.. ).

 Sudo code eg : insert into files(columns..)Select <files columns> from files_shit group by primary_colum;

(Note - This means you will use the primary column(s) as the primary key. If you want to use autogenerated integers (optimal) you will have to perform lookups... )

Lot is dependent on the new schema and relations (which you havent defined clearly here). Hope this helps.

EDIT- Lookups

You will have an INT id field for each table.eg. file_id. These will be system generated (Mostly auto_increment). In simple words, this info is not in your current table. So, when u add a file to the file table, and it gets a file_id, you will have to 'look up' the id for this file to add to the user table to satisfy your foreign key relationships(based on how they exist). SIMPLE EG - Try adding additional file_id/tag_id columns to your main table.

Fill tag table first (basically the ones that dont refer anyother).

Fill main tables tag_id for each row by joining tag table (lookup).

UPDATE <mainTable> mT JOIN  tag_table tT on mT.tag_pk_column= tT.tag_pk_column

SET mT.tag_id=tT.tag_id

Now insert into files ...select file_pk_col, tag_Id group by file_pk_col

-This is an example lookup for the tag table.


The simplest way is to take the database offline, create new tables, including all the required constraints, and use INSERT INTO . . . SELECT column_list FROM old_table to populate the new tables. Some data probably won't satisfy the constraints in the new tables; you'll have to fix that.

It gets more complicated if you can't take the database offline, or if you have to make the changes transparent to application programs. Triggers, rules, and updatable views will help with that.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号