开发者

MySQL Database Duplicates

开发者 https://www.devze.com 2023-03-31 20:56 出处:网络
Rather than explain the whole situation, which would just add confusion to the matter, I\'ve just decided to go with a very light table example andask my question below it

Rather than explain the whole situation, which would just add confusion to the matter, I've just decided to go with a very light table example and ask my question below it

I have a database table which looks kind of like this:

    product_id | category_id | category_name
         1            1           Cat One
         2            2           Cat One
         3            3           Cat One
         4            4           Cat Two
         4            5           Cat Two

How can I set the second and third row to have the same category_id as the first? So it looks like:

    product_id | category_id | category_name
        1             1            Cat One
        2             1            Cat One
        3             1            Cat One
        4             4            Cat Two
        5             4            Cat Two

Bearing in mind I need to do this for 12,474 rows! :(

I've been bashing my head against the desk for hours with this. Any ideas would be massively apprecia开发者_StackOverflowted

Thanks, Andy

P.S: I forgot to add that the category_id's should be the same because the category_name is Cat One for all of them

Edit 2: edited table sample a bit to avoid confusion


Create a table that maps each category name to an id

CREATE TABLE `categories` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `name` varchar(50) DEFAULT NULL,
  PRIMARY KEY (`id`)
) SELECT NULL as id, `category_name` as name FROM `tableName` GROUP BY `category_name`

Then UPDATE everything

UPDATE `tableName`, `categories` 
  SET `tableName`.`category_id` = `categories`.`id` 
  WHERE `tableName`.`category_name` = `categories`.`name`

And then drop the categories table!


I think you'll need to use a little PHP or other scripting languange.

Try something along these lines:

$result_set = query("
    SELECT category_id, categeory_name
    FROM table_name
    WHERE 1
    GROUP BY category_name
    ORDER BY category_id ASC
"); //should give you all unique category_names and the lowest category_id for that name

while($row = mysql_fetch_assoc($result_set)) {
    query("UPDATE table_name SET category_id = {$row['category_id']} WHERE category_name = '{$row['category_name']}'");
}


UPDATE `table_name` SET `category_id`='1' WHERE `category_name`="Cat One";

Basic UPDATE with a WHERE clause.


Try this with subqueries (it has never been tested):

UPDATE `table_name` u SET 
      `category_id`= (SELECT `category_id` 
                      FROM `table_name` t 
                      WHERE u.category_name = t.category_name 
                      GROUP BY category_name);


UPDATE table_name SET category_id=1
WHERE category_name="Cat One" AND category_id<>1;

if the category_name is the decide factor

UPDATE table_name, category 
SET table_name.category_id=category.category_id
WHERE table_name.category_name=category.category_name
AND table_name.category_id<>category.category_id;

and your question is getting more fuzzy ...

UPDATE table_name, 
(select min(category_id) as cid, category_name from category 
group by category_name) as alias_table 
SET table_name.category_id=cid
WHERE table_name.category_name=alias_table.category_name;


You should ask yourself first if the design you have is truly the one you want.

Specifically I am talking about normalizing your relations. Your tables do not appear to be in second normal form. You might want to consider having a category table, then having a foreign key from the product table to the category table.

Looks like this

product_id | category_id 
     1            1         
     2            1          
     3            1         
     4            2         
     5            2        

category_id | category_name
     1            cat one
     2            cat two

To get to second normal form you would need to migrate the data.

Create the category table

create table categories (category_id int(11), category_name varchar(100));

Then select all of the redundant data into the table.

insert into categories
select 
 category_id,
 category_name
from products

Delete redundant data

delete c2 from categories c1, categories c2
where c1.category_name = c2.category_name and c1.category_id < c2.category_id

Now point the products table at the normalized data

update products, categories
set product.category_id = categories.category_id
where product.category_name = categories.category_name

Then drop the category_name column from products

alter tables products drop column category_name

Finally add a foreign key constraint from products to categories.

The end result is a normalized schema removing the possibility of redundant data.

Edit: I Really mean Second Normal Form, not first.

Another Edit: The Delete Redundant Data step was initially wrong, and would have deleted all categories =)

0

精彩评论

暂无评论...
验证码 换一张
取 消