I'm looking for an example of how to split-out comma-delimited data in a field of one table, and fill in a second table with those individual elements, in order to make a one-to-many relational database schema. This is probably really simple, but let me give an example:
I'll start with everything in one table, Widgets, which has a "state" field to contain states that have that widget:
Table: WIDGET
====开发者_JS百科===========================
| id | unit | states |
===============================
|1 | abc | AL,AK,CA |
-------------------------------
|2 | lmn | VA,NC,SC,GA,FL |
-------------------------------
|3 | xyz | KY |
===============================
Now, what I'd like to create via code is a second table to be joined to WIDGET called Widget_ST that has widget id, widget state id, and widget state name fields, for example
Table: WIDGET_ST
==============================
| w_id | w_st_id | w_st_name |
------------------------------
|1 | 1 | AL |
|1 | 2 | AK |
|1 | 3 | CA |
|2 | 1 | VA |
|2 | 2 | NC |
|2 | 1 | SC |
|2 | 2 | GA |
|2 | 1 | FL |
|3 | 1 | KY |
==============================
I am learning C# and PHP, so responses in either language would be great.
Thanks.
I wrote some scripts to import the Stack Overflow data dump into an SQL database. I split the tags list to populate a many-to-many table as you describe. I use a technique similar to the following:
Read a row from WIDGET
while ($row = $pdoStmt->fetch()) {
Use
explode()
to split on a comma$states = explode(",", $row["state"]);
Loop over elements, writing to a new CSV file
$stateid = array(); $stfile = fopen("states.csv", "w+"); $mmfile = fopen("manytomany.csv", "w+"); $i = 0; foreach ($state as $st) { if (!array_key_exists($st, $stateid)) { $stateid[$st] = ++$i; fprintf($stfile, "%d,%s\n", $i, $st); } fprintf($mmfile, "%s,%s\n", $row["id"], $stateid[$st]); } fclose($stfile); fclose($mmfile);
When you're done, load the CSV files into the database. You can do this in the mysql client:
mysql> LOAD DATA INFILE 'states.csv' INTO TABLE STATES; mysql> LOAD DATA INFILE 'manytomany.csv' INTO TABLE WIDGET_ST;
It may seem like a lot of work, but using the LOAD DATA command runs 20x faster than inserting one row at a time, so it's worthwhile if your data set is large.
Re your comment:
Right, I also have data in a database already. It turns out that the solution I show above, dumping to CSV files and re-importing in normalized format, is many times faster than doing INSERT statements inside the loop that splits the data.
Each brand of database has its own tool for importing bulk data. See my answer to Optimizing big import in PHP for a list of bulk import solutions per database.
You should use the tools provided by each database. Trying to remain cross platform only makes your code Jack of all trades, master of none. Besides, in 90% of cases when people bend over backwards to make their code database-independent, it turns out they never use more than one database. And you can't achieve complete database independence anyway.
精彩评论