Just a small question regarding joins. I have a table with around 30 fields and I was thinking about making a second table to store 10 of those fields. Then I would just join them in with the main data. The 10 fields that I was planning to store in a second table does not get queried directly, it's just some settings for the data in the first table.
Something like:
Table 1
Id
Data1
Data2
Data3
etc ...
Table 2
Id (same id as table one)
Settings1
Settings2
Settings3
Is this a bad 开发者_开发问答solution? Should I just use one table? How much performance impact does it have? All entries in table 1 would also then have an entry in table 2.
Small update is in order. Most of the Data fields are of the type varchar and 2 of them are of the type text. How is indexing treated? My plan is to index 2 data fields, email (varchar 50) and author (varchar 20). And yes, all records in Table 1 will have a record in Table 2. Most of the settings fields are of the bit type, around 80%. The rest is a mix between int and varchar. The varchars can be null.
This is known as vertical partitioning, and is a legitimate strategy. You might do this for the following reasons:
- If certain columns are accessed or changed far more frequently than others.
- If you want to store some of the columns on one set of storage media and the others on another.
- If you have extensive triggers that don't need to run in response to updates on certain of the columns.
There will likely be a small performance hit when accessing through the JOIN, but performance may increase in the cases where you only need to access one or the other of the component tables.
This will depend on if every row in table 1 needs a row in table 2 (Does every row have settings) and how many of these settings are regularly NULL.
If all the settings (or mostly all of them) are used and used regularly I would recomend placing them in the same table.
Also, the number of columns you refer to is not extreme, so a single table, avoiding joins if fields are used is what I would recomend.
If the data belongs together, keep it together.
If your settings don't make sense without the main table data and correspond one-to-one to them (i.e. one row in table1 has exactly one row is table2), then you should add them to the main table.
You solution is not bad, but consider following things:
- Table containing more number of columns and many of them stores null is a sign to create another table
- If you want to store settings, create a design that can store any number of settings without storing null, for example create a table having fields, primarykeyid, foreignkeyid,settingname,settingvalue. Please note that in 2, you are storing over head of setting name, but you use int or tiny int it will perform a lot more better.
Hope this helps
Do table 1 and 2 have the same number of rows? That is, does every row in table 1 have a corresponding row in table 2 and vice versa? If so, a join unnecessary, you can just filter out the columns you want with a select.
A join would only be useful if you have rows in table 2 that correspond with more than 1 row in table 2, e.g. a row of setings that apply to multiple rows of data.
Edit: My questions are already answered by the edit in the original post. I would say a join is not necessary here and would only adversely impact performance.
If you are going to regularly query the settings table at the same time as the data table, then you will want them as a single table. If you query each set of data separately, it can be useful to keep them apart.
There is a performance hit on every join, but you also want to look into how you are using the tables. If you need them at the same time, its best to keep them together, otherwise you can break them up as long as you dont plan to use them at the same time often in the future.
The size of the fields, frequency and type of use (selects, updates, etc) and the size of the tables (rows) all matters, in addition to the ways you index it.
In general this is not bad design if the subset of data in one table is very commonly viewed (say in a list) and the second subset is viewed/edited much less often and/or contains very large fields (like varchar(max)).
However if the data is always viewed together, this is probably not the way to go. So if you have to read those settings in the second table every time you read the first table, don't go this road.
Update: Considering your index and the fact that almost all settings are a fixed size (bit), I would just make it one table. If you need a subset query without the entire table, you may want to consider a covering index instead of splitting the table.
精彩评论