I have large DB (many millions of rows) and I'm trying to make the best choices for datatypes for 2 fields. Most everything I have made varchar or INT. However, 2 fields I'm wondering if Enum is the be开发者_StackOverflow中文版st way.
Field 1 First field is gender, My data is currently either 'Male' or 'Female' or it could be blank. I initially set it up like this:
GENDER VARCHAR(6) NOT NULL
Is this the best way, or would it be better to set it up as:
GENDER ENUM ('Male', 'Female') NOT NULL
And do I need to make it NOT NULL to allow for the blank, or do I need to add the blank, i.e.
GENDER ENUM ('Male', 'Female', '') NOT NULL
Not to mention, I'm considering converting the entire field to just M or F.
Field 2: I have pretty much the same things to consider, except for the state field, which could include 52 values (50 states, DC, plus blank).
I guess the biggest question is - Is all this Enum stuff worth it? My DB has many millions of rows, so everything is a factor, but should I just be using VARCHAR(2) for the states instead of ENUM.
The rule of thumb I usually apply to such cases is NOT to use MySQL ENUMs. Using them creates maintenance issues, especially around adding/removing/renaming some of the values. In InnoDB, renaming and removing an enum value is heavy on big tables. Adding a value isn't (as long as you don't add it in the middle).
As you probably DO want to keep this column in context, and not to allow any value out of this context, the best way IMHO is to use INT, and connect it as a foreign key to a values table (columns id, value).
You will be able to add and rename values in this table easily, and before you remove a value the FK will enforce handling any existing records in the main table which have this value.
To read the data easily, all you need is a simple JOIN.
Note: since genders are pretty final, you may want to leave it as VARCHAR(1) or use an ENUM like Johan suggests, but who knows? You may want to support transgenders and androgyny in the future. Not kidding.
If you want to have a value for no value entered
, use null
that's what null
is designed for!
If you want to specify something in between male and female, use
ENUM('male','female','other') NULL;
Note that an enum does not store the literal text value in the column.
male
is stored as 1, female
as 2 and other
as 3 etc.
This means that it is much more efficient than varchar.
If you are struggling with null
in your selects, note that you can use the ifnull
or coalesce
functions to replace the null
with something more useful.
SELECT IFNULL(gender,'other') as gender FROM people;
-- or the identical statement
SELECT COALESCE(gender,'other') as gender FROM people;
精彩评论