What is cardinality in MySQL? Please explain in simple, non-technical language.
If a ind开发者_如何学JAVAex detail of any table displays the cardinality of a field say group_id
as 11, then what does that mean?
Max cardinality: All values are unique
Min cardinality: All values are the same
Some columns are called high-cardinality columns because they have constraints in place (like unique) prohibiting you from putting the same value in every row.
Cardinality is a property which affects the ability to cluster, sort and search data. It is therefore an important measurement for the query planners in DBs, it is a heuristic which they can use to choose the best plans.
Wikipedia summarizes cardinality in SQL as follows:
In SQL (Structured Query Language), the term cardinality refers to the uniqueness of data values contained in a particular column (attribute) of a database table. The lower the cardinality, the more duplicated elements in a column. Thus, a column with the lowest possible cardinality would have the same value for every row. SQL databases use cardinality to help determine the optimal query plan for a given query.
It is an estimate of the number of unique values in the index.
For a table with a single primary key column, the cardinality should normally be equal to the number of rows in the table.
More information.
It's basically associated with the degree of uniqueness of a column's values as per the Wikipedia article linked to by Kami.
Why it is important to consider is that it affects indexing strategy. There will be little point indexing a low cardinality column with only 2 possible values as the index will not be selective enough to be used.
The higher cardinality, the better is differentiation of rows. Differentiation helps navigating less branches to get data.
Therefore higher cordinality values mean:
- better performance of read-queries;
- bigger database size;
- worse performance of write-queries, because hidden index data is being updated.
In mathematical terms, cardinality is the count of values in a set of values. A set can only contains unique values. An example would be the set "A".
Let the set "A" be: A={1,2,3} - the cardinality of that set is |3|.
If set "A" contains 5 values A={10,21,33,42,57}, then the cardinality is |5|.
SET VALUES Cardinality
A 1,2,3 3
B 10,21,33,42,57 5
What that means in the context of MySQL is that the cardinality of a table column is the count of that column's unique values. If you are looking at the cardinality of your primary key column (e.g. table.id), then the cardinality of that column will tell you how many rows that table contains, as there is one unique ID for each row in the table. You don't have to perform a "COUNT(*)" on that table to find out how many rows it has, simply look at the cardinality.
In a simple way, cardinality is the number of rows or tuples within the table. No. of columns is called "degree"
From the manual:
Cardinality
An estimate of the number of unique values in the index. This is updated by running ANALYZE TABLE or myisamchk -a. Cardinality is counted based on statistics stored as integers, so the value is not necessarily exact even for small tables. The higher the cardinality, the greater the chance that MySQL uses the index when doing joins.
And an analysis from Percona:
CREATE TABLE `antest` (
`i` int(10) unsigned NOT NULL,
`c` char(80) default NULL,
KEY `i` (`i`),
KEY `c` (`c`,`i`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1
mysql> select count(distinct c) from antest;
+-------------------+
| count(distinct c) |
+-------------------+
| 101 |
+-------------------+
1 row in set (0.36 sec)
mysql> select count(distinct i) from antest;
+-------------------+
| count(distinct i) |
+-------------------+
| 101 |
+-------------------+
1 row in set (0.20 sec)
mysql> select count(distinct i,c) from antest;
+---------------------+
| count(distinct i,c) |
+---------------------+
| 10201 |
+---------------------+
1 row in set (0.43 sec)
mysql> show index from antest;
+--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| antest | 1 | i | 1 | i | A | NULL | NULL | NULL | | BTREE | |
| antest | 1 | c | 1 | c | A | NULL | NULL | NULL | YES | BTREE | |
| antest | 1 | c | 2 | i | A | NULL | NULL | NULL | | BTREE | |
+--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
3 rows in set (0.00 sec)
mysql> analyze table sys_users;
+--------------------------------+---------+----------+----------+
| Table | Op | Msg_type | Msg_text |
+--------------------------------+---------+----------+----------+
| antest | analyze | status | OK |
+--------------------------------+---------+----------+----------+
1 row in set (0.01 sec)
mysql> show index from antest;
+--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| antest | 1 | i | 1 | i | A | 101 | NULL | NULL | | BTREE | |
| antest | 1 | c | 1 | c | A | 101 | NULL | NULL | YES | BTREE | |
| antest | 1 | c | 2 | i | A | 10240 | NULL | NULL | | BTREE | |
+--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
3 rows in set (0.01 sec)
精彩评论