开发者

Composite Primary and Cardinality

开发者 https://www.devze.com 2022-12-31 05:35 出处:网络
I have some questions on Composite Primary Keys and the cardinality of the columns. I searched开发者_运维问答 the web, but did not find any definitive answer, so I am trying again. The questions are:

I have some questions on Composite Primary Keys and the cardinality of the columns. I searched开发者_运维问答 the web, but did not find any definitive answer, so I am trying again. The questions are:

Context: Large (50M - 500M rows) OLAP Prep tables, not NOSQL, not Columnar. MySQL and DB2

1) Does the order of keys in a PK matter?

2) If the cardinality of the columns varies heavily, which should be used first. For example, if I have CLIENT/CAMPAIGN/PROGRAM where CLIENT is highly cardinal, CAMPAIGN is moderate, PROGRAM is almost like a bitmap index, what order is the best?

3) What order is the best for Join, if there is a Where clause and when there is no Where Clause (for views)

Thanks in advance.


You've got "MySQL and DB2". This answer is for DB2, MySQL has none of this.

Yes, of course that is logical, but the optimiser takes much more than just that into account.

Generally, the order of the columns in the WHERE clause (join) do not (and should not) matter.

However, there are two items related to the order of predicates which may be the reason for your question.

  1. What does matter, is the order of the columns in the index, against which the WHERE clause is processed. Yes, there it is best to specify the columns in the order of highest cardinality to lowest. That allows the optimiser to target a smaller range of rows.

    • And along those lines do not bother implementing indices for single-column, low cardinality columns (there are useless). If the index is correct, then it will be used more often.
      .
  2. The order of tables being joined (not columns in the join) matters very much, it is probably the most important consideration. In fact Join Transitive Closure is automatic, and the optimiser evaluates all possible join orders, and chooses what it thinks is the best, based on Statistics (which is why UPDATE STATS is so important).

    Regardless of the no of rows in the tables, if you are joining 100 rows from table_A on a bad index with 1,000,000 rows in table_B on a good index, you want the order A:B, not B:A. If you are getting less than the max IOPS, you may want to do something about it.

    The correct sequence of steps is, no surprise:

    • check that the index is correct as per (1). Do not just add another index, correct the ones you have.

    • check that update stats is being executed regularly

    • always try the default operation of the optimiser first. Set stats on and measure I/Os. Use representative sets of values (that the user will use in production).

    • check the shoowplan, to ensure that the code is correct. Of course that will also identify the join order chosen.

    • if the performance is not good enough, and you believe that the the join order chosen by the optimiser for those sets of values is sub-optimal, SET JTC OFF (syntax depends on your version of DB2), then specify the order that you want in the WHERE clause. Measure I/Os. Use representative sets

    • form an opinion. Choose whichever is better performance overall. Never tune for single queries.


1) Does the order of keys in a PK matter?

Yes, it changes the order of the record for the index that is used to police the PRIMARY KEY.

2) If the cardinality of the columns varies heavily, which should be used first. For example, if I have CLIENT/CAMPAIGN/PROGRAM where CLIENT is highly cardinal, CAMPAIGN is moderate, PROGRAM is almost like a bitmap index, what order is the best?

For select queries, this totally depends on the queries you are going to use. If you are searching for all three columns at once, the order is not important; if you are searching for two or one columns, they should be leading in the index.

For inserts, it is better to make the leading column match the order in which the records are inserted.

3) What order is the best for Join, if there is a Where clause and when there is no Where Clause (for views)

Again, this depends on the WHERE clause.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号