开发者

Why would SQL Server choose Clustered Index Scan over Non-Clustered one?

开发者 https://www.devze.com 2023-03-12 03:14 出处:网络
In one of the tables I am querying, a clustered index was created over a key that\'s not a primary key. (I don\'t know why.)

In one of the tables I am querying, a clustered index was created over a key that's not a primary key. (I don't know why.)

However, there's a non-clustered index for the primary key for this table.

In the execution plan, SQL is choosing the clustered index, rather than the non-clustered index for the primary key which I am using in my query.

Is there a reason why SQL would do this? How can I force SQL to choose the non-clustered index instead?


Appending more detail:

The table has many fields and the query contains many joins. Let me abstract it a bit.

The table definition looks like this:

SlowTable
[SlowTable_id] [int] IDENTITY(200000000,1) NOT NULL,
[fk1Field] [int] NULL,
[fk2Field] [int] NULL,
[other1Field] [varchar] NULL,
etc. etc...

and then the indices for this table are:

fk1Field (Clustered)
SlowTable_id (Non-Unique, Non-Clustered)
fk2Field (Non-Unique, Non-Clustered)
... and 14 other Non-Unique, Non-Clustered indices on other fields

Presumably there are lots more queries made against fk1Field which is why they selected this as the basis for the Clustered index.

The query I have uses a view:

SELECT
  [field list]
FROM 
     SourceTable1 S1
     INNER JOIN SourceTable2 S2
       ON S2.S2_id = S1.S2_id
     INNER JOIN SourceTable3 S3
       ON S3.S3_id = S2.S3_id
     INNER JOIN SlowTable ST
       ON ST.SlowTable_id = S1.SlowTable_id
     INNER JOIN [many other tables, around 7 more...]

The execution plan is quite big, with the nodes concerned say

Hash Match 
(Inner Join)
Cost: 9%

with a thick arrow pointing to

Clustered Index Scan (Clustered)
SlowTable.fk1Field 
Cost: 77%

I hope this provides enough detail on the issue.

Thanks!


ADDENDUM 2: Correction to my previous post. The view doesn't have a where clause. It is just a series of inner joins. The execution plan was taken from an Insert statement that uses the View (listed as SLOW_VIEW) in a complex query that looks like the following:

(What this stored procedure does is to do a proportional split of the total amount of some records, based on weights, computed as percentage against a total. This mimics distributing a value from, say, one account, to other accounts.)

INSERT INTO dbo.WDTD(
    FieldA,
    FieldB,
    GWB_id,
    C_id,
    FieldC,
    PG_id,
    FieldD,
    FieldE,
    O_id,
    FieldF,
    FieldG,
    FieldH,
    FieldI,
    GWBIH_id,
    T_id,
    JO_id,
    PC_id,
    PP_id,
    FieldJ,
    FieldK,
    FieldL,
    FieldM,
    FieldN,
    FieldO,
    FieldP,
    FieldQ,
    FieldS)
SELECT DISTINCT
    @FieldA FieldA,
    GETDATE() FieldB,
    @Parameter1 GWB_id,
    GWBIH.C_id C_id,
    P.FieldT FieldC,
    P.PG_id PG_id,
    PAM.FieldD FieldD,
    PP.FieldU FieldE,
    GWBIH.O_id O_id,
    CO.FieldF FieldF,
    CO.FieldG FieldG,
    PSAM.FieldH FieldH,
    PSAM.FieldI FieldI,
    SOURCE.GWBIH_id GWBIH_id,
    ' ' T_id,
    GWBIH.JO_id JO_id,
    SOURCE.PC_id PC_id,
    GWB.PP_id,
    SOURCE.FieldJ FieldJ,
    1 FieldK,
    ROUND((SUM(GWBIH.Total) / AGG.Total) * SOURCE.Total, 2) FieldL,
    ROUND((SUM(GWBIH.Total) / AGG.Total) * SOURCE.Total, 2) FieldM,
    0 FieldN,
    ' ' FieldO,
    ESGM.FieldP_flag FieldP,
    SOURCE.FieldQ FieldQ,
     '[UNPROCESSED]'
FROM
    dbo.Table1 GWBIH
    INNER JOIN dbo.Table2 GWBPH
        ON GWBPH.GWBP_id = GWBIH.GWBP_id
    INNER JOIN dbo.Table3 GWB
        ON GWB.GWB_id = GWBPH.GWB_id
    INNER JOIN dbo.Table4 P
        ON P.P_id = GWBPH.P_id
    INNER JOIN dbo.Table5 ESGM
        ON ESGM.ET_id = P.ET_id
    INNER JOIN dbo.Table6 PAM
        ON PAM.PG_id = P.PG_id
    INNER JOIN dbo.Table7 O
        ON O.dboffcode = GWBIH.O_id
    INNER JOIN dbo.Table8 CO
        ON
            CO.Country_id = O.Country_id
            AND CO.Brand_id = O.Brand_id
    INNER JOIN dbo.Table9 PSAM
        ON PSAM.Office_id = GWBIH.O_id
    INNER JOIN dbo.Table10 PCM
        ON PCM.PC_id = GWBIH.PC_id
    INNER JOIN dbo.Table11 PC
        ON PC.PC_id = GWBIH.PC_id
    INNER JOIN dbo.Table12 PP
        ON PP.PP_id = GWB.PP_id
            -- THIS IS THE VIEW THAT CONTAINS THE CLUSTERED INDEX SCAN
    INNER JOIN dbo.SLOW_VIEW GL
        ON GL.JO_id = GWBIH.JO_id
    INNER JOIN开发者_如何学编程 (
        SELECT
            GWBIH.C_id C_id,
            GWBPH.GWB_id,
            SUM(GWBIH.Total) Total
        FROM
            dbo.Table1 GWBIH
            INNER JOIN dbo.Table2 GWBPH
                ON GWBPH.GWBP_id = GWBIH.GWBP_id
            INNER JOIN dbo.Table10 PCM
                ON PCM.PC_id = GWBIH.PC_id
        WHERE
            PCM.Split_flag = 0
            AND GWBIH.JO_id IS NOT NULL
        GROUP BY
            GWBIH.C_id,
            GWBPH.GWB_id
            ) AGG
        ON AGG.C_id = GWBIH.C_id
            AND AGG.GWB_id = GWBPH.GWB_id
    INNER JOIN (
        SELECT
            GWBIH.GWBIH_id GWBIH_id,
            GWBIH.C_id C_id,
            GWBIH.FieldQ FieldQ,
            GWBP.GWB_id GWB_id,
            PCM.PC_id PC_id,
            CASE
            WHEN WT.FieldS IS NOT NULL
                THEN WT.FieldS
            WHEN WT.FieldS IS NULL
                THEN PCMS.FieldT
            END FieldJ,
            SUM(GWBIH.Total) Total
        FROM
            dbo.Table1 GWBIH
            INNER JOIN dbo.Table2 GWBP
                ON GWBP.GWBP_id = GWBIH.GWBP_id
            INNER JOIN dbo.Table4 P
                ON P.P_id = GWBP.P_id
            INNER JOIN dbo.Table10 PCM
                ON PCM.PC_id = GWBIH.PC_id
            INNER JOIN dbo.Table11 PCMS
                ON PCMS.PC_id = PCM.PC_id
            LEFT JOIN dbo.WT WT
                ON WT.ET_id = P.ET_id
                AND WT.PC_id = GWBIH.PC_id
        WHERE
            PCM.Split_flag = 1
        GROUP BY
            GWBIH.GWBI_id,
            GWBIH.C_id,
            GWBIH.FieldQ,
            GWBP.GWB_id,
            WT.FieldS,
            PCM.PC_id,
            PCMS.ImportCode
            ) SOURCE
        ON SOURCE.C_id = GWBIH.C_id
            AND SOURCE.GWB_id = GWBPH.GWB_id
WHERE
    PCM.Split_flag = 0
    AND AGG.Total > 0
    AND GWBPH.GWB_id = @Parameter1
    AND NOT EXISTS (
        SELECT *
        FROM dbo.WDTD
        WHERE
            TD.C_id = GWBIH.C_id
            AND TD.FieldA = GWBPH.GWB_id
            AND TD.JO_id = GWBIH.JO_id
            AND TD.PC_id = SOURCE.PC_id
            AND TD.GWBIH_id = ' ')
GROUP BY
    GWBIH.C_id,
    P.FieldT,
    GWBIH.JO_id,
    GWBIH.O_id,
    GWBPH.GWB_id,
    P.PG_id,
    PAM.FieldD,
    PP.FieldU,
    GWBIH.O_id,
    CO.FieldF,
    CO.FieldG,
    PSAM.FieldH,
    PSAM.FieldI,
    GWBIH.JO_id,
    SOURCE.PC_id,
    GWB.PP_id,
    SOURCE.FieldJ,
    ESGM.FieldP_flag,
    SOURCE.GWBIH_id,
    SOURCE.FieldQ,
    AGG.Total,
    SOURCE.Total

ADDENDUM 3: When doing an execution plan on the select statement of the view, I see this:

Hash Match     <====   Bitmap           <------ etc...
(Inner Join)           (Bitmap Create)
Cost: 0%               Cost: 0%
    ^
    |
    |
Parallelism                           Clustered Index Scan (Clustered)
(Repartition Streams)    <====        Slow_Table.fk1Field
Cost: 1%                              Cost: 98%

ADDENDUM 4: I think I found the problem. The Clustered Index Scan isn't referring to my clause that references the Primary Key, but rather another clause that needs a field that is, in some way, related to fk1Field above.


Most likely one of:

  • too many rows to make the index effective
  • index doesn't fit the ON/WHERE conditions
  • index isn't covering and SQL Server avoids a key lookup

Edit, after update:

Your indexes are useless because they are all single column indexes, so it does a clustered index scan.

You need an index that matches your ON, WHERE, GROUP BY conditions with INCLUDES for your SELECT list.


If the query you're executing isn't selecting a small subset of the records, SQL Server may well choose to ignore any "otherwise useful" non-clustered index and just scan through the clustered index (in this instance, most likely all rows in the table) - the logic being that the amount of I/O required to perform the query vs. the non-clustered index outweights that required for a full scan.

If you can post the schema of your table(s) + a sample query, I'm sure we can offer more information.


Ideally you shouldn't be telling SQL Server to do either or, it can pick the best, if you give it a good query. Query hints was created to steer the engine a bit, but you shouldn't have to use this just yet.

Sometimes it is beneficial to cluster the table differently that the primary key, is rare, but it can be useful (the clustering controls the data layout while the primary key ensures correctness).

I can tell you exactly why SQL Server picks the clustered index if you show me your query and schema otherwise I'd only be guessing on likely causes and execution plan is helpful in these cases.

For a non-clustered index to be considered it has to be meaningful to the query and if you non-clustered index doesn't cover your query, there's no guaratee that it will be used at all.


A clustered index scan is essentially a table scan (on a table that happens to have a clustered index). You really should post your statement to get a better answer. Your where clause may not be searchable (see sargs), or if you are selecting many records, sql server may scan the table rather than use the index and later have to look up related columns.

0

精彩评论

暂无评论...
验证码 换一张
取 消