Does SQLite multi column primary key need an additional index?_问答_开发者

Does SQLite multi column primary key need an additional index?

开发者 https://www.devze.com 2023-01-04 04:59 出处：网络

If I create a table like so: CREATE TABLE something (column1, column2, PRIMARY KEY (column1, column2));

If I create a table like so:

CREATE TABLE something (column1, column2, PRIMARY KEY (column1, column2));

Neither column1 nor column2 are unique by themselves. However, I will do most of my queries on column1.

Does the multi column primary key create an index for both columns separately? I would think that if you specify a multi column primary开发者_如何学Python key it would index them together, but I really don't know.

Would there be any performance benefit to adding a UNIQUE INDEX on column1?

There will probably not be a performance benefit, because queries against col1=xxx and col2=yyy would use the same index as queries like col1=zzz with no mention of col2. But my experience is only Oracle, SQL Server, Ingres, and MySQL. I don't know for sure.

You certainly don't want to add a unique index on column 1 as you just stated:

Neither column1 nor column2 are unique by themselves.

If column one comes first, it will be first in the multicolumn index in most databases and thus it is likely to be used. The second column is the one that might not use the index. I wouldn't add one on the second column unless you see problems and again, I would add an index not a unique index based on the comment you wrote above.

But SQL lite must have some way of seeing what it is using like most other databases, right? Set the Pk and see if queries uing just column1 are using it.

I stumbled across this question while researching this same question, so figured I'd share my findings. Note that all of the below is tested on SQLite 3.39.4. I make no guarantees about how it will hold up on old/future versions. That said, SQLite is not exactly known for radically changing behavior at random.

To give a concrete answer for SQLite specifically: an index on column1 would provide no benefits, but an index on column2 would.

Let's look at a simple SQL script:

CREATE TABLE tbl (
    column1 TEXT NOT NULL,
    column2 TEXT NOT NULL,
    val INTEGER NOT NULL,
    PRIMARY KEY (column1, column2)
);

-- Uncomment to make the final SELECT fast
-- CREATE INDEX column2_ix ON tbl (column2);

EXPLAIN QUERY PLAN SELECT val FROM tbl WHERE column1 = 'column1' AND column2 = 'column2';
EXPLAIN QUERY PLAN SELECT val FROM tbl WHERE column1 = 'column1';
EXPLAIN QUERY PLAN SELECT val FROM tbl WHERE column2 = 'column2';

EXPLAIN QUERY PLAN is SQLite's method of allowing you to inspect what its query planner is actually going to do.

You can execute the script via something like:

$ sqlite3 :memory: < sample.sql

This gives the output

QUERY PLAN
`--SEARCH tbl USING INDEX sqlite_autoindex_tbl_1 (column1=? AND column2=?)
QUERY PLAN
`--SEARCH tbl USING INDEX sqlite_autoindex_tbl_1 (column1=?)
QUERY PLAN
`--SCAN tbl

So the first two queries, the ones which SELECT on (column1, column2) and (column1), will use the index to perform the search. Which should be nice and fast.

Note that the last query, the SELECT on (column2) has different output, though. It says it's going to SCAN the table -- that is, go through each row one by one. This will be significantly less performant.

What happens if we uncomment the CREATE INDEX in the above script? This will give the output

QUERY PLAN
`--SEARCH tbl USING INDEX sqlite_autoindex_tbl_1 (column1=? AND column2=?)
QUERY PLAN
`--SEARCH tbl USING INDEX sqlite_autoindex_tbl_1 (column1=?)
QUERY PLAN
`--SEARCH tbl USING INDEX column2_ix (column2=?)

Now the query on column2 will also use an index, and should be just as performant as the others.