I'd like to use SQLite FTS3 (FTS4, actually) to index a table with integer columns, conceptually something like this:
CREATE VIRTUAL TABLE whole (document INTEGER, page INTEGER, content TEXT,
UNIQUE(document, page)) USING fts4();
I know that FTS3 treats all columns other than rowid as TEXT, so I'll have to use two tables:
CREATE VIRTUAL TABLE data USING fts4();
CREATE TABLE metadata(document INTEGER, page INTEGER, UNIQUE(document, page));
I want to be able to query for documents, or for pages in a given document:
SELECT DISTINCT document FROM metadata NATURAL JOIN data WHERE content MATCH 'foo';
SELECT page FROM metadata NATURAL JOIN data
WHERE document = 123 AND content MATCH 'foo';
I think the NATURAL JOIN requires me to ensure that the rowids are kept in sync, but what's the best way to do that? Should I be using a FOREIGN KEY or other constraint? Would a sub-select be better than a join?
I'd like an insertion for a document and page already in the database to overwrite the text content. Is that possible programmatically via SQL or will I have to check to see if the row already exists in the info table?
I'm also going to be wanting to DELETE FROM both tables for a given document -- is there a way to do this in a single statement?
All advice gratefully received, but as I am a SQL newbie, code samples particularly appreciated!
Update: It's not at all clear to me how I can create a foreign key constraint here. If I choose metadata
as the parent table (which would be my preference, in the absence of a bidirectional constraint):
PRAGMA foreign_keys = ON;
CREATE TABLE metadata (document INTEGER, page INTEGER);
CREATE VIRTUAL TABLE data USING fts4(content TEXT, docid REFERENCES metadata);
I get Error: vtable constructor failed: data
(unsur开发者_如何学Goprisingly, because docid
is an alias for rowid
, but of course I can't use another column because all columns except rowid
must be TEXT
).
Whereas if I try the other way round:
PRAGMA foreign_keys = ON;
CREATE VIRTUAL TABLE data USING fts4();
CREATE TABLE metadata (document INTEGER, page INTEGER, docid REFERENCES data);
the table construction succeeds, but if I try:
INSERT INTO data (docid, content) VALUES (123, 'testing');
INSERT INTO metadata (docid, document, page) VALUES (123, 12, 23);
I get Error: foreign key mismatch
.
In short, it's pretty difficult to enforce consistency where FTS3 is involved.
SQLite virtual tables don't allow for triggers, and FTS3 tables ignore constraints and affinities.
The best I have been able to do so far is as follows:
CREATE TABLE metadata (document INTEGER, page INTEGER, UNIQUE(document, page));
CREATE VIRTUAL TABLE data USING fts4();
CREATE VIEW whole AS SELECT metadata.rowid AS rowid, document, page, content
FROM metadata JOIN data ON metadata.rowid = data.rowid;
CREATE TRIGGER whole_insert INSTEAD OF INSERT ON whole
BEGIN
INSERT INTO metadata (document, page) VALUES (NEW.document, NEW.page);
INSERT INTO data (rowid, content) VALUES (last_insert_rowid(), NEW.content);
END;
CREATE TRIGGER whole_delete INSTEAD OF DELETE ON whole
BEGIN
DELETE FROM metadata WHERE rowid = OLD.rowid;
DELETE FROM data WHERE rowid = OLD.rowid;
END;
To enforce consistency I could (with PRAGMA recursive_triggers = NO
) create triggers to raise exceptions on direct operations on the metadata
and data
tables, but this is probably overkill for my purposes (likewise, I don't need the UPDATE
trigger for the whole
table).
I think most DBAs would agree that if you're using SQL, you should take advantage of all the faculties it offers.
Foreign Keys are the general route I would recommend and they are documented here.
In general with SQL databases, you should never ever be manually enforcing consistency. Particularly in cases that fit fine with a foreign key such as this.
For the DELETE FROM case, SQLite doesn't support the "cascade" keyword like say MS SQL, but it has triggers which allow you to have this behavior anyway. The documentation for SQLite triggers can be found here.
Lastly, I would skip doing the natural join.
精彩评论