开发者

Setting sqlite I/O priority from python module (speeding up sqlite commits)

开发者 https://www.devze.com 2023-01-30 04:02 出处:网络
I have a python program in an embedded system that needs to write to an sqlite database. This database is mission critical and thus needs to be fully synchronous. The problem is that the database comm

I have a python program in an embedded system that needs to write to an sqlite database. This database is mission critical and thus needs to be fully synchronous. The problem is that the database commits take a LONG time (3-30 seconds) for a single insert. The insert is wrapped in a transaction, but there's really no way to separate these inserts into multiple transactions.

I've been searching for any way to make the database commit take a shorter amount of time but I'm sort of lost.

I've tried setting pragma journal_mode=persistance, and that seemed to help but only a small amount. Now I'm thinking it could be sqlite being starved of i/o time.

Is there a way to increase JUST the process priority of sqlite3? I don't want to increase the python priority itself because we're doing logging, config updates, and other file i/o, but I want to force sqlite to take as much i/o time as possible.

I'm open to other suggestions to speed commit time up as well.

This is what I'm doing in my insert:

    cur = None
            try:
                logging.info('Inserting into : ' + table + ':' + str(m))
                sql = "INSERT INTO " + table + "("
                bind = " VALUES("
                list = [];
                for k, v in m.items():
                    if(v != None):
                        sql += k + ","
                        bind += "?,"
                        list.append(v)
                sql =开发者_如何转开发 str(sql).rstrip(',') + ")"
                bind = str(bind).rstrip(',') + ")"

                cur = conn.cursor()
                cur.execute("PRAGMA journal_mode=PERSIST")
                logging.info(sql + bind)
                cur.execute(sql + bind, list) # It's definitely this that takes the most time. Yes I've profiled.
                conn.commit()
                id = cur.lastrowid
                return id
            except Exception as e:
                raise e
            finally:
                if cur != None:
                    cur.close()


Have you tried 3.7's WAL?

Historically, SQLite has been very slow at handling safe writes (eg. without changing to synchronous=off). It wrote the entire transaction to a journal, flushed it to disk, then copied the whole thing back over the original file, with many blocking syncs happening in between, serializing the whole thing.

SQLite 3.7's write-ahead logging (WAL) should largely fix this problem; it avoids the redundant write, which is expensive for large transactions, and significantly reduces the number of required FS syncs.

See: http://www.sqlite.org/wal.html


You don't say what your embedded platform is. If it is Linux then there is a reason this is happening.

In order to do a commit SQLite has to wait until the related data is absolutely positively definitively on disc. It often has to do this three times for a transaction - for the database, the journal and the directory itself containing both files. Even WAL requires one sync.

The system call fsync is used to do this which blocks until the data for the associated file/directory handle is on disc. However the common Linux ext3/4 family of filesystems turn this into a sync call. Sync blocks until all outstanding data for the entire filesystem is on disk. (The same implementation behaviour may be present in other embedded operating systems.)

You can use strace or a similar tool to trace the system calls and their timing which will help you identify or eliminate this as a cause.

If it is (highly likely) then you have two solutions. One is repeatedly calling sync in the background or configuring kernel sync behaviour (bdflush/kflushd etc) with short time intervals so that the amount of uncommitted write data is low. The Linux defaults are around 30 seconds unless you are in laptop mode in which case it can be several minutes.

The other is to put the database on its own filesystem so that uncommitted writes on other filesystems have no effect on your database filesystem.

Under Linux the ionice system call/tool can be used to change I/O priority. (It requires root to boost your priority.) However if the sync behaviour above is the cause of the problem that won't really help since it will still be the same amount of uncommitted data and changing the order it is written in won't help.

If your underlying filesystem is using some sort of crappy flash then you can also configure SQLite's page size (default 1kb) to match that of your filesystem. This will likely help a little.


Depending on how your DB is indexed, you might get much better performance if you use a larger cache. You could change this by typing:

cursor.execute("PRAGMA cache_size=200000")

I believe that in most cases that will give you a 200 MB cache size (but this depends on your page size), so you might want to adjust if you have more/less RAM available.


Why don't you try logging the insert to a separate (persistent) file and then making the database commit? That gives you recoverability without having to worry about messing with sqlite at all - you can make your INSERTs asynchronous if they're journaled beforehand.

0

精彩评论

暂无评论...
验证码 换一张
取 消