SQLite's test code to production code ratio_问答_开发者

SQLite claim to have 679 times more test code than production one. http://www.sqlite.org/testing.html

Does anyone knows how it is possible? Do they generate any test code automatically? What are the ma开发者_StackOverflowjor parts of these "45678.3 KSLOC" of test code?

"Does anyone knows how it is possible?"

"It is possible" to have 679 times as much test code because a single feature can be used in many different ways. Consider just a single function that takes two parameters. I can generate alot of test code for that one function that tests boundary conditions and many other combinations of conditions. When you consider setup/teardown of the tests, there is additional code there. Depending on their testing framework this overhead may significantly add to the amount of code in testing.

What it really boils down to is the fact the a piece of software can be used in so many different ways, which means that you have many different scenarios to test for. This is the beauty of elegant software, in that a simple program can be applied to numerous scenarios, but that is the same thing that makes verifying and testing software so challenging.

It's presumably possible if the developers spent 679 times as much time writing test code as they spent writing production code. Just think: if they'd opted instead for 339 times as much test code, they could have had two entire database engines, each still with a ludicrous amount of test coverage.

I once watched a fellow developer trying to placate a furious customer about slipped deadlines by informing them that he had written 5 times as much test code as production code. The customer was not placated, if you can imagine. At least I don't think 5X coverage is extreme anymore.

It uses Tcl to power the test framework so it's much easier to write tests than it is to write the implementation. This encourages thorough testing, which is what you want in a database, yes? Moreover, a fair fraction of those tests are proprietary, aimed at testing in embedded environments; I imagine some corporate user (or users) paid for that sort of thing. It's also quite possible that the same feature is tested multiple times.

Looking at section 3.1 (OOM):

OOM testing is accomplished by simulating OOM errors. SQLite allows an application to substitute an alternative malloc() implementation using the sqlite3_config(SQLITE_CONFIG_MALLOC,...) interface. The TCL and TH3 test harnesses are both capable of inserting a modified version of malloc() that can be rigged to fail after a certain number of allocations. These instrumented mallocs can be set to fail only once and then start working again, or to continue failing after the first failure. OOM tests are done in a loop. On the first iteration of the loop, the instrumented malloc is rigged to fail on the first allocation. Then some SQLite operation is carried out and checks are done to make sure SQLite handled the OOM error correctly. Then the time-to-failure counter on the instrumented malloc is increased by one and the test is repeated. The loop continues until the entire operation runs to completion without ever encountering a simulated OOM failure. Tests like this are run twice, once with the instrumented malloc set to fail only once, and again with the instrumented malloc set to fail continuously after the first failure.

Note that section 7 explicitly states 100% core coverage as determined by gcov. I agree with Donal Fellows that the test framework is largely responsible for the test coverage beyond what a call graph would suggest. Its a much different thing to see malloc() entered nn times and write a test for it than it is to write dozens of tests geared to simulate environments where malloc() is likely to fail.

Yes, the resulting coverage is an artifact of diligence, however so is the selection of a test framework that enables that kind of diligence.

Finally, reiterating the obvious, malloc() takes only a single void pointer. This suggests that the tests written around it are by deliberate design, not automatically generated.