I was thinking back to my freshman year at college (five years ago) when I took an exam to place-out of intro-level computer science. There was a question about loop invariants, and I was wondering if loop invariants are really necessary in this case or if the question was simply a bad example... the question was to write an iterative definition for a factorial function, and then to prove that the function was correct.
The code that I provided for the factorial function was as follows:
public static int factorial(int x)
{
if ( x < 0 ){
throw new IllegalArgumentException("Parameter must be >= 0");
}else if ( x == 0 ){
return 1;
}else{
int result = 1;
for ( int i = 1; i <= x; i++ ){
result*=i;
}
return result;
}
}
My own proof of correctness was a proof by cases, and in each I asserted that it was correct by definition (x! is undefined for negative values, 0! is 1, and x! is 1*2*3...*x for a positive value of x). The professor wanted me to prove the loop using a loop invariant; however, my argument was that it was correct "by definition", because the definition of "x!" for a positive integer x is "the product of the integers from 1... x", a开发者_如何转开发nd the for-loop in the else clause is simply a literal translation of this definition. Is a loop invariant really needed as a proof of correctness in this case? How complicated must a loop be before a loop invariant (and proper initialization and termination conditions) become necessary for a proof of correctness?
Additionally, I was wondering... how often are such formal proofs used in the industry? I have found that about half of my courses are very theoretical and proof-heavy and about half are very implementation and coding-heavy, without any formal or theoretical material. How much do these overlap in practice? If you do use proofs in the industry, when do you apply them (always, only if it's complicated, rarely, never)?
Edit
If we, ourselves, are convinced that a piece of code is correct, can convince others (informally) that it is correct, and there are unit tests in place, to what extent are formal proofs of correctness needed?The professor wanted me to prove the loop using a loop invariant;
Your professor wanted to make sure you understood loop invariants, not just prove something about a very simple function.
Is a loop invariant really needed as a proof of correctness in this case?
Well, technically, no. By that reasoning, you don't need to write a factorial function, either: just use a library function! But that's not the point of the exercise.
How complicated must a loop be before a loop invariant (and proper initialization and termination conditions) become necessary for a proof of correctness?
I know some smart people who can probably prove just about anything without invariants, and then there's people who need to use them even for trivial cases like the above. That's like asking "how heavy does a rock have to be before you need a wheelbarrow to move it?".
Additionally, I was wondering... how often are such formal proofs used in the industry?
Written out explicitly? Probably rarely, unless you're in certain industries. But I still think about them when writing any but the most simple loop.
It's kind of like how I don't diagram sentences, but that doesn't mean I never think about grammar, especially if I'm writing some text that's really important. I can tell you what my pronoun's antecedent is, even though I'd never bother to put that fact on paper.
to what extent are formal proofs of correctness needed?
It depends, of course, but I think it's important for programmers to know how to write code that is not prone to errors, where it tends to be correct by construction.
One example is the concept of "look-ahead", such as in parsing, where the next token of input is not "read", then "looked at", and then possibly "put back" if it is not what is wanted, but rather "looked at" and then possibly "accepted" if it is what is wanted. When, for example, writing loops to cycle through database records and extracting subtotals, this simple change in perspective can result in much simpler and more reliable code.
Another example is differential execution, a technique I stumbled on many years ago. It appears to allow any algorithm to be incrementally re-executed, so as to incrementally update its results. I use it extensively in user interfaces where the contents can dynamically change. For a long time, I felt that it worked in all cases, but couldn't be sure, until I finally proved it, as at the bottom of my Wikipedia page. After that, I knew that if I stuck to some simple constraints, I could rely on it to work, no matter how much code depended on it.
At the same time, we may have utmost confidence in the correctness of some algorithm, but find it very difficult to formally prove, because our proof techniques are poor. Consider the lowly bubble-sort. It obviously works, but try to prove it formally, by applying rules to source code. I've done it, but it is not easy. I haven't tried more advanced sorting algorithms.
When you're solving hard problems and writing code that will be reused long after you've moved on, you (should) go through the process of proving the correctness of every routine you write, every day. Test driven development is a formalization of that idea, but the core of it is: you need to prove at least to yourself and preferably to others (code review!) that the code you've written will handle all possible inputs and paths in an appropriate manner.
Do we bicker about code invariants? No. Do we grade papers before you can check in? Sort of. If the team isn't comfortable with your code or your "proof", you go back to your box to fix it until it passes review.
In recent years "test driven development" under various names has been the furthest most people bother to go with reasoning about their code. It's rather like very careful and repeatable experimentation, versus logical reasoning. Science versus mathematics!
There is some use of pre-conditions, post-conditions and loop/class invariants in languages like Eiffel, and the forthcoming "contracts" support in .NET 4.0 may help to popularise these ideas further.
Personally I use assertions pretty infrequently these days; when I'm looping through a structure I usually don't write it as a loop any more. I write it as a query, e.g. Linq in C# or similar things in other languages like JS. So there is less imperative state manipulation to get wrong (usually there isn't any). And any assertion about the results would be redundant, as it would simply restate the conditions in the query: in the query approach, you describe the results you want.
This is not to say that I never use assertions; but I tend to use them in conjunction with a unit test, and only for very involved algorithms that perform some complicated "in-place" mutation of a collection; in such cases, there is no "built-in" way to ask for the results I want; I have to write the algorithm imperatively (maybe because it would be horribly expensive to copy the entire data structure), so I cover it with assertions in order to help my unit test flag up internal problems.
精彩评论