I was hesitating to ask this, since it seems very easy.
What is wrong in this pseudocode?
In the switching software (written in C), there was;
- a long "do... while" construct, which contained
- a "switch" statement, which contained
- an "if' clause, which contained
- a "break," which was intended for the "if" clause
- but instead broke from the "switch" statement.
This caused a crash of the telephone system in 1990 (See: http://users.csc.calpoly.edu/~jdalbey/SWE/Papers/att_collapse.html).
I need a very simple, explanation, why this code is wrong. I think the most simple answer is that within a if clause a break is not possible? So 开发者_StackOverflow中文版what statement needs to be written instead of a break within a if clause for getting the wanted effect, which is breaking the if clause?
I suspect that the description / pseudo-code is incorrect when it says:
a "break," which was intended for the "if" clause
It would make sense if that was meant to be:
- a
break
, which was intended to terminate thedo while
loop
The problem description then makes sense.
do
{
...
switch (...)
{
case ...:
...
break;
...
case ...:
...
if (critical_condition())
break; // Intended to exit loop - actually exits switch only
...
break; // Terminates the case in the switch
}
} while (!time_to_stop());
Reading the URL referenced in the question, the pseudo-code there is:
In pseudocode, the program read as follows:
1 while (ring receive buffer not empty and side buffer not empty) DO 2 Initialize pointer to first message in side buffer or ring receive buffer 3 get copy of buffer 4 switch (message) 5 case (incoming_message): 6 if (sending switch is out of service) DO 7 if (ring write buffer is empty) DO 8 send "in service" to status map 9 else 10 break END IF 11 process incoming message, set up pointers to optional parameters 12 break END SWITCH 13 do optional parameter work
When the destination switch received the second of the two closely timed messages while it was still busy with the first (buffer not empty, line 7), the program should have dropped out of the if clause (line 7), processed the incoming message, and set up the pointers to the database (line 11). Instead, because of the break statement in the else clause (line 10), the program dropped out of the case statement entirely and began doing optional parameter work which overwrote the data (line 13). Error correction software detected the overwrite and shut the switch down while it couls [sic] reset. Because every switch contained the same software, the resets cascaded down the network, incapacitating the system.
This agrees with my hypothesis - the pseudo-code in the question is an incorrect characterization of the pseudo-code in the paper.
Another reference on the same subject (found via a Google search 'att crash 1990 4ess') says:
Error Description
What was reported in ACM's Software Engineering Notes [Reference 2] is that the software defect was traced to an elementary programming error, which is described as follows:
In the offending "C" program text there was a construct of the form: [Erratic indentation as in original]
/* ``C'' Fragment to Illustrate AT&T Defect */ do { switch expression { ... case (value): if (logical) { sequence of statements break } else { another sequence of statements } statements after if...else statement } statements after case statement } while (expression) statements after do...while statement
Programming Mistake Described
The mistake is that the programmer thought that the break statement applied to the if statement in the above passage, was clearly never exercised. If it had been, then the testers would have noticed the abnormal behavior and would have been able to corr [sic]
The only caveat to this statement is the following: it is possible that tests applied to the code contain information which would reveal the error; however, if the testers do not examine the output and notice the error, then the deficiency is not with th [sic]
In the case of a misplaced break statement, it is very likely that the error would have been detected.
References
"Can We Trust Our Software?", Newsweek, 29 January 1990.
ACM SIGSOFT, Software Engineering Notes, Vol. 15, No. 2, Page 11ff, April 1990.
Apparently, the programmer really did just think that break
would end the if
statement; it was a small mental blackout that led to a large real-world blackout.
If I understand it right, the else
block where the incriminated break
statement occurs is merely part of that "one line bug" as it's called before1. I don't see any good reason for that else
to exist there, unless those "certain types of messages" that received optimization were thought be the only occurrence of a non-empty buffer while processing a message. The description you linked misses good deals of domain knowledge, without which I at least cannot fully understand that piece of code. I'll try anyway to give an explanation.
As break
statements can only refer to a switch
or a loop, I can assume that:
hypothesis #1
the original coder intended to "speed processing of certain types of messages" by cutting the while
statement with such a break
. However, the nesting misled the guy and let him oversee that the switch
statement and not the while
was to be affected by the break
.
hypothesis #2
the original coder really intended to quickly end the switch
statement, but put that break
too early and forgot to eventually update pointers to optional parameters, e.g. marking somehow that no optional parameters were provided with the current message.
- I would thus call it "two lines bug"
精彩评论