How Big Should main() Be, in C? [closed]_问答_开发者

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance. 开发者_C百科 Closed 12 years ago.

I'm learning a little C over the holiday weekend, and I started to look at other programs written in C. I ended up looking at GNU Netcat, thinking it would be a good example.

I was a bit shocked to see a 600 line main() function. Is this normal? If it is normal is this considered good C coding practices?

There's a quote of an American President (Lincoln?) who was asked how long a man's legs should be. "Long enough to reach from his body to the ground," he said.

Getting back on topic:

Authors of books like "Clean Code" advertise that every function do only one thing (that's grossly simplified by me here), so in theory your main() should maybe call an initialization function and then another function orchestrating the work of the application, and that's all.

In practice, many programmers find a lot of tiny functions irritating. A perhaps more useful metric is that a function should usually fit on one screen, if only to make it easier to see and think about.

If a program is complex and most of its functionality is in main(), someone hasn't done a decent job of breaking the problem down. Essentially you should strive for manageability, understandability and readability. There's usually no good reason for a main() to be huge.

I often find on certain kinds of applications that main() has hundreds of lines of initialization followed by about 20 lines of top-level loop.

It's my habit not to break functions out until I need to call them twice. This sometimes leads to me writing a 300 line function, but as soon as I see the same block occur twice I break that block out.

As for main, initialization routines are often once so 600 lines does not sound unreasonable.

A 600 line main is a bit of a warning sign. But if you look at it and can't see any way of breaking it up into smaller pieces other than to do this.

void the_first_part_of_main(args...);
void the_second_part_of_main(args...);
...

main()
{
   the_first_part_of_main();
   the_second_part_of_main();
   ...
}

Then you should leave it alone.

Regardless of the language, I would try and restrict a subroutine method to roughly what's visible in one page of code, and extract functionality to subroutines wherever possible.

600 lines sounds quite long for any implementation. Perhaps there's some overriding reason wrt. passing arguments around and clarity (I've not looked at the example you've posted) but it sounds to be at the far end of what's normally practised, and it should be possible to subdivide this task.

I suspect it's been developed by continual incremental addition of functionality over the years, and nobody has stopped and refactored this to be more readable/maintainable. If there are no unit tests for this (and in my experience main() methods don't often get written tests - for whatever reasons) then there's going to be an understandable reluctance to refactor it.

The limit is the editor window...

Heh, that's horrible but I've seen worse. I've seen large, multi-thousand line fortran programs with no subroutines at all.

I believe the answer is: it should fit in an editor window and it should have low cyclomatic complexity.

If a main program is just a series of function calls or computations, then I suppose it could be as long as necessary and it could have an exemption from the editor window constraint. Even then, I would be a bit surprised that there was not a natural way to extract meaningful discrete methods.

But if it is testing and branching and returning and breaking and continue-ing then it needs to be broken up into individual and individually tested functional components.

Hopefully they are planning on refactoring. This looks very rough.

  443   while (optind < argc) {
  444     const char *get_argv = argv[optind++];
  445     char *q, *parse = strdup(get_argv);
  446     int port_lo = 0, port_hi = 65535;
  447     nc_port_t port_tmp;
  448 
  449     if (!(q = strchr(parse, '-')))    /* simple number? */
  450       q = strchr(parse, ':');     /* try with the other separator */
  451 
  452     if (!q) {
  453       if (netcat_getport(&port_tmp, parse, 0))
  454   netcat_ports_insert(old_flag, port_tmp.num, port_tmp.num);
  455       else
  456   goto got_err;
  457     }
  458     else {        /* could be in the forms: N1-N2, -N2, N1- */
  459       *q++ = 0;
  460       if (*parse) {
  461   if (netcat_getport(&port_tmp, parse, 0))
  462     port_lo = port_tmp.num;
  463   else
  464     goto got_err;
  465       }
  466       if (*q) {
  467   if (netcat_getport(&port_tmp, q, 0))
  468     port_hi = port_tmp.num;
  469   else
  470     goto got_err;
  471       }
  472       if (!*parse && !*q)     /* don't accept the form '-' */
  473   goto got_err;
  474 
  475       netcat_ports_insert(old_flag, port_lo, port_hi);
  476     }
  477 
  478     free(parse);
  479     continue;
  480 
  481  got_err:
  482     free(parse);
  483     ncprint(NCPRINT_ERROR, _("Invalid port specification: %s"), get_argv);
  484     exit(EXIT_FAILURE);
  485   }

By some standards a 600 line function of any sort is a bad idea, but there is no reason main should be treated any differently to any other function.

The only reason I can think of such a situation arising is where a program is developed rapidly, and as it grows no one ever bothers to split it up into more logical units.

As short as possible. Usually, whenever there's an operation I can assign a name to, I create a new method for it.

I would say your routines should be as long/short as necessary to be effective and reliably and automatically tested. A 600-statement routine likely has multiple paths through it and the combinations of routines might get very large very quickly. I try to break down functions into someting that make it easily readable. Functions are either "functional" or "narrative." All the while including unit tests.

My personal coding style is to try and only use the the main function for command-line argument parsing, and any big-ticket initialization that the program needs.

Nearly all 600-line functions I have seen were also stupidly written. This doesn't have to be so.

However, in these cases it was just a failure of a programmer to present some zoomed-out view, and give meaningful names to sections - both high-level (like, Initialize()) and low-level (something that takes a common 3-line pattern and hides it under one name, with parameters).

In cases of extreme stupidity, they were optimizing function call performance when it was not required.

main(), like any function, should be exactly as big as it needs to be. "As it needs to be" will vary a lot depending on what it needs to do. Having said that, it shouldn't have to be more than a couple of hundred lines in most cases. 600 lines is a bit on the hefty side, and some of that could / should probably be refactored into separate functions.

For an extreme example, one team I was on was tasked with speeding up some code to drive a 3d display. The code was originally written by a wirehead who was obviously taught himself programming using old-school FORTRAN; main() was over five thousand lines of code, with random bits #includeed here and there. Instead of breaking code out into functions, he'd simply branch to a subroutine within main() via goto (somewhere between 13 and 15 gotos, branching both directions seemingly at random). As a first step we simply turned on level 1 optimization; the compiler promptly swallowed up all available memory and swap space and panicked the kernel. The code so brittle that we couldn't make any changes without breaking something. We finally told the customer they had two choices: allow us to rewrite the entire system from scratch or buy faster hardware.

They bought faster hardware.