开发者

Programming style question on how to code functions

开发者 https://www.devze.com 2022-12-27 02:51 出处:网络
So, I was just coding a bit today, and I realized that I don\'t have much consistency when it comes to a coding style when programming functions. One of my main concerns is whether or not its proper t

So, I was just coding a bit today, and I realized that I don't have much consistency when it comes to a coding style when programming functions. One of my main concerns is whether or not its proper to code it so that you check that the input of the user is valid OUTSIDE of the function, or just throw the values passed by the user into the function and check if the values are valid in there. Let me sketch an example:

I have a function that lists hosts based on an environment, and I want to be able to split the environment into chunks of hosts. So an example of the usage is this:

listhosts -e testenv -s 2 1

This will get all the hosts from the "testenv", split it up into two parts, and it is displaying part one.

In my code, I have a function that you pass it in a list, and it returns a list of lists based on you parameters for splitting. BUT, before I pass it a list, I first verify the parameters in my MAIN during the getops process, so in the main I check to make sure there are no negatives passed by the user, I make sure the user didnt request to split into say, 4 parts, but asking to display part 5 (which would not be valid), etc.

tl;dr: Would you check the validity of a users input the flow of you're M开发者_StackOverflowAIN class, or would you do a check in your function itself, and either return a valid response in the case of valid input, or return NULL in the case of invalid input?

Obviously both methods work, I'm just interested to hear from experts as to which approach is better :) Thanks for any comments and suggestions you guys have! FYI, my example is coded in Python, but I'm still more interested in a general programming answer as opposed to a language-specific one!


Good question! My main advice is that you approach the problem systematically. If you are designing a function f, here is how I think about its specification:

  1. What are the absolute requirements that a caller of f must meet? Those requirements are f's precondition.

  2. What does f do for its caller? When f returns, what is the return value and what is the state of the machine? Under what circumstances does f throw an exception, and what exception is thrown? The answers to all these questions constitute f's postcondition.

    The precondition and postcondition together constitute f's contract with callers. Only a caller meeting the precondition gets to rely on the postcondition.

  3. Finally, bearing directly on your question, what happens if f's caller doesn't meet the precondition? You have two choices:

    • You guarantee to halt the program, one hopes with an informative message. This is a checked run-time error.

    • Anything goes. Maybe there's a segfault, maybe memory is corrupted, maybe f silently returns a wrong answer. This is an unchecked run-time error.

    Notice some items not on this list: raising an exception or returning an error code. If these behaviors are to be relied upon, they become part of f's contract.

Now I can rephrase your question:

What should a function do when its caller violates its contract?

In most kinds of applications, the function should halt the program with a checked run-time error. If the program is part of an application that needs to be reliable, either the application should provide an external mechanism for restarting an application that halts with a checked run-time error (common in Erlang code), or if restarting is difficult, all functions' contracts should be made very permissive so that "bad input" still meets the contract but promises always to raise an exception.

In every program, unchecked run-time errors should be rare. An unchecked run-time error is typically justified only on performance grounds, and even then only when code is performance-critical. Another source of unchecked run-time errors is programming in unsafe languages; for example, in C, there's no way to check whether memory pointed to has actually been initialized.

Another aspect of your question is

What kinds of contracts make the best designs?

The answer to this question varies more depending on the problem domain. Because none of the work I do has to be high-availability or safety-critical, I use restrictive contracts and lots of checked run-time errors (typically assertion failures). When you are designing the interfaces and contracts of a big system, it is much easier if you keep the contracts simple, you keep the preconditions restrictive (tight), and you rely on checked run-time errors when arguments are "bad".

I have a function that you pass it in a list, and it returns a list of lists based on you parameters for splitting. BUT, before I pass it a list, I first verify the parameters in my MAIN during the getops process, so in the main I check to make sure there are no negatives passed by the user, I make sure the user didnt request to split into say, 4 parts, but asking to display part 5.

I think this is exactly the right way to solve this particular problem:

  1. Your contract with the user is that the user can say anything, and if the user utters a nonsensical request, your program won't fall over— it will issue a sensible error message and then continue.

  2. Your internal contract with your request-processing function is that you will pass it only sensible requests.

  3. You therefore have a third function, outside the second, whose job it is to distinguish sense from nonsense and act accordingly—your request-processing function gets "sense", the user is told about "nonsense", and all contracts are met.

One of my main concerns is whether or not its proper to code it so that you check that the input of the user is valid OUTSIDE of the function.

Yes. Almost always this is the best design. In fact, there's probably a design pattern somewhere with a fancy name. But if not, experienced programmers have seen this over and over again. One of two things happens:

  • parse / validate / reject with error message

  • parse / validate / process

This kind of design has one data type (request) and four functions. Since I'm writing tons of Haskell code this week, I'll give an example in Haskell:

data Request -- type of a request
parse    :: UserInput -> Request            -- has a somewhat permissive precondition
validate :: Request -> Maybe ErrorMessage   -- has a very permissive precondition
process  :: Request -> Result               -- has a very restrictive precondition

Of course there are many other ways to do it. Failures could be detected at the parsing stage as well as the validation stage. "Valid request" could actually be represented by a different type than "unvalidated request". And so on.


I'd do the check inside the function itself to make sure that the parameters I was expecting were indeed what I got.

Call it "defensive programming" or "programming by contract" or "assert checking parameters" or "encapsulation", but the idea is that the function should be responsible for checking its own pre- and post-conditions and making sure that no invariants are violated.

If you do it outside the function, you leave yourself open to the possibility that a client won't perform the checks. A method should not rely on others knowing how to use it properly.

If the contract fails you either throw an exception, if your language supports them, or return an error code of some kind.


Checking within the function adds complexity, so my personal policy is to do sanity checking as far up the stack as possible, and catch exceptions as they arise. I also make sure that my functions are documented so that other programmers know what the function expects of them. They may not always follow such expectations, but to be blunt, it is not my job to make their programs work.


It often makes sense to check the input in both places.

In the function you should validate the inputs and throw an exception if they are incorrect. This prevents invalid inputs causing the function to get halfway through and then throw an unexpected exception like "array index out of bounds" or similar. This will make debugging errors much simpler.

However throwing exceptions shouldn't be used as flow control and you wouldn't want to throw the raw exception straight to the user, so I would also add logic in the user interface to make sure I never call the function with invalid inputs. In your case this would be displaying a message on the console, but in other cases it might be showing a validation error in a GUI, possibly as you are typing.


"Code Complete" suggests an isolation strategy where one could draw a line between classes that validate all input and classes that treat their input as already validated. Anything allowed to pass the validation line is considered safe and can be passed to functions that don't do validation (they use asserts instead, so that errors in the external validation code can manifest themselves).


How to handle errors depends on the programming language; however, when writing a commandline application, the commandline really should validate that the input is reasonable. If the input is not reasonable, the appropriate behavior is to print a "Usage" message with an explanation of the requirements as well as to exit with a non-zero status code so that other programs know it failed (by testing the exit code).

Silent failure is the worst kind of failure, and that is what happens if you simply return incorrect results when given invalid arguments. If the failure is ever caught, then it will most likely be discovered very far away from the true point of failure (passing the invalid argument). Therefore, it is best, IMHO to throw an exception (or, where not possible, to return an error status code) when an argument is invalid, since it flags the error as soon as it occurs, making it much easier to identify and correct the true cause of failure.

I should also add that it is very important to be consistent in how you handle invalid inputs; you should either check and throw an exception on invalid input for all functions or do that for none of them, since if users of your interface discover that some functions throw on invalid input, they will begin to rely on this behavior and will be incredibly surprised when other function simply return invalid results rather than complaining.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号