Bad inputs for unit testing on a text parser_问答_开发者

Bad inputs for unit testing on a text parser

开发者 https://www.devze.com 2022-12-09 14:34 出处：网络

I would like to write the unit test for a text parser class but I am not sure about the Bad Inputs I need to give to the test method in order to be sure it work in all p开发者_JS百科ossible cases.

I would like to write the unit test for a text parser class but I am not sure about the Bad Inputs I need to give to the test method in order to be sure it work in all p开发者_JS百科ossible cases.

The method uses some regular expression plus string manipulation, but if I understood well the Unit Testing approach I don't have to concentrate on the method implementation but see it as a black box. Right?

The method is

Public Function GetSymbol(ByVal symbol As String) As SymbolInfo

if the symbol specified is found it return an instance of SymbolInfo will all needed data, otherwise it return Nothing.

My problem now is creating the range of bad inputs for this method, but they are so many that I have no idea how to do it in an effective way.

Can you give me some hints? I am newbie about unit testing, just started.

I don't believe that a pure Black Box approach is completely right.

My thinking would go like this:

A). Look at the inputs. Think of ways that a person could send in bad stuff. Examples:

Null reference
Empty string
Very long string
String with all digits
String with weird punctuation characters
String with unprintable, or strange character sets

In each case you need to know the expected outcome - that is the formal definition of what is "right". To this extent you are doing Black Box testing. You don't want to know what the code does, you want to know what it should do.

B). But now we open the box. You look at the code and look for the various branches it has. Do your tests exercise those branches? There are coverage tools to analyse your tests. Suppose that you had a special case: to take a silly example, the word MY-SPECIAL-SYMBOL is subject to some special processing, there's an "If" for that. Your test case needs to include that as an input. You often can only find such corner cases by looking at the code. (Yes the spec should tell you, but the implementors often find interesting extras, so look at the code.

C). What about side effects? Suppose that this method was supposed to update a cache for each symbol it found. It's a formal requriement of the component, and yet the interface doesn't tell you that. Again you are going inside the Black Box. So here you might use mocks to let you verify that the component in turn passed the right data to the cache api.

If you are using .NET you could use Pex to generate inputs for your unit tests. Even if you don't keep the generated tests themselves it would give you a good set of inputs to get you started.

As a general rule, you can start with all values that are clearly outside of the accepted range, like Nothing or null, the empty string, strings with whitespace chars, strings with seperator chars. If all those bad inputs are handled gracefully or raise the appropriate exception you can start building bad input with the knowledge of what good input is. Like both upper- and lowercase versions of the input strings, strings including international characters, etc. What is an optimal set of bad inputs depends heavilly on the structure of what you parse.