I have a function like:
// string is a null-terminated char array. Replace开发者_开发问答 all a in the string with b
void ReplaceCharInString(char *string, char a, char b)
{
// loop over the string char by char, to find all "a"s and replace them with "b"
}
I'm doing the defensive programming. The problem is the implementation replies on the client to really pass in an array of chars. If an address of a single char is passed in, the program definitely runs into a bad state(probably crashes). How do I check and avoid this? (I know if I pass in the std::string object, the problem goes away of course)
No, you cannot check this and have to trust the user of the function to pass an actual correctly null-terminated string.
This is also the way all the C standard library functions like strcpy()
, strlen()
, ... work.
No you can't check if you're running out of the allocated memory if the string is not null terminated.
As sth said C standard library functions like strcpy(), strlen(), also rely on the fact that the string is valid (null terminated).
... however, one solution could be Mudflap. It is costly (in term of performance) and is only valid with GCC.
Mudflap is a library that instruments all pointer/array operations. With this you will be able to check if a specific location is valid memory or not.
In fact the only reason I see for using Mudflap is if security is a very big issue for your application. But even in this case GCC provides a better alternative against buffer overflow (see -fstack-protector[-all]).
Best is to use std::string.
You really can't trust user-inputted data at all. Using std::string
is an improvement, since it prevents buffer overflows, but it's still not entirely safe. What's to stop the user from inputting a 10GB string that consumes all of your memory?
You really need to validate the input when you first receive it from the user, (whether it's coming from a socket, or through stdin.) For example, you can enforce a maximum number of bytes to read, or ensure that all input characters are in the ASCII range.
If you have to use C-strings, you can memset
the input buffer to 0, and make sure the last character in the buffer is always a NULL byte, to ensure that your string is properly null-terminated.
You avoid this by documenting your code properly. Nobody but a complete beginner at C (who should not be writing serious code) would ever take the address of a single char
variable (or even declare such a variable), so your fear is not very relevant.
One way in C99 to partly guard against this is to declare the function as:
void ReplaceCharInString(char string[static 2], char a, char b);
This means that string
must be a valid pointer to at least 2 chars, which would protect against the single-char-pointer issue you mentioned at the cost of making the function fail to support the empty string. But it still cannot protect you from incorrect malloc
, etc.
Writing safe C is really a matter of proper code documentation and auditing, and firing incompetent people (or better yet never hiring them to begin with). The language is designed to be light and efficient. If you go trying to put your own huge safety harness layers on it, you'll probably do a worse job than the people who designed higher-level languages, and then you might as well have just used a higher-level language to begin with.
With C++, you should both use function overloading and make sure no one will send you the address of a char.
void ReplaceCharInString(std::string & p_string, char p_a, char p_b)
{
for(size_t i = 0, iMax = p_string.size(); i < iMax; ++i)
{
// replace if needed : p_string[i]
}
}
void ReplaceCharInString(char & p_char, char p_a, char p_b)
{
// replace if needed : p_char
}
template<size_t size>
void ReplaceCharInString(char (& p_string)[size], char p_a, char p_b) ;
{
for(size_t i = 0; i < size; ++i)
{
// replace if needed : p_string[i]
}
}
With only those three functions (and no function taking char *
as its first parameter), no one will be able to call them with a raw char *
:
void foo()
{
char a[4] ;
std::string b ;
char c ;
char * d ;
// initialize a, b, c and d to some values
ReplaceCharInString(a, 'x', 'z') ; // compiles
ReplaceCharInString(b, 'x', 'z') ; // compiles
ReplaceCharInString(c, 'x', 'z') ; // compiles
ReplaceCharInString(d, 'x', 'z') ; // COMPILATION ERROR
}
So there's no way someone will call your function with the address of a single char.
Anyway, in C++, there are very few reasons to use char *
as a string (you would use instead a std::string
), and even less a char *
as an address to a single char (you would use instead a char &
), so no one will complain about the lack of support for char *
in the functions above.
The function just gets a pointer to somewhere in memory. It's defensive programming practice to check that it is not NULL. This is a usually value if the caller has "forgotten" to allocate the memory or some allocation failed.
There are more checks for invalid char arrays possible, but these are platform dependent.
At least you cannot distinguish between pointer to char array and pointer to char.
Your prototype is wrong. When you pass an array to a function, you always must pass the length of the array since inside the function you never know the size of the array (it is always the size of the pointer). So you should declare an int variable for the size of the array so that the declaration itself makes explicit the use of an array.
void ReplaceCharInString(char *string, int len, char a, char b)
This way you do not depend on reading up to NULL character of string input arg, and just use the length. If you can not do this, and you have to keep the declaration as is, then you should have an explicit contract on what the users of the function should do, to not have problems.Then the responsibility is delegated to the caller of the function to honor the contract
精彩评论