I am working on understanding recursion, and I think I have it down alright... I'm trying to build a search function (like the std::string.find()) that searches a given string for another string for example:
G开发者_高级运维iven (big) string: "ru the running cat"
search (small) string: "run"
I'm trying to return a index for the where the word i am searching for (in the case above it would be **7) the recursive method i have is as follows - i can't seem to get it to return the index properly.
calling the recursive function:
index = index_of(imput.c_str(), search.c_str())
Recursion method:
int index_of( const char * i, const char * s) {
int j;
if (*s == '\0') { return; }
if (*i == '\0') {
return NULL;
}
else if ( *i == *s ) {
index_of((i++), (s++));
}
else {
j += index_of((i++), s);
}
return j;
}
another foreseeable problem is that when (like in the example - OK it sucks but i need one that worked) it reaches the "ru_" it's still stuck on the ' ' (SPACE) -> how would i get it to 'reset' the s pointer?
- Don't change any state variables. Your code should not include the operator
++
anywhere. You are not trying to loop over a datastructure and change your local variables in some fashion - you are trying to generate an entirely new but smaller problem each time. So, all those++
operators - whether pre or post increment - are red flags. You have more than one sub-problem. (...so single function recursion isn't ideal).
Let's look at this systematically.
Assume you have a working
index_of
and you just want to call it with input that's shorter than yours, and that both haystack and needle aren't empty yet. Then one of two things may be:The haystack starts with the same letter as the needle, and you just need to look a little deeper to verify this.
- What happens if verification succeeds - what if it fails? is this anindex_of
subproblem?...or the haystack starts off wrong, and you need to look deeper.
- Is looking deeper into the haystack an index_of subproblem?
Notice that if the haystack starts OK it doesn't necessarily mean that it starts with the full search string - and if it starts OK but does not start with the full search string, you really don't need to continue looking. That means that the "starts-with" sub-problem is fundamentally different than the index-of sub-problem:
- index_of: here, failure to find the search string at index 0 means you need to look further.
- starts_with: here, failure to find the search string at index 0 means you need to stop.
It is possible to say
startswith(haystack, needle):= 0==index_of(haystack, needle)
, but obviouslyindex_of
is a more complicated problem with a more expensive solution - so you shouldn't do that; it's also more confusing at first.Identify your base cases - When either needle or haystack are empty, what does that mean?
Good names help clarify things, and recursion is hard to read at the best of times - Yacoby's reply for instance has some meaningful choices here.
Summary
I'm not going to solve your own puzzle for you, but a few tips in recap...
- Avoid changing your local variables: you're trying to define a subproblem and then just call the right function for those newer, shorter parameters. Don't use side-effects, they make things terribly complex.
- Recursion doesn't necessarily mean just one function
A
callingA
- it can be any cyclic call graph that eventually terminates; you may use more functions - If needle and haystack start with the same letter, that doesn't mean that the entire needle is at the start of the haystack - and if it is not, you still need to continue searching
- This line is wrong:
if (*s == '\0') { return 1; }
This is how I would do it, which is simplify everything by moving it into different functions. Untested but it should hopefully give you an idea.
/**
* Checks if one string starts with another string. This returns an
* incorrect result if it is called where prefix is an empty string.
*/
bool starts_with_impl(const char* haystack, const char* prefix) {
if ( *prefix == 0 ){
//reached the end of prefix without having found a difference in characters
return true;
}else if ( *haystack == 0 || *prefix != *haystack ){
//either prefix is longer than haystack or there is a difference in characters.
return false;
}
//move along the haystack and prefix by one character
return starts_with_impl(++haystack, ++prefix);
}
/**
* Wrapper around starts_with_impl that returns false if prefix is an empty string
*/
bool starts_with(const char* haystack, const char* prefix) {
return *prefix ? starts_with_impl(haystack, prefix) : false;
}
int get_substr_impl(const char* haystack, const char* const needle, int index) {
if ( *haystack == 0 ){
//reached the end of haystack with no match, -1 is no string found
return -1;
}else if ( starts_with(haystack, needle) ){
//we have found a substring match.
return index;
}
//move along haystack by one character
return get_substr_impl(++haystack, needle, ++index);
}
/**
* Wrapper function for the above to hide the fact we need an additional argument.
* I am avoiding using a default argument as it makes a messy api
*/
int get_substr(const char* haystack, const char* const needle) {
return get_substr_impl(haystack, needle, 0);
}
From the comments
you've got 2 const's in the get_substr_impl method....
Intentional.
// means that the data is constant, in other words I can't change the value of needle
const char* needle;
//means that as well as the data being constant
//I can't change the address that the pointer points to.
const char* const needle;
from the main method wouldn't i just call teh get_substr_impl with the same parameters given in get_substr?
I split it as get_substr_impl
has an additional (required) argument int index
that is required for the inner workings of the function and should always start at 0. While you could call get_substr_impl("abc", "a", 0);
, it looks nicer and is more understandable to call get_substr("abc", "a");
and it avoids errors (like calling get_substr_impl("abc", "a", 1);
)
First of all, you have two i
s in your code. That shouldn't even compile.
Also, index_of((i++), (s++));
is effectively the same as:
index_of(i, s);
++i;
++s;
In other words, you calling index_of over & over with the same parameters it was called with the first time. (it'll never return to get to the ++i).
Also, purely stylistic, but since the return type is an int, you should return 0, and save NULL for use with pointers.
The main problem here is that you are using post-increment, and result of (i++) is i . You have to use ++i
精彩评论