开发者

find the position of a string in another string [duplicate]

开发者 https://www.devze.com 2022-12-21 15:25 出处:网络
This question already has answers here: Closed 12 years ago. Possible Duplicate: substring algorithm Given two strings, A and B, ho开发者_高级运维w to find the first position of B in A?
This question already has answers here: Closed 12 years ago.

Possible Duplicate:

substring algorithm

Given two strings, A and B, ho开发者_高级运维w to find the first position of B in A? For instance, A = " ab123cdefgcde"; B= "cde" Then the first position of B in A is 5.

Is there any trick to solve this problem or just search A from the start?


You really must to scan A from the start.

There are good algorithms of fast substring search, e.g. http://en.wikipedia.org/wiki/Knuth%E2%80%93Morris%E2%80%93Pratt_algorithm

There is also a standard function strstr:

strstr(A,B)

http://www.cplusplus.com/reference/clibrary/cstring/strstr/


See http://en.wikipedia.org/wiki/Boyer%E2%80%93Moore_string_search_algorithm


The problem that you are solving is the "exact string matching" problem. The naive solution runs in O(n^2) time, but you can do much better than that. Some linear-time algorithms to solve this problem are Knuth-Morris-Pratt (KMP), Boyer-Moore, and Apostolico-Giancarlo. Another way to solve it is by constructing a finite state automaton that enters an accepting state when it sees the pattern string. The best possible solution is O(n), and all those have that worst-case running time. You do have to scan the string from one end to the other; however, it is possible to skip a fraction of the characters (which Boyer-Moore and Apostolico-Giancarlo will do), since some mismatches can imply other mismatches.

If you need to code this yourself, I recommend you go with the Knuth-Morris-Pratt algorithm, since it is a little bit more intuitive and easier to implement than the other solutions I've mentioned. Most programming languges, though, have an "indexOf" or "find" function that will solve this for you.


The optimal way to solve this is by using the KMP algorithm: http://en.wikipedia.org/wiki/Knuth%E2%80%93Morris%E2%80%93Pratt_algorithm


"Trick" is another word for algorithm, I guess. The most famous one is Knuth-Morris-Pratt.


In C programming you have function strstr to find the sub string position in the source String.


If you have a big string and you're going to search for many substrings,
it's good to have in mind the suffix array structure.
Basically, you create an array of pointers and you sort it.
Then you can locate any substring with a binary search.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号