开发者

How to determine whether a string is composed only of letters given by a second string

开发者 https://www.devze.com 2022-12-20 04:05 出处:网络
I have two strings.How can I determine whether the first string is composed only of letters given by the second string?

I have two strings. How can I determine whether the first string is composed only of letters given by the second string?

For example:

A = abcd
B = abbcc

should return false, since d is not in the second string.

A = aab
B = ab

should return true.

If the program most of the time returns false, how can I optimize this program? If it returns true most of the time, how can I op开发者_开发技巧timize it then?


Sort both strings. Then go through A, and have a pointer going through B. If the character in A is the same as what the B pointer points to, keep looking through A. If the character in A is later in the alphabet than what the B pointer points to, advance the B pointer. If the character in A is earlier in the alphabet than what the B pointer points to, return false. If you run out of A, return true.


[How do I] determine [if] the first string [is composed of characters that appear in] the second string?

Here's a quick algorithm:

  1. Treat the first and second strings as two sets of characters S and T.
  2. Perform the set difference S - T. Call the result U.

If U is nonempty, return false. Otherwise, return true.


Here is one simple way.

def isComposedOf(A, B):
    bset = set(B)
    for c in A:
        if c not in bset:
            return False
    return True

This algorithm walks each string once, so it runs in O(len(A) + len(B)) time.

When the answer is yes, you cannot do better than len(A) comparisons even in the best case, because no matter what you must check every letter. And in the worst case one of the characters in A is hidden very deep in B. So O(len(A) + len(B)) is optimal, as far as worst-case performance is concerned.

Similarly: when the answer is no, you cannot do better than len(B) comparisons even in the best case; and in the worst case the character that isn't in B is hidden very deep in A. So O(len(A) + len(B)) is again optimal.

You can reduce the constant factor by using a better data structure for bset.

You can avoid scanning all of B in some (non-worst) cases where the answer is yes by building it lazily, scanning more of B each time you find a character in A that you haven't seen before.


> If the program is always return false, how to optimize this program?

return false

> If it is always return true, how to optimize it?

return true

EDIT: Seriously, it's a good question what algorithm optimizes for the case of failure, and what algorithm optimizes for the case of success. I don't know what algorithm strstr uses, it may be a generally-good algorithm which isn't optimal for either of those assumptions.

Maybe you'll catch someone here who knows offhand. If not, this looks like a good place to start reading:Exact String Matching Algorithms


assuming strings has all lower case in it, then you can have a bit vector and set the bit based on position position = str1[i] - 'a'. to set it you would do
bitVector |= (1<<pos). And then for str2 you would check if a bit is set in bitVector for all bits, if so return true otherwise return false.

0

精彩评论

暂无评论...
验证码 换一张
取 消