开发者

Find subsequences of strings within strings

开发者 https://www.devze.com 2023-01-15 23:46 出处:网络
I want to make a function which checks a string for occurrences of other strings within them. However, the sub-strings which are being checked may be interrupted within the main string by other letter

I want to make a function which checks a string for occurrences of other strings within them.

However, the sub-strings which are being checked may be interrupted within the main string by other letters.

For in开发者_C百科stance:

a = 'abcde'
b = 'ace'
c = 'acb'

The function in question should return as b being in a, but not c.

I've tried set(a). intersection(set(b)) already, and my problem with that is that it returns c as being in a.


You can turn your expected sequence into a regex:

import re

def sequence_in(s1, s2):
    """Does `s1` appear in sequence in `s2`?"""
    pat = ".*".join(s1)
    if re.search(pat, s2):
        return True
    return False

# or, more compactly:
def sequence_in(s1, s2):
    """Does `s1` appear in sequence in `s2`?"""
    return bool(re.search(".*".join(s1), s2))

a = 'abcde' 
b = 'ace' 
c = 'acb'

assert sequence_in(b, a)
assert not sequence_in(c, a)

"ace" gets turned into the regex "a.*c.*e", which finds those three characters in sequence, with possible intervening characters.


how about something like this...

def issubstr(substr, mystr, start_index=0):
    try:
        for letter in substr:
            start_index = mystr.index(letter, start_index) + 1
        return True
    except: return False

or...

def issubstr(substr, mystr, start_index=0):
    for letter in substr:
        start_index = mystr.find(letter, start_index) + 1
        if start_index == 0: return False
    return True


def issubstr(s1, s2):
    return "".join(x for x in s2 if x in  s1) == s1

>>> issubstr('ace', 'abcde')
True

>>> issubstr('acb', 'abcde')
False
0

精彩评论

暂无评论...
验证码 换一张
取 消