开发者

Pattern Matching Python

开发者 https://www.devze.com 2023-03-10 01:53 出处:网络
I\'m currently stuck on trying to make a naive algorithm which given a piece of apattern e.g aabba search for it in a text e.g abbbbaababaabbaaabbaa one letter at a time. It will compare a with the t

I'm currently stuck on trying to make a naive algorithm which given a piece of a pattern e.g aabba search for it in a text e.g abbbbaababaabbaaabbaa one letter at a time. It will compare a with the text if that is right then compares the next letter and if that's wrong the whole pattern will shift one and compare a with b etc

we were give code example

print "Input text: ",
text = raw_input()
print "Input pattern: ",
pattern = raw_input()

index = text.find(pattern)
while index > -1:
    print index
    index = text.find(pattern, index+1)

but the find() function in p开发者_StackOverflow中文版ython is too fast(I need a non optimized sort of algorithm, using while and for loops statements I guess).

Any help appreciated, Thanks


I guess here's what you need, the following code does character-by-character comparison. You may also replace the calls to find by iterations over text which includes checks whether the first character of text matches the first character of pattern:

def my_find(text, pattern):
    '''Find the start index of a pattern string in a text.
    Return -1 if not found, and assume that pattern is not empty'''

    found = False
    current_start_index = text.find(pattern[0])
    index_text = current_start_index
    index_pattern = 0

    while not found and index_text + len(pattern) - 1 < len(text) and \
            current_start_index != -1:

        index_text += 1
        index_pattern += 1

        while index_text < len(text) and \
                index_pattern < len(pattern) and \
                text[index_text] == pattern[index_pattern]:

            if index_pattern == len(pattern) - 1:
                found = True
                break
            else:
                index_text += 1
                index_pattern += 1

        if not found:
            current_start_index = text.find(pattern[0],current_start_index + 1)
            index_text = current_start_index

    if found:
        return current_start_index
    else:
        -1


def my_find(haystack, needle):
    n_len = len(needle)
    start = 0
    while start <= (len(haystack)-n_len+1):
        if haystack[start:start+n_len-1] == needle:
            return True
        start += 1

This is, as far as I can understand, your algorithm. Not tested, will test and let you know if it works.

Tested and seems to work.


Sounds like you are learning about regular expressions, here's a snippet that may help you get started.

myFileName = "abbababaaa"
patternToMatch = "ababa"

i = 0
j = 0
while (i < len(myFileName)):
    if (patternToMatch[i:i] == myFileName[j:j]):
        i++
        j++
    else:
        i = 0        

if len(patternToMatch) == i:
    # matched a pattern
0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号