开发者

How to make this function not prematurely split?

开发者 https://www.devze.com 2023-01-19 08:24 出处:网络
I\'ve written this function... internal static IEnumerable<KeyValuePair<char?, string>> SplitUnescaped(this string input, char[] separators)

I've written this function...

internal static IEnumerable<KeyValuePair<char?, string>> SplitUnescaped(this string input, char[] separators)
{
    int index = 0;
    var state = new Stack<char>();

    for (int i = 0; i < input.Length; ++i)
    {
        char c = input[i];
        char s = state.Count > 0 ? state.Peek() : default(char);

        if (state.Count > 0 && (s == '\\' || (s == '[' && c == ']') || ((s == '"' || s == '\'') && c == s)))
            state.Pop();
        else if (c == '\\' || c == '[' || c == '"' || c == '\'')
            state.Push(c);
        if (state.Count == 0 && separators.Contains(c))
        {
            yield return new KeyValuePair<char?, string>(c, input.Substring(index, i - index));
            index = i + 1;
        }
    }

    yield return new KeyValuePair<char?, string>(null, input.Substring(index));
}

Which splits a string on the given separators, as long as they aren't escaped, in quotes, or in brackets. Seems to work pretty well, but there's one problem with it.

There characters I want to split on include a space:

{ '>', '+', '~', ' ' };

So, given the string

开发者_如何学Goa > b

I want it to split on > and ignore the spaces, but given

a b

I do want it to split on the space.

How can I fix the function?


You could continue to split based on and > and then remove the strings which are empty.


I think this does it...

internal static IEnumerable<KeyValuePair<char?, string>> SplitUnescaped(this string input, char[] separators)
{
    int startIndex = 0;
    var state = new Stack<char>();
    input = input.Trim(separators);

    for (int i = 0; i < input.Length; ++i)
    {
        char c = input[i];
        char s = state.Count > 0 ? state.Peek() : default(char);

        if (state.Count > 0 && (s == '\\' || (s == '[' && c == ']') || ((s == '"' || s == '\'') && c == s)))
            state.Pop();
        else if (c == '\\' || c == '[' || c == '"' || c == '\'')
            state.Push(c);
        else if (state.Count == 0 && separators.Contains(c))
        {
            int endIndex = i;
            while (input[i] == ' ' && separators.Contains(input[i + 1])) { ++i; }
            yield return new KeyValuePair<char?, string>(input[i], input.Substring(startIndex, endIndex - startIndex));
            while (input[++i] == ' ') { }
            startIndex = i;
        }
    }

    yield return new KeyValuePair<char?, string>(null, input.Substring(startIndex));
}

I was trying to push the space onto the stack too before, and then doing some checks against that...but I think this is easier.

0

精彩评论

暂无评论...
验证码 换一张
取 消