开发者

Evaluating a regular expression range

开发者 https://www.devze.com 2022-12-31 19:48 出处:网络
Is there a nice way to evaluate a regular expression range, say, for a url such as http://example.com/[a-z]/[0-9].htm

Is there a nice way to evaluate a regular expression range, say, for a url such as

http://example.com/[a-z]/[0-9].htm

This would be converted into:

http://example.com/a/0.htm
http://example.com/a/1.htm
http://example.com/a/2.htm
...
http://example.com/a/9.htm
...
http://example.com/z/0.htm
http://example.com/z/1.htm
http://example.com/z/2.htm
...
http://example.com/z/9.htm

I've been scratching my head about this, and there's no pretty way of doing it with开发者_运维问答out going through the alphabet and looping through numbers.

Thanks in advance!


If you really need to do this, it's not that hard to generate the strings using recursion. Here's a snippet to do just that in Java:

public class Explode {
    static void dfs(String prefix, String suffix) {
        final int k = suffix.indexOf('[');
        if (k == -1) {
            System.out.println(prefix + suffix);
        } else {
            prefix += suffix.substring(0, k);
            char from = suffix.charAt(k+1);
            char to = suffix.charAt(k+3);
            suffix = suffix.substring(k+5);
            for (char ch = from; ch <= to; ch++) {
                dfs(prefix + ch, suffix);               
            }
        }
    }
    public static void main(String[] args) {
        String template = "http://example.com/[a-c]/[0-2][x-z].htm";
        dfs("", template);
    }
}

(see full output)

This is a standard recursive tuple generator, but with some string infixing in between. It's trivial to port to C#. You'd want to use a mutable StringBuilder-like class for better performance.


I guess there is no way to expand regular expressions in general. Your example

http://foo.com/[a-z]/[0-9].htm

is a very easy regex without * or + for instance. How would you expand such a regex?

In your case you might get away with some loops, but as I said - this is a untypical (easy) regex.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号