
Multifunction RegEx for parsing JCL variables - out of working solutions

开发者 https://www.devze.com 2023-02-08 20:28 出处:网络
I\'m a bit lost creating a RegEx under C#.NET. I\'m doing something like parser, so I use Regex.Replace to search text for certain \"variables\" and replace them with their \"values\".

I'm a bit lost creating a RegEx under C#.NET.

I'm doing something like parser, so I use Regex.Replace to search text for certain "variables" and replace them with their "values".

Each variable starts with ampersand ("&") and ends with ampersand (begining of another variable) or dot.

Each variable (as well as text surrounding variables) can only consist of alphanumerical characters and certain "special" characters, that being "$", "@", "#" and "-".

Nor variables, nor the rest of the text could contain space characters (" ").

Now, the problem is that I'm trying to figure out a RegEx replacing one possible ending character ("."), while not replacing the other possible ending character ("&"). Which happanes to be quite an issue:

  • "&"+variable+"[^A-Za-z0-9#@$]" does what I want, except for it also replaces "&" - not acceptable.
  • "&"+variable+"(.)?\b" replaces dot, but only if followed by literal character - not if it's followed by \&\@#\$\- and that could occur, so this doesn't work either.
  • "&"+variable+"(.)?(?!A-Za-z0-9)" does exactly what i want as for the ending characters, except it doesn't recognize true end of variable - this way, search-and-replace for "&DEN" also replaces that part in another variable, called "&DENV" - of which "&DEN" is a substring. This would create false/misleading results - totally unacceptable.
These were all the possibilities I could think of (and search of); is it possible to do the task I require with one RegEx at all? Under C#.NET RegEx parser?

Just to illustrate desired function:

string variable="DEN";
string replaceWith="28";
string replText;
string r开发者_运维技巧egex = "<desired regex>";
replText = Regex.Replace(replText, "&"+variable+regex, replaceWith);


=> replaced => repltext=="28"


=> not replaced => repltext=="&DENV"


=> replaced => repltext=="2828"


=> replaced, not replaced => repltext=="28&DENV"


=> replaced and dot removed => repltext=="28anything"


=> replaced and first dot removed => repltext=="28.anything"

variable could also be like "#DE@N-$".

The following works correctly on all of your examples. I assumed that a variable &FOO should only be replaced if it's followed by ., &, or end-of-string $. If it's followed by anything else, it's not replaced.

In order to match but not capture a terminating &, I used a lookahead assertion (?=&). Assertions force the string to match the regex, but they don't consume any characters, so those characters aren't replaced. Trailing . are still captured and replaced as part of the variable, however.

Finally, a MatchEvaluator is specified to use the captured pattern to do a lookup in the replacements dictionary for the replacement value. If the pattern (variable name) is not found, the text is effectively untouched (the full original capture is returned).

class Program
    static string ReplaceVariables(Dictionary<string, string> replacements, string input)
        return Regex.Replace(input, @"&([\w\d$@#-]+)(\.|(?=&)|$)", m =>
            string replacement = null;
            return replacements.TryGetValue(m.Groups[1].Value, out replacement)
                 ? replacement
                 : m.Groups[0].Value;

    static void Main(string[] args)
        string[] tests = new[]
            "&DEN", "&DENV", "&DEN&DEN",
            "&DEN&DENV", "&DEN.anything",
            "&DEN..anything", "&DEN Foo",

        var replace = new Dictionary<string, string>
            { "DEN", "28" },
            { "FOO", "42" }

        foreach (var test in tests)
            Console.WriteLine("{0} -> {1}", test, ReplaceVariables(replace, test));

Ok, I think I finally found it, using ORs. Regex
(.)?([^A-Za-z0-9#\@\$\&\,\;\:-\<>()\ ]|(?=\&)|\b)
seems to work fine. I'm just posting this if anyone found it helpfull.

EDIT: sorry, I haven't refreshed the page and thus reacted without knowing there is a better answer provided by Chris Schmich



验证码 换一张
取 消