How to prevent something I'd call "regex injection"?
I'm using regular expressions to parse strings that might look like - one of the examples -
Size: 10, qty: 20
Writing a regex to capture "10" and "20" is not hard by itself. "Size" and "qty" are, however,开发者_运维知识库 customizable - user can choose some other words instead.
So what I do is:
var pattern = String.Format(
@"{0}[ \t]*(?<size>{1}|\d*)[ \t]*:[ \t]*{2}:[ \t]*(?<quantity>[\d]*)",
sizeSign,
univerSizeAbbrev,
qtySign);
But how do I 'sanitize' sizeSign, qtySign (or univerSizeAbbrev for that matter)?
Regex does not have procedure parameters like SQL does (?), so how do I make sure, positively sure that sizeSign and qtySign are always treated as literals, whatever they are.
Use Regex.Escape:
Escapes a minimal set of characters (\, *, +, ?, |, {, [, (,), ^, $,., #, and white space) by replacing them with their escape codes. This instructs the regular expression engine to interpret these characters literally rather than as metacharacters.
Make sure you include:
using System.Text.RegularExpressions;
And then escape the variables like this:
sizeSign = Regex.Escape(sizeSign);
qtySign = Regex.Escape(qtySign);
If you are allowed to assume that the identifiers can only consist of letter characters, this becomes easy. Just test each with
str.Any(ch => ! Char.IsLetter(ch));
and reject any choices for which this returns false.
精彩评论