开发者

Regular Expression to Match Exact Word - Search String Highlight

开发者 https://www.devze.com 2023-01-10 14:09 出处:网络
I\'m using the following 2 methods to highlight the search keywords. It is working fine but fetching partial words also.

I'm using the following 2 methods to highlight the search keywords. It is working fine but fetching partial words also.

For Example:

Text: "This is .net Programming" Search Key Word: "is"

It is highlighting partial word from this and "is"

Please let me know the correct reg开发者_Go百科ular expression to highlight the correct match.

private string HighlightSearchKeyWords(string searchKeyWord, string text)
{
    Regex exp = new Regex(@", ?");
    searchKeyWord = "(\b" + exp.Replace(searchKeyWord, @"|") + "\b)";
    exp = new Regex(searchKeyWord, RegexOptions.Singleline | RegexOptions.IgnoreCase);
    return exp.Replace(text, new MatchEvaluator(MatchEval));
}

private string MatchEval(Match match)
{
    if (match.Groups[1].Success)
    {
        return "<span class='search-highlight'>" + match.ToString() + "</span>";
    }
    return ""; //no match
}


You really just need @ before your "(\b" and "\b)" because the string "\b" will not be "\b" as you would expect. But I have also tried making another version with a replacement pattern instead of a full-blown method.

How about this one:

private string keywordPattern(string searchKeyword)
{
    var keywords = searchKeyword.Split(',').Select(k => k.Trim()).Where(k => k != "").Select(k => Regex.Escape(k));

    return @"\b(" + string.Join("|", keywords) + @")\b";
}

private string HighlightSearchKeyWords(string searchKeyword, string text)
{
    var pattern = keywordPattern(searchKeyword);
    Regex exp = new Regex(pattern, RegexOptions.IgnoreCase | RegexOptions.Singleline);
    return exp.Replace(text, @"<span class=""search-highlight"">$0</span>");
}

Usage:

var res = HighlightSearchKeyWords("is,this", "Is this programming? This is .net Programming.");

Result:

<span class="search-highlight">Is</span> <span class="search-highlight">this</span> programming? <span class="search-highlight">This</span> <span class="search-highlight">is</span> .net Programming.

Updated to use \b and a simplified replace pattern. (The old one used (^|\s) instead of the first \b and ($|\s) instead of the last \b. So it would also work on search terms which not only includes word-characters.

Updated to your comma notation for search terms

Updated forgot Regex.Escape - added now. Otherwise searches for "\w" would blow up the thing :)

Updated do to a comment ;)


Try this fixed line:

searchKeyWord = @"(\b" + exp.Replace(searchKeyWord, @"|") + @"\b)";


You need to enclose the keywords in a non-matching group, otherwise you will get false positives (if you are using multiple keywords separated by commas as indicated in the sample)!

private string EscapeKeyWords(string searchKeyWord)
{
    string[] keyWords = searchKeyWord.Split(',');
    for (int i = 0; i < keyWords.Length; i++) keyWords[i] = Regex.Escape(keyWords[i].Trim());

    return String.Join("|", keyWords);
}

private string HighlightSearchKeyWords(string searchKeyWord, string text)
{
    searchKeyWord = @"(\b(?:" + EscapeKeyWords(searchKeyWord) + @")\b)";
    Regex exp = new Regex(searchKeyWord, RegexOptions.Singleline | RegexOptions.IgnoreCase);
    return exp.Replace(text, @"<span class=""search-highlight"">$0</span>");
}
0

精彩评论

暂无评论...
验证码 换一张
取 消