开发者

Which class should I use for making many replacements in a string?

开发者 https://www.devze.com 2023-02-04 12:03 出处:网络
I have to make a lot of text-replacements. Which class is best used t开发者_如何学Goo make this in a performant manner? Is it StringBuilder?

I have to make a lot of text-replacements. Which class is best used t开发者_如何学Goo make this in a performant manner? Is it StringBuilder?

StringBuilder stringBuilder=new StringBuilder(startString);
stringBuilder.Replace(literala1,literala2);
stringBuilder.Replace(literalb1,literalb2);
stringBuilder.Replace(literalc1,literalc2);
...

or is there a better class to do this? By the way, the literals will be mostly constants.


This exact question was dealt with at length on Roberto Farah's blog: Comparing RegEx.Replace, String.Replace and StringBuilder.Replace – Which has better performance?

I'll summarize the findings here, which come as a shock to many .NET developers. It turns out that for relatively simple string replacement (in cases where it's not necessary for matches to be case sensitive), RegEx.Replace() has the worst performance and String.Replace() wins with the best.

A link is also provided to an article on CodeProject that confirms these findings: StringBuilder vs String / Fast String Operations with .NET 2.0

In general, I would say the rules ought to be as follows:

  • Use String.Replace() when you only have to do a small number of replacements (say around 5)
  • Use StringBuilder.Replace() when you have to do a larger number of replacements
  • Reserve regular expressions (RegEx.Replace) only for the most complex scenarios where it's worth paying a slight performance penalty for the elegance of a single expression that handles all of the necessary replacements.
  • Ignore all of the above guidelines and use whatever makes your code most readable or expressive. Prematurely optimizing something like this isn't worth the time it took me to write this answer.


I would go with RegEx.Replace. This overload: http://msdn.microsoft.com/en-us/library/cft8645c.aspx

All your different inputs can be matched in the regular expression and all your different replacements strings could go in your MatchEvaluator.


StringBuilder is probably the best class for doing this, as it won't create extra copies of the underlying character buffer during replacements. If you are performance-sensitive, then String may be bad because it creates copies of the string with every call to Replace, and using a Regex will probably be inferior to the straightforward search-and-replace of StringBuilder.


I found using this code implementing Aho-Corasick string matching to find all the strings to match and then only going your string only once with StringBuilder doing the replacements was a lot better than looping with a set of string replacements one at a time.

0

精彩评论

暂无评论...
验证码 换一张
取 消