开发者

Is using a StringBuilder for writing XML ok?

开发者 https://www.devze.com 2022-12-26 13:49 出处:网络
It feels dirty. But maybe it isn\'t... is it ok to use a StringBuilder for writing XML? My gut instinct says \"although this feels wrong, it\'s probably pretty darn performant because it\'s not loadin

It feels dirty. But maybe it isn't... is it ok to use a StringBuilder for writing XML? My gut instinct says "although this feels wrong, it's probably pretty darn performant because it's not loading extra libraries and overhead it's not doing whatever extra method calls XmlWriter invokes." It also seems like it's just less code in general. What's the benefit in XmlWriter?

Here's what it looks like. I'm building an OpenSearch XML doc based on the domain you come in from.

public void ProcessRequest(HttpContext context)
{
    context.Response.ContentType = "text/xml";

    string domain = WebUtils.ReturnParsedSourceUrl(null); //returns something like www.sample.com
    string cachedChan = 开发者_运维百科context.Cache[domain + "_opensearchdescription"] as String;

    if (cachedChan == null)
    {
        StringBuilder sb = new StringBuilder();
        sb.Append("<?xml version=\"1.0\" encoding=\"UTF-8\"?>");
        sb.Append("<OpenSearchDescription xmlns=\"http://a9.com/-/spec/opensearch/1.1/\" xmlns:moz=\"http://www.mozilla.org/2006/browser/search/\">");
        sb.Append("    <ShortName>Search</ShortName>");
        sb.Append("    <Description>Use " + domain + " to search.</Description>");
        sb.Append("    <Contact>contact@sample.com</Contact>");
        sb.Append("    <Url type=\"text/html\" method=\"get\" template=\"http://" + domain + "/Search.aspx?q={searchTerms}\" />");
        sb.Append("    <moz:SearchForm>http://" + domain + "/Search.aspx</moz:SearchForm>");
        sb.Append("    <Image height=\"16\" width=\"16\" type=\"image/x-icon\">http://" + domain + "/favicon.ico</Image>");
        sb.Append("</OpenSearchDescription>");

        cachedChan = sb.ToString();

        context.Cache.Insert(domain + "_opensearchdescription", cachedChan, null, DateTime.Now.AddDays(14), TimeSpan.Zero);
    }

    context.Response.Write(cachedChan);
}

Followup, ~2 years later I realized that what I meant to say, and completely failed to say it is: what is the benefit of gobs of code using XML classes to generate this file, vs. just using strings? Is there one? Is this worse than (for example) John Saunder's example?

I used Jim Schubert's method, opting for 'I can read this and it makes sense' rather than vying for 'correctness'. I'm glad I did. There's nothing wrong with John Saunder's example- but I felt that it was way overbearing for what I was trying to accomplish. Pragmatism? Maybe.


That's very wrong. Use one of the .NET APIs which understand XML to write XML.

Using a System.Xml.XmlWriter will not cause any performance problem by loading "any extra libraries".


The reason to use the XML APIs is that they understand the rules of XML. For instance, they'll know the set of characters that need to be quoted inside an element, and the different set that need to be quoted inside an attribute.

This might not be an issue in your case: maybe you're certain that domain will not have any characters in it that will need to be quoted. In any broader situation, it's best to let the XML APIs do XML - which they know how to do - so you don't have to do it yourself.


Here's an example of how easy it is to produce valid XML using LINQ to XML:

public static string MakeXml()
{
    XNamespace xmlns = "http://a9.com/-/spec/opensearch/1.1/";
    XNamespace moz = "http://www.mozilla.org/2006/browser/search/";
    string domain = "http://localhost";
    string searchTerms = "abc";
    var doc = new XDocument(
        new XDeclaration("1.0", "UTF-8", "yes"),
        new XElement(
            xmlns + "OpenSearchDescription",
            new XElement(xmlns + "ShortName", "Search"),
            new XElement(
                xmlns + "Description",
                String.Format("Use {0} to search.", domain)),
            new XElement(xmlns + "Contact", "contact@sample.com"),
            new XElement(
                xmlns + "Url",
                new XAttribute("type", "text/html"),
                new XAttribute("method", "get"),
                new XAttribute(
                    "template",
                    String.Format(
                        "http://{0}/Search.aspx?q={1}",
                        domain,
                        searchTerms))),
            new XElement(
                moz + "SearchForm",
                String.Format("http://{0}/Search.aspx", domain)),
            new XElement(
                xmlns + "Image",
                new XAttribute("height", 16),
                new XAttribute("width", 16),
                new XAttribute("type", "image/x-icon"),
                String.Format("http://{0}/favicon.ico", domain))));
    return doc.ToString(); // If you _must_ have a string
}


I wouldn't use StringBuilder for this, because you have to call the Append method for every line. You could use XmlWriter and that won't hurt performance.

You can reduce the amount of IL code generated by doing the following:

private const string XML_TEMPLATE = @"<?xml version=\"1.0\" encoding=\"UTF-8\"?>
<OpenSearchDescription xmlns=\"http://a9.com/-/spec/opensearch/1.1/\" xmlns:moz=\"http://www.mozilla.org/2006/browser/search/\">
    <ShortName>Search</ShortName>
    <Description>Use {0} to search.</Description>
    <Contact>contact@sample.com</Contact>
    <Url type=\"text/html\" method=\"get\" template=\"http://{0}/Search.aspx?q={searchTerms}\" />
    <moz:SearchForm>http://{0}/Search.aspx</moz:SearchForm>
    <Image height=\"16\" width=\"16\" type=\"image/x-icon\">http://{0}/favicon.ico</Image>
</OpenSearchDescription>";

And in your method:

    if (cachedChan == null)
    {
        cachedChan = String.Format(XML_TEMPLATE, domain);

        context.Cache.Insert(domain + "_opensearchdescription", 
               cachedChan, null, DateTime.Now.AddDays(14), TimeSpan.Zero);
    }

That should work well for you, because the method as you have it now will have to create a new string for every StringBuilder.Append() call, then call that method. The String.Format call only generates 17 lines of IL code, compared to StringBuilder generating 8 lines of ctor code, then 6 lines for every Append call. Although, with today's technology, an extra 50 lines of IL won't be noticeable.


Well, this is subtle. Like all other optimizations in life, you break abstraction boundaries and pay the price for that, in order to gain efficiency.

From my experience, it is indeed significantly faster, not because of loading libraries of course (if anything, that would make it slower) but because it saves on string allocations. I don't remember exactly how much faster, sorry. Measuring it with a profiler will be hard because you also save on garbage collection costs.

But, don't blame me when you will have to deal with encodings and escaping, and hell knows what else, and remember to read the XML standard carefully before getting these XMLs out anywhere.


Please do not use StringBuilder. Anyone who tells you that it is significantly faster hasn't presented you with any real data. The difference in speed is inconsequential, and you will have a nightmare of maintenence ahead of you.

Have a looK: StringBuilder vs XmlTextWriter


Well, there's nothing wrong with manually writing XML strings per se, but it is far more error prone. Unless you have a compelling performance reason to do this (that is, you've measured and found that the XML formatting is a bottleneck) I'd use the XML classes instead. You'll save a lot in debugging and development time.

As an aside, why are you mixing dynamic string operations with your builder calls? Instead of:

sb.Append("    <Description>Use " + domain + " to search.</Description>"); 

try this:

sb.Append("    <Description>Use ").Append(domain).Append(" to search.</Description>");


Your gut is wrong. Whether you hand-write the XML or use an XmlWriter, the most efficient way to send XML to an HttpResponse would be appent text directly to the Response. Building the whole string and then sending it wastes resources.


Will the domain variables ever return "&" characters, or another character that needs to be encoded? You may want to spend the time to do defensive programming and validate your input.


You could create a strongly typed object and use XmlSerialization classes to generate the xml data

0

精彩评论

暂无评论...
验证码 换一张
取 消