开发者

XSD regular expression pattern in .Net causes application to hang

开发者 https://www.devze.com 2022-12-21 04:37 出处:网络
Processing time doubles as \"Y\" goes to the right. Can anybody tell me why? How to solve this problem?

Processing time doubles as "Y" goes to the right. Can anybody tell me why? How to solve this problem?

I have many big ID's stored in a database those can't be changed so I can't limit the size too much.

using System;
using System.IO;
using System.Text;
using System.Xml;
using System.Xml.Schema;

namespace TestRegex
{
 class Program
 {
  static void Main(string[] args)
  {

   DateTime start = DateTime.Now;

   /******************************************
    *  ID to validate
    ******************************************/
   //string id = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxx"; // Ok: Fast
     string id = "xxxxxxxxxxxxxxxxxxxxxYxxxxxxx"; // Invalid: Slow
   //string id = "xxxxxxxxxxxxxxxxxxxxxxYxxxxxx"; // Invalid: Slower
   //string id = "xxxxxxxxxxxxxxxxxxxxxxxYxxxxx"; // Invalid: Very slow
   //string id = "xxxxxxxxxxxxxxxxxxxxxxxxYxxxx"; // Invalid: Very very slow

   /******************************************
    *  XML to validate
    ******************************************/  
   XmlDocument doc = new XmlDocument();
   doc.LoadXml("<root id='" + id + "'></root>");

   /******************************************
    *  XSD validator
    *****************开发者_如何学Python*************************/
   string xsl =
@"
<xs:schema xmlns:xs='http://www.w3.org/2001/XMLSchema'
           elementFormDefault='unqualified'
           attributeFormDefault='unqualified'>

 <xs:simpleType name='id'>
        <xs:restriction base='xs:string'>
            <xs:pattern value='^([a-z_]+[0-9]*)+' />
        </xs:restriction>
 </xs:simpleType>

    <xs:element name='root'>
        <xs:complexType>
            <xs:attribute name='id' use='required' type='id' />
  </xs:complexType>
 </xs:element>
</xs:schema>
";

   /******************************************
    *  Adds XSD to XML and validates it
    ******************************************/
   XmlTextReader reader = new XmlTextReader(
    new MemoryStream(ASCIIEncoding.Default.GetBytes(xsl)));

   XmlSchema schema = XmlSchema.Read(reader, new ValidationEventHandler(Validate));
   doc.Schemas.Add(schema);
   doc.Validate(new ValidationEventHandler(Validate));


   /******************************************
    *  Performance results
    ******************************************/
   Console.WriteLine(id.Length + " = " + (DateTime.Now - start).TotalSeconds);
   Console.Read();
  }

  private static void Validate(object o, ValidationEventArgs args)
  {
   if (args.Exception != null)
   {
    Console.WriteLine(args.Exception);
   }
  }
 }
}


This looks like a case of a Catastrophic Backtracking.
Your regex seems overly complex. If I'm reading it correctly it accepts lower case and numbers, when the first letter isn't a number. You can rewrite it as:

^[a-z_]\w*


Solved!

The regex ^([a-z_][a-z_0-9]*) has the same behavior and it's extremely faster.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号