Having trouble wrapping my head around this. I need parse this using a regular expression to create the definition list below
Width=3/8 in|Length=1 in|Thread - TPI or Pitch=|Bolt/Screw Length=|Material=|Coating=|Type=Snap-On|Used With=|Quantity=5000 pe开发者_开发问答r pack|Wt.=20 lb|Color=
The result would be something like this
<dt>Width</dt>
<dd>3/8 in</dd>
<dt>Length </dt>
<dd>1 Inch</dd>
<dt>Thread - TPI or Pitch</dt>
<dd></dd>
<dt>Quantity</dt>
<dd>5000 a pack</dd>
<dt>Wt.</dt>
<dd>20 lb</dd>
If you don't need to reorder items or change their values, and are confident the values themselves don't contain the equals signs or vertical bars used as markup in the input, you could apply a series of regular expressions to introduce the HTML. Using Java's String class from Scala, this could be a dense but effective one-liner:
"Escape test=&<>|Width=3/8 in|Length=1 in|Thread - TPI or Pitch=|Bolt/Screw Length=|Material=|Coating=|Type=Snap-On|Used With=|Quantity=5000 per pack|Wt.=20 lb|Color=".
replaceAll("&","&").
replaceAll("<","<").
replaceAll(">",">").
replaceAll("^","<dl>\n\t<dt>").
replaceAll("=","</dt>\n\t<dd>").
replaceAll("\\|","</dd>\n\n\t<dt>").
replaceAll("$","</dd>\n</dl>")
which yields
<dl>
<dt>Escape test</dt>
<dd>&<></dd>
<dt>Width</dt>
<dd>3/8 in</dd>
<dt>Length</dt>
<dd>1 in</dd>
<dt>Thread - TPI or Pitch</dt>
<dd></dd>
<dt>Bolt/Screw Length</dt>
<dd></dd>
<dt>Material</dt>
<dd></dd>
<dt>Coating</dt>
<dd></dd>
<dt>Type</dt>
<dd>Snap-On</dd>
<dt>Used With</dt>
<dd></dd>
<dt>Quantity</dt>
<dd>5000 per pack</dd>
<dt>Wt.</dt>
<dd>20 lb</dd>
<dt>Color</dt>
<dd></dd>
You can use
([^=|]+)=([^|]+)(?:\||$)
Apply with the "global" flag.
Explanation:
( # start match group 1 [^=|]+ # any character that's not a "=" or "|", at least once ) # end match group 1 = # a literal "=" ( # start match group 2 [^|]+ # any character that's not a "|", at least once ) # end match group 2 (?= # look-ahead: followed by \| # either a literal "|" | # or… $ # the end of the string ) # end look-ahead
The string parts you are interested in are in match groups 1 and 2, respectively. For me the above matches:
Width
=3/8 in
Length
=1 in
Type
=Snap-On
Quantity
=5000 per pack
Wt.
=20 lb
Your example is inconsistent in the Thread - TPI or Pitch
case.
Something like this:
/(?:(.*?)=(.*?)(\||$))+/
精彩评论