I have tried to use the following regex expression to remove html whitespace and leading whitespace
Find: \s*([<>])\s*
Replace: $1
But each time that I do this I end up with 186 occurrences of $1 literaly in my document. Any assistance would be greatly appreciated
Here is an example of what I am talking about
This
<fieldset id="prod_desc">
<p>Original AA </p>
<b>Features:</b>
<ul>
<li>2 pole rectangular dome tent with 13.4 sq ft of vestibule storage </li>
<li>Durable, shockcorded, self-supporting fiberglass frame and ring and pin/pole pocket assembly </li>
<li>2 side opening door panels are constructed entirely of no see-um mesh to maximize air flow inside </li>
<li>Poke-out vent in side wall allows the option of additional ventilation when needed </li>
<li>2 interior storage pockets keep es开发者_运维百科sential items handy Specifications: </li>
<li>Season: 3 </li>
<li>Sleeps: 2 </li>
<li>Doors: 2 </li>
<li>Windows: 2 </li>
<li>Weight: 5 lbs 12 oz </li>
<li>Area: 36.5 Sq. Ft. </li>
<li>Center Height: 3' 7.5"</li>
</ul>
</fieldset>
should become:
<fieldset id="prod_desc"><p>Original AA</p><b>Features:</b><ul><li>2 pole rectangular dome tent with 13.4 sq ft of vestibule storage</li><li>Durable, shockcorded, self-supporting fiberglass frame and ring and pin/pole pocket assembly</li><li>2 side opening door panels are constructed entirely of no see-um mesh to maximize air flow inside</li><li>Poke-out vent in side wall allows the option of additional ventilation when needed</li><li>2 interior storage pockets keep essential items handy Specifications:</li><li>Season: 3</li><li>Sleeps: 2</li><li>Doors: 2</li><li>Windows: 2</li><li>Weight: 5 lbs 12 oz</li><li>Area: 36.5 Sq. Ft.</li><li>Center Height: 3' 7.5"</li></ul></fieldset>
Notepad++ doesn't support $1
for backreferences before version 6.0 when it introduced PCRE support for find-and-replace. For older versions, use \1
for backreferences.
You should be finding \s*(<[^>]+>)\s*
. As of Notepad++ version 6.0, released in March 2012, this alone should work for you. I tried your original regex and it works as well, much to my surprise.
Previous versions cannot do multi-line regex replacements. To strip newlines, perform the regex replacement first, then do an extended find (UNIX line endings):
\n
For Windows line endings:
\r\n
Replace either case with nothing.
You could use the expression \s+\<(.*)\>\s+
and replace with $1 (or \1 in Notepad++)
Or you could use this approach:
- first, match
\s+\<
and replace with<
- second, match
\>\s+
and replace with>
精彩评论