I made this regex:
/\<+[a-zA-Z0-9\=\"\s]+\>+.+\<\/+[a-zA-Z0-9]+\>/gi
which matches a full html tag like:
<p>this is a paragraph</p>
But the problem with this that that it matches all of the elements as one match
<div><p>this is a paragraph</p>开发者_运维知识库</div>
But I would like to get all of the HTML elements separated.
Note: The HTML tags are in a string not in the DOM.
Before the regex solution I tried to create a new div
element and I added the string as it's innerHTML. But doesn't worked properly I don't really know why...
So I'm looking for a REGEX solution which solves this one match problem.
Thanks
Replacing the inner +.+
with +[^<]+
would prevent it from matching the whole string, but regular expressions are not the correct choice for processing strings that contain nested components. For that you should be using a parser.
Regular expressions are simply the wrong tool for the job here.
Regular expressions are not appropriate to handle html. As you mention that the HTML is not part of the DOM
Note: The HTML tags are in a string not in the DOM.
You can use JQuery to build an object from the HTML and use DOM selectors / traversion to work with it:
$(myHTMLString).find('p')...
精彩评论