I need to catch the following tags + content in html source of the page:
<li class="someClass someClass2">
... some html code ...
</li>
I'm not very good at regular expressions, so I'll also appreciate comments containing links to a good tutorial. I've been checking http://www.regular-expressions.info/ out, but I'm not very happy with explanations there.
What I found on the above site was smt like this:
<li\b[^>]*>(.*?)</li>
This matches all the <li>
tags, which is not what I want. I tried messing around with it, and tested this one
<li class="someClass someClass[1-9]{1,1}[0-9]*">(.*?)</li>
Unfortunately, this one doesn't do the job as well. The second class name is in format someClassX, where X is from {1, 2, ... } (well, obviously, it's not a set of natural numbers :) )
All I get from this regexp is "no matches". I'm using Ubuntu, Kodos tool.
What's even more depressing is the fact that this regexp:
<li class="someClass someClass[1-9]{1,1}[0-9]*">
actually catches the opening <li>
tags, but nothing more, just as if it gets "distracted" by new line character.
I'm still looking for a solution on google, and I'll post开发者_开发百科 it here if found, but I would also really appreciate some helpful input :)
Thx
This regex does what you're looking for (in Kodos at least... your mileage may vary!)
<li class="someClass someClass[0-9]+">(.*\n)*?</li>
精彩评论