How to get this regex working?_问答_开发者_运维开发者技术经验分享

开发者 https://www.devze.com 2023-01-14 16:36 出处：网络

i have a small problem, i want to find in <tr><td>3</td><td>foo</td><td>2</td>开发者_Python百科

i have a small problem, i want to find in

<tr><td>3</td><td>foo</td><td>2</td>开发者_Python百科

the foo, i use:

$<tr><td>\d</td><td>(.*)</td>$

to find the foo, but it dont work because it dont match with the </td> at the end of foo but with the </td> at the end of the string

You have to make the .* lazy instead of greedy. Read more about lazy vs greedy here.
Your end of string anchors ($) also don't make sense. Try:

<tr><td>\d<\/td><td>(.*?)<\/td>

(As seen on rubular.)

NOTE: I don't advocate using regex to parse HTML. But some times the task at hand is simple enough to be handled by regex, for which a full-blown XML parser is overkill (for example: this question). Knowing to pick the "right tool for the job" is an important skill in programming.

Your leading $ should be a ^.

If you don't want to match all of the way to the end of the string, don't use a $ at the end. However, since * is greedy, it'll grab as much as it can. Some regex implementations have a non-greedy version which would work, but you probably just want to change (.*) to ([^<]*).

Use: