Regex to extract specific urls from <img> tags in an HTML document_问答_开发者

Regex to extract specific urls from <img> tags in an HTML document

开发者 https://www.devze.com 2023-03-22 06:05 出处：网络

I am trying to extract a specific url pattern from the body of some content and replace it with a newly formed url. But I am running into problems with my regex patterns and wanted to see if you could

Here is the code I am testing this with:

$body = '<p><img src="/file/637/view" height="540" width="640"></p>';
$pattern = '/src="/file/(0-9)+/view"/';
$pattern = '/src="/file/(.)+/view"/';
$pattern = '/"/file/[0-9]+/view"';
$pattern = '/\<img src="(.)+"(.)+"\>/';

preg_match($pa开发者_如何学Cttern, $body, $matches);

Now, the last pattern down will grab the entire image tag, which is great, but what I want it to extract all image urls (just the url) that use the "/file/(some number)/view" pattern so that I can form new urls and then do a string replace on them. All of the others fail to find anything when I run print_r on the $matches var.

Obviously the body string represents the content body that I am scanning for this. Any suggestions as to how to get this to work and grab just the image url? This will have to work for situations with multiple images intermingled with lots of other html including anchor tags.

try to replace (.) with (.*?) or for your problem, try following

$body = '<p><img src="/file/637/view" height="540" width="640"></p>';
$pattern = '/\/file\/([0-9]+)\/view/';


preg_match($pattern, $body, $matches);

You need to escape slashes, I think you have some unescaped slashes there

try this:

$body = '<p><img src="/file/637/view" height="540" width="640"></p>';
$pattern = '/<img src="\/file\/([0-9]+)\/view"/'

preg_match($pattern, $body, $matches);

echo ($matches[1]);