I'm trying to fetch my articles and I need to make a slider out of them.
Each of my articles has an image inside it's text, like this:
<p>
<img src="story_img.jpg" width=120 height=80>
In the last couple o开发者_运维知识库f weeks I often had to download a lot of files, submitted to a web-based teaching platform. Downloading all these files by hand is very annoying so I implemented a short Groovy script. Since Groovy has a great support for parsing well-formed XML-like information it fails if you want to parse unstructured and nasty HTML code.
</p>
Now what I need is simple, first I should parse the image and then remove it from the text .
So that I could have 2 constants
$imgOfText = ?
$TextWithOutImg = ?
I tried different ways in php and even read this topic.
But I couldn't do that.
It's HTML so you can parse it ! Use DomDocument !
$html = '<p>';
$html.= '<img src="story_img.jpg" width=120 height=80>';
$html.= 'In the last couple of weeks I often had to download a lot ';
$html.= 'of files, submitted to a web-based teaching platform. Downloading ';
$html.= 'all these files by hand is very annoying so I implemented a short ';
$html.= 'Groovy script. Since Groovy has a great support for parsing well-';
$html.= 'formed XML-like information it fails if you want to parse ';
$html.= 'unstructured and nasty HTML code.';
$html.= '</p>';
$doc = new DOMDocument();
$doc->loadHTML($html);
$p = $doc->getElementsByTagName('p')->item(0);
$img = $doc->getElementsByTagName('img')->item(0);
$imgOfText = $img->getAttribute('src');
$TextWithOutImg = $p->nodeValue;
Demo here
How about this Live Demo I whipped up. It's just some very basic parsing using strpos(). Im sure this could be done with regular expressions, but I never was any good at that :)
CODE
<?php
$html = '<p>';
$html.= ' <img src="story_img.jpg" width=120 height=80>';
$html.= ' In the last couple of weeks I often had to download a lot ';
$html.= 'of files, submitted to a web-based teaching platform. Downloading ';
$html.= 'all these files by hand is very annoying so I implemented a short ';
$html.= 'Groovy script. Since Groovy has a great support for parsing well-';
$html.= 'formed XML-like information it fails if you want to parse ';
$html.= 'unstructured and nasty HTML code.';
$html.= '</p>';
$spot = strpos($html, 'src="', strpos($html, '<img'))+5;
$spot2 =strpos($html, '"', $spot);
$imgOfText = substr($html, $spot, $spot2-$spot);
$spot = strpos($html, '<img');
$spot2 = strpos($html, '>', $spot)+1;
$TextWithOutImg = substr($html,0,$spot).substr($html,$spot2);
echo "Image Source: ".$imgOfText."\n\n";
echo "Text Without Image:\n".$TextWithOutImg;
?>
OUTPUT
Image Source: story_img.jpg
Text Without Image:
<p>In the last couple of weeks I often had to download a lot of files, submitted to a web-based teaching platform. Downloading all these files by hand is very annoying so I implemented a short Groovy script. Since Groovy has a great support for parsing well-formed XML-like information it fails if you want to parse unstructured and nasty HTML code.</p>
Try this topic: PHP - remove <img> tag from string
There is a number of PHP libraries that can parse HTML, even invalid one.
PHPQuery
Simple HTML DOM
Zend DOM Query
Here is a PHPQuery example that prints all img tags appear on StackOverflow home page.
<?php
$html = file_get_contents('http://stackoverflow.com');
include('phpQuery.php');
$pq = phpQuery::newDocumentHTML($html, 'utf-8');
foreach ($pq->find('img') as $img)
{
echo pq($img)->attr('src') .'<br>';
}
?>
Another example that extracts text of all paragraphs:
foreach ($pq->find('p') as $p)
{
echo pq($p)->text() .'<br>';
}
精彩评论