Using PHP to extract img src alt and title from HTML is quite easy and this can be done in two ways. Either using PHP regular expression matching or using the Document Object Model (DOM).
Using regexp to extract img tags is a bad idea and will likely lead in unmaintainable and unreliable code.
Using DOM: Here is a DOMDocument/DOMXPath based example to extract img src alt and title from HTML using PHP.
Assuming you have URL of the page:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | <?php $url="http://example.com"; $html = file_get_contents($url); $doc = new DOMDocument(); @$doc->loadHTML($html); $tags = $doc->getElementsByTagName('img'); foreach ($tags as $tag) { echo $tag->getAttribute('src'); echo $tag->getAttribute('alt'); } ?> |
And if you have html content ready then the idea is same. I just used xpath as an example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | <?php $content = "<html><body>Test<br><img src=\"myimage.jpg\" alt=\"image alt\" title=\"image title\"></body></html>"; $doc = new DOMDocument(); $doc->loadHTML($content); $xml = simplexml_import_dom($doc); // making xpath more simple $images = $xml->xpath('//img'); foreach ($images as $img) { if (isset($img["src"])) echo "Source : " . $img['src'] . "<br />"; if (isset($img["alt"])) echo "Alt : " . $img['alt'] . "<br />"; if (isset($img["title"])) echo "Title : " . $img['title'] . "<br />"; } ?> |
Use of the DOMDocument::loadHTML()
method can cope with HTML-syntax and does not force the input document to be XHTML. Therefore simpleXMLElement
is not necessary – it just makes using xpath and the xpath results more simple using PHP to extract img src alt and title from HTML.
thanks very much for sharing