Create thumbnails from images on a webpage with PHPCreate thumbnails from images on a webpage with PHP

Posted July 20th, 2010 in PHP

In previous posts I have looked at how to extract images from a web page with PHP and the Simple HTML DOM Parser and generate thumbnails with PHP using a class I created. This page combines the two by downloading all the images from a specified web page and creating thumbnails for them.

The code

Read the two posts linked two above for full details about how the HTML DOM Parser works and how the class I have created to generate thumbnails works. Then have a look at the few lines of code below that combine the two to generate the thumbnails.

Note that the code downloads the images directly using file_get_contents and therefore needs URL aware fopen wrappers enabled to work. Read my last post whic shows how to check if these are enabled.

require_once('/path/to/simple_html_dom.php');
$html = file_get_html('http://www.cnn.com/');
$images = array();
foreach($html->find('img') as $element) {
    $images[$element->src] = true;
}
foreach($images as $url => $void) {
    $tg->generate($url, 100, 100, '/path/to/thumbnails/' . md5($url) . '.jpg');
}

The example downloads the www.cnn.com homepage and extracts all the images using the HTML DOM Parse, whose syntax works the same way as jQuery.

The images that are found are put into an array indexed by the full url; this effectively eliminates duplicates (which in the case of the CNN homepage at the current time includes a 1 pixel spacer image used many times).

This array is then looped through and the thumbnail images generated. I've named them using an md5 hash based on the full url with a .jpg extension/format in the example. This solves issues with pathing etc in the full URL filename.

The above example will create thumbnails that are a maximum 100x100 pixels.

CSS images won't be included

Note that the above example only gets images from the page which are defined with an <img> tag; any defined inline using CSS backgrounds etc or in a style sheet will not be downloaded.

Refinements to the script

The script could be refined to exclude images that are below a certain size (e.g images which are less than 100 pixels wide or 100 pixels high could be ignored) or of a particular format. You could do the latter with my thumbnail generation class by setting the allowable types. I'll have a look at these (and any suggestions made in the comments below) and post an update in a few days.

Related posts:

Share or Bookmark

Share or Bookmark this page using the following services. You will need to have an account with the selected service in order to post links or bookmark this page.

Subscribe or Follow

Subscribe via RSS or email, or follow me on Facebook or Twitter below. The RSS icon takes you through to Feedburner where you can select the service or application to use.

Comments

blog comments powered by Disqus