Download a sequence of files with curl

Posted in Linux/Unix/BSD -

In the past to download a sequence of files (e.g named blue00.png to blue09.png) I've used a for loop for wget but there's a simpler and more powerful way to do the same thing with curl.

Example files

There are a bunch of freely available* map icon images for use with Google Maps here.
* freely available with attribution

The examples in this post download the images
from
  http://google-maps-icons.googlecode.com/files/blue00.png
to
  http://google-maps-icons.googlecode.com/files/blue09.png

Using a for loop with wget

As I mentioned in the past I would have used a for-loop with wget, and have covered this before in my no seq on Mac OS X - use jot instead post. Using this method, the above files would be downloaded like so:

On a Mac:

for number in `jot - 0 9`; do wget http://google-maps-icons.googlecode.com/files/blue0$number.png; done

On Linux:

for number in `seq 0 9`; do wget http://google-maps- icons.googlecode.com/files/blue0$number.png; done

Using curl instead

Curl has the ability to set sequences (including with leading zeroes, and alphanumeric sequences) as part of the download command which makes it a lot easier. This is all covered in the man page so I suggest reading it for a complete understanding of the options available.

To download the blue icons from 00 to 09 with curl, do this:

curl -O http://google-maps-icons.googlecode.com/files/blue0[0-9].png

The -O flag tells curl to write the file out as a file instead of to standard output.

Because curl supports leading zeroes in the sequence we can also easily download the images from 00 to 20 with just one command:

curl -O http://google-maps-icons.googlecode.com/files/blue[00-20].png

Saving with a different filename, based on the sequence

If you wanted to download some files but make the output filename different from the source filename this can easily be done with curl too. The values in the [] sequence placeholder are available in #1, #2, etc where each hash number corresponds to a placeholder.

As an example, if we were to download http://www.example.com/page/1 to http://www.example.com/page/20 but save the files as 1.html to 20.html, do this:

curl -o "#1.html" http://www.example.com/page/[1-20]

I found when the -o filename started with a # the value needed to be enclosed with quotes otherwise you get the error "curl: option -o: requires parameter"

Now read the manpage

This post is just intended as an introduction to using curl to download files with sequences in their filename. It's all covered in the manual page and there are a lot of other useful things you can do with curl as well so I suggest reading it.

To read the manpage, open up a command prompt / terminal on Mac or *nix and run:

man curl

Assuming it's installed of course ;)



Related posts:


Comments