Downloading Google Analytics data attachments with PHP IMAPDownloading Google Analytics data attachments with PHP IMAP

Posted March 15th, 2009 in PHP

This post is part of a series about having Google Analytics data sent by email and then downloading and parsing the data with PHP. Having already set up the automatic sending of data by email this post puts together all the work of the previous posts and actually downloads the attachment by looping through the email messages in the inbox and looking for the specific message to download. The next three posts in the series will then look at how to extract the data from a CSV, TSV or XML formatted attachment.

The report this example comes from in Google Analytics is the full top content page, as shown in the following screenshot example:

example google analytics report

When setting up the email I've added "Top Content" to the subject line to distinguish this from other reports and the final subject that appears in an incoming email will be e.g. "Analytics www.electrictoolbox.com 20090213-20090315 (Top Content)". The date range will obviously vary from day to day.

First the code, and then a brief explanation of what it's doing:

$connection = imap_open($server, $login, $password);
$result = imap_search($connection, 'FROM "toolboxnz@gmail.com" SUBJECT "Analytics www.electrictoolbox.com "');

foreach($result as $message_number) {
    $headers = imap_headerinfo($connection, $message_number);
    $filename_to_match = strtolower(str_replace(' ', '_', $headers->subject)) . '.csv';
    if(preg_match('/Analytics (.*) (\d{8})\-(\d{8}) \((.*)\)/i', $headers->subject, $matches)) {
        $attachments = extract_attachments($connection, $message_number);
        for($i = 0; $i < count($attachments); $i++) {
            if(strtolower($attachments[$i]['filename']) == $filename_to_match) {
                process_attachment($attachments[$i]['attachment'], $matches[1], $matches[2], $matches[3], $matches[4]);
            }
        }
    }
}

imap_close($connection);

function process_attachment($attachment, $domain, $date_from, $date_to, $subject) {

    /* code to process the data here */
   
}

Now a brief explanation of what each bit of code does.

The first part shown below uses the imap_search function to search for emails from 'example@gmail.com' with the subject 'Analytics www.electrictoolbox.com'.

$result = imap_search($connection, 
  'FROM "example@gmail.com" SUBJECT "Analytics www.electrictoolbox.com "');

In your code you need to change the from email address to whatever Google sends it as; it will be your own email address used to log into Analytics. The domain in the subject would need to be the domain that you are receiving data from.

I covered the imap_search function in an earlier post titled "Looping through messages to find a specific subject". Read that for more details.

The code then loops through the result set and gets the headers. The next line of code looks like this:

$filename_to_match = 
  strtolower(str_replace(' ', '_', $headers->subject)) . '.csv';

In order to work out which attachment to extract data from we need to compare the filename. The filename of the attachment is the same as the subject but with underscores instead of spaces. It's also wise to make the comparisons in lower case to ensure no case sensitivity issues. And finally append the file format extension to the end. In this case we're looking for a CSV file.

The next line does a regular expression to extract the domain, date range and custom subject from the subject line. They are returned in the $matches array.

if(preg_match('/Analytics (.*) (\d{8})\-(\d{8}) \((.*)\)/i', $headers->subject, $matches)) {

If matches were found, the attachments are extracted from the email using the extract_attachments function. This was posted in a separate post earlier today so this one doesn't get too long. Refer to that post and the earlier one it was based on for more details about what the function does.

Next we loop through the attachments and it matches the filename we're trying to match it then calls the process_attachments function.

if(strtolower($attachments[$i]['filename']) == $filename_to_match)

You might also want to add some additional testing to ensure someone can't fake an email to you containing dummy data. It's probably never going to happen but you never know...

The process_attachment function in the above code example is just a placeholder in my example. Tomorrow I'll look at how to extract CSV data from the attachment which could then be loaded a database.

Read about the other posts in this series here, including a complete list of the posts in the series.

Related posts:

Share or Bookmark

Share or Bookmark this page using the following services. You will need to have an account with the selected service in order to post links or bookmark this page.

Subscribe or Follow

Subscribe via RSS or email, or follow me on Facebook or Twitter below. The RSS icon takes you through to Feedburner where you can select the service or application to use.