Your browser is very old. You might enjoy surfing the web more if you used something newer like:

Google Chrome

Even Firefox would be OK.

If you're being forced at gunpoint to use Internet Explorer, you should at least upgrade it. Version 8 is tolerable and 9 will be OK when it comes out.

Posts from December 2009

Converting an RFC 3339 date to a Python timestamp
(plus an update to my Google Docs backup script)

I’ve been working on updating the script to backup my Google docs.

One of the biggest issues with the current version is that it’s dumb and always downloads every single file, whether that file has changed or not. The nightly download is getting slower and slower so I figured it was time to make the script a bit smarter.

It turns out that the feed containing the document list has an <updated> property which is, obviously, that last date/time the file was updated. Pretty handy, huh?

The problem is that date/time stamp was in a format I’d never seen before:

2009-01-26T01:47:26.036Z

What?

After some digging, I found the Entry Update Date mentioned in the Protocol Reference which helpfully informed me that the Date is in RFC 3339 format.

Again, what?

A little more digging lead me to the fact that RFC3339 is the date format used in ATOM feeds, which makes sense since that’s exactly what the Google Docs document list is. The format itself is pretty straightfoward. The T separates the date and time portion and the Z is used to specify a numeric time zone offset. In this case, no Z value is provided so we know we’re dealing with GMT.

My next problem was how best to turn this RFC3339 date into a Python timestamp. My first instinct was to hack together a quick regex and be done with it but, in the end, I decided check with StackOverflow to see if there was a better way to do it.

And I’m glad I checked because I was immediately pointed to PyFeed by Steven Hastings which, in addition to a handy library for parsing ATOM feeds, includes a set of functions for manipulating RCF3339 dates. In particular, the tf_from_timestamp(ts_string) function which takes an RFC3339 string and returns a Python timestamp.

The current version of Doxworker (thanks to Ben for the name, fixing the spreadsheet downloads, and folder support) can be downloaded here. You’ll need to modify doxworker.cfg and you may also need to modify main.py depending on the location of your doxworker.cfg file.

Flickr REST API Basics

It’s been a while since I’ve played with the Flickr API and I was pleasantly surprised by how much easier it’s gotten to use. I don’t believe the REST interface was available last time I tried (although it’s possible I was just dumb and missed it) so I ended up using somebody’s PHP XML-RPC class which was a bit clunky.

The REST API is dead-simple to use. The basic url format is


http://api.flickr.com/services/rest/?method=<method-name>&name=value...

While only some API calls require authentication, all require an API key.

Here’s what a call to flickr.photos.search method might look like:


http://api.flickr.com/services/rest/?method=flickr.photos.search&api_key=YOURKEY&tags=monkey

The above call returns the 100 most recently uploaded photos matching the tag monkey. The flickr.photos.search method has lots of other options that let you customize the search.

You get back a big chunk of XML (there’s a parameter to return JSON and several other formats instead) that looks like this:

<rsp stat="ok">
    <photos page="1" pages="3410" perpage="100" total="340901">
        <photo id="4195976808" owner="40592053@N02" secret="7605f0aa9f" server="2698" farm="3" title="373" ispublic="1" isfriend="0" isfamily="0"/>
        <photo id="4195224913" owner="32143071@N00" secret="199a3c18cb" server="2650" farm="3" title="Q & Milk II" ispublic="1" isfriend="0" isfamily="0"/>
    </photos>
</rsp>

Once you have a photo id, you can call flickr.photos.getinfo which, among other things, lets you get the URL to the page for that photo. You can also get thumbnail URLs in different sizes with a call to flickr.photos.getSizes.

I decided to do my latest playing around in Python rather than PHP. Here’s a very basic function I threw together for making API calls:

def flickrRequest(method, params):
    args = '&' + '&'.join([key + '=' + str(params[key]) for key in params.keys()])
    url = "http://api.flickr.com/services/rest/?method=" + method + args
    resp = urllib2.urlopen(url)
    raw_xml = unicode(resp.read(), errors="ignore")
    return minidom.parseString(raw_xml.encode("utf-8"))

It takes an API method name a dictionary of parameters, then uses urllib2 to get the XML response. The XML response is parsed using minidom and the DOM object is returned.

Since this is just for fun, the code basically just glosses over the Unicode aspect of the response by ignoring any Unicode errors. This lets us parse the XML at the expense of possible mangling some characters (generally only in the image title).

This is how the above function would be used to do a simple search:

tags = ["monkey", "chimp"]

searchParams = {
                    'api_key': FLICKR_API_KEY,
                    'tags': ','.join(tags),
                    'tag_mode': 'any',
                    'content_type': 1,
                    'page': 1
                    'sort': 'date-posted-desc',
                    }

dom = flickrRequest('flickr.photos.search', searchParams)

This searches for any pictures that are tagged monkey or chimp. This only returns the first page (the first 100 by default). You can get the next batch by incrementing the page parameter.

Now let’s say I want to call flickr.photos.getInfo for each image:

#This gives me all of the 'photo' nodes in the XML
photos = dom.getElementsByTagName("photo")

for photo in photos:
  dom = flickrRequest('flickr.photos.getInfo', {'api_key': FLICKR_API_KEY, 'photo_id': photo.attributes["id"].value})

  #The dom object now contains XML with info about the current photo
  #You can take a look at the format here:  http://www.flickr.com/services/api/flickr.photos.getInfo.html

And that’s the basics of using the Flickr REST api. My ultimate goal with learning the API is to write some sort of script that will let me backup all of the meta-information (tags, sets, collections, descriptions) of my Flickr account.