AppEngine Task Queue - that’s just what we wanted / May 30th 2009

Well today has been quite a day at Google I/O. First of all the morning keynote about Google Wave was just mindblowing with the possibilities for this new protocol. It has profound implications for how we communicate and co-create and I’ll probably write something up after the Wave API session I’m now waiting for. One of the things I was excited to see before the conference was the App Engine Offline Processing look ahead talk. Last year’s talks by Brett Slakin were really good and yesterday’s Building Scalable, Complex Apps on App Engine was pretty inspirational with some cool design patterns for DataStore work for pubsub and DAG queries (will fill in the links for these two later when I’m near to my notes from yesterday). 

Today it was all about the Task Queue. I had a quick chat with a couple of the AppEngine team including Guido yesterday and it was clear that the Task Queue would solve the use cases I was interested in relating to lots of the things I’ve been thinking about relating to creating active listening and indexing apps for social services. 

The really nice thing about the Task Queue is that it’s built in a very interesting and web standardsy sort of way. It uses the Web Hooks event driven metaphor new tasks are pulled off of the head of the queue and are then posted to a webservice that you set up to run the job. The handler for this webservice can obviously itself put things onto the task queue and so you can produce very complex and intricate workflows. 

As Brett pointed out in his talks, each handler/activity must be idempotent. It’s not guaranteed that the task will run only once, it’s unlikely but not impossible. So semaphore flags within any activity you run within the task queue are pretty essential. Putting things onto a queue sounds easy and there was a quick bit of code I jotted down about this.

taskqueue.add(url='/work/mail', params = some_dict)

You can run multiple task queues and you can configure how they’re throttled with a queue.yaml file like this.

queue.yaml

-name: mail_queue
 rate: 2000/d

-name: speedy_queue
 rate: 5/s

It’s going to make a whole load of new applications possible. The cron jobs allow you to build fairly rudimentary polling devices, but if you have lots of activities to do within these scheduled jobs or you want to implement a modern asynchronous architecture to your web app with things like write-behind and parallelised data writes this will really help. Adding jobs to the queue is almost instantaneous and in the demos many of the jobs on the queue had executed between request and response.

Powered by Tumblr

Archive