2011

About twelve months ago, I wrote an incomplete blog post about my goals for 2011. This isn’t it, but digging that up inspired some points in this quick year-end recap.

Notable Work

Spent the first four or so months of the year working on Census data: launched the Spokesman-Review Census Center right after the first data dumps came around. I got invited by the awesome NICAR folks to help hack on census.ire.org. (Took the original NationBrowse down in mid-November since the Spokesman project functionally supersedes it.)

Coming off of that Census high, we started putting together more “data browse” tools at the Spokesman-Review.

Notable Personal Developments

  • An illness in the family at the turn of the year. Hustled to visit for a week in January. (Everything’s fine now and I’m thankful for that.)
  • Came close to leaving Spokane for good, for the shine and glamor of Silicon Valley. Couldn’t bring myself to leave loose ends and burn bridges that way.
  • Ran the 12k Lilac Bloomsday Run with nearly no training, more or less out of shape. (And fared better than I thought I would.)
  • Started going to the gym a couple times a week.
  • Took a day trip to a missile silo.
  • Took a weekend trip to northeast Washington and Nelson, British Columbia
  • Witnessed the final Space Shuttle launch in person.
  • Some absurdly awesome baseball happened this year, too. (Outside of the Cardinals: who could forget the last day of the season? [1] [2] [3])
  • Oh, and I moved in with a girl. That’s something that people do, right?

No projections for 2012. This year was ridiculous and I couldn’t have predicted it, so I don’t want to bother.

No concrete goals or resolutions, either. (The calendar is such an arbitrary measuring device and who keeps resolutions anyway?) Just this: do good, and do more awesome than last year.*

*Subject to change.

Django 1.4 first thoughts: CachedStaticFilesStorage

The first alpha release of Django 1.4 is out. I've been toying with trunk on and off for a while now, but hadn’t taken more than a passing glance at all the new features. At first glance: I’m thrilled.

I plan on writing a few posts about some new bits that stick out to me. Here’s the first.

CachedStaticFilesStorage and {% static %}

The new CachedStaticFileStorage stores your files but also creates a copy where the MD5 hash of the file's contents are added to the file's name. (i.e.: something like foo/style.css would be saved as foo/style.55e7cbb9ba48.css)

Taking a look at the implementation, it looks like a mixin (CachedFilesMixin) provides the functionality — and appears to be compatible as a plugin to any existing storage backend. The implications of this are pretty useful.

Say, for example, you’re currently using S3/CloudFront to serve your static files (by using the S3BotoStorage in django-storages: you’d simply do the following…

1
2
3
4
5
6
7
8
9
10
from storages.backends.s3boto import S3BotoStorage
from django.contrib.staticfiles.storage import CachedFilesMixin

class S3HashedFilesStorage(CachedFilesMixin, S3BotoStorage):
    """
Extends S3BotoStorage to also save hashed copies (i.e.
with filenames containing the file's MD5 hash) of the
files it saves.
"""
    pass

… and then set your STATICFILES_STORAGE to reference the above class. (Caveat: I haven’t tested the above yet.)

To get your templates to pick up the hashed versions of filenames, you'd use the new {% static %} template tag: just convert this…

1
<link href="{{ STATIC_URL }}style/base.css" rel="stylesheet"/>

…to this…

1
2
{% load static from staticfiles %}
<link href="{% static "style/base.css" %}" rel="stylesheet"/>

…and you’re now ready to rock with static files that you can cache forever without worry of them going stale.

Disclosure: I’m both excited and pissed because I actually spent a bit of time on a staticfiles hashing utility that wasn’t nearly as elegant as this mixin. Kudos to anyone who worked on these staticfiles enhancements.


Update (12:15PM PST): It isn’t mentioned in the docs for CachedStaticFilesStorage, but the filenames are cached in Django’s cache system (so the static files themselves do not need to be read constantly to generate the MD5 hashes). If you’re using memcached as your cache backend, this means that for deep directory structures and long filenames, you may hit the 250 character limit imposed on memcached keys. (Note that encountering this is rare outside of collecting static assets since models.FileField defaults to max_length=100 anyway.)

Solution: set up a function to shorten/hash your cache keys and point settings.KEY_FUNCTION to that function.


Another addendum: If you’re using multiple memcached servers and don’t want to incur network overhead just for those staticfile MD5 hashes, you can set a custom cache backend config — CachedStaticFilesStorage looks for a 'staticfiles' cache before falling back to the default, so you can force CachedStaticFilesStorage to cache these values in local memory instead of memcached (hat tip to Ted):

1
2
3
4
5
6
7
8
9
10
CACHES = {
    'default': {
        'BACKEND': 'django.core.cache.backends.memcached.PyLibMCCache',
        'LOCATION': '127.0.0.1:11211',
    },
    'staticfiles': {
        'BACKEND': 'django.core.cache.backends.locmem.LocMemCache',
        'LOCATION': 'staticfiles-filehashes'
    }
}

Oh look, a link relevant to my previous post.

Seth Godin, on the trap of social media noise:

If we put a number on it, people will try to make the number go up.

[…] In Corey's words, the conventional, broken wisdom is:

  • Follow a ton of people to get people to follow back
  • Focus on the # of followers, not the interests of followers or your relationship with them.
  • Pump links through the social platform (take your pick, or do them all!)
  • Offer nothing of value, and no context. This is a megaphone, not a telephone.
  • Think you're winning, because you're playing video games (highest follower count wins!)

This looks like winning (the numbers are going up!), but it's actually a double-edged form of losing. First, you're polluting a powerful space, turning signals into noise and bringing down the level of discourse for everyone.

The Information Diet

When you’re young, you look at television and think, There’s a conspiracy. The networks have conspired to dumb us down. But when you get a little older, you realize that’s not true. The networks are in business to give people exactly what they want. That’s a far more depressing thought. Conspiracy is optimistic! You can shoot the bastards! We can have a revolution! But the networks are really in business to give people what they want. It’s the truth.

-Steve Jobs


Clay Johnson, founder of Blue State Digital (which ran online strategy for Obama’s 2008 campaign) and former director of Sunlight Labs, is writing a book called The Information Diet (website).

A preview of the first chapter was posted today. (The introduction opens with the Jobs quote above.) Here’s a snip regarding the thoughts that led him to leave Sunlight Labs:

Transparency wasn’t the universal answer I was looking for. You cannot simply flood the market with broccoli and hope that people stop eating french fries. If large numbers of people only seek out information that confirms their beliefs, then flooding the market with data from and about the government will really not work as well as the theorists predict; the data ends up being twisted by the left- and right-wing noise machines, and turned into more fodder to keep America spinning.


Some people may have heard me rant about the self-validating echo chamber (also: filter bubble) effect we have going on in our society. (Sorry, sweetie.) And hey, I’m guilty of it as much as anyone. The topic ties in neatly with the philosophical struggle I have with the more timesucking aspects of the internet versus my career as a web developer. (My love/hate relationship with social networks being one facet of that.)

In any case: I’m really looking forward to reading this book and seeing where Johnson goes with this idea of a diet from junk information.