I’ve been a terrible blogger lately, but I’m going to be catching up on writing very soon. (Yes, I always say that.) I’ve got about as many “blog posts yet to write” as I have projects — and I’ve got an interesting plethora of those as of late.

…Actually, let’s talk about some of those for a minute. I plan on discussing a few of these at length over the next couple weeks (as big news is on the horizon for some), but here’s a collection of some things I’ve been working on since I last wrote:

I’ve probably forgotten something in there… But you’ll hear back from me soon enough.

(Are you a developer-ish person? Do these projects sound interesting to you? You should check out the Knight-Mozilla OpenNews Project, get involved with the gang, and apply to be a 2014 fellow. Seriously, check it out.)

I started blogging here about ten years ago, when I was still a kid, and when hand-editing a handful of HTML pages was an acceptable way to run a blog. (And no "permalinks"/"detail pages"!) LiveJournal was still invite-only. The first public release of Wordpress wouldn’t come happen for another six months. (And you’d have to host it yourself: the Wordpress.com service wouldn’t happen for another three years.)

By “blogging here,” I mean here on this Ship of Theseus which was once straight hand-crafted HTML and then an AvidGamers CMS hack and then plain HTML again and then a pMachine CMS blog and then Wordpress and then a hand-crafted Django app. It has gone through enough thematic phases and has outlasted most Xanga blogs and MySpace profiles. But I still consider it one and the same as it’s only ever existed in one place at any given time (though the place has varied).

I’m sometimes called out for being too nostalgic, a trait which probably explains how intact the majority of the archive remains. There are some non-public posts in the mix, but I have generally dragged the bulk of it along publicly — including even the awkward, rambling, philosophizing teenager.

Now, a lot of that content is embarrassing (and I personally haven’t looked through it and re-vetted it in years), but I think I’ve stubbornly dragged it along as some sort of awkward personal benchmark: some form of this has existed for nearly half of the entire lifetime of the World Wide Web (circa 1991-92).

(To be fair to that count, this website is not a direct continuation of my first attempt, created in 1998. That <blink>-ing, <marquee>-scrolling, MIDI-playing, child-built website was a different beast altogether.)

Now, I’d say “I’ll get back into the habit of brain dumping here,” but I’ve never habitually done that (save for a few months maybe eight or nine years ago). But there’s big news to talk about, and hopefully I’ll get to talking about that — and other fun things — soon.

Here’s to another ten years and outlasting a few more hosting providers/blog sites/social networks.

django-medusa: Rendering Django sites as static HTML

If you’ve ever poked around on this blog, you may have noticed the colophon which mentions very briefly:

Unlike most Django sites, this is compiled into static HTML pages by django-medusa, a Django app I'm currently building.

This little tool has been open-source since I deployed to the new version of this website, maybe nine months ago, but I hadn’t really done anything with it or mentioned it much anywhere. It powers the several hundred pages on this site and turns them into static HTML — which is then hosted in S3. (Details below.)

(The only other time I’ve mentioned this project publicly was in response to django-bakery, a tool that the L.A. Times Data Desk uses to process some data projects into static pages. Clearly, this is an interesting idea to some people.)

tl;dr for Django pros: Test out the tutorial “hello world” and see the README. Come back if you want the more detailed narrative breakdown of the app (and how the app powers this blog).


The app basically auto-finds “renderer” definitions for your apps and then provides a Django management command that builds the static rendition of the website (with the output directed based on some settings).

Renderers live in renderer.py files that are auto-discovered — like models.py and tests.py, it’s auto-discovered as long as the app is listed in INSTALLED_APPS. This basically defines a class that defines a get_paths instance method that returns a list/tuple of absolute URLs, representing all the URLs that should be converted to static files. Renderers are set up like this so that, on an app-by-app basis (or even varying within an app), you can dynamically generate all the possible URLs that exist in your site.

Here’s a couple renderer definitions that actually power part of this site.

The specific URL names and model bits aren’t important: basically, you’ll notice that the example BlogHomeRenderer in my example generates the entire URL structure for /blog/* by querying for all live blog posts and then using Django’s URL reversing methods to figure out all the paths that could possibly be built. (That file in particular uses sets instead of lists/tuples, so that it can just blindly generate all the URLs and have duplicates ignored. It casts the set to a list upon returning.)

The process that actually generates the output simply uses (or abuses) Django’s internal testclient to request each URL and store the resulting data (and mimetype/other metadata, if using the right backend — I’ll touch on this more, below). I believe that this paradigm provides the most flexibility regarding giving each app the ability to define it’s own outputs and it keeps app-and-view-building as Django-like as possible (i.e. you are still building within the urlconf and view system). It seems ghetto at first to rely on those internal HTTP testclient mechanisms, but I haven’t yet encountered any issues — the rendering command can even (optionally) parallelize the testclient crawl to achieve faster rendering.


The staticsitegen management command then renders the URL structure you’ve defined, into static files. There are currently three rendering classes:

  • a disk-based renderer which outputs the directory tree in HTML files, turning bare URLs into directories with an index.html (so /colophon/ would result in output_dir/colophon/index.html being generated)
  • an Amazon S3 renderer which uploads the files directly to an S3 bucket (overriding duplicates)
  • a Google App Engine renderer which uploads the files to a static GAE instance, similar to the S3 renderer’s behavior

The advantages of the latter two primarily deal with situations where non-HTML content is generated: if any of your views returns JSON, XML, or some other data format, then the S3/GAE renderers will attempt to store the generated files with that mimetype.


This blog basically runs on a local dev server that uses an SQLite database as storage. I use the S3 renderer for this, to cut out the filesystem middle-man. For static files, I use the built-in staticfiles app along with django-storages; the collectstatic command automatically uploads static resources to S3, the same way the medusa S3 renderer does.

I write everything on that local server, then use my staticsitegen command to upload the whole URL tree. (In the event I updated any template bits, every URL is re-generated and overwritten.) I then use collectstatic to sync my static (CSS/JS/etc.) files. (For CloudFront or EdgeCast CDNs, right around here is where I’d run a script to immediately invalidate some of the more recent URL roots so that blog index pages get refreshed faster.)

That’s basically it. This blog (and the Onion Browser site, which are simply implemented as direct_to_template views) have been running via this system for about nine months and it’s been solid. Not even the Onion Browser release rush (being featured on Hacker News, Reddit, Gizmodo, Lifehacker, etc. etc. etc.) affected the site in the slightest.

Despite the static nature of the underlying pages, I don’t lose the ability to have comments, stats, and other features: I had been running Disqus for comments for quite some time, and I still use Google Analytics for analytics. (I recently disabled comments on this site; mainly out of apathy and lack of use than any philosophical stance.)


The README has a pretty good technical overview that goes into more code detail than the above paragraphs. You might want to start with this basic “hello world” tutorial first, though — it’ll get your feet wet and demonstrate the ease at which you can convert a (simple) Django project to become a statically-generated Django project, by simply adding a renderer definition and some settings.

I’m planning to clean the code a bit and tidy up (and add to) the documentation in the next couple weeks, but I figured I’d been sitting on this long enough.

You can discuss this on Hacker News. Feel free to bug me via e-mail or follow me on Twitter, too.

FBI FOI/PA Letter

Just a “we have received your request, we are searching our records, here is your case number, here is how to check your status” notification.

instagramfull size

FOIA / Privacy Act and Acxiom Requests

Inspired by Andy Boyle’s FBI FOI/PA request on himself, and partially driven by my own morbid curiosity…

And inspired by a recent New York Times article regarding Acxiom, one of the largest consumer-targeting database marketing operations…

I’m sending Freedom of Information Act / Privacy Act requests to the FBI, CIA, and NSA for records regarding myself and Onion Browser. Additionally, I’ve requested my U.S. Reference Information Report from Acxiom.

I’m mainly interested in:

  1. What exactly is contained in Acxiom’s commercialized “consumer targeting” databases?
  2. How does that compare to anything that may come up in a FOI/PA request?
  3. All jokes aside, did any attention actually come to Onion Browser during the post-release buzz? (At the height of it I’d noticed some .onion http_referer traffic on sites I’d rather not talk about, so there’s always a possibility of a mention in a page that was scraped or collected.)

FOI/PA letterhead to FBI

FOI/PA letterhead to CIA

FOI/PA letterhead to NSA

  1. Any records on, about, mentioning, or concerning myself. […]
  2. Any records (not included in 1) on, about, mentioning, or concerning the TOR (The Onion Router) anonymizing network (“TOR network”) which also reference myself. […]
  3. Any records on, about, mentioning, or concerning the software “Onion Browser”: an anonymizing web browser for the iOS platform (iPhone, iPad, etc.) which utilizes the TOR network. Variations of name include “iOS-OnionBrowser” and “OnionBrowser”. […]

The full content of these letters (with my personal information redacted) is available here.


The Acxiom request was filed online as per Acxiom’s instructions, and I printed a screenshot of my filled-out form along with a letter to go with my $5 processing fee.

Request to Acxiom


A response (though not necessarily the requested information) is required within 20 days of receipt of a Freedom of Information Act request (5 U.S.C. § 552(a)(6)(A)).

Acxiom (as a private company) is under no obligation to respond in a timely fashion, and the New York Times article even mentions their own difficulty with their request:

On May 25, the reporter submitted an online request to Acxiom for her file, along with a personal check, sent by Express Mail, for the $5 processing fee. Three weeks later, no response had arrived.

I’ll keep y’all updated.