Five Years

Or, “A Few Career-Related Thoughts Related to My Impending Graduation”

Are you where you thought you’d be at the beginning of this year? A year ago? Five years ago? Are you who you thought you’d be?


Five years ago, my life went kind of like this:

MIT rejection letter

I was an ambitious (and somewhat delusional) straight-A student. I didn’t get into the schools I wanted. My faith in “the system” was upended. (Which, in itself, is worthy of it’s own long discussion some other time.) Dreams were broken. Plans were changed and tossed out.

Instead of following through with any of my original backup schools, I applied to the University of Missouri, to follow my friends. As a kid I’d lived in Columbia (before moving to St. Louis) and I guess in another little delusion of mine, I thought I’d come back to bring things full circle, so to speak. I’d spent years of my childhood wanting it.

(Funny side story about that: During my short photojournalism stint, I profiled and interviewed an award-winning professor who turned out to be the father of a childhood friend of mine, from the days I still lived in Columbia. Full circle, indeed.)

At the time, I was accused of merely settling for the “easy out.” But I’d be lying if I said it wasn’t the right choice — it didn’t make a lot of sense back then, but in hindsight the decision has panned out way better than I could have expected.


Around the end of my freshman year, sick of the computer science curriculum, I toyed with the idea of getting a journalism degree instead. On a whim, I went over to the student newspaper, the Maneater, and started taking photo assignments.

And while that didn’t quite pan out — I’ve since come back to tech and decided on a degree in Information Technology — the gut decision to get involved in the media industry has completely driven my career since then. (Oh, but my reporting/photographing days sure were something.)

Due to a couple folks working the ’Eater site my freshman year, I got acquainted with Python & Django early — an early draft of the new themaneater.com was written in Django 0.91, or pre-“magic removal”. (Not to mention the “Maneater lineage” everyone adored — co-creator Adrian Holovaty had been online editor of the paper, years ago.) I’d like to say I got into it, but I had trouble understanding any of it until a couple years later, when we buckled down and finally took the time to get a new site out the door. (Having started over twice, from scratch.)

But we finally got it done in Django, just as it was starting to become the “hip” thing. And after that launch, one thing led to another, and well… What started as a random major change (to journalism) turned back into a programming gig; what started as tinkering with Django over the course of a few months turned into a nice little career niche.

I’m lucky as hell for having been in the right place in the right time.


Sophomore year, I found my way out of my awkward, shy shell and worked as a barback for a couple semesters. Picked up another part-time job as a programmer at a great local startup. I was working hard and it didn’t feel like work — I honestly loved every minute of it.

I failed out of school on account of really terrible grades. By that, I mean impossibly terrible, are you actually trying to fail grades. After the initial shock, I took it in stride. I appealed and back in immediately, without having to take the requisite one semester off. I took responsibility for the mistake and learned to juggle a bit better. But also: I learned what it felt like to really pour myself into something I loved doing — the difference between a a job and an awesome job.

It may have set me back well over a year, but in hindsight, that year was one of the most fulfilling times of my life.


A quick aside: This article — “Many gifted children fail academically” — and the relevant Hacker News thread posted a couple days ago really hit home. Couldn’t help but get a wry smile when I read this comment:

…MIT alum here, but there's nothing all that singularly unique about MIT in my book. I'm glad I was able to go, but I also realize (claim?) that most people who would be successful at MIT will be successful wherever they go, and that MIT is likely a rounding error in their success. MIT doesn't turn lead applicants into gold graduates.


Today, I take my last final exams. (I really wanted to say “my final finals.”)

Barring any unforeseen circumstances, I graduate on Friday. Pomp and Circumstance and all that — I will finally be done with my “formal education.” (And while I loved and learned plenty from it: good riddance, since high school I’ve always liked my way better.)

I’m so prone to hyperbole when I talk about the near-future, but to tell you the truth, I’m not really sure where I’m going. I’ve committed to going back to Spokane for the summer. (The joke I keep hearing goes, “it just wouldn’t be a Mike Tigas summer without Spokane.”) But after that? Who knows. And you know? It’s somewhat refreshing to have that clean-ish slate ahead of me. I’d put off the thought of “tomorrow” for so long that it’s simultaneously amazing and overwhelming to think about now. No more school. This is it. What am I going to do now? I’ve got my degree — what am I fighting for now?

My impending graduation and departure from Columbia feels like a breakup to me. I’ve mostly passed the sad, reminiscing phase for now, but now that I’m looking ahead I’m stuck in that now what? phase. Starting over fresh is, once again, amazing and overwhelming to think about.

(I feel like that talk would start with something like: Columbia, we’ve had great times together, but I don’t think it’s working out — not now, at least. I’ve had this on-and-off thing with Spokane for a while. And don’t get me started on how long I’ve been pining for New York. I’d love to see you again someday but, for now, maybe we should see other people places? Don’t worry, I’m not crawling back to St. Louis. At least, not right now.)

My biggest takeaway from the past few years: trust my gut instinct more often. As someone with an awful tendency to overthink things, my (at the time, half-brained and random) decisions to go to this school and get involved with journalism here are probably the most significant good choices I’ve made over the past few years.

Another takeaway: and don’t be afraid to fail. I’ve always been a supporter of the better to have loved and lost than never to have loved at all mantra. (The professional corollary would likely fall along the lines of: “better to do something you love and fail” than not.) Crippling fear of failure in the past meant I’d often never give some things a chance, but I think I’ve gotten better at it over the past couple years.

I don’t know where I see myself in two months, much less a year or five years or ten. But so what? I trust myself enough to believe that I’ll find a way make things work out. I mean, the past five years worked out well enough on the fly.

Again, I’ve been lucky as hell; right place, right time.

I can’t wait for the next thing, whatever that may be.

How-to: Easy wireless eavesdropping with a Mac

Simple question: is unsecured wireless an actual, real-world problem?

Simple answer: YES. HELL YES.

Not a single coffee shop I frequent has any sort of wireless security. While I understand the consequences of that, I know that others don’t. Plenty of poeple take unprotected, public wireless for granted. Some don’t understand the risks and others believe that wireless eavesdropping is beyond the technical reach of just any ol’ person. That’s simply not true.

It’s dangerously easy for anyone to do — and today, I’m going to show you how someone can start eavesdropping on an unprotected wireless network in mere minutes. I’m going to show you just how easy it is. And then I’ll talk about what you can do about it.


Super important disclaimer text: If you’re not doing this on your own wireless network, get permission first. Otherwise, you may be breaking the law. I will not be held liable for what you do, based on whatever you learn from here. If you don’t agree with that, stop reading.


This is Mac-oriented, for simplicity’s sake: OS X comes with a lot of things that make this way too easy and that’s the point I’d like to get across. (This is completely doable on other systems, however.[1])

This guide is for tech-savvy folks who’ve used the command-line before. (A previous draft was more general-purpose, but far longer than I was comfortable publishing.)

Tools

Mac OS X comes with a version of tcpdump, which is a common command-line tool for “dumping” (aka “sniffing”; saving) the packets that zip across a network.

To actually analyze and get interesting information out of the mass of information in a packet dump — download Wireshark. I’m using the Development Release (1.3.4), but Stable should work fine as well. Install that to your Applications folder by dragging it over.

Using tcpdump

My usual use case looks something like the following. (I’ll explain all of the bits below.)

sudo tcpdump \
    -i $WIFICARD \
    -I \
    -n \
    -w $OUTPUT_FILE \
    not ether host $ETHER_ADDR \
    and not host $IP_ADDR \
    and not "(wlan[0:1] & 0xfc) == 0x40" \
    and not "(wlan[0:1] & 0xfc) == 0x50" \
    and not "(wlan[0:1] & 0xfc) == 0x80" \
    and not "(wlan[0:1] & 0xfc) == 0xa4" \
    and not "(wlan[0:1] & 0xfc) == 0xc4" \
    and not "(wlan[0:1] & 0xfc) == 0xd4"
 
  • -i sets the network card you’ll be using ($WIFICARD is your wireless card — en1, for example, is usually the identifier for Airport cards in Mac laptops)
  • -I puts your network card in “monitor mode,” where it listens in on all packets on the network, not just the ones addressed to you.
  • -n disables name resolution, since we don’t need it for our packet dump
  • -w sets the output packet dump file ($OUTPUT_FILE could be something like ~/Desktop/capture.pcap)
  • The last few options filter down our dataset:
    • Don’t save data between our computer and the access point, since we’re interested in eavesdropping other people ($ETHER_ADDR and $IP_ADDR would be your MAC and IP addresses on the local network, respectively)
    • Don’t save miscellaneous packets like wireless beacon packets and pings. There are a lot of them, and they don’t hold any useful data.

Tip: you can run airport -I to see what your $WIFICARD is. From there, you can get the others by running ifconfig $WIFICARD — look the values next to “ether” and “inet.”

An example:

sudo tcpdump -i en1 -I -n -w ~/Desktop/dump.pcap not ether host 00:26:bb:0b:1e:01 and not host 192.168.1.100 and not "(wlan[0:1] & 0xfc) == 0x40" and not "(wlan[0:1] & 0xfc) == 0x50" and not "(wlan[0:1] & 0xfc) == 0x80" and not "(wlan[0:1] & 0xfc) == 0xa4" and not "(wlan[0:1] & 0xfc) == 0xc4" and not "(wlan[0:1] & 0xfc) == 0xd4"

Alternatively, I’ve wrapped up that command in a script that (should) automatically figure out your IP and MAC addresses, then start a packet dump that saves to your desktop.

You can view the script here and download it from here.

Since the tcpdump command within the script is being run via sudo, it’ll ask for your password — tcpdump needs to be run as an administrator to switch the wireless card over to “monitor mode.” (Aside: check out the code before running it. Never ever let run anything with sudo on the command-line unless you’re absolutely sure it’s safe.)

Assuming you’ve downloaded it to your Downloads folder, creating a packet dump is as simple as:

cd ~/Downloads
chmod +x sniff.sh
./sniff.sh

If the script is working, you’ll notice the dump file appear on the desktop and grow as you capture packets. You are now eavesdropping on other people’s connections on the given wireless network. At any point, you can finish up and close the script by pressing control-c.

Making sense of the data

Open up Wireshark.

Go to File->Open and go open up that .pcap file that you’ve created.

You should now have a huge list of packets. For our intents and purposes, we really don’t care about a lot of packet types, so paste the following into the “Filter” box and click on “Apply”. (Note that since Wireshark is an X11-based application, pasting is done with control-v, rather than ⌘-v.)

(http or smtp or imap or pop or aim or jabber or aim_chat or aim_buddylist) and not (tcp.analysis.retransmission or tcp.analysis.lost_segment or not http.response.code)

You should now have a packet dump that looks sort of like the following. (Click for a larger view.)

You can now dig around and browse all of the data that went through the wireless network: Web pages, SMTP/IMAP/POP e-mail, AIM conversations, Jabber (Google Talk, Facebook Chat) conversations — provided they’re unencrypted. (Side note: AIM and Google Talk now default to using SSL encryption. Most e-mail hosts do, too.)

The “packet data” panel (the second or third one — bottom one in my example image) allows you to drill down the layers of protocols-within-protocols in every packet. Play around with it!

The following filters might also be nice to experiment with:

  • aim.messageblock.message — will only show IM messages over the AIM network.
  • http.request.uri contains "profile.php" — will only snow Web pages with "profile.php" in the link (i.e., Facebook)
  • http.request.uri contains "login"
  • http.request.uri contains "mail"
  • http contains "username" — will only show requests that have the string "username" anywhere within the URL or content. (Surprise: this includes submissions to unencrypted login forms, if there are any.)

But wait! There’s more!

Wireshark can automatically parse out intercepted files and save them to your hard drive. This means you don’t even need to make sense of the raw protocols to get “tangible” results.

Go to File->Export->Objects->HTTP. Click on “Save All.” Type in a name for this folder and hit “OK” — ignore the “Some files could not be saved” error.

Open up that folder and you’ll see nearly every file transmitted over the network while you were capturing packets:

To drive the point home

Scared yet? You should be.

Unsecured public wireless networks are a huge risk to those who don’t understand just how “open” they are.

I’ve just shown you how little time and effort an eavesdropping attack takes. In mere minutes of idle time (about 10 in my example dump), anyone has the ability to collect a treasure trove of information on the people using a wireless network around them.

Digital eavesdropping and identity theft don’t have to be targeted crimes against specific people. Digital thieves can cast wide nets and hope they drag something valuable in.

What you can do

If your school or company has a VPN, log into it whenever you connect to an open wireless network. (Provided your connection doesn’t need extra authentication like Cisco Clean Access, even non-computer devices like the iPhone support VPN.) Connecting through a VPN encrypts data between you and the VPN — only after your information makes it to your VPN’s internet connection does it become unencrypted (and from there, it goes to the internet normally).

Alternatively, if you’re savvy enough to have SSH access to a Web server, you can use it as a secure proxy tunnel in practically the same way. If you understood what I just said, you can probably wing it.

If you don’t have access to the above, you can’t really do that much. Ideally, you should ask your local business to enable WPA on their network and either post the password or have customers ask for it. (My nearby Rocket Market operated their wireless like this, back when I lived up in Spokane.)

Most importantly: tread lightly. Never do anything “confidential” on an unprotected wireless network. And whenever you do go out, only log into sites and services that use SSL. (Facebook, Twitter, Gmail, and many other major sites always send your username & password via HTTPS. Gmail can be read over HTTPS, as can most other e-mail services. iChat can be set to “Require SSL” under your account’s server settings.)

Cautiousness is a virtue, online. Be careful and always be prepared for the worst. Think before you log in. Don’t use the same password everywhere. (I used to keep a rotation of about four passwords before switching to all random passwords and 1Password as a password manager.) Don’t take the Internet for granted.

Oh yeah, and don’t ever try anything I’ve mentioned here, unless you have permission. §


[1] Wireshark does work on all platforms and also performs the sniffing aspects on Windows/Linux — if your drivers allow it. With a little bit of effort, you can figure that out. You can still make do with my Wireshark analysis instructions once you have a packet dump.

Compartmentalization

Double the fun! I’ve gone into the archives and published “Signal to Noise,” a previously unpublished entry from August of last year. You should probably read that first, as it goes hand-in-hand with this one.


My attempts at compartmentalization have failed. There is only one inbox.

On the down side (that was the up side), there is no “off the clock.” There is no “not on company time.” There is no “not speaking on behalf of…” Disclaimers to the contrary are commonplace, well-rehearsed, and futile. Technologies that “help” us to link our disparate personas will inevitably intertwine them with our impersonas too. There are no “strictly personal venues.” And when nothing can be said without being misconstrued, there is nothing left to be said.

My attempts at compartmentalization have failed. There is only one outbox.

Mark Pilgrim, One

It’s almost relieving to witness someone as well-known as Mark Pilgrim, running headfirst into this very issue.[1] This is the one demon that prevents me from posting draft after draft of blog prose. There is a crippling fear and question, “what if my personal thoughts and my professional persona are irreconcilable?”

I once made the mistake of mixing the much-too-personal with my blog, years ago — and, judging from the volume of entries I’ve published, I’ve practically been sitting on my hands, since.

But why compartmentalize? Why build those walls to divvy up our lives?

There is a fine line that separates “transparency” from “way too personal,” and it’s a line I regret crossing before. But I think I’d rather be judged as Mike Tigas — mistakes, missteps, misadventures, and all — than project a “manufactured” identity under my own name.

As a self-employed freelancer — whose brand is his name — I’m not sure I see the utility of having separate “professional” and “personal” lives. And even in general: work and home are very different places, but throughout the day isn’t it still the same life you’re living? (In some professions there will be exceptions to this, I’m sure.)

Online, sacrificing your identity for the sake of image is folly — your pseudo-identity just becomes a pretense, like you’re just a marketing gimmick for the product or brand you represent. And if that brand is you — is the dog walking the master at that point? (At what point does your brand stop representing you, but rather you represent what you wish it could stand for?)

I’m not saying you should talk “inside baseball” in the open. I’ve been under NDAs and I’ve in situations without ’em where openly discussing my work could be disastrous. But I suppose my point is censorship of personality: who you are in either environment shouldn’t differ all that much. You’re you. Everyone makes mistakes. If someone really wants to find something incriminating on you, they probably will, despite your best efforts. If you aren’t comfortable being yourself, then who are you?


…I’m working on that answer. I’ve been working on it for as long as I can remember, actually — winging it, floating between hobbies and work that I enjoy, looking for “a fit.” I graduate in six weeks. That will only be the beginning, I’m sure.


[1] In fact, a couple bloggers I follow and idolize share that common theme. (I don’t really know what that says about me.) Pilgrim lost his job over a post regarding alcoholism and addiction. Heather Armstrong’s work rants also got her fired.

Even now — nearly ten years since both lost jobs over blogging — the way they write is still intriguing and very human to me. Doesn’t hurt that they both ooze wit and charm through their writing. Compared to other blogs I follow out of topical interest, I follow them (and some others) just for the prose and writing style.

…And it probably helps their case that Pilgrim now works for Google and that Armstrong is possibly the most widely read female blogger today. Minor details.

On reality and authenticity

Mark Lamster, returning from a trip to Las Vegas:

Drinks at Prime Meats, in Brooklyn, with my wife. Realistically, this place is as much an artifice as anything on the Strip, a re-imagining of a 19th-century saloon, complete with polished bar, antique typography, Edison bulbs. Why, then, does it feel so much more honest? Because its aesthetic is filtered through a contemporary sensibility? Because it seems a natural part of a vibrant neighborhood? Is this all bullshit I invent to make myself feel more comfortable?

Mark Lamster, What Am I Doing Here? Tall Buildings and High Anxiety in Las Vegas


Carnegie Mellon Professor, Jesse Schell, on the psychology of games: Video here. It’s good in it’s entirety, but the relevant parts start at about 10:25. Segment quoted below starts at about 12:15:

Go look at TV — the people on TV, their heads are spinning! Everything is about reality TV. Go to the grocery store: it’s not just groceries anymore! Organic groceries — they’re more genuine, they’re more real groceries. You go to McDonald’s, and a Big Mac — well, you could get a Big Mac, or you could get the real burger, the Angus Burger, made with real this and that and whatever. Everything’s suddenly about reality.

[…] Gilmore and Pine put forth this interesting concept: that the most valuable thing in products today is are they real, are they authentic. Which is a bold hypothesis. And then they go further and they say, “Well, now why is it? Why now? It didn’t always used to be this way. Certainly it’s not what sold stuff in the ’80s. Right? […] What is it now that people are demanding reality, demanding authenticity?”

And they’re arguing that all this virtual stuff that’s been creeping up on us over the last twenty years has really cut us off from nature. We’re cut off from nature, we’re cut off from self-sufficiency.

[…] We live in a bubble of fake bullshit and we have this hunger to get to anything that’s real. Even if the best we can do is a Starbucks mocha with real Swiss chocolate — we’ll take it! Oh, that’s real! Look how real that seems to me, relative to what I’m used to!

Jesse Schnell, Design Outside the Box Presentation

In that segment, Schnell frequently references Authenticity, by Gilmore and Pine, so you might also want to check that out.


This is something I often wonder about, as the Internet grows by leaps and bounds. For example, my recurring love-hate relationship with the Great Internet Timesuck and my tendency to quit Facebook and invoke Vonnegut just about every year. As I said before, I feel as if there’s some sort of cultural push back on the horizon — maybe this “thirst for reality” is already here, just in some other form?

NationBrowse

I haven’t quite graduated yet, but I did take my “capstone” class last semester. The objective was vaguely, “do something innovative,” so I pitched (what I thought was) the data app of my dreams.

This is how it all went down. This is essentially a brain dump of all the little notes I’ve collected while working on this project. Boy, do I collect a lot of notes.

The end result

Quick note: The server running the demo is ill-equipped for the massive dataset size — I’ll talk more about this below. …If you click around and you get a timeout error, wait a minute to let the server catch up (or cache up…) and try again.

NationBrowse screenshot

In it’s current state, nationbrowse.com is a mess, but showing it off is the easiest starting point to work from:

Warning: A lot of technical talk, from here on out.

Background bits

Heavily inspired by: The Apps for America contests [1,2], ThisWeKnow, DataMasher, this Mapping L.A. Neighborhoods project from the Los Angeles Times, and EveryBlock. (ThisWeKnow and DataMasher, we actually hadn’t heard of until partway through the semester — was really great to see more reference projects show up along the way.)

The team: Graham Greenfield, Jeremy Howard, Nick Roma, and myself. While all had programming experience, none of the others had used Python, developed GIS software, or worked on a Web app with real-world data. (It went extremely well. They picked up quickly. Python is awesome.)

Source code: Here, on github.

The basics: Python, Django, and PostgreSQL. GeoDjango via PostGIS.

Server: Served over Apache+mod_wsgi, on an internal port. nginx sits at port 80 and proxies requests over to the Apache instance.

Caching: Memcached. Using python-memcached instead of (the now unmaintained) cmemcache. Using the cache middleware along with custom caching all over the place. (There are a few notes in the next section, regarding nginx+memcached.)

Mapping: OpenLayers, for client-side shape rendering.

Graphs: Google Chart API.

Data: U.S. Census TIGER/Line for shapefiles. U.S. Census 2000 & American Community Survey 2008 for most statistics. FBI Uniform Crime Reports for other numbers.

Issues & things we cut

A lot of our initial ambitions were fiercely struck down by performance considerations. Last I checked, a bzip2-compressed database dump sat at over one gigabyte due to the sheer number of states, counties, and ZIP codes stored and the precision of the shapefiles and statistics. On a VPS with 256MB of RAM, pitting PostgreSQL against a set of data at this size proved to be a royal pain in the ass.

Wanted to use TileCache/Mapnik, the “EveryBlock stack,” to generate maps server-side: performance was awful given the hardware/dataset circumstances. (Not to mention adding the configuration complexity of having a whole Apache mod_python instance running alongside the site’s Django wsgi instance.) Instead: we found a way render shapes in OpenLayers, on the user’s Web browser, by sending along raw WKT geo data in the Javascript for a given map. The (sometimes huge) file size increase far outweighed the (dangerously high) server load.

Wnated to use MatPlotLib, to generate server-side graphs: again, performance was killing the site. This was actually completely implemented [1, 2], but not strong enough for us to demo with. Instead: we built wrappers around the Google Chart API, which offloads the rendering work to some magical Google server.

nginx is being used as a reverse proxy and we’d hoped it could serve cached results, directly out of memcached. There are still some issues with corrupted/misencoded data being returned to the browser. (The classic “gibberish loads in browser” effect.) Not sure if this is due to the large size of things being stored, or what some encoding misconfiguration — if anyone has any ideas, I’d love to hear ’em. (I’m using this serve-from-cache method on this blog, and it’s working just fine, with a near-exact configuration.)

Similar to DataMasher, we wanted to develop a way to let users automatically create comparative (and inferential) statistics. Unlike DataMasher, we sought to build something statistically sound — we were talked out of this by some folks at the Social Science Statistics Center, who noted that blindly comparing Census data would create junk data in nearly every case. At this point, we just threw our all into descriptive statistics — hence a focus on maps, charts, and tables.

Pieces of note

The cacheutil library is a little “swiss army knife” that includes a few useful functions: the safe_get_cache/safe_set_cache/safecache methods and template tag, which sanitize and hash cache keys; some decorators for caching methods, class methods, and class properties; and a middleware for those wanting nginx to serve directly from the cache [1,2].

A threading shortcut function that allows you to call some function in the background, while the rest of your view moves on and gets returned to the user’s browser. (Useful for loading views or calling functions in advance, to pre-cache ’em before a user actually goes there.)

Some pluggable utilities for generating Google Graphs URLs.

A ton of Javascript magic, using jQuery and OpenLayers. Between the template and the static helper functions, you get that nice map with toggle-able shapes (to change which variable the map is shaded by) and the nice hover effect on the shapes — as seen on the homepage.

If you are interested in using MatPlotLib and Django, you can split your chart generation functions and the bits that actually grab the data & generate a PNG response. While this project couldn’t use it in the end, here’s a lot of potential for dynamic awesomeness there.

Credits due

Ted came up with the name a long time ago, when I first threw around the idea of a data project like this.

My team was awesome for going along with something so ridiculously ambitious. For a one-semester undergraduate capstone project, in which 75% of the team hadn’t even used the language, it really worked out. Graham and Jeremy were troopers and put a lot of work into the MatPlotLib renderers [1, 2] that weren’t fully implemented in the end product. Nick, without any prior Javascript or jQuery experience, built a GUI “query builder” (which, unfortunately, is not functional in the live demo).

After repeatedly shooting down Flash-based maps and discovering that server-side map tiles were out of the question, the dynamic elements of the map are heavily inspired from staring at the source of this Los Angeles Times mapping project. (And weeding my way through the OpenLayers documentation and mailing lists.) It’s not the prettiest, but there’s a lot of dynamic flexibility to it that I haven’t yet seen in other OpenLayers implementations.

Last complaints

Setting up a PostGIS database is a pain. Importing the entire State, County, and ZipCode sets is even worse. I did it here — note that I had to manually import Puerto Rican municipio (equivalent to counties) by tweaking the INSERT statements and unescaping some of the characters with diacritics and forcing PostgreSQL to run it as UTF-8. Hopefully that’ll save you some pain if you try this someday.

Census data is a mess. Know how to get to raw data from the homepage? Yeah. (Try the Download Center over here.) The data was pipe-delimited (and therefore, PostgreSQL could import it directly), but turning the many, many arbitrary columns into model fields was a pain.

Oh, and mixing data from disparate sources? (Say, the FBI Uniform Crime Reports, whose data is entirely distributed in Excel spreadsheets.) Good luck.

I would really love to see a more open method to access a lot of this data. After working on this project, I have to say that there are still significant barriers to doing useful things with open government data. ThisWeKnow uses RDF/SPARQL and is — judging from their goals and execution — an excellent start.

Epilogue

I don’t believe NationBrowse is “complete.” It’s a nice technology demo and was a nice experiment in building a large data app can be built with very few resources. But it’s a data ghetto. It’s a standalone site, with very little context and very little use of the massive underlying dataset.

If I could have another go at this, I’d have emphasized data export functionality or some other way to get “joined” data from disparate sets and sources. Possibly create an API around the underlying data. And even then, the data still needs to go in, somehow.

But hey, if four guys in college can find a way to make something of that data, for (near-)free, maybe there’s hope.

I implore you to dig around in the repository and especially check out the notable bits.

You can comment on this post via Google Buzz. Or, you can contact me directly.

This is a quote I love to come back to, time and again.

Even as a Web developer — a person who gets paid to go out and build up the great expanses of the Internet — I love this quote. And, to a great extent, I believe in it.

Electronic communities build nothing. You wind up with nothing. We are dancing animals. How beautiful it is to get up and go out and do something. We are here on Earth to fart around. Don’t let anybody tell you any different.

— Kurt Vonnegut, in A Man Without a Country


Google Buzz was released earlier this week. Facebook redesigned it’s main page. A lot of people paid a whole lot of attention to these things.

I had a good conversation with Carolina a couple nights ago, about the substitution of real social interaction for social networks. (Her friend Amanda expressed dismay at the whole thing, which is what got us on the subject.) And while I concede, there are plenty of uses for these communities — reconnecting with distant folks, planning events, having non-live conversations in comment streams — I can’t help but notice:

There are an increasing number of people I speak to that believe we’re placing far too much collective importance on these things. Me? I fear the people to young to remember dial-up Internet and earlier. And seriously, think about it: I’m sure there are some kids who communicate through these networking sites more than any other medium — text, phone, or in-person. This is all they’ll have ever known. (In practice, I’m sure the reality lies somewhere between texting and the Internet.)

In my wildest dreams, I imagine we’ll get to a point where this dawns on everyone and we have a large cultural push back. Maybe, like the whole/organic food fad, it’ll only be a minority. But sometimes I feel like the undercurrents are there.

Does anybody even remember Google Wave? Friendster? Xanga?

The iPad & Game Consoles

A quick thought or two on the iPad hubbub and the “casual computing vs. tinkering” conversation that’s been happening as of late. But first:

ThinkPad, anyone?

I concede “iPad” is a terrible name simply because of the similarity to Apple’s existing “iPod,” but I really don’t understand the fascination with “pad” jokes. A “-Pad” name has been pretty successful — without the toilet humor — for about 18 years now.

There are examples of names like this in recent history — take the Nintendo Wii, for example. Like the Wii, I’m pretty sure we’ll move on from picking on nomenclature once we start using the damn thing.

Which sort of leads into my main point

One of the general arguments against the iPad being successful is that it’s more expensive than a netbook, it’s not as full featured, and it doesn’t even multitask, etc.…

Who cares? Between my brother and I, we own several high-end computers that, by default, are closed systems. They don’t multitask. You can’t easily make your own content for them. You can’t really mess around with a lot of the performance-oriented settings.

They are: a Playstation 3, an Xbox 360, a Wii, and a few other systems.

For the most part, direct comparisons between these devices and “general computers” tend to be “apples to oranges” comparisons. (The classic “console vs. PC gaming” argument is probably the best example.)

They’re purpose-built machines, they’re in a different league and that’s that.

There are lots folks who own Macs or older PCs that want a way to play the latest games — and many of them own game consoles because that provides the easiest out-of-the-box experience as opposed to buying and maintaining a PC gaming rig. And it’s much easier than trying to play Crysis on a PC whose hardware is dated four or five years or one at a sub-$500 price tag.

My point is, there is a place for the iPad and people will buy it even if it is (several orders of magnitude) less versatile and far more expensive than a netbook. It doesn’t have to be a netbook to succeed. As long as the iPad gives the user enough of what they want (presumably: Web content, books, and apps) and wraps that up in an enjoyable experience, then Apple has a legitimate competitor against the netbook market.

Another point of reference: Some folks will go out and buy an Xbox 360 because of the platform-exclusive titles, like Halo. I could try to talk about how technologically superior the PlayStation 3 is to the Xbox 360, but I can’t specifically dissuade someone who loves Halo. Some folks will go the iPhone/iPad route specifically for the exclusive apps and features, too.

On hackers and tinkerers

On the other hand, there is the “tinkering argument” — that the spread and adoption of these “closed systems” will bring an end to the days of tinkerers.

Video game consoles also provide great analogue to the iPad’s “closedness” in this regard: they come “closed,” of course. But my Xbox 360 is modified to play burned games and doing the same to the Wii is, supposedly, a piece of cake. You don’t have to look far to find people willing to do the same with Apple’s closed systems.

(My brother and I do live on the far end of the tinkering range — in both PCs and game consoles — so my experiences are obviously a tiny bit skewed.)

Interestingly enough, I do notice that a great percentage of the PC gamers I know do tinker with the settings, update their drivers, upgrade their parts, etc., on a normal basis — or at the very least, know how to perform those tasks. And while I know of primarily-console folks who’ve modified or hacked their systems, they are a much rarer breed. This is exactly what the fear is: tinkering falling to the wayside because the closed-off systems inherently have fewer things to tinker with.[1]

While I have no reservations on the “closed” nature of the iPad specifically, I am one of the people that will be concerned if this truly is the “future of computing.”

At best, some console hacks are merely inconvenient[2], while at worst there are those that are outright illegal. I strongly believe that those who want to do more with their computing devices will inevitably find a way to do it. I just think it will play out better for everyone if we encourage and facilitate rather than criminalize curiosity and innovation.


[1] Alex Payne & Jim Stogdill both have excellent points on this, which inspired me to write a bit about it.
[2] Older PS3 models do allow you to install Linux on an unmodified console. And as far as I know, there are no hacks for the PS3 that allow you to play burned games.

I don’t have comments set up on this site yet, but if you’d like to, you can comment on this blog post over on Facebook. You don’t even have to be my friend.

I'm a three-time (soon to be four-time) published author. When aspiring authors learn this, they invariably ask what word processor I use. It doesn't fucking matter! I happen to write in Emacs. I also code in Emacs, which is a nice bonus. Other people write and code in vi. Other people write in Microsoft Word and code in TextMate+ or TextEdit or some fancy web-based collaborative editor like EtherPad or Google Wave. Whatever. Picking the right text editor will not make you a better writer. Writing will make you a better writer. Writing, and editing, and publishing, and listening -- really listening -- to what people say about your writing. […] Just fucking write, then publish, then write some more. One day your writing will get featured on a site like Reddit and you'll go from 5 readers to 5000 in a matter of hours, and they'll all tell you how much your writing sucks. And most of them will be right! Learn how to respond to constructive criticism and filter out the trolls, and you can write the next great American novel in edlin.

Mark Pilgrim, on The Setup. (Emphasis mine.)

Cop out

I owe you some vacation photos, but alas — I don’t have them yet. For now, enjoy a set of photos from a previous trip to New Jersey & New York City.

CONCRETE Subway Fun
Basketball Man
Grease Trucks

Oh yeah, and I probably owe you a mention or so about the rebuild of this site, don’t I?

Trust me, I’ll have something current up soon.