Thursday, September 07, 2006

Holovaty: We're building databases for the future, not the present

As you may have heard, a fellow named Adrian Holovaty has a Big Idea, and it's a really good one: newspaper information needs to be updated for the digital age by storing it not only in the hundred-year-old "story" format, but in little database chunks. What's the business model? He's quick to say he doesn't know, but his answer this week to one concern should be comforting to data compilers with small audiences.

Here's Adrian's line: if you've lifted a few words out of your story and flagged them in a way that a computer can recognize -- if a computer can indentify your story's "who" and "where" -- then you're setting yourself up to someday ask a computer to map all those "who"s against, say, a database of political donors, or real estate purchasers, or sources. It's the difference between Fisher-Price and Lego.

His most visible work at the Post has been stuff like the wonderfully nichey political ads database. But in a long blog post this week, he reminds us that mere newsy databases aren't the endgame -- the greater purpose isn't serving today's reader, but laying the ground for future remixes of the data.

If you store everything on your Web site as a news article, the Web site is not necessarily hard to use. Rather, it's a problem of lost opportunity. ... That Web site cannot do the cool things that readers are beginning to expect.

That's a bit of encouragement for small papers considering similar projects in the face of minimal pageviews. (Squint. You see that? Wagging in the distance? It's the long tail!)

It's also some motivation to keep that data well-scrubbed: more's at stake here than Saturday's paper.

No comments:


Creative Commons License
This work is licensed under a Creative Commons Attribution-Share Alike 2.5 License.