What We Do Part 2…

I promised more disclosure.  So here we go.

In Part 1, I laid out some background on the problem we solve.  There were a few things I didn’t get to cover (time has been in short supply of late).

IDC estimates that digital data is growing at 56% per year.  To give you a mental picture of how ridiculous that is, the data in the digital eco-system in 2006 was the equivalent of 12 columns of books going from here to the sun.  By 2010, those 12 columns will reach to Pluto AND BACK.

Yeah.  It’s a lot of data and a lot of growth.

And the problem is that you, as an individual using current tools, simply can’t keep pace. 

Time for a graph:

Growthindata

As you can see, there is an ever-widening gap between "all data" and "frequently used data".  In other words, more and more data is being used less and less often.

And that breaks a lot of the tools we rely on. 

For example, search relies on having good keywords.  Good keywords rely on some familiarity with the data being searched.  Less familiarity leads to poorer keywords.  And as we’ve all seen firsthand, having a poor keyword usually leads to useless search results.

A normal hierarchical system gets into trouble too, whether we’re talking menu systems, wikis or filesystems.  As you shove more and more data into the system, you have two choices:

1.  Create ever broader subcategories.  This makes filing easy but retrieval gets harder as more and more stuff gets thrown in any single category.  For example, dividing all your friends into "tall" or "short" doesn’t really add a lot to the organization of your rolodex.

2.  Create many new subcategories.  This makes filing hard as it becomes increasingly subjective (is a Zebra an animal that is black with white stripes or white with black stripes).   It also makes retrieval ever more dependent on being familiar with the organizational system.  For example, try using a library to find something specific without a basic understanding of the Dewey Decimal System…

Since we generally want to store and retrieve information, neither option is appealing.

For a much more detailed discussion of how things are breaking down, take a look at "Everything is miscellaneous" by David Weinberger.

Okay, so the world is going to hell in a handbasket and life as we know it is done. 

Well not quite.

The dirty little secret is that by and large, all this new data is actually being created by someone.  So almost any piece of data in the digital universe is familiar to somebody somewhere.  And their knowledge is the information that can be tapped to solve the problem of managing all the data.

Sounds a bit like tagging, no?  You’ll have to come back and read more later in the week to find out.

1 comment for “What We Do Part 2…

Comments are closed.