Tuesday, October 30, 2007

The Quantified Self

The Quantified Self
Amazon examines its digital archive of several hundred thousand scanned books to extracts what it calls Statistically Improbable Phrases (SIPs). This process compares the sentences in each book to the millions of sentences in the rest of the library to find distinctive and improbable phrases. Rare sequences of words inhabiting the same few books suggest these works share more than that phrase. For instance, one of my books ("Out Of Control") employs the phrase “perpetual novelty” more than once. That word-pair shows up in 22 other books. When I click on those references I am brought to the exact place in each these books where that passage occurs. I can quickly see the relevance (or not) of these works. Amazon highlights two dozen other improbable phrases in my book; each one leads to another cluster of related works that I was unaware of. Clearly this helps Amazon to sell more books (“If you like this one, you’ll like these.”) But it is also a new way of knowing. Once text is digital, books seep into each other’s binding to create the wisdom of the library. Books know about each other. And they carry links between books and what people say about them. The collective intelligence of a library allows us to see things we can’t see in a single book.

No comments: