Our Data, Our Selves: Data Mining for Self-Knowledge

If you haven’t read Gary Shteygart’s Super, Sad, True, Love Story, I would encourage you to go, sell all, buy and do so.  I guess I would call it a dystopian black comedic satire, and at one point I would have called it futuristic.  Now I’m not so sure.  The creepy thing is that about every other week there’s some new thing I notice and I kind of say to myself “Wow–that’s right out of Shteyngart.”  This latest from the NYTimes is another case in point.  The article traces the efforts of Stephen Wolfram to use his immense collection of data from the records of his email to the keystrokes on his computer to analyze his life for patterns of creativity, productivity, and the like.

He put the system to work, examining his e-mail and phone calls. As a marker for his new-idea rate, he used the occurrence of new words or phrases he had begun using over time in his e-mail. These words were different from the 33,000 or so that the system knew were in his standard lexicon.

The analysis showed that the practical aspects of his days were highly regular — a reliable dip in e-mail about dinner time, and don’t try getting him on the phone then, either.

But he said the system also identified, as hoped, some of the times and circumstances of creative action. Graphs of what the system found can be seen on his blog, called “The Personal Analytics of My Life.”

The algorithms that Dr. Wolfram and his group wrote “are prototypes for what we might be able to do for everyone,” he said.

The system may someday end up serving as a kind of personal historian, as well as a potential coach for improving work habits and productivity. The data could also be a treasure trove for people writing their autobiographies, or for biographers entrusted with the information.

This is eerily like the processes in Shteyngart’s novel whereby people have data scores that are immediately readable by themselves and others, and the main character obsesses continuously over the state of his data, and judges the nature and potential for his relationship on the basis of the data of others.

Socrates was the first, I think, to say the unexamined life was not worth living, but I’m not entirely sure this was what he had in mind.  There is a weird distancing effect involved in this process by which we remove ourselves from ourselves and look at the numbers.

At the same time, I’m fascinated by the prospects, and I think its not all that different from the idea of “distanced reading” that is now becoming common through certain Digital humanities practices in literature, analyzing hundreds or thousands of novels instead of reading two or three closely in order to understand through statistical analysis the important trends in literary history at any particular point in time, as well as the way specific novels might fit in to that statistical history.

Nevertheless, a novel isn’t a person.  I remain iffy about reducing myself to a set of numbers I can work to improve, modify, analyze, and interpret.  The examined life leads typically not to personal policies, but to a sense of mystery, how much there is that we don’t know about ourselves, how much there is that can’t be reduced to what I can see, or what I can count.  If I could understand my life by numbers, would I?

For Your edification I include the book trailer for Shteygart’s novel below.