The Social Graph is Neither
I first came across the phrase social graph in 2007, in an essay by Brad Fitzpatrick, though I'd be curious to know if it goes back further.
MORE >>>The idea of representing relationships between people as networks is old, but this was the first time I had thought about treating the connections between all living people as one big object that you could manipulate with a computer.
At the time he wrote, Fitzpatrick had two points to make. The first was that it made no sense for every social website to try and recreate the same web of relationships, over and over, by making people send each other follow requests. The second was that this relationship data should not be proprietary, but a common resource that rival services could build on as a foundation.
Fitzpatrick subsequently went to work for Google, and his Utopian vision of open standards and open data became subsumed in a rivalry between Google and Facebook. Both companies now offer their version of a social graph API, and Google (which is trying to catch up) has taken up the banner of open standards and data portability.
This rivalry has brought the phrase 'social graph' into wider use. Last week Forbes even went to the extent of calling the social graph an exploitable resource comprarable to crude oil, with riches to those who figure out how to mine it and refine it.
I think this is a fascinating metaphor. If the social graph is crude oil, doesn't that make our friends and colleagues the little animals that get crushed and buried underground?
But right now I would like to take issue with the underlying concept, which I think has two flaws:
I. It's not a graph
The idea of the social graph is that each person is a dot in a kind of grand connect-the-dots game, the various relationships between us forming the lines. Some of these lines may point in one direction (Ted works for Sylvia), while others are reciprocal (Bob is my neighbor). All of them taken together form a mathematical structure called the social graph. Facebook even has a pretty picture!.
We nerds love graphs because they are easy to represent in a computer and there is a vast literature on how to do useful things with them. When you ask Google for directions from Detroit to Redwood City, for example, you're interacting with a graph that represents the US road network. The same principle applies any time a site tells you people who bought object X might also be interested in book Y.
In order to model something as a graph, you have to have a clear definition of what its nodes and edges represent. In most social sites, this does not pose a problem. The nodes are users, while edges means something like 'accepted a connection request from', or 'followed', or 'exchanged email with', depending on where you are.
The way you interpret this is another matter - does clicking 'follow' imply you're friends with someone in real life? But at least what the data model represents is unambiguous.
But when you start talking about building a social graph that transcends any specific implementation, you quickly find yourself in the weeds. Is accepting someone's invitation on LinkedIn the same kind of connection as mutually following them on Twitter? Can we define some generic connections like 'fan of' or 'follower' and re-use them for multiple sites? Does it matter that you can see who your followers are on site X but not on site Y?
One way to solve this comparison problem is with standards. Before pooling your data in the social graph, you first map it to a common vocabulary. Google, for example, uses XFN as part of their Social Graph API. This defines a set of about twenty allowed relationships. (Facebook has a much more austere set: close_friends, acquaintances, restricted, and the weaselly user_created).
But these common relationships turn out to be kind of slippery. To use XFN as my example, how do I decide if my cubicle mate is a friend, acquaintance or just a contact? And if I call him my friend, should I interpret that in the northern California sense, or in in some kind of universal sense of friendship?
In the old country, for example, we have two kinds of 'friendship' (distinguished by whether you address one another with the informal pronoun) and going from one status to the other is a pretty big deal; you have to drink a toast with your arms all in a pretzel and it's considered a huge faux pas to suggest it before both people feel ready. But at least it's not ambiguous!
And of course sex complicates things even more. Will it get me in hot water to have a crush on someone but have a different person as my muse? Does spouse imply sweetheart, or do I have to explicilty declare that (perhaps on our 20th anniversary)? And should restrainingOrder be an edge or a node in this data model?
There's also the matter of things that XFN doesn't allow you to describe. There's no nemesis or rival, since the standards writers wanted to exclude negativity. The gender-dependent second e on fiancé(e) panicked the spec writers, so they left that relationship out. Neither will they allow you to declare an ex-spouse or an ex-colleague.
And then there's the question of how to describe the more complicated relationships that human beings have. Maybe my friend Bill is a little abrasive if he starts drinking, but wonderful with kids - how do I mark that? Dawn and I go out sometimes to kvetch over coffee, but I can't really tell if she and I would stay friends if we didn't work together. I'd like to be better friends with Pat. Alex is my AA sponsor. Just how many kinds of edges are in this thing?
And speaking of booze, how come there's a field for declaring I'm an alcoholic (opensocial.Enum.Drinker.HEAVILY) but no way to tell people I smoke pot? Why are the only genders male and female? Have the people who designed this protocol really never made the twenty mile drive to San Francisco?
What happens to dead people in the social graph? Facebook keeps profiles around for a while in memoriam, so we probably shouldn't just purge dead contacts from the social graph immediately. But we certainly don't want them haunting us on LinkedIn - maybe there should be a second, Elysian social graph where we can put those nodes to await us?
You can call this nitpicking, but this stuff matters! This is supposed to be a canonical representation of human relationships. But it only takes five minutes of reading the existing standards to see that they're completely inadequate.
Here the Ghost of Abstractions Past materializes in a flurry of angle brackets, and says in a sepulchral whisper:
“How about we let people define arbitrary relationships between nodes…”
(subject,verb,object)“Maybe even in XML…”
<Person "john"> <likesToShareRecipesWith "susan" /> </Person>“Of course, we'll need namespaces…”
<ns:Person rdf:about="http://www.example.org/#john"> <ns:likesToShareRecipesWith rdf:resource="http://www.example.org/#susan" /> </ns:Person>And RDF rises lurching out of the grave to infect the brains of another generation of young developers.
But even if we go ahead and build the Semantic Web, 2004 edition, and populate it with information about all our connections to other people, it still won't be expressive enough.
One big sticking point is privacy. Do I really want to find out that my pastor and I share the same dominatrix? If not, then who is going to be in charge of maintaining all the access control lists for every node and edge so that some information is not shared? You can either have a decentralized, communally owned social graph (like Fitzpatrick envisioned) or good privacy controls, but not the two together.
There's another fundamental problem in that a graph is a static thing, with no concept of time. Real life relationships are a shared history, but in the social graph they're just a single connection. My friend from ten years ago has the same relationship to me as the friend I dined with yesterday. You're left with forcing people (or their software) to maintain lists like 'Recent Contacts' because there is no place in the model to fit this information.
"No problem," says Poindexter. "We'll add a time series of state transitions and exponentially decaying edge weights, model group dynamics as directional flows, and pass a context object in with each query..." and around we go.
This obsession with modeling has led us into a social version of the Uncanny Valley, that weird phenomenon from computer graphics where the more faithfully you try to represent something human, the creepier it becomes. As the model becomes more expressive, we really start to notice the places where it fails.
Personally, I think finding an adequate data model for the totality of interpersonal connections is an AI-hard problem. But even if you disagree, it's clear that a plain old graph is not going to cut it.




Ran into 
For more than 65 years, the family that founded Sub-Zero has believed that consumers purchase the company’s premium products because of their quality. But recently, with a groundswell of support for American-made items, the Madison, Wis.-based manufacturer of built-in home refrigeration has found it important to tell that story, too.
